why the running time of CPU and GPU are same, while i use itkRWSegmentationFilter and itkCudaRWSegmentationFilter

code from:GitHub - enricperera/itkRWSegmentationFilter: ITK-like filter for image segmentation using the random walker algorithm
I run the code itkRWSegmentationFilter successful in CPU, but there has some error when i run itkCudaRWSegmentationFilter in GPU, like that:
BiCStab failed with error-1
so, i try to comment out these codes in itkCudaRWSegmentationFilter.hxx,lines 389~393.

// Call CUDA BiCGStab solver
bicgstab_exit_status = this->BiCGStab(unmarkedLength, nnz);
if (bicgstab_exit_status != 0)
{
  std::cout << "BiCStab failed with error" << bicgstab_exit_status << std::endl;
  return;
}

finally i get the same segmentation result and same running time in CPU and GPU.
So my questions are:

  • Are changes to the code the cause of this problem?
  • How to solve the problem of CPU and GPU running time?

Thanks in advance!
Kary