[Rtk-users] [EXTERNAL] Help with CudaForwardProjectionImageFilter

Rahman, Obaid rahmano at ornl.gov
Mon Apr 29 13:53:24 CEST 2024


Thank you, Simon.
I did as you asked. Please see the attachment for the status of GPU memory right before ForwarProj.Update().
I also monitored GPU memory during ForwarProj.Update() run (watch -n0.1 nvidia-smi) and I found that it never exceeded 12 GB.
Very strange!
Please note that I am using GPU3.

Best,
Obaid



> On Apr 29, 2024, at 2:11 AM, Simon Rit <simon.rit at creatis.insa-lyon.fr> wrote:
> 
> Hi Obaid,
> Thanks for your report. So your volume and your projections should use 10 GB. One thing I recently noticed while working on CUDA 12 compatibility is that the forward projection code uses layered texture and actually doubles the memory of the volume (for scalars, adds one component for a volume of vectors)
> https://github.com/RTKConsortium/RTK/blob/master/src/rtkCudaForwardProjectionImageFilter.cu#L203 <https://urldefense.us/v2/url?u=https-3A__github.com_RTKConsortium_RTK_blob_master_src_rtkCudaForwardProjectionImageFilter.cu-23L203&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=J7uT21mkGp7aMwIrHQkTLGwy72wKx_bOB0IkoGp__bQ&m=YP_feMKOtFmfI_RlzB15o9wbqPG47bYsVuEAUn_txqGKXShZClMCQ2M6hFFvZ-5b&s=sBRf0tTFHZldRkEmINnmghVY4iJ2QPsIRgxML2Xcy1I&e=>
> https://github.com/RTKConsortium/RTK/blob/master/src/rtkCudaUtilities.cu#L171 <https://urldefense.us/v2/url?u=https-3A__github.com_RTKConsortium_RTK_blob_master_src_rtkCudaUtilities.cu-23L171&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=J7uT21mkGp7aMwIrHQkTLGwy72wKx_bOB0IkoGp__bQ&m=YP_feMKOtFmfI_RlzB15o9wbqPG47bYsVuEAUn_txqGKXShZClMCQ2M6hFFvZ-5b&s=-rb9umhqxdnfxG2pxnrceB6kFeZfuoHjLUMVlmgeUxk&e=>
> That amounts to 18.8 GB which is still much less than 80 GB. I am probably missing something but I don't know what. I would suggest to add
>   os.system('nvidia-smi')
> in your python code to check what is the memory status (before ForwardProj.Update()).
> To be honest, the projection images is a minor portion of the volume so the streaming will probably not help, sorry.
> Thanks for reporting any additional info on the mailing list,
> Simon
> 
> On Fri, Apr 26, 2024 at 1:36 PM Rahman, Obaid <rahmano at ornl.gov <mailto:rahmano at ornl.gov>> wrote:
>> Hi Simon,
>> 
>> Thank you for the suggestion, I will definitely try that.
>> Please find attached the output of nvidia-smi.
>> Thanks.
>> 
>> Best,
>> Obaid
>> 
>> <Screenshot 2024-04-26 at 7.32.11 AM.png>
>> 
>> 
>>> On Apr 26, 2024, at 3:03 AM, Simon Rit <simon.rit at creatis.insa-lyon.fr <mailto:simon.rit at creatis.insa-lyon.fr>> wrote:
>>> 
>>> Hi Obaid,
>>> If you remove the "ForwardProj.Update()" line and connect instead a StreamingImageFilter <https://urldefense.us/v2/url?u=https-3A__itk.org_Doxygen_html_classitk-5F1-5F1StreamingImageFilter.html&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=J7uT21mkGp7aMwIrHQkTLGwy72wKx_bOB0IkoGp__bQ&m=irXqD18u-c9Im_eQprxOMZNcAsAeuIrOn7RWNk8LABzPsfkFpG_WuH03l9pRCH7S&s=wSLRSuiTcAhTsEGggRb-clAgYNx67JL22y315GEq6TI&e=>, your projections can be processed piece by piece.
>>> I'm surprised by your 80 GB memory, can you display the output of nvidia-smi?
>>> Simon
>>> 
>>> On Thu, Apr 25, 2024 at 11:56 AM Rahman, Obaid <rahmano at ornl.gov <mailto:rahmano at ornl.gov>> wrote:
>>>> Thanks, Nils for the swift response.
>>>> I have actually 80 GB memory per GPU which should be plenty.
>>>> The image is 8.65 GB (read as float32), and the projection size is 1.5 GB (read as float32).
>>>> Also, CudaBackProjectionImageFilter works fine for the given projection and image sizes.
>>>> 
>>>> I have tried other forward projectors (Astra) and they work fine with the given memory.
>>>> I was wondering if my code was suboptimal memory-wise.
>>>> Thanks!
>>>> 
>>>> Best,
>>>> Obaid
>>>> 
>>>>> On Apr 25, 2024, at 3:54 AM, krah <nils.krah at creatis.insa-lyon.fr <mailto:nils.krah at creatis.insa-lyon.fr>> wrote:
>>>>> 
>>>>> Hey,
>>>>> ​without knowing further details nor having verified the code, I guess your image is just too large to fit into the memory of your GPU.
>>>>> 
>>>>> Naïvely, I would calculate the size in GBytes of your image and compare with the memory of your GPU. Increasing the spacing means using fewer pixels = less memory. On top of that, I guess there are the projections which require space on the GPU.
>>>>>>>>>> ​If you need a small spacing, maybe you can reconstruct in chunks along the rotation axis and merge the chunks later?
>>>>> ​Or the RTK experts know other tricks ...​
>>>>>>>>>> ​just my non-expert two-cents ...
>>>>> ​Nils
>>>>>>>>>> On Apr 25 2024, at 3:21 am, Rahman, Obaid <rahmano at ornl.gov <mailto:rahmano at ornl.gov>> wrote:
>>>>>> Hello all,
>>>>>> 
>>>>>> I am getting “out of memory” error when I run CudaForwardProjectionImageFilter.
>>>>>> Please refer to the attached screenshot.
>>>>>> 
>>>>>> These are the array sizes with which I get the error:
>>>>>> Projection size: (slice/row, view, column)=(1456,145,1840)
>>>>>> Recon size: (z, y, x)=(1264, 1356, 1356)
>>>>>> 
>>>>>> When I increase the voxel size to twice i.e. Recon size: (z, y, x)=(632, 678, 678), it works fine.
>>>>>> Maybe I’m not being very efficient with the memory?
>>>>>> 
>>>>>> If anyone has suggestion to make it more memory efficient, please let me know.
>>>>>> Thanks.
>>>>>> 
>>>>>> Best,
>>>>>> Obaid
>>>>>> 
>>>>>> Here’s what I’m doing:
>>>>>> 
>>>>>> def rtk_projection(recon_arrar, proj_params, miscalib, vol_params, return_itk=False, return_RVC=True):
>>>>>>     '''
>>>>>>     inputs:
>>>>>>       recon_arrar: image numpy array (ZXY format is expected)
>>>>>>       proj_params: projection parameter dictionary
>>>>>>       miscalib: miscalibation parameter dictionary
>>>>>>       vol_params: image volume parameter dictionary
>>>>>>       return_itk: True if itk image is expected, False if a numpy array is expected
>>>>>>     output:
>>>>>>       itk image (c,r,v format internally) or image numpy array (r,v,c format) of projection data
>>>>>>     '''
>>>>>>     ImageType = itk.Image[itk.F,3]; CudaImageType = itk.CudaImage[itk.F,3]
>>>>>> 
>>>>>>     # Convert the recon to itk format
>>>>>>     ###########################################################################
>>>>>>     # proj_data is in (z,x,y); rtk requires  numpy array in (x,z,y)
>>>>>>     recon_arrar = np.transpose(recon_arrar, [1,0,2]).astype('float32') # now in (x,z,y)
>>>>>>     recon_shape = recon_arrar.shape
>>>>>>     recon_arrar = itk.GetImageFromArray(recon_arrar) # internally img has format (y,z,x)
>>>>>> 
>>>>>>     # Center the image around 0 which is the default center of rotation
>>>>>>     recon_arrar.SetOrigin([-0.5*(recon_arrar.GetLargestPossibleRegion().GetSize()[0]-1)*recon_arrar.GetSpacing()[0],
>>>>>>                     -0.5*(recon_arrar.GetLargestPossibleRegion().GetSize()[1]-1)*recon_arrar.GetSpacing()[1],
>>>>>>                     -0.5*(recon_arrar.GetLargestPossibleRegion().GetSize()[2]-1)*recon_arrar.GetSpacing()[2]])
>>>>>> 
>>>>>>     # Graft the projections to an itk::CudaImage
>>>>>>     ###########################################################################
>>>>>>     vol_xy = float(vol_params['vox_xy']); vol_z = float(vol_params['vox_z'])
>>>>>>     rtk_recon = CudaImageType.New()# img will have format crv
>>>>>>     rtk_recon.SetPixelContainer(recon_arrar.GetPixelContainer()) # img has format crv
>>>>>>     rtk_recon.SetLargestPossibleRegion(recon_arrar.GetLargestPossibleRegion())
>>>>>>     rtk_recon.SetBufferedRegion(recon_arrar.GetBufferedRegion())
>>>>>>     rtk_recon.SetRequestedRegion(recon_arrar.GetRequestedRegion())
>>>>>>     rtk_recon.SetSpacing([vol_xy, vol_z, vol_xy]) # spacing for xzy
>>>>>>     rtk_recon.SetOrigin([-0.5*(recon_shape[2]-1)*vol_xy, # origin for x direction
>>>>>>                   -0.5*(recon_shape[1]-1)*vol_z, # origin for z direction
>>>>>>                   -0.5*(recon_shape[0]-1)*vol_xy])  # origin for y direction
>>>>>>     # print('GPU recon:', rtk_recon)
>>>>>>     del recon_arrar
>>>>>>     ###########################################################################
>>>>>> 
>>>>>>     # Defines the RTK geometry object
>>>>>>     geometry = rtk.ThreeDCircularProjectionGeometry.New()
>>>>>>     for i in range(len(proj_params['angles'])):
>>>>>>         angle_deg = proj_params['angles'][i]*180/np.pi # angle in degree
>>>>>>         geometry.AddProjection(proj_params['cone_params']['src_orig'],
>>>>>>                                proj_params['cone_params']['src_orig']+proj_params['cone_params']['orig_det'],
>>>>>>                                angle_deg, miscalib['delta_u'], miscalib['delta_v'])
>>>>>> 
>>>>>>     # define some parameters (to make code more readable)
>>>>>>     det_pix_x = proj_params['cone_params']['pix_x'] # row direction pixel size
>>>>>>     det_pix_y = proj_params['cone_params']['pix_y'] # channel direction pixel size
>>>>>>     proj_shape = [int(proj_params['dims'][0]), int(proj_params['dims'][1]), int(proj_params['dims'][2])] # r,v,c
>>>>>>     ###########################################################################
>>>>>> 
>>>>>>     # define a zero projection (GPU)
>>>>>>     constantImageSource = rtk.ConstantImageSource[CudaImageType].New()
>>>>>>     constantImageSource.SetSpacing( [det_pix_y, det_pix_x, 1.] ) # c,r,v
>>>>>>     constantImageSource.SetSize( [proj_shape[2], proj_shape[0], proj_shape[1]] ) # c,r,v
>>>>>>     constantImageSource.SetOrigin([ -0.5*(proj_shape[2]-1)*det_pix_y,   # origin for channel direction
>>>>>>                                     -0.5*(proj_shape[0]-1)*det_pix_x,   # origin for row direction
>>>>>>                                     -0.5*(proj_shape[1]-1)*1])          # origin for view direction (will be ignored anyway)
>>>>>>     constantImageSource.SetConstant(0.)
>>>>>>     ###########################################################################
>>>>>> 
>>>>>>     # forward project the recon to fill out the zero projection image
>>>>>>     ForwardProj = rtk.CudaForwardProjectionImageFilter[CudaImageType].New()
>>>>>>     ForwardProj.SetGeometry( geometry )
>>>>>>     ForwardProj.SetInput(0, constantImageSource.GetOutput()) # projection image
>>>>>>     ForwardProj.SetInput(1, cpu2gpu(rtk_recon)) # recon volume
>>>>>>     ForwardProj.SetStepSize(float(vol_params['vox_xy'])/4) # step size along the ray (default is 1mm)
>>>>>>     ForwardProj.Update()
>>>>>>     ###########################################################################
>>>>>> 
>>>>>>     # graft the projection to CPU / extract the array
>>>>>>     rtk_reprojection = gpu2cpu(ForwardProj.GetOutput()) # array inside the image has c,r,v format
>>>>>>     if return_itk:
>>>>>>         return rtk_reprojection # array inside the image has c,r,v format
>>>>>>     else:
>>>>>>         rtk_reprojection = itk.GetArrayViewFromImage(rtk_reprojection) # now v,r,c format
>>>>>>         if return_RVC:
>>>>>>             rtk_reprojection = np.transpose(rtk_reprojection, [1,0,2]) # r,v,c format to match Astra projection
>>>>>>         return rtk_reprojection # numpy array (r,v,c)
>>>>>> 
>>>>>> 
>>>>>> <Screenshot 2024-04-24 at 8.34.20 PM.png>
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Rtk-users mailing list
>>>>>> rtk-users at openrtk.org <mailto:rtk-users at openrtk.org>
>>>>>> https://www.creatis.insa-lyon.fr/mailman/listinfo/rtk-users <https://urldefense.us/v2/url?u=https-3A__www.creatis.insa-2Dlyon.fr_mailman_listinfo_rtk-2Dusers&d=DwQFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=J7uT21mkGp7aMwIrHQkTLGwy72wKx_bOB0IkoGp__bQ&m=mi_1gmtry3HzMfO0caEhCP-0wcNe3su2gocFndAoByaorId18bE57XL9ozhaj01k&s=K4sXSgvZVGEDHgEKoU168ighF9MBDAXWRLef6iJDkrs&e=><Screenshot 2024-04-24 at 8.34.20 PM.png>
>>>> 
>>>> _______________________________________________
>>>> Rtk-users mailing list
>>>> rtk-users at openrtk.org <mailto:rtk-users at openrtk.org>
>>>> https://www.creatis.insa-lyon.fr/mailman/listinfo/rtk-users <https://urldefense.us/v2/url?u=https-3A__www.creatis.insa-2Dlyon.fr_mailman_listinfo_rtk-2Dusers&d=DwMFaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=J7uT21mkGp7aMwIrHQkTLGwy72wKx_bOB0IkoGp__bQ&m=irXqD18u-c9Im_eQprxOMZNcAsAeuIrOn7RWNk8LABzPsfkFpG_WuH03l9pRCH7S&s=jxU_OqFdvXhIND4OxiV2Qk7vLRtlZbkMZpGf5fuYKeA&e=>
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.creatis.insa-lyon.fr/pipermail/rtk-users/attachments/20240429/b62247b2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2024-04-29 at 7.16.19?AM.png
Type: image/png
Size: 713290 bytes
Desc: not available
URL: <http://www.creatis.insa-lyon.fr/pipermail/rtk-users/attachments/20240429/b62247b2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1611 bytes
Desc: not available
URL: <http://www.creatis.insa-lyon.fr/pipermail/rtk-users/attachments/20240429/b62247b2/attachment-0001.p7s>


More information about the Rtk-users mailing list