[Dcmlib] std::map vs std::hash_map

Sean McInerney seanm at nmr.mgh.harvard.edu
Tue Jul 5 20:11:16 CEST 2005


Hi Mathieu,

First off, I would say that using std::map is the ideal solution *if*
the tag maps are not a performance bottle neck. All STL implementations
should have a usable std::map while extension classes like hash_map are
a pain to deal with when portability is an issue.

I have some additional data regarding map performance issues from a
library of container classes for vtkObjects which I will be releasing
soon (http://vtkcontainers.sourceforge.net/). My comparison of map vs.
hash_map is very similar to your results. If mapping is your bottleneck,
you may find the performance of my non-STL hash map interesting ...
though it would certainly involve some work to graft such a scheme into
the relevant (two?) GDCM classes.

-Sean

Sean McInerney wrote:
> Hi Mathieu,
> 
> A 5% improvement is surprisingly weak, but may be due to the typical key
> sizes found in a DICOM table as well as the size of the table. Depending
> on the hash function used, you may have a lot of collisions and multiple
> rehashing necessary for any given key. If you really *need* better
> performance, you may want to investigate passing a different hash
> function (functor) as a template parameter to std::hash_map tuned for
> your specific distribution of input values (keys).
> 
> Or better yet, skip the enormous time sink of tweaking your hash
> implementation and enjoy the handy lexicographic sorting offered by
> std::map. By definition, hash maps have no such ordering.
> 
> Happy 4th of July to all ... especially the French members of the list.
> We would not be here without you. Sorry to have made such a mess of the
> place in the meantime, we'll try to fix it up in the near future.
> 
> -Sean
> 
> Mathieu Malaterre wrote:
> 
>>Ok guys,
>>
>>    I did perform a few bench comparing in Release mode on linux debian,
>>gcc 3.3. If I replace the TagKeyHT from a std::map to a
>>std::hash_table.. I only get a ~5% improvement. But again to reach 5%
>>improvement I was parsing (see TestAllPrint.cxx) all files from gdcmData
>>and all files from gdcmSampleData (rsync serveur). Basically the
>>difference between a map and a hash is extremely negligeable. You only
>>notice it when you parse a lot of file and a lot of tags...
>>
>>    Since this is not exactly what I expected I'll differ the merge with
>>my local change until I found a more drastic change that affect the
>>speed of gdcm.
>>
>>    Unless of course people are interested in the patch right away...
>>
>>-----------------------------
>>
>>    My machine: CHIRON, which used to do the win32 part (bcc32, and
>>vs7.1) died last week (I am guessing the power supply), so it will only
>>be back middle of next week.
>>
>>
>>Mathieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.creatis.insa-lyon.fr/pipermail/dcmlib/attachments/20050705/ec238530/attachment.html>


More information about the Dcmlib mailing list