CREATIS Lab.-France || KULeuven-Belgium || ThoraxCenter Lab.-Netherlands

Evaluation procedure

Goals

The aims of the evaluation metrics are twofold:

  • measure the degree of accuracy of the endocardial surface. This will be done through global and local measures of similarity with the reference contours;

  • measure the degree of accuracy of the derived clinical indices.

Distance error metrics

To measure the degree of accuracy of the endocardial border, three standard metrics will be used:

  • Modified Dice similarity index: the modified Dice similarity index, D*, is computed as a measure of overlap between the surface volume (V) from automatic method and the reference surface volume (Vref), giving a measurement value between 0 (full overlap) and 1 (no overlap).
    Modified Dice Equation

    This value will be computed for both End Diastolic D*ED and End systolic D*ES instances.

  • Mean surface distance: the mean surface distance, dmean, between the surface (S) from automatic method and the reference surface (Sref) defined as:
    Mean surface Distance Equation
    where d(S,Sref) is the mean of distances between every surface voxel in S and the closest surface voxel in Sref, while d(Sref,S) is computed in a similar way. This value will be computed for both End Diastolic dmean,ED and End systolic dmean,ES instances.

  • Hausdorff surface distance: the Hausdroff distance, dH, measures the local maximum distance between the two surfaces S and Sref. This value will be computed for both End Diastolic dH,ED and End systolic dH,ES instances.

Clinical indice metrics

To measure the ability of the algorithms in extracting relevant clinical indices, modified correlation (corr*=1-corr), bias and standard deviation (std) measurements will be computed from End Diastolic Volumes (EDV), End Systolic Volumes (ESV) and Ejection Fraction (EF) measurements. The following notation will be used

  • Modified correlation computed from EDV measurements: EDVcorr*

  • Bias computed from EDV measurements: EDVbias

  • Standard deviaton computed from EDV measurements: EDVstd


Ranking strategy

The challengers scores will be ranked according to the following measurements:

  • Global distance errors measure
    Global Distance Measure
  • Global clinical indices measure
    Global Indice Measure
  • Global error measure
    Global Error Measure
It has to be noticed that each individual measure (for example dH,ED) will be normalized by the maximum value of the corresponding measure among the participants. By doing so, each measure will be defined between 0 (best score obtained if the result perfectly fits the reference mesh) and 1 (worst case among the participants).

The ranking will be performed on the global error measure obtained by each participant.

MIDAS mesh distance visualization

Each participant will have the possibility to visualize the distance between his uploaded result mesh and the corresponding reference one thanks to WebGL technique. A typical visualization that will be provided through the dedicated MIDAS website is presented bellow (the colormap provides a direct information on distance between two meshes).