### Layer-wise relevance propagation

Layer-wise relevance propagation allows to obtain a pixel-wise decomposition of a~model decision and is applicable for most state-of-the-art architectures for image classification and neural reinforcement learning1. LRP uses a notion of relevance $R$ which is equal to the model output for the output neurons and can be propagated to lower layers.

#### LRP for image classification

LRP can be employed to explain classification decisions for a given class $i$ by relevance propagation from the corresponding model output $y^i$ according to:

where $l$ refers to the layer index, $L^l$ to all neurons of layer $l$, and $R_d^{(l)}$ to a relevance of neuron $d$ in layer $l$. Typically, the relevances are propagated to the input layer ($l=1$) yielding a relevance map $M^i$, which enables visualization of input regions influencing the model decision.

#### LRP for semantic segmentation

In order to apply LRP to semantic segmentation models, we cast the segmentation problem as a voxel-wise classification. This means that in order to explain a decision of a segmentation model for a given output region $A$, we propose to compute the input relevance maps according to Eq.1 for each considered output location $a \in A$. Then the relevance map $M^i$ explaining the model decision for class $i$ in the region $A$ can be calculated as:

We normalize $M_a^i$ by its sum to ensure that each output location $a$ equally contributes to the final relevance map $M^i$.

### Relevance analysis of liver tumor segmentation CNN

We train a 3D u-net2 with a 6-channel input and 2-channel output to segment liver tumors in : T2, non contrast enhanced T1 (plain-T1), and four dynamic contrast enhanced (DCE) T1 images acquired 20s (T1-20s), 60s (T1-60s), 120s (T1-120s), and 15min (T1-15min) after contrast agent administration (Gd-EOB-DTPA). All sequences were motion corrected using the T1-15min image as reference3.

#### MRI sequence relevance

Normalized relevance distribution across input MRI sequences for 20 test patients denoted by different colors.

#### Pixel-level explanations

Foreground relevance maps computed for a true positive.

Foreground relevance maps computed for a false negative.

#### Discussion

• CNN used information from all MRI sequences.
• T1-15min sequence, which was used to create training labels was not the most important one.
• Similar relevance attribution for plain T1, T1-20s, T1-60s, and T1-120s.
• Pixel-level explanations are hard to interpret.

#### References

1. S. Bach et al., On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLOS ONE, 2015.

2. Ö. Çiçek et al. 3d u-net: learning dense volumetric segmentation from sparse annotation. MICCAI, 2016.

3. J. Strehlow et al. Landmark-based evaluation of a deformable motion correction for dce-mri of the liver. IJCARS, 2018.