🧪 Evaluation Metrics - AI4Life Calcium Imaging Denoising Challenge 2025

AI4Life Calcium Imaging Denoising Challenge 2025 Banner

Evaluation Metrics¶

In Calcium Imaging Denoising, models are required to accurately remove noise from the image sequences, while preserving the integrity of the underlying signal.

In particular, since this kind of data include both spatial and temporal information, it is important for a good denoising model to preserve both the spatial and temporal structure of the signal.

For this reason, we designed the evaluation method of this challenge to take into account these two aspects by defining a temporal (t) and a spatial (s) variant of each metric, that is later merged into a spatio-temporal (st) metric.

Spatio-Temporal SNR¶

Given an image stack (video) of size T x H x W, where T is the number of frames, and H, W are the spatial dimensions of each frame:

The spatial Signal to Noise Ratio (sSNR) is defined as the widely-used Signal-to-Noise (SNR) metric, computed across each frame:

Where y is the ground-truth (i.e., clean) image and x is the output of the denoising algorithm.

The temporal Signal to Noise Ratio (tSNR) is defined as Signal-to-Noise (SNR) metric, computed for each for each temporally resolved signal at each spatial location (i, j):

Lastly, the spatio-temporal Signal to Noise Ratio (stSNR) is computed as a convex combination between the spatial and the temporal SNR:

Final Score¶

Our leaderboards for each task use stSNR as the final evaluation score by averaging it across each file in the leaderboard dataset, formally:

where F is the number of files used in each leaderboard.

Additional Metrics¶

While not included in the final ranking score, we also offer the participants the possibility to inspect other metrics, which are computed using the same spatio-temporal combination as above, but using different metrics as their base formulation instead of using SNR.

For metrics that are based on a data range, we used the difference between 97th and 3rd percentile of the full ground truth stack to remove outliers and keep spatial and temporal metrics comparable.

PSNR¶

PSNR is a widely used metric in image processing for quantifying the similarity between two images, measured in decibels (dB).

Where MAXI is the maximum possible pixel value of the image, for example, when the pixels are represented using 8 bits per sample, this is 255. As stated above, in this case we used the difference between 97th and 3rd percentile of the full ground truth stack. MSE is the Mean Squared Error between the original and reconstructed images.

We used the scikit-image implementation in our evaluation code.

Scale-Invariant PSNR (SI-PSNR)¶

This metric is described in Luo, Yi, and Nima Mesgarani. "Tasnet: time-domain audio separation network for real-time, single-channel speech separation." 2018

SI-PSNR metric is invariant to the scale of the signals being compared, meaning that if one signal is a scaled version of another, the SI-SNR will not change, addressing a limitation of traditional SNR or PSNR metrics sensitive to signal amplitude changes.

SI-PSNR is defined as:

Where:

Where s and s are the estimated and target clean sources, respectively, s and s are both normalized to have zero-mean to ensure scale-invariance.

Here, we use the scale-invariant implementation from the careamics package, modified to accept a different data-range parameter for normalization.

Metrics Overview¶

Metric Name	Description	Averaging Domain
`sSNR`	Spatial Signal-to-Noise Ratio	Over frames
`tSNR`	Temporal Signal-to-Noise Ratio	Over spatial grid
`stSNR`	Spatio-temporal SNR (weighted avg)	Global
`sPSNR`	Spatial Peak SNR	Over frames
`tPSNR`	Temporal Peak SNR	Over spatial grid
`stPSNR`	Spatio-temporal Peak SNR	Global
`sSI_PSNR`	Spatial Scale-Invariant PSNR	Over frames
`tSI_PSNR`	Temporal Scale-Invariant PSNR	Over spatial grid
`stSI_PSNR`	Spatio-temporal Scale-Invariant PSNR	Global
`<metric>_std`	Standard deviation (spatial/temporal only), per-file	Dispersion

Note: Standard deviations are computed only on a per-file basis. To inspect them, check the result page of your algorithm and inspect the output json file.