MIRI: Missing Data Imputation by Reducing Mutual Information

Abstract

This paper introduces a novel iterative method for missing data imputation that sequentially reduces the mutual information between data and the corresponding missingness mask. Inspired by GAN-based approaches that train generators to decrease the predictability of missingness patterns, our method explicitly targets this reduction in mutual information. Specifically, our algorithm iteratively minimizes the KL divergence between the joint distribution of the imputed data and missingness mask, and the product of their marginals from the previous iteration. We show that the optimal imputation under this framework can be achieved by solving an ODE whose velocity field minimizes a rectified flow training objective. We further illustrate that some existing imputation techniques can be interpreted as approximate special cases of our mutual-information-reducing framework. Comprehensive experiments on synthetic and real-world datasets validate the efficacy of our proposed approach, demonstrating its superior imputation performance.

The MIRI Framework

Process Overview

The process begins with a noisy initialization. Each loop minimizes the Mutual Information between the data and the missingness mask.

To achieve this, we train a Rectified Flow defined by an ODE that transports current estimates toward a distribution where missing patterns are unpredictable.

This cycle continues until the imputed values are statistically indistinguishable from the observed data.

Input

Observed Data & Mask

Raw dataset with missing values

$\tilde{\mathbf{X}}, \mathbf{M}$

Initialization

Random Noise Fill

Starting point for imputation

$\mathbf{X}^{(0)} \sim \mathcal{N}(0, I)$

ITERATIVE UPDATES

Objective

Minimize Mutual Information

$$\min_{\mathbf{g}} \mathrm{D}_{KL} [\mathbb{P}_{\mathbf{X}(\mathbf{g}), \mathbf{M}} \Vert \mathbb{P}_{\mathbf{X}^{(t-1)}} \otimes \mathbb{P}_{\mathbf{M}}]$$

Training

Rectified Flow Transport

$$\frac{d\mathbf{Z}_\tau}{d\tau} = \mathbf{v}^*(\mathbf{Z}_\tau, \tau)$$

Update

Update Imputation

$$\mathbf{X}^{(t)} = \underbrace{\mathbf{M} \odot \tilde{\mathbf{X}}}_{\text{Keep Observed}} + \underbrace{(1-\mathbf{M}) \odot \left( \mathbf{X}^{(t-1)} + \int_{0}^{1} \mathbf{v}_t(\mathbf{Z}_\tau, \tau) \, \mathrm{d}\tau \right)}_{\text{Update Missing via Flow}}$$

Repeat if $t < T$

Result

Final Imputed Data

Data $\perp$ Mask (Independent)

$\mathbf{X}^{(T)}$

Experimental Results

Competitive performance on Tabular and Image Benchmarks

Aggregated MMD Rankings for UCI (40% Missingness) Across 10 Datasets

Rankings based on Maximum Mean Discrepancy (MMD). Lower is better.

Method	MCAR	MAR	MNAR
TabCSDI	6.4	6.4	6.1
GAIN	6.2	6.2	5.9
TDM	4.5	4.4	4.2
KnewImp	4.1	4.2	3.8
MIWAE	3.0	2.9	2.8
HyperImpute	1.6	1.5	1.6
MIRI (Ours*)	1.4	2.0	1.3

Quantitative results on CIFAR-10

Methods are evaluated at three levels of missingness (20%, 40%, 60%) using FID, PSNR, and SSIM. The best results are highlighted in bold.

Method	20% Missingness			40% Missingness			60% Missingness
Method	FID ↓	PSNR ↑	SSIM ↑	FID ↓	PSNR ↑	SSIM ↑	FID ↓	PSNR ↑	SSIM ↑
GAIN	164.11	21.21	0.7803	281.62	16.20	0.5576	285.53	11.99	0.2933
KnewImp	153.09	18.84	0.6463	193.68	15.81	0.4740	264.40	14.04	0.3317
MissDiff	90.51	22.29	0.7702	129.84	19.65	0.6648	197.91	16.78	0.4989
HyperImpute	8.92	34.09	0.9750	65.01	23.22	0.7931	130.36	20.17	0.6533
MIRI (Ours*)	6.01	32.29	0.9736	27.53	27.14	0.9126	68.58	23.22	0.8063

CIFAR-10 Image Imputation (60% Missingness)

15 uncurated 32×32 CIFAR-10 images and their imputations. Pixels are removed from all RGB channels.

CIFAR-10 imputation comparison showing masked images, GAIN, KnewImp, MissDiff, HyperImpute, MIRI, and ground truth across 15 samples

CelebA Image Imputation (60% Missingness)

15 uncurated 64×64 CelebA images and their imputations. Pixels are removed from each RGB channel independently.

CelebA imputation comparison showing masked images, GAIN, KnewImp, MissDiff, HyperImpute, MIRI, and ground truth across 15 samples

Key Observations

Superior reconstruction quality: MIRI consistently produces sharper images with better preservation of high-frequency details (e.g., hair textures, facial features) compared to baseline methods.
Reduced artifacts: Unlike GAIN and other GAN-based methods that often produce blurry or hallucinated features, MIRI maintains realistic textures while avoiding over-smoothing.
Robust across missingness levels: MIRI demonstrates strong performance across all three missingness rates (20%, 40%, 60%), with particularly notable improvements at higher missingness levels.

Missing Data Imputation by
Reducing Mutual Information with Rectified Flows

Abstract

The MIRI Framework

Process Overview

Observed Data & Mask

Random Noise Fill

Minimize Mutual Information

Rectified Flow Transport

Update Imputation

Final Imputed Data

Experimental Results

Aggregated MMD Rankings for UCI (40% Missingness) Across 10 Datasets

Quantitative results on CIFAR-10

CIFAR-10 Image Imputation (60% Missingness)

CelebA Image Imputation (60% Missingness)

Key Observations

Citation

Missing Data Imputation by Reducing Mutual Information with Rectified Flows

Abstract

The MIRI Framework

Process Overview

Observed Data & Mask

Random Noise Fill

Minimize Mutual Information

Rectified Flow Transport

Update Imputation

Final Imputed Data

Experimental Results

Aggregated MMD Rankings for UCI (40% Missingness) Across 10 Datasets

Quantitative results on CIFAR-10

CIFAR-10 Image Imputation (60% Missingness)

CelebA Image Imputation (60% Missingness)

Key Observations

Citation

Missing Data Imputation by
Reducing Mutual Information with Rectified Flows