Missing Data Imputation by Reducing Mutual Information with Rectified Flows

1University of Cambridge 2University of Oxford 3University College London 4University of Bristol

Abstract

This paper introduces a novel iterative method for missing data imputation that sequentially reduces the mutual information between data and the corresponding missingness mask. Inspired by GAN-based approaches that train generators to decrease the predictability of missingness patterns, our method explicitly targets this reduction in mutual information. Specifically, our algorithm iteratively minimizes the KL divergence between the joint distribution of the imputed data and missingness mask, and the product of their marginals from the previous iteration. We show that the optimal imputation under this framework can be achieved by solving an ODE whose velocity field minimizes a rectified flow training objective. We further illustrate that some existing imputation techniques can be interpreted as approximate special cases of our mutual-information-reducing framework. Comprehensive experiments on synthetic and real-world datasets validate the efficacy of our proposed approach, demonstrating its superior imputation performance.

The MIRI Framework

Process Overview

The process begins with a noisy initialization. Each loop minimizes the Mutual Information between the data and the missingness mask.

To achieve this, we train a Rectified Flow defined by an ODE that transports current estimates toward a distribution where missing patterns are unpredictable.

This cycle continues until the imputed values are statistically indistinguishable from the observed data.

Input

Observed Data & Mask

Raw dataset with missing values
$\tilde{\mathbf{X}}, \mathbf{M}$
Initialization

Random Noise Fill

Starting point for imputation
$\mathbf{X}^{(0)} \sim \mathcal{N}(0, I)$
ITERATIVE UPDATES
Objective

Minimize Mutual Information

$$\min_{\mathbf{g}} \mathrm{D}_{KL} [\mathbb{P}_{\mathbf{X}(\mathbf{g}), \mathbf{M}} \Vert \mathbb{P}_{\mathbf{X}^{(t-1)}} \otimes \mathbb{P}_{\mathbf{M}}]$$
Training

Rectified Flow Transport

$$\frac{d\mathbf{Z}_\tau}{d\tau} = \mathbf{v}^*(\mathbf{Z}_\tau, \tau)$$
Update

Update Imputation

$$\mathbf{X}^{(t)} = \underbrace{\mathbf{M} \odot \tilde{\mathbf{X}}}_{\text{Keep Observed}} + \underbrace{(1-\mathbf{M}) \odot \left( \mathbf{X}^{(t-1)} + \int_{0}^{1} \mathbf{v}_t(\mathbf{Z}_\tau, \tau) \, \mathrm{d}\tau \right)}_{\text{Update Missing via Flow}}$$
Repeat if $t < T$
Result

Final Imputed Data

Data $\perp$ Mask (Independent)
$\mathbf{X}^{(T)}$

Experimental Results

Competitive performance on Tabular and Image Benchmarks

Aggregated MMD Rankings for UCI (40% Missingness) Across 10 Datasets

Rankings based on Maximum Mean Discrepancy (MMD). Lower is better.

Method MCAR MAR MNAR
TabCSDI 6.4 6.4 6.1
GAIN 6.2 6.2 5.9
TDM 4.5 4.4 4.2
KnewImp 4.1 4.2 3.8
MIWAE 3.0 2.9 2.8
HyperImpute 1.6 1.5 1.6
MIRI (Ours*) 1.4 2.0 1.3

Citation

@inproceedings{yu2025missing,
  title={Missing Data Imputation by Reducing Mutual Information with Rectified Flows},
  author={Yu, Jiahao and Ying, Qizhen and Wang, Leyang and Jiang, Ziyue and Liu, Song},
  booktitle={Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS)},
  year={2025}
}