The State of AI in Survey Cosmology

EuCAIFCon, Cagliari, June 2025


François Lanusse










slides at eiffl.github.io/talks/Cagliari2025

What are cosmologists trying to go after with galaxy surveys?


















We have been dreaming about these surveys for 20 years!

Albrecht et al. (2006)
LSST forecast on dark energy parameters
Image credit: N. Jeffrey / DES Collaboration
Image credit: Euclid Consortium / Planck Collaboration / A. Mellinger



















Stage II: SDSS

Image credit: Peter Melchior



















Stage III: DES

Image credit: Peter Melchior



















Stage IV: Rubin Observatory LSST (HSC)

Image credit: Peter Melchior

Euclid Q1 Data Release: April 2025




















Image credit: ESA/Euclid/Euclid Consortium/NASA
Image processing by J.-C. Cuillandre, E. Bertin, G. Anselmi

the Vera C. Rubin Observatory Legacy Survey of Space and Time



  • 1000 images each night, 15 TB/night for 10 years

  • 18,000 square degrees, observed once every few days

  • Tens of billions of objects, each one observed $\sim1000$ times
Rubin First Look, Monday June 23rd!










Dark Energy Spectroscopic Instrument (DESI) is teasing us with new tensions!


DESI Collaboration (2025)











Ok, exciting, but where is the AI you promised???




What does a conventional cosmological analysis look like?

The limits of traditional cosmological inference

HSC cosmic shear power spectrum
HSC Y1 constraints on $(S_8, \Omega_m)$
(Hikage et al. 2018)
  • Measure the ellipticity $\epsilon = \epsilon_i + \gamma$ of all galaxies
    $\Longrightarrow$ Noisy tracer of the weak lensing shear $\gamma$

  • Compute summary statistics based on 2pt functions,
    e.g. the power spectrum

  • Run an MCMC to recover a posterior on model parameters, using an analytic likelihood $$ p(\theta | x ) \propto \underbrace{p(x | \theta)}_{\mathrm{likelihood}} \ \underbrace{p(\theta)}_{\mathrm{prior}}$$
Main limitation: the need for an explicit likelihood
We can only compute from theory the likelihood for simple summary statistics and on large scales

$\Longrightarrow$ We are dismissing a significant fraction of the information!

Full-Field Simulation-Based Inference

  • Instead of trying to analytically evaluate the likelihood of sub-optimal summary statistics, let us build a forward model of the full observables.
    $\Longrightarrow$ The simulator becomes the physical model.

  • Each component of the model is now tractable, but at the cost of a large number of latent variables.


Benefits of a forward modeling approach
  • Fully exploits the information content of the data (aka "full field inference").

  • Easy to incorporate systematic effects.

  • Easy to combine multiple cosmological probes by joint simulations.
(Porqueres et al. 2021)

Just as a reminder, why is it classicaly hard to do simulation-based inference?

The Challenge of Simulation-Based Inference
$$ p(x|\theta) = \int p(x, z | \theta) dz = \int p(x | z, \theta) p(z | \theta) dz $$ Where $z$ are stochastic latent variables of the simulator.

$\Longrightarrow$ This marginal likelihood is intractable!


How to perform inference over forward simulation models?

  • Implicit Inference: Treat the simulator as a black-box with only the ability to sample from the joint distribution $$(x, \theta) \sim p(x, \theta)$$ a.k.a.
    • Simulation-Based Inference (SBI)
    • Likelihood-free inference (LFI)
    • Approximate Bayesian Computation (ABC)

  • Explicit Inference: Treat the simulator as a probabilistic model and perform inference over the joint posterior $$p(\theta, z | x) \propto p(x | z, \theta) p(z, \theta) p(\theta) $$ a.k.a.
    • Bayesian Hierarchical Modeling (BHM)

$\Longrightarrow$ For a given simulation model, both methods should converge to the same posterior!

Implicit Inference


The land of Neural Density Estimation

We have converged to a standard SBI recipe


A two-steps approach to Implicit Inference
  • Automatically learn an optimal low-dimensional summary statistic $$y = f_\varphi(x) $$
  • Use Neural Density Estimation to either:
    • build an estimate $p_\phi$ of the likelihood function $p(y \ | \ \theta)$ (Neural Likelihood Estimation)

    • build an estimate $p_\phi$ of the posterior distribution $p(\theta \ | \ y)$ (Neural Posterior Estimation)

Information Point of View on Neural Summarisation



Learning Sufficient Statistics
  • Summary statistics $y$ is sufficient for $\theta$ if $$ I(Y; \Theta) = I(X; \Theta) \Leftrightarrow p(\theta | x ) = p(\theta | y) $$
  • Variational Mutual Information Maximization $$ \mathcal{L} \ = \ \mathbb{E}_{x, \theta} [ \log q_\phi(\theta | y=f_\varphi(x)) ] \leq I(Y; \Theta) $$ (Barber & Agakov variational lower bound)
    Jeffrey, Alsing, Lanusse (2021)

Another Approach: maximizing the Fisher information

Information Maximization Neural Network (IMNN) $$\mathcal{L} \ = \ - | \det \mathbf{F} | \ \mbox{with} \ \mathbf{F}_{\alpha, \beta} = tr[ \mu_{\alpha}^t C^{-1} \mu_{\beta} ] $$
Charnock, Lavaux, Wandelt (2018)

People use a lot of variants in practice!


* grey rows are papers analyzing survey data

Optimal Neural Summarisation for Cosmological Implicit Inference

Lanzieri, Zeghal et al. (2024)
  • Asymptotically VMIM yields a sufficient statistics
    • No reason not to use it in practice, it works well, and is asymptotically optimal



  • Mean Squared Error (MSE) DOES NOT yield a sufficient statistics even asymptotically
    • Same for Mean Absolute Error (MAE) and weighted versions of MSE

credit: Justine Zeghal

Our humble beginnings: Likelihood-Free Parameter Inference with DES SV...

Jeffrey, Alsing, Lanusse (2021)

Suite of N-body + raytracing simulations: $\mathcal{D}$

$w$CDM analysis of KiDS-1000 Weak Lensing (Fluri et al. 2022)

Fluri, Kacprzak, Lucchi, Schneider, Refregier, Hofmann (2022)


KiDS-1000 footprint and simulated data
  • Neural Compressor: Graph Convolutional Neural Network on the Sphere
    Trained by Fisher information maximization.

SIMBIG: Field-level SBI of Large Scale Structure (Lemos et al. 2023)



















BOSS CMASS galaxy sample: Data vs Simulations
  • 20,000 simulated galaxy samples at 2,000 cosmologies
Hahn et al. (2022)

Finally, SBI has reached the mainstream: Official DES year 3 SBI wCDM results

Jeffrey et al. (2024)




I'm calling it!

Implicit Inference is solved for cosmological surveys!
@EiffL - Cagliari, June 2025

Has it delivered everything we hoped for?

Example of unforeseen impact of shortcuts in simulations

Gatti, Jeffrey, Whiteway et al. (2023)

Is it ok to distribute lensing source galaxies randomly in simulations, or should they be clustered?

$\Longrightarrow$ An SBI analysis could be biased by this effect and you would never know it!

How much usable information is there beyond the power spectrum?

Chisari et al. (2018)

Ratio of power spectrum in hydrodynamical simulations vs. N-body simulations
Secco et al. (2021)

DES Y3 Cosmic Shear data vector

$\Longrightarrow$ Can we find non-Gaussian information that is not affected by baryons?

takeways



Will we be able to exploit all of the information content of LSST, Euclid, DESI?
$\Longrightarrow$ Not rightaway, but it is not the fault of the inference methodology!

  • Deep Learning has redefined the limits of our statistical tools, creating additional demand on the accuracy of simulations far beyond the power spectrum.

  • Neural compression methods have the downside of being opaque. It is much harder to detect unknown systematics.

  • We will need a significant number of large volume, high resolution simulations.





If Implicit Inference is solved, can we still have fun solving Explicit Inference?
Credit: Yuuki Omori, Chihway Chang, Justine Zeghal, EiffL

https://github.com/EiffL/LPTLensingComparison

More seriously, Explicit Inference has some advantages:
  • More introspectable results to identify systematics
  • Allows for fitting parametric corrections/nuisances from data
  • Provides validation of statistical inference with a different method

Explicit Inference


Where the things are!

Simulators as Hierarchical Bayesian Models

  • If we have access to all latent variables $z$ of the simulator, then the joint log likelihood $p(x | z, \theta)$ is explicit.

  • We need to infer the joint posterior $p(\theta, z | x)$ before marginalization to yield $p(\theta | x) = \int p(\theta, z | x) dz$.
    $\Longrightarrow$ Extremely difficult problem as $z$ is typically very high-dimensional.

  • Necessitates inference strategies with access to gradients of the likelihood. $$\frac{d \log p(x | z, \theta)}{d \theta} \quad ; \quad \frac{d \log p(x | z, \theta)}{d z} $$ For instance: Maximum A Posterior estimation, Hamiltonian Monte-Carlo, Variational Inference.

$\Longrightarrow$ The only hope for explicit cosmological inference is to have fully-differentiable cosmological simulations!

How complicated can it be to simulate the entire Universe?

Forward Models in Cosmology

Linear Field
Final Dark Matter

Dark Matter Halos
Galaxies
$\longrightarrow$
N-body simulations
$\longrightarrow$
Group Finding
algorithms
$\longrightarrow$
Semi-analytic &
distribution models

the Fast Particle-Mesh scheme for N-body simulations

The idea: approximate gravitational forces by estimating densities on a grid.
  • The numerical scheme:

    • Estimate the density of particles on a mesh
      => compute gravitational forces by FFT

    • Interpolate forces at particle positions

    • Update particle velocity and positions, and iterate

  • Fast and simple, at the cost of approximating short range interactions.
$\Longrightarrow$ Only a series of FFTs and interpolations.

FlowPM: Particle-Mesh Simulations in TensorFlow

Modi, Lanusse, Seljak (2020)

																		 import tensorflow as tf
																		 import flowpm
																		 # Defines integration steps
																		 stages = np.linspace(0.1, 1.0, 10, endpoint=True)

																		 initial_conds = flowpm.linear_field(32,       # size of the cube
																											100,       # Physical size
																											ipklin,    # Initial powerspectrum
																											batch_size=16)

																		 # Sample particles and displace them by LPT
																		 state = flowpm.lpt_init(initial_conds, a0=0.1)

																		 # Evolve particles down to z=0
																		 final_state = flowpm.nbody(state, stages, 32)

																		 # Retrieve final density field
																		 final_field = flowpm.cic_paint(tf.zeros_like(initial_conditions),
																										final_state[0])
																	 
  • Seamless interfacing with deep learning components
  • Now superseeded by the JAX-based pmwd and JaxPM libraries









MAP optimization in action

$$\arg\max_z \ \log p(x_{dm} | f(z)) \ + \ p(z| \theta) $$
credit: C. Modi


True initial conditions
$z_0$

Reconstructed initial conditions $z$

Reconstructed dark matter distribution $x_{dm} = f(z)$

Data
$x_{dm} = f(z_0)$


Necessity to fill the gap in the accuracy-speed space of PM simulations

Camels simulations

PM simulations

Hybrid physical/neural differential equations

Lanzieri, Lanusse, Starck (2022)
$$\left\{ \begin{array}{ll} \frac{d \color{#6699CC}{\mathbf{x}} }{d a} & = \frac{1}{a^3 E(a)} \color{#6699CC}{\mathbf{v}} \\ \frac{d \color{#6699CC}{\mathbf{v}}}{d a} & = \frac{1}{a^2 E(a)} F_\theta( \color{#6699CC}{\mathbf{x}} , a), \\ F_\theta( \color{#6699CC}{\mathbf{x}}, a) &= \frac{3 \Omega_m}{2} \nabla \left[ \color{#669900}{\phi_{PM}} (\color{#6699CC}{\mathbf{x}}) \right] \end{array} \right. $$
  • $\mathbf{x}$ and $\mathbf{v}$ define the position and the velocity of the particles
  • $\phi_{PM}$ is the gravitational potential in the mesh

$\to$ We can use this parametrisation to complement the physical ODE with neural networks.


$$F_\theta(\mathbf{x}, a) = \frac{3 \Omega_m}{2} \nabla \left[ \phi_{PM} (\mathbf{x}) \ast \mathcal{F}^{-1} (1 + \color{#996699}{f_\theta(a,|\mathbf{k}|)}) \right] $$


Correction integrated as a Fourier-based isotropic filter $f_{\theta}$ $\to$ incorporates translation and rotation symmetries

Projections of final density field



Camels simulations
PM simulations
PM+NN correction

Results


  • Neural network trained using single CAMELS simulation of $25^3$ ($h^{-1}$ Mpc)$^3$ volume and $64^3$ dark matter particles at the fiducial cosmology of $\Omega_m = 0.3$


  • Hybrid N-body simulations with Field-Level Emulator

    Jamieson et al. (2023)
    Doeser et al. (2024)
    Bartlett et al. (2024)

    The need for distributed differentiable programming frameworks

    • The state vector of a moderate size cosmological simulation volume can easily require from 100GB to several TB.
      $\Longrightarrow$ We need model-parallelism! Not currently fully supported by any mainstream autodiff frameworks!

    (Gholami et al. 2018)

    JAX-powered differentiable HPC




    • JAX v0.4 has made a strong push for bringing automated parallelization and support multi-host GPU clusters!

    • Scientific HPC still most likely requires dedicated high-performance ops

    • jaxDecomp: Domain Decomposition and Parallel FFTs


      • JAX bindings to the high-performance cuDecomp (Romero et al. 2022) adaptive domain decomposition library.

      • Provides parallel FFTs and halo-exchange operations.

      • Supports variety of backends: CUDA-aware MPI, NVIDIA NCCL, NVIDIA NVSHMEM.

    Building PM components from these distributed operations

    Kabalan, Lanusse, Boucaud (in prep.)

    Distributed 3D FFT for force computation

    Halo Exchange for CiC painting and reading

    $2048^3$ LPT field, 1.02s on 32 H100 GPUs

    Performance Benchmark


    Strong scaling plots of 3D FFT

    Official performance benchmark from NVIDIA with cuFFTMp

    Timing of 1LPT computation

    Final piece of the puzzle: Efficient Sampling Algorithms

    Simon-Onfroy, Lanusse, de Mattia (2025)

    ESS for different sampling algorithms

    Microcanonical sampler

    Conclusion

    The next decade of cosmological inference
    • Stage IV surveys are here and already producing surprising results

    • SBI has become mainstream - the tools are mature and reliable

    • Differentiable computing provides new opportunities for exploration

    • The real challenge is not inference methods but simulation model accuracy

    • The future lies in learnable, adaptive simulation models that can discover new physics