publications
automatically generated publication list from NASA's ADS service, powered by jekyllscholar.
2022
 Galaxies on graph neural networks: towards robust synthetic galaxy catalogs with deep generative modelsYesukhei Jagvaral, Francois Lanusse, Sukhdeep Singh, and 3 more authorsarXiv eprints Dec 2022
The future astronomical imaging surveys are set to provide precise constraints on cosmological parameters, such as dark energy. However, production of synthetic data for these surveys, to test and validate analysis methods, suffers from a very high computational cost. In particular, generating mock galaxy catalogs at sufficiently large volume and high resolution will soon become computationally unreachable. In this paper, we address this problem with a Deep Generative Model to create robust mock galaxy catalogs that may be used to test and develop the analysis pipelines of future weak lensing surveys. We build our model on a custom built Graph Convolutional Networks, by placing each galaxy on a graph node and then connecting the graphs within each gravitationally bound system. We train our model on a cosmological simulation with realistic galaxy populations to capture the 2D and 3D orientations of galaxies. The samples from the model exhibit comparable statistical properties to those in the simulations. To the best of our knowledge, this is the first instance of a generative model on graphs in an astrophysical/cosmological context.
 Modeling halo and central galaxy orientations on the SO(3) manifold with scorebased generative modelsYesukhei Jagvaral, Rachel Mandelbaum, and Francois LanussearXiv eprints Dec 2022
Upcoming cosmological weak lensing surveys are expected to constrain cosmological parameters with unprecedented precision. In preparation for these surveys, large simulations with realistic galaxy populations are required to test and validate analysis pipelines. However, these simulations are computationally very costly – and at the volumes and resolutions demanded by upcoming cosmological surveys, they are computationally infeasible. Here, we propose a Deep Generative Modeling approach to address the specific problem of emulating realistic 3D galaxy orientations in synthetic catalogs. For this purpose, we develop a novel ScoreBased Diffusion Model specifically for the SO(3) manifold. The model accurately learns and reproduces correlated orientations of galaxies and dark matter halos that are statistically consistent with those of a reference high resolution hydrodynamical simulation.
 The N5K Challenge: NonLimber Integration for LSST CosmologyC. D. Leonard, T. Ferreira, X. Fang, and 9 more authorsarXiv eprints Dec 2022
The rapidly increasing statistical power of cosmological imaging surveys requires us to reassess the regime of validity for various approximations that accelerate the calculation of relevant theoretical predictions. In this paper, we present the results of the ’N5K nonLimber integration challenge’, the goal of which was to quantify the performance of different approaches to calculating the angular power spectrum of galaxy number counts and cosmic shear data without invoking the socalled ’Limber approximation’, in the context of the Rubin Observatory Legacy Survey of Space and Time (LSST). We quantify the performance, in terms of accuracy and speed, of three nonLimber implementations: \tt FKEM (CosmoLike), \tt Levin, and \tt matter, themselves based on different integration schemes and approximations. We find that in the challenge’s fiducial 3x2pt LSST Year 10 scenario, \tt FKEM (CosmoLike) produces the fastest run time within the required accuracy by a considerable margin, positioning it favourably for use in Bayesian parameter inference. This method, however, requires further development and testing to extend its use to certain analysis scenarios, particularly those involving a scale dependent growth rate. For this and other reasons discussed herein, alternative approaches such as \tt matter and \tt Levin may be necessary for a full exploration of parameter space. We also find that the usual firstorder Limber approximation is insufficiently accurate for LSST Year 10 3x2pt analysis on \ell=2001000, whereas invoking the secondorder Limber approximation on these scales (with a full nonLimber method at smaller \ell) does suffice.
 pmwd: A Differentiable Cosmological ParticleMesh Nbody LibraryYin Li, Libin Lu, Chirag Modi, and 7 more authorsarXiv eprints Nov 2022
The formation of the largescale structure, the evolution and distribution of galaxies, quasars, and dark matter on cosmological scales, requires numerical simulations. Differentiable simulations provide gradients of the cosmological parameters, that can accelerate the extraction of physical information from statistical analyses of observational data. The deep learning revolution has brought not only myriad powerful neural networks, but also breakthroughs including automatic differentiation (AD) tools and computational accelerators like GPUs, facilitating forward modeling of the Universe with differentiable simulations. Because AD needs to save the whole forward evolution history to backpropagate gradients, current differentiable cosmological simulations are limited by memory. Using the adjoint method, with reverse time integration to reconstruct the evolution history, we develop a differentiable cosmological particlemesh (PM) simulation library pmwd (particlemesh with derivatives) with a low memory cost. Based on the powerful AD library JAX, pmwd is fully differentiable, and is highly performant on GPUs.
 Differentiable Cosmological Simulation with Adjoint MethodYin Li, Chirag Modi, Drew Jamieson, and 5 more authorsarXiv eprints Nov 2022
Rapid advances in deep learning have brought not only myriad powerful neural networks, but also breakthroughs that benefit established scientific research. In particular, automatic differentiation (AD) tools and computational accelerators like GPUs have facilitated forward modeling of the Universe with differentiable simulations. Current differentiable cosmological simulations are limited by memory, thus are subject to a tradeoff between time and space/mass resolution. They typically integrate for only tens of time steps, unlike the standard nondifferentiable simulations. We present a new approach free of such constraints, using the adjoint method and reverse time integration. It enables larger and more accurate forward modeling, and will improve gradient based optimization and inference. We implement it in a particlemesh (PM) Nbody library pmwd (particlemesh with derivatives). Based on the powerful AD system JAX, pmwd is fully differentiable, and is highly performant on GPUs.
 Differentiable Stochastic Halo Occupation DistributionBenjamin Horowitz, ChangHoon Hahn, Francois Lanusse, and 2 more authorsarXiv eprints Nov 2022
In this work, we demonstrate how differentiable stochastic sampling techniques developed in the context of deep Reinforcement Learning can be used to perform efficient parameter inference over stochastic, simulationbased, forward models. As a particular example, we focus on the problem of estimating parameters of Halo Occupancy Distribution (HOD) models which are used to connect galaxies with their dark matter halos. Using a combination of continuous relaxation and gradient parameterization techniques, we can obtain welldefined gradients with respect to HOD parameters through discrete galaxy catalogs realizations. Having access to these gradients allows us to leverage efficient sampling schemes, such as Hamiltonian MonteCarlo, and greatly speed up parameter inference. We demonstrate our technique on a mock galaxy catalog generated from the Bolshoi simulation using the Zheng et al. 2007 HOD model and find near identical posteriors as standard Markov Chain Monte Carlo techniques with an increase of \raisebox0.5ex\textasciitilde8x in convergence efficiency. Our differentiable HOD model also has broad applications in full forward model approaches to cosmic structure and cosmological analysis.
 Towards solving model bias in cosmic shear forward modelingBenjamin Remy, Francois Lanusse, and JeanLuc StarckarXiv eprints Oct 2022
As the volume and quality of modern galaxy surveys increase, so does the difficulty of measuring the cosmological signal imprinted in galaxy shapes. Weak gravitational lensing sourced by the most massive structures in the Universe generates a slight shearing of galaxy morphologies called cosmic shear, key probe for cosmological models. Modern techniques of shear estimation based on statistics of ellipticity measurements suffer from the fact that the ellipticity is not a welldefined quantity for arbitrary galaxy light profiles, biasing the shear estimation. We show that a hybrid physical and deep learning Hierarchical Bayesian Model, where a generative model captures the galaxy morphology, enables us to recover an unbiased estimate of the shear on realistic galaxies, thus solving the model bias.
 The DAWES review 10: The impact of deep learning for the analysis of galaxy surveysMarc HuertasCompany, and François LanussearXiv eprints Oct 2022
The amount and complexity of data delivered by modern galaxy surveys has been steadily increasing over the past years. Extracting coherent scientific information from these large and multimodal data sets remains an open issue and data driven approaches such as deep learning have rapidly emerged as a potentially powerful solution to some long lasting challenges. This enthusiasm is reflected in an unprecedented exponential growth of publications using neural networks. Half a decade after the first published work in astronomy mentioning deep learning, we believe it is timely to review what has been the real impact of this new technology in the field and its potential to solve key challenges raised by the size and complexity of the new datasets. In this review we first aim at summarizing the main applications of deep learning for galaxy surveys that have emerged so far. We then extract the major achievements and lessons learned and highlight key open questions and limitations. Overall, stateofthe art deep learning methods are rapidly adopted by the astronomical community, reflecting a democratization of these methods. We show that the majority of works using deep learning up to date are oriented to computer vision tasks. This is also the domain of application where deep learning has brought the most important breakthroughs so far. We report that the applications are becoming more diverse and deep learning is used for estimating galaxy properties, identifying outliers or constraining the cosmological model. Most of these works remain at the exploratory level. Some common challenges will most likely need to be addressed before moving to the next phase of deployment of deep learning in the processing of future surveys; e.g. uncertainty quantification, interpretability, data labeling and domain shift issues from training with simulations, which constitutes a common practice in astronomy.
 Galaxies and haloes on graph neural networks: Deep generative modelling scalar and vector quantities for intrinsic alignmentYesukhei Jagvaral, François Lanusse, Sukhdeep Singh, and 3 more authorsMonthly Notices of the Royal Astronomical Society Oct 2022
In order to prepare for the upcoming widefield cosmological surveys, large simulations of the Universe with realistic galaxy populations are required. In particular, the tendency of galaxies to naturally align towards overdensities, an effect called intrinsic alignments (IA), can be a major source of systematics in the weak lensing analysis. As the details of galaxy formation and evolution relevant to IA cannot be simulated in practice on such volumes, we propose as an alternative a Deep Generative Model. This model is trained on the IllustrisTNG100 simulation and is capable of sampling the orientations of a population of galaxies so as to recover the correct alignments. In our approach, we model the cosmic web as a set of graphs, where the graphs are constructed for each halo, and galaxy orientations as a signal on those graphs. The generative model is implemented on a Generative Adversarial Network architecture and uses specifically designed Graph Convolutional Networks sensitive to the relative 3D positions of the vertices. Given (sub)halo masses and tidal fields, the model is able to learn and predict scalar features such as galaxy and dark matter subhalo shapes; and more importantly, vector features such as the 3D orientation of the major axis of the ellipsoid and the complex 2D ellipticities. For correlations of 3D orientations the model is in good quantitative agreement with the measured values from the simulation, except for at very small and transition scales. For correlations of 2D ellipticities, the model is in good quantitative agreement with the measured values from the simulation on all scales. Additionally, the model is able to capture the dependence of IA on mass, morphological type, and central/satellite type.
 From Data to Software to Science with the Rubin Observatory LSSTKatelyn Breivik, Andrew J. Connolly, K. E. Saavik Ford, and 97 more authorsarXiv eprints Aug 2022
The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) dataset will dramatically alter our understanding of the Universe, from the origins of the Solar System to the nature of dark matter and dark energy. Much of this research will depend on the existence of robust, tested, and scalable algorithms, software, and services. Identifying and developing such tools ahead of time has the potential to significantly accelerate the delivery of early science from LSST. Developing these collaboratively, and making them broadly available, can enable more inclusive and equitable collaboration on LSST science. To facilitate such opportunities, a community workshop entitled “From Data to Software to Science with the Rubin Observatory LSST” was organized by the LSST Interdisciplinary Network for Collaboration and Computing (LINCC) and partners, and held at the Flatiron Institute in New York, March 2830th 2022. The workshop included over 50 inperson attendees invited from over 300 applications. It identified seven key software areas of need: (i) scalable crossmatching and distributed joining of catalogs, (ii) robust photometric redshift determination, (iii) software for determination of selection functions, (iv) frameworks for scalable timeseries analyses, (v) services for image access and reprocessing at scale, (vi) object image access (cutouts) and analysis at scale, and (vii) scalable job execution systems. This white paper summarizes the discussions of this workshop. It considers the motivating science use cases, identified crosscutting algorithms, software, and services, their highlevel technical specifications, and the principles of inclusive collaborations needed to develop them. We provide it as a useful roadmap of needs, as well as to spur action and collaboration between groups and individuals looking to develop reusable software for early LSST science.
 Bayesian uncertainty quantification for machinelearned models in physicsYarin Gal, Petros Koumoutsakos, Francois Lanusse, and 2 more authorsNature Reviews Physics Aug 2022
Being able to quantify uncertainty when comparing a theoretical or computational model to observations is critical to conducting a sound scientific investigation. With the rise of datadriven modelling, understanding various sources of uncertainty and developing methods to estimate them has gained renewed attention. Five researchers discuss uncertainty quantification in machinelearned models with an emphasis on issues relevant to physics problems.
 Neural Posterior Estimation with Differentiable SimulatorsJustine Zeghal, François Lanusse, Alexandre Boucaud, and 2 more authorsarXiv eprints Jul 2022
SimulationBased Inference (SBI) is a promising Bayesian inference framework that alleviates the need for analytic likelihoods to estimate posterior distributions. Recent advances using neural density estimators in SBI algorithms have demonstrated the ability to achieve highfidelity posteriors, at the expense of a large number of simulations ; which makes their application potentially very timeconsuming when using complex physical simulations. In this work we focus on boosting the sample efficiency of posterior density estimation using the gradients of the simulator. We present a new method to perform Neural Posterior Estimation (NPE) with a differentiable simulator. We demonstrate how gradient information helps constrain the shape of the posterior and improves sampleefficiency.
 Hybrid PhysicalNeural ODEs for Fast Nbody SimulationsDenise Lanzieri, François Lanusse, and JeanLuc StarckarXiv eprints Jul 2022
We present a new scheme to compensate for the smallscales approximations resulting from ParticleMesh (PM) schemes for cosmological Nbody simulations. This kind of simulations are fast and low computational cost realizations of the large scale structures, but lack resolution on small scales. To improve their accuracy, we introduce an additional effective force within the differential equations of the simulation, parameterized by a Fourierspace Neural Network acting on the PMestimated gravitational potential. We compare the results for the matter power spectrum obtained to the ones obtained by the PGD scheme (Potential gradient descent scheme). We notice a similar improvement in term of power spectrum, but we find that our approach outperforms PGD for the crosscorrelation coefficients, and is more robust to changes in simulation settings (different resolutions, different cosmologies).
 ShapeNet: Shape constraint for galaxy image deconvolutionF. Nammour, U. Akhaury, J. N. Girard, and 4 more authorsAstronomy and Astrophysics Jul 2022
Deep learning (DL) has shown remarkable results in solving inverse problems in various domains. In particular, the Tikhonet approach is very powerful in deconvolving optical astronomical images. However, this approach only uses the \ensuremath\ell_2 loss, which does not guarantee the preservation of physical information (e.g., flux and shape) of the object that is reconstructed in the image. A new loss function has been proposed in the framework of sparse deconvolution that better preserves the shape of galaxies and reduces the pixel error. In this paper, we extend the Tikhonet approach to take this shape constraint into account and apply our new DL method, called ShapeNet, to a simulated optical and radiointerferometry dataset. The originality of the paper relies on i) the shape constraint we use in the neural network framework, ii) the application of DL to radiointerferometry image deconvolution for the first time, and iii) the generation of a simulated radio dataset that we make available for the community. A range of examples illustrates the results.
 RubinEuclid Derived Data Products: Initial RecommendationsLeanne P. Guy, JeanCharles Cuillandre, Etienne Bachelet, and 118 more authorsIn Zenodo id. 5836022 Jan 2022
This report is the result of a joint discussion between the Rubin and Euclid scientific communities. The work presented in this report was focused on designing and recommending an initial set of Derived Data products (DDPs) that could realize the science goals enabled by joint processing. All interested Rubin and Euclid data rights holders were invited to contribute via an online discussion forum and a series of virtual meetings. Strong interest in enhancing science with joint DDPs emerged from across a wide range of astrophysical domains: Solar System, the Galaxy, the Local Volume, from the nearby to the primaeval Universe, and cosmology.
 Probabilistic Mass Mapping with Neural Score EstimationBenjamin Remy, Francois Lanusse, Niall Jeffrey, and 4 more authorsarXiv eprints Jan 2022
Weak lensing massmapping is a useful tool to access the full distribution of dark matter on the sky, but because of intrinsic galaxy ellipticies and finite fields/missing data, the recovery of dark matter maps constitutes a challenging illposed inverse problem. We introduce a novel methodology allowing for efficient sampling of the highdimensional Bayesian posterior of the weak lensing massmapping problem, and relying on simulations for defining a fully nonGaussian prior. We aim to demonstrate the accuracy of the method on simulations, and then proceed to applying it to the mass reconstruction of the HST/ACS COSMOS field. The proposed methodology combines elements of Bayesian statistics, analytic theory, and a recent class of Deep Generative Models based on Neural Score Matching. This approach allows us to do the following: 1) Make full use of analytic cosmological theory to constrain the 2pt statistics of the solution. 2) Learn from cosmological simulations any differences between this analytic prior and full simulations. 3) Obtain samples from the full Bayesian posterior of the problem for robust Uncertainty Quantification. We demonstrate the method on the \kappaTNG simulations and find that the posterior mean significantly outperfoms previous methods (KaiserSquires, Wiener filter, Sparsity priors) both on rootmeansquare error and in terms of the Pearson correlation. We further illustrate the interpretability of the recovered posterior by establishing a close correlation between posterior convergence values and SNR of clusters artificially introduced into a field. Finally, we apply the method to the reconstruction of the HST/ACS COSMOS field and yield the highest quality convergence map of this field to date.
 Validating Synthetic Galaxy Catalogs for Dark Energy Science in the LSST EraEve Kovacs, YaoYuan Mao, Michel Aguena, and 36 more authorsThe Open Journal of Astrophysics Jan 2022
Large simulation efforts are required to provide synthetic galaxy catalogs for ongoing and upcoming cosmology surveys. These extragalactic catalogs are being used for many diverse purposes covering a wide range of scientific topics. In order to be useful, they must offer realistically complex information about the galaxies they contain. Hence, it is critical to implement a rigorous validation procedure that ensures that the simulated galaxy properties faithfully capture observations and delivers an assessment of the level of realism attained by the catalog. We present here a suite of validation tests that have been developed by the Rubin Observatory Legacy Survey of Space and Time (LSST) Dark Energy Science Collaboration (DESC). We discuss how the inclusion of each test is driven by the scientific targets for static groundbased dark energy science and by the availability of suitable validation data. The validation criteria that are used to assess the performance of a catalog are flexible and depend on the science goals. We illustrate the utility of this suite by showing examples for the validation of cosmoDC2, the extragalactic catalog recently released for the LSST DESC second Data Challenge.
 Euclid preparation. XIII. Forecasts for galaxy morphology with the Euclid Survey using deep generative modelsEuclid Collaboration, H. Bretonnière, M. HuertasCompany, and 194 more authorsAstronomy and Astrophysics Jan 2022
We present a machine learning framework to simulate realistic galaxies for the Euclid Survey, producing more complex and realistic galaxies than the analytical simulations currently used in Euclid. The proposed method combines a control on galaxy shape parameters offered by analytic models with realistic surface brightness distributions learned from real Hubble Space Telescope observations by deep generative models. We simulate a galaxy field of 0.4 deg^2 as it will be seen by the Euclid visible imager VIS, and we show that galaxy structural parameters are recovered to an accuracy similar to that for pure analytic Sérsic profiles. Based on these simulations, we estimate that the Euclid Wide Survey (EWS) will be able to resolve the internal morphological structure of galaxies down to a surface brightness of 22.5 mag arcsec^\ensuremath2, and the Euclid Deep Survey (EDS) down to 24.9 mag arcsec^\ensuremath2. This corresponds to approximately 250 million galaxies at the end of the mission and a 50% complete sample for stellar masses above 10^10.6 M_\ensuremath⊙ (resp. 10^9.6 M_\ensuremath⊙) at a redshift z \ensuremath∼ 0.5 for the EWS (resp. EDS). The approach presented in this work can contribute to improving the preparation of future high precision cosmological imaging surveys by allowing simulations to incorporate more realistic galaxies.
2021
 Anomaly detection in Hyper SuprimeCam galaxy images with generative adversarial networksKate StoreyFisher, Marc HuertasCompany, Nesar Ramachandra, and 5 more authorsMonthly Notices of the Royal Astronomical Society Dec 2021
The problem of anomaly detection in astronomical surveys is becoming increasingly important as data sets grow in size. We present the results of an unsupervised anomaly detection method using a Wasserstein generative adversarial network (WGAN) on nearly one million optical galaxy images in the Hyper SuprimeCam (HSC) survey. The WGAN learns to generate realistic HSClike galaxies that follow the distribution of the data set; anomalous images are defined based on a poor reconstruction by the generator and outlying features learned by the discriminator. We find that the discriminator is more attuned to potentially interesting anomalies compared to the generator, and compared to a simpler autoencoderbased anomaly detection approach, so we use the discriminatorselected images to construct a highanomaly sample of \raisebox0.5ex\textasciitilde13 000 objects. We propose a new approach to further characterize these anomalous images: we use a convolutional autoencoder to reduce the dimensionality of the residual differences between the real and WGANreconstructed images and perform UMAP clustering on these. We report detected anomalies of interest including galaxy mergers, tidal features, and extreme starforming galaxies. A followup spectroscopic analysis of one of these anomalies is detailed in the Appendix; we find that it is an unusual system most likely to be a metal poor dwarf galaxy with an extremely blue, highermetallicity H II region. We have released a catalogue with the WGAN anomaly scores; the code and catalogue are available at https://github.com/kstoreyf/anomaliesGANHSC; and our interactive visualization tool for exploring the clustered data is at https://weirdgalaxi.es.
 The LSSTDESC 3x2pt Tomography Optimization ChallengeJoe Zuntz, François Lanusse, Alex I. Malz, and 26 more authorsThe Open Journal of Astrophysics Oct 2021
This paper presents the results of the Rubin Observatory Dark Energy Science Collaboration (DESC) 3x2pt tomography challenge, which served as a first step toward optimizing the tomographic binning strategy for the main DESC analysis. The task of choosing an optimal tomographic binning scheme for a photometric survey is made particularly delicate in the context of a metacalibrated lensing catalogue, as only the photometry from the bands included in the metacalibration process (usually riz and potentially g) can be used in sample definition. The goal of the challenge was to collect and compare bin assignment strategies under various metrics of a standard 3x2pt cosmology analysis in a highly idealized setting to establish a baseline for realistically complex followup studies; in this preliminary study, we used two sets of cosmological simulations of galaxy redshifts and photometry under a simple noise model neglecting photometric outliers and variation in observing conditions, and contributed algorithms were provided with a representative and complete training set. We review and evaluate the entries to the challenge, finding that even from this limited photometry information, multiple algorithms can separate tomographic bins reasonably well, reaching figuresofmerit scores close to the attainable maximum. We further find that adding the g band to riz photometry improves metric performance by \raisebox0.5ex\textasciitilde15% and that the optimal bin assignment strategy depends strongly on the science case: which figureofmerit is to be optimized, and which observables (clustering, lensing, or both) are included.
 FlowPM: Distributed TensorFlow implementation of the FastPM cosmological Nbody solverC. Modi, F. Lanusse, and U. SeljakAstronomy and Computing Oct 2021
We present FlowPM, a ParticleMesh (PM) cosmological Nbody code implemented in MeshTensorFlow for GPUaccelerated, distributed, and differentiable simulations. We implement and validate the accuracy of a novel multigrid scheme based on multiresolution pyramids to compute largescale forces efficiently on distributed platforms. We explore the scaling of the simulation on largescale supercomputers and compare it with corresponding Python based PM code, finding on an average 10x speedup in terms of wallclock time. We also demonstrate how this novel tool can be used for efficiently solving large scale cosmological inference problems, in particular reconstruction of cosmological fields in a forward model Bayesian framework with hybrid PM and neural network forward model. We provide skeleton code for these examples and the entire code is publicly available at https://github.com/modichirag/flowpm
 Dark Energy Survey Year 3 results: Curvedsky weak lensing mass map reconstructionN. Jeffrey, M. Gatti, C. Chang, and 129 more authorsMonthly Notices of the Royal Astronomical Society Aug 2021
We present reconstructed convergence maps, mass maps, from the Dark Energy Survey (DES) third year (Y3) weak gravitational lensing data set. The mass maps are weighted projections of the density field (primarily dark matter) in the foreground of the observed galaxies. We use four reconstruction methods, each is a maximum a posteriori estimate with a different model for the prior probability of the map: KaiserSquires, null Bmode prior, Gaussian prior, and a sparsity prior. All methods are implemented on the celestial sphere to accommodate the large sky coverage of the DES Y3 data. We compare the methods using realistic \ensuremathΛCDM simulations with mock data that are closely matched to the DES Y3 data. We quantify the performance of the methods at the map level and then apply the reconstruction methods to the DES Y3 data, performing tests for systematic error effects. The maps are compared with optical foreground cosmicweb structures and are used to evaluate the lensing signal from cosmicvoid profiles. The recovered dark matter map covers the largest sky fraction of any galaxy weak lensing map to date.
 Adaptive wavelet distillation from neural networks through interpretationsWooseok Ha, Chandan Singh, Francois Lanusse, and 2 more authorsarXiv eprints Jul 2021
Recent deeplearning models have achieved impressive prediction performance, but often sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted or where interpretation is the goal itself. Moreover, interpretable models are concise and often yield computational efficiency. Here, we propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform. Specifically, AWD penalizes feature attributions of a neural network in the wavelet domain to learn an effective multi resolution wavelet transform. The resulting model is highly predictive, concise, computationally efficient, and has properties (such as a multiscale structure) which make it easy to interpret. In close collaboration with domain experts, we showcase how AWD addresses challenges in two realworld settings: cosmological parameter inference and molecularpartner prediction. In both cases, AWD yields a scientifically interpretable and concise model which gives predictive performance better than stateoftheart neural networks. Moreover, AWD identifies predictive features that are scientifically meaningful in the context of respective domains. All code and models are released in a fullfledged package available on Github (https://github.com/YuGroup/adaptive wavelets).
 Deep generative models for galaxy image simulationsFrançois Lanusse, Rachel Mandelbaum, Siamak Ravanbakhsh, and 3 more authorsMonthly Notices of the Royal Astronomical Society Jul 2021
Image simulations are essential tools for preparing and validating the analysis of current and future widefield optical surveys. However, the galaxy models used as the basis for these simulations are typically limited to simple parametric light profiles, or use a fairly limited amount of available space based data. In this work, we propose a methodology based on deep generative models to create complex models of galaxy morphologies that may meet the image simulation needs of upcoming surveys. We address the technical challenges associated with learning this morphology model from noisy and point spread function (PSF)convolved images by building a hybrid Deep Learning/physical Bayesian hierarchical model for observed images, explicitly accounting for the PSF and noise properties. The generative model is further made conditional on physical galaxy parameters, to allow for sampling new light profiles from specific galaxy populations. We demonstrate our ability to train and sample from such a model on galaxy postage stamps from the HST/ACS COSMOS survey, and validate the quality of the model using a range of second and higher order morphology statistics. Using this set of statistics, we demonstrate significantly more realistic morphologies using these deep generative models compared to conventional parametric models. To help make these generative models practical tools for the community, we introduce GALSIMHUB, a communitydriven repository of generative models, and a framework for incorporating generative models within the GALSIM image simulation software.
 Realtime Likelihoodfree Inference of Roman Binary Microlensing Events with Amortized Neural Posterior EstimationKeming Zhang, Joshua S. Bloom, B. Scott Gaudi, and 3 more authorsAstronomical Journal Jun 2021
Fast and automated inference of binarylens, singlesource (2L1S) microlensing events with samplingbased Bayesian algorithms (e.g., Markov Chain Monte Carlo, MCMC) is challenged on two fronts: the high computational cost of likelihood evaluations with microlensing simulation codes, and a pathological parameter space where the negativeloglikelihood surface can contain a multitude of local minima that are narrow and deep. Analysis of 2L1S events usually involves grid searches over some parameters to locate approximate solutions as a prerequisite to posterior sampling, an expensive process that often requires humaninthe loop domain expertise. As the nextgeneration, spacebased microlensing survey with the Roman Space Telescope is expected to yield thousands of binary microlensing events, a new fast and automated method is desirable. Here, we present a likelihood free inference approach named amortized neural posterior estimation, where a neural density estimator (NDE) learns a surrogate posterior \hatp(\boldsymbolθ \boldsymbolx) as an observationparameterized conditional probability distribution, from precomputed simulations over the full prior space. Trained on 291,012 simulated Romanlike 2L1S simulations, the NDE produces accurate and precise posteriors within seconds for any observation within the prior support without requiring a domain expert in the loop, thus allowing for realtime and automated inference. We show that the NDE also captures expected posterior degeneracies. The NDE posterior could then be refined into the exact posterior with a downstream MCMC sampler with minimal burnin steps.
 TheLastMetric: an informationbased observing strategy metric for photometric redshifts, cosmology, and moreA. I. Malz, F. Lanusse, M. L. Graham, and 1 more authorIn American Astronomical Society Meeting Abstracts Jun 2021
An astronomical survey’s observing strategy, which encompasses the frequency and duration of visits to each portion of the sky, impacts the degree to which its data can answer the most pressing questions about the universe. Surveys with diverse scientific goals pose a special challenge for survey design decisionmaking; even if each physical parameter of interest has a corresponding quantitative metric, there’s no guarantee of a “one size fits all” optimal observing strategy. While traditional observing strategy metrics must be specific to the science case in question, we exploit a chain rule of the variational mutual information to engineer TheLastMetric, an interpretable, extensible metric that enables coherent observing strategy optimization over multiple science objectives. The upcoming Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) serves as an ideal application for this metric, as many of its extreagalactic science goals rely upon purely photometric redshift constraints. As a demonstration, we use the LSST Metrics Analysis Framework (MAF) to quantify how much information about redshift is contained within photometry, conditioned on a fiducial true galaxy catalog and mock observations under each of several given observing strategies, generated by the LSST Operations Simulator (OpSim). We compare traditional metrics of photometric redshift performance to TheLastMetric and interpret their differences from the perspective of observing strategy optimization. Finally, we illustrate how to extend TheLastMetric to cosmological constraints by multiple probes, jointly or individually.
 Deep Probabilistic Modeling of Weak Lensing Mass MapsF. Lanusse, B. Remy, N. Jeffrey, and 2 more authorsIn American Astronomical Society Meeting Abstracts Jun 2021
While weak gravitational lensing is one of the most promising cosmological probes targeted by upcoming widefield surveys, exploiting the full information content of the cosmic shear signal remains a major challenge. One dimension of this challenge is the fact that analytic cosmological models only describe the 2pt functions of the lensing signal, while we know the convergence field to be significantly nonGaussian. As a result, solving a problem like weak lensing massmapping using analytic Gaussian priors will be suboptimal. We do however have access to models that can capture the full statistics of the lensing signal: numerical simulations. But the question is: how can we use samples from numerical simulations to solve a Bayesian inference problem such as weak lensing massmapping? \\textbackslashIn this talk, I will illustrate how recent deep generative modeling provides us with the tools needed to leverage a physical model in the form of numerical simulations to perform proper Bayesian inference. \\textbackslashUsing Neural Score Estimation, we learn from numerical simulations an estimate of the score function (i.e. the gradient of the log density function) of the distribution of convergence maps. We then use the learned score function as a prior within an Annealed Hamiltonian MonteCarlo sampling scheme which allows us to access the full posterior distribution of a massmapping problem, in 10^6 dimensions.
 Weaklensing mass reconstruction using sparsity and a Gaussian random fieldJ. L. Starck, K. E. Themelis, N. Jeffrey, and 2 more authorsAstronomy and Astrophysics May 2021
\Aims: We introduce a novel approach to reconstructing dark matter mass maps from weak gravitational lensing measurements. The cornerstone of the proposed method lies in a new modelling of the matter density field in the Universe as a mixture of two components: (1) a sparsitybased component that captures the nonGaussian structure of the field, such as peaks or halos at different spatial scales, and (2) a Gaussian random field, which is known to represent the linear characteristics of the field well. \Methods: We propose an algorithm called MCALens that jointly estimates these two components. MCALens is based on an alternating minimisation incorporating both sparse recovery and a proximal iterative Wiener filtering. \Results: Experimental results on simulated data show that the proposed method exhibits improved estimation accuracy compared to customised massmap reconstruction methods.
 CosmicRIM : Reconstructing Early Universe by Combining Differentiable Simulations with Recurrent Inference MachinesChirag Modi, François Lanusse, Uroš Seljak, and 2 more authorsarXiv eprints Apr 2021
Reconstructing the Gaussian initial conditions at the beginning of the Universe from the survey data in a forward modeling framework is a major challenge in cosmology. This requires solving a high dimensional inverse problem with an expensive, nonlinear forward model: a cosmological Nbody simulation. While intractable until recently, we propose to solve this inference problem using an automatically differentiable Nbody solver, combined with a recurrent networks to learn the inference scheme and obtain the maximumaposteriori (MAP) estimate of the initial conditions of the Universe. We demonstrate using realistic cosmological observables that learnt inference is 40 times faster than traditional algorithms such as ADAM and LBFGS, which require specialized annealing schemes, and obtains solution of higher quality.
 An informationbased metric for observing strategy optimization, demonstrated in the context of photometric redshifts with applications to cosmologyAlex I. Malz, François Lanusse, John Franklin Crenshaw, and 1 more authorarXiv eprints Apr 2021
The observing strategy of a galaxy survey influences the degree to which its resulting data can be used to accomplish any science goal. LSST is thus seeking metrics of observing strategies for multiple science cases in order to optimally choose a cadence. Photometric redshifts are essential for many extragalactic science applications of LSST’s data, including but not limited to cosmology, but there are few metrics available, and they are not straightforwardly integrated with metrics of other cadence dependent quantities that may influence any given use case. We propose a metric for observing strategy optimization based on the potentially recoverable mutual information about redshift from a photometric sample under the constraints of a realistic observing strategy. We demonstrate a tractable estimation of a variational lower bound of this mutual information implemented in a public code using conditional normalizing flows. By comparing the recoverable redshift information across observing strategies, we can distinguish between those that preclude robust redshift constraints and those whose data will preserve more redshift information, to be generically utilized in a downstream analysis. We recommend the use of this versatile metric to observing strategy optimization for redshiftdependent extragalactic use cases, including but not limited to cosmology, as well as any other science applications for which photometry may be modeled from true parameter values beyond redshift.
 A deep learning approach to test the smallscale galaxy morphology and its relationship with star formation activity in hydrodynamical simulationsLorenzo Zanisi, Marc HuertasCompany, François Lanusse, and 10 more authorsMonthly Notices of the Royal Astronomical Society Mar 2021
Hydrodynamical simulations of galaxy formation and evolution attempt to fully model the physics that shapes galaxies. The agreement between the morphology of simulated and real galaxies, and the way the morphological types are distributed across galaxy scaling relations are important probes of our knowledge of galaxy formation physics. Here, we propose an unsupervised deep learning approach to perform a stringent test of the fine morphological structure of galaxies coming from the Illustris and IllustrisTNG (TNG100 and TNG50) simulations against observations from a subsample of the Sloan Digital Sky Survey. Our framework is based on PixelCNN, an autoregressive model for image generation with an explicit likelihood. We adopt a strategy that combines the output of two PixelCNN networks in a metric that isolates the smallscale morphological details of galaxies from the sky background. We are able to quantitatively identify the improvements of IllustrisTNG, particularly in the highresolution TNG50 run, over the original Illustris. However, we find that the fine details of galaxy structure are still different between observed and simulated galaxies. This difference is mostly driven by small, more spheroidal, and quenched galaxies that are globally less accurate regardless of resolution and which have experienced little improvement between the three simulations explored. We speculate that this disagreement, that is less severe for quenched discy galaxies, may stem from a still too coarse numerical resolution, which struggles to properly capture the inner, dense regions of quenched spheroidal galaxies.
 The LSST DESC DC2 Simulated Sky SurveyLSST Dark Energy Science Collaboration (LSST DESC), Bela Abolfathi, David Alonso, and 77 more authorsAstrophysical Journal, Supplement Mar 2021
We describe the simulated sky survey underlying the second data challenge (DC2) carried out in preparation for analysis of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) by the LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark of LSST; the DC2 program represents a unique modeling effort that stresses this interconnectivity in a way that has not been attempted before. This effort encompasses a full end toend approach: starting from a large Nbody simulation, through setting up LSSTlike observations including realistic cadences, through image simulations, and finally processing with Rubin’s LSST Science Pipelines. This last step ensures that we generate data products resembling those to be delivered by the Rubin Observatory as closely as is currently possible. The simulated DC2 sky survey covers six optical bands in a wide fastdeep area of approximately 300 deg^2, as well as a deep drilling field of approximately 1 deg^2. We simulate 5 yr of the planned 10 yr survey. The DC2 sky survey has multiple purposes. First, the LSST DESC working groups can use the data set to develop a range of DESC analysis pipelines to prepare for the advent of actual data. Second, it serves as a realistic test bed for the image processing software under development for LSST by the Rubin Observatory. In particular, simulated data provide a controlled way to investigate certain imagelevel systematic effects. Finally, the DC2 sky survey enables the exploration of new scientific ideas in both static and time domain cosmology.
 Likelihoodfree inference with neural compression of DES SV weak lensing map statisticsNiall Jeffrey, Justin Alsing, and François LanusseMonthly Notices of the Royal Astronomical Society Feb 2021
In many cosmological inference problems, the likelihood (the probability of the observed data as a function of the unknown parameters) is unknown or intractable. This necessitates approximations and assumptions, which can lead to incorrect inference of cosmological parameters, including the nature of dark matter and dark energy, or create artificial model tensions. Likelihood free inference covers a novel family of methods to rigorously estimate posterior distributions of parameters using forward modelling of mock data. We present likelihoodfree cosmological parameter inference using weak lensing maps from the Dark Energy Survey (DES) Science Verification data, using neural data compression of weak lensing map summary statistics. We explore combinations of the power spectra, peak counts, and neural compressed summaries of the lensing mass map using deep convolution neural networks. We demonstrate methods to validate the inference process, for both the data modelling and the probability density estimation steps. Likelihoodfree inference provides a robust and scalable alternative for rigorous large scale cosmological inference with galaxy survey data (for DES, Euclid, and LSST). We have made our simulated lensing maps publicly available.
 DESC DC2 Data Release NoteLSST Dark Energy Science Collaboration, Bela Abolfathi, Robert Armstrong, and 54 more authorsarXiv eprints Jan 2021
In preparation for cosmological analyses of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), the LSST Dark Energy Science Collaboration (LSST DESC) has created a 300 deg^2 simulated survey as part of an effort called Data Challenge 2 (DC2). The DC2 simulated sky survey, in six optical bands with observations following a reference LSST observing cadence, was processed with the LSST Science Pipelines (19.0.0). In this Note, we describe the public data release of the resulting object catalogs for the coadded images of five years of simulated observations along with associated truth catalogs. We include a brief description of the major features of the available data sets. To enable convenient access to the data products, we have developed a web portal connected to Globus data services. We describe how to access the data and provide example Jupyter Notebooks in Python to aid first interactions with the data. We welcome feedback and questions about the data release via a GitHub repository.
 Automating Inference of Binary Microlensing Events with Neural Density EstimationK. Zhang, J. Bloom, B. Gaudi, and 3 more authorsIn American Astronomical Society Meeting Abstracts Jan 2021
Automated inference of binary microlensing events with traditional samplingbased algorithms such as MCMC has been hampered by the slowness of the physical forward model and the pathological likelihood surface. Current analysis of such events requires both expert knowledge and largescale grid searches to locate the approximate solution as a prerequisite to MCMC posterior sampling. As the next generation, spacebased microlensing survey with the Roman Space Observatory is expected to yield thousands of binary microlensing events, a new scalable and automated approach is desired. Here, we present an automated inference method based on neural density estimation (NDE). We show that the NDE trained on simulated Roman data not only produces fast, accurate, and precise posteriors but also captures expected posterior degeneracies. A hybrid NDEMCMC framework can further be applied to produce the exact posterior.
2020
 Anomaly Detection in Astronomical Images with Generative Adversarial NetworksKate StoreyFisher, Marc HuertasCompany, Nesar Ramachandra, and 4 more authorsarXiv eprints Dec 2020
We present an anomaly detection method using Wasserstein generative adversarial networks (WGANs) on optical galaxy images from the widefield survey conducted with the Hyper SuprimeCam (HSC) on the Subaru Telescope in Hawai’i. The WGAN is trained on the entire sample, and learns to generate realistic HSClike images that follow the distribution of the training data. We identify images which are less wellrepresented in the generator’s latent space, and which the discriminator flags as less realistic; these are thus anomalous with respect to the rest of the data. We propose a new approach to characterize these anomalies based on a convolutional autoencoder (CAE) to reduce the dimensionality of the residual differences between the real and WGANreconstructed images. We construct a subsample of \raisebox0.5ex\textasciitilde9,000 highly anomalous images from our nearly million object sample, and further identify interesting anomalies within these; these include galaxy mergers, tidal features, and extreme starforming galaxies. The proposed approach could boost unsupervised discovery in the era of big data astrophysics.
 Denoising ScoreMatching for Uncertainty Quantification in Inverse ProblemsZaccharie Ramzi, Benjamin Remy, Francois Lanusse, and 2 more authorsarXiv eprints Nov 2020
Deep neural networks have proven extremely efficient at solving a wide rangeof inverse problems, but most often the uncertainty on the solution they provideis hard to quantify. In this work, we propose a generic Bayesian framework forsolving inverse problems, in which we limit the use of deep neural networks tolearning a prior distribution on the signals to recover. We adopt recent denoisingscore matching techniques to learn this prior from data, and subsequently use it aspart of an annealed Hamiltonian MonteCarlo scheme to sample the full posteriorof image inverse problems. We apply this framework to Magnetic ResonanceImage (MRI) reconstruction and illustrate how this approach not only yields highquality reconstructions but can also be used to assess the uncertainty on particularfeatures of a reconstructed image.
 Probabilistic Mapping of Dark Matter by Neural Score MatchingBenjamin Remy, Francois Lanusse, Zaccharie Ramzi, and 3 more authorsarXiv eprints Nov 2020
The Dark Matter present in the LargeScale Structure of the Universe is invisible, but its presence can be inferred through the small gravitational lensing effect it has on the images of far away galaxies. By measuring this lensing effect on a large number of galaxies it is possible to reconstruct maps of the Dark Matter distribution on the sky. This, however, represents an extremely challenging inverse problem due to missing data and noise dominated measurements. In this work, we present a novel methodology for addressing such inverse problems by combining elements of Bayesian statistics, analytic physical theory, and a recent class of Deep Generative Models based on Neural Score Matching. This approach allows to do the following: (1) make full use of analytic cosmological theory to constrain the 2pt statistics of the solution, (2) learn from cosmological simulations any differences between this analytic prior and full simulations, and (3) obtain samples from the full Bayesian posterior of the problem for robust Uncertainty Quantification. We present an application of this methodology on the first deep learningassisted Dark Matter map reconstruction of the Hubble Space Telescope COSMOS field.
 Automating Inference of Binary Microlensing Events with Neural Density EstimationKeming Zhang, Joshua S. Bloom, B. Scott Gaudi, and 3 more authorsarXiv eprints Oct 2020
Automated inference of binary microlensing events with traditional samplingbased algorithms such as MCMC has been hampered by the slowness of the physical forward model and the pathological likelihood surface. Current analysis of such events requires both expert knowledge and largescale grid searches to locate the approximate solution as a prerequisite to MCMC posterior sampling. As the next generation, spacebased microlensing survey with the Roman Space Observatory is expected to yield thousands of binary microlensing events, a new scalable and automated approach is desired. Here, we present an automated inference method based on neural density estimation (NDE). We show that the NDE trained on simulated Roman data not only produces fast, accurate, and precise posteriors but also captures expected posterior degeneracies. A hybrid NDEMCMC framework can further be applied to produce the exact posterior.
 Bayesian Neural NetworksTom Charnock, Laurence PerreaultLevasseur, and François LanussearXiv eprints Jun 2020
In recent times, neural networks have become a powerful tool for the analysis of complex and abstract data models. However, their introduction intrinsically increases our uncertainty about which features of the analysis are modelrelated and which are due to the neural network. This means that predictions by neural networks have biases which cannot be trivially distinguished from being due to the true nature of the creation and observation of data or not. In order to attempt to address such issues we discuss Bayesian neural networks: neural networks where the uncertainty due to the network can be characterised. In particular, we present the Bayesian statistical framework which allows us to categorise uncertainty in terms of the ingrained randomness of observing certain data and the uncertainty from our lack of knowledge about how data can be created and observed. In presenting such techniques we show how errors in prediction by neural networks can be obtained in principle, and provide the two favoured methods for characterising these errors. We will also describe how both of these methods have substantial pitfalls when put into practice, highlighting the need for other statistical techniques to truly be able to do inference when using neural networks.
 Transformation Importance with Applications to CosmologyChandan Singh, Wooseok Ha, Francois Lanusse, and 3 more authorsarXiv eprints Mar 2020
Machine learning lies at the heart of new possibilities for scientific discovery, knowledge generation, and artificial intelligence. Its potential benefits to these fields requires going beyond predictive accuracy and focusing on interpretability. In particular, many scientific problems require interpretations in a domainspecific interpretable feature space (e.g. the frequency domain) whereas attributions to the raw features (e.g. the pixel space) may be unintelligible or even misleading. To address this challenge, we propose TRIM (TRansformation IMportance), a novel approach which attributes importances to features in a transformed space and can be applied posthoc to a fully trained model. TRIM is motivated by a cosmological parameter estimation problem using deep neural networks (DNNs) on simulated data, but it is generally applicable across domains/models and can be combined with any local interpretation method. In our cosmology example, combining TRIM with contextual decomposition shows promising results for identifying which frequencies a DNN uses, helping cosmologists to understand and validate that the model learns appropriate physical features rather than simulation artifacts.
 Deep learning dark matter map reconstructions from DES SV weak lensing dataNiall Jeffrey, François Lanusse, Ofer Lahav, and 1 more authorMonthly Notices of the Royal Astronomical Society Mar 2020
We present the first reconstruction of dark matter maps from weak lensing observational data using deep learning. We train a convolution neural network with a UNetbased architecture on over 3.6 \texttimes 10^5 simulated data realizations with nonGaussian shape noise and with cosmological parameters varying over a broad prior distribution. We interpret our newly created dark energy survey science verification (DES SV) map as an approximation of the posterior mean P(\ensuremathκ\ensuremathγ) of the convergence given observed shear. Our DeepMass^1 method is substantially more accurate than existing massmapping methods. With a validation set of 8000 simulated DES SV data realizations, compared to Wiener filtering with a fixed power spectrum, the DeepMass method improved the mean square error (MSE) by 11 per cent. With Nbody simulated MICE mock data, we show that Wiener filtering, with the optimal known power spectrum, still gives a worse MSE than our generalized method with no input cosmological parameters; we show that the improvement is driven by the nonlinear structures in the convergence. With higher galaxy density in future weak lensing data unveiling more nonlinear scales, it is likely that deep learning will be a leading approach for mass mapping with Euclid and LSST.
2019
 Hybrid PhysicalDeep Learning Model for Astronomical Inverse ProblemsFrancois Lanusse, Peter Melchior, and Fred MoolekamparXiv eprints Dec 2019
We present a Bayesian machine learning architecture that combines a physically motivated parametrization and an analytic error model for the likelihood with a deep generative model providing a powerful datadriven prior for complex signals. This combination yields an interpretable and differentiable generative model, allows the incorporation of prior knowledge, and can be utilized for observations with different data quality without having to retrain the deep network. We demonstrate our approach with an example of astronomical source separation in current imaging data, yielding a physical and interpretable model of astronomical scenes.
 CosmoDC2: A Synthetic Sky Catalog for Dark Energy Science with LSSTDanila Korytov, Andrew Hearin, Eve Kovacs, and 28 more authorsAstrophysical Journal, Supplement Dec 2019
This paper introduces cosmoDC2, a large synthetic galaxy catalog designed to support precision dark energy science with the Large Synoptic Survey Telescope (LSST). CosmoDC2 is the starting point for the second data challenge (DC2) carried out by the LSST Dark Energy Science Collaboration (LSST DESC). The catalog is based on a trillionparticle, (4.225 Gpc)^3 box cosmological Nbody simulation, the Outer Rim run. It covers 440 deg^2 of sky area to a redshift of z = 3 and matches expected number densities from contemporary surveys to a magnitude depth of 28 in the r band. Each galaxy is characterized by a multitude of galaxy properties including stellar mass, morphology, spectral energy distributions, broadband filter magnitudes, host halo information, and weak lensing shear. The size and complexity of cosmoDC2 requires an efficient catalog generation methodology; our approach is based on a new hybrid technique that combines databased empirical approaches with semianalytic galaxy modeling. A wide range of observationbased validation tests has been implemented to ensure that cosmoDC2 enables the science goals of the planned LSST DESC DC2 analyses. This paper also represents the official release of the cosmoDC2 data set, including an efficient reader that facilitates interaction with the data.
 Uncertainty Quantification with Generative ModelsVanessa Böhm, François Lanusse, and Uroš SeljakarXiv eprints Oct 2019
We develop a generative modelbased approach to Bayesian inverse problems, such as image reconstruction from noisy and incomplete images. Our framework addresses two common challenges of Bayesian reconstructions: 1) It makes use of complex, data driven priors that comprise all available information about the uncorrupted data distribution. 2) It enables computationally tractable uncertainty quantification in the form of posterior analysis in latent and data space. The method is very efficient in that the generative model only has to be trained once on an uncorrupted data set, after that, the procedure can be used for arbitrary corruption types.
 The Role of Machine Learning in the Next Decade of CosmologyMichelle Ntampaka, Camille Avestruz, Steven Boada, and 27 more authorsBulletin of the AAS May 2019
Machine learning (ML) methods have remarkably improved how cosmologists can interpret data. The next decade will bring new opportunities for datadriven discovery, but will also present new challenges for adopting ML methodologies. ML could transform our field, but it will require the community to promote interdisciplinary research endeavors.
 Core Cosmology Library: Precision Cosmological Predictions for LSSTNora Elisa Chisari, David Alonso, Elisabeth Krause, and 28 more authorsAstrophysical Journal, Supplement May 2019
The Core Cosmology Library (CCL) provides routines to compute basic cosmological observables to a high degree of accuracy, which have been verified with an extensive suite of validation tests. Predictions are provided for many cosmological quantities, including distances, angular power spectra, correlation functions, halo bias, and the halo mass function through state oftheart modeling prescriptions available in the literature. Fiducial specifications for the expected galaxy distributions for the Large Synoptic Survey Telescope (LSST) are also included, together with the capability of computing redshift distributions for a userdefined photometric redshift model. A rigorous validation procedure, based on comparisons between CCL and independent software packages, allows us to establish a welldefined numerical accuracy for each predicted quantity. As a result, predictions for correlation functions of galaxy clustering, galaxygalaxy lensing, and cosmic shear are demonstrated to be within a fraction of the expected statistical uncertainty of the observables for the models and in the range of scales of interest to LSST. CCL is an open source software package written in C, with a Python interface and publicly available at <A href=“https://github.com/LSSTDESC/CCL”>https:/ /github.com/LSSTDESC/CCL</A>.
 The strong gravitational lens finding challengeR. B. Metcalf, M. Meneghetti, C. Avestruz, and 34 more authorsAstronomy and Astrophysics May 2019
Largescale imaging surveys will increase the number of galaxyscale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images, and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives automated methods must be developed. Because gravitational lenses are rare objects, reducing false positives will be particularly important. We present a description and results of an open gravitational lens finding challenge. Participants were asked to classify 100 000 candidate objects as to whether they were gravitational lenses or not with the goal of developing better automated methods for finding lenses in large data sets. A variety of methods were used including visual inspection, arc and ring finders, support vector machines (SVM) and convolutional neural networks (CNN). We find that many of the methods will be easily fast enough to analyse the anticipated data flow. In test data, several methods are able to identify upwards of half the lenses after applying some thresholds on the lens characteristics such as lensed image brightness, size or contrast with the lens galaxy without making a single falsepositive identification. This is significantly better than direct inspection by humans was able to do. Having multiband, ground based data is found to be better for this purpose than singleband space based data with lower noise and higher resolution, suggesting that multicolour data is crucial. Multiband space based data will be superior to ground based data. The most difficult challenge for a lens finder is differentiating between rare, irregular and ringlike faceon galaxies and true gravitational lenses. The degree to which the efficiency and biases of lens finders can be quantified largely depends on the realism of the simulated data on which the finders are trained.
 Cosmology from cosmic shear power spectra with Subaru Hyper SuprimeCam firstyear dataChiaki Hikage, Masamune Oguri, Takashi Hamana, and 34 more authorsPublications of the ASJ Apr 2019
We measure cosmic weak lensing shear power spectra with the Subaru Hyper SuprimeCam (HSC) survey firstyear shear catalog covering 137 deg^2 of the sky. Thanks to the high effective galaxy number density of \ensuremath∼17 arcmin^2, even after conservative cuts such as a magnitude cut of i < 24.5 and photometric redshift cut of 0.3 \ensuremath≤ z \ensuremath≤ 1.5, we obtain a highsignificance measurement of the cosmic shear power spectra in four tomographic redshift bins, achieving a total signaltonoise ratio of 16 in the multipole range 300 \ensuremath≤ \ensuremath\ell \ensuremath≤ 1900. We carefully account for various uncertainties in our analysis including the intrinsic alignment of galaxies, scatters and biases in photometric redshifts, residual uncertainties in the shear measurement, and modeling of the matter power spectrum. The accuracy of our power spectrum measurement method as well as our analytic model of the covariance matrix are tested against realistic mock shear catalogs. For a flat \ensuremathΛ cold dark matter model, we find S _8\ensuremath≡ \ensuremathσ _8(\ensuremathΩ _m/0.3)\^\ensuremathα =0.800\^{+0.029}_{0.028} for \ensuremathα = 0.45 (S _8=0.780\^{+0.030}_{0.033} for \ensuremathα = 0.5) from our HSC tomographic cosmic shear analysis alone. In comparison with Planck cosmic microwave background constraints, our results prefer slightly lower values of S_8, although metrics such as the Bayesian evidence ratio test do not show significant evidence for discordance between these results. We study the effect of possible additional systematic errors that are unaccounted for in our fiducial cosmic shear analysis, and find that they can shift the bestfit values of S_8 by up to \ensuremath∼0.6 \ensuremathσ in both directions. The full HSC survey data will contain several times more area, and will lead to significantly improved cosmological constraints.
2018
 Weak lensing shear calibration with simulations of the HSC surveyRachel Mandelbaum, François Lanusse, Alexie Leauthaud, and 8 more authorsMonthly Notices of the Royal Astronomical Society Dec 2018
We present results from a set of simulations designed to constrain the weak lensing shear calibration for the Hyper SuprimeCam (HSC) survey. These simulations include HSC observing conditions and galaxy images from the Hubble Space Telescope (HST), with fully realistic galaxy morphologies and the impact of nearby galaxies included. We find that the inclusion of nearby galaxies in the images is critical to reproducing the observed distributions of galaxy sizes and magnitudes, due to the nonnegligible fraction of unrecognized blends in groundbased data, even with the excellent typical seeing of the HSC survey (0.58 arcsec in the i band). Using these simulations, we detect and remove the impact of selection biases due to the correlation of weights and the quantities used to define the sample (S/N and apparent size) with the lensing shear. We quantify and remove galaxy property dependent multiplicative and additive shear biases that are intrinsic to our shear estimation method, including an \ensuremath∼10 per centlevel multiplicative bias due to the impact of nearby galaxies and unrecognized blends. Finally, we check the sensitivity of our shear calibration estimates to other cuts made on the simulated samples, and find that the changes in shear calibration are well within the requirements for HSC weak lensing analysis. Overall, the simulations suggest that the weak lensing multiplicative biases in the firstyear HSC shear catalogue are controlled at the 1 per cent level.
 Improving weak lensing mass map reconstructions using Gaussian and sparsity priors: application to DES SVN. Jeffrey, F. B. Abdalla, O. Lahav, and 66 more authorsMonthly Notices of the Royal Astronomical Society Sep 2018
Mapping the underlying density field, including nonvisible dark matter, using weak gravitational lensing measurements is now a standard tool in cosmology. Due to its importance to the science results of current and upcoming surveys, the quality of the convergence reconstruction methods should be well understood. We compare three methods: KaiserSquires (KS), Wiener filter, and GLIMPSE. KaiserSquires is a direct inversion, not accounting for survey masks or noise. The Wiener filter is wellmotivated for Gaussian density fields in a Bayesian framework. GLIMPSE uses sparsity, aiming to reconstruct nonlinearities in the density field. We compare these methods with several tests using public Dark Energy Survey (DES) Science Verification (SV) data and realistic DES simulations. The Wiener filter and GLIMPSE offer substantial improvements over smoothed KaiserSquires with a range of metrics. Both the Wiener filter and GLIMPSE convergence reconstructions show a 12 per cent improvement in Pearson correlation with the underlying truth from simulations. To compare the mapping methods’ abilities to find mass peaks, we measure the difference between peak counts from simulated \ensuremathΛCDM shear catalogues and catalogues with no mass fluctuations (a standard data vector when inferring cosmology from peak statistics); the maximum signaltonoise of these peak statistics is increased by a factor of 3.5 for the Wiener filter and 9 for GLIMPSE. With simulations, we measure the reconstruction of the harmonic phases; the phase residuals’ concentration is improved 17 per cent by GLIMPSE and 18 per cent by the Wiener filter. The correlationbetween reconstructions from data and foreground redMaPPer clusters is increased 18 per cent by the Wiener filter and 32 per cent by GLIMPSE.
 The clustering of z > 7 galaxies: predictions from the BLUETIDES simulationAklant K. Bhowmick, Tiziana Di Matteo, Yu Feng, and 1 more authorMonthly Notices of the Royal Astronomical Society Mar 2018
We study the clustering of the highest z galaxies (from \ensuremath∼0.1 to a few tens Mpc scales) using the BLUETIDES simulation and compare it to current observational constraints from Hubble legacy and Hyper Suprime Cam (HSC) fields (at z = 67.2). With a box length of 400 Mpc h^1 on each side and 0.7 trillion particles, BLUETIDES is the largest volume highresolution cosmological hydrodynamic simulation to date ideally suited for studies of highz galaxies. We find that galaxies with magnitude m_UV < 27.7 have a bias (b_g) of 8.1 \ensuremath\pm 1.2 at z = 8, and typical halo masses M_H \ensuremath≳ 6 \texttimes 10^10 M_\ensuremath⊙. Given the redshift evolution between z = 8 and z = 10 [b_g \ensuremath∝ (1 + z)^1.6], our inferred values of the bias and halo masses are consistent with measured angular clustering at z \ensuremath∼ 6.8 from these brighter samples. The bias of fainter galaxies (in the Hubble legacy field at H_160 \ensuremath≲ 29.5) is 5.9 \ensuremath\pm 0.9 at z = 8 corresponding to halo masses M_H \ensuremath≳ 10^10 M_\ensuremath⊙. We investigate directly the 1halo term in the clustering and show that it dominates on scales r \ensuremath≲ 0.1 Mpc h^1 (\ensuremathΘ \ensuremath≲ 3 arcsec) with nonlinear effect at transition scales between the onehalo and twohalo term affecting scales 0.1 Mpc h^1\ensuremath≲ r \ensuremath≲ 20 Mpc h^1 (3 arcsec \ensuremath≲ \ensuremathΘ \ensuremath≲ 90 arcsec). Current clustering measurements probe down to the scales in the transition between onehalo and twohalo regime where nonlinear effects are important. The amplitude of the onehalo term implies that occupation numbers for satellites in BLUETIDES are somewhat higher than standard halo occupation distributions adopted in these analyses (which predict amplitudes in the onehalo regime suppressed by a factor 23). That possibly implies a higher number of galaxies detected by JWST (at small scales and even fainter magnitudes) observing these fields.
 DESCQA: An Automated Validation Framework for Synthetic Sky CatalogsYaoYuan Mao, Eve Kovacs, Katrin Heitmann, and 25 more authorsAstrophysical Journal, Supplement Feb 2018
The use of highquality simulated sky catalogs is essential for the success of cosmological surveys. The catalogs have diverse applications, such as investigating signatures of fundamental physics in cosmological observables, understanding the effect of systematic uncertainties on measured signals and testing mitigation strategies for reducing these uncertainties, aiding analysis pipeline development and testing, and survey strategy optimization. The list of applications is growing with improvements in the quality of the catalogs and the details that they can provide. Given the importance of simulated catalogs, it is critical to provide rigorous validation protocols that enable both catalog providers and users to assess the quality of the catalogs in a straightforward and comprehensive way. For this purpose, we have developed the DESCQA framework for the Large Synoptic Survey Telescope Dark Energy Science Collaboration as well as for the broader community. The goal of DESCQA is to enable the inspection, validation, and comparison of an inhomogeneous set of synthetic catalogs via the provision of a common interface within an automated framework. In this paper, we present the design concept and first implementation of DESCQA. In order to establish and demonstrate its full functionality we use a set of interim catalogs and validation tests. We highlight several important aspects, both technical and scientific, that require thoughtful consideration when designing a validation framework, including validation metrics and how these metrics impose requirements on the synthetic sky catalogs.
 The firstyear shear catalog of the Subaru Hyper SuprimeCam Subaru Strategic Program SurveyRachel Mandelbaum, Hironao Miyatake, Takashi Hamana, and 28 more authorsPublications of the ASJ Jan 2018
We present and characterize the catalog of galaxy shape measurements that will be used for cosmological weak lensing measurements in the Wide layer of the first year of the Hyper SuprimeCam (HSC) survey. The catalog covers an area of 136.9 deg^2 split into six fields, with a mean iband seeing of 0{\^”_.}58 and 5\ensuremathσ pointsource depth of i \ensuremath∼ 26. Given conservative galaxy selection criteria for firstyear science, the depth and excellent image quality results in unweighted and weighted source number densities of 24.6 and 21.8 arcmin^2, respectively. We define the requirements for cosmological weak lensing science with this catalog, then focus on characterizing potential systematics in the catalog using a series of internal null tests for problems with pointspread function (PSF) modeling, shear estimation, and other aspects of the image processing. We find that the PSF models narrowly meet requirements for weak lensing science with this catalog, with fractional PSF model size residuals of approximately 0.003 (requirement: 0.004) and the PSF model shape correlation function \ensuremathρ_1 < 3 \texttimes 10^7 (requirement: 4 \texttimes 10^7) at 0.5\textdegree scales. A variety of galaxy shaperelated null tests are statistically consistent with zero, but star galaxy shape correlations reveal additive systematics on >1\textdegree scales that are sufficiently large as to require mitigation in cosmic shear measurements. Finally, we discuss the dominant systematics and the planned algorithmic changes to reduce them in future data reductions.
 CMU DeepLens: deep learning for automatic imagebased galaxygalaxy strong lens findingFrançois Lanusse, Quanbin Ma, Nan Li, and 5 more authorsMonthly Notices of the Royal Astronomical Society Jan 2018
Galaxyscale strong gravitational lensing can not only provide a valuable probe of the dark matter distribution of massive galaxies, but also provide valuable cosmological constraints, either by studying the population of strong lenses or by measuring time delays in lensed quasars. Due to the rarity of galaxyscale strongly lensed systems, fast and reliable automated lens finding methods will be essential in the era of large surveys such as Large Synoptic Survey Telescope, Euclid and WideField Infrared Survey Telescope. To tackle this challenge, we introduce CMU DeepLens, a new fully automated galaxygalaxy lens finding method based on deep learning. This supervised machine learning approach does not require any tuning after the training step which only requires realistic image simulations of strongly lensed systems. We train and validate our model on a set of 20 000 LSSTlike mock observations including a range of lensed systems of various sizes and signal tonoise ratios (S/N). We find on our simulated data set that for a rejection rate of nonlenses of 99 per cent, a completeness of 90 per cent can be achieved for lenses with Einstein radii larger than 1.4 arcsec and S/N larger than 20 on individual gband LSST exposures. Finally, we emphasize the importance of realistically complex simulations for training such machine learning methods by demonstrating that the performance of models of significantly different complexities cannot be distinguished on simpler simulations. We make our code publicly available at https://github.com/McWilliamsCenter/CMUDeepLens.
2017
 Lsstdesc/Descqa: Descqa V2 Alpha Release (V2.0.00.4.5)YaoYuan Mao, tomuram, Rongpu Zhou, and 9 more authorsDec 2017


 Sparse Reconstruction of the Merging A520 Cluster SystemAustin Peel, François Lanusse, and JeanLuc StarckAstrophysical Journal Sep 2017
Merging galaxy clusters present a unique opportunity to study the properties of dark matter in an astrophysical context. These are rare and extreme cosmic events in which the bulk of the baryonic matter becomes displaced from the dark matter halos of the colliding subclusters. Since all mass bends light, weak gravitational lensing is a primary tool to study the total mass distribution in such systems. Combined with Xray and optical analyses, mass maps of cluster mergers reconstructed from weak lensing observations have been used to constrain the self interaction crosssection of dark matter. The dynamically complex Abell 520 (A520) cluster is an exceptional case, even among merging systems: multiwavelength observations have revealed a surprising high masstolight concentration of dark mass, the interpretation of which is difficult under the standard assumption of effectively collisionless dark matter. We revisit A520 using a new sparsitybased massmapping algorithm to independently assess the presence of the puzzling dark core. We obtain highresolution mass reconstructions from two separate galaxy shape catalogs derived from Hubble Space Telescope observations of the system. Our mass maps agree well overall with the results of previous studies, but we find important differences. In particular, although we are able to identify the dark core at a certain level in both data sets, it is at much lower significance than has been reported before using the same data. As we cannot confirm the detection in our analysis, we do not consider A520 as posing a significant challenge to the collisionless dark matter scenario.
 Cosmological constraints with weaklensing peak counts and secondorder statistics in a largefield surveyAustin Peel, ChiehAn Lin, François Lanusse, and 3 more authorsAstronomy and Astrophysics Mar 2017
Peak statistics in weaklensing maps access the nonGaussian information contained in the largescale distribution of matter in the Universe. They are therefore a promising complementary probe to twopoint and higherorder statistics to constrain our cosmological models. Nextgeneration galaxy surveys, with their advanced optics and large areas, will measure the cosmic weak lensing signal with unprecedented precision. To prepare for these anticipated data sets, we assess the constraining power of peak counts in a simulated Euclidlike survey on the cosmological parameters \ensuremathΩ_m, \ensuremathσ_8, and w_0^de. In particular, we study how CAMELUS, a fast stochastic model for predicting peaks, can be applied to such large surveys. The algorithm avoids the need for timecostly Nbody simulations, and its stochastic approach provides full PDF information of observables. Considering peaks with a signaltonoise ratio \ensuremath≥ 1, we measure the abundance histogram in a mock shear catalogue of approximately 5000 deg^2 using a multiscale massmap filtering technique. We constrain the parameters of the mock survey using CAMELUS combined with approximate Bayesian computation, a robust likelihoodfree inference algorithm. Peak statistics yield a tight but significantly biased constraint in the \ensuremathσ_8\ensuremathΩ_m plane, as measured by the width \ensuremath∆\ensuremathΣ_8 of the 1\ensuremathσ contour. We find \ensuremathΣ_8 = \ensuremathσ_8(\ensuremathΩ_m/ 0.27)^\ensuremathα = 0.77_0.05^+0.06 with \ensuremathα = 0.75 for a flat \ensuremathΛCDM model. The strong bias indicates the need to better understand and control the model systematics before applying it to a real survey of this size or larger. We perform a calibration of the model and compare results to those from the twopoint correlation functions \ensuremathξ_\ensuremath\pm measured on the same field. We calibrate the \ensuremathξ_\ensuremath\pm result as well, since its contours are also biased, although not as severely as for peaks. In this case, we find for peaks \ensuremathΣ_8 = 0.76_0.03^+0.02 with \ensuremathα = 0.65, while for the combined \ensuremathξ_+ and \ensuremathξ_ statistics the values are \ensuremathΣ_8 = 0.76_0.01^+0.02 and \ensuremathα = 0.70. We conclude that the constraining power can therefore be comparable between the two weaklensing observables in largefield surveys. Furthermore, the tilt in the \ensuremathσ_8\ensuremathΩ_m degeneracy direction for peaks with respect to that of \ensuremathξ_\ensuremath\pm suggests that a combined analysis would yield tighter constraints than either measure alone. As expected, w_0^de cannot be well constrained without a tomographic analysis, but its degeneracy directions with the other two varied parameters are still clear for both peaks and \ensuremathξ_\ensuremath\pm.
 Cosmological constraints with weak lensing peak counts and secondorder statistics in a largefield surveyAustin Peel, ChiehAn Lin, Francois Lanusse, and 3 more authorsIn American Astronomical Society Meeting Abstracts #229 Jan 2017
Peak statistics in weak lensing maps access the nonGaussian information contained in the largescale distribution of matter in the Universe. They are therefore a promising complementary probe to twopoint and higherorder statistics to constrain our cosmological models. To prepare for the high precision afforded by nextgeneration weak lensing surveys, we assess the constraining power of peak counts in a simulated Euclidlike survey on the cosmological parameters \ensuremathΩ_m, \ensuremathσ_8, and w_0^de. In particular, we study how CAMELUS—a fast stochastic model for predicting peaks—can be applied to such large surveys. The algorithm avoids the need for timecostly Nbody simulations, and its stochastic approach provides full PDF information of observables. We measure the abundance histogram of peaks in a mock shear catalogue of approximately 5,000 deg^2 using a multiscale mass map filtering technique, and we then constrain the parameters of the mock survey using CAMELUS combined with approximate Bayesian computation, a robust likelihoodfree inference algorithm. We find that peak statistics yield a tight but significantly biased constraint in the \ensuremathσ_8\ensuremathΩ_m plane, indicating the need to better understand and control the model’s systematics before applying it to a real survey of this size or larger. We perform a calibration of the model to remove the bias and compare results to those from the twopoint correlation functions (2PCF) measured on the same field. In this case, we find the derived parameter \ensuremathΣ_8 = \ensuremathσ_8(\ensuremathΩ_m/0.27 )^\ensuremathα = 0.76 (0.03 +0.02) with \ensuremathα = 0.65 for peaks, while for 2PCF the values are \ensuremathΣ_8 = 0.76 (0.01 +0.02) and \ensuremathα = 0.70. We conclude that the constraining power can therefore be comparable between the two weak lensing observables in largefield surveys. Furthermore, the tilt in the \ensuremathσ_8\ensuremathΩ_m degeneracy direction for peaks with respect to that of 2PCF suggests that a combined analysis would yield tighter constraints than either measure alone. As expected, w_0^de cannot be well constrained without a tomographic analysis, but its degeneracy directions with the other two varied parameters are still clear for both peaks and 2PCF.
 Deep Generative Models of Galaxy Images for the Calibration of the Next Generation of Weak Lensing SurveysFrancois Lanusse, Siamak Ravanbakhsh, Rachel Mandelbaum, and 2 more authorsIn American Astronomical Society Meeting Abstracts #229 Jan 2017
Weak gravitational lensing has long been identified as one of the most powerful probes to investigate the nature of dark energy. As such, weak lensing is at the heart of the next generation of cosmological surveys such as LSST, Euclid or WFIRST.One particularly crititcal source of systematic errors in these surveys comes from the shape measurement algorithms tasked with estimating galaxy shapes. GREAT3, the last community challenge to assess the quality of stateoftheart shape measurement algorithms has in particular demonstrated that all current methods are biased to various degrees and, more importantly, that these biases depend on the details of the galaxy morphologies. These biases can be measured and calibrated by generating mock observations where a known lensing signal has been introduced and comparing the resulting measurements to the groundtruth. Producing these mock observations however requires input galaxy images of higher resolution and S/N than the simulated survey, which typically implies acquiring extremely expensive spacebased observations.The goal of this work is to train a deep generative model on already available Hubble Space Telescope data which can then be used to sample new galaxy images conditioned on parameters such as magnitude, size or redshift and exhibiting complex morphologies. Such model can allow us to inexpensively produce large set of realistic realistic images for calibration purposes.We implement a conditional generative model based on stateoftheart deep learning methods and fit it to deep galaxy images from the COSMOS survey. The quality of the model is assessed by computing an extensive set of galaxy morphology statistics on the generated images. Beyond simple second moment statistics such as size and ellipticity, we apply more complex statistics specifically designed to be sensitive to disturbed galaxy morphologies. We find excellent agreement between the morphologies of real and model generated galaxies.Our results suggest that such deep generative models represent a reliable alternative to the acquisition of expensive high quality observations for generating the calibration data needed by the next generation of weak lensing surveys.
2016
 Enabling Dark Energy Science with Deep Generative Models of Galaxy ImagesSiamak Ravanbakhsh, Francois Lanusse, Rachel Mandelbaum, and 2 more authorsarXiv eprints Sep 2016
Understanding the nature of dark energy, the mysterious force driving the accelerated expansion of the Universe, is a major challenge of modern cosmology. The next generation of cosmological surveys, specifically designed to address this issue, rely on accurate measurements of the apparent shapes of distant galaxies. However, shape measurement methods suffer from various unavoidable biases and therefore will rely on a precise calibration to meet the accuracy requirements of the science analysis. This calibration process remains an open challenge as it requires large sets of high quality galaxy images. To this end, we study the application of deep conditional generative models in generating realistic galaxy images. In particular we consider variations on conditional variational autoencoder and introduce a new adversarial objective for training of conditional generative networks. Our results suggest a reliable alternative to the acquisition of expensive high quality observations for generating the calibration data needed by the next generation of cosmological surveys.
 High resolution weak lensing mass mapping combining shear and flexionF. Lanusse, J. L. Starck, A. Leonard, and 1 more authorAstronomy and Astrophysics Jun 2016
\Aims: We propose a new mass mapping algorithm, specifically designed to recover smallscale information from a combination of gravitational shear and flexion. Including flexion allows us to supplement the shear on small scales in order to increase the sensitivity to substructures and the overall resolution of the convergence map without relying on strong lensing constraints. \Methods: To preserve all available small scale information, we avoid any binning of the irregularly sampled input shear and flexion fields and treat the mass mapping problem as a general illposed inverse problem, which is regularised using a robust multiscale wavelet sparsity prior. The resulting algorithm incorporates redshift, reduced shear, and reduced flexion measurements for individual galaxies and is made highly efficient by the use of fast Fourier estimators. \Results: We tested our reconstruction method on a set of realistic weak lensing simulations corresponding to typical HST/ACS cluster observations and demonstrate our ability to recover substructures with the inclusion of flexion, which are otherwise lost if only shear information is used. In particular, we can detect substructures on the 15“ scale well outside of the critical region of the clusters. In addition, flexion also helps to constrain the shape of the central regions of the main dark matter halos. \\textbackslashOur mass mapping software, called Glimpse2D, is made freely available at <A href=”http://www.cosm ostat.org/software/glimpse”>http://www.cosmostat.org/software/g limpse</A>
2015
 3D galaxy clustering with future widefield surveys: Advantages of a spherical FourierBessel analysisF. Lanusse, A. Rassat, and J. L. StarckAstronomy and Astrophysics Jun 2015
Context. Upcoming spectroscopic galaxy surveys are extremely promising to help in addressing the major challenges of cosmology, in particular in understanding the nature of the dark universe. The strength of these surveys, naturally described in spherical geometry, comes from their unprecedented depth and width, but an optimal extraction of their threedimensional information is of utmost importance to best constrain the properties of the dark universe. \Aims: Although there is theoretical motivation and novel tools to explore these surveys using the 3D spherical FourierBessel (SFB) power spectrum of galaxy number counts C_\ensuremath\ell(k,k’), most survey optimisations and forecasts are based on the tomographic spherical harmonics power spectrum C^(ij)_\ensuremath\ell. The goal of this paper is to perform a new investigation of the information that can be extracted from these two analyses in the context of planned stage IV widefield galaxy surveys. \Methods: We compared tomographic and 3D SFB techniques by comparing the forecast cosmological parameter constraints obtained from a Fisher analysis. The comparison was made possible by careful and coherent treatment of nonlinear scales in the two analyses, which makes this study the first to compare 3D SFB and tomographic constraints on an equal footing. Nuisance parameters related to a scale and redshiftdependent galaxy bias were also included in the computation of the 3D SFB and tomographic power spectra for the first time. \Results: Tomographic and 3D SFB methods can recover similar constraints in the absence of systematics. This requires choosing an optimal number of redshift bins for the tomographic analysis, which we computed to be N = 26 for z_med ≃ 0.4, N = 30 for z_med ≃ 1.0, and N = 42 for z_med ≃ 1.7. When marginalising over nuisance parameters related to the galaxy bias, the forecast 3D SFB constraints are less affected by this source of systematics than the tomographic constraints. In addition, the rate of increase of the figure of merit as a function of median redshift is higher for the 3D SFB method than for the 2D tomographic method. \Conclusions: Constraints from the 3D SFB analysis are less sensitive to unavoidable systematics stemming from a redshift and scaledependent galaxy bias. Even for surveys that are optimised with tomography in mind, a 3D SFB analysis is more powerful. In addition, for survey optimisation, the figure of merit for the 3D SFB method increases more rapidly with redshift, especially at higher redshifts, suggesting that the 3D SFB method should be preferred for designing and analysing future widefield spectroscopic surveys. CosmicPy, the Python package developed for this paper, is freely available at <a href=“http://cosmicpy.github.io”> https://cosmicpy.github.io</a>. \\textbackslashAppendices are available in electronic form at <A href=“http://www.aanda.org/10.1051/00046 361/201424456/olm”>http://www.aanda.org</A>
 Weak lensing reconstructions in 2D and 3D: implications for cluster studiesAdrienne Leonard, François Lanusse, and JeanLuc StarckMonthly Notices of the Royal Astronomical Society May 2015
We compare the efficiency with which 2D and 3D weak lensing mass mapping techniques are able to detect clusters of galaxies using two stateoftheart mass reconstruction techniques: MRLens in 2D and GLIMPSE in 3D. We simulate otherwiseempty cluster fields for 96 different virial massredshift combinations spanning the ranges 3 \texttimes 10^13 h^1 M_\ensuremath⊙ \ensuremath≤ M_vir \ensuremath≤ 10^15 h^1 M_\ensuremath⊙ and 0.05 \ensuremath≤ z_cl \ensuremath≤ 0.75, and for each generate 1000 realizations of noisy shear data in 2D and 3D. For each field, we then compute the cluster (false) detection rate as the mean number of cluster (false) detections per reconstruction over the sample of 1000 reconstructions. We show that both MRLens and GLIMPSE are effective tools for the detection of clusters from weak lensing measurements, and provide comparable quality reconstructions at low redshift. At high redshift, GLIMPSE reconstructions offer increased sensitivity in the detection of clusters, yielding cluster detection rates up to a factor of \ensuremath∼10 \texttimes that seen in 2D reconstructions using MRLens. We conclude that 3D mass mapping techniques are more efficient for the detection of clusters of galaxies in weak lensing surveys than 2D methods, particularly since 3D reconstructions yield unbiased estimators of both the mass and redshift of the detected clusters directly.
 SNIa detection in the SNLS photometric analysis using Morphological Component AnalysisA. Möller, V. RuhlmannKleider, F. Lanusse, and 3 more authorsJournal of Cosmology and Astroparticle Physics Apr 2015
Detection of supernovae (SNe) and, more generally, of transient events in large surveys can provide numerous false detections. In the case of a deferred processing of survey images, this implies reconstructing complete light curves for all detections, requiring sizable processing time and resources. Optimizing the detection of transient events is thus an important issue for both present and future surveys. We present here the optimization done in the SuperNova Legacy Survey (SNLS) for the 5year data deferred photometric analysis. In this analysis, detections are derived from stacks of subtracted images with one stack per lunation. The 3year analysis provided 300,000 detections dominated by signals of bright objects that were not perfectly subtracted. Allowing these artifacts to be detected leads not only to a waste of resources but also to possible signal coordinate contamination. We developed a subtracted image stack treatment to reduce the number of non SNlike events using morphological component analysis. This technique exploits the morphological diversity of objects to be detected to extract the signal of interest. At the level of our subtraction stacks, SN like events are rather circular objects while most spurious detections exhibit different shapes. A twostep procedure was necessary to have a proper evaluation of the noise in the subtracted image stacks and thus a reliable signal extraction. We also set up a new detection strategy to obtain coordinates with good resolution for the extracted signal. SNIa MonteCarlo (MC) generated images were used to study detection efficiency and coordinate resolution. When tested on SNLS 3year data this procedure decreases the number of detections by a factor of two, while losing only 10% of SNlike events, almost all faint ones. MC results show that SNIa detection efficiency is equivalent to that of the original method for bright events, while the coordinate resolution is improved.
2014
 PRISM: Recovery of the primordial spectrum from Planck dataF. Lanusse, P. Paykari, J. L. Starck, and 3 more authorsAstronomy and Astrophysics Nov 2014
\Aims: The primordial power spectrum describes the initial perturbations that seeded the largescale structure we observe today. It provides an indirect probe of inflation or other structureformation mechanisms. In this Letter, we recover the primordial power spectrum from the Planck PR1 dataset, using our recently published algorithm PRISM. \Methods: PRISM is a sparsitybased inversion method that aims at recovering features in the primordial power spectrum from the empirical power spectrum of the cosmic microwave background (CMB). This illposed inverse problem is regularised using a sparsity prior on features in the primordial power spectrum in a wavelet dictionary. Although this nonparametric method does not assume a strong prior on the shape of the primordial power spectrum, it is able to recover both its general shape and localised features. As a results, this approach presents a reliable way of detecting deviations from the currently favoured scaleinvariant spectrum. \Results: We applied PRISM to 100 simulated Planck data to investigate its performance on Plancklike data. We then applied PRISM to the Planck PR1 power spectrum to recover the primordial power spectrum. We also tested the algorithm’s ability to recover a small localised feature at k \raisebox0.5ex~0.125 Mpc^1, which caused a large dip at \ensuremath\ell \raisebox0.5ex~1800 in the angular power spectrum. \Conclusions: We find no significant departures from the fiducial Planck PR1 near scale invariant primordial power spectrum with A_s = 2.215 \texttimes 10^9 and n_s = 0.9624.
 Cluster identification with 3D weak lensing density reconstructionsAdrienne Leonard, Fran’e7ois Lanusse, and JeanLuc StarckIn Building the Euclid Cluster Survey  Scientific Program Jul 2014
 PRISM: Sparse recovery of the primordial power spectrumP. Paykari, F. Lanusse, J. L. Starck, and 2 more authorsAstronomy and Astrophysics Jun 2014
\Aims: The primordial power spectrum describes the initial perturbations in the Universe which eventually grew into the largescale structure we observe today, and thereby provides an indirect probe of inflation or other structureformation mechanisms. Here, we introduce a new method to estimate this spectrum from the empirical power spectrum of cosmic microwave background maps. \Methods: A sparsitybased linear inversion method, named PRISM, is presented. This technique leverages a sparsity prior on features in the primordial power spectrum in a wavelet basis to regularise the inverse problem. This nonparametric approach does not assume a strong prior on the shape of the primordial power spectrum, yet is able to correctly reconstruct its global shape as well as localised features. These advantages make this method robust for detecting deviations from the currently favoured scaleinvariant spectrum. \Results: We investigate the strength of this method on a set of WMAP nineyear simulated data for three types of primordial power spectra: a near scaleinvariant spectrum, a spectrum with a small running of the spectral index, and a spectrum with a localised feature. This technique proves that it can easily detect deviations from a pure scaleinvariant power spectrum and is suitable for distinguishing between simple models of the inflation. We process the WMAP nineyear data and find no significant departure from a near scaleinvariant power spectrum with the spectral index n_s = 0.972. \Conclusions: A highresolution primordial power spectrum can be reconstructed with this technique, where any strong local deviations or small global deviations from a pure scale invariant spectrum can easily be detected.
 GLIMPSE: accurate 3D weak lensing reconstructions using sparsityAdrienne Leonard, François Lanusse, and JeanLuc StarckMonthly Notices of the Royal Astronomical Society May 2014
We present GLIMPSE  Gravitational Lensing Inversion and MaPping with Sparse Estimators  a new algorithm to generate density reconstructions in three dimensions from photometric weak lensing measurements. This is an extension of earlier work in one dimension aimed at applying compressive sensing theory to the inversion of gravitational lensing measurements to recover 3D density maps. Using the assumption that the density can be represented sparsely in our chosen basis  2D transverse wavelets and 1D lineofsight Dirac functions  we show that clusters of galaxies can be identified and accurately localized and characterized using this method. Throughout, we use simulated data consistent with the quality currently attainable in large surveys. We present a thorough statistical analysis of the errors and biases in both the redshifts of detected structures and their amplitudes. The GLIMPSE method is able to produce reconstructions at significantly higher resolution than the input data; in this paper, we show reconstructions with 6 times finer redshift resolution than the shear data. Considering cluster simulations with 0.05 \ensuremath≤ z_cl \ensuremath≤ 0.75 and 3 \texttimes 10^13 \ensuremath≤ M_vir \ensuremath≤ 10^15 h^1 M_\ensuremath⊙, we show that the redshift extent of detected peaks is typically 12 pixel, or \ensuremath∆z \ensuremath≲ 0.07, and that we are able to recover an unbiased estimator of the redshift of a detected cluster by considering many realizations of the noise. We also recover an accurate estimator of the mass, which is largely unbiased when the redshift is known and whose bias is constrained to \ensuremath≲5 per cent in the majority of our simulations when the estimated redshift is taken to be the true redshift. This shows a substantial improvement over earlier 3D inversion methods, which showed redshift smearing with a typical standard deviation of \ensuremathσ \ensuremath∼ 0.20.3, a significant damping of the amplitude of the peaks detected, and a bias in the detected redshift.
 Combining ProbesAnaı̈s Rassat, François Lanusse, Donnacha Kirk, and 2 more authorsIn Statistical Challenges in 21st Century Cosmology May 2014
With the advent of widefield surveys, cosmology has entered a new golden age of data where our cosmological model and the nature of dark universe will be tested with unprecedented accuracy, so that we can strive for high precision cosmology. Observational probes like weak lensing, galaxy surveys and the cosmic microwave background as well as other observations will all contribute to these advances. These different probes trace the underlying expansion history and growth of structure in complementary ways and can be combined in order to extract cosmological parameters as best as possible. With future wide field surveys, observational overlap means these will trace the same physical underlying dark matter distribution, and extra care must be taken when combining information from different probes. Consideration of probe combination is a fundamental aspect of cosmostatistics and important to ensure optimal use of future widefield surveys.
 Density reconstruction from 3D lensing: Application to galaxy clustersFrançois Lanusse, Adrienne Leonard, and JeanLuc StarckIn Statistical Challenges in 21st Century Cosmology May 2014
Using the 3D information provided by photometric or spectroscopic weak lensing surveys, it has become possible in the last few years to address the problem of mapping the matter density contrast in three dimensions from gravitational lensing. We recently proposed a new non linear sparsity based reconstruction method allowing for high resolution reconstruction of the overdensity. This new technique represents a significant improvement over previous linear methods and opens the way to new applications of 3D weak lensing density reconstruction. In particular, we demonstrate that for the first time reconstructed overdensity maps can be used to detect and characterise galaxy clusters in mass and redshift.
 PRISM: Sparse recovery of the primordial spectrum from WMAP9 and Planck datasetsP. Paykari, F. Lanusse, J. L. Starck, and 2 more authorsIn Statistical Challenges in 21st Century Cosmology May 2014
The primordial power spectrum is an indirect probe of inflation or other structureformation mechanisms. We introduce a new method, named PRISM, to estimate this spectrum from the empirical cosmic microwave background (CMB) power spectrum. This is a sparsity based inversion method, which leverages a sparsity prior on features in the primordial spectrum in a wavelet dictionary to regularise the inverse problem. This nonparametric approach is able to reconstruct the global shape as well as localised features of the primordial spectrum accurately and proves to be robust for detecting deviations from the currently favoured scaleinvariant spectrum. We investigate the strength of this method on a set of WMAP nineyear simulated data for three types of primordial spectra and then process the WMAP nineyear data as well as the Planck PR1 data. We find no significant departures from a near scaleinvariant spectrum.
2013
 Imaging dark matter using sparsityFrançois Lanusse, Adrienne Leonard, and JeanLuc StarckIn Wavelets and Sparsity XV Sep 2013
We present an application of sparse regularization of illposed linear inverse problems to the reconstruction of the 3D distribution of dark matter in the Universe. By its very nature dark matter cannot be directly observed. Nevertheless, it can be studied through its gravitational effects can it be studied. In particular, the presence of dark matter induces small deformations to the shapes of background galaxies which is known as weak gravitational lensing. However, reconstructing the 3D distribution of dark matter from tomographic lensing measurements amounts to solving an illposed linear inverse problem. Considering that the 3D dark matter density is sparse in an appropriate wavelet based 3D dictionary, we propose an iterative thresholding algorithm to solve a penalized least squares problem. We present our results on simulated dark matter halos and compare them to state of the art linear reconstruction techniques. We show that thanks to our 3D sparsity constraint the quality of the reconstructed maps can be greatly improved.
 3D sparse representations on the sphere and applications in astronomyFrançois Lanusse, and JeanLuc StarckIn Wavelets and Sparsity XV Sep 2013
We present several 3D sparse decompositions based on wavelets on the sphere that are useful for different kind of data set such as regular 3D spherical measurements (r,\ensuremathθ, \ensuremath\varphi) and multichannel spherical measurements (\ensuremathλ, \ensuremathθ, \ensuremath\varphi). We show how these new decompositions can be used for astronomical data denoising and deconvolution, when the data are contaminated by Gaussian and Poisson noise.
2012
 Spherical 3D isotropic waveletsF. Lanusse, A. Rassat, and J. L. StarckAstronomy and Astrophysics Apr 2012
Context. Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D spherical FourierBessel (SFB) analysis in spherical coordinates is natural. Wavelets are particularly wellsuited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. \Aims: The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the SFB decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. \Methods: We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. (2006). We also present a new fast discrete spherical Fourier Bessel transform (DSFBT) based on both a discrete Bessel transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toyapplication, apply a denoising algorithm in wavelet space to the Virgo large box cosmological simulations and find we can successfully remove noise without much loss to the large scale structure. \Results: We have described a new spherical 3D isotropic wavelet transform, ideally suited to analyse and denoise future 3D spherical cosmological surveys, which uses a novel DSFBT. We illustrate its potential use for denoising using a toy model. All the algorithms presented in this paper are available for download as a public code called MRS3D at http://jstarck.free.fr/mrs3d.html