publications
automatically generated publication list from NASA's ADS service, powered by jekyll-scholar.
2024
- The Multimodal Universe: Enabling Large-Scale Machine Learning with 100TB of Astronomical Scientific DataThe Multimodal Universe Collaboration, Jeroen Audenaert, Micah Bowles, and 26 more authorsarXiv e-prints Dec 2024
We present the MULTIMODAL UNIVERSE, a large-scale multimodal dataset of scientific astronomical data, compiled specifically to facilitate machine learning research. Overall, the MULTIMODAL UNIVERSE contains hundreds of millions of astronomical observations, constituting 100 TB of multi-channel and hyper-spectral images, spectra, multivariate time series, as well as a wide variety of associated scientific measurements and “metadata”. In addition, we include a range of benchmark tasks representative of standard practices for machine learning methods in astrophysics. This massive dataset will enable the development of large multi-modal models specifically targeted towards scientific applications. All codes used to compile the MULTIMODAL UNIVERSE and a description of how to access the data is available at https://github.com/MultimodalUniverse/MultimodalUniverse
- Geometric deep learning for galaxy-halo connection: a case study for galaxy intrinsic alignmentsYesukhei Jagvaral, Francois Lanusse, and Rachel MandelbaumarXiv e-prints Sep 2024
Forthcoming cosmological imaging surveys, such as the Rubin Observatory LSST, require large-scale simulations encompassing realistic galaxy populations for a variety of scientific applications. Of particular concern is the phenomenon of intrinsic alignments (IA), whereby galaxies orient themselves towards overdensities, potentially introducing significant systematic biases in weak gravitational lensing analyses if they are not properly modeled. Due to computational constraints, simulating the intricate details of galaxy formation and evolution relevant to IA across vast volumes is impractical. As an alternative, we propose a Deep Generative Model trained on the IllustrisTNG-100 simulation to sample 3D galaxy shapes and orientations to accurately reproduce intrinsic alignments along with correlated scalar features. We model the cosmic web as a set of graphs, each graph representing a halo with nodes representing the subhalos/galaxies. The architecture consists of a SO(3) \times \mathbbR^n diffusion generative model, for galaxy orientations and n scalars, implemented with E(3) equivariant Graph Neural Networks that explicitly respect the Euclidean symmetries of our Universe. The model is able to learn and predict features such as galaxy orientations that are statistically consistent with the reference simulation. Notably, our model demonstrates the ability to jointly model Euclidean- valued scalars (galaxy sizes, shapes, and colors) along with non-Euclidean valued SO(3) quantities (galaxy orientations) that are governed by highly complex galactic physics at non-linear scales.
- Simulation-Based Inference Benchmark for LSST Weak Lensing CosmologyJustine Zeghal, Denise Lanzieri, François Lanusse, and 5 more authorsarXiv e-prints Sep 2024
Standard cosmological analysis, which relies on two-point statistics, fails to extract the full information of the data. This limits our ability to constrain with precision cosmological parameters. Thus, recent years have seen a paradigm shift from analytical likelihood-based to simulation-based inference. However, such methods require a large number of costly simulations. We focus on full-field inference, considered the optimal form of inference. Our objective is to benchmark several ways of conducting full-field inference to gain insight into the number of simulations required for each method. We make a distinction between explicit and implicit full-field inference. Moreover, as it is crucial for explicit full-field inference to use a differentiable forward model, we aim to discuss the advantages of having this property for the implicit approach. We use the sbi_lens package which provides a fast and differentiable log- normal forward model. This forward model enables us to compare explicit and implicit full-field inference with and without gradient. The former is achieved by sampling the forward model through the No U-Turns sampler. The latter starts by compressing the data into sufficient statistics and uses the Neural Likelihood Estimation algorithm and the one augmented with gradient. We perform a full-field analysis on LSST Y10 like weak lensing simulated mass maps. We show that explicit and implicit full-field inference yield consistent constraints. Explicit inference requires 630 000 simulations with our particular sampler corresponding to 400 independent samples. Implicit inference requires a maximum of 101 000 simulations split into 100 000 simulations to build sufficient statistics (this number is not fine tuned) and 1 000 simulations to perform inference. Additionally, we show that our way of exploiting the gradients does not significantly help implicit inference.
- Teaching dark matter simulations to speak the halo languageShivam Pandey, Francois Lanusse, Chirag Modi, and 1 more authorarXiv e-prints Sep 2024
We develop a transformer-based conditional generative model for discrete point objects and their properties. We use it to build a model for populating cosmological simulations with gravitationally collapsed structures called dark matter halos. Specifically, we condition our model with dark matter distribution obtained from fast, approximate simulations to recover the correct three- dimensional positions and masses of individual halos. This leads to a first model that can recover the statistical properties of the halos at small scales to better than 3% level using an accelerated dark matter simulation. This trained model can then be applied to simulations with significantly larger volumes which would otherwise be computationally prohibitive with traditional simulations, and also provides a crucial missing link in making end-to-end differentiable cosmological simulations. The code, named GOTHAM (Generative cOnditional Transformer for Halo’s Auto-regressive Modeling) is publicly available at \textbackslashurl{https://github.com/shivampcosmo/GOTHAM}.
- The cosmological analysis of X-ray cluster surveys: VI. Inference based on analytically simulated observable diagramsM. Kosiba, N. Cerardi, M. Pierre, and 4 more authorsarXiv e-prints Sep 2024
The number density of galaxy clusters across mass and redshift has been established as a powerful cosmological probe. Cosmological analyses with galaxy clusters traditionally employ scaling relations. However, many challenges arise from this approach as the scaling relations are highly scattered, may be ill- calibrated, depend on the cosmology, and contain many nuisance parameters with low physical significance. In this paper, we use a simulation-based inference method utilizing artificial neural networks to optimally extract cosmological information from a shallow X-ray survey of galaxy clusters, solely using count rates (CR), hardness ratios (HR), and redshifts. This procedure enables us to conduct likelihood-free inference of cosmological parameters \Omega_\mathrmm and \sigma_8. We analytically generate simulations of galaxy cluster distribution in a CR, HR space in multiple redshift bins based on totally random combinations of cosmological and scaling relation parameters. We train Convolutional Neural Networks (CNNs) to retrieve the cosmological parameters from these simulations. We then use neural density estimation (NDE) neural networks to predict the posterior probability distribution of \Omega_\mathrmm and \sigma_8 given an input galaxy cluster sample. The 1 σerrors of our density estimator on one of the target testing simulations are 1000 deg^2: 15.2% for \Omega_\mathrmm and 10.0% for \sigma_8; 10000 deg^2: 9.6% for \Omega_\mathrmm and 5.6% for \sigma_8. We also compare our results with Fisher analysis. We demonstrate, as a proof of concept, that it is possible to calculate cosmological predictions of \Omega_\mathrmm and \sigma_8 from a galaxy cluster population without explicitly computing cluster masses and even, the scaling relation coefficients, thus avoiding potential biases resulting from such a procedure. [abridged]
- Optimal Neural Summarisation for Full-Field Weak Lensing Cosmological Implicit InferenceDenise Lanzieri, Justine Zeghal, T. Lucas Makinen, and 3 more authorsarXiv e-prints Jul 2024
Traditionally, weak lensing cosmological surveys have been analyzed using summary statistics motivated by their analytically tractable likelihoods, or by their ability to access higher- order information, at the cost of requiring Simulation-Based Inference (SBI) approaches. While informative, these statistics are neither designed nor guaranteed to be statistically sufficient. With the rise of deep learning, it becomes possible to create summary statistics optimized to extract the full data information. We compare different neural summarization strategies proposed in the weak lensing literature, to assess which loss functions lead to theoretically optimal summary statistics to perform full-field inference. In doing so, we aim to provide guidelines and insights to the community to help guide future neural-based inference analyses. We design an experimental setup to isolate the impact of the loss function used to train neural networks. We have developed the sbi_lens JAX package, which implements an automatically differentiable lognormal wCDM LSST-Y10 weak lensing simulator. The explicit full-field posterior obtained using the Hamilotnian-Monte-Carlo sampler gives us a ground truth to which to compare different compression strategies. We provide theoretical insight into the loss functions used in the literature and show that some do not necessarily lead to sufficient statistics (e.g. Mean Square Error (MSE)), while those motivated by information theory (e.g. Variational Mutual Information Maximization (VMIM)) can. Our numerical experiments confirm these insights and show, in our simulated wCDM scenario, that the Figure of Merit (FoM) of an analysis using neural summaries optimized under VMIM achieves 100% of the reference Omega_c - sigma_8 full-field FoM, while an analysis using neural summaries trained under MSE achieves only 81% of the same reference FoM.
- AstroCLIP: a cross-modal foundation model for galaxiesLiam Parker, Francois Lanusse, Siavash Golkar, and 13 more authorsMonthly Notices of the Royal Astronomical Society Jul 2024
We present AstroCLIP, a single, versatile model that can embed both galaxy images and spectra into a shared, physically meaningful latent space. These embeddings can then be used - without any model fine-tuning - for a variety of downstream tasks including (1) accurate in-modality and cross-modality semantic similarity search, (2) photometric redshift estimation, (3) galaxy property estimation from both images and spectra, and (4) morphology classification. Our approach to implementing AstroCLIP consists of two parts. First, we embed galaxy images and spectra separately by pre-training separate transformer-based image and spectrum encoders in self-supervised settings. We then align the encoders using a contrastive loss. We apply our method to spectra from the Dark Energy Spectroscopic Instrument and images from its corresponding Legacy Imaging Survey. Overall, we find remarkable performance on all downstream tasks, even relative to supervised baselines. For example, for a task like photometric redshift prediction, we find similar performance to a specifically trained ResNet18, and for additional tasks like physical property estimation (stellar mass, age, metallicity, and specific-star-formation rate), we beat this supervised baseline by 19 per cent in terms of R^2. We also compare our results with a state-of-the-art self-supervised single-modal model for galaxy images, and find that our approach outperforms this benchmark by roughly a factor of two on photometric redshift estimation and physical property prediction in terms of R^2, while remaining roughly in-line in terms of morphology classification. Ultimately, our approach represents the first cross-modal self-supervised model for galaxies, and the first self-supervised transformer-based architectures for galaxy images and spectra.
- Detecting galaxy tidal features using self-supervised representation learningAlice Desmons, Sarah Brough, and Francois LanusseMonthly Notices of the Royal Astronomical Society Jul 2024
Low surface brightness substructures around galaxies, known as tidal features, are a valuable tool in the detection of past or ongoing galaxy mergers, and their properties can answer questions about the progenitor galaxies involved in the interactions. The assembly of current tidal feature samples is primarily achieved using visual classification, making it difficult to construct large samples and draw accurate and statistically robust conclusions about the galaxy evolution process. With upcoming large optical imaging surveys such as the Vera C. Rubin Observatory’s Legacy Survey of Space and Time, predicted to observe billions of galaxies, it is imperative that we refine our methods of detecting and classifying samples of merging galaxies. This paper presents promising results from a self-supervised machine learning model, trained on data from the Ultradeep layer of the Hyper Suprime-Cam Subaru Strategic Program optical imaging survey, designed to automate the detection of tidal features. We find that self-supervised models are capable of detecting tidal features, and that our model outperforms previous automated tidal feature detection methods, including a fully supervised model. An earlier method applied to real galaxy images achieved 76 per cent completeness for 22 per cent contamination, while our model achieves considerably higher (96 per cent) completeness for the same level of contamination. We emphasize a number of advantages of self-supervised models over fully supervised models including maintaining excellent performance when using only 50 labelled examples for training, and the ability to perform similarity searches using a single example of a galaxy with tidal features.
- An Empirical Model For Intrinsic Alignments: Insights From Cosmological SimulationsNicholas Van Alfen, Duncan Campbell, Jonathan Blazek, and 5 more authorsThe Open Journal of Astrophysics Jun 2024
We extend current models of the halo occupation distribution (HOD) to include a flexible, empirical framework for the forward modeling of the intrinsic alignment (IA) of galaxies. A primary goal of this work is to produce mock galaxy catalogs for the purpose of validating existing models and methods for the mitigation of IA in weak lensing measurements. This technique can also be used to produce new, simulation-based predictions for IA and galaxy clustering. Our model is probabilistically formulated, and rests upon the assumption that the orientations of galaxies exhibit a correlation with their host dark matter (sub)halo orientation or with their position within the halo. We examine the necessary components and phenomenology of such a model by considering the alignments between (sub)halos in a cosmological dark matter only simulation. We then validate this model for a realistic galaxy population in a set of simulations in the Illustris-TNG suite. We create an HOD mock with Illustris-like correlations using our method, constraining the associated IA model parameters, with the between our model’s correlations and those of Illustris matching as closely as 1.4 and 1.1 for orientation–position and orientation–orientation correlation functions, respectively. By modeling the misalignment between galaxies and their host halo, we show that the 3-dimensional two-point position and orientation correlation functions of simulated (sub)halos and galaxies can be accurately reproduced from quasi- linear scales down to . We also find evidence for environmental influence on IA within a halo. Our publicly-available software provides a key component enabling efficient determination of Bayesian posteriors on IA model parameters using observational measurements of galaxy-orientation correlation functions in the highly nonlinear regime.
- Learning Diffusion Priors from Observations by Expectation MaximizationFrançois Rozet, Gérôme Andry, François Lanusse, and 1 more authorarXiv e-prints May 2024
Diffusion models recently proved to be remarkable priors for Bayesian inverse problems. However, training these models typically requires access to large amounts of clean data, which could prove difficult in some settings. In this work, we present a novel method based on the expectation-maximization algorithm for training diffusion models from incomplete and noisy observations only. Unlike previous works, our method leads to proper diffusion models, which is crucial for downstream tasks. As part of our method, we propose and motivate an improved posterior sampling scheme for unconditional diffusion models. We present empirical evidence supporting the effectiveness of our method.
- Differentiable stochastic halo occupation distributionBenjamin Horowitz, ChangHoon Hahn, Francois Lanusse, and 2 more authorsMonthly Notices of the Royal Astronomical Society Apr 2024
In this work, we demonstrate how differentiable stochastic sampling techniques developed in the context of deep reinforcement learning can be used to perform efficient parameter inference over stochastic, simulation-based, forward models. As a particular example, we focus on the problem of estimating parameters of halo occupation distribution (HOD) models that are used to connect galaxies with their dark matter haloes. Using a combination of continuous relaxation and gradient re- parametrization techniques, we can obtain well-defined gradients with respect to HOD parameters through discrete galaxy catalogue realizations. Having access to these gradients allows us to leverage efficient sampling schemes, such as Hamiltonian Monte Carlo, and greatly speed up parameter inference. We demonstrate our technique on a mock galaxy catalogue generated from the Bolshoi simulation using a standard HOD model and find near- identical posteriors as standard Markov chain Monte Carlo techniques with an increase of \raisebox-0.5ex\textasciitilde8\texttimes in convergence efficiency. Our differentiable HOD model also has broad applications in full forward model approaches to cosmic structure and cosmological analysis.
- Differentiable Cosmological Simulation with the Adjoint MethodYin Li, Chirag Modi, Drew Jamieson, and 5 more authorsAstrophysical Journal, Supplement Feb 2024
Rapid advances in deep learning have brought not only a myriad of powerful neural networks, but also breakthroughs that benefit established scientific research. In particular, automatic differentiation (AD) tools and computational accelerators like GPUs have facilitated forward modeling of the Universe with differentiable simulations. Based on analytic or automatic backpropagation, current differentiable cosmological simulations are limited by memory, and thus are subject to a trade-off between time and space/mass resolution, usually sacrificing both. We present a new approach free of such constraints, using the adjoint method and reverse time integration. It enables larger and more accurate forward modeling at the field level, and will improve gradient-based optimization and inference. We implement it in an open-source particle-mesh (PM) N-body library pmwd (PM with derivatives). Based on the powerful AD system JAX, pmwd is fully differentiable, and is highly performant on GPUs.
2023
- Unified framework for diffusion generative models in SO(3): applications in computer vision and astrophysicsYesukhei Jagvaral, Francois Lanusse, and Rachel MandelbaumarXiv e-prints Dec 2023
Diffusion-based generative models represent the current state-of-the-art for image generation. However, standard diffusion models are based on Euclidean geometry and do not translate directly to manifold-valued data. In this work, we develop extensions of both score-based generative models (SGMs) and Denoising Diffusion Probabilistic Models (DDPMs) to the Lie group of 3D rotations, SO(3). SO(3) is of particular interest in many disciplines such as robotics, biochemistry and astronomy/cosmology science. Contrary to more general Riemannian manifolds, SO(3) admits a tractable solution to heat diffusion, and allows us to implement efficient training of diffusion models. We apply both SO(3) DDPMs and SGMs to synthetic densities on SO(3) and demonstrate state-of-the-art results. Additionally, we demonstrate the practicality of our model on pose estimation tasks and in predicting correlated galaxy orientations for astrophysics/cosmology.
- Forecasting the power of higher order weak-lensing statistics with automatically differentiable simulationsDenise Lanzieri, François Lanusse, Chirag Modi, and 4 more authorsAstronomy and Astrophysics Nov 2023
\Aims: We present the fully differentiable physical Differentiable Lensing Lightcone (DLL) model, designed for use as a forward model in Bayesian inference algorithms that require access to derivatives of lensing observables with respect to cosmological parameters. \Methods: We extended the public FlowPM N-body code, a particle-mesh N-body solver, while simulating the lensing lightcones and implementing the Born approximation in the Tensorflow framework. Furthermore, DLL is aimed at achieving high accuracy with low computational costs. As such, it integrates a novel hybrid physical-neural (HPN) parameterization that is able to compensate for the small-scale approximations resulting from particle-mesh schemes for cosmological N-body simulations. We validated our simulations in the context of the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) against high-resolution \ensuremathκTNG-Dark simulations by comparing both the lensing angular power spectrum and multiscale peak counts. We demonstrated its ability to recover lensing C_\ensuremath\ell up to a 10% accuracy at \ensuremath\ell = 1000 for sources at a redshift of 1, with as few as \ensuremath∼0.6 particles per Mpc h^\ensuremath-1. As a first-use case, we applied this tool to an investigation of the relative constraining power of the angular power spectrum and peak counts statistic in an LSST setting. Such comparisons are typically very costly as they require a large number of simulations and do not scale appropriately with an increasing number of cosmological parameters. As opposed to forecasts based on finite differences, these statistics can be analytically differentiated with respect to cosmology or any systematics included in the simulations at the same computational cost of the forward simulation. \Results: We find that the peak counts outperform the power spectrum in terms of the cold dark matter parameter, \ensuremathΩ_c, as well as on the amplitude of density fluctuations, \ensuremathσ_8, and the amplitude of the intrinsic alignment signal, A_IA.
- Multiple Physics Pretraining for Physical Surrogate ModelsMichael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, and 11 more authorsarXiv e-prints Oct 2023
We introduce multiple physics pretraining (MPP), an autoregressive task- agnostic pretraining approach for physical surrogate modeling. MPP involves training large surrogate models to predict the dynamics of multiple heterogeneous physical systems simultaneously by learning features that are broadly useful across diverse physical tasks. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a single shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP- pretrained transformer is able to match or outperform task- specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on new physics compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility and community experimentation.
- xVal: A Continuous Number Encoding for Large Language ModelsSiavash Golkar, Mariel Pettee, Michael Eickenberg, and 11 more authorsarXiv e-prints Oct 2023
Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers. We propose xVal, a numerical encoding scheme that represents any real number using just a single token. xVal represents a given real number by scaling a dedicated embedding vector by the number value. Combined with a modified number- inference approach, this strategy renders the model end-to-end continuous when considered as a map from the numbers of the input string to those of the output string. This leads to an inductive bias that is generally more suitable for applications in scientific domains. We empirically evaluate our proposal on a number of synthetic and real-world datasets. Compared with existing number encoding schemes, we find that xVal is more token-efficient and demonstrates improved generalization.
- Detecting Tidal Features using Self-Supervised LearningAlice Desmons, Sarah Brough, and Francois LanusseIn Machine Learning for Astrophysics Jul 2023
Low surface brightness substructures around galaxies, known as tidal features, are a valuable tool in the detection of past or ongoing galaxy mergers, and their properties can answer questions about the progenitor galaxies involved in the interactions. The assembly of current tidal feature samples is primarily achieved using visual classification, making it difficult to construct large samples and draw accurate and statistically robust conclusions about the galaxy evolution process. With upcoming large optical imaging surveys such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), predicted to observe billions of galaxies, it is imperative that we refine our methods of detecting and classifying samples of merging galaxies. This paper presents promising results from a self-supervised machine learning model, trained on data from the Ultradeep layer of the Hyper Suprime- Cam Subaru Strategic Program optical imaging survey, designed to automate the detection of tidal features. We find that self- supervised models are capable of detecting tidal features, and that our model outperforms previous automated tidal feature detection methods, including a fully supervised model. An earlier method applied to real galaxy images achieved 76% completeness for 22% contamination, while our model achieves considerably higher (96%) completeness for the same level of contamination. We emphasise a number of advantages of self- supervised models over fully supervised models including maintaining excellent performance when using only 50 labelled examples for training, and the ability to perform similarity searches using a single example of a galaxy with tidal features.
- JAX-COSMO: An End-to-End Differentiable and GPU Accelerated Cosmology LibraryJean-Eric Campagne, François Lanusse, Joe Zuntz, and 7 more authorsThe Open Journal of Astrophysics Apr 2023
We present jax-cosmo, a library for automatically differentiable cosmological theory calculations. It uses the JAX library, which has created a new coding ecosystem, especially in probabilistic programming. As well as batch acceleration, just-in-time compilation, and automatic optimization of code for different hardware modalities (CPU, GPU, TPU), JAX exposes an automatic differentiation (autodiff) mechanism. Thanks to autodiff, jax- cosmo gives access to the derivatives of cosmological likelihoods with respect to any of their parameters, and thus enables a range of powerful Bayesian inference algorithms, otherwise impractical in cosmology, such as Hamiltonian Monte Carlo and Variational Inference. In its initial release, jax- cosmo implements background evolution, linear and non-linear power spectra (using halofit or the Eisenstein and Hu transfer function), as well as angular power spectra with the Limber approximation for galaxy and weak lensing probes, all differentiable with respect to the cosmological parameters and their other inputs. We illustrate how autodiff can be a game- changer for common tasks involving Fisher matrix computations, or full posterior inference with gradient-based techniques. In particular, we show how Fisher matrices are now fast, exact, no longer require any fine tuning, and are themselves differentiable. Finally, using a Dark Energy Survey Year 1 3x2pt analysis as a benchmark, we demonstrate how jax-cosmo can be combined with Probabilistic Programming Languages to perform posterior inference with state-of-the-art algorithms including a No U-Turn Sampler, Automatic Differentiation Variational Inference,and Neural Transport HMC. We further demonstrate that Normalizing Flows using Neural Transport are a promising methodology for model validation in the early stages of analysis.
- Probabilistic mass-mapping with neural score estimationB. Remy, F. Lanusse, N. Jeffrey, and 4 more authorsAstronomy and Astrophysics Apr 2023
Context. Weak lensing mass-mapping is a useful tool for accessing the full distribution of dark matter on the sky, but because of intrinsic galaxy ellipticies, finite fields, and missing data, the recovery of dark matter maps constitutes a challenging, ill- posed inverse problem \Aims: We introduce a novel methodology that enables the efficient sampling of the high- dimensional Bayesian posterior of the weak lensing mass-mapping problem, relying on simulations to define a fully non-Gaussian prior. We aim to demonstrate the accuracy of the method to simulated fields, and then proceed to apply it to the mass reconstruction of the HST/ACS COSMOS field. \Methods: The proposed methodology combines elements of Bayesian statistics, analytic theory, and a recent class of deep generative models based on neural score matching. This approach allows us to make full use of analytic cosmological theory to constrain the 2pt statistics of the solution, to understand any differences between this analytic prior and full simulations from cosmological simulations, and to obtain samples from the full Bayesian posterior of the problem for robust uncertainty quantification. \Results: We demonstrate the method in the \ensuremathκTNG simulations and find that the posterior mean significantly outperfoms previous methods (Kaiser-Squires, Wiener filter, Sparsity priors) both for the root-mean-square error and in terms of the Pearson correlation. We further illustrate the interpretability of the recovered posterior by establishing a close correlation between posterior convergence values and the S/N of the clusters artificially introduced into a field. Finally, we apply the method to the reconstruction of the HST/ACS COSMOS field, which yields the highest-quality convergence map of this field to date. \Conclusions: We find the proposed approach to be superior to previous algorithms, scalable, providing uncertainties, and using a fully non-Gaussian prior. \\textbackslashAll codes and data products associated with this paper are available at <A href=“https://github.com/CosmoStat/jax- lensing”>https://github.com/CosmoStat/jax-lensing</A>.
- The N5K Challenge: Non-Limber Integration for LSST CosmologyC. Danielle Leonard, Tassia Ferreira, Xiao Fang, and 8 more authorsThe Open Journal of Astrophysics Feb 2023
The rapidly increasing statistical power of cosmological imaging surveys requires us to reassess the regime of validity for various approximations that accelerate the calculation of relevant theoretical predictions. In this paper, we present the results of the ’N5K non-Limber integration challenge’, the goal of which was to quantify the performance of different approaches to calculating the angular power spectrum of galaxy number counts and cosmic shear data without invoking the so-called ’Limber approximation’, in the context of the Rubin Observatory Legacy Survey of Space and Time (LSST). We quantify the performance, in terms of accuracy and speed, of three non-Limber implementations: \tt FKEM (CosmoLike), \tt Levin, and \tt matter, themselves based on different integration schemes and approximations. We find that in the challenge’s fiducial 3x2pt LSST Year 10 scenario, \tt FKEM (CosmoLike) produces the fastest run time within the required accuracy by a considerable margin, positioning it favourably for use in Bayesian parameter inference. This method, however, requires further development and testing to extend its use to certain analysis scenarios, particularly those involving a scale- dependent growth rate. For this and other reasons discussed herein, alternative approaches such as \tt matter and \tt Levin may be necessary for a full exploration of parameter space. We also find that the usual first-order Limber approximation is insufficiently accurate for LSST Year 10 3x2pt analysis on \ell=200-1000, whereas invoking the second-order Limber approximation on these scales (with a full non-Limber method at smaller \ell) does suffice.
- The Dawes Review 10: The impact of deep learning for the analysis of galaxy surveysM. Huertas-Company, and F. LanussePublications of the Astron. Soc. of Australia Jan 2023
The amount and complexity of data delivered by modern galaxy surveys has been steadily increasing over the past years. New facilities will soon provide imaging and spectra of hundreds of millions of galaxies. Extracting coherent scientific information from these large and multi-modal data sets remains an open issue for the community and data-driven approaches such as deep learning have rapidly emerged as a potentially powerful solution to some long lasting challenges. This enthusiasm is reflected in an unprecedented exponential growth of publications using neural networks, which have gone from a handful of works in 2015 to an average of one paper per week in 2021 in the area of galaxy surveys. Half a decade after the first published work in astronomy mentioning deep learning, and shortly before new big data sets such as Euclid and LSST start becoming available, we believe it is timely to review what has been the real impact of this new technology in the field and its potential to solve key challenges raised by the size and complexity of the new datasets. The purpose of this review is thus two-fold. We first aim at summarising, in a common document, the main applications of deep learning for galaxy surveys that have emerged so far. We then extract the major achievements and lessons learned and highlight key open questions and limitations, which in our opinion, will require particular attention in the coming years. Overall, state-of-the-art deep learning methods are rapidly adopted by the astronomical community, reflecting a democratisation of these methods. This review shows that the majority of works using deep learning up to date are oriented to computer vision tasks (e.g. classification, segmentation). This is also the domain of application where deep learning has brought the most important breakthroughs so far. However, we also report that the applications are becoming more diverse and deep learning is used for estimating galaxy properties, identifying outliers or constraining the cosmological model. Most of these works remain at the exploratory level though which could partially explain the limited impact in terms of citations. Some common challenges will most likely need to be addressed before moving to the next phase of massive deployment of deep learning in the processing of future surveys; for example, uncertainty quantification, interpretability, data labelling and domain shift issues from training with simulations, which constitutes a common practice in astronomy.
2022
- Modeling halo and central galaxy orientations on the SO(3) manifold with score-based generative modelsYesukhei Jagvaral, Rachel Mandelbaum, and Francois LanussearXiv e-prints Dec 2022
Upcoming cosmological weak lensing surveys are expected to constrain cosmological parameters with unprecedented precision. In preparation for these surveys, large simulations with realistic galaxy populations are required to test and validate analysis pipelines. However, these simulations are computationally very costly – and at the volumes and resolutions demanded by upcoming cosmological surveys, they are computationally infeasible. Here, we propose a Deep Generative Modeling approach to address the specific problem of emulating realistic 3D galaxy orientations in synthetic catalogs. For this purpose, we develop a novel Score-Based Diffusion Model specifically for the SO(3) manifold. The model accurately learns and reproduces correlated orientations of galaxies and dark matter halos that are statistically consistent with those of a reference high- resolution hydrodynamical simulation.
- pmwd: A Differentiable Cosmological Particle-Mesh N-body LibraryYin Li, Libin Lu, Chirag Modi, and 7 more authorsarXiv e-prints Nov 2022
The formation of the large-scale structure, the evolution and distribution of galaxies, quasars, and dark matter on cosmological scales, requires numerical simulations. Differentiable simulations provide gradients of the cosmological parameters, that can accelerate the extraction of physical information from statistical analyses of observational data. The deep learning revolution has brought not only myriad powerful neural networks, but also breakthroughs including automatic differentiation (AD) tools and computational accelerators like GPUs, facilitating forward modeling of the Universe with differentiable simulations. Because AD needs to save the whole forward evolution history to backpropagate gradients, current differentiable cosmological simulations are limited by memory. Using the adjoint method, with reverse time integration to reconstruct the evolution history, we develop a differentiable cosmological particle-mesh (PM) simulation library pmwd (particle-mesh with derivatives) with a low memory cost. Based on the powerful AD library JAX, pmwd is fully differentiable, and is highly performant on GPUs.
- Towards solving model bias in cosmic shear forward modelingBenjamin Remy, Francois Lanusse, and Jean-Luc StarckarXiv e-prints Oct 2022
As the volume and quality of modern galaxy surveys increase, so does the difficulty of measuring the cosmological signal imprinted in galaxy shapes. Weak gravitational lensing sourced by the most massive structures in the Universe generates a slight shearing of galaxy morphologies called cosmic shear, key probe for cosmological models. Modern techniques of shear estimation based on statistics of ellipticity measurements suffer from the fact that the ellipticity is not a well-defined quantity for arbitrary galaxy light profiles, biasing the shear estimation. We show that a hybrid physical and deep learning Hierarchical Bayesian Model, where a generative model captures the galaxy morphology, enables us to recover an unbiased estimate of the shear on realistic galaxies, thus solving the model bias.
- Galaxies and haloes on graph neural networks: Deep generative modelling scalar and vector quantities for intrinsic alignmentYesukhei Jagvaral, François Lanusse, Sukhdeep Singh, and 3 more authorsMonthly Notices of the Royal Astronomical Society Oct 2022
In order to prepare for the upcoming wide-field cosmological surveys, large simulations of the Universe with realistic galaxy populations are required. In particular, the tendency of galaxies to naturally align towards overdensities, an effect called intrinsic alignments (IA), can be a major source of systematics in the weak lensing analysis. As the details of galaxy formation and evolution relevant to IA cannot be simulated in practice on such volumes, we propose as an alternative a Deep Generative Model. This model is trained on the IllustrisTNG-100 simulation and is capable of sampling the orientations of a population of galaxies so as to recover the correct alignments. In our approach, we model the cosmic web as a set of graphs, where the graphs are constructed for each halo, and galaxy orientations as a signal on those graphs. The generative model is implemented on a Generative Adversarial Network architecture and uses specifically designed Graph- Convolutional Networks sensitive to the relative 3D positions of the vertices. Given (sub)halo masses and tidal fields, the model is able to learn and predict scalar features such as galaxy and dark matter subhalo shapes; and more importantly, vector features such as the 3D orientation of the major axis of the ellipsoid and the complex 2D ellipticities. For correlations of 3D orientations the model is in good quantitative agreement with the measured values from the simulation, except for at very small and transition scales. For correlations of 2D ellipticities, the model is in good quantitative agreement with the measured values from the simulation on all scales. Additionally, the model is able to capture the dependence of IA on mass, morphological type, and central/satellite type.
- Bayesian uncertainty quantification for machine-learned models in physicsYarin Gal, Petros Koumoutsakos, Francois Lanusse, and 2 more authorsNature Reviews Physics Sep 2022
Being able to quantify uncertainty when comparing a theoretical or computational model to observations is critical to conducting a sound scientific investigation. With the rise of data-driven modelling, understanding various sources of uncertainty and developing methods to estimate them has gained renewed attention. Five researchers discuss uncertainty quantification in machine-learned models with an emphasis on issues relevant to physics problems.
- From Data to Software to Science with the Rubin Observatory LSSTKatelyn Breivik, Andrew J. Connolly, K. E. Saavik Ford, and 97 more authorsarXiv e-prints Aug 2022
The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) dataset will dramatically alter our understanding of the Universe, from the origins of the Solar System to the nature of dark matter and dark energy. Much of this research will depend on the existence of robust, tested, and scalable algorithms, software, and services. Identifying and developing such tools ahead of time has the potential to significantly accelerate the delivery of early science from LSST. Developing these collaboratively, and making them broadly available, can enable more inclusive and equitable collaboration on LSST science. To facilitate such opportunities, a community workshop entitled “From Data to Software to Science with the Rubin Observatory LSST” was organized by the LSST Interdisciplinary Network for Collaboration and Computing (LINCC) and partners, and held at the Flatiron Institute in New York, March 28-30th 2022. The workshop included over 50 in-person attendees invited from over 300 applications. It identified seven key software areas of need: (i) scalable cross-matching and distributed joining of catalogs, (ii) robust photometric redshift determination, (iii) software for determination of selection functions, (iv) frameworks for scalable time-series analyses, (v) services for image access and reprocessing at scale, (vi) object image access (cutouts) and analysis at scale, and (vii) scalable job execution systems. This white paper summarizes the discussions of this workshop. It considers the motivating science use cases, identified cross-cutting algorithms, software, and services, their high-level technical specifications, and the principles of inclusive collaborations needed to develop them. We provide it as a useful roadmap of needs, as well as to spur action and collaboration between groups and individuals looking to develop reusable software for early LSST science.
- Hybrid Physical-Neural ODEs for Fast N-body SimulationsDenise Lanzieri, Francois Lanusse, and Jean-Luc StarckIn Machine Learning for Astrophysics Jul 2022
We present a new scheme to compensate for the small-scales approximations resulting from Particle-Mesh (PM) schemes for cosmological N-body simulations. This kind of simulations are fast and low computational cost realizations of the large scale structures, but lack resolution on small scales. To improve their accuracy, we introduce an additional effective force within the differential equations of the simulation, parameterized by a Fourier-space Neural Network acting on the PM-estimated gravitational potential. We compare the results for the matter power spectrum obtained to the ones obtained by the PGD scheme (Potential gradient descent scheme). We notice a similar improvement in term of power spectrum, but we find that our approach outperforms PGD for the cross-correlation coefficients, and is more robust to changes in simulation settings (different resolutions, different cosmologies).
- Neural Posterior Estimation with Differentiable SimulatorJustine Zeghal, Francois Lanusse, Alexandre Boucaud, and 2 more authorsIn Machine Learning for Astrophysics Jul 2022
Simulation-Based Inference (SBI) is a promising Bayesian inference framework that alleviates the need for analytic likelihoods to estimate posterior distributions. Recent advances using neural density estimators in SBI algorithms have demonstrated the ability to achieve high-fidelity posteriors, at the expense of a large number of simulations ; which makes their application potentially very time-consuming when using complex physical simulations. In this work we focus on boosting the sample- efficiency of posterior density estimation using the gradients of the simulator. We present a new method to perform Neural Posterior Estimation (NPE) with a differentiable simulator. We demonstrate how gradient information helps constrain the shape of the posterior and improves sample-efficiency.
- Galaxies on graph neural networks: towards robust synthetic galaxy catalogs with deep generative modelsYesukhei Jagvaral, Rachel Mandelbaum, Francois Lanusse, and 3 more authorsIn Machine Learning for Astrophysics Jul 2022
The future astronomical imaging surveys are set to provide precise constraints on cosmological parameters, such as dark energy. However, production of synthetic data for these surveys, to test and validate analysis methods, suffers from a very high computational cost. In particular, generating mock galaxy catalogs at sufficiently large volume and high resolution will soon become computationally unreachable. In this paper, we address this problem with a Deep Generative Model to create robust mock galaxy catalogs that may be used to test and develop the analysis pipelines of future weak lensing surveys. We build our model on a custom built Graph Convolutional Networks, by placing each galaxy on a graph node and then connecting the graphs within each gravitationally bound system. We train our model on a cosmological simulation with realistic galaxy populations to capture the 2D and 3D orientations of galaxies. The samples from the model exhibit comparable statistical properties to those in the simulations. To the best of our knowledge, this is the first instance of a generative model on graphs in an astrophysical/cosmological context.
- ShapeNet: Shape constraint for galaxy image deconvolutionF. Nammour, U. Akhaury, J. N. Girard, and 4 more authorsAstronomy and Astrophysics Jul 2022
Deep learning (DL) has shown remarkable results in solving inverse problems in various domains. In particular, the Tikhonet approach is very powerful in deconvolving optical astronomical images. However, this approach only uses the \ensuremath\ell_2 loss, which does not guarantee the preservation of physical information (e.g., flux and shape) of the object that is reconstructed in the image. A new loss function has been proposed in the framework of sparse deconvolution that better preserves the shape of galaxies and reduces the pixel error. In this paper, we extend the Tikhonet approach to take this shape constraint into account and apply our new DL method, called ShapeNet, to a simulated optical and radio-interferometry dataset. The originality of the paper relies on i) the shape constraint we use in the neural network framework, ii) the application of DL to radio-interferometry image deconvolution for the first time, and iii) the generation of a simulated radio dataset that we make available for the community. A range of examples illustrates the results.
- Rubin-Euclid Derived Data Products: Initial RecommendationsLeanne P. Guy, Jean-Charles Cuillandre, Etienne Bachelet, and 118 more authorsIn Zenodo id. 5836022 Jan 2022
This report is the result of a joint discussion between the Rubin and Euclid scientific communities. The work presented in this report was focused on designing and recommending an initial set of Derived Data products (DDPs) that could realize the science goals enabled by joint processing. All interested Rubin and Euclid data rights holders were invited to contribute via an online discussion forum and a series of virtual meetings. Strong interest in enhancing science with joint DDPs emerged from across a wide range of astrophysical domains: Solar System, the Galaxy, the Local Volume, from the nearby to the primaeval Universe, and cosmology.
- Validating Synthetic Galaxy Catalogs for Dark Energy Science in the LSST EraEve Kovacs, Yao-Yuan Mao, Michel Aguena, and 36 more authorsThe Open Journal of Astrophysics Jan 2022
Large simulation efforts are required to provide synthetic galaxy catalogs for ongoing and upcoming cosmology surveys. These extragalactic catalogs are being used for many diverse purposes covering a wide range of scientific topics. In order to be useful, they must offer realistically complex information about the galaxies they contain. Hence, it is critical to implement a rigorous validation procedure that ensures that the simulated galaxy properties faithfully capture observations and delivers an assessment of the level of realism attained by the catalog. We present here a suite of validation tests that have been developed by the Rubin Observatory Legacy Survey of Space and Time (LSST) Dark Energy Science Collaboration (DESC). We discuss how the inclusion of each test is driven by the scientific targets for static ground-based dark energy science and by the availability of suitable validation data. The validation criteria that are used to assess the performance of a catalog are flexible and depend on the science goals. We illustrate the utility of this suite by showing examples for the validation of cosmoDC2, the extragalactic catalog recently released for the LSST DESC second Data Challenge.
- Euclid preparation. XIII. Forecasts for galaxy morphology with the Euclid Survey using deep generative modelsEuclid Collaboration, H. Bretonnière, M. Huertas-Company, and 194 more authorsAstronomy and Astrophysics Jan 2022
We present a machine learning framework to simulate realistic galaxies for the Euclid Survey, producing more complex and realistic galaxies than the analytical simulations currently used in Euclid. The proposed method combines a control on galaxy shape parameters offered by analytic models with realistic surface brightness distributions learned from real Hubble Space Telescope observations by deep generative models. We simulate a galaxy field of 0.4 deg^2 as it will be seen by the Euclid visible imager VIS, and we show that galaxy structural parameters are recovered to an accuracy similar to that for pure analytic Sérsic profiles. Based on these simulations, we estimate that the Euclid Wide Survey (EWS) will be able to resolve the internal morphological structure of galaxies down to a surface brightness of 22.5 mag arcsec^\ensuremath-2, and the Euclid Deep Survey (EDS) down to 24.9 mag arcsec^\ensuremath-2. This corresponds to approximately 250 million galaxies at the end of the mission and a 50% complete sample for stellar masses above 10^10.6 M_\ensuremath⊙ (resp. 10^9.6 M_\ensuremath⊙) at a redshift z \ensuremath∼ 0.5 for the EWS (resp. EDS). The approach presented in this work can contribute to improving the preparation of future high- precision cosmological imaging surveys by allowing simulations to incorporate more realistic galaxies.
2021
- Anomaly detection in Hyper Suprime-Cam galaxy images with generative adversarial networksKate Storey-Fisher, Marc Huertas-Company, Nesar Ramachandra, and 5 more authorsMonthly Notices of the Royal Astronomical Society Dec 2021
The problem of anomaly detection in astronomical surveys is becoming increasingly important as data sets grow in size. We present the results of an unsupervised anomaly detection method using a Wasserstein generative adversarial network (WGAN) on nearly one million optical galaxy images in the Hyper Suprime-Cam (HSC) survey. The WGAN learns to generate realistic HSC-like galaxies that follow the distribution of the data set; anomalous images are defined based on a poor reconstruction by the generator and outlying features learned by the discriminator. We find that the discriminator is more attuned to potentially interesting anomalies compared to the generator, and compared to a simpler autoencoder-based anomaly detection approach, so we use the discriminator-selected images to construct a high-anomaly sample of \raisebox-0.5ex\textasciitilde13 000 objects. We propose a new approach to further characterize these anomalous images: we use a convolutional autoencoder to reduce the dimensionality of the residual differences between the real and WGAN-reconstructed images and perform UMAP clustering on these. We report detected anomalies of interest including galaxy mergers, tidal features, and extreme star-forming galaxies. A follow-up spectroscopic analysis of one of these anomalies is detailed in the Appendix; we find that it is an unusual system most likely to be a metal- poor dwarf galaxy with an extremely blue, higher-metallicity H II region. We have released a catalogue with the WGAN anomaly scores; the code and catalogue are available at https://github.com/kstoreyf/anomalies-GAN-HSC; and our interactive visualization tool for exploring the clustered data is at https://weirdgalaxi.es.
- The LSST-DESC 3x2pt Tomography Optimization ChallengeJoe Zuntz, François Lanusse, Alex I. Malz, and 26 more authorsThe Open Journal of Astrophysics Oct 2021
This paper presents the results of the Rubin Observatory Dark Energy Science Collaboration (DESC) 3x2pt tomography challenge, which served as a first step toward optimizing the tomographic binning strategy for the main DESC analysis. The task of choosing an optimal tomographic binning scheme for a photometric survey is made particularly delicate in the context of a metacalibrated lensing catalogue, as only the photometry from the bands included in the metacalibration process (usually riz and potentially g) can be used in sample definition. The goal of the challenge was to collect and compare bin assignment strategies under various metrics of a standard 3x2pt cosmology analysis in a highly idealized setting to establish a baseline for realistically complex follow-up studies; in this preliminary study, we used two sets of cosmological simulations of galaxy redshifts and photometry under a simple noise model neglecting photometric outliers and variation in observing conditions, and contributed algorithms were provided with a representative and complete training set. We review and evaluate the entries to the challenge, finding that even from this limited photometry information, multiple algorithms can separate tomographic bins reasonably well, reaching figures-of-merit scores close to the attainable maximum. We further find that adding the g band to riz photometry improves metric performance by \raisebox-0.5ex\textasciitilde15% and that the optimal bin assignment strategy depends strongly on the science case: which figure-of-merit is to be optimized, and which observables (clustering, lensing, or both) are included.
- FlowPM: Distributed TensorFlow implementation of the FastPM cosmological N-body solverC. Modi, F. Lanusse, and U. SeljakAstronomy and Computing Oct 2021
We present FlowPM, a Particle-Mesh (PM) cosmological N-body code implemented in Mesh-TensorFlow for GPU-accelerated, distributed, and differentiable simulations. We implement and validate the accuracy of a novel multi-grid scheme based on multiresolution pyramids to compute large-scale forces efficiently on distributed platforms. We explore the scaling of the simulation on large-scale supercomputers and compare it with corresponding Python based PM code, finding on an average 10x speed-up in terms of wallclock time. We also demonstrate how this novel tool can be used for efficiently solving large scale cosmological inference problems, in particular reconstruction of cosmological fields in a forward model Bayesian framework with hybrid PM and neural network forward model. We provide skeleton code for these examples and the entire code is publicly available at https://github.com/modichirag/flowpm
- Dark Energy Survey Year 3 results: Curved-sky weak lensing mass map reconstructionN. Jeffrey, M. Gatti, C. Chang, and 129 more authorsMonthly Notices of the Royal Astronomical Society Aug 2021
We present reconstructed convergence maps, mass maps, from the Dark Energy Survey (DES) third year (Y3) weak gravitational lensing data set. The mass maps are weighted projections of the density field (primarily dark matter) in the foreground of the observed galaxies. We use four reconstruction methods, each is a maximum a posteriori estimate with a different model for the prior probability of the map: Kaiser-Squires, null B-mode prior, Gaussian prior, and a sparsity prior. All methods are implemented on the celestial sphere to accommodate the large sky coverage of the DES Y3 data. We compare the methods using realistic \ensuremathΛCDM simulations with mock data that are closely matched to the DES Y3 data. We quantify the performance of the methods at the map level and then apply the reconstruction methods to the DES Y3 data, performing tests for systematic error effects. The maps are compared with optical foreground cosmic-web structures and are used to evaluate the lensing signal from cosmic-void profiles. The recovered dark matter map covers the largest sky fraction of any galaxy weak lensing map to date.
- Adaptive wavelet distillation from neural networks through interpretationsWooseok Ha, Chandan Singh, Francois Lanusse, and 2 more authorsarXiv e-prints Jul 2021
Recent deep-learning models have achieved impressive prediction performance, but often sacrifice interpretability and computational efficiency. Interpretability is crucial in many disciplines, such as science and medicine, where models must be carefully vetted or where interpretation is the goal itself. Moreover, interpretable models are concise and often yield computational efficiency. Here, we propose adaptive wavelet distillation (AWD), a method which aims to distill information from a trained neural network into a wavelet transform. Specifically, AWD penalizes feature attributions of a neural network in the wavelet domain to learn an effective multi- resolution wavelet transform. The resulting model is highly predictive, concise, computationally efficient, and has properties (such as a multi-scale structure) which make it easy to interpret. In close collaboration with domain experts, we showcase how AWD addresses challenges in two real-world settings: cosmological parameter inference and molecular-partner prediction. In both cases, AWD yields a scientifically interpretable and concise model which gives predictive performance better than state-of-the-art neural networks. Moreover, AWD identifies predictive features that are scientifically meaningful in the context of respective domains. All code and models are released in a full-fledged package available on Github (https://github.com/Yu-Group/adaptive- wavelets).
- Deep generative models for galaxy image simulationsFrançois Lanusse, Rachel Mandelbaum, Siamak Ravanbakhsh, and 3 more authorsMonthly Notices of the Royal Astronomical Society Jul 2021
Image simulations are essential tools for preparing and validating the analysis of current and future wide-field optical surveys. However, the galaxy models used as the basis for these simulations are typically limited to simple parametric light profiles, or use a fairly limited amount of available space- based data. In this work, we propose a methodology based on deep generative models to create complex models of galaxy morphologies that may meet the image simulation needs of upcoming surveys. We address the technical challenges associated with learning this morphology model from noisy and point spread function (PSF)-convolved images by building a hybrid Deep Learning/physical Bayesian hierarchical model for observed images, explicitly accounting for the PSF and noise properties. The generative model is further made conditional on physical galaxy parameters, to allow for sampling new light profiles from specific galaxy populations. We demonstrate our ability to train and sample from such a model on galaxy postage stamps from the HST/ACS COSMOS survey, and validate the quality of the model using a range of second- and higher order morphology statistics. Using this set of statistics, we demonstrate significantly more realistic morphologies using these deep generative models compared to conventional parametric models. To help make these generative models practical tools for the community, we introduce GALSIM-HUB, a community-driven repository of generative models, and a framework for incorporating generative models within the GALSIM image simulation software.
- Real-time Likelihood-free Inference of Roman Binary Microlensing Events with Amortized Neural Posterior EstimationKeming Zhang, Joshua S. Bloom, B. Scott Gaudi, and 3 more authorsAstronomical Journal Jun 2021
Fast and automated inference of binary-lens, single-source (2L1S) microlensing events with sampling-based Bayesian algorithms (e.g., Markov Chain Monte Carlo, MCMC) is challenged on two fronts: the high computational cost of likelihood evaluations with microlensing simulation codes, and a pathological parameter space where the negative-log-likelihood surface can contain a multitude of local minima that are narrow and deep. Analysis of 2L1S events usually involves grid searches over some parameters to locate approximate solutions as a prerequisite to posterior sampling, an expensive process that often requires human-in-the- loop domain expertise. As the next-generation, space-based microlensing survey with the Roman Space Telescope is expected to yield thousands of binary microlensing events, a new fast and automated method is desirable. Here, we present a likelihood- free inference approach named amortized neural posterior estimation, where a neural density estimator (NDE) learns a surrogate posterior \hatp(\boldsymbolθ| \boldsymbolx) as an observation-parameterized conditional probability distribution, from pre-computed simulations over the full prior space. Trained on 291,012 simulated Roman-like 2L1S simulations, the NDE produces accurate and precise posteriors within seconds for any observation within the prior support without requiring a domain expert in the loop, thus allowing for real-time and automated inference. We show that the NDE also captures expected posterior degeneracies. The NDE posterior could then be refined into the exact posterior with a downstream MCMC sampler with minimal burn-in steps.
- TheLastMetric: an information-based observing strategy metric for photometric redshifts, cosmology, and moreA. I. Malz, F. Lanusse, M. L. Graham, and 1 more authorIn American Astronomical Society Meeting Abstracts Jun 2021
An astronomical survey’s observing strategy, which encompasses the frequency and duration of visits to each portion of the sky, impacts the degree to which its data can answer the most pressing questions about the universe. Surveys with diverse scientific goals pose a special challenge for survey design decisionmaking; even if each physical parameter of interest has a corresponding quantitative metric, there’s no guarantee of a “one size fits all” optimal observing strategy. While traditional observing strategy metrics must be specific to the science case in question, we exploit a chain rule of the variational mutual information to engineer TheLastMetric, an interpretable, extensible metric that enables coherent observing strategy optimization over multiple science objectives. The upcoming Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) serves as an ideal application for this metric, as many of its extreagalactic science goals rely upon purely photometric redshift constraints. As a demonstration, we use the LSST Metrics Analysis Framework (MAF) to quantify how much information about redshift is contained within photometry, conditioned on a fiducial true galaxy catalog and mock observations under each of several given observing strategies, generated by the LSST Operations Simulator (OpSim). We compare traditional metrics of photometric redshift performance to TheLastMetric and interpret their differences from the perspective of observing strategy optimization. Finally, we illustrate how to extend TheLastMetric to cosmological constraints by multiple probes, jointly or individually.
- Deep Probabilistic Modeling of Weak Lensing Mass MapsF. Lanusse, B. Remy, N. Jeffrey, and 2 more authorsIn American Astronomical Society Meeting Abstracts Jun 2021
While weak gravitational lensing is one of the most promising cosmological probes targeted by upcoming wide-field surveys, exploiting the full information content of the cosmic shear signal remains a major challenge. One dimension of this challenge is the fact that analytic cosmological models only describe the 2pt functions of the lensing signal, while we know the convergence field to be significantly non-Gaussian. As a result, solving a problem like weak lensing mass-mapping using analytic Gaussian priors will be sub-optimal. We do however have access to models that can capture the full statistics of the lensing signal: numerical simulations. But the question is: how can we use samples from numerical simulations to solve a Bayesian inference problem such as weak lensing mass-mapping? \\textbackslashIn this talk, I will illustrate how recent deep generative modeling provides us with the tools needed to leverage a physical model in the form of numerical simulations to perform proper Bayesian inference. \\textbackslashUsing Neural Score Estimation, we learn from numerical simulations an estimate of the score function (i.e. the gradient of the log density function) of the distribution of convergence maps. We then use the learned score function as a prior within an Annealed Hamiltonian Monte-Carlo sampling scheme which allows us to access the full posterior distribution of a mass-mapping problem, in 10^6 dimensions.
- Weak-lensing mass reconstruction using sparsity and a Gaussian random fieldJ. -L. Starck, K. E. Themelis, N. Jeffrey, and 2 more authorsAstronomy and Astrophysics May 2021
\Aims: We introduce a novel approach to reconstructing dark matter mass maps from weak gravitational lensing measurements. The cornerstone of the proposed method lies in a new modelling of the matter density field in the Universe as a mixture of two components: (1) a sparsity-based component that captures the non-Gaussian structure of the field, such as peaks or halos at different spatial scales, and (2) a Gaussian random field, which is known to represent the linear characteristics of the field well. \Methods: We propose an algorithm called MCALens that jointly estimates these two components. MCALens is based on an alternating minimisation incorporating both sparse recovery and a proximal iterative Wiener filtering. \Results: Experimental results on simulated data show that the proposed method exhibits improved estimation accuracy compared to customised mass-map reconstruction methods.
- CosmicRIM : Reconstructing Early Universe by Combining Differentiable Simulations with Recurrent Inference MachinesChirag Modi, François Lanusse, Uroš Seljak, and 2 more authorsarXiv e-prints Apr 2021
Reconstructing the Gaussian initial conditions at the beginning of the Universe from the survey data in a forward modeling framework is a major challenge in cosmology. This requires solving a high dimensional inverse problem with an expensive, non-linear forward model: a cosmological N-body simulation. While intractable until recently, we propose to solve this inference problem using an automatically differentiable N-body solver, combined with a recurrent networks to learn the inference scheme and obtain the maximum-a-posteriori (MAP) estimate of the initial conditions of the Universe. We demonstrate using realistic cosmological observables that learnt inference is 40 times faster than traditional algorithms such as ADAM and LBFGS, which require specialized annealing schemes, and obtains solution of higher quality.
- An information-based metric for observing strategy optimization, demonstrated in the context of photometric redshifts with applications to cosmologyAlex I. Malz, François Lanusse, John Franklin Crenshaw, and 1 more authorarXiv e-prints Apr 2021
The observing strategy of a galaxy survey influences the degree to which its resulting data can be used to accomplish any science goal. LSST is thus seeking metrics of observing strategies for multiple science cases in order to optimally choose a cadence. Photometric redshifts are essential for many extragalactic science applications of LSST’s data, including but not limited to cosmology, but there are few metrics available, and they are not straightforwardly integrated with metrics of other cadence- dependent quantities that may influence any given use case. We propose a metric for observing strategy optimization based on the potentially recoverable mutual information about redshift from a photometric sample under the constraints of a realistic observing strategy. We demonstrate a tractable estimation of a variational lower bound of this mutual information implemented in a public code using conditional normalizing flows. By comparing the recoverable redshift information across observing strategies, we can distinguish between those that preclude robust redshift constraints and those whose data will preserve more redshift information, to be generically utilized in a downstream analysis. We recommend the use of this versatile metric to observing strategy optimization for redshift-dependent extragalactic use cases, including but not limited to cosmology, as well as any other science applications for which photometry may be modeled from true parameter values beyond redshift.
- A deep learning approach to test the small-scale galaxy morphology and its relationship with star formation activity in hydrodynamical simulationsLorenzo Zanisi, Marc Huertas-Company, François Lanusse, and 10 more authorsMonthly Notices of the Royal Astronomical Society Mar 2021
Hydrodynamical simulations of galaxy formation and evolution attempt to fully model the physics that shapes galaxies. The agreement between the morphology of simulated and real galaxies, and the way the morphological types are distributed across galaxy scaling relations are important probes of our knowledge of galaxy formation physics. Here, we propose an unsupervised deep learning approach to perform a stringent test of the fine morphological structure of galaxies coming from the Illustris and IllustrisTNG (TNG100 and TNG50) simulations against observations from a subsample of the Sloan Digital Sky Survey. Our framework is based on PixelCNN, an autoregressive model for image generation with an explicit likelihood. We adopt a strategy that combines the output of two PixelCNN networks in a metric that isolates the small-scale morphological details of galaxies from the sky background. We are able to quantitatively identify the improvements of IllustrisTNG, particularly in the high-resolution TNG50 run, over the original Illustris. However, we find that the fine details of galaxy structure are still different between observed and simulated galaxies. This difference is mostly driven by small, more spheroidal, and quenched galaxies that are globally less accurate regardless of resolution and which have experienced little improvement between the three simulations explored. We speculate that this disagreement, that is less severe for quenched discy galaxies, may stem from a still too coarse numerical resolution, which struggles to properly capture the inner, dense regions of quenched spheroidal galaxies.
- The LSST DESC DC2 Simulated Sky SurveyLSST Dark Energy Science Collaboration (LSST DESC), Bela Abolfathi, David Alonso, and 77 more authorsAstrophysical Journal, Supplement Mar 2021
We describe the simulated sky survey underlying the second data challenge (DC2) carried out in preparation for analysis of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) by the LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark of LSST; the DC2 program represents a unique modeling effort that stresses this interconnectivity in a way that has not been attempted before. This effort encompasses a full end- to-end approach: starting from a large N-body simulation, through setting up LSST-like observations including realistic cadences, through image simulations, and finally processing with Rubin’s LSST Science Pipelines. This last step ensures that we generate data products resembling those to be delivered by the Rubin Observatory as closely as is currently possible. The simulated DC2 sky survey covers six optical bands in a wide- fast-deep area of approximately 300 deg^2, as well as a deep drilling field of approximately 1 deg^2. We simulate 5 yr of the planned 10 yr survey. The DC2 sky survey has multiple purposes. First, the LSST DESC working groups can use the data set to develop a range of DESC analysis pipelines to prepare for the advent of actual data. Second, it serves as a realistic test bed for the image processing software under development for LSST by the Rubin Observatory. In particular, simulated data provide a controlled way to investigate certain image-level systematic effects. Finally, the DC2 sky survey enables the exploration of new scientific ideas in both static and time domain cosmology.
- Likelihood-free inference with neural compression of DES SV weak lensing map statisticsNiall Jeffrey, Justin Alsing, and François LanusseMonthly Notices of the Royal Astronomical Society Feb 2021
In many cosmological inference problems, the likelihood (the probability of the observed data as a function of the unknown parameters) is unknown or intractable. This necessitates approximations and assumptions, which can lead to incorrect inference of cosmological parameters, including the nature of dark matter and dark energy, or create artificial model tensions. Likelihood- free inference covers a novel family of methods to rigorously estimate posterior distributions of parameters using forward modelling of mock data. We present likelihood-free cosmological parameter inference using weak lensing maps from the Dark Energy Survey (DES) Science Verification data, using neural data compression of weak lensing map summary statistics. We explore combinations of the power spectra, peak counts, and neural compressed summaries of the lensing mass map using deep convolution neural networks. We demonstrate methods to validate the inference process, for both the data modelling and the probability density estimation steps. Likelihood-free inference provides a robust and scalable alternative for rigorous large- scale cosmological inference with galaxy survey data (for DES, Euclid, and LSST). We have made our simulated lensing maps publicly available.
- DESC DC2 Data Release NoteLSST Dark Energy Science Collaboration, Bela Abolfathi, Robert Armstrong, and 54 more authorsarXiv e-prints Jan 2021
In preparation for cosmological analyses of the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST), the LSST Dark Energy Science Collaboration (LSST DESC) has created a 300 deg^2 simulated survey as part of an effort called Data Challenge 2 (DC2). The DC2 simulated sky survey, in six optical bands with observations following a reference LSST observing cadence, was processed with the LSST Science Pipelines (19.0.0). In this Note, we describe the public data release of the resulting object catalogs for the coadded images of five years of simulated observations along with associated truth catalogs. We include a brief description of the major features of the available data sets. To enable convenient access to the data products, we have developed a web portal connected to Globus data services. We describe how to access the data and provide example Jupyter Notebooks in Python to aid first interactions with the data. We welcome feedback and questions about the data release via a GitHub repository.
- Automating Inference of Binary Microlensing Events with Neural Density EstimationK. Zhang, J. Bloom, B. Gaudi, and 3 more authorsIn American Astronomical Society Meeting Abstracts Jan 2021
Automated inference of binary microlensing events with traditional sampling-based algorithms such as MCMC has been hampered by the slowness of the physical forward model and the pathological likelihood surface. Current analysis of such events requires both expert knowledge and large-scale grid searches to locate the approximate solution as a prerequisite to MCMC posterior sampling. As the next generation, space-based microlensing survey with the Roman Space Observatory is expected to yield thousands of binary microlensing events, a new scalable and automated approach is desired. Here, we present an automated inference method based on neural density estimation (NDE). We show that the NDE trained on simulated Roman data not only produces fast, accurate, and precise posteriors but also captures expected posterior degeneracies. A hybrid NDE-MCMC framework can further be applied to produce the exact posterior.
2020
- Anomaly Detection in Astronomical Images with Generative Adversarial NetworksKate Storey-Fisher, Marc Huertas-Company, Nesar Ramachandra, and 4 more authorsarXiv e-prints Dec 2020
We present an anomaly detection method using Wasserstein generative adversarial networks (WGANs) on optical galaxy images from the wide-field survey conducted with the Hyper Suprime-Cam (HSC) on the Subaru Telescope in Hawai’i. The WGAN is trained on the entire sample, and learns to generate realistic HSC-like images that follow the distribution of the training data. We identify images which are less well-represented in the generator’s latent space, and which the discriminator flags as less realistic; these are thus anomalous with respect to the rest of the data. We propose a new approach to characterize these anomalies based on a convolutional autoencoder (CAE) to reduce the dimensionality of the residual differences between the real and WGAN-reconstructed images. We construct a subsample of \raisebox-0.5ex\textasciitilde9,000 highly anomalous images from our nearly million object sample, and further identify interesting anomalies within these; these include galaxy mergers, tidal features, and extreme star-forming galaxies. The proposed approach could boost unsupervised discovery in the era of big data astrophysics.
- Denoising Score-Matching for Uncertainty Quantification in Inverse ProblemsZaccharie Ramzi, Benjamin Remy, Francois Lanusse, and 2 more authorsarXiv e-prints Nov 2020
Deep neural networks have proven extremely efficient at solving a wide rangeof inverse problems, but most often the uncertainty on the solution they provideis hard to quantify. In this work, we propose a generic Bayesian framework forsolving inverse problems, in which we limit the use of deep neural networks tolearning a prior distribution on the signals to recover. We adopt recent denoisingscore matching techniques to learn this prior from data, and subsequently use it aspart of an annealed Hamiltonian Monte-Carlo scheme to sample the full posteriorof image inverse problems. We apply this framework to Magnetic ResonanceImage (MRI) reconstruction and illustrate how this approach not only yields highquality reconstructions but can also be used to assess the uncertainty on particularfeatures of a reconstructed image.
- Probabilistic Mapping of Dark Matter by Neural Score MatchingBenjamin Remy, Francois Lanusse, Zaccharie Ramzi, and 3 more authorsarXiv e-prints Nov 2020
The Dark Matter present in the Large-Scale Structure of the Universe is invisible, but its presence can be inferred through the small gravitational lensing effect it has on the images of far away galaxies. By measuring this lensing effect on a large number of galaxies it is possible to reconstruct maps of the Dark Matter distribution on the sky. This, however, represents an extremely challenging inverse problem due to missing data and noise dominated measurements. In this work, we present a novel methodology for addressing such inverse problems by combining elements of Bayesian statistics, analytic physical theory, and a recent class of Deep Generative Models based on Neural Score Matching. This approach allows to do the following: (1) make full use of analytic cosmological theory to constrain the 2pt statistics of the solution, (2) learn from cosmological simulations any differences between this analytic prior and full simulations, and (3) obtain samples from the full Bayesian posterior of the problem for robust Uncertainty Quantification. We present an application of this methodology on the first deep- learning-assisted Dark Matter map reconstruction of the Hubble Space Telescope COSMOS field.
- Automating Inference of Binary Microlensing Events with Neural Density EstimationKeming Zhang, Joshua S. Bloom, B. Scott Gaudi, and 3 more authorsarXiv e-prints Oct 2020
Automated inference of binary microlensing events with traditional sampling-based algorithms such as MCMC has been hampered by the slowness of the physical forward model and the pathological likelihood surface. Current analysis of such events requires both expert knowledge and large-scale grid searches to locate the approximate solution as a prerequisite to MCMC posterior sampling. As the next generation, space-based microlensing survey with the Roman Space Observatory is expected to yield thousands of binary microlensing events, a new scalable and automated approach is desired. Here, we present an automated inference method based on neural density estimation (NDE). We show that the NDE trained on simulated Roman data not only produces fast, accurate, and precise posteriors but also captures expected posterior degeneracies. A hybrid NDE-MCMC framework can further be applied to produce the exact posterior.
- Bayesian Neural NetworksTom Charnock, Laurence Perreault-Levasseur, and François LanussearXiv e-prints Jun 2020
In recent times, neural networks have become a powerful tool for the analysis of complex and abstract data models. However, their introduction intrinsically increases our uncertainty about which features of the analysis are model-related and which are due to the neural network. This means that predictions by neural networks have biases which cannot be trivially distinguished from being due to the true nature of the creation and observation of data or not. In order to attempt to address such issues we discuss Bayesian neural networks: neural networks where the uncertainty due to the network can be characterised. In particular, we present the Bayesian statistical framework which allows us to categorise uncertainty in terms of the ingrained randomness of observing certain data and the uncertainty from our lack of knowledge about how data can be created and observed. In presenting such techniques we show how errors in prediction by neural networks can be obtained in principle, and provide the two favoured methods for characterising these errors. We will also describe how both of these methods have substantial pitfalls when put into practice, highlighting the need for other statistical techniques to truly be able to do inference when using neural networks.
- Transformation Importance with Applications to CosmologyChandan Singh, Wooseok Ha, Francois Lanusse, and 3 more authorsarXiv e-prints Mar 2020
Machine learning lies at the heart of new possibilities for scientific discovery, knowledge generation, and artificial intelligence. Its potential benefits to these fields requires going beyond predictive accuracy and focusing on interpretability. In particular, many scientific problems require interpretations in a domain-specific interpretable feature space (e.g. the frequency domain) whereas attributions to the raw features (e.g. the pixel space) may be unintelligible or even misleading. To address this challenge, we propose TRIM (TRansformation IMportance), a novel approach which attributes importances to features in a transformed space and can be applied post-hoc to a fully trained model. TRIM is motivated by a cosmological parameter estimation problem using deep neural networks (DNNs) on simulated data, but it is generally applicable across domains/models and can be combined with any local interpretation method. In our cosmology example, combining TRIM with contextual decomposition shows promising results for identifying which frequencies a DNN uses, helping cosmologists to understand and validate that the model learns appropriate physical features rather than simulation artifacts.
- Deep learning dark matter map reconstructions from DES SV weak lensing dataNiall Jeffrey, François Lanusse, Ofer Lahav, and 1 more authorMonthly Notices of the Royal Astronomical Society Mar 2020
We present the first reconstruction of dark matter maps from weak lensing observational data using deep learning. We train a convolution neural network with a U-Net-based architecture on over 3.6 \texttimes 10^5 simulated data realizations with non-Gaussian shape noise and with cosmological parameters varying over a broad prior distribution. We interpret our newly created dark energy survey science verification (DES SV) map as an approximation of the posterior mean P(\ensuremathκ|\ensuremathγ) of the convergence given observed shear. Our DeepMass^1 method is substantially more accurate than existing mass-mapping methods. With a validation set of 8000 simulated DES SV data realizations, compared to Wiener filtering with a fixed power spectrum, the DeepMass method improved the mean square error (MSE) by 11 per cent. With N-body simulated MICE mock data, we show that Wiener filtering, with the optimal known power spectrum, still gives a worse MSE than our generalized method with no input cosmological parameters; we show that the improvement is driven by the non-linear structures in the convergence. With higher galaxy density in future weak lensing data unveiling more non-linear scales, it is likely that deep learning will be a leading approach for mass mapping with Euclid and LSST.
2019
- Hybrid Physical-Deep Learning Model for Astronomical Inverse ProblemsFrancois Lanusse, Peter Melchior, and Fred MoolekamparXiv e-prints Dec 2019
We present a Bayesian machine learning architecture that combines a physically motivated parametrization and an analytic error model for the likelihood with a deep generative model providing a powerful data-driven prior for complex signals. This combination yields an interpretable and differentiable generative model, allows the incorporation of prior knowledge, and can be utilized for observations with different data quality without having to retrain the deep network. We demonstrate our approach with an example of astronomical source separation in current imaging data, yielding a physical and interpretable model of astronomical scenes.
- CosmoDC2: A Synthetic Sky Catalog for Dark Energy Science with LSSTDanila Korytov, Andrew Hearin, Eve Kovacs, and 28 more authorsAstrophysical Journal, Supplement Dec 2019
This paper introduces cosmoDC2, a large synthetic galaxy catalog designed to support precision dark energy science with the Large Synoptic Survey Telescope (LSST). CosmoDC2 is the starting point for the second data challenge (DC2) carried out by the LSST Dark Energy Science Collaboration (LSST DESC). The catalog is based on a trillion-particle, (4.225 Gpc)^3 box cosmological N-body simulation, the Outer Rim run. It covers 440 deg^2 of sky area to a redshift of z = 3 and matches expected number densities from contemporary surveys to a magnitude depth of 28 in the r band. Each galaxy is characterized by a multitude of galaxy properties including stellar mass, morphology, spectral energy distributions, broadband filter magnitudes, host halo information, and weak lensing shear. The size and complexity of cosmoDC2 requires an efficient catalog generation methodology; our approach is based on a new hybrid technique that combines data-based empirical approaches with semianalytic galaxy modeling. A wide range of observation-based validation tests has been implemented to ensure that cosmoDC2 enables the science goals of the planned LSST DESC DC2 analyses. This paper also represents the official release of the cosmoDC2 data set, including an efficient reader that facilitates interaction with the data.
- Uncertainty Quantification with Generative ModelsVanessa Böhm, François Lanusse, and Uroš SeljakarXiv e-prints Oct 2019
We develop a generative model-based approach to Bayesian inverse problems, such as image reconstruction from noisy and incomplete images. Our framework addresses two common challenges of Bayesian reconstructions: 1) It makes use of complex, data- driven priors that comprise all available information about the uncorrupted data distribution. 2) It enables computationally tractable uncertainty quantification in the form of posterior analysis in latent and data space. The method is very efficient in that the generative model only has to be trained once on an uncorrupted data set, after that, the procedure can be used for arbitrary corruption types.
- The Role of Machine Learning in the Next Decade of CosmologyMichelle Ntampaka, Camille Avestruz, Steven Boada, and 27 more authorsBulletin of the AAS May 2019
Machine learning (ML) methods have remarkably improved how cosmologists can interpret data. The next decade will bring new opportunities for data-driven discovery, but will also present new challenges for adopting ML methodologies. ML could transform our field, but it will require the community to promote interdisciplinary research endeavors.
- Core Cosmology Library: Precision Cosmological Predictions for LSSTNora Elisa Chisari, David Alonso, Elisabeth Krause, and 28 more authorsAstrophysical Journal, Supplement May 2019
The Core Cosmology Library (CCL) provides routines to compute basic cosmological observables to a high degree of accuracy, which have been verified with an extensive suite of validation tests. Predictions are provided for many cosmological quantities, including distances, angular power spectra, correlation functions, halo bias, and the halo mass function through state- of-the-art modeling prescriptions available in the literature. Fiducial specifications for the expected galaxy distributions for the Large Synoptic Survey Telescope (LSST) are also included, together with the capability of computing redshift distributions for a user-defined photometric redshift model. A rigorous validation procedure, based on comparisons between CCL and independent software packages, allows us to establish a well-defined numerical accuracy for each predicted quantity. As a result, predictions for correlation functions of galaxy clustering, galaxy-galaxy lensing, and cosmic shear are demonstrated to be within a fraction of the expected statistical uncertainty of the observables for the models and in the range of scales of interest to LSST. CCL is an open source software package written in C, with a Python interface and publicly available at <A href=“https://github.com/LSSTDESC/CCL”>https:/ /github.com/LSSTDESC/CCL</A>.
- The strong gravitational lens finding challengeR. B. Metcalf, M. Meneghetti, C. Avestruz, and 34 more authorsAstronomy and Astrophysics May 2019
Large-scale imaging surveys will increase the number of galaxy-scale strong lensing candidates by maybe three orders of magnitudes beyond the number known today. Finding these rare objects will require picking them out of at least tens of millions of images, and deriving scientific results from them will require quantifying the efficiency and bias of any search method. To achieve these objectives automated methods must be developed. Because gravitational lenses are rare objects, reducing false positives will be particularly important. We present a description and results of an open gravitational lens finding challenge. Participants were asked to classify 100 000 candidate objects as to whether they were gravitational lenses or not with the goal of developing better automated methods for finding lenses in large data sets. A variety of methods were used including visual inspection, arc and ring finders, support vector machines (SVM) and convolutional neural networks (CNN). We find that many of the methods will be easily fast enough to analyse the anticipated data flow. In test data, several methods are able to identify upwards of half the lenses after applying some thresholds on the lens characteristics such as lensed image brightness, size or contrast with the lens galaxy without making a single false-positive identification. This is significantly better than direct inspection by humans was able to do. Having multi-band, ground based data is found to be better for this purpose than single-band space based data with lower noise and higher resolution, suggesting that multi-colour data is crucial. Multi-band space based data will be superior to ground based data. The most difficult challenge for a lens finder is differentiating between rare, irregular and ring-like face-on galaxies and true gravitational lenses. The degree to which the efficiency and biases of lens finders can be quantified largely depends on the realism of the simulated data on which the finders are trained.
- Cosmology from cosmic shear power spectra with Subaru Hyper Suprime-Cam first-year dataChiaki Hikage, Masamune Oguri, Takashi Hamana, and 34 more authorsPublications of the ASJ Apr 2019
We measure cosmic weak lensing shear power spectra with the Subaru Hyper Suprime-Cam (HSC) survey first-year shear catalog covering 137 deg^2 of the sky. Thanks to the high effective galaxy number density of \ensuremath∼17 arcmin^-2, even after conservative cuts such as a magnitude cut of i < 24.5 and photometric redshift cut of 0.3 \ensuremath≤ z \ensuremath≤ 1.5, we obtain a high-significance measurement of the cosmic shear power spectra in four tomographic redshift bins, achieving a total signal-to-noise ratio of 16 in the multipole range 300 \ensuremath≤ \ensuremath\ell \ensuremath≤ 1900. We carefully account for various uncertainties in our analysis including the intrinsic alignment of galaxies, scatters and biases in photometric redshifts, residual uncertainties in the shear measurement, and modeling of the matter power spectrum. The accuracy of our power spectrum measurement method as well as our analytic model of the covariance matrix are tested against realistic mock shear catalogs. For a flat \ensuremathΛ cold dark matter model, we find S _8\ensuremath≡ \ensuremathσ _8(\ensuremathΩ _m/0.3)\^\ensuremathα =0.800\^{+0.029}_{-0.028} for \ensuremathα = 0.45 (S _8=0.780\^{+0.030}_{-0.033} for \ensuremathα = 0.5) from our HSC tomographic cosmic shear analysis alone. In comparison with Planck cosmic microwave background constraints, our results prefer slightly lower values of S_8, although metrics such as the Bayesian evidence ratio test do not show significant evidence for discordance between these results. We study the effect of possible additional systematic errors that are unaccounted for in our fiducial cosmic shear analysis, and find that they can shift the best-fit values of S_8 by up to \ensuremath∼0.6 \ensuremathσ in both directions. The full HSC survey data will contain several times more area, and will lead to significantly improved cosmological constraints.
2018
- Weak lensing shear calibration with simulations of the HSC surveyRachel Mandelbaum, François Lanusse, Alexie Leauthaud, and 8 more authorsMonthly Notices of the Royal Astronomical Society Dec 2018
We present results from a set of simulations designed to constrain the weak lensing shear calibration for the Hyper Suprime-Cam (HSC) survey. These simulations include HSC observing conditions and galaxy images from the Hubble Space Telescope (HST), with fully realistic galaxy morphologies and the impact of nearby galaxies included. We find that the inclusion of nearby galaxies in the images is critical to reproducing the observed distributions of galaxy sizes and magnitudes, due to the non-negligible fraction of unrecognized blends in ground-based data, even with the excellent typical seeing of the HSC survey (0.58 arcsec in the i band). Using these simulations, we detect and remove the impact of selection biases due to the correlation of weights and the quantities used to define the sample (S/N and apparent size) with the lensing shear. We quantify and remove galaxy property- dependent multiplicative and additive shear biases that are intrinsic to our shear estimation method, including an \ensuremath∼10 per cent-level multiplicative bias due to the impact of nearby galaxies and unrecognized blends. Finally, we check the sensitivity of our shear calibration estimates to other cuts made on the simulated samples, and find that the changes in shear calibration are well within the requirements for HSC weak lensing analysis. Overall, the simulations suggest that the weak lensing multiplicative biases in the first-year HSC shear catalogue are controlled at the 1 per cent level.
- Improving weak lensing mass map reconstructions using Gaussian and sparsity priors: application to DES SVN. Jeffrey, F. B. Abdalla, O. Lahav, and 66 more authorsMonthly Notices of the Royal Astronomical Society Sep 2018
Mapping the underlying density field, including non-visible dark matter, using weak gravitational lensing measurements is now a standard tool in cosmology. Due to its importance to the science results of current and upcoming surveys, the quality of the convergence reconstruction methods should be well understood. We compare three methods: Kaiser-Squires (KS), Wiener filter, and GLIMPSE. Kaiser-Squires is a direct inversion, not accounting for survey masks or noise. The Wiener filter is well-motivated for Gaussian density fields in a Bayesian framework. GLIMPSE uses sparsity, aiming to reconstruct non-linearities in the density field. We compare these methods with several tests using public Dark Energy Survey (DES) Science Verification (SV) data and realistic DES simulations. The Wiener filter and GLIMPSE offer substantial improvements over smoothed Kaiser-Squires with a range of metrics. Both the Wiener filter and GLIMPSE convergence reconstructions show a 12 per cent improvement in Pearson correlation with the underlying truth from simulations. To compare the mapping methods’ abilities to find mass peaks, we measure the difference between peak counts from simulated \ensuremathΛCDM shear catalogues and catalogues with no mass fluctuations (a standard data vector when inferring cosmology from peak statistics); the maximum signal-to-noise of these peak statistics is increased by a factor of 3.5 for the Wiener filter and 9 for GLIMPSE. With simulations, we measure the reconstruction of the harmonic phases; the phase residuals’ concentration is improved 17 per cent by GLIMPSE and 18 per cent by the Wiener filter. The correlationbetween reconstructions from data and foreground redMaPPer clusters is increased 18 per cent by the Wiener filter and 32 per cent by GLIMPSE.
- The clustering of z > 7 galaxies: predictions from the BLUETIDES simulationAklant K. Bhowmick, Tiziana Di Matteo, Yu Feng, and 1 more authorMonthly Notices of the Royal Astronomical Society Mar 2018
We study the clustering of the highest z galaxies (from \ensuremath∼0.1 to a few tens Mpc scales) using the BLUETIDES simulation and compare it to current observational constraints from Hubble legacy and Hyper Suprime Cam (HSC) fields (at z = 6-7.2). With a box length of 400 Mpc h^-1 on each side and 0.7 trillion particles, BLUETIDES is the largest volume high-resolution cosmological hydrodynamic simulation to date ideally suited for studies of high-z galaxies. We find that galaxies with magnitude m_UV < 27.7 have a bias (b_g) of 8.1 \ensuremath\pm 1.2 at z = 8, and typical halo masses M_H \ensuremath≳ 6 \texttimes 10^10 M_\ensuremath⊙. Given the redshift evolution between z = 8 and z = 10 [b_g \ensuremath∝ (1 + z)^1.6], our inferred values of the bias and halo masses are consistent with measured angular clustering at z \ensuremath∼ 6.8 from these brighter samples. The bias of fainter galaxies (in the Hubble legacy field at H_160 \ensuremath≲ 29.5) is 5.9 \ensuremath\pm 0.9 at z = 8 corresponding to halo masses M_H \ensuremath≳ 10^10 M_\ensuremath⊙. We investigate directly the 1-halo term in the clustering and show that it dominates on scales r \ensuremath≲ 0.1 Mpc h^-1 (\ensuremathΘ \ensuremath≲ 3 arcsec) with non-linear effect at transition scales between the one-halo and two-halo term affecting scales 0.1 Mpc h^-1\ensuremath≲ r \ensuremath≲ 20 Mpc h^-1 (3 arcsec \ensuremath≲ \ensuremathΘ \ensuremath≲ 90 arcsec). Current clustering measurements probe down to the scales in the transition between one-halo and two-halo regime where non-linear effects are important. The amplitude of the one-halo term implies that occupation numbers for satellites in BLUETIDES are somewhat higher than standard halo occupation distributions adopted in these analyses (which predict amplitudes in the one-halo regime suppressed by a factor 2-3). That possibly implies a higher number of galaxies detected by JWST (at small scales and even fainter magnitudes) observing these fields.
- DESCQA: An Automated Validation Framework for Synthetic Sky CatalogsYao-Yuan Mao, Eve Kovacs, Katrin Heitmann, and 25 more authorsAstrophysical Journal, Supplement Feb 2018
The use of high-quality simulated sky catalogs is essential for the success of cosmological surveys. The catalogs have diverse applications, such as investigating signatures of fundamental physics in cosmological observables, understanding the effect of systematic uncertainties on measured signals and testing mitigation strategies for reducing these uncertainties, aiding analysis pipeline development and testing, and survey strategy optimization. The list of applications is growing with improvements in the quality of the catalogs and the details that they can provide. Given the importance of simulated catalogs, it is critical to provide rigorous validation protocols that enable both catalog providers and users to assess the quality of the catalogs in a straightforward and comprehensive way. For this purpose, we have developed the DESCQA framework for the Large Synoptic Survey Telescope Dark Energy Science Collaboration as well as for the broader community. The goal of DESCQA is to enable the inspection, validation, and comparison of an inhomogeneous set of synthetic catalogs via the provision of a common interface within an automated framework. In this paper, we present the design concept and first implementation of DESCQA. In order to establish and demonstrate its full functionality we use a set of interim catalogs and validation tests. We highlight several important aspects, both technical and scientific, that require thoughtful consideration when designing a validation framework, including validation metrics and how these metrics impose requirements on the synthetic sky catalogs.
- The first-year shear catalog of the Subaru Hyper Suprime-Cam Subaru Strategic Program SurveyRachel Mandelbaum, Hironao Miyatake, Takashi Hamana, and 28 more authorsPublications of the ASJ Jan 2018
We present and characterize the catalog of galaxy shape measurements that will be used for cosmological weak lensing measurements in the Wide layer of the first year of the Hyper Suprime-Cam (HSC) survey. The catalog covers an area of 136.9 deg^2 split into six fields, with a mean i-band seeing of 0{\^”_.}58 and 5\ensuremathσ point-source depth of i \ensuremath∼ 26. Given conservative galaxy selection criteria for first-year science, the depth and excellent image quality results in unweighted and weighted source number densities of 24.6 and 21.8 arcmin^-2, respectively. We define the requirements for cosmological weak lensing science with this catalog, then focus on characterizing potential systematics in the catalog using a series of internal null tests for problems with point-spread function (PSF) modeling, shear estimation, and other aspects of the image processing. We find that the PSF models narrowly meet requirements for weak lensing science with this catalog, with fractional PSF model size residuals of approximately 0.003 (requirement: 0.004) and the PSF model shape correlation function \ensuremathρ_1 < 3 \texttimes 10^-7 (requirement: 4 \texttimes 10^-7) at 0.5\textdegree scales. A variety of galaxy shape-related null tests are statistically consistent with zero, but star- galaxy shape correlations reveal additive systematics on >1\textdegree scales that are sufficiently large as to require mitigation in cosmic shear measurements. Finally, we discuss the dominant systematics and the planned algorithmic changes to reduce them in future data reductions.
- CMU DeepLens: deep learning for automatic image-based galaxy-galaxy strong lens findingFrançois Lanusse, Quanbin Ma, Nan Li, and 5 more authorsMonthly Notices of the Royal Astronomical Society Jan 2018
Galaxy-scale strong gravitational lensing can not only provide a valuable probe of the dark matter distribution of massive galaxies, but also provide valuable cosmological constraints, either by studying the population of strong lenses or by measuring time delays in lensed quasars. Due to the rarity of galaxy-scale strongly lensed systems, fast and reliable automated lens finding methods will be essential in the era of large surveys such as Large Synoptic Survey Telescope, Euclid and Wide-Field Infrared Survey Telescope. To tackle this challenge, we introduce CMU DeepLens, a new fully automated galaxy-galaxy lens finding method based on deep learning. This supervised machine learning approach does not require any tuning after the training step which only requires realistic image simulations of strongly lensed systems. We train and validate our model on a set of 20 000 LSST-like mock observations including a range of lensed systems of various sizes and signal- to-noise ratios (S/N). We find on our simulated data set that for a rejection rate of non-lenses of 99 per cent, a completeness of 90 per cent can be achieved for lenses with Einstein radii larger than 1.4 arcsec and S/N larger than 20 on individual g-band LSST exposures. Finally, we emphasize the importance of realistically complex simulations for training such machine learning methods by demonstrating that the performance of models of significantly different complexities cannot be distinguished on simpler simulations. We make our code publicly available at https://github.com/McWilliamsCenter/CMUDeepLens.
2017
- Sparse Reconstruction of the Merging A520 Cluster SystemAustin Peel, François Lanusse, and Jean-Luc StarckAstrophysical Journal Sep 2017
Merging galaxy clusters present a unique opportunity to study the properties of dark matter in an astrophysical context. These are rare and extreme cosmic events in which the bulk of the baryonic matter becomes displaced from the dark matter halos of the colliding subclusters. Since all mass bends light, weak gravitational lensing is a primary tool to study the total mass distribution in such systems. Combined with X-ray and optical analyses, mass maps of cluster mergers reconstructed from weak- lensing observations have been used to constrain the self- interaction cross-section of dark matter. The dynamically complex Abell 520 (A520) cluster is an exceptional case, even among merging systems: multi-wavelength observations have revealed a surprising high mass-to-light concentration of dark mass, the interpretation of which is difficult under the standard assumption of effectively collisionless dark matter. We revisit A520 using a new sparsity-based mass-mapping algorithm to independently assess the presence of the puzzling dark core. We obtain high-resolution mass reconstructions from two separate galaxy shape catalogs derived from Hubble Space Telescope observations of the system. Our mass maps agree well overall with the results of previous studies, but we find important differences. In particular, although we are able to identify the dark core at a certain level in both data sets, it is at much lower significance than has been reported before using the same data. As we cannot confirm the detection in our analysis, we do not consider A520 as posing a significant challenge to the collisionless dark matter scenario.
- Cosmological constraints with weak-lensing peak counts and second-order statistics in a large-field surveyAustin Peel, Chieh-An Lin, François Lanusse, and 3 more authorsAstronomy and Astrophysics Mar 2017
Peak statistics in weak-lensing maps access the non-Gaussian information contained in the large-scale distribution of matter in the Universe. They are therefore a promising complementary probe to two-point and higher-order statistics to constrain our cosmological models. Next-generation galaxy surveys, with their advanced optics and large areas, will measure the cosmic weak- lensing signal with unprecedented precision. To prepare for these anticipated data sets, we assess the constraining power of peak counts in a simulated Euclid-like survey on the cosmological parameters \ensuremathΩ_m, \ensuremathσ_8, and w_0^de. In particular, we study how CAMELUS, a fast stochastic model for predicting peaks, can be applied to such large surveys. The algorithm avoids the need for time-costly N-body simulations, and its stochastic approach provides full PDF information of observables. Considering peaks with a signal-to-noise ratio \ensuremath≥ 1, we measure the abundance histogram in a mock shear catalogue of approximately 5000 deg^2 using a multiscale mass-map filtering technique. We constrain the parameters of the mock survey using CAMELUS combined with approximate Bayesian computation, a robust likelihood-free inference algorithm. Peak statistics yield a tight but significantly biased constraint in the \ensuremathσ_8-\ensuremathΩ_m plane, as measured by the width \ensuremath∆\ensuremathΣ_8 of the 1\ensuremathσ contour. We find \ensuremathΣ_8 = \ensuremathσ_8(\ensuremathΩ_m/ 0.27)^\ensuremathα = 0.77_-0.05^+0.06 with \ensuremathα = 0.75 for a flat \ensuremathΛCDM model. The strong bias indicates the need to better understand and control the model systematics before applying it to a real survey of this size or larger. We perform a calibration of the model and compare results to those from the two-point correlation functions \ensuremathξ_\ensuremath\pm measured on the same field. We calibrate the \ensuremathξ_\ensuremath\pm result as well, since its contours are also biased, although not as severely as for peaks. In this case, we find for peaks \ensuremathΣ_8 = 0.76_-0.03^+0.02 with \ensuremathα = 0.65, while for the combined \ensuremathξ_+ and \ensuremathξ_- statistics the values are \ensuremathΣ_8 = 0.76_-0.01^+0.02 and \ensuremathα = 0.70. We conclude that the constraining power can therefore be comparable between the two weak-lensing observables in large-field surveys. Furthermore, the tilt in the \ensuremathσ_8-\ensuremathΩ_m degeneracy direction for peaks with respect to that of \ensuremathξ_\ensuremath\pm suggests that a combined analysis would yield tighter constraints than either measure alone. As expected, w_0^de cannot be well constrained without a tomographic analysis, but its degeneracy directions with the other two varied parameters are still clear for both peaks and \ensuremathξ_\ensuremath\pm.
- Cosmological constraints with weak lensing peak counts and second-order statistics in a large-field surveyAustin Peel, Chieh-An Lin, Francois Lanusse, and 3 more authorsIn American Astronomical Society Meeting Abstracts #229 Jan 2017
Peak statistics in weak lensing maps access the non-Gaussian information contained in the large-scale distribution of matter in the Universe. They are therefore a promising complementary probe to two-point and higher-order statistics to constrain our cosmological models. To prepare for the high precision afforded by next-generation weak lensing surveys, we assess the constraining power of peak counts in a simulated Euclid-like survey on the cosmological parameters \ensuremathΩ_m, \ensuremathσ_8, and w_0^de. In particular, we study how CAMELUS—a fast stochastic model for predicting peaks—can be applied to such large surveys. The algorithm avoids the need for time-costly N-body simulations, and its stochastic approach provides full PDF information of observables. We measure the abundance histogram of peaks in a mock shear catalogue of approximately 5,000 deg^2 using a multiscale mass map filtering technique, and we then constrain the parameters of the mock survey using CAMELUS combined with approximate Bayesian computation, a robust likelihood-free inference algorithm. We find that peak statistics yield a tight but significantly biased constraint in the \ensuremathσ_8-\ensuremathΩ_m plane, indicating the need to better understand and control the model’s systematics before applying it to a real survey of this size or larger. We perform a calibration of the model to remove the bias and compare results to those from the two-point correlation functions (2PCF) measured on the same field. In this case, we find the derived parameter \ensuremathΣ_8 = \ensuremathσ_8(\ensuremathΩ_m/0.27 )^\ensuremathα = 0.76 (-0.03 +0.02) with \ensuremathα = 0.65 for peaks, while for 2PCF the values are \ensuremathΣ_8 = 0.76 (-0.01 +0.02) and \ensuremathα = 0.70. We conclude that the constraining power can therefore be comparable between the two weak lensing observables in large-field surveys. Furthermore, the tilt in the \ensuremathσ_8-\ensuremathΩ_m degeneracy direction for peaks with respect to that of 2PCF suggests that a combined analysis would yield tighter constraints than either measure alone. As expected, w_0^de cannot be well constrained without a tomographic analysis, but its degeneracy directions with the other two varied parameters are still clear for both peaks and 2PCF.
- Deep Generative Models of Galaxy Images for the Calibration of the Next Generation of Weak Lensing SurveysFrancois Lanusse, Siamak Ravanbakhsh, Rachel Mandelbaum, and 2 more authorsIn American Astronomical Society Meeting Abstracts #229 Jan 2017
Weak gravitational lensing has long been identified as one of the most powerful probes to investigate the nature of dark energy. As such, weak lensing is at the heart of the next generation of cosmological surveys such as LSST, Euclid or WFIRST.One particularly crititcal source of systematic errors in these surveys comes from the shape measurement algorithms tasked with estimating galaxy shapes. GREAT3, the last community challenge to assess the quality of state-of-the-art shape measurement algorithms has in particular demonstrated that all current methods are biased to various degrees and, more importantly, that these biases depend on the details of the galaxy morphologies. These biases can be measured and calibrated by generating mock observations where a known lensing signal has been introduced and comparing the resulting measurements to the ground-truth. Producing these mock observations however requires input galaxy images of higher resolution and S/N than the simulated survey, which typically implies acquiring extremely expensive space-based observations.The goal of this work is to train a deep generative model on already available Hubble Space Telescope data which can then be used to sample new galaxy images conditioned on parameters such as magnitude, size or redshift and exhibiting complex morphologies. Such model can allow us to inexpensively produce large set of realistic realistic images for calibration purposes.We implement a conditional generative model based on state-of-the-art deep learning methods and fit it to deep galaxy images from the COSMOS survey. The quality of the model is assessed by computing an extensive set of galaxy morphology statistics on the generated images. Beyond simple second moment statistics such as size and ellipticity, we apply more complex statistics specifically designed to be sensitive to disturbed galaxy morphologies. We find excellent agreement between the morphologies of real and model generated galaxies.Our results suggest that such deep generative models represent a reliable alternative to the acquisition of expensive high quality observations for generating the calibration data needed by the next generation of weak lensing surveys.
2016
- Enabling Dark Energy Science with Deep Generative Models of Galaxy ImagesSiamak Ravanbakhsh, Francois Lanusse, Rachel Mandelbaum, and 2 more authorsarXiv e-prints Sep 2016
Understanding the nature of dark energy, the mysterious force driving the accelerated expansion of the Universe, is a major challenge of modern cosmology. The next generation of cosmological surveys, specifically designed to address this issue, rely on accurate measurements of the apparent shapes of distant galaxies. However, shape measurement methods suffer from various unavoidable biases and therefore will rely on a precise calibration to meet the accuracy requirements of the science analysis. This calibration process remains an open challenge as it requires large sets of high quality galaxy images. To this end, we study the application of deep conditional generative models in generating realistic galaxy images. In particular we consider variations on conditional variational autoencoder and introduce a new adversarial objective for training of conditional generative networks. Our results suggest a reliable alternative to the acquisition of expensive high quality observations for generating the calibration data needed by the next generation of cosmological surveys.
- High resolution weak lensing mass mapping combining shear and flexionF. Lanusse, J. -L. Starck, A. Leonard, and 1 more authorAstronomy and Astrophysics Jun 2016
\Aims: We propose a new mass mapping algorithm, specifically designed to recover small-scale information from a combination of gravitational shear and flexion. Including flexion allows us to supplement the shear on small scales in order to increase the sensitivity to substructures and the overall resolution of the convergence map without relying on strong lensing constraints. \Methods: To preserve all available small scale information, we avoid any binning of the irregularly sampled input shear and flexion fields and treat the mass mapping problem as a general ill-posed inverse problem, which is regularised using a robust multi-scale wavelet sparsity prior. The resulting algorithm incorporates redshift, reduced shear, and reduced flexion measurements for individual galaxies and is made highly efficient by the use of fast Fourier estimators. \Results: We tested our reconstruction method on a set of realistic weak lensing simulations corresponding to typical HST/ACS cluster observations and demonstrate our ability to recover substructures with the inclusion of flexion, which are otherwise lost if only shear information is used. In particular, we can detect substructures on the 15“ scale well outside of the critical region of the clusters. In addition, flexion also helps to constrain the shape of the central regions of the main dark matter halos. \\textbackslashOur mass mapping software, called Glimpse2D, is made freely available at <A href=”http://www.cosm ostat.org/software/glimpse”>http://www.cosmostat.org/software/g limpse</A>
2015
- 3D galaxy clustering with future wide-field surveys: Advantages of a spherical Fourier-Bessel analysisF. Lanusse, A. Rassat, and J. -L. StarckAstronomy and Astrophysics Jun 2015
Context. Upcoming spectroscopic galaxy surveys are extremely promising to help in addressing the major challenges of cosmology, in particular in understanding the nature of the dark universe. The strength of these surveys, naturally described in spherical geometry, comes from their unprecedented depth and width, but an optimal extraction of their three-dimensional information is of utmost importance to best constrain the properties of the dark universe. \Aims: Although there is theoretical motivation and novel tools to explore these surveys using the 3D spherical Fourier-Bessel (SFB) power spectrum of galaxy number counts C_\ensuremath\ell(k,k’), most survey optimisations and forecasts are based on the tomographic spherical harmonics power spectrum C^(ij)_\ensuremath\ell. The goal of this paper is to perform a new investigation of the information that can be extracted from these two analyses in the context of planned stage IV wide-field galaxy surveys. \Methods: We compared tomographic and 3D SFB techniques by comparing the forecast cosmological parameter constraints obtained from a Fisher analysis. The comparison was made possible by careful and coherent treatment of non-linear scales in the two analyses, which makes this study the first to compare 3D SFB and tomographic constraints on an equal footing. Nuisance parameters related to a scale- and redshift-dependent galaxy bias were also included in the computation of the 3D SFB and tomographic power spectra for the first time. \Results: Tomographic and 3D SFB methods can recover similar constraints in the absence of systematics. This requires choosing an optimal number of redshift bins for the tomographic analysis, which we computed to be N = 26 for z_med ≃ 0.4, N = 30 for z_med ≃ 1.0, and N = 42 for z_med ≃ 1.7. When marginalising over nuisance parameters related to the galaxy bias, the forecast 3D SFB constraints are less affected by this source of systematics than the tomographic constraints. In addition, the rate of increase of the figure of merit as a function of median redshift is higher for the 3D SFB method than for the 2D tomographic method. \Conclusions: Constraints from the 3D SFB analysis are less sensitive to unavoidable systematics stemming from a redshift- and scale-dependent galaxy bias. Even for surveys that are optimised with tomography in mind, a 3D SFB analysis is more powerful. In addition, for survey optimisation, the figure of merit for the 3D SFB method increases more rapidly with redshift, especially at higher redshifts, suggesting that the 3D SFB method should be preferred for designing and analysing future wide-field spectroscopic surveys. CosmicPy, the Python package developed for this paper, is freely available at <a href=“http://cosmicpy.github.io”> https://cosmicpy.github.io</a>. \\textbackslashAppendices are available in electronic form at <A href=“http://www.aanda.org/10.1051/0004-6 361/201424456/olm”>http://www.aanda.org</A>
- Weak lensing reconstructions in 2D and 3D: implications for cluster studiesAdrienne Leonard, François Lanusse, and Jean-Luc StarckMonthly Notices of the Royal Astronomical Society May 2015
We compare the efficiency with which 2D and 3D weak lensing mass mapping techniques are able to detect clusters of galaxies using two state-of-the-art mass reconstruction techniques: MRLens in 2D and GLIMPSE in 3D. We simulate otherwise-empty cluster fields for 96 different virial mass-redshift combinations spanning the ranges 3 \texttimes 10^13 h^-1 M_\ensuremath⊙ \ensuremath≤ M_vir \ensuremath≤ 10^15 h^-1 M_\ensuremath⊙ and 0.05 \ensuremath≤ z_cl \ensuremath≤ 0.75, and for each generate 1000 realizations of noisy shear data in 2D and 3D. For each field, we then compute the cluster (false) detection rate as the mean number of cluster (false) detections per reconstruction over the sample of 1000 reconstructions. We show that both MRLens and GLIMPSE are effective tools for the detection of clusters from weak lensing measurements, and provide comparable quality reconstructions at low redshift. At high redshift, GLIMPSE reconstructions offer increased sensitivity in the detection of clusters, yielding cluster detection rates up to a factor of \ensuremath∼10 \texttimes that seen in 2D reconstructions using MRLens. We conclude that 3D mass mapping techniques are more efficient for the detection of clusters of galaxies in weak lensing surveys than 2D methods, particularly since 3D reconstructions yield unbiased estimators of both the mass and redshift of the detected clusters directly.
- SNIa detection in the SNLS photometric analysis using Morphological Component AnalysisA. Möller, V. Ruhlmann-Kleider, F. Lanusse, and 3 more authorsJournal of Cosmology and Astroparticle Physics Apr 2015
Detection of supernovae (SNe) and, more generally, of transient events in large surveys can provide numerous false detections. In the case of a deferred processing of survey images, this implies reconstructing complete light curves for all detections, requiring sizable processing time and resources. Optimizing the detection of transient events is thus an important issue for both present and future surveys. We present here the optimization done in the SuperNova Legacy Survey (SNLS) for the 5-year data deferred photometric analysis. In this analysis, detections are derived from stacks of subtracted images with one stack per lunation. The 3-year analysis provided 300,000 detections dominated by signals of bright objects that were not perfectly subtracted. Allowing these artifacts to be detected leads not only to a waste of resources but also to possible signal coordinate contamination. We developed a subtracted image stack treatment to reduce the number of non SN-like events using morphological component analysis. This technique exploits the morphological diversity of objects to be detected to extract the signal of interest. At the level of our subtraction stacks, SN- like events are rather circular objects while most spurious detections exhibit different shapes. A two-step procedure was necessary to have a proper evaluation of the noise in the subtracted image stacks and thus a reliable signal extraction. We also set up a new detection strategy to obtain coordinates with good resolution for the extracted signal. SNIa Monte-Carlo (MC) generated images were used to study detection efficiency and coordinate resolution. When tested on SNLS 3-year data this procedure decreases the number of detections by a factor of two, while losing only 10% of SN-like events, almost all faint ones. MC results show that SNIa detection efficiency is equivalent to that of the original method for bright events, while the coordinate resolution is improved.
2014
- PRISM: Recovery of the primordial spectrum from Planck dataF. Lanusse, P. Paykari, J. -L. Starck, and 3 more authorsAstronomy and Astrophysics Nov 2014
\Aims: The primordial power spectrum describes the initial perturbations that seeded the large-scale structure we observe today. It provides an indirect probe of inflation or other structure-formation mechanisms. In this Letter, we recover the primordial power spectrum from the Planck PR1 dataset, using our recently published algorithm PRISM. \Methods: PRISM is a sparsity-based inversion method that aims at recovering features in the primordial power spectrum from the empirical power spectrum of the cosmic microwave background (CMB). This ill-posed inverse problem is regularised using a sparsity prior on features in the primordial power spectrum in a wavelet dictionary. Although this non-parametric method does not assume a strong prior on the shape of the primordial power spectrum, it is able to recover both its general shape and localised features. As a results, this approach presents a reliable way of detecting deviations from the currently favoured scale-invariant spectrum. \Results: We applied PRISM to 100 simulated Planck data to investigate its performance on Planck-like data. We then applied PRISM to the Planck PR1 power spectrum to recover the primordial power spectrum. We also tested the algorithm’s ability to recover a small localised feature at k \raisebox-0.5ex~0.125 Mpc^-1, which caused a large dip at \ensuremath\ell \raisebox-0.5ex~1800 in the angular power spectrum. \Conclusions: We find no significant departures from the fiducial Planck PR1 near scale- invariant primordial power spectrum with A_s = 2.215 \texttimes 10^-9 and n_s = 0.9624.
- Cluster identification with 3D weak lensing density reconstructionsAdrienne Leonard, Fran’e7ois Lanusse, and Jean-Luc StarckIn Building the Euclid Cluster Survey - Scientific Program Jul 2014
- PRISM: Sparse recovery of the primordial power spectrumP. Paykari, F. Lanusse, J. -L. Starck, and 2 more authorsAstronomy and Astrophysics Jun 2014
\Aims: The primordial power spectrum describes the initial perturbations in the Universe which eventually grew into the large-scale structure we observe today, and thereby provides an indirect probe of inflation or other structure-formation mechanisms. Here, we introduce a new method to estimate this spectrum from the empirical power spectrum of cosmic microwave background maps. \Methods: A sparsity-based linear inversion method, named PRISM, is presented. This technique leverages a sparsity prior on features in the primordial power spectrum in a wavelet basis to regularise the inverse problem. This non-parametric approach does not assume a strong prior on the shape of the primordial power spectrum, yet is able to correctly reconstruct its global shape as well as localised features. These advantages make this method robust for detecting deviations from the currently favoured scale-invariant spectrum. \Results: We investigate the strength of this method on a set of WMAP nine-year simulated data for three types of primordial power spectra: a near scale-invariant spectrum, a spectrum with a small running of the spectral index, and a spectrum with a localised feature. This technique proves that it can easily detect deviations from a pure scale-invariant power spectrum and is suitable for distinguishing between simple models of the inflation. We process the WMAP nine-year data and find no significant departure from a near scale-invariant power spectrum with the spectral index n_s = 0.972. \Conclusions: A high-resolution primordial power spectrum can be reconstructed with this technique, where any strong local deviations or small global deviations from a pure scale- invariant spectrum can easily be detected.
- GLIMPSE: accurate 3D weak lensing reconstructions using sparsityAdrienne Leonard, François Lanusse, and Jean-Luc StarckMonthly Notices of the Royal Astronomical Society May 2014
We present GLIMPSE - Gravitational Lensing Inversion and MaPping with Sparse Estimators - a new algorithm to generate density reconstructions in three dimensions from photometric weak lensing measurements. This is an extension of earlier work in one dimension aimed at applying compressive sensing theory to the inversion of gravitational lensing measurements to recover 3D density maps. Using the assumption that the density can be represented sparsely in our chosen basis - 2D transverse wavelets and 1D line-of-sight Dirac functions - we show that clusters of galaxies can be identified and accurately localized and characterized using this method. Throughout, we use simulated data consistent with the quality currently attainable in large surveys. We present a thorough statistical analysis of the errors and biases in both the redshifts of detected structures and their amplitudes. The GLIMPSE method is able to produce reconstructions at significantly higher resolution than the input data; in this paper, we show reconstructions with 6 times finer redshift resolution than the shear data. Considering cluster simulations with 0.05 \ensuremath≤ z_cl \ensuremath≤ 0.75 and 3 \texttimes 10^13 \ensuremath≤ M_vir \ensuremath≤ 10^15 h^-1 M_\ensuremath⊙, we show that the redshift extent of detected peaks is typically 1-2 pixel, or \ensuremath∆z \ensuremath≲ 0.07, and that we are able to recover an unbiased estimator of the redshift of a detected cluster by considering many realizations of the noise. We also recover an accurate estimator of the mass, which is largely unbiased when the redshift is known and whose bias is constrained to \ensuremath≲5 per cent in the majority of our simulations when the estimated redshift is taken to be the true redshift. This shows a substantial improvement over earlier 3D inversion methods, which showed redshift smearing with a typical standard deviation of \ensuremathσ \ensuremath∼ 0.2-0.3, a significant damping of the amplitude of the peaks detected, and a bias in the detected redshift.
- Combining ProbesAnaı̈s Rassat, François Lanusse, Donnacha Kirk, and 2 more authorsIn Statistical Challenges in 21st Century Cosmology May 2014
With the advent of wide-field surveys, cosmology has entered a new golden age of data where our cosmological model and the nature of dark universe will be tested with unprecedented accuracy, so that we can strive for high precision cosmology. Observational probes like weak lensing, galaxy surveys and the cosmic microwave background as well as other observations will all contribute to these advances. These different probes trace the underlying expansion history and growth of structure in complementary ways and can be combined in order to extract cosmological parameters as best as possible. With future wide- field surveys, observational overlap means these will trace the same physical underlying dark matter distribution, and extra care must be taken when combining information from different probes. Consideration of probe combination is a fundamental aspect of cosmostatistics and important to ensure optimal use of future wide-field surveys.
- Density reconstruction from 3D lensing: Application to galaxy clustersFrançois Lanusse, Adrienne Leonard, and Jean-Luc StarckIn Statistical Challenges in 21st Century Cosmology May 2014
Using the 3D information provided by photometric or spectroscopic weak lensing surveys, it has become possible in the last few years to address the problem of mapping the matter density contrast in three dimensions from gravitational lensing. We recently proposed a new non linear sparsity based reconstruction method allowing for high resolution reconstruction of the over-density. This new technique represents a significant improvement over previous linear methods and opens the way to new applications of 3D weak lensing density reconstruction. In particular, we demonstrate that for the first time reconstructed over-density maps can be used to detect and characterise galaxy clusters in mass and redshift.
- PRISM: Sparse recovery of the primordial spectrum from WMAP9 and Planck datasetsP. Paykari, F. Lanusse, J. -L. Starck, and 2 more authorsIn Statistical Challenges in 21st Century Cosmology May 2014
The primordial power spectrum is an indirect probe of inflation or other structure-formation mechanisms. We introduce a new method, named PRISM, to estimate this spectrum from the empirical cosmic microwave background (CMB) power spectrum. This is a sparsity- based inversion method, which leverages a sparsity prior on features in the primordial spectrum in a wavelet dictionary to regularise the inverse problem. This non-parametric approach is able to reconstruct the global shape as well as localised features of the primordial spectrum accurately and proves to be robust for detecting deviations from the currently favoured scale-invariant spectrum. We investigate the strength of this method on a set of WMAP nine-year simulated data for three types of primordial spectra and then process the WMAP nine-year data as well as the Planck PR1 data. We find no significant departures from a near scale-invariant spectrum.
2013
- Imaging dark matter using sparsityFrançois Lanusse, Adrienne Leonard, and Jean-Luc StarckIn Wavelets and Sparsity XV Sep 2013
We present an application of sparse regularization of ill-posed linear inverse problems to the reconstruction of the 3D distribution of dark matter in the Universe. By its very nature dark matter cannot be directly observed. Nevertheless, it can be studied through its gravitational effects can it be studied. In particular, the presence of dark matter induces small deformations to the shapes of background galaxies which is known as weak gravitational lensing. However, reconstructing the 3D distribution of dark matter from tomographic lensing measurements amounts to solving an ill-posed linear inverse problem. Considering that the 3D dark matter density is sparse in an appropriate wavelet based 3D dictionary, we propose an iterative thresholding algorithm to solve a penalized least- squares problem. We present our results on simulated dark matter halos and compare them to state of the art linear reconstruction techniques. We show that thanks to our 3D sparsity constraint the quality of the reconstructed maps can be greatly improved.
- 3D sparse representations on the sphere and applications in astronomyFrançois Lanusse, and Jean-Luc StarckIn Wavelets and Sparsity XV Sep 2013
We present several 3D sparse decompositions based on wavelets on the sphere that are useful for different kind of data set such as regular 3D spherical measurements (r,\ensuremathθ, \ensuremath\varphi) and multichannel spherical measurements (\ensuremathλ, \ensuremathθ, \ensuremath\varphi). We show how these new decompositions can be used for astronomical data denoising and deconvolution, when the data are contaminated by Gaussian and Poisson noise.
2012
- Spherical 3D isotropic waveletsF. Lanusse, A. Rassat, and J. -L. StarckAstronomy and Astrophysics Apr 2012
Context. Future cosmological surveys will provide 3D large scale structure maps with large sky coverage, for which a 3D spherical Fourier-Bessel (SFB) analysis in spherical coordinates is natural. Wavelets are particularly well-suited to the analysis and denoising of cosmological data, but a spherical 3D isotropic wavelet transform does not currently exist to analyse spherical 3D data. \Aims: The aim of this paper is to present a new formalism for a spherical 3D isotropic wavelet, i.e. one based on the SFB decomposition of a 3D field and accompany the formalism with a public code to perform wavelet transforms. \Methods: We describe a new 3D isotropic spherical wavelet decomposition based on the undecimated wavelet transform (UWT) described in Starck et al. (2006). We also present a new fast discrete spherical Fourier- Bessel transform (DSFBT) based on both a discrete Bessel transform and the HEALPIX angular pixelisation scheme. We test the 3D wavelet transform and as a toy-application, apply a denoising algorithm in wavelet space to the Virgo large box cosmological simulations and find we can successfully remove noise without much loss to the large scale structure. \Results: We have described a new spherical 3D isotropic wavelet transform, ideally suited to analyse and denoise future 3D spherical cosmological surveys, which uses a novel DSFBT. We illustrate its potential use for denoising using a toy model. All the algorithms presented in this paper are available for download as a public code called MRS3D at http://jstarck.free.fr/mrs3d.html