Issue 1, December 2022
Issue Editors: Jessi Cisewski-Kehe, David W. Hogg, Vinay L. Kashyap, Aneta Siemiginowska
Introducing the Newsletter
Astrostatistics News (AN) is a newsletter designed to inform, promote, cultivate, and inspire the astrostatistics community. The AN editors are Jessi Cisewski-Kehe (UW-Madison), David W Hogg (NYU; Flatiron), Vinay Kashyap (CfA), and Aneta Siemiginowska (CfA). The AN was established in late 2022 with encouragement from the International Astrostatistics Association. We anticipate 2 - 3 issues per year, with potential for more.
Astrostatistics News Mission Statement
Astrostatistics News serves the astrostatistics community by highlighting and describing recent research developments in astrostatistics at an accessible level to the diverse backgrounds of its members, sharing interesting new algorithms, software, or data sets, promoting relevant events, and striving to inspire new researchers to join in the fun.
Subscribe to Astrostatistics News
To subscribe to Astrostatistics News, go to https://groups.google.com/g/astrostatistics-news and select the “Join group” button. You will need to be logged into your Google account to join the group.
Please forward this information to anyone who may be interested!
Astrostatistics Student Paper Finalists
This first AN issue highlights the finalists of the 2022 Student Paper Competition conducted by the Astrostatistics Interest Group of the American Statistical Association (AIG/ASA). Each of the finalists were invited to provide us with a statistical and astronomical summary of their work, which were lightly edited and may include additional commentary or information from the editors. All five finalists presented their papers at the Joint Statistical Meeting in Washington, D.C. The paper by Andrew Saydjari was the winner of the competition.
Stratified Learning: A General-Purpose Statistical Method for Improved Learning under Covariate Shift
By M Autenrieth, DA van Dyk, R Trotta, DC Stenning
(Statistics) Flux limited observations are subject to Malmquist bias, which means that nearby and/or brighter objects are more likely to have spectra that enable classification or redshift estimation. Using these objects as a training set for making estimates for more distant/fainter objects may result in bias.
To mitigate this bias, a method is proposed that partitions the data set into subgroups, based on the estimated probability of being included in the training set. Within these subgroups the training data represents the fainter objects more accurately, and machine learning models can be trained without further adjustment, improving predictions compared to existing methodologies in the literature.
(Astronomy) New telescopes and large-scale astronomical surveys lead to vastly increasing observations of astronomical objects. Few of these observations (mostly nearby and bright objects) are accurately characterized/categorized by astronomers via time-expensive follow-up analysis.
This requires the development of automated classification/regression methods, to accurately predict the large set of unlabeled objects, based on the small set of non-representative training data. The proposed methodology is particularly beneficial in classifying different types of supernovae, exploding stars with very high luminosity, and in accurately estimating galaxy redshift, a key measure of distance in cosmology.
Functional Data Analysis for Extracting the Intrinsic Dimensionality of Spectra: Application to Chemical Homogeneity in the Open Cluster M67
By AA Patil, J Bovy, G Eadie, S Jaimungal
The Astrophysical Journal, 926(51), 2022
(Statistics) A novel statistical method using Functional Principal Component Analysis is used that effectively disentangles the low-dimensional chemical structure of stars hidden in noisy spectroscopic data. Using this structure, populations of stars can be directly analyzed without needing to infer chemistry. This is verified by dimensionally reducing the spectra of stars in open cluster M67 and inferring stellar chemistry. To ensure accurate and efficient inference, a state-of-the-art Bayesian method is applied called Sequential Neural Likelihood that trains neural networks to infer model parameters. Results suggest that the birth chemistry of M67 is homogeneous, which has promising implications for understanding the star formation history and dynamical evolution of the Milky Way and other galaxies.
(Astronomy) Star clusters are excellent astrophysical laboratories to study the history of star formation and chemical enrichment in our Galaxy. These are groupings of stars born out of the same gas cloud and are theoretically expected to have similar chemical compositions. Empirically validating this chemical homogeneity is important because it allows us to trace stars back to their birth locations and understand the dynamical history of our Galaxy. However, the measurement of accurate and precise chemistry of stars using noisy, high-dimensional spectroscopic observations is challenging. Particularly, the pre-processing pipelines and standard inference procedures are based on problematic assumptions that lead to systematic errors.
Photometry on Structured Backgrounds: Local Pixel-wise Infilling by Regression
By AK Saydjari*, DP Finkbeiner
The Astrophysical Journal, 933(2), 2022
*winner of the competition
(Statistics) To correctly measure the brightness of a star that appears on top of complicated background features, the proposed approach predicts what the background might have looked like if the star were not there. A distribution of possible backgrounds using samples of nearby uncontaminated background is built. Then, predictions are made that are consistent with both that distribution and the actual background observed at the edge of the star’s influence. By making many such predictions, a correction to the measured brightness of the star is provided, but also correct estimation of the uncertainty of that measurement. This method is then repeated for every detection of a star in an astronomical survey, which was carried out for all 34 billion detections in the Dark Energy Camera Plane Survey, the largest photometric catalog of objects taken with a single camera.
(Astronomy) Using telescope images to measure the brightness of stars in different colors of light, astronomers can infer properties of the stars themselves, their distribution in the galaxy, and properties of the interstellar medium that lies between us on Earth and a given star. However, in the presence of structured backgrounds such as bright gaseous nebulae, it is difficult to correctly measure how much light is coming from the star as opposed to the gas. This problem is particularly troublesome in the Galactic plane where the majority of both the stars and gas reside. Thus, it is imperative to develop techniques to correct the brightness of stars measured from images as complicated as those of the Galactic plane in order to learn about our Galaxy.
The Mass of the Milky Way from the H3 Survey
By J Shen, GM Eadie, N Murray, D Zaritsky, JS Speagle, Y-S Ting, C Conroy, PA Cargile, BD Johnson, RP Naidu, JJ Han
The Astrophysical Journal, 925(1), 2022
(Statistics) Measurements in astronomy are always noisy, and some measurements are very difficult to obtain, leading to partially missing data. A flexible model is designed that incorporates this noise and missing data when using the dynamics of stars to infer the mass of the Milky Way. However, this flexibility comes at the cost of being very computationally expensive; using standard tools, performing inference for this model would take far too long. Instead an algorithm is applied that uses many clever tricks to make this inference orders of magnitude faster, called the No-U-Turn sampler.
(Astronomy) Accurately measuring the mass of the Milky Way can provide many clues into how it evolved and what might happen to it in the future. Measuring the mass of the Milky Way can be difficult because most of the mass comes from ``dark matter'', which is not directly observable with telescopes. However, because dark matter influences the motion of objects in the Galaxy, it is possible to work backwards to infer the mass of the Galaxy from the motion of those objects. Using a combination of space-based and ground-based telescopes, how stars move in the very outer regions of the Galaxy (where there is a lot of dark matter) can be inferred and used to estimate how massive the Milky Way is.
Testing the consistency of dust laws in SN Ia host galaxies: a BayeSN examination of Foundation DR1
By S Thorp, KS Mandel, DO Jones, SM Ward, G Narayan
Monthly Notices of the Royal Astronomical Society, 508(3), 2021
(Statistics) A hierarchical Bayesian framework is built to simultaneously model the light curves (time series of brightness vs. time) of a sample of Type Ia supernovae (SNe Ia), allowing for inferences about the properties of the underlying SN Ia population and the dust in their host galaxies. A forward model is designed that describes mathematically how a sample of supernova light curves could be generated, with this model including population-level effects (common to all supernovae), and supernova-level effects (specific to each individual in the population). The aim is to solve the inverse problem of taking a sample of SN Ia light curves, and making a statistical inference about the population-level properties of interest to us (the properties of dust in the SN Ia host galaxies). This is done by constructing a joint posterior probability distribution over all of the supernova-level and population-level parameters associated with the sample. Samples are drawn from this many-dimensional probability distribution using Stan's Hamiltonian Monte Carlo algorithm.
(Astronomy) SNe Ia are standardizable candles, meaning models can be built of their light curves that enable their use as distance indicators to far-away galaxies. To correctly estimate the distances to SNe Ia, a good understanding is necessary of the interstellar dust in their host galaxies, which has a dimming effect on the supernova light curves and can thus be confounded with distance. It is particularly important to understand if the properties of dust correlate with other host galaxy properties, such as stellar mass. The aim is to infer the population distribution of dust laws (functions describing dimming vs. wavelength) in SN Ia host galaxies, by building a model for SN Ia light curves that includes the effect of dust. Within this model we have to take particular care to separate the effect of dust from the random variation intrinsic to the population of SNe Ia -- doing this correctly, avoiding systematic calibration variations by using a homogenous SNe survey (Foundation DR1, based exclusively on Pan-STARRS-1) can help to obtain more accurate distances and thus cosmological constraints.
Astrostatistics 2023 Student Paper Competition
The 2023 astrostatistics student paper contest run by the ASA Astrostatistics Interest Group is accepting submissions now until Monday, December 5, 2022 at 11:59pm EST.
See https://astrostat.org/competition/ for more details about the contest including the eligibility requirements.
The field of astrostatistics has been growing rapidly over the past decade or so. If you are new to astrostatistics and would like an overview of the field, we list several references below that may be of interest to read – some are older and some are more recent. If you know of other good references, please let us know (see “Content suggestions” below for how to communicate with us).
Loredo, Rice, and Stein. The Annals of Applied Statistics Special Section on Statistics and Astronomy (2009). (link)
Feigelson and Babu, “Modern statistical methods for astronomy: with R applications.” Cambridge University Press (2012). (link)
Feigelson and Babu, “Statistical Challenges in Modern Astronomy V.” Springer Science & Business Media (2012). (link)
Hilbe et al. "Life, the universe, and everything." Significance 11.5 (2014): 48-75. (link)
Ivezić et al. "Statistics, data mining, and machine learning in astronomy." Statistics, Data Mining, and Machine Learning in Astronomy. Princeton University Press (2014). (link)
Schafer, “A Framework for Statistical Inference in Astrophysics.” Annual Review of Statistics and Its Application, 2: 141-162 (2015). (link)
Feigelson, "The changing landscape of astrostatistics and astroinformatics." Proceedings of the International Astronomical Union 12.S325 (2016): 3-9. (link)
Hilbe, De Souza, and Ishida. Bayesian models for astrophysical data: using R, JAGS, Python, and Stan. Cambridge University Press (2017). (link)
Algeri et al. "Statistical challenges in the search for dark matter." arXiv preprint arXiv:1807.09273 (2018). (link)
Eadie et al. "Realizing the potential of astrostatistics and astroinformatics." arXiv preprint arXiv:1909.11714 (2019). (link)
Ntampaka et al. "The role of machine learning in the next decade of cosmology." arXiv preprint arXiv:1902.10159 (2019). (link)
Schafer and Cisewski-Kehe, Chance Magazine, Special Issue on Astrostatistics (2019). (link)
Siemiginowska et al. "The next decade of astroinformatics and astrostatistics." (2019). (link)
A list of references will be maintained at our website, astrostatisticsnews.com/references.
There are a number of astrostatistics-related societies that you may be interested in joining. If you know of others, please let us know (see “Content suggestions” below for how to communicate with us).
American Astronomical Society Working Group on Astroinformatics and Astrostatistics (AAS WGAA)
American Statistical Association Astrostatistics Interest Group (AIG/ASA)
Cosmostatistics Initiative (COIN)
International Astroinformatics Association (IAIA)
International Astrostatistics Association (IAA)
International Astronomical Union Commission on Astroinformatics and Astrostatistics (IAU CAA)
International Statistical Institute Special Interest Group in Astrostatistics (ISI SIGA)
A list of societies will be maintained at our website, astrostatisticsnews.com/societies.
Job Opportunities in Astrostatistics
Research Assistant/Associate in Supernovae/Astrostatistics/Data Science
with Dr. Kaisey Mandel
Institute of Astronomy at the University of Cambridge, UK
Deadlines: Application deadline is Dec 15, 2022, selection deadline is Feb 15, 2023
Assistant Professor of Statistics - Cluster Hire
to deepen UW-Madison leadership in multi-messenger/time domain Astrophysics
University of Wisconsin-Madison
Deadlines: open until filled, anticipated begin date Aug 21, 2023
Ten PhD and Postdoc positions
related to Euclid, SKA, and JWST; cosmology (weak lensing); machine learning, and/or astrostatistics methodologies
at CEA Saclay France or project partners FORTH Heraklion Greece, ENSICAEN Caen France and Nice University
Deadlines: First deadlines are on Nov 15, 2022 and last deadlines are on Jan 31, 2023
A list of job opportunities will be maintained at our website, astrostatisticsnews.com/job-opportunities.
Statistical Challenges in Modern Astronomy VIII
June 12-16, 2023
Center for Astrostatistics, Pennsylvania State University
Deadlines: Abstract submission deadline is Feb 1, 2023, Final program will be announced Mar 15, 2023
A list of events will be maintained at our website, astrostatisticsnews.com/upcoming-events.
If you have ideas for AN content, please send a message to firstname.lastname@example.org. We may include your idea in a future issue if we think it is a good fit for an issue.
Ideas may include relevant astrostatistics papers/data/code, visualizations, upcoming events, job postings, format or commentary suggestions, etc.