Issue 4 (April 2024)

Astrostatistics News

Issue 4, April 2024

Issue Editors:  Jessi Cisewski-Kehe, David W. Hogg, Vinay L. Kashyap, Aneta Siemiginowska

Astrostatistics News (AN) is a newsletter designed to inform, promote, cultivate, and inspire the astrostatistics community.  


Highlights

The end of Chandra?

By Jessi Cisewski-Kehe


Chandra is an Earth-orbiting X-ray observatory that has been collecting high-quality data since 1999.  Chandra has been the source and inspiration of many astrostatistics projects and collaborations. The recently released NASA budget suggests major reductions of Chandra’s financial support over the coming years.  You can get an overview of the situation in the Space.com article linked below, or learn more details (and show support of Chandra, if desired) at https://www.savechandra.org/.



Overview of Banff Workshop:  Astrostatistics in Canada and Beyond

By David Stenning (Simon Fraser University) 


The workshop Astrostatistics in Canada and Beyond was held at the Banff International Research Station (BIRS) in Banff, AB, Canada, from Oct 30 - Nov 3, 2023. Rather than focus on a particular astrophysical application, the workshop sought to cultivate possibilities for collaboration between astronomers and statisticians. At the beginning of the week, each meeting participant introduced themselves with a 60-second “lightning talk” to briefly mention their research area and to advertise their learning interests at the workshop. Throughout the week, there was a combination of review talks, research talks, and collaboration sessions. All researchers wanting to give a longer research talk were given the opportunity to do so in the mornings.  Early career researchers were given the longest time slots for research talks to maximally emphasize their contributions. The afternoons of days 2 and 4 were spent in self-organized discussion in small groups on topics such as simulation-based and likelihood-free inference, Bayesian computing, noisy/biased/incomplete data, and probabilistic catalogs, among others. Less formal discussions continued during meals, in the evenings, and in the afternoon on day 3 with a trip to Lake Louise; the communal aspect of the BIRS workshops is a key feature that aided cross-disciplinary communication. Since the workshop, participants have already begun reporting that what began as informal conversations have led to continued discussion and the beginnings of new collaborations between statisticians and astronomers.


Recordings of the talks are available to the public at https://www.birs.ca/events/2023/5-day-workshops/23w5094/videos



Historical Astrostatistics

Astrostatistics innovations of the past are highlighted in this section.  


There is not a historical article in this issue.  If you know of a good historical astrostatistics article or innovation to highlight, please contact us (astrostatisticsnews@gmail.com).  



Spotlight

Astrostatistics innovations of the present are highlighted in this section. 


Multivariate, heteroscedastic empirical Bayes via nonparametric maximum likelihood

By  Bodhisattva Sen

Paper:  Soloff, J.A., Guntuboyina, A. and Sen, B., 2021. Multivariate, heteroscedastic empirical bayes via nonparametric maximum likelihood. arXiv preprint arXiv:2109.03466.


Many astronomy data sets are characterized by a calibrated measurement error distribution attached to each observation, and typically these errors are heteroscedastic. For example, the noise in the raw color-magnitude diagram (CMD) (e.g., from the Gaia TGAS catalog) may obscure many known features of stellar evolution, rendering the raw CMD unreliable for downstream parallax inference. This calls for statistical methods that can denoise the raw CMD to construct a precise low-noise CMD. Empirical Bayes methods are attractive in such settings, but standard parametric approaches rest on assumptions about the form of the latent variable (prior) distribution which can be hard to justify and which introduce unnecessary tuning parameters. In this paper, we extend the nonparametric maximum likelihood estimator (NPMLE) for Gaussian location mixture densities to allow for multivariate, heteroscedastic errors. NPMLEs estimate an arbitrary latent variable distribution by solving an infinite-dimensional convex optimization problem; we show that this convex optimization problem can be tractably approximated by a finite-dimensional analogue.


The denoised estimates (aka empirical Bayes posterior means based on an NPMLE) have low regret, meaning they closely target the oracle posterior means one would compute with the true prior in hand. We apply our method to multiple denoising problems in astronomy (e.g., constructing a fully data-driven color-magnitude diagram of 1.4 million stars in the Milky Way and investigating the distribution of 19 chemical abundance ratios for 27 thousand stars in the red clump).


AstroPhot: fitting everything everywhere all at once in astronomical images

By  Conor Stone (Université de Montréal)

Paper:  Stone, C.J., Courteau, S., Cuillandre, J.C., Hezaveh, Y., Perreault-Levasseur, L. and Arora, N., 2023. astrophot: fitting everything everywhere all at once in astronomical images. Monthly Notices of the Royal Astronomical Society, 525(4), pp.6377-6393.


We are in a golden age of astronomy!

New telescopes (JWST, Euclid, Rubin, Roman, ELT, GMT, and more) are providing mountains of fascinating data full of exciting new discoveries. However, the challenges of data processing have never been greater; with such large volumes of data we need more speed, while these rich data also require more detailed analysis. To exploit the potential of these data, we present AstroPhot. It is a fully featured 2D astronomical image modeling code built from the ground up using the Python numerical library: PyTorch, which gives it access to GPU acceleration and automatic differentiation.  Further, AstroPhot provides an object-oriented interface to build complex multi-band (wavelength) and multi-epoch (time) models of galaxies, point sources (e.g., stars and supernovae), and more. With a focus on statistical precision, one can seamlessly transition between maximum likelihood methods (which give parameter variance/covariance) and full MCMC modeling of the posterior density. An exciting result found in the development of AstroPhot was that simultaneous fitting of a galaxy model and the PSF was robust, though notably covariances exist with the galaxy shape and PSF width. Meaning that fitting a galaxy shape with a fixed PSF model is biased in some circumstances which could be crucial for precision applications, such as weak lensing.


Measuring the robustness of Gaussian processes to kernel choice

By  Sameer Deshpande (University of Wisconsin-Madison)

Paper:  Stephenson, W.T., Ghosh, S., Nguyen, T.D., Yurochkin, M., Deshpande, S. and Broderick, T., 2022, May. Measuring the robustness of Gaussian processes to kernel choice. In International Conference on Artificial Intelligence and Statistics (pp. 3308-3331). PMLR.


Gaussian processes (GPs) are an increasingly popular tool for the analysis of astronomical data, with applications ranging from exoplanet discovery to gravitational waves. To use GPs for any application, data analysts must select a covariance kernel, which governs the shape, smoothness, and other properties of the latent function of interest. Ideally, analysts would elicit precise prior beliefs about the latent function (e.g., “the function is exactly twice mean-square differentiable with a particular power spectrum”) and would select a kernel that exactly encodes these beliefs (e.g., the Matérn-5/2 kernel with length scale 1). But in practice, analysts often only have vague, qualitative prior information (e.g., “the function is smooth but not too smooth”) and will select a kernel from among a handful of convenient, standard families (e.g., squared exponential, Matérn, quasi-periodic). And generally speaking, infinitely many kernels may be equally compatible with such vague beliefs; we call these “qualitatively interchangeable” kernels. We provide a practical, two-step workflow to determine whether substantive conclusions drawn based on an initially specified kernel would change if an analyst had used a qualitatively interchangeable kernel instead. In the first step, we expand a neighborhood around the initial kernel until we find a kernel that leads to different conclusions (e.g., changing the sign of the prediction at a given input). Then, we use numeric and visual diagnostics to assess whether the conclusion-changing kernel is qualitatively interchangeable with the initial kernel. If it is, we conclude that the original analysis was non-robust to kernel choice.


Jargon Dictionary

General definitions of astronomy or statistical terms are included in this section.


The International Astronomical Union is constructing a glossary of common astronomy terms (see https://astro4edu.org/resources/glossary/search/).  Here we plan to build up a similar dictionary, focusing on both statistics and astronomy jargon.  


If you have comments, questions, concerns, edits, or terms you would like included please let us know at astrostatisticsnews@gmail.com.


Photons vs Counts vs Events

In high-energy astronomy, where observations are usually made at low rates of photon arrivals, astronomers make a distinction between photons and counts and events.  By convention, the term photon usually refers to the photons before they pass through the telescope, while counts refer to the observed signal in the detector. That is, a count is the result of a photon passing through the telescope/detector system and being registered in the detector.  A count can also occur due to several causes, like cosmic rays or ambient solar wind particles, so a detected count is also referred to as an event.  Each photon that arrives at and is detected in the instrumentation as a count is an event.  Statisticians can think of events as a marked Poisson process, where the marks are attributes such as energy, photon arrival time, and arrival direction.

-VLK


Heteroscedastic (also spelled Heteroskedastic)

A data set is said to be heteroscedastic when different individual data points in the data set have different uncertainties, or have noise contributions drawn from distributions that differ in variance. The quantity that is often called χ² (a residual squared divided by its uncertainty squared) is one of the cost functions employed when the data are heteroscedastic, because it weights each data point with its own unique inverse variance.

–DH



Other News


Job Opportunities in Astrostatistics


Assistant Professor Big Data Astronomy
Radboud Universiteit in Nijmegen, Netherlands

Details: https://aas.org/jobregister/ad/d2d32e7c

Deadlines: Application deadline is April 21, 2024


AAS Job Register is a good resource for jobs in astronomy, and may include relevant jobs in astrostatistics or data science.


Please let us know if you hear of job opportunities in astrostatistics (astrostatisticsnews@gmail.com).  


*We are looking for a “Jobs Editor” to help identify and organize job opportunities in astrostatistics.  If you are interested, please contact Jessi at astrostatisticsnews@gmail.com


A list of job opportunities will be maintained at our website, astrostatisticsnews.com/job-opportunities.



Astrostatistics Events


Hotwiring the Transient Universe VII

May 13-16, 2024

Dunlap Institute at the University of Toronto

Details: https://www.dunlap.utoronto.ca/hotwired7/

Deadlines:  Registration deadline: April 15, 2024 (or until full; there is capacity for 100 in-person attendees)


COSMO 21

May 20-24, 2024

Chania, Greece

Details: https://cosmo21.cosmostat.org/

Deadlines: Registration deadline: March 29, 2024 (or before if the maximum of 100 people is reached).  While the registration deadline has already passed, we include this in our list so the AN readers are aware of the event for future reference.


Summer School in Statistics for Astronomers XIX

June 3-7 2024

Enhanced virtual online format through Penn State University

Details: https://sites.psu.edu/astrostatistics/su24/

Deadlines: Registration deadline is May 11, 2024.


Joint Statistical Meeting

Aug 3-8, 2024

Portland, OR

Details: https://ww2.amstat.org/meetings/jsm/2024/ 

Deadlines: Submission of Topic-Contributed Session Proposals Nov 15-Dec 7, 2023; Abstract submission Dec 1, 2023 - Feb 1, 2024; Meeting and Event Request submissions Jan 25-Apr 4, 2024; Registration opens May 1, 2024


The IAA maintains a list of events at

http://iaa.mi.oa-brera.inaf.it/adm_program/modules/dates/dates.php 


A list of events is maintained at our website, astrostatisticsnews.com/upcoming-events.



Content suggestions


If you have ideas for AN content, please send a message to astrostatisticsnews@gmail.comWe may include your idea in a future issue if we think it is a good fit for an issue.


Ideas may include relevant astrostatistics papers/data/code, visualizations, upcoming events, job postings, format or commentary suggestions, etc.  



Astrostatistics News website


See astrostatisticsnews.com for more information such as past issues, lists of astrostatistics references and societies.



Subscribe to Astrostatistics News

To subscribe to Astrostatistics News, go to https://groups.google.com/g/astrostatistics-news and select the “Join group” button.  You will need to be logged into your Google account to join the group.


Please forward this information to anyone who may be interested!