Issue 3 (Oct 2023)

Astrostatistics News

Issue 3, October 2023

Issue Editors:  Jessi Cisewski-Kehe, David W. Hogg, Vinay L. Kashyap, Aneta Siemiginowska

Astrostatistics News (AN) is a newsletter designed to inform, promote, cultivate, and inspire the astrostatistics community.  


In this issue, you will find a retrospective on Statistical Challenges in Modern Astronomy VIII, a summary of astrostatistics papers by the finalists of the American Statistical Association Astrostatistics Interest Group, a spotlight on several recent astrostatistics innovations, and more.


Highlights


Retrospective:  Statistical Challenges in Modern Astronomy VIII

By Eric Feigleson (Pennsylvania State University)


After seven previous installments dating back to 1991, the Statistical Challenges in Modern Astronomy VIII conference was held in-person at Penn State University during June 12-16, 2023. This was the first in-person SCMA since the SCMA VI conference in 2016. Of the 65 participants in attendance, there was roughly one statistician for every 2-3 astronomers. Since its founding, SCMA meetings have specialized in bringing the two communities together. The SCMA VIII Scientific Organizing Committee was led by Emille Ishida (Clermont-Auvergne), Chad Schafer (CMU), Hyungsuk Tak (Penn State), and Ashley Villar (Penn State), along with members Francois Lanusse (CNRS), Joel Lea (Penn State), and David van Dyk (ICL).  


The week of fascinating presentations commenced with inspirational opening remarks by Tom Loredo (Cornell), who has attended every SCMA meeting since 1991. He brought out a guitar and had the entire room sing a paean in honor of SCMA conferences.  The talks and posters covered a variety of topics, with links to many of the slides available in the online program.  Keynote speaker Kyle Cranmer (Wisconsin) discussed the challenge of performing statistical inference on simulators when the scientific phenomena are complex and likelihoods are intractable. He discussed several approaches to augment these computationally expensive simulations such as importance sampling of high-dimensional spaces of nuisance parameters using probabilistic programming.  Xiao-Li Meng (Harvard) gave a Keynote presentation with a lively perspective on how statisticians can effectively work with physical scientists. He demonstrated self-shrinkage estimates of variances, correcting astronomers’ often inaccurate estimates of statistical and systematic errors.



Astrostatistics Student Paper Competition Finalists


The five finalists for the 2022 Student Paper Competition conducted by the Astrostatistics Interest Group of the American Statistical Association (AIG/ASA) are listed below, along with bibliographic information and a short overview of the main astrostatistics contribution of their work.  The papers were highlighted during presentations at the 2023 Joint Statistical Meeting in Toronto, Ontario.  The paper by Jacob Nibauer (Princeton University) was the winner of the competition.


Dayi (David) Li (University of Toronto)

Light from the Darkness: Detecting Ultra-diffuse Galaxies in the Perseus Cluster through Over-densities of Globular Clusters with a Log-Gaussian Cox Process

Paper: Li, D.D., Eadie, G.M., Abraham, R., Brown, P.E., Harris, W.E., Janssens, S.R., Romanowsky, A.J., Van Dokkum, P. and Danieli, S., 2022. The Astrophysical Journal, 935(1), p.3.

Summary: We introduce a new method for detecting ultra-diffuse galaxies (UDGs) by searching for over-densities in intergalactic globular cluster (GC) populations via an application of the log-Gaussian Cox process, a commonly used model in spatial statistics literature but not in astronomy. This method is applied to the GC data obtained from the PIPER survey, a Hubble program targeting the Perseus cluster, where we successfully detect all confirmed UDGs with known GC populations and also identify a potential first of its kind system that has no detected diffuse stellar content and is unlikely to be an accidental clump of GCs or other objects.


Antoine Meyer (Imperial College London)

TD-CARMA: Painless, Accurate, and Scalable Estimates of Gravitational Lens Time Delays with Flexible CARMA Processes.

Paper: Meyer, A.D., van Dyk, D.A., Tak, H. and Siemiginowska, A., 2023. The Astrophysical Journal, 950(1), p.37.

Summary: We propose TD-CARMA, a Bayesian method to estimate cosmological time delays by modeling observed and irregularly sampled light curves as realizations of a continuous auto-regressive moving average (CARMA) process. We apply TD-CARMA to six doubly lensed quasars HS2209+1914, SDSS J1001+5027, SDSS J1206+4332, SDSS J1515+1511, SDSS J1455+1447, and SDSS J1349+1227, and produce estimates that are consistent with those derived in the relevant literature, but are typically two to four times more precise.


Jacob Nibauer (Princeton University)

Charting Galactic Accelerations with Stellar Streams and Machine Learning

Paper:  Nibauer, J., Belokurov, V., Cranmer, M., Goodman, J. and Ho, S., 2022. The Astrophysical Journal, 940(1), p.22.

Summary:  We derive a model-independent approach to connect the observed kinematics and morphology of stellar streams to a direct estimate of the galactic acceleration field local to each stream. Constraints can be combined across multiple streams at the likelihood level, providing a new flexible approach to map the dark matter in the Milky Way.


Martijn Oei (Leiden University)

Measuring the giant radio galaxy length distribution with the LoTSS

Paper:  Oei, M.S., van Weeren, R.J., Gast, A.R., Botteon, A., Hardcastle, M.J., Dabhade, P., Shimwell, T.W., Röttgering, H.J. and Drabent, A., 2023. Astronomy & Astrophysics, 672, p.A163.

Summary:  Jets launched by supermassive black holes can travel for megaparsecs beyond the borders of their host galaxies, thereby enriching the intergalactic medium with cosmic rays, heat, heavy elements, and magnetic fields. In this work we derive and put into practice a suite of analytic expressions for Bayesian inference of the cosmological abundance and geometry of Nature's largest --- and most powerful --- jet systems.


Sam Ward (University of Cambridge)

SN 2021hpr and its two siblings in the Cepheid calibrator galaxy NGC 3147: A hierarchical BayeSN analysis of a Type Ia supernova trio, and a Hubble constant constraint

Paper:  Ward, S.M., Thorp, S., Mandel, K.S., Dhawan, S., Jones, D.O., Taggart, K., Foley, R.J., Narayan, G., Chambers, K.C., Coulter, D.A. and Davis, K.W., 2022. arXiv:2209.10558.

Summary: This work simultaneously fits the light curves of NGC 3147’s trio of Type Ia supernova siblings using a hierarchical Bayesian model, BayeSN. We marginalise over the siblings' correlation to robustly infer the distance to the host galaxy, and estimate the Hubble constant, H0 (the present day expansion rate of the Universe).



Historical Astrostatistics

Astrostatistics innovations of the past are highlighted in this section.  


Avni 1976: Uncertainties on Model Parameter Estimates

By Vinay Kashyap (Center for Astrophysics)

Paper: Avni, Y., 1976. Energy spectra of X-ray clusters of galaxies. Astrophysical Journal, vol. 210, Dec. 15, 1976, pt. 1, p. 642-646. 


I often joke about how more important papers get fewer citations because they rapidly reach the “everybody knows” category.  Avni (1976) is one such paper.  It only has 718 citations in ADS as of September 2023, which is honestly a travesty, considering that all, but all, of high-energy spectroscopy is built upon it.  Astronomers owe a massive debt to Avni, who first wrote down the recipe for how to do non-linear regression to estimate parameters and their uncertainties for high-energy spectra.


About 50 years ago, it was well understood that minimizing χ² would yield a “best” fit.  But how to establish confidence bounds on the parameter estimates, especially when the number of free parameters could vary?  You may have fitted a power-law model with a low-energy cut-off to an Uhuru spectrum of the Perseus cluster, but how could you be sure that the cut-off was needed?  There were several proposals on what to do, and no consensus.  How could you tell whether the uncertainty on the cut-off fell off the left edge of the energy domain?  Avni pointed out the right prescription to use was to set a Δχ²(q,α) that correctly accounted for both the number of degrees of freedom q and the confidence level α, and demonstrated exactly how to do the calculations.  Specifically for the Perseus cluster, Avni demonstrated that there was a cut-off necessary in the fits to the Perseus cluster spectra if you assumed a power-law spectral model, but not for a thermal bremsstrahlung spectral model.  Guess what?  We know now that the spectrum is dominated by a complex combination of many thermal line+continuum spectra.  There is no cut-off.


Spotlight

Astrostatistics innovations of the present are highlighted in this section.  


Scalable Bayesian inference for detection and deblending in astronomical images

By Jeffrey Regier (University of Michigan)

Paper:  Hansen, D., Mendoza, I., Liu, R., Pang, Z., Zhao, Z., Avestruz, C. and Regier, J., 2022. Scalable Bayesian inference for detection and deblending in astronomical images. arXiv preprint arXiv:2207.05642.


The forthcoming generation of astronomical surveys will peer deeper into space, revealing many more astronomical light sources than their predecessors. Due to the higher density of light sources in the images from these surveys, many more light sources will visually overlap. Visually overlapping light sources, called "blends," are challenging for traditional (non-probabilistic) astronomical image processing pipelines because they introduce ambiguity into the interpretation of the image data. We present a new probabilistic method for detecting, deblending, and cataloging astronomical objects called the Bayesian Light Source Separator (BLISS). In the BLISS statistical model, the latent space is interpretable: one random variable encodes the number of stars and galaxies imaged, a random vector encodes the locations and fluxes of these astronomical objects, and another random vector encodes the galaxy morphologies. For posterior inference, BLISS uses a new form of variational inference based on stochastic optimization, deep neural networks, and the forward Kullback-Leibler divergence. Once its inference network is trained with synthetic images, BLISS can produce highly accurate probabilistic catalogs for megapixel astronomical images in seconds.


A novel approach to detect line emission under high background in high-resolution X-ray spectra

By Sara Algeri (University of Minnesota)

Paper:  Zhang, X., Algeri, S., Kashyap, V. and Karovska, M., 2023. A novel approach to detect line emission under high background in high-resolution X-ray spectra. Monthly Notices of the Royal Astronomical Society, 521(1), pp.969-983.


The challenge of signal detection in the presence of a high background (when the signal in the aperture is dominated by emission other than from the source of interest) is ubiquitous in physics and astronomy. This article addresses the problem within the context of high-resolution X-ray spectra using a two-stage analysis. Specifically, the first stage aims to assess and calibrate the background models proposed by scientists using a source-free sample. This calibration is achieved by means of smooth tests, originally introduced by Neyman in 1937. In the second stage, likelihood-ratio tests are employed for signal detection and the construction of upper limits. To account for potentially complex background structures, the analysis is conducted separately across different regions of the spectra. Therefore, to ensure adequate control of the desired probability of Type I error, corrections for multiple hypothesis testing are ultimately implemented.  This method was applied to check whether narrow spectral lines could be detected in a background dominated spectrum of RT Cru.  RT Cru is a symbiotic star, which is a binary with a white dwarf companion which is accreting material from a companion giant.  The goal of the X-ray observations was to choose between different emission scenarios, but the data quality is insufficient to tell, and upper limits were set instead.


Clearing the hurdle: The mass of globular cluster systems as a function of host galaxy mass

By Aneta Siemiginowska (Center for Astrophysics)

Paper:  Eadie, G.M., Harris, W.E. and Springford, A., 2022. Clearing the hurdle: The mass of globular cluster systems as a function of host galaxy mass. The Astrophysical Journal, 926(2), p.162.


Globular star clusters (GCs) are old massive star clusters present in almost all galaxies.  The relation between the total mass of globular star clusters and their host galaxy halo mass is an important component in theoretical modeling of galaxy formation. However, till now this relation has been derived for massive galaxies with the presence of GCs ignoring galaxies without GCs (data with zeros) and potentially biasing the results.  The implementation of the Bayesian hurdle model presented in this paper incorporates the galaxies with no GCs in deriving this relationship extending it to low mass galaxies and additionally providing a threshold on the galaxy mass that contains GCs. Interestingly, the model applied to a large number of local galaxies gives a broad mass transition region for galaxies to host GCs, and sets the minimum mass for GCs to survive over Hubble time. A range of individual histories in the evolution of dwarf galaxies appears to be the major reason for such a broad mass transition region.  In general, the hurdle model can be applied to any data with a lot of zeros.



Jargon Dictionary

General definitions of astronomy or statistical terms are included in this section.


The International Astronomical Union is constructing a glossary of common astronomy terms (see https://astro4edu.org/resources/glossary/search/).  Here we plan to build up a similar dictionary, focusing on both statistics and astronomy jargon.  


If you have comments, questions, concerns, edits, or terms you would like included please let us know at astrostatisticsnews@gmail.com.


Calibration

In astronomy, calibration is the process by which the electrical signals output from an instrument are converted into physically meaningful quantities.  You may point your telescope at an object and collect 1000 photons from it, but how does that translate to a physically meaningful flux (energy unit time per unit area passing through the telescope aperture)?  Answering such questions requires calibration of the detector.  Calibration seeps into every aspect of the measurement process, even to establish when you can say you have detected a photon in a sensor (like a CCD – a charge coupled device imaging semiconductor circuit).

–VLK


Color

In astronomy, color is the difference between two magnitudes (a standardized measure of ranked brightness of an astronomical object such as a star).  Magnitudes are defined as scaled logarithmic intensities relative to some standard value, and colors are thus scaled logarithmic ratios of intensities or counts from a source measured in different passbands.  Colors are often used as proxies for properties of the source; e.g., B-V in the Johnson filter system tracks the surface temperatures of stars.

–VLK


cstat 

In astronomy, cstat (aka C-stat or Cash statistics) is a measure of the ln(Poisson likelihood).  It is a drop-in replacement for χ², and is a convenient statistic used by high-energy astronomers to find maximum likelihood estimates in a non-linear regression of Poisson binned counts data (see, e.g., Kaastra 2017, A&A 605, A51).  It is written in a form that can be derived either from the Stirling approximation to the factorial, or as a likelihood ratio of two models where the model of interest is compared to a “perfect” model which matches the observed counts exactly, cstat=–2∑K(modelk–countsK)+countsK⋅(ln(countsK)–ln(modelK)))

and the summation is carried out over all bins K.  In this form it is asymptotically distributed as χ².  There is another statistic called wstat, which is sometimes also called cstat, which includes a term describing a marginalized background.

–VLK


Effective Area (EA)

In astronomy, the sensitivity of a telescope and detector system varies strongly with photon energy.  The cross-section that the system presents to the sky is a fraction of the geometric area, and is called the effective area (often abbreviated EA).  It is regularly reevaluated over the lifetime of a telescope, and is a critical part of instrument calibration.  In optical and IR astronomy, this is often called sensitivity or response, and variations in the transmissions of different filters are often used to obtain color information.  In high-energy astronomy, the effective area is first measured at ground-based facilities and adjusted in-flight by observations of objects with well-understood spectra.  These effective areas are usually available as Ancillary Response Files (ARFs) with the data. 

–VLK


Other News


Job Opportunities in Astrostatistics


Please let us know if you have astrostatistics job opportunities by emailing us at astrostatisticsnews@gmail.com.


A list of job opportunities is maintained at our website, astrostatisticsnews.com/job-opportunities.



Astrostatistics Events


Please let us know if you have astrostatistics events by emailing us at astrostatisticsnews@gmail.com.


International Astrostatistics Association (IAA) Elections

Event dates:  October 6-31, 2023

Event location:  Online

Details:  Only IAA members are allowed to vote.  If you are an IAA member,  you should have already received an email with a link to the voting system.

Deadlines: IAA members vote by October 31, 2023


The International Astrostatistics Association (IAA) is the global scientific association devoted to astrostatistics and astroinformatics with the goal  to promote collaboration between astronomers, statisticians and computer scientists. Membership is free and open to any scientist  interested in the statistical analysis of astronomical data. To apply for membership please visit the IAA website: http://iaa.mi.oa-brera.inaf.it/IAA/home.html


International Astronomical Union nominations for memberships

Details: IAU Individual or Junior memberships are open for nominations

Nominations open on: October 7, 2023

Nominations end on: December 15, 2023

Website: https://www.iau.org/administration/membership/individual/qualification/#:~:text=The%20call%20for%20membership%20applications,or%20to%20the%20Adhering%20Organizations 


Cosmo21: 4th Edition

Event dates: May 20-24, 2024

Event location: Chania, Greece

Event website: https://cosmo21.cosmostat.org

Deadlines: Registration will open in December 2023


A list of events is maintained at our website, astrostatisticsnews.com/upcoming-events.



Content suggestions

If you have ideas for AN content, please send a message to astrostatisticsnews@gmail.comWe may include your idea in a future issue if we think it is a good fit for an issue.


Ideas may include relevant astrostatistics papers/data/code, visualizations, upcoming events, job postings, format or commentary suggestions, etc.  



Astrostatistics News website

See astrostatisticsnews.com for more information such as past issues, lists of astrostatistics references and societies.



Subscribe to Astrostatistics News

To subscribe to Astrostatistics News, go to https://groups.google.com/g/astrostatistics-news and select the “Join group” button.  You will need to be logged into your Google account to join the group.


Please forward this information to anyone who may be interested!