Jargon Dictionary

General definitions of astronomy or statistical terms are included on this page.


The International Astronomical Union is constructing a glossary of common astronomy terms (see https://astro4edu.org/resources/glossary/search/).  Here we plan to build up a similar dictionary, focusing on both statistics and astronomy jargon.  


If you have comments, questions, concerns, edits, or terms you would like included please let us know at astrostatisticsnews@gmail.com.


Bolometric correction, Distance modulus, Color excess, Absolute magnitude, etc

Because astronomical techniques have spanned many centuries and many technologies, there is a lot of old observational and quasi-observational jargon. Some of it is subtle: For example, a magnitude is not a (negative) logarithmic measurement of brightness, it is a logarithmic measurement of a brightness ratio. A pedagogical introduction to all these magnitude-connected quantities is available in “Magnitudes, distance moduli, bolometric corrections, and so much more” found at https://arxiv.org/abs/2206.00989

The note is aimed at physicists, but it can be understood by most people with a quantitative background.
DWH

Calibration

In astronomy, calibration is the process by which the electrical signals output from an instrument are converted into physically meaningful quantities.  You may point your telescope at an object and collect 1000 photons from it, but how does that translate to a physically meaningful flux (energy unit time per unit area passing through the telescope aperture)?  Answering such questions requires calibration of the detector.  Calibration seeps into every aspect of the measurement process, even to establish when you can say you have detected a photon in a sensor (like a CCD – a charge coupled device imaging semiconductor circuit).

–VLK

Color

In astronomy, color is the difference between two magnitudes (a standardized measure of ranked brightness of an astronomical object such as a star).  Magnitudes are defined as scaled logarithmic intensities relative to some standard value, and colors are thus scaled logarithmic ratios of intensities or counts from a source measured in different passbands.  Colors are often used as proxies for properties of the source; e.g., B-V in the Johnson filter system tracks the surface temperatures of stars.

–VLK

cstat 

In astronomy, cstat (aka C-stat or Cash statistics) is a measure of the ln(Poisson likelihood).  It is a drop-in replacement for χ², and is a convenient statistic used by high-energy astronomers to find maximum likelihood estimates in a non-linear regression of Poisson binned counts data (see, e.g., Kaastra 2017, A&A 605, A51).  It is written in a form that can be derived either from the Stirling approximation to the factorial, or as a likelihood ratio of two models where the model of interest is compared to a “perfect” model which matches the observed counts exactly, cstat=–2∑K(modelk–countsK)+countsK⋅(ln(countsK)–ln(modelK))), and the summation is carried out over all bins K.  In this form it is asymptotically distributed as χ².  There is another statistic called wstat, which is sometimes also called cstat, which includes a term describing a marginalized background.

–VLK

Effective Area (EA)

In astronomy, the sensitivity of a telescope and detector system varies strongly with photon energy.  The cross-section that the system presents to the sky is a fraction of the geometric area, and is called the effective area (often abbreviated EA).  It is regularly reevaluated over the lifetime of a telescope, and is a critical part of instrument calibration.  In optical and IR astronomy, this is often called sensitivity or response, and variations in the transmissions of different filters are often used to obtain color information.  In high-energy astronomy, the effective area is first measured at ground-based facilities and adjusted in-flight by observations of objects with well-understood spectra.  These effective areas are usually available as Ancillary Response Files (ARFs) with the data. 

–VLK

Heteroscedastic (also spelled Heteroskedastic)

A data set is said to be heteroscedastic when different individual data points in the data set have different uncertainties, or have noise contributions drawn from distributions that differ in variance. The quantity that is often called χ² (a residual squared divided by its uncertainty squared) is one of the cost functions employed when the data are heteroscedastic, because it weights each data point with its own unique inverse variance.

–DWH

Marginalization

In Bayesian contexts, we have priors over parameters, which permit integration (in the calculus sense). Priors are probability distributions, and are therefore also measures; they make integration for a statistical analysis possible. If you have a likelihood function p(y|a,b) (in statistics we often write this as “L(a,b|y)”), where y represents the data, a represents the parameters you care about, and b represents the nuisance parameters, you can integrate over b, but only if you have a prior on b. That is, to integrate, you need a prior p(b), and when you integrate p(y|a,b)p(b) over all b, you obtain the marginalized likelihood or marginal likelihood p(y|a). The rules for constructing and integrating probability distributions are set down in (among other places) “Data analysis recipes: Probability calculus for inference” found at https://arxiv.org/abs/1205.4446.
DWH

Metallicity

Elemental composition is a critical parameter required to understand most astronomical objects.  There are various ways to summarize them, ranging from crude (e.g., abundance relative to solar) to highly detailed (e.g., absolute abundances of specific elements and even their ionic species).  A common and highly useful summary is the so-called metallicity, which is simply the abundance of iron (Fe) relative to hydrogen (H), and which stands as a proxy for all metals; astronomers call everything outside of the two most abundant elements in the Universe, H and helium (He), as metals.
VLK

Nuisance parameter

The structure of most astrophysics projects (science projects?) is that there is a fundamental model you care about (e.g., an astronomer may care about a model of an exoplanet orbiting a distant star) and then an auxiliary model to handle the things you do not care about (e.g., this might be something that describes the stochastic variability of the star that the planet is orbiting). The parameters of the fundamental model are the parameters you care about, and the parameters of the auxiliary model are what we often call “nuisance parameters.”  Though you generally need to represent the nuisance parameters, and often have to infer them, you do not care about them. Trouble arises when the nuisance parameters are covariant with the parameters you do care about; a solution is to marginalize or profile (see definitions for "Marginalization" and "Profiling (statistics)").
DWH

Photons vs Counts vs Events

In high-energy astronomy, where observations are usually made at low rates of photon arrivals, astronomers make a distinction between photons and counts and events.  By convention, the term photon usually refers to the photons before they pass through the telescope, while counts refer to the observed signal in the detector. That is, a count is the result of a photon passing through the telescope/detector system and being registered in the detector.  A count can also occur due to several causes, like cosmic rays or ambient solar wind particles, so a detected count is also referred to as an event.  Each photon that arrives at and is detected in the instrumentation as a count is an event.  Statisticians can think of events as a marked Poisson process, where the marks are attributes such as energy, photon arrival time, and arrival direction.

-VLK

Profiling (statistics)

In frequentist statistics contexts—or when you feel uncomfortable putting a prior p(b) on your nuisance parameters b—you can nonetheless account for covariances between the parameters of interest and the nuisance parameters in your inferences by profiling. The profile likelihood pb(y|a) is the value of the full likelihood p(y|a,b) but, at each value of a, evaluated at the maximum-likelihood value of b, given that setting of a. That is, the profile likelihood is the likelihood function optimized over the nuisance parameters. Profiling can be useful when you do not have a principled way to put priors on your nuisance parameters. A recent paper on profiling, aimed at cosmologists, is “Profile Likelihoods in Cosmology: When, Why and How illustrated with ΛCDM, Massive Neutrinos and Dark Energy” found at https://arxiv.org/abs/2408.07700.
DWH