October 2024 Digest

Sep 30, 2024 2:35 pm

Consulting and Training

If you are interested in consulting or training engagements or even commissioning me to create a presentation on a topic of interest then don’t hesitate to reach out to me at inquiries@symplectomorphic.com.

Recent Writing

Do you like

- retro video games,

- modeling race outcomes,

- exchangeability and hierarchical modeling in practice,

- inferring rankings,

- and/or making sophisticated, conditional predictions?

In my latest case study I develop a model for the outcome of randomized video game races, inferring the skills of each participant and using those inferences to inform player rankings and subtle conditional predictions.

HTML: https://betanalpha.github.io/assets/chapters_html/racing.html

PDF: https://betanalpha.github.io/assets/chapters_pdf/racing.pdf

Recent Code

I recently updated my Markov chain Monte Carlo analysis and diagnostic tools, https://github.com/betanalpha/mcmc_diagnostics. New features include the ability to overlay expectand pushforward histograms, an ensemble quantile estimator, an implicit subset probability estimator, and a general expectand pushforward evaluation function.

Besides the code itself the GitHub repository also includes HTML and PDF files that document and demonstrate of the functionality of the tools. If you have an opportunity to use the code then all feedback is encouraged and welcome!

The code nominally supports RStan, PyStan2, and PyStan3 but by re-implementing just a few functions it can be easily extended to any Markov chain Monte Carlo code in R or Python.

Office Hours Livestream

You can watch the VOD of this week’s office hours anytime at https://www.patreon.com/posts/back-to-school-9-110594928. Unsurprisingly we talked a lot about challenges in statistics, from generalizing HMC to scaling analyses to larger data sets to publishing novel analyses. I also ranted about how frustratingly bad the “one long Markov chain verse many short Markov chains” debates in MCMC are, with three pages of math to back my complaints.

Support Me on Patreon

If you would like to support my writing then consider becoming a patron, https://www.patreon.com/betanalpha.

Probabilistic Modeling Discord

I have a Discord server dedicated to discussion about (narratively) generatively modeling of all kinds, https://discord.gg/WwDcsWUX.

Recent Rants

On Entropy

Friendly reminder that entropy is a not property of individual states/configurations of a system but rather a probability distribution all of the possible states/configurations (relative to some reference measure, if we want to get technical).

An apparently “messy” or “disorganized” configuration of a room is not by itself high entropy. By definition _any_ room configuration completely describes the room. In other words there is no uncertainty about where every individual object is placed.

On the other hand if we don’t know what the configuration of the room is then we might describe its _possible_ configurations with a probability distribution over room configurations.

If this probability distribution exhibits a high entropy (relative to a uniform measure) then all room configurations will be nearly-equally probable. Moreover if there are many more messy configurations than clean configurations then we can say that clean room configurations are rare.

This statement, however, is entirely a consequence of that probability distribution over room configurations and not any particular room configuration.

Alternatively we might be interested not in the full configuration of a room but rather the position of a particular item. Again entropy is another a way to describe a particular position of this item but rather a probability distribution of possible positions.

In this case a “messy” room might be described by a highly entropic probability distribution of possible positions (again relative to some assumed uniform measure) which spreads out over many possible positions, quantifying large uncertainty about where the book might be before we are able to investigate.

The said once investigate we know the position of the book and, even if the room was “messy”, there is no longer any uncertainty about where that book is.

On Directed Acyclic Graphs

The use of directed acyclic graphics in data analysis suffers from the cardinal sin of applied mathematics: a general mathematical concept that is confounded with particular applications at the exclusion of others. You better believe it's time for another thread.

Formally DAGs are abstract mathematical objects that encode abstract, directed relationships between abstract entities. That abstraction can then be used to consistently represent a variety of different objects.

Because of this generality, however, the interpretation of any particular DAG is at best ambiguous without additional context.

For example DAGs can be used to represent the structure of the covariance matrix in a multivariate normal density function. This approach is commonly, although not always explicitly, assumed in structural equation modeling.

That said this precise association between a DAG and a multivariate normal is not always accepted. In that case it’s not clear what, if anything, any particular DAG actually represents.

DAGs are also greats ways of encoding conditional decompositions of joint probability distributions over product spaces, https://betanalpha.github.io/assets/case_studies/probability_on_product_spaces.html#4_Directed_Graphical_Models.

Because we can interpret probability distributions and their conditional decompositions in many different ways, however, the precise interpretation of these probabilistic DAGs and the information they encode also requires some additional context.

For example one might engineer a conditional decomposition for computational reasons, such as setting up ancestral sampling to sample from the joint probability distribution. In this case the corresponding DAG takes on an algorithmic interpretation, https://betanalpha.github.io/assets/case_studies/generative_modeling.html#11_Generative_As_Sampling.

Similarly we might engineer a conditional decomposition for modeling reasons, with the conditional structure capturing the possible evolutions of a data generating process from latent behaviors to observed data, https://betanalpha.github.io/assets/case_studies/generative_modeling.html#12_Generative_As_Story_Telling.

From this perspective we can use a DAG to define the basic structure of a data generating process before filling the probabilistic details in piece by piece until we have defined the full model.

Occasionally a conditional decomposition, and the DAG that represents it, can manifest both algorithmic and data generative interpretations at the same time. In this case ancestral sampling becomes an actual _simulation_ of the data generating process.

Perhaps most frustrating, at least to me, are the “causal” DAGs. Of course most of the literature claims that causal DAGs encode the “causal relationship” between different behaviors and outcomes.

The frustration arises when trying to understand what mathematical objects those relationships, and the structure of any particular DAG, are meant to represent. Indeed there are multiple possibilities, most of which are rarely if ever explicitly stated.

For example in some interpretations of “causal inference” the so-called causal DAGs imply certain regression models. Nodes in the DAGs mostly refer to observed variables but sometimes they refer to unobservable variables with heuristic conventions assumed to accommodate them.

When those conventions are unclear or misunderstood the utility of DAGs for these “causal inference” analyses is far more limited than often assumed. Without context a DAG does not define a unique model, let alone a unique analysis.

Personally I am a probabilistic modeler. I use DAGs all of the time to reason about and communicate the conditional decompositional structure of probabilistic models in my analyses.

Even then, however, I have to be careful to clearly specify when that structure is a mathematical convenience and when it is meant to capture some assumed narratively generative structure.

Math is just math until it is given context. Statistics is all about defining and communicating that context.

Comments