January 2025 Digest
Dec 30, 2025 3:36 am
Recent Writing
I was looking into Voronoi diagrams for a side project, but I just could not follow the details. So I do want I am wont to do and dove into the details, working out a lot of geometry and making too many figures.
HTML: https://betanalpha.github.io/assets/chapters_html/voronoi.html
PDF: https://betanalpha.github.io/assets/chapters_pdf/voronoi.pdf
In this note I work through the geometric structure of Voronoi diagrams, and then use that structure to understand the motivation for, and implementation of, Fortune’s algorithm. I placed a particular focus on the details that, to me, seem to be hand-waved in all of the standard references.
To demonstrate all of the details I put together an object-oriented implementation of Fortune’s algorithm, including a doubly-connected edge list, event queue, and self-balancing binary search tree to ensure optimal computational scaling.
I never could figure out how to balance the heterogeneous tree structures recommended by the standard references. Instead I ended up working out my own based on the geometry of Fortune’s algorithm. That’ll more than enough computer science for me for the next few years…
Slightly Less Recent Writing
I reviewed all of my writing this year over on Patreon, https://www.patreon.com/posts/146964612. Included are not only all of the pieces that I’ve already shared in the newsletter this year but also a few pieces that have until now not been publicly available. Take a look to make sure that you haven’t missed anything.
Office Hours VOD
The VOD for my recent office hour live stream is available at https://www.patreon.com/posts/holiday-surprise-143657758. We ended up going for two and a half hours, talking about time series, hierarchical, and even factor models in constrained spaces, what makes for a well-defined model comparison and what makes for a practically useful one, and more.
Consulting and Training
Are you, or is someone you know, struggling to develop, implement, or scale a Bayesian analysis compatible with your domain expertise? If so then I can help.
I can be reached at inquiries@symplectomorphic.com for any questions about potential consulting engagements and training courses.
Probabilistic Modeling Discord
I set up a Discord server dedicated to discussion about (narratively) generative modeling of all kinds. For directions on how to join see https://www.patreon.com/posts/generative-88674175. Come join the conversation!
Support Me on Patreon
If my work had benefited your own and you have the resources then consider supporting me on Patreon, https://www.patreon.com/c/betanalpha.
Recent Rants
On The Frequentist Verses Bayes Debate
So much statistics discourse orbits around Bayes verses frequentist debates, but in practice Bayes verses frequentist is often a false dichotomy. Just as important as the inferential strategy is the underlying model, and how that model is specified. 🧵
(To be precise I’m referring to formal frequentist methods, which require calibrating decisions, and formal Bayesian methods, which require accurately quantifying posterior distributions. Haphazard approaches, such as uncalibrated point estimators or posterior summary estimates without error estimates, make these discussions even less productive.)
Default models are black boxes chosen with little to no consideration, let alone validation, of the underlying assumptions. Common black boxes include regression, ANOVA, and the implicit models assumed with many asymptotic methods.
Bespoke models, on the other hand, are developed for a particular application. My personal favorite are narratively generative models that approximate the underlying data generating process.
Many fields where Bayesian methods have become more normalized <cough> psychology <cough> consider mostly only II.
In many applications I and II aren’t all that different, with the Bayesian methods largely being equivalently to slightly regularized frequentist methods. That doesn’t, however, imply that either approach is particularly effective.
On the other hand III and IV often diverge wildly.
III is possible in theory, but in practice typical relies on maximum likelihood, or related techniques like profile likelihood, which in turn require additional asymptotic assumptions to be well-behaved. Deriving frequentist properties for custom models is challenging.
Many of the most passionate Bayes verses frequentist debates are not between I and II or III and IV but rather I and IV. People are often excited more about the freedom to work with better models than having uncertainty quantification.
Of course the correct approach is IV which allows one to extract as much productivity as possible out of carefully considered modeling assumptions. 🫡