Rejection Sampling and the Ziggurat Algorithm
It’s blazingly fast and is the best RNG you’ve never heard of. Let me show you how it works with some interactive sketches. We’ll throw darts along the way!
Contents:
- Throwing darts
- Rejection blues
- I never liked y anyway
- The algorithm
- My very first ziggurat
- Sampling from the ziggurat
- You’re still here?
Throwing darts
How do you calculate π? One way to do it is to throw (a lot) of darts.
Huh? I’ll explain step by step. It might be useful to have the code open in another tab.
Step 1: First, we throw a dart randomly inside the square by calling this aptly named function:
Math.random()
gives us i.i.d. standard uniform variates, which is a fancy way to say that:
- Each time we call it we get any number between 0 and 1.
- There’s no way to know in advance which one it’ll be,
- but we do know no one is favored.
Everything I just said is flat out wrong, but it’s very practical to pretend this is the way it works. Go read about true randomness somewhere else.
Step 2: We check where the dart landed. That’s line 31 in the code, if (dart.dist(origin) < radius)
(you did open it in another tab, right?). If it falls within the circle, we color it blue and increase the counter in_circle
by one. Otherwise we color it red.
Step 3: We do some math.
-
It is intuitively clear to some people that
\[P(\mathrm{a\ dart\ is\ blue}) =\frac{\mathrm{circle\ area}}{\mathrm{square\ area}}.\]The intuition is that if no number is favored by the uniform coordinates
x
andy
then every piece of the square should get its “fair share”. The proof, however, just so happens to require some (slightly) nontrivial math. I’ll go ahead and assume you’re OK with the intuition. -
The area of the circle is \(\pi r^2\) while the area of the square is \((2r)^2\) (because its side is twice the length of the radius). Take the ratio of the areas and you have
\[\frac{\mathrm{circle\ area}}{\mathrm{square\ area}} = \pi/4.\] -
The law of large numbers says that
Step 4: We calculate 4*in_circle/total
as our estimate for π, and it seems to work, in the long run.
Rejection blues
Throwing darts in a square is easy, two portions of Math.random()
and you’re done. What if I wanted something a bit more special? Maybe sampling uniformly in, say, a circle?
Let’s change the game: now we throw the same old darts in the square, though every time we miss the circle we start over. About \(\pi/4\approx 78 \%\) of the time we’ll score on the first try, the rest of the time we’ve wasted a throw, but we pick ourselves up and try again.
THROWS: 0 | HITS: 0
`By clicking this button until you get a hit you manually iterate through the following loop:
Once the loop terminates, we know for sure that the last dart landed within the white circle, it fulfilled the condition. In fact,
is a general way to get a sample of X
conditioned on A
. It’s called the acceptance-rejection method. In our case we reject the red darts and start over until we accept a blue dart and move on.
This works with everything, so why limit ourselves to circles?
THROWS: 0 | HITS: 0
`What you see is a zoomed in bell curve:
\[\mathrm{bell}(x) = C\cdot {e^{-x^2/2}}\]Where \(C\) is just some constant I chose to make the bell curve touch the top of the sketch (It’s \(4\sqrt{2\pi}\) if you have to know). We want to accept any dart that hits below this curve, so the loop looks like this now:
There’s one more thing that I’ve intentionally made easy to miss: the square we’re throwing at has changed. It’s now within these boundaries:
\[-2 <x <2 ,\ 0< y < 4.\]Look very hard at the grid lines to see why.
This means we need to change the dart throwing function. Luckily, this is an easy fix:
I never liked y anyway
Let’s do the same thing, but slower. I want us to get just an x
at first.
Now it’s up to you to randomly choose y
. Close your eyes and click the button.
THROWS: 0 | HITS: 0
`Hopefully this experience convinced you that it’s harder to score the further \(x\) is from zero. How much harder though? Suppose throwX()
gave us 1.5 back, what are the chances we’ll score a hit? The vertical line above \(x=1.5\) turns red at \(y= \mathrm{bell}(1.5)\approx 4/3\), which is a third of the way up, so I’d say
The logic works the other way around as well: this selection process makes it less likely that we’ll see darts with larger \(x\) values. In fact, it turns out that if we:
- Take the blue darts.
- Ignore their
y
’s. - Keep just the
x
’s.
Then what we get is no longer uniform, but normally distributed! I won’t go through the proper proof here (it’s Bayes’ law, basically), but here’s a proof by statistics: Below you’ll find a live histogram generated from the darts we’ve been throwing automatically since you clicked “LET’S GO”.
You should be thinking: Surely a sample or two should be farther out than \(\pm2\). We’ll get to it later.
The algorithm
Our little game is actually an algorithm: it calls Math.random()
a few times and spits out other flavors of randomness, all you need is to:
- Draw the graph of the target density. In our case this is the bell curve.
- Put a box around it.
- Keep throwing darts at the box until one hits under the density.
- The \(x\) coordinate of the dart is a sample from this density.
My very first ziggurat
Every time we miss a dart, we have to start over. That’s a lot of wasted effort. Could we do something about this? Yes, think inside the box!
The ziggurat hugs the density, so there’s very little leftover area. We’ll throw darts within the ziggurat, and reject the small minority that land outside the bell curve. The incredible part is that sampling from the ziggurat is even faster than sampling from inside a plain old square: each throw is more accurate and cheaper!
Sampling from the ziggurat
Play with the sketch and let’s talk later.
THROWS: 0 | HITS: 0
You probably got it, but just to make sure:
- We start by choosing one of the three levels of the ziggurat at random.
When you get down to the nuts and bolts, this costs basically nothing.
- Now we want to throw a dart within the chosen level, so we start by getting just an
x
, like withthrowX
from earlier. - Most of the time we get an
x
in the safe zone (that’s the blue box that pops up). This means the dart’s going to be blue, no matter what. We don’t need to get ay
in this case.This is where most of the savings come from.
- Some of the time
x
lands in the danger zone (red boxes). Now we’re forced to get ay
because the dart can be blue or red given the current information.
This thing should only work if in the end we don’t play favorites with any area inside the ziggurat. We have to throw a dart that’s equally likely to land anywhere inside. Obviously, within each level everyone gets their fair share, so we just need to make sure the levels themselves are chosen in proportion to their area.
Think about it, if the top is tiny, it doesn’t make sense to go there a third of the time.
The problem also shows us the solution: when setting up the ziggurat, we move things around to make sure the levels have the same area.
You’re still here?
- I still owe you an answer regarding the tails: there’s more to the normal distribution than the range -2 to 2, but you’re not getting any of values beyond it with what we’ve done so far. So… 99% of the solution is to make the box/ziggurat wider, thereby covering more of the density. The last 1% is tricky. Not satisfied? Read the next comment then.
- Most academic papers don’t end up on my recommended reading list, but George Marsaglia’s are actually fun. Check out (one of) the original ziggurat papers. This paper is also where you find out how to set up the zigguart.
- The zigguart is cool, but the alias method is the OG RNG. An algorithm that runs at \(O(1)\)? It’s one of those rare free lunches. Here’s a nice writeup.