Rejection Sampling and the Ziggurat Algorithm

It’s blazingly fast and is the best RNG you’ve never heard of. Let me show you how it works with some interactive sketches. We’ll throw darts along the way!

Throwing darts

How do you calculate π? One way to do it is to throw (a lot) of darts.

Run this sketch yourself in the p5.js web editor.

Huh? I’ll explain step by step. It might be useful to have the code open in another tab.

Step 1: First, we throw a dart randomly inside the square by calling this aptly named function:

function throwDart() {
const x = Math.random();
const y = Math.random();
return createVector(x, y);
}

Math.random() gives us i.i.d. standard uniform variates, which is a fancy way to say that:

Each time we call it we get any number between 0 and 1.
There’s no way to know in advance which one it’ll be,
but we do know no one is favored.

Everything I just said is flat out wrong, but it’s very practical to pretend this is the way it works. Go read about true randomness somewhere else.

Step 2: We check where the dart landed. That’s line 31 in the code, if (dart.dist(origin) < radius) (you did open it in another tab, right?). If it falls within the circle, we color it blue and increase the counter in_circle by one. Otherwise we color it red.

Step 3: We do some math.

It is intuitively clear to some people that
\[P(\mathrm{a\ dart\ is\ blue}) =\frac{\mathrm{circle\ area}}{\mathrm{square\ area}}.\]
The intuition is that if no number is favored by the uniform coordinates x and y then every piece of the square should get its “fair share”. The proof, however, just so happens to require some (slightly) nontrivial math. I’ll go ahead and assume you’re OK with the intuition.
The area of the circle is \(\pi r^2\) while the area of the square is \((2r)^2\) (because its side is twice the length of the radius). Take the ratio of the areas and you have
\[\frac{\mathrm{circle\ area}}{\mathrm{square\ area}} = \pi/4.\]
The law of large numbers says that

\[P(\mathrm{a\ dart\ is\ blue})\approx \frac{\mathrm{\#\ of\ blue\ darts}}{\mathrm{\#\ of\ total\ darts}}.\]

Step 4: We calculate 4*in_circle/total as our estimate for π, and it seems to work, in the long run.

Rejection blues

Throwing darts in a square is easy, two portions of Math.random() and you’re done. What if I wanted something a bit more special? Maybe sampling uniformly in, say, a circle?

Let’s change the game: now we throw the same old darts in the square, though every time we miss the circle we start over. About \(\pi/4\approx 78 \%\) of the time we’ll score on the first try, the rest of the time we’ve wasted a throw, but we pick ourselves up and try again.

`THROWS: 0 | HITS: 0`

By clicking this button until you get a hit you manually iterate through the following loop:

do {
  dart = throwDart();
} while (dart.dist(origin) > radius)

Once the loop terminates, we know for sure that the last dart landed within the white circle, it fulfilled the condition. In fact,

do {
  Sample X
} while (X does not fulfill condition A)

is a general way to get a sample of X conditioned on A. It’s called the acceptance-rejection method. In our case we reject the red darts and start over until we accept a blue dart and move on.

This works with everything, so why limit ourselves to circles?

`THROWS: 0 | HITS: 0`

What you see is a zoomed in bell curve:

\[\mathrm{bell}(x) = C\cdot {e^{-x^2/2}}\]

Where \(C\) is just some constant I chose to make the bell curve touch the top of the sketch (It’s \(4\sqrt{2\pi}\) if you have to know). We want to accept any dart that hits below this curve, so the loop looks like this now:

do {
  dart = throwDart();
  } while (dart.y > bell(dart.x))

There’s one more thing that I’ve intentionally made easy to miss: the square we’re throwing at has changed. It’s now within these boundaries:

\[-2 <x <2 ,\ 0< y < 4.\]

Look very hard at the grid lines to see why.

This means we need to change the dart throwing function. Luckily, this is an easy fix:

function throwDart() {
  const x = 4*Math.random() - 2; //x ~ U[-2,2]
  const y = 4*Math.random();     //y ~ U[0,4]
  return createVector(x, y);
}

I never liked y anyway

Let’s do the same thing, but slower. I want us to get just an x at first.

function throwX() {
  return 4*Math.random() - 2; 
}

Now it’s up to you to randomly choose y. Close your eyes and click the button.

`THROWS: 0 | HITS: 0`

Hopefully this experience convinced you that it’s harder to score the further \(x\) is from zero. How much harder though? Suppose throwX() gave us 1.5 back, what are the chances we’ll score a hit? The vertical line above \(x=1.5\) turns red at \(y= \mathrm{bell}(1.5)\approx 4/3\), which is a third of the way up, so I’d say

\[P(\mathrm{the\ dart\ is\ blue, \ given \ that\ } x=1.5) \approx 1/3.\]

The logic works the other way around as well: this selection process makes it less likely that we’ll see darts with larger \(x\) values. In fact, it turns out that if we:

Take the blue darts.
Ignore their y’s.
Keep just the x’s.

Then what we get is no longer uniform, but normally distributed! I won’t go through the proper proof here (it’s Bayes’ law, basically), but here’s a proof by statistics: Below you’ll find a live histogram generated from the darts we’ve been throwing automatically since you clicked “LET’S GO”.

You should be thinking: Surely a sample or two should be farther out than \(\pm2\). We’ll get to it later.

The algorithm

Our little game is actually an algorithm: it calls Math.random() a few times and spits out other flavors of randomness, all you need is to:

Draw the graph of the target density. In our case this is the bell curve.
Put a box around it.
Keep throwing darts at the box until one hits under the density.
The \(x\) coordinate of the dart is a sample from this density.

My very first ziggurat

Every time we miss a dart, we have to start over. That’s a lot of wasted effort. Could we do something about this? Yes, think inside the box!

The ziggurat hugs the density, so there’s very little leftover area. We’ll throw darts within the ziggurat, and reject the small minority that land outside the bell curve. The incredible part is that sampling from the ziggurat is even faster than sampling from inside a plain old square: each throw is more accurate and cheaper!

Sampling from the ziggurat

Play with the sketch and let’s talk later.

THROWS: 0 | HITS: 0

You probably got it, but just to make sure:

We start by choosing one of the three levels of the ziggurat at random.

When you get down to the nuts and bolts, this costs basically nothing.
Now we want to throw a dart within the chosen level, so we start by getting just an x, like with throwX from earlier.
Most of the time we get an x in the safe zone (that’s the blue box that pops up). This means the dart’s going to be blue, no matter what. We don’t need to get a y in this case.

This is where most of the savings come from.
Some of the time x lands in the danger zone (red boxes). Now we’re forced to get a y because the dart can be blue or red given the current information.

This thing should only work if in the end we don’t play favorites with any area inside the ziggurat. We have to throw a dart that’s equally likely to land anywhere inside. Obviously, within each level everyone gets their fair share, so we just need to make sure the levels themselves are chosen in proportion to their area.

Think about it, if the top is tiny, it doesn’t make sense to go there a third of the time.

The problem also shows us the solution: when setting up the ziggurat, we move things around to make sure the levels have the same area.

You’re still here?

I still owe you an answer regarding the tails: there’s more to the normal distribution than the range -2 to 2, but you’re not getting any of values beyond it with what we’ve done so far. So… 99% of the solution is to make the box/ziggurat wider, thereby covering more of the density. The last 1% is tricky. Not satisfied? Read the next comment then.
Most academic papers don’t end up on my recommended reading list, but George Marsaglia’s are actually fun. Check out (one of) the original ziggurat papers. This paper is also where you find out how to set up the zigguart.
The zigguart is cool, but the alias method is the OG RNG. An algorithm that runs at \(O(1)\)? It’s one of those rare free lunches. Here’s a nice writeup.

Contents:

Throwing darts

Rejection blues

I never liked y anyway

The algorithm

My very first ziggurat

Sampling from the ziggurat

You’re still here?