Inside: two short proofs and a cool viz of the SVD.

Contents:

It’s in the title

What does a linear map do to the unit circle? Well, since they make parallelograms out of squares [3b1b], you might guess that a circle becomes an ellipse. Then you might check an example or two. Let me help you:

You can move the red and blue image vectors (on the right)

With the intuition affirmed, let’s get to proving “circles → ellipses”.

Easy mode

Every matrix has a SVD [Wiki], which basically says its action is equivalent to a composition of three simple actions:

You know what to do
LEGEND:
  1. The first action is a rotation (you might know it as \(V^T\)). The unit circle being, well, a circle, doesn’t care about this one.
  2. Map No. 2 is

    \[\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}.\]

    It scales the \(x\) and \(y\) axes by different values, which means the circle

    \[x^2+y^2=1\]

    becomes

    \[\left(\frac{u}{\sigma_1}\right) ^2+\left( \frac{v}{\sigma_2}\right) ^2=1.\]

    Not a circle anymore, but still an ellipse. So far so good.

  3. The last action is another rotation (\(U\) is the name). The ellipse stays an ellipse, though its axes might not be aligned with the up/down and left/right directions anymore.

Step #2 fails when \(\sigma_2 = 0\). What happens then? Take a guess, then check it by aligning both image vectors in the same direction.

There’s really nothing more to it. Except… What if you wanted to work backwards and use the “circles → ellipses” property of linear maps to prove that the SVD exists? This isn’t that far-fetched, it’s essentially the geometrical interpretation of Camille Jordan’s original proof.

Normal difficulty

We need to avoid circular reasoning, so the SVD is not allowed. How do we prove it then?

If the matrix \(A\) is invertible then we can write

\[\begin{bmatrix} u \\ v \end{bmatrix} = A \begin{bmatrix} x \\ y \end{bmatrix}\]

as

\[\begin{bmatrix} x \\ y \end{bmatrix} = A^{-1} \begin{bmatrix} u \\ v \end{bmatrix}\]

which unpacks to

\[x = a^{11}u+a^{12}v\] \[y = a^{21}u+a^{22}v\]

where \(a^{ij}\) is what you think it is. With this in mind, the unit circle

\[\left\{ \begin{bmatrix} x \\ y \end{bmatrix} : x^2+y^2=1 \right\}\]

becomes

\[\left\{ \begin{bmatrix} u \\ v \end{bmatrix} : (a^{11}u+a^{12}v)^2+(a^{21}u+a^{22}v)^2=1 \right\}\]

after the action of \(A\).

While this set is certainly not as attractive as the one before it, the defining equation is still quadratic. The solutions to quadratic equations are conic sections [Wiki], and there’s just not that much of a variety there:

Conic Sections

Conics. By Pbroks13 [Wikimedia]
Check out the interactive version at CindyJS

Specifically, the only bounded conic is an ellipse. Since the action of \(A\) is continuous, it doesn’t make unbounded sets out of compact ones, and the unit circle we started with is certainly compact.

What happens when \(A\) is not invertible? If you consider segments and single points as degenerate ellipses then it’s all good.

It’s not over?

Let me play devil’s advocate for a sec:

  • Exhibit A: the textbook approach to prove that the SVD exists (Beltrami’s original) goes through the diagonalization of \(A^TA\).
  • Exhibit B: we usually do the same to show that the solution set of the quadratic equation in \(u\) and \(v\) is a conic section.

Isn’t that cheating?

No! Even Descartes already knew that all you need is to get rid of the \(u\cdot v\) term with a well chosen rotation. Diagonalization is overkill.

Go further

  1. Want more SVD visualizations? Check out this video.
  2. The SVD is the bomb, but don’t take it from me. Here’s Gilbert Strang himself singing its praise.
  3. BTW, if we’re on the subject of singing, I learned the SVD exists from this love song 12 years ago. No lie.
  4. If you’ve made it all the way here, you’re ready for the history of the SVD.