Principal Component Analysis:

Three Examples and some Theory

Very often, especially in applications to the life sciences, useful low-dimensional projections exist and allow humans to grasp a data set that would otherwise be inscrutable.

Tony Phillips
Stony Brook University


Principal component analysis (PCA), an algorithm for helping us understand large-dimensional data sets, has become very useful in science (for example, a search in Nature for the year 2020 picks it up in 124 different articles). Human minds are good at recognizing patterns in two dimensions and to some extent in three, but are essentially useless in higher dimensions. The goal of PCA is to produce the most useful possible 2 or 3-dimensional projection of a high-dimensional data set—most useful in that the smallest amount of information is lost by the projection. One can construct artificial examples where all the dimensions must be considered, but very often, especially in applications to the life sciences, useful low-dimensional projections exist and allow humans to grasp a data set that would otherwise be inscrutable. In this column we will look at PCA in three recently published applications (to anthropology and paleontology), and go through the mathematics that makes it work.

Example 1. Distinguishing nearby soil samples in a 78,000-year-old burial

burial PCA
In this PCA, 13-dimensional data from some 80 soil samples are projected into the plane spanned by their two principal components. The projection shows a clear distinction (highlighted by the superimposed 95% confidence ellipses) between samples from the burial pit (red dots) and samples (purple dots) from outside the pit at the same level (Layer 19) of the excavation. Image adapted from Nature, 593, 95-100, Extended Data Figure 4

Earliest known human burial in Africa was published in Nature on May 5, 2021. The authors are an international team of 36 led by María Martinón-Torres, Nicole Boivin and Michael Petraglia. In documenting the 78,000-year-old grave of a child in Panga ya Saidi, Kenya, they needed to establish that there had actually been a burial; they did this in part by showing that the earth surrounding the skeleton was different from the surrounding earth at that level (layer 19) in the archaeological excavation, i.e. that a pit had been dug to receive the child’s corpse. “The burial fill is a mix of ferruginous silt and sand, compositionally similar to the top of layer 18 and the base of layer 17, and different from layer 19, in which the pit was excavated.” They confirmed the difference by measuring the concentrations of 13 chemical elements occurring in the burial fill and in those three layers. To visuaize their ensemble of measurements (80 data points, each with 13 components), “[w]e performed PCA of EDXRF concentrations for the thirteen major, minor, and trace elements more frequently detected (Si, K, Ca, Ti, Mn, Fe, Zn, Ga, As, Rb, Y, Zr and Ba).” As they report, “The burial pit samples markedly differ from layer 19.”

The basic mathematics

In the “Earliest … burial” example, PCA gives a useful 2-dimensional representation of a cloud of points in a 13-dimensional space. (Astronomers describe the general procedure as “finding the principal axes of inertia of the cloud of samples about their mean in feature space.”) The a priori unlikely combination of basic elements of statistics, multivariable calculus and linear algebra that makes it work will be outlined in this section.

To be specific, suppose the data set consists of $n$ measurements taken on $M$ samples; geometrically it would be a set of $M$ points distributed in $n$-dimensional space, where each sample is a point and its $n$ measurements are its $n$ coordinates.

  • To fix notation, let measurements $x_1, \dots, x_n$ be taken on $M$ samples $s_1, \dots, s_m$.

  • For each $x_i$ we calculate the average, or mean

    $$\overline{x_i} = \frac{1}{M}\sum_{j=1}^M x_i(s_j).$$

  • Then for each $x_i$ we calculate the variance: this is the average of the square of the difference between the $x_i$-measurements and the mean: $${\rm Var}(x_i) = \frac{1}{M}\sum_{k=1}^M (x_i(s_k)-\overline{x_i})^2.$$

  • The variance ${\rm Var}(x_i)$ estimates the spread of $x_i$. Squaring the differences puts deviations to the right or the left of the mean on the same footing; it also gives more weight to the larger differences (the “outliers”). (The standard deviation, a more common measure of spread, is defined as the square root of the variance.)

  • For each pair $(x_i, x_j)$ of measurements we calculate the covariance $${\rm Cov}(x_i, x_j) = \frac{1}{M}\sum_{k=1}^M (x_i(s_k)-\overline{x_i})(x_j(s_k)-\overline{x_i}).$$ Note that ${\rm Cov}(x_i, x_i)={\rm Var}(x_i)$.

The various covariances are naturally displayed in  the covariance matrix

$${\bf Cov} ={\bf Cov}(x_1,\dots, x_n)=\left (\begin{array}{cccc}{\rm Cov}(x_1,x_1)& {\rm Cov}(x_1,x_2)&\dots&{\rm Cov}(x_1,x_n)\\ {\rm Cov}(x_2,x_1)&{\rm Cov}(x_2,x_2) &\dots &{\rm Cov}(x_2,x_n)\\ \vdots&\vdots&\vdots&\vdots\\ {\rm Cov}(x_n,x_1)&{\rm Cov}(x_n,x_2)&\dots&{\rm Cov}(x_n,x_n) \end{array} \right ).$$

Note that since ${\rm Cov}(x_i, x_j) = {\rm Cov}(x_j, x_i)$ the matrix ${\bf Cov}$ is symmetric. This fact will be very important when we study its algebraic properties.

So much for the statistics. The next calculations will be simpler to write if we assume our initial measurements have been centered (by subtracting from each one its original mean), so that each one has mean equal to zero. So now ${\rm Cov}(x_i, x_j) = \frac{1}{M}\sum_{k=1}^M x_i(s_k)x_j(s_k), $ for all $i,j=1,\dots,n$.

PCA starts with finding the direction in $n$-space (measurement-space) along which the data are maximally spread out, so that the clustering and spacing of the data points along that direction carry the most information. A direction is given by a unit vector ${\bf a}$ with components $a_1, \dots, a_n$, and the measurement in this direction is the linear combination $y=a_1 x_1 + \dots + a_n x_n$. Since the $x_i$ all have mean zero, the same must hold for $y$, so it is automatically centered. “Maximally spread out” in our context means having the greatest variance ${\rm Var}(y) = \frac{1}{M}\sum_{j=1}^M y(s_j)^2 $.

So we want to find the components $a_1, \dots, a_n$ leading to the maximum of ${\rm Var}(y) = {\rm Var}(a_1 x_1 + \dots + a_n x_n)$, subject to the constraint $ a_1 ^2 +\cdots +a_n^2=1$. Geometrically, we are maximizing ${\rm Var}(y)$ on the surface of the unit sphere in $n$-space. The way we learn to do this in multivariable calculus is to look for the points on the sphere $a_1 ^2 +\cdots +a_n^2=1$ where the gradient of ${\rm Var}(y)$ is perpendicular to the surface of the sphere or, equivalently, parallel to the gradient of the function $a_1 ^2 +\cdots +a_n^2$, which is $(2a_1, \dots, 2a_n)$. One of these points will be the maximum. (The  gradient of a function $f$ of $n$ variables $x_1, \dots, x_n$ is the vector $(\partial f/\partial x_1, \dots, \partial f/\partial x_n)$. It points in the direction of steepest increase.)

Numerically, the statement that the gradients of ${\rm Var}(y)$ and $a_i ^2 +\cdots +a_n^2$ are parallel means that there is a non-zero scalar $\lambda$ such that one is $\lambda$ times the other (this is the basis of the method of  Lagrange multipliers):

$$ \frac{\partial {\rm Var}(y)}{\partial a_1}= 2\lambda a_1$$

$$~~~~~~~~~~~~~~~~\vdots ~~~~~~~~~~~~~~~~(*)$$

$$ \frac{\partial {\rm Var}(y)}{\partial a_n}   =  2\lambda a_n. $$

Let us calculate $\partial {\rm Var}(y)/\partial a_j$.

$$\frac{\partial}{\partial a_j}{\rm Var}(y) = \frac{\partial}{\partial a_j}{\rm Var}(a_1 x_1 + \dots + a_n x_n).$$

Remembering that the $x_i$ are centered,

$${\rm Var}(a_1 x_1 + \dots + a_n x_n)= \frac{1}{M}\sum_{k=1}^M (a_1 x_1(s_k) + \dots + a_n x_n(s_k))^2$$

so by the chain rule

$$\frac{\partial}{\partial a_j}{\rm Var}(y)= \frac{1}{M} \sum_{k=1}^M [2(a_1 x_1(s_k) + \dots + a_n x_n(s_k))\cdot x_j(s_k)]$$

$$= 2a_1{\rm Cov}(x_1,x_j) + \cdots + 2a_n {\rm Cov}(x_n,x_j).$$

Equations $(*)$ then imply that for each $j$ between 1 and $n$,

$$2a_1 {\rm Cov}(x_1,x_j) + \cdots + 2a_n {\rm Cov}(x_n,x_j) = 2\lambda a_j.$$

Here is where linear algebra makes a miraculous intervention. Cancelling the 2s, and remembering that  ${\rm Cov}(x_i,x_j) = {\rm Cov}(x_j,x_i)$, these equations together can be rewritten as

$$ \left (
{\bf Cov}(x_1,x_1)& {\bf Cov}(x_1,x_2)&\dots&{\bf Cov}(x_1,x_n)\\
{\bf Cov}(x_2,x_1)&{\bf Cov}(x_2,x_2) &\dots &{\bf Cov}(x_2,x_n)\\
{\bf Cov}(x_n,x_1)&{\bf Cov}(x_n,x_2)&\dots&{\bf Cov}(x_n,x_n)
\right ) \left(
\end{array} \right )= \lambda \left(
\end{array} \right ).~(**)$$

We say that the column vector ${\bf a}$  is an eigenvector for the covariance matrix ${\bf Cov}$, with eigenvalue $\lambda$.

Finding the eigenvalues and eigenvectors for the $n\times n$ matrix ${\bf Cov}$ is a standard set of calculations in linear algebra, which I will summarize here.

  • The eigenvalues of ${\bf Cov}$ are defined as the roots of the degree-$n$ polynomial $\det({\bf Cov}-\lambda {\rm I})=0$, where ${\rm I}$ is the $n\times n$ identity matrix; but for large $n$ this definition is not practical: eigenvectors and eigenvalues are calculated by the Lagrange multiplier method as above. In applications to the life sciences, where there are not likely to be hidden symmetries in the problem, the eigenvalues will be all distinct; furthermore, since ${\bf Cov}$ is a symmetric matrix these are all real numbers, and can therefore be ordered $\lambda_1 > \lambda_2 > \cdots > \lambda_n$.

  • For each index $j$, the vector equation ${\bf Cov} \cdot {\bf a} = \lambda_j {\bf a}$ has a non-zero solution, unique up to a scalar multiple, so unique up to sign if we fix its length to be 1. This solution, call it ${\bf a}_j$, is the eigenvector corresponding to $\lambda_j$.

  • (For future reference, here is a useful fact: because the matrix is symmetric, eigenvectors corresponding to different eigenvalues are perpendicular.)

In particular the column vector ${\bf a}_1$, with components $ a_{11}, \dots, a_{1n}$, corresponding to the maximum eigenvalue $\lambda_1$, gives us the answer to our original question: the direction in measurement-space along which the data are maximally spread out. Notice that the indeterminacy in the sign of ${\bf a}_1$ does not affect its direction. In PCA, the quantity $y_1 = a_{11}x_1 + \dots + a_{1n}x_n$ is called PC1, the first principal component.

What is this maximum variance? The answer involves some calculation.

Remember that for $\displaystyle{y_1 = a_{11}x_1 +\dots + a_{1n}x_n}$,
$${\rm Var}(y_1) ={\rm Var}(a_{11} x_1 + \dots + a_{1n} x_n)= \frac{1}{M}\sum_{k=1}^M (a_{11} x_1(s_k) + \dots + a_{1n} x_n(s_k))^2$$
$${\rm Var}(y_1) =\frac{1}{M}\sum_{k=1}^M \sum_{i=1}^n\sum_{j=1}^n a_{1i}a_{1j}x_i(s_k)x_j(s_k)$$

$$= \sum_{i=1}^n\sum_{j=1}^n a_{1i}a_{1j}\frac{1}{M}\sum_{k=1}^M x_i(s_k)x_j(s_k) = \sum_{i=1}^n\sum_{j=1}^n a_{1i}a_{1j} {\rm Cov}(x_i,x_j)$$

$$ = (a_{11} ~ a_{12}~ \cdots ~a_{1n}) \left (
{\rm Cov}(x_1,x_1)& {\rm Cov}(x_1,x_2)&\dots&{\rm Cov}(x_1,x_n)\\
{\rm Cov}(x_2,x_1)&{\rm Cov}(x_2,x_2) &\dots &{\rm Cov}(x_2,x_n)\\
{\rm Cov}(x_n,x_1)&{\rm Cov}(x_n,x_2)&\dots&{\rm Cov}(x_n,x_n)
\right ) \left(
\end{array} \right ) $$
$$=(a_{11} ~ a_{12}~ \cdots ~a_{1n})~ \lambda_1 \left(
\end{array} \right ) = \lambda_1 (a_{11}^2 + a_{12}^2 + \cdots a_{1n}^2) = \lambda_1.$$

Amazingly enough, the variance of the linear combination $y_1 = a_{11}x_1 + \dots + a_{1n}x_n$ using the coefficients of ${\bf a}_1$ is exactly the corresponding eigenvalue $\lambda_1$.

Once we have found ${\bf a}_1$, we go on to look for the next most useful direction in which to sort the data. We look in the hyperplane (in measurement-space) perpendicular to ${\bf a}_1$ and repeat the analysis. As before we want to maximize ${\rm Var}(y) = {\rm Var}(b_1 x_1 + \dots + b_n x_n)$; subject to the constraint $b_1 ^2 +\cdots +b_n^2=1$; now we add the additional constraint $b_1a_{11} + \cdots + b_na_{1n} = 0$. The solution will be a point satisfying $b_1 ^2 +\cdots +b_n^2=1$ where the gradient of ${\rm Var}(y)$ is parallel to the gradient of $ b_i ^2 +\cdots +b_n^2$ and lies in the plane $b_1a_{11} + \cdots + b_na_{1n} = 0$. As before, the solution is an eigenvector of the covariance matrix; the condition $b_1a_{11} + \cdots + b_na_{1n} = 0$ eliminates ${\bf a}_1$, leaving ${\bf a}_2$ as the direction maximizing variance; $y_2 = a_{21}x_1 + \cdots a_{2n}x_n$ will be PC2, the second principal component, with variance calculated as above to be $\lambda_2$.

Some more linear algebra: equation $(**)$ specialized to the vector ${\bf a}_i$ can be written in matrix notation ${\bf Cov}\cdot{\bf a}_i = \lambda_i{\bf a}_i$. Amalgamating the $n$ column vectors ${\bf a}_1, \dots {\bf a}_n$ into a matrix ${\bf A}$ will give $ {\bf Cov}\cdot{\bf A} = {\bf A}\cdot{\bf \Lambda}$, where ${\bf \Lambda}$ is the diagonal matrix with entries $\lambda_1, \dots, \lambda_n$. Now ${\bf A}$, being made up of orthogonal columns of length 1, is an orthogonal matrix; its inverse is easy to calculate: it is the transpose ${\bf A}^{\top}$. Multiplying both sides of the last equation on the left by ${\bf A}^{\top}$ gives ${\bf A}^{\top}\cdot{\bf Cov}\cdot{\bf A} = {\bf A}^{\top}\cdot {\bf A}\cdot {\bf \Lambda}= {\bf \Lambda}$. On the other hand, ${\bf A}^{\top}\cdot {\bf Cov}\cdot{\bf A}$, being similar to ${\bf Cov}$, has the same trace ${\rm Cov}(x_1, x_1) + \cdots + {\rm Cov}(x_n, x_n)$, which then must be equal to the trace of ${\bf \Lambda}$, i.e. $\lambda_1 + \cdots + \lambda_n$. The sum of the eigenvalues is equal to the total variance. Often in applications to the life sciences the first two eigenvalues together account for a large proportion of the variance, and their associated principal components give a satisfactory two-dimensional representation of the entire $n$-dimensional panorama of the measurements (in our first two examples they account for 76% and 95.5%, respectively, of the total variance). Often also the third principal component is calculated to further refine the picture.

To complete the picture the $M$ points in $n$-space corresponding to the original measurements can be mapped into the (CP1, CP2)-plane using the matrix ${\bf A}$: $y_1(s_k) = a_{11}x_1(s_k)+\cdots+a_{1n}x_n(s_k)$, $y_2(s_k) = a_{21}x_1(s_k)+\cdots+a_{2n}x_n(s_k)$, or into (CP1, CP2, CP3)-space by using also $y_3(s_k) = a_{31}x_1(s_k)+\cdots+a_{3n}x_n(s_k)$.

Example 2. Sorting Kuehneotherium teeth

Kuehneotherium (right) and its 200-million- years-ago contemporary Morganucodon are imagined by John Sibbick foraging for insects in what is now part of Wales. The painting, which appeared on the cover of Nature, is reproduced courtesy of Pamela Gill.

This example, which comes from a review article on PCA by Ian T. Jolliffe and Jorge Cadima in  Philosophical Transactions of the Royal Society A, “examines a dataset consisting of nine measurements on 88 fossil teeth from the early mammalian insectivore Kuehneotherium . …  Kuehneotherium is one of the earliest mammals and remains have been found during quarrying of limestone in South Wales, UK. The bones and teeth were washed into fissures in the rock, about 200 million years ago, and all the lower molar teeth used in this analysis are from a single fissure. However, it looked possible that there were teeth from more than one species of  Kuehneotherium in the sample.” The teeth came from the collection of the Natural History Museum London, and the measurements were made by Pamela G. Gill, School of Earth Sciences, Bristol. Her nine measurements are identified in the next figure.

tooth parameters
The nine parameters measured by Pamela Gill on each Kuehneotherium tooth. Image courtesy of Pamela Gil.

Cadima and Jolliffe ran a PCA of a subset Gill’s measurements (all the teeth from one of the fissures), and obtained the plot shown here.

tooth PCA
Cadima and Jolliffe’s PCA of the Kuehneotherium dataset. The original 88 data points have been projected onto the (PC1, PC2) plane, along with vectors corresponding to the nine original measurements. Image courtesy of Jorge Cadima.

They remark: “The first two PCs account for 78.8% and 16.7%, respectively, of the total variation in the dataset, so the two-dimensional scatter-plot of the 88 teeth given by [the figure] is a very good approximation to the original scatter-plot in nine-dimensional space.” Furthermore “All of the loadings in the first PC have the same sign, so it is a weighted average of all variables, representing ‘overall size’. In [the figure], large teeth are on the left and small teeth on the right. The second PC has negative loadings for the three length variables and positive loadings for the other six variables, representing an aspect of the ‘shape’ of teeth. Fossils near the top of [the figure] have smaller lengths, relative to their heights and widths, than those towards the bottom.”

They conclude: “The relatively compact cluster of points in the bottom half of [the figure] is thought to correspond to a species of Kuehneotherium, while the broader group at the top cannot be assigned to Kuehneotherium, but to some related, but as yet unidentified, animal.”

When the measurement space has very high dimension

In applications to genetics or to image analysis (where each pixel is a “measurement”) the number $n$ of measurements is often enormous compared to the number $M$ of samples. Following the CPA algorithm sketched above would mean dealing with an intractably large matrix.

Now this problem is only potentially $n$-dimensional, because all we have to work with are data from $M$ points. One way to conceptualize it is to think of each of the aggregate measurement results $x_i$ as a vector ${\bf x}_i$ with components $x_i(s_1), \dots x_i(s_M)$. Now we have $n$ vectors in an $M$-dimensional space, with $n>M$, so there is a set of (at most) $M$ linearly independent vectors which give all the ${\bf x}_i$ by linear combination. Suppose for simplicity there are exactly $M$, say ${\bf z}_1,\dots, {\bf z}_M$. Then our problem of finding a variance-maximizing linear combination $y=a_1x_1 + \dots +a_nx_n$, i.e. ${\bf y} = a_1{\bf x}_1 + \dots +a_n{\bf x}_n$ reduces to finding the much smaller linear combination ${\bf y} = c_1{\bf z}_1 + \dots +c_M{\bf z}_M$ maximizing variance. But how do we identify ${\bf z}_1,\dots, {\bf z}_M$?

Again, linear algebra has just the tools we need. With the same notation as above, let us compute for every $i,j$ between 1 and $M$ the quantity $ \frac{1}{n}(x_1(s_i)x_1(s_j)+x_2(s_i)x_2(s_j) + \cdots + x_n(s_i)x_n(s_j))$. These numbers fit in a new, $M\times M$, matrix we can call ${\bf Voc}$ to distinguish it from ${\bf Cov}$ defined above. We calculate the eigenvectors ${\bf b}_1, \dots, {\bf b}_M$ and corresponding eigenvalues $\lambda_1, \dots, \lambda_M$ for ${\bf Voc}$ as we did before.

There is a useful relation between ${\bf Cov}$ and ${\bf Voc}$. Suppose we call ${\bf X}$ the matrix of all the measurements of all the samples, so ${\bf X}_{ij} = x_i(s_j)$, and its transpose ${\bf X}^{\top}_{ij} = {\bf X}_{ji} = x_j(s_i)$. The matrix ${\bf X}$ has $M$ rows and $n$ columns, vice-versa for ${\bf X}^{\top}$. The matrix product ${\bf X}\cdot {\bf X}^{\top}$ is defined by $[{\bf X}\cdot {\bf X}^{\top}]_{ij} = {\bf X}_{i1} {\bf X}^{\top}_{1j} + \cdots + {\bf X}_{iM} {\bf X}^{\top}_{Mj} = x_i(s_1)x_j(s_1) + \cdots + x_i(s_M)x_j(s_M)$ $= M {\rm Cov}(x_1, x_j)$ whereas $[{\bf X}^{\top}\cdot {\bf X}]_{ij} = {\bf X}^{\top}_{i1} {\bf X}_{1j} + \cdots + {\bf X}^{\top}_{in} {\bf X}_{nj} = x_1(s_i)x_1(s_j) + \cdots + x_n(s_i)x_n(s_j)$ $ =n {\rm Voc}(s_i,s_j)$ So ${\bf X}\cdot {\bf X}^{\top} = n {\bf Cov}$, $n$ times the intractable matrix we are interested in, and ${\bf X}^{\top}\cdot {\bf X} = M {\bf Voc}$, $M$ times its smaller cousin. Now a little more linear algebra:

  • ${\bf X}^{\top}\cdot {\bf X}$ and ${\bf X}\cdot {\bf X}^{\top}$ have the same non-zero eigenvalues, $\lambda_1, \dots, \lambda_M$
  • For each eigenvector ${\bf b}_i$ of ${\bf X}^{\top}\cdot {\bf X}$, the vector ${\bf X}\cdot {\bf b}_i$ is an eigenvector for ${\bf X}\cdot {\bf X}^{\top}$ with the same eigenvalue $\lambda_i$.

The proofs of these a priori unlikely statements are immediate: If there is a non-zero vector ${\bf b}$ with ${\bf X}^{\top}\cdot {\bf X}\cdot {\bf b}=\lambda{\bf b}$, then ${\bf X}\cdot {\bf X}^{\top}\cdot {\bf X}\cdot {\bf b} =\lambda{\bf X}\cdot {\bf b}$, so ${\bf X}\cdot {\bf b}$ is an eigenvector for ${\bf X}\cdot {\bf X}^{\top}$ with eigenvalue $\lambda$.

Now if ${\bf b}$ is an eigenvector of the matrix ${\bf A}$ with eigenvalue $\lambda$, then ${\bf b}$ will also be an eigenvector of $k$ times ${\bf A}$, with eigenvalue $k\lambda$ (since $k{\bf A}\cdot {\bf b} = k({\bf A}\cdot {\bf b}) = k(\lambda {\bf b})=(k\lambda) {\bf b} $). So once we know the eigenvectors and eigenvalues for the smaller matrix ${\bf Voc}$, we can retrieve the eigenvectors and eigenvalues for ${\bf Cov}$, and complete our PCA.

The switch to the transpose makes examples like the next one possible.

Example 3. Parsing Native American genes in Polynesian ancestry

“Native American gene flow into Polynesia predating Easter Island settlement” was published in Nature, July 8, 2020. The authors are a team of 32 led by Alexander G. Ioannidis (Stanford), Javier Blanco-Portillo and Andrés Moreno-Estrada (both at LANGEBIO, Irapuato, Mexico). They begin: “The possibility of voyaging contact between prehistoric Polynesian and Native American populations has long intrigued researchers.” They mention Thor Heyerdahl’s controversial suggestion “that prehistoric South American populations had an important role in the settlement of east Polynesia and particularly of Easter Island (Rapa Nui),” and that the matter is still “hotly contested.” No longer, presumably: “We find conclusive evidence for prehistoric contact of Polynesian individuals with Native American individuals (around AD 1200)… Our analyses suggest strongly that a single contact event occurred in eastern Polynesia, before the settlement of Rapa Nui, between Polynesian individuals and a Native American group most closely related to the indigenous inhabitants of present-day Colombia.”

Part of their evidence is a detailed analysis of the ancestry of present-day inhabitants of Rapa Nui. They split them into two groups: those who have European ancestry and those who do not. They present a PCA comparing these two groups (along with other Pacific islanders) with a spectrum of present-day Native American inhabitants of Central and South America. It shows that the Rapa Nui inhabitants with some European ancestry group with populations in the South of Chile (Mapuche, Pehuenche, Huiiliche, Chilote) while those with no European ancestry group with the Zenu, a population in Colombia, at the opposite end of South America.

ioannides PCA
Part of Supplemental Figure 14 from {\em Nature} {\bf 583}, 572-577 (2020), used with permission. For legibility I have cut the PC1 axis in two and have suppressed the top of the range of PC2. The color key is shown separately below.
color key
Color key for the PCA by Ioannides et al.

The caption in Nature gives more detail:  “Our new weighted ancestry-specific SVD-completed principal component analysis (PCA) applied to the Native American component from Pacific islanders and reference individuals from the Americas. The Native American ancestries of reference individuals are plotted with open circles. The Native American ancestries of Pacific islanders are represented as solid points. … In addition, labels are plotted for the location of the allele frequency vector created from aggregating all Native American ancestry haplotypes in each island population. The Rapanui are split into two populations, those without European ancestry tracts (dark green) and those with high European and Native American ancestry (blue). The first principal component axis separates the reference Native Americans on a north-south axis, while the second principal component separates the Yamana of southern Patagonia. Pacific islanders’ Native American ancestry is seen to cluster with northern South American references with the exception of the Rapanui with high European ancestry, who cluster with the southern references (Chilean).”
Some background details about this research in a Stanford press release.

Conversations with Folkert Tangerman were very helpful in the preparation of this column.

Updated 31 August in response to a comment from a reader.

Completing the Square

The prehistory of the quadratic formula

Completing the square is the essential ingredient in the generation of our handy quadratic formula. And for all the years before the formula existed, that was how quadratic equations were solved…

Tony Phillips
Stony Brook University

Quadratic equations have been considered and solved since Old Babylonian times (c. 1800 BC), but the quadratic formula students memorize today is an 18th century AD development. What did people do in the meantime?

The difficulty with the general quadratic equation ($ax^2+bx+c=0$ as we write it today) is that, unlike a linear equation, it cannot be solved by arithmetic manipulation of the terms themselves: a creative intervention, the addition and subtraction of a new term, is required to change the form of the equation into one which is arithmetically solvable. We call this maneuver completing the square.

Here is how students learn about the maneuver today:

The coefficient $a$ is non-zero (or else the equation is linear) so you can divide through by $a$, giving $x^2 +\frac{b}{a}x + \frac{c}{a} =0$. Now comes the maneuver: you add and subtract $\frac{b^2}{4a^2}$, giving $x^2+\frac{b}{a}x +\frac{b^2}{4a^2} -\frac{b^2}{4a^2}+ \frac{c}{a} =0.$ Now the first three terms are a perfect square: $x^2+\frac{b}{a}x +\frac{b^2}{4a^2} = (x+\frac{b}{2a})^2$: you have completed the first two terms to a square. Instead of having both the unknown $x$ and its square $x^2$ in the equation, you have only the second power $(x+\frac{b}{2a})^2$ of a new unknown. Now arithmetic can do the rest. The equation gets rearranged as $(x+\frac{b}{2a})^2 = \frac{b^2}{4a^2} – \frac{c}{a}$ so $x+\frac{b}{2a} = \pm\sqrt{\frac{b^2}{4a^2} – \frac{c}{a}}$ $= \pm\sqrt{\frac{b^2-4ac}{4a^2}}=\pm\frac{\sqrt{b^2-4ac}}{2a}$ giving the solution $x=-\frac{b}{2a}\pm\frac{\sqrt{b^2-4ac}}{2a}$ =$\frac{-b\pm\sqrt{b^2-4ac}}{2a}$. This expression of the solution to the equation as an algebraic function of the coefficients is known as the quadratic formula.

So completing the square is the essential ingredient in the generation of our handy quadratic formula. And for all the years before the formula existed, that was how quadratic equations were solved. The maneuver was originally explicitly geometric: numbers were identified with areas, and a certain L-shaped polygon was completed to a geometric square. With the development of algebra the job could be done by manipulation of symbols, but for about a millennium the symbolic solution was always buttressed by a geometric argument, as if the algebra alone could not be trusted. Meanwhile our conception of the numbers that the symbols represented evolved, but very slowly. Negative solutions were considered “false” even by Descartes (1596-1650), but once they were accepted, the identification of equations with actual geometric problems eventually evaporated: in Euler’s Algebra (1770) it is gone without a trace.

In this column I will explore some episodes in this history, starting with the Old Babylonians and ending with an occurrence of square-completion in an important calculation in 20th century physics.

  • In Old Babylonian Mathematics
  • The connection to Plimpton 322
  • In Islamic mathematics
  • Aryabhata and Brahmagupta
  • In Fibonacci’s Liber Abaci
  • In Simon Stevin’s L’Arithmétique
  • Descartes’ La Géometrie
  • In Maria Agnesi’s Instituzioni Analitiche …
  • In Leonhard Euler’s Vollständige Anleitung zur Algebra
  • In 20th century physics

    In Old Babylonian Mathematics

    Obverse of tablet YBC 6967
    Reverse of tablet YBC 6967

    The Yale Babylonian Collection’s tablet YBC 6967, as transcribed in in Neugebauer and Sachs, Mathematical Cuneiform Texts, American Oriental Society, New Haven, 1986. Size 4.5 $\times$ 6.5cm.

    The Old Babylonian tablet YBC 6967 (about 1900 BC) contains a problem and its solution. Here is a line-by-line literal translation from Jöran Friberg, A Remarkable Collection of Babylonian Mathematical Texts, (Springer, New York, 2007).

    The over the igi 7 is beyond.
    The igi and the are what?
    7 the the over the igi is beyond
    to two break, then 3 30.
    3 30 with 3 30 let them eat each other, then 12 15
    To 12 15 that came up for you
    1, the field, add, then 1 12 15.
    The equalside of 1 12 15 is what? 8 30.
    8 30 and 8 30, its equal, lay down, then
    3 30, the holder,
    from one tear out,
    to one add.
    one is 12, the second 5.
    12 is the, 5 the igi.

    The Old Babylonians used a base-60 floating-point notation for numbers, so that the symbol corresponding to 1 can represent for example 60 or 1 or 1/60. In the context of YBC 6967, the reciprocal numbers, the igi and the, have product 1 0. Their difference is given as 7.

    Robson's YBC 6967 solution diagram

    Here is a diagram of the solution to the YBC 6967 problem, adapted from Eleanor Robson’s “Words and Pictures: New Light on Plimpton 322” (MAA Monthly, February 2002, 105-120). Robson uses a semi-colon to separate the whole and the fractional part of a number, but this is a modern insertion for our convenience. The two unknown reciprocals are conceptualized as the sides of a rectangle of area (yellow) 1 0 [or 60 in decimal notation]. A rectangle with one side 3;30 [$= 3\frac{1}{2}$] is moved from the side of the figure to the top, creating an L-shaped figure of area 1 0 which can be completed to a square by adding a small square of area 3;30 $\times$ 3;30 = 12;15 [$= 12\frac{1}{4}$]. The area of the large square is 1 0 + 12;15 = 1 12;15 [$ = 72\frac{1}{4}$] with square root 8;30 [$=8\frac{1}{2}$]. It follows that our unknown reciprocals must be 8;30 + 3;30 = 12 and 8;30 − 3;30 = 5 respectively.

    In modern notation, the YBC 6967 problem would be $xy = 60, x-y = 7$, or $x^2-7x-60=0$. In this case the term to be added in completing the square is $\frac{b^2}{4a^2}=\frac{49}{4}=12\frac{1}{4}$, corresponding exactly to the area of the small square in the diagram.


    This tablet, and the several related ones from the same period that exist in various collections (they are cataloged in Friberg’s book mentioned above), are significant because they hold a piece of real mathematics: a calculation that goes well beyond tallying to a creative attack on a problem. It should also be noted that none of these tablets contains a figure, even though Old Babylonian tablets often have diagrams. It is as if those mathematicians thought of “breaking,” “laying down” and “tearing out” as purely abstract operations on quantities, despite the geometrical/physical language and the clear (to us) geometrical conceptualization.

    The connection to Plimpton 322

    Plimpton 322
    The Old Babylonian tablet Plimpton 322 (approx. $13 \times 9$ cm) is in the Columbia University Library. Image courtesy of Bill Casselman (larger image). For scholarly discussions of its content see Friberg’s book or (available online) Eleanor Robson’s article referenced above.

    There have been many attempts at understanding why Plimpton 322 was made and also how that particular set of numbers was generated. It has been described as a list of Pythagorean triples and as an exact sexagesimal trigonometry table. An interpretation mentioned by R. Creighton Buck and attributed to D. L. Voils (in a paper that was never published) is “that the Plimpton tablet has nothing to do with Pythagorean triplets or trigonometry but, instead, is a pedagogical tool intended to help a mathematics teacher of the period make up a large number of igi-igibi quadratic equation exercises having known solutions and intermediate solution steps that are easily checked” (igi, problems involve a number and its reciprocal; the one on YBC 6769 is exactly of this type).

    In the solution of the problem of YBC 6769, we squared 3;30 to get 12;15, added 1 to get $1\, 12;15$ and took the square root to get 8;30. Then the solutions were 8;30 + 3;30 and 8;30 $-$ 3;30. So to set up an igi, problem which will work out evenly we need a square like 12;15 which, when a 1 is placed in the next base-60 place to the left, becomes another square.

    Now the first column of Plimpton 322 contains exactly numbers of this type. For example, in row 5, the first undamaged row, the column I entry is $48\, 54\, 01\, 40$ [=10562500] with square root $54\, 10$ [=3250]. Adding 1 gives $1\, 48\, 54\, 01\, 40$ [=23522500] with square root $80\, 50$ [=4850]. The corresponding igi, problem would ask for two reciprocals differing by two times $54\, 10$, i.e. $1\, 48\, 20$; the answer would be $80\, 50 + 54\, 10 = 2\, 15\, 00$ and $80\, 50 – 54\, 10 = 26\, 40$.

    Unfortunately, neither the problem on YBC 6967 nor any of the five igi, problems recorded by Friberg from tablet MS 3971 correspond to parameters on Plimpton 322. It is possible that lines on the bottom and reverse of the tablet mean that it was supposed to be extended to additional 20 or so rows, where those missing parameters would appear. In fact, none of the proposed explanations is completely satisfactory. As Robson remarked, “The Mystery of the Cuneiform Tablet has not yet been fully solved.”

    In Islamic Mathematics

    Solving quadratic equations by completing the square was treated by Diophantus (c.200-c.285 AD) in his Arithmetica, but the explanations are in the six lost books of that work. Here we’ll look at the topic as covered by Muhammad ibn Musa, born in Khwarazm (Khiva in present-day Uzbekistan) and known as al-Khwarizmi, on his Compendium on Calculation by Completion and Reduction dating to c. 820 AD. (I’m using the translation published by Frederic Rosen in 1831). Negative numbers were still unavailable, so al-Khwarizmi, to solve a general quadratic, has to consider three cases. In each case he supposes a preliminary division has been done so the coefficient of the squares is equal to one (“Whenever you meet with a multiple or sub-multiple of a square, reduce it to the entire square”).

    1. “roots and squares are equal to numbers” [$x^2 + bx = a$]
    2. “squares and numbers are equal to roots” [$x^2 +a = bx$]
    3. “roots and numbers are equal to squares” [$x^2=bx+a$]

    Case 1. al-Khwarizmi works out a specific numerical example, which can serve as a template for any other equation of this form: “what must be the square which, when increased by ten of its roots, amounts to thirty-nine.”

    “The solution is this: you halve the number of the roots, which in the present case equals five. This you multiply by itself; the product is twenty-five. Add this to thirty-nine, the sum is sixty-four. Now take the root of this, which is eight, and subtract from it half the number of the roots, which is five; the remainder is three. This is the root of the square which you sought for; the square itself is nine.”

    Note that this is exactly the Old Babylonian recipe, updated from $x(x+7)=60$ to $x^2 +10x = 39$, and that the figure Eleanor Robson uses for her explanation is essentially identical to the one al-Khwarizmi gives for his second demonstration, reproduced here:

    al-Khwarizmi's first figure

    “We proceed from the quadrate AB, which represents the square. It is our next business to add to it the ten roots of the same. We halve for this purpose the ten, so it becomes five, and construct two quadrangles on two sides of the quadrate AB, namely, G and D, the length of each of them being five, as the moiety of the ten roots, whilst the breadth of each is equal to a side of the quadrate AB. Then a quadrate remains opposite the corner of the quadrate AB. This is equal to five multiplied by five: this five being half of the number of roots which we have added to each side of the first quadrate. Thus we know that the first quadrate, which is the square, and the two quadrangles on its sides, which are the ten roots, make together thirty-nine. In order to complete the great quadrate, there wants only a square of five multiplied by five, or twenty-five. This we add to the thirty-nine, in order to complete the great square SH. The sum is sixty-four. We extract its root, eight, which is one of the sides of the great quadrangle. By subtracting from this the same quantity which we have before added, namely five, we obtain three as the remainder. This is the side of the quadrangle AB, which represents the square; it is the root of this square, and the square itself is nine.”

    Case 2. “for instance, ‘a square and twenty-one in numbers are equal to ten roots of the same square.”

    Solution: Halve the number of the roots; the moiety is five. Multiply this by itself; the product is twenty-five. Subtract from this the twenty-one which are connected with the square; the remainder is four. Extract its root; it is two. Subtract this from the moiety of the roots, which is five; the remainder is three. This is the root of the square you required, and the square is nine.

    Here is a summary of al-Khwarizmi’s demonstration. The last of the four figures appears (minus the modern embellishments) in Rosen, p. 18.

    al-Khwarizmi's demonstration

    The problem set up geometrically. I have labeled the unknown root $x$ for modern convenience. The square ABCD has area $x^2$, the rectangle CHND has area $10x$, the rectangle AHNB has area 21, so $x^2+21=10x$.

    al-Khwarizmi's demonstration part 2

    The side CH is divided in half at G, so the segment AG measures $5-x$. The segment TG parallel to DC is extended by GK with length also $5-x$. Al-Khwarizmi says this is done “in order to complete the square.”

    al-Khwarizmi's demonstration part 3

    The segment TK then measures 5, so the figure KMNT, obtained by drawing KM parallel to GH and adding MH, is a square with area 25.

    al-Khwarizmi's demonstration part 4

    Measuring off KL equal to KG, and drawing LR parallel to KG leads to a square KLRG. Since HR has length $5-(5-x)=x$ the rectangles LMHR and AGTB have the same area, so the area of the region formed by adding LMHR to GHNT is the same as that of the rectangle formed by adding AGTB to GHNT, i.e. 21. And since that region together with the square KLRG makes up the square KMNT of area 25, it follows that the area of KLRG is $25-21=4$, and that its side-length $5-x$ is equal to 2. Hence $x=3$, and the sought-for square is 9.


    Al-Khwarizmi remarks that if you add that 2 to the length of CG then “the sum is seven, represented by the line CR, which is the root to a larger square,” and that this square is also a solution to the problem.

    Case 3. Example: “Three roots and four simple numbers are equal to a square.”

    Solution: “Halve the roots; the moiety is one and a half. Multiply this by itself; the product is two and a quarter. Add this to the four; the sum is six and a quarter. Extract its root; it is two and a half. Add this to the moiety of the roots, which was one and a half; the sum is four. This is the root of the square, and the square is sixteen.”

    As above, we summarize al-Khwarizmi’s demonstration. The last figure minus decoration appears on Rosen, p. 20.

    al-Khwarizmi's demonstration with total square'

    We represent the unknown square as ABDC, with side-length $x$. We cut off the rectangle HRDC with side-lengths 3 and $x$. Since $x^2 = 3x + 4$ the remaining rectangle ABRH has area 4.

    al-Khwarizmi's demonstration with total square part 2

    Halve the side HC at the point G, and construct the square HKTG. Since HG has length $1\frac{1}{2}$, the square HKTG has area $2\frac{1}{4}$.

    al-Khwarizmi's demonstration with total square part 3

    Extend CT by a segment TL equal to AH. Then the segments GL and AG have the same length, so drawing LM parallel to AG gives a square AGLM. Now TL = AH = MN and NL = HG = GC = BM, so the rectangles MBRN and KNLT have equal area, and so the region formed by AMNH and KNLT has the same area as ABRH, namely 4. It follows that the square AMLG has area $4+2\frac{1}{4}=6\frac{1}{4}$ and consequently side-length AG = $2\frac{1}{2}$. Since GC = $1\frac{1}{2}$, it follows that $x = 2\frac{1}{2} + 1\frac{1}{2} = 4$.

    Aryabhata and Brahmagupta

    The study of quadratic equations in India dates back to Aryabhata (476-550) and Brahmagupta (598-c.665). Aryabhata’s work on the topic was referenced at the time but is now lost; Brahmagupta’s has been preserved. He gives a solution algorithm in words (in verse!) which turns out to be equivalent to part of the quadratic formula—it only gives the root involving $+$ the radical. Here’s Brahmagupta’s rule with a translation, from Brahmagupta as an Algebraist (a chapter of Brahmasphutasiddhanta, Vol. 1):

    Brahmagupta's quadratic formula figure

    “The quadratic: the absolute quantities multiplied by four times the coefficient of the square of the unknown are increased by the square of the coefficient of the middle (i.e. unknown); the square-root of the result being diminished by the coefficient of the middle and divided by twice the coefficient of the square of the unknown, is (the value of) the middle.”

    There is only the rule, and no indication of how it was derived.

    In Fibonacci’s Liber Abaci

    The third section of chapter 15 of the Liber Abaci, written by Fibonacci (Leonardo of Pisa, c. 1170-c. 1245) in 1202, concerns “Problems of Algebra and Almuchabala,” referring directly to the Arabic words for “Completion and Reduction” in the title of al-Khwarizmi’s compendium. I am using the translation of Liber Abaci by L. E. Sigler, Springer, 2002. In that section, a short introduction presenting the methods is followed by a collection of over a hundred problems.

    Fibonacci follows al-Khwarizmi in distinguishing three “modes” of compound problems involving a square (“census”) and its roots (his second mode corresponds to al_Khwarizmi’s case 3 and vice-versa).

    • “the first mode is when census plus roots are equal to a number.” The example Fibonacci gives, “the census plus ten roots is equal to 39” is inherited directly from al Khwarizmi. But the demonstration he gives is different: he starts with a square $abcd$ of side length greater than 5, marks off lengths of 5 in two directions from one corner and thus partitions $abcd$ into a square of area 25, two rectangles and another square, which he identifies with the unknown census. Then the two rectangles add up to 10 times the root, but since “the census plus 10 roots is equal to 39 denari” the area of $abcd$ must be $25+39=64$, so its side length is 8, our root is $8-5=3$ and the census is 9. The figure and the calculation are essentially the same as al-Khwarizmi’s, but the “completion” narrative has disappeared.
    • “let the census be equal to 10 roots plus 39 denari.” Here the example is new.Fibonacci's root-finding figure(Figure from Sigler, p. 557). Fibonacci starts by representing the census as a square $abcd$ of side length larger than 10. He draws a segment $fe$ parallel to $ab$ so that the segments $fd$ and $ec$ each have length 10 and assigns a midpoint $g$ to $ec$. So the ten roots will be equal to the area of $fecd$, and the area of the rectangle $abfe$, which is also $fe\times eb$, is 39. But $fe=bc$, therefore $be \times bc = 39$. “If to this is added the square of the line segment $eg$, then 64 results for the square of the line segment $bg$.” [$bg^2 = be^2 + 2be\times eg +eg^2$ $= be\times(be + 2eg)+ eg^2 = be\times(be+eg+gc)+eg^2$ $= be\times bc +eg^2 = 39+25=64$]. So $bg = 8$ and $bc = bg+gc = 8+5=13$ is the root of the sought census, and the census is 169.
    • “when it will occur that the census plus a number will equal a number of roots, then you know that you can operate whenever the number is equal to or less than the square of half the number of roots.” Here Fibonacci goes beyond al-Khwarizmi in remarking that (in modern notation) $x^2+c=bx$ has no solution if $c>(b/2)^2$. The example he gives is “let the census plus 40 equal 14 roots.”Fibonacci's root-finding figure part 2(Figures from Sigler, pp. 557, 558). Fibonacci gives two construction for the two positive roots. In each case he starts with a segment $ab$ of length 14, with center at $g$. He chooses another point $d$ dividing $ab$ into two unequal parts.

      For the first root he constructs a square $dbez$ over $db$—this will be the census— and extends $ez$ to $iz$ matching $ab$. The rectangle $abzi$ now represents 14 roots, so that $adei$ measures (in modern notation) $14x – x^2 = 40$. Now $dg = 7-x$ so $dg^2 = 49-14x+x^2$ and $dg^2 + 40 = 49$. It follows that $dg=3$ and the root $x$ is $db=gb-gd=7-3=4$.

      For the second root the square $adlk$ is over $ad$; the segment $kl$ is extended to $km$ matching $ab$, and $lmbd$ measures $14x – x^2 = 40$. Now $gd = x-7$ so again $gd^2 = x^2-14x+49 = 9$ and $gd=3$. This time the root is $x=ag+gd = 10$.


    In Simon Stevin’s L’Arithmétique

    Stevin’s L’Arithmétique was published in 1594 (excerpts here are from the 1625 edition, supervised by Albert Girard). Among other things it treated quadratic equations. Stevin, following Bombelli, used a notation for powers that turns out to be intermediate between cossic notation (which used different symbols for the unknown, its square, its cube etc.) and the exponents that started with Descartes. For Stevin, the unknown was represented by a 1 in a circle, its square by a 2 in a circle, etc., and the numerical unit by a 0 in a circle. Here let us write ${\bf 1}$ for 1 in a circle, etc. He also had an idiosyncratic way of expressing the solution to an equation in one unknown, using the “fourth proportional.” For example, his setting of the problem the problem of finding $x$ such that $x^2=4x+12$ could be schematized as $${\bf 2} : 4\,{\bf 1} + 12\,{\bf 0} : : {\bf 1} : ?$$ (he would actually write: given the three terms for the problem — the first ${\bf 2}$, the second $4\,{\bf 1} + 12\,{\bf 0}$, the third ${\bf 1}$, find the fourth proportional). As Girard explains, the “proportion” is equality. So the problem should be read as: “if ${\bf 2} = 4\,{\bf 1} + 12\,{\bf 0}$, then ${\bf 1} =$ what?”

    Some writers have claimed “[Stevin’s Arithmetic] brought to the western world for the first time a general solution of the quadratic equation …” but in fact there is only this step towards modern notation, his use of minus signs in his equations and his admission of irrational numbers as coefficients to separate him from Fibonacci. Notably he also considers separately three types of quadratic equations [his examples, in post-Descartes notation]

    1. “second term ${\bf 1} + {\bf 0}$” [$x^2=4x+12$]
    2. “second term $-{\bf 1} + {\bf 0}$” [$x^2 = -6x + 16$]
    3. “second term ${\bf 1} – {\bf 0}$” [$x^2=6x-5$].

    and does not entertain negative solutions, so for the first equation he gives only $6$ and not $-2$, for the second he gives $2$ and not $-8$; for the third he gives the two (positive) solutions $5$ and $1$.


    Stevin gives a geometric justification for his solution of each type of equation. For example for the first type his solution is:

    Half of the 4 (from the $4\,{\bf 1}$) is 2
    Its square 4
    To the same added the given ${\bf 0}$, which is 12
    Gives the sum 16
    Its square root 4
    To the same added the 2 from the first line 6

    I say that 6 is the sought-for fourth proportional term.

    Here is his geometrical proof:

    Stevin's root-finding figure

    Figure from Stevin, L’Arithmétique p. 267.

    With modern notation: Stevin starts with a square ABCD representing $x^2$, so its side BC has length $x$. He draws EF parallel to AD, with AE $= 4$. So the rectangle ADFE measures $4x$ and then the rectangle EFCB has area $x^2-4x = 12$. Pick G the midpoint of AE, and draw the square GLKB.

    The rest of the argument as it appears in L’Arithmétique:

    Half of AE $=4$ which is GE 2
    Its square GEHI 4
    Add to the same the given ${\bf 0}$, i.e. EFCB 12
    Gives sum for the gnomon HIGBCF
    or for the square GBKL of same area 16
    Its root BK 4
    Add to the same GE$=2$ or instead KC = GE, makes for BC 6
    Q.E.D. [My translations -TP]

    Notice that the argument and even the diagram are inherited directly from al-Khwarizmi.

    Descartes’ La Géometrie

    René Descartes (1596-1650) published La Géometrie (1885 edition in modern French) in 1637. One of the first topics he covers is the solution of quadratic equations. Besides the actual geometry in his book, two notational and conceptual features started the modern era in mathematics. The first was his use of exponents for powers. This started as a typographical convenience: he still usually wrote $xx$ instead of $x^2$. It turned out to be a short series of steps (but one he could not have imagined) from his $x^3$ to Euler’s $e^x$. The second was his use of $x$ and $y$ as coordinates in the plane. The first would eventually allow the general quadratic equation to be written as $ax^2 + bx +c =0$, and the second would allow the solution to be viewed as the intersection of a parabola with the $x$-axis. But Descartes’ avoidance of negative numbers (the original cartesian plane was a quadrant) kept him from the full exploitation of his own inventions.

    In particular, he still had to follow al-Khwarizmi and Stevin in distinguishing three forms of quadratic equations. In his notation:

    • $z^2=az + b^2$
    • $y^2=-ay + b^2$
    • $z^2 = az-b^2$.


    His solutions, however, are completely different from the earlier methods. He shows in all three cases how a ruler-and-compass construction can lead from the lengths $a$ and $b$ to a segment whose length is the solution.

    Descartes' root-finding figure

    Cases 1 and 2. “I construct the right triangle NLM with one leg $LM$ equal to $b$, and the other $LN$ is $\frac{1}{2}a$, half of the other known quantity which was multiplied by $z$, which I suppose to be the unknown length; then extending $MN$, the hypothenuse of this triangle, to the point $O$, such that $NO$ is equal to $NL$, the entirety $OM$ is the sought-for length; and it can be expressed in this way: $$z=\frac{1}{2}a + \sqrt{\frac{1}{4}aa + bb}.$$ Whereas if I have $yy=-ay+bb$, with $y$ the unknown quantity, I construct the same right triangle $NLM$, and from the hypothenuse $MN$ I subtract $NP$ equal to $NL$, and the remainder $PM$ is $y$, the sought-for root. So that I have $y=-\frac{1}{2}a + \sqrt{\frac{1}{4}aa + bb}.$”

    Descartes' root-finding part 2

    Case 3. “Finally, if I have $$z^2=az-bb$$ I draw $NL$ equal to $\frac{1}{2}a$ and $LM$ equal to $b$, as before; then, instead of joining the points $MN$, I draw $MQR$ parallel to $LN$, and having drawn a circle with center $N$ which cuts it at the points $Q$ and $R$, the sought-for length is $MQ$, or $MR$; since in this case it can be expressed two ways, to wit, $z=\frac{1}{2}a + \sqrt{\frac{1}{4}aa – bb}$ and $z=\frac{1}{2}a – \sqrt{\frac{1}{4}aa – bb}.$ And if the circle, which has its center at $N$ and passes through $L$ neither cuts nor touches the straight line $MQR$, the equation has no root, so one can state that the construction of the proposed problem is impossible.” [My translation. Checking Descartes’ constructions involves some standard Euclidean geometry.-TP]

    In Maria Gaetana Agnesi’s Instituzioni analitiche ad uso della gioventù italiana

    Agnesi’s textbook was published in 1748. Agnesi does algebraically complete the square for one case of the quadratic equation:

    “Consider the equation $xx+2ax = bb$; add to one and to the other sides the square of half the coefficient of the second term, i.e. $aa$, and it becomes $xx+2ax+aa = aa + bb$ and, extracting the root, $x=\pm\sqrt{aa+bb} -a$.”

    Specifically (with the letters Descartes used) she takes $a$ as a positive quantity (necessary for the geometric construction) and gives

    • MP and $-$MO as roots of $xx+ax-bb=0$
    • MO and $-$MP as roots of $xx-ax-bb=0$
    • $-$MQ and $-$MR as roots of $xx+ax+bb=0$
    • MQ and MR as roots of $xx-ax+bb=0$.

    Imaginary numbers do not yet have a geometric meaning. “Whenever the equation, to which the particulars of the problems have led us, produces only imaginary values, this means that the problem has no solution, and that it is impossible.” [My translations -TP]

    Note that she uses the $\pm$ notation. She explains: “So the ambiguity of the sign affected to the square root implies two values for the unknown, which can be both positive, both negative, one positive and the other negative, or even both imaginary, depending on the quantities from which they have been computed.”

    But when Agnesi comes to a general treatment she follows Descartes, using identical geometric constructions, with one significant improvement, as implied above: negative roots are calculated and given the same status as positive ones:

    “Negative values, which are still called false, are no less real than the positive, and have the only difference, that if in the solution to the problem the positives are taken from the fixed starting point of the unknown toward one side, the negatives are taken from the same point in the opposite direction.”

    In Leonhard Euler’s Vollständige Anleitung zur Algebra

    Euler’s text was published in St. Petersburg in 1770; John Hewitt’s English translation, Elements of Algebra, appeared in 1840. Chapter VI covers the general quadratic equation: Euler writes it as $ax^2\pm bx\pm c=0$, and then remarks that it can always be put in the form $x^2 + px = q$, where $p$ and $q$ can be positive or negative. He explains how the left-hand side can be made into the square of $(x+\frac{1}{2}p)$ by adding $\frac{1}{4}p^2$ to both sides, leading to $x+\frac{1}{2}p = \sqrt{\frac{1}{4}p^2 + q}$ and, “as every square root may be taken either affirmitively or negatively,” $$x = -\frac{1}{2}p \pm \sqrt{\frac{1}{4}p^2 + q}.$$ In deriving this solution, completely equivalent to the quadratic formula, Euler has completed the square in a purely algebraic manner. The gnomons and Euclidean diagrams, that for some 2500 years had seemed necessary to justify the maneuver, have evaporated.

    In Elements of Algebra Euler is receptive to imaginary numbers:

    §145. “But notwithstanding [their being impossible] these numbers present themselves to the mind; they exist in our imagination, and we still have a sufficient idea of them; since we know that by $\sqrt{-4}$ is meant a number which, multiplied by itself, produces $-4$; for this reason also, nothing prevents us from making use of these imaginary numbers, and employing them in calculation.”

    but does not consider them “possible” as roots of quadratic equations. For example:

    §700. “A very remarkable case sometimes occurs, in which both values of $x$ become imaginary, or impossible; and it is then wholly impossible to assign any value for $x$, that would satisfy the terms of the equation.”

    In 20th century physics

    A Gaussian function $f(x)=C\exp(-\frac{1}{2}ax^2)$ corresponds to a familiar “bell-shaped curve.” In multivariable calculus we learn that $\int_{-\infty}^{\infty}\,f(x)\,dx=C\sqrt{\frac{2\pi}{a}}$; this also holds for $\int_{-\infty}^{\infty}\,f(x-\mu)\,dx$, with $\mu$ any finite number.

    Gaussian curve

    The width of the Gaussian $f(x)=C\exp(-\frac{1}{2}ax^2)$, defined as the distance between the two points where $\,f=\frac{C}{2}$, can be calculated to be  $2\sqrt{\frac{2\ln 2}{a}}$. In this image with $C=4$, $a=\frac{1}{2}$, the width is 3.33. Note that the width does not depend on the factor $C$.

    The Fourier transform of $f$ (taking $C=1$) is the function $${\hat f}(y)= \int_{-\infty}^{\infty}\exp(ixy)\,f(x)~dx = \int_{-\infty}^{\infty}\exp(-\frac{1}{2}ax^2 + ixy)~dx.$$ This integral can be computed by completing the square: write $-\frac{1}{2}ax^2 + ixy$ as $-\frac{1}{2}a(x^2 -\frac{2iyx}{a} +\frac{y^2}{a^2}-\frac{y^2}{a^2})= -\frac{1}{2}a (x-\frac{iy}{a})^2 – \frac{y^2}{2a}$. Then $$ {\hat f}(y)= \int_{-\infty}^{\infty}\exp(-\frac{y^2}{2a})\,\exp\left (-\frac{1}{2}a(x-\frac{iy}{a})^2\right )\,dx=\sqrt{\frac{2\pi}{a}}\,\exp(-\frac{y^2}{2a}).$$ This means that the Fourier transform of $f$ is again a Gaussian; the parameter $a$ has become $\frac{1}{a}$, so the product of the widths of $\,f$ and its Fourier transform ${\,\hat f}$ is constant. The wider $\,f$ is, the narrower ${\,\hat f}$ must be, and vice-versa. This phenomenon is the mathematical form of the uncertainty principle.

AMS Feature Column banner

Tony Phillips’ Column Archive

Here are Tony Phillips’ older Feature Columns. Tony’s newest columns may be found here.