# Alan Turing and the Countability of Computable Numbers

Turing’s methodology was unique: he imagined hypothetical machines that could perform complicated mathematical tasks in a deterministic manner, in the way computers do today. In this way, he inadvertently kickstarted the entire field of modern computer science…

University of Puget Sound

Alan Turing may be the most-influential-yet-least-read figure in 20th Century mathematics. His landmark paper, “On computable numbers, with an application to the Entscheidungsproblem”, was published in the Proceedings of the London Mathematical Society in late 1936. This is the paper that spun computer science off of mathematics, into the distinct field that we know today. Yet, if you ask mathematicians and computer scientists alike if they have read this paper, the answer is frequently “No.” It is so long and amazingly dense that even experts often have a very hard time parsing his arguments. This column aims to rectify this slightly, by explaining one small part of Turing’s paper: the set of computable numbers, and its place within the real numbers.

The Entscheidungsproblem (German for “decision problem”) is the problem of developing an algorithm that can determine whether a statement of first-order logic is universally valid. However, Turing did not know that American mathematician Alonzo Church had already proven that such an algorithm is impossible a few months previously. Even so, Turing’s methodology was unique: he imagined hypothetical machines that could perform complicated mathematical tasks in a deterministic manner, in the way computers do today. In this way, he inadvertently kickstarted the entire field of modern computer science (which has made some modest improvements since 1936). As part of his proof, he developed the concept of computable numbers. Turing used his imaginary machines (that have since come to be called “Turing machines” in his honor) for the process of calculating numbers in this set. It contains any number that can be calculated to within an arbitrary precision in a finite amount of time. This set includes all rational and algebraic numbers (e.g. $$\sqrt{2}$$), as well as many transcendental numbers such as $$\pi$$ and $$e$$. Interestingly, Turing created a very natural extension to Georg Cantor’s set theory, when he proved that the set of computable numbers is countably infinite!

Most mathematicians are familiar with the idea of countability. That is, the notion developed by Cantor in the 1870s that not all infinite sets have the same cardinality. A set that is countably infinite is one for which there exists some one-to-one correspondence between each of its elements and the set of natural numbers $$\mathbb{N}$$. For example, the set of integers $$\mathbb{Z}$$ (“Z” for “Zahlen”, meaning “numbers” in German) can be easily shown to be countably infinite. Cantor himself developed a well-known proof that the set of rational numbers $$\mathbb{Q}$$ (for “quotient”) is also countably infinite. Thus, in a very real sense $$\mathbb{N}$$, $$\mathbb{Z}$$, and $$\mathbb{Q}$$ all have the same cardinality, even though $$\mathbb{N} \subset \mathbb{Z} \subset \mathbb{Q}$$.

Conversely, an infinite set for which there is no one-to-one correspondence with $$\mathbb{N}$$ is said to be “uncountably infinite”, or just “uncountable”. $$\mathbb{R}$$, the set of real numbers, is one such set. Cantor’s “diagonalization proof” showed that no infinite enumeration of real numbers could possibly contain them all. Of course, there are many irrational real numbers that are useful in common computations ($$\pi$$, $$e$$, $$\phi$$, $$\sqrt{2}$$, and $$\log~2$$, just to name a few). A common misconception is that a set of all such useful numbers (including infinitely many transcendental numbers) is somehow “too complicated” to be merely countably infinite. Turing’s paper proved this intuition to be incorrect.

The rest of this column is laid out as follows. First, we repeat Cantor’s proofs showing that $$\mathbb{Z}$$ and $$\mathbb{Q}$$ are countable and $$\mathbb{R}$$ is uncountable. Then we will show how Turing extended Cantor’s work, by proving the countability of the set of computable numbers. We will call this set $$\mathbb{K}$$, to better fit in with the other sets of numbers. However, we will reprove Turing’s ideas using Python rather than his original Turing machines. The ideas behind the proofs will remain unchanged, while making it much more easily understood to a modern audience.

### The sets of integers and rationals are both countably infinite

Intuitively, one might think of the set of integers $$\mathbb{Z}$$ as being “bigger” than the set of the natural numbers $$\mathbb{N}$$. After all, $$\mathbb{Z}$$ is a proper superset of $$\mathbb{N}$$! However, by rearranging the integers to start with $$0$$ and count up in magnitude alternating between positive and negative, we can create an infinite list of all the integers, with a definite starting point. This rearrangement allows a bijective function between $$\mathbb{N}$$ and $$\mathbb{Z}$$. This means the function gives a one-to-one and onto correspondence between the two sets, so the function is invertible.

Here’s a proof. Rearrange $$\mathbb{Z}$$ to the order $$\{0, 1, -1, 2, -2, 3, -3, …\}$$. Then map them to the elements of $$\mathbb{N}$$ in their standard order, so that $$0\rightarrow 0$$, $$1\rightarrow 1$$, $$-1\rightarrow 2$$, and so on. This is injective (“one-to-one”) , since no two different integers will ever map to the same natural number. And it is surjective (“onto”), because there exists an integer to map to every natural number. Therefore the relationship is a bijection, and thus the integers are a countable set.

Proving that the set of rational numbers is countable is more difficult, given that there are two “degrees of freedom” in a rational number: the numerator and the denominator. It seems difficult to rearrange $$\mathbb{Q}$$ into a list the same way we did with $$\mathbb{Z}$$. (That is, one with a definite starting point, that extends infinitely forward in a single dimension.) One cannot simply list all the rational numbers with numerator 1, then all with numerator 2, etc., because there are an infinite number of them in each subset. If we started out by listing all the rationals with numerator 1 (for example), we’d never get to the others!

Instead, Cantor thought to traverse the set in an alternating “zig-zag” fashion, as shown here. We start by only considering the positive rational numbers $$\mathbb{Q}^+$$. We’ll extend to the complete set later. This zig-zag pattern lets us get to any given positive rational number, eventually.

Let’s prove that the rational numbers are countable. We start by showing a bijection between the positive rationals $$\mathbb{Q}^+$$ and $$\mathbb{N}$$. Lay out the elements of $$\mathbb{Q}^+$$ as shown in the figure above, so that the first row consists of all numbers with a $$1$$ numerator in order of increasing denominator ($$\frac{1}{1}$$, $$\frac{1}{2}$$, $$\frac{1}{3}$$, $$\ldots$$), the second row has all those with a $$2$$ demoninator in the same order, and so on. Then, traverse them as shown in the figure, moving diagonally while weaving back and forth. We skip any fractions not in lowest terms, so that no rational number appears twice. In this way, we can arrange the set $$\mathbb{Q}^+$$ into the order $$\{\frac{1}{1}, \frac{2}{1}, \frac{1}{2}, \frac{1}{3}, \frac{3}{1}, \ldots\}$$. When mapping them to the natural numbers in this order, the mapping function is an injection because no two different elements of $$\mathbb{Q}^+$$ map to the same member of $$\mathbb{N}$$, and it is a surjection because there exists a member of $$\mathbb{Q}^+$$ for every member of $$\mathbb{N}$$. Thus it is a bijection, and $$\mathbb{Q}^+$$ is countable.

To show that the set of all rational numbers $$\mathbb{Q}$$ is countable, use the reordering strategy employed to prove that $$\mathbb{Z}$$ is countable. Reorder $$\mathbb{Q}$$ to start with $$0$$, and then proceed through the rationals in the order shown in the figure, alternativing positive and negative. Thus $$\mathbb{Q}$$ will be ordered $$\{\frac{0}{1}, \frac{1}{1}, -\frac{1}{1}, \frac{2}{1}, -\frac{2}{1}, \frac{1}{2}, -\frac{1}{2}, \frac{1}{3}, -\frac{1}{3}, \frac{3}{1}, -\frac{3}{1}, \ldots\}$$. These elements can be mapped to the elements of $$\mathbb{N}$$ in its usual order, and this is a bijection for the same reasons as given above.

### The set of reals is uncountably infinite

However, real numbers are inherently uncountable. A rephrasing of Cantor’s original proof follows, using a trick that has come to be known as “diagonalization.” No matter what infinite list of real numbers is given, we can generate a new number $$x$$ that cannot possibly be in that list.

To prove the real numbers are uncountable, let’s assume they are not. Then there would exist some one-to-one correspondence between the real numbers $$\mathbb{R}$$ and the natural numbers $$\mathbb{N}$$. If that were the case, then it would be possible to enumerate all the real numbers in some order, such as that illustrated in the table shown. However, we can construct a real number $$x$$ that cannot possibly be in the list, which means that it’s not a complete list. We do this by taking one digit from each number in the list, and changing it. (In the table shown, the change is effected by adding 1 to the digit, but skipping 9 and 0 so as to avoid numbers such as $$0.0999\ldots$$ and $$0.1000\ldots$$ that are equal, even though their digits are different.) This way, $$x$$ differs from every single number in the list in at least one of its digits. The real number $$x$$ has no corresponding element in the natural numbers, and therefore the original assumption must be false. $$\mathbb{R}$$ must be uncountable.

Thus, integers and rational numbers are both countable sets, while real numbers are uncountable. Cantor also proved that the algebraic numbers ($$\mathbb{A}$$) are also a countable set. (These are the roots of polynomials with rational coefficients.) For his insight, Cantor was allegedly called an “apostate” (“Abtrünnigen” in German) and a “corrupter of youth” (“Verderber der Jugend”), among other colorful names. Many of his contemporaries felt that the nature of infinity was beyond human mathematics, being the realm of God. It was not until David Hilbert began to champion his ideas that they became more widely accepted in the early 1900s. Today, they are part of a standard undergraduate curriculum.

### Defining computable numbers

We will now move on to Turing’s work on computable numbers, which he defined in his famous paper. We use $$\mathbb{K}$$ to represent this set. ($$\mathbb{C}$$ is taken, as it usually refers to the set of complex numbers.) Turing defined a computable number as one that can be calculated to within an arbitrary precision, within a finite amount of time. These days, the easiest way to do that is often with a computer program—though any algorithmic approach will do.

First, let us show that all rational numbers are computable. When represented as a decimal, a rational number either terminates, or eventually begins to infinitely repeat the same set of digits. Thus, given enough time, a program could be written to display any rational number to the desired precision.

The figure here shows a non-terminating Python program, similar in spirit to the machines Turing laid out in his paper. This one outputs the never-ending decimal representation of one third. It first prints out “0.”, and then loops forever printing the digit “3” until the user gets tired and pulls the plug. In this way, it “calculates” the number to as accurate a degree as the user wishes.

It may come as a surprise that in his seminal paper, Turing preferred Turing machines that never terminated their calculations. Today, Turing machines are often associated with the “halting problem”, which is the impossible task of proving whether or not an arbitrary program will eventually terminate. However, Turing used his machines to calculate numbers to ever-increasing precisions. In fact, he referred to his machines as “unsatisfactory” if they ever stopped, and “satisfactory” if they kept on going forever. Although Turing’s paper touches on concepts similar to the halting problem, this would not be posed as we know it today until the 1950s.

The procedure to calculate a computable number does not need to be in code form (though any of the below approaches may be programmed, if needed). For example, $$\pi$$ is also a computable number, and to show this we only need to express it as an infinite sum, like this one:

$$\pi = \frac{4}{1} – \frac{4}{3} + \frac{4}{5} – \frac{4}{7} + \frac{4}{9} – \ldots$$

This is not as “computery” as the above code, but it is algorithmic all the same. One may continue adding further terms until the approximation is as accurate as needed. There are other well-known infinite series to calculate other irrational numbers like $$e$$ and $$\phi$$. So long as there exists a mechanical way to approximate a number to ever-increasing precision, within finite time, the number is computable.

Writing an efficient Python program to calculate $$\pi$$ is a little more difficult. The program shown in this figure uses a “streaming” method developed by J. Gibbons and implemented by D. Bau. It is not as easily comprehensible as the above infinite series, but is more in the spirit of Turing’s paper. It is able to continue printing further digits of $$\pi,$$ only needing to maintain six variables between each print statement (rather than reviewing all the digits it has already output), making for a very efficient calculation.

Some other numbers such as $$\sqrt{2}$$ and $$\log~2$$ also have infinite sums, products, or fractions that can be used to calculate them. However, it’s often more efficient to compute them via successive approximations. For example, let us approximate $$\sqrt{2}$$ with a number $$r$$. We know that $$r$$ is a positive number such that $$r^2 = 2$$. Since $$1 < r^2 < 4$$, that means that $$1 < r < 2$$. We can then successively approximate $$r$$ as $$1.5$$, $$1.25$$, $$1.375$$, $$1.4375$$, $$1.40625$$, and so on, each time adding or subtracting the next smaller power of two, depending on whether the current approximation’s square is greater than or less than 2. We stop when the power added or subtracted is less than half the desired maximum error.

There is one flaw with calculating irrational numbers like these on a modern computer: any computer has only a finite amount of memory, and eventually that memory will be exhausted. This is one key difference between Turing machines and real computers: Turing assumed that his machines had inexhaustible memory. Thus, it is fair to do likewise when proving that the set of computable numbers is countable. However, one could also slightly redefine a computable number to be one that can be calculated to within an arbitrary precision in a finite amount of time and with a finite amount of resources. This is equivalent to Turing’s original definition, since if a calculation required unbounded memory, it would need unbounded time to access that memory.

In general, any number with a well-defined value, or well-defined method to calculate its value, may be said to be an element of $$\mathbb{K}$$. If you can write a program to calculate a number (to within some arbitrary precision, in a finite amount of time), it’s computable. This includes many well-known non-algebraic real numbers. In fact, it’s very difficult to come up with a number that’s not computable.

### Gödel numbers

Now that we’ve defined our set $$\mathbb{K}$$, we need to show that it is countable. The proof relies on the concept of a Gödel number. Gödel numbers were developed by Kurt Gödel himself, in order to prove his “incompleteness theorems.” (If you are new to German, Gödel’s name is pronounced similarly to the English word “girdle”, but without the “r” sound.)

Definition: Gödel Number
A Gödel number is a natural number that is a unique representation of a particular member of a set $$S$$. The Gödel number is calculated with a function $$\mathrm{g}$$ whose domain is $$S$$ and whose codomain is $$\mathbb{N}$$.

Note that $$\mathrm{g}$$ is an injective function, so that if $$\mathrm{g}(s) = \mathrm{g}(s’)$$, it must be the case that $$s = s’$$. However some integers may not be valid Gödel numbers for members of $$S$$, so the function $$\mathrm{g}$$ is frequently not a bijection between $$S$$ and $$\mathbb{N}$$.

If a set $$S$$ is infinite, and there exists a function $$\mathrm{g}$$ to calculate a valid Gödel number for every member $$s$$, then $$S$$ must be countably infinite. To prove this, we must show that $$S$$ has a one-to-one correspondence with $$\mathbb{N}$$. Since the Gödel number of each set element $$s$$ is unique, one of them must have the least value. Let this element of $$S$$ correspond to $$0$$. Then, the one with the second-least Gödel number will correspond to $$1$$, the next one to $$2$$, and so on. Since both sets are infinite, the correspondence can continue infinitely. Hence there exists a bijection between $$S$$ and $$\mathbb{N}$$, and therefore $$S$$ is countable.

### Bringing it together

It remains to show that a Gödel number function $$\mathrm{g}$$ exists for $$\mathbb{K}$$. The trick is to calculate a Gödel number from the program that generates a computable number, rather than from the number itself. Recall that a program consists of a finite block of text. Most computers use the ASCII code or Unicode to represent a text file, in which every valid character is a encoded as a number between 1 and 127. (A 0 would represent a null character, which isn’t used in text files.) Thus, the source code itself may be represented as a base-128 number with no 0s, and as many digits as there are characters in the file. We will simply translate this base-128 number into a common base-10 one. For example, in the case of the tiny program above that prints one third, the computation is:
$$112\cdot128^0 + 114\cdot128^1 + 105\cdot128^2 + \ldots + 10\cdot128^{50}$$
Here, the $$112$$, $$114$$, and $$105$$ represent the characters p, r, and i respectively: the first three characters in the program. The fifty-first character at the end is a newline, which is represented by $$10$$. The resulting sum is the number:

23, 674, 419, 604, 088, 055, 738, 162, 829, 936, 727, 249, 274, 729, 959, 756, 590, 023, 485, 007, 031, 946, 824, 166, 916, 359, 320, 099, 837, 891, 922, 699, 574, 436, 329, 840.

The other program, to calculate $$\pi$$, results in a Gödel number that is approximately $$1.77\times10^{654}$$. These numbers are unwieldy, but remember that they encode entire programs. Because one can recover the complete program from its number, it must be the case that the function $$\mathrm{g}$$ is injective.

This is similar to Turing’s approach: he created Turing machines to compute numbers, and then represented each machine as a large table of states. This let him systematically create a unique “description number” for each possible machine. (The term “Gödel number” was not yet in common use.) The number was a large positive integer that described every aspect of the machine, and no two different machines could ever have the same number. Thus, as in our case, Turing showed that there exists a valid Gödel function $$\mathrm{g}$$, and used this to show that $$\mathbb{K}$$ is countable.

It may be argued that any given computable number has an infinite number of programs that can calculate it, and hence infinite Gödel numbers associated with it. We need the function $$\mathrm{g}$$ to map each computable number to one Gödel number. Therefore, we choose the number with the least magnitude, which corresponds to the shortest program that produces the computable number in question. A critic may argue that finding this shortest program is not always possible, and so the Gödel function $$\mathrm{g}$$ cannot be executed. However, for our purposes it does not matter if the function is practical. All that matters, to prove that the set of computable numbers is countable, is that each one can be shown to have an associated Gödel number.

Here’s the final proof, bringing this all together. The computable numbers are an infinite set. We have provided an injective function $$\mathrm{g}$$ that maps every computable number to a single natural number: a Gödel number. Any set with such a function is countable, and therefore computable numbers are countable.

### A counter proof?

Some readers may wonder why we cannot use the same diaganolization trick that Cantor used, and so prove that the computable numbers are uncountable. Let’s try to do it, and we will see where it fails.

Let us say that we have an allegedly complete (though infinite) enumeration of every computable number. We will now attempt to use diagonalization in order to produce a new computable number $$x$$, that could not possibly be in our list. First we calculate the tenths place digit of the first computable number, and assign $$x$$’s tenths place digit to be something else. Then we calculate the hundredths place digit of the second computable number, and assign $$x$$’s corresponding digit to be another one. We proceed as before, until we have computed a number that could not possibly be in the enumeration. Thus, it appears that we have a contradiction, and so $$\mathbb{K}$$ appears uncountable.

The problem with this “proof” is that we never showed that the number $$x$$ is itself computable. When Cantor used diagonalization to prove $$\mathbb{R}$$ uncountable, the resulting $$x$$ was obviously a real number, even though it’s not in the “complete” list. Hence he had a contradiction. But proving that a number is computable is more difficult, because we would need to show that the computation to some arbitrary accuracy can be done in finite time. Since we haven’t done this (and in fact we cannot do it), the alleged proof falls apart.

It might seem straightforward at first to prove that $$x$$ is computable, since the diagonalization technique appears to give an algorithm to calculate it. However the algorithm is flawed, and will enter an infinite loop from which it cannot recover. Remember that it calculates $$x$$ by using each computable number in turn, which would include $$x$$ itself. Let us say that $$x$$ happens to be the $$i$$th number in the list. Then, in order to calculate $$x$$ to the $$i$$th digit, one must first calculate $$x$$ to the $$i$$th digit. The algorithm enters an infinite regress from which it cannot recover. Thus the computation fails: $$x$$ is not computable, there is no contradiction, and diagonalization technique did not work.

### Conclusions

What does it mean that the set of computable numbers is countable, but the set of real numbers is not? In a very true sense, the set $$\mathbb{K}$$ contains all the numbers that are knowable. Real numbers that aren’t computable can’t be calculated, or even thought about except in the most abstract manner (such as $$x$$ from the last section, the uncalculable result of using the diagonalization process on $$\mathbb{K}$$). Yet, since $$\mathbb{R}$$ is uncountable, the vast majority of real numbers must be unknowable. They are the infinite streams of digits, with absolutely no rhyme or reason to their sequences. Any real number used in day-to-day life is part of the computable set $$\mathbb{K}$$.

Others have gone on to describe other incomputable numbers. For example, in 1949 the Swiss mathematician Ernst Specker found a monotonically increasing, bounded, computable sequence of rational numbers whose least upper bound is not computable. Computability theory originated with Turing’s paper as a branch of mathematics in its own right. It attempts to classify noncomputable functions into a hierarchy, based on their degrees of noncomputability. It is of interest to both mathematicians and computer scientists.

Fortunately for Alan Turing, his Turing machines were thought to be innovative enought that his paper found publication, despite not being the first to prove that the Entscheidungsproblem is not solvable. In fact, his paper served as his introduction to Alonzo Church, the very man who had scooped him. Church would go on to supervise Turing’s doctoral work at Princeton, from which he graduated in 1938 at the age of 26! The two would go on to lay much of the groundwork for modern computer science, including the Church-Turing thesis (which roughly states that human computation, machine computation, and Church’s lambda calculus are all equivalent).

Turing’s paper doesn’t mention complex numbers, but an extension of his work is straightforward. A complex computable number would be a complex number whose real portion is a computable number, and whose imaginary portion is another computable number times $$i$$ $$(\sqrt{-1})$$. Such a number is computable to within an arbitrary precision in a finite amount of time, so long as the program that calculates it alternates regularly between calculating the real portion and the imaginary portion (perhaps outputting the two numbers via separate streams or into separate files). Since such a complex number can be calculated using a program, the same proof of countability applies.

Charles Petzold’s “The Annotated Turing” is an excellent source for those who wish to know more about Turing and his proof that the Entscheidungsproblem is impossible. It reproduces Turing’s entire paper one small piece at a time, interspersed with copious explanatory detail and context. If you wish to know more about Turing’s seminal work, this is without a doubt the best place to start. Those more interested in Turing’s personal life should read Andrew Hodges’ “Alan Turing: the Enigma”. This covers his whole life in great detail, from his early childhood, through his codebreaking efforts during the war years, and ending with his tragic persecution and death just shy of his 42nd birthday.

Turing’s idea of computable numbers form a logical and intuitive extension to Cantor’s original work on countability. His work is a unique bridge between mathematics and computer science, that is still relevant today.

### Acknowledgements

Thank you to David Bau for permission to use his implementation of the $$\pi$$ spigot algorithm. Thank you also to Ursula Whitcher for encouragement and assistance with one of the proofs.

### References

• D. Bau. A dabbler’s weblog: Python pi.py spigot. http://davidbau.com/archives/2010/03/14/python_pipy_spigot.html, March 2010.
• G. Cantor. Über eine Eigenschaft des Inbegriffs aller reellen algebraischen Zahlen (On a property of the collection of all real algebraic numbers). Journal für die reine und angewandte Mathematik, 77:258–262, 1874.
• G. Cantor. Über eine elementare Frage der Mannigfaltigkeitslehre (On an elementary question concerning the theory of manifolds). Jahresbericht der Deutschen Mathematiker-Vereinigung, 1, 1892.
• A. Church. A note on the Entscheidungsproblem. The Journal of Symbolic Logic, 1(01):40–41, 1936.
• J. Gibbons. Unbounded spigot algorithms for the digits of pi. The American Mathematical Monthly, 113(4):pp. 318–328, 2006.
• K. Gödel. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I (On formally undecidable propositions of Principia Mathematica and related systems I). Monatshefte für Mathematik und Physik, 38(1):173–198, 1931.
• A. Hodges. Alan Turing: the Enigma. Burnett Books, London, 1983.
• C. Petzold. The Annotated Turing: A Guided Tour Through Alan Turing’s Historic Paper on Computability and the Turing Machine. Wiley Publishing, 2008.
• E. Specker. Nicht konstruktiv beweisbare Sätze der Analysis (Theorems of analysis that cannot be proven constructively). The Journal of Symbolic Logic, 14:145–158, 1949.
• A. M. Turing. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42, 1936.
• A. M. Turing. Systems of logic based on ordinals. PhD thesis, Princeton University, 1938.

# Meet me up in space!

Rather than closing the distance, however, the target seemed to move down and away in defiance of everyday intuition…

David Austin
Grand Valley State University

Complex space missions rely on the ability to bring two spacecraft together, a procedure called orbital rendezvous. A spacecraft docking at the International Space Station is a typical example.

Historically, rendezvous was a vital component of the Apollo lunar missions. While three astronauts traveled to the moon in a single craft, the command module, two of them descended to the lunar surface in a second craft, the lunar excursion module (LEM). A successful mission required the ascending LEM to rendezvous with the command module before returning to Earth.

Gemini 4 Capsule at the Smithsonian National Air and Space Museum.

The first attempt at orbital rendezvous was one component of the Gemini IV mission in 1965 when the pilot James McDivitt tried to bring his Gemini capsule close to the spent booster that lifted them into orbit. Upon first seeing the booster at an estimated 120 meters, McDivitt aimed the capsule at the booster and thrusted toward it. Rather than closing the distance, however, the target seemed to move down and away in defiance of everyday intuition. He repeated this procedure several times with similar results.

NASA engineer André Meyer later recalled, “There is a good explanation [for] what went wrong with rendezvous.” Mission planners “just didn’t understand or reason out the orbital mechanics involved” due to competing mission priorities. That’s what we’re going to do here. This column will explain the relative motion of two spacecraft flying closely in Earth orbit and why that motion behaves in such a counter-intuitive way.

Science fiction films often depict spacecrafts that continually alter their trajectories by firing their engines. In the real world, however, fuel is a scarce resource due to the energy required to lift the fuel into space. As a result, spacecraft spend most of their time coasting and only apply short bursts of thrust at opportune moments to adjust their trajectory. This means that a spacecraft operating near, say, Earth or the Moon will spend most of its time traveling in an elliptical orbit, as dictated by Kepler’s second law.

We therefore imagine a target spacecraft $T$ traveling in a circular orbit, and an interceptor spacecraft $I$ attempting to rendezvous with the target and traveling in another elliptical orbit. It’s not essential that the target’s orbit be circular, but it’s a convenient assumption for our purposes. The object of orbital rendezvous is to guide the interceptor so that it comes together with the target.

### Equations of relative motion

The interceptor perceives the target as a fixed point that it wants to move toward. For this reason, we are interested in describing the motion of the interceptor in a coordinate system that rotates with the target, which means we want to understand the evolution of the vector joining the spacecrafts.

As a note to the reader, we’ll walk through some calculations that explain the evolution of this vector and eventually find a rather elegant description. I’d like to give some intuition for what controls the motion, but I won’t give every detail and trust that the reader can fill in as much or as little as they wish. This exposition follows a 1962 technical report by NASA engineer Donald Mueller and a more recent survey by Bradley Carroll. Apollo 11 astronaut Edwin “Buzz” Aldrin’s M.I.T. Ph.D. thesis covers similar ground.

We’ll denote the interceptor’s location by $I$ and the target’s by $T$. The vector ${\mathbf r}$ measures the displacement of the interceptor from the target. The vectors ${\mathbf R}^*$ and ${\mathbf r}^*$ measure the displacement of the target and interceptor from Earth’s center.

Since our goal is to understand the motion of the interceptor $I$ relative to the target $T$, we will focus our attention on ${\mathbf r}$. Moreover, we will consider a coordinate system that rotates with the target $T$ as shown below.

For convenience, we will assume that the target is moving in a circular orbit of radius $R_0$ in the $xy$-plane. We see that ${\mathbf i}$ is a unit vector pointing in the direction of the target’s motion, ${\mathbf j}$ points radially from the center of Earth, and ${\mathbf k}$ is a unit vector pointing out of the page.

The vectors ${\mathbf i}^*$, ${\mathbf j}^*$, and ${\mathbf k}^* = {\mathbf k}$ are fixed. In general, quantities denoted with a $*$ are with respect to this fixed, Earth-centered coordinate system and quantities without a $*$ are with respect to the rotating, target-centered coordinate system.

We can describe $${\mathbf R^*} = R_0\left[\cos(\omega_0T){\mathbf i}^* -\sin(\omega_0t){\mathbf j}^*\right],$$ where $\omega_0$ is the angular velocity of the target. Differentiating twice gives the acceleration $$\frac{d^2}{dt^2}{\mathbf R}^* = R_0\omega_0^2\left[-\cos(\omega_0T){\mathbf i}^* +\sin(\omega_0t){\mathbf j}^*\right].$$

Since gravity is the only force acting on the target, the magnitude of the acceleration vector is $GM/R_0^2$, where $G$ is the gravitational constant and $M$ is Earth’s mass. This tells us that $R_0\omega_0^2 = GM/R_0^2$ so we have $$\omega_0=\sqrt{\frac{GM}{R_0^3}}.$$ Notice that the angular velocity decreases as the target’s altitude increases.

This is really an expression of Kepler’s Third Law for circular orbits. If $P$ is the period of the orbit, then $\omega_0P = 2\pi$, which says that $$P^2 = \frac{4\pi^2}{\omega_0^2} = \frac{4\pi^2}{GM}R_0^3$$ so that the square of the period is proportional to the cube of the radius.

We would like to relate rates of change measured in the fixed, Earth-centered coordinate system to those measured in the rotating, target-centered system. In particular, we are interested in the derivatives $$\frac{d^*}{dt}{\mathbf r},\hspace{24pt} \frac{d}{dt}{\mathbf r},$$ the first of which measures the rate of change of ${\mathbf r}$ in the fixed coordinate system while the second measures the rate of change in the rotating coordinate system.

Writing $${\mathbf r} = x{\mathbf i} + y{\mathbf j} + z{\mathbf k}$$ shows us that $$\frac{d^*}{dt}{\mathbf r} = \left(\frac{d}{dt}x\right){\mathbf i} + \left(\frac{d}{dt}y\right){\mathbf j} + \left(\frac{d}{dt}z\right){\mathbf k} + x\frac{d}{dt}{\mathbf i} + y\frac{d}{dt}{\mathbf j}.$$

Notice that there are some additional terms that come from differentiating the coordinate vectors. Introducing the angular velocity vector ${\mathbf \omega} = -\omega_0{\mathbf k}$ allows us to see, with a little calculation, that $$\frac{d^*}{dt}{\mathbf r} = \frac{d}{dt}{\mathbf r} + \omega\times{\mathbf r}.$$

After differentiating again, one can show that $$\frac{d^2}{dt^2}{\mathbf r} = \frac{{d^*}^2}{dt^2}{\mathbf r} – \omega\times(\omega\times{\mathbf r}) – 2\omega\times\frac{d}{dt}{\mathbf r}.$$

Remember that ${\mathbf r} = {\mathbf r}^* – {\mathbf R}^*$ so that $$\frac{{d^*}^2}{dt^2}{\mathbf r} = \frac{{d^*}^2}{dt^2}{\mathbf r}^* – \frac{{d^*}^2}{dt^2}{\mathbf R}^* = -\frac{GM}{|{\mathbf r}^*|^3}{\mathbf r}^* +\frac{GM}{|{\mathbf R}^*|^3}{\mathbf R}^*.$$ Since the target is moving in a circular orbit of radius $R_0$, we see that ${\mathbf R}^* = R_0{\mathbf j}$. Also, because ${\mathbf r}^* = {\mathbf R}^* + {\mathbf r}$, we have $${\mathbf r}^* = x{\mathbf i} + (R_0+y){\mathbf j} + z{\mathbf k}.$$

Since we are interested in guiding the interceptor to rendezvous with the target, we will assume that the distance between the target and interceptor is much less than the radius of the target’s orbit. For instance, ${\mathbf r} = x{\mathbf i} + y{\mathbf j} + z{\mathbf k}$ may represent a length of a few kilometers, while ${\mathbf R}^* = R_0{\mathbf j}$ represents a length of several thousands of kilometers.

This leads us to approximate $|{\mathbf r}^*| \approx R_0 + y$ and to apply the linear approximation $$\frac{1}{|{\mathbf r}^*|^3} \approx \frac{1}{R_0^3} -\frac{3y}{R_0^4}.$$ With a little algebra, we can write the components of the vector $\frac{d^2}{dt^2}{\mathbf r}$ as \begin{aligned} \frac{d^2x}{dt^2} & = -2\omega_0 \frac{dy}{dt} \\ \frac{d^2y}{dt^2} & = 3\omega_0^2y + 2\omega_0\frac{dx}{dt} \\ \frac{d^2z}{dt^2} & = -\omega_0^2z \end{aligned} These are known as the Hill equations that describe the interceptor’s path in the rotating coordinate frame, and we will see how to find explicit solutions to them.

### Ellipses everywhere

The equation for the $z$-coordinate is a simple harmonic oscillator whose solution is $$z(t) = A\cos(\omega_0t) + B\sin(\omega_0t).$$

Remember that the $z$-axis is fixed in both coordinate systems. This form for $z(t)$ simply expresses the fact that in the fixed, Earth-centric coordinate system, the interceptor is traveling in an elliptical orbit about Earth’s center. More specifically, this expression describes the motion of the interceptor out of the plane of the target’s orbit. To achieve orbital rendezvous, we need to align the plane of the interceptor with that of the target. In practice, this incurs a considerable cost in fuel so care is taken to launch the interceptor in an orbital plane that’s close to that of the target.

We will assume that the interceptor has performed such a maneuver so that its orbital plane agrees with that of the target. In other words, we’ll assume that $z(t) = 0$ constantly and focus on the other two equations: \begin{aligned} \frac{d^2x}{dt^2} & = -2\omega_0 \frac{dy}{dt} \\ \frac{d^2y}{dt^2} & = 3\omega_0^2y + 2\omega_0\frac{dx}{dt} \\ \end{aligned}

We can develop some intuition by thinking about these equations more carefully. Remember that gravity is the only force acting on the interceptor. These equations tell us, however, that in the rotating coordinate system, the interceptor behaves as if it were subjected to two forces per unit mass: \begin{aligned} {\mathbf F}_{\mbox{Tid}} & = 3\omega_0^2y{\mathbf j} \\ {\mathbf F}_{\mbox{Cor}} & = -2\omega_0\frac{dy}{dt}{\mathbf i}+2\omega_0\frac{dy}{dt}{\mathbf j} = -2\omega\times\frac{d}{dt}{\mathbf r}. \\ \end{aligned}

The first ${\mathbf F}_{\mbox{Tid}}$, known as the tidal force, tends to push the interceptor away from the target’s circular orbit and makes $y=0$ an unstable equilibrium.

 The tidal force The Coriolis force

The second force, the Coriolis force, causes the interceptor to rotate counterclockwise as it moves in the rotating coordinate system.

Imagine what happens if we start just a little above the $x$-axis. The tidal force pushes us upward further away from the axis and causes us to pick up speed. The Coriolis force then begins to pull us to the left.

One simple solution to the equations of motion in the rotating frame, which can be verified directly from the equations, is given by \begin{aligned} x(t) & = a\cos(\omega_0 t) \\ y(t) & = \frac a2\sin(\omega_0 t). \end{aligned} Of course, we know by Kepler’s Second Law that the interceptor’s trajectory is elliptical in the fixed, Earth-centered frame. What’s remarkable about this solution is that it says that the interceptor’s path in the rotating frame is also elliptical.

The following figures illustrate this solution in both frames along with the positions at equal time intervals. It’s worth taking a moment to study the relationship between them.

 In the rotating frame In the fixed frame

For this special solution, one can check that $$\frac{d^2}{dt^2}{\mathbf r} = -\omega_0^2{\mathbf r}.$$ This is the equation for a simple harmonic oscillator, the kind of motion observed by a mass on a spring. In other words, the interceptor moves as if it were connected to the target by a spring, which gives some insight into why orbital rendezvous is so counter-intuitive.

While this solution is certainly special, it shares many features with the general solution. Denoting the initial position by $(x_0, y_0)$ and the initial velocity by $(\dot{x}_0, \dot{y}_0)$, one can check that the general solution is \begin{aligned} x(t) & = \left(6y_0 + \frac{4\dot{x}_0}{\omega_0}\right) \sin(\omega_0 t) + \frac{2\dot{y}_0}{\omega_0}\cos(\omega_0t) + \left[x_0 – \frac{2\dot{y}_0}{\omega_0} – (3\dot{x}_0 + 6\omega_0y_0)t\right] \\ y(t) & = -\left(3y_0 + \frac{2\dot{x}_0}{\omega_0}\right)\cos(\omega_0 t) + \frac{\dot{y}_0}{\omega_0}\sin(\omega_0t) + \left[4y_0 + \frac{2\dot{x}_0}{\omega_0}\right] \end{aligned}

These expressions, known as the Clohessy-Wiltshire equations, initially look complicated, but a little study reveals a suprising simplicity. Defining \begin{aligned} C & = 3y_0 + \frac{2\dot{x}_0}{\omega_0} \\ D & = \frac{\dot{y}_0}{\omega_0} \\ x_c & = x_0 – \frac{2\dot{y}_0}{\omega_0} – (3\dot{x}_0 + 6\omega_0y_0)t \\ y_c & = 4y_0 + \frac{2\dot{x}_0}{\omega_0}, \end{aligned} enables us to write \begin{aligned} x(t) = 2C\sin(\omega_0 t) + 2D\cos(\omega_0 t) + x_c \\ y(t) = -C\cos(\omega_0 t) + D\sin(\omega_0 t) + y_c. \end{aligned}

This shows that the interceptor, once again, is traveling in an elliptical path centered about the point $(x_c, y_c)$. In fact, the semi-major axis $a$ and semi-minor axis $b$ of the ellipse are \begin{aligned} a & = 2\sqrt{C^2+D^2} \\ b & = a/2, \end{aligned} which shows that the eccentricity of the path is always $\sqrt{3}/2$.

Notice, however, that the point $(x_c, y_c)$ is moving since $x_c$ has velocity $$v_{\mbox{drift}} = – (3\dot{x}_0 + 6\omega_0y_0)t = -\frac32\omega_0 y_c.$$ This is important because it says that the center of the ellipse drifts left if $y_c \gt 0$ and drifts right if $y_c \lt 0$.

### Parking orbits

Here are some special orbits that Mueller calls parking orbits.

Remember that the orbit is determined by its initial conditions $(x_0,y_0)$ and $(\dot{x}_0,\dot{y}_0)$. With the right choice of initial conditions, we can arrange that $C=D=0$ so that the ellipse collapses to a point. Mueller’s type I parking orbit has the interceptor in a circular orbit, as seen in the fixed frame, at the same altitude as the target. In the rotating frame, the interceptor is not moving relative to the target.

A type III parking orbit has the interceptor again in a circular orbit but at a different altitude than the target. Since the angular velocity decreases with the altitude, the interceptor appears to move horizontally in the rotating frame. The Clohessy-Wiltshire equations explain this by noting that the center of the ellipse moves left when $y_c\gt0$ and right when $y_c\lt0$.

Things get a little more interesting if we allow $C$ and $D$ to be nonzero so that the interceptor moves on an elliptical trajectory in the rotating frame. A type II parking orbit has $y_c = 0$ so that the center of the ellipse is centered on the $x$-axis. In this case, the center does not move so the elliptical orbit is stationary. The special solution we looked at earlier is an example. The orbit depicted here shows how an astronaut could retrieve a tool accidentally dropped by just waiting one orbital period for its return.

A type IV orbit has the interceptor traveling an elliptical orbit whose center is drifting. The orbit shown has $y_c \gt 0$ so the center of the ellipse is moving to the left.

With any solution, one can check that the interceptor moves as if connected by a spring to the moving center of the ellipse, which again speaks to the counter-intuitive nature of the motion.

These orbits explain the behavior of Gemini IV, which we described in the introduction. Imagine that the interceptor is initially in a type I parking orbit, traveling in the same circular orbit as the target and 500 meters away.

McDivitt aimed at the target and fired Gemini’s thruster, which would make $\dot{x}_0 \gt 0$ and move the craft into the type IV orbit shown here. This explains why the target paradoxically appeared to move down and away from the capsule.

One scenario Carroll considers has an astronaut stranded outside the ISS at an initial position of $(100,100)$, which is 100 meters in front of the space station and 100 meters above. Applying a momentary thrust to create a velocity of 1 meter per second aimed directly at the space station leads to a type IV orbit that badly misses the station.

So how can we put the interceptor on a path to dock with the target? We’ll assume the interceptor is at an initial position $(x_0, y_0)$ and choose a time $T$ at which we’d like to dock. Setting $x(T) = 0$ and $y(T) = 0$ gives the equations \begin{aligned} x(T) = 0 & = \left(6y_0 + \frac{4\dot{x}_0}{\omega_0}\right) \sin(\omega_0 T) + \frac{2\dot{y}_0}{\omega_0}\cos(\omega_0T) + \left[x_0 – \frac{2\dot{y}_0}{\omega_0} – (3\dot{x}_0 + 6\omega_0y_0)T\right] \\ y(T) = 0 & = -\left(3y_0 + \frac{2\dot{x}_0}{\omega_0}\right)\cos(\omega_0 T) + \frac{\dot{y}_0}{\omega_0}\sin(\omega_0T) + \left[4y_0 + \frac{2\dot{x}_0}{\omega_0}\right] \end{aligned}

Once again, these equations appear complicated, but notice that we know everything except the components of the initial velocity. This means that these are two linear equations for the two components of the unknown initial velocity $(\dot{x}_0, \dot{y}_0)$. Once we have found those components, the interceptor applies a momentary thrust to change its velocity to $(\dot{x}_0, \dot{y}_0)$ bringing it the target’s position at time $T$.

Carroll’s paper shows how to apply this idea to reproduce the rendezvous technique used in the Apollo 11 mission as Neil Armstrong and Buzz Aldrin returned from the lunar surface to dock with Michael Collins in the command module.

### Summary

NASA learned from Gemini IV’s failure to rendezvous with the booster target, and several months later, the Gemini VI mission successfully navigated to within 130 feet of Gemini VII. The pilot of Gemini VI, Walter Schirra, later explained that the difficulty was getting to that distance but that completing docking from there is relatively straightforward. Carroll’s paper examines that claim and confirms that line-of-sight flying is effective within about 100 feet of the target.

The International Space Station in 2021.

In the past, spacecraft docking with the International Space Station would fly in formation with the station, which would grab the craft with the Canadarm2 and manually complete the docking procedure. Recently, however, SpaceX has automated the docking process so that software guides the craft rather than astronauts who are not involved at all.

Would you like to try to dock with the ISS? This SpaceX simulator gives you the chance.