We use the Internet for many things, from reading news articles, to keeping in touch with friends on social media, to shopping from the comfort of our own homes. Many of these tasks involve sending sensitive personal information (such as credit card numbers and our home address) to complete strangers. We would like to keep this information safe, making sure no malicious third party is able to intercept our messages. RSA is a cryptosystem which is known as one of the first practicable public-key cryptosystems and is widely used for secure data transmission. RSA has stood the test of nearly 40 years of attacks, making it the algorithm of choice for encrypting Internet credit-card transactions, securing e-mail, and authenticating phone calls.
One of the distinguishing techniques employed in public-key cryptography is the use of asymmetric keys. In this scheme, one key (the public key) is used to encrypt the message while a different key (the private key) is used to decrypt it. The keys are related mathematically, but the parameters are chosen so that calculating the private key from the public key is either impossible or prohibitively expensive.
At MIT in the fall of 1976, computer scientists Ronald Rivest and Adi Shamir, along with number theorist Leonard Adleman devised the public-key encryption code that bears their initials (RSA encryption), and has been in use ever since to secure electronic transactions. The RSA algorithm was the culmination of many months of work, which all started when Rivest read a paper by Diffie and Hellman, proposing that a good public-key encryption scheme would need to be based on what they called a “trap-door one-way function.” This function would be easy to compute, but hard to invert unless you knew the secret (the “trap door”). Rivest and Shamir would come up with numerous number theoretic schemes to fit this “trap door” idea, and then Adleman would try to poke holes in it (usually succeeding after a few minutes’ thought). However, one evening Rivest called Adleman with a new idea, which Adleman agreed was a good one because it seemed that only factoring would break the algorithm. Rivest wrote up a paper about the new algorithm and sent a copy to Adleman. The authors were listed in the standard alphabetic order: Adleman, Rivest, and Shamir. Adleman objected to this ordering, however, because he stated that he had not done enough work to be listed first. Adleman consented to being listed as an author only if his name was put last, reflecting what he considered his minimal contribution. Adleman said later, “I remember thinking that this is probably the least interesting paper I will ever write and no one will read it and it will appear in some obscure journal.” The paper in question was published in Communications of the ACM and so the RSA (instead of the ARS) algorithm was born.
The RSA algorithm involves three steps: key generation, encryption, and decryption. RSA involves two keys – a public key and a private key. As the names suggest, anyone can be given information about the public key, whereas the private key must be kept secret. Anyone can use the public key to encrypt a message, but only someone with knowledge of the private key can hope to decrypt the message in a reasonable amount of time. So how are these keys generated? The power and security of the RSA cryptosystem is based on the fact that the factoring problem is “hard.” That is, it is believed that the full decryption of an RSA ciphertext is infeasible because no efficient algorithm currently exists for factoring large numbers. The keys for the RSA algorithm are generated as follows:
1. Choose two distinct prime numbers and . In order for the system to be secure, the integers and should be chosen at random and should be of similar bit-length. To find large primes, the numbers can be chosen at random and, using one of several fast probabilistic methods, we can test their primality.
2. Compute . The product will be used as the modulus for both the public and private keys. Its length, usually expressed in bits, is the length of the key.
3. Compute , where is Euler’s totient function.
4. Choose an integer such that and gcd (that is, and are coprime). The number is released as the public key exponent.
5. Determine as mod . That is, is the multiplicative inverse of (modulo ). This is often computed using the extended Euclidean algorithm. The number is kept as the private key exponent.
The public key is formed by the pair , where is called the modulus and is called the public (or encryption) exponent. The private key is formed by the pair , where is called the private (or decryption) exponent. It is imperative that the decryption exponent is kept secret. In addition, the numbers , and must also be kept private because they can be used to calculate .
Once the keys are determined, secure messages can now be sent. Suppose Bob would like to send Alice a message. Alice transmits her public key to Bob, keeping her private key secret. In order to send Alice an encrypted message, Bob first has to turn the message into an integer , such that . This is done by using a previously agreed-upon reversible protocol known as a padding scheme. Bob then computes the ciphertext corresponding to his message. The ciphertext can be found by computing (mod ). Bob then transmits the encoded message to Alice. In order to recover the message, Alice uses her private key, computing (mod ). Given , she can recover the original message by reversing the padding scheme.
We will now look at a small (and insecure) example. Alice would like to create a public and private key to use for her secure internet transactions. In order to create these keys, she chooses and . Next, she computes as well as the totient . In order to find the public key exponent, Alice must choose a number which is also coprime to 11200. For this example, Alice will choose = 3533. Finally, Alice must compute (mod ) = 6597. Alice publishes the public key pair , while keeping and private. Now suppose that Bob would like to send the message to Alice. Bob would compute
(mod ) = 5761,
which he would then send to Alice. After receiving the ciphertext , Alice can decode the message using her private key
(mod ) = 9726.
The RSA algorithm has remained a secure scheme for sending encrypted messages for almost 40 years, earning Rivest, Shamir, and Adleman the Association for Computing Machinery’s 2002 Alan Turing Award, among one of the highest honors in computer science. Currently, the only way to completely break the RSA cryptosystem in use today (which is slightly more sophisticated than that described here) is to factor the modulus . With the ability to recover the prime factors and , an attacker can compute the secret exponent from the public key . Once they have the secret exponent, the attacker can decrypt any message sent using the public key. What keeps RSA safe from such an attack is the fact that no polynomial-time algorithm for factoring large integers on a classical computer has been found yet. However, it also has not been proven that no such algorithm exists. As of 2010, the largest known number factored by a general-purpose factoring algorithm was 768 bits long, using a state-of-the-art distributed implementation. RSA keys are typically 1024 to 2048 bits long, though some experts believe that 1024-bit keys could be broken in the near future. It is generally believed that 4096-bit keys are unlikely to be broken in the foreseeable future, meaning that RSA should remain secure as long as is chosen to be sufficiently large. It is currently recommended that be at least 2048 bits long.
In 1994, Peter Shor showed that a quantum computer could be used to factor a number in polynomial time, thus effectively breaking RSA. Stay tuned for my next article where we will look at Shor’s algorithm in depth!
Robinson, Sara. “Still Guarding Secrets after Years of Attacks, RSA Earns Accolades for its Founders.” SIAM News, Volume 36, Number 5, June 2003.
R. L. Rivest, A. Shamir, and L. Adleman. 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 2 (February 1978), 120-126. DOI=10.1145/359340.359342 http://doi.acm.org/10.1145/359340.359342