Introduction

What is Cryptography?

Cryptography is the study of how to securely communicate a message even when a third party can intercept all transmissions. Plaintext is the original message to be sent. Encryption converts plaintext into ciphertext, which hides the original message. Finally, decryption converts ciphertext back into plaintext to reveal the initial message.

An important issue arises when this process happens on the Internet: how can we communicate securely when we must assume that eavesdroppers are able to intercept all communications, including the method of encryption? For example, it may be relatively difficult to crack a substitution cipher (a cipher in which each character is substituted for another) in a vacuum, but if the eavesdropper was able to intercept everything being sent, then they would also be able to intercept the cipher!

This problem is actually impossible to solve the way it is currently stated. If an eavesdropper has access to both the ciphertext and the method of encryption, they can simply encrypt every possible plaintext message until it matches the ciphertext. The goal is to make it computationally infeasible for the message to be intercepted. For example, trying trillions of possible messages is infeasible. However, smarter techniques can reverse engineer the plaintext input from the ciphertext, or at least narrow down the possibilities until trying each one becomes feasible.

This brings us to the concept of a trapdoor function. A trapdoor function is easy to calculate but difficult to find the inverse of, without some private secret. This is useful in cryptography: the recipient can generate a trapdoor function along with the private key needed to solve it, and only send out the details of the function. The sender can then use the function to encode plaintext into ciphertext, and the recipient can use their private key to invert it back into plaintext. The eavesdropper, however, does not know the private key, and so must manually find the function’s inverse, which takes far too long.

Why Not RSA?

RSA encryption is a popular method used today that uses the above principle to safely communicate messages. If you don’t know anything about it, there is another article, "Modular arithmetic and public-key cryptography," dedicated to explaining how it works. That article also introduces mods, so if even if you’re not interested in RSA, it will still be helpful to read the first two sections.

RSA still has not been cracked, but many advancements have made it more feasible to solve. The algorithm’s safety relies on the difficulty of tasks such as factoring large numbers, which is something many mathematicians have made important progress in solving. As a result, the keys used in RSA encryption must get progressively larger to deter attacks. While this maintains security, it makes encryption and decryption less convenient. Elliptic curves, however, don’t have algorithms that are as effective in solving them, thus allowing for smaller keys with the same level of protection and faster processing.

Elliptic Curves

So what exactly is an elliptic curve, and why is it so useful in cryptography?

An elliptic curve is the graph of the equation $$y^2=x^3+ax+b$$. A couple examples are displayed below:

We can define a group using the points on the elliptic curve. A group is a set of objects which are connected by some binary operation. In this case, we’ll define what it means to add points together. This operation must have certain properties:

• Closure: Performing the operation on any two elements of the set must yield another element of the set.

• Commutativity and Associativity: For a particular expression, no matter how you order the terms or perform the operations, the result must be the same.

• Identity: There must exist some identity 0 such that for all elements $$a$$ in the group, $$a+0=a$$.

• Inverse: For every element in the set $$a$$, there exists another element in in the group $$-a$$ such that $$a + -a = 0$$.

Let’s define this function. The identity is represented with $$0$$, which doesn’t represent a specific point on the curve but will still be important. For every point $$(x, y)$$, the inverse will be the point $$(x, -y)$$, or the point that lies opposite the x-axis on the curve. Finally, three points $$P$$, $$Q$$, and $$R$$ on the curve are collinear if and only if $$P+Q+R=0$$.

This equation allows us to define the sum of two points: $$P+Q=-R$$. This can be represented with the following process: draw the line $$PQ$$ on the curve, then find the point $$R$$ which is the third point where the line intersects the curve, and then flip it across the x-axis to get the point $$-R$$, which is the sum of the two points $$P$$ and $$Q$$!

The above definition works fine with a pen and paper, but how can a computer perform this addition? Firstly, let’s define the coordinates of the points $$P=(x_P, y_P)$$ and $$Q=(x_Q, y_Q)$$. We then define the slope $$m = \frac{y_P-y_Q}{x_P-x_Q}$$, and let $$k$$ be the $$y$$-intercept of the line passing through $$P$$ and $$Q$$ (which we don’t need to calculate for now). Then the line through $$P$$ and $$Q$$ has equation $$y = mx+k$$, and we want the third point which lies on this line and the elliptic curve. We can solve for it using the following process: $y^2=x^3+ax+b$ $(mx+k)^2=x^3+ax+b$ $x^3+ax+b-(mx+k)^2=0$ We can now use Vieta’s formulas to find the third root $$x_R$$. In a monic cubic, the sum of the roots is equal to $$-b$$, where $$b$$ is the coefficient of the $$x^2$$ term. Therefore, $x_P+x_Q+x_R=-(-m^2)$ $x_R=m^2-x_P-x_Q$ And once we find $$x_R$$, we can use our knowledge that this point lies on the line through $$P$$ and $$Q$$ to solve for $$y_R$$: $m=\frac{y_R-y_P}{x_R-x_P}$ $m(x_R-x_P)=y_R-y_P$ $y_R=y_P+m(x_R-x_P)$ Then, once we get the point $$R$$ that’s collinear with $$P$$ and $$Q$$, the answer to $$P+Q$$ is $$-R$$, or the point $$(x_R, -y_R)$$. These equations will work for almost every case. There is one exception, however: what if $$P=Q$$? First of all, we’re going to interpret the line through $$PQ$$ as the line tangent to the curve at point $$P$$ in this instance. And since the slope is now going to be the slope of the tangent line rather than the slope between $$P$$ and $$Q$$, we need a new formula for this case. $m=\frac{3x_P^2+a}{2y_P}$ For those who are familiar with calculus, you may recognize that this formula for $$m$$ corresponds with the first derivative of the function $$y=\pm \sqrt{x_P^3+ax_P+b}$$.

Scalar Multiplication

If we can successfully define an addition operation, then we can also define a scalar multiplication operation: for an integer $$n$$ and a point $$P$$ on an elliptic curve, the expression $$nP$$ is the result of adding $$P$$ to itself $$n$$ times.

How can we do this operation on a computer? One technique is to simply add $$P$$ to itself $$n$$ times. But as $$n$$ gets extremely large, this technique no longer becomes feasible. For example, $$n$$ can get as large as $$2^{512}$$, but adding a point to itself that number of times is far too difficult for any computer to do.

We can introduce a new algorithm to quickly perform scalar multiplication. The procedure looks something like this: for a point $$P$$, first calculate $$2P$$, then $$4P$$, $$8P$$, and so on, adding the result to itself at each each step to get the next result in the sequence. Then, for each power of 2, if that power of 2 shows up in the bit representation of the desired coefficient $$n$$, then we add it to a running total. By doing this, we ensure that the amount of time we take is proportional to the number of powers of 2 we must cycle through in order to get the result. To be more precise, the time complexity of this algorithm is $$O(\log_2n)$$.

Finite Fields

We can now apply a new concept to add another layer of complexity to the groups we have just defined. This particular section will use modular arithmetic to show how an elliptic curve can be limited to a finite field.

First, a finite field $$\mathbb{F}_p$$ is a group of the nonnegative integers less than $$p$$ that are connected by addition, subtraction, and multiplication in modular arithmetic. So for example, if we were dealing with $$\mathbb{F}_5$$ we could say that $$4\cdot3 = 12 \equiv 2 \pmod{5}$$.

Now we can define the elliptic curve over a finite field: for two nonnegative integers $$x$$ and $$y$$ both less than $$p$$, we say that the point $$(x, y)$$ lies on the elliptic curve if $$y^2 \equiv x^3+ax+b \pmod{p}$$. Now can we define addition over the finite field as well: we’ll still say that three points $$P$$, $$Q$$, and $$R$$ are collinear if they lie on the same line, but this time the line will be defined as the set of points satisfying $$ax+by+c \equiv 0 \pmod{p}$$, the form of a line in a field. The algebraic equations for finding the sum $$P+Q$$ will be very similar: we once again look for the point $$R$$ on the curve that’s collinear with $$P$$ and $$Q$$ using the following formulas: $x_R = (m^2 - x_P - x_Q)\pmod{p}$ $y_R = y_P+m(x_R-x_P)\pmod{p}$ Where if $$P \neq Q$$, then $m = \frac{y_P-y_Q}{x_P-x_Q}\pmod{p}$ And if $$P=Q$$, then $m=\frac{3x_P^2+a}{2y_P}\pmod{p}$ Note that in a finite field, division is defined as multiplying by the modular inverse rather than actually dividing. This is done to ensure that we remain within the field of integers.

Once we calculate the coordinates $$R=(x_R, y_R)$$, we can now flip it to get $$P+Q=-R=(x_R, -y_R)$$, reaching our answer.

For an elliptic curve taken on a finite field (aka an elliptic curve group), we will define the order of the elliptic curve group as the number of distinct points on the curve. Schoof’s algorithm was designed to calculate this quantity fast enough to justify using it. While the algorithm is too complex to lie within the scope of this article, just know that it exists and is able to calculate this useful quantity.

Cyclic Subgroups

In a finite field, the multiples of a given point $$P$$, called the base point, will eventually form an infinite loop. It’s important to note that the points that lie along that loop form a cyclic subgroup, because adding any two multiples of $$P$$ will yield another multiple of $$P$$ that also lies within the subgroup. We can define the order of that subgroup as the number of points in the subgroup, but we can also create an equivalent definition that may be easier to use: the order of a cyclic subgroup generated with the multiples of $$P$$ is the lowest number $$n$$ such that $$nP=0$$.

How do we find the order of a subgroup when that order is potentially very large? Well, we can make use of Lagrange’s theorem, which states that the order of a subgroup is always a divisor of the order of the group. If we know the order of the group, $$N$$ (which can be calculated with Schoof’s algorithm), then we can generate all the divisors of that number and individually test each possibility to find the smallest $$n$$ where $$nP=0$$.

Next, the goal is to find a base point that generates a cyclic subgroup with a large, prime order. With the right approach, we can actually pick the order we want, and then go about finding a base point that creates a subgroup with that order. We can make use of the cofactor of $$n$$, or the integer quantity $$\frac{N}{n}$$, which we’ll call $$h$$. For every point $$P$$, we know that $$NP=0$$, since the order of the cyclic subgroup generated by $$P$$ must divide $$N$$. We can express that statement as $$nhP=0$$, since $$h$$ is defined as $$\frac{N}{n}$$. Finally, this can be expressed as $$n(hP)=0$$, which can be true in two different ways: either $$hP=0$$ and the equation is trivially true, or $$hP$$ is a nonzero point, in which case we can use it as a base point to generate a cyclic subgroup of order $$n$$. The probability that $$hP=0$$ is sufficiently small enough that picking random points $$P$$ and checking if $$hP$$ is a valid base point for a cyclic subgroup of order $$n$$ will yield an answer very quickly.

The Discrete Logarithm Problem

After establishing the properties of this elliptic field, we can finally create a trapdoor function that’s easy to create but very difficult to solve: Given two points $$P$$ and $$Q$$, what’s the integer $$k$$ such that $$kP=Q$$? At the current moment, no classical computing algorithm can solve this problem in a feasible amount of time. This is a discrete logarithm problem: discrete because it relates to finite fields and cyclic subgroups, and logarithmic because it’s similar to the original logarithm, which determines how many times to repeat the operation of multiplication to get the result. What separates elliptic curve cryptography from other discrete logarithm problems is that it’s uniquely difficult to break, which allows for smaller keys with the same level of security, as well as faster computation.

Applications of Elliptic Curve Cryptography

Elliptic-curve Diffie-Hellman

Elliptic-curve Diffie-Hellman is a key-agreement protocol that allows two parties to agree on a key without an eavesdropper knowing that key. Once the two parties can achieve this, they can go on to communicate via some other encryption algorithm, like AES encryption, which requires a secret key that’s unknown to any eavesdroppers. The process goes as follows:

• Alice and Bob publicly agree on an elliptic curve field to perform their calculations on, as well as a base point $$P$$ which generates a cyclic subgroup with a large, prime order.

• Alice and Bob then each privately pick numbers $$a$$ and $$b$$, respectively. Alice calculates and publicly sends the point $$aP$$, and Bob calculates and publicly sends the point $$bP$$.

• Alice then multiplies the point $$bP$$ by $$a$$ to get $$abP$$. Bob does the same, multiplying the point $$aP$$ by $$b$$ to get $$abP$$.

Note that the value of the end result, $$abP$$, is completely random and unintentional. But if both parties have the same secret information, that’s enough to use it for a different form of encryption with, for example, the $$x$$-coordinate being used as the key for that encryption method.

Authenticating Signatures

Let’s say I want to transfer some bitcoin from my account to a friend’s account. I would create and send an instruction that initiates that transaction. But anybody else can also create these instructions, which would perform the transfer whether I want to or not. So we must introduce a signature, some confirmation that I am the person intending to make the transaction. If I have a private key that nobody else knows, then my signature would involve demonstrating my knowledge of the private key without revealing the true value of this key. This is a very tough problem, but we can solve it using elliptic curves.

Let’s first define a hash function $$hash(m, R)$$ that takes an integer $$m$$ and a point $$R$$ on an elliptic curve to produce a pseudorandom integer output. There are many necessary properties of good hash functions, but the main one that matters here is that trying to reverse engineer the input from the output of a hash function is computationally infeasible.

Let’s give each user a private key $$x$$ that only they know. We can also take a predetermined base point $$P$$ and calculate $$X = x \cdot P$$, which will be publicly available. We can publish $$X$$ without revealing $$x$$ because of the discrete logarithm problem having no known solution.

The user will first privately pick integers $$m$$ and $$r$$. Then they will calculate the point $$R = r \cdot P$$, and then calculate the integer $$s = hash(m, R) \cdot x + r$$. Notice that the following equalities all hold true: $hash(m, r \cdot P) \cdot x \cdot P + r \cdot P = (hash(m, r \cdot P) \cdot x+r) \cdot P$ $hash(m, R) \cdot X + R = (hash(m, R) \cdot x+r) \cdot P$ $hash(m, R) \cdot X + R = s \cdot P$ So the signature works in the following manner: if the user really does have access to the private key $$x$$, then they will easily be able to generate $$m$$, $$R$$, and $$s$$ which satisfy this final equation using the above process. Even though nobody else knows the value of $$x$$, they can use the public point $$X$$ to confirm that the generated solution does indeed work.

The security of this method has two parts to it: first, nobody else should be able to feasibly generate this working solution with only knowledge of $$X$$, and second, publicly releasing this solution $$m$$, $$R$$, and $$s$$ should not reveal anything about the original value of $$x$$.

The first part relies on the security of both the elliptic curve discrete logarithm and the hash function. This prevents a hacker from randomly picking $$m$$ and $$R$$, and trying to reverse engineer some value $$s$$ such that $$s \cdot P = hash(m, R) \cdot X+R$$. It also prevents them from randomly selecting $$s$$ and $$R$$, then trying to find an $$m$$ which satisfies $$hash(m, R) \cdot X+R=s \cdot P$$.

The second part uses the security of the discrete logarithm problem. Note that $$m$$ and $$R$$ are completely random, and so they reveal nothing about $$x$$. We have that $$s = hash(m, R) \cdot x+r$$, so $$x = \frac{s-r}{hash(m, R)})$$. In order to evaluate that expression, we must know the value of $$r$$, but that requires being able to solve the discrete logarithm problem for the given point $$R$$. In fact, Bitcoin and Ethereum both use this algorithm to facilitate secure transactions, in which a user requesting a transfer of cryptocurrency from them to another user can confirm that they are indeed the one requesting this transfer.

Citations

1. Corbellini, Andrea. “Elliptic Curve Cryptography: a Gentle Introduction,” May 17, 2015. https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/.

2. Knutson, Hans. “What Is the Math behind Elliptic Curve Cryptography?” Hacker Noon, April 16, 2018. https://hackernoon.com/what-is-the-math-behind-elliptic-curve-cryptography-f61b25253da3.

3. Silverman, Joseph H. An Introduction to the Theory of Elliptic Curves. Laramie, Wyoming: University of Wyoming, 2006.
https://www.math.brown.edu/johsilve/Presentations/WyomingEllipticCurve.pdf.