A Primer on Elliptic Curve Cryptography

Introduction

If you have used Bitcoin, Ethereum or any other cryptocurrency you would be familiar with public and private keys and if you are a more savvy person you might have also heard about ECC(Elliptic Curve Cryptography) or ECDSA(Elliptic Curve Digital Signature Algorithm). Today we will find out what exactly ECC is and how and why it’s used in case of Bitcoin and maybe other cryptocurrencies.

Brief History on Public-Key Cryptography

Public key cryptography is not new, it’s been around for a long time. The first public key cryptography algorithm was Diffie-Hellman. ECC is just one way of doing public-key cryptography. It was invented by two guys named Whitfield Diffie and Martin Hellman. Apparently Hellman was obsessed with the problem of how do I send my friend an encrypted message? Well it’s easy to send the encrypted message but how do I send the decryption key to the encrypted message. I could encrypt the decryption key, but then how do I send him the key to decrypt the decryption key? As you can see this quickly becomes a recursive problem that never ends. Hellman then teamed up with Diffie and went across the country to meet him. After approximately 4 years, in 1976 they came up with the Diffie–Hellman key exchange algorithm. So Diffie-Hellman is first of it’s kind. There are several others now, to name a few there are ElGamal, Paillier cryptosystem and RSA.

Fun fact: You use the RSA algorithm in your daily life because the HTTPS protocol uses RSA encryption. Without HTTPS your WhatsApp conversations wouldn’t be secure(not that they are secure to begin with because your data is getting zucked by Facebook, but at least people near you can’t see the data in clear text) and web security in general would be a security nightmare.

Properties of Public-Key Cryptography Systems

Why does Bitcoin use asymmetric-key cryptography? Well, asymmetric keys offer 2 properties:

Public and private keys - unlike symmetric-key cryptography you have a private and a public key - the public key is used to identify you publicly and the private key, as the name implies is kept safe. If your private key is lost or compromised then the attacker has full control over you funds. Also note that it’s computationally unfeasible to work your way back to a private key given a public key. Functions that have this characteristic, easy to compute one way but hard to compute the other way, are called Trapdoor functions. More formally, given a function f(x) it’s hard to find f^’(x).
Sign and verify messages - this lets you prove that you are the owner of the corresponding public key without actually giving away the private key. This lets you sign transactions which is then verified by the miners that the transaction actually came from you.

Okay enough talk, let’s dive into the mathematics of elliptic curves.

Elliptic Curves

Elliptic curves generally have a mathematical form of y² = x³ + ax + b. Your public key is just a point on this curve. In case of Bitcoin, it uses a standard called secp256k1 in which a=0 and b=7 so the equation ends up as y² = x³ + 7. See Bitcoin wiki for more info.

Elliptic curves are symmetric along their x-axis - this means that given point x on the x-axis, the corresponding points on the y-axis are y and -y. This property is smartly used in Bitcoin addresses to save transaction fees. Originally Bitcoin used “uncompressed” public keys in which your public key is both the x and y coordinate of a point. But later, using this property, it introduced a new “compressed” key format that lets you save data (as the public key is reduced to half of it’s size by omitting the y coordinate) and thus save transaction fees. See bitcoin-book on how compressed keys are represented.

Image showing different elliptic curves.

Elliptic Curves over a Finite Field:

Bitcoin public keys are computed by multiplying a base point, usually denoted by G, that is known to everybody times your private key (a 256-bit number). When you multiply a point P on the curve by a number n, it is not always guaranteed to be less than 2²⁵⁶. Why should it be less than 2²⁵⁶ you ask? Well there are always standards, or more like limits, on how big the number can be (depending on the use-case). In case of Bitcoin and most other use cases the limit is 2²⁵⁶. Bitcoin’s public keys are 512 bits long (256 for the x-coordinate and 256 for the y-coordinate).

Note: I’m only talking about the points associated with an uncompressed public key, the actual public keys are more than 512 bits long, this is because you encode some information to the key along with the points.

So in order for the public keys to fit the 512 bit length obviously each coordinate should be less than or equal to 2²⁵⁶. So how do we ensure that the computed number is always less than 2²⁵⁶? The answer is that we compute the Elliptic Curve over a finite field. Finite Fields are beyond the scope of this article so if you don’t know what finite fields are then this would be a good place to start. To be frank even I don’t know finite fields to it’s core, but I got the basics down. Calculating Elliptic Curves over a finite field would make sure that the computed number is always less then 2²⁵⁶. A naive explanation of finite fields would be that we take the mod of the result computed with respect to a number p. This number p is usually prime. In case of secp256k1 p is the largest prime number that is less than 2²⁵⁶.

Point Operations on Elliptic Curve

Point Addition

Lets say you have 2 points P and Q on the elliptic curve E and you need to add these two points. How do you add two points on the elliptic curve? It turns out that you need to find a line L that’s passing through these 2 points. Once you have done that, you have to find another point R at which the line L intersects the curve, usually denoted by E. You then reflect that point R about x-axis, i.e flip the sign of y-coordinate of the point. This point is our 3rd point R that we get by adding the 2 points P and Q. Lets visually see what’s going on. Images are taken from this hackernoon article.

point-addition

Mathematically adding is given as follows: (taken from wikipedia)

slope = (Q[y] – P[y]) / (Q[x] – P[x]) mod p # finite fields
R[x] = slope² – P[x] – Q[x] mod p
R[y] = slope * (P[x] – R[x]) – P[y] mod p

If you were paying attention you might have noticed something. Given 2 points and a line L which passes through those 2 points, it is not always guaranteed that the line L will intersect the curve at a 3rd point. Now that you know that it will not always intersect the curve at a 3rd point, I’d like you to think of a point at which this will happen. Did you find out which point it is? Yeah it’s the point of inflection (I’m not entirely sure if it’s called the point of inflection, so I just included an image with the point highlighted). There is also another image showing the tangent line at that point, as you can see the slope is infinity and the line always stays parallel to the y-axis hence it doesn’t intersect the curve at any other point (graphs are made using desmos).

So here are few properties of the elliptic curve that you might find useful later:

A non-vertical line intersecting two non-tangent points on the curve will always intersect the curve at a third point.
A non-vertical line tangent to the curve at a point will intersect precisely one other point on the curve.

Note: The lines have to be non-vertical for the reasons mentioned above.

Point Doubling

Adding 2 points isn’t always the case. Sometimes you need to add a point to itself. In that case we can’t use the above formula because we can’t find the slope of a line using just one point and moreover there are infinite lines passing through a single point P, so which one do we choose? Well we take the tangent line at point P. Why the tangent line you ask? Well you can still think of it as 2 points and that point Q is approaching P (infinitesimally closer to P) and now the line becomes the tangent line at point P. If you have taken any calculus class before you might be familiar that the slope of an equation at any given point is given by it’s first derivative. In this case the first derivative of y² = x³ + ax + b is:

$$\frac{\mathrm{d}y}{\mathrm{d}x} = \frac{(3x^2 + a)}{2y}$$

Then the equation of the tangent line is given by:

$$y = mx + c \text{, where } m = \frac{(3x^2 + a)}{2y}$$

Now that we have the equation to the line, we substitute it in the equation of the curve to find where they coincide, note that x1 and y1 are coordinates of point P.

$$[m(x - x1) + y1]^2 = x^3 + ax + b \text{, where } m = \frac{(3x^2 + a)}{2y}$$

You then solve the equation to get point R. Now, let’s visualize what’s happening: (images are taken from this hackernoon article).

point-doubling

Mathematically adding is given as follows: (taken from wikipedia)

slope = (3 P[x]² + a) / (2 P[y]) mod p
R[x] = slope² – 2P[x] mod p
R[y] = slope * (P[x] – R[x]) – P[y] mod p

Point Multiplication

Okay now how do we multiply a point with a scalar? Let’s say you want to compute n * P, where n is a scalar. A naive approach would be to add that point to itself using the point addition algorithm n number of times. But in real life applications these numbers are humungously large so it’s not very practical. Turns out there are multiple methods to do this, see wikipedia for all the different methods, in this article we focus on the double-and-add method. The algorithm for double-and-add is as follows: (taken from wikipedia)

def f(P, d):
    if d == 0:
        return 0                         # computation complete
    elif d == 1:
        return P
    elif d % 2 == 1:
        return point_add(P, f(P, d - 1)) # addition when d is odd
    else:
        return f(point_double(P), d/2)   # doubling when d is even

There is another implementation, of the same algorithm, on the wikipedia page, but I find it a little confusing. If you are a computer science student you might recognize that this algorithm is analogous to the binary exponentiation algorithm. Turns out this is exactly the same, only that we are doubling and adding instead of multiplying and squaring.

Let’s break it down a little more with an example. Let’s say you want to compute 10*P. We can approach the problem in 2 ways:

you could add P to itself 10 times - P + (P + (P + (P + (P + (P + (P + (P + (P + P)))))))) or
another approach would be to calculate it this way

2P = P + P
4P = 2P + 2P
8P = 4P + 4P
10P = 2P + 8P

Clearly the 2nd approach is far better as it only uses the point addition algorithm 4 times compared to 10 times in the first approach. Just to make thing clear here’s another example. In this example we calculate 9*P, this time it’s more like the original algorithm:

9 = 1001 (in binary - just means 8 + 1)
R = 9 P
R = P + 8P
R = P + 2(4P)
R = P + 2(2(2P))

# we start calculating 9P here
R = P + 2(2(P + P))
R = P + 2(2P + 2P)
R = P + 4P + 4P

As you can see computing 9*P again only requires 4 point additions compared to 9 if we were to use the naive approach.

Fun fact that you don’t need to know and is totally irrelevant: I kind of came up with this on my own once I got to know that adding point P to itself n times is not practical (most probably because I was already familiar with the binary exponentiation algorithm). Yeah, feels a little validating xD.

Finally here is the python code for the above operations that I copied from stackoverflow and fixed it.

Now that we got the basics down, we can then go on and explore how digital signatures work. In the next article we will explore how Digital Signatures works. At it’s core it’s basically “tell me that you know a number without actually telling me the number”(if you get the reference, lol).

This is my first article so I don’t know how it came out. I took a ton of ~~copied~~ inspiration from this hackernoon article, you should check it out though he did an amazing job at explaining Elliptic Curve Cryptography. Thanks for reading.