Discrete math for cryptography#

Note: I wrote this for UBC’s CPSC 436S (cybersecurity) course when I got to TA it in 2024. It was put here to test the MathML-generating abilities of the script that compiles this site. Since it also works as a general page about the basics of cryptography-related math, it lives here now.

Unfortunately, the UBC computer science curriculum doesn’t include much discrete math, apart from the intro in CPSC 121. This page will try to fill in the gap—it will present much more math than is required for CPSC 436S, both for your understanding of asymmetric cryptography, and hopefully to also show that the math itself is interesting.

Computational hardness (as motivation)#

Modern cryptography depends on the existence of one-way functions. Informally speaking, these are functions where finding the output from an input is easy (can be done in polynomial time), but finding the input given an output is hard.

You might find it concerning that no one knows if one-way functions actually exist. If someone did prove they exist, then we would know P != NP. We do the next best thing by using things that look like one-way functions and praying. The two hopefully-one-way functions we cover in class are:

Factoring: If a number has large prime factors, there is no quick way to find them. But given a list of factors, we can find their product in polynomial time.
The discrete logarithm problem: Given $g$ and $x$ in a finite group, finding an exponent $a$ such that $g^{a} = x$ can be hard, where finding $x$ given $g$ and $a$ is easy.

The above terminology will need some explaining. Along with relevant definitions and notation, this page offers:

An explanation of these hard problems
Properties of abstract mathematical objects used in cryptography
Properties of the not-abstract integers and modular arithmetic, also used in cryptography

Primes and factoring#

Recall that a prime number is an integer greater than 1 which can’t be factored further; its only divisors are one and itself.

For any number, we can get its list of prime factors by repeated division. Let $f$ be a function that does this, which takes a positive integer and produces an orderless array of its prime factors, so $f (12) = f (2^{2} \cdot 3) = {2, 2, 3}$ .

It is clear that every list of prime numbers corresponds to exactly one product (multiply them all together and you’ll get the same result every time). Notably, the reverse is also true: every number corresponds to exactly one unordered list of factors. This uniqueness of a number’s prime factorization is known as the fundamental theorem of arithmetic. If you are familiar with the term, we can say our function $f$ is a bijection between the positive integers and the set of unordered lists of primes.

Every whole positive number* $x$ is either prime (i.e. $f (x)$ has length 1) or composite ( $f (x)$ has length $\geq 2$ ). Two numbers are coprime or relatively prime if they share no factors at all.

*When we’re talking about multiplication, 1 is a more of a no-op than a number, so it doesn’t count as a number for this sentence. It is neither composite nor prime; you can think of $f (1)$ to have length 0. More on 1 being the identity of multiplication later.

To do anything interesting with our numbers, we need to define the rules they should follow.

Fields#

Roughly speaking, a field is a set where math ( $+, -, \times, \div$ ) works in the usual way. Precisely, a field is a set $S$ along with two functions $+$ ("addition") and $\cdot$ ("multiplication") where the following rules hold:

Closure: The given operations on elements of $S$ should never "escape" the set. $a + b$ is in $S$ and $a \cdot b$ is in $S$ for any $a$ and $b$ in $S$ .
Distributive property: $\cdot$ distributes over $+$ , so $a \cdot (b + c) = a \cdot b + a \cdot c$
Addition is associative: $(a + b) + c = a + (b + c)$ for every choice of $a, b, c$ in $S$ .
Addition is commutative: $a + b = b + a$ for every $a, b$ in $S$ .
Additive identity: There is some element (call it $I_{+}$ ) in $S$ where adding it is a no-op. That is, $a + I_{+} = a$ for any $a$ in $S$ . You may recognize $I_{+}$ to be 0 in familiar fields.
Additive inverse: For any $a$ in $S$ , there is an "inverse" element $- a$ such that adding it brings you back to the identity: $a +$ ( $- a$ ) $= I_{+}$
Multiplication is also associative
Multiplication is also commutative
Multiplicative identity: There is some $I$ in $S$ such that $I \cdot a = a$ for any $a$ in $S$ . You may recognize this to be 1 in familiar fields.
Multiplicative inverse: For any $a$ in $S$ with the exception of $a = I_{+}$ , there is an "inverse" element $a^{- 1}$ in $S$ such that $a \cdot a^{- 1} = I$

Usually these rules are combined for brevity, but we’ll keep the list long so we can choose ones to cross out later. If all of these are true for your choices of $S, +, \cdot$ , then you have yourself a field!

Exercises

ℤ = {\dots, - 1, 0, 1, 2, \dots}

(with normal addition and multiplication) a field?

No.

Which rule does it break from above?

Number 9. There is no integer

x

such that

3 \cdot x = 1

, so 3 has no multiplicative inverse in

ℤ

ℚ

(the rational numbers) under normal addition and multiplication a field?

Yes.

Finite fields and modular arithmetic#

By defining all the rules we needed to do arithmetic, we can find another field to do arithmetic in. The field we define next will also get us closer to a real explanation of the discrete log problem mentioned above.

Sort the set of all integers into 5 buckets (congruence classes) depending on the result of taking the integer modulo 5. We can denote $[1]$ to mean the congruence class 1 goes in, so $[1] = [6] = [- 4]$ and

[0] = {\dots - 5, 0, 5, 10, \dots}

[1] = {\dots - 4, 1, 6, 11, \dots}

[2] = {\dots - 3, 2, 7, 12, \dots}

[3] = {\dots - 2, 3, 8, 13, \dots}

[4] = {\dots - 1, 4, 9, 14, \dots}

Then by defining addition as $[a] + [b] = [a + b]$ and multiplication as $[a] \cdot [b] = [a \cdot b]$ , all rules hold! (proof of 0-8 left as an exercise to the reader).

The least intuitive thing here is finding the multiplicative inverse of each congruence class. If we were dealing with the rational numbers, finding this inverse is as easy as swapping the numerator and denominator. In $ℤ_{5} = {[0], [1], [2], [3], [4]}$ , the multiplicative identity is $[1]$ , and we can see $[1] \cdot [1] = [1]$ , $[2] \cdot [3] = [6] = [1]$ , and $[4] \cdot [4] = [16] = [1]$ , so we know ${[1]}^{- 1} = [1]$ ${[2]}^{- 1} = [3]$ , ${[3]}^{- 1} = 2$ , and ${[4]}^{- 1} = [4]$ . So everything but $[0]$ is invertible. This means $ℤ_{5}$ along with modular addition and multiplication is indeed a field.

Side note

Finding multiplicative inverses in larger fields can be done in polynomial time using the extended Euclidean algorithm. If you’re using a newer version of Python, pow(x, -1, p) will find the inverse of x mod p.

Exercises

What are all the multiplicative inverses in

ℤ_{7}

{[1]}^{- 1} = [1]

{[2]}^{- 1} = [4]

and

{[4]}^{- 1} = [2]

{[3]}^{- 1} = [5]

and

{[5]}^{- 1} = [3]

{[6]}^{- 1} = [6]

since (

{[- 1]}^{2} = [1]

)

What are all the additive inverses in

ℤ_{7}

[- 0] = [0]

[- 1] = [6]

and

[- 6] = 1

[- 2] = [5]

and

[- 5] = 2

[- 3] = [4]

and

[- 4] = 3

(rhetorical question) The elements of

ℤ_{n}

seem to pair up very nicely; every element has a unique inverse. Are the additive and multiplicative inverses always like this?

Yes. This is a property we get as a reward for defining rules.

Another fun property of fields: given a natural number $n$ , there is either exactly one or zero finite fields with $n$ elements (having order $n$ ). This is not true of other mathematical objects; for example, there can be many dissimilar groups of the same order. The finite field of size $n$ exists if and only if $n$ is a prime power: $n = p^{k}$ for some prime $p$ and positive integer $k$ . Since there is only one finite field of a given order, we call it "the Galois field of order $p^{k}$ ," or $G F (p^{k})$ . Simple modular math allows us to understand fields where $k = 1$ .

When we do math in $ℤ_{n} = {[0], [1], \dots [n - 1]}$ with non-prime $n$ , we run into issues with invertibility. Take $ℤ_{10}$ and attempt to find the multiplicative inverse of $[2]$ : no matter what, $[2]$ makes the product even, so getting to $[1]$ is impossible. So $ℤ_{10}$ isn’t a field, but the remaining properties are worth noting.

Rings#

A ring is a set $S$ and two operations ( $+$ and $\cdot$ ) where the following rules hold:

Closure: The given operations on elements of $S$ should never "escape" the set. $a + b$ is in $S$ and $a \cdot b$ is in $S$ for any $a$ and $b$ in $S$ .
Distributive property: $\cdot$ distributes over $+$ , so $a \cdot (b + c) = a \cdot b + a \cdot c$
Addition is associative: $(a + b) + c = a + (b + c)$ for every choice of $a, b, c$ in $S$ .
Addition is commutative: $a + b = b + a$ for every $a, b$ in $S$
Additive identity: There is some element (call it $I_{+}$ ) in $S$ where adding it is a no-op. That is, $a + I_{+} = a$ for any $a$ in $S$ .
Additive inverse: For any $a$ in $S$ , there is an "inverse" element $- a$ such that adding it brings you back to the identity: $a +$ ( $- a$ ) $= I_{+}$
Multiplication is associative
Multiplication is commutative (click to see note)
Since we’re talking about modular arithmetic, this will be true anyway, but we don’t need commutative multiplication in a ring (for example, the set of 2x2 matrices with integer entries form a non-commutative ring).
Multiplicative identity: There is some $I$ in $S$ such that $I \cdot a = a$ for any $a$ in $S$ .
Multiplicative inverse: For any $a$ in $S$ with the exception of $a = I_{+}$ , there is an "inverse" element $a^{- 1}$ in $S$ such that $a \cdot a^{- 1} = I$

Exercises

ℤ = {\dots, - 1, 0, 1, 2, \dots}

(with normal addition and multiplication) a ring?

Yes.

Which elements of

ℤ_{10}

have no multiplicative inverse?

[0]

(of course),

[2]

[4]

[6]

[8]

[5]

If we choose a non-prime $n$ for $ℤ_{n} = {[0], [1], \dots [n - 1]}$ , we don’t get a field, but we still get a ring as a participation medal.

More modular arithmetic: Euler’s Totient#

You might notice something in common between the troublesome elements of $ℤ_{10}$ with no multiplicative inverse: they share common factors with 10. The invertible elements, on the other hand, are coprime to 10. This count of coprime-to- $n$ numbers that are smaller than $n$ is denoted by $φ (n)$ , where $φ$ is Euler’s totient function. This function is quick to calculate, but only if we know the factors of $n$ .

Exercises

What is

φ (p)

for some prime

p

The only numbers that share factors with

p

are multiples of

p

, which are too big to be counted by the totient. Hence

φ (p)

p - 1

What is

φ (p^{k})

for some prime

p

and

k \geq 1

? (Hint: how many multiples of

p

are less than

p^{k}

φ (p^{k}) = p^{k} - p^{k - 1}

What is

φ (p q)

where

p

and

q

are different primes?

There are

p q - 1

numbers less than

p q

p - 1

of these are multiples of

q

(so

q

2 q

, ...,

(p - 1) q

), and similarly,

q - 1

multiples of

p

. They are coprime, so no overlap between their multiples until

p q

. So

φ (p q) = (p q - 1) - (p - 1) - (q - 1) = (p - 1) (q - 1)

By removing the non-invertible numbers (including 0) from $ℤ_{n}$ , we can reinstate rule 9, but we need to ignore addition, or else we could produce a sum that is one of our removed elements (breaking rule 0). Our result $ℤ_{n}^{*}$ has $φ (n)$ elements and is something called a group.

Groups#

A group is a set $S$ with one operation, $\cdot$ , ("multiplication") where the following rules hold:

Closure: The given operations on elements of $S$ should never "escape" the set. $a + b$ is in $S$ and $a \cdot b$ is in $S$ for any $a$ and $b$ in $S$ .
Distributive property: $\cdot$ distributes over $+$ , so $a \cdot (b + c) = a \cdot b + a \cdot c$
Addition is associative: $(a + b) + c = a + (b + c)$ for every choice of $a, b, c$ in $S$ .
Addition is commutative: $a + b = b + a$ for every $a, b$ in $S$
Additive identity: There is some element (call it $I_{+}$ ) in $S$ where adding it is a no-op. That is, $a + I_{+} = a$ for any $a$ in $S$ .
Additive inverse: For any $a$ in $S$ , there is an "inverse" element $- a$ such that adding it brings you back to the identity: $a +$ ( $- a$ ) $= I_{+}$
Multiplication is associative
Multiplication is commutative (click to see note)
As with rings, we don’t need commutativity for a group, but since we’re talking about integers modulo $n$ , our multiplication is commutative. Commutative groups are usually called abelian. Usually when you see an introduction to groups, they’ll provide a non-abelian example like the group of rotations and reflections of a polygon ( $D_{n}$ ), the 2x2 matrices with real-number entries and nonzero determinant ( $G L (2, ℝ)$ ), the group of moves on the rubik’s cube, etc.
Multiplicative identity: There is some $I$ in $S$ such that $I \cdot a = a$ for any $a$ in $S$ .
Multiplicative inverse: For any $a$ in $S$ with the exception of $a = I_{+}$ , there is an "inverse" element $a^{- 1}$ in $S$ such that $a \cdot a^{- 1} = I$

We call $ℤ_{n}^{*}$ "the multiplicative group of integers modulo $n$ ." It is a handy piece of notation that specifies both the operation and the set we’re working with. Our example $ℤ_{10}^{*}$ would contain $φ (10) = 4$ elements: $[1], [3], [7]$ , and $[9]$ .

Exercises

We named our operations arbitrarily and didn’t specify what they did, other than requiring an identity and an inverse. Would using rules 0, 2, 4, and 5 (instead of 0, 6, 8, 9) also give us a group?

Yes! The field rules give us two copies of the group rules, one using addition, and one using multiplication over the nonzero elements. A short definition of a field is a set that is an abelian group under +, where nonzero elements are an abelian group under

\cdot

, and the distributive law holds, which would be a pretty unhelpful definition to put at the top of this page.

The discrete log problem#

Fix $g$ and $x$ in some group. What integer $a$ makes $g^{a} = x$ ? In the group $ℤ_{p}^{*}$ , there is no easy way to find $a$ (assuming $p$ was chosen well).

Diffie-Hellman key exchange works in any group where finding the discrete log is hard. In 436S, we only discuss it using the group $ℤ_{p}^{*}$ , but elliptic curve Diffie-Hellman uses a group obtained from points on an elliptic curve in a finite field. RSA operates in the group of integers mod $p q$ under multiplication (so $ℤ_{p q}^{*}$ ).

Exercises

Let

p

be prime and call the additive group of integers modulo

p

ℤ_{p}^{+}

. Let

g

and

x

be elements of it. Why is finding an integer

a

such that

g^{a} = x

NOT difficult (where exponentiation is repeated addition in this case)?

ℤ_{p}

is also a field. We can just find the multiplicative inverse of

g

in the field, which is quick to do, and then pick

a = g^{- 1} x

Generating a group#

A cyclic group has an additional rule: There must be some element in the set (usually given the letter $g$ ) such that ${g, g^{2}, g^{3}, \dots} = {g, (g \cdot g), (g \cdot g \cdot g), \dots} = S$ We say $g$ generates the whole group. Angle brackets denote the set of every possible combination (using $\cdot$ ) of the items inside, where $⟨ x, y, z ⟩$ is read "as the group generated by $x$ , $y$ , and $z$ ." If a group $G$ is cyclic, then $G$ = $⟨ g ⟩$ .

Groups in general are not necessarily cyclic. For example, we can make a group of 2-bit binary numbers with XOR as the "multiplication". This is a group, but no 2-bit binary number generates all of them through repeated XORs: $⟨ 00 ⟩ = {00}$ , $⟨ 01 ⟩ = {01, 00}$ , $⟨ 10 ⟩ = {10, 00}$ , $⟨ 11 ⟩ = {00, 11}$

Back to our multiplicative groups over the integers, we may find ourselves wanting a generator of $ℤ_{n}^{*}$ , (e.g. for the Diffie-Hellman parameter $g$ ). In this particular kind of group, a generator is called a primitive root modulo $n$ (but we’ll stick with "generator" for now). Interestingly:

What we’re looking for might not exist, meaning the group $ℤ_{n}^{*}$ is not always cyclic. Note that we’re safe with prime $n$ . If you’re curious, the group is cyclic if and only if $n$ is one of of $1, 2, 4, p^{k}$ , or $2 p^{k}$ where $p$ is prime and $k > 0$ .
There is no general formula to find a generator given $n$ , but if we know $ℤ_{n}^{*}$ is cyclic, there happens to be many possible generators, so guessing $g$ has a good chance of success, and checking that it generates everything can be done quickly using theorems described below.

Exercises

In the group

ℤ_{10}^{*}

, what is

⟨ [3] ⟩

{

[3]

[3^{2}] = [9]

[3^{3}] = [7]

[3^{4}] = [1]

}. So

ℤ_{10}^{*} = ⟨ [3] ⟩

, meaning

[3]

generates the group.

Does

[9]

also generate

ℤ_{10}^{*}

No.

⟨ [9] ⟩ = {[9], [1]}

, which is not equal to

ℤ_{10}^{*} = {[1], [3], [7], [9]}

Say we have a group

ℤ_{n}^{*}

with generator

g

(so

ℤ_{n}^{*} = g, g^{2}, g^{3}, \dots

). If we list all of out

g

g^{2}

g^{3}

, ... will we loop back to

g^{a} = [1]

at some point?

Yes.

[1]

is in the group (by rule 8), so if it’s not in

⟨ g ⟩

, then

g

doesn’t properly generate the group. (You could also argue that the inverse

g^{- 1}

is in the group somewhere, so

g^{- 1} = g^{b}

for some

b

. Then we loop back when we hit

g^{1 + b} = g \cdot g^{b} = g \cdot g^{- 1} = [1]

In an earlier exercise, we found all inverses in

ℤ_{7}^{*}

by guessing and checking and we found that inverses formed neat pairs. Given the additional information that

[3]

generates the group and

{[3], [3^{2}]], [3^{3}]], [3^{4}], [3^{5}], [3^{6}]} = {[3], [2], [6], [4], [5], [1]}

in that order, can you find all inverses again without guessing?

Since

[3^{6}] = [1]

, we can take any

[3^{a}]

and pick

b

such that

a + b =

some multiple of 6. Then

[3^{b}]

is the inverse of

[3^{a}]

[3^{1}] \cdot [3^{5}] = [1]

, so

[3]

and

[5]

are inverses

[3^{2}] \cdot [3^{4}] = [1]

, so

[2]

and

[4]

are inverses

[3^{3}] \cdot [3^{3}] = [1]

, so

[6]

is its own inverse and, of course,

[3^{6}] = [1]

is its own inverse.

Here is an example of finding generators of $ℤ_{17}^{*}$ by brute force. Can you spot any patterns?

# [[pow(g, i, 17) for i in range(1, 17)] for g in range(1, 17)]

   i=1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16 | generator?
g=  ----------------------------------------------------------------+-----------
1   [1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1] |
2   [2,  4,  8,  16, 15, 13, 9,  1,  2,  4,  8,  16, 15, 13, 9,  1] |
3   [3,  9,  10, 13, 5,  15, 11, 16, 14, 8,  7,  4,  12, 2,  6,  1] | yes
4   [4,  16, 13, 1,  4,  16, 13, 1,  4,  16, 13, 1,  4,  16, 13, 1] |
5   [5,  8,  6,  13, 14, 2,  10, 16, 12, 9,  11, 4,  3,  15, 7,  1] | yes
6   [6,  2,  12, 4,  7,  8,  14, 16, 11, 15, 5,  13, 10, 9,  3,  1] | yes
7   [7,  15, 3,  4,  11, 9,  12, 16, 10, 2,  14, 13, 6,  8,  5,  1] | yes
8   [8,  13, 2,  16, 9,  4,  15, 1,  8,  13, 2,  16, 9,  4,  15, 1] |
9   [9,  13, 15, 16, 8,  4,  2,  1,  9,  13, 15, 16, 8,  4,  2,  1] |
10  [10, 15, 14, 4,  6,  9,  5,  16, 7,  2,  3,  13, 11, 8,  12, 1] | yes
11  [11, 2,  5,  4,  10, 8,  3,  16, 6,  15, 12, 13, 7,  9,  14, 1] | yes
12  [12, 8,  11, 13, 3,  2,  7,  16, 5,  9,  6,  4,  14, 15, 10, 1] | yes
13  [13, 16, 4,  1,  13, 16, 4,  1,  13, 16, 4,  1,  13, 16, 4,  1] |
14  [14, 9,  7,  13, 12, 15, 6,  16, 3,  8,  10, 4,  5,  2,  11, 1] | yes
15  [15, 4,  9,  16, 2,  13, 8,  1,  15, 4,  9,  16, 2,  13, 8,  1] |
16  [16, 1,  16, 1,  16, 1,  16, 1,  16, 1,  16, 1,  16, 1,  16, 1] |

Euler’s theorem, Fermat’s little theorem#

It would be very useful if the pattern of ones in the right-hand column held true for any element of any $ℤ_{n}^{*}$ ; that would allow huge exponents to be reduced (modulo 16 in this case).

Not only is it true (for coprime $a$ and $n$ ), but we now have enough tools to prove it! Let $ℤ_{n}^{*}$ be cyclic with generator $g$ , so the whole group can be written as a list (call it $L_{1}$ ) $ℤ_{n}^{*} = L_{1} = {g, g^{2}, g^{3}, \dots g^{φ (n)}}$ We know all the group’s elements are in $L_{1}$ , just in an unknown order. (recall the elements are $[1], [2], \dots, [n - 1]$ minus any terms not coprime to $n$ ). Let $a$ be any element of the group. Notice that the list $L_{2} = {a g, a g^{2}, a g^{3}, \dots a g^{φ (n)}}$ is some permutation of $L_{1}$ : multiplication is invertible, so we can’t multiply two different things by $a$ and get the same thing out. Then, since the lists contain the same elements, and we are in a commutative group, the products of all elements in the list remains the same: $Prod (L_{2}) = Prod (L_{1})$ $\prod_{i = 1}^{φ (n)} a g^{i} = \prod_{i = 1}^{φ (n)} g^{i}$ Factoring all the $a$ terms out, we get: $a^{φ (n)} \prod_{i = 1}^{φ (n)} g^{i} = \prod_{i = 1}^{φ (n)} g^{i}$ Finally, multiplying both sides by the inverse of $\prod_{i = 1}^{φ (n)} g^{i}$ (which is just some member of the group, and with your knowledge of how inverses work, you might know which one): $a^{φ (n)} = [1] (mod n)$ which is exactly Euler’s theorem. $□$

Fermat’s little theorem follows straight from Euler’s by substituting prime $p$ for $n$ : $a^{p - 1} = [1] (mod p)$ So we know that, no matter what element we pick from the multiplicative group modulo $n$ , raising it to the $φ (n)$ (the size of the group) will give us 1.

Lagrange’s theorem#

The proof above is handy because it doesn’t require any definitions beyond what we already have. Admittedly, it doesn’t provide much intuition as to why.

A subgroup is a subset of a group where the group laws are satisfied. For example, in the group of the integers under addition, the even numbers are a subgroup. A group is always a subgroup of itself.
Lagrange’s theorem states that, if you find a subgroup in a group, the order of a subgroup divides the order of the group. (in other words, the order of the group is a multiple of the order of the subgroup.)

Then (with some details intentionally left out) Euler’s theorem is Lagrange’s theorem applied to the group $ℤ_{n}^{*}$ and the subgroup $⟨ a ⟩$ .

Notation notes#

In this page we used square brackets to be very clear that the "numbers" we’re using are congruence classes of integers. Usually you’ll see $[n]$ shortened to $n$ and $[1] = [5]$ written as $1 \equiv 5$ (the triple bar meaning "is equivalent/congruent to").
The common (mod $n$ ) notation can be confusing when mistaking "mod" for the % operator, which is a different thing (and even this operator can refer to different things; try calculating (-1 % 5) in python, and again into your browser console.) (mod $n$ ) is a statement about the world we’re living in when reading something like $1 \equiv 100$ (mod $n$ ).

Conclusion#

If you would like to learn more, The Joy of Cryptography is a wonderful and well-written undergraduate-level textbook. It teaches cryptography along with the underlying math, has sane notation, and is freely available online. See chapters 0.1, 0.2, 0.4, 13.1, and 14.1 for things relevant to this webpage.