A Gentle Introduction to Lattices and Lattice-Based Key Exchange: Part 1

This post gives a (hopefully!) simple introduction to lattices and hard problems based on lattices. Lattices are interesting to the field of cryptography as the below problems are difficult for a quantum computer to solve, as opposed to problems based on discrete logarithms or factoring such as Diffie-Hellman based problems or RSA.

In Part 2, we’ll look at one simple lattice-based key exchange, to understand how to use these underlying hard problems in cryptosystems.

Much of the material from this post is assembled from notes from a class by Douglas Stebila and Chris Peikert’s A Decade of Lattice Cryptography (Peikert 2015).

Important Terms and Notation

Let’s review notation we’ll use and a few basic Linear Algebra terms.

We will denote a vector using the following notation: \(\vec{a}\), as opposed to a scalar, \(a\). Matrices will be denoted as capital letters, \(A\). We denote random sampling via \(\stackrel{\$}{\gets}\); for example, sampling a vector of length \(n\) from the integers modulo \(q\) randomly is denoted: \(\vec{a} \stackrel{\$}{\gets}{\mathbb{Z}_q}^n\).

The inner product of two vectors is denoted by the notation \(\langle \vec{a}, \vec{b} \rangle\), and can also be referred to as the dot product, denoted as \(\vec{a} \cdot \vec{b}\). The inner product/dot product operation can also be performed between a matrix and a vector (or a vector and a scalar).

A basis is a set of vectors such that these vectors are linearly independent. For vectors to be linearly independent, this means that no two vectors can be reduced to each other, meaning that no two vectors in the basis are multiples of one other. Bases can also be reduced, and we say that a basis is "better" than another basis if the basis is more reduced than another. For example, the set \(\{[0, 1], [1, 0]\}\) is a fully-reduced basis, but \(\{[1, 0], [2, 0]\}\) is not, because \([2, 0]\) is a multiple of \([1, 0]\).

A basis generates a vector space by combining basis vectors to create new points in the vector space. For a basis \(B\) comprised of the set of vectors \(\{[0, 1], [1, 0]\}\), \(B\) generates a vector space that includes points such as \([2, 3] = 3[0, 1] + 2[1, 0]\).

What is a Lattice?

A lattice, denoted \(\mathcal{L}\), is a finite set of points (the points are represented as vectors) generated by a basis, such that any lattice point is a integer linear combination of basis vectors. This means that the lattice point is equal to some combination of basis vectors added together, where each basis vector can be multiplied by an integer scalar. Taking our example of a basis in Section 1, one lattice point generated from the integer linear combination of basis vectors is \(\{[0, 1], [1, 0]\}\) is \([4, 2] = 2[0, 1], 4[1, 0]\).

The lattice generated by a basis \(B\) is denoted as \(\mathcal{L}(B)\).

It is important to emphasize that a lattice can be generated by multiple possible bases. Two bases are equivilant (in that they generate the same lattice) if the follwing relation holds: \(B_1 = B_2 \times U\), where \(U\) is a unimodular matrix. A unimodular matrix is a square matrix whose determinant is \( \pm 1 \). For example, the basis \(B_1 = \{[2, 0], [0, 2]\}\) and the basis \(B_2 = \{[2, -6], [-2, 8]\}\) both generate the same lattice. This is easy to see after confirming \(\{[2, -6], [-2, 8]\} = \{[2, 0], [0, 2]\}\ \times \{[1, -3], [-1, 4]\}\), where \(\{[1, -3], [-1, 4]\}\) is a unimodular matrix.

A more formal way of expressing the definition of a lattice as a combination of integer scalar multiples of basis vectors can be defined as follows (as demonstrated by Peikert (Peikert 2015)):

\[\mathcal{L} = \mathcal{L}(B) = \sum_{i=1}^{m} (\mathbb{Z} \cdot b_i)\]

Another important concept to understand when studying lattices is the \(i\)th successive minima of a given lattice \(\mathcal{L}\). Intuitively, the \(i\)th successive minima are the \(i\)th shortest vectors in the lattice, where for each \((i+1)\)th minima, some bound is increased by a constant factor (every \(i\)th successive minima is less than the \(i\)th bound).

The \(i\) successive minima of \(\mathcal{L}\) is a integer linear combination of basis vectors, where the length of the \(i\)th successive minima is under the \(i\)th bound \(i \cdot \gamma\). In this case, \(\gamma\) is simply a scaling factor.

Hard Problems based on Lattices

In cryptography, it is important to have assurance of the hardness of given problems, so that we can build cryptosystems on top of these hard problems. This ensures that the difficulty an adversary will face when attacking this cryptosystem reduces to the hardness of the underlying hard problem. We will now examine a series of hard lattice problems. In part 2, we’ll see how to use these hard problems to build cryptosystems which in turn are hard for a given adversary.

Shortest Vector Problem (SVP)

Intuitively, the Shortest Vector Problem reduces to the hardness an adversary will face when required to find a lattice point that is within a given bound, if all that the adversary knows is a set of basis vectors for the lattice. As any point in the lattice is a linear combination of basis vectors, the SVP problem is hard as no efficient algorithm exists to test all possible points. Furthermore, as a lattice can be generated by multiple bases (as described in Section [sect:what-are-lattice]), where some valid bases are "better" (i.e, more reduced) than others, this in turn further frustrates the search techniques for an adversary.

More formally, the Shortest Vector Problem is defined as follows:

Shortest Vector Problem. Given a basis \(B\) of \(\mathcal{L}\), find a nonzero point \(v \in \mathcal{L}\) (in other words, "v is in the lattice") such that:
\[||v|| \leq \gamma \lambda_i (\mathcal{L})\]
Where \(\gamma\) is some multiplicative constant, and \(\lambda_i\) is the \(i\)th successive minima of \(\mathcal{L}\).

Note that an inefficient algorithm does exist to solve the SVP problem, but runs in exponential time \(2^{O(n)}\). Therefore, as \(n\) increases, the difficulty for an adversary increases exponentially as the adversary tries to guess a lattice point \(v\) which is less than the given bound \(\gamma \lambda_i\). Why? As lattice vectors are linear combinations of basis vectors, an exhaustive-search approach (or something similar) is required to find the closest lattice points to the given bound.

Closest Vector Problem

The closest vector problem (CVP) is a derivation of the SVP problem. Intuitively, the CVP requires an adversary to find the closest lattice point \(v \in \mathcal{L}(B)\) to a given point \(w \in \mathbb{Q}^n\), where \(w\) is an arbitrary point not within the given lattice.

More formally, this can be written as follows:

Closest Vector Problem Given a basis \(B\) and a lattice \(\mathcal{L}\) and a point \(w \in \mathbb{Q}^n\), find \(v \in \mathcal{L}\) such that \(w - v\) is minimized.

Bounded Distance Decoding Problem

While multiple points in the lattice could be solutions to the Closest Vector Problem, sometimes it is useful to bound a solution to a single lattice point. The Bounded Distance Decoding (BDD) problem is similar to the Closest Vector Problem, but requires that the distance between a point \(v\) in the lattice and an arbitrary point \(w\) be less than some bound \(\alpha \lambda_i\), where \(0 \leq \alpha \leq \frac{1}{\sqrt{2}}\).

More formally:

Bounded Distance Decoding Problem Given a basis \(B\) for a lattice \(\mathcal{L}\) and a point \(w \in \mathbb{Q}^n\), find \(v \in \mathcal{L}\) such that: \[||w-v|| \leq \alpha \lambda_i(\mathcal{L}), where \ 0 \leq \alpha \leq \frac{1}{\sqrt{2}}.\]

Hard Problems Continued: Learning with Errors

Many current lattice-based cryptosystems are based on the hardness of the Learning with Errors (LWE) problem, first introduced by Regev (Regev 2005). Intuitively, the LWE problem is based on the hardness of finding a hidden input from a "noisy inner product," where the noise is induced by explicitly adding a randomly-selected error term.

More formally, LWE can be defined as follows:

Learning with Errors Define \(n, q \in \mathbf{N}\), where \(n\) is the length of the vector, and \(q\) is the modulus. Let \(\chi_e\) be a distribution over \(\mathbb{Z}\), and let \(s \in {\mathbb{Z}_q}^n\).

Define the LWE distribution \(A_{s, \chi_e}\) over \({\mathbb{Z}_q}^n\) x \(\mathbb{Z}_q\) by performing the following:
1. Sample \(a \stackrel{\$}{\gets}{\mathbb{Z}_q}^n\) uniformly at random.
2. Sample \(e \stackrel{\$}{\gets}{ \chi_e}\) uniformly at random.
3. Compute \(b \leftarrow \langle s, a \rangle\) + \(e\) mod \(q\).
4. Return \((a, b)\)

In this case, \(b\) is the "noisy inner product" comprised of the inner product between the publicly-known value \(a\) and the secret value \(s\). If the error term \(e\) was not added to \(\langle s, a \rangle\), then the product of \(s, a\) is easy to find using Gaussian Elimination. However, \(b\) is safe to use as a public value because \(e\) is first added to \(\langle s, a \rangle\). Specifically, \(e\) is secret and \(e\) is chosen at random, so a network adversary cannot guess the value of \(e\). This in turn means that an adversary cannot easily find the secret \(s\) given only \(a\) and \(b\).

Note that \(\chi_e\) is typically defined as a discrete Gaussian distribution of width \(\alpha q\), where \(\alpha\) is termed the "error rate".

This leads to the definition of two more computationally-hard problems based on lattices, Search-LWE and Decision-LWE.

Search-LWE

Intuitively, Search-LWE asks an adversary to find the lattice point \(a \cdot s\), given \(m\) samples from the "noisy inner product". Search-LWE says that the problem of finding \(s\) should be hard for an adversary to solve.

Search-LWE is often represented in aggregate using matrix/vector form, where all \(m\) samples are included in the representation \(\langle A, \vec{s} \rangle + \vec{e}\).

More formally, we can state Search-LWE as the following:

Search-LWE Fix LWE parameters as defined in Definition 4. Let \(\chi_s\) be a distribution over \({\mathbb{Z}_q}^n\). Let \(m \in \mathbb{N}\).

The Search-LWE problem is as follows: Sample \(s \stackrel{\$}{\gets}\chi_s\). Given \(m\) samples \(a_i, b_i\) drawn from the LWE distribution \(A_{s, \chi_e}\), find \(s\).

Reduction to Bounded Distance Decoding

It is important to understand that LWE reduces to the Bounded Distance Decoding (BDD) problem, and that BDD itself reduces to Shortest Vector Problem (SVP). This reduction is what ensures cryptosystems based on LWE are based on hard-to-solve lattice problems, which an adversary must overcome in order to break the cryptosystem.

While understanding the LWE-to-BDD reduction is a fairly involved proof, the reverse is also true- that BDD reduces to LWE. We’ll focus on this simpler reduction below, however, understanding the fact that LWE reduces to BDD is important, and why we call LWE a "lattice-based" problem. See Regev’s 2005 publication for this proof (Regev 2005).

Let’s go over a (high level!) explanation of why BDD reduces to LWE- this means that if we have a "black box" that solves BDD, then this box can also be used to solve LWE.

First, let’s run the Search-LWE defined in Definition [Search-LWE] \(m\) times, meaning we will have \(m\) samples of \(a_i, b_i\). In this case, the \(m\) vectors of \(a_i\) form a matrix \(A\). The lattice generated by A is denoted \(\mathcal{L}(A)\), and every point in the lattice can be represented by the following:

\[\mathcal{L}(A) = \{A^t \vec{s} : \vec{s} \in {\mathbb{Z}_q}^n + q \mathbb{Z}^m\}\]

The above just defines every point in \(\mathcal{L}(A)\) as an integer linear combination of basis vectors, where the integer scalars for each basis vector form the secret \(s\). We can see that every point \(\vec{b_i}\) is close to a point in the lattice \(\mathcal{L}(A)\), as \(\vec{b_i}\) is simply \(\langle \vec{s}, A \rangle\) plus some error term \(\vec{e_i}\).

For a network adversary to be able to find \(s\), where \(\langle \vec{s}, A \rangle\) is a point in the lattice \(\mathcal{L}(A)\), the adversary would only be able to use their knowledge of \(\vec{b}\) and the publicly-disclosed parameter \(A\) in order to find the lattice point \(\langle \vec{s}, A \rangle\).

We can now argue that if we had a BDD solver, that if this solver were provided with a "noisy inner product" as described above, the BDD solver would be able to recover the secret lattice point from the noisy inner product, where the noisy inner product is created by adding an error term to \(\langle \vec{s}, A \rangle\). Therefore, the Bounded Distance Decoding reduces to LWE- if we can solve a BDD instance, we can solve the Search-LWE problem.

Decision-LWE

Decision-LWE (DLWE) challenges an adversary to distinguish the output from the LWE distribution \(A_{s, \chi_e}\) and the output from a sample drawn uniformly at random. This is interesting in and of itself- other cryptosystems have suffered in the past due to their distinguishable features, which a network adversary could use to block or censor encrypted content, even if the content could not be decrypted, in turn motivating obfuscation schemes such as Elligator (J. Bernstein et al. 2013). However, the "noisy inner product" generated from LWE produces the pair \((a_i, b_i)\) that should be indistinguishable from a sample drawn uniformly at random.

Formally, this can be defined as follows:

Decision-LWE Fix parameters as done in [Search-LWE].
The Decision-LWE problem is as follows: Sample \(s \stackrel{\$}{\gets}\chi_s\). Given \(m\) samples of \(a_i, b_i\) where every sample is either:
1. Drawn from the LWE distribution \(A_{s, \chi_e}\), or
2. Drawn uniformly at random from \({\mathbb{Z}_q}^n\) and outputting \(\mathbb{Z}_q\)

The challenge to the adversary is to reliably distinguish (with probability greater than random guessing) which is the case.

Learning with Errors Tradeoffs: Security vs Correctness

One important point to observe in lattice cryptography is the tradeoff between security and correctness. The larger the error distribution used in LWE, the larger the range that errors can be randomly sampled from. A simple observation stems from the example when the error distribution is small, as the deviation in the "noisy inner product" from the secret (a point in the lattice) will consequently also be small. However, as error terms grow, the deviation from the secret lattice point will grow, therefore making the protocol more secure, as the range for errors to be randomly sampled from is larger.

However, increasing the size of the error distribution in turn impacts the correctness of the scheme. Imagining our lattice grid, we can see that as the range that errors are sampled from grows, the "noisy inner product" can drift further away from the correct secret lattice point, and towards other points in the lattice. This can be described as though there is a "ball" around each lattice point. This ball represents the range in which the point can be uniquely decoded from the "noisy inner product". However, as error terms grow, this risks drifting outside the ball in which the lattice can be uniquely decoded.

In Part 2, we’ll see a key-exchange protocol based on LWE where two parties compute "approximate" points, and then come to the exact agreement, which is a point in the lattice. This reduces to the hardness of LWE based on Definition [Search-LWE], as a network observer will need to find a specific point in the lattice, given some public value which has an error term.

Finally, if you are interested in hearing more about modern cryptography and privacy topics, you might consider following me on Twitter!



J. Bernstein, Daniel, Mike Hamburg, Anna Krasnova, and Tanja Lange. 2013. “Elligator: Elliptic-Curve Points Indistinguishable from Uniform Random Strings.” In Proceedings of the ACM Conference on Computer and Communications Security, 967–80. https://doi.org/10.1145/2508859.2516734.

Peikert, Chris. 2015. “A Decade of Lattice Cryptography.” Cryptology ePrint Archive, Report 2015/939.

Regev, Oded. 2005. “On Lattices, Learning with Errors, Random Linear Codes, and Cryptography.” In Proceedings of the Thirty-Seventh Annual Acm Symposium on Theory of Computing, 84–93. STOC ’05. New York, NY, USA: ACM. https://doi.org/10.1145/1060590.1060603.

Chelsea H. Komlo

Chelsea H. Komlo