da1729's Blog

cryptography, digital design, embedded, rf, ...

Chapter 1: Introduction to Lattices

For getting deep into cryptanalysis, we must have a solid understanding of the system we are analyzing. The new, modern, “quantum-safe” cryptography that is the lattice-based cryptography is based on these mathematical structures called lattices, hence the name.

Lattices and Bases

Lattice is defined as:

Definition (Lattice)
A lattice $\mathcal{L}$ is a discrete subgroup of $\mathbb{R}^m$.

We now define a lattice constructively using a basis. This definition is very important for our purposes:

Definition (Constructive Definition)
Let $\mathbf{b}_1, \cdots \mathbf{b}_n$ be $n$ linearly independent vectors in $\mathbb{R}^m$. A lattice is a structure defined by the set of all their integer linear combinations:
$$
\mathcal{L}(\mathbf{B}) =
\{
\sum_{i=1}^{n} x_i \mathbf{b}_i : x_i \in \mathbb{Z}
\}
$$

How is a lattice different from a vector field? Well, for a vector field, we would not have constrained ourselves to integer linear combinations, but rather any real combination. In other words, our lattice is discrete but a vector field is continuous. In the study of lattice cryptography, you will see how this “discreteness” allows us to formulate hard problems to build cryptosystems upon. To be more technical, this discreteness allows us to make the ways of linear algebra (especially Gaussian Elimination) break down.

The Basis Matrix

Matrix whose each column is a basis vector, is called the basis matrix. Represented as $$\mathbf{B} = [\mathbf{b}_1, \cdots, \mathbf{b}_n] \in \mathbb{R}^{m \times n}$$

It is important to have the dimensions clear. $m$ is the dimension of the ambient space we are in, and $n$ is the number of linearly independent vectors we have in our basis. Note that the given basis matrix has rank equal to $n$.

Definition (Full-Rank Lattice)
A given lattice $\mathcal{L}$ is called full-rank if $n = m$.

Now with these new notations, we can define our lattice mathematically as: $$\mathcal{L}(\mathbf{B}) = \{\mathbf{B}\mathbf{z}: \mathbf{z} \in \mathbb{Z}^n\}$$

It is very important to note an inference:

A given basis $\mathbf{B}$ does generate a unique lattice $\mathcal{L}$, but the converse is not true. A lattice does not point to a unique basis.

This fact is of a great importance in lattice cryptography. A given lattice $\mathcal{L}$ infact, has infinitely many bases, and we classify them as such:

  • Good Basis: The vectors are short and nearly orthogonal.
  • Bad Basis: The vectors are long and highly skewed (nearly parallel).

Now a question arises, are these infinitely many bases, somehow internally related to each other? The answer is yes. We can transform between basis using Unimodular Matrices. These matrices have integer entries and have determinant = $\pm 1$

Theorem
Two bases $\mathbf{B}$ and $\mathbf{B’}$ generate the same lattice $\mathcal{L}$ if and only if there exists a unimodular matrix $\mathbf{U}$ such that: $$\mathbf{B’} = \mathbf{BU}$$

Why does this theorem hold? Because $\text{det}(\mathbf{U}) = \pm 1$, the inverse $\mathbf{U}^{-1}$ is also an integer matrix, in fact, unimodular. This means, any integer combination of $\mathbf{B}$ can be written as an integer combination of $\mathbf{B’}$, and vice versa. They cover the exact same points.

To give some cryptological insights, and a higher level idea of how almost all public-key lattice cryptosystems work:

  • Alice (Keys): Generates a good basis $\mathbf{G}$ (secret key). She multiplies it by a random unimodular matrix $\mathbf{U}$ to get $\mathbf{B}_{pub} = \mathbf{GU}$ (public key).
  • Eve (Attacker): Only sees $\mathbf{B}_{pub}$. Her goal is to find $\mathbf{U}^{-1}$ or otherwise transform $\mathbf{B}_{pub}$ back into a “good” basis to break the encryption.
  • Algorithms: LLL and BKZ algorithms (to be covered in later chapters) are cryptanalytic algorithms that aim to search for a transformation $\mathbf{U}$ to make the basis “better”.

The Fundamental Parallelpiped

To measure the “density” of lattice points, we look at the shape formed by the basis vectors. This shape is called the Fundamental Parallelpiped, denoted by $\mathcal{P}(\mathbf{B})$ and defined by: $$\mathcal{P}(\mathbf{B}) = \{\mathbf{Bx}: 0 \leq x_i < 1\}$$

It is the “tile” that, if repeated, would tile the entire span of the lattice.

Definition (Determinant of the Lattice)
The volume of the fundamental parallelpiped,
$$ \operatorname{vol}(\mathcal{P}(\mathbf{B})) = \sqrt{\det(\mathbf{B}^T \mathbf{B})}, $$
is called the determinant of the lattice:
$$ \det(\mathcal{L}) = \operatorname{vol}(\mathcal{P}(\mathbf{B})) = \sqrt{\det(\mathbf{B}^T \mathbf{B})}. $$
If the lattice is full-rank ($n = m$) and square, this simplifies to:
$$\det(\mathcal{L}) = |\det({\mathbf{B}})|$$

It has to be noted, we can take any basis for finding the determinant of the lattice, it stays the same.

Why doesn’t the volume change if we change the basis?
Using the unimodular property:
$$\det(\mathbf{B’}) = \det(\mathbf{BU}) = \det(\mathbf{B})\det(\mathbf{U}) = \det(\mathbf{B})\cdot (\pm 1)$$
Since the volume is absolute, $|\det(\mathbf{B’})| = |\det(\mathbf{B})|$

Fundamental Invariants and Measuring “Badness”

In the previous section, I classified the infinitely many bases of a lattice very ambiguously. I mentioned good ones are short and quite orthogonal, and bad ones are long and skewed. But how do we know if vectors in the given basis are short and orthogonal enough? Moreover, one can, maybe, be able to classify them visually for lattices upto 3 dimensions, but would definitely struggle any higher, and we work with very high dimensions like 500, 1000, etc.

Hadamard’s Inequality and The Orthogonality Defect

Imagine that you have a rectangle. It can be completely described by a set of two vectors in 2D and the area of the rectangle will just be the product of the lengths of these vectors. Now, if you were to skew this rectangle, keeping the side lengths the same, we observe that the area decreases and is given by the determinant of the basis matrix formed by these two vectors. This geometric fact is formalized as Hadamard’s Inequality:

Definition (Hadamard’s Inequality)
$$\det(\mathcal{L}) \leq \prod_{i=1}^{n} ||\mathbf{b}_i|| $$
The equality holds if and only if the basis vectors are mutually orthogonal.

Now that we know that the determinant is the “minimum possible product” (achieved only by a perfect basis), we can compare our current basis against this ideal.

With this, we define Orthogonality Defect (often denoted as $\gamma$ or just “Hadamard Ratio”) as:

Definition (Orthogonality Defect/ Hadamard Ratio)
$$\gamma(\mathbf{B}) = \frac{\prod_{i=1}^{n} ||\mathbf{b}_i||}{\det(\mathcal{L})}$$

Now, how do we interpret this ratio?

Interpreting Hadamard Ratio

  • $\gamma = 1$: the basis is perfectly orthogonal. The vectors are as short as they can possibly be.
  • $\gamma > 1$: the basis is skewed. The vectors are longer than necessary.
  • $\gamma \gg 1$: the basis is terrible. The vectors are incredibly long and nearly parallel. This is what a lattice public key looks like.

In later chapters, when we deal with the analysis part, the algorithms, LLL, BKZ, aim to lower this ratio only. We essentially try to force the product of lengths to get closer to the fixed volume.

As mentioned earlier, we work with very high dimensions in practical lattice cryptosystems (I am talking 500, 1000), this hadamard ratio becomes astronomically large and it gets harder to compare a defect of $10^{500}$ vs $10^{400}$.

To make the number manageable, cryptanalysts often take the $n$-th root. This is related to the Hermite Factor.

Definition (Hermite Factor)
$$\delta \approx \gamma ^ \frac{1}{n}$$

This factor gives you a normalized “score” independent of the dimension.

Gram-Schmidt Orthogonalization

Up to this point, we have repeatedly used vague language such as “skewed”, “nearly parallel”, and “orthogonal enough”. While these notions are geometrically intuitive in low dimensions, they become meaningless in the high-dimensional lattices used in cryptography. We therefore need a precise mathematical tool that allows us to measure how a basis behaves internally.

For this, we use one of the most famous tools of linear algebra, that is the Gram-Schmidt Orthogonalization.

Let me propose a question first before diving deep into the topic:

Given several vectors, how do we measure the volume of the shape they span?

Let’s start from the basics, 1D. Well, in 1D the volume is just the length of the vector describing the basis, so: $$V_1 = ||\mathbf{v}_1||$$

Let’s add the second dimension now. Take two vectors $\mathbf{v}_1, \mathbf{v}_2 \in \mathbb{R}^2$. They span a parallelogram. The naive guess would be just the product of their norms, but that is obviously not always true. I think all of us know that we have to take product of the norm of the first vector and the norm of component of the second vector perpendicular to the first vector, so $$V_2 = ||\mathbf{v}_1||\cdot ||\mathbf{v}_2^*||$$, where $\mathbf{v}_2^*$ is the component of the second vector perpendicular to the first one.

Why am I overexplaining stuff at this point? To point out a perspective, that is in this case, only the perpendicular component of $\mathbf{v}_2$ contributed to the area.

Euclidean geometry admits exactly one way to decompose a vector into parallel and orthogonal components. The projection of $\mathbf{v}_2$ onto $\mathbf{v}_1$ is: $$\text{proj}_{\mathbf{v}_1}(\mathbf{v}_2) = \frac{\langle \mathbf{v}_2, \mathbf{v}_1 \rangle}{||\mathbf{v}_1||^2}\mathbf{v}_1$$

Now, to get the orthogonal projection, we just need to subtract the vector above from the original vector $\mathbf{v}_2$

Let’s extend this by another dimension. We have $\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3 \in \mathbb{R}^3$. To compute the volume of the parallelpiped they span, the third vector must contribute only what lies outside the plane generated by the first two.

This requires removing from $\mathbf{v}_3$ all components lying in previously used directions. Hence, $$\mathbf{v}_3^* = \mathbf{v}_3 - \text{proj}_{\mathbf{v}_1^*}(\mathbf{v}_3) - \text{proj}_{\mathbf{v}_2^*}(\mathbf{v}_3)$$

With this, one should be able to see a pattern arising. And the pattern’s generalization is nothing but the Gram-Schmidt Orthogonalization.

Gram-Schmidt Orthogonalization
Given a basis $\{\mathbf{v}_1, \cdots, \mathbf{v}_n\}$, corresponding basis $\{\mathbf{v}_1^*, \cdots, \mathbf{v}_n^*\}$ generated by the following algorithm:
$$\mathbf{v}_1^* = \mathbf{v}_1,$$
For $k \geq 2$
$$\mathbf{v}_k^* = \mathbf{v}_k - \sum_{j=1}^{k-1} \frac{\langle \mathbf{v}_k, \mathbf{v}_j^* \rangle}{||\mathbf{v}_j^*||^2}\mathbf{v}_j^*$$
is mutually orthogonal and span the same linear subspace as the original vectors.
Once the vectors are orthogonal, volume factorizes naturally: $$V_n = \prod_{i=1}^{n} ||\mathbf{v}_i^*||$$

Thus, GSO is the unique procedure that allows us to decompose volume into independent, one-dimensional contributions.

Now let’s see the use of GSO in our lattice theory. Let’s say that a given lattice $\mathcal{L}$ has a basis $\mathbf{B} = \{\mathbf{b}_1, \cdots, \mathbf{b}_n\}$. Applying GSO to this basis, let’s say we get $$\{\mathbf{b}_1^*, \cdots, \mathbf{b}_n^*\}$$ One has to note that this generated basis may not consist of lattice vectors, and the majority of the time, they are not. Instead, they reveal the true orthogonal structure hidden inside the basis.

From the volume decomposition above, the determinant of a lattice satisfies: $$\det(\mathcal{L}) = \prod_{i=1}^{n} ||\mathbf{b}_i^*||$$

Successive Minima and Minkowski’s Theorems

In the previous section, we used GSO to measure the quality of a specific basis. However, as cryptanalysts, we often need to measure the properties of the lattice itself, independent of which basis is currently representing it.

We know that a lattice is a discrete grid. This discreteness implies that there is a limit to how close two distinct lattice points can be. This brings us to the concept of Successive Minima, which generalizes the idea of the “shortest vector.”

Successive Minima

The most famous invariant of a lattice is the length of its shortest non-zero vector. We denote this length as $\lambda_1$.

Definition (First Successive Minimum/ Shortest Vector)
The first successive minimum $\lambda_1(\mathcal{L})$ is the length of the shortest non-zero vector in the lattice: $$\lambda_1(\mathcal{L}) = \min_{\mathbf{v} \in \mathcal{L}\setminus \{0\}} ||\mathbf{v}||$$

Finding a vector of length $\lambda_1$ is exactly the Shortest Vector Problem (SVP), which is the computational foundation of modern lattice cryptography.

To get more formal intuition for this shortest vector, imagine a lattice (2D or 3D). Now from origin you start inflating a circle/sphere, initially, it will be just a point at the origin. As we expand it, at this critical radius, the sphere just touches the first non-zero lattice point. So, the shortest vector is the first lattice point the expanding sphere collides with. Additionally, this ball of critical radius $\lambda_1$ has a special property: $$B(\lambda_1)\cap \mathcal{L} = \{0, \mathbf{v_1}\}$$

Definition (Successive Minima)
for $k \in \{1, \cdots, n\}$, the $k$th successive minimum $\lambda_k(\mathcal{L})$ is the smallest radius $r$ such that the closed ball of radius $r$ centered at origin contains at least $k$ linearly independent lattice vectors. $$0 < \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_n < \infty $$

This sequence gives us a “geometric profile” of the lattice.

  • If $\lambda_1 \approx \lambda_n$, the lattice is “well-rounded” (like a grid of squares).
  • If $\lambda_1 << \lambda_n$, the lattice is “long and thin” (collapsed in some directions).

Minkowski’s Theorems

This is the fundamental theory of the Geometry of Numbers. It guarantees that if a lattice has a certain density (determinant), then short vectors must exist.
For a cryptanalyst, this is a double-edged sword:

  • The Good: it proves that a short key (or a weakness) exists.
  • The Bad: It does not tell us how to find it.

To state the theorem formally, we first need a lemma regarding volume packing, often called the “Pigeonhole Principle for Geometry.”

Blichfeldt’s Theorem

Theorem (Blichfeldt’s Theorem)
Let $S \subset \mathbb{R}^n$ be a measurable set with volume $\operatorname{vol}(S) > \det(\mathcal{L})$. Then, there exist two distinct points $\mathbf{z}_1, \mathbf{z}_2 \in S$ such that their difference is a lattice vector:
$$\mathbf{z}_1 - \mathbf{z}_2 \in \mathcal{L}$$

Proof:
Let $\mathcal{P}(\mathbf{B})$ be the fundamental parallelpiped. We can tile the entire space $\mathbb{R}^n$ using shifts of this parallelpiped by lattice vectors. Any point $\mathbf{x} \in \mathbb{R}^n$ can be uniquely written as $\mathbf{x} = \mathbf{b} + \mathbf{p}$, where $\mathbf{b} \in \mathcal{L}$ and $\mathbf{p} \in \mathcal{P}(\mathbf{B})$.

Imagine “folding” the set $S$ into a single fundamental parallelpiped by mapping every point $\mathbf{x} \in S$ to its relative position $\mathbf{p}$. Since $\operatorname{vol}(S) > \operatorname{vol}(\mathcal{P}(\mathbf{B})) = \det(\mathcal{L})$, by the Pigeonhole Principle, at least two points $\mathbf{z}_1, \mathbf{z}_2$ must land on the same spot $\mathbf{p}$ after folding.

This implies $\mathbf{z}_1 = \mathbf{b}_1 + \mathbf{p}$ and $\mathbf{z}_2 = \mathbf{b}_2 + \mathbf{p}$.
Subtracting them gives $\mathbf{z}_1 - \mathbf{z}_2 = \mathbf{b}_1 - \mathbf{b}_2$. Since lattices are closed under subtraction, this difference is a lattice vector. $\blacksquare$

Minkowski’s First Theorem (Convex Body Theorem)

Using Blichfeldt’s result, we can now prove the core theorem. We consider a set $S$ that is convex and centrally symmetric (e.g., a sphere or a cube centered at the origin).

Theorem (Minkowski’s First Theorem)
Let $\mathcal{L}$ be a full-rank lattice in $\mathbb{R}^n$. If $S \subset \mathbb{R}^n$ is a convex, centrally symmetric set with volume:
$$\operatorname{vol}(S) > 2^n \det(\mathcal{L})$$
Then $S$ contains at least one non-zero lattice vector.

Proof:
Define a shrunk version of the set, $S’ = \frac{1}{2}S = \{ \frac{1}{2}\mathbf{x} : \mathbf{x} \in S \}$.
The volume scales by $(1/2)^n$, so:
$$\operatorname{vol}(S’) = \frac{1}{2^n} \operatorname{vol}(S) > \det(\mathcal{L})$$

By Blichfeldt’s Theorem, there exist distinct $\mathbf{x}_1, \mathbf{x}_2 \in S’$ such that $\mathbf{x}_1 - \mathbf{x}_2 \in \mathcal{L}$.

We must now show this difference vector lies inside $S$.
Since $\mathbf{x}_1, \mathbf{x}_2 \in S’$, we know that $2\mathbf{x}_1 \in S$ and $2\mathbf{x}_2 \in S$.
Because $S$ is symmetric about the origin, if $2\mathbf{x}_2 \in S$, then $-2\mathbf{x}_2 \in S$.
Because $S$ is convex, the midpoint of any segment connecting two points in $S$ is also in $S$. We take the midpoint of $2\mathbf{x}_1$ and $-2\mathbf{x}_2$:
$$\frac{(2\mathbf{x}_1) + (-2\mathbf{x}_2)}{2} = \mathbf{x}_1 - \mathbf{x}_2$$

Thus, the difference vector lies in $S$. Since it is a lattice vector and non-zero (as $\mathbf{x}_1 \neq \mathbf{x}_2$), the theorem holds. $\blacksquare$

Minkowski’s Second Theorem

While the First Theorem gives us a bound on the shortest vector $\lambda_1$, it tells us nothing about the rest of the lattice structure. The Second Theorem generalizes this bound to the entire set of successive minima.

Theorem (Minkowski’s Second Theorem)
For any rank-$n$ lattice $\mathcal{L}$, the product of the successive minima is bounded by the determinant:
$$\prod_{i=1}^n \lambda_i(\mathcal{L}) \leq \gamma_n^n \cdot \det(\mathcal{L})$$
Where $\gamma_n$ is a constant depending on the dimension (specifically related to the volume of the unit ball). A looser but easier-to-remember bound is:
$$\left(\prod_{i=1}^n \lambda_i(\mathcal{L})\right)^{1/n} \leq \sqrt{n} \cdot \det(\mathcal{L})^{1/n}$$

Why is this powerful?
It links the “geometric lengths” ($\lambda_i$) directly to the “algebraic volume” ($\det(\mathcal{L})$).

  • Ideally: If the lattice has an orthogonal basis of short vectors, then $\prod \lambda_i \approx \det(\mathcal{L})$.
  • In Practice: If the lattice is highly skewed (like a Public Key), the product $\prod \lambda_i$ might be much smaller than the product of the basis vector lengths $\prod ||\mathbf{b}_i||$.

This theorem proves that “short bases” mathematically exist. The goal of cryptanalysis is to find a basis where $||\mathbf{b}_i|| \approx \lambda_i$. In dimensions $n > 500$, the gap between Minkowski’s existence bound and what algorithms like LLL can actually find is where the security of schemes like Kyber and Dilithium lives.

The Dual Lattice

We have one final structure to define before we close our introduction to the geometry of numbers. In many areas of mathematics (like Fourier analysis or optimization), every object has a “dual” counterpart. Lattices are no exception.

The Dual Lattice is essentially the set of all vectors that have an integer inner product with every vector in your original lattice.

Definition (Dual Lattice)
Let $\mathcal{L}$ be a lattice in $\mathbb{R}^n$. The dual lattice, denoted $\mathcal{L}^*$, is defined as:
$$
\mathcal{L}^* = { \mathbf{x} \in \text{span}(\mathcal{L}) : \langle \mathbf{x}, \mathbf{y} \rangle \in \mathbb{Z}, \forall \mathbf{y} \in \mathcal{L} }
$$

If $\mathcal{L}$ is full-rank (which it almost always is in cryptography), the condition simplifies to $\mathbf{x} \in \mathbb{R}^n$ such that the inner products are integers.

The Dual Basis

If we have a basis matrix $\mathbf{B}$ for the primal lattice $\mathcal{L}$, how do we find a basis $\mathbf{D}$ for $\mathcal{L}^*$?
We need the inner product of any column in $\mathbf{B}$ and any column in $\mathbf{D}$ to be an integer (specifically, we want them to be orthonormal roughly speaking, satisfying the condition $\mathbf{D}^T \mathbf{B} = \mathbf{I}$).

Theorem (Dual Basis)
If $\mathbf{B}$ is a basis for $\mathcal{L}$, then the “inverse transpose” matrix:
$$\mathbf{D} = (\mathbf{B}^{-1})^T$$
is a basis for $\mathcal{L}^*$.

Proof:
Let’s check the inner product condition. We compute the matrix product $\mathbf{D}^T \mathbf{B}$:
$$\mathbf{D}^T \mathbf{B} = ((\mathbf{B}^{-1})^T)^T \mathbf{B} = \mathbf{B}^{-1} \mathbf{B} = \mathbf{I}$$
Since the result is the Identity matrix (which consists entirely of integers $0$ and $1$), the condition holds. $\blacksquare$

Properties of the Dual

The dual lattice has a beautiful “inverse” relationship with the primal lattice.

  1. Reflexivity: The dual of the dual is the original lattice.
    $$(\mathcal{L}^*)^* = \mathcal{L}$$

  2. Volume: The determinant of the dual is the inverse of the primal determinant.
    $$\det(\mathcal{L}^*) = \frac{1}{\det(\mathcal{L})}$$

    Proof: $\det(\mathbf{D}) = \det((\mathbf{B}^{-1})^T) = \det(\mathbf{B}^{-1}) = 1/\det(\mathbf{B})$.

Cryptanalytic Significance: The “Slicing” Intuition

Why do we care about this?
In the primal lattice $\mathcal{L}$, a short vector represents a point close to the origin.
In the dual lattice $\mathcal{L}^*$, a short vector $\mathbf{w} \in \mathcal{L}^*$ represents a set of hyperplanes that slice the original lattice into widely spaced layers.

  • The distance between these layers is exactly $1/||\mathbf{w}||$.
  • Therefore, finding a short vector in the dual lattice corresponds to finding sparse layers in the primal lattice.

This is the heart of the Dual Attack on LWE. If we can find a short vector in the dual, we can distinguish the lattice from random noise because the lattice points will cluster on these specific hyperplanes, whereas random noise will be uniform.

Computational Perspective

Before moving to Chapter 2, a note on implementation.
Throughout this chapter, we have assumed we can compute things like $\mathbf{B}^{-1}$ or Gram-Schmidt coefficients exactly.
In code (C++ / Python), we face two realities:

  1. Integer Lattices: We typically work with $\mathcal{L} \subseteq \mathbb{Z}^n$. This means $\det(\mathcal{L})$ is an integer. However, the dual lattice $\mathcal{L}^*$ will usually contain rational numbers (fractions).
    Implementation Tip: To avoid floating point errors, we often scale the dual lattice by a factor of $k = \det(\mathcal{L})$ to map it back to integers.
  2. Floating Point Precision: For high dimensions ($n > 100$), standard double precision is insufficient for LLL and BKZ. The rounding errors in Gram-Schmidt will destroy the orthogonality. You will often need to use libraries like MPFR (for high-precision floats) or strict integer arithmetic strategies.

What’s Next?

This concludes Chapter 1.
We have defined our playground:

  • The Objects: Lattices and Bases.
  • The Metric: Orthogonality Defect (how “bad” a basis is).
  • The Tool: Gram-Schmidt Orthogonalization.
  • The Guarantee: Minkowski’s Theorems (short vectors exist).
  • The Mirror: The Dual Lattice.

In Chapter 2, we will tackle the Hard Problems. We will formally define SVP, CVP, and LWE, and explain why they are hard. This sets the stage for the algorithms (LLL/BKZ) that try to solve them.