Code-Based Cryptography
Overview
Code-based cryptography builds public-key encryption and key encapsulation on the hardness of decoding random linear error-correcting codes. The field traces back to Robert McEliece’s 1978 cryptosystem — predating RSA by just one year — making it the oldest family of post-quantum cryptographic schemes still standing. Unlike lattice-based schemes that rose to prominence during NIST’s standardization process, code-based constructions have endured over four decades of sustained cryptanalysis without a fundamental break.
This durability is not an accident. The underlying computational problem — decoding a random linear code — is NP-hard in its general formulation, and the best known algorithms for attacking well-parameterized instances remain exponential. When NIST selected HQC as an additional KEM standard in March 2025, it validated the code-based family as a critical pillar of algorithmic diversity in the post-quantum ecosystem.
This page covers the mathematical foundations of error-correcting codes as they apply to cryptography, the syndrome decoding problem, the three major code-based candidates from NIST’s process (Classic McEliece, HQC, and BIKE), the historical attack landscape, and why maintaining code-based alternatives to lattice-based schemes is a strategic imperative.
For the broader context of NIST’s selection process, see NIST PQC Standardization Process. For background on why classical cryptography is vulnerable, see Classical Cryptography at Risk.
1. Error-Correcting Codes Primer
Before diving into cryptographic constructions, we need to establish the coding theory fundamentals that underpin every scheme in this family. Error-correcting codes were originally designed to protect data transmitted over noisy channels — but their mathematical structure turns out to be ideal for building trapdoor functions.
1.1 Linear Codes
A linear code C is a k-dimensional subspace of the n-dimensional vector space F₂ⁿ (the set of all binary vectors of length n). The code is characterized by three parameters:
- n: the block length (total number of bits in a codeword)
- k: the dimension (number of information bits encoded)
- d: the minimum distance (minimum Hamming weight of any nonzero codeword)
A code with these parameters is denoted as an [n, k, d] code. The minimum distance d determines the error-correcting capability: a code can correct up to t = ⌊(d−1)/2⌋ errors.
For example, the classic [7, 4, 3] Hamming code encodes 4 information bits into 7-bit codewords, with minimum distance 3, correcting any single-bit error.
1.2 Generator and Parity-Check Matrices
Every linear [n, k] code can be represented by two fundamental matrices:
Generator Matrix G (k × n): Encodes a message vector m ∈ F₂ᵏ into a codeword c = mG ∈ F₂ⁿ. The rows of G form a basis for the code. In systematic form, G = [Iₖ | P] where Iₖ is the k × k identity matrix and P is the k × (n−k) parity portion.
Parity-Check Matrix H ((n−k) × n): Defines the code through the relationship HcT = 0 for every codeword c ∈ C. The parity-check matrix satisfies GHT = 0. In systematic form corresponding to G = [Iₖ | P], we have H = [PT | I_{n−k}].
The relationship between these matrices is the core duality that makes code-based cryptography possible:
Encoding: c = mG (easy: matrix-vector multiplication)
Verification: HcT = 0 (easy: check membership in code)
Decoding: Given c + e, find m (hard for random codes, easy for structured codes)
1.3 Syndrome Decoding
When a codeword c is corrupted by an error vector e of weight t (meaning t bits are flipped), the receiver obtains y = c + e. Computing the syndrome of y gives:
s = HyT = H(c + e)T = HcT + HeT = 0 + HeT = HeT
The syndrome depends only on the error pattern, not on the transmitted codeword. For a structured code (Goppa, BCH, Reed-Solomon, etc.), efficient decoding algorithms can recover e from s when the number of errors t is within the code’s correction capability. For a random linear code, no efficient decoding algorithm is known.
This asymmetry — efficient decoding for structured codes, intractable decoding for random codes — is the trapdoor at the heart of code-based cryptography.
1.4 Key Code Families Used in Cryptography
| Code Family | Structure | Decoding Algorithm | Use in PQC |
|---|---|---|---|
| Binary Goppa codes | Defined by a polynomial g(x) over F₂ᵐ | Patterson’s algorithm (polynomial time) | Classic McEliece |
| Quasi-cyclic codes | Generator/parity-check matrix has cyclic block structure | Varies by construction | HQC, BIKE |
| QC-MDPC codes | Quasi-cyclic moderate-density parity-check | Bit-flipping decoders | BIKE |
| Reed-Solomon codes | Evaluation of polynomials over finite fields | Berlekamp-Massey, Guruswami-Sudan | Historical (broken variants) |
| LRPC codes | Low-rank parity-check codes | Rank metric decoding | ROLLO (broken) |
2. The Syndrome Decoding Problem
2.1 Formal Definition
The Syndrome Decoding Problem (SDP) is defined as follows:
Given: A binary matrix H ∈ F₂^{(n−k)×n}, a syndrome s ∈ F₂^{n−k}, and an integer t Find: A vector e ∈ F₂ⁿ of Hamming weight ≤ t such that HeT = s
Equivalently stated: given a random linear code defined by its parity-check matrix and a target syndrome, find a low-weight error vector that produces that syndrome.
2.2 Computational Hardness
The syndrome decoding problem was proven NP-complete by Berlekamp, McEliece, and van Tilborg in 1978 — the same year as the McEliece cryptosystem’s publication. This result holds for the general (worst-case) formulation. While NP-hardness does not automatically guarantee average-case hardness (which is what cryptography requires), decades of algorithmic research have produced strong evidence that the average-case problem is also intractable.
The distinction matters: lattice problems like LWE are also believed to be hard on average, but lattice-based reductions often involve polynomial approximation factors that leave theoretical room for improvement. The code-based setting has a cleaner separation — the best known algorithms for random instances are essentially the same as the best known algorithms for worst-case instances.
2.3 Best Known Attacks
The best generic algorithms for syndrome decoding belong to the Information Set Decoding (ISD) family, which has evolved through several generations:
| Algorithm | Year | Complexity (for [n, k] code, t errors) | Key Innovation |
|---|---|---|---|
| Prange | 1962 | 2^{0.1207n} (for rate 1/2) | Randomly select k coordinates, check if error-free |
| Lee-Brickell | 1988 | Improved constant | Allow p errors in information set |
| Leon | 1988 | Similar to Lee-Brickell | Collision-based approach |
| Stern | 1989 | 2^{0.1164n} | Birthday-paradox meet-in-the-middle |
| Dumer | 1991 | 2^{0.1162n} | Generalized splitting |
| MMT (May-Meurer-Thomae) | 2011 | 2^{0.1114n} | Representation technique |
| BJMM (Becker-Joux-May-Meurer) | 2012 | 2^{0.1019n} | Nearest-neighbor search + representations |
| Both-May | 2018 | 2^{0.0966n} | Further representation improvements |
The critical observation: over 60 years of research, the exponent in the best known attack has improved from 0.1207n to 0.0966n — a roughly 20% reduction. This is a gradual, well-understood improvement curve, not the sudden algorithmic breakthroughs that shattered schemes like SIKE. Cryptographers can confidently set parameters with generous security margins.
2.4 Quantum Speedup for ISD
Grover’s algorithm provides a generic quadratic speedup for search problems, which applies to ISD algorithms. The quantum versions of ISD achieve approximately:
Classical ISD: ~2^{0.0966n}
Quantum ISD: ~2^{0.0483n} (roughly halving the exponent)
However, the actual quantum speedup for ISD is nuanced. The internal structure of ISD algorithms involves nested search and collision-finding subroutines, and the degree to which each component benefits from quantum speedup varies. Kachigar and Tillich (2017) and Kirshanova (2018) performed detailed analyses showing that the quantum improvement is meaningful but does not fundamentally change the asymptotic picture — it approximately halves the security level, which is accounted for in parameter selection.
This is a crucial advantage over schemes broken by Shor’s algorithm, where the security drops from exponential to polynomial. For code-based cryptography, quantum computers provide at most a quadratic speedup — the same category of improvement as Grover’s attack on AES, which is mitigated by simply doubling key lengths.
3. The McEliece Cryptosystem (1978)
3.1 How It Works
Robert McEliece introduced his cryptosystem in 1978, one year after RSA. The fundamental idea is elegant: hide a structured code (which you can efficiently decode) inside what appears to be a random code (which nobody can efficiently decode).
graph TD
subgraph Key Generation
A["Choose Goppa code<br/>with generator G<br/>(k × n matrix)"] --> B["Generate random<br/>scrambling matrix S<br/>(k × k, invertible)"]
B --> C["Generate random<br/>permutation matrix P<br/>(n × n)"]
C --> D["Public key:<br/>G' = SGP"]
A --> E["Private key:<br/>(S, G, P)"]
end
subgraph Encryption
F["Message m ∈ F₂ᵏ"] --> G2["Compute c = mG'"]
G2 --> H["Add random error e<br/>of weight t"]
H --> I["Ciphertext:<br/>y = mG' + e"]
end
subgraph Decryption
I --> J["Compute yP⁻¹"]
J --> K["Decode using<br/>Goppa decoder<br/>(removes error)"]
K --> L["Recover mS"]
L --> M["Compute m = (mS)S⁻¹"]
end
style A fill:#1a1a2e,stroke:#e94560,color:#eee
style B fill:#1a1a2e,stroke:#e94560,color:#eee
style C fill:#1a1a2e,stroke:#e94560,color:#eee
style D fill:#1a1a2e,stroke:#0f3460,color:#eee
style E fill:#1a1a2e,stroke:#0f3460,color:#eee
style F fill:#1a1a2e,stroke:#16213e,color:#eee
style G2 fill:#1a1a2e,stroke:#16213e,color:#eee
style H fill:#1a1a2e,stroke:#16213e,color:#eee
style I fill:#1a1a2e,stroke:#16213e,color:#eee
style J fill:#1a1a2e,stroke:#e94560,color:#eee
style K fill:#1a1a2e,stroke:#e94560,color:#eee
style L fill:#1a1a2e,stroke:#e94560,color:#eee
style M fill:#1a1a2e,stroke:#e94560,color:#eee
Key Generation:
- Choose a binary Goppa code with parameters [n, k, t] — a structured code for which efficient decoding (Patterson’s algorithm) exists
- Compute the generator matrix G for this code (k × n)
- Generate a random invertible k × k binary matrix S (the scrambling matrix)
- Generate a random n × n permutation matrix P
- Public key: G’ = SGP (this looks like a generator matrix for a random code)
- Private key: (S, G, P) along with the Goppa polynomial defining the code
Encryption:
- Encode message m as c = mG’
- Generate a random error vector e of Hamming weight exactly t
- Ciphertext: y = mG’ + e = c + e
Decryption:
- Compute y’ = yP⁻¹ = mSG + eP⁻¹ (permuting the error positions, but not changing the weight)
- Apply the Goppa code decoder to y’ — this removes the error and recovers mS (since mSG is a valid codeword of the Goppa code, and eP⁻¹ has weight t within correction capability)
- Recover m = (mS)S⁻¹
The security argument: without knowing S, G, and P individually, the attacker sees only G’ which is computationally indistinguishable from a random k × n binary matrix. Decoding a random linear code with t errors is the syndrome decoding problem — which we established is NP-hard and practically intractable for appropriate parameters.
3.2 Why It Survived 45+ Years
McEliece’s cryptosystem has an extraordinary track record. To understand why, consider what an attacker must do:
-
Structural attack: Recover the hidden Goppa code from the public key G’. This requires “unscrambling” the combined transformation SGP. For binary Goppa codes, no efficient algorithm for this has ever been found despite extensive research.
-
Decoding attack: Ignore the structure entirely and attempt to decode the ciphertext directly using ISD algorithms. This is the generic attack, and its cost is well-understood (see Section 2.3).
-
Algebraic attack: Exploit algebraic relationships in the public key or ciphertext. These have been effective against variants that use non-Goppa codes (see Section 8) but have consistently failed against binary Goppa codes.
The key insight is that binary Goppa codes appear to have no exploitable algebraic structure when disguised by the SGP transformation. This has been tested by some of the best cryptanalysts in the world over four decades, spanning advances in both classical and quantum algorithm design.
By contrast, several alternative code families proposed as McEliece variants were broken relatively quickly:
| Variant | Code Family Used | Year Proposed | Year Broken | Attack |
|---|---|---|---|---|
| McEliece with Reed-Solomon codes | Reed-Solomon | 1986 | 1992 | Sidelnikov-Shestakov structural attack |
| McEliece with Reed-Muller codes | Reed-Muller | 1994 | 2007 | Minder-Shokrollahi distinguisher |
| McEliece with algebraic geometry codes | AG codes | 1996 | 2014 | Couvreur-Márquez-Corbella-Pellikaan |
| McEliece with GRS subcodes | Generalized RS | 2005 | 2010 | Wieschebrink attack |
| McEliece with wild Goppa codes | Wild Goppa | 2010 | 2014 | Distinguisher by Faugère et al. |
The pattern is clear: the original binary Goppa construction is special. Its resistance to structural attacks is not an arbitrary choice — it reflects deep properties of these codes that cryptanalysts have been unable to exploit.
3.3 The Key Size Problem
McEliece’s Achilles’ heel has always been key size. The public key G’ is a k × n binary matrix. Even in systematic form (where k columns form an identity), the non-trivial portion requires k × (n − k) bits of storage.
For parameters providing 128-bit classical security (256-bit post-quantum security with Grover’s), the public key is on the order of hundreds of kilobytes to over a megabyte — roughly 1,000x larger than ML-KEM public keys:
ML-KEM-768 public key: 1,184 bytes (~1.2 KB)
McEliece-348864 public key: 261,120 bytes (~255 KB)
McEliece-6960119 public key: 1,047,319 bytes (~1 MB)
This size disparity is not a tunable parameter — it is intrinsic to the construction. The public key must be large enough that it is computationally indistinguishable from a random matrix, and the information content of a random binary matrix grows quadratically with the code dimension.
3.4 Classic McEliece: The NIST Submission
The Classic McEliece submission to NIST’s PQC competition is a modernized version of the original McEliece system, converted to a KEM (Key Encapsulation Mechanism) using standard transformations. The submission team included Daniel J. Bernstein, Tung Chou, and other leading code-based cryptography researchers.
Parameter Sets:
| Parameter Set | NIST Level | n | k | t | Public Key | Ciphertext | Shared Secret |
|---|---|---|---|---|---|---|---|
| mceliece348864 | 1 | 3,488 | 2,720 | 64 | 261,120 B (~255 KB) | 128 B | 32 B |
| mceliece460896 | 3 | 4,608 | 3,360 | 96 | 524,160 B (~512 KB) | 188 B | 32 B |
| mceliece6688128 | 5 | 6,688 | 5,024 | 128 | 1,044,992 B (~1 MB) | 240 B | 32 B |
| mceliece6960119 | 5 | 6,960 | 5,413 | 119 | 1,047,319 B (~1 MB) | 226 B | 32 B |
| mceliece8192128 | 5 | 8,192 | 6,528 | 128 | 1,357,824 B (~1.3 MB) | 240 B | 32 B |
Note the extreme asymmetry: ciphertexts are tiny (128–240 bytes), but public keys are enormous. This makes Classic McEliece unsuitable for protocols where public keys are transmitted frequently (TLS handshakes, certificate chains) but potentially viable for scenarios where public keys can be cached or pre-distributed (long-term encryption keys, key escrow systems).
Why NIST Did Not Select It as Primary Standard:
NIST was explicit in their reasoning: Classic McEliece has “the strongest security argument” among all KEM candidates, but its public key sizes make it impractical for the majority of deployment scenarios. Specifically:
- TLS 1.3 handshakes would need to transmit 255 KB+ public keys, causing severe latency issues and potential fragmentation problems
- X.509 certificate chains with McEliece keys would exceed typical MTU sizes by orders of magnitude
- Constrained environments (IoT, embedded, smart cards) cannot store megabyte-scale keys
- Even for desktop/server environments, the bandwidth overhead is significant at scale
NIST chose ML-KEM as the primary KEM standard because it offers a dramatically better size/performance tradeoff for the common case, while acknowledging that Classic McEliece provides the strongest known security guarantee. The selection of HQC as an additional standard was partly motivated by providing code-based diversity without McEliece’s key size burden.
4. HQC (Hamming Quasi-Cyclic)
4.1 NIST Round 4 Selection
In March 2025, NIST announced the selection of HQC as an additional KEM standard, complementing ML-KEM (FIPS 203). This decision culminated several years of Round 4 evaluation following the initial standardization of ML-KEM in August 2024.
HQC’s selection was driven primarily by the algorithmic diversity argument: ML-KEM is lattice-based (MLWE problem), and if a breakthrough were to undermine lattice assumptions, the entire KEM infrastructure would be compromised. HQC provides a fallback based on an entirely different mathematical foundation — the hardness of decoding quasi-cyclic codes — ensuring that the cryptographic ecosystem does not have a single point of failure.
4.2 How HQC Works
HQC is built on the decisional quasi-cyclic syndrome decoding problem (QCSD). Unlike Classic McEliece, which hides a structured code behind random transformations, HQC uses the quasi-cyclic structure openly — the security comes from the difficulty of decoding with respect to a random quasi-cyclic code, not from hiding the code’s structure.
Quasi-Cyclic Structure:
A quasi-cyclic code of index 2 over F₂ has a parity-check matrix of the form H = [I | rot(h)], where rot(h) is a circulant matrix defined by a vector h. The circulant structure means each row is a cyclic shift of the previous row. This structure dramatically reduces key sizes compared to fully random codes (like in McEliece), because an n × n circulant matrix is fully defined by its first row of n elements.
KEM Construction:
HQC uses a variant of the Alekhnovich cryptosystem combined with the Fujisaki-Okamoto transform to achieve IND-CCA2 security:
Key Generation:
- Sample a sparse random vector x and y from F₂ⁿ (the secret key)
- Sample a random vector h ∈ F₂ⁿ (part of the public structure)
- Compute s = x + h·y (where · denotes multiplication in F₂[X]/(Xⁿ−1))
- Public key: (h, s)
- Private key: (x, y)
Encapsulation:
- Sample random sparse vectors r₁, r₂, e ∈ F₂ⁿ
- Compute u = r₁ + h·r₂
- Encode the message m using a concatenated code (Reed-Muller + Reed-Solomon) to get a codeword v
- Compute v’ = s·r₂ + e + v
- Ciphertext: (u, v’)
- Apply the FO transform to derive the shared secret
Decapsulation:
- Compute v’ − u·y = v + (noise terms)
- The noise terms are the sum of sparse vectors with bounded weight
- Decode using the concatenated decoder to recover m
- Re-encapsulate to verify consistency (FO transform requirement)
The security relies on the fact that without knowing the secret key (x, y), the value s·r₂ looks random, and the error + encoding structure cannot be separated from random noise.
4.3 The Concatenated Code Decoder
HQC’s internal error correction uses a two-layer concatenated code, which is a distinctive engineering choice:
- Inner code: A Reed-Muller code RM(1, m) — provides high error-correction capability for burst errors at the cost of rate
- Outer code: A Reed-Solomon code — corrects residual errors from the inner decoder using algebraic decoding
This concatenation provides a Decoding Failure Rate (DFR) below 2⁻¹²⁸ for all parameter sets, which is critical for IND-CCA2 security under the Fujisaki-Okamoto transform. A DFR that is too high enables reaction attacks where an adversary submits many specially crafted ciphertexts and learns secret key information from decapsulation failures.
4.4 Parameter Sets
| Parameter Set | NIST Level | n | Public Key | Ciphertext | Shared Secret | Est. Encaps Time |
|---|---|---|---|---|---|---|
| HQC-128 | 1 (AES-128) | 17,669 | 2,249 B | 4,497 B | 64 B | ~100 μs |
| HQC-192 | 3 (AES-192) | 35,851 | 4,522 B | 9,042 B | 64 B | ~200 μs |
| HQC-256 | 5 (AES-256) | 57,637 | 7,245 B | 14,485 B | 64 B | ~350 μs |
4.5 Comparison with ML-KEM
graph LR
subgraph ML-KEM["ML-KEM (FIPS 203)"]
direction TB
A1["Public Key: 1,184 B"]
A2["Ciphertext: 1,088 B"]
A3["Problem: Module-LWE"]
A4["Family: Lattice-based"]
A5["Status: Primary standard"]
end
subgraph HQC["HQC (Pending FIPS)"]
direction TB
B1["Public Key: 2,249 B"]
B2["Ciphertext: 4,497 B"]
B3["Problem: QCSD"]
B4["Family: Code-based"]
B5["Status: Additional standard"]
end
ML-KEM ---|"Algorithmic<br/>Diversity"| HQC
style A1 fill:#1a1a2e,stroke:#0f3460,color:#eee
style A2 fill:#1a1a2e,stroke:#0f3460,color:#eee
style A3 fill:#1a1a2e,stroke:#0f3460,color:#eee
style A4 fill:#1a1a2e,stroke:#0f3460,color:#eee
style A5 fill:#1a1a2e,stroke:#0f3460,color:#eee
style B1 fill:#1a1a2e,stroke:#e94560,color:#eee
style B2 fill:#1a1a2e,stroke:#e94560,color:#eee
style B3 fill:#1a1a2e,stroke:#e94560,color:#eee
style B4 fill:#1a1a2e,stroke:#e94560,color:#eee
style B5 fill:#1a1a2e,stroke:#e94560,color:#eee
Detailed Comparison at NIST Level 3:
| Property | ML-KEM-768 | HQC-192 | Factor |
|---|---|---|---|
| Public Key | 1,184 B | 4,522 B | HQC ~3.8x larger |
| Ciphertext | 1,088 B | 9,042 B | HQC ~8.3x larger |
| Shared Secret | 32 B | 64 B | HQC 2x larger |
| Key Generation | ~50 μs | ~150 μs | HQC ~3x slower |
| Encapsulation | ~50 μs | ~200 μs | HQC ~4x slower |
| Decapsulation | ~50 μs | ~250 μs | HQC ~5x slower |
| Underlying Problem | Module-LWE | Quasi-cyclic SD | Different families |
| Problem Age | ~2005 (MLWE), ~1996 (LWE) | ~1978 (SD), ~2010 (QCSD) | Code-based older |
| Best Known Attack | Lattice sieving/BKZ | Information Set Decoding | Both exponential |
| Quantum Impact | Moderate (core sieving speedup debated) | ~Halves security level (Grover on ISD) | Both resistant |
The size and performance penalties of HQC relative to ML-KEM are real but manageable. HQC’s ciphertexts at Level 3 are about 9 KB — significant for bandwidth-constrained protocols but not prohibitive for most modern networks. The key insight is that HQC is not meant to replace ML-KEM in the common case; it exists as a backup with a fundamentally different security foundation.
When to choose HQC over ML-KEM:
- When organizational risk policy requires algorithmic diversity
- In hybrid deployments combining both schemes (belt-and-suspenders approach)
- When long-term confidence in security assumptions outweighs performance optimization
- In scenarios where a lattice-specific breakthrough would be catastrophic
5. BIKE (Bit Flipping Key Encapsulation)
5.1 Construction
BIKE (Bit Flipping Key Encapsulation) is based on Quasi-Cyclic Moderate Density Parity-Check (QC-MDPC) codes. It was a Round 4 candidate alongside HQC and Classic McEliece, and while it was not selected for standardization, it remains an active research subject with distinctive properties.
MDPC codes differ from the sparse (LDPC) and dense (Goppa) codes used in other schemes. Their parity-check matrices have moderate row weight — heavy enough to provide good error correction but light enough to enable efficient iterative decoding.
How BIKE works:
Key Generation:
- Sample two sparse random polynomials h₀, h₁ ∈ F₂[X]/(Xⁿ−1) of weight w/2 each
- The public key is h = h₁ · h₀⁻¹ mod (Xⁿ−1)
- The private key is (h₀, h₁)
Encapsulation:
- Sample a random error vector e = (e₀, e₁) of total weight t
- Compute syndrome s = e₀ + e₁ · h
- Ciphertext is (s, c) where c includes the FO-transform components
- Shared secret is derived by hashing
Decapsulation:
- Use the private key to construct the QC-MDPC parity-check matrix H = [h₀ | h₁]
- Apply a bit-flipping decoder to recover the error vector e from syndrome s
- Re-encapsulate and verify consistency
5.2 The Bit-Flipping Decoder
BIKE’s decoding algorithm is an iterative bit-flipping procedure similar to those used for LDPC codes in communications:
- Compute the syndrome of the received word
- For each bit position, count how many unsatisfied parity checks involve that position
- Flip bits that exceed a threshold number of unsatisfied checks
- Repeat until the syndrome is zero or a maximum iteration count is reached
The simplicity of this decoder is a strength (easy to implement, easy to protect against side channels) and a weakness (the decoder can fail with non-negligible probability for some inputs).
5.3 Decoding Failure Rate Challenges
BIKE’s most significant technical challenge is its Decoding Failure Rate (DFR). Unlike algebraic decoders (Patterson’s for Goppa codes, Berlekamp-Massey for Reed-Solomon), the bit-flipping decoder is probabilistic — it may fail to converge for certain error patterns.
Why this matters for CCA security:
- The Fujisaki-Okamoto transform requires the decapsulation to either succeed perfectly or reject — it cannot leak partial information
- If the DFR is too high, an attacker can craft ciphertexts that cause decoding failures, then use the failure/success pattern as a side channel to extract the secret key (GJS attack, Guo-Johansson-Stankovski, 2016)
- The DFR must be proven to be below approximately 2⁻¹²⁸ for CCA2 security at NIST Level 1
Achieving a provably low DFR with the bit-flipping decoder requires either:
- Very conservative parameters (increasing key/ciphertext sizes)
- More sophisticated decoders (which may introduce timing side channels)
- Careful threshold selection backed by extensive statistical analysis
The BIKE team made significant progress on DFR analysis, including the development of the Black-Gray-Flip (BGF) decoder, but the analysis remained more complex and less clean than the algebraic decoding guarantees of HQC and Classic McEliece. This was a factor in NIST’s decision to select HQC over BIKE for the additional KEM standard.
5.4 BIKE Parameters
| Parameter Set | NIST Level | n | Public Key | Ciphertext | Shared Secret |
|---|---|---|---|---|---|
| BIKE-1 | 1 | 12,323 | 1,541 B | 1,573 B | 32 B |
| BIKE-3 | 3 | 24,659 | 3,083 B | 3,115 B | 32 B |
| BIKE-5 | 5 | 40,973 | 5,122 B | 5,154 B | 32 B |
Notably, BIKE’s sizes are competitive with HQC at Level 1 and actually smaller at higher security levels. BIKE-1’s combined public key + ciphertext (~3.1 KB) is smaller than HQC-128’s (~6.7 KB). If the DFR analysis matures further, BIKE could re-emerge as a viable alternative in future standardization efforts.
5.5 Current Status
BIKE was not selected by NIST in the March 2025 announcement. The team continues active research, and the scheme remains a candidate for future consideration. BIKE’s compact sizes and simple decoder make it attractive for constrained environments, and its DFR challenges are an active area of academic research rather than a fundamental impossibility.
6. Comprehensive Comparison of Code-Based Candidates
6.1 Parameter Comparison Table
The following table compares all three major code-based NIST candidates at equivalent security levels:
NIST Level 1 (128-bit classical / AES-128 equivalent):
| Property | Classic McEliece (348864) | HQC-128 | BIKE-1 |
|---|---|---|---|
| Public Key | 261,120 B (255 KB) | 2,249 B (2.2 KB) | 1,541 B (1.5 KB) |
| Ciphertext | 128 B | 4,497 B (4.4 KB) | 1,573 B (1.5 KB) |
| PK + CT Total | 261,248 B (255 KB) | 6,746 B (6.6 KB) | 3,114 B (3.0 KB) |
| Shared Secret | 32 B | 64 B | 32 B |
| Underlying Code | Binary Goppa | QC (RM+RS concatenation) | QC-MDPC |
| Decoder Type | Algebraic (Patterson) | Algebraic (RM+RS) | Probabilistic (bit-flip) |
| DFR | 0 (exact decoding) | < 2⁻¹²⁸ | < 2⁻¹²⁸ (with caveats) |
| Security Confidence | Highest (45+ years) | High (~15 years for QCSD) | High (DFR analysis ongoing) |
| NIST Status | Round 4 (not selected) | Selected (Mar 2025) | Round 4 (not selected) |
NIST Level 5 (256-bit classical / AES-256 equivalent):
| Property | Classic McEliece (6688128) | HQC-256 | BIKE-5 |
|---|---|---|---|
| Public Key | 1,044,992 B (~1 MB) | 7,245 B (7.1 KB) | 5,122 B (5.0 KB) |
| Ciphertext | 240 B | 14,485 B (14.1 KB) | 5,154 B (5.0 KB) |
| PK + CT Total | 1,045,232 B (~1 MB) | 21,730 B (21.2 KB) | 10,276 B (10.0 KB) |
6.2 Design Philosophy Comparison
graph TD
subgraph McEliece["Classic McEliece"]
M1["Hide structured code<br/>behind random transform"]
M2["Binary Goppa codes"]
M3["Algebraic decoder"]
M4["Zero DFR"]
M5["Enormous keys,<br/>tiny ciphertexts"]
end
subgraph HQC_D["HQC"]
H1["Use quasi-cyclic<br/>structure openly"]
H2["QC codes + RM/RS<br/>concatenation"]
H3["Algebraic decoder"]
H4["Negligible DFR"]
H5["Moderate keys,<br/>moderate ciphertexts"]
end
subgraph BIKE_D["BIKE"]
B1["Use quasi-cyclic<br/>MDPC structure"]
B2["QC-MDPC codes"]
B3["Bit-flipping decoder"]
B4["Low DFR<br/>(probabilistic)"]
B5["Compact keys,<br/>compact ciphertexts"]
end
style M1 fill:#1a1a2e,stroke:#e94560,color:#eee
style M2 fill:#1a1a2e,stroke:#e94560,color:#eee
style M3 fill:#1a1a2e,stroke:#e94560,color:#eee
style M4 fill:#1a1a2e,stroke:#e94560,color:#eee
style M5 fill:#1a1a2e,stroke:#e94560,color:#eee
style H1 fill:#1a1a2e,stroke:#0f3460,color:#eee
style H2 fill:#1a1a2e,stroke:#0f3460,color:#eee
style H3 fill:#1a1a2e,stroke:#0f3460,color:#eee
style H4 fill:#1a1a2e,stroke:#0f3460,color:#eee
style H5 fill:#1a1a2e,stroke:#0f3460,color:#eee
style B1 fill:#1a1a2e,stroke:#16213e,color:#eee
style B2 fill:#1a1a2e,stroke:#16213e,color:#eee
style B3 fill:#1a1a2e,stroke:#16213e,color:#eee
style B4 fill:#1a1a2e,stroke:#16213e,color:#eee
style B5 fill:#1a1a2e,stroke:#16213e,color:#eee
7. The Diversity Argument
7.1 Why Algorithmic Diversity Is Non-Negotiable
The history of cryptography is a history of unexpected breakthroughs. MD5 was considered secure until Wang et al. demonstrated practical collisions in 2004. SHA-1 was deprecated after the SHAttered attack in 2017. SIKE, one of NIST’s Round 4 candidates, was catastrophically broken in 2022 by Castryck and Decru using techniques from arithmetic geometry that had not previously been applied to isogeny-based cryptography.
The lesson is stark: no mathematical assumption should be trusted unconditionally. If the entire post-quantum infrastructure relies on a single problem family (lattice problems, in the case of ML-KEM + ML-DSA), a single breakthrough could collapse the entire system simultaneously.
7.2 The Lattice Monoculture Risk
As of 2026, the NIST post-quantum standards heavily favor lattice-based constructions:
- ML-KEM (FIPS 203): Module-LWE (lattice)
- ML-DSA (FIPS 204): Module-LWE/MSIS (lattice)
- FN-DSA (pending): NTRU lattice
- SLH-DSA (FIPS 205): Hash-based (not lattice — but only for signatures)
For key encapsulation, ML-KEM was the only standardized option until HQC’s selection. If MLWE were broken, there would have been no standardized quantum-resistant KEM to fall back on. HQC eliminates this single point of failure.
The specific risk scenarios include:
-
Quantum algorithm improvements: While Shor’s algorithm does not apply to lattice or code problems, new quantum algorithms could provide super-quadratic speedups against specific structures. The algebraic structure of lattice problems (polynomial rings, module structure) could potentially be exploited by future quantum algorithms in ways that the less structured code-based problems might resist.
-
Classical algorithm breakthroughs: Lattice reduction algorithms (BKZ, sieving) have seen steady improvements. The 2023 improvements by Ducas and others reduced the gap between theoretical and practical lattice attack complexities. While current parameters remain secure, the trajectory of improvements warrants caution.
-
Structural attacks: The module/ring structure that makes ML-KEM efficient also provides algebraic footholds for potential attacks. Code-based schemes (especially McEliece with Goppa codes) have less exploitable algebraic structure.
7.3 Code-Based Cryptography as a Hedge
Code-based schemes provide an ideal diversification hedge because:
- Different mathematical foundation: Syndrome decoding is unrelated to lattice problems. A breakthrough in one has no implications for the other.
- Longer track record: The syndrome decoding problem has been studied since 1962 (Prange’s ISD algorithm). McEliece dates to 1978. This is a deeper pool of cryptanalytic evidence than lattice-based cryptography can claim.
- Well-understood quantum impact: The quantum speedup for ISD is at most quadratic (Grover-type), which is the mildest possible quantum impact. There is no known quantum algorithm that provides a super-polynomial speedup for syndrome decoding.
- Conservative parameter selection: Because the attack landscape is well-characterized, parameters can be set with high confidence. There are no “unknown unknowns” in the attack model comparable to the open questions about quantum lattice sieving.
8. Historical Attacks and What Survived
8.1 Attacks on the McEliece Framework
The history of attacks on McEliece variants provides a roadmap of what works, what does not, and why the original construction remains secure.
Structural Attacks (attacking the key):
| Attack | Target | Result |
|---|---|---|
| Sidelnikov-Shestakov (1992) | McEliece with GRS codes | Broken: Recovered the code structure in polynomial time |
| Loidreau-Sendrier (2001) | Distinguishing Goppa codes from random | Failed: No efficient distinguisher found for binary Goppa codes |
| Faugère-Otmani-Perret-Tillich (2010) | Algebraic attack on short-key McEliece | Limited: Only applicable to unrealistically small parameters |
| Couvreur-Márquez-Corbella-Pellikaan (2014) | McEliece with algebraic geometry codes | Broken: Exploited algebraic structure of AG codes |
| Barelli-Couvreur (2018) | Quasi-cyclic alternant/Goppa codes | Partial: Reduced security for specific quasi-cyclic Goppa instantiations |
Generic Decoding Attacks (attacking the ciphertext):
These are the ISD algorithms listed in Section 2.3. They apply to all code-based schemes and represent the baseline attack cost. The steady but slow improvement in ISD efficiency is well-tracked and accounted for in parameter selection.
Side-Channel Attacks:
| Attack | Target | Mitigation |
|---|---|---|
| Strenzke (2010) | Timing leaks in Patterson’s decoder | Constant-time implementation |
| Shoufan et al. (2010) | Power analysis of McEliece | Masking and blinding countermeasures |
| GJS attack (Guo-Johansson-Stankovski, 2016) | Decoding failure side channel | Ensure DFR < 2⁻¹²⁸; use FO transform with implicit rejection |
| Sim et al. (2020) | Timing in BIKE decoder | Constant-time BGF decoder |
8.2 Attacks on Other Code-Based Submissions
During the NIST competition, several code-based submissions faced cryptanalytic challenges:
- ROLLO (Rank-metric codes): Broken in 2020 by Bardet et al. using algebraic attacks on the MinRank problem. The rank metric setting proved more vulnerable to algebraic techniques than the Hamming metric setting.
- RQC (Rank Quasi-Cyclic): Affected by the same rank-metric algebraic attacks as ROLLO.
- LEDAcrypt (QC-LDPC based): Not broken, but concerns about the DFR analysis and reaction attacks limited its viability.
- NTS-KEM (McEliece variant): Secure but offered no significant advantage over Classic McEliece.
The pattern is instructive: Hamming-metric schemes (McEliece, HQC, BIKE) have proven far more resilient than rank-metric schemes (ROLLO, RQC). The Hamming metric setting benefits from deeper theoretical foundations and a larger body of cryptanalytic work.
8.3 What Survived and Why
The schemes that survived NIST’s evaluation share key properties:
- Well-studied underlying problem: Binary syndrome decoding in the Hamming metric, not exotic metric spaces
- Conservative parameter selection: Security margins that account for foreseeable algorithmic improvements
- Clean security reductions: Reductions to well-defined hard problems, not ad hoc security arguments
- Resistance to algebraic exploitation: No exploitable algebraic structure in the public key (McEliece) or reliance on generic quasi-cyclic hardness (HQC, BIKE)
- Mature implementation techniques: Well-understood constant-time implementation strategies
9. Implementation Considerations
9.1 Constant-Time Requirements
All three code-based candidates require constant-time implementation to prevent timing side channels. The specific challenges vary:
Classic McEliece: Patterson’s algorithm for Goppa code decoding involves polynomial GCD computations and root-finding over finite fields. Implementing these operations in constant time is well-understood but requires care — branching on secret-dependent values must be eliminated.
HQC: The Reed-Muller and Reed-Solomon decoders used internally are algebraic and can be implemented in constant time relatively straightforwardly. The main concern is the sparse polynomial multiplication in key generation and encapsulation.
BIKE: The bit-flipping decoder is inherently iterative, and naive implementations leak timing information through the variable number of iterations. The BGF decoder with a fixed iteration count mitigates this, but proving that the fixed count is always sufficient ties back to the DFR analysis.
9.2 Random Number Generation
Code-based schemes require high-quality randomness for:
- Key generation: Sampling the secret key components (sparse vectors, Goppa polynomial)
- Encapsulation: Generating the error vector with precise Hamming weight
- FO transform: The re-encryption step requires deterministic randomness derived from the message
Biased or predictable random number generation can be catastrophic. In particular, the error vector in McEliece must have exactly weight t — if an attacker can predict or influence even a few positions, the effective decoding problem becomes much easier.
9.3 Hardware Acceleration
Code-based schemes benefit from different hardware features than lattice-based schemes:
- Lattice-based (ML-KEM): Benefits from NTT (Number Theoretic Transform) hardware, SIMD instructions for polynomial arithmetic
- Code-based (HQC, BIKE): Benefits from carry-less multiplication (CLMUL/PCLMULQDQ), population count (POPCNT), and vector bitwise operations
Modern x86 processors with AVX2/AVX-512 and AArch64 processors with NEON/SVE provide adequate support for efficient code-based implementations. The BIKE and HQC reference implementations achieve performance within an order of magnitude of ML-KEM on mainstream hardware, which is acceptable for most applications.
10. Future Directions
10.1 HQC Standardization Timeline
Following the March 2025 selection announcement, HQC is expected to follow a standardization path similar to ML-KEM:
- Draft FIPS publication: Expected late 2025 or early 2026
- Public comment period: Typically 90 days
- Final FIPS publication: Estimated 2026-2027
Organizations should begin planning for HQC integration now, particularly those in regulated sectors (finance, government, healthcare) where algorithmic diversity requirements are likely to be mandated.
10.2 Classic McEliece’s Role
Despite not being selected as a NIST standard, Classic McEliece remains relevant:
- Highest security confidence: When maximum assurance is required and key size is not a constraint, McEliece remains the gold standard
- Pre-distributed key scenarios: Applications like encrypted backup, long-term key storage, or satellite communication where public keys can be pre-loaded
- Government and military: High-assurance environments that prioritize security margin over bandwidth efficiency
- ISO standardization: Classic McEliece is being considered independently by ISO/IEC JTC 1/SC 27
10.3 BIKE’s Potential
BIKE’s compact sizes make it an attractive candidate for future standardization if the DFR analysis matures. Active research areas include:
- Improved decoders: New bit-flipping variants with provably low DFR
- Tighter DFR bounds: Better analytical and empirical characterization of failure probability
- Side-channel resistant implementations: Fully constant-time decoders with proven timing uniformity
- Hardware implementations: FPGA and ASIC designs optimized for constrained environments
10.4 Code-Based Signatures
The NIST additional signatures call (2023) includes two code-based signature candidates:
- CROSS: Based on the restricted syndrome decoding problem, using zero-knowledge proof techniques
- LESS: Based on the linear code equivalence problem
These schemes produce relatively large signatures (5-13 KB) compared to lattice-based alternatives, but they diversify the signature algorithm landscape beyond lattices and hashes. Results from this evaluation are expected in 2026-2027. For more on post-quantum signature schemes, see NIST PQC Standardization Process.
11. Key Takeaways for Security Professionals
Strategic Recommendations
-
Deploy ML-KEM now, plan for HQC: ML-KEM should be the primary KEM for immediate migration. Begin architectural planning for HQC as a secondary/backup KEM.
-
Hybrid mode as a bridge: Consider hybrid deployments (ML-KEM + HQC, or classical + HQC) during the transition period. This provides defense in depth against both quantum attacks and potential post-quantum algorithm failures.
-
Monitor the DFR landscape: If BIKE achieves clean DFR proofs, it could offer the most compact code-based KEM. Track academic progress for future procurement decisions.
-
Do not dismiss McEliece: For applications where key size is not a binding constraint (offline encryption, pre-shared key scenarios, long-term archival), Classic McEliece offers the strongest security assurance available.
-
Understand your threat model: The choice between ML-KEM and HQC is fundamentally a risk management decision. If your adversary is a nation-state with long time horizons and the data you protect has multi-decade sensitivity, the additional bandwidth cost of HQC is trivial compared to the diversification benefit.
The Bottom Line
Code-based cryptography has earned its place in the post-quantum landscape through sheer endurance. While lattice-based schemes offer superior performance and smaller sizes for the common case, code-based schemes provide the longest-standing, most battle-tested foundation for quantum-resistant encryption. The selection of HQC as a NIST standard in March 2025 ensures that this proven mathematical family will remain a cornerstone of cryptographic infrastructure for decades to come.
Further Reading
- NIST Post-Quantum Standardization: NIST PQC Standardization Process — full timeline and selection criteria
- Classical Vulnerabilities: Classical Cryptography at Risk — why current systems are threatened
- Quantum Algorithms: Shor’s and Grover’s Algorithms — the quantum attacks driving PQC adoption
- Quantum Computing Basics: Quantum Computing Fundamentals — prerequisite quantum mechanics and computing concepts