Dixon's factorization method

In number theory, Dixon's factorization method (also Dixon's random squares method{{cite book |first=Thorsten |last=Kleinjung |chapter=Factorization of a 768-Bit RSA Modulus |title=Advances in Cryptology – CRYPTO 2010 |series=Lecture Notes in Computer Science |year=2010 |volume=6223 |pages=333–350 |doi=10.1007/978-3-642-14623-7_18 |display-authors=etal|isbn=978-3-642-14622-0 |s2cid=11556080 }} or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run-time bound comes with a rigorous proof that does not rely on conjectures about the smoothness properties of the values taken by a polynomial.

The algorithm was designed by John D. Dixon, a mathematician at Carleton University, and was published in 1981.{{cite journal |last=Dixon |first=J. D. |author-link=John D. Dixon |year=1981 |title=Asymptotically fast factorization of integers |url=https://www.ams.org/journals/mcom/1981-36-153/S0025-5718-1981-0595059-1/S0025-5718-1981-0595059-1.pdf |journal=Math. Comp. |volume=36 |issue=153 |pages=255–260 |doi=10.1090/S0025-5718-1981-0595059-1 |jstor=2007743 |doi-access=free}}

Basic idea

Dixon's method is based on finding a congruence of squares modulo the integer N which is intended to factor. Fermat's factorization method finds such a congruence by selecting random or pseudo-random x values and hoping that the integer x² mod N is a perfect square (in the integers):

: $x^2\equiv y^2\quad(\hbox{mod }N),\qquad x\not\equiv\pm y\quad(\hbox{mod }N).$

For example, if {{nowrap|N {{=}} 84923}}, (by starting at 292, the first number greater than {{radic|N}} and counting up) the {{nowrap|505² mod 84923}} is 256, the square of 16. So {{nowrap|(505 − 16)(505 + 16) {{=}} 0 mod 84923}}. Computing the greatest common divisor of {{nowrap|505 − 16}} and N using Euclid's algorithm gives 163, which is a factor of N.

In practice, selecting random x values will take an impractically long time to find a congruence of squares, since there are only {{radic|N}} squares less than N.

Dixon's method replaces the condition "is the square of an integer" with the much weaker one "has only small prime factors"; for example, there are 292 squares smaller than 84923; 662 numbers smaller than 84923 whose prime factors are only 2,3,5 or 7; and 4767 whose prime factors are all less than 30. (Such numbers are called B-smooth with respect to some bound B.)

If there are many numbers $a_1 \ldots a_n$ whose squares can be factorized as $a_i^2 \mod N = \prod_{j=1}^m b_j^{e_{ij}}$ for a fixed set $b_1 \ldots b_m$ of small primes, linear algebra modulo 2 on the matrix $e_{ij}$ will give a subset of the $a_i$ whose squares combine to a product of small primes to an even power — that is, a subset of the $a_i$ whose squares multiply to the square of a (hopefully different) number mod N.

Method

Suppose the composite number N is being factored. Bound B is chosen, and the factor base is identified (which is called P), the set of all primes less than or equal to B. Next, positive integers z are sought such that z² mod N is B-smooth. Therefore we can write, for suitable exponents a_i,

: $z^2 \text{ mod } N = \prod_{p_i\in P} p_i^{a_i}$

When enough of these relations have been generated (it is generally sufficient that the number of relations be a few more than the size of P), the methods of linear algebra, such as Gaussian elimination, can be used to multiply together these various relations in such a way that the exponents of the primes on the right-hand side are all even:

: ${z_1^2 z_2^2 \cdots z_k^2 \equiv \prod_{p_i\in P} p_i^{a_{i,1}+a_{i,2}+\cdots+a_{i,k}}\ \pmod{N}\quad (\text{where } a_{i,1}+a_{i,2}+\cdots+a_{i,k} \equiv 0\pmod{2}) }$

This yields a congruence of squares of the form {{nowrap|a² ≡ b² (mod N),}} which can be turned into a factorization of N, {{nowrap|N {{=}} gcd(a + b, N) × (N/gcd(a + b, N)).}} This factorization might turn out to be trivial (i.e. {{nowrap|N {{=}} N × 1}}), which can only happen if {{nowrap|a ≡ ±b (mod N),}} in which case another try must be made with a different combination of relations; but if a nontrivial pair of factors of N is reached, the algorithm terminates.

Pseudocode

This section is taken directly from Dixon(1981).

Dixon's Algorithm

Initialization. Let L be a list of integers in the range [1, n], and let P = {p₁, ..., pₕ} be the list of the h primes ≤ v. Let B and Z be initially empty lists (Z will be indexed by B).

Step 1. If L is empty, exit (algorithm unsuccessful). Otherwise, take the first term z from L, remove it from L, and proceed to Step 2.

Step 2. Compute w as the least positive remainder of z² mod n. Factor w as:

$w = w' \prod_i p_i^{a_i}$

where w′ has no factor in P. If w′ = 1, proceed to Step 3; otherwise, return to Step 1.

Step 3. Let a ← (a₁, ..., aₕ). Add a to B and z to Z. If B has at most h elements, return to Step 1; otherwise, proceed to Step 4.

Step 4. Find the first vector c in B that is linearly dependent (mod 2) on earlier vectors in B. Remove c from B and $z_c$ from Z. Compute coefficients $f_b$ such that:

$\mathbf{c} \equiv \sum_{b \in B} f_b \mathbf{b} \pmod{2}$

Define:

$\mathbf{d} = (d_1, \dots, d_n) \gets \frac{1}{2} \left(\mathbf{c} + \sum f_b \mathbf{b} \right)$

Proceed to Step 5.

Step 5. Compute:

$x \gets z_c \prod_b z_b^{f_b}, \quad y \gets \prod_i p_i^{d_i}$

so that:

$x^2 \equiv \prod_i p_i^{2d_i} = y^2 \mod n.$

If $x \equiv y$ or $x \equiv -y \pmod{n}$ , return to Step 1. Otherwise, return:

$\gcd(n, x + y)$

which provides a nontrivial factor of n, and terminate successfully.

Step-by-step example : factorizing ( n = 84923) using Dixon’s Algorithm

This example is lifted directly from the LeetArxiv substack.Kibicho, Murage (2025). [https://leetarxiv.substack.com/p/hand-written-paper-implementation Hand-Written Paper Implementation]: Asymptotically Fast Factorization of Integers. Credit is given to the original author.

Initialization:
Define a list of numbers L, ranging from 1 to 84923:

::: $L = \{1, \dots, 84923\}$

:* Define a value v, which is the smoothness factor:

::: $v = 7$

:* Define a list P containing all the prime numbers less than or equal to v:

::: $P = {2, 3, 5, 7}$

:* Define B and Z, two empty lists. B is a list of powers, while Z is a list of accepted integers:

::: $B = [ ]$

::: $Z = [ ]$

Step 1: Iterating $z$ values
Write a for loop that indexes the list $L$ . Each element in $L$ is indexed as $z$ . The for loop exits at the end of the list.

int n = 84923;

for(int i = 1; i <= n; i++)

{

int z = i;

}

Step 2: Computing $z^2 \mod n$ and v-smooth Prime Factorization
To proceed, compute $z^2 \mod 84923$ for each z, then express the result as a prime factorization.

:: $1^2 \mod 84923 \equiv 1 \mod 84923 = 2^0 \cdot 3^0 \cdot 5^0 \cdot 7^0 \mod 84923$

:: $\vdots$

:: $513^2 \mod 84923 = 8400 \mod 84923 = 2^4 \cdot 3^1 \cdot 5^2 \cdot 7^1 \mod 84923$

:: $\vdots$

:: $537^2 \mod 84923 = 33600 \mod 84923 = 2^6 \cdot 3^1 \cdot 5^2 \cdot 7^1 \mod 84923$

:: $538^2 \mod 84923 = 34675 \mod 84923 = 5^2 \cdot 19^1 \cdot 73^1 \mod 84923$

::This step continues for all values of z in the range.

Step 3: If $z^2 \mod 84923$ is 7-smooth, then append its powers to list $B$ and append $z$ to list $Z$ .

:: $Z = \{1, 513, 537\}$

:: $B = \{ [0, 0, 0, 0], [4, 1, 2, 1], [6, 1, 2, 1] \}$

Step 4: This step is split into two parts.

:: Part 1: Find $B$ modulo 2.

:: $B =
\begin{pmatrix}
0 & 0 & 0 & 0 \\
4 & 1 & 2 & 1 \\
6 & 1 & 2 & 1
\end{pmatrix}
\mod 2
\equiv
B =
\begin{pmatrix}
0 & 0 & 0 & 0 \\
0 & 1 & 0 & 1 \\
0 & 1 & 0 & 1
\end{pmatrix}$

:: Part 2: Check if any row combinations of $B$ sum to even numbers

:::For example, summing Row $2$ and Row $3$ gives us a vector of even numbers.

::: $R_2 = \{0, 1, 0, 1\}$ and $R_3 = \{0, 1, 0, 1\}$

:::then

::: $R_2 + R_3 = \{0, 1, 0, 1\} + \{0, 1, 0, 1\}$

::: $R_2 + R_3 = \{0, 2, 0, 2\}$ .

Step 5 : This step is split into four parts.
Part 1. (Finding x): Multiply the corresponding $z$ values for the rows found in Step 4. Then find the square root. This gives us $x$ .
For Row 2, we had $2^4 * 3^1 * 5^2 * 7^1$ .
For Row 3, we had $2^6 * 3^1 * 5^2 * 7^1$ .
Thus, we find $x$ :

:::: $\begin{array}{ll}
(513 \cdot 537) ^ 2 \mod 84923 = y ^ 2 \\
\\
\text{where} \quad x^2 \mod 84923 = (513 \cdot 537) ^ 2 \mod 84923 \\
\\
\text{thus} \quad x = (513 \cdot 537) \mod 84923 \\
\\
\text{so} \quad x = 275481 \mod 84923\\
\\
\text{Finally} \quad x = 20712 \mod 84923\\
\end{array}$

:* Part 2. (Finding y): Multiply the corresponding smooth factorizations for the rows found in Step 4. Then find the square root. This gives us $y$ .

::: $\begin{array}{ll}
y^2 = (2^4 \cdot 3^1 \cdot 5^2 \cdot 7^1) \times (2^6 \cdot 3^1 \cdot 5^2 \cdot 7^1) \\
\\
\text{By the multiplication law of exponents,} \\
y^2 = 2^{(4+6)} \cdot 3^{(1+1)} \cdot 5^{(2+2)} \cdot 7^{(1+1)} \\
\\
\text{Thus,} \\
y^2 = 2^{10} \cdot 3^2 \cdot 5^4 \cdot 7^2 \\
\\
\text{Taking square roots on both sides gives} \\
y = 2^5 \cdot 3^1 \cdot 5^2 \cdot 7^1 \\
\\
\text{Therefore,} \\
y = 32 \times 3 \times 25 \times 7 \\
\\
\text{Finally,} \\
y = 16800
\end{array}$

:* Part 3. (Find x + y and x - y) where x = 20712 and y = 16800.

::: $x + y = 20712 + 16800 = 37512$

::: $x - y = 20712 - 16800 = 3912$

:* Part 4. Compute GCD(x+y, n) and GCD(x-y, n), where n = 84923, x+y = 292281 and x-y = 258681

::: $\begin{array}{ll}
\gcd(37512, 84923) = 521 \\
\gcd(3912, 84923) = 163
\end{array}$

Quick check shows $84923 = 521 \times 163$ .

Optimizations

The quadratic sieve is an optimization of Dixon's method. It selects values of x close to the square root of {{var|N}} such that x² modulo N is small, thereby largely increasing the chance of obtaining a smooth number.

Other ways to optimize Dixon's method include using a better algorithm to solve the matrix equation, taking advantage of the sparsity of the matrix: a number z cannot have more than $\log_2 z$ factors, so each row of the matrix is almost all zeros. In practice, the block Lanczos algorithm is often used. Also, the size of the factor base must be chosen carefully: if it is too small, it will be difficult to find numbers that factorize completely over it, and if it is too large, more relations will have to be collected.

A more sophisticated analysis, using the approximation that a number has all its prime factors less than $N^{1/a}$ with probability about $a^{-a}$ (an approximation to the Dickman–de Bruijn function), indicates that choosing too small a factor base is much worse than too large, and that the ideal factor base size is some power of $\exp\left(\sqrt{\log N \log \log N}\right)$ .

The optimal complexity of Dixon's method is

: $O\left(\exp\left(2 \sqrt 2 \sqrt{\log n \log \log n}\right)\right)$

in big-O notation, or

: $L_n [1/2, 2 \sqrt 2]$

in L-notation.

References

Category:Integer factorization algorithms

Category:Squares in number theory