cyclotomic fast Fourier transform

The cyclotomic fast Fourier transform is a type of fast Fourier transform algorithm over finite fields.S.V. Fedorenko and P.V. Trifonov, {{cite journal |last1=Fedorenko |first1=S. V. |first2=P. V.. |last2=Trifonov |url=http://dcn.ftk.spbstu.ru/~petert/papers/pushkin2.pdf |title=On Computing the Fast Fourier Transform over Finite Fields |journal=Proceedings of International Workshop on Algebraic and Combinatorial Coding Theory |pages=108–111|year=2003}} This algorithm first decomposes a DFT into several circular convolutions, and then derives the DFT results from the circular convolution results. When applied to a DFT over GF(2^m), this algorithm has a very low multiplicative complexity. In practice, since there usually exist efficient algorithms for circular convolutions with specific lengths, this algorithm is very efficient.{{cite journal | last1=Wu | first1=Xuebin | last2=Wang | first2=Ying |last3=Yan | first3=Zhiyuan | title=On Algorithms and Complexities of Cyclotomic Fast Fourier Transforms Over Arbitrary Finite Fields| journal=IEEE Transactions on Signal Processing|volume=60|issue=3|year=2012|pages=1149–1158 | doi=10.1109/tsp.2011.2178844| bibcode=2012ITSP...60.1149W }}

Background

The discrete Fourier transform over finite fields finds widespread application in the decoding of error-correcting codes such as BCH codes and Reed–Solomon codes. Generalized from the complex field, a discrete Fourier transform of a sequence \{f_i\}_{0}^{N-1} over a finite field GF(pm) is defined as

:F_j=\sum_{i=0}^{N-1}f_i\alpha^{ij}, 0\le j\le N-1,

where \alpha is the N-th primitive root of 1 in GF(pm). If we define the polynomial representation of \{f_i\}_{0}^{N-1} as

:f(x) = f_0+f_1x+f_2x^2+\cdots+f_{N-1}x^{N-1}=\sum_{0}^{N-1}f_ix^i,

it is easy to see that F_j is simply f(\alpha^j). That is, the discrete Fourier transform of a sequence converts it to a polynomial evaluation problem.

Written in matrix format,

:\mathbf{F}=\left[\begin{matrix}F_0\\F_1\\ \vdots \\ F_{N-1}\end{matrix}\right]=

\left[\begin{matrix}

\alpha^0&\alpha^0 &\cdots & \alpha^0\\

\alpha^0 & \alpha^1 &\cdots &\alpha^{N-1}\\

\vdots &\vdots & \ddots & \vdots \\

\alpha^{0} & \alpha^{N-1} &\cdots & \alpha^{(N-1)(N-1)}

\end{matrix}\right]

\left[\begin{matrix}f_0\\f_1\\\vdots\\f_{N-1}\end{matrix}\right]=\mathcal{F}\mathbf{f}.

Direct evaluation of DFT has an O(N^2) complexity. Fast Fourier transforms are just efficient algorithms evaluating the above matrix-vector product.

Algorithm

First, we define a linearized polynomial over GF(pm) as

:L(x) = \sum_{i} l_i x^{p^i}, l_i \in \mathrm{GF}(p^m).

L(x) is called linearized because L(x_1+x_2) = L(x_1) + L(x_2), which comes from the fact that for elements x_1,x_2 \in \mathrm{GF}(p^m),(x_1+x_2)^p=x_1^p+x_2^p.

Notice that p is invertible modulo N because N must divide the order p^m-1 of the multiplicative group of the field \mathrm{GF}(p^m). So, the elements \{0, 1, 2, \ldots, N-1\} can be partitioned into l+1 cyclotomic cosets modulo N:

:\{0\},

:\{k_1, pk_1, p^2k_1, \ldots, p^{m_1-1}k_1\},

:\ldots,

:\{k_l, pk_l, p^2k_l, \ldots, p^{m_l-1}k_l\},

where k_i=p^{m_i}k_i \pmod{N}. Therefore, the input to the Fourier transform can be rewritten as

:f(x)=\sum_{i=0}^l L_i(x^{k_i}),\quad L_i(y) = \sum_{t=0}^{m_i-1}y^{p^t}f_{p^tk_i\bmod{N}}.

In this way, the polynomial representation is decomposed into a sum of linear polynomials, and hence F_j is given by

:F_j=f(\alpha^j)=\sum_{i=0}^lL_i(\alpha^{jk_i}).

Expanding \alpha^{jk_i} \in \mathrm{GF}(p^{m_i}) with the proper basis \{\beta_{i,0}, \beta_{i,1}, \ldots, \beta_{i,m_i-1}\}, we have \alpha^{jk_i} = \sum_{s=0}^{m_i-1}a_{ijs}\beta_{i,s} where a_{ijs} \in \mathrm{GF}(p), and by the property of the linearized polynomial L_i(x), we have

:F_j=\sum_{i=0}^l\sum_{s=0}^{m_i-1}a_{ijs}\left(\sum_{t=0}^{m_i-1}\beta_{i,s}^{p^t}f_{p^{t}k_i\bmod{N}}\right)

This equation can be rewritten in matrix form as \mathbf{F}=\mathbf{AL\Pi f}, where \mathbf{A} is an N\times N matrix over GF(p) that contains the elements a_{ijs}, \mathbf{L} is a block diagonal matrix, and \mathbf{\Pi} is a permutation matrix regrouping the elements in \mathbf{f} according to the cyclotomic coset index.

Note that if the normal basis \{\gamma_i^{p^0}, \gamma_i^{p^1}, \cdots, \gamma_i^{p^{m_i-1}}\} is used to expand the field elements of \mathrm{GF}(p^{m_i}), the i-th block of \mathbf{L} is given by:

:\mathbf{L}_i=

\begin{bmatrix}

\gamma_i^{p^0} &\gamma_i^{p^1} &\cdots &\gamma_i^{p^{m_i-1}}\\

\gamma_i^{p^1} &\gamma_i^{p^2} &\cdots &\gamma_i^{p^{0}}\\

\vdots & \vdots & \ddots & \vdots\\

\gamma_i^{p^{m_i-1}} &\gamma_i^{p^0} &\cdots &\gamma_i^{p^{m_i-2}}\\

\end{bmatrix}

which is a circulant matrix. It is well known that a circulant matrix-vector product can be efficiently computed by convolutions. Hence we successfully reduce the discrete Fourier transform into short convolutions.

Complexity

When applied to a characteristic-2 field GF(2m), the matrix \mathbf{A} is just a binary matrix. Only addition is used when calculating the matrix-vector product of \mathrm{A} and \mathrm{L\Pi f}. It has been shown that the multiplicative complexity of the cyclotomic algorithm is given by O(n(\log_2n)^{\log_2\frac{3}{2}}), and the additive complexity is given by O(n^2/(\log_2 n)^{\log_2\frac{8}{3}}).

References