vector-radix FFT algorithm

{{Short description|Multidimensional fast Fourier transform algorithm}}

The vector-radix FFT algorithm, is a multidimensional fast Fourier transform (FFT) algorithm, which is a generalization of the ordinary Cooley–Tukey FFT algorithm that divides the transform dimensions by arbitrary radices. It breaks a multidimensional (MD) discrete Fourier transform (DFT) down into successively smaller MD DFTs until, ultimately, only trivial MD DFTs need to be evaluated.{{cite book|last1=Dudgeon|first1=Dan|last2=Russell|first2=Mersereau|title=Multidimensional Digital Signal Processing|date=September 1983|publisher=Prentice Hall|isbn=0136049591|pages=76}}

The most common multidimensional FFT algorithm is the row-column algorithm, which means transforming the array first in one index and then in the other, see more in FFT. Then a radix-2 direct 2-D FFT has been developed,{{cite journal|last1=Rivard|first1=G.|title=Direct fast Fourier transform of bivariate functions|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|volume=25|issue=3|pages=250–252|doi=10.1109/TASSP.1977.1162951|year=1977}} and it can eliminate 25% of the multiplies as compared to the conventional row-column approach. And this algorithm has been extended to rectangular arrays and arbitrary radices,{{cite book|last1=Harris|first1=D.|last2=McClellan|first2=J.|last3=Chan|first3=D.|last4=Schuessler|first4=H.|title=ICASSP '77. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter=Vector radix fast Fourier transform |volume=2|pages=548–551|doi=10.1109/ICASSP.1977.1170349|year=1977}} which is the general vector-radix algorithm.

Vector-radix FFT algorithm can reduce the number of complex multiplications significantly, compared to row-vector algorithm. For example, for a $N^M$ element matrix (M dimensions, and size N on each dimension), the number of complex multiples of vector-radix FFT algorithm for radix-2 is $\frac{2^M -1}{2^M} N^M \log_2 N$ , meanwhile, for row-column algorithm, it is $\frac{M N^M} 2 \log_2 N$ . And generally, even larger savings in multiplies are obtained when this algorithm is operated on larger radices and on higher dimensional arrays.

Overall, the vector-radix algorithm significantly reduces the structural complexity of the traditional DFT having a better indexing scheme, at the expense of a slight increase in arithmetic operations. So this algorithm is widely used for many applications in engineering, science, and mathematics, for example, implementations in image processing,{{cite journal|last1=Buijs|first1=H.|last2=Pomerleau|first2=A.|last3=Fournier|first3=M.|last4=Tam|first4=W.|title=Implementation of a fast Fourier transform (FFT) for image processing applications|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|date=Dec 1974|volume=22|issue=6|pages=420–424|doi=10.1109/TASSP.1974.1162620}} and high speed FFT processor designing.{{cite book|last1=Badar|first1=S.|last2=Dandekar|first2=D.|title=2015 International Conference on Industrial Instrumentation and Control (ICIC) |chapter=High speed FFT processor design using radix −⁴ pipelined architecture |pages=1050–1055|doi=10.1109/IIC.2015.7150901|year=2015|isbn=978-1-4799-7165-7|s2cid=11093545 }}

2-D DIT case

As with the Cooley–Tukey FFT algorithm, the two dimensional vector-radix FFT is derived by decomposing the regular 2-D DFT into sums of smaller DFT's multiplied by "twiddle" factors.

A decimation-in-time (DIT) algorithm means the decomposition is based on time domain $x$ , see more in Cooley–Tukey FFT algorithm.

We suppose the 2-D DFT is defined

: $X(k_1,k_2) = \sum_{n_1=0}^{N_1-1} \sum_{n_2=0}^{N_2-1} x[n_1, n_2] \cdot W_{N_1}^{k_1 n_1} W_{N_2}^{k_2 n_2},$

where $k_1 = 0,\dots,N_1-1$ ,and $k_2 = 0,\dots,N_2-1$ , and $x[n_1, n_2]$ is an $N_1 \times N_2$ matrix, and $W_N = \exp(-j 2\pi /N)$ .

For simplicity, let us assume that $N_1=N_2=N$ , and the radix- $(r\times r)$ is such that $N/r$ is an integer.

Using the change of variables:

$n_i=rp_i+q_i$ , where $p_i=0,\ldots,(N/r)-1; q_i = 0,\ldots,r-1;$
$k_i=u_i+v_i N/r$ , where $u_i=0,\ldots,(N/r)-1; v_i = 0,\ldots,r-1;$

where $i = 1$ or $2$ , then the two dimensional DFT can be written as:{{cite journal|last1=Chan|first1=S. C.|last2=Ho|first2=K. L.|title=Split vector-radix fast Fourier transform|journal=IEEE Transactions on Signal Processing|volume=40|issue=8|pages=2029–2039|doi=10.1109/78.150004|bibcode=1992ITSP...40.2029C|year=1992}}

: $X(u_1+v_1 N/r,u_2+v_2 N/r)=\sum_{q_1=0}^{r-1} \sum_{q_2=0}^{r-1} \left[ \sum_{p_1=0}^{N/r-1} \sum_{p_2=0}^{N/r-1} x[rp_1+q_1, rp_2+q_2] W_{N/r}^{p_1 u_1} W_{N/r}^{p_2 u_2} \right] \cdot W_N^{q_1 u_1+q_2 u_2} W_r^{q_1 v_1} W_r^{q_2 v_2},$

File:2x2 radix butterfly diagram.svg

The equation above defines the basic structure of the 2-D DIT radix- $(r\times r)$ "butterfly". (See 1-D "butterfly" in Cooley–Tukey FFT algorithm)

When $r=2$ , the equation can be broken into four summations, and this leads to:

: $X(k_1,k_2) = S_{00}(k_1,k_2) + S_{01}(k_1,k_2) W_N^{k_2} +S_{10}(k_1,k_2) W_N^{k_1} + S_{11}(k_1,k_2) W_N^{k_1+k_2}$ for $0\leq k_1, k_2 < \frac{N}{2}$ ,

where $S_{ij}(k_1,k_2)=\sum_{n_1=0}^{N/2-1} \sum_{n_2=0}^{N/2-1} x[2 n_1 + i, 2 n_2 + j] \cdot W_{N/2}^{n_1 k_1} W_{N/2}^{n_2 k_2}$ .

The $S_{ij}$ can be viewed as the $N/2$ -dimensional DFT, each over a subset of the original sample:

$S_{00}$ is the DFT over those samples of $x$ for which both $n_1$ and $n_2$ are even;
$S_{01}$ is the DFT over the samples for which $n_1$ is even and $n_2$ is odd;
$S_{10}$ is the DFT over the samples for which $n_1$ is odd and $n_2$ is even;
$S_{11}$ is the DFT over the samples for which both $n_1$ and $n_2$ are odd.

Thanks to the periodicity of the complex exponential, we can obtain the following additional identities, valid for $0\leq k_1, k_2 < \frac{N}{2}$ :

$X\biggl(k_1+\frac{N}{2},k_2\biggr) = S_{00}(k_1,k_2) + S_{01}(k_1,k_2) W_N^{k_2} -S_{10}(k_1,k_2) W_N^{k_1} - S_{11}(k_1,k_2) W_N^{k_1+k_2}$ ;
$X\biggl(k_1,k_2+\frac{N}{2}\biggr) = S_{00}(k_1,k_2) - S_{01}(k_1,k_2) W_N^{k_2} +S_{10}(k_1,k_2) W_N^{k_1} - S_{11}(k_1,k_2) W_N^{k_1+k_2}$ ;
$X\biggl(k_1+\frac{N}{2},k_2+\frac{N}{2}\biggr) = S_{00}(k_1,k_2) - S_{01}(k_1,k_2) W_N^{k_2} -S_{10}(k_1,k_2) W_N^{k_1} + S_{11}(k_1,k_2) W_N^{k_1+k_2}$ .

2-D DIF case

Similarly, a decimation-in-frequency (DIF, also called the Sande–Tukey algorithm) algorithm means the decomposition is based on frequency domain $X$ , see more in Cooley–Tukey FFT algorithm.

Using the change of variables:

$n_i=p_i+q_i N/r$ , where $p_i=0,\ldots,(N/r)-1; q_i = 0,\ldots,r-1;$
$k_i=r u_i+v_i$ , where $u_i=0,\ldots,(N/r)-1; v_i = 0,\ldots,r-1;$

where $i = 1$ or $2$ , and the DFT equation can be written as:

: $X(r u_1+v_1,r u_2+v_2)=\sum_{p_1=0}^{N/r-1} \sum_{p_2=0}^{N/r-1} \left[ \sum_{q_1=0}^{r-1} \sum_{q_2=0}^{r-1} x[p_1+q_1 N/r, p_2+q_2 N/r] W_{r}^{q_1 v_1} W_{r}^{q_2 v_2} \right] \cdot W_{N}^{p_1 v_1+p_2 v_2} W_{N/r}^{p_1 u_1} W_{N/r}^{p_2 u_2},$

Other approaches

The split-radix FFT algorithm has been proved to be a useful method for 1-D DFT. And this method has been applied to the vector-radix FFT to obtain a split vector-radix FFT.{{cite book|last1=Pei|first1=Soo-Chang|last2=Wu|first2=Ja-Lin|title=ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing |chapter=Split vector radix 2D fast Fourier transform |volume=12|date=April 1987|pages=1987–1990|doi=10.1109/ICASSP.1987.1169345|s2cid=118173900 }}

In conventional 2-D vector-radix algorithm, we decompose the indices $k_1,k_2$ into 4 groups:

: $\begin{array}{lcl}
X(2 k_1, 2 k_2) & : & \text{even-even} \\
X(2 k_1, 2 k_2 +1) & : & \text{even-odd} \\
X(2 k_1 +1, 2 k_2) & : & \text{odd-even} \\
X(2 k_1+1, 2 k_2+1) & : & \text{odd-odd}
\end{array}$

By the split vector-radix algorithm, the first three groups remain unchanged, the fourth odd-odd group is further decomposed into another four sub-groups, and seven groups in total:

: $\begin{array}{lcl}
X(2 k_1, 2 k_2) & : & \text{even-even} \\
X(2 k_1, 2 k_2 +1) & : & \text{even-odd} \\
X(2 k_1 +1, 2 k_2) & : & \text{odd-even} \\
X(4 k_1+1, 4 k_2+1) & : & \text{odd-odd} \\
X(4 k_1+1, 4 k_2+3) & : & \text{odd-odd} \\
X(4 k_1+3, 4 k_2+1) & : & \text{odd-odd} \\
X(4 k_1+3, 4 k_2+3) & : & \text{odd-odd}
\end{array}$

That means the fourth term in 2-D DIT radix- $(2\times 2)$ equation, $S_{11}(k_1,k_2) W_{N}^{k_1+k_2}$ becomes:{{cite journal|last1=Wu|first1=H.|last2=Paoloni|first2=F.|title=On the two-dimensional vector split-radix FFT algorithm|journal=IEEE Transactions on Acoustics, Speech, and Signal Processing|date=Aug 1989|volume=37|issue=8|pages=1302–1304|doi=10.1109/29.31283}}

: $A_{11}(k_1,k_2) W_N^{k_1+k_2} + A_{13}(k_1,k_2) W_N^{k_1+3 k_2} +A_{31}(k_1,k_2) W_N^{3 k_1+k_2} + A_{33}(k_1,k_2) W_N^{3(k_1+k_2)},$

where $A_{ij}(k_1,k_2)=\sum_{n_1=0}^{N/4-1} \sum_{n_2=0}^{N/4-1} x[4 n_1 + i, 4 n_2 + j] \cdot W_{N/4}^{n_1 k_1} W_{N/4}^{n_2 k_2}$

The 2-D N by N DFT is then obtained by successive use of the above decomposition, up to the last stage.

It has been shown that the split vector radix algorithm has saved about 30% of the complex multiplications and about the same number of the complex additions for typical $1024\times 1024$ array, compared with the vector-radix algorithm.

References

Category:FFT algorithms

Category:Digital signal processing

Category:Discrete transforms