Sparse Fourier transform

The sparse Fourier transform (SFT) is a kind of discrete Fourier transform (DFT) for handling big data signals. Specifically, it is used in GPS synchronization, spectrum sensing and analog-to-digital converters.:{{Cite journal |doi = 10.1109/MSP.2014.2329131|title = Recent Developments in the Sparse Fourier Transform: A compressed Fourier transform for big data|journal = IEEE Signal Processing Magazine|volume = 31|issue = 5|pages = 91–100|year = 2014|last1 = Gilbert|first1 = Anna C.|last2 = Indyk|first2 = Piotr|last3 = Iwen|first3 = Mark|last4 = Schmidt|first4 = Ludwig|bibcode = 2014ISPM...31...91G|url = https://dspace.mit.edu/bitstream/1721.1/113828/1/Recent%20developments.pdf|hdl = 1721.1/113828| s2cid=14585685 |hdl-access = free}}

The fast Fourier transform (FFT) plays an indispensable role on many scientific domains, especially on signal processing. It is one of the top-10 algorithms in the twentieth century.{{ Cite journal | title = The best of the 20th century: Editors name top 10 algorithms | last1 = Cipra | first1 = Barry A. | date =2000 | journal = SIAM News |volume= 33 | issue = 4 }} However, with the advent of big data era, the FFT still needs to be improved in order to save more computing power. Recently, the sparse Fourier transform (SFT) has gained a considerable amount of attention, for it performs well on analyzing the long sequence of data with few signal components.

Definition

Consider a sequence x_n of complex numbers. By Fourier series, x_n can be written as

: $x_n=(F^*X)_n=\sum_{k=0}^{N-1}X_k e^{j\frac{2\pi}{N}kn}.$

Similarly, X_k can be represented as

: $X_k=\frac{1}{N}(Fx)_k=\frac{1}{N}\sum_{n=0}^{N-1}x_n e^{-j\frac{2\pi}{N}kn}.$

Hence, from the equations above, the mapping is $F:\mathbb C^N\to \mathbb C^N$ .

=Single frequency recovery=

Assume only a single frequency exists in the sequence. In order to recover this frequency from the sequence, it is reasonable to utilize the relationship between adjacent points of the sequence.

==Phase encoding==

The phase k can be obtained by dividing the adjacent points of the sequence. In other words,

: $\frac{x_{n+1}}{x_n}=e^{j\frac{2\pi}{N}k} = \cos\left(\frac{2\pi k}{N}\right)+j \sin\left(\frac{2\pi k}{N}\right).$

Notice that $x_n \in \mathbb C^N$ .

==An aliasing-based search==

File:Aliasing-based search.png

Seeking phase k can be done by Chinese remainder theorem (CRT).{{cite journal |last1=Iwen |first1=M. A. |title=Combinatorial Sublinear-Time Fourier Algorithms |journal=Foundations of Computational Mathematics |date=2010-01-05 |volume=10 |issue=3 |pages=303–338 |doi=10.1007/s10208-009-9057-1|s2cid=1631513 }}

Take $k=104{,}134$ for an example. Now, we have three relatively prime integers 100, 101, and 103. Thus, the equation can be described as

: $k=104{,}134\equiv \left\{ \begin{array}{rl} 34 & \bmod 100, \\ 3 & \bmod 101, \\ 1 & \bmod 103. \end{array} \right.$

By CRT, we have

: $k=104{,}134\bmod (100\cdot101\cdot103)=104{,}134\bmod 1{,}040{,}300$

==Randomly binning frequencies==

File:Randomlybinning.pdf

Now, we desire to explore the case of multiple frequencies, instead of a single frequency. The adjacent frequencies can be separated by the scaling c and modulation b properties. Namely, by randomly choosing the parameters of c and b, the distribution of all frequencies can be almost a uniform distribution. The figure Spread all frequencies reveals by randomly binning frequencies, we can utilize the single frequency recovery to seek the main components.

: $x_n'=X_k e^{j\frac{2\pi}{N}(c\cdot k+b)},$

where c is scaling property and b is modulation property.

By randomly choosing c and b, the whole spectrum can be looked like uniform distribution. Then, taking them into filter banks can separate all frequencies, including Gaussians,{{cite book |author1=Haitham Hassanieh |author2=Piotr Indyk |author3=Dina Katabi |author4=Eric Price |title=Simple and Practical Algorithm for Sparse Fourier Transform |pages=1183–1194 |date=2012 |doi=10.1137/1.9781611973099.93 |hdl=1721.1/73474 |isbn=978-1-61197-210-8 }} indicator functions,{{cite book |author1=A. C. Gilbert |others=S. Guha, P. Indyk, S. Muthukrishnan, M. Strauss |title=Proceedings of the thiry-fourth annual ACM symposium on Theory of computing |chapter=Near-optimal sparse fourier representations via sampling |date=2002 |pages=152–161 |doi=10.1145/509907.509933|isbn=1581134959 |s2cid=14320243 }}{{cite book|author1=A. C. Gilbert |author2=S. Muthukrishnan|author2-link=S. Muthukrishnan (computer scientist)|author3= M. Strauss |editor-first1=Manos |editor-first2=Andrew F |editor-first3=Michael A |editor-last1=Papadakis |editor-last2=Laine |editor-last3=Unser |chapter=Improved time bounds for near-optimal sparse Fourier representations |title=Wavelets XI |volume=5914|pages=59141A|date=21 September 2005 |doi=10.1117/12.615931|series=Proceedings of SPIE|bibcode=2005SPIE.5914..398G|s2cid=12622592 }} spike trains,{{Cite book |doi = 10.1109/Allerton.2013.6736670|chapter = Sample-optimal average-case sparse Fourier Transform in two dimensions|title = 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton)|pages = 1258–1265|year = 2013|last1 = Ghazi|first1 = Badih|last2 = Hassanieh|first2 = Haitham|last3 = Indyk|first3 = Piotr|last4 = Katabi|first4 = Dina|last5 = Price|first5 = Eric|last6 = Lixin Shi|isbn = 978-1-4799-3410-2|arxiv = 1303.1209| s2cid=6151728 }}{{cite journal |last1=Iwen |first1=M. A. |title=Combinatorial Sublinear-Time Fourier Algorithms |journal=Foundations of Computational Mathematics |date=2010-01-05 |volume=10 |issue=3 |pages=303–338 |doi=10.1007/s10208-009-9057-1|s2cid=1631513 }}{{cite journal |author1=Mark A.Iwen |title=Improved approximation guarantees for sublinear-time Fourier algorithms |journal=Applied and Computational Harmonic Analysis |date=2013-01-01 |volume=34 |issue=1 |pages=57–82 |doi=10.1016/j.acha.2012.03.007 |language=en |issn=1063-5203|arxiv=1010.0014 |s2cid=16808450 }}{{Cite book |doi = 10.1109/ISIT.2013.6620269|chapter = Computing a k-sparse n-length Discrete Fourier Transform using at most 4k samples and O(k log k) complexity|title = 2013 IEEE International Symposium on Information Theory|pages = 464–468|year = 2013|last1 = Pawar|first1 = Sameer|last2 = Ramchandran|first2 = Kannan|isbn = 978-1-4799-0446-4| s2cid=601496 }} and Dolph-Chebyshev filters.{{cite book |last1=Hassanieh |first1=Haitham |last2=Indyk |first2=Piotr |last3=Katabi |first3=Dina |last4=Price |first4=Eric |title=Proceedings of the forty-fourth annual ACM symposium on Theory of computing |chapter=Nearly optimal sparse fourier transform |date=2012 |pages=563–578 |doi=10.1145/2213977.2214029 |chapter-url=https://dl.acm.org/citation.cfm?id=2214029 |publisher=ACM |arxiv=1201.2501 |series=STOC'12 |isbn=9781450312455 |s2cid=3760962 }} Each bank only contains a single frequency.

The prototypical SFT

Generally, all SFT follows the three stages

=Identifying frequencies=

By randomly bining frequencies, all components can be separated. Then, taking them into filter banks, so each band only contains a single frequency. It is convenient to use the methods we mentioned to recover this signal frequency.

=Estimating coefficients=

After identifying frequencies, we will have many frequency components. We can use Fourier transform to estimate their coefficients.

: $X_k'=\frac 1 L \sum_{l=1}^L x_n'e^{-j\frac{2\pi}{N}n'\ell}$

=Repeating=

Finally, repeating these two stages can we extract the most important components from the original signal.

: $x_n-\sum_{k'=1}^k X_k' e^{j\frac{2\pi}{N}k'n}$

Sparse Fourier transform in the discrete setting

In 2012, Hassanieh, Indyk, Katabi, and Price proposed an algorithm that takes $O ( k \log n \log (n/k) )$ samples and runs in the same running time.

Sparse Fourier transform in the high dimensional setting

In 2014, Indyk and Kapralov {{cite journal |last1=Indyk |first1=Piotr |last2=Kapralov |first2=Michael |title=Sample-optimal Fourier sampling in any constant dimension |url=https://archive.org/details/arxiv-1403.5804 |journal= Annual Symposium on Foundations of Computer Science |date=2014 |pages=514–523 |arxiv=1403.5804 |series=FOCS'14 }} proposed an algorithm that takes $2^{O(d \log d)} k \log n$ samples and runs in nearly linear time in $n$ . In 2016, Kapralov {{cite book |last1=Kapralov |first1=Michael |title=Proceedings of the forty-eighth annual ACM symposium on Theory of Computing |chapter=Sparse fourier transform in any constant dimension with nearly-optimal sample complexity in sublinear time |url=https://archive.org/details/arxiv-1604.00845 |date=2016 | arxiv=1604.00845 |series=STOC'16 |pages=264–277 |doi=10.1145/2897518.2897650 |isbn=9781450341325 |s2cid=11847086 }} proposed an algorithm that uses sublinear samples $2^{O(d^2)} k \log n \log \log n$ and sublinear decoding time $k \log^{O(d)} n$ . In 2019, Nakos, Song, and Wang {{cite journal |last1=Nakos |first1=Vasileios |last2=Song |first2=Zhao | last3 = Wang | first3 = Zhengyu |title= (Nearly) Sample-Optimal Sparse Fourier Transform in Any Dimension; RIPless and Filterless |journal= Annual Symposium on Foundations of Computer Science |date=2019 | arxiv = 1909.11123 |series=FOCS'19 }} introduced a new algorithm which uses nearly optimal samples $O( k \log n \log k )$ and requires nearly linear time decoding time. A dimension-incremental algorithm was proposed by Potts, Volkmer {{cite journal |last1=Potts |first1=Daniel |last2=Volkmer |first2=Toni |title= Sparse high-dimensional FFT based on rank-1 lattice sampling |journal= Applied and Computational Harmonic Analysis|date=2016|volume=41 |issue=3 |pages=713–748 | doi= 10.1016/j.acha.2015.05.002}} based on sampling along rank-1 lattices.

Sparse Fourier transform in the continuous setting

There are several works about generalizing the discrete setting into the continuous setting.{{cite journal |last1=Price |first1=Eric |last2=Song |first2=Zhao |title=A Robust Sparse Fourier Transform in the Continuous Setting |journal= Annual Symposium on Foundations of Computer Science |date=2015 |pages=583–600 |arxiv=1609.00896 |series=FOCS'15 }}{{cite journal |last1 = Chen | first1 = Xue | last2 = Kane | first2 = Daniel M.| last3=Price |first3=Eric |last4=Song |first4=Zhao |title=Fourier-Sparse Interpolation without a Frequency Gap |url = https://archive.org/details/arxiv-1609.01361 |journal= Annual Symposium on Foundations of Computer Science |date=2016 |pages=741–750

|arxiv=1609.01361 |series=FOCS'16 }}

Implementations

There are several works based on MIT, MSU, ETH and University of Technology Chemnitz [TUC]. Also, they are free online.

[https://sourceforge.net/projects/aafftannarborfa/ MSU implementations]
[http://www.spiral.net/software/sfft.html ETH implementations]
[http://groups.csail.mit.edu/netmit/sFFT/ MIT implementations]
[https://github.com/davidediger/sfft GitHub]
[https://www.tu-chemnitz.de/~tovo/software.php TUC implementations]

References

Category:Fourier analysis

Category:Big data