embarrassingly parallel
{{Short description|Parallel computing, a problem which is able to be trivially divided into parallelized tasks}}
In parallel computing, an embarrassingly parallel workload or problem (also called embarrassingly parallelizable, perfectly parallel, delightfully parallel or pleasingly parallel) is one where little or no effort is needed to split the problem into a number of parallel tasks.{{cite book|last1=Herlihy|first1=Maurice|last2=Shavit|first2=Nir|title=The Art of Multiprocessor Programming, Revised Reprint|date=2012|publisher=Elsevier|isbn=9780123977953|page=14|edition=revised|url=https://books.google.com/books?id=vfvPrSz7R7QC&q=embarrasingly|access-date=28 February 2016|quote=Some computational problems are “embarrassingly parallel”: they can easily be divided into components that can be executed concurrently.}} This is due to minimal or no dependency upon communication between the parallel tasks, or for results between them.Section 1.4.4 of: {{cite book
|url=http://www.mcs.anl.gov/~itf/dbpp/text/node10.html
|title=Designing and Building Parallel Programs
|author=Foster, Ian
|publisher=Addison–Wesley
|archive-url=https://web.archive.org/web/20110301095228/http://www.mcs.anl.gov/~itf/dbpp/text/node10.html
|archive-date=2011-03-01
|year=1995
|url-status=dead
|isbn=9780201575941
}}
These differ from distributed computing problems, which need communication between tasks, especially communication of intermediate results. They are easier to perform on server farms which lack the special infrastructure used in a true supercomputer cluster. They are well-suited to large, Internet-based volunteer computing platforms such as BOINC, and suffer less from parallel slowdown. The opposite of embarrassingly parallel problems are inherently serial problems, which cannot be parallelized at all.
A common example of an embarrassingly parallel problem is 3D video rendering handled by a graphics processing unit, where each frame (forward method) or pixel (ray tracing method) can be handled with no interdependency.{{cite book|author1=Alan Chalmers|author2=Erik Reinhard|author3=Tim Davis|title=Practical Parallel Rendering|url=https://books.google.com/books?id=loxhtzzG1FYC&q=%22embarrassingly+parallel%22|date=21 March 2011|publisher=CRC Press|isbn=978-1-4398-6380-0}} Some forms of password cracking are another embarrassingly parallel task that is easily distributed on central processing units, CPU cores, or clusters.
Etymology
"Embarrassingly" is used here to refer to parallelization problems which are "embarrassingly easy".Matloff, Norman (2011). The Art of R Programming: A Tour of Statistical Software Design, p.347. No Starch. {{ISBN|9781593274108}}. The term may imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial homotopy continuation methods."{{cite book|last1=Leykin|first1=Anton|last2=Verschelde|first2=Jan|last3=Zhuang|first3=Yan|title=Mathematical Software - ICMS 2006 |chapter=Parallel Homotopy Algorithms to Solve Polynomial Systems |year=2006 |volume=4151|pages=225–234|doi=10.1007/11832225_22|series=Lecture Notes in Computer Science|isbn=978-3-540-38084-9}} The term is first found in the literature in a 1986 book on multiprocessors by MATLAB's creator Cleve Moler,{{cite book
| contribution = Matrix Computation on Distributed Memory Multiprocessors
| author = Moler, Cleve
| publisher = Society for Industrial and Applied Mathematics, Philadelphia
| title = Hypercube Multiprocessors
| editor-last = Heath
| editor-first = Michael T.
| year = 1986
| isbn = 978-0898712094
}} who claims to have invented the term.[http://blogs.mathworks.com/cleve/2013/11/12/the-intel-hypercube-part-2-reposted/#096367ea-045e-4f28-8fa2-9f7db8fb7b01 The Intel hypercube part 2 reposted on Cleve's Corner blog on The MathWorks website]
An alternative term, pleasingly parallel, has gained some use, perhaps to avoid the negative connotations of embarrassment in favor of a positive reflection on the parallelizability of the problems: "Of course, there is nothing embarrassing about these programs at all."Kepner, Jeremy (2009). Parallel MATLAB for Multicore and Multinode Computers, p.12. SIAM. {{isbn|9780898716733}}.
Examples
A trivial example involves serving static data. It would take very little effort to have many processing units produce the same set of bits. Indeed, the famous Hello World problem could easily be parallelized with few programming considerations or computational costs.
Some examples of embarrassingly parallel problems include:
- Monte Carlo method{{cite book|author=Erricos John Kontoghiorghes|title=Handbook of Parallel Computing and Statistics|url=https://books.google.com/books?id=BnNnKPkFH2kC&q=%22embarrassingly+parallel%22|date=21 December 2005|publisher=CRC Press|isbn=978-1-4200-2868-3}}
- Distributed relational database queries using [http://www.mysqlperformanceblog.com/2011/05/14/distributed-set-processing-with-shard-query/ distributed set processing].
- Numerical integration{{cite book|author=Yuefan Deng|title=Applied Parallel Computing|url=https://books.google.com/books?id=YS9wvVeWrXgC&q=%22embarrassingly+parallel%22|year=2013|publisher=World Scientific|isbn=978-981-4307-60-4}}
- Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion.
- The Mandelbrot set, Perlin noise and similar images, where each point is calculated independently.
- Rendering of computer graphics. In computer animation, each frame or pixel may be rendered independently {{xref|(see: Parallel rendering)}}.
- Some brute-force searches in cryptography.{{Cite journal|url=https://tools.ietf.org/html/rfc7914#page-2|title=The scrypt Password-Based Key Derivation Function|first1=Simon|last1=Josefsson|first2=Colin|last2=Percival|author-link2=Colin Percival|date=August 2016|website=tools.ietf.org|doi=10.17487/RFC7914 |access-date=2016-12-12}} Notable real-world examples include distributed.net and proof-of-work systems used in cryptocurrency.
- BLAST searches in bioinformatics with split databases.{{cite journal |last1=Mathog |first1=DR |title=Parallel BLAST on split databases. |journal=Bioinformatics |date=22 September 2003 |volume=19 |issue=14 |pages=1865–6 |doi=10.1093/bioinformatics/btg250 |pmid=14512366|doi-access=free }}
- Large scale facial recognition systems that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via closed-circuit television) with similarly large number of previously stored faces (e.g., a rogues gallery or similar watch list).[http://lbrandy.com/blog/2008/10/how-we-made-our-face-recognizer-25-times-faster/ How we made our face recognizer 25 times faster] (developer blog post)
- Computer simulations comparing many independent scenarios.
- Genetic algorithms.{{cite book|author1=Shigeyoshi Tsutsui|author2=Pierre Collet|title=Massively Parallel Evolutionary Computation on GPGPUs|url=https://books.google.com/books?id=Hv68BAAAQBAJ&q=%22embarrassingly+parallel%22|date=5 December 2013|publisher=Springer Science & Business Media|isbn=978-3-642-37959-8}}
- Ensemble calculations of numerical weather prediction.
- Event simulation and reconstruction in particle physics.
- The marching squares algorithm.
- Sieving step of the quadratic sieve and the number field sieve.
- Tree growth step of the random forest machine learning technique.
- Discrete Fourier transform where each harmonic is independently calculated.
- Convolutional neural networks running on GPUs.
- Parallel search in constraint programming{{cite book|author1=Youssef Hamadi|author2=Lakhdar Sais|title=Handbook of Parallel Constraint Reasoning|url=https://books.google.com/books?id=w5JUDwAAQBAJ&q=%22embarrassingly+parallel%22|date=5 April 2018|publisher=Springer|isbn=978-3-319-63516-3}}
Implementations
- In R (programming language) – The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a Beowulf cluster for embarrassingly parallel computations.[http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html Simple Network of Workstations (SNOW) package] Similar R packages include "future", "parallel" and others.
See also
- Amdahl's law defines value P, which would be almost or exactly equal to 1 for embarrassingly parallel problems.
- Cellular automaton
- Connection Machine
- CUDA framework
- Manycore processor
- Map (parallel pattern)
- Massively parallel
- Multiprocessing
- Parallel computing
- Process-oriented programming
- Shared-nothing architecture (SN)
- Symmetric multiprocessing (SMP)
- Vector processor
References
External links
{{Wiktionary}}
- [http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node30.html Embarrassingly Parallel Computations], Engineering a Beowulf-style Compute Cluster
- "[http://www.cs.ucsb.edu/~gilbert/reports/hpec04.pdf Star-P: High Productivity Parallel Computing]"
{{Parallel computing}}
Category:Applications of distributed computing