fork–join model

{{Short description|Way of setting up and executing parallel computer programs}}

Image:Fork join.svg

In parallel computing, the fork–join model is a way of setting up and executing parallel programs, such that execution branches off in parallel at designated points in the program, to "join" (merge) at a subsequent point and resume sequential execution. Parallel sections may fork recursively until a certain task granularity is reached. Fork–join can be considered a parallel design pattern.{{cite book |author1=Michael McCool |author2=James Reinders |author3=Arch Robison |title=Structured Parallel Programming: Patterns for Efficient Computation |publisher=Elsevier |year=2013}}{{rp|209 ff.}} It was formulated as early as 1963.{{cite conference |author=Melvin E. Conway |title=A multiprocessor system design |conference=Fall Join Computer Conference |year=1963 |pages=139–146 |doi=10.1145/1463822.1463838|doi-access=free }}{{cite journal |last=Nyman |first=Linus |last2=Laakso |first2=Mikael |title=Notes on the History of Fork and Join |journal=IEEE Annals of the History of Computing |publisher=IEEE Computer Society |volume=38 |issue=3 |pages=84–87 |doi=10.1109/MAHC.2016.34 |year=2016 }}

By nesting fork–join computations recursively, one obtains a parallel version of the divide and conquer paradigm, expressed by the following generic pseudocode:{{cite conference |author=Doug Lea |author-link=Doug Lea |title=A Java fork/join framework |year=2000 |url=http://gee.cs.oswego.edu/dl/papers/fj.pdf |conference=ACM Conference on Java}}

solve(problem):

if problem is small enough:

solve problem directly (sequential algorithm)

else:

for part in subdivide(problem)

fork subtask to solve(part)

join all subtasks spawned in previous loop

return combined results

Examples

The simple parallel merge sort of CLRS is a fork–join algorithm.{{Introduction to Algorithms|3|pages=797}}

mergesort(A, lo, hi):

if lo < hi: // at least one element of input

mid = ⌊lo + (hi - lo) / 2⌋

fork mergesort(A, lo, mid) // process (potentially) in parallel with main task

mergesort(A, mid, hi) // main task handles second recursion

join

merge(A, lo, mid, hi)

The first recursive call is "forked off", meaning that its execution may run in parallel (in a separate thread) with the following part of the function, up to the {{mono|join}} that causes all threads to synchronize. While the {{mono|join}} may look like a barrier, it is different because the threads will continue to work after a barrier, while after a {{mono|join}} only one thread continues.{{r|spp}}{{rp|88}}

The second recursive call is not a fork in the pseudocode above; this is intentional, as forking tasks may come at an expense. If both recursive calls were set up as subtasks, the main task would not have any additional work to perform before being blocked at the {{mono|join}}.

Implementations

Implementations of the fork–join model will typically fork tasks, fibers or lightweight threads, not operating-system-level "heavyweight" threads or processes, and use a thread pool to execute these tasks: the fork primitive allows the programmer to specify potential parallelism, which the implementation then maps onto actual parallel execution. The reason for this design is that creating new threads tends to result in too much overhead.{{r|lea}}

The lightweight threads used in fork–join programming will typically have their own scheduler (typically a work stealing one) that maps them onto the underlying thread pool. This scheduler can be much simpler than a fully featured, preemptive operating system scheduler: general-purpose thread schedulers must deal with blocking for locks, but in the fork–join paradigm, threads only block at the join point.

Fork–join is the main model of parallel execution in the OpenMP framework, although OpenMP implementations may or may not support nesting of parallel sections.{{cite web |author=Blaise Barney |title=OpenMP |publisher=Lawrence Livermore National Laboratory |url=https://computing.llnl.gov/tutorials/openMP/ |access-date=5 April 2014 |date=12 June 2013}} It is also supported by the Java concurrency framework,{{cite web |title=Fork/Join |website=The Java Tutorials |url=http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html |access-date=5 April 2014}} the Task Parallel Library for .NET,{{cite conference |author1=Daan Leijen |author2=Wolfram Schulte |author3=Sebastian Burckhardt |year=2009 |title=The design of a Task Parallel Library |conference=OOPSLA}} and Intel's Threading Building Blocks (TBB). The Cilk programming language has language-level support for fork and join, in the form of the spawn and sync keywords, or cilk_spawn and cilk_sync in Cilk Plus.

See also

{{Portal|Computer programming}}

References

{{Reflist|30em}}