Automatic Parallelization of Recursive Procedures

Gupta, Manish; Mukhopadhyay, Sayak; Sinha, Navin
December 2000
International Journal of Parallel Programming;Dec2000, Vol. 28 Issue 6, p537
Academic Journal
Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present compile-time analysis, using powerful, symbolic array section analysis, to detect the independence of multiple recursive calls in a procedure. This allows exploitation of a scalable form of nested parallelism, where each parallel task can further spawn off parallel work in subsequent recursive calls. We describe a runtime system which efficiently supports this kind of nested parallelism without unnecessarily blocking tasks. We have implemented this framework in a parallelizing compiler, which is able to automatically parallelize programs like quicksort and mergesort, written in C. For cases where even the advanced compile-time analysis we describe is not able to prove the independence of procedure calls, we propose novel techniques for speculative runtime parallelization, which are more efficient and powerful in this context than analogous techniques proposed previously for speculatively parallelizing loops. Our experimental results on an IBM G30 SMP machine show good speedups obtained by following our approach.


Related Articles

  • Path Analysis and Renaming for Predicated Instruction Scheduling. Carter, Lori; Simon, Beth; Calder, Brad; Carter, Larry; Ferrante, Jeanne // International Journal of Parallel Programming;Dec2000, Vol. 28 Issue 6, p563 

    Increases in instruction level parallelism are needed to exploit the potential parallelism available in future wide issue architectures. Predicated execution is an architectural mechanism that increases instruction level parallelism by removing branches and allowing simultaneous execution of...

  • Exploring the structure of the space of compilation sequences using randomized search algorithms. Cooper, Keith; Grosul, Alexander; Harvey, Timothy; Reeves, Steve; Subramanian, Devika; Torczon, Linda; Waterman, Todd // Journal of Supercomputing;May2006, Vol. 36 Issue 2, p135 

    Modern optimizing compilers apply a fixed sequence of optimizations, which we call a compilation sequence, to each program that they compile. These compilers let the user modify their behavior in a small number of specified ways, using command-line flags (e.g.,-O1,-O2,...). For five years, we...

  • Nvidia Unleashes Cuda Technology.  // Computer Graphics World;Dec2006, Vol. 29 Issue 12, p4 

    The article provides information on Nvidia's Cuda technology, a C-compiler development environment for the graphics processing unit (GPU). Reportedly up to 100 times faster than traditional methods, GPU computing with Cuda takes advantage of hundreds of on-chip processor cores that work in...

  • Experiences with Sweep3D implementations in Co-array Fortran. Coarfa, Cristian; Dotsenko, Yuri; Mellor-Crummey, John // Journal of Supercomputing;May2006, Vol. 36 Issue 2, p101 

    As part of the recent focus on increasing the productivity of parallel application developers, Co-array Fortran (CAF) has emerged as an appealing alternative to the Message Passing Interface (MPI). CAF belongs to the family of global address space parallel programming languages; such languages...

  • Parallel Compiler.  // Network Dictionary;2007, p365 

    A definition of the term "parallel compiler" in the context of computer software is presented. This refers to a type of computer compiling technique that speeds up the process of compilation on multi-processor machines. Super-computers and other large scale multi-processor machines make use of...

  • Parallel Processing Is Speedy and Efficient.  // USA Today Magazine;Jun1998 Science, Vol. 126 Issue 2637, p13 

    Reports on the development of a compiler computer program by electrical and computer engineering professor Rudolf Eigenmann which translates conventional programs so they can run on a parallel processing computer.

  • A general data dependence analysis for parallelizing compilers. Jing Zhou; Guosun Zeng // Journal of Supercomputing;Aug2008, Vol. 45 Issue 2, p236 

    Many dependence tests have been proposed for loop parallelization in the case of arrays with linear subscripts, but little work has been done on the arrays with nonlinear subscripts, which sometimes occur in parallel benchmarks and scientific and engineering applications. This paper focuses on...

  • Containers on the Parallelization of General-Purpose Java Programs. Peng Wu; Padua, David // International Journal of Parallel Programming;Dec2000, Vol. 28 Issue 6, p589 

    Static parallelization of general-purpose programs is still impossible, in general, due to their common use of pointers, irregular data structures, and complex control-flows. One promising strategy is to exploit parallelism at runtime. Runtime parallelization schemes, particularly data...

  • Model, Design, and Evaluation of a Compiler for a Parallel Processing Environment. Baer, Jean-Loijp; Ellis, Carla S. // IEEE Transactions on Software Engineering;Nov77, Vol. 3 Issue 6, p394 

    The problem of designing compilers for a multiprocessing environment is approached. We show that by modeling an existing sequential compiler, we gain an understanding of the modifications necessary to transform the sequential structure into a pipeline of processes. The pipelined compiler is then...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics