TITLE

A Vectorizing Compiler for Multimedia Extensions

AUTHOR(S)
Sreraman, N.; Govindarajan, R.
PUB. DATE
August 2000
SOURCE
International Journal of Parallel Programming;Aug2000, Vol. 28 Issue 4, p363
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
In this paper, we present an implementation of a vectorizing C compiler for Intel's MMX (Multimedia Extension). This compiler would identify data parallel sections of the code using scalar and array dependence analysis. To enhance the scope for application of the subword semantics, our compiler performs several code transformations. These include strip mining, scalar expansion, grouping and reduction, and distribution. Thereafter inline assembly instructions corresponding to the data parallel sections are generated. We have used the Stanford University Intermediate Format (SUIF), a public domain compiler tool, for our implementation. We evaluated the performance of the code generated by our compiler for a number of benchmarks. Initial performance results reveal that our compiler generated code produces a reasonable performance improvement (speedup of 2 to 6.5) over the the code generated without the vectorizing transformations/inline assembly. In certain cases, the performance of the compiler generated code is within 85% of the hand-tuned code for MMX architecture.
ACCESSION #
17143257

 

Related Articles

  • Compilation Techniques for Multimedia Processors. Krall, Andreas; Lelait, Sylvain // International Journal of Parallel Programming;Aug2000, Vol. 28 Issue 4, p347 

    The huge processing power needed by multimedia applications has led to multimedia extensions in the instruction set of microprocessors which exploit subword parallelism. Examples of these extended instruction sets are the Visual Instruction Set of the UltraSPARC processor, the AltiVec...

  • An Abstract Semantically Rich Compiler Collocative and Interpretative Model for OpenMP Programs. Mokbel, Mohammed F.; Kent, Robert D.; Wong, Michael // Computer Journal;Aug2011, Vol. 54 Issue 8, p1325 

    To understand the behavior of OpenMP programs, special tools and adaptive techniques are needed for performance analysis. However, these tools provide low-level profile information at the assembly and functions boundaries via instrumentation at the binary or code level, which are very hard to...

  • The Design and Implementation of an ASN. 1-C Compiler. Neufeld, Gerald W.; Yueli Yang // IEEE Transactions on Software Engineering;Oct90, Vol. 16 Issue 10, p1209 

    A basic requirement for communication in a heterogeneous computing environment is a standard external data representation. Abstract Syntax Notation One (ASN.1) has been widely used in international standard specifications; its transfer-syntax, the Basic Encoding Rules (BER), is used as the...

  • Loop Shifting for Loop Compaction. Darte, Alain; Huard, Guillaume // International Journal of Parallel Programming;Oct2000, Vol. 28 Issue 5, p499 

    The idea of decomposed software pipelining is to decouple the software pipelining problem into a cyclic scheduling problem without resource constraints and an acyclic scheduling problem with resource constraints. In terms of loop transformation and code motion, the technique can be formulated as...

  • Loop Shifting for Loop Compaction. Darte, Alain; Huard, Guillaume // International Journal of Parallel Programming;Oct2000, Vol. 28 Issue 5, p499 

    The idea of decomposed software pipelining is to decouple the software pipelining problem into a cyclic scheduling problem without resource constraints and an acyclic scheduling problem with resource constraints. In terms of loop transformation and code motion, the technique can be formulated as...

  • Experiences with Sweep3D implementations in Co-array Fortran. Coarfa, Cristian; Dotsenko, Yuri; Mellor-Crummey, John // Journal of Supercomputing;May2006, Vol. 36 Issue 2, p101 

    As part of the recent focus on increasing the productivity of parallel application developers, Co-array Fortran (CAF) has emerged as an appealing alternative to the Message Passing Interface (MPI). CAF belongs to the family of global address space parallel programming languages; such languages...

  • Handling Global Constraints in Compiler Strategy. Rohou, Erven; Bodin, François; Eisenbeis, Christine; Seznec, André // International Journal of Parallel Programming;Aug2000, Vol. 28 Issue 4, p325 

    To achieve high-performance on processors featuring ILP, most compilers apply locally a set of heuristics. This leads to a potentially high-performance on separate code fragments. Unfortunately, most optimizations also increase code size, which may lead to a global net performance loss. In this...

  • A Case Study on Compiler Optimizations for the Intel® CoreTM 2 Duo Processor. Bik, Aart; Kreitzer, David; Tian, Xinmin // International Journal of Parallel Programming;Dec2008, Vol. 36 Issue 6, p571 

    The complexity of modern processors poses increasingly more difficult challenges to software optimization. Modern optimizing compilers have become essential tools for leveraging the power of recent processors by means of high-level optimizations to exploit multi-core platforms and...

  • Parallel Compiler.  // Network Dictionary;2007, p365 

    A definition of the term "parallel compiler" in the context of computer software is presented. This refers to a type of computer compiling technique that speeds up the process of compilation on multi-processor machines. Super-computers and other large scale multi-processor machines make use of...

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics