Performance Evaluation of Mixed-Mode OpenMP/MPI Implementations

Bull, J.; Enright, James; Guo, Xu; Maynard, Chris; Reid, Fiona
October 2010
International Journal of Parallel Programming;Oct2010, Vol. 38 Issue 5/6, p396
Academic Journal
With the current prevalence of multi-core processors in HPC architectures mixed-mode programming, using both MPI and OpenMP in the same application, is seen as an important technique for achieving high levels of scalability. As there are few standard benchmarks written in this paradigm, it is difficult to assess the likely performance of such programs. To help address this, we examine the performance of mixed-mode OpenMP/MPI on a number of popular HPC architectures, using a synthetic benchmark suite and two large-scale applications. We find performance characteristics which differ significantly between implementations, and which highlight possible areas for improvement, especially when multiple OpenMP threads communicate simultaneously via MPI.


Related Articles

  • Instruction Level Parallelism through Microthreading—A Scalable Approach to Chip Multiprocessors. BOUSIAS, KOSTAS; HASASNEH, NABIL; JESSHOPE, CHRIS // Computer Journal;2006, Vol. 49 Issue 2, p211 

    Most microprocessor chips today use an out-of-order instruction execution mechanism. This mechanism allows superscalar processors to extract reasonably high levels of instruction level parallelism (ILP). The most significant problem with this approach is a large instruction window and the logic...

  • Towards Automated Memory Model Generation Via Event Tracing. Perks, O.F.J.; Beckingsale, D.A.; Hammond, S.D.; Miller, I.; Herdman, J.A.; Vadgama, A.; Bhalerao, A.H.; He, L.; Jarvis, S.A. // Computer Journal;Feb2013, Vol. 56 Issue 2, p156 

    The importance of memory performance and capacity is a growing concern for high performance computing laboratories around the world. It has long been recognized that improvements in processor speed exceed the rate of improvement in dynamic random access memory speed and, as a result, memory...

  • A career in HPC. Robson, David // Scientific Computing World;Apr/May2008 HPC Supplement, Issue 1, p16 

    The article profiles the professional career of Fiona Reid, an applications consultant at Edinburgh Parallel Computing Centre (EPCC) in Scotland. Reid chose the middle road of working in high-performance computing, a career path that allowed her to play an important part in the latest...

  • Multilingual Support for A-JUMP. Malik, Usman A.; Riaz, Naveed; Asghar, Sajjad; Hafeez, Mehnaz; Rehman, Adeel ur // International Journal of Grid & Distributed Computing;Dec2011, Vol. 4 Issue 4, p57 

    The Architecture for Java Universal Message Passing (A-JUMP) is a message passing implementation based on MPJ specifications. It is written purely in Java. One of the design goals of A-JUMP is interoperability, where applications written in different programming languages must be able to...

  • An Efficient Algorithm for Aggregating PEPA Models. Gilmore, Stephen; Hillston, Jane; Ribaudo, MArina // IEEE Transactions on Software Engineering;May2001, Vol. 27 Issue 5, p449 

    Performance Evaluation Process Algebra (PEPA) is a formal language for performance modeling based on process algebra. It has previously been shown that, by using the process algebra apparatus, compact performance models can be derived which retain the essential behavioral characteristics of the...

  • Restructuring Fortran legacy applications for parallel computing in multiprocessors. Tinetti, Fernando; Méndez, Mariano; Giusti, Armando // Journal of Supercomputing;May2013, Vol. 64 Issue 2, p638 

    As it is widely known, multi-core computers are broadly used these days, and automatic parallelization of sequential programs is still a challenge. In this context, we propose a set of code transformations to be applied automatically by a tool in order to transform sequential legacy systems into...

  • Boost Performance with Parallel Processing. Kelly, Andrew J. // SQL Server Magazine;Nov2007, Vol. 9 Issue 11, p16 

    The author reflects on the setting of the Max Degree of Parallelism in the SQL Server environment. He describes that parallelism is one of the aspect in SQL that people lacks understanding in making a configuration decision. The author also discusses the notion that SQL Server utilizes more than...

  • Quicksort Revisited. Davidson, Colin M. // IEEE Transactions on Software Engineering;Oct88, Vol. 14 Issue 10, p1480 

    Mills and Linger propose adding the datatype "set" to existing programming languages. During some investigations using sets, it became apparent that Quicksort can be written without using stacks (or recursion). Using sets can lead to efficient multiprocessor usage, because if the elements of a...

  • Rationale for Ada 2012: 4 Tasking and Real-Time. Barnes, John // Ada User Journal;Sep2012, Vol. 33 Issue 3, p178 

    This paper describes various improvements in the tasking and real-time areas for Ada 2012. The most important is perhaps the recognition of the need to provide control over task allocation on multiprocessor architectures. There are also various improvements to the scheduling mechanisms and...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics