A scalable method-level parallel library and its improvement

Zhang, Yang; Ji, Weixing
September 2012
Journal of Supercomputing;Sep2012, Vol. 61 Issue 3, p1154
Academic Journal
This paper proposes a method-level parallel library, called JPL, and an improved version based on aspect-oriented programming, called AOJPL. Both JPL and AOJPL execute a method based on run-time reflection. By adding OpenMP-like notations to the definition of a method, AOJPL can automatically finish some tasks of JPL, thus making AOJPL easy to utilize. Experimental results show that when JPL and AOJPL are applied to several benchmarks in the JGF Benchmark Suite for varying numbers of threads on two different multicore processors (Intel Xeon and Sun Sparc T2), their performance tracks JOMP very closely under the situation that the number of software threads is not more than a number (8 for Intel Xeon and 64 for Sun Sparc T2). When the number of software threads surpasses the number, JPL and AOJPL significantly outperform JOMP for all four benchmarks. Both JPL and AOJPL can gain good scalability.


Related Articles

  • An Efficient Scalable Runtime System for Macro Data Flow Processing Using S- Net. Gijsbers, Bert; Grelck, Clemens // International Journal of Parallel Programming;Dec2014, Vol. 42 Issue 6, p988 

    S- Net is a declarative coordination language and component technology aimed at radically facilitating software engineering for modern parallel compute systems by near-complete separation of concerns between application (component) engineering and concurrency orchestration. S- Net builds on the...

  • POWER-AWARE SPEED-UP FOR MULTITHREADED NUMERICAL LINEAR ALGEBRAIC SOLVERS ON CHIP MULTICORE PROCESSORS. Mukherjee, Jayanta; Raha, Soumyendu // Scalable Computing: Practice & Experience;2009, Vol. 10 Issue 2, p217 

    With the advent of multicore chips new parallel computing metrics and models have become essential for redesigning traditional scientific application libraries tuned to a single chip. In this paper we evolve metrics specific to generalized chip multicore processors (CMP) and use them for...

  • Performance Analysis of Matrix-Vector Multiplication in Hybrid (MPI + OpenMP). Waghmare, Vivek N.; Kendre, Sandip V.; Chordiya, Sanket G. // International Journal of Computer Applications;May2011, Vol. 22, p22 

    Computing of multiple tasks simultaneously on multiple processors is called Parallel Computing. The parallel program consists of multiple active processes simultaneously solving a given problem. Parallel computers can be roughly classified as Multi-Processor and Multi-Core. In both these...

  • Preface. Thor, Miroslaw; Tudruj, Marek // Journal of Supercomputing;Jul2011, Vol. 57 Issue 1, p1 

    An introduction is presented in which the editor discusses various reports within the issue on topics including multi-core processor architectures, hierarchical parallelization techniques, and machine and task heterogeneity.

  • A Review of Transactional Memory in Multicore Processors. Wang, X.; Ji, Z.; Fu, C.; Hu, M. // Information Technology Journal;2010, Vol. 9 Issue 1, p192 

    To develop composable parallel programs easily and get high performance, many transactional memory systems have been proposed to solve the synchronization problem of multicore processors, Transactional memory can be implemented in hardware, software, or a hybrid of the two. There are many hot...

  • VORD: A Versatile On-the-fly Race Detection Tool in OpenMP Programs. Kim, Young-Joo; Song, Sejun; Jun, Yong-Kee // International Journal of Parallel Programming;Dec2014, Vol. 42 Issue 6, p900 

    Shared-memory based parallel programming with OpenMP and Posix-thread APIs becomes more common to fully take advantage of multiprocessor computing environments. One of the critical risks in multithreaded programming is data races that are hard to debug and greatly damaging to parallel...

  • Performance scalability and energy consumption on distributed and many-core platforms. Karanikolaou, E.; Milovanović, E.; Milovanović, I.; Bekakos, M. // Journal of Supercomputing;Oct2014, Vol. 70 Issue 1, p349 

    In this paper, the performance evaluation of distributed and many-core computer complexes, in conjunction with their consumed energy, is investigated. The distributed execution of a specific problem on an interconnected processors platform requires a larger amount of energy compared to the...

  • A Scalable Farm Skeleton for Hybrid Parallel and Distributed Programming. Ernsting, Steffen; Kuchen, Herbert // International Journal of Parallel Programming;Dec2014, Vol. 42 Issue 6, p968 

    Multi-core processors and clusters of multi-core processors are ubiquitous. They provide scalable performance yet introducing complex and low-level programming models for shared and distributed memory programming. Thus, fully exploiting the potential of shared and distributed memory...

  • Mesh Structure VLSI of 9/7 Lifting Wavelet Parallel Transform. Xiaoqiang Yang // Advanced Materials Research;2014, Vol. 971-973, p1647 

    9/7 lifting wavelet is widely applied to data processing. By taking advantages of regular data flows and data local properties of 9/7 wavelet transform, mesh structure of 9/7 Lifting wavelet parallel transform is put forward in the paper. The structure realizes the parallel processing of wavelet...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics