Performance Prediction and Evaluation of Parallels Processing on a NUMA Multiprocessor

Xiaodong Zhang; Xiaohan Qin
October 1991
IEEE Transactions on Software Engineering;Oct91, Vol. 17 Issue 10, p1059
Academic Journal
Non-Uniform Memory Access (NUMA) architectures make it possible to build large-scale, shared-memory multiprocessor systems, in comparison with nonscalable Uniform Memory Access (UMA) architectures. Most NUMA multiprocessor operations such as scheduling and synchronizing processes, accessing data from processors to memory modules and allocating distributed memory space to different processors, are performed through interconnection networks such as a multistage switching network The efficiency of these basic operations determines the parallel processing performance on a NUMA multiprocessor. This paper presents several analytical models to predict and evaluate the overhead of interprocessor communication, process scheduling, process synchronization, and remote memory access where network contention and memory contention are considered. Performance measurements to support the models and analyses through several numerical examples have been done on the BBN GP1000, a NUMA shared-memory multiprocessor. Both analytical and experimental results give a comprehensive and clear understanding of the various effects, which are important for the effective use of a NUMA shared-memory multiprocessor. The results in this paper may be used to determine optimal strategies in developing an efficient programming environment for a NUMA system.


Related Articles

  • Non-Strict Execution in Parallel and Distributed Computing. Cristobal-Salas, Alfredo; Tchernykh, Andrei; Gaudiot, Jean-Luc; Lin, Wen-Yen // International Journal of Parallel Programming;Apr2003, Vol. 31 Issue 2, p77 

    This paper surveys and demonstrates the power of non-strict evaluation in applications executed on distributed architectures. We present the design, implementation, and experimental evaluation of single assignment, incomplete data structures in a distributed memory architecture and Abstract...

  • An Efficient Hardware-oriented Algorithm of Spatial Motion Vector Prediction for AVS HD Video Encoder. Minghui Yang; Xiaodong Xie // Applied Mechanics & Materials;2014, Issue 556-562, p4365 

    Motion Vector Prediction (MVP) plays an important role in improving coding efficiency in HEVC, H.264/AVC and AVS video coding standard. MVP is implemented by exploiting redundancy of adjacent-block optimal coding information under the constraint that MVP must be performed in a serial way. The...

  • Comparing the Optimal Performance of Parallel Architectures. Klonowska, Kamilla; Lundberg, Lars; Lennerstad, HÃ¥kan; Broberg, Magnus // Computer Journal;2004, Vol. 47 Issue 5, p527 

    Consider a parallel program with n processes and a synchronization granularity z. Consider also two parallel architectures: an SMP with q processors and run-time reallocation of processes to processors, and a distributed system (or cluster) with k processors and no run-time reallocation. There...

  • A Multi-Level WEB Based Parallel Processing System A Hierarchical Volunteer Computing Approach. Mohamed Osman, Abdelrahman Ahmed // Enformatika;2006, Vol. 13, p66 

    Over the past few years, a number of efforts have been exerted to build parallel processing systems that utilize the idle power of LAN's and PC's available in many homes and corporations. The main advantage of these approaches is that they provide cheap parallel processing environments for those...

  • Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing. Takizawa, Hiroyuki; Kobayashi, Hiroaki // Journal of Supercomputing;Jun2006, Vol. 36 Issue 3, p219 

    This paper presents an effective scheme for clustering a huge data set using a PC cluster system, in which each PC is equipped with a commodity programmable graphics processing unit (GPU). The proposed scheme is devised to achieve three-level hierarchical parallel processing of massive data...

  • A Compositional Framework for Developing Parallel Programs on Two-Dimensional Arrays. Emoto, Kento; Hu, Zhenjiang; Kakehi, Kazuhiko; Takeichi, Masato // International Journal of Parallel Programming;Dec2007, Vol. 35 Issue 6, p615 

    Computations on two-dimensional arrays such as matrices and images are one of the most fundamental and ubiquitous things in computational science and its vast application areas, but development of efficient parallel programs on two-dimensional arrays is known to be hard. In this paper, we...

  • Efficient parallel processing with spin-wave nanoarchitectures. Eshaghian-Wilner, Mary; Navab, Shiva // Journal of Supercomputing;Aug2009, Vol. 49 Issue 2, p248 

    In this paper, we study the algorithm design aspects of three newly developed spin-wave architectures. The architectures are capable of simultaneously transmitting multiple signals using different frequencies, and allow for concurrent read/write operations. Using such features, we show a number...

  • Parallel processing of multicomponent seismic data. Falfushinsky, V. V. // Cybernetics & Systems Analysis;Mar2011, Vol. 47 Issue 2, p330 

    n algorithm for processing multicomponent seismic data is proposed. It is implemented in and its performance is measured on the Inparcom cluster. Several improvements are applied to speed up the program and to reduce the filesystem load, in particular, local folders are used to store temporary...

  • Leveraging computation sharing and parallel processing in location-dependent query processing. Cazalas, Jonathan; Guha, Ratan // Journal of Supercomputing;Jul2012, Vol. 61 Issue 1, p215 

    A variety of research exists for the processing of continuous queries in large, mobile environments. Each method tries, in its own way, to address the computational bottleneck of constantly processing so many queries. In this paper, we introduce an efficient and scalable system for monitoring...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics