TITLE

Efficient and Scalable Routing Algorithms for Collective Communication Operations on 2D All-Port Torus Networks

AUTHOR(S)
İmre, Kayhan M.; Baransel, Cesur; Artuner, Harun
PUB. DATE
December 2011
SOURCE
International Journal of Parallel Programming;Dec2011, Vol. 39 Issue 6, p746
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
Collective Communication Algorithms for 2D torus networks have been investigated quite extensively in the literature and two broad approaches, namely direct methods and indirect (message combining) methods are recognized in the field. While direct methods minimize the volume of data, the indirect methods reduce the number of message start-ups. Consequently, either a suite of algorithms must be employed for efficiency over a wide range of message lengths and communication operations or algorithms should be able to adapt themselves to the current case, possibly by switching between direct and indirect routing modes as appropriate. In this paper, we propose adaptive routing algorithms for all-port, wormhole routed, synchronous, 2D torus networks optimized for one-to-all broadcast, gossiping and complete exchange collective communication operations. The proposed algorithms employ completely-connected subnetworks where complete exchange amongst the nodes in the subnetwork can be accomplished in one step only. Combined with suitable 2D plane tiling techniques, the proposed algorithms share the same set of primitive operations and yield superior performance compared to previously proposed methods, either pure or hybridized.
ACCESSION #
63995374

 

Related Articles

  • System-level Buffer Allocation for Application Specific Network-on-chip with Wormhole Routing. Jian Wang; Yubai Li; Qicong Peng // IETE Technical Review (Medknow Publications & Media Pvt. Ltd.);Nov/Dec2012, Vol. 29 Issue 6, p482 

    A novel buffer allocation algorithm that can be used to customize the router design in application-specific Network-on-Chip (NoC) is proposed. More precisely, according to the target application and the total budget of the available buffer resources, the proposed method automatically assigns the...

  • Faust: An Integrated Environment for Parallel Programming. Guarna Jr., Vincent A.; Gannon, Dennis; Jablonowski, David; Malony, Allen D.; Gaur, Yogesh // IEEE Software;Jul89, Vol. 6 Issue 4, p20 

    Features Faust, a workstation-based programming environment for scientific applications from the Center for Supercomputing Research and Development at the University of Illinois. Implementation of set of tools specifically designed to help develop efficient parallel programs; Significance of...

  • A Decomposition-Based Approach for Scalable Many-Field Packet Classification on Multi-core Processors. Qu, Yun; Zhou, Shijie; Prasanna, Viktor // International Journal of Parallel Programming;Dec2015, Vol. 43 Issue 6, p965 

    As a kernel function in network routers, packet classification requires the incoming packet headers to be checked against a set of predefined rules. There are two trends for packet classification: (1) to examine a large number of packet header fields, and (2) to use software-based solutions on...

  • PERFORMANCE MODELLING OF TORUS INTERCONNECTION NETWORKS WITH DEEP BUFFERS. Alzeidi, N.; O.-Khaoua, M.; Khonsari, A. // International Journal of Computers & Applications (Acta Press);2010, Vol. 32 Issue 1, p1 

    Analytical models of deterministic routing in wormhole-switched networks have been widely reported in the literature. However, all of these models have assumed no, or neglectable, buffers at each switching element of the network. This paper proposes a new analytical model for wormhole-switched...

  • A Generic and Extensible Spidergon NoC. Zitouni, Abdelkrim; Zid, Mounir; Badrouchi, Sami; Tourki, Rached // Proceedings of World Academy of Science: Engineering & Technolog;2007, Vol. 25, p14 

    The Globally Asynchronous Locally Synchronous Network on Chip (GALS NoC) is the most efficient solution that provides low latency transfers and power efficient System on Chip (SoC) interconnect. This study presents a GALS and generic NoC architecture based on a configurable router. This router...

  • Research on Security of Routing Protocols Against Wormhole Attack in the Ad hoc Networks. Huang Wen hua // WSEAS Transactions on Computers;Jun2013, Vol. 12 Issue 6, p255 

    Ad hoc networks are made of a group of portable terminals with wireless transmitter as multi-hops provisional autonomy system. Ad hoc networks are very easy to be attacked by various kinds of network attacks, as the limited resources, dynamic topology and open communication channel and so on....

  • DBR: A Simple, Fast and Efficient Dynamic Network Reconfiguration Mechanism Based on Deadlock Recovery Scheme. ValadBeigi, Majed; Safaei, Farshad; Pourshirazi, Bahareh // International Journal of VLSI Design & Communication Systems;Oct2012, Vol. 3 Issue 5, p13 

    Dynamic network reconfiguration is described as the process of replacing one routing function with another while the network keeps running. The main challenge is avoiding deadlock anomalies while keeping limitations on message injection and forwarding minimal. Current approaches, whose...

  • Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4x4 Mesh Topology Network-on-Chip. Jalil, Nauman; Qureshi, Adnan; Khan, Furqan; Qazi, Sohaib Ayyaz // International Journal of Computer Theory & Engineering;Feb2012, Vol. 4 Issue 1, p85 

    In this paper we have provided routing algorithms, process model for quality of service (QoS) and architecture for new Timer based adaptive routing algorithm for a generic network, based on a two-dimensional mesh topology. Compared to previous work, our proposed work has provided with details of...

  • An Efficient Path-Based Multicast Algorithm for Minimum Communication Steps. El-Obaid, Amnah; Wan-Li Zuo // Information Technology Journal;2008, Vol. 7 Issue 1, p32 

    Multicasting is an information dissemination problem which consists, for a processor of a distributed memory parallel computer, to send a same message to a subset of processors. This study presents a new efficient multicast path-based algorithm Two-Path-Pipelined (TPP for short), which can...

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics