Exploitation of Fine-Grain Parallelism by Günter Böckle

By Günter Böckle

Many parallel computing device architectures are specially fitted to specific periods of purposes. in spite of the fact that, there are just a few parallel architectures both like minded for traditional courses. a lot attempt is invested into examine in compiler recommendations to make programming parallel machines easier.
This e-book provides tools for computerized parallelization, in order that courses needn't to be adapted for particular architectures; the following the point of interest is on fine-grain parallelism, provided through such a lot new microprocessor architectures. The booklet addresses compiler writers, machine architects, and scholars by way of demonstrating the manifold complicated relationships among structure and compiler technology.

Show description

Read Online or Download Exploitation of Fine-Grain Parallelism PDF

Best microprocessors & system design books

Designing Embedded Systems with PIC Microcontrollers: Principles and Applications

This booklet is a hands-on creation to the rules and perform of embedded process layout utilizing the PIC microcontroller. full of important examples and illustrations, it offers an in-depth therapy of microcontroller layout, programming in either meeting language and C, and lines complicated subject matters akin to networking and real-time working structures.

Logic and Language Models for Computer Science

This article makes in-depth explorations of a vast diversity of theoretical subject matters in desktop technology. It plunges into the functions of the summary innovations so one can confront and tackle the skepticism of readers, and instill in them an appreciation for the usefulness of concept. A two-part presentation integrates good judgment and formal language—both with functions.

Extra resources for Exploitation of Fine-Grain Parallelism

Sample text

The nodes of the tree represent compare operations (resp. conditional branches) and the edges represent the other operations occurring between the compare operations. e. L1 ..... L4 in figure 21. The processing of such an instruction tree starts by choosing an execution path through the tree. This is performed by evaluating the compare operations top-down. These comparisons are performed in parallel. e. their results are marked as valid while operations on other paths are invalid. Afterwards, processing continues at the successor determined by the chosen path through the instruction tree.

Shen present in [Wolfe/Shen 91] the XIMD architecture which tries to overcome the disadvantages of VLIW architectures while keeping their advantages. In applications with many branch and call operations, VLIW architectures are limited in their capability to exploit enough parallelism; this holds still more for superscalar architectures. Multiway branches are quite complex to implement and thus the number of branches which can be executed concurrently is limited. Unpredictable memory and peripherals behaviour may decrease performance because a parallelizing compiler has to make worst-case assumptions to guarantee correct program execution.

The branch-target addresses are specified in three of the six 32-bit immediate fields of the instruction word. The output of the first processing element may also be specified as branch-target address to execute computed branches. 1) is used to extract parallelism from application programs during compilation. 2). 4). 3 C y d r a The Cydra 5 was built by Cydrome, Inc. (see [Rim 88], [Rau 92] and [Ran/Yen/Towle 89]). Called a ,,Departmental Supercomputer", it is a heterogeneous multiprocessor system for engineering and scientific applications.

Download PDF sample

Rated 4.93 of 5 – based on 8 votes