We are now well into the era in which processor are reducing the rate of improvements in serial execution. All new processors are multi core. New development tools are supporting parallel programming, and more programmers are developing parallel applications. On the SW side, scheduling of work is moving from the OS up to user mode, and work stealing scheduler are supporting composable parallel programming, in which SW components, such as libraries, implemented as parallel programs, can be put together and perform well, without oversubscription.
The trend of multi core is meeting two additional trends: The GPUs are becoming programmable, and processors makers cannot add cores in high rate.
The combination of the three trends results in heterogeneous computing. In heterogeneous computing, the application executes on a combination of execution resources that are different from each other.
Heterogeneous programming allows the programmer to target the heterogeneous resources. When HW resources are fixed functions, then programming them is as simple as making an API call. When they are programmable, such as is the trends in GPUs, then heterogeneous programming is an extension of parallel programming, in which the programmer provides an indication as to where each portion of the program should execute.
In current systems, the overhead of moving data and computation between different resource types has high overhead. Clearly, these systems are going to mature and improve. The impact on programming is that code can move across execution unit type in very coarse grain, infrequently and with minimal movement of data. New trends in HW will allow fine grain sharing by making the non CPU components more programmable, introducing shared memory and reducing overhead. This trend is expected to lead programming models in which work stealing scheduler can become heterogeneous, and applications can benefit from the heterogeneity even when the programmer doesn’t express heterogeneous programming explicitely.
Author: Robert Geva
Principal engineer, Intel SSG
Robert joined Intel in 1991 and has since developed an expertise in compilers and performance analysis and tuning for microarchitectures. Robert has worked on compiler optimizations for a variety of Intel microprocessor based systems, including the 80486, the Pentium Processor, the Pentium Pro Processor, Itanium, the Pentium 4 and Pentium M and core II Duo.
Currently, Robert is an architect in the development products division responsible for driving language extensions and programming models for parallel and heterogeneous programming. Robert has BA and MSc from the Technion, Israel institute of technology.