Tutorial:                      
Sep 24, 2005
2 pm - 5 pm
San Francisco, CA

Register for Tutorial

 

Software Thread Integration and Related Techniques
Presented by:
Alex Dean

Held in conjunction with the 2005 International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES 2005).



 

Abstract:

Software Thread Integration (STI) methods are software transformation and design techniques which interleave functions from two or more threads or portions of a program at the assembly language level to produce a single implicitly multithreaded function. This resulting function provides two large advantages over the original function: minimal context-switching overhead and increased instruction-level parallelism (ILP).

Reducing context-switching overhead increases program efficiecy in applications with frequent switches. This is useful when performing hardware-to-software migration, as it lowers the processor throughput requirements for an application, and increases the maximum performance of a processor. This software improvement enables the use of a slower and less expensive processor.

Increasing ILP allows more efficient scheduling of instructions. Dependences between instructions typically limit the efficiency of processors with multiple instruction issue or moderately deep pipelines to levels well below the capabilities of the hardware. As STI creates an implicitly multithreaded function from separate functions, this function as much more instruction-level parallelism that the original functions, allowing much more efficient scheduling and hence processor use.

This tutorial presents Software Thread Integration methods and applications. We first introduce the software transformations used for STI, and present desirable characteristics of target hardware and software. We present and discuss the run-time model for STI. We then present applications in which STI provides a benefit.

Next we present Asynchronous STI (ASTI), which uses the STI transformations in conjunction with coroutine calls to create code which enables independent progress among integrated threads. This extends the range of applications which can benefit from these technologies. We present and discuss the transformations and desirable characteristics of target hardware and software, as well as the run-time model. We then present applications which benefit from ASTI.

We finish by presenting STI as used for increasing ILP on a very-long-instruction-word (VLIW) digital signal processor. In this case the C source code rather than assembly code can be integrated, enabling the developer to leverage the software pipelining, predication and other powerful features which modern compilers rely upon to improve performance. We present the techniques and demonstrate results on various DSP library functions.