Lifeng Liu Ph.D. Dissertation Defense "An Optimization Compiler Framework Based on Polyhedron Model for GPGPUs"

Thursday, April 28, 2016, 10 am to 1 pm
Campus: 
Dayton
304 Russ
Audience: 
Current Students
Faculty
Staff

Lifeng Liu's Ph.D. Dissertation Defense, "An Optimization Compiler Framework Based on Polyhedron Model for GPGPUs", will be Thursday, April 28th at 10:00am.

ABSTRACT:

General purpose GPU (GPGPU) is an effective many-core architecture that could yearn high throughput for many scientific applications with thread level parallelism. However, several challenges still limit further performance improvements and make GPU programming difficult for programmers who are lack of hardware knowledge of GPUs. In this dissertation, we propose an Optimization Compiler Framework Based on Polyhedron Model for GPGPUs to bridge the speed gap between the GPU core and the off-chip memory and improve the overall performance of GPU systems.

The proposed optimization compiler framework includes a detailed data reuse analyzer based on the extended polyhedron model  for GPU kernels, a compiler assisted programmable warp scheduler, a compiler assisted CTA mapping scheme, a compiler assisted software managed cache optimization framework, and a compiler assisted automatic synchronization optimization framework to help GPU programmers optimize their GPU kernels.

The extended polyhedron model is used to detect intra-warp data dependencies, cross-warp data dependencies, and to do data reuse analysis.  The compiler assisted programmable warp scheduler programmable warp scheduler for GPGPUs could take advantage of inter-warp locality and intra-warp locality simultaneously.  The compiler assisted CTA mapping scheme is proposed to further improve the performance of the programmable warp scheduler by considering inter thread block data reuses.  The compiler assisted software managed cache optimization framework is proposed to make a better use of the shared memory of GPU systems and bridge the speed gap between GPU cores and global off-chip memory. The synchronization optimization framework to automatically insert synchronization statements into GPU kernels at compile time, meanwhile minimize the number of inserted synchronization statements.

The proposed optimization compiler framework is implemented and evaluated on the GPU simulators. Experimental results show that our proposed optimization compiler framework could automatically optimize the GPU kernel programs and correspondingly improve the performance significantly.

For information, contact
Attachment: 
Log in to submit a correction for this event (subject to moderation).