Performance Optimization

5/20/98


Click here to start


Table of Contents

Performance Optimization

Outline

Performance

Performance Examples

Issues in Performance

Issues in Performance

Issues in Performance Problem Size and Precision

Performance Metrics

Performance Metrics

Performance Metrics

Performance Metrics

Performance Metrics

Performance Metrics

Asymptotic Analysis

Amdahl’s Law

Amdahl’s Law

Efficiency

What is Optimization?

Types of Optimization

Steps of Optimization

Performance Strategies

Performance Strategies

Considerations when Optimizing

Locality

Memory Hierarchy

SP2 Access Times

Cache Performance

Cache Architecture

Cache Mapping

2 way set associative cache

Memory Access

Memory Access Example

Serial Optimizations

Array Optimization

Array Allocation

Array Initialization

Array Initialization

Array Padding

Array Padding a = a + b * c

Stride Minimization

Stride Minimization

Stride Minimization

Loop Fusion

Loop Fusion

Loop Fusion

Loop Interchange

Loop Interchange

Loop Interchange

Floating IF’s

Floating IF’s

Floating IF’s

Loop Defactorization

Loop Defactorization

Loop Defactorization

Loop Defactorization

Loop Peeling

Loop Peeling

Loop Peeling

Loop Collapse

Loop Collapse

Loop Collapse

Loop Collapse

Loop Unrolling

Loop Unrolling

Loop Unrolling

Loop Unrolling

Loop Unrolling and Sum Reductions

Loop Unrolling and Sum Reductions

Loop Unrolling and Sum Reductions

Loop Unrolling and Sum Reductions

Outer Loop Unrolling

Outer Loop Unrolling

Outer Loop Unrolling

Outer Loop Unrolling

Loop structure

Strength Reduction

Strength Reduction Horner’s Rule

Strength Reduction Horner’s Rule

Strength Reduction Integer Division by a Power of 2

Strength Reduction Integer division by a Power of 2

Strength Reduction Integer division by a Power of 2

Strength Reduction Factorization

Strength Reduction Factorization

Strength Reduction Factorization

Subexpression Elimination Parenthesis

Subexpression Elimination Parenthesis

Subexpression Elimination Parenthesis

Subexpression Elimination Type Considerations

Subexpression Elimination Type Considerations

Subroutine Inlining

Optimized Arithmetic Libraries

Optimized Arithmetic Libraries

SGI Origin 2000

SGI Origin 2000

SP2

SP2

T3E

T3E

T3E

T3E Streams

T3E Streams

T3E E-registers

T3E Cache Bypass

T3E E-registers

O2K Flags and Libraries

SP2 Flags and Libraries

T3E Flags and Libraries

Timers

Timers

Timers

prof

prof

prof

SpeedShop on the SGI

SpeedShop on the SGI

SpeedShop Usage

Ideal Experiment

Pcsamp Experiment

Example of UserTime

tprof for the SP2

tprof for the SP2

PAT for the T3E

PAT for the T3E

Apprentice for the T3E

PPT Slide

Additional Material

References

References

Author: Philip J. Mucci, Kevin S. London

Email: mucci@cs.utk.edu
london@cs.utk.edu

Home Page: http://www.cs.utk.edu/~london
http://www.cs.utk.edu/~london

Download presentation source