T3E Flags and Libraries
-O3 - Maximum optimization, may alter semantics.
-apad - Pad arrays to avoid cache line conflicts
-unroll - Apply aggressive unrolling
-pipeline - Software pipelining
-split - Apply loop splitting.
-aggress - Apply aggressive code motion
-Wl''-Dallocate(alignsz)=64b'' Align common blocks on cache line boundary
-lmfastv - Fastest vectorized intrinsic library
-lsci - Include library with BLAS, LAPACK and ESSL routines
-inlinefrom=<> - Specifies source file or directory of functions to inline
-inline2 - Aggressively inline function calls.