<<
efficient
Improving I/O Performance
Improving Run-time Efficiency
compilation
Efficient Compilation
Optimizing Compilation Process Overview
use of arrays
use of record buffers
enabling
auto-parallelizer
implied-DO loop collapsing
inlining
parallelizer
PGO options
END PARALLEL DO
using
endian data
enhancing optimization
enhancing performance
environment variables
and OpenMP* extension routines
for auto-parallelization
for little endian conversion
FORT_BUFFERED
OMP_NUM_THREADS
OMP_SCHEDULE
OpenMP*
PROF_DUMP_INTERVAL
routines overriding
EQUIVALENCE
effect on run-time efficiency
example of
auto-parallelization
dumping profile information
loop constructs
parallel program development
using OpenMP*
using profile-guided optimization
vectorization
exceptions
denormal
exclude code
code coverage tool
execution environment routines
execution flow
execution mode
explicit-shape arrays
files
.dpi
Basic PGO Options
Code-coverage Tool
Profmerge and Proforder Utilities
Test-prioritization Tool
.dyn
Basic PGO Options
Code-coverage Tool
Dumping and Resetting Profile Information
Dumping Profile Information
PGO Environment Variables
Profile an Application
Profmerge and Proforder Utilities
Test-prioritization Tool
.hpi
.spi
Code-coverage Tool
Test-prioritization Tool
.tb5
formatted
OpenMP* header
pgopti.dpi
pgopti.spi
source
Efficient Compilation
Example of Profile-Guided Optimization
unformatted
FIRSTPRIVATE
in worksharing constructs
summary of data scope attribute clauses
using
floating-point applications
optimizing
flow dependency in loops
flush-to-zero mode
formatted files
FORT_BUFFERED environment variable
FTZ mode
function expansion
function order list
function preemption
function splitting
enabling or disabling
general compiler directives
Loop Unrolling Support
Prefetching Support
generating
instrumented code
processor-specific code
profile-optimized executable
profiling information
reports
guidelines
for auto-parallelization
for IA-32 architecture
for improving run-time efficiency
for profile-guided optimization
for vectorization
Key Programming Guidelines for Vectorization
Vectorization Overview
helper thread optimization
heuristics
affecting data prefetches
Loop Count and Loop Distribution
Prefetching Support
affecting software pipelining
Loop Count and Loop Distribution
Pipelining for IA-64 Architecture
for inlining functions
high-level optimization
high-level optimizer
HLO Overview
Optimizer Report Generation
high performance
high performance programming
HLO
High-Level Optimization (HLO) Report
HLO Overview
reports
hotspots
Hyper-Threading Technology
parallel loops
thread pools
I/O
improving performance
list
parsing
performance
IA-32 architecture
applications for
dispatch options for
guidelines for
options for
processors for
Automatic Processor-specific Optimization (IA-32 Architecture)
Parallelism Overview
report generation
IA-64 architecture based applications
auto-vectorization in
HLO
options
pipelining for
report generation
targeting
using intrinsics in
ILO
implied-DO loop
improving
I/O performance
run-time performance
inefficient
code
initialization values for reduction variables
inlining
Controlling Inline Expansion of User Functions
Efficient Compilation
Improving Run-time Efficiency
Inline Function Expansion
Profile-guided Optimizations Overview
User Directed Inline Expansion of User Functions
compiler directed
developer directed
preemption
instruction-level parallelism
instrumentation
compilation
instrumented code
execution
feedback compilation
generating
program
integer pointers
preventing aliasing
Intel(R)-extended intrinsics
Intel(R) architectures
Intel(R) Celeron D processors
Intel(R) Celeron M processors
Intel(R) compiler-generated code
Intel(R) Core™ Duo processors
Intel(R) Core™ Solo processors
Intel(R) Core™2 Duo processors
Intel(R) Core™2 Extreme processors
Intel(R) Core™2 Quad processors
Intel(R) Debugger
Intel(R) extension environment variables
Intel(R) extension routines
Intel(R) linking tools
Intel(R) Pentium(R) 4 processors
Intel(R) Pentium(R) II processors
Intel(R) Pentium(R) III processors
Intel(R) Pentium(R) Pro processors
Intel(R) Pentium(R) processors
Intel(R) Threading Tools
Intel(R) Xeon(R) processors
intermediate language scalar optimizer
intermediate representation (IR)
Interprocedural Optimization (IPO) Overview
Using IPO
intermediate results
using memory for
internal subprograms
interprocedural optimizations
Controlling Inline Expansion of User Functions
Efficient Compilation
Optimizer Report Generation
Profile-guided Optimizations Overview
interval profile dumping
initiating
intrinsics
introduction to Optimizing Applications
IPO
capturing intermediate output
code layout
compilation
compiling
considerations
creating and using an executable for
creating libraries
issues
large programs
linking
Interprocedural Optimization (IPO) Overview
Using IPO
options
overview
performance
reports
samples
using
whole program analysis
xiar
xild
Creating a Library from IPO Objects
Creating a Multi-file IPO Executable
xilibtool
xilink
IR
Interprocedural Optimization (IPO) Overview
Using IPO
Itanium(R) 2 processors
Itanium(R) processors
IVDEP
effect of compiler option on
effect when tuning applications
IVDEP directive
>>