July 02, 2006

Performance Optimization

Results of performance optimization study on both PowerPC and CoreDuo machines. 100 runs of the same two functions were done and the best time from each is recorded as changed are made to the code and compiler flags. The "Sum" test sums 10,000 vectors (c = a + b). The "Diffuse" test runs a fluid diffusion pass on a 2D array of vectors.

PowerPC (G5 1.8Ghz)

ChangeSumDiffuse
Baseline28ms48ms
Switch to vFloat type68ms116ms
'inline' Vector ctor69ms128ms
AltiVec Vector functions27ms62ms
'inline' AltiVec functions25ms58ms
'inline' getNeighborSum()25ms38ms
Hand tune diffuse with vec_maddn/a23ms
-mtune=G524ms22ms
-ffast-math=1624ms22ms
-falign-loops=1624ms22ms

Intel (Core Duo 2Ghz)

ChangeSumDiffuse
Baseline43ms81ms
Inline SSE18ms29ms

Posted by sean at July 2, 2006 11:33 PM

Comments

Post a comment




Remember Me?


© Sean Houghton