Optimizing for UltraSPARC III

(This is a copy of a mail I sent to the CeBiTec's cluster-user mailing list.)

To anyone using the GNU Compiler Collection on the CeBiTec's SPARC machines to write high-performance software, I can highly recommend using some optimization options in their CFLAGS and/or CXXFLAGS.

Of course -O2 or even -O3 should always be used to enable optimizations at all – I have encountered code which was twenty times faster :!: with these options enabled.

The reason for this mail is that I have “discovered” the -mcpu= option for gcc and g++ – and its effectiveness. Use “-mcpu=ultrasparc3” to generate code which uses special instructions available only on an UltraSPARC III processor. In my case, this sped up execution of a (CPU-bound) binary by over 20%: using only -O2: 5:03min using -mcpu=ultrasparc3 -O2: 3:52min

Of course you should make sure that the target processor actually is an UltraSPARC III.