I was experimenting with the attached program (taken from an IBM Developerworks article) to find the context switch times on AMD64 machine. With a 64bit binary I get average 5 to 8 usec/cswitch, whereas the same program compiled as 32bit consistently gives >= 10 usec/cswitch - sometimes even 13 usec/cswitch. Are there more context switching overheads when running 32bit programs on a 64bit kernel? Kernel version is 2.6.11-gentoo x86_64. 64bit compile - g++ -O2 -pthread csfast5.cpp -ocsfast64 32bit compile - g++ -m32 -O2 -pthread csfast5.cpp -ocsfast32 Run - ./csfast{32/64} -t 40 -c4 10 Parag