Christoph Lameter wrote: > Could you come up with a patch? Currently, I do not seem to be able to > spend enough time on it. Please have a look at this patch. Temporary solution while we are waiting for: test_and_set_bit (int nr, volatile void *addr, MODE_BARRIER) & co. Changing the temp. variables to be 64 bit wide was not a good idea => alignment faults. In order to eliminate the extra "zxt4", I hanged the type of the return values of my intrinsic macros to be 32 bit wide. Here is what I get (NOP-s removed): reserve_bootmem_core+240: [MMI] mf;; reserve_bootmem_core+241: and r10=31,r18 reserve_bootmem_core+257: extr r11=r18,5,27;; reserve_bootmem_core+272: [MFI] shladd r16=r11,2,r16 reserve_bootmem_core+274: shl r17=r19,r10;; reserve_bootmem_core+288: [MMI] ld4.bias.nta r20=[r16];; reserve_bootmem_core+289: or r22=r17,r20 reserve_bootmem_core+305: mov.m ar.ccv=r20;; reserve_bootmem_core+320: [MMI] cmpxchg4.acq.nta r21=[r16],r22,ar.ccv;; reserve_bootmem_core+322: cmp4.eq p14,p15=r20,r21 reserve_bootmem_core+336: [BBB] (p15) br.cond.dptk.few reserve_bootmem_core+288 BTW why do all the intrinsic macros return 64 bit wide values, independently of their actual operand width? E.g.: #define ia64_cmpxchg4_acq(ptr, new, old) ... __u64 ia64_intri_res; Thanks, Zoltan Signed-off-by: Zoltan Menyhart