On Tuesday 13 November 2007 22:30, Denys Vlasenko wrote: > I will resubmit the patch without de-unrolling. > Meanwhile, I'd like to ask you guys to think about ways > to make size/speed tradeoffs selectable at build time. Here is the patch which has loops still unrolled, but otherwise unchanged. Description: Use alternative key setup implementation with mostly 64-bit ops if BITS_PER_LONG >= 64. Both much smaller and much faster. Unify camellia_en/decrypt128/256 into camellia_do_en/decrypt. Code was similar, with just one additional if() we can use came code. Replace (x & 0xff) with (u8)x, gcc is not smart enough to realize that it can do (x & 0xff) this way (which is smaller at least on i386). Don't do (x & 0xff) in a few places where x cannot be > 255 anyway: t0 = il >> 16; v = camellia_sp0222[(t1 >> 8) & 0xff]; il16 is u32, (thus t1 >> 8) is one byte! Signed-off-by: Denys Vlasenko -- vda