Hello all, Here is a cleaned up version of the previous experimental patch: http://marc.info/?l=linux-bluetooth&m=123245036109697&w=2 I changed it to be alignment and byte order neutral (input data is read one byte at a time). It's a bit slower than reading via int16_t * pointer, but avoids headache of worrying about the other problems. Endian conversion is still also kept (when reading one byte at a time, it does not affect performance anyway). The patch should be safe to apply. Benchmarks show consistent performance improvement ~30% for both x86 and ARM Cortex-A8. It's even more than I measured before just because optimizations are cumulative and the effect of each individual change becomes more visible when the other parts also get faster (the previous benchmark was run before "-funroll-loops" optimization got committed). ARM Cortex-A8: before: real 1m 24.78s user 1m 21.20s sys 0m 3.57s after: real 1m 4.72s user 1m 1.03s sys 0m 3.68s Intel Core2: before: real 0m10.210s user 0m9.761s sys 0m0.324s after: real 0m7.729s user 0m7.268s sys 0m0.376s Best regards, Siarhei Siamashka