This turned out to be a huge win on 32-bit i386 in PAE mode, but it is likely not as significant on x86_64; I don't know because I haven't actually measured the cost. I don't have 64-bit hardware that I have the luxury of rebooting right now, so this patch is untested, but if someone wants to try this out, it might actually show a measurable win on fork/exit. I lost my cycle count measurement diffs, but I don't think they would apply cleanly to x86_64 anyways. This patch at least looks good, and compiles cleanly on 2.6.13-rc5-mm1, thus passing some level of testing. Also, it might show reduced latency on pre-emptible kernels during heavy fork/exit activity, possibly allowing ZAP_BLOCK_SIZE to be raised for some architectures (I measured a ~30-50% reduction in cycle timings for zap_pte_range on i386 with CONFIG_PREEMPT with the analogous patch). Zach