Philippe Gerum wrote: > On Thu, 2006-09-28 at 12:31 +0200, Jan Kiszka wrote: >> Does it still make sense then to optimise like I proposed? >> > > If you did measure a 5% improvement on low-end x86 hw, then I guess so. > Moving __ipipe_unstall_root to the arch-dependent section would do, so > that we could use the most efficient implementation on a case by case > basis. > I'm not seeing 5% alone due to this change, it's a piece in the puzzle. I just benchmarked the fixed version below against the unmodified one. There is an improvement of the average hackbench runtime, though it's minimal. The wrong variant I posted first doesn't show noticeable benefit compared to the correct one. I guess the "critical" part is switching IRQs off when they were on, and not just enabling them when they weren't disabled. So I think the one below could be applied for now, but per-arch optimisations may remain on the to-do list for the future if we discover archs that could do better with hand-optimised code here. The x86 assembly looks quite good already. --- linux-2.6.17.13.orig/kernel/ipipe/core.c +++ linux-2.6.17.13/kernel/ipipe/core.c @@ -166,6 +158,7 @@ void __ipipe_unstall_root(void) { ipipe_declare_cpuid; +#ifdef CONFIG_SMP local_irq_disable_hw(); ipipe_load_cpuid(); @@ -174,6 +167,14 @@ void __ipipe_unstall_root(void) if (ipipe_root_domain->cpudata[cpuid].irq_pending_hi != 0) __ipipe_sync_pipeline(IPIPE_IRQMASK_ANY); +#else /* !CONFIG_SMP */ + clear_bit(IPIPE_STALL_FLAG, &ipipe_root_domain->cpudata[cpuid].status); + + if (unlikely(ipipe_root_domain->cpudata[cpuid].irq_pending_hi != 0)) { + local_irq_disable_hw(); + __ipipe_sync_pipeline(IPIPE_IRQMASK_ANY); + } +#endif /* CONFIG_SMP */ local_irq_enable_hw(); }