From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759886AbYEGQXk (ORCPT ); Wed, 7 May 2008 12:23:40 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1760158AbYEGQXN (ORCPT ); Wed, 7 May 2008 12:23:13 -0400 Received: from palinux.external.hp.com ([192.25.206.14]:40465 "EHLO mail.parisc-linux.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758199AbYEGQXI (ORCPT ); Wed, 7 May 2008 12:23:08 -0400 Date: Wed, 7 May 2008 10:22:51 -0600 From: Matthew Wilcox To: Andrew Morton Cc: Andi Kleen , Linus Torvalds , "Zhang, Yanmin" , Ingo Molnar , LKML , Alexander Viro Subject: Re: AIM7 40% regression with 2.6.26-rc1 Message-ID: <20080507162251.GX19219@parisc-linux.org> References: <1210052904.3453.30.camel@ymzhang> <20080506114449.GC32591@elte.hu> <1210126286.3453.37.camel@ymzhang> <1210131712.3453.43.camel@ymzhang> <87lk2mbcqp.fsf@basil.nowhere.org> <20080507114643.GR19219@parisc-linux.org> <87hcdab8zp.fsf@basil.nowhere.org> <4821C370.1030801@firstfloor.org> <20080507083105.b9874d78.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080507083105.b9874d78.akpm@linux-foundation.org> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 07, 2008 at 08:31:05AM -0700, Andrew Morton wrote: > On Wed, 07 May 2008 16:57:52 +0200 Andi Kleen wrote: > > > Or figure out what made the semaphore consolidation slower? As Ingo > > pointed out earlier 40% is unlikely to be a fast path problem, but some > > algorithmic problem. Surely that is fixable (even for .26)? > > Absolutely. Yanmin is apparently showing that each call to __down() > results in 1,451 calls to schedule(). wtf? I can't figure it out either. Unless schedule() is broken somehow ... but that should have shown up with semaphore-sleepers.c, shouldn't it? One other difference between semaphore-sleepers and the new generic code is that in effect, semaphore-sleepers does a little bit of spinning before it sleeps. That is, if up() and down() are called more-or-less simultaneously, the increment of sem->count will happen before __down calls schedule(). How about something like this: diff --git a/kernel/semaphore.c b/kernel/semaphore.c index 5c2942e..ef83f5a 100644 --- a/kernel/semaphore.c +++ b/kernel/semaphore.c @@ -211,6 +211,7 @@ static inline int __sched __down_common(struct semaphore *sem, long state, waiter.up = 0; for (;;) { + int i; if (state == TASK_INTERRUPTIBLE && signal_pending(task)) goto interrupted; if (state == TASK_KILLABLE && fatal_signal_pending(task)) @@ -219,7 +220,15 @@ static inline int __sched __down_common(struct semaphore *sem, long state, goto timed_out; __set_task_state(task, state); spin_unlock_irq(&sem->lock); + + for (i = 0; i < 10; i++) { + if (waiter.up) + goto skip_schedule; + cpu_relax(); + } + timeout = schedule_timeout(timeout); + skip_schedule: spin_lock_irq(&sem->lock); if (waiter.up) return 0; Maybe it'd be enough to test it once ... or maybe we should use spin_is_locked() ... Ingo? -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step."