public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Török Edwin" <edwin@clamav.net>
Cc: Ingo Molnar <mingo@elte.hu>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	aCaB <acab@clamav.net>, David Howells <dhowells@redhat.com>,
	Nick Piggin <npiggin@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: Mutex vs semaphores scheduler bug
Date: Mon, 12 Oct 2009 16:53:27 +0200	[thread overview]
Message-ID: <1255359207.10420.31.camel@twins> (raw)
In-Reply-To: <4AD0A0F7.9070700@clamav.net>

On Sat, 2009-10-10 at 17:57 +0300, Török Edwin wrote:
> If a semaphore (such as mmap_sem) is heavily congested, then using a
> userspace mutex makes the program faster.
> 
> For example using a mutex around *anonymous* mmaps, speeds it up
> significantly (~80% on this microbenchmark,
> ~15% on real applications). Such workarounds shouldn't  be necessary for
> userspace applications, the kernel should
> by default use the most efficient implementation for locks.

Should, yes, does, no.

> However when using a mutex the number of context switches is SMALLER by
> 40-60%.

That matches the problem, see below.

> I think its a bug in the scheduler, it scheduler the mutex case much
> better. 

It's not, the scheduler doesn't know about mutexes/futexes/rwsems.

> Maybe because userspace also spins a bit before actually calling
> futex().

Nope, if we would ever spin, it would be in the kernel after calling
FUTEX_LOCK (which currently doesn't exist). glibc shouldn't do any
spinning on its own (if it does, I have yet another reason to try and
supplant the glibc futex code).

> I think its important to optimize the mmap_sem semaphore

It is.

The problem appears to be that rwsem doesn't allow lock-stealing, and
very strictly maintains FIFO order on contention. This results in extra
schedules and reduced performance as you noticed.

What happens is that when we release a contended rwsem we assign it to
the next waiter, if before that waiter gets ran, another (running) tasks
comes along and tries to acquire the lock, that gets put to sleep, even
though it could possibly get to acquire it (and the woken waiter would
detect failure and go back to sleep).

So what I think we need to do is have a look at all this lib/rwsem.c
slowpath code and hack in lock stealing.



  reply	other threads:[~2009-10-12 14:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-10 14:57 Mutex vs semaphores scheduler bug Török Edwin
2009-10-12 14:53 ` Peter Zijlstra [this message]
2009-10-12 15:37   ` Török Edwin
2009-10-15 23:44   ` David Howells
2009-10-17 15:32     ` Peter Zijlstra
2009-10-20 19:02       ` Török Edwin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1255359207.10420.31.camel@twins \
    --to=peterz@infradead.org \
    --cc=acab@clamav.net \
    --cc=dhowells@redhat.com \
    --cc=edwin@clamav.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox