All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [rfc] "fair" rw spinlocks
Date: Fri, 27 Nov 2009 18:07:39 -0800	[thread overview]
Message-ID: <20091128020739.GA18149@linux.vnet.ibm.com> (raw)
In-Reply-To: <20091123145409.GA29627@wotan.suse.de>

On Mon, Nov 23, 2009 at 03:54:09PM +0100, Nick Piggin wrote:
> Hi,
> 
> Last time this issue came up that I could see, I don't think
> there were objections to making rwlocks fair, the main
> difficulty seemed to be that we allow reentrant read locks
> (so a write lock waiting must not block arbitrary read lockers).
> 
> Nowadays our rwlock usage is smaller although still quite a
> few, so it would make better sense to do a conversion by
> introducing a new lock type and move them over I guess.
> 
> Anyway, I would like to add some kind of fairness or at least
> anti starvation for writers. We have a customer seeing total
> livelock on tasklist_lock for write locking on a system as small
> as 8 core Opteron.
> 
> This was basically reproduced by several cores executing wait
> with WNOHANG.
> 
> Of course it would always be nice to improve locking so
> contention isn't an issue, but so long as we have rwlocks, we
> could possibly get into a situation where starvation is
> triggered *somehow*. So I'd really like to fix this.
> 
> This particular starvation on tasklist lock I guess is a local
> DoS vulnerability even if the workload is not particularly
> realistic.
> 
> Anyway, I don't have a patch yet. I'm sure it can be done
> without extra atomics in fastpaths. Comments?

The usual trick would be to keep per-fair-rwlock state in per-CPU
variables.  If it is forbidden to read-acquire one nestable fair rwlock
while read-holding another, then this per-CPU state can be a single
pointer and a nesting count.  On the other hand, if it is permitted to
read-acquire one nestable fair rwlock while holding another, then one
can use a small per-CPU array of pointer/count pairs.

Readers check the per-CPU state.  If they already read-hold the lock,
they increment the nesting count, otherwise, they contend directly for
the lock (and set up the per-CPU state).

Same number of atomics on the fastpath as the current implementation.
Too bad about those array access, though!  ;-)

(Though on modern hardware, the array accesses might be a non-event,
performance-wise.)

Hey, you asked!!!  And there are other ways to make this work, including
variations on brlock.

							Thanx, Paul

  parent reply	other threads:[~2009-11-28  2:07 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-23 14:54 [rfc] "fair" rw spinlocks Nick Piggin
2009-11-24 20:19 ` David Miller
2009-11-25  6:52   ` Nick Piggin
2009-11-25  8:49   ` Andi Kleen
2009-11-25  8:56     ` Nick Piggin
2009-11-24 20:47 ` Andi Kleen
2009-11-25  6:54   ` Nick Piggin
2009-11-25  8:48     ` Andi Kleen
2009-11-25 13:09       ` Arnd Bergmann
2009-11-28  2:07 ` Paul E. McKenney [this message]
2009-11-28 11:15   ` Andi Kleen
2009-11-28 15:20     ` Paul E. McKenney
2009-11-28 17:30 ` Linus Torvalds
2009-11-29 18:51   ` Paul E. McKenney
2009-11-30  7:57     ` Nick Piggin
2009-11-30  7:55   ` Nick Piggin
2009-11-30 15:22     ` Linus Torvalds
2009-11-30 15:40       ` Nick Piggin
2009-11-30 16:07         ` Linus Torvalds
2009-11-30 16:17           ` Nick Piggin
2009-11-30 16:39           ` Paul E. McKenney
2009-11-30 17:05             ` Linus Torvalds
2009-11-30 17:13               ` Nick Piggin
2009-11-30 17:18                 ` Linus Torvalds
2009-12-01 17:03                   ` Arnd Bergmann
2009-12-01 17:15                     ` Linus Torvalds
2009-11-30 18:29                 ` Paul E. McKenney
2009-11-30 16:20     ` Paul E. McKenney
2009-11-30 10:00   ` Christoph Hellwig
2009-11-30 15:52     ` Linus Torvalds
2009-11-30 17:46       ` Ingo Molnar
2009-11-30 21:12         ` Thomas Gleixner
2009-11-30 21:27           ` Peter Zijlstra
2009-11-30 22:02             ` Thomas Gleixner
2009-11-30 22:11               ` Linus Torvalds
2009-11-30 22:37                 ` Thomas Gleixner
2009-11-30 22:49                   ` Linus Torvalds
2009-12-01 17:37                     ` [PATCH] audit: Call tty_audit_push_task() outside preempt disabled region Thomas Gleixner
2009-12-01 18:22                       ` Oleg Nesterov
2009-12-01 19:53                         ` Thomas Gleixner
2009-12-06  3:12                     ` [rfc] "fair" rw spinlocks Eric W. Biederman
2009-12-07 18:18                       ` Paul E. McKenney
2009-12-07 22:24                         ` Eric W. Biederman
2009-12-07 22:35                           ` Andi Kleen
2009-12-07 23:19                             ` Eric W. Biederman
2009-12-08  1:39                               ` Paul E. McKenney
2009-12-08  2:11                                 ` Eric W. Biederman
2009-12-08  2:37                                   ` Paul E. McKenney
2009-12-07 18:32                       ` Oleg Nesterov
2009-12-07 20:38                         ` Peter Zijlstra
2009-12-09 15:55                           ` Oleg Nesterov
2009-12-07 22:10                         ` Eric W. Biederman
2009-12-09 15:37                           ` Oleg Nesterov
2009-12-10  3:36                             ` Eric W. Biederman
2009-12-10  6:22                             ` Paul E. McKenney
2009-12-10 10:31                               ` Eric W. Biederman
2009-12-10 16:41                                 ` Paul E. McKenney
2009-12-01 19:01 ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091128020739.GA18149@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.