All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	Christoph Hellwig <hch@infradead.org>,
	Nick Piggin <npiggin@suse.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [rfc] "fair" rw spinlocks
Date: Mon, 7 Dec 2009 10:18:16 -0800	[thread overview]
Message-ID: <20091207181816.GF6808@linux.vnet.ibm.com> (raw)
In-Reply-To: <m1y6lg7q3n.fsf@fess.ebiederm.org>

On Sat, Dec 05, 2009 at 07:12:28PM -0800, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > On Mon, 30 Nov 2009, Thomas Gleixner wrote:
> >> 
> >> I'm aware of that. The number of places where we read_lock
> >> tasklist_lock is 79 in 36 files right now. That's not a horrible task
> >> to go through them one by one and do a case by case conversion with a
> >> proper changelog. That would only leave the write_lock sites. 
> >
> > The write_lock sites should be fine, since just changing them to a 
> > spinlock should be 100% semantically equivalent - except for the lack of 
> > interrupt disable. And the lack of interrupt disable will result in a nice 
> > big deadlock if some interrupt really does take the spinlock, which is 
> > much easier to debug than a subtle race that would get the wrong read 
> > value.
> >
> >> We can then either do the rw_lock to spin_lock conversion or keep the
> >> rw_lock which has no readers anymore and behaves like a spinlock for a
> >> transition time so reverts of one of the read_lock -> rcu patches
> >> could be done to debug stuff.
> >
> > So as per the above, I wouldn't worry about the write lockers. Might as 
> > well change it to a spinlock, since that's what it will act as. It's not 
> > as if there is any chance that the spinlock code is subtly buggy.
> >
> > So the only reason to keep it as a rwlock would be if you decide to do the 
> > read-locked cases one by one, and don't end up with all of them converted. 
> > Which is a reasonable strategy too, of course. We don't _have_ to convert 
> > them all - if the main problem is some starvation issue, it's sufficient 
> > to convert just the main read-lock cases so that writers never get 
> > starved.
> >
> > But converting it all would be nice, because that whole
> >
> > 	write_lock_irq(&tasklist_lock);
> >
> > to
> >
> > 	spin_lock(&tasklist_lock);
> >
> > conversion would likely be a measurable performance win. Both because 
> > spinlocks are fundamentally faster (no atomic on unlock), and because you 
> > get rid of the irq disable/enable. But in order to get there, you'd have 
> > to convert _all_ the read-lockers, so you'd miss the opportunity to only 
> > convert the easy cases.
> 
> Atomically sending signal to every member of a process group, is the
> big fly in the ointment I am aware of.  Last time I looked I could
> not see how to convert it rcu.
> 
> Fundamentally: "kill -KILL -pgrp" should be usable to kill all of
> the processes in a process group, and "kill -KILL -1" should be usable
> to kill everything except the sender and init.  Something I have seen
> in shutdown scripts on more than one occasion.
> 
> This is a subtle in the sense that it won't show up in simple tests if
> you get it wrong.
> 
> This is a pain because we occasionally signal a process group from
> interrupt context.

Is it required that all of the processes see the signal before the
corresponding interrupt handler returns?  (My guess is "no", which
enables a trick or two, but thought I should ask.)

> The trouble as I recall is how to ensure new processes see the signal.

And can we afford to serialize signals to groups of processes?  Not
necessarily one at a time, but a limited set at a given time?
Alternatively, a long list of pending group signals for each new task to
walk?

							Thanx, Paul

  reply	other threads:[~2009-12-07 18:18 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-23 14:54 [rfc] "fair" rw spinlocks Nick Piggin
2009-11-24 20:19 ` David Miller
2009-11-25  6:52   ` Nick Piggin
2009-11-25  8:49   ` Andi Kleen
2009-11-25  8:56     ` Nick Piggin
2009-11-24 20:47 ` Andi Kleen
2009-11-25  6:54   ` Nick Piggin
2009-11-25  8:48     ` Andi Kleen
2009-11-25 13:09       ` Arnd Bergmann
2009-11-28  2:07 ` Paul E. McKenney
2009-11-28 11:15   ` Andi Kleen
2009-11-28 15:20     ` Paul E. McKenney
2009-11-28 17:30 ` Linus Torvalds
2009-11-29 18:51   ` Paul E. McKenney
2009-11-30  7:57     ` Nick Piggin
2009-11-30  7:55   ` Nick Piggin
2009-11-30 15:22     ` Linus Torvalds
2009-11-30 15:40       ` Nick Piggin
2009-11-30 16:07         ` Linus Torvalds
2009-11-30 16:17           ` Nick Piggin
2009-11-30 16:39           ` Paul E. McKenney
2009-11-30 17:05             ` Linus Torvalds
2009-11-30 17:13               ` Nick Piggin
2009-11-30 17:18                 ` Linus Torvalds
2009-12-01 17:03                   ` Arnd Bergmann
2009-12-01 17:15                     ` Linus Torvalds
2009-11-30 18:29                 ` Paul E. McKenney
2009-11-30 16:20     ` Paul E. McKenney
2009-11-30 10:00   ` Christoph Hellwig
2009-11-30 15:52     ` Linus Torvalds
2009-11-30 17:46       ` Ingo Molnar
2009-11-30 21:12         ` Thomas Gleixner
2009-11-30 21:27           ` Peter Zijlstra
2009-11-30 22:02             ` Thomas Gleixner
2009-11-30 22:11               ` Linus Torvalds
2009-11-30 22:37                 ` Thomas Gleixner
2009-11-30 22:49                   ` Linus Torvalds
2009-12-01 17:37                     ` [PATCH] audit: Call tty_audit_push_task() outside preempt disabled region Thomas Gleixner
2009-12-01 18:22                       ` Oleg Nesterov
2009-12-01 19:53                         ` Thomas Gleixner
2009-12-06  3:12                     ` [rfc] "fair" rw spinlocks Eric W. Biederman
2009-12-07 18:18                       ` Paul E. McKenney [this message]
2009-12-07 22:24                         ` Eric W. Biederman
2009-12-07 22:35                           ` Andi Kleen
2009-12-07 23:19                             ` Eric W. Biederman
2009-12-08  1:39                               ` Paul E. McKenney
2009-12-08  2:11                                 ` Eric W. Biederman
2009-12-08  2:37                                   ` Paul E. McKenney
2009-12-07 18:32                       ` Oleg Nesterov
2009-12-07 20:38                         ` Peter Zijlstra
2009-12-09 15:55                           ` Oleg Nesterov
2009-12-07 22:10                         ` Eric W. Biederman
2009-12-09 15:37                           ` Oleg Nesterov
2009-12-10  3:36                             ` Eric W. Biederman
2009-12-10  6:22                             ` Paul E. McKenney
2009-12-10 10:31                               ` Eric W. Biederman
2009-12-10 16:41                                 ` Paul E. McKenney
2009-12-01 19:01 ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091207181816.GF6808@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ebiederm@xmission.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.