From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Christoph Hellwig <hch@infradead.org>,
Nick Piggin <npiggin@suse.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>
Subject: Re: [rfc] "fair" rw spinlocks
Date: Mon, 7 Dec 2009 10:18:16 -0800 [thread overview]
Message-ID: <20091207181816.GF6808@linux.vnet.ibm.com> (raw)
In-Reply-To: <m1y6lg7q3n.fsf@fess.ebiederm.org>
On Sat, Dec 05, 2009 at 07:12:28PM -0800, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Mon, 30 Nov 2009, Thomas Gleixner wrote:
> >>
> >> I'm aware of that. The number of places where we read_lock
> >> tasklist_lock is 79 in 36 files right now. That's not a horrible task
> >> to go through them one by one and do a case by case conversion with a
> >> proper changelog. That would only leave the write_lock sites.
> >
> > The write_lock sites should be fine, since just changing them to a
> > spinlock should be 100% semantically equivalent - except for the lack of
> > interrupt disable. And the lack of interrupt disable will result in a nice
> > big deadlock if some interrupt really does take the spinlock, which is
> > much easier to debug than a subtle race that would get the wrong read
> > value.
> >
> >> We can then either do the rw_lock to spin_lock conversion or keep the
> >> rw_lock which has no readers anymore and behaves like a spinlock for a
> >> transition time so reverts of one of the read_lock -> rcu patches
> >> could be done to debug stuff.
> >
> > So as per the above, I wouldn't worry about the write lockers. Might as
> > well change it to a spinlock, since that's what it will act as. It's not
> > as if there is any chance that the spinlock code is subtly buggy.
> >
> > So the only reason to keep it as a rwlock would be if you decide to do the
> > read-locked cases one by one, and don't end up with all of them converted.
> > Which is a reasonable strategy too, of course. We don't _have_ to convert
> > them all - if the main problem is some starvation issue, it's sufficient
> > to convert just the main read-lock cases so that writers never get
> > starved.
> >
> > But converting it all would be nice, because that whole
> >
> > write_lock_irq(&tasklist_lock);
> >
> > to
> >
> > spin_lock(&tasklist_lock);
> >
> > conversion would likely be a measurable performance win. Both because
> > spinlocks are fundamentally faster (no atomic on unlock), and because you
> > get rid of the irq disable/enable. But in order to get there, you'd have
> > to convert _all_ the read-lockers, so you'd miss the opportunity to only
> > convert the easy cases.
>
> Atomically sending signal to every member of a process group, is the
> big fly in the ointment I am aware of. Last time I looked I could
> not see how to convert it rcu.
>
> Fundamentally: "kill -KILL -pgrp" should be usable to kill all of
> the processes in a process group, and "kill -KILL -1" should be usable
> to kill everything except the sender and init. Something I have seen
> in shutdown scripts on more than one occasion.
>
> This is a subtle in the sense that it won't show up in simple tests if
> you get it wrong.
>
> This is a pain because we occasionally signal a process group from
> interrupt context.
Is it required that all of the processes see the signal before the
corresponding interrupt handler returns? (My guess is "no", which
enables a trick or two, but thought I should ask.)
> The trouble as I recall is how to ensure new processes see the signal.
And can we afford to serialize signals to groups of processes? Not
necessarily one at a time, but a limited set at a given time?
Alternatively, a long list of pending group signals for each new task to
walk?
Thanx, Paul
next prev parent reply other threads:[~2009-12-07 18:18 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-23 14:54 [rfc] "fair" rw spinlocks Nick Piggin
2009-11-24 20:19 ` David Miller
2009-11-25 6:52 ` Nick Piggin
2009-11-25 8:49 ` Andi Kleen
2009-11-25 8:56 ` Nick Piggin
2009-11-24 20:47 ` Andi Kleen
2009-11-25 6:54 ` Nick Piggin
2009-11-25 8:48 ` Andi Kleen
2009-11-25 13:09 ` Arnd Bergmann
2009-11-28 2:07 ` Paul E. McKenney
2009-11-28 11:15 ` Andi Kleen
2009-11-28 15:20 ` Paul E. McKenney
2009-11-28 17:30 ` Linus Torvalds
2009-11-29 18:51 ` Paul E. McKenney
2009-11-30 7:57 ` Nick Piggin
2009-11-30 7:55 ` Nick Piggin
2009-11-30 15:22 ` Linus Torvalds
2009-11-30 15:40 ` Nick Piggin
2009-11-30 16:07 ` Linus Torvalds
2009-11-30 16:17 ` Nick Piggin
2009-11-30 16:39 ` Paul E. McKenney
2009-11-30 17:05 ` Linus Torvalds
2009-11-30 17:13 ` Nick Piggin
2009-11-30 17:18 ` Linus Torvalds
2009-12-01 17:03 ` Arnd Bergmann
2009-12-01 17:15 ` Linus Torvalds
2009-11-30 18:29 ` Paul E. McKenney
2009-11-30 16:20 ` Paul E. McKenney
2009-11-30 10:00 ` Christoph Hellwig
2009-11-30 15:52 ` Linus Torvalds
2009-11-30 17:46 ` Ingo Molnar
2009-11-30 21:12 ` Thomas Gleixner
2009-11-30 21:27 ` Peter Zijlstra
2009-11-30 22:02 ` Thomas Gleixner
2009-11-30 22:11 ` Linus Torvalds
2009-11-30 22:37 ` Thomas Gleixner
2009-11-30 22:49 ` Linus Torvalds
2009-12-01 17:37 ` [PATCH] audit: Call tty_audit_push_task() outside preempt disabled region Thomas Gleixner
2009-12-01 18:22 ` Oleg Nesterov
2009-12-01 19:53 ` Thomas Gleixner
2009-12-06 3:12 ` [rfc] "fair" rw spinlocks Eric W. Biederman
2009-12-07 18:18 ` Paul E. McKenney [this message]
2009-12-07 22:24 ` Eric W. Biederman
2009-12-07 22:35 ` Andi Kleen
2009-12-07 23:19 ` Eric W. Biederman
2009-12-08 1:39 ` Paul E. McKenney
2009-12-08 2:11 ` Eric W. Biederman
2009-12-08 2:37 ` Paul E. McKenney
2009-12-07 18:32 ` Oleg Nesterov
2009-12-07 20:38 ` Peter Zijlstra
2009-12-09 15:55 ` Oleg Nesterov
2009-12-07 22:10 ` Eric W. Biederman
2009-12-09 15:37 ` Oleg Nesterov
2009-12-10 3:36 ` Eric W. Biederman
2009-12-10 6:22 ` Paul E. McKenney
2009-12-10 10:31 ` Eric W. Biederman
2009-12-10 16:41 ` Paul E. McKenney
2009-12-01 19:01 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091207181816.GF6808@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=ebiederm@xmission.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).