All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Don Zickus <dzickus@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@elte.hu>,
	Jerome Marchand <jmarchan@redhat.com>,
	Mandeep Singh Baines <msb@google.com>,
	linux-kernel@vger.kernel.org, stable@kernel.org,
	"Eric W. Biederman" <ebiederm@xmission.com>
Subject: Re: while_each_thread() under rcu_read_lock() is broken?
Date: Thu, 24 Jun 2010 20:37:44 -0700	[thread overview]
Message-ID: <20100625033744.GC2391@linux.vnet.ibm.com> (raw)
In-Reply-To: <20100624211446.5713743188@magilla.sf.frob.com>

On Thu, Jun 24, 2010 at 02:14:46PM -0700, Roland McGrath wrote:
> > First, what "bad things" can happen to a reader scanning a thread
> > group?
> > 
> > 1.	The thread-group leader might do exec(), destroying the old
> > 	list and forming a new one.  In this case, we want any readers
> > 	to stop scanning.
> 
> This doesn't do anything different (for these concerns) from just all the
> other threads happening to exit right before the exec.  There is no
> "destroying the old" and "forming the new", it's just that all the other
> threads are convinced to die now.  There is no problem here.

Understood -- I wasn't saying that each category posed a unique problem,
but rather making sure that I really had enumerated all the possibilities.
The reason for my "destroying the old" and "forming the new" is the
possibility of someone doing proc_task_readdir() when the group leader
does exec(), which causes all to die, and then the new process does
pthread_create(), forming a new thread group.  Because proc_task_readdir()
neither holds a lock nor stays in an RCU read-side critical section
for the full /proc scan, the old group might really be destroyed from
the reader's point of view.

That said, I freely admit that I am not very familiar with this code.

> > 2.	Some other thread might do exec(), destroying the old list and
> > 	forming a new one.  In this case, we also want any readers to
> > 	stop scanning.
> 
> Again, the list is not really destroyed, just everybody dies.  What is
> different here is that ->group_leader changes.  This is the only time
> that ever happens.  Moreover, it's the only time that a task that was
> previously pointed to by any ->group_leader can be reaped before the
> rest of the group has already been reaped first (and thus the
> thread_group made a singleton).

Yep!  Same proc_task_readdir() situation as before.  The group leader
cannot go away because proc_task_readdir() takes a reference.

> > 3.	The thread-group leader might do pthread_exit(), removing itself
> > 	from the thread group -- and might do so while the hapless reader
> > 	is referencing that thread.
> 
> This is called the delay_group_leader() case.  It doesn't happen in a
> way that has the problems you are concerned with.  The group_leader
> remains in EXIT_ZOMBIE state and can't be reaped until all the other
> threads have been reaped.  There is no time at which any thread in the
> group is in any hashes or accessible by any means after the (final)
> group_leader is reaped.

OK, good to know -- that does make things simpler.

> > 4.	Some other thread might do pthread_exit(), removing itself
> > 	from the thread group, and again might do so while the hapless
> > 	reader is referencing that thread.  In this case, we want
> > 	the hapless reader to continue scanning the remainder of the
> > 	thread group.
> 
> This is the most normal case (and #1 is effectively just this repeated
> by every thread in parallel).

Agreed.  One possible difference is that in #1, no one is going to
complain about the reader quitting, while in this case someone might
well be annoyed if the list of threads is incomplete.

> > 5.	The thread-group leader might do exit(), destroying the old
> > 	list without forming a new one.  In this case, we want any
> > 	readers to stop scanning.
> 
> All this means is everybody is convinced to die, and the group_leader
> dies too.  It is not discernibly different from #6.

Seems reasonable.

> > 6.	Some other thread might do exit(), destroying the old list
> > 	without forming a new one.  In this case, we also want any
> > 	readers to stop scanning.
> 
> This just means everybody is convinced to die and is not materially
> different from each individual thread all happening to die at the same
> time.  

Fair enough.  Again, my goal was to ensure that I had covered all the
cases as opposed to ensuring that I had described them minimally.

> You've described all these cases as "we want any readers to stop
> scanning".  That is far too strong, and sounds like some kind of
> guaranteed synchronization, which does not and need not exist.  Any
> reader that needs a dead thread to be off the list holds siglock
> and/or tasklist_lock.  For the casual readers that only use
> rcu_read_lock, we only "want any readers' loops eventually to
> terminate and never to dereference stale pointers".  That's why
> normal RCU listiness is generally fine.

OK, so maybe "it is OK for readers to stop scanning" is a better way
of putting it?

> The only problem we have is in #2.  This is only a problem because
> readers' loops may be using the old ->group_leader pointer as the
> anchor for their circular-list round-robin loop.  Once the former
> leader is removed from the list, that loop termination condition can
> never be met.

Does Oleg's checking for the group leader still being alive look correct
to you?

							Thanx, Paul

  reply	other threads:[~2010-06-25  3:37 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-18 19:02 [PATCH] fix the racy check_hung_uninterruptible_tasks()->rcu_lock_break() logic Oleg Nesterov
2010-06-18 19:34 ` while_each_thread() under rcu_read_lock() is broken? Oleg Nesterov
2010-06-18 21:08   ` Roland McGrath
2010-06-18 22:37     ` Oleg Nesterov
2010-06-18 22:33   ` Paul E. McKenney
2010-06-21 17:09     ` Oleg Nesterov
2010-06-21 17:44       ` Oleg Nesterov
2010-06-21 18:00         ` Oleg Nesterov
2010-06-21 19:02         ` Roland McGrath
2010-06-21 20:06           ` Oleg Nesterov
2010-06-21 21:19             ` Eric W. Biederman
2010-06-22 14:34               ` Oleg Nesterov
2010-07-08 23:59             ` Roland McGrath
2010-07-09  0:41               ` Paul E. McKenney
2010-07-09  1:01                 ` Roland McGrath
2010-07-09 16:18                   ` Paul E. McKenney
2010-06-21 20:51       ` Paul E. McKenney
2010-06-21 21:22         ` Eric W. Biederman
2010-06-21 21:38           ` Paul E. McKenney
2010-06-22 21:23         ` Oleg Nesterov
2010-06-22 22:12           ` Paul E. McKenney
2010-06-23 15:24             ` Oleg Nesterov
2010-06-24 18:07               ` Paul E. McKenney
2010-06-24 18:50                 ` Chris Friesen
2010-06-24 22:00                   ` Oleg Nesterov
2010-06-25  0:08                     ` Eric W. Biederman
2010-06-25  3:42                       ` Paul E. McKenney
2010-06-25 10:08                       ` Oleg Nesterov
2010-07-09  0:52                       ` Roland McGrath
2010-06-24 21:14                 ` Roland McGrath
2010-06-25  3:37                   ` Paul E. McKenney [this message]
2010-07-09  0:41                     ` Roland McGrath
2010-06-24 21:57                 ` Oleg Nesterov
2010-06-25  3:41                   ` Paul E. McKenney
2010-06-25  9:55                     ` Oleg Nesterov
2010-06-28 23:43                       ` Paul E. McKenney
2010-06-29 13:05                         ` Oleg Nesterov
2010-06-29 15:34                           ` Paul E. McKenney
2010-06-29 17:54                             ` Oleg Nesterov
2010-06-19  5:00   ` Mandeep Baines
2010-06-19  5:35     ` Frederic Weisbecker
2010-06-19 15:44       ` Mandeep Baines
2010-06-19 19:19     ` Oleg Nesterov
2010-06-18 20:11 ` [PATCH] fix the racy check_hung_uninterruptible_tasks()->rcu_lock_break() logic Frederic Weisbecker
2010-06-18 20:38 ` Mandeep Singh Baines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100625033744.GC2391@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=fweisbec@gmail.com \
    --cc=jmarchan@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=msb@google.com \
    --cc=oleg@redhat.com \
    --cc=roland@redhat.com \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.