public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Jan Schönherr" <schnhrr@cs.tu-berlin.de>
Cc: Paul Turner <pjt@google.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>, Dipankar Sarma <dipankar@in.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFCv2 2/8] rcu: More rcu-variants for list manipulation
Date: Tue, 02 Aug 2011 15:39:31 +0200	[thread overview]
Message-ID: <1312292371.1147.178.camel@twins> (raw)
In-Reply-To: <4E327232.204@cs.tu-berlin.de>

On Fri, 2011-07-29 at 10:41 +0200, Jan Schönherr wrote:
> Am 27.07.2011 21:10, schrieb Jan H. Schönherr:
> >  /**
> > + * list_add_tail_nobackref - add element to list without setting up
> > + *			     back references
> > + * @new: element to add
> > + * @head: list to add the element to
> > + *
> > + * @new is appended to @head.
> > + *
> > + * In contrast to list_add_tail(), this function does not maintain the
> > + * references back to @head. So, neither @new->next will be changed, nor
> > + * -- in case @new becomes the first entry of @head -- @new->pref.
> > + *
> > + * This is useful when reorganizing deleted elements of a RCU-protected
> > + * list as concurrent readers must always find their way back to the list.
> > + * When used in this context, special care must be taken in order to not
> > + * disturb concurrent readers on @head (or @new):
> > + * a) @new->next must lead back to the list
> > + * b) @new->next must not lead to any node already on @head
> > + * c) @new must be already known to existing readers on @head
> 
> This might actually race with physical deletion. Consider a list:
> 
>    HEAD --> A --> B --> C --> D --> HEAD
> 
> 1. Remove C from list, then D:
> 
>    HEAD --> A --> B --------------> HEAD
>                         C --> D ----^
> 
> 2. Attempt to physically delete D by using call_rcu() or similar.
>    This is delayed until all already running readers have finished.
> 
> 3. Let a new reader begin to traverse the list, advancing until A.
> 
> 4. Remove A.
> 
>    HEAD --------> B --------------> HEAD
>             A ----^      C --> D ----^
> 
> 5. Call list_add_tail_nobackref() for A, then C.
> 
>    HEAD --------> B --------------> HEAD
>    LIST --> A ---------> C --> D ----^
> 
> 6. Finish step 2, really deleting D.
> 
> 7. Let the new reader continue its traversal.
> 
> This reader will hit D, which is obviously bad.
> 
> 
> I think, we can avoid this by modifying the next pointers
> of the deleted elements and letting them point to HEAD.
> That is:
> 
> 5a. Call list_add_tail_nobackref() for A.
> 
>    HEAD --------> B --------------> HEAD
>                         C --> D ----^
>    LIST --> A -----------------------^
> 
> 5b. Call list_add_tail_nobackref() for C.
> 
>    HEAD --------> B --------------> HEAD
>                               D ----^
>    LIST --> A --------> C -----------^
> 
> 
> 
> However, this (and also the previous version) might
> cause readers to skip elements that are on the list
> (B in the example above). Can we tolerate that?
> 
> My current guess would be: no.
> 
> Thinking more about that, we already have that case:
> 
> 
>    HEAD --> A --> B --> C --> HEAD
> 
> 1. Let a reader traverse until A.
> 
> 2. Remove A.
> 
>    HEAD --------> B --> C --> HEAD
>             A ----^
> 
> 3. Add A after B.
> 
>    HEAD --------> B     C --> HEAD
>                   +> A -^
> 
> 4. Let reader continue traversal, skipping B.
> 
> (Similarly, if we had removed C and re-added it between A and B,
> we could have traversed B twice.)
> 
> So either the presented solution in this patch series
> is valid, or -- when we are not allowed to re-add
> elements which might still have readers -- the current
> handling of CFS leaf runqueues has an additional problem
> besides mixing up the order sometimes...
> 
> Paul? [Both of you, actually. :)]

*head still reels a little*

Right so traditional wisdom dictates not re-using entries that might
possibly still be in used. I remember us having had a lot of 'fun' with
that in kernel/smp.c..

See commit 6dc19899958e420a931274b94019e267e2396d3e and the follow ups
from Milton Miller.

Our use-case makes this hard because we need to re-insert the entry as
soon as a task wakes up again.. ho humm..

I don't think we really mind missing an entry or visiting a few twice,
as long as there's a solid termination condition for each iteration. Its
load-balancing after all, if we mess up, we'll fix up next time
round ;-)



  reply	other threads:[~2011-08-02 13:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-27 19:10 [PATCH RFCv2 0/8] sched: Enforce order of leaf CFS runqueues (and list cleanup) Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 1/8] list, treewide: Rename __list_del() to __list_link() Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 2/8] rcu: More rcu-variants for list manipulation Jan H. Schönherr
2011-07-29  8:41   ` Jan Schönherr
2011-08-02 13:39     ` Peter Zijlstra [this message]
2011-07-27 19:10 ` [PATCH RFCv2 3/8] sched: Handle on_list ancestor in list_add_leaf_cfs_rq() Jan H. Schönherr
2011-08-02 13:50   ` Peter Zijlstra
2011-08-03 20:44     ` Jan Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 4/8] list, treewide: Rename __list_del_entry() to __list_del() Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 5/8] treewide: Use __list_del() instead of __list_link() Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 6/8] list: Make use " Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 7/8] rcu: Make use of __list_link() and __list_link_rcu() Jan H. Schönherr
2011-07-27 19:10 ` [PATCH RFCv2 8/8] rcu: Rewrite and rename list_splice_init_rcu() Jan H. Schönherr
2011-08-03 21:05 ` [PATCH RFCv2 0/8] sched: Enforce order of leaf CFS runqueues (and list cleanup) Jan Schönherr
2011-08-03 21:35   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1312292371.1147.178.camel@twins \
    --to=peterz@infradead.org \
    --cc=dipankar@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=schnhrr@cs.tu-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox