public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, avi@redhat.com
Subject: Re: array of pointers with rcu
Date: Sun, 28 Jun 2009 22:06:36 +0300	[thread overview]
Message-ID: <20090628190636.GA14136@redhat.com> (raw)
In-Reply-To: <20090628161411.GE7070@linux.vnet.ibm.com>

On Sun, Jun 28, 2009 at 09:14:11AM -0700, Paul E. McKenney wrote:
> On Sun, Jun 28, 2009 at 04:22:24PM +0300, Michael S. Tsirkin wrote:
> > Paul,
> > I'd like to implement a static array of pointers with rcu.
> > (Note that Documentation/RCU/arrayRCU.txt addresses only the case of
> > static arrays where the data (rather than a pointer to the data) is
> > located in each array element).
> > The array is implemented today in kvm as follows:
> > 
> > struct kvm_io_bus {
> >         int                   dev_count;
> > #define NR_IOBUS_DEVS 6
> >         struct kvm_io_device *devs[NR_IOBUS_DEVS];
> > };
> > 
> > It's easy to use rcu_assign_pointer and
> > rcu_dereference to access each value in the array,
> > so that's fine.
> > 
> > However, I also have the dev_count value to handle.
> > This value is usually used in loops like this
> > 
> > for (i = 0; i < bus->dev_count; ++i) {
> > 	... uses bus->devs[i] ...
> > }
> 
> So the count can vary, then.  In other words, someone might change the
> value of ->dev_count from (say) 5 to (say) 2, implicitly invalidating
> the last three ->devs pointers, correct?
> 
> Of course, this implicit invalidation might happen just after you
> fetched the value of (say) ->devs[3].  Or just after you fetch the
> value of ->dev_count, for that matter.

Yes. So I need to be careful to call synchronize_rcu after changing dev_count
and before I can assume readers see an updated value.

> > I can assign the dev_count value with rcu_assign_pointer and even though
> > it is not a pointer I think it should work fine.  However, to access
> > dev_count I think that rcu_dereference will not do what I want:
> > 
> > #define rcu_dereference(p)     ({ \
> >                                 typeof(p) _________p1 = ACCESS_ONCE(p);
> > \
> >                                 smp_read_barrier_depends(); \
> >                                 (_________p1); \
> >                                 })
> > 
> > 
> > The use of dev_count is not through a dependency so
> > smp_read_barrier_depends will not be enough on most architectures.
> > 
> > So it seems that what I really need is something like:
> > #define rcu_read_value(p)     ({ \
> >                                 typeof(p) _________p1 = ACCESS_ONCE(p);
> > \
> >                                 smp_rmb(); \
> >                                 (_________p1); \
> >                                 })
> > 
> > And maybe
> > #define rcu_assign_value rcu_assign_pointer
> > for symmetry.
> > 
> > Comments?
> 
> Hmmm...
> 
> What you would really need to do in the above scheme is to make careful
> use of memory barriers trao order the accesses to ->dev_count and the the
> elements of the ->devs[] array.

Right, that's my plan. I think that the strategy I outlined above where
dev_count is always set with rcu_assign_value and read with
rcu_read_value, all array entries are read after dev_count is read with
rcu_dereference, and are assigned before dev count is updated with
rcu_assign_pointer will work in this case.

What I was thinking is that maybe this pattern is generic
enough to be of use to others. But maybe not.

> The usual trick to avoid such memory
> barriers it to dynamically allocate the array, but putting the count
> into a struct that includes the array, so that readers are guaranteed
> to get a value of ->dev_count that matches the ->devs[] array.
> 
> Of course, you might have other reasons for the array to be statically
> allocated.  In that case, one trick is to statically allocate two
> arrays, each with its own ->dev_count.  Then a size-change update will
> copy from the current array to the new array, set the value of the new
> ->dev_count appropriately, and then use rcu_assign_pointer() to cause
> new readers to see the updated array.  In short, use the approach
> described for resizeable arrays in Documentation/RCU/arrayRCU.txt
> even though the physical array stays the same size.
> 
> But again given that you only have six elements, why not just scan the
> whole array unconditionally, using NULL pointers in the ->devs[] elements
> to indicate that the corresponding element is invalid?  Then the normal
> rcu_assign_pointer() and rcu_dereference() will work normally.
> 
> 							Thanx, Paul


People are speaking about increasing the array size to 512. That's
a nice 4K page but adding the number of entries gets us just above that
which is kind of annoying.

-- 
MST

  reply	other threads:[~2009-06-28 19:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-28 13:22 array of pointers with rcu Michael S. Tsirkin
2009-06-28 16:14 ` Paul E. McKenney
2009-06-28 19:06   ` Michael S. Tsirkin [this message]
2009-06-28 21:10     ` Paul E. McKenney
2009-06-29  9:27       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090628190636.GA14136@redhat.com \
    --to=mst@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox