Re: RCU + SMR for preemptive kernel/user threads.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Joe Seigh" <jseigh_02@xemaps.com>
To: linux-kernel@vger.kernel.org
Subject: Re: RCU + SMR for preemptive kernel/user threads.
Date: Wed, 11 May 2005 17:47:52 -0400	[thread overview]
Message-ID: <opsqmr52snehbc72@grunion> (raw)
In-Reply-To: 20050511150454.GA1343@us.ibm.com

On Wed, 11 May 2005 08:04:54 -0700, Paul E. McKenney <paulmck@us.ibm.com>  
wrote:

> On Tue, May 10, 2005 at 06:40:20PM -0400, Joe Seigh wrote:

> In classic RCU, the release is supplied by the context switch.  In your
> scheme, couldn't you do the following on the update side?
>
> 	1.  Gather up all the hazard pointers.
> 	2.  Send IPIs to all other CPUs.
> 	3.  Check the hazard pointers gathered in #1 against the
> 	    blocks to be freed.

You need to do the IPIs before you look at the hazard pointers.

>
> The read side would do the following when letting go of a hazard pointer:
>
> 	1.  Prevent the compiler from reordering memory references
> 	    (the CPU would still be free to do so).
> 	2.  Set the hazard pointer to NULL.
> 	3.  begin non-critical-section code.
>
> Checking where the IPI is received by the read side:
>
> 1.  Before this point, the updater would have seen the non-NULL hazard
>     pointer (if the hazard pointer referenced the data item that was
>     previously removed).
> 2.  Ditto.
> 3.  Before this point, the hazard pointer could be seen as NULL, but
>     the read-side CPU will also have stopped using the pointer (since
>     we are assuming precise interrupts).

The problem is you don't know when the hazard pointer was set to NULL.
It could have been set soon after the IPI interrupt was received and
any outstanding accesses made since the IPI interrupt aren't syncronized
with respect to setting the hazard pointer to null.

But if you looked at the hazard pointer in the IPI interrupt handler,
you could use that information to decide whether you had to wait an
additional RCU interval.  So updater logic would be

          1.  Set global pointer to NULL.  // make object unreachable
          2.  Send IPIs  to all other CPUs
              (IPI interrupt handler will copy CPU's hazard pointers)
          3.  Check objects to be freed against copied hazard pointers.
          4.  There is no step 4.  Even if the actual hazard pointers
              that pointed to the object is NULL by this point (but not
              its copy), you'd still have to wait and addtional RCU
              interval so you might as well leave it out as redundant.

This is better.  I may try that trick I used to make NPTL condvars
faster to see if I can keep Linux user space version of this from
tanking.  It uses unix signals instead of IPIs.

>
> Again, not sure if all CPUs support precise interrupts.  The ones that I
> am familiar with do, at least for IPIs.
>
>> Additionally if you replace any non NULL hazard pointer value you will
>> need to use
>> release semantics.
>
> The trick is that the IPI provides the release semantics, but only
> when needed.  Right?
>
>> There might be something you can do to avoid the extra RCU wait but
>> I'd have to study it a little more to get a better handle on the
>> trade offs.
>
> True, there will need to be either two RCU waits or two rounds of IPIs.

Yes, it might better be called RCU+SMR+RCU in that case.

>>
>> I suppose I should do some kind of formal analysis of this.  I'm  
>> figuring
>> out
>> if this technique is interesting enough first before I go through all  
>> that
>> work.
>
> Hard to say without some experimentation.

I've done plenty of that.  I have some atomically thread-safe reference
counting impletations and a proxy GC based on those which I compare to
an RCU for user threads implementation.  Using lock-free in user space
gives you much more dramatic performance improvements than in the kernel.
It cuts down on unnecessary context switching which can slow things down
considerably.  Also mutexes and rwlocks are prone to starvation.  Making
them FIFO for guaranteed service order slows them down even further.

I usually use a semaphore to keep the updaters from running out of  
resources.
It slows down updater throughput but then I'm more concerned with reader
throughput.  If I want faster recovery of resources I'll used the atomic
refcounted pointer or the proxy based on it.  Slightly slower updater  
performance
but resources are recovered more quickly.

-- 
Joe Seigh

next prev parent reply	other threads:[~2005-05-11 22:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-09 19:09 RCU + SMR for preemptive kernel/user threads Joe Seigh
2005-05-10  1:11 ` Paul E. McKenney
2005-05-10 13:32 ` Joe Seigh
2005-05-10 16:55   ` Paul E. McKenney
2005-05-10 22:40     ` Joe Seigh
2005-05-11 15:04       ` Paul E. McKenney
2005-05-11 21:47         ` Joe Seigh [this message]
2005-05-12  0:39           ` Joe Seigh
2005-05-12  1:57           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=opsqmr52snehbc72@grunion \
    --to=jseigh_02@xemaps.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.