[PATCH] rcu,doc: lock-free update site

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] rcu,doc: lock-free update site
@ 2011-06-14  9:00 Lai Jiangshan
  2011-06-14 12:50 ` Mathieu Desnoyers
       [not found] ` <BLU0-SMTP15CB1FCA05185652974DF796680@phx.gbl>
  0 siblings, 2 replies; 4+ messages in thread
From: Lai Jiangshan @ 2011-06-14  9:00 UTC (permalink / raw)
  To: Paul E. McKenney, Mathieu Desnoyers, josh, Manfred Spraul, LKML

Add a document which describes a pattern of using RCU to implement lock-free(lockless)
update site.

Singed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/RCU/00-INDEX                  |    2 +
 Documentation/RCU/lock-free-update-site.txt |  143 +++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 0 deletions(-)

diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 1d7a885..7178dd5 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -8,6 +8,8 @@ listRCU.txt
 	- Using RCU to Protect Read-Mostly Linked Lists
 lockdep.txt
 	- RCU and lockdep checking
+lock-free-update-site.txt
+	- RCU pattern of lock-free update site
 NMI-RCU.txt
 	- Using RCU to Protect Dynamic NMI Handlers
 rcubarrier.txt
diff --git a/Documentation/RCU/lock-free-update-site.txt b/Documentation/RCU/lock-free-update-site.txt
new file mode 100644
index 0000000..9b6984a
--- /dev/null
+++ b/Documentation/RCU/lock-free-update-site.txt
@@ -0,0 +1,143 @@
+Lock-free(lockless) update site
+
+This article describes a pattern of using RCU to implement lock-free(lockless)
+update site. RCU update site is considered call-rare and it is protected
+by a update-site lock generally. But blocking algorithms are undesirable
+in some cases for some reasons, thus, this pattern may help.
+
+This pattern can only protect a single pointer which is the only reference
+of the object.
+
+object pointer:
+
+struct my_struct *gptr;
+
+wait-free read site:
+{
+	rcu_read_lock();
+	ptr = rcu_dereference(gptr);
+	my_struct_read(ptr);
+	rcu_read_unlock();
+}
+
+lock-free update site(update as new):
+{
+	new_ptr = my_struct_alloc();
+	for (;;) {
+		rcu_read_lock();
+
+		old_ptr = rcu_dereference(gptr);
+
+		/* copy data from old_ptr to new_ptr and update it */
+		my_struct_update(new_ptr, old_ptr);
+
+		/* atomically publish the new_ptr and de-publish the old_ptr */
+		if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
+			rcu_read_unlock();
+
+			/*
+			 * free it after a grace-period, read sites and other
+			 * update sites may be reading it in parallel.
+			 */
+			kfree_rcu(old_ptr);
+
+			/* success, exit the loop */
+			break;
+		} else {
+			rcu_read_unlock();
+
+			/*
+			 * Other update site successfully update it, we need
+			 * to read the latest data and try the update again.
+			 *
+			 * If the other update site did the same thing we need,
+			 * we can free the new_ptr and exit this loop too,
+			 * and it may becomes a wait-free algorithm.
+			 */
+		}
+	}
+}
+
+1) In update site, rcu_read_lock() is needed for my_struct_update().
+
+   In this kind of lock-free update site, many update sides
+   may run parallel, other update side may had successfully
+   de-published old_ptr and tried to free it. rcu_read_lock()
+   prevents old_ptr from freeing and ensures it valid for
+   my_struct_update().
+
+2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
+
+   Although the content of old_ptr is not accessed when cmpxchg(),
+   but old_ptr should not be freed until cmpxchg() finished.
+   Otherwise we may miss other successful update and publish a
+   new_ptr without information from the latest object.
+
+   Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
+   (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   my_struct_update(new_ptr, old_ptr);
+   rcu_read_unlock();
+   .						successfully update, now gptr=other_ptr
+   .						old_ptr is freed
+   .
+   .						other update, my_struct_alloc() returns old_ptr
+   .						successfully publish and de-publish
+   .						now gptr=old_ptr again
+   .
+   cmpxchg(&gptr, old_ptr, new_ptr)
+     cmpxchg() success, but the 2 updates
+     of CPU1 are completely missed.
+
+   This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
+   before cmpxchg() finished and to prevent ABA-problem.
+
+3) Beware NULL pointer.
+
+   Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
+
+lock-free update site(dispose, wait-free):
+{
+	old_ptr = xchg(&gptr, NULL);
+	if (old_ptr != NULL)
+		kfree_rcu(old_ptr);
+}
+
+   This code cause NULL reusing and may cause ABA-problem like above example:
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   /* old_ptr = NULL */
+   my_struct_update(new_ptr, NULL);
+   .						successfully update, now gptr=other_ptr
+   .
+   .						successfully dispose
+   .						now gptr=NULL again
+   .
+   cmpxchg(&gptr, NULL, new_ptr)
+     cmpxchg() success, but the update
+     and the dispose of CPU1 are missed
+     consideration by CPU0.
+   rcu_read_unlock();
+
+   In many use cases, these behaviors are OK. In these use cases,
+   my_struct_update(new_ptr, NULL) give us the same result even we retry.
+
+   But in some raw use cases(I can't find any use-case now, I believe it exist),
+   the missed considerations of the updates are not acceptable, in this case,
+   we should use different null-value for NULL pointer for every disposing.
+
+lock-free update site(dispose, wait-free, paranoid version):
+{
+	null_ptr = alloc_null_ptr();
+	old_ptr = xchg(&gptr, null_ptr);
+	if (is_null_ptr(old_ptr))
+		free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
+	else
+		kfree_rcu(old_ptr);
+}
+

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] rcu,doc: lock-free update site
  2011-06-14  9:00 [PATCH] rcu,doc: lock-free update site Lai Jiangshan
@ 2011-06-14 12:50 ` Mathieu Desnoyers
       [not found] ` <BLU0-SMTP15CB1FCA05185652974DF796680@phx.gbl>
  1 sibling, 0 replies; 4+ messages in thread
From: Mathieu Desnoyers @ 2011-06-14 12:50 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Paul E. McKenney, josh, Manfred Spraul, LKML

* Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
> update site.
> 
[...]
> @@ -0,0 +1,143 @@
> +Lock-free(lockless) update site
> +
> +This article describes a pattern of using RCU to implement lock-free(lockless)
> +update site. RCU update site is considered call-rare and it is protected
> +by a update-site lock generally. But blocking algorithms are undesirable
> +in some cases for some reasons, thus, this pattern may help.

Hi Lai,

Yes, using this kind of rcu read-side lock to protect against the
cmpxchg ABA problem is well-known (to me at least) ;) I used this
technique in the userspace RCU library "lock-free queue" and "lock-free
stack" in 2010*. Please feel free to dig through my RCU data containers code
to bring in more data structure examples:

http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD

http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD

Thanks!

Mathieu

* AFAIK I introduced this technique using RCU read-side C.S. to deal
  with cmpxchg ABA at that point, but someone might have thought about
  it before me without my knowledge. My litterature survey so far
  indicates that using a double-word CAS on a pointer/counter was one of
  the usual technique used to protect against cmpxchg ABA so far. Other
  techniques imply allocating elements in a limited-size array (so a
  simple cmpxchg can update the array index and counter atomically),
  Hasard Pointers, or having a full-blown GC which provides similar
  guarantees to the RCU grace period with a read-side lock held.
  Ref.:

  [1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
  [2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
  [2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"

> +
> +This pattern can only protect a single pointer which is the only reference
> +of the object.
> +
> +object pointer:
> +
> +struct my_struct *gptr;
> +
> +wait-free read site:
> +{
> +	rcu_read_lock();
> +	ptr = rcu_dereference(gptr);
> +	my_struct_read(ptr);
> +	rcu_read_unlock();
> +}
> +
> +lock-free update site(update as new):
> +{
> +	new_ptr = my_struct_alloc();
> +	for (;;) {
> +		rcu_read_lock();
> +
> +		old_ptr = rcu_dereference(gptr);
> +
> +		/* copy data from old_ptr to new_ptr and update it */
> +		my_struct_update(new_ptr, old_ptr);
> +
> +		/* atomically publish the new_ptr and de-publish the old_ptr */
> +		if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
> +			rcu_read_unlock();
> +
> +			/*
> +			 * free it after a grace-period, read sites and other
> +			 * update sites may be reading it in parallel.
> +			 */
> +			kfree_rcu(old_ptr);
> +
> +			/* success, exit the loop */
> +			break;
> +		} else {
> +			rcu_read_unlock();
> +
> +			/*
> +			 * Other update site successfully update it, we need
> +			 * to read the latest data and try the update again.
> +			 *
> +			 * If the other update site did the same thing we need,
> +			 * we can free the new_ptr and exit this loop too,
> +			 * and it may becomes a wait-free algorithm.
> +			 */
> +		}



> +	}
> +}
> +
> +1) In update site, rcu_read_lock() is needed for my_struct_update().
> +
> +   In this kind of lock-free update site, many update sides
> +   may run parallel, other update side may had successfully
> +   de-published old_ptr and tried to free it. rcu_read_lock()
> +   prevents old_ptr from freeing and ensures it valid for
> +   my_struct_update().
> +
> +2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
> +
> +   Although the content of old_ptr is not accessed when cmpxchg(),
> +   but old_ptr should not be freed until cmpxchg() finished.
> +   Otherwise we may miss other successful update and publish a
> +   new_ptr without information from the latest object.
> +
> +   Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
> +   (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
> +
> +   CPU0						CPU1
> +   rcu_read_lock()
> +   old_ptr = rcu_dereference(gptr);
> +   my_struct_update(new_ptr, old_ptr);
> +   rcu_read_unlock();
> +   .						successfully update, now gptr=other_ptr
> +   .						old_ptr is freed
> +   .
> +   .						other update, my_struct_alloc() returns old_ptr
> +   .						successfully publish and de-publish
> +   .						now gptr=old_ptr again
> +   .
> +   cmpxchg(&gptr, old_ptr, new_ptr)
> +     cmpxchg() success, but the 2 updates
> +     of CPU1 are completely missed.
> +
> +   This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
> +   before cmpxchg() finished and to prevent ABA-problem.
> +
> +3) Beware NULL pointer.
> +
> +   Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
> +
> +lock-free update site(dispose, wait-free):
> +{
> +	old_ptr = xchg(&gptr, NULL);
> +	if (old_ptr != NULL)
> +		kfree_rcu(old_ptr);
> +}
> +
> +   This code cause NULL reusing and may cause ABA-problem like above example:
> +
> +   CPU0						CPU1
> +   rcu_read_lock()
> +   old_ptr = rcu_dereference(gptr);
> +   /* old_ptr = NULL */
> +   my_struct_update(new_ptr, NULL);
> +   .						successfully update, now gptr=other_ptr
> +   .
> +   .						successfully dispose
> +   .						now gptr=NULL again
> +   .
> +   cmpxchg(&gptr, NULL, new_ptr)
> +     cmpxchg() success, but the update
> +     and the dispose of CPU1 are missed
> +     consideration by CPU0.
> +   rcu_read_unlock();
> +
> +   In many use cases, these behaviors are OK. In these use cases,
> +   my_struct_update(new_ptr, NULL) give us the same result even we retry.
> +
> +   But in some raw use cases(I can't find any use-case now, I believe it exist),
> +   the missed considerations of the updates are not acceptable, in this case,
> +   we should use different null-value for NULL pointer for every disposing.
> +
> +lock-free update site(dispose, wait-free, paranoid version):
> +{
> +	null_ptr = alloc_null_ptr();
> +	old_ptr = xchg(&gptr, null_ptr);
> +	if (is_null_ptr(old_ptr))
> +		free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
> +	else
> +		kfree_rcu(old_ptr);
> +}
> +

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] rcu,doc: lock-free update site
       [not found] ` <BLU0-SMTP15CB1FCA05185652974DF796680@phx.gbl>
@ 2011-06-16  2:40   ` Lai Jiangshan
  2011-06-16  4:06     ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Lai Jiangshan @ 2011-06-16  2:40 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Paul E. McKenney, josh, Manfred Spraul, LKML

On 06/14/2011 08:50 PM, Mathieu Desnoyers wrote:
> * Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
>> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
>> update site.
>>
> [...]
>> @@ -0,0 +1,143 @@
>> +Lock-free(lockless) update site
>> +
>> +This article describes a pattern of using RCU to implement lock-free(lockless)
>> +update site. RCU update site is considered call-rare and it is protected
>> +by a update-site lock generally. But blocking algorithms are undesirable
>> +in some cases for some reasons, thus, this pattern may help.
> 
> Hi Lai,
> 
> Yes, using this kind of rcu read-side lock to protect against the
> cmpxchg ABA problem is well-known (to me at least) ;) I used this
> technique in the userspace RCU library "lock-free queue" and "lock-free
> stack" in 2010*. Please feel free to dig through my RCU data containers code
> to bring in more data structure examples:
> 
> http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> 
> http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> 
> Thanks!
> 
> Mathieu


Hi, Mathieu

I will try to make simple update site wait-free, so I wrote a simple guide/document
at first. I didn't notice your work. Your work is excellent,
I will add some references. I hope you rewrite/pretty this document also.

The lock-free stack is a good example, but the lock-free queue is not good here I think.

Thanks,
Lai


> 
> * AFAIK I introduced this technique using RCU read-side C.S. to deal
>   with cmpxchg ABA at that point, but someone might have thought about
>   it before me without my knowledge. My litterature survey so far
>   indicates that using a double-word CAS on a pointer/counter was one of
>   the usual technique used to protect against cmpxchg ABA so far. Other
>   techniques imply allocating elements in a limited-size array (so a
>   simple cmpxchg can update the array index and counter atomically),
>   Hasard Pointers, or having a full-blown GC which provides similar
>   guarantees to the RCU grace period with a read-side lock held.
>   Ref.:
> 
>   [1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
>   [2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
>   [2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] rcu,doc: lock-free update site
  2011-06-16  2:40   ` Lai Jiangshan
@ 2011-06-16  4:06     ` Paul E. McKenney
  0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2011-06-16  4:06 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Mathieu Desnoyers, josh, Manfred Spraul, LKML

On Thu, Jun 16, 2011 at 10:40:16AM +0800, Lai Jiangshan wrote:
> On 06/14/2011 08:50 PM, Mathieu Desnoyers wrote:
> > * Lai Jiangshan (laijs@cn.fujitsu.com) wrote:
> >> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
> >> update site.
> >>
> > [...]
> >> @@ -0,0 +1,143 @@
> >> +Lock-free(lockless) update site
> >> +
> >> +This article describes a pattern of using RCU to implement lock-free(lockless)
> >> +update site. RCU update site is considered call-rare and it is protected
> >> +by a update-site lock generally. But blocking algorithms are undesirable
> >> +in some cases for some reasons, thus, this pattern may help.
> > 
> > Hi Lai,
> > 
> > Yes, using this kind of rcu read-side lock to protect against the
> > cmpxchg ABA problem is well-known (to me at least) ;) I used this
> > technique in the userspace RCU library "lock-free queue" and "lock-free
> > stack" in 2010*. Please feel free to dig through my RCU data containers code
> > to bring in more data structure examples:
> > 
> > http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> > 
> > http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> > 
> > Thanks!
> > 
> > Mathieu
> 
> 
> Hi, Mathieu
> 
> I will try to make simple update site wait-free, so I wrote a simple guide/document
> at first. I didn't notice your work. Your work is excellent,
> I will add some references. I hope you rewrite/pretty this document also.
> 
> The lock-free stack is a good example, but the lock-free queue is not good here I think.

I look forward to seeing the update!

							Thanx, Paul

> Thanks,
> Lai
> 
> 
> > 
> > * AFAIK I introduced this technique using RCU read-side C.S. to deal
> >   with cmpxchg ABA at that point, but someone might have thought about
> >   it before me without my knowledge. My litterature survey so far
> >   indicates that using a double-word CAS on a pointer/counter was one of
> >   the usual technique used to protect against cmpxchg ABA so far. Other
> >   techniques imply allocating elements in a limited-size array (so a
> >   simple cmpxchg can update the array index and counter atomically),
> >   Hasard Pointers, or having a full-blown GC which provides similar
> >   guarantees to the RCU grace period with a read-side lock held.
> >   Ref.:
> > 
> >   [1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
> >   [2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
> >   [2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-06-16  4:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-14  9:00 [PATCH] rcu,doc: lock-free update site Lai Jiangshan
2011-06-14 12:50 ` Mathieu Desnoyers
     [not found] ` <BLU0-SMTP15CB1FCA05185652974DF796680@phx.gbl>
2011-06-16  2:40   ` Lai Jiangshan
2011-06-16  4:06     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox