All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	josh@joshtriplett.org, Manfred Spraul <manfred@colorfullife.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: [PATCH] rcu,doc: lock-free update site
Date: Tue, 14 Jun 2011 17:00:11 +0800	[thread overview]
Message-ID: <4DF7231B.1080708@cn.fujitsu.com> (raw)

Add a document which describes a pattern of using RCU to implement lock-free(lockless)
update site.

Singed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/RCU/00-INDEX                  |    2 +
 Documentation/RCU/lock-free-update-site.txt |  143 +++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 0 deletions(-)

diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 1d7a885..7178dd5 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -8,6 +8,8 @@ listRCU.txt
 	- Using RCU to Protect Read-Mostly Linked Lists
 lockdep.txt
 	- RCU and lockdep checking
+lock-free-update-site.txt
+	- RCU pattern of lock-free update site
 NMI-RCU.txt
 	- Using RCU to Protect Dynamic NMI Handlers
 rcubarrier.txt
diff --git a/Documentation/RCU/lock-free-update-site.txt b/Documentation/RCU/lock-free-update-site.txt
new file mode 100644
index 0000000..9b6984a
--- /dev/null
+++ b/Documentation/RCU/lock-free-update-site.txt
@@ -0,0 +1,143 @@
+Lock-free(lockless) update site
+
+This article describes a pattern of using RCU to implement lock-free(lockless)
+update site. RCU update site is considered call-rare and it is protected
+by a update-site lock generally. But blocking algorithms are undesirable
+in some cases for some reasons, thus, this pattern may help.
+
+This pattern can only protect a single pointer which is the only reference
+of the object.
+
+object pointer:
+
+struct my_struct *gptr;
+
+wait-free read site:
+{
+	rcu_read_lock();
+	ptr = rcu_dereference(gptr);
+	my_struct_read(ptr);
+	rcu_read_unlock();
+}
+
+lock-free update site(update as new):
+{
+	new_ptr = my_struct_alloc();
+	for (;;) {
+		rcu_read_lock();
+
+		old_ptr = rcu_dereference(gptr);
+
+		/* copy data from old_ptr to new_ptr and update it */
+		my_struct_update(new_ptr, old_ptr);
+
+		/* atomically publish the new_ptr and de-publish the old_ptr */
+		if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
+			rcu_read_unlock();
+
+			/*
+			 * free it after a grace-period, read sites and other
+			 * update sites may be reading it in parallel.
+			 */
+			kfree_rcu(old_ptr);
+
+			/* success, exit the loop */
+			break;
+		} else {
+			rcu_read_unlock();
+
+			/*
+			 * Other update site successfully update it, we need
+			 * to read the latest data and try the update again.
+			 *
+			 * If the other update site did the same thing we need,
+			 * we can free the new_ptr and exit this loop too,
+			 * and it may becomes a wait-free algorithm.
+			 */
+		}
+	}
+}
+
+1) In update site, rcu_read_lock() is needed for my_struct_update().
+
+   In this kind of lock-free update site, many update sides
+   may run parallel, other update side may had successfully
+   de-published old_ptr and tried to free it. rcu_read_lock()
+   prevents old_ptr from freeing and ensures it valid for
+   my_struct_update().
+
+2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
+
+   Although the content of old_ptr is not accessed when cmpxchg(),
+   but old_ptr should not be freed until cmpxchg() finished.
+   Otherwise we may miss other successful update and publish a
+   new_ptr without information from the latest object.
+
+   Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
+   (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   my_struct_update(new_ptr, old_ptr);
+   rcu_read_unlock();
+   .						successfully update, now gptr=other_ptr
+   .						old_ptr is freed
+   .
+   .						other update, my_struct_alloc() returns old_ptr
+   .						successfully publish and de-publish
+   .						now gptr=old_ptr again
+   .
+   cmpxchg(&gptr, old_ptr, new_ptr)
+     cmpxchg() success, but the 2 updates
+     of CPU1 are completely missed.
+
+   This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
+   before cmpxchg() finished and to prevent ABA-problem.
+
+3) Beware NULL pointer.
+
+   Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
+
+lock-free update site(dispose, wait-free):
+{
+	old_ptr = xchg(&gptr, NULL);
+	if (old_ptr != NULL)
+		kfree_rcu(old_ptr);
+}
+
+   This code cause NULL reusing and may cause ABA-problem like above example:
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   /* old_ptr = NULL */
+   my_struct_update(new_ptr, NULL);
+   .						successfully update, now gptr=other_ptr
+   .
+   .						successfully dispose
+   .						now gptr=NULL again
+   .
+   cmpxchg(&gptr, NULL, new_ptr)
+     cmpxchg() success, but the update
+     and the dispose of CPU1 are missed
+     consideration by CPU0.
+   rcu_read_unlock();
+
+   In many use cases, these behaviors are OK. In these use cases,
+   my_struct_update(new_ptr, NULL) give us the same result even we retry.
+
+   But in some raw use cases(I can't find any use-case now, I believe it exist),
+   the missed considerations of the updates are not acceptable, in this case,
+   we should use different null-value for NULL pointer for every disposing.
+
+lock-free update site(dispose, wait-free, paranoid version):
+{
+	null_ptr = alloc_null_ptr();
+	old_ptr = xchg(&gptr, null_ptr);
+	if (is_null_ptr(old_ptr))
+		free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
+	else
+		kfree_rcu(old_ptr);
+}
+

             reply	other threads:[~2011-06-14  9:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-14  9:00 Lai Jiangshan [this message]
2011-06-14 12:50 ` [PATCH] rcu,doc: lock-free update site Mathieu Desnoyers
     [not found] ` <BLU0-SMTP15CB1FCA05185652974DF796680@phx.gbl>
2011-06-16  2:40   ` Lai Jiangshan
2011-06-16  4:06     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF7231B.1080708@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.