linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, jiangshanlai@gmail.com,
	dipankar@in.ibm.com, akpm@linux-foundation.org,
	mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
	tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org,
	dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com,
	fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com,
	Patrick Marlier <patrick.marlier@gmail.com>
Subject: Re: [PATCH tip/core/rcu 11/13] rculist: Make list_entry_rcu() use lockless_dereference()
Date: Mon, 26 Oct 2015 07:55:52 -0700	[thread overview]
Message-ID: <20151026145552.GG5105@linux.vnet.ibm.com> (raw)
In-Reply-To: <20151026084506.GA28423@gmail.com>

On Mon, Oct 26, 2015 at 09:45:06AM +0100, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > From: Patrick Marlier <patrick.marlier@gmail.com>
> > 
> > The current list_entry_rcu() implementation copies the pointer to a stack
> > variable, then invokes rcu_dereference_raw() on it.  This results in an
> > additional store-load pair.  Now, most compilers will emit normal store
> > and load instructions, which might seem to be of negligible overhead,
> > but this results in a load-hit-store situation that can cause surprisingly
> > long pipeline stalls, even on modern microprocessors.  The problem is
> > that it takes time for the store to get the store buffer updated, which
> > can delay the subsequent load, which immediately follows.
> > 
> > This commit therefore switches to the lockless_dereference() primitive,
> > which does not expect the __rcu annotations (that are anyway not present
> > in the list_head structure) and which, like rcu_dereference_raw(),
> > does not check for an enclosing RCU read-side critical section.
> > Most importantly, it does not copy the pointer, thus avoiding the
> > load-hit-store overhead.
> > 
> > Signed-off-by: Patrick Marlier <patrick.marlier@gmail.com>
> > [ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ]
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  include/linux/rculist.h | 5 +----
> >  1 file changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> > index 17c6b1f84a77..5ed540986019 100644
> > --- a/include/linux/rculist.h
> > +++ b/include/linux/rculist.h
> > @@ -247,10 +247,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
> >   * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
> >   */
> >  #define list_entry_rcu(ptr, type, member) \
> > -({ \
> > -	typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
> > -	container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
> > -})
> > +	container_of(lockless_dereference(ptr), type, member)
> 
> So this commit:
> 
>   8db70b132dd5 ("rculist: Make list_entry_rcu() use lockless_dereference()")
> 
> when merged with Linus's latest tree, triggers the following build failure on 
> allyesconfig/allmodconfig x86:
> 
> triton:~/tip> make fs/fs-writeback.o
>   CHK     include/config/kernel.release
>   CHK     include/generated/uapi/linux/version.h
>   CHK     include/generated/utsrelease.h
>   CHK     include/generated/bounds.h
>   CHK     include/generated/timeconst.h
>   CHK     include/generated/asm-offsets.h
>   CALL    scripts/checksyscalls.sh
>   CC      fs/fs-writeback.o
> In file included from fs/fs-writeback.c:16:0:
> fs/fs-writeback.c: In function ‘bdi_split_work_to_wbs’:
> include/linux/compiler.h:281:20: error: lvalue required as unary ‘&’ operand
>    __read_once_size(&(x), __u.__c, sizeof(x));  \
>                     ^
> include/linux/kernel.h:811:49: note: in definition of macro ‘container_of’
>   const typeof( ((type *)0)->member ) *__mptr = (ptr); \
>                                                  ^
> include/linux/compiler.h:286:22: note: in expansion of macro ‘__READ_ONCE’
>  #define READ_ONCE(x) __READ_ONCE(x, 1)
>                       ^
> include/linux/compiler.h:533:26: note: in expansion of macro ‘READ_ONCE’
>   typeof(p) _________p1 = READ_ONCE(p); \
>                           ^
> include/linux/rculist.h:250:15: note: in expansion of macro ‘lockless_dereference’
>   container_of(lockless_dereference(ptr), type, member)
>                ^
> fs/fs-writeback.c:782:29: note: in expansion of macro ‘list_entry_rcu’
>   struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list,
>                              ^
> include/linux/compiler.h:283:28: error: lvalue required as unary ‘&’ operand
>    __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \
>                             ^
> include/linux/kernel.h:811:49: note: in definition of macro ‘container_of’
>   const typeof( ((type *)0)->member ) *__mptr = (ptr); \
>                                                  ^
> include/linux/compiler.h:286:22: note: in expansion of macro ‘__READ_ONCE’
>  #define READ_ONCE(x) __READ_ONCE(x, 1)
>                       ^
> include/linux/compiler.h:533:26: note: in expansion of macro ‘READ_ONCE’
>   typeof(p) _________p1 = READ_ONCE(p); \
>                           ^
> include/linux/rculist.h:250:15: note: in expansion of macro ‘lockless_dereference’
>   container_of(lockless_dereference(ptr), type, member)
>                ^
> fs/fs-writeback.c:782:29: note: in expansion of macro ‘list_entry_rcu’
>   struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list,
>                              ^
> scripts/Makefile.build:258: recipe for target 'fs/fs-writeback.o' failed
> make[1]: *** [fs/fs-writeback.o] Error 1
> Makefile:1526: recipe for target 'fs/fs-writeback.o' failed
> make: *** [fs/fs-writeback.o] Error 2
> 
> It's this new usage in fs/fs-writeback.c:
> 
> static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
>                                   struct wb_writeback_work *base_work,
>                                   bool skip_if_busy)
> {
>         struct bdi_writeback *last_wb = NULL;
>         struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list,

I believe that the above should instead be:

	struct bdi_writeback *wb = list_entry_rcu(bdi->wb_list.next,

After all, RCU read-side list primitives need to fetch pointers in order
to traverse those pointers in an RCU-safe manner.  The patch below clears
this up for me, does it also work for you?

							Thanx, Paul

------------------------------------------------------------------------

commit 9221c4e78cc3511cd15b1433617ae2548ad8f631
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Oct 26 07:47:20 2015 -0700

    fs: Fix argument to list_entry_rcu()
    
    The list_entry_rcu() macro requires that its first argument be the
    pointer to the list_head contained in the structure whose pointer is to
    be returned, but fetched in an RCU-safe manner.  This commit therefore
    fixes the use in bdi_split_work_to_wbs() to follow this convention.
    
    Reported-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 29e4599f6fc1..126d4de8faad 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -779,7 +779,7 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
 				  bool skip_if_busy)
 {
 	struct bdi_writeback *last_wb = NULL;
-	struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list,
+	struct bdi_writeback *wb = list_entry_rcu(bdi->wb_list.next,
 						struct bdi_writeback, bdi_node);
 
 	might_sleep();


  reply	other threads:[~2015-10-26 14:56 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-06 16:13 [PATCH tip/core/rcu 0/13] Miscellaneous fixes for 4.4 Paul E. McKenney
2015-10-06 16:13 ` [PATCH tip/core/rcu 01/13] sched: Export sched_setscheduler_nocheck Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 02/13] rcu: Use rcu_callback_t in call_rcu*() and friends Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 03/13] rcu: Use call_rcu_func_t to replace explicit type equivalents Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 04/13] rcu: Don't disable preemption for Tiny and Tree RCU readers Paul E. McKenney
2015-10-06 16:20     ` [Kernel networking modules.] OSI levels 2 & 3, Assistance - If anyone knows anyone in the US. North West region John D Allen, Leveridge Systems INC
2015-10-06 16:44     ` [PATCH tip/core/rcu 04/13] rcu: Don't disable preemption for Tiny and Tree RCU readers Josh Triplett
2015-10-06 17:01       ` Paul E. McKenney
2015-10-06 17:16         ` Josh Triplett
2015-10-06 17:42           ` Paul E. McKenney
2015-10-06 17:46             ` Josh Triplett
2015-10-06 20:05             ` Peter Zijlstra
2015-10-06 20:18               ` Paul E. McKenney
2015-10-06 20:52                 ` Peter Zijlstra
2015-10-06 21:05                   ` Paul E. McKenney
2015-10-07  7:19                     ` Peter Zijlstra
2015-10-06 16:13   ` [PATCH tip/core/rcu 05/13] rcu: Eliminate panic when silly boot-time fanout specified Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 06/13] rcu: Add online/offline info to stall warning message Paul E. McKenney
2015-10-06 17:15     ` Josh Triplett
2015-10-06 16:13   ` [PATCH tip/core/rcu 07/13] rcu: Move preemption disabling out of __srcu_read_lock() Paul E. McKenney
2015-10-06 17:18     ` Josh Triplett
2015-10-06 17:36       ` Paul E. McKenney
2015-10-06 17:43         ` Josh Triplett
2015-10-06 18:07           ` Paul E. McKenney
2015-10-06 20:07     ` Peter Zijlstra
2015-10-06 20:19       ` Paul E. McKenney
2015-10-06 20:32         ` Peter Zijlstra
2015-10-06 21:03           ` Paul E. McKenney
2015-10-07  7:20             ` Peter Zijlstra
2015-10-07 14:18               ` Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 08/13] rcu: Finish folding ->fqs_state into ->gp_state Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 09/13] rcu: Correct comment for values of ->gp_state field Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 10/13] rcu: Add rcu_pointer_handoff() Paul E. McKenney
2015-10-06 17:21     ` Josh Triplett
2015-10-06 17:31       ` Paul E. McKenney
2015-10-06 17:36         ` Josh Triplett
2015-10-06 20:27     ` Peter Zijlstra
2015-10-06 21:02       ` Paul E. McKenney
2015-10-07  7:22         ` Peter Zijlstra
2015-10-07 14:20           ` Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 11/13] rculist: Make list_entry_rcu() use lockless_dereference() Paul E. McKenney
2015-10-26  8:45     ` Ingo Molnar
2015-10-26 14:55       ` Paul E. McKenney [this message]
2015-10-26 18:02         ` Ingo Molnar
2015-10-27  3:37         ` Linus Torvalds
2015-10-27  5:19           ` Tejun Heo
2015-10-27  5:33             ` Paul E. McKenney
2015-10-28  8:33             ` Ingo Molnar
2015-10-28 20:35               ` Patrick Marlier
2015-10-29  0:00                 ` Paul E. McKenney
2015-10-29  2:13                   ` Tejun Heo
2015-10-28 20:58             ` [tip:core/rcu] fs/writeback, rcu: Don't use list_entry_rcu() for pointer offsetting in bdi_split_work_to_wbs() tip-bot for Tejun Heo
2015-10-27  5:32           ` [PATCH tip/core/rcu 11/13] rculist: Make list_entry_rcu() use lockless_dereference() Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 12/13] rcu: Remove deprecated rcu_lockdep_assert() Paul E. McKenney
2015-10-06 16:13   ` [PATCH tip/core/rcu 13/13] rculist: Use WRITE_ONCE() when deleting from reader-visible list Paul E. McKenney
2015-10-06 17:23 ` [PATCH tip/core/rcu 0/13] Miscellaneous fixes for 4.4 Josh Triplett
2015-10-06 17:38   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151026145552.GG5105@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=patrick.marlier@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).