From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f41.google.com (mail-qg0-f41.google.com [209.85.192.41]) by kanga.kvack.org (Postfix) with ESMTP id BF0D86B0031 for ; Thu, 19 Jun 2014 11:03:08 -0400 (EDT) Received: by mail-qg0-f41.google.com with SMTP id i50so2235028qgf.28 for ; Thu, 19 Jun 2014 08:03:08 -0700 (PDT) Received: from qmta08.emeryville.ca.mail.comcast.net (qmta08.emeryville.ca.mail.comcast.net. [2001:558:fe2d:43:76:96:30:80]) by mx.google.com with ESMTP id n7si6852311qas.81.2014.06.19.08.03.07 for ; Thu, 19 Jun 2014 08:03:08 -0700 (PDT) Date: Thu, 19 Jun 2014 10:03:04 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <53A2F406.4010109@oracle.com> Message-ID: References: <53A2F406.4010109@oracle.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Sasha Levin Cc: Pekka Enberg , Thomas Gleixner , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , "Paul E. McKenney" Dave Jones , LKML On Thu, 19 Jun 2014, Sasha Levin wrote: > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > [ 690.770137] ? discard_slab (mm/slub.c:1486) > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) __call_rcu does a slab allocation? This means __call_rcu can no longer be used in slab allocators? What happened? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f44.google.com (mail-yh0-f44.google.com [209.85.213.44]) by kanga.kvack.org (Postfix) with ESMTP id A30876B0031 for ; Thu, 19 Jun 2014 12:52:55 -0400 (EDT) Received: by mail-yh0-f44.google.com with SMTP id f10so1929549yha.17 for ; Thu, 19 Jun 2014 09:52:55 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com. [32.97.110.152]) by mx.google.com with ESMTPS id h29si8978700yhi.30.2014.06.19.09.52.54 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 09:52:55 -0700 (PDT) Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 10:52:53 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 4B60519D805E for ; Thu, 19 Jun 2014 10:52:40 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08026.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JGpjkk4391396 for ; Thu, 19 Jun 2014 18:51:45 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JGuia8001957 for ; Thu, 19 Jun 2014 10:56:44 -0600 Date: Thu, 19 Jun 2014 09:52:47 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619165247.GA4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Sasha Levin , Pekka Enberg , Thomas Gleixner , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > used in slab allocators? What happened? My guess is that the root cause is a double call_rcu(), call_rcu_sched(), call_rcu_bh(), or call_srcu(). Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? That would be unfortunate... Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f181.google.com (mail-we0-f181.google.com [74.125.82.181]) by kanga.kvack.org (Postfix) with ESMTP id 868C66B0031 for ; Thu, 19 Jun 2014 15:29:21 -0400 (EDT) Received: by mail-we0-f181.google.com with SMTP id q59so2843950wes.12 for ; Thu, 19 Jun 2014 12:29:21 -0700 (PDT) Received: from Galois.linutronix.de (Galois.linutronix.de. [2001:470:1f0b:db:abcd:42:0:1]) by mx.google.com with ESMTPS id t2si8406051wjw.106.2014.06.19.12.29.19 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 12:29:19 -0700 (PDT) Date: Thu, 19 Jun 2014 21:29:08 +0200 (CEST) From: Thomas Gleixner Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619165247.GA4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > > used in slab allocators? What happened? > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > call_rcu_bh(), or call_srcu(). > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > That would be unfortunate... Well, no. Look at the callchain: __call_rcu debug_object_activate rcuhead_fixup_activate debug_object_init kmem_cache_alloc So call rcu activates the object, but the object has no reference in the debug objects code so the fixup code is called which inits the object and allocates a reference .... Thanks, tglx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f179.google.com (mail-qc0-f179.google.com [209.85.216.179]) by kanga.kvack.org (Postfix) with ESMTP id 986E16B0031 for ; Thu, 19 Jun 2014 16:19:43 -0400 (EDT) Received: by mail-qc0-f179.google.com with SMTP id x3so2643150qcv.38 for ; Thu, 19 Jun 2014 13:19:43 -0700 (PDT) Received: from qmta12.emeryville.ca.mail.comcast.net (qmta12.emeryville.ca.mail.comcast.net. [2001:558:fe2d:44:76:96:27:227]) by mx.google.com with ESMTP id b2si8023395qar.16.2014.06.19.13.19.42 for ; Thu, 19 Jun 2014 13:19:42 -0700 (PDT) Date: Thu, 19 Jun 2014 15:19:39 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: "Paul E. McKenney" , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Thomas Gleixner wrote: > Well, no. Look at the callchain: > > __call_rcu > debug_object_activate > rcuhead_fixup_activate > debug_object_init > kmem_cache_alloc > > So call rcu activates the object, but the object has no reference in > the debug objects code so the fixup code is called which inits the > object and allocates a reference .... So we need to init the object in the page struct before the __call_rcu? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f42.google.com (mail-wg0-f42.google.com [74.125.82.42]) by kanga.kvack.org (Postfix) with ESMTP id 58FA46B0037 for ; Thu, 19 Jun 2014 16:28:27 -0400 (EDT) Received: by mail-wg0-f42.google.com with SMTP id z12so2764441wgg.1 for ; Thu, 19 Jun 2014 13:28:26 -0700 (PDT) Received: from Galois.linutronix.de (Galois.linutronix.de. [2001:470:1f0b:db:abcd:42:0:1]) by mx.google.com with ESMTPS id u5si8592557wjf.58.2014.06.19.13.28.25 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:28:26 -0700 (PDT) Date: Thu, 19 Jun 2014 22:28:14 +0200 (CEST) From: Thomas Gleixner Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: "Paul E. McKenney" , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? Looks like RCU is lazily relying on the state callback to initialize the objects. There is an unused debug_init_rcu_head() inline in kernel/rcu/update.c Paul???? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f177.google.com (mail-qc0-f177.google.com [209.85.216.177]) by kanga.kvack.org (Postfix) with ESMTP id 281356B0038 for ; Thu, 19 Jun 2014 16:29:42 -0400 (EDT) Received: by mail-qc0-f177.google.com with SMTP id r5so2659982qcx.8 for ; Thu, 19 Jun 2014 13:29:41 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com. [32.97.110.149]) by mx.google.com with ESMTPS id k6si7827166qct.2.2014.06.19.13.29.41 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:29:41 -0700 (PDT) Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 14:29:40 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 719DC3E40083 for ; Thu, 19 Jun 2014 14:29:30 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JKTTIU10354976 for ; Thu, 19 Jun 2014 22:29:30 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JKXPEi028413 for ; Thu, 19 Jun 2014 14:33:25 -0600 Date: Thu, 19 Jun 2014 13:29:28 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619202928.GG4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > > > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > > > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > > > used in slab allocators? What happened? > > > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > > call_rcu_bh(), or call_srcu(). > > > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > > That would be unfortunate... > > Well, no. Look at the callchain: > > __call_rcu > debug_object_activate > rcuhead_fixup_activate > debug_object_init > kmem_cache_alloc > > So call rcu activates the object, but the object has no reference in > the debug objects code so the fixup code is called which inits the > object and allocates a reference .... OK, got it. And you are right, call_rcu() has done this for a very long time, so not sure what changed. But it seems like the right approach is to provide a debug-object-free call_rcu_alloc() for use by the memory allocators. Seem reasonable? If so, please see the following patch. Thanx, Paul ------------------------------------------------------------------------ rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion The sl*b allocators use call_rcu() to manage object lifetimes, but call_rcu() can use debug-objects, which in turn invokes the sl*b allocators. These allocators are not prepared for this sort of recursion, which can result in failures. This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), which act as their call_rcu() and call_rcu_sched() counterparts, but which avoid invoking debug-objects. These new API members are intended only for use by the sl*b allocators, and this commit makes the sl*b allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). Reported-by: Sasha Levin Set-straight-by: Thomas Gleixner Signed-off-by: Paul E. McKenney diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index d5e40a42cc43..1f708a7f9e7d 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -140,13 +140,24 @@ void do_trace_rcu_torture_read(const char *rcutorturename, * if CPU A and CPU B are the same CPU (but again only if the system has * more than one CPU). */ -void call_rcu(struct rcu_head *head, - void (*func)(struct rcu_head *head)); +void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *head)); + +/** + * call_rcu__alloc() - Queue an RCU for invocation after grace period. + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * Similar to call_rcu(), but avoids invoking debug-objects. This permits + * this to be called from allocators without needing to worry about + * recursive calls into those allocators for debug-objects allocations. + */ +void call_rcu_alloc(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); #else /* #ifdef CONFIG_PREEMPT_RCU */ /* In classic RCU, call_rcu() is just call_rcu_sched(). */ #define call_rcu call_rcu_sched +#define call_rcu_alloc call_rcu_sched_alloc #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ @@ -196,6 +207,19 @@ void call_rcu_bh(struct rcu_head *head, void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); +/** + * call_rcu_sched_alloc() - Queue RCU for invocation after sched grace period. + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * Similar to call_rcu_sched(), but avoids invoking debug-objects. + * This permits this to be called from allocators without needing to + * worry about recursive calls into those allocators for debug-objects + * allocations. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)); + void synchronize_sched(void); #ifdef CONFIG_PREEMPT_RCU diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index d9efcc13008c..515e60067c53 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -338,15 +338,14 @@ void synchronize_sched(void) EXPORT_SYMBOL_GPL(synchronize_sched); /* - * Helper function for call_rcu() and call_rcu_bh(). + * Provide call_rcu() function, but avoid invoking debug objects. */ -static void __call_rcu(struct rcu_head *head, - void (*func)(struct rcu_head *rcu), - struct rcu_ctrlblk *rcp) +static void __call_rcu_nodo(struct rcu_head *head, + void (*func)(struct rcu_head *rcu), + struct rcu_ctrlblk *rcp) { unsigned long flags; - debug_rcu_head_queue(head); head->func = func; head->next = NULL; @@ -358,6 +357,17 @@ static void __call_rcu(struct rcu_head *head, } /* + * Helper function for call_rcu() and call_rcu_bh(). + */ +static void __call_rcu(struct rcu_head *head, + void (*func)(struct rcu_head *rcu), + struct rcu_ctrlblk *rcp) +{ + debug_rcu_head_queue(head); + __call_rcu_nodo(head, func, rcp); +} + +/* * Post an RCU callback to be invoked after the end of an RCU-sched grace * period. But since we have but one CPU, that would be after any * quiescent state. @@ -369,6 +379,18 @@ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) EXPORT_SYMBOL_GPL(call_rcu_sched); /* + * Similar to call_rcu_sched(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_sched_ctrlblk); +} +EXPORT_SYMBOL_GPL(call_rcu_sched_alloc); + +/* * Post an RCU bottom-half callback to be invoked after any subsequent * quiescent state. */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 8c47d04ecdea..593195d38850 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2640,25 +2640,16 @@ static void rcu_leak_callback(struct rcu_head *rhp) } /* - * Helper function for call_rcu() and friends. The cpu argument will - * normally be -1, indicating "currently running CPU". It may specify - * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() - * is expected to specify a CPU. + * Provide call_rcu() function, but avoid invoking debug objects. */ static void -__call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), - struct rcu_state *rsp, int cpu, bool lazy) +__call_rcu_nodo(struct rcu_head *head, void (*func)(struct rcu_head *rcu), + struct rcu_state *rsp, int cpu, bool lazy) { unsigned long flags; struct rcu_data *rdp; WARN_ON_ONCE((unsigned long)head & 0x1); /* Misaligned rcu_head! */ - if (debug_rcu_head_queue(head)) { - /* Probable double call_rcu(), so leak the callback. */ - ACCESS_ONCE(head->func) = rcu_leak_callback; - WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n"); - return; - } head->func = func; head->next = NULL; @@ -2704,6 +2695,25 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), } /* + * Helper function for call_rcu() and friends. The cpu argument will + * normally be -1, indicating "currently running CPU". It may specify + * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() + * is expected to specify a CPU. + */ +static void +__call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), + struct rcu_state *rsp, int cpu, bool lazy) +{ + if (debug_rcu_head_queue(head)) { + /* Probable double call_rcu(), so leak the callback. */ + ACCESS_ONCE(head->func) = rcu_leak_callback; + WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n"); + return; + } + __call_rcu_nodo(head, func, rsp, cpu, lazy); +} + +/* * Queue an RCU-sched callback for invocation after a grace period. */ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) @@ -2713,6 +2723,18 @@ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) EXPORT_SYMBOL_GPL(call_rcu_sched); /* + * Similar to call_rcu_sched(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_sched_state, -1, 0); +} +EXPORT_SYMBOL_GPL(call_rcu_sched_alloc); + +/* * Queue an RCU callback for invocation after a quicker grace period. */ void call_rcu_bh(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 569b390daa15..e9362d7f8328 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -679,6 +679,17 @@ void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) } EXPORT_SYMBOL_GPL(call_rcu); +/* + * Similar to call_rcu(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_alloc(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_preempt_state, -1, 0); +} +EXPORT_SYMBOL_GPL(call_rcu_alloc); + /** * synchronize_rcu - wait until a grace period has elapsed. * diff --git a/mm/slab.c b/mm/slab.c index 9ca3b87edabc..1e5de0d39701 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1994,7 +1994,7 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page) * we can use it safely. */ head = (void *)&page->rcu_head; - call_rcu(head, kmem_rcu_free); + call_rcu_alloc(head, kmem_rcu_free); } else { kmem_freepages(cachep, page); diff --git a/mm/slob.c b/mm/slob.c index 21980e0f39a8..47ad4a43521a 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -605,7 +605,7 @@ void kmem_cache_free(struct kmem_cache *c, void *b) struct slob_rcu *slob_rcu; slob_rcu = b + (c->size - sizeof(struct slob_rcu)); slob_rcu->size = c->size; - call_rcu(&slob_rcu->head, kmem_rcu_free); + call_rcu_alloc(&slob_rcu->head, kmem_rcu_free); } else { __kmem_cache_free(b, c->size); } diff --git a/mm/slub.c b/mm/slub.c index b2b047327d76..7f01e57fd99f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1512,7 +1512,7 @@ static void free_slab(struct kmem_cache *s, struct page *page) head = (void *)&page->lru; } - call_rcu(head, rcu_free_slab); + call_rcu_alloc(head, rcu_free_slab); } else __free_slab(s, page); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by kanga.kvack.org (Postfix) with ESMTP id 6794C6B0039 for ; Thu, 19 Jun 2014 16:32:50 -0400 (EDT) Received: by mail-pd0-f170.google.com with SMTP id z10so2188888pdj.1 for ; Thu, 19 Jun 2014 13:32:50 -0700 (PDT) Received: from aserp1040.oracle.com (aserp1040.oracle.com. [141.146.126.69]) by mx.google.com with ESMTPS id nx10si6985811pbb.197.2014.06.19.13.32.49 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:32:49 -0700 (PDT) Message-ID: <53A348E6.3050404@oracle.com> Date: Thu, 19 Jun 2014 16:32:38 -0400 From: Sasha Levin MIME-Version: 1.0 Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: paulmck@linux.vnet.ibm.com, Thomas Gleixner Cc: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote: >> > >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote: >>>> > > > >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) >>>> > > > >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be >>>> > > > used in slab allocators? What happened? >>> > > >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), >>> > > call_rcu_bh(), or call_srcu(). >>> > > >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? >>> > > That would be unfortunate... >> > >> > Well, no. Look at the callchain: >> > >> > __call_rcu >> > debug_object_activate >> > rcuhead_fixup_activate >> > debug_object_init >> > kmem_cache_alloc >> > >> > So call rcu activates the object, but the object has no reference in >> > the debug objects code so the fixup code is called which inits the >> > object and allocates a reference .... > OK, got it. And you are right, call_rcu() has done this for a very > long time, so not sure what changed. It's probable my fault. I've introduced clone() and unshare() fuzzing. Those two are full with issues and I've been waiting with enabling those until the rest of the kernel could survive trinity for more than an hour. Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qa0-f52.google.com (mail-qa0-f52.google.com [209.85.216.52]) by kanga.kvack.org (Postfix) with ESMTP id 078BB6B003B for ; Thu, 19 Jun 2014 16:37:14 -0400 (EDT) Received: by mail-qa0-f52.google.com with SMTP id w8so2448863qac.11 for ; Thu, 19 Jun 2014 13:37:14 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com. [32.97.110.149]) by mx.google.com with ESMTPS id r64si7914262qga.37.2014.06.19.13.37.14 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:37:14 -0700 (PDT) Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 14:37:13 -0600 Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 6DD261FF001C for ; Thu, 19 Jun 2014 14:37:00 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08027.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JKZveS43319434 for ; Thu, 19 Jun 2014 22:35:57 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JKeu0K022150 for ; Thu, 19 Jun 2014 14:40:56 -0600 Date: Thu, 19 Jun 2014 13:36:59 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619203659.GH4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 03:19:39PM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? Good point. The patch I just sent will complain at callback-invocation time because the debug-object information won't be present. One way to handle this would be for rcu_do_batch() to avoid complaining if it gets a callback that has not been through call_rcu()'s debug_rcu_head_queue(). One way to do that would be to have an alternative to debug_object_deactivate() that does not complain if it is handed an unactivated object. Another way to handle this would be for me to put the definition of debug_rcu_head_queue() somewhere where the sl*b allocator could get at it, and have the sl*b allocators invoke it some at initialization and within the RCU callback. Other thoughts? Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) by kanga.kvack.org (Postfix) with ESMTP id BE74A6B003C for ; Thu, 19 Jun 2014 16:37:25 -0400 (EDT) Received: by mail-wi0-f181.google.com with SMTP id n3so3482980wiv.8 for ; Thu, 19 Jun 2014 13:37:25 -0700 (PDT) Received: from Galois.linutronix.de (Galois.linutronix.de. [2001:470:1f0b:db:abcd:42:0:1]) by mx.google.com with ESMTPS id f1si4132119wjw.158.2014.06.19.13.37.24 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:37:24 -0700 (PDT) Date: Thu, 19 Jun 2014 22:37:17 +0200 (CEST) From: Thomas Gleixner Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > OK, got it. And you are right, call_rcu() has done this for a very > long time, so not sure what changed. But it seems like the right > approach is to provide a debug-object-free call_rcu_alloc() for use > by the memory allocators. > > Seem reasonable? If so, please see the following patch. Not really, you're torpedoing the whole purpose of debugobjects :) So, why can't we just init the rcu head when the stuff is created? If that's impossible due to other memory allocator constraints, then instead of inventing a whole new API we can simply flag the relevent data in the memory allocator as we do with the debug objects mem cache itself (SLAB_DEBUG_OBJECTS). Thanks, tglx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f169.google.com (mail-qc0-f169.google.com [209.85.216.169]) by kanga.kvack.org (Postfix) with ESMTP id ABBF16B0036 for ; Thu, 19 Jun 2014 16:42:05 -0400 (EDT) Received: by mail-qc0-f169.google.com with SMTP id c9so2723339qcz.14 for ; Thu, 19 Jun 2014 13:42:05 -0700 (PDT) Received: from e37.co.us.ibm.com (e37.co.us.ibm.com. [32.97.110.158]) by mx.google.com with ESMTPS id d7si8113986qar.50.2014.06.19.13.42.04 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:42:05 -0700 (PDT) Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 14:42:04 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 3881E3E40388 for ; Thu, 19 Jun 2014 14:39:17 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08026.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JKc75F66388166 for ; Thu, 19 Jun 2014 22:38:07 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JKh6R7029865 for ; Thu, 19 Jun 2014 14:43:07 -0600 Date: Thu, 19 Jun 2014 13:39:09 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619203909.GI4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <53A348E6.3050404@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A348E6.3050404@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 04:32:38PM -0400, Sasha Levin wrote: > On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > >> > > >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote: > >>>> > > > > >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > >>>> > > > > >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > >>>> > > > used in slab allocators? What happened? > >>> > > > >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > >>> > > call_rcu_bh(), or call_srcu(). > >>> > > > >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > >>> > > That would be unfortunate... > >> > > >> > Well, no. Look at the callchain: > >> > > >> > __call_rcu > >> > debug_object_activate > >> > rcuhead_fixup_activate > >> > debug_object_init > >> > kmem_cache_alloc > >> > > >> > So call rcu activates the object, but the object has no reference in > >> > the debug objects code so the fixup code is called which inits the > >> > object and allocates a reference .... > > OK, got it. And you are right, call_rcu() has done this for a very > > long time, so not sure what changed. > > It's probable my fault. I've introduced clone() and unshare() fuzzing. > > Those two are full with issues and I've been waiting with enabling those > until the rest of the kernel could survive trinity for more than an hour. Well, that might explain why I haven't seen it in my testing. ;-) Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by kanga.kvack.org (Postfix) with ESMTP id 86F276B003B for ; Thu, 19 Jun 2014 16:42:19 -0400 (EDT) Received: by mail-pd0-f170.google.com with SMTP id z10so2195645pdj.1 for ; Thu, 19 Jun 2014 13:42:19 -0700 (PDT) Received: from userp1040.oracle.com (userp1040.oracle.com. [156.151.31.81]) by mx.google.com with ESMTPS id ko1si7031308pbc.100.2014.06.19.13.42.18 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:42:18 -0700 (PDT) Message-ID: <53A34B23.1000401@oracle.com> Date: Thu, 19 Jun 2014 16:42:11 -0400 From: Sasha Levin MIME-Version: 1.0 Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: paulmck@linux.vnet.ibm.com, Thomas Gleixner Cc: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion > > The sl*b allocators use call_rcu() to manage object lifetimes, but > call_rcu() can use debug-objects, which in turn invokes the sl*b > allocators. These allocators are not prepared for this sort of > recursion, which can result in failures. > > This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), > which act as their call_rcu() and call_rcu_sched() counterparts, but > which avoid invoking debug-objects. These new API members are intended > only for use by the sl*b allocators, and this commit makes the sl*b > allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because > in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so > therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). > > Reported-by: Sasha Levin > Set-straight-by: Thomas Gleixner > Signed-off-by: Paul E. McKenney Paul, what is this patch based on? It won't apply cleanly on -next or Linus's tree. Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f52.google.com (mail-yh0-f52.google.com [209.85.213.52]) by kanga.kvack.org (Postfix) with ESMTP id DF99C6B0031 for ; Thu, 19 Jun 2014 16:53:20 -0400 (EDT) Received: by mail-yh0-f52.google.com with SMTP id a41so2146525yho.39 for ; Thu, 19 Jun 2014 13:53:20 -0700 (PDT) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com. [32.97.110.149]) by mx.google.com with ESMTPS id g26si9956701yhl.210.2014.06.19.13.53.19 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:53:20 -0700 (PDT) Received: from /spool/local by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 14:53:19 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 3EB423E40066 for ; Thu, 19 Jun 2014 14:53:09 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JIneMX59179116 for ; Thu, 19 Jun 2014 20:49:40 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JKv5a4005332 for ; Thu, 19 Jun 2014 14:57:05 -0600 Date: Thu, 19 Jun 2014 13:53:07 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619205307.GL4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > Well, no. Look at the callchain: > > > > > > __call_rcu > > > debug_object_activate > > > rcuhead_fixup_activate > > > debug_object_init > > > kmem_cache_alloc > > > > > > So call rcu activates the object, but the object has no reference in > > > the debug objects code so the fixup code is called which inits the > > > object and allocates a reference .... > > > > OK, got it. And you are right, call_rcu() has done this for a very > > long time, so not sure what changed. But it seems like the right > > approach is to provide a debug-object-free call_rcu_alloc() for use > > by the memory allocators. > > > > Seem reasonable? If so, please see the following patch. > > Not really, you're torpedoing the whole purpose of debugobjects :) > > So, why can't we just init the rcu head when the stuff is created? That would allow me to keep my code unchanged, so I am in favor. ;-) Thanx, Paul > If that's impossible due to other memory allocator constraints, then > instead of inventing a whole new API we can simply flag the relevent > data in the memory allocator as we do with the debug objects mem cache > itself (SLAB_DEBUG_OBJECTS). > > Thanks, > > tglx > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qa0-f42.google.com (mail-qa0-f42.google.com [209.85.216.42]) by kanga.kvack.org (Postfix) with ESMTP id 04C4F6B0036 for ; Thu, 19 Jun 2014 16:53:46 -0400 (EDT) Received: by mail-qa0-f42.google.com with SMTP id dc16so2483835qab.1 for ; Thu, 19 Jun 2014 13:53:46 -0700 (PDT) Received: from e36.co.us.ibm.com (e36.co.us.ibm.com. [32.97.110.154]) by mx.google.com with ESMTPS id h50si7955895qgf.62.2014.06.19.13.53.46 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 13:53:46 -0700 (PDT) Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 14:53:44 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 693913E40062 for ; Thu, 19 Jun 2014 14:53:39 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08026.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JKqZAW51380350 for ; Thu, 19 Jun 2014 22:52:35 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JKvXYo006543 for ; Thu, 19 Jun 2014 14:57:34 -0600 Date: Thu, 19 Jun 2014 13:53:36 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619205336.GM4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <53A34B23.1000401@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A34B23.1000401@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 04:42:11PM -0400, Sasha Levin wrote: > On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > > rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion > > > > The sl*b allocators use call_rcu() to manage object lifetimes, but > > call_rcu() can use debug-objects, which in turn invokes the sl*b > > allocators. These allocators are not prepared for this sort of > > recursion, which can result in failures. > > > > This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), > > which act as their call_rcu() and call_rcu_sched() counterparts, but > > which avoid invoking debug-objects. These new API members are intended > > only for use by the sl*b allocators, and this commit makes the sl*b > > allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because > > in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so > > therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). > > > > Reported-by: Sasha Levin > > Set-straight-by: Thomas Gleixner > > Signed-off-by: Paul E. McKenney > > Paul, what is this patch based on? It won't apply cleanly on -next > or Linus's tree. On my -rcu tree, but I think that Thomas's approach is better. Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com [74.125.82.51]) by kanga.kvack.org (Postfix) with ESMTP id 7A1E46B0031 for ; Thu, 19 Jun 2014 17:32:56 -0400 (EDT) Received: by mail-wg0-f51.google.com with SMTP id x12so2879488wgg.22 for ; Thu, 19 Jun 2014 14:32:55 -0700 (PDT) Received: from Galois.linutronix.de (Galois.linutronix.de. [2001:470:1f0b:db:abcd:42:0:1]) by mx.google.com with ESMTPS id a17si7297906wib.72.2014.06.19.14.32.54 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 14:32:55 -0700 (PDT) Date: Thu, 19 Jun 2014 23:32:41 +0200 (CEST) From: Thomas Gleixner Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619205307.GL4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > Well, no. Look at the callchain: > > > > > > > > __call_rcu > > > > debug_object_activate > > > > rcuhead_fixup_activate > > > > debug_object_init > > > > kmem_cache_alloc > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > the debug objects code so the fixup code is called which inits the > > > > object and allocates a reference .... > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > long time, so not sure what changed. But it seems like the right > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > by the memory allocators. > > > > > > Seem reasonable? If so, please see the following patch. > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > So, why can't we just init the rcu head when the stuff is created? > > That would allow me to keep my code unchanged, so I am in favor. ;-) Almost unchanged. You need to provide a function to do so, i.e. make use of debug_init_rcu_head() Thanks, tglx -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f51.google.com (mail-qg0-f51.google.com [209.85.192.51]) by kanga.kvack.org (Postfix) with ESMTP id 3BE986B0031 for ; Thu, 19 Jun 2014 18:04:56 -0400 (EDT) Received: by mail-qg0-f51.google.com with SMTP id z60so2727657qgd.10 for ; Thu, 19 Jun 2014 15:04:55 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com. [32.97.110.151]) by mx.google.com with ESMTPS id do2si8084921qcb.21.2014.06.19.15.04.54 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 19 Jun 2014 15:04:55 -0700 (PDT) Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Jun 2014 16:04:53 -0600 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 90F5B3E4003E for ; Thu, 19 Jun 2014 16:04:50 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08025.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5JM4oA566322650 for ; Fri, 20 Jun 2014 00:04:50 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5JM8kU9017296 for ; Thu, 19 Jun 2014 16:08:46 -0600 Date: Thu, 19 Jun 2014 15:04:49 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619220449.GT4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > Well, no. Look at the callchain: > > > > > > > > > > __call_rcu > > > > > debug_object_activate > > > > > rcuhead_fixup_activate > > > > > debug_object_init > > > > > kmem_cache_alloc > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > the debug objects code so the fixup code is called which inits the > > > > > object and allocates a reference .... > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > long time, so not sure what changed. But it seems like the right > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > by the memory allocators. > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > Almost unchanged. You need to provide a function to do so, i.e. make > use of > > debug_init_rcu_head() You mean like this? Thanx, Paul ------------------------------------------------------------------------ rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() Currently, call_rcu() relies on implicit allocation and initialization for the debug-objects handling of RCU callbacks. If you hammer the kernel hard enough with Sasha's modified version of trinity, you can end up with the sl*b allocators recursing into themselves via this implicit call_rcu() allocation. This commit therefore exports the debug_init_rcu_head() and debug_rcu_head_free() functions, which permits the allocators to allocated and pre-initialize the debug-objects information, so that there no longer any need for call_rcu() to do that initialization, which in turn prevents the recursion into the memory allocators. Reported-by: Sasha Levin Suggested-by: Thomas Gleixner Signed-off-by: Paul E. McKenney diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 063a6bf1a2b6..34ae5c376e35 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -358,9 +358,19 @@ void wait_rcu_gp(call_rcu_func_t crf); * initialization. */ #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD +void debug_init_rcu_head(struct rcu_head *head); +void debug_rcu_head_free(struct rcu_head *head); void init_rcu_head_on_stack(struct rcu_head *head); void destroy_rcu_head_on_stack(struct rcu_head *head); #else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ +static inline void debug_init_rcu_head(struct rcu_head *head) +{ +} + +static inline void debug_rcu_head_free(struct rcu_head *head) +{ +} + static inline void init_rcu_head_on_stack(struct rcu_head *head) { } diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a2aeb4df0f60..a41c81a26506 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -200,12 +200,12 @@ void wait_rcu_gp(call_rcu_func_t crf) EXPORT_SYMBOL_GPL(wait_rcu_gp); #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD -static inline void debug_init_rcu_head(struct rcu_head *head) +void debug_init_rcu_head(struct rcu_head *head) { debug_object_init(head, &rcuhead_debug_descr); } -static inline void debug_rcu_head_free(struct rcu_head *head) +void debug_rcu_head_free(struct rcu_head *head) { debug_object_free(head, &rcuhead_debug_descr); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by kanga.kvack.org (Postfix) with ESMTP id 5D17D6B0035 for ; Fri, 20 Jun 2014 04:17:54 -0400 (EDT) Received: by mail-wi0-f172.google.com with SMTP id hi2so345986wib.11 for ; Fri, 20 Jun 2014 01:17:53 -0700 (PDT) Received: from Galois.linutronix.de (Galois.linutronix.de. [2001:470:1f0b:db:abcd:42:0:1]) by mx.google.com with ESMTPS id qi1si10186962wjc.18.2014.06.20.01.17.52 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 20 Jun 2014 01:17:52 -0700 (PDT) Date: Fri, 20 Jun 2014 10:17:32 +0200 (CEST) From: Thomas Gleixner Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619220449.GT4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > Well, no. Look at the callchain: > > > > > > > > > > > > __call_rcu > > > > > > debug_object_activate > > > > > > rcuhead_fixup_activate > > > > > > debug_object_init > > > > > > kmem_cache_alloc > > > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > > the debug objects code so the fixup code is called which inits the > > > > > > object and allocates a reference .... > > > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > > long time, so not sure what changed. But it seems like the right > > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > > by the memory allocators. > > > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > > > Almost unchanged. You need to provide a function to do so, i.e. make > > use of > > > > debug_init_rcu_head() > > You mean like this? I'd rather name it init_rcu_head() and free_rcu_head() w/o the debug_ prefix, so it's consistent with init_rcu_head_on_stack / destroy_rcu_head_on_stack. But either way works for me. Acked-by: Thomas Gleixner -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qa0-f49.google.com (mail-qa0-f49.google.com [209.85.216.49]) by kanga.kvack.org (Postfix) with ESMTP id 81CF56B0036 for ; Fri, 20 Jun 2014 10:30:59 -0400 (EDT) Received: by mail-qa0-f49.google.com with SMTP id w8so3203941qac.22 for ; Fri, 20 Jun 2014 07:30:59 -0700 (PDT) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net. [2001:558:fe2d:43:76:96:30:17]) by mx.google.com with ESMTP id l66si10795242qgf.78.2014.06.20.07.30.58 for ; Fri, 20 Jun 2014 07:30:58 -0700 (PDT) Date: Fri, 20 Jun 2014 09:30:52 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619220449.GT4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, 19 Jun 2014, Paul E. McKenney wrote: > This commit therefore exports the debug_init_rcu_head() and > debug_rcu_head_free() functions, which permits the allocators to allocated > and pre-initialize the debug-objects information, so that there no longer > any need for call_rcu() to do that initialization, which in turn prevents > the recursion into the memory allocators. Looks-good-to: Christoph Lameter -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f179.google.com (mail-qc0-f179.google.com [209.85.216.179]) by kanga.kvack.org (Postfix) with ESMTP id ABEE36B0037 for ; Fri, 20 Jun 2014 11:40:21 -0400 (EDT) Received: by mail-qc0-f179.google.com with SMTP id x3so3639999qcv.38 for ; Fri, 20 Jun 2014 08:40:21 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com. [32.97.110.153]) by mx.google.com with ESMTPS id k31si4075424qge.52.2014.06.20.08.40.20 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 20 Jun 2014 08:40:20 -0700 (PDT) Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 20 Jun 2014 09:40:19 -0600 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 1E7BF19D8041 for ; Fri, 20 Jun 2014 09:40:08 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s5KFct9j7471470 for ; Fri, 20 Jun 2014 17:38:55 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s5KFiCN4021685 for ; Fri, 20 Jun 2014 09:44:13 -0600 Date: Fri, 20 Jun 2014 08:40:14 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140620154014.GC4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Fri, Jun 20, 2014 at 10:17:32AM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > > Well, no. Look at the callchain: > > > > > > > > > > > > > > __call_rcu > > > > > > > debug_object_activate > > > > > > > rcuhead_fixup_activate > > > > > > > debug_object_init > > > > > > > kmem_cache_alloc > > > > > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > > > the debug objects code so the fixup code is called which inits the > > > > > > > object and allocates a reference .... > > > > > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > > > long time, so not sure what changed. But it seems like the right > > > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > > > by the memory allocators. > > > > > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > > > > > Almost unchanged. You need to provide a function to do so, i.e. make > > > use of > > > > > > debug_init_rcu_head() > > > > You mean like this? > > I'd rather name it init_rcu_head() and free_rcu_head() w/o the debug_ > prefix, so it's consistent with init_rcu_head_on_stack / > destroy_rcu_head_on_stack. But either way works for me. > > Acked-by: Thomas Gleixner So just drop the _on_stack() from the other names, then. Please see below. Thanx, Paul ------------------------------------------------------------------------ rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() Currently, call_rcu() relies on implicit allocation and initialization for the debug-objects handling of RCU callbacks. If you hammer the kernel hard enough with Sasha's modified version of trinity, you can end up with the sl*b allocators recursing into themselves via this implicit call_rcu() allocation. This commit therefore exports the debug_init_rcu_head() and debug_rcu_head_free() functions, which permits the allocators to allocated and pre-initialize the debug-objects information, so that there no longer any need for call_rcu() to do that initialization, which in turn prevents the recursion into the memory allocators. Reported-by: Sasha Levin Suggested-by: Thomas Gleixner Signed-off-by: Paul E. McKenney Acked-by: Thomas Gleixner diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 063a6bf1a2b6..37c92cfef9ec 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -358,9 +358,19 @@ void wait_rcu_gp(call_rcu_func_t crf); * initialization. */ #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD +void init_rcu_head(struct rcu_head *head); +void destroy_rcu_head(struct rcu_head *head); void init_rcu_head_on_stack(struct rcu_head *head); void destroy_rcu_head_on_stack(struct rcu_head *head); #else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ +static inline void init_rcu_head(struct rcu_head *head) +{ +} + +static inline void destroy_rcu_head(struct rcu_head *head) +{ +} + static inline void init_rcu_head_on_stack(struct rcu_head *head) { } diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a2aeb4df0f60..0fb691e63ce6 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -200,12 +200,12 @@ void wait_rcu_gp(call_rcu_func_t crf) EXPORT_SYMBOL_GPL(wait_rcu_gp); #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD -static inline void debug_init_rcu_head(struct rcu_head *head) +void init_rcu_head(struct rcu_head *head) { debug_object_init(head, &rcuhead_debug_descr); } -static inline void debug_rcu_head_free(struct rcu_head *head) +void destroy_rcu_head(struct rcu_head *head) { debug_object_free(head, &rcuhead_debug_descr); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sasha Levin Subject: slub/debugobjects: lockup when freeing memory Date: Thu, 19 Jun 2014 10:30:30 -0400 Message-ID: <53A2F406.4010109@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Sender: linux-kernel-owner@vger.kernel.org To: Pekka Enberg , Christoph Lameter , Thomas Gleixner , Matt Mackall Cc: Andrew Morton , Dave Jones , "linux-mm@kvack.org" , "Paul E. McKenney" Dave Jones , LKML List-Id: linux-mm.kvack.org Hi all, While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following spew. It seems to cause an actual lockup as hung task messages followed soon after. [ 690.762537] ============================================= [ 690.764196] [ INFO: possible recursive locking detected ] [ 690.765247] 3.16.0-rc1-next-20140618-sasha-00029-g9e4acf8-dirty #664 Tainted: G W [ 690.766457] --------------------------------------------- [ 690.767237] kworker/u95:0/256 is trying to acquire lock: [ 690.767886] (&(&n->list_lock)->rlock){-.-.-.}, at: get_partial_node.isra.35 (mm/slub.c:1630) [ 690.769162] [ 690.769162] but task is already holding lock: [ 690.769851] (&(&n->list_lock)->rlock){-.-.-.}, at: kmem_cache_close (mm/slub.c:3209 mm/slub.c:3233) [ 690.770137] [ 690.770137] other info that might help us debug this: [ 690.770137] Possible unsafe locking scenario: [ 690.770137] [ 690.770137] CPU0 [ 690.770137] ---- [ 690.770137] lock(&(&n->list_lock)->rlock); [ 690.770137] lock(&(&n->list_lock)->rlock); [ 690.770137] [ 690.770137] *** DEADLOCK *** [ 690.770137] [ 690.770137] May be due to missing lock nesting notation [ 690.770137] [ 690.770137] 7 locks held by kworker/u95:0/256: [ 690.770137] #0: ("%s"("netns")){.+.+.+}, at: process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] #1: (net_cleanup_work){+.+.+.}, at: process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] #2: (net_mutex){+.+.+.}, at: cleanup_net (net/core/net_namespace.c:287) [ 690.770137] #3: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) [ 690.770137] #4: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) [ 690.770137] #5: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:343) [ 690.770137] #6: (&(&n->list_lock)->rlock){-.-.-.}, at: kmem_cache_close (mm/slub.c:3209 mm/slub.c:3233) [ 690.770137] [ 690.770137] stack backtrace: [ 690.770137] CPU: 18 PID: 256 Comm: kworker/u95:0 Tainted: G W 3.16.0-rc1-next-20140618-sasha-00029-g9e4acf8-dirty #664 [ 690.770137] Workqueue: netns cleanup_net [ 690.770137] ffff8808a172b000 ffff8808a1737628 ffffffff9d5179a0 0000000000000003 [ 690.770137] ffffffffa0b499c0 ffff8808a1737728 ffffffff9a1cac52 ffff8808a1737668 [ 690.770137] ffffffff9a1a74f8 23e00d8075e32f12 ffff8808a172b000 23e00d8000000001 [ 690.770137] Call Trace: [ 690.770137] dump_stack (lib/dump_stack.c:52) [ 690.770137] __lock_acquire (kernel/locking/lockdep.c:3034 kernel/locking/lockdep.c:3180) [ 690.770137] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 690.770137] ? get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305) [ 690.770137] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 690.770137] ? get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] ? __slab_alloc (mm/slub.c:2304) [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] debug_object_init (lib/debugobjects.c:365) [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) [ 690.770137] ? discard_slab (mm/slub.c:1486) [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) [ 690.770137] call_rcu (kernel/rcu/tree_plugin.h:679) [ 690.770137] discard_slab (mm/slub.c:1515 mm/slub.c:1523) [ 690.770137] kmem_cache_close (mm/slub.c:3212 mm/slub.c:3233) [ 690.770137] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607) [ 690.770137] __kmem_cache_shutdown (mm/slub.c:3245) [ 690.770137] kmem_cache_destroy (mm/slab_common.c:349) [ 690.770137] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 2)) [ 690.770137] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) [ 690.770137] ops_exit_list.isra.1 (net/core/net_namespace.c:135) [ 690.770137] cleanup_net (net/core/net_namespace.c:302 (discriminator 2)) [ 690.770137] process_one_work (kernel/workqueue.c:2081 include/linux/jump_label.h:115 include/trace/events/workqueue.h:111 kernel/workqueue.c:2086) [ 690.770137] ? process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] worker_thread (kernel/workqueue.c:2213) [ 690.770137] ? rescuer_thread (kernel/workqueue.c:2157) [ 690.770137] kthread (kernel/kthread.c:210) [ 690.770137] ? kthread_create_on_node (kernel/kthread.c:176) [ 690.770137] ret_from_fork (arch/x86/kernel/entry_64.S:349) Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f181.google.com (mail-pd0-f181.google.com [209.85.192.181]) by kanga.kvack.org (Postfix) with ESMTP id B70D96B0035 for ; Sat, 12 Jul 2014 14:04:09 -0400 (EDT) Received: by mail-pd0-f181.google.com with SMTP id v10so3068617pde.12 for ; Sat, 12 Jul 2014 11:04:09 -0700 (PDT) Received: from userp1040.oracle.com (userp1040.oracle.com. [156.151.31.81]) by mx.google.com with ESMTPS id tx10si5622201pac.29.2014.07.12.11.04.07 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sat, 12 Jul 2014 11:04:08 -0700 (PDT) Message-ID: <53C1788D.9080800@oracle.com> Date: Sat, 12 Jul 2014 14:03:57 -0400 From: Sasha Levin MIME-Version: 1.0 Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> <20140620154014.GC4904@linux.vnet.ibm.com> In-Reply-To: <20140620154014.GC4904@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: paulmck@linux.vnet.ibm.com, Thomas Gleixner Cc: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On 06/20/2014 11:40 AM, Paul E. McKenney wrote: > rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() > > Currently, call_rcu() relies on implicit allocation and initialization > for the debug-objects handling of RCU callbacks. If you hammer the > kernel hard enough with Sasha's modified version of trinity, you can end > up with the sl*b allocators recursing into themselves via this implicit > call_rcu() allocation. > > This commit therefore exports the debug_init_rcu_head() and > debug_rcu_head_free() functions, which permits the allocators to allocated > and pre-initialize the debug-objects information, so that there no longer > any need for call_rcu() to do that initialization, which in turn prevents > the recursion into the memory allocators. > > Reported-by: Sasha Levin > Suggested-by: Thomas Gleixner > Signed-off-by: Paul E. McKenney > Acked-by: Thomas Gleixner Hi Paul, Oddly enough, I still see the issue in -next (I made sure that this patch was in the tree): [ 393.810123] ============================================= [ 393.810123] [ INFO: possible recursive locking detected ] [ 393.810123] 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 Not tainted [ 393.810123] --------------------------------------------- [ 393.810123] trinity-c32/9762 is trying to acquire lock: [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: get_partial_node.isra.39 (mm/slub.c:1628) [ 393.810123] [ 393.810123] but task is already holding lock: [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.810123] [ 393.810123] other info that might help us debug this: [ 393.810123] Possible unsafe locking scenario: [ 393.810123] [ 393.810123] CPU0 [ 393.810123] ---- [ 393.810123] lock(&(&n->list_lock)->rlock); [ 393.810123] lock(&(&n->list_lock)->rlock); [ 393.810123] [ 393.810123] *** DEADLOCK *** [ 393.810123] [ 393.810123] May be due to missing lock nesting notation [ 393.810123] [ 393.810123] 5 locks held by trinity-c32/9762: [ 393.810123] #0: (net_mutex){+.+.+.}, at: copy_net_ns (net/core/net_namespace.c:254) [ 393.810123] #1: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) [ 393.810123] #2: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) [ 393.810123] #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:344) [ 393.810123] #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.810123] [ 393.810123] stack backtrace: [ 393.810123] CPU: 32 PID: 9762 Comm: trinity-c32 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 [ 393.843284] ffff880bc26730e0 0000000000000000 ffffffffb4ae7ff0 ffff880bc26a3848 [ 393.843284] ffffffffb0e47068 ffffffffb4ae7ff0 ffff880bc26a38f0 ffffffffac258586 [ 393.843284] ffff880bc2673e30 000000050000000a ffffffffb444dee0 ffff880bc2673e48 [ 393.843284] Call Trace: [ 393.843284] dump_stack (lib/dump_stack.c:52) [ 393.843284] __lock_acquire (kernel/locking/lockdep.c:1739 kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 kernel/locking/lockdep.c:3182) [ 393.843284] lock_acquire (kernel/locking/lockdep.c:3602) [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] ? check_irq_usage (kernel/locking/lockdep.c:1638) [ 393.843284] ? __slab_alloc (mm/slub.c:2307) [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 393.843284] __slab_alloc (mm/slub.c:1730 mm/slub.c:2208 mm/slub.c:2372) [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) [ 393.843284] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:304) [ 393.843284] kmem_cache_alloc (mm/slub.c:2445 mm/slub.c:2487 mm/slub.c:2492) [ 393.843284] ? debug_smp_processor_id (lib/smp_processor_id.c:57) [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? check_chain_key (kernel/locking/lockdep.c:2188) [ 393.843284] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191) [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 393.843284] debug_object_init (lib/debugobjects.c:365) [ 393.843284] rcuhead_fixup_activate (kernel/rcu/update.c:260) [ 393.843284] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) [ 393.843284] ? preempt_count_sub (kernel/sched/core.c:2600) [ 393.843284] ? slab_cpuup_callback (mm/slub.c:1484) [ 393.843284] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 8) kernel/rcu/tree.c:2665 (discriminator 8)) [ 393.843284] ? __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.843284] call_rcu (kernel/rcu/tree_plugin.h:679) [ 393.843284] discard_slab (mm/slub.c:1522) [ 393.843284] __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.843284] kmem_cache_destroy (mm/slab_common.c:350) [ 393.843284] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 3)) [ 393.843284] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) [ 393.843284] ops_exit_list.isra.1 (net/core/net_namespace.c:135) [ 393.843284] setup_net (net/core/net_namespace.c:180 (discriminator 3)) [ 393.843284] copy_net_ns (net/core/net_namespace.c:255) [ 393.843284] create_new_namespaces (kernel/nsproxy.c:95) [ 393.843284] unshare_nsproxy_namespaces (kernel/nsproxy.c:190 (discriminator 4)) [ 393.843284] SyS_unshare (kernel/fork.c:1865 kernel/fork.c:1814) [ 393.843284] tracesys (arch/x86/kernel/entry_64.S:542) Thanks, Sasha -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa0-f46.google.com (mail-oa0-f46.google.com [209.85.219.46]) by kanga.kvack.org (Postfix) with ESMTP id 545CC6B0037 for ; Sat, 12 Jul 2014 15:33:58 -0400 (EDT) Received: by mail-oa0-f46.google.com with SMTP id m1so2663280oag.5 for ; Sat, 12 Jul 2014 12:33:58 -0700 (PDT) Received: from e36.co.us.ibm.com (e36.co.us.ibm.com. [32.97.110.154]) by mx.google.com with ESMTPS id pa2si10729703obb.49.2014.07.12.12.33.56 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 12 Jul 2014 12:33:57 -0700 (PDT) Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sat, 12 Jul 2014 13:33:55 -0600 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 345CE19D8026 for ; Sat, 12 Jul 2014 13:33:43 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08025.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s6CJXq674653526 for ; Sat, 12 Jul 2014 21:33:52 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s6CJbv8x022019 for ; Sat, 12 Jul 2014 13:37:57 -0600 Date: Sat, 12 Jul 2014 12:33:49 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140712193349.GD16041@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> <20140620154014.GC4904@linux.vnet.ibm.com> <53C1788D.9080800@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53C1788D.9080800@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Sat, Jul 12, 2014 at 02:03:57PM -0400, Sasha Levin wrote: > On 06/20/2014 11:40 AM, Paul E. McKenney wrote: > > rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() > > > > Currently, call_rcu() relies on implicit allocation and initialization > > for the debug-objects handling of RCU callbacks. If you hammer the > > kernel hard enough with Sasha's modified version of trinity, you can end > > up with the sl*b allocators recursing into themselves via this implicit > > call_rcu() allocation. > > > > This commit therefore exports the debug_init_rcu_head() and > > debug_rcu_head_free() functions, which permits the allocators to allocated > > and pre-initialize the debug-objects information, so that there no longer > > any need for call_rcu() to do that initialization, which in turn prevents > > the recursion into the memory allocators. > > > > Reported-by: Sasha Levin > > Suggested-by: Thomas Gleixner > > Signed-off-by: Paul E. McKenney > > Acked-by: Thomas Gleixner > > Hi Paul, > > Oddly enough, I still see the issue in -next (I made sure that this patch > was in the tree): Hello, Sasha, This commit is only part of the solution. The allocators need to change to make use of it. Thanx, Paul > [ 393.810123] ============================================= > [ 393.810123] [ INFO: possible recursive locking detected ] > [ 393.810123] 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 Not tainted > [ 393.810123] --------------------------------------------- > [ 393.810123] trinity-c32/9762 is trying to acquire lock: > [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.810123] > [ 393.810123] but task is already holding lock: > [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.810123] > [ 393.810123] other info that might help us debug this: > [ 393.810123] Possible unsafe locking scenario: > [ 393.810123] > [ 393.810123] CPU0 > [ 393.810123] ---- > [ 393.810123] lock(&(&n->list_lock)->rlock); > [ 393.810123] lock(&(&n->list_lock)->rlock); > [ 393.810123] > [ 393.810123] *** DEADLOCK *** > [ 393.810123] > [ 393.810123] May be due to missing lock nesting notation > [ 393.810123] > [ 393.810123] 5 locks held by trinity-c32/9762: > [ 393.810123] #0: (net_mutex){+.+.+.}, at: copy_net_ns (net/core/net_namespace.c:254) > [ 393.810123] #1: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) > [ 393.810123] #2: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) > [ 393.810123] #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:344) > [ 393.810123] #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.810123] > [ 393.810123] stack backtrace: > [ 393.810123] CPU: 32 PID: 9762 Comm: trinity-c32 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 > [ 393.843284] ffff880bc26730e0 0000000000000000 ffffffffb4ae7ff0 ffff880bc26a3848 > [ 393.843284] ffffffffb0e47068 ffffffffb4ae7ff0 ffff880bc26a38f0 ffffffffac258586 > [ 393.843284] ffff880bc2673e30 000000050000000a ffffffffb444dee0 ffff880bc2673e48 > [ 393.843284] Call Trace: > [ 393.843284] dump_stack (lib/dump_stack.c:52) > [ 393.843284] __lock_acquire (kernel/locking/lockdep.c:1739 kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 kernel/locking/lockdep.c:3182) > [ 393.843284] lock_acquire (kernel/locking/lockdep.c:3602) > [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] ? check_irq_usage (kernel/locking/lockdep.c:1638) > [ 393.843284] ? __slab_alloc (mm/slub.c:2307) > [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 393.843284] __slab_alloc (mm/slub.c:1730 mm/slub.c:2208 mm/slub.c:2372) > [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) > [ 393.843284] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:304) > [ 393.843284] kmem_cache_alloc (mm/slub.c:2445 mm/slub.c:2487 mm/slub.c:2492) > [ 393.843284] ? debug_smp_processor_id (lib/smp_processor_id.c:57) > [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? check_chain_key (kernel/locking/lockdep.c:2188) > [ 393.843284] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191) > [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 393.843284] debug_object_init (lib/debugobjects.c:365) > [ 393.843284] rcuhead_fixup_activate (kernel/rcu/update.c:260) > [ 393.843284] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > [ 393.843284] ? preempt_count_sub (kernel/sched/core.c:2600) > [ 393.843284] ? slab_cpuup_callback (mm/slub.c:1484) > [ 393.843284] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 8) kernel/rcu/tree.c:2665 (discriminator 8)) > [ 393.843284] ? __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.843284] call_rcu (kernel/rcu/tree_plugin.h:679) > [ 393.843284] discard_slab (mm/slub.c:1522) > [ 393.843284] __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.843284] kmem_cache_destroy (mm/slab_common.c:350) > [ 393.843284] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 3)) > [ 393.843284] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) > [ 393.843284] ops_exit_list.isra.1 (net/core/net_namespace.c:135) > [ 393.843284] setup_net (net/core/net_namespace.c:180 (discriminator 3)) > [ 393.843284] copy_net_ns (net/core/net_namespace.c:255) > [ 393.843284] create_new_namespaces (kernel/nsproxy.c:95) > [ 393.843284] unshare_nsproxy_namespaces (kernel/nsproxy.c:190 (discriminator 4)) > [ 393.843284] SyS_unshare (kernel/fork.c:1865 kernel/fork.c:1814) > [ 393.843284] tracesys (arch/x86/kernel/entry_64.S:542) > > > Thanks, > Sasha > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f177.google.com (mail-ig0-f177.google.com [209.85.213.177]) by kanga.kvack.org (Postfix) with ESMTP id A05676B0035 for ; Mon, 18 Aug 2014 14:51:37 -0400 (EDT) Received: by mail-ig0-f177.google.com with SMTP id hn18so8511310igb.10 for ; Mon, 18 Aug 2014 11:51:37 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com. [32.97.110.153]) by mx.google.com with ESMTPS id np5si21758152icc.97.2014.08.18.11.51.36 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 18 Aug 2014 11:51:36 -0700 (PDT) Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 18 Aug 2014 12:51:36 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 9097E3E4003D for ; Mon, 18 Aug 2014 12:51:34 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s7IGlfvL3801560 for ; Mon, 18 Aug 2014 18:47:41 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s7IItqm7000761 for ; Mon, 18 Aug 2014 12:55:53 -0600 Date: Mon, 18 Aug 2014 09:37:57 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140818163757.GA30742@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Thu, Jun 19, 2014 at 03:19:39PM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? And the needed APIs are now in mainline: void init_rcu_head(struct rcu_head *head); void destroy_rcu_head(struct rcu_head *head); Over to you, Christoph! ;-) Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f43.google.com (mail-pa0-f43.google.com [209.85.220.43]) by kanga.kvack.org (Postfix) with ESMTP id 5C3F46B0035 for ; Mon, 18 Aug 2014 23:44:41 -0400 (EDT) Received: by mail-pa0-f43.google.com with SMTP id lf10so9101954pab.16 for ; Mon, 18 Aug 2014 20:44:41 -0700 (PDT) Received: from qmta05.emeryville.ca.mail.comcast.net (qmta05.emeryville.ca.mail.comcast.net. [2001:558:fe2d:43:76:96:30:48]) by mx.google.com with ESMTPS id nk17si24711342pdb.140.2014.08.18.20.44.37 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 18 Aug 2014 20:44:37 -0700 (PDT) Date: Mon, 18 Aug 2014 22:44:34 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140818163757.GA30742@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > So call rcu activates the object, but the object has no reference in > > > the debug objects code so the fixup code is called which inits the > > > object and allocates a reference .... > > > > So we need to init the object in the page struct before the __call_rcu? > > And the needed APIs are now in mainline: > > void init_rcu_head(struct rcu_head *head); > void destroy_rcu_head(struct rcu_head *head); > > Over to you, Christoph! ;-) The field we are using for the rcu head serves other purposes before the free action. We cannot init the field at slab creation as we thought since it is used for the queueing of slabs on the partial, free and full lists. The kmem_cache information is not available when doing the freeing so we must force the allocation of reserve fields and the use of the reserved areas for rcu on all kmem_caches. I made this conditional on CONFIG_RCU_XYZ. This needs to be the actual Debug options that will require allocations when initializing rcu heads. Also note that the allocations in the rcu head initialization must be restricted to non RCU slabs otherwise the recursion may not terminate. Subject RFC: Allow allocations on initializing rcu fields in slub. Signed-off-by: Christoph Lameter Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,41 @@ static inline struct page *alloc_slab_pa return page; } +#ifdef CONFIG_RCU_DEBUG_XYZ +/* + * We may have to do alloations during the initialization of the + * debug portion of the rcu structure for a slab. It must therefore + * be separately allocated and initized on allocation. + * We cannot overload the lru field in the page struct at all. + */ +#define need_reserve_slab_rcu 1 +#else +/* + * Overload the lru field in struct page if it fits. + * Should struct rcu_head grow due to debugging fields etc then + * additional space will be allocated from the end of the slab to + * store the rcu_head. + */ +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) +#endif + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1392,21 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } +#ifdef CONFIG_RCU_DEBUG_XYZ + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize rcu_head and potentially do other + * allocations. Note that this is still a recursive + * call into the allocator which may recurse endlessly + * if the same kmem_cache is used for allocation here. + * + * So in order to be safe the slab caches used + * in init_rcu_head must be restricted to be of the + * non rcu kind only. + */ + init_rcu_head(get_rcu_head(s, page)); +#endif + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1502,13 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; +#ifdef CONFIG_RCU_DEBUG_XYZ + destroy_rcu_head(h); +#endif if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1519,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f174.google.com (mail-ig0-f174.google.com [209.85.213.174]) by kanga.kvack.org (Postfix) with ESMTP id 474086B0035 for ; Mon, 18 Aug 2014 23:58:35 -0400 (EDT) Received: by mail-ig0-f174.google.com with SMTP id c1so9408968igq.7 for ; Mon, 18 Aug 2014 20:58:35 -0700 (PDT) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com. [32.97.110.153]) by mx.google.com with ESMTPS id t8si10544744igs.16.2014.08.18.20.58.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 18 Aug 2014 20:58:34 -0700 (PDT) Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 18 Aug 2014 21:58:33 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id E826E19D8039 for ; Mon, 18 Aug 2014 21:58:18 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s7J1sbec3473744 for ; Tue, 19 Aug 2014 03:54:37 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s7J42no1013731 for ; Mon, 18 Aug 2014 22:02:49 -0600 Date: Mon, 18 Aug 2014 20:58:28 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140819035828.GI4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Mon, Aug 18, 2014 at 10:44:34PM -0500, Christoph Lameter wrote: > On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > > > So call rcu activates the object, but the object has no reference in > > > > the debug objects code so the fixup code is called which inits the > > > > object and allocates a reference .... > > > > > > So we need to init the object in the page struct before the __call_rcu? > > > > And the needed APIs are now in mainline: > > > > void init_rcu_head(struct rcu_head *head); > > void destroy_rcu_head(struct rcu_head *head); > > > > Over to you, Christoph! ;-) > > The field we are using for the rcu head serves other purposes before > the free action. We cannot init the field at slab creation as we > thought since it is used for the queueing of slabs on the partial, free > and full lists. The kmem_cache information is not available when doing > the freeing so we must force the allocation of reserve fields and the > use of the reserved areas for rcu on all kmem_caches. Yow! I am glad I didn't try doing this myself! > I made this conditional on CONFIG_RCU_XYZ. This needs to be the actual > Debug options that will require allocations when initializing rcu heads. > > Also note that the allocations in the rcu head initialization must be > restricted to non RCU slabs otherwise the recursion may not terminate. > > > Subject RFC: Allow allocations on initializing rcu fields in slub. > > Signed-off-by: Christoph Lameter > > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,41 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#ifdef CONFIG_RCU_DEBUG_XYZ If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, then it will automatically show up when it needs to. The rest looks plausible, for whatever that is worth. Thanx, Paul > +/* > + * We may have to do alloations during the initialization of the > + * debug portion of the rcu structure for a slab. It must therefore > + * be separately allocated and initized on allocation. > + * We cannot overload the lru field in the page struct at all. > + */ > +#define need_reserve_slab_rcu 1 > +#else > +/* > + * Overload the lru field in struct page if it fits. > + * Should struct rcu_head grow due to debugging fields etc then > + * additional space will be allocated from the end of the slab to > + * store the rcu_head. > + */ > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > +#endif > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1392,21 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > +#ifdef CONFIG_RCU_DEBUG_XYZ > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize rcu_head and potentially do other > + * allocations. Note that this is still a recursive > + * call into the allocator which may recurse endlessly > + * if the same kmem_cache is used for allocation here. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head must be restricted to be of the > + * non rcu kind only. > + */ > + init_rcu_head(get_rcu_head(s, page)); > +#endif > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1502,13 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > +#ifdef CONFIG_RCU_DEBUG_XYZ > + destroy_rcu_head(h); > +#endif > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1519,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f182.google.com (mail-pd0-f182.google.com [209.85.192.182]) by kanga.kvack.org (Postfix) with ESMTP id B83D56B0035 for ; Tue, 19 Aug 2014 22:00:10 -0400 (EDT) Received: by mail-pd0-f182.google.com with SMTP id fp1so10913996pdb.13 for ; Tue, 19 Aug 2014 19:00:10 -0700 (PDT) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net. [2001:558:fe2d:44:76:96:27:243]) by mx.google.com with ESMTP id sf8si17580091pbb.149.2014.08.19.19.00.09 for ; Tue, 19 Aug 2014 19:00:09 -0700 (PDT) Date: Tue, 19 Aug 2014 21:00:05 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140819035828.GI4752@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > +#ifdef CONFIG_RCU_DEBUG_XYZ > > If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, > then it will automatically show up when it needs to. Ok. > The rest looks plausible, for whatever that is worth. We talked in the hallway about init_rcu_head not touching the contents of the rcu_head. If that is the case then we can simplify the patch. We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa return page; } +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1376,22 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize various things. However, this init is + * not allowed to modify the contents of the rcu head. + * Allocations are permitted. However, the use of + * the same cache or another cache with SLAB_RCU_DESTROY + * set may cause additional recursions. + * + * So in order to be safe the slab caches used + * in init_rcu_head should be restricted to be of the + * non rcu kind only. + */ + init_rcu_head(get_rcu_head(s, page)); +#endif + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1487,13 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD + destroy_rcu_head(h); +#endif if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1504,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) by kanga.kvack.org (Postfix) with ESMTP id E4F106B0035 for ; Tue, 19 Aug 2014 22:31:28 -0400 (EDT) Received: by mail-ig0-f169.google.com with SMTP id r2so10568972igi.4 for ; Tue, 19 Aug 2014 19:31:28 -0700 (PDT) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com. [32.97.110.151]) by mx.google.com with ESMTPS id i12si1261758igt.5.2014.08.19.19.31.27 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 19 Aug 2014 19:31:28 -0700 (PDT) Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 19 Aug 2014 20:31:27 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 0523B3E4004E for ; Tue, 19 Aug 2014 20:31:24 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s7K0RUux6029676 for ; Wed, 20 Aug 2014 02:27:30 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s7K2Zg1A007900 for ; Tue, 19 Aug 2014 20:35:43 -0600 Date: Tue, 19 Aug 2014 19:31:21 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140820023121.GS4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Tue, Aug 19, 2014 at 09:00:05PM -0500, Christoph Lameter wrote: > On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > > +#ifdef CONFIG_RCU_DEBUG_XYZ > > > > If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, > > then it will automatically show up when it needs to. > > Ok. > > > The rest looks plausible, for whatever that is worth. > > We talked in the hallway about init_rcu_head not touching > the contents of the rcu_head. If that is the case then we can simplify > the patch. That is correct -- the debug-objects code uses separate storage to track states, and does not touch the memory to which the state applies. > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. And indeed they are, good point! It appears to me that both sets of #ifdefs can go away. Thanx, Paul > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1376,22 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize various things. However, this init is > + * not allowed to modify the contents of the rcu head. > + * Allocations are permitted. However, the use of > + * the same cache or another cache with SLAB_RCU_DESTROY > + * set may cause additional recursions. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head should be restricted to be of the > + * non rcu kind only. > + */ > + init_rcu_head(get_rcu_head(s, page)); > +#endif > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1487,13 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD > + destroy_rcu_head(h); > +#endif > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1504,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f180.google.com (mail-pd0-f180.google.com [209.85.192.180]) by kanga.kvack.org (Postfix) with ESMTP id 09F136B0035 for ; Wed, 20 Aug 2014 02:01:23 -0400 (EDT) Received: by mail-pd0-f180.google.com with SMTP id v10so11206266pde.39 for ; Tue, 19 Aug 2014 23:01:23 -0700 (PDT) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net. [2001:558:fe2d:44:76:96:27:212]) by mx.google.com with ESMTP id t9si30208709pas.58.2014.08.19.23.01.22 for ; Tue, 19 Aug 2014 23:01:22 -0700 (PDT) Date: Wed, 20 Aug 2014 01:01:19 -0500 (CDT) From: Christoph Lameter Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140820023121.GS4752@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> <20140820023121.GS4752@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: "Paul E. McKenney" Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Tue, 19 Aug 2014, Paul E. McKenney wrote: > > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. > > And indeed they are, good point! It appears to me that both sets of > #ifdefs can go away. Ok then this is a first workable version I think. How do we test this? From: Christoph Lameter Subject: slub: Add init/destroy function calls for rcu_heads In order to do proper debugging for rcu_head use we need some additional structures allocated when an object potentially using a rcu_head is allocated in the slub allocator. This adds the proper calls to init_rcu_head() and destroy_rcu_head(). init_rcu_head() is a bit of an unusual function since: 1. It does not touch the contents of the rcu_head. This is required since the rcu_head is only used during slab_page freeing. Outside of that the same memory location is used for slab page list management. However, the initialization occurs when the slab page is initially allocated. So in the time between init_rcu_head() and destroy_rcu_head() there may be multiple uses of the indicated address as a list_head. 2. It is called without gfp flags and could potentially be called from atomic contexts. Allocations from init_rcu_head() context need to deal with this. 3. init_rcu_head() is called from within the slab allocation functions. Since init_rcu_head() calls the allocator again for more allocations it must avoid to use slabs that use rcu freeing. Otherwise endless recursion may occur (We may have to convince lockdep that what we do here is sane). Signed-off-by: Christoph Lameter Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa return page; } +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1376,29 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize various things. However, this init is + * not allowed to modify the contents of the rcu head. + * The allocator typically overloads the rcu head over + * page->lru which is also used to manage lists of + * slab pages. + * + * Allocations are permitted in init_rcu_head(). + * However, the use of the same cache or another + * cache with SLAB_DESTROY_BY_RCU set will cause + * additional recursions. + * + * So in order to be safe the slab caches used + * in init_rcu_head() should be restricted to be of the + * non rcu kind only. + * + * Note also that no GFPFLAG is passed. The function + * may therefore be called from atomic contexts + * and somehow(?) needs to do the right thing. + */ + init_rcu_head(get_rcu_head(s, page)); + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1494,11 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; + destroy_rcu_head(h); if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1509,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f170.google.com (mail-ig0-f170.google.com [209.85.213.170]) by kanga.kvack.org (Postfix) with ESMTP id 50F726B0035 for ; Wed, 20 Aug 2014 08:20:06 -0400 (EDT) Received: by mail-ig0-f170.google.com with SMTP id h3so11056741igd.5 for ; Wed, 20 Aug 2014 05:20:06 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com. [32.97.110.152]) by mx.google.com with ESMTPS id gj19si26740295icb.4.2014.08.20.05.20.05 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 20 Aug 2014 05:20:05 -0700 (PDT) Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 20 Aug 2014 06:20:04 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id DA71319D803F for ; Wed, 20 Aug 2014 06:19:50 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s7KAG8Zd9699660 for ; Wed, 20 Aug 2014 12:16:08 +0200 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s7KCOLrw019643 for ; Wed, 20 Aug 2014 06:24:21 -0600 Date: Wed, 20 Aug 2014 05:19:59 -0700 From: "Paul E. McKenney" Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140820121959.GT4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> <20140820023121.GS4752@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML On Wed, Aug 20, 2014 at 01:01:19AM -0500, Christoph Lameter wrote: > On Tue, 19 Aug 2014, Paul E. McKenney wrote: > > > > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > > > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. > > > > And indeed they are, good point! It appears to me that both sets of > > #ifdefs can go away. > > Ok then this is a first workable version I think. How do we test this? It looks good to me. Sasha, could you please try this out? This should fix the problem you reported here: https://lkml.org/lkml/2014/6/19/306 Thanx, Paul > From: Christoph Lameter > Subject: slub: Add init/destroy function calls for rcu_heads > > In order to do proper debugging for rcu_head use we need some > additional structures allocated when an object potentially > using a rcu_head is allocated in the slub allocator. > > This adds the proper calls to init_rcu_head() > and destroy_rcu_head(). > > init_rcu_head() is a bit of an unusual function since: > 1. It does not touch the contents of the rcu_head. This is > required since the rcu_head is only used during > slab_page freeing. Outside of that the same memory location > is used for slab page list management. However, the > initialization occurs when the slab page is initially allocated. > So in the time between init_rcu_head() and destroy_rcu_head() > there may be multiple uses of the indicated address as a > list_head. > > 2. It is called without gfp flags and could potentially > be called from atomic contexts. Allocations from init_rcu_head() > context need to deal with this. > > 3. init_rcu_head() is called from within the slab allocation > functions. Since init_rcu_head() calls the allocator again > for more allocations it must avoid to use slabs that use > rcu freeing. Otherwise endless recursion may occur > (We may have to convince lockdep that what we do here is sane). > > Signed-off-by: Christoph Lameter > > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1376,29 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize various things. However, this init is > + * not allowed to modify the contents of the rcu head. > + * The allocator typically overloads the rcu head over > + * page->lru which is also used to manage lists of > + * slab pages. > + * > + * Allocations are permitted in init_rcu_head(). > + * However, the use of the same cache or another > + * cache with SLAB_DESTROY_BY_RCU set will cause > + * additional recursions. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head() should be restricted to be of the > + * non rcu kind only. > + * > + * Note also that no GFPFLAG is passed. The function > + * may therefore be called from atomic contexts > + * and somehow(?) needs to do the right thing. > + */ > + init_rcu_head(get_rcu_head(s, page)); > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1494,11 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > + destroy_rcu_head(h); > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1509,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758047AbaFSOa7 (ORCPT ); Thu, 19 Jun 2014 10:30:59 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:38418 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757835AbaFSOa5 (ORCPT ); Thu, 19 Jun 2014 10:30:57 -0400 Message-ID: <53A2F406.4010109@oracle.com> Date: Thu, 19 Jun 2014 10:30:30 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Pekka Enberg , Christoph Lameter , Thomas Gleixner , Matt Mackall CC: Andrew Morton , Dave Jones , "linux-mm@kvack.org" , "Paul E. McKenney" , Dave Jones , LKML Subject: slub/debugobjects: lockup when freeing memory X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: acsinet21.oracle.com [141.146.126.237] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following spew. It seems to cause an actual lockup as hung task messages followed soon after. [ 690.762537] ============================================= [ 690.764196] [ INFO: possible recursive locking detected ] [ 690.765247] 3.16.0-rc1-next-20140618-sasha-00029-g9e4acf8-dirty #664 Tainted: G W [ 690.766457] --------------------------------------------- [ 690.767237] kworker/u95:0/256 is trying to acquire lock: [ 690.767886] (&(&n->list_lock)->rlock){-.-.-.}, at: get_partial_node.isra.35 (mm/slub.c:1630) [ 690.769162] [ 690.769162] but task is already holding lock: [ 690.769851] (&(&n->list_lock)->rlock){-.-.-.}, at: kmem_cache_close (mm/slub.c:3209 mm/slub.c:3233) [ 690.770137] [ 690.770137] other info that might help us debug this: [ 690.770137] Possible unsafe locking scenario: [ 690.770137] [ 690.770137] CPU0 [ 690.770137] ---- [ 690.770137] lock(&(&n->list_lock)->rlock); [ 690.770137] lock(&(&n->list_lock)->rlock); [ 690.770137] [ 690.770137] *** DEADLOCK *** [ 690.770137] [ 690.770137] May be due to missing lock nesting notation [ 690.770137] [ 690.770137] 7 locks held by kworker/u95:0/256: [ 690.770137] #0: ("%s"("netns")){.+.+.+}, at: process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] #1: (net_cleanup_work){+.+.+.}, at: process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] #2: (net_mutex){+.+.+.}, at: cleanup_net (net/core/net_namespace.c:287) [ 690.770137] #3: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) [ 690.770137] #4: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) [ 690.770137] #5: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:343) [ 690.770137] #6: (&(&n->list_lock)->rlock){-.-.-.}, at: kmem_cache_close (mm/slub.c:3209 mm/slub.c:3233) [ 690.770137] [ 690.770137] stack backtrace: [ 690.770137] CPU: 18 PID: 256 Comm: kworker/u95:0 Tainted: G W 3.16.0-rc1-next-20140618-sasha-00029-g9e4acf8-dirty #664 [ 690.770137] Workqueue: netns cleanup_net [ 690.770137] ffff8808a172b000 ffff8808a1737628 ffffffff9d5179a0 0000000000000003 [ 690.770137] ffffffffa0b499c0 ffff8808a1737728 ffffffff9a1cac52 ffff8808a1737668 [ 690.770137] ffffffff9a1a74f8 23e00d8075e32f12 ffff8808a172b000 23e00d8000000001 [ 690.770137] Call Trace: [ 690.770137] dump_stack (lib/dump_stack.c:52) [ 690.770137] __lock_acquire (kernel/locking/lockdep.c:3034 kernel/locking/lockdep.c:3180) [ 690.770137] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 690.770137] ? get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305) [ 690.770137] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 690.770137] ? get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] get_partial_node.isra.35 (mm/slub.c:1630) [ 690.770137] ? __slab_alloc (mm/slub.c:2304) [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 690.770137] debug_object_init (lib/debugobjects.c:365) [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) [ 690.770137] ? discard_slab (mm/slub.c:1486) [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) [ 690.770137] call_rcu (kernel/rcu/tree_plugin.h:679) [ 690.770137] discard_slab (mm/slub.c:1515 mm/slub.c:1523) [ 690.770137] kmem_cache_close (mm/slub.c:3212 mm/slub.c:3233) [ 690.770137] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607) [ 690.770137] __kmem_cache_shutdown (mm/slub.c:3245) [ 690.770137] kmem_cache_destroy (mm/slab_common.c:349) [ 690.770137] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 2)) [ 690.770137] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) [ 690.770137] ops_exit_list.isra.1 (net/core/net_namespace.c:135) [ 690.770137] cleanup_net (net/core/net_namespace.c:302 (discriminator 2)) [ 690.770137] process_one_work (kernel/workqueue.c:2081 include/linux/jump_label.h:115 include/trace/events/workqueue.h:111 kernel/workqueue.c:2086) [ 690.770137] ? process_one_work (include/linux/workqueue.h:185 kernel/workqueue.c:599 kernel/workqueue.c:626 kernel/workqueue.c:2074) [ 690.770137] worker_thread (kernel/workqueue.c:2213) [ 690.770137] ? rescuer_thread (kernel/workqueue.c:2157) [ 690.770137] kthread (kernel/kthread.c:210) [ 690.770137] ? kthread_create_on_node (kernel/kthread.c:176) [ 690.770137] ret_from_fork (arch/x86/kernel/entry_64.S:349) Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933310AbaFSPDK (ORCPT ); Thu, 19 Jun 2014 11:03:10 -0400 Received: from qmta08.emeryville.ca.mail.comcast.net ([76.96.30.80]:49613 "EHLO qmta08.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933188AbaFSPDI (ORCPT ); Thu, 19 Jun 2014 11:03:08 -0400 Date: Thu, 19 Jun 2014 10:03:04 -0500 (CDT) From: Christoph Lameter To: Sasha Levin cc: Pekka Enberg , Thomas Gleixner , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , "Paul E. McKenney" , Dave Jones , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <53A2F406.4010109@oracle.com> Message-ID: References: <53A2F406.4010109@oracle.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Sasha Levin wrote: > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > [ 690.770137] ? discard_slab (mm/slub.c:1486) > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) __call_rcu does a slab allocation? This means __call_rcu can no longer be used in slab allocators? What happened? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933764AbaFSQw4 (ORCPT ); Thu, 19 Jun 2014 12:52:56 -0400 Received: from e31.co.us.ibm.com ([32.97.110.149]:60997 "EHLO e31.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932775AbaFSQwz (ORCPT ); Thu, 19 Jun 2014 12:52:55 -0400 Date: Thu, 19 Jun 2014 09:52:47 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Sasha Levin , Pekka Enberg , Thomas Gleixner , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619165247.GA4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061916-8236-0000-0000-000003393AC1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > used in slab allocators? What happened? My guess is that the root cause is a double call_rcu(), call_rcu_sched(), call_rcu_bh(), or call_srcu(). Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? That would be unfortunate... Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934469AbaFST3Y (ORCPT ); Thu, 19 Jun 2014 15:29:24 -0400 Received: from www.linutronix.de ([62.245.132.108]:55127 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934265AbaFST3V (ORCPT ); Thu, 19 Jun 2014 15:29:21 -0400 Date: Thu, 19 Jun 2014 21:29:08 +0200 (CEST) From: Thomas Gleixner To: "Paul E. McKenney" cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619165247.GA4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > > used in slab allocators? What happened? > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > call_rcu_bh(), or call_srcu(). > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > That would be unfortunate... Well, no. Look at the callchain: __call_rcu debug_object_activate rcuhead_fixup_activate debug_object_init kmem_cache_alloc So call rcu activates the object, but the object has no reference in the debug objects code so the fixup code is called which inits the object and allocates a reference .... Thanks, tglx From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965103AbaFSUTo (ORCPT ); Thu, 19 Jun 2014 16:19:44 -0400 Received: from qmta12.emeryville.ca.mail.comcast.net ([76.96.27.227]:49968 "EHLO qmta12.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756631AbaFSUTm (ORCPT ); Thu, 19 Jun 2014 16:19:42 -0400 Date: Thu, 19 Jun 2014 15:19:39 -0500 (CDT) From: Christoph Lameter To: Thomas Gleixner cc: "Paul E. McKenney" , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Thomas Gleixner wrote: > Well, no. Look at the callchain: > > __call_rcu > debug_object_activate > rcuhead_fixup_activate > debug_object_init > kmem_cache_alloc > > So call rcu activates the object, but the object has no reference in > the debug objects code so the fixup code is called which inits the > object and allocates a reference .... So we need to init the object in the page struct before the __call_rcu? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934464AbaFSU22 (ORCPT ); Thu, 19 Jun 2014 16:28:28 -0400 Received: from www.linutronix.de ([62.245.132.108]:55336 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754264AbaFSU21 (ORCPT ); Thu, 19 Jun 2014 16:28:27 -0400 Date: Thu, 19 Jun 2014 22:28:14 +0200 (CEST) From: Thomas Gleixner To: Christoph Lameter cc: "Paul E. McKenney" , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? Looks like RCU is lazily relying on the state callback to initialize the objects. There is an unused debug_init_rcu_head() inline in kernel/rcu/update.c Paul???? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934559AbaFSU3t (ORCPT ); Thu, 19 Jun 2014 16:29:49 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:44520 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932071AbaFSU3r (ORCPT ); Thu, 19 Jun 2014 16:29:47 -0400 Date: Thu, 19 Jun 2014 13:29:28 -0700 From: "Paul E. McKenney" To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619202928.GG4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061920-9332-0000-0000-000001255E24 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > > > On Thu, 19 Jun 2014, Sasha Levin wrote: > > > > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > > > > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > > > used in slab allocators? What happened? > > > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > > call_rcu_bh(), or call_srcu(). > > > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > > That would be unfortunate... > > Well, no. Look at the callchain: > > __call_rcu > debug_object_activate > rcuhead_fixup_activate > debug_object_init > kmem_cache_alloc > > So call rcu activates the object, but the object has no reference in > the debug objects code so the fixup code is called which inits the > object and allocates a reference .... OK, got it. And you are right, call_rcu() has done this for a very long time, so not sure what changed. But it seems like the right approach is to provide a debug-object-free call_rcu_alloc() for use by the memory allocators. Seem reasonable? If so, please see the following patch. Thanx, Paul ------------------------------------------------------------------------ rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion The sl*b allocators use call_rcu() to manage object lifetimes, but call_rcu() can use debug-objects, which in turn invokes the sl*b allocators. These allocators are not prepared for this sort of recursion, which can result in failures. This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), which act as their call_rcu() and call_rcu_sched() counterparts, but which avoid invoking debug-objects. These new API members are intended only for use by the sl*b allocators, and this commit makes the sl*b allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). Reported-by: Sasha Levin Set-straight-by: Thomas Gleixner Signed-off-by: Paul E. McKenney diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index d5e40a42cc43..1f708a7f9e7d 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -140,13 +140,24 @@ void do_trace_rcu_torture_read(const char *rcutorturename, * if CPU A and CPU B are the same CPU (but again only if the system has * more than one CPU). */ -void call_rcu(struct rcu_head *head, - void (*func)(struct rcu_head *head)); +void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *head)); + +/** + * call_rcu__alloc() - Queue an RCU for invocation after grace period. + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * Similar to call_rcu(), but avoids invoking debug-objects. This permits + * this to be called from allocators without needing to worry about + * recursive calls into those allocators for debug-objects allocations. + */ +void call_rcu_alloc(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); #else /* #ifdef CONFIG_PREEMPT_RCU */ /* In classic RCU, call_rcu() is just call_rcu_sched(). */ #define call_rcu call_rcu_sched +#define call_rcu_alloc call_rcu_sched_alloc #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ @@ -196,6 +207,19 @@ void call_rcu_bh(struct rcu_head *head, void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); +/** + * call_rcu_sched_alloc() - Queue RCU for invocation after sched grace period. + * @head: structure to be used for queueing the RCU updates. + * @func: actual callback function to be invoked after the grace period + * + * Similar to call_rcu_sched(), but avoids invoking debug-objects. + * This permits this to be called from allocators without needing to + * worry about recursive calls into those allocators for debug-objects + * allocations. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)); + void synchronize_sched(void); #ifdef CONFIG_PREEMPT_RCU diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index d9efcc13008c..515e60067c53 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -338,15 +338,14 @@ void synchronize_sched(void) EXPORT_SYMBOL_GPL(synchronize_sched); /* - * Helper function for call_rcu() and call_rcu_bh(). + * Provide call_rcu() function, but avoid invoking debug objects. */ -static void __call_rcu(struct rcu_head *head, - void (*func)(struct rcu_head *rcu), - struct rcu_ctrlblk *rcp) +static void __call_rcu_nodo(struct rcu_head *head, + void (*func)(struct rcu_head *rcu), + struct rcu_ctrlblk *rcp) { unsigned long flags; - debug_rcu_head_queue(head); head->func = func; head->next = NULL; @@ -358,6 +357,17 @@ static void __call_rcu(struct rcu_head *head, } /* + * Helper function for call_rcu() and call_rcu_bh(). + */ +static void __call_rcu(struct rcu_head *head, + void (*func)(struct rcu_head *rcu), + struct rcu_ctrlblk *rcp) +{ + debug_rcu_head_queue(head); + __call_rcu_nodo(head, func, rcp); +} + +/* * Post an RCU callback to be invoked after the end of an RCU-sched grace * period. But since we have but one CPU, that would be after any * quiescent state. @@ -369,6 +379,18 @@ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) EXPORT_SYMBOL_GPL(call_rcu_sched); /* + * Similar to call_rcu_sched(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_sched_ctrlblk); +} +EXPORT_SYMBOL_GPL(call_rcu_sched_alloc); + +/* * Post an RCU bottom-half callback to be invoked after any subsequent * quiescent state. */ diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 8c47d04ecdea..593195d38850 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2640,25 +2640,16 @@ static void rcu_leak_callback(struct rcu_head *rhp) } /* - * Helper function for call_rcu() and friends. The cpu argument will - * normally be -1, indicating "currently running CPU". It may specify - * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() - * is expected to specify a CPU. + * Provide call_rcu() function, but avoid invoking debug objects. */ static void -__call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), - struct rcu_state *rsp, int cpu, bool lazy) +__call_rcu_nodo(struct rcu_head *head, void (*func)(struct rcu_head *rcu), + struct rcu_state *rsp, int cpu, bool lazy) { unsigned long flags; struct rcu_data *rdp; WARN_ON_ONCE((unsigned long)head & 0x1); /* Misaligned rcu_head! */ - if (debug_rcu_head_queue(head)) { - /* Probable double call_rcu(), so leak the callback. */ - ACCESS_ONCE(head->func) = rcu_leak_callback; - WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n"); - return; - } head->func = func; head->next = NULL; @@ -2704,6 +2695,25 @@ __call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), } /* + * Helper function for call_rcu() and friends. The cpu argument will + * normally be -1, indicating "currently running CPU". It may specify + * a CPU only if that CPU is a no-CBs CPU. Currently, only _rcu_barrier() + * is expected to specify a CPU. + */ +static void +__call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu), + struct rcu_state *rsp, int cpu, bool lazy) +{ + if (debug_rcu_head_queue(head)) { + /* Probable double call_rcu(), so leak the callback. */ + ACCESS_ONCE(head->func) = rcu_leak_callback; + WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n"); + return; + } + __call_rcu_nodo(head, func, rsp, cpu, lazy); +} + +/* * Queue an RCU-sched callback for invocation after a grace period. */ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) @@ -2713,6 +2723,18 @@ void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) EXPORT_SYMBOL_GPL(call_rcu_sched); /* + * Similar to call_rcu_sched(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_sched_alloc(struct rcu_head *head, + void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_sched_state, -1, 0); +} +EXPORT_SYMBOL_GPL(call_rcu_sched_alloc); + +/* * Queue an RCU callback for invocation after a quicker grace period. */ void call_rcu_bh(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 569b390daa15..e9362d7f8328 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -679,6 +679,17 @@ void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) } EXPORT_SYMBOL_GPL(call_rcu); +/* + * Similar to call_rcu(), but avoids debug-objects and thus calls + * into the memory allocators, which don't appreciate that sort of + * recursion. + */ +void call_rcu_alloc(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) +{ + __call_rcu_nodo(head, func, &rcu_preempt_state, -1, 0); +} +EXPORT_SYMBOL_GPL(call_rcu_alloc); + /** * synchronize_rcu - wait until a grace period has elapsed. * diff --git a/mm/slab.c b/mm/slab.c index 9ca3b87edabc..1e5de0d39701 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1994,7 +1994,7 @@ static void slab_destroy(struct kmem_cache *cachep, struct page *page) * we can use it safely. */ head = (void *)&page->rcu_head; - call_rcu(head, kmem_rcu_free); + call_rcu_alloc(head, kmem_rcu_free); } else { kmem_freepages(cachep, page); diff --git a/mm/slob.c b/mm/slob.c index 21980e0f39a8..47ad4a43521a 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -605,7 +605,7 @@ void kmem_cache_free(struct kmem_cache *c, void *b) struct slob_rcu *slob_rcu; slob_rcu = b + (c->size - sizeof(struct slob_rcu)); slob_rcu->size = c->size; - call_rcu(&slob_rcu->head, kmem_rcu_free); + call_rcu_alloc(&slob_rcu->head, kmem_rcu_free); } else { __kmem_cache_free(b, c->size); } diff --git a/mm/slub.c b/mm/slub.c index b2b047327d76..7f01e57fd99f 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1512,7 +1512,7 @@ static void free_slab(struct kmem_cache *s, struct page *page) head = (void *)&page->lru; } - call_rcu(head, rcu_free_slab); + call_rcu_alloc(head, rcu_free_slab); } else __free_slab(s, page); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934616AbaFSUdI (ORCPT ); Thu, 19 Jun 2014 16:33:08 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:51156 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932071AbaFSUdH (ORCPT ); Thu, 19 Jun 2014 16:33:07 -0400 Message-ID: <53A348E6.3050404@oracle.com> Date: Thu, 19 Jun 2014 16:32:38 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com, Thomas Gleixner CC: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote: >> > >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote: >>>> > > > >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) >>>> > > > >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be >>>> > > > used in slab allocators? What happened? >>> > > >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), >>> > > call_rcu_bh(), or call_srcu(). >>> > > >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? >>> > > That would be unfortunate... >> > >> > Well, no. Look at the callchain: >> > >> > __call_rcu >> > debug_object_activate >> > rcuhead_fixup_activate >> > debug_object_init >> > kmem_cache_alloc >> > >> > So call rcu activates the object, but the object has no reference in >> > the debug objects code so the fixup code is called which inits the >> > object and allocates a reference .... > OK, got it. And you are right, call_rcu() has done this for a very > long time, so not sure what changed. It's probable my fault. I've introduced clone() and unshare() fuzzing. Those two are full with issues and I've been waiting with enabling those until the rest of the kernel could survive trinity for more than an hour. Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934497AbaFSUhJ (ORCPT ); Thu, 19 Jun 2014 16:37:09 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:49283 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933503AbaFSUhH (ORCPT ); Thu, 19 Jun 2014 16:37:07 -0400 Date: Thu, 19 Jun 2014 13:36:59 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619203659.GH4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061920-1344-0000-0000-0000024F1CBF Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 03:19:39PM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? Good point. The patch I just sent will complain at callback-invocation time because the debug-object information won't be present. One way to handle this would be for rcu_do_batch() to avoid complaining if it gets a callback that has not been through call_rcu()'s debug_rcu_head_queue(). One way to do that would be to have an alternative to debug_object_deactivate() that does not complain if it is handed an unactivated object. Another way to handle this would be for me to put the definition of debug_rcu_head_queue() somewhere where the sl*b allocator could get at it, and have the sl*b allocators invoke it some at initialization and within the RCU callback. Other thoughts? Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934563AbaFSUh2 (ORCPT ); Thu, 19 Jun 2014 16:37:28 -0400 Received: from www.linutronix.de ([62.245.132.108]:55392 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933503AbaFSUh0 (ORCPT ); Thu, 19 Jun 2014 16:37:26 -0400 Date: Thu, 19 Jun 2014 22:37:17 +0200 (CEST) From: Thomas Gleixner To: "Paul E. McKenney" cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > OK, got it. And you are right, call_rcu() has done this for a very > long time, so not sure what changed. But it seems like the right > approach is to provide a debug-object-free call_rcu_alloc() for use > by the memory allocators. > > Seem reasonable? If so, please see the following patch. Not really, you're torpedoing the whole purpose of debugobjects :) So, why can't we just init the rcu head when the stuff is created? If that's impossible due to other memory allocator constraints, then instead of inventing a whole new API we can simply flag the relevent data in the memory allocator as we do with the debug objects mem cache itself (SLAB_DEBUG_OBJECTS). Thanks, tglx From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964956AbaFSUm3 (ORCPT ); Thu, 19 Jun 2014 16:42:29 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:59185 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754837AbaFSUm2 (ORCPT ); Thu, 19 Jun 2014 16:42:28 -0400 Date: Thu, 19 Jun 2014 13:39:09 -0700 From: "Paul E. McKenney" To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619203909.GI4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <53A348E6.3050404@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A348E6.3050404@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061920-6688-0000-0000-000002AF0E62 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 04:32:38PM -0400, Sasha Levin wrote: > On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > >> > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > >> > > >>> > > On Thu, Jun 19, 2014 at 10:03:04AM -0500, Christoph Lameter wrote: > >>>> > > > On Thu, 19 Jun 2014, Sasha Levin wrote: > >>>> > > > > >>>>> > > > > [ 690.770137] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > >>>>> > > > > [ 690.770137] __slab_alloc (mm/slub.c:1732 mm/slub.c:2205 mm/slub.c:2369) > >>>>> > > > > [ 690.770137] ? __lock_acquire (kernel/locking/lockdep.c:3189) > >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] kmem_cache_alloc (mm/slub.c:2442 mm/slub.c:2484 mm/slub.c:2489) > >>>>> > > > > [ 690.770137] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] ? debug_object_activate (lib/debugobjects.c:439) > >>>>> > > > > [ 690.770137] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > >>>>> > > > > [ 690.770137] debug_object_init (lib/debugobjects.c:365) > >>>>> > > > > [ 690.770137] rcuhead_fixup_activate (kernel/rcu/update.c:231) > >>>>> > > > > [ 690.770137] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > >>>>> > > > > [ 690.770137] ? discard_slab (mm/slub.c:1486) > >>>>> > > > > [ 690.770137] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 2) kernel/rcu/tree.c:2585 (discriminator 2)) > >>>> > > > > >>>> > > > __call_rcu does a slab allocation? This means __call_rcu can no longer be > >>>> > > > used in slab allocators? What happened? > >>> > > > >>> > > My guess is that the root cause is a double call_rcu(), call_rcu_sched(), > >>> > > call_rcu_bh(), or call_srcu(). > >>> > > > >>> > > Perhaps the DEBUG_OBJECTS code now allocates memory to report errors? > >>> > > That would be unfortunate... > >> > > >> > Well, no. Look at the callchain: > >> > > >> > __call_rcu > >> > debug_object_activate > >> > rcuhead_fixup_activate > >> > debug_object_init > >> > kmem_cache_alloc > >> > > >> > So call rcu activates the object, but the object has no reference in > >> > the debug objects code so the fixup code is called which inits the > >> > object and allocates a reference .... > > OK, got it. And you are right, call_rcu() has done this for a very > > long time, so not sure what changed. > > It's probable my fault. I've introduced clone() and unshare() fuzzing. > > Those two are full with issues and I've been waiting with enabling those > until the rest of the kernel could survive trinity for more than an hour. Well, that might explain why I haven't seen it in my testing. ;-) Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965092AbaFSUmh (ORCPT ); Thu, 19 Jun 2014 16:42:37 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:43227 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964975AbaFSUmf (ORCPT ); Thu, 19 Jun 2014 16:42:35 -0400 Message-ID: <53A34B23.1000401@oracle.com> Date: Thu, 19 Jun 2014 16:42:11 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com, Thomas Gleixner CC: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> In-Reply-To: <20140619202928.GG4904@linux.vnet.ibm.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion > > The sl*b allocators use call_rcu() to manage object lifetimes, but > call_rcu() can use debug-objects, which in turn invokes the sl*b > allocators. These allocators are not prepared for this sort of > recursion, which can result in failures. > > This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), > which act as their call_rcu() and call_rcu_sched() counterparts, but > which avoid invoking debug-objects. These new API members are intended > only for use by the sl*b allocators, and this commit makes the sl*b > allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because > in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so > therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). > > Reported-by: Sasha Levin > Set-straight-by: Thomas Gleixner > Signed-off-by: Paul E. McKenney Paul, what is this patch based on? It won't apply cleanly on -next or Linus's tree. Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934631AbaFSVDh (ORCPT ); Thu, 19 Jun 2014 17:03:37 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:46179 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932343AbaFSVDg (ORCPT ); Thu, 19 Jun 2014 17:03:36 -0400 Date: Thu, 19 Jun 2014 13:53:07 -0700 From: "Paul E. McKenney" To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619205307.GL4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061920-0928-0000-0000-000002D0F915 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > Well, no. Look at the callchain: > > > > > > __call_rcu > > > debug_object_activate > > > rcuhead_fixup_activate > > > debug_object_init > > > kmem_cache_alloc > > > > > > So call rcu activates the object, but the object has no reference in > > > the debug objects code so the fixup code is called which inits the > > > object and allocates a reference .... > > > > OK, got it. And you are right, call_rcu() has done this for a very > > long time, so not sure what changed. But it seems like the right > > approach is to provide a debug-object-free call_rcu_alloc() for use > > by the memory allocators. > > > > Seem reasonable? If so, please see the following patch. > > Not really, you're torpedoing the whole purpose of debugobjects :) > > So, why can't we just init the rcu head when the stuff is created? That would allow me to keep my code unchanged, so I am in favor. ;-) Thanx, Paul > If that's impossible due to other memory allocator constraints, then > instead of inventing a whole new API we can simply flag the relevent > data in the memory allocator as we do with the debug objects mem cache > itself (SLAB_DEBUG_OBJECTS). > > Thanks, > > tglx > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934646AbaFSVEA (ORCPT ); Thu, 19 Jun 2014 17:04:00 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:55439 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934613AbaFSVD6 (ORCPT ); Thu, 19 Jun 2014 17:03:58 -0400 Date: Thu, 19 Jun 2014 13:53:36 -0700 From: "Paul E. McKenney" To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619205336.GM4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <53A34B23.1000401@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A34B23.1000401@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061920-9332-0000-0000-000001256D3C Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 04:42:11PM -0400, Sasha Levin wrote: > On 06/19/2014 04:29 PM, Paul E. McKenney wrote: > > rcu: Provide call_rcu_alloc() and call_rcu_sched_alloc() to avoid recursion > > > > The sl*b allocators use call_rcu() to manage object lifetimes, but > > call_rcu() can use debug-objects, which in turn invokes the sl*b > > allocators. These allocators are not prepared for this sort of > > recursion, which can result in failures. > > > > This commit therefore creates call_rcu_alloc() and call_rcu_sched_alloc(), > > which act as their call_rcu() and call_rcu_sched() counterparts, but > > which avoid invoking debug-objects. These new API members are intended > > only for use by the sl*b allocators, and this commit makes the sl*b > > allocators use call_rcu_alloc(). Why call_rcu_sched_alloc()? Because > > in CONFIG_PREEMPT=n kernels, call_rcu() maps to call_rcu_sched(), so > > therefore call_rcu_alloc() must map to call_rcu_sched_alloc(). > > > > Reported-by: Sasha Levin > > Set-straight-by: Thomas Gleixner > > Signed-off-by: Paul E. McKenney > > Paul, what is this patch based on? It won't apply cleanly on -next > or Linus's tree. On my -rcu tree, but I think that Thomas's approach is better. Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965813AbaFSVc6 (ORCPT ); Thu, 19 Jun 2014 17:32:58 -0400 Received: from www.linutronix.de ([62.245.132.108]:55645 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964833AbaFSVc4 (ORCPT ); Thu, 19 Jun 2014 17:32:56 -0400 Date: Thu, 19 Jun 2014 23:32:41 +0200 (CEST) From: Thomas Gleixner To: "Paul E. McKenney" cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619205307.GL4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > Well, no. Look at the callchain: > > > > > > > > __call_rcu > > > > debug_object_activate > > > > rcuhead_fixup_activate > > > > debug_object_init > > > > kmem_cache_alloc > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > the debug objects code so the fixup code is called which inits the > > > > object and allocates a reference .... > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > long time, so not sure what changed. But it seems like the right > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > by the memory allocators. > > > > > > Seem reasonable? If so, please see the following patch. > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > So, why can't we just init the rcu head when the stuff is created? > > That would allow me to keep my code unchanged, so I am in favor. ;-) Almost unchanged. You need to provide a function to do so, i.e. make use of debug_init_rcu_head() Thanks, tglx From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965831AbaFSWE4 (ORCPT ); Thu, 19 Jun 2014 18:04:56 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:48203 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965319AbaFSWEz (ORCPT ); Thu, 19 Jun 2014 18:04:55 -0400 Date: Thu, 19 Jun 2014 15:04:49 -0700 From: "Paul E. McKenney" To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140619220449.GT4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14061922-1344-0000-0000-0000024F7A12 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > Well, no. Look at the callchain: > > > > > > > > > > __call_rcu > > > > > debug_object_activate > > > > > rcuhead_fixup_activate > > > > > debug_object_init > > > > > kmem_cache_alloc > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > the debug objects code so the fixup code is called which inits the > > > > > object and allocates a reference .... > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > long time, so not sure what changed. But it seems like the right > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > by the memory allocators. > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > Almost unchanged. You need to provide a function to do so, i.e. make > use of > > debug_init_rcu_head() You mean like this? Thanx, Paul ------------------------------------------------------------------------ rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() Currently, call_rcu() relies on implicit allocation and initialization for the debug-objects handling of RCU callbacks. If you hammer the kernel hard enough with Sasha's modified version of trinity, you can end up with the sl*b allocators recursing into themselves via this implicit call_rcu() allocation. This commit therefore exports the debug_init_rcu_head() and debug_rcu_head_free() functions, which permits the allocators to allocated and pre-initialize the debug-objects information, so that there no longer any need for call_rcu() to do that initialization, which in turn prevents the recursion into the memory allocators. Reported-by: Sasha Levin Suggested-by: Thomas Gleixner Signed-off-by: Paul E. McKenney diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 063a6bf1a2b6..34ae5c376e35 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -358,9 +358,19 @@ void wait_rcu_gp(call_rcu_func_t crf); * initialization. */ #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD +void debug_init_rcu_head(struct rcu_head *head); +void debug_rcu_head_free(struct rcu_head *head); void init_rcu_head_on_stack(struct rcu_head *head); void destroy_rcu_head_on_stack(struct rcu_head *head); #else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ +static inline void debug_init_rcu_head(struct rcu_head *head) +{ +} + +static inline void debug_rcu_head_free(struct rcu_head *head) +{ +} + static inline void init_rcu_head_on_stack(struct rcu_head *head) { } diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a2aeb4df0f60..a41c81a26506 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -200,12 +200,12 @@ void wait_rcu_gp(call_rcu_func_t crf) EXPORT_SYMBOL_GPL(wait_rcu_gp); #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD -static inline void debug_init_rcu_head(struct rcu_head *head) +void debug_init_rcu_head(struct rcu_head *head) { debug_object_init(head, &rcuhead_debug_descr); } -static inline void debug_rcu_head_free(struct rcu_head *head) +void debug_rcu_head_free(struct rcu_head *head) { debug_object_free(head, &rcuhead_debug_descr); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934861AbaFTIR5 (ORCPT ); Fri, 20 Jun 2014 04:17:57 -0400 Received: from www.linutronix.de ([62.245.132.108]:57404 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934721AbaFTIRx (ORCPT ); Fri, 20 Jun 2014 04:17:53 -0400 Date: Fri, 20 Jun 2014 10:17:32 +0200 (CEST) From: Thomas Gleixner To: "Paul E. McKenney" cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619220449.GT4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Paul E. McKenney wrote: > On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > Well, no. Look at the callchain: > > > > > > > > > > > > __call_rcu > > > > > > debug_object_activate > > > > > > rcuhead_fixup_activate > > > > > > debug_object_init > > > > > > kmem_cache_alloc > > > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > > the debug objects code so the fixup code is called which inits the > > > > > > object and allocates a reference .... > > > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > > long time, so not sure what changed. But it seems like the right > > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > > by the memory allocators. > > > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > > > Almost unchanged. You need to provide a function to do so, i.e. make > > use of > > > > debug_init_rcu_head() > > You mean like this? I'd rather name it init_rcu_head() and free_rcu_head() w/o the debug_ prefix, so it's consistent with init_rcu_head_on_stack / destroy_rcu_head_on_stack. But either way works for me. Acked-by: Thomas Gleixner From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753119AbaFTObA (ORCPT ); Fri, 20 Jun 2014 10:31:00 -0400 Received: from qmta12.emeryville.ca.mail.comcast.net ([76.96.27.227]:38963 "EHLO qmta12.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751397AbaFTOa6 (ORCPT ); Fri, 20 Jun 2014 10:30:58 -0400 Date: Fri, 20 Jun 2014 09:30:52 -0500 (CDT) From: Christoph Lameter To: "Paul E. McKenney" cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140619220449.GT4904@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 19 Jun 2014, Paul E. McKenney wrote: > This commit therefore exports the debug_init_rcu_head() and > debug_rcu_head_free() functions, which permits the allocators to allocated > and pre-initialize the debug-objects information, so that there no longer > any need for call_rcu() to do that initialization, which in turn prevents > the recursion into the memory allocators. Looks-good-to: Christoph Lameter From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934351AbaFTPk0 (ORCPT ); Fri, 20 Jun 2014 11:40:26 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:34160 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932228AbaFTPkV (ORCPT ); Fri, 20 Jun 2014 11:40:21 -0400 Date: Fri, 20 Jun 2014 08:40:14 -0700 From: "Paul E. McKenney" To: Thomas Gleixner Cc: Christoph Lameter , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140620154014.GC4904@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14062015-6688-0000-0000-000002B3E1AC Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 20, 2014 at 10:17:32AM +0200, Thomas Gleixner wrote: > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > On Thu, Jun 19, 2014 at 11:32:41PM +0200, Thomas Gleixner wrote: > > > > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > > On Thu, Jun 19, 2014 at 10:37:17PM +0200, Thomas Gleixner wrote: > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > On Thu, Jun 19, 2014 at 09:29:08PM +0200, Thomas Gleixner wrote: > > > > > > > On Thu, 19 Jun 2014, Paul E. McKenney wrote: > > > > > > > Well, no. Look at the callchain: > > > > > > > > > > > > > > __call_rcu > > > > > > > debug_object_activate > > > > > > > rcuhead_fixup_activate > > > > > > > debug_object_init > > > > > > > kmem_cache_alloc > > > > > > > > > > > > > > So call rcu activates the object, but the object has no reference in > > > > > > > the debug objects code so the fixup code is called which inits the > > > > > > > object and allocates a reference .... > > > > > > > > > > > > OK, got it. And you are right, call_rcu() has done this for a very > > > > > > long time, so not sure what changed. But it seems like the right > > > > > > approach is to provide a debug-object-free call_rcu_alloc() for use > > > > > > by the memory allocators. > > > > > > > > > > > > Seem reasonable? If so, please see the following patch. > > > > > > > > > > Not really, you're torpedoing the whole purpose of debugobjects :) > > > > > > > > > > So, why can't we just init the rcu head when the stuff is created? > > > > > > > > That would allow me to keep my code unchanged, so I am in favor. ;-) > > > > > > Almost unchanged. You need to provide a function to do so, i.e. make > > > use of > > > > > > debug_init_rcu_head() > > > > You mean like this? > > I'd rather name it init_rcu_head() and free_rcu_head() w/o the debug_ > prefix, so it's consistent with init_rcu_head_on_stack / > destroy_rcu_head_on_stack. But either way works for me. > > Acked-by: Thomas Gleixner So just drop the _on_stack() from the other names, then. Please see below. Thanx, Paul ------------------------------------------------------------------------ rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() Currently, call_rcu() relies on implicit allocation and initialization for the debug-objects handling of RCU callbacks. If you hammer the kernel hard enough with Sasha's modified version of trinity, you can end up with the sl*b allocators recursing into themselves via this implicit call_rcu() allocation. This commit therefore exports the debug_init_rcu_head() and debug_rcu_head_free() functions, which permits the allocators to allocated and pre-initialize the debug-objects information, so that there no longer any need for call_rcu() to do that initialization, which in turn prevents the recursion into the memory allocators. Reported-by: Sasha Levin Suggested-by: Thomas Gleixner Signed-off-by: Paul E. McKenney Acked-by: Thomas Gleixner diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 063a6bf1a2b6..37c92cfef9ec 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -358,9 +358,19 @@ void wait_rcu_gp(call_rcu_func_t crf); * initialization. */ #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD +void init_rcu_head(struct rcu_head *head); +void destroy_rcu_head(struct rcu_head *head); void init_rcu_head_on_stack(struct rcu_head *head); void destroy_rcu_head_on_stack(struct rcu_head *head); #else /* !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ +static inline void init_rcu_head(struct rcu_head *head) +{ +} + +static inline void destroy_rcu_head(struct rcu_head *head) +{ +} + static inline void init_rcu_head_on_stack(struct rcu_head *head) { } diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a2aeb4df0f60..0fb691e63ce6 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -200,12 +200,12 @@ void wait_rcu_gp(call_rcu_func_t crf) EXPORT_SYMBOL_GPL(wait_rcu_gp); #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD -static inline void debug_init_rcu_head(struct rcu_head *head) +void init_rcu_head(struct rcu_head *head) { debug_object_init(head, &rcuhead_debug_descr); } -static inline void debug_rcu_head_free(struct rcu_head *head) +void destroy_rcu_head(struct rcu_head *head) { debug_object_free(head, &rcuhead_debug_descr); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752307AbaGLSEZ (ORCPT ); Sat, 12 Jul 2014 14:04:25 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:18174 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751452AbaGLSEX (ORCPT ); Sat, 12 Jul 2014 14:04:23 -0400 Message-ID: <53C1788D.9080800@oracle.com> Date: Sat, 12 Jul 2014 14:03:57 -0400 From: Sasha Levin User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com, Thomas Gleixner CC: Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> <20140620154014.GC4904@linux.vnet.ibm.com> In-Reply-To: <20140620154014.GC4904@linux.vnet.ibm.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/20/2014 11:40 AM, Paul E. McKenney wrote: > rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() > > Currently, call_rcu() relies on implicit allocation and initialization > for the debug-objects handling of RCU callbacks. If you hammer the > kernel hard enough with Sasha's modified version of trinity, you can end > up with the sl*b allocators recursing into themselves via this implicit > call_rcu() allocation. > > This commit therefore exports the debug_init_rcu_head() and > debug_rcu_head_free() functions, which permits the allocators to allocated > and pre-initialize the debug-objects information, so that there no longer > any need for call_rcu() to do that initialization, which in turn prevents > the recursion into the memory allocators. > > Reported-by: Sasha Levin > Suggested-by: Thomas Gleixner > Signed-off-by: Paul E. McKenney > Acked-by: Thomas Gleixner Hi Paul, Oddly enough, I still see the issue in -next (I made sure that this patch was in the tree): [ 393.810123] ============================================= [ 393.810123] [ INFO: possible recursive locking detected ] [ 393.810123] 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 Not tainted [ 393.810123] --------------------------------------------- [ 393.810123] trinity-c32/9762 is trying to acquire lock: [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: get_partial_node.isra.39 (mm/slub.c:1628) [ 393.810123] [ 393.810123] but task is already holding lock: [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.810123] [ 393.810123] other info that might help us debug this: [ 393.810123] Possible unsafe locking scenario: [ 393.810123] [ 393.810123] CPU0 [ 393.810123] ---- [ 393.810123] lock(&(&n->list_lock)->rlock); [ 393.810123] lock(&(&n->list_lock)->rlock); [ 393.810123] [ 393.810123] *** DEADLOCK *** [ 393.810123] [ 393.810123] May be due to missing lock nesting notation [ 393.810123] [ 393.810123] 5 locks held by trinity-c32/9762: [ 393.810123] #0: (net_mutex){+.+.+.}, at: copy_net_ns (net/core/net_namespace.c:254) [ 393.810123] #1: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) [ 393.810123] #2: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) [ 393.810123] #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:344) [ 393.810123] #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.810123] [ 393.810123] stack backtrace: [ 393.810123] CPU: 32 PID: 9762 Comm: trinity-c32 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 [ 393.843284] ffff880bc26730e0 0000000000000000 ffffffffb4ae7ff0 ffff880bc26a3848 [ 393.843284] ffffffffb0e47068 ffffffffb4ae7ff0 ffff880bc26a38f0 ffffffffac258586 [ 393.843284] ffff880bc2673e30 000000050000000a ffffffffb444dee0 ffff880bc2673e48 [ 393.843284] Call Trace: [ 393.843284] dump_stack (lib/dump_stack.c:52) [ 393.843284] __lock_acquire (kernel/locking/lockdep.c:1739 kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 kernel/locking/lockdep.c:3182) [ 393.843284] lock_acquire (kernel/locking/lockdep.c:3602) [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] get_partial_node.isra.39 (mm/slub.c:1628) [ 393.843284] ? check_irq_usage (kernel/locking/lockdep.c:1638) [ 393.843284] ? __slab_alloc (mm/slub.c:2307) [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 393.843284] __slab_alloc (mm/slub.c:1730 mm/slub.c:2208 mm/slub.c:2372) [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) [ 393.843284] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:304) [ 393.843284] kmem_cache_alloc (mm/slub.c:2445 mm/slub.c:2487 mm/slub.c:2492) [ 393.843284] ? debug_smp_processor_id (lib/smp_processor_id.c:57) [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? check_chain_key (kernel/locking/lockdep.c:2188) [ 393.843284] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) [ 393.843284] ? _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191) [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 393.843284] debug_object_init (lib/debugobjects.c:365) [ 393.843284] rcuhead_fixup_activate (kernel/rcu/update.c:260) [ 393.843284] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) [ 393.843284] ? preempt_count_sub (kernel/sched/core.c:2600) [ 393.843284] ? slab_cpuup_callback (mm/slub.c:1484) [ 393.843284] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 8) kernel/rcu/tree.c:2665 (discriminator 8)) [ 393.843284] ? __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.843284] call_rcu (kernel/rcu/tree_plugin.h:679) [ 393.843284] discard_slab (mm/slub.c:1522) [ 393.843284] __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) [ 393.843284] kmem_cache_destroy (mm/slab_common.c:350) [ 393.843284] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 3)) [ 393.843284] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) [ 393.843284] ops_exit_list.isra.1 (net/core/net_namespace.c:135) [ 393.843284] setup_net (net/core/net_namespace.c:180 (discriminator 3)) [ 393.843284] copy_net_ns (net/core/net_namespace.c:255) [ 393.843284] create_new_namespaces (kernel/nsproxy.c:95) [ 393.843284] unshare_nsproxy_namespaces (kernel/nsproxy.c:190 (discriminator 4)) [ 393.843284] SyS_unshare (kernel/fork.c:1865 kernel/fork.c:1814) [ 393.843284] tracesys (arch/x86/kernel/entry_64.S:542) Thanks, Sasha From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753063AbaGLTd6 (ORCPT ); Sat, 12 Jul 2014 15:33:58 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:34851 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752786AbaGLTd5 (ORCPT ); Sat, 12 Jul 2014 15:33:57 -0400 Date: Sat, 12 Jul 2014 12:33:49 -0700 From: "Paul E. McKenney" To: Sasha Levin Cc: Thomas Gleixner , Christoph Lameter , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140712193349.GD16041@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20140619165247.GA4904@linux.vnet.ibm.com> <20140619202928.GG4904@linux.vnet.ibm.com> <20140619205307.GL4904@linux.vnet.ibm.com> <20140619220449.GT4904@linux.vnet.ibm.com> <20140620154014.GC4904@linux.vnet.ibm.com> <53C1788D.9080800@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53C1788D.9080800@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14071219-0928-0000-0000-0000035AE9FB Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 12, 2014 at 02:03:57PM -0400, Sasha Levin wrote: > On 06/20/2014 11:40 AM, Paul E. McKenney wrote: > > rcu: Export debug_init_rcu_head() and and debug_init_rcu_head() > > > > Currently, call_rcu() relies on implicit allocation and initialization > > for the debug-objects handling of RCU callbacks. If you hammer the > > kernel hard enough with Sasha's modified version of trinity, you can end > > up with the sl*b allocators recursing into themselves via this implicit > > call_rcu() allocation. > > > > This commit therefore exports the debug_init_rcu_head() and > > debug_rcu_head_free() functions, which permits the allocators to allocated > > and pre-initialize the debug-objects information, so that there no longer > > any need for call_rcu() to do that initialization, which in turn prevents > > the recursion into the memory allocators. > > > > Reported-by: Sasha Levin > > Suggested-by: Thomas Gleixner > > Signed-off-by: Paul E. McKenney > > Acked-by: Thomas Gleixner > > Hi Paul, > > Oddly enough, I still see the issue in -next (I made sure that this patch > was in the tree): Hello, Sasha, This commit is only part of the solution. The allocators need to change to make use of it. Thanx, Paul > [ 393.810123] ============================================= > [ 393.810123] [ INFO: possible recursive locking detected ] > [ 393.810123] 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 Not tainted > [ 393.810123] --------------------------------------------- > [ 393.810123] trinity-c32/9762 is trying to acquire lock: > [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.810123] > [ 393.810123] but task is already holding lock: > [ 393.810123] (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.810123] > [ 393.810123] other info that might help us debug this: > [ 393.810123] Possible unsafe locking scenario: > [ 393.810123] > [ 393.810123] CPU0 > [ 393.810123] ---- > [ 393.810123] lock(&(&n->list_lock)->rlock); > [ 393.810123] lock(&(&n->list_lock)->rlock); > [ 393.810123] > [ 393.810123] *** DEADLOCK *** > [ 393.810123] > [ 393.810123] May be due to missing lock nesting notation > [ 393.810123] > [ 393.810123] 5 locks held by trinity-c32/9762: > [ 393.810123] #0: (net_mutex){+.+.+.}, at: copy_net_ns (net/core/net_namespace.c:254) > [ 393.810123] #1: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) > [ 393.810123] #2: (mem_hotplug.lock){.+.+.+}, at: get_online_mems (mm/memory_hotplug.c:83) > [ 393.810123] #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy (mm/slab_common.c:344) > [ 393.810123] #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.810123] > [ 393.810123] stack backtrace: > [ 393.810123] CPU: 32 PID: 9762 Comm: trinity-c32 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813 > [ 393.843284] ffff880bc26730e0 0000000000000000 ffffffffb4ae7ff0 ffff880bc26a3848 > [ 393.843284] ffffffffb0e47068 ffffffffb4ae7ff0 ffff880bc26a38f0 ffffffffac258586 > [ 393.843284] ffff880bc2673e30 000000050000000a ffffffffb444dee0 ffff880bc2673e48 > [ 393.843284] Call Trace: > [ 393.843284] dump_stack (lib/dump_stack.c:52) > [ 393.843284] __lock_acquire (kernel/locking/lockdep.c:1739 kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 kernel/locking/lockdep.c:3182) > [ 393.843284] lock_acquire (kernel/locking/lockdep.c:3602) > [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > [ 393.843284] ? get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] get_partial_node.isra.39 (mm/slub.c:1628) > [ 393.843284] ? check_irq_usage (kernel/locking/lockdep.c:1638) > [ 393.843284] ? __slab_alloc (mm/slub.c:2307) > [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 393.843284] __slab_alloc (mm/slub.c:1730 mm/slub.c:2208 mm/slub.c:2372) > [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) > [ 393.843284] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:304) > [ 393.843284] kmem_cache_alloc (mm/slub.c:2445 mm/slub.c:2487 mm/slub.c:2492) > [ 393.843284] ? debug_smp_processor_id (lib/smp_processor_id.c:57) > [ 393.843284] ? __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? check_chain_key (kernel/locking/lockdep.c:2188) > [ 393.843284] __debug_object_init (lib/debugobjects.c:100 lib/debugobjects.c:312) > [ 393.843284] ? _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191) > [ 393.843284] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 393.843284] debug_object_init (lib/debugobjects.c:365) > [ 393.843284] rcuhead_fixup_activate (kernel/rcu/update.c:260) > [ 393.843284] debug_object_activate (lib/debugobjects.c:280 lib/debugobjects.c:439) > [ 393.843284] ? preempt_count_sub (kernel/sched/core.c:2600) > [ 393.843284] ? slab_cpuup_callback (mm/slub.c:1484) > [ 393.843284] __call_rcu (kernel/rcu/rcu.h:76 (discriminator 8) kernel/rcu/tree.c:2665 (discriminator 8)) > [ 393.843284] ? __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.843284] call_rcu (kernel/rcu/tree_plugin.h:679) > [ 393.843284] discard_slab (mm/slub.c:1522) > [ 393.843284] __kmem_cache_shutdown (mm/slub.c:3210 mm/slub.c:3233 mm/slub.c:3244) > [ 393.843284] kmem_cache_destroy (mm/slab_common.c:350) > [ 393.843284] nf_conntrack_cleanup_net_list (net/netfilter/nf_conntrack_core.c:1569 (discriminator 3)) > [ 393.843284] nf_conntrack_pernet_exit (net/netfilter/nf_conntrack_standalone.c:558) > [ 393.843284] ops_exit_list.isra.1 (net/core/net_namespace.c:135) > [ 393.843284] setup_net (net/core/net_namespace.c:180 (discriminator 3)) > [ 393.843284] copy_net_ns (net/core/net_namespace.c:255) > [ 393.843284] create_new_namespaces (kernel/nsproxy.c:95) > [ 393.843284] unshare_nsproxy_namespaces (kernel/nsproxy.c:190 (discriminator 4)) > [ 393.843284] SyS_unshare (kernel/fork.c:1865 kernel/fork.c:1814) > [ 393.843284] tracesys (arch/x86/kernel/entry_64.S:542) > > > Thanks, > Sasha > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752096AbaHRSvk (ORCPT ); Mon, 18 Aug 2014 14:51:40 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:59468 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751676AbaHRSvj (ORCPT ); Mon, 18 Aug 2014 14:51:39 -0400 Date: Mon, 18 Aug 2014 09:37:57 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140818163757.GA30742@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14081818-1344-0000-0000-0000038C1EFB Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 19, 2014 at 03:19:39PM -0500, Christoph Lameter wrote: > On Thu, 19 Jun 2014, Thomas Gleixner wrote: > > > Well, no. Look at the callchain: > > > > __call_rcu > > debug_object_activate > > rcuhead_fixup_activate > > debug_object_init > > kmem_cache_alloc > > > > So call rcu activates the object, but the object has no reference in > > the debug objects code so the fixup code is called which inits the > > object and allocates a reference .... > > So we need to init the object in the page struct before the __call_rcu? And the needed APIs are now in mainline: void init_rcu_head(struct rcu_head *head); void destroy_rcu_head(struct rcu_head *head); Over to you, Christoph! ;-) Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752914AbaHSDoi (ORCPT ); Mon, 18 Aug 2014 23:44:38 -0400 Received: from qmta08.emeryville.ca.mail.comcast.net ([76.96.30.80]:33681 "EHLO qmta08.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752636AbaHSDoh (ORCPT ); Mon, 18 Aug 2014 23:44:37 -0400 Date: Mon, 18 Aug 2014 22:44:34 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@gentwo.org To: "Paul E. McKenney" cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140818163757.GA30742@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > So call rcu activates the object, but the object has no reference in > > > the debug objects code so the fixup code is called which inits the > > > object and allocates a reference .... > > > > So we need to init the object in the page struct before the __call_rcu? > > And the needed APIs are now in mainline: > > void init_rcu_head(struct rcu_head *head); > void destroy_rcu_head(struct rcu_head *head); > > Over to you, Christoph! ;-) The field we are using for the rcu head serves other purposes before the free action. We cannot init the field at slab creation as we thought since it is used for the queueing of slabs on the partial, free and full lists. The kmem_cache information is not available when doing the freeing so we must force the allocation of reserve fields and the use of the reserved areas for rcu on all kmem_caches. I made this conditional on CONFIG_RCU_XYZ. This needs to be the actual Debug options that will require allocations when initializing rcu heads. Also note that the allocations in the rcu head initialization must be restricted to non RCU slabs otherwise the recursion may not terminate. Subject RFC: Allow allocations on initializing rcu fields in slub. Signed-off-by: Christoph Lameter Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,41 @@ static inline struct page *alloc_slab_pa return page; } +#ifdef CONFIG_RCU_DEBUG_XYZ +/* + * We may have to do alloations during the initialization of the + * debug portion of the rcu structure for a slab. It must therefore + * be separately allocated and initized on allocation. + * We cannot overload the lru field in the page struct at all. + */ +#define need_reserve_slab_rcu 1 +#else +/* + * Overload the lru field in struct page if it fits. + * Should struct rcu_head grow due to debugging fields etc then + * additional space will be allocated from the end of the slab to + * store the rcu_head. + */ +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) +#endif + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1392,21 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } +#ifdef CONFIG_RCU_DEBUG_XYZ + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize rcu_head and potentially do other + * allocations. Note that this is still a recursive + * call into the allocator which may recurse endlessly + * if the same kmem_cache is used for allocation here. + * + * So in order to be safe the slab caches used + * in init_rcu_head must be restricted to be of the + * non rcu kind only. + */ + init_rcu_head(get_rcu_head(s, page)); +#endif + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1502,13 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; +#ifdef CONFIG_RCU_DEBUG_XYZ + destroy_rcu_head(h); +#endif if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1519,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752589AbaHSD6f (ORCPT ); Mon, 18 Aug 2014 23:58:35 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:43251 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752449AbaHSD6e (ORCPT ); Mon, 18 Aug 2014 23:58:34 -0400 Date: Mon, 18 Aug 2014 20:58:28 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140819035828.GI4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14081903-1542-0000-0000-0000041CF189 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 18, 2014 at 10:44:34PM -0500, Christoph Lameter wrote: > On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > > > So call rcu activates the object, but the object has no reference in > > > > the debug objects code so the fixup code is called which inits the > > > > object and allocates a reference .... > > > > > > So we need to init the object in the page struct before the __call_rcu? > > > > And the needed APIs are now in mainline: > > > > void init_rcu_head(struct rcu_head *head); > > void destroy_rcu_head(struct rcu_head *head); > > > > Over to you, Christoph! ;-) > > The field we are using for the rcu head serves other purposes before > the free action. We cannot init the field at slab creation as we > thought since it is used for the queueing of slabs on the partial, free > and full lists. The kmem_cache information is not available when doing > the freeing so we must force the allocation of reserve fields and the > use of the reserved areas for rcu on all kmem_caches. Yow! I am glad I didn't try doing this myself! > I made this conditional on CONFIG_RCU_XYZ. This needs to be the actual > Debug options that will require allocations when initializing rcu heads. > > Also note that the allocations in the rcu head initialization must be > restricted to non RCU slabs otherwise the recursion may not terminate. > > > Subject RFC: Allow allocations on initializing rcu fields in slub. > > Signed-off-by: Christoph Lameter > > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,41 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#ifdef CONFIG_RCU_DEBUG_XYZ If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, then it will automatically show up when it needs to. The rest looks plausible, for whatever that is worth. Thanx, Paul > +/* > + * We may have to do alloations during the initialization of the > + * debug portion of the rcu structure for a slab. It must therefore > + * be separately allocated and initized on allocation. > + * We cannot overload the lru field in the page struct at all. > + */ > +#define need_reserve_slab_rcu 1 > +#else > +/* > + * Overload the lru field in struct page if it fits. > + * Should struct rcu_head grow due to debugging fields etc then > + * additional space will be allocated from the end of the slab to > + * store the rcu_head. > + */ > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > +#endif > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1392,21 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > +#ifdef CONFIG_RCU_DEBUG_XYZ > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize rcu_head and potentially do other > + * allocations. Note that this is still a recursive > + * call into the allocator which may recurse endlessly > + * if the same kmem_cache is used for allocation here. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head must be restricted to be of the > + * non rcu kind only. > + */ > + init_rcu_head(get_rcu_head(s, page)); > +#endif > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1502,13 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > +#ifdef CONFIG_RCU_DEBUG_XYZ > + destroy_rcu_head(h); > +#endif > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1519,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752015AbaHTCAL (ORCPT ); Tue, 19 Aug 2014 22:00:11 -0400 Received: from qmta15.emeryville.ca.mail.comcast.net ([76.96.27.228]:46865 "EHLO qmta15.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751733AbaHTCAJ (ORCPT ); Tue, 19 Aug 2014 22:00:09 -0400 Date: Tue, 19 Aug 2014 21:00:05 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@gentwo.org To: "Paul E. McKenney" cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140819035828.GI4752@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > +#ifdef CONFIG_RCU_DEBUG_XYZ > > If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, > then it will automatically show up when it needs to. Ok. > The rest looks plausible, for whatever that is worth. We talked in the hallway about init_rcu_head not touching the contents of the rcu_head. If that is the case then we can simplify the patch. We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa return page; } +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1376,22 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize various things. However, this init is + * not allowed to modify the contents of the rcu head. + * Allocations are permitted. However, the use of + * the same cache or another cache with SLAB_RCU_DESTROY + * set may cause additional recursions. + * + * So in order to be safe the slab caches used + * in init_rcu_head should be restricted to be of the + * non rcu kind only. + */ + init_rcu_head(get_rcu_head(s, page)); +#endif + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1487,13 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD + destroy_rcu_head(h); +#endif if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1504,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752057AbaHTCb2 (ORCPT ); Tue, 19 Aug 2014 22:31:28 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:57744 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751738AbaHTCb0 (ORCPT ); Tue, 19 Aug 2014 22:31:26 -0400 Date: Tue, 19 Aug 2014 19:31:21 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140820023121.GS4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14082002-1542-0000-0000-000004245647 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 19, 2014 at 09:00:05PM -0500, Christoph Lameter wrote: > On Mon, 18 Aug 2014, Paul E. McKenney wrote: > > > > +#ifdef CONFIG_RCU_DEBUG_XYZ > > > > If you make CONFIG_RCU_DEBUG_XYZ instead be CONFIG_DEBUG_OBJECTS_RCU_HEAD, > > then it will automatically show up when it needs to. > > Ok. > > > The rest looks plausible, for whatever that is worth. > > We talked in the hallway about init_rcu_head not touching > the contents of the rcu_head. If that is the case then we can simplify > the patch. That is correct -- the debug-objects code uses separate storage to track states, and does not touch the memory to which the state applies. > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. And indeed they are, good point! It appears to me that both sets of #ifdefs can go away. Thanx, Paul > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1376,22 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize various things. However, this init is > + * not allowed to modify the contents of the rcu head. > + * Allocations are permitted. However, the use of > + * the same cache or another cache with SLAB_RCU_DESTROY > + * set may cause additional recursions. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head should be restricted to be of the > + * non rcu kind only. > + */ > + init_rcu_head(get_rcu_head(s, page)); > +#endif > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1487,13 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > +#ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD > + destroy_rcu_head(h); > +#endif > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1504,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752119AbaHTGBY (ORCPT ); Wed, 20 Aug 2014 02:01:24 -0400 Received: from qmta10.emeryville.ca.mail.comcast.net ([76.96.30.17]:42447 "EHLO qmta10.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750984AbaHTGBW (ORCPT ); Wed, 20 Aug 2014 02:01:22 -0400 Date: Wed, 20 Aug 2014 01:01:19 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@gentwo.org To: "Paul E. McKenney" cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory In-Reply-To: <20140820023121.GS4752@linux.vnet.ibm.com> Message-ID: References: <53A2F406.4010109@oracle.com> <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> <20140820023121.GS4752@linux.vnet.ibm.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 19 Aug 2014, Paul E. McKenney wrote: > > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. > > And indeed they are, good point! It appears to me that both sets of > #ifdefs can go away. Ok then this is a first workable version I think. How do we test this? From: Christoph Lameter Subject: slub: Add init/destroy function calls for rcu_heads In order to do proper debugging for rcu_head use we need some additional structures allocated when an object potentially using a rcu_head is allocated in the slub allocator. This adds the proper calls to init_rcu_head() and destroy_rcu_head(). init_rcu_head() is a bit of an unusual function since: 1. It does not touch the contents of the rcu_head. This is required since the rcu_head is only used during slab_page freeing. Outside of that the same memory location is used for slab page list management. However, the initialization occurs when the slab page is initially allocated. So in the time between init_rcu_head() and destroy_rcu_head() there may be multiple uses of the indicated address as a list_head. 2. It is called without gfp flags and could potentially be called from atomic contexts. Allocations from init_rcu_head() context need to deal with this. 3. init_rcu_head() is called from within the slab allocation functions. Since init_rcu_head() calls the allocator again for more allocations it must avoid to use slabs that use rcu freeing. Otherwise endless recursion may occur (We may have to convince lockdep that what we do here is sane). Signed-off-by: Christoph Lameter Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c +++ linux/mm/slub.c @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa return page; } +#define need_reserve_slab_rcu \ + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) + +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) +{ + if (need_reserve_slab_rcu) { + int order = compound_order(page); + int offset = (PAGE_SIZE << order) - s->reserved; + + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); + return page_address(page) + offset; + } else { + /* + * RCU free overloads the RCU head over the LRU + */ + return (void *)&page->lru; + } +} + static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) { struct page *page; @@ -1357,6 +1376,29 @@ static struct page *allocate_slab(struct kmemcheck_mark_unallocated_pages(page, pages); } + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) + /* + * Initialize various things. However, this init is + * not allowed to modify the contents of the rcu head. + * The allocator typically overloads the rcu head over + * page->lru which is also used to manage lists of + * slab pages. + * + * Allocations are permitted in init_rcu_head(). + * However, the use of the same cache or another + * cache with SLAB_DESTROY_BY_RCU set will cause + * additional recursions. + * + * So in order to be safe the slab caches used + * in init_rcu_head() should be restricted to be of the + * non rcu kind only. + * + * Note also that no GFPFLAG is passed. The function + * may therefore be called from atomic contexts + * and somehow(?) needs to do the right thing. + */ + init_rcu_head(get_rcu_head(s, page)); + if (flags & __GFP_WAIT) local_irq_disable(); if (!page) @@ -1452,13 +1494,11 @@ static void __free_slab(struct kmem_cach memcg_uncharge_slab(s, order); } -#define need_reserve_slab_rcu \ - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) - static void rcu_free_slab(struct rcu_head *h) { struct page *page; + destroy_rcu_head(h); if (need_reserve_slab_rcu) page = virt_to_head_page(h); else @@ -1469,24 +1509,9 @@ static void rcu_free_slab(struct rcu_hea static void free_slab(struct kmem_cache *s, struct page *page) { - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { - struct rcu_head *head; - - if (need_reserve_slab_rcu) { - int order = compound_order(page); - int offset = (PAGE_SIZE << order) - s->reserved; - - VM_BUG_ON(s->reserved != sizeof(*head)); - head = page_address(page) + offset; - } else { - /* - * RCU free overloads the RCU head over the LRU - */ - head = (void *)&page->lru; - } - - call_rcu(head, rcu_free_slab); - } else + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) + call_rcu(get_rcu_head(s, page), rcu_free_slab); + else __free_slab(s, page); } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752880AbaHTMUJ (ORCPT ); Wed, 20 Aug 2014 08:20:09 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:57991 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752208AbaHTMUG (ORCPT ); Wed, 20 Aug 2014 08:20:06 -0400 Date: Wed, 20 Aug 2014 05:19:59 -0700 From: "Paul E. McKenney" To: Christoph Lameter Cc: Thomas Gleixner , Sasha Levin , Pekka Enberg , Matt Mackall , Andrew Morton , Dave Jones , "linux-mm@kvack.org" , LKML Subject: Re: slub/debugobjects: lockup when freeing memory Message-ID: <20140820121959.GT4752@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20140619165247.GA4904@linux.vnet.ibm.com> <20140818163757.GA30742@linux.vnet.ibm.com> <20140819035828.GI4752@linux.vnet.ibm.com> <20140820023121.GS4752@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14082012-6688-0000-0000-000004268796 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 20, 2014 at 01:01:19AM -0500, Christoph Lameter wrote: > On Tue, 19 Aug 2014, Paul E. McKenney wrote: > > > > We could also remove the #ifdefs if init_rcu_head and destroy_rcu_head > > > are no ops if CONFIG_DEBUG_RCU_HEAD is not defined. > > > > And indeed they are, good point! It appears to me that both sets of > > #ifdefs can go away. > > Ok then this is a first workable version I think. How do we test this? It looks good to me. Sasha, could you please try this out? This should fix the problem you reported here: https://lkml.org/lkml/2014/6/19/306 Thanx, Paul > From: Christoph Lameter > Subject: slub: Add init/destroy function calls for rcu_heads > > In order to do proper debugging for rcu_head use we need some > additional structures allocated when an object potentially > using a rcu_head is allocated in the slub allocator. > > This adds the proper calls to init_rcu_head() > and destroy_rcu_head(). > > init_rcu_head() is a bit of an unusual function since: > 1. It does not touch the contents of the rcu_head. This is > required since the rcu_head is only used during > slab_page freeing. Outside of that the same memory location > is used for slab page list management. However, the > initialization occurs when the slab page is initially allocated. > So in the time between init_rcu_head() and destroy_rcu_head() > there may be multiple uses of the indicated address as a > list_head. > > 2. It is called without gfp flags and could potentially > be called from atomic contexts. Allocations from init_rcu_head() > context need to deal with this. > > 3. init_rcu_head() is called from within the slab allocation > functions. Since init_rcu_head() calls the allocator again > for more allocations it must avoid to use slabs that use > rcu freeing. Otherwise endless recursion may occur > (We may have to convince lockdep that what we do here is sane). > > Signed-off-by: Christoph Lameter > > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c > +++ linux/mm/slub.c > @@ -1308,6 +1308,25 @@ static inline struct page *alloc_slab_pa > return page; > } > > +#define need_reserve_slab_rcu \ > + (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > + > +static struct rcu_head *get_rcu_head(struct kmem_cache *s, struct page *page) > +{ > + if (need_reserve_slab_rcu) { > + int order = compound_order(page); > + int offset = (PAGE_SIZE << order) - s->reserved; > + > + VM_BUG_ON(s->reserved != sizeof(struct rcu_head)); > + return page_address(page) + offset; > + } else { > + /* > + * RCU free overloads the RCU head over the LRU > + */ > + return (void *)&page->lru; > + } > +} > + > static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) > { > struct page *page; > @@ -1357,6 +1376,29 @@ static struct page *allocate_slab(struct > kmemcheck_mark_unallocated_pages(page, pages); > } > > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU) && page) > + /* > + * Initialize various things. However, this init is > + * not allowed to modify the contents of the rcu head. > + * The allocator typically overloads the rcu head over > + * page->lru which is also used to manage lists of > + * slab pages. > + * > + * Allocations are permitted in init_rcu_head(). > + * However, the use of the same cache or another > + * cache with SLAB_DESTROY_BY_RCU set will cause > + * additional recursions. > + * > + * So in order to be safe the slab caches used > + * in init_rcu_head() should be restricted to be of the > + * non rcu kind only. > + * > + * Note also that no GFPFLAG is passed. The function > + * may therefore be called from atomic contexts > + * and somehow(?) needs to do the right thing. > + */ > + init_rcu_head(get_rcu_head(s, page)); > + > if (flags & __GFP_WAIT) > local_irq_disable(); > if (!page) > @@ -1452,13 +1494,11 @@ static void __free_slab(struct kmem_cach > memcg_uncharge_slab(s, order); > } > > -#define need_reserve_slab_rcu \ > - (sizeof(((struct page *)NULL)->lru) < sizeof(struct rcu_head)) > - > static void rcu_free_slab(struct rcu_head *h) > { > struct page *page; > > + destroy_rcu_head(h); > if (need_reserve_slab_rcu) > page = virt_to_head_page(h); > else > @@ -1469,24 +1509,9 @@ static void rcu_free_slab(struct rcu_hea > > static void free_slab(struct kmem_cache *s, struct page *page) > { > - if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) { > - struct rcu_head *head; > - > - if (need_reserve_slab_rcu) { > - int order = compound_order(page); > - int offset = (PAGE_SIZE << order) - s->reserved; > - > - VM_BUG_ON(s->reserved != sizeof(*head)); > - head = page_address(page) + offset; > - } else { > - /* > - * RCU free overloads the RCU head over the LRU > - */ > - head = (void *)&page->lru; > - } > - > - call_rcu(head, rcu_free_slab); > - } else > + if (unlikely(s->flags & SLAB_DESTROY_BY_RCU)) > + call_rcu(get_rcu_head(s, page), rcu_free_slab); > + else > __free_slab(s, page); > } >