All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Christoph Lameter <cl@linux.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	 Sasha Levin <levinsasha928@gmail.com>,
	 Dave Jones <davej@redhat.com>,  davem <davem@davemloft.net>,
	 Pekka Enberg <penberg@kernel.org>,
	 Matt Mackall <mpm@selenic.com>,
	 kaber@trash.net,  pablo@netfilter.org,
	 linux-kernel <linux-kernel@vger.kernel.org>,
	 linux-mm <linux-mm@kvack.org>,
	 netfilter-devel@vger.kernel.org,
	 netdev <netdev@vger.kernel.org>
Subject: Re: Hung task when calling clone() due to netfilter/slab
Date: Thu, 19 Jan 2012 14:15:01 -0800	[thread overview]
Message-ID: <m14nvr2vbu.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <m1bopz2ws3.fsf@fess.ebiederm.org> (Eric W. Biederman's message of "Thu, 19 Jan 2012 13:43:40 -0800")

ebiederm@xmission.com (Eric W. Biederman) writes:

> Christoph Lameter <cl@linux.com> writes:
>
>> Another version that drops the slub lock for both invocations of sysfs
>> functions from kmem_cache_create. The invocation from slab_sysfs_init
>> is not a problem since user space is not active at that point.
>>
>>
>> Subject: slub: Do not take the slub lock while calling into sysfs
>>
>> This patch avoids holding the slub_lock during kmem_cache_create()
>> when calling sysfs. It is possible because kmem_cache_create()
>> allocates the kmem_cache object and therefore is the only one context
>> that can access the newly created object. It is therefore possible
>> to drop the slub_lock early. We defer the adding of the new kmem_cache
>> to the end of processing because the new kmem_cache structure would
>> be reachable otherwise via scans over slabs. This allows sysfs_slab_add()
>> to run without holding any locks.
>>
>> The case is different if we are creating an alias instead of a new
>> kmem_cache structure. In that case we can also drop the slub lock
>> early because we have taken a refcount on the kmem_cache structure.
>> It therefore cannot vanish from under us.
>> But if the sysfs_slab_alias() call fails we can no longer simply
>> decrement the refcount since the other references may have gone
>> away in the meantime. Call kmem_cache_destroy() to cause the
>> refcount to be decremented and the kmem_cache structure to be
>> freed if all references are gone.
>>
>> Signed-off-by: Christoph Lameter <cl@linux.com>
>
> I am dense.  Is the deadlock here that you are fixing slub calling sysfs
> with the slub_lock held but sysfs then calling kmem_cache_zalloc?
>
> I don't see what sysfs is doing in the creation path that would cause
> a deadlock except for using slab.

Oh.  I see.  The problem is calling kobject_uevent (which happens to
live in slabs sysfs_slab_add) with a lock held.  And kobject_uevent
makes a blocking call to userspace.

No locks held seems to be a good policy on that one.

Eric

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: ebiederm@xmission.com (Eric W. Biederman)
To: Christoph Lameter <cl@linux.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Sasha Levin <levinsasha928@gmail.com>,
	Dave Jones <davej@redhat.com>, davem <davem@davemloft.net>,
	Pekka Enberg <penberg@kernel.org>, Matt Mackall <mpm@selenic.com>,
	kaber@trash.net, pablo@netfilter.org,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	netfilter-devel@vger.kernel.org, netdev <netdev@vger.kernel.org>
Subject: Re: Hung task when calling clone() due to netfilter/slab
Date: Thu, 19 Jan 2012 14:15:01 -0800	[thread overview]
Message-ID: <m14nvr2vbu.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <m1bopz2ws3.fsf@fess.ebiederm.org> (Eric W. Biederman's message of "Thu, 19 Jan 2012 13:43:40 -0800")

ebiederm@xmission.com (Eric W. Biederman) writes:

> Christoph Lameter <cl@linux.com> writes:
>
>> Another version that drops the slub lock for both invocations of sysfs
>> functions from kmem_cache_create. The invocation from slab_sysfs_init
>> is not a problem since user space is not active at that point.
>>
>>
>> Subject: slub: Do not take the slub lock while calling into sysfs
>>
>> This patch avoids holding the slub_lock during kmem_cache_create()
>> when calling sysfs. It is possible because kmem_cache_create()
>> allocates the kmem_cache object and therefore is the only one context
>> that can access the newly created object. It is therefore possible
>> to drop the slub_lock early. We defer the adding of the new kmem_cache
>> to the end of processing because the new kmem_cache structure would
>> be reachable otherwise via scans over slabs. This allows sysfs_slab_add()
>> to run without holding any locks.
>>
>> The case is different if we are creating an alias instead of a new
>> kmem_cache structure. In that case we can also drop the slub lock
>> early because we have taken a refcount on the kmem_cache structure.
>> It therefore cannot vanish from under us.
>> But if the sysfs_slab_alias() call fails we can no longer simply
>> decrement the refcount since the other references may have gone
>> away in the meantime. Call kmem_cache_destroy() to cause the
>> refcount to be decremented and the kmem_cache structure to be
>> freed if all references are gone.
>>
>> Signed-off-by: Christoph Lameter <cl@linux.com>
>
> I am dense.  Is the deadlock here that you are fixing slub calling sysfs
> with the slub_lock held but sysfs then calling kmem_cache_zalloc?
>
> I don't see what sysfs is doing in the creation path that would cause
> a deadlock except for using slab.

Oh.  I see.  The problem is calling kobject_uevent (which happens to
live in slabs sysfs_slab_add) with a lock held.  And kobject_uevent
makes a blocking call to userspace.

No locks held seems to be a good policy on that one.

Eric

  reply	other threads:[~2012-01-19 22:15 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-14 16:30 Hung task when calling clone() due to netfilter/slab Sasha Levin
2012-01-14 16:30 ` Sasha Levin
2012-01-14 17:10 ` Eric Dumazet
2012-01-14 17:10   ` Eric Dumazet
2012-01-14 17:10   ` Eric Dumazet
2012-01-14 17:18   ` Eric Dumazet
2012-01-14 17:18     ` Eric Dumazet
2012-01-14 17:18     ` Eric Dumazet
2012-01-15 12:59   ` Sasha Levin
2012-01-15 12:59     ` Sasha Levin
2012-01-15 17:25     ` Eric Dumazet
2012-01-15 17:25       ` Eric Dumazet
2012-01-15 17:25       ` Eric Dumazet
2012-01-17 15:12       ` Christoph Lameter
2012-01-17 15:12         ` Christoph Lameter
2012-01-17 15:20         ` Eric Dumazet
2012-01-17 15:20           ` Eric Dumazet
2012-01-17 15:20           ` Eric Dumazet
2012-01-17 15:27           ` Christoph Lameter
2012-01-17 15:27             ` Christoph Lameter
2012-01-17 15:30             ` Eric Dumazet
2012-01-17 15:30               ` Eric Dumazet
2012-01-17 15:30               ` Eric Dumazet
2012-01-17 16:04               ` Christoph Lameter
2012-01-17 16:04                 ` Christoph Lameter
2012-01-17 22:22                 ` Christoph Lameter
2012-01-17 22:22                   ` Christoph Lameter
2012-01-19 21:43                   ` Eric W. Biederman
2012-01-19 21:43                     ` Eric W. Biederman
2012-01-19 22:15                     ` Eric W. Biederman [this message]
2012-01-19 22:15                       ` Eric W. Biederman
2012-01-20  2:03                       ` Christoph Lameter
2012-01-20  2:03                         ` Christoph Lameter
2012-01-20  2:31                         ` Eric W. Biederman
2012-01-20  2:31                           ` Eric W. Biederman
2012-01-20 14:49                           ` Christoph Lameter
2012-01-20 14:49                             ` Christoph Lameter
2012-01-20 20:40                             ` Eric W. Biederman
2012-01-20 20:40                               ` Eric W. Biederman
2012-02-01  8:05                             ` Pekka Enberg
2012-02-01  8:05                               ` Pekka Enberg
2012-02-01 17:32                               ` Eric W. Biederman
2012-02-01 17:32                                 ` Eric W. Biederman
2012-02-01  8:07               ` Pekka Enberg
2012-02-01  8:07                 ` Pekka Enberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m14nvr2vbu.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=cl@linux.com \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=kaber@trash.net \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mpm@selenic.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=penberg@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.