From: Eric Dumazet <eric.dumazet@gmail.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Zdenek Kabelac <zdenek.kabelac@gmail.com>,
Patrick McHardy <kaber@trash.net>,
Christoph Lameter <cl@linux-foundation.org>,
Robin Holt <holt@sgi.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Jesper Dangaard Brouer <hawk@comx.dk>,
Linux Netdev List <netdev@vger.kernel.org>,
Netfilter Developers <netfilter-devel@vger.kernel.org>,
paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
Date: Thu, 03 Sep 2009 09:38:43 +0200 [thread overview]
Message-ID: <4A9F7283.1090306@gmail.com> (raw)
In-Reply-To: <84144f020909022331x2b275aa5n428f88670e0ae8bc@mail.gmail.com>
Pekka Enberg a écrit :
> On Thu, Sep 3, 2009 at 4:04 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> Zdenek Kabelac a écrit :
>>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>>> there before the cache is destroyed.
>>> But as I said - it's just my shot into the dark - which seems to work for me...
>>>
>> Reading again your traces, I do believe there are two bugs in slub
>>
>> Maybe not explaining your problem, but worth to fix !
>>
>> Thank you
>>
>> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>>
>> When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
>> slab padding, we call restore_bytes() on the whole slab, not only
>> on the padding.
>>
>> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
>> and *before* sysfs_slab_remove() or risk rcu_free_slab()
>> being called after kmem_cache is deleted (kfreed).
>>
>> rmmod nf_conntrack can crash the machine because it has to
>> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
>>
>> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> ---
>> diff --git a/mm/slub.c b/mm/slub.c
>> index b9f1491..0ac839f 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
>> slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
>> print_section("Padding", end - remainder, remainder);
>>
>> - restore_bytes(s, "slab padding", POISON_INUSE, start, end);
>> + restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
>
> OK, makes sense.
>
>> return 0;
>> }
>>
>> @@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
>> */
>> void kmem_cache_destroy(struct kmem_cache *s)
>> {
>> - if (s->flags & SLAB_DESTROY_BY_RCU)
>> - rcu_barrier();
>> down_write(&slub_lock);
>> s->refcount--;
>> if (!s->refcount) {
>> @@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
>> "still has objects.\n", s->name, __func__);
>> dump_stack();
>> }
>> + if (s->flags & SLAB_DESTROY_BY_RCU)
>> + rcu_barrier();
>> sysfs_slab_remove(s);
>> } else
>> up_write(&slub_lock);
>
> The rcu_barrier() call was added by this commit:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5
>
> I guess we should CC Paul as well.
Sure !
rcu_barrier() is definitly better than synchronize_rcu() in
kmem_cache_destroy()
But its location was not really right (for SLUB at least)
SLAB_DESTROY_BY_RCU means subsystem will call kfree(elems) without waiting RCU
grace period.
By the time subsystem calls kmem_cache_destroy(), all previously allocated
elems must have already be kfreed() by this subsystem.
We must however wait that all slabs, queued for freeing by rcu_free_slab(),
are indeed freed, since this freeing needs access to kmem_cache pointer.
As kmem_cache_close() might clean/purge the cache and call rcu_free_slab(),
we must call rcu_barrier() *after* kmem_cache_close(), and before kfree(kmem_cache *s)
Alternatively we could delay this final kfree(s) (with call_rcu()) but would
have to copy s->name in kmem_cache_create() instead of keeping a pointer to
a string that might be in a module, and freed at rmmod time.
Given that there is few uses in current tree that call kmem_cache_destroy()
on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
rcu_barrier() call, unless we want superfast reboot/halt sequences...
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Zdenek Kabelac <zdenek.kabelac@gmail.com>,
Patrick McHardy <kaber@trash.net>,
Christoph Lameter <cl@linux-foundation.org>,
Robin Holt <holt@sgi.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Jesper Dangaard Brouer <hawk@comx.dk>,
Linux Netdev List <netdev@vger.kernel.org>,
Netfilter Developers <netfilter-devel@vger.kernel.org>,
paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
Date: Thu, 03 Sep 2009 09:38:43 +0200 [thread overview]
Message-ID: <4A9F7283.1090306@gmail.com> (raw)
In-Reply-To: <84144f020909022331x2b275aa5n428f88670e0ae8bc@mail.gmail.com>
Pekka Enberg a écrit :
> On Thu, Sep 3, 2009 at 4:04 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> Zdenek Kabelac a écrit :
>>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>>> there before the cache is destroyed.
>>> But as I said - it's just my shot into the dark - which seems to work for me...
>>>
>> Reading again your traces, I do believe there are two bugs in slub
>>
>> Maybe not explaining your problem, but worth to fix !
>>
>> Thank you
>>
>> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>>
>> When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
>> slab padding, we call restore_bytes() on the whole slab, not only
>> on the padding.
>>
>> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
>> and *before* sysfs_slab_remove() or risk rcu_free_slab()
>> being called after kmem_cache is deleted (kfreed).
>>
>> rmmod nf_conntrack can crash the machine because it has to
>> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
>>
>> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> ---
>> diff --git a/mm/slub.c b/mm/slub.c
>> index b9f1491..0ac839f 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
>> slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
>> print_section("Padding", end - remainder, remainder);
>>
>> - restore_bytes(s, "slab padding", POISON_INUSE, start, end);
>> + restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
>
> OK, makes sense.
>
>> return 0;
>> }
>>
>> @@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
>> */
>> void kmem_cache_destroy(struct kmem_cache *s)
>> {
>> - if (s->flags & SLAB_DESTROY_BY_RCU)
>> - rcu_barrier();
>> down_write(&slub_lock);
>> s->refcount--;
>> if (!s->refcount) {
>> @@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
>> "still has objects.\n", s->name, __func__);
>> dump_stack();
>> }
>> + if (s->flags & SLAB_DESTROY_BY_RCU)
>> + rcu_barrier();
>> sysfs_slab_remove(s);
>> } else
>> up_write(&slub_lock);
>
> The rcu_barrier() call was added by this commit:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5
>
> I guess we should CC Paul as well.
Sure !
rcu_barrier() is definitly better than synchronize_rcu() in
kmem_cache_destroy()
But its location was not really right (for SLUB at least)
SLAB_DESTROY_BY_RCU means subsystem will call kfree(elems) without waiting RCU
grace period.
By the time subsystem calls kmem_cache_destroy(), all previously allocated
elems must have already be kfreed() by this subsystem.
We must however wait that all slabs, queued for freeing by rcu_free_slab(),
are indeed freed, since this freeing needs access to kmem_cache pointer.
As kmem_cache_close() might clean/purge the cache and call rcu_free_slab(),
we must call rcu_barrier() *after* kmem_cache_close(), and before kfree(kmem_cache *s)
Alternatively we could delay this final kfree(s) (with call_rcu()) but would
have to copy s->name in kmem_cache_create() instead of keeping a pointer to
a string that might be in a module, and freed at rmmod time.
Given that there is few uses in current tree that call kmem_cache_destroy()
on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
rcu_barrier() call, unless we want superfast reboot/halt sequences...
next prev parent reply other threads:[~2009-09-03 7:39 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-11 12:42 System freeze on reboot - general protection fault Zdenek Kabelac
2009-08-11 14:34 ` Christoph Lameter
2009-08-11 14:52 ` Zdenek Kabelac
2009-08-11 15:03 ` Christoph Lameter
2009-08-11 15:32 ` Zdenek Kabelac
2009-08-11 15:48 ` Robin Holt
2009-08-11 21:10 ` Zdenek Kabelac
2009-08-12 22:16 ` Zdenek Kabelac
2009-08-12 22:21 ` Christoph Lameter
2009-08-13 17:09 ` Zdenek Kabelac
2009-08-14 9:33 ` Zdenek Kabelac
2009-08-16 9:16 ` Eric Dumazet
2009-08-17 14:03 ` Patrick McHardy
2009-09-02 21:45 ` Zdenek Kabelac
2009-09-02 22:17 ` Eric Dumazet
2009-09-02 22:31 ` Zdenek Kabelac
2009-09-03 1:04 ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Eric Dumazet
2009-09-03 6:31 ` Pekka Enberg
2009-09-03 6:31 ` Pekka Enberg
2009-09-03 7:38 ` Eric Dumazet [this message]
2009-09-03 7:38 ` Eric Dumazet
2009-09-03 7:51 ` Pekka Enberg
2009-09-03 17:50 ` Christoph Lameter
2009-09-03 14:05 ` Pekka Enberg
2009-09-03 14:18 ` [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU Eric Dumazet
2009-09-03 19:48 ` Pekka Enberg
2009-09-03 19:56 ` Eric Dumazet
2009-09-03 19:56 ` Eric Dumazet
2009-09-03 17:45 ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
2009-09-03 14:08 ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
2009-09-03 18:38 ` Christoph Lameter
2009-09-03 15:01 ` Paul E. McKenney
2009-09-03 15:02 ` Eric Dumazet
2009-09-03 19:24 ` Christoph Lameter
2009-09-03 17:44 ` Paul E. McKenney
2009-09-03 22:43 ` Christoph Lameter
2009-09-03 22:03 ` Paul E. McKenney
2009-09-04 15:33 ` Christoph Lameter
2009-09-03 22:08 ` Eric Dumazet
2009-09-03 22:08 ` Eric Dumazet
2009-09-03 22:17 ` Eric Dumazet
2009-09-04 15:39 ` Christoph Lameter
2009-09-04 20:42 ` Paul E. McKenney
2009-09-04 20:42 ` Paul E. McKenney
2009-09-04 15:38 ` Christoph Lameter
2009-09-03 17:59 ` Eric Dumazet
2009-09-03 17:59 ` Eric Dumazet
2009-09-03 19:00 ` Pekka Enberg
2009-09-03 22:44 ` Christoph Lameter
2009-09-03 23:17 ` Paul E. McKenney
2009-09-04 15:42 ` Christoph Lameter
2009-09-04 20:43 ` Paul E. McKenney
2009-09-08 19:57 ` Christoph Lameter
2009-09-08 22:20 ` Paul E. McKenney
2009-09-08 22:41 ` Christoph Lameter
2009-09-08 22:59 ` Paul E. McKenney
2009-09-09 14:04 ` Christoph Lameter
2009-09-09 14:42 ` Paul E. McKenney
2009-09-09 14:53 ` Christoph Lameter
2009-09-09 15:09 ` Paul E. McKenney
2009-09-03 19:34 ` Pekka Enberg
2009-09-03 15:00 ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Paul E. McKenney
2009-09-03 13:42 ` Paul E. McKenney
2009-09-03 13:42 ` Paul E. McKenney
2009-09-03 13:28 ` Zdenek Kabelac
2009-09-03 13:46 ` Eric Dumazet
2009-09-03 13:46 ` Eric Dumazet
2009-09-03 14:35 ` Zdenek Kabelac
2009-09-03 14:35 ` Zdenek Kabelac
2009-09-03 18:17 ` System freeze on reboot - general protection fault Paul E. McKenney
2009-09-03 18:17 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A9F7283.1090306@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=cl@linux-foundation.org \
--cc=hawk@comx.dk \
--cc=holt@sgi.com \
--cc=kaber@trash.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=penberg@cs.helsinki.fi \
--cc=zdenek.kabelac@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.