Re: System freeze on reboot - general protection fault

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: System freeze on reboot - general protection fault
       [not found]                   ` <c4e36d110908140233v59421ba6y82192b858210370d@mail.gmail.com>
@ 2009-08-16  9:16                     ` Eric Dumazet
  2009-08-17 14:03                       ` Patrick McHardy
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Dumazet @ 2009-08-16  9:16 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Christoph Lameter, Robin Holt, Linux Kernel Mailing List,
	Pekka Enberg, Jesper Dangaard Brouer, Linux Netdev List,
	Netfilter Developers, Patrick McHardy

Zdenek Kabelac a écrit :
> 2009/8/13 Zdenek Kabelac <zdenek.kabelac@gmail.com>:
>> 2009/8/13 Christoph Lameter <cl@linux-foundation.org>:
>>> On Thu, 13 Aug 2009, Zdenek Kabelac wrote:
>>>
>>>>> I've added authors of some recent conntrack commits to Cc: - maybe
>>>>> they might know?
>>>> I've tested v2.6.30 - and it's crashing in the same way - so any other
>>>> starting point where slub has the same detection mechanism and
>>>> conntrack module should be working reliable ?
>>> Next point is 2.6.29.
>>>
>> Ok  - played lengthy game between 2.6.29 which appeared to be ok and 2.6.30
>>
>> And the winner is: ea781f197d6a835cbb93a0bf88ee1696296ed8aa
>> netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()
>>
>> The error is actually being hit by  libvirtd networking rules added
>> during boot for my kvm usage.
>> (Which I noticed after some time...  leading my game into wrong
>> direction ;))....
>>
>> Here are the some last bisect entries:
>>
>> git bisect bad 54dc79fe0d895758bdaa1dcf8512d3d21263d105
>> # bad: [5c0de29d06318ec8f6e3ba0d17d62529dbbdc1e8] netfilter:
>> nf_conntrack: add generic function to get len of generic policy
>> git bisect bad 5c0de29d06318ec8f6e3ba0d17d62529dbbdc1e8
>> # good: [e487eb99cf9381a4f8254fa01747a85818da612b] netlink: add nla_policy_len()
>> git bisect good e487eb99cf9381a4f8254fa01747a85818da612b
>> # good: [1f9352ae2253a97b07b34dcf16ffa3b4ca12c558] netfilter:
>> {ip,ip6,arp}_tables: fix incorrect loop detection
>> git bisect good 1f9352ae2253a97b07b34dcf16ffa3b4ca12c558
>> # bad: [2732c4e45bb67006fdc9ae6669be866762711ab5] netfilter:
>> ctnetlink: allocate right-sized ctnetlink skb
>> git bisect bad 2732c4e45bb67006fdc9ae6669be866762711ab5
>>
>>
>> Unfortunately the commit cannot be reverted with current tree - thus I
>> cannot easily check if it's the only problem.
>> (warning: too many files (created: 3096 deleted: 1096), skipping
>> inexact rename detection
>> Automatic revert failed.  After resolving the conflicts,)
> 
> Hmm after checking today with serial cable attached - it looks like
> I've tracked the problem but to the wrong commit - my original 'slub'
> error was now actually something else - so there are most probably two
> kinds of problem - as with this kernel the nf_conntrack_ipv4 fails to
> register tcp so it's not loaded at all.
> This might get fixed later, but different error was there.
> 
> I'll need to play the game again and check when I'll start to get the
> same slub oops/
> 
> Here is the second oops I've got with 2.6.29-rc5 kernel:
> 
> IP: [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
> [nf_conntrack]
> PGD 13bfb1067 PUD 1384c8067 PMD 0
> Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> last sysfs file: /sys/module/nf_conntrack_ftp/refcnt
> CPU 0
> Modules linked in: sit tunnel4 nf_defrag_ipv4 bridge stp llc autofs4
> ipv6 nf_conntrack_ftp(-) nf_conntrack binfmt_misc loop dm_mirror
> dm_region_hash dm_log dm_mod kvm_intel kvm i915 drm i2c_algo_bit
> uinput i2c_i801 arc4 ecb cryptomgr aead crypto_blkcipher crypto_hash
> crypto_algapi iwl3945 iwlcore mac80211 video thinkpad_acpi i2c_core
> sr_mod rfkill led_class evdev iTCO_wdt backlight usbhid hid cfg80211
> iTCO_vendor_support e1000e psmouse serio_raw cdrom output rtc_cmos
> rtc_core battery intel_agp nvram rtc_lib button ac uhci_hcd ohci_hcd
> ehci_hcd usbcore [last unloaded: x_tables]
> Pid: 2824, comm: modprobe Not tainted 2.6.29-rc5-00889-gea781f1 #25 6464CTO
> RIP: 0010:[<ffffffffa02b2c2c>]  [<ffffffffa02b2c2c>]
> nf_conntrack_helper_unregister+0x16c/0x320 [nf_conntrack]
> RSP: 0018:ffff88013982fe68  EFLAGS: 00010202
> RAX: 0000000000000200 RBX: 0000000000000001 RCX: ffffffffa02b2b31
> RDX: 00000000000001ff RSI: 8f5c28f5c28f5c29 RDI: 0000000000000001
> RBP: ffff88013982feb8 R08: 0000000000000000 R09: 0000000000000000
> R10: 000000000000006d R11: 0000000000000000 R12: ffffffffa02c6a00
> R13: ffffffffa02c71a0 R14: ffffffff81188e20 R15: ffff88013982fe78
> FS:  00007ffbd4984700(0000) GS:ffffffff8092e040(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000038 CR3: 000000013779b000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process modprobe (pid: 2824, threadinfo ffff88013982e000, task ffff880138710000)
> Stack:
>  ffff88013982fe88 0000020080271bf2 ffffffff806a47c0 0000000000000246
>  ffff88013982fe98 ffffffffa02c6a00 0000000000000000 ffffffffa02c71a0
>  0000000000000000 000000000040f510 ffff88013982fed8 ffffffffa02c502f
> Call Trace:
>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
> [nf_conntrack]
>  RSP <ffff88013982fe68>
> CR2: 0000000000000038
> ---[ end trace bc3a0ede3d0084db ]---
> 
> Zdenek

Hello Zdenek

I am currently traveling and wont be able to help you before next week.

I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.

Thank you

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: System freeze on reboot - general protection fault
  2009-08-16  9:16                     ` System freeze on reboot - general protection fault Eric Dumazet
@ 2009-08-17 14:03                       ` Patrick McHardy
  2009-09-02 21:45                         ` Zdenek Kabelac
  0 siblings, 1 reply; 50+ messages in thread
From: Patrick McHardy @ 2009-08-17 14:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Zdenek Kabelac, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Eric Dumazet wrote:
> Zdenek Kabelac a écrit :
>>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
>>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
>>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
>>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
>>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
>>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
>> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
>> [nf_conntrack]
>>  RSP <ffff88013982fe68>
>> CR2: 0000000000000038
>> ---[ end trace bc3a0ede3d0084db ]---
>>
> I am currently traveling and wont be able to help you before next week.
> 
> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.

Thanks for the report, I'll have a look at this. Zdenek, please
send me the nf_conntrack.ko file used in the above oops. Thanks.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: System freeze on reboot - general protection fault
  2009-08-17 14:03                       ` Patrick McHardy
@ 2009-09-02 21:45                         ` Zdenek Kabelac
  2009-09-02 22:17                           ` Eric Dumazet
  0 siblings, 1 reply; 50+ messages in thread
From: Zdenek Kabelac @ 2009-09-02 21:45 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Eric Dumazet, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

2009/8/17 Patrick McHardy <kaber@trash.net>:
> Eric Dumazet wrote:
>> Zdenek Kabelac a écrit :
>>>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
>>>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
>>>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
>>>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
>>>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
>>>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
>>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
>>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
>>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
>>> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
>>> [nf_conntrack]
>>>  RSP <ffff88013982fe68>
>>> CR2: 0000000000000038
>>> ---[ end trace bc3a0ede3d0084db ]---
>>>
>> I am currently traveling and wont be able to help you before next week.
>>
>> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.
>
> Thanks for the report, I'll have a look at this. Zdenek, please
> send me the nf_conntrack.ko file used in the above oops. Thanks.
>

Ok

I've found the solution for my problem.

http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/30483

I've made this small fix from this thread:

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core
index b5869b9..68488f8 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1108,6 +1108,7 @@ static void nf_conntrack_cleanup_init_net(void)
 {
        nf_conntrack_helper_fini();
        nf_conntrack_proto_fini();
+       rcu_barrier();
        kmem_cache_destroy(nf_conntrack_cachep);
 }

@@ -1266,7 +1267,7 @@ static int nf_conntrack_init_init_net(void)

        nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
                                                sizeof(struct nf_conn),
-                                               0, SLAB_DESTROY_BY_RCU, NULL);
+                                               0, 0, NULL);
        if (!nf_conntrack_cachep) {
                printk(KERN_ERR "Unable to create nf_conn slab cache\n");
                ret = -ENOMEM;


As the thread nf_conntrack: Use rcu_barrier() and fix kmem_cache_create flags
seems to be samewhat 'unfinished'  and already a bit old and I've no
idea whether it actually fixes problem completely or just hides it in
my case - I'm leaving it to some RCU gurus to fix this issue.

All I could say is - this this extra rcu_barrier() and removal of
SLAB_DESTROY removes my GPF on reboot.

Zdenek

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: System freeze on reboot - general protection fault
  2009-09-02 21:45                         ` Zdenek Kabelac
@ 2009-09-02 22:17                           ` Eric Dumazet
  2009-09-02 22:31                             ` Zdenek Kabelac
  2009-09-03 18:17                             ` System freeze on reboot - general protection fault Paul E. McKenney
  0 siblings, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-02 22:17 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Zdenek Kabelac a écrit :
> 2009/8/17 Patrick McHardy <kaber@trash.net>:
>> Eric Dumazet wrote:
>>> Zdenek Kabelac a écrit :
>>>>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
>>>>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
>>>>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
>>>>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
>>>>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
>>>>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>>>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
>>>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
>>>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
>>>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
>>>> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
>>>> [nf_conntrack]
>>>>  RSP <ffff88013982fe68>
>>>> CR2: 0000000000000038
>>>> ---[ end trace bc3a0ede3d0084db ]---
>>>>
>>> I am currently traveling and wont be able to help you before next week.
>>>
>>> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.
>> Thanks for the report, I'll have a look at this. Zdenek, please
>> send me the nf_conntrack.ko file used in the above oops. Thanks.
>>
> 
> Ok
> 
> I've found the solution for my problem.
> 
> http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/30483
> 
> I've made this small fix from this thread:
> 
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core
> index b5869b9..68488f8 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1108,6 +1108,7 @@ static void nf_conntrack_cleanup_init_net(void)
>  {
>         nf_conntrack_helper_fini();
>         nf_conntrack_proto_fini();
> +       rcu_barrier();
>         kmem_cache_destroy(nf_conntrack_cachep);
>  }
> 
> @@ -1266,7 +1267,7 @@ static int nf_conntrack_init_init_net(void)
> 
>         nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
>                                                 sizeof(struct nf_conn),
> -                                               0, SLAB_DESTROY_BY_RCU, NULL);
> +                                               0, 0, NULL);
>         if (!nf_conntrack_cachep) {
>                 printk(KERN_ERR "Unable to create nf_conn slab cache\n");
>                 ret = -ENOMEM;
> 
> 
> As the thread nf_conntrack: Use rcu_barrier() and fix kmem_cache_create flags
> seems to be samewhat 'unfinished'  and already a bit old and I've no
> idea whether it actually fixes problem completely or just hides it in
> my case - I'm leaving it to some RCU gurus to fix this issue.
> 
> All I could say is - this this extra rcu_barrier() and removal of
> SLAB_DESTROY removes my GPF on reboot.
> 
> Zdenek

Ouch..

Dont think such a patch makes your kernel better, it'll crash too.

You cannot remove SLAB_DESTROY_BY_RCU like this, it's there for very good reasons.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: System freeze on reboot - general protection fault
  2009-09-02 22:17                           ` Eric Dumazet
@ 2009-09-02 22:31                             ` Zdenek Kabelac
  2009-09-03  1:04                               ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Eric Dumazet
  2009-09-03 18:17                             ` System freeze on reboot - general protection fault Paul E. McKenney
  1 sibling, 1 reply; 50+ messages in thread
From: Zdenek Kabelac @ 2009-09-02 22:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

2009/9/3 Eric Dumazet <eric.dumazet@gmail.com>:
> Zdenek Kabelac a écrit :
>> 2009/8/17 Patrick McHardy <kaber@trash.net>:
>>> Eric Dumazet wrote:
>>>> Zdenek Kabelac a écrit :
>>>>>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
>>>>>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
>>>>>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
>>>>>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
>>>>>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
>>>>>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>>>>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
>>>>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
>>>>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
>>>>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
>>>>> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
>>>>> [nf_conntrack]
>>>>>  RSP <ffff88013982fe68>
>>>>> CR2: 0000000000000038
>>>>> ---[ end trace bc3a0ede3d0084db ]---
>>>>>
>>>> I am currently traveling and wont be able to help you before next week.
>>>>
>>>> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.
>>> Thanks for the report, I'll have a look at this. Zdenek, please
>>> send me the nf_conntrack.ko file used in the above oops. Thanks.
>>>
>>
>> Ok
>>
>> I've found the solution for my problem.
>>
>> http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/30483
>>
>> I've made this small fix from this thread:
>>
>> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core
>> index b5869b9..68488f8 100644
>> --- a/net/netfilter/nf_conntrack_core.c
>> +++ b/net/netfilter/nf_conntrack_core.c
>> @@ -1108,6 +1108,7 @@ static void nf_conntrack_cleanup_init_net(void)
>>  {
>>         nf_conntrack_helper_fini();
>>         nf_conntrack_proto_fini();
>> +       rcu_barrier();
>>         kmem_cache_destroy(nf_conntrack_cachep);
>>  }
>>
>> @@ -1266,7 +1267,7 @@ static int nf_conntrack_init_init_net(void)
>>
>>         nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
>>                                                 sizeof(struct nf_conn),
>> -                                               0, SLAB_DESTROY_BY_RCU, NULL);
>> +                                               0, 0, NULL);
>>         if (!nf_conntrack_cachep) {
>>                 printk(KERN_ERR "Unable to create nf_conn slab cache\n");
>>                 ret = -ENOMEM;
>>
>>
>> As the thread nf_conntrack: Use rcu_barrier() and fix kmem_cache_create flags
>> seems to be samewhat 'unfinished'  and already a bit old and I've no
>> idea whether it actually fixes problem completely or just hides it in
>> my case - I'm leaving it to some RCU gurus to fix this issue.
>>
>> All I could say is - this this extra rcu_barrier() and removal of
>> SLAB_DESTROY removes my GPF on reboot.
>>
>> Zdenek
>
> Ouch..
>
> Dont think such a patch makes your kernel better, it'll crash too.
>
> You cannot remove SLAB_DESTROY_BY_RCU like this, it's there for very good reasons.
>

Well I'm not noticing any ill behavior - also note - rcu_barrier() is
there before the cache is destroyed.
But as I said - it's just my shot into the dark - which seems to work for me...

Zdenek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-02 22:31                             ` Zdenek Kabelac
@ 2009-09-03  1:04                               ` Eric Dumazet
  2009-09-03  6:31                                 ` Pekka Enberg
  2009-09-03 13:28                                 ` Zdenek Kabelac
  0 siblings, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03  1:04 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Zdenek Kabelac a écrit :
> 
> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
> there before the cache is destroyed.
> But as I said - it's just my shot into the dark - which seems to work for me...
> 

Reading again your traces, I do believe there are two bugs in slub

Maybe not explaining your problem, but worth to fix !

Thank you

[PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU

When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
slab padding, we call restore_bytes() on the whole slab, not only
on the padding.

kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
and *before* sysfs_slab_remove() or risk rcu_free_slab()
being called after kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to
kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/mm/slub.c b/mm/slub.c
index b9f1491..0ac839f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
 	slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
 	print_section("Padding", end - remainder, remainder);
 
-	restore_bytes(s, "slab padding", POISON_INUSE, start, end);
+	restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
 	return 0;
 }
 
@@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
  */
 void kmem_cache_destroy(struct kmem_cache *s)
 {
-	if (s->flags & SLAB_DESTROY_BY_RCU)
-		rcu_barrier();
 	down_write(&slub_lock);
 	s->refcount--;
 	if (!s->refcount) {
@@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
 				"still has objects.\n", s->name, __func__);
 			dump_stack();
 		}
+		if (s->flags & SLAB_DESTROY_BY_RCU)
+			rcu_barrier();
 		sysfs_slab_remove(s);
 	} else
 		up_write(&slub_lock);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  1:04                               ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Eric Dumazet
@ 2009-09-03  6:31                                 ` Pekka Enberg
  2009-09-03  7:38                                   ` Eric Dumazet
  2009-09-03 13:42                                   ` Paul E. McKenney
  2009-09-03 13:28                                 ` Zdenek Kabelac
  1 sibling, 2 replies; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03  6:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Zdenek Kabelac, Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, Sep 3, 2009 at 4:04 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> Zdenek Kabelac a écrit :
>>
>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>> there before the cache is destroyed.
>> But as I said - it's just my shot into the dark - which seems to work for me...
>>
>
> Reading again your traces, I do believe there are two bugs in slub
>
> Maybe not explaining your problem, but worth to fix !
>
> Thank you
>
> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>
> When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
> slab padding, we call restore_bytes() on the whole slab, not only
> on the padding.
>
> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
> and *before* sysfs_slab_remove() or risk rcu_free_slab()
> being called after kmem_cache is deleted (kfreed).
>
> rmmod nf_conntrack can crash the machine because it has to
> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
>
> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/mm/slub.c b/mm/slub.c
> index b9f1491..0ac839f 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
>        slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
>        print_section("Padding", end - remainder, remainder);
>
> -       restore_bytes(s, "slab padding", POISON_INUSE, start, end);
> +       restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);

OK, makes sense.

>        return 0;
>  }
>
> @@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
>  */
>  void kmem_cache_destroy(struct kmem_cache *s)
>  {
> -       if (s->flags & SLAB_DESTROY_BY_RCU)
> -               rcu_barrier();
>        down_write(&slub_lock);
>        s->refcount--;
>        if (!s->refcount) {
> @@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
>                                "still has objects.\n", s->name, __func__);
>                        dump_stack();
>                }
> +               if (s->flags & SLAB_DESTROY_BY_RCU)
> +                       rcu_barrier();
>                sysfs_slab_remove(s);
>        } else
>                up_write(&slub_lock);

The rcu_barrier() call was added by this commit:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5

I guess we should CC Paul as well.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  6:31                                 ` Pekka Enberg
@ 2009-09-03  7:38                                   ` Eric Dumazet
  2009-09-03  7:51                                     ` Pekka Enberg
  2009-09-03 17:45                                     ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
  2009-09-03 13:42                                   ` Paul E. McKenney
  1 sibling, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03  7:38 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Zdenek Kabelac, Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Pekka Enberg a écrit :
> On Thu, Sep 3, 2009 at 4:04 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> Zdenek Kabelac a écrit :
>>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>>> there before the cache is destroyed.
>>> But as I said - it's just my shot into the dark - which seems to work for me...
>>>
>> Reading again your traces, I do believe there are two bugs in slub
>>
>> Maybe not explaining your problem, but worth to fix !
>>
>> Thank you
>>
>> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>>
>> When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
>> slab padding, we call restore_bytes() on the whole slab, not only
>> on the padding.
>>
>> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
>> and *before* sysfs_slab_remove() or risk rcu_free_slab()
>> being called after kmem_cache is deleted (kfreed).
>>
>> rmmod nf_conntrack can crash the machine because it has to
>> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
>>
>> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> ---
>> diff --git a/mm/slub.c b/mm/slub.c
>> index b9f1491..0ac839f 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
>>        slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
>>        print_section("Padding", end - remainder, remainder);
>>
>> -       restore_bytes(s, "slab padding", POISON_INUSE, start, end);
>> +       restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
> 
> OK, makes sense.
> 
>>        return 0;
>>  }
>>
>> @@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
>>  */
>>  void kmem_cache_destroy(struct kmem_cache *s)
>>  {
>> -       if (s->flags & SLAB_DESTROY_BY_RCU)
>> -               rcu_barrier();
>>        down_write(&slub_lock);
>>        s->refcount--;
>>        if (!s->refcount) {
>> @@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
>>                                "still has objects.\n", s->name, __func__);
>>                        dump_stack();
>>                }
>> +               if (s->flags & SLAB_DESTROY_BY_RCU)
>> +                       rcu_barrier();
>>                sysfs_slab_remove(s);
>>        } else
>>                up_write(&slub_lock);
> 
> The rcu_barrier() call was added by this commit:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5
> 
> I guess we should CC Paul as well.

Sure !

rcu_barrier() is definitly better than synchronize_rcu() in 
kmem_cache_destroy()

But its location was not really right (for SLUB at least)

SLAB_DESTROY_BY_RCU means subsystem will call kfree(elems) without waiting RCU
grace period.

By the time subsystem calls kmem_cache_destroy(), all previously allocated
elems must have already be kfreed() by this subsystem.

We must however wait that all slabs, queued for freeing by rcu_free_slab(),
are indeed freed, since this freeing needs access to kmem_cache pointer.

As kmem_cache_close() might clean/purge the cache and call rcu_free_slab(),
we must call rcu_barrier() *after* kmem_cache_close(), and before kfree(kmem_cache *s)

Alternatively we could delay this final kfree(s) (with call_rcu()) but would
have to copy s->name in kmem_cache_create() instead of keeping a pointer to
 a string that might be in a module, and freed at rmmod time.

Given that there is few uses in current tree that call kmem_cache_destroy()
on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
rcu_barrier() call, unless we want superfast reboot/halt sequences...

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  7:38                                   ` Eric Dumazet
@ 2009-09-03  7:51                                     ` Pekka Enberg
  2009-09-03 17:50                                       ` Christoph Lameter
  2009-09-03 17:45                                     ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
  1 sibling, 1 reply; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03  7:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Zdenek Kabelac, Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Hi Eric,

On Thu, Sep 3, 2009 at 10:38 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> The rcu_barrier() call was added by this commit:
>>
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5
>>
>> I guess we should CC Paul as well.
>
> Sure !
>
> rcu_barrier() is definitly better than synchronize_rcu() in
> kmem_cache_destroy()
>
> But its location was not really right (for SLUB at least)
>
> SLAB_DESTROY_BY_RCU means subsystem will call kfree(elems) without waiting RCU
> grace period.
>
> By the time subsystem calls kmem_cache_destroy(), all previously allocated
> elems must have already be kfreed() by this subsystem.
>
> We must however wait that all slabs, queued for freeing by rcu_free_slab(),
> are indeed freed, since this freeing needs access to kmem_cache pointer.
>
> As kmem_cache_close() might clean/purge the cache and call rcu_free_slab(),
> we must call rcu_barrier() *after* kmem_cache_close(), and before kfree(kmem_cache *s)
>
> Alternatively we could delay this final kfree(s) (with call_rcu()) but would
> have to copy s->name in kmem_cache_create() instead of keeping a pointer to
>  a string that might be in a module, and freed at rmmod time.
>
> Given that there is few uses in current tree that call kmem_cache_destroy()
> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
> rcu_barrier() call, unless we want superfast reboot/halt sequences...

Oh, sure, the fix looks sane to me. It's just that I am a complete
coward when it comes to merging RCU related patches so I always try to
fish an Acked-by from Paul or Christoph ;).

                        Pekka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  1:04                               ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Eric Dumazet
  2009-09-03  6:31                                 ` Pekka Enberg
@ 2009-09-03 13:28                                 ` Zdenek Kabelac
  2009-09-03 13:46                                   ` Eric Dumazet
  1 sibling, 1 reply; 50+ messages in thread
From: Zdenek Kabelac @ 2009-09-03 13:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

2009/9/3 Eric Dumazet <eric.dumazet@gmail.com>:
> Zdenek Kabelac a écrit :
>>
>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>> there before the cache is destroyed.
>> But as I said - it's just my shot into the dark - which seems to work for me...
>>
>
> Reading again your traces, I do believe there are two bugs in slub
>
> Maybe not explaining your problem, but worth to fix !
>
> Thank you
>
> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>


Ok - I can confirm that this patch fixes my reboot problem.

Thanks

Zdenek

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  6:31                                 ` Pekka Enberg
  2009-09-03  7:38                                   ` Eric Dumazet
@ 2009-09-03 13:42                                   ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 13:42 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Eric Dumazet, Zdenek Kabelac, Patrick McHardy, Christoph Lameter,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 09:31:01AM +0300, Pekka Enberg wrote:
> On Thu, Sep 3, 2009 at 4:04 AM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> > Zdenek Kabelac a écrit :
> >>
> >> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
> >> there before the cache is destroyed.
> >> But as I said - it's just my shot into the dark - which seems to work for me...
> >>
> >
> > Reading again your traces, I do believe there are two bugs in slub
> >
> > Maybe not explaining your problem, but worth to fix !
> >
> > Thank you
> >
> > [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
> >
> > When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
> > slab padding, we call restore_bytes() on the whole slab, not only
> > on the padding.
> >
> > kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
> > and *before* sysfs_slab_remove() or risk rcu_free_slab()
> > being called after kmem_cache is deleted (kfreed).
> >
> > rmmod nf_conntrack can crash the machine because it has to
> > kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
> >
> > Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > ---
> > diff --git a/mm/slub.c b/mm/slub.c
> > index b9f1491..0ac839f 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
> >        slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
> >        print_section("Padding", end - remainder, remainder);
> >
> > -       restore_bytes(s, "slab padding", POISON_INUSE, start, end);
> > +       restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
> 
> OK, makes sense.
> 
> >        return 0;
> >  }
> >
> > @@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
> >  */
> >  void kmem_cache_destroy(struct kmem_cache *s)
> >  {
> > -       if (s->flags & SLAB_DESTROY_BY_RCU)
> > -               rcu_barrier();
> >        down_write(&slub_lock);
> >        s->refcount--;
> >        if (!s->refcount) {
> > @@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
> >                                "still has objects.\n", s->name, __func__);
> >                        dump_stack();
> >                }
> > +               if (s->flags & SLAB_DESTROY_BY_RCU)
> > +                       rcu_barrier();
> >                sysfs_slab_remove(s);
> >        } else
> >                up_write(&slub_lock);
> 
> The rcu_barrier() call was added by this commit:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7ed9f7e5db58c6e8c2b4b738a75d5dcd8e17aad5
> 
> I guess we should CC Paul as well.

I agree that moving the rcu_barrier() as you have done is the right
thing to do -- no point in doing the rcu_barrier() unless you actually
are destroying the kmem_cache!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03 13:28                                 ` Zdenek Kabelac
@ 2009-09-03 13:46                                   ` Eric Dumazet
  2009-09-03 14:35                                     ` Zdenek Kabelac
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 13:46 UTC (permalink / raw)
  To: Zdenek Kabelac
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Zdenek Kabelac a écrit :
> 2009/9/3 Eric Dumazet <eric.dumazet@gmail.com>:
>> Zdenek Kabelac a écrit :
>>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>>> there before the cache is destroyed.
>>> But as I said - it's just my shot into the dark - which seems to work for me...
>>>
>> Reading again your traces, I do believe there are two bugs in slub
>>
>> Maybe not explaining your problem, but worth to fix !
>>
>> Thank you
>>
>> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>>
> 
> 
> Ok - I can confirm that this patch fixes my reboot problem.
> 
> Thanks
> 
> Zdenek

Good news !

Hmm... but you had a problem with a 2.6.29.something kernel if I remember well ?

At that time, conntrack did not use SLAB_DESTROY_BY_RCU yet.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03 17:50                                       ` Christoph Lameter
@ 2009-09-03 14:05                                         ` Pekka Enberg
  2009-09-03 14:18                                           ` [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU Eric Dumazet
  0 siblings, 1 reply; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03 14:05 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Hi Christoph,

On Thu, 3 Sep 2009, Pekka Enberg wrote:
>> Oh, sure, the fix looks sane to me. It's just that I am a complete
>> coward when it comes to merging RCU related patches so I always try to
>> fish an Acked-by from Paul or Christoph ;).

On Thu, Sep 3, 2009 at 8:50 PM, Christoph
Lameter<cl@linux-foundation.org> wrote:
> I am fine with acking the poison piece.

Ok.

On Thu, Sep 3, 2009 at 8:50 PM, Christoph
Lameter<cl@linux-foundation.org> wrote:
> I did not ack the patch that added rcu to kmem_cache_destroy() and I
> likely wont ack that piece either.

Right. I didn't remember that I merged that over your NAK but I don't
share your view that kmem_cache_destroy() callers should do
rcu_barrier() or whatever is necessary if we can do it from
kmem_cache_destroy(). So I am happy to take both changes but I would
appreciate them being split into two separate patches.

                        Pekka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] slub: fix slab_pad_check()
  2009-09-03 17:45                                     ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
@ 2009-09-03 14:08                                       ` Eric Dumazet
  2009-09-03 18:38                                         ` Christoph Lameter
  2009-09-03 19:34                                         ` Pekka Enberg
  2009-09-03 15:00                                       ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Paul E. McKenney
  1 sibling, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 14:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Christoph Lameter a écrit :
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
>> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
>> rcu_barrier() call, unless we want superfast reboot/halt sequences...
> 
> I stilll think that the action to quiesce rcu is something that the caller
> of kmem_cache_destroy must take care of.

Do you mean :

if (kmem_cache_shrink(s) == 0) {
	rcu_barrier();
	kmem_cache_destroy_no_rcu_barrier(s);
} else {
	kmem_cache_destroy_with_rcu_barrier_because_SLAB_DESTROY_BY_RCU_cache(s);
}

What would be the point ?



> 
> Could you split this into two patches: One that addresses the poison and
> another that deals with rcu?
> 

Sure, here is the poison thing

[PATCH] slub: fix slab_pad_check()

When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
slab padding, we call restore_bytes() on the whole slab, not only
on the padding.

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
diff --git a/mm/slub.c b/mm/slub.c
index b9f1491..0ac839f 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -646,7 +646,7 @@ static int slab_pad_check(struct kmem_cache *s, struct page *page)
 	slab_err(s, page, "Padding overwritten. 0x%p-0x%p", fault, end - 1);
 	print_section("Padding", end - remainder, remainder);
 
-	restore_bytes(s, "slab padding", POISON_INUSE, start, end);
+	restore_bytes(s, "slab padding", POISON_INUSE, end - remainder, end);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
  2009-09-03 14:05                                         ` Pekka Enberg
@ 2009-09-03 14:18                                           ` Eric Dumazet
  2009-09-03 19:48                                             ` Pekka Enberg
  0 siblings, 1 reply; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 14:18 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck,
	stable@kernel.org

Pekka Enberg a écrit :
> Hi Christoph,
> 
> On Thu, 3 Sep 2009, Pekka Enberg wrote:
>>> Oh, sure, the fix looks sane to me. It's just that I am a complete
>>> coward when it comes to merging RCU related patches so I always try to
>>> fish an Acked-by from Paul or Christoph ;).
> 
> On Thu, Sep 3, 2009 at 8:50 PM, Christoph
> Lameter<cl@linux-foundation.org> wrote:
>> I am fine with acking the poison piece.
> 
> Ok.
> 
> On Thu, Sep 3, 2009 at 8:50 PM, Christoph
> Lameter<cl@linux-foundation.org> wrote:
>> I did not ack the patch that added rcu to kmem_cache_destroy() and I
>> likely wont ack that piece either.
> 
> Right. I didn't remember that I merged that over your NAK but I don't
> share your view that kmem_cache_destroy() callers should do
> rcu_barrier() or whatever is necessary if we can do it from
> kmem_cache_destroy(). So I am happy to take both changes but I would
> appreciate them being split into two separate patches.
> 

Here is the second patch (RCU thing). Stable candidate

[PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU

kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
and *before* sysfs_slab_remove() or risk rcu_free_slab()
being called after kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to
kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

diff --git a/mm/slub.c b/mm/slub.c
index b9f1491..b627675 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2594,8 +2594,6 @@ static inline int kmem_cache_close(struct kmem_cache *s)
  */
 void kmem_cache_destroy(struct kmem_cache *s)
 {
-	if (s->flags & SLAB_DESTROY_BY_RCU)
-		rcu_barrier();
 	down_write(&slub_lock);
 	s->refcount--;
 	if (!s->refcount) {
@@ -2606,6 +2604,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
 				"still has objects.\n", s->name, __func__);
 			dump_stack();
 		}
+		if (s->flags & SLAB_DESTROY_BY_RCU)
+			rcu_barrier();
 		sysfs_slab_remove(s);
 	} else
 		up_write(&slub_lock);

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03 13:46                                   ` Eric Dumazet
@ 2009-09-03 14:35                                     ` Zdenek Kabelac
  0 siblings, 0 replies; 50+ messages in thread
From: Zdenek Kabelac @ 2009-09-03 14:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

2009/9/3 Eric Dumazet <eric.dumazet@gmail.com>:
> Zdenek Kabelac a écrit :
>> 2009/9/3 Eric Dumazet <eric.dumazet@gmail.com>:
>>> Zdenek Kabelac a écrit :
>>>> Well I'm not noticing any ill behavior - also note - rcu_barrier() is
>>>> there before the cache is destroyed.
>>>> But as I said - it's just my shot into the dark - which seems to work for me...
>>>>
>>> Reading again your traces, I do believe there are two bugs in slub
>>>
>>> Maybe not explaining your problem, but worth to fix !
>>>
>>> Thank you
>>>
>>> [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
>>>
>>
>>
>> Ok - I can confirm that this patch fixes my reboot problem.
>>
>> Thanks
>>
>> Zdenek
>
> Good news !
>
> Hmm... but you had a problem with a 2.6.29.something kernel if I remember well ?
>
> At that time, conntrack did not use SLAB_DESTROY_BY_RCU yet.

Yes - but it was also giving different error message - which even not
loaded ftp conntrack module. Basically it has been fine until
conntrack started to use rcu. From that moment various errors started
to appear with slub and some kernel debugging options - slab was fine.
With your recent patch it's the first time I do not see oops with nf
after 2.6.29 kernel.

Zdenek
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03 17:45                                     ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
  2009-09-03 14:08                                       ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
@ 2009-09-03 15:00                                       ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 15:00 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 12:45:43PM -0500, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
> > on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
> > rcu_barrier() call, unless we want superfast reboot/halt sequences...
> 
> I stilll think that the action to quiesce rcu is something that the caller
> of kmem_cache_destroy must take care of.

Why?

							Thanx, Paul

> Could you split this into two patches: One that addresses the poison and
> another that deals with rcu?
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 18:38                                         ` Christoph Lameter
@ 2009-09-03 15:01                                           ` Paul E. McKenney
  2009-09-03 15:02                                           ` Eric Dumazet
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 15:01 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 01:38:50PM -0500, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
> > Christoph Lameter a ?crit :
> > > On Thu, 3 Sep 2009, Eric Dumazet wrote:
> > >
> > >> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
> > >> rcu_barrier() call, unless we want superfast reboot/halt sequences...
> > >
> > > I stilll think that the action to quiesce rcu is something that the caller
> > > of kmem_cache_destroy must take care of.
> >
> > Do you mean :
> >
> > if (kmem_cache_shrink(s) == 0) {
> > 	rcu_barrier();
> > 	kmem_cache_destroy_no_rcu_barrier(s);
> > } else {
> > 	kmem_cache_destroy_with_rcu_barrier_because_SLAB_DESTROY_BY_RCU_cache(s);
> > }
> >
> > What would be the point ?
> 
> The above is port of slub?
> 
> I mean that (in this case) the net subsystem would have to deal with RCU quietness
> before disposing of the slab cache. There may be multiple ways of dealing
> with RCU. The RCU barrier may be unnecessary for future uses. Typically
> one would expect that all deferred handling of structures must be complete
> for correctness before disposing of the whole cache.

Which is precisely the point of the rcu_barrier(), right?

							Thanx, Paul

> > [PATCH] slub: fix slab_pad_check()
> 
> Acked-by: Christoph Lameter <cl@linux-foundation.org>
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 18:38                                         ` Christoph Lameter
  2009-09-03 15:01                                           ` Paul E. McKenney
@ 2009-09-03 15:02                                           ` Eric Dumazet
  2009-09-03 19:24                                             ` Christoph Lameter
  1 sibling, 1 reply; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 15:02 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Christoph Lameter a écrit :
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
>> Christoph Lameter a ?crit :
>>> On Thu, 3 Sep 2009, Eric Dumazet wrote:
>>>
>>>> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
>>>> rcu_barrier() call, unless we want superfast reboot/halt sequences...
>>> I stilll think that the action to quiesce rcu is something that the caller
>>> of kmem_cache_destroy must take care of.
>> Do you mean :
>>
>> if (kmem_cache_shrink(s) == 0) {
>> 	rcu_barrier();
>> 	kmem_cache_destroy_no_rcu_barrier(s);
>> } else {
>> 	kmem_cache_destroy_with_rcu_barrier_because_SLAB_DESTROY_BY_RCU_cache(s);
>> }
>>
>> What would be the point ?
> 
> The above is port of slub?

No, I am trying to code what you suggest, and I could not find a clean way with
current API (SLAB/SLUB/SLOB)

> 
> I mean that (in this case) the net subsystem would have to deal with RCU quietness
> before disposing of the slab cache. There may be multiple ways of dealing
> with RCU. The RCU barrier may be unnecessary for future uses. Typically
> one would expect that all deferred handling of structures must be complete
> for correctness before disposing of the whole cache.

Point is we cannot deal with RCU quietness before disposing the slab cache,
(if SLAB_DESTROY_BY_RCU was set on the cache) since this disposing *will* 
make call_rcu() calls when a full slab is freed/purged.
And when RCU grace period is elapsed, the callback *will* need access to 
the cache we want to dismantle. Better to not have kfreed()/poisoned it...

I believe you mix two RCU uses here.

1) The one we all know, is use normal caches (!SLAB_DESTROY_BY_RCU) 
(or kmalloc()), and use call_rcu(... kfree_something)

   In this case, you are 100% right that the subsystem itself has
   to call rcu_barrier() (or respect whatever self-synchro) itself,
   before calling kmem_cache_destroy()

2) The SLAB_DESTROY_BY_RCU one.

   Part of cache dismantle needs to call rcu_barrier() itself.
   Caller doesnt have to use rcu_barrier(). It would be a waste of time,
   as kmem_cache_destroy() will refill rcu wait queues with its own stuff.






^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 19:24                                             ` Christoph Lameter
@ 2009-09-03 17:44                                               ` Paul E. McKenney
  2009-09-03 22:43                                                 ` Christoph Lameter
  2009-09-03 17:59                                               ` Eric Dumazet
  1 sibling, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 17:44 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 02:24:17PM -0500, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
> > Point is we cannot deal with RCU quietness before disposing the slab cache,
> > (if SLAB_DESTROY_BY_RCU was set on the cache) since this disposing *will*
> > make call_rcu() calls when a full slab is freed/purged.
> 
> There is no need to do call_rcu calls for frees at that point since
> objects are no longer in use. We could simply disable SLAB_DESTROY_BY_RCU
> for the final clearing of caches.

Suppose we have the following sequence of events:

1.	CPU 0 is running a task that is using the slab cache.

	This CPU does kmem_cache_free(), which happens to free up
	some memory to the system.  Because SLAB_DESTROY_BY_RCU is
	set, an RCU callback is posted to do the actual freeing.

	Please note that this RCU callback is internal to the slab,
	so that the slab user cannot be aware of it.  In fact, the
	slab user isn't doing any call_rcu()s whatever.

2.	CPU 0 discovers that the slab cache can now be destroyed.

	It determines that there are no users, and has guaranteed
	that there will be no future users.  So it knows that it
	can safely do kmem_cache_destroy().

3.	In absence of rcu_barrier(), kmem_cache_destroy() would
	immediately tear down the slab data structures.

4.	At the end of the next grace period, the RCU callback posted
	(again, internally by the slab cache) is invoked.  It has a
	coronary due to the slab data structures having already been
	freed, and (worse yet) possibly reallocated for other uses.

Hence the need for the rcu_barrier() when tearing down SLAB_DESTROY_BY_RCU
slab caches.

> > And when RCU grace period is elapsed, the callback *will* need access to
> > the cache we want to dismantle. Better to not have kfreed()/poisoned it...
> 
> But going through the RCU period is pointless since no user of the cache
> remains.

Which is irrelevant.  The outstanding RCU callback was posted by the
slab cache itself, -not- by the user of the slab cache.

> > I believe you mix two RCU uses here.
> >
> > 1) The one we all know, is use normal caches (!SLAB_DESTROY_BY_RCU)
> > (or kmalloc()), and use call_rcu(... kfree_something)
> >
> >    In this case, you are 100% right that the subsystem itself has
> >    to call rcu_barrier() (or respect whatever self-synchro) itself,
> >    before calling kmem_cache_destroy()
> >
> > 2) The SLAB_DESTROY_BY_RCU one.
> >
> >    Part of cache dismantle needs to call rcu_barrier() itself.
> >    Caller doesnt have to use rcu_barrier(). It would be a waste of time,
> >    as kmem_cache_destroy() will refill rcu wait queues with its own stuff.
> 
> The dismantling does not need RCU since there are no operations on the
> objects in progress. So simply switch DESTROY_BY_RCU off for close.

Unless I am missing something, this patch re-introduces the bug that
the rcu_barrier() was added to prevent.  So, in absence of a better
explanation of what I am missing:

NACK.

							Thanx, Paul

> ---
>  mm/slub.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2009-09-03 10:14:51.000000000 -0500
> +++ linux-2.6/mm/slub.c	2009-09-03 10:18:32.000000000 -0500
> @@ -2594,9 +2594,9 @@ static inline int kmem_cache_close(struc
>   */
>  void kmem_cache_destroy(struct kmem_cache *s)
>  {
> -	if (s->flags & SLAB_DESTROY_BY_RCU)
> -		rcu_barrier();
>  	down_write(&slub_lock);
> +	/* Stop deferring frees so that we can immediately free structures */
> +	s->flags &= ~SLAB_DESTROY_BY_RCU;
>  	s->refcount--;
>  	if (!s->refcount) {
>  		list_del(&s->list);

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  7:38                                   ` Eric Dumazet
  2009-09-03  7:51                                     ` Pekka Enberg
@ 2009-09-03 17:45                                     ` Christoph Lameter
  2009-09-03 14:08                                       ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
  2009-09-03 15:00                                       ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Paul E. McKenney
  1 sibling, 2 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 17:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, 3 Sep 2009, Eric Dumazet wrote:

> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
> rcu_barrier() call, unless we want superfast reboot/halt sequences...

I stilll think that the action to quiesce rcu is something that the caller
of kmem_cache_destroy must take care of.

Could you split this into two patches: One that addresses the poison and
another that deals with rcu?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU
  2009-09-03  7:51                                     ` Pekka Enberg
@ 2009-09-03 17:50                                       ` Christoph Lameter
  2009-09-03 14:05                                         ` Pekka Enberg
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 17:50 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Eric Dumazet, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, 3 Sep 2009, Pekka Enberg wrote:

> Oh, sure, the fix looks sane to me. It's just that I am a complete
> coward when it comes to merging RCU related patches so I always try to
> fish an Acked-by from Paul or Christoph ;).

I am fine with acking the poison piece.

I did not ack the patch that added rcu to kmem_cache_destroy() and I
likely wont ack that piece either.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 19:24                                             ` Christoph Lameter
  2009-09-03 17:44                                               ` Paul E. McKenney
@ 2009-09-03 17:59                                               ` Eric Dumazet
  2009-09-03 19:00                                                 ` Pekka Enberg
  2009-09-03 22:44                                                 ` Christoph Lameter
  1 sibling, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 17:59 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

Christoph Lameter a écrit :
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
>> Point is we cannot deal with RCU quietness before disposing the slab cache,
>> (if SLAB_DESTROY_BY_RCU was set on the cache) since this disposing *will*
>> make call_rcu() calls when a full slab is freed/purged.
> 
> There is no need to do call_rcu calls for frees at that point since
> objects are no longer in use. We could simply disable SLAB_DESTROY_BY_RCU
> for the final clearing of caches.
> 
>> And when RCU grace period is elapsed, the callback *will* need access to
>> the cache we want to dismantle. Better to not have kfreed()/poisoned it...
> 
> But going through the RCU period is pointless since no user of the cache
> remains.
> 
>> I believe you mix two RCU uses here.
>>
>> 1) The one we all know, is use normal caches (!SLAB_DESTROY_BY_RCU)
>> (or kmalloc()), and use call_rcu(... kfree_something)
>>
>>    In this case, you are 100% right that the subsystem itself has
>>    to call rcu_barrier() (or respect whatever self-synchro) itself,
>>    before calling kmem_cache_destroy()
>>
>> 2) The SLAB_DESTROY_BY_RCU one.
>>
>>    Part of cache dismantle needs to call rcu_barrier() itself.
>>    Caller doesnt have to use rcu_barrier(). It would be a waste of time,
>>    as kmem_cache_destroy() will refill rcu wait queues with its own stuff.
> 
> The dismantling does not need RCU since there are no operations on the
> objects in progress. So simply switch DESTROY_BY_RCU off for close.
> 
> 
> ---
>  mm/slub.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2009-09-03 10:14:51.000000000 -0500
> +++ linux-2.6/mm/slub.c	2009-09-03 10:18:32.000000000 -0500
> @@ -2594,9 +2594,9 @@ static inline int kmem_cache_close(struc
>   */
>  void kmem_cache_destroy(struct kmem_cache *s)
>  {
> -	if (s->flags & SLAB_DESTROY_BY_RCU)
> -		rcu_barrier();
>  	down_write(&slub_lock);
> +	/* Stop deferring frees so that we can immediately free structures */
> +	s->flags &= ~SLAB_DESTROY_BY_RCU;
>  	s->refcount--;
>  	if (!s->refcount) {
>  		list_del(&s->list);

It seems very smart, but needs review of all callers to make sure no slabs
are waiting for final freeing in call_rcu queue on some cpu.

I suspect most of them will then have to use rcu_barrier() before calling
kmem_cache_destroy(), so why not factorizing code in one place ?

net/dccp/ipv6.c:1145:   .slab_flags        = SLAB_DESTROY_BY_RCU,
net/dccp/ipv4.c:941:    .slab_flags             = SLAB_DESTROY_BY_RCU,
net/ipv4/udp.c:1593:    .slab_flags        = SLAB_DESTROY_BY_RCU,
net/ipv4/udplite.c:54:  .slab_flags        = SLAB_DESTROY_BY_RCU,
net/ipv4/tcp_ipv4.c:2446:       .slab_flags             = SLAB_DESTROY_BY_RCU,
net/ipv4/udp.c.orig:1587:       .slab_flags        = SLAB_DESTROY_BY_RCU,
net/ipv6/udp.c:1274:    .slab_flags        = SLAB_DESTROY_BY_RCU,
net/ipv6/udplite.c:52:  .slab_flags        = SLAB_DESTROY_BY_RCU,
net/ipv6/tcp_ipv6.c:2085:       .slab_flags             = SLAB_DESTROY_BY_RCU,
net/netfilter/nf_conntrack_core.c:1269:                                         0, SLAB_DESTROY_BY_RCU, NULL);

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: System freeze on reboot - general protection fault
  2009-09-02 22:17                           ` Eric Dumazet
  2009-09-02 22:31                             ` Zdenek Kabelac
@ 2009-09-03 18:17                             ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 18:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Zdenek Kabelac, Patrick McHardy, Christoph Lameter, Robin Holt,
	Linux Kernel Mailing List, Pekka Enberg, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 12:17:43AM +0200, Eric Dumazet wrote:
> Zdenek Kabelac a écrit :
> > 2009/8/17 Patrick McHardy <kaber@trash.net>:
> >> Eric Dumazet wrote:
> >>> Zdenek Kabelac a écrit :
> >>>>  [<ffffffffa02c502f>] nf_conntrack_ftp_fini+0x2f/0x70 [nf_conntrack_ftp]
> >>>>  [<ffffffff8027bcc5>] sys_delete_module+0x1a5/0x270
> >>>>  [<ffffffff8020d329>] ? retint_swapgs+0xe/0x13
> >>>>  [<ffffffff80271bf2>] ? trace_hardirqs_on_caller+0x162/0x1b0
> >>>>  [<ffffffff80292121>] ? audit_syscall_entry+0x191/0x1c0
> >>>>  [<ffffffff80526dae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >>>>  [<ffffffff8020c84b>] system_call_fastpath+0x16/0x1b
> >>>> Code: c6 00 00 0f 82 66 ff ff ff 49 8b 9e d8 05 00 00 48 85 db 75 16
> >>>> e9 8e 00 00 00 0f 1f 44 00 00 48 85 c0 0f 84 80 00 00 00 48 89 c3 <0f>
> >>>> b6 4b 37 48 8b 03 48 8d 14 cd 00 00 00 00 0f 18 08 48 29 ca
> >>>> RIP  [<ffffffffa02b2c2c>] nf_conntrack_helper_unregister+0x16c/0x320
> >>>> [nf_conntrack]
> >>>>  RSP <ffff88013982fe68>
> >>>> CR2: 0000000000000038
> >>>> ---[ end trace bc3a0ede3d0084db ]---
> >>>>
> >>> I am currently traveling and wont be able to help you before next week.
> >>>
> >>> I added netdev, Patrick, and netfilter-devel in CC so that more eyes can take a look.
> >> Thanks for the report, I'll have a look at this. Zdenek, please
> >> send me the nf_conntrack.ko file used in the above oops. Thanks.
> >>
> > 
> > Ok
> > 
> > I've found the solution for my problem.
> > 
> > http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/30483
> > 
> > I've made this small fix from this thread:
> > 
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core
> > index b5869b9..68488f8 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -1108,6 +1108,7 @@ static void nf_conntrack_cleanup_init_net(void)
> >  {
> >         nf_conntrack_helper_fini();
> >         nf_conntrack_proto_fini();
> > +       rcu_barrier();
> >         kmem_cache_destroy(nf_conntrack_cachep);
> >  }
> > 
> > @@ -1266,7 +1267,7 @@ static int nf_conntrack_init_init_net(void)
> > 
> >         nf_conntrack_cachep = kmem_cache_create("nf_conntrack",
> >                                                 sizeof(struct nf_conn),
> > -                                               0, SLAB_DESTROY_BY_RCU, NULL);
> > +                                               0, 0, NULL);
> >         if (!nf_conntrack_cachep) {
> >                 printk(KERN_ERR "Unable to create nf_conn slab cache\n");
> >                 ret = -ENOMEM;
> > 
> > 
> > As the thread nf_conntrack: Use rcu_barrier() and fix kmem_cache_create flags
> > seems to be samewhat 'unfinished'  and already a bit old and I've no
> > idea whether it actually fixes problem completely or just hides it in
> > my case - I'm leaving it to some RCU gurus to fix this issue.
> > 
> > All I could say is - this this extra rcu_barrier() and removal of
> > SLAB_DESTROY removes my GPF on reboot.
> > 
> > Zdenek
> 
> Ouch..
> 
> Dont think such a patch makes your kernel better, it'll crash too.
> 
> You cannot remove SLAB_DESTROY_BY_RCU like this, it's there for very good reasons.

And if I understand correctly, this is more evidence that
kmem_cache_destroy() needs to do an rcu_barrier() in the
SLAB_DESTROY_BY_RCU case.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 14:08                                       ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
@ 2009-09-03 18:38                                         ` Christoph Lameter
  2009-09-03 15:01                                           ` Paul E. McKenney
  2009-09-03 15:02                                           ` Eric Dumazet
  2009-09-03 19:34                                         ` Pekka Enberg
  1 sibling, 2 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 18:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, 3 Sep 2009, Eric Dumazet wrote:

> Christoph Lameter a ?crit :
> > On Thu, 3 Sep 2009, Eric Dumazet wrote:
> >
> >> on a SLAB_DESTROY_BY_RCU cache, there is no need to try to optimize this
> >> rcu_barrier() call, unless we want superfast reboot/halt sequences...
> >
> > I stilll think that the action to quiesce rcu is something that the caller
> > of kmem_cache_destroy must take care of.
>
> Do you mean :
>
> if (kmem_cache_shrink(s) == 0) {
> 	rcu_barrier();
> 	kmem_cache_destroy_no_rcu_barrier(s);
> } else {
> 	kmem_cache_destroy_with_rcu_barrier_because_SLAB_DESTROY_BY_RCU_cache(s);
> }
>
> What would be the point ?

The above is port of slub?

I mean that (in this case) the net subsystem would have to deal with RCU quietness
before disposing of the slab cache. There may be multiple ways of dealing
with RCU. The RCU barrier may be unnecessary for future uses. Typically
one would expect that all deferred handling of structures must be complete
for correctness before disposing of the whole cache.

> [PATCH] slub: fix slab_pad_check()

Acked-by: Christoph Lameter <cl@linux-foundation.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 17:59                                               ` Eric Dumazet
@ 2009-09-03 19:00                                                 ` Pekka Enberg
  2009-09-03 22:44                                                 ` Christoph Lameter
  1 sibling, 0 replies; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03 19:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, Sep 3, 2009 at 8:59 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> Christoph Lameter a écrit :
>> On Thu, 3 Sep 2009, Eric Dumazet wrote:
>>
>>> Point is we cannot deal with RCU quietness before disposing the slab cache,
>>> (if SLAB_DESTROY_BY_RCU was set on the cache) since this disposing *will*
>>> make call_rcu() calls when a full slab is freed/purged.
>>
>> There is no need to do call_rcu calls for frees at that point since
>> objects are no longer in use. We could simply disable SLAB_DESTROY_BY_RCU
>> for the final clearing of caches.
>>
>>> And when RCU grace period is elapsed, the callback *will* need access to
>>> the cache we want to dismantle. Better to not have kfreed()/poisoned it...
>>
>> But going through the RCU period is pointless since no user of the cache
>> remains.
>>
>>> I believe you mix two RCU uses here.
>>>
>>> 1) The one we all know, is use normal caches (!SLAB_DESTROY_BY_RCU)
>>> (or kmalloc()), and use call_rcu(... kfree_something)
>>>
>>>    In this case, you are 100% right that the subsystem itself has
>>>    to call rcu_barrier() (or respect whatever self-synchro) itself,
>>>    before calling kmem_cache_destroy()
>>>
>>> 2) The SLAB_DESTROY_BY_RCU one.
>>>
>>>    Part of cache dismantle needs to call rcu_barrier() itself.
>>>    Caller doesnt have to use rcu_barrier(). It would be a waste of time,
>>>    as kmem_cache_destroy() will refill rcu wait queues with its own stuff.
>>
>> The dismantling does not need RCU since there are no operations on the
>> objects in progress. So simply switch DESTROY_BY_RCU off for close.
>>
>>
>> ---
>>  mm/slub.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> Index: linux-2.6/mm/slub.c
>> ===================================================================
>> --- linux-2.6.orig/mm/slub.c  2009-09-03 10:14:51.000000000 -0500
>> +++ linux-2.6/mm/slub.c       2009-09-03 10:18:32.000000000 -0500
>> @@ -2594,9 +2594,9 @@ static inline int kmem_cache_close(struc
>>   */
>>  void kmem_cache_destroy(struct kmem_cache *s)
>>  {
>> -     if (s->flags & SLAB_DESTROY_BY_RCU)
>> -             rcu_barrier();
>>       down_write(&slub_lock);
>> +     /* Stop deferring frees so that we can immediately free structures */
>> +     s->flags &= ~SLAB_DESTROY_BY_RCU;
>>       s->refcount--;
>>       if (!s->refcount) {
>>               list_del(&s->list);
>
> It seems very smart, but needs review of all callers to make sure no slabs
> are waiting for final freeing in call_rcu queue on some cpu.
>
> I suspect most of them will then have to use rcu_barrier() before calling
> kmem_cache_destroy(), so why not factorizing code in one place ?

\x10[snip]

Can someone please explain what's the upside in Christoph's approach?
Performance? Correctness? Something else entirely? We're looking at a
tested bug fix here and I don't understand why I shouldn't just go
ahead and merge it. Hmm?

                        Pekka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 15:02                                           ` Eric Dumazet
@ 2009-09-03 19:24                                             ` Christoph Lameter
  2009-09-03 17:44                                               ` Paul E. McKenney
  2009-09-03 17:59                                               ` Eric Dumazet
  0 siblings, 2 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 19:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, 3 Sep 2009, Eric Dumazet wrote:

> Point is we cannot deal with RCU quietness before disposing the slab cache,
> (if SLAB_DESTROY_BY_RCU was set on the cache) since this disposing *will*
> make call_rcu() calls when a full slab is freed/purged.

There is no need to do call_rcu calls for frees at that point since
objects are no longer in use. We could simply disable SLAB_DESTROY_BY_RCU
for the final clearing of caches.

> And when RCU grace period is elapsed, the callback *will* need access to
> the cache we want to dismantle. Better to not have kfreed()/poisoned it...

But going through the RCU period is pointless since no user of the cache
remains.

> I believe you mix two RCU uses here.
>
> 1) The one we all know, is use normal caches (!SLAB_DESTROY_BY_RCU)
> (or kmalloc()), and use call_rcu(... kfree_something)
>
>    In this case, you are 100% right that the subsystem itself has
>    to call rcu_barrier() (or respect whatever self-synchro) itself,
>    before calling kmem_cache_destroy()
>
> 2) The SLAB_DESTROY_BY_RCU one.
>
>    Part of cache dismantle needs to call rcu_barrier() itself.
>    Caller doesnt have to use rcu_barrier(). It would be a waste of time,
>    as kmem_cache_destroy() will refill rcu wait queues with its own stuff.

The dismantling does not need RCU since there are no operations on the
objects in progress. So simply switch DESTROY_BY_RCU off for close.


---
 mm/slub.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2009-09-03 10:14:51.000000000 -0500
+++ linux-2.6/mm/slub.c	2009-09-03 10:18:32.000000000 -0500
@@ -2594,9 +2594,9 @@ static inline int kmem_cache_close(struc
  */
 void kmem_cache_destroy(struct kmem_cache *s)
 {
-	if (s->flags & SLAB_DESTROY_BY_RCU)
-		rcu_barrier();
 	down_write(&slub_lock);
+	/* Stop deferring frees so that we can immediately free structures */
+	s->flags &= ~SLAB_DESTROY_BY_RCU;
 	s->refcount--;
 	if (!s->refcount) {
 		list_del(&s->list);

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 14:08                                       ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
  2009-09-03 18:38                                         ` Christoph Lameter
@ 2009-09-03 19:34                                         ` Pekka Enberg
  1 sibling, 0 replies; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03 19:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, Sep 3, 2009 at 5:08 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> Sure, here is the poison thing
>
> [PATCH] slub: fix slab_pad_check()
>
> When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
> slab padding, we call restore_bytes() on the whole slab, not only
> on the padding.
>
> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
  2009-09-03 14:18                                           ` [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU Eric Dumazet
@ 2009-09-03 19:48                                             ` Pekka Enberg
  2009-09-03 19:56                                               ` Eric Dumazet
  0 siblings, 1 reply; 50+ messages in thread
From: Pekka Enberg @ 2009-09-03 19:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck,
	stable@kernel.org

On Thu, Sep 3, 2009 at 5:18 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
> Here is the second patch (RCU thing). Stable candidate
>
> [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
>
> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
> and *before* sysfs_slab_remove() or risk rcu_free_slab()
> being called after kmem_cache is deleted (kfreed).
>
> rmmod nf_conntrack can crash the machine because it has to
> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.

Do we have a bugzilla URL for this?

> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

OK, this is in for-next now and queued for 2.6.31. If you guys want to
fix this in a different way, lets do that in 2.6.32.

                        Pekka

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
  2009-09-03 19:48                                             ` Pekka Enberg
@ 2009-09-03 19:56                                               ` Eric Dumazet
  0 siblings, 0 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 19:56 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck,
	stable@kernel.org

Pekka Enberg a écrit :
> On Thu, Sep 3, 2009 at 5:18 PM, Eric Dumazet<eric.dumazet@gmail.com> wrote:
>> Here is the second patch (RCU thing). Stable candidate
>>
>> [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
>>
>> kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close()
>> and *before* sysfs_slab_remove() or risk rcu_free_slab()
>> being called after kmem_cache is deleted (kfreed).
>>
>> rmmod nf_conntrack can crash the machine because it has to
>> kmem_cache_destroy() a SLAB_DESTROY_BY_RCU enabled cache.
> 
> Do we have a bugzilla URL for this?

Well, I can crash my 2.6.30.5 machine just doing rmmod nf_conntrack

(You'll need CONFIG_SLUB_DEBUG_ON or equivalent)

Original Zdenek report : http://thread.gmane.org/gmane.linux.kernel/876016/focus=876086

> 
>> Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> OK, this is in for-next now and queued for 2.6.31. If you guys want to
> fix this in a different way, lets do that in 2.6.32.

Seems the right thing IMHO
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:43                                                 ` Christoph Lameter
@ 2009-09-03 22:03                                                   ` Paul E. McKenney
  2009-09-04 15:33                                                     ` Christoph Lameter
  2009-09-03 22:08                                                   ` Eric Dumazet
  1 sibling, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 22:03 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 05:43:12PM -0500, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Paul E. McKenney wrote:
> 
> > 2.	CPU 0 discovers that the slab cache can now be destroyed.
> >
> > 	It determines that there are no users, and has guaranteed
> > 	that there will be no future users.  So it knows that it
> > 	can safely do kmem_cache_destroy().
> >
> > 3.	In absence of rcu_barrier(), kmem_cache_destroy() would
> > 	immediately tear down the slab data structures.
> 
> Of course. This has been discussed before.
> 
> You need to ensure that no objects are in use before destroying a slab. In
> case of DESTROY_BY_RCU you must ensure that there are no potential
> readers. So use a suitable rcu barrier or something else like a
> synchronize_rcu...

If by "you must ensure" you mean "kmem_cache_destroy must ensure", then
we are in complete agreement.  Otherwise, not a chance.

> > > But going through the RCU period is pointless since no user of the cache
> > > remains.
> >
> > Which is irrelevant.  The outstanding RCU callback was posted by the
> > slab cache itself, -not- by the user of the slab cache.
> 
> There will be no rcu callbacks generated at kmem_cache_destroy with the
> patch I posted.

That is true.  However, there well might be RCU callbacks generated by
kmem_cache_free() that have not yet been invoked.  Since kmem_cache_free()
generated them, it is ridiculous to insist that the user account for them.
That responsibility must instead fall on kmem_cache_destroy().

> > > The dismantling does not need RCU since there are no operations on the
> > > objects in progress. So simply switch DESTROY_BY_RCU off for close.
> >
> > Unless I am missing something, this patch re-introduces the bug that
> > the rcu_barrier() was added to prevent.  So, in absence of a better
> > explanation of what I am missing:
> 
> The "fix" was ill advised. Slab users must ensure that no objects are in
> use before destroying a slab. Only the slab users know how the objects
> are being used. The slab allocator itself cannot know how to ensure that
> there are no pending references. Putting a rcu_barrier in there creates an
> inconsistency in the operation of kmem_cache_destroy() and an expectation
> of functionality that the function cannot provide.

If by "must ensure that no objects are in use", you mean "must have
no further references to them", then we are in agreement.  And in my
scenario above, it is not the -user- who later references the memory,
but rather the slab code itself.

Put the rcu_barrier() in kmem_cache_free().  Please.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:43                                                 ` Christoph Lameter
  2009-09-03 22:03                                                   ` Paul E. McKenney
@ 2009-09-03 22:08                                                   ` Eric Dumazet
  2009-09-03 22:17                                                     ` Eric Dumazet
  2009-09-04 15:38                                                     ` Christoph Lameter
  1 sibling, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 22:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Christoph Lameter a écrit :
> On Thu, 3 Sep 2009, Paul E. McKenney wrote:
> 
>> 2.	CPU 0 discovers that the slab cache can now be destroyed.
>>
>> 	It determines that there are no users, and has guaranteed
>> 	that there will be no future users.  So it knows that it
>> 	can safely do kmem_cache_destroy().
>>
>> 3.	In absence of rcu_barrier(), kmem_cache_destroy() would
>> 	immediately tear down the slab data structures.
> 
> Of course. This has been discussed before.
> 
> You need to ensure that no objects are in use before destroying a slab. In
> case of DESTROY_BY_RCU you must ensure that there are no potential
> readers. So use a suitable rcu barrier or something else like a
> synchronize_rcu...
> 
>>> But going through the RCU period is pointless since no user of the cache
>>> remains.
>> Which is irrelevant.  The outstanding RCU callback was posted by the
>> slab cache itself, -not- by the user of the slab cache.
> 
> There will be no rcu callbacks generated at kmem_cache_destroy with the
> patch I posted.
> 
>>> The dismantling does not need RCU since there are no operations on the
>>> objects in progress. So simply switch DESTROY_BY_RCU off for close.
>> Unless I am missing something, this patch re-introduces the bug that
>> the rcu_barrier() was added to prevent.  So, in absence of a better
>> explanation of what I am missing:
> 
> The "fix" was ill advised. Slab users must ensure that no objects are in
> use before destroying a slab. Only the slab users know how the objects
> are being used. The slab allocator itself cannot know how to ensure that
> there are no pending references. Putting a rcu_barrier in there creates an
> inconsistency in the operation of kmem_cache_destroy() and an expectation
> of functionality that the function cannot provide.
> 



Problem is not _objects_ Christoph, but _slabs_, and your patch is not working.

Its true that when User calls kmem_cache_destroy(), all _objects_ were previously freed.
This is mandatory, with or withou SLAB_DESTROY_BY_RCU thing

Problem is that slub has some internal state, including some to-be-freed _slabs_,
that User have no control at all on it.

User cannot "know" slabs are freed, inuse, or whatever state in cache or call_rcu queues.

Face it, SLAB_DESTROY_BY_RCU is internal affair (to slub/slab/... allocators)

We absolutely need a rcu_barrier() somewhere, believe it or not. You can argue that it should
be done *before*, but it gives no speedup, only potential bugs.

Only case User should do its rcu_barrier() itself is if it knows some call_rcu() are pending
and are delaying _objects_ freeing (typical !SLAB_DESTROY_RCU usage in RCU algos).

I dont even understand why you care so much about kmem_cache_destroy(SLAB_DESTROY_BY_RCU),
given that almost nobody use it. We took almost one month to find out what the bug was in first
place...
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:08                                                   ` Eric Dumazet
@ 2009-09-03 22:17                                                     ` Eric Dumazet
  2009-09-04 15:39                                                       ` Christoph Lameter
  2009-09-04 20:42                                                       ` Paul E. McKenney
  2009-09-04 15:38                                                     ` Christoph Lameter
  1 sibling, 2 replies; 50+ messages in thread
From: Eric Dumazet @ 2009-09-03 22:17 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

Eric Dumazet a écrit :
> 
> 
> 
> Problem is not _objects_ Christoph, but _slabs_, and your patch is not working.
> 
> Its true that when User calls kmem_cache_destroy(), all _objects_ were previously freed.
> This is mandatory, with or withou SLAB_DESTROY_BY_RCU thing
> 
> Problem is that slub has some internal state, including some to-be-freed _slabs_,
> that User have no control at all on it.
> 
> User cannot "know" slabs are freed, inuse, or whatever state in cache or call_rcu queues.
> 
> Face it, SLAB_DESTROY_BY_RCU is internal affair (to slub/slab/... allocators)
> 
> We absolutely need a rcu_barrier() somewhere, believe it or not. You can argue that it should
> be done *before*, but it gives no speedup, only potential bugs.
> 
> Only case User should do its rcu_barrier() itself is if it knows some call_rcu() are pending
> and are delaying _objects_ freeing (typical !SLAB_DESTROY_RCU usage in RCU algos).
> 
> I dont even understand why you care so much about kmem_cache_destroy(SLAB_DESTROY_BY_RCU),
> given that almost nobody use it. We took almost one month to find out what the bug was in first
> place...


So maybe the safest thing would be to include the rcu_barrier() to insure all objects where freed

And another one for SLAB_DESTROY_BY_RCU to make sure slabs where freed

void kmem_cache_destroy(struct kmem_cache *s)
{
	/*
	 * Make sure no objects are waiting in call_rcu queues to be freed
	 */
	rcu_barrier();

	down_write(&slub_lock);
	s->refcount--;
	if (!s->refcount) {
                list_del(&s->list);
                up_write(&slub_lock);
                if (kmem_cache_close(s)) {
                        printk(KERN_ERR "SLUB %s: %s called for cache that "
                                "still has objects.\n", s->name, __func__);
                        dump_stack();
                }
		/*
		 * Make sure no slabs are waiting in call_rcu queues to be freed
		 */
                if (s->flags & SLAB_DESTROY_BY_RCU)
                        rcu_barrier();
                sysfs_slab_remove(s);
        } else
                up_write(&slub_lock);
}

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 17:44                                               ` Paul E. McKenney
@ 2009-09-03 22:43                                                 ` Christoph Lameter
  2009-09-03 22:03                                                   ` Paul E. McKenney
  2009-09-03 22:08                                                   ` Eric Dumazet
  0 siblings, 2 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 22:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, 3 Sep 2009, Paul E. McKenney wrote:

> 2.	CPU 0 discovers that the slab cache can now be destroyed.
>
> 	It determines that there are no users, and has guaranteed
> 	that there will be no future users.  So it knows that it
> 	can safely do kmem_cache_destroy().
>
> 3.	In absence of rcu_barrier(), kmem_cache_destroy() would
> 	immediately tear down the slab data structures.

Of course. This has been discussed before.

You need to ensure that no objects are in use before destroying a slab. In
case of DESTROY_BY_RCU you must ensure that there are no potential
readers. So use a suitable rcu barrier or something else like a
synchronize_rcu...

> > But going through the RCU period is pointless since no user of the cache
> > remains.
>
> Which is irrelevant.  The outstanding RCU callback was posted by the
> slab cache itself, -not- by the user of the slab cache.

There will be no rcu callbacks generated at kmem_cache_destroy with the
patch I posted.

> > The dismantling does not need RCU since there are no operations on the
> > objects in progress. So simply switch DESTROY_BY_RCU off for close.
>
> Unless I am missing something, this patch re-introduces the bug that
> the rcu_barrier() was added to prevent.  So, in absence of a better
> explanation of what I am missing:

The "fix" was ill advised. Slab users must ensure that no objects are in
use before destroying a slab. Only the slab users know how the objects
are being used. The slab allocator itself cannot know how to ensure that
there are no pending references. Putting a rcu_barrier in there creates an
inconsistency in the operation of kmem_cache_destroy() and an expectation
of functionality that the function cannot provide.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 17:59                                               ` Eric Dumazet
  2009-09-03 19:00                                                 ` Pekka Enberg
@ 2009-09-03 22:44                                                 ` Christoph Lameter
  2009-09-03 23:17                                                   ` Paul E. McKenney
  1 sibling, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-03 22:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Pekka Enberg, Zdenek Kabelac, Patrick McHardy, Robin Holt,
	Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, paulmck

On Thu, 3 Sep 2009, Eric Dumazet wrote:

> It seems very smart, but needs review of all callers to make sure no slabs
> are waiting for final freeing in call_rcu queue on some cpu.

Yes. Again this is the first time we encounter a situation where a
DESTROY_BY_RCU slab has to be destroyed. So the review is quite short.

> I suspect most of them will then have to use rcu_barrier() before calling
> kmem_cache_destroy(), so why not factorizing code in one place ?

There are different forms of RCU which require different forms of
barriers. Its best to leave that up to the user. Again the user must make
sure that no objects are in use before a slab is destroyed. For
SLAB_DESTROY_BY_RCU this means that there are no potential outstanding
reads on the structure. You may need an rcu_barrier() to accomplish that.

Slight variations in the use of RCU could require different method. Better
reduce the entanglement of slabs to RCU to a mininum possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:44                                                 ` Christoph Lameter
@ 2009-09-03 23:17                                                   ` Paul E. McKenney
  2009-09-04 15:42                                                     ` Christoph Lameter
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-03 23:17 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, Sep 03, 2009 at 05:44:54PM -0500, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Eric Dumazet wrote:
> 
> > It seems very smart, but needs review of all callers to make sure no slabs
> > are waiting for final freeing in call_rcu queue on some cpu.
> 
> Yes. Again this is the first time we encounter a situation where a
> DESTROY_BY_RCU slab has to be destroyed. So the review is quite short.
> 
> > I suspect most of them will then have to use rcu_barrier() before calling
> > kmem_cache_destroy(), so why not factorizing code in one place ?
> 
> There are different forms of RCU which require different forms of
> barriers. Its best to leave that up to the user. Again the user must make
> sure that no objects are in use before a slab is destroyed. For
> SLAB_DESTROY_BY_RCU this means that there are no potential outstanding
> reads on the structure. You may need an rcu_barrier() to accomplish that.
> 
> Slight variations in the use of RCU could require different method. Better
> reduce the entanglement of slabs to RCU to a mininum possible.

If it were the user of the slab who was invoking some variant of
call_rcu(), then I would agree with you.

However, call_rcu() is instead being invoked by the slab itself in the
case of SLAB_DESTROY_BY_RCU, so that there is no variation in usage.
Requiring that the user call rcu_barrier() is asking for subtle bugs.
Therefore, the best approach is to have kmem_cache_destroy() handle
the RCU cleanup, given that this cleanup is for actions taken by
kmem_cache_free(), not by the user.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:03                                                   ` Paul E. McKenney
@ 2009-09-04 15:33                                                     ` Christoph Lameter
  0 siblings, 0 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-04 15:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, 3 Sep 2009, Paul E. McKenney wrote:

> > You need to ensure that no objects are in use before destroying a slab. In
> > case of DESTROY_BY_RCU you must ensure that there are no potential
> > readers. So use a suitable rcu barrier or something else like a
> > synchronize_rcu...
>
> If by "you must ensure" you mean "kmem_cache_destroy must ensure", then
> we are in complete agreement.  Otherwise, not a chance.

Well then we are going down to a crappy interface and mixing rcu with slab
semantics. More bugs to follow in the future.

> If by "must ensure that no objects are in use", you mean "must have
> no further references to them", then we are in agreement.  And in my
> scenario above, it is not the -user- who later references the memory,
> but rather the slab code itself.

The slab code itself is not referencing the later memory with my patch. It
can only be the user.

> Put the rcu_barrier() in kmem_cache_free().  Please.

Guess we are doing this ... Sigh. Are you going to add support other rcu
versions to slab as well as time permits and as the need arises? Should we
add you as a maintainer? ;-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:08                                                   ` Eric Dumazet
  2009-09-03 22:17                                                     ` Eric Dumazet
@ 2009-09-04 15:38                                                     ` Christoph Lameter
  1 sibling, 0 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-04 15:38 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paul E. McKenney, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Fri, 4 Sep 2009, Eric Dumazet wrote:

> Problem is not _objects_ Christoph, but _slabs_, and your patch is not working.

Why?

> Its true that when User calls kmem_cache_destroy(), all _objects_ were
> previously freed. This is mandatory, with or withou SLAB_DESTROY_BY_RCU
> thing

Right.

> Problem is that slub has some internal state, including some to-be-freed _slabs_,
> that User have no control at all on it.

Those are going to be freed without calls to rcu with my patch. The only
reason for earlier rcu frees are user calls to kfree.

> Face it, SLAB_DESTROY_BY_RCU is internal affair (to slub/slab/... allocators)

Nope the user must follow RCU guidelines when using objects.

> We absolutely need a rcu_barrier() somewhere, believe it or not. You can
> argue that it should be done *before*, but it gives no speedup, only
> potential bugs.

I never said that you do not need an rcu_barrier() for this particular
situation? Why suggest such a thing?

The insertion of rcu stuff in the slab code will lead to future bugs since
now the slab logic is tied to the semantics of a particular rcu
implementatin.

> Only case User should do its rcu_barrier() itself is if it knows some
> call_rcu() are pending and are delaying _objects_ freeing (typical
> !SLAB_DESTROY_RCU usage in RCU algos).

Ok then the user already has to deal with the barriers. The API is
inconsistent if you put this into kmem_cache_destroy.

> I dont even understand why you care so much about
> kmem_cache_destroy(SLAB_DESTROY_BY_RCU), given that almost nobody use
> it. We took almost one month to find out what the bug was in first
> place...

This is already the second bug on this issue. Given the complexity of rcu
it is to be experted that inserting more RCU semantics into the slab
allocators is likely to cause future chains of new features and
bugs in slab allocators.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:17                                                     ` Eric Dumazet
@ 2009-09-04 15:39                                                       ` Christoph Lameter
  2009-09-04 20:42                                                       ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Christoph Lameter @ 2009-09-04 15:39 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Paul E. McKenney, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Fri, 4 Sep 2009, Eric Dumazet wrote:

> And another one for SLAB_DESTROY_BY_RCU to make sure slabs where freed

Now its two rcu_barriers(). What I feared comes true.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 23:17                                                   ` Paul E. McKenney
@ 2009-09-04 15:42                                                     ` Christoph Lameter
  2009-09-04 20:43                                                       ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-04 15:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Thu, 3 Sep 2009, Paul E. McKenney wrote:

> If it were the user of the slab who was invoking some variant of
> call_rcu(), then I would agree with you.

The user already has to deal with it as explained by Eric.

> However, call_rcu() is instead being invoked by the slab itself in the
> case of SLAB_DESTROY_BY_RCU, so that there is no variation in usage.
> Requiring that the user call rcu_barrier() is asking for subtle bugs.
> Therefore, the best approach is to have kmem_cache_destroy() handle
> the RCU cleanup, given that this cleanup is for actions taken by
> kmem_cache_free(), not by the user.

The user already has to properly handle barriers and rcu logic in order to
use objects handled with RCU properly. Moreover the user even has to check
that the object is not suddenly checked under it. Its already complex.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-03 22:17                                                     ` Eric Dumazet
  2009-09-04 15:39                                                       ` Christoph Lameter
@ 2009-09-04 20:42                                                       ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-04 20:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Christoph Lameter, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Fri, Sep 04, 2009 at 12:17:34AM +0200, Eric Dumazet wrote:
> Eric Dumazet a écrit :
> > 
> > 
> > 
> > Problem is not _objects_ Christoph, but _slabs_, and your patch is not working.
> > 
> > Its true that when User calls kmem_cache_destroy(), all _objects_ were previously freed.
> > This is mandatory, with or withou SLAB_DESTROY_BY_RCU thing
> > 
> > Problem is that slub has some internal state, including some to-be-freed _slabs_,
> > that User have no control at all on it.
> > 
> > User cannot "know" slabs are freed, inuse, or whatever state in cache or call_rcu queues.
> > 
> > Face it, SLAB_DESTROY_BY_RCU is internal affair (to slub/slab/... allocators)
> > 
> > We absolutely need a rcu_barrier() somewhere, believe it or not. You can argue that it should
> > be done *before*, but it gives no speedup, only potential bugs.
> > 
> > Only case User should do its rcu_barrier() itself is if it knows some call_rcu() are pending
> > and are delaying _objects_ freeing (typical !SLAB_DESTROY_RCU usage in RCU algos).
> > 
> > I dont even understand why you care so much about kmem_cache_destroy(SLAB_DESTROY_BY_RCU),
> > given that almost nobody use it. We took almost one month to find out what the bug was in first
> > place...
> 
> 
> So maybe the safest thing would be to include the rcu_barrier() to
> insure all objects where freed

I argue that the above is the user's responsibility.  That said, I don't
see why the user would pass a SLAB_DESTROY_BY_RCU object to call_rcu().
So I would want to see an example of this before inflicting a pair
of rcu_barrier() calls on kmem_cache_destroy().

> And another one for SLAB_DESTROY_BY_RCU to make sure slabs where freed

This last is I believe kmem_cache's responsibility.

							Thanx, Paul

> void kmem_cache_destroy(struct kmem_cache *s)
> {
> 	/*
> 	 * Make sure no objects are waiting in call_rcu queues to be freed
> 	 */
> 	rcu_barrier();
> 
> 	down_write(&slub_lock);
> 	s->refcount--;
> 	if (!s->refcount) {
>                 list_del(&s->list);
>                 up_write(&slub_lock);
>                 if (kmem_cache_close(s)) {
>                         printk(KERN_ERR "SLUB %s: %s called for cache that "
>                                 "still has objects.\n", s->name, __func__);
>                         dump_stack();
>                 }
> 		/*
> 		 * Make sure no slabs are waiting in call_rcu queues to be freed
> 		 */
>                 if (s->flags & SLAB_DESTROY_BY_RCU)
>                         rcu_barrier();
>                 sysfs_slab_remove(s);
>         } else
>                 up_write(&slub_lock);
> }
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-04 15:42                                                     ` Christoph Lameter
@ 2009-09-04 20:43                                                       ` Paul E. McKenney
  2009-09-08 19:57                                                         ` Christoph Lameter
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-04 20:43 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Fri, Sep 04, 2009 at 11:42:17AM -0400, Christoph Lameter wrote:
> On Thu, 3 Sep 2009, Paul E. McKenney wrote:
> 
> > If it were the user of the slab who was invoking some variant of
> > call_rcu(), then I would agree with you.
> 
> The user already has to deal with it as explained by Eric.

I didn't read his email that way.  Perhaps I misinterpreted it.

> > However, call_rcu() is instead being invoked by the slab itself in the
> > case of SLAB_DESTROY_BY_RCU, so that there is no variation in usage.
> > Requiring that the user call rcu_barrier() is asking for subtle bugs.
> > Therefore, the best approach is to have kmem_cache_destroy() handle
> > the RCU cleanup, given that this cleanup is for actions taken by
> > kmem_cache_free(), not by the user.
> 
> The user already has to properly handle barriers and rcu logic in order to
> use objects handled with RCU properly. Moreover the user even has to check
> that the object is not suddenly checked under it. Its already complex.

mm/slab.c has had RCU calls since 2.6.9, so this is not new.

> Guess we are doing this ... Sigh. Are you going to add support other rcu
> versions to slab as well as time permits and as the need arises? Should
> we add you as a maintainer? ;-)

I don't see any point in adding anything resembling SLAB_DESTROY_BY_RCU_BH,
SLAB_DESTROY_BY_RCU_SCHED, or SLAB_DESTROY_BY_SRCU unless and until
someone needs it.  And I most especially don't see the point of adding
(say) SLAB_DESTROY_BY_RCU_BH_SCHED until someone convinces me of the
need for it.  I would prefer to put the energy into further streamlining
the underlying RCU implementation, maybe someday collapsing RCU-BH back
into RCU.  ;-)

We have gotten along fine with only SLAB_DESTROY_BY_RCU for almost
five years, so I think we are plenty fine with what we have.  So, as
you say, "as the need arises".

I don't see any more need to add me as maintainer of slab and friends
than of btrfs, netfilter, selinux, decnet, afs, wireless, or any of a
number of other subsystems that use RCU.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-04 20:43                                                       ` Paul E. McKenney
@ 2009-09-08 19:57                                                         ` Christoph Lameter
  2009-09-08 22:20                                                           ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-08 19:57 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Fri, 4 Sep 2009, Paul E. McKenney wrote:

> We have gotten along fine with only SLAB_DESTROY_BY_RCU for almost
> five years, so I think we are plenty fine with what we have.  So, as
> you say, "as the need arises".

These were the glory years where SLAB_DESTROY_BY_RCU was only used for
anonymous vmas. Now Eric has picked it up for the net subsystem. You may
see the RCU use proliferate.

The kmem_cache_destroy rcu barriers did not matter until
SLAB_DESTROY_BY_RCU spread.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-08 19:57                                                         ` Christoph Lameter
@ 2009-09-08 22:20                                                           ` Paul E. McKenney
  2009-09-08 22:41                                                             ` Christoph Lameter
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-08 22:20 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Tue, Sep 08, 2009 at 03:57:04PM -0400, Christoph Lameter wrote:
> On Fri, 4 Sep 2009, Paul E. McKenney wrote:
> 
> > We have gotten along fine with only SLAB_DESTROY_BY_RCU for almost
> > five years, so I think we are plenty fine with what we have.  So, as
> > you say, "as the need arises".
> 
> These were the glory years where SLAB_DESTROY_BY_RCU was only used for
> anonymous vmas. Now Eric has picked it up for the net subsystem. You may
> see the RCU use proliferate.
> 
> The kmem_cache_destroy rcu barriers did not matter until
> SLAB_DESTROY_BY_RCU spread.

Certainly it is true that increased use of RCU has resulted in new
requirements, which have in turn led to any number of changes over
the years.

Are you saying that people have already asked you for additional
variants of SLAB_DESTROY_BY_RCU?  If so, please don't keep them a secret!
Otherwise, experience indicates that it is best to wait for the new uses,
because they usually aren't quite what one might expect.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-08 22:20                                                           ` Paul E. McKenney
@ 2009-09-08 22:41                                                             ` Christoph Lameter
  2009-09-08 22:59                                                               ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-08 22:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Tue, 8 Sep 2009, Paul E. McKenney wrote:

> Are you saying that people have already asked you for additional
> variants of SLAB_DESTROY_BY_RCU?  If so, please don't keep them a secret!
> Otherwise, experience indicates that it is best to wait for the new uses,
> because they usually aren't quite what one might expect.

No direct request but I have seen the network developers discover these
features and their caching benefits over the last year. It is likely that
they will try to push it into more components of the net subsystem.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-08 22:41                                                             ` Christoph Lameter
@ 2009-09-08 22:59                                                               ` Paul E. McKenney
  2009-09-09 14:04                                                                 ` Christoph Lameter
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-08 22:59 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Tue, Sep 08, 2009 at 06:41:14PM -0400, Christoph Lameter wrote:
> On Tue, 8 Sep 2009, Paul E. McKenney wrote:
> 
> > Are you saying that people have already asked you for additional
> > variants of SLAB_DESTROY_BY_RCU?  If so, please don't keep them a secret!
> > Otherwise, experience indicates that it is best to wait for the new uses,
> > because they usually aren't quite what one might expect.
> 
> No direct request but I have seen the network developers discover these
> features and their caching benefits over the last year. It is likely that
> they will try to push it into more components of the net subsystem.

So if they push it far enough, they might well decide that they need
a SLAB_DESTROY_BY_RCU_BH, for example.  Looks like seven bits left,
so unless I am missing something, should not be a huge problem should
this need arise.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-08 22:59                                                               ` Paul E. McKenney
@ 2009-09-09 14:04                                                                 ` Christoph Lameter
  2009-09-09 14:42                                                                   ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-09 14:04 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers

On Tue, 8 Sep 2009, Paul E. McKenney wrote:

> > No direct request but I have seen the network developers discover these
> > features and their caching benefits over the last year. It is likely that
> > they will try to push it into more components of the net subsystem.
>
> So if they push it far enough, they might well decide that they need
> a SLAB_DESTROY_BY_RCU_BH, for example.  Looks like seven bits left,
> so unless I am missing something, should not be a huge problem should
> this need arise.

I'd rather have the call_rcu in the slabs replaced by a function that
can be set by the user. Then we can remove all rcu barriers from the code
and the slabs could be used with any type of rcu functionality.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-09 14:04                                                                 ` Christoph Lameter
@ 2009-09-09 14:42                                                                   ` Paul E. McKenney
  2009-09-09 14:53                                                                     ` Christoph Lameter
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-09 14:42 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, dhowells, lethal, kernel,
	mpm

On Wed, Sep 09, 2009 at 10:04:20AM -0400, Christoph Lameter wrote:
> On Tue, 8 Sep 2009, Paul E. McKenney wrote:
> 
> > > No direct request but I have seen the network developers discover these
> > > features and their caching benefits over the last year. It is likely that
> > > they will try to push it into more components of the net subsystem.
> >
> > So if they push it far enough, they might well decide that they need
> > a SLAB_DESTROY_BY_RCU_BH, for example.  Looks like seven bits left,
> > so unless I am missing something, should not be a huge problem should
> > this need arise.
> 
> I'd rather have the call_rcu in the slabs replaced by a function that
> can be set by the user. Then we can remove all rcu barriers from the code
> and the slabs could be used with any type of rcu functionality.

If the embedded guys are OK with an additional pointer in the slab data
structure, I have no objection to this approach.  I am assuming that we
would use the usual ops-style structure full of pointers to functions
in order to avoid a pair of extra pointers.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-09 14:42                                                                   ` Paul E. McKenney
@ 2009-09-09 14:53                                                                     ` Christoph Lameter
  2009-09-09 15:09                                                                       ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Christoph Lameter @ 2009-09-09 14:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, dhowells, lethal, kernel,
	mpm

On Wed, 9 Sep 2009, Paul E. McKenney wrote:

> If the embedded guys are OK with an additional pointer in the slab data
> structure, I have no objection to this approach.  I am assuming that we
> would use the usual ops-style structure full of pointers to functions
> in order to avoid a pair of extra pointers.

This would also allow us to get rid of the ctor pointer in
kmem_cache_create(). We can put it into the same structure.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH] slub: fix slab_pad_check()
  2009-09-09 14:53                                                                     ` Christoph Lameter
@ 2009-09-09 15:09                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2009-09-09 15:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Eric Dumazet, Pekka Enberg, Zdenek Kabelac, Patrick McHardy,
	Robin Holt, Linux Kernel Mailing List, Jesper Dangaard Brouer,
	Linux Netdev List, Netfilter Developers, dhowells, lethal, kernel,
	mpm

On Wed, Sep 09, 2009 at 10:53:22AM -0400, Christoph Lameter wrote:
> On Wed, 9 Sep 2009, Paul E. McKenney wrote:
> 
> > If the embedded guys are OK with an additional pointer in the slab data
> > structure, I have no objection to this approach.  I am assuming that we
> > would use the usual ops-style structure full of pointers to functions
> > in order to avoid a pair of extra pointers.
> 
> This would also allow us to get rid of the ctor pointer in
> kmem_cache_create(). We can put it into the same structure.

Sounds good -- that would result in no increase in size.  Could even
add a dtor if people want it.  ;-)  (Sorry, couldn't resist...)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2009-09-09 15:09 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <c4e36d110908110542l3de51aaepcbd62c84b9848f2b@mail.gmail.com>
     [not found] ` <alpine.DEB.1.10.0908111032110.30494@gentwo.org>
     [not found]   ` <c4e36d110908110752x253e30epb00cc71f4683052b@mail.gmail.com>
     [not found]     ` <alpine.DEB.1.10.0908111056550.30494@gentwo.org>
     [not found]       ` <c4e36d110908110832y21fba830ub8804613df571228@mail.gmail.com>
     [not found]         ` <20090811154853.GF2763@sgi.com>
     [not found]           ` <c4e36d110908111410y29b922ceod6871fda2514f6e6@mail.gmail.com>
     [not found]             ` <c4e36d110908121516u504809e9y537e9babfa95df1d@mail.gmail.com>
     [not found]               ` <alpine.DEB.1.10.0908121820410.3257@gentwo.org>
     [not found]                 ` <c4e36d110908131009l78b29bffvb4d41b90f9b83288@mail.gmail.com>
     [not found]                   ` <c4e36d110908140233v59421ba6y82192b858210370d@mail.gmail.com>
2009-08-16  9:16                     ` System freeze on reboot - general protection fault Eric Dumazet
2009-08-17 14:03                       ` Patrick McHardy
2009-09-02 21:45                         ` Zdenek Kabelac
2009-09-02 22:17                           ` Eric Dumazet
2009-09-02 22:31                             ` Zdenek Kabelac
2009-09-03  1:04                               ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Eric Dumazet
2009-09-03  6:31                                 ` Pekka Enberg
2009-09-03  7:38                                   ` Eric Dumazet
2009-09-03  7:51                                     ` Pekka Enberg
2009-09-03 17:50                                       ` Christoph Lameter
2009-09-03 14:05                                         ` Pekka Enberg
2009-09-03 14:18                                           ` [PATCH] slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU Eric Dumazet
2009-09-03 19:48                                             ` Pekka Enberg
2009-09-03 19:56                                               ` Eric Dumazet
2009-09-03 17:45                                     ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Christoph Lameter
2009-09-03 14:08                                       ` [PATCH] slub: fix slab_pad_check() Eric Dumazet
2009-09-03 18:38                                         ` Christoph Lameter
2009-09-03 15:01                                           ` Paul E. McKenney
2009-09-03 15:02                                           ` Eric Dumazet
2009-09-03 19:24                                             ` Christoph Lameter
2009-09-03 17:44                                               ` Paul E. McKenney
2009-09-03 22:43                                                 ` Christoph Lameter
2009-09-03 22:03                                                   ` Paul E. McKenney
2009-09-04 15:33                                                     ` Christoph Lameter
2009-09-03 22:08                                                   ` Eric Dumazet
2009-09-03 22:17                                                     ` Eric Dumazet
2009-09-04 15:39                                                       ` Christoph Lameter
2009-09-04 20:42                                                       ` Paul E. McKenney
2009-09-04 15:38                                                     ` Christoph Lameter
2009-09-03 17:59                                               ` Eric Dumazet
2009-09-03 19:00                                                 ` Pekka Enberg
2009-09-03 22:44                                                 ` Christoph Lameter
2009-09-03 23:17                                                   ` Paul E. McKenney
2009-09-04 15:42                                                     ` Christoph Lameter
2009-09-04 20:43                                                       ` Paul E. McKenney
2009-09-08 19:57                                                         ` Christoph Lameter
2009-09-08 22:20                                                           ` Paul E. McKenney
2009-09-08 22:41                                                             ` Christoph Lameter
2009-09-08 22:59                                                               ` Paul E. McKenney
2009-09-09 14:04                                                                 ` Christoph Lameter
2009-09-09 14:42                                                                   ` Paul E. McKenney
2009-09-09 14:53                                                                     ` Christoph Lameter
2009-09-09 15:09                                                                       ` Paul E. McKenney
2009-09-03 19:34                                         ` Pekka Enberg
2009-09-03 15:00                                       ` [PATCH] slub: fix slab_pad_check() and SLAB_DESTROY_BY_RCU Paul E. McKenney
2009-09-03 13:42                                   ` Paul E. McKenney
2009-09-03 13:28                                 ` Zdenek Kabelac
2009-09-03 13:46                                   ` Eric Dumazet
2009-09-03 14:35                                     ` Zdenek Kabelac
2009-09-03 18:17                             ` System freeze on reboot - general protection fault Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).