* namespaces?: bug at mm/slub.c:2750
@ 2009-02-06 11:35 Eric Sesterhenn
2009-02-07 0:15 ` Andrew Morton
0 siblings, 1 reply; 14+ messages in thread
From: Eric Sesterhenn @ 2009-02-06 11:35 UTC (permalink / raw)
To: linux-kernel
Hi,
with todays -git i get the following bug when i reboot the machine
[ 94.369135] ------------[ cut here ]------------
[ 94.369344] kernel BUG at mm/slub.c:2750!
[ 94.369463] invalid opcode: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[ 94.369828] last sysfs file: /sys/devices/pnp0/00:0c/id
[ 94.369952] Modules linked in:
[ 94.370035]
[ 94.370035] Pid: 0, comm: swapper Not tainted
(2.6.29-rc3-00653-g8285bbf #240) System Name
[ 94.370035] EIP: 0060:[<c018340b>] EFLAGS: 00010246 CPU: 0
[ 94.370035] EIP is at kfree+0x3c/0xe4
[ 94.370035] EAX: 00000400 EBX: c1139000 ECX: 00000000 EDX: c09fb204
[ 94.370035] ESI: c114cf60 EDI: c09fb204 EBP: c0ab0f94 ESP: c0ab0f80
[ 94.370035] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 94.370035] Process swapper (pid: 0, ti=c0ab0000 task=c09f23ac
task.ti=c0a50000)
[ 94.370035] Stack:
[ 94.370035] c0ab0f94 00000286 c09fb204 c014e784 00000020 c0ab0fa0
c014e79c c09fb204
[ 94.370035] c0ab0fb0 c04dca32 cede1240 00000282 c0ab0fc4 c01297d1
c012c8aa cf9e4dc0
[ 94.370035] cf9e4e20 c0ab0fd0 c0135b0d 00000000 c0ab0fe0 c015a783
00000100 00000001
[ 94.370035] Call Trace:
[ 94.370035] [<c014e784>] ? free_user_ns+0x0/0x1b
[ 94.370035] [<c014e79c>] ? free_user_ns+0x18/0x1b
[ 94.370035] [<c04dca32>] ? kref_put+0x3b/0x49
[ 94.370035] [<c01297d1>] ? free_uid+0x49/0xa4
[ 94.370035] [<c012c8aa>] ? groups_free+0x31/0x35
[ 94.370035] [<c0135b0d>] ? put_cred_rcu+0x52/0x63
[ 94.370035] [<c015a783>] ? rcu_process_callbacks+0x60/0x74
[ 94.370035] [<c0125af2>] ? __do_softirq+0x6a/0xf1
[ 94.370035] [<c0125a88>] ? __do_softirq+0x0/0xf1
[ 94.370035] <IRQ> <0> [<c01598a8>] ? handle_level_irq+0x0/0xbc
[ 94.370035] [<c0125a0a>] ? irq_exit+0x3b/0x77
[ 94.370035] [<c010411d>] ? do_IRQ+0xdc/0xf3
[ 94.370035] [<c010340c>] ? common_interrupt+0x2c/0x34
[ 94.370035] [<c013007b>] ? param_array_set+0x18/0xc6
[ 94.370035] [<c051f5d2>] ? acpi_idle_enter_simple+0x167/0x1d1
[ 94.370035] [<c062c69d>] ? cpuidle_idle_call+0x57/0x8e
[ 94.370035] [<c01019eb>] ? cpu_idle+0x59/0x86
[ 94.370035] [<c077a585>] ? rest_init+0x61/0x63
[ 94.370035] Code: 00 00 00 8b 1d e4 0f 03 c1 e8 fb f3 f8 ff c1 e8 0c
c1 e0 05 8d 34 03 f6 46 01 40 74 03 8b 76 0c 8b 06 84 c0 78 15 f6 c4 60
75 04 <0f> 0b eb fe 89 f0 e8 90 7a fe ff e9 90 00 00 00 8b 45 04 89 45
[ 94.370035] EIP: [<c018340b>] kfree+0x3c/0xe4 SS:ESP 0068:c0ab0f80
[ 94.383497] ---[ end trace 4779637014de8de6 ]---
[ 94.383620] Kernel panic - not syncing: Fatal exception in interrupt
The bug triggered is BUG_ON(!PageCompound(page)) in kfree();
(gdb) l *(free_user_ns+0x0)
0xc014e784 is in free_user_ns (kernel/user_namespace.c:64).
59
60 return 0;
61 }
62
63 void free_user_ns(struct kref *kref)
64 {
65 struct user_namespace *ns;
66
67 ns = container_of(kref, struct user_namespace, kref);
68 free_uid(ns->creator);
(gdb) l *(groups_free)
0xc012c879 is in groups_free (kernel/sys.c:1160).
1155 }
1156
1157 EXPORT_SYMBOL(groups_alloc);
1158
1159 void groups_free(struct group_info *group_info)
1160 {
1161 if (group_info->blocks[0] != group_info->small_block) {
1162 int i;
1163 for (i = 0; i < group_info->nblocks; i++)
1164 free_page((unsigned
long)group_info->blocks[i]);
Greetings, Eric
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: namespaces?: bug at mm/slub.c:2750 2009-02-06 11:35 namespaces?: bug at mm/slub.c:2750 Eric Sesterhenn @ 2009-02-07 0:15 ` Andrew Morton 2009-02-09 11:53 ` Eric Sesterhenn 2009-02-11 7:55 ` Vegard Nossum 0 siblings, 2 replies; 14+ messages in thread From: Andrew Morton @ 2009-02-07 0:15 UTC (permalink / raw) To: Eric Sesterhenn; +Cc: linux-kernel On Fri, 6 Feb 2009 12:35:56 +0100 Eric Sesterhenn <snakebyte@gmx.de> wrote: > Hi, > > with todays -git i get the following bug when i reboot the machine > > [ 94.369135] ------------[ cut here ]------------ > [ 94.369344] kernel BUG at mm/slub.c:2750! > [ 94.369463] invalid opcode: 0000 [#1] PREEMPT DEBUG_PAGEALLOC > [ 94.369828] last sysfs file: /sys/devices/pnp0/00:0c/id > [ 94.369952] Modules linked in: > [ 94.370035] > [ 94.370035] Pid: 0, comm: swapper Not tainted > (2.6.29-rc3-00653-g8285bbf #240) System Name > [ 94.370035] EIP: 0060:[<c018340b>] EFLAGS: 00010246 CPU: 0 > [ 94.370035] EIP is at kfree+0x3c/0xe4 > [ 94.370035] EAX: 00000400 EBX: c1139000 ECX: 00000000 EDX: c09fb204 > [ 94.370035] ESI: c114cf60 EDI: c09fb204 EBP: c0ab0f94 ESP: c0ab0f80 > [ 94.370035] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 > [ 94.370035] Process swapper (pid: 0, ti=c0ab0000 task=c09f23ac > task.ti=c0a50000) > [ 94.370035] Stack: > [ 94.370035] c0ab0f94 00000286 c09fb204 c014e784 00000020 c0ab0fa0 > c014e79c c09fb204 > [ 94.370035] c0ab0fb0 c04dca32 cede1240 00000282 c0ab0fc4 c01297d1 > c012c8aa cf9e4dc0 > [ 94.370035] cf9e4e20 c0ab0fd0 c0135b0d 00000000 c0ab0fe0 c015a783 > 00000100 00000001 > [ 94.370035] Call Trace: > [ 94.370035] [<c014e784>] ? free_user_ns+0x0/0x1b > [ 94.370035] [<c014e79c>] ? free_user_ns+0x18/0x1b > [ 94.370035] [<c04dca32>] ? kref_put+0x3b/0x49 > [ 94.370035] [<c01297d1>] ? free_uid+0x49/0xa4 > [ 94.370035] [<c012c8aa>] ? groups_free+0x31/0x35 > [ 94.370035] [<c0135b0d>] ? put_cred_rcu+0x52/0x63 > [ 94.370035] [<c015a783>] ? rcu_process_callbacks+0x60/0x74 > [ 94.370035] [<c0125af2>] ? __do_softirq+0x6a/0xf1 > [ 94.370035] [<c0125a88>] ? __do_softirq+0x0/0xf1 > [ 94.370035] <IRQ> <0> [<c01598a8>] ? handle_level_irq+0x0/0xbc > [ 94.370035] [<c0125a0a>] ? irq_exit+0x3b/0x77 > [ 94.370035] [<c010411d>] ? do_IRQ+0xdc/0xf3 > [ 94.370035] [<c010340c>] ? common_interrupt+0x2c/0x34 > [ 94.370035] [<c013007b>] ? param_array_set+0x18/0xc6 > [ 94.370035] [<c051f5d2>] ? acpi_idle_enter_simple+0x167/0x1d1 > [ 94.370035] [<c062c69d>] ? cpuidle_idle_call+0x57/0x8e > [ 94.370035] [<c01019eb>] ? cpu_idle+0x59/0x86 > [ 94.370035] [<c077a585>] ? rest_init+0x61/0x63 > [ 94.370035] Code: 00 00 00 8b 1d e4 0f 03 c1 e8 fb f3 f8 ff c1 e8 0c > c1 e0 05 8d 34 03 f6 46 01 40 74 03 8b 76 0c 8b 06 84 c0 78 15 f6 c4 60 > 75 04 <0f> 0b eb fe 89 f0 e8 90 7a fe ff e9 90 00 00 00 8b 45 04 89 45 > [ 94.370035] EIP: [<c018340b>] kfree+0x3c/0xe4 SS:ESP 0068:c0ab0f80 > [ 94.383497] ---[ end trace 4779637014de8de6 ]--- > [ 94.383620] Kernel panic - not syncing: Fatal exception in interrupt > > The bug triggered is BUG_ON(!PageCompound(page)) in kfree(); > > (gdb) l *(free_user_ns+0x0) > 0xc014e784 is in free_user_ns (kernel/user_namespace.c:64). > 59 > 60 return 0; > 61 } > 62 > 63 void free_user_ns(struct kref *kref) > 64 { > 65 struct user_namespace *ns; > 66 > 67 ns = container_of(kref, struct user_namespace, kref); > 68 free_uid(ns->creator); > (gdb) l *(groups_free) > 0xc012c879 is in groups_free (kernel/sys.c:1160). > 1155 } > 1156 > 1157 EXPORT_SYMBOL(groups_alloc); > 1158 > 1159 void groups_free(struct group_info *group_info) > 1160 { > 1161 if (group_info->blocks[0] != group_info->small_block) { > 1162 int i; > 1163 for (i = 0; i < group_info->nblocks; i++) > 1164 free_page((unsigned > long)group_info->blocks[i]); > Well that's ugly. We seem to have passed a non-slab address into kfree(). void kfree(const void *x) { struct page *page; void *object = (void *)x; if (unlikely(ZERO_OR_NULL_PTR(x))) return; page = virt_to_head_page(x); if (unlikely(!PageSlab(page))) { BUG_ON(!PageCompound(page)); doing BUG_ON(!PageCompound) is a rather odd way of reporting that. I'm unsure what could have caused this. Could you have a play around please? Set all the memory debug options, try using slab instead of slub, etc? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-07 0:15 ` Andrew Morton @ 2009-02-09 11:53 ` Eric Sesterhenn 2009-02-11 7:55 ` Vegard Nossum 1 sibling, 0 replies; 14+ messages in thread From: Eric Sesterhenn @ 2009-02-09 11:53 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel * Andrew Morton (akpm@linux-foundation.org) wrote: > On Fri, 6 Feb 2009 12:35:56 +0100 > Eric Sesterhenn <snakebyte@gmx.de> wrote: > > > Hi, > > > > with todays -git i get the following bug when i reboot the machine > > > > [ 94.369135] ------------[ cut here ]------------ > > [ 94.369344] kernel BUG at mm/slub.c:2750! > > [ 94.369463] invalid opcode: 0000 [#1] PREEMPT DEBUG_PAGEALLOC > > [ 94.369828] last sysfs file: /sys/devices/pnp0/00:0c/id > > [ 94.369952] Modules linked in: > > [ 94.370035] > > [ 94.370035] Pid: 0, comm: swapper Not tainted > > (2.6.29-rc3-00653-g8285bbf #240) System Name > > [ 94.370035] EIP: 0060:[<c018340b>] EFLAGS: 00010246 CPU: 0 > > [ 94.370035] EIP is at kfree+0x3c/0xe4 > > [ 94.370035] EAX: 00000400 EBX: c1139000 ECX: 00000000 EDX: c09fb204 > > [ 94.370035] ESI: c114cf60 EDI: c09fb204 EBP: c0ab0f94 ESP: c0ab0f80 > > [ 94.370035] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 > > [ 94.370035] Process swapper (pid: 0, ti=c0ab0000 task=c09f23ac > > task.ti=c0a50000) > > [ 94.370035] Stack: > > [ 94.370035] c0ab0f94 00000286 c09fb204 c014e784 00000020 c0ab0fa0 > > c014e79c c09fb204 > > [ 94.370035] c0ab0fb0 c04dca32 cede1240 00000282 c0ab0fc4 c01297d1 > > c012c8aa cf9e4dc0 > > [ 94.370035] cf9e4e20 c0ab0fd0 c0135b0d 00000000 c0ab0fe0 c015a783 > > 00000100 00000001 > > [ 94.370035] Call Trace: > > [ 94.370035] [<c014e784>] ? free_user_ns+0x0/0x1b > > [ 94.370035] [<c014e79c>] ? free_user_ns+0x18/0x1b > > [ 94.370035] [<c04dca32>] ? kref_put+0x3b/0x49 > > [ 94.370035] [<c01297d1>] ? free_uid+0x49/0xa4 > > [ 94.370035] [<c012c8aa>] ? groups_free+0x31/0x35 > > [ 94.370035] [<c0135b0d>] ? put_cred_rcu+0x52/0x63 > > [ 94.370035] [<c015a783>] ? rcu_process_callbacks+0x60/0x74 > > [ 94.370035] [<c0125af2>] ? __do_softirq+0x6a/0xf1 > > [ 94.370035] [<c0125a88>] ? __do_softirq+0x0/0xf1 > > [ 94.370035] <IRQ> <0> [<c01598a8>] ? handle_level_irq+0x0/0xbc > > [ 94.370035] [<c0125a0a>] ? irq_exit+0x3b/0x77 > > [ 94.370035] [<c010411d>] ? do_IRQ+0xdc/0xf3 > > [ 94.370035] [<c010340c>] ? common_interrupt+0x2c/0x34 > > [ 94.370035] [<c013007b>] ? param_array_set+0x18/0xc6 > > [ 94.370035] [<c051f5d2>] ? acpi_idle_enter_simple+0x167/0x1d1 > > [ 94.370035] [<c062c69d>] ? cpuidle_idle_call+0x57/0x8e > > [ 94.370035] [<c01019eb>] ? cpu_idle+0x59/0x86 > > [ 94.370035] [<c077a585>] ? rest_init+0x61/0x63 > > [ 94.370035] Code: 00 00 00 8b 1d e4 0f 03 c1 e8 fb f3 f8 ff c1 e8 0c > > c1 e0 05 8d 34 03 f6 46 01 40 74 03 8b 76 0c 8b 06 84 c0 78 15 f6 c4 60 > > 75 04 <0f> 0b eb fe 89 f0 e8 90 7a fe ff e9 90 00 00 00 8b 45 04 89 45 > > [ 94.370035] EIP: [<c018340b>] kfree+0x3c/0xe4 SS:ESP 0068:c0ab0f80 > > [ 94.383497] ---[ end trace 4779637014de8de6 ]--- > > [ 94.383620] Kernel panic - not syncing: Fatal exception in interrupt > > > > The bug triggered is BUG_ON(!PageCompound(page)) in kfree(); > > > > (gdb) l *(free_user_ns+0x0) > > 0xc014e784 is in free_user_ns (kernel/user_namespace.c:64). > > 59 > > 60 return 0; > > 61 } > > 62 > > 63 void free_user_ns(struct kref *kref) > > 64 { > > 65 struct user_namespace *ns; > > 66 > > 67 ns = container_of(kref, struct user_namespace, kref); > > 68 free_uid(ns->creator); > > (gdb) l *(groups_free) > > 0xc012c879 is in groups_free (kernel/sys.c:1160). > > 1155 } > > 1156 > > 1157 EXPORT_SYMBOL(groups_alloc); > > 1158 > > 1159 void groups_free(struct group_info *group_info) > > 1160 { > > 1161 if (group_info->blocks[0] != group_info->small_block) { > > 1162 int i; > > 1163 for (i = 0; i < group_info->nblocks; i++) > > 1164 free_page((unsigned > > long)group_info->blocks[i]); > > > > Well that's ugly. We seem to have passed a non-slab address into > kfree(). > > void kfree(const void *x) > { > struct page *page; > void *object = (void *)x; > > if (unlikely(ZERO_OR_NULL_PTR(x))) > return; > > page = virt_to_head_page(x); > if (unlikely(!PageSlab(page))) { > BUG_ON(!PageCompound(page)); > > > doing BUG_ON(!PageCompound) is a rather odd way of reporting that. > > I'm unsure what could have caused this. Could you have a play around > please? Set all the memory debug options, try using slab instead of > slub, etc Short story, I cant reproduce this anymore. Long story: I tried bisecting this. I tested each kernel with at least two reboots, the ones marked bad paniced on both, all the good ones rebooted cleanly. root@whiterabbit:/usr/src/linux# git-bisect log git-bisect start # good: [97499aafffd20f018d719acc5ed73cf65705784d] Merge branch 'master' of # git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 git-bisect good 97499aafffd20f018d719acc5ed73cf65705784d # bad: [695874d11965eee158e9bf45807a7a2db3f652c9] Merge branch 'master' of # git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 git-bisect bad 695874d11965eee158e9bf45807a7a2db3f652c9 # good: [67e70baf043cfdcdaf5972bc94be82632071536b] V4L/DVB (10411): s5h1409: # Perform s5h1409 soft reset after tuning git-bisect good 67e70baf043cfdcdaf5972bc94be82632071536b # good: [5c350d93ff4736086a1b08fef1d0b5e22138d2e0] Merge branch 'for-linus' of # git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc git-bisect good 5c350d93ff4736086a1b08fef1d0b5e22138d2e0 # bad: [93bfbd71db4d2e01c05e219f285249a74808b1d4] Merge branch 'merge' of # git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc git-bisect bad 93bfbd71db4d2e01c05e219f285249a74808b1d4 # bad: [31c952dcf83d5b0fd57b514cbe8a1664647c26e7] Merge branch # 'sched-fixes-for-linus' of # git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip git-bisect bad 31c952dcf83d5b0fd57b514cbe8a1664647c26e7 # good: [a9f3e2b549f83a9cdab873abf4140be27c05a3f2] sched: clear buddies more # aggressively git-bisect good a9f3e2b549f83a9cdab873abf4140be27c05a3f2 # good: [3d398703ef06fd97b4c28c86b580546d5b57e7b7] sched_rt: don't use # first_cpu on cpumask created with cpumask_and git-bisect good 3d398703ef06fd97b4c28c86b580546d5b57e7b7 # good: [9e6235e997bf091326b2f3ac92217c2ac2e27eb5] Merge branch 'for_linus' of # git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 git-bisect good 9e6235e997bf091326b2f3ac92217c2ac2e27eb5 root@whiterabbit:/usr/src/linux# git-bisect good 31c952dcf83d5b0fd57b514cbe8a1664647c26e7 is first bad commit root@whiterabbit:/usr/src/linux# git-show 31c952dcf83d5b0fd57b514cbe8a1664647c26e7 commit 31c952dcf83d5b0fd57b514cbe8a1664647c26e7 Merge: 9e6235e... 3d39870... Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Mon Feb 2 19:26:29 2009 -0800 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched_rt: don't use first_cpu on cpumask created with cpumask_and sched: fix buddie group latency sched: clear buddies more aggressively sched: symmetric sync vs avg_overlap sched: fix sync wakeups cpuset: fix possible deadlock in async_rebuild_sched_domains This seemed bogus to me, as far as i understand git, the merge commits do not actually change the code, they only server as a reference that other commits where pulled in. Tried a git-reset --hard 31c952dcf83d5b0fd57b514cbe8a1664647c26e7 to get to the always failing version again. Which turned out to fail in the last of three reboots... :| With dmesg -n 8 I didnt get more information out of this. I enabled slab on this 2.6.29-rc3-00410-g31c952d and tested 10 reboots. None showed anything strange :( After this I tried with SLOB, but it often hang during boot after [ 0.762297] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) [ 0.762556] io scheduler noop registered [ 0.762662] io scheduler cfq registered (default) [ 0.767357] PCI: VIA PCI bridge detected. Disabling DAC. So I checked out a fresh tree (2.6.29-rc3-00742-ge83102c) and build it with the same config and SLUB enabled again. Just to see if this was still reproducible. 16 reboots later I didnt see this again. I then tested 2.6.29-rc4-00001-gd5b5623 but saw nothing in 13 reboots. Noticed CONFIG_SLUB_DEBUG_ON=n and set it to y, tried another 12 boots and nothing happened. Based on this I assume this was just a somehow miscompiled kernel since being really really hard to reproduce on some kernels, while always reproducible on others sounds somewhat strange. Greetings, Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-07 0:15 ` Andrew Morton 2009-02-09 11:53 ` Eric Sesterhenn @ 2009-02-11 7:55 ` Vegard Nossum 2009-02-11 8:07 ` Andrew Morton 1 sibling, 1 reply; 14+ messages in thread From: Vegard Nossum @ 2009-02-11 7:55 UTC (permalink / raw) To: Andrew Morton; +Cc: Eric Sesterhenn, linux-kernel On Sat, Feb 7, 2009 at 1:15 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Fri, 6 Feb 2009 12:35:56 +0100 > Eric Sesterhenn <snakebyte@gmx.de> wrote: >> >> with todays -git i get the following bug when i reboot the machine >> >> [ 94.369135] ------------[ cut here ]------------ >> [ 94.369344] kernel BUG at mm/slub.c:2750! [...] > Well that's ugly. We seem to have passed a non-slab address into > kfree(). > > void kfree(const void *x) > { > struct page *page; > void *object = (void *)x; > > if (unlikely(ZERO_OR_NULL_PTR(x))) > return; > > page = virt_to_head_page(x); > if (unlikely(!PageSlab(page))) { > BUG_ON(!PageCompound(page)); > > > doing BUG_ON(!PageCompound) is a rather odd way of reporting that. > This might be pointing out the obvious, but page allocator pass-through means that we can't really tell slab pages from page allocator pages. But since pass-through uses GFP_COMP, we know it'd be a bug to pass a non-compound page into this function. > I'm unsure what could have caused this. Could you have a play around > please? Set all the memory debug options, try using slab instead of > slub, etc? I think it is trying to free the init_user_ns (because of a refcounting bug). I am able to reproduce it with a clean stack trace: [ 648.864710] ------------[ cut here ]------------ [ 648.865554] kernel BUG at mm/slub.c:2750! [ 648.865554] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 648.865554] last sysfs file: /sys/devices/pnp0/00:0d/id [ 648.865554] CPU 1 [ 648.865554] Modules linked in: [ 648.865554] Pid: 917, comm: udevd Not tainted 2.6.29-rc3 #223 [ 648.865554] RIP: 0010:[<ffffffff810b99c9>] [<ffffffff810b99c9>] kfree+0x29/0x87 [ 648.865554] RSP: 0018:ffff88003f187e28 EFLAGS: 00010246 [ 648.865554] RAX: 0100000000000400 RBX: ffffffff8171fe00 RCX: 0000000000000086 [ 648.865554] RDX: ffffe20000050ec8 RSI: 0000000000000085 RDI: ffffe20000050ec8 [ 648.865554] RBP: ffff88003f187e38 R08: 0000000000000000 R09: ffffffff81819cb0 [ 648.865554] R10: 0000000000000002 R11: ffff88003f187e98 R12: ffffffff81072144 [ 648.865554] R13: 00000000000000cf R14: ffffffff818b13e0 R15: 000000000000000a [ 648.865554] FS: 0000000000000000(0000) GS:ffff88003f156f80(0063) knlGS:00000000f7fac700 [ 648.865554] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b [ 648.865554] CR2: 0000000009b2d630 CR3: 000000003e92c000 CR4: 00000000000006a0 [ 648.865554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 648.865554] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 648.865554] Process udevd (pid: 917, threadinfo ffff88003e458000, task ffff88003f1b8e80) [ 648.865554] Stack: [ 648.865554] ffffffff8171fe00 ffffffff81072144 ffff88003f187e58 ffffffff81072169 [ 648.865554] ffffffff818b13e0 ffffffff8171fe00 ffff88003f187e78 ffffffff811b2f68 [ 648.865554] ffff88003dd4c500 0000000000000286 ffff88003f187e98 ffffffff81046712 [ 648.865554] Call Trace: [ 648.865554] <IRQ> <0> [<ffffffff81072144>] ? free_user_ns+0x0/0x29 [ 648.865554] [<ffffffff81072169>] free_user_ns+0x25/0x29 [ 648.865554] [<ffffffff811b2f68>] kref_put+0x66/0x72 [ 648.865554] [<ffffffff81046712>] free_uid+0x4f/0x90 [ 648.865554] [<ffffffff810555b8>] put_cred_rcu+0xa5/0xb9 [ 648.865554] [<ffffffff8107ea86>] __rcu_process_callbacks+0x1f4/0x263 [ 648.865554] [<ffffffff8107eb2c>] rcu_process_callbacks+0x37/0x5d [ 648.865554] [<ffffffff81041bd3>] __do_softirq+0x88/0x149 [ 648.865554] [<ffffffff8100d53c>] call_softirq+0x1c/0x28 [ 648.865554] [<ffffffff8100e4e9>] do_softirq+0x39/0x7b [ 648.865554] [<ffffffff81041946>] irq_exit+0x44/0x7e [ 648.865554] [<ffffffff814c3d8f>] smp_apic_timer_interrupt+0x98/0xb1 [ 648.865554] [<ffffffff8100cf73>] apic_timer_interrupt+0x13/0x20 [ 648.865554] <EOI> <0>Code: c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 00 48 83 ff 10 48 89 fb 7 6 6d e8 1d ed ff ff 48 89 c7 48 8b 00 84 c0 78 10 f6 c4 60 75 04 <0f> 0b eb fe e8 f4 cd fd ff e b 4e 48 8b 4d 08 4c 8b 4f 10 9c 41 [ 648.865554] RIP [<ffffffff810b99c9>] kfree+0x29/0x87 [ 648.865554] RSP <ffff88003f187e28> [ 649.131912] ---[ end trace a48f49f69d2bcb3e ]--- [ 649.136840] Kernel panic - not syncing: Fatal exception in interrupt [ 649.143558] Rebooting in 10 seconds.. And if you look closely in the stack dump, you will find init_user_ns: ffffffff8171fe00 D init_user_ns In case you missed it, KOSAKI Motohiro posted a similar stack-trace (but not the same BUG) in this thread: http://lkml.org/lkml/2009/2/10/7 Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 7:55 ` Vegard Nossum @ 2009-02-11 8:07 ` Andrew Morton 2009-02-11 10:48 ` Vegard Nossum 0 siblings, 1 reply; 14+ messages in thread From: Andrew Morton @ 2009-02-11 8:07 UTC (permalink / raw) To: Vegard Nossum Cc: Eric Sesterhenn, linux-kernel, David Howells, KOSAKI Motohiro, containers (cc's added) On Wed, 11 Feb 2009 08:55:08 +0100 Vegard Nossum <vegard.nossum@gmail.com> wrote: > On Sat, Feb 7, 2009 at 1:15 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > > On Fri, 6 Feb 2009 12:35:56 +0100 > > Eric Sesterhenn <snakebyte@gmx.de> wrote: > >> > >> with todays -git i get the following bug when i reboot the machine > >> > >> [ 94.369135] ------------[ cut here ]------------ > >> [ 94.369344] kernel BUG at mm/slub.c:2750! > > [...] > > > Well that's ugly. We seem to have passed a non-slab address into > > kfree(). > > > > void kfree(const void *x) > > { > > struct page *page; > > void *object = (void *)x; > > > > if (unlikely(ZERO_OR_NULL_PTR(x))) > > return; > > > > page = virt_to_head_page(x); > > if (unlikely(!PageSlab(page))) { > > BUG_ON(!PageCompound(page)); > > > > > > doing BUG_ON(!PageCompound) is a rather odd way of reporting that. > > > > This might be pointing out the obvious, but page allocator > pass-through means that we can't really tell slab pages from page > allocator pages. But since pass-through uses GFP_COMP, we know it'd be > a bug to pass a non-compound page into this function. > > > I'm unsure what could have caused this. Could you have a play around > > please? Set all the memory debug options, try using slab instead of > > slub, etc? > > I think it is trying to free the init_user_ns (because of a > refcounting bug). I am able to reproduce it with a clean stack trace: > > [ 648.864710] ------------[ cut here ]------------ > [ 648.865554] kernel BUG at mm/slub.c:2750! > [ 648.865554] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC > [ 648.865554] last sysfs file: /sys/devices/pnp0/00:0d/id > [ 648.865554] CPU 1 > [ 648.865554] Modules linked in: > [ 648.865554] Pid: 917, comm: udevd Not tainted 2.6.29-rc3 #223 > [ 648.865554] RIP: 0010:[<ffffffff810b99c9>] [<ffffffff810b99c9>] > kfree+0x29/0x87 > [ 648.865554] RSP: 0018:ffff88003f187e28 EFLAGS: 00010246 > [ 648.865554] RAX: 0100000000000400 RBX: ffffffff8171fe00 RCX: 0000000000000086 > [ 648.865554] RDX: ffffe20000050ec8 RSI: 0000000000000085 RDI: ffffe20000050ec8 > [ 648.865554] RBP: ffff88003f187e38 R08: 0000000000000000 R09: ffffffff81819cb0 > [ 648.865554] R10: 0000000000000002 R11: ffff88003f187e98 R12: ffffffff81072144 > [ 648.865554] R13: 00000000000000cf R14: ffffffff818b13e0 R15: 000000000000000a > [ 648.865554] FS: 0000000000000000(0000) GS:ffff88003f156f80(0063) > knlGS:00000000f7fac700 > [ 648.865554] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b > [ 648.865554] CR2: 0000000009b2d630 CR3: 000000003e92c000 CR4: 00000000000006a0 > [ 648.865554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 648.865554] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 648.865554] Process udevd (pid: 917, threadinfo ffff88003e458000, > task ffff88003f1b8e80) > [ 648.865554] Stack: > [ 648.865554] ffffffff8171fe00 ffffffff81072144 ffff88003f187e58 > ffffffff81072169 > [ 648.865554] ffffffff818b13e0 ffffffff8171fe00 ffff88003f187e78 > ffffffff811b2f68 > [ 648.865554] ffff88003dd4c500 0000000000000286 ffff88003f187e98 > ffffffff81046712 > [ 648.865554] Call Trace: > [ 648.865554] <IRQ> <0> [<ffffffff81072144>] ? free_user_ns+0x0/0x29 > [ 648.865554] [<ffffffff81072169>] free_user_ns+0x25/0x29 > [ 648.865554] [<ffffffff811b2f68>] kref_put+0x66/0x72 > [ 648.865554] [<ffffffff81046712>] free_uid+0x4f/0x90 > [ 648.865554] [<ffffffff810555b8>] put_cred_rcu+0xa5/0xb9 > [ 648.865554] [<ffffffff8107ea86>] __rcu_process_callbacks+0x1f4/0x263 > [ 648.865554] [<ffffffff8107eb2c>] rcu_process_callbacks+0x37/0x5d > [ 648.865554] [<ffffffff81041bd3>] __do_softirq+0x88/0x149 > [ 648.865554] [<ffffffff8100d53c>] call_softirq+0x1c/0x28 > [ 648.865554] [<ffffffff8100e4e9>] do_softirq+0x39/0x7b > [ 648.865554] [<ffffffff81041946>] irq_exit+0x44/0x7e > [ 648.865554] [<ffffffff814c3d8f>] smp_apic_timer_interrupt+0x98/0xb1 > [ 648.865554] [<ffffffff8100cf73>] apic_timer_interrupt+0x13/0x20 > [ 648.865554] <EOI> <0>Code: c9 c3 55 48 89 e5 41 54 53 0f 1f 44 00 > 00 48 83 ff 10 48 89 fb 7 > 6 6d e8 1d ed ff ff 48 89 c7 48 8b 00 84 c0 78 10 f6 c4 60 75 04 <0f> > 0b eb fe e8 f4 cd fd ff e > b 4e 48 8b 4d 08 4c 8b 4f 10 9c 41 > [ 648.865554] RIP [<ffffffff810b99c9>] kfree+0x29/0x87 > [ 648.865554] RSP <ffff88003f187e28> > [ 649.131912] ---[ end trace a48f49f69d2bcb3e ]--- > [ 649.136840] Kernel panic - not syncing: Fatal exception in interrupt > [ 649.143558] Rebooting in 10 seconds.. > > And if you look closely in the stack dump, you will find init_user_ns: > > ffffffff8171fe00 D init_user_ns > > In case you missed it, KOSAKI Motohiro posted a similar stack-trace > (but not the same BUG) in this thread: > http://lkml.org/lkml/2009/2/10/7 > Both traces include the newly-added put_cred_rcu(). Suspicious. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 8:07 ` Andrew Morton @ 2009-02-11 10:48 ` Vegard Nossum 2009-02-11 16:37 ` Serge E. Hallyn 0 siblings, 1 reply; 14+ messages in thread From: Vegard Nossum @ 2009-02-11 10:48 UTC (permalink / raw) To: Andrew Morton Cc: Eric Sesterhenn, linux-kernel, David Howells, KOSAKI Motohiro, containers On Wed, Feb 11, 2009 at 9:07 AM, Andrew Morton <akpm@linux-foundation.org> wrote: >> In case you missed it, KOSAKI Motohiro posted a similar stack-trace >> (but not the same BUG) in this thread: >> http://lkml.org/lkml/2009/2/10/7 >> > > Both traces include the newly-added put_cred_rcu(). Suspicious. > I have this test case which triggers it regularly after some minutes: #include <sys/wait.h> #include <sys/types.h> #include <errno.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(int argc, char* argv[]) { unsigned int i; if (fork() == 0) { while (1) system("echo -n ."); } for (i = 0; i < 2; ++i) { if (fork() == 0) { while (1) { setreuid(0, 0xcafeba); setreuid(0xcafeba, 0); setreuid(0, 0xcafebb); setreuid(0xcafebb, 0); } exit(EXIT_SUCCESS); } } while (!(wait(NULL) == -1 && errno == ECHILD)) ; return 0; } It seems to be the combination of exec() and setreuid(), but I couldn't get it to work with just exec() instead of system(). It is possible that CONFIG_USER_SCHED must be =y for this to work. It can probably be simplified too... Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 10:48 ` Vegard Nossum @ 2009-02-11 16:37 ` Serge E. Hallyn 2009-02-11 17:02 ` David Howells 0 siblings, 1 reply; 14+ messages in thread From: Serge E. Hallyn @ 2009-02-11 16:37 UTC (permalink / raw) To: Vegard Nossum Cc: Andrew Morton, David Howells, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Quoting Vegard Nossum (vegard.nossum@gmail.com): > On Wed, Feb 11, 2009 at 9:07 AM, Andrew Morton > <akpm@linux-foundation.org> wrote: > >> In case you missed it, KOSAKI Motohiro posted a similar stack-trace > >> (but not the same BUG) in this thread: > >> http://lkml.org/lkml/2009/2/10/7 > >> > > > > Both traces include the newly-added put_cred_rcu(). Suspicious. > > > > I have this test case which triggers it regularly after some minutes: > > #include <sys/wait.h> > #include <sys/types.h> > > #include <errno.h> > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <unistd.h> > > int main(int argc, char* argv[]) > { > unsigned int i; > > if (fork() == 0) { > while (1) > system("echo -n ."); > } > > for (i = 0; i < 2; ++i) { > if (fork() == 0) { > while (1) { > setreuid(0, 0xcafeba); > setreuid(0xcafeba, 0); > > setreuid(0, 0xcafebb); > setreuid(0xcafebb, 0); > } > > exit(EXIT_SUCCESS); > } > } > > while (!(wait(NULL) == -1 && errno == ECHILD)) > ; > > return 0; > } > > It seems to be the combination of exec() and setreuid(), but I > couldn't get it to work with just exec() instead of system(). It is > possible that CONFIG_USER_SCHED must be =y for this to work. It can > probably be simplified too... > > > Vegard I haven't yet been able to reproduce it, but I'm wondering whether something like the following is needed. Note that free_user() still seems wrong there, as it won't call uid_hash_remove(up) for a user which is not inthe init_user_ns at all, right? >From 2bc9da9728e0a4242bd8b1362d34e8d425be1b66 Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <serue@us.ibm.com> Date: Wed, 11 Feb 2009 11:29:52 -0500 Subject: [PATCH 1/1] user namespaces: only put the userns when we unhash the uid uids in namespaces other than init don't get a sysfs entry. For those in the init namespace, while we're waiting to remove the sysfs entry for the uid the uid is still hashed, and alloc_uid() may re-grab that uid without getting a new reference to the user_ns, which we've already put in free_user before scheduling remove_user_sysfs_dir(). Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> --- kernel/user.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/kernel/user.c b/kernel/user.c index 477b666..bb401a5 100644 --- a/kernel/user.c +++ b/kernel/user.c @@ -71,6 +71,7 @@ static void uid_hash_insert(struct user_struct *up, struct hlist_head *hashent) static void uid_hash_remove(struct user_struct *up) { + put_user_ns(up->user_ns); hlist_del_init(&up->uidhash_node); } @@ -334,7 +335,6 @@ static void free_user(struct user_struct *up, unsigned long flags) atomic_inc(&up->__count); spin_unlock_irqrestore(&uidhash_lock, flags); - put_user_ns(up->user_ns); INIT_WORK(&up->work, remove_user_sysfs_dir); schedule_work(&up->work); } @@ -357,7 +357,6 @@ static void free_user(struct user_struct *up, unsigned long flags) sched_destroy_user(up); key_put(up->uid_keyring); key_put(up->session_keyring); - put_user_ns(up->user_ns); kmem_cache_free(uid_cachep, up); } -- 1.6.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 16:37 ` Serge E. Hallyn @ 2009-02-11 17:02 ` David Howells 2009-02-11 17:24 ` Serge E. Hallyn 0 siblings, 1 reply; 14+ messages in thread From: David Howells @ 2009-02-11 17:02 UTC (permalink / raw) To: Serge E. Hallyn Cc: dhowells, Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Serge E. Hallyn <serue@us.ibm.com> wrote: > static void uid_hash_remove(struct user_struct *up) > { > + put_user_ns(up->user_ns); > hlist_del_init(&up->uidhash_node); > } Don't you need to do the hlist_del_init() first? Otherwise, mightn't the put_user_ns() cause the namespace to be freed before hlist_del_init() removes the user_struct from it? David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 17:02 ` David Howells @ 2009-02-11 17:24 ` Serge E. Hallyn 2009-02-11 18:00 ` Vegard Nossum 2009-02-11 18:03 ` David Howells 0 siblings, 2 replies; 14+ messages in thread From: Serge E. Hallyn @ 2009-02-11 17:24 UTC (permalink / raw) To: David Howells Cc: Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Quoting David Howells (dhowells@redhat.com): > Serge E. Hallyn <serue@us.ibm.com> wrote: > > > static void uid_hash_remove(struct user_struct *up) > > { > > + put_user_ns(up->user_ns); > > hlist_del_init(&up->uidhash_node); > > } > > Don't you need to do the hlist_del_init() first? Otherwise, mightn't the > put_user_ns() cause the namespace to be freed before hlist_del_init() removes > the user_struct from it? It's called under uidhash_lock spinlock so should be ok, but in principle you're right so it's probably a good idea. The main point is that without this patch, put_user_ns is done before the hlist_del_init and *not* atomically under uidhash_lock. thanks, -serge ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 17:24 ` Serge E. Hallyn @ 2009-02-11 18:00 ` Vegard Nossum 2009-02-11 18:03 ` David Howells 1 sibling, 0 replies; 14+ messages in thread From: Vegard Nossum @ 2009-02-11 18:00 UTC (permalink / raw) To: Serge E. Hallyn, KOSAKI Motohiro Cc: David Howells, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra On Wed, Feb 11, 2009 at 6:24 PM, Serge E. Hallyn <serue@us.ibm.com> wrote: > Quoting David Howells (dhowells@redhat.com): >> Serge E. Hallyn <serue@us.ibm.com> wrote: >> >> > static void uid_hash_remove(struct user_struct *up) >> > { >> > + put_user_ns(up->user_ns); >> > hlist_del_init(&up->uidhash_node); >> > } >> >> Don't you need to do the hlist_del_init() first? Otherwise, mightn't the >> put_user_ns() cause the namespace to be freed before hlist_del_init() removes >> the user_struct from it? > > It's called under uidhash_lock spinlock so should be ok, but in > principle you're right so it's probably a good idea. > > The main point is that without this patch, put_user_ns is done before > the hlist_del_init and *not* atomically under uidhash_lock. Congrats, your (unmodified) patch made it through the first 20 minutes of testing! :-D (In comparison, the unpatched kernel would usually crash after ~3 minutes) I wonder why you couldn't reproduce it, though. KOSAKI Motohiro: You might want to see if this patch helps too. It is here: http://lkml.org/lkml/2009/2/11/251 Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 17:24 ` Serge E. Hallyn 2009-02-11 18:00 ` Vegard Nossum @ 2009-02-11 18:03 ` David Howells 2009-02-11 19:38 ` Serge E. Hallyn 1 sibling, 1 reply; 14+ messages in thread From: David Howells @ 2009-02-11 18:03 UTC (permalink / raw) To: Serge E. Hallyn Cc: dhowells, Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Serge E. Hallyn <serue@us.ibm.com> wrote: > It's called under uidhash_lock spinlock so should be ok, but in > principle you're right so it's probably a good idea. The lock is nothing to do with it. put_user_ns() may call kfree() on the user_namespace, but the user_struct given to uid_hash_remove() may still be attached to it. David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 18:03 ` David Howells @ 2009-02-11 19:38 ` Serge E. Hallyn 2009-02-11 20:42 ` David Howells 0 siblings, 1 reply; 14+ messages in thread From: Serge E. Hallyn @ 2009-02-11 19:38 UTC (permalink / raw) To: David Howells Cc: Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Quoting David Howells (dhowells@redhat.com): > Serge E. Hallyn <serue@us.ibm.com> wrote: > > > It's called under uidhash_lock spinlock so should be ok, but in > > principle you're right so it's probably a good idea. > > The lock is nothing to do with it. put_user_ns() may call kfree() on the > user_namespace, but the user_struct given to uid_hash_remove() may still be > attached to it. Yes, but noone will pull the user_struct off the list without taking the lock. what am I missing? Anyway, I do like swapping the lines (as below) better. -serge >From 8b83d11023c1064e99bffae3c2a05580b915de60 Mon Sep 17 00:00:00 2001 From: Serge E. Hallyn <serue@us.ibm.com> Date: Wed, 11 Feb 2009 11:29:52 -0500 Subject: [PATCH 1/1] user namespaces: only put the userns when we unhash the uid uids in namespaces other than init don't get a sysfs entry. For those in the init namespace, while we're waiting to remove the sysfs entry for the uid the uid is still hashed, and alloc_uid() may re-grab that uid without getting a new reference to the user_ns, which we've already put in free_user before scheduling remove_user_sysfs_dir(). Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> --- kernel/user.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/kernel/user.c b/kernel/user.c index 477b666..3551ac7 100644 --- a/kernel/user.c +++ b/kernel/user.c @@ -72,6 +72,7 @@ static void uid_hash_insert(struct user_struct *up, struct hlist_head *hashent) static void uid_hash_remove(struct user_struct *up) { hlist_del_init(&up->uidhash_node); + put_user_ns(up->user_ns); } static struct user_struct *uid_hash_find(uid_t uid, struct hlist_head *hashent) @@ -334,7 +335,6 @@ static void free_user(struct user_struct *up, unsigned long flags) atomic_inc(&up->__count); spin_unlock_irqrestore(&uidhash_lock, flags); - put_user_ns(up->user_ns); INIT_WORK(&up->work, remove_user_sysfs_dir); schedule_work(&up->work); } @@ -357,7 +357,6 @@ static void free_user(struct user_struct *up, unsigned long flags) sched_destroy_user(up); key_put(up->uid_keyring); key_put(up->session_keyring); - put_user_ns(up->user_ns); kmem_cache_free(uid_cachep, up); } -- 1.6.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 19:38 ` Serge E. Hallyn @ 2009-02-11 20:42 ` David Howells 2009-02-11 21:21 ` Serge E. Hallyn 0 siblings, 1 reply; 14+ messages in thread From: David Howells @ 2009-02-11 20:42 UTC (permalink / raw) To: Serge E. Hallyn Cc: dhowells, Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Serge E. Hallyn <serue@us.ibm.com> wrote: > Yes, but noone will pull the user_struct off the list without > taking the lock. > > what am I missing? I believe that the hash link (uidhash_node) in the user_struct that is passed to uid_hash_remove() points to, and is pointed to by the user_namespace to which the user_struct belongs. In which case calling put_user_ns() may kfree the head pointer of the list _before_ hlist_del_init() is invoked - in which case hlist_del_init() will act upon freed memory. At least, I think it works like this. Anyway, I have no objection to your new patch. Acked-by: David Howells <dhowells@redhat.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: namespaces?: bug at mm/slub.c:2750 2009-02-11 20:42 ` David Howells @ 2009-02-11 21:21 ` Serge E. Hallyn 0 siblings, 0 replies; 14+ messages in thread From: Serge E. Hallyn @ 2009-02-11 21:21 UTC (permalink / raw) To: David Howells Cc: Vegard Nossum, Andrew Morton, Eric Sesterhenn, containers, linux-kernel, Dhaval Giani, Peter Zijlstra Quoting David Howells (dhowells@redhat.com): > Serge E. Hallyn <serue@us.ibm.com> wrote: > > > Yes, but noone will pull the user_struct off the list without > > taking the lock. > > > > what am I missing? > > I believe that the hash link (uidhash_node) in the user_struct that is passed > to uid_hash_remove() points to, and is pointed to by the user_namespace to > which the user_struct belongs. > > In which case calling put_user_ns() may kfree the head pointer of the list > _before_ hlist_del_init() is invoked - in which case hlist_del_init() will act > upon freed memory. > > At least, I think it works like this. Yikes, you're right. I was thinking there was on hash table with the key calculated from ns+uid, but instead each ns has its own hash table keyed on uid. > Anyway, I have no objection to your new patch. > > Acked-by: David Howells <dhowells@redhat.com> thanks, -serge ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-02-11 21:22 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-06 11:35 namespaces?: bug at mm/slub.c:2750 Eric Sesterhenn 2009-02-07 0:15 ` Andrew Morton 2009-02-09 11:53 ` Eric Sesterhenn 2009-02-11 7:55 ` Vegard Nossum 2009-02-11 8:07 ` Andrew Morton 2009-02-11 10:48 ` Vegard Nossum 2009-02-11 16:37 ` Serge E. Hallyn 2009-02-11 17:02 ` David Howells 2009-02-11 17:24 ` Serge E. Hallyn 2009-02-11 18:00 ` Vegard Nossum 2009-02-11 18:03 ` David Howells 2009-02-11 19:38 ` Serge E. Hallyn 2009-02-11 20:42 ` David Howells 2009-02-11 21:21 ` Serge E. Hallyn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox