* [GIT PULL] SLAB changes for v3.10
@ 2013-05-07 6:37 Pekka Enberg
2013-05-08 0:30 ` Tony Lindgren
0 siblings, 1 reply; 22+ messages in thread
From: Pekka Enberg @ 2013-05-07 6:37 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Andrew Morton, Christoph Lameter, linux-kernel
Hi Linus,
Please pull the latest SLAB tree from:
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
Bulk of the changes are more slab unification from Christoph. There's
also few fixes from Aaron, Glauber, and Joonsoo thrown into the mix.
Pekka
------------------>
The following changes since commit 51a26ae7a14b85c99c9be470c2d28eeeba0f26a3:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2013-05-06 15:51:10 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus
Aaron Tomlin (1):
mm: slab: Verify the nodeid passed to ____cache_alloc_node
Christoph Lameter (19):
slab: Use proper formatting specs for unsigned size_t
slab: Move kmalloc related function defs
slab: Common kmalloc slab index determination
slab: Use common kmalloc_index/kmalloc_size functions
slab: Common name for the per node structures
slab: Rename nodelists to node
slab: Common constants for kmalloc boundaries
slab: Common definition for the array of kmalloc caches
slab: Common function to create the kmalloc array
stat: Use size_t for sizes instead of unsigned
slab: Common Kmalloc cache determination
slab: Rename list3/l3 to node
slab: Common definition for kmem_cache_node
slab: Handle ARCH_DMA_MINALIGN correctly
slab: Fixup CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections
slub: Do not dereference NULL pointer in node_match
slub: tid must be retrieved from the percpu area of the current processor
slab: Return NULL for oversized allocations
mm, slab_common: Fix bootstrap creation of kmalloc caches
Glauber Costa (1):
slub: correctly bootstrap boot caches
Joonsoo Kim (3):
mm/sl[au]b: correct allocation type check in kmalloc_slab()
slub: correct to calculate num of acquired objects in get_partial_node()
slub: add 'likely' macro to inc_slabs_node()
Pekka Enberg (1):
Merge branch 'slab/next' into slab/for-linus
fs/proc/stat.c | 2 +-
include/linux/kmalloc_sizes.h | 45 ---
include/linux/slab.h | 231 +++++++++----
include/linux/slab_def.h | 54 +--
include/linux/slub_def.h | 136 +-------
mm/slab.c | 790 +++++++++++++++++------------------------
mm/slab.h | 43 +++-
mm/slab_common.c | 174 +++++++++-
mm/slub.c | 221 ++----------
9 files changed, 781 insertions(+), 915 deletions(-)
delete mode 100644 include/linux/kmalloc_sizes.h
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [GIT PULL] SLAB changes for v3.10 2013-05-07 6:37 [GIT PULL] SLAB changes for v3.10 Pekka Enberg @ 2013-05-08 0:30 ` Tony Lindgren 2013-05-08 4:24 ` Tony Lindgren 0 siblings, 1 reply; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 0:30 UTC (permalink / raw) To: Pekka Enberg Cc: Linus Torvalds, Andrew Morton, Christoph Lameter, linux-kernel * Pekka Enberg <penberg@kernel.org> [130506 23:42]: > Hi Linus, > > Please pull the latest SLAB tree from: > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus ... > mm, slab_common: Fix bootstrap creation of kmalloc caches This one seems to cause a regression for me on at least arm omap and vexpress that depends on some kconfig option I have not been able to figure out yet. Reverting 8a965b3b fixes the issue. Any ideas? Regards, Tony Without reverting 8a965b3b I'm getting: [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054 [ 0.000000] pgd = c0004000 [ 0.000000] [00000054] *pgd=00000000 [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689 [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000 [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178 [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178 [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3 [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000 [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0 [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015 [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240) [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000) [ 0.000000] df60: 00002014 00000010 c0702890 00000000 [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000 [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828 [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000 [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4) [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec) [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c) [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074) [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004) [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]--- ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 0:30 ` Tony Lindgren @ 2013-05-08 4:24 ` Tony Lindgren 2013-05-08 5:16 ` Tony Lindgren ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 4:24 UTC (permalink / raw) To: Pekka Enberg Cc: Linus Torvalds, Andrew Morton, Christoph Lameter, linux-kernel * Tony Lindgren <tony@atomide.com> [130507 17:35]: > * Pekka Enberg <penberg@kernel.org> [130506 23:42]: > > Hi Linus, > > > > Please pull the latest SLAB tree from: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus > ... > > > mm, slab_common: Fix bootstrap creation of kmalloc caches > > This one seems to cause a regression for me on at least arm omap > and vexpress that depends on some kconfig option I have not been > able to figure out yet. Reverting 8a965b3b fixes the issue. > > Any ideas? OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem with commit 8a965b3b. Ain't nothing like bisecting and booting and then diffing .config files on top of that. > Without reverting 8a965b3b I'm getting: > > [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054 > [ 0.000000] pgd = c0004000 > [ 0.000000] [00000054] *pgd=00000000 > [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM > [ 0.000000] Modules linked in: > [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689 > [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000 > [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178 > [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178 > [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3 > [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000 > [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0 > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c > [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel > [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015 > [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240) > [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000) > [ 0.000000] df60: 00002014 00000010 c0702890 00000000 > [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000 > [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828 > [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d > [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000 > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4) > [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec) > [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c) > [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074) > [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004) > [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]--- > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 4:24 ` Tony Lindgren @ 2013-05-08 5:16 ` Tony Lindgren 2013-05-08 6:20 ` Pekka Enberg 2013-05-08 13:56 ` Christoph Lameter 2013-05-08 18:13 ` Chris Mason 2 siblings, 1 reply; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 5:16 UTC (permalink / raw) To: Pekka Enberg Cc: Linus Torvalds, Andrew Morton, Christoph Lameter, linux-kernel * Tony Lindgren <tony@atomide.com> [130507 21:30]: > * Tony Lindgren <tony@atomide.com> [130507 17:35]: > > * Pekka Enberg <penberg@kernel.org> [130506 23:42]: > > > Hi Linus, > > > > > > Please pull the latest SLAB tree from: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus > > ... > > > > > mm, slab_common: Fix bootstrap creation of kmalloc caches > > > > This one seems to cause a regression for me on at least arm omap > > and vexpress that depends on some kconfig option I have not been > > able to figure out yet. Reverting 8a965b3b fixes the issue. > > > > Any ideas? > > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > diffing .config files on top of that. Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem, not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK, but I cannot break my other working .config by adding PROVE_LOCKING and DEBUG_SPINLOCK. Hmm. > > Without reverting 8a965b3b I'm getting: > > > > [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054 > > [ 0.000000] pgd = c0004000 > > [ 0.000000] [00000054] *pgd=00000000 > > [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM > > [ 0.000000] Modules linked in: > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689 > > [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000 > > [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178 > > [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178 > > [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3 > > [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000 > > [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0 > > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 > > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c > > [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel > > [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015 > > [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240) > > [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000) > > [ 0.000000] df60: 00002014 00000010 c0702890 00000000 > > [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000 > > [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828 > > [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d > > [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000 > > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4) > > [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec) > > [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c) > > [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074) > > [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004) > > [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]--- > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 5:16 ` Tony Lindgren @ 2013-05-08 6:20 ` Pekka Enberg 2013-05-08 11:58 ` Glauber Costa 0 siblings, 1 reply; 22+ messages in thread From: Pekka Enberg @ 2013-05-08 6:20 UTC (permalink / raw) To: Tony Lindgren Cc: Linus Torvalds, Andrew Morton, Christoph Lameter, LKML, Glauber Costa, Tetsuo Handa Hi Tony, On Wed, May 8, 2013 at 8:16 AM, Tony Lindgren <tony@atomide.com> wrote: > * Tony Lindgren <tony@atomide.com> [130507 21:30]: >> * Tony Lindgren <tony@atomide.com> [130507 17:35]: >> > * Pekka Enberg <penberg@kernel.org> [130506 23:42]: >> > > Hi Linus, >> > > >> > > Please pull the latest SLAB tree from: >> > > >> > > git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux.git slab/for-linus >> > ... >> > >> > > mm, slab_common: Fix bootstrap creation of kmalloc caches >> > >> > This one seems to cause a regression for me on at least arm omap >> > and vexpress that depends on some kconfig option I have not been >> > able to figure out yet. Reverting 8a965b3b fixes the issue. >> > >> > Any ideas? >> >> OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem >> with commit 8a965b3b. Ain't nothing like bisecting and booting and then >> diffing .config files on top of that. > > Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem, > not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected > by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my > non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK, > but I cannot break my other working .config by adding PROVE_LOCKING > and DEBUG_SPINLOCK. Hmm. That's almost certainly a slab bootstrap issue. Lockdep et al just change struct sizes which is why the problem shows up. Is this with SLAB or SLUB? Christoph, Glauber, care to take a look? >> > Without reverting 8a965b3b I'm getting: >> > >> > [ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00000054 >> > [ 0.000000] pgd = c0004000 >> > [ 0.000000] [00000054] *pgd=00000000 >> > [ 0.000000] Internal error: Oops: 5 [#1] SMP ARM >> > [ 0.000000] Modules linked in: >> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-11486-g6d13e11 #689 >> > [ 0.000000] task: c06f77c8 ti: c06ec000 task.ti: c06ec000 >> > [ 0.000000] PC is at kmem_cache_alloc_trace+0x50/0x178 >> > [ 0.000000] LR is at kmem_cache_alloc_trace+0x38/0x178 >> > [ 0.000000] pc : [<c00edab4>] lr : [<c00eda9c>] psr: 600001d3 >> > [ 0.000000] sp : c06edf70 ip : 600001d3 fp : 00000000 >> > [ 0.000000] r10: 000000c0 r9 : c0086958 r8 : 000080d0 >> > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 >> > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c >> > [ 0.000000] Flags: nZCv IRQs off FIQs off Mode SVC_32 ISA ARM Segment kernel >> > [ 0.000000] Control: 10c5387d Table: 6000406a DAC: 00000015 >> > [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0xc06ec240) >> > [ 0.000000] Stack: (0xc06edf70 to 0xc06ee000) >> > [ 0.000000] df60: 00002014 00000010 c0702890 00000000 >> > [ 0.000000] df80: 00000000 00000000 00000000 410fc090 c06f4480 c0086958 c0702890 00000000 >> > [ 0.000000] dfa0: 00000010 c06dc0b0 ffffffff c06b44f4 9fffffff c073834c c0738340 c06a5828 >> > [ 0.000000] dfc0: ffffffff ffffffff c06a5478 00000000 00000000 c06dc0b0 00000000 10c5387d >> > [ 0.000000] dfe0: c06f44c8 c06dc4b4 c06f8cac 6000406a 00000000 60008074 00000000 00000000 >> > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4) >> > [ 0.000000] [<c0086958>] (alloc_desc+0x24/0xb4) from [<c06b44f4>] (early_irq_init+0x78/0xec) >> > [ 0.000000] [<c06b44f4>] (early_irq_init+0x78/0xec) from [<c06a5828>] (start_kernel+0x17c/0x33c) >> > [ 0.000000] [<c06a5828>] (start_kernel+0x17c/0x33c) from [<60008074>] (0x60008074) >> > [ 0.000000] Code: e5923014 e2833014 e1a03103 e0873003 (e5931004) >> > [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]--- ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 6:20 ` Pekka Enberg @ 2013-05-08 11:58 ` Glauber Costa 2013-05-08 12:26 ` Pekka Enberg 0 siblings, 1 reply; 22+ messages in thread From: Glauber Costa @ 2013-05-08 11:58 UTC (permalink / raw) To: Pekka Enberg Cc: Tony Lindgren, Linus Torvalds, Andrew Morton, Christoph Lameter, LKML, Tetsuo Handa On 05/08/2013 10:20 AM, Pekka Enberg wrote: >> > Hmm it's actually CONFIG_PROVE_LOCKING=y that causes the problem, >> > not CONFIG_DEBUG_SPINLOCK=y. CONFIG_DEBUG_SPINLOCK=y was just selected >> > by CONFIG_PROVE_LOCKING=y in my non-booting .config. I can now fix my >> > non-booting .config by removing PROVE_LOCKING and DEBUG_SPINLOCK, >> > but I cannot break my other working .config by adding PROVE_LOCKING >> > and DEBUG_SPINLOCK. Hmm. > That's almost certainly a slab bootstrap issue. Lockdep et al just > change struct sizes which is why the problem shows up. Is this with > SLAB or SLUB? Christoph, Glauber, care to take a look? > >>>> >> > My first guess is that it hit a NULL cache. Being a NULL pointer dereference, the thing among all that has the biggest chances of being NULL and accessed unconditionally is the cache pointer itself. Due to the size being too big. But if that were the case, he would have hit the WARN_ON recently introduced: if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE)) return NULL; Is this WARN hit ? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 11:58 ` Glauber Costa @ 2013-05-08 12:26 ` Pekka Enberg 2013-05-08 12:38 ` Glauber Costa 0 siblings, 1 reply; 22+ messages in thread From: Pekka Enberg @ 2013-05-08 12:26 UTC (permalink / raw) To: Glauber Costa Cc: Tony Lindgren, Linus Torvalds, Andrew Morton, Christoph Lameter, LKML, Tetsuo Handa On Wed, May 8, 2013 at 2:58 PM, Glauber Costa <glommer@parallels.com> wrote: > My first guess is that it hit a NULL cache. Being a NULL pointer > dereference, the thing among all that has the biggest chances of being > NULL and accessed unconditionally is the cache pointer itself. > > Due to the size being too big. But if that were the case, he would have > hit the WARN_ON recently introduced: > > if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE)) > return NULL; > > > Is this WARN hit ? I doubt it: [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from [<c0086958>] (alloc_desc+0x24/0xb4) It's the kzalloc_node() in kernel/irq/irqdesc.c::alloc_desc() and AFAICT based on r4 it's a 32 KB allocation. It's more likely that KMALLOC_SHIFT_HIGH is less than 25 and because kmalloc_index() doesn't respect it, we return a pointer to an uninitialized kmalloc cache. Pekka ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 12:26 ` Pekka Enberg @ 2013-05-08 12:38 ` Glauber Costa 0 siblings, 0 replies; 22+ messages in thread From: Glauber Costa @ 2013-05-08 12:38 UTC (permalink / raw) To: Pekka Enberg Cc: Tony Lindgren, Linus Torvalds, Andrew Morton, Christoph Lameter, LKML, Tetsuo Handa On 05/08/2013 04:26 PM, Pekka Enberg wrote: > On Wed, May 8, 2013 at 2:58 PM, Glauber Costa <glommer@parallels.com> wrote: >> My first guess is that it hit a NULL cache. Being a NULL pointer >> dereference, the thing among all that has the biggest chances of being >> NULL and accessed unconditionally is the cache pointer itself. >> >> Due to the size being too big. But if that were the case, he would have >> hit the WARN_ON recently introduced: >> >> if (WARN_ON_ONCE(size > KMALLOC_MAX_SIZE)) >> return NULL; >> >> >> Is this WARN hit ? > > I doubt it: > > [ 0.000000] r7 : 00000000 r6 : 600001d3 r5 : 00000000 r4 : 00008000 > [ 0.000000] r3 : 00000050 r2 : c06ec000 r1 : c06f77c8 r0 : c00eda9c > > [ 0.000000] [<c00edab4>] (kmem_cache_alloc_trace+0x50/0x178) from > [<c0086958>] (alloc_desc+0x24/0xb4) > > It's the kzalloc_node() in kernel/irq/irqdesc.c::alloc_desc() and > AFAICT based on r4 it's a 32 KB allocation. It's more likely that > KMALLOC_SHIFT_HIGH is less than 25 and because kmalloc_index() doesn't > respect it, we return a pointer to an uninitialized kmalloc cache. > Exactly, but then the index is calculated from the size. If we are allocating with a size that would lead to an invalid index, we should WARN. If this is not happening, that WARN is really really badly placed. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 4:24 ` Tony Lindgren 2013-05-08 5:16 ` Tony Lindgren @ 2013-05-08 13:56 ` Christoph Lameter 2013-05-08 15:45 ` Tony Lindgren 2013-05-08 18:13 ` Chris Mason 2 siblings, 1 reply; 22+ messages in thread From: Christoph Lameter @ 2013-05-08 13:56 UTC (permalink / raw) To: Tony Lindgren; +Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel On Tue, 7 May 2013, Tony Lindgren wrote: > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > diffing .config files on top of that. > > > Without reverting 8a965b3b I'm getting: The patch (commit 8a965b3baa89ffedc73c0fbc750006c631012ced) merely changed the sequence of slab creation to address an issue in SLAB. Hmmm.. But if KMALLOC_SHIFT_LOW is higher than 6 or 7 then the creation of the non-power of two slab could be skipped as a result of the patch. But we should not need them in those cases. Can I see the kernel config? What is the value of KMALLOC_SHIFT_LOW? Can you figure out which kernel slab the function is trying to access? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 13:56 ` Christoph Lameter @ 2013-05-08 15:45 ` Tony Lindgren [not found] ` <alpine.DEB.2.02.1305081309310.3124@gentwo.org> 0 siblings, 1 reply; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 15:45 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1694 bytes --] * Christoph Lameter <cl@linux.com> [130508 07:01]: > On Tue, 7 May 2013, Tony Lindgren wrote: > > > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > > diffing .config files on top of that. > > > > > Without reverting 8a965b3b I'm getting: > > The patch (commit 8a965b3baa89ffedc73c0fbc750006c631012ced) merely changed the sequence of > slab creation to address an issue in SLAB. > > Hmmm.. But if KMALLOC_SHIFT_LOW is higher than 6 or 7 then the creation of > the non-power of two slab could be skipped as a result of the patch. But > we should not need them in those cases. > > Can I see the kernel config? Attached are minimal defconfig-bad and defconfig-good with pretty much everything disabled. These boot vexpress in qemu using earlyprintk for the output. > What is the value of KMALLOC_SHIFT_LOW? It's 6. > Can you figure out which kernel slab the function is trying to access? I can certainly debug it further, but it's also pretty easy to reproduce: 1. Download and build qemu-linaro from: https://git.linaro.org/gitweb?p=qemu/qemu-linaro.git The last time I tried the stock qemu I could not get vexpress to boot with it, don't know if that's still the case. But the above works for me. 2. Cross compile kernel with the attached defconfig-bad $ ARCH=arm CROSS_COMPILE=... make zImage 3. Try to boot the kernel in qemu $ qemu-system-arm -machine vexpress-a9 -m 1024 -curses -net nic \ -net user -serial stdio -append "console=ttyAMA0,115200n8 \ root=/dev/mmcblk0p2 ro rootwait physmap.enabled=0 \ debug earlyprintk" -kernel zImage Regards, Tony [-- Attachment #2: defconfig-bad --] [-- Type: text/plain, Size: 1922 bytes --] CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 CONFIG_BLK_DEV_INITRD=y CONFIG_EXPERT=y CONFIG_SLAB=y CONFIG_PROFILING=y CONFIG_OPROFILE=y # CONFIG_BLOCK is not set CONFIG_ARCH_VEXPRESS_CA9X4=y CONFIG_ARM_THUMBEE=y CONFIG_SMP=y CONFIG_NR_CPUS=2 CONFIG_ZBOOT_ROM_TEXT=0x0 CONFIG_ZBOOT_ROM_BSS=0x0 CONFIG_CMDLINE="root=/dev/mmcblk0p2 rootwait console=ttyO2,115200" CONFIG_KEXEC=y CONFIG_FPE_NWFPE=y CONFIG_BINFMT_MISC=y CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_CMA=y CONFIG_INPUT_JOYDEV=y CONFIG_INPUT_EVDEV=y CONFIG_KEYBOARD_GPIO=y CONFIG_KEYBOARD_MATRIX=y CONFIG_INPUT_TOUCHSCREEN=y CONFIG_INPUT_MISC=y CONFIG_VT_HW_CONSOLE_BINDING=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y CONFIG_SERIAL_AMBA_PL011=y CONFIG_SERIAL_AMBA_PL011_CONSOLE=y CONFIG_DEBUG_GPIO=y CONFIG_GPIO_SYSFS=y # CONFIG_HWMON is not set CONFIG_FB=y CONFIG_FIRMWARE_EDID=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_BACKLIGHT_LCD_SUPPORT=y CONFIG_LCD_PLATFORM=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_FONTS=y CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y CONFIG_LOGO=y # CONFIG_USB_SUPPORT is not set # CONFIG_IOMMU_SUPPORT is not set # CONFIG_FILE_LOCKING is not set # CONFIG_DNOTIFY is not set # CONFIG_INOTIFY_USER is not set CONFIG_TMPFS=y # CONFIG_MISC_FILESYSTEMS is not set CONFIG_PRINTK_TIME=y CONFIG_MAGIC_SYSRQ=y CONFIG_SCHEDSTATS=y CONFIG_TIMER_STATS=y CONFIG_PROVE_LOCKING=y # CONFIG_DEBUG_BUGVERBOSE is not set CONFIG_DEBUG_INFO=y # CONFIG_FTRACE is not set CONFIG_DEBUG_LL=y CONFIG_DEBUG_VEXPRESS_UART0_CA9=y CONFIG_EARLY_PRINTK=y [-- Attachment #3: defconfig-good --] [-- Type: text/plain, Size: 1921 bytes --] CONFIG_SYSVIPC=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 CONFIG_BLK_DEV_INITRD=y CONFIG_EXPERT=y CONFIG_KALLSYMS_ALL=y CONFIG_SLAB=y CONFIG_PROFILING=y CONFIG_OPROFILE=y # CONFIG_BLOCK is not set CONFIG_ARCH_VEXPRESS_CA9X4=y CONFIG_ARM_THUMBEE=y CONFIG_SMP=y CONFIG_NR_CPUS=2 CONFIG_ZBOOT_ROM_TEXT=0x0 CONFIG_ZBOOT_ROM_BSS=0x0 CONFIG_CMDLINE="root=/dev/mmcblk0p2 rootwait console=ttyO2,115200" CONFIG_KEXEC=y CONFIG_FPE_NWFPE=y CONFIG_BINFMT_MISC=y CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_CMA=y CONFIG_INPUT_JOYDEV=y CONFIG_INPUT_EVDEV=y CONFIG_KEYBOARD_GPIO=y CONFIG_KEYBOARD_MATRIX=y CONFIG_INPUT_TOUCHSCREEN=y CONFIG_INPUT_MISC=y CONFIG_VT_HW_CONSOLE_BINDING=y # CONFIG_LEGACY_PTYS is not set CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y CONFIG_SERIAL_AMBA_PL011=y CONFIG_SERIAL_AMBA_PL011_CONSOLE=y CONFIG_DEBUG_GPIO=y CONFIG_GPIO_SYSFS=y # CONFIG_HWMON is not set CONFIG_FB=y CONFIG_FIRMWARE_EDID=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_BACKLIGHT_LCD_SUPPORT=y CONFIG_LCD_PLATFORM=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_FONTS=y CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y CONFIG_LOGO=y # CONFIG_USB_SUPPORT is not set # CONFIG_IOMMU_SUPPORT is not set # CONFIG_FILE_LOCKING is not set # CONFIG_DNOTIFY is not set # CONFIG_INOTIFY_USER is not set CONFIG_TMPFS=y # CONFIG_MISC_FILESYSTEMS is not set CONFIG_PRINTK_TIME=y CONFIG_MAGIC_SYSRQ=y CONFIG_SCHEDSTATS=y CONFIG_TIMER_STATS=y # CONFIG_DEBUG_BUGVERBOSE is not set CONFIG_DEBUG_INFO=y # CONFIG_FTRACE is not set CONFIG_DEBUG_LL=y CONFIG_DEBUG_VEXPRESS_UART0_CA9=y CONFIG_EARLY_PRINTK=y ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <alpine.DEB.2.02.1305081309310.3124@gentwo.org>]
* Re: [GIT PULL] SLAB changes for v3.10 [not found] ` <alpine.DEB.2.02.1305081309310.3124@gentwo.org> @ 2013-05-08 18:29 ` Christoph Lameter 0 siblings, 0 replies; 22+ messages in thread From: Christoph Lameter @ 2013-05-08 18:29 UTC (permalink / raw) To: Tony Lindgren Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel, Chris Mason > The 1.4.0 verion in ubuntu 13.04 s not good enough? qemu 1.4.0 reproduces the bug here on arm. And Chris Mason;s patch fixes it. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 4:24 ` Tony Lindgren 2013-05-08 5:16 ` Tony Lindgren 2013-05-08 13:56 ` Christoph Lameter @ 2013-05-08 18:13 ` Chris Mason 2013-05-08 18:25 ` Christoph Lameter 2013-05-08 21:01 ` Konrad Rzeszutek Wilk 2 siblings, 2 replies; 22+ messages in thread From: Chris Mason @ 2013-05-08 18:13 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel, Tony Lindgren, Christoph Lameter [ Sorry if I break the threading on this, I had to pull it off gmane ] On Tue, 7 May 2013, Tony Lindgren wrote: > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > diffing .config files on top of that. I'm unable to boot with slab on current Linus -master, and bisected it down almost as far as Tony did before trying SLUB and then finding this thread. My box is a standard x86-64, nothing exciting and spinlock debugging isn't on. A few printks stuffed into Christoph's code: cache ffff88047f000080 at index 5 creating slab cache at 6 create special cache #1 (this is kmalloc_caches[1]) cache ffff88047f0001c0 at index 7 creating slab cache at 8 creating slab cache at 9 creating slab cache at 10 ... more get created Pulling this into the code from commit 8a965b3b: for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) { if (!kmalloc_caches[i]) { kmalloc_caches[i] = create_kmalloc_cache(NULL, 1 << i, flags); /* * Caches that are not of the two-to-the-power-of size. * These have to be created immediately after the * earlier power of two caches */ if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); } } kmalloc_caches[7] was not null, and so kmalloc_caches[2] was never created. I get this oops (with early printk on) ------------[ cut here ]------------ kernel BUG at mm/slab.c:1635! invalid opcode: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-josef+ #920 Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012 task: ffffffff8196a410 ti: ffffffff8195a000 task.ti: ffffffff8195a000 RIP: 0010:[<ffffffff81a49c9d>] [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d RSP: 0000:ffffffff8195bf78 EFLAGS: 00010282 RAX: 00000000fffffff4 RBX: ffff88047f006480 RCX: 000000000000ff31 RDX: 000000000000000e RSI: 0000000000000046 RDI: ffffffff81baf238 RBP: ffffffff8195bf80 R08: 0000000000000400 R09: 0000000000000000 R10: 0000000000002fa4 R11: 0000000000000000 R12: ffffffff81ab58d0 R13: ffffffff81abd2c0 R14: ffff88047ffaf0c0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88047ffff000 CR3: 0000000001965000 CR4: 00000000000406b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffffffffff ffffffff8195bfc0 ffffffff81a25b64 ffffffff81a2573d ffffffff81abd2c0 0000000000003000 0000000000000000 0000000000000000 0000000000000000 ffffffff8195bfd0 ffffffff81a2547f ffffffff8195bfe8 Call Trace: [<ffffffff81a25b64>] start_kernel+0x235/0x394 [<ffffffff81a2573d>] ? repair_env_string+0x58/0x58 [<ffffffff81a2547f>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81a25548>] x86_64_start_kernel+0xc7/0xca Code: 53 e8 40 57 bd ff 48 8b 05 21 f2 f4 ff 48 8d 58 a8 48 8d 43 58 48 3d a0 8e 99 81 74 1a 31 f6 48 89 df e8 ba 7f 6d ff 85 c0 74 02 <0f> 0b 48 8b 5b 58 48 83 eb 58 eb da 48 c7 c7 70 8e 99 81 e8 e6 RIP [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d RSP <ffffffff8195bf78> ---[ end trace 2e5587581263f881 ]--- This patch fixes things for me, but to maintain the rules from Christoph's patch, kmalloc_caches[2] should have been created whenever kmalloc_caches[7] was done. diff --git a/mm/slab_common.c b/mm/slab_common.c index d2517b0..ff3218a 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags) if (!kmalloc_caches[i]) { kmalloc_caches[i] = create_kmalloc_cache(NULL, 1 << i, flags); + } - /* - * Caches that are not of the two-to-the-power-of size. - * These have to be created immediately after the - * earlier power of two caches - */ - if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) - kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); + /* + * Caches that are not of the two-to-the-power-of size. + * These have to be created immediately after the + * earlier power of two caches + */ + if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) + kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); - if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) - kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); - } + if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) + kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); } /* Kmalloc array is now usable */ ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 18:13 ` Chris Mason @ 2013-05-08 18:25 ` Christoph Lameter 2013-05-08 18:48 ` Chris Mason 2013-05-08 19:05 ` [GIT PULL] SLAB changes for v3.10 Tony Lindgren 2013-05-08 21:01 ` Konrad Rzeszutek Wilk 1 sibling, 2 replies; 22+ messages in thread From: Christoph Lameter @ 2013-05-08 18:25 UTC (permalink / raw) To: Chris Mason Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel, Tony Lindgren On Wed, 8 May 2013, Chris Mason wrote: > This patch fixes things for me, but to maintain the rules from > Christoph's patch, kmalloc_caches[2] should have been created whenever > kmalloc_caches[7] was done. Not necessary. The early slab bootstrap must create some slab caches of specific sizes, it will only use those during very early bootstrap. The later creation of the array must skip those. You correctly moved the checks out of the if (!kmalloc_cacheS()) condition so that the caches are created properly. Acked-by: Christoph Lameter <cl@linux.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 18:25 ` Christoph Lameter @ 2013-05-08 18:48 ` Chris Mason 2013-05-08 19:01 ` Christoph Lameter 2013-05-08 19:05 ` [GIT PULL] SLAB changes for v3.10 Tony Lindgren 1 sibling, 1 reply; 22+ messages in thread From: Chris Mason @ 2013-05-08 18:48 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel@vger.kernel.org, Tony Lindgren Quoting Christoph Lameter (2013-05-08 14:25:49) > On Wed, 8 May 2013, Chris Mason wrote: > > > This patch fixes things for me, but to maintain the rules from > > Christoph's patch, kmalloc_caches[2] should have been created whenever > > kmalloc_caches[7] was done. > > Not necessary. The early slab bootstrap must create some slab caches of > specific sizes, it will only use those during very early bootstrap. > > The later creation of the array must skip those. > > You correctly moved the checks out of the if (!kmalloc_cacheS()) > condition so that the caches are created properly. But if the ordering is required at all, why is it ok to create cache 2 after cache 6 instead of after cache 7? IOW if we can safely do cache 2 after cache 6, why can't we just do both cache 1 and cache 2 after the loop? -chris ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 18:48 ` Chris Mason @ 2013-05-08 19:01 ` Christoph Lameter 2013-05-08 19:11 ` Tony Lindgren 2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason 0 siblings, 2 replies; 22+ messages in thread From: Christoph Lameter @ 2013-05-08 19:01 UTC (permalink / raw) To: Chris Mason Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel@vger.kernel.org, Tony Lindgren On Wed, 8 May 2013, Chris Mason wrote: > > You correctly moved the checks out of the if (!kmalloc_cacheS()) > > condition so that the caches are created properly. > > But if the ordering is required at all, why is it ok to create cache 2 > after cache 6 instead of after cache 7? The power of two caches are 2^x beginning with KMALLOC_MIN_SHIFT. The non power of two caches were folded into number 1 + 2 since they do not fit into the scheme and they are special cased throughout. This works since the minimal slab cache size is 8 bytes. > IOW if we can safely do cache 2 after cache 6, why can't we just do both > cache 1 and cache 2 after the loop? Because the cache creation in SLAB can cause the use of a fractional slab size if kmem_cache_create() thinks its better to put the metadata on a different slab cache (OFF_SLAB type) because data will align better that way. Its weird I know but its due to the way that SLAB aligns data in the page frame. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 19:01 ` Christoph Lameter @ 2013-05-08 19:11 ` Tony Lindgren 2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason 1 sibling, 0 replies; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 19:11 UTC (permalink / raw) To: Christoph Lameter Cc: Chris Mason, Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel@vger.kernel.org * Christoph Lameter <cl@linux.com> [130508 12:06]: > On Wed, 8 May 2013, Chris Mason wrote: > > > > You correctly moved the checks out of the if (!kmalloc_cacheS()) > > > condition so that the caches are created properly. > > > > But if the ordering is required at all, why is it ok to create cache 2 > > after cache 6 instead of after cache 7? > > The power of two caches are 2^x beginning with KMALLOC_MIN_SHIFT. The non > power of two caches were folded into number 1 + 2 since they do not fit > into the scheme and they are special cased throughout. This works since > the minimal slab cache size is 8 bytes. > > > IOW if we can safely do cache 2 after cache 6, why can't we just do both > > cache 1 and cache 2 after the loop? > > Because the cache creation in SLAB can cause the use of a fractional slab > size if kmem_cache_create() thinks its better to put the metadata on a > different slab cache (OFF_SLAB type) because data will align better that > way. Its weird I know but its due to the way that SLAB aligns data in the > page frame. Hmm OK so kmalloc_caches[7] got created earlier with INDEX_AC != INDEX_NODE, and those are defined as: #define INDEX_AC kmalloc_index(sizeof(struct arraycache_init)) #define INDEX_NODE kmalloc_index(sizeof(struct kmem_cache_node)) So the different sizes for the structs can trigger it like Pekka was speculating earlier. Regards, Tony ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] Fix crash during slab init 2013-05-08 19:01 ` Christoph Lameter 2013-05-08 19:11 ` Tony Lindgren @ 2013-05-08 19:56 ` Chris Mason 2013-05-08 20:10 ` Sören Brinkmann ` (2 more replies) 1 sibling, 3 replies; 22+ messages in thread From: Chris Mason @ 2013-05-08 19:56 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel@vger.kernel.org, Tony Lindgren Commit 8a965b3b introduced a regression that caused us to crash early during boot. The commit was introducing ordering of slab creation, making sure two odd-sized slabs were created after specific powers of two sizes. But, if any of the power of two slabs were created earlier during boot, slabs at index 1 or 2 might not get created at all. This patch makes sure none of the slabs get skipped. Tony Lindgren bisected this down to the offending commit, which really helped because bisect kept bringing me to almost but not quite this one. Signed-off-by: Chris Mason <chris.mason@fusionio.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Tony Lindgren <tony@atomide.com> --- v1->v2 reword description diff --git a/mm/slab_common.c b/mm/slab_common.c index d2517b0..ff3218a 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags) if (!kmalloc_caches[i]) { kmalloc_caches[i] = create_kmalloc_cache(NULL, 1 << i, flags); + } - /* - * Caches that are not of the two-to-the-power-of size. - * These have to be created immediately after the - * earlier power of two caches - */ - if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) - kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); + /* + * Caches that are not of the two-to-the-power-of size. + * These have to be created immediately after the + * earlier power of two caches + */ + if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) + kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); - if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) - kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); - } + if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) + kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); } /* Kmalloc array is now usable */ ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] Fix crash during slab init 2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason @ 2013-05-08 20:10 ` Sören Brinkmann 2013-05-08 21:48 ` Tetsuo Handa 2013-05-08 22:09 ` Andrew Morton 2 siblings, 0 replies; 22+ messages in thread From: Sören Brinkmann @ 2013-05-08 20:10 UTC (permalink / raw) To: Chris Mason Cc: Christoph Lameter, Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel@vger.kernel.org, Tony Lindgren On Wed, May 08, 2013 at 03:56:28PM -0400, Chris Mason wrote: > Commit 8a965b3b introduced a regression that caused us to crash early > during boot. The commit was introducing ordering of slab creation, > making sure two odd-sized slabs were created after specific powers of > two sizes. > > But, if any of the power of two slabs were created earlier during boot, > slabs at index 1 or 2 might not get created at all. This patch makes > sure none of the slabs get skipped. > > Tony Lindgren bisected this down to the offending commit, which really > helped because bisect kept bringing me to almost but not quite this one. > > Signed-off-by: Chris Mason <chris.mason@fusionio.com> > Acked-by: Christoph Lameter <cl@linux.com> > Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Fixes things for me on Zynq, too. Sören ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Fix crash during slab init 2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason 2013-05-08 20:10 ` Sören Brinkmann @ 2013-05-08 21:48 ` Tetsuo Handa 2013-05-08 22:09 ` Andrew Morton 2 siblings, 0 replies; 22+ messages in thread From: Tetsuo Handa @ 2013-05-08 21:48 UTC (permalink / raw) To: clmason, cl; +Cc: penberg, torvalds, akpm, linux-kernel, tony Chris Mason wrote: > Commit 8a965b3b introduced a regression that caused us to crash early > during boot. The commit was introducing ordering of slab creation, > making sure two odd-sized slabs were created after specific powers of > two sizes. > > But, if any of the power of two slabs were created earlier during boot, > slabs at index 1 or 2 might not get created at all. This patch makes > sure none of the slabs get skipped. > > Tony Lindgren bisected this down to the offending commit, which really > helped because bisect kept bringing me to almost but not quite this one. > > Signed-off-by: Chris Mason <chris.mason@fusionio.com> > Acked-by: Christoph Lameter <cl@linux.com> > Acked-by: Tony Lindgren <tony@atomide.com> This patch keeps my x86_32 debug kernel boots fine. Thank you. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] Fix crash during slab init 2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason 2013-05-08 20:10 ` Sören Brinkmann 2013-05-08 21:48 ` Tetsuo Handa @ 2013-05-08 22:09 ` Andrew Morton 2 siblings, 0 replies; 22+ messages in thread From: Andrew Morton @ 2013-05-08 22:09 UTC (permalink / raw) To: Chris Mason Cc: Christoph Lameter, Pekka Enberg, Linus Torvalds, linux-kernel@vger.kernel.org, Tony Lindgren On Wed, 8 May 2013 15:56:28 -0400 Chris Mason <clmason@fusionio.com> wrote: > Commit 8a965b3b introduced a regression that caused us to crash early > during boot. The commit was introducing ordering of slab creation, > making sure two odd-sized slabs were created after specific powers of > two sizes. > > But, if any of the power of two slabs were created earlier during boot, > slabs at index 1 or 2 might not get created at all. This patch makes > sure none of the slabs get skipped. > > Tony Lindgren bisected this down to the offending commit, which really > helped because bisect kept bringing me to almost but not quite this one. err, yes. Without your patch, current mainline does an ignominious faceplant on my test box. From: Chris Mason <clmason@fusionio.com> Subject: slab: fix crash during slab init Commit 8a965b3b ("mm, slab_common: Fix bootstrap creation of kmalloc caches") introduced a regression that caused us to crash early during boot. The commit was introducing ordering of slab creation, making sure two odd-sized slabs were created after specific powers of two sizes. But, if any of the power of two slabs were created earlier during boot, slabs at index 1 or 2 might not get created at all. This patch makes sure none of the slabs get skipped. Tony Lindgren bisected this down to the offending commit, which really helped because bisect kept bringing me to almost but not quite this one. Signed-off-by: Chris Mason <chris.mason@fusionio.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Tony Lindgren <tony@atomide.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Pekka Enberg <penberg@kernel.org> Tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- mm/slab_common.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff -puN mm/slab_common.c~slab-fix-crash-during-slab-init mm/slab_common.c --- a/mm/slab_common.c~slab-fix-crash-during-slab-init +++ a/mm/slab_common.c @@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsign if (!kmalloc_caches[i]) { kmalloc_caches[i] = create_kmalloc_cache(NULL, 1 << i, flags); + } - /* - * Caches that are not of the two-to-the-power-of size. - * These have to be created immediately after the - * earlier power of two caches - */ - if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) - kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); + /* + * Caches that are not of the two-to-the-power-of size. + * These have to be created immediately after the + * earlier power of two caches + */ + if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) + kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); - if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) - kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); - } + if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) + kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); } /* Kmalloc array is now usable */ _ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 18:25 ` Christoph Lameter 2013-05-08 18:48 ` Chris Mason @ 2013-05-08 19:05 ` Tony Lindgren 1 sibling, 0 replies; 22+ messages in thread From: Tony Lindgren @ 2013-05-08 19:05 UTC (permalink / raw) To: Christoph Lameter Cc: Chris Mason, Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel * Christoph Lameter <cl@linux.com> [130508 11:31]: > On Wed, 8 May 2013, Chris Mason wrote: > > > This patch fixes things for me, but to maintain the rules from > > Christoph's patch, kmalloc_caches[2] should have been created whenever > > kmalloc_caches[7] was done. > > Not necessary. The early slab bootstrap must create some slab caches of > specific sizes, it will only use those during very early bootstrap. > > The later creation of the array must skip those. > > You correctly moved the checks out of the if (!kmalloc_cacheS()) > condition so that the caches are created properly. > > Acked-by: Christoph Lameter <cl@linux.com> Great this fixes things for me too: Acked-by: Tony Lindgren <tony@atomide.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [GIT PULL] SLAB changes for v3.10 2013-05-08 18:13 ` Chris Mason 2013-05-08 18:25 ` Christoph Lameter @ 2013-05-08 21:01 ` Konrad Rzeszutek Wilk 1 sibling, 0 replies; 22+ messages in thread From: Konrad Rzeszutek Wilk @ 2013-05-08 21:01 UTC (permalink / raw) To: Chris Mason Cc: Christoph Lameter, Pekka Enberg, Linus Torvalds, Andrew Morton, linux-kernel, Tony Lindgren On Wed, May 08, 2013 at 02:13:53PM -0400, Chris Mason wrote: > [ Sorry if I break the threading on this, I had to pull it off gmane ] > > On Tue, 7 May 2013, Tony Lindgren wrote: > > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > > diffing .config files on top of that. > > I'm unable to boot with slab on current Linus -master, and bisected it > down almost as far as Tony did before trying SLUB and then finding this > thread. My box is a standard x86-64, nothing exciting and spinlock > debugging isn't on. > > A few printks stuffed into Christoph's code: > > cache ffff88047f000080 at index 5 > creating slab cache at 6 > create special cache #1 (this is kmalloc_caches[1]) > cache ffff88047f0001c0 at index 7 > creating slab cache at 8 > creating slab cache at 9 > creating slab cache at 10 > ... more get created > > Pulling this into the code from commit 8a965b3b: > > for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) { > if (!kmalloc_caches[i]) { > kmalloc_caches[i] = create_kmalloc_cache(NULL, > 1 << i, flags); > > /* > * Caches that are not of the two-to-the-power-of size. > * These have to be created immediately after the > * earlier power of two caches > */ > if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) > kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); > > if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) > kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); > } > } > > kmalloc_caches[7] was not null, and so kmalloc_caches[2] was never > created. > > I get this oops (with early printk on) > > ------------[ cut here ]------------ > kernel BUG at mm/slab.c:1635! > invalid opcode: 0000 [#1] PREEMPT SMP > Modules linked in: > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.9.0-josef+ #920 > Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012 > task: ffffffff8196a410 ti: ffffffff8195a000 task.ti: ffffffff8195a000 > RIP: 0010:[<ffffffff81a49c9d>] [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d > RSP: 0000:ffffffff8195bf78 EFLAGS: 00010282 > RAX: 00000000fffffff4 RBX: ffff88047f006480 RCX: 000000000000ff31 > RDX: 000000000000000e RSI: 0000000000000046 RDI: ffffffff81baf238 > RBP: ffffffff8195bf80 R08: 0000000000000400 R09: 0000000000000000 > R10: 0000000000002fa4 R11: 0000000000000000 R12: ffffffff81ab58d0 > R13: ffffffff81abd2c0 R14: ffff88047ffaf0c0 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffff88047ffff000 CR3: 0000000001965000 CR4: 00000000000406b0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Stack: > ffffffffffffffff ffffffff8195bfc0 ffffffff81a25b64 ffffffff81a2573d > ffffffff81abd2c0 0000000000003000 0000000000000000 0000000000000000 > 0000000000000000 ffffffff8195bfd0 ffffffff81a2547f ffffffff8195bfe8 > Call Trace: > [<ffffffff81a25b64>] start_kernel+0x235/0x394 > [<ffffffff81a2573d>] ? repair_env_string+0x58/0x58 > [<ffffffff81a2547f>] x86_64_start_reservations+0x2a/0x2c > [<ffffffff81a25548>] x86_64_start_kernel+0xc7/0xca > Code: 53 e8 40 57 bd ff 48 8b 05 21 f2 f4 ff 48 8d 58 a8 48 8d 43 58 48 3d a0 8e 99 81 74 1a 31 f6 48 89 df e8 ba 7f 6d ff 85 c0 74 02 <0f> 0b 48 8b 5b 58 48 83 eb 58 eb da 48 c7 c7 70 8e 99 81 e8 e6 > RIP [<ffffffff81a49c9d>] kmem_cache_init_late+0x40/0x7d > RSP <ffffffff8195bf78> > ---[ end trace 2e5587581263f881 ]--- > > This patch fixes things for me, but to maintain the rules from > Christoph's patch, kmalloc_caches[2] should have been created whenever > kmalloc_caches[7] was done. And you might also want to add: Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > diff --git a/mm/slab_common.c b/mm/slab_common.c > index d2517b0..ff3218a 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -446,18 +446,18 @@ void __init create_kmalloc_caches(unsigned long flags) > if (!kmalloc_caches[i]) { > kmalloc_caches[i] = create_kmalloc_cache(NULL, > 1 << i, flags); > + } > > - /* > - * Caches that are not of the two-to-the-power-of size. > - * These have to be created immediately after the > - * earlier power of two caches > - */ > - if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) > - kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); > + /* > + * Caches that are not of the two-to-the-power-of size. > + * These have to be created immediately after the > + * earlier power of two caches > + */ > + if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6) > + kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags); > > - if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) > - kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); > - } > + if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7) > + kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags); > } > > /* Kmalloc array is now usable */ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2013-05-08 22:09 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-07 6:37 [GIT PULL] SLAB changes for v3.10 Pekka Enberg
2013-05-08 0:30 ` Tony Lindgren
2013-05-08 4:24 ` Tony Lindgren
2013-05-08 5:16 ` Tony Lindgren
2013-05-08 6:20 ` Pekka Enberg
2013-05-08 11:58 ` Glauber Costa
2013-05-08 12:26 ` Pekka Enberg
2013-05-08 12:38 ` Glauber Costa
2013-05-08 13:56 ` Christoph Lameter
2013-05-08 15:45 ` Tony Lindgren
[not found] ` <alpine.DEB.2.02.1305081309310.3124@gentwo.org>
2013-05-08 18:29 ` Christoph Lameter
2013-05-08 18:13 ` Chris Mason
2013-05-08 18:25 ` Christoph Lameter
2013-05-08 18:48 ` Chris Mason
2013-05-08 19:01 ` Christoph Lameter
2013-05-08 19:11 ` Tony Lindgren
2013-05-08 19:56 ` [PATCH] Fix crash during slab init Chris Mason
2013-05-08 20:10 ` Sören Brinkmann
2013-05-08 21:48 ` Tetsuo Handa
2013-05-08 22:09 ` Andrew Morton
2013-05-08 19:05 ` [GIT PULL] SLAB changes for v3.10 Tony Lindgren
2013-05-08 21:01 ` Konrad Rzeszutek Wilk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox