* [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on
@ 2012-07-17 16:50 Jiang Liu
2012-07-17 17:39 ` Christoph Lameter
0 siblings, 1 reply; 10+ messages in thread
From: Jiang Liu @ 2012-07-17 16:50 UTC (permalink / raw)
To: Christoph Lameter, Pekka Enberg, Matt Mackall, Mel Gorman
Cc: Jianguo Wu, Jiang Liu, Tony Luck, KAMEZAWA Hiroyuki,
KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen,
linux-mm, linux-kernel, Jiang Liu
From: Jianguo Wu <wujianguo@huawei.com>
From: Jianguo Wu <wujianguo@huawei.com>
SLUB allocator may cause a BUG_ON() when offlining a memory node if
CONFIG_SLUB_DEBUG is on. The scenario is:
1) when creating kmem_cache_node slab, it cause inc_slabs_node() twice.
early_kmem_cache_node_alloc
->new_slab
->inc_slabs_node
->inc_slabs_node
2) Later when offlining a memory node, it triggers the BUG_ON() in function
slab_mem_offline_callback() due to the extra inc_slabs_node() in function
early_kmem_cache_node_alloc().
{
if (n) {
/*
* if n->nr_slabs > 0, slabs still exist on the node
* that is going down. We were unable to free them,
* and offline_pages() function shouldn't call this
* callback. So, we must fail.
*/
BUG_ON(slabs_node(s, offline_node));
}
------------[ cut here ]------------
kernel BUG at mm/slub.c:3590!
invalid opcode: 0000 [#1] SMP
CPU 61
Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 vfat fat dm_mirror dm_region_hash dm_log uinput iTCO_wdt iTCO_vendor_support coretemp hwmon kvm_intel kvm crc32c_intel ghash_clmulni_intel serio_raw pcspkr cdc_ether usbnet mii i2c_i801 i2c_core sg lpc_ich mfd_core shpchp ioatdma i7core_edac edac_core igb dca bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif aesni_intel cryptd aes_x86_64 aes_generic bfa scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas dm_mod [last unloaded: microcode]
Pid: 46287, comm: sh Not tainted 3.5.0-rc4-pgtable-00215-g35f0828-dirty #85 IBM System x3850 X5 -[7143O3G]-/Node 1, Processor Card
RIP: 0010:[<ffffffff81160b2a>] [<ffffffff81160b2a>] slab_memory_callback+0x1ba/0x1c0
RSP: 0018:ffff880efdcb7c68 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880f7ec06100 RCX: 0000000100400001
RDX: 0000000100400002 RSI: ffff880f7ec02000 RDI: ffff880f7ec06100
RBP: ffff880efdcb7c78 R08: ffff88107b6fb098 R09: ffffffff81160a00
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000019
R13: 00000000fffffffb R14: 0000000000000000 R15: ffffffff81abe930
FS: 00007f709f342700(0000) GS:ffff880f7f3a0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000003b5a874570 CR3: 0000000f0da20000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sh (pid: 46287, threadinfo ffff880efdcb6000, task ffff880f0fa50000)
Stack:
0000000000000004 ffff880efdcb7da8 ffff880efdcb7cb8 ffffffff81524af5
0000000000000001 ffffffff81a8b620 ffffffff81a8b640 0000000000000004
ffff880efdcb7da8 00000000ffffffff ffff880efdcb7d08 ffffffff8107a89a
Call Trace:
[<ffffffff81524af5>] notifier_call_chain+0x55/0x80
[<ffffffff8107a89a>] __blocking_notifier_call_chain+0x5a/0x80
[<ffffffff8107a8d6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff81352f0b>] memory_notify+0x1b/0x20
[<ffffffff81507104>] offline_pages+0x624/0x700
[<ffffffff811619de>] remove_memory+0x1e/0x20
[<ffffffff813530cc>] memory_block_change_state+0x13c/0x2e0
[<ffffffff81153e96>] ? alloc_pages_current+0xb6/0x120
[<ffffffff81353332>] store_mem_state+0xc2/0xd0
[<ffffffff8133e190>] dev_attr_store+0x20/0x30
[<ffffffff811e2d4f>] sysfs_write_file+0xef/0x170
[<ffffffff81173e28>] vfs_write+0xc8/0x190
[<ffffffff81173ff1>] sys_write+0x51/0x90
[<ffffffff81528d29>] system_call_fastpath+0x16/0x1b
Code: 8b 3d cb fd c4 00 be d0 00 00 00 e8 71 de ff ff 48 85 c0 75 9c 48 c7 c7 c0 7f a5 81 e8 c0 89 f1 ff b8 0d 80 00 00 e9 69 fe ff ff <0f> 0b eb fe 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83
RIP [<ffffffff81160b2a>] slab_memory_callback+0x1ba/0x1c0
RSP <ffff880efdcb7c68>
---[ end trace 749e9e9a67c78c12 ]---
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
---
mm/slub.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/slub.c b/mm/slub.c
index 8c691fa..f8276db 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2840,7 +2840,6 @@ static void early_kmem_cache_node_alloc(int node)
init_tracking(kmem_cache_node, n);
#endif
init_kmem_cache_node(n);
- inc_slabs_node(kmem_cache_node, node, page->objects);
add_partial(n, page, DEACTIVATE_TO_HEAD);
}
--
1.7.9.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on 2012-07-17 16:50 [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on Jiang Liu @ 2012-07-17 17:39 ` Christoph Lameter 2012-07-17 17:53 ` Luck, Tony 2012-07-18 16:52 ` Jiang Liu 0 siblings, 2 replies; 10+ messages in thread From: Christoph Lameter @ 2012-07-17 17:39 UTC (permalink / raw) To: Jiang Liu Cc: Pekka Enberg, Matt Mackall, Mel Gorman, Jianguo Wu, Jiang Liu, Tony Luck, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel On Wed, 18 Jul 2012, Jiang Liu wrote: > From: Jianguo Wu <wujianguo@huawei.com> > > From: Jianguo Wu <wujianguo@huawei.com> > > SLUB allocator may cause a BUG_ON() when offlining a memory node if > CONFIG_SLUB_DEBUG is on. The scenario is: > > 1) when creating kmem_cache_node slab, it cause inc_slabs_node() twice. > early_kmem_cache_node_alloc > ->new_slab > ->inc_slabs_node > ->inc_slabs_node New slab will not be able to increment the slab counter. It will check that there is no per node structure yet and then skip the inc slabs node. This suggests that a call to early_kmem_cache_node_alloc was not needed because the per node structure already existed. Lets fix that instead. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on 2012-07-17 17:39 ` Christoph Lameter @ 2012-07-17 17:53 ` Luck, Tony 2012-07-18 15:30 ` Christoph Lameter 2012-07-18 16:52 ` Jiang Liu 1 sibling, 1 reply; 10+ messages in thread From: Luck, Tony @ 2012-07-17 17:53 UTC (permalink / raw) To: Christoph Lameter, Jiang Liu Cc: Pekka Enberg, Matt Mackall, Mel Gorman, Jianguo Wu, Jiang Liu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm@kvack.org, linux-kernel@vger.kernel.org > This suggests that a call to early_kmem_cache_node_alloc was not needed > because the per node structure already existed. Lets fix that instead. Perhaps by just having one API for users to call? It seems odd to force users to figure out whether they are called before some magic time during boot and use the "early...()" call. Shouldn't we hide this sort of detail from them? -Tony -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on 2012-07-17 17:53 ` Luck, Tony @ 2012-07-18 15:30 ` Christoph Lameter 0 siblings, 0 replies; 10+ messages in thread From: Christoph Lameter @ 2012-07-18 15:30 UTC (permalink / raw) To: Luck, Tony Cc: Jiang Liu, Pekka Enberg, Matt Mackall, Mel Gorman, Jianguo Wu, Jiang Liu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Tue, 17 Jul 2012, Luck, Tony wrote: > > This suggests that a call to early_kmem_cache_node_alloc was not needed > > because the per node structure already existed. Lets fix that instead. > > Perhaps by just having one API for users to call? It seems odd to force users > to figure out whether they are called before some magic time during boot > and use the "early...()" call. Shouldn't we hide this sort of detail from them? The early_ calls are internal to the allocator and not exposed to the user. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on 2012-07-17 17:39 ` Christoph Lameter 2012-07-17 17:53 ` Luck, Tony @ 2012-07-18 16:52 ` Jiang Liu 2012-07-18 18:53 ` Christoph Lameter 1 sibling, 1 reply; 10+ messages in thread From: Jiang Liu @ 2012-07-18 16:52 UTC (permalink / raw) To: Christoph Lameter Cc: Pekka Enberg, Matt Mackall, Mel Gorman, Jianguo Wu, Jiang Liu, Tony Luck, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel Hi Chris, I found the previous analysis of the BUG_ON() issue is incorrect after another round of code review. The really issue is that function early_kmem_cache_node_alloc() calls inc_slabs_node(kmem_cache_node, node, page->objects) to increase the object count on local node no matter whether page is allocated from local or remote node. With current implementation it's OK because every memory node has normal memory so page is allocated from local node. Now we are working on a patch set to improve memory hotplug. The basic idea is to to let some memory nodes only host ZONE_MOVABLE zone, so we could easily remove the whole memory node when needed. That means some memory nodes have no ZONE_NORMAL/ZONE_DMA, and the page will be allocated from remote node in function early_kmem_cache_node_alloc(). But early_kmem_cache_node_alloc() still increases object count on local node, which triggers the BUG_ON eventually when removing the affected memory node. I will try to work out another version for it. Thanks! Gerry On 07/18/2012 01:39 AM, Christoph Lameter wrote: > On Wed, 18 Jul 2012, Jiang Liu wrote: > >> From: Jianguo Wu <wujianguo@huawei.com> >> >> From: Jianguo Wu <wujianguo@huawei.com> >> >> SLUB allocator may cause a BUG_ON() when offlining a memory node if >> CONFIG_SLUB_DEBUG is on. The scenario is: >> >> 1) when creating kmem_cache_node slab, it cause inc_slabs_node() twice. >> early_kmem_cache_node_alloc >> ->new_slab >> ->inc_slabs_node >> ->inc_slabs_node > > New slab will not be able to increment the slab counter. It will > check that there is no per node structure yet and then skip the inc slabs > node. > > This suggests that a call to early_kmem_cache_node_alloc was not needed > because the per node structure already existed. Lets fix that instead. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on 2012-07-18 16:52 ` Jiang Liu @ 2012-07-18 18:53 ` Christoph Lameter 2012-07-24 9:55 ` [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory Jiang Liu 0 siblings, 1 reply; 10+ messages in thread From: Christoph Lameter @ 2012-07-18 18:53 UTC (permalink / raw) To: Jiang Liu Cc: Pekka Enberg, Matt Mackall, Mel Gorman, Jianguo Wu, Jiang Liu, Tony Luck, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel On Thu, 19 Jul 2012, Jiang Liu wrote: > I found the previous analysis of the BUG_ON() issue is incorrect after > another round of code review. > The really issue is that function early_kmem_cache_node_alloc() calls > inc_slabs_node(kmem_cache_node, node, page->objects) to increase the object > count on local node no matter whether page is allocated from local or remote > node. With current implementation it's OK because every memory node has normal > memory so page is allocated from local node. Now we are working on a patch set > to improve memory hotplug. The basic idea is to to let some memory nodes only > host ZONE_MOVABLE zone, so we could easily remove the whole memory node when > needed. That means some memory nodes have no ZONE_NORMAL/ZONE_DMA, and the page > will be allocated from remote node in function early_kmem_cache_node_alloc(). > But early_kmem_cache_node_alloc() still increases object count on local node, > which triggers the BUG_ON eventually when removing the affected memory node. That does not work. If the node does only have ZONE_MOVABLE then no slab object can be allocated from the zone. You need to modify the slab allocators to not allocate a per node structure for those zones and forbit all allocations from such a node. Actually that should already work because only ZONE_NORMAL nodes should get a per node structure because slab objects can only be allocated from ZONE_NORMAL. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory 2012-07-18 18:53 ` Christoph Lameter @ 2012-07-24 9:55 ` Jiang Liu 2012-07-24 14:45 ` Christoph Lameter 0 siblings, 1 reply; 10+ messages in thread From: Jiang Liu @ 2012-07-24 9:55 UTC (permalink / raw) To: Christoph Lameter Cc: WuJianguo, Tony Luck, Pekka Enberg, Matt Mackall, Mel Gorman, Yinghai Lu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel, Jiang Liu From: WuJianguo <wujianguo@huawei.com> When handling a memory node with only movable zone, function early_kmem_cache_node_alloc() will allocate a page from remote node but still increase object count on local node, which will trigger a BUG_ON() as below when hot-removing this memory node. Actually there's no need to create kmem_cache_node for memory node with only movable zone at all. ------------[ cut here ]------------ kernel BUG at mm/slub.c:3590! invalid opcode: 0000 [#1] SMP CPU 61 Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 vfat fat dm_mirror dm_region_hash dm_log uinput iTCO_wdt iTCO_vendor_support coretemp hwmon kvm_intel kvm crc32c_intel ghash_clmulni_intel serio_raw pcspkr cdc_ether usbnet mii i2c_i801 i2c_core sg lpc_ich mfd_core shpchp ioatdma i7core_edac edac_core igb dca bnx2 ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif aesni_intel cryptd aes_x86_64 aes_generic bfa scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix megaraid_sas dm_mod [last unloaded: microcode] Pid: 46287, comm: sh Not tainted 3.5.0-rc4-pgtable-00215-g35f0828-dirty #85 IBM System x3850 X5 -[7143O3G]-/Node 1, Processor Card RIP: 0010:[<ffffffff81160b2a>] [<ffffffff81160b2a>] slab_memory_callback+0x1ba/0x1c0 RSP: 0018:ffff880efdcb7c68 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffff880f7ec06100 RCX: 0000000100400001 RDX: 0000000100400002 RSI: ffff880f7ec02000 RDI: ffff880f7ec06100 RBP: ffff880efdcb7c78 R08: ffff88107b6fb098 R09: ffffffff81160a00 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000019 R13: 00000000fffffffb R14: 0000000000000000 R15: ffffffff81abe930 FS: 00007f709f342700(0000) GS:ffff880f7f3a0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003b5a874570 CR3: 0000000f0da20000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 46287, threadinfo ffff880efdcb6000, task ffff880f0fa50000) Stack: 0000000000000004 ffff880efdcb7da8 ffff880efdcb7cb8 ffffffff81524af5 0000000000000001 ffffffff81a8b620 ffffffff81a8b640 0000000000000004 ffff880efdcb7da8 00000000ffffffff ffff880efdcb7d08 ffffffff8107a89a Call Trace: [<ffffffff81524af5>] notifier_call_chain+0x55/0x80 [<ffffffff8107a89a>] __blocking_notifier_call_chain+0x5a/0x80 [<ffffffff8107a8d6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81352f0b>] memory_notify+0x1b/0x20 [<ffffffff81507104>] offline_pages+0x624/0x700 [<ffffffff811619de>] remove_memory+0x1e/0x20 [<ffffffff813530cc>] memory_block_change_state+0x13c/0x2e0 [<ffffffff81153e96>] ? alloc_pages_current+0xb6/0x120 [<ffffffff81353332>] store_mem_state+0xc2/0xd0 [<ffffffff8133e190>] dev_attr_store+0x20/0x30 [<ffffffff811e2d4f>] sysfs_write_file+0xef/0x170 [<ffffffff81173e28>] vfs_write+0xc8/0x190 [<ffffffff81173ff1>] sys_write+0x51/0x90 [<ffffffff81528d29>] system_call_fastpath+0x16/0x1b Code: 8b 3d cb fd c4 00 be d0 00 00 00 e8 71 de ff ff 48 85 c0 75 9c 48 c7 c7 c0 7f a5 81 e8 c0 89 f1 ff b8 0d 80 00 00 e9 69 fe ff ff <0f> 0b eb fe 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 RIP [<ffffffff81160b2a>] slab_memory_callback+0x1ba/0x1c0 RSP <ffff880efdcb7c68> ---[ end trace 749e9e9a67c78c12 ]--- Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> --- mm/slub.c | 44 +++++++++++++++++++++++++++++++++----------- 1 files changed, 33 insertions(+), 11 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 8c691fa..3976745 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2803,6 +2803,17 @@ static inline int alloc_kmem_cache_cpus(struct kmem_cache *s) static struct kmem_cache *kmem_cache_node; +static bool node_has_normal_memory(int node) +{ + int i; + + for (i = ZONE_NORMAL; i >= 0; i--) + if (populated_zone(&NODE_DATA(node)->node_zones[i])) + return true; + + return false; +} + /* * No kmalloc_node yet so do it by hand. We know that this is the first * slab on the node for this slabcache. There are no concurrent accesses @@ -2866,6 +2877,10 @@ static int init_kmem_cache_nodes(struct kmem_cache *s) for_each_node_state(node, N_NORMAL_MEMORY) { struct kmem_cache_node *n; + /* Do not create kmem_cache_node for node without normal memory */ + if (!node_has_normal_memory(node)) + continue; + if (slab_state == DOWN) { early_kmem_cache_node_alloc(node); continue; @@ -3178,9 +3193,11 @@ static inline int kmem_cache_close(struct kmem_cache *s) for_each_node_state(node, N_NORMAL_MEMORY) { struct kmem_cache_node *n = get_node(s, node); - free_partial(s, n); - if (n->nr_partial || slabs_node(s, node)) - return 1; + if (n) { + free_partial(s, n); + if (n->nr_partial || slabs_node(s, node)) + return 1; + } } free_kmem_cache_nodes(s); return 0; @@ -3509,7 +3526,7 @@ int kmem_cache_shrink(struct kmem_cache *s) for_each_node_state(node, N_NORMAL_MEMORY) { n = get_node(s, node); - if (!n->nr_partial) + if (!n || !n->nr_partial) continue; for (i = 0; i < objects; i++) @@ -4170,7 +4187,8 @@ static long validate_slab_cache(struct kmem_cache *s) for_each_node_state(node, N_NORMAL_MEMORY) { struct kmem_cache_node *n = get_node(s, node); - count += validate_slab_node(s, n, map); + if (n) + count += validate_slab_node(s, n, map); } kfree(map); return count; @@ -4339,7 +4357,7 @@ static int list_locations(struct kmem_cache *s, char *buf, unsigned long flags; struct page *page; - if (!atomic_long_read(&n->nr_slabs)) + if (!n || !atomic_long_read(&n->nr_slabs)) continue; spin_lock_irqsave(&n->list_lock, flags); @@ -4534,11 +4552,13 @@ static ssize_t show_slab_objects(struct kmem_cache *s, for_each_node_state(node, N_NORMAL_MEMORY) { struct kmem_cache_node *n = get_node(s, node); - if (flags & SO_TOTAL) - x = atomic_long_read(&n->total_objects); - else if (flags & SO_OBJECTS) - x = atomic_long_read(&n->total_objects) - - count_partial(n, count_free); + if (!n) + continue; + if (flags & SO_TOTAL) + x = atomic_long_read(&n->total_objects); + else if (flags & SO_OBJECTS) + x = atomic_long_read(&n->total_objects) - + count_partial(n, count_free); else x = atomic_long_read(&n->nr_slabs); @@ -4552,6 +4572,8 @@ static ssize_t show_slab_objects(struct kmem_cache *s, for_each_node_state(node, N_NORMAL_MEMORY) { struct kmem_cache_node *n = get_node(s, node); + if (!n) + continue; if (flags & SO_TOTAL) x = count_partial(n, count_total); else if (flags & SO_OBJECTS) -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory 2012-07-24 9:55 ` [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory Jiang Liu @ 2012-07-24 14:45 ` Christoph Lameter 2012-07-24 17:00 ` Jiang Liu 0 siblings, 1 reply; 10+ messages in thread From: Christoph Lameter @ 2012-07-24 14:45 UTC (permalink / raw) To: Jiang Liu Cc: WuJianguo, Tony Luck, Pekka Enberg, Matt Mackall, Mel Gorman, Yinghai Lu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel, Jiang Liu On Tue, 24 Jul 2012, Jiang Liu wrote: > > diff --git a/mm/slub.c b/mm/slub.c > index 8c691fa..3976745 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2803,6 +2803,17 @@ static inline int alloc_kmem_cache_cpus(struct kmem_cache *s) > > static struct kmem_cache *kmem_cache_node; > > +static bool node_has_normal_memory(int node) > +{ > + int i; > + > + for (i = ZONE_NORMAL; i >= 0; i--) > + if (populated_zone(&NODE_DATA(node)->node_zones[i])) > + return true; > + > + return false; > +} There is already a N_NORMAL_MEMORY node map that contains a list of node that have *normal* memory usable by slab allocators etc. I think the cleanest solution would be to clear the corresponding node bits for your special movable only zones. Then you wont be needing to modify other subsystems anymore. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory 2012-07-24 14:45 ` Christoph Lameter @ 2012-07-24 17:00 ` Jiang Liu 2012-07-25 15:31 ` Christoph Lameter 0 siblings, 1 reply; 10+ messages in thread From: Jiang Liu @ 2012-07-24 17:00 UTC (permalink / raw) To: Christoph Lameter Cc: Jiang Liu, WuJianguo, Tony Luck, Pekka Enberg, Matt Mackall, Mel Gorman, Yinghai Lu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel On 07/24/2012 10:45 PM, Christoph Lameter wrote: > On Tue, 24 Jul 2012, Jiang Liu wrote: > >> >> diff --git a/mm/slub.c b/mm/slub.c >> index 8c691fa..3976745 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2803,6 +2803,17 @@ static inline int alloc_kmem_cache_cpus(struct kmem_cache *s) >> >> static struct kmem_cache *kmem_cache_node; >> >> +static bool node_has_normal_memory(int node) >> +{ >> + int i; >> + >> + for (i = ZONE_NORMAL; i >= 0; i--) >> + if (populated_zone(&NODE_DATA(node)->node_zones[i])) >> + return true; >> + >> + return false; >> +} > > There is already a N_NORMAL_MEMORY node map that contains a list of node > that have *normal* memory usable by slab allocators etc. I think the > cleanest solution would be to clear the corresponding node bits for your > special movable only zones. Then you wont be needing to modify other > subsystems anymore. > Hi Chris, Thanks for your comments! I have thought about the solution mentioned, but seems it doesn't work. We have node masks for both N_NORMAL_MEMORY and N_HIGH_MEMORY to distinguish between normal and highmem on platforms such as x86. But we still don't have such a mechanism to distinguish between "normal" and "movable" memory. So for memory nodes with only movable zones, we still set N_NORMAL_MEMORY for them. One possible solution is to add a node mask for "N_NORMAL_OR_MOVABLE_MEMORY", but haven't tried that yet. Will have a try for that. Thanks! Gerry -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory 2012-07-24 17:00 ` Jiang Liu @ 2012-07-25 15:31 ` Christoph Lameter 0 siblings, 0 replies; 10+ messages in thread From: Christoph Lameter @ 2012-07-25 15:31 UTC (permalink / raw) To: Jiang Liu Cc: Jiang Liu, WuJianguo, Tony Luck, Pekka Enberg, Matt Mackall, Mel Gorman, Yinghai Lu, KAMEZAWA Hiroyuki, KOSAKI Motohiro, David Rientjes, Minchan Kim, Keping Chen, linux-mm, linux-kernel On Wed, 25 Jul 2012, Jiang Liu wrote: > > There is already a N_NORMAL_MEMORY node map that contains a list of node > > that have *normal* memory usable by slab allocators etc. I think the > > cleanest solution would be to clear the corresponding node bits for your > > special movable only zones. Then you wont be needing to modify other > > subsystems anymore. > > > Hi Chris, > Thanks for your comments! I have thought about the solution mentioned, > but seems it doesn't work. We have node masks for both N_NORMAL_MEMORY and > N_HIGH_MEMORY to distinguish between normal and highmem on platforms such as x86. > But we still don't have such a mechanism to distinguish between "normal" and "movable" > memory. So for memory nodes with only movable zones, we still set N_NORMAL_MEMORY for > them. One possible solution is to add a node mask for "N_NORMAL_OR_MOVABLE_MEMORY", > but haven't tried that yet. Will have a try for that. Hmmm... Maybe add another N_LRU_MEMORY bitmask and replace those N_NORMAL_MEMORY uses with N_LRU_MEMORY as needed? Use N_NORMAL_MEMORY for subsystems that need to do regular (non LRU) allocations that are not movable? _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-07-25 15:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-17 16:50 [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on Jiang Liu 2012-07-17 17:39 ` Christoph Lameter 2012-07-17 17:53 ` Luck, Tony 2012-07-18 15:30 ` Christoph Lameter 2012-07-18 16:52 ` Jiang Liu 2012-07-18 18:53 ` Christoph Lameter 2012-07-24 9:55 ` [RFC PATCH v2] SLUB: enhance slub to handle memory nodes without normal memory Jiang Liu 2012-07-24 14:45 ` Christoph Lameter 2012-07-24 17:00 ` Jiang Liu 2012-07-25 15:31 ` Christoph Lameter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).