* Re: 4.10-rc2 list_lru_isolate list corruption [not found] <20170106052056.jihy5denyxsnfuo5@codemonkey.org.uk> @ 2017-01-06 16:59 ` Johannes Weiner 2017-01-06 19:58 ` Dave Jones 0 siblings, 1 reply; 7+ messages in thread From: Johannes Weiner @ 2017-01-06 16:59 UTC (permalink / raw) To: Dave Jones; +Cc: Jan Kara, linux-mm Dave, can you reproduce this by any chance with this patch applied? diff --git a/lib/radix-tree.c b/lib/radix-tree.c index 6f382e07de77..0783af1c0ebb 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -640,6 +640,8 @@ static inline void radix_tree_shrink(struct radix_tree_root *root, update_node(node, private); } + WARN_ON_ONCE(!list_empty(&node->private_list)); + radix_tree_node_free(node); } } @@ -666,6 +668,8 @@ static void delete_node(struct radix_tree_root *root, root->rnode = NULL; } + WARN_ON_ONCE(!list_empty(&node->private_list)); + radix_tree_node_free(node); node = parent; @@ -767,6 +771,7 @@ static void radix_tree_free_nodes(struct radix_tree_node *node) struct radix_tree_node *old = child; offset = child->offset + 1; child = child->parent; + WARN_ON_ONCE(!list_empty(&node->private_list)); radix_tree_node_free(old); if (old == entry_to_node(node)) return; On Fri, Jan 06, 2017 at 12:20:56AM -0500, Dave Jones wrote: > While fuzzing today, I triggered list corruption in the mm code twice. > > Exhibit a: > > WARNING: CPU: 1 PID: 53 at lib/list_debug.c:55 __list_del_entry_valid+0x5c/0xc0 > list_del corruption. next->prev should be ffff8804c31b8e60, but was ffffffff813d2dc0 > CPU: 1 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #2 > Call Trace: > dump_stack+0x4f/0x73 > __warn+0xcb/0xf0 > warn_slowpath_fmt+0x5f/0x80 > ? warn_slowpath_fmt+0x5/0x80 > ? radix_tree_free_nodes+0xa0/0xa0 > __list_del_entry_valid+0x5c/0xc0 > list_lru_isolate+0x1a/0x40 > shadow_lru_isolate+0x3e/0x220 > __list_lru_walk_one.isra.4+0x9b/0x190 > ? memcg_drain_all_list_lrus+0x1d0/0x1d0 > list_lru_walk_one+0x23/0x30 > scan_shadow_nodes+0x2e/0x40 > shrink_slab.part.44+0x23d/0x5d0 > ? 0xffffffffa0285077 > shrink_node+0x22c/0x330 > kswapd+0x392/0x8f0 > kthread+0x10f/0x150 > ? mem_cgroup_shrink_node+0x2e0/0x2e0 > ? kthread_create_on_node+0x60/0x60 > ret_from_fork+0x22/0x30 > > > Exhibit b: > > > WARNING: CPU: 0 PID: 17728 at lib/list_debug.c:55 __list_del_entry_valid+0x5c/0xc0 > list_del corruption. next->prev should be ffff8804f8972030, but was ffffffff813d2dc0 > CPU: 0 PID: 17728 Comm: trinity-c28 Not tainted 4.10.0-rc2-think+ #2 > Call Trace: > dump_stack+0x4f/0x73 > __warn+0xcb/0xf0 > warn_slowpath_fmt+0x5f/0x80 > ? warn_slowpath_fmt+0x5/0x80 > ? radix_tree_free_nodes+0xa0/0xa0 > __list_del_entry_valid+0x5c/0xc0 > list_lru_isolate+0x1a/0x40 > shadow_lru_isolate+0x3e/0x220 > __list_lru_walk_one.isra.4+0x9b/0x190 > ? memcg_drain_all_list_lrus+0x1d0/0x1d0 > list_lru_walk_one+0x23/0x30 > scan_shadow_nodes+0x2e/0x40 > shrink_slab.part.44+0x23d/0x5d0 > ? 0xffffffffa0333077 > shrink_node+0x22c/0x330 > do_try_to_free_pages+0xf5/0x330 > try_to_free_pages+0x132/0x310 > __alloc_pages_slowpath+0x357/0xaa0 > __alloc_pages_nodemask+0x3cc/0x460 > __do_page_cache_readahead+0x165/0x370 > ? __do_page_cache_readahead+0xed/0x370 > ? __do_page_cache_readahead+0x5/0x370 > ondemand_readahead+0x112/0x350 > ? page_cache_sync_readahead+0x5/0x50 > page_cache_sync_readahead+0x31/0x50 > generic_file_read_iter+0x724/0x960 > ? rw_copy_check_uvector+0x8e/0x190 > ? generic_file_read_iter+0x5/0x960 > do_iter_readv_writev+0xb8/0x120 > do_readv_writev+0x1a4/0x250 > ? do_readv_writev+0x5/0x250 > ? vfs_readv+0x5/0x50 > vfs_readv+0x3c/0x50 > do_preadv+0xb5/0xd0 > SyS_preadv+0x11/0x20 > do_syscall_64+0x61/0x170 > entry_SYSCALL64_slow_path+0x25/0x25 > RIP: 0033:0x7f5cb7c1e119 > RSP: 002b:00007ffc7e7d2758 EFLAGS: 00000246 > [CONT START] ORIG_RAX: 0000000000000127 > RAX: ffffffffffffffda RBX: 0000000000000127 RCX: 00007f5cb7c1e119 > RDX: 0000000000000037 RSI: 00005561d7798a70 RDI: 000000000000000c > RBP: 00007f5cb8228000 R08: 00000000a0000033 R09: 0000000000000030 > R10: 0000000000400000 R11: 0000000000000246 R12: 0000000000000002 > R13: 00007f5cb8228048 R14: 00007f5cb82f3ad8 R15: 00007f5cb8228000 > > > Interesting that the 'but was' value is the same on two seperate boots. > > > It looks like mm/list_lru.c didn't change recently, but mm/workingset.c did, > which calls into this.. Johannes ? > > Dave > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-06 16:59 ` 4.10-rc2 list_lru_isolate list corruption Johannes Weiner @ 2017-01-06 19:58 ` Dave Jones 2017-01-07 1:19 ` Johannes Weiner 0 siblings, 1 reply; 7+ messages in thread From: Dave Jones @ 2017-01-06 19:58 UTC (permalink / raw) To: Johannes Weiner; +Cc: Jan Kara, linux-mm On Fri, Jan 06, 2017 at 11:59:41AM -0500, Johannes Weiner wrote: > Dave, can you reproduce this by any chance with this patch applied? yep. > diff --git a/lib/radix-tree.c b/lib/radix-tree.c > index 6f382e07de77..0783af1c0ebb 100644 > --- a/lib/radix-tree.c > +++ b/lib/radix-tree.c > @@ -640,6 +640,8 @@ static inline void radix_tree_shrink(struct radix_tree_root *root, > update_node(node, private); > } > > + WARN_ON_ONCE(!list_empty(&node->private_list)); > + > radix_tree_node_free(node); > } > } [ 8467.462878] WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200 [ 8467.468770] CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3 [ 8467.480436] Call Trace: [ 8467.486213] dump_stack+0x4f/0x73 [ 8467.491999] __warn+0xcb/0xf0 [ 8467.497769] warn_slowpath_null+0x1d/0x20 [ 8467.503566] delete_node+0x1e4/0x200 [ 8467.509468] __radix_tree_delete_node+0xd/0x10 [ 8467.515425] shadow_lru_isolate+0xe6/0x220 [ 8467.521337] __list_lru_walk_one.isra.4+0x9b/0x190 [ 8467.527176] ? memcg_drain_all_list_lrus+0x1d0/0x1d0 [ 8467.533066] list_lru_walk_one+0x23/0x30 [ 8467.538953] scan_shadow_nodes+0x2e/0x40 [ 8467.544840] shrink_slab.part.44+0x23d/0x5d0 [ 8467.550751] ? 0xffffffffa023a077 [ 8467.556639] shrink_node+0x22c/0x330 [ 8467.562542] kswapd+0x392/0x8f0 [ 8467.568422] kthread+0x10f/0x150 [ 8467.574313] ? mem_cgroup_shrink_node+0x2e0/0x2e0 [ 8467.580266] ? kthread_create_on_node+0x60/0x60 [ 8467.586203] ret_from_fork+0x29/0x40 [ 8467.592109] ---[ end trace f790bafb683609d5 ]--- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-06 19:58 ` Dave Jones @ 2017-01-07 1:19 ` Johannes Weiner 2017-01-08 0:07 ` Dave Jones 0 siblings, 1 reply; 7+ messages in thread From: Johannes Weiner @ 2017-01-07 1:19 UTC (permalink / raw) To: Dave Jones; +Cc: Jan Kara, linux-mm On Fri, Jan 06, 2017 at 02:58:51PM -0500, Dave Jones wrote: > On Fri, Jan 06, 2017 at 11:59:41AM -0500, Johannes Weiner wrote: > > diff --git a/lib/radix-tree.c b/lib/radix-tree.c > > index 6f382e07de77..0783af1c0ebb 100644 > > --- a/lib/radix-tree.c > > +++ b/lib/radix-tree.c > > @@ -640,6 +640,8 @@ static inline void radix_tree_shrink(struct radix_tree_root *root, > > update_node(node, private); > > } > > > > + WARN_ON_ONCE(!list_empty(&node->private_list)); > > + > > radix_tree_node_free(node); > > } > > } > > [ 8467.462878] WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200 > [ 8467.468770] CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3 > [ 8467.480436] Call Trace: > [ 8467.486213] dump_stack+0x4f/0x73 > [ 8467.491999] __warn+0xcb/0xf0 > [ 8467.497769] warn_slowpath_null+0x1d/0x20 > [ 8467.503566] delete_node+0x1e4/0x200 > [ 8467.509468] __radix_tree_delete_node+0xd/0x10 > [ 8467.515425] shadow_lru_isolate+0xe6/0x220 > [ 8467.521337] __list_lru_walk_one.isra.4+0x9b/0x190 > [ 8467.527176] ? memcg_drain_all_list_lrus+0x1d0/0x1d0 > [ 8467.533066] list_lru_walk_one+0x23/0x30 > [ 8467.538953] scan_shadow_nodes+0x2e/0x40 > [ 8467.544840] shrink_slab.part.44+0x23d/0x5d0 > [ 8467.550751] ? 0xffffffffa023a077 > [ 8467.556639] shrink_node+0x22c/0x330 > [ 8467.562542] kswapd+0x392/0x8f0 > [ 8467.568422] kthread+0x10f/0x150 > [ 8467.574313] ? mem_cgroup_shrink_node+0x2e0/0x2e0 > [ 8467.580266] ? kthread_create_on_node+0x60/0x60 > [ 8467.586203] ret_from_fork+0x29/0x40 > [ 8467.592109] ---[ end trace f790bafb683609d5 ]--- Argh, __radix_tree_delete_node() makes the flawed assumption that only the immediate branch it's mucking with can collapse. But this warning points out that a sibling branch can collapse too, including its leaf. Can you try if this patch fixes the problem? --- ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-07 1:19 ` Johannes Weiner @ 2017-01-08 0:07 ` Dave Jones 2017-01-08 0:37 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: Dave Jones @ 2017-01-08 0:07 UTC (permalink / raw) To: Johannes Weiner; +Cc: Jan Kara, linux-mm On Fri, Jan 06, 2017 at 08:19:31PM -0500, Johannes Weiner wrote: > Argh, __radix_tree_delete_node() makes the flawed assumption that only > the immediate branch it's mucking with can collapse. But this warning > points out that a sibling branch can collapse too, including its leaf. > > Can you try if this patch fixes the problem? 18 hours and still running.. I think we can call it good. Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-08 0:07 ` Dave Jones @ 2017-01-08 0:37 ` Hugh Dickins 2017-01-08 2:02 ` Johannes Weiner 0 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2017-01-08 0:37 UTC (permalink / raw) To: Johannes Weiner; +Cc: Dave Jones, Jan Kara, linux-mm On Sat, 7 Jan 2017, Dave Jones wrote: > On Fri, Jan 06, 2017 at 08:19:31PM -0500, Johannes Weiner wrote: > > > Argh, __radix_tree_delete_node() makes the flawed assumption that only > > the immediate branch it's mucking with can collapse. But this warning > > points out that a sibling branch can collapse too, including its leaf. > > > > Can you try if this patch fixes the problem? > > 18 hours and still running.. I think we can call it good. I'm inclined to agree, though I haven't had it running long enough (on a load like when it hit me a few times before) to be sure yet myself. I'd rather see the proposed fix go in than wait longer for me: I've certainly seen nothing bad from it yet. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-08 0:37 ` Hugh Dickins @ 2017-01-08 2:02 ` Johannes Weiner 2017-01-08 20:30 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: Johannes Weiner @ 2017-01-08 2:02 UTC (permalink / raw) To: Hugh Dickins; +Cc: Dave Jones, Jan Kara, linux-mm On Sat, Jan 07, 2017 at 04:37:43PM -0800, Hugh Dickins wrote: > On Sat, 7 Jan 2017, Dave Jones wrote: > > On Fri, Jan 06, 2017 at 08:19:31PM -0500, Johannes Weiner wrote: > > > > > Argh, __radix_tree_delete_node() makes the flawed assumption that only > > > the immediate branch it's mucking with can collapse. But this warning > > > points out that a sibling branch can collapse too, including its leaf. > > > > > > Can you try if this patch fixes the problem? > > > > 18 hours and still running.. I think we can call it good. > > I'm inclined to agree, though I haven't had it running long enough > (on a load like when it hit me a few times before) to be sure yet myself. > I'd rather see the proposed fix go in than wait longer for me: > I've certainly seen nothing bad from it yet. Thank you both! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 4.10-rc2 list_lru_isolate list corruption 2017-01-08 2:02 ` Johannes Weiner @ 2017-01-08 20:30 ` Hugh Dickins 0 siblings, 0 replies; 7+ messages in thread From: Hugh Dickins @ 2017-01-08 20:30 UTC (permalink / raw) To: Johannes Weiner; +Cc: Hugh Dickins, Dave Jones, Jan Kara, linux-mm On Sat, 7 Jan 2017, Johannes Weiner wrote: > On Sat, Jan 07, 2017 at 04:37:43PM -0800, Hugh Dickins wrote: > > On Sat, 7 Jan 2017, Dave Jones wrote: > > > On Fri, Jan 06, 2017 at 08:19:31PM -0500, Johannes Weiner wrote: > > > > > > > Argh, __radix_tree_delete_node() makes the flawed assumption that only > > > > the immediate branch it's mucking with can collapse. But this warning > > > > points out that a sibling branch can collapse too, including its leaf. > > > > > > > > Can you try if this patch fixes the problem? > > > > > > 18 hours and still running.. I think we can call it good. > > > > I'm inclined to agree, though I haven't had it running long enough > > (on a load like when it hit me a few times before) to be sure yet myself. > > I'd rather see the proposed fix go in than wait longer for me: > > I've certainly seen nothing bad from it yet. > > Thank you both! Been running successfully for 36 and 24 hours on two machines, each with a different load that showed it much sooner before: I too call it good, and thanks to Dave and you and Linus for getting the fix in for -rc3. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-01-08 20:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170106052056.jihy5denyxsnfuo5@codemonkey.org.uk>
2017-01-06 16:59 ` 4.10-rc2 list_lru_isolate list corruption Johannes Weiner
2017-01-06 19:58 ` Dave Jones
2017-01-07 1:19 ` Johannes Weiner
2017-01-08 0:07 ` Dave Jones
2017-01-08 0:37 ` Hugh Dickins
2017-01-08 2:02 ` Johannes Weiner
2017-01-08 20:30 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).