linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>,
	Xishi Qiu <qiuxishi@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>,
	David Rientjes <rientjes@google.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Laura Abbott <lauraa@codeaurora.org>,
	zhuhui@xiaomi.com, wangxq10@lzu.edu.cn,
	Linux MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: fix invalid node in alloc_migrate_target()
Date: Tue, 29 Mar 2016 15:06:16 +0200	[thread overview]
Message-ID: <56FA7DC8.4000902@suse.cz> (raw)
In-Reply-To: <20160325122237.4ca4e0dbca215ccbf4f49922@linux-foundation.org>

On 03/25/2016 08:22 PM, Andrew Morton wrote:
> On Fri, 25 Mar 2016 14:56:04 +0800 Xishi Qiu <qiuxishi@huawei.com> wrote:
>
>> It is incorrect to use next_node to find a target node, it will
>> return MAX_NUMNODES or invalid node. This will lead to crash in
>> buddy system allocation.
>>
>> ...
>>
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -289,11 +289,11 @@ struct page *alloc_migrate_target(struct page *page, unsigned long private,
>>   	 * now as a simple work-around, we use the next node for destination.
>>   	 */
>>   	if (PageHuge(page)) {
>> -		nodemask_t src = nodemask_of_node(page_to_nid(page));
>> -		nodemask_t dst;
>> -		nodes_complement(dst, src);
>> +		int node = next_online_node(page_to_nid(page));
>> +		if (node == MAX_NUMNODES)
>> +			node = first_online_node;
>>   		return alloc_huge_page_node(page_hstate(compound_head(page)),
>> -					    next_node(page_to_nid(page), dst));
>> +					    node);
>>   	}
>>
>>   	if (PageHighMem(page))
>
> Indeed.  Can you tell us more about this circumstances under which the
> kernel will crash?  I need to decide which kernel version(s) need the
> patch, but the changelog doesn't contain the info needed to make this
> decision (it should).
>
>
>
> next_node() isn't a very useful interface, really.  Just about every
> caller does this:
>
>
> 	node = next_node(node, XXX);
> 	if (node == MAX_NUMNODES)
> 		node = first_node(XXX);
>
> so how about we write a function which does that, and stop open-coding
> the same thing everywhere?

Good idea.

> And I think your fix could then use such a function:
>
> 	int node = that_new_function(page_to_nid(page), node_online_map);
>
>
>
> Also, mm/mempolicy.c:offset_il_node() worries me:
>
> 	do {
> 		nid = next_node(nid, pol->v.nodes);
> 		c++;
> 	} while (c <= target);
>
> Can't `nid' hit MAX_NUMNODES?

AFAICS it can. interleave_nid() uses this and the nid is then used e.g. 
in node_zonelist() where it's used for NODE_DATA(nid). That's quite 
scary. It also predates git. Why don't we see crashes or KASAN finding this?

>
> And can someone please explain mem_cgroup_select_victim_node() to me?
> How can we hit the "node = numa_node_id()" path?  Only if
> memcg->scan_nodes is empty?  is that even valid?  The comment seems to
> have not much to do with the code?

I understand the comment that it's valid to be empty and the comment 
lists reasons why that can happen (with somewhat broken language). Note 
that I didn't verify these reasons:
- we call this when hitting memcg limit, not when adding pages to LRU, 
as adding to LRU means it would contain the given LRU's node
- adding to unevictable LRU means it's not added to scan_nodes (probably 
because scanning unevictable lru would be useless)
- for other reasons (which?) it might have pages not on LRU and it's so 
small there are no other pages that would be on LRU

> mpol_rebind_nodemask() is similar.
>
>
>
> Something like this?
>
>
> From: Andrew Morton <akpm@linux-foundation.org>
> Subject: include/linux/nodemask.h: create next_node_in() helper
>
> Lots of code does
>
> 	node = next_node(node, XXX);
> 	if (node == MAX_NUMNODES)
> 		node = first_node(XXX);
>
> so create next_node_in() to do this and use it in various places.
>
> Cc: Xishi Qiu <qiuxishi@huawei.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Patch doesn't address offset_il_node() which is good, because if it's 
indeed buggy, it's serious and needs a non-cleanup patch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-03-29 13:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-25  6:56 [PATCH] mm: fix invalid node in alloc_migrate_target() Xishi Qiu
2016-03-25 19:22 ` Andrew Morton
2016-03-26  5:31   ` Xishi Qiu
2016-03-29  9:52     ` Vlastimil Babka
2016-03-29 10:06       ` Vlastimil Babka
2016-03-29 10:37       ` Xishi Qiu
2016-03-29 12:21         ` Vlastimil Babka
2016-03-29 13:06   ` Vlastimil Babka [this message]
2016-03-31 13:13     ` Vlastimil Babka
2016-03-31 21:01       ` Andrew Morton
2016-04-01  8:42         ` Vlastimil Babka
2016-03-29 15:52   ` Michal Hocko
2016-03-29 12:25 ` Vlastimil Babka
2016-03-30  1:13   ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56FA7DC8.4000902@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=js1304@gmail.com \
    --cc=lauraa@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=wangxq10@lzu.edu.cn \
    --cc=zhuhui@xiaomi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).