From: Suzuki K Poulose <Suzuki.Poulose@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>, Jia He <hejianet@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Minchan Kim <minchan@kernel.org>,
Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>,
Arvind Yadav <arvind.yadav.cs@gmail.com>,
Mike Rapoport <rppt@linux.vnet.ibm.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
jia.he@hxt-semitech.com, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH v2] mm/ksm: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm
Date: Mon, 14 May 2018 10:45:02 +0100 [thread overview]
Message-ID: <2cd6b39b-1496-bbd5-9e31-5e3dcb31feda@arm.com> (raw)
In-Reply-To: <20180509163101.02f23de1842a822c61fc68ff@linux-foundation.org>
On 10/05/18 00:31, Andrew Morton wrote:
> On Fri, 4 May 2018 11:11:46 +0800 Jia He <hejianet@gmail.com> wrote:
>
>> In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by PAGE_SIZE
>> unaligned for rmap_item->address under memory pressure tests(start 20 guests
>> and run memhog in the host).
>>
>> ...
>>
>> In rmap_walk_ksm, the rmap_item->address might still have the STABLE_FLAG,
>> then the start and end in handle_hva_to_gpa might not be PAGE_SIZE aligned.
>> Thus it will cause exceptions in handle_hva_to_gpa on arm64.
>>
>> This patch fixes it by ignoring(not removing) the low bits of address when
>> doing rmap_walk_ksm.
>>
>> Signed-off-by: jia.he@hxt-semitech.com
>
> I assumed you wanted this patch to be committed as
> From:jia.he@hxt-semitech.com rather than From:hejianet@gmail.com, so I
> made that change. Please let me know if this was inappropriate.
>
> You can do this yourself by adding an explicit From: line to the very
> start of the patch's email text.
>
> Also, a storm of WARN_ONs is pretty poor behaviour. Is that the only
> misbehaviour which this bug causes? Do you think the fix should be
> backported into earlier kernels?
>
I think its just not the WARN_ON(). We do more than what is probably
intended with an unaligned address. i.e, We could be modifying the
flags for other pages that were not affected.
e.g :
In the original report [0], the trace looked like :
[ 800.511498] [<ffff0000080b4f2c>] kvm_age_hva_handler+0xcc/0xd4
[ 800.517324] [<ffff0000080b4838>] handle_hva_to_gpa+0xec/0x15c
[ 800.523063] [<ffff0000080b6c5c>] kvm_age_hva+0x5c/0xcc
[ 800.528194] [<ffff0000080a7c3c>] kvm_mmu_notifier_clear_flush_young+0x54/0x90
[ 800.535324] [<ffff00000827a0e8>] __mmu_notifier_clear_flush_young+0x6c/0xa8
[ 800.542279] [<ffff00000825a644>] page_referenced_one+0x1e0/0x1fc
[ 800.548279] [<ffff00000827e8f8>] rmap_walk_ksm+0x124/0x1a0
[ 800.553759] [<ffff00000825c974>] rmap_walk+0x94/0x98
[ 800.558717] [<ffff00000825ca98>] page_referenced+0x120/0x180
[ 800.564369] [<ffff000008228c58>] shrink_active_list+0x218/0x4a4
[ 800.570281] [<ffff000008229470>] shrink_node_memcg+0x58c/0x6fc
[ 800.576107] [<ffff0000082296c4>] shrink_node+0xe4/0x328
[ 800.581325] [<ffff000008229c9c>] do_try_to_free_pages+0xe4/0x3b8
[ 800.587324] [<ffff00000822a094>] try_to_free_pages+0x124/0x234
[ 800.593150] [<ffff000008216aa0>] __alloc_pages_nodemask+0x564/0xf7c
[ 800.599412] [<ffff000008292814>] khugepaged_alloc_page+0x38/0xb8
[ 800.605411] [<ffff0000082933bc>] collapse_huge_page+0x74/0xd70
[ 800.611238] [<ffff00000829470c>] khugepaged_scan_mm_slot+0x654/0xa98
[ 800.617585] [<ffff000008294e0c>] khugepaged+0x2bc/0x49c
[ 800.622803] [<ffff0000080ffb70>] kthread+0x124/0x150
[ 800.627762] [<ffff0000080849f0>] ret_from_fork+0x10/0x1c
[ 800.633066] ---[ end trace 944c130b5252fb01 ]---
Now, the ksm wants to mark *a page* as referenced via page_referenced_one(),
passing it an unaligned address. This could eventually turn out to be
one of :
ptep_clear_flush_young_notify(address, address + PAGE_SIZE)
or
pmdp_clear_flush_young_notify(address, address + PMD_SIZE)
which now spans two pages/pmds and the notifier consumer might
take an action on the second page as well, which is not something
intended. So, I do think that old behavior is wrong and has other
side effects as mentioned above.
[0] https://lkml.kernel.org/r/1525244911-5519-1-git-send-email-hejianet@gmail.com
Suzuki
next prev parent reply other threads:[~2018-05-14 9:45 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-03 8:34 [PATCH] mm/ksm: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm Jia He
2018-05-03 10:44 ` Claudio Imbrenda
2018-05-03 13:23 ` Jia He
2018-05-04 3:11 ` [PATCH v2] " Jia He
2018-05-04 5:56 ` Jia He
2018-05-09 23:31 ` Andrew Morton
2018-05-10 1:26 ` Jia He
2018-05-14 9:09 ` Suzuki K Poulose
2018-05-14 9:45 ` Suzuki K Poulose [this message]
2018-05-24 8:44 ` Suzuki K Poulose
[not found] ` <198fe48c-0d26-7dd1-89dc-415bd3c7ab41@gmail.com>
2018-05-24 9:01 ` Suzuki K Poulose
2018-05-24 9:36 ` Jia He
2018-05-24 20:38 ` Andrew Morton
2018-06-07 22:13 ` Andrew Morton
2018-06-07 23:38 ` Andrea Arcangeli
2018-06-08 1:32 ` Jia He
2018-06-08 1:23 ` Jia He
2018-06-08 11:08 ` Suzuki K Poulose
2018-05-03 13:41 ` [PATCH] " Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2cd6b39b-1496-bbd5-9e31-5e3dcb31feda@arm.com \
--to=suzuki.poulose@arm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=arvind.yadav.cs@gmail.com \
--cc=hejianet@gmail.com \
--cc=hughd@google.com \
--cc=imbrenda@linux.vnet.ibm.com \
--cc=jia.he@hxt-semitech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=rppt@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox