All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhong jiang <zhongjiang@huawei.com>
To: David Hildenbrand <david@redhat.com>
Cc: <akpm@linux-foundation.org>, <mhocko@suse.com>,
	<hannes@cmpxchg.org>, <ktkhai@virtuozzo.com>,
	<linux-mm@kvack.org>
Subject: Re: [PATCH] mm: fix unevictable page reclaim when calling madvise_pageout
Date: Mon, 28 Oct 2019 23:45:27 +0800	[thread overview]
Message-ID: <5DB70D17.9040108@huawei.com> (raw)
In-Reply-To: <3ac2e87d-2899-ab17-8b0b-8aa6a5035d4a@redhat.com>

On 2019/10/28 23:27, David Hildenbrand wrote:
> On 28.10.19 16:08, zhong jiang wrote:
>> Recently, I hit the following issue when running in the upstream.
>>
>> kernel BUG at mm/vmscan.c:1521!
>> invalid opcode: 0000 [#1] SMP KASAN PTI
>> CPU: 0 PID: 23385 Comm: syz-executor.6 Not tainted 5.4.0-rc4+ #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
>> RIP: 0010:shrink_page_list+0x12b6/0x3530 mm/vmscan.c:1521
>> Code: de f5 ff ff e8 ab 79 eb ff 4c 89 f7 e8 43 33 0d 00 e9 cc f5 ff ff e8 99 79 eb ff 48 c7 c6 a0 34 2b a0 4c 89 f7 e8 1a 4d 05 00 <0f> 0b e8 83 79 eb ff 48 89 d8 48 c1 e8 03 42 80 3c 38 00 0f 85 74
>> RSP: 0018:ffff88819a3df5a0 EFLAGS: 00010286
>> RAX: 0000000000040000 RBX: ffffea00061c3980 RCX: ffffffff814fba36
>> RDX: 00000000000056f7 RSI: ffffc9000c02c000 RDI: ffff8881f70268cc
>> RBP: ffff88819a3df898 R08: ffffed103ee05de0 R09: ffffed103ee05de0
>> R10: 0000000000000001 R11: ffffed103ee05ddf R12: ffff88819a3df6f0
>> R13: ffff88819a3df6f0 R14: ffffea00061c3980 R15: dffffc0000000000
>> FS:  00007f21b9d8e700(0000) GS:ffff8881f7000000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b2d621000 CR3: 00000001c8c46004 CR4: 00000000007606f0
>> DR0: 0000000020000140 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>> PKRU: 55555554
>> Call Trace:
>>   reclaim_pages+0x499/0x800 mm/vmscan.c:2188
>>   madvise_cold_or_pageout_pte_range+0x58a/0x710 mm/madvise.c:453
>>   walk_pmd_range mm/pagewalk.c:53 [inline]
>>   walk_pud_range mm/pagewalk.c:112 [inline]
>>   walk_p4d_range mm/pagewalk.c:139 [inline]
>>   walk_pgd_range mm/pagewalk.c:166 [inline]
>>   __walk_page_range+0x45a/0xc20 mm/pagewalk.c:261
>>   walk_page_range+0x179/0x310 mm/pagewalk.c:349
>>   madvise_pageout_page_range mm/madvise.c:506 [inline]
>>   madvise_pageout+0x1f0/0x330 mm/madvise.c:542
>>   madvise_vma mm/madvise.c:931 [inline]
>>   __do_sys_madvise+0x7d2/0x1600 mm/madvise.c:1113
>>   do_syscall_64+0x9f/0x4c0 arch/x86/entry/common.c:290
>>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> madvise_pageout access the specified range of the vma and isolate
>> them, then run shrink_page_list to reclaim the memory. But It also
>> isolate the unevictable page to reclaim. Hence, we can catch the
>> cases in shrink_page_list.
>>
>> We can fix it by preventing unevictable page from isolating.
>> Another way to fix the issue by removing the condition of
>> BUG_ON(PageUnevictable(page)) in shrink_page_list. I think it
>> is better  to use the latter. Because We has taken the unevictable
>> page and skip it into account in shrink_page_list.
> I really don't understand the last sentence. Looks like
> something got messed up :)
I mean that we will check the page_evictable(page) in shrink_page_list,
if it is unevictable page, we will put the page back to correct lru.

Based on the condition, I make the choice. It seems to more simpler.:-)

Thanks,
zhong jiang
>
>> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
>> ---
>>   mm/vmscan.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index f7d1301..1c6e959 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1524,7 +1524,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>   		unlock_page(page);
>>   keep:
>>   		list_add(&page->lru, &ret_pages);
>> -		VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
>> +		VM_BUG_ON_PAGE(PageLRU(page), page);
> So, this comes from
>
> commit b291f000393f5a0b679012b39d79fbc85c018233
> Author: Nick Piggin <npiggin@suse.de>
> Date:   Sat Oct 18 20:26:44 2008 -0700
>
>     mlock: mlocked pages are unevictable
>     
>     Make sure that mlocked pages also live on the unevictable LRU, so kswapd
>     will not scan them over and over again.
>
>
> That patch is fairly old. How come we can suddenly trigger this?
> Which commit is responsible for that? Was it always broken?
>
> I can see that
>
> commit ad6b67041a45497261617d7a28b15159b202cb5a
> Author: Minchan Kim <minchan@kernel.org>
> Date:   Wed May 3 14:54:13 2017 -0700
>
>     mm: remove SWAP_MLOCK in ttu
>
> Performed some changes in that area. But also some time ago.
I think the following patch introduce the issue.

commit 1a4e58cce84ee88129d5d49c064bd2852b481357
Author: Minchan Kim <minchan@kernel.org>
Date:   Wed Sep 25 16:49:15 2019 -0700

    mm: introduce MADV_PAGEOUT

    When a process expects no accesses to a certain memory range for a long

Thanks,
zhong jiang



  reply	other threads:[~2019-10-28 15:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28 15:08 [PATCH] mm: fix unevictable page reclaim when calling madvise_pageout zhong jiang
2019-10-28 15:27 ` David Hildenbrand
2019-10-28 15:45   ` zhong jiang [this message]
2019-10-28 16:07     ` David Hildenbrand
2019-10-28 16:15       ` zhong jiang
2019-10-28 16:15       ` David Hildenbrand
2019-10-29  2:29         ` zhong jiang
2019-10-29  8:11 ` Michal Hocko
2019-10-29  9:30   ` zhong jiang
2019-10-29  9:40     ` Michal Hocko
2019-10-29 10:45       ` zhong jiang
2019-10-30 16:52         ` Minchan Kim
2019-10-30 17:22           ` Johannes Weiner
2019-10-30 18:39             ` Minchan Kim
2019-11-01  8:57             ` zhong jiang
2019-10-30 17:45           ` Michal Hocko
2019-10-30 18:42             ` Minchan Kim
2019-10-30 19:33             ` Johannes Weiner
2019-10-31  9:16               ` Michal Hocko
2019-10-31 14:48                 ` Minchan Kim
2019-10-31 17:17                   ` Michal Hocko
2019-11-01 12:56                 ` zhong jiang
2019-10-31  9:46               ` zhong jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5DB70D17.9040108@huawei.com \
    --to=zhongjiang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.