From: Byungchul Park <byungchul@sk.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: akpm@linux-foundation.org, ying.huang@intel.com,
osalvador@suse.de, hannes@cmpxchg.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel_team@skhynix.com, stable@vger.kernel.org
Subject: Re: [PATCH] mm/vmscan: Fix a bug calling wakeup_kswapd() with a wrong zone index
Date: Mon, 19 Feb 2024 15:31:39 +0900 [thread overview]
Message-ID: <20240219063139.GC65758@system.software.com> (raw)
In-Reply-To: <517e58d4-7537-4d9f-8893-0130c65c3fdb@linux.alibaba.com>
On Mon, Feb 19, 2024 at 02:25:11PM +0800, Baolin Wang wrote:
>
>
> On 2024/2/16 19:15, Byungchul Park wrote:
> > With numa balancing on, when a numa system is running where a numa node
> > doesn't have its local memory so it has no managed zones, the following
> > oops has been observed. It's because wakeup_kswapd() is called with a
> > wrong zone index, -1. Fixed it by checking the index before calling
> > wakeup_kswapd().
> >
> > > BUG: unable to handle page fault for address: 00000000000033f3
> > > #PF: supervisor read access in kernel mode
> > > #PF: error_code(0x0000) - not-present page
> > > PGD 0 P4D 0
> > > Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > CPU: 2 PID: 895 Comm: masim Not tainted 6.6.0-dirty #255
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > > RIP: 0010:wakeup_kswapd (./linux/mm/vmscan.c:7812)
> > > Code: (omitted)
> > > RSP: 0000:ffffc90004257d58 EFLAGS: 00010286
> > > RAX: ffffffffffffffff RBX: ffff88883fff0480 RCX: 0000000000000003
> > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88883fff0480
> > > RBP: ffffffffffffffff R08: ff0003ffffffffff R09: ffffffffffffffff
> > > R10: ffff888106c95540 R11: 0000000055555554 R12: 0000000000000003
> > > R13: 0000000000000000 R14: 0000000000000000 R15: ffff88883fff0940
> > > FS: 00007fc4b8124740(0000) GS:ffff888827c00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00000000000033f3 CR3: 000000026cc08004 CR4: 0000000000770ee0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > PKRU: 55555554
> > > Call Trace:
> > > <TASK>
> > > ? __die
> > > ? page_fault_oops
> > > ? __pte_offset_map_lock
> > > ? exc_page_fault
> > > ? asm_exc_page_fault
> > > ? wakeup_kswapd
> > > migrate_misplaced_page
> > > __handle_mm_fault
> > > handle_mm_fault
> > > do_user_addr_fault
> > > exc_page_fault
> > > asm_exc_page_fault
> > > RIP: 0033:0x55b897ba0808
> > > Code: (omitted)
> > > RSP: 002b:00007ffeefa821a0 EFLAGS: 00010287
> > > RAX: 000055b89983acd0 RBX: 00007ffeefa823f8 RCX: 000055b89983acd0
> > > RDX: 00007fc2f8122010 RSI: 0000000000020000 RDI: 000055b89983acd0
> > > RBP: 00007ffeefa821a0 R08: 0000000000000037 R09: 0000000000000075
> > > R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
> > > R13: 00007ffeefa82410 R14: 000055b897ba5dd8 R15: 00007fc4b8340000
> > > </TASK>
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > Reported-by: Hyeongtak Ji <hyeongtak.ji@sk.com>
> > Cc: stable@vger.kernel.org
> > Fixes: c574bbe917036 ("NUMA balancing: optimize page placement for memory tiering system")
>
> This means that there is no memory on the target node? if so, we can add a
> check at the beginning to avoid calling unnecessary
> migrate_misplaced_folio().
Right. However, the check is necessary no matter where it comes from.
FYI, the fix is going on in another thread:
https://lore.kernel.org/lkml/20240216114045.24828-1-byungchul@sk.com/
https://lore.kernel.org/lkml/20240219041920.1183-1-byungchul@sk.com/
Byungchul
> diff --git a/mm/memory.c b/mm/memory.c
> index e95503d7544e..a64a1aac463f 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5182,7 +5182,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
> else
> last_cpupid = folio_last_cpupid(folio);
> target_nid = numa_migrate_prep(folio, vma, vmf->address, nid,
> &flags);
> - if (target_nid == NUMA_NO_NODE) {
> + if (target_nid == NUMA_NO_NODE || !node_state(target_nid, N_MEMORY))
> {
> folio_put(folio);
> goto out_map;
> }
>
> (similar changes for do_huge_pmd_numa_page())
>
> > ---
> > mm/migrate.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index fbc8586ed735..51ee6865b0f6 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -2825,6 +2825,14 @@ static int numamigrate_isolate_folio(pg_data_t *pgdat, struct folio *folio)
> > if (managed_zone(pgdat->node_zones + z))
> > break;
> > }
> > +
> > + /*
> > + * If there are no managed zones, it should not proceed
> > + * further.
> > + */
> > + if (z < 0)
> > + return 0;
> > +
> > wakeup_kswapd(pgdat->node_zones + z, 0,
> > folio_order(folio), ZONE_MOVABLE);
> > return 0;
next prev parent reply other threads:[~2024-02-19 6:31 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-16 11:15 [PATCH] mm/vmscan: Fix a bug calling wakeup_kswapd() with a wrong zone index Byungchul Park
2024-02-19 6:25 ` Baolin Wang
2024-02-19 6:31 ` Byungchul Park [this message]
2024-02-19 8:11 ` Oscar Salvador
2024-02-19 9:54 ` Baolin Wang
2024-02-20 3:42 ` Huang, Ying
2024-02-20 4:03 ` Byungchul Park
2024-02-20 5:29 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240219063139.GC65758@system.software.com \
--to=byungchul@sk.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=hannes@cmpxchg.org \
--cc=kernel_team@skhynix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.