From: Sasha Levin <sashal@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
peterx@redhat.com, aarcange@redhat.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm/userfaultfd: fix missing PTE unmap for non-migration entries
Date: Thu, 31 Jul 2025 08:43:11 -0400 [thread overview]
Message-ID: <aItk34ca4Mp6KLUB@lappy> (raw)
In-Reply-To: <CAJuCfpF3K49Z8uevF6M9FZX-tFgJDCkCi54iL=xwDuQB2RMqoA@mail.gmail.com>
On Tue, Jul 08, 2025 at 09:34:48AM -0700, Suren Baghdasaryan wrote:
>On Tue, Jul 8, 2025 at 8:57 AM Sasha Levin <sashal@kernel.org> wrote:
>>
>> On Tue, Jul 08, 2025 at 08:39:47AM -0700, Suren Baghdasaryan wrote:
>> >On Tue, Jul 8, 2025 at 8:33 AM Sasha Levin <sashal@kernel.org> wrote:
>> >>
>> >> On Tue, Jul 08, 2025 at 05:10:44PM +0200, David Hildenbrand wrote:
>> >> >On 01.07.25 02:57, Andrew Morton wrote:
>> >> >>On Sun, 29 Jun 2025 23:19:58 -0400 Sasha Levin <sashal@kernel.org> wrote:
>> >> >>
>> >> >>>When handling non-swap entries in move_pages_pte(), the error handling
>> >> >>>for entries that are NOT migration entries fails to unmap the page table
>> >> >>>entries before jumping to the error handling label.
>> >> >>>
>> >> >>>This results in a kmap/kunmap imbalance which on CONFIG_HIGHPTE systems
>> >> >>>triggers a WARNING in kunmap_local_indexed() because the kmap stack is
>> >> >>>corrupted.
>> >> >>>
>> >> >>>Example call trace on ARM32 (CONFIG_HIGHPTE enabled):
>> >> >>> WARNING: CPU: 1 PID: 633 at mm/highmem.c:622 kunmap_local_indexed+0x178/0x17c
>> >> >>> Call trace:
>> >> >>> kunmap_local_indexed from move_pages+0x964/0x19f4
>> >> >>> move_pages from userfaultfd_ioctl+0x129c/0x2144
>> >> >>> userfaultfd_ioctl from sys_ioctl+0x558/0xd24
>> >> >>>
>> >> >>>The issue was introduced with the UFFDIO_MOVE feature but became more
>> >> >>>frequent with the addition of guard pages (commit 7c53dfbdb024 ("mm: add
>> >> >>>PTE_MARKER_GUARD PTE marker")) which made the non-migration entry code
>> >> >>>path more commonly executed during userfaultfd operations.
>> >> >>>
>> >> >>>Fix this by ensuring PTEs are properly unmapped in all non-swap entry
>> >> >>>paths before jumping to the error handling label, not just for migration
>> >> >>>entries.
>> >> >>
>> >> >>I don't get it.
>> >> >>
>> >> >>>--- a/mm/userfaultfd.c
>> >> >>>+++ b/mm/userfaultfd.c
>> >> >>>@@ -1384,14 +1384,15 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd,
>> >> >>> entry = pte_to_swp_entry(orig_src_pte);
>> >> >>> if (non_swap_entry(entry)) {
>> >> >>>+ pte_unmap(src_pte);
>> >> >>>+ pte_unmap(dst_pte);
>> >> >>>+ src_pte = dst_pte = NULL;
>> >> >>> if (is_migration_entry(entry)) {
>> >> >>>- pte_unmap(src_pte);
>> >> >>>- pte_unmap(dst_pte);
>> >> >>>- src_pte = dst_pte = NULL;
>> >> >>> migration_entry_wait(mm, src_pmd, src_addr);
>> >> >>> err = -EAGAIN;
>> >> >>>- } else
>> >> >>>+ } else {
>> >> >>> err = -EFAULT;
>> >> >>>+ }
>> >> >>> goto out;
>> >> >>
>> >> >>where we have
>> >> >>
>> >> >>out:
>> >> >> ...
>> >> >> if (dst_pte)
>> >> >> pte_unmap(dst_pte);
>> >> >> if (src_pte)
>> >> >> pte_unmap(src_pte);
>> >> >
>> >> >AI slop?
>> >>
>> >> Nah, this one is sadly all me :(
>> >>
>> >> I was trying to resolve some of the issues found with linus-next on
>> >> LKFT, and misunderstood the code. Funny enough, I thought that the
>> >> change above "fixed" it by making the warnings go away, but clearly is
>> >> the wrong thing to do so I went back to the drawing table...
>> >>
>> >> If you're curious, here's the issue: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-43418-g558c6dd4d863/testrun/29030370/suite/log-parser-test/test/exception-warning-cpu-pid-at-mmhighmem-kunmap_local_indexed/details/
>> >
>> >Any way to symbolize that Call trace? I can't find build artefacts to
>> >extract vmlinux image...
>>
>> The build artifacts are at
>> https://storage.tuxsuite.com/public/linaro/lkft/builds/2zSrTao2x4P640QKIx18JUuFdc1/
>> but I couldn't get it to do the right thing. I'm guessing that I need
>> some magical arm32 toolchain bits that I don't carry:
>>
>> cat tr.txt | ./scripts/decode_stacktrace.sh vmlinux
>> <4>[ 38.566145] ------------[ cut here ]------------
>> <4>[ 38.566392] WARNING: CPU: 1 PID: 637 at mm/highmem.c:622 kunmap_local_indexed+0x198/0x1a4
>> <4>[ 38.569398] Modules linked in: nfnetlink ip_tables x_tables
>> <4>[ 38.570481] CPU: 1 UID: 0 PID: 637 Comm: uffd-unit-tests Not tainted 6.16.0-rc4 #1 NONE
>> <4>[ 38.570815] Hardware name: Generic DT based system
>> <4>[ 38.571073] Call trace:
>> <4>[ 38.571239] unwind_backtrace from show_stack (arch/arm64/kernel/stacktrace.c:465)
>> <4>[ 38.571602] show_stack from dump_stack_lvl (lib/dump_stack.c:118 (discriminator 1))
>> <4>[ 38.571805] dump_stack_lvl from __warn (kernel/panic.c:791)
>> <4>[ 38.572002] __warn from warn_slowpath_fmt+0xa8/0x174
>> <4>[ 38.572290] warn_slowpath_fmt from kunmap_local_indexed+0x198/0x1a4
>> <4>[ 38.572520] kunmap_local_indexed from move_pages_pte+0xc40/0xf48
>> <4>[ 38.572970] move_pages_pte from move_pages+0x428/0x5bc
>> <4>[ 38.573189] move_pages from userfaultfd_ioctl+0x900/0x1ec0
>> <4>[ 38.573376] userfaultfd_ioctl from sys_ioctl+0xd24/0xd90
>> <4>[ 38.573581] sys_ioctl from ret_fast_syscall+0x0/0x5c
>> <4>[ 38.573810] Exception stack(0xf9d69fa8 to 0xf9d69ff0)
>> <4>[ 38.574546] 9fa0: 00001000 00000005 00000005 c028aa05 b2d3ecd8 b2d3ecc8
>> <4>[ 38.574919] 9fc0: 00001000 00000005 b2d3ece0 00000036 b2d3ed84 b2d3ed50 b2d3ed7c b2d3ed58
>> <4>[ 38.575131] 9fe0: 00000036 b2d3ecb0 b6df1861 b6d5f736
>> <4>[ 38.575511] ---[ end trace 0000000000000000 ]---
>
>Ah, I know what's going on. 6.13.rc7 which is used in this test does
>not have my fix 927e926d72d9 ("userfaultfd: fix PTE unmapping
>stack-allocated PTE copies") (see
>https://elixir.bootlin.com/linux/v6.13.7/source/mm/userfaultfd.c#L1284).
>It was backported into 6.13.rc8. So, it tries to unmap a copy of a
>mapped PTE, which will fail when CONFIG_HIGHPTE is enabled. So, it
>makes sense that it is failing on arm32.
Sorry, I've missed this.
The tree only identifies as 6.13-rc7 but in practice it's a much newer
version since it merges in PRs from the ML.
The issue was still reproducing even on v6.16 with 927e926d72d9.
I've sent out https://lore.kernel.org/all/aItjffoR7molh3QF@lappy/ which
fixed the issue for me.
--
Thanks,
Sasha
next prev parent reply other threads:[~2025-07-31 12:43 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 3:19 [PATCH] mm/userfaultfd: fix missing PTE unmap for non-migration entries Sasha Levin
2025-06-30 15:09 ` Dev Jain
2025-07-01 0:57 ` Andrew Morton
2025-07-08 15:10 ` David Hildenbrand
2025-07-08 15:32 ` Suren Baghdasaryan
2025-07-08 15:33 ` Sasha Levin
2025-07-08 15:39 ` Suren Baghdasaryan
2025-07-08 15:57 ` Sasha Levin
2025-07-08 16:34 ` Suren Baghdasaryan
2025-07-31 12:43 ` Sasha Levin [this message]
2025-07-08 15:42 ` David Hildenbrand
2025-07-31 12:37 ` Sasha Levin
2025-07-31 12:56 ` David Hildenbrand
2025-07-31 14:00 ` Suren Baghdasaryan
2025-07-31 14:07 ` Sasha Levin
2025-08-01 13:26 ` Sasha Levin
2025-08-01 14:06 ` David Hildenbrand
2025-08-01 14:13 ` David Hildenbrand
2025-08-01 14:24 ` Sasha Levin
2025-08-01 14:29 ` Sasha Levin
2025-08-07 19:51 ` Sasha Levin
2025-08-08 8:02 ` David Hildenbrand
2025-08-08 15:55 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aItk34ca4Mp6KLUB@lappy \
--to=sashal@kernel.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=peterx@redhat.com \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.