From: Wandun <chenwandun1@gmail.com>
To: Alexander Krabler <Alexander.Krabler@kuka.com>,
"Vlastimil Babka (SUSE)" <vbabka@kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-trace-kernel@vger.kernel.org"
<linux-trace-kernel@vger.kernel.org>,
"linux-rt-devel@lists.linux.dev" <linux-rt-devel@lists.linux.dev>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"surenb@google.com" <surenb@google.com>,
"mhocko@suse.com" <mhocko@suse.com>,
"jackmanb@google.com" <jackmanb@google.com>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"ziy@nvidia.com" <ziy@nvidia.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"mhiramat@kernel.org" <mhiramat@kernel.org>,
"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
"david@kernel.org" <david@kernel.org>,
"ljs@kernel.org" <ljs@kernel.org>,
"liam@infradead.org" <liam@infradead.org>,
"rppt@kernel.org" <rppt@kernel.org>,
"bigeasy@linutronix.de" <bigeasy@linutronix.de>,
"clrkwllms@kernel.org" <clrkwllms@kernel.org>,
Hugh Dickins <hughd@google.com>
Subject: Re: [RFC PATCH 1/3] mm/compaction: skip isolate mlocked folios when compact_unevictable_allowed=0
Date: Mon, 29 Jun 2026 17:07:36 +0800 [thread overview]
Message-ID: <05d7369e-341c-49f4-ae13-df3d0ad930d7@gmail.com> (raw)
In-Reply-To: <PR3PR01MB6666EC8D53E75F742B37270282EB2@PR3PR01MB6666.eurprd01.prod.exchangelabs.com>
On 6/26/26 21:42, Alexander Krabler wrote:
> On 6/26/26 11:38, Wandun wrote:
>> On 6/26/26 16:45, Alexander Krabler wrote:
>>> However, we were not able to reproduce the actual race
>>> (mlockall() process waiting on a migration PTE),
>>> not in the past, not now. Might be hard to trigger that race.
>>
>> Not hard to trigger that case, I added a debug message, such as below,
>> lots of messages occur in a few second.
>>
>> diff --cc mm/memory.c
>> index ff338c2abe92,ff338c2abe92..6552b3b14f78
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@@ -4768,6 -4768,6 +4768,8 @@@ vm_fault_t do_swap_page(struct vm_faul
>> if (softleaf_is_migration(entry)) {
>> migration_entry_wait(vma->vm_mm, vmf->pmd,
>> vmf->address);
>> + if (!strcmp(current->comm, "repro"))
>> + pr_err("============== hit ================\n");
>> } else if (softleaf_is_device_exclusive(entry)) {
>> vmf->page = softleaf_to_page(entry);
>> ret = remove_device_exclusive_entry(vmf);
>
> I have a kprobe on migration_entry_wait set and logged into a ftrace buffer
> (including kernel stacktrace).
> Yes, this function is hit, but only inside the mmap-syscall, which is okay,
> memory allocation is not realtime-safe.
>
> repro-2090 [002] d.... 811.129549: frt_migration_entry_wait: (migration_entry_wait+0x0/0x100)
> repro-2090 [002] d.... 811.129553: <stack trace>
> => migration_entry_wait
> => __handle_mm_fault
> => handle_mm_fault
> => __get_user_pages
> => populate_vma_page_range
> => __mm_populate
> => vm_mmap_pgoff
> => ksys_mmap_pgoff
> => __arm64_sys_mmap
> => el0_svc_common.constprop.0
> => do_el0_svc
> => el0_svc
> => el0t_64_sync_handler
> => el0t_64_sync
>
> The original race was an instruction abort interrupt out of nothing due
> to the migration PTE set by kcompactd.
> And these kind of races I see quite often on non mlockall()-processes,
> but can't reproduce on memory locked processes.
>
> Example:
> podman-832 [000] d.... 812.447820: frt_migration_entry_wait: (migration_entry_wait+0x0/0x100)
> podman-832 [000] d.... 812.447823: <stack trace>
> => migration_entry_wait
> => __handle_mm_fault
> => handle_mm_fault
> => do_page_fault
> => do_translation_fault
> => do_mem_abort
> => el0_da
> => el0t_64_sync_handler
> => el0t_64_sync
Hi, Alexander
From the perspective of the root cause, there is no fundamental difference
between these two call stacks. I modified the reproduction program, and it
can still reproduce the situation of the second call stack
(although it doesn't occur as frequently). The complete reproduction program
is as follows:
#define _GNU_SOURCE
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/sysinfo.h>
#include <unistd.h>
#define PAGE_SIZE 4096
#define NR_PAGES 10000
static void *worker_fn(void *arg)
{
int fd = (long)arg;
size_t len = NR_PAGES * PAGE_SIZE;
while (1) {
if (ftruncate(fd, 0) < 0) {}
if (ftruncate(fd, len) < 0) {}
char *p = mmap(NULL, len, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED)
continue;
mlockall(MCL_ONFAULT | MCL_FUTURE);
for (int i = 0; i < NR_PAGES; i++) {
for (int j = 0; j < PAGE_SIZE; j++) {
p[i * PAGE_SIZE + j] = 1;
}
}
usleep(200);
munmap(p, len);
}
return NULL;
}
static void *compact_fn(void *arg)
{
(void)arg;
int fd = open("/proc/sys/vm/compact_memory", O_WRONLY);
if (fd < 0)
return NULL;
while (1) {
if (write(fd, "1", 1) < 0) {}
usleep(5000);
}
}
int main(void)
{
int nproc = sysconf(_SC_NPROCESSORS_ONLN);
if (nproc < 1)
nproc = 1;
int *fds = calloc((size_t)nproc, sizeof(int));
if (!fds)
return 1;
size_t len = NR_PAGES * PAGE_SIZE;
for (int i = 0; i < nproc; i++) {
char path[64];
snprintf(path, sizeof(path), "./repro_%d.dat", i);
unlink(path);
fds[i] = open(path, O_RDWR | O_CREAT, 0600);
if (fds[i] < 0)
return 1;
if (ftruncate(fds[i], len) < 0)
return 1;
}
printf("repro: %d workers, %d pages, Ctrl-C to stop\n",
nproc, NR_PAGES);
pthread_t compact;
pthread_create(&compact, NULL, compact_fn, NULL);
pthread_t *threads = calloc((size_t)nproc, sizeof(pthread_t));
for (int i = 0; i < nproc; i++)
pthread_create(&threads[i], NULL, worker_fn, (void *)(long)fds[i]);
pthread_join(compact, NULL);
return 0;
}
>
> Thanks,
> Alexander
>
> --
>
> KUKA Deutschland GmbH Board of Directors: Michael Jürgens (Chairman), Johan Naten, Hui Zhang Registered Office: Augsburg HRB 14914
>
> This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of contents of this e-mail is strictly forbidden.
>
> Please consider the environment before printing this e-mail.
next prev parent reply other threads:[~2026-06-29 9:07 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 2:38 [RFC PATCH 0/3] mm/compaction: honour compact_unevictable_allowed in mlock race and alloc_contig path Wandun Chen
2026-06-04 2:38 ` [RFC PATCH 1/3] mm/compaction: skip isolate mlocked folios when compact_unevictable_allowed=0 Wandun Chen
2026-06-17 18:52 ` Vlastimil Babka (SUSE)
2026-06-18 11:43 ` Wandun
2026-06-22 9:55 ` Vlastimil Babka (SUSE)
2026-06-24 11:08 ` Wandun
2026-06-26 8:45 ` Alexander Krabler
2026-06-26 9:38 ` Wandun
2026-06-26 13:42 ` Alexander Krabler
2026-06-29 9:07 ` Wandun [this message]
2026-06-26 9:26 ` Sebastian Andrzej Siewior
2026-06-26 9:39 ` Wandun
2026-06-04 2:38 ` [RFC PATCH 2/3] mm/compaction: add per-folio isolation tracepoint Wandun Chen
2026-06-04 2:38 ` [RFC PATCH 3/3] mm/compaction: respect compact_unevictable_allowed in alloc_contig path Wandun Chen
2026-06-17 18:57 ` Vlastimil Babka (SUSE)
2026-06-18 11:47 ` Wandun
2026-06-15 8:28 ` [RFC PATCH 0/3] mm/compaction: honour compact_unevictable_allowed in mlock race and " Wandun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05d7369e-341c-49f4-ae13-df3d0ad930d7@gmail.com \
--to=chenwandun1@gmail.com \
--cc=Alexander.Krabler@kuka.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=clrkwllms@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jackmanb@google.com \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.