From: Wandun <chenwandun1@gmail.com>
To: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-trace-kernel@vger.kernel.org,
linux-rt-devel@lists.linux.dev
Cc: akpm@linux-foundation.org, surenb@google.com, mhocko@suse.com,
jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
rostedt@goodmis.org, mhiramat@kernel.org,
mathieu.desnoyers@efficios.com, david@kernel.org, ljs@kernel.org,
liam@infradead.org, rppt@kernel.org, bigeasy@linutronix.de,
clrkwllms@kernel.org, Alexander.Krabler@kuka.com,
Hugh Dickins <hughd@google.com>
Subject: Re: [RFC PATCH 1/3] mm/compaction: skip isolate mlocked folios when compact_unevictable_allowed=0
Date: Thu, 18 Jun 2026 19:43:09 +0800 [thread overview]
Message-ID: <040788a9-e0d5-478e-bb48-3d22b8b41020@gmail.com> (raw)
In-Reply-To: <969cb14b-5b8b-48e6-add6-4dd13101dd89@kernel.org>
On 6/18/26 02:52, Vlastimil Babka (SUSE) wrote:
> On 6/4/26 04:38, Wandun Chen wrote:
>> From: Wandun Chen <chenwandun@lixiang.com>
>>
>> compact_unevictable_allowed is default 0 under PREEMPT_RT,
>> isolate_migratepages_block() skips folios with PG_unevictable set.
>> However, mlock_folio() sets PG_mlocked immediately but defers
>> PG_unevictable to mlock_folio_batch(), result in a folio with
>> PG_mlocked=1 but PG_unevictable=0. Compaction will isolate such a
>> folio.
>>
>> Fix by checking folio_test_mlocked() together with the existing
>> folio_test_unevictable() check.
>>
>> A similar issue has been reported by Alexander Krabler on a 6.12-rt
>> aarch64 system. Vlastimil suggested to check the mlocked flag [1].
>>
>> Reported-by: Alexander Krabler <Alexander.Krabler@kuka.com>
>> Closes: https://lore.kernel.org/all/DU0PR01MB10385345F7153F334100981888259A@DU0PR01MB10385.eurprd01.prod.exchangelabs.com/
>> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
>> Signed-off-by: Wandun Chen <chenwandun@lixiang.com>
>> Link: https://lore.kernel.org/all/33275585-f2db-4779-89f0-3ae24b455a67@suse.cz/ [1]
>
> Well in that thread, Hugh doubted my suggestion and then it seems we didn't
> concluded anything. Did you actually in practice observe the issue that
> Alexander had, and that this patch fixed it, or is that theoretical?
>
Yes, I wrote a test case that can reproduce it in a few second.
The test case contains 3 steps:
1. mlockall
2. mmap file(2GB) + trigger file write page fault;
3. during step 1, trigger compact via /proc/sys/vm/compact_memory
My reproduction environment is qemu with 4GB ram, 8 core, aarch64,
preempt_rt and includes the tracepoint in patch 02.
After running the reproduction program for a few seconds, the
following output appears.
repro-403 [004] ....1 101.270505: mm_compaction_isolate_folio: pfn=0x71e3a mode=0x0 flags=referenced|uptodate|mlocked
repro-403 [004] ....1 101.270507: mm_compaction_isolate_folio: pfn=0x71e3b mode=0x0 flags=referenced|uptodate|mlocked
repro-403 [004] ....1 101.270513: mm_compaction_isolate_folio: pfn=0x71e3c mode=0x0 flags=referenced|uptodate|mlocked
repro-403 [004] ....1 101.270515: mm_compaction_isolate_folio: pfn=0x71e3d mode=0x0 flags=uptodate|mlocked
repro-403 [004] ....1 101.270517: mm_compaction_isolate_folio: pfn=0x71e3e mode=0x0 flags=uptodate|mlocked
repro-403 [004] ....1 101.270520: mm_compaction_isolate_folio: pfn=0x71e3f mode=0x0 flags=uptodate|mlocked
Unfortunately, I recently found that there is still a bug in the
fix patch. Setting mlocked in the mlock_folio function could happen
even after the page is successfully isolated, so it still cannot
prevent migration. Because of this, I need to think more about how
to fix it.
Perhaps we should double-check whether the page is mlocked during
the actual migration phase.
What do you think of this best-effort approach?
Best regards,
Wandun
The full reproducer is as below:
/* gcc repro.c -o repro -lpthread */
#define _GNU_SOURCE
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>
#define PAGE_SIZE 4096
#define NR_PAGES 32
#define FILE_SIZE (2ULL * 1024 * 1024 * 1024)
static void *worker_fn(void *arg)
{
int fd = (long)arg;
size_t len = (size_t)FILE_SIZE;
char *p = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (p == MAP_FAILED)
return NULL;
for (size_t off = 0; off + NR_PAGES * PAGE_SIZE <= len;
off += NR_PAGES * PAGE_SIZE) {
for (int i = 0; i < NR_PAGES; i++)
p[off + i * PAGE_SIZE] = 1;
usleep(200);
}
munmap(p, len);
return NULL;
}
static void *compact_fn(void *arg)
{
(void)arg;
int fd = open("/proc/sys/vm/compact_memory", O_WRONLY);
if (fd < 0)
return NULL;
while (1) {
if (write(fd, "1", 1) < 0) {}
usleep(5000);
}
}
int main(void)
{
mlockall(MCL_CURRENT | MCL_FUTURE);
int fd = open("./repro_largefile.dat", O_RDWR | O_CREAT, 0600);
if (fd < 0)
return 1;
unlink("./repro_largefile.dat");
if (ftruncate(fd, (off_t)FILE_SIZE) < 0)
return 1;
printf("repro_largefile: 1 worker, %d pages/batch, Ctrl-C to stop\n",
NR_PAGES);
pthread_t compact, worker;
pthread_create(&compact, NULL, compact_fn, NULL);
pthread_create(&worker, NULL, worker_fn, (void *)(long)fd);
pthread_join(worker, NULL);
return 0;
}
>> ---
>> mm/compaction.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index b776f35ad020..7e07b792bcb5 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1116,7 +1116,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
>> is_unevictable = folio_test_unevictable(folio);
>>
>> /* Compaction might skip unevictable pages but CMA takes them */
>> - if (!(mode & ISOLATE_UNEVICTABLE) && is_unevictable)
>> + if (!(mode & ISOLATE_UNEVICTABLE) &&
>> + (is_unevictable || folio_test_mlocked(folio)))
>> goto isolate_fail_put;
>>
>> /*
>
next prev parent reply other threads:[~2026-06-18 12:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 2:38 [RFC PATCH 0/3] mm/compaction: honour compact_unevictable_allowed in mlock race and alloc_contig path Wandun Chen
2026-06-04 2:38 ` [RFC PATCH 1/3] mm/compaction: skip isolate mlocked folios when compact_unevictable_allowed=0 Wandun Chen
2026-06-17 18:52 ` Vlastimil Babka (SUSE)
2026-06-18 11:43 ` Wandun [this message]
2026-06-04 2:38 ` [RFC PATCH 2/3] mm/compaction: add per-folio isolation tracepoint Wandun Chen
2026-06-04 2:38 ` [RFC PATCH 3/3] mm/compaction: respect compact_unevictable_allowed in alloc_contig path Wandun Chen
2026-06-17 18:57 ` Vlastimil Babka (SUSE)
2026-06-18 11:47 ` Wandun
2026-06-15 8:28 ` [RFC PATCH 0/3] mm/compaction: honour compact_unevictable_allowed in mlock race and " Wandun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=040788a9-e0d5-478e-bb48-3d22b8b41020@gmail.com \
--to=chenwandun1@gmail.com \
--cc=Alexander.Krabler@kuka.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=clrkwllms@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jackmanb@google.com \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=ljs@kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=mhocko@suse.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox