All of lore.kernel.org
 help / color / mirror / Atom feed
* Minor page faults from memory compaction causing in-band transitions
@ 2026-03-31 23:14 Brandon Ho
  2026-04-02 21:23 ` Jan Kiszka
  2026-04-03 12:55 ` Philippe Gerum
  0 siblings, 2 replies; 13+ messages in thread
From: Brandon Ho @ 2026-03-31 23:14 UTC (permalink / raw)
  To: xenomai; +Cc: Brandon Ho, Jay Sridharan

Hi Xenomai team,

I'm working on a real-time control application using Xenomai and we've been
experiencing unexpected in-band transitions caused by minor page faults during
kernel memory compaction. Even though our RT threads use mlockall(MCL_CURRENT |
MCL_FUTURE) and we've pre-faulted memory, the kernel's compaction process seems
to temporarily invalidate PTEs on our locked pages, causing faults when the RT
thread accesses them.

Has this issue come up before? I'm wondering if there's a mechanism to reserve
a section of RAM that's excluded from compaction/migration entirely, or if
there are kernel/Xenomai configurations we should be using to prevent this.

Any guidance would be appreciated!

Thanks,
Brandon

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re: Minor page faults from memory compaction causing in-band transitions
@ 2026-05-04 17:52 Jay Sridharan
  2026-05-06  7:40 ` Philippe Gerum
  0 siblings, 1 reply; 13+ messages in thread
From: Jay Sridharan @ 2026-05-04 17:52 UTC (permalink / raw)
  To: rpm; +Cc: Brandon Ho, brho, jsridharan, xenomai

Hi Phillipe,

Let me know what you think of this patch for the docs:

diff --git a/content/core/caveat.md b/content/core/caveat.md
index b24e270..e11d605 100644
--- a/content/core/caveat.md
+++ b/content/core/caveat.md
@@ -70,6 +70,68 @@ which depend on instrumenting the spinlock constructs (e.g.
 `CONFIG_DEBUG_PREEMPT`), you may want to disable all the related kernel
 options, starting with `CONFIG_SMP`.

+### Memory compaction and page migration impact real-time behavior
{#caveat-memory-compaction}
+
+Several kernel configuration options related to memory management can
+introduce unpredictable latency through page migration and page fault
+handling, which is problematic for real-time workloads:
+
+- `CONFIG_COMPACTION`: Enables memory compaction to reduce fragmentation
+  by migrating pages to create larger contiguous memory regions. The
+  compaction process can trigger page migrations that introduce latency.
+
+- `CONFIG_MIGRATION`: Allows the migration of the physical location of
+  memory pages of processes while the virtual addresses are not changed.
+  Useful for allowing the kernel to move pages between NUMA
+  nodes or during compaction. Page migration involves copying page
+  contents and updating page table entries, which can take time, or
+  cause page faults when page table entries become temporarily unavailable.
+  The use of `mlock` or `mlockall` does **NOT** automatically guarantee
+  that a page will not be migrated. See below for more.
+
+- `CONFIG_TRANSPARENT_HUGEPAGE`: Transparent Hugepages (THP) reduces page
+  faults by mapping 2MB instead of 4KB pages, but causes higher latency
+  during faults due to intensive memory allocation and zeroing. While
+  reducing the number of faults, THP can incur higher latency during
+  initial memory access or when khugepaged compacts memory.
+
+The best approach depends on your workload characteristics. In an ideal
+situation, you would disable `CONFIG_COMPACTION`,
+`CONFIG_MIGRATION`, and `CONFIG_TRANSPARENT_HUGEPAGE` entirely. In other
+situations, you may need to keep these options enabled
+
+If you are running applications that alloc/free memory often, and/or need
+a steady source of consecutive pages, you may need to keep `CONFIG_MIGRATION`
+and `CONFIG_COMPACTION` enabled. Without it, your in-band applications may
+experience out-of-memory issues.
+
+Luckily, procfs provides tunables to control compaction behavior:
+
+- `vm.compaction_proactiveness`: determines how aggressively compaction is
+  done in the background. Write of a non zero value to this tunable will
+  immediately trigger the proactive compaction. Setting it to 0 disables
+  proactive compaction.
+- `vm.compact_unevictable_allowed`: When set to 1, compaction is allowed
+  to examine the unevictable lru (mlocked pages) for pages to compact. This
+  should be used on systems where stalls for minor page faults are an
+  acceptable trade for large contiguous free memory. Set to 0 to prevent
+  compaction from moving pages that are unevictable. On EVL, the default
+  value is 0 in order to avoid a page fault due to compaction.
+  (`CONFIG_COMPACT_UNEVICTABLE_DEFAULT`)
+
+procfs also provides an interface to manually trigger a memory compaction
+operation using `vm.compact_memory`.
+
+Using these options, you can perform a sequence such as the following to
+prime the system for reliability:
+
+- Launch EVL threads. At the end of initialization, but before
calling mlockall..
+- Trigger a memory compaction manually using `vm.compact_memory`
+- Once complete, call `mlockall` to lock pages.
+- Then, disable `vm.compact_unevictable_allowed` to prevent those pages
+  from getting migrated.
+- Finally, launch other in-band applications.
+
 ## Architecture-specific issues

 ### x86 {#x86-caveat}

^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: Minor page faults from memory compaction causing in-band transitions
@ 2026-04-13 19:45 Jay Sridharan
  0 siblings, 0 replies; 13+ messages in thread
From: Jay Sridharan @ 2026-04-13 19:45 UTC (permalink / raw)
  To: rpm; +Cc: Brandon Ho, brho, jsridharan, xenomai

I think it would also be very useful to add these considerations to
the "Caveat" page in the EVL documentation.

Perhaps some discussion on the impacts of CONFIG_COMPACTION,
CONFIG_MIGRATION and CONFIG_TRANSPARENT_HUGEPAGE, as well as the
impact of COMPACT_UNEVICTABLE_DEFAULT.

I know the Caveat page already has an in-depth discussion of CPU
isolation and SMIs, so it feels like a good place to mention it.

Is there a place where we can submit a patch to the docs?

Jay

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Minor page faults from memory compaction causing in-band transitions
@ 2026-03-31 22:52 Brandon Ho
  0 siblings, 0 replies; 13+ messages in thread
From: Brandon Ho @ 2026-03-31 22:52 UTC (permalink / raw)
  To: xenomai@lists.linux.dev; +Cc: Jay Sridharan

Hi Xenomai team,

I'm working on a real-time control application using Xenomai and we've been experiencing unexpected in-band transitions caused by minor page faults during kernel memory compaction. Even though our RT threads use mlockall(MCL_CURRENT | MCL_FUTURE) and we've pre-faulted memory, the kernel's compaction process seems to temporarily invalidate PTEs on our locked pages, causing faults when the RT thread accesses them.

We've tried the usual mitigations (disabling THP, MAP_LOCKED, etc.) but compaction can still trigger these faults on already-locked pages.

Has this issue come up before? I'm wondering if there's a mechanism to reserve a section of RAM that's excluded from compaction/migration entirely, or if there are kernel/Xenomai configurations we should be using to prevent this.

Any guidance would be appreciated!

Thanks,
Brandon

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-06  7:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 23:14 Minor page faults from memory compaction causing in-band transitions Brandon Ho
2026-04-02 21:23 ` Jan Kiszka
2026-04-03 12:55 ` Philippe Gerum
2026-04-13 18:19   ` Brandon Ho
2026-04-13 18:31     ` Philippe Gerum
     [not found]       ` <CY1P110MB0760484C0708C37BA89D5A02D624A@CY1P110MB0760.NAMP110.PROD.OUTLOOK.COM>
2026-04-14  7:05         ` Philippe Gerum
2026-04-14  7:37       ` Florian Bezdeka
2026-04-14  7:42         ` Philippe Gerum
2026-04-14  7:52           ` Philippe Gerum
  -- strict thread matches above, loose matches on Subject: below --
2026-05-04 17:52 Jay Sridharan
2026-05-06  7:40 ` Philippe Gerum
2026-04-13 19:45 Jay Sridharan
2026-03-31 22:52 Brandon Ho

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.