All of lore.kernel.org
 help / color / mirror / Atom feed
* [v3 PATCH 0/1] fs/proc: Expose mm_cpumask in /proc/[pid]/status
@ 2026-01-15 20:54 Aaron Tomlin
  2026-01-15 20:54 ` [v3 PATCH 1/1] " Aaron Tomlin
  0 siblings, 1 reply; 10+ messages in thread
From: Aaron Tomlin @ 2026-01-15 20:54 UTC (permalink / raw)
  To: oleg, akpm, gregkh, david, brauner, mingo
  Cc: neelx, sean, linux-kernel, linux-fsdevel

Hi Oleg, David, Greg, Andrew,

This patch introduces a mechanism to expose the mm_cpumask of a process via
the /proc/[pid]/status interface.

In high-performance and large-scale NUMA environments, diagnosing latency
spikes attributed to Inter-Processor Interrupts (IPIs) can be particularly
challenging. While cpus_allowed describes where a thread may execute, it
does not describe the "memory footprint" - specifically, the set of CPUs
that may hold stale Translation Lookaside Buffer (TLB) entries for the
process.

It is this footprint (mm_cpumask) that dictates the target destination for
TLB flush IPIs. Discrepancies between a process's scheduling affinity and
its memory footprint are a common source of system noise and performance
degradation. By exposing this mask, we provide userspace with the
visibility required to debug these "invisible" sources of latency.

These fields are exposed only on architectures that explicitly opt-in
via CONFIG_ARCH_WANT_PROC_CPUS_ACTIVE_MM. This is necessary because
mm_cpumask semantics vary significantly across architectures; some
(e.g., x86) actively maintain the mask for coherency, while others may
never clear bits, rendering the data misleading for this specific use
case. x86 is updated to select this feature by default.

For example, outside x86:

    # make fs/proc/array.i
    # grep task_cpus_active_mm -B 1 -A 3 --max-count 1 fs/proc/array.i
    # 430 "fs/proc/array.c"
    static inline __attribute__((__gnu_inline__)) __attribute__((__unused__)) __attribute__((no_instrument_function)) void task_cpus_active_mm(struct seq_file *m, struct mm_struct *mm)
    {
    }

The implementation reads the mask directly without introducing additional
locks or snapshots. While this implies that the hex mask and list format
could theoretically observe slightly different states on a rapidly
changing system, this "best-effort" approach aligns with the standard
design philosophy of /proc and avoids imposing locking overhead on
critical memory management paths.


Changes since v2 [1]:
 - Introduce new configuration ARCH_WANT_PROC_CPUS_ACTIVE_MM. The x86
   architecture now explicitly selects this feature, ensuring that the
   field is only exposed where the mm_cpumask semantics are meaningful for
   TLB coherency (David Hildenbrand)

Changes since v1 [2]:
 - Document new Cpus_active_mm and Cpus_active_mm_list entries in
   /proc/[pid]/status (Oleg Nesterov)

[1]: https://lore.kernel.org/lkml/20251226211407.2252573-1-atomlin@atomlin.com/ 
[2]: https://lore.kernel.org/lkml/20251217024603.1846651-1-atomlin@atomlin.com/

Aaron Tomlin (1):
  fs/proc: Expose mm_cpumask in /proc/[pid]/status

 Documentation/filesystems/proc.rst |  7 +++++++
 arch/x86/Kconfig                   |  1 +
 fs/proc/Kconfig                    | 14 ++++++++++++++
 fs/proc/array.c                    | 28 +++++++++++++++++++++++++++-
 4 files changed, 49 insertions(+), 1 deletion(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-16 15:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-15 20:54 [v3 PATCH 0/1] fs/proc: Expose mm_cpumask in /proc/[pid]/status Aaron Tomlin
2026-01-15 20:54 ` [v3 PATCH 1/1] " Aaron Tomlin
2026-01-15 21:19   ` David Hildenbrand (Red Hat)
2026-01-15 21:23     ` Peter Zijlstra
2026-01-15 21:39     ` Dave Hansen
2026-01-16  1:53       ` Aaron Tomlin
2026-01-16  2:27         ` Rik van Riel
2026-01-16 14:31           ` Aaron Tomlin
2026-01-16  5:08         ` Dave Hansen
2026-01-16 15:42           ` Aaron Tomlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.