Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* tracepoints expose s_dev of kernel-internal superblocks -- no generic resolution interface
@ 2026-05-31 17:10 Yiyang Chen
  0 siblings, 0 replies; only message in thread
From: Yiyang Chen @ 2026-05-31 17:10 UTC (permalink / raw)
  To: linux-fsdevel, linux-mm, linux-trace-devel
  Cc: Christian Brauner, Steven Rostedt, Andrew Morton, Matthew Wilcox

Hi all,

While tracing page cache activity via mm_filemap_add_to_page_cache, I
noticed s_dev values that do not appear in /proc/*/mountinfo:

  mm_filemap_add_to_page_cache: dev 0:18 ino dea89 pfn=0x13ba00 ofs=0 order=9

Using a kernel module to enumerate all active superblocks, I confirmed
this is the hugetlbfs internal mount created by hugetlbfs_init() ->
kern_mount().  The actual s_dev value is dynamically allocated via
get_anon_bdev() and varies across systems (0:18 on my test machine).
Because kern_mount() attaches the mount to MNT_NS_INTERNAL, it is
invisible to all mount namespaces.

Each internal filesystem requires its own trick to discover the
s_dev -> fs_type mapping:

  - shmem:      create a memfd, call fstatfs() -> TMPFS_MAGIC
  - bdev:       open a block device, call fstatfs()
  - hugetlbfs: memfd_create(MFD_HUGETLB) creates a file
    on the internal mount itself, so fstat() gives s_dev and
    fstatfs() returns HUGETLBFS_MAGIC

None of these are a general, authoritative interface -- each requires
filesystem-specific knowledge of which object to create and which
syscall to call.

The question is: should there be a single stable interface that exposes
the s_dev -> fs_type mapping for all active superblocks, including
internal ones? Here are some options:

  - Add fs_type to the affected tracepoints.
  - Provide a generic interface (BPF iterator, /proc, /sys) that
    exposes s_dev + fs_type for all active superblocks including
    internal ones.


Reproduce steps:

s_dev is dynamically allocated via get_anon_bdev(), values vary.

--- test_hugetlb_sdev.c ---
#include <sys/mman.h>
#include <string.h>
int main(void) {
    void *p = mmap(NULL, 2<<20, PROT_READ|PROT_WRITE,
                   MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0);
    if (p == MAP_FAILED) return 1;
    memset(p, 0, 2<<20);
    munmap(p, 2<<20);
    return 0;
}

  gcc -o test_hugetlb_sdev test_hugetlb_sdev.c
  # precondition:
  cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages   # must be > 0

--- trace script (run with sudo) ---
  TP=/sys/kernel/tracing
  DEVS=$(awk '{print $3}' /proc/self/mountinfo | sort -u)

  echo 1 > $TP/events/filemap/mm_filemap_add_to_page_cache/enable
  echo 1 > $TP/events/hugetlbfs/hugetlbfs_alloc_inode/enable
  echo > $TP/trace

  ./test_hugetlb_sdev
  sleep 1

  echo "=== s_dev from tracepoint, NOT in mountinfo ==="
  # filemap uses "dev X:Y", hugetlbfs uses "dev X,Y"
  grep -v '^#' $TP/trace | grep -oP 'dev[= ]\K\d+[:,]\d+' | tr ',' ':' | sort -u \
      | while read d; do echo "$DEVS" | grep -qx "$d" || echo "  $d"; done

  echo 0 > $TP/events/filemap/enable
  echo 0 > $TP/events/hugetlbfs/enable
  echo > $TP/trace


You will see hugetlbfs s_dev which is not present in any mountinfo.


Thanks.


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-31 17:10 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-31 17:10 tracepoints expose s_dev of kernel-internal superblocks -- no generic resolution interface Yiyang Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox