Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry()
@ 2026-05-02  8:20 Anindya Roy
  2026-05-02  9:40 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Anindya Roy @ 2026-05-02  8:20 UTC (permalink / raw)
  To: akpm, rppt, david; +Cc: ljs, security, linux-mm, lokeshgidra

[-- Attachment #1: Type: text/plain, Size: 3432 bytes --]

Hello,

I would like to report a vulnerability in the Linux kernel userfaultfd
implementation that allows cross-inode page cache injection due to VMA
replacement during the UFFDIO_COPY retry path.

*Summary:*

A local unprivileged attacker can inject controlled data into the page
cache of a different shmem/memfd/tmpfs inode than originally targeted.

This occurs because mfill_copy_folio_retry() drops all locks, allowing the
destination VMA to be replaced, but does not verify that the underlying
file (vm_file) remains the same when execution resumes.

As a result, a folio allocated for one inode can be inserted into another
inode’s page cache.

*Affected component:*

File: mm/userfaultfd.c
Function: mfill_copy_folio_retry()

*Affected versions:*

Introduced by: f5f035a72423
Incomplete fix: 292411fda25b
Tested on:
- Ubuntu 6.17.0-23-generic (x86_64)
- Debian 6.1.159-1

The issue appears present in the current mainline at the time of testing.

*Root cause:*

During UFFDIO_COPY:

   1. A folio is allocated based on the original VMA (inode A)
   2. mfill_copy_folio_retry() drops all locks
   3. While locks are dropped, another thread replaces the VMA using
   mmap(MAP_FIXED) with a different file (inode B)
   4. After re-acquiring the VMA, no validation ensures that vm_file is
   unchanged
   5. The previously allocated folio is inserted into inode B’s page cache

The existing fix (292411fda25b) compares vma_uffd_ops(), which is
insufficient because different shmem mappings share the same ops pointer.

*Impact:*

This provides a deterministic primitive:

   - Arbitrary page cache write into shmem/tmpfs files
   - Full control over contents (PAGE_SIZE per injection)
   - No timing race required
   - Works as an unprivileged user (userfaultfd or via user namespace)

Security implications include:

   - Container escape via shared tmpfs (e.g. /dev/shm) in misconfigured
   environments
   - Bypass of memfd seals (F_SEAL_WRITE not enforced in this path)
   - Corruption of shared memory used by privileged processes
   - Enabling privilege escalation chains

*Reproduction:*

The issue is deterministic and can be reproduced as follows:

   1. Map a shmem/memfd file (fd_A) and register userfaultfd
   2. Trigger UFFDIO_COPY with a faulting source buffer
   3. During retry, replace the destination VMA with a mapping of a
   different file (fd_B)
   4. Resume execution and observe injected data in fd_B’s page cache

*Proposed fix:*

Validate that the VMA still maps the same file after retry:

if (state->vma->vm_file != orig_file)
return -EAGAIN;

Additionally, shmem_mfill_filemap_add() should enforce memfd seals (e.g.
reject when F_SEAL_WRITE or F_SEAL_FUTURE_WRITE is set).

*Notes:*

   - The issue is deterministic and reproducible in a single attempt
   - No timing race is required
   - It breaks isolation between independent shmem inodes

For demonstration, I can provide a proof-of-concept that shows one possible
exploitation path: a privileged process reading commands from a /dev/shm
file can be influenced via injected page cache contents, leading to
unintended command execution.

This demonstrates exploitability, but is not required to trigger the
underlying vulnerability.

Please let me know if further details are needed.

Best regards,
Andy
https://github.com/theteatoast

[-- Attachment #2: Type: text/html, Size: 4153 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry()
  2026-05-02  8:20 [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry() Anindya Roy
@ 2026-05-02  9:40 ` Andrew Morton
  2026-05-02 10:15   ` Anindya Roy
  2026-05-02 19:26   ` Mike Rapoport
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Morton @ 2026-05-02  9:40 UTC (permalink / raw)
  To: Anindya Roy; +Cc: rppt, david, ljs, security, linux-mm, lokeshgidra

On Sat, 2 May 2026 13:50:02 +0530 Anindya Roy <anindyaandy1904@gmail.com> wrote:

> Hello,
> 
> I would like to report a vulnerability in the Linux kernel userfaultfd
> implementation that allows cross-inode page cache injection due to VMA
> replacement during the UFFDIO_COPY retry path.
> 
> *Summary:*
> 
> A local unprivileged attacker can inject controlled data into the page
> cache of a different shmem/memfd/tmpfs inode than originally targeted.
> 
> This occurs because mfill_copy_folio_retry() drops all locks, allowing the
> destination VMA to be replaced, but does not verify that the underlying
> file (vm_file) remains the same when execution resumes.
> 
> As a result, a folio allocated for one inode can be inserted into another
> inode’s page cache.
> 
> *Affected component:*
> 
> File: mm/userfaultfd.c
> Function: mfill_copy_folio_retry()

Thanks.

> *Affected versions:*
> 
> Introduced by: f5f035a72423

Only in 7.1-rc1, fortunately.

May I ask how you found this?

Annoyingly, Sashiko found it:

: If the locks are dropped during the retry, can the anonymous VMA be
: replaced by a different type of VMA (such as hugetlb or shared shmem)
: at the same address?
: 
: mfill_get_vma() accepts these VMAs and returns 0.  When execution
: resumes in mfill_atomic_pte_copy(), it proceeds to
: mfill_atomic_install_pte() using the newly populated state->vma.
: 
: For a hugetlb VMA, mfill_establish_pmd() will allocate a standard 4KB
: PMD into a hugepage page table hierarchy.  For a shared shmem VMA,
: 
: folio_add_new_anon_rmap() will be called on a shared file-backed VMA. 
: Could this sequence cause page table corruption or kernel crashes?

We must have simply failed to look :(

It claims to have found other things:
	https://sashiko.dev/#/patchset/20260402041156.1377214-6-rppt@kernel.org

>
> ...
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry()
  2026-05-02  9:40 ` Andrew Morton
@ 2026-05-02 10:15   ` Anindya Roy
  2026-05-02 19:26   ` Mike Rapoport
  1 sibling, 0 replies; 5+ messages in thread
From: Anindya Roy @ 2026-05-02 10:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rppt, david, ljs, security, linux-mm, lokeshgidra

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

Hello,

Thanks for the response.

I used multiple agents to analyze recent changes and explore potential
attack surfaces in the userfaultfd retry path, and then followed up with
manual reasoning and validation. This led me to the missing vm_file
validation after mfill_copy_folio_retry(), which I confirmed in my ubuntu
VM.

I wasn’t aware of the Sashiko report beforehand. From your excerpt, it
appears to focus on VMA type mismatches, whereas my report demonstrates a
deterministic cross-inode page cache injection primitive due to folio reuse
across different vm_file mappings.

Happy to provide a reproducer if helpful.

On Sat, 2 May 2026 at 15:10, Andrew Morton <akpm@linux-foundation.org>
wrote:

> On Sat, 2 May 2026 13:50:02 +0530 Anindya Roy <anindyaandy1904@gmail.com>
> wrote:
>
> > Hello,
> >
> > I would like to report a vulnerability in the Linux kernel userfaultfd
> > implementation that allows cross-inode page cache injection due to VMA
> > replacement during the UFFDIO_COPY retry path.
> >
> > *Summary:*
> >
> > A local unprivileged attacker can inject controlled data into the page
> > cache of a different shmem/memfd/tmpfs inode than originally targeted.
> >
> > This occurs because mfill_copy_folio_retry() drops all locks, allowing
> the
> > destination VMA to be replaced, but does not verify that the underlying
> > file (vm_file) remains the same when execution resumes.
> >
> > As a result, a folio allocated for one inode can be inserted into another
> > inode’s page cache.
> >
> > *Affected component:*
> >
> > File: mm/userfaultfd.c
> > Function: mfill_copy_folio_retry()
>
> Thanks.
>
> > *Affected versions:*
> >
> > Introduced by: f5f035a72423
>
> Only in 7.1-rc1, fortunately.
>
> May I ask how you found this?
>
> Annoyingly, Sashiko found it:
>
> : If the locks are dropped during the retry, can the anonymous VMA be
> : replaced by a different type of VMA (such as hugetlb or shared shmem)
> : at the same address?
> :
> : mfill_get_vma() accepts these VMAs and returns 0.  When execution
> : resumes in mfill_atomic_pte_copy(), it proceeds to
> : mfill_atomic_install_pte() using the newly populated state->vma.
> :
> : For a hugetlb VMA, mfill_establish_pmd() will allocate a standard 4KB
> : PMD into a hugepage page table hierarchy.  For a shared shmem VMA,
> :
> : folio_add_new_anon_rmap() will be called on a shared file-backed VMA.
> : Could this sequence cause page table corruption or kernel crashes?
>
> We must have simply failed to look :(
>
> It claims to have found other things:
>
> https://sashiko.dev/#/patchset/20260402041156.1377214-6-rppt@kernel.org
>
> >
> > ...
> >
>

[-- Attachment #2: Type: text/html, Size: 3641 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry()
  2026-05-02  9:40 ` Andrew Morton
  2026-05-02 10:15   ` Anindya Roy
@ 2026-05-02 19:26   ` Mike Rapoport
  2026-05-04 12:53     ` Anindya Roy
  1 sibling, 1 reply; 5+ messages in thread
From: Mike Rapoport @ 2026-05-02 19:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Anindya Roy, david, ljs, security, linux-mm, lokeshgidra

On Sat, May 02, 2026 at 02:40:12AM -0700, Andrew Morton wrote:
> On Sat, 2 May 2026 13:50:02 +0530 Anindya Roy <anindyaandy1904@gmail.com> wrote:
> 
> > Hello,
> > 
> > I would like to report a vulnerability in the Linux kernel userfaultfd
> > implementation that allows cross-inode page cache injection due to VMA
> > replacement during the UFFDIO_COPY retry path.
> > 
> > *Summary:*
> > 
> > A local unprivileged attacker can inject controlled data into the page
> > cache of a different shmem/memfd/tmpfs inode than originally targeted.
> > 
> > This occurs because mfill_copy_folio_retry() drops all locks, allowing the
> > destination VMA to be replaced, but does not verify that the underlying
> > file (vm_file) remains the same when execution resumes.
> > 
> > As a result, a folio allocated for one inode can be inserted into another
> > inode’s page cache.
> > 
> > *Affected component:*
> > 
> > File: mm/userfaultfd.c
> > Function: mfill_copy_folio_retry()
> 
> Thanks.
> 
> > *Affected versions:*
> > 
> > Introduced by: f5f035a72423
> 
> Only in 7.1-rc1, fortunately.
> 
> May I ask how you found this?
> 
> Annoyingly, Sashiko found it:
> 
> : If the locks are dropped during the retry, can the anonymous VMA be
> : replaced by a different type of VMA (such as hugetlb or shared shmem)
> : at the same address?
> : 
> : mfill_get_vma() accepts these VMAs and returns 0.  When execution
> : resumes in mfill_atomic_pte_copy(), it proceeds to
> : mfill_atomic_install_pte() using the newly populated state->vma.
> : 
> : For a hugetlb VMA, mfill_establish_pmd() will allocate a standard 4KB
> : PMD into a hugepage page table hierarchy.  For a shared shmem VMA,
> : 
> : folio_add_new_anon_rmap() will be called on a shared file-backed VMA. 
> : Could this sequence cause page table corruption or kernel crashes?
> 
> We must have simply failed to look :(

We did look, we didn't reach a consensus

David Carlier sent a basic fix for changing VMA:

https://lore.kernel.org/all/20260328170101.184163-1-devnexen@gmail.com/

then Peter suggested to take a "VMA snapshot" to verify VMA didn't change:

https://lore.kernel.org/all/acrRRRT1xrAqNraj@x1.local

and then David sent a patch that compared vma->flags and
vma->vm_file->f_inode:

https://lore.kernel.org/all/20260331134158.622084-1-devnexen@gmail.com

but since comparing all vma->flags is overkill and could break existing
userspace we decided to postpone this until after rc1 and only check the
ops to ensure kernel won't crash if the VMA changes. 

As for me I'd check if we are retrying with the same VMA, but we don't have
a way to verify that :/

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry()
  2026-05-02 19:26   ` Mike Rapoport
@ 2026-05-04 12:53     ` Anindya Roy
  0 siblings, 0 replies; 5+ messages in thread
From: Anindya Roy @ 2026-05-04 12:53 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: Andrew Morton, david, ljs, security, linux-mm, lokeshgidra

[-- Attachment #1: Type: text/plain, Size: 3666 bytes --]

Hi Mike,

To add context: I have a working proof of concept that demonstrates this as
a complete local privilege escalation, not just a theoretical primitive.

The current ops check does not prevent the injection when the replacement
VMA is shmem to shmem with a different backing inode. The PoC uses
/dev/userfaultfd to intercept the kernel retry fault deterministically,
swaps the destination VMA to a different file via mmap(MAP_FIXED) during
that window, and injects controlled contents into the replacement file's
page cache. A victim process reading commands from /dev/shm then executes
the injected payload, producing euid=0.

I can share the full PoC if that would help drive the fix forward.

Thanks.

On Sun, 3 May 2026 at 00:56, Mike Rapoport <rppt@kernel.org> wrote:

> On Sat, May 02, 2026 at 02:40:12AM -0700, Andrew Morton wrote:
> > On Sat, 2 May 2026 13:50:02 +0530 Anindya Roy <anindyaandy1904@gmail.com>
> wrote:
> >
> > > Hello,
> > >
> > > I would like to report a vulnerability in the Linux kernel userfaultfd
> > > implementation that allows cross-inode page cache injection due to VMA
> > > replacement during the UFFDIO_COPY retry path.
> > >
> > > *Summary:*
> > >
> > > A local unprivileged attacker can inject controlled data into the page
> > > cache of a different shmem/memfd/tmpfs inode than originally targeted.
> > >
> > > This occurs because mfill_copy_folio_retry() drops all locks, allowing
> the
> > > destination VMA to be replaced, but does not verify that the underlying
> > > file (vm_file) remains the same when execution resumes.
> > >
> > > As a result, a folio allocated for one inode can be inserted into
> another
> > > inode’s page cache.
> > >
> > > *Affected component:*
> > >
> > > File: mm/userfaultfd.c
> > > Function: mfill_copy_folio_retry()
> >
> > Thanks.
> >
> > > *Affected versions:*
> > >
> > > Introduced by: f5f035a72423
> >
> > Only in 7.1-rc1, fortunately.
> >
> > May I ask how you found this?
> >
> > Annoyingly, Sashiko found it:
> >
> > : If the locks are dropped during the retry, can the anonymous VMA be
> > : replaced by a different type of VMA (such as hugetlb or shared shmem)
> > : at the same address?
> > :
> > : mfill_get_vma() accepts these VMAs and returns 0.  When execution
> > : resumes in mfill_atomic_pte_copy(), it proceeds to
> > : mfill_atomic_install_pte() using the newly populated state->vma.
> > :
> > : For a hugetlb VMA, mfill_establish_pmd() will allocate a standard 4KB
> > : PMD into a hugepage page table hierarchy.  For a shared shmem VMA,
> > :
> > : folio_add_new_anon_rmap() will be called on a shared file-backed VMA.
> > : Could this sequence cause page table corruption or kernel crashes?
> >
> > We must have simply failed to look :(
>
> We did look, we didn't reach a consensus
>
> David Carlier sent a basic fix for changing VMA:
>
> https://lore.kernel.org/all/20260328170101.184163-1-devnexen@gmail.com/
>
> then Peter suggested to take a "VMA snapshot" to verify VMA didn't change:
>
> https://lore.kernel.org/all/acrRRRT1xrAqNraj@x1.local
>
> and then David sent a patch that compared vma->flags and
> vma->vm_file->f_inode:
>
> https://lore.kernel.org/all/20260331134158.622084-1-devnexen@gmail.com
>
> but since comparing all vma->flags is overkill and could break existing
> userspace we decided to postpone this until after rc1 and only check the
> ops to ensure kernel won't crash if the VMA changes.
>
> As for me I'd check if we are retrying with the same VMA, but we don't have
> a way to verify that :/
>
> --
> Sincerely yours,
> Mike.
>

[-- Attachment #2: Type: text/html, Size: 6386 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-04 12:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-02  8:20 [SECURITY] mm/userfaultfd: cross-inode page cache injection via mfill_copy_folio_retry() Anindya Roy
2026-05-02  9:40 ` Andrew Morton
2026-05-02 10:15   ` Anindya Roy
2026-05-02 19:26   ` Mike Rapoport
2026-05-04 12:53     ` Anindya Roy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox