* [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
@ 2025-07-25 12:16 Jann Horn
2025-07-25 14:11 ` Vlastimil Babka
2025-07-28 4:05 ` Harry Yoo
0 siblings, 2 replies; 9+ messages in thread
From: Jann Horn @ 2025-07-25 12:16 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Rik van Riel,
Liam R. Howlett, Vlastimil Babka, Harry Yoo
Cc: linux-mm, linux-kernel, Jann Horn
If an anon folio is mapped into userspace, its anon_vma must be alive,
otherwise rmap walks can hit UAF.
There have been syzkaller reports a few months ago[1][2] of UAF in rmap
walks that seems to indicate that there can be pages with elevated mapcount
whose anon_vma has already been freed, but I think we never figured out
what the cause is; and syzkaller only hit these UAFs when memory pressure
randomly caused reclaim to rmap-walk the affected pages, so it of course
didn't manage to create a reproducer.
Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
hopefully catch such issues more reliably.
[1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
[2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Jann Horn <jannh@google.com>
---
Changes in v2:
- applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
- remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
- more verbose comment (Lorenzo)
- replaced "page" mentions with "folio" in commit message
- Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
---
include/linux/rmap.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 20803fcb49a7..6cd020eea37a 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
default:
VM_WARN_ON_ONCE(true);
}
+
+ /*
+ * Anon folios must have an associated live anon_vma as long as they're
+ * mapped into userspace.
+ * Note that the atomic_read() mainly does two things:
+ *
+ * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
+ * check that the associated anon_vma has not yet been freed (subject
+ * to KASAN's usual limitations). This check will pass if the
+ * anon_vma's refcount has already dropped to 0 but an RCU grace
+ * period hasn't passed since then.
+ * 2. If the anon_vma has not yet been freed, it checks that the
+ * anon_vma still has a nonzero refcount (as opposed to being in the
+ * middle of an RCU delay for getting freed).
+ */
+ if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
+ unsigned long mapping = (unsigned long)folio->mapping;
+ struct anon_vma *anon_vma;
+
+ anon_vma = (void *)(mapping - FOLIO_MAPPING_ANON);
+ VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio);
+ }
}
/*
---
base-commit: 1d1c610e32ab2489c49fccb7472a6bef136a0a8b
change-id: 20250724-anonvma-uaf-debug-a9db0eb4177b
--
Jann Horn <jannh@google.com>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-25 12:16 [PATCH v2] mm/rmap: Add anon_vma lifetime debug check Jann Horn
@ 2025-07-25 14:11 ` Vlastimil Babka
2025-07-25 14:44 ` Jann Horn
2025-07-28 4:05 ` Harry Yoo
1 sibling, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2025-07-25 14:11 UTC (permalink / raw)
To: Jann Horn, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Rik van Riel, Liam R. Howlett, Harry Yoo
Cc: linux-mm, linux-kernel
On 7/25/25 14:16, Jann Horn wrote:
> If an anon folio is mapped into userspace, its anon_vma must be alive,
> otherwise rmap walks can hit UAF.
>
> There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> walks that seems to indicate that there can be pages with elevated mapcount
> whose anon_vma has already been freed, but I think we never figured out
> what the cause is; and syzkaller only hit these UAFs when memory pressure
> randomly caused reclaim to rmap-walk the affected pages, so it of course
> didn't manage to create a reproducer.
>
> Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> hopefully catch such issues more reliably.
>
> [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
> Changes in v2:
> - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> - more verbose comment (Lorenzo)
> - replaced "page" mentions with "folio" in commit message
> - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> ---
> include/linux/rmap.h | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 20803fcb49a7..6cd020eea37a 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
> default:
> VM_WARN_ON_ONCE(true);
> }
> +
> + /*
> + * Anon folios must have an associated live anon_vma as long as they're
> + * mapped into userspace.
> + * Note that the atomic_read() mainly does two things:
> + *
> + * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
> + * check that the associated anon_vma has not yet been freed (subject
I think more precisely it checks that the slab folio hosting the anon_vma
could not have been yet freed, IIUC? If the anon_vma itself has been freed
then this will not trigger.
> + * to KASAN's usual limitations). This check will pass if the
> + * anon_vma's refcount has already dropped to 0 but an RCU grace
> + * period hasn't passed since then.
AFAIU this says it more accurately and matches my interpretation above?
> + * 2. If the anon_vma has not yet been freed, it checks that the
> + * anon_vma still has a nonzero refcount (as opposed to being in the
> + * middle of an RCU delay for getting freed).
Again the RCU delay would apply to the slab page, unless you talk about the
CONFIG_SLUB_RCU_DEBUG specific path (IIRC).
That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
something that's not anon_vma anymore? I think we expect it to exist? Can we
thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
itself has not been freed yet (i.e. even if within a grace period?).
I was wondering what actually does rely on SLAB_TYPESAFE_BY_RCU, thanks
Lorenzo for pointing me to folio_get_anon_vma(). But that's only used
elsewhere than here, no?
> + */
> + if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
So you verified this compiles out completely without DEBUG_VM?
> + unsigned long mapping = (unsigned long)folio->mapping;
> + struct anon_vma *anon_vma;
> +
> + anon_vma = (void *)(mapping - FOLIO_MAPPING_ANON);
> + VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio);
> + }
> }
>
> /*
>
> ---
> base-commit: 1d1c610e32ab2489c49fccb7472a6bef136a0a8b
> change-id: 20250724-anonvma-uaf-debug-a9db0eb4177b
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-25 14:11 ` Vlastimil Babka
@ 2025-07-25 14:44 ` Jann Horn
2025-07-25 15:38 ` Vlastimil Babka
2025-07-25 15:40 ` David Hildenbrand
0 siblings, 2 replies; 9+ messages in thread
From: Jann Horn @ 2025-07-25 14:44 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Rik van Riel,
Liam R. Howlett, Harry Yoo, linux-mm, linux-kernel
On Fri, Jul 25, 2025 at 4:11 PM Vlastimil Babka <vbabka@suse.cz> wrote:
> On 7/25/25 14:16, Jann Horn wrote:
> > If an anon folio is mapped into userspace, its anon_vma must be alive,
> > otherwise rmap walks can hit UAF.
> >
> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> > walks that seems to indicate that there can be pages with elevated mapcount
> > whose anon_vma has already been freed, but I think we never figured out
> > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > didn't manage to create a reproducer.
> >
> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> > hopefully catch such issues more reliably.
> >
> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Signed-off-by: Jann Horn <jannh@google.com>
> > ---
> > Changes in v2:
> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> > - more verbose comment (Lorenzo)
> > - replaced "page" mentions with "folio" in commit message
> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> > ---
> > include/linux/rmap.h | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> > index 20803fcb49a7..6cd020eea37a 100644
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
> > default:
> > VM_WARN_ON_ONCE(true);
> > }
> > +
> > + /*
> > + * Anon folios must have an associated live anon_vma as long as they're
> > + * mapped into userspace.
> > + * Note that the atomic_read() mainly does two things:
> > + *
> > + * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
> > + * check that the associated anon_vma has not yet been freed (subject
>
> I think more precisely it checks that the slab folio hosting the anon_vma
> could not have been yet freed, IIUC? If the anon_vma itself has been freed
> then this will not trigger.
The point of CONFIG_SLUB_RCU_DEBUG, which I'm talking about here, is
that it allows KASAN to catch UAF once the anon_vma has been freed and
an RCU grace period has passed; it is not necessary that the slab
folio has been freed.
You can see that working in the linked syzkaller reports - KASAN
tracked the object as freed after slab_free_after_rcu_debug(), which
is an RCU callback scheduled from kmem_cache_free().
> > + * to KASAN's usual limitations). This check will pass if the
> > + * anon_vma's refcount has already dropped to 0 but an RCU grace
> > + * period hasn't passed since then.
>
> AFAIU this says it more accurately and matches my interpretation above?
>
> > + * 2. If the anon_vma has not yet been freed, it checks that the
> > + * anon_vma still has a nonzero refcount (as opposed to being in the
> > + * middle of an RCU delay for getting freed).
>
> Again the RCU delay would apply to the slab page, unless you talk about the
> CONFIG_SLUB_RCU_DEBUG specific path (IIRC).
Yes, right, the "RCU delay" in the second bullet point refers to
CONFIG_SLUB_RCU_DEBUG.
Here I'm saying "If the anon_vma has not yet been freed" because
that's the only case in which I can reliably say what will happen, and
this is the main case that isn't already covered by the first bullet
point in a CONFIG_SLUB_RCU_DEBUG build.
> That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
> situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
> something that's not anon_vma anymore? I think we expect it to exist?
Yes, we expect it to exist. That's why I'm not just asserting that the
anon_vma is still considered live by KASAN, but also that its refcount
is non-zero.
> Can we
> thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
> itself has not been freed yet (i.e. even if within a grace period?).
That is essentially what I'm doing - checking that the count is
nonzero verifies that it's not within a grace period, and the implicit
KASAN check verifies it can't be in a KASAN quarantine after the grace
period is over.
> I was wondering what actually does rely on SLAB_TYPESAFE_BY_RCU, thanks
> Lorenzo for pointing me to folio_get_anon_vma(). But that's only used
> elsewhere than here, no?
Yes; the point of this assertion is that folio_get_anon_vma() and
folio_lock_anon_vma_read() (which show up in the stack traces of the
two linked syzkaller reports) rely on the ->mapping not being
dangling; and they can happen in the background anytime as long as the
folio mapcount is elevated, but they only actually happen sporadically
(in particular under memory pressure, which syzkaller apparently
mostly triggers randomly and non-reproducibly). If we have bugs where
the ->mapping of a folio goes away while it still has an elevated
mapcount, then instead of randomly getting KASAN splats under memory
pressure, it would be nice to detect this reliably. So I figured a
nice place to check this is when we're decrementing the mapcount -
that should catch almost all cases, except ones where such a bug
happens because we somehow leaked both a mapcount and a refcount of a
folio.
> > + */
> > + if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
>
> So you verified this compiles out completely without DEBUG_VM?
No, after David's suggestion to remove the explicit CONFIG_DEBUG_VM
check, I looked at the definitions of these things to check that it's
all just plain reads and arithmetic, and removed the CONFIG_DEBUG_VM
check, but haven't actually compile-tested it.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-25 14:44 ` Jann Horn
@ 2025-07-25 15:38 ` Vlastimil Babka
2025-07-25 15:40 ` David Hildenbrand
1 sibling, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2025-07-25 15:38 UTC (permalink / raw)
To: Jann Horn
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Rik van Riel,
Liam R. Howlett, Harry Yoo, linux-mm, linux-kernel
On 7/25/25 16:44, Jann Horn wrote:
> On Fri, Jul 25, 2025 at 4:11 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>> On 7/25/25 14:16, Jann Horn wrote:
>> > If an anon folio is mapped into userspace, its anon_vma must be alive,
>> > otherwise rmap walks can hit UAF.
>> >
>> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
>> > walks that seems to indicate that there can be pages with elevated mapcount
>> > whose anon_vma has already been freed, but I think we never figured out
>> > what the cause is; and syzkaller only hit these UAFs when memory pressure
>> > randomly caused reclaim to rmap-walk the affected pages, so it of course
>> > didn't manage to create a reproducer.
>> >
>> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
>> > hopefully catch such issues more reliably.
>> >
>> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
>> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
>> >
>> > Acked-by: David Hildenbrand <david@redhat.com>
>> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> > Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
>> > ---
>> > Changes in v2:
>> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
>> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
>> > - more verbose comment (Lorenzo)
>> > - replaced "page" mentions with "folio" in commit message
>> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
>> > ---
>> > include/linux/rmap.h | 22 ++++++++++++++++++++++
>> > 1 file changed, 22 insertions(+)
>> >
>> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> > index 20803fcb49a7..6cd020eea37a 100644
>> > --- a/include/linux/rmap.h
>> > +++ b/include/linux/rmap.h
>> > @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
>> > default:
>> > VM_WARN_ON_ONCE(true);
>> > }
>> > +
>> > + /*
>> > + * Anon folios must have an associated live anon_vma as long as they're
>> > + * mapped into userspace.
>> > + * Note that the atomic_read() mainly does two things:
>> > + *
>> > + * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
>> > + * check that the associated anon_vma has not yet been freed (subject
>>
>> I think more precisely it checks that the slab folio hosting the anon_vma
>> could not have been yet freed, IIUC? If the anon_vma itself has been freed
>> then this will not trigger.
>
> The point of CONFIG_SLUB_RCU_DEBUG, which I'm talking about here, is
> that it allows KASAN to catch UAF once the anon_vma has been freed and
> an RCU grace period has passed; it is not necessary that the slab
> folio has been freed.
>
> You can see that working in the linked syzkaller reports - KASAN
> tracked the object as freed after slab_free_after_rcu_debug(), which
> is an RCU callback scheduled from kmem_cache_free().
>
>> > + * to KASAN's usual limitations). This check will pass if the
>> > + * anon_vma's refcount has already dropped to 0 but an RCU grace
>> > + * period hasn't passed since then.
>>
>> AFAIU this says it more accurately and matches my interpretation above?
>>
>> > + * 2. If the anon_vma has not yet been freed, it checks that the
>> > + * anon_vma still has a nonzero refcount (as opposed to being in the
>> > + * middle of an RCU delay for getting freed).
>>
>> Again the RCU delay would apply to the slab page, unless you talk about the
>> CONFIG_SLUB_RCU_DEBUG specific path (IIRC).
>
> Yes, right, the "RCU delay" in the second bullet point refers to
> CONFIG_SLUB_RCU_DEBUG.
OK I misunderstood that while bullet 1 notes the check only happens with
CONFIG_SLUB_RCU_DEBUG, I assumed the description is still meant semantically
from the point of anon_vma users (particularly what "freed" means - moment
of kfree() vs KASAN quarantine). Once considered from the point of what
happens with the object under CONFIG_SLUB_RCU_DEBUG, it all makes sense.
> Here I'm saying "If the anon_vma has not yet been freed" because
> that's the only case in which I can reliably say what will happen, and
> this is the main case that isn't already covered by the first bullet
> point in a CONFIG_SLUB_RCU_DEBUG build.
>
>> That said, I wonder if here in __folio_rmap_sanity_checks() we are even in a
>> situation where we rely on SLAB_TYPESAFE_BY_RCU in order to not touch
>> something that's not anon_vma anymore? I think we expect it to exist?
>
> Yes, we expect it to exist. That's why I'm not just asserting that the
> anon_vma is still considered live by KASAN, but also that its refcount
> is non-zero.
>
>> Can we
>> thus invent a CONFIG_SLUB_RCU_DEBUG-specific assert that assert the anon_vma
>> itself has not been freed yet (i.e. even if within a grace period?).
>
> That is essentially what I'm doing - checking that the count is
> nonzero verifies that it's not within a grace period, and the implicit
> KASAN check verifies it can't be in a KASAN quarantine after the grace
> period is over.
OK I guess that's sufficient and we'd be unlikely to find a bug scenario
where anon_vma was kfree'd() and thus KASAN with CONFIG_SLUB_RCU_DEBUG is
waiting for the grace period, yet it doesn't have a zero refcount.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-25 14:44 ` Jann Horn
2025-07-25 15:38 ` Vlastimil Babka
@ 2025-07-25 15:40 ` David Hildenbrand
1 sibling, 0 replies; 9+ messages in thread
From: David Hildenbrand @ 2025-07-25 15:40 UTC (permalink / raw)
To: Jann Horn, Vlastimil Babka
Cc: Andrew Morton, Lorenzo Stoakes, Rik van Riel, Liam R. Howlett,
Harry Yoo, linux-mm, linux-kernel
>>> + */
>>> + if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
>>
>> So you verified this compiles out completely without DEBUG_VM?
>
> No, after David's suggestion to remove the explicit CONFIG_DEBUG_VM
> check, I looked at the definitions of these things to check that it's
> all just plain reads and arithmetic, and removed the CONFIG_DEBUG_VM
> check, but haven't actually compile-tested it.
The compiler seems to know its judo well.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-25 12:16 [PATCH v2] mm/rmap: Add anon_vma lifetime debug check Jann Horn
2025-07-25 14:11 ` Vlastimil Babka
@ 2025-07-28 4:05 ` Harry Yoo
2025-07-28 4:33 ` Lorenzo Stoakes
2025-07-28 14:12 ` Jann Horn
1 sibling, 2 replies; 9+ messages in thread
From: Harry Yoo @ 2025-07-28 4:05 UTC (permalink / raw)
To: Jann Horn
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Rik van Riel,
Liam R. Howlett, Vlastimil Babka, linux-mm, linux-kernel
On Fri, Jul 25, 2025 at 02:16:24PM +0200, Jann Horn wrote:
> If an anon folio is mapped into userspace, its anon_vma must be alive,
> otherwise rmap walks can hit UAF.
>
> There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> walks that seems to indicate that there can be pages with elevated mapcount
> whose anon_vma has already been freed, but I think we never figured out
> what the cause is; and syzkaller only hit these UAFs when memory pressure
> randomly caused reclaim to rmap-walk the affected pages, so it of course
> didn't manage to create a reproducer.
>
> Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> hopefully catch such issues more reliably.
>
> [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
> Changes in v2:
> - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> - more verbose comment (Lorenzo)
> - replaced "page" mentions with "folio" in commit message
> - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> ---
Oops, I'm late to the party.
A question; does it make sense to disable reuse of anon_vmas during
anon_vma_clone() to increase chances of detecting this? (of course,
for debugging-purpose only)
Regardless of that:
Acked-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
> include/linux/rmap.h | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 20803fcb49a7..6cd020eea37a 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -449,6 +449,28 @@ static inline void __folio_rmap_sanity_checks(const struct folio *folio,
> default:
> VM_WARN_ON_ONCE(true);
> }
> +
> + /*
> + * Anon folios must have an associated live anon_vma as long as they're
> + * mapped into userspace.
> + * Note that the atomic_read() mainly does two things:
> + *
> + * 1. In KASAN builds with CONFIG_SLUB_RCU_DEBUG, it causes KASAN to
> + * check that the associated anon_vma has not yet been freed (subject
> + * to KASAN's usual limitations). This check will pass if the
> + * anon_vma's refcount has already dropped to 0 but an RCU grace
> + * period hasn't passed since then.
> + * 2. If the anon_vma has not yet been freed, it checks that the
> + * anon_vma still has a nonzero refcount (as opposed to being in the
> + * middle of an RCU delay for getting freed).
> + */
> + if (folio_test_anon(folio) && !folio_test_ksm(folio)) {
> + unsigned long mapping = (unsigned long)folio->mapping;
> + struct anon_vma *anon_vma;
> +
> + anon_vma = (void *)(mapping - FOLIO_MAPPING_ANON);
> + VM_WARN_ON_FOLIO(atomic_read(&anon_vma->refcount) == 0, folio);
> + }
> }
>
> /*
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-28 4:05 ` Harry Yoo
@ 2025-07-28 4:33 ` Lorenzo Stoakes
2025-07-29 2:41 ` Harry Yoo
2025-07-28 14:12 ` Jann Horn
1 sibling, 1 reply; 9+ messages in thread
From: Lorenzo Stoakes @ 2025-07-28 4:33 UTC (permalink / raw)
To: Harry Yoo
Cc: Jann Horn, Andrew Morton, David Hildenbrand, Rik van Riel,
Liam R. Howlett, Vlastimil Babka, linux-mm, linux-kernel
On Mon, Jul 28, 2025 at 01:05:54PM +0900, Harry Yoo wrote:
> On Fri, Jul 25, 2025 at 02:16:24PM +0200, Jann Horn wrote:
> > If an anon folio is mapped into userspace, its anon_vma must be alive,
> > otherwise rmap walks can hit UAF.
> >
> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> > walks that seems to indicate that there can be pages with elevated mapcount
> > whose anon_vma has already been freed, but I think we never figured out
> > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > didn't manage to create a reproducer.
> >
> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> > hopefully catch such issues more reliably.
> >
> > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Signed-off-by: Jann Horn <jannh@google.com>
> > ---
> > Changes in v2:
> > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> > - more verbose comment (Lorenzo)
> > - replaced "page" mentions with "folio" in commit message
> > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> > ---
>
> Oops, I'm late to the party.
Isn't this the fashionable time to turn up? :P
>
> A question; does it make sense to disable reuse of anon_vmas during
> anon_vma_clone() to increase chances of detecting this? (of course,
> for debugging-purpose only)
This won't impact this issue that much AFAICT, as we only reuse an anon_vma
if it has a refcount == 1 and for that to be the case it has to be have
children >= 1.
We'd then have to rely on this bug triggering by this firing when the child
no longer references it but then it is dereffed, but we're seeing the bug
now so presumably it's not required.
On the other hand, it would obviously cause more anon_vma's to get to
refcount 0, so maybe it'd increase the prevelance of it.
However, we might actually be seeing the bug _because_ of anon_vma reuse :)
at which point obviously it would not help increase prevelance... so we
should keep behaviour as close to 'reality' as possible IMO.
Finally, I'm not in favour of introducing some special debug mode for this
or changing this code to be arbitrarily disabled in existing debug modes -
let's keep this change simple.
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-28 4:05 ` Harry Yoo
2025-07-28 4:33 ` Lorenzo Stoakes
@ 2025-07-28 14:12 ` Jann Horn
1 sibling, 0 replies; 9+ messages in thread
From: Jann Horn @ 2025-07-28 14:12 UTC (permalink / raw)
To: Harry Yoo
Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Rik van Riel,
Liam R. Howlett, Vlastimil Babka, linux-mm, linux-kernel
On Mon, Jul 28, 2025 at 6:06 AM Harry Yoo <harry.yoo@oracle.com> wrote:
> On Fri, Jul 25, 2025 at 02:16:24PM +0200, Jann Horn wrote:
> > If an anon folio is mapped into userspace, its anon_vma must be alive,
> > otherwise rmap walks can hit UAF.
> >
> > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> > walks that seems to indicate that there can be pages with elevated mapcount
> > whose anon_vma has already been freed, but I think we never figured out
> > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > didn't manage to create a reproducer.
> >
> > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> > hopefully catch such issues more reliably.
[...]
> Oops, I'm late to the party.
>
> A question; does it make sense to disable reuse of anon_vmas during
> anon_vma_clone() to increase chances of detecting this? (of course,
> for debugging-purpose only)
As Lorenzo said, I think making such a change would risk making it
impossible to hit some bugs in debug builds even though they can
happen in normal builds, which would be bad.
> Regardless of that:
> Acked-by: Harry Yoo <harry.yoo@oracle.com>
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/rmap: Add anon_vma lifetime debug check
2025-07-28 4:33 ` Lorenzo Stoakes
@ 2025-07-29 2:41 ` Harry Yoo
0 siblings, 0 replies; 9+ messages in thread
From: Harry Yoo @ 2025-07-29 2:41 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Jann Horn, Andrew Morton, David Hildenbrand, Rik van Riel,
Liam R. Howlett, Vlastimil Babka, linux-mm, linux-kernel
On Mon, Jul 28, 2025 at 05:33:34AM +0100, Lorenzo Stoakes wrote:
> On Mon, Jul 28, 2025 at 01:05:54PM +0900, Harry Yoo wrote:
> > On Fri, Jul 25, 2025 at 02:16:24PM +0200, Jann Horn wrote:
> > > If an anon folio is mapped into userspace, its anon_vma must be alive,
> > > otherwise rmap walks can hit UAF.
> > >
> > > There have been syzkaller reports a few months ago[1][2] of UAF in rmap
> > > walks that seems to indicate that there can be pages with elevated mapcount
> > > whose anon_vma has already been freed, but I think we never figured out
> > > what the cause is; and syzkaller only hit these UAFs when memory pressure
> > > randomly caused reclaim to rmap-walk the affected pages, so it of course
> > > didn't manage to create a reproducer.
> > >
> > > Add a VM_WARN_ON_FOLIO() when we add/remove mappings of anonymous folios to
> > > hopefully catch such issues more reliably.
> > >
> > > [1] https://lore.kernel.org/r/67abaeaf.050a0220.110943.0041.GAE@google.com
> > > [2] https://lore.kernel.org/r/67a76f33.050a0220.3d72c.0028.GAE@google.com
> > >
> > > Acked-by: David Hildenbrand <david@redhat.com>
> > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > > Signed-off-by: Jann Horn <jannh@google.com>
> > > ---
> > > Changes in v2:
> > > - applied akpm's fixup (use FOLIO_MAPPING_ANON, ...)
> > > - remove CONFIG_DEBUG_VM check and use folio_test_* helpers (David)
> > > - more verbose comment (Lorenzo)
> > > - replaced "page" mentions with "folio" in commit message
> > > - Link to v1: https://lore.kernel.org/r/20250724-anonvma-uaf-debug-v1-1-29989ddc4e2a@google.com
> > > ---
> >
> > A question; does it make sense to disable reuse of anon_vmas during
> > anon_vma_clone() to increase chances of detecting this? (of course,
> > for debugging-purpose only)
>
> On the other hand, it would obviously cause more anon_vma's to get to
> refcount 0, so maybe it'd increase the prevelance of it.
>
> However, we might actually be seeing the bug _because_ of anon_vma reuse :)
> at which point obviously it would not help increase prevelance... so we
> should keep behaviour as close to 'reality' as possible IMO.
That's fair enough. Agree with you that adding a new config option that
introduces behavior diverging from reality for debugging doesn't really
add much value - it may even prevent some bugs from being reported.
> Finally, I'm not in favour of introducing some special debug mode for this
> or changing this code to be arbitrarily disabled in existing debug modes -
> let's keep this change simple.
Sure. Thanks for the answer!
> Cheers, Lorenzo
--
Cheers,
Harry / Hyeonggon
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-07-29 2:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 12:16 [PATCH v2] mm/rmap: Add anon_vma lifetime debug check Jann Horn
2025-07-25 14:11 ` Vlastimil Babka
2025-07-25 14:44 ` Jann Horn
2025-07-25 15:38 ` Vlastimil Babka
2025-07-25 15:40 ` David Hildenbrand
2025-07-28 4:05 ` Harry Yoo
2025-07-28 4:33 ` Lorenzo Stoakes
2025-07-29 2:41 ` Harry Yoo
2025-07-28 14:12 ` Jann Horn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).