* [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs()
2025-08-26 23:16 [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Max Kellermann
@ 2025-08-26 23:16 ` Max Kellermann
2025-08-27 1:42 ` Zi Yan
2025-08-27 9:38 ` David Hildenbrand
2025-08-27 1:19 ` [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Zi Yan
2025-08-27 1:55 ` Andrew Morton
2 siblings, 2 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-26 23:16 UTC (permalink / raw)
To: akpm, david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
Cc: Max Kellermann
It is not legal to have NULL pointers in a folio_batch.
However, the Ceph code does exactly this, and a refactoring patch gone
wrong has exposed this to folios_put_refs(), see
https://lore.kernel.org/ceph-devel/aK4v548CId5GIKG1@swift.blarg.de/
I believe this should Oops instead of crashing due to NULL pointer
reference (guarded by is_huge_zero_folio(), which may silently hide
the bug).
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
mm/swap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/swap.c b/mm/swap.c
index 3632dd061beb..07ccda00e7ee 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -952,6 +952,7 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
for (i = 0, j = 0; i < folios->nr; i++) {
struct folio *folio = folios->folios[i];
unsigned int nr_refs = refs ? refs[i] : 1;
+ BUG_ON(folio == NULL);
if (is_huge_zero_folio(folio))
continue;
--
2.47.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs()
2025-08-26 23:16 ` [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs() Max Kellermann
@ 2025-08-27 1:42 ` Zi Yan
2025-08-27 2:12 ` Chris Li
2025-08-27 9:38 ` David Hildenbrand
1 sibling, 1 reply; 27+ messages in thread
From: Zi Yan @ 2025-08-27 1:42 UTC (permalink / raw)
To: Max Kellermann
Cc: akpm, david, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 26 Aug 2025, at 19:16, Max Kellermann wrote:
> It is not legal to have NULL pointers in a folio_batch.
>
> However, the Ceph code does exactly this, and a refactoring patch gone
> wrong has exposed this to folios_put_refs(), see
> https://lore.kernel.org/ceph-devel/aK4v548CId5GIKG1@swift.blarg.de/
>
> I believe this should Oops instead of crashing due to NULL pointer
> reference (guarded by is_huge_zero_folio(), which may silently hide
> the bug).
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> mm/swap.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 3632dd061beb..07ccda00e7ee 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -952,6 +952,7 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
> for (i = 0, j = 0; i < folios->nr; i++) {
> struct folio *folio = folios->folios[i];
> unsigned int nr_refs = refs ? refs[i] : 1;
> + BUG_ON(folio == NULL);
>
> if (is_huge_zero_folio(folio))
> continue;
We are moving away from BUG_ON. It is better to use WARN_ON_ONCE and skip
NULL values:
if (WARN_ON_ONCE(!folio))
continue;
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs()
2025-08-27 1:42 ` Zi Yan
@ 2025-08-27 2:12 ` Chris Li
0 siblings, 0 replies; 27+ messages in thread
From: Chris Li @ 2025-08-27 2:12 UTC (permalink / raw)
To: Zi Yan
Cc: Max Kellermann, akpm, david, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, shikemeng,
kasong, nphamcs, bhe, linux-mm, linux-kernel
On Tue, Aug 26, 2025 at 6:43 PM Zi Yan <ziy@nvidia.com> wrote:
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -952,6 +952,7 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
> > for (i = 0, j = 0; i < folios->nr; i++) {
> > struct folio *folio = folios->folios[i];
> > unsigned int nr_refs = refs ? refs[i] : 1;
> > + BUG_ON(folio == NULL);
> >
> > if (is_huge_zero_folio(folio))
> > continue;
>
> We are moving away from BUG_ON. It is better to use WARN_ON_ONCE and skip
> NULL values:
>
> if (WARN_ON_ONCE(!folio))
> continue;
+1
Chris
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs()
2025-08-26 23:16 ` [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs() Max Kellermann
2025-08-27 1:42 ` Zi Yan
@ 2025-08-27 9:38 ` David Hildenbrand
1 sibling, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 9:38 UTC (permalink / raw)
To: Max Kellermann, akpm, lorenzo.stoakes, ziy, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, shikemeng,
kasong, nphamcs, bhe, chrisl, linux-mm, linux-kernel
On 27.08.25 01:16, Max Kellermann wrote:
> It is not legal to have NULL pointers in a folio_batch.
>
> However, the Ceph code does exactly this, and a refactoring patch gone
> wrong has exposed this to folios_put_refs(), see
> https://lore.kernel.org/ceph-devel/aK4v548CId5GIKG1@swift.blarg.de/
>
> I believe this should Oops instead of crashing due to NULL pointer
> reference (guarded by is_huge_zero_folio(), which may silently hide
> the bug).
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> mm/swap.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 3632dd061beb..07ccda00e7ee 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -952,6 +952,7 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
> for (i = 0, j = 0; i < folios->nr; i++) {
> struct folio *folio = folios->folios[i];
> unsigned int nr_refs = refs ? refs[i] : 1;
> + BUG_ON(folio == NULL);
We shouldn't be adding such checks into each and every function that
expects a valid folio.
The is_huge_zero_folio() is special because it does not crash
immediately when the function is abused.
So we don't want this (and in particular not a BUG_ON).
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-26 23:16 [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Max Kellermann
2025-08-26 23:16 ` [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs() Max Kellermann
@ 2025-08-27 1:19 ` Zi Yan
2025-08-27 1:55 ` Andrew Morton
2 siblings, 0 replies; 27+ messages in thread
From: Zi Yan @ 2025-08-27 1:19 UTC (permalink / raw)
To: Max Kellermann
Cc: akpm, david, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 26 Aug 2025, at 19:16, Max Kellermann wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is legal.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> I believe this is a negligible corner case and I don't want to add any
> overhead for this; but in debugging kernels, it may be helpful to add
> this check, therefore I put it inside an `#ifdef CONFIG_DEBUG_VM`.
IMHO, we should make is_huge_zero_folio() return the correct answer.
Adding hzf == NULL check should not add much overhead. So removing
#ifdef is better.
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> include/linux/huge_mm.h | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 7748489fde1b..e4c617c0b445 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -479,7 +479,12 @@ extern unsigned long huge_zero_pfn;
>
> static inline bool is_huge_zero_folio(const struct folio *folio)
> {
> - return READ_ONCE(huge_zero_folio) == folio;
> + const struct folio *hzf = READ_ONCE(huge_zero_folio);
> +#ifdef CONFIG_DEBUG_VM
> + if (hzf == NULL)
> + return false;
> +#endif
> + return hzf == folio;
> }
>
> static inline bool is_huge_zero_pfn(unsigned long pfn)
> --
> 2.47.2
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-26 23:16 [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Max Kellermann
2025-08-26 23:16 ` [PATCH 2/2] mm/swap: add BUG_ON(folio==NULL) to folios_put_refs() Max Kellermann
2025-08-27 1:19 ` [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Zi Yan
@ 2025-08-27 1:55 ` Andrew Morton
2025-08-27 4:39 ` Max Kellermann
2025-08-27 9:35 ` David Hildenbrand
2 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2025-08-27 1:55 UTC (permalink / raw)
To: Max Kellermann
Cc: david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On Wed, 27 Aug 2025 01:16:24 +0200 Max Kellermann <max.kellermann@ionos.com> wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is legal.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
Isn't it a bug to call is_huge_zero_folio() before the huge_zero_folio
has been created?
Being a simple soul, I'm thinking
VM_BUG_ON(!huge_zero_folio);
VM_BUG_ON(!folio);
or similar will settle matters?
> I believe this is a negligible corner case and I don't want to add any
> overhead for this; but in debugging kernels, it may be helpful to add
> this check, therefore I put it inside an `#ifdef CONFIG_DEBUG_VM`.
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> ...
>
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -479,7 +479,12 @@ extern unsigned long huge_zero_pfn;
>
> static inline bool is_huge_zero_folio(const struct folio *folio)
> {
> - return READ_ONCE(huge_zero_folio) == folio;
> + const struct folio *hzf = READ_ONCE(huge_zero_folio);
> +#ifdef CONFIG_DEBUG_VM
> + if (hzf == NULL)
> + return false;
> +#endif
> + return hzf == folio;
> }
>
Yeah, this all seems rather ... complicated.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 1:55 ` Andrew Morton
@ 2025-08-27 4:39 ` Max Kellermann
2025-08-27 9:36 ` David Hildenbrand
2025-08-27 9:35 ` David Hildenbrand
1 sibling, 1 reply; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 4:39 UTC (permalink / raw)
To: Andrew Morton
Cc: david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 3:55 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> Isn't it a bug to call is_huge_zero_folio() before the huge_zero_folio
> has been created?
I have no idea, but folios_put_refs() calls it and I don't see why it
shouldn't be allowed to do that. What else should folios_put_refs()
do?
(Thanks Zi Yan, I'll send v2 with your suggestions.)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 4:39 ` Max Kellermann
@ 2025-08-27 9:36 ` David Hildenbrand
2025-08-27 10:13 ` Max Kellermann
0 siblings, 1 reply; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 9:36 UTC (permalink / raw)
To: Max Kellermann, Andrew Morton
Cc: lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 27.08.25 06:39, Max Kellermann wrote:
> On Wed, Aug 27, 2025 at 3:55 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>> Isn't it a bug to call is_huge_zero_folio() before the huge_zero_folio
>> has been created?
>
> I have no idea, but folios_put_refs() calls it and I don't see why it
> shouldn't be allowed to do that. What else should folios_put_refs()
> do?
Why should it be allowed to pass in garbage (folio == NULL) into a
function that operates on valid folios?
>
> (Thanks Zi Yan, I'll send v2 with your suggestions.)
No, use a VM_WARN_ON_ONCE().
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 9:36 ` David Hildenbrand
@ 2025-08-27 10:13 ` Max Kellermann
2025-08-27 11:56 ` David Hildenbrand
0 siblings, 1 reply; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 10:13 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 11:36 AM David Hildenbrand <david@redhat.com> wrote:
> Why should it be allowed to pass in garbage (folio == NULL) into a
> function that operates on valid folios?
This patch isn't about the function parameter but about the global
variable being NULL.
(Don't mix up with my other patch.)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 10:13 ` Max Kellermann
@ 2025-08-27 11:56 ` David Hildenbrand
2025-08-27 13:06 ` Max Kellermann
0 siblings, 1 reply; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 11:56 UTC (permalink / raw)
To: Max Kellermann
Cc: Andrew Morton, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
On 27.08.25 12:13, Max Kellermann wrote:
> On Wed, Aug 27, 2025 at 11:36 AM David Hildenbrand <david@redhat.com> wrote:
>> Why should it be allowed to pass in garbage (folio == NULL) into a
>> function that operates on valid folios?
>
> This patch isn't about the function parameter but about the global
> variable being NULL.
> (Don't mix up with my other patch.)
Huh?
"Calling is_huge_zero_folio(NULL) should not be legal - it makes no
sense, and a different (theoretical) implementation may dereference
the pointer."
And then
"But currently, lacking any explicit documentation, this
call is legal."
No. It isn't. It never was.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 11:56 ` David Hildenbrand
@ 2025-08-27 13:06 ` Max Kellermann
2025-08-27 14:34 ` David Hildenbrand
0 siblings, 1 reply; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 13:06 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 1:56 PM David Hildenbrand <david@redhat.com> wrote:
> > This patch isn't about the function parameter but about the global
> > variable being NULL.
> > (Don't mix up with my other patch.)
>
> Huh?
Oh, I see the misunderstanding. You are right here, yes my patch is
indeed about passing NULL to is_huge_zero_folio(). My previous reply
was confusing.
But you were replying to a sub-thread by Andrew who questioned whether
a call is_huge_zero_folio() is legal at all before the huge zero folio
has been created.
The question I asked Andrew, which you replied to, was not about
passing NULL to is_huge_zero_folio(), but about whether the call to
is_huge_zero_folio() is legal at all, no matter which parameter value.
I agree with you that is_huge_zero_folio(NULL) should not be legal and
makes no sense. It's a bug somewhere in the caller.
But I saw that the current implementation effectively (randomly) hides
a bug in another part of the kernel (= Ceph), one that I'd rather like
to be visible (and get fixed). And that is the point of my patch, and
is the reason why I chose to have the additional debug-only code
inside an #ifdef - I don't want normal users to pay the price for a
debugging feature, no matter how small the price.
David, what is your opinion here?
- leave it as it is; is_huge_zero_folio(NULL) may randomly return true
or false (= reject this patch)
- add the huge_zero_folio==NULL check, but guarded with #ifdef DEBUG
(= this patch)
- add the huge_zero_folio==NULL check unconditionally (suggested by Zi Yan)
And do you agree with Andrew that calling is_huge_zero_folio() should
only be legal if huge_zero_folio!=NULL?
Max
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 13:06 ` Max Kellermann
@ 2025-08-27 14:34 ` David Hildenbrand
2025-08-27 15:03 ` [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL) Max Kellermann
2025-08-27 15:05 ` [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Max Kellermann
0 siblings, 2 replies; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 14:34 UTC (permalink / raw)
To: Max Kellermann
Cc: Andrew Morton, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
On 27.08.25 15:06, Max Kellermann wrote:
> On Wed, Aug 27, 2025 at 1:56 PM David Hildenbrand <david@redhat.com> wrote:
>>> This patch isn't about the function parameter but about the global
>>> variable being NULL.
>>> (Don't mix up with my other patch.)
>>
>> Huh?
>
> Oh, I see the misunderstanding. You are right here, yes my patch is
> indeed about passing NULL to is_huge_zero_folio(). My previous reply
> was confusing.
>
> But you were replying to a sub-thread by Andrew who questioned whether
> a call is_huge_zero_folio() is legal at all before the huge zero folio
> has been created.
Sure it is, I mean on some systems it is never initialized.
> The question I asked Andrew, which you replied to, was not about
> passing NULL to is_huge_zero_folio(), but about whether the call to
> is_huge_zero_folio() is legal at all, no matter which parameter value.
Yes, see how get_huge_zero_folio() dynamically allocates it on demand to
then update huge_zero_folio.
>
> I agree with you that is_huge_zero_folio(NULL) should not be legal and
> makes no sense. It's a bug somewhere in the caller.
> But I saw that the current implementation effectively (randomly) hides
> a bug in another part of the kernel (= Ceph), one that I'd rather like
> to be visible (and get fixed). And that is the point of my patch, and
> is the reason why I chose to have the additional debug-only code
> inside an #ifdef - I don't want normal users to pay the price for a
> debugging feature, no matter how small the price.
>
> David, what is your opinion here?
>
> - leave it as it is; is_huge_zero_folio(NULL) may randomly return true
> or false (= reject this patch)
> - add the huge_zero_folio==NULL check, but guarded with #ifdef DEBUG
> (= this patch)
> - add the huge_zero_folio==NULL check unconditionally (suggested by Zi Yan)
VM_WARN_ON_ONCE(!folio);
Should allow for catching such bugs without causing any runtime overhead
on non-debug kernels.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 14:34 ` David Hildenbrand
@ 2025-08-27 15:03 ` Max Kellermann
2025-08-27 15:16 ` Lorenzo Stoakes
2025-08-27 23:53 ` [PATCH v2] " Andrew Morton
2025-08-27 15:05 ` [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false Max Kellermann
1 sibling, 2 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 15:03 UTC (permalink / raw)
To: akpm, david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
Cc: Max Kellermann
Calling is_huge_zero_folio(NULL) should not be legal - it makes no
sense, and a different (theoretical) implementation may dereference
the pointer. But currently, lacking any explicit documentation, this
call is possible.
But if somebody really passes NULL, the function should not return
true - this isn't the huge zero folio after all! However, if the
`huge_zero_folio` hasn't been allocated yet, it's NULL, and
is_huge_zero_folio(NULL) just happens to return true, which is a lie.
This weird side effect prevented me from reproducing a kernel crash
that occurred when the elements of a folio_batch were NULL - since
folios_put_refs() skips huge zero folios, this sometimes causes a
crash, but sometimes does not. For debugging, it is better to reveal
such bugs reliably and not hide them behind random preconditions like
"has the huge zero folio already been created?"
To improve detection of such bugs, David Hildenbrand suggested adding
a VM_WARN_ON_ONCE().
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
include/linux/huge_mm.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 7748489fde1b..bc9ca7798f2e 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -2,6 +2,7 @@
#ifndef _LINUX_HUGE_MM_H
#define _LINUX_HUGE_MM_H
+#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
#include <linux/mm_types.h>
#include <linux/fs.h> /* only for vma_is_dax() */
@@ -479,6 +480,8 @@ extern unsigned long huge_zero_pfn;
static inline bool is_huge_zero_folio(const struct folio *folio)
{
+ VM_WARN_ON_ONCE(folio == NULL);
+
return READ_ONCE(huge_zero_folio) == folio;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 15:03 ` [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL) Max Kellermann
@ 2025-08-27 15:16 ` Lorenzo Stoakes
2025-08-27 15:38 ` Max Kellermann
2025-08-27 20:01 ` David Hildenbrand
2025-08-27 23:53 ` [PATCH v2] " Andrew Morton
1 sibling, 2 replies; 27+ messages in thread
From: Lorenzo Stoakes @ 2025-08-27 15:16 UTC (permalink / raw)
To: Max Kellermann
Cc: akpm, david, ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, shikemeng, kasong, nphamcs, bhe, chrisl,
linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 05:03:30PM +0200, Max Kellermann wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is possible.
This is true of a huge amount of kernel logic though. We can't always cover
off every single possibility when documenting these things, and there's
often an implicit assumption of this not being null.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
Hmm seems like this is a bug under a bug. folio_put_refs() shouldn't be
passed a folio batch of NULL's.
Shouldn't we just put the VM_WARN_ON_ONCE() there?
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> To improve detection of such bugs, David Hildenbrand suggested adding
> a VM_WARN_ON_ONCE().
But I really don't think passing NULL to is_huge_zero_folio() is a valid
enough situation to justify this?
You've encountered a case where a bug caused folio_put_refs() to be called
with an invalid parameter, then you're arbitrarily changing
is_huge_zero_folio() so it would deref the folio and splat.
This seems not so great.
I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
on what you've said.
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> include/linux/huge_mm.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 7748489fde1b..bc9ca7798f2e 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -2,6 +2,7 @@
> #ifndef _LINUX_HUGE_MM_H
> #define _LINUX_HUGE_MM_H
>
> +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
Please don't do //.
This include is suspect though, huge_mm.h is included from mm.h and thus
this very easily might break some arch that is weird about this stuff,
because a ton of stuff includes mm.h including things that might absolutely
baulk at mmdebug.
I've had this kind of thing happen several times before.
> #include <linux/mm_types.h>
>
> #include <linux/fs.h> /* only for vma_is_dax() */
> @@ -479,6 +480,8 @@ extern unsigned long huge_zero_pfn;
>
> static inline bool is_huge_zero_folio(const struct folio *folio)
> {
> + VM_WARN_ON_ONCE(folio == NULL);
> +
Convention is VM_WARN_ON_ONCE(!folio);
> return READ_ONCE(huge_zero_folio) == folio;
> }
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 15:16 ` Lorenzo Stoakes
@ 2025-08-27 15:38 ` Max Kellermann
2025-08-27 20:01 ` David Hildenbrand
1 sibling, 0 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 15:38 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: akpm, david, ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, shikemeng, kasong, nphamcs, bhe, chrisl,
linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 5:21 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> > But if somebody really passes NULL, the function should not return
> > true - this isn't the huge zero folio after all! However, if the
> > `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> > is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> Hmm seems like this is a bug under a bug. folio_put_refs() shouldn't be
> passed a folio batch of NULL's.
Agree! That was exactly my point - I was hunting down a bug that
sometimes caused folio_put_refs() to crash, but most of the time not
(when no zero huge page was allocated yet). And this randomness is
what I'd like to get rid of.
> Shouldn't we just put the VM_WARN_ON_ONCE() there?
Agree, but that was the 2/2 patch I dropped after David's objection.
> But I really don't think passing NULL to is_huge_zero_folio() is a valid
> enough situation to justify this?
>
> You've encountered a case where a bug caused folio_put_refs() to be called
> with an invalid parameter, then you're arbitrarily changing
> is_huge_zero_folio() so it would deref the folio and splat.
Actually, my v1 patch did not do that. Instead, it checked whether the
huge zero page was already allocated, in order to make
is_huge_zero_folio(NULL) to reliably return false, because NULL is not
the huge zero page. Then David disagreed and asked me to add
VM_WARN_ON_ONCE() instead.
> I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
> on what you've said.
You only disagree with David, but not with me. I'm happy with either
way of dealing with this kind of bug/abuse.
> > +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
>
> Please don't do //.
In Linux-main, there are currently 432 comments documenting #include
lines. This is a pretty common coding style.
> This include is suspect though, huge_mm.h is included from mm.h and thus
> this very easily might break some arch that is weird about this stuff,
> because a ton of stuff includes mm.h including things that might absolutely
> baulk at mmdebug.
What would you suggest doing instead, to make the VM_WARN_ON_ONCE()
macro available?
> I've had this kind of thing happen several times before.
I know, #includes in Linux are a big mess. A while ago, I tried to
help clean it up, but my effort was rejected by the kernel
maintainers. Which is a pity.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 15:16 ` Lorenzo Stoakes
2025-08-27 15:38 ` Max Kellermann
@ 2025-08-27 20:01 ` David Hildenbrand
2025-08-27 20:26 ` David Hildenbrand
2025-08-28 8:48 ` [PATCH v3] " Max Kellermann
1 sibling, 2 replies; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 20:01 UTC (permalink / raw)
To: Lorenzo Stoakes, Max Kellermann
Cc: akpm, ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, shikemeng, kasong, nphamcs, bhe, chrisl,
linux-mm, linux-kernel
> This seems not so great.
>
> I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
> on what you've said.
No.
Everything in folio_put_refs() will dereference the folio and properly
crash the machine when doing something stupid.
This function, however, will silently swallow a "NULL" pointer.
>
>>
>> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
>> ---
>> include/linux/huge_mm.h | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 7748489fde1b..bc9ca7798f2e 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -2,6 +2,7 @@
>> #ifndef _LINUX_HUGE_MM_H
>> #define _LINUX_HUGE_MM_H
>>
>> +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
>
> Please don't do //.
>
> This include is suspect though, huge_mm.h is included from mm.h and thus
> this very easily might break some arch that is weird about this stuff,
> because a ton of stuff includes mm.h including things that might absolutely
> baulk at mmdebug.
Jup. Very likely this is not required.
>
> I've had this kind of thing happen several times before.
>
>> #include <linux/mm_types.h>
>>
>> #include <linux/fs.h> /* only for vma_is_dax() */
>> @@ -479,6 +480,8 @@ extern unsigned long huge_zero_pfn;
>>
>> static inline bool is_huge_zero_folio(const struct folio *folio)
>> {
>> + VM_WARN_ON_ONCE(folio == NULL);
>> +
>
> Convention is VM_WARN_ON_ONCE(!folio);
Wanted to comment the same thing.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 20:01 ` David Hildenbrand
@ 2025-08-27 20:26 ` David Hildenbrand
2025-08-27 20:44 ` Lorenzo Stoakes
2025-08-28 8:48 ` [PATCH v3] " Max Kellermann
1 sibling, 1 reply; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 20:26 UTC (permalink / raw)
To: Lorenzo Stoakes, Max Kellermann
Cc: akpm, ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, shikemeng, kasong, nphamcs, bhe, chrisl,
linux-mm, linux-kernel
On 27.08.25 22:01, David Hildenbrand wrote:
>
>> This seems not so great.
>>
>> I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
>> on what you've said.
>
> No.
>
> Everything in folio_put_refs() will dereference the folio and properly
> crash the machine when doing something stupid.
>
> This function, however, will silently swallow a "NULL" pointer.
>
>>
>>>
>>> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
>>> ---
>>> include/linux/huge_mm.h | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>>> index 7748489fde1b..bc9ca7798f2e 100644
>>> --- a/include/linux/huge_mm.h
>>> +++ b/include/linux/huge_mm.h
>>> @@ -2,6 +2,7 @@
>>> #ifndef _LINUX_HUGE_MM_H
>>> #define _LINUX_HUGE_MM_H
>>>
>>> +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
>>
>> Please don't do //.
>>
>> This include is suspect though, huge_mm.h is included from mm.h and thus
>> this very easily might break some arch that is weird about this stuff,
>> because a ton of stuff includes mm.h including things that might absolutely
>> baulk at mmdebug.
>
> Jup. Very likely this is not required.
Took another look and including mmdebug.h should likely be fine, because
all it includes is really
(1) include/linux/bug.h
(2) include/linux/stringify.h
But originally, I wonder if we even need mmdebug.h or if it is already
implicitly included.
huge_mm.h includes mm_types.h (where we already do use VM_WARN...).
I assume there we get it implicitly through percpu.h, mmap_lock.h or
percpu.h
So the include should just get dropped.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 20:26 ` David Hildenbrand
@ 2025-08-27 20:44 ` Lorenzo Stoakes
0 siblings, 0 replies; 27+ messages in thread
From: Lorenzo Stoakes @ 2025-08-27 20:44 UTC (permalink / raw)
To: David Hildenbrand
Cc: Max Kellermann, akpm, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 10:26:09PM +0200, David Hildenbrand wrote:
> On 27.08.25 22:01, David Hildenbrand wrote:
> >
> > > This seems not so great.
> > >
> > > I really think the VM_WARN_ON_ONCE() should be in folios_put_refs() based
> > > on what you've said.
> >
> > No.
> >
> > Everything in folio_put_refs() will dereference the folio and properly
> > crash the machine when doing something stupid.
> >
> > This function, however, will silently swallow a "NULL" pointer.
> >
> > >
> > > >
> > > > Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> > > > ---
> > > > include/linux/huge_mm.h | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > > > index 7748489fde1b..bc9ca7798f2e 100644
> > > > --- a/include/linux/huge_mm.h
> > > > +++ b/include/linux/huge_mm.h
> > > > @@ -2,6 +2,7 @@
> > > > #ifndef _LINUX_HUGE_MM_H
> > > > #define _LINUX_HUGE_MM_H
> > > >
> > > > +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
> > >
> > > Please don't do //.
> > >
> > > This include is suspect though, huge_mm.h is included from mm.h and thus
> > > this very easily might break some arch that is weird about this stuff,
> > > because a ton of stuff includes mm.h including things that might absolutely
> > > baulk at mmdebug.
> >
> > Jup. Very likely this is not required.
>
> Took another look and including mmdebug.h should likely be fine, because all
> it includes is really
>
> (1) include/linux/bug.h
> (2) include/linux/stringify.h
>
> But originally, I wonder if we even need mmdebug.h or if it is already
> implicitly included.
>
> huge_mm.h includes mm_types.h (where we already do use VM_WARN...).
>
> I assume there we get it implicitly through percpu.h, mmap_lock.h or
> percpu.h
>
> So the include should just get dropped.
Ah nice, let's do this then please! :)
Thanks for checking!
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v3] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 20:01 ` David Hildenbrand
2025-08-27 20:26 ` David Hildenbrand
@ 2025-08-28 8:48 ` Max Kellermann
2025-08-28 9:35 ` Lorenzo Stoakes
2025-08-29 1:25 ` Zi Yan
1 sibling, 2 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-28 8:48 UTC (permalink / raw)
To: akpm, david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
Cc: Max Kellermann
Calling is_huge_zero_folio(NULL) should not be legal - it makes no
sense, and a different (theoretical) implementation may dereference
the pointer. But currently, lacking any explicit documentation, this
call is possible.
But if somebody really passes NULL, the function should not return
true - this isn't the huge zero folio after all! However, if the
`huge_zero_folio` hasn't been allocated yet, it's NULL, and
is_huge_zero_folio(NULL) just happens to return true, which is a lie.
This weird side effect prevented me from reproducing a kernel crash
that occurred when the elements of a folio_batch were NULL - since
folios_put_refs() skips huge zero folios, this sometimes causes a
crash, but sometimes does not. For debugging, it is better to reveal
such bugs reliably and not hide them behind random preconditions like
"has the huge zero folio already been created?"
To improve detection of such bugs, David Hildenbrand suggested adding
a VM_WARN_ON_ONCE().
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
v1->v2: using VM_WARN_ON_ONCE() instead of checking huge_zero_folio
v2->v3: use "!" to check NULL; removed the #include
---
include/linux/huge_mm.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 7748489fde1b..96ac47603d97 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -479,6 +479,8 @@ extern unsigned long huge_zero_pfn;
static inline bool is_huge_zero_folio(const struct folio *folio)
{
+ VM_WARN_ON_ONCE(!folio);
+
return READ_ONCE(huge_zero_folio) == folio;
}
--
2.47.2
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v3] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-28 8:48 ` [PATCH v3] " Max Kellermann
@ 2025-08-28 9:35 ` Lorenzo Stoakes
2025-08-29 1:25 ` Zi Yan
1 sibling, 0 replies; 27+ messages in thread
From: Lorenzo Stoakes @ 2025-08-28 9:35 UTC (permalink / raw)
To: Max Kellermann
Cc: akpm, david, ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, shikemeng, kasong, nphamcs, bhe, chrisl,
linux-mm, linux-kernel
Please don't send this in reply to previous versions, just send as a
separate mail :) (yes email for dev sucks and there's a bunch of weird
nuances so totally understandable!)
Hopefully Andrew will pick this up regardless.
On Thu, Aug 28, 2025 at 10:48:20AM +0200, Max Kellermann wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is possible.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> To improve detection of such bugs, David Hildenbrand suggested adding
> a VM_WARN_ON_ONCE().
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
LGTM, so:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> v1->v2: using VM_WARN_ON_ONCE() instead of checking huge_zero_folio
> v2->v3: use "!" to check NULL; removed the #include
Putting the history in is great though thanks! :)
> ---
> include/linux/huge_mm.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 7748489fde1b..96ac47603d97 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -479,6 +479,8 @@ extern unsigned long huge_zero_pfn;
>
> static inline bool is_huge_zero_folio(const struct folio *folio)
> {
> + VM_WARN_ON_ONCE(!folio);
> +
> return READ_ONCE(huge_zero_folio) == folio;
> }
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v3] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-28 8:48 ` [PATCH v3] " Max Kellermann
2025-08-28 9:35 ` Lorenzo Stoakes
@ 2025-08-29 1:25 ` Zi Yan
1 sibling, 0 replies; 27+ messages in thread
From: Zi Yan @ 2025-08-29 1:25 UTC (permalink / raw)
To: Max Kellermann
Cc: akpm, david, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 28 Aug 2025, at 4:48, Max Kellermann wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is possible.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> To improve detection of such bugs, David Hildenbrand suggested adding
> a VM_WARN_ON_ONCE().
>
> Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
> ---
> v1->v2: using VM_WARN_ON_ONCE() instead of checking huge_zero_folio
> v2->v3: use "!" to check NULL; removed the #include
> ---
> include/linux/huge_mm.h | 2 ++
> 1 file changed, 2 insertions(+)
>
LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 15:03 ` [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL) Max Kellermann
2025-08-27 15:16 ` Lorenzo Stoakes
@ 2025-08-27 23:53 ` Andrew Morton
2025-08-28 5:17 ` Max Kellermann
2025-08-28 7:57 ` David Hildenbrand
1 sibling, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2025-08-27 23:53 UTC (permalink / raw)
To: Max Kellermann
Cc: david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On Wed, 27 Aug 2025 17:03:30 +0200 Max Kellermann <max.kellermann@ionos.com> wrote:
> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
> sense, and a different (theoretical) implementation may dereference
> the pointer. But currently, lacking any explicit documentation, this
> call is possible.
>
> But if somebody really passes NULL, the function should not return
> true - this isn't the huge zero folio after all! However, if the
> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> This weird side effect prevented me from reproducing a kernel crash
> that occurred when the elements of a folio_batch were NULL - since
> folios_put_refs() skips huge zero folios, this sometimes causes a
> crash, but sometimes does not. For debugging, it is better to reveal
> such bugs reliably and not hide them behind random preconditions like
> "has the huge zero folio already been created?"
>
> To improve detection of such bugs, David Hildenbrand suggested adding
> a VM_WARN_ON_ONCE().
>
> ...
>
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -2,6 +2,7 @@
> #ifndef _LINUX_HUGE_MM_H
> #define _LINUX_HUGE_MM_H
>
> +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
> #include <linux/mm_types.h>
>
> #include <linux/fs.h> /* only for vma_is_dax() */
> @@ -479,6 +480,8 @@ extern unsigned long huge_zero_pfn;
>
> static inline bool is_huge_zero_folio(const struct folio *folio)
> {
> + VM_WARN_ON_ONCE(folio == NULL);
> +
> return READ_ONCE(huge_zero_folio) == folio;
> }
OK, but it remains the case that we have seen code which calls
is_huge_zero_folio() prior to the initialization of huge_zero_folio.
Is this a bug? I think so. Should we be checking for recurrences of
this bug?
Also, sigh. I do dislike seeing VM_WARN_ON_ONCE() in an inline
function - heaven knows how much bloat that adds. Defconfig
mm/huge_memory.o (which has three calls) grows by 80 bytes so I guess
that's livable with.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 23:53 ` [PATCH v2] " Andrew Morton
@ 2025-08-28 5:17 ` Max Kellermann
2025-08-28 7:57 ` David Hildenbrand
1 sibling, 0 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-28 5:17 UTC (permalink / raw)
To: Andrew Morton
Cc: david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On Thu, Aug 28, 2025 at 1:53 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> OK, but it remains the case that we have seen code which calls
> is_huge_zero_folio() prior to the initialization of huge_zero_folio.
>
> Is this a bug? I think so. Should we be checking for recurrences of
> this bug?
Why do you believe it's a bug? To me, as a newbie here, such a call
seems perfectly fine; please explain.
> Also, sigh. I do dislike seeing VM_WARN_ON_ONCE() in an inline
> function - heaven knows how much bloat that adds. Defconfig
> mm/huge_memory.o (which has three calls) grows by 80 bytes so I guess
> that's livable with.
Oh, how is this possible? defconfig has CONFIG_DEBUG_VM=n and that
means VM_WARN_ON_ONCE() is just BUILD_BUG_ON_INVALID() which should
generate no code at all.
Here on my computer (aarch64 with GCC 14), the size (and the
disassembly) is exactly the same. This confirms what David Hildenbrand
predicted.
Are you seeing some compiler weirdness, maybe?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL)
2025-08-27 23:53 ` [PATCH v2] " Andrew Morton
2025-08-28 5:17 ` Max Kellermann
@ 2025-08-28 7:57 ` David Hildenbrand
1 sibling, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2025-08-28 7:57 UTC (permalink / raw)
To: Andrew Morton, Max Kellermann
Cc: lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 28.08.25 01:53, Andrew Morton wrote:
> On Wed, 27 Aug 2025 17:03:30 +0200 Max Kellermann <max.kellermann@ionos.com> wrote:
>
>> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
>> sense, and a different (theoretical) implementation may dereference
>> the pointer. But currently, lacking any explicit documentation, this
>> call is possible.
>>
>> But if somebody really passes NULL, the function should not return
>> true - this isn't the huge zero folio after all! However, if the
>> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
>> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>>
>> This weird side effect prevented me from reproducing a kernel crash
>> that occurred when the elements of a folio_batch were NULL - since
>> folios_put_refs() skips huge zero folios, this sometimes causes a
>> crash, but sometimes does not. For debugging, it is better to reveal
>> such bugs reliably and not hide them behind random preconditions like
>> "has the huge zero folio already been created?"
>>
>> To improve detection of such bugs, David Hildenbrand suggested adding
>> a VM_WARN_ON_ONCE().
>>
>> ...
>>
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -2,6 +2,7 @@
>> #ifndef _LINUX_HUGE_MM_H
>> #define _LINUX_HUGE_MM_H
>>
>> +#include <linux/mmdebug.h> // for VM_WARN_ON_ONCE()
>> #include <linux/mm_types.h>
>>
>> #include <linux/fs.h> /* only for vma_is_dax() */
>> @@ -479,6 +480,8 @@ extern unsigned long huge_zero_pfn;
>>
>> static inline bool is_huge_zero_folio(const struct folio *folio)
>> {
>> + VM_WARN_ON_ONCE(folio == NULL);
>> +
>> return READ_ONCE(huge_zero_folio) == folio;
>> }
>
> OK, but it remains the case that we have seen code which calls
> is_huge_zero_folio() prior to the initialization of huge_zero_folio.
>
> Is this a bug? I think so. Should we be checking for recurrences of
> this bug?
As answered elsewhere, this is perfectly fine as the huge zero folio is
allocated on demand only (and only once enabled).
>
>
> Also, sigh. I do dislike seeing VM_WARN_ON_ONCE() in an inline
> function - heaven knows how much bloat that adds. Defconfig
> mm/huge_memory.o (which has three calls) grows by 80 bytes so I guess
> that's livable with.
Common practice to use VM_WARN_ON_ONCE() and friend in inline functions.
Just look at page-flags.h.
If someone complains about kernel image size with CONFIG_DEBUG_VM, they
shopuld reevaluate life choices. :)
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 14:34 ` David Hildenbrand
2025-08-27 15:03 ` [PATCH v2] huge_mm.h: disallow is_huge_zero_folio(NULL) Max Kellermann
@ 2025-08-27 15:05 ` Max Kellermann
1 sibling, 0 replies; 27+ messages in thread
From: Max Kellermann @ 2025-08-27 15:05 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
npache, ryan.roberts, dev.jain, baohua, shikemeng, kasong,
nphamcs, bhe, chrisl, linux-mm, linux-kernel
On Wed, Aug 27, 2025 at 4:34 PM David Hildenbrand <david@redhat.com> wrote:
> VM_WARN_ON_ONCE(!folio);
>
> Should allow for catching such bugs without causing any runtime overhead
> on non-debug kernels.
Aye. I have posted v2 of this patch and I have dropped the
folios_put_refs() patch.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] huge_mm.h: is_huge_zero_folio(NULL) should return false
2025-08-27 1:55 ` Andrew Morton
2025-08-27 4:39 ` Max Kellermann
@ 2025-08-27 9:35 ` David Hildenbrand
1 sibling, 0 replies; 27+ messages in thread
From: David Hildenbrand @ 2025-08-27 9:35 UTC (permalink / raw)
To: Andrew Morton, Max Kellermann
Cc: lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
ryan.roberts, dev.jain, baohua, shikemeng, kasong, nphamcs, bhe,
chrisl, linux-mm, linux-kernel
On 27.08.25 03:55, Andrew Morton wrote:
> On Wed, 27 Aug 2025 01:16:24 +0200 Max Kellermann <max.kellermann@ionos.com> wrote:
>
>> Calling is_huge_zero_folio(NULL) should not be legal - it makes no
>> sense, and a different (theoretical) implementation may dereference
>> the pointer. But currently, lacking any explicit documentation, this
>> call is legal.
>>
>> But if somebody really passes NULL, the function should not return
>> true - this isn't the huge zero folio after all! However, if the
>> `huge_zero_folio` hasn't been allocated yet, it's NULL, and
>> is_huge_zero_folio(NULL) just happens to return true, which is a lie.
>
> Isn't it a bug to call is_huge_zero_folio() before the huge_zero_folio
> has been created?
It's a bug to call a function that expects a folio without a folio.
>
> Being a simple soul, I'm thinking
>
> VM_BUG_ON(!huge_zero_folio);
> VM_BUG_ON(!folio);
We should just do VM_WARN_ON(!folio);
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 27+ messages in thread