linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
@ 2025-05-15  6:34 Dev Jain
  2025-05-15  8:13 ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Dev Jain @ 2025-05-15  6:34 UTC (permalink / raw)
  To: catalin.marinas, will
  Cc: david, ryan.roberts, anshuman.khandual, mark.rutland, yang,
	linux-kernel, linux-arm-kernel, Dev Jain, stable

Commit 9c006972c3fe removes the pxd_present() checks because the caller
checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller only
checks pud_present(); pud_free_pmd_page() recurses on each pmd through
pmd_free_pte_page(), wherein the pmd may be none. Thus it is possible to
hit a warning in the latter, since pmd_none => !pmd_table(). Thus, enforce
these checks again through pxd_leaf().
This problem was found by code inspection.
The patch is based on 6.15-rc6.

Fixes: 9c006972c3fe (arm64: mmu: drop pXd_present() checks from pXd_free_pYd_table())
Cc: <stable@vger.kernel.org>
Reported-by: Ryan Roberts <ryan.roberts@arm.com> 
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 arch/arm64/mm/mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ea6695d53fb9..3d6789413a9b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1255,7 +1255,7 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
 
 	pmd = READ_ONCE(*pmdp);
 
-	if (!pmd_table(pmd)) {
+	if (pmd_leaf(pmd)) {
 		VM_WARN_ON(1);
 		return 1;
 	}
@@ -1276,7 +1276,7 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
 
 	pud = READ_ONCE(*pudp);
 
-	if (!pud_table(pud)) {
+	if (pud_leaf(pud)) {
 		VM_WARN_ON(1);
 		return 1;
 	}
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  6:34 [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables Dev Jain
@ 2025-05-15  8:13 ` David Hildenbrand
  2025-05-15  8:22   ` Dev Jain
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15  8:13 UTC (permalink / raw)
  To: Dev Jain, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 08:34, Dev Jain wrote:
> Commit 9c006972c3fe removes the pxd_present() checks because the caller
> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller only
> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
> pmd_free_pte_page(), wherein the pmd may be none.
The commit states: "The core code already has a check for pXd_none()", 
so I assume that assumption was not true in all cases?

Should that one problematic caller then check for pmd_none() instead?

If you were able to trigger this WARN, it's always a good idea to 
include the splat in the commit.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:13 ` David Hildenbrand
@ 2025-05-15  8:22   ` Dev Jain
  2025-05-15  8:36     ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Dev Jain @ 2025-05-15  8:22 UTC (permalink / raw)
  To: David Hildenbrand, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable



On 15/05/25 1:43 pm, David Hildenbrand wrote:
> On 15.05.25 08:34, Dev Jain wrote:
>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller 
>> only
>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>> pmd_free_pte_page(), wherein the pmd may be none.
> The commit states: "The core code already has a check for pXd_none()", 
> so I assume that assumption was not true in all cases?
> 
> Should that one problematic caller then check for pmd_none() instead?

 From what I could gather of Will's commit message, my interpretation is 
that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd. 
These individually check for pxd_present():

if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
	return 0;

The problem is that vmap_try_huge_pud will also iterate on pte entries. 
So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page 
may encounter a none pmd and trigger a WARN.

> 
> If you were able to trigger this WARN, it's always a good idea to 
> include the splat in the commit.

I wasn't able to, it is just an observation from code inspection.

> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:22   ` Dev Jain
@ 2025-05-15  8:36     ` David Hildenbrand
  2025-05-15  8:40       ` Dev Jain
  2025-05-15  8:47       ` Dev Jain
  0 siblings, 2 replies; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15  8:36 UTC (permalink / raw)
  To: Dev Jain, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 10:22, Dev Jain wrote:
> 
> 
> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>> On 15.05.25 08:34, Dev Jain wrote:
>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>> only
>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>> pmd_free_pte_page(), wherein the pmd may be none.
>> The commit states: "The core code already has a check for pXd_none()",
>> so I assume that assumption was not true in all cases?
>>
>> Should that one problematic caller then check for pmd_none() instead?
> 
>   From what I could gather of Will's commit message, my interpretation is
> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
> These individually check for pxd_present():
> 
> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
> 	return 0;
> 
> The problem is that vmap_try_huge_pud will also iterate on pte entries.
> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
> may encounter a none pmd and trigger a WARN.

Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.

I assume we should either have an explicit pmd_none() check in 
pud_free_pmd_page() before calling pmd_free_pte_page(), or one in 
pmd_free_pte_page().

With your patch, we'd be calling pte_free_kernel() on a NULL pointer, 
which sounds wrong -- unless I am missing something important.

> 
>>
>> If you were able to trigger this WARN, it's always a good idea to
>> include the splat in the commit.
> 
> I wasn't able to, it is just an observation from code inspection.

That better be included in the patch description :)

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:36     ` David Hildenbrand
@ 2025-05-15  8:40       ` Dev Jain
  2025-05-15  8:49         ` David Hildenbrand
  2025-05-15  8:47       ` Dev Jain
  1 sibling, 1 reply; 16+ messages in thread
From: Dev Jain @ 2025-05-15  8:40 UTC (permalink / raw)
  To: David Hildenbrand, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable



On 15/05/25 2:06 pm, David Hildenbrand wrote:
> On 15.05.25 10:22, Dev Jain wrote:
>>
>>
>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>> On 15.05.25 08:34, Dev Jain wrote:
>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>> only
>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>> The commit states: "The core code already has a check for pXd_none()",
>>> so I assume that assumption was not true in all cases?
>>>
>>> Should that one problematic caller then check for pmd_none() instead?
>>
>>   From what I could gather of Will's commit message, my interpretation is
>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>> These individually check for pxd_present():
>>
>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>     return 0;
>>
>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>> may encounter a none pmd and trigger a WARN.
> 
> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
> 
> I assume we should either have an explicit pmd_none() check in 
> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in 
> pmd_free_pte_page().
> 
> With your patch, we'd be calling pte_free_kernel() on a NULL pointer, 
> which sounds wrong -- unless I am missing something important.
> 
>>
>>>
>>> If you were able to trigger this WARN, it's always a good idea to
>>> include the splat in the commit.
>>
>> I wasn't able to, it is just an observation from code inspection.
> 
> That better be included in the patch description :)

I did, actually. My bad for not putting in spaces, I notice now that the 
description looks horrible to the eye :)

> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:36     ` David Hildenbrand
  2025-05-15  8:40       ` Dev Jain
@ 2025-05-15  8:47       ` Dev Jain
  2025-05-15  8:53         ` David Hildenbrand
  1 sibling, 1 reply; 16+ messages in thread
From: Dev Jain @ 2025-05-15  8:47 UTC (permalink / raw)
  To: David Hildenbrand, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable



On 15/05/25 2:06 pm, David Hildenbrand wrote:
> On 15.05.25 10:22, Dev Jain wrote:
>>
>>
>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>> On 15.05.25 08:34, Dev Jain wrote:
>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>> only
>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>> The commit states: "The core code already has a check for pXd_none()",
>>> so I assume that assumption was not true in all cases?
>>>
>>> Should that one problematic caller then check for pmd_none() instead?
>>
>>   From what I could gather of Will's commit message, my interpretation is
>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>> These individually check for pxd_present():
>>
>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>     return 0;
>>
>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>> may encounter a none pmd and trigger a WARN.
> 
> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
> 
> I assume we should either have an explicit pmd_none() check in 
> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in 
> pmd_free_pte_page().
> 
> With your patch, we'd be calling pte_free_kernel() on a NULL pointer, 
> which sounds wrong -- unless I am missing something important.

Ah thanks, you seem to be right. We will be extracting table from a none 
pmd. Perhaps we should still bail out for !pxd_present() but without the 
warning, which the fix commit used to do.

> 
>>
>>>
>>> If you were able to trigger this WARN, it's always a good idea to
>>> include the splat in the commit.
>>
>> I wasn't able to, it is just an observation from code inspection.
> 
> That better be included in the patch description :)
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:40       ` Dev Jain
@ 2025-05-15  8:49         ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15  8:49 UTC (permalink / raw)
  To: Dev Jain, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 10:40, Dev Jain wrote:
> 
> 
> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>> On 15.05.25 10:22, Dev Jain wrote:
>>>
>>>
>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>> only
>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>> The commit states: "The core code already has a check for pXd_none()",
>>>> so I assume that assumption was not true in all cases?
>>>>
>>>> Should that one problematic caller then check for pmd_none() instead?
>>>
>>>    From what I could gather of Will's commit message, my interpretation is
>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>> These individually check for pxd_present():
>>>
>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>      return 0;
>>>
>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>> may encounter a none pmd and trigger a WARN.
>>
>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>
>> I assume we should either have an explicit pmd_none() check in
>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>> pmd_free_pte_page().
>>
>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>> which sounds wrong -- unless I am missing something important.
>>
>>>
>>>>
>>>> If you were able to trigger this WARN, it's always a good idea to
>>>> include the splat in the commit.
>>>
>>> I wasn't able to, it is just an observation from code inspection.
>>
>> That better be included in the patch description :)
> 
> I did, actually. My bad for not putting in spaces, I notice now that the
> description looks horrible to the eye :)

Ahh, there it is. Sorry :) Yeah, some empty lines won't hurt!

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:47       ` Dev Jain
@ 2025-05-15  8:53         ` David Hildenbrand
  2025-05-15  9:27           ` Dev Jain
  2025-05-15 10:07           ` Ryan Roberts
  0 siblings, 2 replies; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15  8:53 UTC (permalink / raw)
  To: Dev Jain, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 10:47, Dev Jain wrote:
> 
> 
> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>> On 15.05.25 10:22, Dev Jain wrote:
>>>
>>>
>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>> only
>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>> The commit states: "The core code already has a check for pXd_none()",
>>>> so I assume that assumption was not true in all cases?
>>>>
>>>> Should that one problematic caller then check for pmd_none() instead?
>>>
>>>    From what I could gather of Will's commit message, my interpretation is
>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>> These individually check for pxd_present():
>>>
>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>      return 0;
>>>
>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>> may encounter a none pmd and trigger a WARN.
>>
>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>
>> I assume we should either have an explicit pmd_none() check in
>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>> pmd_free_pte_page().
>>
>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>> which sounds wrong -- unless I am missing something important.
> 
> Ah thanks, you seem to be right. We will be extracting table from a none
> pmd. Perhaps we should still bail out for !pxd_present() but without the
> warning, which the fix commit used to do.

Right. We just make sure that all callers of pmd_free_pte_page() already check for it.

I'd just do something like:

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 8fcf59ba39db7..e98dd7af147d5 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
  
         pmd = READ_ONCE(*pmdp);
  
-       if (!pmd_table(pmd)) {
-               VM_WARN_ON(1);
-               return 1;
-       }
+       VM_WARN_ON(!pmd_present(pmd));
+       VM_WARN_ON(!pmd_table(pmd));
  
         table = pte_offset_kernel(pmdp, addr);
         pmd_clear(pmdp);
@@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
         next = addr;
         end = addr + PUD_SIZE;
         do {
-               pmd_free_pte_page(pmdp, next);
+               if (pmd_present(*pmdp))
+                       pmd_free_pte_page(pmdp, next);
         } while (pmdp++, next += PMD_SIZE, next != end);
  
         pud_clear(pudp);


-- 
Cheers,

David / dhildenb



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:53         ` David Hildenbrand
@ 2025-05-15  9:27           ` Dev Jain
  2025-05-15  9:32             ` David Hildenbrand
  2025-05-15 10:07           ` Ryan Roberts
  1 sibling, 1 reply; 16+ messages in thread
From: Dev Jain @ 2025-05-15  9:27 UTC (permalink / raw)
  To: David Hildenbrand, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable



On 15/05/25 2:23 pm, David Hildenbrand wrote:
> On 15.05.25 10:47, Dev Jain wrote:
>>
>>
>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>
>>>>
>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the 
>>>>>> caller
>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>> only
>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd 
>>>>>> through
>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>> so I assume that assumption was not true in all cases?
>>>>>
>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>
>>>>    From what I could gather of Will's commit message, my 
>>>> interpretation is
>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>> These individually check for pxd_present():
>>>>
>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>      return 0;
>>>>
>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>> may encounter a none pmd and trigger a WARN.
>>>
>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>
>>> I assume we should either have an explicit pmd_none() check in
>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>> pmd_free_pte_page().
>>>
>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>> which sounds wrong -- unless I am missing something important.
>>
>> Ah thanks, you seem to be right. We will be extracting table from a none
>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>> warning, which the fix commit used to do.
> 
> Right. We just make sure that all callers of pmd_free_pte_page() already 
> check for it.
> 
> I'd just do something like:
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 8fcf59ba39db7..e98dd7af147d5 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long 
> addr)
> 
>          pmd = READ_ONCE(*pmdp);
> 
> -       if (!pmd_table(pmd)) {
> -               VM_WARN_ON(1);
> -               return 1;
> -       }
> +       VM_WARN_ON(!pmd_present(pmd));
> +       VM_WARN_ON(!pmd_table(pmd));

And also return 1?
Also we should BUG_ON(!pmd_present(pmd)) to avoid the null dereference?

> 
>          table = pte_offset_kernel(pmdp, addr);
>          pmd_clear(pmdp);
> @@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long 
> addr)
>          next = addr;
>          end = addr + PUD_SIZE;
>          do {
> -               pmd_free_pte_page(pmdp, next);
> +               if (pmd_present(*pmdp))
> +                       pmd_free_pte_page(pmdp, next);

Ah yes, the "caller" of pmd_free_pte_page() is not only 
vmap_try_huge_pmd but this also...my mind has been foggy lately...
need to solve a math problem or two to sharpen it :)

>          } while (pmdp++, next += PMD_SIZE, next != end);
> 
>          pud_clear(pudp);
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  9:27           ` Dev Jain
@ 2025-05-15  9:32             ` David Hildenbrand
  2025-05-15 12:56               ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15  9:32 UTC (permalink / raw)
  To: Dev Jain, catalin.marinas, will
  Cc: ryan.roberts, anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 11:27, Dev Jain wrote:
> 
> 
> On 15/05/25 2:23 pm, David Hildenbrand wrote:
>> On 15.05.25 10:47, Dev Jain wrote:
>>>
>>>
>>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>>
>>>>>
>>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the
>>>>>>> caller
>>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>>> only
>>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd
>>>>>>> through
>>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>>> so I assume that assumption was not true in all cases?
>>>>>>
>>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>>
>>>>>     From what I could gather of Will's commit message, my
>>>>> interpretation is
>>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>>> These individually check for pxd_present():
>>>>>
>>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>>       return 0;
>>>>>
>>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>>> may encounter a none pmd and trigger a WARN.
>>>>
>>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>>
>>>> I assume we should either have an explicit pmd_none() check in
>>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>>> pmd_free_pte_page().
>>>>
>>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>>> which sounds wrong -- unless I am missing something important.
>>>
>>> Ah thanks, you seem to be right. We will be extracting table from a none
>>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>>> warning, which the fix commit used to do.
>>
>> Right. We just make sure that all callers of pmd_free_pte_page() already
>> check for it.
>>
>> I'd just do something like:
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 8fcf59ba39db7..e98dd7af147d5 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long
>> addr)
>>
>>           pmd = READ_ONCE(*pmdp);
>>
>> -       if (!pmd_table(pmd)) {
>> -               VM_WARN_ON(1);
>> -               return 1;
>> -       }
>> +       VM_WARN_ON(!pmd_present(pmd));
>> +       VM_WARN_ON(!pmd_table(pmd));
> 
> And also return 1?

I'll leave that to Catalin + Will.

I'm not a friend for adding runtime-overhead for soemthing that should 
not happen and be caught early during testing -> VM_WARN_ON_ONCE().

> Also we should BUG_ON(!pmd_present(pmd)) to avoid the null dereference?

Definitely no BUG_ON :)

> 
>>
>>           table = pte_offset_kernel(pmdp, addr);
>>           pmd_clear(pmdp);
>> @@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long
>> addr)
>>           next = addr;
>>           end = addr + PUD_SIZE;
>>           do {
>> -               pmd_free_pte_page(pmdp, next);
>> +               if (pmd_present(*pmdp))
>> +                       pmd_free_pte_page(pmdp, next);
> 
> Ah yes, the "caller" of pmd_free_pte_page() is not only
> vmap_try_huge_pmd but this also...my mind has been foggy lately...
> need to solve a math problem or two to sharpen it :)

Haha.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  8:53         ` David Hildenbrand
  2025-05-15  9:27           ` Dev Jain
@ 2025-05-15 10:07           ` Ryan Roberts
  2025-05-15 13:01             ` David Hildenbrand
  1 sibling, 1 reply; 16+ messages in thread
From: Ryan Roberts @ 2025-05-15 10:07 UTC (permalink / raw)
  To: David Hildenbrand, Dev Jain, catalin.marinas, will
  Cc: anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15/05/2025 09:53, David Hildenbrand wrote:
> On 15.05.25 10:47, Dev Jain wrote:
>>
>>
>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>
>>>>
>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>> only
>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>> so I assume that assumption was not true in all cases?
>>>>>
>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>
>>>>    From what I could gather of Will's commit message, my interpretation is
>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>> These individually check for pxd_present():
>>>>
>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>      return 0;
>>>>
>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>> may encounter a none pmd and trigger a WARN.
>>>
>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>
>>> I assume we should either have an explicit pmd_none() check in
>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>> pmd_free_pte_page().
>>>
>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>> which sounds wrong -- unless I am missing something important.
>>
>> Ah thanks, you seem to be right. We will be extracting table from a none
>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>> warning, which the fix commit used to do.
> 
> Right. We just make sure that all callers of pmd_free_pte_page() already check
> for it.
> 
> I'd just do something like:

I just reviewed the patch and had the same feedback as David. I agree with the
patch below, with some small mods...

> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 8fcf59ba39db7..e98dd7af147d5 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
>  
>         pmd = READ_ONCE(*pmdp);
>  
> -       if (!pmd_table(pmd)) {
> -               VM_WARN_ON(1);
> -               return 1;
> -       }
> +       VM_WARN_ON(!pmd_present(pmd));
> +       VM_WARN_ON(!pmd_table(pmd));

You don't need both of these warnings; pmd_table() is only true if the pmd is
present (well actually only if it's _valid_ which is more strict than present),
so the second one is sufficient on its own.

>  
>         table = pte_offset_kernel(pmdp, addr);
>         pmd_clear(pmdp);
> @@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)

Given you are removing the runtime check and early return in
pmd_free_pte_page(), I think you should modify this function to use the same
style too.

>         next = addr;
>         end = addr + PUD_SIZE;
>         do {
> -               pmd_free_pte_page(pmdp, next);
> +               if (pmd_present(*pmdp))

question: I wonder if it is better to use !pmd_none() as the condition here? It
should either be none or a table at this point, so this allows the warning in
pmd_free_pte_page() to catch more error conditions. No strong opinion though.

Thanks,
Ryan

> +                       pmd_free_pte_page(pmdp, next);
>         } while (pmdp++, next += PMD_SIZE, next != end);
>  
>         pud_clear(pudp);
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15  9:32             ` David Hildenbrand
@ 2025-05-15 12:56               ` Will Deacon
  2025-05-15 13:04                 ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2025-05-15 12:56 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Dev Jain, catalin.marinas, ryan.roberts, anshuman.khandual,
	mark.rutland, yang, linux-kernel, linux-arm-kernel, stable

On Thu, May 15, 2025 at 11:32:22AM +0200, David Hildenbrand wrote:
> On 15.05.25 11:27, Dev Jain wrote:
> > 
> > 
> > On 15/05/25 2:23 pm, David Hildenbrand wrote:
> > > On 15.05.25 10:47, Dev Jain wrote:
> > > > 
> > > > 
> > > > On 15/05/25 2:06 pm, David Hildenbrand wrote:
> > > > > On 15.05.25 10:22, Dev Jain wrote:
> > > > > > 
> > > > > > 
> > > > > > On 15/05/25 1:43 pm, David Hildenbrand wrote:
> > > > > > > On 15.05.25 08:34, Dev Jain wrote:
> > > > > > > > Commit 9c006972c3fe removes the pxd_present() checks because the
> > > > > > > > caller
> > > > > > > > checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
> > > > > > > > only
> > > > > > > > checks pud_present(); pud_free_pmd_page() recurses on each pmd
> > > > > > > > through
> > > > > > > > pmd_free_pte_page(), wherein the pmd may be none.
> > > > > > > The commit states: "The core code already has a check for pXd_none()",
> > > > > > > so I assume that assumption was not true in all cases?
> > > > > > > 
> > > > > > > Should that one problematic caller then check for pmd_none() instead?
> > > > > > 
> > > > > >     From what I could gather of Will's commit message, my
> > > > > > interpretation is
> > > > > > that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
> > > > > > These individually check for pxd_present():
> > > > > > 
> > > > > > if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
> > > > > >       return 0;
> > > > > > 
> > > > > > The problem is that vmap_try_huge_pud will also iterate on pte entries.
> > > > > > So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
> > > > > > may encounter a none pmd and trigger a WARN.
> > > > > 
> > > > > Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
> > > > > 
> > > > > I assume we should either have an explicit pmd_none() check in
> > > > > pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
> > > > > pmd_free_pte_page().
> > > > > 
> > > > > With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
> > > > > which sounds wrong -- unless I am missing something important.
> > > > 
> > > > Ah thanks, you seem to be right. We will be extracting table from a none
> > > > pmd. Perhaps we should still bail out for !pxd_present() but without the
> > > > warning, which the fix commit used to do.
> > > 
> > > Right. We just make sure that all callers of pmd_free_pte_page() already
> > > check for it.
> > > 
> > > I'd just do something like:
> > > 
> > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > index 8fcf59ba39db7..e98dd7af147d5 100644
> > > --- a/arch/arm64/mm/mmu.c
> > > +++ b/arch/arm64/mm/mmu.c
> > > @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long
> > > addr)
> > > 
> > >           pmd = READ_ONCE(*pmdp);
> > > 
> > > -       if (!pmd_table(pmd)) {
> > > -               VM_WARN_ON(1);
> > > -               return 1;
> > > -       }
> > > +       VM_WARN_ON(!pmd_present(pmd));
> > > +       VM_WARN_ON(!pmd_table(pmd));
> > 
> > And also return 1?
> 
> I'll leave that to Catalin + Will.
> 
> I'm not a friend for adding runtime-overhead for soemthing that should not
> happen and be caught early during testing -> VM_WARN_ON_ONCE().

I definitely think we should return early if the pmd isn't a table.
Otherwise, we could end up descending into God-knows-what!

Will


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15 10:07           ` Ryan Roberts
@ 2025-05-15 13:01             ` David Hildenbrand
  2025-05-15 13:14               ` Ryan Roberts
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15 13:01 UTC (permalink / raw)
  To: Ryan Roberts, Dev Jain, catalin.marinas, will
  Cc: anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15.05.25 12:07, Ryan Roberts wrote:
> On 15/05/2025 09:53, David Hildenbrand wrote:
>> On 15.05.25 10:47, Dev Jain wrote:
>>>
>>>
>>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>>
>>>>>
>>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>>> only
>>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>>> so I assume that assumption was not true in all cases?
>>>>>>
>>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>>
>>>>>     From what I could gather of Will's commit message, my interpretation is
>>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>>> These individually check for pxd_present():
>>>>>
>>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>>       return 0;
>>>>>
>>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>>> may encounter a none pmd and trigger a WARN.
>>>>
>>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>>
>>>> I assume we should either have an explicit pmd_none() check in
>>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>>> pmd_free_pte_page().
>>>>
>>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>>> which sounds wrong -- unless I am missing something important.
>>>
>>> Ah thanks, you seem to be right. We will be extracting table from a none
>>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>>> warning, which the fix commit used to do.
>>
>> Right. We just make sure that all callers of pmd_free_pte_page() already check
>> for it.
>>
>> I'd just do something like:
> 
> I just reviewed the patch and had the same feedback as David. I agree with the
> patch below, with some small mods...
> 
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 8fcf59ba39db7..e98dd7af147d5 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
>>   
>>          pmd = READ_ONCE(*pmdp);
>>   
>> -       if (!pmd_table(pmd)) {
>> -               VM_WARN_ON(1);
>> -               return 1;
>> -       }
>> +       VM_WARN_ON(!pmd_present(pmd));
>> +       VM_WARN_ON(!pmd_table(pmd));
> 
> You don't need both of these warnings; pmd_table() is only true if the pmd is
> present (well actually only if it's _valid_ which is more strict than present),
> so the second one is sufficient on its own.

Ah, right.

> 
>>   
>>          table = pte_offset_kernel(pmdp, addr);
>>          pmd_clear(pmdp);
>> @@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
> 
> Given you are removing the runtime check and early return in
> pmd_free_pte_page(), I think you should modify this function to use the same
> style too.

BTW, the "return 1" is weird. But looking at x86, we seem to be making a 
private copy of the page table first, to defer freeing the page tables 
after the TLB flush.

I wonder if there isn't a better way (e.g., clear PUDP + flush tlb, then 
walk over the effectively-disconnected page table). But I'm sure there 
is a good reason for that.

> 
>>          next = addr;
>>          end = addr + PUD_SIZE;
>>          do {
>> -               pmd_free_pte_page(pmdp, next);
>> +               if (pmd_present(*pmdp))
> 
> question: I wonder if it is better to use !pmd_none() as the condition here? It
> should either be none or a table at this point, so this allows the warning in
> pmd_free_pte_page() to catch more error conditions. No strong opinion though.

Same here. The existing callers check pmd_present().

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15 12:56               ` Will Deacon
@ 2025-05-15 13:04                 ` David Hildenbrand
  2025-05-15 13:21                   ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-05-15 13:04 UTC (permalink / raw)
  To: Will Deacon
  Cc: Dev Jain, catalin.marinas, ryan.roberts, anshuman.khandual,
	mark.rutland, yang, linux-kernel, linux-arm-kernel, stable

On 15.05.25 14:56, Will Deacon wrote:
> On Thu, May 15, 2025 at 11:32:22AM +0200, David Hildenbrand wrote:
>> On 15.05.25 11:27, Dev Jain wrote:
>>>
>>>
>>> On 15/05/25 2:23 pm, David Hildenbrand wrote:
>>>> On 15.05.25 10:47, Dev Jain wrote:
>>>>>
>>>>>
>>>>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>>>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the
>>>>>>>>> caller
>>>>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>>>>> only
>>>>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd
>>>>>>>>> through
>>>>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>>>>> so I assume that assumption was not true in all cases?
>>>>>>>>
>>>>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>>>>
>>>>>>>      From what I could gather of Will's commit message, my
>>>>>>> interpretation is
>>>>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>>>>> These individually check for pxd_present():
>>>>>>>
>>>>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>>>>        return 0;
>>>>>>>
>>>>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>>>>> may encounter a none pmd and trigger a WARN.
>>>>>>
>>>>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>>>>
>>>>>> I assume we should either have an explicit pmd_none() check in
>>>>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>>>>> pmd_free_pte_page().
>>>>>>
>>>>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>>>>> which sounds wrong -- unless I am missing something important.
>>>>>
>>>>> Ah thanks, you seem to be right. We will be extracting table from a none
>>>>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>>>>> warning, which the fix commit used to do.
>>>>
>>>> Right. We just make sure that all callers of pmd_free_pte_page() already
>>>> check for it.
>>>>
>>>> I'd just do something like:
>>>>
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index 8fcf59ba39db7..e98dd7af147d5 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long
>>>> addr)
>>>>
>>>>            pmd = READ_ONCE(*pmdp);
>>>>
>>>> -       if (!pmd_table(pmd)) {
>>>> -               VM_WARN_ON(1);
>>>> -               return 1;
>>>> -       }
>>>> +       VM_WARN_ON(!pmd_present(pmd));
>>>> +       VM_WARN_ON(!pmd_table(pmd));
>>>
>>> And also return 1?
>>
>> I'll leave that to Catalin + Will.
>>
>> I'm not a friend for adding runtime-overhead for soemthing that should not
>> happen and be caught early during testing -> VM_WARN_ON_ONCE().
> 
> I definitely think we should return early if the pmd isn't a table.
> Otherwise, we could end up descending into God-knows-what!

The question is: how did something that is not a table end up here, and 
why is it valid to check exactly that at runtime. Not strong opinion, it 
just feels a bit arbitrary to test for exactly that at runtime if it is 
completely unexpected.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15 13:01             ` David Hildenbrand
@ 2025-05-15 13:14               ` Ryan Roberts
  0 siblings, 0 replies; 16+ messages in thread
From: Ryan Roberts @ 2025-05-15 13:14 UTC (permalink / raw)
  To: David Hildenbrand, Dev Jain, catalin.marinas, will
  Cc: anshuman.khandual, mark.rutland, yang, linux-kernel,
	linux-arm-kernel, stable

On 15/05/2025 14:01, David Hildenbrand wrote:
> On 15.05.25 12:07, Ryan Roberts wrote:
>> On 15/05/2025 09:53, David Hildenbrand wrote:
>>> On 15.05.25 10:47, Dev Jain wrote:
>>>>
>>>>
>>>> On 15/05/25 2:06 pm, David Hildenbrand wrote:
>>>>> On 15.05.25 10:22, Dev Jain wrote:
>>>>>>
>>>>>>
>>>>>> On 15/05/25 1:43 pm, David Hildenbrand wrote:
>>>>>>> On 15.05.25 08:34, Dev Jain wrote:
>>>>>>>> Commit 9c006972c3fe removes the pxd_present() checks because the caller
>>>>>>>> checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
>>>>>>>> only
>>>>>>>> checks pud_present(); pud_free_pmd_page() recurses on each pmd through
>>>>>>>> pmd_free_pte_page(), wherein the pmd may be none.
>>>>>>> The commit states: "The core code already has a check for pXd_none()",
>>>>>>> so I assume that assumption was not true in all cases?
>>>>>>>
>>>>>>> Should that one problematic caller then check for pmd_none() instead?
>>>>>>
>>>>>>     From what I could gather of Will's commit message, my interpretation is
>>>>>> that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
>>>>>> These individually check for pxd_present():
>>>>>>
>>>>>> if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
>>>>>>       return 0;
>>>>>>
>>>>>> The problem is that vmap_try_huge_pud will also iterate on pte entries.
>>>>>> So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
>>>>>> may encounter a none pmd and trigger a WARN.
>>>>>
>>>>> Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
>>>>>
>>>>> I assume we should either have an explicit pmd_none() check in
>>>>> pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
>>>>> pmd_free_pte_page().
>>>>>
>>>>> With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
>>>>> which sounds wrong -- unless I am missing something important.
>>>>
>>>> Ah thanks, you seem to be right. We will be extracting table from a none
>>>> pmd. Perhaps we should still bail out for !pxd_present() but without the
>>>> warning, which the fix commit used to do.
>>>
>>> Right. We just make sure that all callers of pmd_free_pte_page() already check
>>> for it.
>>>
>>> I'd just do something like:
>>
>> I just reviewed the patch and had the same feedback as David. I agree with the
>> patch below, with some small mods...
>>
>>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 8fcf59ba39db7..e98dd7af147d5 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr)
>>>            pmd = READ_ONCE(*pmdp);
>>>   -       if (!pmd_table(pmd)) {
>>> -               VM_WARN_ON(1);
>>> -               return 1;
>>> -       }
>>> +       VM_WARN_ON(!pmd_present(pmd));
>>> +       VM_WARN_ON(!pmd_table(pmd));
>>
>> You don't need both of these warnings; pmd_table() is only true if the pmd is
>> present (well actually only if it's _valid_ which is more strict than present),
>> so the second one is sufficient on its own.
> 
> Ah, right.
> 
>>
>>>            table = pte_offset_kernel(pmdp, addr);
>>>          pmd_clear(pmdp);
>>> @@ -1305,7 +1303,8 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
>>
>> Given you are removing the runtime check and early return in
>> pmd_free_pte_page(), I think you should modify this function to use the same
>> style too.
> 
> BTW, the "return 1" is weird. But looking at x86, we seem to be making a private
> copy of the page table first, to defer freeing the page tables after the TLB flush.
> 
> I wonder if there isn't a better way (e.g., clear PUDP + flush tlb, then walk
> over the effectively-disconnected page table). But I'm sure there is a good
> reason for that.

As I understand it, the actual TLB entries should have been invalidated when the
previous mappings we vfree'd. So the single page __flush_tlb_kernel_pgtable()
calls here are to zap any table entries that may be in the walk cache. We could
do an all-levels TLBI for the entire range, but for a system that doesn't
support the tlbi-range operations, we would end up issuing a tlbi per page
across the whole range which I think would be much slower than the one tlbi per
pgtable we have here.

Things could be rearranged a bit so that we issue all the tlbis with only a
single set of barriers (currently each __flush_tlb_kernel_pgtable() issues it's
own barriers), but I'm not sure how important that micro-optimization is given I
guess we never even call pud_free_pmd_page() in practice given we have had no
reports of the warning tripping.

> 
>>
>>>          next = addr;
>>>          end = addr + PUD_SIZE;
>>>          do {
>>> -               pmd_free_pte_page(pmdp, next);
>>> +               if (pmd_present(*pmdp))
>>
>> question: I wonder if it is better to use !pmd_none() as the condition here? It
>> should either be none or a table at this point, so this allows the warning in
>> pmd_free_pte_page() to catch more error conditions. No strong opinion though.
> 
> Same here. The existing callers check pmd_present().

Yeah fair let's be consistent and use pmd_present().

Thanks,
Ryan



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables
  2025-05-15 13:04                 ` David Hildenbrand
@ 2025-05-15 13:21                   ` Will Deacon
  0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2025-05-15 13:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Dev Jain, catalin.marinas, ryan.roberts, anshuman.khandual,
	mark.rutland, yang, linux-kernel, linux-arm-kernel, stable

On Thu, May 15, 2025 at 03:04:50PM +0200, David Hildenbrand wrote:
> On 15.05.25 14:56, Will Deacon wrote:
> > On Thu, May 15, 2025 at 11:32:22AM +0200, David Hildenbrand wrote:
> > > On 15.05.25 11:27, Dev Jain wrote:
> > > > 
> > > > 
> > > > On 15/05/25 2:23 pm, David Hildenbrand wrote:
> > > > > On 15.05.25 10:47, Dev Jain wrote:
> > > > > > 
> > > > > > 
> > > > > > On 15/05/25 2:06 pm, David Hildenbrand wrote:
> > > > > > > On 15.05.25 10:22, Dev Jain wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 15/05/25 1:43 pm, David Hildenbrand wrote:
> > > > > > > > > On 15.05.25 08:34, Dev Jain wrote:
> > > > > > > > > > Commit 9c006972c3fe removes the pxd_present() checks because the
> > > > > > > > > > caller
> > > > > > > > > > checks pxd_present(). But, in case of vmap_try_huge_pud(), the caller
> > > > > > > > > > only
> > > > > > > > > > checks pud_present(); pud_free_pmd_page() recurses on each pmd
> > > > > > > > > > through
> > > > > > > > > > pmd_free_pte_page(), wherein the pmd may be none.
> > > > > > > > > The commit states: "The core code already has a check for pXd_none()",
> > > > > > > > > so I assume that assumption was not true in all cases?
> > > > > > > > > 
> > > > > > > > > Should that one problematic caller then check for pmd_none() instead?
> > > > > > > > 
> > > > > > > >      From what I could gather of Will's commit message, my
> > > > > > > > interpretation is
> > > > > > > > that the concerned callers are vmap_try_huge_pud and vmap_try_huge_pmd.
> > > > > > > > These individually check for pxd_present():
> > > > > > > > 
> > > > > > > > if (pmd_present(*pmd) && !pmd_free_pte_page(pmd, addr))
> > > > > > > >        return 0;
> > > > > > > > 
> > > > > > > > The problem is that vmap_try_huge_pud will also iterate on pte entries.
> > > > > > > > So if the pud is present, then pud_free_pmd_page -> pmd_free_pte_page
> > > > > > > > may encounter a none pmd and trigger a WARN.
> > > > > > > 
> > > > > > > Yeah, pud_free_pmd_page()->pmd_free_pte_page() looks shaky.
> > > > > > > 
> > > > > > > I assume we should either have an explicit pmd_none() check in
> > > > > > > pud_free_pmd_page() before calling pmd_free_pte_page(), or one in
> > > > > > > pmd_free_pte_page().
> > > > > > > 
> > > > > > > With your patch, we'd be calling pte_free_kernel() on a NULL pointer,
> > > > > > > which sounds wrong -- unless I am missing something important.
> > > > > > 
> > > > > > Ah thanks, you seem to be right. We will be extracting table from a none
> > > > > > pmd. Perhaps we should still bail out for !pxd_present() but without the
> > > > > > warning, which the fix commit used to do.
> > > > > 
> > > > > Right. We just make sure that all callers of pmd_free_pte_page() already
> > > > > check for it.
> > > > > 
> > > > > I'd just do something like:
> > > > > 
> > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > > > index 8fcf59ba39db7..e98dd7af147d5 100644
> > > > > --- a/arch/arm64/mm/mmu.c
> > > > > +++ b/arch/arm64/mm/mmu.c
> > > > > @@ -1274,10 +1274,8 @@ int pmd_free_pte_page(pmd_t *pmdp, unsigned long
> > > > > addr)
> > > > > 
> > > > >            pmd = READ_ONCE(*pmdp);
> > > > > 
> > > > > -       if (!pmd_table(pmd)) {
> > > > > -               VM_WARN_ON(1);
> > > > > -               return 1;
> > > > > -       }
> > > > > +       VM_WARN_ON(!pmd_present(pmd));
> > > > > +       VM_WARN_ON(!pmd_table(pmd));
> > > > 
> > > > And also return 1?
> > > 
> > > I'll leave that to Catalin + Will.
> > > 
> > > I'm not a friend for adding runtime-overhead for soemthing that should not
> > > happen and be caught early during testing -> VM_WARN_ON_ONCE().
> > 
> > I definitely think we should return early if the pmd isn't a table.
> > Otherwise, we could end up descending into God-knows-what!
> 
> The question is: how did something that is not a table end up here, and why
> is it valid to check exactly that at runtime. Not strong opinion, it just
> feels a bit arbitrary to test for exactly that at runtime if it is
> completely unexpected.

I see it a little bit like type-checking: we could see an invalid entry,
a leaf entry or a table entry and we should only ever dereference the
latter. If the VM_WARN_ON() is justified, then I find it jarring to go
ahead with the dereference regardless of the type.

That said, maybe the VM_WARN_ON() should either be deleted or moved out
to the callers in mm/vmalloc.c?

Will


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-05-15 13:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-15  6:34 [PATCH] arm64: Check pxd_leaf() instead of !pxd_table() while tearing down page tables Dev Jain
2025-05-15  8:13 ` David Hildenbrand
2025-05-15  8:22   ` Dev Jain
2025-05-15  8:36     ` David Hildenbrand
2025-05-15  8:40       ` Dev Jain
2025-05-15  8:49         ` David Hildenbrand
2025-05-15  8:47       ` Dev Jain
2025-05-15  8:53         ` David Hildenbrand
2025-05-15  9:27           ` Dev Jain
2025-05-15  9:32             ` David Hildenbrand
2025-05-15 12:56               ` Will Deacon
2025-05-15 13:04                 ` David Hildenbrand
2025-05-15 13:21                   ` Will Deacon
2025-05-15 10:07           ` Ryan Roberts
2025-05-15 13:01             ` David Hildenbrand
2025-05-15 13:14               ` Ryan Roberts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).