linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
@ 2025-06-09  4:12 Anshuman Khandual
  2025-06-09  5:09 ` Dev Jain
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Anshuman Khandual @ 2025-06-09  4:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Anshuman Khandual, stable, Catalin Marinas, Will Deacon,
	Ryan Roberts, linux-kernel, Dev Jain

The arm64 page table dump code can race with concurrent modification of the
kernel page tables. When a leaf entries are modified concurrently, the dump
code may log stale or inconsistent information for a VA range, but this is
otherwise not harmful.

When intermediate levels of table are freed, the dump code will continue to
use memory which has been freed and potentially reallocated for another
purpose. In such cases, the dump code may dereference bogus addresses,
leading to a number of potential problems.

This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
("arm64/mm: Hold memory hotplug lock while walking for kernel page table
dump")' but a same was missed for ptdump_check_wx() which faced the race
condition as well. Let's just take the memory hotplug lock while executing
ptdump_check_wx().

Cc: stable@vger.kernel.org
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reported-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
This patch applies on v6.16-rc1

Dev Jain found this via code inspection.

 arch/arm64/mm/ptdump.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 421a5de806c62..551f80d41e8d2 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -328,7 +328,7 @@ static struct ptdump_info kernel_ptdump_info __ro_after_init = {
 	.mm		= &init_mm,
 };
 
-bool ptdump_check_wx(void)
+static bool __ptdump_check_wx(void)
 {
 	struct ptdump_pg_state st = {
 		.seq = NULL,
@@ -367,6 +367,16 @@ bool ptdump_check_wx(void)
 	}
 }
 
+bool ptdump_check_wx(void)
+{
+	bool ret;
+
+	get_online_mems();
+	ret = __ptdump_check_wx();
+	put_online_mems();
+	return ret;
+}
+
 static int __init ptdump_init(void)
 {
 	u64 page_offset = _PAGE_OFFSET(vabits_actual);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-09  4:12 [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx() Anshuman Khandual
@ 2025-06-09  5:09 ` Dev Jain
  2025-06-09 11:10 ` Ryan Roberts
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Dev Jain @ 2025-06-09  5:09 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: stable, Catalin Marinas, Will Deacon, Ryan Roberts, linux-kernel


On 09/06/25 9:42 am, Anshuman Khandual wrote:
> The arm64 page table dump code can race with concurrent modification of the
> kernel page tables. When a leaf entries are modified concurrently, the dump
> code may log stale or inconsistent information for a VA range, but this is
> otherwise not harmful.
>
> When intermediate levels of table are freed, the dump code will continue to
> use memory which has been freed and potentially reallocated for another
> purpose. In such cases, the dump code may dereference bogus addresses,
> leading to a number of potential problems.
>
> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
> dump")' but a same was missed for ptdump_check_wx() which faced the race
> condition as well. Let's just take the memory hotplug lock while executing
> ptdump_check_wx().
>
> Cc: stable@vger.kernel.org
> Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reported-by: Dev Jain <dev.jain@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---

LGTM

Reviewed-by: Dev Jain <dev.jain@arm.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-09  4:12 [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx() Anshuman Khandual
  2025-06-09  5:09 ` Dev Jain
@ 2025-06-09 11:10 ` Ryan Roberts
  2025-06-10 16:42 ` Catalin Marinas
  2025-06-12 14:58 ` Will Deacon
  3 siblings, 0 replies; 8+ messages in thread
From: Ryan Roberts @ 2025-06-09 11:10 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel
  Cc: stable, Catalin Marinas, Will Deacon, linux-kernel, Dev Jain

On 09/06/2025 05:12, Anshuman Khandual wrote:
> The arm64 page table dump code can race with concurrent modification of the
> kernel page tables. When a leaf entries are modified concurrently, the dump
> code may log stale or inconsistent information for a VA range, but this is
> otherwise not harmful.
> 
> When intermediate levels of table are freed, the dump code will continue to
> use memory which has been freed and potentially reallocated for another
> purpose. In such cases, the dump code may dereference bogus addresses,
> leading to a number of potential problems.
> 
> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
> dump")' but a same was missed for ptdump_check_wx() which faced the race
> condition as well. Let's just take the memory hotplug lock while executing
> ptdump_check_wx().
> 
> Cc: stable@vger.kernel.org
> Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reported-by: Dev Jain <dev.jain@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
> This patch applies on v6.16-rc1
> 
> Dev Jain found this via code inspection.
> 
>  arch/arm64/mm/ptdump.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 421a5de806c62..551f80d41e8d2 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -328,7 +328,7 @@ static struct ptdump_info kernel_ptdump_info __ro_after_init = {
>  	.mm		= &init_mm,
>  };
>  
> -bool ptdump_check_wx(void)
> +static bool __ptdump_check_wx(void)
>  {
>  	struct ptdump_pg_state st = {
>  		.seq = NULL,
> @@ -367,6 +367,16 @@ bool ptdump_check_wx(void)
>  	}
>  }
>  
> +bool ptdump_check_wx(void)
> +{
> +	bool ret;
> +
> +	get_online_mems();
> +	ret = __ptdump_check_wx();
> +	put_online_mems();
> +	return ret;
> +}
> +
>  static int __init ptdump_init(void)
>  {
>  	u64 page_offset = _PAGE_OFFSET(vabits_actual);


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-09  4:12 [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx() Anshuman Khandual
  2025-06-09  5:09 ` Dev Jain
  2025-06-09 11:10 ` Ryan Roberts
@ 2025-06-10 16:42 ` Catalin Marinas
  2025-06-12 14:58 ` Will Deacon
  3 siblings, 0 replies; 8+ messages in thread
From: Catalin Marinas @ 2025-06-10 16:42 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, stable, Will Deacon, Ryan Roberts, linux-kernel,
	Dev Jain

On Mon, Jun 09, 2025 at 05:12:14AM +0100, Anshuman Khandual wrote:
> The arm64 page table dump code can race with concurrent modification of the
> kernel page tables. When a leaf entries are modified concurrently, the dump
> code may log stale or inconsistent information for a VA range, but this is
> otherwise not harmful.
> 
> When intermediate levels of table are freed, the dump code will continue to
> use memory which has been freed and potentially reallocated for another
> purpose. In such cases, the dump code may dereference bogus addresses,
> leading to a number of potential problems.
> 
> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
> dump")' but a same was missed for ptdump_check_wx() which faced the race
> condition as well. Let's just take the memory hotplug lock while executing
> ptdump_check_wx().
> 
> Cc: stable@vger.kernel.org
> Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reported-by: Dev Jain <dev.jain@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-09  4:12 [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx() Anshuman Khandual
                   ` (2 preceding siblings ...)
  2025-06-10 16:42 ` Catalin Marinas
@ 2025-06-12 14:58 ` Will Deacon
  2025-06-13  5:09   ` Anshuman Khandual
  3 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2025-06-12 14:58 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, stable, Catalin Marinas, Ryan Roberts,
	linux-kernel, Dev Jain

On Mon, Jun 09, 2025 at 05:12:14AM +0100, Anshuman Khandual wrote:
> The arm64 page table dump code can race with concurrent modification of the
> kernel page tables. When a leaf entries are modified concurrently, the dump
> code may log stale or inconsistent information for a VA range, but this is
> otherwise not harmful.
> 
> When intermediate levels of table are freed, the dump code will continue to
> use memory which has been freed and potentially reallocated for another
> purpose. In such cases, the dump code may dereference bogus addresses,
> leading to a number of potential problems.
> 
> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
> dump")' but a same was missed for ptdump_check_wx() which faced the race
> condition as well. Let's just take the memory hotplug lock while executing
> ptdump_check_wx().

How do other architectures (e.g. x86) handle this? I don't see any usage
of {get,put}_online_mems() over there. Should this be moved into the core
code?

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-12 14:58 ` Will Deacon
@ 2025-06-13  5:09   ` Anshuman Khandual
  2025-06-18 11:36     ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Anshuman Khandual @ 2025-06-13  5:09 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, stable, Catalin Marinas, Ryan Roberts,
	linux-kernel, Dev Jain



On 12/06/25 8:28 PM, Will Deacon wrote:
> On Mon, Jun 09, 2025 at 05:12:14AM +0100, Anshuman Khandual wrote:
>> The arm64 page table dump code can race with concurrent modification of the
>> kernel page tables. When a leaf entries are modified concurrently, the dump
>> code may log stale or inconsistent information for a VA range, but this is
>> otherwise not harmful.
>>
>> When intermediate levels of table are freed, the dump code will continue to
>> use memory which has been freed and potentially reallocated for another
>> purpose. In such cases, the dump code may dereference bogus addresses,
>> leading to a number of potential problems.
>>
>> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
>> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
>> dump")' but a same was missed for ptdump_check_wx() which faced the race
>> condition as well. Let's just take the memory hotplug lock while executing
>> ptdump_check_wx().
> 
> How do other architectures (e.g. x86) handle this? I don't see any usage
> of {get,put}_online_mems() over there. Should this be moved into the core
> code?

Memory hot remove on arm64 unmaps kernel linear and vmemmap mapping while
also freeing page table pages if those become empty. Although this might
not be true for all other architectures, which might just unmap affected
kernel regions but does not tear down the kernel page table.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-13  5:09   ` Anshuman Khandual
@ 2025-06-18 11:36     ` Will Deacon
  2025-06-19  7:55       ` Anshuman Khandual
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2025-06-18 11:36 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, stable, Catalin Marinas, Ryan Roberts,
	linux-kernel, Dev Jain

On Fri, Jun 13, 2025 at 10:39:02AM +0530, Anshuman Khandual wrote:
> 
> 
> On 12/06/25 8:28 PM, Will Deacon wrote:
> > On Mon, Jun 09, 2025 at 05:12:14AM +0100, Anshuman Khandual wrote:
> >> The arm64 page table dump code can race with concurrent modification of the
> >> kernel page tables. When a leaf entries are modified concurrently, the dump
> >> code may log stale or inconsistent information for a VA range, but this is
> >> otherwise not harmful.
> >>
> >> When intermediate levels of table are freed, the dump code will continue to
> >> use memory which has been freed and potentially reallocated for another
> >> purpose. In such cases, the dump code may dereference bogus addresses,
> >> leading to a number of potential problems.
> >>
> >> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
> >> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
> >> dump")' but a same was missed for ptdump_check_wx() which faced the race
> >> condition as well. Let's just take the memory hotplug lock while executing
> >> ptdump_check_wx().
> > 
> > How do other architectures (e.g. x86) handle this? I don't see any usage
> > of {get,put}_online_mems() over there. Should this be moved into the core
> > code?
> 
> Memory hot remove on arm64 unmaps kernel linear and vmemmap mapping while
> also freeing page table pages if those become empty. Although this might
> not be true for all other architectures, which might just unmap affected
> kernel regions but does not tear down the kernel page table.

... that sounds like something we should be able to give a definitive
answer to?

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx()
  2025-06-18 11:36     ` Will Deacon
@ 2025-06-19  7:55       ` Anshuman Khandual
  0 siblings, 0 replies; 8+ messages in thread
From: Anshuman Khandual @ 2025-06-19  7:55 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, stable, Catalin Marinas, Ryan Roberts,
	linux-kernel, Dev Jain



On 18/06/25 5:06 PM, Will Deacon wrote:
> On Fri, Jun 13, 2025 at 10:39:02AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 12/06/25 8:28 PM, Will Deacon wrote:
>>> On Mon, Jun 09, 2025 at 05:12:14AM +0100, Anshuman Khandual wrote:
>>>> The arm64 page table dump code can race with concurrent modification of the
>>>> kernel page tables. When a leaf entries are modified concurrently, the dump
>>>> code may log stale or inconsistent information for a VA range, but this is
>>>> otherwise not harmful.
>>>>
>>>> When intermediate levels of table are freed, the dump code will continue to
>>>> use memory which has been freed and potentially reallocated for another
>>>> purpose. In such cases, the dump code may dereference bogus addresses,
>>>> leading to a number of potential problems.
>>>>
>>>> This problem was fixed for ptdump_show() earlier via commit 'bf2b59f60ee1
>>>> ("arm64/mm: Hold memory hotplug lock while walking for kernel page table
>>>> dump")' but a same was missed for ptdump_check_wx() which faced the race
>>>> condition as well. Let's just take the memory hotplug lock while executing
>>>> ptdump_check_wx().
>>>
>>> How do other architectures (e.g. x86) handle this? I don't see any usage
>>> of {get,put}_online_mems() over there. Should this be moved into the core
>>> code?
>>
>> Memory hot remove on arm64 unmaps kernel linear and vmemmap mapping while
>> also freeing page table pages if those become empty. Although this might
>> not be true for all other architectures, which might just unmap affected
>> kernel regions but does not tear down the kernel page table.
> 
> ... that sounds like something we should be able to give a definitive
> answer to?

Agreed.

arch_remove_memory() is the primary arch callback which does the unmapping
and also tearing down of the required kernel page table regions i.e linear
and vmemmap mapping . These are the call paths that reach platform specific
memory removal via arch_remove_memory().

A) ZONE_DEVICE

devm_memremap_pages()
    devm_memremap_pages_release()
        devm_memunmap_pages()
            memunmap_pages()
                arch_remove_memory()

B) Normal DRAM

echo 1 > /sys/devices/system/memory/memoryX/offline

memory_subsys_offline()
    device_offline()
        memory_offline()
            offline_memory_block()
                remove_memory()
                    __remove_memory()
                        arch_remove_memory()

Currently there are six platforms which enable ARCH_ENABLE_MEMORY_HOTREMOVE
thus implementing arch_remove_memory(). Core memory hot removal process does
not have any set expectations from these callbacks. So platforms are free to
implement unmap and page table tearing down operation as deemed necessary.

ARCH_ENABLE_MEMORY_HOTREMOVE - arm64, loongarch, powerpc, riscv, s390, x86
ARCH_HAS_PTDUMP              - arm64, powerpc, riscv, s390, x86

In summary all the platforms that support memory hot remove and ptdump do
try and free the unmapped regions of the page table when possible. Hence
they are indeed exposed to possible race with ptdump walk.

But as mentioned earlier the callback arch_remove_memory() does not have to
tear down the page tables. Unless there are objections from other platforms,
standard memory hotplug lock could indeed be taken during all generic ptdump
walk paths. 

arm64
=====
    arch_remove_memory()
        __remove_pages()
            sparse_remove_section()
                section_deactivate()
                    depopulate_section_memmap()
                    free_map_bootmem()
                        vmemmap_free()              /* vmemap mapping */
                            unmap_hotplug_range()   /* Unmap */
                            free_empty_tables()     /* Tear down */

        __remove_pgd_mapping()
            __remove_pgd_mapping()                  /* linear Mapping */
                unmap_hotplug_range()               /* Unmap */
                free_empty_tables()                 /* Tear down */

powerpc
=======
    arch_remove_memory()
        __remove_pages()
            sparse_remove_section()
                    section_deactivate()
                        depopulate_section_memmap()
                            vmemmap_free()
                                __vmemmap_free()            /* Hash */
                                radix__vmemmap_free()       /* Radix */

        arch_remove_linear_mapping()
            remove_section_mapping()
                hash__remove_section_mapping()              /* Hash */
                radix__remove_section_mapping()             /* Radix */
    
riscv
=====
    arch_remove_memory()
        __remove_pages()
            sparse_remove_section()
                    section_deactivate()
                        depopulate_section_memmap()
                            vmemmap_free()
                                remove_pgd_mapping()

        remove_linear_mapping()
            remove_pgd_mapping()
    
remove_pgd_mapping() recursively calls remove_pxd_mapping() and
free_pxd_table() when applicable.

s390
=====
    arch_remove_memory()
        __remove_pages()
            sparse_remove_section()
                section_deactivate()
                    depopulate_section_memmap()
                        vmemmap_free()
                            remove_pagetable()
                                modify_pagetable()
        
        vmem_remove_mapping()
            vmem_remove_range()
                remove_pagetable() 
                        modify_pagetable()

modify_pagetable() on s390 does try to tear down the page table
when possible.
 
x86
===
    arch_remove_memory()
        __remove_pages()
            sparse_remove_section()
                section_deactivate()
                    depopulate_section_memmap()
                    free_map_bootmem()
                        vmemmap_free()              /* vmemap mapping */
                            remove_pagetable()

        kernel_physical_mapping_remove()            /* linear Mapping */
            remove_pagetable()

remove_pagetable() on x86 calls remove_pxd_table() followed up call
with free_pxd_table() which does tear down the page table as well
and hence exposed to race with PTDUMP which scans over the entire
kernel page table.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-06-19  7:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09  4:12 [PATCH] arm64/ptdump: Ensure memory hotplug is prevented during ptdump_check_wx() Anshuman Khandual
2025-06-09  5:09 ` Dev Jain
2025-06-09 11:10 ` Ryan Roberts
2025-06-10 16:42 ` Catalin Marinas
2025-06-12 14:58 ` Will Deacon
2025-06-13  5:09   ` Anshuman Khandual
2025-06-18 11:36     ` Will Deacon
2025-06-19  7:55       ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).