Re: [PATCH V3 1/2] arm64/mm: Enable batched TLB flush in unmap_hotplug_range()

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Anshuman Khandual <anshuman.khandual@arm.com>,
	linux-arm-kernel@lists.infradead.org
Cc: mark.rutland@arm.com, Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Yang Shi <yang@os.amperecomputing.com>,
	Christoph Lameter <cl@gentwo.org>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH V3 1/2] arm64/mm: Enable batched TLB flush in unmap_hotplug_range()
Date: Mon, 2 Mar 2026 16:28:08 +0100	[thread overview]
Message-ID: <bf956adb-06bc-4b68-b846-7dddb9413867@kernel.org> (raw)
In-Reply-To: <20260224062423.972404-2-anshuman.khandual@arm.com>

On 2/24/26 07:24, Anshuman Khandual wrote:
> During a memory hot remove operation, both linear and vmemmap mappings for
> the memory range being removed, get unmapped via unmap_hotplug_range() but
> mapped pages get freed only for vmemmap mapping. This is just a sequential
> operation where each table entry gets cleared, followed by a leaf specific
> TLB flush, and then followed by memory free operation when applicable.
> 
> This approach was simple and uniform both for vmemmap and linear mappings.
> But linear mapping might contain CONT marked block memory where it becomes
> necessary to first clear out all entire in the range before a TLB flush.
> This is as per the architecture requirement. Hence batch all TLB flushes
> during the table tear down walk and finally do it in unmap_hotplug_range().
> 
> Prior to this fix, it was hypothetically possible for a speculative access
> to a higher address in the contiguous block to fill the TLB with shattered
> entries for the entire contiguous range after a lower address had already
> been cleared and invalidated. Due to the table entries being shattered, the
> subsequent TLB invalidation for the higher address would not then clear the
> TLB entries for the lower address, meaning stale TLB entries could persist.
> 
> Besides it also helps in improving the performance via TLBI range operation
> along with reduced synchronization instructions. The time spent executing
> unmap_hotplug_range() improved 97% measured over a 2GB memory hot removal
> in KVM guest.
> 
> This scheme is not applicable during vmemmap mapping tear down where memory
> needs to be freed and hence a TLB flush is required after clearing out page
> table entry.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Closes: https://lore.kernel.org/all/aWZYXhrT6D2M-7-N@willie-the-truck/
> Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
> Cc: stable@vger.kernel.org
> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 81 +++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 67 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a6a00accf4f9..dfb61d218579 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1458,10 +1458,32 @@ static void unmap_hotplug_pte_range(pmd_t *pmdp, unsigned long addr,
>  
>  		WARN_ON(!pte_present(pte));
>  		__pte_clear(&init_mm, addr, ptep);
> -		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> -		if (free_mapped)
> +		if (free_mapped) {
> +			/*
> +			 * If page is part of an existing contiguous
> +			 * memory block, individual TLB invalidation
> +			 * here would not be appropriate. Instead it
> +			 * will require clearing all entries for the
> +			 * memory block and subsequently a TLB flush
> +			 * for the entire range.
> +			 */

I'm not sure about repeating these longish comments a couple of times :)

For example, I think you can drop the ones regarding the TLB flush
("TLB flush is batched in unmap_hotplug_range ...") completely.
unmap_hotplug_range(), the only caller, is pretty clear about that. And
anybody reading that code should be able to spot the "!free_mapped" case
easily.

Alternatively, just say

	/* unmap_hotplug_range() flushes TLB for !free_mapped */

"TLB flush is essential for freeing memory." is rather obvious when
freeing memory, so I would drop that as well.

Regarding pte_cont(), can we shorten that to

	/* CONT blocks in the vmemmap are not supported. */

Anybody who wants to figure *why* can lookup your patch where you add
that comment+check.


I did not check whether people suggested to add these comments in
previous versions. But to me they don't add a lot of real value that
couldn't be had from the code already (or common sense: freeing requires
prior TLB flush).

> +			WARN_ON(pte_cont(pte));
> +
> +			/*
> +			 * TLB flush is essential for freeing memory.
> +			 */
> +			flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>  			free_hotplug_page_range(pte_page(pte),
>  						PAGE_SIZE, altmap);
> +		}

[...]

>  		WARN_ON(!pud_table(pud));
> @@ -1553,6 +1597,7 @@ static void unmap_hotplug_p4d_range(pgd_t *pgdp, unsigned long addr,
>  static void unmap_hotplug_range(unsigned long addr, unsigned long end,
>  				bool free_mapped, struct vmem_altmap *altmap)
>  {
> +	unsigned long start = addr;
>  	unsigned long next;
>  	pgd_t *pgdp, pgd;
>  
> @@ -1574,6 +1619,14 @@ static void unmap_hotplug_range(unsigned long addr, unsigned long end,
>  		WARN_ON(!pgd_present(pgd));
>  		unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap);
>  	} while (addr = next, addr < end);
> +
> +	/*
> +	 * Batched TLB flush only for linear mapping which
> +	 * might contain CONT blocks, and does not require
> +	 * freeing up memory as well.
> +	 */

Also, here, I don't think the comment really adds value.

* !free_mapped -> linear mapping, no freeing of memory
* CONT blocks -> irrelevant, you can batch in either case

> +	if (!free_mapped)
> +		flush_tlb_kernel_range(start, end);
>  }
-- 
Cheers,

David

next prev parent reply	other threads:[~2026-03-02 15:28 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260224062423.972404-1-anshuman.khandual@arm.com>
2026-02-24  6:24 ` [PATCH V3 1/2] arm64/mm: Enable batched TLB flush in unmap_hotplug_range() Anshuman Khandual
2026-03-02 15:28   ` David Hildenbrand (Arm) [this message]
2026-03-03  5:55     ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bf956adb-06bc-4b68-b846-7dddb9413867@kernel.org \
    --to=david@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=ryan.roberts@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox