From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sinmsgout02.his.huawei.com (sinmsgout02.his.huawei.com [119.8.177.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9F983BED38 for ; Tue, 3 Mar 2026 17:52:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=119.8.177.37 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772560362; cv=none; b=Y1jp1WoGc8b7VqL/ozE9LnEaBMm24ch4VpRzPRJ4DVlM/8Fqrb1AF8TtiUIgtyKfofje3VvnrL4IW3+5E7sYGgvunG6PAmrEwHcIpZTlNK6WHKUFjyXtpXyXu+kBpWY+TU8b2V/0tbR9MtDjcw6CsYmDfxMihx1Lf3idEl6KLPY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772560362; c=relaxed/simple; bh=A3zquf/SssNZC9K/0JjGlTrlMjURQBZRAd/jVh8W+5M=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Pzn+35hM0M3fDWBBnVO9aLOkysIJ9gMRh/5gZBnZawAcQuxED93A90/uGJ+ha5jd65RfztVMRSIwmSwFAiNCnfclA6QBR+SJXmWxLX2vmvji7sMGB80oWloZSLk6MtQW3uk6I0twA4dRrSXNXgYxT6fMMkXikcI5RpESh8XTMR0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=Ex5p2+7G; arc=none smtp.client-ip=119.8.177.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="Ex5p2+7G" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=ceLeNL1maxOVUOqqZCKIbzy/96/5ZP+KS0poDgLTyqg=; b=Ex5p2+7Gy0yMAhRw5H70PxKRkgxxw5ZEU9GT8p1DEen6tvI5SgCDpG7XPRA2vaDQxLT5wHVZq /jyetL34OWEji4JwuaUNfePTt4sWFVuEmoYXG6NT7LXV2cVslw7yic5jrq5XACYCvrSzK5Jv+HG 8gkzOlJbCxv06JDKt0k6FTo= Received: from frasgout.his.huawei.com (unknown [172.18.146.33]) by sinmsgout02.his.huawei.com (SkyGuard) with ESMTPS id 4fQN9Z4WCFz1vnNZ; Wed, 4 Mar 2026 01:30:26 +0800 (CST) Received: from mail.maildlp.com (unknown [172.18.224.150]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fQNFk1vstzJ46Bv; Wed, 4 Mar 2026 01:34:02 +0800 (CST) Received: from dubpeml500005.china.huawei.com (unknown [7.214.145.207]) by mail.maildlp.com (Postfix) with ESMTPS id A463E4056B; Wed, 4 Mar 2026 01:34:35 +0800 (CST) Received: from localhost (10.203.177.15) by dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 3 Mar 2026 17:34:34 +0000 Date: Tue, 3 Mar 2026 17:34:33 +0000 From: Jonathan Cameron To: Ryan Roberts CC: Will Deacon , Ard Biesheuvel , Catalin Marinas , Mark Rutland , Linus Torvalds , Oliver Upton , Marc Zyngier , "Dev Jain" , Linu Cherian , , Subject: Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Message-ID: <20260303173433.0000031f@huawei.com> In-Reply-To: <28c7506f-a57e-4c59-b3ef-4596d8e1b11e@arm.com> References: <20260302135602.3716920-1-ryan.roberts@arm.com> <20260302135602.3716920-12-ryan.roberts@arm.com> <20260303095718.00001320@huawei.com> <28c7506f-a57e-4c59-b3ef-4596d8e1b11e@arm.com> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: lhrpeml100009.china.huawei.com (7.191.174.83) To dubpeml500005.china.huawei.com (7.214.145.207) On Tue, 3 Mar 2026 13:54:33 +0000 Ryan Roberts wrote: > On 03/03/2026 09:57, Jonathan Cameron wrote: > > On Mon, 2 Mar 2026 13:55:58 +0000 > > Ryan Roberts wrote: > > > >> Refactor function variants with "_nosync", "_local" and "_nonotify" into > >> a single __always_inline implementation that takes flags and rely on > >> constant folding to select the parts that are actually needed at any > >> given callsite, based on the provided flags. > >> > >> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues > >> to provide the strongest semantics (i.e. evict from walk cache, > >> broadcast, synchronise and notify). Each flag reduces the strength in > >> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to > >> complement the existing TLBF_NOWALKCACHE. > >> > >> There are no users that require TLBF_NOBROADCAST without > >> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to > >> introduce dead code for vae1 invalidations. > >> > >> The result is a clearer, simpler, more powerful API. > > Hi Ryan, > > > > There is one subtle change to rounding that should be called out at least. > > Thanks for the review. I'm confident that there isn't actually a change to the > rounding here, but the responsibility has moved to the caller. See below... > > > > > Might even be worth pulling it to a precursor patch where you can add an > > explanation of why original code was rounding to a larger value than was > > ever needed. > > > > Jonathan > > > > > >> > >> Signed-off-by: Ryan Roberts > > > > > >> static inline void __flush_tlb_range(struct vm_area_struct *vma, > >> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > >> unsigned long stride, int tlb_level, > >> tlbf_t flags) > >> { > >> - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride, > >> - tlb_level, flags); > >> - __tlbi_sync_s1ish(); > >> -} > >> - > >> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma, > >> - unsigned long addr) > >> -{ > >> - unsigned long asid; > >> - > >> - addr = round_down(addr, CONT_PTE_SIZE); > > See below. > >> - > >> - dsb(nshst); > >> - asid = ASID(vma->vm_mm); > >> - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3); > >> - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr, > >> - addr + CONT_PTE_SIZE); > >> - dsb(nsh); > >> + start = round_down(start, stride); > > See below. > >> + end = round_up(end, stride); > >> + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags); > >> } > > > >> > >> static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval) > >> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c > >> index 681f22fac52a1..3f1a3e86353de 100644 > >> --- a/arch/arm64/mm/contpte.c > >> +++ b/arch/arm64/mm/contpte.c > > ... > > > >> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma, > >> __ptep_set_access_flags(vma, addr, ptep, entry, 0); > >> > >> if (dirty) > >> - local_flush_tlb_contpte(vma, start_addr); > >> + __flush_tlb_range(vma, start_addr, > >> + start_addr + CONT_PTE_SIZE, > >> + PAGE_SIZE, 3, > > > > This results in a different stride to round down. > > local_flush_tlb_contpte() did > > addr = round_down(addr, CONT_PTE_SIZE); > > > > With this call we have > > start = round_down(start, stride); where stride is PAGE_SIZE. > > > > I'm too lazy to figure out if that matters. > > contpte_ptep_set_access_flags() is operating on a contpte block of ptes, and as > such, start_addr has already been rounded down to the start of the block, which > is always bigger than (and perfectly divisible by) PAGE_SIZE. > > Previously, local_flush_tlb_contpte() allowed passing any VA in within the > contpte block and the function would automatically round it down to the start of > the block and invalidate the full block. > > After the change, we are explicitly passing the already aligned block; > start_addr is already guaranteed to be at the start of the block and "start_addr > + CONT_PTE_SIZE" is the end. > > So in both cases, the rounding down that is done by local_flush_tlb_contpte() / > __flush_tlb_range() doesn't actually change the value. Ah ok, so key is that the round down in local_flush_tlb_contpte() never did anything in practice because the only caller is contpte_ptep_set_access_flags() and that does the align down a couple of lines before the call. I should have spent a few seconds looking! :( Maybe if you are respinning just throw in a one line comment on this in the commit description. Reviewed-by: Jonathan Cameron > > Thanks, > Ryan > > > > > > > >> + TLBF_NOWALKCACHE | TLBF_NOBROADCAST); > >> } else { > >> __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte); > >> __ptep_set_access_flags(vma, addr, ptep, entry, dirty); > > > >