From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sinmsgout03.his.huawei.com (sinmsgout03.his.huawei.com [119.8.177.38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2858D39B974 for ; Tue, 3 Mar 2026 10:16:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=119.8.177.38 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772532981; cv=none; b=FvCkbaa8e1cq4DBVdjE5Ugnscw7gLAVpaVbTEciyYcflHwVvW5NYEPCtE9aDpiDRmWFlVo59gEGtdR88lgzSIPaZaLA1Mk6TM3uHPnsmd3Zgwl6JjX2bpHRdVJ8nm061KZSLv9MCrnFr7QePq/OpVCHgr5xVnVui1ureFwG1QbA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772532981; c=relaxed/simple; bh=AID9Tknlhzy9+WECd2xwUkkQPXIglmyVIQaaRAUKfyw=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=c+Tk4GLynfdnzYsCO1VxLWdSXblxlvWqrAD8s9ykkJQQjxG0XvRMw3ivsXX2nrNoxwU6eDhlW9l5zLkVDWjOTTwy5MhEsdcs1BIkrax9Qf75b0e25R/SjgvpIPctpg/s7NTeMsnAR5XFpJg8CvmZh7AN7tQqCBanvdA5lXZUnTs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=vtMMOZpY; arc=none smtp.client-ip=119.8.177.38 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="vtMMOZpY" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=b6WzQuT/ywf2VJayMRdaH0qj4NoL8SKxQZkqSuaYQr8=; b=vtMMOZpY2BgK9CHKOV1XD6roFa6JcJHJt0/lx8Ie/XUrTfU6c5qh2KRkwDmV7TIjXbzMEjTIw 9EplnZOzDSwooaTP3aljgPzGBAt/sNZhsElpzqpC0eK0j7AGfhCz3YCdV8SQgSCb/sirgMkRc68 7EMbtJoyxsjv/hETEOwr3X0= Received: from frasgout.his.huawei.com (unknown [172.18.146.32]) by sinmsgout03.his.huawei.com (SkyGuard) with ESMTPS id 4fQB3521XqzN168; Tue, 3 Mar 2026 17:54:09 +0800 (CST) Received: from mail.maildlp.com (unknown [172.18.224.150]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fQB5l5cw0zHnGhy; Tue, 3 Mar 2026 17:56:27 +0800 (CST) Received: from dubpeml500005.china.huawei.com (unknown [7.214.145.207]) by mail.maildlp.com (Postfix) with ESMTPS id 4F5D54056B; Tue, 3 Mar 2026 17:57:21 +0800 (CST) Received: from localhost (10.203.177.15) by dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 3 Mar 2026 09:57:20 +0000 Date: Tue, 3 Mar 2026 09:57:18 +0000 From: Jonathan Cameron To: Ryan Roberts CC: Will Deacon , Ard Biesheuvel , Catalin Marinas , Mark Rutland , Linus Torvalds , Oliver Upton , Marc Zyngier , "Dev Jain" , Linu Cherian , , Subject: Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Message-ID: <20260303095718.00001320@huawei.com> In-Reply-To: <20260302135602.3716920-12-ryan.roberts@arm.com> References: <20260302135602.3716920-1-ryan.roberts@arm.com> <20260302135602.3716920-12-ryan.roberts@arm.com> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: lhrpeml500009.china.huawei.com (7.191.174.84) To dubpeml500005.china.huawei.com (7.214.145.207) On Mon, 2 Mar 2026 13:55:58 +0000 Ryan Roberts wrote: > Refactor function variants with "_nosync", "_local" and "_nonotify" into > a single __always_inline implementation that takes flags and rely on > constant folding to select the parts that are actually needed at any > given callsite, based on the provided flags. > > Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues > to provide the strongest semantics (i.e. evict from walk cache, > broadcast, synchronise and notify). Each flag reduces the strength in > some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to > complement the existing TLBF_NOWALKCACHE. > > There are no users that require TLBF_NOBROADCAST without > TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to > introduce dead code for vae1 invalidations. > > The result is a clearer, simpler, more powerful API. Hi Ryan, There is one subtle change to rounding that should be called out at least. Might even be worth pulling it to a precursor patch where you can add an explanation of why original code was rounding to a larger value than was ever needed. Jonathan > > Signed-off-by: Ryan Roberts > static inline void __flush_tlb_range(struct vm_area_struct *vma, > @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > unsigned long stride, int tlb_level, > tlbf_t flags) > { > - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride, > - tlb_level, flags); > - __tlbi_sync_s1ish(); > -} > - > -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma, > - unsigned long addr) > -{ > - unsigned long asid; > - > - addr = round_down(addr, CONT_PTE_SIZE); See below. > - > - dsb(nshst); > - asid = ASID(vma->vm_mm); > - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3); > - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr, > - addr + CONT_PTE_SIZE); > - dsb(nsh); > + start = round_down(start, stride); See below. > + end = round_up(end, stride); > + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags); > } > > static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval) > diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c > index 681f22fac52a1..3f1a3e86353de 100644 > --- a/arch/arm64/mm/contpte.c > +++ b/arch/arm64/mm/contpte.c ... > @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma, > __ptep_set_access_flags(vma, addr, ptep, entry, 0); > > if (dirty) > - local_flush_tlb_contpte(vma, start_addr); > + __flush_tlb_range(vma, start_addr, > + start_addr + CONT_PTE_SIZE, > + PAGE_SIZE, 3, This results in a different stride to round down. local_flush_tlb_contpte() did addr = round_down(addr, CONT_PTE_SIZE); With this call we have start = round_down(start, stride); where stride is PAGE_SIZE. I'm too lazy to figure out if that matters. > + TLBF_NOWALKCACHE | TLBF_NOBROADCAST); > } else { > __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte); > __ptep_set_access_flags(vma, addr, ptep, entry, dirty);