From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 03BF7C7EE24 for ; Tue, 16 May 2023 09:03:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References :In-Reply-To:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=M9I64UPPy4Wu3C8wCMdcptxd9NOqlsIE8L1merrb1Dc=; b=Eh+IdJbgdYH3Mf NajQMgw9b2GhmyBr4POtKp6+bRpiOl8bmUXwAJB+s6asOvWSyJWRN+Gnq2hKa5o5IlovB9p2CSRfX C5z5Yaoa/a8X/C0Awu8EKv6/xK+PzRblglI2E04MJC9/oD3aAKBWqlYvbIJ4fGAfA+/rBatz2uaom yNZ9Zc2OF2CEUa6WeOdQbZ+D5He5lH2Vlzf2Mf67JhQueC579m2FoY4AxDlL++c3SjS570x0zeyF2 tnX1qgfLaMd9+diAIsr7IHXpTOfRe1WgNUPCMY5fstp218KOFwgb6x8Km+IvjmyOO+dszjY1KTQ/i zgHGrTkcZ9AJOW23tAhA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pyqaq-004yYU-0J; Tue, 16 May 2023 09:03:40 +0000 Received: from galois.linutronix.de ([193.142.43.55]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pyqan-004yXY-0M for linux-arm-kernel@lists.infradead.org; Tue, 16 May 2023 09:03:38 +0000 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1684227815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=S2P5Z3IXnbTSulbv5yfvKztFo0j2Pg9wfHd+Uzr6jzg=; b=Fsv/BpBBilc65+Lp5TifuGS+eQoJsZVfdauIq/aqpDZuHgEd5/uFNF70YNC291Is9/AquE 5YOiMj3mojcBcUWCrjehfIYX1EM2qmNVN3xau9plsGHJvlkBwBCXYvdY/EOigPwSn5JkgM 1Vlzl05cfr+afditF3d4WgbeMkp026+3YAdxQRcSQFcuiRABy6e8YwKsaSlRyqufMDug3D qCQzgr6heZLad92vcIQRY7ugSkFoUYEaWn+yML4K7KBLsUkLGosSXFhVPTLockbYWBTjlL 5+e+BJqHW2m0uCt4oYf0jTqD0NYjK5nv+sTcsh+aWfOMhMu7SaCUCYF3iwmrRQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1684227815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=S2P5Z3IXnbTSulbv5yfvKztFo0j2Pg9wfHd+Uzr6jzg=; b=FHCuMamzYYBPBGHMA05BQxpmDTxnjQJ1CjGvuAppjpWFV9ZMmVOZkarATyp3fPp5Rp3HKQ nHUyu78UF4NjusAg== To: "Russell King (Oracle)" Cc: Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Uladzislau Rezki , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Mark Rutland , Marc Zyngier , x86@kernel.org Subject: Re: Excessive TLB flush ranges In-Reply-To: References: <87a5y5a6kj.ffs@tglx> <87353x9y3l.ffs@tglx> <87zg658fla.ffs@tglx> <87r0rg93z5.ffs@tglx> <87ilcs8zab.ffs@tglx> <87fs7w8z6y.ffs@tglx> Date: Tue, 16 May 2023 11:03:34 +0200 Message-ID: <874joc8x7d.ffs@tglx> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230516_020337_427690_9AE980E4 X-CRM114-Status: GOOD ( 30.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, May 16 2023 at 09:27, Russell King wrote: > On Tue, May 16, 2023 at 10:20:37AM +0200, Thomas Gleixner wrote: >> On Tue, May 16 2023 at 10:18, Thomas Gleixner wrote: >> >> > On Tue, May 16 2023 at 08:37, Thomas Gleixner wrote: >> >> On Mon, May 15 2023 at 22:31, Russell King wrote: >> >>>> + list_for_each_entry(va, list, list) { >> >>>> + /* flush range by one by one 'invlpg' */ >> >>>> + for (addr = va->va_start; addr < va->va_end; addr += PAGE_SIZE) >> >>>> + flush_tlb_one_kernel(addr); >> >>> >> >>> Isn't this just the same as: >> >>> flush_tlb_kernel_range(va->va_start, va->va_end); >> >> >> >> Indeed. >> > >> > Actually not. At least not on x86 where it'd end up with 3 IPIs for that >> > case again, instead of having one which walks the list on each CPU. >> >> ARM32 has the same problem when tlb_ops_need_broadcast() is true. > > If tlb_ops_need_broadcast() is true, then isn't it one IPI to other > CPUs to flush the range, and possibly another for the Cortex-A15 > erratum? > > I've no idea what flush_tlb_one_kernel() is. I can find no such The patch is against x86 and that function exists there. At lease git grep claims so. :) > implementation, there is flush_tlb_kernel_page() though, which I > think is what you're referring to above. On ARM32, that will issue > one IPI each time it's called, and possibly another IPI for the > Cortex-A15 erratum. > > Given that, flush_tlb_kernel_range() is still going to be more > efficient on ARM32 when tlb_ops_need_broadcast() is true than doing > it page by page. Something like the untested below? I did not attempt anything to decide whether a full flush might be worth it, but that's a separate problem. Thanks, tglx --- --- a/arch/Kconfig +++ b/arch/Kconfig @@ -270,6 +270,10 @@ config ARCH_HAS_SET_MEMORY config ARCH_HAS_SET_DIRECT_MAP bool +# Select if architecture provides flush_tlb_kernel_vas() +config ARCH_HAS_FLUSH_TLB_KERNEL_VAS + bool + # # Select if the architecture provides the arch_dma_set_uncached symbol to # either provide an uncached segment alias for a DMA allocation, or --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -10,6 +10,7 @@ config ARM select ARCH_HAS_DMA_WRITE_COMBINE if !ARM_DMA_MEM_BUFFERABLE select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FORTIFY_SOURCE + select ARCH_HAS_FLUSH_TLB_KERNEL_VAS select ARCH_HAS_KEEPINITRD select ARCH_HAS_KCOV select ARCH_HAS_MEMBARRIER_SYNC_CORE --- a/arch/arm/kernel/smp_tlb.c +++ b/arch/arm/kernel/smp_tlb.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -69,6 +70,19 @@ static inline void ipi_flush_tlb_kernel_ local_flush_tlb_kernel_range(ta->ta_start, ta->ta_end); } +static inline void local_flush_tlb_kernel_vas(struct list_head *vmap_list) +{ + struct vmap_area *va; + + list_for_each_entry(va, vmap_list, list) + local_flush_tlb_kernel_range(va->va_start, va->va_end); +} + +static inline void ipi_flush_tlb_kernel_vas(void *arg) +{ + local_flush_tlb_kernel_vas(arg); +} + static inline void ipi_flush_bp_all(void *ignored) { local_flush_bp_all(); @@ -244,6 +258,15 @@ void flush_tlb_kernel_range(unsigned lon broadcast_tlb_a15_erratum(); } +void flush_tlb_kernel_vas(struct list_head *vmap_list, unsigned long num_entries) +{ + if (tlb_ops_need_broadcast()) { + on_each_cpu(ipi_flush_tlb_kernel_vas, vmap_list, 1); + } else + local_flush_tlb_kernel_vas(vmap_list); + broadcast_tlb_a15_erratum(); +} + void flush_bp_all(void) { if (tlb_ops_need_broadcast()) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -77,6 +77,7 @@ config X86 select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_EARLY_DEBUG if KGDB select ARCH_HAS_ELF_RANDOMIZE + select ARCH_HAS_FLUSH_TLB_KERNEL_VAS select ARCH_HAS_FAST_MULTIPLIER select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -1081,6 +1082,27 @@ void flush_tlb_kernel_range(unsigned lon } } +static void do_flush_tlb_vas(void *arg) +{ + struct list_head *vmap_list = arg; + struct vmap_area *va; + unsigned long addr; + + list_for_each_entry(va, vmap_list, list) { + /* flush range by one by one 'invlpg' */ + for (addr = va->va_start; addr < va->va_end; addr += PAGE_SIZE) + flush_tlb_one_kernel(addr); + } +} + +void flush_tlb_kernel_vas(struct list_head *vmap_list, unsigned long num_entries) +{ + if (num_entries > tlb_single_page_flush_ceiling) + on_each_cpu(do_flush_tlb_all, NULL, 1); + else + on_each_cpu(do_flush_tlb_vas, vmap_list, 1); +} + /* * This can be used from process context to figure out what the value of * CR3 is without needing to do a (slow) __read_cr3(). --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -295,4 +295,6 @@ bool vmalloc_dump_obj(void *object); static inline bool vmalloc_dump_obj(void *object) { return false; } #endif +void flush_tlb_kernel_vas(struct list_head *list, unsigned long num_entries); + #endif /* _LINUX_VMALLOC_H */ --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1724,7 +1724,8 @@ static void purge_fragmented_blocks_allc */ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end) { - unsigned long resched_threshold; + unsigned long resched_threshold, num_entries = 0, num_alias_entries = 0; + struct vmap_area alias_va = { .va_start = start, .va_end = end }; unsigned int num_purged_areas = 0; struct list_head local_purge_list; struct vmap_area *va, *n_va; @@ -1736,18 +1737,29 @@ static bool __purge_vmap_area_lazy(unsig list_replace_init(&purge_vmap_area_list, &local_purge_list); spin_unlock(&purge_vmap_area_lock); - if (unlikely(list_empty(&local_purge_list))) - goto out; + start = min(start, list_first_entry(&local_purge_list, struct vmap_area, list)->va_start); + end = max(end, list_last_entry(&local_purge_list, struct vmap_area, list)->va_end); - start = min(start, - list_first_entry(&local_purge_list, - struct vmap_area, list)->va_start); - - end = max(end, - list_last_entry(&local_purge_list, - struct vmap_area, list)->va_end); + if (IS_ENABLED(CONFIG_HAVE_FLUSH_TLB_KERNEL_VAS)) { + list_for_each_entry(va, &local_purge_list, list) + num_entries += (va->va_end - va->va_start) >> PAGE_SHIFT; + + if (unlikely(!num_entries)) + goto out; + + if (alias_va.va_end > alias_va.va_start) { + num_alias_entries = (alias_va.va_end - alias_va.va_start) >> PAGE_SHIFT; + list_add(&alias_va.list, &local_purge_list); + } + + flush_tlb_kernel_vas(&local_purge_list, num_entries + num_alias_entries); + + if (num_alias_entries) + list_del(&alias_va.list); + } else { + flush_tlb_kernel_range(start, end); + } - flush_tlb_kernel_range(start, end); resched_threshold = lazy_max_pages() << 1; spin_lock(&free_vmap_area_lock); _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel