From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B059C7EE2C for ; Wed, 24 May 2023 11:06:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=6HFGvwLsiz1lqZJUv++Q2TT0kpvVLbYILHqv6A0JpRE=; b=1ic7E5gfxvUkSA LuPZ+5Iztd6RPL4qNO8SDFT49/P57XQAgFN2JQvjSfUrOmF+b5UDFdPwgimy46xoZw0yPlw8C2nw1 023oKW8owL3wsqKmNdcJJm2Ma2jOxquMou6w+mP+E26VOScHPaHj5ub/pmrop3N0K2wfTXDP2m5Id KQ5NJickpne1iqLh7XVn1EsamGs4alE5ZByA3elLg4xAnyz1Ajj43tyg8lrJY4ANZGSpHWRBLWcHB dkWT4Clf6cfXsn9gao50k6L/Q/p4r/IDRpsFN5yqHX/Ln3FZumdvKBh8HYBGdO9SQkjnssfoHMT7Q V5gomdp5z6cxF9F0z3xA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q1mK4-00DHBC-1Z; Wed, 24 May 2023 11:06:28 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q1mK1-00DH9r-1V for linux-arm-kernel@lists.infradead.org; Wed, 24 May 2023 11:06:27 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 790E21042; Wed, 24 May 2023 04:07:07 -0700 (PDT) Received: from [10.57.84.6] (unknown [10.57.84.6]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8D62C3F67D; Wed, 24 May 2023 04:06:21 -0700 (PDT) Message-ID: Date: Wed, 24 May 2023 12:05:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH] ARM: tlb: Prevent flushing insane large ranges one by one Content-Language: en-GB To: "Russell King (Oracle)" Cc: Thomas Gleixner , linux-arm-kernel@lists.infradead.org, John Ogness , Arnd Bergmann References: <87pm6qm5wo.ffs@tglx> <14ca0866-cdf1-736c-409b-7318bdfba71f@arm.com> From: Robin Murphy In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230524_040625_597404_9C332CF1 X-CRM114-Status: GOOD ( 22.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2023-05-24 11:23, Russell King (Oracle) wrote: > On Wed, May 24, 2023 at 11:18:12AM +0100, Robin Murphy wrote: >> On 2023-05-24 10:32, Thomas Gleixner wrote: >>> vmalloc uses lazy TLB flushes for unmapped ranges to avoid excessive TLB >>> flushing on every unmap. The lazy flushing coalesces unmapped ranges and >>> invokes flush_tlb_kernel_range() with the combined range. >>> >>> The coalescing can result in ranges which spawn the full vmalloc address >>> range. In the case of flushing an executable mapping in the module address >>> space this range is extended to also flush the direct map alias. >>> >>> flush_tlb_kernel_range() then walks insane large ranges, the worst case >>> observed was ~1.5GB. >>> >>> The range is flushed page by page, which takes several milliseconds to >>> complete in the worst case and obviously affects all processes in the >>> system. In the worst case observed this causes the runtime of a realtime >>> task on an isolated CPU to be almost doubled over the normal worst >>> case, which makes it miss the deadline. >>> >>> Cure this by sanity checking the range against a threshold and fall back to >>> tlb_flush_all() when the range is too large. >>> >>> The default threshold is 32 pages, but for CPUs with CP15 this is evaluated >>> at boot time via read_cpuid(CPUID_TLBTYPE) and set to the half of the TLB >>> size. >>> >>> The vmalloc range coalescing could be improved to provide a list or >>> array of ranges to flush, which allows to avoid overbroad flushing, but >>> that's a major surgery and does not solve the problem of actual >>> justified large range flushes which can happen due to the lazy flush >>> mechanics in vmalloc. The lazy flush results in batching which is biased >>> towards large range flushes by design. >>> >>> Fixes: db64fe02258f ("mm: rewrite vmap layer") >>> Reported-by: John Ogness >>> Debugged-by: John Ogness >>> Signed-off-by: Thomas Gleixner >>> Tested-by: John Ogness >>> Link: https://lore.kernel.org/all/87a5y5a6kj.ffs@tglx >>> --- >>> arch/arm/include/asm/cputype.h | 5 +++++ >>> arch/arm/include/asm/tlbflush.h | 2 ++ >>> arch/arm/kernel/setup.c | 10 ++++++++++ >>> arch/arm/kernel/smp_tlb.c | 4 ++++ >>> 4 files changed, 21 insertions(+) >>> >>> --- a/arch/arm/include/asm/cputype.h >>> +++ b/arch/arm/include/asm/cputype.h >>> @@ -196,6 +196,11 @@ static inline unsigned int __attribute_c >>> return read_cpuid(CPUID_MPUIR); >>> } >>> +static inline unsigned int __attribute_const__ read_cpuid_tlbsize(void) >>> +{ >>> + return 64 << ((read_cpuid(CPUID_TLBTYPE) >> 1) & 0x03); >>> +} >> >> This appears to be specific to Cortex-A9 - these bits are >> implementation-defined, and it looks like on on most other Arm Ltd. CPUs >> they have no meaning at all, e.g.[1][2][3], but they could still hold some >> wildly unrelated value on other implementations. > > That sucks. I guess we'll need to decode the main CPU ID register and > have a table, except for Cortex-A9 where we can read the TLB size. Yes, it seems like Cortex-A9 is the odd one out for having configurability here, otherwise the sizes seem to range from 32 entries on Cortex-A8 to 1024 entries for Cortex-A17's main TLB, so having just one single default value would seem less than optimal. Thanks, Robin. > If that's not going to work either, then the MM layer needs to get > fixed not to be so utterly stupid to request a TLB flush over an > insanely large range - or people will just have to put up with > latency sucking on 32-bit ARM platforms. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel