From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C48CDC4167B for ; Mon, 4 Dec 2023 19:16:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:Subject:Cc:To:From:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=6kgCc7JV+L9sffcl4kRSxSywt0sjdOg4Pm2/S5kyFf8=; b=26BrJ6/vsc5pMO gI+it1dLHg1DQ8ZoluxfWbQQJKm/IDMhaf/SvFdlw5wM/4d60p9qCcDuHcP5b66VoEizeeE9hVyRX vroO9PjrkRwxTheMREczLMbxrU+e9to5VIRLaivLkCLgO2Q6Paxn4hymQhINxhHfCvuMDJ9Iw8H1z xl8aDL3rFsNeWGoQqLLXZxvNvnB8O3fZ+8rJ25rTJDdNZvyi/+SzaxEu246rExM4tULRKh8cP1Af0 WQ09FjQzxw7BcuSZa4jWno73bHtuoFuGkwhIzgQNvkH9CDXvqETaI7vlGpabLmRxiWvFRS/4JUxEF 2zGeNy4TB7TbRMSJLjpQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rAEQ7-005QmE-0R; Mon, 04 Dec 2023 19:15:55 +0000 Received: from gentwo.org ([2a02:4780:10:3cd9::1]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rAEPx-005QiP-3B for linux-arm-kernel@lists.infradead.org; Mon, 04 Dec 2023 19:15:48 +0000 Received: by gentwo.org (Postfix, from userid 1003) id A3F8348F40; Mon, 4 Dec 2023 11:15:44 -0800 (PST) Message-ID: <20231204191526.913822216@linux.com> User-Agent: quilt/0.66 Date: Mon, 04 Dec 2023 11:15:26 -0800 From: Christoph Lameter (Ampere) To: linux-arm-kernel@lists.infradead.org Cc: "tokamoto@jp.fujitsu.com" Cc: "qi.fuli@fujitsu.com" Cc: Takao Indoh Cc: Will Deacon Subject: [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231204_111546_168699_9B3A4491 X-CRM114-Status: GOOD ( 14.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org WARNING: First draft of the patchset, tested with kernel compiles, may corrupt your memory. This patchset intends to aid in help debugging and scaling TLB operations on ARM64. This is in particular desirable for ARM architectures with large numbers of cores. What is often seen is that the mesh is flooded with snoop traffic that is related to TLBI broadcasts. The patchset adds the following features: - Allow diagnostics via /proc/vmstat like already possible on X86 with the CONFIG_DEBUG_TLB option. Some sample output: cat /proc/vmstat ... nr_tlb_remote_flush 554104 Flushes converted to IPIs with local flushing nr_tlb_remote_flush_received 1312243 IPIs received to perform local flushing nr_tlb_local_flush_all 141837 Local flush alls nr_tlb_local_flush_range 571784 Local flush range nr_tlb_local_flush_one 8011880 Local individual page flushes nr_tlb_flush_all 28239 Flush alls through the mesh nr_tlb_flush_range 13003 Flush range through the mesh nr_tlb_flush_one 54764 Flush one through the mesh nr_tlb_skipped 0 Suppressed flush - Tracks the cores that have used an address space. With that we can compute the weight of cpus that have used this address space to decide on how to optimally do the flushing when such an action is required. - Control the TLB flushing behavior via the kernel command line and also on a running system. New Kernel parameter tlb_mode= New sysfs setting /sys/kernel/debug/tlb_mode tlb_mode is comprised of a set of flags starting at bit 10. Bit 0-9 are used to set a boundary as to what cpu weight will lead to a mesh flush. If the cpu weight is lower then IPIs are send avoiding the mesh. Feature flags: Bit 10 = If the current cpu is the only one that has ever used an address space then perform local invalidation. This catches the majority of flushes on boot and activities of typical single threaded Unixy processes. Bit 11 = Enable TLB range. Various hardware has problems with TLB range. This allows the kernel to recognize that TLB range should not be used and an alternate method is to be used to do the flushing. Bit 12 = Suppress TLB flushes if the address space is unused. If this bit is set and a flush is requested in an unused address space then no flush will be performed since there cannot any TLB entries. If this is not set then perform mesh flush (just to be sure). - Autotunes the feature flags on bootup if the user has not specified tlb_mode. Calculates an optimal balance between IPIs and mesh flushing based on the number of cpus in the system. Enables local validation always and tlb range flushing if the processor features indicate that the processor supports it. We need a more detailed description but I hope this is enough to get started. These issues have been discussed before in an patchset that contains a similar feature in 2019: https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel