From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B73B4C433F5 for ; Sat, 8 Jan 2022 12:53:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=y9B04jNbU7ngwoQ/fBV63584F615d5WZA6KAExlF0m0=; b=tP65/voRaalHIC J7sefoA/eyWaLL/pT1YRTH83ZsWIeVWVPJq1/q8v8aXRjIaEwN2+Pie8ke+WrJKCohY1ysuRRNSy7 bFnd/Atpa5r+7l/B2fjysc5SocAblr1zvDRVw1V0USS4fB9ziTBmlkfKPNF9ww9J08Lm3rJPjW0Md YmMyPeIpmKP4wQAirp8+f5QeOOB3nY1tUvcpN9/YQGaRR15wwIeUnDZ+dGQma4YK6mRNBbYjZQ1Qg V8GxdWmv360B0trzyrDyNKPdn3Pl1J9jr1WZ32t0BSDKG3rR6wjjwIfN69uoMsnbuuJ2Bu1yMWhNw JmljxlT6cCu12CJKJaBQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1n6BC0-006RRF-FG; Sat, 08 Jan 2022 12:51:32 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1n6BBu-006RQV-PP for linux-arm-kernel@lists.infradead.org; Sat, 08 Jan 2022 12:51:29 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5021D60ECB; Sat, 8 Jan 2022 12:51:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B49D9C36AE5; Sat, 8 Jan 2022 12:51:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1641646285; bh=IE0CJv+o/fUXrMN/8a18QUgyfiT7NICOpdhziFbTIuI=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=PsfoKHKNptiovQblJ37aLcQnSL9nzojdLtm0l0Pf7lsazQmP/cwLYJ3Pxxd327ym8 V8yKixlsq7k5krknzGk+exx03YDOb5XCN/Mr92f7niWUJob5KSh700F0Saadv/FumR GPBL8CMC7ymnmjTuiYxqaF/d37eqc0v0Zy5vCtjaUDIrh8773gfvnz91NSd7tgRgZr FIMsswyGfKa4+QmRQhGUqevtVFLm4MtIZ6xca6jFnoRkuIxxFTXC2+GjWfYLmcIZvN cTfpxyamApPLjMstApNuf3T6YQ8J5N9mhqSz6rrBfUzcjAE5aba4/tK5jlrjD+9BVh UiKH+HsHme9og== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1n6BBr-00Gkxn-Jp; Sat, 08 Jan 2022 12:51:23 +0000 Date: Sat, 08 Jan 2022 12:51:23 +0000 Message-ID: <87pmp2tmpg.wl-maz@kernel.org> From: Marc Zyngier To: He Ying Cc: , , , , , , , Subject: Re: [PATCH] arm64: Make CONFIG_ARM64_PSEUDO_NMI macro wrap all the pseudo-NMI code In-Reply-To: <20220107085536.214501-1-heying24@huawei.com> References: <20220107085536.214501-1-heying24@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: heying24@huawei.com, catalin.marinas@arm.com, will@kernel.org, mark.rutland@arm.com, marcan@marcan.st, joey.gouly@arm.com, pcc@google.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220108_045126_936382_316C63D6 X-CRM114-Status: GOOD ( 26.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, 07 Jan 2022 08:55:36 +0000, He Ying wrote: > > Our product has been updating its kernel from 4.4 to 5.10 recently and > found a performance issue. We do a bussiness test called ARP test, which > tests the latency for a ping-pong packets traffic with a certain payload. > The result is as following. > > - 4.4 kernel: avg = ~20s > - 5.10 kernel (CONFIG_ARM64_PSEUDO_NMI is not set): avg = ~40s > > I have been just learning arm64 pseudo-NMI code and have a question, > why is the related code not wrapped by CONFIG_ARM64_PSEUDO_NMI? > I wonder if this brings some performance regression. > > First, I make this patch and then do the test again. Here's the result. > > - 5.10 kernel with this patch not applied: avg = ~40s > - 5.10 kernel with this patch applied: avg = ~23s > > Amazing! Note that all kernel is built with CONFIG_ARM64_PSEUDO_NMI not > set. It seems the pseudo-NMI feature actually brings some overhead to > performance event if CONFIG_ARM64_PSEUDO_NMI is not set. > > Furthermore, I find the feature also brings some overhead to vmlinux size. > I build 5.10 kernel with this patch applied or not while > CONFIG_ARM64_PSEUDO_NMI is not set. > > - 5.10 kernel with this patch not applied: vmlinux size is 384060600 Bytes. > - 5.10 kernel with this patch applied: vmlinux size is 383842936 Bytes. > > That means arm64 pseudo-NMI feature may bring ~200KB overhead to > vmlinux size. > > Above all, arm64 pseudo-NMI feature brings some overhead to vmlinux size > and performance even if config is not set. To avoid it, add macro control > all around the related code. This obviously attracted my attention, and I took this patch for a ride on 5.16-rc8 on a machine that doesn't support GICv3 NMIs to make sure that any extra code would only result in pure overhead. There was no measurable difference with this patch applied or not, with CONFIG_ARM64_PSEUDO_NMI selected or not for the workloads I tried (I/O heavy virtual machines, hackbench). Mark already asked a number of questions (test case, implementation, test on a modern kernel). Please provide as many detail as you possibly can, because such a regression really isn't expected, and doesn't show up on the systems I have at hand. Some profiling numbers could also be interesting, in case this is a result of a particular resource being thrashed (TLB, cache...). Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel