From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CCA8C433ED for ; Wed, 12 May 2021 13:51:42 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 863F0611AD for ; Wed, 12 May 2021 13:51:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 863F0611AD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=9Kj13rnT42oBvTDzWt+8GtHouPB0EVwasOza6InBYfo=; b=X0PH5sEsocO3D5JO3RDa4imYu ArhOx4UgeNI2+/M5R4eSf6ROr0gApGyprRHcO5d65FmCBwU+dOuJKnYjzm3U9UVws7MO6SQIi3c1j J0B4CdtDO8zrT2UBZgFDf06Az1+nZS/ZamFUEldVP5mvtaMaM6U8Td6OLjLQS+8I79ebiW7CXb7G+ QCCfXTC4fDyNwqOTbx800oqV1glMzDpxkkglC7bDYD4Le2J9jUxdPIhAESy6CHZhxpQHiN2NBSr8R Okt1WJoIr+6PxAhY+L+gTP2uRmYBjrOPl+84riaIkCPHHqb8C+GVUpjHWWyKlFROJpnuQln5I0FfN hX5PPEPBg==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lgpFX-002xKr-SV; Wed, 12 May 2021 13:50:08 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lgpFU-002xKa-F5 for linux-arm-kernel@desiato.infradead.org; Wed, 12 May 2021 13:50:04 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=R/eMmwrVEjAJoTcpQtiF3T4rrOhGRvhEl2y5UgGenvo=; b=K+xp5yH4BJqnuEGaVl93r+Qte7 fnt7DrExrHheb1OOBJIz3JdUF3Yxa4yM46F1MTaNcjPEP7F8jUKIIUCvBLPErh6Ad9g7bWqHcPR0i Jl5Q/AEyJlpwP/y2Oy236n9JHdc5UWkQBjUiuofznHNkNPz5KqYZFmCr5RnNQu2fA++SjiYEBW/Q8 spERn+7fAd2/OO144oqphSg5RdD/+pwAQjxTLG2aEmNpQcN5jHi+k9M7DPN3lxVw31/C+0L6gumoa VDX2AUSRVSc+3GJ2NWCPMlWaP1Vm2A7F4J9tlyhphHJaGQvLs/dQVOQIIeB7UJki/8nWYpjv3jn2l XyiaGj2Q==; Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lgpFR-00AQIZ-7w for linux-arm-kernel@lists.infradead.org; Wed, 12 May 2021 13:50:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DDCAB31B; Wed, 12 May 2021 06:49:58 -0700 (PDT) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 343373F718; Wed, 12 May 2021 06:49:58 -0700 (PDT) Date: Wed, 12 May 2021 14:49:09 +0100 From: Dave Martin To: Mark Brown Cc: Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 3/3] arm64/sve: Skip flushing Z registers with 128 bit vectors Message-ID: <20210512134909.GF4187@arm.com> References: <20210511160446.42871-1-broonie@kernel.org> <20210511160446.42871-4-broonie@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210511160446.42871-4-broonie@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210512_065001_409894_D48E1333 X-CRM114-Status: GOOD ( 23.64 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, May 11, 2021 at 05:04:46PM +0100, Mark Brown wrote: > When the SVE vector length is 128 bits then there are no bits in the Z > registers which are not shared with the V registers so we can skip them > when zeroing state not shared with FPSIMD, this results in a minor > performance improvement. > > Signed-off-by: Mark Brown > --- > arch/arm64/include/asm/fpsimd.h | 2 +- > arch/arm64/kernel/entry-fpsimd.S | 9 +++++++-- > arch/arm64/kernel/fpsimd.c | 6 ++++-- > 3 files changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h > index 2599504674b5..c072161d5c65 100644 > --- a/arch/arm64/include/asm/fpsimd.h > +++ b/arch/arm64/include/asm/fpsimd.h > @@ -69,7 +69,7 @@ static inline void *sve_pffr(struct thread_struct *thread) > extern void sve_save_state(void *state, u32 *pfpsr); > extern void sve_load_state(void const *state, u32 const *pfpsr, > unsigned long vq_minus_1); > -extern void sve_flush_live(void); > +extern void sve_flush_live(unsigned long vq_minus_1); > extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state, > unsigned long vq_minus_1); > extern unsigned int sve_get_vl(void); > diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S > index dd8382e5ce82..87ef25836963 100644 > --- a/arch/arm64/kernel/entry-fpsimd.S > +++ b/arch/arm64/kernel/entry-fpsimd.S > @@ -69,10 +69,15 @@ SYM_FUNC_START(sve_load_from_fpsimd_state) > ret > SYM_FUNC_END(sve_load_from_fpsimd_state) > > -/* Zero all SVE registers but the first 128-bits of each vector */ > +/* > + * Zero all SVE registers but the first 128-bits of each vector > + * > + * x0 = VQ - 1 This does require that ZCR_EL1.LEN has already been set to match x0, and is not changed again before entering userspace. It would be a good idea to at least describe this in a comment so that this doesn't get forgotten later on, but there's a limit to how foolproof this low-level backend code needs to be... > + */ > SYM_FUNC_START(sve_flush_live) > + cbz x0, 1f // A VQ-1 of 0 is 128 bits so no extra Z state > sve_flush_z > - sve_flush_p_ffr > +1: sve_flush_p_ffr > ret > SYM_FUNC_END(sve_flush_live) > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index ad3dd34a83cf..e57b23f95284 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -957,8 +957,10 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs) > * disabling the trap, otherwise update our in-memory copy. > */ > if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { > - sve_set_vq(sve_vq_from_vl(current->thread.sve_vl) - 1); > - sve_flush_live(); > + unsigned long vq_minus_one = > + sve_vq_from_vl(current->thread.sve_vl) - 1; > + sve_set_vq(vq_minus_one); > + sve_flush_live(vq_minus_one); > fpsimd_bind_task_to_cpu(); > } else { > fpsimd_to_sve(current); > -- > 2.20.1 With a comment added as outlined above, Reviewed-by: Dave Martin Cheers ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel