From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2FF3AC25B74 for ; Thu, 30 May 2024 14:04:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=VFUUB54xqmh1TuWuMNylRTBqEmzBJc9K8tE0QcqGGCE=; b=AtL3Y6GvePb+Xj ae40rZBWen+hZ+zKELGfmosk2x1mXdOzwgS9mA9ciLvKnuvku3hcb3kcLJUKwwlX/zU+/6p8rFTOf 4G69W2u0PapohAiPEMfn6yftlzvQ2bTumbwyHJCo6m5mnJktv7RRiH1TBGScVakz2g19/cRCg9/a3 Vd6YO4GJjQjBWRPpxJHCtkBNB50TmDWLyaeyGKwBEqEdZxv3ygnmxJt+wdLLzv6gVq9sB+LeJ4Mu/ ezjgPN0ccx/yTW7tZlFyhzRp02mtmMYDMKBJvDY4l1G5BL/TEHiztnHwB1xsQ6aDnZKgAjeUyjmGm 87br6EH7xC0raHVnKtyg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCgNo-00000007T6s-3BfF; Thu, 30 May 2024 14:03:56 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCgNm-00000007T4g-1Cge for linux-arm-kernel@lists.infradead.org; Thu, 30 May 2024 14:03:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD514339; Thu, 30 May 2024 07:04:12 -0700 (PDT) Received: from e133380.arm.com (e133380.arm.com [10.1.197.55]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9602E3F641; Thu, 30 May 2024 07:03:47 -0700 (PDT) Date: Thu, 30 May 2024 15:03:42 +0100 From: Dave Martin To: Mark Brown Cc: Catalin Marinas , Will Deacon , Benjamin Herrenschmidt , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v6] arm64/fpsimd: Suppress SVE access traps when loading FPSIMD state Message-ID: References: <20240529-arm64-sve-trap-mitigation-v6-1-c2037be6aced@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240529-arm64-sve-trap-mitigation-v6-1-c2037be6aced@kernel.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240530_070354_438883_C5E04784 X-CRM114-Status: GOOD ( 54.22 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Mark, On Wed, May 29, 2024 at 08:46:23PM +0100, Mark Brown wrote: > When we are in a syscall we take the opportunity to discard the SVE state, > saving only the FPSIMD subset of the register state. When we reload the > state from memory we reenable SVE access traps, stopping tracking SVE until > the task takes another SVE access trap. This means that for a task which is > actively using SVE many blocking system calls will have the additional > overhead of a SVE access trap. Playing devil's advocate here: doesn't a blocking syscall already imply a high overhead (at least in terms of latency for the thread concerned)? i.e., does letting TIF_SVE linger across some blocking syscalls make a meaningful difference in some known use case? (For non-blocking syscalls the argument for allowing TIF_SVE to linger seems a lot stronger.) > As SVE deployment is progressing we are seeing much wider use of the SVE > instruction set, including performance optimised implementations of > operations like memset() and memcpy(), which mean that even tasks which are > not obviously floating point based can end up with substantial SVE usage. > > It does not, however, make sense to just unconditionally use the full SVE > register state all the time since it is larger than the FPSIMD register > state so there is overhead saving and restoring it on context switch and > our requirement to flush the register state not shared with FPSIMD on > syscall also creates a noticeable overhead on system call. > > I did some instrumentation which counted the number of SVE access traps > and the number of times we loaded FPSIMD only register state for each task. > Testing with Debian Bookworm this showed that during boot the overwhelming > majority of tasks triggered another SVE access trap more than 50% of the > time after loading FPSIMD only state with a substantial number near 100%, > though some programs had a very small number of SVE accesses most likely > from startup. There were few tasks in the range 5-45%, most tasks either > used SVE frequently or used it only a tiny proportion of times. As expected > older distributions which do not have the SVE performance work available > showed no SVE usage in general applications. > > This indicates that there should be some useful benefit from reducing the > number of SVE access traps for blocking system calls like we did for non > blocking system calls in commit 8c845e273104 ("arm64/sve: Leave SVE enabled > on syscall if we don't context switch"). Let's do this with a timeout, when > we take a SVE access trap record a jiffies after which we'll reeanble SVE > traps then check this whenver we load a FPSIMD only floating point state > from memory. If the time has passed then we reenable traps, otherwise we > leave traps disabled and flush the non-shared register state like we would > on trap. > > The timeout is currently set to a second, I pulled this number out of thin > air so there is doubtless some room for tuning. This means that for a > task which is actively using SVE the number of SVE access traps will be > substantially reduced but applications which use SVE only very > infrequently will avoid the overheads associated with tracking SVE state > after a second. The extra cost from additional tracking of SVE state > only occurs when a task is preempted so short running tasks should be > minimally affected. Could your instrumentation be extended to build a histogram of the delay between successive SVE traps per task? There's a difference here between a task that takes a lot of traps in a burst (perhaps due to startup or a specific library call), versus a task that uses SVE sporadically for all time. I wonder whether the sweet spot for the timeout may be quite a lot shorter than a second. Still, once we have something we can tune, we can always tune it later as you suggest. > > There should be no functional change resulting from this, it is purely a > performance optimisation. > > Signed-off-by: Mark Brown > --- > Changes in v6: > - Rebase onto v6.10-rc1. > - Link to v5: https://lore.kernel.org/r/20240405-arm64-sve-trap-mitigation-v5-1-126fe2515ef1@kernel.org > > Changes in v5: > - Rebase onto v6.9-rc1. > - Use a timeout rather than number of state loads to decide when to > reenable traps. > - Link to v4: https://lore.kernel.org/r/20240122-arm64-sve-trap-mitigation-v4-1-54e0d78a3ae9@kernel.org > > Changes in v4: > - Rebase onto v6.8-rc1. > - Link to v3: https://lore.kernel.org/r/20231113-arm64-sve-trap-mitigation-v3-1-4779c9382483@kernel.org > > Changes in v3: > - Rebase onto v6.7-rc1. > - Link to v2: https://lore.kernel.org/r/20230913-arm64-sve-trap-mitigation-v2-1-1bdeff382171@kernel.org > > Changes in v2: > - Rebase onto v6.6-rc1. > - Link to v1: https://lore.kernel.org/r/20230807-arm64-sve-trap-mitigation-v1-1-d92eed1d2855@kernel.org > --- > arch/arm64/include/asm/processor.h | 1 + > arch/arm64/kernel/fpsimd.c | 42 ++++++++++++++++++++++++++++++++------ > 2 files changed, 37 insertions(+), 6 deletions(-) > > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h > index f77371232d8c..7a6ed0551291 100644 > --- a/arch/arm64/include/asm/processor.h > +++ b/arch/arm64/include/asm/processor.h > @@ -164,6 +164,7 @@ struct thread_struct { > unsigned int fpsimd_cpu; > void *sve_state; /* SVE registers, if any */ > void *sme_state; /* ZA and ZT state, if any */ > + unsigned long sve_timeout; /* jiffies to drop TIF_SVE */ > unsigned int vl[ARM64_VEC_MAX]; /* vector length */ > unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */ > unsigned long fault_address; /* fault info */ > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index 82e8a6017382..4741e4fb612a 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -354,6 +354,7 @@ static void task_fpsimd_load(void) > { > bool restore_sve_regs = false; > bool restore_ffr; > + unsigned long sve_vq_minus_one; > > WARN_ON(!system_supports_fpsimd()); > WARN_ON(preemptible()); > @@ -365,18 +366,12 @@ static void task_fpsimd_load(void) > if (system_supports_sve() || system_supports_sme()) { > switch (current->thread.fp_type) { > case FP_STATE_FPSIMD: > - /* Stop tracking SVE for this task until next use. */ > - if (test_and_clear_thread_flag(TIF_SVE)) > - sve_user_disable(); > break; > case FP_STATE_SVE: > if (!thread_sm_enabled(¤t->thread) && > !WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE))) > sve_user_enable(); > > - if (test_thread_flag(TIF_SVE)) > - sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1); > - > restore_sve_regs = true; > restore_ffr = true; > break; > @@ -395,6 +390,15 @@ static void task_fpsimd_load(void) > } > } > > + /* > + * If SVE has been enabled we may keep it enabled even if > + * loading only FPSIMD state, so always set the VL. > + */ > + if (system_supports_sve() && test_thread_flag(TIF_SVE)) { > + sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1; > + sve_set_vq(sve_vq_minus_one); > + } > + > /* Restore SME, override SVE register configuration if needed */ > if (system_supports_sme()) { > unsigned long sme_vl = task_get_sme_vl(current); > @@ -421,6 +425,25 @@ static void task_fpsimd_load(void) > } else { > WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD); > fpsimd_load_state(¤t->thread.uw.fpsimd_state); > + > + /* > + * If the task had been using SVE we keep it enabled > + * when loading FPSIMD only state for a period to > + * minimise overhead for tasks actively using SVE, > + * disabling it periodicaly to ensure that tasks that > + * use SVE intermittently do eventually avoid the > + * overhead of carrying SVE state. The timeout is > + * initialised when we take a SVE trap in in > + * do_sve_acc(). > + */ > + if (system_supports_sve() && test_thread_flag(TIF_SVE)) { > + if (time_after(jiffies, current->thread.sve_timeout)) { > + clear_thread_flag(TIF_SVE); > + sve_user_disable(); > + } else { > + sve_flush_live(true, sve_vq_minus_one); Didn't we already flush Zn[max:128] as a side-effect of loading the V-regs in fpsimd_load_state() above? Also, unless I'm missing something, prior to this patch we could just fall through this code with TIF_SVE still set, suggesting that either this flush is not needed for some reason, or it is shadowed by another flush done somewhere else, or a flush is currenly needed but missing. Am I just getting myself confused here? (Or, do the deletions from the switch in the earlier hunk cancel this out?) > + } > + } > } > } > > @@ -1397,6 +1420,13 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs) > > get_cpu_fpsimd_context(); > > + /* > + * We will keep SVE enabled when loading FPSIMD only state for > + * the next second to minimise traps when userspace is > + * actively using SVE. > + */ > + current->thread.sve_timeout = jiffies + HZ; > + > if (test_and_set_thread_flag(TIF_SVE)) > WARN_ON(1); /* SVE access shouldn't have trapped */ > > > --- > base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0 > change-id: 20230807-arm64-sve-trap-mitigation-2e7e2663c849 > > Best regards, > -- > Mark Brown [...] Cheers ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel