From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD797C433E3 for ; Wed, 15 Jul 2020 16:51:17 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ACA7E2064C for ; Wed, 15 Jul 2020 16:51:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="pnkUvhP4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ACA7E2064C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bEGHyT40lkkXLWMaF9PRQGXYYgwu7+iuHAEY+PORQTE=; b=pnkUvhP41pvLHlKTpTMzjEuWH Zm8I0sv6psxG/AUCgKk53sfC+k2c6dtsPICeWC87PURghl7Keml1vFPZrC8+KWuSJWMbLQSeiuRNu dm+hnEClHffhg3t4EVHWI0alqOH+578IeUfH6oE7q/DqrP05wrdlBHcRKoTh+lfA2HzJ7kaFx7Kuu xBUIo0khu1VBd801VUnrUexoUG7YKoImTzxqgxsdkcx2KJkvFDlfxwkzKHwNC+8yhPurGzJIkjiaU VfgeDw2Z9Vizz0XWtmI1T7KRHSeNm4oFrqvU5G1hWR/Apuctcne25JrsVjVeqoaXz0CgacBzKq6Mx P/hB7eSRw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jvkbG-0005ni-8x; Wed, 15 Jul 2020 16:49:42 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jvkbD-0005nF-LO for linux-arm-kernel@lists.infradead.org; Wed, 15 Jul 2020 16:49:40 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 34FC231B; Wed, 15 Jul 2020 09:49:37 -0700 (PDT) Received: from arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 34FF63F718; Wed, 15 Jul 2020 09:49:36 -0700 (PDT) Date: Wed, 15 Jul 2020 17:49:34 +0100 From: Dave Martin To: Mark Brown Subject: Re: [PATCH v3 0/8] arm64/sve: First steps towards optimizing syscalls Message-ID: <20200715164931.GC30452@arm.com> References: <20200629133556.39825-1-broonie@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200629133556.39825-1-broonie@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200715_124939_824414_50F3ABF0 X-CRM114-Status: GOOD ( 28.61 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Julien Grall , Catalin Marinas , zhang.lei@jp.fujitsu.com, Will Deacon , linux-arm-kernel@lists.infradead.org, Daniel Kiss Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Jun 29, 2020 at 02:35:48PM +0100, Mark Brown wrote: > This is a first attempt to optimize the syscall path when the user > application uses SVE. The patch series was originally written by Julien > Grall but has been left for a long time, I've updated it to current > kernels and tried to address the pending review feedback that I found > (which was mostly documentation issues). I may have missed some things > there, apologies if I did, and one thing I've not yet done is produced a > diagram of the states the relevant TIF_ flags can have - I need to work > out a sensible format for that. > > Per the syscall ABI, SVE registers will be unknown after a syscall. In > practice, the kernel will disable SVE and the registers will be zeroed > (except the first 128-bits of each vector) on the next SVE instruction. > In a workload mixing SVE and syscalls, this will result to 2 entry/exit > to the kernel per syscall as we trap on the first SVE access after the > syscall. This series aims to avoid the second entry/exit by zeroing the > SVE registers on syscall return with a twist when the task will get > rescheduled. > > This implementation will have an impact on application using SVE > only once. SVE will now be turned on until the application terminates > (unless disabling it via ptrace). Cleverer strategies for choosing > between SVE and FPSIMD context switching are possible (see fpu_counter > for SH in mainline, or [1]), but it is difficult to assess the benefit > right now. We could improve the behaviour in the future as a selection > of mature hardware platforsm emerges that we can benchmark. > > It is also possible to optimize the case when the SVE vector-length > is 128-bit (i.e the same size as the FPSIMD vectors). This could be > explored in the future. > > Note that the last patch for the series is is not here to optimize syscall > but SVE trap access by directly converting in hardware the FPSIMD state > to SVE state. If there are an interest to have this optimization earlier, > I can reshuffle the patches in the series. > > v3: > - Rebased to current kernels. > - Addressed review comments from v2, mostly around tweaks in the > documentation. Looks reasonable overall, apart from a few questions on some details (partly because I haven't thought deeply about this stuff for a while). I wonder whether we ought to accompany this with a crude mechanism to choose dynamically between setting TIF_SVE_NEEDS_FLUSH and keeping the old behaviour. My concern with doing this unconditionally has been that we can end up with TIF_SVE permanently stuck on, which increases the per-task overhead. This is not a worry if the user task really does use SVE once per context switch, but not so good if, say, the libc startup probes for SVE to initialise some internal logic but the task otherwise doesn't use it. (This is just a worry: I haven't looked for evidence to support it.) Either way, we should keep it pretty dumb until/unless we have compelling rationale for doing something cleverer. Cheers ---Dave _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel