From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 46953C433EF for ; Tue, 14 Dec 2021 13:05:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=FIqn9eCqC2g6/8mox48UI8hsNqm3ERpyblJDQl1Tcuc=; b=mr3v3fhxcqFJ03 8U2TrFFSnjyiUTpncDMUXj66aIAJ7pCbVypnTXt4blkskceQ9pTXXdexek+nUmaGn4y74b+fkC2aM oad2vlejB/B7FPKAuQmfhHyqpceQ9qCY42j8jnYPFnMSczkSKLg714HLWnFmmL8SQ7lbqlHYgQMTs dc8OXzfzZs55BN/pXOdIFOWsDEoyLhcMXV0L0OTJa1/bGtc4epDt3x7pO3s4IBynYWvUHQX1C/AOt XO6M3p/CDUf3GUm4eDEM0dA3PBPBQTp0n3gXC3AMcDJxXDA1CXXnDe53sHr9/8Tr6xavRcaSFhQ2t XyWeNnQCg78Pk5RjTVwA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mx7Tn-00E71Y-Qq; Tue, 14 Dec 2021 13:04:27 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mx7Tj-00E70P-KB for linux-arm-kernel@lists.infradead.org; Tue, 14 Dec 2021 13:04:25 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1A0C76D; Tue, 14 Dec 2021 05:04:20 -0800 (PST) Received: from FVFF77S0Q05N (unknown [10.57.66.239]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1D2313F793; Tue, 14 Dec 2021 05:04:18 -0800 (PST) Date: Tue, 14 Dec 2021 13:04:02 +0000 From: Mark Rutland To: Will Deacon Cc: linux-arm-kernel@lists.infradead.org, boqun.feng@gmail.com, catalin.marinas@arm.com, peterz@infradead.org Subject: Re: [PATCH 4/5] arm64: atomics: lse: improve constraints for simple ops Message-ID: References: <20211210151410.2782645-1-mark.rutland@arm.com> <20211210151410.2782645-5-mark.rutland@arm.com> <20211213194022.GD12868@willie-the-truck> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20211213194022.GD12868@willie-the-truck> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211214_050423_811404_EF57C768 X-CRM114-Status: GOOD ( 26.84 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Dec 13, 2021 at 07:40:23PM +0000, Will Deacon wrote: > On Fri, Dec 10, 2021 at 03:14:09PM +0000, Mark Rutland wrote: > > We have overly conservative assembly constraints for the basic FEAT_LSE > > atomic instructions, and using more accurate and permissive constraints > > will allow for better code generation. > > > > The FEAT_LSE basic atomic instructions have come in two forms: > > > > LD{op}{order}{size} , , [] > > ST{op}{order}{size} , [] > > For either form, both and are read but not written back to, > > and is written with the original value of the memory location. > > Where ( == ) or ( == ), is written *after* the > > other register value(s) are consumed. There are no UNPREDICTABLE or > > CONSTRAINED UNPREDICTABLE behaviours when any pair of , , or > > are the same register. > > Our current inline assembly always uses == , treating this > > register as both an input and an output (using a '+r' constraint). This > > forces the compiler to do some unnecessary register shuffling and/or > > redundant value generation. > > We can improve this with more accurate constraints, separating and > > , where is an input-only register ('r'), and is an > > output-only value ('=r'). As is written back after is > > consumed, it does not need to be earlyclobber ('=&r'), leaving the > > compiler free to use the same register for both and where this > > is desirable. > > > > At the same time, the redundant 'r' constraint for `v` is removed, as > > the `+Q` constraint is sufficient. > Makes sense to me. I'm not sure _why_ the current constraints are so weird; > maybe a hangover from when we patched them inline? Yup; the constraints used to make sense when we were patching the instructions, and we just never cleaned them up. Details below -- I can add a link tag to this mail if you'd like? For example, back when we introduced the ops in commit: c0385b24af15020a ("arm64: introduce CONFIG_ARM64_LSE_ATOMICS as fallback to ll/sc atomics") We had: | #define ATOMIC_OP(op, asm_op) \ | static inline void atomic_##op(int i, atomic_t *v) \ | { \ | register int w0 asm ("w0") = i; \ | register atomic_t *x1 asm ("x1") = v; \ | \ | asm volatile( \ | __LL_SC_CALL(op) \ | : "+r" (w0), "+Q" (v->counter) \ | : "r" (x1) \ | : "x30"); \ | } \ IIUC, for those constraints: * "+r" (w0) -- necessary because the LL-SC forms might clobber w0 (e.g. if used for their `tmp` variable). For the value-returning ops we had to use w0/x0 for the return value as the out-of-line LL-SC ops followed the AAPCS parameter passing rules. * "+Q" (v->counter) -- necessary to clobber the memory location, "Q" specifically to ensure the compiler didn't expect some write-back addressing mode to have adjusted a register. * "r" (x1) -- necessary to ensure the address was available in X1 for the LL-SC code, since I beleive the "+Q" constraint *could* have used a copy in another register. I could be mistaken on this one, since there's no rationale in the original commit message or code. When that approach was thrown away in commits addfc38672c73efd ("arm64: atomics: avoid out-of-line ll/sc atomics") 3337cb5aea594e40 ("arm64: avoid using hard-coded registers for LSE atomics") ... we stopped hard-coding the registers, and added a `tmp` register for ops with a return value, but otherwise left the constraints as-is. > Anywho: > > Acked-by: Will Deacon Thanks! Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel