From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF54AC10F1A for ; Tue, 7 May 2024 14:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References :In-Reply-To:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=U4y5/JluFqTVMxTsuca7KvYTIYSiIie6/AIiIyVhokE=; b=M4yy+HpDR3M8+X FeaHYyP9amA7J//EEqmdemT5sWb2ScxllNmH//UwjDkfox1/oYnEmFqNTeEwzgCsxCxxGY849h6ec qbErFCv+P/zCV19rf1NHJ9GS5H6HnHM+oAJhHABtUGtleOzD7tFGfoRgG/JwXEaUWI934Ovb6A3cj 9T14Fpyad8SLpsbJwoHWHOLlGDM2ZBvpl5CMEmiU2+L37oOsWgx0VDZEK0lAjirb1hhZSRXbYN5Yr D3zpqn1PJ5gQdIwmoAOsAkUvar97gqHT49j3csQQpe+6akd4tlwcAHj70mzGSm3O+QO4eXOMAI/sb b7iaqT1Fw8YkYSvEAXsg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s4LTk-0000000BO70-2C4a; Tue, 07 May 2024 14:07:36 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s4LTh-0000000BO4m-3M02 for linux-riscv@lists.infradead.org; Tue, 07 May 2024 14:07:35 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id DCC9CCE137E; Tue, 7 May 2024 14:07:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 555C4C2BBFC; Tue, 7 May 2024 14:07:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715090849; bh=Wgua9hwlpKleBWvYByFTlHo94DWXbiaBJrkWmQcDYFM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=TlSZCfhxErCpepOvO60RmpT/hc0bdnvgA1gyd50l1azuDZ8Id5DHvFDT3ilzI7BTf nk9t0C2KWKbna/vv9WTjJUJDGHfxdkWTu8VmGz1ukLKMcsqKu8bIYPw+vbBj0ipkMh lH0QD/zU94ChC7E5sENRK2nUfwOlbjx20Bc+L4yBq3qAwZ7HI7J4P31E+YZ2PFjTvF hRHuxDsp3Dn/BJr1rmoJ96iG5aRVoglkfwLgMWOIGWMHU7ZwbZFobdCydda8dEh0fD F61r5mfgZDXRJ11K4zI6kNpZzFkG5C5x9fnuS5Y5vno0f1bCEYwj0wQFR1RZAyDWn2 QZ4b3dML5q/Zg== From: Puranjay Mohan To: Andrea Parri , Daniel Lustig Cc: Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH] riscv/atomic.h: optimize ops with acquire/release ordering In-Reply-To: References: <20240505123340.38495-1-puranjay@kernel.org> Date: Tue, 07 May 2024 14:07:26 +0000 Message-ID: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240507_070734_380969_70314756 X-CRM114-Status: GOOD ( 16.09 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org QW5kcmVhIFBhcnJpIDxwYXJyaS5hbmRyZWFAZ21haWwuY29tPiB3cml0ZXM6Cgo+IEhpIFB1cmFu amF5LAo+Cj4gT24gU3VuLCBNYXkgMDUsIDIwMjQgYXQgMTI6MzM6NDBQTSArMDAwMCwgUHVyYW5q YXkgTW9oYW4gd3JvdGU6Cj4+IEN1cnJlbnRseSwgYXRvbWljIG9wcyB3aXRoIGFjcXVpcmUgb3Ig cmVsZWFzZSBvcmRlcmluZyBhcmUgaW1wbGVtZW50ZWQKPj4gYXMgYXRvbWljIG9wcyB3aXRoIHJl bGF4ZWQgb3JkZXJpbmcgZm9sbG93ZWQgYnkgb3IgcHJlY2VkZWQgYnkgYW4KPj4gYWNxdWlyZSBm ZW5jZSBvciBhIHJlbGVhc2UgZmVuY2UuCj4+IAo+PiBTZWN0aW9uIDguMSBvZiB0aGUgIlRoZSBS SVNDLVYgSW5zdHJ1Y3Rpb24gU2V0IE1hbnVhbCBWb2x1bWUgSToKPj4gVW5wcml2aWxlZ2VkIElT QSIsIHRpdGxlZCwgIlNwZWNpZnlpbmcgT3JkZXJpbmcgb2YgQXRvbWljIEluc3RydWN0aW9ucyIK Pj4gc2F5czoKPj4gCj4+IHwgVG8gcHJvdmlkZSBtb3JlIGVmZmljaWVudCBzdXBwb3J0IGZvciBy ZWxlYXNlIGNvbnNpc3RlbmN5IFs1XSwgZWFjaAo+PiB8IGF0b21pYyBpbnN0cnVjdGlvbiBoYXMg dHdvIGJpdHMsIGFxIGFuZCBybCwgdXNlZCB0byBzcGVjaWZ5IGFkZGl0aW9uYWwKPj4gfCBtZW1v cnkgb3JkZXJpbmcgY29uc3RyYWludHMgYXMgdmlld2VkIGJ5IG90aGVyIFJJU0MtViBoYXJ0cy4K Pj4gCj4+IGFuZAo+PiAKPj4gfCBJZiBvbmx5IHRoZSBhcSBiaXQgaXMgc2V0LCB0aGUgYXRvbWlj IG1lbW9yeSBvcGVyYXRpb24gaXMgdHJlYXRlZCBhcwo+PiB8IGFuIGFjcXVpcmUgYWNjZXNzLgo+ PiB8IElmIG9ubHkgdGhlIHJsIGJpdCBpcyBzZXQsIHRoZSBhdG9taWMgbWVtb3J5IG9wZXJhdGlv biBpcyB0cmVhdGVkIGFzIGEKPj4gfCByZWxlYXNlIGFjY2Vzcy4KPj4gCj4+IFNvLCByYXRoZXIg dGhhbiB1c2luZyB0d28gaW5zdHJ1Y3Rpb25zIChyZWxheGVkIGF0b21pYyBvcCArIGZlbmNlKSwg dXNlCj4+IGEgc2luZ2xlIGF0b21pYyBvcCBpbnN0cnVjdGlvbiB3aXRoIGFjcXVpcmUvcmVsZWFz ZSBvcmRlcmluZy4KPj4gCj4+IEV4YW1wbGUgcHJvZ3JhbToKPj4gCj4+ICAgYXRvbWljX3QgY250 ID0gQVRPTUlDX0lOSVQoMCk7Cj4+ICAgYXRvbWljX2ZldGNoX2FkZF9hY3F1aXJlKDEsICZjbnQp Owo+PiAgIGF0b21pY19mZXRjaF9hZGRfcmVsZWFzZSgxLCAmY250KTsKPj4gCj4+IEJlZm9yZToK Pj4gCj4+ICAgYW1vYWRkLncgICAgICAgIGE0LGE1LChhNCkgIC8vIEF0b21pYyBhZGQgd2l0aCBy ZWxheGVkIG9yZGVyaW5nCj4+ICAgZmVuY2UgICByLHJ3ICAgICAgICAgICAgICAgIC8vIEZlbmNl IHRvIGZvcmNlIEFjcXVpcmUgb3JkZXJpbmcKPj4gCj4+ICAgZmVuY2UgICBydyx3ICAgICAgICAg ICAgICAgIC8vIEZlbmNlIHRvIGZvcmNlIFJlbGVhc2Ugb3JkZXJpbmcKPj4gICBhbW9hZGQudyAg ICAgICAgYTQsYTUsKGE0KSAgLy8gQXRvbWljIGFkZCB3aXRoIHJlbGF4ZWQgb3JkZXJpbmcKPj4g Cj4+IEFmdGVyOgo+PiAKPj4gICBhbW9hZGQudy5hcSAgICAgYTQsYTUsKGE0KSAgLy8gQXRvbWlj IGFkZCB3aXRoIEFjcXVpcmUgb3JkZXJpbmcKPj4gCj4+ICAgYW1vYWRkLncucmwgICAgIGE0LGE1 LChhNCkgIC8vIEF0b21pYyBhZGQgd2l0aCBSZWxlYXNlIG9yZGVyaW5nCj4+IAo+PiBTaWduZWQt b2ZmLWJ5OiBQdXJhbmpheSBNb2hhbiA8cHVyYW5qYXlAa2VybmVsLm9yZz4KPgo+IFlvdXIgY2hh bmdlcyBhcmUgZWZmZWN0aXZlbHkgcGFydGlhbGx5IHJldmVydGluZzoKPgo+ICAgNWNlNmMxZjM1 MzVmYSAoInJpc2N2L2F0b21pYzogU3RyZW5ndGhlbiBpbXBsZW1lbnRhdGlvbnMgd2l0aCBmZW5j ZXMiKQo+Cj4gQ2FuIHlvdSBwbGVhc2UgcHJvdmlkZSAoYW5kIHBvc3NpYmx5IGluY2x1ZGUgaW4g dGhlIGNoYW5nZWxvZyBvZiB2MikgYSBtb3JlCj4gdGhvdWdodGZ1bCBleHBsYW5hdGlvbiBmb3Ig dGhlIGNvcnJlY3RuZXNzIG9mIHN1Y2ggcmV2ZXJ0Pwo+Cj4gKEFudGljaXBhdGluZyBhIHNvbWV3 aGF0IG5vbi10cml2aWFsIGFuYWx5c2lzLi4uKQoKSGkgQW5kcmVhLAoKSSB0aGluayBHdW8gUmVu IHNlbnQgYSBwYXRjaFsxXSBsaWtlIHRoaXMgZWFybGllciBidXQgaXQgZGlkIG5vdCBnZXQKY29t bWVudHMgeWV0LiBJIHdpbGwgcmVwbHkgb24gdGhhdCB0aHJlYWRbMV0gYXMgd2VsbC4KCkkgc2F3 IHRoZSBjb21taXQgNWNlNmMxZjM1MzVmYSAoInJpc2N2L2F0b21pYzogU3RyZW5ndGhlbgppbXBs ZW1lbnRhdGlvbnMgd2l0aCBmZW5jZXMiKSBhbmQgYWxsIHRoZSByZWxhdGVkIGRpc2N1c3Npb25z LgoKVGhpcyBpcyB3aGF0IEkgdW5kZXJzdGFuZCBmcm9tIHRoZSBkaXNjdXNzaW9uczoKClJJU0NW J3MgTFIuYXEvU0Mucmwgd2VyZSBmb2xsb3dpbmcgUkNwYyBvcmRlcmluZyBidXQgdGhlIExLTU0g ZXhwZWN0ZWQKUkNzYyBvcmRlcmluZyBmcm9tIGxvY2soKSBhbmQgdW5sb2NrKCkuIFNvIHlvdSBh ZGRlZCBmZW5jZXMgdG8gZm9yY2UgUkNzYwpvcmRlcmluZyBpbiB0aGUgbG9ja3MvYXRvbWljcy4K CkFuIGV4cGVyaW1lbnQgd2l0aCBMS01NIGFuZCBSSVNDViBNTToKClRoZSBmb2xsb3dpbmcgbGl0 bXVzIHRlc3Qgc2hvdWxkIG5vdCByZWFjaCAoMTpyMD0xIC9cIDE6cjE9MCkgd2l0aCBMS01NOgoK QyB1bmxvY2stbG9jay1yZWFkLW9yZGVyaW5nCgp7fQovKiBzIGluaXRpYWxseSBvd25lZCBieSBQ MSAqLwoKUDAoaW50ICp4LCBpbnQgKnkpCnsKICAgICAgICBXUklURV9PTkNFKCp4LCAxKTsKICAg ICAgICBzbXBfd21iKCk7CiAgICAgICAgV1JJVEVfT05DRSgqeSwgMSk7Cn0KClAxKGludCAqeCwg aW50ICp5LCBzcGlubG9ja190ICpzKQp7CiAgICAgICAgaW50IHIwOwogICAgICAgIGludCByMTsK CiAgICAgICAgcjAgPSBSRUFEX09OQ0UoKnkpOwogICAgICAgIHNwaW5fdW5sb2NrKHMpOwogICAg ICAgIHNwaW5fbG9jayhzKTsKICAgICAgICByMSA9IFJFQURfT05DRSgqeCk7Cn0KCmV4aXN0cyAo MTpyMD0xIC9cIDE6cjE9MCkKCldoaWNoIGlzIGluZGVlZCB0cnVlOgoKVGVzdCB1bmxvY2stbG9j ay1yZWFkLW9yZGVyaW5nIEFsbG93ZWQKU3RhdGVzIDMKMTpyMD0wOyAxOnIxPTA7CjE6cjA9MDsg MTpyMT0xOwoxOnIwPTE7IDE6cjE9MTsKTm8KV2l0bmVzc2VzClBvc2l0aXZlOiAwIE5lZ2F0aXZl OiAzCkZsYWcgdW5tYXRjaGVkLXVubG9jawpDb25kaXRpb24gZXhpc3RzICgxOnIwPTEgL1wgMTpy MT0wKQpPYnNlcnZhdGlvbiB1bmxvY2stbG9jay1yZWFkLW9yZGVyaW5nIE5ldmVyIDAgMwpUaW1l IHVubG9jay1sb2NrLXJlYWQtb3JkZXJpbmcgMC4wMQpIYXNoPWFiMGNmZGNkZTU0ZDFiYjFmYWE3 MzE1MzM5ODBmNDI0CgpBbmQgd2hlbiBJIG1hcCB0aGlzIHRlc3QgdG8gUklTQy1WOgoKUklTQ1Yg UklTQ1YtdW5sb2NrLWxvY2stcmVhZC1vcmRlcmluZwp7CjA6eDI9eDsKMDp4ND15OwoKMTp4Mj14 OwoxOng0PXk7CjE6eDY9czsKfQogUDAgICAgICAgICAgIHwgIFAxICAgICAgICAgICAgICAgICAg ICAgIDsKIG9yaSB4MSx4MCwxICB8IGx3IHgxLDAoeDQpICAgICAgICAgICAgICA7CiBzdyB4MSww KHgyKSAgfCBhbW9zd2FwLncucmwgeDAseDAsKHg2KSAgOwogZmVuY2Ugdyx3ICAgIHwgb3JpIHg1 LHgwLDEgICAgICAgICAgICAgIDsKIG9yaSB4Myx4MCwxICB8IGFtb3N3YXAudy5hcSB4MCx4NSwo eDYpICA7CiBzdyB4MywwKHg0KSAgfCBsdyB4MywwKHgyKSAgICAgICAgICAgICAgOwpleGlzdHMg KDE6eDE9MSAvXCAxOngzPTApCgpUaGlzIGFsc28gZG9lc24ndCByZWFjaCB0aGUgY29uZGl0aW9u OgoKVGVzdCBSSVNDVi11bmxvY2stbG9jay1yZWFkLW9yZGVyaW5nIEFsbG93ZWQKU3RhdGVzIDMK MTp4MT0wOyAxOngzPTA7CjE6eDE9MDsgMTp4Mz0xOwoxOngxPTE7IDE6eDM9MTsKTm8KV2l0bmVz c2VzClBvc2l0aXZlOiAwIE5lZ2F0aXZlOiAzCkNvbmRpdGlvbiBleGlzdHMgKDE6eDE9MSAvXCAx OngzPTApCk9ic2VydmF0aW9uIFJJU0NWLXVubG9jay1sb2NrLXJlYWQtb3JkZXJpbmcgTmV2ZXIg MCAzClRpbWUgUklTQ1YtdW5sb2NrLWxvY2stcmVhZC1vcmRlcmluZyAwLjAxCkhhc2g9ZDg0NWQz NmUyYTg0ODAxNjU5MDM4NzBkMTM1ZGQ4MWUKCllvdXIgY29tbWl0IG1lbnRpb25lZCB0aGF0IHRo ZSBhYm92ZSB0ZXN0IHdvdWxkIHJlYWNoIHRoZSBleGlzdHMKY29uZGl0aW9uIGZvciBSSVNDVi4K ClNvLCBtYXliZSB0aGUgbW9kZWwgaGFzIGJlZW4gbW9kaWZpZWQgdG8gbWFrZSAuYXEgYW5kIC5y bCBSQ3NjIG5vdz8KClRoaXMgcHJlc2VudGF0aW9uWzJdIGJ5IERhbiBMdXN0aWcgc2F5cyBvbiBw YWdlIDMxOgoKIHwgUFBPIFJVTEVTIDUtNwogfCBBIHJlbGVhc2UgdGhhdCBwcmVjZWRlcyBhbiBh Y3F1aXJlIGluIHByb2dyYW0KIHwgb3JkZXIgYWxzbyBwcmVjZWRlcyBpdCBpbiBnbG9iYWwgbWVt b3J5IG9yZGVyCiB8IOKAoiBpLmUuLCB0aGUgUkNzYyB2YXJpYW50IG9mIHJlbGVhc2UgY29uc2lz dGVuY3kKCklmIGFib3ZlIGlzIHRydWUsIHJlbW92aW5nIHRoZSB3ZWFrIGZlbmNlcyBhbmQgdXNp bmcgTFIsIFNDLCBBTU9zIHdpdGgKYXEsIHJsLCBhbmQgYXFybCBiaXRzIGNvdWxkIGJlIHVzZWQg aW4gdGhlIGtlcm5lbCBBTU9zIGFuZCBsb2Nrcy4KClsxXSBodHRwczovL2xvcmUua2VybmVsLm9y Zy9hbGwvMjAyMjA0MjAxNDQ0MTcuMjQ1Mzk1OC0zLWd1b3JlbkBrZXJuZWwub3JnLwpbMl0gaHR0 cHM6Ly9yaXNjdi5vcmcvd3AtY29udGVudC91cGxvYWRzLzIwMTgvMDUvMTQuMjUtMTUuMDAtUklT Q1ZNZW1vcnlNb2RlbFR1dG9yaWFsLnBkZgoKVGhhbmtzLApQdXJhbmpheQoKX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGludXgtcmlzY3YgbWFpbGluZyBs aXN0CmxpbnV4LXJpc2N2QGxpc3RzLmluZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVh ZC5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1yaXNjdgo= From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 031D914EC64 for ; Tue, 7 May 2024 14:07:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715090850; cv=none; b=Z88PnQRCZNa/CgpOmjku4jVpkKWm3UuHek//jZjsLWprHOfOyUZ43wRJCIGrjy2jX4PkC/9Nd2s8kt+LrqGvOUK6pmiyxghZbFdtWrxdQ6jCLPSPn31x2dRI8MH9I7NS9QltlUa75tT9b/RaVdATUsXT7R+mkFNDgkEwsikWEuY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715090850; c=relaxed/simple; bh=Wgua9hwlpKleBWvYByFTlHo94DWXbiaBJrkWmQcDYFM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=lZXSviVFh/mVFHuey1vPOwVvFyJLb65ZgfLrsflEa9bfgA2+ehhhONIg5wW8OGCyz0XshPkozXO+KctgGMP2bGcwrursRu/mO+9MihTQvnFzBmcaYo9VByPrne4SsoyWbeZ1sGy8R+hL9ecT2tweEKn/AE81LClKlaMMwpjDCDM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TlSZCfhx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TlSZCfhx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 555C4C2BBFC; Tue, 7 May 2024 14:07:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715090849; bh=Wgua9hwlpKleBWvYByFTlHo94DWXbiaBJrkWmQcDYFM=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=TlSZCfhxErCpepOvO60RmpT/hc0bdnvgA1gyd50l1azuDZ8Id5DHvFDT3ilzI7BTf nk9t0C2KWKbna/vv9WTjJUJDGHfxdkWTu8VmGz1ukLKMcsqKu8bIYPw+vbBj0ipkMh lH0QD/zU94ChC7E5sENRK2nUfwOlbjx20Bc+L4yBq3qAwZ7HI7J4P31E+YZ2PFjTvF hRHuxDsp3Dn/BJr1rmoJ96iG5aRVoglkfwLgMWOIGWMHU7ZwbZFobdCydda8dEh0fD F61r5mfgZDXRJ11K4zI6kNpZzFkG5C5x9fnuS5Y5vno0f1bCEYwj0wQFR1RZAyDWn2 QZ4b3dML5q/Zg== From: Puranjay Mohan To: Andrea Parri , Daniel Lustig Cc: Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH] riscv/atomic.h: optimize ops with acquire/release ordering In-Reply-To: References: <20240505123340.38495-1-puranjay@kernel.org> Date: Tue, 07 May 2024 14:07:26 +0000 Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Andrea Parri writes: > Hi Puranjay, > > On Sun, May 05, 2024 at 12:33:40PM +0000, Puranjay Mohan wrote: >> Currently, atomic ops with acquire or release ordering are implemented >> as atomic ops with relaxed ordering followed by or preceded by an >> acquire fence or a release fence. >>=20 >> Section 8.1 of the "The RISC-V Instruction Set Manual Volume I: >> Unprivileged ISA", titled, "Specifying Ordering of Atomic Instructions" >> says: >>=20 >> | To provide more efficient support for release consistency [5], each >> | atomic instruction has two bits, aq and rl, used to specify additional >> | memory ordering constraints as viewed by other RISC-V harts. >>=20 >> and >>=20 >> | If only the aq bit is set, the atomic memory operation is treated as >> | an acquire access. >> | If only the rl bit is set, the atomic memory operation is treated as a >> | release access. >>=20 >> So, rather than using two instructions (relaxed atomic op + fence), use >> a single atomic op instruction with acquire/release ordering. >>=20 >> Example program: >>=20 >> atomic_t cnt =3D ATOMIC_INIT(0); >> atomic_fetch_add_acquire(1, &cnt); >> atomic_fetch_add_release(1, &cnt); >>=20 >> Before: >>=20 >> amoadd.w a4,a5,(a4) // Atomic add with relaxed ordering >> fence r,rw // Fence to force Acquire ordering >>=20 >> fence rw,w // Fence to force Release ordering >> amoadd.w a4,a5,(a4) // Atomic add with relaxed ordering >>=20 >> After: >>=20 >> amoadd.w.aq a4,a5,(a4) // Atomic add with Acquire ordering >>=20 >> amoadd.w.rl a4,a5,(a4) // Atomic add with Release ordering >>=20 >> Signed-off-by: Puranjay Mohan > > Your changes are effectively partially reverting: > > 5ce6c1f3535fa ("riscv/atomic: Strengthen implementations with fences") > > Can you please provide (and possibly include in the changelog of v2) a mo= re > thoughtful explanation for the correctness of such revert? > > (Anticipating a somewhat non-trivial analysis...) Hi Andrea, I think Guo Ren sent a patch[1] like this earlier but it did not get comments yet. I will reply on that thread[1] as well. I saw the commit 5ce6c1f3535fa ("riscv/atomic: Strengthen implementations with fences") and all the related discussions. This is what I understand from the discussions: RISCV's LR.aq/SC.rl were following RCpc ordering but the LKMM expected RCsc ordering from lock() and unlock(). So you added fences to force RCsc ordering in the locks/atomics. An experiment with LKMM and RISCV MM: The following litmus test should not reach (1:r0=3D1 /\ 1:r1=3D0) with LKMM: C unlock-lock-read-ordering {} /* s initially owned by P1 */ P0(int *x, int *y) { WRITE_ONCE(*x, 1); smp_wmb(); WRITE_ONCE(*y, 1); } P1(int *x, int *y, spinlock_t *s) { int r0; int r1; r0 =3D READ_ONCE(*y); spin_unlock(s); spin_lock(s); r1 =3D READ_ONCE(*x); } exists (1:r0=3D1 /\ 1:r1=3D0) Which is indeed true: Test unlock-lock-read-ordering Allowed States 3 1:r0=3D0; 1:r1=3D0; 1:r0=3D0; 1:r1=3D1; 1:r0=3D1; 1:r1=3D1; No Witnesses Positive: 0 Negative: 3 Flag unmatched-unlock Condition exists (1:r0=3D1 /\ 1:r1=3D0) Observation unlock-lock-read-ordering Never 0 3 Time unlock-lock-read-ordering 0.01 Hash=3Dab0cfdcde54d1bb1faa731533980f424 And when I map this test to RISC-V: RISCV RISCV-unlock-lock-read-ordering { 0:x2=3Dx; 0:x4=3Dy; 1:x2=3Dx; 1:x4=3Dy; 1:x6=3Ds; } P0 | P1 ; ori x1,x0,1 | lw x1,0(x4) ; sw x1,0(x2) | amoswap.w.rl x0,x0,(x6) ; fence w,w | ori x5,x0,1 ; ori x3,x0,1 | amoswap.w.aq x0,x5,(x6) ; sw x3,0(x4) | lw x3,0(x2) ; exists (1:x1=3D1 /\ 1:x3=3D0) This also doesn't reach the condition: Test RISCV-unlock-lock-read-ordering Allowed States 3 1:x1=3D0; 1:x3=3D0; 1:x1=3D0; 1:x3=3D1; 1:x1=3D1; 1:x3=3D1; No Witnesses Positive: 0 Negative: 3 Condition exists (1:x1=3D1 /\ 1:x3=3D0) Observation RISCV-unlock-lock-read-ordering Never 0 3 Time RISCV-unlock-lock-read-ordering 0.01 Hash=3Dd845d36e2a8480165903870d135dd81e Your commit mentioned that the above test would reach the exists condition for RISCV. So, maybe the model has been modified to make .aq and .rl RCsc now? This presentation[2] by Dan Lustig says on page 31: | PPO RULES 5-7 | A release that precedes an acquire in program | order also precedes it in global memory order | =E2=80=A2 i.e., the RCsc variant of release consistency If above is true, removing the weak fences and using LR, SC, AMOs with aq, rl, and aqrl bits could be used in the kernel AMOs and locks. [1] https://lore.kernel.org/all/20220420144417.2453958-3-guoren@kernel.org/ [2] https://riscv.org/wp-content/uploads/2018/05/14.25-15.00-RISCVMemoryMod= elTutorial.pdf Thanks, Puranjay