From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB019C04FFE for ; Wed, 8 May 2024 13:59:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:References :In-Reply-To:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=PmY8pucoss5fH4Gwlzm/cdmR6qGuWAIY7csoAiR+CMI=; b=XLjCKfM6S/XQSo lgErH+cQZno5y4qS7860ygs0Sg59P8aiUW39ZHWZCIhR1Hlzf8QYGP9AIuhAeuTvw8fjMecZGSEOl QuIOMc1JbHz7ka6tQjZ2aonScbUnHRA+zPBt3P9bWwBZC+9hs1Gng7LxeU0sutff6JTERVR9SAosr bqLphSpmB5LTX8mPzTdNOj3tBCQAgzc4aC+/SqlQppfsLxFw/oXQJxbN52730+YW9Z5V8eRz3Kar9 y93pRG2zSYu6XGEm0la9RhxdnLu4WEMkx2kF95bLJNYTpoZvpExCh8o1IWY9q7pHPUxr4tgQuqfJB yKGkeoOBOuyzZgYRFDPg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s4hpB-0000000Fle2-2opL; Wed, 08 May 2024 13:59:13 +0000 Received: from sin.source.kernel.org ([2604:1380:40e1:4800::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s4hp3-0000000FlbT-2Uzc for linux-riscv@lists.infradead.org; Wed, 08 May 2024 13:59:11 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 54027CE17EC; Wed, 8 May 2024 13:59:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7ACC0C113CC; Wed, 8 May 2024 13:59:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715176741; bh=utHb+Gnn40+ooa4sOyCKwBsal3Y3TBegAvM5IB8a7bw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=G6sm3MY1wfWeucVcijKvZc2VZjsSXNXNsQFJMpgURuHhSQXEnstVoiL5K456uJJs7 hDwtFLkZTDFb2dbkR8j3N6VDiYXjFens2f4DxPG0K0dcD0VRsCqoRpDbfCMNepphE0 t9QB7mjK7H2IdEwaomAC4sfXtSK2WedzTCvB2NMWS0eNxvPcdDPolOM9Q8terq60mO +bhz48+VBaM+xvTcQcQnx6lJkG7n96QHIQg51UUxucHCysliKrC5lX92XLWp90jsJV y1HGYuxSgCsvzyXdphH5gC42uGmJL8qTKW33HEFhH+Mit30A+y7T8ijxwXrWjoaWON PidWu7Xyt7pHw== From: Puranjay Mohan To: Andrea Parri Cc: Daniel Lustig , Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH] riscv/atomic.h: optimize ops with acquire/release ordering In-Reply-To: References: <20240505123340.38495-1-puranjay@kernel.org> Date: Wed, 08 May 2024 13:58:58 +0000 Message-ID: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240508_065906_340337_58B8B57C X-CRM114-Status: GOOD ( 41.35 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org QW5kcmVhIFBhcnJpIDxwYXJyaS5hbmRyZWFAZ21haWwuY29tPiB3cml0ZXM6Cgo+PiBJIHRoaW5r IEd1byBSZW4gc2VudCBhIHBhdGNoWzFdIGxpa2UgdGhpcyBlYXJsaWVyIGJ1dCBpdCBkaWQgbm90 IGdldAo+PiBjb21tZW50cyB5ZXQuIEkgd2lsbCByZXBseSBvbiB0aGF0IHRocmVhZFsxXSBhcyB3 ZWxsLgo+Cj4gVEJGLCB0aG9zZSBjaGFuZ2VzIGFwcGVhcmVkIGluIGEgbGF0ZXIgc3VibWlzc2lv bi9zZXJpZXMsCj4KPiAgIGh0dHBzOi8vbG9yZS5rZXJuZWwub3JnL2xrbWwvMjAyMjA1MDUwMzU1 MjYuMjk3NDM4Mi0xLWd1b3JlbkBrZXJuZWwub3JnLwo+Cj4gYSBzdWJtaXNzaW9uIHRoYXQgcmVj ZWl2ZWQgYSBzaW1pbGFyIGZlZWRiYWNrIGZyb20gdGhlIEFUT01JQyBJTkZSQVNUUlVDVFVSRQo+ IG1haW50YWluZXJzIGFuZCBteXNlbGY6IGluIHNob3J0LCAicGxlYXNlIGV4cGxhaW4gX3doeV8g eW91IGFyZSBkb2luZyB3aGF0Cj4geW91IGFyZSBkb2luZyIuCj4KPgo+PiBJIHNhdyB0aGUgY29t bWl0IDVjZTZjMWYzNTM1ZmEgKCJyaXNjdi9hdG9taWM6IFN0cmVuZ3RoZW4KPj4gaW1wbGVtZW50 YXRpb25zIHdpdGggZmVuY2VzIikgYW5kIGFsbCB0aGUgcmVsYXRlZCBkaXNjdXNzaW9ucy4KPj4g Cj4+IFRoaXMgaXMgd2hhdCBJIHVuZGVyc3RhbmQgZnJvbSB0aGUgZGlzY3Vzc2lvbnM6Cj4+IAo+ PiBSSVNDVidzIExSLmFxL1NDLnJsIHdlcmUgZm9sbG93aW5nIFJDcGMgb3JkZXJpbmcgYnV0IHRo ZSBMS01NIGV4cGVjdGVkCj4+IFJDc2Mgb3JkZXJpbmcgZnJvbSBsb2NrKCkgYW5kIHVubG9jaygp LiBTbyB5b3UgYWRkZWQgZmVuY2VzIHRvIGZvcmNlIFJDc2MKPj4gb3JkZXJpbmcgaW4gdGhlIGxv Y2tzL2F0b21pY3MuCj4KPiBBcHByZWNpYXRlIHRoZSBlZmZvcnQuICBTb21lIGNvcnJlY3Rpb25z L2NsYXJpZmljYXRpb25zOgo+Cj4gV2hlbiA1Y2U2YzFmMzUzNWZhIHdhcyBkZXZlbG9wZWQsIHRo ZSBMS01NIGV4cGVjdGVkICJsZXNzLXRoYW4tUkNzYyIgb3JkZXJpbmcKPiBmcm9tIHRoZSBsb2Nr IG9wZXJhdGlvbnMuICBTb21lIG9mIHRob3NlIHByb3BlcnRpZXMgd2VyZSBpbGx1c3RyYXRlZCBi eSB0aGUKPiB1bmxvY2stbG9jay1yZWFkLW9yZGVyaW5nIGxpdG11cyB0ZXN0IHlvdSByZXBvcnRl ZCAoYW5kIGluY2x1ZGVkIGluCj4KPiAgIDAxMjNmNGQ3NmNhNjMgKCJyaXNjdi9zcGlubG9jazog U3RyZW5ndGhlbiBpbXBsZW1lbnRhdGlvbnMgd2l0aCBmZW5jZXMiKSApLgo+Cj4gSXQncyBhbHNv IHdvcnRoIG1lbnRpb25pbmcgdGhhdCwgd2hlbiA1Y2U2YzFmMzUzNWZhIHdhcyBkaXNjdXNzZWQs IHRoZSBMS01NCj4gZXhwZWN0ZWQgc2ltaWxhciByZWFkLXJlYWQgb3JkZXJpbmcgcHJvcGVydGll cyB0byBob2xkIGZvciBvcmRpbmFyeSBhY3F1aXJlCj4gYW5kIHJlbGVhc2Ugb3BlcmF0aW9ucywg aS5lLiBub3QgbmVjZXNzYXJ5IGEgbG9jayBvcGVyYXRpb24uCj4KPiBMYXRlciBjaGFuZ2VzIHRv IHRoZSBMS01NIHJlbGF4ZWQgdGhvc2UgcHJvcGVydGllcyBmb3Igb3JkaW5hcnkgYWNxdWlyZXMg YW5kCj4gcmVsZWFzZXMsIGFuZCBhZGRlZCBleHRyYSBvcmRlcmluZyBmb3IgbG9ja3MsIGNmLgo+ Cj4gICA2ZTg5ZTgzMWE5MDE3ICgidG9vbHMvbWVtb3J5LW1vZGVsOiBBZGQgZXh0cmEgb3JkZXJp bmcgZm9yIGxvY2tzIGFuZCByZW1vdmUgaXQgZm9yIG9yZGluYXJ5IHJlbGVhc2UvYWNxdWlyZSIp Cj4gICBkZGZlMTI5NDRlODQ4ICgidG9vbHMvbWVtb3J5LW1vZGVsOiBQcm92aWRlIGV4dHJhIG9y ZGVyaW5nIGZvciB1bmxvY2srbG9jayBwYWlyIG9uIHRoZSBzYW1lIENQVSIpIAo+Cj4gUm91Z2hs eSBzcGVha2luZywgc3VjaCBjaGFuZ2VzIG1hZGUgdGhlIExLTU0ncyBsb2NrcyBSQ3RzbywgYW5k IHRoaXMgbWF0Y2hlcwo+IHRoZSBjdXJyZW50IExLTU0ncyBhcHByb2FjaC4gIChZZXMgSSBrbm93 LCB0aGVyZSBpcyBjb2RlIGFzc3VtaW5nL3dpc2hpbmcgUkNzYwo+IGxvY2tzLi4uIGxvbmcgc3Rv cnksIG5vdCBzdHJpY3RseSByZWxhdGVkIHRvIHRoaXMgZGlzY3Vzc2lvbi90aHJlYWQ6IElBQywg Zm9yCj4gY29tcGxldGVuZXNzLCBJJ2xsIHNheSBtb3JlIGFib3V0IHRoYXQgaW4gbXkgY29tbWVu dHMgYmVsb3cuKQo+Cj4gTXkgY2hhbmdlcy90aGUgY3VycmVudCBpbXBsZW1lbnRhdGlvbnMgcHJv dmlkZXMgUkN0c28gKG5vdCBSQ3NjKSBvcmRlcmluZyBmb3IKPiBSSVNDVidzIGxvY2tzIGFuZCBh dG9taWNzOyBpbiBmYWN0LCBieSB0aGVpciB2ZXJ5IGRlc2lnbiwgdGhpcyBSQ3RzbyBpcyBwcmV0 dHkKPiBlYXN5IHRvIHNlZS9wcm92ZToKPgo+ICAgKDEpIGV2ZXJ5IHJlbGVhc2Ugb3AgcHJvdmlk ZXMgUlcgdG8gVyBvcmRlciAob3Igc3Ryb25nZXIpOwo+Cj4gICAoMikgZXZlcnkgYWNxdWlyZSBv cCBwcm92aWRlcyBtb3JlIHRoYW4gUiB0byBSIG9yZGVyICh0eXBpY2FsbHkgUiB0byBSVwo+ICAg ICAgIG9yZGVyLCBidXQgaW4gYXRvbWljX2NvbmRfbG9hZF9hY3F1aXJlKCkgJiBjby4gdGhhdCBS LXRvLVcgb3JkZXIgaXMKPiAgICAgICBsaW1pdGVkIHRvIHRoZSAiUiIgYXNzb2NpYXRlZCB3aXRo IHRoZSBhY3F1aXJlIG9wIGl0c2VsZikuCj4KPiBQdXQgdG9nZXRoZXIsICgxLTIpIGdpdmUgUi10 by1SLCBSLXRvLVcgYW5kIFctdG8tVyBvcmRlciAoYWthIFJDdHNvKSBhcyBjbGFpbWVkLgo+IE5v dGljZSB0aGF0IHRoaXMgYXJndW1lbnQgaG9sZHMgZm9yIGV2ZXJ5IGxvY2tzIG9wZXJhdGlvbnMg YW5kIHR5cGVzIChzcGlubG9jaywKPiByd2xvY2ssIG11dGV4LCBydF9tdXRleCwgc2VtYXBob3Jl LCByd19zZW1hcGhvcmUsIGV0Yy4pIGFuZCB0aGF0IGl0IGRvZXMgX25vdF8KPiByZXF1aXJlIGFu eSBhdWRpdCBvZiB0aGUgbG9ja2luZyBjb2RlLiAgTW9yZSBvbiB0aGlzIHBvaW50IGJlbG93Lgo+ Cj4KPj4gQW4gZXhwZXJpbWVudCB3aXRoIExLTU0gYW5kIFJJU0NWIE1NOgo+PiAKPj4gVGhlIGZv bGxvd2luZyBsaXRtdXMgdGVzdCBzaG91bGQgbm90IHJlYWNoICgxOnIwPTEgL1wgMTpyMT0wKSB3 aXRoIExLTU06Cj4+IAo+PiBDIHVubG9jay1sb2NrLXJlYWQtb3JkZXJpbmcKPj4gCj4+IHt9Cj4+ IC8qIHMgaW5pdGlhbGx5IG93bmVkIGJ5IFAxICovCj4+IAo+PiBQMChpbnQgKngsIGludCAqeSkK Pj4gewo+PiAgICAgICAgIFdSSVRFX09OQ0UoKngsIDEpOwo+PiAgICAgICAgIHNtcF93bWIoKTsK Pj4gICAgICAgICBXUklURV9PTkNFKCp5LCAxKTsKPj4gfQo+PiAKPj4gUDEoaW50ICp4LCBpbnQg KnksIHNwaW5sb2NrX3QgKnMpCj4+IHsKPj4gICAgICAgICBpbnQgcjA7Cj4+ICAgICAgICAgaW50 IHIxOwo+PiAKPj4gICAgICAgICByMCA9IFJFQURfT05DRSgqeSk7Cj4+ICAgICAgICAgc3Bpbl91 bmxvY2socyk7Cj4+ICAgICAgICAgc3Bpbl9sb2NrKHMpOwo+PiAgICAgICAgIHIxID0gUkVBRF9P TkNFKCp4KTsKPj4gfQo+PiAKPj4gZXhpc3RzICgxOnIwPTEgL1wgMTpyMT0wKQo+PiAKPj4gV2hp Y2ggaXMgaW5kZWVkIHRydWU6Cj4+IAo+PiBUZXN0IHVubG9jay1sb2NrLXJlYWQtb3JkZXJpbmcg QWxsb3dlZAo+PiBTdGF0ZXMgMwo+PiAxOnIwPTA7IDE6cjE9MDsKPj4gMTpyMD0wOyAxOnIxPTE7 Cj4+IDE6cjA9MTsgMTpyMT0xOwo+PiBObwo+PiBXaXRuZXNzZXMKPj4gUG9zaXRpdmU6IDAgTmVn YXRpdmU6IDMKPj4gRmxhZyB1bm1hdGNoZWQtdW5sb2NrCj4+IENvbmRpdGlvbiBleGlzdHMgKDE6 cjA9MSAvXCAxOnIxPTApCj4+IE9ic2VydmF0aW9uIHVubG9jay1sb2NrLXJlYWQtb3JkZXJpbmcg TmV2ZXIgMCAzCj4+IFRpbWUgdW5sb2NrLWxvY2stcmVhZC1vcmRlcmluZyAwLjAxCj4+IEhhc2g9 YWIwY2ZkY2RlNTRkMWJiMWZhYTczMTUzMzk4MGY0MjQKPj4gCj4+IEFuZCB3aGVuIEkgbWFwIHRo aXMgdGVzdCB0byBSSVNDLVY6Cj4+IAo+PiBSSVNDViBSSVNDVi11bmxvY2stbG9jay1yZWFkLW9y ZGVyaW5nCj4+IHsKPj4gMDp4Mj14Owo+PiAwOng0PXk7Cj4+IAo+PiAxOngyPXg7Cj4+IDE6eDQ9 eTsKPj4gMTp4Nj1zOwo+PiB9Cj4+ICBQMCAgICAgICAgICAgfCAgUDEgICAgICAgICAgICAgICAg ICAgICAgOwo+PiAgb3JpIHgxLHgwLDEgIHwgbHcgeDEsMCh4NCkgICAgICAgICAgICAgIDsKPj4g IHN3IHgxLDAoeDIpICB8IGFtb3N3YXAudy5ybCB4MCx4MCwoeDYpICA7Cj4+ICBmZW5jZSB3LHcg ICAgfCBvcmkgeDUseDAsMSAgICAgICAgICAgICAgOwo+PiAgb3JpIHgzLHgwLDEgIHwgYW1vc3dh cC53LmFxIHgwLHg1LCh4NikgIDsKPj4gIHN3IHgzLDAoeDQpICB8IGx3IHgzLDAoeDIpICAgICAg ICAgICAgICA7Cj4+IGV4aXN0cyAoMTp4MT0xIC9cIDE6eDM9MCkKPj4gCj4+IFRoaXMgYWxzbyBk b2Vzbid0IHJlYWNoIHRoZSBjb25kaXRpb246Cj4+IAo+PiBUZXN0IFJJU0NWLXVubG9jay1sb2Nr LXJlYWQtb3JkZXJpbmcgQWxsb3dlZAo+PiBTdGF0ZXMgMwo+PiAxOngxPTA7IDE6eDM9MDsKPj4g MTp4MT0wOyAxOngzPTE7Cj4+IDE6eDE9MTsgMTp4Mz0xOwo+PiBObwo+PiBXaXRuZXNzZXMKPj4g UG9zaXRpdmU6IDAgTmVnYXRpdmU6IDMKPj4gQ29uZGl0aW9uIGV4aXN0cyAoMTp4MT0xIC9cIDE6 eDM9MCkKPj4gT2JzZXJ2YXRpb24gUklTQ1YtdW5sb2NrLWxvY2stcmVhZC1vcmRlcmluZyBOZXZl ciAwIDMKPj4gVGltZSBSSVNDVi11bmxvY2stbG9jay1yZWFkLW9yZGVyaW5nIDAuMDEKPj4gSGFz aD1kODQ1ZDM2ZTJhODQ4MDE2NTkwMzg3MGQxMzVkZDgxZQo+Cj4gV2hpY2ggIm1hcHBpbmciIGRp ZCB5b3UgdXNlIGZvciB0aGlzIGV4cGVyaW1lbnQvYW5hbHlzaXM/ICBMb29raW5nIGF0IHRoZQoK QWN0dWFsbHksIGJ5IG1hcHBpbmcgSSBtZWFudDoKClIxCmFtb3N3YXAudy5ybAphbW9zd2FwLncu YXEKUjIKCndpbGwgcHJvdmlkZSBSMS0+UjIgb3JkZXJpbmcgbGlrZToKClIxCnNwaW5fdW5sb2Nr KCkKc3Bpbl9sb2NrKCkKUjIKClRoYXQgdGVzdCBpcyBmb3IgcmVhZC0+cmVhZCBvcmRlcmluZyBl bmZvcmNlZCBieSB1bmxvY2soKS0+bG9jaygpIGFuZCBJCmp1c3Qgd2FudGVkIHRvIHNheSB0aGF0 IHRoZSBjdXJyZW50IFJJU0MtViBtZW1vcnkgbW9kZWwgcHJvdmlkZXMgdGhhdAp3aXRoIGFsbChh bW8vbHIvc2MpIC5ybCAtPiAuYXEgb3BlcmF0aW9ucy4KCj4gY3VycmVudCBzcGlubG9jayBjb2Rl IGZvciBSSVNDViAoYW5kIGluY2x1ZGluZyB0aGUgYXRvbWljIGNoYW5nZXMgYXQgc3Rha2UpCj4g UDEgc2VlbXMgdG8gYmUgYmV0dGVyIGRlc2NyaWJlZCBieSBzb21ldGhpbmcgbGlrZToKPgo+ICAg ZmVuY2UgcncsdwkJLy8gYXJjaF9zcGluX3VubG9jayAtLT4gc21wX3N0b3JlX3JlbGVhc2UKPiAg IHN3Cj4KPiAgIGxyLncgCQkJLy8gYXJjaF9zcGluX3RyeWxvY2sgLS0+IGFyY2hfdHJ5X2NtcHhj aGcKPiAgIGJuZQo+ICAgc2Mudy5ybAo+ICAgYm5lego+ICAgZmVuY2UgcncscncKPgo+IG9yCj4K PiAgIGFtb2FkZC53LmFxcmwJCS8vIGFyY2hfc3Bpbl9sb2NrIC0tPiBhdG9taWNfZmV0Y2hfYWRk Cj4KPiBvcgo+Cj4gICBsdwkJCS8vIGFyY2hfc3Bpbl9sb2NrIC0tPiBhdG9taWNfY29uZF9yZWFk X2FjcXVpcmUgOyBzbXBfbWIgICg/PykKPiAgIGJuZQo+ICAgZmVuY2UgcixyCj4KPiAgIGZlbmNl IHJ3LHJ3Cj4KPiBMb29raW5nIGF0IHRoZSByd2xvY2sgY29kZSAoZm9yIHdoaWNoIHRoZSBzYW1l IFJDdHNvIHByb3BlcnR5IGlzIGV4cGVjdGVkIHRvCj4gaG9sZCwgZXZlbiB0aG91Z2ggdGhhdCBo YXNuJ3QgYmVlbiBmb3JtYWxpemVkIGluIHRoZSBMS01NIHlldCksIEkgc2VlIChhZ2FpbiwKPiBp bmNsdWRpbmcgeW91ciBhdG9taWMgY2hhbmdlcyk6Cj4KPiAgIGFtb2FkZC53LnJsCQkvLyBxdWV1 ZWRfcmVhZF91bmxvY2sgLS0+IGF0b21pY19zdWJfcmV0dXJuX3JlbGVhc2UKPiAgIGFtb2FkZC53 LmFxCQkvLyBxdWV1ZWRfcmVhZF9sb2NrIC0tPiBhdG9taWNfYWRkX3JldHVybl9hY3F1aXJlCj4K PiBhbmQKPgo+ICAgZmVuY2UgcncsdwkJLy8gcXVldWVfd3JpdGVfdW5sb2NrIC0tPiBzbXBfc3Rv cmVfcmVsZWFzZQo+ICAgc3cKPiAgIGxyLncgCQkJLy8gcXVldWVfd3JpdGVfbG9jayAtLT4gYXRv bWljX3RyeV9jbXB4Y2hnX2FjcXVpcmUKPiAgIGJuZQo+ICAgc2Mudwo+ICAgYm5lego+ICAgZmVu Y2Ugcixydwo+Cj4gSSB3b24ndCBsaXN0IHRoZSBzbG93cGF0aCBzY2VuYXJpb3MuICBPciBldmVu IHRoZSBtdXRleCwgc2VtYXBob3JlLCBldGMuICBJCj4gYmVsaWV2ZSB5b3UgZ290IHRoZSBwb2lu dC4uLgoKVGhhbmtzIGZvciB0aGVzZSBkZXRhaWxzLCBJIHdpbGwgZG8gbW9yZSB0ZXN0aW5nIHdp dGggdGhpcyBvbiBoZXJkNy4KCgpTb21ldGhpbmcgb3V0IG9mIGN1cmlvc2l0eT86CgpGcm9tIG15 IHVuZGVyc3RhbmRpbmcgb2YgdGhlIGN1cnJlbnQgdmVyc2lvbiBvZiB0aGUgUlYgbWVtb3J5IG1v ZGVsOgoKLmFxIHByb3ZpZGVzIC5hcSAtPiBhbGwgb3JkZXJpbmcKLnJsIHByb3ZpZGVzIGFsbCAt PiAucmwgb3JkZXJpbmcKYW5kIGJlY2F1c2UgdGhpcyBpcyBSQ3NjIHZhcmlhbnQgb2YgcmVsZWFz ZSBjb25zaXN0ZW5jeQoucmwgLT4gLmFxCgp3aGljaCBtZWFucwoKUi9XCmFtb3N3YXAudy5ybAph bW9zd2FwLncuYXEKUi9XCgpTaG91bGQgYWN0IGFzIGEgZnVsbCBmZW5jZT8gUi9XIC0+IHJsIC0+ IGFxIC0+IFIvVwoKPgo+PiBZb3VyIGNvbW1pdCBtZW50aW9uZWQgdGhhdCB0aGUgYWJvdmUgdGVz dCB3b3VsZCByZWFjaCB0aGUgZXhpc3RzCj4+IGNvbmRpdGlvbiBmb3IgUklTQ1YuCj4+IAo+PiBT bywgbWF5YmUgdGhlIG1vZGVsIGhhcyBiZWVuIG1vZGlmaWVkIHRvIG1ha2UgLmFxIGFuZCAucmwg UkNzYyBub3c/Cj4KPiBZZXMuICAuYXEgYW5kIC5ybCBhcmUgUkNzYy4gIFRoZXkgd2VyZSBjb25z aWRlcmVkIFJDcGMgd2hlbiA1Y2U2YzFmMzUzNWZhCj4gMDEyM2Y0ZDc2Y2E2MyB3ZXJlIGRpc2N1 c3NlZCAod2hpY2ggaGFwcGVuZWQgX2JlZm9yZV8gdGhlIFJJU0MtVidzIG1lbW9yeQo+IG1vZGVs IHdhcyByYXRpZmllZCkgYXMgY2xlYXJseSByZW1hcmtlZCBpbiB0aGVpciBjb21taXQgbWVzc2Fn ZXMuCj4KPiBUaGUgQVRPTUlDIG1haW50YWluZXJzIHdlbnQgYXMgZmFyIGFzICJiaXNlY3Rpbmci IHRoZSBSSVNDLVYgSVNBIHNwZWMgaW4KPgo+ICAgaHR0cHM6Ly9sb3JlLmtlcm5lbC5vcmcvbGtt bC9ZclBlaTZxNHJJQXg2WW1mQGJvcXVuLWFyY2hsaW51eC8KPgo+IGJ1dCwgYXMgdGhleSBzYXks IGl0J3MgaGFyZCB0byBoZWxwIHBlb3BsZSB3aG8gZG9uJ3Qgd2FudCB0byBiZSBoZWxwZWQuLi4K Pgo+Cj4+IFRoaXMgcHJlc2VudGF0aW9uWzJdIGJ5IERhbiBMdXN0aWcgc2F5cyBvbiBwYWdlIDMx Ogo+PiAKPj4gIHwgUFBPIFJVTEVTIDUtNwo+PiAgfCBBIHJlbGVhc2UgdGhhdCBwcmVjZWRlcyBh biBhY3F1aXJlIGluIHByb2dyYW0KPj4gIHwgb3JkZXIgYWxzbyBwcmVjZWRlcyBpdCBpbiBnbG9i YWwgbWVtb3J5IG9yZGVyCj4+ICB8IOKAoiBpLmUuLCB0aGUgUkNzYyB2YXJpYW50IG9mIHJlbGVh c2UgY29uc2lzdGVuY3kKPj4gCj4+IElmIGFib3ZlIGlzIHRydWUsIHJlbW92aW5nIHRoZSB3ZWFr IGZlbmNlcyBhbmQgdXNpbmcgTFIsIFNDLCBBTU9zIHdpdGgKPj4gYXEsIHJsLCBhbmQgYXFybCBi aXRzIGNvdWxkIGJlIHVzZWQgaW4gdGhlIGtlcm5lbCBBTU9zIGFuZCBsb2Nrcy4KPgo+IFRoZSBw cm9ibGVtIHdpdGggdGhpcyBhcmd1bWVudCBpcyB0aGF0IGl0IHJlbGllcyBvbiBhbGwgbG9jayBv cHMgdG8gY29tZSB3aXRoCj4gYW4gUkNzYyBhbm5vdGF0aW9uLCB3aGljaCBpcyBzaW1wbHkgbm90 IHRydWUgaW4gdGhlIGN1cnJlbnQvUklTQ1YgY29kZSBhcyB0aGUKPiBmZXcgc25pcHBldHMgYWJv dmUgYWxzbyBzdWdnZXN0ZWQuCj4KPiBCVFcsIGFybTY0IHVzZXMgYSBzaW1pbGFyIGFyZ3VtZW50 LCBleGNlcHQgYWxsIGl0cyByZWxlYXNlcy9hY3F1aXJlcyBjb21lIHdpdGgKPiBSQ3NjIGFubm90 YXRpb25zICh3aGljaCBncmVhdGx5IHNpbXBsaWZpZXMgdGhlIGFuYWx5c2lzKS4gIFRoZSBhcmd1 bWVudCBjb3VsZAo+IGJlIGVhc2lseSBtYWRlIHRvIHdvcmsgaW4gUklTQ1YgX3Byb3ZpZGVkXyBp dHMgSVNBIHdlcmUgYXVnbWVudGVkIHdpdGggbHcuYXEKPiBhbmQgc3cucmwsIGJ1dCBpdCdzIGJl ZW4gfjYgeWVhcnMuLi4KPgo+IFNhaWQgdGhpcywgbWF5YmUgd2UncmUgImx1Y2t5IiBhbmQgYWxs IHRoZSB1bmxvY2srbG9jayBwYWlycyB3aWxsIGp1c3Qgd29yayB3Lwo+IHlvdXIgY2hhbmdlcy4g IEkgaGF2ZW4ndCByZWFsbHkgY2hlY2tlZCwgYW5kIEkgcHJvYmFibHkgd29uJ3QgdW50aWwgdGhl IG9ubHkKPiBtb3RpdmF0aW9uIGZvciBzdWNoIGNoYW5nZXMgd2lsbCBiZSAibG93ZXIgaW5zdCBj b3VudCBpbiBxZW11Ii4KPgo+IE9uIHN1Y2ggcmVnYXJkLCByZW1hcmsgdGhhdCBTZWN0aW9uIEEu NSwgIkNvZGUgcG9ydGluZyBhbmQgbWFwcGluZyBndWlkZWxpbmVzIgo+IG9mIHRoZSBSSVNDViBJ U0Egc3BlYyBwcm92aWRlcyBhbHRlcm5hdGl2ZSBtYXBwaW5nIGZvciBvdXIgYXRvbWljcywgaW5j bHVkaW5nCj4gQU1PIG1hcHBpbmcgdy8gLmFxIGFuZCAucmwgYW5ub3RhdGlvbnM6IEknbSBzdXJl IHRob3NlIG1hcHBpbmdzIHdlcmUgc3ViamVjdAo+IHRvIGEgZmFpciBhbW91bnQgb2YgcmV2aWV3 IGFuZCBmb3JtYWwgYW5hbHlzaXMgKGFsdGhvdWdoIEkgd2FzIG5vdCBpbnZvbHZlZCBpbgo+IHRo YXQgd29yay9yZXZpZXcgYXQgdGhlIHRpbWUpOiBpZiBpbnN0IGNvdW50IGlzIHNvIGltcG9ydGFu dCB0byB5b3UsIHdoeSBub3QKPiBzaW1wbHkgZm9sbG93IHRob3NlIGd1aWRlbGluZXM/ICAoTm90 aWNlIHRoYXQgc3VjaCByZS13cml0ZSB3b3VsZCByZXF1aXJlIHNvbWUKPiBtb2RpZmljYXRpb24g dG8gbm9uLUFNTyBtYXBwaW5ncywgY2YuIHNtcF9zdG9yZV9yZWxlYXNlKCkgYW5kIExSL1NDIG1h cHBpbmdzLikKClNvLCBJIHdpbGwgZG8gdGhlIGZvbGxvd2luZyBub3c6CgoxLiBEbyBzb21lIGJl bmNobWFya2luZyBvbiByZWFsIGhhcmR3YXJlIGFuZCBmaW5kIG91dCBob3cgbXVjaCBvdmVyaGVh ZAogICB0aGVzZSB3ZWFrIGZlbmNlcyBhZGQuCjIuIFN0dWR5IHRoZSBMS01NIGFuZCB0aGUgUlZX TU8gZm9yIHRoZSBuZXh0IGZldyB3ZWVrcy9tb250aHMgb3IgaG93ZXZlcgogICBtdWNoIHRpbWUg aXQgdGFrZXMgbWUgdG8gY29uZmlkZW50bHkgcmVhc29uIGFib3V0IHRoaW5ncyB3cml0dGVuIGlu CiAgIHRoZXNlIHR3byBtb2RlbHMuCjMuIFN0dWR5IHRoZSBsb2NraW5nIC8gcmVsYXRlZCBjb2Rl IG9mIFJJU0MtViB0byBzZWUgd2hhdCBjb3VsZCBicmVhayBpZgogICB3ZSBjaGFuZ2UgYWxsIHRo ZXNlIG9wZXJhdGlvbnMgaW4gYWNjb3JkYW5jZSB3aXRoICJDb2RlIFBvcnRpbmcgYW5kCiAgIE1h cHBpbmcgR3VpZGVsaW5lcyIgb2YgUklTQ1YgSVNBLgo0LiBJIHdpbGwgdXNlIHRoZSBoZXJkNyBt b2RlbHMgb2YgTEtNTSBhbmQgUlZXTU8gYW5kIHNlZSBpZiBldmVyeXRoaW5nCiAgIHdvcmtzIGFz IGV4cGVjdGVkIGFmdGVyIHRoZXNlIGNoYW5nZXMuCgoKQW5kIElmIEkgYW0gY29udmluY2VkIGFm dGVyIGFsbCB0aGlzLCBJIHdpbGwgc2VuZCBhIHBhdGNoIHRvIGltcGxlbWVudAoiQ29kZSBQb3J0 aW5nIGFuZCBNYXBwaW5nIEd1aWRlbGluZXMiICsgcHJvdmlkZSBwZXJmb3JtYW5jZSBudW1iZXJz IGZyb20KcmVhbCBoYXJkd2FyZS4KClRoYW5rcyBmb3IgdGhlIGRldGFpbGVkIGV4cGxhaW5hdGlv bnMgYW5kIGVzcGVjaWFsbHkgcmVnYXJkaW5nIGhvdyB0aGUKTEtNTSBldm9sdmVkLgoKClB1cmFu amF5CgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpsaW51 eC1yaXNjdiBtYWlsaW5nIGxpc3QKbGludXgtcmlzY3ZAbGlzdHMuaW5mcmFkZWFkLm9yZwpodHRw Oi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2xpbnV4LXJpc2N2Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A0D084A3C for ; Wed, 8 May 2024 13:59:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715176742; cv=none; b=laGEy7q2GolK545ZGMEAEuM1Wz7/NVhGNmNa1aGduPpHjpagCMZxHlPIvdNdpXyDDr7t1vLIGvUbNkpTTaAO8KfqzCaVGiPhIs5hgnbNvLMEMwAaWhnAjPQjgYpp+VE9XJWbp4ZTn87LIzsB2fh5O5Q/Mf/SicG5wKGIQOUQDtg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715176742; c=relaxed/simple; bh=utHb+Gnn40+ooa4sOyCKwBsal3Y3TBegAvM5IB8a7bw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=eeu3T2+34FAZMW7WdXGE/LQnC9tJ9gPGfFdoY2jzrkRVCT/LqFkpMpheC0+C2t4fV30+89dcYWiMH90DuZVnrmFFPvrK0DbvOEt6D/UvIJ20kC08shSDpszTNwr/wM5JRlUv4NpXE55oZYjxrDAjLf4WCb0DtDNpdOeoyaidUOs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=G6sm3MY1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="G6sm3MY1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7ACC0C113CC; Wed, 8 May 2024 13:59:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715176741; bh=utHb+Gnn40+ooa4sOyCKwBsal3Y3TBegAvM5IB8a7bw=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=G6sm3MY1wfWeucVcijKvZc2VZjsSXNXNsQFJMpgURuHhSQXEnstVoiL5K456uJJs7 hDwtFLkZTDFb2dbkR8j3N6VDiYXjFens2f4DxPG0K0dcD0VRsCqoRpDbfCMNepphE0 t9QB7mjK7H2IdEwaomAC4sfXtSK2WedzTCvB2NMWS0eNxvPcdDPolOM9Q8terq60mO +bhz48+VBaM+xvTcQcQnx6lJkG7n96QHIQg51UUxucHCysliKrC5lX92XLWp90jsJV y1HGYuxSgCsvzyXdphH5gC42uGmJL8qTKW33HEFhH+Mit30A+y7T8ijxwXrWjoaWON PidWu7Xyt7pHw== From: Puranjay Mohan To: Andrea Parri Cc: Daniel Lustig , Will Deacon , Peter Zijlstra , Boqun Feng , Mark Rutland , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: Re: [PATCH] riscv/atomic.h: optimize ops with acquire/release ordering In-Reply-To: References: <20240505123340.38495-1-puranjay@kernel.org> Date: Wed, 08 May 2024 13:58:58 +0000 Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Andrea Parri writes: >> I think Guo Ren sent a patch[1] like this earlier but it did not get >> comments yet. I will reply on that thread[1] as well. > > TBF, those changes appeared in a later submission/series, > > https://lore.kernel.org/lkml/20220505035526.2974382-1-guoren@kernel.org/ > > a submission that received a similar feedback from the ATOMIC INFRASTRUCT= URE > maintainers and myself: in short, "please explain _why_ you are doing what > you are doing". > > >> I saw the commit 5ce6c1f3535fa ("riscv/atomic: Strengthen >> implementations with fences") and all the related discussions. >>=20 >> This is what I understand from the discussions: >>=20 >> RISCV's LR.aq/SC.rl were following RCpc ordering but the LKMM expected >> RCsc ordering from lock() and unlock(). So you added fences to force RCsc >> ordering in the locks/atomics. > > Appreciate the effort. Some corrections/clarifications: > > When 5ce6c1f3535fa was developed, the LKMM expected "less-than-RCsc" orde= ring > from the lock operations. Some of those properties were illustrated by t= he > unlock-lock-read-ordering litmus test you reported (and included in > > 0123f4d76ca63 ("riscv/spinlock: Strengthen implementations with fences"= ) ). > > It's also worth mentioning that, when 5ce6c1f3535fa was discussed, the LK= MM > expected similar read-read ordering properties to hold for ordinary acqui= re > and release operations, i.e. not necessary a lock operation. > > Later changes to the LKMM relaxed those properties for ordinary acquires = and > releases, and added extra ordering for locks, cf. > > 6e89e831a9017 ("tools/memory-model: Add extra ordering for locks and re= move it for ordinary release/acquire") > ddfe12944e848 ("tools/memory-model: Provide extra ordering for unlock+l= ock pair on the same CPU")=20 > > Roughly speaking, such changes made the LKMM's locks RCtso, and this matc= hes > the current LKMM's approach. (Yes I know, there is code assuming/wishing= RCsc > locks... long story, not strictly related to this discussion/thread: IAC,= for > completeness, I'll say more about that in my comments below.) > > My changes/the current implementations provides RCtso (not RCsc) ordering= for > RISCV's locks and atomics; in fact, by their very design, this RCtso is p= retty > easy to see/prove: > > (1) every release op provides RW to W order (or stronger); > > (2) every acquire op provides more than R to R order (typically R to RW > order, but in atomic_cond_load_acquire() & co. that R-to-W order is > limited to the "R" associated with the acquire op itself). > > Put together, (1-2) give R-to-R, R-to-W and W-to-W order (aka RCtso) as c= laimed. > Notice that this argument holds for every locks operations and types (spi= nlock, > rwlock, mutex, rt_mutex, semaphore, rw_semaphore, etc.) and that it does = _not_ > require any audit of the locking code. More on this point below. > > >> An experiment with LKMM and RISCV MM: >>=20 >> The following litmus test should not reach (1:r0=3D1 /\ 1:r1=3D0) with L= KMM: >>=20 >> C unlock-lock-read-ordering >>=20 >> {} >> /* s initially owned by P1 */ >>=20 >> P0(int *x, int *y) >> { >> WRITE_ONCE(*x, 1); >> smp_wmb(); >> WRITE_ONCE(*y, 1); >> } >>=20 >> P1(int *x, int *y, spinlock_t *s) >> { >> int r0; >> int r1; >>=20 >> r0 =3D READ_ONCE(*y); >> spin_unlock(s); >> spin_lock(s); >> r1 =3D READ_ONCE(*x); >> } >>=20 >> exists (1:r0=3D1 /\ 1:r1=3D0) >>=20 >> Which is indeed true: >>=20 >> Test unlock-lock-read-ordering Allowed >> States 3 >> 1:r0=3D0; 1:r1=3D0; >> 1:r0=3D0; 1:r1=3D1; >> 1:r0=3D1; 1:r1=3D1; >> No >> Witnesses >> Positive: 0 Negative: 3 >> Flag unmatched-unlock >> Condition exists (1:r0=3D1 /\ 1:r1=3D0) >> Observation unlock-lock-read-ordering Never 0 3 >> Time unlock-lock-read-ordering 0.01 >> Hash=3Dab0cfdcde54d1bb1faa731533980f424 >>=20 >> And when I map this test to RISC-V: >>=20 >> RISCV RISCV-unlock-lock-read-ordering >> { >> 0:x2=3Dx; >> 0:x4=3Dy; >>=20 >> 1:x2=3Dx; >> 1:x4=3Dy; >> 1:x6=3Ds; >> } >> P0 | P1 ; >> ori x1,x0,1 | lw x1,0(x4) ; >> sw x1,0(x2) | amoswap.w.rl x0,x0,(x6) ; >> fence w,w | ori x5,x0,1 ; >> ori x3,x0,1 | amoswap.w.aq x0,x5,(x6) ; >> sw x3,0(x4) | lw x3,0(x2) ; >> exists (1:x1=3D1 /\ 1:x3=3D0) >>=20 >> This also doesn't reach the condition: >>=20 >> Test RISCV-unlock-lock-read-ordering Allowed >> States 3 >> 1:x1=3D0; 1:x3=3D0; >> 1:x1=3D0; 1:x3=3D1; >> 1:x1=3D1; 1:x3=3D1; >> No >> Witnesses >> Positive: 0 Negative: 3 >> Condition exists (1:x1=3D1 /\ 1:x3=3D0) >> Observation RISCV-unlock-lock-read-ordering Never 0 3 >> Time RISCV-unlock-lock-read-ordering 0.01 >> Hash=3Dd845d36e2a8480165903870d135dd81e > > Which "mapping" did you use for this experiment/analysis? Looking at the Actually, by mapping I meant: R1 amoswap.w.rl amoswap.w.aq R2 will provide R1->R2 ordering like: R1 spin_unlock() spin_lock() R2 That test is for read->read ordering enforced by unlock()->lock() and I just wanted to say that the current RISC-V memory model provides that with all(amo/lr/sc) .rl -> .aq operations. > current spinlock code for RISCV (and including the atomic changes at stak= e) > P1 seems to be better described by something like: > > fence rw,w // arch_spin_unlock --> smp_store_release > sw > > lr.w // arch_spin_trylock --> arch_try_cmpxchg > bne > sc.w.rl > bnez > fence rw,rw > > or > > amoadd.w.aqrl // arch_spin_lock --> atomic_fetch_add > > or > > lw // arch_spin_lock --> atomic_cond_read_acquire ; smp_mb (??) > bne > fence r,r > > fence rw,rw > > Looking at the rwlock code (for which the same RCtso property is expected= to > hold, even though that hasn't been formalized in the LKMM yet), I see (ag= ain, > including your atomic changes): > > amoadd.w.rl // queued_read_unlock --> atomic_sub_return_release > amoadd.w.aq // queued_read_lock --> atomic_add_return_acquire > > and > > fence rw,w // queue_write_unlock --> smp_store_release > sw > lr.w // queue_write_lock --> atomic_try_cmpxchg_acquire > bne > sc.w > bnez > fence r,rw > > I won't list the slowpath scenarios. Or even the mutex, semaphore, etc. = I > believe you got the point... Thanks for these details, I will do more testing with this on herd7. Something out of curiosity?: >From my understanding of the current version of the RV memory model: .aq provides .aq -> all ordering .rl provides all -> .rl ordering and because this is RCsc variant of release consistency .rl -> .aq which means R/W amoswap.w.rl amoswap.w.aq R/W Should act as a full fence? R/W -> rl -> aq -> R/W > >> Your commit mentioned that the above test would reach the exists >> condition for RISCV. >>=20 >> So, maybe the model has been modified to make .aq and .rl RCsc now? > > Yes. .aq and .rl are RCsc. They were considered RCpc when 5ce6c1f3535fa > 0123f4d76ca63 were discussed (which happened _before_ the RISC-V's memory > model was ratified) as clearly remarked in their commit messages. > > The ATOMIC maintainers went as far as "bisecting" the RISC-V ISA spec in > > https://lore.kernel.org/lkml/YrPei6q4rIAx6Ymf@boqun-archlinux/ > > but, as they say, it's hard to help people who don't want to be helped... > > >> This presentation[2] by Dan Lustig says on page 31: >>=20 >> | PPO RULES 5-7 >> | A release that precedes an acquire in program >> | order also precedes it in global memory order >> | =E2=80=A2 i.e., the RCsc variant of release consistency >>=20 >> If above is true, removing the weak fences and using LR, SC, AMOs with >> aq, rl, and aqrl bits could be used in the kernel AMOs and locks. > > The problem with this argument is that it relies on all lock ops to come = with > an RCsc annotation, which is simply not true in the current/RISCV code as= the > few snippets above also suggested. > > BTW, arm64 uses a similar argument, except all its releases/acquires come= with > RCsc annotations (which greatly simplifies the analysis). The argument c= ould > be easily made to work in RISCV _provided_ its ISA were augmented with lw= .aq > and sw.rl, but it's been ~6 years... > > Said this, maybe we're "lucky" and all the unlock+lock pairs will just wo= rk w/ > your changes. I haven't really checked, and I probably won't until the o= nly > motivation for such changes will be "lower inst count in qemu". > > On such regard, remark that Section A.5, "Code porting and mapping guidel= ines" > of the RISCV ISA spec provides alternative mapping for our atomics, inclu= ding > AMO mapping w/ .aq and .rl annotations: I'm sure those mappings were subj= ect > to a fair amount of review and formal analysis (although I was not involv= ed in > that work/review at the time): if inst count is so important to you, why = not > simply follow those guidelines? (Notice that such re-write would require= some > modification to non-AMO mappings, cf. smp_store_release() and LR/SC mappi= ngs.) So, I will do the following now: 1. Do some benchmarking on real hardware and find out how much overhead these weak fences add. 2. Study the LKMM and the RVWMO for the next few weeks/months or however much time it takes me to confidently reason about things written in these two models. 3. Study the locking / related code of RISC-V to see what could break if we change all these operations in accordance with "Code Porting and Mapping Guidelines" of RISCV ISA. 4. I will use the herd7 models of LKMM and RVWMO and see if everything works as expected after these changes. And If I am convinced after all this, I will send a patch to implement "Code Porting and Mapping Guidelines" + provide performance numbers from real hardware. Thanks for the detailed explainations and especially regarding how the LKMM evolved. Puranjay