From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Boqun Feng <boqun.feng@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Arnd Bergmann <arnd@arndb.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Steve Capper <steve.capper@arm.com>,
Will Deacon <will@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 59/64] arm64: cmpxchg_double*: hazard against entire exchange variable
Date: Mon, 16 Jan 2023 16:52:06 +0100 [thread overview]
Message-ID: <20230116154745.631943317@linuxfoundation.org> (raw)
In-Reply-To: <20230116154743.577276578@linuxfoundation.org>
From: Mark Rutland <mark.rutland@arm.com>
[ Upstream commit 031af50045ea97ed4386eb3751ca2c134d0fc911 ]
The inline assembly for arm64's cmpxchg_double*() implementations use a
+Q constraint to hazard against other accesses to the memory location
being exchanged. However, the pointer passed to the constraint is a
pointer to unsigned long, and thus the hazard only applies to the first
8 bytes of the location.
GCC can take advantage of this, assuming that other portions of the
location are unchanged, leading to a number of potential problems.
This is similar to what we fixed back in commit:
fee960bed5e857eb ("arm64: xchg: hazard against entire exchange variable")
... but we forgot to adjust cmpxchg_double*() similarly at the same
time.
The same problem applies, as demonstrated with the following test:
| struct big {
| u64 lo, hi;
| } __aligned(128);
|
| unsigned long foo(struct big *b)
| {
| u64 hi_old, hi_new;
|
| hi_old = b->hi;
| cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
| hi_new = b->hi;
|
| return hi_old ^ hi_new;
| }
... which GCC 12.1.0 compiles as:
| 0000000000000000 <foo>:
| 0: d503233f paciasp
| 4: aa0003e4 mov x4, x0
| 8: 1400000e b 40 <foo+0x40>
| c: d2800240 mov x0, #0x12 // #18
| 10: d2800681 mov x1, #0x34 // #52
| 14: aa0003e5 mov x5, x0
| 18: aa0103e6 mov x6, x1
| 1c: d2800ac2 mov x2, #0x56 // #86
| 20: d2800f03 mov x3, #0x78 // #120
| 24: 48207c82 casp x0, x1, x2, x3, [x4]
| 28: ca050000 eor x0, x0, x5
| 2c: ca060021 eor x1, x1, x6
| 30: aa010000 orr x0, x0, x1
| 34: d2800000 mov x0, #0x0 // #0 <--- BANG
| 38: d50323bf autiasp
| 3c: d65f03c0 ret
| 40: d2800240 mov x0, #0x12 // #18
| 44: d2800681 mov x1, #0x34 // #52
| 48: d2800ac2 mov x2, #0x56 // #86
| 4c: d2800f03 mov x3, #0x78 // #120
| 50: f9800091 prfm pstl1strm, [x4]
| 54: c87f1885 ldxp x5, x6, [x4]
| 58: ca0000a5 eor x5, x5, x0
| 5c: ca0100c6 eor x6, x6, x1
| 60: aa0600a6 orr x6, x5, x6
| 64: b5000066 cbnz x6, 70 <foo+0x70>
| 68: c8250c82 stxp w5, x2, x3, [x4]
| 6c: 35ffff45 cbnz w5, 54 <foo+0x54>
| 70: d2800000 mov x0, #0x0 // #0 <--- BANG
| 74: d50323bf autiasp
| 78: d65f03c0 ret
Notice that at the lines with "BANG" comments, GCC has assumed that the
higher 8 bytes are unchanged by the cmpxchg_double() call, and that
`hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
LL/SC versions of cmpxchg_double().
This patch fixes the issue by passing a pointer to __uint128_t into the
+Q constraint, ensuring that the compiler hazards against the entire 16
bytes being modified.
With this change, GCC 12.1.0 compiles the above test as:
| 0000000000000000 <foo>:
| 0: f9400407 ldr x7, [x0, #8]
| 4: d503233f paciasp
| 8: aa0003e4 mov x4, x0
| c: 1400000f b 48 <foo+0x48>
| 10: d2800240 mov x0, #0x12 // #18
| 14: d2800681 mov x1, #0x34 // #52
| 18: aa0003e5 mov x5, x0
| 1c: aa0103e6 mov x6, x1
| 20: d2800ac2 mov x2, #0x56 // #86
| 24: d2800f03 mov x3, #0x78 // #120
| 28: 48207c82 casp x0, x1, x2, x3, [x4]
| 2c: ca050000 eor x0, x0, x5
| 30: ca060021 eor x1, x1, x6
| 34: aa010000 orr x0, x0, x1
| 38: f9400480 ldr x0, [x4, #8]
| 3c: d50323bf autiasp
| 40: ca0000e0 eor x0, x7, x0
| 44: d65f03c0 ret
| 48: d2800240 mov x0, #0x12 // #18
| 4c: d2800681 mov x1, #0x34 // #52
| 50: d2800ac2 mov x2, #0x56 // #86
| 54: d2800f03 mov x3, #0x78 // #120
| 58: f9800091 prfm pstl1strm, [x4]
| 5c: c87f1885 ldxp x5, x6, [x4]
| 60: ca0000a5 eor x5, x5, x0
| 64: ca0100c6 eor x6, x6, x1
| 68: aa0600a6 orr x6, x5, x6
| 6c: b5000066 cbnz x6, 78 <foo+0x78>
| 70: c8250c82 stxp w5, x2, x3, [x4]
| 74: 35ffff45 cbnz w5, 5c <foo+0x5c>
| 78: f9400480 ldr x0, [x4, #8]
| 7c: d50323bf autiasp
| 80: ca0000e0 eor x0, x7, x0
| 84: d65f03c0 ret
... sampling the high 8 bytes before and after the cmpxchg, and
performing an EOR, as we'd expect.
For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
that linux-4.9.y is oldest currently supported stable release, and
mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
on my machines due to library incompatibilities.
I've also used a standalone test to check that we can use a __uint128_t
pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
3.9.1.
Fixes: 5284e1b4bc8a ("arm64: xchg: Implement cmpxchg_double")
Fixes: e9a4b795652f ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Y6GXoO4qmH9OIZ5Q@hirez.programming.kicks-ass.net/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230104151626.3262137-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/arm64/include/asm/atomic_ll_sc.h | 2 +-
arch/arm64/include/asm/atomic_lse.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/atomic_ll_sc.h b/arch/arm64/include/asm/atomic_ll_sc.h
index 906e2d8c254c..abd302e521c0 100644
--- a/arch/arm64/include/asm/atomic_ll_sc.h
+++ b/arch/arm64/include/asm/atomic_ll_sc.h
@@ -315,7 +315,7 @@ __ll_sc__cmpxchg_double##name(unsigned long old1, \
" cbnz %w0, 1b\n" \
" " #mb "\n" \
"2:" \
- : "=&r" (tmp), "=&r" (ret), "+Q" (*(unsigned long *)ptr) \
+ : "=&r" (tmp), "=&r" (ret), "+Q" (*(__uint128_t *)ptr) \
: "r" (old1), "r" (old2), "r" (new1), "r" (new2) \
: cl); \
\
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index ab661375835e..28e96118c1e5 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -403,7 +403,7 @@ __lse__cmpxchg_double##name(unsigned long old1, \
" eor %[old2], %[old2], %[oldval2]\n" \
" orr %[old1], %[old1], %[old2]" \
: [old1] "+&r" (x0), [old2] "+&r" (x1), \
- [v] "+Q" (*(unsigned long *)ptr) \
+ [v] "+Q" (*(__uint128_t *)ptr) \
: [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \
[oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \
: cl); \
--
2.35.1
next prev parent reply other threads:[~2023-01-16 16:08 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-16 15:51 [PATCH 5.10 00/64] 5.10.164-rc1 review Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 01/64] netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 02/64] ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 03/64] KVM: arm64: Fix S1PTW handling on RO memslots Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 04/64] efi: tpm: Avoid READ_ONCE() for accessing the event log Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 05/64] docs: Fix the docs build with Sphinx 6.0 Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 06/64] perf auxtrace: Fix address filter duplicate symbol selection Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 07/64] s390/kexec: fix ipl report address for kdump Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 08/64] ASoC: qcom: lpass-cpu: Fix fallback SD line index handling Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 09/64] s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 10/64] s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 11/64] cifs: Fix uninitialized memory read for smb311 posix symlink create Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 12/64] drm/msm/adreno: Make adreno quirks not overwrite each other Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 13/64] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 14/64] platform/x86: sony-laptop: Dont turn off 0x153 keyboard backlight during probe Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 15/64] ixgbe: fix pci device refcount leak Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 16/64] ipv6: raw: Deduct extension header length in rawv6_push_pending_frames Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 17/64] bus: mhi: host: Fix race between channel preparation and M0 event Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 18/64] iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 19/64] iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 20/64] clk: imx8mp: Add DISP2 pixel clock Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 21/64] clk: imx8mp: add clkout1/2 support Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 22/64] dt-bindings: clocks: imx8mp: Add ID for usb suspend clock Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 23/64] clk: imx: imx8mp: add shared clk gate for usb suspend clk Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 24/64] xhci: Avoid parsing transfer events several times Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 25/64] xhci: get isochronous ring directly from endpoint structure Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 26/64] xhci: adjust parameters passed to cleanup_halted_endpoint() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 27/64] xhci: Add xhci_reset_halted_ep() helper function Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 28/64] xhci: move xhci_td_cleanup so it can be called by more functions Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 29/64] xhci: store TD status in the td struct instead of passing it along Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 30/64] xhci: move and rename xhci_cleanup_halted_endpoint() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 31/64] xhci: Prevent infinite loop in transaction errors recovery for streams Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 32/64] usb: ulpi: defer ulpi_register on ulpi_read_id timeout Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 33/64] ext4: fix uninititialized value in ext4_evict_inode Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 34/64] xfrm: fix rcu lock in xfrm_notify_userpolicy() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 35/64] netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 36/64] powerpc/imc-pmu: Fix use of mutex in IRQs disabled section Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 37/64] x86/boot: Avoid using Intel mnemonics in AT&T syntax asm Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 38/64] EDAC/device: Fix period calculation in edac_device_reset_delay_period() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 39/64] regulator: da9211: Use irq handler when ready Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 40/64] ASoC: wm8904: fix wrong outputs volume after power reactivation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 41/64] tipc: fix unexpected link reset due to discovery messages Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 42/64] octeontx2-af: Update get/set resource count functions Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 43/64] octeontx2-af: Map NIX block from CGX connection Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 44/64] octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 45/64] hvc/xen: lock console list traversal Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 46/64] nfc: pn533: Wait for out_urbs completion in pn533_usb_send_frame() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 47/64] net/sched: act_mpls: Fix warning during failed attribute validation Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 48/64] net/mlx5: Fix ptp max frequency adjustment range Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 49/64] net/mlx5e: Dont support encap rules with gbp option Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 50/64] mm: Always release pages to the buddy allocator in memblock_free_late() Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 51/64] iommu/mediatek-v1: Add error handle for mtk_iommu_probe Greg Kroah-Hartman
2023-01-16 15:51 ` [PATCH 5.10 52/64] iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 53/64] Documentation: KVM: add API issues section Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 54/64] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 55/64] x86/resctrl: Use task_curr() instead of task_struct->on_cpu to prevent unnecessary IPI Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 56/64] x86/resctrl: Fix task CLOSID/RMID update race Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 57/64] arm64: atomics: format whitespace consistently Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 58/64] arm64: atomics: remove LL/SC trampolines Greg Kroah-Hartman
2023-01-16 15:52 ` Greg Kroah-Hartman [this message]
2023-01-16 15:52 ` [PATCH 5.10 60/64] efi: fix NULL-deref in init error path Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 61/64] drm/virtio: Fix GEM handle creation UAF Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 62/64] io_uring/io-wq: free worker if task_work creation is canceled Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 63/64] io_uring/io-wq: only free worker if it was allocated for creation Greg Kroah-Hartman
2023-01-16 15:52 ` [PATCH 5.10 64/64] Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout" Greg Kroah-Hartman
2023-01-16 18:58 ` [PATCH 5.10 00/64] 5.10.164-rc1 review Daniel Díaz
2023-01-16 21:30 ` Pavel Machek
2023-01-17 9:32 ` Greg Kroah-Hartman
2023-01-16 23:58 ` Shuah Khan
2023-01-17 12:35 ` Sudip Mukherjee
2023-01-17 14:20 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230116154745.631943317@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=arnd@arndb.de \
--cc=boqun.feng@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=mark.rutland@arm.com \
--cc=patches@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=steve.capper@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.