Linux Perf Users
 help / color / mirror / Atom feed
* [PATCH v2] arm64/hw_breakpoint: reject unaligned watchpoints that would truncate BAS
@ 2026-06-09 13:15 Breno Leitao
  2026-06-09 13:32 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Breno Leitao @ 2026-06-09 13:15 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Catalin Marinas, Pratyush Anand
  Cc: linux-arm-kernel, linux-perf-users, linux-kernel, clm, leo.bras,
	kernel-team, Breno Leitao

hw_breakpoint_arch_parse() positions the BAS bit pattern in
hw->ctrl.len with

	offset = hw->address & alignment_mask;	/* 0..7 */
	hw->ctrl.len <<= offset;

ctrl.len is an 8-bit bitfield (struct arch_hw_breakpoint_ctrl::len is
u32 :8), so the shift silently drops any bits past bit 7.  For
non-compat AArch64 watchpoints the offset is unbounded relative to
ctrl.len: a perf_event_open(PERF_TYPE_BREAKPOINT) caller asking for
HW_BREAKPOINT_W with bp_addr=page+1 and bp_len=HW_BREAKPOINT_LEN_8
ends up with 0xff << 1 = 0x1fe, stored as 0xfe.  The kernel programs
WCR.BAS=0xfe and the hardware watches bytes [1..7] instead of the
requested [1..8] -- the eighth byte is silently dropped.  The
syscall still returns success, leaving userspace to discover the
gap by empirical probing.

The same class affects HW_BREAKPOINT_LEN_{2,4} when offset pushes the
high BAS bit past bit 7 (e.g. LEN_4 with offset=5 yields 0xe0
instead of 0x1e0).  No memory-safety impact -- the value is masked
into 8 bits before encoding -- but debuggers and perf users observe
missed events on bytes they thought they were watching.

The AArch32 branch immediately above already rejects unrepresentable
(offset, len) combinations via an explicit switch.  Mirror that for
the non-compat branch by checking that the shifted pattern fits in
the BAS field, returning -EINVAL when it does not.

GDB and similar debuggers are unaffected by the stricter check.
aarch64_linux_set_debug_regs() already treats EINVAL on
NT_ARM_HW_WATCH as a downgrade signal: it clears
kernel_supports_any_contiguous_range, calls aarch64_downgrade_regs()
to round the BAS up to a legacy 0x01/03/0f/ff mask with an aligned
base, and retries -- the same fallback path that PR-20207 introduced.
The new -EINVAL is therefore reachable only from a raw
perf_event_open() that pairs an unaligned base with an oversized
bp_len, which is precisely the bug.

Reproducer:

  struct perf_event_attr a = {
      .type = PERF_TYPE_BREAKPOINT, .size = sizeof(a),
      .bp_type = HW_BREAKPOINT_W,
      .bp_addr = (uintptr_t)(buf + 1),
      .bp_len = HW_BREAKPOINT_LEN_8,
      .exclude_kernel = 1, .exclude_hv = 1,
  };
  int fd = perf_event_open(&a, 0, -1, -1, 0);
  /* before this fix: succeeds, watches 7 bytes (buf+1..buf+7)   */
  /* after  this fix: fails with EINVAL                          */

Fixes: b08fb180bb88 ("arm64: Allow hw watchpoint at varied offset from base address")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Changes in v2:
- Use ARM_BREAKPOINT_LEN_8 instead of the 0xff magic number (Will).
- Checked and Expand the commit log to explain why GDB and similar
  debuggers are unaffected by the new -EINVAL
- Link to v1: https://patch.msgid.link/20260430-arm64_bas-v1-1-c6ab2b15aec0@debian.org
---
 arch/arm64/kernel/hw_breakpoint.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index ab76b36dce82..73cce8ac8368 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -559,6 +559,15 @@ int hw_breakpoint_arch_parse(struct perf_event *bp,
 		else
 			alignment_mask = 0x7;
 		offset = hw->address & alignment_mask;
+
+		/*
+		 * BAS is an 8-bit field in WCR/BCR; the shift below would
+		 * silently drop the high bits of ctrl.len when offset + len
+		 * exceeds 8, programming hardware to watch fewer bytes than
+		 * the user requested.
+		 */
+		if (((u32)hw->ctrl.len << offset) > ARM_BREAKPOINT_LEN_8)
+			return -EINVAL;
 	}
 
 	hw->address &= ~alignment_mask;

---
base-commit: 0787c45ea08a13b5482e701fabc741877cf681f6
change-id: 20260430-arm64_bas-77e37d830226

Best regards,
-- 
Breno Leitao <leitao@debian.org>


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v2] arm64/hw_breakpoint: reject unaligned watchpoints that would truncate BAS
  2026-06-09 13:15 [PATCH v2] arm64/hw_breakpoint: reject unaligned watchpoints that would truncate BAS Breno Leitao
@ 2026-06-09 13:32 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-09 13:32 UTC (permalink / raw)
  To: Breno Leitao; +Cc: linux-perf-users

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] User-space watchpoints with MTE-tagged addresses incorrectly fail validation checks and exact-match detection because tag bits are not stripped.
- [High] System register writes to debug registers are not immediately synchronized, violating architectural context synchronization rules.
- [High] Unaligned execution breakpoints bypass length validation and are programmed into hardware with invalid BAS constraints, corrupting RES0 bits.
--

commit 53ef92756387d64bcb01ab3be39ec7396912b68e
Author: Breno Leitao <leitao@debian.org>

arm64/hw_breakpoint: reject unaligned watchpoints that would truncate BAS

This patch addresses an issue where unaligned non-compat AArch64 watchpoints
could silently truncate the BAS bit pattern when the offset shifted the length
past 8 bits. The kernel now correctly rejects these unrepresentable length
and offset combinations with -EINVAL, preventing the hardware from watching
fewer bytes than requested.

> diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
> index ab76b36dce820..73cce8ac83681 100644
> --- a/arch/arm64/kernel/hw_breakpoint.c
> +++ b/arch/arm64/kernel/hw_breakpoint.c
> @@ -559,6 +559,15 @@ int hw_breakpoint_arch_parse(struct perf_event *bp,
>  		else
>  			alignment_mask = 0x7;
>  		offset = hw->address & alignment_mask;

[Severity: High]
This isn't a bug introduced by this patch, but I noticed that MTE-tagged
addresses might not be handled correctly here.

When perf_event_open() is called with an MTE-tagged user-space address,
arch_build_bp_info() saves the address into hw->address without untagging it.

Since hw->address retains the tag, could this cause the bounds check
(va >= TASK_SIZE) in arch_check_bp_in_kernelspace() to evaluate to true due
to the high-order tag bits? This would erroneously classify the user
watchpoint as a kernel watchpoint and reject it.

Additionally, in get_distance_from_watchpoint(), the tagged address
retrieved from the hardware register is used to calculate wp_low, which is
then mathematically compared against the explicitly untagged fault address:

arch/arm64/kernel/hw_breakpoint.c:get_distance_from_watchpoint() {
    addr = untagged_addr(addr);
    ...
    wp_low = val + lens;
    wp_high = val + lene;
    if (addr < wp_low)
    ...
}

Might this cause the distance calculation to silently fail the exact-match
watchpoint heuristic?

> +
> +		/*
> +		 * BAS is an 8-bit field in WCR/BCR; the shift below would
> +		 * silently drop the high bits of ctrl.len when offset + len
> +		 * exceeds 8, programming hardware to watch fewer bytes than
> +		 * the user requested.
> +		 */
> +		if (((u32)hw->ctrl.len << offset) > ARM_BREAKPOINT_LEN_8)
> +			return -EINVAL;

[Severity: High]
This is a pre-existing issue, but does this new check allow unaligned
execution breakpoints to bypass length validation?

For AArch64 execution breakpoints, hw->ctrl.len is fixed at 0xF (4 bits).
If an unaligned address is provided (e.g., offset 1, 2, or 3), the shifted
length becomes 0x1E, 0x3C, or 0x78. Since these are less than 0xFF
(ARM_BREAKPOINT_LEN_8), they bypass this validation check.

When encode_ctrl_reg() executes (ctrl.len << 5), wouldn't these invalid
lengths overflow out of the 4-bit BAS field (bits [8:5] in DBGBCR_EL1)
and overwrite the RES0 bits [11:9]? 

This would also fail to program the architecturally mandated BAS value of
0b1111, leading to CONSTRAINED UNPREDICTABLE hardware behavior.

>  	}
>  
>  	hw->address &= ~alignment_mask;

[Severity: High]
This is a pre-existing issue, but I noticed a potential problem with system
register synchronization in write_wb_reg():

arch/arm64/kernel/hw_breakpoint.c:write_wb_reg() {
...
#define WRITE_WB_REG_CASE(OFF, N, REG, VAL) \
	case (OFF + N): \
		AARCH64_DBG_WRITE(N, REG, VAL); \
		break
...
	switch (reg + n) {
	GEN_WRITE_WB_REG_CASES(AARCH64_DBG_REG_BVR, AARCH64_DBG_REG_NAME_BVR, val);
	...
	}
	isb();
}

According to ARM64 architectural rules, every write to a control-plane system
register must be followed by an isb() as the very next instruction.

Will the break statement in WRITE_WB_REG_CASE introduce an intervening branch
between the system register write and the isb() at the end of the switch
block?

Executing an intervening branch before the Context Synchronization Event (CSE)
means the branch observes an architecturally undefined pipeline state.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609-arm64_bas-v2-1-9f34dbbded88@debian.org?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-09 13:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 13:15 [PATCH v2] arm64/hw_breakpoint: reject unaligned watchpoints that would truncate BAS Breno Leitao
2026-06-09 13:32 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox