* [PATCH 0/7] Rework perf and ptrace watchpoint tracking
@ 2023-08-01 1:17 Benjamin Gray
2023-08-01 1:17 ` [PATCH 1/7] powerpc/watchpoints: Explain thread_change_pc() more Benjamin Gray
` (8 more replies)
0 siblings, 9 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
Syzkaller triggered a null pointer dereference in the
arch_unregister_hw_breakpoint() hook. This is due to accessing
the bp->ctx->task field changing to -1 while we iterate the breakpoints.
This series refactors the breakpoint tracking logic to remove the
dependency on bp->ctx entirely. It also simplifies handling of ptrace and
perf breakpoints, making insertion less restrictive.
If merged, it allows several arch hooks that PowerPC was the sole user of
to be removed.
Benjamin Gray (7):
powerpc/watchpoints: Explain thread_change_pc() more
powerpc/watchpoints: Don't track info persistently
powerpc/watchpoints: Track perf single step directly on the breakpoint
powerpc/watchpoints: Simplify watchpoint reinsertion
powerpc/watchpoints: Remove ptrace/perf exclusion tracking
selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest
perf/hw_breakpoint: Remove arch breakpoint hooks
arch/powerpc/include/asm/hw_breakpoint.h | 1 +
arch/powerpc/include/asm/processor.h | 5 -
arch/powerpc/kernel/hw_breakpoint.c | 388 +-----
include/linux/hw_breakpoint.h | 3 -
kernel/events/hw_breakpoint.c | 28 -
.../testing/selftests/powerpc/ptrace/Makefile | 1 +
.../powerpc/ptrace/ptrace-perf-asm.S | 33 +
.../powerpc/ptrace/ptrace-perf-hwbreak.c | 1104 +++++++----------
8 files changed, 537 insertions(+), 1026 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S
rewrite tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c (93%)
--
2.41.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/7] powerpc/watchpoints: Explain thread_change_pc() more
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 2/7] powerpc/watchpoints: Don't track info persistently Benjamin Gray
` (7 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
The behaviour of the thread_change_pc() function is a bit cryptic
without being more familiar with how the watchpoint logic handles
perf's after-execute semantics.
Expand the comment to explain why we can re-insert the breakpoint and
unset the perf_single_step flag.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
arch/powerpc/kernel/hw_breakpoint.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index e1b4e70c8fd0..bad2991f906b 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -499,6 +499,10 @@ int hw_breakpoint_arch_parse(struct perf_event *bp,
* Restores the breakpoint on the debug registers.
* Invoke this function if it is known that the execution context is
* about to change to cause loss of MSR_SE settings.
+ *
+ * The perf watchpoint will simply re-trigger once the thread is started again,
+ * and the watchpoint handler will set up MSR_SE and perf_single_step as
+ * needed.
*/
void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
{
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/7] powerpc/watchpoints: Don't track info persistently
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
2023-08-01 1:17 ` [PATCH 1/7] powerpc/watchpoints: Explain thread_change_pc() more Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 3/7] powerpc/watchpoints: Track perf single step directly on the breakpoint Benjamin Gray
` (6 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
info is cheap to retrieve, and is likely optimised by the compiler
anyway. On the other hand, propagating it across the functions makes it
possible to be inconsistent and adds needless complexity.
Remove it, and invoke counter_arch_bp() when we need to work with it.
As we don't persist it, we just use the local bp array to track whether
we are ignoring a breakpoint.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
arch/powerpc/kernel/hw_breakpoint.c | 60 +++++++++++++++--------------
1 file changed, 32 insertions(+), 28 deletions(-)
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index bad2991f906b..e6749642604c 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -538,23 +538,22 @@ static bool is_octword_vsx_instr(int type, int size)
* We've failed in reliably handling the hw-breakpoint. Unregister
* it and throw a warning message to let the user know about it.
*/
-static void handler_error(struct perf_event *bp, struct arch_hw_breakpoint *info)
+static void handler_error(struct perf_event *bp)
{
WARN(1, "Unable to handle hardware breakpoint. Breakpoint at 0x%lx will be disabled.",
- info->address);
+ counter_arch_bp(bp)->address);
perf_event_disable_inatomic(bp);
}
-static void larx_stcx_err(struct perf_event *bp, struct arch_hw_breakpoint *info)
+static void larx_stcx_err(struct perf_event *bp)
{
printk_ratelimited("Breakpoint hit on instruction that can't be emulated. Breakpoint at 0x%lx will be disabled.\n",
- info->address);
+ counter_arch_bp(bp)->address);
perf_event_disable_inatomic(bp);
}
static bool stepping_handler(struct pt_regs *regs, struct perf_event **bp,
- struct arch_hw_breakpoint **info, int *hit,
- ppc_inst_t instr)
+ int *hit, ppc_inst_t instr)
{
int i;
int stepped;
@@ -565,7 +564,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event **bp,
if (!hit[i])
continue;
current->thread.last_hit_ubp[i] = bp[i];
- info[i] = NULL;
+ bp[i] = NULL;
}
regs_set_return_msr(regs, regs->msr | MSR_SE);
return false;
@@ -576,15 +575,15 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event **bp,
for (i = 0; i < nr_wp_slots(); i++) {
if (!hit[i])
continue;
- handler_error(bp[i], info[i]);
- info[i] = NULL;
+ handler_error(bp[i]);
+ bp[i] = NULL;
}
return false;
}
return true;
}
-static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
+static void handle_p10dd1_spurious_exception(struct perf_event **bp,
int *hit, unsigned long ea)
{
int i;
@@ -596,10 +595,14 @@ static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
* spurious exception.
*/
for (i = 0; i < nr_wp_slots(); i++) {
- if (!info[i])
+ struct arch_hw_breakpoint *info;
+
+ if (!bp[i])
continue;
- hw_end_addr = ALIGN(info[i]->address + info[i]->len, HW_BREAKPOINT_SIZE);
+ info = counter_arch_bp(bp[i]);
+
+ hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
/*
* Ending address of DAWR range is less than starting
@@ -629,9 +632,9 @@ static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
return;
for (i = 0; i < nr_wp_slots(); i++) {
- if (info[i]) {
+ if (bp[i]) {
hit[i] = 1;
- info[i]->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
+ counter_arch_bp(bp[i])->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
}
}
}
@@ -642,7 +645,6 @@ int hw_breakpoint_handler(struct die_args *args)
int rc = NOTIFY_STOP;
struct perf_event *bp[HBP_NUM_MAX] = { NULL };
struct pt_regs *regs = args->regs;
- struct arch_hw_breakpoint *info[HBP_NUM_MAX] = { NULL };
int i;
int hit[HBP_NUM_MAX] = {0};
int nr_hit = 0;
@@ -667,18 +669,20 @@ int hw_breakpoint_handler(struct die_args *args)
wp_get_instr_detail(regs, &instr, &type, &size, &ea);
for (i = 0; i < nr_wp_slots(); i++) {
+ struct arch_hw_breakpoint *info;
+
bp[i] = __this_cpu_read(bp_per_reg[i]);
if (!bp[i])
continue;
- info[i] = counter_arch_bp(bp[i]);
- info[i]->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ;
+ info = counter_arch_bp(bp[i]);
+ info->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ;
- if (wp_check_constraints(regs, instr, ea, type, size, info[i])) {
+ if (wp_check_constraints(regs, instr, ea, type, size, info)) {
if (!IS_ENABLED(CONFIG_PPC_8xx) &&
ppc_inst_equal(instr, ppc_inst(0))) {
- handler_error(bp[i], info[i]);
- info[i] = NULL;
+ handler_error(bp[i]);
+ bp[i] = NULL;
err = 1;
continue;
}
@@ -697,7 +701,7 @@ int hw_breakpoint_handler(struct die_args *args)
/* Workaround for Power10 DD1 */
if (!IS_ENABLED(CONFIG_PPC_8xx) && mfspr(SPRN_PVR) == 0x800100 &&
is_octword_vsx_instr(type, size)) {
- handle_p10dd1_spurious_exception(info, hit, ea);
+ handle_p10dd1_spurious_exception(bp, hit, ea);
} else {
rc = NOTIFY_DONE;
goto out;
@@ -715,7 +719,7 @@ int hw_breakpoint_handler(struct die_args *args)
if (!hit[i])
continue;
perf_bp_event(bp[i], regs);
- info[i] = NULL;
+ bp[i] = NULL;
}
rc = NOTIFY_DONE;
goto reset;
@@ -726,13 +730,13 @@ int hw_breakpoint_handler(struct die_args *args)
for (i = 0; i < nr_wp_slots(); i++) {
if (!hit[i])
continue;
- larx_stcx_err(bp[i], info[i]);
- info[i] = NULL;
+ larx_stcx_err(bp[i]);
+ bp[i] = NULL;
}
goto reset;
}
- if (!stepping_handler(regs, bp, info, hit, instr))
+ if (!stepping_handler(regs, bp, hit, instr))
goto reset;
}
@@ -743,15 +747,15 @@ int hw_breakpoint_handler(struct die_args *args)
for (i = 0; i < nr_wp_slots(); i++) {
if (!hit[i])
continue;
- if (!(info[i]->type & HW_BRK_TYPE_EXTRANEOUS_IRQ))
+ if (!(counter_arch_bp(bp[i])->type & HW_BRK_TYPE_EXTRANEOUS_IRQ))
perf_bp_event(bp[i], regs);
}
reset:
for (i = 0; i < nr_wp_slots(); i++) {
- if (!info[i])
+ if (!bp[i])
continue;
- __set_breakpoint(i, info[i]);
+ __set_breakpoint(i, counter_arch_bp(bp[i]));
}
out:
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/7] powerpc/watchpoints: Track perf single step directly on the breakpoint
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
2023-08-01 1:17 ` [PATCH 1/7] powerpc/watchpoints: Explain thread_change_pc() more Benjamin Gray
2023-08-01 1:17 ` [PATCH 2/7] powerpc/watchpoints: Don't track info persistently Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 4/7] powerpc/watchpoints: Simplify watchpoint reinsertion Benjamin Gray
` (5 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
There is a bug in the current watchpoint tracking logic, where the
teardown in arch_unregister_hw_breakpoint() uses bp->ctx->task, which it
does not have a reference of and parallel threads may be in the process
of destroying. This was partially addressed in commit fb822e6076d9
("powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event"),
but the underlying issue of accessing a struct member in an unknown
state still remained. Syzkaller managed to trigger a null pointer
derefernce due to the race between the task destructor and checking the
pointer and dereferencing it in the loop.
While this null pointer dereference could be fixed by using READ_ONCE
to access the task up front, that just changes the error to manipulating
possbily freed memory.
Instead, the breakpoint logic needs to be reworked to remove any
dependency on a context or task struct during breakpoint removal.
The reason we have this currently is to clear thread.last_hit_ubp. This
member is used to differentiate the perf DAWR single-step sequence from
other causes of single-step, such as userspace just calling
ptrace(PTRACE_SINGLESTEP, ...). We need to differentiate them because,
when the single step interrupt is received, we need to know whether to
re-insert the DAWR breakpoint (perf) or not (ptrace / other).
arch_unregister_hw_breakpoint() needs to clear this information to
prevent dangling pointers to possibly freed memory. These pointers are
dereferenced in single_step_dabr_instruction() without a way to check
their validity.
This patch moves the tracking of this information to the breakpoint
itself. This means we no longer have to do anything special to clean up.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
arch/powerpc/include/asm/hw_breakpoint.h | 1 +
arch/powerpc/include/asm/processor.h | 5 --
arch/powerpc/kernel/hw_breakpoint.c | 69 ++++++++----------------
3 files changed, 23 insertions(+), 52 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_breakpoint.h b/arch/powerpc/include/asm/hw_breakpoint.h
index 84d39fd42f71..66db0147d5b4 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -18,6 +18,7 @@ struct arch_hw_breakpoint {
u16 len; /* length of the target data symbol */
u16 hw_len; /* length programmed in hw */
u8 flags;
+ bool perf_single_step; /* temporarily uninstalled for a perf single step */
};
/* Note: Don't change the first 6 bits below as they are in the same order
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 8a6754ffdc7e..9e67cb1c72e9 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -172,11 +172,6 @@ struct thread_struct {
unsigned int align_ctl; /* alignment handling control */
#ifdef CONFIG_HAVE_HW_BREAKPOINT
struct perf_event *ptrace_bps[HBP_NUM_MAX];
- /*
- * Helps identify source of single-step exception and subsequent
- * hw-breakpoint enablement
- */
- struct perf_event *last_hit_ubp[HBP_NUM_MAX];
#endif /* CONFIG_HAVE_HW_BREAKPOINT */
struct arch_hw_breakpoint hw_brk[HBP_NUM_MAX]; /* hardware breakpoint info */
unsigned long trap_nr; /* last trap # on this thread */
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index e6749642604c..624375c18882 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -43,16 +43,6 @@ int hw_breakpoint_slots(int type)
return 0; /* no instruction breakpoints available */
}
-static bool single_step_pending(void)
-{
- int i;
-
- for (i = 0; i < nr_wp_slots(); i++) {
- if (current->thread.last_hit_ubp[i])
- return true;
- }
- return false;
-}
/*
* Install a perf counter breakpoint.
@@ -84,7 +74,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
* Do not install DABR values if the instruction must be single-stepped.
* If so, DABR will be populated in single_step_dabr_instruction().
*/
- if (!single_step_pending())
+ if (!info->perf_single_step)
__set_breakpoint(i, info);
return 0;
@@ -371,28 +361,6 @@ void arch_release_bp_slot(struct perf_event *bp)
}
}
-/*
- * Perform cleanup of arch-specific counters during unregistration
- * of the perf-event
- */
-void arch_unregister_hw_breakpoint(struct perf_event *bp)
-{
- /*
- * If the breakpoint is unregistered between a hw_breakpoint_handler()
- * and the single_step_dabr_instruction(), then cleanup the breakpoint
- * restoration variables to prevent dangling pointers.
- * FIXME, this should not be using bp->ctx at all! Sayeth peterz.
- */
- if (bp->ctx && bp->ctx->task && bp->ctx->task != ((void *)-1L)) {
- int i;
-
- for (i = 0; i < nr_wp_slots(); i++) {
- if (bp->ctx->task->thread.last_hit_ubp[i] == bp)
- bp->ctx->task->thread.last_hit_ubp[i] = NULL;
- }
- }
-}
-
/*
* Check for virtual address in kernel space.
*/
@@ -510,7 +478,9 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
int i;
for (i = 0; i < nr_wp_slots(); i++) {
- if (unlikely(tsk->thread.last_hit_ubp[i]))
+ struct perf_event *bp = __this_cpu_read(bp_per_reg[i]);
+
+ if (unlikely(bp && counter_arch_bp(bp)->perf_single_step))
goto reset;
}
return;
@@ -520,7 +490,7 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
for (i = 0; i < nr_wp_slots(); i++) {
info = counter_arch_bp(__this_cpu_read(bp_per_reg[i]));
__set_breakpoint(i, info);
- tsk->thread.last_hit_ubp[i] = NULL;
+ info->perf_single_step = false;
}
}
@@ -563,7 +533,8 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event **bp,
for (i = 0; i < nr_wp_slots(); i++) {
if (!hit[i])
continue;
- current->thread.last_hit_ubp[i] = bp[i];
+
+ counter_arch_bp(bp[i])->perf_single_step = true;
bp[i] = NULL;
}
regs_set_return_msr(regs, regs->msr | MSR_SE);
@@ -770,24 +741,28 @@ NOKPROBE_SYMBOL(hw_breakpoint_handler);
static int single_step_dabr_instruction(struct die_args *args)
{
struct pt_regs *regs = args->regs;
- struct perf_event *bp = NULL;
- struct arch_hw_breakpoint *info;
- int i;
bool found = false;
/*
* Check if we are single-stepping as a result of a
* previous HW Breakpoint exception
*/
- for (i = 0; i < nr_wp_slots(); i++) {
- bp = current->thread.last_hit_ubp[i];
+ for (int i = 0; i < nr_wp_slots(); i++) {
+ struct perf_event *bp;
+ struct arch_hw_breakpoint *info;
+
+ bp = __this_cpu_read(bp_per_reg[i]);
if (!bp)
continue;
- found = true;
info = counter_arch_bp(bp);
+ if (!info->perf_single_step)
+ continue;
+
+ found = true;
+
/*
* We shall invoke the user-defined callback function in the
* single stepping handler to confirm to 'trigger-after-execute'
@@ -795,19 +770,19 @@ static int single_step_dabr_instruction(struct die_args *args)
*/
if (!(info->type & HW_BRK_TYPE_EXTRANEOUS_IRQ))
perf_bp_event(bp, regs);
- current->thread.last_hit_ubp[i] = NULL;
+
+ info->perf_single_step = false;
}
if (!found)
return NOTIFY_DONE;
- for (i = 0; i < nr_wp_slots(); i++) {
- bp = __this_cpu_read(bp_per_reg[i]);
+ for (int i = 0; i < nr_wp_slots(); i++) {
+ struct perf_event *bp = __this_cpu_read(bp_per_reg[i]);
if (!bp)
continue;
- info = counter_arch_bp(bp);
- __set_breakpoint(i, info);
+ __set_breakpoint(i, counter_arch_bp(bp));
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/7] powerpc/watchpoints: Simplify watchpoint reinsertion
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (2 preceding siblings ...)
2023-08-01 1:17 ` [PATCH 3/7] powerpc/watchpoints: Track perf single step directly on the breakpoint Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 5/7] powerpc/watchpoints: Remove ptrace/perf exclusion tracking Benjamin Gray
` (4 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
We only remove watchpoints when they have the perf_single_step flag set,
so we can reinsert them during the first iteration.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
arch/powerpc/kernel/hw_breakpoint.c | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 624375c18882..bf8dda1a7e04 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -772,16 +772,6 @@ static int single_step_dabr_instruction(struct die_args *args)
perf_bp_event(bp, regs);
info->perf_single_step = false;
- }
-
- if (!found)
- return NOTIFY_DONE;
-
- for (int i = 0; i < nr_wp_slots(); i++) {
- struct perf_event *bp = __this_cpu_read(bp_per_reg[i]);
- if (!bp)
- continue;
-
__set_breakpoint(i, counter_arch_bp(bp));
}
@@ -789,7 +779,7 @@ static int single_step_dabr_instruction(struct die_args *args)
* If the process was being single-stepped by ptrace, let the
* other single-step actions occur (e.g. generate SIGTRAP).
*/
- if (test_thread_flag(TIF_SINGLESTEP))
+ if (!found || test_thread_flag(TIF_SINGLESTEP))
return NOTIFY_DONE;
return NOTIFY_STOP;
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 5/7] powerpc/watchpoints: Remove ptrace/perf exclusion tracking
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (3 preceding siblings ...)
2023-08-01 1:17 ` [PATCH 4/7] powerpc/watchpoints: Simplify watchpoint reinsertion Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 6/7] selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest Benjamin Gray
` (3 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
ptrace and perf watchpoints were considered incompatible in
commit 29da4f91c0c1 ("powerpc/watchpoint: Don't allow concurrent perf
and ptrace events"), but the logic in that commit doesn't really apply.
Ptrace doesn't automatically single step; the ptracer must request this
explicitly. And the ptracer can do so regardless of whether a
ptrace/perf watchpoint triggered or not: it could single step every
instruction if it wanted to. Whatever stopped the ptracee before
executing the instruction that would trigger the perf watchpoint is no
longer relevant by this point.
To get correct behaviour when perf and ptrace are watching the same
data we must ignore the perf watchpoint. After all, ptrace has
before-execute semantics, and perf is after-execute, so perf doesn't
actually care about the watchpoint trigger at this point in time.
Pausing before execution does not mean we will actually end up executing
the instruction.
Importantly though, we don't remove the perf watchpoint yet. This is
key.
The ptracer is free to do whatever it likes right now. E.g., it can
continue the process, single step. or even set the child PC somewhere
completely different.
If it does try to execute the instruction though, without reinserting
the watchpoint (in which case we go back to the start of this example),
the perf watchpoint would immediately trigger. This time there is no
ptrace watchpoint, so we can safely perform a single step and increment
the perf counter. Upon receiving the single step exception, the existing
code already handles propagating or consuming it based on whether
another subsystem (e.g. ptrace) requested a single step. Again, this is
needed with or without perf/ptrace exclusion, because ptrace could be
single stepping this instruction regardless of if a watchpoint is
involved.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
Whether it's a _good_ idea to mix ptrace and perf is another thing
entirely mind... . But they are not inherently incompatible.
---
arch/powerpc/kernel/hw_breakpoint.c | 249 +---------------------------
1 file changed, 1 insertion(+), 248 deletions(-)
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index bf8dda1a7e04..b8513dc3e53a 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -114,253 +114,6 @@ static bool is_ptrace_bp(struct perf_event *bp)
return bp->overflow_handler == ptrace_triggered;
}
-struct breakpoint {
- struct list_head list;
- struct perf_event *bp;
- bool ptrace_bp;
-};
-
-/*
- * While kernel/events/hw_breakpoint.c does its own synchronization, we cannot
- * rely on it safely synchronizing internals here; however, we can rely on it
- * not requesting more breakpoints than available.
- */
-static DEFINE_SPINLOCK(cpu_bps_lock);
-static DEFINE_PER_CPU(struct breakpoint *, cpu_bps[HBP_NUM_MAX]);
-static DEFINE_SPINLOCK(task_bps_lock);
-static LIST_HEAD(task_bps);
-
-static struct breakpoint *alloc_breakpoint(struct perf_event *bp)
-{
- struct breakpoint *tmp;
-
- tmp = kzalloc(sizeof(*tmp), GFP_KERNEL);
- if (!tmp)
- return ERR_PTR(-ENOMEM);
- tmp->bp = bp;
- tmp->ptrace_bp = is_ptrace_bp(bp);
- return tmp;
-}
-
-static bool bp_addr_range_overlap(struct perf_event *bp1, struct perf_event *bp2)
-{
- __u64 bp1_saddr, bp1_eaddr, bp2_saddr, bp2_eaddr;
-
- bp1_saddr = ALIGN_DOWN(bp1->attr.bp_addr, HW_BREAKPOINT_SIZE);
- bp1_eaddr = ALIGN(bp1->attr.bp_addr + bp1->attr.bp_len, HW_BREAKPOINT_SIZE);
- bp2_saddr = ALIGN_DOWN(bp2->attr.bp_addr, HW_BREAKPOINT_SIZE);
- bp2_eaddr = ALIGN(bp2->attr.bp_addr + bp2->attr.bp_len, HW_BREAKPOINT_SIZE);
-
- return (bp1_saddr < bp2_eaddr && bp1_eaddr > bp2_saddr);
-}
-
-static bool alternate_infra_bp(struct breakpoint *b, struct perf_event *bp)
-{
- return is_ptrace_bp(bp) ? !b->ptrace_bp : b->ptrace_bp;
-}
-
-static bool can_co_exist(struct breakpoint *b, struct perf_event *bp)
-{
- return !(alternate_infra_bp(b, bp) && bp_addr_range_overlap(b->bp, bp));
-}
-
-static int task_bps_add(struct perf_event *bp)
-{
- struct breakpoint *tmp;
-
- tmp = alloc_breakpoint(bp);
- if (IS_ERR(tmp))
- return PTR_ERR(tmp);
-
- spin_lock(&task_bps_lock);
- list_add(&tmp->list, &task_bps);
- spin_unlock(&task_bps_lock);
- return 0;
-}
-
-static void task_bps_remove(struct perf_event *bp)
-{
- struct list_head *pos, *q;
-
- spin_lock(&task_bps_lock);
- list_for_each_safe(pos, q, &task_bps) {
- struct breakpoint *tmp = list_entry(pos, struct breakpoint, list);
-
- if (tmp->bp == bp) {
- list_del(&tmp->list);
- kfree(tmp);
- break;
- }
- }
- spin_unlock(&task_bps_lock);
-}
-
-/*
- * If any task has breakpoint from alternate infrastructure,
- * return true. Otherwise return false.
- */
-static bool all_task_bps_check(struct perf_event *bp)
-{
- struct breakpoint *tmp;
- bool ret = false;
-
- spin_lock(&task_bps_lock);
- list_for_each_entry(tmp, &task_bps, list) {
- if (!can_co_exist(tmp, bp)) {
- ret = true;
- break;
- }
- }
- spin_unlock(&task_bps_lock);
- return ret;
-}
-
-/*
- * If same task has breakpoint from alternate infrastructure,
- * return true. Otherwise return false.
- */
-static bool same_task_bps_check(struct perf_event *bp)
-{
- struct breakpoint *tmp;
- bool ret = false;
-
- spin_lock(&task_bps_lock);
- list_for_each_entry(tmp, &task_bps, list) {
- if (tmp->bp->hw.target == bp->hw.target &&
- !can_co_exist(tmp, bp)) {
- ret = true;
- break;
- }
- }
- spin_unlock(&task_bps_lock);
- return ret;
-}
-
-static int cpu_bps_add(struct perf_event *bp)
-{
- struct breakpoint **cpu_bp;
- struct breakpoint *tmp;
- int i = 0;
-
- tmp = alloc_breakpoint(bp);
- if (IS_ERR(tmp))
- return PTR_ERR(tmp);
-
- spin_lock(&cpu_bps_lock);
- cpu_bp = per_cpu_ptr(cpu_bps, bp->cpu);
- for (i = 0; i < nr_wp_slots(); i++) {
- if (!cpu_bp[i]) {
- cpu_bp[i] = tmp;
- break;
- }
- }
- spin_unlock(&cpu_bps_lock);
- return 0;
-}
-
-static void cpu_bps_remove(struct perf_event *bp)
-{
- struct breakpoint **cpu_bp;
- int i = 0;
-
- spin_lock(&cpu_bps_lock);
- cpu_bp = per_cpu_ptr(cpu_bps, bp->cpu);
- for (i = 0; i < nr_wp_slots(); i++) {
- if (!cpu_bp[i])
- continue;
-
- if (cpu_bp[i]->bp == bp) {
- kfree(cpu_bp[i]);
- cpu_bp[i] = NULL;
- break;
- }
- }
- spin_unlock(&cpu_bps_lock);
-}
-
-static bool cpu_bps_check(int cpu, struct perf_event *bp)
-{
- struct breakpoint **cpu_bp;
- bool ret = false;
- int i;
-
- spin_lock(&cpu_bps_lock);
- cpu_bp = per_cpu_ptr(cpu_bps, cpu);
- for (i = 0; i < nr_wp_slots(); i++) {
- if (cpu_bp[i] && !can_co_exist(cpu_bp[i], bp)) {
- ret = true;
- break;
- }
- }
- spin_unlock(&cpu_bps_lock);
- return ret;
-}
-
-static bool all_cpu_bps_check(struct perf_event *bp)
-{
- int cpu;
-
- for_each_online_cpu(cpu) {
- if (cpu_bps_check(cpu, bp))
- return true;
- }
- return false;
-}
-
-int arch_reserve_bp_slot(struct perf_event *bp)
-{
- int ret;
-
- /* ptrace breakpoint */
- if (is_ptrace_bp(bp)) {
- if (all_cpu_bps_check(bp))
- return -ENOSPC;
-
- if (same_task_bps_check(bp))
- return -ENOSPC;
-
- return task_bps_add(bp);
- }
-
- /* perf breakpoint */
- if (is_kernel_addr(bp->attr.bp_addr))
- return 0;
-
- if (bp->hw.target && bp->cpu == -1) {
- if (same_task_bps_check(bp))
- return -ENOSPC;
-
- return task_bps_add(bp);
- } else if (!bp->hw.target && bp->cpu != -1) {
- if (all_task_bps_check(bp))
- return -ENOSPC;
-
- return cpu_bps_add(bp);
- }
-
- if (same_task_bps_check(bp))
- return -ENOSPC;
-
- ret = cpu_bps_add(bp);
- if (ret)
- return ret;
- ret = task_bps_add(bp);
- if (ret)
- cpu_bps_remove(bp);
-
- return ret;
-}
-
-void arch_release_bp_slot(struct perf_event *bp)
-{
- if (!is_kernel_addr(bp->attr.bp_addr)) {
- if (bp->hw.target)
- task_bps_remove(bp);
- if (bp->cpu != -1)
- cpu_bps_remove(bp);
- }
-}
-
/*
* Check for virtual address in kernel space.
*/
@@ -687,7 +440,7 @@ int hw_breakpoint_handler(struct die_args *args)
*/
if (ptrace_bp) {
for (i = 0; i < nr_wp_slots(); i++) {
- if (!hit[i])
+ if (!hit[i] || !is_ptrace_bp(bp[i]))
continue;
perf_bp_event(bp[i], regs);
bp[i] = NULL;
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 6/7] selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (4 preceding siblings ...)
2023-08-01 1:17 ` [PATCH 5/7] powerpc/watchpoints: Remove ptrace/perf exclusion tracking Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 1:17 ` [PATCH 7/7] perf/hw_breakpoint: Remove arch breakpoint hooks Benjamin Gray
` (2 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
Now that ptrace and perf are no longer exclusive, update the
test to exercise interesting interactions.
An assembly file is used for the children to allow precise instruction
choice and addresses, while avoiding any compiler quirks.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
.../testing/selftests/powerpc/ptrace/Makefile | 1 +
.../powerpc/ptrace/ptrace-perf-asm.S | 33 +
.../powerpc/ptrace/ptrace-perf-hwbreak.c | 1104 +++++++----------
3 files changed, 479 insertions(+), 659 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S
rewrite tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c (93%)
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile b/tools/testing/selftests/powerpc/ptrace/Makefile
index cbeeaeae8837..1b39b86849da 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -36,6 +36,7 @@ $(TM_TESTS): CFLAGS += -I../tm -mhtm
CFLAGS += $(KHDR_INCLUDES) -fno-pie
$(OUTPUT)/ptrace-gpr: ptrace-gpr.S
+$(OUTPUT)/ptrace-perf-hwbreak: ptrace-perf-asm.S
$(OUTPUT)/ptrace-pkey $(OUTPUT)/core-pkey: LDLIBS += -pthread
$(TEST_GEN_PROGS): ../harness.c ../utils.c ../lib/reg.S
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S
new file mode 100644
index 000000000000..9aa2e58f3189
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include <ppc-asm.h>
+
+.global same_watch_addr_load
+.global same_watch_addr_trap
+
+FUNC_START(same_watch_addr_child)
+ nop
+same_watch_addr_load:
+ ld 0,0(3)
+ nop
+same_watch_addr_trap:
+ trap
+ blr
+FUNC_END(same_watch_addr_child)
+
+
+.global perf_then_ptrace_load1
+.global perf_then_ptrace_load2
+.global perf_then_ptrace_trap
+
+FUNC_START(perf_then_ptrace_child)
+ nop
+perf_then_ptrace_load1:
+ ld 0,0(3)
+perf_then_ptrace_load2:
+ ld 0,0(4)
+ nop
+perf_then_ptrace_trap:
+ trap
+ blr
+FUNC_END(perf_then_ptrace_child)
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c
dissimilarity index 93%
index d8a9e95fc03d..a0a0b9bb5854 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c
@@ -1,659 +1,445 @@
-// SPDX-License-Identifier: GPL-2.0+
-#include <stdio.h>
-#include <string.h>
-#include <signal.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <errno.h>
-#include <linux/hw_breakpoint.h>
-#include <linux/perf_event.h>
-#include <asm/unistd.h>
-#include <sys/ptrace.h>
-#include <sys/wait.h>
-#include "ptrace.h"
-
-char data[16];
-
-/* Overlapping address range */
-volatile __u64 *ptrace_data1 = (__u64 *)&data[0];
-volatile __u64 *perf_data1 = (__u64 *)&data[4];
-
-/* Non-overlapping address range */
-volatile __u64 *ptrace_data2 = (__u64 *)&data[0];
-volatile __u64 *perf_data2 = (__u64 *)&data[8];
-
-static unsigned long pid_max_addr(void)
-{
- FILE *fp;
- char *line, *c;
- char addr[100];
- size_t len = 0;
-
- fp = fopen("/proc/kallsyms", "r");
- if (!fp) {
- printf("Failed to read /proc/kallsyms. Exiting..\n");
- exit(EXIT_FAILURE);
- }
-
- while (getline(&line, &len, fp) != -1) {
- if (!strstr(line, "pid_max") || strstr(line, "pid_max_max") ||
- strstr(line, "pid_max_min"))
- continue;
-
- strncpy(addr, line, len < 100 ? len : 100);
- c = strchr(addr, ' ');
- *c = '\0';
- return strtoul(addr, &c, 16);
- }
- fclose(fp);
- printf("Could not find pid_max. Exiting..\n");
- exit(EXIT_FAILURE);
- return -1;
-}
-
-static void perf_user_event_attr_set(struct perf_event_attr *attr, __u64 addr, __u64 len)
-{
- memset(attr, 0, sizeof(struct perf_event_attr));
- attr->type = PERF_TYPE_BREAKPOINT;
- attr->size = sizeof(struct perf_event_attr);
- attr->bp_type = HW_BREAKPOINT_R;
- attr->bp_addr = addr;
- attr->bp_len = len;
- attr->exclude_kernel = 1;
- attr->exclude_hv = 1;
-}
-
-static void perf_kernel_event_attr_set(struct perf_event_attr *attr)
-{
- memset(attr, 0, sizeof(struct perf_event_attr));
- attr->type = PERF_TYPE_BREAKPOINT;
- attr->size = sizeof(struct perf_event_attr);
- attr->bp_type = HW_BREAKPOINT_R;
- attr->bp_addr = pid_max_addr();
- attr->bp_len = sizeof(unsigned long);
- attr->exclude_user = 1;
- attr->exclude_hv = 1;
-}
-
-static int perf_cpu_event_open(int cpu, __u64 addr, __u64 len)
-{
- struct perf_event_attr attr;
-
- perf_user_event_attr_set(&attr, addr, len);
- return syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
-}
-
-static int perf_thread_event_open(pid_t child_pid, __u64 addr, __u64 len)
-{
- struct perf_event_attr attr;
-
- perf_user_event_attr_set(&attr, addr, len);
- return syscall(__NR_perf_event_open, &attr, child_pid, -1, -1, 0);
-}
-
-static int perf_thread_cpu_event_open(pid_t child_pid, int cpu, __u64 addr, __u64 len)
-{
- struct perf_event_attr attr;
-
- perf_user_event_attr_set(&attr, addr, len);
- return syscall(__NR_perf_event_open, &attr, child_pid, cpu, -1, 0);
-}
-
-static int perf_thread_kernel_event_open(pid_t child_pid)
-{
- struct perf_event_attr attr;
-
- perf_kernel_event_attr_set(&attr);
- return syscall(__NR_perf_event_open, &attr, child_pid, -1, -1, 0);
-}
-
-static int perf_cpu_kernel_event_open(int cpu)
-{
- struct perf_event_attr attr;
-
- perf_kernel_event_attr_set(&attr);
- return syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0);
-}
-
-static int child(void)
-{
- int ret;
-
- ret = ptrace(PTRACE_TRACEME, 0, NULL, 0);
- if (ret) {
- printf("Error: PTRACE_TRACEME failed\n");
- return 0;
- }
- kill(getpid(), SIGUSR1); /* --> parent (SIGUSR1) */
-
- return 0;
-}
-
-static void ptrace_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type,
- __u64 addr, int len)
-{
- info->version = 1;
- info->trigger_type = type;
- info->condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
- info->addr = addr;
- info->addr2 = addr + len;
- info->condition_value = 0;
- if (!len)
- info->addr_mode = PPC_BREAKPOINT_MODE_EXACT;
- else
- info->addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
-}
-
-static int ptrace_open(pid_t child_pid, __u64 wp_addr, int len)
-{
- struct ppc_hw_breakpoint info;
-
- ptrace_ppc_hw_breakpoint(&info, PPC_BREAKPOINT_TRIGGER_RW, wp_addr, len);
- return ptrace(PPC_PTRACE_SETHWDEBUG, child_pid, 0, &info);
-}
-
-static int test1(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by ptrace)
- * if (existing cpu event by perf)
- * if (addr range overlaps)
- * fail;
- */
-
- perf_fd = perf_cpu_event_open(0, (__u64)perf_data1, sizeof(*perf_data1));
- if (perf_fd < 0)
- return -1;
-
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd > 0 || errno != ENOSPC)
- ret = -1;
-
- close(perf_fd);
- return ret;
-}
-
-static int test2(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by ptrace)
- * if (existing cpu event by perf)
- * if (addr range does not overlaps)
- * allow;
- */
-
- perf_fd = perf_cpu_event_open(0, (__u64)perf_data2, sizeof(*perf_data2));
- if (perf_fd < 0)
- return -1;
-
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data2, sizeof(*ptrace_data2));
- if (ptrace_fd < 0) {
- ret = -1;
- goto perf_close;
- }
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
-
-perf_close:
- close(perf_fd);
- return ret;
-}
-
-static int test3(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by ptrace)
- * if (existing thread event by perf on the same thread)
- * if (addr range overlaps)
- * fail;
- */
- perf_fd = perf_thread_event_open(child_pid, (__u64)perf_data1,
- sizeof(*perf_data1));
- if (perf_fd < 0)
- return -1;
-
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd > 0 || errno != ENOSPC)
- ret = -1;
-
- close(perf_fd);
- return ret;
-}
-
-static int test4(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by ptrace)
- * if (existing thread event by perf on the same thread)
- * if (addr range does not overlaps)
- * fail;
- */
- perf_fd = perf_thread_event_open(child_pid, (__u64)perf_data2,
- sizeof(*perf_data2));
- if (perf_fd < 0)
- return -1;
-
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data2, sizeof(*ptrace_data2));
- if (ptrace_fd < 0) {
- ret = -1;
- goto perf_close;
- }
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
-
-perf_close:
- close(perf_fd);
- return ret;
-}
-
-static int test5(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int cpid;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by ptrace)
- * if (existing thread event by perf on the different thread)
- * allow;
- */
- cpid = fork();
- if (!cpid) {
- /* Temporary Child */
- pause();
- exit(EXIT_SUCCESS);
- }
-
- perf_fd = perf_thread_event_open(cpid, (__u64)perf_data1, sizeof(*perf_data1));
- if (perf_fd < 0) {
- ret = -1;
- goto kill_child;
- }
-
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0) {
- ret = -1;
- goto perf_close;
- }
-
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
-perf_close:
- close(perf_fd);
-kill_child:
- kill(cpid, SIGINT);
- return ret;
-}
-
-static int test6(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread kernel event by perf)
- * if (existing thread event by ptrace on the same thread)
- * allow;
- * -- OR --
- * if (new per cpu kernel event by perf)
- * if (existing thread event by ptrace)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_thread_kernel_event_open(child_pid);
- if (perf_fd < 0) {
- ret = -1;
- goto ptrace_close;
- }
- close(perf_fd);
-
- perf_fd = perf_cpu_kernel_event_open(0);
- if (perf_fd < 0) {
- ret = -1;
- goto ptrace_close;
- }
- close(perf_fd);
-
-ptrace_close:
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test7(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range overlaps)
- * fail;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_thread_event_open(child_pid, (__u64)perf_data1,
- sizeof(*perf_data1));
- if (perf_fd > 0 || errno != ENOSPC)
- ret = -1;
-
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test8(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range does not overlaps)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data2, sizeof(*ptrace_data2));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_thread_event_open(child_pid, (__u64)perf_data2,
- sizeof(*perf_data2));
- if (perf_fd < 0) {
- ret = -1;
- goto ptrace_close;
- }
- close(perf_fd);
-
-ptrace_close:
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test9(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int cpid;
- int ret = 0;
-
- /* Test:
- * if (new per thread event by perf)
- * if (existing thread event by ptrace on the other thread)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- cpid = fork();
- if (!cpid) {
- /* Temporary Child */
- pause();
- exit(EXIT_SUCCESS);
- }
-
- perf_fd = perf_thread_event_open(cpid, (__u64)perf_data1, sizeof(*perf_data1));
- if (perf_fd < 0) {
- ret = -1;
- goto kill_child;
- }
- close(perf_fd);
-
-kill_child:
- kill(cpid, SIGINT);
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test10(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per cpu event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range overlaps)
- * fail;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_cpu_event_open(0, (__u64)perf_data1, sizeof(*perf_data1));
- if (perf_fd > 0 || errno != ENOSPC)
- ret = -1;
-
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test11(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per cpu event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range does not overlap)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data2, sizeof(*ptrace_data2));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_cpu_event_open(0, (__u64)perf_data2, sizeof(*perf_data2));
- if (perf_fd < 0) {
- ret = -1;
- goto ptrace_close;
- }
- close(perf_fd);
-
-ptrace_close:
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test12(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread and per cpu event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range overlaps)
- * fail;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_thread_cpu_event_open(child_pid, 0, (__u64)perf_data1, sizeof(*perf_data1));
- if (perf_fd > 0 || errno != ENOSPC)
- ret = -1;
-
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test13(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int ret = 0;
-
- /* Test:
- * if (new per thread and per cpu event by perf)
- * if (existing thread event by ptrace on the same thread)
- * if (addr range does not overlap)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data2, sizeof(*ptrace_data2));
- if (ptrace_fd < 0)
- return -1;
-
- perf_fd = perf_thread_cpu_event_open(child_pid, 0, (__u64)perf_data2, sizeof(*perf_data2));
- if (perf_fd < 0) {
- ret = -1;
- goto ptrace_close;
- }
- close(perf_fd);
-
-ptrace_close:
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int test14(pid_t child_pid)
-{
- int perf_fd;
- int ptrace_fd;
- int cpid;
- int ret = 0;
-
- /* Test:
- * if (new per thread and per cpu event by perf)
- * if (existing thread event by ptrace on the other thread)
- * allow;
- */
- ptrace_fd = ptrace_open(child_pid, (__u64)ptrace_data1, sizeof(*ptrace_data1));
- if (ptrace_fd < 0)
- return -1;
-
- cpid = fork();
- if (!cpid) {
- /* Temporary Child */
- pause();
- exit(EXIT_SUCCESS);
- }
-
- perf_fd = perf_thread_cpu_event_open(cpid, 0, (__u64)perf_data1,
- sizeof(*perf_data1));
- if (perf_fd < 0) {
- ret = -1;
- goto kill_child;
- }
- close(perf_fd);
-
-kill_child:
- kill(cpid, SIGINT);
- ptrace(PPC_PTRACE_DELHWDEBUG, child_pid, 0, ptrace_fd);
- return ret;
-}
-
-static int do_test(const char *msg, int (*fun)(pid_t arg), pid_t arg)
-{
- int ret;
-
- ret = fun(arg);
- if (ret)
- printf("%s: Error\n", msg);
- else
- printf("%s: Ok\n", msg);
- return ret;
-}
-
-char *desc[14] = {
- "perf cpu event -> ptrace thread event (Overlapping)",
- "perf cpu event -> ptrace thread event (Non-overlapping)",
- "perf thread event -> ptrace same thread event (Overlapping)",
- "perf thread event -> ptrace same thread event (Non-overlapping)",
- "perf thread event -> ptrace other thread event",
- "ptrace thread event -> perf kernel event",
- "ptrace thread event -> perf same thread event (Overlapping)",
- "ptrace thread event -> perf same thread event (Non-overlapping)",
- "ptrace thread event -> perf other thread event",
- "ptrace thread event -> perf cpu event (Overlapping)",
- "ptrace thread event -> perf cpu event (Non-overlapping)",
- "ptrace thread event -> perf same thread & cpu event (Overlapping)",
- "ptrace thread event -> perf same thread & cpu event (Non-overlapping)",
- "ptrace thread event -> perf other thread & cpu event",
-};
-
-static int test(pid_t child_pid)
-{
- int ret = TEST_PASS;
-
- ret |= do_test(desc[0], test1, child_pid);
- ret |= do_test(desc[1], test2, child_pid);
- ret |= do_test(desc[2], test3, child_pid);
- ret |= do_test(desc[3], test4, child_pid);
- ret |= do_test(desc[4], test5, child_pid);
- ret |= do_test(desc[5], test6, child_pid);
- ret |= do_test(desc[6], test7, child_pid);
- ret |= do_test(desc[7], test8, child_pid);
- ret |= do_test(desc[8], test9, child_pid);
- ret |= do_test(desc[9], test10, child_pid);
- ret |= do_test(desc[10], test11, child_pid);
- ret |= do_test(desc[11], test12, child_pid);
- ret |= do_test(desc[12], test13, child_pid);
- ret |= do_test(desc[13], test14, child_pid);
-
- return ret;
-}
-
-static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo)
-{
- if (ptrace(PPC_PTRACE_GETHWDBGINFO, child_pid, NULL, dbginfo)) {
- perror("Can't get breakpoint info");
- exit(-1);
- }
-}
-
-static int ptrace_perf_hwbreak(void)
-{
- int ret;
- pid_t child_pid;
- struct ppc_debug_info dbginfo;
-
- child_pid = fork();
- if (!child_pid)
- return child();
-
- /* parent */
- wait(NULL); /* <-- child (SIGUSR1) */
-
- get_dbginfo(child_pid, &dbginfo);
- SKIP_IF_MSG(dbginfo.num_data_bps <= 1, "Not enough data watchpoints (need at least 2)");
-
- ret = perf_cpu_event_open(0, (__u64)perf_data1, sizeof(*perf_data1));
- SKIP_IF_MSG(ret < 0, "perf_event_open syscall failed");
- close(ret);
-
- ret = test(child_pid);
-
- ptrace(PTRACE_CONT, child_pid, NULL, 0);
- return ret;
-}
-
-int main(int argc, char *argv[])
-{
- return test_harness(ptrace_perf_hwbreak, "ptrace-perf-hwbreak");
-}
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <asm/unistd.h>
+#include <linux/hw_breakpoint.h>
+#include <linux/ptrace.h>
+#include <memory.h>
+#include <stdlib.h>
+#include <sys/wait.h>
+
+#include "utils.h"
+
+/*
+ * Child subroutine that performs a load on the address, then traps
+ */
+void same_watch_addr_child(unsigned long *addr);
+
+/* Address of the ld instruction in same_watch_addr_child() */
+extern char same_watch_addr_load[];
+
+/* Address of the end trap instruction in same_watch_addr_child() */
+extern char same_watch_addr_trap[];
+
+/*
+ * Child subroutine that performs a load on the first address, then a load on
+ * the second address (with no instructions separating this from the first
+ * load), then traps.
+ */
+void perf_then_ptrace_child(unsigned long *first_addr, unsigned long *second_addr);
+
+/* Address of the first ld instruction in perf_then_ptrace_child() */
+extern char perf_then_ptrace_load1[];
+
+/* Address of the second ld instruction in perf_then_ptrace_child() */
+extern char perf_then_ptrace_load2[];
+
+/* Address of the end trap instruction in perf_then_ptrace_child() */
+extern char perf_then_ptrace_trap[];
+
+static inline long sys_ptrace(long request, pid_t pid, unsigned long addr, unsigned long data)
+{
+ return syscall(__NR_ptrace, request, pid, addr, data);
+}
+
+static long ptrace_traceme(void)
+{
+ return sys_ptrace(PTRACE_TRACEME, 0, 0, 0);
+}
+
+static long ptrace_getregs(pid_t pid, struct pt_regs *result)
+{
+ return sys_ptrace(PTRACE_GETREGS, pid, 0, (unsigned long)result);
+}
+
+static long ptrace_setregs(pid_t pid, struct pt_regs *result)
+{
+ return sys_ptrace(PTRACE_SETREGS, pid, 0, (unsigned long)result);
+}
+
+static long ptrace_cont(pid_t pid, long signal)
+{
+ return sys_ptrace(PTRACE_CONT, pid, 0, signal);
+}
+
+static long ptrace_singlestep(pid_t pid, long signal)
+{
+ return sys_ptrace(PTRACE_SINGLESTEP, pid, 0, signal);
+}
+
+static long ppc_ptrace_gethwdbginfo(pid_t pid, struct ppc_debug_info *dbginfo)
+{
+ return sys_ptrace(PPC_PTRACE_GETHWDBGINFO, pid, 0, (unsigned long)dbginfo);
+}
+
+static long ppc_ptrace_sethwdbg(pid_t pid, struct ppc_hw_breakpoint *bp_info)
+{
+ return sys_ptrace(PPC_PTRACE_SETHWDEBUG, pid, 0, (unsigned long)bp_info);
+}
+
+static long ppc_ptrace_delhwdbg(pid_t pid, int bp_id)
+{
+ return sys_ptrace(PPC_PTRACE_DELHWDEBUG, pid, 0L, bp_id);
+}
+
+static long ptrace_getreg_pc(pid_t pid, void **pc)
+{
+ struct pt_regs regs;
+ long err;
+
+ err = ptrace_getregs(pid, ®s);
+ if (err)
+ return err;
+
+ *pc = (void *)regs.nip;
+
+ return 0;
+}
+
+static long ptrace_setreg_pc(pid_t pid, void *pc)
+{
+ struct pt_regs regs;
+ long err;
+
+ err = ptrace_getregs(pid, ®s);
+ if (err)
+ return err;
+
+ regs.nip = (unsigned long)pc;
+
+ err = ptrace_setregs(pid, ®s);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
+ int group_fd, unsigned long flags)
+{
+ return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
+}
+
+static void perf_user_event_attr_set(struct perf_event_attr *attr, void *addr, u64 len)
+{
+ memset(attr, 0, sizeof(struct perf_event_attr));
+
+ attr->type = PERF_TYPE_BREAKPOINT;
+ attr->size = sizeof(struct perf_event_attr);
+ attr->bp_type = HW_BREAKPOINT_R;
+ attr->bp_addr = (u64)addr;
+ attr->bp_len = len;
+ attr->exclude_kernel = 1;
+ attr->exclude_hv = 1;
+}
+
+static int perf_watchpoint_open(pid_t child_pid, void *addr, u64 len)
+{
+ struct perf_event_attr attr;
+
+ perf_user_event_attr_set(&attr, addr, len);
+ return perf_event_open(&attr, child_pid, -1, -1, 0);
+}
+
+static int perf_read_counter(int perf_fd, u64 *count)
+{
+ /*
+ * A perf counter is retrieved by the read() syscall. It contains
+ * the current count as 8 bytes that are interpreted as a u64
+ */
+ ssize_t len = read(perf_fd, count, sizeof(*count));
+
+ if (len != sizeof(*count))
+ return -1;
+
+ return 0;
+}
+
+static void ppc_ptrace_init_breakpoint(struct ppc_hw_breakpoint *info,
+ int type, void *addr, int len)
+{
+ info->version = 1;
+ info->trigger_type = type;
+ info->condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
+ info->addr = (u64)addr;
+ info->addr2 = (u64)addr + len;
+ info->condition_value = 0;
+ if (!len)
+ info->addr_mode = PPC_BREAKPOINT_MODE_EXACT;
+ else
+ info->addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
+}
+
+/*
+ * Checks if we can place at least 2 watchpoints on the child process
+ */
+static int check_watchpoints(pid_t pid)
+{
+ struct ppc_debug_info dbginfo;
+
+ FAIL_IF_MSG(ppc_ptrace_gethwdbginfo(pid, &dbginfo), "PPC_PTRACE_GETHWDBGINFO failed");
+ SKIP_IF_MSG(dbginfo.num_data_bps <= 1, "Not enough data watchpoints (need at least 2)");
+
+ return 0;
+}
+
+/*
+ * Wrapper around a plain fork() call that sets up the child for
+ * ptrace-ing. Both the parent and child return from this, though
+ * the child is stopped until ptrace_cont(pid) is run by the parent.
+ */
+static int ptrace_fork_child(pid_t *pid)
+{
+ int status;
+
+ *pid = fork();
+
+ if (*pid < 0)
+ FAIL_IF_MSG(1, "Failed to fork child");
+
+ if (!*pid) {
+ FAIL_IF_EXIT_MSG(ptrace_traceme(), "PTRACE_TRACEME failed");
+ FAIL_IF_EXIT_MSG(raise(SIGSTOP), "Child failed to raise SIGSTOP");
+ } else {
+ /* Synchronise on child SIGSTOP */
+ FAIL_IF_MSG(waitpid(*pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ }
+
+ return 0;
+}
+
+/*
+ * Tests the interaction between ptrace and perf watching the same data.
+ *
+ * We expect ptrace to take 'priority', as it is has before-execute
+ * semantics.
+ *
+ * The perf counter should not be incremented yet because perf has after-execute
+ * semantics. E.g., if ptrace changes the child PC, we don't even execute the
+ * instruction at all.
+ *
+ * When the child is stopped for ptrace, we test both continue and single step.
+ * Both should increment the perf counter. We also test changing the PC somewhere
+ * different and stepping, which should not increment the perf counter.
+ */
+int same_watch_addr_test(void)
+{
+ struct ppc_hw_breakpoint bp_info; /* ptrace breakpoint info */
+ int bp_id; /* Breakpoint handle of ptrace watchpoint */
+ int perf_fd; /* File descriptor of perf performance counter */
+ u64 perf_count; /* Most recently fetched perf performance counter value */
+ pid_t pid; /* PID of child process */
+ void *pc; /* Most recently fetched child PC value */
+ int status; /* Stop status of child after waitpid */
+ unsigned long value; /* Dummy value to be read/written to by child */
+ int err;
+
+ err = ptrace_fork_child(&pid);
+ if (err)
+ return err;
+
+ if (!pid) {
+ same_watch_addr_child(&value);
+ exit(1);
+ }
+
+ err = check_watchpoints(pid);
+ if (err)
+ return err;
+
+ /* Place a perf watchpoint counter on value */
+ perf_fd = perf_watchpoint_open(pid, &value, sizeof(value));
+ FAIL_IF_MSG(perf_fd < 0, "Failed to open perf performance counter");
+
+ /* Place a ptrace watchpoint on value */
+ ppc_ptrace_init_breakpoint(&bp_info, PPC_BREAKPOINT_TRIGGER_READ, &value, sizeof(value));
+ bp_id = ppc_ptrace_sethwdbg(pid, &bp_info);
+ FAIL_IF_MSG(bp_id < 0, "Failed to set ptrace watchpoint");
+
+ /* Let the child run. It should stop on the ptrace watchpoint */
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_load, "Child did not stop on load instruction");
+
+ /*
+ * We stopped before executing the load, so perf should not have
+ * recorded any events yet
+ */
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 0, "perf recorded unexpected event");
+
+ /* Single stepping over the load should increment the perf counter */
+ FAIL_IF_MSG(ptrace_singlestep(pid, 0), "Failed to single step child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_load + 4, "Failed to single step load instruction");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 1, "perf counter did not increment");
+
+ /*
+ * Set up a ptrace watchpoint on the value again and trigger it.
+ * The perf counter should not have incremented because we do not
+ * execute the load yet.
+ */
+ FAIL_IF_MSG(ppc_ptrace_delhwdbg(pid, bp_id), "Failed to remove old ptrace watchpoint");
+ bp_id = ppc_ptrace_sethwdbg(pid, &bp_info);
+ FAIL_IF_MSG(bp_id < 0, "Failed to set ptrace watchpoint");
+ FAIL_IF_MSG(ptrace_setreg_pc(pid, same_watch_addr_load), "Failed to set child PC");
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_load, "Child did not stop on load trap");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 1, "perf counter should not have changed");
+
+ /* Continuing over the load should increment the perf counter */
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_trap, "Child did not stop on end trap");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 2, "perf counter did not increment");
+
+ /*
+ * If we set the child PC back to the load instruction, then continue,
+ * we should reach the end trap (because ptrace is one-shot) and have
+ * another perf event.
+ */
+ FAIL_IF_MSG(ptrace_setreg_pc(pid, same_watch_addr_load), "Failed to set child PC");
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_trap, "Child did not stop on end trap");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 3, "perf counter did not increment");
+
+ /*
+ * If we set the child PC back to the load instruction, set a ptrace
+ * watchpoint on the load, then continue, we should immediately get
+ * the ptrace trap without incrementing the perf counter
+ */
+ FAIL_IF_MSG(ppc_ptrace_delhwdbg(pid, bp_id), "Failed to remove old ptrace watchpoint");
+ bp_id = ppc_ptrace_sethwdbg(pid, &bp_info);
+ FAIL_IF_MSG(bp_id < 0, "Failed to set ptrace watchpoint");
+ FAIL_IF_MSG(ptrace_setreg_pc(pid, same_watch_addr_load), "Failed to set child PC");
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_load, "Child did not stop on load instruction");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 3, "perf counter should not have changed");
+
+ /*
+ * If we change the PC while stopped on the load instruction, we should
+ * not increment the perf counter (because ptrace is before-execute,
+ * perf is after-execute).
+ */
+ FAIL_IF_MSG(ptrace_setreg_pc(pid, same_watch_addr_load + 4), "Failed to set child PC");
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != same_watch_addr_trap, "Child did not stop on end trap");
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 3, "perf counter should not have changed");
+
+ /* Clean up child */
+ FAIL_IF_MSG(kill(pid, SIGKILL) != 0, "Failed to kill child");
+
+ return 0;
+}
+
+/*
+ * Tests the interaction between ptrace and perf when:
+ * 1. perf watches a value
+ * 2. ptrace watches a different value
+ * 3. The perf value is read, then the ptrace value is read immediately after
+ *
+ * A breakpoint implementation may accidentally misattribute/skip one of
+ * the ptrace or perf handlers, as interrupt based work is done after perf
+ * and before ptrace.
+ *
+ * We expect the perf counter to increment before the ptrace watchpoint
+ * triggers.
+ */
+int perf_then_ptrace_test(void)
+{
+ struct ppc_hw_breakpoint bp_info; /* ptrace breakpoint info */
+ int bp_id; /* Breakpoint handle of ptrace watchpoint */
+ int perf_fd; /* File descriptor of perf performance counter */
+ u64 perf_count; /* Most recently fetched perf performance counter value */
+ pid_t pid; /* PID of child process */
+ void *pc; /* Most recently fetched child PC value */
+ int status; /* Stop status of child after waitpid */
+ unsigned long perf_value; /* Dummy value to be watched by perf */
+ unsigned long ptrace_value; /* Dummy value to be watched by ptrace */
+ int err;
+
+ err = ptrace_fork_child(&pid);
+ if (err)
+ return err;
+
+ /*
+ * If we are the child, run a subroutine that reads the perf value,
+ * then reads the ptrace value with consecutive load instructions
+ */
+ if (!pid) {
+ perf_then_ptrace_child(&perf_value, &ptrace_value);
+ exit(0);
+ }
+
+ err = check_watchpoints(pid);
+ if (err)
+ return err;
+
+ /* Place a perf watchpoint counter */
+ perf_fd = perf_watchpoint_open(pid, &perf_value, sizeof(perf_value));
+ FAIL_IF_MSG(perf_fd < 0, "Failed to open perf performance counter");
+
+ /* Place a ptrace watchpoint */
+ ppc_ptrace_init_breakpoint(&bp_info, PPC_BREAKPOINT_TRIGGER_READ,
+ &ptrace_value, sizeof(ptrace_value));
+ bp_id = ppc_ptrace_sethwdbg(pid, &bp_info);
+ FAIL_IF_MSG(bp_id < 0, "Failed to set ptrace watchpoint");
+
+ /* Let the child run. It should stop on the ptrace watchpoint */
+ FAIL_IF_MSG(ptrace_cont(pid, 0), "Failed to continue child");
+
+ FAIL_IF_MSG(waitpid(pid, &status, 0) == -1, "Failed to wait for child");
+ FAIL_IF_MSG(!WIFSTOPPED(status), "Child is not stopped");
+ FAIL_IF_MSG(ptrace_getreg_pc(pid, &pc), "Failed to get child PC");
+ FAIL_IF_MSG(pc != perf_then_ptrace_load2, "Child did not stop on ptrace load");
+
+ /* perf should have recorded the first load */
+ FAIL_IF_MSG(perf_read_counter(perf_fd, &perf_count), "Failed to read perf counter");
+ FAIL_IF_MSG(perf_count != 1, "perf counter did not increment");
+
+ /* Clean up child */
+ FAIL_IF_MSG(kill(pid, SIGKILL) != 0, "Failed to kill child");
+
+ return 0;
+}
+
+int main(int argc, char *argv[])
+{
+ int err = 0;
+
+ err |= test_harness(same_watch_addr_test, "same_watch_addr");
+ err |= test_harness(perf_then_ptrace_test, "perf_then_ptrace");
+
+ return err;
+}
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 7/7] perf/hw_breakpoint: Remove arch breakpoint hooks
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (5 preceding siblings ...)
2023-08-01 1:17 ` [PATCH 6/7] selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest Benjamin Gray
@ 2023-08-01 1:17 ` Benjamin Gray
2023-08-01 9:50 ` [PATCH 0/7] Rework perf and ptrace watchpoint tracking Christophe Leroy
2023-08-23 11:55 ` Michael Ellerman
8 siblings, 0 replies; 11+ messages in thread
From: Benjamin Gray @ 2023-08-01 1:17 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Benjamin Gray
PowerPC was the only user of these hooks, and has been refactored to no
longer require them. There is no need to keep them around, so remove
them to reduce complexity.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
include/linux/hw_breakpoint.h | 3 ---
kernel/events/hw_breakpoint.c | 28 ----------------------------
2 files changed, 31 deletions(-)
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index 7fbb45911273..db199d653dd1 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -90,9 +90,6 @@ extern int dbg_reserve_bp_slot(struct perf_event *bp);
extern int dbg_release_bp_slot(struct perf_event *bp);
extern int reserve_bp_slot(struct perf_event *bp);
extern void release_bp_slot(struct perf_event *bp);
-int arch_reserve_bp_slot(struct perf_event *bp);
-void arch_release_bp_slot(struct perf_event *bp);
-void arch_unregister_hw_breakpoint(struct perf_event *bp);
extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index c3797701339c..6c2cb4e4f48d 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -523,26 +523,6 @@ toggle_bp_slot(struct perf_event *bp, bool enable, enum bp_type_idx type, int we
return 0;
}
-__weak int arch_reserve_bp_slot(struct perf_event *bp)
-{
- return 0;
-}
-
-__weak void arch_release_bp_slot(struct perf_event *bp)
-{
-}
-
-/*
- * Function to perform processor-specific cleanup during unregistration
- */
-__weak void arch_unregister_hw_breakpoint(struct perf_event *bp)
-{
- /*
- * A weak stub function here for those archs that don't define
- * it inside arch/.../kernel/hw_breakpoint.c
- */
-}
-
/*
* Constraints to check before allowing this new breakpoint counter.
*
@@ -594,7 +574,6 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type)
enum bp_type_idx type;
int max_pinned_slots;
int weight;
- int ret;
/* We couldn't initialize breakpoint constraints on boot */
if (!constraints_initialized)
@@ -613,10 +592,6 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type)
if (max_pinned_slots > hw_breakpoint_slots_cached(type))
return -ENOSPC;
- ret = arch_reserve_bp_slot(bp);
- if (ret)
- return ret;
-
return toggle_bp_slot(bp, true, type, weight);
}
@@ -634,8 +609,6 @@ static void __release_bp_slot(struct perf_event *bp, u64 bp_type)
enum bp_type_idx type;
int weight;
- arch_release_bp_slot(bp);
-
type = find_slot_idx(bp_type);
weight = hw_breakpoint_weight(bp);
WARN_ON(toggle_bp_slot(bp, false, type, weight));
@@ -645,7 +618,6 @@ void release_bp_slot(struct perf_event *bp)
{
struct mutex *mtx = bp_constraints_lock(bp);
- arch_unregister_hw_breakpoint(bp);
__release_bp_slot(bp, bp->attr.bp_type);
bp_constraints_unlock(mtx);
}
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/7] Rework perf and ptrace watchpoint tracking
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (6 preceding siblings ...)
2023-08-01 1:17 ` [PATCH 7/7] perf/hw_breakpoint: Remove arch breakpoint hooks Benjamin Gray
@ 2023-08-01 9:50 ` Christophe Leroy
2023-08-02 12:00 ` Michael Ellerman
2023-08-23 11:55 ` Michael Ellerman
8 siblings, 1 reply; 11+ messages in thread
From: Christophe Leroy @ 2023-08-01 9:50 UTC (permalink / raw)
To: Benjamin Gray, linuxppc-dev@lists.ozlabs.org
Le 01/08/2023 à 03:17, Benjamin Gray a écrit :
> Syzkaller triggered a null pointer dereference in the
> arch_unregister_hw_breakpoint() hook. This is due to accessing
> the bp->ctx->task field changing to -1 while we iterate the breakpoints.
>
> This series refactors the breakpoint tracking logic to remove the
> dependency on bp->ctx entirely. It also simplifies handling of ptrace and
> perf breakpoints, making insertion less restrictive.
Is there any link between this series and the following issue:
https://github.com/linuxppc/issues/issues/38
Christophe
>
> If merged, it allows several arch hooks that PowerPC was the sole user of
> to be removed.
>
> Benjamin Gray (7):
> powerpc/watchpoints: Explain thread_change_pc() more
> powerpc/watchpoints: Don't track info persistently
> powerpc/watchpoints: Track perf single step directly on the breakpoint
> powerpc/watchpoints: Simplify watchpoint reinsertion
> powerpc/watchpoints: Remove ptrace/perf exclusion tracking
> selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest
> perf/hw_breakpoint: Remove arch breakpoint hooks
>
> arch/powerpc/include/asm/hw_breakpoint.h | 1 +
> arch/powerpc/include/asm/processor.h | 5 -
> arch/powerpc/kernel/hw_breakpoint.c | 388 +-----
> include/linux/hw_breakpoint.h | 3 -
> kernel/events/hw_breakpoint.c | 28 -
> .../testing/selftests/powerpc/ptrace/Makefile | 1 +
> .../powerpc/ptrace/ptrace-perf-asm.S | 33 +
> .../powerpc/ptrace/ptrace-perf-hwbreak.c | 1104 +++++++----------
> 8 files changed, 537 insertions(+), 1026 deletions(-)
> create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-perf-asm.S
> rewrite tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c (93%)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/7] Rework perf and ptrace watchpoint tracking
2023-08-01 9:50 ` [PATCH 0/7] Rework perf and ptrace watchpoint tracking Christophe Leroy
@ 2023-08-02 12:00 ` Michael Ellerman
0 siblings, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2023-08-02 12:00 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Gray, linuxppc-dev@lists.ozlabs.org
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 01/08/2023 à 03:17, Benjamin Gray a écrit :
>> Syzkaller triggered a null pointer dereference in the
>> arch_unregister_hw_breakpoint() hook. This is due to accessing
>> the bp->ctx->task field changing to -1 while we iterate the breakpoints.
>>
>> This series refactors the breakpoint tracking logic to remove the
>> dependency on bp->ctx entirely. It also simplifies handling of ptrace and
>> perf breakpoints, making insertion less restrictive.
>
> Is there any link between this series and the following issue:
> https://github.com/linuxppc/issues/issues/38
AFAIK no, Ben started looking at the breakpoint code due to a syzkaller
report of an oops.
But this series would resolve that issue AFAICS, so I guess they are
linked in that sense.
cheers
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/7] Rework perf and ptrace watchpoint tracking
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
` (7 preceding siblings ...)
2023-08-01 9:50 ` [PATCH 0/7] Rework perf and ptrace watchpoint tracking Christophe Leroy
@ 2023-08-23 11:55 ` Michael Ellerman
8 siblings, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2023-08-23 11:55 UTC (permalink / raw)
To: linuxppc-dev, Benjamin Gray
On Tue, 01 Aug 2023 11:17:37 +1000, Benjamin Gray wrote:
> Syzkaller triggered a null pointer dereference in the
> arch_unregister_hw_breakpoint() hook. This is due to accessing
> the bp->ctx->task field changing to -1 while we iterate the breakpoints.
>
> This series refactors the breakpoint tracking logic to remove the
> dependency on bp->ctx entirely. It also simplifies handling of ptrace and
> perf breakpoints, making insertion less restrictive.
>
> [...]
Applied to powerpc/next.
[1/7] powerpc/watchpoints: Explain thread_change_pc() more
https://git.kernel.org/powerpc/c/8f8f1cd67aa026c9dab8eb4e087e4a2d8fa9d5bc
[2/7] powerpc/watchpoints: Don't track info persistently
https://git.kernel.org/powerpc/c/668a6ec6ed57f0248070c490aba75a9572e4b0a4
[3/7] powerpc/watchpoints: Track perf single step directly on the breakpoint
https://git.kernel.org/powerpc/c/1e60f3564bad09962646bf8c2af588ecf518d337
[4/7] powerpc/watchpoints: Simplify watchpoint reinsertion
https://git.kernel.org/powerpc/c/5a2d8b9c06712b52b2f0f2fc9a144242277fda74
[5/7] powerpc/watchpoints: Remove ptrace/perf exclusion tracking
https://git.kernel.org/powerpc/c/bd29813ae10698f7bdfb3c68eacbb6464ec701ff
[6/7] selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest
https://git.kernel.org/powerpc/c/58709f6fc327a997daeeca77aa5e6bd4d4c238cf
[7/7] perf/hw_breakpoint: Remove arch breakpoint hooks
https://git.kernel.org/powerpc/c/53834a0c09252dea7918a9e1788bad880690900b
cheers
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-08-23 12:05 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 1:17 [PATCH 0/7] Rework perf and ptrace watchpoint tracking Benjamin Gray
2023-08-01 1:17 ` [PATCH 1/7] powerpc/watchpoints: Explain thread_change_pc() more Benjamin Gray
2023-08-01 1:17 ` [PATCH 2/7] powerpc/watchpoints: Don't track info persistently Benjamin Gray
2023-08-01 1:17 ` [PATCH 3/7] powerpc/watchpoints: Track perf single step directly on the breakpoint Benjamin Gray
2023-08-01 1:17 ` [PATCH 4/7] powerpc/watchpoints: Simplify watchpoint reinsertion Benjamin Gray
2023-08-01 1:17 ` [PATCH 5/7] powerpc/watchpoints: Remove ptrace/perf exclusion tracking Benjamin Gray
2023-08-01 1:17 ` [PATCH 6/7] selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest Benjamin Gray
2023-08-01 1:17 ` [PATCH 7/7] perf/hw_breakpoint: Remove arch breakpoint hooks Benjamin Gray
2023-08-01 9:50 ` [PATCH 0/7] Rework perf and ptrace watchpoint tracking Christophe Leroy
2023-08-02 12:00 ` Michael Ellerman
2023-08-23 11:55 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).