* [PATCH] uprobes: Improve scalability by reducing the contention on siglock
@ 2024-08-01 8:24 Liao Chang
2024-08-01 14:06 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Liao Chang @ 2024-08-01 8:24 UTC (permalink / raw)
To: mhiramat, oleg, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang
Cc: linux-kernel, linux-trace-kernel, linux-perf-users
The profiling result of BPF selftest on ARM64 platform reveals the
significant contention on the current->sighand->siglock within the
handle_singlestep() is the scalability bottleneck. The reason is also
very straightforward that all producer threads of benchmark have to
contend the spinlock mentioned to resume the TIF_SIGPENDING bit in the
thread_info that might be removed in uprobe_deny_signal().
This patch introduces UTASK_SSTEP_DENY_SIGNAL to mark TIF_SIGPENDING is
suppress temporarily during the uprobe single-step. Upon uprobe single-step
is handled and UTASK_SSTEP_DENY_SIGNAL is confirmed, it could resume the
TIF_SIGPENDING directly without acquiring the siglock in most case, then
reducing contention and improving overall performance.
I've use the script developed by Andrii in [1] to run benchmark. The CPU
used was Kunpeng916 (Hi1616), 4 NUMA nodes, 64 cores@2.4GHz running
upstream kernel v6.11-rc1 + my optimization [2] for get_xol_insn_slot().
before-opt
----------
uprobe-nop ( 1 cpus): 0.907 ± 0.003M/s ( 0.907M/s/cpu)
uprobe-nop ( 2 cpus): 1.676 ± 0.008M/s ( 0.838M/s/cpu)
uprobe-nop ( 4 cpus): 3.210 ± 0.003M/s ( 0.802M/s/cpu)
uprobe-nop ( 8 cpus): 4.457 ± 0.003M/s ( 0.557M/s/cpu)
uprobe-nop (16 cpus): 3.724 ± 0.011M/s ( 0.233M/s/cpu)
uprobe-nop (32 cpus): 2.761 ± 0.003M/s ( 0.086M/s/cpu)
uprobe-nop (64 cpus): 1.293 ± 0.015M/s ( 0.020M/s/cpu)
uprobe-push ( 1 cpus): 0.883 ± 0.001M/s ( 0.883M/s/cpu)
uprobe-push ( 2 cpus): 1.642 ± 0.005M/s ( 0.821M/s/cpu)
uprobe-push ( 4 cpus): 3.086 ± 0.002M/s ( 0.771M/s/cpu)
uprobe-push ( 8 cpus): 3.390 ± 0.003M/s ( 0.424M/s/cpu)
uprobe-push (16 cpus): 2.652 ± 0.005M/s ( 0.166M/s/cpu)
uprobe-push (32 cpus): 2.713 ± 0.005M/s ( 0.085M/s/cpu)
uprobe-push (64 cpus): 1.313 ± 0.009M/s ( 0.021M/s/cpu)
uprobe-ret ( 1 cpus): 1.774 ± 0.000M/s ( 1.774M/s/cpu)
uprobe-ret ( 2 cpus): 3.350 ± 0.001M/s ( 1.675M/s/cpu)
uprobe-ret ( 4 cpus): 6.604 ± 0.000M/s ( 1.651M/s/cpu)
uprobe-ret ( 8 cpus): 6.706 ± 0.005M/s ( 0.838M/s/cpu)
uprobe-ret (16 cpus): 5.231 ± 0.001M/s ( 0.327M/s/cpu)
uprobe-ret (32 cpus): 5.743 ± 0.003M/s ( 0.179M/s/cpu)
uprobe-ret (64 cpus): 4.726 ± 0.016M/s ( 0.074M/s/cpu)
after-opt
---------
uprobe-nop ( 1 cpus): 0.985 ± 0.002M/s ( 0.985M/s/cpu)
uprobe-nop ( 2 cpus): 1.773 ± 0.005M/s ( 0.887M/s/cpu)
uprobe-nop ( 4 cpus): 3.304 ± 0.001M/s ( 0.826M/s/cpu)
uprobe-nop ( 8 cpus): 5.328 ± 0.002M/s ( 0.666M/s/cpu)
uprobe-nop (16 cpus): 6.475 ± 0.002M/s ( 0.405M/s/cpu)
uprobe-nop (32 cpus): 4.831 ± 0.082M/s ( 0.151M/s/cpu)
uprobe-nop (64 cpus): 2.564 ± 0.053M/s ( 0.040M/s/cpu)
uprobe-push ( 1 cpus): 0.964 ± 0.001M/s ( 0.964M/s/cpu)
uprobe-push ( 2 cpus): 1.766 ± 0.002M/s ( 0.883M/s/cpu)
uprobe-push ( 4 cpus): 3.290 ± 0.009M/s ( 0.823M/s/cpu)
uprobe-push ( 8 cpus): 4.670 ± 0.002M/s ( 0.584M/s/cpu)
uprobe-push (16 cpus): 5.197 ± 0.004M/s ( 0.325M/s/cpu)
uprobe-push (32 cpus): 5.068 ± 0.161M/s ( 0.158M/s/cpu)
uprobe-push (64 cpus): 2.605 ± 0.026M/s ( 0.041M/s/cpu)
uprobe-ret ( 1 cpus): 1.833 ± 0.001M/s ( 1.833M/s/cpu)
uprobe-ret ( 2 cpus): 3.384 ± 0.003M/s ( 1.692M/s/cpu)
uprobe-ret ( 4 cpus): 6.677 ± 0.004M/s ( 1.669M/s/cpu)
uprobe-ret ( 8 cpus): 6.854 ± 0.005M/s ( 0.857M/s/cpu)
uprobe-ret (16 cpus): 6.508 ± 0.006M/s ( 0.407M/s/cpu)
uprobe-ret (32 cpus): 5.793 ± 0.009M/s ( 0.181M/s/cpu)
uprobe-ret (64 cpus): 4.743 ± 0.016M/s ( 0.074M/s/cpu)
Above benchmark results demonstrates a obivious improvement in the
scalability of trig-uprobe-nop and trig-uprobe-push, the peak throughput
of which are from 4.5M/s to 6.4M/s and 3.3M/s to 5.1M/s individually.
[1] https://lore.kernel.org/all/20240731214256.3588718-1-andrii@kernel.org
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaochang1@huawei.com
Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
include/linux/uprobes.h | 1 +
kernel/events/uprobes.c | 18 +++++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index b503fafb7fb3..50acbf96bccd 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -53,6 +53,7 @@ enum uprobe_task_state {
UTASK_SSTEP,
UTASK_SSTEP_ACK,
UTASK_SSTEP_TRAPPED,
+ UTASK_SSTEP_DENY_SIGNAL,
};
/*
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 76a51a1f51e2..4f9c10b3c7b9 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
if (task_sigpending(t)) {
clear_tsk_thread_flag(t, TIF_SIGPENDING);
+ utask->state = UTASK_SSTEP_DENY_SIGNAL;
if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
utask->state = UTASK_SSTEP_TRAPPED;
@@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
int err = 0;
uprobe = utask->active_uprobe;
- if (utask->state == UTASK_SSTEP_ACK)
+ switch (utask->state) {
+ case UTASK_SSTEP_ACK:
err = arch_uprobe_post_xol(&uprobe->arch, regs);
- else if (utask->state == UTASK_SSTEP_TRAPPED)
+ break;
+ case UTASK_SSTEP_TRAPPED:
arch_uprobe_abort_xol(&uprobe->arch, regs);
- else
+ fallthrough;
+ case UTASK_SSTEP_DENY_SIGNAL:
+ set_tsk_thread_flag(current, TIF_SIGPENDING);
+ break;
+ default:
WARN_ON_ONCE(1);
+ }
put_uprobe(uprobe);
utask->active_uprobe = NULL;
utask->state = UTASK_RUNNING;
xol_free_insn_slot(current);
- spin_lock_irq(¤t->sighand->siglock);
- recalc_sigpending(); /* see uprobe_deny_signal() */
- spin_unlock_irq(¤t->sighand->siglock);
-
if (unlikely(err)) {
uprobe_warn(current, "execute the probed insn, sending SIGILL.");
force_sig(SIGILL);
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-01 8:24 [PATCH] uprobes: Improve scalability by reducing the contention on siglock Liao Chang
@ 2024-08-01 14:06 ` Oleg Nesterov
2024-08-02 1:38 ` Liao, Chang
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-01 14:06 UTC (permalink / raw)
To: Liao Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
On 08/01, Liao Chang wrote:
>
> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> int err = 0;
>
> uprobe = utask->active_uprobe;
> - if (utask->state == UTASK_SSTEP_ACK)
> + switch (utask->state) {
> + case UTASK_SSTEP_ACK:
> err = arch_uprobe_post_xol(&uprobe->arch, regs);
> - else if (utask->state == UTASK_SSTEP_TRAPPED)
> + break;
> + case UTASK_SSTEP_TRAPPED:
> arch_uprobe_abort_xol(&uprobe->arch, regs);
> - else
> + fallthrough;
> + case UTASK_SSTEP_DENY_SIGNAL:
> + set_tsk_thread_flag(current, TIF_SIGPENDING);
> + break;
> + default:
> WARN_ON_ONCE(1);
> + }
Liao, at first glance this change looks "obviously wrong" to me.
But let me read this patch more carefully and reply on weekend,
I am a bit busy right now.
Thanks,
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-01 14:06 ` Oleg Nesterov
@ 2024-08-02 1:38 ` Liao, Chang
2024-08-02 9:24 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-02 1:38 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/1 22:06, Oleg Nesterov 写道:
> On 08/01, Liao Chang wrote:
>>
>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>> int err = 0;
>>
>> uprobe = utask->active_uprobe;
>> - if (utask->state == UTASK_SSTEP_ACK)
>> + switch (utask->state) {
>> + case UTASK_SSTEP_ACK:
>> err = arch_uprobe_post_xol(&uprobe->arch, regs);
>> - else if (utask->state == UTASK_SSTEP_TRAPPED)
>> + break;
>> + case UTASK_SSTEP_TRAPPED:
>> arch_uprobe_abort_xol(&uprobe->arch, regs);
>> - else
>> + fallthrough;
>> + case UTASK_SSTEP_DENY_SIGNAL:
>> + set_tsk_thread_flag(current, TIF_SIGPENDING);
>> + break;
>> + default:
>> WARN_ON_ONCE(1);
>> + }
>
> Liao, at first glance this change looks "obviously wrong" to me.
Oleg. Did i overlook some thing obvious here?
>
> But let me read this patch more carefully and reply on weekend,
> I am a bit busy right now.
Sure, thanks.
>
> Thanks,
>
> Oleg.
>
>
--
BR
Liao, Chang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-02 1:38 ` Liao, Chang
@ 2024-08-02 9:24 ` Oleg Nesterov
2024-08-06 3:06 ` Liao, Chang
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-02 9:24 UTC (permalink / raw)
To: Liao, Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
On 08/02, Liao, Chang wrote:
>
>
> 在 2024/8/1 22:06, Oleg Nesterov 写道:
> > On 08/01, Liao Chang wrote:
> >>
> >> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> >> int err = 0;
> >>
> >> uprobe = utask->active_uprobe;
> >> - if (utask->state == UTASK_SSTEP_ACK)
> >> + switch (utask->state) {
> >> + case UTASK_SSTEP_ACK:
> >> err = arch_uprobe_post_xol(&uprobe->arch, regs);
> >> - else if (utask->state == UTASK_SSTEP_TRAPPED)
> >> + break;
> >> + case UTASK_SSTEP_TRAPPED:
> >> arch_uprobe_abort_xol(&uprobe->arch, regs);
> >> - else
> >> + fallthrough;
> >> + case UTASK_SSTEP_DENY_SIGNAL:
> >> + set_tsk_thread_flag(current, TIF_SIGPENDING);
> >> + break;
> >> + default:
> >> WARN_ON_ONCE(1);
> >> + }
> >
> > Liao, at first glance this change looks "obviously wrong" to me.
>
> Oleg. Did i overlook some thing obvious here?
OK, lets suppose uprobe_deny_signal() sets UTASK_SSTEP_DENY_SIGNAL.
In this case handle_singlestep() will only set TIF_SIGPENDING and
do nothing else. This is wrong, either _post_xol() or _abort_xol()
must be called.
But I think handle_singlestep() will never hit this case. In the
likely case uprobe_post_sstep_notifier() will replace _DENY_SIGNAL
with _ACK, and this means that handle_singlestep() won't restore
TIF_SIGPENDING cleared by uprobe_deny_signal().
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-02 9:24 ` Oleg Nesterov
@ 2024-08-06 3:06 ` Liao, Chang
2024-08-06 17:25 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-06 3:06 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/2 17:24, Oleg Nesterov 写道:
> On 08/02, Liao, Chang wrote:
>>
>>
>> 在 2024/8/1 22:06, Oleg Nesterov 写道:
>>> On 08/01, Liao Chang wrote:
>>>>
>>>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>>>> int err = 0;
>>>>
>>>> uprobe = utask->active_uprobe;
>>>> - if (utask->state == UTASK_SSTEP_ACK)
>>>> + switch (utask->state) {
>>>> + case UTASK_SSTEP_ACK:
>>>> err = arch_uprobe_post_xol(&uprobe->arch, regs);
>>>> - else if (utask->state == UTASK_SSTEP_TRAPPED)
>>>> + break;
>>>> + case UTASK_SSTEP_TRAPPED:
>>>> arch_uprobe_abort_xol(&uprobe->arch, regs);
>>>> - else
>>>> + fallthrough;
>>>> + case UTASK_SSTEP_DENY_SIGNAL:
>>>> + set_tsk_thread_flag(current, TIF_SIGPENDING);
>>>> + break;
>>>> + default:
>>>> WARN_ON_ONCE(1);
>>>> + }
>>>
>>> Liao, at first glance this change looks "obviously wrong" to me.
>>
>> Oleg. Did i overlook some thing obvious here?
>
> OK, lets suppose uprobe_deny_signal() sets UTASK_SSTEP_DENY_SIGNAL.
>
> In this case handle_singlestep() will only set TIF_SIGPENDING and
> do nothing else. This is wrong, either _post_xol() or _abort_xol()
> must be called.
>
> But I think handle_singlestep() will never hit this case. In the
> likely case uprobe_post_sstep_notifier() will replace _DENY_SIGNAL
> with _ACK, and this means that handle_singlestep() won't restore
> TIF_SIGPENDING cleared by uprobe_deny_signal().
You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
_DENY_SIGNAL is likey replaced during next uprobe single-stepping.
I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
to correctly restore TIF_SIGPENDING upon the completion of single-step.
A revised implementation would look like this:
------------------%<------------------
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
if (task_sigpending(t)) {
clear_tsk_thread_flag(t, TIF_SIGPENDING);
+ utask->state = UTASK_SSTEP_DENY_SIGNAL;
if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
utask->state = UTASK_SSTEP_TRAPPED;
@@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
int err = 0;
uprobe = utask->active_uprobe;
- if (utask->state == UTASK_SSTEP_ACK)
+ switch (utask->state) {
+ case UTASK_SSTEP_ACK:
err = arch_uprobe_post_xol(&uprobe->arch, regs);
- else if (utask->state == UTASK_SSTEP_TRAPPED)
+ break;
+ case UTASK_SSTEP_TRAPPED:
arch_uprobe_abort_xol(&uprobe->arch, regs);
- else
+ set_thread_flag(TIF_SIGPENDING);
+ break;
+ default:
WARN_ON_ONCE(1);
+ }
put_uprobe(uprobe);
utask->active_uprobe = NULL;
utask->state = UTASK_RUNNING;
xol_free_insn_slot(current);
- spin_lock_irq(¤t->sighand->siglock);
- recalc_sigpending(); /* see uprobe_deny_signal() */
- spin_unlock_irq(¤t->sighand->siglock);
-
if (unlikely(err)) {
uprobe_warn(current, "execute the probed insn, sending SIGILL.");
force_sig(SIGILL);
@@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
/* task is currently not uprobed */
return 0;
+ if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
+ set_thread_flag(TIF_SIGPENDING);
utask->state = UTASK_SSTEP_ACK;
set_thread_flag(TIF_UPROBE);
return 1;
------------------>%------------------
>
> Oleg.
>
>
--
BR
Liao, Chang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-06 3:06 ` Liao, Chang
@ 2024-08-06 17:25 ` Oleg Nesterov
2024-08-07 10:17 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-06 17:25 UTC (permalink / raw)
To: Liao, Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
On 08/06, Liao, Chang wrote:
>
> You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
> unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
> _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
>
> I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
> and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
> to correctly restore TIF_SIGPENDING upon the completion of single-step.
>
> A revised implementation would look like this:
Still looks "obviously wrong" to me... even the approach itself.
Perhaps I am wrong, yet another day when I can't even read emails on lkml
carefully, sorry.
But can you please send the patch which I could actually apply? This one
looks white-space damaged...
I'll try to reply with more details as soon I convince myself I fully
understand what does your patch actually do, but most probably not tomorrow.
Thanks,
Oleg.
> ------------------%<------------------
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
>
> if (task_sigpending(t)) {
> clear_tsk_thread_flag(t, TIF_SIGPENDING);
> + utask->state = UTASK_SSTEP_DENY_SIGNAL;
>
> if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
> utask->state = UTASK_SSTEP_TRAPPED;
> @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> int err = 0;
>
> uprobe = utask->active_uprobe;
> - if (utask->state == UTASK_SSTEP_ACK)
> + switch (utask->state) {
> + case UTASK_SSTEP_ACK:
> err = arch_uprobe_post_xol(&uprobe->arch, regs);
> - else if (utask->state == UTASK_SSTEP_TRAPPED)
> + break;
> + case UTASK_SSTEP_TRAPPED:
> arch_uprobe_abort_xol(&uprobe->arch, regs);
> - else
> + set_thread_flag(TIF_SIGPENDING);
> + break;
> + default:
> WARN_ON_ONCE(1);
> + }
>
> put_uprobe(uprobe);
> utask->active_uprobe = NULL;
> utask->state = UTASK_RUNNING;
> xol_free_insn_slot(current);
>
> - spin_lock_irq(¤t->sighand->siglock);
> - recalc_sigpending(); /* see uprobe_deny_signal() */
> - spin_unlock_irq(¤t->sighand->siglock);
> -
> if (unlikely(err)) {
> uprobe_warn(current, "execute the probed insn, sending SIGILL.");
> force_sig(SIGILL);
> @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
> /* task is currently not uprobed */
> return 0;
>
> + if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
> + set_thread_flag(TIF_SIGPENDING);
> utask->state = UTASK_SSTEP_ACK;
> set_thread_flag(TIF_UPROBE);
> return 1;
>
> ------------------>%------------------
>
> >
> > Oleg.
> >
> >
>
> --
> BR
> Liao, Chang
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-06 17:25 ` Oleg Nesterov
@ 2024-08-07 10:17 ` Oleg Nesterov
2024-08-08 7:30 ` Liao, Chang
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-07 10:17 UTC (permalink / raw)
To: Liao, Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
So. Liao, I am sorry, but I dislike your patch/approach in any case.
UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the fact
that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep()
and uprobe_post_sstep_notifier(), this complicates the logic even more.
We need a flag, not the new state.
And if I read this patch correctly it is wrong:
- uprobe_deny_signal() clears TIF_SIGPENDING and sets UTASK_SSTEP_DENY_SIGNAL
- another signal cames after that and sets TIF_SIGPENDING again
- in this case the task won't return to user-space and execute the probed
insn, exit_to_user_mode_loop() will notice another TIF_SIGPENDING and
call arch_do_signal_or_restart()->get_signal() again.
- get_signal() will call uprobe_deny_signal() again hit
WARN_ON_ONCE(utask->state != UTASK_SSTEP);
And no, we shouldn't change this check into UTASK_SSTEP || UTASK_SSTEP_DENY_SIGNAL.
Again, the fact that uprobe_deny_signal() cleared TIF_SIGPENDING must not be the
new state.
Oleg.
On 08/06, Oleg Nesterov wrote:
>
> On 08/06, Liao, Chang wrote:
> >
> > You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
> > unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
> > _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
> >
> > I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
> > and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
> > to correctly restore TIF_SIGPENDING upon the completion of single-step.
> >
> > A revised implementation would look like this:
>
> Still looks "obviously wrong" to me... even the approach itself.
>
> Perhaps I am wrong, yet another day when I can't even read emails on lkml
> carefully, sorry.
>
> But can you please send the patch which I could actually apply? This one
> looks white-space damaged...
>
> I'll try to reply with more details as soon I convince myself I fully
> understand what does your patch actually do, but most probably not tomorrow.
>
> Thanks,
>
> Oleg.
>
> > ------------------%<------------------
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
> >
> > if (task_sigpending(t)) {
> > clear_tsk_thread_flag(t, TIF_SIGPENDING);
> > + utask->state = UTASK_SSTEP_DENY_SIGNAL;
> >
> > if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
> > utask->state = UTASK_SSTEP_TRAPPED;
> > @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> > int err = 0;
> >
> > uprobe = utask->active_uprobe;
> > - if (utask->state == UTASK_SSTEP_ACK)
> > + switch (utask->state) {
> > + case UTASK_SSTEP_ACK:
> > err = arch_uprobe_post_xol(&uprobe->arch, regs);
> > - else if (utask->state == UTASK_SSTEP_TRAPPED)
> > + break;
> > + case UTASK_SSTEP_TRAPPED:
> > arch_uprobe_abort_xol(&uprobe->arch, regs);
> > - else
> > + set_thread_flag(TIF_SIGPENDING);
> > + break;
> > + default:
> > WARN_ON_ONCE(1);
> > + }
> >
> > put_uprobe(uprobe);
> > utask->active_uprobe = NULL;
> > utask->state = UTASK_RUNNING;
> > xol_free_insn_slot(current);
> >
> > - spin_lock_irq(¤t->sighand->siglock);
> > - recalc_sigpending(); /* see uprobe_deny_signal() */
> > - spin_unlock_irq(¤t->sighand->siglock);
> > -
> > if (unlikely(err)) {
> > uprobe_warn(current, "execute the probed insn, sending SIGILL.");
> > force_sig(SIGILL);
> > @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
> > /* task is currently not uprobed */
> > return 0;
> >
> > + if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
> > + set_thread_flag(TIF_SIGPENDING);
> > utask->state = UTASK_SSTEP_ACK;
> > set_thread_flag(TIF_UPROBE);
> > return 1;
> >
> > ------------------>%------------------
> >
> > >
> > > Oleg.
> > >
> > >
> >
> > --
> > BR
> > Liao, Chang
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-07 10:17 ` Oleg Nesterov
@ 2024-08-08 7:30 ` Liao, Chang
2024-08-08 10:28 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-08 7:30 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/7 18:17, Oleg Nesterov 写道:
> So. Liao, I am sorry, but I dislike your patch/approach in any case.
>
> UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the fact
> that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep()
> and uprobe_post_sstep_notifier(), this complicates the logic even more.
>
> We need a flag, not the new state.
>
> And if I read this patch correctly it is wrong:
>
> - uprobe_deny_signal() clears TIF_SIGPENDING and sets UTASK_SSTEP_DENY_SIGNAL
>
> - another signal cames after that and sets TIF_SIGPENDING again
>
> - in this case the task won't return to user-space and execute the probed
> insn, exit_to_user_mode_loop() will notice another TIF_SIGPENDING and
> call arch_do_signal_or_restart()->get_signal() again.
>
> - get_signal() will call uprobe_deny_signal() again hit
>
> WARN_ON_ONCE(utask->state != UTASK_SSTEP);
>
>
> And no, we shouldn't change this check into UTASK_SSTEP || UTASK_SSTEP_DENY_SIGNAL.
> Again, the fact that uprobe_deny_signal() cleared TIF_SIGPENDING must not be the
> new state.
Oleg, If i understand correctly, current state machine expects the single-step handling
should end up with either _ACK or _TRAPPED. Any new state would disrupt this logic. If so,
I'm convinced that adding a new state is uncessary. As you mentioned, I propose using a
boolean flag in the uprobe_task data to track whether a signal should be restored at the
cost of increased size. Here's outline of the changes:
- pre_ssout() resets the deny signal flag
- uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
- handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
Does this approach look correct to you,do do you have any other way to implement the "flag"?
Thanks.
>
> Oleg.
>
> On 08/06, Oleg Nesterov wrote:
>>
>> On 08/06, Liao, Chang wrote:
>>>
>>> You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
>>> unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
>>> _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
>>>
>>> I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
>>> and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
>>> to correctly restore TIF_SIGPENDING upon the completion of single-step.
>>>
>>> A revised implementation would look like this:
>>
>> Still looks "obviously wrong" to me... even the approach itself.
>>
>> Perhaps I am wrong, yet another day when I can't even read emails on lkml
>> carefully, sorry.
>>
>> But can you please send the patch which I could actually apply? This one
>> looks white-space damaged...
>>
>> I'll try to reply with more details as soon I convince myself I fully
>> understand what does your patch actually do, but most probably not tomorrow.
>>
>> Thanks,
>>
>> Oleg.
>>
>>> ------------------%<------------------
>>> --- a/kernel/events/uprobes.c
>>> +++ b/kernel/events/uprobes.c
>>> @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
>>>
>>> if (task_sigpending(t)) {
>>> clear_tsk_thread_flag(t, TIF_SIGPENDING);
>>> + utask->state = UTASK_SSTEP_DENY_SIGNAL;
>>>
>>> if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
>>> utask->state = UTASK_SSTEP_TRAPPED;
>>> @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>>> int err = 0;
>>>
>>> uprobe = utask->active_uprobe;
>>> - if (utask->state == UTASK_SSTEP_ACK)
>>> + switch (utask->state) {
>>> + case UTASK_SSTEP_ACK:
>>> err = arch_uprobe_post_xol(&uprobe->arch, regs);
>>> - else if (utask->state == UTASK_SSTEP_TRAPPED)
>>> + break;
>>> + case UTASK_SSTEP_TRAPPED:
>>> arch_uprobe_abort_xol(&uprobe->arch, regs);
>>> - else
>>> + set_thread_flag(TIF_SIGPENDING);
>>> + break;
>>> + default:
>>> WARN_ON_ONCE(1);
>>> + }
>>>
>>> put_uprobe(uprobe);
>>> utask->active_uprobe = NULL;
>>> utask->state = UTASK_RUNNING;
>>> xol_free_insn_slot(current);
>>>
>>> - spin_lock_irq(¤t->sighand->siglock);
>>> - recalc_sigpending(); /* see uprobe_deny_signal() */
>>> - spin_unlock_irq(¤t->sighand->siglock);
>>> -
>>> if (unlikely(err)) {
>>> uprobe_warn(current, "execute the probed insn, sending SIGILL.");
>>> force_sig(SIGILL);
>>> @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
>>> /* task is currently not uprobed */
>>> return 0;
>>>
>>> + if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
>>> + set_thread_flag(TIF_SIGPENDING);
>>> utask->state = UTASK_SSTEP_ACK;
>>> set_thread_flag(TIF_UPROBE);
>>> return 1;
>>>
>>> ------------------>%------------------
>>>
>>>>
>>>> Oleg.
>>>>
>>>>
>>>
>>> --
>>> BR
>>> Liao, Chang
>>>
>
>
--
BR
Liao, Chang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-08 7:30 ` Liao, Chang
@ 2024-08-08 10:28 ` Oleg Nesterov
2024-08-08 12:31 ` Liao, Chang
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-08 10:28 UTC (permalink / raw)
To: Liao, Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users
On 08/08, Liao, Chang wrote:
>
> - pre_ssout() resets the deny signal flag
>
> - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
>
> - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
>
> Does this approach look correct to you,do do you have any other way to implement the "flag"?
Yes. But I don't think pre_ssout() needs to clear this flag. handle_singlestep() resets/clears
state, active_uprobe, frees insn slot. So I guess we only need
--- x/kernel/events/uprobes.c
+++ x/kernel/events/uprobes.c
@@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
utask->state = UTASK_RUNNING;
xol_free_insn_slot(current);
- spin_lock_irq(¤t->sighand->siglock);
- recalc_sigpending(); /* see uprobe_deny_signal() */
- spin_unlock_irq(¤t->sighand->siglock);
+ if (utask->xxx) {
+ set_thread_flag(TIF_SIGPENDING);
+ utask->xxx = 0;
+ }
if (unlikely(err)) {
uprobe_warn(current, "execute the probed insn, sending SIGILL.");
and that is all.
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-08 10:28 ` Oleg Nesterov
@ 2024-08-08 12:31 ` Liao, Chang
2024-08-08 13:17 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-08 12:31 UTC (permalink / raw)
To: Oleg Nesterov
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users, bpf,
Andrii Nakryiko, Masami Hiramatsu, Steven Rostedt, paulmck
在 2024/8/8 18:28, Oleg Nesterov 写道:
> On 08/08, Liao, Chang wrote:
>>
>> - pre_ssout() resets the deny signal flag
>>
>> - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
>>
>> - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
>>
>> Does this approach look correct to you,do do you have any other way to implement the "flag"?
>
> Yes. But I don't think pre_ssout() needs to clear this flag. handle_singlestep() resets/clears
> state, active_uprobe, frees insn slot. So I guess we only need
>
>
> --- x/kernel/events/uprobes.c
> +++ x/kernel/events/uprobes.c
> @@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
> utask->state = UTASK_RUNNING;
> xol_free_insn_slot(current);
>
> - spin_lock_irq(¤t->sighand->siglock);
> - recalc_sigpending(); /* see uprobe_deny_signal() */
> - spin_unlock_irq(¤t->sighand->siglock);
> + if (utask->xxx) {
> + set_thread_flag(TIF_SIGPENDING);
> + utask->xxx = 0;
> + }
Agree, if no more discussion about this flag, I will just send v2 today.
Thanks.
>
> if (unlikely(err)) {
> uprobe_warn(current, "execute the probed insn, sending SIGILL.");
>
> and that is all.
>
> Oleg.
>
>
--
BR
Liao, Chang
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
2024-08-08 12:31 ` Liao, Chang
@ 2024-08-08 13:17 ` Oleg Nesterov
0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-08 13:17 UTC (permalink / raw)
To: Liao, Chang
Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-kernel, linux-trace-kernel, linux-perf-users, bpf,
Andrii Nakryiko, Steven Rostedt, paulmck
On 08/08, Liao, Chang wrote:
>
>
> 在 2024/8/8 18:28, Oleg Nesterov 写道:
> > --- x/kernel/events/uprobes.c
> > +++ x/kernel/events/uprobes.c
> > @@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
> > utask->state = UTASK_RUNNING;
> > xol_free_insn_slot(current);
> >
> > - spin_lock_irq(¤t->sighand->siglock);
> > - recalc_sigpending(); /* see uprobe_deny_signal() */
> > - spin_unlock_irq(¤t->sighand->siglock);
> > + if (utask->xxx) {
> > + set_thread_flag(TIF_SIGPENDING);
> > + utask->xxx = 0;
> > + }
>
> Agree, if no more discussion about this flag, I will just send v2 today.
Please also resend the previous patch a 1/2, this one as 2/2.
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-08 13:17 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 8:24 [PATCH] uprobes: Improve scalability by reducing the contention on siglock Liao Chang
2024-08-01 14:06 ` Oleg Nesterov
2024-08-02 1:38 ` Liao, Chang
2024-08-02 9:24 ` Oleg Nesterov
2024-08-06 3:06 ` Liao, Chang
2024-08-06 17:25 ` Oleg Nesterov
2024-08-07 10:17 ` Oleg Nesterov
2024-08-08 7:30 ` Liao, Chang
2024-08-08 10:28 ` Oleg Nesterov
2024-08-08 12:31 ` Liao, Chang
2024-08-08 13:17 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).