* [PATCH] uprobes: Improve scalability by reducing the contention on siglock
@ 2024-08-01  8:24 Liao Chang
  2024-08-01 14:06 ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Liao Chang @ 2024-08-01  8:24 UTC (permalink / raw)
  To: mhiramat, oleg, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang
  Cc: linux-kernel, linux-trace-kernel, linux-perf-users
The profiling result of BPF selftest on ARM64 platform reveals the
significant contention on the current->sighand->siglock within the
handle_singlestep() is the scalability bottleneck. The reason is also
very straightforward that all producer threads of benchmark have to
contend the spinlock mentioned to resume the TIF_SIGPENDING bit in the
thread_info that might be removed in uprobe_deny_signal().
This patch introduces UTASK_SSTEP_DENY_SIGNAL to mark TIF_SIGPENDING is
suppress temporarily during the uprobe single-step. Upon uprobe single-step
is handled and UTASK_SSTEP_DENY_SIGNAL is confirmed, it could resume the
TIF_SIGPENDING directly without acquiring the siglock in most case, then
reducing contention and improving overall performance.
I've use the script developed by Andrii in [1] to run benchmark. The CPU
used was Kunpeng916 (Hi1616), 4 NUMA nodes, 64 cores@2.4GHz running
upstream kernel v6.11-rc1 + my optimization [2] for get_xol_insn_slot().
before-opt
----------
uprobe-nop      ( 1 cpus):    0.907 ± 0.003M/s  (  0.907M/s/cpu)
uprobe-nop      ( 2 cpus):    1.676 ± 0.008M/s  (  0.838M/s/cpu)
uprobe-nop      ( 4 cpus):    3.210 ± 0.003M/s  (  0.802M/s/cpu)
uprobe-nop      ( 8 cpus):    4.457 ± 0.003M/s  (  0.557M/s/cpu)
uprobe-nop      (16 cpus):    3.724 ± 0.011M/s  (  0.233M/s/cpu)
uprobe-nop      (32 cpus):    2.761 ± 0.003M/s  (  0.086M/s/cpu)
uprobe-nop      (64 cpus):    1.293 ± 0.015M/s  (  0.020M/s/cpu)
uprobe-push     ( 1 cpus):    0.883 ± 0.001M/s  (  0.883M/s/cpu)
uprobe-push     ( 2 cpus):    1.642 ± 0.005M/s  (  0.821M/s/cpu)
uprobe-push     ( 4 cpus):    3.086 ± 0.002M/s  (  0.771M/s/cpu)
uprobe-push     ( 8 cpus):    3.390 ± 0.003M/s  (  0.424M/s/cpu)
uprobe-push     (16 cpus):    2.652 ± 0.005M/s  (  0.166M/s/cpu)
uprobe-push     (32 cpus):    2.713 ± 0.005M/s  (  0.085M/s/cpu)
uprobe-push     (64 cpus):    1.313 ± 0.009M/s  (  0.021M/s/cpu)
uprobe-ret      ( 1 cpus):    1.774 ± 0.000M/s  (  1.774M/s/cpu)
uprobe-ret      ( 2 cpus):    3.350 ± 0.001M/s  (  1.675M/s/cpu)
uprobe-ret      ( 4 cpus):    6.604 ± 0.000M/s  (  1.651M/s/cpu)
uprobe-ret      ( 8 cpus):    6.706 ± 0.005M/s  (  0.838M/s/cpu)
uprobe-ret      (16 cpus):    5.231 ± 0.001M/s  (  0.327M/s/cpu)
uprobe-ret      (32 cpus):    5.743 ± 0.003M/s  (  0.179M/s/cpu)
uprobe-ret      (64 cpus):    4.726 ± 0.016M/s  (  0.074M/s/cpu)
after-opt
---------
uprobe-nop      ( 1 cpus):    0.985 ± 0.002M/s  (  0.985M/s/cpu)
uprobe-nop      ( 2 cpus):    1.773 ± 0.005M/s  (  0.887M/s/cpu)
uprobe-nop      ( 4 cpus):    3.304 ± 0.001M/s  (  0.826M/s/cpu)
uprobe-nop      ( 8 cpus):    5.328 ± 0.002M/s  (  0.666M/s/cpu)
uprobe-nop      (16 cpus):    6.475 ± 0.002M/s  (  0.405M/s/cpu)
uprobe-nop      (32 cpus):    4.831 ± 0.082M/s  (  0.151M/s/cpu)
uprobe-nop      (64 cpus):    2.564 ± 0.053M/s  (  0.040M/s/cpu)
uprobe-push     ( 1 cpus):    0.964 ± 0.001M/s  (  0.964M/s/cpu)
uprobe-push     ( 2 cpus):    1.766 ± 0.002M/s  (  0.883M/s/cpu)
uprobe-push     ( 4 cpus):    3.290 ± 0.009M/s  (  0.823M/s/cpu)
uprobe-push     ( 8 cpus):    4.670 ± 0.002M/s  (  0.584M/s/cpu)
uprobe-push     (16 cpus):    5.197 ± 0.004M/s  (  0.325M/s/cpu)
uprobe-push     (32 cpus):    5.068 ± 0.161M/s  (  0.158M/s/cpu)
uprobe-push     (64 cpus):    2.605 ± 0.026M/s  (  0.041M/s/cpu)
uprobe-ret      ( 1 cpus):    1.833 ± 0.001M/s  (  1.833M/s/cpu)
uprobe-ret      ( 2 cpus):    3.384 ± 0.003M/s  (  1.692M/s/cpu)
uprobe-ret      ( 4 cpus):    6.677 ± 0.004M/s  (  1.669M/s/cpu)
uprobe-ret      ( 8 cpus):    6.854 ± 0.005M/s  (  0.857M/s/cpu)
uprobe-ret      (16 cpus):    6.508 ± 0.006M/s  (  0.407M/s/cpu)
uprobe-ret      (32 cpus):    5.793 ± 0.009M/s  (  0.181M/s/cpu)
uprobe-ret      (64 cpus):    4.743 ± 0.016M/s  (  0.074M/s/cpu)
Above benchmark results demonstrates a obivious improvement in the
scalability of trig-uprobe-nop and trig-uprobe-push, the peak throughput
of which are from 4.5M/s to 6.4M/s and 3.3M/s to 5.1M/s individually.
[1] https://lore.kernel.org/all/20240731214256.3588718-1-andrii@kernel.org
[2] https://lore.kernel.org/all/20240727094405.1362496-1-liaochang1@huawei.com
Signed-off-by: Liao Chang <liaochang1@huawei.com>
---
 include/linux/uprobes.h |  1 +
 kernel/events/uprobes.c | 18 +++++++++++-------
 2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index b503fafb7fb3..50acbf96bccd 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -53,6 +53,7 @@ enum uprobe_task_state {
 	UTASK_SSTEP,
 	UTASK_SSTEP_ACK,
 	UTASK_SSTEP_TRAPPED,
+	UTASK_SSTEP_DENY_SIGNAL,
 };
 
 /*
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 76a51a1f51e2..4f9c10b3c7b9 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
 
 	if (task_sigpending(t)) {
 		clear_tsk_thread_flag(t, TIF_SIGPENDING);
+		utask->state = UTASK_SSTEP_DENY_SIGNAL;
 
 		if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
 			utask->state = UTASK_SSTEP_TRAPPED;
@@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
 	int err = 0;
 
 	uprobe = utask->active_uprobe;
-	if (utask->state == UTASK_SSTEP_ACK)
+	switch (utask->state) {
+	case UTASK_SSTEP_ACK:
 		err = arch_uprobe_post_xol(&uprobe->arch, regs);
-	else if (utask->state == UTASK_SSTEP_TRAPPED)
+		break;
+	case UTASK_SSTEP_TRAPPED:
 		arch_uprobe_abort_xol(&uprobe->arch, regs);
-	else
+		fallthrough;
+	case UTASK_SSTEP_DENY_SIGNAL:
+		set_tsk_thread_flag(current, TIF_SIGPENDING);
+		break;
+	default:
 		WARN_ON_ONCE(1);
+	}
 
 	put_uprobe(uprobe);
 	utask->active_uprobe = NULL;
 	utask->state = UTASK_RUNNING;
 	xol_free_insn_slot(current);
 
-	spin_lock_irq(¤t->sighand->siglock);
-	recalc_sigpending(); /* see uprobe_deny_signal() */
-	spin_unlock_irq(¤t->sighand->siglock);
-
 	if (unlikely(err)) {
 		uprobe_warn(current, "execute the probed insn, sending SIGILL.");
 		force_sig(SIGILL);
-- 
2.34.1
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-01  8:24 [PATCH] uprobes: Improve scalability by reducing the contention on siglock Liao Chang
@ 2024-08-01 14:06 ` Oleg Nesterov
  2024-08-02  1:38   ` Liao, Chang
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-01 14:06 UTC (permalink / raw)
  To: Liao Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
On 08/01, Liao Chang wrote:
>
> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>  	int err = 0;
>  
>  	uprobe = utask->active_uprobe;
> -	if (utask->state == UTASK_SSTEP_ACK)
> +	switch (utask->state) {
> +	case UTASK_SSTEP_ACK:
>  		err = arch_uprobe_post_xol(&uprobe->arch, regs);
> -	else if (utask->state == UTASK_SSTEP_TRAPPED)
> +		break;
> +	case UTASK_SSTEP_TRAPPED:
>  		arch_uprobe_abort_xol(&uprobe->arch, regs);
> -	else
> +		fallthrough;
> +	case UTASK_SSTEP_DENY_SIGNAL:
> +		set_tsk_thread_flag(current, TIF_SIGPENDING);
> +		break;
> +	default:
>  		WARN_ON_ONCE(1);
> +	}
Liao, at first glance this change looks "obviously wrong" to me.
But let me read this patch more carefully and reply on weekend,
I am a bit busy right now.
Thanks,
Oleg.
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-01 14:06 ` Oleg Nesterov
@ 2024-08-02  1:38   ` Liao, Chang
  2024-08-02  9:24     ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-02  1:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/1 22:06, Oleg Nesterov 写道:
> On 08/01, Liao Chang wrote:
>>
>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>>  	int err = 0;
>>  
>>  	uprobe = utask->active_uprobe;
>> -	if (utask->state == UTASK_SSTEP_ACK)
>> +	switch (utask->state) {
>> +	case UTASK_SSTEP_ACK:
>>  		err = arch_uprobe_post_xol(&uprobe->arch, regs);
>> -	else if (utask->state == UTASK_SSTEP_TRAPPED)
>> +		break;
>> +	case UTASK_SSTEP_TRAPPED:
>>  		arch_uprobe_abort_xol(&uprobe->arch, regs);
>> -	else
>> +		fallthrough;
>> +	case UTASK_SSTEP_DENY_SIGNAL:
>> +		set_tsk_thread_flag(current, TIF_SIGPENDING);
>> +		break;
>> +	default:
>>  		WARN_ON_ONCE(1);
>> +	}
> 
> Liao, at first glance this change looks "obviously wrong" to me.
Oleg. Did i overlook some thing obvious here?
> 
> But let me read this patch more carefully and reply on weekend,
> I am a bit busy right now.
Sure, thanks.
> 
> Thanks,
> 
> Oleg.
> 
> 
-- 
BR
Liao, Chang
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-02  1:38   ` Liao, Chang
@ 2024-08-02  9:24     ` Oleg Nesterov
  2024-08-06  3:06       ` Liao, Chang
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-02  9:24 UTC (permalink / raw)
  To: Liao, Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
On 08/02, Liao, Chang wrote:
>
>
> 在 2024/8/1 22:06, Oleg Nesterov 写道:
> > On 08/01, Liao Chang wrote:
> >>
> >> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> >>  	int err = 0;
> >>
> >>  	uprobe = utask->active_uprobe;
> >> -	if (utask->state == UTASK_SSTEP_ACK)
> >> +	switch (utask->state) {
> >> +	case UTASK_SSTEP_ACK:
> >>  		err = arch_uprobe_post_xol(&uprobe->arch, regs);
> >> -	else if (utask->state == UTASK_SSTEP_TRAPPED)
> >> +		break;
> >> +	case UTASK_SSTEP_TRAPPED:
> >>  		arch_uprobe_abort_xol(&uprobe->arch, regs);
> >> -	else
> >> +		fallthrough;
> >> +	case UTASK_SSTEP_DENY_SIGNAL:
> >> +		set_tsk_thread_flag(current, TIF_SIGPENDING);
> >> +		break;
> >> +	default:
> >>  		WARN_ON_ONCE(1);
> >> +	}
> >
> > Liao, at first glance this change looks "obviously wrong" to me.
>
> Oleg. Did i overlook some thing obvious here?
OK, lets suppose uprobe_deny_signal() sets UTASK_SSTEP_DENY_SIGNAL.
In this case handle_singlestep() will only set TIF_SIGPENDING and
do nothing else. This is wrong, either _post_xol() or _abort_xol()
must be called.
But I think handle_singlestep() will never hit this case. In the
likely case uprobe_post_sstep_notifier() will replace _DENY_SIGNAL
with _ACK, and this means that handle_singlestep() won't restore
TIF_SIGPENDING cleared by uprobe_deny_signal().
Oleg.
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-02  9:24     ` Oleg Nesterov
@ 2024-08-06  3:06       ` Liao, Chang
  2024-08-06 17:25         ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-06  3:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/2 17:24, Oleg Nesterov 写道:
> On 08/02, Liao, Chang wrote:
>>
>>
>> 在 2024/8/1 22:06, Oleg Nesterov 写道:
>>> On 08/01, Liao Chang wrote:
>>>>
>>>> @@ -2276,22 +2277,25 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>>>>  	int err = 0;
>>>>
>>>>  	uprobe = utask->active_uprobe;
>>>> -	if (utask->state == UTASK_SSTEP_ACK)
>>>> +	switch (utask->state) {
>>>> +	case UTASK_SSTEP_ACK:
>>>>  		err = arch_uprobe_post_xol(&uprobe->arch, regs);
>>>> -	else if (utask->state == UTASK_SSTEP_TRAPPED)
>>>> +		break;
>>>> +	case UTASK_SSTEP_TRAPPED:
>>>>  		arch_uprobe_abort_xol(&uprobe->arch, regs);
>>>> -	else
>>>> +		fallthrough;
>>>> +	case UTASK_SSTEP_DENY_SIGNAL:
>>>> +		set_tsk_thread_flag(current, TIF_SIGPENDING);
>>>> +		break;
>>>> +	default:
>>>>  		WARN_ON_ONCE(1);
>>>> +	}
>>>
>>> Liao, at first glance this change looks "obviously wrong" to me.
>>
>> Oleg. Did i overlook some thing obvious here?
> 
> OK, lets suppose uprobe_deny_signal() sets UTASK_SSTEP_DENY_SIGNAL.
> 
> In this case handle_singlestep() will only set TIF_SIGPENDING and
> do nothing else. This is wrong, either _post_xol() or _abort_xol()
> must be called.
> 
> But I think handle_singlestep() will never hit this case. In the
> likely case uprobe_post_sstep_notifier() will replace _DENY_SIGNAL
> with _ACK, and this means that handle_singlestep() won't restore
> TIF_SIGPENDING cleared by uprobe_deny_signal().
You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
_DENY_SIGNAL is likey replaced during next uprobe single-stepping.
I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
to correctly restore TIF_SIGPENDING upon the completion of single-step.
A revised implementation would look like this:
------------------%<------------------
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
        if (task_sigpending(t)) {
                clear_tsk_thread_flag(t, TIF_SIGPENDING);
+               utask->state = UTASK_SSTEP_DENY_SIGNAL;
                if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
                        utask->state = UTASK_SSTEP_TRAPPED;
@@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
        int err = 0;
        uprobe = utask->active_uprobe;
-       if (utask->state == UTASK_SSTEP_ACK)
+       switch (utask->state) {
+       case UTASK_SSTEP_ACK:
                err = arch_uprobe_post_xol(&uprobe->arch, regs);
-       else if (utask->state == UTASK_SSTEP_TRAPPED)
+               break;
+       case UTASK_SSTEP_TRAPPED:
                arch_uprobe_abort_xol(&uprobe->arch, regs);
-       else
+               set_thread_flag(TIF_SIGPENDING);
+               break;
+       default:
                WARN_ON_ONCE(1);
+       }
        put_uprobe(uprobe);
        utask->active_uprobe = NULL;
        utask->state = UTASK_RUNNING;
        xol_free_insn_slot(current);
-       spin_lock_irq(¤t->sighand->siglock);
-       recalc_sigpending(); /* see uprobe_deny_signal() */
-       spin_unlock_irq(¤t->sighand->siglock);
-
        if (unlikely(err)) {
                uprobe_warn(current, "execute the probed insn, sending SIGILL.");
                force_sig(SIGILL);
@@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
                /* task is currently not uprobed */
                return 0;
+       if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
+               set_thread_flag(TIF_SIGPENDING);
        utask->state = UTASK_SSTEP_ACK;
        set_thread_flag(TIF_UPROBE);
        return 1;
------------------>%------------------
> 
> Oleg.
> 
> 
-- 
BR
Liao, Chang
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-06  3:06       ` Liao, Chang
@ 2024-08-06 17:25         ` Oleg Nesterov
  2024-08-07 10:17           ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-06 17:25 UTC (permalink / raw)
  To: Liao, Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
On 08/06, Liao, Chang wrote:
>
> You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
> unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
> _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
>
> I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
> and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
> to correctly restore TIF_SIGPENDING upon the completion of single-step.
>
> A revised implementation would look like this:
Still looks "obviously wrong" to me... even the approach itself.
Perhaps I am wrong, yet another day when I can't even read emails on lkml
carefully, sorry.
But can you please send the patch which I could actually apply? This one
looks white-space damaged...
I'll try to reply with more details as soon I convince myself I fully
understand what does your patch actually do, but most probably not tomorrow.
Thanks,
Oleg.
> ------------------%<------------------
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
> 
>         if (task_sigpending(t)) {
>                 clear_tsk_thread_flag(t, TIF_SIGPENDING);
> +               utask->state = UTASK_SSTEP_DENY_SIGNAL;
> 
>                 if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
>                         utask->state = UTASK_SSTEP_TRAPPED;
> @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>         int err = 0;
> 
>         uprobe = utask->active_uprobe;
> -       if (utask->state == UTASK_SSTEP_ACK)
> +       switch (utask->state) {
> +       case UTASK_SSTEP_ACK:
>                 err = arch_uprobe_post_xol(&uprobe->arch, regs);
> -       else if (utask->state == UTASK_SSTEP_TRAPPED)
> +               break;
> +       case UTASK_SSTEP_TRAPPED:
>                 arch_uprobe_abort_xol(&uprobe->arch, regs);
> -       else
> +               set_thread_flag(TIF_SIGPENDING);
> +               break;
> +       default:
>                 WARN_ON_ONCE(1);
> +       }
> 
>         put_uprobe(uprobe);
>         utask->active_uprobe = NULL;
>         utask->state = UTASK_RUNNING;
>         xol_free_insn_slot(current);
> 
> -       spin_lock_irq(¤t->sighand->siglock);
> -       recalc_sigpending(); /* see uprobe_deny_signal() */
> -       spin_unlock_irq(¤t->sighand->siglock);
> -
>         if (unlikely(err)) {
>                 uprobe_warn(current, "execute the probed insn, sending SIGILL.");
>                 force_sig(SIGILL);
> @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
>                 /* task is currently not uprobed */
>                 return 0;
> 
> +       if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
> +               set_thread_flag(TIF_SIGPENDING);
>         utask->state = UTASK_SSTEP_ACK;
>         set_thread_flag(TIF_UPROBE);
>         return 1;
> 
> ------------------>%------------------
> 
> > 
> > Oleg.
> > 
> > 
> 
> -- 
> BR
> Liao, Chang
> 
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-06 17:25         ` Oleg Nesterov
@ 2024-08-07 10:17           ` Oleg Nesterov
  2024-08-08  7:30             ` Liao, Chang
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-07 10:17 UTC (permalink / raw)
  To: Liao, Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
So. Liao, I am sorry, but I dislike your patch/approach in any case.
UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the fact
that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep()
and uprobe_post_sstep_notifier(), this complicates the logic even more.
We need a flag, not the new state.
And if I read this patch correctly it is wrong:
	- uprobe_deny_signal() clears TIF_SIGPENDING and sets UTASK_SSTEP_DENY_SIGNAL
	- another signal cames after that and sets TIF_SIGPENDING again
	- in this case the task won't return to user-space and execute the probed
	  insn, exit_to_user_mode_loop() will notice another TIF_SIGPENDING and
	  call arch_do_signal_or_restart()->get_signal() again.
	- get_signal() will call uprobe_deny_signal() again hit
		WARN_ON_ONCE(utask->state != UTASK_SSTEP);
And no, we shouldn't change this check into UTASK_SSTEP || UTASK_SSTEP_DENY_SIGNAL.
Again, the fact that uprobe_deny_signal() cleared TIF_SIGPENDING must not be the
new state.
Oleg.
On 08/06, Oleg Nesterov wrote:
>
> On 08/06, Liao, Chang wrote:
> >
> > You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
> > unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
> > _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
> >
> > I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
> > and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
> > to correctly restore TIF_SIGPENDING upon the completion of single-step.
> >
> > A revised implementation would look like this:
>
> Still looks "obviously wrong" to me... even the approach itself.
>
> Perhaps I am wrong, yet another day when I can't even read emails on lkml
> carefully, sorry.
>
> But can you please send the patch which I could actually apply? This one
> looks white-space damaged...
>
> I'll try to reply with more details as soon I convince myself I fully
> understand what does your patch actually do, but most probably not tomorrow.
>
> Thanks,
>
> Oleg.
>
> > ------------------%<------------------
> > --- a/kernel/events/uprobes.c
> > +++ b/kernel/events/uprobes.c
> > @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
> >
> >         if (task_sigpending(t)) {
> >                 clear_tsk_thread_flag(t, TIF_SIGPENDING);
> > +               utask->state = UTASK_SSTEP_DENY_SIGNAL;
> >
> >                 if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
> >                         utask->state = UTASK_SSTEP_TRAPPED;
> > @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
> >         int err = 0;
> >
> >         uprobe = utask->active_uprobe;
> > -       if (utask->state == UTASK_SSTEP_ACK)
> > +       switch (utask->state) {
> > +       case UTASK_SSTEP_ACK:
> >                 err = arch_uprobe_post_xol(&uprobe->arch, regs);
> > -       else if (utask->state == UTASK_SSTEP_TRAPPED)
> > +               break;
> > +       case UTASK_SSTEP_TRAPPED:
> >                 arch_uprobe_abort_xol(&uprobe->arch, regs);
> > -       else
> > +               set_thread_flag(TIF_SIGPENDING);
> > +               break;
> > +       default:
> >                 WARN_ON_ONCE(1);
> > +       }
> >
> >         put_uprobe(uprobe);
> >         utask->active_uprobe = NULL;
> >         utask->state = UTASK_RUNNING;
> >         xol_free_insn_slot(current);
> >
> > -       spin_lock_irq(¤t->sighand->siglock);
> > -       recalc_sigpending(); /* see uprobe_deny_signal() */
> > -       spin_unlock_irq(¤t->sighand->siglock);
> > -
> >         if (unlikely(err)) {
> >                 uprobe_warn(current, "execute the probed insn, sending SIGILL.");
> >                 force_sig(SIGILL);
> > @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
> >                 /* task is currently not uprobed */
> >                 return 0;
> >
> > +       if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
> > +               set_thread_flag(TIF_SIGPENDING);
> >         utask->state = UTASK_SSTEP_ACK;
> >         set_thread_flag(TIF_UPROBE);
> >         return 1;
> >
> > ------------------>%------------------
> >
> > >
> > > Oleg.
> > >
> > >
> >
> > --
> > BR
> > Liao, Chang
> >
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-07 10:17           ` Oleg Nesterov
@ 2024-08-08  7:30             ` Liao, Chang
  2024-08-08 10:28               ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-08  7:30 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
在 2024/8/7 18:17, Oleg Nesterov 写道:
> So. Liao, I am sorry, but I dislike your patch/approach in any case.
> 
> UTASK_SSTEP_DENY_SIGNAL complicates the state machine. And I don't like the fact
> that set_thread_flag(TIF_SIGPENDING) is called twice, from handle_singlestep()
> and uprobe_post_sstep_notifier(), this complicates the logic even more.
> 
> We need a flag, not the new state.
> 
> And if I read this patch correctly it is wrong:
> 
> 	- uprobe_deny_signal() clears TIF_SIGPENDING and sets UTASK_SSTEP_DENY_SIGNAL
> 
> 	- another signal cames after that and sets TIF_SIGPENDING again
> 
> 	- in this case the task won't return to user-space and execute the probed
> 	  insn, exit_to_user_mode_loop() will notice another TIF_SIGPENDING and
> 	  call arch_do_signal_or_restart()->get_signal() again.
> 
> 	- get_signal() will call uprobe_deny_signal() again hit
> 
> 		WARN_ON_ONCE(utask->state != UTASK_SSTEP);
> 
> 
> And no, we shouldn't change this check into UTASK_SSTEP || UTASK_SSTEP_DENY_SIGNAL.
> Again, the fact that uprobe_deny_signal() cleared TIF_SIGPENDING must not be the
> new state.
Oleg, If i understand correctly, current state machine expects the single-step handling
should end up with either _ACK or _TRAPPED. Any new state would disrupt this logic. If so,
I'm convinced that adding a new state is uncessary. As you mentioned, I propose using a
boolean flag in the uprobe_task data to track whether a signal should be restored at the
cost of increased size. Here's outline of the changes:
  - pre_ssout() resets the deny signal flag
  - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
  - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
Does this approach look correct to you,do do you have any other way to implement the "flag"?
Thanks.
> 
> Oleg.
> 
> On 08/06, Oleg Nesterov wrote:
>>
>> On 08/06, Liao, Chang wrote:
>>>
>>> You're absolutely right. handle_signlestep() has chance to handle _DENY_SIGANL
>>> unless it followed by setting TIF_UPROBE in uprobe_deny_signal(). This means
>>> _DENY_SIGNAL is likey replaced during next uprobe single-stepping.
>>>
>>> I believe introducing _DENY_SIGNAL as the immediate state between UTASK_SSTEP
>>> and UTASK_SSTEP_ACK is still necessary. This allow uprobe_post_sstep_notifier()
>>> to correctly restore TIF_SIGPENDING upon the completion of single-step.
>>>
>>> A revised implementation would look like this:
>>
>> Still looks "obviously wrong" to me... even the approach itself.
>>
>> Perhaps I am wrong, yet another day when I can't even read emails on lkml
>> carefully, sorry.
>>
>> But can you please send the patch which I could actually apply? This one
>> looks white-space damaged...
>>
>> I'll try to reply with more details as soon I convince myself I fully
>> understand what does your patch actually do, but most probably not tomorrow.
>>
>> Thanks,
>>
>> Oleg.
>>
>>> ------------------%<------------------
>>> --- a/kernel/events/uprobes.c
>>> +++ b/kernel/events/uprobes.c
>>> @@ -1980,6 +1980,7 @@ bool uprobe_deny_signal(void)
>>>
>>>         if (task_sigpending(t)) {
>>>                 clear_tsk_thread_flag(t, TIF_SIGPENDING);
>>> +               utask->state = UTASK_SSTEP_DENY_SIGNAL;
>>>
>>>                 if (__fatal_signal_pending(t) || arch_uprobe_xol_was_trapped(t)) {
>>>                         utask->state = UTASK_SSTEP_TRAPPED;
>>> @@ -2276,22 +2277,23 @@ static void handle_singlestep(struct uprobe_task *utask, struct pt_regs *regs)
>>>         int err = 0;
>>>
>>>         uprobe = utask->active_uprobe;
>>> -       if (utask->state == UTASK_SSTEP_ACK)
>>> +       switch (utask->state) {
>>> +       case UTASK_SSTEP_ACK:
>>>                 err = arch_uprobe_post_xol(&uprobe->arch, regs);
>>> -       else if (utask->state == UTASK_SSTEP_TRAPPED)
>>> +               break;
>>> +       case UTASK_SSTEP_TRAPPED:
>>>                 arch_uprobe_abort_xol(&uprobe->arch, regs);
>>> -       else
>>> +               set_thread_flag(TIF_SIGPENDING);
>>> +               break;
>>> +       default:
>>>                 WARN_ON_ONCE(1);
>>> +       }
>>>
>>>         put_uprobe(uprobe);
>>>         utask->active_uprobe = NULL;
>>>         utask->state = UTASK_RUNNING;
>>>         xol_free_insn_slot(current);
>>>
>>> -       spin_lock_irq(¤t->sighand->siglock);
>>> -       recalc_sigpending(); /* see uprobe_deny_signal() */
>>> -       spin_unlock_irq(¤t->sighand->siglock);
>>> -
>>>         if (unlikely(err)) {
>>>                 uprobe_warn(current, "execute the probed insn, sending SIGILL.");
>>>                 force_sig(SIGILL);
>>> @@ -2351,6 +2353,8 @@ int uprobe_post_sstep_notifier(struct pt_regs *regs)
>>>                 /* task is currently not uprobed */
>>>                 return 0;
>>>
>>> +       if (utask->state == UTASK_SSTEP_DENY_SIGNAL)
>>> +               set_thread_flag(TIF_SIGPENDING);
>>>         utask->state = UTASK_SSTEP_ACK;
>>>         set_thread_flag(TIF_UPROBE);
>>>         return 1;
>>>
>>> ------------------>%------------------
>>>
>>>>
>>>> Oleg.
>>>>
>>>>
>>>
>>> --
>>> BR
>>> Liao, Chang
>>>
> 
> 
-- 
BR
Liao, Chang
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-08  7:30             ` Liao, Chang
@ 2024-08-08 10:28               ` Oleg Nesterov
  2024-08-08 12:31                 ` Liao, Chang
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-08 10:28 UTC (permalink / raw)
  To: Liao, Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users
On 08/08, Liao, Chang wrote:
>
>   - pre_ssout() resets the deny signal flag
>
>   - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
>
>   - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
>
> Does this approach look correct to you,do do you have any other way to implement the "flag"?
Yes. But I don't think pre_ssout() needs to clear this flag. handle_singlestep() resets/clears
state, active_uprobe, frees insn slot. So I guess we only need
--- x/kernel/events/uprobes.c
+++ x/kernel/events/uprobes.c
@@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
 	utask->state = UTASK_RUNNING;
 	xol_free_insn_slot(current);
 
-	spin_lock_irq(¤t->sighand->siglock);
-	recalc_sigpending(); /* see uprobe_deny_signal() */
-	spin_unlock_irq(¤t->sighand->siglock);
+	if (utask->xxx) {
+		set_thread_flag(TIF_SIGPENDING);
+		utask->xxx = 0;
+	}
 
 	if (unlikely(err)) {
 		uprobe_warn(current, "execute the probed insn, sending SIGILL.");
and that is all.
Oleg.
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-08 10:28               ` Oleg Nesterov
@ 2024-08-08 12:31                 ` Liao, Chang
  2024-08-08 13:17                   ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Liao, Chang @ 2024-08-08 12:31 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users, bpf,
	Andrii Nakryiko, Masami Hiramatsu, Steven Rostedt, paulmck
在 2024/8/8 18:28, Oleg Nesterov 写道:
> On 08/08, Liao, Chang wrote:
>>
>>   - pre_ssout() resets the deny signal flag
>>
>>   - uprobe_deny_signal() sets the deny signal flag when TIF_SIGPENDING is cleared.
>>
>>   - handle_singlestep() check the deny signal flag and restore TIF_SIGPENDING if necessary.
>>
>> Does this approach look correct to you,do do you have any other way to implement the "flag"?
> 
> Yes. But I don't think pre_ssout() needs to clear this flag. handle_singlestep() resets/clears
> state, active_uprobe, frees insn slot. So I guess we only need
> 
> 
> --- x/kernel/events/uprobes.c
> +++ x/kernel/events/uprobes.c
> @@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
>  	utask->state = UTASK_RUNNING;
>  	xol_free_insn_slot(current);
>  
> -	spin_lock_irq(¤t->sighand->siglock);
> -	recalc_sigpending(); /* see uprobe_deny_signal() */
> -	spin_unlock_irq(¤t->sighand->siglock);
> +	if (utask->xxx) {
> +		set_thread_flag(TIF_SIGPENDING);
> +		utask->xxx = 0;
> +	}
Agree, if no more discussion about this flag, I will just send v2 today.
Thanks.
>  
>  	if (unlikely(err)) {
>  		uprobe_warn(current, "execute the probed insn, sending SIGILL.");
> 
> and that is all.
> 
> Oleg.
> 
> 
-- 
BR
Liao, Chang
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH] uprobes: Improve scalability by reducing the contention on siglock
  2024-08-08 12:31                 ` Liao, Chang
@ 2024-08-08 13:17                   ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2024-08-08 13:17 UTC (permalink / raw)
  To: Liao, Chang
  Cc: mhiramat, peterz, mingo, acme, namhyung, mark.rutland,
	alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
	linux-kernel, linux-trace-kernel, linux-perf-users, bpf,
	Andrii Nakryiko, Steven Rostedt, paulmck
On 08/08, Liao, Chang wrote:
>
>
> 在 2024/8/8 18:28, Oleg Nesterov 写道:
> > --- x/kernel/events/uprobes.c
> > +++ x/kernel/events/uprobes.c
> > @@ -2308,9 +2308,10 @@ static void handle_singlestep(struct upr
> >  	utask->state = UTASK_RUNNING;
> >  	xol_free_insn_slot(current);
> >
> > -	spin_lock_irq(¤t->sighand->siglock);
> > -	recalc_sigpending(); /* see uprobe_deny_signal() */
> > -	spin_unlock_irq(¤t->sighand->siglock);
> > +	if (utask->xxx) {
> > +		set_thread_flag(TIF_SIGPENDING);
> > +		utask->xxx = 0;
> > +	}
>
> Agree, if no more discussion about this flag, I will just send v2 today.
Please also resend the previous patch a 1/2, this one as 2/2.
Oleg.
^ permalink raw reply	[flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-08 13:17 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01  8:24 [PATCH] uprobes: Improve scalability by reducing the contention on siglock Liao Chang
2024-08-01 14:06 ` Oleg Nesterov
2024-08-02  1:38   ` Liao, Chang
2024-08-02  9:24     ` Oleg Nesterov
2024-08-06  3:06       ` Liao, Chang
2024-08-06 17:25         ` Oleg Nesterov
2024-08-07 10:17           ` Oleg Nesterov
2024-08-08  7:30             ` Liao, Chang
2024-08-08 10:28               ` Oleg Nesterov
2024-08-08 12:31                 ` Liao, Chang
2024-08-08 13:17                   ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).