public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH REPOST 0/2] signal: Avoid preempt_disable() in ptrace_stop() on PREEMPT_RT
@ 2023-06-06  8:55 Sebastian Andrzej Siewior
  2023-06-06  8:55 ` [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop() Sebastian Andrzej Siewior
  2023-06-06  8:55 ` [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
  0 siblings, 2 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-06-06  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Eric W. Biederman, Oleg Nesterov, Peter Zijlstra, Thomas Gleixner

this mini series updates the comment to properly explain why the
preempt-disable section has been added and then disables this on
PREEMPT_RT explaining why it can not be done.

This is a repost of
	https://lore.kernel.org/20230406205713.1843072-1-bigeasy@linutronix.de

Sebastian



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop().
  2023-06-06  8:55 [PATCH REPOST 0/2] signal: Avoid preempt_disable() in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
@ 2023-06-06  8:55 ` Sebastian Andrzej Siewior
  2023-06-06 11:06   ` Oleg Nesterov
  2023-06-06  8:55 ` [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-06-06  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Eric W. Biederman, Oleg Nesterov, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior

Commit 53da1d9456fe7 ("fix ptrace slowness") added a preempt-disable section
between read_unlock() and the following schedule() invocation without
explaining why it is needed.

Replace the comment with an explanation why this is needed. Clarify that
it is needed for correctness but for performance reasons.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/signal.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 2547fa73bde51..da017a5461163 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2313,10 +2313,21 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 		do_notify_parent_cldstop(current, false, why);
 
 	/*
-	 * Don't want to allow preemption here, because
-	 * sys_ptrace() needs this task to be inactive.
+	 * The previous do_notify_parent_cldstop() invocation woke ptracer.
+	 * One a PREEMPTION kernel this can result in preemption requirement
+	 * which will be fulfilled after read_unlock() and the ptracer will be
+	 * put on the CPU.
+	 * The ptracer is in wait_task_inactive(, __TASK_TRACED) waiting for
+	 * this task wait in schedule(). If this task gets preempted then it
+	 * remains enqueued on the runqueue. The ptracer will observe this and
+	 * then sleep for a delay of one HZ tick. In the meantime this task
+	 * gets scheduled, enters schedule() and will wait for the ptracer.
 	 *
-	 * XXX: implement read_unlock_no_resched().
+	 * This preemption point is not bad from correctness point of view but
+	 * extends the runtime by one HZ tick time due to the ptracer's sleep.
+	 * The preempt-disable section ensures that there will be no preemption
+	 * between unlock and schedule() and so improving the performance since
+	 * the ptracer has no reason to sleep.
 	 */
 	preempt_disable();
 	read_unlock(&tasklist_lock);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06  8:55 [PATCH REPOST 0/2] signal: Avoid preempt_disable() in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
  2023-06-06  8:55 ` [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop() Sebastian Andrzej Siewior
@ 2023-06-06  8:55 ` Sebastian Andrzej Siewior
  2023-06-06 11:04   ` Oleg Nesterov
  1 sibling, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-06-06  8:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Eric W. Biederman, Oleg Nesterov, Peter Zijlstra, Thomas Gleixner,
	Sebastian Andrzej Siewior

On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires css_set_lock
which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled
preemption.
The preempt-disabled section is only for performance optimisation
reasons and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/signal.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index da017a5461163..9e07b3075c72e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * The preempt-disable section ensures that there will be no preemption
 	 * between unlock and schedule() and so improving the performance since
 	 * the ptracer has no reason to sleep.
+	 *
+	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
+	 * within the preempt-disable section.
 	 */
-	preempt_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_disable();
 	read_unlock(&tasklist_lock);
 	cgroup_enter_frozen();
-	preempt_enable_no_resched();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_enable_no_resched();
 	schedule();
 	cgroup_leave_frozen(true);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06  8:55 ` [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
@ 2023-06-06 11:04   ` Oleg Nesterov
  2023-06-06 11:14     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2023-06-06 11:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner

The patch LGTM, but I am a bit confused by the changelog/comments,
I guess I missed something...

On 06/06, Sebastian Andrzej Siewior wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
>  	 * The preempt-disable section ensures that there will be no preemption
>  	 * between unlock and schedule() and so improving the performance since
>  	 * the ptracer has no reason to sleep.
> +	 *
> +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> +	 * within the preempt-disable section.
>  	 */
> -	preempt_disable();
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_disable();

Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
this optimisation doesn't work on RT anyway?

IIUC, read_lock() on RT disables migration but not preemption, so it is simply
too late to do preempt_disable() before unlock/schedule. The tracer can preempt
the tracee right after do_notify_parent_cldstop().

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop().
  2023-06-06  8:55 ` [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop() Sebastian Andrzej Siewior
@ 2023-06-06 11:06   ` Oleg Nesterov
  0 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2023-06-06 11:06 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Eric W. Biederman, Peter Zijlstra, Thomas Gleixner

On 06/06, Sebastian Andrzej Siewior wrote:
>
> Commit 53da1d9456fe7 ("fix ptrace slowness") added a preempt-disable section
> between read_unlock() and the following schedule() invocation without
> explaining why it is needed.
>
> Replace the comment with an explanation why this is needed. Clarify that
> it is needed for correctness but for performance reasons.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06 11:04   ` Oleg Nesterov
@ 2023-06-06 11:14     ` Peter Zijlstra
  2023-06-06 11:38       ` Oleg Nesterov
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2023-06-06 11:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sebastian Andrzej Siewior, linux-kernel, Eric W. Biederman,
	Thomas Gleixner

On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> The patch LGTM, but I am a bit confused by the changelog/comments,
> I guess I missed something...
> 
> On 06/06, Sebastian Andrzej Siewior wrote:
> >
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> >  	 * The preempt-disable section ensures that there will be no preemption
> >  	 * between unlock and schedule() and so improving the performance since
> >  	 * the ptracer has no reason to sleep.
> > +	 *
> > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > +	 * within the preempt-disable section.
> >  	 */
> > -	preempt_disable();
> > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > +		preempt_disable();
> 
> Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> this optimisation doesn't work on RT anyway?
> 
> IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> the tracee right after do_notify_parent_cldstop().

Correct -- but I think you can disable preemption over what is
effectivly rwsem_up_read(), but you can't over the effective
rtmutex_lock() that cgroup_enter_frozen() will then attempt.

(iow, unlock() doesn't tend to sleep, while lock() does)

But you're correct to point out that the whole preempt_disable() thing
is entirely pointless due to the whole task_lock region being
preemptible before it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06 11:14     ` Peter Zijlstra
@ 2023-06-06 11:38       ` Oleg Nesterov
  2023-06-06 13:16         ` [PATCH 2/2 v2] " Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2023-06-06 11:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sebastian Andrzej Siewior, linux-kernel, Eric W. Biederman,
	Thomas Gleixner

On 06/06, Peter Zijlstra wrote:
>
> On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> > The patch LGTM, but I am a bit confused by the changelog/comments,
> > I guess I missed something...
> >
> > On 06/06, Sebastian Andrzej Siewior wrote:
> > >
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> > >  	 * The preempt-disable section ensures that there will be no preemption
> > >  	 * between unlock and schedule() and so improving the performance since
> > >  	 * the ptracer has no reason to sleep.
> > > +	 *
> > > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > > +	 * within the preempt-disable section.
> > >  	 */
> > > -	preempt_disable();
> > > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > > +		preempt_disable();
> >
> > Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> > this optimisation doesn't work on RT anyway?
> >
> > IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> > too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> > the tracee right after do_notify_parent_cldstop().
>
> Correct -- but I think you can disable preemption over what is
> effectivly rwsem_up_read(), but you can't over the effective
> rtmutex_lock() that cgroup_enter_frozen() will then attempt.
>
> (iow, unlock() doesn't tend to sleep, while lock() does)
>
> But you're correct to point out that the whole preempt_disable() thing
> is entirely pointless due to the whole task_lock region being
> preemptible before it.

Thanks Peter.

So I think the comment should be updated. Otherwise it looks as if it makes
sense to try to move cgroup_enter_frozen() up before preempt_disable().

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2 v2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06 11:38       ` Oleg Nesterov
@ 2023-06-06 13:16         ` Sebastian Andrzej Siewior
  2023-06-06 13:20           ` Oleg Nesterov
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-06-06 13:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, linux-kernel, Eric W. Biederman, Thomas Gleixner

On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires css_set_lock
which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled
preemption.
The preempt-disabled section is only for performance optimisation
reasons and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

Is this better?

v1…v2:
  - Extend the comment to note that preemption isn't disabled due to
    the lock to make it obvious that the optimisation isn't just
    harmful but also pointless.

 kernel/signal.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index da017a5461163..dcb0b1fbcb3a8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2328,11 +2328,20 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * The preempt-disable section ensures that there will be no preemption
 	 * between unlock and schedule() and so improving the performance since
 	 * the ptracer has no reason to sleep.
+	 *
+	 * On PREEMPT_RT locking tasklist_lock does not disable preemption.
+	 * Therefore the task can be preempted (after
+	 * do_notify_parent_cldstop()) before unlocking tasklist_lock so there
+	 * is no benefit in doing this. The optimisation is harmful on
+	 * PEEMPT_RT because the spinlock_t (in cgroup_enter_frozen()) must not
+	 * be acquired with disabled preemption.
 	 */
-	preempt_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_disable();
 	read_unlock(&tasklist_lock);
 	cgroup_enter_frozen();
-	preempt_enable_no_resched();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_enable_no_resched();
 	schedule();
 	cgroup_leave_frozen(true);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2 v2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
  2023-06-06 13:16         ` [PATCH 2/2 v2] " Sebastian Andrzej Siewior
@ 2023-06-06 13:20           ` Oleg Nesterov
  0 siblings, 0 replies; 9+ messages in thread
From: Oleg Nesterov @ 2023-06-06 13:20 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, linux-kernel, Eric W. Biederman, Thomas Gleixner

On 06/06, Sebastian Andrzej Siewior wrote:
>
> v1…v2:
>   - Extend the comment to note that preemption isn't disabled due to
>     the lock to make it obvious that the optimisation isn't just
>     harmful but also pointless.

Thanks,

Acked-by: Oleg Nesterov <oleg@redhat.com>



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-06 13:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-06  8:55 [PATCH REPOST 0/2] signal: Avoid preempt_disable() in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
2023-06-06  8:55 ` [PATCH REPOST 1/2] signal: Add proper comment about the preempt-disable in ptrace_stop() Sebastian Andrzej Siewior
2023-06-06 11:06   ` Oleg Nesterov
2023-06-06  8:55 ` [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT Sebastian Andrzej Siewior
2023-06-06 11:04   ` Oleg Nesterov
2023-06-06 11:14     ` Peter Zijlstra
2023-06-06 11:38       ` Oleg Nesterov
2023-06-06 13:16         ` [PATCH 2/2 v2] " Sebastian Andrzej Siewior
2023-06-06 13:20           ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox