All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] don't preempt not TASK_RUNNING tasks
@ 2009-03-20  9:43 Miklos Szeredi
  2009-03-20 10:03 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Miklos Szeredi @ 2009-03-20  9:43 UTC (permalink / raw)
  To: mingo
  Cc: peterz, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm

Ingo,

I tested this one, and I think it makes sense in any case as an
optimization.  It should also be good for -stable kernels.

Does it look OK?

Thanks,
Miklos
----

From: Miklos Szeredi <mszeredi@suse.cz>

This patch fixes bug #12208:

  http://bugzilla.kernel.org/show_bug.cgi?id=12208

Don't preempt tasks in preempt_schedule() if they are already in the
process of going to sleep.  Otherwise the task would wake up only to
go to sleep again.

Due to the way wait_task_inactive() works this can also drastically
slow down ptrace:

 - task A is ptracing task B
 - task B stops on a trace event
 - task A is woken up and preempts task B
 - task A calls ptrace on task B, which does ptrace_check_attach()
 - this calls wait_task_inactive(), which sees that task B is still on the runq
 - task A goes to sleep for a jiffy
 - ...

Since UML does lots of the above sequences, those jiffies quickly add
up to make it slow as hell.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
---
 kernel/sched.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: linux.git/kernel/sched.c
===================================================================
--- linux.git.orig/kernel/sched.c	2009-03-20 09:40:47.000000000 +0100
+++ linux.git/kernel/sched.c	2009-03-20 10:28:56.000000000 +0100
@@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule
 	if (likely(ti->preempt_count || irqs_disabled()))
 		return;
 
+	/* No point in preempting we are just about to go to sleep. */
+	if (current->state != TASK_RUNNING)
+		return;
+
 	do {
 		add_preempt_count(PREEMPT_ACTIVE);
 		schedule();

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] don't preempt not TASK_RUNNING tasks
  2009-03-20  9:43 [patch] don't preempt not TASK_RUNNING tasks Miklos Szeredi
@ 2009-03-20 10:03 ` Peter Zijlstra
  2009-03-20 10:37   ` Miklos Szeredi
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2009-03-20 10:03 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: mingo, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm

On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote:
> Ingo,
> 
> I tested this one, and I think it makes sense in any case as an
> optimization.  It should also be good for -stable kernels.
> 
> Does it look OK?

The idea is good, but there is a risk of preemption latencies here. Some
code paths aren't real quick between setting ->state != TASK_RUNNING and
calling schedule.

[ Both quick: as in O(1) and few instructions ]

So if we're going to do this, we'd need to audit all such code paths --
and there be lots.

The first line of attack for this problem is making wait_task_inactive()
sucks less, which shouldn't be too hard, that unconditional 1 jiffy
sleep is simply retarded.

> Index: linux.git/kernel/sched.c
> ===================================================================
> --- linux.git.orig/kernel/sched.c	2009-03-20 09:40:47.000000000 +0100
> +++ linux.git/kernel/sched.c	2009-03-20 10:28:56.000000000 +0100
> @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule
>  	if (likely(ti->preempt_count || irqs_disabled()))
>  		return;
>  
> +	/* No point in preempting we are just about to go to sleep. */
> +	if (current->state != TASK_RUNNING)
> +		return;
> +
>  	do {
>  		add_preempt_count(PREEMPT_ACTIVE);
>  		schedule();

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] don't preempt not TASK_RUNNING tasks
  2009-03-20 10:03 ` Peter Zijlstra
@ 2009-03-20 10:37   ` Miklos Szeredi
  2009-03-20 10:53     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Miklos Szeredi @ 2009-03-20 10:37 UTC (permalink / raw)
  To: peterz
  Cc: miklos, mingo, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm

On Fri, 20 Mar 2009, Peter Zijlstra wrote:
> On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote:
> > Ingo,
> > 
> > I tested this one, and I think it makes sense in any case as an
> > optimization.  It should also be good for -stable kernels.
> > 
> > Does it look OK?
> 
> The idea is good, but there is a risk of preemption latencies here. Some
> code paths aren't real quick between setting ->state != TASK_RUNNING and
> calling schedule.
> 
> [ Both quick: as in O(1) and few instructions ]
> 
> So if we're going to do this, we'd need to audit all such code paths --
> and there be lots.

Oh, yes.

In a random sample the most common pattern is something like this:

	spin_lock(&some_lock);
	/* do something */
	set_task_state(TASK_SOMESLEEP);
	/* do something more */
	spin_unlock(&some_lock);
	schedule();
	...

Which should only positively be impacted by the change.  But I can
imagine rare cases where it's more complex.

> The first line of attack for this problem is making wait_task_inactive()
> sucks less, which shouldn't be too hard, that unconditional 1 jiffy
> sleep is simply retarded.

I completely agree.  However, I'd like to have a non-invasive solution
that can go into current and stable kernels so UML users don't need to
suffer any more.

Thanks,
Miklos

> 
> > Index: linux.git/kernel/sched.c
> > ===================================================================
> > --- linux.git.orig/kernel/sched.c	2009-03-20 09:40:47.000000000 +0100
> > +++ linux.git/kernel/sched.c	2009-03-20 10:28:56.000000000 +0100
> > @@ -4632,6 +4632,10 @@ asmlinkage void __sched preempt_schedule
> >  	if (likely(ti->preempt_count || irqs_disabled()))
> >  		return;
> >  
> > +	/* No point in preempting we are just about to go to sleep. */
> > +	if (current->state != TASK_RUNNING)
> > +		return;
> > +
> >  	do {
> >  		add_preempt_count(PREEMPT_ACTIVE);
> >  		schedule();
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] don't preempt not TASK_RUNNING tasks
  2009-03-20 10:37   ` Miklos Szeredi
@ 2009-03-20 10:53     ` Ingo Molnar
  2009-03-20 11:25       ` Miklos Szeredi
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2009-03-20 10:53 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: peterz, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm


* Miklos Szeredi <miklos@szeredi.hu> wrote:

> On Fri, 20 Mar 2009, Peter Zijlstra wrote:
> > On Fri, 2009-03-20 at 10:43 +0100, Miklos Szeredi wrote:
> > > Ingo,
> > > 
> > > I tested this one, and I think it makes sense in any case as an
> > > optimization.  It should also be good for -stable kernels.
> > > 
> > > Does it look OK?
> > 
> > The idea is good, but there is a risk of preemption latencies here. Some
> > code paths aren't real quick between setting ->state != TASK_RUNNING and
> > calling schedule.
> > 
> > [ Both quick: as in O(1) and few instructions ]
> > 
> > So if we're going to do this, we'd need to audit all such code paths --
> > and there be lots.
> 
> Oh, yes.
> 
> In a random sample the most common pattern is something like this:
> 
> 	spin_lock(&some_lock);
> 	/* do something */
> 	set_task_state(TASK_SOMESLEEP);
> 	/* do something more */
> 	spin_unlock(&some_lock);
> 	schedule();
> 	...
> 
> Which should only positively be impacted by the change.  But I can 
> imagine rare cases where it's more complex.

I'd suggest spin_unlock_no_resched() and task_unlock_no_resched() 
instead of open-coding preempt-disable sequences.

> > The first line of attack for this problem is making 
> > wait_task_inactive() sucks less, which shouldn't be too hard, 
> > that unconditional 1 jiffy sleep is simply retarded.
> 
> I completely agree.  However, I'd like to have a non-invasive 
> solution that can go into current and stable kernels so UML users 
> don't need to suffer any more.

Agreed. task_unlock_no_resched() should do that i think.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] don't preempt not TASK_RUNNING tasks
  2009-03-20 10:53     ` Ingo Molnar
@ 2009-03-20 11:25       ` Miklos Szeredi
  2009-03-20 11:39         ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Miklos Szeredi @ 2009-03-20 11:25 UTC (permalink / raw)
  To: mingo
  Cc: miklos, peterz, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm

On Fri, 20 Mar 2009, Ingo Molnar wrote:
> > > The first line of attack for this problem is making 
> > > wait_task_inactive() sucks less, which shouldn't be too hard, 
> > > that unconditional 1 jiffy sleep is simply retarded.
> > 
> > I completely agree.  However, I'd like to have a non-invasive 
> > solution that can go into current and stable kernels so UML users 
> > don't need to suffer any more.
> 
> Agreed. task_unlock_no_resched() should do that i think.

I don't see how that would help.

ptrace_stop() specifically would need read_unlock_no_resched().  But
I'm reluctant to add more spinlock functions with all their
variants...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch] don't preempt not TASK_RUNNING tasks
  2009-03-20 11:25       ` Miklos Szeredi
@ 2009-03-20 11:39         ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2009-03-20 11:39 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: peterz, roland, efault, rjw, jdike, user-mode-linux-devel,
	linux-kernel, torvalds, akpm


* Miklos Szeredi <miklos@szeredi.hu> wrote:

> On Fri, 20 Mar 2009, Ingo Molnar wrote:
> > > > The first line of attack for this problem is making 
> > > > wait_task_inactive() sucks less, which shouldn't be too hard, 
> > > > that unconditional 1 jiffy sleep is simply retarded.
> > > 
> > > I completely agree.  However, I'd like to have a non-invasive 
> > > solution that can go into current and stable kernels so UML users 
> > > don't need to suffer any more.
> > 
> > Agreed. task_unlock_no_resched() should do that i think.
> 
> I don't see how that would help.

it more clearly expresses the need there, and we already have 
_no_resched API variants (we add them on an as-needed basis).

Doing:

 preempt_disable();
 read_lock();
 ...
 read_unlock();
 preempt_enable_no_resched();

Really just open-codes read_unlock_no_resched() and uglifies the 
code.

> ptrace_stop() specifically would need read_unlock_no_resched().  
> But I'm reluctant to add more spinlock functions with all their 
> variants...

if you worry about backportability, we can certainly add the easy 
fix too, if it's followed by the more involved fix(es).

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-03-20 11:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-20  9:43 [patch] don't preempt not TASK_RUNNING tasks Miklos Szeredi
2009-03-20 10:03 ` Peter Zijlstra
2009-03-20 10:37   ` Miklos Szeredi
2009-03-20 10:53     ` Ingo Molnar
2009-03-20 11:25       ` Miklos Szeredi
2009-03-20 11:39         ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.