public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Schmidt <mschmidt@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Jon Masters <jcm@redhat.com>,
	Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Subject: [PATCH] kthread: always create the kernel threads with normal priority
Date: Mon, 7 Jan 2008 11:06:03 +0100	[thread overview]
Message-ID: <20080107110603.09b72450@brian.englab.brq.redhat.com> (raw)
In-Reply-To: <20071222013021.db2528cb.akpm@linux-foundation.org>

On Sat, 22 Dec 2007 01:30:21 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon, 17 Dec 2007 23:43:14 +0100 Michal Schmidt
> <mschmidt@redhat.com> wrote:
> 
> > kthreadd, the creator of other kernel threads, runs as a normal
> > priority task. This is a potential for priority inversion when a
> > task wants to spawn a high-priority kernel thread. A middle priority
> > SCHED_FIFO task can block kthreadd's execution indefinitely and thus
> > prevent the timely creation of the high-priority kernel thread.
> >     
> > This causes a practical problem. When a runaway real-time task is
> > eating 100% CPU and we attempt to put the CPU offline, sometimes we
> > block while waiting for the creation of the highest-priority
> > "kstopmachine" thread. 
> > 
> > The fix is to run kthreadd with the highest possible SCHED_FIFO
> > priority. Its children must still run as slightly negatively reniced
> > SCHED_NORMAL tasks.
> 
> Did you hit this problem with the stock kernel, or have you been
> working on other stuff?

This was with RHEL5 and with current Fedora kernels.

> A locked-up SCHED_FIFO process will cause kernel threads all sorts of
> problems.  You've hit one instance, but there will be others.
> (pdflush stops working, for one).
> 
> The general approach we've taken to this is "don't do that".  Yes, we
> could boost lots of kernel threads in the way which this patch does
> but this actually takes control *away* from userspace.  Userspace no
> longer has the ability to guarantee itself minimum possible latency
> without getting preempted by kernel threads.
> 
> And yes, giving userspace this minimum-latency capability does imply
> that userspace has a responsibility to not 100% starve kernel
> threads.  It's a reasonable compromise, I think?

You're right. We should not run kthreadd with SCHED_FIFO by default.
But the user should be able to change it using chrt if he wants to
avoid this particular problem. So how about this instead?:



kthreadd, the creator of other kernel threads, runs as a normal priority task.
This is a potential for priority inversion when a task wants to spawn a
high-priority kernel thread. A middle priority SCHED_FIFO task can block
kthreadd's execution indefinitely and thus prevent the timely creation of the
high-priority kernel thread.

This causes a practical problem. When a runaway real-time task is eating 100%
CPU and we attempt to put the CPU offline, sometimes we block while waiting for
the creation of the highest-priority "kstopmachine" thread.

This could be solved by always running kthreadd with the highest possible
SCHED_FIFO priority, but that would be undesirable policy decision in the
kernel. kthreadd would cause unwanted latencies even for the realtime users who
know what they're doing.

Let's not make the decision for the user. Just allow the administrator to
change kthreadd's priority safely if he chooses to do it. Ensure that the
kernel threads are created with the usual nice level even if kthreadd's
priority is changed from the default.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 kernel/kthread.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index dcfe724..e832a85 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -94,10 +94,21 @@ static void create_kthread(struct kthread_create_info *create)
 	if (pid < 0) {
 		create->result = ERR_PTR(pid);
 	} else {
+		struct sched_param param = { .sched_priority = 0 };
 		wait_for_completion(&create->started);
 		read_lock(&tasklist_lock);
 		create->result = find_task_by_pid(pid);
 		read_unlock(&tasklist_lock);
+		/*
+		 * root may want to change our (kthreadd's) priority to
+		 * realtime to solve a corner case priority inversion problem
+		 * (a realtime task consuming 100% CPU blocking the creation of
+		 * kernel threads). The kernel thread should not inherit the
+		 * higher priority. Let's always create it with the usual nice
+		 * level.
+		 */
+		sched_setscheduler(create->result, SCHED_NORMAL, &param);
+		set_user_nice(create->result, -5);
 	}
 	complete(&create->done);
 }
-- 
1.5.3.3


  parent reply	other threads:[~2008-01-07 10:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-17 22:43 [PATCH] kthread: run kthreadd with max priority SCHED_FIFO Michal Schmidt
2007-12-17 23:00 ` Jon Masters
2007-12-22  9:30 ` Andrew Morton
2007-12-22  9:52   ` Jon Masters
2007-12-22 10:11     ` Andrew Morton
2007-12-22 10:18       ` Jon Masters
2007-12-22 10:39     ` Mike Galbraith
2007-12-22 10:52       ` Andrew Morton
2007-12-22 11:21         ` Jon Masters
2007-12-23  8:50         ` Mike Galbraith
2008-01-07 10:06   ` Michal Schmidt [this message]
2008-01-07 10:25     ` [PATCH] kthread: always create the kernel threads with normal priority Andrew Morton
2008-01-07 11:09       ` Ingo Molnar
2008-01-07 17:29         ` Andrew Morton
2008-01-07 17:47           ` Peter Zijlstra
2008-01-08  9:54           ` Michal Schmidt
2008-01-07 13:18       ` Michal Schmidt
2008-01-08 16:22         ` Ingo Molnar
2008-01-07 11:22     ` Remy Bohmer
2008-01-07 13:10       ` Michal Schmidt
2008-01-07 15:53         ` Remy Bohmer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080107110603.09b72450@brian.englab.brq.redhat.com \
    --to=mschmidt@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=jcm@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=takeuchi_satoru@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox