From: Oleg Nesterov <oleg@redhat.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: joseph.salisbury@canonical.com, JBottomley@parallels.com,
Nagalakshmi.Nandigama@lsi.com, Sreekanth.Reddy@lsi.com,
rientjes@google.com, akpm@linux-foundation.org,
torvalds@linux-foundation.org, tj@kernel.org, tglx@linutronix.de,
linux-kernel@vger.kernel.org, kernel-team@lists.ubuntu.com,
linux-scsi@vger.kernel.org, thenzl@redhat.com
Subject: Re: please fix FUSION (Was: [v3.13][v3.14][Regression]kthread:makekthread_create()killable)
Date: Sat, 22 Mar 2014 20:25:11 +0100 [thread overview]
Message-ID: <20140322192511.GA14811@redhat.com> (raw)
In-Reply-To: <201403221525.FJC69759.MSLOtHOJOVFFQF@I-love.SAKURA.ne.jp>
On 03/22, Tetsuo Handa wrote:
>
> Oleg Nesterov wrote:
> > Tetsuo, what do you think?
>
> I don't like blocking SIGKILL while doing operations that depend on memory
> allocation by other processes. If the OOM killer is triggered and it chose
> the process blocking SIGKILL in mptsas_init() (I know it unlikely happens),
> it generates the OOM killer deadlock.
It seems that we do not understand each other.
I do not like this hack too. And it is even wrong, you can't really block
SIGKILL. And it is unnecessary in a sense that (I think) it is fine that
module_init() reacts to SIGKILL and aborts, just the fact it crashes the
kernel in the error paths is not fine.
The driver should be fixed anyway. As for timeout, either userspace/systemd
should be changed to not send SIGKILL after 30 secs, or (better) the driver
should be changed to not waste 30 secs.
The hack I sent can only serve as a short term solution, it should be
reverted once we have something better. And, otoh, this hack only affects
the problematic driver which should be fixed in any case.
Do you see my point? I can be wrong, but I think that you constantly
misunderstand the intent.
> My preference is to fix kthread_create() to handle SIGKILL gracefully.
And now I do not understand you too. I do not understand why should we
"fix" kthread_create().
> Many kernel operations (e.g. mutex_lock() wait_event() wait_for_completion())
> ignore SIGKILL and the current users depend on SIGKILL being ignored. Thus,
> commit 786235ee sounds like a kernel API breakage.
Personally I do not really think so, but OK. In this case lets revert
786235ee.
> Commit 786235ee "kthread: make kthread_create() killable" changed to
> leave kthread_create() as soon as receiving SIGKILL. But this change
> caused boot failures because systemd-udevd receives SIGKILL due to timeout
> while loading SCSI controller drivers using finit_module() [1].
And I still think that 786235ee simply uncovered the problems in this
driver. Perhaps we should change kthread_create() for some reason, but
(imho) not because we need to help the buggy code.
> Therefore, this patch changes kthread_create() from "wait forever in
> killable state" to "wait for 10 seconds in unkillable state, check for
> the OOM killer every second".
Personally I dislike this change. In fact I think it is ugly. But this
is only my opinion.
If you convince someone to take this patch I won't argue.
> This patch also changes the return value of timeout case from -ENOMEM
> to -EINTR because -ENOMEM could make sense for only TIF_MEMDIE case.
>
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> kernel/kthread.c | 37 +++++++++++++++++++++----------------
> 1 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index b5ae3ee..6a25a9f 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -269,6 +269,7 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
> const char namefmt[],
> ...)
> {
> + int i = 0;
> DECLARE_COMPLETION_ONSTACK(done);
> struct task_struct *task;
> struct kthread_create_info *create = kmalloc(sizeof(*create),
> @@ -287,24 +288,28 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
>
> wake_up_process(kthreadd_task);
> /*
> - * Wait for completion in killable state, for I might be chosen by
> - * the OOM killer while kthreadd is trying to allocate memory for
> - * new kernel thread.
> + * Wait for completion with 10 seconds timeout. Unless the kthreadd is
> + * blocked for direct memory reclaim path, the kthreadd will be able to
> + * complete the request within 10 seconds. Also, check every second if
> + * I was chosen by the OOM killer in order to avoid the OOM killer
> + * deadlock.
> */
> - if (unlikely(wait_for_completion_killable(&done))) {
> - /*
> - * If I was SIGKILLed before kthreadd (or new kernel thread)
> - * calls complete(), leave the cleanup of this structure to
> - * that thread.
> - */
> - if (xchg(&create->done, NULL))
> - return ERR_PTR(-ENOMEM);
> - /*
> - * kthreadd (or new kernel thread) will call complete()
> - * shortly.
> - */
> - wait_for_completion(&done);
> + do {
> + if (likely(wait_for_completion_timeout(&done, HZ)))
> + goto ready;
> + } while (i++ < 10 && !test_thread_flag(TIF_MEMDIE));
> + /*
> + * The kthreadd was unable to complete the request within 10 seconds
> + * (or I was chosen by the OOM killer). Thus, give up and leave the
> + * cleanup of this structure to the kthreadd (or new kernel thread).
> + */
> + if (xchg(&create->done, NULL)) {
> + WARN(1, "Gave up waiting for kthreadd.\n");
> + return ERR_PTR(-EINTR);
> }
> + /* kthreadd (or new kernel thread) will call complete() shortly. */
> + wait_for_completion(&done);
> +ready:
> task = create->result;
> if (!IS_ERR(task)) {
> static const struct sched_param param = { .sched_priority = 0 };
> --
> 1.7.1
next prev parent reply other threads:[~2014-03-22 19:25 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-14 20:46 [v3.13][v3.14][Regression] kthread: make kthread_create() killable Joseph Salisbury
2014-03-15 0:43 ` Tetsuo Handa
2014-03-16 15:13 ` Tetsuo Handa
2014-03-16 16:25 ` Oleg Nesterov
2014-03-17 12:38 ` [v3.13][v3.14][Regression] kthread: make kthread_create()killable Tetsuo Handa
2014-03-17 14:22 ` Oleg Nesterov
2014-03-18 12:03 ` [v3.13][v3.14][Regression] kthread: makekthread_create()killable Tetsuo Handa
2014-03-18 17:16 ` Oleg Nesterov
2014-03-19 11:49 ` [v3.13][v3.14][Regression] kthread:makekthread_create()killable Tetsuo Handa
2014-03-19 16:13 ` Joseph Salisbury
2014-03-19 17:52 ` Oleg Nesterov
2014-03-19 18:29 ` please fix FUSION (Was: [v3.13][v3.14][Regression] kthread:makekthread_create()killable) Oleg Nesterov
2014-03-19 19:42 ` Oleg Nesterov
2014-03-19 21:04 ` Joseph Salisbury
2014-03-20 16:46 ` Joseph Salisbury
2014-03-20 19:23 ` Oleg Nesterov
2014-03-21 18:34 ` Oleg Nesterov
2014-03-21 19:32 ` Linus Torvalds
2014-03-21 20:31 ` Oleg Nesterov
2014-03-21 22:56 ` James Bottomley
2014-03-22 6:25 ` please fix FUSION (Was: [v3.13][v3.14][Regression]kthread:makekthread_create()killable) Tetsuo Handa
2014-03-22 19:25 ` Oleg Nesterov [this message]
2014-03-22 20:48 ` James Bottomley
2014-03-24 17:01 ` Oleg Nesterov
2014-03-22 21:25 ` Thomas Gleixner
2014-03-22 22:01 ` Thomas Gleixner
2014-03-22 23:57 ` please fix FUSION (Was:[v3.13][v3.14][Regression]kthread:makekthread_create()killable) Tetsuo Handa
2014-03-23 8:04 ` Thomas Gleixner
2014-03-23 14:19 ` James Bottomley
2014-03-23 14:28 ` Thomas Gleixner
2014-03-23 14:29 ` James Bottomley
2014-03-22 23:50 ` Tetsuo Handa
2014-03-17 20:02 ` [v3.13][v3.14][Regression] kthread: make kthread_create() killable Andrew Morton
2014-03-17 20:19 ` Oleg Nesterov
2014-03-17 20:39 ` Andrew Morton
2014-03-18 17:45 ` Oleg Nesterov
2014-06-03 13:03 ` [PATCH] kthread: Fix return value of kthread_create() upon SIGKILL Tetsuo Handa
2014-06-03 21:35 ` David Rientjes
2014-03-17 21:32 ` [v3.13][v3.14][Regression] kthread: make kthread_create()killable Tetsuo Handa
2014-03-17 23:18 ` [v3.13][v3.14][Regression] kthread: make kthread_create() killable One Thousand Gnomes
2014-03-18 17:50 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140322192511.GA14811@redhat.com \
--to=oleg@redhat.com \
--cc=JBottomley@parallels.com \
--cc=Nagalakshmi.Nandigama@lsi.com \
--cc=Sreekanth.Reddy@lsi.com \
--cc=akpm@linux-foundation.org \
--cc=joseph.salisbury@canonical.com \
--cc=kernel-team@lists.ubuntu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
--cc=tglx@linutronix.de \
--cc=thenzl@redhat.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.