From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>, Petr Mladek <pmladek@suse.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Rafael Wysocki <rjw@rjwysocki.net>, Pavel Machek <pavel@ucw.cz>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
linux-kernel@vger.kernel.org,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread
Date: Thu, 28 Dec 2017 15:48:58 +0900 [thread overview]
Message-ID: <20171228064858.GA892@jagdpanzerIV> (raw)
In-Reply-To: <20171221231932.27727fab@vmware.local.home>
Hello,
On (12/21/17 23:19), Steven Rostedt wrote:
[..]
> > I wrote this before but this isn't a theoretical problem. We see
> > these stalls a lot. Preemption isn't enabled to begin with. Memory
> > pressure is high and OOM triggers and printk starts printing out OOM
> > warnings; then, a network packet comes in which triggers allocations
> > in the network layer, which fails due to memory pressure, which then
> > generates memory allocation failure messages which then generates
> > netconsole packets which then tries to allocate more memory and so on.
>
> It doesn't matter if preemption is enabled or not. The hand off should
> happen either way.
preemption does matter. it matters so much, that it makes the first
thing your patch depends on to be questionable - if CPUA stuck
in console_unlock() printing other CPU's messages, it's
because there are currently printk()-s happening on CPUB-CPUZ.
which is not always true. with preemption enabled CPUA can stuck in
console_unlock() because printk()-s from CPUB-CPUZ could have happened
seconds or even minutes ago. I have demonstrated it with the traces; it
didn't convenience you. OK, I sat down and went trough Tetsuo's reports
starting from 2016:
https://marc.info/?l=linux-kernel&m=151375384500555
and Tetsuo has reported several times that printing CPU can sleep for
minutes with console_sem being locked; while other CPUs are happy to
printk()->log_store() as much as they want. and that's why I believe
that that "The hand off should happen either way" is dangerous, especially
when we hand off from a preemptible context to an atomic context. you
don't think that this is a problem, because your patch requires logbuf to
be small enough for any atomic context (irq, under spin_lock, etc) to be
able to print out all the pending messages to all registered consoles [we
print to consoles sequentially] within watchdog_threshold seconds (10
seconds by default). who does adjust the size of the logbuf in a way that
console_unlock() can print the entire logbuf under 10 seconds?
> > It's just that there's no one else to give that flushing duty too, so
> > the pingpoinging that your patch implements can't really help
> > anything.
> >
> > You argue that it isn't made worse by your patch, which may be true
> > but your patch doesn't solve actual problems and is most likely
> > unnecessary complication which gets in the way for the actual
> > solution. It's a weird argument to push an approach which is
> > fundamentally broken. Let's please stop that.
>
> BULLSHIT!
>
> It's not a complex solution, and coming from the cgroup and workqueue
> maintainer, that's a pretty ironic comment.
>
> It has already been proven that it can solve problems:
>
> http://lkml.kernel.org/r/20171219143147.GB15210@dhcp22.suse.cz
and Tetsuo also said that his test does not cover all the scenarios;
and he has sent us all a pretty clear warning:
: Therefore, while it is true that any approach would survive my
: environment, it is dangerous to assume that any approach is safe
: for my customer's enterprise servers.
https://marc.info/?l=linux-kernel&m=151377161705573
when it comes to lockups, printk() design is so flexible, that one can
justify nearly every patch no matter how many regressions it introduces,
by simply saying:
"but the same lockup scenario could have happened even before my
patch. it's just printk() was designed this way. you were lucky
enough not to hit the problem before; now you are less lucky".
been there, done that. it's a trap.
> You don't think handing off printks to an offloaded thread isn't more
> complex nor can it cause even more issues (like more likely to lose
> relevant information on kernel crashes)?
printk_kthread *used* to be way too independent. basically what we had
before was
bad_function()
printk()
vprintk_emit()
{
if (oops_in_progress)
can_offload_to_printk = false;
if (can_offload_to_printk)
wake_up(printk_kthread);
else if (console_trylock())
console_unlock();
}
the bad things:
- first, we do not take into account the fact that printk_kthread can
already be active.
- second, we do not take into account the fact that printk_kthread can
be way-way behind - a huge gap between `console_seq' and `log_next_seq'.
- third, we do not take into account the fact that printk_kthread can
be preempted under console_sem.
so, IOW, the CPU which was in trouble declared the "oh, we should not
offload to printk kthread" emergency only for itself, basically. the CPU
which printk_kthread was running on or sleeping on [preemption] did not
care, neither did printk_kthread.
what we have now:
- printk_kthread cannot be preempted under console_sem. if it acquires
the console_sem it stays RUNNING, printing the messages.
- printk_kthread does check for emergency on other CPUs. right before
it starts printing the next pending logbuf line (every line).
IOW, instead of that:
console_unlock()
{
again:
for (;;) {
call_console_drivers();
}
up();
if (more_pending_messages && down_trylock())
goto again;
}
it now does this:
console_unlock()
{
preempt_disable();
again:
for (;;) {
if (something_is_not_right)
break;
if (printing_for_too_long)
break;
call_console_drivers();
}
up();
preempt_enable();
if (!something_is_not_right && !printing_for_too_long &&
more_pending_messages && down_trylock())
goto again;
}
so if CPUA declares emergency (oops_in_progress, etc) then printk_kthread,
if it's running, will get out of CPUA's way as soon as possible. so I
don't think we are still *more likely* to lose relevant information on
kernel crashes because of printk_kthread.
and I'm actually thinking about returning back the old vprintk_emit()
behavior
vprintk_emit()
{
+ preempt_disable();
if (console_trylock())
console_unlock();
+ preempt_enable();
}
and letting console_unlock() to offload when needed. preemption in
vprintk_emit() is no good.
-ss
next prev parent reply other threads:[~2017-12-28 6:48 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-04 13:48 [RFC][PATCHv6 00/12] printk: introduce printing kernel thread Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 01/12] printk: move printk_pending out of per-cpu Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 02/12] printk: introduce printing kernel thread Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 03/12] printk: consider watchdogs thresholds for offloading Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 04/12] printk: add sync printk_emergency API Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 05/12] printk: enable printk offloading Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 06/12] PM: switch between printk emergency modes Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 07/12] printk: register syscore notifier Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 08/12] printk: force printk_kthread to offload printing Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 09/12] printk: do not cond_resched() when we can offload Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 10/12] printk: move offloading logic to per-cpu Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 11/12] printk: add offloading watchdog API Sergey Senozhatsky
2017-12-04 13:48 ` [RFC][PATCHv6 12/12] printk: improve printk offloading mechanism Sergey Senozhatsky
2017-12-04 13:53 ` [PATCH 0/4] printk: offloading testing module/trace events Sergey Senozhatsky
2017-12-04 13:53 ` [PATCH 1/4] printk/lib: add offloading trace events and test_printk module Sergey Senozhatsky
2017-12-04 13:53 ` [PATCH 2/4] printk/lib: simulate slow consoles Sergey Senozhatsky
2017-12-04 13:53 ` [PATCH 3/4] printk: add offloading takeover traces Sergey Senozhatsky
2017-12-04 13:53 ` [PATCH 4/4] printk: add task name and CPU to console messages Sergey Senozhatsky
2017-12-14 14:27 ` [RFC][PATCHv6 00/12] printk: introduce printing kernel thread Petr Mladek
2017-12-14 14:39 ` Sergey Senozhatsky
2017-12-15 15:55 ` Steven Rostedt
2017-12-14 15:25 ` Tejun Heo
2017-12-14 17:55 ` Steven Rostedt
2017-12-14 18:11 ` Tejun Heo
2017-12-14 18:21 ` Steven Rostedt
2017-12-22 0:09 ` Tejun Heo
2017-12-22 4:19 ` Steven Rostedt
2017-12-28 6:48 ` Sergey Senozhatsky [this message]
2017-12-28 10:07 ` Sergey Senozhatsky
2017-12-29 13:59 ` Tetsuo Handa
2017-12-31 1:44 ` Sergey Senozhatsky
2018-01-09 20:06 ` Tejun Heo
2018-01-09 22:08 ` Tetsuo Handa
2018-01-09 22:17 ` Tejun Heo
2018-01-11 11:14 ` Tetsuo Handa
2018-01-09 22:08 ` Steven Rostedt
2018-01-09 22:17 ` Tejun Heo
2018-01-09 22:47 ` Steven Rostedt
2018-01-09 22:53 ` Tejun Heo
2018-01-10 7:18 ` Steven Rostedt
2018-01-10 14:04 ` Tejun Heo
2017-12-15 2:10 ` Sergey Senozhatsky
2017-12-15 3:18 ` Steven Rostedt
2017-12-15 5:06 ` Sergey Senozhatsky
2017-12-15 6:52 ` Sergey Senozhatsky
2017-12-15 15:39 ` Steven Rostedt
2017-12-15 8:31 ` Petr Mladek
2017-12-15 8:42 ` Sergey Senozhatsky
2017-12-15 9:08 ` Petr Mladek
2017-12-15 15:47 ` Steven Rostedt
2017-12-18 9:36 ` Sergey Senozhatsky
2017-12-18 10:36 ` Sergey Senozhatsky
2017-12-18 12:35 ` Sergey Senozhatsky
2017-12-18 13:51 ` Petr Mladek
2017-12-18 13:31 ` Petr Mladek
2017-12-18 13:39 ` Sergey Senozhatsky
2017-12-18 14:13 ` Petr Mladek
2017-12-18 17:46 ` Steven Rostedt
2017-12-19 1:03 ` Sergey Senozhatsky
2017-12-19 1:08 ` Steven Rostedt
2017-12-19 1:24 ` Sergey Senozhatsky
2017-12-19 2:03 ` Steven Rostedt
2017-12-19 2:46 ` Sergey Senozhatsky
2017-12-19 3:38 ` Steven Rostedt
2017-12-19 4:58 ` Sergey Senozhatsky
2017-12-19 14:40 ` Steven Rostedt
2017-12-20 7:46 ` Sergey Senozhatsky
2017-12-19 14:31 ` Michal Hocko
2017-12-20 7:10 ` Sergey Senozhatsky
2017-12-20 12:06 ` Tetsuo Handa
2017-12-21 6:52 ` Sergey Senozhatsky
2017-12-19 4:36 ` Sergey Senozhatsky
2017-12-18 14:10 ` Petr Mladek
2017-12-19 1:09 ` Sergey Senozhatsky
2017-12-15 15:42 ` Steven Rostedt
2017-12-15 15:19 ` Steven Rostedt
2017-12-19 0:52 ` Sergey Senozhatsky
2017-12-19 1:03 ` Steven Rostedt
2018-01-05 2:54 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171228064858.GA892@jagdpanzerIV \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=pavel@ucw.cz \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rjw@rjwysocki.net \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.