From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
linux-kernel@vger.kernel.org,
John Ogness <john.ogness@linutronix.de>,
Petr Mladek <pmladek@suse.com>,
Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Jisheng Zhang <Jisheng.Zhang@synaptics.com>,
Valdis Kletnieks <valdis.kletnieks@vt.edu>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Andrew Gabbasov <andrew_gabbasov@mentor.com>,
Dirk Behme <dirk.behme@de.bosch.com>,
Eugeniu Rosca <roscaeugeniu@gmail.com>
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup
Date: Thu, 19 Mar 2020 15:48:36 +0900 [thread overview]
Message-ID: <20200319064836.GB24020@google.com> (raw)
In-Reply-To: <20200318180525.GA5790@lxhi-065.adit-jv.com>
Hi,
On (20/03/18 19:05), Eugeniu Rosca wrote:
[..]
> > I'm afraid, as of now, this approach is not going to work the way it's
> > supposed to work in 100% of cases. Because the only thing that printk()
> > call sort of guarantees is that the message will be stored somewhere.
> > Either in the main kernel log buffer, on in one of auxiliary per-CPU
> > log buffers. It does not guarantee, generally speaking, that the message
> > will be printed on the console immediately.
>
> I take this passage as an acknowledgement of the problem being _real_,
> in spite of the fix being not perfect.
>
> One aspect I would like to emphasize is that (please, NAK this
> statement if it's not accurate) the problem reported in this patch is
> not specific to the existing printk mechanism, but also applies to the
> upcoming kthread-based printk. If that's true, then IMHO this is a
> compelling argument to join forces and try to find a working, safe and
> future-proof solution.
Yes, this is a "known issue".
> > Consider the following example:
> >
> > CPU0 CPU1
> > console_lock();
> > schedule();
> >
> > watchdog()
> > console_verbose_start();
> > printk()
> > log_store()
> > if (!console_trylock())
> > return;
> > console_verbose_end();
> >
> > ...
> > console_unlock()
> > print logbuf messages to the consoles
> > we missed the console_verbose_start/end
> > on CPU1
>
> This looks plausible. However, I wonder to which degree the same
> scenario is a concern in the kthread-based approach?
This is a problem for any async/deferred printk() printout. When we
print messages on the consoles we don't have any idea anymore if the
CPU which added those messages to the logbuf had any issues going on.
> My current standpoint is that as long as points [A-D] are met, it
> should do no harm to accept a (partial) fix like seen in my series:
>
> - [A] the patch tackles at least a subset of problematic use-cases
> - [B] the fix is non-intrusive and easy to review
> - [C] there is hope to reuse it in the new lockless buffer based printk
> - [D] there are no regressions employing the major console knobs
> (ignore_loglevel, quiet, loglevel, etc) as it happened in
> a6ae928c25835c ("Revert "printk: make sure to print log on console."")
So the issue is that when we bump `console_loglevel' or `ignore_loglevel'
we lift restrictions globally. For all CPUs, for all contexts which call
printk(). This may have negative impact. Fuzzing is one example. It
usually generates a lot of printk() noise, so lifting global printk()
log_level restrictions does cause problems. I recall that fuzzing people
really disliked the
old_level = console_loglevel
console_loglevel = new_level
....
console_loglevel = old_level
approach. Because if lets all CPUs and tasks to pollute serial logs.
While what we want is to print messages from a particular CPU/task.
So _maybe_ we can explore the per-context printk() flags instead. The idea
is to store a "desired"/emergency ->log_level. When a context sets that
->log_level we need to make any printk() messages issued by that context
at least of that ->log_level. IOW we overwrite messages log_levels, because
printk() is not synchronous by design.
We had similar discussion before [0].
I was misunderstood back then and in order to avoid the same
misunderstanding this time around, let me first state that - *I DO NOT*
propose to disable preemption for printk(). The code in question
was *just a POC snippet*.
To move that POC further - we can have the same per-context printk()
fields in every task_struct struct. Ideally without consuming additional
memory, or consuming as few extra bytes per task_struct as possible.
But even if we will have ->log_level embedded into every task_struct, we
still need to distinguish various cases (contexts):
- when current task wants to declare printk() emergency and sets
current->log_level, only this process is in printk() emergency.
- IRQs and NMIs which interrupt current should not be in printk()
emergency, unless IRQ and/or NMI explicitly declare printk()
emergency.
- when IRQ declares printk() emergency and sets current->log_level, only
this IRQ is in printk() emergency
- when NMI declares printk() emergency and sets current->log_level, only
this NMI is in printk() emergency
Another thing is. Suppose current sets ->log_level, and then encounters
a page_fault, which printk-s from exception handler. We shall distinguish
such cases too, I think.
So something like
printk_emergency_enter(LOG_LEVEL)
current->context[ctx] = LOG_LEVEL;
...
printk()
// overwrite message log_level to current->context[ctx] level
dump_stack()
// same log_level overwrite
...
printk_emergency_exit()
current->context[ctx] = -1
So when we iterate pending logbuf messages (CPU0->console_unlock() in
the example from my first email) messages from CPU1 will have appropriate
"desired" log_level and we will print those messages on the consoles.
Well. Something like this.
[0] https://lore.kernel.org/linuxppc-dev/20191112021747.GA68506@google.com/
-ss
next prev parent reply other threads:[~2020-03-19 6:48 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-15 17:09 [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios Eugeniu Rosca
2020-03-15 17:09 ` [RFC PATCH 1/3] printk: convert ignore_loglevel to atomic_t Eugeniu Rosca
2020-03-15 17:09 ` [RFC PATCH 2/3] printk: add console_verbose_{start,end} Eugeniu Rosca
2020-03-16 21:19 ` [RFC PATCH 2/3] printk: add console_verbose_{start, end} kbuild test robot
2020-03-16 21:30 ` kbuild test robot
2020-03-18 16:00 ` [RFC PATCH 2/3] printk: add console_verbose_{start,end} John Ogness
2020-03-15 17:09 ` [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup Eugeniu Rosca
2020-03-17 2:18 ` Sergey Senozhatsky
2020-03-18 18:05 ` Eugeniu Rosca
2020-03-19 6:48 ` Sergey Senozhatsky [this message]
2020-03-19 7:38 ` Valdis Klētnieks
2020-03-19 8:01 ` Sergey Senozhatsky
2020-03-19 8:18 ` Valdis Klētnieks
2020-05-19 14:43 ` Petr Mladek
2020-03-19 8:20 ` John Ogness
2020-03-16 18:35 ` [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios Steven Rostedt
2020-03-16 19:09 ` Sebastian Andrzej Siewior
2020-03-18 14:56 ` Eugeniu Rosca
2020-03-18 15:40 ` John Ogness
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200319064836.GB24020@google.com \
--to=sergey.senozhatsky.work@gmail.com \
--cc=Jisheng.Zhang@synaptics.com \
--cc=andrew_gabbasov@mentor.com \
--cc=bigeasy@linutronix.de \
--cc=dirk.behme@de.bosch.com \
--cc=erosca@de.adit-jv.com \
--cc=john.ogness@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=roscaeugeniu@gmail.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=tglx@linutronix.de \
--cc=valdis.kletnieks@vt.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.