From: Petr Mladek <pmladek@suse.com>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jan Kara <jack@suse.com>, Tejun Heo <tj@kernel.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
linux-kernel@vger.kernel.org,
Byungchul Park <byungchul.park@lge.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Jan Kara <jack@suse.cz>
Subject: Re: [RFC][PATCH v6 1/2] printk: Make printk() completely async
Date: Tue, 22 Mar 2016 14:11:07 +0100 [thread overview]
Message-ID: <20160322131106.GA5522@pathway.suse.cz> (raw)
In-Reply-To: <1458581130-8677-2-git-send-email-sergey.senozhatsky@gmail.com>
On Tue 2016-03-22 02:25:29, Sergey Senozhatsky wrote:
> From: Jan Kara <jack@suse.cz>
>
> This patch makes printk() completely asynchronous (similar to what
> printk_deferred() did until now). It appends message to the kernel
> printk buffer and wake_up()s a special dedicated kthread to do the
> printing to console. This has the advantage that printing always happens
> from a schedulable contex and thus we don't lockup any particular CPU or
> even interrupts. Also it has the advantage that printk() is fast and
> thus kernel booting is not slowed down by slow serial console.
> Disadvantage of this method is that in case of crash there is higher
> chance that important messages won't appear in console output (we may
> need working scheduling to print message to console). We somewhat
> mitigate this risk by switching printk to the original method of
> immediate printing to console if oops is in progress. Also for
> debugging purposes we provide printk.synchronous kernel parameter which
> resorts to the original printk behavior.
>
> printk() is expected to work under different conditions and in different
> scenarios, including corner cases of OOM when all of the workers are busy
> (e.g. allocating memory), thus printk() uses its own dedicated printing
> kthread, rather than relying on workqueue (even with WQ_MEM_RECLAIM bit
> set we potentially can receive delays in printing until workqueue
> declares a ->mayday, as noted by Tetsuo Handa). Another thing to mention,
> is that deferred printk() messages may appear before printing kthread
> created, so in the very beginning we have to print deferred messages
> the old way -- via IRQs.
>
> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
> Documentation/kernel-parameters.txt | 10 ++
> kernel/printk/printk.c | 209 ++++++++++++++++++++++++++----------
> 2 files changed, 161 insertions(+), 58 deletions(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e38579d..99c105d6 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> +/*
> + * Delayed printk version, for scheduler-internal messages:
> + */
> +#define PRINTK_PENDING_WAKEUP (1<<0)
> +#define PRINTK_PENDING_OUTPUT (1<<1)
> +
> +static DEFINE_PER_CPU(int, printk_pending);
> +
> +static void wake_up_klogd_work_func(struct irq_work *irq_work)
> +{
> + int pending = __this_cpu_xchg(printk_pending, 0);
> +
> + if (pending & PRINTK_PENDING_OUTPUT) {
> + /* If trylock fails, someone else is doing the printing */
> + if (console_trylock())
> + console_unlock();
This is called from IRQ context and thus keeps the original problem
for printk_deferred(). We should try to
wake_up_process(printk_kthread) when allowed.
if (pending & PRINTK_PENDING_OUTPUT) {
if (printk_sync || !printk_kthread) {
/* If trylock fails, someone else is printing. */
if (console_trylock())
console_unlock();
} else {
wake_up_process(printk_kthread);
}
}
> @@ -1609,6 +1709,8 @@ asmlinkage int vprintk_emit(int facility, int level,
> const char *dict, size_t dictlen,
> const char *fmt, va_list args)
> {
> + /* cpu currently holding logbuf_lock in this function */
> + static unsigned int logbuf_cpu = UINT_MAX;
> static bool recursion_bug;
> static char textbuf[LOG_LINE_MAX];
> char *text = textbuf;
> @@ -1619,12 +1721,21 @@ asmlinkage int vprintk_emit(int facility, int level,
> int printed_len = 0;
> int nmi_message_lost;
> bool in_sched = false;
> - /* cpu currently holding logbuf_lock in this function */
> - static unsigned int logbuf_cpu = UINT_MAX;
> + bool in_panic = console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH;
> + bool sync_print = printk_sync;
I still think that this local variable adds more mess than good.
Please, rename it to do_printk_sync at least. It will a poor human
to distinguish it from the global one ;-)
I still hope that we could get rid if it, see below.
> +
> + /* disable async printk */
> + if (in_panic)
> + printk_sync = true;
>
> if (level == LOGLEVEL_SCHED) {
> level = LOGLEVEL_DEFAULT;
> + /*
> + * Deferred sched messages must not be printed
> + * synchronously regardless the @printk_sync or @in_panic.
> + */
> in_sched = true;
> + sync_print = false;
> }
>
> boot_delay_msec(level);
> @@ -1657,6 +1768,14 @@ asmlinkage int vprintk_emit(int facility, int level,
> raw_spin_lock(&logbuf_lock);
> logbuf_cpu = this_cpu;
>
> + /*
> + * Set printing_func() sleep condition early, under the @logbuf_lock.
> + * So printing kthread (if RUNNING) will go to console_lock() and spin
> + * on @logbuf_lock.
> + */
> + if (!printk_sync)
> + need_flush_console = true;
We set this variable for each call and also when printk_kthread is
NULL or when sync_printk is false.
We migth want to clear it also from console_unlock(). I think that
a good place would be in the check:
raw_spin_lock(&logbuf_lock);
retry = console_seq != log_next_seq;
raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> if (unlikely(recursion_bug)) {
> static const char recursion_msg[] =
> "BUG: recent printk recursion!";
> @@ -1762,10 +1881,38 @@ asmlinkage int vprintk_emit(int facility, int level,
> logbuf_cpu = UINT_MAX;
> raw_spin_unlock(&logbuf_lock);
> lockdep_on();
> + /*
> + * By default we print message to console asynchronously so that kernel
> + * doesn't get stalled due to slow serial console. That can lead to
> + * softlockups, lost interrupts, or userspace timing out under heavy
> + * printing load.
> + *
> + * However we resort to synchronous printing of messages during early
> + * boot, when synchronous printing was explicitly requested by
> + * kernel parameter, or when console_verbose() was called to print
> + * everything during panic / oops.
> + */
> + if (!sync_print) {
> + if (in_sched) {
> + /*
> + * @in_sched messages may come too early, when we don't
> + * yet have @printk_kthread. We can't print deferred
> + * messages directly, because this may deadlock, route
> + * them via IRQ context.
> + */
> + __this_cpu_or(printk_pending,
> + PRINTK_PENDING_OUTPUT);
> + irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
> + } else if (printk_kthread && !in_panic) {
> + /* Offload printing to a schedulable context. */
> + wake_up_process(printk_kthread);
> + } else {
> + sync_print = true;
> + }
> + }
> local_irq_restore(flags);
Please, what is the exact reason to call the above stuff before
we release IRQs here? I guess that it is related to the discussions
about possible lock debugging, infinite loops, ...
I wonder how the disabled IRQs help here. It is very likely that I
miss something.
In each case, irq_work_queue() is lock-less and was used with IRQs
enabled before.
So, it might be related to wake_up_process(). It takes a lock but
using
raw_spin_lock_irqsave(&p->pi_lock, flags);
, so it disables IRQs in the critical section. Do we need to guard
it in between?
I think that you actually wanted to disable lockdep or any other lock
debugging, instead.
I wonder if the following code would do the job and is better readable
to others.
/*
* By default we print message to console asynchronously so that kernel
* doesn't get stalled due to slow serial console. That can lead to
* softlockups, lost interrupts, or userspace timing out under heavy
* printing load.
*
* However we resort to synchronous printing of messages during early
* boot, when synchronous printing was explicitly requested by
* kernel parameter, or when console_verbose() was called to print
* everything during panic / oops.
*/
if (in_sched) {
/*
* @in_sched messages may come too early, when we don't
* yet have @printk_kthread. We can't print deferred
* messages directly, because this may deadlock, route
* them via IRQ context.
*/
__this_cpu_or(printk_pending,
PRINTK_PENDING_OUTPUT);
irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
goto out;
}
/* Avoid printk recursion */
lockdep_off();
if (printk_kthread && !in_panic) {
/* Offload printing to a schedulable context. */
wake_up_process(printk_kthread);
goto out_lockdep;
} else {
/*
* Try to acquire and then immediately release the console
* semaphore. The release will print out buffers and wake up
* /dev/kmsg and syslog() users.
*/
if (console_trylock())
console_unlock();
}
lockdep_on();
I do not say that it is a "dream-of-like" code. One important thing for
me is that it does not use "sync_printk" variable.
You original code modified "sync_printk" according to "in_sched" and
"in_panic" variables earlier in vprintk_emit. Then it again checked
all three variables here which produced strange twists in my head ;-)
Best regards,
Petr
next prev parent reply other threads:[~2016-03-22 13:11 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-21 17:25 [RFC][PATCH v6 0/2] printk: Make printk() completely async Sergey Senozhatsky
2016-03-21 17:25 ` [RFC][PATCH v6 1/2] " Sergey Senozhatsky
2016-03-22 13:11 ` Petr Mladek [this message]
2016-03-22 14:04 ` Petr Mladek
2016-03-23 0:37 ` Sergey Senozhatsky
2016-03-23 8:42 ` Sergey Senozhatsky
2016-03-23 10:04 ` Petr Mladek
2016-03-24 2:24 ` Sergey Senozhatsky
2016-03-22 16:36 ` Petr Mladek
2016-03-23 1:24 ` Sergey Senozhatsky
2016-03-23 9:25 ` Petr Mladek
2016-03-23 13:20 ` Jan Kara
2016-03-23 14:30 ` Sergey Senozhatsky
2016-03-23 14:41 ` Jan Kara
2016-03-21 17:25 ` [RFC][PATCH v6 2/2] printk: Make wake_up_klogd_work_func() async Sergey Senozhatsky
2016-03-22 6:49 ` [RFC][PATCH v6 0/2] printk: Make printk() completely async Jan Kara
2016-03-22 7:57 ` Sergey Senozhatsky
2016-03-22 8:15 ` Jan Kara
2016-04-23 19:40 ` Pavel Machek
2016-04-24 5:14 ` Sergey Senozhatsky
2016-04-24 13:35 ` Pavel Machek
2016-04-24 15:00 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160322131106.GA5522@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=akpm@linux-foundation.org \
--cc=byungchul.park@lge.com \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=sergey.senozhatsky@gmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.