All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.com>, LKML <linux-kernel@vger.kernel.org>,
	pmladek@suse.com, rostedt@goodmis.org,
	Gavin Hu <gavin.hu.2010@gmail.com>,
	KY Srinivasan <kys@microsoft.com>, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH 1/4] printk: Hand over printing to console if printing too long
Date: Tue, 22 Sep 2015 12:27:43 +0200	[thread overview]
Message-ID: <20150922102743.GJ9028@quack.suse.cz> (raw)
In-Reply-To: <20150918151459.a73804a65137f3e1049b8dd7@linux-foundation.org>

On Fri 18-09-15 15:14:59, Andrew Morton wrote:
> On Wed, 19 Aug 2015 17:38:28 +0200 Jan Kara <jack@suse.com> wrote:
> 
> > From: Jan Kara <jack@suse.cz>
> > 
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> > 
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> > 
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> > 
> > ...
> >
> > @@ -2230,6 +2292,8 @@ void console_unlock(void)
> >  	unsigned long flags;
> >  	bool wake_klogd = false;
> >  	bool retry;
> > +	bool hand_over = false;
> > +	int printed_chars = 0;
> >  
> >  	if (console_suspended) {
> >  		up_console_sem();
> > @@ -2241,12 +2305,18 @@ void console_unlock(void)
> >  	/* flush buffered message fragment immediately to console */
> >  	console_cont_flush(text, sizeof(text));
> >  again:
> > +	spin_lock(&print_lock);
> 
> I'm surprised this isn't spin_lock_irqsave().  How come this isn't
> deadlockable?

Yes, it should be spin_lock_irqsave(). My original plan was to nest
print_lock inside logbuf_lock which would provide the protection but later
I've ordered them the other way around and forgot to update the irq
protection. Will fix.

> >  	for (;;) {
> >  		struct printk_log *msg;
> >  		size_t ext_len = 0;
> >  		size_t len;
> >  		int level;
> >  
> > +		if (cpu_stop_printing(printed_chars)) {
> > +			hand_over = true;
> > +			break;
> > +		}
> > +
> >  		raw_spin_lock_irqsave(&logbuf_lock, flags);
> >  		if (seen_seq != log_next_seq) {
> >  			wake_klogd = true;
> >
> > ...
> >
> > +/* Kthread which takes over printing from a CPU which asks for help */
> > +static int printing_task(void *arg)
> > +{
> > +	DEFINE_WAIT(wait);
> > +
> > +	while (1) {
> > +		prepare_to_wait_exclusive(&print_queue, &wait,
> > +					  TASK_INTERRUPTIBLE);
> > +		schedule();
> > +		finish_wait(&print_queue, &wait);
> > +		preempt_disable();
> 
> I don't understand the preempt_disable().  Code comment, please?

We don't want to be scheduled away in preemptible kernels when spinning on
print_lock or after we acquired print_lock and before we got console_sem.
I'll add a comment.

Thanks for review!
								Honza

> 
> > +		atomic_inc(&printing_tasks_spinning);
> > +		/*
> > +		 * Store printing_tasks_spinning value before we spin. Matches
> > +		 * the barrier in cpu_stop_printing().
> > +		 */
> > +		smp_mb__after_atomic();
> > +		/*
> > +		 * Wait for currently printing thread to complete. We spin on
> > +		 * print_lock instead of waiting on console_sem since we don't
> > +		 * want to sleep once we got scheduled to make sure we take
> > +		 * over printing without depending on the scheduler.
> > +		 */
> > +		spin_lock(&print_lock);
> > +		atomic_dec(&printing_tasks_spinning);
> > +		spin_unlock(&print_lock);
> > +		if (console_trylock())
> > +			console_unlock();
> > +		preempt_enable();
> > +	}
> > +	return 0;
> > +}
> >
> > ...
> >
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2015-09-22 10:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-19 15:38 [PATCH 0/4] printk: Softlockup avoidance Jan Kara
2015-08-19 15:38 ` [PATCH 1/4] printk: Hand over printing to console if printing too long Jan Kara
2015-09-18 22:14   ` Andrew Morton
2015-09-22 10:27     ` Jan Kara [this message]
2015-08-19 15:38 ` [PATCH 2/4] printk: Start printing handover kthreads on demand Jan Kara
2015-08-19 15:38 ` [PATCH 3/4] kernel: Avoid softlockups in stop_machine() during heavy printing Jan Kara
2015-09-18 22:15   ` Andrew Morton
2015-09-22 10:55     ` Jan Kara
2015-09-23  8:37       ` Jan Kara
2015-08-19 15:38 ` [PATCH 4/4] printk: Add config option for disabling printk offloading Jan Kara
2015-09-18 22:15   ` Andrew Morton
2015-09-22 11:51     ` Jan Kara
2015-08-20  2:37 ` [PATCH 0/4] printk: Softlockup avoidance KY Srinivasan
2015-09-18 22:14 ` Andrew Morton
2015-09-22 10:10   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150922102743.GJ9028@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=gavin.hu.2010@gmail.com \
    --cc=jack@suse.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.