All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.com>, LKML <linux-kernel@vger.kernel.org>,
	pmladek@suse.com, rostedt@goodmis.org,
	Gavin Hu <gavin.hu.2010@gmail.com>,
	KY Srinivasan <kys@microsoft.com>, Jan Kara <jack@suse.cz>
Subject: Re: [PATCH 3/4] kernel: Avoid softlockups in stop_machine() during heavy printing
Date: Tue, 22 Sep 2015 12:55:02 +0200	[thread overview]
Message-ID: <20150922105502.GK9028@quack.suse.cz> (raw)
In-Reply-To: <20150918151521.c5c8caa59a0e254fdd713337@linux-foundation.org>

On Fri 18-09-15 15:15:21, Andrew Morton wrote:
> On Wed, 19 Aug 2015 17:38:30 +0200 Jan Kara <jack@suse.com> wrote:
> 
> > From: Jan Kara <jack@suse.cz>
> > 
> > When there are lots of messages accumulated in printk buffer, printing
> > them (especially over serial console) can take a long time (tens of
> > seconds). stop_machine() will effectively make all cpus spin in
> > multi_cpu_stop() waiting for the CPU doing printing to print all the
> > messages which triggers NMI softlockup watchdog and RCU stall detector
> > which add even more to the messages to print. Since machine doesn't do
> > anything (except serving interrupts) during this time, also network
> > connections are dropped and other disturbances may happen.
> > 
> > Paper over the problem by waiting for printk buffer to be empty before
> > starting to stop CPUs. In theory a burst of new messages can be appended
> > to the printk buffer before CPUs enter multi_cpu_stop() so this isn't a 100%
> > solution but it works OK in practice and I'm not aware of a reasonably
> > simple better solution.
> 
> Confused.  Why don't patches 1 and 2 already fix this problem?

Because stop_machine() will not allow printing threads to run on any CPU
(all but one CPUs are spinning in multi_cpu_stop() without possibility of
preemption) and thus any printk offloading cannot happen.

> > ...
> >
> > @@ -2489,6 +2489,28 @@ struct tty_driver *console_device(int *index)
> >  }
> >  
> >  /*
> > + * Wait until all messages accumulated in the printk buffer are printed to
> > + * console. Note that as soon as this function returns, new messages may be
> > + * added to the printk buffer by other CPUs.
> > + */
> > +void console_flush(void)
> 
> This doesn't seem a very good name.  We already have
> console_cont_flush(), cont_flush(), etc.  Can we think of something
> more specific?  printk_log_buf_drain() perhaps.

Thanks for suggestion. I'll change the name.

> > +{
> > +	bool retry;
> > +	unsigned long flags;
> > +
> > +	while (1) {
> > +		raw_spin_lock_irqsave(&logbuf_lock, flags);
> > +		retry = console_seq != log_next_seq;
> > +		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> 
> Does this lock/unlock do anything useful?
> 
> > +		if (!retry || console_suspended)
> > +			break;
> > +		/* Cycle console_sem to wait for outstanding printing */
> > +		console_lock();
> > +		console_unlock();
> > +	}
> > +}
> > +
> > +/*
> >   * Prevent further output on the passed console device so that (for example)
> >   * serial drivers can disable console output before suspending a port, and can
> >   * re-enable output afterwards.
> > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> > index fd643d8c4b42..016d34621d2e 100644
> > --- a/kernel/stop_machine.c
> > +++ b/kernel/stop_machine.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/smpboot.h>
> >  #include <linux/atomic.h>
> >  #include <linux/lglock.h>
> > +#include <linux/console.h>
> >  
> >  /*
> >   * Structure to determine completion condition and record errors.  May
> > @@ -543,6 +544,14 @@ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus)
> >  		return ret;
> >  	}
> >  
> > +	/*
> > +	 * If there are lots of outstanding messages, printing them can take a
> > +	 * long time and all cpus would be spinning waiting for the printing to
> > +	 * finish thus triggering NMI watchdog, RCU lockups etc. Wait for the
> > +	 * printing here to avoid these.
> > +	 */
> > +	console_flush();
> 
> This is pretty pointless if num_possible_cpus==1.  I'd suggest setting
> printk_offload_chars=0 in this case, add some early bale-out into
> console_flush().  Or something along those lines.
> 
> And make console_flush() go away altogether if CONFIG_SMP=n - it's
> pointless bloat.

Sure, I'll do that.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2015-09-22 10:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-19 15:38 [PATCH 0/4] printk: Softlockup avoidance Jan Kara
2015-08-19 15:38 ` [PATCH 1/4] printk: Hand over printing to console if printing too long Jan Kara
2015-09-18 22:14   ` Andrew Morton
2015-09-22 10:27     ` Jan Kara
2015-08-19 15:38 ` [PATCH 2/4] printk: Start printing handover kthreads on demand Jan Kara
2015-08-19 15:38 ` [PATCH 3/4] kernel: Avoid softlockups in stop_machine() during heavy printing Jan Kara
2015-09-18 22:15   ` Andrew Morton
2015-09-22 10:55     ` Jan Kara [this message]
2015-09-23  8:37       ` Jan Kara
2015-08-19 15:38 ` [PATCH 4/4] printk: Add config option for disabling printk offloading Jan Kara
2015-09-18 22:15   ` Andrew Morton
2015-09-22 11:51     ` Jan Kara
2015-08-20  2:37 ` [PATCH 0/4] printk: Softlockup avoidance KY Srinivasan
2015-09-18 22:14 ` Andrew Morton
2015-09-22 10:10   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150922105502.GK9028@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=gavin.hu.2010@gmail.com \
    --cc=jack@suse.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.