All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, joern@logfs.org,
	peterz@infradead.org, Andrew Morton <akpm@linux-foundation.org>,
	cxie@redhat.com, Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jiri Slaby <jslaby@suse.cz>
Subject: Re: [PATCH RFC] sysrq: rcu-ify __handle_sysrq
Date: Wed, 23 Apr 2014 18:46:49 -0700	[thread overview]
Message-ID: <20140424014648.GK4496@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LRH.2.00.1404232349350.1491@twin.jikos.cz>

On Wed, Apr 23, 2014 at 11:51:55PM +0200, Jiri Kosina wrote:
> On Wed, 23 Apr 2014, Rik van Riel wrote:
> 
> > >> Echoing values into /proc/sysrq-trigger seems to be a popular way to
> > >> get information out of the kernel. However, dumping information about
> > >> thousands of processes, or hundreds of CPUs to serial console can
> > >> result in IRQs being blocked for minutes, resulting in various kinds
> > >> of cascade failures.
> > >>
> > >> The most common failure is due to interrupts being blocked for a very
> > >> long time. This can lead to things like failed IO requests, and other
> > >> things the system cannot easily recover from.
> > >>
> > >> This problem is easily fixable by making __handle_sysrq use RCU
> > >> instead of spin_lock_irqsave.
> > >>
> > >> This leaves the warning that RCU grace periods have not elapsed for a
> > >> long time, but the system will come back from that automatically.
> > > 
> > > This, however, will make RCU stall detector to send NMI to all online CPUs 
> > > so that they can dump their stacks.

Hey, if dumping the stacks once is a good idea, dumping them twice
must be twice as good, right?  ;-)

> > It already does that, since several of the longer-running
> > sysrq handlers already grab rcu_read_lock(), for example
> > show_state().
> > 
> > > IOW, this might actually make the whole sysrq dump last for much longer, 
> > > and have the log polluted with all-CPU dumps for no good reason.
> > > 
> > > I wonder whether explicitly setting rcu_cpu_stall_suppress during sysrq 
> > > handling might be a viable workaround for this.
> > 
> > I suppose that would do the trick.
> 
> I can imagine Paul opposing this though ... this variable is supposed to 
> be changed only by cmdline/modparam, not really flipped during runtime as 
> a bandaid ... let's add Paul to CC.

Well, we already crowbar it to 1 when panic starts, see rcu_panic().

How about something like the following?

	void rcu_sysrq_start(void)
	{
		rcu_cpu_stall_suppress = 2;
	}

	void rcu_sysrq_end(void)
	{
		if (rcu_cpu_stall_suppress == 2)
			rcu_cpu_stall_suppress = 0;
	}

If there get to be too many more different reasons for temporarily
suppressing RCU CPU stall warnings, I can then swap out to a better
implementation, for some definition or another of "better".

							Thanx, Paul


  reply	other threads:[~2014-04-24  1:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-23 16:53 [PATCH RFC] sysrq: rcu-ify __handle_sysrq Rik van Riel
2014-04-23 20:04 ` Andrew Morton
2014-04-23 20:44   ` Rik van Riel
2014-04-23 21:39   ` Jiri Kosina
2014-04-23 21:41     ` Andrew Morton
2014-04-23 21:44       ` Jiri Kosina
2014-04-23 21:49         ` Andrew Morton
2014-04-23 21:37 ` Jiri Kosina
2014-04-23 21:42   ` Rik van Riel
2014-04-23 21:51     ` Jiri Kosina
2014-04-24  1:46       ` Paul E. McKenney [this message]
2014-04-24 13:04         ` [PATCH RFC] sysrq,rcu: suppress RCU stall warnings while sysrq runs Rik van Riel
2014-04-24 15:16           ` Paul E. McKenney
2014-04-25  5:35           ` Mike Galbraith
2014-04-24  0:52   ` [PATCH RFC] sysrq: rcu-ify __handle_sysrq Jörn Engel
2014-04-24 19:40     ` [PATCH] printk: Print cpu number along with time Jörn Engel
2014-04-24 19:58       ` Greg Kroah-Hartman
2014-04-24 21:23         ` Jörn Engel
2014-04-24 22:12         ` Jiri Kosina
2014-04-24 22:18           ` David Rientjes
2014-04-24 22:21             ` Jiri Kosina
2014-04-24 23:29               ` Jörn Engel
2014-04-24 22:20           ` Greg Kroah-Hartman
2014-04-28 23:40       ` Jörn Engel
2014-04-29  0:22         ` Andrew Morton
2014-06-04 23:15           ` Jörn Engel
2014-06-04 23:28             ` Andrew Morton
2014-06-04 23:49               ` Jörn Engel
2014-09-09 17:16             ` Jörn Engel
2014-09-10 21:26               ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140424014648.GK4496@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cxie@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=joern@logfs.org \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.