public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Petr Mladek <pmladek@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Dave Anderson <anderson@redhat.com>, Kay Sievers <kay@vrfy.org>,
	Michal Hocko <mhocko@suse.cz>, Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 00/11] printk: safe printing in NMI context
Date: Wed, 18 Jun 2014 07:36:12 -0700	[thread overview]
Message-ID: <20140618143612.GC4669@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1406181245230.2303@pobox.suse.cz>

On Wed, Jun 18, 2014 at 01:03:05PM +0200, Jiri Kosina wrote:
> On Tue, 10 Jun 2014, Linus Torvalds wrote:
> 
> > > Lets be crazy and Cc Linus on that.
> > 
> > Quite frankly, I hate seeing something like this:
> > 
> >  kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
> > 
> > for something that is stupid and broken. Printing from NMI context
> > isn't really supposed to work, and we all *know* it's not supposed to
> > work.
> > 
> > I'd much rather disallow it, and if there is one or two places that
> > really want to print a warning and know that they are in NMI context,
> > have a special workaround just for them, with something that does
> > *not* try to make printk in general work any better.
> > 
> > Dammit, NMI context is special. I absolutely refuse to buy into the
> > broken concept that we should make more stuff work in NMI context.
> > Hell no, we should *not* try to make more crap work in NMI. NMI people
> > should be careful.
> > 
> > Make a trivial "printk_nmi()" wrapper that tries to do a trylock on
> > logbuf_lock, and *maybe* the existing sequence of
> > 
> >         if (console_trylock_for_printk())
> >                 console_unlock();
> > 
> > then works for actually triggering the printout. But the wrapper
> > should be 15 lines of code for "if possible, try to print things", and
> > *not* a thousand lines of changes.
> 
> Alright, so this went silent again without any real progress. Is everyone 
> hoping this gets sorted out on kernel summit, or ... ?
> 
> Let me sum up the current situation:
> 
> - both RCU stall detector and 'echo l > sysrq-trigger' can (and we've 
>   seen it happening for real) cause a complete, undebuggable, silent hang 
>   of machine (deadlock in NMI context)

I could easily add an option to RCU to allow people to tell it not to
use NMIs to dump the stack.  Would that help?

							Thanx, Paul

> - before 7ff9554bb578 and friends, this was trivial to fix more or less 
>   exactly the way Linus is proposing above. We've been carrying the 
>   fix in our kernels for a while already [1]. With printk() having got  
>   overly complicated recently, the "in principle trivial" fix turns out  
>   into crazy mess due to handling of all the indexes, sequence numbers, 
>   etc.
> 
> - printk() from NMI is actually useful in rare cases (such as inter-CPU 
>   stack dumping)
> 
> - using lockless buffers that trace_printk() is using has its own 
>   problems, as described by Petr elsewhere in this thread
> 
> 
> I find it rather outrageous that fixing *real bugs* (leading to hangs) 
> becomes impossible due to printk() being too complex. It's very 
> unfortunate that the same level of pushback didn't happen when new 
> features (that actually *made* it so complicated) have been pushed; that 
> would be much more valuable an appropriate.
> 
> I believe Jan Kara is in the same situation with his softlockup fixes for 
> printk. Those are real problems, as they are bringing machines down, and 
> after two years, still not fixed, because "printk() code is scary enough 
> as-is"
> 
> [1] http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP3&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9
> 
> -- 
> Jiri Kosina
> SUSE Labs
> 


  reply	other threads:[~2014-06-18 14:36 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09  9:10 [RFC PATCH 00/11] printk: safe printing in NMI context Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 01/11] printk: rename struct printk_log to printk_msg Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 02/11] printk: allow to handle more log buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 03/11] printk: rename "logbuf_lock" to "main_logbuf_lock" Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 04/11] printk: add NMI ring and cont buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 05/11] printk: allow to modify NMI log buffer size using boot parameter Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 06/11] printk: NMI safe printk Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 07/11] printk: right ordering of the cont buffers from NMI context Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 08/11] printk: try hard to print Oops message in " Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 09/11] printk: merge and flush NMI buffer predictably via IRQ work Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 10/11] printk: survive rotation of sequence numbers Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 11/11] printk: avoid staling when merging NMI log buffer Petr Mladek
2014-05-28 22:02 ` [RFC PATCH 00/11] printk: safe printing in NMI context Jiri Kosina
2014-05-29  0:09   ` Frederic Weisbecker
2014-05-29  8:09     ` Jiri Kosina
2014-06-10 16:46       ` Frederic Weisbecker
2014-06-10 16:57         ` Linus Torvalds
2014-06-10 17:32           ` Jiri Kosina
2014-06-11  9:01             ` Petr Mládek
2014-06-18 11:03           ` Jiri Kosina
2014-06-18 14:36             ` Paul E. McKenney [this message]
2014-06-18 14:41               ` Jiri Kosina
2014-06-18 14:44                 ` Paul E. McKenney
2014-06-18 14:53                   ` Jiri Kosina
2014-06-18 15:07                     ` Paul E. McKenney
     [not found]               ` <CA+55aFwPgDC6gSEPfu3i-pA4f0ZbsTSvykxzX4sXMeLbdXuKrw@mail.gmail.com>
2014-06-18 16:21                 ` Paul E. McKenney
2014-06-18 16:38                   ` Steven Rostedt
2014-06-18 16:43                     ` Paul E. McKenney
2014-06-18 20:36                   ` Jiri Kosina
2014-06-18 21:07                     ` Paul E. McKenney
2014-06-18 21:12                       ` Jiri Kosina
2014-06-18 21:20                         ` Paul E. McKenney
2014-06-18 21:32                           ` Jiri Kosina
2014-06-18 21:37                             ` Paul E. McKenney
2014-06-18 23:20                         ` Steven Rostedt
2014-05-30  8:13     ` Jan Kara
2014-05-30 10:10       ` Jiri Kosina
2014-06-10 16:49       ` Frederic Weisbecker
2014-06-12 11:50     ` Petr Mládek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140618143612.GC4669@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anderson@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.cz \
    --cc=kay@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=pmladek@suse.cz \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox