All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Petr Mladek <pmladek@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Dave Anderson <anderson@redhat.com>, Kay Sievers <kay@vrfy.org>,
	Michal Hocko <mhocko@suse.cz>, Jan Kara <jack@suse.cz>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 00/11] printk: safe printing in NMI context
Date: Wed, 18 Jun 2014 07:36:12 -0700	[thread overview]
Message-ID: <20140618143612.GC4669@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1406181245230.2303@pobox.suse.cz>

On Wed, Jun 18, 2014 at 01:03:05PM +0200, Jiri Kosina wrote:
> On Tue, 10 Jun 2014, Linus Torvalds wrote:
> 
> > > Lets be crazy and Cc Linus on that.
> > 
> > Quite frankly, I hate seeing something like this:
> > 
> >  kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
> > 
> > for something that is stupid and broken. Printing from NMI context
> > isn't really supposed to work, and we all *know* it's not supposed to
> > work.
> > 
> > I'd much rather disallow it, and if there is one or two places that
> > really want to print a warning and know that they are in NMI context,
> > have a special workaround just for them, with something that does
> > *not* try to make printk in general work any better.
> > 
> > Dammit, NMI context is special. I absolutely refuse to buy into the
> > broken concept that we should make more stuff work in NMI context.
> > Hell no, we should *not* try to make more crap work in NMI. NMI people
> > should be careful.
> > 
> > Make a trivial "printk_nmi()" wrapper that tries to do a trylock on
> > logbuf_lock, and *maybe* the existing sequence of
> > 
> >         if (console_trylock_for_printk())
> >                 console_unlock();
> > 
> > then works for actually triggering the printout. But the wrapper
> > should be 15 lines of code for "if possible, try to print things", and
> > *not* a thousand lines of changes.
> 
> Alright, so this went silent again without any real progress. Is everyone 
> hoping this gets sorted out on kernel summit, or ... ?
> 
> Let me sum up the current situation:
> 
> - both RCU stall detector and 'echo l > sysrq-trigger' can (and we've 
>   seen it happening for real) cause a complete, undebuggable, silent hang 
>   of machine (deadlock in NMI context)

I could easily add an option to RCU to allow people to tell it not to
use NMIs to dump the stack.  Would that help?

							Thanx, Paul

> - before 7ff9554bb578 and friends, this was trivial to fix more or less 
>   exactly the way Linus is proposing above. We've been carrying the 
>   fix in our kernels for a while already [1]. With printk() having got  
>   overly complicated recently, the "in principle trivial" fix turns out  
>   into crazy mess due to handling of all the indexes, sequence numbers, 
>   etc.
> 
> - printk() from NMI is actually useful in rare cases (such as inter-CPU 
>   stack dumping)
> 
> - using lockless buffers that trace_printk() is using has its own 
>   problems, as described by Petr elsewhere in this thread
> 
> 
> I find it rather outrageous that fixing *real bugs* (leading to hangs) 
> becomes impossible due to printk() being too complex. It's very 
> unfortunate that the same level of pushback didn't happen when new 
> features (that actually *made* it so complicated) have been pushed; that 
> would be much more valuable an appropriate.
> 
> I believe Jan Kara is in the same situation with his softlockup fixes for 
> printk. Those are real problems, as they are bringing machines down, and 
> after two years, still not fixed, because "printk() code is scary enough 
> as-is"
> 
> [1] http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP3&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9
> 
> -- 
> Jiri Kosina
> SUSE Labs
> 


  reply	other threads:[~2014-06-18 14:36 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09  9:10 [RFC PATCH 00/11] printk: safe printing in NMI context Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 01/11] printk: rename struct printk_log to printk_msg Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 02/11] printk: allow to handle more log buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 03/11] printk: rename "logbuf_lock" to "main_logbuf_lock" Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 04/11] printk: add NMI ring and cont buffers Petr Mladek
2014-05-09  9:10 ` [RFC PATCH 05/11] printk: allow to modify NMI log buffer size using boot parameter Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 06/11] printk: NMI safe printk Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 07/11] printk: right ordering of the cont buffers from NMI context Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 08/11] printk: try hard to print Oops message in " Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 09/11] printk: merge and flush NMI buffer predictably via IRQ work Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 10/11] printk: survive rotation of sequence numbers Petr Mladek
2014-05-09  9:11 ` [RFC PATCH 11/11] printk: avoid staling when merging NMI log buffer Petr Mladek
2014-05-28 22:02 ` [RFC PATCH 00/11] printk: safe printing in NMI context Jiri Kosina
2014-05-29  0:09   ` Frederic Weisbecker
2014-05-29  8:09     ` Jiri Kosina
2014-06-10 16:46       ` Frederic Weisbecker
2014-06-10 16:57         ` Linus Torvalds
2014-06-10 17:32           ` Jiri Kosina
2014-06-11  9:01             ` Petr Mládek
2014-06-18 11:03           ` Jiri Kosina
2014-06-18 14:36             ` Paul E. McKenney [this message]
2014-06-18 14:41               ` Jiri Kosina
2014-06-18 14:44                 ` Paul E. McKenney
2014-06-18 14:53                   ` Jiri Kosina
2014-06-18 15:07                     ` Paul E. McKenney
     [not found]               ` <CA+55aFwPgDC6gSEPfu3i-pA4f0ZbsTSvykxzX4sXMeLbdXuKrw@mail.gmail.com>
2014-06-18 16:21                 ` Paul E. McKenney
2014-06-18 16:38                   ` Steven Rostedt
2014-06-18 16:43                     ` Paul E. McKenney
2014-06-18 20:36                   ` Jiri Kosina
2014-06-18 21:07                     ` Paul E. McKenney
2014-06-18 21:12                       ` Jiri Kosina
2014-06-18 21:20                         ` Paul E. McKenney
2014-06-18 21:32                           ` Jiri Kosina
2014-06-18 21:37                             ` Paul E. McKenney
2014-06-18 23:20                         ` Steven Rostedt
2014-05-30  8:13     ` Jan Kara
2014-05-30 10:10       ` Jiri Kosina
2014-06-10 16:49       ` Frederic Weisbecker
2014-06-12 11:50     ` Petr Mládek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140618143612.GC4669@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anderson@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.cz \
    --cc=kay@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=pmladek@suse.cz \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.