Linux MIPS Architecture development
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Russell King <rmk+kernel@arm.linux.org.uk>,
	Daniel Thompson <daniel.thompson@linaro.org>,
	Jiri Kosina <jkosina@suse.com>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Chris Metcalf <cmetcalf@ezchip.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	adi-buildroot-devel@lists.sourceforge.net,
	linux-cris-kernel@axis.com, linux-mips@linux-mips.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, sparclinux@vger.kernel.org,
	Jan Kara <jack@suse.cz>, Ralf Baechle <ralf@linux-mips.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	David Miller <davem@davemloft.net>
Subject: Re: [PATCH v5 1/4] printk/nmi: generic solution for safe printk in NMI
Date: Thu, 20 Apr 2017 15:11:54 +0200	[thread overview]
Message-ID: <20170420131154.GL3452@pathway.suse.cz> (raw)
In-Reply-To: <20170420033112.GB542@jagdpanzerIV.localdomain>

On Thu 2017-04-20 12:31:12, Sergey Senozhatsky wrote:
> Hello Steven,
> 
> On (04/19/17 13:13), Steven Rostedt wrote:
> > > printk() takes some locks and could not be used a safe way in NMI context.
> > 
> > I just found a problem with this solution. It kills ftrace dumps from
> > NMI context :-(
> > 
> > [ 1295.168495]    <...>-67423  10dNh1 382171111us : do_raw_spin_lock <-_raw_spin_lock
> > [ 1295.168495]    <...>-67423  10dNh1 382171111us : sched_stat_runtime: comm=cc1 pid=67423 runtime=96858 [ns] vruntime=11924198270 [ns]
> > [ 1295.168496]    <...>-67423  10dNh1 382171111us : lock_acquire: ffffffff81c5c940 read rcu_read_lock
> > [ 1295.168497]
> > [ 1295.168498] Lost 4890096 message(s)!
> > [ 1296.805063] ---[ end Kernel panic - not syncing: Hard LOCKUP
> > [ 1296.811553] unchecked MSR access error: WRMSR to 0x83f (tried to write 0x00000000000000f6) at rIP: 0xffffffff81046fc7 (native_apic_msr_write+0x27/0x40)
> > [ 1296.811553] Call Trace:
> > [ 1296.811553]  <NMI>
> > 
> > I was hoping to see a cause of a hard lockup by enabling
> > ftrace_dump_on_oops. But as NMIs now have a very small buffer that
> > gets flushed, we need to find a new way to print out the full ftrace
> > buffer over serial.
> > 
> > Thoughts?
> 
> hmmm... a really tough one.
> 
> well, someone has to say this:
>  the simplest thing is to have a bigger PRINTK_SAFE_LOG_BUF_SHIFT value :)
> 
> 
> just thinking (well, sort of) out loud. the problem is that we can't tell if
> we already hold any printk related locks ("printk related locks" is not even
> well defined term). so printk from NMI can deadlock or it can be OK, we
> never know. but looking and vprintk_emit() and console_unlock() it seems that
> we have some sort of a hint now, which is this_cpu_read(printk_context) - if
> we are not in printk_safe context then we can say that _probably_ (and that's
> a Russian roulette) doing "normal" printk() will work. that is a *very-very*
> risky (and admittedly dumb) thing to assume, so we will move in a slightly
> different direction. checking this_cpu_read(printk_context) only assures us
> that we don't hold `logbuf_lock' on this CPU. and that is sort of something,
> at least we can be sure that doing printk_deferred() from this CPU is safe.
> printk_deferred() means that your NMI messages will end up in the logbuf,
> which is a) bigger in size than per-CPU buffer and b) some other CPU can
> immediately print those messages (hopefully).
> 
> we also switch to printk_safe mode for call_console_drivers() in
> console_unlock(). but we can't make any solid assumptions there - serial
> console lock can already be acquired, we don't have any markers for that.
> it may be reasonable to assume that if we are not in printk_safe mode on
> this CPU then serial console is not locked from this CPU, but there is
> nothing that can assure us.

Good analyze. I would summarize it that we need to be careful of:

  + logbug_lock
  + PRINTK_SAFE_CONTEXT
  + locks used by console drivers

The first two things are easy to check. Except that a check for logbuf_lock
might produce false negatives. The last check is very hard.

> so at the moment what I can think of is something like
> 
>   -- check this_cpu_read(printk_context) in NMI prink
> 
> 	-- if we are NOT in printk_safe on this CPU, then do printk_deferred()
> 	   and bypass `nmi_print_seq' buffer

I would add also a check for logbuf_lock.

> 	-- if we are in printk_safe
> 	  -- well... bad luck... have a bigger buffer.

Yup, we do the best effort while still trying to stay on the safe
side.

I have cooked up a patch based on this. It uses printk_deferred()
in NMI when it is safe. Note that console_flush_on_panic() will
try to get them on the console when a kdump is not generated.
I believe that it will help Steven.

  reply	other threads:[~2017-04-20 13:12 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-21 11:48 [PATCH v5 0/4] Cleaning printk stuff in NMI context Petr Mladek
2016-04-21 11:48 ` [PATCH v5 1/4] printk/nmi: generic solution for safe printk in NMI Petr Mladek
2016-04-27  9:31   ` Russell King - ARM Linux
2017-04-19 17:13   ` Steven Rostedt
2017-04-19 17:21     ` Peter Zijlstra
2017-04-20  3:31     ` Sergey Senozhatsky
2017-04-20 13:11       ` Petr Mladek [this message]
2017-04-21  1:57         ` Sergey Senozhatsky
2017-04-21 12:06           ` Petr Mladek
2017-04-24  2:17             ` Sergey Senozhatsky
2017-04-27 13:38               ` Petr Mladek
2017-04-27 14:31                 ` Steven Rostedt
2017-04-27 15:28                   ` Petr Mladek
2017-04-27 15:42                     ` Steven Rostedt
2017-04-28  9:02                 ` Peter Zijlstra
2017-04-28 13:44                   ` Petr Mladek
2017-04-28 13:58                     ` Peter Zijlstra
2017-04-28 14:47                       ` Steven Rostedt
2017-04-27 16:14         ` Steven Rostedt
2017-04-28  1:35           ` Sergey Senozhatsky
2017-04-28 12:57             ` Petr Mladek
2017-04-28 14:16               ` Steven Rostedt
2017-04-28  1:25         ` Sergey Senozhatsky
2017-04-28 12:38           ` Petr Mladek
2016-04-21 11:48 ` [PATCH v5 2/4] printk/nmi: warn when some message has been lost in NMI context Petr Mladek
2016-04-27  9:34   ` Russell King - ARM Linux
2016-04-21 11:48 ` [PATCH v5 3/4] printk/nmi: increase the size of NMI buffer and make it configurable Petr Mladek
2016-04-21 11:48 ` [PATCH v5 4/4] printk/nmi: flush NMI messages on the system panic Petr Mladek
2016-04-23  3:49   ` Sergey Senozhatsky
2016-04-26 14:21     ` Petr Mladek
2016-04-27  0:34       ` Sergey Senozhatsky
2016-04-27  0:36 ` [PATCH v5 0/4] Cleaning printk stuff in NMI context Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170420131154.GL3452@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=adi-buildroot-devel@lists.sourceforge.net \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=cmetcalf@ezchip.com \
    --cc=daniel.thompson@linaro.org \
    --cc=davem@davemloft.net \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-cris-kernel@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ralf@linux-mips.org \
    --cc=rmk+kernel@arm.linux.org.uk \
    --cc=rostedt@goodmis.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox