Re: [PATCH v4 1/5] printk/nmi: generic solution for safe printk in NMI

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Russell King <rmk+kernel@arm.linux.org.uk>,
	Daniel Thompson <daniel.thompson@linaro.org>,
	Jiri Kosina <jkosina@suse.com>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Chris Metcalf <cmetcalf@ezchip.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	adi-buildroot-devel@lists.sourceforge.net,
	linux-cris-kernel@axis.com, linux-mips@linux-mips.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, sparclinux@vger.kernel.org,
	Jan Kara <jack@suse.cz>, Ralf Baechle <ralf@linux-mips.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	David Miller <davem@davemloft.net>
Subject: Re: [PATCH v4 1/5] printk/nmi: generic solution for safe printk in NMI
Date: Mon, 4 Apr 2016 13:49:28 +0900	[thread overview]
Message-ID: <20160404044928.GD6164@swordfish> (raw)
In-Reply-To: <1459353210-20260-2-git-send-email-pmladek@suse.com>

Hello,

On (03/30/16 17:53), Petr Mladek wrote:
[..]
> @@ -67,10 +67,12 @@ extern void irq_exit(void);
>  		preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);	\
>  		rcu_nmi_enter();				\
>  		trace_hardirq_enter();				\
> +		printk_nmi_enter();				\
>  	} while (0)
>  
>  #define nmi_exit()						\
>  	do {							\
> +		printk_nmi_exit();				\
>  		trace_hardirq_exit();				\
>  		rcu_nmi_exit();					\
>  		BUG_ON(!in_nmi());				\

isn't it a bit too early to printk_nmi_exit()? rcu_nmi_exit() can
WARN_ON_ONCE() in 3 places.

the same goes for printk_nmi_enter(). rcu_nmi_enter() can WARN_ON_ONCE().

seems that in both cases we can endup having WARN_ON_ONCE() from nmi,
but with default printk function.


> +/*
> + * Flush data from the associated per_CPU buffer. The function
> + * can be called either via IRQ work or independently.
> + */
> +static void __printk_nmi_flush(struct irq_work *work)
> +{
> +	static raw_spinlock_t read_lock =
> +		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
> +	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> +	unsigned long flags;
> +	size_t len, size;
> +	int i, last_i;
> +
> +	/*
> +	 * The lock has two functions. First, one reader has to flush all
> +	 * available message to make the lockless synchronization with
> +	 * writers easier. Second, we do not want to mix messages from
> +	 * different CPUs. This is especially important when printing
> +	 * a backtrace.
> +	 */
> +	raw_spin_lock_irqsave(&read_lock, flags);
> +

hm... so here we have
	for (; i < size; i++)
		printk()

under the spinlock. the thing is that one of printk() can end up
in console_unlock()->call_console_drivers() loop, iterating there
long enough to spinlock lockup other CPUs that might want to flush
NMI buffers (if any), assuming that there are enough printk() (or
may be a slow serial console) happening concurrently on other CPUs
to keep the current ->read_lock busy. async printk can help here,
but user can request sync version of printk.

how about using deferred printk for nmi flush?
print_nmi_seq_line()->printk_deferred() ?

	-ss

> +	i = 0;
> +more:
> +	len = atomic_read(&s->len);
> +
> +	/*
> +	 * This is just a paranoid check that nobody has manipulated
> +	 * the buffer an unexpected way. If we printed something then
> +	 * @len must only increase.
> +	 */
> +	if (i && i >= len)
> +		pr_err("printk_nmi_flush: internal error: i=%d >= len=%zu\n",
> +		       i, len);
> +
> +	if (!len)
> +		goto out; /* Someone else has already flushed the buffer. */
> +
> +	/* Make sure that data has been written up to the @len */
> +	smp_rmb();
> +
> +	size = min(len, sizeof(s->buffer));
> +	last_i = i;
> +
> +	/* Print line by line. */
> +	for (; i < size; i++) {
> +		if (s->buffer[i] == '\n') {
> +			print_nmi_seq_line(s, last_i, i);
> +			last_i = i + 1;
> +		}
> +	}
> +	/* Check if there was a partial line. */
> +	if (last_i < size) {
> +		print_nmi_seq_line(s, last_i, size - 1);
> +		pr_cont("\n");
> +	}
> +
> +	/*
> +	 * Check that nothing has got added in the meantime and truncate
> +	 * the buffer. Note that atomic_cmpxchg() is an implicit memory
> +	 * barrier that makes sure that the data were copied before
> +	 * updating s->len.
> +	 */
> +	if (atomic_cmpxchg(&s->len, len, 0) != len)
> +		goto more;
> +
> +out:
> +	raw_spin_unlock_irqrestore(&read_lock, flags);
> +}

WARNING: multiple messages have this Message-ID (diff)

From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v4 1/5] printk/nmi: generic solution for safe printk in NMI
Date: Mon, 04 Apr 2016 04:49:28 +0000	[thread overview]
Message-ID: <20160404044928.GD6164@swordfish> (raw)
In-Reply-To: <1459353210-20260-2-git-send-email-pmladek@suse.com>

Hello,

On (03/30/16 17:53), Petr Mladek wrote:
[..]
> @@ -67,10 +67,12 @@ extern void irq_exit(void);
>  		preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);	\
>  		rcu_nmi_enter();				\
>  		trace_hardirq_enter();				\
> +		printk_nmi_enter();				\
>  	} while (0)
>  
>  #define nmi_exit()						\
>  	do {							\
> +		printk_nmi_exit();				\
>  		trace_hardirq_exit();				\
>  		rcu_nmi_exit();					\
>  		BUG_ON(!in_nmi());				\

isn't it a bit too early to printk_nmi_exit()? rcu_nmi_exit() can
WARN_ON_ONCE() in 3 places.

the same goes for printk_nmi_enter(). rcu_nmi_enter() can WARN_ON_ONCE().

seems that in both cases we can endup having WARN_ON_ONCE() from nmi,
but with default printk function.


> +/*
> + * Flush data from the associated per_CPU buffer. The function
> + * can be called either via IRQ work or independently.
> + */
> +static void __printk_nmi_flush(struct irq_work *work)
> +{
> +	static raw_spinlock_t read_lock > +		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
> +	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> +	unsigned long flags;
> +	size_t len, size;
> +	int i, last_i;
> +
> +	/*
> +	 * The lock has two functions. First, one reader has to flush all
> +	 * available message to make the lockless synchronization with
> +	 * writers easier. Second, we do not want to mix messages from
> +	 * different CPUs. This is especially important when printing
> +	 * a backtrace.
> +	 */
> +	raw_spin_lock_irqsave(&read_lock, flags);
> +

hm... so here we have
	for (; i < size; i++)
		printk()

under the spinlock. the thing is that one of printk() can end up
in console_unlock()->call_console_drivers() loop, iterating there
long enough to spinlock lockup other CPUs that might want to flush
NMI buffers (if any), assuming that there are enough printk() (or
may be a slow serial console) happening concurrently on other CPUs
to keep the current ->read_lock busy. async printk can help here,
but user can request sync version of printk.

how about using deferred printk for nmi flush?
print_nmi_seq_line()->printk_deferred() ?

	-ss

> +	i = 0;
> +more:
> +	len = atomic_read(&s->len);
> +
> +	/*
> +	 * This is just a paranoid check that nobody has manipulated
> +	 * the buffer an unexpected way. If we printed something then
> +	 * @len must only increase.
> +	 */
> +	if (i && i >= len)
> +		pr_err("printk_nmi_flush: internal error: i=%d >= len=%zu\n",
> +		       i, len);
> +
> +	if (!len)
> +		goto out; /* Someone else has already flushed the buffer. */
> +
> +	/* Make sure that data has been written up to the @len */
> +	smp_rmb();
> +
> +	size = min(len, sizeof(s->buffer));
> +	last_i = i;
> +
> +	/* Print line by line. */
> +	for (; i < size; i++) {
> +		if (s->buffer[i] = '\n') {
> +			print_nmi_seq_line(s, last_i, i);
> +			last_i = i + 1;
> +		}
> +	}
> +	/* Check if there was a partial line. */
> +	if (last_i < size) {
> +		print_nmi_seq_line(s, last_i, size - 1);
> +		pr_cont("\n");
> +	}
> +
> +	/*
> +	 * Check that nothing has got added in the meantime and truncate
> +	 * the buffer. Note that atomic_cmpxchg() is an implicit memory
> +	 * barrier that makes sure that the data were copied before
> +	 * updating s->len.
> +	 */
> +	if (atomic_cmpxchg(&s->len, len, 0) != len)
> +		goto more;
> +
> +out:
> +	raw_spin_unlock_irqrestore(&read_lock, flags);
> +}

WARNING: multiple messages have this Message-ID (diff)

From: sergey.senozhatsky.work@gmail.com (Sergey Senozhatsky)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 1/5] printk/nmi: generic solution for safe printk in NMI
Date: Mon, 4 Apr 2016 13:49:28 +0900	[thread overview]
Message-ID: <20160404044928.GD6164@swordfish> (raw)
In-Reply-To: <1459353210-20260-2-git-send-email-pmladek@suse.com>

Hello,

On (03/30/16 17:53), Petr Mladek wrote:
[..]
> @@ -67,10 +67,12 @@ extern void irq_exit(void);
>  		preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);	\
>  		rcu_nmi_enter();				\
>  		trace_hardirq_enter();				\
> +		printk_nmi_enter();				\
>  	} while (0)
>  
>  #define nmi_exit()						\
>  	do {							\
> +		printk_nmi_exit();				\
>  		trace_hardirq_exit();				\
>  		rcu_nmi_exit();					\
>  		BUG_ON(!in_nmi());				\

isn't it a bit too early to printk_nmi_exit()? rcu_nmi_exit() can
WARN_ON_ONCE() in 3 places.

the same goes for printk_nmi_enter(). rcu_nmi_enter() can WARN_ON_ONCE().

seems that in both cases we can endup having WARN_ON_ONCE() from nmi,
but with default printk function.


> +/*
> + * Flush data from the associated per_CPU buffer. The function
> + * can be called either via IRQ work or independently.
> + */
> +static void __printk_nmi_flush(struct irq_work *work)
> +{
> +	static raw_spinlock_t read_lock =
> +		__RAW_SPIN_LOCK_INITIALIZER(read_lock);
> +	struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> +	unsigned long flags;
> +	size_t len, size;
> +	int i, last_i;
> +
> +	/*
> +	 * The lock has two functions. First, one reader has to flush all
> +	 * available message to make the lockless synchronization with
> +	 * writers easier. Second, we do not want to mix messages from
> +	 * different CPUs. This is especially important when printing
> +	 * a backtrace.
> +	 */
> +	raw_spin_lock_irqsave(&read_lock, flags);
> +

hm... so here we have
	for (; i < size; i++)
		printk()

under the spinlock. the thing is that one of printk() can end up
in console_unlock()->call_console_drivers() loop, iterating there
long enough to spinlock lockup other CPUs that might want to flush
NMI buffers (if any), assuming that there are enough printk() (or
may be a slow serial console) happening concurrently on other CPUs
to keep the current ->read_lock busy. async printk can help here,
but user can request sync version of printk.

how about using deferred printk for nmi flush?
print_nmi_seq_line()->printk_deferred() ?

	-ss

> +	i = 0;
> +more:
> +	len = atomic_read(&s->len);
> +
> +	/*
> +	 * This is just a paranoid check that nobody has manipulated
> +	 * the buffer an unexpected way. If we printed something then
> +	 * @len must only increase.
> +	 */
> +	if (i && i >= len)
> +		pr_err("printk_nmi_flush: internal error: i=%d >= len=%zu\n",
> +		       i, len);
> +
> +	if (!len)
> +		goto out; /* Someone else has already flushed the buffer. */
> +
> +	/* Make sure that data has been written up to the @len */
> +	smp_rmb();
> +
> +	size = min(len, sizeof(s->buffer));
> +	last_i = i;
> +
> +	/* Print line by line. */
> +	for (; i < size; i++) {
> +		if (s->buffer[i] == '\n') {
> +			print_nmi_seq_line(s, last_i, i);
> +			last_i = i + 1;
> +		}
> +	}
> +	/* Check if there was a partial line. */
> +	if (last_i < size) {
> +		print_nmi_seq_line(s, last_i, size - 1);
> +		pr_cont("\n");
> +	}
> +
> +	/*
> +	 * Check that nothing has got added in the meantime and truncate
> +	 * the buffer. Note that atomic_cmpxchg() is an implicit memory
> +	 * barrier that makes sure that the data were copied before
> +	 * updating s->len.
> +	 */
> +	if (atomic_cmpxchg(&s->len, len, 0) != len)
> +		goto more;
> +
> +out:
> +	raw_spin_unlock_irqrestore(&read_lock, flags);
> +}

next prev parent reply	other threads:[~2016-04-04  4:48 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30 15:53 [PATCH v4 0/5] Cleaning printk stuff in NMI context Petr Mladek
2016-03-30 15:53 ` Petr Mladek
2016-03-30 15:53 ` Petr Mladek
2016-03-30 15:53 ` [PATCH v4 1/5] printk/nmi: generic solution for safe printk in NMI Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-04-04  4:49   ` Sergey Senozhatsky [this message]
2016-04-04  4:49     ` Sergey Senozhatsky
2016-04-04  4:49     ` Sergey Senozhatsky
2016-04-04  9:38     ` Petr Mladek
2016-04-04  9:38       ` Petr Mladek
2016-04-04  9:38       ` Petr Mladek
2016-04-20 13:49       ` Petr Mladek
2016-04-20 13:49         ` Petr Mladek
2016-04-20 13:49         ` Petr Mladek
2016-03-30 15:53 ` [PATCH v4 2/5] printk/nmi: use IRQ work only when ready Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-04-04  4:56   ` Sergey Senozhatsky
2016-04-04  4:56     ` Sergey Senozhatsky
2016-04-04  4:56     ` Sergey Senozhatsky
2016-03-30 15:53 ` [PATCH v4 3/5] printk/nmi: warn when some message has been lost in NMI context Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53 ` [PATCH v4 4/5] printk/nmi: increase the size of NMI buffer and make it configurable Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53 ` [PATCH v4 5/5] printk/nmi: flush NMI messages on the system panic Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 15:53   ` Petr Mladek
2016-03-30 16:33   ` kbuild test robot
2016-03-30 16:33     ` kbuild test robot
2016-03-30 16:33     ` kbuild test robot
2016-03-30 16:33     ` kbuild test robot
2016-03-30 16:33     ` kbuild test robot
2016-03-31 12:36     ` Petr Mladek
2016-03-31 12:36       ` Petr Mladek
2016-03-31 12:36       ` Petr Mladek
2016-03-31 12:36       ` Petr Mladek
2016-04-04  4:52       ` Sergey Senozhatsky
2016-04-04  4:52         ` Sergey Senozhatsky
2016-04-04  4:52         ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160404044928.GD6164@swordfish \
    --to=sergey.senozhatsky.work@gmail.com \
    --cc=adi-buildroot-devel@lists.sourceforge.net \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=cmetcalf@ezchip.com \
    --cc=daniel.thompson@linaro.org \
    --cc=davem@davemloft.net \
    --cc=jack@suse.cz \
    --cc=jkosina@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-cris-kernel@axis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=ralf@linux-mips.org \
    --cc=rmk+kernel@arm.linux.org.uk \
    --cc=rostedt@goodmis.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.