Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Petr Mladek <pmladek@suse.com>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, Tejun Heo <tj@kernel.org>,
	Calvin Owens <calvinowens@fb.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mel Gorman <mgorman@techsingularity.net>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Laura Abbott <labbott@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Kees Cook <keescook@chromium.org>,
	linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Subject: Re: [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts
Date: Fri, 25 Nov 2016 12:07:30 +0100	[thread overview]
Message-ID: <20161125110730.GH24103@pathway.suse.cz> (raw)
In-Reply-To: <20161027154933.1211-5-sergey.senozhatsky@gmail.com>

On Fri 2016-10-28 00:49:31, Sergey Senozhatsky wrote:
> Account lost messages in pritk-safe and printk-safe-nmi
> contexts and report those numbers during printk_safe_flush().
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
>  kernel/printk/internal.h    | 17 ---------------
>  kernel/printk/printk.c      | 10 ---------
>  kernel/printk/printk_safe.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 48 insertions(+), 29 deletions(-)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 1878b2d..4675b8d 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1795,7 +1795,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	unsigned long flags;
>  	int this_cpu;
>  	int printed_len = 0;
> -	int nmi_message_lost;
>  	bool in_sched = false;
>  	/* cpu currently holding logbuf_lock in this function */
>  	static unsigned int logbuf_cpu = UINT_MAX;
> @@ -1846,15 +1845,6 @@ asmlinkage int vprintk_emit(int facility, int level,
>  					 strlen(recursion_msg));
>  	}
>  
> -	nmi_message_lost = get_nmi_message_lost();
> -	if (unlikely(nmi_message_lost)) {
> -		text_len = scnprintf(textbuf, sizeof(textbuf),
> -				     "BAD LUCK: lost %d message(s) from NMI context!",
> -				     nmi_message_lost);
> -		printed_len += log_store(0, 2, LOG_PREFIX|LOG_NEWLINE, 0,
> -					 NULL, 0, textbuf, text_len);
> -	}

I really like that we are moving this out of vprintk_emit() but...

> -
>  	/*
>  	 * The printf needs to come first; we need the syslog
>  	 * prefix which might be passed-in as a parameter.
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index af74d9c..721b091 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -52,10 +52,11 @@ struct printk_safe_seq_buf {
>  
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq);
>  static DEFINE_PER_CPU(int, printk_safe_context);
> +static atomic_t safe_message_lost;
>  
>  #ifdef CONFIG_PRINTK_NMI
>  static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
> -atomic_t nmi_message_lost;
> +static atomic_t nmi_message_lost;
>  #endif
>  
>  static int printk_safe_log_store(struct printk_safe_seq_buf *s,
> @@ -122,6 +123,41 @@ static void printk_safe_flush_seq_line(struct printk_safe_seq_buf *s,
>  	printk_safe_flush_line(buf, (end - start) + 1);
>  }
>  
> +static void report_message_lost(atomic_t *num_lost, char *fmt)
> +{
> +	int lost = atomic_xchg(num_lost, 0);
> +
> +	if (lost) {
> +		char msg[56];

I would really like to avoid a hard coded buffer size. Such things
are likely to bite us in the future.

I thought about reshuffling a lot of logic, adding more wrappers,
... But the solution might be easy in the end, see below.

> +
> +		scnprintf(msg, sizeof(msg), fmt, lost);
> +
> +		printk_safe_flush_line(msg, strlen(msg));

This made my brain spin a lot. I wondered if it did what we wanted
and it was safe.

On one hand, it is supposed to work because use exactly this
function in __printk_safe_flush() where you call this from.

One question is if it does what we want in different contexts.
Let's look at it:

1. It calls printk_deferred() in NMI context. There is a risk
   of a deadlock. But it is called only from
   printk_safe_flush_on_panic() which is the last resort. Therefore
   it does exactly what we want.

2. It calls printk()->printk_func()->vprintk_emit() in normal context.
   It is what we want in normal context.

3. It calls printk()->printk_func()->v printk_safe() in printk_safe
   context. This does not look correct. IMHO, this might happen
   only printk_safe_flush_on_panic() and we want to use
   printk_deferred() here as well.

   Also note that printk_safe buffer might still be full at this
   point. It means that the warning will not be stored. There also
   might be a risk of an infinite loop. Both problems are
   solved by printk_deferred().

The easiest solution would be to simply call printk_deferred()
here. Everything will be deferred after the async printk() patchset
anyway.

I would even use printk_deferred() in printk_safe_flush_line()
for each context. It is not optimal but it works very well
and it makes the code much more straightforward.

> +	}
> +}
> +
> +#ifdef CONFIG_PRINTK_NMI
> +
> +static void report_nmi_message_lost(void)
> +{
> +	report_message_lost(&nmi_message_lost,
> +			"Lost %d message(s) from NMI context!");
> +}
> +
> +#else
> +
> +static void report_nmi_message_lost(void)
> +{
> +}
> +
> +#endif /* CONFIG_PRINTK_NMI */
> +
> +static void report_safe_message_lost(void)
> +{
> +	report_message_lost(&safe_message_lost,
> +			"Lost %d message(s) from printk-safe context!");
> +}
> +
>  /*
>   * Flush data from the associated per-CPU buffer. The function
>   * can be called either via IRQ work or independently.
> @@ -147,6 +183,9 @@ static void __printk_safe_flush(struct irq_work *work)
>  
>  	i = 0;
>  more:
> +	report_nmi_message_lost();
> +	report_safe_message_lost();

Please, move this at the end of this function. There are few reasons
for this:

1. This causes an infinite loop if called in printk_safe() context
   and vpritnk_safe() is used to print the warning.

2. This prints the warning about lost messages in the wrong order.
   We should print the warning after we print the messages
   that fit into the buffer.

Best Regards,
Petr

next prev parent reply	other threads:[~2016-11-25 11:07 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 15:49 [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 1/6] printk: use vprintk_func in vprintk() Sergey Senozhatsky
2016-11-24 16:28   ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 2/6] printk: rename nmi.c and exported api Sergey Senozhatsky
2016-11-24 16:35   ` Petr Mladek
2016-12-01  1:07     ` Sergey Senozhatsky
2016-12-01 12:12       ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 3/6] printk: introduce per-cpu safe_print seq buffer Sergey Senozhatsky
2016-11-24 16:58   ` Petr Mladek
2016-12-01  1:08     ` Sergey Senozhatsky
2016-12-01  5:32     ` Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 4/6] printk: report lost messages in printk safe/nmi contexts Sergey Senozhatsky
2016-11-25 11:07   ` Petr Mladek [this message]
2016-12-01  2:10     ` Sergey Senozhatsky
2016-12-01 12:50       ` Petr Mladek
2016-10-27 15:49 ` [RFC][PATCHv4 5/6] printk: use printk_safe buffers Sergey Senozhatsky
2016-11-25 14:28   ` Petr Mladek
2016-12-01  2:14     ` Sergey Senozhatsky
2016-10-27 15:49 ` [RFC][PATCHv4 6/6] printk: remove zap_locks() function Sergey Senozhatsky
2016-11-25 15:01   ` Petr Mladek
2016-11-25 15:17     ` Peter Zijlstra
2016-12-01  2:34       ` Sergey Senozhatsky
2016-12-01  5:42         ` Peter Zijlstra
2016-12-01 13:36           ` Petr Mladek
2016-12-02  1:11             ` Sergey Senozhatsky
2016-12-01  2:18     ` Sergey Senozhatsky
2016-12-01 12:50     ` Sergey Senozhatsky
2016-12-01 13:15       ` Petr Mladek
2016-10-28  3:30 ` [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Linus Torvalds
2016-10-28  4:05   ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161125110730.GH24103@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=calvinowens@fb.com \
    --cc=jack@suse.cz \
    --cc=keescook@chromium.org \
    --cc=labbott@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.