public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH printk v4 23/27] printk: nbcon: Implement emergency sections
Date: Fri, 12 Apr 2024 17:27:49 +0200	[thread overview]
Message-ID: <ZhlS9fsKrLe242Zn@localhost.localdomain> (raw)
In-Reply-To: <20240402221129.2613843-24-john.ogness@linutronix.de>

On Wed 2024-04-03 00:17:25, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> In emergency situations (something has gone wrong but the
> system continues to operate), usually important information
> (such as a backtrace) is generated via printk(). Each
> individual printk record has little meaning. It is the
> collection of printk messages that is most often needed by
> developers and users.
> 
> In order to help ensure that the collection of printk messages
> in an emergency situation are all stored to the ringbuffer as
> quickly as possible, disable console output for that CPU while
> it is in the emergency situation. The consoles need to be
> flushed when exiting the emergency situation.
> 
> Add per-CPU emergency nesting tracking because an emergency
> can arise while in an emergency situation.
> 
> Add functions to mark the beginning and end of emergency
> sections where the urgent messages are generated.
> 
> Do not print if the current CPU is in an emergency state.
> 
> When exiting all emergency nesting, flush nbcon consoles
> directly using their atomic callback. Legacy consoles are
> triggered for flushing via irq_work because it is not known
> if the context was safe for a trylock on the console lock.
> 
> Note that the emergency state is not system-wide. While one CPU
> is in an emergency state, another CPU may continue to print
> console messages.
> 
> Co-developed-by: John Ogness <john.ogness@linutronix.de>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
> ---
>  include/linux/console.h |  4 ++
>  kernel/printk/nbcon.c   | 83 +++++++++++++++++++++++++++++++++++++++++
>  kernel/printk/printk.c  | 13 ++++++-
>  3 files changed, 98 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 5f1758891aec..7712e4145164 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -559,10 +559,14 @@ static inline bool console_is_registered(const struct console *con)
>  	hlist_for_each_entry(con, &console_list, node)
>  
>  #ifdef CONFIG_PRINTK
> +extern void nbcon_cpu_emergency_enter(void);
> +extern void nbcon_cpu_emergency_exit(void);
>  extern bool nbcon_can_proceed(struct nbcon_write_context *wctxt);
>  extern bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt);
>  extern bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt);
>  #else
> +static inline void nbcon_cpu_emergency_enter(void) { }
> +static inline void nbcon_cpu_emergency_exit(void) { }
>  static inline bool nbcon_can_proceed(struct nbcon_write_context *wctxt) { return false; }
>  static inline bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt) { return false; }
>  static inline bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt) { return false; }
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 47f39402a22b..4c852c2e8d89 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -936,6 +936,29 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
>  	return nbcon_context_exit_unsafe(ctxt);
>  }
>  
> +/* Track the nbcon emergency nesting per CPU. */
> +static DEFINE_PER_CPU(unsigned int, nbcon_pcpu_emergency_nesting);
> +static unsigned int early_nbcon_pcpu_emergency_nesting __initdata;
> +
> +/**
> + * nbcon_get_cpu_emergency_nesting - Get the per CPU emergency nesting pointer
> + *
> + * Return:	Either a pointer to the per CPU emergency nesting counter of
> + *		the current CPU or to the init data during early boot.
> + */
> +static __ref unsigned int *nbcon_get_cpu_emergency_nesting(void)
> +{
> +	/*
> +	 * The value of __printk_percpu_data_ready gets set in normal
> +	 * context and before SMP initialization. As a result it could
> +	 * never change while inside an nbcon emergency section.
> +	 */
> +	if (!printk_percpu_data_ready())
> +		return &early_nbcon_pcpu_emergency_nesting;
> +
> +	return this_cpu_ptr(&nbcon_pcpu_emergency_nesting);
> +}
> +
>  /**
>   * nbcon_atomic_emit_one - Print one record for an nbcon console using the
>   *				write_atomic() callback
> @@ -977,9 +1000,15 @@ static bool nbcon_atomic_emit_one(struct nbcon_write_context *wctxt)
>   */
>  enum nbcon_prio nbcon_get_default_prio(void)
>  {
> +	unsigned int *cpu_emergency_nesting;
> +
>  	if (this_cpu_in_panic())
>  		return NBCON_PRIO_PANIC;
>  
> +	cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +	if (*cpu_emergency_nesting)
> +		return NBCON_PRIO_EMERGENCY;
> +
>  	return NBCON_PRIO_NORMAL;
>  }
>  
> @@ -1146,6 +1175,60 @@ void nbcon_atomic_flush_unsafe(void)
>  	__nbcon_atomic_flush_pending(prb_next_reserve_seq(prb), true);
>  }
>  
> +/**
> + * nbcon_cpu_emergency_enter - Enter an emergency section where printk()
> + *	messages for that CPU are only stored
> + *
> + * Upon exiting the emergency section, all stored messages are flushed.
> + *
> + * Context:	Any context. Disables preemption.
> + *
> + * When within an emergency section, no printing occurs on that CPU. This
> + * is to allow all emergency messages to be dumped into the ringbuffer before
> + * flushing the ringbuffer. The actual printing occurs when exiting the
> + * outermost emergency section.
> + */
> +void nbcon_cpu_emergency_enter(void)
> +{
> +	unsigned int *cpu_emergency_nesting;
> +
> +	preempt_disable();
> +
> +	cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +	(*cpu_emergency_nesting)++;
> +}
> +
> +/**
> + * nbcon_cpu_emergency_exit - Exit an emergency section and flush the
> + *	stored messages
> + *
> + * Flushing only occurs when exiting all nesting for the CPU.
> + *
> + * Context:	Any context. Enables preemption.
> + */
> +void nbcon_cpu_emergency_exit(void)
> +{
> +	unsigned int *cpu_emergency_nesting;
> +	bool do_trigger_flush = false;
> +
> +	cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +
> +	WARN_ON_ONCE(*cpu_emergency_nesting == 0);

We should safe the situation and make sure that the couter
does not go below zero.

Also the WARN_ON_ONCE() might be moved after the flush. IMHO, it is
not that important.


> +	if (*cpu_emergency_nesting == 1) {
> +		nbcon_atomic_flush_pending();
> +		do_trigger_flush = true;
> +	}

I wanted to reshufle the code. Then I realized that the messages
are flushed before decrementing the counter probably on purpose.

> +
> +	/* Undo the nesting count of nbcon_cpu_emergency_enter(). */
> +	(*cpu_emergency_nesting)--;

I suggest to replace the above code with something like this:

	/*
	 * Flush the messages enabling preemtion to see them ASAP.
	 *
	 * Reduce the risk of potential softlockup by using the flush_pending()
	 * variant which ignores later added messages. It is called before
	 * decrementing the couter to even prevent flushing of another nested
	 * emergency messages.
	 */
	if (*cpu_emergency_nesting == 1) {
		nbcon_atomic_flush_pending();
		do_trigger_flush = true;
	}

	(*cpu_emergency_nesting)--;

	if (WARN_ON_ONCE(*cpu_emergency_nesting < 0))
		*cpu_emergency_nesting = 0;

> +
> +	preempt_enable();
>
> +	if (do_trigger_flush)
> +		printk_trigger_flush();
> +}
> +
>  /**
>   * nbcon_alloc - Allocate buffers needed by the nbcon console
>   * @con:	Console to allocate buffers for

With the above change, feel free to use:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

  reply	other threads:[~2024-04-12 15:27 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02 22:11 [PATCH printk v4 00/27] wire up write_atomic() printing John Ogness
2024-04-02 22:11 ` [PATCH printk v4 01/27] printk: Add notation to console_srcu locking John Ogness
2024-04-02 22:11 ` [PATCH printk v4 02/27] printk: Properly deal with nbcon consoles on seq init John Ogness
2024-04-05 15:37   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 03/27] printk: nbcon: Remove return value for write_atomic() John Ogness
2024-04-08 13:17   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 04/27] printk: Check printk_deferred_enter()/_exit() usage John Ogness
2024-04-02 22:11 ` [PATCH printk v4 05/27] printk: nbcon: Add detailed doc for write_atomic() John Ogness
2024-04-08 15:20   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 06/27] printk: nbcon: Add callbacks to synchronize with driver John Ogness
2024-04-09  9:20   ` Petr Mladek
2024-04-16 15:01     ` John Ogness
2024-04-17 13:03       ` Petr Mladek
2024-04-17 14:54         ` John Ogness
2024-04-18 10:33           ` Petr Mladek
2024-04-18 12:10             ` John Ogness
2024-04-18 15:03               ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 07/27] printk: nbcon: Use driver synchronization while registering John Ogness
2024-04-09  9:46   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 08/27] serial: core: Provide low-level functions to lock port John Ogness
2024-04-09 12:00   ` Petr Mladek
2024-04-09 13:23   ` Greg Kroah-Hartman
2024-04-02 22:11 ` [PATCH printk v4 09/27] printk: nbcon: Implement processing in port->lock wrapper John Ogness
2024-04-03 11:35   ` John Ogness
2024-04-10 12:35   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 10/27] printk: nbcon: Do not rely on proxy headers John Ogness
2024-04-03 10:33   ` Andy Shevchenko
2024-04-10 13:06   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 11/27] printk: nbcon: Fix kerneldoc for enums John Ogness
2024-04-02 22:11 ` [PATCH printk v4 12/27] printk: Make console_is_usable() available to nbcon John Ogness
2024-04-02 22:11 ` [PATCH printk v4 13/27] printk: Let console_is_usable() handle nbcon John Ogness
2024-04-02 22:11 ` [PATCH printk v4 14/27] printk: Add @flags argument for console_is_usable() John Ogness
2024-04-02 22:11 ` [PATCH printk v4 15/27] printk: nbcon: Provide function to flush using write_atomic() John Ogness
2024-04-10 14:56   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 16/27] printk: Track registered boot consoles John Ogness
2024-04-02 22:11 ` [PATCH printk v4 17/27] printk: nbcon: Use nbcon consoles in console_flush_all() John Ogness
2024-04-11 14:14   ` Petr Mladek
2024-04-11 15:32     ` Petr Mladek
2024-04-17 23:05       ` John Ogness
2024-04-18 12:47         ` Petr Mladek
2024-04-18 21:45           ` John Ogness
2024-04-19  9:55             ` Petr Mladek
2024-04-12  9:07     ` Petr Mladek
2024-04-17 21:59     ` John Ogness
2024-04-02 22:11 ` [PATCH printk v4 18/27] printk: nbcon: Assign priority based on CPU state John Ogness
2024-04-02 22:11 ` [PATCH printk v4 19/27] printk: nbcon: Add unsafe flushing on panic John Ogness
2024-04-11 14:18   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 20/27] printk: Avoid console_lock dance if no legacy or boot consoles John Ogness
2024-04-03 10:01   ` John Ogness
2024-04-12 14:21     ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 21/27] printk: Track nbcon consoles John Ogness
2024-04-12 14:22   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 22/27] printk: Coordinate direct printing in panic John Ogness
2024-04-12 14:39   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 23/27] printk: nbcon: Implement emergency sections John Ogness
2024-04-12 15:27   ` Petr Mladek [this message]
2024-04-02 22:11 ` [PATCH printk v4 24/27] panic: Mark emergency section in warn John Ogness
2024-04-15 13:16   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 25/27] panic: Mark emergency section in oops John Ogness
2024-04-15 13:22   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 26/27] rcu: Mark emergency section in rcu stalls John Ogness
2024-04-15 13:32   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 27/27] lockdep: Mark emergency sections in lockdep splats John Ogness
2024-04-16  9:51   ` Petr Mladek
2024-04-16 11:17   ` Peter Zijlstra
2024-04-16 12:53     ` John Ogness

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhlS9fsKrLe242Zn@localhost.localdomain \
    --to=pmladek@suse.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox