public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH printk v5 26/30] printk: nbcon: Implement emergency sections
Date: Tue, 21 May 2024 15:38:05 +0200	[thread overview]
Message-ID: <ZkyjvcAg8HnYEUdt@pathway.suse.cz> (raw)
In-Reply-To: <20240502213839.376636-27-john.ogness@linutronix.de>

On Thu 2024-05-02 23:44:35, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> In emergency situations (something has gone wrong but the
> system continues to operate), usually important information
> (such as a backtrace) is generated via printk(). Each
> individual printk record has little meaning. It is the
> collection of printk messages that is most often needed by
> developers and users.
> 
> In order to help ensure that the collection of printk messages
> in an emergency situation are all stored to the ringbuffer as
> quickly as possible, disable console output for that CPU while
> it is in the emergency situation. The consoles need to be
> flushed when exiting the emergency situation.
> 
> Add per-CPU emergency nesting tracking because an emergency
> can arise while in an emergency situation.
> 
> Add functions to mark the beginning and end of emergency
> sections where the urgent messages are generated.
> 
> Do not print if the current CPU is in an emergency state.
> 
> When exiting all emergency nesting, flush nbcon consoles
> directly using their atomic callback. Legacy consoles are
> triggered for flushing via irq_work because it is not known
> if the context was safe for a trylock on the console lock.
> 
> Note that the emergency state is not system-wide. While one CPU
> is in an emergency state, another CPU may continue to print
> console messages.
> 
> Co-developed-by: John Ogness <john.ogness@linutronix.de>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>

> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1199,6 +1228,93 @@ void nbcon_atomic_flush_unsafe(void)
>  	__nbcon_atomic_flush_pending(prb_next_reserve_seq(prb), true);
>  }
>  
> +/**
> + * nbcon_cpu_emergency_enter - Enter an emergency section where printk()
> + *				messages for that CPU are only stored
> + *
> + * Upon exiting the emergency section, all stored messages are flushed.
> + *
> + * Context:	Any context. Disables preemption.
> + *
> + * When within an emergency section, no printing occurs on that CPU. This
> + * is to allow all emergency messages to be dumped into the ringbuffer before
> + * flushing the ringbuffer. The actual printing occurs when exiting the
> + * outermost emergency section.
> + */
> +void nbcon_cpu_emergency_enter(void)
> +{
> +	unsigned int *cpu_emergency_nesting;
> +
> +	preempt_disable();
> +
> +	cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +	(*cpu_emergency_nesting)++;
> +}
> +
> +/**
> + * nbcon_cpu_emergency_exit - Exit an emergency section and flush the
> + *				stored messages
> + *
> + * Flushing only occurs when exiting all nesting for the CPU.
> + *
> + * Context:	Any context. Enables preemption.
> + */
> +void nbcon_cpu_emergency_exit(void)
> +{
> +	unsigned int *cpu_emergency_nesting;
> +	bool do_trigger_flush = false;
> +
> +	cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +
> +	/*
> +	 * Flush the messages before enabling preemtion to see them ASAP.
> +	 *
> +	 * Reduce the risk of potential softlockup by using the
> +	 * flush_pending() variant which ignores messages added later. It is
> +	 * called before decrementing the counter so that the printing context
> +	 * for the emergency messages is NBCON_PRIO_EMERGENCY.
> +	 */
> +	if (*cpu_emergency_nesting == 1) {
> +		nbcon_atomic_flush_pending();
> +		do_trigger_flush = true;
> +	}

The commit message says:

  "Legacy consoles are triggered for flushing via irq_work because
   it is not known if the context was safe for a trylock on the
   console lock."

I do not feel completely comfortable with this. If printk() knows
when it is safe to call console_trylock() then we should know as
well.

IMHO, we could just call the below implemented nbcon_cpu_emergency_flush() here.

> +
> +	(*cpu_emergency_nesting)--;
> +
> +	if (WARN_ON_ONCE(*cpu_emergency_nesting < 0))
> +		*cpu_emergency_nesting = 0;
> +
> +	preempt_enable();
> +
> +	if (do_trigger_flush)
> +		printk_trigger_flush();
> +}
> +
> +/**
> + * nbcon_cpu_emergency_flush - Explicitly flush consoles while
> + *				within emergency context
> + *
> + * Both nbcon and legacy consoles are flushed.
> + *
> + * It should be used only when there are too many messages printed
> + * in emergency context, for example, printing backtraces of all
> + * CPUs or processes. It is typically needed when the watchdogs
> + * need to be touched as well.
> + */
> +void nbcon_cpu_emergency_flush(void)
> +{
> +	/* The explicit flush is needed only in the emergency context. */
> +	if (*(nbcon_get_cpu_emergency_nesting()) == 0)
> +		return;
> +
> +	nbcon_atomic_flush_pending();
> +
> +	if (printing_via_unlock && !in_nmi()) {
> +		if (console_trylock())
> +			console_unlock();
> +	}

Hmm, we should also check whether we are in printk_safe context.
We could implement:

bool is_printk_deferred(void)
{
	/*
	 * The per-CPU variable @printk_context can be read safely in any
	 * context. The CPU migration always disabled when set.
	 */
	return this_cpu_read(printk_context) || in_nmi();
}

How does that sound?

Best Regards,
Petr

  parent reply	other threads:[~2024-05-21 13:38 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-02 21:38 [PATCH printk v5 00/30] wire up write_atomic() printing John Ogness
2024-05-02 21:38 ` [PATCH printk v5 01/30] printk: Add notation to console_srcu locking John Ogness
2024-05-07 12:16   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 02/30] printk: Properly deal with nbcon consoles on seq init John Ogness
2024-05-02 21:38 ` [PATCH printk v5 03/30] printk: nbcon: Remove return value for write_atomic() John Ogness
2024-05-02 21:38 ` [PATCH printk v5 04/30] printk: Check printk_deferred_enter()/_exit() usage John Ogness
2024-05-02 21:38 ` [PATCH printk v5 05/30] printk: nbcon: Add detailed doc for write_atomic() John Ogness
2024-05-07 12:26   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 06/30] printk: nbcon: Add callbacks to synchronize with driver John Ogness
2024-05-17 13:31   ` Petr Mladek
2024-05-17 14:00     ` John Ogness
2024-05-17 14:40       ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 07/30] printk: nbcon: Use driver synchronization while (un)registering John Ogness
2024-05-06 10:27   ` John Ogness
2024-05-17 13:44     ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 08/30] serial: core: Provide low-level functions to lock port John Ogness
2024-05-02 21:38 ` [PATCH printk v5 09/30] serial: core: Introduce wrapper to set @uart_port->cons John Ogness
2024-05-03  8:09   ` Théo Lebrun
2024-05-04 16:08   ` Greg Kroah-Hartman
2024-05-17 14:51   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 10/30] console: Improve console_srcu_read_flags() comments John Ogness
2024-05-17 14:55   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 11/30] nbcon: Provide functions for drivers to acquire console for non-printing John Ogness
2024-05-06 10:29   ` John Ogness
2024-05-20 12:13   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 12/30] serial: core: Implement processing in port->lock wrapper John Ogness
2024-05-04 16:08   ` Greg Kroah-Hartman
2024-05-20 12:22   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 13/30] printk: nbcon: Do not rely on proxy headers John Ogness
2024-05-02 21:38 ` [PATCH printk v5 14/30] printk: nbcon: Fix kerneldoc for enums John Ogness
2024-05-02 21:38 ` [PATCH printk v5 15/30] printk: Make console_is_usable() available to nbcon John Ogness
2024-05-02 21:38 ` [PATCH printk v5 16/30] printk: Let console_is_usable() handle nbcon John Ogness
2024-05-02 21:38 ` [PATCH printk v5 17/30] printk: Add @flags argument for console_is_usable() John Ogness
2024-05-02 21:38 ` [PATCH printk v5 18/30] printk: nbcon: Add helper to assign priority based on CPU state John Ogness
2024-05-02 21:38 ` [PATCH printk v5 19/30] printk: nbcon: Provide function to flush using write_atomic() John Ogness
2024-05-20 15:04   ` Petr Mladek
2024-05-21 19:07     ` John Ogness
2024-05-02 21:38 ` [PATCH printk v5 20/30] printk: Track registered boot consoles John Ogness
2024-05-02 21:38 ` [PATCH printk v5 21/30] printk: nbcon: Use nbcon consoles in console_flush_all() John Ogness
2024-05-21  8:41   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 22/30] printk: nbcon: Add unsafe flushing on panic John Ogness
2024-05-02 21:38 ` [PATCH printk v5 23/30] printk: Avoid console_lock dance if no legacy or boot consoles John Ogness
2024-05-02 21:38 ` [PATCH printk v5 24/30] printk: Track nbcon consoles John Ogness
2024-05-02 21:38 ` [PATCH printk v5 25/30] printk: Coordinate direct printing in panic John Ogness
2024-05-02 21:38 ` [PATCH printk v5 26/30] printk: nbcon: Implement emergency sections John Ogness
2024-05-06 10:39   ` John Ogness
2024-05-21 12:37     ` Petr Mladek
2024-05-21 13:38   ` Petr Mladek [this message]
2024-05-02 21:38 ` [PATCH printk v5 27/30] panic: Mark emergency section in warn John Ogness
2024-05-02 21:38 ` [PATCH printk v5 28/30] panic: Mark emergency section in oops John Ogness
2024-05-02 21:38 ` [PATCH printk v5 29/30] rcu: Mark emergency sections in rcu stalls John Ogness
2024-05-21 14:03   ` Petr Mladek
2024-05-02 21:38 ` [PATCH printk v5 30/30] lockdep: Mark emergency sections in lockdep splats John Ogness
2024-05-21 15:13   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZkyjvcAg8HnYEUdt@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox