public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH printk v4 17/27] printk: nbcon: Use nbcon consoles in console_flush_all()
Date: Thu, 18 Apr 2024 23:51:01 +0206	[thread overview]
Message-ID: <87v84e4a76.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <ZiEWaA3CeQsccEdj@pathway.suse.cz>

On 2024-04-18, Petr Mladek <pmladek@suse.com> wrote:
>> > 	   Solve this problem by introducing[*] nbcon_atomic_flush_all()
>> > 	   which would flush even newly added messages and
>> > 	   call this in nbcon_cpu_emergency_exit() when the printk
>> > 	   kthread does not work. It should bail out when there
>> > 	   is a panic in progress.
>> >
>> > 	   Motivation: It does not matter which "atomic" context
>> > 		flushes NORMAL/EMERGENCY messages when
>> > 		the printk kthread is not available.
>> 
>> I do not think that solves the problem. If the console is in an unsafe
>> section, nothing can be printed.
>
> IMHO, it solves the problem.
>
> The idea is simple:
>
>   "The current nbcon console owner will be responsible for flushing
>    all messages when the printk kthread does not exist."

Currently this is not valid. It assumes owners are printers. That is not
always true. The owner might be some task modifying the baud rate and
has nothing to do with printing.

> The prove is more complicated:
>
>    1. Let's put aside panic. We already do the best effort there.
>
>    2. Emergency mode currently violates the rule because
>       nbcon_atomic_flush_pending() ignores the simple rule.
>
>       => FIX: improve nbcon_cpu_emergency_exit() to flush
> 	      all messages when kthreads are not ready.

Emergency mode cannot flush _anything_ if there is an owner in an unsafe
region. (And that owner may not be a printer.)

>    3. Normal mode flushes nbcon consoles via
>       nbcon_legacy_emit_next_record() from console_unlock()
>       before the kthreads are started.
>
>       It is not reliable when nbcon_try_acquire() fails.
>       But it would fail only when there is another user.
>       The other owner might be:
>
> 	+ panic: will handle everything
>
> 	+ emergency: should flush everything [*]
>
> 	+ normal: can't happen because of con->device() lock.

As the code is now, "normal" does not imply con->device() lock. When
using con->write_atomic(), we do not (and can not) use the con->device()
lock. So normal _can_ fail in nbcon_legacy_emit_next_record() if another
CPU is adjusting the baud rate. This is why I said the problem with
"emergency" is basically the same problem as "normal" (WRT using
write_atomic()).

> => The only remaining problem is to fix nbcon_atomic_flush_pending()
>    to flush everything when the kthreads are not started yet.

I think my proposed change handles it better. I have been doing various
tests and also adjusted it a bit.

The reason the flushing fails is because another context owns the
console. So I think it makes sense for that owner to handle flushing
responsibility when releasing ownership (even if that context was just
changing the baud rate).

[ Keep in mind we are only talking about printing via write_atomic()
when the kthread is not available. ]

If the current owner is a normal printing context, it will print to
completion anyway (via console_flush_all()).

If the current owner is an emergency printing context, it will only
print the emergency messages (as PRIO_EMERGENCY). However, when it
releases ownership, it could flush the remaining records (as
PRIO_NORMAL) in the same fashion as console_flush_all() does it.

If the current owner is a non-printing context, when it releases
ownership, it could flush the remaining records (as PRIO_NORMAL) in the
same fashion as console_flush_all() does it.

So what I am proposing is that we add two new normal-flushing sites that
are only used when the kthread is not available:

1. after exiting emergency mode: in nbcon_cpu_emergency_exit()

2. after releasing ownership for non-printing: in nbcon_driver_release()

I think this will close the gap and it does not require irq_work.

> Sigh, all this is so complicated. I wonder how to document
> this so that other people do not have to discover these
> dependencies again and again. Is it even possible?

In the end we will have the following 5 scenarios (assuming my
proposal):

1. PRIO_NORMAL non-printing activity, always under con->device() lock,
   upon release flushes[*] full backlog

2. PRIO_NORMAL printing using write_thread(), always called from task
   context and under con->device() lock, always flushes full backlog

3. PRIO_NORMAL printing using write_atomic(), called with hardware
   interrupts disabled, always flushes full backlog, (only used when the
   kthread is not available)

4. PRIO_EMERGENCY printing using write_atomic(), called with hardware
   interrupts disabled, flushes through emergency, upon exit flushes[*]
   full backlog

5. PRIO_PANIC printing using write_atomic(), called with hardware
   interrupts disabled, flushes full backlog

[*] Only when the kthread is not available. And in that case #3 is the
    scenario used for the trailing full flushing by #1 and #4.


Full flushing is attempted in all 5 scenarios. A failed attempt means
there is a new owner, but that owner is also one of the 5 scenarios.

Am I missing something?

IMHO #3 is the only bizarre scenario. It exists only to cover when the
kthread is not available.

John

  reply	other threads:[~2024-04-18 21:45 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02 22:11 [PATCH printk v4 00/27] wire up write_atomic() printing John Ogness
2024-04-02 22:11 ` [PATCH printk v4 01/27] printk: Add notation to console_srcu locking John Ogness
2024-04-02 22:11 ` [PATCH printk v4 02/27] printk: Properly deal with nbcon consoles on seq init John Ogness
2024-04-05 15:37   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 03/27] printk: nbcon: Remove return value for write_atomic() John Ogness
2024-04-08 13:17   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 04/27] printk: Check printk_deferred_enter()/_exit() usage John Ogness
2024-04-02 22:11 ` [PATCH printk v4 05/27] printk: nbcon: Add detailed doc for write_atomic() John Ogness
2024-04-08 15:20   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 06/27] printk: nbcon: Add callbacks to synchronize with driver John Ogness
2024-04-09  9:20   ` Petr Mladek
2024-04-16 15:01     ` John Ogness
2024-04-17 13:03       ` Petr Mladek
2024-04-17 14:54         ` John Ogness
2024-04-18 10:33           ` Petr Mladek
2024-04-18 12:10             ` John Ogness
2024-04-18 15:03               ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 07/27] printk: nbcon: Use driver synchronization while registering John Ogness
2024-04-09  9:46   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 08/27] serial: core: Provide low-level functions to lock port John Ogness
2024-04-09 12:00   ` Petr Mladek
2024-04-09 13:23   ` Greg Kroah-Hartman
2024-04-02 22:11 ` [PATCH printk v4 09/27] printk: nbcon: Implement processing in port->lock wrapper John Ogness
2024-04-03 11:35   ` John Ogness
2024-04-10 12:35   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 10/27] printk: nbcon: Do not rely on proxy headers John Ogness
2024-04-03 10:33   ` Andy Shevchenko
2024-04-10 13:06   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 11/27] printk: nbcon: Fix kerneldoc for enums John Ogness
2024-04-02 22:11 ` [PATCH printk v4 12/27] printk: Make console_is_usable() available to nbcon John Ogness
2024-04-02 22:11 ` [PATCH printk v4 13/27] printk: Let console_is_usable() handle nbcon John Ogness
2024-04-02 22:11 ` [PATCH printk v4 14/27] printk: Add @flags argument for console_is_usable() John Ogness
2024-04-02 22:11 ` [PATCH printk v4 15/27] printk: nbcon: Provide function to flush using write_atomic() John Ogness
2024-04-10 14:56   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 16/27] printk: Track registered boot consoles John Ogness
2024-04-02 22:11 ` [PATCH printk v4 17/27] printk: nbcon: Use nbcon consoles in console_flush_all() John Ogness
2024-04-11 14:14   ` Petr Mladek
2024-04-11 15:32     ` Petr Mladek
2024-04-17 23:05       ` John Ogness
2024-04-18 12:47         ` Petr Mladek
2024-04-18 21:45           ` John Ogness [this message]
2024-04-19  9:55             ` Petr Mladek
2024-04-12  9:07     ` Petr Mladek
2024-04-17 21:59     ` John Ogness
2024-04-02 22:11 ` [PATCH printk v4 18/27] printk: nbcon: Assign priority based on CPU state John Ogness
2024-04-02 22:11 ` [PATCH printk v4 19/27] printk: nbcon: Add unsafe flushing on panic John Ogness
2024-04-11 14:18   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 20/27] printk: Avoid console_lock dance if no legacy or boot consoles John Ogness
2024-04-03 10:01   ` John Ogness
2024-04-12 14:21     ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 21/27] printk: Track nbcon consoles John Ogness
2024-04-12 14:22   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 22/27] printk: Coordinate direct printing in panic John Ogness
2024-04-12 14:39   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 23/27] printk: nbcon: Implement emergency sections John Ogness
2024-04-12 15:27   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 24/27] panic: Mark emergency section in warn John Ogness
2024-04-15 13:16   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 25/27] panic: Mark emergency section in oops John Ogness
2024-04-15 13:22   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 26/27] rcu: Mark emergency section in rcu stalls John Ogness
2024-04-15 13:32   ` Petr Mladek
2024-04-02 22:11 ` [PATCH printk v4 27/27] lockdep: Mark emergency sections in lockdep splats John Ogness
2024-04-16  9:51   ` Petr Mladek
2024-04-16 11:17   ` Peter Zijlstra
2024-04-16 12:53     ` John Ogness

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v84e4a76.fsf@jogness.linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox