From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: "Toshiyuki Sato (Fujitsu)" <fj6611ie@fujitsu.com>,
'Michael Kelley' <mhklinux@outlook.com>,
'Ryo Takakura' <ryotkkr98@gmail.com>,
Russell King <linux@armlinux.org.uk>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jiri Slaby <jirislaby@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-serial@vger.kernel.org" <linux-serial@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: Problem with nbcon console and amba-pl011 serial port
Date: Fri, 06 Jun 2025 19:04:22 +0206 [thread overview]
Message-ID: <84msakdcy9.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <aEL0tZgSEhsR9qbf@pathway.suse.cz>
On 2025-06-06, Petr Mladek <pmladek@suse.com> wrote:
>> What if during non-panic-CPU shutdown, we allow reacquires to succeed
>> only for _direct_ acquires? The below diff shows how this could be
>> implemented. Since it only supports direct acquires, it does not violate
>> any state rules. And also, since it only involves the reacquire, there
>> is no ugly battling for console printing between the panic and non-panic
>> CPUs.
>
> Interesting idea. I thought a lot about it, see below.
>
>
>> diff --git a/include/linux/printk.h b/include/linux/printk.h
>> index 5b462029d03c1..d58ebdc8170b3 100644
>> --- a/include/linux/printk.h
>> +++ b/include/linux/printk.h
>> diff --git a/kernel/panic.c b/kernel/panic.c
>> index b0b9a8bf4560d..8f572630c9f7e 100644
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -304,6 +310,8 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>> smp_send_stop();
>> else
>> crash_smp_send_stop();
>> +
>> + nbcon_panic_allow_reacquire_set(false);
>> }
>
> I have two concerns here:
>
> 1. I wonder whether this is reliable enough. It seems that
> smp_send_stop() waits at least 1 sec until the CPUs
> get stopped. But is this enough on virtualized systems?
>
> 2. It might increase a risk when CPUs are stopped using NMI.
> The change would allow a non-panic CPU to reacquire the ownership
> and enter _unsafe_ section right before being stopped by NMI.
>
>
> The 1st problem might be avoided by allowing the reacquire all
> the time unless an "unsafe" takeover happened.
>
> The 2nd problem is worse. But allowing the reacquire all the time
> might actually help as well.
>
> Note that the information about the "unsafe_takeover" is stored
> in struct console so that we even won't need a new global
> variable.
That is a good idea. We should add unsafe_takeover as a condition as
well. That can only happen way after the smp_send_stop() anyway. But it
means that only the atomic printing code would ever need to worry about
unsafe_takeover (assuming a driver were to implement some sort of
handling of it).
>> /**
>> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
>> index d60596777d278..d960cb8a05558 100644
>> --- a/kernel/printk/nbcon.c
>> +++ b/kernel/printk/nbcon.c
>> @@ -235,7 +235,8 @@ static void nbcon_seq_try_update(struct nbcon_context *ctxt, u64 new_seq)
>> * the handover acquire method.
>> */
>> static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
>> - struct nbcon_state *cur)
>> + struct nbcon_state *cur,
>> + bool ignore_other_cpu_in_panic)
>> {
>> unsigned int cpu = smp_processor_id();
>> struct console *con = ctxt->console;
>> @@ -249,7 +250,7 @@ static int nbcon_context_try_acquire_direct(struct nbcon_context *ctxt,
>> * nbcon_waiter_matches(). In particular, the assumption that
>> * lower priorities are ignored during panic.
>> */
>> - if (other_cpu_in_panic())
>> + if (other_cpu_in_panic() && !ignore_other_cpu_in_panic)
>
> If you agree with allowing the reacquire all the time then I would
> rename the parameter to @is_reacquire and do something like:
>
> if (other_cpu_in_panic() &&
> (!is_reacquire || cur->unsafe_takeover))
Looks fine to me.
>> return -EPERM;
>>
>> if (ctxt->prio <= cur->prio || ctxt->prio <= cur->req_prio)
>> @@ -913,7 +920,7 @@ void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt)
>> {
>> struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
>>
>> - while (!nbcon_context_try_acquire(ctxt))
>> + while (!nbcon_context_try_acquire(ctxt, READ_ONCE(nbcon_panic_allow_reacquire)))
>
> And here it would be:
>
> while (!nbcon_context_try_acquire(ctxt, true))
Right.
>> cpu_relax();
>>
>> nbcon_write_context_set_buf(wctxt, NULL, 0);
>
>
> Summary:
>
> I open to give this alternative approach a chance when we allow the
> reacquire all the time. It might work well. And it won't require
> any special "panic" handling in all console drivers.
Agreed. Thanks for being open about this approach. I will put together
an official patch.
John
next prev parent reply other threads:[~2025-06-06 16:58 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-03 3:18 Problem with nbcon console and amba-pl011 serial port Michael Kelley
2025-06-03 9:03 ` Ryo Takakura
2025-06-03 9:36 ` Toshiyuki Sato (Fujitsu)
2025-06-03 10:13 ` John Ogness
2025-06-03 10:44 ` John Ogness
2025-06-04 1:22 ` Toshiyuki Sato (Fujitsu)
2025-06-04 7:44 ` John Ogness
2025-06-04 8:11 ` Russell King (Oracle)
2025-06-03 11:09 ` John Ogness
2025-06-04 4:11 ` Toshiyuki Sato (Fujitsu)
2025-06-04 7:52 ` John Ogness
2025-06-04 11:08 ` Petr Mladek
2025-06-04 11:50 ` John Ogness
2025-06-04 13:42 ` Petr Mladek
2025-06-05 5:27 ` Toshiyuki Sato (Fujitsu)
2025-06-05 13:39 ` Petr Mladek
2025-06-06 6:46 ` Toshiyuki Sato (Fujitsu)
2025-06-06 10:19 ` John Ogness
2025-06-06 10:35 ` John Ogness
2025-06-06 14:01 ` Petr Mladek
2025-06-06 16:58 ` John Ogness [this message]
2025-06-05 2:49 ` Michael Kelley
2025-06-05 6:22 ` Toshiyuki Sato (Fujitsu)
2025-06-05 7:42 ` John Ogness
2025-06-09 3:38 ` Michael Kelley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84msakdcy9.fsf@jogness.linutronix.de \
--to=john.ogness@linutronix.de \
--cc=fj6611ie@fujitsu.com \
--cc=gregkh@linuxfoundation.org \
--cc=jirislaby@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mhklinux@outlook.com \
--cc=pmladek@suse.com \
--cc=ryotkkr98@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox