From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: "Sergey Senozhatsky" <senozhatsky@chromium.org>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
linux-kernel@vger.kernel.org,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Jiri Slaby" <jirislaby@kernel.org>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
"Tony Lindgren" <tony@atomide.com>,
"Geert Uytterhoeven" <geert+renesas@glider.be>,
"Justin Chen" <justin.chen@broadcom.com>,
"Jiaqing Zhao" <jiaqing.zhao@linux.intel.com>,
linux-serial@vger.kernel.org
Subject: Re: [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper
Date: Mon, 11 Mar 2024 18:14:19 +0106 [thread overview]
Message-ID: <87le6oy9vg.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <Zdh4eEJJpasEWqa5@alley>
Hi Petr,
Thanks for the detailed feedback. Here is a lengthy response. I hope it
clarifies the uart port and console fields. And I think you identified a
bug relating to the setup() callback.
On 2024-02-23, Petr Mladek <pmladek@suse.com> wrote:
> My main (only) concern was the synchronization of the various accessed
> variables, especially, port->cons.
The only issue is if port->cons disappears between lock and unlock. I
see there is code setting port->cons to NULL, although I do not know
why. Once port->cons is set, there is never a reason to unset it.
Regardless, I will add port->lock synchronization when modifying
port->cons. There are only a few code sites and they are all during
driver setup.
> Note: I am not completely sure how the early and valid console drivers
> share the same struct uart_port. So, maybe I miss some important
> guarantee.
The struct uart_port is _not_ shared between the early and normal
consoles. However, the struct console is shared for normal consoles
amoung various ports of a particular driver.
> Anyway. synchronization of port->cons looks like a shade area.
> IMHO, the existing code expects that it is used _only when the console
> is registered_. But this patch wants to access it _even before
> the console is registered_!
Indeed. It is not enough for uart_is_nbcon() to check if it is an
NBCON. It must also check if it is registered, locklessly:
hlist_unhashed_lockless(&con->node);
Most importantly to be sure that nbcon_init() has already been called.
register_console() clears the nbcon state after cons->index has been
set, but before the console has been added to the list.
> For example, it took me quite a lot of time to shake my head around:
>
> #define uart_console(port) \
> ((port)->cons && (port)->cons->index == (port)->line)
>
> + port->cons and port->line are updated in the uart code.
> It seems that the update is not serialized by port->lock.
> Something might be done under port->mutex.
>
> + cons->index is updated in register_console() under
> console_list_lock.
>
> Spoiler: I propose a solution which does not use uart_console().
This check is necessary because multiple ports of a driver will set and
share the same port->cons value, even if they are not really the
console. I looked into enforcing that port->cons is NULL if it is not a
registered console, but this is tricky. port->cons is driver-internal
and hidden from printk. The driver will set port->cons in case it
becomes the console and printk will set cons->index once it has chosen
which port will be the actual console. But there is no way to unset
port->cons if a port was not chosen by printk.
The various fields have the following meaning (AFAICT):
port->line: An identifier to represent a particular port supported by a
driver.
port->cons: The struct console to use if this port is chosen to be a
console.
port->console: Boolean, true if this port was chosen to be a
console. (Used only by the tty layer.)
cons->index: The port chosen by printk to be a console.
None of those fields specify if the port is currently registered as a
console. For that you would need to check if port->cons->node is hashed
and then verify that port->line matches port->cons->index. This is what
uart_nbcon_acquire() needs to do (as well as check if it is an nbcon
console).
>> --- a/drivers/tty/serial/8250/8250_port.c
>> +++ b/drivers/tty/serial/8250/8250_port.c
>> @@ -3284,6 +3284,7 @@ void serial8250_init_port(struct uart_8250_port *up)
>> struct uart_port *port = &up->port;
>>
>> spin_lock_init(&port->lock);
>> + port->nbcon_locked_port = false;
>
> I am not sure if the variable really helps anything:
>
> + nbcon_context release() must handle the case when it
> lost the ownership anyway.
uart_nbcon_release() must first check if the provided port is a
registered nbcon console. A simple boolean check is certainly faster
than the 4 checks mentioned above. After all, if it was never locked,
there is no reason to unlock it.
> + nbcon_acquire() is called under port->lock. It means that
> nbcon_release() should always be called when the acquire
> succeeded earlier.
Same answer as above.
> We just need to make sure that port->cons can be cleared only
> under port->lock. But this would be required even with
> port->nbcon_locked_port.
Agreed. I will add this.
>> --- a/kernel/printk/nbcon.c
>> +++ b/kernel/printk/nbcon.c
>> @@ -995,3 +996,79 @@ void nbcon_free(struct console *con)
>>
>> con->pbufs = NULL;
>> }
>> +
>> +static inline bool uart_is_nbcon(struct uart_port *up)
>> +{
>> + int cookie;
>> + bool ret;
>> +
>> + if (!uart_console(up))
>
> This function accesses up->cons. I am not completely sure how
> this value is synchronized, for example:
>
> + serial_core_add_one_port() sets uport->cons under port->mutex.
> The function is called uart_add_one_port() in various probe()
> or init() calls.
I will add port->lock synchronization.
> + univ8250_console_setup() sets and clears port->cons with
> no explicit synchronization. The function is called from
> try_enable_preferred_console() under console_list_lock.
I will add port->lock synchronization.
> IMHO, the assignment is done when the drivers are being initialized.
> It is done when the port might already be used by early consoles.
>
> Especially, according to my understanding, newcon->setup() callbacks
> are responsible for using the same port by early and real console drivers.
>
> I guess that uart_port_lock() API is used by early consoles as well.
> It means that they might access up->cons here while it is being
> manipulated by the proper driver.
Note that port->lock does not synchronize early and normal
consoles. Only the console lock can do that. But you bring up a very
good point with setup(). serial8250_console_setup() can call
probe_baud(), which will write to the hardware.
I think that con->setup() needs to be called under the console lock
(just as already with unblank() and device()).
>> + return false;
>> +
>> + cookie = console_srcu_read_lock();
>> + ret = (console_srcu_read_flags(up->cons) & CON_NBCON);
>> + console_srcu_read_unlock(cookie);
>
> Hmm, console_srcu_read_flags(con) is called under
> console_srcu_read_lock() to make sure that it does not
> disappear. It makes sense when it is used by registered consoles.
>
> But uart_port_lock() might be called even when the console
> is not registered.
>
> I suggest to remove the SRCU lock here. In this code path,
> it does not guarantee anything and is rather misleading.
Agreed.
> I would use a READ_ONCE(), for example by splitting:
>
> /*
> * Locklessly reading console->flags provides a consistent
> * read value because there is at most one CPU modifying
> * console->flags and that CPU is using only read-modify-write
> * operations to do so.
> *
> * The caller must make sure that @con does not disappear.
> * It can be done by console_srcu_read_lock() when used
> * only for registered consoles.
> */
> static inline short console_read_flags(const struct console *con)
> {
> return data_race(READ_ONCE(con->flags));
> }
>
> /* Locklessly reading console->flags for registered consoles */
> static inline short console_srcu_read_flags(const struct console *con)
> {
> WARN_ON_ONCE(!console_srcu_read_lock_is_held());
>
> console_read_flags();
> }
OK.
>> +void uart_nbcon_acquire(struct uart_port *up)
>> +{
>> + struct console *con = up->cons;
>> + struct nbcon_context ctxt;
>
> I would add:
>
> lockdep_assert_held(&up->lock);
OK.
>> +
>> + if (!uart_is_nbcon(up))
>> + return;
>> +
>> + WARN_ON_ONCE(up->nbcon_locked_port);
>> +
>> + do {
>> + do {
>> + memset(&ctxt, 0, sizeof(ctxt));
>> + ctxt.console = con;
>> + ctxt.prio = NBCON_PRIO_NORMAL;
>> + } while (!nbcon_context_try_acquire(&ctxt));
>> +
>> + } while (!nbcon_context_enter_unsafe(&ctxt));
>> +
>> + up->nbcon_locked_port = true;
>> +}
>> +EXPORT_SYMBOL_GPL(uart_nbcon_acquire);
>
> I would prefer to split the uart and nbcon specific code, for example:
Can you explain why? This code will not be used anywhere else.
> /**
> * nbcon_normal_context_acquire - Acquire a generic context with
> * the normal priority for nbcon console
> * @con: nbcon console
> *
> * Context: Any context which could not be migrated to another CPU.
> *
> * Acquire a generic context with the normal priority for the given console.
> * Prevent the release by entering the unsafe state.
> *
> * Note: The console might still loose the ownership by a hostile takeover.
> * But it can be done only by the final flush in panic() when other
> * CPUs should be stopped and other contexts interrupted.
> */
> static void nbcon_normal_context_acquire(struct console *con)
> {
> struct nbcon_context ctxt;
>
> do {
> do {
> memset(&ctxt, 0, sizeof(ctxt));
> ctxt.console = con;
> ctxt.prio = NBCON_PRIO_NORMAL;
> } while (!nbcon_context_try_acquire(&ctxt));
>
> } while (!nbcon_context_enter_unsafe(&ctxt));
> }
>
> /**
> * uart_nbcon_acquire - Acquire nbcon console associated with the gived port.
> * @up: uart port
> *
> * Context: Must be called under up->lock to prevent manipulating
> * up->cons and migrating to another CPU.
> *
> * Note: The console might still loose the ownership by a hostile takeover.
> * But it can be done only by the final flush in panic() when other
> * CPUs should be stopped and other contexts interrupted.
> */
> void uart_nbcon_acquire(struct uart_port *up)
> {
> struct console *con; = up->cons;
>
> lockdep_assert_held(&up->lock);
>
> /*
> * FIXME: This would require adding WRITE_ONCE()
> * on the write part.
> *
> * Or even better, the value should be modified under
> * port->lock so that simple read would be enough here.
> */
> con = data_race(READ_ONCE(up->cons));
>
> if (!con)
> return;
>
> if (!console_read_flags(con) & CON_NBCON)
> return;
>
> nbcon_normal_context_acquire(con);
> }
>
> Note that I did not use up->nbcon_locked_port as explained above.
Note that it will not work because other ports will share the same
up->cons value even though they are not consoles. up->cons only
specifies which struct console to use _if_ printk chooses that port as a
console. It does _not_ mean that printk has chosen that port.
>> +void uart_nbcon_release(struct uart_port *up)
>> +{
>> + struct console *con = up->cons;
>> + struct nbcon_context ctxt = {
>> + .console = con,
>> + .prio = NBCON_PRIO_NORMAL,
>> + };
>> +
>
> I would add here as well.
>
> lockdep_assert_held(&up->lock);
OK.
> This deserves a comment why we do not complain when this function
> is called for nbcon and it is not locked. Something like:
>
> /*
> * Even port used by nbcon console might be seen unlocked
> * when it was locked and the console has been registered
> * at the same time.
> */
I think a more appropriate comment would be:
/*
* This function is called for ports that are not consoles
* and for ports that may be consoles but are not nbcon
* consoles. In those the cases the nbcon console was
* never locked and this context must not unlock.
*/
>> + if (!up->nbcon_locked_port)
>> + return;
>> +
>> + if (nbcon_context_exit_unsafe(&ctxt))
>> + nbcon_context_release(&ctxt);
>> +
>> + up->nbcon_locked_port = false;
>> +}
>
> Again I would better split the nbcon and uart part and create:
I can do the split, but I do not see the reason for it.
John
next prev parent reply other threads:[~2024-03-11 17:09 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-18 18:57 [PATCH printk v2 00/26] wire up write_atomic() printing John Ogness
2024-02-18 18:57 ` [PATCH printk v2 01/26] serial: core: Provide port lock wrappers John Ogness
2024-02-18 18:57 ` [PATCH printk v2 02/26] serial: core: Use " John Ogness
2024-02-18 18:57 ` [PATCH printk v2 03/26] serial: core: fix kernel-doc for uart_port_unlock_irqrestore() John Ogness
2024-02-18 18:57 ` [PATCH printk v2 04/26] printk: Consider nbcon boot consoles on seq init John Ogness
2024-02-20 10:26 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 05/26] printk: Add notation to console_srcu locking John Ogness
2024-02-20 10:29 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 06/26] printk: nbcon: Ensure ownership release on failed emit John Ogness
2024-02-20 15:16 ` Petr Mladek
2024-02-20 16:29 ` John Ogness
2024-02-21 13:23 ` John Ogness
2024-02-18 18:57 ` [PATCH printk v2 07/26] printk: Check printk_deferred_enter()/_exit() usage John Ogness
2024-02-21 9:55 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper John Ogness
2024-02-19 12:16 ` Andy Shevchenko
2024-02-19 14:18 ` John Ogness
2024-02-19 14:35 ` Andy Shevchenko
2024-02-19 16:52 ` John Ogness
2024-02-19 17:14 ` Andy Shevchenko
2024-02-23 10:51 ` Petr Mladek
2024-03-11 17:08 ` John Ogness [this message]
2024-03-13 9:49 ` John Ogness
2024-03-22 6:23 ` Tony Lindgren
2024-03-27 9:32 ` John Ogness
2024-03-27 13:12 ` Andy Shevchenko
2024-03-14 11:37 ` John Ogness
2024-03-14 14:26 ` Petr Mladek
2024-03-15 15:04 ` John Ogness
2024-03-18 15:42 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 09/26] printk: nbcon: Add detailed doc for write_atomic() John Ogness
2024-02-23 13:11 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 10/26] printk: nbcon: Fix kerneldoc for enums John Ogness
2024-02-18 19:10 ` Randy Dunlap
2024-02-23 13:13 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 11/26] printk: Make console_is_usable() available to nbcon John Ogness
2024-02-18 18:57 ` [PATCH printk v2 12/26] printk: Let console_is_usable() handle nbcon John Ogness
2024-02-18 18:57 ` [PATCH printk v2 13/26] printk: Add @flags argument for console_is_usable() John Ogness
2024-02-18 18:57 ` [PATCH printk v2 14/26] printk: nbcon: Provide function to flush using write_atomic() John Ogness
2024-02-23 15:47 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 15/26] printk: Track registered boot consoles John Ogness
2024-02-23 15:57 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 16/26] printk: nbcon: Use nbcon consoles in console_flush_all() John Ogness
2024-02-23 17:15 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 17/26] printk: nbcon: Assign priority based on CPU state John Ogness
2024-02-29 13:50 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 18/26] printk: nbcon: Add unsafe flushing on panic John Ogness
2024-02-29 13:53 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 19/26] printk: Avoid console_lock dance if no legacy or boot consoles John Ogness
2024-02-29 15:19 ` Petr Mladek
2024-02-29 16:19 ` READ_ONCE: was: " Petr Mladek
2024-02-29 22:51 ` Paul E. McKenney
2024-02-18 18:57 ` [PATCH printk v2 20/26] printk: Track nbcon consoles John Ogness
2024-03-01 9:39 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 21/26] printk: Coordinate direct printing in panic John Ogness
2024-03-01 13:05 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 22/26] printk: nbcon: Implement emergency sections John Ogness
2024-03-01 13:28 ` Petr Mladek
2024-03-01 15:49 ` flush was: " Petr Mladek
2024-03-01 16:12 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 23/26] panic: Mark emergency section in warn John Ogness
2024-03-01 13:57 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 24/26] panic: Mark emergency section in oops John Ogness
2024-03-01 14:55 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 25/26] rcu: Mark emergency section in rcu stalls John Ogness
2024-03-01 15:13 ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 26/26] lockdep: Mark emergency section in lockdep splats John Ogness
2024-02-19 4:14 ` Waiman Long
2024-02-19 11:11 ` John Ogness
2024-02-19 15:07 ` Waiman Long
2024-03-01 15:18 ` Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87le6oy9vg.fsf@jogness.linutronix.de \
--to=john.ogness@linutronix.de \
--cc=andriy.shevchenko@linux.intel.com \
--cc=geert+renesas@glider.be \
--cc=gregkh@linuxfoundation.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=jiaqing.zhao@linux.intel.com \
--cc=jirislaby@kernel.org \
--cc=justin.chen@broadcom.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tglx@linutronix.de \
--cc=tony@atomide.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox