Re: [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: "Sergey Senozhatsky" <senozhatsky@chromium.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Jiri Slaby" <jirislaby@kernel.org>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
	"Andy Shevchenko" <andriy.shevchenko@linux.intel.com>,
	"Tony Lindgren" <tony@atomide.com>,
	"Geert Uytterhoeven" <geert+renesas@glider.be>,
	"Justin Chen" <justin.chen@broadcom.com>,
	"Jiaqing Zhao" <jiaqing.zhao@linux.intel.com>,
	linux-serial@vger.kernel.org
Subject: Re: [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper
Date: Fri, 23 Feb 2024 11:51:06 +0100	[thread overview]
Message-ID: <Zdh4eEJJpasEWqa5@alley> (raw)
In-Reply-To: <20240218185726.1994771-9-john.ogness@linutronix.de>

On Sun 2024-02-18 20:03:08, John Ogness wrote:
> Currently the port->lock wrappers uart_port_lock(),
> uart_port_unlock() (and their variants) only lock/unlock
> the spin_lock.
> 
> If the port is an nbcon console, the wrappers must also
> acquire/release the console and mark the region as unsafe. This
> allows general port->lock synchronization to be synchronized
> with the nbcon console ownership.
> 
> Add a flag to struct uart_port to track nbcon console ownership.

The patch makes sense.

My main (only) concern was the synchronization of the various accessed
variables, especially, port->cons.

Note: I am not completely sure how the early and valid console drivers
      share the same struct uart_port. So, maybe I miss some important
      guarantee.

Anyway. synchronization of port->cons looks like a shade area.
IMHO, the existing code expects that it is used _only when the console
is registered_. But this patch wants to access it _even before
the console is registered_!

For example, it took me quite a lot of time to shake my head around:

#define uart_console(port) \
	((port)->cons && (port)->cons->index == (port)->line)

  + port->cons and port->line are updated in the uart code.
    It seems that the update is not serialized by port->lock.
    Something might be done under port->mutex.

  + cons->index is updated in register_console() under
    console_list_lock.

Spoiler: I propose a solution which does not use uart_console().

See below for more details.

> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -3284,6 +3284,7 @@ void serial8250_init_port(struct uart_8250_port *up)
>  	struct uart_port *port = &up->port;
>  
>  	spin_lock_init(&port->lock);
> +	port->nbcon_locked_port = false;

I am not sure if the variable really helps anything:

   + nbcon_context release() must handle the case when it
     lost the ownership anyway.

     It is the same as if it did not acquire the context
     in the first place [*].


   + nbcon_acquire() is called under port->lock. It means that
     nbcon_release() should always be called when the acquire
     succeeded earlier.

We just need to make sure that port->cons can be cleared only
under port->lock. But this would be required even with
port->nbcon_locked_port.

[*] I am not super-happy with this semantic because it prevents
    catching bugs by lockdep. But it is how nbcon_acquire/release
    work and it is important part of the design.


Apology: It is possible that I suggested to add this variable. I just
	see now that it does not really help much. It rather makes
	a false feeling about that nbcon acquire/release are always
	paired.

>  	port->ctrl_id = 0;
>  	port->pm = NULL;
>  	port->ops = &serial8250_pops;
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -995,3 +996,79 @@ void nbcon_free(struct console *con)
>  
>  	con->pbufs = NULL;
>  }
> +
> +static inline bool uart_is_nbcon(struct uart_port *up)
> +{
> +	int cookie;
> +	bool ret;
> +
> +	if (!uart_console(up))

This function accesses up->cons. I am not completely sure how
this value is synchronized, for example:

  + serial_core_add_one_port() sets uport->cons under port->mutex.
    The function is called uart_add_one_port() in various probe()
    or init() calls.

  + univ8250_console_setup() sets and clears port->cons with
    no explicit synchronization. The function is called from
    try_enable_preferred_console() under console_list_lock.

IMHO, the assignment is done when the drivers are being initialized.
It is done when the port might already be used by early consoles.

Especially, according to my understanding, newcon->setup() callbacks
are responsible for using the same port by early and real console drivers.

I guess that uart_port_lock() API is used by early consoles as well.
It means that they might access up->cons here while it is being
manipulated by the proper driver.

A minimal solution would be access/modify port->cons using
READ_ONCE()/WRITE_ONCE().

But I think that we do not need to call uart_console() at all.
We do not really need to synchronize the consistency of
con->index and port->line.

Instead, we could read up->cons using READ_ONCE() and
check if it is defined. Or even better would be to always
set/read port->cons under the port->lock.


> +		return false;
> +
> +	cookie = console_srcu_read_lock();
> +	ret = (console_srcu_read_flags(up->cons) & CON_NBCON);
> +	console_srcu_read_unlock(cookie);

Hmm, console_srcu_read_flags(con) is called under
console_srcu_read_lock() to make sure that it does not
disappear. It makes sense when it is used by registered consoles.

But uart_port_lock() might be called even when the console
is not registered.

I suggest to remove the SRCU lock here. In this code path,
it does not guarantee anything and is rather misleading.

I would use a READ_ONCE(), for example by splitting:

/*
 * Locklessly reading console->flags provides a consistent
 * read value because there is at most one CPU modifying
 * console->flags and that CPU is using only read-modify-write
 * operations to do so.
 *
 * The caller must make sure that @con does not disappear.
 * It can be done by console_srcu_read_lock() when used
 * only for registered consoles.
 */
static inline short console_read_flags(const struct console *con)
{
	return data_race(READ_ONCE(con->flags));
}

/* Locklessly reading console->flags for registered consoles */
static inline short console_srcu_read_flags(const struct console *con)
{
	WARN_ON_ONCE(!console_srcu_read_lock_is_held());

	console_read_flags();
}

> +	return ret;
> +}
> +
> +/**
> + * uart_nbcon_acquire - The second half of the port locking wrapper
> + * @up:		The uart port whose @lock was locked
> + *
> + * The uart_port_lock() wrappers will first lock the spin_lock @up->lock.
> + * Then this function is called to implement nbcon-specific processing.
> + *
> + * If @up is an nbcon console, this console will be acquired and marked as
> + * unsafe. Otherwise this function does nothing.
> + */
> +void uart_nbcon_acquire(struct uart_port *up)
> +{
> +	struct console *con = up->cons;
> +	struct nbcon_context ctxt;

I would add:

	lockdep_assert_held(&up->lock);
> +
> +	if (!uart_is_nbcon(up))
> +		return;
> +
> +	WARN_ON_ONCE(up->nbcon_locked_port);
> +
> +	do {
> +		do {
> +			memset(&ctxt, 0, sizeof(ctxt));
> +			ctxt.console	= con;
> +			ctxt.prio	= NBCON_PRIO_NORMAL;
> +		} while (!nbcon_context_try_acquire(&ctxt));
> +
> +	} while (!nbcon_context_enter_unsafe(&ctxt));
> +
> +	up->nbcon_locked_port = true;
> +}
> +EXPORT_SYMBOL_GPL(uart_nbcon_acquire);

I would prefer to split the uart and nbcon specific code, for example:

/**
 * nbcon_normal_context_acquire - Acquire a generic context with
 *	the normal priority for nbcon console
 * @con:	nbcon console
 *
 * Context:	Any context which could not be migrated to another CPU.
 *
 * Acquire a generic context with the normal priority for the given console.
 * Prevent the release by entering the unsafe state.
 *
 * Note: The console might still loose the ownership by a hostile takeover.
 *	 But it can be done only by the final flush in panic() when other
 *	 CPUs should be stopped and other contexts interrupted.
 */
static void nbcon_normal_context_acquire(struct console *con)
{
	struct nbcon_context ctxt;

	do {
		do {
			memset(&ctxt, 0, sizeof(ctxt));
			ctxt.console	= con;
			ctxt.prio	= NBCON_PRIO_NORMAL;
		} while (!nbcon_context_try_acquire(&ctxt));

	} while (!nbcon_context_enter_unsafe(&ctxt));
}

/**
 * uart_nbcon_acquire - Acquire nbcon console associated with the gived port.
 * @up:		uart port
 *
 * Context:	Must be called under up->lock to prevent manipulating
 *		up->cons and migrating to another CPU.
 *
 * Note: The console might still loose the ownership by a hostile takeover.
 *	 But it can be done only by the final flush in panic() when other
 *	 CPUs should be stopped and other contexts interrupted.
 */
void uart_nbcon_acquire(struct uart_port *up)
{
	struct console *con; = up->cons;

	lockdep_assert_held(&up->lock);

	/*
	 * FIXME: This would require adding WRITE_ONCE()
	 * on the write part.
	 *
	 * Or even better, the value should be modified under
	 * port->lock so that simple read would be enough here.
	 */
	con = data_race(READ_ONCE(up->cons));

	if (!con)
		return;

	if (!console_read_flags(con) & CON_NBCON)
		return;

	nbcon_normal_context_acquire(con);
}

Note that I did not use up->nbcon_locked_port as explained above.


> +
> +/**
> + * uart_nbcon_release - The first half of the port unlocking wrapper
> + * @up:		The uart port whose @lock is about to be unlocked
> + *
> + * The uart_port_unlock() wrappers will first call this function to implement
> + * nbcon-specific processing. Then afterwards the uart_port_unlock() wrappers
> + * will unlock the spin_lock @up->lock.
> + *
> + * If @up is an nbcon console, the console will be marked as safe and
> + * released. Otherwise this function does nothing.
> + */
> +void uart_nbcon_release(struct uart_port *up)
> +{
> +	struct console *con = up->cons;
> +	struct nbcon_context ctxt = {
> +		.console	= con,
> +		.prio		= NBCON_PRIO_NORMAL,
> +	};
> +

I would add here as well.

	lockdep_assert_held(&up->lock);


This deserves a comment why we do not complain when this function
is called for nbcon and it is not locked. Something like:

	/*
	 * Even port used by nbcon console might be seen unlocked
	 * when it was locked and the console has been registered
	 * at the same time.
	 */
> +	if (!up->nbcon_locked_port)
> +		return;

Wait, if up->cons might be set asynchronously then it might also
disapper assynchronously. Which would be really bad because we would
not be able to release the normal context.

IMHO, we really need to synchronize up->cons acceess using up->lock.

> +
> +	if (nbcon_context_exit_unsafe(&ctxt))
> +		nbcon_context_release(&ctxt);
> +
> +	up->nbcon_locked_port = false;
> +}

Again I would better split the nbcon and uart part and create:

/**
 * nbcon_normal_context_release - Release a generic context with
 *	the normal priority.
 * @con:	nbcon console
 *
 * Context:	Any context which could not be migrated to another CPU.
 *
 * Release a generic context with the normal priority for the given console.
 * Prevent the release by entering the unsafe state.
 *
 * Note: The console might have lost the ownership by a hostile takeover.
 *	 But it should not happen in reality because the hostile
 *       takeover is allowed only for the final flush in panic()
 *	 when other CPUs should be stopped and other contexts interrupted.
 */
static void nbcon_normal_context_release(struct console *con)
{
	struct nbcon_context ctxt = {
		.console	= con,
		.prio		= NBCON_PRIO_NORMAL,
	};

	if (nbcon_context_exit_unsafe(&ctxt))
		nbcon_context_release(&ctxt);
}

/**
 * uart_nbcon_acquire - Release nbcon console associated with the given port.
 * @up:		uart port
 *
 * Context:	Must be called under up->lock to prevent manipulating
 *		up->cons and migrating to another CPU.
 */
void uart_nbcon_release(struct uart_port *up)
{
	struct console *con;

	lockdep_assert_held(&up->lock);

	/*
	 * FIXME: This would require adding WRITE_ONCE()
	 * on the write part.
	 *
	 * Or even better, the value should be modified under
	 * port->lock so that simple read would be enough here.
	 */
	con = data_race(READ_ONCE(up->cons));

	if (!con)
		return;

	if (!console_read_flags(con) & CON_NBCON)
		return;

	nbcon_normal_context_release(con);
}

Best Regards,
Petr

next prev parent reply	other threads:[~2024-02-23 10:51 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-18 18:57 [PATCH printk v2 00/26] wire up write_atomic() printing John Ogness
2024-02-18 18:57 ` [PATCH printk v2 01/26] serial: core: Provide port lock wrappers John Ogness
2024-02-18 18:57 ` [PATCH printk v2 02/26] serial: core: Use " John Ogness
2024-02-18 18:57 ` [PATCH printk v2 03/26] serial: core: fix kernel-doc for uart_port_unlock_irqrestore() John Ogness
2024-02-18 18:57 ` [PATCH printk v2 04/26] printk: Consider nbcon boot consoles on seq init John Ogness
2024-02-20 10:26   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 05/26] printk: Add notation to console_srcu locking John Ogness
2024-02-20 10:29   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 06/26] printk: nbcon: Ensure ownership release on failed emit John Ogness
2024-02-20 15:16   ` Petr Mladek
2024-02-20 16:29     ` John Ogness
2024-02-21 13:23       ` John Ogness
2024-02-18 18:57 ` [PATCH printk v2 07/26] printk: Check printk_deferred_enter()/_exit() usage John Ogness
2024-02-21  9:55   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper John Ogness
2024-02-19 12:16   ` Andy Shevchenko
2024-02-19 14:18     ` John Ogness
2024-02-19 14:35       ` Andy Shevchenko
2024-02-19 16:52         ` John Ogness
2024-02-19 17:14           ` Andy Shevchenko
2024-02-23 10:51   ` Petr Mladek [this message]
2024-03-11 17:08     ` John Ogness
2024-03-13  9:49       ` John Ogness
2024-03-22  6:23         ` Tony Lindgren
2024-03-27  9:32           ` John Ogness
2024-03-27 13:12             ` Andy Shevchenko
2024-03-14 11:37       ` John Ogness
2024-03-14 14:26       ` Petr Mladek
2024-03-15 15:04         ` John Ogness
2024-03-18 15:42           ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 09/26] printk: nbcon: Add detailed doc for write_atomic() John Ogness
2024-02-23 13:11   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 10/26] printk: nbcon: Fix kerneldoc for enums John Ogness
2024-02-18 19:10   ` Randy Dunlap
2024-02-23 13:13   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 11/26] printk: Make console_is_usable() available to nbcon John Ogness
2024-02-18 18:57 ` [PATCH printk v2 12/26] printk: Let console_is_usable() handle nbcon John Ogness
2024-02-18 18:57 ` [PATCH printk v2 13/26] printk: Add @flags argument for console_is_usable() John Ogness
2024-02-18 18:57 ` [PATCH printk v2 14/26] printk: nbcon: Provide function to flush using write_atomic() John Ogness
2024-02-23 15:47   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 15/26] printk: Track registered boot consoles John Ogness
2024-02-23 15:57   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 16/26] printk: nbcon: Use nbcon consoles in console_flush_all() John Ogness
2024-02-23 17:15   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 17/26] printk: nbcon: Assign priority based on CPU state John Ogness
2024-02-29 13:50   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 18/26] printk: nbcon: Add unsafe flushing on panic John Ogness
2024-02-29 13:53   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 19/26] printk: Avoid console_lock dance if no legacy or boot consoles John Ogness
2024-02-29 15:19   ` Petr Mladek
2024-02-29 16:19   ` READ_ONCE: was: " Petr Mladek
2024-02-29 22:51     ` Paul E. McKenney
2024-02-18 18:57 ` [PATCH printk v2 20/26] printk: Track nbcon consoles John Ogness
2024-03-01  9:39   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 21/26] printk: Coordinate direct printing in panic John Ogness
2024-03-01 13:05   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 22/26] printk: nbcon: Implement emergency sections John Ogness
2024-03-01 13:28   ` Petr Mladek
2024-03-01 15:49   ` flush was: " Petr Mladek
2024-03-01 16:12     ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 23/26] panic: Mark emergency section in warn John Ogness
2024-03-01 13:57   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 24/26] panic: Mark emergency section in oops John Ogness
2024-03-01 14:55   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 25/26] rcu: Mark emergency section in rcu stalls John Ogness
2024-03-01 15:13   ` Petr Mladek
2024-02-18 18:57 ` [PATCH printk v2 26/26] lockdep: Mark emergency section in lockdep splats John Ogness
2024-02-19  4:14   ` Waiman Long
2024-02-19 11:11     ` John Ogness
2024-02-19 15:07       ` Waiman Long
2024-03-01 15:18   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zdh4eEJJpasEWqa5@alley \
    --to=pmladek@suse.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=geert+renesas@glider.be \
    --cc=gregkh@linuxfoundation.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=jiaqing.zhao@linux.intel.com \
    --cc=jirislaby@kernel.org \
    --cc=john.ogness@linutronix.de \
    --cc=justin.chen@broadcom.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    --cc=tglx@linutronix.de \
    --cc=tony@atomide.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox