Lock contention with console kthreads

public inbox for linux-serial@vger.kernel.org
 help / color / mirror / Atom feed

* Lock contention with console kthreads
@ 2025-09-09 20:14 Andrew Murray
  2025-09-10  9:14 ` Petr Mladek
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Murray @ 2025-09-09 20:14 UTC (permalink / raw)
  To: John Ogness, senozhatsky, pmladek, rostedt
  Cc: linux-serial, Andrew Murray, Michael Cobb

Hello,

Since commit 5f53ca3ff83b ("printk: Implement legacy printer kthread
for PREEMPT_RT"), and only when PREEMPT_RT is enabled, legacy consoles
use a kthread to write out to the device. This is beneficial for boot
time when using a slow console such as a serial port. However, this
gain can be short-lived as console_on_rootfs in init/main.c appears to
act as a synchronisation point.

legacy_kthread_func holds console_lock when flushing, and is held for
a long period of time during boot. Towards the end of boot,
console_on_rootfs is called which calls filp_open("/devconsole"). This
does a tty_lookup_driver which uses the console_device function to
obtain the tty_driver. console_device attempts to acquire
console_lock. At this point in time, with a (very) slow UART, progress
through kernel_init_freeable is halted waiting for the lock,
eliminating the gains of offloading to a kthread. If this could be
avoided, then boot time would be improved.

I also see something similar with an nbcon console (with
b63e6f60eab45b16a1bf734fef9035a4c4187cd5 on 8250), with
console_on_rootfs being held up by the writeout, as far as I can tell
this is due to lock contention in many places, for example with long
delays acquiring uart_port_lock_irq.

These issues can easily be reproduced by adding 'initcall_debug=1
loglevel=10' on a 8250 console at 9600 bps.

I know the legacy printer and nbcon work relate to PREEMPT_RT rather
than boot time, and so I thought I'd point out my observations as it
seems we're pretty close to getting boot time benefits from this work
as well.

Assuming my observations are correct, are these known issues, is there
any interest in accepting patches to address them? Or pointers for a
direction forward?

Andrew Murray

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Lock contention with console kthreads
  2025-09-09 20:14 Lock contention with console kthreads Andrew Murray
@ 2025-09-10  9:14 ` Petr Mladek
  2025-09-10  9:39   ` John Ogness
  0 siblings, 1 reply; 4+ messages in thread
From: Petr Mladek @ 2025-09-10  9:14 UTC (permalink / raw)
  To: Andrew Murray
  Cc: John Ogness, senozhatsky, rostedt, linux-serial, Michael Cobb

On Tue 2025-09-09 21:14:21, Andrew Murray wrote:
> Hello,
> 
> Since commit 5f53ca3ff83b ("printk: Implement legacy printer kthread
> for PREEMPT_RT"), and only when PREEMPT_RT is enabled, legacy consoles
> use a kthread to write out to the device. This is beneficial for boot
> time when using a slow console such as a serial port. However, this
> gain can be short-lived as console_on_rootfs in init/main.c appears to
> act as a synchronisation point.
> 
> legacy_kthread_func holds console_lock when flushing, and is held for
> a long period of time during boot.

This is the main principle of the legacy loop.

At the very beginning, printk() emitted the messages to the console(s)
directly. But it slowed everything. Several CPUs were waiting for
the serial console...

In v2.4.10 (Aug 2001), printk() started using console_trylock().
Other CPUs were allowed to store new messages and skip emitting them
when the trylock failed. The owner of the console_lock became
responsible for flushing everything.

I believe that we could change this rule for the printk kthread,
see below.

> Towards the end of boot,
> console_on_rootfs is called which calls filp_open("/devconsole"). This
> does a tty_lookup_driver which uses the console_device function to
> obtain the tty_driver. console_device attempts to acquire
> console_lock. At this point in time, with a (very) slow UART, progress
> through kernel_init_freeable is halted waiting for the lock,
> eliminating the gains of offloading to a kthread. If this could be
> avoided, then boot time would be improved.
> 
> I also see something similar with an nbcon console (with
> b63e6f60eab45b16a1bf734fef9035a4c4187cd5 on 8250), with
> console_on_rootfs being held up by the writeout, as far as I can tell
> this is due to lock contention in many places, for example with long
> delays acquiring uart_port_lock_irq.
> 
> These issues can easily be reproduced by adding 'initcall_debug=1
> loglevel=10' on a 8250 console at 9600 bps.
> 
> I know the legacy printer and nbcon work relate to PREEMPT_RT rather
> than boot time, and so I thought I'd point out my observations as it
> seems we're pretty close to getting boot time benefits from this work
> as well.
> 
> Assuming my observations are correct, are these known issues,

I hear about this issue for the first time. But I am not surprised.
The description of the problem makes perfect sense.

> is there any interest in accepting patches to address them?
> Or pointers for a direction forward?

IMHO, it is worth fixing. It is ugly to slow down the boot by
the speed of the serial console.

I would suggest to modify legacy_kthread_func() and flush one
record-per-cycle and get/release the console_lock() in each cycle.

Something like, inspired by nbcon_kthread_func():

static int legacy_kthread_func(void *unused)
{

wait_for_event:

	wait_event_interruptible(legacy_wait, legacy_kthread_should_wakeup());

	do {

		if (kthread_should_stop())
			break;

		console_lock();

		progress = console_flush_all_one_record(&next_seq, &handover)
		if (handover)
			break;

		__console_unlock()
		cond_resched();
	} while(progress);

	return 0;
}

, where console_flush_all_one_record() would implement one inner-cycle
  from console_flush_all(), something like:

static bool console_flush_all_one_record(u64 *next_seq, bool *handover, bool *any_usable)
{
	any_progress = false;
	*any_usable = false;

	printk_get_console_flush_type(&ft);

	cookie = console_srcu_read_lock();
	for_each_console_srcu(con) {
		short flags = console_srcu_read_flags(con);
		u64 printk_seq;
		bool progress;

		/*
		 * console_flush_all() is only responsible for nbcon
		 * consoles when the nbcon consoles cannot print via
		 * their atomic or threaded flushing.
		 */
		if ((flags & CON_NBCON) && (ft.nbcon_atomic || ft.nbcon_offload))
			continue;

		if (!console_is_usable(con, flags, !do_cond_resched))
			continue;
		*any_usable = true;

		if (flags & CON_NBCON) {
			progress = nbcon_legacy_emit_next_record(con, handover, cookie,
								 !do_cond_resched);
			printk_seq = nbcon_seq_read(con);
		} else {
			progress = console_emit_next_record(con, handover, cookie);
			printk_seq = con->seq;
		}

		/*
		 * If a handover has occurred, the SRCU read lock
		 * is already released.
		 */
		if (*handover)
			return false;

		/* Track the next of the highest seq flushed. */
		if (printk_seq > *next_seq)
			*next_seq = printk_seq;

		if (!progress)
			continue;
		any_progress = true;

		/* Allow panic_cpu to take over the consoles safely. */
		if (other_cpu_in_panic())
			goto abandon;
	}

	console_srcu_read_unlock(cookie);

	return any_progress;

abandon:
	console_srcu_read_unlock(cookie);
	return false;
}

This function would be used also in console_flush_all(), something
like:

static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handover)
{
	struct console_flush_type ft;
	struct console *con;
	bool any_progress;
	bool any_usable;
	int cookie;

	*next_seq = 0;
	*handover = false;

	do {
		any_progress = console_flush_all_one_record(next_seq, handover, &any_usable);

		if (*handover)
			return false;

		/* Allow panic_cpu to take over the consoles safely. */
		if (other_cpu_in_panic())
			return false;

		if (do_cond_resched)
			cond_resched();
	} while (any_progress);

	return any_usable;
}

BTW: I see one more advantage. The new code will take console_srcu_read_lock()
     only around one record. It might remove contention also in the global
     SRCU machinery.

Warning: The above code is just a concept. It is not even compile tested.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Lock contention with console kthreads
  2025-09-10  9:14 ` Petr Mladek
@ 2025-09-10  9:39   ` John Ogness
  2025-09-11 10:12     ` Andrew Murray
  0 siblings, 1 reply; 4+ messages in thread
From: John Ogness @ 2025-09-10  9:39 UTC (permalink / raw)
  To: Petr Mladek, Andrew Murray
  Cc: senozhatsky, rostedt, linux-serial, Michael Cobb

On 2025-09-10, Petr Mladek <pmladek@suse.com> wrote:
>> Since commit 5f53ca3ff83b ("printk: Implement legacy printer kthread
>> for PREEMPT_RT"), and only when PREEMPT_RT is enabled, legacy consoles
>> use a kthread to write out to the device. This is beneficial for boot
>> time when using a slow console such as a serial port. However, this
>> gain can be short-lived as console_on_rootfs in init/main.c appears to
>> act as a synchronisation point.
>> 
>> legacy_kthread_func holds console_lock when flushing, and is held for
>> a long period of time during boot.

[...]

> I would suggest to modify legacy_kthread_func() and flush one
> record-per-cycle and get/release the console_lock() in each cycle.

Agreed. This is an excellent suggestion!

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Lock contention with console kthreads
  2025-09-10  9:39   ` John Ogness
@ 2025-09-11 10:12     ` Andrew Murray
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Murray @ 2025-09-11 10:12 UTC (permalink / raw)
  To: John Ogness; +Cc: Petr Mladek, senozhatsky, rostedt, linux-serial, Michael Cobb

Hi Petr, John,

On Wed, 10 Sept 2025 at 10:39, John Ogness <john.ogness@linutronix.de> wrote:
>
> On 2025-09-10, Petr Mladek <pmladek@suse.com> wrote:
> >> Since commit 5f53ca3ff83b ("printk: Implement legacy printer kthread
> >> for PREEMPT_RT"), and only when PREEMPT_RT is enabled, legacy consoles
> >> use a kthread to write out to the device. This is beneficial for boot
> >> time when using a slow console such as a serial port. However, this
> >> gain can be short-lived as console_on_rootfs in init/main.c appears to
> >> act as a synchronisation point.
> >>
> >> legacy_kthread_func holds console_lock when flushing, and is held for
> >> a long period of time during boot.
>
> [...]
>
> > I would suggest to modify legacy_kthread_func() and flush one
> > record-per-cycle and get/release the console_lock() in each cycle.
>
> Agreed. This is an excellent suggestion!
>

Thanks for your quick feedback and suggestions.

I'll see if I can make some magic happen :)

Thanks,

Andrew Murray

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-11 10:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-09 20:14 Lock contention with console kthreads Andrew Murray
2025-09-10  9:14 ` Petr Mladek
2025-09-10  9:39   ` John Ogness
2025-09-11 10:12     ` Andrew Murray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox