public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH printk-rework 11/12] printk: remove logbuf_lock
Date: Tue, 02 Feb 2021 12:47:20 +0106	[thread overview]
Message-ID: <87czxivgrj.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <YBkYOKL22kADKTeG@alley>

On 2021-02-02, Petr Mladek <pmladek@suse.com> wrote:
> On Tue 2021-01-26 22:21:50, John Ogness wrote:
>> Since the ringbuffer is lockless, there is no need for it to be
>> protected by @logbuf_lock. Remove @logbuf_lock.
>> 
>> This means that printk_nmi_direct and printk_safe_flush_on_panic()
>> no longer need to acquire any lock to run.
>> 
>> @console_seq, @exclusive_console_stop_seq, @console_dropped are
>> protected by @console_lock.
>> 
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index d14a4afc5b72..b57dba7f077d 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -401,6 +366,7 @@ static u64 syslog_seq;
>>  static size_t syslog_partial;
>>  static bool syslog_time;
>>  
>> +/* All 3 protected by @console_sem. */
>>  /* the next printk record to write to the console */
>>  static u64 console_seq;
>>  static u64 exclusive_console_stop_seq;
>> @@ -762,27 +728,27 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
>>  	if (ret)
>>  		return ret;
>>  
>> -	logbuf_lock_irq();
>> +	printk_safe_enter_irq();
>
> What is the exact reason to keep this, please?

As Sergey pointed out [0], logbuf_lock_irq() does 2 things: logbuf_lock
and safe buffers. This series is not trying to remove the safe buffers
(a later series will). The series is only removing logbuf_lock. So all
logbuf_lock_*() calls will turn into printk_safe_*() calls. There are a
few exceptions, which you noticed and I will respond to.

[0] https://lkml.kernel.org/r/20201208203539.GB1667627@google.com

> 1. The primary function of the printk_safe context is to avoid deadlock
>    caused by logbuf_lock. It might have happened with recursive or nested
>    printk(). But logbuf_lock is gone now.

Agreed. Deadlock is not a concern anymore.

> 2. There are still some hidded locks that were guarded by this as
>    well. For example, console_owner_lock, or spinlock inside
>    console_sem, or scheduler locks taken when console_sem()
>    wakes another waiting process. It might still make sense
>    to somehow guard these.

This was not my motivation and I do not think it is an issue. I am not
aware of any technical need for the safe buffers to protect such
synchronization.

> 3. It kind of prevented infinite printk() recursion by using another
>    code path. The other path was limited by the size of the per-cpu
>    buffer. Well, recursion inside printk_safe code would likely
>    hit end of the stack first.

Yes, this was my main motivation. The safe buffers carry this
responsibility in mainline. So until a replacement for recursion
protection is in place, the safe buffers should remain.

And even if we decide we do not need/want recursion protection, I still
do not think this series should be the one to remove it. I only wanted
to remove logbuf_lock for now.

If we later have regressions, it will be helpful to bisect if the safe
buffers (with their local_irq_disable()) or the logbuf_lock were
involved.

> IMHO, we do not need printk safe context here in devkmsg_read().
> It does not belong into any categoty that is described above.
> logbug_lock() is gone. devkmsg_read() is never called directly
> from printk().

No. But it is calling printk_ringbuffer functions that can trigger
WARN_ONs that can trigger printk's.

> The same is true for almost entire patch. There are only two or so
> exceptions, see below.
>
>
>>  	if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) {
>>  		if (file->f_flags & O_NONBLOCK) {
>>  			ret = -EAGAIN;
>> -			logbuf_unlock_irq();
>> +			printk_safe_exit_irq();
>>  			goto out;
>>  		}
>>  
>> -		logbuf_unlock_irq();
>> +		printk_safe_exit_irq();
>>  		ret = wait_event_interruptible(log_wait,
>>  				prb_read_valid(prb, atomic64_read(&user->seq), r));
>>  		if (ret)
>>  			goto out;
>> -		logbuf_lock_irq();
>> +		printk_safe_enter_irq();
>>  	}
>>  
>>  	if (atomic64_read(&user->seq) < prb_first_valid_seq(prb)) {
>>  		/* our last seen message is gone, return error and reset */
>>  		atomic64_set(&user->seq, prb_first_valid_seq(prb));
>>  		ret = -EPIPE;
>> -		logbuf_unlock_irq();
>> +		printk_safe_exit_irq();
>>  		goto out;
>>  	}
>>  
>
>
>> @@ -2593,7 +2559,6 @@ void console_unlock(void)
>>  		size_t len;
>>  
>>  		printk_safe_enter_irqsave(flags);
>> -		raw_spin_lock(&logbuf_lock);
>>  skip:
>>  		if (!prb_read_valid(prb, console_seq, &r))
>>  			break;
>
> I agree that printk_safe context makes sense in console_unlock().
> It prevents eventual deadlocks caused, for example by
> console_lock_owner.
>
> It also somehow prevents an infinite loop when a console driver would
> add a new message that would need to get flushed out as well which
> would trigger another messages ...
>
>
>> @@ -2973,9 +2933,7 @@ void register_console(struct console *newcon)
>>  		/*
>>  		 * console_unlock(); will print out the buffered messages
>>  		 * for us.
>> -		 */
>> -		logbuf_lock_irqsave(flags);
>
> I am just curious what was the motivation to remove printk_safe
> context here? It is a bit inconsistent with the other locations
> where you kept it.

This never should have been logbuf_lock_irqsave(flags) in the first
place. It would have been enough to just grab @logbuf_lock. The safe
buffers only make sense if printk or printk_ringbuffer functions are
called. Here we are just copying some variables (which are already
protected by console_sem and syslog_lock).

> IMHO, it can be removed. It does not fit into any category
> where it would help as described above.
>
> Anyway, we have to be consistent and explain it in the commit message.

I could add an earlier patch that changes logbuf_lock_irqsave() here to
spin_lock_irqsave() and explain. Then for this patch it would be clear
that it is just dropped.

>> -		/*
>> +		 *
>>  		 * We're about to replay the log buffer.  Only do this to the
>>  		 * just-registered console to avoid excessive message spam to
>>  		 * the already-registered consoles.
>> @@ -2988,11 +2946,9 @@ void register_console(struct console *newcon)
>>  		exclusive_console_stop_seq = console_seq;
>>  
>>  		/* Get a consistent copy of @syslog_seq. */
>> -		raw_spin_lock(&syslog_lock);
>> +		raw_spin_lock_irqsave(&syslog_lock, flags);
>
> I guess that you have added _irqsafe() variant here to preserve the
> original behavior.

Yes.

> I just wonder if it makes sense. We take the sleeping console_lock()
> in this function. This should not be allowed in a context
> with disabled interrupts.
>
> IMHO, we do not need the irqsafe variant here. But it probably should
> be removed in a separate patch. We should not hide it in this huge patch.

Yes. Let's put that in another patch. It is not related to logbuf_lock
removal.

>>  		console_seq = syslog_seq;
>> -		raw_spin_unlock(&syslog_lock);
>> -
>> -		logbuf_unlock_irqrestore(flags);
>> +		raw_spin_unlock_irqrestore(&syslog_lock, flags);
>>  	}
>>  	console_unlock();
>>  	console_sysfs_notify();
>> @@ -3414,9 +3366,11 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog,
>>  	struct printk_info info;
>>  	unsigned int line_count;
>>  	struct printk_record r;
>> +	unsigned long flags;
>>  	size_t l = 0;
>>  	bool ret = false;
>>  
>> +	printk_safe_enter_irqsave(flags);
>
> This change is neither obvious nor documented.

I noticed that this function was calling ringbuffer functions without
marking the region for safe buffers. I should have included a separate
patch before this one adding the safe buffers so that it would be
clear. Sorry.

> I guess that this is preparation step for unfying
> kmsg_dump_get_line_nolock() and kmsg_dump_get_line().
>
> Please, do it in the next patch and keep this one strightforward.

Or I will just do it in the following unification patch.

> That said, IMHO, the printk_safe() context is not needed here at all.
> This code is not called from printk() directly. The recursion is
> prevented by iter->next_seq or the buffer size.

My logic was: "If it is calling prb_*(), it should be protected by safe
buffers."

>>  	prb_rec_init_rd(&r, &info, line, size);
>>  
>>  	if (!iter->active)
>
>
>> @@ -3569,8 +3517,12 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
>>   */
>>  void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter)
>>  {
>> +	unsigned long flags;
>> +
>> +	printk_safe_enter_irqsave(flags);
>
> This is the same as in kmsg_dump_get_line_nolock().
>
> Please, keep this huge patch strightforward. Either replace
> logbuf_lock*() with the corresponding printk_safe*() calls.
> Or do not add printk_safe*() where it is not needed.

Agreed. Sorry.

>>  	iter->cur_seq = latched_seq_read_nolock(&clear_seq);
>>  	iter->next_seq = prb_next_seq(prb);
>> +	printk_safe_exit_irqrestore(flags);
>>  }
>>  
>>  /**
>
>> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
>> index a0e6f746de6c..a9a3137bd972 100644
>> --- a/kernel/printk/printk_safe.c
>> +++ b/kernel/printk/printk_safe.c
>> @@ -368,20 +354,21 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>>  #endif
>>  
>>  	/*
>> -	 * Try to use the main logbuf even in NMI. But avoid calling console
>> +	 * Use the main logbuf even in NMI. But avoid calling console
>>  	 * drivers that might have their own locks.
>>  	 */
>> -	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK) &&
>> -	    raw_spin_trylock(&logbuf_lock)) {
>> +	if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) {
>> +		unsigned long flags;
>>  		int len;
>>  
>> +		printk_safe_enter_irqsave(flags);
>
> The printk_safe context does not make much sense here. The per-context
> redirection is done in vprintk_func(). It will always use this path
> because PRINTK_NMI_DIRECT_CONTEXT_MASK is handled before
> PRINTK_SAFE_CONTEXT_MASK.

If the following vprintk_store() calls printk(), those printk's should
land in safe buffers. If @printk_context is not incremented, they land
here again.

vprintk_store() relies on its caller to update @printk_context.

>>  		len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args);
>> -		raw_spin_unlock(&logbuf_lock);
>> +		printk_safe_exit_irqrestore(flags);
>>  		defer_console_output();
>>  		return len;
>>  	}
>>  
>> -	/* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */
>> +	/* Use extra buffer in NMI. */
>>  	if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK)
>>  		return vprintk_nmi(fmt, args);
>
> I agree that it makes sense to keep vprintk_nmi() for now. It prevents
> an infinite recursion. I still hope that we will agree on a better
> solution later.
>
>
> Summary:
>
> I do not have a strong opinion whether to remove printk_safe
> enter/exit calls in this patch or in a separate one. I would
> be fine with both solutions.
>
> Anyway, please keep the patch as straightforward as possible.
> Please, do not move printk_safe context into another locations.
>
> Also provide more reasoning in the commit message. Why printk_safe
> enter/exit calls are preserved or why they can be removed.
>
> Feel free to mention that printk_safe context still makes sense on
> some locations, explain why, and say that it will be removed later.

OK. Thank you for the excellent response. I will go over this again.

John Ogness

  reply	other threads:[~2021-02-02 11:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26 21:15 [PATCH printk-rework 00/12] printk: remove logbuf_lock John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 01/12] printk: kmsg_dump: remove unused fields John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 02/12] printk: refactor kmsg_dump_get_buffer() John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 03/12] printk: consolidate kmsg_dump_get_buffer/syslog_print_all code John Ogness
     [not found]   ` <YBQgTQYTA5p6Wgj6@alley>
2021-02-01  9:49     ` John Ogness
2021-02-02 12:31       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 04/12] printk: define CONSOLE_LOG_MAX in printk.h John Ogness
     [not found]   ` <YBQtbKrdwUAZQB9v@alley>
2021-02-01  8:24     ` LINE_MAX: was: " John Ogness
2021-02-02 11:22       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 05/12] printk: use seqcount_latch for clear_seq John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 06/12] printk: use atomic64_t for devkmsg_user.seq John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 07/12] printk: add syslog_lock John Ogness
2021-02-01 12:26   ` Petr Mladek
2021-02-01 13:11     ` John Ogness
2021-02-02 12:50       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 08/12] printk: introduce a kmsg_dump iterator John Ogness
2021-02-01 13:17   ` Petr Mladek
2021-02-01 13:32     ` John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 09/12] um: synchronize kmsg_dumper John Ogness
2021-02-01 10:26   ` Petr Mladek
2021-02-01 14:15   ` Petr Mladek
2021-02-01 16:51     ` John Ogness
2021-02-01 16:54       ` Richard Weinberger
2021-02-01 20:25         ` John Ogness
2021-02-01 20:40           ` Richard Weinberger
2021-02-02 13:26       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 10/12] hv: " John Ogness
2021-01-27 21:32   ` Michael Kelley
2021-02-01 10:56     ` John Ogness
2021-01-26 21:15 ` [PATCH printk-rework 11/12] printk: remove logbuf_lock John Ogness
2021-02-02  9:15   ` Petr Mladek
2021-02-02 11:41     ` John Ogness [this message]
2021-02-02 16:11       ` Petr Mladek
2021-01-26 21:15 ` [PATCH printk-rework 12/12] printk: kmsg_dump: remove _nolock() variants John Ogness
2021-02-02  9:45   ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czxivgrj.fsf@jogness.linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox