From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org,
Mukesh Ojha <quic_mojha@quicinc.com>,
Chunlei Wang <chunlei.wang@mediatek.com>
Subject: Re: [RFC PATCH printk v1] printk: ringbuffer: Do not skip non-finalized with prb_next_seq()
Date: Thu, 02 Nov 2023 14:54:23 +0106 [thread overview]
Message-ID: <87zfzwp8pk.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <ZTkxOJbDLPy12n41@alley>
On 2023-10-25, Petr Mladek <pmladek@suse.com> wrote:
> there seems to be missing word in the subject:
>
> s/non-finalized/non-finalized records/
Ack.
> On Thu 2023-10-19 15:31:45, John Ogness wrote:
>> Commit f244b4dc53e5 ("printk: ringbuffer: Improve prb_next_seq()
>> performance") introduced an optimization for prb_next_seq() by
>> using best-effort to track recently finalized records. However,
>> the order of finalization does not necessarily match the order
>> of the records. This can lead to prb_next_seq() returning
>> higher than desired sequence numbers, which results in the
>> reader skipping over records that are not yet finalized. From
>> the reader's perspective it results in messages never being
>> seen.
>
> IMHO, "messages never being seen" is too strong.
Agreed. A reader does not use prb_next_seq() to decide what to print
next. Worst case it thinks records are available that are not (available
for that reader).
> I have found only one (or two) scenarios where the messages might
> really get lost.
>
> 1. It might happen when real console is replacing a boot console.
> The real console is initialized with the value returned
> by prb_next_seq(). And the boot console might not be able
> to flush earlier non-finalized records.
This cannot happen because in this situation console_init_seq() sets
@seq to the lowest boot console counter.
> 2. The other scenario is based on the fact that console_unlock()
> or pr_flush() might see lower prb_next_seq() than the last
> reserved record. It means that they might not flush all
> pending records.
>
> But wait! This is actually the opposite case. pr_flush()
> and console_unlock() might miss the messages when
> they see too low prb_next_seq().
>
> Important: This problem existed since introducing
> the lockless ring buffer!
>
> The question is. Should pr_flush() and console_unlock()
> wait until all registered messages get finalized?
>
> It would need to ignore only the last record when it
> is not finalized because it might be a continuous line.
Yes, this is the question to answer.
With the lockless ringbuffer we allow multiple CPUs/contexts to write
simultaneously into the buffer. This creates an ambiguity as some
writers will finalize sooner.
IMHO we need 2 different functions:
1. A function that reports the last contiguous finalized record for a
reader. This is useful for syslog and kmsg_dump to know what is
available for them to read. We can use @last_finalized_seq for this,
optimizing it correctly this time.
2. A function that reports the last reserved sequence number of a
writer. This is useful for pr_flush and console_unlock to know when they
are finished. This function can begin with @last_finalized_seq, looking
for the last finalized record (skipping over any non-finalized).
> I agree that this might be optimized. I think about reducing the
> number of cmpxchg even more, something like:
>
> static void desc_update_last_finalized(struct prb_desc_ring *desc_ring)
> {
> struct prb_desc_ring *desc_ring = &rb->desc_ring;
> u64 prev_seq = desc_last_finalized_seq(desc_ring);
> u64 seq = prev_seq;
>
> try_again:
> while (_prb_read_valid(rb, &seq, NULL, NULL))
> seq++;
>
> if (seq == prev_seq)
> return;
>
> oldval = __u64seq_to_ulseq(prev_seq);
> newval = __u64seq_to_ulseq(seq);
>
> if (!atomic_long_try_cmpxchg_relaxed(&desc_ring->last_finalized_seq,
> &oldval, newval)) {
> prev_seq = seq;
> goto try_again;
> }
> }
I am fine with this implementation.
> It looks to me that we could keep passing desc_ring as the parameter.
No, _prb_read_valid() needs it.
> I feel that we need a read barrier here. It should be between the
> above
>
> atomic_long_read(&desc_ring->last_finalized_seq))
>
> and the below
>
> while (_prb_read_valid(rb, &seq, NULL, NULL))
> seq++;
>
> It should make sure that the _prb_read_valid() will see all messages
> finalized which were seen finalized by the CPU updating
> desc_ring->last_finalized_seq.
Generally we have not concerned ourselves with readers. But I agree we
should make the optimization coherent with what a reader can actually
read. It might save some CPU cycles for polling tasks.
Writing and reading of @last_finalized_seq will provide the necessary
boundaries to guarantee this:
...finalize record...
atomic_long_try_cmpxchg_release(&desc_ring->last_finalized_seq, ...);
and
atomic_long_read_acquire(&desc_ring->last_finalized_seq);
...read record...
This guarantees that if a reader sees a certain @last_finalized_seq
value, that they will also see the record that was finalized.
This will be the 13th memory barrier pair to be added to the
documentation.
I would like to submit a new patch implementing things as described
here.
John
next prev parent reply other threads:[~2023-11-02 13:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-19 13:25 [RFC PATCH printk v1] printk: ringbuffer: Do not skip non-finalized with prb_next_seq() John Ogness
2023-10-19 13:39 ` John Ogness
2023-10-22 1:14 ` kernel test robot
2023-10-25 15:16 ` Petr Mladek
2023-10-26 8:07 ` Petr Mladek
2023-11-02 13:48 ` John Ogness [this message]
2023-11-03 12:31 ` Petr Mladek
2023-11-03 13:31 ` John Ogness
2023-11-03 15:58 ` Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zfzwp8pk.fsf@jogness.linutronix.de \
--to=john.ogness@linutronix.de \
--cc=chunlei.wang@mediatek.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pmladek@suse.com \
--cc=quic_mojha@quicinc.com \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.