public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: Daniil Tatianin <d-tatianin@yandex-team.ru>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>
Subject: Re: [PATCH v2 0/2] printk_ringbuffer: don't needlessly wrap data blocks around
Date: Mon, 15 Sep 2025 17:08:01 +0200	[thread overview]
Message-ID: <aMgr0dId_UfBptzW@pathway.suse.cz> (raw)
In-Reply-To: <84a52zy0iu.fsf@jogness.linutronix.de>

On Fri 2025-09-12 20:49:37, John Ogness wrote:
> Hi Petr,
> 
> Summary: printk() is not in danger but we should correct a loose bounds
> check.
> 
> On 2025-09-12, Petr Mladek <pmladek@suse.com> wrote:
> > Honestly, I would really like to limit the maximal record size to
> > 1/4 of the buffer size. I do not want to make the design more
> > complicated just to be able to fill just one record, definitely.
> 
> So I was able to track this down. Your usage of
> 
> DEFINE_PRINTKRB(test_rb, 4, 4);
> 
> actually made it relatively easy because there are only 16
> descriptors. All I needed to do was dump the descriptors before each
> reserve, between reserve and commit, after commit, and when reserve
> fails. This allowed me to easily see exactly how the ringbuffer is
> behaving.
> 
> The problem can be reproduced with a single writer, no reader
> needed. Using
> 
> #define MAX_RBDATA_TEXT_SIZE (0x256 - sizeof(struct prbtest_rbdata))
> 
> provides a wild range of attempts that trigger the problem within about
> 20 write cycles.
> 
> The problem comes from the function data_make_reusable(). The job of
> this function is to push the data_ring tail forward, one data block at a
> time, while setting the related descriptors to reusable.
> 
> After pushing the tail forward, if it still has not pushed it far enough
> for new requested reservation, it must push it further. For this it
> _assumes the current position of the tail is a descriptor ID for the
> next data block_. But what if the tail was pushed all the way to the
> head? Then there is no next data block and it will read in garbage,
> thinking it is the next descriptor ID to set reusable. And from there it
> just goes crazy because it is reading garbage to determine how big the
> data block is so that it can continue pushing the tail (beyond the head!).
>
> Example: Assume the 96 byte ringbuffer has a single message of 64
> bytes. Then we try to reserve space for a 72-byte
> message. data_make_reusable() will first set the descriptor of the
> 64-byte message to reusable and push the tail forward to index 64. But
> the new message needs 72 bytes, so data_make_reusable() will keep going
> and read the descriptor ID at index 64, but there is only random garbage
> at that position. 64 is the head and there is nothing valid after it.

Great catch and example!

I wondered why data_make_reusable() needed to push the tail that far.
The buffer was empty after making the 64 bytes long message free.

My understanding is that it is combination of the following effects:

  1. The message is wrapped.

  2. The ring buffer does not support proper wrapping. Instead,
     the non-sufficient space at the end of the buffer stays
     unused (last wrap). And the messages will be written
     from the beginning of the buffer (next wrap).

      => the message will occupy more space than expected

	  unused space from last wrap + full message size in new wrap

In our case:

    + size of the buffer: 96
    + unused space in old wrap: 96 - 64 = 32
    + occupied space in new wrap: 72

    => total occupied space: = 32 + 72 = 104 > 96

    => lpos passed to data_push_tail() is from a never used space

    => This is why data_push_tail() tries to read
       descriptor from a never used space and reads a garbage

> This situation can never happen for printk because of your 1/4 limit
> (MAX_LOG_TAKE_PART), although it is over-conservative.

I would say that it is conservative. It would survive mistakes from the
off-by-one family, ... ;-)

And it is still far from practical limits. Because having this
powerful ring buffer for 1, 2, or 4 messages looks line an overkill.

> It is enough to limit messages to 1/2 of the data ring
> (with Daniil's series). Otherwise the limit must be
> "1/2 - sizeof(long)" to also leave room for the
> trailing ID of a wrapping data block.

I am not sure why it is important to push it to the limits.
That said, I could live with it. Especially how, when we
understood what happened.


> I am still positive about Daniil's series.

Yes, the patch which prevents wrapping for perfectly fitting messages
looks good to me.

> And we should fix
> data_check_size() to be provide a proper limit as well as describe the
> critical relationship between data_check_size() and
> data_make_reusable().

Yup.

> I prefer not modify data_make_reusable() to handle this case. Currently
> data_make_reusable() does nothing with the head, so it would introduce
> new memory barriers. Also, the "push tail beyond head" scenario is a bit
> odd to handle. It is better just to document the assumption and put in
> the correct bounds checks.

It might be possible to catch this in either in data_alloc().
or in get_next_lpos(). They could ignore/yell about when
the really occupied space would be bigger than DATA_SIZE(data_ring).

Something like:

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 17b741b2eccd..d7ba4c0d8c3b 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1056,8 +1056,16 @@ static char *data_alloc(struct printk_ringbuffer *rb, unsigned int size,
 	do {
 		next_lpos = get_next_lpos(data_ring, begin_lpos, size);
 
-		if (!data_push_tail(rb, next_lpos - DATA_SIZE(data_ring))) {
-			/* Failed to allocate, specify a data-less block. */
+		/*
+		 * Double check that the really used space won't be bigger than
+		 * the ring buffer. Wrapped messages need to reserve more space,
+		 * see get_next_lpos.
+		 *
+		 * Specify a data-less block when the check or the allocation
+		 * fails.
+		 */
+		if (WARN_ON_ONCE(next_lpos - begin_lpos > DATA_SIZE(data_ring)) ||
+		    !data_push_tail(rb, next_lpos - DATA_SIZE(data_ring))) {
 			blk_lpos->begin = FAILED_LPOS;
 			blk_lpos->next = FAILED_LPOS;
 			return NULL;


Similar check would need to be done also in data_realloc().

I am not sure if it is worth it. Maybe, we could rule this out
when we limit the allocated size to 1/2 or 1/4 of the ring buffer size.

Best Regards,
Petr

  parent reply	other threads:[~2025-09-15 15:08 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 14:41 [PATCH v2 0/2] printk_ringbuffer: don't needlessly wrap data blocks around Daniil Tatianin
2025-09-05 14:41 ` [PATCH v2 1/2] " Daniil Tatianin
2025-09-05 15:27   ` John Ogness
2025-09-05 15:29     ` Daniil Tatianin
2025-09-05 16:10       ` John Ogness
2025-09-11  8:34         ` Daniil Tatianin
2025-09-11 15:33           ` Petr Mladek
2025-09-05 15:30     ` John Ogness
2025-09-26 14:35   ` Daniil Tatianin
2025-09-26 14:44   ` Petr Mladek
2025-09-26 14:53     ` Daniil Tatianin
2025-10-22 12:46   ` Petr Mladek
2025-09-05 14:41 ` [PATCH v2 2/2] printk_ringbuffer: allow one data block to occupy the entire data ring Daniil Tatianin
2025-09-05 15:27   ` John Ogness
2025-09-11 15:30 ` [PATCH v2 0/2] printk_ringbuffer: don't needlessly wrap data blocks around Petr Mladek
2025-09-11 15:55   ` Petr Mladek
2025-09-11 16:11     ` Petr Mladek
2025-09-11 15:58   ` Petr Mladek
2025-09-11 16:12     ` John Ogness
2025-09-12  9:25       ` Petr Mladek
2025-09-12  9:54         ` Petr Mladek
2025-09-12 14:49           ` Petr Mladek
2025-09-12 15:15             ` Petr Mladek
2025-09-12 18:43             ` John Ogness
2025-09-13 17:38               ` Daniil Tatianin
2025-09-14  9:23                 ` John Ogness
2025-09-14  9:56                   ` Daniil Tatianin
2025-09-15 15:07                     ` John Ogness
2025-09-15 16:00                       ` Petr Mladek
2025-09-15 16:07                       ` Daniil Tatianin
2025-09-15 15:08               ` Petr Mladek [this message]
2025-09-15 15:25                 ` John Ogness

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMgr0dId_UfBptzW@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=d-tatianin@yandex-team.ru \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox