From: John Ogness <john.ogness@linutronix.de>
To: Petr Mladek <pmladek@suse.com>
Cc: Joanne Koong <joannelkoong@gmail.com>,
syzbot <syzbot+3686758660f980b402dc@syzkaller.appspotmail.com>,
"amurray@thegoodpenguin.co.uk" <amurray@thegoodpenguin.co.uk>,
brauner@kernel.org, chao@kernel.org, djwong@kernel.org,
jaegeuk@kernel.org, linux-f2fs-devel@lists.sourceforge.net,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-xfs@vger.kernel.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)
Date: Thu, 06 Nov 2025 20:04:03 +0106 [thread overview]
Message-ID: <87h5v73s5g.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <aQzLX_y8PvBMiZ9f@pathway.suse.cz>
On 2025-11-06, Petr Mladek <pmladek@suse.com> wrote:
>> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
>> index 839f504db6d30..8499ee642c31d 100644
>> --- a/kernel/printk/printk_ringbuffer.c
>> +++ b/kernel/printk/printk_ringbuffer.c
>> @@ -390,6 +390,17 @@ static unsigned int to_blk_size(unsigned int size)
>> return size;
>> }
>>
>> +/*
>> + * Check if @lpos1 is before @lpos2. This takes ringbuffer wrapping
>> + * into account. If @lpos1 is more than a full wrap before @lpos2,
>> + * it is considered to be after @lpos2.
>
> The 2nd sentence is a brain teaser ;-)
>
>> + */
>> +static bool lpos1_before_lpos2(struct prb_data_ring *data_ring,
>> + unsigned long lpos1, unsigned long lpos2)
>> +{
>> + return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
>> +}
>
> It would be nice to describe the semantic a more clean way. Sigh,
> it is not easy. I tried several variants and ended up with:
>
> + using "lt" instead of "before" because "lower than" is
> a well known mathematical therm.
I explicitly chose a word other than "less" or "lower" because I was
concerned people might visualize values. The lpos does not necessarily
have a lesser or lower value. "Preceeds" would also be a choice of mine.
When I see "lt" I immediately think "less than" and "<". But I will not
fight it. I can handle "lt".
> + adding "_safe" suffix to make it clear that it is not
> a simple mathematical comparsion. It takes the wrap
> into account.
I find "_safe" confusing. Especially when you look at the implementation
you wonder, "what is safe about this?". Especially when comparing it to
all the complexity of the rest of the code. But I can handle "_safe" if
it is important for you.
> Something like:
>
> /*
> * Returns true when @lpos1 is lower than @lpos2 and both values
> * are comparable.
> *
> * It is safe when the compared values are read a lock less way.
> * One of them must be already overwritten when the difference
> * is bigger then the data ring buffer size.
This makes quite a bit of assumptions about the context and intention of
the call. I preferred my brain teaser version. But to me it is not worth
bike-shedding. If this explanation helps you, I am fine with it.
> */
> static bool lpos1_lt_lpos2_safe(struct prb_data_ring *data_ring,
> unsined long lpos1, unsigned long lpos2)
> {
> return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> }
>
>> /*
>> * Sanity checker for reserve size. The ringbuffer code assumes that a data
>> * block does not exceed the maximum possible size that could fit within the
>> @@ -577,7 +588,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
>> unsigned long id;
>>
>> /* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
>> - while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
>> + while (lpos1_before_lpos2(data_ring, lpos_begin, lpos_end)) {
>
> lpos1_lt_lpos2_safe() fits here.
>
>> blk = to_block(data_ring, lpos_begin);
>> /*
>> @@ -668,7 +679,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
>> * sees the new tail lpos, any descriptor states that transitioned to
>> * the reusable state must already be visible.
>> */
>> - while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
>> + while (lpos1_before_lpos2(data_ring, tail_lpos, lpos)) {
>> /*
>> * Make all descriptors reusable that are associated with
>> * data blocks before @lpos.
>
> Same here.
>
>> @@ -1149,7 +1160,7 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
>> next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
>>
>> /* If the data block does not increase, there is nothing to do. */
>> - if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
>> + if (!lpos1_before_lpos2(data_ring, head_lpos, next_lpos)) {
>
> I think that the original code was correct. And using the "-1" is
> wrong here.
You have overlooked that I inverted the check. It is no longer checking:
next_pos <= head_pos
but is instead checking:
!(head_pos < next_pos)
IOW, if "next has not overtaken head".
> Both data_make_reusable() and data_push_tail() had to use "-1"
> because it was the "lower than" semantic. But in this case,
> we do not need to do anything even when "head_lpos == next_lpos"
>
> By other words, both data_make_reusable() and data_push_tail()
> needed to make a free space when the position was "lower than".
> There was enough space when the values were "equal".
>
> It means that "equal" should be OK in data_realloc(). By other
> words, data_realloc() should use "le" aka "less or equal"
> semantic.
>
> The helper function might be:
>
> /*
> * Returns true when @lpos1 is lower or equal than @lpos2 and both
> * values are comparable.
> *
> * It is safe when the compared values are read a lock less way.
> * One of them must be already overwritten when the difference
> * is bigger then the data ring buffer size.
> */
> static bool lpos1_le_lpos2_safe(struct prb_data_ring *data_ring,
> unsined long lpos1, unsigned long lpos2)
> {
> return lpos2 - lpos1 < DATA_SIZE(data_ring);
> }
If you negate lpos1_lt_lpos2_safe() and swap the parameters, there is no
need for a second helper. That is what I did.
>> @@ -1262,7 +1273,7 @@ static const char *get_data(struct prb_data_ring *data_ring,
>>
>> /* Regular data block: @begin less than @next and in same wrap. */
>> if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
>> - blk_lpos->begin < blk_lpos->next) {
>> + lpos1_before_lpos2(data_ring, blk_lpos->begin, blk_lpos->next)) {
>
> Hmm, I think that it is more complicated here.
>
> The "lower than" semantic is weird here. One would expect that "equal"
> values, aka "zero size" is perfectly fine.
No, we would _not_ expect that zero size is OK, because we are detecting
"Regular data blocks", in which case they must _not_ be equal.
> It does not hurt because the "zero size" case is already handled
> earlier. But still, the "lower than" semantic does not fit here.
Currently we have 3 explicit checks:
1. data-less
2. regular
3. wrapping
But I agree the checks are "relaxed" because we are doing only minimal
sanity checks on the positions, rather than size validation.
> IMHO, the main motivation for this fix is to make sure that
> blk_lpos->begin and blk_lpos->next will produce a valid
> *data_size.
>
> From this POV, even lpos1_le_lpos2_safe() does not fit here
> because the data_size must be lower than half of the size
> of the ring buffer.
Currently we do not do size validation for reading, only for writing. If
you are arguing that we _should_ perform better size validation on read,
then I agree this is the place for it.
>> db = to_block(data_ring, blk_lpos->begin);
>> *data_size = blk_lpos->next - blk_lpos->begin;
>
> I think that we should do the following:
>
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 839f504db6d3..78e02711872e 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -1260,9 +1260,8 @@ static const char *get_data(struct prb_data_ring *data_ring,
> return NULL;
> }
>
> - /* Regular data block: @begin less than @next and in same wrap. */
> - if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
> - blk_lpos->begin < blk_lpos->next) {
> + /* Regular data block: @begin and @next in same wrap. */
> + if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
> db = to_block(data_ring, blk_lpos->begin);
> *data_size = blk_lpos->next - blk_lpos->begin;
>
> @@ -1279,6 +1278,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
> return NULL;
> }
>
> + /* Double check that the data_size is reasonable. */
> + if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
> + return NULL;
> +
> /* A valid data block will always be aligned to the ID size. */
> if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
> WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
>
> 1. Just distinguish regular vs. wrapped. vs. invalid block.
>
> 2. Add sanity check for the "data_size". It might catch some wrong values
> in both code paths for "regular" and "wrapped" blocks. So, win win.
>
> How does that sound?
I think it can be made even more simple since we are adding size
validation:
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index b7ab4e75917f0..04bc863eae411 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1271,23 +1271,15 @@ static const char *get_data(struct prb_data_ring *data_ring,
return NULL;
}
- /* Regular data block: @begin less than @next and in same wrap. */
- if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
- blk_lpos->begin < blk_lpos->next) {
- db = to_block(data_ring, blk_lpos->begin);
- *data_size = blk_lpos->next - blk_lpos->begin;
-
- /* Wrapping data block: @begin is one wrap behind @next. */
- } else if (!is_blk_wrapped(data_ring,
- blk_lpos->begin + DATA_SIZE(data_ring),
- blk_lpos->next)) {
+ /* Wrapping data block description. */
+ if (is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
db = to_block(data_ring, 0);
*data_size = DATA_INDEX(data_ring, blk_lpos->next);
- /* Illegal block description. */
+ /* Regular data block description. */
} else {
- WARN_ON_ONCE(1);
- return NULL;
+ db = to_block(data_ring, blk_lpos->begin);
+ *data_size = blk_lpos->next - blk_lpos->begin;
}
/* A valid data block will always be aligned to the ID size. */
@@ -1300,6 +1292,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
if (WARN_ON_ONCE(*data_size < sizeof(db->id)))
return NULL;
+ /* Check if the data size is at least legal. */
+ if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
+ return NULL;
+
/* Subtract block ID space from size to reflect data size. */
*data_size -= sizeof(db->id);
So it ends up looking like this:
/* Wrapping data block description. */
if (is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
db = to_block(data_ring, 0);
*data_size = DATA_INDEX(data_ring, blk_lpos->next);
/* Regular data block description. */
} else {
db = to_block(data_ring, blk_lpos->begin);
*data_size = blk_lpos->next - blk_lpos->begin;
}
...
/* Ensure the data size is at least legal. */
if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size)))
return NULL;
(Note that there is already WARN_ON_ONCE() checks for misaligned lpos
values and sizes less than sizeof(id).)
How does this sound?
John
next prev parent reply other threads:[~2025-11-06 18:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-18 13:13 [syzbot] [f2fs?] kernel BUG in folio_end_read (2) syzbot
2025-11-01 2:11 ` [syzbot] [iomap?] " syzbot
2025-11-03 16:58 ` Joanne Koong
2025-11-04 2:43 ` syzbot
2025-11-04 17:45 ` Joanne Koong
2025-11-04 18:25 ` Petr Mladek
2025-11-05 14:54 ` John Ogness
2025-11-05 16:49 ` Petr Mladek
2025-11-05 19:58 ` John Ogness
2025-11-06 11:36 ` John Ogness
2025-11-06 16:22 ` Petr Mladek
2025-11-06 18:58 ` John Ogness [this message]
2025-11-06 19:36 ` John Ogness
2025-11-07 11:48 ` Petr Mladek
2025-11-07 13:41 ` John Ogness
2025-11-02 5:39 ` syzbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h5v73s5g.fsf@jogness.linutronix.de \
--to=john.ogness@linutronix.de \
--cc=amurray@thegoodpenguin.co.uk \
--cc=brauner@kernel.org \
--cc=chao@kernel.org \
--cc=djwong@kernel.org \
--cc=jaegeuk@kernel.org \
--cc=joannelkoong@gmail.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=pmladek@suse.com \
--cc=syzbot+3686758660f980b402dc@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox