[PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks
@ 2025-11-07 19:47 Petr Mladek
  2025-11-07 19:47 ` [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows Petr Mladek
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Petr Mladek @ 2025-11-07 19:47 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs, Petr Mladek

This is outcome of the long discussion about the regression caused
by 67e1b0052f6bb82 ("printk_ringbuffer: don't needlessly wrap data blocks around"),
see https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/

The 1st patch fixes the regression as agreed, see
https://lore.kernel.org/all/87ecqb3qd0.fsf@jogness.linutronix.de/

The 2nd patch adds a helper function to unify the checks whether
a more space is needed. I did my best to address all the concerns
about various proposed variants.

Note that I called the new helper function "need_more_space()" in the end.
It avoids all the problems with "before" vs. "lt" vs "le",
and "_safe" vs. "_sane" vs. "_bounded".

IMHO, the name "need_more_space()" fits very well in all three
locations, surprisingly even in data_realloc(). But it is possible
that you disagree. Let me know if you hate it ;-)

The patchset applies on top of printk/linux.git, branch for-6.19.
It should apply on top of linux-next as well.

Petr Mladek (2):
  printk_ringbuffer: Fix check of valid data size when blk_lpos
    overflows
  printk_ringbuffer: Create a helper function to decide whether a more
    space is needed

 kernel/printk/printk_ringbuffer.c | 40 +++++++++++++++++++++++++------
 1 file changed, 33 insertions(+), 7 deletions(-)

-- 
2.51.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows
  2025-11-07 19:47 [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
@ 2025-11-07 19:47 ` Petr Mladek
  2025-11-10  9:13   ` John Ogness
  2025-11-07 19:47 ` [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed Petr Mladek
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Petr Mladek @ 2025-11-07 19:47 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs, Petr Mladek

The commit 67e1b0052f6bb8 ("printk_ringbuffer: don't needlessly wrap
data blocks around") allows to use the last 4 bytes of the ring buffer.

But the check for the @data_size was not properly updated in get_data().
It fails when "blk_lpos->next" overflows to "0". In this case:

  + is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)
    returns "false" because it checks "blk_lpos->next - 1".

  + "blk_lpos->begin < blk_lpos->next" fails because "blk_lpos->next"
    is already 0.

  + is_blk_wrapped(data_ring, blk_lpos->begin + DATA_SIZE(data_ring),
    blk_lpos->next) returns "false" because "begin_lpos" is from
    the next wrap but "next_lpos - 1" is from the previous one.

As a result, get_data() triggers the WARN_ON_ONCE() for "Illegal
block description", for example:

[  216.317316][ T7652] loop0: detected capacity change from 0 to 16
** 1 printk messages dropped **
[  216.327750][ T7652] ------------[ cut here ]------------
[  216.327789][ T7652] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.585/7652
[  216.327848][ T7652] Modules linked in:
[  216.327907][ T7652] CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
[  216.327933][ T7652] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
[  216.327953][ T7652] RIP: 0010:get_data+0x48a/0x840
[  216.327986][ T7652] Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
[  216.328007][ T7652] RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
[  216.328029][ T7652] RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
[  216.328048][ T7652] RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
[  216.328063][ T7652] RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
[  216.328079][ T7652] R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
[  216.328095][ T7652] R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
[  216.328111][ T7652] FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
[  216.328131][ T7652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  216.328147][ T7652] CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
[  216.328168][ T7652] Call Trace:
[  216.328178][ T7652]  <TASK>
[  216.328199][ T7652]  _prb_read_valid+0x672/0xa90
[  216.328328][ T7652]  ? desc_read+0x1b8/0x3f0
[  216.328381][ T7652]  ? __pfx__prb_read_valid+0x10/0x10
[  216.328422][ T7652]  ? panic_on_this_cpu+0x32/0x40
[  216.328450][ T7652]  prb_read_valid+0x3c/0x60
[  216.328482][ T7652]  printk_get_next_message+0x15c/0x7b0
[  216.328526][ T7652]  ? __pfx_printk_get_next_message+0x10/0x10
[  216.328561][ T7652]  ? __lock_acquire+0xab9/0xd20
[  216.328595][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328621][ T7652]  ? console_flush_all+0x478/0xb10
[  216.328648][ T7652]  console_flush_all+0x4cc/0xb10
[  216.328673][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328704][ T7652]  ? __pfx_console_flush_all+0x10/0x10
[  216.328748][ T7652]  ? is_printk_cpu_sync_owner+0x32/0x40
[  216.328781][ T7652]  console_unlock+0xbb/0x190
[  216.328815][ T7652]  ? __pfx___down_trylock_console_sem+0x10/0x10
[  216.328853][ T7652]  ? __pfx_console_unlock+0x10/0x10
[  216.328899][ T7652]  vprintk_emit+0x4c5/0x590
[  216.328935][ T7652]  ? __pfx_vprintk_emit+0x10/0x10
[  216.328993][ T7652]  _printk+0xcf/0x120
[  216.329028][ T7652]  ? __pfx__printk+0x10/0x10
[  216.329051][ T7652]  ? kernfs_get+0x5a/0x90
[  216.329090][ T7652]  _erofs_printk+0x349/0x410
[  216.329130][ T7652]  ? __pfx__erofs_printk+0x10/0x10
[  216.329161][ T7652]  ? __raw_spin_lock_init+0x45/0x100
[  216.329186][ T7652]  ? __init_swait_queue_head+0xa9/0x150
[  216.329231][ T7652]  erofs_fc_fill_super+0x1591/0x1b20
[  216.329285][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329324][ T7652]  ? sb_set_blocksize+0x104/0x180
[  216.329356][ T7652]  ? setup_bdev_super+0x4c1/0x5b0
[  216.329385][ T7652]  get_tree_bdev_flags+0x40e/0x4d0
[  216.329410][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329444][ T7652]  ? __pfx_get_tree_bdev_flags+0x10/0x10
[  216.329483][ T7652]  vfs_get_tree+0x92/0x2b0
[  216.329512][ T7652]  do_new_mount+0x302/0xa10
[  216.329537][ T7652]  ? apparmor_capable+0x137/0x1b0
[  216.329576][ T7652]  ? __pfx_do_new_mount+0x10/0x10
[  216.329605][ T7652]  ? ns_capable+0x8a/0xf0
[  216.329637][ T7652]  ? kmem_cache_free+0x19b/0x690
[  216.329682][ T7652]  __se_sys_mount+0x313/0x410
[  216.329717][ T7652]  ? __pfx___se_sys_mount+0x10/0x10
[  216.329836][ T7652]  ? do_syscall_64+0xbe/0xfa0
[  216.329869][ T7652]  ? __x64_sys_mount+0x20/0xc0
[  216.329901][ T7652]  do_syscall_64+0xfa/0xfa0
[  216.329932][ T7652]  ? lockdep_hardirqs_on+0x9c/0x150
[  216.329964][ T7652]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.329988][ T7652]  ? clear_bhb_loop+0x60/0xb0
[  216.330017][ T7652]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.330040][ T7652] RIP: 0033:0x7f44ea99076a
[  216.330080][ T7652] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
[  216.330100][ T7652] RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[  216.330128][ T7652] RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
[  216.330146][ T7652] RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
[  216.330164][ T7652] RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
[  216.330181][ T7652] R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
[  216.330196][ T7652] R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
[  216.330233][ T7652]  </TASK>

Solve the problem by moving and fixing the sanity check. The problematic
if-else-if-else code will just distinguish three basic scenarios:
"regular" vs. "wrapped" vs. "too many times wrapped" block.

The new sanity check is more precise. A valid "data_size" must be
lower than half of the data buffer size. Also it must not be zero at
this stage. It allows to catch problematic "data_size" even for wrapped
blocks.

Closes: https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/
Closes: https://lore.kernel.org/all/69078fb6.050a0220.29fc44.0029.GAE@google.com/
Fixes: 67e1b0052f6bb82 ("printk_ringbuffer: don't needlessly wrap data blocks around")
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 839f504db6d3..3e6fd8d6fa9f 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -1260,9 +1260,8 @@ static const char *get_data(struct prb_data_ring *data_ring,
 		return NULL;
 	}
 
-	/* Regular data block: @begin less than @next and in same wrap. */
-	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next) &&
-	    blk_lpos->begin < blk_lpos->next) {
+	/* Regular data block: @begin and @next in the same wrap. */
+	if (!is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)) {
 		db = to_block(data_ring, blk_lpos->begin);
 		*data_size = blk_lpos->next - blk_lpos->begin;
 
@@ -1279,6 +1278,10 @@ static const char *get_data(struct prb_data_ring *data_ring,
 		return NULL;
 	}
 
+	/* Sanity check. Data-less blocks were handled earlier. */
+	if (WARN_ON_ONCE(!data_check_size(data_ring, *data_size) || !*data_size))
+		return NULL;
+
 	/* A valid data block will always be aligned to the ID size. */
 	if (WARN_ON_ONCE(blk_lpos->begin != ALIGN(blk_lpos->begin, sizeof(db->id))) ||
 	    WARN_ON_ONCE(blk_lpos->next != ALIGN(blk_lpos->next, sizeof(db->id)))) {
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed
  2025-11-07 19:47 [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
  2025-11-07 19:47 ` [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows Petr Mladek
@ 2025-11-07 19:47 ` Petr Mladek
  2025-11-10  9:21   ` John Ogness
  2025-11-10 12:25 ` [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
  2025-12-09 17:18 ` [f2fs-dev] " patchwork-bot+f2fs
  3 siblings, 1 reply; 7+ messages in thread
From: Petr Mladek @ 2025-11-07 19:47 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs, Petr Mladek

The decision whether some more space is needed is tricky in the printk
ring buffer code:

  1. The given lpos values might overflow. A subtraction must be used
     instead of a simple "lower than" check.

  2. Another CPU might reuse the space in the mean time. It can be
     detected when the subtraction is bigger than DATA_SIZE(data_ring).

  3. There is exactly enough space when the result of the subtraction
     is zero. But more space is needed when the result is exactly
     DATA_SIZE(data_ring).

Add a helper function to make sure that the check is done correctly
in all situations. Also it helps to make the code consistent and
better documented.

Suggested-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/r/87tsz7iea2.fsf@jogness.linutronix.de
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk_ringbuffer.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 3e6fd8d6fa9f..ede3039dd041 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -411,6 +411,23 @@ static bool data_check_size(struct prb_data_ring *data_ring, unsigned int size)
 	return to_blk_size(size) <= DATA_SIZE(data_ring) / 2;
 }
 
+/*
+ * Compare the current and requested logical position and decide
+ * whether more space needed.
+ *
+ * Return false when @lpos_current is already at or beyond @lpos_target.
+ *
+ * Also return false when the difference between the positions is bigger
+ * than the size of the data buffer. It might happen only when the caller
+ * raced with another CPU(s) which already made and used the space.
+ */
+static bool need_more_space(struct prb_data_ring *data_ring,
+			    unsigned long lpos_current,
+			    unsigned long lpos_target)
+{
+	return lpos_target - lpos_current - 1 < DATA_SIZE(data_ring);
+}
+
 /* Query the state of a descriptor. */
 static enum desc_state get_desc_state(unsigned long id,
 				      unsigned long state_val)
@@ -577,7 +594,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
 	unsigned long id;
 
 	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
-	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
+	while (need_more_space(data_ring, lpos_begin, lpos_end)) {
 		blk = to_block(data_ring, lpos_begin);
 
 		/*
@@ -668,7 +685,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
 	 * sees the new tail lpos, any descriptor states that transitioned to
 	 * the reusable state must already be visible.
 	 */
-	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
+	while (need_more_space(data_ring, tail_lpos, lpos)) {
 		/*
 		 * Make all descriptors reusable that are associated with
 		 * data blocks before @lpos.
@@ -1148,8 +1165,14 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
 
 	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
 
-	/* If the data block does not increase, there is nothing to do. */
-	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
+	/*
+	 * Use the current data block when the size does not increase.
+	 *
+	 * Note that need_more_space() could never return false here because
+	 * the difference between the positions was bigger than the data
+	 * buffer size. The data block is reopened and can't get reused.
+	 */
+	if (!need_more_space(data_ring, head_lpos, next_lpos)) {
 		if (wrapped)
 			blk = to_block(data_ring, 0);
 		else
-- 
2.51.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows
  2025-11-07 19:47 ` [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows Petr Mladek
@ 2025-11-10  9:13   ` John Ogness
  0 siblings, 0 replies; 7+ messages in thread
From: John Ogness @ 2025-11-10  9:13 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs, Petr Mladek

On 2025-11-07, Petr Mladek <pmladek@suse.com> wrote:
> The commit 67e1b0052f6bb8 ("printk_ringbuffer: don't needlessly wrap
> data blocks around") allows to use the last 4 bytes of the ring buffer.
>
> But the check for the @data_size was not properly updated in get_data().
> It fails when "blk_lpos->next" overflows to "0". In this case:
>
>   + is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)
>     returns "false" because it checks "blk_lpos->next - 1".
>
>   + "blk_lpos->begin < blk_lpos->next" fails because "blk_lpos->next"
>     is already 0.
>
>   + is_blk_wrapped(data_ring, blk_lpos->begin + DATA_SIZE(data_ring),
>     blk_lpos->next) returns "false" because "begin_lpos" is from
>     the next wrap but "next_lpos - 1" is from the previous one.
>
> As a result, get_data() triggers the WARN_ON_ONCE() for "Illegal
> block description", for example:
>
> [  216.317316][ T7652] loop0: detected capacity change from 0 to 16
> ** 1 printk messages dropped **
> [  216.327750][ T7652] ------------[ cut here ]------------
> [  216.327789][ T7652] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.585/7652
> [  216.327848][ T7652] Modules linked in:
> [  216.327907][ T7652] CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
> [  216.327933][ T7652] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
> [  216.327953][ T7652] RIP: 0010:get_data+0x48a/0x840
> [  216.327986][ T7652] Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
> [  216.328007][ T7652] RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
> [  216.328029][ T7652] RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
> [  216.328048][ T7652] RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
> [  216.328063][ T7652] RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
> [  216.328079][ T7652] R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
> [  216.328095][ T7652] R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
> [  216.328111][ T7652] FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
> [  216.328131][ T7652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  216.328147][ T7652] CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
> [  216.328168][ T7652] Call Trace:
> [  216.328178][ T7652]  <TASK>
> [  216.328199][ T7652]  _prb_read_valid+0x672/0xa90
> [  216.328328][ T7652]  ? desc_read+0x1b8/0x3f0
> [  216.328381][ T7652]  ? __pfx__prb_read_valid+0x10/0x10
> [  216.328422][ T7652]  ? panic_on_this_cpu+0x32/0x40
> [  216.328450][ T7652]  prb_read_valid+0x3c/0x60
> [  216.328482][ T7652]  printk_get_next_message+0x15c/0x7b0
> [  216.328526][ T7652]  ? __pfx_printk_get_next_message+0x10/0x10
> [  216.328561][ T7652]  ? __lock_acquire+0xab9/0xd20
> [  216.328595][ T7652]  ? console_flush_all+0x131/0xb10
> [  216.328621][ T7652]  ? console_flush_all+0x478/0xb10
> [  216.328648][ T7652]  console_flush_all+0x4cc/0xb10
> [  216.328673][ T7652]  ? console_flush_all+0x131/0xb10
> [  216.328704][ T7652]  ? __pfx_console_flush_all+0x10/0x10
> [  216.328748][ T7652]  ? is_printk_cpu_sync_owner+0x32/0x40
> [  216.328781][ T7652]  console_unlock+0xbb/0x190
> [  216.328815][ T7652]  ? __pfx___down_trylock_console_sem+0x10/0x10
> [  216.328853][ T7652]  ? __pfx_console_unlock+0x10/0x10
> [  216.328899][ T7652]  vprintk_emit+0x4c5/0x590
> [  216.328935][ T7652]  ? __pfx_vprintk_emit+0x10/0x10
> [  216.328993][ T7652]  _printk+0xcf/0x120
> [  216.329028][ T7652]  ? __pfx__printk+0x10/0x10
> [  216.329051][ T7652]  ? kernfs_get+0x5a/0x90
> [  216.329090][ T7652]  _erofs_printk+0x349/0x410
> [  216.329130][ T7652]  ? __pfx__erofs_printk+0x10/0x10
> [  216.329161][ T7652]  ? __raw_spin_lock_init+0x45/0x100
> [  216.329186][ T7652]  ? __init_swait_queue_head+0xa9/0x150
> [  216.329231][ T7652]  erofs_fc_fill_super+0x1591/0x1b20
> [  216.329285][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
> [  216.329324][ T7652]  ? sb_set_blocksize+0x104/0x180
> [  216.329356][ T7652]  ? setup_bdev_super+0x4c1/0x5b0
> [  216.329385][ T7652]  get_tree_bdev_flags+0x40e/0x4d0
> [  216.329410][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
> [  216.329444][ T7652]  ? __pfx_get_tree_bdev_flags+0x10/0x10
> [  216.329483][ T7652]  vfs_get_tree+0x92/0x2b0
> [  216.329512][ T7652]  do_new_mount+0x302/0xa10
> [  216.329537][ T7652]  ? apparmor_capable+0x137/0x1b0
> [  216.329576][ T7652]  ? __pfx_do_new_mount+0x10/0x10
> [  216.329605][ T7652]  ? ns_capable+0x8a/0xf0
> [  216.329637][ T7652]  ? kmem_cache_free+0x19b/0x690
> [  216.329682][ T7652]  __se_sys_mount+0x313/0x410
> [  216.329717][ T7652]  ? __pfx___se_sys_mount+0x10/0x10
> [  216.329836][ T7652]  ? do_syscall_64+0xbe/0xfa0
> [  216.329869][ T7652]  ? __x64_sys_mount+0x20/0xc0
> [  216.329901][ T7652]  do_syscall_64+0xfa/0xfa0
> [  216.329932][ T7652]  ? lockdep_hardirqs_on+0x9c/0x150
> [  216.329964][ T7652]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  216.329988][ T7652]  ? clear_bhb_loop+0x60/0xb0
> [  216.330017][ T7652]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> [  216.330040][ T7652] RIP: 0033:0x7f44ea99076a
> [  216.330080][ T7652] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> [  216.330100][ T7652] RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> [  216.330128][ T7652] RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
> [  216.330146][ T7652] RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
> [  216.330164][ T7652] RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
> [  216.330181][ T7652] R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
> [  216.330196][ T7652] R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
> [  216.330233][ T7652]  </TASK>
>
> Solve the problem by moving and fixing the sanity check. The problematic
> if-else-if-else code will just distinguish three basic scenarios:
> "regular" vs. "wrapped" vs. "too many times wrapped" block.
>
> The new sanity check is more precise. A valid "data_size" must be
> lower than half of the data buffer size. Also it must not be zero at
> this stage. It allows to catch problematic "data_size" even for wrapped
> blocks.
>
> Closes: https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/
> Closes: https://lore.kernel.org/all/69078fb6.050a0220.29fc44.0029.GAE@google.com/
> Fixes: 67e1b0052f6bb82 ("printk_ringbuffer: don't needlessly wrap data blocks around")
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: John Ogness <john.ogness@linutronix.de>
Tested-by: John Ogness <john.ogness@linutronix.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed
  2025-11-07 19:47 ` [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed Petr Mladek
@ 2025-11-10  9:21   ` John Ogness
  0 siblings, 0 replies; 7+ messages in thread
From: John Ogness @ 2025-11-10  9:21 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs, Petr Mladek

Hi Petr,

Nit: For the patch subject, remove the word "a":

"Create a helper function to decide whether more space is needed"

More below...

On 2025-11-07, Petr Mladek <pmladek@suse.com> wrote:
> The decision whether some more space is needed is tricky in the printk
> ring buffer code:
>
>   1. The given lpos values might overflow. A subtraction must be used
>      instead of a simple "lower than" check.
>
>   2. Another CPU might reuse the space in the mean time. It can be
>      detected when the subtraction is bigger than DATA_SIZE(data_ring).
>
>   3. There is exactly enough space when the result of the subtraction
>      is zero. But more space is needed when the result is exactly
>      DATA_SIZE(data_ring).
>
> Add a helper function to make sure that the check is done correctly
> in all situations. Also it helps to make the code consistent and
> better documented.
>
> Suggested-by: John Ogness <john.ogness@linutronix.de>
> Link: https://lore.kernel.org/r/87tsz7iea2.fsf@jogness.linutronix.de
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/printk/printk_ringbuffer.c | 31 +++++++++++++++++++++++++++----
>  1 file changed, 27 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 3e6fd8d6fa9f..ede3039dd041 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -411,6 +411,23 @@ static bool data_check_size(struct prb_data_ring *data_ring, unsigned int size)
>  	return to_blk_size(size) <= DATA_SIZE(data_ring) / 2;
>  }
>  
> +/*
> + * Compare the current and requested logical position and decide
> + * whether more space needed.
> + *
> + * Return false when @lpos_current is already at or beyond @lpos_target.
> + *
> + * Also return false when the difference between the positions is bigger
> + * than the size of the data buffer. It might happen only when the caller
> + * raced with another CPU(s) which already made and used the space.
> + */
> +static bool need_more_space(struct prb_data_ring *data_ring,
> +			    unsigned long lpos_current,
> +			    unsigned long lpos_target)
> +{
> +	return lpos_target - lpos_current - 1 < DATA_SIZE(data_ring);
> +}
> +
>  /* Query the state of a descriptor. */
>  static enum desc_state get_desc_state(unsigned long id,
>  				      unsigned long state_val)
> @@ -577,7 +594,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
>  	unsigned long id;
>  
>  	/* Loop until @lpos_begin has advanced to or beyond @lpos_end. */
> -	while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) {
> +	while (need_more_space(data_ring, lpos_begin, lpos_end)) {
>  		blk = to_block(data_ring, lpos_begin);
>  
>  		/*
> @@ -668,7 +685,7 @@ static bool data_push_tail(struct printk_ringbuffer *rb, unsigned long lpos)
>  	 * sees the new tail lpos, any descriptor states that transitioned to
>  	 * the reusable state must already be visible.
>  	 */
> -	while ((lpos - tail_lpos) - 1 < DATA_SIZE(data_ring)) {
> +	while (need_more_space(data_ring, tail_lpos, lpos)) {
>  		/*
>  		 * Make all descriptors reusable that are associated with
>  		 * data blocks before @lpos.
> @@ -1148,8 +1165,14 @@ static char *data_realloc(struct printk_ringbuffer *rb, unsigned int size,
>  
>  	next_lpos = get_next_lpos(data_ring, blk_lpos->begin, size);
>  
> -	/* If the data block does not increase, there is nothing to do. */
> -	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
> +	/*
> +	 * Use the current data block when the size does not increase.

I would like to expand the above sentence so that it is a bit clearer
how it relates to the new check. Perhaps:

	 * Use the current data block when the size does not increase, i.e.
	 * when @head_lpos is already able to accommodate the new @next_lpos.

> +	 *
> +	 * Note that need_more_space() could never return false here because
> +	 * the difference between the positions was bigger than the data
> +	 * buffer size. The data block is reopened and can't get reused.
> +	 */
> +	if (!need_more_space(data_ring, head_lpos, next_lpos)) {
>  		if (wrapped)
>  			blk = to_block(data_ring, 0);
>  		else
> -- 
> 2.51.1

Otherwise, LGTM. Thanks for choosing a name that presents contextual
purpose rather than simply function.

Reviewed-by: John Ogness <john.ogness@linutronix.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks
  2025-11-07 19:47 [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
  2025-11-07 19:47 ` [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows Petr Mladek
  2025-11-07 19:47 ` [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed Petr Mladek
@ 2025-11-10 12:25 ` Petr Mladek
  2025-12-09 17:18 ` [f2fs-dev] " patchwork-bot+f2fs
  3 siblings, 0 replies; 7+ messages in thread
From: Petr Mladek @ 2025-11-10 12:25 UTC (permalink / raw)
  To: John Ogness
  Cc: Joanne Koong, amurray @ thegoodpenguin . co . uk, brauner, chao,
	djwong, jaegeuk, linux-f2fs-devel, linux-fsdevel, linux-kernel,
	linux-xfs, syzkaller-bugs

On Fri 2025-11-07 20:47:18, Petr Mladek wrote:
> This is outcome of the long discussion about the regression caused
> by 67e1b0052f6bb82 ("printk_ringbuffer: don't needlessly wrap data blocks around"),
> see https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/
> 
> The 1st patch fixes the regression as agreed, see
> https://lore.kernel.org/all/87ecqb3qd0.fsf@jogness.linutronix.de/
> 
> The 2nd patch adds a helper function to unify the checks whether
> a more space is needed. I did my best to address all the concerns
> about various proposed variants.
> 
> Note that I called the new helper function "need_more_space()" in the end.
> It avoids all the problems with "before" vs. "lt" vs "le",
> and "_safe" vs. "_sane" vs. "_bounded".
> 
> IMHO, the name "need_more_space()" fits very well in all three
> locations, surprisingly even in data_realloc(). But it is possible
> that you disagree. Let me know if you hate it ;-)
> 
> 
> The patchset applies on top of printk/linux.git, branch for-6.19.
> It should apply on top of linux-next as well.
> 
> Petr Mladek (2):
>   printk_ringbuffer: Fix check of valid data size when blk_lpos
>     overflows
>   printk_ringbuffer: Create a helper function to decide whether a more
>     space is needed
> 
>  kernel/printk/printk_ringbuffer.c | 40 +++++++++++++++++++++++++------
>  1 file changed, 33 insertions(+), 7 deletions(-)

JFYI, the patchset has been comitted into printk/linux.git,
branch for-6.19.

Note that I have updated the Subject and a comment in the 2nd patch
as suggested by John, see
https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=for-6.19&id=394aa576c0b783ae728d87ed98fe4f1831dfd720

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [f2fs-dev] [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks
  2025-11-07 19:47 [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
                   ` (2 preceding siblings ...)
  2025-11-10 12:25 ` [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
@ 2025-12-09 17:18 ` patchwork-bot+f2fs
  3 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+f2fs @ 2025-12-09 17:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: john.ogness, brauner, djwong, syzkaller-bugs, linux-kernel,
	linux-f2fs-devel, linux-xfs, linux-fsdevel, jaegeuk, joannelkoong,
	amurray

Hello:

This series was applied to jaegeuk/f2fs.git (dev)
by Petr Mladek <pmladek@suse.com>:

On Fri,  7 Nov 2025 20:47:18 +0100 you wrote:
> This is outcome of the long discussion about the regression caused
> by 67e1b0052f6bb82 ("printk_ringbuffer: don't needlessly wrap data blocks around"),
> see https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/
> 
> The 1st patch fixes the regression as agreed, see
> https://lore.kernel.org/all/87ecqb3qd0.fsf@jogness.linutronix.de/
> 
> [...]

Here is the summary with links:
  - [f2fs-dev,1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows
    https://git.kernel.org/jaegeuk/f2fs/c/cc3bad11de6e
  - [f2fs-dev,2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed
    https://git.kernel.org/jaegeuk/f2fs/c/394aa576c0b7

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-12-09 17:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07 19:47 [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
2025-11-07 19:47 ` [PATCH 1/2] printk_ringbuffer: Fix check of valid data size when blk_lpos overflows Petr Mladek
2025-11-10  9:13   ` John Ogness
2025-11-07 19:47 ` [PATCH 2/2] printk_ringbuffer: Create a helper function to decide whether a more space is needed Petr Mladek
2025-11-10  9:21   ` John Ogness
2025-11-10 12:25 ` [PATCH 0/2] printk_ringbuffer: Fix regression in get_data() and clean up data size checks Petr Mladek
2025-12-09 17:18 ` [f2fs-dev] " patchwork-bot+f2fs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).