public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Jianzhou Zhao <luckd0g@163.com>
Cc: linux-kernel@vger.kernel.org, senozhatsky@chromium.org,
	rostedt@goodmis.org, andriy.shevchenko@linux.intel.com,
	linux@rasmusvillemoes.dk, akpm@linux-foundation.org
Subject: Re: KCSAN: data-race in data_push_tail / symbol_string
Date: Thu, 19 Mar 2026 17:22:06 +0100	[thread overview]
Message-ID: <abwirtrQbqBSubG8@pathway.suse.cz> (raw)
In-Reply-To: <4a692712.648a.19cdbd3a995.Coremail.luckd0g@163.com>

Hi Jianzhou,

first, thanks a lot for the report.

On Wed 2026-03-11 15:36:47, Jianzhou Zhao wrote:
> 
> 
> Subject: [BUG] printk: KCSAN: data-race in data_push_tail / symbol_string
> 
> Dear Maintainers,
> 
> We are writing to report a KCSAN-detected data-race vulnerability in the Linux kernel. This bug was found by our custom fuzzing tool, RacePilot. The bug occurs during ringbuffer tail advancement where a reader speculatively reads the `blk->id` from a physical address that has concurrently been overwritten by a writer formatting a string. We observed this on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.
> 
> Call Trace & Context
> ==================================================================
> BUG: KCSAN: data-race in data_push_tail.part.0 / symbol_string
> 
> write to 0xffffffff88f194a8 of 1 bytes by task 38579 on cpu 0:
>  string_nocheck lib/vsprintf.c:658 [inline]
>  symbol_string+0x129/0x2c0 lib/vsprintf.c:1020
>  pointer+0x24c/0x920 lib/vsprintf.c:2565
>  vsnprintf+0x5d0/0xb80 lib/vsprintf.c:2982
>  vscnprintf+0x41/0x90 lib/vsprintf.c:3042
>  printk_sprint+0x31/0x1c0 kernel/printk/printk.c:2199
>  vprintk_store+0x3f6/0x980 kernel/printk/printk.c:2321
>  vprintk_emit+0xfd/0x540 kernel/printk/printk.c:2412
>  vprintk_default+0x26/0x30 kernel/printk/printk.c:2451
>  vprintk+0x1d/0x30 kernel/printk/printk_safe.c:82
>  _printk+0x63/0x90 kernel/printk/printk.c:2461
>  printk_stack_address arch/x86/kernel/dumpstack.c:70 [inline]
> 
> read to 0xffffffff88f194a8 of 8 bytes by task 38521 on cpu 1:
>  data_make_reusable kernel/printk/printk_ringbuffer.c:606 [inline]
>  data_push_tail.part.0+0xe6/0x350 kernel/printk/printk_ringbuffer.c:692
>  data_push_tail kernel/printk/printk_ringbuffer.c:656 [inline]
>  data_alloc+0x157/0x330 kernel/printk/printk_ringbuffer.c:1096
>  prb_reserve+0x44d/0x7d0 kernel/printk/printk_ringbuffer.c:1742
>  vprintk_store+0x3b4/0x980 kernel/printk/printk.c:2311
>  vprintk_emit+0xfd/0x540 kernel/printk/printk.c:2412
>  vprintk_default+0x26/0x30 kernel/printk/printk.c:2451
>  vprintk+0x1d/0x30 kernel/printk/printk_safe.c:82
>  _printk+0x63/0x90 kernel/printk/printk.c:2461
> 
> value changed: 0x00000000fffff47a -> 0x302f303978302b6c
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 UID: 0 PID: 38521 Comm: syz.7.1998 Not tainted 6.18.0-08691-g2061f18ad76e-dirty #42 PREEMPT(voluntary) 
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> ==================================================================
> 
> Execution Flow & Code Context
> On CPU 0, a printing task formats an output string containing a symbol address through `vsprintf.c` which recursively formats data and writes to a buffer natively byte-by-byte:
> ```c
> // lib/vsprintf.c
> static char *string_nocheck(char *buf, char *end, const char *s,
> 			    struct printf_spec spec)
> {
> 	...
> 	while (lim--) {
> 		char c = *s++;
> 		...
> 		if (buf < end)
> 			*buf = c; // <-- Plain Write
> 		++buf;
> 		...
> 	}
> 	return widen_string(buf, len, end, spec);
> }
> ```
> 
> This destination buffer represents the text block inside the physical `printk_ringbuffer` array, historically mapped out by `data_alloc()`. Concurrently, CPU 1 calls `prb_reserve()` advancing `data_make_reusable()` along the same space to check if it's safe to clear descriptors. The reader uses `blk->id` unannotated to see if a particular logical block was recycled:
> ```c
> // kernel/printk/printk_ringbuffer.c
> static bool data_make_reusable(struct printk_ringbuffer *rb, ...)
> {
> 	...
> 	while (need_more_space(data_ring, lpos_begin, lpos_end)) {
> 		blk = to_block(data_ring, lpos_begin);
> 
> 		/*
> 		 * Load the block ID from the data block. This is a data race
> 		 * against a writer that may have newly reserved this data
> 		 * area. If the loaded value matches a valid descriptor ID,
> 		...
> 		 */
> 		id = blk->id; /* LMM(data_make_reusable:A) */  // <-- Plain Lockless Read
> 		...
> ```
> 
> Root Cause Analysis
> A data race occurs because the reader speculatively accesses `blk->id` using a plain memory access (`id = blk->id`). However, because another concurrent task (`CPU 0`) running `vsprintf` has already pushed the logical boundaries on this data array and is linearly formatting strings onto this exact overlapping physical memory region block, `CPU 1` reads data undergoing character writes. This is an intentional heuristic documented by the comment: "This is a data race against a writer that may have newly reserved this data area". Reading garbage here is gracefully handled out-of-band by mapping the `sys_desc` ring ID and concluding it mismatching. However, it still trips compiler sanitizer checks.
> Unfortunately, we were unable to generate a reproducer for this bug.
> 
> Potential Impact
> This data race is functionally benign. If `data_make_reusable` reads formatted text characters rather than a proper `unsigned long id`, it safely skips it and verifies limits via `blk_lpos` logic. However, tripping the KCSAN sanitizer adds unnecessary debugging noise and may hide actual vulnerabilities under prolonged workloads.
> 
> Proposed Fix
> To silence the compiler sanitizer and explicitly annotate to the memory model that this deliberate racing behavior is expected, `data_race()` macro should wrap the read on `blk->id`.
> 
> ```diff
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -616,7 +616,7 @@ static bool data_make_reusable(struct printk_ringbuffer *rb,
>  		 * sure it points back to this data block. If the check fails,
>  		 * the data area has been recycled by another writer.
>  		 */
> -		id = blk->id; /* LMM(data_make_reusable:A) */
> +		id = data_race(blk->id); /* LMM(data_make_reusable:A) */
>  
>  		d_state = desc_read(desc_ring, id, &desc, NULL,
>  				    NULL); /* LMM(data_make_reusable:B) */
> ```
> 
> We would be highly honored if this could be of any help.

The proposed change makes perfect sense. Would you like to send it as
a proper patch? Or should I prepare it myself (giving you credits)?

The proper patch might be similar to this report. I would just do:

  + keep only printk-related part of the backtraces
  + format the other text to keep <= 75 characters long lines
  + add the tags (Reported-by:, Closes:, Fixes:, Signed-off-by:).

Best Regards,
Petr

      reply	other threads:[~2026-03-19 16:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11  7:36 KCSAN: data-race in data_push_tail / symbol_string Jianzhou Zhao
2026-03-19 16:22 ` Petr Mladek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abwirtrQbqBSubG8@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=luckd0g@163.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox