From: Shuhao Fu <sfual@cse.ust.hk>
To: Mauro Carvalho Chehab <mchehab@kernel.org>, linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rzg2l-cru: serialize state transitions with qlock
Date: Tue, 21 Apr 2026 14:27:08 +0800 [thread overview]
Message-ID: <20260421062708.GA2548544@chcpu16> (raw)
In-Reply-To: <20260421060307.GA2522920@chcpu16>
Hi,
Here is the best reproduction detail I could put together locally.
From source review, I think there are two windows where process-context
state updates can overlap the IRQ handler's reads of `cru->state`:
- on streamoff, `rzg2l_cru_stop_streaming()` stores `STOPPING` before
it calls `rzg2l_cru_set_stream(cru, 0)`, while interrupts are only
disabled later in `rzg2l_cru_stop_image_processing()`
- on streamon, `rzg2l_cru_start_image_processing()` enables interrupts
before `rzg2l_cru_set_stream(cru, 1)` returns, while
`rzg2l_cru_start_streaming_vq()` stores `STARTING` only after that
I do not have an RZ/G2L board or an arm64/QEMU model for this CRU
block, so I could not reproduce either path from a real userspace V4L2
stream on actual hardware. The setup below is only the best local
reference proof I could produce in this environment. It is not a claim
of a natural hardware-backed repro.
Locally, I targeted the streamoff-side `STOPPING` vs IRQ overlap.
1. Build a KCSAN/KUnit kernel with the dedicated config fragment:
./tools/testing/kunit/kunit.py build \
--arch=x86_64 \
--kunitconfig=kernel/kcsan/rzg2l_cru.kunitconfig \
--build_dir=../out-rzg2l-kunit-red2 \
--make_options CC=clang-20 \
--make_options LD=ld.bfd \
--make_options AR=llvm-ar-20 \
--make_options NM=llvm-nm-20 \
--make_options OBJCOPY=llvm-objcopy-20 \
--make_options READELF=llvm-readelf-20 \
--make_options LLVM_IAS=1 \
--jobs 8
2. Boot that kernel under QEMU:
timeout 90 qemu-system-x86_64 \
-m 1024 \
-kernel out-rzg2l-kunit-red2/arch/x86/boot/bzImage \
-append 'kunit.filter_glob=kcsan.test_rzg2l_cru_state_stop_vs_irq* kunit.enable=1 console=ttyS0 kunit_shutdown=reboot' \
-no-reboot \
-nographic \
-accel tcg \
-smp 4
3. The KUnit/KCSAN test creates a fake `rzg2l_cru_dev`, records the
address of `cru->state`, and then runs two worker sides concurrently:
- writer side: `test_rzg2l_cru_kunit_stop()`, which just calls the
real `rzg2l_cru_stop_streaming()`
- reader side: `test_rzg2l_cru_kunit_irq()`, which seeds minimal
fake MMIO state and then calls the real `rzg2l_cru_irq()`
So the harness does not invent a fake state variable or a fake reader.
It only provides enough fake object/MMIO state for the real driver code
to run on x86 and reproduce the stop-side overlap in a controlled way.
With that setup I got repeated KCSAN reports of:
BUG: KCSAN: data-race in rzg2l_cru_irq / test_rzg2l_cru_kunit_stop
The first hit in my local log was:
write to 0xffff9bd4c1c03cf4 of 4 bytes by task 54 on cpu 0:
test_rzg2l_cru_kunit_stop+0x14/0x30
test_kernel_rzg2l_cru_stop+0x20/0x30
access_thread+0x93/0xe0
read to 0xffff9bd4c1c03cf4 of 4 bytes by task 53 on cpu 3:
rzg2l_cru_irq+0x110/0x2d0
test_rzg2l_cru_kunit_irq+0x4d/0x60
test_kernel_rzg2l_cru_irq+0x20/0x30
The same run then hit the same race again in the 3-thread and 4-thread
variants, still on the same 4-byte `state` address and still with the
same writer/reader pair.
Thanks,
Shuhao
On Tue, Apr 21, 2026 at 02:03:26PM +0800, Shuhao Fu wrote:
> struct rzg2l_cru_dev.state is documented as protected by qlock, and the
> IRQ path already reads and updates it under that lock. However,
> rzg2l_cru_stop_streaming() writes STOPPING and
> rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.
>
> That lets process-context stream control race with rzg2l_cru_irq().
> If the IRQ handler misses a concurrent STOPPING update, it can continue
> normal frame completion and slot refill after streamoff has begun. A
> similar race around STARTING can make the IRQ path observe the wrong
> phase during startup synchronization.
>
> Fix both state transitions by serializing the writes with qlock, while
> still keeping rzg2l_cru_set_stream() outside the locked region.
prev parent reply other threads:[~2026-04-21 6:27 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
2026-04-21 6:27 ` Shuhao Fu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421062708.GA2548544@chcpu16 \
--to=sfual@cse.ust.hk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=mchehab@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox