* [PATCH] media: rzg2l-cru: serialize state transitions with qlock
@ 2026-04-21 6:03 Shuhao Fu
2026-04-21 6:27 ` Shuhao Fu
0 siblings, 1 reply; 2+ messages in thread
From: Shuhao Fu @ 2026-04-21 6:03 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-media; +Cc: linux-kernel
struct rzg2l_cru_dev.state is documented as protected by qlock, and the
IRQ path already reads and updates it under that lock. However,
rzg2l_cru_stop_streaming() writes STOPPING and
rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.
That lets process-context stream control race with rzg2l_cru_irq().
If the IRQ handler misses a concurrent STOPPING update, it can continue
normal frame completion and slot refill after streamoff has begun. A
similar race around STARTING can make the IRQ path observe the wrong
phase during startup synchronization.
Fix both state transitions by serializing the writes with qlock, while
still keeping rzg2l_cru_set_stream() outside the locked region.
Fixes: 07fc05bd0a79 ("media: platform: Add Renesas RZ/G2L CRU driver")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
---
drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
index 162e2ace693184..434754fd155a8e 100644
--- a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
+++ b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
@@ -560,7 +560,11 @@ pipe_line_stop:
static void rzg2l_cru_stop_streaming(struct rzg2l_cru_dev *cru)
{
+ unsigned long flags;
+
+ spin_lock_irqsave(&cru->qlock, flags);
cru->state = RZG2L_CRU_DMA_STOPPING;
+ spin_unlock_irqrestore(&cru->qlock, flags);
rzg2l_cru_set_stream(cru, 0);
}
@@ -749,6 +753,7 @@ irqreturn_t rzg3e_cru_irq(int irq, void *data)
static int rzg2l_cru_start_streaming_vq(struct vb2_queue *vq, unsigned int count)
{
struct rzg2l_cru_dev *cru = vb2_get_drv_priv(vq);
+ unsigned long flags;
int ret;
ret = pm_runtime_resume_and_get(cru->dev);
@@ -791,7 +796,9 @@ static int rzg2l_cru_start_streaming_vq(struct vb2_queue *vq, unsigned int count
goto out;
}
+ spin_lock_irqsave(&cru->qlock, flags);
cru->state = RZG2L_CRU_DMA_STARTING;
+ spin_unlock_irqrestore(&cru->qlock, flags);
dev_dbg(cru->dev, "Starting to capture\n");
return 0;
--
2.25.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] media: rzg2l-cru: serialize state transitions with qlock
2026-04-21 6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
@ 2026-04-21 6:27 ` Shuhao Fu
0 siblings, 0 replies; 2+ messages in thread
From: Shuhao Fu @ 2026-04-21 6:27 UTC (permalink / raw)
To: Mauro Carvalho Chehab, linux-media; +Cc: linux-kernel
Hi,
Here is the best reproduction detail I could put together locally.
From source review, I think there are two windows where process-context
state updates can overlap the IRQ handler's reads of `cru->state`:
- on streamoff, `rzg2l_cru_stop_streaming()` stores `STOPPING` before
it calls `rzg2l_cru_set_stream(cru, 0)`, while interrupts are only
disabled later in `rzg2l_cru_stop_image_processing()`
- on streamon, `rzg2l_cru_start_image_processing()` enables interrupts
before `rzg2l_cru_set_stream(cru, 1)` returns, while
`rzg2l_cru_start_streaming_vq()` stores `STARTING` only after that
I do not have an RZ/G2L board or an arm64/QEMU model for this CRU
block, so I could not reproduce either path from a real userspace V4L2
stream on actual hardware. The setup below is only the best local
reference proof I could produce in this environment. It is not a claim
of a natural hardware-backed repro.
Locally, I targeted the streamoff-side `STOPPING` vs IRQ overlap.
1. Build a KCSAN/KUnit kernel with the dedicated config fragment:
./tools/testing/kunit/kunit.py build \
--arch=x86_64 \
--kunitconfig=kernel/kcsan/rzg2l_cru.kunitconfig \
--build_dir=../out-rzg2l-kunit-red2 \
--make_options CC=clang-20 \
--make_options LD=ld.bfd \
--make_options AR=llvm-ar-20 \
--make_options NM=llvm-nm-20 \
--make_options OBJCOPY=llvm-objcopy-20 \
--make_options READELF=llvm-readelf-20 \
--make_options LLVM_IAS=1 \
--jobs 8
2. Boot that kernel under QEMU:
timeout 90 qemu-system-x86_64 \
-m 1024 \
-kernel out-rzg2l-kunit-red2/arch/x86/boot/bzImage \
-append 'kunit.filter_glob=kcsan.test_rzg2l_cru_state_stop_vs_irq* kunit.enable=1 console=ttyS0 kunit_shutdown=reboot' \
-no-reboot \
-nographic \
-accel tcg \
-smp 4
3. The KUnit/KCSAN test creates a fake `rzg2l_cru_dev`, records the
address of `cru->state`, and then runs two worker sides concurrently:
- writer side: `test_rzg2l_cru_kunit_stop()`, which just calls the
real `rzg2l_cru_stop_streaming()`
- reader side: `test_rzg2l_cru_kunit_irq()`, which seeds minimal
fake MMIO state and then calls the real `rzg2l_cru_irq()`
So the harness does not invent a fake state variable or a fake reader.
It only provides enough fake object/MMIO state for the real driver code
to run on x86 and reproduce the stop-side overlap in a controlled way.
With that setup I got repeated KCSAN reports of:
BUG: KCSAN: data-race in rzg2l_cru_irq / test_rzg2l_cru_kunit_stop
The first hit in my local log was:
write to 0xffff9bd4c1c03cf4 of 4 bytes by task 54 on cpu 0:
test_rzg2l_cru_kunit_stop+0x14/0x30
test_kernel_rzg2l_cru_stop+0x20/0x30
access_thread+0x93/0xe0
read to 0xffff9bd4c1c03cf4 of 4 bytes by task 53 on cpu 3:
rzg2l_cru_irq+0x110/0x2d0
test_rzg2l_cru_kunit_irq+0x4d/0x60
test_kernel_rzg2l_cru_irq+0x20/0x30
The same run then hit the same race again in the 3-thread and 4-thread
variants, still on the same 4-byte `state` address and still with the
same writer/reader pair.
Thanks,
Shuhao
On Tue, Apr 21, 2026 at 02:03:26PM +0800, Shuhao Fu wrote:
> struct rzg2l_cru_dev.state is documented as protected by qlock, and the
> IRQ path already reads and updates it under that lock. However,
> rzg2l_cru_stop_streaming() writes STOPPING and
> rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.
>
> That lets process-context stream control race with rzg2l_cru_irq().
> If the IRQ handler misses a concurrent STOPPING update, it can continue
> normal frame completion and slot refill after streamoff has begun. A
> similar race around STARTING can make the IRQ path observe the wrong
> phase during startup synchronization.
>
> Fix both state transitions by serializing the writes with qlock, while
> still keeping rzg2l_cru_set_stream() outside the locked region.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-04-21 6:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
2026-04-21 6:27 ` Shuhao Fu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox