public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] media: rzg2l-cru: serialize state transitions with qlock
@ 2026-04-21  6:03 Shuhao Fu
  2026-04-21  6:27 ` Shuhao Fu
  0 siblings, 1 reply; 2+ messages in thread
From: Shuhao Fu @ 2026-04-21  6:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-media; +Cc: linux-kernel

struct rzg2l_cru_dev.state is documented as protected by qlock, and the
IRQ path already reads and updates it under that lock. However,
rzg2l_cru_stop_streaming() writes STOPPING and
rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.

That lets process-context stream control race with rzg2l_cru_irq().
If the IRQ handler misses a concurrent STOPPING update, it can continue
normal frame completion and slot refill after streamoff has begun. A
similar race around STARTING can make the IRQ path observe the wrong
phase during startup synchronization.

Fix both state transitions by serializing the writes with qlock, while
still keeping rzg2l_cru_set_stream() outside the locked region.

Fixes: 07fc05bd0a79 ("media: platform: Add Renesas RZ/G2L CRU driver")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
---
 drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
index 162e2ace693184..434754fd155a8e 100644
--- a/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
+++ b/drivers/media/platform/renesas/rzg2l-cru/rzg2l-video.c
@@ -560,7 +560,11 @@ pipe_line_stop:
 
 static void rzg2l_cru_stop_streaming(struct rzg2l_cru_dev *cru)
 {
+	unsigned long flags;
+
+	spin_lock_irqsave(&cru->qlock, flags);
 	cru->state = RZG2L_CRU_DMA_STOPPING;
+	spin_unlock_irqrestore(&cru->qlock, flags);
 
 	rzg2l_cru_set_stream(cru, 0);
 }
@@ -749,6 +753,7 @@ irqreturn_t rzg3e_cru_irq(int irq, void *data)
 static int rzg2l_cru_start_streaming_vq(struct vb2_queue *vq, unsigned int count)
 {
 	struct rzg2l_cru_dev *cru = vb2_get_drv_priv(vq);
+	unsigned long flags;
 	int ret;
 
 	ret = pm_runtime_resume_and_get(cru->dev);
@@ -791,7 +796,9 @@ static int rzg2l_cru_start_streaming_vq(struct vb2_queue *vq, unsigned int count
 		goto out;
 	}
 
+	spin_lock_irqsave(&cru->qlock, flags);
 	cru->state = RZG2L_CRU_DMA_STARTING;
+	spin_unlock_irqrestore(&cru->qlock, flags);
 	dev_dbg(cru->dev, "Starting to capture\n");
 	return 0;
 
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] media: rzg2l-cru: serialize state transitions with qlock
  2026-04-21  6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
@ 2026-04-21  6:27 ` Shuhao Fu
  0 siblings, 0 replies; 2+ messages in thread
From: Shuhao Fu @ 2026-04-21  6:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-media; +Cc: linux-kernel

Hi,

Here is the best reproduction detail I could put together locally.

From source review, I think there are two windows where process-context
state updates can overlap the IRQ handler's reads of `cru->state`:

- on streamoff, `rzg2l_cru_stop_streaming()` stores `STOPPING` before
  it calls `rzg2l_cru_set_stream(cru, 0)`, while interrupts are only
  disabled later in `rzg2l_cru_stop_image_processing()`
- on streamon, `rzg2l_cru_start_image_processing()` enables interrupts
  before `rzg2l_cru_set_stream(cru, 1)` returns, while
  `rzg2l_cru_start_streaming_vq()` stores `STARTING` only after that

I do not have an RZ/G2L board or an arm64/QEMU model for this CRU
block, so I could not reproduce either path from a real userspace V4L2
stream on actual hardware. The setup below is only the best local
reference proof I could produce in this environment. It is not a claim
of a natural hardware-backed repro.

Locally, I targeted the streamoff-side `STOPPING` vs IRQ overlap.

1. Build a KCSAN/KUnit kernel with the dedicated config fragment:

   ./tools/testing/kunit/kunit.py build \
     --arch=x86_64 \
     --kunitconfig=kernel/kcsan/rzg2l_cru.kunitconfig \
     --build_dir=../out-rzg2l-kunit-red2 \
     --make_options CC=clang-20 \
     --make_options LD=ld.bfd \
     --make_options AR=llvm-ar-20 \
     --make_options NM=llvm-nm-20 \
     --make_options OBJCOPY=llvm-objcopy-20 \
     --make_options READELF=llvm-readelf-20 \
     --make_options LLVM_IAS=1 \
     --jobs 8

2. Boot that kernel under QEMU:

   timeout 90 qemu-system-x86_64 \
     -m 1024 \
     -kernel out-rzg2l-kunit-red2/arch/x86/boot/bzImage \
     -append 'kunit.filter_glob=kcsan.test_rzg2l_cru_state_stop_vs_irq* kunit.enable=1 console=ttyS0 kunit_shutdown=reboot' \
     -no-reboot \
     -nographic \
     -accel tcg \
     -smp 4

3. The KUnit/KCSAN test creates a fake `rzg2l_cru_dev`, records the
   address of `cru->state`, and then runs two worker sides concurrently:

   - writer side: `test_rzg2l_cru_kunit_stop()`, which just calls the
     real `rzg2l_cru_stop_streaming()`
   - reader side: `test_rzg2l_cru_kunit_irq()`, which seeds minimal
     fake MMIO state and then calls the real `rzg2l_cru_irq()`

So the harness does not invent a fake state variable or a fake reader.
It only provides enough fake object/MMIO state for the real driver code
to run on x86 and reproduce the stop-side overlap in a controlled way.

With that setup I got repeated KCSAN reports of:

   BUG: KCSAN: data-race in rzg2l_cru_irq / test_rzg2l_cru_kunit_stop

The first hit in my local log was:

   write to 0xffff9bd4c1c03cf4 of 4 bytes by task 54 on cpu 0:
    test_rzg2l_cru_kunit_stop+0x14/0x30
    test_kernel_rzg2l_cru_stop+0x20/0x30
    access_thread+0x93/0xe0

   read to 0xffff9bd4c1c03cf4 of 4 bytes by task 53 on cpu 3:
    rzg2l_cru_irq+0x110/0x2d0
    test_rzg2l_cru_kunit_irq+0x4d/0x60
    test_kernel_rzg2l_cru_irq+0x20/0x30

The same run then hit the same race again in the 3-thread and 4-thread
variants, still on the same 4-byte `state` address and still with the
same writer/reader pair.

Thanks,
Shuhao


On Tue, Apr 21, 2026 at 02:03:26PM +0800, Shuhao Fu wrote:
> struct rzg2l_cru_dev.state is documented as protected by qlock, and the
> IRQ path already reads and updates it under that lock. However,
> rzg2l_cru_stop_streaming() writes STOPPING and
> rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.
> 
> That lets process-context stream control race with rzg2l_cru_irq().
> If the IRQ handler misses a concurrent STOPPING update, it can continue
> normal frame completion and slot refill after streamoff has begun. A
> similar race around STARTING can make the IRQ path observe the wrong
> phase during startup synchronization.
> 
> Fix both state transitions by serializing the writes with qlock, while
> still keeping rzg2l_cru_set_stream() outside the locked region.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-04-21  6:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21  6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
2026-04-21  6:27 ` Shuhao Fu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox