From: Shuhao Fu <sfual@cse.ust.hk>
To: Mauro Carvalho Chehab <mchehab@kernel.org>, linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rzg2l-cru: serialize state transitions with qlock
Date: Tue, 21 Apr 2026 14:27:08 +0800 [thread overview]
Message-ID: <20260421062708.GA2548544@chcpu16> (raw)
In-Reply-To: <20260421060307.GA2522920@chcpu16>
Hi,
Here is the best reproduction detail I could put together locally.
From source review, I think there are two windows where process-context
state updates can overlap the IRQ handler's reads of `cru->state`:
- on streamoff, `rzg2l_cru_stop_streaming()` stores `STOPPING` before
it calls `rzg2l_cru_set_stream(cru, 0)`, while interrupts are only
disabled later in `rzg2l_cru_stop_image_processing()`
- on streamon, `rzg2l_cru_start_image_processing()` enables interrupts
before `rzg2l_cru_set_stream(cru, 1)` returns, while
`rzg2l_cru_start_streaming_vq()` stores `STARTING` only after that
I do not have an RZ/G2L board or an arm64/QEMU model for this CRU
block, so I could not reproduce either path from a real userspace V4L2
stream on actual hardware. The setup below is only the best local
reference proof I could produce in this environment. It is not a claim
of a natural hardware-backed repro.
Locally, I targeted the streamoff-side `STOPPING` vs IRQ overlap.
1. Build a KCSAN/KUnit kernel with the dedicated config fragment:
./tools/testing/kunit/kunit.py build \
--arch=x86_64 \
--kunitconfig=kernel/kcsan/rzg2l_cru.kunitconfig \
--build_dir=../out-rzg2l-kunit-red2 \
--make_options CC=clang-20 \
--make_options LD=ld.bfd \
--make_options AR=llvm-ar-20 \
--make_options NM=llvm-nm-20 \
--make_options OBJCOPY=llvm-objcopy-20 \
--make_options READELF=llvm-readelf-20 \
--make_options LLVM_IAS=1 \
--jobs 8
2. Boot that kernel under QEMU:
timeout 90 qemu-system-x86_64 \
-m 1024 \
-kernel out-rzg2l-kunit-red2/arch/x86/boot/bzImage \
-append 'kunit.filter_glob=kcsan.test_rzg2l_cru_state_stop_vs_irq* kunit.enable=1 console=ttyS0 kunit_shutdown=reboot' \
-no-reboot \
-nographic \
-accel tcg \
-smp 4
3. The KUnit/KCSAN test creates a fake `rzg2l_cru_dev`, records the
address of `cru->state`, and then runs two worker sides concurrently:
- writer side: `test_rzg2l_cru_kunit_stop()`, which just calls the
real `rzg2l_cru_stop_streaming()`
- reader side: `test_rzg2l_cru_kunit_irq()`, which seeds minimal
fake MMIO state and then calls the real `rzg2l_cru_irq()`
So the harness does not invent a fake state variable or a fake reader.
It only provides enough fake object/MMIO state for the real driver code
to run on x86 and reproduce the stop-side overlap in a controlled way.
With that setup I got repeated KCSAN reports of:
BUG: KCSAN: data-race in rzg2l_cru_irq / test_rzg2l_cru_kunit_stop
The first hit in my local log was:
write to 0xffff9bd4c1c03cf4 of 4 bytes by task 54 on cpu 0:
test_rzg2l_cru_kunit_stop+0x14/0x30
test_kernel_rzg2l_cru_stop+0x20/0x30
access_thread+0x93/0xe0
read to 0xffff9bd4c1c03cf4 of 4 bytes by task 53 on cpu 3:
rzg2l_cru_irq+0x110/0x2d0
test_rzg2l_cru_kunit_irq+0x4d/0x60
test_kernel_rzg2l_cru_irq+0x20/0x30
The same run then hit the same race again in the 3-thread and 4-thread
variants, still on the same 4-byte `state` address and still with the
same writer/reader pair.
Thanks,
Shuhao
On Tue, Apr 21, 2026 at 02:03:26PM +0800, Shuhao Fu wrote:
> struct rzg2l_cru_dev.state is documented as protected by qlock, and the
> IRQ path already reads and updates it under that lock. However,
> rzg2l_cru_stop_streaming() writes STOPPING and
> rzg2l_cru_start_streaming_vq() writes STARTING without taking qlock.
>
> That lets process-context stream control race with rzg2l_cru_irq().
> If the IRQ handler misses a concurrent STOPPING update, it can continue
> normal frame completion and slot refill after streamoff has begun. A
> similar race around STARTING can make the IRQ path observe the wrong
> phase during startup synchronization.
>
> Fix both state transitions by serializing the writes with qlock, while
> still keeping rzg2l_cru_set_stream() outside the locked region.
next prev parent reply other threads:[~2026-04-21 6:27 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 6:03 [PATCH] media: rzg2l-cru: serialize state transitions with qlock Shuhao Fu
2026-04-21 6:27 ` Shuhao Fu [this message]
2026-04-22 6:46 ` Jacopo Mondi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421062708.GA2548544@chcpu16 \
--to=sfual@cse.ust.hk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=mchehab@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.