From: sashiko-bot@kernel.org
To: "Nas Chung" <nas.chung@chipsnmedia.com>
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v6 4/9] media: chips-media: wave6: Add v4l2 m2m driver support
Date: Wed, 24 Jun 2026 07:40:30 +0000 [thread overview]
Message-ID: <20260624074031.617F21F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260624072043.238-5-nas.chung@chipsnmedia.com>
Thank you for your contribution! Sashiko AI review found 10 potential issue(s) to consider:
- [High] On encoder/decoder start failure, the driver removes the wrong source buffer from the M2M queue, leading to premature buffer completion and Use-After-Free.
- [High] The decoder `VIDIOC_S_FMT` handlers lack `vb2_is_busy()` checks, allowing userspace to dynamically change the pixel format during streaming.
- [High] Missing `vb2_is_busy()` check in encoder's `s_selection` allows changing crop dimensions during streaming, leading to hardware DMA out-of-bounds writes.
- [High] The `video_device` structure is allocated with `devm_kzalloc`, leading to a Use-After-Free on driver unbind.
- [High] A race condition in the global `task_timer` timeout handler leads to Use-After-Free and M2M state corruption.
- [High] Unprotected traversal of `v4l2_m2m` ready queues causes linked list corruption.
- [High] Out-of-bounds array reads and writes due to unvalidated frame buffer and motion vector counts returned directly from hardware registers.
- [High] Missing hardware reset and completion reinitialization on VPU timeout leaves a stray completion that corrupts future hardware syncs.
- [Medium] Conditionally dropping the PM usage counter based on `pm_runtime_suspended()` leaks the usage counter.
- [Medium] Decoder permanently enables the hardware scaler once scaling is requested.
--
commit 85193b153b6b5079480d70d95e569973a1a2732e
Author: Nas Chung <nas.chung@chipsnmedia.com>
media: chips-media: wave6: Add v4l2 m2m driver support
This commit adds V4L2 M2M stateful driver support for the Chips&Media
Wave6 video codec IP, covering both encoder and decoder. It includes
multiplanar format support, scaling, and extensive V4L2 encoder controls
for AVC and HEVC formats.
> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
> index 0000000000000..9545264c19b7f 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
[ ... ]
> @@ -152,6 +152,8 @@ static void wave6_vpu_dec_destroy_instance(struct vpu_instance *inst, bool remov
> wave6_vpu_dec_release_fb(inst);
> wave6_vpu_set_instance_state(inst, VPU_INST_STATE_NONE);
>
> if (!pm_runtime_suspended(inst->dev->dev))
> pm_runtime_put_sync(inst->dev->dev);
> }
[Severity: Medium]
Will this skip dropping the PM usage counter if the device is runtime suspended?
If the device enters runtime suspend (e.g., via system suspend) and
pm_runtime_suspended() returns true, pm_runtime_put_sync() is skipped. This
seems to leave the usage counter elevated, which could permanently prevent the
VPU from entering runtime suspend again.
[ ... ]
> @@ -219,6 +219,8 @@ static void wave6_vpu_dec_handle_dst_buffer(struct vpu_instance *inst)
> int consumed_num = wave6_vpu_get_consumed_fb_num(inst);
> int ret;
>
> v4l2_m2m_for_each_dst_buf(inst->v4l2_fh.m2m_ctx, v4l2_m2m_buf) {
> dst_buf = &v4l2_m2m_buf->vb;
> vpu_buf = wave6_to_vpu_buf(dst_buf);
[Severity: High]
Does traversing the V4L2 M2M ready queue here require holding the rdy_spinlock?
The v4l2_m2m_for_each_dst_buf() macro iterates over the rdy_queue, which can be
modified concurrently by VIDIOC_QBUF in process context. Traversing this queue
without the rdy_spinlock might result in following stale list pointers and
causing linked list corruption.
[ ... ]
> @@ -598,6 +598,8 @@ static int wave6_vpu_dec_start_decode(struct vpu_instance *inst)
> dev_err(inst->dev->dev, "[%d] %s: fail %d\n", inst->id, __func__, ret);
> wave6_vpu_set_instance_state(inst, VPU_INST_STATE_STOP);
>
> src_buf = v4l2_m2m_src_buf_remove(inst->v4l2_fh.m2m_ctx);
> if (src_buf) {
> v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
[Severity: High]
If start_one_frame fails, could removing the first buffer from the source queue
complete the wrong buffer?
Since the VPU driver allows multiple source buffers to be in flight (such as
for B-frame reordering), the first buffer in the M2M rdy_queue might be
actively processing. Unconditionally popping it via v4l2_m2m_src_buf_remove()
and completing it with VB2_BUF_STATE_ERROR would return an active buffer to
userspace. If userspace then frees or unmaps it, this could lead to a
Use-After-Free or DMA memory corruption.
[ ... ]
> @@ -682,6 +682,8 @@ static int wave6_vpu_dec_s_fmt_cap(struct file *file, void *fh, struct v4l2_form
> __func__, pix_mp->pixelformat, pix_mp->width, pix_mp->height,
> pix_mp->num_planes, pix_mp->colorspace);
>
> ret = wave6_vpu_dec_try_fmt_cap(file, fh, f);
> if (ret)
> return ret;
>
> vpu_fmt = wave6_find_vpu_fmt(pix_mp->pixelformat, VPU_FMT_TYPE_RAW);
[Severity: High]
Should there be a vb2_is_busy() check here to prevent userspace from changing
the format mid-stream?
Without this check, the format and inst->dst_fmt properties can be modified
during streaming without reallocating buffers. This could cause the driver to
calculate out-of-bounds DMA addresses and corrupt memory.
[ ... ]
> @@ -766,6 +766,8 @@ static int wave6_vpu_dec_s_fmt_out(struct file *file, void *fh, struct v4l2_form
> __func__, pix_mp->pixelformat, pix_mp->width, pix_mp->height,
> pix_mp->num_planes, pix_mp->colorspace);
>
> ret = wave6_vpu_dec_try_fmt_out(file, fh, f);
> if (ret)
> return ret;
>
> pix_mp->colorspace = in_pix_mp.colorspace;
[Severity: High]
Does this also need a vb2_is_busy() check to prevent mid-stream format changes
on the output queue?
Altering inst->src_fmt while streaming could lead to similar out-of-bounds
DMA operations.
[ ... ]
> @@ -954,6 +954,8 @@ static int wave6_vpu_dec_s_selection(struct file *file, void *fh, struct v4l2_se
> scale_height = round_down(scale_height, step);
> }
>
> if (scale_width < inst->src_fmt.width ||
> scale_height < inst->src_fmt.height)
> inst->scaler_info.enable = true;
>
> if (inst->scaler_info.enable) {
[Severity: Medium]
If a subsequent call requests a 1:1 scale, does inst->scaler_info.enable ever
get reset to false?
It appears the scaler is permanently enabled once requested. If userspace
reverts to a 1:1 scale later, the hardware scaler remains active, which might
introduce unnecessary overhead or logic errors.
[ ... ]
> @@ -1323,6 +1323,8 @@ static int wave6_vpu_dec_prepare_fb(struct vpu_instance *inst)
> luma_size = fb_stride * fb_height;
> chroma_size = ALIGN(fb_stride / 2, W6_FBC_BUF_ALIGNMENT) * fb_height;
>
> for (i = 0; i < fb_num; i++) {
> struct frame_buffer *frame = &inst->frame_buf[i];
> struct vpu_buf *vframe = &inst->frame_vbuf[i];
>
> vframe->size = luma_size + chroma_size;
[Severity: High]
Can fb_num (derived from the hardware register W6_RET_DEC_MIN_FB_CNT) exceed
the WAVE6_MAX_FBS array bounds?
The values min_frame_buffer_count and req_mv_buffer_count are read directly
from hardware in wave6_get_dec_seq_result() without validation. If the
hardware or firmware returns a value larger than the fixed-size array bounds,
this loop could cause out-of-bounds writes to the inst->frame_buf and
inst->frame_vbuf arrays.
[ ... ]
> @@ -1803,6 +1803,8 @@ int wave6_vpu_dec_register_device(struct vpu_core_device *core)
> struct video_device *vdev_dec;
> int ret;
>
> vdev_dec = devm_kzalloc(core->v4l2_dev.dev, sizeof(*vdev_dec), GFP_KERNEL);
> if (!vdev_dec)
> return -ENOMEM;
>
> core->video_dev_dec = vdev_dec;
[Severity: High]
Is it safe to allocate the video_device with devm_kzalloc() here?
The V4L2 video_device structure must be freed dynamically in its release
callback (currently assigned video_device_release_empty) when the last file
descriptor is closed. If the device is unbound via sysfs while userspace holds
an open file descriptor, devm_kzalloc will free the memory, but the V4L2 core
will still hold pointers to it. This could result in a Use-After-Free during
subsequent ioctl or close operations.
> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
> index 0000000000000..825229f6fb051 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
[ ... ]
> @@ -422,6 +422,8 @@ static int wave6_allocate_aux_buffer(struct vpu_instance *inst,
> return ret;
> }
>
> for (i = 0; i < num; i++) {
> inst->aux_vbuf[type][i].size = size;
> ret = wave6_vdi_alloc_dma(inst->dev->dev, &inst->aux_vbuf[type][i]);
> if (ret) {
[Severity: High]
Is the num parameter validated against the array size before this loop?
Unlike the decoder's allocate function which clamps num to WAVE6_MAX_FBS,
the encoder seems to use the unvalidated num (derived from hardware registers)
directly. This could lead to out-of-bounds writes into the aux_vbuf array if
the hardware returns an unexpectedly large value.
[ ... ]
> @@ -783,6 +783,8 @@ static int wave6_vpu_enc_s_selection(struct file *file, void *fh, struct v4l2_se
> s->r.width = clamp(s->r.width, W6_MIN_ENC_PIC_WIDTH, max_crop_w);
> s->r.height = clamp(s->r.height, W6_MIN_ENC_PIC_HEIGHT, max_crop_h);
>
> wave6_update_pix_fmt(&inst->dst_fmt, s->r.width, s->r.height);
> wave6_update_crop_info(inst, s->r.left, s->r.top, s->r.width, s->r.height);
>
> dev_dbg(inst->dev->dev, "V4L2_SEL_TGT_CROP %dx%dx%dx%d\n",
[Severity: High]
Should there be a vb2_is_busy() check before updating the crop dimensions?
Without checking if the queues are busy, userspace can change the crop
dimensions during streaming. If a larger crop selection is set,
inst->codec_rect is updated, and the hardware will be programmed with larger
dimensions than the previously allocated internal FBC backing buffers. This
could result in hardware DMA memory corruption.
> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
> index 0000000000000..a9a168b8bf269 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
[ ... ]
> @@ -72,6 +72,8 @@ struct vb2_v4l2_buffer *wave6_get_dst_buf_by_addr(struct vpu_instance *inst,
> struct v4l2_m2m_buffer *v4l2_m2m_buf;
> struct vb2_v4l2_buffer *dst_buf = NULL;
>
> v4l2_m2m_for_each_dst_buf(inst->v4l2_fh.m2m_ctx, v4l2_m2m_buf) {
> vb2_v4l2_buf = &v4l2_m2m_buf->vb;
> if (addr == wave6_get_dma_addr(vb2_v4l2_buf, 0)) {
> dst_buf = vb2_v4l2_buf;
[Severity: High]
Is it safe to iterate over the M2M destination queue without holding the
rdy_spinlock?
The rdy_queue is shared and can be concurrently modified by VIDIOC_QBUF.
Traversing it unprotected could result in following stale list pointers.
[ ... ]
> @@ -312,6 +312,8 @@ int wave6_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout)
>
> ret = wait_for_completion_timeout(&inst->irq_done,
> msecs_to_jiffies(timeout));
> if (!ret)
> return -ETIMEDOUT;
>
> reinit_completion(&inst->irq_done);
[Severity: High]
Does exiting early on a timeout leave a stray completion signal?
If the timeout expires and the driver returns -ETIMEDOUT without resetting
the hardware or calling reinit_completion(), the VPU might finish the command
later and signal the completion. This stray completion would cause a subsequent
wait_for_completion_timeout() to return immediately, potentially reading result
registers while the hardware is still executing a new command.
[ ... ]
> @@ -419,6 +419,8 @@ static void wave6_vpu_device_run_timeout(struct work_struct *work)
> vb2_queue_error(v4l2_m2m_get_src_vq(inst->v4l2_fh.m2m_ctx));
> vb2_queue_error(v4l2_m2m_get_dst_vq(inst->v4l2_fh.m2m_ctx));
>
> v4l2_m2m_job_finish(inst->dev->m2m_dev, inst->v4l2_fh.m2m_ctx);
> }
[Severity: High]
Should the hardware be reset or aborted before finishing the M2M job?
If v4l2_m2m_job_finish() is called while the VPU is actively executing a
timed-out command, an eventual hardware interrupt might corrupt the next job
or trigger a Use-After-Free if the instance has been freed.
[ ... ]
> @@ -442,6 +442,8 @@ static void wave6_vpu_device_run(void *priv)
>
> void wave6_vpu_finish_job(struct vpu_instance *inst)
> {
> cancel_delayed_work(&inst->dev->task_timer);
> v4l2_m2m_job_finish(inst->dev->m2m_dev, inst->v4l2_fh.m2m_ctx);
> }
[Severity: High]
Can cancel_delayed_work() leave the timeout handler running concurrently?
Using cancel_delayed_work() instead of cancel_delayed_work_sync() allows the
timeout handler to potentially race with instance destruction or subsequent
jobs.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624072043.238-1-nas.chung@chipsnmedia.com?part=4
next prev parent reply other threads:[~2026-06-24 7:40 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 7:20 [PATCH v6 0/9] Add support for Wave6 video codec driver Nas Chung
2026-06-24 7:20 ` [PATCH v6 1/9] media: v4l2-common: Fix P010 format info Nas Chung
2026-06-24 7:20 ` [PATCH v6 2/9] dt-bindings: media: nxp: Add Wave6 video codec device Nas Chung
2026-06-24 7:20 ` [PATCH v6 3/9] media: chips-media: wave6: Add Wave6 VPU interface Nas Chung
2026-06-24 7:43 ` sashiko-bot
2026-06-24 7:20 ` [PATCH v6 4/9] media: chips-media: wave6: Add v4l2 m2m driver support Nas Chung
2026-06-24 7:40 ` sashiko-bot [this message]
2026-06-24 7:20 ` [PATCH v6 5/9] media: chips-media: wave6: Add Wave6 core driver Nas Chung
2026-06-24 7:38 ` sashiko-bot
2026-06-24 7:20 ` [PATCH v6 6/9] media: chips-media: wave6: Improve debugging capabilities Nas Chung
2026-06-24 7:36 ` sashiko-bot
2026-06-24 7:20 ` [PATCH v6 7/9] media: chips-media: wave6: Add Wave6 thermal cooling device Nas Chung
2026-06-24 7:36 ` sashiko-bot
2026-06-24 7:20 ` [PATCH v6 8/9] media: chips-media: wave6: Add Wave6 control driver Nas Chung
2026-06-24 7:41 ` sashiko-bot
2026-06-24 7:20 ` [PATCH v6 9/9] arm64: dts: freescale: imx95: Add video codec node Nas Chung
2026-06-24 11:50 ` Francesco Dolcini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260624074031.617F21F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=nas.chung@chipsnmedia.com \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox