Re: [PATCH v6 4/9] media: chips-media: wave6: Add v4l2 m2m driver support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Nas Chung" <nas.chung@chipsnmedia.com>
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v6 4/9] media: chips-media: wave6: Add v4l2 m2m driver support
Date: Wed, 24 Jun 2026 07:40:30 +0000	[thread overview]
Message-ID: <20260624074031.617F21F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260624072043.238-5-nas.chung@chipsnmedia.com>

Thank you for your contribution! Sashiko AI review found 10 potential issue(s) to consider:
- [High] On encoder/decoder start failure, the driver removes the wrong source buffer from the M2M queue, leading to premature buffer completion and Use-After-Free.
- [High] The decoder `VIDIOC_S_FMT` handlers lack `vb2_is_busy()` checks, allowing userspace to dynamically change the pixel format during streaming.
- [High] Missing `vb2_is_busy()` check in encoder's `s_selection` allows changing crop dimensions during streaming, leading to hardware DMA out-of-bounds writes.
- [High] The `video_device` structure is allocated with `devm_kzalloc`, leading to a Use-After-Free on driver unbind.
- [High] A race condition in the global `task_timer` timeout handler leads to Use-After-Free and M2M state corruption.
- [High] Unprotected traversal of `v4l2_m2m` ready queues causes linked list corruption.
- [High] Out-of-bounds array reads and writes due to unvalidated frame buffer and motion vector counts returned directly from hardware registers.
- [High] Missing hardware reset and completion reinitialization on VPU timeout leaves a stray completion that corrupts future hardware syncs.
- [Medium] Conditionally dropping the PM usage counter based on `pm_runtime_suspended()` leaks the usage counter.
- [Medium] Decoder permanently enables the hardware scaler once scaling is requested.
--

commit 85193b153b6b5079480d70d95e569973a1a2732e
Author: Nas Chung <nas.chung@chipsnmedia.com>

media: chips-media: wave6: Add v4l2 m2m driver support

This commit adds V4L2 M2M stateful driver support for the Chips&Media
Wave6 video codec IP, covering both encoder and decoder. It includes
multiplanar format support, scaling, and extensive V4L2 encoder controls
for AVC and HEVC formats.

> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
> index 0000000000000..9545264c19b7f 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c

[ ... ]

> @@ -152,6 +152,8 @@ static void wave6_vpu_dec_destroy_instance(struct vpu_instance *inst, bool remov
>  	wave6_vpu_dec_release_fb(inst);
>  	wave6_vpu_set_instance_state(inst, VPU_INST_STATE_NONE);
>  
>  	if (!pm_runtime_suspended(inst->dev->dev))
>  		pm_runtime_put_sync(inst->dev->dev);
>  }

[Severity: Medium]
Will this skip dropping the PM usage counter if the device is runtime suspended?

If the device enters runtime suspend (e.g., via system suspend) and
pm_runtime_suspended() returns true, pm_runtime_put_sync() is skipped. This
seems to leave the usage counter elevated, which could permanently prevent the
VPU from entering runtime suspend again.

[ ... ]

> @@ -219,6 +219,8 @@ static void wave6_vpu_dec_handle_dst_buffer(struct vpu_instance *inst)
>  	int consumed_num = wave6_vpu_get_consumed_fb_num(inst);
>  	int ret;
>  
>  	v4l2_m2m_for_each_dst_buf(inst->v4l2_fh.m2m_ctx, v4l2_m2m_buf) {
>  		dst_buf = &v4l2_m2m_buf->vb;
>  		vpu_buf = wave6_to_vpu_buf(dst_buf);

[Severity: High]
Does traversing the V4L2 M2M ready queue here require holding the rdy_spinlock?

The v4l2_m2m_for_each_dst_buf() macro iterates over the rdy_queue, which can be
modified concurrently by VIDIOC_QBUF in process context. Traversing this queue
without the rdy_spinlock might result in following stale list pointers and
causing linked list corruption.

[ ... ]

> @@ -598,6 +598,8 @@ static int wave6_vpu_dec_start_decode(struct vpu_instance *inst)
>  		dev_err(inst->dev->dev, "[%d] %s: fail %d\n", inst->id, __func__, ret);
>  		wave6_vpu_set_instance_state(inst, VPU_INST_STATE_STOP);
>  
>  		src_buf = v4l2_m2m_src_buf_remove(inst->v4l2_fh.m2m_ctx);
>  		if (src_buf) {
>  			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);

[Severity: High]
If start_one_frame fails, could removing the first buffer from the source queue
complete the wrong buffer?

Since the VPU driver allows multiple source buffers to be in flight (such as
for B-frame reordering), the first buffer in the M2M rdy_queue might be
actively processing. Unconditionally popping it via v4l2_m2m_src_buf_remove()
and completing it with VB2_BUF_STATE_ERROR would return an active buffer to
userspace. If userspace then frees or unmaps it, this could lead to a
Use-After-Free or DMA memory corruption.

[ ... ]

> @@ -682,6 +682,8 @@ static int wave6_vpu_dec_s_fmt_cap(struct file *file, void *fh, struct v4l2_form
>  		__func__, pix_mp->pixelformat, pix_mp->width, pix_mp->height,
>  		pix_mp->num_planes, pix_mp->colorspace);
>  
>  	ret = wave6_vpu_dec_try_fmt_cap(file, fh, f);
>  	if (ret)
>  		return ret;
>  
>  	vpu_fmt = wave6_find_vpu_fmt(pix_mp->pixelformat, VPU_FMT_TYPE_RAW);

[Severity: High]
Should there be a vb2_is_busy() check here to prevent userspace from changing
the format mid-stream?

Without this check, the format and inst->dst_fmt properties can be modified
during streaming without reallocating buffers. This could cause the driver to
calculate out-of-bounds DMA addresses and corrupt memory.

[ ... ]

> @@ -766,6 +766,8 @@ static int wave6_vpu_dec_s_fmt_out(struct file *file, void *fh, struct v4l2_form
>  		__func__, pix_mp->pixelformat, pix_mp->width, pix_mp->height,
>  		pix_mp->num_planes, pix_mp->colorspace);
>  
>  	ret = wave6_vpu_dec_try_fmt_out(file, fh, f);
>  	if (ret)
>  		return ret;
>  
>  	pix_mp->colorspace = in_pix_mp.colorspace;

[Severity: High]
Does this also need a vb2_is_busy() check to prevent mid-stream format changes
on the output queue?

Altering inst->src_fmt while streaming could lead to similar out-of-bounds
DMA operations.

[ ... ]

> @@ -954,6 +954,8 @@ static int wave6_vpu_dec_s_selection(struct file *file, void *fh, struct v4l2_se
>  		scale_height = round_down(scale_height, step);
>  	}
>  
>  	if (scale_width < inst->src_fmt.width ||
>  	    scale_height < inst->src_fmt.height)
>  		inst->scaler_info.enable = true;
>  
>  	if (inst->scaler_info.enable) {

[Severity: Medium]
If a subsequent call requests a 1:1 scale, does inst->scaler_info.enable ever
get reset to false?

It appears the scaler is permanently enabled once requested. If userspace
reverts to a 1:1 scale later, the hardware scaler remains active, which might
introduce unnecessary overhead or logic errors.

[ ... ]

> @@ -1323,6 +1323,8 @@ static int wave6_vpu_dec_prepare_fb(struct vpu_instance *inst)
>  	luma_size = fb_stride * fb_height;
>  	chroma_size = ALIGN(fb_stride / 2, W6_FBC_BUF_ALIGNMENT) * fb_height;
>  
>  	for (i = 0; i < fb_num; i++) {
>  		struct frame_buffer *frame = &inst->frame_buf[i];
>  		struct vpu_buf *vframe = &inst->frame_vbuf[i];
>  
>  		vframe->size = luma_size + chroma_size;

[Severity: High]
Can fb_num (derived from the hardware register W6_RET_DEC_MIN_FB_CNT) exceed
the WAVE6_MAX_FBS array bounds?

The values min_frame_buffer_count and req_mv_buffer_count are read directly
from hardware in wave6_get_dec_seq_result() without validation. If the
hardware or firmware returns a value larger than the fixed-size array bounds,
this loop could cause out-of-bounds writes to the inst->frame_buf and
inst->frame_vbuf arrays.

[ ... ]

> @@ -1803,6 +1803,8 @@ int wave6_vpu_dec_register_device(struct vpu_core_device *core)
>  	struct video_device *vdev_dec;
>  	int ret;
>  
>  	vdev_dec = devm_kzalloc(core->v4l2_dev.dev, sizeof(*vdev_dec), GFP_KERNEL);
>  	if (!vdev_dec)
>  		return -ENOMEM;
>  
>  	core->video_dev_dec = vdev_dec;

[Severity: High]
Is it safe to allocate the video_device with devm_kzalloc() here?

The V4L2 video_device structure must be freed dynamically in its release
callback (currently assigned video_device_release_empty) when the last file
descriptor is closed. If the device is unbound via sysfs while userspace holds
an open file descriptor, devm_kzalloc will free the memory, but the V4L2 core
will still hold pointers to it. This could result in a Use-After-Free during
subsequent ioctl or close operations.

> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
> index 0000000000000..825229f6fb051 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c

[ ... ]

> @@ -422,6 +422,8 @@ static int wave6_allocate_aux_buffer(struct vpu_instance *inst,
>  		return ret;
>  	}
>  
>  	for (i = 0; i < num; i++) {
>  		inst->aux_vbuf[type][i].size = size;
>  		ret = wave6_vdi_alloc_dma(inst->dev->dev, &inst->aux_vbuf[type][i]);
>  		if (ret) {

[Severity: High]
Is the num parameter validated against the array size before this loop?

Unlike the decoder's allocate function which clamps num to WAVE6_MAX_FBS,
the encoder seems to use the unvalidated num (derived from hardware registers)
directly. This could lead to out-of-bounds writes into the aux_vbuf array if
the hardware returns an unexpectedly large value.

[ ... ]

> @@ -783,6 +783,8 @@ static int wave6_vpu_enc_s_selection(struct file *file, void *fh, struct v4l2_se
>  	s->r.width = clamp(s->r.width, W6_MIN_ENC_PIC_WIDTH, max_crop_w);
>  	s->r.height = clamp(s->r.height, W6_MIN_ENC_PIC_HEIGHT, max_crop_h);
>  
>  	wave6_update_pix_fmt(&inst->dst_fmt, s->r.width, s->r.height);
>  	wave6_update_crop_info(inst, s->r.left, s->r.top, s->r.width, s->r.height);
>  
>  	dev_dbg(inst->dev->dev, "V4L2_SEL_TGT_CROP %dx%dx%dx%d\n",

[Severity: High]
Should there be a vb2_is_busy() check before updating the crop dimensions?

Without checking if the queues are busy, userspace can change the crop
dimensions during streaming. If a larger crop selection is set,
inst->codec_rect is updated, and the hardware will be programmed with larger
dimensions than the previously allocated internal FBC backing buffers. This
could result in hardware DMA memory corruption.

> diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
> index 0000000000000..a9a168b8bf269 100644
> --- a/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
> +++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c

[ ... ]

> @@ -72,6 +72,8 @@ struct vb2_v4l2_buffer *wave6_get_dst_buf_by_addr(struct vpu_instance *inst,
>  	struct v4l2_m2m_buffer *v4l2_m2m_buf;
>  	struct vb2_v4l2_buffer *dst_buf = NULL;
>  
>  	v4l2_m2m_for_each_dst_buf(inst->v4l2_fh.m2m_ctx, v4l2_m2m_buf) {
>  		vb2_v4l2_buf = &v4l2_m2m_buf->vb;
>  		if (addr == wave6_get_dma_addr(vb2_v4l2_buf, 0)) {
>  			dst_buf = vb2_v4l2_buf;

[Severity: High]
Is it safe to iterate over the M2M destination queue without holding the
rdy_spinlock?

The rdy_queue is shared and can be concurrently modified by VIDIOC_QBUF.
Traversing it unprotected could result in following stale list pointers.

[ ... ]

> @@ -312,6 +312,8 @@ int wave6_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout)
>  
>  	ret = wait_for_completion_timeout(&inst->irq_done,
>  					  msecs_to_jiffies(timeout));
>  	if (!ret)
>  		return -ETIMEDOUT;
>  
>  	reinit_completion(&inst->irq_done);

[Severity: High]
Does exiting early on a timeout leave a stray completion signal?

If the timeout expires and the driver returns -ETIMEDOUT without resetting
the hardware or calling reinit_completion(), the VPU might finish the command
later and signal the completion. This stray completion would cause a subsequent
wait_for_completion_timeout() to return immediately, potentially reading result
registers while the hardware is still executing a new command.

[ ... ]

> @@ -419,6 +419,8 @@ static void wave6_vpu_device_run_timeout(struct work_struct *work)
>  	vb2_queue_error(v4l2_m2m_get_src_vq(inst->v4l2_fh.m2m_ctx));
>  	vb2_queue_error(v4l2_m2m_get_dst_vq(inst->v4l2_fh.m2m_ctx));
>  
>  	v4l2_m2m_job_finish(inst->dev->m2m_dev, inst->v4l2_fh.m2m_ctx);
>  }

[Severity: High]
Should the hardware be reset or aborted before finishing the M2M job?

If v4l2_m2m_job_finish() is called while the VPU is actively executing a
timed-out command, an eventual hardware interrupt might corrupt the next job
or trigger a Use-After-Free if the instance has been freed.

[ ... ]

> @@ -442,6 +442,8 @@ static void wave6_vpu_device_run(void *priv)
>  
>  void wave6_vpu_finish_job(struct vpu_instance *inst)
>  {
>  	cancel_delayed_work(&inst->dev->task_timer);
>  	v4l2_m2m_job_finish(inst->dev->m2m_dev, inst->v4l2_fh.m2m_ctx);
>  }

[Severity: High]
Can cancel_delayed_work() leave the timeout handler running concurrently?

Using cancel_delayed_work() instead of cancel_delayed_work_sync() allows the
timeout handler to potentially race with instance destruction or subsequent
jobs.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260624072043.238-1-nas.chung@chipsnmedia.com?part=4

next prev parent reply	other threads:[~2026-06-24  7:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24  7:20 [PATCH v6 0/9] Add support for Wave6 video codec driver Nas Chung
2026-06-24  7:20 ` [PATCH v6 1/9] media: v4l2-common: Fix P010 format info Nas Chung
2026-06-24  7:20 ` [PATCH v6 2/9] dt-bindings: media: nxp: Add Wave6 video codec device Nas Chung
2026-06-24  7:20 ` [PATCH v6 3/9] media: chips-media: wave6: Add Wave6 VPU interface Nas Chung
2026-06-24  7:43   ` sashiko-bot
2026-06-24  7:20 ` [PATCH v6 4/9] media: chips-media: wave6: Add v4l2 m2m driver support Nas Chung
2026-06-24  7:40   ` sashiko-bot [this message]
2026-06-24  7:20 ` [PATCH v6 5/9] media: chips-media: wave6: Add Wave6 core driver Nas Chung
2026-06-24  7:38   ` sashiko-bot
2026-06-24  7:20 ` [PATCH v6 6/9] media: chips-media: wave6: Improve debugging capabilities Nas Chung
2026-06-24  7:36   ` sashiko-bot
2026-06-24  7:20 ` [PATCH v6 7/9] media: chips-media: wave6: Add Wave6 thermal cooling device Nas Chung
2026-06-24  7:36   ` sashiko-bot
2026-06-24  7:20 ` [PATCH v6 8/9] media: chips-media: wave6: Add Wave6 control driver Nas Chung
2026-06-24  7:41   ` sashiko-bot
2026-06-24  7:20 ` [PATCH v6 9/9] arm64: dts: freescale: imx95: Add video codec node Nas Chung
2026-06-24 11:50   ` Francesco Dolcini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260624074031.617F21F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=nas.chung@chipsnmedia.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.