public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] media: chips-media: wave5: Move src_buf Removal to finish_encode
@ 2026-03-13 16:54 Brandon Brnich
  2026-03-19  1:38 ` jackson.lee
  0 siblings, 1 reply; 2+ messages in thread
From: Brandon Brnich @ 2026-03-13 16:54 UTC (permalink / raw)
  To: linux-media, linux-kernel
  Cc: detheridge, mchehab, nas.chung, jackson.lee, nicolas.dufresne,
	Brandon Brnich

During encoder processing, there is a case where the IRQ response could
return the buffer back to userspace via v4l2_m2m_buf_done call. In this
time, userspace could queue up this same buffer before start_encode removes
the index from the ready queue. This would then lead to a case where the
buffer in the ready queue could be a self loop due to the
WRITE_ONCE(prev->next, new) call in __list_add.

When __list_del is finally called, the loop is already made so nothing
points back to ready queue list head and pointers are poisoned.

A buffer should not be marked as DONE before the buffer is removed from
m2m ready queue. Move removal entirely to finish_encode.

Signed-off-by: Brandon Brnich <b-brnich@ti.com>
---

This bug is very hard to reproduce in simple encode environments. It
primarily occurs during long run cases where CPU is strained doing other
forms of computation. Crash log shared below.

I see other drivers removing buffer in their device_run process, but
they do it before any chance of DONE state transition. I can move this
removal to there as well if that is the correct location, but I can't
find anywhere saying this is required.

Kernel tested on: 7.0.0-rc3-00167-g66dbdc5b5d2d
Gstreamer version: 1.26.9

[ 609.879961] pc : v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
[ 609.886313] lr : v4l2_m2m_buf_remove_by_idx+0x28/0xe8 [v4l2_mem2mem]
[ 609.892663] sp : ffff800081a4bbd0
[ 609.895968] x29: ffff800081a4bbd0 x28: 0000000000000000 x27: 000000000007f800
[ 609.903096] x26: 00000000aefd2800 x25: 0000000000000780 x24: ffff0000014efdc8
[ 609.910224] x23: ffff0000014efc28 x22: ffff0000014efc00 x21: ffff0000014eff60
[ 609.917351] x20: ffff0000014efdc8 x19: ffff000001daf800 x18: 0000000000000000
[ 609.924478] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000002
[ 609.931605] x14: 0000000000000800 x13: 00000000000001e6 x12: 0000000000000000
[ 609.938732] x11: 0000000000000000 x10: 00000000000009d0 x9 : ffff800081a4bcc0
[ 609.945859] x8 : ffff800081a4bc88 x7 : 0000000000000000 x6 : ffff0000014eff50
[ 609.952986] x5 : dead000000000100 x4 : dead000000000100 x3 : dead000000000122
[ 609.960113] x2 : 0000000000000100 x1 : 0000000000000000 x0 : ffff000001dafc00
[ 609.967242] Call trace:
[ 609.969678] v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
[ 609.975685] start_encode+0x28c/0x554 [wave5]
[ 609.980063] wave5_vpu_enc_device_run+0x10c/0x230 [wave5]
[ 609.985466] v4l2_m2m_try_run+0x84/0x140 [v4l2_mem2mem]
[ 609.990692] v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem]
[ 609.996522] process_one_work+0x148/0x28c
[ 610.000532] worker_thread+0x2d0/0x3d8
[ 610.004277] kthread+0x110/0x114
[ 610.007500] ret_from_fork+0x10/0x20

 .../chips-media/wave5/wave5-vpu-enc.c         | 32 +++----------------
 1 file changed, 5 insertions(+), 27 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
index 7613fcdbafed..3e198a7cefb1 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
@@ -223,17 +223,9 @@ static int start_encode(struct vpu_instance *inst, u32 *fail_res)
 		dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
 		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
 		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_ERROR);
-	} else {
+	} else
 		dev_dbg(inst->dev->dev, "%s: wave5_vpu_enc_start_one_frame success\n",
 			__func__);
-		/*
-		 * Remove the source buffer from the ready-queue now and finish
-		 * it in the videobuf2 framework once the index is returned by the
-		 * firmware in finish_encode
-		 */
-		if (src_buf)
-			v4l2_m2m_src_buf_remove_by_idx(m2m_ctx, src_buf->vb2_buf.index);
-	}
 
 	return 0;
 }
@@ -259,27 +251,13 @@ static void wave5_vpu_enc_finish_encode(struct vpu_instance *inst)
 		__func__,  enc_output_info.pic_type, enc_output_info.recon_frame_index,
 		enc_output_info.enc_src_idx, enc_output_info.enc_pic_byte, enc_output_info.pts);
 
-	/*
-	 * The source buffer will not be found in the ready-queue as it has been
-	 * dropped after sending of the encode firmware command, locate it in
-	 * the videobuf2 queue directly
-	 */
 	if (enc_output_info.enc_src_idx >= 0) {
-		struct vb2_buffer *vb = vb2_get_buffer(v4l2_m2m_get_src_vq(m2m_ctx),
-						       enc_output_info.enc_src_idx);
-		if (vb->state != VB2_BUF_STATE_ACTIVE)
-			dev_warn(inst->dev->dev,
-				 "%s: encoded buffer (%d) was not in ready queue %i.",
-				 __func__, enc_output_info.enc_src_idx, vb->state);
-		else
-			src_buf = to_vb2_v4l2_buffer(vb);
-
-		if (src_buf) {
+		src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);
+		if (!src_buf)
+			dev_warn(inst->dev->dev, "%s: no source buffer found\n", __func__);
+		else {
 			inst->timestamp = src_buf->vb2_buf.timestamp;
 			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
-		} else {
-			dev_warn(inst->dev->dev, "%s: no source buffer with index: %d found\n",
-				 __func__, enc_output_info.enc_src_idx);
 		}
 	}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* RE: [PATCH] media: chips-media: wave5: Move src_buf Removal to finish_encode
  2026-03-13 16:54 [PATCH] media: chips-media: wave5: Move src_buf Removal to finish_encode Brandon Brnich
@ 2026-03-19  1:38 ` jackson.lee
  0 siblings, 0 replies; 2+ messages in thread
From: jackson.lee @ 2026-03-19  1:38 UTC (permalink / raw)
  To: Brandon Brnich, linux-media@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: detheridge@ti.com, mchehab@kernel.org, Nas Chung,
	nicolas.dufresne@collabora.com

Hi Brandon


> -----Original Message-----
> From: Brandon Brnich <b-brnich@ti.com>
> Sent: Saturday, March 14, 2026 1:55 AM
> To: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: detheridge@ti.com; mchehab@kernel.org; Nas Chung
> <nas.chung@chipsnmedia.com>; jackson.lee <jackson.lee@chipsnmedia.com>;
> nicolas.dufresne@collabora.com; Brandon Brnich <b-brnich@ti.com>
> Subject: [PATCH] media: chips-media: wave5: Move src_buf Removal to
> finish_encode
> 
> During encoder processing, there is a case where the IRQ response could
> return the buffer back to userspace via v4l2_m2m_buf_done call. In this
> time, userspace could queue up this same buffer before start_encode
> removes the index from the ready queue. This would then lead to a case
> where the buffer in the ready queue could be a self loop due to the
> WRITE_ONCE(prev->next, new) call in __list_add.
> 
> When __list_del is finally called, the loop is already made so nothing
> points back to ready queue list head and pointers are poisoned.
> 
> A buffer should not be marked as DONE before the buffer is removed from
> m2m ready queue. Move removal entirely to finish_encode.
> 
> Signed-off-by: Brandon Brnich <b-brnich@ti.com>
> ---
> 
> This bug is very hard to reproduce in simple encode environments. It
> primarily occurs during long run cases where CPU is strained doing other
> forms of computation. Crash log shared below.
> 
> I see other drivers removing buffer in their device_run process, but they
> do it before any chance of DONE state transition. I can move this removal
> to there as well if that is the correct location, but I can't find
> anywhere saying this is required.
> 
> Kernel tested on: 7.0.0-rc3-00167-g66dbdc5b5d2d Gstreamer version: 1.26.9
> 
> [ 609.879961] pc : v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
> [ 609.886313] lr : v4l2_m2m_buf_remove_by_idx+0x28/0xe8 [v4l2_mem2mem]
> [ 609.892663] sp : ffff800081a4bbd0 [ 609.895968] x29: ffff800081a4bbd0
> x28: 0000000000000000 x27: 000000000007f800 [ 609.903096] x26:
> 00000000aefd2800 x25: 0000000000000780 x24: ffff0000014efdc8 [ 609.910224]
> x23: ffff0000014efc28 x22: ffff0000014efc00 x21: ffff0000014eff60
> [ 609.917351] x20: ffff0000014efdc8 x19: ffff000001daf800 x18:
> 0000000000000000 [ 609.924478] x17: 0000000000000000 x16: 0000000000000000
> x15: 0000000000000002 [ 609.931605] x14: 0000000000000800 x13:
> 00000000000001e6 x12: 0000000000000000 [ 609.938732] x11: 0000000000000000
> x10: 00000000000009d0 x9 : ffff800081a4bcc0 [ 609.945859] x8 :
> ffff800081a4bc88 x7 : 0000000000000000 x6 : ffff0000014eff50 [ 609.952986]
> x5 : dead000000000100 x4 : dead000000000100 x3 : dead000000000122
> [ 609.960113] x2 : 0000000000000100 x1 : 0000000000000000 x0 :
> ffff000001dafc00 [ 609.967242] Call trace:
> [ 609.969678] v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
> [ 609.975685] start_encode+0x28c/0x554 [wave5] [ 609.980063]
> wave5_vpu_enc_device_run+0x10c/0x230 [wave5] [ 609.985466]
> v4l2_m2m_try_run+0x84/0x140 [v4l2_mem2mem] [ 609.990692]
> v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem] [ 609.996522]
> process_one_work+0x148/0x28c [ 610.000532] worker_thread+0x2d0/0x3d8
> [ 610.004277] kthread+0x110/0x114 [ 610.007500] ret_from_fork+0x10/0x20
> 
>  .../chips-media/wave5/wave5-vpu-enc.c         | 32 +++----------------
>  1 file changed, 5 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> index 7613fcdbafed..3e198a7cefb1 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> @@ -223,17 +223,9 @@ static int start_encode(struct vpu_instance *inst,
> u32 *fail_res)
>  		dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
>  		v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
>  		v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_ERROR);
> -	} else {
> +	} else

The else without braces violates kernel coding style when the if branch uses braces:

>  		dev_dbg(inst->dev->dev, "%s: wave5_vpu_enc_start_one_frame
> success\n",
>  			__func__);
> -		/*
> -		 * Remove the source buffer from the ready-queue now and
> finish
> -		 * it in the videobuf2 framework once the index is returned
> by the
> -		 * firmware in finish_encode
> -		 */
> -		if (src_buf)
> -			v4l2_m2m_src_buf_remove_by_idx(m2m_ctx, src_buf-
> >vb2_buf.index);
> -	}
> 
>  	return 0;
>  }
> @@ -259,27 +251,13 @@ static void wave5_vpu_enc_finish_encode(struct
> vpu_instance *inst)
>  		__func__,  enc_output_info.pic_type,
> enc_output_info.recon_frame_index,
>  		enc_output_info.enc_src_idx, enc_output_info.enc_pic_byte,
> enc_output_info.pts);
> 
> -	/*
> -	 * The source buffer will not be found in the ready-queue as it has
> been
> -	 * dropped after sending of the encode firmware command, locate it
> in
> -	 * the videobuf2 queue directly
> -	 */
>  	if (enc_output_info.enc_src_idx >= 0) {
> -		struct vb2_buffer *vb =
> vb2_get_buffer(v4l2_m2m_get_src_vq(m2m_ctx),
> -						       enc_output_info.enc_src_idx);
> -		if (vb->state != VB2_BUF_STATE_ACTIVE)
> -			dev_warn(inst->dev->dev,
> -				 "%s: encoded buffer (%d) was not in ready
> queue %i.",
> -				 __func__, enc_output_info.enc_src_idx, vb-
> >state);
> -		else
> -			src_buf = to_vb2_v4l2_buffer(vb);
> -
> -		if (src_buf) {
> +		src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);


v4l2_m2m_src_buf_remove() is too weak

The new finish_encode uses v4l2_m2m_src_buf_remove() which blindly removes the head of the ready queue.
It ignores enc_output_info.enc_src_idx entirely.
This works under the assumption that the m2m framework is strictly FIFO (which it is for single-job-at-a-time), but a safer approach would be:

src_buf = v4l2_m2m_src_buf_remove_by_idx(m2m_ctx, enc_output_info.enc_src_idx);


thanks
Jackson

> +		if (!src_buf)
> +			dev_warn(inst->dev->dev, "%s: no source buffer
> found\n", __func__);
> +		else {
>  			inst->timestamp = src_buf->vb2_buf.timestamp;
>  			v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
> -		} else {
> -			dev_warn(inst->dev->dev, "%s: no source buffer with
> index: %d found\n",
> -				 __func__, enc_output_info.enc_src_idx);
>  		}
>  	}
> 
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-19  1:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 16:54 [PATCH] media: chips-media: wave5: Move src_buf Removal to finish_encode Brandon Brnich
2026-03-19  1:38 ` jackson.lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox