public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v1 0/7] Performance improvement of decoder
@ 2025-04-10  3:39 Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 1/7] media: chips-media: wave5: Fix Null reference while testing fluster Jackson.lee
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

v4l2-compliance results:
========================

v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t

Buffer ioctls:
                warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
                warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
        test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
        test CREATE_BUFS maximum buffers: OK
        test VIDIOC_EXPBUF: OK
        test Requests: OK (Not Supported)

Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0, Warnings: 2
Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0, Warnings: 0

Fluster test results:
=====================

Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-Gst1.0
Using 3 parallel job(s)
Ran 133/147 tests successfully               in 41.629 secs

(1 test fails because of not supporting to parse multi frames, 1 test fails because of a missing frame and slight corruption,
 2 tests fail because of sizes which are incompatible with the IP, 11 tests fail because of unsupported 10 bit format)


Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-Gst1.0
Using 3 parallel job(s)
Ran 78/135 tests successfully               in 44.578 secs

(57 fail because the hardware is unable to decode  MBAFF / FMO / Field / Extended profile streams.)

Seek test
=====================
1. gst-play-1.0 seek.264
2. this will use waylandsink since gst-play-1.0 uses playbin.
   if you don't want to hook up display,
   you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
3. Let pipeline run for 2-3 seconds
4. press SPACE key to pause
5. press 0 to reset
press SPACE to start again

gst-play-1.0 seek.264 --videosink=fakevideosink
Press 'k' to see a list of keyboard shortcuts.
Now playing /root/seek.264
Redistribute latency...
Redistribute latency...
Redistribute latency...
Redistribute latency...
Redistribute latency...aused
0:00:09.9 / 0:00:09.7
Reached end of play list.

Sequence Change test
=====================
gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 ! h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Redistribute latency...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstSystemClock
Got EOS from element "pipeline0".
Execution ended after 0:00:05.772414400
Setting pipeline to NULL ...
Freeing pipeline ...

Change since v0:
===================
* For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder
 - separates the previous patch to a few patches

* For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when
    instance was destroyed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed
 - separated from the previous patch of performance improvement of
   decoder

Jackson Lee (7):
  media: chips-media: wave5: Fix Null reference while testing fluster
  media: chips-media: wave5: Improve performance of decoder
  media: chips-media: wave5: Fix not to be closed
  media: chips-media: wave5: Use spinlock whenever state is changed
  media: chips-media: wave5: Fix not to free resources normally when
    instance was destroyed
  media: chips-media: wave5: Reduce high CPU load
  media: chips-media: wave5: Fix SError of kernel panic when closed

 .../platform/chips-media/wave5/wave5-helper.c |  10 +-
 .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-------
 .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
 .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
 .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
 .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
 .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
 7 files changed, 179 insertions(+), 72 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 1/7] media: chips-media: wave5: Fix Null reference while testing fluster
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
@ 2025-04-10  3:39 ` Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder Jackson.lee
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

When multi instances are created/destroyed, many interrupts happens
or structures for decoder are removed.
"struct vpu_instance" this structure is shared for all flow in decoder,
so if the structure is not protected by lock, Null reference exception
could happens sometimes.
IRQ Handler was spilt to two phases and Lock was added as well.

Fixes: 9707a6254a8a ("media: chips-media: wave5: Add the v4l2 layer")
Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 .../platform/chips-media/wave5/wave5-helper.c | 10 ++-
 .../chips-media/wave5/wave5-vpu-dec.c         |  5 ++
 .../chips-media/wave5/wave5-vpu-enc.c         |  5 ++
 .../platform/chips-media/wave5/wave5-vpu.c    | 68 ++++++++++++++++---
 .../platform/chips-media/wave5/wave5-vpuapi.h |  6 ++
 5 files changed, 85 insertions(+), 9 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c b/drivers/media/platform/chips-media/wave5/wave5-helper.c
index 2c9d8cbca6e4..5d9969bb7ada 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-helper.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c
@@ -49,7 +49,7 @@ void wave5_cleanup_instance(struct vpu_instance *inst)
 		v4l2_fh_del(&inst->v4l2_fh);
 		v4l2_fh_exit(&inst->v4l2_fh);
 	}
-	list_del_init(&inst->list);
+	kfifo_free(&inst->irq_status);
 	ida_free(&inst->dev->inst_ida, inst->id);
 	kfree(inst->codec_info);
 	kfree(inst);
@@ -61,8 +61,16 @@ int wave5_vpu_release_device(struct file *filp,
 {
 	struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data);
 	int ret = 0;
+	unsigned long flags;
 
 	v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx);
+	ret = mutex_lock_interruptible(&inst->dev->irq_lock);
+	if (ret)
+		return ret;
+	spin_lock_irqsave(&inst->dev->irq_spinlock, flags);
+	list_del_init(&inst->list);
+	spin_unlock_irqrestore(&inst->dev->irq_spinlock, flags);
+	mutex_unlock(&inst->dev->irq_lock);
 	if (inst->state != VPU_INST_STATE_NONE) {
 		u32 fail_res;
 
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index fd71f0c43ac3..32de43de1870 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1811,6 +1811,11 @@ static int wave5_vpu_open_dec(struct file *filp)
 	inst->xfer_func = V4L2_XFER_FUNC_DEFAULT;
 
 	init_completion(&inst->irq_done);
+	ret = kfifo_alloc(&inst->irq_status, 16 * sizeof(int), GFP_KERNEL);
+	if (ret) {
+		dev_err(inst->dev->dev, "failed to allocate fifo\n");
+		goto cleanup_inst;
+	}
 
 	inst->id = ida_alloc(&inst->dev->inst_ida, GFP_KERNEL);
 	if (inst->id < 0) {
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
index 1e5fc5f8b856..52a1a00fd9bb 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
@@ -1760,6 +1760,11 @@ static int wave5_vpu_open_enc(struct file *filp)
 	inst->frame_rate = 30;
 
 	init_completion(&inst->irq_done);
+	ret = kfifo_alloc(&inst->irq_status, 16 * sizeof(int), GFP_KERNEL);
+	if (ret) {
+		dev_err(inst->dev->dev, "failed to allocate fifo\n");
+		goto cleanup_inst;
+	}
 
 	inst->id = ida_alloc(&inst->dev->inst_ida, GFP_KERNEL);
 	if (inst->id < 0) {
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
index e1715d3f43b0..c53ac9a5c1c2 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
@@ -51,8 +51,11 @@ static void wave5_vpu_handle_irq(void *dev_id)
 	u32 seq_done;
 	u32 cmd_done;
 	u32 irq_reason;
-	struct vpu_instance *inst;
+	u32 irq_subreason;
+	struct vpu_instance *inst, *tmp;
 	struct vpu_device *dev = dev_id;
+	int val;
+	unsigned long flags;
 
 	irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON);
 	seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO);
@@ -60,7 +63,8 @@ static void wave5_vpu_handle_irq(void *dev_id)
 	wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason);
 	wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1);
 
-	list_for_each_entry(inst, &dev->instances, list) {
+	spin_lock_irqsave(&dev->irq_spinlock, flags);
+	list_for_each_entry_safe(inst, tmp, &dev->instances, list) {
 
 		if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) ||
 		    irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) {
@@ -82,14 +86,22 @@ static void wave5_vpu_handle_irq(void *dev_id)
 		    irq_reason & BIT(INT_WAVE5_ENC_PIC)) {
 			if (cmd_done & BIT(inst->id)) {
 				cmd_done &= ~BIT(inst->id);
-				wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST,
-							 cmd_done);
-				inst->ops->finish_process(inst);
+				if (dev->irq >= 0) {
+					irq_subreason =
+						wave5_vdi_read_register(dev, W5_VPU_VINT_REASON);
+					if (!(irq_subreason & BIT(INT_WAVE5_DEC_PIC)))
+						wave5_vdi_write_register(dev,
+									 W5_RET_QUEUE_CMD_DONE_INST,
+									 cmd_done);
+				}
+				val = BIT(INT_WAVE5_DEC_PIC);
+				kfifo_in(&inst->irq_status, &val, sizeof(int));
 			}
 		}
-
-		wave5_vpu_clear_interrupt(inst, irq_reason);
 	}
+	spin_unlock_irqrestore(&dev->irq_spinlock, flags);
+
+	up(&dev->irq_sem);
 }
 
 static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id)
@@ -121,6 +133,35 @@ static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer)
 	return HRTIMER_RESTART;
 }
 
+static int irq_thread(void *data)
+{
+	struct vpu_device *dev = (struct vpu_device *)data;
+	struct vpu_instance *inst, *tmp;
+	int irq_status, ret;
+
+	while (!kthread_should_stop()) {
+		if (down_interruptible(&dev->irq_sem))
+			continue;
+
+		if (kthread_should_stop())
+			break;
+
+		mutex_lock(&dev->irq_lock);
+		list_for_each_entry_safe(inst, tmp, &dev->instances, list) {
+			while (kfifo_len(&inst->irq_status)) {
+				ret = kfifo_out(&inst->irq_status, &irq_status, sizeof(int));
+				if (!ret)
+					break;
+
+				inst->ops->finish_process(inst);
+			}
+		}
+		mutex_unlock(&dev->irq_lock);
+	}
+
+	return 0;
+}
+
 static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name,
 				   u32 *revision)
 {
@@ -224,6 +265,8 @@ static int wave5_vpu_probe(struct platform_device *pdev)
 
 	mutex_init(&dev->dev_lock);
 	mutex_init(&dev->hw_lock);
+	mutex_init(&dev->irq_lock);
+	spin_lock_init(&dev->irq_spinlock);
 	dev_set_drvdata(&pdev->dev, dev);
 	dev->dev = &pdev->dev;
 
@@ -265,6 +308,9 @@ static int wave5_vpu_probe(struct platform_device *pdev)
 		goto err_clk_dis;
 	}
 	dev->product = wave5_vpu_get_product_id(dev);
+	sema_init(&dev->irq_sem, 1);
+	INIT_LIST_HEAD(&dev->instances);
+	dev->irq_thread = kthread_run(irq_thread, dev, "irq thread");
 
 	dev->irq = platform_get_irq(pdev, 0);
 	if (dev->irq < 0) {
@@ -288,7 +334,6 @@ static int wave5_vpu_probe(struct platform_device *pdev)
 		}
 	}
 
-	INIT_LIST_HEAD(&dev->instances);
 	ret = v4l2_device_register(&pdev->dev, &dev->v4l2_dev);
 	if (ret) {
 		dev_err(&pdev->dev, "v4l2_device_register, fail: %d\n", ret);
@@ -351,6 +396,12 @@ static void wave5_vpu_remove(struct platform_device *pdev)
 {
 	struct vpu_device *dev = dev_get_drvdata(&pdev->dev);
 
+	if (dev->irq_thread) {
+		kthread_stop(dev->irq_thread);
+		up(&dev->irq_sem);
+		dev->irq_thread = NULL;
+	}
+
 	if (dev->irq < 0) {
 		kthread_destroy_worker(dev->worker);
 		hrtimer_cancel(&dev->hrtimer);
@@ -361,6 +412,7 @@ static void wave5_vpu_remove(struct platform_device *pdev)
 
 	mutex_destroy(&dev->dev_lock);
 	mutex_destroy(&dev->hw_lock);
+	mutex_destroy(&dev->irq_lock);
 	reset_control_assert(dev->resets);
 	clk_bulk_disable_unprepare(dev->num_clks, dev->clks);
 	wave5_vpu_enc_unregister_device(dev);
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
index 45615c15beca..f3c1ad6fb3be 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
@@ -8,6 +8,7 @@
 #ifndef VPUAPI_H_INCLUDED
 #define VPUAPI_H_INCLUDED
 
+#include <linux/kfifo.h>
 #include <linux/idr.h>
 #include <linux/genalloc.h>
 #include <media/v4l2-device.h>
@@ -747,6 +748,7 @@ struct vpu_device {
 	struct video_device *video_dev_enc;
 	struct mutex dev_lock; /* lock for the src, dst v4l2 queues */
 	struct mutex hw_lock; /* lock hw configurations */
+	struct mutex irq_lock;
 	int irq;
 	enum product_id product;
 	struct vpu_attr attr;
@@ -764,7 +766,10 @@ struct vpu_device {
 	struct kthread_worker *worker;
 	int vpu_poll_interval;
 	int num_clks;
+	struct task_struct *irq_thread;
+	struct semaphore irq_sem;
 	struct reset_control *resets;
+	spinlock_t irq_spinlock; /* protect instances list */
 };
 
 struct vpu_instance;
@@ -788,6 +793,7 @@ struct vpu_instance {
 	enum v4l2_ycbcr_encoding ycbcr_enc;
 	enum v4l2_quantization quantization;
 
+	struct kfifo irq_status;
 	enum vpu_instance_state state;
 	enum vpu_instance_type type;
 	const struct vpu_instance_ops *ops;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 1/7] media: chips-media: wave5: Fix Null reference while testing fluster Jackson.lee
@ 2025-04-10  3:39 ` Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed Jackson.lee
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

The current decoding method  was to wait until each frame was
decoded after feeding a bitstream. As a result, performance was low
and Wave5 could not achieve max pixel processing rate.

Update driver to use an asynchronous approach for decoding and feeding a
bitstream in order to achieve full capabilities of the device.

WAVE5 supports command-queueing to maximize performance by pipelining
internal commands and by hiding wait cycle taken to receive a command
from Host processor.

Instead of waiting for each command to be executed before sending the
next command, Host processor just places all the commands in the
command-queue and goes on doing other things while the commands in the
queue are processed by VPU.

While Host processor handles its own tasks, it can receive VPU interrupt
request (IRQ).
In this case, host processor can simply exit interrupt service routine
(ISR) without accessing to host interface to read the result of the
command reported by VPU.
After host processor completed its tasks, host processor can read the
command result when host processor needs the reports and does
response processing.

To archive this goal, the device_run() calls v4l2_m2m_job_finish
so that next command can be sent to VPU continuously, if there is
any result, then irq is triggered and gets decoded frames and returns
them to upper layer.
Theses processes work independently each other without waiting
a decoded frame.

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 .../chips-media/wave5/wave5-vpu-dec.c         | 82 +++++++++++--------
 .../platform/chips-media/wave5/wave5-vpuapi.c |  2 +
 .../platform/chips-media/wave5/wave5-vpuapi.h |  3 +
 3 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index 32de43de1870..192556b91aaa 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -347,7 +347,6 @@ static void wave5_vpu_dec_finish_decode(struct vpu_instance *inst)
 	struct vb2_v4l2_buffer *dec_buf = NULL;
 	struct vb2_v4l2_buffer *disp_buf = NULL;
 	struct vb2_queue *dst_vq = v4l2_m2m_get_dst_vq(m2m_ctx);
-	struct queue_status_info q_status;
 
 	dev_dbg(inst->dev->dev, "%s: Fetch output info from firmware.", __func__);
 
@@ -441,20 +440,6 @@ static void wave5_vpu_dec_finish_decode(struct vpu_instance *inst)
 		}
 		spin_unlock_irqrestore(&inst->state_spinlock, flags);
 	}
-
-	/*
-	 * During a resolution change and while draining, the firmware may flush
-	 * the reorder queue regardless of having a matching decoding operation
-	 * pending. Only terminate the job if there are no more IRQ coming.
-	 */
-	wave5_vpu_dec_give_command(inst, DEC_GET_QUEUE_STATUS, &q_status);
-	if (q_status.report_queue_count == 0 &&
-	    (q_status.instance_queue_count == 0 || dec_info.sequence_changed)) {
-		dev_dbg(inst->dev->dev, "%s: finishing job.\n", __func__);
-		pm_runtime_mark_last_busy(inst->dev->dev);
-		pm_runtime_put_autosuspend(inst->dev->dev);
-		v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
-	}
 }
 
 static int wave5_vpu_dec_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
@@ -1146,8 +1131,8 @@ static int write_to_ringbuffer(struct vpu_instance *inst, void *buffer, size_t b
 static int fill_ringbuffer(struct vpu_instance *inst)
 {
 	struct v4l2_m2m_ctx *m2m_ctx = inst->v4l2_fh.m2m_ctx;
-	struct v4l2_m2m_buffer *buf, *n;
-	int ret;
+	struct vpu_src_buffer *vpu_buf;
+	int ret = 0;
 
 	if (m2m_ctx->last_src_buf)  {
 		struct vpu_src_buffer *vpu_buf = wave5_to_vpu_src_buf(m2m_ctx->last_src_buf);
@@ -1158,9 +1143,8 @@ static int fill_ringbuffer(struct vpu_instance *inst)
 		}
 	}
 
-	v4l2_m2m_for_each_src_buf_safe(m2m_ctx, buf, n) {
-		struct vb2_v4l2_buffer *vbuf = &buf->vb;
-		struct vpu_src_buffer *vpu_buf = wave5_to_vpu_src_buf(vbuf);
+	list_for_each_entry(vpu_buf, &inst->avail_src_bufs, list) {
+		struct vb2_v4l2_buffer *vbuf = &vpu_buf->v4l2_m2m_buf.vb;
 		struct vpu_buf *ring_buffer = &inst->bitstream_vbuf;
 		size_t src_size = vb2_get_plane_payload(&vbuf->vb2_buf, 0);
 		void *src_buf = vb2_plane_vaddr(&vbuf->vb2_buf, 0);
@@ -1220,9 +1204,13 @@ static int fill_ringbuffer(struct vpu_instance *inst)
 			dev_dbg(inst->dev->dev, "last src buffer written to the ring buffer\n");
 			break;
 		}
+
+		inst->queuing_num++;
+		list_del_init(&vpu_buf->list);
+		break;
 	}
 
-	return 0;
+	return ret;
 }
 
 static void wave5_vpu_dec_buf_queue_src(struct vb2_buffer *vb)
@@ -1236,6 +1224,11 @@ static void wave5_vpu_dec_buf_queue_src(struct vb2_buffer *vb)
 	vbuf->sequence = inst->queued_src_buf_num++;
 
 	v4l2_m2m_buf_queue(m2m_ctx, vbuf);
+
+	INIT_LIST_HEAD(&vpu_buf->list);
+	mutex_lock(&inst->feed_lock);
+	list_add_tail(&vpu_buf->list, &inst->avail_src_bufs);
+	mutex_unlock(&inst->feed_lock);
 }
 
 static void wave5_vpu_dec_buf_queue_dst(struct vb2_buffer *vb)
@@ -1385,6 +1378,13 @@ static int streamoff_output(struct vb2_queue *q)
 	dma_addr_t new_rd_ptr;
 	struct dec_output_info dec_info;
 	unsigned int i;
+	struct vpu_src_buffer *vpu_buf, *tmp;
+
+	inst->retry = false;
+	inst->queuing_num = 0;
+
+	list_for_each_entry_safe(vpu_buf, tmp, &inst->avail_src_bufs, list)
+		list_del_init(&vpu_buf->list);
 
 	for (i = 0; i < v4l2_m2m_num_dst_bufs_ready(m2m_ctx); i++) {
 		ret = wave5_vpu_dec_set_disp_flag(inst, i);
@@ -1580,10 +1580,19 @@ static void wave5_vpu_dec_device_run(void *priv)
 
 	dev_dbg(inst->dev->dev, "%s: Fill the ring buffer with new bitstream data", __func__);
 	pm_runtime_resume_and_get(inst->dev->dev);
-	ret = fill_ringbuffer(inst);
-	if (ret) {
-		dev_warn(inst->dev->dev, "Filling ring buffer failed\n");
-		goto finish_job_and_return;
+	if (!inst->retry) {
+		mutex_lock(&inst->feed_lock);
+		ret = fill_ringbuffer(inst);
+		mutex_unlock(&inst->feed_lock);
+		if (ret < 0) {
+			dev_warn(inst->dev->dev, "Filling ring buffer failed\n");
+			goto finish_job_and_return;
+		} else if (!inst->eos &&
+				inst->queuing_num == 0 &&
+				inst->state == VPU_INST_STATE_PIC_RUN) {
+			dev_dbg(inst->dev->dev, "%s: no bitstream for feeding, so skip ", __func__);
+			goto finish_job_and_return;
+		}
 	}
 
 	switch (inst->state) {
@@ -1639,7 +1648,7 @@ static void wave5_vpu_dec_device_run(void *priv)
 		}
 
 		if (q_status.instance_queue_count) {
-			dev_dbg(inst->dev->dev, "%s: leave with active job", __func__);
+			v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
 			return;
 		}
 
@@ -1650,14 +1659,21 @@ static void wave5_vpu_dec_device_run(void *priv)
 			dev_err(inst->dev->dev,
 				"Frame decoding on m2m context (%p), fail: %d (result: %d)\n",
 				m2m_ctx, ret, fail_res);
-			break;
+			goto finish_job_and_return;
+		}
+
+		if (fail_res == WAVE5_SYSERR_QUEUEING_FAIL) {
+			inst->retry = true;
+		} else {
+			inst->retry = false;
+			if (!inst->eos)
+				inst->queuing_num--;
 		}
-		/* Return so that we leave this job active */
-		dev_dbg(inst->dev->dev, "%s: leave with active job", __func__);
-		return;
-	default:
-		WARN(1, "Execution of a job in state %s illegal.\n", state_to_str(inst->state));
 		break;
+	default:
+		if (!v4l2_m2m_has_stopped(m2m_ctx))
+			WARN(1, "Execution of a job in state %s illegal.\n",
+			     state_to_str(inst->state));
 	}
 
 finish_job_and_return:
@@ -1755,6 +1771,8 @@ static int wave5_vpu_open_dec(struct file *filp)
 	inst->ops = &wave5_vpu_dec_inst_ops;
 
 	spin_lock_init(&inst->state_spinlock);
+	mutex_init(&inst->feed_lock);
+	INIT_LIST_HEAD(&inst->avail_src_bufs);
 
 	inst->codec_info = kzalloc(sizeof(*inst->codec_info), GFP_KERNEL);
 	if (!inst->codec_info)
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index e5e879a13e8b..68d86625538f 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -255,6 +255,8 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 	if (inst_count == 1)
 		pm_runtime_dont_use_autosuspend(vpu_dev->dev);
 
+	mutex_destroy(&inst->feed_lock);
+
 unlock_and_return:
 	mutex_unlock(&vpu_dev->hw_lock);
 	pm_runtime_put_sync(inst->dev->dev);
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
index f3c1ad6fb3be..fd0aef0bac4e 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
@@ -818,6 +818,9 @@ struct vpu_instance {
 	bool cbcr_interleave;
 	bool nv21;
 	bool eos;
+	bool retry; /* retry to feed bitstream if failure reason is WAVE5_SYSERR_QUEUEING_FAIL*/
+	int queuing_num; /* check if there is input buffer or not */
+	struct mutex feed_lock; /* lock for feeding bitstream buffers */
 	struct vpu_buf bitstream_vbuf;
 	dma_addr_t last_rd_ptr;
 	size_t remaining_consumed_bytes;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 1/7] media: chips-media: wave5: Fix Null reference while testing fluster Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder Jackson.lee
@ 2025-04-10  3:39 ` Jackson.lee
  2025-04-10  3:39 ` [RESEND PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed Jackson.lee
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

The current code was to wait interrupt if queue or report queue is not 0,
but since applying the performance patch, input and output is not any
more synchronized.
So even if queue count is not 0, an interrupt could not be triggered.

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c   | 6 +++---
 drivers/media/platform/chips-media/wave5/wave5-vpuconfig.h | 1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index 192556b91aaa..e0f4b11ee464 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1481,11 +1481,11 @@ static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
 
 		wave5_vpu_dec_give_command(inst, DEC_GET_QUEUE_STATUS, &q_status);
 
-		if (q_status.report_queue_count == 0)
+		if ((inst->state == VPU_INST_STATE_STOP || q_status.instance_queue_count == 0) &&
+		    q_status.report_queue_count == 0)
 			break;
 
-		if (wave5_vpu_wait_interrupt(inst, VPU_DEC_TIMEOUT) < 0)
-			break;
+		wave5_vpu_wait_interrupt(inst, VPU_DEC_STOP_TIMEOUT);
 
 		if (wave5_vpu_dec_get_output_info(inst, &dec_output_info))
 			dev_dbg(inst->dev->dev, "there is no output info\n");
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuconfig.h b/drivers/media/platform/chips-media/wave5/wave5-vpuconfig.h
index 1ea9f5f31499..4ebd48d5550e 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuconfig.h
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuconfig.h
@@ -59,6 +59,7 @@
 //  application specific configuration
 #define VPU_ENC_TIMEOUT                 60000
 #define VPU_DEC_TIMEOUT                 60000
+#define VPU_DEC_STOP_TIMEOUT            10
 
 // for WAVE encoder
 #define USE_SRC_PRP_AXI         0
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (2 preceding siblings ...)
  2025-04-10  3:39 ` [RESEND PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed Jackson.lee
@ 2025-04-10  3:39 ` Jackson.lee
  2025-04-10  3:40 ` [RESEND PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when instance was destroyed Jackson.lee
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:39 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

The device_run and finish_decode is not any more synchronized,
so lock was needed in the device_run whenever state was changed.

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index e0f4b11ee464..485320db9bdc 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1577,6 +1577,7 @@ static void wave5_vpu_dec_device_run(void *priv)
 	struct queue_status_info q_status;
 	u32 fail_res = 0;
 	int ret = 0;
+	unsigned long flags;
 
 	dev_dbg(inst->dev->dev, "%s: Fill the ring buffer with new bitstream data", __func__);
 	pm_runtime_resume_and_get(inst->dev->dev);
@@ -1617,7 +1618,9 @@ static void wave5_vpu_dec_device_run(void *priv)
 			}
 			spin_unlock_irqrestore(&inst->state_spinlock, flags);
 		} else {
+			spin_lock_irqsave(&inst->state_spinlock, flags);
 			switch_state(inst, VPU_INST_STATE_INIT_SEQ);
+			spin_unlock_irqrestore(&inst->state_spinlock, flags);
 		}
 
 		break;
@@ -1628,8 +1631,9 @@ static void wave5_vpu_dec_device_run(void *priv)
 		 * we had a chance to switch, which leads to an invalid state
 		 * change.
 		 */
+		spin_lock_irqsave(&inst->state_spinlock, flags);
 		switch_state(inst, VPU_INST_STATE_PIC_RUN);
-
+		spin_unlock_irqrestore(&inst->state_spinlock, flags);
 		/*
 		 * During DRC, the picture decoding remains pending, so just leave the job
 		 * active until this decode operation completes.
@@ -1643,7 +1647,9 @@ static void wave5_vpu_dec_device_run(void *priv)
 		ret = wave5_prepare_fb(inst);
 		if (ret) {
 			dev_warn(inst->dev->dev, "Framebuffer preparation, fail: %d\n", ret);
+			spin_lock_irqsave(&inst->state_spinlock, flags);
 			switch_state(inst, VPU_INST_STATE_STOP);
+			spin_unlock_irqrestore(&inst->state_spinlock, flags);
 			break;
 		}
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when instance was destroyed
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (3 preceding siblings ...)
  2025-04-10  3:39 ` [RESEND PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed Jackson.lee
@ 2025-04-10  3:40 ` Jackson.lee
  2025-04-10  3:40 ` [RESEND PATCH v1 6/7] media: chips-media: wave5: Reduce high CPU load Jackson.lee
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:40 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

Since applying performance patch, there was a problem not to free
resources, the root cause was that timeout sometimes happened after
calling the wave5_vpu_dec_finish_seq() when application was closed
forcibly,so if failure reason is WAVE5_SYSERR_VPU_STILL_RUNNING,
the wave5_vpu_dec_get_output_info() should be called to flush videos
decoded before closed.

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 .../platform/chips-media/wave5/wave5-vpuapi.c | 20 +++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index 68d86625538f..d7318d596b73 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -209,6 +209,7 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 	int i;
 	int inst_count = 0;
 	struct vpu_instance *inst_elm;
+	struct dec_output_info dec_info;
 
 	*fail_res = 0;
 	if (!inst->codec_info)
@@ -229,11 +230,26 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 			goto unlock_and_return;
 		}
 
-		if (*fail_res == WAVE5_SYSERR_VPU_STILL_RUNNING &&
-		    retry++ >= MAX_FIRMWARE_CALL_RETRY) {
+		if (ret == 0)
+			break;
+
+		if (*fail_res != WAVE5_SYSERR_VPU_STILL_RUNNING) {
+			dev_warn(inst->dev->dev, "dec_finish_seq timed out\n");
+			goto unlock_and_return;
+		}
+
+		if (retry++ >= MAX_FIRMWARE_CALL_RETRY) {
 			ret = -ETIMEDOUT;
 			goto unlock_and_return;
 		}
+
+		mutex_unlock(&vpu_dev->hw_lock);
+		wave5_vpu_dec_get_output_info(inst, &dec_info);
+		ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
+		if (ret) {
+			pm_runtime_put_sync(inst->dev->dev);
+			return ret;
+		}
 	} while (ret != 0);
 
 	dev_dbg(inst->dev->dev, "%s: dec_finish_seq complete\n", __func__);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 6/7] media: chips-media: wave5: Reduce high CPU load
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (4 preceding siblings ...)
  2025-04-10  3:40 ` [RESEND PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when instance was destroyed Jackson.lee
@ 2025-04-10  3:40 ` Jackson.lee
  2025-04-10  3:40 ` [RESEND PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed Jackson.lee
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:40 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

Since applying changes for performance improvement of decoder,
there was a problem related to high CPU load.
CPU load was more than 4 times when comparing CPU load.
The root cause was the device_run was called many times even if
there was no bitstream which should be queued.

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 .../media/platform/chips-media/wave5/wave5-vpu-dec.c | 12 +++++++++---
 .../media/platform/chips-media/wave5/wave5-vpuapi.h  |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index 485320db9bdc..58340fddcd04 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1280,10 +1280,13 @@ static void wave5_vpu_dec_buf_queue(struct vb2_buffer *vb)
 		__func__, vb->type, vb->index, vb2_plane_size(&vbuf->vb2_buf, 0),
 		vb2_plane_size(&vbuf->vb2_buf, 1), vb2_plane_size(&vbuf->vb2_buf, 2));
 
-	if (vb->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE)
+	if (vb->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
+		if (inst->empty_queue)
+			inst->empty_queue = false;
 		wave5_vpu_dec_buf_queue_src(vb);
-	else if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
+	} else if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
 		wave5_vpu_dec_buf_queue_dst(vb);
+	}
 }
 
 static int wave5_vpu_dec_allocate_ring_buffer(struct vpu_instance *inst)
@@ -1474,6 +1477,7 @@ static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
 
 	dev_dbg(inst->dev->dev, "%s: type: %u\n", __func__, q->type);
 	pm_runtime_resume_and_get(inst->dev->dev);
+	inst->empty_queue = false;
 
 	while (check_cmd) {
 		struct queue_status_info q_status;
@@ -1592,6 +1596,7 @@ static void wave5_vpu_dec_device_run(void *priv)
 				inst->queuing_num == 0 &&
 				inst->state == VPU_INST_STATE_PIC_RUN) {
 			dev_dbg(inst->dev->dev, "%s: no bitstream for feeding, so skip ", __func__);
+			inst->empty_queue = true;
 			goto finish_job_and_return;
 		}
 	}
@@ -1737,7 +1742,8 @@ static int wave5_vpu_dec_job_ready(void *priv)
 				"No capture buffer ready to decode!\n");
 			break;
 		} else if (!wave5_is_draining_or_eos(inst) &&
-			   !v4l2_m2m_num_src_bufs_ready(m2m_ctx)) {
+			   (!v4l2_m2m_num_src_bufs_ready(m2m_ctx) ||
+			    inst->empty_queue)) {
 			dev_dbg(inst->dev->dev,
 				"No bitstream data to decode!\n");
 			break;
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
index fd0aef0bac4e..f2596af08cdf 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h
@@ -821,6 +821,7 @@ struct vpu_instance {
 	bool retry; /* retry to feed bitstream if failure reason is WAVE5_SYSERR_QUEUEING_FAIL*/
 	int queuing_num; /* check if there is input buffer or not */
 	struct mutex feed_lock; /* lock for feeding bitstream buffers */
+	bool empty_queue;
 	struct vpu_buf bitstream_vbuf;
 	dma_addr_t last_rd_ptr;
 	size_t remaining_consumed_bytes;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RESEND PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (5 preceding siblings ...)
  2025-04-10  3:40 ` [RESEND PATCH v1 6/7] media: chips-media: wave5: Reduce high CPU load Jackson.lee
@ 2025-04-10  3:40 ` Jackson.lee
  2025-04-10 12:39 ` [RESEND PATCH v1 0/7] Performance improvement of decoder Nicolas Dufresne
  2025-04-11 16:57 ` Nicolas Dufresne
  8 siblings, 0 replies; 18+ messages in thread
From: Jackson.lee @ 2025-04-10  3:40 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, sebastian.fricke, nicolas.dufresne,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, jackson.lee, lafley.kim, b-brnich,
	hverkuil, nas.chung

From: Jackson Lee <jackson.lee@chipsnmedia.com>

Since applying "Reduce high CPU load" patch, SError of kernel panic rarely
happened while testing fluster.
The root cause was to enter suspend mode because timeout of autosuspend
delay happened.

[   48.834439] SError Interrupt on CPU0, code 0x00000000bf000000 -- SError
[   48.834455] CPU: 0 UID: 0 PID: 1067 Comm: v4l2h265dec0:sr Not tainted 6.12.9-gc9e21a1ebd75-dirty #7
[   48.834461] Hardware name: ti Texas Instruments J721S2 EVM/Texas Instruments J721S2 EVM, BIOS 2025.01-00345-gbaf3aaa8ecfa 01/01/2025
[   48.834464] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   48.834468] pc : wave5_dec_clr_disp_flag+0x40/0x80 [wave5]
[   48.834488] lr : wave5_dec_clr_disp_flag+0x40/0x80 [wave5]
[   48.834495] sp : ffff8000856e3a30
[   48.834497] x29: ffff8000856e3a30 x28: ffff0008093f6010 x27: ffff000809158130
[   48.834504] x26: 0000000000000000 x25: ffff00080b625000 x24: ffff000804a9ba80
[   48.834509] x23: ffff000802343028 x22: ffff000809158150 x21: ffff000802218000
[   48.834513] x20: ffff0008093f6000 x19: ffff0008093f6000 x18: 0000000000000000
[   48.834518] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff74009618
[   48.834523] x14: 000000010000000c x13: 0000000000000000 x12: 0000000000000000
[   48.834527] x11: ffffffffffffffff x10: ffffffffffffffff x9 : ffff000802343028
[   48.834532] x8 : ffff00080b6252a0 x7 : 0000000000000038 x6 : 0000000000000000
[   48.834536] x5 : ffff00080b625060 x4 : 0000000000000000 x3 : 0000000000000000
[   48.834541] x2 : 0000000000000000 x1 : ffff800084bf0118 x0 : ffff800084bf0000
[   48.834547] Kernel panic - not syncing: Asynchronous SError Interrupt
[   48.834549] CPU: 0 UID: 0 PID: 1067 Comm: v4l2h265dec0:sr Not tainted 6.12.9-gc9e21a1ebd75-dirty #7
[   48.834554] Hardware name: ti Texas Instruments J721S2 EVM/Texas Instruments J721S2 EVM, BIOS 2025.01-00345-gbaf3aaa8ecfa 01/01/2025
[   48.834556] Call trace:
[   48.834559]  dump_backtrace+0x94/0xec
[   48.834574]  show_stack+0x18/0x24
[   48.834579]  dump_stack_lvl+0x38/0x90
[   48.834585]  dump_stack+0x18/0x24
[   48.834588]  panic+0x35c/0x3e0
[   48.834592]  nmi_panic+0x40/0x8c
[   48.834595]  arm64_serror_panic+0x64/0x70
[   48.834598]  do_serror+0x3c/0x78
[   48.834601]  el1h_64_error_handler+0x34/0x4c
[   48.834605]  el1h_64_error+0x64/0x68
[   48.834608]  wave5_dec_clr_disp_flag+0x40/0x80 [wave5]
[   48.834615]  wave5_vpu_dec_clr_disp_flag+0x54/0x80 [wave5]
[   48.834622]  wave5_vpu_dec_buf_queue+0x19c/0x1a0 [wave5]
[   48.834628]  __enqueue_in_driver+0x3c/0x74 [videobuf2_common]
[   48.834639]  vb2_core_qbuf+0x508/0x61c [videobuf2_common]
[   48.834646]  vb2_qbuf+0xa4/0x168 [videobuf2_v4l2]
[   48.834656]  v4l2_m2m_qbuf+0x80/0x238 [v4l2_mem2mem]
[   48.834666]  v4l2_m2m_ioctl_qbuf+0x18/0x24 [v4l2_mem2mem]
[   48.834673]  v4l_qbuf+0x48/0x5c [videodev]
[   48.834704]  __video_do_ioctl+0x180/0x3f0 [videodev]
[   48.834725]  video_usercopy+0x2ec/0x68c [videodev]
[   48.834745]  video_ioctl2+0x18/0x24 [videodev]
[   48.834766]  v4l2_ioctl+0x40/0x60 [videodev]
[   48.834786]  __arm64_sys_ioctl+0xa8/0xec
[   48.834793]  invoke_syscall+0x44/0x100
[   48.834800]  el0_svc_common.constprop.0+0xc0/0xe0
[   48.834804]  do_el0_svc+0x1c/0x28
[   48.834809]  el0_svc+0x30/0xd0
[   48.834813]  el0t_64_sync_handler+0xc0/0xc4
[   48.834816]  el0t_64_sync+0x190/0x194
[   48.834820] SMP: stopping secondary CPUs
[   48.834831] Kernel Offset: disabled
[   48.834833] CPU features: 0x08,00002002,80200000,4200421b
[   48.834837] Memory Limit: none
[   49.161404] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---

Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
 .../platform/chips-media/wave5/wave5-vpu-dec.c   |  3 ---
 .../platform/chips-media/wave5/wave5-vpu-enc.c   |  3 ---
 .../media/platform/chips-media/wave5/wave5-vpu.c |  2 +-
 .../platform/chips-media/wave5/wave5-vpuapi.c    | 16 ----------------
 4 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index 58340fddcd04..d069cb955a7c 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1865,9 +1865,6 @@ static int wave5_vpu_open_dec(struct file *filp)
 	if (ret)
 		goto cleanup_inst;
 
-	if (list_empty(&dev->instances))
-		pm_runtime_use_autosuspend(inst->dev->dev);
-
 	list_add_tail(&inst->list, &dev->instances);
 
 	mutex_unlock(&dev->dev_lock);
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
index 52a1a00fd9bb..7f1aa392805f 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
@@ -1779,9 +1779,6 @@ static int wave5_vpu_open_enc(struct file *filp)
 	if (ret)
 		goto cleanup_inst;
 
-	if (list_empty(&dev->instances))
-		pm_runtime_use_autosuspend(inst->dev->dev);
-
 	list_add_tail(&inst->list, &dev->instances);
 
 	mutex_unlock(&dev->dev_lock);
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
index c53ac9a5c1c2..ae2c3eefd00a 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
@@ -367,7 +367,7 @@ static int wave5_vpu_probe(struct platform_device *pdev)
 	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
 	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
 
-	pm_runtime_set_autosuspend_delay(&pdev->dev, 100);
+	pm_runtime_set_autosuspend_delay(&pdev->dev, 500);
 	pm_runtime_use_autosuspend(&pdev->dev);
 	pm_runtime_enable(&pdev->dev);
 	wave5_vpu_sleep_wake(&pdev->dev, true, NULL, 0);
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index d7318d596b73..1f7f4d214b3c 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -207,8 +207,6 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 	int retry = 0;
 	struct vpu_device *vpu_dev = inst->dev;
 	int i;
-	int inst_count = 0;
-	struct vpu_instance *inst_elm;
 	struct dec_output_info dec_info;
 
 	*fail_res = 0;
@@ -265,12 +263,6 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 	}
 
 	wave5_vdi_free_dma_memory(vpu_dev, &p_dec_info->vb_task);
-
-	list_for_each_entry(inst_elm, &vpu_dev->instances, list)
-		inst_count++;
-	if (inst_count == 1)
-		pm_runtime_dont_use_autosuspend(vpu_dev->dev);
-
 	mutex_destroy(&inst->feed_lock);
 
 unlock_and_return:
@@ -738,8 +730,6 @@ int wave5_vpu_enc_close(struct vpu_instance *inst, u32 *fail_res)
 	int ret;
 	int retry = 0;
 	struct vpu_device *vpu_dev = inst->dev;
-	int inst_count = 0;
-	struct vpu_instance *inst_elm;
 
 	*fail_res = 0;
 	if (!inst->codec_info)
@@ -782,12 +772,6 @@ int wave5_vpu_enc_close(struct vpu_instance *inst, u32 *fail_res)
 	}
 
 	wave5_vdi_free_dma_memory(vpu_dev, &p_enc_info->vb_task);
-
-	list_for_each_entry(inst_elm, &vpu_dev->instances, list)
-		inst_count++;
-	if (inst_count == 1)
-		pm_runtime_dont_use_autosuspend(vpu_dev->dev);
-
 	mutex_unlock(&vpu_dev->hw_lock);
 	pm_runtime_put_sync(inst->dev->dev);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (6 preceding siblings ...)
  2025-04-10  3:40 ` [RESEND PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed Jackson.lee
@ 2025-04-10 12:39 ` Nicolas Dufresne
  2025-04-11  0:49   ` jackson.lee
  2025-04-11 16:57 ` Nicolas Dufresne
  8 siblings, 1 reply; 18+ messages in thread
From: Nicolas Dufresne @ 2025-04-10 12:39 UTC (permalink / raw)
  To: Jackson.lee, mchehab, hverkuil-cisco, sebastian.fricke,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, lafley.kim, b-brnich, hverkuil,
	nas.chung

Hi,

Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> From: Jackson Lee <jackson.lee@chipsnmedia.com>
> 
> v4l2-compliance results:
> ========================

What there reason for a resend within 3h ?

Nicolas

> 
> v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> 
> Buffer ioctls:
>                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> not supported
>                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> not supported
>         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
>         test CREATE_BUFS maximum buffers: OK
>         test VIDIOC_EXPBUF: OK
>         test Requests: OK (Not Supported)
> 
> Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> Warnings: 2
> Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> Warnings: 0
> 
> Fluster test results:
> =====================
> 
> Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> Gst1.0
> Using 3 parallel job(s)
> Ran 133/147 tests successfully               in 41.629 secs
> 
> (1 test fails because of not supporting to parse multi frames, 1 test
> fails because of a missing frame and slight corruption,
>  2 tests fail because of sizes which are incompatible with the IP, 11
> tests fail because of unsupported 10 bit format)
> 
> 
> Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> Gst1.0
> Using 3 parallel job(s)
> Ran 78/135 tests successfully               in 44.578 secs
> 
> (57 fail because the hardware is unable to decode  MBAFF / FMO /
> Field / Extended profile streams.)
> 
> Seek test
> =====================
> 1. gst-play-1.0 seek.264
> 2. this will use waylandsink since gst-play-1.0 uses playbin.
>    if you don't want to hook up display,
>    you can run gst-play-1.0 seek.264 --videosink=fakevideosink
> instead
> 3. Let pipeline run for 2-3 seconds
> 4. press SPACE key to pause
> 5. press 0 to reset
> press SPACE to start again
> 
> gst-play-1.0 seek.264 --videosink=fakevideosink
> Press 'k' to see a list of keyboard shortcuts.
> Now playing /root/seek.264
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...aused
> 0:00:09.9 / 0:00:09.7
> Reached end of play list.
> 
> Sequence Change test
> =====================
> gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> Setting pipeline to PAUSED ...
> Pipeline is PREROLLING ...
> Redistribute latency...
> Redistribute latency...
> Pipeline is PREROLLED ...
> Setting pipeline to PLAYING ...
> Redistribute latency...
> New clock: GstSystemClock
> Got EOS from element "pipeline0".
> Execution ended after 0:00:05.772414400
> Setting pipeline to NULL ...
> Freeing pipeline ...
> 
> Change since v0:
> ===================
> * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance
> of decoder
>  - separates the previous patch to a few patches
> 
> * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> state is changed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> resources normally when
>     instance was destroyed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> panic when closed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> Jackson Lee (7):
>   media: chips-media: wave5: Fix Null reference while testing fluster
>   media: chips-media: wave5: Improve performance of decoder
>   media: chips-media: wave5: Fix not to be closed
>   media: chips-media: wave5: Use spinlock whenever state is changed
>   media: chips-media: wave5: Fix not to free resources normally when
>     instance was destroyed
>   media: chips-media: wave5: Reduce high CPU load
>   media: chips-media: wave5: Fix SError of kernel panic when closed
> 
>  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
>  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> --
>  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
>  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
>  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
>  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
>  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
>  7 files changed, 179 insertions(+), 72 deletions(-)

-- 
Nicolas Dufresne
Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-10 12:39 ` [RESEND PATCH v1 0/7] Performance improvement of decoder Nicolas Dufresne
@ 2025-04-11  0:49   ` jackson.lee
  2025-04-11 15:47     ` Nicolas Dufresne
  0 siblings, 1 reply; 18+ messages in thread
From: jackson.lee @ 2025-04-11  0:49 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	sebastian.fricke@collabora.com, bob.beckett@collabora.com,
	dafna.hirschfeld@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Nicolas

There was warning message of patch work.
Hash value of "Improve performance of decoder" in the patch series were used for FIXS tag, I removed the FIXS tags in the commit message.

thanks



> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Thursday, April 10, 2025 9:39 PM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> Chung <nas.chung@chipsnmedia.com>
> Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> 
> Hi,
> 
> Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> >
> > v4l2-compliance results:
> > ========================
> 
> What there reason for a resend within 3h ?
> 
> Nicolas
> 
> >
> > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> >
> > Buffer ioctls:
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> >         test CREATE_BUFS maximum buffers: OK
> >         test VIDIOC_EXPBUF: OK
> >         test Requests: OK (Not Supported)
> >
> > Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> > Warnings: 2
> > Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> > Warnings: 0
> >
> > Fluster test results:
> > =====================
> >
> > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 133/147 tests successfully               in 41.629 secs
> >
> > (1 test fails because of not supporting to parse multi frames, 1 test
> > fails because of a missing frame and slight corruption,
> >  2 tests fail because of sizes which are incompatible with the IP, 11
> > tests fail because of unsupported 10 bit format)
> >
> >
> > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 78/135 tests successfully               in 44.578 secs
> >
> > (57 fail because the hardware is unable to decode  MBAFF / FMO / Field
> > / Extended profile streams.)
> >
> > Seek test
> > =====================
> > 1. gst-play-1.0 seek.264
> > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> >    if you don't want to hook up display,
> >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
> > 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause 5.
> > press 0 to reset press SPACE to start again
> >
> > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a
> > list of keyboard shortcuts.
> > Now playing /root/seek.264
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...aused
> > 0:00:09.9 / 0:00:09.7
> > Reached end of play list.
> >
> > Sequence Change test
> > =====================
> > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > Setting pipeline to PAUSED ...
> > Pipeline is PREROLLING ...
> > Redistribute latency...
> > Redistribute latency...
> > Pipeline is PREROLLED ...
> > Setting pipeline to PLAYING ...
> > Redistribute latency...
> > New clock: GstSystemClock
> > Got EOS from element "pipeline0".
> > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > Freeing pipeline ...
> >
> > Change since v0:
> > ===================
> > * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of
> > decoder
> >  - separates the previous patch to a few patches
> >
> > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> > state is changed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > resources normally when
> >     instance was destroyed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> > panic when closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > Jackson Lee (7):
> >   media: chips-media: wave5: Fix Null reference while testing fluster
> >   media: chips-media: wave5: Improve performance of decoder
> >   media: chips-media: wave5: Fix not to be closed
> >   media: chips-media: wave5: Use spinlock whenever state is changed
> >   media: chips-media: wave5: Fix not to free resources normally when
> >     instance was destroyed
> >   media: chips-media: wave5: Reduce high CPU load
> >   media: chips-media: wave5: Fix SError of kernel panic when closed
> >
> >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> > --
> >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> >  7 files changed, 179 insertions(+), 72 deletions(-)
> 
> --
> Nicolas Dufresne
> Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-11  0:49   ` jackson.lee
@ 2025-04-11 15:47     ` Nicolas Dufresne
  2025-04-14  1:44       ` jackson.lee
  0 siblings, 1 reply; 18+ messages in thread
From: Nicolas Dufresne @ 2025-04-11 15:47 UTC (permalink / raw)
  To: jackson.lee, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	sebastian.fricke@collabora.com, bob.beckett@collabora.com,
	dafna.hirschfeld@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi,

Le vendredi 11 avril 2025 à 00:49 +0000, jackson.lee a écrit :
> Hi Nicolas
> 
> There was warning message of patch work.
> Hash value of "Improve performance of decoder" in the patch series
> were used for FIXS tag, I removed the FIXS tags in the commit
> message.

I'd prefer if you simply comment back that this needs to be removed
before merging. If you get enough comment for a new version, you can do
it, if its the only thing preventing the merge, I will fix it.

Resends are mostly used when a patches have had no comment, no feedback
for a long period of time. When the patches have been forgotten, and
are not expected to hold any edits.

regards,
Nicolas

> 
> thanks
> 
> 
> 
> > -----Original Message-----
> > From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > Sent: Thursday, April 10, 2025 9:39 PM
> > To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> > bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > lafley.kim
> > <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl;
> > Nas
> > Chung <nas.chung@chipsnmedia.com>
> > Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of
> > decoder
> > 
> > Hi,
> > 
> > Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> > > 
> > > v4l2-compliance results:
> > > ========================
> > 
> > What there reason for a resend within 3h ?
> > 
> > Nicolas
> > 
> > > 
> > > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> > > 
> > > Buffer ioctls:
> > >                 warn: v4l2-test-buffers.cpp(693):
> > > VIDIOC_CREATE_BUFS
> > > not supported
> > >                 warn: v4l2-test-buffers.cpp(693):
> > > VIDIOC_CREATE_BUFS
> > > not supported
> > >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> > >         test CREATE_BUFS maximum buffers: OK
> > >         test VIDIOC_EXPBUF: OK
> > >         test Requests: OK (Not Supported)
> > > 
> > > Total for wave5-dec device /dev/video0: 46, Succeeded: 46,
> > > Failed: 0,
> > > Warnings: 2
> > > Total for wave5-enc device /dev/video1: 46, Succeeded: 46,
> > > Failed: 0,
> > > Warnings: 0
> > > 
> > > Fluster test results:
> > > =====================
> > > 
> > > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-
> > > V4L2-
> > > Gst1.0
> > > Using 3 parallel job(s)
> > > Ran 133/147 tests successfully               in 41.629 secs
> > > 
> > > (1 test fails because of not supporting to parse multi frames, 1
> > > test
> > > fails because of a missing frame and slight corruption,
> > >  2 tests fail because of sizes which are incompatible with the
> > > IP, 11
> > > tests fail because of unsupported 10 bit format)
> > > 
> > > 
> > > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > > Gst1.0
> > > Using 3 parallel job(s)
> > > Ran 78/135 tests successfully               in 44.578 secs
> > > 
> > > (57 fail because the hardware is unable to decode  MBAFF / FMO /
> > > Field
> > > / Extended profile streams.)
> > > 
> > > Seek test
> > > =====================
> > > 1. gst-play-1.0 seek.264
> > > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> > >    if you don't want to hook up display,
> > >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink
> > > instead
> > > 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause
> > > 5.
> > > press 0 to reset press SPACE to start again
> > > 
> > > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see
> > > a
> > > list of keyboard shortcuts.
> > > Now playing /root/seek.264
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...aused
> > > 0:00:09.9 / 0:00:09.7
> > > Reached end of play list.
> > > 
> > > Sequence Change test
> > > =====================
> > > gst-launch-1.0 filesrc
> > > location=./switch_1080p_720p_240frames.h264 !
> > > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > > Setting pipeline to PAUSED ...
> > > Pipeline is PREROLLING ...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Pipeline is PREROLLED ...
> > > Setting pipeline to PLAYING ...
> > > Redistribute latency...
> > > New clock: GstSystemClock
> > > Got EOS from element "pipeline0".
> > > Execution ended after 0:00:05.772414400 Setting pipeline to NULL
> > > ...
> > > Freeing pipeline ...
> > > 
> > > Change since v0:
> > > ===================
> > > * For [PATCH v1 2/7] media: chips-media: wave5: Improve
> > > performance of
> > > decoder
> > >  - separates the previous patch to a few patches
> > > 
> > > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be
> > > closed
> > >  - separated from the previous patch of performance improvement
> > > of
> > >    decoder
> > > 
> > > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock
> > > whenever
> > > state is changed
> > >  - separated from the previous patch of performance improvement
> > > of
> > >    decoder
> > > 
> > > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > > resources normally when
> > >     instance was destroyed
> > >  - separated from the previous patch of performance improvement
> > > of
> > >    decoder
> > > 
> > > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of
> > > kernel
> > > panic when closed
> > >  - separated from the previous patch of performance improvement
> > > of
> > >    decoder
> > > 
> > > Jackson Lee (7):
> > >   media: chips-media: wave5: Fix Null reference while testing
> > > fluster
> > >   media: chips-media: wave5: Improve performance of decoder
> > >   media: chips-media: wave5: Fix not to be closed
> > >   media: chips-media: wave5: Use spinlock whenever state is
> > > changed
> > >   media: chips-media: wave5: Fix not to free resources normally
> > > when
> > >     instance was destroyed
> > >   media: chips-media: wave5: Reduce high CPU load
> > >   media: chips-media: wave5: Fix SError of kernel panic when
> > > closed
> > > 
> > >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> > >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-
> > > ----
> > > --
> > >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> > >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> > >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> > >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> > >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> > >  7 files changed, 179 insertions(+), 72 deletions(-)
> > 
> > --
> > Nicolas Dufresne
> > Principal Engineer at Collabora

-- 
Nicolas Dufresne
Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
                   ` (7 preceding siblings ...)
  2025-04-10 12:39 ` [RESEND PATCH v1 0/7] Performance improvement of decoder Nicolas Dufresne
@ 2025-04-11 16:57 ` Nicolas Dufresne
  2025-04-14  2:04   ` jackson.lee
  2025-05-07  4:57   ` jackson.lee
  8 siblings, 2 replies; 18+ messages in thread
From: Nicolas Dufresne @ 2025-04-11 16:57 UTC (permalink / raw)
  To: Jackson.lee, mchehab, hverkuil-cisco, sebastian.fricke,
	bob.beckett, dafna.hirschfeld
  Cc: linux-media, linux-kernel, lafley.kim, b-brnich, hverkuil,
	nas.chung

Hi Jackson,

Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> From: Jackson Lee <jackson.lee@chipsnmedia.com>
> 
> v4l2-compliance results:
> ========================
> 
> v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> 
> Buffer ioctls:
>                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> not supported
>                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> not supported
>         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
>         test CREATE_BUFS maximum buffers: OK
>         test VIDIOC_EXPBUF: OK
>         test Requests: OK (Not Supported)
> 
> Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> Warnings: 2
> Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> Warnings: 0
> 
> Fluster test results:
> =====================
> 
> Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> Gst1.0
> Using 3 parallel job(s)
> Ran 133/147 tests successfully               in 41.629 secs

Same results here.

> 
> (1 test fails because of not supporting to parse multi frames, 1 test
> fails because of a missing frame and slight corruption,
>  2 tests fail because of sizes which are incompatible with the IP, 11
> tests fail because of unsupported 10 bit format)
> 
> 
> Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> Gst1.0
> Using 3 parallel job(s)
> Ran 78/135 tests successfully               in 44.578 secs
> 
> (57 fail because the hardware is unable to decode  MBAFF / FMO /
> Field / Extended profile streams.)

Same results here. There is also JVT-FR-EXT test suite now that you
should include. 23/69 here, without incident, but I did not analyze the
inner results, I'll leave that to you. Though, a quick looks shows that
YUV422 does not work anymore.

> 
> Seek test
> =====================
> 1. gst-play-1.0 seek.264
> 2. this will use waylandsink since gst-play-1.0 uses playbin.
>    if you don't want to hook up display,
>    you can run gst-play-1.0 seek.264 --videosink=fakevideosink
> instead
> 3. Let pipeline run for 2-3 seconds
> 4. press SPACE key to pause
> 5. press 0 to reset
> press SPACE to start again
> 
> gst-play-1.0 seek.264 --videosink=fakevideosink
> Press 'k' to see a list of keyboard shortcuts.
> Now playing /root/seek.264
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...aused
> 0:00:09.9 / 0:00:09.7
> Reached end of play list.

So, I managed to resurrect my device. Once side effect of this series
is that the driver is no longer silent in normal cases. Pretty 'q'
while playing, or seeking seems to fill the kernel logs with these two
error.

[ 5037.457307] vdec 4210000.video-codec: wave5_vpu_dec_finish_decode:
could not get output info.
[ 5037.457436] vdec 4210000.video-codec:
wave5_vpu_firmware_command_queue_error_check: result not ready: 0x800

This needs fixing for the next version. The condition that makes these
non error needs to be tested so that we don't get spammed anymore. They
also occur while running fluster.

> 
> Sequence Change test
> =====================
> gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> Setting pipeline to PAUSED ...
> Pipeline is PREROLLING ...
> Redistribute latency...
> Redistribute latency...
> Pipeline is PREROLLED ...
> Setting pipeline to PLAYING ...
> Redistribute latency...
> New clock: GstSystemClock
> Got EOS from element "pipeline0".
> Execution ended after 0:00:05.772414400
> Setting pipeline to NULL ...
> Freeing pipeline ...

I did a test of my own here, and did get kernel splat. The warning
indicate that the state machine is no longer respected. This needs to
be address in v2, we added these check, since the locking is bound to
legal use of the state machine.

[  401.018648] Execution of a job in state STOP illegal.
[  401.023761] WARNING: CPU: 0 PID: 635 at
drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
wave5_vpu_dec_device_run+0x448/0x83c [wave5]
[  401.036890] Modules linked in: rfkill ip6table_filter ip6_tables
iptable_filter ip_tables x_tables rpmsg_ctrl rpmsg_char
phy_cadence_torrent rtc_tps6594 tps6594_esm tps6594_regulator
tps6594_pfsm pinctrl_tps6594 gpio_regmap ti_am335x_adc cdns3 kfifo_buf
cdns_usb_common qrtr mux_gpio omap_mailbox phy_j721e_wiz wave5
phy_can_transceiver ti_k3_r5_remoteproc v4l2_mem2mem
videobuf2_dma_contig videobuf2_memops tps6594_i2c videobuf2_v4l2
tps6594_core at24 k3_j72xx_bandgap sa2ul videodev m_can_platform
videobuf2_common authenc m_can ti_k3_dsp_remoteproc mc cdns3_ti
ti_am335x_tscadc can_dev rti_wdt fuse drm backlight dm_mod ipv6
[  401.091795] CPU: 0 UID: 1000 PID: 635 Comm: h264parse0:sink Tainted:
G        W           6.15.0-rc1-jacinto+ #1 PREEMPT 
[  401.102731] Tainted: [W]=WARN
[  401.105687] Hardware name: Texas Instruments J721S2 EVM (DT)
[  401.111330] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[  401.118277] pc : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
[  401.124015] lr : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
[  401.129749] sp : ffff800084a0ba40
[  401.133051] x29: ffff800084a0baf0 x28: ffff00080a64c130 x27:
ffff800084a0bc08
[  401.140175] x26: 00000000c058560f x25: 0000000000000000 x24:
ffff00081331c000
[  401.147297] x23: ffff00081331c010 x22: ffff00080ed059a8 x21:
ffff00081331dbc0
[  401.154418] x20: ffff00081331d000 x19: 0000000000000000 x18:
0000000000000006
[  401.161540] x17: 0000000000000000 x16: 0000000000000000 x15:
072e076c07610767
[  401.168662] x14: ffff00080ee0b500 x13: 072e076c07610767 x12:
ffff800082107128
[  401.175783] x11: 0000000000000058 x10: 0000000000000018 x9 :
ffff00080ee0b500
[  401.182907] x8 : 00000000000004c7 x7 : ffff00080ee0b500 x6 :
ffff80008215f128
[  401.190028] x5 : 0000000000000000 x4 : 0000000000000000 x3 :
0000000000000001
[  401.197149] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
ffff00080ee0b480
[  401.204272] Call trace:
[  401.206709]  wave5_vpu_dec_device_run+0x448/0x83c [wave5] (P)
[  401.212448]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem]
[  401.217667]  v4l2_m2m_qbuf+0x184/0x240 [v4l2_mem2mem]
[  401.222709]  v4l2_m2m_ioctl_qbuf+0x18/0x4e0 [v4l2_mem2mem]
[  401.228184]  v4l_qbuf+0x48/0x70 [videodev]
[  401.232292]  __video_do_ioctl+0x40c/0x4a0 [videodev]
[  401.237260]  video_usercopy+0x1e0/0x688 [videodev]
[  401.242054]  video_ioctl2+0x18/0x38 [videodev]
[  401.246500]  v4l2_ioctl+0x40/0x60 [videodev]
[  401.250774]  __arm64_sys_ioctl+0xb4/0xf4
[  401.254690]  invoke_syscall+0x48/0x104
[  401.258433]  el0_svc_common.constprop.0+0x40/0xe0
[  401.263125]  do_el0_svc+0x1c/0x28
[  401.266431]  el0_svc+0x30/0xcc
[  401.269480]  el0t_64_sync_handler+0x10c/0x138
[  401.273827]  el0t_64_sync+0x198/0x19c
[  401.277480] ---[ end trace 0000000000000000 ]---
[  401.316876] ------------[ cut here ]------------
[  401.321523] Execution of a job in state STOP illegal.
[  401.326992] WARNING: CPU: 0 PID: 635 at
drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
wave5_vpu_dec_device_run+0x448/0x83c [wave5]


To test:
gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-raw,format=NV12,width=320,height=240 ! v4l2h264enc ! filesink location=340x240.h264
gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-raw,format=NV12,width=640,height=480 ! v4l2h264enc ! filesink location=640x480.h264
cat 340x240.h264 640x480.h264 340x240.h264 640x480.h264  > drc.h264
gst-launch-1.0 filesrc location=drc.h264  ! parsebin ! v4l2h264dec ! fakevideosink -v


> 
> Change since v0:
> ===================
> * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance
> of decoder
>  - separates the previous patch to a few patches
> 
> * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> state is changed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> resources normally when
>     instance was destroyed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> panic when closed
>  - separated from the previous patch of performance improvement of
>    decoder
> 
> Jackson Lee (7):
>   media: chips-media: wave5: Fix Null reference while testing fluster
>   media: chips-media: wave5: Improve performance of decoder
>   media: chips-media: wave5: Fix not to be closed
>   media: chips-media: wave5: Use spinlock whenever state is changed
>   media: chips-media: wave5: Fix not to free resources normally when
>     instance was destroyed
>   media: chips-media: wave5: Reduce high CPU load
>   media: chips-media: wave5: Fix SError of kernel panic when closed
> 
>  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
>  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> --
>  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
>  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
>  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
>  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
>  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
>  7 files changed, 179 insertions(+), 72 deletions(-)

-- 
Nicolas Dufresne
Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-11 15:47     ` Nicolas Dufresne
@ 2025-04-14  1:44       ` jackson.lee
  0 siblings, 0 replies; 18+ messages in thread
From: jackson.lee @ 2025-04-14  1:44 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	sebastian.fricke@collabora.com, bob.beckett@collabora.com,
	dafna.hirschfeld@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Nicolas

Thanks for your advice.

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Saturday, April 12, 2025 12:48 AM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> Chung <nas.chung@chipsnmedia.com>
> Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> 
> Hi,
> 
> Le vendredi 11 avril 2025 à 00:49 +0000, jackson.lee a écrit :
> > Hi Nicolas
> >
> > There was warning message of patch work.
> > Hash value of "Improve performance of decoder" in the patch series
> > were used for FIXS tag, I removed the FIXS tags in the commit message.
> 
> I'd prefer if you simply comment back that this needs to be removed before
> merging. If you get enough comment for a new version, you can do it, if
> its the only thing preventing the merge, I will fix it.
> 
> Resends are mostly used when a patches have had no comment, no feedback
> for a long period of time. When the patches have been forgotten, and are
> not expected to hold any edits.
> 
> regards,
> Nicolas
> 
> >
> > thanks
> >
> >
> >
> > > -----Original Message-----
> > > From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > > Sent: Thursday, April 10, 2025 9:39 PM
> > > To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > > hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> > > bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> > > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > lafley.kim <lafley.kim@chipsnmedia.com>; b-brnich@ti.com;
> > > hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>
> > > Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of
> > > decoder
> > >
> > > Hi,
> > >
> > > Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > > > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> > > >
> > > > v4l2-compliance results:
> > > > ========================
> > >
> > > What there reason for a resend within 3h ?
> > >
> > > Nicolas
> > >
> > > >
> > > > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> > > >
> > > > Buffer ioctls:
> > > >                 warn: v4l2-test-buffers.cpp(693):
> > > > VIDIOC_CREATE_BUFS
> > > > not supported
> > > >                 warn: v4l2-test-buffers.cpp(693):
> > > > VIDIOC_CREATE_BUFS
> > > > not supported
> > > >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> > > >         test CREATE_BUFS maximum buffers: OK
> > > >         test VIDIOC_EXPBUF: OK
> > > >         test Requests: OK (Not Supported)
> > > >
> > > > Total for wave5-dec device /dev/video0: 46, Succeeded: 46,
> > > > Failed: 0,
> > > > Warnings: 2
> > > > Total for wave5-enc device /dev/video1: 46, Succeeded: 46,
> > > > Failed: 0,
> > > > Warnings: 0
> > > >
> > > > Fluster test results:
> > > > =====================
> > > >
> > > > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-
> > > > V4L2-
> > > > Gst1.0
> > > > Using 3 parallel job(s)
> > > > Ran 133/147 tests successfully               in 41.629 secs
> > > >
> > > > (1 test fails because of not supporting to parse multi frames, 1
> > > > test fails because of a missing frame and slight corruption,
> > > >  2 tests fail because of sizes which are incompatible with the IP,
> > > > 11 tests fail because of unsupported 10 bit format)
> > > >
> > > >
> > > > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > > > Gst1.0
> > > > Using 3 parallel job(s)
> > > > Ran 78/135 tests successfully               in 44.578 secs
> > > >
> > > > (57 fail because the hardware is unable to decode  MBAFF / FMO /
> > > > Field / Extended profile streams.)
> > > >
> > > > Seek test
> > > > =====================
> > > > 1. gst-play-1.0 seek.264
> > > > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> > > >    if you don't want to hook up display,
> > > >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink
> > > > instead 3. Let pipeline run for 2-3 seconds 4. press SPACE key to
> > > > pause 5.
> > > > press 0 to reset press SPACE to start again
> > > >
> > > > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a
> > > > list of keyboard shortcuts.
> > > > Now playing /root/seek.264
> > > > Redistribute latency...
> > > > Redistribute latency...
> > > > Redistribute latency...
> > > > Redistribute latency...
> > > > Redistribute latency...aused
> > > > 0:00:09.9 / 0:00:09.7
> > > > Reached end of play list.
> > > >
> > > > Sequence Change test
> > > > =====================
> > > > gst-launch-1.0 filesrc
> > > > location=./switch_1080p_720p_240frames.h264 !
> > > > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > > > Setting pipeline to PAUSED ...
> > > > Pipeline is PREROLLING ...
> > > > Redistribute latency...
> > > > Redistribute latency...
> > > > Pipeline is PREROLLED ...
> > > > Setting pipeline to PLAYING ...
> > > > Redistribute latency...
> > > > New clock: GstSystemClock
> > > > Got EOS from element "pipeline0".
> > > > Execution ended after 0:00:05.772414400 Setting pipeline to NULL
> > > > ...
> > > > Freeing pipeline ...
> > > >
> > > > Change since v0:
> > > > ===================
> > > > * For [PATCH v1 2/7] media: chips-media: wave5: Improve
> > > > performance of decoder
> > > >  - separates the previous patch to a few patches
> > > >
> > > > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be
> > > > closed
> > > >  - separated from the previous patch of performance improvement of
> > > >    decoder
> > > >
> > > > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock
> > > > whenever state is changed
> > > >  - separated from the previous patch of performance improvement of
> > > >    decoder
> > > >
> > > > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > > > resources normally when
> > > >     instance was destroyed
> > > >  - separated from the previous patch of performance improvement of
> > > >    decoder
> > > >
> > > > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of
> > > > kernel panic when closed
> > > >  - separated from the previous patch of performance improvement of
> > > >    decoder
> > > >
> > > > Jackson Lee (7):
> > > >   media: chips-media: wave5: Fix Null reference while testing
> > > > fluster
> > > >   media: chips-media: wave5: Improve performance of decoder
> > > >   media: chips-media: wave5: Fix not to be closed
> > > >   media: chips-media: wave5: Use spinlock whenever state is
> > > > changed
> > > >   media: chips-media: wave5: Fix not to free resources normally
> > > > when
> > > >     instance was destroyed
> > > >   media: chips-media: wave5: Reduce high CPU load
> > > >   media: chips-media: wave5: Fix SError of kernel panic when
> > > > closed
> > > >
> > > >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> > > >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-
> > > > ----
> > > > --
> > > >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> > > >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> > > >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> > > >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> > > >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> > > >  7 files changed, 179 insertions(+), 72 deletions(-)
> > >
> > > --
> > > Nicolas Dufresne
> > > Principal Engineer at Collabora
> 
> --
> Nicolas Dufresne
> Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-11 16:57 ` Nicolas Dufresne
@ 2025-04-14  2:04   ` jackson.lee
  2025-04-14 13:28     ` Nicolas Dufresne
  2025-05-07  4:57   ` jackson.lee
  1 sibling, 1 reply; 18+ messages in thread
From: jackson.lee @ 2025-04-14  2:04 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	bob.beckett@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Nicolas

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Saturday, April 12, 2025 1:58 AM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> Chung <nas.chung@chipsnmedia.com>
> Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> 
> Hi Jackson,
> 
> Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> >
> > v4l2-compliance results:
> > ========================
> >
> > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> >
> > Buffer ioctls:
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> >         test CREATE_BUFS maximum buffers: OK
> >         test VIDIOC_EXPBUF: OK
> >         test Requests: OK (Not Supported)
> >
> > Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> > Warnings: 2
> > Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> > Warnings: 0
> >
> > Fluster test results:
> > =====================
> >
> > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 133/147 tests successfully               in 41.629 secs
> 
> Same results here.
> 
> >
> > (1 test fails because of not supporting to parse multi frames, 1 test
> > fails because of a missing frame and slight corruption,
> >  2 tests fail because of sizes which are incompatible with the IP, 11
> > tests fail because of unsupported 10 bit format)
> >
> >
> > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 78/135 tests successfully               in 44.578 secs
> >
> > (57 fail because the hardware is unable to decode  MBAFF / FMO / Field
> > / Extended profile streams.)
> 
> Same results here. There is also JVT-FR-EXT test suite now that you should
> include. 23/69 here, without incident, but I did not analyze the inner
> results, I'll leave that to you. Though, a quick looks shows that
> YUV422 does not work anymore.
> 
> >
> > Seek test
> > =====================
> > 1. gst-play-1.0 seek.264
> > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> >    if you don't want to hook up display,
> >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
> > 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause 5.
> > press 0 to reset press SPACE to start again
> >
> > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a
> > list of keyboard shortcuts.
> > Now playing /root/seek.264
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...aused
> > 0:00:09.9 / 0:00:09.7
> > Reached end of play list.
> 
> So, I managed to resurrect my device. Once side effect of this series is
> that the driver is no longer silent in normal cases. Pretty 'q'
> while playing, or seeking seems to fill the kernel logs with these two
> error.
> 
> [ 5037.457307] vdec 4210000.video-codec: wave5_vpu_dec_finish_decode:
> could not get output info.
> [ 5037.457436] vdec 4210000.video-codec:
> wave5_vpu_firmware_command_queue_error_check: result not ready: 0x800
> 
> This needs fixing for the next version. The condition that makes these non
> error needs to be tested so that we don't get spammed anymore. They also
> occur while running fluster.
> 
> >
> > Sequence Change test
> > =====================
> > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > Setting pipeline to PAUSED ...
> > Pipeline is PREROLLING ...
> > Redistribute latency...
> > Redistribute latency...
> > Pipeline is PREROLLED ...
> > Setting pipeline to PLAYING ...
> > Redistribute latency...
> > New clock: GstSystemClock
> > Got EOS from element "pipeline0".
> > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > Freeing pipeline ...
> 
> I did a test of my own here, and did get kernel splat. The warning
> indicate that the state machine is no longer respected. This needs to be
> address in v2, we added these check, since the locking is bound to legal
> use of the state machine.


Since applying the performance patch, device_run and wave5_vpu_dec_finish_decode is not synchronized any more.
How about removing this warning message ?

> 
> [  401.018648] Execution of a job in state STOP illegal.
> [  401.023761] WARNING: CPU: 0 PID: 635 at
> drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> wave5_vpu_dec_device_run+0x448/0x83c [wave5] [  401.036890] Modules linked
> in: rfkill ip6table_filter ip6_tables iptable_filter ip_tables x_tables
> rpmsg_ctrl rpmsg_char phy_cadence_torrent rtc_tps6594 tps6594_esm
> tps6594_regulator tps6594_pfsm pinctrl_tps6594 gpio_regmap ti_am335x_adc
> cdns3 kfifo_buf cdns_usb_common qrtr mux_gpio omap_mailbox phy_j721e_wiz
> wave5 phy_can_transceiver ti_k3_r5_remoteproc v4l2_mem2mem
> videobuf2_dma_contig videobuf2_memops tps6594_i2c videobuf2_v4l2
> tps6594_core at24 k3_j72xx_bandgap sa2ul videodev m_can_platform
> videobuf2_common authenc m_can ti_k3_dsp_remoteproc mc cdns3_ti
> ti_am335x_tscadc can_dev rti_wdt fuse drm backlight dm_mod ipv6
> [  401.091795] CPU: 0 UID: 1000 PID: 635 Comm: h264parse0:sink Tainted:
> G        W           6.15.0-rc1-jacinto+ #1 PREEMPT
> [  401.102731] Tainted: [W]=WARN
> [  401.105687] Hardware name: Texas Instruments J721S2 EVM (DT)
> [  401.111330] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [  401.118277] pc : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> [  401.124015] lr : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> [  401.129749] sp : ffff800084a0ba40 [  401.133051] x29: ffff800084a0baf0
> x28: ffff00080a64c130 x27:
> ffff800084a0bc08
> [  401.140175] x26: 00000000c058560f x25: 0000000000000000 x24:
> ffff00081331c000
> [  401.147297] x23: ffff00081331c010 x22: ffff00080ed059a8 x21:
> ffff00081331dbc0
> [  401.154418] x20: ffff00081331d000 x19: 0000000000000000 x18:
> 0000000000000006
> [  401.161540] x17: 0000000000000000 x16: 0000000000000000 x15:
> 072e076c07610767
> [  401.168662] x14: ffff00080ee0b500 x13: 072e076c07610767 x12:
> ffff800082107128
> [  401.175783] x11: 0000000000000058 x10: 0000000000000018 x9 :
> ffff00080ee0b500
> [  401.182907] x8 : 00000000000004c7 x7 : ffff00080ee0b500 x6 :
> ffff80008215f128
> [  401.190028] x5 : 0000000000000000 x4 : 0000000000000000 x3 :
> 0000000000000001
> [  401.197149] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
> ffff00080ee0b480
> [  401.204272] Call trace:
> [  401.206709]  wave5_vpu_dec_device_run+0x448/0x83c [wave5] (P)
> [  401.212448]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem] [  401.217667]
> v4l2_m2m_qbuf+0x184/0x240 [v4l2_mem2mem] [  401.222709]
> v4l2_m2m_ioctl_qbuf+0x18/0x4e0 [v4l2_mem2mem] [  401.228184]
> v4l_qbuf+0x48/0x70 [videodev] [  401.232292]  __video_do_ioctl+0x40c/0x4a0
> [videodev] [  401.237260]  video_usercopy+0x1e0/0x688 [videodev]
> [  401.242054]  video_ioctl2+0x18/0x38 [videodev] [  401.246500]
> v4l2_ioctl+0x40/0x60 [videodev] [  401.250774]  __arm64_sys_ioctl+0xb4/0xf4
> [  401.254690]  invoke_syscall+0x48/0x104 [  401.258433]
> el0_svc_common.constprop.0+0x40/0xe0
> [  401.263125]  do_el0_svc+0x1c/0x28
> [  401.266431]  el0_svc+0x30/0xcc
> [  401.269480]  el0t_64_sync_handler+0x10c/0x138 [  401.273827]
> el0t_64_sync+0x198/0x19c [  401.277480] ---[ end trace 0000000000000000 ]-
> -- [  401.316876] ------------[ cut here ]------------ [  401.321523]
> Execution of a job in state STOP illegal.
> [  401.326992] WARNING: CPU: 0 PID: 635 at
> drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> 
> 
> To test:
> gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> raw,format=NV12,width=320,height=240 ! v4l2h264enc ! filesink
> location=340x240.h264
> gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> raw,format=NV12,width=640,height=480 ! v4l2h264enc ! filesink
> location=640x480.h264 cat 340x240.h264 640x480.h264 340x240.h264
> 640x480.h264  > drc.h264
> gst-launch-1.0 filesrc location=drc.h264  ! parsebin ! v4l2h264dec !
> fakevideosink -v
> 
> 
> >
> > Change since v0:
> > ===================
> > * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of
> > decoder
> >  - separates the previous patch to a few patches
> >
> > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> > state is changed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > resources normally when
> >     instance was destroyed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> > panic when closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > Jackson Lee (7):
> >   media: chips-media: wave5: Fix Null reference while testing fluster
> >   media: chips-media: wave5: Improve performance of decoder
> >   media: chips-media: wave5: Fix not to be closed
> >   media: chips-media: wave5: Use spinlock whenever state is changed
> >   media: chips-media: wave5: Fix not to free resources normally when
> >     instance was destroyed
> >   media: chips-media: wave5: Reduce high CPU load
> >   media: chips-media: wave5: Fix SError of kernel panic when closed
> >
> >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> > --
> >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> >  7 files changed, 179 insertions(+), 72 deletions(-)
> 
> --
> Nicolas Dufresne
> Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-14  2:04   ` jackson.lee
@ 2025-04-14 13:28     ` Nicolas Dufresne
  2025-04-29  5:59       ` jackson.lee
  0 siblings, 1 reply; 18+ messages in thread
From: Nicolas Dufresne @ 2025-04-14 13:28 UTC (permalink / raw)
  To: jackson.lee, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	bob.beckett@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Jackson,

Le lundi 14 avril 2025 à 02:04 +0000, jackson.lee a écrit :
> > > Sequence Change test
> > > =====================
> > > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > > Setting pipeline to PAUSED ...
> > > Pipeline is PREROLLING ...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Pipeline is PREROLLED ...
> > > Setting pipeline to PLAYING ...
> > > Redistribute latency...
> > > New clock: GstSystemClock
> > > Got EOS from element "pipeline0".
> > > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > > Freeing pipeline ...
> > 
> > I did a test of my own here, and did get kernel splat. The warning
> > indicate that the state machine is no longer respected. This needs to be
> > address in v2, we added these check, since the locking is bound to legal
> > use of the state machine.
> 
> 
> Since applying the performance patch, device_run and wave5_vpu_dec_finish_decode is not synchronized any more.
> How about removing this warning message ?

I would like to understand how it makes sense the we've reached STOP
state while the HW is still actively decoding. It does not feel robust
to me, perhaps you want to introduce a new state ? Also, I'm concern
with overall thread-safety now.

regards,

-- 
Nicolas Dufresne
Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-14 13:28     ` Nicolas Dufresne
@ 2025-04-29  5:59       ` jackson.lee
  0 siblings, 0 replies; 18+ messages in thread
From: jackson.lee @ 2025-04-29  5:59 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	bob.beckett@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Nicolas


> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Monday, April 14, 2025 10:29 PM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> hverkuil-cisco@xs4all.nl; bob.beckett@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> Chung <nas.chung@chipsnmedia.com>
> Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> 
> Jackson,
> 
> Le lundi 14 avril 2025 à 02:04 +0000, jackson.lee a écrit :
> > > > Sequence Change test
> > > > =====================
> > > > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > > > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > > > Setting pipeline to PAUSED ...
> > > > Pipeline is PREROLLING ...
> > > > Redistribute latency...
> > > > Redistribute latency...
> > > > Pipeline is PREROLLED ...
> > > > Setting pipeline to PLAYING ...
> > > > Redistribute latency...
> > > > New clock: GstSystemClock
> > > > Got EOS from element "pipeline0".
> > > > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > > > Freeing pipeline ...
> > >
> > > I did a test of my own here, and did get kernel splat. The warning
> > > indicate that the state machine is no longer respected. This needs
> > > to be address in v2, we added these check, since the locking is
> > > bound to legal use of the state machine.
> >
> >
> > Since applying the performance patch, device_run and
> wave5_vpu_dec_finish_decode is not synchronized any more.
> > How about removing this warning message ?
> 
> I would like to understand how it makes sense the we've reached STOP state
> while the HW is still actively decoding. It does not feel robust to me,
> perhaps you want to introduce a new state ? Also, I'm concern with overall
> thread-safety now.


The finish_decode and device_run was synchronized each other, but since applying the performance patch, as you know, they are not.
So as soon as setting the device statue to STOP in the finish_decode, the device_run could be called before preventing device_run being called by the job_ready.

I think there is no problem because EOS has already been sent in the finish_decode and statue has been changed.
I don't think it breaks overall thread-safety.

SO I can remove the warning message. How about?


> 
> regards,
> 
> --
> Nicolas Dufresne
> Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-04-11 16:57 ` Nicolas Dufresne
  2025-04-14  2:04   ` jackson.lee
@ 2025-05-07  4:57   ` jackson.lee
  2025-05-07 12:56     ` Nicolas Dufresne
  1 sibling, 1 reply; 18+ messages in thread
From: jackson.lee @ 2025-05-07  4:57 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	sebastian.fricke@collabora.com, bob.beckett@collabora.com,
	dafna.hirschfeld@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Nicolas

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Saturday, April 12, 2025 1:58 AM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> Chung <nas.chung@chipsnmedia.com>
> Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> 
> Hi Jackson,
> 
> Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> >
> > v4l2-compliance results:
> > ========================
> >
> > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> >
> > Buffer ioctls:
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > not supported
> >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> >         test CREATE_BUFS maximum buffers: OK
> >         test VIDIOC_EXPBUF: OK
> >         test Requests: OK (Not Supported)
> >
> > Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> > Warnings: 2
> > Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> > Warnings: 0
> >
> > Fluster test results:
> > =====================
> >
> > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 133/147 tests successfully               in 41.629 secs
> 
> Same results here.
> 
> >
> > (1 test fails because of not supporting to parse multi frames, 1 test
> > fails because of a missing frame and slight corruption,
> >  2 tests fail because of sizes which are incompatible with the IP, 11
> > tests fail because of unsupported 10 bit format)
> >
> >
> > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > Gst1.0
> > Using 3 parallel job(s)
> > Ran 78/135 tests successfully               in 44.578 secs
> >
> > (57 fail because the hardware is unable to decode  MBAFF / FMO / Field
> > / Extended profile streams.)
> 
> Same results here. There is also JVT-FR-EXT test suite now that you should
> include. 23/69 here, without incident, but I did not analyze the inner
> results, I'll leave that to you. Though, a quick looks shows that
> YUV422 does not work anymore.
> 
> >

- HPCAMOLQ_BRCM_B , HPCVMOLQ_BRCM_B for these two streams, our firmware was modified to pass them.
  We will do upstream for new firmware.

So our result is 25/69. Failure reason(44) is we do not support field encoded streams and 422 encoded streams.


Thanks.


> > Seek test
> > =====================
> > 1. gst-play-1.0 seek.264
> > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> >    if you don't want to hook up display,
> >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
> > 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause 5.
> > press 0 to reset press SPACE to start again
> >
> > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a
> > list of keyboard shortcuts.
> > Now playing /root/seek.264
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...
> > Redistribute latency...aused
> > 0:00:09.9 / 0:00:09.7
> > Reached end of play list.
> 
> So, I managed to resurrect my device. Once side effect of this series is
> that the driver is no longer silent in normal cases. Pretty 'q'
> while playing, or seeking seems to fill the kernel logs with these two
> error.
> 
> [ 5037.457307] vdec 4210000.video-codec: wave5_vpu_dec_finish_decode:
> could not get output info.
> [ 5037.457436] vdec 4210000.video-codec:
> wave5_vpu_firmware_command_queue_error_check: result not ready: 0x800
> 
> This needs fixing for the next version. The condition that makes these non
> error needs to be tested so that we don't get spammed anymore. They also
> occur while running fluster.
> 
> >
> > Sequence Change test
> > =====================
> > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > Setting pipeline to PAUSED ...
> > Pipeline is PREROLLING ...
> > Redistribute latency...
> > Redistribute latency...
> > Pipeline is PREROLLED ...
> > Setting pipeline to PLAYING ...
> > Redistribute latency...
> > New clock: GstSystemClock
> > Got EOS from element "pipeline0".
> > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > Freeing pipeline ...
> 
> I did a test of my own here, and did get kernel splat. The warning
> indicate that the state machine is no longer respected. This needs to be
> address in v2, we added these check, since the locking is bound to legal
> use of the state machine.
> 
> [  401.018648] Execution of a job in state STOP illegal.
> [  401.023761] WARNING: CPU: 0 PID: 635 at
> drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> wave5_vpu_dec_device_run+0x448/0x83c [wave5] [  401.036890] Modules linked
> in: rfkill ip6table_filter ip6_tables iptable_filter ip_tables x_tables
> rpmsg_ctrl rpmsg_char phy_cadence_torrent rtc_tps6594 tps6594_esm
> tps6594_regulator tps6594_pfsm pinctrl_tps6594 gpio_regmap ti_am335x_adc
> cdns3 kfifo_buf cdns_usb_common qrtr mux_gpio omap_mailbox phy_j721e_wiz
> wave5 phy_can_transceiver ti_k3_r5_remoteproc v4l2_mem2mem
> videobuf2_dma_contig videobuf2_memops tps6594_i2c videobuf2_v4l2
> tps6594_core at24 k3_j72xx_bandgap sa2ul videodev m_can_platform
> videobuf2_common authenc m_can ti_k3_dsp_remoteproc mc cdns3_ti
> ti_am335x_tscadc can_dev rti_wdt fuse drm backlight dm_mod ipv6
> [  401.091795] CPU: 0 UID: 1000 PID: 635 Comm: h264parse0:sink Tainted:
> G        W           6.15.0-rc1-jacinto+ #1 PREEMPT
> [  401.102731] Tainted: [W]=WARN
> [  401.105687] Hardware name: Texas Instruments J721S2 EVM (DT)
> [  401.111330] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [  401.118277] pc : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> [  401.124015] lr : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> [  401.129749] sp : ffff800084a0ba40 [  401.133051] x29: ffff800084a0baf0
> x28: ffff00080a64c130 x27:
> ffff800084a0bc08
> [  401.140175] x26: 00000000c058560f x25: 0000000000000000 x24:
> ffff00081331c000
> [  401.147297] x23: ffff00081331c010 x22: ffff00080ed059a8 x21:
> ffff00081331dbc0
> [  401.154418] x20: ffff00081331d000 x19: 0000000000000000 x18:
> 0000000000000006
> [  401.161540] x17: 0000000000000000 x16: 0000000000000000 x15:
> 072e076c07610767
> [  401.168662] x14: ffff00080ee0b500 x13: 072e076c07610767 x12:
> ffff800082107128
> [  401.175783] x11: 0000000000000058 x10: 0000000000000018 x9 :
> ffff00080ee0b500
> [  401.182907] x8 : 00000000000004c7 x7 : ffff00080ee0b500 x6 :
> ffff80008215f128
> [  401.190028] x5 : 0000000000000000 x4 : 0000000000000000 x3 :
> 0000000000000001
> [  401.197149] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
> ffff00080ee0b480
> [  401.204272] Call trace:
> [  401.206709]  wave5_vpu_dec_device_run+0x448/0x83c [wave5] (P)
> [  401.212448]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem] [  401.217667]
> v4l2_m2m_qbuf+0x184/0x240 [v4l2_mem2mem] [  401.222709]
> v4l2_m2m_ioctl_qbuf+0x18/0x4e0 [v4l2_mem2mem] [  401.228184]
> v4l_qbuf+0x48/0x70 [videodev] [  401.232292]  __video_do_ioctl+0x40c/0x4a0
> [videodev] [  401.237260]  video_usercopy+0x1e0/0x688 [videodev]
> [  401.242054]  video_ioctl2+0x18/0x38 [videodev] [  401.246500]
> v4l2_ioctl+0x40/0x60 [videodev] [  401.250774]  __arm64_sys_ioctl+0xb4/0xf4
> [  401.254690]  invoke_syscall+0x48/0x104 [  401.258433]
> el0_svc_common.constprop.0+0x40/0xe0
> [  401.263125]  do_el0_svc+0x1c/0x28
> [  401.266431]  el0_svc+0x30/0xcc
> [  401.269480]  el0t_64_sync_handler+0x10c/0x138 [  401.273827]
> el0t_64_sync+0x198/0x19c [  401.277480] ---[ end trace 0000000000000000 ]-
> -- [  401.316876] ------------[ cut here ]------------ [  401.321523]
> Execution of a job in state STOP illegal.
> [  401.326992] WARNING: CPU: 0 PID: 635 at
> drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> 
> 
> To test:
> gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> raw,format=NV12,width=320,height=240 ! v4l2h264enc ! filesink
> location=340x240.h264
> gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> raw,format=NV12,width=640,height=480 ! v4l2h264enc ! filesink
> location=640x480.h264 cat 340x240.h264 640x480.h264 340x240.h264
> 640x480.h264  > drc.h264
> gst-launch-1.0 filesrc location=drc.h264  ! parsebin ! v4l2h264dec !
> fakevideosink -v
> 
> 
> >
> > Change since v0:
> > ===================
> > * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of
> > decoder
> >  - separates the previous patch to a few patches
> >
> > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> > state is changed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > resources normally when
> >     instance was destroyed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> > panic when closed
> >  - separated from the previous patch of performance improvement of
> >    decoder
> >
> > Jackson Lee (7):
> >   media: chips-media: wave5: Fix Null reference while testing fluster
> >   media: chips-media: wave5: Improve performance of decoder
> >   media: chips-media: wave5: Fix not to be closed
> >   media: chips-media: wave5: Use spinlock whenever state is changed
> >   media: chips-media: wave5: Fix not to free resources normally when
> >     instance was destroyed
> >   media: chips-media: wave5: Reduce high CPU load
> >   media: chips-media: wave5: Fix SError of kernel panic when closed
> >
> >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> > --
> >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> >  7 files changed, 179 insertions(+), 72 deletions(-)
> 
> --
> Nicolas Dufresne
> Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
  2025-05-07  4:57   ` jackson.lee
@ 2025-05-07 12:56     ` Nicolas Dufresne
  0 siblings, 0 replies; 18+ messages in thread
From: Nicolas Dufresne @ 2025-05-07 12:56 UTC (permalink / raw)
  To: jackson.lee, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	sebastian.fricke@collabora.com, bob.beckett@collabora.com,
	dafna.hirschfeld@collabora.com
  Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org,
	lafley.kim, b-brnich@ti.com, hverkuil@xs4all.nl, Nas Chung

Hi Jackson,

Le mercredi 07 mai 2025 à 04:57 +0000, jackson.lee a écrit :
> Hi Nicolas
> 
> > -----Original Message-----
> > From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > Sent: Saturday, April 12, 2025 1:58 AM
> > To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > hverkuil-cisco@xs4all.nl; sebastian.fricke@collabora.com;
> > bob.beckett@collabora.com; dafna.hirschfeld@collabora.com
> > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org; lafley.kim
> > <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; hverkuil@xs4all.nl; Nas
> > Chung <nas.chung@chipsnmedia.com>
> > Subject: Re: [RESEND PATCH v1 0/7] Performance improvement of decoder
> > 
> > Hi Jackson,
> > 
> > Le jeudi 10 avril 2025 à 12:39 +0900, Jackson.lee a écrit :
> > > From: Jackson Lee <jackson.lee@chipsnmedia.com>
> > > 
> > > v4l2-compliance results:
> > > ========================
> > > 
> > > v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> > > 
> > > Buffer ioctls:
> > >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > > not supported
> > >                 warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS
> > > not supported
> > >         test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
> > >         test CREATE_BUFS maximum buffers: OK
> > >         test VIDIOC_EXPBUF: OK
> > >         test Requests: OK (Not Supported)
> > > 
> > > Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0,
> > > Warnings: 2
> > > Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0,
> > > Warnings: 0
> > > 
> > > Fluster test results:
> > > =====================
> > > 
> > > Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-
> > > Gst1.0
> > > Using 3 parallel job(s)
> > > Ran 133/147 tests successfully               in 41.629 secs
> > 
> > Same results here.
> > 
> > > 
> > > (1 test fails because of not supporting to parse multi frames, 1 test
> > > fails because of a missing frame and slight corruption,
> > >  2 tests fail because of sizes which are incompatible with the IP, 11
> > > tests fail because of unsupported 10 bit format)
> > > 
> > > 
> > > Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-
> > > Gst1.0
> > > Using 3 parallel job(s)
> > > Ran 78/135 tests successfully               in 44.578 secs
> > > 
> > > (57 fail because the hardware is unable to decode  MBAFF / FMO / Field
> > > / Extended profile streams.)
> > 
> > Same results here. There is also JVT-FR-EXT test suite now that you should
> > include. 23/69 here, without incident, but I did not analyze the inner
> > results, I'll leave that to you. Though, a quick looks shows that
> > YUV422 does not work anymore.
> > 
> > > 
> 
> - HPCAMOLQ_BRCM_B , HPCVMOLQ_BRCM_B for these two streams, our firmware was modified to pass them.
>   We will do upstream for new firmware.
> 
> So our result is 25/69. Failure reason(44) is we do not support field encoded streams and 422 encoded streams.

Wonderful!

Nicolas

> 
> 
> Thanks.
> 
> 
> > > Seek test
> > > =====================
> > > 1. gst-play-1.0 seek.264
> > > 2. this will use waylandsink since gst-play-1.0 uses playbin.
> > >    if you don't want to hook up display,
> > >    you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
> > > 3. Let pipeline run for 2-3 seconds 4. press SPACE key to pause 5.
> > > press 0 to reset press SPACE to start again
> > > 
> > > gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a
> > > list of keyboard shortcuts.
> > > Now playing /root/seek.264
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Redistribute latency...aused
> > > 0:00:09.9 / 0:00:09.7
> > > Reached end of play list.
> > 
> > So, I managed to resurrect my device. Once side effect of this series is
> > that the driver is no longer silent in normal cases. Pretty 'q'
> > while playing, or seeking seems to fill the kernel logs with these two
> > error.
> > 
> > [ 5037.457307] vdec 4210000.video-codec: wave5_vpu_dec_finish_decode:
> > could not get output info.
> > [ 5037.457436] vdec 4210000.video-codec:
> > wave5_vpu_firmware_command_queue_error_check: result not ready: 0x800
> > 
> > This needs fixing for the next version. The condition that makes these non
> > error needs to be tested so that we don't get spammed anymore. They also
> > occur while running fluster.
> > 
> > > 
> > > Sequence Change test
> > > =====================
> > > gst-launch-1.0 filesrc location=./switch_1080p_720p_240frames.h264 !
> > > h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> > > Setting pipeline to PAUSED ...
> > > Pipeline is PREROLLING ...
> > > Redistribute latency...
> > > Redistribute latency...
> > > Pipeline is PREROLLED ...
> > > Setting pipeline to PLAYING ...
> > > Redistribute latency...
> > > New clock: GstSystemClock
> > > Got EOS from element "pipeline0".
> > > Execution ended after 0:00:05.772414400 Setting pipeline to NULL ...
> > > Freeing pipeline ...
> > 
> > I did a test of my own here, and did get kernel splat. The warning
> > indicate that the state machine is no longer respected. This needs to be
> > address in v2, we added these check, since the locking is bound to legal
> > use of the state machine.
> > 
> > [  401.018648] Execution of a job in state STOP illegal.
> > [  401.023761] WARNING: CPU: 0 PID: 635 at
> > drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> > wave5_vpu_dec_device_run+0x448/0x83c [wave5] [  401.036890] Modules linked
> > in: rfkill ip6table_filter ip6_tables iptable_filter ip_tables x_tables
> > rpmsg_ctrl rpmsg_char phy_cadence_torrent rtc_tps6594 tps6594_esm
> > tps6594_regulator tps6594_pfsm pinctrl_tps6594 gpio_regmap ti_am335x_adc
> > cdns3 kfifo_buf cdns_usb_common qrtr mux_gpio omap_mailbox phy_j721e_wiz
> > wave5 phy_can_transceiver ti_k3_r5_remoteproc v4l2_mem2mem
> > videobuf2_dma_contig videobuf2_memops tps6594_i2c videobuf2_v4l2
> > tps6594_core at24 k3_j72xx_bandgap sa2ul videodev m_can_platform
> > videobuf2_common authenc m_can ti_k3_dsp_remoteproc mc cdns3_ti
> > ti_am335x_tscadc can_dev rti_wdt fuse drm backlight dm_mod ipv6
> > [  401.091795] CPU: 0 UID: 1000 PID: 635 Comm: h264parse0:sink Tainted:
> > G        W           6.15.0-rc1-jacinto+ #1 PREEMPT
> > [  401.102731] Tainted: [W]=WARN
> > [  401.105687] Hardware name: Texas Instruments J721S2 EVM (DT)
> > [  401.111330] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
> > BTYPE=--)
> > [  401.118277] pc : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> > [  401.124015] lr : wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> > [  401.129749] sp : ffff800084a0ba40 [  401.133051] x29: ffff800084a0baf0
> > x28: ffff00080a64c130 x27:
> > ffff800084a0bc08
> > [  401.140175] x26: 00000000c058560f x25: 0000000000000000 x24:
> > ffff00081331c000
> > [  401.147297] x23: ffff00081331c010 x22: ffff00080ed059a8 x21:
> > ffff00081331dbc0
> > [  401.154418] x20: ffff00081331d000 x19: 0000000000000000 x18:
> > 0000000000000006
> > [  401.161540] x17: 0000000000000000 x16: 0000000000000000 x15:
> > 072e076c07610767
> > [  401.168662] x14: ffff00080ee0b500 x13: 072e076c07610767 x12:
> > ffff800082107128
> > [  401.175783] x11: 0000000000000058 x10: 0000000000000018 x9 :
> > ffff00080ee0b500
> > [  401.182907] x8 : 00000000000004c7 x7 : ffff00080ee0b500 x6 :
> > ffff80008215f128
> > [  401.190028] x5 : 0000000000000000 x4 : 0000000000000000 x3 :
> > 0000000000000001
> > [  401.197149] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
> > ffff00080ee0b480
> > [  401.204272] Call trace:
> > [  401.206709]  wave5_vpu_dec_device_run+0x448/0x83c [wave5] (P)
> > [  401.212448]  v4l2_m2m_try_run+0x84/0x134 [v4l2_mem2mem] [  401.217667]
> > v4l2_m2m_qbuf+0x184/0x240 [v4l2_mem2mem] [  401.222709]
> > v4l2_m2m_ioctl_qbuf+0x18/0x4e0 [v4l2_mem2mem] [  401.228184]
> > v4l_qbuf+0x48/0x70 [videodev] [  401.232292]  __video_do_ioctl+0x40c/0x4a0
> > [videodev] [  401.237260]  video_usercopy+0x1e0/0x688 [videodev]
> > [  401.242054]  video_ioctl2+0x18/0x38 [videodev] [  401.246500]
> > v4l2_ioctl+0x40/0x60 [videodev] [  401.250774]  __arm64_sys_ioctl+0xb4/0xf4
> > [  401.254690]  invoke_syscall+0x48/0x104 [  401.258433]
> > el0_svc_common.constprop.0+0x40/0xe0
> > [  401.263125]  do_el0_svc+0x1c/0x28
> > [  401.266431]  el0_svc+0x30/0xcc
> > [  401.269480]  el0t_64_sync_handler+0x10c/0x138 [  401.273827]
> > el0t_64_sync+0x198/0x19c [  401.277480] ---[ end trace 0000000000000000 ]-
> > -- [  401.316876] ------------[ cut here ]------------ [  401.321523]
> > Execution of a job in state STOP illegal.
> > [  401.326992] WARNING: CPU: 0 PID: 635 at
> > drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c:1715
> > wave5_vpu_dec_device_run+0x448/0x83c [wave5]
> > 
> > 
> > To test:
> > gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> > raw,format=NV12,width=320,height=240 ! v4l2h264enc ! filesink
> > location=340x240.h264
> > gst-launch-1.0 videotestsrc num-buffers=3 ! video/x-
> > raw,format=NV12,width=640,height=480 ! v4l2h264enc ! filesink
> > location=640x480.h264 cat 340x240.h264 640x480.h264 340x240.h264
> > 640x480.h264  > drc.h264
> > gst-launch-1.0 filesrc location=drc.h264  ! parsebin ! v4l2h264dec !
> > fakevideosink -v
> > 
> > 
> > > 
> > > Change since v0:
> > > ===================
> > > * For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of
> > > decoder
> > >  - separates the previous patch to a few patches
> > > 
> > > * For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
> > >  - separated from the previous patch of performance improvement of
> > >    decoder
> > > 
> > > * For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever
> > > state is changed
> > >  - separated from the previous patch of performance improvement of
> > >    decoder
> > > 
> > > * For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free
> > > resources normally when
> > >     instance was destroyed
> > >  - separated from the previous patch of performance improvement of
> > >    decoder
> > > 
> > > * For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel
> > > panic when closed
> > >  - separated from the previous patch of performance improvement of
> > >    decoder
> > > 
> > > Jackson Lee (7):
> > >   media: chips-media: wave5: Fix Null reference while testing fluster
> > >   media: chips-media: wave5: Improve performance of decoder
> > >   media: chips-media: wave5: Fix not to be closed
> > >   media: chips-media: wave5: Use spinlock whenever state is changed
> > >   media: chips-media: wave5: Fix not to free resources normally when
> > >     instance was destroyed
> > >   media: chips-media: wave5: Reduce high CPU load
> > >   media: chips-media: wave5: Fix SError of kernel panic when closed
> > > 
> > >  .../platform/chips-media/wave5/wave5-helper.c |  10 +-
> > >  .../chips-media/wave5/wave5-vpu-dec.c         | 116 +++++++++++-----
> > > --
> > >  .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
> > >  .../platform/chips-media/wave5/wave5-vpu.c    |  70 +++++++++--
> > >  .../platform/chips-media/wave5/wave5-vpuapi.c |  36 +++---
> > >  .../platform/chips-media/wave5/wave5-vpuapi.h |  10 ++
> > >  .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
> > >  7 files changed, 179 insertions(+), 72 deletions(-)
> > 
> > --
> > Nicolas Dufresne
> > Principal Engineer at Collabora

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-05-07 12:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10  3:39 [RESEND PATCH v1 0/7] Performance improvement of decoder Jackson.lee
2025-04-10  3:39 ` [RESEND PATCH v1 1/7] media: chips-media: wave5: Fix Null reference while testing fluster Jackson.lee
2025-04-10  3:39 ` [RESEND PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder Jackson.lee
2025-04-10  3:39 ` [RESEND PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed Jackson.lee
2025-04-10  3:39 ` [RESEND PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed Jackson.lee
2025-04-10  3:40 ` [RESEND PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when instance was destroyed Jackson.lee
2025-04-10  3:40 ` [RESEND PATCH v1 6/7] media: chips-media: wave5: Reduce high CPU load Jackson.lee
2025-04-10  3:40 ` [RESEND PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed Jackson.lee
2025-04-10 12:39 ` [RESEND PATCH v1 0/7] Performance improvement of decoder Nicolas Dufresne
2025-04-11  0:49   ` jackson.lee
2025-04-11 15:47     ` Nicolas Dufresne
2025-04-14  1:44       ` jackson.lee
2025-04-11 16:57 ` Nicolas Dufresne
2025-04-14  2:04   ` jackson.lee
2025-04-14 13:28     ` Nicolas Dufresne
2025-04-29  5:59       ` jackson.lee
2025-05-07  4:57   ` jackson.lee
2025-05-07 12:56     ` Nicolas Dufresne

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox