linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling
@ 2025-06-06 10:29 Jacopo Mondi
  2025-06-06 10:29 ` [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function Jacopo Mondi
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-06 10:29 UTC (permalink / raw)
  To: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil
  Cc: linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	Jacopo Mondi, stable

Currently the 'pispbe_schedule()' function does two things:

1) Tries to assemble a job by inspecting all the video node queues
   to make sure all the required buffers are available
2) Submit the job to the hardware

The pispbe_schedule() function is called at:

- video device start_streaming() time
- video device qbuf() time
- irq handler

As assembling a job requires inspecting all queues, it is a rather
time consuming operation which is better not run in IRQ context.

To avoid executing the time consuming job creation in interrupt
context, split the job creation and job scheduling in two distinct
operations. When a well-formed job is created, append it to the
newly introduced 'pispbe->job_queue' where it will be dequeued from
by the scheduling routine.

At start_streaming() and qbuf() time immediately try to schedule a job
if one has been created as the irq handler routine is only called when
a job has completed, and we can't solely rely on it for scheduling new
jobs.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
---
Changes in v7:
- Rebased on media-committers/next
- Fix lockdep warning by using the proper spinlock_irq() primitive in
  pispbe_prepare_job() which can race with the IRQ handler
- Link to v6: https://lore.kernel.org/r/20240930-pispbe-mainline-split-jobs-handling-v6-v6-0-63d60f9dd10f@ideasonboard.com

v5->v6:
- Make the driver depend on PM
  - Simplify the probe() routine by using pm_runtime_
  - Remove suspend call from remove()

v4->v5:
- Use appropriate locking constructs:
  - spin_lock_irq() for pispbe_prepare_job() called from non irq context
  - spin_lock_irqsave() for pispbe_schedule() called from irq context
  - Remove hw_lock from ready_queue accesses in stop_streaming and
    start_streaming
  - Fix trivial indentation mistake in 4/4

v3->v4:
- Expand commit message in 2/4 to explain why removing validation in schedule()
  is safe
- Drop ready_lock spinlock
- Use non _irqsave version of safe_guard(spinlock
- Support !CONFIG_PM in 4/4 by calling the enable/disable routines directly
  and adjust pm_runtime usage as suggested by Laurent

v2->v3:
- Mark pispbe_runtime_resume() as __maybe_unused
- Add fixes tags where appropriate

v1->v2:
- Add two patches to address Laurent's comments separately
- use scoped_guard() when possible
- Add patch to fix runtime_pm imbalance

---
Jacopo Mondi (4):
      media: pisp_be: Drop reference to non-existing function
      media: pisp_be: Remove config validation from schedule()
      media: pisp_be: Split jobs creation and scheduling
      media: pisp_be: Fix pm_runtime underrun in probe

 drivers/media/platform/raspberrypi/pisp_be/Kconfig |   1 +
 .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 187 ++++++++++-----------
 2 files changed, 90 insertions(+), 98 deletions(-)
---
base-commit: 5e1ff2314797bf53636468a97719a8222deca9ae
change-id: 20240930-pispbe-mainline-split-jobs-handling-v6-15dc16e11e3a

Best regards,
-- 
Jacopo Mondi <jacopo.mondi@ideasonboard.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function
  2025-06-06 10:29 [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling Jacopo Mondi
@ 2025-06-06 10:29 ` Jacopo Mondi
  2025-06-13  8:00   ` Naushir Patuck
  2025-06-06 10:29 ` [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule() Jacopo Mondi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-06 10:29 UTC (permalink / raw)
  To: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil
  Cc: linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	Jacopo Mondi

A comment in the pisp_be driver references the
pispbe_schedule_internal() function which doesn't exist.

Drop it.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/media/platform/raspberrypi/pisp_be/pisp_be.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
index 7596ae1f7de6..b1449245f394 100644
--- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
+++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
@@ -368,10 +368,7 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
 	ret = pispbe_get_planes_addr(addrs, buf[MAIN_INPUT_NODE],
 				     &pispbe->node[MAIN_INPUT_NODE]);
 	if (ret <= 0) {
-		/*
-		 * This shouldn't happen; pispbe_schedule_internal should insist
-		 * on an input.
-		 */
+		/* Shouldn't happen, we have validated an input is available. */
 		dev_warn(pispbe->dev, "ISP-BE missing input\n");
 		hw_en->bayer_enables = 0;
 		hw_en->rgb_enables = 0;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule()
  2025-06-06 10:29 [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling Jacopo Mondi
  2025-06-06 10:29 ` [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function Jacopo Mondi
@ 2025-06-06 10:29 ` Jacopo Mondi
  2025-06-13  8:31   ` Naushir Patuck
  2025-06-06 10:29 ` [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling Jacopo Mondi
  2025-06-06 10:29 ` [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe Jacopo Mondi
  3 siblings, 1 reply; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-06 10:29 UTC (permalink / raw)
  To: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil
  Cc: linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	Jacopo Mondi

The config parameters buffer is already validated in
pisp_be_validate_config() at .buf_prepare() time.

However some of the same validations are also performed at
pispbe_schedule() time. In particular the function checks that:

1) config.num_tiles is valid
2) At least one of the BAYER or RGB input is enabled

The input config validation is already performed in
pisp_be_validate_config() and while job.hw_enables is modified by
pispbe_xlate_addrs(), the function only resets the input masks if

- there is no input buffer available, but pispbe_prepare_job() fails
  before calling pispbe_xlate_addrs() in this case
- bayer_enable is 0, but in this case rgb_enable is valid as guaranteed
  by pisp_be_validate_config()
- only outputs are reset in rgb_enable

For this reasons there is no need to repeat the check at
pispbe_schedule() time.

The num_tiles validation can be moved to pisp_be_validate_config() as
well. As num_tiles is a u32 it can'be be < 0, so change the sanity
check accordingly.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 25 ++++++----------------
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
index b1449245f394..92c452891d6c 100644
--- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
+++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
@@ -588,24 +588,6 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
 	pispbe->hw_busy = true;
 	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
 
-	if (job.config->num_tiles <= 0 ||
-	    job.config->num_tiles > PISP_BACK_END_NUM_TILES ||
-	    !((job.hw_enables.bayer_enables | job.hw_enables.rgb_enables) &
-	      PISP_BE_BAYER_ENABLE_INPUT)) {
-		/*
-		 * Bad job. We can't let it proceed as it could lock up
-		 * the hardware, or worse!
-		 *
-		 * For now, just force num_tiles to 0, which causes the
-		 * H/W to do something bizarre but survivable. It
-		 * increments (started,done) counters by more than 1,
-		 * but we seem to survive...
-		 */
-		dev_dbg(pispbe->dev, "Bad job: invalid number of tiles: %u\n",
-			job.config->num_tiles);
-		job.config->num_tiles = 0;
-	}
-
 	pispbe_queue_job(pispbe, &job);
 
 	return;
@@ -703,6 +685,13 @@ static int pisp_be_validate_config(struct pispbe_dev *pispbe,
 		return -EIO;
 	}
 
+	if (config->num_tiles == 0 ||
+	    config->num_tiles > PISP_BACK_END_NUM_TILES) {
+		dev_dbg(dev, "%s: Invalid number of tiles: %d\n", __func__,
+			config->num_tiles);
+		return -EINVAL;
+	}
+
 	/* Ensure output config strides and buffer sizes match the V4L2 formats. */
 	fmt = &pispbe->node[TDN_OUTPUT_NODE].format;
 	if (bayer_enables & PISP_BE_BAYER_ENABLE_TDN_OUTPUT) {

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling
  2025-06-06 10:29 [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling Jacopo Mondi
  2025-06-06 10:29 ` [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function Jacopo Mondi
  2025-06-06 10:29 ` [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule() Jacopo Mondi
@ 2025-06-06 10:29 ` Jacopo Mondi
  2025-06-16 14:40   ` Laurent Pinchart
  2025-06-06 10:29 ` [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe Jacopo Mondi
  3 siblings, 1 reply; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-06 10:29 UTC (permalink / raw)
  To: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil
  Cc: linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	Jacopo Mondi

Currently the 'pispbe_schedule()' function does two things:

1) Tries to assemble a job by inspecting all the video node queues
   to make sure all the required buffers are available
2) Submit the job to the hardware

The pispbe_schedule() function is called at:

- video device start_streaming() time
- video device qbuf() time
- irq handler

As assembling a job requires inspecting all queues, it is a rather
time consuming operation which is better not run in IRQ context.

To avoid the executing the time consuming job creation in interrupt
context split the job creation and job scheduling in two distinct
operations. When a well-formed job is created, append it to the
newly introduced 'pispbe->job_queue' where it will be dequeued from
by the scheduling routine.

As the per-node 'ready_queue' buffer list is only accessed in vb2
ops callbacks, protected by a mutex, it is not necessary to guard it
with a dedicated spinlock so drop it. Also use the spin_lock_irq()
variant in all functions not called from an IRQ context where the
spin_lock_irqsave() version was used.

Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
---
 .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 152 +++++++++++----------
 1 file changed, 79 insertions(+), 73 deletions(-)

diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
index 92c452891d6c..c25f7d9b404c 100644
--- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
+++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
@@ -161,8 +161,6 @@ struct pispbe_node {
 	struct mutex node_lock;
 	/* vb2_queue lock */
 	struct mutex queue_lock;
-	/* Protect pispbe_node->ready_queue and pispbe_buffer->ready_list */
-	spinlock_t ready_lock;
 	struct list_head ready_queue;
 	struct vb2_queue queue;
 	struct v4l2_format format;
@@ -190,6 +188,8 @@ struct pispbe_hw_enables {
 
 /* Records a job configuration and memory addresses. */
 struct pispbe_job_descriptor {
+	struct list_head queue;
+	struct pispbe_buffer *buffers[PISPBE_NUM_NODES];
 	dma_addr_t hw_dma_addrs[N_HW_ADDRESSES];
 	struct pisp_be_tiles_config *config;
 	struct pispbe_hw_enables hw_enables;
@@ -215,8 +215,10 @@ struct pispbe_dev {
 	unsigned int sequence;
 	u32 streaming_map;
 	struct pispbe_job queued_job, running_job;
-	spinlock_t hw_lock; /* protects "hw_busy" flag and streaming_map */
+	/* protects "hw_busy" flag, streaming_map and job_queue */
+	spinlock_t hw_lock;
 	bool hw_busy; /* non-zero if a job is queued or is being started */
+	struct list_head job_queue;
 	int irq;
 	u32 hw_version;
 	u8 done, started;
@@ -440,41 +442,47 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
  * For Output0, Output1, Tdn and Stitch, a buffer only needs to be
  * available if the blocks are enabled in the config.
  *
- * Needs to be called with hw_lock held.
+ * If all the buffers required to form a job are available, append the
+ * job descriptor to the job queue to be later queued to the HW.
  *
  * Returns 0 if a job has been successfully prepared, < 0 otherwise.
  */
-static int pispbe_prepare_job(struct pispbe_dev *pispbe,
-			      struct pispbe_job_descriptor *job)
+static int pispbe_prepare_job(struct pispbe_dev *pispbe)
 {
 	struct pispbe_buffer *buf[PISPBE_NUM_NODES] = {};
+	struct pispbe_job_descriptor *job;
+	unsigned int streaming_map;
 	unsigned int config_index;
 	struct pispbe_node *node;
-	unsigned long flags;
 
-	lockdep_assert_held(&pispbe->hw_lock);
+	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
+		static const u32 mask = BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE);
 
-	memset(job, 0, sizeof(struct pispbe_job_descriptor));
+		if ((pispbe->streaming_map & mask) != mask)
+			return -ENODEV;
 
-	if (((BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)) &
-		pispbe->streaming_map) !=
-			(BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)))
-		return -ENODEV;
+		/*
+		 * Take a copy of streaming_map: nodes activated after this
+		 * point are ignored when preparing this job.
+		 */
+		streaming_map = pispbe->streaming_map;
+	}
+
+	job = kzalloc(sizeof(*job), GFP_KERNEL);
+	if (!job)
+		return -ENOMEM;
 
 	node = &pispbe->node[CONFIG_NODE];
-	spin_lock_irqsave(&node->ready_lock, flags);
 	buf[CONFIG_NODE] = list_first_entry_or_null(&node->ready_queue,
 						    struct pispbe_buffer,
 						    ready_list);
-	if (buf[CONFIG_NODE]) {
-		list_del(&buf[CONFIG_NODE]->ready_list);
-		pispbe->queued_job.buf[CONFIG_NODE] = buf[CONFIG_NODE];
+	if (!buf[CONFIG_NODE]) {
+		kfree(job);
+		return -ENODEV;
 	}
-	spin_unlock_irqrestore(&node->ready_lock, flags);
 
-	/* Exit early if no config buffer has been queued. */
-	if (!buf[CONFIG_NODE])
-		return -ENODEV;
+	list_del(&buf[CONFIG_NODE]->ready_list);
+	job->buffers[CONFIG_NODE] = buf[CONFIG_NODE];
 
 	config_index = buf[CONFIG_NODE]->vb.vb2_buf.index;
 	job->config = &pispbe->config[config_index];
@@ -495,7 +503,7 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
 			continue;
 
 		buf[i] = NULL;
-		if (!(pispbe->streaming_map & BIT(i)))
+		if (!(streaming_map & BIT(i)))
 			continue;
 
 		if ((!(rgb_en & PISP_BE_RGB_ENABLE_OUTPUT0) &&
@@ -522,25 +530,25 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
 		node = &pispbe->node[i];
 
 		/* Pull a buffer from each V4L2 queue to form the queued job */
-		spin_lock_irqsave(&node->ready_lock, flags);
 		buf[i] = list_first_entry_or_null(&node->ready_queue,
 						  struct pispbe_buffer,
 						  ready_list);
 		if (buf[i]) {
 			list_del(&buf[i]->ready_list);
-			pispbe->queued_job.buf[i] = buf[i];
+			job->buffers[i] = buf[i];
 		}
-		spin_unlock_irqrestore(&node->ready_lock, flags);
 
 		if (!buf[i] && !ignore_buffers)
 			goto err_return_buffers;
 	}
 
-	pispbe->queued_job.valid = true;
-
 	/* Convert buffers to DMA addresses for the hardware */
 	pispbe_xlate_addrs(pispbe, job, buf);
 
+	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
+		list_add_tail(&job->queue, &pispbe->job_queue);
+	}
+
 	return 0;
 
 err_return_buffers:
@@ -551,33 +559,39 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
 			continue;
 
 		/* Return the buffer to the ready_list queue */
-		spin_lock_irqsave(&n->ready_lock, flags);
 		list_add(&buf[i]->ready_list, &n->ready_queue);
-		spin_unlock_irqrestore(&n->ready_lock, flags);
 	}
 
-	memset(&pispbe->queued_job, 0, sizeof(pispbe->queued_job));
+	kfree(job);
 
 	return -ENODEV;
 }
 
 static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
 {
-	struct pispbe_job_descriptor job;
-	unsigned long flags;
-	int ret;
+	struct pispbe_job_descriptor *job;
+
+	scoped_guard(spinlock_irqsave, &pispbe->hw_lock) {
+		if (clear_hw_busy)
+			pispbe->hw_busy = false;
+
+		if (pispbe->hw_busy)
+			return;
 
-	spin_lock_irqsave(&pispbe->hw_lock, flags);
+		job = list_first_entry_or_null(&pispbe->job_queue,
+					       struct pispbe_job_descriptor,
+					       queue);
+		if (!job)
+			return;
 
-	if (clear_hw_busy)
-		pispbe->hw_busy = false;
+		list_del(&job->queue);
 
-	if (pispbe->hw_busy)
-		goto unlock_and_return;
+		for (unsigned int i = 0; i < PISPBE_NUM_NODES; i++)
+			pispbe->queued_job.buf[i] = job->buffers[i];
+		pispbe->queued_job.valid = true;
 
-	ret = pispbe_prepare_job(pispbe, &job);
-	if (ret)
-		goto unlock_and_return;
+		pispbe->hw_busy = true;
+	}
 
 	/*
 	 * We can kick the job off without the hw_lock, as this can
@@ -585,16 +599,8 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
 	 * only when the following job has been queued and an interrupt
 	 * is rised.
 	 */
-	pispbe->hw_busy = true;
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
-
-	pispbe_queue_job(pispbe, &job);
-
-	return;
-
-unlock_and_return:
-	/* No job has been queued, just release the lock and return. */
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
+	pispbe_queue_job(pispbe, job);
+	kfree(job);
 }
 
 static void pispbe_isr_jobdone(struct pispbe_dev *pispbe,
@@ -846,18 +852,16 @@ static void pispbe_node_buffer_queue(struct vb2_buffer *buf)
 		container_of(vbuf, struct pispbe_buffer, vb);
 	struct pispbe_node *node = vb2_get_drv_priv(buf->vb2_queue);
 	struct pispbe_dev *pispbe = node->pispbe;
-	unsigned long flags;
 
 	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
-	spin_lock_irqsave(&node->ready_lock, flags);
 	list_add_tail(&buffer->ready_list, &node->ready_queue);
-	spin_unlock_irqrestore(&node->ready_lock, flags);
 
 	/*
 	 * Every time we add a buffer, check if there's now some work for the hw
 	 * to do.
 	 */
-	pispbe_schedule(pispbe, false);
+	if (!pispbe_prepare_job(pispbe))
+		pispbe_schedule(pispbe, false);
 }
 
 static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
@@ -865,17 +869,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
 	struct pispbe_node *node = vb2_get_drv_priv(q);
 	struct pispbe_dev *pispbe = node->pispbe;
 	struct pispbe_buffer *buf, *tmp;
-	unsigned long flags;
 	int ret;
 
 	ret = pm_runtime_resume_and_get(pispbe->dev);
 	if (ret < 0)
 		goto err_return_buffers;
 
-	spin_lock_irqsave(&pispbe->hw_lock, flags);
-	node->pispbe->streaming_map |=  BIT(node->id);
-	node->pispbe->sequence = 0;
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
+	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
+		node->pispbe->streaming_map |=  BIT(node->id);
+		node->pispbe->sequence = 0;
+	}
 
 	dev_dbg(pispbe->dev, "%s: for node %s (count %u)\n",
 		__func__, NODE_NAME(node), count);
@@ -883,17 +886,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
 		node->pispbe->streaming_map);
 
 	/* Maybe we're ready to run. */
-	pispbe_schedule(pispbe, false);
+	if (!pispbe_prepare_job(pispbe))
+		pispbe_schedule(pispbe, false);
 
 	return 0;
 
 err_return_buffers:
-	spin_lock_irqsave(&pispbe->hw_lock, flags);
 	list_for_each_entry_safe(buf, tmp, &node->ready_queue, ready_list) {
 		list_del(&buf->ready_list);
 		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED);
 	}
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
 
 	return ret;
 }
@@ -903,7 +905,6 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
 	struct pispbe_node *node = vb2_get_drv_priv(q);
 	struct pispbe_dev *pispbe = node->pispbe;
 	struct pispbe_buffer *buf;
-	unsigned long flags;
 
 	/*
 	 * Now this is a bit awkward. In a simple M2M device we could just wait
@@ -915,11 +916,7 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
 	 * This may return buffers out of order.
 	 */
 	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
-	spin_lock_irqsave(&pispbe->hw_lock, flags);
 	do {
-		unsigned long flags1;
-
-		spin_lock_irqsave(&node->ready_lock, flags1);
 		buf = list_first_entry_or_null(&node->ready_queue,
 					       struct pispbe_buffer,
 					       ready_list);
@@ -927,15 +924,23 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
 			list_del(&buf->ready_list);
 			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
 		}
-		spin_unlock_irqrestore(&node->ready_lock, flags1);
 	} while (buf);
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
 
 	vb2_wait_for_all_buffers(&node->queue);
 
-	spin_lock_irqsave(&pispbe->hw_lock, flags);
+	spin_lock_irq(&pispbe->hw_lock);
 	pispbe->streaming_map &= ~BIT(node->id);
-	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
+
+	/* Release all jobs once all nodes have stopped streaming. */
+	if (pispbe->streaming_map == 0) {
+		struct pispbe_job_descriptor *job, *temp;
+
+		list_for_each_entry_safe(job, temp, &pispbe->job_queue, queue) {
+			list_del(&job->queue);
+			kfree(job);
+		}
+	}
+	spin_unlock_irq(&pispbe->hw_lock);
 
 	pm_runtime_mark_last_busy(pispbe->dev);
 	pm_runtime_put_autosuspend(pispbe->dev);
@@ -1393,7 +1398,6 @@ static int pispbe_init_node(struct pispbe_dev *pispbe, unsigned int id)
 	mutex_init(&node->node_lock);
 	mutex_init(&node->queue_lock);
 	INIT_LIST_HEAD(&node->ready_queue);
-	spin_lock_init(&node->ready_lock);
 
 	node->format.type = node->buf_type;
 	pispbe_node_def_fmt(node);
@@ -1677,6 +1681,8 @@ static int pispbe_probe(struct platform_device *pdev)
 	if (!pispbe)
 		return -ENOMEM;
 
+	INIT_LIST_HEAD(&pispbe->job_queue);
+
 	dev_set_drvdata(&pdev->dev, pispbe);
 	pispbe->dev = &pdev->dev;
 	platform_set_drvdata(pdev, pispbe);

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe
  2025-06-06 10:29 [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling Jacopo Mondi
                   ` (2 preceding siblings ...)
  2025-06-06 10:29 ` [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling Jacopo Mondi
@ 2025-06-06 10:29 ` Jacopo Mondi
  2025-06-16 14:17   ` Laurent Pinchart
  3 siblings, 1 reply; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-06 10:29 UTC (permalink / raw)
  To: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil
  Cc: linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel,
	Jacopo Mondi, stable

During the probe() routine, the PiSP BE driver needs to power up the
interface in order to identify and initialize the hardware.

The driver resumes the interface by calling the
pispbe_runtime_resume() function directly, without going
through the pm_runtime helpers, but later suspends it by calling
pm_runtime_put_autosuspend().

This causes a PM usage count imbalance at probe time, notified by the
runtime_pm framework with the below message in the system log:

 pispbe 1000880000.pisp_be: Runtime PM usage count underflow!

Fix this by resuming the interface using the pm runtime helpers instead
of calling the resume function directly and use the pm_runtime framework
in the probe() error path. While at it, remove manual suspend of the
interface in the remove() function. The driver cannot be unloaded if in
use, so simply disable runtime pm.

To simplify the implementation, make the driver depend on PM as the
RPI5 platform where the ISP is integrated in uses the PM framework by
default.

Fixes: 12187bd5d4f8 ("media: raspberrypi: Add support for PiSP BE")
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>

--

Cc: stable@vger.kernel.org
---
 drivers/media/platform/raspberrypi/pisp_be/Kconfig   | 1 +
 drivers/media/platform/raspberrypi/pisp_be/pisp_be.c | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/raspberrypi/pisp_be/Kconfig b/drivers/media/platform/raspberrypi/pisp_be/Kconfig
index 46765a2e4c4d..a9e51fd94aad 100644
--- a/drivers/media/platform/raspberrypi/pisp_be/Kconfig
+++ b/drivers/media/platform/raspberrypi/pisp_be/Kconfig
@@ -3,6 +3,7 @@ config VIDEO_RASPBERRYPI_PISP_BE
 	depends on V4L_PLATFORM_DRIVERS
 	depends on VIDEO_DEV
 	depends on ARCH_BCM2835 || COMPILE_TEST
+	depends on PM
 	select VIDEO_V4L2_SUBDEV_API
 	select MEDIA_CONTROLLER
 	select VIDEOBUF2_DMA_CONTIG
diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
index c25f7d9b404c..e49e4cc322db 100644
--- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
+++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
@@ -1718,7 +1718,7 @@ static int pispbe_probe(struct platform_device *pdev)
 	pm_runtime_use_autosuspend(pispbe->dev);
 	pm_runtime_enable(pispbe->dev);
 
-	ret = pispbe_runtime_resume(pispbe->dev);
+	ret = pm_runtime_resume_and_get(pispbe->dev);
 	if (ret)
 		goto pm_runtime_disable_err;
 
@@ -1740,7 +1740,7 @@ static int pispbe_probe(struct platform_device *pdev)
 disable_devs_err:
 	pispbe_destroy_devices(pispbe);
 pm_runtime_suspend_err:
-	pispbe_runtime_suspend(pispbe->dev);
+	pm_runtime_put(pispbe->dev);
 pm_runtime_disable_err:
 	pm_runtime_dont_use_autosuspend(pispbe->dev);
 	pm_runtime_disable(pispbe->dev);
@@ -1754,7 +1754,6 @@ static void pispbe_remove(struct platform_device *pdev)
 
 	pispbe_destroy_devices(pispbe);
 
-	pispbe_runtime_suspend(pispbe->dev);
 	pm_runtime_dont_use_autosuspend(pispbe->dev);
 	pm_runtime_disable(pispbe->dev);
 }

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function
  2025-06-06 10:29 ` [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function Jacopo Mondi
@ 2025-06-13  8:00   ` Naushir Patuck
  0 siblings, 0 replies; 12+ messages in thread
From: Naushir Patuck @ 2025-06-13  8:00 UTC (permalink / raw)
  To: Jacopo Mondi
  Cc: Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil,
	linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel

Hi Jacopo,

Thank you for tidying this up!

On Fri, 6 Jun 2025 at 11:29, Jacopo Mondi <jacopo.mondi@ideasonboard.com> wrote:
>
> A comment in the pisp_be driver references the
> pispbe_schedule_internal() function which doesn't exist.
>
> Drop it.
>
> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Reviewed-by: Naushir Patuck <naush@raspberrypi.com>

> ---
>  drivers/media/platform/raspberrypi/pisp_be/pisp_be.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> index 7596ae1f7de6..b1449245f394 100644
> --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> @@ -368,10 +368,7 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
>         ret = pispbe_get_planes_addr(addrs, buf[MAIN_INPUT_NODE],
>                                      &pispbe->node[MAIN_INPUT_NODE]);
>         if (ret <= 0) {
> -               /*
> -                * This shouldn't happen; pispbe_schedule_internal should insist
> -                * on an input.
> -                */
> +               /* Shouldn't happen, we have validated an input is available. */
>                 dev_warn(pispbe->dev, "ISP-BE missing input\n");
>                 hw_en->bayer_enables = 0;
>                 hw_en->rgb_enables = 0;
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule()
  2025-06-06 10:29 ` [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule() Jacopo Mondi
@ 2025-06-13  8:31   ` Naushir Patuck
  0 siblings, 0 replies; 12+ messages in thread
From: Naushir Patuck @ 2025-06-13  8:31 UTC (permalink / raw)
  To: Jacopo Mondi
  Cc: Nick Hollinghurst, David Plowman, Dave Stevenson,
	Laurent Pinchart, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil,
	linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel

Hi Jacopo,

Thank you for this patch.

On Fri, 6 Jun 2025 at 11:29, Jacopo Mondi <jacopo.mondi@ideasonboard.com> wrote:
>
> The config parameters buffer is already validated in
> pisp_be_validate_config() at .buf_prepare() time.
>
> However some of the same validations are also performed at
> pispbe_schedule() time. In particular the function checks that:
>
> 1) config.num_tiles is valid
> 2) At least one of the BAYER or RGB input is enabled
>
> The input config validation is already performed in
> pisp_be_validate_config() and while job.hw_enables is modified by
> pispbe_xlate_addrs(), the function only resets the input masks if
>
> - there is no input buffer available, but pispbe_prepare_job() fails
>   before calling pispbe_xlate_addrs() in this case
> - bayer_enable is 0, but in this case rgb_enable is valid as guaranteed
>   by pisp_be_validate_config()
> - only outputs are reset in rgb_enable
>
> For this reasons there is no need to repeat the check at
> pispbe_schedule() time.
>
> The num_tiles validation can be moved to pisp_be_validate_config() as
> well. As num_tiles is a u32 it can'be be < 0, so change the sanity
> check accordingly.
>
> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Reviewed-by: Naushir Patuck <naush@raspberrypi.com>

> ---
>  .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 25 ++++++----------------
>  1 file changed, 7 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> index b1449245f394..92c452891d6c 100644
> --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> @@ -588,24 +588,6 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
>         pispbe->hw_busy = true;
>         spin_unlock_irqrestore(&pispbe->hw_lock, flags);
>
> -       if (job.config->num_tiles <= 0 ||
> -           job.config->num_tiles > PISP_BACK_END_NUM_TILES ||
> -           !((job.hw_enables.bayer_enables | job.hw_enables.rgb_enables) &
> -             PISP_BE_BAYER_ENABLE_INPUT)) {
> -               /*
> -                * Bad job. We can't let it proceed as it could lock up
> -                * the hardware, or worse!
> -                *
> -                * For now, just force num_tiles to 0, which causes the
> -                * H/W to do something bizarre but survivable. It
> -                * increments (started,done) counters by more than 1,
> -                * but we seem to survive...
> -                */
> -               dev_dbg(pispbe->dev, "Bad job: invalid number of tiles: %u\n",
> -                       job.config->num_tiles);
> -               job.config->num_tiles = 0;
> -       }
> -
>         pispbe_queue_job(pispbe, &job);
>
>         return;
> @@ -703,6 +685,13 @@ static int pisp_be_validate_config(struct pispbe_dev *pispbe,
>                 return -EIO;
>         }
>
> +       if (config->num_tiles == 0 ||
> +           config->num_tiles > PISP_BACK_END_NUM_TILES) {
> +               dev_dbg(dev, "%s: Invalid number of tiles: %d\n", __func__,
> +                       config->num_tiles);
> +               return -EINVAL;
> +       }
> +
>         /* Ensure output config strides and buffer sizes match the V4L2 formats. */
>         fmt = &pispbe->node[TDN_OUTPUT_NODE].format;
>         if (bayer_enables & PISP_BE_BAYER_ENABLE_TDN_OUTPUT) {
>
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe
  2025-06-06 10:29 ` [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe Jacopo Mondi
@ 2025-06-16 14:17   ` Laurent Pinchart
  0 siblings, 0 replies; 12+ messages in thread
From: Laurent Pinchart @ 2025-06-16 14:17 UTC (permalink / raw)
  To: Jacopo Mondi
  Cc: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Raspberry Pi Kernel Maintenance, Mauro Carvalho Chehab,
	Florian Fainelli, Broadcom internal kernel review list,
	Sakari Ailus, Hans Verkuil, linux-media, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel, stable

Hi Jacopo,

Thank you for the patch.

On Fri, Jun 06, 2025 at 12:29:24PM +0200, Jacopo Mondi wrote:
> During the probe() routine, the PiSP BE driver needs to power up the
> interface in order to identify and initialize the hardware.
> 
> The driver resumes the interface by calling the
> pispbe_runtime_resume() function directly, without going
> through the pm_runtime helpers, but later suspends it by calling
> pm_runtime_put_autosuspend().
> 
> This causes a PM usage count imbalance at probe time, notified by the
> runtime_pm framework with the below message in the system log:
> 
>  pispbe 1000880000.pisp_be: Runtime PM usage count underflow!
> 
> Fix this by resuming the interface using the pm runtime helpers instead
> of calling the resume function directly and use the pm_runtime framework
> in the probe() error path. While at it, remove manual suspend of the
> interface in the remove() function. The driver cannot be unloaded if in
> use, so simply disable runtime pm.
> 
> To simplify the implementation, make the driver depend on PM as the
> RPI5 platform where the ISP is integrated in uses the PM framework by
> default.
> 
> Fixes: 12187bd5d4f8 ("media: raspberrypi: Add support for PiSP BE")
> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> 
> --
> 
> Cc: stable@vger.kernel.org

This should go just below the Fixes: tag.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> ---
>  drivers/media/platform/raspberrypi/pisp_be/Kconfig   | 1 +
>  drivers/media/platform/raspberrypi/pisp_be/pisp_be.c | 5 ++---
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/media/platform/raspberrypi/pisp_be/Kconfig b/drivers/media/platform/raspberrypi/pisp_be/Kconfig
> index 46765a2e4c4d..a9e51fd94aad 100644
> --- a/drivers/media/platform/raspberrypi/pisp_be/Kconfig
> +++ b/drivers/media/platform/raspberrypi/pisp_be/Kconfig
> @@ -3,6 +3,7 @@ config VIDEO_RASPBERRYPI_PISP_BE
>  	depends on V4L_PLATFORM_DRIVERS
>  	depends on VIDEO_DEV
>  	depends on ARCH_BCM2835 || COMPILE_TEST
> +	depends on PM
>  	select VIDEO_V4L2_SUBDEV_API
>  	select MEDIA_CONTROLLER
>  	select VIDEOBUF2_DMA_CONTIG
> diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> index c25f7d9b404c..e49e4cc322db 100644
> --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> @@ -1718,7 +1718,7 @@ static int pispbe_probe(struct platform_device *pdev)
>  	pm_runtime_use_autosuspend(pispbe->dev);
>  	pm_runtime_enable(pispbe->dev);
>  
> -	ret = pispbe_runtime_resume(pispbe->dev);
> +	ret = pm_runtime_resume_and_get(pispbe->dev);
>  	if (ret)
>  		goto pm_runtime_disable_err;
>  
> @@ -1740,7 +1740,7 @@ static int pispbe_probe(struct platform_device *pdev)
>  disable_devs_err:
>  	pispbe_destroy_devices(pispbe);
>  pm_runtime_suspend_err:
> -	pispbe_runtime_suspend(pispbe->dev);
> +	pm_runtime_put(pispbe->dev);
>  pm_runtime_disable_err:
>  	pm_runtime_dont_use_autosuspend(pispbe->dev);
>  	pm_runtime_disable(pispbe->dev);
> @@ -1754,7 +1754,6 @@ static void pispbe_remove(struct platform_device *pdev)
>  
>  	pispbe_destroy_devices(pispbe);
>  
> -	pispbe_runtime_suspend(pispbe->dev);
>  	pm_runtime_dont_use_autosuspend(pispbe->dev);
>  	pm_runtime_disable(pispbe->dev);
>  }

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling
  2025-06-06 10:29 ` [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling Jacopo Mondi
@ 2025-06-16 14:40   ` Laurent Pinchart
  2025-06-17 12:32     ` Jacopo Mondi
  0 siblings, 1 reply; 12+ messages in thread
From: Laurent Pinchart @ 2025-06-16 14:40 UTC (permalink / raw)
  To: Jacopo Mondi
  Cc: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Raspberry Pi Kernel Maintenance, Mauro Carvalho Chehab,
	Florian Fainelli, Broadcom internal kernel review list,
	Sakari Ailus, Hans Verkuil, linux-media, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

Hi Jacopo,

On Fri, Jun 06, 2025 at 12:29:23PM +0200, Jacopo Mondi wrote:
> Currently the 'pispbe_schedule()' function does two things:
> 
> 1) Tries to assemble a job by inspecting all the video node queues
>    to make sure all the required buffers are available
> 2) Submit the job to the hardware
> 
> The pispbe_schedule() function is called at:
> 
> - video device start_streaming() time
> - video device qbuf() time
> - irq handler
> 
> As assembling a job requires inspecting all queues, it is a rather
> time consuming operation which is better not run in IRQ context.
> 
> To avoid the executing the time consuming job creation in interrupt

s/the executing/executing/

> context split the job creation and job scheduling in two distinct
> operations. When a well-formed job is created, append it to the
> newly introduced 'pispbe->job_queue' where it will be dequeued from
> by the scheduling routine.
> 
> As the per-node 'ready_queue' buffer list is only accessed in vb2
> ops callbacks, protected by a mutex, it is not necessary to guard it

"by the node->queue_lock mutex"

> with a dedicated spinlock so drop it. Also use the spin_lock_irq()
> variant in all functions not called from an IRQ context where the
> spin_lock_irqsave() version was used.
> 
> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> ---
>  .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 152 +++++++++++----------
>  1 file changed, 79 insertions(+), 73 deletions(-)
> 
> diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> index 92c452891d6c..c25f7d9b404c 100644
> --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> @@ -161,8 +161,6 @@ struct pispbe_node {
>  	struct mutex node_lock;
>  	/* vb2_queue lock */
>  	struct mutex queue_lock;
> -	/* Protect pispbe_node->ready_queue and pispbe_buffer->ready_list */
> -	spinlock_t ready_lock;
>  	struct list_head ready_queue;
>  	struct vb2_queue queue;
>  	struct v4l2_format format;
> @@ -190,6 +188,8 @@ struct pispbe_hw_enables {
>  
>  /* Records a job configuration and memory addresses. */
>  struct pispbe_job_descriptor {
> +	struct list_head queue;
> +	struct pispbe_buffer *buffers[PISPBE_NUM_NODES];
>  	dma_addr_t hw_dma_addrs[N_HW_ADDRESSES];
>  	struct pisp_be_tiles_config *config;
>  	struct pispbe_hw_enables hw_enables;
> @@ -215,8 +215,10 @@ struct pispbe_dev {
>  	unsigned int sequence;
>  	u32 streaming_map;
>  	struct pispbe_job queued_job, running_job;
> -	spinlock_t hw_lock; /* protects "hw_busy" flag and streaming_map */
> +	/* protects "hw_busy" flag, streaming_map and job_queue */
> +	spinlock_t hw_lock;
>  	bool hw_busy; /* non-zero if a job is queued or is being started */
> +	struct list_head job_queue;
>  	int irq;
>  	u32 hw_version;
>  	u8 done, started;
> @@ -440,41 +442,47 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
>   * For Output0, Output1, Tdn and Stitch, a buffer only needs to be
>   * available if the blocks are enabled in the config.
>   *
> - * Needs to be called with hw_lock held.
> + * If all the buffers required to form a job are available, append the
> + * job descriptor to the job queue to be later queued to the HW.
>   *
>   * Returns 0 if a job has been successfully prepared, < 0 otherwise.
>   */
> -static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> -			      struct pispbe_job_descriptor *job)
> +static int pispbe_prepare_job(struct pispbe_dev *pispbe)
>  {
>  	struct pispbe_buffer *buf[PISPBE_NUM_NODES] = {};
> +	struct pispbe_job_descriptor *job;

You could use

	struct pispbe_job_descriptor __free(kfree) *job = NULL;

and drop the kfree() in the error paths to simplify error handling and
make it more robust. Don't forget to set job to NULL just after adding
it to the job_queue.

> +	unsigned int streaming_map;
>  	unsigned int config_index;
>  	struct pispbe_node *node;
> -	unsigned long flags;
>  
> -	lockdep_assert_held(&pispbe->hw_lock);

You could replace this with

	lockdep_assert_irqs_enabled();

Up to you.

> +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> +		static const u32 mask = BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE);
>  
> -	memset(job, 0, sizeof(struct pispbe_job_descriptor));
> +		if ((pispbe->streaming_map & mask) != mask)
> +			return -ENODEV;
>  
> -	if (((BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)) &
> -		pispbe->streaming_map) !=
> -			(BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)))
> -		return -ENODEV;
> +		/*
> +		 * Take a copy of streaming_map: nodes activated after this
> +		 * point are ignored when preparing this job.
> +		 */
> +		streaming_map = pispbe->streaming_map;
> +	}
> +
> +	job = kzalloc(sizeof(*job), GFP_KERNEL);
> +	if (!job)
> +		return -ENOMEM;
>  
>  	node = &pispbe->node[CONFIG_NODE];
> -	spin_lock_irqsave(&node->ready_lock, flags);
>  	buf[CONFIG_NODE] = list_first_entry_or_null(&node->ready_queue,
>  						    struct pispbe_buffer,
>  						    ready_list);
> -	if (buf[CONFIG_NODE]) {
> -		list_del(&buf[CONFIG_NODE]->ready_list);
> -		pispbe->queued_job.buf[CONFIG_NODE] = buf[CONFIG_NODE];
> +	if (!buf[CONFIG_NODE]) {
> +		kfree(job);
> +		return -ENODEV;
>  	}
> -	spin_unlock_irqrestore(&node->ready_lock, flags);
>  
> -	/* Exit early if no config buffer has been queued. */
> -	if (!buf[CONFIG_NODE])
> -		return -ENODEV;
> +	list_del(&buf[CONFIG_NODE]->ready_list);
> +	job->buffers[CONFIG_NODE] = buf[CONFIG_NODE];
>  
>  	config_index = buf[CONFIG_NODE]->vb.vb2_buf.index;
>  	job->config = &pispbe->config[config_index];
> @@ -495,7 +503,7 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
>  			continue;
>  
>  		buf[i] = NULL;
> -		if (!(pispbe->streaming_map & BIT(i)))
> +		if (!(streaming_map & BIT(i)))
>  			continue;
>  
>  		if ((!(rgb_en & PISP_BE_RGB_ENABLE_OUTPUT0) &&
> @@ -522,25 +530,25 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
>  		node = &pispbe->node[i];
>  
>  		/* Pull a buffer from each V4L2 queue to form the queued job */
> -		spin_lock_irqsave(&node->ready_lock, flags);
>  		buf[i] = list_first_entry_or_null(&node->ready_queue,
>  						  struct pispbe_buffer,
>  						  ready_list);
>  		if (buf[i]) {
>  			list_del(&buf[i]->ready_list);
> -			pispbe->queued_job.buf[i] = buf[i];
> +			job->buffers[i] = buf[i];
>  		}
> -		spin_unlock_irqrestore(&node->ready_lock, flags);
>  
>  		if (!buf[i] && !ignore_buffers)
>  			goto err_return_buffers;
>  	}
>  
> -	pispbe->queued_job.valid = true;
> -
>  	/* Convert buffers to DMA addresses for the hardware */
>  	pispbe_xlate_addrs(pispbe, job, buf);
>  
> +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> +		list_add_tail(&job->queue, &pispbe->job_queue);
> +	}
> +
>  	return 0;
>  
>  err_return_buffers:
> @@ -551,33 +559,39 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
>  			continue;
>  
>  		/* Return the buffer to the ready_list queue */
> -		spin_lock_irqsave(&n->ready_lock, flags);
>  		list_add(&buf[i]->ready_list, &n->ready_queue);
> -		spin_unlock_irqrestore(&n->ready_lock, flags);
>  	}
>  
> -	memset(&pispbe->queued_job, 0, sizeof(pispbe->queued_job));
> +	kfree(job);
>  
>  	return -ENODEV;
>  }
>  
>  static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
>  {
> -	struct pispbe_job_descriptor job;
> -	unsigned long flags;
> -	int ret;
> +	struct pispbe_job_descriptor *job;
> +
> +	scoped_guard(spinlock_irqsave, &pispbe->hw_lock) {
> +		if (clear_hw_busy)
> +			pispbe->hw_busy = false;
> +
> +		if (pispbe->hw_busy)
> +			return;
>  
> -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> +		job = list_first_entry_or_null(&pispbe->job_queue,
> +					       struct pispbe_job_descriptor,
> +					       queue);
> +		if (!job)
> +			return;
>  
> -	if (clear_hw_busy)
> -		pispbe->hw_busy = false;
> +		list_del(&job->queue);
>  
> -	if (pispbe->hw_busy)
> -		goto unlock_and_return;
> +		for (unsigned int i = 0; i < PISPBE_NUM_NODES; i++)
> +			pispbe->queued_job.buf[i] = job->buffers[i];
> +		pispbe->queued_job.valid = true;
>  
> -	ret = pispbe_prepare_job(pispbe, &job);
> -	if (ret)
> -		goto unlock_and_return;
> +		pispbe->hw_busy = true;
> +	}
>  
>  	/*
>  	 * We can kick the job off without the hw_lock, as this can
> @@ -585,16 +599,8 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
>  	 * only when the following job has been queued and an interrupt
>  	 * is rised.
>  	 */
> -	pispbe->hw_busy = true;
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> -
> -	pispbe_queue_job(pispbe, &job);
> -
> -	return;
> -
> -unlock_and_return:
> -	/* No job has been queued, just release the lock and return. */
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> +	pispbe_queue_job(pispbe, job);
> +	kfree(job);
>  }
>  
>  static void pispbe_isr_jobdone(struct pispbe_dev *pispbe,
> @@ -846,18 +852,16 @@ static void pispbe_node_buffer_queue(struct vb2_buffer *buf)
>  		container_of(vbuf, struct pispbe_buffer, vb);
>  	struct pispbe_node *node = vb2_get_drv_priv(buf->vb2_queue);
>  	struct pispbe_dev *pispbe = node->pispbe;
> -	unsigned long flags;
>  
>  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> -	spin_lock_irqsave(&node->ready_lock, flags);
>  	list_add_tail(&buffer->ready_list, &node->ready_queue);
> -	spin_unlock_irqrestore(&node->ready_lock, flags);
>  
>  	/*
>  	 * Every time we add a buffer, check if there's now some work for the hw
>  	 * to do.
>  	 */
> -	pispbe_schedule(pispbe, false);
> +	if (!pispbe_prepare_job(pispbe))
> +		pispbe_schedule(pispbe, false);
>  }
>  
>  static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> @@ -865,17 +869,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
>  	struct pispbe_node *node = vb2_get_drv_priv(q);
>  	struct pispbe_dev *pispbe = node->pispbe;
>  	struct pispbe_buffer *buf, *tmp;
> -	unsigned long flags;
>  	int ret;
>  
>  	ret = pm_runtime_resume_and_get(pispbe->dev);
>  	if (ret < 0)
>  		goto err_return_buffers;
>  
> -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> -	node->pispbe->streaming_map |=  BIT(node->id);
> -	node->pispbe->sequence = 0;
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> +		node->pispbe->streaming_map |=  BIT(node->id);
> +		node->pispbe->sequence = 0;
> +	}
>  
>  	dev_dbg(pispbe->dev, "%s: for node %s (count %u)\n",
>  		__func__, NODE_NAME(node), count);
> @@ -883,17 +886,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
>  		node->pispbe->streaming_map);
>  
>  	/* Maybe we're ready to run. */
> -	pispbe_schedule(pispbe, false);
> +	if (!pispbe_prepare_job(pispbe))
> +		pispbe_schedule(pispbe, false);
>  
>  	return 0;
>  
>  err_return_buffers:
> -	spin_lock_irqsave(&pispbe->hw_lock, flags);
>  	list_for_each_entry_safe(buf, tmp, &node->ready_queue, ready_list) {
>  		list_del(&buf->ready_list);
>  		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED);
>  	}
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
>  
>  	return ret;
>  }
> @@ -903,7 +905,6 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
>  	struct pispbe_node *node = vb2_get_drv_priv(q);
>  	struct pispbe_dev *pispbe = node->pispbe;
>  	struct pispbe_buffer *buf;
> -	unsigned long flags;
>  
>  	/*
>  	 * Now this is a bit awkward. In a simple M2M device we could just wait
> @@ -915,11 +916,7 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
>  	 * This may return buffers out of order.
>  	 */
>  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> -	spin_lock_irqsave(&pispbe->hw_lock, flags);
>  	do {
> -		unsigned long flags1;
> -
> -		spin_lock_irqsave(&node->ready_lock, flags1);
>  		buf = list_first_entry_or_null(&node->ready_queue,
>  					       struct pispbe_buffer,
>  					       ready_list);
> @@ -927,15 +924,23 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
>  			list_del(&buf->ready_list);
>  			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
>  		}
> -		spin_unlock_irqrestore(&node->ready_lock, flags1);
>  	} while (buf);
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
>  
>  	vb2_wait_for_all_buffers(&node->queue);
>  
> -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> +	spin_lock_irq(&pispbe->hw_lock);
>  	pispbe->streaming_map &= ~BIT(node->id);
> -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> +
> +	/* Release all jobs once all nodes have stopped streaming. */
> +	if (pispbe->streaming_map == 0) {
> +		struct pispbe_job_descriptor *job, *temp;
> +
> +		list_for_each_entry_safe(job, temp, &pispbe->job_queue, queue) {
> +			list_del(&job->queue);
> +			kfree(job);
> +		}
> +	}

Please splice pispbe->job_queue to a local list with the lock held, and
then iterate over the local list without the lock held to free the jobs.

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> +	spin_unlock_irq(&pispbe->hw_lock);
>  
>  	pm_runtime_mark_last_busy(pispbe->dev);
>  	pm_runtime_put_autosuspend(pispbe->dev);
> @@ -1393,7 +1398,6 @@ static int pispbe_init_node(struct pispbe_dev *pispbe, unsigned int id)
>  	mutex_init(&node->node_lock);
>  	mutex_init(&node->queue_lock);
>  	INIT_LIST_HEAD(&node->ready_queue);
> -	spin_lock_init(&node->ready_lock);
>  
>  	node->format.type = node->buf_type;
>  	pispbe_node_def_fmt(node);
> @@ -1677,6 +1681,8 @@ static int pispbe_probe(struct platform_device *pdev)
>  	if (!pispbe)
>  		return -ENOMEM;
>  
> +	INIT_LIST_HEAD(&pispbe->job_queue);
> +
>  	dev_set_drvdata(&pdev->dev, pispbe);
>  	pispbe->dev = &pdev->dev;
>  	platform_set_drvdata(pdev, pispbe);

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling
  2025-06-16 14:40   ` Laurent Pinchart
@ 2025-06-17 12:32     ` Jacopo Mondi
  2025-06-17 13:53       ` Laurent Pinchart
  0 siblings, 1 reply; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-17 12:32 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Jacopo Mondi, Naushir Patuck, Nick Hollinghurst, David Plowman,
	Dave Stevenson, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil,
	linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel

Hi Laurent

On Mon, Jun 16, 2025 at 05:40:09PM +0300, Laurent Pinchart wrote:
> Hi Jacopo,
>
> On Fri, Jun 06, 2025 at 12:29:23PM +0200, Jacopo Mondi wrote:
> > Currently the 'pispbe_schedule()' function does two things:
> >
> > 1) Tries to assemble a job by inspecting all the video node queues
> >    to make sure all the required buffers are available
> > 2) Submit the job to the hardware
> >
> > The pispbe_schedule() function is called at:
> >
> > - video device start_streaming() time
> > - video device qbuf() time
> > - irq handler
> >
> > As assembling a job requires inspecting all queues, it is a rather
> > time consuming operation which is better not run in IRQ context.
> >
> > To avoid the executing the time consuming job creation in interrupt
>
> s/the executing/executing/
>
> > context split the job creation and job scheduling in two distinct
> > operations. When a well-formed job is created, append it to the
> > newly introduced 'pispbe->job_queue' where it will be dequeued from
> > by the scheduling routine.
> >
> > As the per-node 'ready_queue' buffer list is only accessed in vb2
> > ops callbacks, protected by a mutex, it is not necessary to guard it
>
> "by the node->queue_lock mutex"
>
> > with a dedicated spinlock so drop it. Also use the spin_lock_irq()
> > variant in all functions not called from an IRQ context where the
> > spin_lock_irqsave() version was used.
> >
> > Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> > ---
> >  .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 152 +++++++++++----------
> >  1 file changed, 79 insertions(+), 73 deletions(-)
> >
> > diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > index 92c452891d6c..c25f7d9b404c 100644
> > --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > @@ -161,8 +161,6 @@ struct pispbe_node {
> >  	struct mutex node_lock;
> >  	/* vb2_queue lock */
> >  	struct mutex queue_lock;
> > -	/* Protect pispbe_node->ready_queue and pispbe_buffer->ready_list */
> > -	spinlock_t ready_lock;
> >  	struct list_head ready_queue;
> >  	struct vb2_queue queue;
> >  	struct v4l2_format format;
> > @@ -190,6 +188,8 @@ struct pispbe_hw_enables {
> >
> >  /* Records a job configuration and memory addresses. */
> >  struct pispbe_job_descriptor {
> > +	struct list_head queue;
> > +	struct pispbe_buffer *buffers[PISPBE_NUM_NODES];
> >  	dma_addr_t hw_dma_addrs[N_HW_ADDRESSES];
> >  	struct pisp_be_tiles_config *config;
> >  	struct pispbe_hw_enables hw_enables;
> > @@ -215,8 +215,10 @@ struct pispbe_dev {
> >  	unsigned int sequence;
> >  	u32 streaming_map;
> >  	struct pispbe_job queued_job, running_job;
> > -	spinlock_t hw_lock; /* protects "hw_busy" flag and streaming_map */
> > +	/* protects "hw_busy" flag, streaming_map and job_queue */
> > +	spinlock_t hw_lock;
> >  	bool hw_busy; /* non-zero if a job is queued or is being started */
> > +	struct list_head job_queue;
> >  	int irq;
> >  	u32 hw_version;
> >  	u8 done, started;
> > @@ -440,41 +442,47 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
> >   * For Output0, Output1, Tdn and Stitch, a buffer only needs to be
> >   * available if the blocks are enabled in the config.
> >   *
> > - * Needs to be called with hw_lock held.
> > + * If all the buffers required to form a job are available, append the
> > + * job descriptor to the job queue to be later queued to the HW.
> >   *
> >   * Returns 0 if a job has been successfully prepared, < 0 otherwise.
> >   */
> > -static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > -			      struct pispbe_job_descriptor *job)
> > +static int pispbe_prepare_job(struct pispbe_dev *pispbe)
> >  {
> >  	struct pispbe_buffer *buf[PISPBE_NUM_NODES] = {};
> > +	struct pispbe_job_descriptor *job;
>
> You could use
>
> 	struct pispbe_job_descriptor __free(kfree) *job = NULL;
>
> and drop the kfree() in the error paths to simplify error handling and
> make it more robust. Don't forget to set job to NULL just after adding
> it to the job_queue.
>

Only if I

	no_free_ptr(job);

before returning as job as to stay valid until it gets consumed.

I'm not sure it's worth it just to save two "kfree(job);" in error
paths

> > +	unsigned int streaming_map;
> >  	unsigned int config_index;
> >  	struct pispbe_node *node;
> > -	unsigned long flags;
> >
> > -	lockdep_assert_held(&pispbe->hw_lock);
>
> You could replace this with
>
> 	lockdep_assert_irqs_enabled();
>
> Up to you.
>
> > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > +		static const u32 mask = BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE);
> >
> > -	memset(job, 0, sizeof(struct pispbe_job_descriptor));
> > +		if ((pispbe->streaming_map & mask) != mask)
> > +			return -ENODEV;
> >
> > -	if (((BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)) &
> > -		pispbe->streaming_map) !=
> > -			(BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)))
> > -		return -ENODEV;
> > +		/*
> > +		 * Take a copy of streaming_map: nodes activated after this
> > +		 * point are ignored when preparing this job.
> > +		 */
> > +		streaming_map = pispbe->streaming_map;
> > +	}
> > +
> > +	job = kzalloc(sizeof(*job), GFP_KERNEL);
> > +	if (!job)
> > +		return -ENOMEM;
> >
> >  	node = &pispbe->node[CONFIG_NODE];
> > -	spin_lock_irqsave(&node->ready_lock, flags);
> >  	buf[CONFIG_NODE] = list_first_entry_or_null(&node->ready_queue,
> >  						    struct pispbe_buffer,
> >  						    ready_list);
> > -	if (buf[CONFIG_NODE]) {
> > -		list_del(&buf[CONFIG_NODE]->ready_list);
> > -		pispbe->queued_job.buf[CONFIG_NODE] = buf[CONFIG_NODE];
> > +	if (!buf[CONFIG_NODE]) {
> > +		kfree(job);
> > +		return -ENODEV;
> >  	}
> > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> >
> > -	/* Exit early if no config buffer has been queued. */
> > -	if (!buf[CONFIG_NODE])
> > -		return -ENODEV;
> > +	list_del(&buf[CONFIG_NODE]->ready_list);
> > +	job->buffers[CONFIG_NODE] = buf[CONFIG_NODE];
> >
> >  	config_index = buf[CONFIG_NODE]->vb.vb2_buf.index;
> >  	job->config = &pispbe->config[config_index];
> > @@ -495,7 +503,7 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> >  			continue;
> >
> >  		buf[i] = NULL;
> > -		if (!(pispbe->streaming_map & BIT(i)))
> > +		if (!(streaming_map & BIT(i)))
> >  			continue;
> >
> >  		if ((!(rgb_en & PISP_BE_RGB_ENABLE_OUTPUT0) &&
> > @@ -522,25 +530,25 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> >  		node = &pispbe->node[i];
> >
> >  		/* Pull a buffer from each V4L2 queue to form the queued job */
> > -		spin_lock_irqsave(&node->ready_lock, flags);
> >  		buf[i] = list_first_entry_or_null(&node->ready_queue,
> >  						  struct pispbe_buffer,
> >  						  ready_list);
> >  		if (buf[i]) {
> >  			list_del(&buf[i]->ready_list);
> > -			pispbe->queued_job.buf[i] = buf[i];
> > +			job->buffers[i] = buf[i];
> >  		}
> > -		spin_unlock_irqrestore(&node->ready_lock, flags);
> >
> >  		if (!buf[i] && !ignore_buffers)
> >  			goto err_return_buffers;
> >  	}
> >
> > -	pispbe->queued_job.valid = true;
> > -
> >  	/* Convert buffers to DMA addresses for the hardware */
> >  	pispbe_xlate_addrs(pispbe, job, buf);
> >
> > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > +		list_add_tail(&job->queue, &pispbe->job_queue);
> > +	}
> > +
> >  	return 0;
> >
> >  err_return_buffers:
> > @@ -551,33 +559,39 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> >  			continue;
> >
> >  		/* Return the buffer to the ready_list queue */
> > -		spin_lock_irqsave(&n->ready_lock, flags);
> >  		list_add(&buf[i]->ready_list, &n->ready_queue);
> > -		spin_unlock_irqrestore(&n->ready_lock, flags);
> >  	}
> >
> > -	memset(&pispbe->queued_job, 0, sizeof(pispbe->queued_job));
> > +	kfree(job);
> >
> >  	return -ENODEV;
> >  }
> >
> >  static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> >  {
> > -	struct pispbe_job_descriptor job;
> > -	unsigned long flags;
> > -	int ret;
> > +	struct pispbe_job_descriptor *job;
> > +
> > +	scoped_guard(spinlock_irqsave, &pispbe->hw_lock) {
> > +		if (clear_hw_busy)
> > +			pispbe->hw_busy = false;
> > +
> > +		if (pispbe->hw_busy)
> > +			return;
> >
> > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > +		job = list_first_entry_or_null(&pispbe->job_queue,
> > +					       struct pispbe_job_descriptor,
> > +					       queue);
> > +		if (!job)
> > +			return;
> >
> > -	if (clear_hw_busy)
> > -		pispbe->hw_busy = false;
> > +		list_del(&job->queue);
> >
> > -	if (pispbe->hw_busy)
> > -		goto unlock_and_return;
> > +		for (unsigned int i = 0; i < PISPBE_NUM_NODES; i++)
> > +			pispbe->queued_job.buf[i] = job->buffers[i];
> > +		pispbe->queued_job.valid = true;
> >
> > -	ret = pispbe_prepare_job(pispbe, &job);
> > -	if (ret)
> > -		goto unlock_and_return;
> > +		pispbe->hw_busy = true;
> > +	}
> >
> >  	/*
> >  	 * We can kick the job off without the hw_lock, as this can
> > @@ -585,16 +599,8 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> >  	 * only when the following job has been queued and an interrupt
> >  	 * is rised.
> >  	 */
> > -	pispbe->hw_busy = true;
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > -
> > -	pispbe_queue_job(pispbe, &job);
> > -
> > -	return;
> > -
> > -unlock_and_return:
> > -	/* No job has been queued, just release the lock and return. */
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > +	pispbe_queue_job(pispbe, job);
> > +	kfree(job);
> >  }
> >
> >  static void pispbe_isr_jobdone(struct pispbe_dev *pispbe,
> > @@ -846,18 +852,16 @@ static void pispbe_node_buffer_queue(struct vb2_buffer *buf)
> >  		container_of(vbuf, struct pispbe_buffer, vb);
> >  	struct pispbe_node *node = vb2_get_drv_priv(buf->vb2_queue);
> >  	struct pispbe_dev *pispbe = node->pispbe;
> > -	unsigned long flags;
> >
> >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > -	spin_lock_irqsave(&node->ready_lock, flags);
> >  	list_add_tail(&buffer->ready_list, &node->ready_queue);
> > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> >
> >  	/*
> >  	 * Every time we add a buffer, check if there's now some work for the hw
> >  	 * to do.
> >  	 */
> > -	pispbe_schedule(pispbe, false);
> > +	if (!pispbe_prepare_job(pispbe))
> > +		pispbe_schedule(pispbe, false);
> >  }
> >
> >  static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > @@ -865,17 +869,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> >  	struct pispbe_dev *pispbe = node->pispbe;
> >  	struct pispbe_buffer *buf, *tmp;
> > -	unsigned long flags;
> >  	int ret;
> >
> >  	ret = pm_runtime_resume_and_get(pispbe->dev);
> >  	if (ret < 0)
> >  		goto err_return_buffers;
> >
> > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > -	node->pispbe->streaming_map |=  BIT(node->id);
> > -	node->pispbe->sequence = 0;
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > +		node->pispbe->streaming_map |=  BIT(node->id);
> > +		node->pispbe->sequence = 0;
> > +	}
> >
> >  	dev_dbg(pispbe->dev, "%s: for node %s (count %u)\n",
> >  		__func__, NODE_NAME(node), count);
> > @@ -883,17 +886,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> >  		node->pispbe->streaming_map);
> >
> >  	/* Maybe we're ready to run. */
> > -	pispbe_schedule(pispbe, false);
> > +	if (!pispbe_prepare_job(pispbe))
> > +		pispbe_schedule(pispbe, false);
> >
> >  	return 0;
> >
> >  err_return_buffers:
> > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> >  	list_for_each_entry_safe(buf, tmp, &node->ready_queue, ready_list) {
> >  		list_del(&buf->ready_list);
> >  		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED);
> >  	}
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> >
> >  	return ret;
> >  }
> > @@ -903,7 +905,6 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> >  	struct pispbe_dev *pispbe = node->pispbe;
> >  	struct pispbe_buffer *buf;
> > -	unsigned long flags;
> >
> >  	/*
> >  	 * Now this is a bit awkward. In a simple M2M device we could just wait
> > @@ -915,11 +916,7 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> >  	 * This may return buffers out of order.
> >  	 */
> >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> >  	do {
> > -		unsigned long flags1;
> > -
> > -		spin_lock_irqsave(&node->ready_lock, flags1);
> >  		buf = list_first_entry_or_null(&node->ready_queue,
> >  					       struct pispbe_buffer,
> >  					       ready_list);
> > @@ -927,15 +924,23 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> >  			list_del(&buf->ready_list);
> >  			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> >  		}
> > -		spin_unlock_irqrestore(&node->ready_lock, flags1);
> >  	} while (buf);
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> >
> >  	vb2_wait_for_all_buffers(&node->queue);
> >
> > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > +	spin_lock_irq(&pispbe->hw_lock);
> >  	pispbe->streaming_map &= ~BIT(node->id);
> > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > +
> > +	/* Release all jobs once all nodes have stopped streaming. */
> > +	if (pispbe->streaming_map == 0) {
> > +		struct pispbe_job_descriptor *job, *temp;
> > +
> > +		list_for_each_entry_safe(job, temp, &pispbe->job_queue, queue) {
> > +			list_del(&job->queue);
> > +			kfree(job);
> > +		}
> > +	}
>
> Please splice pispbe->job_queue to a local list with the lock held, and
> then iterate over the local list without the lock held to free the jobs.
>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>
> > +	spin_unlock_irq(&pispbe->hw_lock);
> >
> >  	pm_runtime_mark_last_busy(pispbe->dev);
> >  	pm_runtime_put_autosuspend(pispbe->dev);
> > @@ -1393,7 +1398,6 @@ static int pispbe_init_node(struct pispbe_dev *pispbe, unsigned int id)
> >  	mutex_init(&node->node_lock);
> >  	mutex_init(&node->queue_lock);
> >  	INIT_LIST_HEAD(&node->ready_queue);
> > -	spin_lock_init(&node->ready_lock);
> >
> >  	node->format.type = node->buf_type;
> >  	pispbe_node_def_fmt(node);
> > @@ -1677,6 +1681,8 @@ static int pispbe_probe(struct platform_device *pdev)
> >  	if (!pispbe)
> >  		return -ENOMEM;
> >
> > +	INIT_LIST_HEAD(&pispbe->job_queue);
> > +
> >  	dev_set_drvdata(&pdev->dev, pispbe);
> >  	pispbe->dev = &pdev->dev;
> >  	platform_set_drvdata(pdev, pispbe);
>
> --
> Regards,
>
> Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling
  2025-06-17 12:32     ` Jacopo Mondi
@ 2025-06-17 13:53       ` Laurent Pinchart
  2025-06-17 14:06         ` Jacopo Mondi
  0 siblings, 1 reply; 12+ messages in thread
From: Laurent Pinchart @ 2025-06-17 13:53 UTC (permalink / raw)
  To: Jacopo Mondi
  Cc: Naushir Patuck, Nick Hollinghurst, David Plowman, Dave Stevenson,
	Raspberry Pi Kernel Maintenance, Mauro Carvalho Chehab,
	Florian Fainelli, Broadcom internal kernel review list,
	Sakari Ailus, Hans Verkuil, linux-media, linux-rpi-kernel,
	linux-arm-kernel, linux-kernel

On Tue, Jun 17, 2025 at 02:32:19PM +0200, Jacopo Mondi wrote:
> On Mon, Jun 16, 2025 at 05:40:09PM +0300, Laurent Pinchart wrote:
> > On Fri, Jun 06, 2025 at 12:29:23PM +0200, Jacopo Mondi wrote:
> > > Currently the 'pispbe_schedule()' function does two things:
> > >
> > > 1) Tries to assemble a job by inspecting all the video node queues
> > >    to make sure all the required buffers are available
> > > 2) Submit the job to the hardware
> > >
> > > The pispbe_schedule() function is called at:
> > >
> > > - video device start_streaming() time
> > > - video device qbuf() time
> > > - irq handler
> > >
> > > As assembling a job requires inspecting all queues, it is a rather
> > > time consuming operation which is better not run in IRQ context.
> > >
> > > To avoid the executing the time consuming job creation in interrupt
> >
> > s/the executing/executing/
> >
> > > context split the job creation and job scheduling in two distinct
> > > operations. When a well-formed job is created, append it to the
> > > newly introduced 'pispbe->job_queue' where it will be dequeued from
> > > by the scheduling routine.
> > >
> > > As the per-node 'ready_queue' buffer list is only accessed in vb2
> > > ops callbacks, protected by a mutex, it is not necessary to guard it
> >
> > "by the node->queue_lock mutex"
> >
> > > with a dedicated spinlock so drop it. Also use the spin_lock_irq()
> > > variant in all functions not called from an IRQ context where the
> > > spin_lock_irqsave() version was used.
> > >
> > > Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> > > ---
> > >  .../media/platform/raspberrypi/pisp_be/pisp_be.c   | 152 +++++++++++----------
> > >  1 file changed, 79 insertions(+), 73 deletions(-)
> > >
> > > diff --git a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > > index 92c452891d6c..c25f7d9b404c 100644
> > > --- a/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > > +++ b/drivers/media/platform/raspberrypi/pisp_be/pisp_be.c
> > > @@ -161,8 +161,6 @@ struct pispbe_node {
> > >  	struct mutex node_lock;
> > >  	/* vb2_queue lock */
> > >  	struct mutex queue_lock;
> > > -	/* Protect pispbe_node->ready_queue and pispbe_buffer->ready_list */
> > > -	spinlock_t ready_lock;
> > >  	struct list_head ready_queue;
> > >  	struct vb2_queue queue;
> > >  	struct v4l2_format format;
> > > @@ -190,6 +188,8 @@ struct pispbe_hw_enables {
> > >
> > >  /* Records a job configuration and memory addresses. */
> > >  struct pispbe_job_descriptor {
> > > +	struct list_head queue;
> > > +	struct pispbe_buffer *buffers[PISPBE_NUM_NODES];
> > >  	dma_addr_t hw_dma_addrs[N_HW_ADDRESSES];
> > >  	struct pisp_be_tiles_config *config;
> > >  	struct pispbe_hw_enables hw_enables;
> > > @@ -215,8 +215,10 @@ struct pispbe_dev {
> > >  	unsigned int sequence;
> > >  	u32 streaming_map;
> > >  	struct pispbe_job queued_job, running_job;
> > > -	spinlock_t hw_lock; /* protects "hw_busy" flag and streaming_map */
> > > +	/* protects "hw_busy" flag, streaming_map and job_queue */
> > > +	spinlock_t hw_lock;
> > >  	bool hw_busy; /* non-zero if a job is queued or is being started */
> > > +	struct list_head job_queue;
> > >  	int irq;
> > >  	u32 hw_version;
> > >  	u8 done, started;
> > > @@ -440,41 +442,47 @@ static void pispbe_xlate_addrs(struct pispbe_dev *pispbe,
> > >   * For Output0, Output1, Tdn and Stitch, a buffer only needs to be
> > >   * available if the blocks are enabled in the config.
> > >   *
> > > - * Needs to be called with hw_lock held.
> > > + * If all the buffers required to form a job are available, append the
> > > + * job descriptor to the job queue to be later queued to the HW.
> > >   *
> > >   * Returns 0 if a job has been successfully prepared, < 0 otherwise.
> > >   */
> > > -static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > > -			      struct pispbe_job_descriptor *job)
> > > +static int pispbe_prepare_job(struct pispbe_dev *pispbe)
> > >  {
> > >  	struct pispbe_buffer *buf[PISPBE_NUM_NODES] = {};
> > > +	struct pispbe_job_descriptor *job;
> >
> > You could use
> >
> > 	struct pispbe_job_descriptor __free(kfree) *job = NULL;
> >
> > and drop the kfree() in the error paths to simplify error handling and
> > make it more robust. Don't forget to set job to NULL just after adding
> > it to the job_queue.
> >
> 
> Only if I
> 
> 	no_free_ptr(job);

That's setting it to NULL, yes.

> before returning as job as to stay valid until it gets consumed.
> 
> I'm not sure it's worth it just to save two "kfree(job);" in error
> paths

Up to you.

> > > +	unsigned int streaming_map;
> > >  	unsigned int config_index;
> > >  	struct pispbe_node *node;
> > > -	unsigned long flags;
> > >
> > > -	lockdep_assert_held(&pispbe->hw_lock);
> >
> > You could replace this with
> >
> > 	lockdep_assert_irqs_enabled();
> >
> > Up to you.
> >
> > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > +		static const u32 mask = BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE);
> > >
> > > -	memset(job, 0, sizeof(struct pispbe_job_descriptor));
> > > +		if ((pispbe->streaming_map & mask) != mask)
> > > +			return -ENODEV;
> > >
> > > -	if (((BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)) &
> > > -		pispbe->streaming_map) !=
> > > -			(BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)))
> > > -		return -ENODEV;
> > > +		/*
> > > +		 * Take a copy of streaming_map: nodes activated after this
> > > +		 * point are ignored when preparing this job.
> > > +		 */
> > > +		streaming_map = pispbe->streaming_map;
> > > +	}
> > > +
> > > +	job = kzalloc(sizeof(*job), GFP_KERNEL);
> > > +	if (!job)
> > > +		return -ENOMEM;
> > >
> > >  	node = &pispbe->node[CONFIG_NODE];
> > > -	spin_lock_irqsave(&node->ready_lock, flags);
> > >  	buf[CONFIG_NODE] = list_first_entry_or_null(&node->ready_queue,
> > >  						    struct pispbe_buffer,
> > >  						    ready_list);
> > > -	if (buf[CONFIG_NODE]) {
> > > -		list_del(&buf[CONFIG_NODE]->ready_list);
> > > -		pispbe->queued_job.buf[CONFIG_NODE] = buf[CONFIG_NODE];
> > > +	if (!buf[CONFIG_NODE]) {
> > > +		kfree(job);
> > > +		return -ENODEV;
> > >  	}
> > > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> > >
> > > -	/* Exit early if no config buffer has been queued. */
> > > -	if (!buf[CONFIG_NODE])
> > > -		return -ENODEV;
> > > +	list_del(&buf[CONFIG_NODE]->ready_list);
> > > +	job->buffers[CONFIG_NODE] = buf[CONFIG_NODE];
> > >
> > >  	config_index = buf[CONFIG_NODE]->vb.vb2_buf.index;
> > >  	job->config = &pispbe->config[config_index];
> > > @@ -495,7 +503,7 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > >  			continue;
> > >
> > >  		buf[i] = NULL;
> > > -		if (!(pispbe->streaming_map & BIT(i)))
> > > +		if (!(streaming_map & BIT(i)))
> > >  			continue;
> > >
> > >  		if ((!(rgb_en & PISP_BE_RGB_ENABLE_OUTPUT0) &&
> > > @@ -522,25 +530,25 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > >  		node = &pispbe->node[i];
> > >
> > >  		/* Pull a buffer from each V4L2 queue to form the queued job */
> > > -		spin_lock_irqsave(&node->ready_lock, flags);
> > >  		buf[i] = list_first_entry_or_null(&node->ready_queue,
> > >  						  struct pispbe_buffer,
> > >  						  ready_list);
> > >  		if (buf[i]) {
> > >  			list_del(&buf[i]->ready_list);
> > > -			pispbe->queued_job.buf[i] = buf[i];
> > > +			job->buffers[i] = buf[i];
> > >  		}
> > > -		spin_unlock_irqrestore(&node->ready_lock, flags);
> > >
> > >  		if (!buf[i] && !ignore_buffers)
> > >  			goto err_return_buffers;
> > >  	}
> > >
> > > -	pispbe->queued_job.valid = true;
> > > -
> > >  	/* Convert buffers to DMA addresses for the hardware */
> > >  	pispbe_xlate_addrs(pispbe, job, buf);
> > >
> > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > +		list_add_tail(&job->queue, &pispbe->job_queue);
> > > +	}
> > > +
> > >  	return 0;
> > >
> > >  err_return_buffers:
> > > @@ -551,33 +559,39 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > >  			continue;
> > >
> > >  		/* Return the buffer to the ready_list queue */
> > > -		spin_lock_irqsave(&n->ready_lock, flags);
> > >  		list_add(&buf[i]->ready_list, &n->ready_queue);
> > > -		spin_unlock_irqrestore(&n->ready_lock, flags);
> > >  	}
> > >
> > > -	memset(&pispbe->queued_job, 0, sizeof(pispbe->queued_job));
> > > +	kfree(job);
> > >
> > >  	return -ENODEV;
> > >  }
> > >
> > >  static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> > >  {
> > > -	struct pispbe_job_descriptor job;
> > > -	unsigned long flags;
> > > -	int ret;
> > > +	struct pispbe_job_descriptor *job;
> > > +
> > > +	scoped_guard(spinlock_irqsave, &pispbe->hw_lock) {
> > > +		if (clear_hw_busy)
> > > +			pispbe->hw_busy = false;
> > > +
> > > +		if (pispbe->hw_busy)
> > > +			return;
> > >
> > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > +		job = list_first_entry_or_null(&pispbe->job_queue,
> > > +					       struct pispbe_job_descriptor,
> > > +					       queue);
> > > +		if (!job)
> > > +			return;
> > >
> > > -	if (clear_hw_busy)
> > > -		pispbe->hw_busy = false;
> > > +		list_del(&job->queue);
> > >
> > > -	if (pispbe->hw_busy)
> > > -		goto unlock_and_return;
> > > +		for (unsigned int i = 0; i < PISPBE_NUM_NODES; i++)
> > > +			pispbe->queued_job.buf[i] = job->buffers[i];
> > > +		pispbe->queued_job.valid = true;
> > >
> > > -	ret = pispbe_prepare_job(pispbe, &job);
> > > -	if (ret)
> > > -		goto unlock_and_return;
> > > +		pispbe->hw_busy = true;
> > > +	}
> > >
> > >  	/*
> > >  	 * We can kick the job off without the hw_lock, as this can
> > > @@ -585,16 +599,8 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> > >  	 * only when the following job has been queued and an interrupt
> > >  	 * is rised.
> > >  	 */
> > > -	pispbe->hw_busy = true;
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > -
> > > -	pispbe_queue_job(pispbe, &job);
> > > -
> > > -	return;
> > > -
> > > -unlock_and_return:
> > > -	/* No job has been queued, just release the lock and return. */
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > +	pispbe_queue_job(pispbe, job);
> > > +	kfree(job);
> > >  }
> > >
> > >  static void pispbe_isr_jobdone(struct pispbe_dev *pispbe,
> > > @@ -846,18 +852,16 @@ static void pispbe_node_buffer_queue(struct vb2_buffer *buf)
> > >  		container_of(vbuf, struct pispbe_buffer, vb);
> > >  	struct pispbe_node *node = vb2_get_drv_priv(buf->vb2_queue);
> > >  	struct pispbe_dev *pispbe = node->pispbe;
> > > -	unsigned long flags;
> > >
> > >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > > -	spin_lock_irqsave(&node->ready_lock, flags);
> > >  	list_add_tail(&buffer->ready_list, &node->ready_queue);
> > > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> > >
> > >  	/*
> > >  	 * Every time we add a buffer, check if there's now some work for the hw
> > >  	 * to do.
> > >  	 */
> > > -	pispbe_schedule(pispbe, false);
> > > +	if (!pispbe_prepare_job(pispbe))
> > > +		pispbe_schedule(pispbe, false);
> > >  }
> > >
> > >  static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > > @@ -865,17 +869,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> > >  	struct pispbe_dev *pispbe = node->pispbe;
> > >  	struct pispbe_buffer *buf, *tmp;
> > > -	unsigned long flags;
> > >  	int ret;
> > >
> > >  	ret = pm_runtime_resume_and_get(pispbe->dev);
> > >  	if (ret < 0)
> > >  		goto err_return_buffers;
> > >
> > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > -	node->pispbe->streaming_map |=  BIT(node->id);
> > > -	node->pispbe->sequence = 0;
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > +		node->pispbe->streaming_map |=  BIT(node->id);
> > > +		node->pispbe->sequence = 0;
> > > +	}
> > >
> > >  	dev_dbg(pispbe->dev, "%s: for node %s (count %u)\n",
> > >  		__func__, NODE_NAME(node), count);
> > > @@ -883,17 +886,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > >  		node->pispbe->streaming_map);
> > >
> > >  	/* Maybe we're ready to run. */
> > > -	pispbe_schedule(pispbe, false);
> > > +	if (!pispbe_prepare_job(pispbe))
> > > +		pispbe_schedule(pispbe, false);
> > >
> > >  	return 0;
> > >
> > >  err_return_buffers:
> > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > >  	list_for_each_entry_safe(buf, tmp, &node->ready_queue, ready_list) {
> > >  		list_del(&buf->ready_list);
> > >  		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED);
> > >  	}
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > >
> > >  	return ret;
> > >  }
> > > @@ -903,7 +905,6 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> > >  	struct pispbe_dev *pispbe = node->pispbe;
> > >  	struct pispbe_buffer *buf;
> > > -	unsigned long flags;
> > >
> > >  	/*
> > >  	 * Now this is a bit awkward. In a simple M2M device we could just wait
> > > @@ -915,11 +916,7 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > >  	 * This may return buffers out of order.
> > >  	 */
> > >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > >  	do {
> > > -		unsigned long flags1;
> > > -
> > > -		spin_lock_irqsave(&node->ready_lock, flags1);
> > >  		buf = list_first_entry_or_null(&node->ready_queue,
> > >  					       struct pispbe_buffer,
> > >  					       ready_list);
> > > @@ -927,15 +924,23 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > >  			list_del(&buf->ready_list);
> > >  			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > >  		}
> > > -		spin_unlock_irqrestore(&node->ready_lock, flags1);
> > >  	} while (buf);
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > >
> > >  	vb2_wait_for_all_buffers(&node->queue);
> > >
> > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > +	spin_lock_irq(&pispbe->hw_lock);
> > >  	pispbe->streaming_map &= ~BIT(node->id);
> > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > +
> > > +	/* Release all jobs once all nodes have stopped streaming. */
> > > +	if (pispbe->streaming_map == 0) {
> > > +		struct pispbe_job_descriptor *job, *temp;
> > > +
> > > +		list_for_each_entry_safe(job, temp, &pispbe->job_queue, queue) {
> > > +			list_del(&job->queue);
> > > +			kfree(job);
> > > +		}
> > > +	}
> >
> > Please splice pispbe->job_queue to a local list with the lock held, and
> > then iterate over the local list without the lock held to free the jobs.
> >
> > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >
> > > +	spin_unlock_irq(&pispbe->hw_lock);
> > >
> > >  	pm_runtime_mark_last_busy(pispbe->dev);
> > >  	pm_runtime_put_autosuspend(pispbe->dev);
> > > @@ -1393,7 +1398,6 @@ static int pispbe_init_node(struct pispbe_dev *pispbe, unsigned int id)
> > >  	mutex_init(&node->node_lock);
> > >  	mutex_init(&node->queue_lock);
> > >  	INIT_LIST_HEAD(&node->ready_queue);
> > > -	spin_lock_init(&node->ready_lock);
> > >
> > >  	node->format.type = node->buf_type;
> > >  	pispbe_node_def_fmt(node);
> > > @@ -1677,6 +1681,8 @@ static int pispbe_probe(struct platform_device *pdev)
> > >  	if (!pispbe)
> > >  		return -ENOMEM;
> > >
> > > +	INIT_LIST_HEAD(&pispbe->job_queue);
> > > +
> > >  	dev_set_drvdata(&pdev->dev, pispbe);
> > >  	pispbe->dev = &pdev->dev;
> > >  	platform_set_drvdata(pdev, pispbe);

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling
  2025-06-17 13:53       ` Laurent Pinchart
@ 2025-06-17 14:06         ` Jacopo Mondi
  0 siblings, 0 replies; 12+ messages in thread
From: Jacopo Mondi @ 2025-06-17 14:06 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Jacopo Mondi, Naushir Patuck, Nick Hollinghurst, David Plowman,
	Dave Stevenson, Raspberry Pi Kernel Maintenance,
	Mauro Carvalho Chehab, Florian Fainelli,
	Broadcom internal kernel review list, Sakari Ailus, Hans Verkuil,
	linux-media, linux-rpi-kernel, linux-arm-kernel, linux-kernel

On Tue, Jun 17, 2025 at 04:53:04PM +0300, Laurent Pinchart wrote:
> > > > -static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > > > -			      struct pispbe_job_descriptor *job)
> > > > +static int pispbe_prepare_job(struct pispbe_dev *pispbe)
> > > >  {
> > > >  	struct pispbe_buffer *buf[PISPBE_NUM_NODES] = {};
> > > > +	struct pispbe_job_descriptor *job;
> > >
> > > You could use
> > >
> > > 	struct pispbe_job_descriptor __free(kfree) *job = NULL;
> > >
> > > and drop the kfree() in the error paths to simplify error handling and
> > > make it more robust. Don't forget to set job to NULL just after adding
> > > it to the job_queue.
> > >
> >
> > Only if I
> >
> > 	no_free_ptr(job);
>
> That's setting it to NULL, yes.
>

I realized my comment was unparsable, sorry. I meant I wanted to use
no_free_ptr(job) which is equivalent to job = NULL; but more explicit,
but media-ci reported that I'm not meant to ignore its return value so
I went for job = NULL in the end.

> > before returning as job as to stay valid until it gets consumed.
> >
> > I'm not sure it's worth it just to save two "kfree(job);" in error
> > paths
>
> Up to you.
>

I'm in two minds here. It makes cleanup paths easier but requires an
ad-hoc handling before returning. Oh well, let's use this new fancy
features and be done with that. That's what I've done in v8

Thanks
  j

> > > > +	unsigned int streaming_map;
> > > >  	unsigned int config_index;
> > > >  	struct pispbe_node *node;
> > > > -	unsigned long flags;
> > > >
> > > > -	lockdep_assert_held(&pispbe->hw_lock);
> > >
> > > You could replace this with
> > >
> > > 	lockdep_assert_irqs_enabled();
> > >
> > > Up to you.
> > >
> > > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > > +		static const u32 mask = BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE);
> > > >
> > > > -	memset(job, 0, sizeof(struct pispbe_job_descriptor));
> > > > +		if ((pispbe->streaming_map & mask) != mask)
> > > > +			return -ENODEV;
> > > >
> > > > -	if (((BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)) &
> > > > -		pispbe->streaming_map) !=
> > > > -			(BIT(CONFIG_NODE) | BIT(MAIN_INPUT_NODE)))
> > > > -		return -ENODEV;
> > > > +		/*
> > > > +		 * Take a copy of streaming_map: nodes activated after this
> > > > +		 * point are ignored when preparing this job.
> > > > +		 */
> > > > +		streaming_map = pispbe->streaming_map;
> > > > +	}
> > > > +
> > > > +	job = kzalloc(sizeof(*job), GFP_KERNEL);
> > > > +	if (!job)
> > > > +		return -ENOMEM;
> > > >
> > > >  	node = &pispbe->node[CONFIG_NODE];
> > > > -	spin_lock_irqsave(&node->ready_lock, flags);
> > > >  	buf[CONFIG_NODE] = list_first_entry_or_null(&node->ready_queue,
> > > >  						    struct pispbe_buffer,
> > > >  						    ready_list);
> > > > -	if (buf[CONFIG_NODE]) {
> > > > -		list_del(&buf[CONFIG_NODE]->ready_list);
> > > > -		pispbe->queued_job.buf[CONFIG_NODE] = buf[CONFIG_NODE];
> > > > +	if (!buf[CONFIG_NODE]) {
> > > > +		kfree(job);
> > > > +		return -ENODEV;
> > > >  	}
> > > > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> > > >
> > > > -	/* Exit early if no config buffer has been queued. */
> > > > -	if (!buf[CONFIG_NODE])
> > > > -		return -ENODEV;
> > > > +	list_del(&buf[CONFIG_NODE]->ready_list);
> > > > +	job->buffers[CONFIG_NODE] = buf[CONFIG_NODE];
> > > >
> > > >  	config_index = buf[CONFIG_NODE]->vb.vb2_buf.index;
> > > >  	job->config = &pispbe->config[config_index];
> > > > @@ -495,7 +503,7 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > > >  			continue;
> > > >
> > > >  		buf[i] = NULL;
> > > > -		if (!(pispbe->streaming_map & BIT(i)))
> > > > +		if (!(streaming_map & BIT(i)))
> > > >  			continue;
> > > >
> > > >  		if ((!(rgb_en & PISP_BE_RGB_ENABLE_OUTPUT0) &&
> > > > @@ -522,25 +530,25 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > > >  		node = &pispbe->node[i];
> > > >
> > > >  		/* Pull a buffer from each V4L2 queue to form the queued job */
> > > > -		spin_lock_irqsave(&node->ready_lock, flags);
> > > >  		buf[i] = list_first_entry_or_null(&node->ready_queue,
> > > >  						  struct pispbe_buffer,
> > > >  						  ready_list);
> > > >  		if (buf[i]) {
> > > >  			list_del(&buf[i]->ready_list);
> > > > -			pispbe->queued_job.buf[i] = buf[i];
> > > > +			job->buffers[i] = buf[i];
> > > >  		}
> > > > -		spin_unlock_irqrestore(&node->ready_lock, flags);
> > > >
> > > >  		if (!buf[i] && !ignore_buffers)
> > > >  			goto err_return_buffers;
> > > >  	}
> > > >
> > > > -	pispbe->queued_job.valid = true;
> > > > -
> > > >  	/* Convert buffers to DMA addresses for the hardware */
> > > >  	pispbe_xlate_addrs(pispbe, job, buf);
> > > >
> > > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > > +		list_add_tail(&job->queue, &pispbe->job_queue);
> > > > +	}
> > > > +
> > > >  	return 0;
> > > >
> > > >  err_return_buffers:
> > > > @@ -551,33 +559,39 @@ static int pispbe_prepare_job(struct pispbe_dev *pispbe,
> > > >  			continue;
> > > >
> > > >  		/* Return the buffer to the ready_list queue */
> > > > -		spin_lock_irqsave(&n->ready_lock, flags);
> > > >  		list_add(&buf[i]->ready_list, &n->ready_queue);
> > > > -		spin_unlock_irqrestore(&n->ready_lock, flags);
> > > >  	}
> > > >
> > > > -	memset(&pispbe->queued_job, 0, sizeof(pispbe->queued_job));
> > > > +	kfree(job);
> > > >
> > > >  	return -ENODEV;
> > > >  }
> > > >
> > > >  static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> > > >  {
> > > > -	struct pispbe_job_descriptor job;
> > > > -	unsigned long flags;
> > > > -	int ret;
> > > > +	struct pispbe_job_descriptor *job;
> > > > +
> > > > +	scoped_guard(spinlock_irqsave, &pispbe->hw_lock) {
> > > > +		if (clear_hw_busy)
> > > > +			pispbe->hw_busy = false;
> > > > +
> > > > +		if (pispbe->hw_busy)
> > > > +			return;
> > > >
> > > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > > +		job = list_first_entry_or_null(&pispbe->job_queue,
> > > > +					       struct pispbe_job_descriptor,
> > > > +					       queue);
> > > > +		if (!job)
> > > > +			return;
> > > >
> > > > -	if (clear_hw_busy)
> > > > -		pispbe->hw_busy = false;
> > > > +		list_del(&job->queue);
> > > >
> > > > -	if (pispbe->hw_busy)
> > > > -		goto unlock_and_return;
> > > > +		for (unsigned int i = 0; i < PISPBE_NUM_NODES; i++)
> > > > +			pispbe->queued_job.buf[i] = job->buffers[i];
> > > > +		pispbe->queued_job.valid = true;
> > > >
> > > > -	ret = pispbe_prepare_job(pispbe, &job);
> > > > -	if (ret)
> > > > -		goto unlock_and_return;
> > > > +		pispbe->hw_busy = true;
> > > > +	}
> > > >
> > > >  	/*
> > > >  	 * We can kick the job off without the hw_lock, as this can
> > > > @@ -585,16 +599,8 @@ static void pispbe_schedule(struct pispbe_dev *pispbe, bool clear_hw_busy)
> > > >  	 * only when the following job has been queued and an interrupt
> > > >  	 * is rised.
> > > >  	 */
> > > > -	pispbe->hw_busy = true;
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > > -
> > > > -	pispbe_queue_job(pispbe, &job);
> > > > -
> > > > -	return;
> > > > -
> > > > -unlock_and_return:
> > > > -	/* No job has been queued, just release the lock and return. */
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > > +	pispbe_queue_job(pispbe, job);
> > > > +	kfree(job);
> > > >  }
> > > >
> > > >  static void pispbe_isr_jobdone(struct pispbe_dev *pispbe,
> > > > @@ -846,18 +852,16 @@ static void pispbe_node_buffer_queue(struct vb2_buffer *buf)
> > > >  		container_of(vbuf, struct pispbe_buffer, vb);
> > > >  	struct pispbe_node *node = vb2_get_drv_priv(buf->vb2_queue);
> > > >  	struct pispbe_dev *pispbe = node->pispbe;
> > > > -	unsigned long flags;
> > > >
> > > >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > > > -	spin_lock_irqsave(&node->ready_lock, flags);
> > > >  	list_add_tail(&buffer->ready_list, &node->ready_queue);
> > > > -	spin_unlock_irqrestore(&node->ready_lock, flags);
> > > >
> > > >  	/*
> > > >  	 * Every time we add a buffer, check if there's now some work for the hw
> > > >  	 * to do.
> > > >  	 */
> > > > -	pispbe_schedule(pispbe, false);
> > > > +	if (!pispbe_prepare_job(pispbe))
> > > > +		pispbe_schedule(pispbe, false);
> > > >  }
> > > >
> > > >  static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > > > @@ -865,17 +869,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > > >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> > > >  	struct pispbe_dev *pispbe = node->pispbe;
> > > >  	struct pispbe_buffer *buf, *tmp;
> > > > -	unsigned long flags;
> > > >  	int ret;
> > > >
> > > >  	ret = pm_runtime_resume_and_get(pispbe->dev);
> > > >  	if (ret < 0)
> > > >  		goto err_return_buffers;
> > > >
> > > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > > -	node->pispbe->streaming_map |=  BIT(node->id);
> > > > -	node->pispbe->sequence = 0;
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > > +	scoped_guard(spinlock_irq, &pispbe->hw_lock) {
> > > > +		node->pispbe->streaming_map |=  BIT(node->id);
> > > > +		node->pispbe->sequence = 0;
> > > > +	}
> > > >
> > > >  	dev_dbg(pispbe->dev, "%s: for node %s (count %u)\n",
> > > >  		__func__, NODE_NAME(node), count);
> > > > @@ -883,17 +886,16 @@ static int pispbe_node_start_streaming(struct vb2_queue *q, unsigned int count)
> > > >  		node->pispbe->streaming_map);
> > > >
> > > >  	/* Maybe we're ready to run. */
> > > > -	pispbe_schedule(pispbe, false);
> > > > +	if (!pispbe_prepare_job(pispbe))
> > > > +		pispbe_schedule(pispbe, false);
> > > >
> > > >  	return 0;
> > > >
> > > >  err_return_buffers:
> > > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > >  	list_for_each_entry_safe(buf, tmp, &node->ready_queue, ready_list) {
> > > >  		list_del(&buf->ready_list);
> > > >  		vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_QUEUED);
> > > >  	}
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > >
> > > >  	return ret;
> > > >  }
> > > > @@ -903,7 +905,6 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > > >  	struct pispbe_node *node = vb2_get_drv_priv(q);
> > > >  	struct pispbe_dev *pispbe = node->pispbe;
> > > >  	struct pispbe_buffer *buf;
> > > > -	unsigned long flags;
> > > >
> > > >  	/*
> > > >  	 * Now this is a bit awkward. In a simple M2M device we could just wait
> > > > @@ -915,11 +916,7 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > > >  	 * This may return buffers out of order.
> > > >  	 */
> > > >  	dev_dbg(pispbe->dev, "%s: for node %s\n", __func__, NODE_NAME(node));
> > > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > >  	do {
> > > > -		unsigned long flags1;
> > > > -
> > > > -		spin_lock_irqsave(&node->ready_lock, flags1);
> > > >  		buf = list_first_entry_or_null(&node->ready_queue,
> > > >  					       struct pispbe_buffer,
> > > >  					       ready_list);
> > > > @@ -927,15 +924,23 @@ static void pispbe_node_stop_streaming(struct vb2_queue *q)
> > > >  			list_del(&buf->ready_list);
> > > >  			vb2_buffer_done(&buf->vb.vb2_buf, VB2_BUF_STATE_ERROR);
> > > >  		}
> > > > -		spin_unlock_irqrestore(&node->ready_lock, flags1);
> > > >  	} while (buf);
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > >
> > > >  	vb2_wait_for_all_buffers(&node->queue);
> > > >
> > > > -	spin_lock_irqsave(&pispbe->hw_lock, flags);
> > > > +	spin_lock_irq(&pispbe->hw_lock);
> > > >  	pispbe->streaming_map &= ~BIT(node->id);
> > > > -	spin_unlock_irqrestore(&pispbe->hw_lock, flags);
> > > > +
> > > > +	/* Release all jobs once all nodes have stopped streaming. */
> > > > +	if (pispbe->streaming_map == 0) {
> > > > +		struct pispbe_job_descriptor *job, *temp;
> > > > +
> > > > +		list_for_each_entry_safe(job, temp, &pispbe->job_queue, queue) {
> > > > +			list_del(&job->queue);
> > > > +			kfree(job);
> > > > +		}
> > > > +	}
> > >
> > > Please splice pispbe->job_queue to a local list with the lock held, and
> > > then iterate over the local list without the lock held to free the jobs.
> > >
> > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > >
> > > > +	spin_unlock_irq(&pispbe->hw_lock);
> > > >
> > > >  	pm_runtime_mark_last_busy(pispbe->dev);
> > > >  	pm_runtime_put_autosuspend(pispbe->dev);
> > > > @@ -1393,7 +1398,6 @@ static int pispbe_init_node(struct pispbe_dev *pispbe, unsigned int id)
> > > >  	mutex_init(&node->node_lock);
> > > >  	mutex_init(&node->queue_lock);
> > > >  	INIT_LIST_HEAD(&node->ready_queue);
> > > > -	spin_lock_init(&node->ready_lock);
> > > >
> > > >  	node->format.type = node->buf_type;
> > > >  	pispbe_node_def_fmt(node);
> > > > @@ -1677,6 +1681,8 @@ static int pispbe_probe(struct platform_device *pdev)
> > > >  	if (!pispbe)
> > > >  		return -ENOMEM;
> > > >
> > > > +	INIT_LIST_HEAD(&pispbe->job_queue);
> > > > +
> > > >  	dev_set_drvdata(&pdev->dev, pispbe);
> > > >  	pispbe->dev = &pdev->dev;
> > > >  	platform_set_drvdata(pdev, pispbe);
>
> --
> Regards,
>
> Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-06-17 14:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-06 10:29 [PATCH v7 0/4] media: pisp-be: Split jobs creation and scheduling Jacopo Mondi
2025-06-06 10:29 ` [PATCH v7 1/4] media: pisp_be: Drop reference to non-existing function Jacopo Mondi
2025-06-13  8:00   ` Naushir Patuck
2025-06-06 10:29 ` [PATCH v7 2/4] media: pisp_be: Remove config validation from schedule() Jacopo Mondi
2025-06-13  8:31   ` Naushir Patuck
2025-06-06 10:29 ` [PATCH v7 3/4] media: pisp_be: Split jobs creation and scheduling Jacopo Mondi
2025-06-16 14:40   ` Laurent Pinchart
2025-06-17 12:32     ` Jacopo Mondi
2025-06-17 13:53       ` Laurent Pinchart
2025-06-17 14:06         ` Jacopo Mondi
2025-06-06 10:29 ` [PATCH v7 4/4] media: pisp_be: Fix pm_runtime underrun in probe Jacopo Mondi
2025-06-16 14:17   ` Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).