stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] media: qcom: camss: Fix two CAMSS bugs found by dogfooding with SoftISP
@ 2024-07-16 22:13 Bryan O'Donoghue
  2024-07-16 22:13 ` [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming Bryan O'Donoghue
  2024-07-16 22:13 ` [PATCH v2 2/2] media: qcom: camss: Fix ordering of pm_runtime_enable Bryan O'Donoghue
  0 siblings, 2 replies; 6+ messages in thread
From: Bryan O'Donoghue @ 2024-07-16 22:13 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov
  Cc: linux-media, linux-arm-msm, linux-kernel, Mauro Carvalho Chehab,
	Johan Hovold, Bryan O'Donoghue, stable

v2:
- Updates commits with Johan's Review/Reported tags
- Adds Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com
- Cc's stable
- Adds in suggested kernel log to allow others to more easily match kernel
  log to fixes
- Link to v1: https://lore.kernel.org/r/20240714-linux-next-24-07-13-camss-fixes-v1-0-8f8954bc8c85@linaro.org

V1:
Dogfooding with SoftISP has uncovered two bugs in this series which I'm
posting fixes for.

- The first error:
  A simple race condition which to be honest I'm surprised I haven't found
  earlier nor has anybody else. Simply stated the order we typically
  end up loading CAMSS on boot has masked out the pm_runtime_enable() race
  condition that has been present in CAMSS for a long time.

  If you blacklist qcom-camss in modules.d and then modprobe after boot,
  the race condition shows up easily.

  Moving the pm_runtime_enable prior to subdevice registration fixes the
  problem.

The second error:
  Nomenclature:
    - CSIPHY: CSI Physical layer analogue to digital domain serialiser
    - CSID: CSI Decoder
    - VFE: Video Front End
    - RDI: Raw Data Interface
    - VC: Virtual Channel

  In order to support streaming multiple virtual-channels on the same RDI a
  V4L2 provided use_count variable is used to decide whether or not to actually
  terminate streaming and release buffers for 'msm_vfe_rdiX'.

  Unfortunately use_count indicates the number of times msm_vfe_rdiX has
  been opened by user-space not the number of concurrent streams on
  msm_vfe_rdiX.

  Simply stated use_count and stream_count are two different things.

  The silicon enabling code to select between VCs is valid but, a different
  solution needs to be found to support _concurrent_ VC streams.

  Right now the upstream use_count as-is is breaking the non concurrent VC
  case and I don't believe there are upstream users of concurrent VCs on
  CAMSS.

  This series implements a revert for the invalid use_count check,
  retaining the ability to select which VC is active on the RDI.

  Dogfooding with libcamera's SoftISP in Hangouts, Zoom and multiple runs
  of libcamera's "qcam" application is a very different test-case to the
  simple capture of frames we previously did when validating the
  'use_count' change.

  A partial revert in expectation of a renewed push to fixup that
  concurrent VC issue is included.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
Bryan O'Donoghue (2):
      media: qcom: camss: Remove use_count guard in stop_streaming
      media: qcom: camss: Fix ordering of pm_runtime_enable

 drivers/media/platform/qcom/camss/camss-video.c | 6 ------
 drivers/media/platform/qcom/camss/camss.c       | 5 +++--
 2 files changed, 3 insertions(+), 8 deletions(-)
---
base-commit: c6ce8f9ab92edc9726996a0130bfc1c408132d47
change-id: 20240713-linux-next-24-07-13-camss-fixes-fa98c0965a5d

Best regards,
-- 
Bryan O'Donoghue <bryan.odonoghue@linaro.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming
  2024-07-16 22:13 [PATCH v2 0/2] media: qcom: camss: Fix two CAMSS bugs found by dogfooding with SoftISP Bryan O'Donoghue
@ 2024-07-16 22:13 ` Bryan O'Donoghue
  2024-07-17  9:06   ` Johan Hovold
  2024-07-16 22:13 ` [PATCH v2 2/2] media: qcom: camss: Fix ordering of pm_runtime_enable Bryan O'Donoghue
  1 sibling, 1 reply; 6+ messages in thread
From: Bryan O'Donoghue @ 2024-07-16 22:13 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov
  Cc: linux-media, linux-arm-msm, linux-kernel, Mauro Carvalho Chehab,
	Johan Hovold, Bryan O'Donoghue, stable

The use_count check was introduced so that multiple concurrent Raw Data
Interfaces RDIs could be driven by different virtual channels VCs on the
CSIPHY input driving the video pipeline.

This is an invalid use of use_count though as use_count pertains to the
number of times a video entity has been opened by user-space not the number
of active streams.

If use_count and stream-on count don't agree then stop_streaming() will
break as is currently the case and has become apparent when using CAMSS
with libcamera's released softisp 0.3.

The use of use_count like this is a bit hacky and right now breaks regular
usage of CAMSS for a single stream case. As an example the "qcam"
application in libcamera will fail with an -EBUSY result on stream stop and
cannot then subsequently be restarted.

The kernel log for this fault looks like this:

[ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
...
[ 1265.510630] Call trace:
[ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
[ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
[ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
[ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
[ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
[ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
[ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
[ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
...
[ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
[ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
[ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st

One CAMSS specific way to handle multiple VCs on the same RDI might be:

- Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
- The video buffers are already associated with msm_vfeN_rdiX so
  release video buffers when told to do so by stop_streaming.
- Only release the power-domains for the CSIPHY, CSID and VFE when
  their internal refcounts drop.

Either way refusing to release video buffers based on use_count is
erroneous and should be reverted. The silicon enabling code for selecting
VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
won't work with CAMSS right now.

Initial testing with this code didn't show an error but, SoftISP and "real"
usage with Google Hangouts breaks the upstream code pretty quickly, we need
to do a partial revert and take another pass at VCs.

This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
Pipeline starting and stopping for multiple virtual channels")

Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels")
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 drivers/media/platform/qcom/camss/camss-video.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/media/platform/qcom/camss/camss-video.c b/drivers/media/platform/qcom/camss/camss-video.c
index cd72feca618c..3b8fc31d957c 100644
--- a/drivers/media/platform/qcom/camss/camss-video.c
+++ b/drivers/media/platform/qcom/camss/camss-video.c
@@ -297,12 +297,6 @@ static void video_stop_streaming(struct vb2_queue *q)
 
 		ret = v4l2_subdev_call(subdev, video, s_stream, 0);
 
-		if (entity->use_count > 1) {
-			/* Don't stop if other instances of the pipeline are still running */
-			dev_dbg(video->camss->dev, "Video pipeline still used, don't stop streaming.\n");
-			return;
-		}
-
 		if (ret) {
 			dev_err(video->camss->dev, "Video pipeline stop failed: %d\n", ret);
 			return;

-- 
2.45.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] media: qcom: camss: Fix ordering of pm_runtime_enable
  2024-07-16 22:13 [PATCH v2 0/2] media: qcom: camss: Fix two CAMSS bugs found by dogfooding with SoftISP Bryan O'Donoghue
  2024-07-16 22:13 ` [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming Bryan O'Donoghue
@ 2024-07-16 22:13 ` Bryan O'Donoghue
  1 sibling, 0 replies; 6+ messages in thread
From: Bryan O'Donoghue @ 2024-07-16 22:13 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov
  Cc: linux-media, linux-arm-msm, linux-kernel, Mauro Carvalho Chehab,
	Johan Hovold, Bryan O'Donoghue, stable

pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
pm_runtime_resume_and_get().

This is a basic race condition that doesn't show up for most users so is
not widely reported. If you blacklist qcom-camss in modules.d and then
subsequently modprobe the module post-boot it is possible to reliably show
this error up.

The kernel log for this error looks like this:

qcom-camss ac5a000.camss: Failed to power up pipeline: -13

Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support")
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 drivers/media/platform/qcom/camss/camss.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/qcom/camss/camss.c b/drivers/media/platform/qcom/camss/camss.c
index 51b1d3550421..d64985ca6e88 100644
--- a/drivers/media/platform/qcom/camss/camss.c
+++ b/drivers/media/platform/qcom/camss/camss.c
@@ -2283,6 +2283,8 @@ static int camss_probe(struct platform_device *pdev)
 
 	v4l2_async_nf_init(&camss->notifier, &camss->v4l2_dev);
 
+	pm_runtime_enable(dev);
+
 	num_subdevs = camss_of_parse_ports(camss);
 	if (num_subdevs < 0) {
 		ret = num_subdevs;
@@ -2323,8 +2325,6 @@ static int camss_probe(struct platform_device *pdev)
 		}
 	}
 
-	pm_runtime_enable(dev);
-
 	return 0;
 
 err_register_subdevs:
@@ -2332,6 +2332,7 @@ static int camss_probe(struct platform_device *pdev)
 err_v4l2_device_unregister:
 	v4l2_device_unregister(&camss->v4l2_dev);
 	v4l2_async_nf_cleanup(&camss->notifier);
+	pm_runtime_disable(dev);
 err_genpd_cleanup:
 	camss_genpd_cleanup(camss);
 

-- 
2.45.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming
  2024-07-16 22:13 ` [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming Bryan O'Donoghue
@ 2024-07-17  9:06   ` Johan Hovold
  2024-07-17 10:43     ` Bryan O'Donoghue
  0 siblings, 1 reply; 6+ messages in thread
From: Johan Hovold @ 2024-07-17  9:06 UTC (permalink / raw)
  To: Bryan O'Donoghue
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov, linux-media, linux-arm-msm,
	linux-kernel, Mauro Carvalho Chehab, Johan Hovold, stable

On Tue, Jul 16, 2024 at 11:13:24PM +0100, Bryan O'Donoghue wrote:
> The use_count check was introduced so that multiple concurrent Raw Data
> Interfaces RDIs could be driven by different virtual channels VCs on the
> CSIPHY input driving the video pipeline.
> 
> This is an invalid use of use_count though as use_count pertains to the
> number of times a video entity has been opened by user-space not the number
> of active streams.
> 
> If use_count and stream-on count don't agree then stop_streaming() will
> break as is currently the case and has become apparent when using CAMSS
> with libcamera's released softisp 0.3.
> 
> The use of use_count like this is a bit hacky and right now breaks regular
> usage of CAMSS for a single stream case. As an example the "qcam"
> application in libcamera will fail with an -EBUSY result on stream stop and
> cannot then subsequently be restarted.

No, stopping qcam results in the splat below, and then it cannot be
started again and any attempts to do so fails with -EBUSY.

> The kernel log for this fault looks like this:
> 
> [ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
> ...
> [ 1265.510630] Call trace:
> [ 1265.510636]  __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
> [ 1265.510648]  vb2_core_streamoff+0x24/0xcc [videobuf2_common]
> [ 1265.510660]  vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
> [ 1265.510673]  v4l_streamoff+0x24/0x30 [videodev]
> [ 1265.510707]  __video_do_ioctl+0x190/0x3f4 [videodev]
> [ 1265.510732]  video_usercopy+0x304/0x8c4 [videodev]
> [ 1265.510757]  video_ioctl2+0x18/0x34 [videodev]
> [ 1265.510782]  v4l2_ioctl+0x40/0x60 [videodev]
> ...
> [ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
> [ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
> [ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st

Johan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming
  2024-07-17  9:06   ` Johan Hovold
@ 2024-07-17 10:43     ` Bryan O'Donoghue
  2024-07-17 11:04       ` Johan Hovold
  0 siblings, 1 reply; 6+ messages in thread
From: Bryan O'Donoghue @ 2024-07-17 10:43 UTC (permalink / raw)
  To: Johan Hovold
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov, linux-media, linux-arm-msm,
	linux-kernel, Mauro Carvalho Chehab, Johan Hovold, stable

On 17/07/2024 10:06, Johan Hovold wrote:
>> The use of use_count like this is a bit hacky and right now breaks regular
>> usage of CAMSS for a single stream case. As an example the "qcam"
>> application in libcamera will fail with an -EBUSY result on stream stop and
>> cannot then subsequently be restarted.
> No, stopping qcam results in the splat below, and then it cannot be
> started again and any attempts to do so fails with -EBUSY.

I thought that's what I said.

Let me reword the commit log with your sentence included directly :)

---
bod

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming
  2024-07-17 10:43     ` Bryan O'Donoghue
@ 2024-07-17 11:04       ` Johan Hovold
  0 siblings, 0 replies; 6+ messages in thread
From: Johan Hovold @ 2024-07-17 11:04 UTC (permalink / raw)
  To: Bryan O'Donoghue
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Hans Verkuil, Milen Mitkov, linux-media, linux-arm-msm,
	linux-kernel, Mauro Carvalho Chehab, Johan Hovold, stable

On Wed, Jul 17, 2024 at 11:43:02AM +0100, Bryan O'Donoghue wrote:
> On 17/07/2024 10:06, Johan Hovold wrote:
> >> The use of use_count like this is a bit hacky and right now breaks regular
> >> usage of CAMSS for a single stream case. As an example the "qcam"
> >> application in libcamera will fail with an -EBUSY result on stream stop and
> >> cannot then subsequently be restarted.
> > No, stopping qcam results in the splat below, and then it cannot be
> > started again and any attempts to do so fails with -EBUSY.
> 
> I thought that's what I said.

I read the above as if stopping the stream fails with -EBUSY, when it's
really restarting the stream that fails that way after the first stop.

> Let me reword the commit log with your sentence included directly :)

Sounds good.

Johan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-07-17 11:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-16 22:13 [PATCH v2 0/2] media: qcom: camss: Fix two CAMSS bugs found by dogfooding with SoftISP Bryan O'Donoghue
2024-07-16 22:13 ` [PATCH v2 1/2] media: qcom: camss: Remove use_count guard in stop_streaming Bryan O'Donoghue
2024-07-17  9:06   ` Johan Hovold
2024-07-17 10:43     ` Bryan O'Donoghue
2024-07-17 11:04       ` Johan Hovold
2024-07-16 22:13 ` [PATCH v2 2/2] media: qcom: camss: Fix ordering of pm_runtime_enable Bryan O'Donoghue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).