[PATCH 0/2] media: qcom: camss: Fix two bugs in mainline

linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] media: qcom: camss: Fix two bugs in mainline
@ 2025-06-12  8:07 Bryan O'Donoghue
  2025-06-12  8:07 ` [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init Bryan O'Donoghue
  2025-06-12  8:07 ` [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug Bryan O'Donoghue
  0 siblings, 2 replies; 9+ messages in thread
From: Bryan O'Donoghue @ 2025-06-12  8:07 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, Vladimir Zapolskiy, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, Bryan O'Donoghue,
	stable, Johan Hovold

Two bug fixes here.

First up SDM630/SDM660 hasn't been probing because moving the CSIPHY gen2
init sequence into a common location also moved the default case of the
switch statement which rejects non-gen2 devices.

Second is a fix for a very longstanding bug which is a race-condition
between fully enumerating /dev/videoX devices along with all of their
dependent data-structures and gating user-space access to those devices.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
Bryan O'Donoghue (2):
      media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init
      media: qcom: camss: vfe: Fix registration sequencing bug

 drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c | 3 +--
 drivers/media/platform/qcom/camss/camss-vfe.c            | 8 ++++++++
 drivers/media/platform/qcom/camss/camss-vfe.h            | 1 +
 3 files changed, 10 insertions(+), 2 deletions(-)
---
base-commit: 8666245114d979b963dc23894a03c74ecab8a7a6
change-id: 20250610-linux-next-25-05-30-daily-reviews-47ef54eee7ea

Best regards,
-- 
Bryan O'Donoghue <bryan.odonoghue@linaro.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init
  2025-06-12  8:07 [PATCH 0/2] media: qcom: camss: Fix two bugs in mainline Bryan O'Donoghue
@ 2025-06-12  8:07 ` Bryan O'Donoghue
  2025-06-17  9:02   ` Vladimir Zapolskiy
  2025-06-12  8:07 ` [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug Bryan O'Donoghue
  1 sibling, 1 reply; 9+ messages in thread
From: Bryan O'Donoghue @ 2025-06-12  8:07 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, Vladimir Zapolskiy, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, Bryan O'Donoghue,
	stable

The moving of init sequence hook from gen2() to subdev_init() doesn't
account for gen1 devices such as SDM660 and SDM670. The switch should find
the right offset for gen2 PHYs only, not reject gen1. Remove the default
error case to restore gen1 CSIPHY support.

Cc: stable@vger.kernel.org
Fixes: fbce0ca24c3a ("media: qcom: camss: csiphy-3ph: Move CSIPHY variables to data field inside csiphy struct")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c b/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c
index f732a76de93e3e7b787d9553bf7f31e6c0596c58..88c0ba495c3271f3d09d2c48f07b03d2a4949061 100644
--- a/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c
+++ b/drivers/media/platform/qcom/camss/camss-csiphy-3ph-1-0.c
@@ -849,8 +849,7 @@ static int csiphy_init(struct csiphy_device *csiphy)
 		regs->offset = 0x1000;
 		break;
 	default:
-		WARN(1, "unknown csiphy version\n");
-		return -ENODEV;
+		break;
 	}
 
 	return 0;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-12  8:07 [PATCH 0/2] media: qcom: camss: Fix two bugs in mainline Bryan O'Donoghue
  2025-06-12  8:07 ` [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init Bryan O'Donoghue
@ 2025-06-12  8:07 ` Bryan O'Donoghue
  2025-06-13  9:13   ` Vladimir Zapolskiy
  2025-06-16  9:19   ` Johan Hovold
  1 sibling, 2 replies; 9+ messages in thread
From: Bryan O'Donoghue @ 2025-06-12  8:07 UTC (permalink / raw)
  To: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, Vladimir Zapolskiy, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, Bryan O'Donoghue,
	stable, Johan Hovold

msm_vfe_register_entities loops through each Raw Data Interface input line.
For each loop we add video device with its associated pads.

Once a single /dev/video0 node has been populated it is possible for
camss_find_sensor_pad to run. This routine scans through a list of media
entities taking a pointer pad = media_entity->pad[0] and assuming that
pointer is always valid.

It is possible for both the enumeration loop in msm_vfe_register_entities()
and a call from user-space to run concurrently.

Adding some deliberate sleep code into the loop in
msm_vfe_register_entities() and constructing a user-space program to open
every /dev/videoX node in a tight continuous loop, quickly shows the
following error.

[  691.074558] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[  691.074933] Call trace:
[  691.074935]  camss_find_sensor_pad+0x74/0x114 [qcom_camss] (P)
[  691.074946]  camss_get_pixel_clock+0x18/0x64 [qcom_camss]
[  691.074956]  vfe_get+0xc0/0x54c [qcom_camss]
[  691.074968]  vfe_set_power+0x58/0xf4c [qcom_camss]
[  691.074978]  pipeline_pm_power_one+0x124/0x140 [videodev]
[  691.074986]  pipeline_pm_power+0x70/0x100 [videodev]
[  691.074992]  v4l2_pipeline_pm_use+0x54/0x90 [videodev]
[  691.074998]  v4l2_pipeline_pm_get+0x14/0x20 [videodev]
[  691.075005]  video_open+0x74/0xe0 [qcom_camss]
[  691.075014]  v4l2_open+0xa8/0x124 [videodev]
[  691.075021]  chrdev_open+0xb0/0x21c
[  691.075031]  do_dentry_open+0x138/0x4c4
[  691.075040]  vfs_open+0x2c/0xe8
[  691.075044]  path_openat+0x6f0/0x10a0
[  691.075050]  do_filp_open+0xa8/0x164
[  691.075054]  do_sys_openat2+0x94/0x104
[  691.075058]  __arm64_sys_openat+0x64/0xc0
[  691.075061]  invoke_syscall+0x48/0x104
[  691.075069]  el0_svc_common.constprop.0+0x40/0xe0
[  691.075075]  do_el0_svc+0x1c/0x28
[  691.075080]  el0_svc+0x30/0xcc
[  691.075085]  el0t_64_sync_handler+0x10c/0x138
[  691.075088]  el0t_64_sync+0x198/0x19c

Taking the vfe->power_lock is not possible since
v4l2_device_register_subdev takes the mdev->graph_lock. Later on fops->open
takes the mdev->graph_lock followed by vfe_get() -> taking vfe->power_lock.

Introduce a simple enumeration_complete bool which is false initially and
only set true once in our init routine after we complete enumeration.

If user-space tries to interact with the VFE before complete enumeration it
will receive -EAGAIN.

Cc: stable@vger.kernel.org
Fixes: 4c98a5f57f90 ("media: camss: Add VFE files")
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Closes: https://lore.kernel.org/all/Zwjw6XfVWcufMlqM@hovoldconsulting.com
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 drivers/media/platform/qcom/camss/camss-vfe.c | 8 ++++++++
 drivers/media/platform/qcom/camss/camss-vfe.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
index ac3a9579e3e6910eee8c1ec11c4fff6e1bc94443..3fccc83ebbcc84c20e17da72c359de3dfd18fb40 100644
--- a/drivers/media/platform/qcom/camss/camss-vfe.c
+++ b/drivers/media/platform/qcom/camss/camss-vfe.c
@@ -1062,6 +1062,9 @@ int vfe_get(struct vfe_device *vfe)
 {
 	int ret;
 
+	if (!vfe->enumeration_complete)
+		return -EAGAIN;
+
 	mutex_lock(&vfe->power_lock);
 
 	if (vfe->power_count == 0) {
@@ -1122,6 +1125,9 @@ int vfe_get(struct vfe_device *vfe)
  */
 void vfe_put(struct vfe_device *vfe)
 {
+	if (!vfe->enumeration_complete)
+		return;
+
 	mutex_lock(&vfe->power_lock);
 
 	if (vfe->power_count == 0) {
@@ -2091,6 +2097,8 @@ int msm_vfe_register_entities(struct vfe_device *vfe,
 		}
 	}
 
+	vfe->enumeration_complete = true;
+
 	return 0;
 
 error_link:
diff --git a/drivers/media/platform/qcom/camss/camss-vfe.h b/drivers/media/platform/qcom/camss/camss-vfe.h
index 614e932c33da78e02e0800ce6534af7b14822f83..33b59dcfc8b2b81e896336b079a41eba603ec400 100644
--- a/drivers/media/platform/qcom/camss/camss-vfe.h
+++ b/drivers/media/platform/qcom/camss/camss-vfe.h
@@ -169,6 +169,7 @@ struct vfe_device {
 	struct camss_video_ops video_ops;
 	struct device *genpd;
 	struct device_link *genpd_link;
+	bool enumeration_complete;
 };
 
 struct camss_subdev_resources;

-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-12  8:07 ` [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug Bryan O'Donoghue
@ 2025-06-13  9:13   ` Vladimir Zapolskiy
  2025-06-16 14:09     ` Bryan O'Donoghue
  2025-06-16  9:19   ` Johan Hovold
  1 sibling, 1 reply; 9+ messages in thread
From: Vladimir Zapolskiy @ 2025-06-13  9:13 UTC (permalink / raw)
  To: Bryan O'Donoghue, Robert Foss, Todor Tomov,
	Mauro Carvalho Chehab, Hans Verkuil, Depeng Shao, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, stable, Johan Hovold

Hi Bryan.

On 6/12/25 11:07, Bryan O'Donoghue wrote:
> msm_vfe_register_entities loops through each Raw Data Interface input line.
> For each loop we add video device with its associated pads.
> 
> Once a single /dev/video0 node has been populated it is possible for

Here is a typo, /dev/video0 should be replaced by something like /dev/videoX.

> camss_find_sensor_pad to run. This routine scans through a list of media
> entities taking a pointer pad = media_entity->pad[0] and assuming that
> pointer is always valid.
> 
> It is possible for both the enumeration loop in msm_vfe_register_entities()
> and a call from user-space to run concurrently.

Here comes my insufficient understanding, please explain further.

Per se this concurrent execution shall not lead to the encountered bug,
both an initialization of media entity pads by media_entity_pads_init()
and a registration of a v4l2 devnode inside msm_video_register() are
done under in a proper sequence, aren't they?

 From what I read there is no bug stated.

> Adding some deliberate sleep code into the loop in
> msm_vfe_register_entities() and constructing a user-space program to open
> every /dev/videoX node in a tight continuous loop, quickly shows the
> following error.
> 
> [  691.074558] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> [  691.074933] Call trace:
> [  691.074935]  camss_find_sensor_pad+0x74/0x114 [qcom_camss] (P)
> [  691.074946]  camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> [  691.074956]  vfe_get+0xc0/0x54c [qcom_camss]
> [  691.074968]  vfe_set_power+0x58/0xf4c [qcom_camss]
> [  691.074978]  pipeline_pm_power_one+0x124/0x140 [videodev]
> [  691.074986]  pipeline_pm_power+0x70/0x100 [videodev]
> [  691.074992]  v4l2_pipeline_pm_use+0x54/0x90 [videodev]
> [  691.074998]  v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> [  691.075005]  video_open+0x74/0xe0 [qcom_camss]
> [  691.075014]  v4l2_open+0xa8/0x124 [videodev]
> [  691.075021]  chrdev_open+0xb0/0x21c
> [  691.075031]  do_dentry_open+0x138/0x4c4
> [  691.075040]  vfs_open+0x2c/0xe8
> [  691.075044]  path_openat+0x6f0/0x10a0
> [  691.075050]  do_filp_open+0xa8/0x164
> [  691.075054]  do_sys_openat2+0x94/0x104
> [  691.075058]  __arm64_sys_openat+0x64/0xc0
> [  691.075061]  invoke_syscall+0x48/0x104
> [  691.075069]  el0_svc_common.constprop.0+0x40/0xe0
> [  691.075075]  do_el0_svc+0x1c/0x28
> [  691.075080]  el0_svc+0x30/0xcc
> [  691.075085]  el0t_64_sync_handler+0x10c/0x138
> [  691.075088]  el0t_64_sync+0x198/0x19c
> 
> Taking the vfe->power_lock is not possible since
> v4l2_device_register_subdev takes the mdev->graph_lock. Later on fops->open
> takes the mdev->graph_lock followed by vfe_get() -> taking vfe->power_lock.

It's unclear what is the connection between the issue and a call to
v4l2_device_register_subdev(), the latter is related to /dev/v4l-subdevX
devnodes, but all way above the talk was about /dev/videoX devnodes, no?

> Introduce a simple enumeration_complete bool which is false initially and
> only set true once in our init routine after we complete enumeration.

It might be a fix (what is the bug actually? it's still left unexplained)
at the price of the machine state complification, a much better fix would
be not to create and expose a non-ready /dev/videoX devnode by calling
video_register_device() too early.

> 
> If user-space tries to interact with the VFE before complete enumeration it
> will receive -EAGAIN.

It sounds like a critical change in the kernel to userspace ABI of open(2)
syscall for CAMSS V4L2 devnodes, unfortunately... EAGAIN could be received,
if open() is called with O_NONBLOCK flag, otherwise the syscall shall be
blocked.

I believe a completion of media device entities/pads registration before
creating a devnode should solve all the issues in a proper way.

> Cc: stable@vger.kernel.org
> Fixes: 4c98a5f57f90 ("media: camss: Add VFE files")
> Reported-by: Johan Hovold <johan+linaro@kernel.org>
> Closes: https://lore.kernel.org/all/Zwjw6XfVWcufMlqM@hovoldconsulting.com
> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-12  8:07 ` [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug Bryan O'Donoghue
  2025-06-13  9:13   ` Vladimir Zapolskiy
@ 2025-06-16  9:19   ` Johan Hovold
  1 sibling, 0 replies; 9+ messages in thread
From: Johan Hovold @ 2025-06-16  9:19 UTC (permalink / raw)
  To: Bryan O'Donoghue
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, Vladimir Zapolskiy, Hans Verkuil, linux-media,
	linux-arm-msm, linux-kernel, stable, Johan Hovold

On Thu, Jun 12, 2025 at 09:07:16AM +0100, Bryan O'Donoghue wrote:
> msm_vfe_register_entities loops through each Raw Data Interface input line.
> For each loop we add video device with its associated pads.
> 
> Once a single /dev/video0 node has been populated it is possible for
> camss_find_sensor_pad to run. This routine scans through a list of media
> entities taking a pointer pad = media_entity->pad[0] and assuming that
> pointer is always valid.
> 
> It is possible for both the enumeration loop in msm_vfe_register_entities()
> and a call from user-space to run concurrently.
> 
> Adding some deliberate sleep code into the loop in
> msm_vfe_register_entities() and constructing a user-space program to open
> every /dev/videoX node in a tight continuous loop, quickly shows the
> following error.
> 
> [  691.074558] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> [  691.074933] Call trace:
> [  691.074935]  camss_find_sensor_pad+0x74/0x114 [qcom_camss] (P)
> [  691.074946]  camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> [  691.074956]  vfe_get+0xc0/0x54c [qcom_camss]
> [  691.074968]  vfe_set_power+0x58/0xf4c [qcom_camss]
> [  691.074978]  pipeline_pm_power_one+0x124/0x140 [videodev]
> [  691.074986]  pipeline_pm_power+0x70/0x100 [videodev]
> [  691.074992]  v4l2_pipeline_pm_use+0x54/0x90 [videodev]
> [  691.074998]  v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> [  691.075005]  video_open+0x74/0xe0 [qcom_camss]
> [  691.075014]  v4l2_open+0xa8/0x124 [videodev]
> [  691.075021]  chrdev_open+0xb0/0x21c
> [  691.075031]  do_dentry_open+0x138/0x4c4
> [  691.075040]  vfs_open+0x2c/0xe8

> Taking the vfe->power_lock is not possible since
> v4l2_device_register_subdev takes the mdev->graph_lock. Later on fops->open
> takes the mdev->graph_lock followed by vfe_get() -> taking vfe->power_lock.
> 
> Introduce a simple enumeration_complete bool which is false initially and
> only set true once in our init routine after we complete enumeration.
> 
> If user-space tries to interact with the VFE before complete enumeration it
> will receive -EAGAIN.

As Vladimir also pointed out, this is at best just papering over the
issue.

You need to make sure the video device is not registered until it's
ready to be used. That is the bug that needs fixing.

Johan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-13  9:13   ` Vladimir Zapolskiy
@ 2025-06-16 14:09     ` Bryan O'Donoghue
  2025-06-16 15:00       ` Vladimir Zapolskiy
  0 siblings, 1 reply; 9+ messages in thread
From: Bryan O'Donoghue @ 2025-06-16 14:09 UTC (permalink / raw)
  To: Vladimir Zapolskiy, Robert Foss, Todor Tomov,
	Mauro Carvalho Chehab, Hans Verkuil, Depeng Shao, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, stable, Johan Hovold

On 13/06/2025 10:13, Vladimir Zapolskiy wrote:
> 
> Per se this concurrent execution shall not lead to the encountered bug,

What does that mean ? Please re-read the commit log, the analysis is all 
there.

> both an initialization of media entity pads by media_entity_pads_init()
> and a registration of a v4l2 devnode inside msm_video_register() are
> done under in a proper sequence, aren't they?

No, I clearly haven't explained this clearly enough in the commit log.

vfe0_rdi0 == /dev/video0 is complete. vfe0_rdi1 is not complete there is 
no /dev/video1 in user-space.

vfe_get() is called for an RDI in a VFE, camss_find_sensor_pad() assumes 
all RDIs are populated.

We can't use any VFE mutex to synchronise this because

lock(vfe->mutex);
lock(media->mutex);

and
lock(media->mutex);
lock(vfe->mutex);

happen.

So we can educate vfe_get() about the RDI it is operating on or we can 
flag that a VFE - all of it's subordinate RDIs are available.

I didn't much like teaching vfe_get() about which RDI index because the 
code looked ugly for 8916 you have to assume on one of the code paths 
that it always operates on RDI0, which is an invalid assumption.

The other way to fix this is:

+++ b/drivers/media/platform/qcom/camss/camss.c
@@ -2988,7 +2988,7 @@ struct media_pad *camss_find_sensor_pad(struct 
media_entity *entity)

         while (1) {
                 pad = &entity->pads[0];
-               if (!(pad->flags & MEDIA_PAD_FL_SINK))
+               if (!pad || !(pad->flags & MEDIA_PAD_FL_SINK))

But then you see that every other driver treats pad = &entity->pads[0] 
as always non-NULL.

---
bod

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-16 14:09     ` Bryan O'Donoghue
@ 2025-06-16 15:00       ` Vladimir Zapolskiy
  2025-06-16 16:08         ` Bryan O'Donoghue
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Zapolskiy @ 2025-06-16 15:00 UTC (permalink / raw)
  To: Bryan O'Donoghue, Johan Hovold
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, linux-media, linux-arm-msm, linux-kernel, stable,
	Johan Hovold

Hi Bryan.

On 6/16/25 17:09, Bryan O'Donoghue wrote:
> On 13/06/2025 10:13, Vladimir Zapolskiy wrote:
>>
>> Per se this concurrent execution shall not lead to the encountered bug,
> 
> What does that mean ? Please re-read the commit log, the analysis is all
> there.

The concurrent execution does not state a problem, moreover it's a feature
of operating systems.

>> both an initialization of media entity pads by media_entity_pads_init()
>> and a registration of a v4l2 devnode inside msm_video_register() are
>> done under in a proper sequence, aren't they?
> 
> No, I clearly haven't explained this clearly enough in the commit log.
> 
> vfe0_rdi0 == /dev/video0 is complete. vfe0_rdi1 is not complete there is
> no /dev/video1 in user-space.

Please let me ask for a few improvements to the commit message of the next
version of the fix.

Te information like "vfe0_rdi0 == /dev/video0" etc. above vaguely assumes
so much of the context, that the statements become wrong, let's remove
ambiguity instead of its amplification.

> vfe_get() is called for an RDI in a VFE, camss_find_sensor_pad() assumes
> all RDIs are populated.
> 

This is a good and almost sufficient one line problem description.

Still there is an issue, you mention vfe_get() and camss_find_sensor_pad()
functions, however both of them are good, and the problem lays within
vfe_set_clock_rates() function, that's the exact place in the driver code,
which iterates over all VFE lines like all of them are initialized.

> We can't use any VFE mutex to synchronise this because
> 
> lock(vfe->mutex);
> lock(media->mutex);
> 
> and
> lock(media->mutex);
> lock(vfe->mutex);
> 
> happen.
> 
> So we can educate vfe_get() about the RDI it is operating on or we can
> flag that a VFE - all of it's subordinate RDIs are available.
> 
> I didn't much like teaching vfe_get() about which RDI index because the
> code looked ugly for 8916 you have to assume on one of the code paths
> that it always operates on RDI0, which is an invalid assumption.

vfe_get() and mutices are all red herring, there is no problem with
vfe_get(), there is no problem with camss_find_sensor_pad(), and there
is no expectation to find a proper fix in any of these two functions.

Johan and me pointed the way out how to fix the encoundered issue properly,
once again, and please don't hesitate to ask questions, if my short
explanations are unclear to you.

The fix is to issue any of VFE line devnodes for userspace strictly after
the completion of all media entity pads initialization. Do you have an
idea how to implement it, or should I help with it? It'd be totally okay.

> The other way to fix this is:
> 
> +++ b/drivers/media/platform/qcom/camss/camss.c
> @@ -2988,7 +2988,7 @@ struct media_pad *camss_find_sensor_pad(struct
> media_entity *entity)
> 
>           while (1) {
>                   pad = &entity->pads[0];
> -               if (!(pad->flags & MEDIA_PAD_FL_SINK))
> +               if (!pad || !(pad->flags & MEDIA_PAD_FL_SINK))
> 
> 
> But then you see that every other driver treats pad = &entity->pads[0]
> as always non-NULL.

There is another expected way with zero problems, see the comment above.

There is no proven problem with camss_find_sensor_pad() funcition, and
it should be left unmodified.

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug
  2025-06-16 15:00       ` Vladimir Zapolskiy
@ 2025-06-16 16:08         ` Bryan O'Donoghue
  0 siblings, 0 replies; 9+ messages in thread
From: Bryan O'Donoghue @ 2025-06-16 16:08 UTC (permalink / raw)
  To: Vladimir Zapolskiy, Johan Hovold
  Cc: Robert Foss, Todor Tomov, Mauro Carvalho Chehab, Hans Verkuil,
	Depeng Shao, linux-media, linux-arm-msm, linux-kernel, stable,
	Johan Hovold

On 16/06/2025 16:00, Vladimir Zapolskiy wrote:
> Hi Bryan.
> 
> On 6/16/25 17:09, Bryan O'Donoghue wrote:
>> On 13/06/2025 10:13, Vladimir Zapolskiy wrote:
>>>
>>> Per se this concurrent execution shall not lead to the encountered bug,
>>
>> What does that mean ? Please re-read the commit log, the analysis is all
>> there.
> 
> The concurrent execution does not state a problem, moreover it's a feature
> of operating systems.

I don't quite understand what your objection is.

I'm informing the reader of the commit log that one function may execute 
in parallel to another function, this is not so with every function and 
is root-cause of the error.


>>> both an initialization of media entity pads by media_entity_pads_init()
>>> and a registration of a v4l2 devnode inside msm_video_register() are
>>> done under in a proper sequence, aren't they?
>>
>> No, I clearly haven't explained this clearly enough in the commit log.
>>
>> vfe0_rdi0 == /dev/video0 is complete. vfe0_rdi1 is not complete there is
>> no /dev/video1 in user-space.
> 
> Please let me ask for a few improvements to the commit message of the next
> version of the fix.
> 
> Te information like "vfe0_rdi0 == /dev/video0" etc. above vaguely assumes
> so much of the context
Sure but this is a _response_ email to you and you know what vfe0_rdi0 is.

The statement doesn't appear in the commit log.

---
bod

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init
  2025-06-12  8:07 ` [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init Bryan O'Donoghue
@ 2025-06-17  9:02   ` Vladimir Zapolskiy
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Zapolskiy @ 2025-06-17  9:02 UTC (permalink / raw)
  To: Bryan O'Donoghue, Robert Foss, Todor Tomov,
	Mauro Carvalho Chehab, Hans Verkuil, Depeng Shao, Hans Verkuil
  Cc: linux-media, linux-arm-msm, linux-kernel, stable

On 6/12/25 11:07, Bryan O'Donoghue wrote:
> The moving of init sequence hook from gen2() to subdev_init() doesn't
> account for gen1 devices such as SDM660 and SDM670. The switch should find
> the right offset for gen2 PHYs only, not reject gen1. Remove the default
> error case to restore gen1 CSIPHY support.
> 
> Cc: stable@vger.kernel.org
> Fixes: fbce0ca24c3a ("media: qcom: camss: csiphy-3ph: Move CSIPHY variables to data field inside csiphy struct")
> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>

Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>

--
Best wishes,
Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-06-17  9:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-12  8:07 [PATCH 0/2] media: qcom: camss: Fix two bugs in mainline Bryan O'Donoghue
2025-06-12  8:07 ` [PATCH 1/2] media: qcom: camss: csiphy-3ph: Fix inadvertent dropping of SDM660/SDM670 phy init Bryan O'Donoghue
2025-06-17  9:02   ` Vladimir Zapolskiy
2025-06-12  8:07 ` [PATCH 2/2] media: qcom: camss: vfe: Fix registration sequencing bug Bryan O'Donoghue
2025-06-13  9:13   ` Vladimir Zapolskiy
2025-06-16 14:09     ` Bryan O'Donoghue
2025-06-16 15:00       ` Vladimir Zapolskiy
2025-06-16 16:08         ` Bryan O'Donoghue
2025-06-16  9:19   ` Johan Hovold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).