* camss NULL-deref on power on with 6.12-rc2
@ 2024-10-11 9:33 Johan Hovold
2024-10-11 9:41 ` Bryan O'Donoghue
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Johan Hovold @ 2024-10-11 9:33 UTC (permalink / raw)
To: Robert Foss, Todor Tomov, Bryan O'Donoghue
Cc: Vladimir Zapolskiy, linux-media, linux-arm-msm, linux-kernel
Hi,
This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
kernel on the Lenovo ThinkPad X13s.
I booted the same kernel another 50 times without hitting it again it so
it may not be a regression, but simply an older, hard to hit bug.
Hopefully you can figure out what went wrong from just staring at the
oops and code.
Johan
[ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
[ 5.676183] vreg_l6q: Bringing 2800000uV into 1800000-1800000uV
[ 6.517689] qcom-camss ac5a000.camss: Adding to iommu group 22
[ 6.589201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[ 6.589625] Mem abort info:
[ 6.589960] ESR = 0x0000000096000004
[ 6.590293] EC = 0x25: DABT (current EL), IL = 32 bits
[ 6.590630] SET = 0, FnV = 0
[ 6.591619] EA = 0, S1PTW = 0
[ 6.591968] FSC = 0x04: level 0 translation fault
[ 6.592298] Data abort info:
[ 6.592621] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 6.593112] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 6.593450] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 6.593783] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010daef000
[ 6.594139] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
[ 6.594214] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 6.594753] Modules linked in: qrtr_mhi cbc des_generic libdes algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog r8152 ipt_REJECT mii nf_reject_ipv4 libphy xt_tcpudp xt_conntrack nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter qcom_pm8008_regulator ov5675 snd_q6apm(+) hci_uart btqca venus_enc venus_dec bluetooth videobuf2_dma_contig qcom_pm8008 pmic_glink_altmode qcom_spmi_adc5 leds_qcom_lpg qcom_spmi_adc_tm5 mfd_core snd_soc_sc8280xp qcom_spmi_temp_alarm qcom_pon rpmsg_ctrl ecdh_generic fastrpc apr rpmsg_char qrtr_smd qcom_pd_mapper rtc_pm8xxx qcom_battmgr ecc aux_hpd_bridge reboot_mode qcom_vadc_common industrialio nvmem_qcom_spmi_sdam led_class_multicolor regmap_i2c i2c_hid_of_elan snd_soc_qcom_common snd_soc_qcom_sdw pwrseq_qcom_wcn ath11k_pci qcom_camss venus_core videobuf2_dma_sg videobuf2_memops v4l2_mem2mem v4l2_fwnode videobuf2_v4l2 msm v4l2_async videobuf2_common qcom_stats gpio_sbu_mux ath11k videodev drm_exec dispcc_sc8280xp snd_soc_wcd938x phy_qcom_edp gpu_sched
[ 6.594814] snd_soc_wcd_classh snd_soc_wcd938x_sdw mac80211 drm_display_helper mc snd_soc_lpass_rx_macro snd_soc_lpass_wsa_macro drm_dp_aux_bus snd_soc_lpass_tx_macro snd_soc_lpass_va_macro camcc_sc8280xp regmap_sdw videocc_sm8350 i2c_qcom_cci soundwire_qcom snd_soc_wcd_mbhc libarc4 snd_soc_lpass_macro_common phy_qcom_qmp_combo cfg80211 qcom_q6v5_pas llcc_qcom aux_bridge snd_soc_core snd_compress qcom_pil_info rfkill qcom_common snd_pcm qcom_glink_smem pci_pwrctl_pwrseq drm_kms_helper pci_pwrctl_core mhi typec qcom_glink pwrseq_core icc_bwmon snd_timer phy_qcom_qmp_usb qrtr phy_qcom_snps_femto_v2 qcom_q6v5 gpucc_sc8280xp pinctrl_sc8280xp_lpass_lpi snd qcom_sysmon pinctrl_lpass_lpi lpasscc_sc8280xp pmic_glink soundcore mdt_loader pdr_interface soundwire_bus qcom_rng rpmsg_core leds_gpio input_leds qcom_pdr_msg socinfo qmi_helpers rng_core qcom_wdt pwm_bl icc_osm_l3 led_class fuse dm_mod ip_tables x_tables ipv6 autofs4 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid drm
[ 6.594866] i2c_core
[ 6.594868] CPU: 0 UID: 0 PID: 557 Comm: v4l_id Not tainted 6.12.0-rc2 #165
[ 6.594871] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
[ 6.594872] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 6.594874] pc : camss_find_sensor+0x20/0x74 [qcom_camss]
[ 6.594885] lr : camss_get_pixel_clock+0x18/0x60 [qcom_camss]
[ 6.594889] sp : ffff800082d538f0
[ 6.594890] x29: ffff800082d538f0 x28: ffff800082d53c70 x27: ffff670cc0404618
[ 6.594893] x26: 0000000000000000 x25: 0000000000000000 x24: ffff670cd33173d0
[ 6.594895] x23: ffff800082d539a8 x22: ffff670cd33192c8 x21: ffff800082d539b8
[ 6.594898] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000000
[ 6.594900] x17: 0000000000000000 x16: ffffbf0bffbecdd0 x15: 0000000000000001
[ 6.594902] x14: ffff670cc5c95300 x13: ffff670cc0b38980 x12: ffff670cc5c95ba8
[ 6.594905] x11: ffffbf0c00f73000 x10: 0000000000000000 x9 : 0000000000000000
[ 6.594907] x8 : ffffbf0c0085d000 x7 : 0000000000000000 x6 : 0000000000000078
[ 6.594910] x5 : 0000000000000000 x4 : ffff670cd3318598 x3 : ffff670cd3318468
[ 6.594912] x2 : ffff670cd3317728 x1 : ffff800082d539b8 x0 : 0000000000000000
[ 6.594915] Call trace:
[ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
[ 6.594920] camss_get_pixel_clock+0x18/0x60 [qcom_camss]
[ 6.594924] vfe_get+0xb8/0x504 [qcom_camss]
[ 6.594931] vfe_set_power+0x30/0x58 [qcom_camss]
[ 6.594936] pipeline_pm_power_one+0x13c/0x150 [videodev]
[ 6.594951] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
[ 6.594960] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
[ 6.594969] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
[ 6.594978] video_open+0x78/0xf4 [qcom_camss]
[ 6.594982] v4l2_open+0x80/0x120 [videodev]
[ 6.594991] chrdev_open+0xb4/0x204
[ 6.594996] do_dentry_open+0x138/0x4d0
[ 6.595000] vfs_open+0x2c/0xe4
[ 6.595003] path_openat+0x2b4/0x9fc
[ 6.595005] do_filp_open+0x80/0x130
[ 6.595007] do_sys_openat2+0xb4/0xe8
[ 6.595010] __arm64_sys_openat+0x64/0xac
[ 6.595012] invoke_syscall+0x48/0x110
[ 6.595016] el0_svc_common.constprop.0+0xc0/0xe0
[ 6.595018] do_el0_svc+0x1c/0x28
[ 6.595021] el0_svc+0x48/0x114
[ 6.595023] el0t_64_sync_handler+0xc0/0xc4
[ 6.595025] el0t_64_sync+0x190/0x194
[ 6.595028] Code: 52800033 72a00053 d503201f f9402400 (f9401801)
[ 6.595029] ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2024-10-11 9:33 camss NULL-deref on power on with 6.12-rc2 Johan Hovold
@ 2024-10-11 9:41 ` Bryan O'Donoghue
2024-10-11 9:54 ` Johan Hovold
2025-04-07 9:12 ` Johan Hovold
2025-08-24 20:42 ` Vladimir Zapolskiy
2 siblings, 1 reply; 10+ messages in thread
From: Bryan O'Donoghue @ 2024-10-11 9:41 UTC (permalink / raw)
To: Johan Hovold, Robert Foss, Todor Tomov
Cc: Vladimir Zapolskiy, linux-media, linux-arm-msm, linux-kernel
On 11/10/2024 10:33, Johan Hovold wrote:
> Hi,
>
> This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> kernel on the Lenovo ThinkPad X13s.
>
> I booted the same kernel another 50 times without hitting it again it so
> it may not be a regression, but simply an older, hard to hit bug.
>
> Hopefully you can figure out what went wrong from just staring at the
> oops and code.
>
> Johan
>
>
> [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
So this caused it, I guess the sensor failed to power up.
You've booted 50 times in a row and hit a corner case where the sensor
didn't power up leading to a NULL deference.
So, two bugs I'd say.
- What is the cirumcstance where the sensor doesn't power up
- What's the NULL either entity * or entity->pad I'd say.
<snip>
> [ 6.594915] Call trace:
> [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
Hmm, not sure looking at what we have.
pad = &entity->pads[0];
if (!(pad->flags & MEDIA_PAD_FL_SINK))
return NULL;
Is pad guaranteed after entity->pads[0] ?
We dereference it like its guaranteed.
Anyway thanks for the report, should be enough start digging.
---
bod
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2024-10-11 9:41 ` Bryan O'Donoghue
@ 2024-10-11 9:54 ` Johan Hovold
0 siblings, 0 replies; 10+ messages in thread
From: Johan Hovold @ 2024-10-11 9:54 UTC (permalink / raw)
To: Bryan O'Donoghue
Cc: Robert Foss, Todor Tomov, Vladimir Zapolskiy, linux-media,
linux-arm-msm, linux-kernel
On Fri, Oct 11, 2024 at 10:41:30AM +0100, Bryan O'Donoghue wrote:
> On 11/10/2024 10:33, Johan Hovold wrote:
> > This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> > kernel on the Lenovo ThinkPad X13s.
> >
> > I booted the same kernel another 50 times without hitting it again it so
> > it may not be a regression, but simply an older, hard to hit bug.
> >
> > Hopefully you can figure out what went wrong from just staring at the
> > oops and code.
> > [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
>
> So this caused it, I guess the sensor failed to power up.
The probe deferral may be involved, but we see this deferral all the
time without things blowing up (and the driver should be able to handle
that).
> You've booted 50 times in a row and hit a corner case where the sensor
> didn't power up leading to a NULL deference.
>
> So, two bugs I'd say.
>
> - What is the cirumcstance where the sensor doesn't power up
Not sure what is causing it, but I have seen boots where this message
shows up 5-6 times, which may indeed indicate that something is off. If
this was just a provider not having probed yet, driver core should
generally prevent the sensor from from probing until the resources (e.g.
clocks) are available.
> - What's the NULL either entity * or entity->pad I'd say.
>
> <snip>
> > [ 6.594915] Call trace:
> > [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
> Hmm, not sure looking at what we have.
>
> pad = &entity->pads[0];
> if (!(pad->flags & MEDIA_PAD_FL_SINK))
> return NULL;
>
> Is pad guaranteed after entity->pads[0] ?
> We dereference it like its guaranteed.
>
> Anyway thanks for the report, should be enough start digging.
Thanks.
Johan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2024-10-11 9:33 camss NULL-deref on power on with 6.12-rc2 Johan Hovold
2024-10-11 9:41 ` Bryan O'Donoghue
@ 2025-04-07 9:12 ` Johan Hovold
2025-04-07 9:58 ` Bryan O'Donoghue
2025-08-24 20:42 ` Vladimir Zapolskiy
2 siblings, 1 reply; 10+ messages in thread
From: Johan Hovold @ 2025-04-07 9:12 UTC (permalink / raw)
To: Robert Foss, Todor Tomov, Bryan O'Donoghue
Cc: Vladimir Zapolskiy, linux-media, linux-arm-msm, linux-kernel
On Fri, Oct 11, 2024 at 11:33:30AM +0200, Johan Hovold wrote:
> This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> kernel on the Lenovo ThinkPad X13s.
>
> I booted the same kernel another 50 times without hitting it again it so
> it may not be a regression, but simply an older, hard to hit bug.
>
> Hopefully you can figure out what went wrong from just staring at the
> oops and code.
Hit the NULL-pointer dereference during boot that I reported back in
October again today with 6.15-rc1.
The camss_find_sensor_pad() function was renamed in 6.15-rc1, but
otherwise it looks identical.
Johan
[ 5.740833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
[ 5.741162] Mem abort info:
[ 5.741435] ESR = 0x0000000096000004
[ 5.741707] EC = 0x25: DABT (current EL), IL = 32 bits
[ 5.741980] SET = 0, FnV = 0
[ 5.742249] EA = 0, S1PTW = 0
[ 5.742253] FSC = 0x04: level 0 translation fault
[ 5.742255] Data abort info:
[ 5.742257] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 5.743264] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 5.743267] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 5.743269] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010fb98000
[ 5.743272] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
[ 5.744064] Internal error: Oops: 0000000096000004 [#1] SMP
[ 5.744645] CPU: 3 UID: 0 PID: 442 Comm: v4l_id Not tainted 6.15.0-rc1 #106 PREEMPT
[ 5.744647] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
[ 5.744649] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 5.744651] pc : camss_find_sensor_pad+0x20/0x74 [qcom_camss]
[ 5.744661] lr : camss_get_pixel_clock+0x18/0x64 [qcom_camss]
[ 5.744666] sp : ffff800082dfb8e0
[ 5.744667] x29: ffff800082dfb8e0 x28: ffff800082dfbc68 x27: ffff143e80404618
[ 5.744671] x26: 0000000000000000 x25: 0000000000000000 x24: ffff143e9398baa8
[ 5.744675] x23: ffff800082dfb998 x22: ffff143e9398d9a0 x21: ffff800082dfb9a8
[ 5.744678] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000020
[ 5.744682] x17: 3030613563613a33 x16: ffffac4db3ccf814 x15: 706e65672f6b6e69
[ 5.744686] x14: 0000000000000000 x13: ffff143e80b39180 x12: 30613563613a333a
[ 5.744690] x11: ffffac4db50a8920 x10: 0000000000000000 x9 : 0000000000000000
[ 5.744693] x8 : ffffac4db4992000 x7 : ffff800082dfb8e0 x6 : ffff800082dfb870
[ 5.744697] x5 : ffff800082dfc000 x4 : ffff143e9398cc70 x3 : ffff143e9398cb40
[ 5.744701] x2 : ffff143e9398be00 x1 : ffff143e9398d9a0 x0 : 0000000000000000
[ 5.744704] Call trace:
[ 5.744706] camss_find_sensor_pad+0x20/0x74 [qcom_camss] (P)
[ 5.744711] camss_get_pixel_clock+0x18/0x64 [qcom_camss]
[ 5.744716] vfe_get+0xb8/0x504 [qcom_camss]
[ 5.744724] vfe_set_power+0x30/0x58 [qcom_camss]
[ 5.744731] pipeline_pm_power_one+0x13c/0x150 [videodev]
[ 5.744745] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
[ 5.744754] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
[ 5.744762] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
[ 5.744771] video_open+0x78/0xf4 [qcom_camss]
[ 5.744776] v4l2_open+0x80/0x120 [videodev]
[ 5.755711] chrdev_open+0xb4/0x204
[ 5.755716] do_dentry_open+0x138/0x4d0
[ 5.756271] vfs_open+0x2c/0xe8
[ 5.756274] path_openat+0x2b8/0x9fc
[ 5.756276] do_filp_open+0x8c/0x144
[ 5.756277] do_sys_openat2+0x80/0xdc
[ 5.756279] __arm64_sys_openat+0x60/0xb0
[ 5.757830] invoke_syscall+0x48/0x110
[ 5.757834] el0_svc_common.constprop.0+0xc0/0xe0
[ 5.758369] do_el0_svc+0x1c/0x28
[ 5.758372] el0_svc+0x48/0x114
[ 5.758889] el0t_64_sync_handler+0xc8/0xcc
[ 5.759184] el0t_64_sync+0x198/0x19c
[ 5.759475] Code: f9000bf3 52800033 72a00053 f9402420 (f9401801)
> [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
> [ 5.676183] vreg_l6q: Bringing 2800000uV into 1800000-1800000uV
>
> [ 6.517689] qcom-camss ac5a000.camss: Adding to iommu group 22
>
> [ 6.589201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> [ 6.589625] Mem abort info:
> [ 6.589960] ESR = 0x0000000096000004
> [ 6.590293] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 6.590630] SET = 0, FnV = 0
> [ 6.591619] EA = 0, S1PTW = 0
> [ 6.591968] FSC = 0x04: level 0 translation fault
> [ 6.592298] Data abort info:
> [ 6.592621] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 6.593112] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 6.593450] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 6.593783] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010daef000
> [ 6.594139] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
> [ 6.594214] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 6.594868] CPU: 0 UID: 0 PID: 557 Comm: v4l_id Not tainted 6.12.0-rc2 #165
> [ 6.594871] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
> [ 6.594872] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 6.594874] pc : camss_find_sensor+0x20/0x74 [qcom_camss]
> [ 6.594885] lr : camss_get_pixel_clock+0x18/0x60 [qcom_camss]
> [ 6.594889] sp : ffff800082d538f0
> [ 6.594890] x29: ffff800082d538f0 x28: ffff800082d53c70 x27: ffff670cc0404618
> [ 6.594893] x26: 0000000000000000 x25: 0000000000000000 x24: ffff670cd33173d0
> [ 6.594895] x23: ffff800082d539a8 x22: ffff670cd33192c8 x21: ffff800082d539b8
> [ 6.594898] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000000
> [ 6.594900] x17: 0000000000000000 x16: ffffbf0bffbecdd0 x15: 0000000000000001
> [ 6.594902] x14: ffff670cc5c95300 x13: ffff670cc0b38980 x12: ffff670cc5c95ba8
> [ 6.594905] x11: ffffbf0c00f73000 x10: 0000000000000000 x9 : 0000000000000000
> [ 6.594907] x8 : ffffbf0c0085d000 x7 : 0000000000000000 x6 : 0000000000000078
> [ 6.594910] x5 : 0000000000000000 x4 : ffff670cd3318598 x3 : ffff670cd3318468
> [ 6.594912] x2 : ffff670cd3317728 x1 : ffff800082d539b8 x0 : 0000000000000000
> [ 6.594915] Call trace:
> [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
> [ 6.594920] camss_get_pixel_clock+0x18/0x60 [qcom_camss]
> [ 6.594924] vfe_get+0xb8/0x504 [qcom_camss]
> [ 6.594931] vfe_set_power+0x30/0x58 [qcom_camss]
> [ 6.594936] pipeline_pm_power_one+0x13c/0x150 [videodev]
> [ 6.594951] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> [ 6.594960] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> [ 6.594969] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> [ 6.594978] video_open+0x78/0xf4 [qcom_camss]
> [ 6.594982] v4l2_open+0x80/0x120 [videodev]
> [ 6.594991] chrdev_open+0xb4/0x204
> [ 6.594996] do_dentry_open+0x138/0x4d0
> [ 6.595000] vfs_open+0x2c/0xe4
> [ 6.595003] path_openat+0x2b4/0x9fc
> [ 6.595005] do_filp_open+0x80/0x130
> [ 6.595007] do_sys_openat2+0xb4/0xe8
> [ 6.595010] __arm64_sys_openat+0x64/0xac
> [ 6.595012] invoke_syscall+0x48/0x110
> [ 6.595016] el0_svc_common.constprop.0+0xc0/0xe0
> [ 6.595018] do_el0_svc+0x1c/0x28
> [ 6.595021] el0_svc+0x48/0x114
> [ 6.595023] el0t_64_sync_handler+0xc0/0xc4
> [ 6.595025] el0t_64_sync+0x190/0x194
> [ 6.595028] Code: 52800033 72a00053 d503201f f9402400 (f9401801)
> [ 6.595029] ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2025-04-07 9:12 ` Johan Hovold
@ 2025-04-07 9:58 ` Bryan O'Donoghue
2025-04-07 10:38 ` Johan Hovold
0 siblings, 1 reply; 10+ messages in thread
From: Bryan O'Donoghue @ 2025-04-07 9:58 UTC (permalink / raw)
To: Johan Hovold, Robert Foss, Todor Tomov
Cc: Vladimir Zapolskiy, linux-media, linux-arm-msm, linux-kernel
On 07/04/2025 10:12, Johan Hovold wrote:
> On Fri, Oct 11, 2024 at 11:33:30AM +0200, Johan Hovold wrote:
>
>> This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
>> kernel on the Lenovo ThinkPad X13s.
>>
>> I booted the same kernel another 50 times without hitting it again it so
>> it may not be a regression, but simply an older, hard to hit bug.
>>
>> Hopefully you can figure out what went wrong from just staring at the
>> oops and code.
>
> Hit the NULL-pointer dereference during boot that I reported back in
> October again today with 6.15-rc1.
>
> The camss_find_sensor_pad() function was renamed in 6.15-rc1, but
> otherwise it looks identical.
>
> Johan
>
>
> [ 5.740833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> [ 5.741162] Mem abort info:
> [ 5.741435] ESR = 0x0000000096000004
> [ 5.741707] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 5.741980] SET = 0, FnV = 0
> [ 5.742249] EA = 0, S1PTW = 0
> [ 5.742253] FSC = 0x04: level 0 translation fault
> [ 5.742255] Data abort info:
> [ 5.742257] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 5.743264] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 5.743267] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 5.743269] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010fb98000
> [ 5.743272] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
> [ 5.744064] Internal error: Oops: 0000000096000004 [#1] SMP
>
> [ 5.744645] CPU: 3 UID: 0 PID: 442 Comm: v4l_id Not tainted 6.15.0-rc1 #106 PREEMPT
> [ 5.744647] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
> [ 5.744649] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 5.744651] pc : camss_find_sensor_pad+0x20/0x74 [qcom_camss]
> [ 5.744661] lr : camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> [ 5.744666] sp : ffff800082dfb8e0
> [ 5.744667] x29: ffff800082dfb8e0 x28: ffff800082dfbc68 x27: ffff143e80404618
> [ 5.744671] x26: 0000000000000000 x25: 0000000000000000 x24: ffff143e9398baa8
> [ 5.744675] x23: ffff800082dfb998 x22: ffff143e9398d9a0 x21: ffff800082dfb9a8
> [ 5.744678] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000020
> [ 5.744682] x17: 3030613563613a33 x16: ffffac4db3ccf814 x15: 706e65672f6b6e69
> [ 5.744686] x14: 0000000000000000 x13: ffff143e80b39180 x12: 30613563613a333a
> [ 5.744690] x11: ffffac4db50a8920 x10: 0000000000000000 x9 : 0000000000000000
> [ 5.744693] x8 : ffffac4db4992000 x7 : ffff800082dfb8e0 x6 : ffff800082dfb870
> [ 5.744697] x5 : ffff800082dfc000 x4 : ffff143e9398cc70 x3 : ffff143e9398cb40
> [ 5.744701] x2 : ffff143e9398be00 x1 : ffff143e9398d9a0 x0 : 0000000000000000
> [ 5.744704] Call trace:
> [ 5.744706] camss_find_sensor_pad+0x20/0x74 [qcom_camss] (P)
> [ 5.744711] camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> [ 5.744716] vfe_get+0xb8/0x504 [qcom_camss]
> [ 5.744724] vfe_set_power+0x30/0x58 [qcom_camss]
> [ 5.744731] pipeline_pm_power_one+0x13c/0x150 [videodev]
> [ 5.744745] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> [ 5.744754] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> [ 5.744762] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> [ 5.744771] video_open+0x78/0xf4 [qcom_camss]
> [ 5.744776] v4l2_open+0x80/0x120 [videodev]
> [ 5.755711] chrdev_open+0xb4/0x204
> [ 5.755716] do_dentry_open+0x138/0x4d0
> [ 5.756271] vfs_open+0x2c/0xe8
> [ 5.756274] path_openat+0x2b8/0x9fc
> [ 5.756276] do_filp_open+0x8c/0x144
> [ 5.756277] do_sys_openat2+0x80/0xdc
> [ 5.756279] __arm64_sys_openat+0x60/0xb0
> [ 5.757830] invoke_syscall+0x48/0x110
> [ 5.757834] el0_svc_common.constprop.0+0xc0/0xe0
> [ 5.758369] do_el0_svc+0x1c/0x28
> [ 5.758372] el0_svc+0x48/0x114
> [ 5.758889] el0t_64_sync_handler+0xc8/0xcc
> [ 5.759184] el0t_64_sync+0x198/0x19c
> [ 5.759475] Code: f9000bf3 52800033 72a00053 f9402420 (f9401801)
>
>
>> [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
>> [ 5.676183] vreg_l6q: Bringing 2800000uV into 1800000-1800000uV
>>
>> [ 6.517689] qcom-camss ac5a000.camss: Adding to iommu group 22
>>
>> [ 6.589201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
>> [ 6.589625] Mem abort info:
>> [ 6.589960] ESR = 0x0000000096000004
>> [ 6.590293] EC = 0x25: DABT (current EL), IL = 32 bits
>> [ 6.590630] SET = 0, FnV = 0
>> [ 6.591619] EA = 0, S1PTW = 0
>> [ 6.591968] FSC = 0x04: level 0 translation fault
>> [ 6.592298] Data abort info:
>> [ 6.592621] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
>> [ 6.593112] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>> [ 6.593450] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> [ 6.593783] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010daef000
>> [ 6.594139] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
>> [ 6.594214] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>
>> [ 6.594868] CPU: 0 UID: 0 PID: 557 Comm: v4l_id Not tainted 6.12.0-rc2 #165
>> [ 6.594871] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
>> [ 6.594872] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [ 6.594874] pc : camss_find_sensor+0x20/0x74 [qcom_camss]
>> [ 6.594885] lr : camss_get_pixel_clock+0x18/0x60 [qcom_camss]
>> [ 6.594889] sp : ffff800082d538f0
>> [ 6.594890] x29: ffff800082d538f0 x28: ffff800082d53c70 x27: ffff670cc0404618
>> [ 6.594893] x26: 0000000000000000 x25: 0000000000000000 x24: ffff670cd33173d0
>> [ 6.594895] x23: ffff800082d539a8 x22: ffff670cd33192c8 x21: ffff800082d539b8
>> [ 6.594898] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000000
>> [ 6.594900] x17: 0000000000000000 x16: ffffbf0bffbecdd0 x15: 0000000000000001
>> [ 6.594902] x14: ffff670cc5c95300 x13: ffff670cc0b38980 x12: ffff670cc5c95ba8
>> [ 6.594905] x11: ffffbf0c00f73000 x10: 0000000000000000 x9 : 0000000000000000
>> [ 6.594907] x8 : ffffbf0c0085d000 x7 : 0000000000000000 x6 : 0000000000000078
>> [ 6.594910] x5 : 0000000000000000 x4 : ffff670cd3318598 x3 : ffff670cd3318468
>> [ 6.594912] x2 : ffff670cd3317728 x1 : ffff800082d539b8 x0 : 0000000000000000
>> [ 6.594915] Call trace:
>> [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
>> [ 6.594920] camss_get_pixel_clock+0x18/0x60 [qcom_camss]
>> [ 6.594924] vfe_get+0xb8/0x504 [qcom_camss]
>> [ 6.594931] vfe_set_power+0x30/0x58 [qcom_camss]
>> [ 6.594936] pipeline_pm_power_one+0x13c/0x150 [videodev]
>> [ 6.594951] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
>> [ 6.594960] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
>> [ 6.594969] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
>> [ 6.594978] video_open+0x78/0xf4 [qcom_camss]
>> [ 6.594982] v4l2_open+0x80/0x120 [videodev]
>> [ 6.594991] chrdev_open+0xb4/0x204
>> [ 6.594996] do_dentry_open+0x138/0x4d0
>> [ 6.595000] vfs_open+0x2c/0xe4
>> [ 6.595003] path_openat+0x2b4/0x9fc
>> [ 6.595005] do_filp_open+0x80/0x130
>> [ 6.595007] do_sys_openat2+0xb4/0xe8
>> [ 6.595010] __arm64_sys_openat+0x64/0xac
>> [ 6.595012] invoke_syscall+0x48/0x110
>> [ 6.595016] el0_svc_common.constprop.0+0xc0/0xe0
>> [ 6.595018] do_el0_svc+0x1c/0x28
>> [ 6.595021] el0_svc+0x48/0x114
>> [ 6.595023] el0t_64_sync_handler+0xc0/0xc4
>> [ 6.595025] el0t_64_sync+0x190/0x194
>> [ 6.595028] Code: 52800033 72a00053 d503201f f9402400 (f9401801)
>> [ 6.595029] ---[ end trace 0000000000000000 ]---
I've never seen this myself.
I wonder, are you building camcc, camss and the sensor driver into your
initrd ?
---
bod
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2025-04-07 9:58 ` Bryan O'Donoghue
@ 2025-04-07 10:38 ` Johan Hovold
2025-04-07 11:01 ` Johan Hovold
0 siblings, 1 reply; 10+ messages in thread
From: Johan Hovold @ 2025-04-07 10:38 UTC (permalink / raw)
To: Bryan O'Donoghue
Cc: Robert Foss, Todor Tomov, Vladimir Zapolskiy, linux-media,
linux-arm-msm, linux-kernel
On Mon, Apr 07, 2025 at 10:58:52AM +0100, Bryan O'Donoghue wrote:
> On 07/04/2025 10:12, Johan Hovold wrote:
> > On Fri, Oct 11, 2024 at 11:33:30AM +0200, Johan Hovold wrote:
> >
> >> This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> >> kernel on the Lenovo ThinkPad X13s.
> >>
> >> I booted the same kernel another 50 times without hitting it again it so
> >> it may not be a regression, but simply an older, hard to hit bug.
> >>
> >> Hopefully you can figure out what went wrong from just staring at the
> >> oops and code.
> >
> > Hit the NULL-pointer dereference during boot that I reported back in
> > October again today with 6.15-rc1.
> >
> > The camss_find_sensor_pad() function was renamed in 6.15-rc1, but
> > otherwise it looks identical.
> > [ 5.740833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> > [ 5.744704] Call trace:
> > [ 5.744706] camss_find_sensor_pad+0x20/0x74 [qcom_camss] (P)
> > [ 5.744711] camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> > [ 5.744716] vfe_get+0xb8/0x504 [qcom_camss]
> > [ 5.744724] vfe_set_power+0x30/0x58 [qcom_camss]
> > [ 5.744731] pipeline_pm_power_one+0x13c/0x150 [videodev]
> > [ 5.744745] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> > [ 5.744754] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> > [ 5.744762] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> > [ 5.744771] video_open+0x78/0xf4 [qcom_camss]
> > [ 5.744776] v4l2_open+0x80/0x120 [videodev]
> I've never seen this myself.
>
> I wonder, are you building camcc, camss and the sensor driver into your
> initrd ?
No, there's nothing camera related in my initramfs.
I've only seen it twice myself (that I've noticed, at least this time it
prevented the display from probing so I knew something was wrong).
Since it's obviously a race condition I think you'll need to analyse the
code to try to figure out where the bug is. With an hypothesis you may
be able to instrument a reliable reproducer (e.g. by adding appropriate
delays to extend the race window).
The fact that the sensor driver is probe deferring may also be relevant
here.
Johan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2025-04-07 10:38 ` Johan Hovold
@ 2025-04-07 11:01 ` Johan Hovold
2025-04-07 13:49 ` Johan Hovold
0 siblings, 1 reply; 10+ messages in thread
From: Johan Hovold @ 2025-04-07 11:01 UTC (permalink / raw)
To: Bryan O'Donoghue
Cc: Robert Foss, Todor Tomov, Vladimir Zapolskiy, linux-media,
linux-arm-msm, linux-kernel
On Mon, Apr 07, 2025 at 12:38:56PM +0200, Johan Hovold wrote:
> On Mon, Apr 07, 2025 at 10:58:52AM +0100, Bryan O'Donoghue wrote:
> > On 07/04/2025 10:12, Johan Hovold wrote:
> > > [ 5.740833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
>
> > > [ 5.744704] Call trace:
> > > [ 5.744706] camss_find_sensor_pad+0x20/0x74 [qcom_camss] (P)
> > > [ 5.744711] camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> > > [ 5.744716] vfe_get+0xb8/0x504 [qcom_camss]
> > > [ 5.744724] vfe_set_power+0x30/0x58 [qcom_camss]
> > > [ 5.744731] pipeline_pm_power_one+0x13c/0x150 [videodev]
> > > [ 5.744745] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> > > [ 5.744754] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> > > [ 5.744762] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> > > [ 5.744771] video_open+0x78/0xf4 [qcom_camss]
> > > [ 5.744776] v4l2_open+0x80/0x120 [videodev]
> I've only seen it twice myself (that I've noticed, at least this time it
> prevented the display from probing so I knew something was wrong).
Just hit this again with 6.15-rc1 after the third reboot so timing has
likely changed slightly which now makes it easier to hit this.
> Since it's obviously a race condition I think you'll need to analyse the
> code to try to figure out where the bug is. With an hypothesis you may
> be able to instrument a reliable reproducer (e.g. by adding appropriate
> delays to extend the race window).
It's apparently udev which powers up the camera when running v4l_id:
[ 5.859741] CPU: 4 UID: 0 PID: 420 Comm: v4l_id Not tainted 6.15.0-rc1 #106 PREEMPT
So this looks like the classic bug of drivers registering their devices
before they have been fully set up.
> The fact that the sensor driver is probe deferring may also be relevant
> here.
Johan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2025-04-07 11:01 ` Johan Hovold
@ 2025-04-07 13:49 ` Johan Hovold
0 siblings, 0 replies; 10+ messages in thread
From: Johan Hovold @ 2025-04-07 13:49 UTC (permalink / raw)
To: Bryan O'Donoghue
Cc: Robert Foss, Todor Tomov, Vladimir Zapolskiy, linux-media,
linux-arm-msm, linux-kernel
On Mon, Apr 07, 2025 at 01:01:09PM +0200, Johan Hovold wrote:
> On Mon, Apr 07, 2025 at 12:38:56PM +0200, Johan Hovold wrote:
> > On Mon, Apr 07, 2025 at 10:58:52AM +0100, Bryan O'Donoghue wrote:
> > > On 07/04/2025 10:12, Johan Hovold wrote:
>
> > > > [ 5.740833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> >
> > > > [ 5.744704] Call trace:
> > > > [ 5.744706] camss_find_sensor_pad+0x20/0x74 [qcom_camss] (P)
> > > > [ 5.744711] camss_get_pixel_clock+0x18/0x64 [qcom_camss]
> > > > [ 5.744716] vfe_get+0xb8/0x504 [qcom_camss]
> > > > [ 5.744724] vfe_set_power+0x30/0x58 [qcom_camss]
> > > > [ 5.744731] pipeline_pm_power_one+0x13c/0x150 [videodev]
> > > > [ 5.744745] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> > > > [ 5.744754] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> > > > [ 5.744762] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> > > > [ 5.744771] video_open+0x78/0xf4 [qcom_camss]
> > > > [ 5.744776] v4l2_open+0x80/0x120 [videodev]
>
> > I've only seen it twice myself (that I've noticed, at least this time it
> > prevented the display from probing so I knew something was wrong).
>
> Just hit this again with 6.15-rc1 after the third reboot so timing has
> likely changed slightly which now makes it easier to hit this.
>
> > Since it's obviously a race condition I think you'll need to analyse the
> > code to try to figure out where the bug is. With an hypothesis you may
> > be able to instrument a reliable reproducer (e.g. by adding appropriate
> > delays to extend the race window).
>
> It's apparently udev which powers up the camera when running v4l_id:
>
> [ 5.859741] CPU: 4 UID: 0 PID: 420 Comm: v4l_id Not tainted 6.15.0-rc1 #106 PREEMPT
>
> So this looks like the classic bug of drivers registering their devices
> before they have been fully set up.
It's entity->pad which is being dereferenced while NULL in
camss_find_sensor_pad() and when this happens entity->name is also NULL.
Bailing out when entity->pad is NULL allows the machine to boot, but we
should figure out why this function is called before things have been
properly initialised.
Johan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2024-10-11 9:33 camss NULL-deref on power on with 6.12-rc2 Johan Hovold
2024-10-11 9:41 ` Bryan O'Donoghue
2025-04-07 9:12 ` Johan Hovold
@ 2025-08-24 20:42 ` Vladimir Zapolskiy
2025-08-29 9:15 ` Johan Hovold
2 siblings, 1 reply; 10+ messages in thread
From: Vladimir Zapolskiy @ 2025-08-24 20:42 UTC (permalink / raw)
To: Johan Hovold
Cc: Robert Foss, Todor Tomov, Bryan O'Donoghue, linux-media,
linux-arm-msm, linux-kernel
Hi Johan.
On 10/11/24 12:33, Johan Hovold wrote:
> Hi,
>
> This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> kernel on the Lenovo ThinkPad X13s.
>
> I booted the same kernel another 50 times without hitting it again it so
> it may not be a regression, but simply an older, hard to hit bug.
>
> Hopefully you can figure out what went wrong from just staring at the
> oops and code.
>
> Johan
>
>
> [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
> [ 5.676183] vreg_l6q: Bringing 2800000uV into 1800000-1800000uV
>
> [ 6.517689] qcom-camss ac5a000.camss: Adding to iommu group 22
>
> [ 6.589201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> [ 6.589625] Mem abort info:
> [ 6.589960] ESR = 0x0000000096000004
> [ 6.590293] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 6.590630] SET = 0, FnV = 0
> [ 6.591619] EA = 0, S1PTW = 0
> [ 6.591968] FSC = 0x04: level 0 translation fault
> [ 6.592298] Data abort info:
> [ 6.592621] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 6.593112] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 6.593450] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 6.593783] user pgtable: 4k pages, 48-bit VAs, pgdp=000000010daef000
> [ 6.594139] [0000000000000030] pgd=0000000000000000, p4d=0000000000000000
> [ 6.594214] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 6.594753] Modules linked in: qrtr_mhi cbc des_generic libdes algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog r8152 ipt_REJECT mii nf_reject_ipv4 libphy xt_tcpudp xt_conntrack nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter qcom_pm8008_regulator ov5675 snd_q6apm(+) hci_uart btqca venus_enc venus_dec bluetooth videobuf2_dma_contig qcom_pm8008 pmic_glink_altmode qcom_spmi_adc5 leds_qcom_lpg qcom_spmi_adc_tm5 mfd_core snd_soc_sc8280xp qcom_spmi_temp_alarm qcom_pon rpmsg_ctrl ecdh_generic fastrpc apr rpmsg_char qrtr_smd qcom_pd_mapper rtc_pm8xxx qcom_battmgr ecc aux_hpd_bridge reboot_mode qcom_vadc_common industrialio nvmem_qcom_spmi_sdam led_class_multicolor regmap_i2c i2c_hid_of_elan snd_soc_qcom_common snd_soc_qcom_sdw pwrseq_qcom_wcn ath11k_pci qcom_camss venus_core videobuf2_dma_sg videobuf2_memops v4l2_mem2mem v4l2_fwnode videobuf2_v4l2 msm v4l2_async videobuf2_common qcom_stats gpio_sbu_mux ath11k videodev drm_exec dispcc_sc8280xp snd_soc_wcd938x phy_qcom_edp gpu_sched
> [ 6.594814] snd_soc_wcd_classh snd_soc_wcd938x_sdw mac80211 drm_display_helper mc snd_soc_lpass_rx_macro snd_soc_lpass_wsa_macro drm_dp_aux_bus snd_soc_lpass_tx_macro snd_soc_lpass_va_macro camcc_sc8280xp regmap_sdw videocc_sm8350 i2c_qcom_cci soundwire_qcom snd_soc_wcd_mbhc libarc4 snd_soc_lpass_macro_common phy_qcom_qmp_combo cfg80211 qcom_q6v5_pas llcc_qcom aux_bridge snd_soc_core snd_compress qcom_pil_info rfkill qcom_common snd_pcm qcom_glink_smem pci_pwrctl_pwrseq drm_kms_helper pci_pwrctl_core mhi typec qcom_glink pwrseq_core icc_bwmon snd_timer phy_qcom_qmp_usb qrtr phy_qcom_snps_femto_v2 qcom_q6v5 gpucc_sc8280xp pinctrl_sc8280xp_lpass_lpi snd qcom_sysmon pinctrl_lpass_lpi lpasscc_sc8280xp pmic_glink soundcore mdt_loader pdr_interface soundwire_bus qcom_rng rpmsg_core leds_gpio input_leds qcom_pdr_msg socinfo qmi_helpers rng_core qcom_wdt pwm_bl icc_osm_l3 led_class fuse dm_mod ip_tables x_tables ipv6 autofs4 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid drm
> [ 6.594866] i2c_core
> [ 6.594868] CPU: 0 UID: 0 PID: 557 Comm: v4l_id Not tainted 6.12.0-rc2 #165
> [ 6.594871] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
> [ 6.594872] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 6.594874] pc : camss_find_sensor+0x20/0x74 [qcom_camss]
> [ 6.594885] lr : camss_get_pixel_clock+0x18/0x60 [qcom_camss]
> [ 6.594889] sp : ffff800082d538f0
> [ 6.594890] x29: ffff800082d538f0 x28: ffff800082d53c70 x27: ffff670cc0404618
> [ 6.594893] x26: 0000000000000000 x25: 0000000000000000 x24: ffff670cd33173d0
> [ 6.594895] x23: ffff800082d539a8 x22: ffff670cd33192c8 x21: ffff800082d539b8
> [ 6.594898] x20: 0000000000000002 x19: 0000000000020001 x18: 0000000000000000
> [ 6.594900] x17: 0000000000000000 x16: ffffbf0bffbecdd0 x15: 0000000000000001
> [ 6.594902] x14: ffff670cc5c95300 x13: ffff670cc0b38980 x12: ffff670cc5c95ba8
> [ 6.594905] x11: ffffbf0c00f73000 x10: 0000000000000000 x9 : 0000000000000000
> [ 6.594907] x8 : ffffbf0c0085d000 x7 : 0000000000000000 x6 : 0000000000000078
> [ 6.594910] x5 : 0000000000000000 x4 : ffff670cd3318598 x3 : ffff670cd3318468
> [ 6.594912] x2 : ffff670cd3317728 x1 : ffff800082d539b8 x0 : 0000000000000000
> [ 6.594915] Call trace:
> [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
> [ 6.594920] camss_get_pixel_clock+0x18/0x60 [qcom_camss]
> [ 6.594924] vfe_get+0xb8/0x504 [qcom_camss]
> [ 6.594931] vfe_set_power+0x30/0x58 [qcom_camss]
> [ 6.594936] pipeline_pm_power_one+0x13c/0x150 [videodev]
> [ 6.594951] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> [ 6.594960] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> [ 6.594969] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> [ 6.594978] video_open+0x78/0xf4 [qcom_camss]
> [ 6.594982] v4l2_open+0x80/0x120 [videodev]
As you remember the problem has been discussed in the past [1], for
your information the issue has been indirectly fixed in v6.17-rc1 by
getting rid of v4l2_pipeline_pm_get() from .open, see commit
164202f68203 ("media: qcom: camss: Power pipeline only when streaming").
Still the old race is left unresolved, and it could lead to a NULL
pointer dereference, but practically it would be close to impossible to
reproduce it, since one more step of starting a video stream is needed.
> [ 6.594991] chrdev_open+0xb4/0x204
> [ 6.594996] do_dentry_open+0x138/0x4d0
> [ 6.595000] vfs_open+0x2c/0xe4
> [ 6.595003] path_openat+0x2b4/0x9fc
> [ 6.595005] do_filp_open+0x80/0x130
> [ 6.595007] do_sys_openat2+0xb4/0xe8
> [ 6.595010] __arm64_sys_openat+0x64/0xac
> [ 6.595012] invoke_syscall+0x48/0x110
> [ 6.595016] el0_svc_common.constprop.0+0xc0/0xe0
> [ 6.595018] do_el0_svc+0x1c/0x28
> [ 6.595021] el0_svc+0x48/0x114
> [ 6.595023] el0t_64_sync_handler+0xc0/0xc4
> [ 6.595025] el0t_64_sync+0x190/0x194
> [ 6.595028] Code: 52800033 72a00053 d503201f f9402400 (f9401801)
> [ 6.595029] ---[ end trace 0000000000000000 ]---
[1] https://lore.kernel.org/all/aE_hlGHkRZqFFacR@hovoldconsulting.com/
--
Best wishes,
Vladimir
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: camss NULL-deref on power on with 6.12-rc2
2025-08-24 20:42 ` Vladimir Zapolskiy
@ 2025-08-29 9:15 ` Johan Hovold
0 siblings, 0 replies; 10+ messages in thread
From: Johan Hovold @ 2025-08-29 9:15 UTC (permalink / raw)
To: Vladimir Zapolskiy
Cc: Robert Foss, Todor Tomov, Bryan O'Donoghue, linux-media,
linux-arm-msm, linux-kernel
Hi Vladimir,
On Sun, Aug 24, 2025 at 11:42:26PM +0300, Vladimir Zapolskiy wrote:
> On 10/11/24 12:33, Johan Hovold wrote:
> > This morning I hit the below NULL-deref in camss when booting a 6.12-rc2
> > kernel on the Lenovo ThinkPad X13s.
> >
> > I booted the same kernel another 50 times without hitting it again it so
> > it may not be a regression, but simply an older, hard to hit bug.
> >
> > Hopefully you can figure out what went wrong from just staring at the
> > oops and code.
> > [ 5.657860] ov5675 24-0010: failed to get HW configuration: -517
> > [ 5.676183] vreg_l6q: Bringing 2800000uV into 1800000-1800000uV
> >
> > [ 6.517689] qcom-camss ac5a000.camss: Adding to iommu group 22
> >
> > [ 6.589201] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> > [ 6.594868] CPU: 0 UID: 0 PID: 557 Comm: v4l_id Not tainted 6.12.0-rc2 #165
> > [ 6.594871] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET87W (1.59 ) 12/05/2023
> > [ 6.594915] Call trace:
> > [ 6.594915] camss_find_sensor+0x20/0x74 [qcom_camss]
> > [ 6.594920] camss_get_pixel_clock+0x18/0x60 [qcom_camss]
> > [ 6.594924] vfe_get+0xb8/0x504 [qcom_camss]
> > [ 6.594931] vfe_set_power+0x30/0x58 [qcom_camss]
> > [ 6.594936] pipeline_pm_power_one+0x13c/0x150 [videodev]
> > [ 6.594951] pipeline_pm_power.part.0+0x58/0xf4 [videodev]
> > [ 6.594960] v4l2_pipeline_pm_use+0x58/0x94 [videodev]
> > [ 6.594969] v4l2_pipeline_pm_get+0x14/0x20 [videodev]
> > [ 6.594978] video_open+0x78/0xf4 [qcom_camss]
> > [ 6.594982] v4l2_open+0x80/0x120 [videodev]
>
> As you remember the problem has been discussed in the past [1], for
> your information the issue has been indirectly fixed in v6.17-rc1 by
> getting rid of v4l2_pipeline_pm_get() from .open, see commit
> 164202f68203 ("media: qcom: camss: Power pipeline only when streaming").
>
> Still the old race is left unresolved, and it could lead to a NULL
> pointer dereference, but practically it would be close to impossible to
> reproduce it, since one more step of starting a video stream is needed.
Thanks for the update. Would still be good to fix the underlying issue
properly, especially since I don't think we know exactly how big that
race window is, and also to prevent further hard-to-track down issues
from being introduced later.
Johan
> [1] https://lore.kernel.org/all/aE_hlGHkRZqFFacR@hovoldconsulting.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-29 9:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-11 9:33 camss NULL-deref on power on with 6.12-rc2 Johan Hovold
2024-10-11 9:41 ` Bryan O'Donoghue
2024-10-11 9:54 ` Johan Hovold
2025-04-07 9:12 ` Johan Hovold
2025-04-07 9:58 ` Bryan O'Donoghue
2025-04-07 10:38 ` Johan Hovold
2025-04-07 11:01 ` Johan Hovold
2025-04-07 13:49 ` Johan Hovold
2025-08-24 20:42 ` Vladimir Zapolskiy
2025-08-29 9:15 ` Johan Hovold
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).