linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable
@ 2025-07-18 11:13 Naresh Kamboju
  2025-07-18 11:28 ` Arnd Bergmann
  0 siblings, 1 reply; 4+ messages in thread
From: Naresh Kamboju @ 2025-07-18 11:13 UTC (permalink / raw)
  To: open list, lkft-triage, Linux Regressions, linux-clk,
	linux-arm-msm, Linux Media Mailing List
  Cc: quic_vgarodia, quic_dikshita, Bryan O'Donoghue,
	Mauro Carvalho Chehab, Arnd Bergmann, Anders Roxell,
	Dan Carpenter, Benjamin Copeland

The following Boot regressions are noticed on the Linux
next-20250708with gcc-13 and clang-20 toolchains for the dragonboard
410c device.

First seen on the tag next-20250708.
Good: next-20250704
Bad:  next-20250708

Regression Analysis:
- New regression? Yes
- Reproducibility? Yes

Boot regression: arm64 dragonboard 410c Internal error Oops
dev_pm_opp_put core_clks_enable

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

## Test log
[   12.512749] Internal error: Oops: 0000000096000004 [#1]  SMP
[   12.518471] Modules linked in: drm_dp_aux_bus qcom_vadc_common
venus_core(+) qcom_pon(+) qmi_helpers videobuf2_dma_sg qnoc_msm8916
qcom_stats drm_display_helper v4l2_mem2mem videobuf2_memops qcom_rng
mdt_loader videobuf2_v4l2 cec videobuf2_common drm_client_lib
display_connector rpmsg_ctrl rpmsg_char ramoops drm_kms_helper socinfo
reed_solomon rmtfs_mem fuse drm backlight ip_tables x_tables
[   12.527390] input: pm8941_resin as
/devices/platform/soc@0/200f000.spmi/spmi-0/0-00/200f000.spmi:pmic@0:pon@800/200f000.spmi:pmic@0:pon@800:resin/input/input2
[   12.536414] CPU: 1 UID: 0 PID: 245 Comm: (udev-worker) Not tainted
6.16.0-rc6-next-20250717 #1 PREEMPT
[   12.536428] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   12.536435] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   12.536445] pc : dev_pm_opp_put (/builds/linux/drivers/opp/core.c:1685)
[   12.595660] lr : core_clks_enable+0x54/0x148 venus_core
[   12.595754] sp : ffff80008492b600
[   12.595760] x29: ffff80008492b600 x28: ffff80008492bba0 x27: ffff0000047a6138
[   12.595778] x26: 0000000000000000 x25: ffff800082c4fe38 x24: ffff80007b3b8ba0
[   12.595795] x23: ffff00000b2e00c8 x22: ffff00000b2e0080 x21: 0000000000000000
[   12.595811] x20: 0000000000000000 x19: ffffffffffffffee x18: 0000000000000000
[   12.595827] x17: 0000000000000000 x16: 1fffe000006c0ae1 x15: 0000000000000000
[   12.629871] x14: 0000000000000000 x13: 007473696c5f7974 x12: 696e696666615f65
[   12.629890] x11: ffff00003fa551c0 x10: 0000000000000020 x9 : ffff80007b3a5684
[   12.629908] x8 : ffffffffffffffde x7 : ffff000009ade040 x6 : 0000000000000000
[   12.629924] x5 : 0000000000000002 x4 : 00000000c0000000 x3 : 0000000000000001
[   12.629939] x2 : 0000000000000002 x1 : ffffffffffffffde x0 : ffffffffffffffee
[   12.629956] Call trace:
[   12.629962]  dev_pm_opp_put+0x24/0x58 (P)
[   12.629981]  core_clks_enable+0x54/0x148 venus_core
[   12.630064]  core_power_v1+0x78/0x90 venus_core
[   12.691130]  venus_runtime_resume+0x6c/0x98 venus_core
[   12.691214]  pm_generic_runtime_resume+0x34/0x58
[   12.691233]  __genpd_runtime_resume+0x38/0x90
[   12.691247]  genpd_runtime_resume+0xe0/0x2f0
[   12.691261]  __rpm_callback+0x50/0x1f0
[   12.691272]  rpm_callback+0x7c/0x90
[   12.691281]  rpm_resume+0x46c/0x650
[   12.721332]  __pm_runtime_resume+0x58/0xa8
[   12.721345]  venus_probe+0x2d8/0x588 venus_core
[   12.721409]  platform_probe+0x64/0xa8
[   12.721423]  really_probe+0xc8/0x3a0
[   12.721433]  __driver_probe_device+0x84/0x170
[   12.721443]  driver_probe_device+0x44/0x120
[   12.721453]  __driver_attach+0xf8/0x208
[   12.749283]  bus_for_each_dev+0x90/0xf8
[   12.749299]  driver_attach+0x2c/0x40
[   12.749314]  bus_add_driver+0x118/0x248
[   12.749328]  driver_register+0x64/0x138
[   12.749339]  __platform_driver_register+0x2c/0x40
[   12.749350]  qcom_venus_driver_init+0x28/0xfb8 venus_core
[   12.772990]  do_one_initcall+0x60/0x290
[   12.773012]  do_init_module+0x60/0x268
[   12.773028]  load_module+0x1e00/0x2060
[   12.773042]  init_module_from_file+0x90/0xe0
[   12.773057]  __arm64_sys_finit_module+0x270/0x370
[   12.773070]  invoke_syscall+0x50/0x120
[   12.773081]  el0_svc_common.constprop.0+0xc8/0xf0
[   12.773091]  do_el0_svc+0x24/0x38
[   12.773100]  el0_svc+0x3c/0x138
[   12.773116]  el0t_64_sync_handler+0xa0/0xe8
[   12.773130]  el0t_64_sync+0x198/0x1a0
[   12.817608] Code: 910003fd f9000bf3 91004013 aa1303e0 (f9402821)
All code
========
   0: 910003fd mov x29, sp
   4: f9000bf3 str x19, [sp, #16]
   8: 91004013 add x19, x0, #0x10
   c: aa1303e0 mov x0, x19
  10:* f9402821 ldr x1, [x1, #80] <-- trapping instruction

Code starting with the faulting instruction
===========================================
   0: f9402821 ldr x1, [x1, #80]
[   12.817618] ---[ end trace 0000000000000000 ]---

...
[   38.070603] Internal error: Oops: 0000000096000004 [#2]  SMP
[   38.077336] Modules linked in: pm8916_wdt snd_soc_lpass_apq8016
snd_soc_msm8916_digital snd_soc_lpass_cpu snd_soc_msm8916_analog
snd_soc_apq8016_sbc msm qcom_wcnss_pil snd_soc_lpass_platform
snd_soc_qcom_common snd_soc_core snd_compress qrtr coresight_stm
ubwc_config coresight_cpu_debug snd_pcm_dmaengine qcom_q6v5_mss
llcc_qcom stm_core snd_pcm coresight_cti qcom_pil_info ocmem snd_timer
qcom_q6v5 drm_gpuvm adv7511 snd qcom_sysmon drm_exec soundcore
gpu_sched qcom_common qcom_spmi_temp_alarm rtc_pm8xxx qcom_spmi_vadc
qcom_glink_smem qcom_camss drm_dp_aux_bus qcom_vadc_common
venus_core(+) qcom_pon qmi_helpers videobuf2_dma_sg qnoc_msm8916
qcom_stats drm_display_helper v4l2_mem2mem videobuf2_memops qcom_rng
mdt_loader videobuf2_v4l2 cec videobuf2_common drm_client_lib
display_connector rpmsg_ctrl rpmsg_char ramoops drm_kms_helper socinfo
reed_solomon rmtfs_mem fuse drm backlight ip_tables x_tables
[   38.140171] CPU: 0 UID: 0 PID: 1202 Comm: irq/55-3-0039 Tainted: G
    D             6.16.0-rc6-next-20250717 #1 PREEMPT
[   38.162330] Tainted: [D]=DIE
[   38.173246] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   38.176294] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   38.182980] pc : adv7511_cec_register_volatile+0xc/0x40 adv7511
[   38.189659] lr : regmap_volatile
(/builds/linux/drivers/base/regmap/regmap.c:153)
[   38.195904] sp : ffff8000869e3bf0
[   38.199896] x29: ffff8000869e3bf0 x28: ffff00000b38eeac x27: ffff8000801e1700
[   38.203289] x26: ffff8000801de1b0 x25: ffff8000829a0210 x24: ffff00000a091d80
[   38.210408] x23: 0000000000000070 x22: 0000000000000096 x21: ffff8000869e3cb4
[   38.217525] x20: 0000000000000096 x19: ffff00000366f400 x18: 0000000000000098
[   38.224643] x17: 00000000000000c7 x16: 00000000000000c4 x15: 00000000000000bf
[   38.231761] x14: 00000000000000be x13: 00000000000000ff x12: 00000000000000f6
[   38.238881] x11: 00000000000000f4 x10: 00000000000000cb x9 : ffff800080c83384
[   38.245997] x8 : ffff8000869e3a78 x7 : 0000000000000000 x6 : 0000000000000001
[   38.253115] x5 : ffff8000829a0000 x4 : 0000000000000000 x3 : ffff00000366f400
[   38.260233] x2 : ffff80007b4b74d8 x1 : 0000000000000096 x0 : 0000000000000000
[   38.267353] Call trace:
[   38.274460]  adv7511_cec_register_volatile+0xc/0x40 adv7511 (P)
[   38.276726]  regcache_read+0x3c/0x100
[   38.282973]  _regmap_read+0x90/0x190
[   38.286615]  regmap_read+0x54/0x88
[   38.290260]  adv7511_cec_irq_process+0xb4/0x310 adv7511
[   38.293477]  adv7511_irq_process+0xc4/0x158 adv7511
[   38.298946]  adv7511_irq_handler+0x20/0x40 adv7511
[   38.303979]  irq_thread_fn+0x34/0xb8
[   38.309010]  irq_thread+0x198/0x3b0
[   38.312570]  kthread+0x138/0x228
[   38.315781]  ret_from_fork+0x10/0x20
[   38.319260] Code: ffff8000 aa1e03e9 d503201f f9403c00 (f941cc00)
All code
========
   0: ffff8000 .inst 0xffff8000 ; undefined
   4: aa1e03e9 mov x9, x30
   8: d503201f nop
   c: f9403c00 ldr x0, [x0, #120]
  10:* f941cc00 ldr x0, [x0, #920] <-- trapping instruction

Code starting with the faulting instruction
===========================================
   0: f941cc00 ldr x0, [x0, #920]
[   38.322823] ---[ end trace 0000000000000000 ]---


## Source
* Git tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Project: https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250717/
* Git sha: 024e09e444bd2b06aee9d1f3fe7b313c7a2df1bb
* Git describe: 6.16.0-rc6-next-20250717
* kernel version: next-20250717
* Architectures: arm64 (Dragonboard 410c)
* Toolchains: clang-20 gcc-13
* Kconfigs: defconfig+lkftconfigs

## Test
* Test log:https://qa-reports.linaro.org/api/testruns/29169813/log_file/
* Test LAVA: https://lkft.validation.linaro.org/scheduler/job/8361760#L3403
* Test run: https://regressions.linaro.org/lkft/linux-next-master/next-20250717/testruns/1662734/
* Test history:
https://regressions.linaro.org/lkft/linux-next-master/next-20250717/log-parser-test/internal-error-oops-oops-smp/history/
* Test plan: https://regressions.linaro.org/lkft/linux-next-master/next-20250717/log-parser-test/internal-error-oops-oops-smp/
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/2zzwEWZON9hQQK9VfaE276a89yt/
* Kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2zzwEWZON9hQQK9VfaE276a89yt/config


--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable
  2025-07-18 11:13 arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable Naresh Kamboju
@ 2025-07-18 11:28 ` Arnd Bergmann
  2025-07-23 13:28   ` Renjiang Han
  0 siblings, 1 reply; 4+ messages in thread
From: Arnd Bergmann @ 2025-07-18 11:28 UTC (permalink / raw)
  To: Naresh Kamboju, open list, lkft-triage, Linux Regressions,
	linux-clk, linux-arm-msm, Linux Media Mailing List
  Cc: quic_vgarodia, quic_dikshita, Bryan O'Donoghue,
	Mauro Carvalho Chehab, Anders Roxell, Dan Carpenter,
	Benjamin Copeland, Renjiang Han

On Fri, Jul 18, 2025, at 13:13, Naresh Kamboju wrote:
> The following Boot regressions are noticed on the Linux
> next-20250708with gcc-13 and clang-20 toolchains for the dragonboard
> 410c device.

> [   12.629924] x5 : 0000000000000002 x4 : 00000000c0000000 x3 : 
> 0000000000000001
> [   12.629939] x2 : 0000000000000002 x1 : ffffffffffffffde x0 : 
> ffffffffffffffee
> [   12.629956] Call trace:
> [   12.629962]  dev_pm_opp_put+0x24/0x58 (P)
> [   12.629981]  core_clks_enable+0x54/0x148 venus_core
> [   12.630064]  core_power_v1+0x78/0x90 venus_core
> [   12.691130]  venus_runtime_resume+0x6c/0x98 venus_core

> [   12.817608] Code: 910003fd f9000bf3 91004013 aa1303e0 (f9402821)
> All code
> ========
>    0: 910003fd mov x29, sp
>    4: f9000bf3 str x19, [sp, #16]
>    8: 91004013 add x19, x0, #0x10
>    c: aa1303e0 mov x0, x19
>   10:* f9402821 ldr x1, [x1, #80] <-- trapping instruction

It's loading from 'x1', which is an error pointer ffffffffffffffde
(-EISCONN).  The caller was modified by Renjiang Han (added to Cc)
in commit b179234b5e59 ("media: venus: pm_helpers: use opp-table
for the frequency").

The new version of the code is now  

static int core_clks_enable(struct venus_core *core)
 {
        const struct venus_resources *res = core->res;
+       struct device *dev = core->dev;
+       unsigned long freq = 0;
+       struct dev_pm_opp *opp;
        unsigned int i;
        int ret;
 
+       opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+       dev_pm_opp_put(opp);
 
Where the 'opp' pointer is the error code and gets passed
into dev_pm_opp_put() without checking for the error condition.

    Arnd

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable
  2025-07-18 11:28 ` Arnd Bergmann
@ 2025-07-23 13:28   ` Renjiang Han
  2025-07-24  9:59     ` Naresh Kamboju
  0 siblings, 1 reply; 4+ messages in thread
From: Renjiang Han @ 2025-07-23 13:28 UTC (permalink / raw)
  To: Arnd Bergmann, Naresh Kamboju, open list, lkft-triage,
	Linux Regressions, linux-clk, linux-arm-msm,
	Linux Media Mailing List
  Cc: quic_vgarodia, quic_dikshita, Bryan O'Donoghue,
	Mauro Carvalho Chehab, Anders Roxell, Dan Carpenter,
	Benjamin Copeland

On 7/18/2025 7:28 PM, Arnd Bergmann wrote:
> On Fri, Jul 18, 2025, at 13:13, Naresh Kamboju wrote:
>> The following Boot regressions are noticed on the Linux
>> next-20250708with gcc-13 and clang-20 toolchains for the dragonboard
>> 410c device.
>> [   12.629924] x5 : 0000000000000002 x4 : 00000000c0000000 x3 :
>> 0000000000000001
>> [   12.629939] x2 : 0000000000000002 x1 : ffffffffffffffde x0 :
>> ffffffffffffffee
>> [   12.629956] Call trace:
>> [   12.629962]  dev_pm_opp_put+0x24/0x58 (P)
>> [   12.629981]  core_clks_enable+0x54/0x148 venus_core
>> [   12.630064]  core_power_v1+0x78/0x90 venus_core
>> [   12.691130]  venus_runtime_resume+0x6c/0x98 venus_core
>> [   12.817608] Code: 910003fd f9000bf3 91004013 aa1303e0 (f9402821)
>> All code
>> ========
>>     0: 910003fd mov x29, sp
>>     4: f9000bf3 str x19, [sp, #16]
>>     8: 91004013 add x19, x0, #0x10
>>     c: aa1303e0 mov x0, x19
>>    10:* f9402821 ldr x1, [x1, #80] <-- trapping instruction
> It's loading from 'x1', which is an error pointer ffffffffffffffde
> (-EISCONN).  The caller was modified by Renjiang Han (added to Cc)
> in commit b179234b5e59 ("media: venus: pm_helpers: use opp-table
> for the frequency").
>
> The new version of the code is now
>
> static int core_clks_enable(struct venus_core *core)
>   {
>          const struct venus_resources *res = core->res;
> +       struct device *dev = core->dev;
> +       unsigned long freq = 0;
> +       struct dev_pm_opp *opp;
>          unsigned int i;
>          int ret;
>   
> +       opp = dev_pm_opp_find_freq_ceil(dev, &freq);
> +       dev_pm_opp_put(opp);
>   
> Where the 'opp' pointer is the error code and gets passed
> into dev_pm_opp_put() without checking for the error condition.
Thank you for pointing it out.
I have submitted the following patch to fix this issue.
https://lore.kernel.org/linux-arm-msm/20250723-fallback_of_opp_table-v1-1-20a6277fdded@quicinc.com
>
>      Arnd

-- 
Best Regards,
Renjiang


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable
  2025-07-23 13:28   ` Renjiang Han
@ 2025-07-24  9:59     ` Naresh Kamboju
  0 siblings, 0 replies; 4+ messages in thread
From: Naresh Kamboju @ 2025-07-24  9:59 UTC (permalink / raw)
  To: Renjiang Han
  Cc: Arnd Bergmann, open list, lkft-triage, Linux Regressions,
	linux-clk, linux-arm-msm, Linux Media Mailing List, quic_vgarodia,
	quic_dikshita, Bryan O'Donoghue, Mauro Carvalho Chehab,
	Anders Roxell, Dan Carpenter, Benjamin Copeland

On Wed, 23 Jul 2025 at 18:58, Renjiang Han <quic_renjiang@quicinc.com> wrote:
>
> On 7/18/2025 7:28 PM, Arnd Bergmann wrote:
> > On Fri, Jul 18, 2025, at 13:13, Naresh Kamboju wrote:
> >> The following Boot regressions are noticed on the Linux
> >> next-20250708with gcc-13 and clang-20 toolchains for the dragonboard
> >> 410c device.
> >> [   12.629924] x5 : 0000000000000002 x4 : 00000000c0000000 x3 :
> >> 0000000000000001
> >> [   12.629939] x2 : 0000000000000002 x1 : ffffffffffffffde x0 :
> >> ffffffffffffffee
> >> [   12.629956] Call trace:
> >> [   12.629962]  dev_pm_opp_put+0x24/0x58 (P)
> >> [   12.629981]  core_clks_enable+0x54/0x148 venus_core
> >> [   12.630064]  core_power_v1+0x78/0x90 venus_core
> >> [   12.691130]  venus_runtime_resume+0x6c/0x98 venus_core
> >> [   12.817608] Code: 910003fd f9000bf3 91004013 aa1303e0 (f9402821)
> >> All code
> >> ========
> >>     0: 910003fd mov x29, sp
> >>     4: f9000bf3 str x19, [sp, #16]
> >>     8: 91004013 add x19, x0, #0x10
> >>     c: aa1303e0 mov x0, x19
> >>    10:* f9402821 ldr x1, [x1, #80] <-- trapping instruction
> > It's loading from 'x1', which is an error pointer ffffffffffffffde
> > (-EISCONN).  The caller was modified by Renjiang Han (added to Cc)
> > in commit b179234b5e59 ("media: venus: pm_helpers: use opp-table
> > for the frequency").
> >
> > The new version of the code is now
> >
> > static int core_clks_enable(struct venus_core *core)
> >   {
> >          const struct venus_resources *res = core->res;
> > +       struct device *dev = core->dev;
> > +       unsigned long freq = 0;
> > +       struct dev_pm_opp *opp;
> >          unsigned int i;
> >          int ret;
> >
> > +       opp = dev_pm_opp_find_freq_ceil(dev, &freq);
> > +       dev_pm_opp_put(opp);
> >
> > Where the 'opp' pointer is the error code and gets passed
> > into dev_pm_opp_put() without checking for the error condition.
> Thank you for pointing it out.
> I have submitted the following patch to fix this issue.

I have applied this [1] patch set on top of the Linux next tree and
performed testing. The previously reported regressions [a] are no
longer observed.

Thank you for providing the fix.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

[1] https://lore.kernel.org/linux-arm-msm/20250723-fallback_of_opp_table-v1-1-20a6277fdded@quicinc.com

Reference link:
[a] https://lore.kernel.org/all/CA+G9fYu5=3n84VY+vTbCAcfFKOq7Us5vgBZgpypY4MveM=eVwg@mail.gmail.com/

Lava test job link,
 - https://lkft.validation.linaro.org/scheduler/job/8366971#L2573

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-24  9:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-18 11:13 arm64 dragonboard 410c Internal error Oops dev_pm_opp_put core_clks_enable Naresh Kamboju
2025-07-18 11:28 ` Arnd Bergmann
2025-07-23 13:28   ` Renjiang Han
2025-07-24  9:59     ` Naresh Kamboju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).