All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johan Hovold <johan@kernel.org>
To: Jon Hunter <jonathanh@nvidia.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Kevin Xie <kevin.xie@starfivetech.com>,
	"linux-tegra@vger.kernel.org" <linux-tegra@vger.kernel.org>
Subject: Re: [PATCH v1] PM: sleep: core: Synchronize runtime PM status of parents and children
Date: Fri, 7 Feb 2025 14:50:29 +0100	[thread overview]
Message-ID: <Z6YPpbRF_U0TxAbf@hovoldconsulting.com> (raw)
In-Reply-To: <1c2433d4-7e0f-4395-b841-b8eac7c25651@nvidia.com>

On Fri, Feb 07, 2025 at 01:38:58PM +0000, Jon Hunter wrote:
> On 28/01/2025 19:24, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Commit 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the
> > resume phase") overlooked the case in which the parent of a device with
> > DPM_FLAG_SMART_SUSPEND set did not use that flag and could be runtime-
> > suspended before a transition into a system-wide sleep state.  In that
> > case, if the child is resumed during the subsequent transition from
> > that state into the working state, its runtime PM status will be set to
> > RPM_ACTIVE, but the runtime PM status of the parent will not be updated
> > accordingly, even though the parent will be resumed too, because of the
> > dev_pm_skip_suspend() check in device_resume_noirq().
> > 
> > Address this problem by tracking the need to set the runtime PM status
> > to RPM_ACTIVE during system-wide resume transitions for devices with
> > DPM_FLAG_SMART_SUSPEND set and all of the devices depended on by them.
> > 
> > Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase")
> > Closes: https://lore.kernel.org/linux-pm/Z30p2Etwf3F2AUvD@hovoldconsulting.com/
> > Reported-by: Johan Hovold <johan@kernel.org>
> > Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> I am seeing the following crash during suspend on a couple of our boards (with mainline/next) and bisect is pointing to this commit ...
> 
> [  216.311009] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000

> [  216.468986] Call trace:
> [  216.471179]  simple_pm_bus_runtime_suspend+0x14/0x48 (P)
> [  216.476775]  pm_generic_runtime_suspend+0x2c/0x44
> [  216.481499]  pm_runtime_force_suspend+0x54/0x14c
> [  216.486049]  device_suspend_noirq+0x6c/0x278
> [  216.490253]  dpm_suspend_noirq+0xc0/0x198
> [  216.494278]  suspend_devices_and_enter+0x210/0x4c0
> [  216.499348]  pm_suspend+0x164/0x1c8
> [  216.503023]  state_store+0x8c/0xfc
> [  216.506260]  kobj_attr_store+0x18/0x2c
> [  216.509940]  sysfs_kf_write+0x44/0x54
> [  216.513699]  kernfs_fop_write_iter+0x118/0x1a8
> [  216.518163]  vfs_write+0x2b0/0x35c
> [  216.521399]  ksys_write+0x68/0xfc
> [  216.524810]  __arm64_sys_write+0x1c/0x28
> [  216.528574]  invoke_syscall+0x48/0x110
> [  216.532253]  el0_svc_common.constprop.0+0x40/0xe8
> [  216.536628]  do_el0_svc+0x20/0x2c
> [  216.540299]  el0_svc+0x30/0xd0
> [  216.543016]  el0t_64_sync_handler+0x144/0x168
> [  216.547736]  el0t_64_sync+0x198/0x19c
> [  216.551327] Code: a9be7bfd 910003fd a90153f3 f9403c00 (f9400014)
> [  216.557197] ---[ end trace 0000000000000000 ]---
> 
> I have not looked any further, but if you have any thoughts, let me know.

Yeah, I hit something like this yesterday as well and did confirm that
reverting this commit makes the problem go away. Haven't had time to dig
much further.

[  110.522368] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  110.531751] Mem abort info:
[  110.534785]   ESR = 0x0000000096000004
[  110.538799]   EC = 0x25: DABT (current EL), IL = 32 bits
[  110.544421]   SET = 0, FnV = 0
[  110.547716]   EA = 0, S1PTW = 0
[  110.551097]   FSC = 0x04: level 0 translation fault
[  110.556274] Data abort info:
[  110.559385]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  110.565188]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  110.570536]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  110.576157] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000892256000
[  110.582946] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[  110.590348] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
...
[  110.742358] CPU: 2 UID: 0 PID: 420 Comm: suspend-test.sh Not tainted 6.13.0 #118
[  110.750067] Hardware name: Qualcomm CRD, BIOS 6.0.231221.BOOT.MXF.2.4-00348.1-HAMOA-1 12/21/2023
[  110.759198] pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  110.766462] pc : simple_pm_bus_runtime_suspend+0x14/0x48
[  110.772048] lr : pm_generic_runtime_suspend+0x2c/0x44
[  110.777352] sp : ffff8000839d3a20
[  110.780866] x29: ffff8000839d3a20 x28: ffff0edf80fff810 x27: ffffa2dc2d1f1d30
[  110.788303] x26: ffffa2dc2dd82000 x25: 0000000000000002 x24: ffffa2dc2cc7aca0
[  110.795741] x23: ffffa2dc2dd1e000 x22: 0000000000000000 x21: ffffa2dc2d090e50
[  110.803177] x20: ffffa2dc2c612498 x19: ffff0edf80fff810 x18: 0000000000000030
[  110.810615] x17: 0005002c00000000 x16: ffffa2dc2c614ce4 x15: ffffffffffffffff
[  110.818052] x14: 0000000000000000 x13: ffff0edf80fff980 x12: 705f64706e65675f
[  110.825490] x11: ffffa2dc2d9c5890 x10: 0000000000000000 x9 : 0000000000000000
[  110.832927] x8 : ffffa2dc2d2af000 x7 : ffff8000839d3a10 x6 : ffff8000839d39b0
[  110.840364] x5 : ffff8000839d4000 x4 : 0000000000000004 x3 : ffff0edf953e0000
[  110.847801] x2 : ffffa2dc2c4e5784 x1 : 0000000000000000 x0 : 0000000000000000
[  110.855238] Call trace:
[  110.857861]  simple_pm_bus_runtime_suspend+0x14/0x48 (P)
[  110.863425]  pm_generic_runtime_suspend+0x2c/0x44
[  110.868362]  pm_runtime_force_suspend+0x54/0x100
[  110.873217]  dpm_run_callback+0xb4/0x228
[  110.877347]  device_suspend_noirq+0x70/0x2a8
[  110.881844]  dpm_noirq_suspend_devices+0xe0/0x230
[  110.886778]  dpm_suspend_noirq+0x24/0x98
[  110.890904]  suspend_devices_and_enter+0x368/0x678
[  110.895941]  pm_suspend+0x1b4/0x348
[  110.899627]  state_store+0x8c/0xfc
[  110.903228]  kobj_attr_store+0x18/0x2c
[  110.907195]  sysfs_kf_write+0x4c/0x78
[  110.911074]  kernfs_fop_write_iter+0x120/0x1b4
[  110.915735]  vfs_write+0x2ac/0x358
[  110.919352]  ksys_write+0x68/0xfc
[  110.922873]  __arm64_sys_write+0x1c/0x28
[  110.927002]  invoke_syscall+0x48/0x110
[  110.930969]  el0_svc_common.constprop.0+0x40/0xe0
[  110.935907]  do_el0_svc+0x1c/0x28
[  110.939427]  el0_svc+0x48/0x114
[  110.942769]  el0t_64_sync_handler+0xc8/0xcc
[  110.947180]  el0t_64_sync+0x198/0x19c
[  110.951059] Code: a9be7bfd 910003fd a90153f3 f9403c00 (f9400014)
[  110.957428] ---[ end trace 0000000000000000 ]---

Johan

  reply	other threads:[~2025-02-07 13:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-28 19:24 [PATCH v1] PM: sleep: core: Synchronize runtime PM status of parents and children Rafael J. Wysocki
2025-01-29  8:31 ` Johan Hovold
2025-01-29 11:52 ` Ulf Hansson
2025-01-29 15:55   ` Rafael J. Wysocki
2025-01-29 16:42     ` Ulf Hansson
2025-01-29 16:58       ` Rafael J. Wysocki
2025-01-30 11:11         ` Ulf Hansson
2025-01-30 13:19           ` Rafael J. Wysocki
2025-01-31 10:01             ` Ulf Hansson
2025-02-01 12:35               ` Rafael J. Wysocki
2025-02-03 12:12                 ` Ulf Hansson
2025-02-07 13:38 ` Jon Hunter
2025-02-07 13:50   ` Johan Hovold [this message]
2025-02-07 14:45     ` Johan Hovold
2025-02-07 15:41       ` Rafael J. Wysocki
2025-02-07 16:06         ` Johan Hovold
2025-02-07 16:26           ` Johan Hovold
2025-02-07 18:14             ` Rafael J. Wysocki
2025-02-08 12:10               ` Rafael J. Wysocki
2025-02-08 16:42                 ` Johan Hovold
2025-02-08 17:43                   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6YPpbRF_U0TxAbf@hovoldconsulting.com \
    --to=johan@kernel.org \
    --cc=helgaas@kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kevin.xie@starfivetech.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=rjw@rjwysocki.net \
    --cc=stern@rowland.harvard.edu \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.