Linux Power Management development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Cc: "Jingoo Han" <jingoohan1@gmail.com>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Rob Herring" <robh@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Will Deacon" <will@kernel.org>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-msm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, jonathanh@nvidia.com,
	bjorn.andersson@oss.qualcomm.com, "Frank Li" <Frank.Li@nxp.com>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH v5 4/5] PCI: dwc: Use common D3cold eligibility helper in suspend path
Date: Tue, 19 May 2026 19:01:53 -0500	[thread overview]
Message-ID: <20260520000153.GA14400@bhelgaas> (raw)
In-Reply-To: <20260429-d3cold-v5-4-89e9735b9df6@oss.qualcomm.com>

[+cc Frank, linux-pm]

On Wed, Apr 29, 2026 at 12:12:26PM +0530, Krishna Chaitanya Chundru wrote:
> Previously, the driver skipped putting the link into L2/device state in
> D3cold whenever L1 ASPM was enabled, since some devices (e.g. NVMe) expect
> low resume latency and may not tolerate deeper power states.

I think "some devices expect low resume latency and may not tolerate
deeper power states" conveys the wrong message.  It's not that NVMe
has a mysterious acceptable resume latency number that we have to meet
or that NVMe has some inherent aversion to D3cold or L1SS or whatever
"deeper power states" refers to.

It could be that ASPM L1 was configured incorrectly (e.g., an L1->L0
transition didn't happen within the advertised exit latency, leading
to some device access failure) or a device lost internal context when
the driver didn't expect it (e.g., the Qcom problem where L1SS exit
takes too long and results in a link-down and device reset [1]).

It sounds to me like the ASPM L1 check was a way to avoid problems
like that, but I don't think we ever really had a root cause.

[1] https://lore.kernel.org/linux-pci/20260519-l1ss-fix-v2-0-b2c3a4bdeb15@oss.qualcomm.com/

> However, such devices typically remain in D0 and are already covered
> by the new helper's requirement that all endpoints be in D3hot
> before the devices under host bridge may enter D3cold.

If we put the host bridge in D3cold, I assume the hierarchy below is
either put in D3cold as well, or at least every device in the
hierarchy will be reset as a consequence of the Root Port link going
down.

If the driver doesn't manage the device power state itself, I assume
we have the freedom to put the hierarchy in D3cold or reset it.

Do we have the same freedom if the driver *does* manage the power
state itself?  What if the driver put the device in D3hot, expecting
it to *stay* in D3hot?

I think pci_host_common_d3cold_possible() will see the device in D3hot
and decide that D3cold is possible.

(I'm looking at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/power/pci.rst?id=v7.0#n746)

> So, replace the local L1/L1SS-based check in dw_pcie_suspend_noirq() with
> the shared pci_host_common_d3cold_possible() helper to decide whether the
> devices under host bridge can safely transition to D3cold.
> 
> In addition, propagate PME-from-D3cold capability information from the
> helper and record it in skip_pwrctrl_off. Some devices (e.g. M.2 cards
> without auxiliary power) may lose PME detection when main power is
> removed, even if they advertise PME-from-D3cold support. This allows
> controller power-off to be skipped when required to preserve wakeup
> functionality.
> 
> Update the suspended flag in dw_pcie_resume_noirq() only after the PCIe
> link resumes successfully, to avoid marking the controller active when
> link resume fails.
> 
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++--------
>  drivers/pci/controller/dwc/pcie-designware.h      |  1 +
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index c9517a348836..9e409a1909e6 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -16,9 +16,11 @@
>  #include <linux/msi.h>
>  #include <linux/of_address.h>
>  #include <linux/of_pci.h>
> +#include <linux/pci.h>
>  #include <linux/pci_regs.h>
>  #include <linux/platform_device.h>
>  
> +#include "../pci-host-common.h"
>  #include "../../pci.h"
>  #include "pcie-designware.h"
>  
> @@ -1218,18 +1220,14 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
>  
>  int dw_pcie_suspend_noirq(struct dw_pcie *pci)
>  {
> -	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> +	bool pme_capable = false;
>  	int ret = 0;
>  	u32 val;
>  
>  	if (!dw_pcie_link_up(pci))
>  		goto stop_link;
>  
> -	/*
> -	 * If L1SS is supported, then do not put the link into L2 as some
> -	 * devices such as NVMe expect low resume latency.
> -	 */
> -	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> +	if (!pci_host_common_d3cold_possible(pci->pp.bridge, &pme_capable))
>  		return 0;
>  
>  	if (pci->pp.ops->pme_turn_off) {
> @@ -1273,6 +1271,7 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
>  	udelay(1);
>  
>  stop_link:
> +	pci->pp.skip_pwrctrl_off = pme_capable;
>  	dw_pcie_stop_link(pci);
>  	if (pci->pp.ops->deinit)
>  		pci->pp.ops->deinit(&pci->pp);
> @@ -1290,8 +1289,6 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
>  	if (!pci->suspended)
>  		return 0;
>  
> -	pci->suspended = false;
> -
>  	if (pci->pp.ops->init) {
>  		ret = pci->pp.ops->init(&pci->pp);
>  		if (ret) {
> @@ -1313,6 +1310,8 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
>  	if (pci->pp.ops->post_init)
>  		pci->pp.ops->post_init(&pci->pp);
>  
> +	pci->suspended = false;
> +
>  	return 0;
>  
>  err_stop_link:
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 3e69ef60165b..e759c5c7257e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -450,6 +450,7 @@ struct dw_pcie_rp {
>  	bool			ecam_enabled;
>  	bool			native_ecam;
>  	bool                    skip_l23_ready;
> +	bool			skip_pwrctrl_off;
>  };
>  
>  struct dw_pcie_ep_ops {
> 
> -- 
> 2.34.1
> 

           reply	other threads:[~2026-05-20  0:01 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <20260429-d3cold-v5-4-89e9735b9df6@oss.qualcomm.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260520000153.GA14400@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Frank.Li@nxp.com \
    --cc=bhelgaas@google.com \
    --cc=bjorn.andersson@oss.qualcomm.com \
    --cc=jingoohan1@gmail.com \
    --cc=jonathanh@nvidia.com \
    --cc=krishna.chundru@oss.qualcomm.com \
    --cc=kwilczynski@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=robh@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox