From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57EECEE49AB for ; Fri, 25 Aug 2023 21:26:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229564AbjHYVZc (ORCPT ); Fri, 25 Aug 2023 17:25:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231472AbjHYVZQ (ORCPT ); Fri, 25 Aug 2023 17:25:16 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7C3726A2; Fri, 25 Aug 2023 14:25:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3E8FF61EE0; Fri, 25 Aug 2023 21:25:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5BA37C433C8; Fri, 25 Aug 2023 21:25:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692998709; bh=/ndd9JvcnOZbrZ59TJ6yt5tniui40+xM/DyzWVsM1tI=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=IybtaUFxSdTUnn2pZh+WWdSykTWHRi0lD13eMCFhqvngsarg0CzPMfJuf1eflfUiN PK3gN8/vzav9SvRCG8ibtFyc9QaZJFfVFuDzt4V24YZXoCaEhpLYIOvLMuwI1iFkbU j/FSx5kVwmCtn/LJahKCWOPBBd+rc9A8lOgOJcEVrjclIW8nOOt0BiARAgychZx1ld J20YolI4eMFxW1Om9ZJSinBAV8yTwub87pPr1pE/YJhLSHGvap56/TvKhisJk06tmA 0HBt+NYdChA/O9+ek60ZXZ1x9j5WAH9XmKuK3j013iBx8Wt5mxCrsU1MiaoRJXpGd1 6A3qXjjEyLctQ== Date: Fri, 25 Aug 2023 16:25:07 -0500 From: Bjorn Helgaas To: Feiyang Chen Cc: Feiyang Chen , bhelgaas@google.com, rafael.j.wysocki@intel.com, mika.westerberg@linux.intel.com, anders.roxell@linaro.org, linux-pci@vger.kernel.org, linux-pm@vger.kernel.org, guyinggang@loongson.cn, siyanteng@loongson.cn, chenhuacai@loongson.cn, loongson-kernel@lists.loongnix.cn, "Rafael J . Wysocki" Subject: Re: [PATCH v3] PCI/PM: Only read PCI_PM_CTRL register when available Message-ID: <20230825212507.GA627427@bhelgaas> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org On Fri, Aug 25, 2023 at 11:57:00AM +0800, Feiyang Chen wrote: > On Fri, Aug 25, 2023 at 5:59 AM Bjorn Helgaas wrote: > > On Thu, Aug 24, 2023 at 09:37:38AM +0800, Feiyang Chen wrote: > > > When the current state is already PCI_D0, pci_power_up() will return > > > 0 even though dev->pm_cap is not set. In that case, we should not > > > read the PCI_PM_CTRL register in pci_set_full_power_state(). > > > > > > There is nothing more needs to be done below in that case. > > > Additionally, pci_power_up() has two callers only and the other one > > > ignores the return value, so we can safely move the current state > > > check from pci_power_up() to pci_set_full_power_state(). > > > > Does this fix a bug? I guess it does, because previously > > pci_set_full_power_state() did a config read at 0 + PCI_PM_CTRL, i.e., > > offset 4, which is actually PCI_COMMAND, and set dev->current_state > > based on that. So dev->current_state is now junk, right? > > Yes. > > > This might account for some "Refused to change power state from %s to D0" > > messages. > > > > How did you find this? It's nice if we can mention a symptom so > > people can connect the problem with this fix. > > We are attempting to add MSI support for our stmmac driver, but the > pci_alloc_irq_vectors() function always fails. > After looking into it more, we came across the message "Refused to > change power state from D3hot to D0" :) So I guess this device doesn't have a PM Capability at all? Can you collect the "sudo lspci -vv" output? The PM Capability is required for all PCIe devices, so maybe this is a conventional PCI device? > > This sounds like something that probably should have a stable tag? > > Do I need to include the symptom and Cc in the commit message and > then send v4? > > > Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") > > > Signed-off-by: Feiyang Chen > > > Reviewed-by: Rafael J. Wysocki > > > --- > > > drivers/pci/pci.c | 9 +++++---- > > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > index 60230da957e0..7e90ab7b47a1 100644 > > > --- a/drivers/pci/pci.c > > > +++ b/drivers/pci/pci.c > > > @@ -1242,9 +1242,6 @@ int pci_power_up(struct pci_dev *dev) > > > else > > > dev->current_state = state; > > > > > > - if (state == PCI_D0) > > > - return 0; > > > - > > > return -EIO; > > > } > > > > > > @@ -1302,8 +1299,12 @@ static int pci_set_full_power_state(struct pci_dev *dev) > > > int ret; > > > > > > ret = pci_power_up(dev); > > > - if (ret < 0) > > > + if (ret < 0) { > > > + if (dev->current_state == PCI_D0) > > > + return 0; > > > + > > > return ret; > > > + } > > > pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); > > > dev->current_state = pmcsr & PCI_PM_CTRL_STATE_MASK; One thing that makes me hesitate a little bit is that we rely on the failure return from pci_power_up() to guard the dev->pm_cap usage. That's slightly obscure, and I liked the way the v1 patch made it explicit. And it seems slightly weird that when there's no PM cap, pci_power_up() always returns failure even if the platform was able to put the device in D0. Anyway, here's a proposal for commit log and updated comment for pci_power_up(): commit 5694ba13b004 ("PCI/PM: Only read PCI_PM_CTRL register when available") Author: Feiyang Chen Date: Thu Aug 24 09:37:38 2023 +0800 PCI/PM: Only read PCI_PM_CTRL register when available For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D0, which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even though it doesn't exist. Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted dev->current_state. This led to messages like this in some cases: pci 0000:01:00.0: Refused to change power state from D3hot to D0 To prevent this, make pci_power_up() always return a negative failure code if the device lacks a Power Management Capability, even if non-PCI platform power management has been able to put the device in D0. The failure will prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL. Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn Signed-off-by: Feiyang Chen Signed-off-by: Bjorn Helgaas Reviewed-by: "Rafael J. Wysocki" Cc: stable@vger.kernel.org # v5.19+ diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0..39728196e295 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1226,6 +1226,10 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) * * On success, return 0 or 1, depending on whether or not it is necessary to * restore the device's BARs subsequently (1 is returned in that case). + * + * On failure, return a negative error code. Always return failure if @dev + * lacks a Power Management Capability, even if the platform was able to + * put the device in D0 via non-PCI means. */ int pci_power_up(struct pci_dev *dev) { @@ -1242,9 +1246,6 @@ int pci_power_up(struct pci_dev *dev) else dev->current_state = state; - if (state == PCI_D0) - return 0; - return -EIO; } @@ -1302,8 +1303,12 @@ static int pci_set_full_power_state(struct pci_dev *dev) int ret; ret = pci_power_up(dev); - if (ret < 0) + if (ret < 0) { + if (dev->current_state == PCI_D0) + return 0; + return ret; + } pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); dev->current_state = pmcsr & PCI_PM_CTRL_STATE_MASK;