From: "Yan, Zheng" <zheng.z.yan@intel.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: bhelgaas@google.com, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, linux-pm@vger.kernel.org,
Lin Ming <ming.m.lin@intel.com>, Zhang Rui <rui.zhang@intel.com>,
huang ying <huang.ying.caritas@gmail.com>,
ACPI Devel Mailing List <linux-acpi@vger.kernel.org>
Subject: Re: [RFC PATCH] PCIe: Add PCIe runtime D3cold support
Date: Mon, 16 Apr 2012 10:23:10 +0800 [thread overview]
Message-ID: <4F8B828E.9060606@intel.com> (raw)
In-Reply-To: <201204132141.58063.rjw@sisk.pl>
On 04/14/2012 03:41 AM, Rafael J. Wysocki wrote:
> Hi,
>
> On Friday, April 13, 2012, Yan, Zheng wrote:
>> Hi all,
>>
>> This patch adds PCIe runtime D3cold support, namely cut power supply for functions
>> beneath a PCIe port when they all have entered D3. A device in D3cold can only
>> generate wake event through the WAKE# pin. Because we can not access to a device's
>> configure space while it's in D3cold, pme_poll is disabled for devices in D3cold.
>>
>> Any comment will be appreciated.
>>
>> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
>> ---
>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> index 0f150f2..e210e8cb 100644
>> --- a/drivers/pci/pci-acpi.c
>> +++ b/drivers/pci/pci-acpi.c
>> @@ -224,7 +224,7 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>> [PCI_D1] = ACPI_STATE_D1,
>> [PCI_D2] = ACPI_STATE_D2,
>> [PCI_D3hot] = ACPI_STATE_D3,
>> - [PCI_D3cold] = ACPI_STATE_D3
>> + [PCI_D3cold] = ACPI_STATE_D3_COLD
>> };
>> int error = -EINVAL;
>>
>
> Please don't use that ACPI_STATE_D3_COLD thing, it's not defined correctly.
>
> We should define ACPI_STATE_D3_COLD == ACPI_STATE_D3 and add ACPI_STATE_D3_HOT
> instead. I'll prepare a patch for that over the weekend if no one has done
> that already.
>
>> @@ -296,7 +296,8 @@ static void acpi_pci_propagate_run_wake(struct pci_bus *bus, bool enable)
>>
>> static int acpi_pci_run_wake(struct pci_dev *dev, bool enable)
>> {
>> - if (dev->pme_interrupt)
>> + /* PME interrupt isn't available in the D3cold case */
>> + if (dev->pme_interrupt && !dev->runtime_d3cold)
>
> This whole thing is wrong. First off, I don't think that the runtime_d3cold
> flag makes any sense. We already cover that in dev->pme_support.
>
> Second, pme_interrupt means that the _root_ _port_, not the device itself will
> trigger an interrupt whenever the device sends the PME message to it (which
> very well may happen for a device in D3_cold woken up by an external signal).
>
The reason I introduced the runtime_d3cold flag is I can't make the PME interrupt
work in the D3cold case in our test platform. Document for our test platform says
A device in D3cold can only generate wake event through the WAKE# pin.
>> return 0;
>>
>> if (!acpi_pm_device_run_wake(&dev->dev, enable))
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8156744..bc16869 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -731,8 +731,8 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>> int error;
>>
>
> Guys, please. Never, _ever_, touch pci_set_power_state() without discussing
> your ideas with someone who knows how it works and _why_ it works this way.
>
> The problem here is that you can't program a PCI device into D3_cold, so it
> doesn't even make sense to have a helper for that.
>
>> /* bound the state we're entering */
>> - if (state > PCI_D3hot)
>> - state = PCI_D3hot;
>> + if (state > PCI_D3cold)
>> + state = PCI_D3cold;
>> else if (state < PCI_D0)
>> state = PCI_D0;
>> else if ((state == PCI_D1 || state == PCI_D2) && pci_no_d1d2(dev))
>> @@ -750,7 +750,8 @@ int pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>> if (state == PCI_D3hot && (dev->dev_flags & PCI_DEV_FLAGS_NO_D3))
>> return 0;
>>
>> - error = pci_raw_set_power_state(dev, state);
>> + error = pci_raw_set_power_state(dev, state > PCI_D3hot ?
>> + PCI_D3hot : state);
>>
>> if (!__pci_complete_power_transition(dev, state))
>> error = 0;
>> @@ -1482,6 +1483,17 @@ bool pci_pme_capable(struct pci_dev *dev, pci_power_t state)
>> return !!(dev->pme_support & (1 << state));
>> }
>>
>> +static void pci_pme_poll_wakeup(struct pci_dev *dev)
>> +{
>> + struct pci_dev *bridge = dev->bus->self;
>> +
>> + /* don't poll the pme bit if parent is in low power state */
>> + if (bridge && bridge->current_state != PCI_D0)
>> + return;
>> +
>> + pci_pme_wakeup(dev, NULL);
>> +}
>
> This one actually makes some sense, although it might be better to put the
> test into pci_pme_wakeup() itself.
put it into pci_pme_wakeup will break ACPI wakeup, because ACPI wakeup also
uses pci_pme_wakeup. When a device exits from D3cold by asserting the WAKE#
signal, device power is restored automatically by ACPI, then pci_pme_wakeup
is called to check device's PME bit. pci_dev->current_state is PCI_D3hot
during the checking.
>
>> +
>> static void pci_pme_list_scan(struct work_struct *work)
>> {
>> struct pci_pme_device *pme_dev, *n;
>> @@ -1490,7 +1502,7 @@ static void pci_pme_list_scan(struct work_struct *work)
>> if (!list_empty(&pci_pme_list)) {
>> list_for_each_entry_safe(pme_dev, n, &pci_pme_list, list) {
>> if (pme_dev->dev->pme_poll) {
>> - pci_pme_wakeup(pme_dev->dev, NULL);
>> + pci_pme_poll_wakeup(pme_dev->dev);
>> } else {
>> list_del(&pme_dev->list);
>> kfree(pme_dev);
>> @@ -1608,6 +1620,10 @@ int __pci_enable_wake(struct pci_dev *dev, pci_power_t state,
>> if (enable) {
>> int error;
>>
>> + if (runtime && state >= PCI_D3cold)
>> + dev->runtime_d3cold = true;
>> + else
>> + dev->runtime_d3cold = false;
>> if (pci_pme_capable(dev, state))
>> pci_pme_active(dev, true);
>> else
>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> index e0610bd..d66b7e9 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -11,11 +11,13 @@
>> #include <linux/kernel.h>
>> #include <linux/errno.h>
>> #include <linux/pm.h>
>> +#include <linux/pm_runtime.h>
>> #include <linux/init.h>
>> #include <linux/pcieport_if.h>
>> #include <linux/aer.h>
>> #include <linux/dmi.h>
>> #include <linux/pci-aspm.h>
>> +#include <linux/delay.h>
>>
>> #include "portdrv.h"
>> #include "aer/aerdrv.h"
>> @@ -99,6 +101,25 @@ static int pcie_port_resume_noirq(struct device *dev)
>> return 0;
>> }
>>
>> +static int pcie_port_runtime_suspend(struct device *dev)
>> +{
>> + struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> + pci_save_state(pdev);
>
> Are you sure this is sufficient?
What else should I do?
>
>> + return 0;
>> +}
>> +
>> +static int pcie_port_runtime_resume(struct device *dev)
>> +{
>> + struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> + pci_restore_state(pdev);
>> + if (pdev->runtime_d3cold)
>> + msleep(100);
>
> What's _that_ supposed to do?
Bad thing happens if device is accessed immediately after restoring power.
Document for our test platform states 100ms PERST delay is required.
>
>> +
>> + return 0;
>> +}
>> +
>> static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>> .suspend = pcie_port_device_suspend,
>> .resume = pcie_port_device_resume,
>> @@ -107,6 +128,8 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>> .poweroff = pcie_port_device_suspend,
>> .restore = pcie_port_device_resume,
>> .resume_noirq = pcie_port_resume_noirq,
>> + .runtime_suspend = pcie_port_runtime_suspend,
>> + .runtime_resume = pcie_port_runtime_resume,
>> };
>>
>> #define PCIE_PORTDRV_PM_OPS (&pcie_portdrv_pm_ops)
>> @@ -144,12 +167,14 @@ static int __devinit pcie_portdrv_probe(struct pci_dev *dev,
>> return status;
>>
>> pci_save_state(dev);
>> + pm_runtime_put_noidle(&dev->dev);
>
> What's the purpose of this?
>
>> return 0;
>> }
>>
>> static void pcie_portdrv_remove(struct pci_dev *dev)
>> {
>> + pm_runtime_get_noresume(&dev->dev);
>> pcie_port_device_remove(dev);
>> pci_disable_device(dev);
>> }
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index e444f5b..b41d9a1 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -281,6 +281,7 @@ struct pci_dev {
>> unsigned int no_d1d2:1; /* Only allow D0 and D3 */
>> unsigned int mmio_always_on:1; /* disallow turning off io/mem
>> decoding during bar sizing */
>> + unsigned int runtime_d3cold:1;
>> unsigned int wakeup_prepared:1;
>> unsigned int d3_delay; /* D3->D0 transition time in ms */
>
> OK
>
> So now please tell me what exactly you want to achieve and why you want to do
> that in the first place.
>
> Thanks,
> Rafael
next prev parent reply other threads:[~2012-04-16 2:23 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-13 2:35 [RFC PATCH] PCIe: Add PCIe runtime D3cold support Yan, Zheng
2012-04-13 6:06 ` Alex He
2012-04-13 6:28 ` Yan, Zheng
2012-04-16 8:15 ` huang ying
2012-04-13 19:41 ` Rafael J. Wysocki
2012-04-16 0:48 ` Lin Ming
2012-04-16 16:26 ` Rafael J. Wysocki
2012-04-16 2:23 ` Yan, Zheng [this message]
2012-04-16 17:07 ` Rafael J. Wysocki
2012-04-17 2:07 ` huang ying
2012-04-17 2:07 ` huang ying
2012-04-17 20:20 ` Rafael J. Wysocki
2012-04-18 1:19 ` huang ying
2012-04-18 19:51 ` Rafael J. Wysocki
2012-04-17 2:12 ` Yan, Zheng
2012-04-17 5:32 ` huang ying
2012-04-17 20:43 ` Rafael J. Wysocki
2012-04-18 1:22 ` huang ying
2012-04-18 19:52 ` Rafael J. Wysocki
2012-04-17 20:35 ` Rafael J. Wysocki
2012-04-16 7:49 ` Yan, Zheng
2012-04-16 21:11 ` Rafael J. Wysocki
2012-04-16 8:58 ` huang ying
2012-04-16 8:58 ` huang ying
2012-04-16 21:30 ` Rafael J. Wysocki
2012-04-17 2:02 ` huang ying
2012-04-17 2:02 ` huang ying
2012-04-17 21:03 ` Rafael J. Wysocki
2012-04-18 1:45 ` huang ying
2012-04-18 21:00 ` Rafael J. Wysocki
2012-04-19 2:47 ` huang ying
2012-04-19 2:47 ` huang ying
2012-04-19 12:31 ` Rafael J. Wysocki
2012-04-20 0:48 ` huang ying
2012-04-20 0:48 ` huang ying
2012-04-17 5:13 ` huang ying
2012-04-17 5:13 ` huang ying
2012-04-17 21:10 ` Rafael J. Wysocki
2012-04-18 2:01 ` huang ying
2012-04-18 20:51 ` Rafael J. Wysocki
2012-04-19 2:08 ` huang ying
2012-04-19 2:08 ` huang ying
2012-04-19 12:36 ` Rafael J. Wysocki
2012-04-20 0:53 ` huang ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F8B828E.9060606@intel.com \
--to=zheng.z.yan@intel.com \
--cc=bhelgaas@google.com \
--cc=huang.ying.caritas@gmail.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=rjw@sisk.pl \
--cc=rui.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.