linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Abhishek Sahu <abhsahu@nvidia.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
	Kevin Tian <kevin.tian@intel.com>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	<linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>,
	<linux-pm@vger.kernel.org>, <linux-pci@vger.kernel.org>
Subject: Re: [PATCH v5 0/4] vfio/pci: power management changes
Date: Wed, 18 May 2022 11:51:29 -0600	[thread overview]
Message-ID: <20220518115129.72beddcd.alex.williamson@redhat.com> (raw)
In-Reply-To: <20220518111612.16985-1-abhsahu@nvidia.com>

On Wed, 18 May 2022 16:46:08 +0530
Abhishek Sahu <abhsahu@nvidia.com> wrote:

> Currently, there is very limited power management support available
> in the upstream vfio-pci driver. If there is no user of vfio-pci device,
> then it will be moved into D3Hot state. Similarly, if we enable the
> runtime power management for vfio-pci device in the guest OS, then the
> device is being runtime suspended (for linux guest OS) and the PCI
> device will be put into D3hot state (in function
> vfio_pm_config_write()). If the D3cold state can be used instead of
> D3hot, then it will help in saving maximum power. The D3cold state can't
> be possible with native PCI PM. It requires interaction with platform
> firmware which is system-specific. To go into low power states
> (including D3cold), the runtime PM framework can be used which
> internally interacts with PCI and platform firmware and puts the device
> into the lowest possible D-States.
> 
> This patch series registers the vfio-pci driver with runtime
> PM framework and uses the same for moving the physical PCI
> device to go into the low power state for unused idle devices.
> There will be separate patch series that will add the support
> for using runtime PM framework for used idle devices.
> 
> The current PM support was added with commit 6eb7018705de ("vfio-pci:
> Move idle devices to D3hot power state") where the following point was
> mentioned regarding D3cold state.
> 
>  "It's tempting to try to use D3cold, but we have no reason to inhibit
>   hotplug of idle devices and we might get into a loop of having the
>   device disappear before we have a chance to try to use it."
> 
> With the runtime PM, if the user want to prevent going into D3cold then
> /sys/bus/pci/devices/.../d3cold_allowed can be set to 0 for the
> devices where the above functionality is required instead of
> disallowing the D3cold state for all the cases.
> 
> The BAR access needs to be disabled if device is in D3hot state.
> Also, there should not be any config access if device is in D3cold
> state. For SR-IOV, the PF power state should be higher than VF's power
> state.
> 
> * Changes in v5
> 
> - Rebased over https://github.com/awilliam/linux-vfio/tree/next.
> - Renamed vfio_pci_lock_and_set_power_state() to
>   vfio_lock_and_set_power_state() and made it static.
> - Inside vfio_pci_core_sriov_configure(), protected setting of
>   power state and sriov enablement with 'memory_lock'.
> - Removed CONFIG_PM macro use since it is not needed with current
>   code.

Applied to vfio next branch for v5.19.  Thanks!

Alex

> * Changes in v4
>   (https://lore.kernel.org/lkml/20220517100219.15146-1-abhsahu@nvidia.com)
> 
> - Rebased over https://github.com/awilliam/linux-vfio/tree/next.
> - Split the patch series into 2 parts. This part contains the patches
>   for using runtime PM for unused idle device.
> - Used the 'pdev->current_state' for checking if the device in D3 state.
> - Adds the check in __vfio_pci_memory_enabled() function itself instead
>   of adding power state check at each caller.
> - Make vfio_pci_lock_and_set_power_state() global since it is needed
>   in different files.
> - Used vfio_pci_lock_and_set_power_state() instead of
>   vfio_pci_set_power_state() before pci_enable_sriov().
> - Inside vfio_pci_core_sriov_configure(), handled both the cases
>   (the device is in low power state with and without user).
> - Used list_for_each_entry_continue_reverse() in
>   vfio_pci_dev_set_pm_runtime_get().
> 
> * Changes in v3
>   (https://lore.kernel.org/lkml/20220425092615.10133-1-abhsahu@nvidia.com)
> 
> - Rebased patches on v5.18-rc3.
> - Marked this series as PATCH instead of RFC.
> - Addressed the review comments given in v2.
> - Removed the limitation to keep device in D0 state if there is any
>   access from host side. This is specific to NVIDIA use case and
>   will be handled separately.
> - Used the existing DEVICE_FEATURE IOCTL itself instead of adding new
>   IOCTL for power management.
> - Removed all custom code related with power management in runtime
>   suspend/resume callbacks and IOCTL handling. Now, the callbacks
>   contain code related with INTx handling and few other stuffs and
>   all the PCI state and platform PM handling will be done by PCI core
>   functions itself.
> - Add the support of wake-up in main vfio layer itself since now we have
>   more vfio/pci based drivers.
> - Instead of assigning the 'struct dev_pm_ops' in individual parent
>   driver, now the vfio_pci_core tself assigns the 'struct dev_pm_ops'. 
> - Added handling of power management around SR-IOV handling.
> - Moved the setting of drvdata in a separate patch.
> - Masked INTx before during runtime suspended state.
> - Changed the order of patches so that Fix related things are at beginning
>   of this patch series.
> - Removed storing the power state locally and used one new boolean to
>   track the d3 (D3cold and D3hot) power state 
> - Removed check for IO access in D3 power state.
> - Used another helper function vfio_lock_and_set_power_state() instead
>   of touching vfio_pci_set_power_state().
> - Considered the fixes made in
>   https://lore.kernel.org/lkml/20220217122107.22434-1-abhsahu@nvidia.com
>   and updated the patches accordingly.
> 
> * Changes in v2
>   (https://lore.kernel.org/lkml/20220124181726.19174-1-abhsahu@nvidia.com)
> 
> - Rebased patches on v5.17-rc1.
> - Included the patch to handle BAR access in D3cold.
> - Included the patch to fix memory leak.
> - Made a separate IOCTL that can be used to change the power state from
>   D3hot to D3cold and D3cold to D0.
> - Addressed the review comments given in v1.
> 
> * v1
>   https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/
> 
> Abhishek Sahu (4):
>   vfio/pci: Invalidate mmaps and block the access in D3hot power state
>   vfio/pci: Change the PF power state to D0 before enabling VFs
>   vfio/pci: Virtualize PME related registers bits and initialize to zero
>   vfio/pci: Move the unused device into low power state with runtime PM
> 
>  drivers/vfio/pci/vfio_pci_config.c |  56 ++++++++-
>  drivers/vfio/pci/vfio_pci_core.c   | 178 ++++++++++++++++++++---------
>  2 files changed, 178 insertions(+), 56 deletions(-)
> 


  parent reply	other threads:[~2022-05-18 17:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18 11:16 [PATCH v5 0/4] vfio/pci: power management changes Abhishek Sahu
2022-05-18 11:16 ` [PATCH v5 1/4] vfio/pci: Invalidate mmaps and block the access in D3hot power state Abhishek Sahu
2022-05-18 11:16 ` [PATCH v5 2/4] vfio/pci: Change the PF power state to D0 before enabling VFs Abhishek Sahu
2022-05-18 11:16 ` [PATCH v5 3/4] vfio/pci: Virtualize PME related registers bits and initialize to zero Abhishek Sahu
2022-05-18 11:16 ` [PATCH v5 4/4] vfio/pci: Move the unused device into low power state with runtime PM Abhishek Sahu
2022-05-18 17:51 ` Alex Williamson [this message]
2022-05-19  4:51   ` [PATCH v5 0/4] vfio/pci: power management changes Abhishek Sahu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220518115129.72beddcd.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=abhsahu@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=rafael@kernel.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).