From: Alex Williamson <alex.williamson@redhat.com>
To: Abhishek Sahu <abhsahu@nvidia.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
Yishai Hadas <yishaih@nvidia.com>,
Jason Gunthorpe <jgg@nvidia.com>,
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>,
Kevin Tian <kevin.tian@intel.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Max Gurtovoy <mgurtovoy@nvidia.com>,
Bjorn Helgaas <bhelgaas@google.com>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
linux-pm@vger.kernel.org, linux-pci@vger.kernel.org
Subject: Re: [PATCH v4 1/6] vfio/pci: Mask INTx during runtime suspend
Date: Fri, 8 Jul 2022 09:45:08 -0600 [thread overview]
Message-ID: <20220708094508.49ba647f.alex.williamson@redhat.com> (raw)
In-Reply-To: <3b143762-d6ce-ac70-59ae-a0c2e66ffc1b@nvidia.com>
On Fri, 8 Jul 2022 14:51:30 +0530
Abhishek Sahu <abhsahu@nvidia.com> wrote:
> On 7/6/2022 9:09 PM, Alex Williamson wrote:
> > On Fri, 1 Jul 2022 16:38:09 +0530
> > Abhishek Sahu <abhsahu@nvidia.com> wrote:
> >
> >> This patch adds INTx handling during runtime suspend/resume.
> >> All the suspend/resume related code for the user to put the device
> >> into the low power state will be added in subsequent patches.
> >>
> >> The INTx are shared among devices. Whenever any INTx interrupt comes
> >
> > "The INTx lines may be shared..."
> >
> >> for the VFIO devices, then vfio_intx_handler() will be called for each
> >> device. Inside vfio_intx_handler(), it calls pci_check_and_mask_intx()
> >
> > "...device sharing the interrupt."
> >
> >> and checks if the interrupt has been generated for the current device.
> >> Now, if the device is already in the D3cold state, then the config space
> >> can not be read. Attempt to read config space in D3cold state can
> >> cause system unresponsiveness in a few systems. To prevent this, mask
> >> INTx in runtime suspend callback and unmask the same in runtime resume
> >> callback. If INTx has been already masked, then no handling is needed
> >> in runtime suspend/resume callbacks. 'pm_intx_masked' tracks this, and
> >> vfio_pci_intx_mask() has been updated to return true if INTx has been
> >> masked inside this function.
> >>
> >> For the runtime suspend which is triggered for the no user of VFIO
> >> device, the is_intx() will return false and these callbacks won't do
> >> anything.
> >>
> >> The MSI/MSI-X are not shared so similar handling should not be
> >> needed for MSI/MSI-X. vfio_msihandler() triggers eventfd_signal()
> >> without doing any device-specific config access. When the user performs
> >> any config access or IOCTL after receiving the eventfd notification,
> >> then the device will be moved to the D0 state first before
> >> servicing any request.
> >>
> >> Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
> >> ---
> >> drivers/vfio/pci/vfio_pci_core.c | 37 +++++++++++++++++++++++++++----
> >> drivers/vfio/pci/vfio_pci_intrs.c | 6 ++++-
> >> include/linux/vfio_pci_core.h | 3 ++-
> >> 3 files changed, 40 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> >> index a0d69ddaf90d..5948d930449b 100644
> >> --- a/drivers/vfio/pci/vfio_pci_core.c
> >> +++ b/drivers/vfio/pci/vfio_pci_core.c
> >> @@ -259,16 +259,45 @@ int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t stat
> >> return ret;
> >> }
> >>
> >> +#ifdef CONFIG_PM
> >> +static int vfio_pci_core_runtime_suspend(struct device *dev)
> >> +{
> >> + struct vfio_pci_core_device *vdev = dev_get_drvdata(dev);
> >> +
> >> + /*
> >> + * If INTx is enabled, then mask INTx before going into the runtime
> >> + * suspended state and unmask the same in the runtime resume.
> >> + * If INTx has already been masked by the user, then
> >> + * vfio_pci_intx_mask() will return false and in that case, INTx
> >> + * should not be unmasked in the runtime resume.
> >> + */
> >> + vdev->pm_intx_masked = (is_intx(vdev) && vfio_pci_intx_mask(vdev));
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static int vfio_pci_core_runtime_resume(struct device *dev)
> >> +{
> >> + struct vfio_pci_core_device *vdev = dev_get_drvdata(dev);
> >> +
> >> + if (vdev->pm_intx_masked)
> >> + vfio_pci_intx_unmask(vdev);
> >> +
> >> + return 0;
> >> +}
> >> +#endif /* CONFIG_PM */
> >> +
> >> /*
> >> - * The dev_pm_ops needs to be provided to make pci-driver runtime PM working,
> >> - * so use structure without any callbacks.
> >> - *
> >> * The pci-driver core runtime PM routines always save the device state
> >> * before going into suspended state. If the device is going into low power
> >> * state with only with runtime PM ops, then no explicit handling is needed
> >> * for the devices which have NoSoftRst-.
> >> */
> >> -static const struct dev_pm_ops vfio_pci_core_pm_ops = { };
> >> +static const struct dev_pm_ops vfio_pci_core_pm_ops = {
> >> + SET_RUNTIME_PM_OPS(vfio_pci_core_runtime_suspend,
> >> + vfio_pci_core_runtime_resume,
> >> + NULL)
> >> +};
> >>
> >> int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
> >> {
> >> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> >> index 6069a11fb51a..1a37db99df48 100644
> >> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> >> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> >> @@ -33,10 +33,12 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused)
> >> eventfd_signal(vdev->ctx[0].trigger, 1);
> >> }
> >>
> >> -void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
> >> +/* Returns true if INTx has been masked by this function. */
> >> +bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
> >> {
> >> struct pci_dev *pdev = vdev->pdev;
> >> unsigned long flags;
> >> + bool intx_masked = false;
> >>
> >> spin_lock_irqsave(&vdev->irqlock, flags);
> >>
> >> @@ -60,9 +62,11 @@ void vfio_pci_intx_mask(struct vfio_pci_core_device *vdev)
> >> disable_irq_nosync(pdev->irq);
> >>
> >> vdev->ctx[0].masked = true;
> >> + intx_masked = true;
> >> }
> >>
> >> spin_unlock_irqrestore(&vdev->irqlock, flags);
> >> + return intx_masked;
> >> }
> >
> >
> > There's certainly another path through this function that masks the
> > interrupt, which makes the definition of this return value a bit
> > confusing.
>
> For our case we should not hit that path. But we can return the
> intx_masked true from that path as well to align return value.
>
> > Wouldn't it be simpler not to overload the masked flag on
> > the interrupt context like this and instead set a new flag on the vdev
> > under irqlock to indicate the device is unable to generate interrupts.
> > The irq handler would add a test of this flag before any tests that
> > would access the device. Thanks,
> >
> > Alex
> >
>
> We will set this flag inside runtime_suspend callback but the
> device can be in non-D3cold state (For example, if user has
> disabled d3cold explicitly by sysfs, the D3cold is not supported in
> the platform, etc.). Also, in D3cold supported case, the device will
> be in D0 till the PCI core moves the device into D3cold. In this case,
> there is possibility that the device can generate an interrupt.
> If we add check in the IRQ handler, then we won’t check and clear
> the IRQ status, but the interrupt line will still be asserted
> which can cause interrupt flooding.
>
> This was the reason for disabling interrupt itself instead of
> checking flag in the IRQ handler.
Ok, maybe this is largely a clarification of the return value of
vfio_pci_intx_mask(). I think what you're looking for is whether the
context mask was changed, rather than whether the interrupt is masked,
which I think avoids the confusion regarding whether the first branch
should return true or false. So the variable should be something like
"masked_changed" and the comment changed to "Returns true if the INTx
vfio_pci_irq_ctx.masked value is changed".
Testing is_intx() outside of the irqlock is potentially racy, so do we
need to add the pm-get/put wrappers on ioctls first to avoid the
possibility that pm-suspend can race a SET_IRQS ioctl? Thanks,
Alex
next prev parent reply other threads:[~2022-07-08 15:45 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-01 11:08 [PATCH v4 0/6] vfio/pci: power management changes Abhishek Sahu
2022-07-01 11:08 ` [PATCH v4 1/6] vfio/pci: Mask INTx during runtime suspend Abhishek Sahu
2022-07-06 15:39 ` Alex Williamson
2022-07-08 9:21 ` Abhishek Sahu
2022-07-08 15:45 ` Alex Williamson [this message]
2022-07-11 9:18 ` Abhishek Sahu
2022-07-11 12:57 ` Alex Williamson
2022-07-01 11:08 ` [PATCH v4 2/6] vfio: Add a new device feature for the power management Abhishek Sahu
2022-07-06 15:39 ` Alex Williamson
2022-07-08 9:39 ` Abhishek Sahu
2022-07-08 16:36 ` Alex Williamson
2022-07-11 9:43 ` Abhishek Sahu
2022-07-11 13:04 ` Alex Williamson
2022-07-11 17:30 ` Abhishek Sahu
2022-07-01 11:08 ` [PATCH v4 3/6] vfio: Increment the runtime PM usage count during IOCTL call Abhishek Sahu
2022-07-06 15:40 ` Alex Williamson
2022-07-08 9:43 ` Abhishek Sahu
2022-07-08 16:49 ` Alex Williamson
2022-07-11 9:50 ` Abhishek Sahu
2022-07-01 11:08 ` [PATCH v4 4/6] vfio/pci: Add the support for PCI D3cold state Abhishek Sahu
2022-07-06 15:40 ` Alex Williamson
2022-07-01 11:08 ` [PATCH v4 5/6] vfio/pci: Prevent low power re-entry without guest driver Abhishek Sahu
2022-07-06 15:40 ` Alex Williamson
2022-07-01 11:08 ` [PATCH v4 6/6] vfio/pci: Add support for virtual PME Abhishek Sahu
2022-07-06 15:40 ` Alex Williamson
2022-07-08 9:45 ` Abhishek Sahu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220708094508.49ba647f.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=abhsahu@nvidia.com \
--cc=bhelgaas@google.com \
--cc=cohuck@redhat.com \
--cc=jgg@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mgurtovoy@nvidia.com \
--cc=rafael@kernel.org \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).