* [virtio-dev] [PATCH v4 0/1] Define a low power mode for devices @ 2024-02-16 8:24 David Stevens 2024-02-16 8:24 ` [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices David Stevens 0 siblings, 1 reply; 9+ messages in thread From: David Stevens @ 2024-02-16 8:24 UTC (permalink / raw) To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev Cc: parav, David Stevens The Linux patch [1] added support for suspending virtio devices using native PCI power management. However, Linux does PCI power management during the noirq phase of suspend and resume. Without a mechanism to suspend/resume virtio devices when the driver is suspended/resumed in the early phase of suspend/late phase of resume, there is a window where interrupts can be lost. [1] https://lore.kernel.org/lkml/20231208070754.3132339-1-stevensd@chromium.org/ v3 -> v4: - Define virtio-pci specific power management that can be used in conjunction with PCI power management. v2 -> v3: - Use different words for some concepts to avoid conflicts with other parts of the spec. - Rewrite various sentences to improve clarity. v1 -> v2: - Define virtio-pci support on top of PCI power management. - Add more conformance requirements. David Stevens (1): Add suspend support for virtio PCI devices transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) -- 2.44.0.rc0.258.g7320e95886-goog --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-16 8:24 [virtio-dev] [PATCH v4 0/1] Define a low power mode for devices David Stevens @ 2024-02-16 8:24 ` David Stevens 2024-02-16 8:56 ` [virtio-dev] " Michael S. Tsirkin 2024-02-26 6:42 ` Jason Wang 0 siblings, 2 replies; 9+ messages in thread From: David Stevens @ 2024-02-16 8:24 UTC (permalink / raw) To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev Cc: parav, David Stevens Add a virtio power management PCI capability to allow drivers to suspend virtio PCI devices. This allows drivers to suspend devices at the virtio level before suspending them at the PCI transport layer. This allows drivers to do a two phase suspend, which prevents notifications from being ignored or lost if interrupts are reconfigured at the PCI transport layer immediately before or after the device is put into the PCI PM D3 low power state. --- transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/transport-pci.tex b/transport-pci.tex index a5c6719ea871..ce77708a9b69 100644 --- a/transport-pci.tex +++ b/transport-pci.tex @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 /* Vendor-specific data */ #define VIRTIO_PCI_CAP_VENDOR_CFG 9 +/* Power management configuration */ +#define VIRTIO_PCI_CAP_PM_CFG 10 \end{lstlisting} Any other value is reserved for future use. @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O specified by some other Virtio Structure PCI Capability of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} + +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The +driver can write to the byte to set the power state of the device, +and it can read from the byte to get the current power state of the +device. + +The valid power states are: + +\begin{lstlisting} +/* Device is operating normally */ +#define VIRTIO_PM_STATE_ACTIVE 0 +/* Device operation is suspended */ +#define VIRTIO_PM_STATE_SUSPENDED 1 +\end{lstlisting} + +The device power state has no effect when \field{device status} does +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit +set. + +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} + +A device MUST maintain its state while suspended such that all driver +visible state after resuming exactly matches driver visible state +before suspending. + +A device MUST NOT send notifications while suspended. + +A device MAY operate on any buffers in its virtqueue while suspended. + +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. + +A device SHOULD take steps to minimize its resource consumption while +suspended, although what this involves is specific to the particular +device implementation. + +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} + +A driver MUST NOT access a suspended device's BARs corresponding to +any virtio structures, except for the power management byte. + +A driver MAY suspend a device that has buffers in one of its +virtqueues, but it MUST NOT modify any such buffers while the device +is suspended. + +A driver MUST read from the power management byte after writing to the +byte to verify that the device successfully entered the target power +state. + \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout} Transitional devices MUST present part of configuration -- 2.44.0.rc0.258.g7320e95886-goog --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-16 8:24 ` [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices David Stevens @ 2024-02-16 8:56 ` Michael S. Tsirkin 2024-02-18 10:59 ` Zhu, Lingshan 2024-02-19 6:46 ` David Stevens 2024-02-26 6:42 ` Jason Wang 1 sibling, 2 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2024-02-16 8:56 UTC (permalink / raw) To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote: > Add a virtio power management PCI capability to allow drivers to suspend > virtio PCI devices. This allows drivers to suspend devices at the virtio > level before suspending them at the PCI transport layer. This allows > drivers to do a two phase suspend, which prevents notifications from > being ignored or lost if interrupts are reconfigured at the PCI > transport layer immediately before or after the device is put into the > PCI PM D3 low power state. > --- > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 51 insertions(+) > > diff --git a/transport-pci.tex b/transport-pci.tex > index a5c6719ea871..ce77708a9b69 100644 > --- a/transport-pci.tex > +++ b/transport-pci.tex > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option > #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 > /* Vendor-specific data */ > #define VIRTIO_PCI_CAP_VENDOR_CFG 9 > +/* Power management configuration */ > +#define VIRTIO_PCI_CAP_PM_CFG 10 > \end{lstlisting} > > Any other value is reserved for future use. > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O > specified by some other Virtio Structure PCI Capability > of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > + > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The > +driver can write to the byte to set the power state of the device, > +and it can read from the byte to get the current power state of the > +device. > + > +The valid power states are: > + > +\begin{lstlisting} > +/* Device is operating normally */ > +#define VIRTIO_PM_STATE_ACTIVE 0 > +/* Device operation is suspended */ > +#define VIRTIO_PM_STATE_SUSPENDED 1 > +\end{lstlisting} > + > +The device power state has no effect when \field{device status} does > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit > +set. Given this is only after DRIVER_OK, wouldn't a feature bit + status bit make more sense? This will make it transport-independent and simplify discovery. > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > + > +A device MUST maintain its state while suspended such that all driver > +visible state after resuming exactly matches driver visible state > +before suspending. > + > +A device MUST NOT send notifications while suspended. > + > +A device MAY operate on any buffers in its virtqueue while suspended. How is this reconsiled with state matching exactly? buffers are driver-visible ... > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. > + > +A device SHOULD take steps to minimize its resource consumption while > +suspended, although what this involves is specific to the particular > +device implementation. > + > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > + > +A driver MUST NOT access a suspended device's BARs corresponding to > +any virtio structures, except for the power management byte. > + > +A driver MAY suspend a device that has buffers in one of its > +virtqueues, but it MUST NOT modify any such buffers while the device > +is suspended. > + > +A driver MUST read from the power management byte after writing to the > +byte to verify that the device successfully entered the target power > +state. Verify how? By checking the value returned? And what should it do with the value does not match? > + > \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout} > > Transitional devices MUST present part of configuration > -- > 2.44.0.rc0.258.g7320e95886-goog --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-16 8:56 ` [virtio-dev] " Michael S. Tsirkin @ 2024-02-18 10:59 ` Zhu, Lingshan 2024-02-19 6:46 ` David Stevens 1 sibling, 0 replies; 9+ messages in thread From: Zhu, Lingshan @ 2024-02-18 10:59 UTC (permalink / raw) To: Michael S. Tsirkin, David Stevens Cc: Jason Wang, virtio-comment, virtio-dev, parav On 2/16/2024 4:56 PM, Michael S. Tsirkin wrote: > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote: >> Add a virtio power management PCI capability to allow drivers to suspend >> virtio PCI devices. This allows drivers to suspend devices at the virtio >> level before suspending them at the PCI transport layer. This allows >> drivers to do a two phase suspend, which prevents notifications from >> being ignored or lost if interrupts are reconfigured at the PCI >> transport layer immediately before or after the device is put into the >> PCI PM D3 low power state. >> --- >> transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 51 insertions(+) >> >> diff --git a/transport-pci.tex b/transport-pci.tex >> index a5c6719ea871..ce77708a9b69 100644 >> --- a/transport-pci.tex >> +++ b/transport-pci.tex >> @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option >> #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 >> /* Vendor-specific data */ >> #define VIRTIO_PCI_CAP_VENDOR_CFG 9 >> +/* Power management configuration */ >> +#define VIRTIO_PCI_CAP_PM_CFG 10 >> \end{lstlisting} >> >> Any other value is reserved for future use. >> @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O >> specified by some other Virtio Structure PCI Capability >> of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. >> >> +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} >> + >> +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The >> +driver can write to the byte to set the power state of the device, >> +and it can read from the byte to get the current power state of the >> +device. >> + >> +The valid power states are: >> + >> +\begin{lstlisting} >> +/* Device is operating normally */ >> +#define VIRTIO_PM_STATE_ACTIVE 0 >> +/* Device operation is suspended */ >> +#define VIRTIO_PM_STATE_SUSPENDED 1 >> +\end{lstlisting} >> + >> +The device power state has no effect when \field{device status} does >> +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit >> +set. > > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit > make more sense? This will make it transport-independent and simplify > discovery. I agree with Michael, I have ever proposed a solution suspending the device by setting _SUSPEND in the device_status field. I will send this suspending patch again(will cc you) in a few days. Thanks Zhu Lingshan > >> +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} >> + >> +A device MUST maintain its state while suspended such that all driver >> +visible state after resuming exactly matches driver visible state >> +before suspending. >> + >> +A device MUST NOT send notifications while suspended. >> + >> +A device MAY operate on any buffers in its virtqueue while suspended. > How is this reconsiled with state matching exactly? buffers are > driver-visible ... > >> +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. >> + >> +A device SHOULD take steps to minimize its resource consumption while >> +suspended, although what this involves is specific to the particular >> +device implementation. >> + >> +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} >> + >> +A driver MUST NOT access a suspended device's BARs corresponding to >> +any virtio structures, except for the power management byte. >> + >> +A driver MAY suspend a device that has buffers in one of its >> +virtqueues, but it MUST NOT modify any such buffers while the device >> +is suspended. >> + >> +A driver MUST read from the power management byte after writing to the >> +byte to verify that the device successfully entered the target power >> +state. > Verify how? By checking the value returned? And what should it do with the value > does not match? > > >> + >> \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout} >> >> Transitional devices MUST present part of configuration >> -- >> 2.44.0.rc0.258.g7320e95886-goog > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-16 8:56 ` [virtio-dev] " Michael S. Tsirkin 2024-02-18 10:59 ` Zhu, Lingshan @ 2024-02-19 6:46 ` David Stevens 2024-02-19 7:42 ` Michael S. Tsirkin 1 sibling, 1 reply; 9+ messages in thread From: David Stevens @ 2024-02-19 6:46 UTC (permalink / raw) To: Michael S. Tsirkin; +Cc: Jason Wang, virtio-comment, virtio-dev, parav On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote: > > Add a virtio power management PCI capability to allow drivers to suspend > > virtio PCI devices. This allows drivers to suspend devices at the virtio > > level before suspending them at the PCI transport layer. This allows > > drivers to do a two phase suspend, which prevents notifications from > > being ignored or lost if interrupts are reconfigured at the PCI > > transport layer immediately before or after the device is put into the > > PCI PM D3 low power state. > > --- > > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 51 insertions(+) > > > > diff --git a/transport-pci.tex b/transport-pci.tex > > index a5c6719ea871..ce77708a9b69 100644 > > --- a/transport-pci.tex > > +++ b/transport-pci.tex > > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option > > #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 > > /* Vendor-specific data */ > > #define VIRTIO_PCI_CAP_VENDOR_CFG 9 > > +/* Power management configuration */ > > +#define VIRTIO_PCI_CAP_PM_CFG 10 > > \end{lstlisting} > > > > Any other value is reserved for future use. > > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O > > specified by some other Virtio Structure PCI Capability > > of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. > > > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The > > +driver can write to the byte to set the power state of the device, > > +and it can read from the byte to get the current power state of the > > +device. > > + > > +The valid power states are: > > + > > +\begin{lstlisting} > > +/* Device is operating normally */ > > +#define VIRTIO_PM_STATE_ACTIVE 0 > > +/* Device operation is suspended */ > > +#define VIRTIO_PM_STATE_SUSPENDED 1 > > +\end{lstlisting} > > + > > +The device power state has no effect when \field{device status} does > > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit > > +set. > > > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit > make more sense? This will make it transport-independent and simplify > discovery. A feature bit + status bit would work as well. I've posted some questions on Zhu Lingshan's patch. > > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +A device MUST maintain its state while suspended such that all driver > > +visible state after resuming exactly matches driver visible state > > +before suspending. > > + > > +A device MUST NOT send notifications while suspended. > > + > > +A device MAY operate on any buffers in its virtqueue while suspended. > > How is this reconsiled with state matching exactly? buffers are > driver-visible ... Drivers can't atomically access buffers and the PM byte, so the device modifying a buffer while suspended is indistinguishable from the device modifying a buffer in the window between when it is resumed by the driver and when the driver accesses the buffer. But you are right that the wording is contradictory. Maybe the earlier requirement could be better phrased as: A device MUST maintain its state while suspended such that after the driver resumes the device, the driver can use the device as if it had never been suspended in the first place. > > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. > > + > > +A device SHOULD take steps to minimize its resource consumption while > > +suspended, although what this involves is specific to the particular > > +device implementation. > > + > > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +A driver MUST NOT access a suspended device's BARs corresponding to > > +any virtio structures, except for the power management byte. > > + > > +A driver MAY suspend a device that has buffers in one of its > > +virtqueues, but it MUST NOT modify any such buffers while the device > > +is suspended. > > + > > +A driver MUST read from the power management byte after writing to the > > +byte to verify that the device successfully entered the target power > > +state. > > Verify how? By checking the value returned? And what should it do with the value > does not match? Yes, comparing the value returned to the one it just wrote. The three options I can think of would be to abort the suspend, reset the device, or retry. Retry only makes sense if suspend/resume might succeed in the future. An API is really only retry-friendly if it has a way to set a timeout, since the consumer is what should be deciding how long to wait but can't make that decision without a timeout. Personally, I would lean towards not allowing timeouts here, since it's simpler. So maybe something like this: After the driver writes a new value to the PM byte, if the device can transition to the requested state, then subsequent reads of the PM state byte MUST block until the transition completes. If the device cannot transition to the requested state, it MUST immediately return the prior value of the PM state byte for subsequent reads of the PM state byte. In that case, the only options are to abort the suspension or to reset the device. That's really a policy decision of the driver, so I don't know how it would fit into this spec. -David -David --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-19 6:46 ` David Stevens @ 2024-02-19 7:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2024-02-19 7:42 UTC (permalink / raw) To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav On Mon, Feb 19, 2024 at 03:46:37PM +0900, David Stevens wrote: > On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote: > > > Add a virtio power management PCI capability to allow drivers to suspend > > > virtio PCI devices. This allows drivers to suspend devices at the virtio > > > level before suspending them at the PCI transport layer. This allows > > > drivers to do a two phase suspend, which prevents notifications from > > > being ignored or lost if interrupts are reconfigured at the PCI > > > transport layer immediately before or after the device is put into the > > > PCI PM D3 low power state. > > > --- > > > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 51 insertions(+) > > > > > > diff --git a/transport-pci.tex b/transport-pci.tex > > > index a5c6719ea871..ce77708a9b69 100644 > > > --- a/transport-pci.tex > > > +++ b/transport-pci.tex > > > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option > > > #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 > > > /* Vendor-specific data */ > > > #define VIRTIO_PCI_CAP_VENDOR_CFG 9 > > > +/* Power management configuration */ > > > +#define VIRTIO_PCI_CAP_PM_CFG 10 > > > \end{lstlisting} > > > > > > Any other value is reserved for future use. > > > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O > > > specified by some other Virtio Structure PCI Capability > > > of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. > > > > > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > > + > > > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The > > > +driver can write to the byte to set the power state of the device, > > > +and it can read from the byte to get the current power state of the > > > +device. > > > + > > > +The valid power states are: > > > + > > > +\begin{lstlisting} > > > +/* Device is operating normally */ > > > +#define VIRTIO_PM_STATE_ACTIVE 0 > > > +/* Device operation is suspended */ > > > +#define VIRTIO_PM_STATE_SUSPENDED 1 > > > +\end{lstlisting} > > > + > > > +The device power state has no effect when \field{device status} does > > > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit > > > +set. > > > > > > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit > > make more sense? This will make it transport-independent and simplify > > discovery. > > A feature bit + status bit would work as well. I've posted some > questions on Zhu Lingshan's patch. Thanks! I like it that your patch is more specific but maybe something can be figure out to get the best of both worlds. > > > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > > + > > > +A device MUST maintain its state while suspended such that all driver > > > +visible state after resuming exactly matches driver visible state > > > +before suspending. > > > + > > > +A device MUST NOT send notifications while suspended. > > > + > > > +A device MAY operate on any buffers in its virtqueue while suspended. > > > > How is this reconsiled with state matching exactly? buffers are > > driver-visible ... > > Drivers can't atomically access buffers and the PM byte, so the device > modifying a buffer while suspended is indistinguishable from the > device modifying a buffer in the window between when it is resumed by > the driver and when the driver accesses the buffer. But you are right > that the wording is contradictory. Maybe the earlier requirement could > be better phrased as: > > A device MUST maintain its state while suspended such that after the > driver resumes the device, the driver can use the device as if it had > never been suspended in the first place. I think you wording is fine, just make it clear how is the contradiction resolved. E.g. exactly matches ... with the exception of using buffers available in one of the virtqueues. > > > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. > > > + > > > +A device SHOULD take steps to minimize its resource consumption while > > > +suspended, although what this involves is specific to the particular > > > +device implementation. > > > + > > > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > > + > > > +A driver MUST NOT access a suspended device's BARs corresponding to > > > +any virtio structures, except for the power management byte. > > > + > > > +A driver MAY suspend a device that has buffers in one of its > > > +virtqueues, but it MUST NOT modify any such buffers while the device > > > +is suspended. > > > + > > > +A driver MUST read from the power management byte after writing to the > > > +byte to verify that the device successfully entered the target power > > > +state. > > > > Verify how? By checking the value returned? And what should it do with the value > > does not match? > > Yes, comparing the value returned to the one it just wrote. The three > options I can think of would be to abort the suspend, reset the > device, or retry. Retry only makes sense if suspend/resume might > succeed in the future. Well actually there is a reason to retry even without a timeout - has to do with pci rules which limit how quickly device has to respond to a read. So if you want to implement suspend in firmware and not be bound to any timing limits you need retry in software. > An API is really only retry-friendly if it has > a way to set a timeout, since the consumer is what should be deciding > how long to wait but can't make that decision without a timeout. > Personally, I would lean towards not allowing timeouts here, since > it's simpler. So maybe something like this: > > After the driver writes a new value to the PM byte, if the device can > transition to the requested state, then subsequent reads of the PM > state byte MUST block until the transition completes. If the device > cannot transition to the requested state, it MUST immediately return > the prior value of the PM state byte for subsequent reads of the PM > state byte. PCI has timing limitations on how long reads can block though. So that could be a reason to retry. > In that case, the only options are to abort the suspension or to reset > the device. That's really a policy decision of the driver, so I don't > know how it would fit into this spec. > > -David > > > -David --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-16 8:24 ` [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices David Stevens 2024-02-16 8:56 ` [virtio-dev] " Michael S. Tsirkin @ 2024-02-26 6:42 ` Jason Wang 2024-02-26 8:46 ` David Stevens 1 sibling, 1 reply; 9+ messages in thread From: Jason Wang @ 2024-02-26 6:42 UTC (permalink / raw) To: David Stevens; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote: > > Add a virtio power management PCI capability to allow drivers to suspend > virtio PCI devices. This allows drivers to suspend devices at the virtio > level before suspending them at the PCI transport layer. This allows > drivers to do a two phase suspend, which prevents notifications from > being ignored or lost if interrupts are reconfigured at the PCI > transport layer immediately before or after the device is put into the > PCI PM D3 low power state. > --- > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 51 insertions(+) > I would like to know how the two phase suspend can prevent the loss of notification or interrupt. Or for example, if it can be workaround at the driver level. Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-26 6:42 ` Jason Wang @ 2024-02-26 8:46 ` David Stevens 2024-02-26 10:23 ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan 0 siblings, 1 reply; 9+ messages in thread From: David Stevens @ 2024-02-26 8:46 UTC (permalink / raw) To: Jason Wang; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote: > > On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote: > > > > Add a virtio power management PCI capability to allow drivers to suspend > > virtio PCI devices. This allows drivers to suspend devices at the virtio > > level before suspending them at the PCI transport layer. This allows > > drivers to do a two phase suspend, which prevents notifications from > > being ignored or lost if interrupts are reconfigured at the PCI > > transport layer immediately before or after the device is put into the > > PCI PM D3 low power state. > > --- > > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 51 insertions(+) > > > > I would like to know how the two phase suspend can prevent the loss of > notification or interrupt. Or for example, if it can be workaround at > the driver level. I'll preface this by saying I am not an expert in how Linux handles interrupts or in PCI, so I might be missing some important information. Actually, after thinking about this some more, I'm not sure that the two phase suspend is necessary per-se. Rather, I think relying only on PCI power management is simply insufficient. According to the kernel's system suspend and device interrupts documentation [1], "... after the 'late' phase of device suspend there is no legitimate reason why any interrupts from suspended devices should trigger and if any devices have not been suspended properly yet, it is better to block interrupts from them anyway." PCI power management doesn't occur until the noirq phase of suspend, so that's too late to be useful here. Other than that, as of right now, there's nothing the driver can do to tell devices they need to stop sending interrupts. Adding suspend support to the virtio spec would address this deficiency. More concretely, I ran into two problems when trying to get shallow suspend working with only PCI power management. If there are workarounds for this at the driver level, then that would be great. But I can't think of any, at least. First, there is a window between when Linux suspends interrupts with suspend_device_irqs and when PCI power management occurs. If the device triggers an interrupt in that window, then it won't be lost since IRQS_PENDING gets added to the descriptor state. However, if the originating event should have woken the guest up from sleep, then that doesn't happen, since there is no way for the device to know it actually needed to generate a PCI PME. I might be missing something, but this seems like something that can't be worked around at the driver level. Second, the core interrupt code can migrate interrupts when CPUs are offline'ed entering shallow suspend. Since PCI devices are already in D3 at this point, __pci_write_msi_msg does not update the MSI routing. When bringing the device back up, there is a window between when the device is put into D0 and when the updated routing tables are written by __pci_restore_msix_state. If the device sends an interrupt in this window, then it will be sent to the completely wrong handler in the guest (or dropped if there is no handler installed). I guess this could be worked around in the core PCI code by masking MSIX interrupts before suspending the device. However, the fact that Linux doesn't already do that today suggests that we're wrong, rather than everybody else happens to be wrong in just the right way. [1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst -David --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
* [virtio-dev] Re: [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices 2024-02-26 8:46 ` David Stevens @ 2024-02-26 10:23 ` Zhu, Lingshan 0 siblings, 0 replies; 9+ messages in thread From: Zhu, Lingshan @ 2024-02-26 10:23 UTC (permalink / raw) To: David Stevens, Jason Wang Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav On 2/26/2024 4:46 PM, David Stevens wrote: > On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote: >> On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote: >>> Add a virtio power management PCI capability to allow drivers to suspend >>> virtio PCI devices. This allows drivers to suspend devices at the virtio >>> level before suspending them at the PCI transport layer. This allows >>> drivers to do a two phase suspend, which prevents notifications from >>> being ignored or lost if interrupts are reconfigured at the PCI >>> transport layer immediately before or after the device is put into the >>> PCI PM D3 low power state. >>> --- >>> transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 51 insertions(+) >>> >> I would like to know how the two phase suspend can prevent the loss of >> notification or interrupt. Or for example, if it can be workaround at >> the driver level. > I'll preface this by saying I am not an expert in how Linux handles > interrupts or in PCI, so I might be missing some important > information. > > Actually, after thinking about this some more, I'm not sure that the > two phase suspend is necessary per-se. Rather, I think relying only on > PCI power management is simply insufficient. > > According to the kernel's system suspend and device interrupts > documentation [1], "... after the 'late' phase of device suspend there > is no legitimate reason why any interrupts from suspended devices > should trigger and if any devices have not been suspended properly > yet, it is better to block interrupts from them anyway." PCI power > management doesn't occur until the noirq phase of suspend, so that's > too late to be useful here. Other than that, as of right now, there's > nothing the driver can do to tell devices they need to stop sending > interrupts. Adding suspend support to the virtio spec would address > this deficiency. I think the SUSPEND bit which is a virtio level suspending can help here. > > More concretely, I ran into two problems when trying to get shallow > suspend working with only PCI power management. If there are > workarounds for this at the driver level, then that would be great. > But I can't think of any, at least. > > First, there is a window between when Linux suspends interrupts with > suspend_device_irqs and when PCI power management occurs. If the > device triggers an interrupt in that window, then it won't be lost > since IRQS_PENDING gets added to the descriptor state. However, if the > originating event should have woken the guest up from sleep, then that > doesn't happen, since there is no way for the device to know it > actually needed to generate a PCI PME. I might be missing something, > but this seems like something that can't be worked around at the > driver level. > > Second, the core interrupt code can migrate interrupts when CPUs are > offline'ed entering shallow suspend. Since PCI devices are already in > D3 at this point, __pci_write_msi_msg does not update the MSI routing. > When bringing the device back up, there is a window between when the > device is put into D0 and when the updated routing tables are written > by __pci_restore_msix_state. If the device sends an interrupt in this > window, then it will be sent to the completely wrong handler in the > guest (or dropped if there is no handler installed). I guess this > could be worked around in the core PCI code by masking MSIX interrupts > before suspending the device. However, the fact that Linux doesn't > already do that today suggests that we're wrong, rather than everybody > else happens to be wrong in just the right way. I am not a PCI expert, please correct me if I misunderstand anything. MSI interrupt are in-band DMA writing, so the data is always secure, and CPU should process them anyway even after waking up, the handler is always there since the driver register it. The device should not send any interrupts(both vq and config) once it presents SUSPEND. For PM, I think we should not SUSPEND a PCI device by PM, PM should be supplementary steps for SUSPENDING. While we are working on this patch, I will address the comments for the SUSPENDING bit patch and work our a V2 to wider review and more comments. Finally we will merge into a series. Thanks > > [1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst > > -David > > This publicly archived list offers a means to provide input to the > > OASIS Virtual I/O Device (VIRTIO) TC. > > > > In order to verify user consent to the Feedback License terms and > > to minimize spam in the list archive, subscription is required > > before posting. > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org > > List help: virtio-comment-help@lists.oasis-open.org > > List archive: https://lists.oasis-open.org/archives/virtio-comment/ > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists > > Committee: https://www.oasis-open.org/committees/virtio/ > > Join OASIS: https://www.oasis-open.org/join/ > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-02-26 10:23 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-16 8:24 [virtio-dev] [PATCH v4 0/1] Define a low power mode for devices David Stevens 2024-02-16 8:24 ` [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices David Stevens 2024-02-16 8:56 ` [virtio-dev] " Michael S. Tsirkin 2024-02-18 10:59 ` Zhu, Lingshan 2024-02-19 6:46 ` David Stevens 2024-02-19 7:42 ` Michael S. Tsirkin 2024-02-26 6:42 ` Jason Wang 2024-02-26 8:46 ` David Stevens 2024-02-26 10:23 ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).