public inbox for virtio-dev@lists.linux.dev
 help / color / mirror / Atom feed
From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
	Eugenio Perez Martin <eperezma@redhat.com>
Cc: jasowang@redhat.com, mst@redhat.com, cohuck@redhat.com,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org
Subject: [virtio-dev] Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status
Date: Fri, 18 Aug 2023 17:55:42 +0800	[thread overview]
Message-ID: <cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com> (raw)
In-Reply-To: <20230817160440.GA3605166@fedora>



On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote:
> On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote:
>> On Tue, Aug 15, 2023 at 2:29 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote:
>>>>
>>>> On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote:
>>>>> On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote:
>>>>>> This patch introudces a new status bit in the device status: SUSPEND.
>>>>>>
>>>>>> This SUSPEND bit can be used by the driver to suspend a device,
>>>>>> in order to stablize the device states and virtqueue states.
>>>>>>
>>>>>> Its main use case is live migration.
>>>>>>
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>> Signed-off-by: Eugenio PÃrez <eperezma@redhat.com>
>>>>> There is an character encoding issue in Eugenio's surname.
>>>> Oh, I copied his SOB form his email, I will copy from git log to fix this,
>>>> thanks for point out it.
>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> ---
>>>>>>    content.tex | 18 ++++++++++++++++++
>>>>>>    1 file changed, 18 insertions(+)
>>>>> This patch hints at the asynchronous nature of the SUSPEND bit (the
>>>>> driver must re-read the Device Status Field) but doesn't explain the
>>>>> rationale or any limits.
>>>>>
>>>>> For example, is there a timeout or should the driver re-read the Device
>>>>> Status Field forever?
>>>> It depends on the driver, normally we expect this operation can be done
>>>> successfully
>>>> like how the driver/device handles FEATURES_OK.
>>>>
>>>> Once failed due to:
>>>> 1) driver timeout, the driver can reset the device
>>>> 2) device failure, the device can set NEEDS_RESET.
>>> I mention this because SUSPEND involves quiescing the device so that no
>>> requests are in flight and that can take an unbounded amount of time on
>>> a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy
>>> waiting for the device to report the SUSPEND bit, then that could take a
>>> long time/forever.
>>>
>>> Imagine a virtiofs PCI device implemented in hardware that forwards I/O
>>> to a distributed storage system. If the distributed storage system has a
>>> request in flight then SUSPEND needs to wait for it to complete. The
>>> device has no control over how long that will take, but if it does not
>>> then corruption could occur later on.
>>>
>>> There are two issues with long SUSPEND times:
>>> 1. Busy wait CPU consumption. Since there is no interrupt that signals
>>>     when the bit is set, the best a driver can do is to back off
>>>     gradually and use timers to avoid hogging the CPU.
>>> 2. Synchronous blocking. If the call stack that led the driver to set
>>>     SUSPEND is blocked until the device reports the SUSPEND bit, then
>>>     other parts of the system could experience blocking. For example, the
>>>     VMM might be blocked in a vhost ioctl() call, which makes the guest
>>>     unresponsive.
>>>
>> I think all of this already happens with ring reset or even a plain
>> device reset, doesn't it?
> Yes, but keep in mind that the driver has typically already drained
> requests when it invokes reset. If SUSPEND is used transparently during
> live migration then there really will be requests in flight because the
> guest driver is unaware.
I agree, and as discussed before, I think in this series we should say:
the device MUST wait untilall descriptors that being processed to finish 
and mark them as used.

Then in the following series, we should implement in-flight IO tracker.
>
>> In my opinion the best thing the device can do here is to fail the
>> request after a certain time, the same way it would fail if the
>> backend distributed storage system gets disconnected or latency gets
>> out of bounds.
> In order to prevent corruption there needs to be a fence in addition to
> a timeout. In other words, the storage backend needs to guarantee that
> any requests sent before the fence will be ignored if they are still
> encountered.
Please correct me if I misunderstand anything.

I am not sure a remote target is aware of SUSPEND, but the device can
fail SUSPEND by setting NEEDS_RESET for sure. IN this series,
maybe it is best to flush and wait for the IO requests.
>
>>> Making SUSPEND asynchronous is more complicated but would allow long
>>> SUSPEND times to be handled gracefully.
>>>
>> Maybe that should be the direction of the transport vq, so transport
>> commands are asynchronous and we get rid of all the similar problems
>> in one shot?
> Yes, a transport virtqueue would make this operation asynchronous.
I agree

Thanks
Zhu Lingshan
>
> Stefan
>
>> Thanks!
>>
>>>>> Does the driver need to re-read the Device Status Field after clearing
>>>>> the SUSPEND bit?
>>>> I think the driver should re-read, I will add this in the next version.
>>>>>> diff --git a/content.tex b/content.tex
>>>>>> index 0a62dce..1bb4401 100644
>>>>>> --- a/content.tex
>>>>>> +++ b/content.tex
>>>>>> @@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>>    \item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
>>>>>>      drive the device.
>>>>>> +\item[SUSPEND (16)] When VIRTIO_F_SUSPEND is negotiated, indicates that the
>>>>>> +  device has been suspended by the driver.
>>>>>> +
>>>>>>    \item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
>>>>>>      an error from which it can't recover.
>>>>>>    \end{description}
>>>>>> @@ -73,6 +76,10 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>>    recover by issuing a reset.
>>>>>>    \end{note}
>>>>>> +The driver MUST NOT set SUSPEND if FEATURES_OK is not set.
>>>>>> +
>>>>>> +When set SUSPEND, the driver MUST re-read \field{device status} to ensure the SUSPEND bit is set.
>>>>> "When setting SUSPEND, ..." would be grammatically correct. Another
>>>>> option is "After setting the SUSPEND bit, ...".
>>>> Will fix in the next version.
>>>>>> +
>>>>>>    \devicenormative{\subsection}{Device Status Field}{Basic Facilities of a Virtio Device / Device Status Field}
>>>>>>    The device MUST NOT consume buffers or send any used buffer
>>>>>> @@ -82,6 +89,13 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>>    that a reset is needed.  If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device
>>>>>>    MUST send a device configuration change notification to the driver.
>>>>>> +The device MUST ignore SUSPEND if FEATURES_OK is not set.
>>>>>> +
>>>>>> +The deivce MUST ignore SUSPEND if VIRTIO_F_SUSPEND is not negotiated.
>>> I noticed a typo:
>>> "device MUST"
>>>
>>>>>> +
>>>>>> +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set, the device MUST clear SUSPEND
>>>>>> +and resumes operation upon DRIVER_OK.
>>>>> I can't parse this sentence. If the driver writes SUSPEND | DRIVER_OK |
>>>>> ... to the Device Status Field, then the device accepts DRIVER_OK and
>>>>> clears SUSPEND?
>>>>>
>>>>> Why?
>>>> I expect DRIVER_OK can clear SUSPEND, so that the device can resume running
>>>> in case of a failed live migration.
>>>>
>>>> Maybe I should say: DRIVER_OK clears SUSPEND, and if DRIVER_OK is set to
>>>> a suspended device, the device should resume operation
>>> It's confusing because there are other Device Status Field bits aside
>>> from DRIVER_OK. I wasn't sure what you meant.
>>>
>>> I think this is really saying that devices must support the SUSPEND ->
>>> !SUSPEND transition. It's not really about DRIVER_OK because that bit
>>> will be set the entire time (!SUSPEND -> SUSPEND -> !SUSPEND).
>>>
>>> Can you rephrase it? For example:
>>>
>>>    If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set, the device MUST
>>>    resume operation when the driver clears the SUSPEND bit.
>>>
>>> Stefan


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  reply	other threads:[~2023-08-18  9:55 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14 19:28 [virtio-dev] [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu Lingshan
2023-08-14 14:20 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-14 15:47 ` Stefan Hajnoczi
2023-08-15  1:38   ` Jason Wang
2023-08-15 10:14     ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status Zhu Lingshan
2023-08-14 14:30   ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-15 10:31     ` Zhu, Lingshan
2023-08-15 12:29       ` Stefan Hajnoczi
2023-08-17 15:15         ` Eugenio Perez Martin
2023-08-17 16:04           ` Stefan Hajnoczi
2023-08-18  9:55             ` Zhu, Lingshan [this message]
2023-08-21 13:45               ` Stefan Hajnoczi
2023-08-15  0:26   ` [virtio-dev] " Jason Wang
2023-08-15  0:37     ` Jason Wang
2023-08-15 10:48       ` Zhu, Lingshan
2023-08-16  1:58         ` Jason Wang
2023-08-16  2:17           ` Zhu, Lingshan
2023-08-15 10:50       ` Zhu, Lingshan
2023-08-16  2:05         ` [virtio-dev] Re: [virtio-comment] " Jason Wang
2023-08-16  2:20           ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 2/5] virtio: introduce vq state as basic facility Zhu Lingshan
2023-08-14 14:49   ` Stefan Hajnoczi
2023-08-15 10:53     ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 3/5] virtio: The actions by the device upon SUSPEND Zhu Lingshan
2023-08-14 15:00   ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-15 11:07     ` Zhu, Lingshan
2023-08-15 12:33       ` Stefan Hajnoczi
2023-08-16  4:25         ` Zhu, Lingshan
2023-08-16 12:33           ` Stefan Hajnoczi
2023-08-15  0:29   ` [virtio-dev] " Jason Wang
2023-08-15 11:16     ` Zhu, Lingshan
2023-08-16  2:10       ` Jason Wang
2023-08-16  4:53         ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 4/5] virtqueue: constraints for virtqueue state Zhu Lingshan
2023-08-14 15:15   ` Stefan Hajnoczi
2023-08-15 11:18     ` Zhu, Lingshan
2023-08-15  0:34   ` [virtio-dev] " Jason Wang
2023-08-15 11:30     ` Zhu, Lingshan
2023-08-16  2:11       ` Jason Wang
2023-08-16  5:07         ` Zhu, Lingshan
     [not found]       ` <SN6PR11MB3517EF23D99CE4FDA8DDB22DFF1AA@SN6PR11MB3517.namprd11.prod.outlook.com>
2023-08-17  8:42         ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-08-21  4:03           ` Jason Wang
2023-08-17 15:19       ` [virtio-dev] " Eugenio Perez Martin
2023-08-18  9:44         ` Zhu, Lingshan
2023-08-21  9:26           ` Eugenio Perez Martin
2023-08-21 10:32             ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-09-05  9:08             ` Zhu, Lingshan
2023-09-07  8:09               ` Eugenio Perez Martin
2023-09-07  9:34                 ` Zhu, Lingshan
2023-09-08  6:23                   ` Si-Wei Liu
2023-09-08  8:41                     ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE Zhu Lingshan
2023-08-14 15:18   ` Stefan Hajnoczi
2023-08-15 11:31     ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-08-15  0:35   ` [virtio-dev] " Jason Wang
2023-08-15 11:31     ` Zhu, Lingshan
2023-08-17  3:04 ` [virtio-dev] Re: [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu, Lingshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com \
    --to=lingshan.zhu@intel.com \
    --cc=cohuck@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox