From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
Eugenio Perez Martin <eperezma@redhat.com>
Cc: jasowang@redhat.com, mst@redhat.com, cohuck@redhat.com,
virtio-comment@lists.oasis-open.org,
virtio-dev@lists.oasis-open.org
Subject: [virtio-dev] Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status
Date: Fri, 18 Aug 2023 17:55:42 +0800 [thread overview]
Message-ID: <cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com> (raw)
In-Reply-To: <20230817160440.GA3605166@fedora>
On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote:
> On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote:
>> On Tue, Aug 15, 2023 at 2:29 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote:
>>>>
>>>> On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote:
>>>>> On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote:
>>>>>> This patch introudces a new status bit in the device status: SUSPEND.
>>>>>>
>>>>>> This SUSPEND bit can be used by the driver to suspend a device,
>>>>>> in order to stablize the device states and virtqueue states.
>>>>>>
>>>>>> Its main use case is live migration.
>>>>>>
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>> Signed-off-by: Eugenio PÃrez <eperezma@redhat.com>
>>>>> There is an character encoding issue in Eugenio's surname.
>>>> Oh, I copied his SOB form his email, I will copy from git log to fix this,
>>>> thanks for point out it.
>>>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> ---
>>>>>> content.tex | 18 ++++++++++++++++++
>>>>>> 1 file changed, 18 insertions(+)
>>>>> This patch hints at the asynchronous nature of the SUSPEND bit (the
>>>>> driver must re-read the Device Status Field) but doesn't explain the
>>>>> rationale or any limits.
>>>>>
>>>>> For example, is there a timeout or should the driver re-read the Device
>>>>> Status Field forever?
>>>> It depends on the driver, normally we expect this operation can be done
>>>> successfully
>>>> like how the driver/device handles FEATURES_OK.
>>>>
>>>> Once failed due to:
>>>> 1) driver timeout, the driver can reset the device
>>>> 2) device failure, the device can set NEEDS_RESET.
>>> I mention this because SUSPEND involves quiescing the device so that no
>>> requests are in flight and that can take an unbounded amount of time on
>>> a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy
>>> waiting for the device to report the SUSPEND bit, then that could take a
>>> long time/forever.
>>>
>>> Imagine a virtiofs PCI device implemented in hardware that forwards I/O
>>> to a distributed storage system. If the distributed storage system has a
>>> request in flight then SUSPEND needs to wait for it to complete. The
>>> device has no control over how long that will take, but if it does not
>>> then corruption could occur later on.
>>>
>>> There are two issues with long SUSPEND times:
>>> 1. Busy wait CPU consumption. Since there is no interrupt that signals
>>> when the bit is set, the best a driver can do is to back off
>>> gradually and use timers to avoid hogging the CPU.
>>> 2. Synchronous blocking. If the call stack that led the driver to set
>>> SUSPEND is blocked until the device reports the SUSPEND bit, then
>>> other parts of the system could experience blocking. For example, the
>>> VMM might be blocked in a vhost ioctl() call, which makes the guest
>>> unresponsive.
>>>
>> I think all of this already happens with ring reset or even a plain
>> device reset, doesn't it?
> Yes, but keep in mind that the driver has typically already drained
> requests when it invokes reset. If SUSPEND is used transparently during
> live migration then there really will be requests in flight because the
> guest driver is unaware.
I agree, and as discussed before, I think in this series we should say:
the device MUST wait untilall descriptors that being processed to finish
and mark them as used.
Then in the following series, we should implement in-flight IO tracker.
>
>> In my opinion the best thing the device can do here is to fail the
>> request after a certain time, the same way it would fail if the
>> backend distributed storage system gets disconnected or latency gets
>> out of bounds.
> In order to prevent corruption there needs to be a fence in addition to
> a timeout. In other words, the storage backend needs to guarantee that
> any requests sent before the fence will be ignored if they are still
> encountered.
Please correct me if I misunderstand anything.
I am not sure a remote target is aware of SUSPEND, but the device can
fail SUSPEND by setting NEEDS_RESET for sure. IN this series,
maybe it is best to flush and wait for the IO requests.
>
>>> Making SUSPEND asynchronous is more complicated but would allow long
>>> SUSPEND times to be handled gracefully.
>>>
>> Maybe that should be the direction of the transport vq, so transport
>> commands are asynchronous and we get rid of all the similar problems
>> in one shot?
> Yes, a transport virtqueue would make this operation asynchronous.
I agree
Thanks
Zhu Lingshan
>
> Stefan
>
>> Thanks!
>>
>>>>> Does the driver need to re-read the Device Status Field after clearing
>>>>> the SUSPEND bit?
>>>> I think the driver should re-read, I will add this in the next version.
>>>>>> diff --git a/content.tex b/content.tex
>>>>>> index 0a62dce..1bb4401 100644
>>>>>> --- a/content.tex
>>>>>> +++ b/content.tex
>>>>>> @@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>> \item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
>>>>>> drive the device.
>>>>>> +\item[SUSPEND (16)] When VIRTIO_F_SUSPEND is negotiated, indicates that the
>>>>>> + device has been suspended by the driver.
>>>>>> +
>>>>>> \item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
>>>>>> an error from which it can't recover.
>>>>>> \end{description}
>>>>>> @@ -73,6 +76,10 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>> recover by issuing a reset.
>>>>>> \end{note}
>>>>>> +The driver MUST NOT set SUSPEND if FEATURES_OK is not set.
>>>>>> +
>>>>>> +When set SUSPEND, the driver MUST re-read \field{device status} to ensure the SUSPEND bit is set.
>>>>> "When setting SUSPEND, ..." would be grammatically correct. Another
>>>>> option is "After setting the SUSPEND bit, ...".
>>>> Will fix in the next version.
>>>>>> +
>>>>>> \devicenormative{\subsection}{Device Status Field}{Basic Facilities of a Virtio Device / Device Status Field}
>>>>>> The device MUST NOT consume buffers or send any used buffer
>>>>>> @@ -82,6 +89,13 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>>>>>> that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device
>>>>>> MUST send a device configuration change notification to the driver.
>>>>>> +The device MUST ignore SUSPEND if FEATURES_OK is not set.
>>>>>> +
>>>>>> +The deivce MUST ignore SUSPEND if VIRTIO_F_SUSPEND is not negotiated.
>>> I noticed a typo:
>>> "device MUST"
>>>
>>>>>> +
>>>>>> +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set, the device MUST clear SUSPEND
>>>>>> +and resumes operation upon DRIVER_OK.
>>>>> I can't parse this sentence. If the driver writes SUSPEND | DRIVER_OK |
>>>>> ... to the Device Status Field, then the device accepts DRIVER_OK and
>>>>> clears SUSPEND?
>>>>>
>>>>> Why?
>>>> I expect DRIVER_OK can clear SUSPEND, so that the device can resume running
>>>> in case of a failed live migration.
>>>>
>>>> Maybe I should say: DRIVER_OK clears SUSPEND, and if DRIVER_OK is set to
>>>> a suspended device, the device should resume operation
>>> It's confusing because there are other Device Status Field bits aside
>>> from DRIVER_OK. I wasn't sure what you meant.
>>>
>>> I think this is really saying that devices must support the SUSPEND ->
>>> !SUSPEND transition. It's not really about DRIVER_OK because that bit
>>> will be set the entire time (!SUSPEND -> SUSPEND -> !SUSPEND).
>>>
>>> Can you rephrase it? For example:
>>>
>>> If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set, the device MUST
>>> resume operation when the driver clears the SUSPEND bit.
>>>
>>> Stefan
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
next prev parent reply other threads:[~2023-08-18 9:55 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-14 19:28 [virtio-dev] [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu Lingshan
2023-08-14 14:20 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-14 15:47 ` Stefan Hajnoczi
2023-08-15 1:38 ` Jason Wang
2023-08-15 10:14 ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status Zhu Lingshan
2023-08-14 14:30 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-15 10:31 ` Zhu, Lingshan
2023-08-15 12:29 ` Stefan Hajnoczi
2023-08-17 15:15 ` Eugenio Perez Martin
2023-08-17 16:04 ` Stefan Hajnoczi
2023-08-18 9:55 ` Zhu, Lingshan [this message]
2023-08-21 13:45 ` Stefan Hajnoczi
2023-08-15 0:26 ` [virtio-dev] " Jason Wang
2023-08-15 0:37 ` Jason Wang
2023-08-15 10:48 ` Zhu, Lingshan
2023-08-16 1:58 ` Jason Wang
2023-08-16 2:17 ` Zhu, Lingshan
2023-08-15 10:50 ` Zhu, Lingshan
2023-08-16 2:05 ` [virtio-dev] Re: [virtio-comment] " Jason Wang
2023-08-16 2:20 ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 2/5] virtio: introduce vq state as basic facility Zhu Lingshan
2023-08-14 14:49 ` Stefan Hajnoczi
2023-08-15 10:53 ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 3/5] virtio: The actions by the device upon SUSPEND Zhu Lingshan
2023-08-14 15:00 ` [virtio-dev] Re: [virtio-comment] " Stefan Hajnoczi
2023-08-15 11:07 ` Zhu, Lingshan
2023-08-15 12:33 ` Stefan Hajnoczi
2023-08-16 4:25 ` Zhu, Lingshan
2023-08-16 12:33 ` Stefan Hajnoczi
2023-08-15 0:29 ` [virtio-dev] " Jason Wang
2023-08-15 11:16 ` Zhu, Lingshan
2023-08-16 2:10 ` Jason Wang
2023-08-16 4:53 ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 4/5] virtqueue: constraints for virtqueue state Zhu Lingshan
2023-08-14 15:15 ` Stefan Hajnoczi
2023-08-15 11:18 ` Zhu, Lingshan
2023-08-15 0:34 ` [virtio-dev] " Jason Wang
2023-08-15 11:30 ` Zhu, Lingshan
2023-08-16 2:11 ` Jason Wang
2023-08-16 5:07 ` Zhu, Lingshan
[not found] ` <SN6PR11MB3517EF23D99CE4FDA8DDB22DFF1AA@SN6PR11MB3517.namprd11.prod.outlook.com>
2023-08-17 8:42 ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-08-21 4:03 ` Jason Wang
2023-08-17 15:19 ` [virtio-dev] " Eugenio Perez Martin
2023-08-18 9:44 ` Zhu, Lingshan
2023-08-21 9:26 ` Eugenio Perez Martin
2023-08-21 10:32 ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-09-05 9:08 ` Zhu, Lingshan
2023-09-07 8:09 ` Eugenio Perez Martin
2023-09-07 9:34 ` Zhu, Lingshan
2023-09-08 6:23 ` Si-Wei Liu
2023-09-08 8:41 ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-dev] [RFC PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE Zhu Lingshan
2023-08-14 15:18 ` Stefan Hajnoczi
2023-08-15 11:31 ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2023-08-15 0:35 ` [virtio-dev] " Jason Wang
2023-08-15 11:31 ` Zhu, Lingshan
2023-08-17 3:04 ` [virtio-dev] Re: [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu, Lingshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com \
--to=lingshan.zhu@intel.com \
--cc=cohuck@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=virtio-dev@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox