All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Zhu, Lingshan" <lingshan.zhu@intel.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>,
	jasowang@redhat.com, mst@redhat.com, cohuck@redhat.com,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org
Subject: Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status
Date: Mon, 21 Aug 2023 09:45:57 -0400	[thread overview]
Message-ID: <20230821134557.GA45012@fedora> (raw)
In-Reply-To: <cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com>

[-- Attachment #1: Type: text/plain, Size: 5063 bytes --]

On Fri, Aug 18, 2023 at 05:55:42PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote:
> > On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote:
> > > On Tue, Aug 15, 2023 at 2:29 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote:
> > > > > 
> > > > > On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote:
> > > > > > On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote:
> > > > > > > This patch introudces a new status bit in the device status: SUSPEND.
> > > > > > > 
> > > > > > > This SUSPEND bit can be used by the driver to suspend a device,
> > > > > > > in order to stablize the device states and virtqueue states.
> > > > > > > 
> > > > > > > Its main use case is live migration.
> > > > > > > 
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > Signed-off-by: Eugenio PÃrez <eperezma@redhat.com>
> > > > > > There is an character encoding issue in Eugenio's surname.
> > > > > Oh, I copied his SOB form his email, I will copy from git log to fix this,
> > > > > thanks for point out it.
> > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >    content.tex | 18 ++++++++++++++++++
> > > > > > >    1 file changed, 18 insertions(+)
> > > > > > This patch hints at the asynchronous nature of the SUSPEND bit (the
> > > > > > driver must re-read the Device Status Field) but doesn't explain the
> > > > > > rationale or any limits.
> > > > > > 
> > > > > > For example, is there a timeout or should the driver re-read the Device
> > > > > > Status Field forever?
> > > > > It depends on the driver, normally we expect this operation can be done
> > > > > successfully
> > > > > like how the driver/device handles FEATURES_OK.
> > > > > 
> > > > > Once failed due to:
> > > > > 1) driver timeout, the driver can reset the device
> > > > > 2) device failure, the device can set NEEDS_RESET.
> > > > I mention this because SUSPEND involves quiescing the device so that no
> > > > requests are in flight and that can take an unbounded amount of time on
> > > > a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy
> > > > waiting for the device to report the SUSPEND bit, then that could take a
> > > > long time/forever.
> > > > 
> > > > Imagine a virtiofs PCI device implemented in hardware that forwards I/O
> > > > to a distributed storage system. If the distributed storage system has a
> > > > request in flight then SUSPEND needs to wait for it to complete. The
> > > > device has no control over how long that will take, but if it does not
> > > > then corruption could occur later on.
> > > > 
> > > > There are two issues with long SUSPEND times:
> > > > 1. Busy wait CPU consumption. Since there is no interrupt that signals
> > > >     when the bit is set, the best a driver can do is to back off
> > > >     gradually and use timers to avoid hogging the CPU.
> > > > 2. Synchronous blocking. If the call stack that led the driver to set
> > > >     SUSPEND is blocked until the device reports the SUSPEND bit, then
> > > >     other parts of the system could experience blocking. For example, the
> > > >     VMM might be blocked in a vhost ioctl() call, which makes the guest
> > > >     unresponsive.
> > > > 
> > > I think all of this already happens with ring reset or even a plain
> > > device reset, doesn't it?
> > Yes, but keep in mind that the driver has typically already drained
> > requests when it invokes reset. If SUSPEND is used transparently during
> > live migration then there really will be requests in flight because the
> > guest driver is unaware.
> I agree, and as discussed before, I think in this series we should say:
> the device MUST wait untilall descriptors that being processed to finish and
> mark them as used.

Sounds good.

> Then in the following series, we should implement in-flight IO tracker.
> > 
> > > In my opinion the best thing the device can do here is to fail the
> > > request after a certain time, the same way it would fail if the
> > > backend distributed storage system gets disconnected or latency gets
> > > out of bounds.
> > In order to prevent corruption there needs to be a fence in addition to
> > a timeout. In other words, the storage backend needs to guarantee that
> > any requests sent before the fence will be ignored if they are still
> > encountered.
> Please correct me if I misunderstand anything.
> 
> I am not sure a remote target is aware of SUSPEND, but the device can
> fail SUSPEND by setting NEEDS_RESET for sure. IN this series,
> maybe it is best to flush and wait for the IO requests.

Yes, it is necessary to wait for pending I/O.

I was pointing out that timeouts implemented inside the storage
initiator (client) are unsafe. The storage target (server) needs to
participate in stopping in-flight I/O one way or another (timeout,
cancellation, fence, etc).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Zhu, Lingshan" <lingshan.zhu@intel.com>
Cc: Eugenio Perez Martin <eperezma@redhat.com>,
	jasowang@redhat.com, mst@redhat.com, cohuck@redhat.com,
	virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org
Subject: [virtio-dev] Re: [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status
Date: Mon, 21 Aug 2023 09:45:57 -0400	[thread overview]
Message-ID: <20230821134557.GA45012@fedora> (raw)
In-Reply-To: <cc054fd2-7262-32ef-8512-0b801fbc2334@intel.com>

[-- Attachment #1: Type: text/plain, Size: 5063 bytes --]

On Fri, Aug 18, 2023 at 05:55:42PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 8/18/2023 12:04 AM, Stefan Hajnoczi wrote:
> > On Thu, Aug 17, 2023 at 05:15:16PM +0200, Eugenio Perez Martin wrote:
> > > On Tue, Aug 15, 2023 at 2:29 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > > > On Tue, Aug 15, 2023 at 06:31:23PM +0800, Zhu, Lingshan wrote:
> > > > > 
> > > > > On 8/14/2023 10:30 PM, Stefan Hajnoczi wrote:
> > > > > > On Tue, Aug 15, 2023 at 03:29:00AM +0800, Zhu Lingshan wrote:
> > > > > > > This patch introudces a new status bit in the device status: SUSPEND.
> > > > > > > 
> > > > > > > This SUSPEND bit can be used by the driver to suspend a device,
> > > > > > > in order to stablize the device states and virtqueue states.
> > > > > > > 
> > > > > > > Its main use case is live migration.
> > > > > > > 
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > Signed-off-by: Eugenio PÃrez <eperezma@redhat.com>
> > > > > > There is an character encoding issue in Eugenio's surname.
> > > > > Oh, I copied his SOB form his email, I will copy from git log to fix this,
> > > > > thanks for point out it.
> > > > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >    content.tex | 18 ++++++++++++++++++
> > > > > > >    1 file changed, 18 insertions(+)
> > > > > > This patch hints at the asynchronous nature of the SUSPEND bit (the
> > > > > > driver must re-read the Device Status Field) but doesn't explain the
> > > > > > rationale or any limits.
> > > > > > 
> > > > > > For example, is there a timeout or should the driver re-read the Device
> > > > > > Status Field forever?
> > > > > It depends on the driver, normally we expect this operation can be done
> > > > > successfully
> > > > > like how the driver/device handles FEATURES_OK.
> > > > > 
> > > > > Once failed due to:
> > > > > 1) driver timeout, the driver can reset the device
> > > > > 2) device failure, the device can set NEEDS_RESET.
> > > > I mention this because SUSPEND involves quiescing the device so that no
> > > > requests are in flight and that can take an unbounded amount of time on
> > > > a virtio-blk, virtio-scsi, or virtiofs device. If the driver is busy
> > > > waiting for the device to report the SUSPEND bit, then that could take a
> > > > long time/forever.
> > > > 
> > > > Imagine a virtiofs PCI device implemented in hardware that forwards I/O
> > > > to a distributed storage system. If the distributed storage system has a
> > > > request in flight then SUSPEND needs to wait for it to complete. The
> > > > device has no control over how long that will take, but if it does not
> > > > then corruption could occur later on.
> > > > 
> > > > There are two issues with long SUSPEND times:
> > > > 1. Busy wait CPU consumption. Since there is no interrupt that signals
> > > >     when the bit is set, the best a driver can do is to back off
> > > >     gradually and use timers to avoid hogging the CPU.
> > > > 2. Synchronous blocking. If the call stack that led the driver to set
> > > >     SUSPEND is blocked until the device reports the SUSPEND bit, then
> > > >     other parts of the system could experience blocking. For example, the
> > > >     VMM might be blocked in a vhost ioctl() call, which makes the guest
> > > >     unresponsive.
> > > > 
> > > I think all of this already happens with ring reset or even a plain
> > > device reset, doesn't it?
> > Yes, but keep in mind that the driver has typically already drained
> > requests when it invokes reset. If SUSPEND is used transparently during
> > live migration then there really will be requests in flight because the
> > guest driver is unaware.
> I agree, and as discussed before, I think in this series we should say:
> the device MUST wait untilall descriptors that being processed to finish and
> mark them as used.

Sounds good.

> Then in the following series, we should implement in-flight IO tracker.
> > 
> > > In my opinion the best thing the device can do here is to fail the
> > > request after a certain time, the same way it would fail if the
> > > backend distributed storage system gets disconnected or latency gets
> > > out of bounds.
> > In order to prevent corruption there needs to be a fence in addition to
> > a timeout. In other words, the storage backend needs to guarantee that
> > any requests sent before the fence will be ignored if they are still
> > encountered.
> Please correct me if I misunderstand anything.
> 
> I am not sure a remote target is aware of SUSPEND, but the device can
> fail SUSPEND by setting NEEDS_RESET for sure. IN this series,
> maybe it is best to flush and wait for the IO requests.

Yes, it is necessary to wait for pending I/O.

I was pointing out that timeouts implemented inside the storage
initiator (client) are unsafe. The storage target (server) needs to
participate in stopping in-flight I/O one way or another (timeout,
cancellation, fence, etc).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-08-21 13:46 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14 19:28 [virtio-comment] [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu Lingshan
2023-08-14 19:28 ` [virtio-dev] " Zhu Lingshan
2023-08-14 14:20 ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 14:20   ` [virtio-dev] " Stefan Hajnoczi
2023-08-14 15:47 ` Stefan Hajnoczi
2023-08-14 15:47   ` [virtio-dev] " Stefan Hajnoczi
2023-08-15  1:38   ` Jason Wang
2023-08-15  1:38     ` [virtio-dev] " Jason Wang
2023-08-15 10:14     ` Zhu, Lingshan
2023-08-15 10:14       ` [virtio-dev] " Zhu, Lingshan
2023-08-14 19:29 ` [virtio-comment] [RFC PATCH 1/5] virtio: introduce SUSPEND bit in device status Zhu Lingshan
2023-08-14 19:29   ` [virtio-dev] " Zhu Lingshan
2023-08-14 14:30   ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 14:30     ` [virtio-dev] " Stefan Hajnoczi
2023-08-15 10:31     ` Zhu, Lingshan
2023-08-15 10:31       ` [virtio-dev] " Zhu, Lingshan
2023-08-15 12:29       ` Stefan Hajnoczi
2023-08-15 12:29         ` [virtio-dev] " Stefan Hajnoczi
2023-08-17 15:15         ` Eugenio Perez Martin
2023-08-17 15:15           ` [virtio-dev] " Eugenio Perez Martin
2023-08-17 16:04           ` Stefan Hajnoczi
2023-08-17 16:04             ` [virtio-dev] " Stefan Hajnoczi
2023-08-18  9:55             ` Zhu, Lingshan
2023-08-18  9:55               ` [virtio-dev] " Zhu, Lingshan
2023-08-21 13:45               ` Stefan Hajnoczi [this message]
2023-08-21 13:45                 ` Stefan Hajnoczi
2023-08-15  0:26   ` [virtio-comment] " Jason Wang
2023-08-15  0:26     ` [virtio-dev] " Jason Wang
2023-08-15  0:37     ` [virtio-comment] " Jason Wang
2023-08-15  0:37       ` [virtio-dev] " Jason Wang
2023-08-15 10:48       ` [virtio-comment] " Zhu, Lingshan
2023-08-15 10:48         ` [virtio-dev] " Zhu, Lingshan
2023-08-16  1:58         ` [virtio-comment] " Jason Wang
2023-08-16  1:58           ` [virtio-dev] " Jason Wang
2023-08-16  2:17           ` [virtio-comment] " Zhu, Lingshan
2023-08-16  2:17             ` [virtio-dev] " Zhu, Lingshan
2023-08-15 10:50       ` [virtio-comment] " Zhu, Lingshan
2023-08-15 10:50         ` [virtio-dev] " Zhu, Lingshan
2023-08-16  2:05         ` [virtio-comment] " Jason Wang
2023-08-16  2:05           ` [virtio-dev] " Jason Wang
2023-08-16  2:20           ` Zhu, Lingshan
2023-08-16  2:20             ` [virtio-dev] " Zhu, Lingshan
2023-08-14 19:29 ` [virtio-comment] [RFC PATCH 2/5] virtio: introduce vq state as basic facility Zhu Lingshan
2023-08-14 19:29   ` [virtio-dev] " Zhu Lingshan
2023-08-14 14:49   ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 14:49     ` Stefan Hajnoczi
2023-08-15 10:53     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 10:53       ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-comment] [RFC PATCH 3/5] virtio: The actions by the device upon SUSPEND Zhu Lingshan
2023-08-14 19:29   ` [virtio-dev] " Zhu Lingshan
2023-08-14 15:00   ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 15:00     ` [virtio-dev] " Stefan Hajnoczi
2023-08-15 11:07     ` Zhu, Lingshan
2023-08-15 11:07       ` [virtio-dev] " Zhu, Lingshan
2023-08-15 12:33       ` Stefan Hajnoczi
2023-08-15 12:33         ` [virtio-dev] " Stefan Hajnoczi
2023-08-16  4:25         ` Zhu, Lingshan
2023-08-16  4:25           ` [virtio-dev] " Zhu, Lingshan
2023-08-16 12:33           ` Stefan Hajnoczi
2023-08-16 12:33             ` [virtio-dev] " Stefan Hajnoczi
2023-08-15  0:29   ` [virtio-comment] " Jason Wang
2023-08-15  0:29     ` [virtio-dev] " Jason Wang
2023-08-15 11:16     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 11:16       ` Zhu, Lingshan
2023-08-16  2:10       ` [virtio-comment] " Jason Wang
2023-08-16  2:10         ` Jason Wang
2023-08-16  4:53         ` [virtio-comment] " Zhu, Lingshan
2023-08-16  4:53           ` Zhu, Lingshan
2023-08-14 19:29 ` [virtio-comment] [RFC PATCH 4/5] virtqueue: constraints for virtqueue state Zhu Lingshan
2023-08-14 19:29   ` [virtio-dev] " Zhu Lingshan
2023-08-14 15:15   ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 15:15     ` Stefan Hajnoczi
2023-08-15 11:18     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 11:18       ` Zhu, Lingshan
2023-08-15  0:34   ` [virtio-comment] " Jason Wang
2023-08-15  0:34     ` [virtio-dev] " Jason Wang
2023-08-15 11:30     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 11:30       ` [virtio-dev] " Zhu, Lingshan
2023-08-16  2:11       ` [virtio-comment] " Jason Wang
2023-08-16  2:11         ` [virtio-dev] " Jason Wang
2023-08-16  5:07         ` [virtio-comment] " Zhu, Lingshan
2023-08-16  5:07           ` Zhu, Lingshan
2023-08-17  8:31       ` [virtio-comment] " Uminski, Piotr
2023-08-17  8:42         ` Zhu, Lingshan
2023-08-17  8:42           ` [virtio-dev] " Zhu, Lingshan
2023-08-21  4:03           ` Jason Wang
2023-08-21  4:03             ` [virtio-dev] " Jason Wang
2023-08-17 15:19       ` Eugenio Perez Martin
2023-08-17 15:19         ` [virtio-dev] " Eugenio Perez Martin
2023-08-18  9:44         ` [virtio-comment] " Zhu, Lingshan
2023-08-18  9:44           ` [virtio-dev] " Zhu, Lingshan
2023-08-21  9:26           ` [virtio-comment] " Eugenio Perez Martin
2023-08-21  9:26             ` [virtio-dev] " Eugenio Perez Martin
2023-08-21 10:32             ` [virtio-comment] " Zhu, Lingshan
2023-08-21 10:32               ` [virtio-dev] " Zhu, Lingshan
2023-09-05  9:08             ` Zhu, Lingshan
2023-09-05  9:08               ` [virtio-dev] " Zhu, Lingshan
2023-09-07  8:09               ` Eugenio Perez Martin
2023-09-07  8:09                 ` [virtio-dev] " Eugenio Perez Martin
2023-09-07  9:34                 ` Zhu, Lingshan
2023-09-07  9:34                   ` [virtio-dev] " Zhu, Lingshan
2023-09-08  6:23                   ` Si-Wei Liu
2023-09-08  6:23                     ` [virtio-dev] " Si-Wei Liu
2023-09-08  8:41                     ` Zhu, Lingshan
2023-09-08  8:41                       ` [virtio-dev] " Zhu, Lingshan
2023-08-14 19:29 ` [virtio-comment] [RFC PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE Zhu Lingshan
2023-08-14 19:29   ` [virtio-dev] " Zhu Lingshan
2023-08-14 15:18   ` [virtio-comment] " Stefan Hajnoczi
2023-08-14 15:18     ` Stefan Hajnoczi
2023-08-15 11:31     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 11:31       ` [virtio-dev] " Zhu, Lingshan
2023-08-15  0:35   ` [virtio-comment] " Jason Wang
2023-08-15  0:35     ` [virtio-dev] " Jason Wang
2023-08-15 11:31     ` [virtio-comment] " Zhu, Lingshan
2023-08-15 11:31       ` [virtio-dev] " Zhu, Lingshan
2023-08-17  3:04 ` [virtio-comment] Re: [RFC PATCH 0/5] virtio: introduce SUSPEND bit and vq state Zhu, Lingshan
2023-08-17  3:04   ` [virtio-dev] " Zhu, Lingshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230821134557.GA45012@fedora \
    --to=stefanha@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lingshan.zhu@intel.com \
    --cc=mst@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.