From: Cornelia Huck <cohuck@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Halil Pasic <pasic@linux.ibm.com>,
stefanha@redhat.com, virtio-comment@lists.oasis-open.org,
mst@redhat.com, eperezma@redhat.com, sgarzare@redhat.com
Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
Date: Tue, 22 Dec 2020 16:54:31 +0100 [thread overview]
Message-ID: <20201222165431.3f49de29.cohuck@redhat.com> (raw)
In-Reply-To: <7da54d5b-0787-c78f-4b35-6a4f7ed2f5bf@redhat.com>
On Tue, 22 Dec 2020 20:51:04 +0800
Jason Wang <jasowang@redhat.com> wrote:
> On 2020/12/22 下午8:14, Cornelia Huck wrote:
> > On Tue, 22 Dec 2020 15:30:31 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> >> On 2020/12/22 下午2:50, Halil Pasic wrote:
> >>> On Tue, 22 Dec 2020 10:36:41 +0800
> >>> Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>>> On 2020/12/22 上午5:33, Halil Pasic wrote:
> >>>>> On Fri, 18 Dec 2020 12:23:02 +0800
> >>>>> Jason Wang <jasowang@redhat.com> wrote:
> >>>>>
> >>>>>> This patch introduces a new status bit DEVICE_STOPPED. This will be
> >>>>>> used by the driver to stop and resume a device. The main user will be
> >>>>>> live migration support for virtio device.
> >>>>>>
> >>>>> Can you please provide some more background information, or point
> >>>>> me to the appropriate discussion?
> >>>>>
> >>>>> I mean AFAIK migration already works without this driver initiated
> >>>>> drain. What is the exact motivation? What about the big picture? I
> >>>>> guess some agent in the guest would have to make the driver issue
> >>>>> the DEVICE_STOP.
> >>>> This is not necessary if the datapath is done inside qemu and when
> >>>> migration is initiated by qemu itself.
> >>>>
> >>>> But it's a must for using virtio-device as a backend for emulated virtio
> >>>> devices (e.g vhost-vDPA). In this case, qemu needs to stop the device
> >>>> then it can safely synchronize the state from them.
> >>>>
> >>> You say, in this case qemu needs to stop the device, which makes sense
> >>> (it also has to do this when the datapath is implemented in qemu), but
> >>> AFAIU DEVICE_STOPPED is initiated by the guest and not by qemu. I'm
> >>> confused.
> >>
> >> It's initiated by Qemu. Guest is unware of live migration.
> > But isn't setting DEVICE_STOPPED a _driver_ initiated process? That
> > sounds like "guest" to me.
>
>
> I think we need clarification on the "driver".
Looking through the spec, I think we have never properly defined what
we mean by "device" or "driver". Not sure what a good definition would
look like. I would consider a "device" an entity that is present and is
configured/used by a "driver". The "driver" is the means to actually
configure/use a "device", i.e. the "initiating" part (even though some
actions are initiated by the device once the driver has started
interacting with it).
> Think the following setup:
>
> Guest driver <-> Qemu <-> vhost-vdpa <-> vDPA parent (virtio-pci vDPA
> driver)
>
> It's the Qemu that initiates the stop, and the request is forwarded to
> virtio-pci vDPA driver. So the process is transparent to the guest (driver).
So it's
Guest driver <-> QEMU <-> vhost-vdpa <-> vDPA parent
|<-- guest -->|<-------------- host -------------->|
but the vDPA parent is itself acting as a "driver" if you take my
attempt at a definition above?
(Wouldn't that mean that the device in QEMU is configured/used from two
entities?)
>
>
> >
> >>
> >>> I'm still curious about how the different components in the stack
> >>> (guest OS, qemu, vdpa-vhost in host kernel, the PCI function) are
> >>> supposed to interact.
> >>
> >> It works like:
> >>
> >> From Qemu point of view, vhost-vDPA is just another type of vhost
> >> backend. Qemu needs to stop virtio (vhost) before it can do migration.
> >> So we require vDPA devices to have the ability of stopping or pausing
> >> its datapath. If the vDPA device is by chance the virtio-PCI device, it
> >> needs an interface for receiving stop/resume command from the driver.
> >>
> >> So the devce stop/resume command was sent from Qemu to vhost-VDPA, then
> >> to vDPA parent which could be a virtio-PCI device in this case.
> > But QEMU implements the _device_, not the driver, doesn't it?
>
>
> The device is implemented with the co-operation between Qemu and
> vhost-vDPA. During migration qemu need to stop the virtio-net device,
> then vhost must be stopped as well.
Yes, it makes sense that any vhost parts need to be stopped as well.
>
>
> > And IIUC,
> > vhost-VDPA and the vDPA parent are also on the device side. I feel like
> > I'm missing something essential here.
>
>
> Virtio-PCI driver could be a vDPA parent as well in this case. So we
> need stop the virtio-pci device.
Who is the "driver" in that case?
>
>
> >
> >>
> >>>
> >>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>>>> ---
> >>>>>> content.tex | 26 ++++++++++++++++++++++++--
> >>>>>> 1 file changed, 24 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/content.tex b/content.tex
> >>>>>> index 61eab41..4392b60 100644
> >>>>>> --- a/content.tex
> >>>>>> +++ b/content.tex
> >>>>>> @@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
> >>>>>> \item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
> >>>>>> drive the device.
> >>>>>>
> >>>>>> +\item[DEVICE_STOPPED (32)] When VIRTIO_F_DEVICE_STOPPED is negotiated,
> >>>>>> + indicates that the device has been stopped by the driver.
> >>>>>> +
> >>>>> AFAIU it is not only about indicating stopped, but also requesting to be
> >>>>> stopped.
> >>>>>
> >>>>> More importantly, that must not be set immediately, in a sense that the
> >>>>> one side initiates some action by requesting the bit to be set, and the
> >>>>> other side must not set the bit before the action is performed.
> >>>> Yes.
> >>>>
> >>>>
> >>>>> We also
> >>>>> seem to assume that every device implementation is capable of performing
> >>>>> this trick.
> >>>> A dedicated feature bit is introduced for this.
> >>>>
> >>> This is not about the feature bit, but about the mechanism. But your
> >>> subsequent answers explain, that this is nothing unusual, and then
> >>> we should be fine.
> >>>
> >>>>> Is it for hardware devices (e.g. PCI) standard to request an
> >>>>> operation by writing some value into a register, and get feedback bout
> >>>>> a non-completion by reading different value that written,
> >>>> This is not ununsal in other devices. And in fact, the FEATURES_OK works
> >>>> like this:
> >>>>
> >>>> """
> >>>>
> >>>> The device MUST NOT offer a feature which requires another feature which
> >>>> was not offered. The device SHOULD accept any valid subset of features
> >>>> the driver accepts, otherwise it MUST fail to set the FEATURES_OK device
> >>>> status bit when the driver writes it.
> >>>>
> >>>> """
> >>>>
> >>> Thanks for the pointer. I intend to have another look at how FEATURES_OK
> >>> works, and how similar this is to DEVICE_STOPPED.
> >>
> >> My understanding is that for both of them, driver can try to set the bit
> >> by writing to the status register but it's the device that decide when
> >> to set the bit.
> > I think there's a difference: For FEATURES_OK, the driver can read back
> > the status and knows that the feature combination is not accepted if it
> > is not set. For DEVICE_STOPPED, it does not really know whether the
> > device has started to stop yet. It only knows that the device is
> > actually done stopping when DEVICE_STOPPED is set.
>
>
> The only difference is that device may need sometime to be stopped.
> FEATURES_OK might not be the best example. Let's see how reset work
> which is more similar to DEVICE_STOPPED:
>
> """
>
> After writing 0 to device_status, the driver MUST wait for a read of
> device_status to return 0 before reinitializing the device.
>
> """
Hm, but that is PCI-specific writing of fields in the configuration
structure, right? CCW uses a different method for resetting (an
asynchronous channel command; channel commands create an interrupt when
done); "write 0 to status for reset" is not universal.
Which makes me think: is using the device status a good universal
method for stopping the device? If I look at it only from the CCW
perspective, I'd have used a new channel command with a payload of
stop/resume and only reflected the stopped state in the device status
(i.e. read-only from the driver, and only changed by the device when it
transitions to/from the stopped state.) Could PCI use a new field for
that?
>
> >
> > Do we need a different mechanism for the device to signal the driver
> > that the device has actually been stopped? If the driver sees the
> > STOPPED status after reading back, it knows that the device has
> > acknowledged the request and is now stopping down. If it is not set, it
> > means that the device has not honoured the request. Same for clearing
> > it to resume.
>
>
> I don't see much difference. Device reset doesn't have such notification
> and driver may simply poll or use a timer.
See above; CCW does not really use the device status in the same way. I
think a notification would be useful.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2020-12-22 15:54 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-18 4:23 [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP Jason Wang
2020-12-18 10:15 ` [virtio-comment] " Stefano Garzarella
2020-12-21 3:08 ` Jason Wang
2020-12-21 11:06 ` Stefano Garzarella
2020-12-22 2:38 ` Jason Wang
2020-12-21 21:33 ` [virtio-comment] " Halil Pasic
2020-12-22 2:36 ` Jason Wang
2020-12-22 6:50 ` Halil Pasic
2020-12-22 7:30 ` Jason Wang
2020-12-22 12:14 ` Cornelia Huck
2020-12-22 12:51 ` Jason Wang
2020-12-22 15:54 ` Cornelia Huck [this message]
2020-12-23 2:48 ` Jason Wang
2020-12-25 7:38 ` Halil Pasic
2020-12-27 10:00 ` Michael S. Tsirkin
2020-12-28 6:21 ` Halil Pasic
2020-12-28 7:01 ` Jason Wang
2020-12-28 12:30 ` Michael S. Tsirkin
2020-12-29 9:04 ` Jason Wang
2021-01-12 10:54 ` Michael S. Tsirkin
2021-01-13 3:35 ` Jason Wang
2020-12-29 13:35 ` Halil Pasic
2020-12-30 8:15 ` Jason Wang
2021-01-11 18:16 ` Cornelia Huck
2021-01-12 3:27 ` Jason Wang
2021-01-12 12:13 ` Cornelia Huck
2021-01-13 2:52 ` Jason Wang
2021-01-14 12:00 ` Cornelia Huck
2020-12-28 6:47 ` Jason Wang
2020-12-29 13:20 ` Halil Pasic
2020-12-30 8:03 ` Jason Wang
2020-12-24 4:52 ` Halil Pasic
2020-12-24 5:51 ` Jason Wang
2020-12-25 3:18 ` Halil Pasic
2020-12-25 6:45 ` Jason Wang
2020-12-27 11:12 ` Michael S. Tsirkin
2020-12-28 7:05 ` Jason Wang
2020-12-28 12:27 ` Michael S. Tsirkin
2020-12-29 8:57 ` Jason Wang
2021-05-03 9:02 ` [virtio-comment] " Eugenio Perez Martin
2021-05-06 2:51 ` Jason Wang
2021-05-05 13:16 ` Michael S. Tsirkin
2021-05-06 7:26 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201222165431.3f49de29.cohuck@redhat.com \
--to=cohuck@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.