From: Cornelia Huck <cohuck@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Halil Pasic <pasic@linux.ibm.com>,
stefanha@redhat.com, virtio-comment@lists.oasis-open.org,
mst@redhat.com, eperezma@redhat.com, sgarzare@redhat.com
Subject: Re: [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP
Date: Tue, 22 Dec 2020 16:54:31 +0100 [thread overview]
Message-ID: <20201222165431.3f49de29.cohuck@redhat.com> (raw)
In-Reply-To: <7da54d5b-0787-c78f-4b35-6a4f7ed2f5bf@redhat.com>
On Tue, 22 Dec 2020 20:51:04 +0800
Jason Wang <jasowang@redhat.com> wrote:
> On 2020/12/22 下午8:14, Cornelia Huck wrote:
> > On Tue, 22 Dec 2020 15:30:31 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> >> On 2020/12/22 下午2:50, Halil Pasic wrote:
> >>> On Tue, 22 Dec 2020 10:36:41 +0800
> >>> Jason Wang <jasowang@redhat.com> wrote:
> >>>
> >>>> On 2020/12/22 上午5:33, Halil Pasic wrote:
> >>>>> On Fri, 18 Dec 2020 12:23:02 +0800
> >>>>> Jason Wang <jasowang@redhat.com> wrote:
> >>>>>
> >>>>>> This patch introduces a new status bit DEVICE_STOPPED. This will be
> >>>>>> used by the driver to stop and resume a device. The main user will be
> >>>>>> live migration support for virtio device.
> >>>>>>
> >>>>> Can you please provide some more background information, or point
> >>>>> me to the appropriate discussion?
> >>>>>
> >>>>> I mean AFAIK migration already works without this driver initiated
> >>>>> drain. What is the exact motivation? What about the big picture? I
> >>>>> guess some agent in the guest would have to make the driver issue
> >>>>> the DEVICE_STOP.
> >>>> This is not necessary if the datapath is done inside qemu and when
> >>>> migration is initiated by qemu itself.
> >>>>
> >>>> But it's a must for using virtio-device as a backend for emulated virtio
> >>>> devices (e.g vhost-vDPA). In this case, qemu needs to stop the device
> >>>> then it can safely synchronize the state from them.
> >>>>
> >>> You say, in this case qemu needs to stop the device, which makes sense
> >>> (it also has to do this when the datapath is implemented in qemu), but
> >>> AFAIU DEVICE_STOPPED is initiated by the guest and not by qemu. I'm
> >>> confused.
> >>
> >> It's initiated by Qemu. Guest is unware of live migration.
> > But isn't setting DEVICE_STOPPED a _driver_ initiated process? That
> > sounds like "guest" to me.
>
>
> I think we need clarification on the "driver".
Looking through the spec, I think we have never properly defined what
we mean by "device" or "driver". Not sure what a good definition would
look like. I would consider a "device" an entity that is present and is
configured/used by a "driver". The "driver" is the means to actually
configure/use a "device", i.e. the "initiating" part (even though some
actions are initiated by the device once the driver has started
interacting with it).
> Think the following setup:
>
> Guest driver <-> Qemu <-> vhost-vdpa <-> vDPA parent (virtio-pci vDPA
> driver)
>
> It's the Qemu that initiates the stop, and the request is forwarded to
> virtio-pci vDPA driver. So the process is transparent to the guest (driver).
So it's
Guest driver <-> QEMU <-> vhost-vdpa <-> vDPA parent
|<-- guest -->|<-------------- host -------------->|
but the vDPA parent is itself acting as a "driver" if you take my
attempt at a definition above?
(Wouldn't that mean that the device in QEMU is configured/used from two
entities?)
>
>
> >
> >>
> >>> I'm still curious about how the different components in the stack
> >>> (guest OS, qemu, vdpa-vhost in host kernel, the PCI function) are
> >>> supposed to interact.
> >>
> >> It works like:
> >>
> >> From Qemu point of view, vhost-vDPA is just another type of vhost
> >> backend. Qemu needs to stop virtio (vhost) before it can do migration.
> >> So we require vDPA devices to have the ability of stopping or pausing
> >> its datapath. If the vDPA device is by chance the virtio-PCI device, it
> >> needs an interface for receiving stop/resume command from the driver.
> >>
> >> So the devce stop/resume command was sent from Qemu to vhost-VDPA, then
> >> to vDPA parent which could be a virtio-PCI device in this case.
> > But QEMU implements the _device_, not the driver, doesn't it?
>
>
> The device is implemented with the co-operation between Qemu and
> vhost-vDPA. During migration qemu need to stop the virtio-net device,
> then vhost must be stopped as well.
Yes, it makes sense that any vhost parts need to be stopped as well.
>
>
> > And IIUC,
> > vhost-VDPA and the vDPA parent are also on the device side. I feel like
> > I'm missing something essential here.
>
>
> Virtio-PCI driver could be a vDPA parent as well in this case. So we
> need stop the virtio-pci device.
Who is the "driver" in that case?
>
>
> >
> >>
> >>>
> >>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>>>> ---
> >>>>>> content.tex | 26 ++++++++++++++++++++++++--
> >>>>>> 1 file changed, 24 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/content.tex b/content.tex
> >>>>>> index 61eab41..4392b60 100644
> >>>>>> --- a/content.tex
> >>>>>> +++ b/content.tex
> >>>>>> @@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
> >>>>>> \item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
> >>>>>> drive the device.
> >>>>>>
> >>>>>> +\item[DEVICE_STOPPED (32)] When VIRTIO_F_DEVICE_STOPPED is negotiated,
> >>>>>> + indicates that the device has been stopped by the driver.
> >>>>>> +
> >>>>> AFAIU it is not only about indicating stopped, but also requesting to be
> >>>>> stopped.
> >>>>>
> >>>>> More importantly, that must not be set immediately, in a sense that the
> >>>>> one side initiates some action by requesting the bit to be set, and the
> >>>>> other side must not set the bit before the action is performed.
> >>>> Yes.
> >>>>
> >>>>
> >>>>> We also
> >>>>> seem to assume that every device implementation is capable of performing
> >>>>> this trick.
> >>>> A dedicated feature bit is introduced for this.
> >>>>
> >>> This is not about the feature bit, but about the mechanism. But your
> >>> subsequent answers explain, that this is nothing unusual, and then
> >>> we should be fine.
> >>>
> >>>>> Is it for hardware devices (e.g. PCI) standard to request an
> >>>>> operation by writing some value into a register, and get feedback bout
> >>>>> a non-completion by reading different value that written,
> >>>> This is not ununsal in other devices. And in fact, the FEATURES_OK works
> >>>> like this:
> >>>>
> >>>> """
> >>>>
> >>>> The device MUST NOT offer a feature which requires another feature which
> >>>> was not offered. The device SHOULD accept any valid subset of features
> >>>> the driver accepts, otherwise it MUST fail to set the FEATURES_OK device
> >>>> status bit when the driver writes it.
> >>>>
> >>>> """
> >>>>
> >>> Thanks for the pointer. I intend to have another look at how FEATURES_OK
> >>> works, and how similar this is to DEVICE_STOPPED.
> >>
> >> My understanding is that for both of them, driver can try to set the bit
> >> by writing to the status register but it's the device that decide when
> >> to set the bit.
> > I think there's a difference: For FEATURES_OK, the driver can read back
> > the status and knows that the feature combination is not accepted if it
> > is not set. For DEVICE_STOPPED, it does not really know whether the
> > device has started to stop yet. It only knows that the device is
> > actually done stopping when DEVICE_STOPPED is set.
>
>
> The only difference is that device may need sometime to be stopped.
> FEATURES_OK might not be the best example. Let's see how reset work
> which is more similar to DEVICE_STOPPED:
>
> """
>
> After writing 0 to device_status, the driver MUST wait for a read of
> device_status to return 0 before reinitializing the device.
>
> """
Hm, but that is PCI-specific writing of fields in the configuration
structure, right? CCW uses a different method for resetting (an
asynchronous channel command; channel commands create an interrupt when
done); "write 0 to status for reset" is not universal.
Which makes me think: is using the device status a good universal
method for stopping the device? If I look at it only from the CCW
perspective, I'd have used a new channel command with a payload of
stop/resume and only reflected the stopped state in the device status
(i.e. read-only from the driver, and only changed by the device when it
transitions to/from the stopped state.) Could PCI use a new field for
that?
>
> >
> > Do we need a different mechanism for the device to signal the driver
> > that the device has actually been stopped? If the driver sees the
> > STOPPED status after reading back, it knows that the device has
> > acknowledged the request and is now stopping down. If it is not set, it
> > means that the device has not honoured the request. Same for clearing
> > it to resume.
>
>
> I don't see much difference. Device reset doesn't have such notification
> and driver may simply poll or use a timer.
See above; CCW does not really use the device status in the same way. I
think a notification would be useful.
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2020-12-22 15:54 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-18 4:23 [virtio-comment] [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP Jason Wang
2020-12-18 10:15 ` [virtio-comment] " Stefano Garzarella
2020-12-21 3:08 ` Jason Wang
2020-12-21 11:06 ` Stefano Garzarella
2020-12-22 2:38 ` Jason Wang
2020-12-21 21:33 ` [virtio-comment] " Halil Pasic
2020-12-22 2:36 ` Jason Wang
2020-12-22 6:50 ` Halil Pasic
2020-12-22 7:30 ` Jason Wang
2020-12-22 12:14 ` Cornelia Huck
2020-12-22 12:51 ` Jason Wang
2020-12-22 15:54 ` Cornelia Huck [this message]
2020-12-23 2:48 ` Jason Wang
2020-12-25 7:38 ` Halil Pasic
2020-12-27 10:00 ` Michael S. Tsirkin
2020-12-28 6:21 ` Halil Pasic
2020-12-28 7:01 ` Jason Wang
2020-12-28 12:30 ` Michael S. Tsirkin
2020-12-29 9:04 ` Jason Wang
2021-01-12 10:54 ` Michael S. Tsirkin
2021-01-13 3:35 ` Jason Wang
2020-12-29 13:35 ` Halil Pasic
2020-12-30 8:15 ` Jason Wang
2021-01-11 18:16 ` Cornelia Huck
2021-01-12 3:27 ` Jason Wang
2021-01-12 12:13 ` Cornelia Huck
2021-01-13 2:52 ` Jason Wang
2021-01-14 12:00 ` Cornelia Huck
2020-12-28 6:47 ` Jason Wang
2020-12-29 13:20 ` Halil Pasic
2020-12-30 8:03 ` Jason Wang
2020-12-24 4:52 ` Halil Pasic
2020-12-24 5:51 ` Jason Wang
2020-12-25 3:18 ` Halil Pasic
2020-12-25 6:45 ` Jason Wang
2020-12-27 11:12 ` Michael S. Tsirkin
2020-12-28 7:05 ` Jason Wang
2020-12-28 12:27 ` Michael S. Tsirkin
2020-12-29 8:57 ` Jason Wang
2021-05-03 9:02 ` [virtio-comment] " Eugenio Perez Martin
2021-05-06 2:51 ` Jason Wang
2021-05-05 13:16 ` Michael S. Tsirkin
2021-05-06 7:26 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201222165431.3f49de29.cohuck@redhat.com \
--to=cohuck@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox