From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
virtio-comment@lists.oasis-open.org,
Laurent Vivier <lvivier@redhat.com>, Cindy Lu <lulu@redhat.com>,
cohuck@redhat.com, alvaro.karsz@solid-run.com,
Liuxiangdong <liuxiangdong5@huawei.com>,
Gautam Dawar <gdawar@xilinx.com>,
longpeng2@huawei.com, Dragos Tatulea <dtatulea@nvidia.com>,
parav@nvidia.com, stefanha@redhat.com,
Harpreet Singh Anand <hanand@xilinx.com>,
Stefano Garzarella <sgarzare@redhat.com>,
Heng Qi <hengqi@linux.alibaba.com>,
Zhu Lingshan <lingshan.zhu@intel.com>,
Shannon Nelson <snelson@pensando.io>,
mgurtovoy@nvidia.com, si-wei.liu@oracle.com
Subject: Re: [virtio-comment] Re: [PATCH 0/2] Selective queue enabling
Date: Thu, 8 Jun 2023 18:08:39 -0400 [thread overview]
Message-ID: <20230608180617-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAJaqyWf+vn06NHcQY208dwyB=WH3PQ3A7z53o0BTy102XrT+3Q@mail.gmail.com>
On Thu, Jun 08, 2023 at 10:36:19AM +0200, Eugenio Perez Martin wrote:
> On Thu, Jun 8, 2023 at 9:19 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jun 08, 2023 at 08:43:18AM +0200, Eugenio Perez Martin wrote:
> > > On Thu, Jun 8, 2023 at 8:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Jun 08, 2023 at 08:44:41AM +0800, Jason Wang wrote:
> > > > > On Thu, Jun 8, 2023 at 4:27 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jun 07, 2023 at 11:41:39AM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Wed, Jun 7, 2023 at 10:59 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jun 07, 2023 at 10:47:12AM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Wed, Jun 7, 2023 at 10:23 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, 7 Jun 2023 07:35:58 +0200, Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > > > > > > > > > On Tue, Jun 6, 2023 at 9:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Jun 06, 2023 at 07:55:09PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > > > This series allows the driver to start the device (as set DRIVER_OK) with only
> > > > > > > > > > > > > some queues enabled, and then enable another queues later.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is the current way to migrate net device state through control
> > > > > > > > > > > > > virtqueue, in a software assisted framework with vDPA:
> > > > > > > > > > > > > * First, only net CVQ is enabled at DRIVER_OK
> > > > > > > > > > > > > * All the control commands (mac address, mq, etc) needed for the device
> > > > > > > > > > > > > to behave the same as the source of migration are sent
> > > > > > > > > > > > > * Finally all the dataplane queues are enabled.
> > > > > > > > > > > >
> > > > > > > > > > > > In my opinion, this is somewhat problematic. Specifically, currently
> > > > > > > > > > > > devices tend to deduce how many queues are needed by looking
> > > > > > > > > > > > at the state at DRIVER_OK time.
> > > > > > > > > > > >
> > > > > > > > > > > > Question: what is wrong with enabling queues initially and then
> > > > > > > > > > > > doing a reset right after DRIVER_OK? You can even allocate
> > > > > > > > > > > > memory for just one queue (zeroing it out).
> > > > > > > > > > > >
> > > > > > > > > > > > Granted this looks kind of ugly but side-steps this problem with
> > > > > > > > > > > > no need for spec changes.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The problem is that the rx queues can start receiving, as the guest
> > > > > > > > > > > already has buffers there.
> > > > > > > > > >
> > > > > > > > > > Can we reset the vq before filling buffers to it?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > They are passthrough from the guest so there is a window where the
> > > > > > > > > device can process rx descriptors.
> > > > > > > >
> > > > > > > > Not if there are no descriptors there.
> > > > > > > >
> > > > > > >
> > > > > > > But the migration is driven by the hypervisor, and it cannot control
> > > > > > > that. The guest will likely have rx descriptors available.
> > > > > >
> > > > > > Maybe I misunderstand. Is hypervisor driving cvq while guest is driving
> > > > > > rx queues?
> > > > >
> > > > > No, hypervisor tries to restore virtqueue states via cvq before guest can drive.
> > > >
> > > > So cvq maps to hypervisor memory?
> > > >
> > >
> > > >From the device POV, yes, the CVQ vring is in hypervisor memory, not
> > > in the guest's one. That allows the hypervisor to send the CVQ
> > > commands to restore the device state in the destination, without the
> > > guest intervention.
> > >
> > > Data vqs, on the other hand, are passthrough. The device talks
> > > directly to the guest's vrings.
> > >
> > > Currently, this is done by emulating CVQ in the host's kernel and then
> > > translating the commands in the way that suits better the vdpa device,
> > > using its vendor vdpa driver in the host. Other methods like PASID are
> > > possible too.
> > >
> > > Thanks!
> >
> > OK. So my suggestion is simple: map data vrings to a zero page in
> > hypervisor memory initially. Later reset and map to guest.
> >
>
> The idea is interesting, but we lose the net configuration in a device
> reset, so we need to send it again.
Ring reset, not device reset.
> In the case of qemu+vDPA maybe it is possible with queue_reset, like:
> * Map all the guest pages as usual.
> * Map a new zero page, forbid the guest to write on that page. Even in
> vIOMMU case, we can send all the CVQ commands before allowing the
> guest to modify mappings. The guest has no way to write on that page
> through the device as, well, dataplane is not initialized. It's the
> way DPDK shadow virtqueues worked, so it should be valid.
> * Reset the queues to the guest passthrough.
>
> I don't like the complexity of it but I like it does require even less
> changes to the device / spec.
This is exactly what I meant.
> >
> > If that does not work, then I am not sure this proposal is enough
> > since I think devices want to have a specific point in time
> > where they know which queues are going to be used.
>
> In the case of net, this should not be a problem since the spec
> mandates 2 if !cvq, 3 if cvq but !mq, and max_virtqueue_pairs*2+1 if
> cvq and mq. virtio-blk also has num_queues.
This is max, not how many there are in practice.
> Is it even valid to not enable some of the queues?
Yes and linux will do that if max_virtqueue_pairs > #CPUs.
> I've always felt that queue_enable has been redundant before
> queue_reset for this reason actually.
>
> > Maybe we could use e.g. bit 1 in queue_enable to signal that?
> >
>
> I'm totally ok to go in that direction.
>
> Thanks!
>
> >
> > > >
> > > >
> > > > > So in this case if RX queues are enabled at the same time, the device
> > > > > may try to queue packets to queue 0.
> > > > >
> > > > > Thanks
> > > > >
> > > > > > How do you do this - they are DMA from same VF no?
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > > > Apart from that, the back and forth
> > > > > > > > > > > introduces latencies.
> > > > > > > > > > >
> > > > > > > > > > > Maybe a better angle is to start all the queues as if they're reset,
> > > > > > > > > > > write 1 just to CVQ, configure the device, and then write 1 to all
> > > > > > > > > > > dataplane vqs?
> > > > > > > > > >
> > > > > > > > > > write to what?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Sorry I was unclear, I mean to enable the vqs writing 1 to queue_enable.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > >
> > > > > > > > > > > > > Eugenio Pérez (2):
> > > > > > > > > > > > > virtio: introduce selective queue enabling
> > > > > > > > > > > > > virtio: pci support virtqueue selective enabling
> > > > > > > > > > > > >
> > > > > > > > > > > > > content.tex | 15 +++++++++++++--
> > > > > > > > > > > > > transport-pci.tex | 4 ++++
> > > > > > > > > > > > > 2 files changed, 17 insertions(+), 2 deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.31.1
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This publicly archived list offers a means to provide input to the
> > > > > > > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > > > > > >
> > > > > > > > > > In order to verify user consent to the Feedback License terms and
> > > > > > > > > > to minimize spam in the list archive, subscription is required
> > > > > > > > > > before posting.
> > > > > > > > > >
> > > > > > > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > > > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > > > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > > > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > > > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > > > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > > > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > > > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
next prev parent reply other threads:[~2023-06-08 22:08 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-06 17:55 [virtio-comment] [PATCH 0/2] Selective queue enabling Eugenio Pérez
2023-06-06 17:55 ` [virtio-comment] [PATCH 1/2] virtio: introduce selective " Eugenio Pérez
2023-06-06 18:04 ` [virtio-comment] " Parav Pandit
2023-06-06 19:19 ` [virtio-comment] " Michael S. Tsirkin
2023-06-06 19:25 ` [virtio-comment] " Parav Pandit
2023-06-08 11:53 ` Eugenio Perez Martin
2023-06-08 13:15 ` Parav Pandit
2023-06-08 15:05 ` Michael S. Tsirkin
2023-06-08 15:07 ` Parav Pandit
2023-06-08 15:11 ` Michael S. Tsirkin
2023-06-08 12:11 ` [virtio-comment] " Xuan Zhuo
2023-06-08 13:21 ` [virtio-comment] " Parav Pandit
2023-06-08 14:18 ` [virtio-comment] " Eugenio Perez Martin
2023-06-08 14:39 ` Michael S. Tsirkin
2023-06-09 3:53 ` Xuan Zhuo
2023-06-06 19:11 ` Michael S. Tsirkin
2023-06-13 7:50 ` Michael S. Tsirkin
2023-06-13 8:28 ` Eugenio Perez Martin
2023-06-06 17:55 ` [virtio-comment] [PATCH 2/2] virtio: pci support virtqueue selective enabling Eugenio Pérez
2023-06-06 19:09 ` [virtio-comment] " Michael S. Tsirkin
2023-06-07 7:37 ` Eugenio Perez Martin
2023-06-07 9:03 ` Michael S. Tsirkin
2023-06-07 9:46 ` Eugenio Perez Martin
2023-06-06 19:09 ` [virtio-comment] Re: [PATCH 0/2] Selective queue enabling Michael S. Tsirkin
2023-06-07 5:35 ` Eugenio Perez Martin
2023-06-07 8:22 ` Xuan Zhuo
2023-06-07 8:47 ` Eugenio Perez Martin
2023-06-07 8:59 ` Michael S. Tsirkin
2023-06-07 9:41 ` Eugenio Perez Martin
2023-06-07 20:26 ` Michael S. Tsirkin
2023-06-08 0:44 ` Jason Wang
2023-06-08 6:04 ` Michael S. Tsirkin
2023-06-08 6:43 ` Eugenio Perez Martin
2023-06-08 7:18 ` Michael S. Tsirkin
2023-06-08 7:47 ` Jason Wang
2023-06-08 13:44 ` Michael S. Tsirkin
2023-06-08 8:36 ` Eugenio Perez Martin
2023-06-08 14:13 ` Parav Pandit
2023-06-08 22:08 ` Michael S. Tsirkin [this message]
2023-06-09 10:27 ` Eugenio Perez Martin
2023-06-09 15:54 ` Michael S. Tsirkin
2023-06-12 7:56 ` Eugenio Perez Martin
2023-06-13 7:46 ` Michael S. Tsirkin
2023-06-13 7:53 ` Michael S. Tsirkin
2023-06-13 10:12 ` Eugenio Perez Martin
2023-06-13 12:28 ` Michael S. Tsirkin
2023-06-15 8:35 ` Eugenio Perez Martin
2023-06-16 14:40 ` Michael S. Tsirkin
2023-06-17 12:53 ` Eugenio Perez Martin
2023-06-17 23:08 ` Michael S. Tsirkin
2023-06-24 18:40 ` Eugenio Perez Martin
2023-06-25 5:31 ` Jason Wang
2023-06-25 21:32 ` Michael S. Tsirkin
2023-06-26 2:53 ` Jason Wang
2023-06-26 8:19 ` Eugenio Perez Martin
2023-06-26 9:40 ` Michael S. Tsirkin
2023-06-27 8:07 ` Jason Wang
2023-06-13 19:00 ` Parav Pandit
2023-06-13 19:54 ` Michael S. Tsirkin
2023-06-13 21:09 ` Parav Pandit
2023-06-13 21:19 ` Parav Pandit
2023-06-13 21:48 ` Michael S. Tsirkin
2023-06-13 21:54 ` Parav Pandit
2023-06-14 4:26 ` Zhu, Lingshan
2023-06-14 4:32 ` Parav Pandit
2023-06-14 6:11 ` Zhu, Lingshan
2023-06-14 11:56 ` Parav Pandit
2023-06-15 5:56 ` Zhu, Lingshan
2023-06-16 9:19 ` Eugenio Perez Martin
2023-06-08 7:46 ` Jason Wang
2023-07-06 18:18 ` [virtio-comment] " Eugenio Perez Martin
2023-07-10 3:55 ` Jason Wang
2023-07-10 5:49 ` Michael S. Tsirkin
2023-07-10 12:13 ` Parav Pandit
2023-07-11 3:09 ` Jason Wang
2023-07-11 3:08 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230608180617-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alvaro.karsz@solid-run.com \
--cc=cohuck@redhat.com \
--cc=dtatulea@nvidia.com \
--cc=eperezma@redhat.com \
--cc=gdawar@xilinx.com \
--cc=hanand@xilinx.com \
--cc=hengqi@linux.alibaba.com \
--cc=jasowang@redhat.com \
--cc=lingshan.zhu@intel.com \
--cc=liuxiangdong5@huawei.com \
--cc=longpeng2@huawei.com \
--cc=lulu@redhat.com \
--cc=lvivier@redhat.com \
--cc=mgurtovoy@nvidia.com \
--cc=parav@nvidia.com \
--cc=sgarzare@redhat.com \
--cc=si-wei.liu@oracle.com \
--cc=snelson@pensando.io \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox