From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: Jason Wang <jasowang@redhat.com>,
wexu@redhat.com, virtualization@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
jfreimann@redhat.com
Subject: Re: [RFC v2] virtio: support packed ring
Date: Tue, 24 Apr 2018 04:43:22 +0300 [thread overview]
Message-ID: <20180424044250-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180424013747.hmr4ytqe7gdqfhdt@debian>
On Tue, Apr 24, 2018 at 09:37:47AM +0800, Tiwei Bie wrote:
> On Tue, Apr 24, 2018 at 04:29:51AM +0300, Michael S. Tsirkin wrote:
> > On Tue, Apr 24, 2018 at 09:16:38AM +0800, Tiwei Bie wrote:
> > > On Tue, Apr 24, 2018 at 04:05:07AM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Apr 24, 2018 at 08:54:52AM +0800, Jason Wang wrote:
> > > > >
> > > > >
> > > > > On 2018年04月23日 17:29, Tiwei Bie wrote:
> > > > > > On Mon, Apr 23, 2018 at 01:42:14PM +0800, Jason Wang wrote:
> > > > > > > On 2018年04月01日 22:12, Tiwei Bie wrote:
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > This RFC implements packed ring support for virtio driver.
> > > > > > > >
> > > > > > > > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
> > > > > > > > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
> > > > > > > > Minor changes are needed for the vhost code, e.g. to kick the guest.
> > > > > > > >
> > > > > > > > TODO:
> > > > > > > > - Refinements and bug fixes;
> > > > > > > > - Split into small patches;
> > > > > > > > - Test indirect descriptor support;
> > > > > > > > - Test/fix event suppression support;
> > > > > > > > - Test devices other than net;
> > > > > > > >
> > > > > > > > RFC v1 -> RFC v2:
> > > > > > > > - Add indirect descriptor support - compile test only;
> > > > > > > > - Add event suppression supprt - compile test only;
> > > > > > > > - Move vring_packed_init() out of uapi (Jason, MST);
> > > > > > > > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > > > > > > > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > > > > > > > - Avoid using '%' operator (Jason);
> > > > > > > > - Rename free_head -> next_avail_idx (Jason);
> > > > > > > > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > > > > > > > - Some other refinements and bug fixes;
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > ---
> > > > > > > > drivers/virtio/virtio_ring.c | 1094 +++++++++++++++++++++++++++++-------
> > > > > > > > include/linux/virtio_ring.h | 8 +-
> > > > > > > > include/uapi/linux/virtio_config.h | 12 +-
> > > > > > > > include/uapi/linux/virtio_ring.h | 61 ++
> > > > > > > > 4 files changed, 980 insertions(+), 195 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > index 71458f493cf8..0515dca34d77 100644
> > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > @@ -58,14 +58,15 @@
> > > > > > > [...]
> > > > > > >
> > > > > > > > +
> > > > > > > > + if (vq->indirect) {
> > > > > > > > + u32 len;
> > > > > > > > +
> > > > > > > > + desc = vq->desc_state[head].indir_desc;
> > > > > > > > + /* Free the indirect table, if any, now that it's unmapped. */
> > > > > > > > + if (!desc)
> > > > > > > > + goto out;
> > > > > > > > +
> > > > > > > > + len = virtio32_to_cpu(vq->vq.vdev,
> > > > > > > > + vq->vring_packed.desc[head].len);
> > > > > > > > +
> > > > > > > > + BUG_ON(!(vq->vring_packed.desc[head].flags &
> > > > > > > > + cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
> > > > > > > It looks to me spec does not force to keep VRING_DESC_F_INDIRECT here. So we
> > > > > > > can safely remove this BUG_ON() here.
> > > > > > >
> > > > > > > > + BUG_ON(len == 0 || len % sizeof(struct vring_packed_desc));
> > > > > > > Len could be ignored for used descriptor according to the spec, so we need
> > > > > > > remove this BUG_ON() too.
> > > > > > Yeah, you're right! The BUG_ON() isn't right. I'll remove it.
> > > > > > And I think something related to this in the spec isn't very
> > > > > > clear currently.
> > > > > >
> > > > > > In the spec, there are below words:
> > > > > >
> > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L272
> > > > > > """
> > > > > > In descriptors with VIRTQ_DESC_F_INDIRECT set VIRTQ_DESC_F_WRITE
> > > > > > is reserved and is ignored by the device.
> > > > > > """
> > > > > >
> > > > > > So when device writes back an used descriptor in this case,
> > > > > > device may not set the VIRTQ_DESC_F_WRITE flag as the flag
> > > > > > is reserved and should be ignored.
> > > > > >
> > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L170
> > > > > > """
> > > > > > Element Length is reserved for used descriptors without the
> > > > > > VIRTQ_DESC_F_WRITE flag, and is ignored by drivers.
> > > > > > """
> > > > > >
> > > > > > And this is the way how driver ignores the `len` in an used
> > > > > > descriptor.
> > > > > >
> > > > > > https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L241
> > > > > > """
> > > > > > To increase ring capacity the driver can store a (read-only
> > > > > > by the device) table of indirect descriptors anywhere in memory,
> > > > > > and insert a descriptor in the main virtqueue (with \field{Flags}
> > > > > > bit VIRTQ_DESC_F_INDIRECT on) that refers to a buffer element
> > > > > > containing this indirect descriptor table;
> > > > > > """
> > > > > >
> > > > > > So the indirect descriptors in the table are read-only by
> > > > > > the device. And the only descriptor which is writeable by
> > > > > > the device is the descriptor in the main virtqueue (with
> > > > > > Flags bit VIRTQ_DESC_F_INDIRECT on). So if we ignore the
> > > > > > `len` in this descriptor, we won't be able to get the
> > > > > > length of the data written by the device.
> > > > > >
> > > > > > So I think the `len` in this descriptor will carry the
> > > > > > length of the data written by the device (if the buffers
> > > > > > are writable to the device) even if the VIRTQ_DESC_F_WRITE
> > > > > > isn't set by the device. How do you think?
> > > > >
> > > > > Yes I think so. But we'd better need clarification from Michael.
> > > >
> > > > I think if you use a descriptor, and you want to supply len
> > > > to guest, you set VIRTQ_DESC_F_WRITE in the used descriptor.
> > > > Spec also says you must not set VIRTQ_DESC_F_INDIRECT then.
> > > > If that's a problem we could look at relaxing that last requirement -
> > > > does driver want INDIRECT in used descriptor to match
> > > > the value in the avail descriptor for some reason?
> > >
> > > For indirect, driver needs some way to get the length
> > > of the data written by the driver. And the descriptors
> > > in the indirect table is read-only, so the only place
> > > device could put this value is the descriptor with the
> > > VIRTQ_DESC_F_INDIRECT flag set.
> >
> > when writing out used descriptor, device should set VIRTQ_DESC_F_WRITE
> > (and clear VIRTQ_DESC_F_INDIRECT).
>
> So the spec allows device to set VIRTQ_DESC_F_WRITE bit
> when writing out an used descriptor even if the corresponding
> descriptors it just parsed don't have the VIRTQ_DESC_F_WRITE
> bit set?
>
> Best regards,
> Tiwei Bie
I think so. In a used descriptor, VIRTQ_DESC_F_WRITE just means length
is valid.
--
MST
next prev parent reply other threads:[~2018-04-24 1:43 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-01 14:12 [RFC v2] virtio: support packed ring Tiwei Bie
2018-04-10 2:55 ` Jason Wang
2018-04-10 3:21 ` Tiwei Bie
2018-04-10 3:21 ` Tiwei Bie
2018-04-10 2:55 ` Jason Wang
2018-04-13 4:30 ` Jason Wang
2018-04-13 7:15 ` Tiwei Bie
2018-04-13 7:15 ` Tiwei Bie
2018-04-17 2:11 ` Jason Wang
2018-04-17 2:17 ` Michael S. Tsirkin
2018-04-17 2:17 ` Michael S. Tsirkin
2018-04-17 2:24 ` Jason Wang
2018-04-17 2:37 ` Michael S. Tsirkin
2018-04-17 2:37 ` Michael S. Tsirkin
2018-04-17 2:24 ` Jason Wang
2018-04-17 2:51 ` Tiwei Bie
2018-04-17 2:51 ` Tiwei Bie
2018-04-17 12:17 ` Michael S. Tsirkin
2018-04-17 12:17 ` Michael S. Tsirkin
2018-04-17 12:47 ` Tiwei Bie
2018-04-17 12:47 ` Tiwei Bie
2018-04-17 14:04 ` Michael S. Tsirkin
2018-04-17 14:56 ` Tiwei Bie
2018-04-17 14:56 ` Tiwei Bie
2018-04-17 15:54 ` Michael S. Tsirkin
2018-04-17 15:54 ` Michael S. Tsirkin
2018-04-18 1:17 ` Tiwei Bie
2018-04-18 1:17 ` Tiwei Bie
2018-04-17 14:04 ` Michael S. Tsirkin
2018-04-17 2:11 ` Jason Wang
2018-04-13 4:30 ` Jason Wang
2018-04-13 15:22 ` Michael S. Tsirkin
2018-04-13 15:22 ` Michael S. Tsirkin
2018-04-14 11:22 ` Tiwei Bie
2018-04-14 11:22 ` Tiwei Bie
2018-04-23 5:42 ` Jason Wang
2018-04-23 9:29 ` Tiwei Bie
2018-04-24 0:54 ` Jason Wang
2018-04-24 0:54 ` Jason Wang
2018-04-24 1:05 ` Michael S. Tsirkin
2018-04-24 1:14 ` Jason Wang
2018-04-24 1:14 ` Jason Wang
2018-04-24 1:16 ` Tiwei Bie
2018-04-24 1:29 ` Michael S. Tsirkin
2018-04-24 1:37 ` Tiwei Bie
2018-04-24 1:37 ` Tiwei Bie
2018-04-24 1:43 ` Michael S. Tsirkin
2018-04-24 1:43 ` Michael S. Tsirkin [this message]
2018-04-24 1:49 ` Tiwei Bie
2018-04-24 1:49 ` Tiwei Bie
2018-04-24 1:29 ` Michael S. Tsirkin
2018-04-24 1:16 ` Tiwei Bie
2018-04-24 1:05 ` Michael S. Tsirkin
2018-04-23 9:29 ` Tiwei Bie
2018-04-23 5:42 ` Jason Wang
-- strict thread matches above, loose matches on Subject: below --
2018-04-01 14:12 Tiwei Bie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180424044250-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=jasowang@redhat.com \
--cc=jfreimann@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tiwei.bie@intel.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=wexu@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.