From: "Michael S. Tsirkin" <mst@redhat.com>
To: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: virtio-comment@lists.oasis-open.org,
Cornelia Huck <cohuck@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>, Greg Kurz <groug@kaod.org>,
Dominique Martinet <asmadeus@codewreck.org>,
Halil Pasic <pasic@linux.ibm.com>
Subject: Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
Date: Wed, 23 Mar 2022 08:35:00 -0400 [thread overview]
Message-ID: <20220323081402-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <2085227.9aJWpeLndy@silver>
On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > > > > To be honest, I don't feel like discussing precise wordings at this
> > > > > point
> > > > > when you are questioning the proposal on design level already.
> > > > >
> > > > > Maybe you make some more thorough thoughts on what you actually want
> > > > > this
> > > > > to be on design level before continueing to argue about precise
> > > > > terminology, which you are not using either BTW when you articulating
> > > > > your criticism.
> > > > >
> > > > > Or even better: come up with your own proposol with the precise
> > > > > wording
> > > > > you
> > > > > feel appropriate.
> > > >
> > > > OK let's go back and agree on what we are trying to achieve. The github
> > > > issue and the cover letter imply that while indirect descriptors would
> > > > normally allow huge tables, we artificially limit them to queue size,
> > > > and you want to be able to relax that.
> > >
> > > Correct, that's my motivation for all of this.
> >
> > Okay. So I think that given this, we can limit the total number
> > of non-indirect descriptors, including non-indirect ones
> > in a chain + all the ones in indirect pointer table if any,
> > and excluding the indirect descriptor itself, and this
> > will address the issue you are describing here, right?
>
> As far as I understand your suggestion, yes, it would cover:
>
> A. My use case [1]: allowing indirect table length > queue size.
> B. Stefan's use case [2]: forcing indirect table length < queue size.
> C. Your use case: forcing chain (within FIFOs) length < queue size.
>
> [1] https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.com/
> [2] https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
>
> However it would not cover:
>
> D. Your other use case: blocking indirect tables.
Well, the practical use-case I have in mind is actually for when using
mergeable buffers with the network device, in practice there's never a
s/g but there's no way for device to know that. So maybe we can declare
that indirect must not be used when there's a single descriptor in the
table (esp when the new feature is accepted), and then just limiting s/g
to 1 will do the trick in this specific case.
> E. Potential other people's need: max. chain length (within FIFOs) != max.
> indirect table length.
I don't understand this one, sorry.
> It's not clear to me though why you would want to exclude the descriptor
> pointing to a table from counting towards that limit.
Because it's a transient thing. It does not need to be processed
with a packet, it's just a way of supplying a longer s/g.
In other words what devices care about is the s/g list length,
bot how the s/g is specified.
I think what QEMU and Linux both do is very typical:
the indirect descriptor is used to fetch the table,
but then what QEMU cares about is whether it will overflow 1024
entries when preparing the iovec for linux.
> Instead of yet again mixing different limits again with each other, what about
> using 3 distinct fields for those limits instead:
>
> 1. max. indirect table length (any descriptor in the table)
> 2. max. chain length (currently: descriptors within FIFOs)
> 3. max. total descriptors per buffer (sum of all descriptors, both from FIFOs
> and indirect tables, including descriptor pointing to a table).
So this doesn't actually address at least what you called "your
use-case" above. Well it almost does, but not exactly because
of the extra indirect descriptor.
It's not that I'm against this kind of thing but personally
I'd like a bit more than a theoretical argument for
why all the extra complexity is helpful - remember that
all that needs to be coded in the driver, and any mistakes
often tend to paint us in the corner because of the need to
support existing drivers.
> As a side note: the spec currently refers to "table of indirect descriptors",
> hence my previous assumption that descriptors in such a table should be called
> "indirect descriptors", whereas you are apparently assuming that an "indirect
> descriptor" is only that single one pointing to a table and you see the
> descriptors in the table as "direct" ones I guess.
Oh. You are right. There's no special name for what I mistakenly
called an indirect descriptor. So what I called
"direct descriptors" are really "read/write descriptors".
It might make sense to add some text calling these something.
Scatter/gather descriptors maybe?
And what I called "indirect descriptor" is really just
descriptor with INDIRECT flag set. I am not sure we need
to call these anything unless we really start counting it.
> > Question:
> > - I am thinking about bi-directional descriptors such as
> > block and scsi have. it looks like they have a separate limit for
> > read and write parts of the chain. Should we have two limits
> > then? Or should we just make driver use the lower of the two,
> > i.e. the per vq limit applies to the total # of elements
> > in a buffer, and read/write sgs are controlled separately
> > by the per device control?
> >
> > Stefan, any comments?
>
> Leaving that question to Stefan. No opinion from my side at this point.
>
> > > > Fair enough.
> > > >
> > > > However, I feel trying to talk about indirect descriptor is too narrow a
> > > > use-case, simply because the issue is not indirect at all. Why do we
> > > > limit number of segments? I think it's really because of backend
> > > > limitations. And indirect is only used by the frontend. So limiting
> > > > that is really going about it wrong.
> > >
> > > I am only aware about current implementation situation in QEMU and Linux
> > > kernel. As for those two: yes, it is not a limitation on Linux kernel
> > > side,
> > > but on QEMU side.
> > >
> > > As for other implementations: no idea.
> > >
> > > > So block for example has seg_max already. What should happen
> > > > if that exceeds queue size is not defined.
> > > >
> > > > So maybe we can generalize that making it device independent?
> > > > The litmus paper for this is the block and scsi devices,
> > > > we should be able to use the new feature as a super-set.
> > > >
> > > > Before we discuss solutions, did I formulate the problem correctly?
> > >
> > > Keep in mind that I never worked on virtio code or virtio spec before. I
> > > just started to review virtio implementation of QEMU and Linux kernel and
> > > the virtio spec in November, specifically in context of 9p. I definitely
> > > don't know all the other virtioo device classes out there.
> >
> > I think it's great that we have someone taking care of 9p btw!
>
> Thanks!
>
> > > In other words: I can't help you on fitting this appropriately into a
> > > superset picture.
> > >
> > > Best regards,
> > > Christian Schoenebeck
> >
> > I think we need some cleanups in the spec to make what you are trying to
> > do possible to do cleanly, specifically move this description:
> >
> > A buffer consists of zero or more device-readable physically-contiguous
> > elements followed by zero or more physically-contiguous
> > device-writable elements (each buffer has at least one element).
> >
> > out to the generic part from packed ring part, drop corresponding
> > text from split ring and make it refer to the term "element".
> >
> > After this, the new field will just be "a number of elements per
> > buffer" (note that indirect descriptors do not themselves
> > describe elements and so won't be included in the math).
> >
> > Christian, you mentioned you don't like the term buffer generally,
> > changing that can be done before or after this feature but IMHO
> > best not as part of it.
>
> For pragmatic reasons I will refrain from questioning any virtio terms in
> foreseeable future and will just use the ones suggested by people.
It's not that the terminology is so great, but we do have used it for a
while and by now it's quite entrenched.
> > I think it's good in that it will fit better in the superset picture
> > addressing in addition to your requirement also the requirement to have
> > huge rings while limiting descriptors.
> >
> > Does above sound like it addresses your requirements of having
> > a longer descriptor chain than queue size? If not what is not
> > addressed?
> >
> > Thanks,
>
next prev parent reply other threads:[~2022-03-23 12:35 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
2022-03-17 13:40 ` [virtio-comment] " Cornelia Huck
2022-03-18 10:45 ` Christian Schoenebeck
2022-03-18 16:03 ` [virtio-comment] " Cornelia Huck
2022-03-19 9:33 ` [virtio-comment] " Michael S. Tsirkin
2022-03-19 12:00 ` Christian Schoenebeck
2022-03-20 12:31 ` Michael S. Tsirkin
2022-03-20 13:32 ` Christian Schoenebeck
2022-03-20 13:55 ` Michael S. Tsirkin
2022-03-20 15:17 ` Christian Schoenebeck
2022-03-20 16:06 ` Michael S. Tsirkin
2022-03-20 16:07 ` Michael S. Tsirkin
2022-03-20 17:43 ` Christian Schoenebeck
2022-03-20 21:52 ` Michael S. Tsirkin
2022-03-21 9:23 ` Christian Schoenebeck
2022-03-21 22:13 ` Michael S. Tsirkin
2022-03-23 10:20 ` Christian Schoenebeck
2022-03-23 12:35 ` Michael S. Tsirkin [this message]
2022-03-24 9:16 ` Christian Schoenebeck
2022-03-24 10:36 ` Michael S. Tsirkin
2022-03-24 11:11 ` Christian Schoenebeck
2022-03-24 11:16 ` Michael S. Tsirkin
2022-03-24 11:52 ` Christian Schoenebeck
2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
2022-03-16 14:41 ` [virtio-comment] " Christian Schoenebeck
2022-03-16 13:52 ` [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum" Christian Schoenebeck
2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
2022-03-17 14:12 ` [virtio-comment] " Cornelia Huck
2022-03-18 11:02 ` Christian Schoenebeck
2022-03-18 16:06 ` Halil Pasic
2022-03-19 10:12 ` Christian Schoenebeck
2022-03-21 16:36 ` Cornelia Huck
2022-03-22 1:56 ` Halil Pasic
2022-03-22 9:57 ` Cornelia Huck
2022-03-22 11:21 ` Halil Pasic
2022-03-18 16:10 ` Cornelia Huck
2022-03-19 10:23 ` Christian Schoenebeck
2022-03-21 16:25 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220323081402-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=asmadeus@codewreck.org \
--cc=cohuck@redhat.com \
--cc=groug@kaod.org \
--cc=pasic@linux.ibm.com \
--cc=qemu_oss@crudebyte.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox