All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: virtio-comment@lists.oasis-open.org,
	Cornelia Huck <cohuck@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>, Greg Kurz <groug@kaod.org>,
	Dominique Martinet <asmadeus@codewreck.org>,
	Halil Pasic <pasic@linux.ibm.com>
Subject: Re: [virtio-comment] [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE
Date: Wed, 23 Mar 2022 08:35:00 -0400	[thread overview]
Message-ID: <20220323081402-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <2085227.9aJWpeLndy@silver>

On Wed, Mar 23, 2022 at 11:20:07AM +0100, Christian Schoenebeck wrote:
> On Montag, 21. März 2022 23:13:55 CET Michael S. Tsirkin wrote:
> > On Mon, Mar 21, 2022 at 10:23:07AM +0100, Christian Schoenebeck wrote:
> > > On Sonntag, 20. März 2022 22:52:16 CET Michael S. Tsirkin wrote:
> > > > On Sun, Mar 20, 2022 at 06:43:53PM +0100, Christian Schoenebeck wrote:
> > > > > To be honest, I don't feel like discussing precise wordings at this
> > > > > point
> > > > > when you are questioning the proposal on design level already.
> > > > > 
> > > > > Maybe you make some more thorough thoughts on what you actually want
> > > > > this
> > > > > to be on design level before continueing to argue about precise
> > > > > terminology, which you are not using either BTW when you articulating
> > > > > your criticism.
> > > > > 
> > > > > Or even better: come up with your own proposol with the precise
> > > > > wording
> > > > > you
> > > > > feel appropriate.
> > > > 
> > > > OK let's go back and agree on what we are trying to achieve.  The github
> > > > issue and the cover letter imply that while indirect descriptors would
> > > > normally allow huge tables, we artificially limit them to queue size,
> > > > and you want to be able to relax that.
> > > 
> > > Correct, that's my motivation for all of this.
> > 
> > Okay. So I think that given this, we can limit the total number
> > of non-indirect descriptors, including non-indirect ones
> > in a chain + all the ones in indirect pointer table if any,
> > and excluding the indirect descriptor itself, and this
> > will address the issue you are describing here, right?
> 
> As far as I understand your suggestion, yes, it would cover:
> 
> A. My use case [1]: allowing indirect table length > queue size.
> B. Stefan's use case [2]: forcing indirect table length < queue size.
> C. Your use case: forcing chain (within FIFOs) length < queue size.
> 
> [1] https://lore.kernel.org/netdev/cover.1640870037.git.linux_oss@crudebyte.com/
> [2] https://lore.kernel.org/all/YXpmwP6RtvY0BmSM@stefanha-x1.localdomain/
> 
> However it would not cover:
> 
> D. Your other use case: blocking indirect tables.

Well, the practical use-case I have in mind is actually for when using
mergeable buffers with the network device, in practice there's never a
s/g but there's no way for device to know that. So maybe we can declare
that indirect must not be used when there's a single descriptor in the
table (esp when the new feature is accepted), and then just limiting s/g
to 1 will do the trick in this specific case.

> E. Potential other people's need: max. chain length (within FIFOs) != max.
>    indirect table length.

I don't understand this one, sorry.

> It's not clear to me though why you would want to exclude the descriptor
> pointing to a table from counting towards that limit.

Because it's a transient thing. It does not need to be processed
with a packet, it's just a way of supplying a longer s/g.
In other words what devices care about is the s/g list length,
bot how the s/g is specified.

I think what QEMU and Linux both do is very typical:
the indirect descriptor is used to fetch the table,
but then what QEMU cares about is whether it will overflow 1024
entries when preparing the iovec for linux.



> Instead of yet again mixing different limits again with each other, what about
> using 3 distinct fields for those limits instead:
> 
> 1. max. indirect table length (any descriptor in the table)
> 2. max. chain length (currently: descriptors within FIFOs)
> 3. max. total descriptors per buffer (sum of all descriptors, both from FIFOs
>    and indirect tables, including descriptor pointing to a table).

So this doesn't actually address at least what you called "your
use-case" above. Well it almost does, but not exactly because
of the extra indirect descriptor.

It's not that I'm against this kind of thing but personally
I'd like a bit more than a theoretical argument for
why all the extra complexity is helpful - remember that
all that needs to be coded in the driver, and any mistakes
often tend to paint us in the corner because of the need to
support existing drivers.


> As a side note: the spec currently refers to "table of indirect descriptors",
> hence my previous assumption that descriptors in such a table should be called
> "indirect descriptors", whereas you are apparently assuming that an "indirect
> descriptor" is only that single one pointing to a table and you see the
> descriptors in the table as "direct" ones I guess.

Oh. You are right. There's no special name for what I mistakenly
called an indirect descriptor. So what I called
"direct descriptors" are really "read/write descriptors".
It might make sense to add some text calling these something.
Scatter/gather descriptors maybe?

And what I called "indirect descriptor" is really just
descriptor with INDIRECT flag set. I am not sure we need
to call these anything unless we really start counting it.




> > Question:
> > - I am thinking about bi-directional descriptors such as
> >   block and scsi have. it looks like they have a separate limit for
> >   read and write parts of the chain. Should we have two limits
> >   then? Or should we just make driver use the lower of the two,
> >   i.e. the per vq limit applies to the total # of elements
> >   in a buffer, and read/write sgs are controlled separately
> >   by the per device control?
> > 
> > Stefan, any comments?
> 
> Leaving that question to Stefan. No opinion from my side at this point.
> 
> > > > Fair enough.
> > > > 
> > > > However, I feel trying to talk about indirect descriptor is too narrow a
> > > > use-case, simply because the issue is not indirect at all.  Why do we
> > > > limit number of segments? I think it's really because of backend
> > > > limitations. And indirect is only used by the frontend.  So limiting
> > > > that is really going about it wrong.
> > > 
> > > I am only aware about current implementation situation in QEMU and Linux
> > > kernel. As for those two: yes, it is not a limitation on Linux kernel
> > > side,
> > > but on QEMU side.
> > > 
> > > As for other implementations: no idea.
> > > 
> > > > So block for example has seg_max already. What should happen
> > > > if that exceeds queue size is not defined.
> > > > 
> > > > So maybe we can generalize that making it device independent?
> > > > The litmus paper for this is the block and scsi devices,
> > > > we should be able to use the new feature as a super-set.
> > > > 
> > > > Before we discuss solutions, did I formulate the problem correctly?
> > > 
> > > Keep in mind that I never worked on virtio code or virtio spec before. I
> > > just started to review virtio implementation of QEMU and Linux kernel and
> > > the virtio spec in November, specifically in context of 9p. I definitely
> > > don't know all the other virtioo device classes out there.
> > 
> > I think it's great that we have someone taking care of 9p btw!
> 
> Thanks!
> 
> > > In other words: I can't help you on fitting this appropriately into a
> > > superset picture.
> > > 
> > > Best regards,
> > > Christian Schoenebeck
> > 
> > I think we need some cleanups in the spec to make what you are trying to
> > do possible to do cleanly, specifically move this description:
> > 
> > 	A buffer consists of zero or more device-readable physically-contiguous
> > 	elements followed by zero or more physically-contiguous
> > 	device-writable elements (each buffer has at least one element).
> > 
> > out to the generic part from packed ring part, drop corresponding
> > text from split ring and make it refer to the term "element".
> > 
> > After this, the new field will just be "a number of elements per
> > buffer" (note that indirect descriptors do not themselves
> > describe elements and so won't be included in the math).
> > 
> > Christian, you mentioned you don't like the term buffer generally,
> > changing that can be done before or after this feature but IMHO
> > best not as part of it.
> 
> For pragmatic reasons I will refrain from questioning any virtio terms in
> foreseeable future and will just use the ones suggested by people.

It's not that the terminology is so great, but we do have used it for a
while and by now it's quite entrenched.

> > I think it's good in that it will fit better in the superset picture
> > addressing in addition to your requirement also the requirement to have
> > huge rings while limiting descriptors.
> > 
> > Does above sound like it addresses your requirements of having
> > a longer descriptor chain than queue size? If not what is not
> > addressed?
> > 
> > Thanks,
> 


  reply	other threads:[~2022-03-23 12:35 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16 13:44 [PATCH v3 0/4] Add VIRTIO_RING_F_INDIRECT_SIZE and queue_indirect_size Christian Schoenebeck
2022-03-16 13:47 ` [PATCH v3 1/4] Add VIRTIO_RING_F_INDIRECT_SIZE Christian Schoenebeck
2022-03-17 13:40   ` [virtio-comment] " Cornelia Huck
2022-03-18 10:45     ` Christian Schoenebeck
2022-03-18 16:03       ` [virtio-comment] " Cornelia Huck
2022-03-19  9:33   ` [virtio-comment] " Michael S. Tsirkin
2022-03-19 12:00     ` Christian Schoenebeck
2022-03-20 12:31       ` Michael S. Tsirkin
2022-03-20 13:32         ` Christian Schoenebeck
2022-03-20 13:55           ` Michael S. Tsirkin
2022-03-20 15:17             ` Christian Schoenebeck
2022-03-20 16:06               ` Michael S. Tsirkin
2022-03-20 16:07                 ` Michael S. Tsirkin
2022-03-20 17:43                 ` Christian Schoenebeck
2022-03-20 21:52                   ` Michael S. Tsirkin
2022-03-21  9:23                     ` Christian Schoenebeck
2022-03-21 22:13                       ` Michael S. Tsirkin
2022-03-23 10:20                         ` Christian Schoenebeck
2022-03-23 12:35                           ` Michael S. Tsirkin [this message]
2022-03-24  9:16                             ` Christian Schoenebeck
2022-03-24 10:36                               ` Michael S. Tsirkin
2022-03-24 11:11                                 ` Christian Schoenebeck
2022-03-24 11:16                                   ` Michael S. Tsirkin
2022-03-24 11:52                                     ` Christian Schoenebeck
2022-03-16 13:50 ` [PATCH v3 2/4] Add PCI configuration field "queue_indirect_size" Christian Schoenebeck
2022-03-16 14:41   ` [virtio-comment] " Christian Schoenebeck
2022-03-16 13:52 ` [PATCH v3 3/4] Add MMIO configuration register "QueueIndirectNum" Christian Schoenebeck
2022-03-16 13:55 ` [PATCH v3 4/4] Add CCW configuration field "indirect_num" Christian Schoenebeck
2022-03-17 14:12   ` [virtio-comment] " Cornelia Huck
2022-03-18 11:02     ` Christian Schoenebeck
2022-03-18 16:06       ` Halil Pasic
2022-03-19 10:12         ` Christian Schoenebeck
2022-03-21 16:36           ` Cornelia Huck
2022-03-22  1:56             ` Halil Pasic
2022-03-22  9:57               ` Cornelia Huck
2022-03-22 11:21                 ` Halil Pasic
2022-03-18 16:10       ` Cornelia Huck
2022-03-19 10:23         ` Christian Schoenebeck
2022-03-21 16:25           ` Cornelia Huck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220323081402-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=asmadeus@codewreck.org \
    --cc=cohuck@redhat.com \
    --cc=groug@kaod.org \
    --cc=pasic@linux.ibm.com \
    --cc=qemu_oss@crudebyte.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.