qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Wei Wang <wei.w.wang@intel.com>
Cc: Jason Wang <jasowang@redhat.com>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"stefanha@gmail.com" <stefanha@gmail.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"jan.scheurich@ericsson.com" <jan.scheurich@ericsson.com>,
	"armbru@redhat.com" <armbru@redhat.com>,
	"marcandre.lureau@gmail.com" <marcandre.lureau@gmail.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] Re: [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size
Date: Fri, 16 Jun 2017 06:22:57 +0300	[thread overview]
Message-ID: <20170616061949-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <59422E91.7080407@intel.com>

On Thu, Jun 15, 2017 at 02:52:01PM +0800, Wei Wang wrote:
> On 06/15/2017 12:16 PM, Jason Wang wrote:
> > 
> > 
> > On 2017年06月14日 23:22, Michael S. Tsirkin wrote:
> > > On Wed, Jun 14, 2017 at 07:26:54PM +0800, Jason Wang wrote:
> > > > 
> > > > On 2017年06月13日 18:46, Jason Wang wrote:
> > > > > 
> > > > > On 2017年06月13日 17:50, Wei Wang wrote:
> > > > > > On 06/13/2017 05:04 PM, Jason Wang wrote:
> > > > > > > 
> > > > > > > On 2017年06月13日 15:17, Wei Wang wrote:
> > > > > > > > On 06/13/2017 02:29 PM, Jason Wang wrote:
> > > > > > > > > The issue is what if there's a mismatch of
> > > > > > > > > max #sgs between qemu and
> > > > > > > > > > > > When the vhost backend is used, QEMU is not
> > > > > > > > > > > > involved in the data path.
> > > > > > > > > > > > The vhost backend
> > > > > > > > > > > > directly gets what is offered by
> > > > > > > > > > > > the guest from the vq. Why would
> > > > > > > > > > > > there be a mismatch of
> > > > > > > > > > > > max #sgs between QEMU and vhost, and what is
> > > > > > > > > > > > the QEMU side max #sgs
> > > > > > > > > > > > used for? Thanks.
> > > > > > > > > > > You need query the backend max #sgs in this case
> > > > > > > > > > > at least. no? If not
> > > > > > > > > > > how do you know the value is supported by the backend?
> > > > > > > > > > > 
> > > > > > > > > > > Thanks
> > > > > > > > > > > 
> > > > > > > > > > Here is my thought: vhost backend has already been
> > > > > > > > > > supporting 1024 sgs,
> > > > > > > > > > so I think it might not be necessary to query the
> > > > > > > > > > max sgs that the vhost
> > > > > > > > > > backend supports. In the setup phase, when QEMU
> > > > > > > > > > detects the backend is
> > > > > > > > > > vhost, it assumes 1024 max sgs is supported, instead
> > > > > > > > > > of giving an extra
> > > > > > > > > > call to query.
> > > > > > > > > We can probably assume vhost kernel supports up to 1024
> > > > > > > > > sgs. But how about for other vhost-user backends?
> > > > > > > > > 
> > > > > > > > So far, I haven't seen any vhost backend implementation
> > > > > > > > supporting less than 1024 sgs.
> > > > > > > Since vhost-user is an open protocol we can not check each
> > > > > > > implementation (some may be even close sourced). For safety, we
> > > > > > > need an explicit clarification on this.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > And what you said here makes me ask one of
> > > > > > > > > my questions in the past:
> > > > > > > > > 
> > > > > > > > > Do we have plan to extend 1024 to a larger value or 1024
> > > > > > > > > looks good for the future years? If we only care about
> > > > > > > > > 1024, there's even no need for a new config filed, a
> > > > > > > > > feature flag is more than enough. If we want to extend
> > > > > > > > > it to e.g 2048, we definitely need to query vhost
> > > > > > > > > backend's limit (even for vhost-kernel).
> > > > > > > > > 
> > > > > > > > According to virtio spec (e.g. 2.4.4), unreasonably large
> > > > > > > > descriptors are
> > > > > > > > not encouraged to be used by the guest. If possible, I would
> > > > > > > > suggest to use
> > > > > > > > 1024 as the largest number of descriptors that the guest can
> > > > > > > > chain, even when
> > > > > > > > we have larger queue size in the future. That is,
> > > > > > > > if (backend == QEMU backend)
> > > > > > > >      config.max_chain_size = 1023 (defined by the qemu
> > > > > > > > backend implementation);
> > > > > > > > else if (backend == vhost)
> > > > > > > >      config.max_chain_size = 1024;
> > > > > > > > 
> > > > > > > > It is transparent to the guest. From the guest's point of
> > > > > > > > view, all it knows is a value
> > > > > > > > given to him via reading config.max_chain_size.
> > > > > > > So not transparent actually, guest at least guest need to see
> > > > > > > and check for this. So the question still, since you only care
> > > > > > > about two cases in fact:
> > > > > > > 
> > > > > > > - backend supports 1024
> > > > > > > - backend supports <1024 (qemu or whatever other backends)
> > > > > > > 
> > > > > > > So it looks like a new feature flag is more than enough. If
> > > > > > > device(backends) support this feature, it can make sure 1024 sgs
> > > > > > > is supported?
> > > > > > > 
> > > > > > That wouldn't be enough. For example, QEMU3.0 backend supports
> > > > > > max_chain_size=1023,
> > > > > > while QEMU4.0 backend supports max_chain_size=1021. How would the
> > > > > > guest know
> > > > > > the max size with the same feature flag? Would it still chain 1023
> > > > > > descriptors with QEMU4.0?
> > > > > > 
> > > > > > Best,
> > > > > > Wei
> > > > > I believe we won't go back to less than 1024 in the future. It may be
> > > > > worth to add a unittest for this to catch regression early.
> > > > > 
> > > > > Thanks
> > > I think I disagree with that. Smaller pipes a better (e.g. less cache
> > > pressure) and you only need huge pipes because host thread gets
> > > scheduled out for too long. With more CPUs there's less of a chance of
> > > an overcommit so we'll be able to get by with smaller pipes in the
> > > future.
> > 
> > Agree, but we are talking about the upper limit. Even if 1024 is
> > supported, small number of #sgs is still encouraged.
> > 
> > > 
> > > > Consider the queue size is 256 now, I think maybe we can first
> > > > make tx queue
> > > > size configurable up to 1024, and then do the #sg stuffs on top.
> > > > 
> > > > What's your opinion, Michael?
> > > > 
> > > > Thanks
> > > With a kernel backend, 1024 is problematic since we are then unable
> > > to add any entries or handle cases where an entry crosses an MR region
> > > boundary. We could support up to 512 with a kernel backend but no one
> > > seems to want that :)
> > 
> > Then I see issues with indirect descriptors.
> > 
> > We try to allow up 1024 chained descriptors implicitly since
> > e0e9b406470b ("vhost: max s/g to match qemu"). If guest can submit
> > crossing MR descs, I'm afraid we've already had this bug since this
> > commit. And actually this seems conflict to what spec said in 2.4.5:
> > 
> > """
> > The number of descriptors in the table is defined by the queue size for
> > this virtqueue: this is the maximum possible descriptor chain length.
> > """
> > 
> > Technically, we had the same issue for rx since we allow 1024 queue size
> > now.
> > 
> > So actually, allowing the size to 1024 does not introduce any new
> > trouble?
> > > 
> > > With vhost-user the backend might be able to handle that. So an
> > > acceptable option would be to allow 1K with vhost-user backends
> > > only, trim it back with other backends.
> > > 
> > 
> > I believe the idea is to clarify the maximum chain size instead of
> > having any assumption.
> > 
> > 
> 
> I think the issues can be solved by VIRTIO_F_MAX_CHAIN_SIZE.
> 
> For now, how about splitting it into two series of patches:
> 1) enable 1024 tx queue size for vhost-user, to let the users of vhost-user
> to easily use 1024 queue size.

Fine with me. 1) will get property from user but override it on
!vhost-user. Do we need a protocol flag? It seems prudent but we get
back to cross-version migration issues that are still pending solution.
Marc Andre, what's the status of that work?

> 2) enable VIRTIO_F_MAX_CHAIN_SIZE,  to enhance robustness.

Rather, to support it for more backends.

> Best,
> Wei

-- 
MST

  reply	other threads:[~2017-06-16  3:23 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-05  8:57 [Qemu-devel] [PATCH v1] virtio-net: enable configurable tx queue size Wei Wang
2017-06-05 15:38 ` Michael S. Tsirkin
2017-06-05 15:41   ` Eric Blake
2017-06-05 15:45     ` Michael S. Tsirkin
2017-06-06  3:32   ` Wei Wang
2017-06-07  1:04   ` Wei Wang
2017-06-08 19:01     ` Michael S. Tsirkin
2017-06-09  3:00       ` Wei Wang
2017-06-12  9:30         ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-12 20:43           ` Michael S. Tsirkin
2017-06-13  3:10             ` Wei Wang
2017-06-13  3:19               ` Jason Wang
2017-06-13  3:51                 ` Wei Wang
2017-06-13  3:55                   ` Jason Wang
2017-06-13  3:59                     ` Jason Wang
2017-06-13  6:13                       ` Wei Wang
2017-06-13  6:31                         ` Jason Wang
2017-06-13  7:49                           ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-13  6:08                     ` [Qemu-devel] " Wei Wang
2017-06-13  6:29                       ` [Qemu-devel] [virtio-dev] " Jason Wang
2017-06-13  7:17                         ` Wei Wang
2017-06-13  9:04                           ` Jason Wang
2017-06-13  9:50                             ` Wei Wang
2017-06-13 10:46                               ` Jason Wang
2017-06-14 11:26                                 ` Jason Wang
2017-06-14 15:22                                   ` Michael S. Tsirkin
2017-06-15  4:16                                     ` Jason Wang
2017-06-15  6:52                                       ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-16  3:22                                         ` Michael S. Tsirkin [this message]
2017-06-16  8:57                                           ` Jason Wang
2017-06-16 10:10                                             ` Wei Wang
2017-06-16 15:15                                               ` Michael S. Tsirkin
2017-06-17  8:37                                                 ` Wei Wang
2017-06-18 19:46                                                   ` Michael S. Tsirkin
2017-06-19  7:40                                                     ` Wei Wang
2017-06-16 15:19                                             ` Michael S. Tsirkin
2017-06-16 17:04                                               ` Maxime Coquelin
2017-06-16 20:33                                                 ` Michael S. Tsirkin
2017-06-05 15:47 ` [Qemu-devel] " Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170616061949-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=armbru@redhat.com \
    --cc=jan.scheurich@ericsson.com \
    --cc=jasowang@redhat.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).