Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Jason Wang <jasowang@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	"stefanha@gmail.com" <stefanha@gmail.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"jan.scheurich@ericsson.com" <jan.scheurich@ericsson.com>,
	"armbru@redhat.com" <armbru@redhat.com>,
	Wei Wang <wei.w.wang@intel.com>,
	"marcandre.lureau@gmail.com" <marcandre.lureau@gmail.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] Re: [PATCH v1] virtio-net: enable configurable tx queue size
Date: Thu, 15 Jun 2017 12:16:09 +0800	[thread overview]
Message-ID: <b59be2f1-ee8c-e660-4f82-bbe63e22b603@redhat.com> (raw)
In-Reply-To: <20170614180459-mutt-send-email-mst@kernel.org>



On 2017年06月14日 23:22, Michael S. Tsirkin wrote:
> On Wed, Jun 14, 2017 at 07:26:54PM +0800, Jason Wang wrote:
>>
>> On 2017年06月13日 18:46, Jason Wang wrote:
>>>
>>> On 2017年06月13日 17:50, Wei Wang wrote:
>>>> On 06/13/2017 05:04 PM, Jason Wang wrote:
>>>>>
>>>>> On 2017年06月13日 15:17, Wei Wang wrote:
>>>>>> On 06/13/2017 02:29 PM, Jason Wang wrote:
>>>>>>> The issue is what if there's a mismatch of max #sgs between qemu and
>>>>>>>>>> When the vhost backend is used, QEMU is not
>>>>>>>>>> involved in the data path.
>>>>>>>>>> The vhost backend
>>>>>>>>>> directly gets what is offered by the guest from the vq. Why would
>>>>>>>>>> there be a mismatch of
>>>>>>>>>> max #sgs between QEMU and vhost, and what is
>>>>>>>>>> the QEMU side max #sgs
>>>>>>>>>> used for? Thanks.
>>>>>>>>> You need query the backend max #sgs in this case
>>>>>>>>> at least. no? If not
>>>>>>>>> how do you know the value is supported by the backend?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>> Here is my thought: vhost backend has already been
>>>>>>>> supporting 1024 sgs,
>>>>>>>> so I think it might not be necessary to query the
>>>>>>>> max sgs that the vhost
>>>>>>>> backend supports. In the setup phase, when QEMU
>>>>>>>> detects the backend is
>>>>>>>> vhost, it assumes 1024 max sgs is supported, instead
>>>>>>>> of giving an extra
>>>>>>>> call to query.
>>>>>>> We can probably assume vhost kernel supports up to 1024
>>>>>>> sgs. But how about for other vhost-user backends?
>>>>>>>
>>>>>> So far, I haven't seen any vhost backend implementation
>>>>>> supporting less than 1024 sgs.
>>>>> Since vhost-user is an open protocol we can not check each
>>>>> implementation (some may be even close sourced). For safety, we
>>>>> need an explicit clarification on this.
>>>>>
>>>>>>
>>>>>>> And what you said here makes me ask one of my questions in the past:
>>>>>>>
>>>>>>> Do we have plan to extend 1024 to a larger value or 1024
>>>>>>> looks good for the future years? If we only care about
>>>>>>> 1024, there's even no need for a new config filed, a
>>>>>>> feature flag is more than enough. If we want to extend
>>>>>>> it to e.g 2048, we definitely need to query vhost
>>>>>>> backend's limit (even for vhost-kernel).
>>>>>>>
>>>>>> According to virtio spec (e.g. 2.4.4), unreasonably large
>>>>>> descriptors are
>>>>>> not encouraged to be used by the guest. If possible, I would
>>>>>> suggest to use
>>>>>> 1024 as the largest number of descriptors that the guest can
>>>>>> chain, even when
>>>>>> we have larger queue size in the future. That is,
>>>>>> if (backend == QEMU backend)
>>>>>>      config.max_chain_size = 1023 (defined by the qemu
>>>>>> backend implementation);
>>>>>> else if (backend == vhost)
>>>>>>      config.max_chain_size = 1024;
>>>>>>
>>>>>> It is transparent to the guest. From the guest's point of
>>>>>> view, all it knows is a value
>>>>>> given to him via reading config.max_chain_size.
>>>>> So not transparent actually, guest at least guest need to see
>>>>> and check for this. So the question still, since you only care
>>>>> about two cases in fact:
>>>>>
>>>>> - backend supports 1024
>>>>> - backend supports <1024 (qemu or whatever other backends)
>>>>>
>>>>> So it looks like a new feature flag is more than enough. If
>>>>> device(backends) support this feature, it can make sure 1024 sgs
>>>>> is supported?
>>>>>
>>>> That wouldn't be enough. For example, QEMU3.0 backend supports
>>>> max_chain_size=1023,
>>>> while QEMU4.0 backend supports max_chain_size=1021. How would the
>>>> guest know
>>>> the max size with the same feature flag? Would it still chain 1023
>>>> descriptors with QEMU4.0?
>>>>
>>>> Best,
>>>> Wei
>>> I believe we won't go back to less than 1024 in the future. It may be
>>> worth to add a unittest for this to catch regression early.
>>>
>>> Thanks
> I think I disagree with that. Smaller pipes a better (e.g. less cache
> pressure) and you only need huge pipes because host thread gets
> scheduled out for too long. With more CPUs there's less of a chance of
> an overcommit so we'll be able to get by with smaller pipes in the
> future.

Agree, but we are talking about the upper limit. Even if 1024 is 
supported, small number of #sgs is still encouraged.

>
>> Consider the queue size is 256 now, I think maybe we can first make tx queue
>> size configurable up to 1024, and then do the #sg stuffs on top.
>>
>> What's your opinion, Michael?
>>
>> Thanks
> With a kernel backend, 1024 is problematic since we are then unable
> to add any entries or handle cases where an entry crosses an MR region
> boundary. We could support up to 512 with a kernel backend but no one
> seems to want that :)

Then I see issues with indirect descriptors.

We try to allow up 1024 chained descriptors implicitly since 
e0e9b406470b ("vhost: max s/g to match qemu"). If guest can submit 
crossing MR descs, I'm afraid we've already had this bug since this 
commit. And actually this seems conflict to what spec said in 2.4.5:

"""
The number of descriptors in the table is defined by the queue size for 
this virtqueue: this is the maximum possible descriptor chain length.
"""

Technically, we had the same issue for rx since we allow 1024 queue size 
now.

So actually, allowing the size to 1024 does not introduce any new trouble?
>
> With vhost-user the backend might be able to handle that. So an
> acceptable option would be to allow 1K with vhost-user backends
> only, trim it back with other backends.
>

I believe the idea is to clarify the maximum chain size instead of 
having any assumption.

Thanks

next prev parent reply	other threads:[~2017-06-15  4:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-05  8:57 [Qemu-devel] [PATCH v1] virtio-net: enable configurable tx queue size Wei Wang
2017-06-05 15:38 ` Michael S. Tsirkin
2017-06-05 15:41   ` Eric Blake
2017-06-05 15:45     ` Michael S. Tsirkin
2017-06-06  3:32   ` Wei Wang
2017-06-07  1:04   ` Wei Wang
2017-06-08 19:01     ` Michael S. Tsirkin
2017-06-09  3:00       ` Wei Wang
2017-06-12  9:30         ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-12 20:43           ` Michael S. Tsirkin
2017-06-13  3:10             ` Wei Wang
2017-06-13  3:19               ` Jason Wang
2017-06-13  3:51                 ` Wei Wang
2017-06-13  3:55                   ` Jason Wang
2017-06-13  3:59                     ` Jason Wang
2017-06-13  6:13                       ` Wei Wang
2017-06-13  6:31                         ` Jason Wang
2017-06-13  7:49                           ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-13  6:08                     ` [Qemu-devel] " Wei Wang
2017-06-13  6:29                       ` [Qemu-devel] [virtio-dev] " Jason Wang
2017-06-13  7:17                         ` Wei Wang
2017-06-13  9:04                           ` Jason Wang
2017-06-13  9:50                             ` Wei Wang
2017-06-13 10:46                               ` Jason Wang
2017-06-14 11:26                                 ` Jason Wang
2017-06-14 15:22                                   ` Michael S. Tsirkin
2017-06-15  4:16                                     ` Jason Wang [this message]
2017-06-15  6:52                                       ` [Qemu-devel] [virtio-dev] " Wei Wang
2017-06-16  3:22                                         ` Michael S. Tsirkin
2017-06-16  8:57                                           ` Jason Wang
2017-06-16 10:10                                             ` Wei Wang
2017-06-16 15:15                                               ` Michael S. Tsirkin
2017-06-17  8:37                                                 ` Wei Wang
2017-06-18 19:46                                                   ` Michael S. Tsirkin
2017-06-19  7:40                                                     ` Wei Wang
2017-06-16 15:19                                             ` Michael S. Tsirkin
2017-06-16 17:04                                               ` Maxime Coquelin
2017-06-16 20:33                                                 ` Michael S. Tsirkin
2017-06-05 15:47 ` [Qemu-devel] " Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b59be2f1-ee8c-e660-4f82-bbe63e22b603@redhat.com \
    --to=jasowang@redhat.com \
    --cc=armbru@redhat.com \
    --cc=jan.scheurich@ericsson.com \
    --cc=marcandre.lureau@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).