From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from ws5-mx01.kavi.com (ws5-mx01.kavi.com [34.193.7.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22E8BCA0ED1 for ; Wed, 13 Sep 2023 04:01:32 +0000 (UTC) Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by ws5-mx01.kavi.com (Postfix) with ESMTP id 5DBAB42A99 for ; Wed, 13 Sep 2023 04:01:30 +0000 (UTC) Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 58F70986646 for ; Wed, 13 Sep 2023 04:01:30 +0000 (UTC) Received: from host09.ws5.connectedcommunity.org (host09.ws5.connectedcommunity.org [10.110.1.97]) by lists.oasis-open.org (Postfix) with QMQP id 45E2B98612D; Wed, 13 Sep 2023 04:01:30 +0000 (UTC) Mailing-List: contact virtio-dev-help@lists.oasis-open.org; run by ezmlm List-ID: Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 30171986630; Wed, 13 Sep 2023 04:01:29 +0000 (UTC) X-Virus-Scanned: amavisd-new at kavi.com X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="378469770" X-IronPort-AV: E=Sophos;i="6.02,142,1688454000"; d="scan'208";a="378469770" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="990753446" X-IronPort-AV: E=Sophos;i="6.02,142,1688454000"; d="scan'208";a="990753446" Message-ID: <7424d2ae-2366-882f-bd84-04ee5714764b@intel.com> Date: Wed, 13 Sep 2023 12:01:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0 Thunderbird/102.15.0 Content-Language: en-US To: Parav Pandit , Jason Wang Cc: "Michael S. Tsirkin" , "eperezma@redhat.com" , "cohuck@redhat.com" , "stefanha@redhat.com" , "virtio-comment@lists.oasis-open.org" , "virtio-dev@lists.oasis-open.org" References: <20230906081637.32185-1-lingshan.zhu@intel.com> <88b8b14c-88d8-1f76-0e6e-7b5f334171f1@intel.com> <2b3e8da1-5cbb-f990-0c1d-c0e894a73486@intel.com> <82805bc3-f891-a35a-12fd-f4799e28570c@intel.com> <987f72eb-f82c-da4b-229c-ffc5b60d6fb4@intel.com> <0861d9ff-d126-c6e4-0deb-74ca81675eeb@intel.com> <30bed94a-e99e-1561-def3-719e25a0dd26@intel.com> From: "Zhu, Lingshan" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: [virtio-dev] Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE On 9/12/2023 9:43 PM, Parav Pandit wrote: >> From: Zhu, Lingshan >> Sent: Tuesday, September 12, 2023 6:33 PM >> >> On 9/12/2023 5:21 PM, Parav Pandit wrote: >>>> From: Zhu, Lingshan >>>> Sent: Tuesday, September 12, 2023 2:33 PM admin vq require fixed and >>>> dedicated resource to serve the VMs, the question still remains, does >>>> is scale to server big amount of devices migration? how many admin >>>> vqs do you need to serve 10 VMs, how many for 100? and so on? How to >>>> scale? >>>> >>> Yes, it scales within the AQ and across multiple AQs. >>> Please consult your board designers to know such limits for your device. >> scales require multiple AQs, then how many should a vendor provide for the >> worst case? >> >> I am boring for the same repeating questions. > I said it scales, within the AQ. (and across AQs). > I have answered enough times, so I will stop on same repeated question. > Your repeated question is not helping anyone as it is not in the scope of virtio. > > If you think it is, please get it written first for RSS and MQ in net section and post for review. You missed the point of the question and I agree no need to discuss this anymore. > >>>> If one admin vq can serve 100 VMs, can it migrate 1000VMs in reasonable >> time? >>>> If not, how many exactly. >>>> >>> Yes, it can serve both 100 and 1000 VMs in reasonable time. >> I am not sure, the aq is limitless? Can serve thousands of VMs in a reasonable >> time? Like in 300ms? >> > Yes. really? limitless? > >> If you say, that require multiple AQ, then how many should a vendor provide? >> > I didn’t say multiple AQs must be used. > It is same as NIC RQs. don't you agree a single vq has its own performance limitations? > >> Don't say the board designer own the risks. >>>> And register does not need to scale, it resides on the VF and only >>>> serve the VF. >>>> >>> Since its per VF, by nature it is linearly growing entity that the board design >> needs to support read and write with guaranteed timing. >>> It clearly scaled poor than queue. >> Please read my series. For example, we introduce a new bit SUSPEND in the >> \field{device status}, any scalability issues here? > That must behave like queue_reset, (it must get acknowledged from the device) that it is suspended. > And that brings the scale issue. In this series, it says: +When setting SUSPEND, the driver MUST re-read \field{device status} to ensure the SUSPEND bit is set. And this is nothing to do with scale. > On top of that once the device is SUSPENDED, it cannot accept some other RESET_VQ command. so as SiWei suggested, there will be a new feature bit introduced in V2 for vq reset. > >>>> It does not reside on the PF to migrate the VFs. >>> Hence it does not scale and cannot do parallel operation within the VF, unless >> each register is replicated. >> Why its not scale? It is a per device facility. > Because the device needs to answer per device through some large scale memory to fit in a response time. Again, it is a per-device facility, and it is register based serve the only one device itself. And we do not plan to log the dirty pages in bar. > >> Why do you need parallel operation against the LM facility? > Because your downtime was 300msec for 1000 VMs. the LM facility in this series is per-device, it only severs itself. > >> That doesn't make a lot of sense. >>> Using register of a queue for bulk data transfer is solved question when the >> virtio spec was born. >>> I don’t see a point to discuss it. >>> Snippet from spec: " As a device can have zero or more virtqueues for bulk >> data transport" >> Where do you see the series intends to transfer bulk data through registers? >>>> VFs config space can use the device dedicated resource like the bandwidth. >>>> >>>> for AQ, still you need to reserve resource and how much? >>> It depends on your board, please consult your board designer to know >> depending on the implementation. >>> From spec point of view, it should not be same as any other virtqueue. >> so the vendor own the risk to implement AQ LM? Why they have to? >>>>> No. I do not agree. It can fail and very hard for board designers. >>>>> AQs are more reliable way to transport bulk data in scalable manner >>>>> for tens >>>> of member devices. >>>> Really? How often do you observe virtio config space fail? >>> On Intel Icelake server we have seen it failing with 128 VFs. >>> And device needs to do very weird things to support 1000+ VFs forever >> expanding config space, which is not the topic of this discussion anyway. >> That is your setup problem. >>> >>>>>> Please allow me to provide an extreme example, is one single admin >>>>>> vq limitless, that can serve hundreds to thousands of VMs migration? >>>>> It is left to the device implementation. Just like RSS and multi queue >> support? >>>>> Is one Q enough for 800Gbps to 10Mbps link? >>>>> Answer is: Not the scope of specification, spec provide the >>>>> framework to scale >>>> this way, but not impose on the device. >>>> Even if not support RSS or MQ, the device still can work with >>>> performance overhead, not fail. >>>> >>> _work_ is subjective. >>> The financial transaction (application) failed. Packeted worked. >>> LM commands were successful, but it was not timely. >>> >>> Same same.. >>> >>>> Insufficient bandwidth & resource caused live migration fail is >>>> totally different. >>> Very abstract point and unrelated to administration commands. >> It is your design facing the problem. >>>>>> If not, two or >>>>>> three or what number? >>>>> It really does not matter. Its wrong point to discuss here. >>>>> Number of queues and command execution depends on the device >>>> implementation. >>>>> A financial transaction application can timeout when a device >>>>> queuing delay >>>> for virtio net rx queue is long. >>>>> And we don’t put details about such things in specification. >>>>> Spec takes the requirements and provides driver device interface to >>>> implement and scale. >>>>> I still don’t follow the motivation behind the question. >>>>> Is your question: How many admin queues are needed to migrate N >>>>> member >>>> devices? If so, it is implementation specific. >>>>> It is similar to how such things depend on implementation for 30 >>>>> virtio device >>>> types. >>>>> And if are implying that because it is implementation specific, that >>>>> is why >>>> administration queue should not be used, but some configuration >>>> register should be used. >>>>> Than you should propose a config register interface to post >>>>> virtqueue >>>> descriptors that way for 30 device types! >>>> if so, leave it as undefined? A potential risk for device implantation? >>>> Then why must the admin vq? >>> Because administration commands and admin vq does not impose devices to >> implement thousands of registers which must have time bound completion >> guarantee. >>> The large part of industry including SIOV devices led by Intel and others are >> moving away from register access mode. >>> To summarize, administration commands and queue offer following benefits. >>> >>> 1. Ability to do bulk data transfer between driver and device >>> >>> 2. Ability to parallelize the work within driver and within device >>> within single or multiple virtqueues >>> >>> 3. Eliminates implementing PCI read/write MMIO registers which demand >>> low latency response interval >>> >>> 4. Better utilize host cpu as no one needs to poll on the device >>> register for completion >>> >>> 5. Ability to handle variability in command completion by device and >>> ability to notify the driver >>> >>> If this does not satisfy you, please refer to some of the past email discussions >> during administration virtuqueue time. >> I think you mixed up the facility and the implementation in my series, please >> read. > I don’t know what you refer to. You asked "why AQ is must?" I answered above what AQ has to offer than some synchronous register. Again, we are implementing facilities, V2 will include inflgiht descriptors and dirty page tracking. That works for LM. > --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org