From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from ws5-mx01.kavi.com (ws5-mx01.kavi.com [34.193.7.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F314ECDB47E for ; Fri, 13 Oct 2023 10:23:41 +0000 (UTC) Received: from lists.oasis-open.org (oasis.ws5.connectedcommunity.org [10.110.1.242]) by ws5-mx01.kavi.com (Postfix) with ESMTP id 436CC7BAEE for ; Fri, 13 Oct 2023 10:23:41 +0000 (UTC) Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id 271A0986841 for ; Fri, 13 Oct 2023 10:23:41 +0000 (UTC) Received: from host09.ws5.connectedcommunity.org (host09.ws5.connectedcommunity.org [10.110.1.97]) by lists.oasis-open.org (Postfix) with QMQP id 0F1589865D9; Fri, 13 Oct 2023 10:23:41 +0000 (UTC) Mailing-List: contact virtio-comment-help@lists.oasis-open.org; run by ezmlm List-ID: Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [10.110.1.242]) by lists.oasis-open.org (Postfix) with ESMTP id F2F45986837 for ; Fri, 13 Oct 2023 10:23:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at kavi.com X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="389007522" X-IronPort-AV: E=Sophos;i="6.03,221,1694761200"; d="scan'208";a="389007522" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="1001880406" X-IronPort-AV: E=Sophos;i="6.03,221,1694761200"; d="scan'208";a="1001880406" Message-ID: Date: Fri, 13 Oct 2023 17:44:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Parav Pandit , "Michael S. Tsirkin" , Jason Wang Cc: "virtio-comment@lists.oasis-open.org" , "cohuck@redhat.com" , "sburla@marvell.com" , Shahaf Shuler , Maor Gottlieb , Yishai Hadas References: <20231008112555.473895-1-parav@nvidia.com> <20231008112555.473895-4-parav@nvidia.com> <20231008073912-mutt-send-email-mst@kernel.org> <2fa89e37-a097-d785-e1ee-cda151b0d872@intel.com> <85c59856-b68e-940c-08ed-a14e5a02554d@intel.com> <6fc4af28-67d9-781b-a243-a6c2ebf0244c@intel.com> From: "Zhu, Lingshan" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [virtio-comment] Re: [PATCH v1 3/8] device-context: Define the device context fields for device migration On 10/12/2023 7:37 PM, Parav Pandit wrote: >> From: Zhu, Lingshan >> Sent: Thursday, October 12, 2023 4:40 PM >> On 10/12/2023 6:09 PM, Parav Pandit wrote: >>>> From: Zhu, Lingshan >>>> Sent: Thursday, October 12, 2023 3:30 PM >>>> >>>> On 10/11/2023 6:54 PM, Parav Pandit wrote: >>>>>> From: Zhu, Lingshan >>>>>> Sent: Wednesday, October 11, 2023 3:38 PM >>>>>> >>>>>>>> The system admin can choose only passthrough some of the devices >>>>>>>> for nested guests, so passthrough the PF to L1 guest is not a >>>>>>>> good idea, because there can be many devices still work for the host or >> L1. >>>>>>> Possible. One size does not fit all. >>>>>>> What I expressed is most common scenarios that user care about. >>>>>> don't block existing usecases, don't break the userspace, nested is >> common. >>>>> Nothing is broken as virtio spec do not have any single construct to >>>>> support >>>> migration. >>>>> If nested is common, can you share the performance number with real >>>>> virtio >>>> device with/without 2 level nesting? >>>>> I frankly don’t know how they look like. >>>> virtio devices support nested, I mean don't break this usecase And >>>> end user accept performance overhead in nested, this is not related to this >> topic. >>> Can you show an example of virtio device nesting and live migration already >> supported where the device has _done_ the live migration. >>> Due to which you claim that new feature of admin command-based owner >> and member device breaks something? >> current virito/kvm/qemu support nested. > Sure, two of the 3 components are not part of the virtio spec. > Hence, they are not broken. you want virtio work for them right? don't break this. > >>> Please don’t use the verb "break". >>> Your proposal is the first of its kind that supports migrating nested device. >>> This is why new patches of config register or admin command does not break >> anything existing. >> if your proposal don't support nested, you break nested use cases. >>>>>>>>> In second use case, where one want to bind only one member >>>>>>>>> device to one VM, I think same plumbing can be extended to have >>>>>>>>> another VF, to take >>>>>>>> the role of migration device instead of owner device. >>>>>>>>> I don’t see a good way to passthrough and also do in-band >>>>>>>>> migration without >>>>>>>> lot of device specific trap and emulation. >>>>>>>>> I also don’t know the cpu performance numbers with 3 levels of >>>>>>>>> nested page >>>>>>>> table translation which to my understanding cannot be accelerated >>>>>>>> by the current cpu. >>>>>>>> host_PA->L1_QEMU_VA->L1_Guest_PA->L1_QEMU_VA->L2_Guest_PA >> and >>>> so >>>>>> on, >>>>>>>> there can be performance overhead, but can be done. >>>>>>>> >>>>>>>> So admin vq migration still don't work for nested, this is surely a >> blocker. >>>>>>> In specific case of member devices are located at different nest >>>>>>> level, it does >>>>>> not. >>>>>> so you got the point, so this series should not be merged. >>>>>>> Why prevents you have a peer VF do the role of migration driver? >>>>>>> Basically, what I am proposing is, connect two VFs to the L1 guest. >>>>>>> One VF is >>>>>> migration driver, one VF is passthrough to L2 guest. >>>>>>> And same scheme works. >>>>>> A peer VF? A management VF? still break the existing usecase. and >>>>>> how do you transfer ownership of L2 VF from PF to L1 VF? >>>>> A peer management VF which services admin command (like PF). >>>>> Ownership of admin command is delegated to the management VF. >>>> interesting, do you plan to cook a patch implementing this? >>> No. I am hoping that you can help to draft those patches for nested case to >> work when one wants to hand of single VM to single nested guest VM. >>> I will not be able to test any of nested things and show its performance value >> either, as I don’t see how rest of the eco system can match up for the nested. >>> Hence, your expertise in drafting extension for nested is desired. > Answer to your below question of patch drafting is here. If you can help to extend it will be good. where are the draft patch? > >>>> Really make sense? >>>> >>>> How do you transfer the ownership? >>> An additional ownership deletgation by a new admin command. >> if you think this can work, do you want to cook a patch to implement this before >> you submitting this live migration series? > I answered this already above. talk is cheap, show me your patch > >>>> How to you maintain a different group? >>> One to one assignment. >> same as above >>>> How do you isolate the groups? >>> Not sure, what it means. The explicit group is created and VFs are placed in >> this group. >> VF resource are on PF, right? > Which resource? > Before jumping to resource, may be you want to answer "group isolation"? > >>>> How to you keep the guest or host secure? >>> Please be specific. Its very broad question when it comes to defining the >> interface. >> without isolation, can be attacked? > What isolation are you talking about? > I am suggesting that one VF as dummy PF is given the role of admin commands. > >>>> How do you manage the overlaps? >>> Overlaps between? >> host pf and L1 VF > L1 VF works at it own level. > Host PF works at its own level. > This is the true nesting. > >>>> How do you implement the hardware support that? >>> Please consult your board designers. Hard to say how to implement something >> in generic. >> so you don't have an idea > :) > Right, I do not have idea for Intel boards. > I was suggesting a management VF that can service the admin commands. > >>>> How do you change the PCI routing? >>> Why anything to be changed in PCI routing? >> do you place PF and mangement VF in an ACL group? > ACL group at which layer? > >> Do does L1 management VF's member device belong to the PF physically? > Yes. Answer all questions above, if you think a management VF can work, please show me your patch. >>>>> It does not break any existing deployments. >>>> we are talking about nested, don't break nested >>> Virtio spec for nested is not defined yet. Hence nothing is broken. Please avoid >> using the verb, _break_. >> virtio nested works for many years > I replied: your break comment is not applicable to virtio_spec, nor does it apply to any existing software you listed. > > As Michael said, software based nesting is used.. > See if actual hw based devices can implement it or not. Many components of cpu cannot do N level nesting either, but may be virtio can. > I don’t know how yet. two facts: 1. virito works for nested for years 2. your admin vq lm solution does not work for nested This publicly archived list offers a means to provide input to the OASIS Virtual I/O Device (VIRTIO) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: virtio-comment-subscribe@lists.oasis-open.org Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org List help: virtio-comment-help@lists.oasis-open.org List archive: https://lists.oasis-open.org/archives/virtio-comment/ Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists Committee: https://www.oasis-open.org/committees/virtio/ Join OASIS: https://www.oasis-open.org/join/