Discussion of the implementations of VIRTIO specification
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtio-comment@lists.oasis-open.org, cohuck@redhat.com,
	virtio-dev@lists.oasis-open.org, jasowang@redhat.com,
	parav@nvidia.com, shahafs@nvidia.com, oren@nvidia.com,
	stefanha@redhat.com, Tziporet Koren <tziporet@nvidia.com>
Subject: Re: [virtio-comment] Re: [PATCH v1 0/5] Introduce virtio subsystem and Admin virtqueue
Date: Sun, 27 Mar 2022 18:40:15 +0300	[thread overview]
Message-ID: <f5b9414b-cbb5-1f61-3528-62c901317841@nvidia.com> (raw)
In-Reply-To: <20220320172826-mutt-send-email-mst@kernel.org>


On 3/20/2022 11:41 PM, Michael S. Tsirkin wrote:
> On Thu, Mar 10, 2022 at 03:08:54PM +0200, Max Gurtovoy wrote:
>> On 3/10/2022 2:49 PM, Michael S. Tsirkin wrote:
>>> On Thu, Mar 10, 2022 at 12:38:38PM +0200, Max Gurtovoy wrote:
>>>> On 3/9/2022 9:42 AM, Michael S. Tsirkin wrote:
>>>>> On Wed, Mar 02, 2022 at 05:56:03PM +0200, Max Gurtovoy wrote:
>>>>>> Hi,
>>>>>> A virtio subsystem definition will help extending the virtio specefication for
>>>>>> various future features that require a notion of grouping devices together or
>>>>>> managing devices inside a group. It also might be used splitting or sharing a
>>>>>> single virtio backend between multiple devices (e.g. Multipath IO for virtio-blk
>>>>>> devices). A virtio subsystem include one or more virtio devices.
>>>>> A large patch, need a bit more time for review. Meanwhile,
>>>>> how about adding migration related capabilities?
>>>>> I would very much like that to make progress before
>>>>> people start using high overhead solutions like
>>>>> VQ shadowing.
>>>> Sure I can start working on rebasing old LM proposal to virtio subsystem
>>>> framework.
>>>>
>>>> But can you be precise for what you mean capabilities ? only caps without
>>>> the commands and LM logic ?
>>> There are at least four distinct bits, and they can be worked on mostly
>>> separately:
>>>
>>>
>>> 1.  We need a bunch of stuff to migrate a device to a different host right?
>>> - device specific state
>>> - transport state
>>> - vq ring state
>>> and of course we need
>>> - ability to stop/resume device
>>> This is useful by itself e.g. for snapshoting.
>>>
>>> Then to reduce downtime we also need to run device during memory
>>> migration, which requires support for
>>>
>>> 2. page faults (postcopy) and optionally
>>> 3. dirty tracking (precopy) - though dirty tracking can be done
>>> with faults too, so maybe just faults.
>>> Faults are definitely useful for a bunch of stuff like memory migration.
>>> Dirty tracking is more of a boutique feature, but I guess uses
>>> beyond memory migration can still be found.
>>>
>>> 4.  Finally, feature compatibility is a problem: not any configuration of a
>>> device can be migrated to any other device. A simplest example is a
>>> device feature not present on destination. Can be solved by not exposing
>>> the feature to the guest. Another example is layout of pci configuration
>>> space. Spec allows a lot of flexibility here, however things like
>>> # of VQs will affect the memory bar size.
>>> I am not exactly sure what we want to do in this space, maybe for
>>> starters enumerating what are the things that need to match on source
>>> and destination?
>>> We can start with a non-normative sections describing the issues
>>> generally at least.
>> MST,
>>
>> I really like us to push these 5 patches before we deep dive to LM stuff.
>>
>> This was our plan we agreed together - push infrastructure with relatively
>> small feature (we choose MSIX management) to the spec.
>>
>> This infrastructure should fit for future features such as: VQ management,
>> LM management and more.
>>
>> I think it does. Now the TG need to review and agree.
>>
>> If we'll start talking about LM during this series review we will end up
>> again with nothing merged to the spec and waste more precious time.
> I am not sure that last sentence is true. Or to be more precise,
> yes it's possible that the fastest way to merge admin queue
> proposal is to avoid making sure it solves live migration, but
> admin queue is not an end by itself.

Not by itself, of course. We use it for other features such as MSIX 
configuration (that I posted in the TG mailing list several months ago), 
remember ?

And it can be extended to other features by other members as well as 
soon as we'll merge it.

>
>> So I'm taking the bits above into account for the internal LM work that I'm
>> preparing for the future (after we'll merge the current series).
>>
>> agreed ?
> My advice is always to do work in the open and publish drafts of
> the work even if it's not ready, but be very clear and open about
> what is and what is not ready, including a TODO list in the
> commit log. You can tag it RFC in subject and make it PATCH 6/5
> so it's clear to people that it's a POC and not a final patch.
> In particular it will be helpful to show that admin queue is
> actually a good fit for this purpose.

We already agreed that admin queue is a good fit. You said it in your 
own words.

I would like us to continue our initial plan that you proposed and that 
we build a plan of records according to it.

Changing strategy in V5 is not something we should do.

Lets stick to the original plan please.

Any comments for this series ? Cornelia/Jason ? or can we merge it as-is ?

>
>
>>>
>>>
>>>> Initial feedback will be great for this series since every rebase cost a
>>>> lot... and it grows if we add more caps and logic.
>>>>
>>>>>> Also introduce the admin facility to allow manipulating features and configurations
>>>>>> in a generic manner. Using the admin command set, one can manipulate the device itself
>>>>>> and/or to manipulate, if possible, another device within the same virtio subsystem (the
>>>>>> following patch set).
>>>>>>
>>>>>> The admin virtqueue is the first management interface to issue Admin commands from
>>>>>> the admin command set.
>>>>>>
>>>>>> The admin virtqueue interface will be extended in the future with more and more
>>>>>> features that some of them already in discussions. Some of these features don't
>>>>>> fit to MMIO/config_space characteristics, therefore a queue is selected to address
>>>>>> admin commands.
>>>>>>
>>>>>> Motivation for choosing admin queue:
>>>>>> 1. It is anticipated that admin queue will be used for managing and configuring
>>>>>>       many different type of resources. For example,
>>>>>>       a. PCI PF configuring PCI VF attributes.
>>>>>>       b. virtio device creating/destroying/configuring subfunctions discussed in [1]
>>>>>>       c. composing device config space of VF or SF such as mac address, number of VQs, virtio features
>>>>>>
>>>>>>       Mapping all of them as configuration registers to MMIO will require large MMIO space,
>>>>>>       if done for each VF/SF. Such MMIO implementation in physical devices such as PCI PF and VF
>>>>>>       requires on-chip resources to complete within MMIO access latencies. Such resources are very
>>>>>>       expensive.
>>>>>>
>>>>>> 2. Such limitation can be overcome by having smaller MMIO register set to build
>>>>>>       a command request response interface. However, such MMIO based command interface
>>>>>>       will be limited to serve single outstanding command execution. Such limitation can
>>>>>>       resulting in high device creation and composing time which can affect VM startup time.
>>>>>>       Often device can queue and service multiple commands in parallel, such command interface
>>>>>>       cannot use parallelism offered by the device.
>>>>>>
>>>>>> 3. When a command wants to DMA data from one or more physical addresses, for example in the future a
>>>>>>       live migration command may need to fetch device state consist of config space, tens of
>>>>>>       VQs state, VLAN and MAC table, per VQ partial outstanding block IO list database and more.
>>>>>>       Packing one or more DMA addresses over new command interface will be burden some and continue
>>>>>>       to suffer single outstanding command execution latencies. Such limitation is not good for time
>>>>>>       sensitive live migration use cases.
>>>>>>
>>>>>> 4. A virtio queue overcomes all the above limitations. It also supports DMA and multiple outstanding
>>>>>>       descriptors. Similar mechanism exist today for device specific configuration - the control VQ.
>>>>>>
>>>>>> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F202108%2Fmsg00025.html&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=%2BPEha9meUnHSg1fiwm7Z7Pv6UgvCLeLka3A2THirvt8%3D&amp;reserved=0
>>>>>>
>>>>>> This series was extended and splitted from the V3 of the "VIRTIO: Provision maximum MSI-X vectors for a VF".
>>>>>> This series include the comments and fixes from V1-V3 of the initial patch set from above.
>>>>>> The following series introduce the management devices and MSI-X configuration of virtio devices.
>>>>>>
>>>>>> Open issues:
>>>>>> 1. CCW and MMIO specification for admin_queue_index register
>>>>>>
>>>>>> Max Gurtovoy (5):
>>>>>>      virtio: Introduce virtio subsystem
>>>>>>      Introduce Admin Command Set
>>>>>>      Introduce DEVICE INFO Admin command
>>>>>>      Add virtio Admin virtqueue
>>>>>>      Add miscellaneous configuration structure for PCI
>>>>>>
>>>>>>     admin.tex        | 177 +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>     conformance.tex  |   3 +
>>>>>>     content.tex      |  33 ++++++++-
>>>>>>     introduction.tex |  20 ++++++
>>>>>>     4 files changed, 231 insertions(+), 2 deletions(-)
>>>>>>     create mode 100644 admin.tex
>>>>>>
>>>>>> -- 
>>>>>> 2.21.0
>> This publicly archived list offers a means to provide input to the
>> OASIS Virtual I/O Device (VIRTIO) TC.
>>
>> In order to verify user consent to the Feedback License terms and
>> to minimize spam in the list archive, subscription is required
>> before posting.
>>
>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>> List help: virtio-comment-help@lists.oasis-open.org
>> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=HaslbICc4GrV%2FesN2KO%2BsA12Gif%2Fbmi0%2BSzkLXPJvFU%3D&amp;reserved=0
>> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=R2lnpEmQ3nfDXQOYkRzkdzitviwOLEuhYDMQKkchaOc%3D&amp;reserved=0
>> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Uc8pPeEvC7m2Afui8b8%2Fxk0J5bUlFSsjZ3Jsr%2BQsBY0%3D&amp;reserved=0
>> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=TCrF3rJ2I94Zgevi4DaTeI2mO%2FL69CarkYs11UgaNRg%3D&amp;reserved=0
>> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Jkph2NA2cZKAvPmj0zcgypu%2BtwI0yVH%2Bpo4G8ndmAKc%3D&amp;reserved=0
>
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
>
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.oasis-open.org%2Farchives%2Fvirtio-comment%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=HaslbICc4GrV%2FesN2KO%2BsA12Gif%2Fbmi0%2BSzkLXPJvFU%3D&amp;reserved=0
> Feedback License: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fwho%2Fipr%2Ffeedback_license.pdf&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=R2lnpEmQ3nfDXQOYkRzkdzitviwOLEuhYDMQKkchaOc%3D&amp;reserved=0
> List Guidelines: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fpolicies-guidelines%2Fmailing-lists&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Uc8pPeEvC7m2Afui8b8%2Fxk0J5bUlFSsjZ3Jsr%2BQsBY0%3D&amp;reserved=0
> Committee: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fcommittees%2Fvirtio%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=TCrF3rJ2I94Zgevi4DaTeI2mO%2FL69CarkYs11UgaNRg%3D&amp;reserved=0
> Join OASIS: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.oasis-open.org%2Fjoin%2F&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cf061ce52d05b4d41f26508da0aba6bfa%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637834093644643841%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Jkph2NA2cZKAvPmj0zcgypu%2BtwI0yVH%2Bpo4G8ndmAKc%3D&amp;reserved=0
>


      reply	other threads:[~2022-03-27 15:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-02 15:56 [PATCH v1 0/5] Introduce virtio subsystem and Admin virtqueue Max Gurtovoy
2022-03-02 15:56 ` [PATCH v1 1/5] virtio: Introduce virtio subsystem Max Gurtovoy
2022-04-04 12:03   ` [virtio-dev] " Michael S. Tsirkin
2022-04-04 15:06     ` [virtio-comment] " Max Gurtovoy
2022-03-02 15:56 ` [virtio-comment] [PATCH v1 2/5] Introduce Admin Command Set Max Gurtovoy
2022-04-04 12:50   ` Michael S. Tsirkin
2022-04-04 15:35     ` Max Gurtovoy
2022-04-04 16:26       ` Michael S. Tsirkin
2022-04-05 10:58         ` [virtio-comment] " Max Gurtovoy
2022-04-05 12:28           ` [virtio-dev] " Michael S. Tsirkin
2022-04-06 17:03             ` [virtio-comment] " Max Gurtovoy
2022-03-02 15:56 ` [PATCH v1 3/5] Introduce DEVICE INFO Admin command Max Gurtovoy
2022-04-04 12:57   ` Michael S. Tsirkin
2022-04-04 15:44     ` Max Gurtovoy
2022-04-04 16:09       ` Michael S. Tsirkin
2022-04-05 11:27         ` [virtio-comment] " Max Gurtovoy
2022-04-05 12:20           ` Michael S. Tsirkin
2022-04-06 17:17             ` [virtio-comment] " Max Gurtovoy
2022-03-02 15:56 ` [PATCH v1 4/5] Add virtio Admin virtqueue Max Gurtovoy
2022-04-04 13:02   ` Michael S. Tsirkin
2022-04-04 15:49     ` Max Gurtovoy
2022-04-04 16:13       ` Michael S. Tsirkin
2022-04-05 11:13         ` [virtio-comment] " Max Gurtovoy
2022-04-05 12:32           ` [virtio-dev] " Michael S. Tsirkin
2022-03-02 15:56 ` [PATCH v1 5/5] Add miscellaneous configuration structure for PCI Max Gurtovoy
2022-04-04 13:04   ` Michael S. Tsirkin
2022-04-04 15:52     ` Max Gurtovoy
2022-04-04 16:16       ` Michael S. Tsirkin
2022-04-05 11:20         ` [virtio-comment] " Max Gurtovoy
2022-04-05 12:12           ` Michael S. Tsirkin
2022-03-09  7:42 ` [PATCH v1 0/5] Introduce virtio subsystem and Admin virtqueue Michael S. Tsirkin
2022-03-10 10:38   ` Max Gurtovoy
2022-03-10 12:49     ` Michael S. Tsirkin
2022-03-10 13:08       ` Max Gurtovoy
2022-03-20 21:41         ` [virtio-comment] " Michael S. Tsirkin
2022-03-27 15:40           ` Max Gurtovoy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f5b9414b-cbb5-1f61-3528-62c901317841@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=oren@nvidia.com \
    --cc=parav@nvidia.com \
    --cc=shahafs@nvidia.com \
    --cc=stefanha@redhat.com \
    --cc=tziporet@nvidia.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox