From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
virtio-comment@lists.oasis-open.org, mst@redhat.com,
jasowang@redhat.com, oren@nvidia.com, parav@nvidia.com,
shahafs@nvidia.com, eperezma@redhat.com, aadam@redhat.com,
bodong@nvidia.com, amikheev@nvidia.com
Subject: Re: [RFC PATCH v2 1/2] Add virtio Admin Virtqueue specification
Date: Wed, 28 Jul 2021 17:20:29 +0300 [thread overview]
Message-ID: <eedd595d-77b9-2921-bbcc-ced2618bccc9@nvidia.com> (raw)
In-Reply-To: <YQFev1vXVFLlvW0w@stefanha-x1.localdomain>
On 7/28/2021 4:42 PM, Stefan Hajnoczi wrote:
> On Wed, Jul 28, 2021 at 01:59:26PM +0300, Max Gurtovoy wrote:
>> On 7/28/2021 11:52 AM, Stefan Hajnoczi wrote:
>>> On Tue, Jul 27, 2021 at 06:29:49PM +0300, Max Gurtovoy wrote:
>>>> On 7/27/2021 5:28 PM, Cornelia Huck wrote:
>>>>> On Tue, Jul 27 2021, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>>>>
>>>>>> On Mon, Jul 26, 2021 at 07:52:53PM +0300, Max Gurtovoy wrote:
>>>>>>> Admin virtqueues will be used to send administrative commands to
>>>>>>> manipulate various features of the device which would not easily map
>>>>>>> into the configuration space.
>>>>>>>
>>>>>>> The same Admin command format will be used for all virtio devices. The
>>>>>>> Admin command set will include 4 types of command classes:
>>>>>>> 1. The generic common class
>>>>>>> 2. The transport specific class
>>>>>>> 3. The device specific class
>>>>>>> 4. The vendor specific class
>>>>>>>
>>>>>>> The above mechanism will enable adding various features to the virtio
>>>>>>> specification, e.g.:
>>>>>>> 1. Format virtio-blk devices in various configurations (512B block size,
>>>>>>> 512B + 8B T10-DIF, 4K block size, 4k + 8B T10-DIF, etc..).
>>>>>>> 2. Live migration management.
>>>>>>> 3. Encrypt/Decrypt descriptors.
>>>>>>> 4. Virtualization management.
>>>>>>> 5. Get device error logs.
>>>>>>> 6. Implement advanced vendor/device/transport specific features.
>>>>>>> 7. Run device health test.
>>>>>>> 8. More.
>>>>>>>
>>>>>>> As virtio evolves beyond the para-virt/sw-emulated world, it's mandatory
>>>>>>> for the specification to become flexible and allow a wider feature set.
>>>>>>> The corrent ctrl virtq that is defined for some of the virtio devices is
>>>>>>> device specific and wasn't designed to be a generic virtq for
>>>>>>> admininistration.
>>>>>>>
>>>>>>> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>>>>>> ---
>>>>>>> admin-virtq.tex | 241 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>> content.tex | 4 +
>>>>>>> 2 files changed, 245 insertions(+)
>>>>>>> create mode 100644 admin-virtq.tex
>>>>>>>
>>>>>>> diff --git a/admin-virtq.tex b/admin-virtq.tex
>>>>>>> new file mode 100644
>>>>>>> index 0000000..ccec2ca
>>>>>>> --- /dev/null
>>>>>>> +++ b/admin-virtq.tex
>>>>>>> @@ -0,0 +1,241 @@
>>>>>>> +\section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues}
>>>>>>> +
>>>>>>> +Admin virtqueues are used to send administrative commands to manipulate
>>>>>>> +various features of the device which would not easily map into the
>>>>>>> +configuration space.
>>>>>>> +
>>>>>>> +Use of Admin virtqueues is negotiated by the VIRTIO_F_ADMIN_VQ
>>>>>>> +feature bit.
>>>>>>> +
>>>>>>> +Admin virtqueue index may vary among different device types.
>>>>>>> +
>>>>>>> +All commands are of the following form:
>>>>>>> +
>>>>>>> +\begin{lstlisting}
>>>>>>> +struct virtio_admin_cmd {
>>>>>>> + /* Device-readable part */
>>>>>>> + u8 class;
>>>>>>> + u8 command;
>>>>>>> + u8 command-specific-data[];
>>>>>>> +
>>>>>>> + /* Device-writable part */
>>>>>>> + u8 command-specific-result[];
>>>>>>> + u8 status_type : 4;
>>>>>>> + u8 reserved : 4;
>>>>>>> + u8 status;
>>>>>>> +};
>>>>>>> +
>>>>>>> +/* Status type values */
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_GENERIC 0
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_CLASS_SPECIFIC 1
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_COMMAND_SPECIFIC 2
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_TRANSPORT_SPECIFIC 3
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_DEVICE_SPECIFIC 4
>>>>>>> +#define VIRTIO_ADMIN_STATUS_TYPE_VENDOR_SPECIFIC 5
>>>>>>> +
>>>>>>> +/* Generic status values */
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_OK 0
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_ERR 1
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_INVALID_CLASS 2
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_INVALID_COMMAND 3
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_DATA_TRANSFER_ERR 4
>>>>>>> +#define VIRTIO_ADMIN_STATUS_GENERIC_DEVICE_INTERNAL_ERR 5
>>>>>>> +\end{lstlisting}
>>>>> This is very complex, and it feels like we're overengineering this.
>>>> Do you mean the status type and the status ?
>>>>
>>>>>>> +
>>>>>>> +The \field{class}, \field{command} and \field{command-specific-data} are
>>>>>>> +set by the driver, and the device sets the \field{status_type}, the
>>>>>>> +\field{status} and the \field{command-specific-result}, if needed.
>>>>>>> +
>>>>>>> +The virtio Admin command class codes are divided in the following form:
>>>>>>> +
>>>>>>> +\begin{lstlisting}
>>>>>>> +/* class values that are transport, device and vendor independent */
>>>>>>> +#define VIRTIO_ADMIN_COMMON_CLASS_START 0
>>>>>>> +#define VIRTIO_ADMIN_COMMON_CLASS_END 63
>>>>>>> +
>>>>>>> +/* class values that are transport specific */
>>>>>>> +#define VIRTIO_ADMIN_TRANSPORT_CLASS_START 64
>>>>>>> +#define VIRTIO_ADMIN_TRANSPORT_CLASS_END 127
>>>>>>> +
>>>>>>> +/* class values that are device specific */
>>>>>>> +#define VIRTIO_ADMIN_DEVICE_CLASS_START 128
>>>>>>> +#define VIRTIO_ADMIN_DEVICE_CLASS_END 191
>>>>>>> +
>>>>>>> +/* class values that are vendor specific */
>>>>>>> +#define VIRTIO_ADMIN_VENDOR_CLASS_START 192
>>>>>>> +#define VIRTIO_ADMIN_VENDOR_CLASS_END 255
>>>>>>> +\end{lstlisting}
>>>>>>> +
>>>>>>> +\subsection{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues / Admin command set}
>>>>>>> +
>>>>>>> +Each virtio device that advertise VIRTIO_F_ADMIN_VQ feature, MUST
>>>>>> "advertises the VIRTIO_F_ADMIN_VQ feature"
>>>>>>
>>>>>>> +support all the mandatory admin commands. A device MAY support also
>>>>>>> +one or more optional admin commands.
>>>>>>> +
>>>>>>> +\subsubsection{Common command set}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues / Admin command set / Common command set}
>>>>>>> +
>>>>>>> +The Common command set is a group of classes and commands within each
>>>>>>> +of these classes which are transport, device and vendor independent.
>>>>>>> +A mandatory class is a class that has at least one mandatory command.
>>>>>>> +The Common command set is summarized in following table:
>>>>>>> +
>>>>>>> +\begin{tabular}{|l|l|l|}
>>>>>>> +\hline
>>>>>>> +Class & Description & M/O \\
>>>>>>> +\hline \hline
>>>>>>> +0 & VIRTIO_ADMIN_DISCOVER_DEVICE & M \\
>>>>>>> +\hline
>>>>>>> +1 & VIRTIO_ADMIN_DISCOVER_DEVICE_CLASS_COMMANDS & M \\
>>>>>>> +\hline
>>>>>>> +2-63 & reserved & - \\
>>>>>>> +\hline
>>>>>>> +\end{tabular}
>>>>>>> +
>>>>>>> +\paragraph{Discover device class}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues / Admin command set / Common command set / Discover device class}
>>>>>>> +
>>>>>>> +This class (opcode: 0) of commands is used to query generic device
>>>>>>> +information. The following table describes the commands supported for
>>>>>>> +this class:
>>>>>>> +
>>>>>>> +\begin{tabular}{|l|l|l|}
>>>>>>> +\hline
>>>>>>> +Command & Description & M/O \\
>>>>>>> +\hline \hline
>>>>>>> +0 & VIRTIO_ADMIN_DISCOVER_DEVICE_IDENTITY & M \\
>>>>>>> +\hline
>>>>>>> +1 & VIRTIO_ADMIN_DISCOVER_DEVICE_SUPPORTED_CLASSES & M \\
>>>>>>> +\hline
>>>>>>> +2-255 & reserved & - \\
>>>>>>> +\hline
>>>>>>> +\end{tabular}
>>>>>>> +
>>>>>>> +\subparagraph{Device identity command}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues / Admin command set / Common command set / Discover device class / Device identity command}
>>>>>>> +
>>>>>>> +This mandatory command should return device identity in the following
>>>>>>> +structure:
>>>>>>> +
>>>>>>> +\begin{tabular}{|l|l|l|}
>>>>>>> +\hline
>>>>>>> +Bytes & Description & M/O \\
>>>>>>> +\hline \hline
>>>>>>> +03:00 & VIRTIO DEVICE ID & M \\
>>>>>>> +\hline
>>>>>>> +05:04 & VIRTIO TRANSPORT ID & M \\
>>>>>> These fields are not defined. I wonder why they are necessary - the
>>>>>> driver should already have this information.
>>>>> Agreed.
>>>> These are initial fields.
>>>>
>>>> We can add also model, serial_number and more in the future.
>>>>
>>>>
>>>>>> In general, I'm a little concerned that this whole infrastructure will
>>>>>> increase the complexity of VIRTIO significantly with little benefit. I
>>>>>> do think an admin virtqueue makes sense, e.g. for migration, but would
>>>>>> prefer it if we focus on actual commands first instead of
>>>>>> infrastructure. That way it will be clear what infrastructure is needed.
>>>> admin virtq is not only for migration.
>>>>
>>>> You'll be able to configure virtio device properties using user space tools
>>>> like: virtio-cli.
>>>>
>>>> For example: format a block device, manage virtual function resources using
>>>> its PF, query for error logs, device health and more.
>>> That sounds good.
>>>
>>>> In the SW world maybe all the above were redundant, but now that you have
>>>> more and more HW virtio devices the protocol should be more flexible and
>>>> adjust.
>>> HW is not special in this regard, I think this will be useful for
>>> software too. In-band admin commands are necessary for nested
>>> virtualization, for example. They also provide a standard admin
>>> interface for out-of-process devices (vhost-user, etc).
>>>
>>>> Few weeks ago I've sent a concrete commands for live migration but then I
>>>> was told that new infrastructure (admin virtq) should be developed and this
>>>> is what I did in this RFC.
>>>>
>>>> if you combine the 2 RFCs you can imagine what is needed here for adding
>>>> Live migration support.
>>>>
>>>> But I want to add it step by step.
>>>>
>>>> We need to agree on the infrastructure.
>>>>
>>>>> A concrete example would be good, but I think we can come up with a
>>>>> bare-bones spec to start with.
>>>>>
>>>>> - feature bit for the admin vq, as defined here
>>>>> - location of the admin vq is device specific
>>>>> - I think we can get away with two classes, as for feature bits (not
>>>>> device specificic and device specific); I don't think we need separate
>>>>> classes for transport or vendor specific
>>>> We need it for live migration probably. It will be a transport class.
>>>>
>>>> Vendor specific is also important to allow vendors develop their special
>>>> souse.
>>>>
>>>>> - make the format for the request simple (command + length + payload?)
>>>> I used almost the same format as virtio net ctrl queue.
>>> The virtio_net_ctrl packet format looks good to me, it's close to what
>>> Cornelia's command + length + payload suggestion:
>> I guess I didn't understand Cornelia suggestion.
>>
>>
>>> struct virtio_net_ctrl {
>>> u8 class;
>>> u8 command;
>>> u8 command-specific-data[];
>>> u8 ack;
>>> };
>>> /* ack values */
>>> #define VIRTIO_NET_OK 0
>>> #define VIRTIO_NET_ERR 1
>>>
>>> I'm not sure how vendor commands will be allocated though. Will each
>>> vendor get a unique class id to prevent collisions? If we want to
>>> support cross-implementation migration then it may be necessary to allow
>>> vendor command availability to change while the device is running.
>> vendor specific commands can collide.
>>
>> Vendor A can implement class 192 to do X and Vendor B can implement class
>> 192 to do Y.
>>
>> what do you mean "support cross-implementation migration" ?
> Migrating from vhost_net to vDPA virtio-net, for example. Or migrating
> between two different vDPA virtio-net implementations.
>
> If vendor commands are all in a single namespace then the guest cannot
> use them without the risk of the command accidentally executing on the
> migration destination (where it has a different effect because the
> vendor has changed!).
>
>>> I prefer the simpler struct virtio_net_ctrl format to the more
>>> complicated one proposed in this patch series.
>> This is the same besides adding status type
>>
>> u8 status_type : 4;
>> u8 reserved : 4;
> I'm not sure why it's needed.
If we can live with 256 status code, I guess we can drop it and divide
it to groups:
/* status values that are transport, device and vendor independent */
#define VIRTIO_ADMIN_STATUS_GENERIC_START 0
#define VIRTIO_ADMIN_STATUS_GENERIC_END 63
/* status values that are transport specific */
#define VIRTIO_ADMIN_STATUS_TRANSPORT_START 64
#define VIRTIO_ADMIN_STATUS_TRANSPORT_END 127
/* status values that are device specific */
#define VIRTIO_ADMIN_STATUS_DEVICE_START 128
#define VIRTIO_ADMIN_STATUS_DEVICE_END 191
/* status values that are vendor specific */
#define VIRTIO_ADMIN_STATUS_VENDOR_START 192
#define VIRTIO_ADMIN_STATUS_VENDOR_END 255
>
>> I split "u8 command-specific-data[];"
>> to
>> "u8 command-specific-data[];
>> u8 command-specific-result[];"
>>
>> to emphasize that there is some data that can be written by the device and some data written by the driver in the same command.
>> And this is also the case in virtio-net-ctrl, right ?
> The split makes sense to me.
>
>>>>> How many different (groups of) commands can we reasonably expect? Do we
>>>>> need a generic discovery command, or can we get away with a feature bit
>>>>> covering each new group of commands?
>>>> I can't predict the future but IMO we need a discovery command.
>>>>
>>>> We have many devices and more can be added in the future.
>>> A <u8 class, u8 command> space is 65536 bits or 8KB. I think admin
>>> commands would not be included in VIRTIO Feature Bits but instead
>>> reported via a separate admin command that returns up to 8KB of data:
>>>
>>> struct virtio_admin_report_cmds {
>>> /* Bitmap of available admin commands [Device->Driver]
>>> * bool command_present =
>>> * command_bits[class * 32 + command / 8] & (command % 8);
>>> */
>>> u8 command_bits[8192];
>>> };
>> Yes, I divided it to multiple commands per class to cover the case we will
>> need more than 1 bit to describe a command.
>>
>> But I guess we can add it later on.
>>
>> I think the above should be:
>>
>> bool command_present = command_bits[class * 32 + command / 8] & (1 << (command % 8));
>>
>> isn't it ?
> You're right. I forgot to shift the bit :D.
>
>> Also what do you think about renaming <class, command> to <opcode, opmod> ?
> I need to understand how opcode and opmod values are used. I'm not sure.
Same as class and command just with different naming.
>
> Stefan
next prev parent reply other threads:[~2021-07-28 14:20 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-26 16:52 [RFC PATCH v2 1/2] Add virtio Admin Virtqueue specification Max Gurtovoy
2021-07-26 16:52 ` [RFC PATCH v2 2/2] virtio-blk: add support for VIRTIO_F_ADMIN_VQ Max Gurtovoy
2021-07-27 12:24 ` Stefan Hajnoczi
2021-07-27 16:08 ` [virtio-comment] " Max Gurtovoy
2021-07-28 8:25 ` Stefan Hajnoczi
2021-07-27 10:27 ` [RFC PATCH v2 1/2] Add virtio Admin Virtqueue specification Stefan Hajnoczi
2021-07-27 14:28 ` [virtio-comment] " Cornelia Huck
2021-07-27 15:29 ` Max Gurtovoy
2021-07-28 8:52 ` Stefan Hajnoczi
2021-07-28 10:59 ` Max Gurtovoy
2021-07-28 13:42 ` Stefan Hajnoczi
2021-07-28 14:20 ` Max Gurtovoy [this message]
2021-07-29 8:48 ` Stefan Hajnoczi
2021-08-01 10:46 ` [virtio-comment] " Max Gurtovoy
2021-08-02 12:58 ` Stefan Hajnoczi
2021-07-28 12:53 ` Michael S. Tsirkin
2021-07-30 6:45 ` [virtio-comment] " Cornelia Huck
2021-07-28 12:48 ` Michael S. Tsirkin
2021-07-29 14:51 ` Max Gurtovoy
2021-07-30 7:05 ` [virtio-comment] " Cornelia Huck
2021-07-31 11:34 ` Max Gurtovoy
2021-07-31 22:26 ` Michael S. Tsirkin
2021-07-31 22:53 ` Max Gurtovoy
2021-08-01 8:16 ` Michael S. Tsirkin
2021-08-01 8:38 ` Max Gurtovoy
2021-08-02 2:17 ` Jason Wang
2021-08-02 2:19 ` Jason Wang
2021-08-02 9:54 ` Max Gurtovoy
2021-08-02 14:51 ` [virtio-comment] " Cornelia Huck
2021-08-02 15:27 ` Max Gurtovoy
2021-08-02 17:28 ` Michael S. Tsirkin
2021-08-03 3:39 ` Jason Wang
2021-08-03 8:32 ` Max Gurtovoy
2021-08-03 9:01 ` Jason Wang
2021-08-03 9:21 ` Max Gurtovoy
2021-08-03 10:04 ` [virtio-comment] " Jason Wang
2021-07-30 7:36 ` Michael S. Tsirkin
2021-07-31 11:53 ` Max Gurtovoy
2021-07-31 22:17 ` Michael S. Tsirkin
2021-07-31 23:46 ` Max Gurtovoy
2021-08-02 13:22 ` Stefan Hajnoczi
2021-08-02 14:34 ` [virtio-comment] " Cornelia Huck
2021-08-02 14:58 ` Max Gurtovoy
2021-08-02 16:39 ` Stefan Hajnoczi
2021-08-02 15:21 ` [virtio-comment] " Cornelia Huck
2021-08-02 16:03 ` Max Gurtovoy
2021-08-02 17:05 ` Michael S. Tsirkin
2021-08-03 6:28 ` [virtio-comment] " Cornelia Huck
2021-08-03 6:41 ` Jason Wang
2021-08-03 6:51 ` [virtio-comment] " Cornelia Huck
2021-08-03 7:55 ` Max Gurtovoy
2021-08-03 8:55 ` Cornelia Huck
2021-08-03 9:04 ` Max Gurtovoy
2021-08-02 2:25 ` Jason Wang
2021-08-02 9:51 ` Max Gurtovoy
2021-08-02 17:07 ` Michael S. Tsirkin
2021-08-03 3:22 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eedd595d-77b9-2921-bbcc-ced2618bccc9@nvidia.com \
--to=mgurtovoy@nvidia.com \
--cc=aadam@redhat.com \
--cc=amikheev@nvidia.com \
--cc=bodong@nvidia.com \
--cc=cohuck@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=oren@nvidia.com \
--cc=parav@nvidia.com \
--cc=shahafs@nvidia.com \
--cc=stefanha@redhat.com \
--cc=virtio-comment@lists.oasis-open.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox