public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed
From: Bertrand Marquis <Bertrand.Marquis@arm.com>
To: Parav Pandit <parav@nvidia.com>
Cc: "Bill Mills (bill.mills@linaro.org)" <bill.mills@linaro.org>,
	"virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
	"Edgar E . Iglesias" <edgar.iglesias@amd.com>,
	Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Alex Bennee <alex.bennee@linaro.org>,
	Armelle Laine <armellel@google.com>
Subject: Re: [PATCH v1 0/4] virtio-msg transport layer
Date: Wed, 25 Feb 2026 15:17:47 +0000	[thread overview]
Message-ID: <2E2ECAE2-F7ED-4FC4-84AA-CB456F568DBA@arm.com> (raw)
In-Reply-To: <62E0DF94-3401-42E4-9574-02C557849D44@arm.com>

Hi Parav,

> On 25 Feb 2026, at 08:52, Bertrand Marquis <Bertrand.Marquis@arm.com> wrote:
>
> Hi Parav,
>
>> On 25 Feb 2026, at 05:58, Parav Pandit <parav@nvidia.com> wrote:
>>
>>
>>> From: Bertrand Marquis <Bertrand.Marquis@arm.com>
>>> Sent: 20 February 2026 02:10 PM
>>>
>>> Hi Parav,
>>>
>>>> On 13 Feb 2026, at 14:52, Parav Pandit <parav@nvidia.com> wrote:
>>>>
>>>> Hi Bill,
>>>>
>>>>> From: Bill Mills <bill.mills@linaro.org>
>>>>> Sent: 26 January 2026 10:02 PM
>>>>>
>>>>> This series adds the virtio-msg transport layer.
>>>>>
>>>>> The individuals and organizations involved in this effort have had difficulty in
>>>>> using the existing virtio-transports in various situations and desire to add one
>>>>> more transport that performs its transport layer operations by sending and
>>>>> receiving messages.
>>>>>
>>>>> Implementations of virtio-msg will normally be done in multiple layers:
>>>>> * common / device level
>>>>> * bus level
>>>>>
>>>>> The common / device level defines the messages exchanged between the driver
>>>>> and a device. This common part should lead to a common driver holding most
>>>>> of the virtio specifics and can be shared by all virtio-msg bus implementations.
>>>>> The kernel implementation in [3] shows this separation. As with other transport
>>>>> layers, virtio-msg should not require modifications to existing virtio device
>>>>> implementations (virtio-net, virtio-blk etc). The common / device level is the
>>>>> main focus of this version of the patch series.
>>>>>
>>>>> The virtio-msg bus level implements the normal things a bus defines
>>>>> (enumeration, dma operations, etc) but also implements the message send and
>>>>> receive operations. A number of bus implementations are envisioned,
>>>>> some of which will be reusable and general purpose. Other bus implementations
>>>>> might be unique to a given situation, for example only used by a PCIe card
>>>>> and its driver.
>>>>>
>>>>> The standard bus messages are an effort to avoid different bus implementations
>>>>> doing the same thing in different ways for no good reason. However the
>>>>> different environments will require different things. Instead of trying to
>>>>> anticipate all needs and provide something very abstract, we think
>>>>> implementation specific messages will be needed at the bus level. Over time,
>>>>> if we see similar messages across multiple bus implementations, we will move to
>>>>> standardize a bus level message for that.
>>>>>
>>>>
>>>> I would review more, had first round of sparse review.
>>>> Please find few comments/questions below.
>>>>
>>>> 1. device number should be 32-bit in struct virtio_msg_header.
>>>> From SIOV_R2 experiences, we learnt that some uses have use case for more than 64k devices.
>>>> Also mapping PCI BDF wont be enough in 16-bits considering domain field.
>>>
>>> That is a very interesting feedback, we will definitely take this into account.
>>> We will have to decide how big and I will answer that in Demi's mail as there might be some drawbacks with
>>> having very big sizes for the device ID.
>>>
>> I am slowly catching up on the thread.
>> There are two types of device id needed.
>> One is UUID style to uniquely identify the device that may show up using two transports to the driver.
>> With that a driver can create single virtio_device object, which is reachable via two different transports.
>> This offers performance, resiliency.
>> This is likely a bigger string which is not efficient to use during every message transaction.
>>
>> Second one is: within a transport, a device id to identify the communication.
>> I was referring to this device id to be u32, so that transport can support more than 64K devices.
>
> We will increase the device number size to support more than 64k devices.
> For the UUID part, I feel it should be something provided as device information per device, so
> we could add a non-mandatory field (nil-UUID when no ID available) and transfer that information
> as part of GET_DEVICE_INFO.
>
>>
>>
>>
>>> In any case this point was raised by you, Peter and Demi and we will definitely handle it in v2.
>>>
>> Ok.
>>
>>>>
>>>> 2. msg_size of 16-bits for 64KB-8 bytes is too less for data transfer.
>>>> For example, a TCP stream wants to send 64KB of data + payload, needs more than 64KB data.
>>>> Needs 32-bits.
>>>
>>> The point of the transport is not to transfer data, this should be done using virtqueues.
>> How comes virtqueues live outside of the transport?
>> I don't understand this at all.
>> Transport is supposed to transport the descriptors and data also _of_ the virtqueue.
>
> virtqueues do not really live outside of the transport but are defined and used in the same way
> as they are in PCI or MMIO. The transport provide ways to configure the virtqueues but relies on
> the fact that virtqueue content is available to both sides through DMA, shared memory or any other
> custom mean.
>
>>
>> For example you can see that proposal [1] is nearly complete that enables virtio msg layer over various transports.
>> This includes control operations and data operations both.
>> Patch-4 defines the control operations (similar to this proposal)
>> Patch-5 defines the binding of data operation with the specific underlying transports such as tcp, rdma etc.
>>
>> In your case, patch-5 would be for system specific bus such as FF-A or others that others have acked for.
>> But it must be documented somewhere.
>
> Correct me if I'm wrong but this is centered on transfering virqueue content over messages and having
> something that is network friendly. Even though some of the principles could be reused, I think the use
> case is a bit different.
>
> In our use cases we have remote memory access but what we cannot afford is trap-and-emulate to
> access registers (MMIO or PCI) or a hardware specific solution (CCW) (for scheduling issues and
> performance reasons).
>
> Now what could be done easily is extend the virtio-msg proposal with a solution to transfer virtqueues
> over messages to and answer to this use case.
>
>>
>>
>> [1] https://yhbt.net/lore/virtio-comment/20231023104647.290759-2-pizhenwei@bytedance.com/
>>
>>> Transport is only there to allow access to registers, features and configuration, the main transfers are to
>>> be done using virtqueues and i would not expect a network driver to use MMIO registers to write network
>>> packet payloads but to use virtqueues for that (as vsock or net is doing).
>>>
>> Very strange.
>> If that is your intention, this transport thing must be renamed to control-transport to succinctly indicate its objective is for 'control operation'.
>
> The proposal is following what PCI, MMIO and CCW transports are defining as they do not mention how virtqueues data is transmitted.
>
>>
>>> So the msg_size here is just to fit transport specific requests.
>>>
>>>>
>>>> 3. BUS_MSG_EVENT_DEVICE to have symmetric name as ADDED and REMOVED (instead of READY)
>>>> But more below.
>>>
>>> Good point, we will check that to make this coherent with other transports.
>>>
>> Ok.
>>>>
>>>> 4. I dont find the transport messages to read and write to the driver memory supplied in VIRTIO_MSG_SET_VQUEUE addresses to operate
>>> the virtqueues.
>>>> Dont we need VIRTIO_MEM_READ, VIRTIO_MEM_WRITE request and response?
>>>
>>> Equivalent of virtio_mem_read/write is provided by GET/SET CONFIG which allows you to access the configuration area.
>>> So those are the equivalent of mem_read/write.
>>> Does that answer your question ?
>>>
>> No. but I understood that what is proposed here is a partial proposal to have transport only for control operation.
>> And data transfer implementation is undefined.
>
> But configuration register read/write which is not going through virtqueues is still covered.
> Can you confirm and ack we do not need any new message added for mem_read/write ?
>
>>
>>>> Also, the queue notification message is missing at bus level.
>>>
>>> Queue notifications are provided by event_avail and event_used messages.
>>>
>> This does not make sense either.
>> The operations that related towards the virtqueues should be part of the virtqueue operations.
>>
>> By not defining specific bus specific details, we don't seem to gain anything significant.
>> So while transport message are good, it needs to cover virtqueue data exchanges too as previously proposed in [1].
>
> Please see my previous answer and discussions with Demi for notifications handling.
>
>>
>>>> But I dont think queue notification via message good idea anyway.
>>>
>>> As said in the spec, a bus can handle this using MSI or interrupts and generate the messages in the interface
>>> between the bus and the transport. It is not an hard requirement to have those done using messages transfered and
>>> the bus can do what it wants, but it must inform the transport using a message.
>>>
>>>> We rather need a more higher-level message for virtqueues.
>>>> Such as posting the descriptors to device. And this should translate into a VIRTIO_MSG_DESC_SEND or bus specific binding.
>>>> Without this msg transport is incomplete.
>>>
>>> I am not completely understanding what you mean here, are your concerns covered by the event messages ?
>>>
>> We need transport messages agnostic of the transport like patch-4 in [1].
>> And transport binding defined for each bus of fabric like [2].
>> So that message interface can support [1] as well by only extending the virtqueue bindings.
>
> I do think that we could easily support your use case of transfering the messages over network.
> What is not covered by our proposal is definitely virtqueue data transfer but this could be easily
> done as part of a bus implementation without impacting the transport itself.
>
>>
>>
>>>>
>>>> 5. msg_id to be renamed to msg_opcode, and token to be renamed to msg_id as it identifies the msg (as written in the description) in
>>>
>>> We went back and forth on the naming, msg_id could be misunderstood as message opcode.
>>> I guess we will have to find the best consensus here.
>>>
>> Ok.
>>
>>>>
>>>> 6. Bus requirements ordering is too strict for implementing any performant data path as data response may not be in same order as request
>>> for reads.
>>>
>>> This was pointed out in an other mail and i agree here.
>>> I will investigate how we could make this a feature bit which i think could be a good solution.
>>>
>> Ok.
>>>>
>>>> 7. VIRTIO_MSG_SET_VQUEUE does not have bit field for individual addresses.
>>>
>>> Set vqueue has an index to specify the virtqueue index but you have to specific all fields in one go that is true.
>>> Do you need a solution where you could set some fields to a specific value to say "keep current" and only update part of the vqueue
>>> configuration ?
>>>
>> I believe so, otherwise it cannot work with existing drivers without driver side caching them.
>
> I will investigate that possibility.
>
>>
>>>> This requires caching all the values on the driver side before sending the transport request.
>>>> I think it is time for virtio spec to shift to virt queue create and destroy model using the admin queue interface.
>>>> and no need to bring this VIRTIO_MSG_SET_VQUEUE legacy to new transport bindings.
>>>> It may require more plumbing, but it is cleaner interface when a new transport binding is created.
>>>
>>> Admin queue useable with the message transport but I would be interested to understand exactly
>>> the model you are referring to with create/destroy model.
>>> Could you elaborate a bit so that i could understand what messages you would expect and how this would work ?
>>>
>> The suggestion is to not use SET_VQUEUE legacy.
>> The suggestion is to use admin virtqueue to create queues and destroy queues, as they are merely an object.
>> And new transport like your proposal can adapt to the modern style.
>> So only admin queue configuration would be the only message.
>> Rest of the other queues can be created directly using the admin queue.
>
> In a way, set vqueue message could be seen like that as it is a one time operation.
> In most cases, drivers are configuring a virtqueue in one go which is optimized here
> as we need only one message transfer.
> If we include the proposal to also have an enable/disable directly in the message this
> could allow for even less messages.
>
> Using the admin solution on top of virtio message is something possible and not prevented
> by this solution.
>
> In my mind we have an object here as you create/destroy queues in one go and the fact
> that a bus can transfer those requests asynchronously gives a solution equivalent to what
> would be provided by admin vqueues.
>
>>
>> One can say it is orthogonal feature and I agree with that reasoning.
>> The part that bothers me is that the new transport ends up adding some legacy bits like set vqueue.
>
> We still need to support legacy to have existing implementation working and here we can optimize
> them a bit by not transferring one message per field.
>
>>
>>>>
>>>> 8. We should have the message response header for req-resp method with status code.
>>>> (even though they are plain MMIO writes for pci).
>>>> As opposed to that, proposed messages are doing composite tasks, hence we need the status response.
>>>
>>> You want an error code in the header, is that what you mean ?
>> Yes, in the response header.
>
>>
>>> We went into this discussion while designing this, the main issue is that most drivers could not handle an
>>> error as this is something not possible in MMIO or PCI so we decided instead to say that in case of error
>>> a valid answer must be generated (all 0 when doing get_config for example).
>>>
>> Well, what is proposed here is a generic transport for future to come.
>> So even if driver does not handle error for MMIO/PCI register writes, transport would have commands for future to come.
>> So creating V1 structure in future just for the status wouldn't be useful.
>
> We do handle that but the error transport is handled by the bus and is returned as an error code to
> the transport (msg send returning an error number) instead of having an error field in the message.
>
> We felt this design was making things simpler and more compliant to existing designs.
>
>>
>> For LOAD/STORE type of messages, driver can always ignore the error.
>> The messages are used for bus discovery too which is outside of existing drivers scope.
>> So status is useful there too.
>
> I will keep that solution in mind to replace the current one but please consider what I said upper and
> confirm.
>
>>
>>
>>>>
>>>> 9. virtio stack cannot handle hotplug devices today, at least Linux.
>>>> So if the intention is to fit into such a system, DEVICE_BUS_STATE_REMOVED will not be enough.
>>>> A graceful method is needed.
>>>
>>> Hotplug could be handled, remove is more problematic in Linux (you could declare a new device instance
>>> at any time and this should work).
>>>
>>> Could you give more details on what a graceful method would need ?
>>>
>> Graceful would need a flow something like below.
>> 1. Remove device request (sent by device side bus)
>> Device continue to operate while the request is serviced
>>
>> 2. Driver acts on the request and unloads the driver/reset the device etc.
>>
>> 3. Driver (and device) eventually stops using the device by stopping all the bus activities.
>>
>> 4. Bus driver notifies to the device, it is safe to remove and remove request is completed.
>
> This is a transport side handling possible depending on what the operating system provides as solutions to
> handle things this way.
>
> Here we have a generic message saying "this device is out" which leaves space to handle this gracefully on
> top and this can happen in lots of cases where the device is offline without it doing it knowingly (crash, power
> down, you name it).
>
> In a graceful scenario i would expect the driver to ask the device to power down through some device specific
> means which would result in the DEVICE_BUS_STATE_REMOVED messages being sent once it has turn
> itself off ending in your scenario.
>
> Do you agree or did i miss something ?

Coming back on this subject, we have one message defined for a backend to signal that a device is removed.

This is a standard bus message but a specific bus wanting to design a more graceful way to handle device remove
could do so without entering in contradiction with the spec.

I am not sure that adding more standard bus messages to handle this case would really be needed here as any
bus can do it using its own way (with probably a strong dependency on the upper OS) but without any link or
consequence on the generic transport part.

Are you ok with this and letting the message simple in the spec ?
I am ok to add some kind of mention that this is possible or recommended to do but not specified here for example.

Cheers
Bertrand

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

  parent reply	other threads:[~2026-02-25 15:19 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26 16:32 [PATCH v1 0/4] virtio-msg transport layer Bill Mills
2026-01-26 16:32 ` [PATCH v1 1/4] virtio-msg: add new command for bus normative Bill Mills
2026-02-03 19:42   ` Edgar E. Iglesias
2026-01-26 16:32 ` [PATCH v1 2/4] virtio-msg: Add virtio-msg, a message based virtio transport layer Bill Mills
2026-02-06 16:28   ` Peter Hilber
2026-02-10  9:39     ` Bertrand Marquis
2026-02-12 11:16       ` Peter Hilber
2026-02-20  8:23         ` Bertrand Marquis
2026-02-26 13:53           ` Peter Hilber
2026-02-13 19:09   ` Demi Marie Obenour
2026-02-20  8:52     ` Bertrand Marquis
2026-02-21  2:04       ` Demi Marie Obenour
2026-02-23  7:44         ` Bertrand Marquis
2026-02-24 15:41   ` Demi Marie Obenour
2026-02-24 16:14     ` Bertrand Marquis
2026-02-24 17:36     ` Edgar E. Iglesias
2026-02-24 17:14   ` Demi Marie Obenour
2026-02-24 17:20     ` Bertrand Marquis
2026-02-24 17:46       ` Demi Marie Obenour
2026-02-25  7:26         ` Bertrand Marquis
2026-02-25 12:36   ` Demi Marie Obenour
2026-02-25 12:46     ` Bertrand Marquis
2026-01-26 16:32 ` [PATCH v1 3/4] virtio-msg: link virtio-msg content Bill Mills
2026-02-03 19:43   ` Edgar E. Iglesias
2026-01-26 16:32 ` [PATCH v1 4/4] virtio-msg: add conformance entries in conformance chapter Bill Mills
2026-02-03 19:43   ` Edgar E. Iglesias
2026-01-26 21:47 ` [PATCH v1 0/4] virtio-msg transport layer Demi Marie Obenour
2026-02-03 13:21 ` Michael S. Tsirkin
2026-02-03 19:48   ` Edgar E. Iglesias
2026-02-03 19:55     ` Michael S. Tsirkin
2026-02-04  8:33   ` Bertrand Marquis
2026-02-04 13:50   ` Arnaud POULIQUEN
2026-02-04  3:29 ` Viresh Kumar
2026-02-04  5:34 ` Manivannan Sadhasivam
2026-02-13 13:52 ` Parav Pandit
2026-02-13 19:45   ` Demi Marie Obenour
2026-02-19 17:31     ` Armelle Laine
2026-02-20  8:55     ` Bertrand Marquis
2026-02-19 23:54   ` Michael S. Tsirkin
2026-02-20  6:13     ` Parav Pandit
2026-02-20  9:02       ` Bertrand Marquis
2026-02-25  7:45         ` Manivannan Sadhasivam
2026-02-25  8:03           ` Bertrand Marquis
2026-02-25  8:12             ` Michael S. Tsirkin
2026-02-25 10:06             ` Manivannan Sadhasivam
2026-02-25 10:10               ` Michael S. Tsirkin
2026-02-25 10:14               ` Bertrand Marquis
2026-02-25 10:22               ` Michael S. Tsirkin
2026-02-25 10:53                 ` Manivannan Sadhasivam
2026-02-25 10:24               ` Parav Pandit
2026-02-25 10:35                 ` Bertrand Marquis
2026-02-25 10:52                   ` Michael S. Tsirkin
2026-02-25 10:55                     ` Bertrand Marquis
2026-02-25 10:58                       ` Michael S. Tsirkin
2026-02-25 14:45                   ` Parav Pandit
2026-02-25 14:49                     ` Michael S. Tsirkin
2026-02-25 14:53                       ` Bertrand Marquis
2026-02-25 15:00                         ` Parav Pandit
2026-02-25 15:07                           ` Parav Pandit
2026-02-25 15:12                             ` Bertrand Marquis
2026-02-25 15:15                               ` Michael S. Tsirkin
2026-02-25 15:36                               ` Demi Marie Obenour
2026-02-25 15:40                                 ` Bertrand Marquis
2026-02-25 15:48                                   ` Demi Marie Obenour
2026-02-25 15:51                                     ` Bertrand Marquis
2026-02-25 16:15                                       ` Demi Marie Obenour
2026-02-26  5:40                               ` Manivannan Sadhasivam
2026-02-26  7:05                                 ` Bertrand Marquis
2026-02-25 15:11                           ` Manivannan Sadhasivam
2026-02-25 15:15                             ` Parav Pandit
2026-02-26  5:36                               ` Manivannan Sadhasivam
2026-02-26  5:59                                 ` Parav Pandit
2026-02-26  6:19                                   ` Manivannan Sadhasivam
2026-02-26  7:01                                 ` Bertrand Marquis
2026-02-26  7:28                                   ` Manivannan Sadhasivam
2026-02-26 19:20                                 ` Demi Marie Obenour
2026-02-26 22:08                                   ` Edgar E. Iglesias
2026-02-25 15:23                           ` Demi Marie Obenour
2026-02-25 16:42                         ` Edgar E. Iglesias
2026-02-25 12:53           ` Demi Marie Obenour
2026-02-25 13:09             ` Manivannan Sadhasivam
2026-02-25 13:12               ` Demi Marie Obenour
2026-02-25 13:29                 ` Bertrand Marquis
2026-02-25 15:19                   ` Demi Marie Obenour
2026-02-25 15:27                     ` Bertrand Marquis
2026-02-20 10:03       ` Michael S. Tsirkin
2026-02-25  5:09         ` Parav Pandit
2026-02-25  7:25           ` Michael S. Tsirkin
2026-02-25  9:18             ` Parav Pandit
2026-02-25  9:22               ` Michael S. Tsirkin
2026-02-25  9:35                 ` Bertrand Marquis
2026-02-25  9:54                   ` Michael S. Tsirkin
2026-02-25 10:01                     ` Bertrand Marquis
2026-02-25 10:08                       ` Michael S. Tsirkin
2026-02-20  8:58     ` Bertrand Marquis
2026-02-20  8:40   ` Bertrand Marquis
2026-02-25  4:58     ` Parav Pandit
2026-02-25  7:52       ` Bertrand Marquis
2026-02-25 12:46         ` Demi Marie Obenour
2026-02-25 13:05           ` Bertrand Marquis
2026-02-25 13:09             ` Demi Marie Obenour
2026-02-25 15:17         ` Bertrand Marquis [this message]
2026-02-24 17:57 ` Demi Marie Obenour
2026-02-25 15:21   ` Alex Bennée
2026-02-25 15:46     ` Demi Marie Obenour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2E2ECAE2-F7ED-4FC4-84AA-CB456F568DBA@arm.com \
    --to=bertrand.marquis@arm.com \
    --cc=alex.bennee@linaro.org \
    --cc=armellel@google.com \
    --cc=arnaud.pouliquen@foss.st.com \
    --cc=bill.mills@linaro.org \
    --cc=edgar.iglesias@amd.com \
    --cc=parav@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=virtio-comment@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox