* [virtio-comment] Next VirtIO device for Project Stratos?
@ 2022-05-31 8:07 Alex Bennée
2022-06-06 9:35 ` Bradford, Robert
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Alex Bennée @ 2022-05-31 8:07 UTC (permalink / raw)
To: stratos-dev, virtio-dev, virtio-comment, virtio-comment
Cc: Viresh Kumar, Mathieu Poirier, Mike Holmes, Matt Spencer,
Peter Griffin, Dan Milea, Bill Mills, Francois Ozog,
Johannes Berg, Gerd Hoffmann, Arnd Bergmann, Christian Pinto,
Namhyung Kim, Petre Eftime, Peter Hilber, Marcel Holtmann,
Michael S. Tsirkin, Stefan Hajnoczi
Hi,
This email is driven by a brain storming session at a recent sprint
where we considered what VirtIO devices we should look at implementing
next. I ended up going through all the assigned device IDs hunting for
missing spec discussion and existing drivers so I'd welcome feedback
from anybody actively using them - especially as my suppositions about
device types I'm not familiar with may be way off!
Work so far
===========
The devices we've tackled so far have been relatively simple ones and
more focused on the embedded workloads. Both the i2c and gpio virtio
devices allow for a fairly simple backend which can multiplex multiple
client VM requests onto a set of real HW presented via the host OS.
We have also done some work on a vhost-user backend for virtio-video and
have a working PoC although it is a couple of iterations behind the
latest submission to the virtio spec. Continuing work on this is
currently paused while Peter works on libcamera related things (although
more on that later).
Upstream first
==============
We've been pretty clear about the need to do things in an upstream
compatible way which means devices should be:
- properly specified in the OASIS spec
- have at least one driver up-streamed (probably in Linux)
- have a working public backend
for Stratos I think we are pretty happy to implement all new backends in
Rust under the auspices of the rust-vmm project and the vhost-device
repository.
We obviously also need a reasonable use case for why abstracting a HW
type is useful. For example i2c was envisioned as useful on mobile
devices where a lot of disparate auxillary HW is often hanging of an i2c
bus.
Current reserved IDs
====================
Looking at the spec there are currently 42 listed device types in the
reserved ID table. While there are quite a lot that have Linux driver
implementations a number are nothing more than reserved numbers:
ioMemory / 6
------------
No idea what this was meant to be.
rpmsg / 7
---------
Not formalised in the specification but there is a driver in the Linux
kernel. AFAIUI I think it's a fairly simple wrapper around the existing
rpmsg bus. I think this has also been used for OpenAMP's hypervisor-less
VirtIO experiments to communicate between processor domains.
mac80211 wlan / 10
mac80211 hwsim wireless simulation device / 29
----------------------------------------------
When the discussion about a virtio-wifi come up there is inevitably a
debate about what the use case is. There are usually two potential use
cases:
- simulation environment
Here the desire is to have something that looks like a real WiFi
device in simulation so the rest of the stack (up from the driver)
can be the same as when running on real HW.
- abstraction environment
Devices with WiFi are different from fixed networking as they need
to deal with portability events like changing networks and reporting
connection status and quality. If the guest VM is responsible for
the UI it needs to gather this information and generally wants it's
userspace components to use the same kernel APIs to get it as it
would with real HW.
Neither of these have up-streamed the specification to OASIS but there
is an implementation of the mac80211_hwsim in the Linux kernel. I found
evidence of a plain 80211 virtio_wifi.c existing in the Android kernel
trees. So far I've been unable to find backends for these devices but I
assume they must exist if the drivers do!
Debates about what sort of features and control channels need to be
supported often run into questions about why existing specifications
can't be expanded (for example expand virtio-net with a control channel
to report additional wifi related metadata) or use pass through sockets
for talking to the host netlink channel.
rproc serial / 11
-----------------
Again this isn't documented in the standard. I'm not sure if this is
related to rpmsg but there is an implementation as part of the kernel
virtio_console code.
virtio CAIF / 12
----------------
Not documented in the specification although there is a driver in the
kernel as part of the orphaned CAIF networking subsystem. From the
kernel documentation this was a sub-system for talking to modem parts.
memory balloon / 13
-------------------
This seems like an abandoned attempt at a next generation version of the
memory ballooning interface.
Timer/Clock device / 17
-----------------------
This looks like a simple reservation with no proposed implementation.
I don't know if there is a case for this on most modern architectures
which usually have virtualised architected timers anyway.
Access to RTC information may be something that mediated by
firmware/system control buses. For emulation there are a fair number of
industry standard RTC chips modelled and RTC access tends not to be
performance critical.
Signal Distribution Module / 21
-------------------------------
This appears to be a intra-domain communication channel for which an RFC
was posted:
https://lists.oasis-open.org/archives/virtio-dev/201606/msg00030.html
it came with references to kernel and QEMU implementations. I don't know
if this approach has been obviated by other communcation channels like
vsock or scmi.
pstore device / 22
------------------
This appears to be a persistent storage device that was intended to
allow guests to dump information like crash dumps. There was a proposed
kernel driver:
https://lwn.net/Articles/698744/
and a proposed QEMU backend:
https://lore.kernel.org/all/1469632111-23260-1-git-send-email-namhyung@kernel.org/
which were never merged. As far as I can tell no proposal for the virtio spec itself.
Video encoder device / 30
Video decoder device / 31
-------------------------
This is an ongoing development which has iterated several versions of
the spec and the kernel side driver.
NitroSecureModule / 33
----------------------
This is a stripped down Trusted Platform Module (TPM) intended to expose
TPM functionality such as cryptographic functions and attestation to
guests. This looks like it is closely tied with AWS's Nitro Enclaves.
I haven't been able to find any public definition of the spec or
implementation details. How would this interact with other TPM
functionality solutions?
Watchdog / 35
-------------
Discussion about this is usually conflated with reset functionality as
the two are intimately related.
An early interest in this was for providing a well specified reset
functionality firmware running on the -M virt machine model in QEMU. The
need has been reduced somewhat with the provision of the sbsa-ref model
which does have a defined reset pin.
Other questions that would need to be answered include how the
functionality would interact with the hypervisor given a vCPU could
easily not be scheduled by it and therefore miss its kick window.
Currently there have been no proposals for the spec or implementations.
CAN / 36
--------
This is a device of interest to the Automotive industry as it looks to
consolidate numerous ECUs into VM based work loads. There was a proposed
RFC last year:
https://markmail.org/message/hdxj35fsthypllkt?q=virtio-can+list:org%2Eoasis-open%2Elists%2Evirtio-comment
and it is presumed there are frontend and backend drivers in vendor
trees. At the last AGL virtualization expert meeting the Open Synergy
guys said they hoped to post new versions of the spec and kernel driver
soon:
https://confluence.automotivelinux.org/pages/viewpage.action?spaceKey=VE&title=Meeting+Agenda#MeetingAgenda-May252022
During our discussion it became clear that while the message bus itself
was fairly simple real HW often has a vendor specific control plane to
enable specific features. Being able to present this flexibility via the
virtio interface without baking in a direct mapping of the HW would be
the challenge.
Parameter Server / 38
---------------------
This is a proposal for a key-value parameter store over virtio. The
exact use case is unclear but I suspect for Arm at least there is
overlap with what is already supported by DT and UEFI variables.
The proposal only seems to have been partially archived on the lists:
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg07201.html
It may be Android related?
Audio policy device / 39
------------------------
Again I think this stems from the Android world and provides a policy
and control device to work in concert with the virtio-sound device. The
initial proposal to the list is here:
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg07255.html
The idea seems to be to have a control layer for dealing with routing
and priority of multiple audio streams.
Bluetooth device / 40
---------------------
Bluetooth suffers from similar complexity problems as 802.11 WiFi.
However the virtio_bt driver in the kernel concentrates on providing a
pipe for a standardised Host Control Interface (HCI) albeit with support
for a selection of vendor specific commands.
I could not find any submission of the specification for standarisation.
Specified but missing backends?
===============================
GPU device / 16
---------------
This is now a fairly mature part of the spec and has implementations is
the kernel, QEMU and a vhost-user backend. However as is commensurate
with the complexity of GPUs there is ongoing development moving from the
VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
make some things easier.
A potential area of interest here is working out what the differences
are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
is currently a ChromeOS only invention so hasn't seen any upstreaming or
specification work but may make more sense where multiple VMs are
drawing only elements of a final display which is composited by a master
program. For further reading see Alyssa's write-up:
https://alyssa.is/using-virtio-wl/
I'm not sure how widely used the existing vhost-user backend is for
virtio-gpu but it could present an opportunity for a more beefy rust-vmm
backend implementation?
Audio device / 25
-----------------
This has a specification and a working kernel driver. However there
isn't a working backend for QEMU although one has been proposed:
Subject: [RFC PATCH 00/27] Virtio sound card implementation
Date: Thu, 29 Apr 2021 17:34:18 +0530
Message-Id: <20210429120445.694420-1-chouhan.shreyansh2702@gmail.com>
this could be a candidate for a rust-vmm version?
Other suggestions
=================
When we started Project Stratos there was a survey amongst members on
where there was interest.
virtio-spi/virtio-greybus
-------------------------
Yet another serial bus. We chose to do i2c but doing another similar bus
wouldn't be pushing the state of the art. We could certainly
mentor/guide someone else who wants to get involved in rust-vmm though.
virtio-tuner/virtio-radio
-------------------------
These were early automotive requests. I don't know where these would sit
in relation to the existing virtio-sound and audio policy devices.
virtio-camera
-------------
We have a prototype of virtio-video but as the libcamera project shows
interfacing with modern cameras is quite a complex task these days.
Modern cameras have all sorts of features powered by complex IP blocks
including various amounts of AI. Perhaps it makes more sense to leave
this to see how the libcamera project progresses before seeing what
common features could be exposed.
Conclusion
==========
Considering the progress we've made so far and our growing confidence
with rust-vmm I think the next device we implement a backend for should
be a more complex device. Discussing this with Viresh and Mathieu
earlier today we thought it would be nice if the device was more demo
friendly as CLI's don't often excite.
My initial thoughts is that a rust-vmm backend for virtio-gpu would fit
the bill because:
- already up-streamed in specification and kernel
- known working implementations in QEMU and C based vhost-user daemon
- ongoing development would be a good test of Rust's flexibility
I think virtio-can would also be a useful target for the automotive use
case. Given there will be a new release of the spec soon we should
certainly keep an eye on it.
Anyway I welcome peoples thoughts.
--
Alex Bennée
See also:
Remaining Xen enabling work for rust-vmm - 87pmk472ii.fsf@linaro.org
vhost-device outstanding tasks - 87zgj87alq.fsf@linaro.org
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-comment] Next VirtIO device for Project Stratos?
2022-05-31 8:07 [virtio-comment] Next VirtIO device for Project Stratos? Alex Bennée
@ 2022-06-06 9:35 ` Bradford, Robert
2022-06-08 12:38 ` Alex Bennée
[not found] ` <80aace95-6c39-c7b9-61ba-70d60bcd08b2@quicinc.com>
2022-09-03 7:43 ` [virtio-dev] " Alyssa Ross
2 siblings, 1 reply; 12+ messages in thread
From: Bradford, Robert @ 2022-06-06 9:35 UTC (permalink / raw)
To: virtio-comment@lists.oasis-open.org
On Tue, 2022-05-31 at 09:07 +0100, Alex Bennée wrote:
> Watchdog / 35
> -------------
>
> Discussion about this is usually conflated with reset functionality
> as
> the two are intimately related.
>
> An early interest in this was for providing a well specified reset
> functionality firmware running on the -M virt machine model in QEMU.
> The
> need has been reduced somewhat with the provision of the sbsa-ref
> model
> which does have a defined reset pin.
>
> Other questions that would need to be answered include how the
> functionality would interact with the hypervisor given a vCPU could
> easily not be scheduled by it and therefore miss its kick window.
>
I guess this a risk with any use of a watchdog whether it be emulating
a physical device or a paravirtualised one. In practice we have never
seen this.
> Currently there have been no proposals for the spec or
> implementations.
I was the one who requested the ID be reserved. We have an
implementation in Cloud Hypervisor:
https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/virtio-devices/src/watchdog.rs
As well as the kernel side:
https://github.com/cloud-hypervisor/linux/commit/cc8f7579faad79cdf02f9b6a742510cd1b1cf340
I admit writing up a patch for the specification fell through the
cracks. That's something I will try to rectify in the near future as it
would also be nice to be able to upstream the kernel patch.
Cheers,
Rob
^ permalink raw reply [flat|nested] 12+ messages in thread
* [virtio-comment] Re: [Stratos-dev] Next VirtIO device for Project Stratos?
[not found] ` <a3856ec8-90d6-df19-2b5f-bc42700b09db@quicinc.com>
@ 2022-06-08 12:28 ` Alex Bennée
0 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2022-06-08 12:28 UTC (permalink / raw)
To: Trilok Soni
Cc: Johannes Berg, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Matt Spencer, Gerd Hoffmann,
Arnd Bergmann, Christian Pinto, Namhyung Kim, Petre Eftime,
Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Stefan Hajnoczi, quic_pdaly, pdaly, svaddagi
Trilok Soni <quic_tsoni@quicinc.com> writes:
> On 6/1/2022 1:06 PM, Johannes Berg wrote:
>> Hi,
>> Not sure if there was anything you wanted me to comment on, but
>> since
>> I'm "the wifi guy" ... :)
>>
>>>> mac80211 wlan / 10
>> FWIW, even though I'm the mac80211 maintainer, I'm not aware of a
>> specification or implementation of this ... I don't know what this is at
>> all.
>>
>>>> mac80211 hwsim wireless simulation device / 29
>> This I implemented (both a driver in mac80211-hwsim in the kernel,
>> as
>> well as a device in wmediumd), but I wouldn't really necessarily
>> recommend using it for anything but testing.
I assume the use-case for this is something like a virtualised Android
OS. For cloud native testing I guess a simulation device provides enough of
what you need to exercise the guests network stack. However for real
deployments you need something to allow selection of networks and
reporting of network quality.
I'm not super familiar with the wifi stack but is this all usually
handled in one place or do multiple userspace daemons interrogate the
kernel APIs for this information?
If it all comes through one place perhaps it's enough for it to be given
a pipe to the host to make those queries - effectively creating a proxy
to the real host kernel interface?
>>> I am not sure if this related but virtio-ethernet keeps coming to us as
>>> requirement, I am not sure about the what is the support available in
>>> the various projects including Xen. This is a non-Mobile requirement
>>> particularly from the IOT or Auto segments. It will be nice to do adb
>>> over ethernet in the guest VM from the host shell.
>> For ethernet you have normal virtio-net.
>
> Thanks. Virtio-net is available, but I think e2e usecase w/ Type-1
> Hypervisor is what I am looking for. I believe CrosVM also supports
> Virtio-net but I am not sure if it works w/ Xen or not.
In normal Xen you would have a Dom0 with a traditional kernel driver to
service the backend. In a more modular setup you might want to have a
driver domain that combines the backend with the real HW driver running
as a unikernel?
>
> ---Trilok Soni
--
Alex Bennée
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-comment] Next VirtIO device for Project Stratos?
2022-06-06 9:35 ` Bradford, Robert
@ 2022-06-08 12:38 ` Alex Bennée
0 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2022-06-08 12:38 UTC (permalink / raw)
To: Bradford, Robert; +Cc: virtio-comment
"Bradford, Robert" <robert.bradford@intel.com> writes:
> On Tue, 2022-05-31 at 09:07 +0100, Alex Bennée wrote:
>> Watchdog / 35
>> -------------
>>
>> Discussion about this is usually conflated with reset functionality
>> as
>> the two are intimately related.
>>
>> An early interest in this was for providing a well specified reset
>> functionality firmware running on the -M virt machine model in QEMU.
>> The
>> need has been reduced somewhat with the provision of the sbsa-ref
>> model
>> which does have a defined reset pin.
>>
>> Other questions that would need to be answered include how the
>> functionality would interact with the hypervisor given a vCPU could
>> easily not be scheduled by it and therefore miss its kick window.
>>
>
> I guess this a risk with any use of a watchdog whether it be emulating
> a physical device or a paravirtualised one. In practice we have never
> seen this.
>
>> Currently there have been no proposals for the spec or
>> implementations.
>
> I was the one who requested the ID be reserved. We have an
> implementation in Cloud Hypervisor:
> https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/virtio-devices/src/watchdog.rs
>
Ahh good. I guess it could be another cloud hypervisor component that
makes the transition to rust-vmm if there is interest?
> As well as the kernel side:
>
> https://github.com/cloud-hypervisor/linux/commit/cc8f7579faad79cdf02f9b6a742510cd1b1cf340
>
> I admit writing up a patch for the specification fell through the
> cracks. That's something I will try to rectify in the near future as it
> would also be nice to be able to upstream the kernel patch.
Thanks, please feel free to Cc me on any such patches and we can take a
look.
--
Alex Bennée
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-05-31 8:07 [virtio-comment] Next VirtIO device for Project Stratos? Alex Bennée
2022-06-06 9:35 ` Bradford, Robert
[not found] ` <80aace95-6c39-c7b9-61ba-70d60bcd08b2@quicinc.com>
@ 2022-09-03 7:43 ` Alyssa Ross
2022-09-05 15:22 ` [virtio-comment] " Alex Bennée
2022-09-05 20:27 ` [virtio-comment] " Stefan Hajnoczi
2 siblings, 2 replies; 12+ messages in thread
From: Alyssa Ross @ 2022-09-03 7:43 UTC (permalink / raw)
To: Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org
Cc: Viresh Kumar, Mathieu Poirier, Mike Holmes, Matt Spencer,
Peter Griffin, Dan Milea, Bill Mills, Francois Ozog,
Johannes Berg, Gerd Hoffmann, Arnd Bergmann, Christian Pinto,
Namhyung Kim, Petre Eftime, Peter Hilber, Marcel Holtmann,
Michael S. Tsirkin, Stefan Hajnoczi, Puck Meerburg
[-- Attachment #1: Type: text/plain, Size: 2993 bytes --]
Hi Alex and everyone else, just catching up on some mail and wanted to
clarify some things:
Alex Bennée <alex.bennee@linaro.org> writes:
> This email is driven by a brain storming session at a recent sprint
> where we considered what VirtIO devices we should look at implementing
> next. I ended up going through all the assigned device IDs hunting for
> missing spec discussion and existing drivers so I'd welcome feedback
> from anybody actively using them - especially as my suppositions about
> device types I'm not familiar with may be way off!
>
> [...snip...]
>
> GPU device / 16
> ---------------
>
> This is now a fairly mature part of the spec and has implementations is
> the kernel, QEMU and a vhost-user backend. However as is commensurate
> with the complexity of GPUs there is ongoing development moving from the
> VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> make some things easier.
>
> A potential area of interest here is working out what the differences
> are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> is currently a ChromeOS only invention so hasn't seen any upstreaming or
> specification work but may make more sense where multiple VMs are
> drawing only elements of a final display which is composited by a master
> program. For further reading see Alyssa's write-up:
>
> https://alyssa.is/using-virtio-wl/
>
> I'm not sure how widely used the existing vhost-user backend is for
> virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> backend implementation?
As I understand it, virtio-wayland is effectively deprecated in favour
of sending Wayland messages over cross-domain virtio-gpu contexts. It's
possible to do this now with an upstream kernel, whereas virtio-wayland
always required a custom driver in the Chromium kernel.
But crosvm is still the only implementation of a virtio-gpu device that
supports Wayland over cross-domain contexts, so it would be great to see
a more generic implementation. Especially because, while crosvm can
share its virtio-gpu device over vhost-user, it does so in a way that's
incompatible with the standardised vhost-user-gpu as implemented by
QEMU. When I asked the crosvm developers in their Matrix channel what
it would take to use the standard vhost-user-gpu variant, they said that
the standard variant was lacking functionality they needed, like mapping
and unmapping GPU buffers into the guest.
So if we wanted to push forward with getting making Wayland over
virttio-gpu less crosvm specific, I suppose the first step would be to
figure out with the crosvm developers what functionality is missing in
the vhost-user-gpu protocol. That would then make it possible to use
crosvm's device (with the Wayland support) with other VMMs like QEMU.
(CCing my colleage Puck, who has also been working with me on getting
Wayland over virtio-gpu up and running outside of Chrome OS.)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* [virtio-comment] Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-03 7:43 ` [virtio-dev] " Alyssa Ross
@ 2022-09-05 15:22 ` Alex Bennée
2022-09-06 7:47 ` [virtio-dev] Re: [Stratos-dev] " Alyssa Ross
2022-09-05 20:27 ` [virtio-comment] " Stefan Hajnoczi
1 sibling, 1 reply; 12+ messages in thread
From: Alex Bennée @ 2022-09-05 15:22 UTC (permalink / raw)
To: Alyssa Ross
Cc: stratos-dev, virtio-dev, virtio-comment@lists.oasis-open.org,
Viresh Kumar, Mathieu Poirier, Mike Holmes, Matt Spencer,
Peter Griffin, Gerd Hoffmann, Arnd Bergmann, Peter Hilber,
Michael S. Tsirkin, Stefan Hajnoczi, Puck Meerburg,
Mikhail Golubev, Andriy Tryshnivskyy, Vasyl Vavrychuk
Alyssa Ross <hi@alyssa.is> writes:
> [[PGP Signed Part:Undecided]]
> Hi Alex and everyone else, just catching up on some mail and wanted to
> clarify some things:
>
> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> This email is driven by a brain storming session at a recent sprint
>> where we considered what VirtIO devices we should look at implementing
>> next. I ended up going through all the assigned device IDs hunting for
>> missing spec discussion and existing drivers so I'd welcome feedback
>> from anybody actively using them - especially as my suppositions about
>> device types I'm not familiar with may be way off!
>>
>> [...snip...]
>>
>> GPU device / 16
>> ---------------
>>
>> This is now a fairly mature part of the spec and has implementations is
>> the kernel, QEMU and a vhost-user backend. However as is commensurate
>> with the complexity of GPUs there is ongoing development moving from the
>> VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
>> make some things easier.
>>
>> A potential area of interest here is working out what the differences
>> are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
>> is currently a ChromeOS only invention so hasn't seen any upstreaming or
>> specification work but may make more sense where multiple VMs are
>> drawing only elements of a final display which is composited by a master
>> program. For further reading see Alyssa's write-up:
>>
>> https://alyssa.is/using-virtio-wl/
>>
>> I'm not sure how widely used the existing vhost-user backend is for
>> virtio-gpu but it could present an opportunity for a more beefy rust-vmm
>> backend implementation?
>
> As I understand it, virtio-wayland is effectively deprecated in favour
> of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> possible to do this now with an upstream kernel, whereas virtio-wayland
> always required a custom driver in the Chromium kernel.
That's good to know. I guess there is nothing that prevents the final
display of virtual GPUs from multiple guests being mapped onto the
final presentation? The automotive use case seems to treat each
individual VM with a UI as presenting a surface which the final console
manager composites up together depending on safety rules to display to
the user.
> But crosvm is still the only implementation of a virtio-gpu device that
> supports Wayland over cross-domain contexts, so it would be great to see
> a more generic implementation. Especially because, while crosvm can
> share its virtio-gpu device over vhost-user, it does so in a way that's
> incompatible with the standardised vhost-user-gpu as implemented by
> QEMU. When I asked the crosvm developers in their Matrix channel what
> it would take to use the standard vhost-user-gpu variant, they said that
> the standard variant was lacking functionality they needed, like mapping
> and unmapping GPU buffers into the guest.
Is this related to ensuring allocated buffers are properly aligned in
the host address space so the HW can use them without needing to copy
them again? I seem to recall this was one of the topics that came up in
one of the AGL VirtIO GPU workshops with the OpenSynergy people:
https://confluence.automotivelinux.org/pages/viewpage.action?spaceKey=VE&title=Meeting+Agenda#MeetingAgenda-Jan20,2021
zero-copy is a goal everyone seems to want to make the mapping from
virtual to real hardware as efficient as possible. Of course zero-copy
is very much in opposition to more memory isolation between guests and
hosts (e.g. Xen/pKVM). Everyone it seems wants the moon on a stick.
> So if we wanted to push forward with getting making Wayland over
> virttio-gpu less crosvm specific, I suppose the first step would be to
> figure out with the crosvm developers what functionality is missing in
> the vhost-user-gpu protocol. That would then make it possible to use
> crosvm's device (with the Wayland support) with other VMMs like QEMU.
>
> (CCing my colleage Puck, who has also been working with me on getting
> Wayland over virtio-gpu up and running outside of Chrome OS.)
Thanks. I'm very much an outside observer when it comes to GPUs so
welcome the expert input ;-)
>
> [[End of PGP Signed Part]]
--
Alex Bennée
This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.
In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.
Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/
^ permalink raw reply [flat|nested] 12+ messages in thread
* [virtio-comment] Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-03 7:43 ` [virtio-dev] " Alyssa Ross
2022-09-05 15:22 ` [virtio-comment] " Alex Bennée
@ 2022-09-05 20:27 ` Stefan Hajnoczi
2022-09-06 17:33 ` Dr. David Alan Gilbert
1 sibling, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-09-05 20:27 UTC (permalink / raw)
To: Alyssa Ross
Cc: Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Viresh Kumar,
Mathieu Poirier, Mike Holmes, Matt Spencer, Peter Griffin,
Dan Milea, Bill Mills, Francois Ozog, Johannes Berg,
Gerd Hoffmann, Arnd Bergmann, Christian Pinto, Namhyung Kim,
Petre Eftime, Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Puck Meerburg, Gurchetan Singh, Dr. David Alan Gilbert
[-- Attachment #1: Type: text/plain, Size: 4753 bytes --]
On Sat, Sep 03, 2022 at 07:43:08AM +0000, Alyssa Ross wrote:
> Hi Alex and everyone else, just catching up on some mail and wanted to
> clarify some things:
>
> Alex Bennée <alex.bennee@linaro.org> writes:
>
> > This email is driven by a brain storming session at a recent sprint
> > where we considered what VirtIO devices we should look at implementing
> > next. I ended up going through all the assigned device IDs hunting for
> > missing spec discussion and existing drivers so I'd welcome feedback
> > from anybody actively using them - especially as my suppositions about
> > device types I'm not familiar with may be way off!
> >
> > [...snip...]
> >
> > GPU device / 16
> > ---------------
> >
> > This is now a fairly mature part of the spec and has implementations is
> > the kernel, QEMU and a vhost-user backend. However as is commensurate
> > with the complexity of GPUs there is ongoing development moving from the
> > VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> > make some things easier.
> >
> > A potential area of interest here is working out what the differences
> > are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> > is currently a ChromeOS only invention so hasn't seen any upstreaming or
> > specification work but may make more sense where multiple VMs are
> > drawing only elements of a final display which is composited by a master
> > program. For further reading see Alyssa's write-up:
> >
> > https://alyssa.is/using-virtio-wl/
> >
> > I'm not sure how widely used the existing vhost-user backend is for
> > virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> > backend implementation?
>
> As I understand it, virtio-wayland is effectively deprecated in favour
> of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> possible to do this now with an upstream kernel, whereas virtio-wayland
> always required a custom driver in the Chromium kernel.
>
> But crosvm is still the only implementation of a virtio-gpu device that
> supports Wayland over cross-domain contexts, so it would be great to see
> a more generic implementation. Especially because, while crosvm can
> share its virtio-gpu device over vhost-user, it does so in a way that's
> incompatible with the standardised vhost-user-gpu as implemented by
> QEMU. When I asked the crosvm developers in their Matrix channel what
> it would take to use the standard vhost-user-gpu variant, they said that
> the standard variant was lacking functionality they needed, like mapping
> and unmapping GPU buffers into the guest.
That sounds somewhat similar to virtiofs and its DAX Window, which needs
vhost-user protocol extensions because of how memory is handled. David
Gilbert wrote the QEMU virtiofs DAX patches, which are under
development.
I took a quick look at the virtio-gpu specs. If the crosvm behavior you
mentioned is covered in the VIRTIO spec then I guess it's the "host
visible memory region"?
(If it's not in the VIRTIO spec then a spec change needs to be proposed
first and a vhost-user protocol spec change can then support that new
virtio-gpu feature.)
The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB command maps the device's resource
into the host visible memory region so that the driver can see it.
The virtiofs DAX window uses vhost-user slave channel messages to
provide file descriptors and offsets for QEMU to mmap. QEMU mmaps the
file pages into the shared memory region seen by the guest driver.
Maybe an equivalent mechanism is needed for virtio-gpu so a device
resource file descriptor can be passed to QEMU and then mmapped so the
guest driver can see the pages?
I think it's possible to unify the virtiofs and virtio-gpu extensions to
the vhost-user protocol. Two new slave channel messages are needed: "map
<fd, offset, len> to shared memory resource <n>" and "unmap <offset,
len> from shared memory resource <n>". Both devices could use these
messages to implement their respective DAX Window and Blob Resource
functionality.
>
> So if we wanted to push forward with getting making Wayland over
> virttio-gpu less crosvm specific, I suppose the first step would be to
> figure out with the crosvm developers what functionality is missing in
> the vhost-user-gpu protocol. That would then make it possible to use
> crosvm's device (with the Wayland support) with other VMMs like QEMU.
>
> (CCing my colleage Puck, who has also been working with me on getting
> Wayland over virtio-gpu up and running outside of Chrome OS.)
I have CCed David Gilbert (virtiofs DAX Window) and Gurchetan Singh
(virtio-gpu shared memory region).
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* [virtio-dev] Re: [Stratos-dev] Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-05 15:22 ` [virtio-comment] " Alex Bennée
@ 2022-09-06 7:47 ` Alyssa Ross
0 siblings, 0 replies; 12+ messages in thread
From: Alyssa Ross @ 2022-09-06 7:47 UTC (permalink / raw)
To: Alex Bennée
Cc: stratos-dev, virtio-dev, virtio-comment@lists.oasis-open.org,
Matt Spencer, Gerd Hoffmann, Arnd Bergmann, Peter Hilber,
Michael S. Tsirkin, Stefan Hajnoczi, Puck Meerburg,
Mikhail Golubev, Andriy Tryshnivskyy, Vasyl Vavrychuk
[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]
Alex Bennée via Stratos-dev <stratos-dev@op-lists.linaro.org> writes:
> Alyssa Ross <hi@alyssa.is> writes:
>
>> As I understand it, virtio-wayland is effectively deprecated in favour
>> of sending Wayland messages over cross-domain virtio-gpu contexts. It's
>> possible to do this now with an upstream kernel, whereas virtio-wayland
>> always required a custom driver in the Chromium kernel.
>
> That's good to know. I guess there is nothing that prevents the final
> display of virtual GPUs from multiple guests being mapped onto the
> final presentation? The automotive use case seems to treat each
> individual VM with a UI as presenting a surface which the final console
> manager composites up together depending on safety rules to display to
> the user.
Well, in the Wayland use case, AIUI virtio-gpu is just a transport for
the Wayland protocol + shared memory resources. The simplest case is
just sending shared CPU memory buffers around (wl_shm), so there's not
really any GPU involved in anything but name. Alternatively, it's
possible to use dma-bufs, and graphics acceleration through the
virtio-gpu devices, and yes, when doing that it's still possible for the
host Wayland compositor to combine everything into one presentation — I
think they're all just dma-bufs to it.
Does that make sense? I'm also no expert on this but hopefully I'm not
too far off.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-05 20:27 ` [virtio-comment] " Stefan Hajnoczi
@ 2022-09-06 17:33 ` Dr. David Alan Gilbert
2022-09-06 18:12 ` Stefan Hajnoczi
0 siblings, 1 reply; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-09-06 17:33 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Alyssa Ross, Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Viresh Kumar,
Mathieu Poirier, Mike Holmes, Matt Spencer, Peter Griffin,
Dan Milea, Bill Mills, Francois Ozog, Johannes Berg,
Gerd Hoffmann, Arnd Bergmann, Christian Pinto, Namhyung Kim,
Petre Eftime, Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Puck Meerburg, Gurchetan Singh
* Stefan Hajnoczi (stefanha@redhat.com) wrote:
> On Sat, Sep 03, 2022 at 07:43:08AM +0000, Alyssa Ross wrote:
> > Hi Alex and everyone else, just catching up on some mail and wanted to
> > clarify some things:
> >
> > Alex Bennée <alex.bennee@linaro.org> writes:
> >
> > > This email is driven by a brain storming session at a recent sprint
> > > where we considered what VirtIO devices we should look at implementing
> > > next. I ended up going through all the assigned device IDs hunting for
> > > missing spec discussion and existing drivers so I'd welcome feedback
> > > from anybody actively using them - especially as my suppositions about
> > > device types I'm not familiar with may be way off!
> > >
> > > [...snip...]
> > >
> > > GPU device / 16
> > > ---------------
> > >
> > > This is now a fairly mature part of the spec and has implementations is
> > > the kernel, QEMU and a vhost-user backend. However as is commensurate
> > > with the complexity of GPUs there is ongoing development moving from the
> > > VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> > > make some things easier.
> > >
> > > A potential area of interest here is working out what the differences
> > > are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> > > is currently a ChromeOS only invention so hasn't seen any upstreaming or
> > > specification work but may make more sense where multiple VMs are
> > > drawing only elements of a final display which is composited by a master
> > > program. For further reading see Alyssa's write-up:
> > >
> > > https://alyssa.is/using-virtio-wl/
> > >
> > > I'm not sure how widely used the existing vhost-user backend is for
> > > virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> > > backend implementation?
> >
> > As I understand it, virtio-wayland is effectively deprecated in favour
> > of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> > possible to do this now with an upstream kernel, whereas virtio-wayland
> > always required a custom driver in the Chromium kernel.
> >
> > But crosvm is still the only implementation of a virtio-gpu device that
> > supports Wayland over cross-domain contexts, so it would be great to see
> > a more generic implementation. Especially because, while crosvm can
> > share its virtio-gpu device over vhost-user, it does so in a way that's
> > incompatible with the standardised vhost-user-gpu as implemented by
> > QEMU. When I asked the crosvm developers in their Matrix channel what
> > it would take to use the standard vhost-user-gpu variant, they said that
> > the standard variant was lacking functionality they needed, like mapping
> > and unmapping GPU buffers into the guest.
>
> That sounds somewhat similar to virtiofs and its DAX Window, which needs
> vhost-user protocol extensions because of how memory is handled. David
> Gilbert wrote the QEMU virtiofs DAX patches, which are under
> development.
>
> I took a quick look at the virtio-gpu specs. If the crosvm behavior you
> mentioned is covered in the VIRTIO spec then I guess it's the "host
> visible memory region"?
>
> (If it's not in the VIRTIO spec then a spec change needs to be proposed
> first and a vhost-user protocol spec change can then support that new
> virtio-gpu feature.)
>
> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB command maps the device's resource
> into the host visible memory region so that the driver can see it.
>
> The virtiofs DAX window uses vhost-user slave channel messages to
> provide file descriptors and offsets for QEMU to mmap. QEMU mmaps the
> file pages into the shared memory region seen by the guest driver.
>
> Maybe an equivalent mechanism is needed for virtio-gpu so a device
> resource file descriptor can be passed to QEMU and then mmapped so the
> guest driver can see the pages?
>
> I think it's possible to unify the virtiofs and virtio-gpu extensions to
> the vhost-user protocol. Two new slave channel messages are needed: "map
> <fd, offset, len> to shared memory resource <n>" and "unmap <offset,
> len> from shared memory resource <n>". Both devices could use these
> messages to implement their respective DAX Window and Blob Resource
> functionality.
It might be possible; but there's a bunch of lifetime/alignment/etc
questions to be answered.
For virtiofs DAX we carve out a chunk of a BAR as a 'cache' (unfortunate
name) that we can then do mappings into.
The VHOST_USER_SLAVE_FS_MAP/UNMAP commands can do the mapping:
https://gitlab.com/virtio-fs/qemu/-/commit/7c29854da484afd7ca95acbd2e4acfc2c75ef491
https://gitlab.com/virtio-fs/qemu/-/commit/f32bc2524035931856aa218ce18efa029b9eed02
those might do what you want if you can figure out a way to generalise
the bar to map them into.
There are some problems; KVM gets really really upset if you try and
access an area that doesn't have a mapping or is mapped to a truncated
file; do you want the guest to be able to crash like that?
Dave
> >
> > So if we wanted to push forward with getting making Wayland over
> > virttio-gpu less crosvm specific, I suppose the first step would be to
> > figure out with the crosvm developers what functionality is missing in
> > the vhost-user-gpu protocol. That would then make it possible to use
> > crosvm's device (with the Wayland support) with other VMMs like QEMU.
> >
> > (CCing my colleage Puck, who has also been working with me on getting
> > Wayland over virtio-gpu up and running outside of Chrome OS.)
>
> I have CCed David Gilbert (virtiofs DAX Window) and Gurchetan Singh
> (virtio-gpu shared memory region).
>
> Stefan
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-06 17:33 ` Dr. David Alan Gilbert
@ 2022-09-06 18:12 ` Stefan Hajnoczi
2022-09-07 14:09 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-09-06 18:12 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Alyssa Ross, Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Viresh Kumar,
Mathieu Poirier, Mike Holmes, Matt Spencer, Peter Griffin,
Dan Milea, Bill Mills, Francois Ozog, Johannes Berg,
Gerd Hoffmann, Arnd Bergmann, Christian Pinto, Namhyung Kim,
Petre Eftime, Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Puck Meerburg, Gurchetan Singh
[-- Attachment #1: Type: text/plain, Size: 6644 bytes --]
On Tue, Sep 06, 2022 at 06:33:36PM +0100, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefanha@redhat.com) wrote:
> > On Sat, Sep 03, 2022 at 07:43:08AM +0000, Alyssa Ross wrote:
> > > Hi Alex and everyone else, just catching up on some mail and wanted to
> > > clarify some things:
> > >
> > > Alex Bennée <alex.bennee@linaro.org> writes:
> > >
> > > > This email is driven by a brain storming session at a recent sprint
> > > > where we considered what VirtIO devices we should look at implementing
> > > > next. I ended up going through all the assigned device IDs hunting for
> > > > missing spec discussion and existing drivers so I'd welcome feedback
> > > > from anybody actively using them - especially as my suppositions about
> > > > device types I'm not familiar with may be way off!
> > > >
> > > > [...snip...]
> > > >
> > > > GPU device / 16
> > > > ---------------
> > > >
> > > > This is now a fairly mature part of the spec and has implementations is
> > > > the kernel, QEMU and a vhost-user backend. However as is commensurate
> > > > with the complexity of GPUs there is ongoing development moving from the
> > > > VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> > > > make some things easier.
> > > >
> > > > A potential area of interest here is working out what the differences
> > > > are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> > > > is currently a ChromeOS only invention so hasn't seen any upstreaming or
> > > > specification work but may make more sense where multiple VMs are
> > > > drawing only elements of a final display which is composited by a master
> > > > program. For further reading see Alyssa's write-up:
> > > >
> > > > https://alyssa.is/using-virtio-wl/
> > > >
> > > > I'm not sure how widely used the existing vhost-user backend is for
> > > > virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> > > > backend implementation?
> > >
> > > As I understand it, virtio-wayland is effectively deprecated in favour
> > > of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> > > possible to do this now with an upstream kernel, whereas virtio-wayland
> > > always required a custom driver in the Chromium kernel.
> > >
> > > But crosvm is still the only implementation of a virtio-gpu device that
> > > supports Wayland over cross-domain contexts, so it would be great to see
> > > a more generic implementation. Especially because, while crosvm can
> > > share its virtio-gpu device over vhost-user, it does so in a way that's
> > > incompatible with the standardised vhost-user-gpu as implemented by
> > > QEMU. When I asked the crosvm developers in their Matrix channel what
> > > it would take to use the standard vhost-user-gpu variant, they said that
> > > the standard variant was lacking functionality they needed, like mapping
> > > and unmapping GPU buffers into the guest.
> >
> > That sounds somewhat similar to virtiofs and its DAX Window, which needs
> > vhost-user protocol extensions because of how memory is handled. David
> > Gilbert wrote the QEMU virtiofs DAX patches, which are under
> > development.
> >
> > I took a quick look at the virtio-gpu specs. If the crosvm behavior you
> > mentioned is covered in the VIRTIO spec then I guess it's the "host
> > visible memory region"?
> >
> > (If it's not in the VIRTIO spec then a spec change needs to be proposed
> > first and a vhost-user protocol spec change can then support that new
> > virtio-gpu feature.)
> >
> > The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB command maps the device's resource
> > into the host visible memory region so that the driver can see it.
> >
> > The virtiofs DAX window uses vhost-user slave channel messages to
> > provide file descriptors and offsets for QEMU to mmap. QEMU mmaps the
> > file pages into the shared memory region seen by the guest driver.
> >
> > Maybe an equivalent mechanism is needed for virtio-gpu so a device
> > resource file descriptor can be passed to QEMU and then mmapped so the
> > guest driver can see the pages?
> >
> > I think it's possible to unify the virtiofs and virtio-gpu extensions to
> > the vhost-user protocol. Two new slave channel messages are needed: "map
> > <fd, offset, len> to shared memory resource <n>" and "unmap <offset,
> > len> from shared memory resource <n>". Both devices could use these
> > messages to implement their respective DAX Window and Blob Resource
> > functionality.
>
> It might be possible; but there's a bunch of lifetime/alignment/etc
> questions to be answered.
>
> For virtiofs DAX we carve out a chunk of a BAR as a 'cache' (unfortunate
> name) that we can then do mappings into.
>
> The VHOST_USER_SLAVE_FS_MAP/UNMAP commands can do the mapping:
> https://gitlab.com/virtio-fs/qemu/-/commit/7c29854da484afd7ca95acbd2e4acfc2c75ef491
> https://gitlab.com/virtio-fs/qemu/-/commit/f32bc2524035931856aa218ce18efa029b9eed02
>
> those might do what you want if you can figure out a way to generalise
> the bar to map them into.
>
> There are some problems; KVM gets really really upset if you try and
> access an area that doesn't have a mapping or is mapped to a truncated
> file; do you want the guest to be able to crash like that?
I think you are pointing out the existing problems with virtiofs
map/unmap and not new issues related to virtio-gpu or generalizing the
vhost-user messages?
There are a few possibilities for dealing with unmapped ranges in Shared
Memory Regions:
1. Reserve the unused Shared Memory Region ranges with mmap(PROT_NONE)
so that accesses to unmapped pages result in faults.
2. Map zero pages that are either:
a. read-only
b. read-write but discard stores
c. private/anonymous memory
virtiofs does #1 and has trouble with accesses to unmapped areas because
KVM's MMIO dispatch loop gets upset. On top of that virtiofs also needs
a way to inject the fault into the guest so that the truncated mmap case
can be detected in the guest.
The situation is probably easier for virtio-gpu than for virtiofs. I
think the underlying host files won't be truncated and guest userspace
processes cannot access unmapped pages. So virtio-gpu is less
susceptible to unmapped accesses.
But we still need to implement unmapped access semantics. I don't know
enough about CPU memory to suggest a solution for injecting unmapped
access faults. Maybe you can find someone who can help. I wonder if pmem
or CXL devices have similar requirements?
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-06 18:12 ` Stefan Hajnoczi
@ 2022-09-07 14:09 ` Dr. David Alan Gilbert
2022-09-07 17:15 ` Stefan Hajnoczi
0 siblings, 1 reply; 12+ messages in thread
From: Dr. David Alan Gilbert @ 2022-09-07 14:09 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: Alyssa Ross, Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Viresh Kumar,
Mathieu Poirier, Mike Holmes, Matt Spencer, Peter Griffin,
Dan Milea, Bill Mills, Francois Ozog, Johannes Berg,
Gerd Hoffmann, Arnd Bergmann, Christian Pinto, Namhyung Kim,
Petre Eftime, Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Puck Meerburg, Gurchetan Singh
* Stefan Hajnoczi (stefanha@redhat.com) wrote:
> On Tue, Sep 06, 2022 at 06:33:36PM +0100, Dr. David Alan Gilbert wrote:
> > * Stefan Hajnoczi (stefanha@redhat.com) wrote:
> > > On Sat, Sep 03, 2022 at 07:43:08AM +0000, Alyssa Ross wrote:
> > > > Hi Alex and everyone else, just catching up on some mail and wanted to
> > > > clarify some things:
> > > >
> > > > Alex Bennée <alex.bennee@linaro.org> writes:
> > > >
> > > > > This email is driven by a brain storming session at a recent sprint
> > > > > where we considered what VirtIO devices we should look at implementing
> > > > > next. I ended up going through all the assigned device IDs hunting for
> > > > > missing spec discussion and existing drivers so I'd welcome feedback
> > > > > from anybody actively using them - especially as my suppositions about
> > > > > device types I'm not familiar with may be way off!
> > > > >
> > > > > [...snip...]
> > > > >
> > > > > GPU device / 16
> > > > > ---------------
> > > > >
> > > > > This is now a fairly mature part of the spec and has implementations is
> > > > > the kernel, QEMU and a vhost-user backend. However as is commensurate
> > > > > with the complexity of GPUs there is ongoing development moving from the
> > > > > VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> > > > > make some things easier.
> > > > >
> > > > > A potential area of interest here is working out what the differences
> > > > > are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> > > > > is currently a ChromeOS only invention so hasn't seen any upstreaming or
> > > > > specification work but may make more sense where multiple VMs are
> > > > > drawing only elements of a final display which is composited by a master
> > > > > program. For further reading see Alyssa's write-up:
> > > > >
> > > > > https://alyssa.is/using-virtio-wl/
> > > > >
> > > > > I'm not sure how widely used the existing vhost-user backend is for
> > > > > virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> > > > > backend implementation?
> > > >
> > > > As I understand it, virtio-wayland is effectively deprecated in favour
> > > > of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> > > > possible to do this now with an upstream kernel, whereas virtio-wayland
> > > > always required a custom driver in the Chromium kernel.
> > > >
> > > > But crosvm is still the only implementation of a virtio-gpu device that
> > > > supports Wayland over cross-domain contexts, so it would be great to see
> > > > a more generic implementation. Especially because, while crosvm can
> > > > share its virtio-gpu device over vhost-user, it does so in a way that's
> > > > incompatible with the standardised vhost-user-gpu as implemented by
> > > > QEMU. When I asked the crosvm developers in their Matrix channel what
> > > > it would take to use the standard vhost-user-gpu variant, they said that
> > > > the standard variant was lacking functionality they needed, like mapping
> > > > and unmapping GPU buffers into the guest.
> > >
> > > That sounds somewhat similar to virtiofs and its DAX Window, which needs
> > > vhost-user protocol extensions because of how memory is handled. David
> > > Gilbert wrote the QEMU virtiofs DAX patches, which are under
> > > development.
> > >
> > > I took a quick look at the virtio-gpu specs. If the crosvm behavior you
> > > mentioned is covered in the VIRTIO spec then I guess it's the "host
> > > visible memory region"?
> > >
> > > (If it's not in the VIRTIO spec then a spec change needs to be proposed
> > > first and a vhost-user protocol spec change can then support that new
> > > virtio-gpu feature.)
> > >
> > > The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB command maps the device's resource
> > > into the host visible memory region so that the driver can see it.
> > >
> > > The virtiofs DAX window uses vhost-user slave channel messages to
> > > provide file descriptors and offsets for QEMU to mmap. QEMU mmaps the
> > > file pages into the shared memory region seen by the guest driver.
> > >
> > > Maybe an equivalent mechanism is needed for virtio-gpu so a device
> > > resource file descriptor can be passed to QEMU and then mmapped so the
> > > guest driver can see the pages?
> > >
> > > I think it's possible to unify the virtiofs and virtio-gpu extensions to
> > > the vhost-user protocol. Two new slave channel messages are needed: "map
> > > <fd, offset, len> to shared memory resource <n>" and "unmap <offset,
> > > len> from shared memory resource <n>". Both devices could use these
> > > messages to implement their respective DAX Window and Blob Resource
> > > functionality.
> >
> > It might be possible; but there's a bunch of lifetime/alignment/etc
> > questions to be answered.
> >
> > For virtiofs DAX we carve out a chunk of a BAR as a 'cache' (unfortunate
> > name) that we can then do mappings into.
> >
> > The VHOST_USER_SLAVE_FS_MAP/UNMAP commands can do the mapping:
> > https://gitlab.com/virtio-fs/qemu/-/commit/7c29854da484afd7ca95acbd2e4acfc2c75ef491
> > https://gitlab.com/virtio-fs/qemu/-/commit/f32bc2524035931856aa218ce18efa029b9eed02
> >
> > those might do what you want if you can figure out a way to generalise
> > the bar to map them into.
> >
> > There are some problems; KVM gets really really upset if you try and
> > access an area that doesn't have a mapping or is mapped to a truncated
> > file; do you want the guest to be able to crash like that?
>
> I think you are pointing out the existing problems with virtiofs
> map/unmap and not new issues related to virtio-gpu or generalizing the
> vhost-user messages?
>
Right, although what I don't have a feel of here is the semantics of the
things that are being mapped in the GPU case, and what possibility that
the driver mapping them has to pick some bad offset.
Dave
> There are a few possibilities for dealing with unmapped ranges in Shared
> Memory Regions:
>
> 1. Reserve the unused Shared Memory Region ranges with mmap(PROT_NONE)
> so that accesses to unmapped pages result in faults.
> 2. Map zero pages that are either:
> a. read-only
> b. read-write but discard stores
> c. private/anonymous memory
>
> virtiofs does #1 and has trouble with accesses to unmapped areas because
> KVM's MMIO dispatch loop gets upset. On top of that virtiofs also needs
> a way to inject the fault into the guest so that the truncated mmap case
> can be detected in the guest.
>
> The situation is probably easier for virtio-gpu than for virtiofs. I
> think the underlying host files won't be truncated and guest userspace
> processes cannot access unmapped pages. So virtio-gpu is less
> susceptible to unmapped accesses.
>
> But we still need to implement unmapped access semantics. I don't know
> enough about CPU memory to suggest a solution for injecting unmapped
> access faults. Maybe you can find someone who can help. I wonder if pmem
> or CXL devices have similar requirements?
>
> Stefan
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [virtio-dev] Next VirtIO device for Project Stratos?
2022-09-07 14:09 ` Dr. David Alan Gilbert
@ 2022-09-07 17:15 ` Stefan Hajnoczi
0 siblings, 0 replies; 12+ messages in thread
From: Stefan Hajnoczi @ 2022-09-07 17:15 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Alyssa Ross, Alex Bennée, stratos-dev, virtio-dev,
virtio-comment@lists.oasis-open.org, Viresh Kumar,
Mathieu Poirier, Mike Holmes, Matt Spencer, Peter Griffin,
Dan Milea, Bill Mills, Francois Ozog, Johannes Berg,
Gerd Hoffmann, Arnd Bergmann, Christian Pinto, Namhyung Kim,
Petre Eftime, Peter Hilber, Marcel Holtmann, Michael S. Tsirkin,
Puck Meerburg, Gurchetan Singh
[-- Attachment #1: Type: text/plain, Size: 6482 bytes --]
On Wed, Sep 07, 2022 at 03:09:27PM +0100, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefanha@redhat.com) wrote:
> > On Tue, Sep 06, 2022 at 06:33:36PM +0100, Dr. David Alan Gilbert wrote:
> > > * Stefan Hajnoczi (stefanha@redhat.com) wrote:
> > > > On Sat, Sep 03, 2022 at 07:43:08AM +0000, Alyssa Ross wrote:
> > > > > Hi Alex and everyone else, just catching up on some mail and wanted to
> > > > > clarify some things:
> > > > >
> > > > > Alex Bennée <alex.bennee@linaro.org> writes:
> > > > >
> > > > > > This email is driven by a brain storming session at a recent sprint
> > > > > > where we considered what VirtIO devices we should look at implementing
> > > > > > next. I ended up going through all the assigned device IDs hunting for
> > > > > > missing spec discussion and existing drivers so I'd welcome feedback
> > > > > > from anybody actively using them - especially as my suppositions about
> > > > > > device types I'm not familiar with may be way off!
> > > > > >
> > > > > > [...snip...]
> > > > > >
> > > > > > GPU device / 16
> > > > > > ---------------
> > > > > >
> > > > > > This is now a fairly mature part of the spec and has implementations is
> > > > > > the kernel, QEMU and a vhost-user backend. However as is commensurate
> > > > > > with the complexity of GPUs there is ongoing development moving from the
> > > > > > VirGL OpenGL encapsulation to a thing called GFXSTREAM which is meant to
> > > > > > make some things easier.
> > > > > >
> > > > > > A potential area of interest here is working out what the differences
> > > > > > are in use cases between virtio-gpu and virtio-wayland. virtio-wayland
> > > > > > is currently a ChromeOS only invention so hasn't seen any upstreaming or
> > > > > > specification work but may make more sense where multiple VMs are
> > > > > > drawing only elements of a final display which is composited by a master
> > > > > > program. For further reading see Alyssa's write-up:
> > > > > >
> > > > > > https://alyssa.is/using-virtio-wl/
> > > > > >
> > > > > > I'm not sure how widely used the existing vhost-user backend is for
> > > > > > virtio-gpu but it could present an opportunity for a more beefy rust-vmm
> > > > > > backend implementation?
> > > > >
> > > > > As I understand it, virtio-wayland is effectively deprecated in favour
> > > > > of sending Wayland messages over cross-domain virtio-gpu contexts. It's
> > > > > possible to do this now with an upstream kernel, whereas virtio-wayland
> > > > > always required a custom driver in the Chromium kernel.
> > > > >
> > > > > But crosvm is still the only implementation of a virtio-gpu device that
> > > > > supports Wayland over cross-domain contexts, so it would be great to see
> > > > > a more generic implementation. Especially because, while crosvm can
> > > > > share its virtio-gpu device over vhost-user, it does so in a way that's
> > > > > incompatible with the standardised vhost-user-gpu as implemented by
> > > > > QEMU. When I asked the crosvm developers in their Matrix channel what
> > > > > it would take to use the standard vhost-user-gpu variant, they said that
> > > > > the standard variant was lacking functionality they needed, like mapping
> > > > > and unmapping GPU buffers into the guest.
> > > >
> > > > That sounds somewhat similar to virtiofs and its DAX Window, which needs
> > > > vhost-user protocol extensions because of how memory is handled. David
> > > > Gilbert wrote the QEMU virtiofs DAX patches, which are under
> > > > development.
> > > >
> > > > I took a quick look at the virtio-gpu specs. If the crosvm behavior you
> > > > mentioned is covered in the VIRTIO spec then I guess it's the "host
> > > > visible memory region"?
> > > >
> > > > (If it's not in the VIRTIO spec then a spec change needs to be proposed
> > > > first and a vhost-user protocol spec change can then support that new
> > > > virtio-gpu feature.)
> > > >
> > > > The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB command maps the device's resource
> > > > into the host visible memory region so that the driver can see it.
> > > >
> > > > The virtiofs DAX window uses vhost-user slave channel messages to
> > > > provide file descriptors and offsets for QEMU to mmap. QEMU mmaps the
> > > > file pages into the shared memory region seen by the guest driver.
> > > >
> > > > Maybe an equivalent mechanism is needed for virtio-gpu so a device
> > > > resource file descriptor can be passed to QEMU and then mmapped so the
> > > > guest driver can see the pages?
> > > >
> > > > I think it's possible to unify the virtiofs and virtio-gpu extensions to
> > > > the vhost-user protocol. Two new slave channel messages are needed: "map
> > > > <fd, offset, len> to shared memory resource <n>" and "unmap <offset,
> > > > len> from shared memory resource <n>". Both devices could use these
> > > > messages to implement their respective DAX Window and Blob Resource
> > > > functionality.
> > >
> > > It might be possible; but there's a bunch of lifetime/alignment/etc
> > > questions to be answered.
> > >
> > > For virtiofs DAX we carve out a chunk of a BAR as a 'cache' (unfortunate
> > > name) that we can then do mappings into.
> > >
> > > The VHOST_USER_SLAVE_FS_MAP/UNMAP commands can do the mapping:
> > > https://gitlab.com/virtio-fs/qemu/-/commit/7c29854da484afd7ca95acbd2e4acfc2c75ef491
> > > https://gitlab.com/virtio-fs/qemu/-/commit/f32bc2524035931856aa218ce18efa029b9eed02
> > >
> > > those might do what you want if you can figure out a way to generalise
> > > the bar to map them into.
> > >
> > > There are some problems; KVM gets really really upset if you try and
> > > access an area that doesn't have a mapping or is mapped to a truncated
> > > file; do you want the guest to be able to crash like that?
> >
> > I think you are pointing out the existing problems with virtiofs
> > map/unmap and not new issues related to virtio-gpu or generalizing the
> > vhost-user messages?
> >
>
> Right, although what I don't have a feel of here is the semantics of the
> things that are being mapped in the GPU case, and what possibility that
> the driver mapping them has to pick some bad offset.
I don't know either. I hope Gurchetan or Gerd can explain how the
virtio-gpu Shared Memory Region is used and whether accesses to unmapped
portions of the region are expected.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-09-07 17:15 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-31 8:07 [virtio-comment] Next VirtIO device for Project Stratos? Alex Bennée
2022-06-06 9:35 ` Bradford, Robert
2022-06-08 12:38 ` Alex Bennée
[not found] ` <80aace95-6c39-c7b9-61ba-70d60bcd08b2@quicinc.com>
[not found] ` <c642058f36321cb7dfdfaa4664f5323841b65450.camel@sipsolutions.net>
[not found] ` <a3856ec8-90d6-df19-2b5f-bc42700b09db@quicinc.com>
2022-06-08 12:28 ` [virtio-comment] Re: [Stratos-dev] " Alex Bennée
2022-09-03 7:43 ` [virtio-dev] " Alyssa Ross
2022-09-05 15:22 ` [virtio-comment] " Alex Bennée
2022-09-06 7:47 ` [virtio-dev] Re: [Stratos-dev] " Alyssa Ross
2022-09-05 20:27 ` [virtio-comment] " Stefan Hajnoczi
2022-09-06 17:33 ` Dr. David Alan Gilbert
2022-09-06 18:12 ` Stefan Hajnoczi
2022-09-07 14:09 ` Dr. David Alan Gilbert
2022-09-07 17:15 ` Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.