From: Stefan Hajnoczi <stefanha@redhat.com>
To: Albert Esteve <aesteve@redhat.com>
Cc: "David Stevens" <stevensd@chromium.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Alyssa Ross" <hi@alyssa.is>,
qemu-devel@nongnu.org, jasowang@redhat.com, david@redhat.com,
slp@redhat.com, "Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests
Date: Thu, 5 Sep 2024 12:39:37 -0400 [thread overview]
Message-ID: <20240905163937.GE1922502@fedora> (raw)
In-Reply-To: <CADSE00K=8SCghVxbP+7Awy6tGHtP3JyYy-5MAAMjrpv+bVC=6Q@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6280 bytes --]
On Tue, Sep 03, 2024 at 10:42:34AM +0200, Albert Esteve wrote:
> Hello all,
>
> Sorry, I have been a bit disconnected from this thread as I was on
> vacations and then had to switch tasks for a while.
>
> I will try to go through all comments and address them for the first
> non-RFC drop of this patch series.
>
> But I was discussing with some colleagues on this. So turns out rust-vmm's
> vhost-user-gpu will potentially use
> this soon, and a rust-vmm/vhost patch have been already posted:
> https://github.com/rust-vmm/vhost/pull/251.
> So I think it may make sense to:
> 1. Split the vhost-user documentation patch once settled. Since it is taken
> as the official spec,
> having it upstreamed independently of the implementation will benefit
> other projects to
> work/integrate their own code.
> 2. Split READ_/WRITE_MEM messages from SHMEM_MAP/_UNMAP patches.
> If I remember correctly, this addresses a virtio-fs specific issue,
> that will not
> impact either virtio-gpu nor virtio-media, or any other.
This is an architectural issue that arises from exposing VIRTIO Shared
Memory Regions in vhost-user. It was first seen with Linux virtiofs but
it could happen with other devices and/or guest operating systems.
Any VIRTIO Shared Memory Region that can be mmapped into Linux userspace
may trigger this issue. Userspace may write(2) to an O_DIRECT file with
the mmap as the source. The vhost-user-blk device will not be able to
access the source device's VIRTIO Shared Memory Region and will fail.
> So it may make
> sense
> to separate them so that one does not stall the other. I will try to
> have both
> integrated in the mid term.
If READ_/WRITE_MEM is a pain to implement (I think it is in the
vhost-user back-end, even though I've been a proponent of it), then
another way to deal with this issue is to specify that upon receiving
MAP/UNMAP messages, the vhost-user front-end must update the vhost-user
memory tables of all other vhost-user devices. That way vhost-user
devices will be able to access VIRTIO Shared Memory Regions mapped by
other devices.
Implementing this in QEMU should be much easier than implementing
READ_/WRITE_MEM support in device back-ends.
This will be slow and scale poorly but performance is only a problem for
devices that frequently MAP/UNMAP like virtiofs. Will virtio-gpu and
virtio-media use MAP/UNMAP often at runtime? They might be able to get
away with this simple solution.
I'd be happy with that. If someone wants to make virtiofs DAX faster,
they can implement READ/WRITE_MEM or another solution later, but let's
at least make things correct from the start.
Stefan
>
> WDYT?
>
> BR,
> Albert.
>
> On Tue, Jul 16, 2024 at 3:21 AM David Stevens <stevensd@chromium.org> wrote:
>
> > On Fri, Jul 12, 2024 at 2:47 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Jul 12, 2024 at 11:06:49AM +0900, David Stevens wrote:
> > > > On Thu, Jul 11, 2024 at 7:56 PM Alyssa Ross <hi@alyssa.is> wrote:
> > > > >
> > > > > Adding David Stevens, who implemented SHMEM_MAP and SHMEM_UNMAP in
> > > > > crosvm a couple of years ago.
> > > > >
> > > > > David, I'd be particularly interested for your thoughts on the
> > MEM_READ
> > > > > and MEM_WRITE commands, since as far as I know crosvm doesn't
> > implement
> > > > > anything like that. The discussion leading to those being added
> > starts
> > > > > here:
> > > > >
> > > > >
> > https://lore.kernel.org/qemu-devel/20240604185416.GB90471@fedora.redhat.com/
> > > > >
> > > > > It would be great if this could be standardised between QEMU and
> > crosvm
> > > > > (and therefore have a clearer path toward being implemented in other
> > VMMs)!
> > > >
> > > > Setting aside vhost-user for a moment, the DAX example given by Stefan
> > > > won't work in crosvm today.
> > > >
> > > > Is universal access to virtio shared memory regions actually mandated
> > > > by the virtio spec? Copying from virtiofs DAX to virtiofs sharing
> > > > seems reasonable enough, but what about virtio-pmem to virtio-blk?
> > > > What about screenshotting a framebuffer in virtio-gpu shared memory to
> > > > virtio-scsi? I guess with some plumbing in the VMM, it's solvable in a
> > > > virtualized environment. But what about when you have real hardware
> > > > that speaks virtio involved? That's outside my wheelhouse, but it
> > > > doesn't seem like that would be easy to solve.
> > >
> > > Yes, it can work for physical devices if allowed by host configuration.
> > > E.g. VFIO supports that I think. Don't think VDPA does.
> >
> > I'm sure it can work, but that sounds more like a SHOULD (MAY?),
> > rather than a MUST.
> >
> > > > For what it's worth, my interpretation of the target scenario:
> > > >
> > > > > Other backends don't see these mappings. If the guest submits a vring
> > > > > descriptor referencing a mapping to another backend, then that
> > backend
> > > > > won't be able to access this memory
> > > >
> > > > is that it's omitting how the implementation is reconciled with
> > > > section 2.10.1 of v1.3 of the virtio spec, which states that:
> > > >
> > > > > References into shared memory regions are represented as offsets from
> > > > > the beginning of the region instead of absolute memory addresses.
> > Offsets
> > > > > are used both for references between structures stored within shared
> > > > > memory and for requests placed in virtqueues that refer to shared
> > memory.
> > > >
> > > > My interpretation of that statement is that putting raw guest physical
> > > > addresses corresponding to virtio shared memory regions into a vring
> > > > is a driver spec violation.
> > > >
> > > > -David
> > >
> > > This really applies within device I think. Should be clarified ...
> >
> > You mean that a virtio device can use absolute memory addresses for
> > other devices' shared memory regions, but it can't use absolute memory
> > addresses for its own shared memory regions? That's a rather strange
> > requirement. Or is the statement simply giving an addressing strategy
> > that device type specifications are free to ignore?
> >
> > -David
> >
> >
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2024-09-05 16:40 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-28 14:57 [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests Albert Esteve
2024-06-28 14:57 ` [RFC PATCH v2 1/5] vhost-user: Add VIRTIO Shared Memory map request Albert Esteve
2024-07-11 7:45 ` Stefan Hajnoczi
2024-09-03 9:54 ` Albert Esteve
2024-09-03 11:54 ` Albert Esteve
2024-09-05 16:45 ` Stefan Hajnoczi
2024-09-11 11:57 ` Albert Esteve
2024-09-11 14:54 ` Stefan Hajnoczi
2024-09-04 7:28 ` Albert Esteve
2024-06-28 14:57 ` [RFC PATCH v2 2/5] vhost_user: Add frontend command for shmem config Albert Esteve
2024-07-11 8:10 ` Stefan Hajnoczi
2024-09-04 9:05 ` Albert Esteve
2024-07-11 8:15 ` Stefan Hajnoczi
2024-06-28 14:57 ` [RFC PATCH v2 3/5] vhost-user-dev: Add cache BAR Albert Esteve
2024-07-11 8:25 ` Stefan Hajnoczi
2024-09-04 11:20 ` Albert Esteve
2024-06-28 14:57 ` [RFC PATCH v2 4/5] vhost_user: Add MEM_READ/WRITE backend requests Albert Esteve
2024-07-11 8:53 ` Stefan Hajnoczi
2024-06-28 14:57 ` [RFC PATCH v2 5/5] vhost_user: Implement mem_read/mem_write handlers Albert Esteve
2024-07-11 8:55 ` Stefan Hajnoczi
2024-09-04 13:01 ` Albert Esteve
2024-09-05 19:18 ` Stefan Hajnoczi
2024-09-10 7:14 ` Albert Esteve
2024-07-11 9:01 ` [RFC PATCH v2 0/5] vhost-user: Add SHMEM_MAP/UNMAP requests Stefan Hajnoczi
2024-07-11 10:56 ` Alyssa Ross
2024-07-12 2:06 ` David Stevens
2024-07-12 5:47 ` Michael S. Tsirkin
2024-07-15 2:30 ` Jason Wang
2024-07-16 1:21 ` David Stevens
2024-09-03 8:42 ` Albert Esteve
2024-09-05 16:39 ` Stefan Hajnoczi [this message]
2024-09-06 7:03 ` Albert Esteve
2024-09-06 13:15 ` Stefan Hajnoczi
2024-09-05 15:56 ` Stefan Hajnoczi
2024-09-06 4:18 ` David Stevens
2024-09-06 13:00 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240905163937.GE1922502@fedora \
--to=stefanha@redhat.com \
--cc=aesteve@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=david@redhat.com \
--cc=hi@alyssa.is \
--cc=jasowang@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=slp@redhat.com \
--cc=stevensd@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.