From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
Elena Afanasova <eafanasova@gmail.com>,
kvm@vger.kernel.org, john.g.johnson@oracle.com,
dinechin@redhat.com, cohuck@redhat.com, jasowang@redhat.com,
felipe@nutanix.com, elena.ufimtseva@oracle.com,
jag.raman@oracle.com
Subject: Re: MMIO/PIO dispatch file descriptors (ioregionfd) design discussion
Date: Thu, 3 Dec 2020 06:34:00 -0500 [thread overview]
Message-ID: <20201203062357-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20201203111036.GD689053@stefanha-x1.localdomain>
On Thu, Dec 03, 2020 at 11:10:36AM +0000, Stefan Hajnoczi wrote:
> On Wed, Dec 02, 2020 at 01:06:28PM -0500, Peter Xu wrote:
> > On Wed, Nov 25, 2020 at 12:44:07PM -0800, Elena Afanasova wrote:
> >
> > [...]
> >
> > > Wire protocol
> > > -------------
> > > The protocol spoken over the file descriptor is as follows. The device reads
> > > commands from the file descriptor with the following layout::
> > >
> > > struct ioregionfd_cmd {
> > > __u32 info;
> > > __u32 padding;
> > > __u64 user_data;
> > > __u64 offset;
> > > __u64 data;
> > > };
> >
> > I'm thinking whether it would be nice to have a handshake on the wire protocol
> > before starting the cmd/resp sequence.
> >
> > I was thinking about migration - we have had a hard time trying to be
> > compatible between old/new qemus. Now we fixed those by applying the same
> > migration capabilities on both sides always so we do the handshake "manually"
> > from libvirt, but it really should be done with a real handshake on the
> > channel, imho.. That's another story, for sure.
> >
> > My understanding is that the wire protocol is kind of a standalone (but tiny)
> > protocol between kvm and the emulation process. So I'm thinking the handshake
> > could also help when e.g. kvm can fallback to an old version of wire protocol
> > if it knows the emulation binary is old. Ideally, I think this could even
> > happen without VMM's awareness.
> >
> > [...]
>
> I imagined that would happen in the control plane (KVM ioctls) instead
> of the data plane (the fd). There is a flags field in
> ioctl(KVM_SET_IOREGION):
>
> struct kvm_ioregion {
> __u64 guest_paddr; /* guest physical address */
> __u64 memory_size; /* bytes */
> __u64 user_data;
> __s32 fd; /* previously created with KVM_CREATE_IOREGIONFD */
> __u32 flags;
> __u8 pad[32];
> };
>
> When userspace sets up the ioregionfd it can tell the kernel which
> features to enable.
>
> Feature availability can be checked through ioctl(KVM_CHECK_EXTENSION).
>
> Do you think this existing mechanism is enough? It's not clear to me
> what kind of additional negotiation would be necessary between the
> device emulation process and KVM after the ioregionfd has been
> registered?
>
> > > Ordering
> > > --------
> > > Guest accesses are delivered in order, including posted writes.
> >
> > I'm wondering whether it should prepare for out-of-order commands assuming if
> > there's no handshake so harder to extend, just in case there could be some slow
> > commands so we still have chance to reply to a very trivial command during
> > handling the slow one (then each command may require a command ID, too). But
> > it won't be a problem at all if we can easily extend the wire protocol so the
> > ordering constraint can be extended too when really needed, and we can always
> > start with in-order-only requests.
>
> Elena and I brainstormed out-of-order commands but didn't include them
> in the proposal because it's not clear that they are needed. For
> multi-queue devices the per-queue registers can be assigned different
> ioregionfds that are handled by dedicated threads.
The difficulty is I think the reverse: reading
any register from a PCI device is normally enough to flush any
writes and interrupts in progress.
> Out-of-order commands are only necessary if a device needs to
> concurrently process register accesses to the *same* set of registers. I
> think it's rare for hardware register interfaces to be designed like
> that.
>
> This could be a mistake, of course. If someone knows a device that needs
> multiple in-flight register accesses, please let us know.
>
> Stefan
next prev parent reply other threads:[~2020-12-03 11:35 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-25 20:44 MMIO/PIO dispatch file descriptors (ioregionfd) design discussion Elena Afanasova
2020-12-02 18:06 ` Peter Xu
2020-12-03 11:10 ` Stefan Hajnoczi
2020-12-03 11:34 ` Michael S. Tsirkin [this message]
2020-12-04 13:23 ` Stefan Hajnoczi
2020-12-03 14:40 ` Peter Xu
2020-12-07 14:58 ` Stefan Hajnoczi
2021-10-12 5:34 ` elena
2021-10-25 12:42 ` Stefan Hajnoczi
2021-10-25 15:21 ` Elena
2021-10-25 16:56 ` Stefan Hajnoczi
2021-10-26 19:01 ` John Levon
2021-10-27 10:15 ` Stefan Hajnoczi
2021-10-27 12:22 ` John Levon
2021-10-28 8:14 ` Stefan Hajnoczi
[not found] <CAFO2pHzmVf7g3z0RikQbYnejwcWRtHKV=npALs49eRDJdt4mJQ@mail.gmail.com>
2020-11-26 3:37 ` Jason Wang
2020-11-26 12:36 ` Stefan Hajnoczi
2020-11-27 3:39 ` Jason Wang
2020-11-27 13:44 ` Stefan Hajnoczi
2020-11-30 2:14 ` Jason Wang
2020-11-30 12:47 ` Stefan Hajnoczi
2020-12-01 4:05 ` Jason Wang
2020-12-01 10:35 ` Stefan Hajnoczi
2020-12-02 2:53 ` Jason Wang
2020-12-02 14:17 ` Elena Afanasova
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201203062357-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=cohuck@redhat.com \
--cc=dinechin@redhat.com \
--cc=eafanasova@gmail.com \
--cc=elena.ufimtseva@oracle.com \
--cc=felipe@nutanix.com \
--cc=jag.raman@oracle.com \
--cc=jasowang@redhat.com \
--cc=john.g.johnson@oracle.com \
--cc=kvm@vger.kernel.org \
--cc=peterx@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox