public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
	Elena Afanasova <eafanasova@gmail.com>,
	kvm@vger.kernel.org, john.g.johnson@oracle.com,
	dinechin@redhat.com, cohuck@redhat.com, jasowang@redhat.com,
	felipe@nutanix.com, elena.ufimtseva@oracle.com,
	jag.raman@oracle.com
Subject: Re: MMIO/PIO dispatch file descriptors (ioregionfd) design discussion
Date: Thu, 3 Dec 2020 06:34:00 -0500	[thread overview]
Message-ID: <20201203062357-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20201203111036.GD689053@stefanha-x1.localdomain>

On Thu, Dec 03, 2020 at 11:10:36AM +0000, Stefan Hajnoczi wrote:
> On Wed, Dec 02, 2020 at 01:06:28PM -0500, Peter Xu wrote:
> > On Wed, Nov 25, 2020 at 12:44:07PM -0800, Elena Afanasova wrote:
> > 
> > [...]
> > 
> > > Wire protocol
> > > -------------
> > > The protocol spoken over the file descriptor is as follows. The device reads
> > > commands from the file descriptor with the following layout::
> > > 
> > >   struct ioregionfd_cmd {
> > >       __u32 info;
> > >       __u32 padding;
> > >       __u64 user_data;
> > >       __u64 offset;
> > >       __u64 data;
> > >   };
> > 
> > I'm thinking whether it would be nice to have a handshake on the wire protocol
> > before starting the cmd/resp sequence.
> > 
> > I was thinking about migration - we have had a hard time trying to be
> > compatible between old/new qemus.  Now we fixed those by applying the same
> > migration capabilities on both sides always so we do the handshake "manually"
> > from libvirt, but it really should be done with a real handshake on the
> > channel, imho..  That's another story, for sure.
> > 
> > My understanding is that the wire protocol is kind of a standalone (but tiny)
> > protocol between kvm and the emulation process.  So I'm thinking the handshake
> > could also help when e.g. kvm can fallback to an old version of wire protocol
> > if it knows the emulation binary is old.  Ideally, I think this could even
> > happen without VMM's awareness.
> > 
> > [...]
> 
> I imagined that would happen in the control plane (KVM ioctls) instead
> of the data plane (the fd). There is a flags field in
> ioctl(KVM_SET_IOREGION):
> 
>   struct kvm_ioregion {
>       __u64 guest_paddr; /* guest physical address */
>       __u64 memory_size; /* bytes */
>       __u64 user_data;
>       __s32 fd; /* previously created with KVM_CREATE_IOREGIONFD */
>       __u32 flags;
>       __u8  pad[32];
>   };
> 
> When userspace sets up the ioregionfd it can tell the kernel which
> features to enable.
> 
> Feature availability can be checked through ioctl(KVM_CHECK_EXTENSION).
> 
> Do you think this existing mechanism is enough? It's not clear to me
> what kind of additional negotiation would be necessary between the
> device emulation process and KVM after the ioregionfd has been
> registered?
> 
> > > Ordering
> > > --------
> > > Guest accesses are delivered in order, including posted writes.
> > 
> > I'm wondering whether it should prepare for out-of-order commands assuming if
> > there's no handshake so harder to extend, just in case there could be some slow
> > commands so we still have chance to reply to a very trivial command during
> > handling the slow one (then each command may require a command ID, too).  But
> > it won't be a problem at all if we can easily extend the wire protocol so the
> > ordering constraint can be extended too when really needed, and we can always
> > start with in-order-only requests.
> 
> Elena and I brainstormed out-of-order commands but didn't include them
> in the proposal because it's not clear that they are needed. For
> multi-queue devices the per-queue registers can be assigned different
> ioregionfds that are handled by dedicated threads.

The difficulty is I think the reverse: reading
any register from a PCI device is normally enough to flush any
writes and interrupts in progress.



> Out-of-order commands are only necessary if a device needs to
> concurrently process register accesses to the *same* set of registers. I
> think it's rare for hardware register interfaces to be designed like
> that.
> 
> This could be a mistake, of course. If someone knows a device that needs
> multiple in-flight register accesses, please let us know.
> 
> Stefan



  reply	other threads:[~2020-12-03 11:35 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-25 20:44 MMIO/PIO dispatch file descriptors (ioregionfd) design discussion Elena Afanasova
2020-12-02 18:06 ` Peter Xu
2020-12-03 11:10   ` Stefan Hajnoczi
2020-12-03 11:34     ` Michael S. Tsirkin [this message]
2020-12-04 13:23       ` Stefan Hajnoczi
2020-12-03 14:40     ` Peter Xu
2020-12-07 14:58       ` Stefan Hajnoczi
2021-10-12  5:34 ` elena
2021-10-25 12:42   ` Stefan Hajnoczi
2021-10-25 15:21     ` Elena
2021-10-25 16:56       ` Stefan Hajnoczi
2021-10-26 19:01       ` John Levon
2021-10-27 10:15         ` Stefan Hajnoczi
2021-10-27 12:22           ` John Levon
2021-10-28  8:14             ` Stefan Hajnoczi
     [not found] <CAFO2pHzmVf7g3z0RikQbYnejwcWRtHKV=npALs49eRDJdt4mJQ@mail.gmail.com>
2020-11-26  3:37 ` Jason Wang
2020-11-26 12:36   ` Stefan Hajnoczi
2020-11-27  3:39     ` Jason Wang
2020-11-27 13:44       ` Stefan Hajnoczi
2020-11-30  2:14         ` Jason Wang
2020-11-30 12:47           ` Stefan Hajnoczi
2020-12-01  4:05             ` Jason Wang
2020-12-01 10:35               ` Stefan Hajnoczi
2020-12-02  2:53                 ` Jason Wang
2020-12-02 14:17                 ` Elena Afanasova

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201203062357-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=eafanasova@gmail.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=felipe@nutanix.com \
    --cc=jag.raman@oracle.com \
    --cc=jasowang@redhat.com \
    --cc=john.g.johnson@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox