From: Sean Christopherson <seanjc@google.com>
To: Ilias Stamatis <ilstam@amazon.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, pdurrant@amazon.co.uk,
dwmw@amazon.co.uk, nh-open-source@amazon.com,
Paul Durrant <paul@xen.org>
Subject: Re: [PATCH v3 2/6] KVM: Add KVM_CREATE_COALESCED_MMIO_BUFFER ioctl
Date: Wed, 28 Aug 2024 07:25:26 -0700 [thread overview]
Message-ID: <Zs8yV4M0e4nZSdni@google.com> (raw)
In-Reply-To: <20240820133333.1724191-3-ilstam@amazon.com>
On Tue, Aug 20, 2024, Ilias Stamatis wrote:
> The current MMIO coalescing design has a few drawbacks which limit its
> usefulness. Currently all coalesced MMIO zones use the same ring buffer.
> That means that upon a userspace exit we have to handle potentially
> unrelated MMIO writes synchronously. And a VM-wide lock needs to be
> taken in the kernel when an MMIO exit occurs.
>
> Additionally, there is no direct way for userspace to be notified about
> coalesced MMIO writes. If the next MMIO exit to userspace is when the
> ring buffer has filled then a substantial (and unbounded) amount of time
> may have passed since the first coalesced MMIO.
>
> Add a KVM_CREATE_COALESCED_MMIO_BUFFER ioctl to KVM. This ioctl simply
Why allocate the buffer in KVM? It allows reusing coalesced_mmio_write() without
needing {get,put}_user() or any form of kmap(), but it adds complexity (quite a
bit in fact) to KVM and inherits many of the restrictions and warts of the existing
coalesced I/O implementation.
E.g. the size of the ring buffer is "fixed", yet variable based on the host kernel
page size.
Why not go with a BYOB model? E.g. so that userspace can explicitly define the
buffer size to best fit the properties of the I/O range, and to avoid additional
complexity in KVM.
> returns a file descriptor to the caller but does not allocate a ring
> buffer. Userspace can then pass this fd to mmap() to actually allocate a
> buffer and map it to its address space.
>
> Subsequent patches will allow userspace to:
>
> - Associate the fd with a coalescing zone when registering it so that
> writes to that zone are accumulated in that specific ring buffer
> rather than the VM-wide one.
> - Poll for MMIO writes using this fd.
Why? I get the desire for a doorbell, but KVM already supports "fast" I/O for
doorbells. The only use case I can think of is where the doorbell sits in the
same region as the data payload, but that essentially just saves an entry in
KVM's MMIO bus (or I suppose two entries if the doorbell is in the middle of
the region).
>
> Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> ---
...
> +static void coalesced_mmio_buffer_vma_close(struct vm_area_struct *vma)
> +{
> + struct kvm_coalesced_mmio_buffer_dev *dev = vma->vm_private_data;
> +
> + spin_lock(&dev->ring_lock);
> +
> + vfree(dev->ring);
> + dev->ring = NULL;
> +
> + spin_unlock(&dev->ring_lock);
I doubt this is correct. I don't see VM_DONTCOPY, and so userspace can create a
second (or more) VMA by forking, and then KVM will hit a UAF if any of the VMAs
is closed.
> +}
> +
> +static const struct vm_operations_struct coalesced_mmio_buffer_vm_ops = {
> + .close = coalesced_mmio_buffer_vma_close,
> +};
> +
> +static int coalesced_mmio_buffer_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> + struct kvm_coalesced_mmio_buffer_dev *dev = file->private_data;
> + unsigned long pfn;
> + int ret = 0;
> +
> + spin_lock(&dev->ring_lock);
> +
> + if (dev->ring) {
Restricting userspace to a single mmap() is a very bizarre restriction.
> + ret = -EBUSY;
> + goto out_unlock;
> + }
next prev parent reply other threads:[~2024-08-28 14:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-20 13:33 [PATCH v3 0/6] KVM: Improve MMIO Coalescing API Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 1/6] KVM: Fix coalesced_mmio_has_room() Ilias Stamatis
2024-08-24 0:04 ` Sean Christopherson
2024-08-20 13:33 ` [PATCH v3 2/6] KVM: Add KVM_CREATE_COALESCED_MMIO_BUFFER ioctl Ilias Stamatis
2024-08-28 14:25 ` Sean Christopherson [this message]
2024-08-29 11:20 ` Stamatis, Ilias
2024-08-29 14:55 ` Sean Christopherson
2024-08-29 17:42 ` Stamatis, Ilias
2024-08-20 13:33 ` [PATCH v3 3/6] KVM: Support poll() on coalesced mmio buffer fds Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 4/6] KVM: Add KVM_(UN)REGISTER_COALESCED_MMIO2 ioctls Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 5/6] KVM: Documentation: Document v2 of coalesced MMIO API Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 6/6] KVM: selftests: Add coalesced_mmio_test Ilias Stamatis
2024-08-28 17:20 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zs8yV4M0e4nZSdni@google.com \
--to=seanjc@google.com \
--cc=dwmw@amazon.co.uk \
--cc=ilstam@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=nh-open-source@amazon.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=pdurrant@amazon.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox