All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ilias Stamatis <ilstam@amazon.com>
Cc: kvm@vger.kernel.org, pbonzini@redhat.com, pdurrant@amazon.co.uk,
	 dwmw@amazon.co.uk, nh-open-source@amazon.com,
	Paul Durrant <paul@xen.org>
Subject: Re: [PATCH v3 2/6] KVM: Add KVM_CREATE_COALESCED_MMIO_BUFFER ioctl
Date: Wed, 28 Aug 2024 07:25:26 -0700	[thread overview]
Message-ID: <Zs8yV4M0e4nZSdni@google.com> (raw)
In-Reply-To: <20240820133333.1724191-3-ilstam@amazon.com>

On Tue, Aug 20, 2024, Ilias Stamatis wrote:
> The current MMIO coalescing design has a few drawbacks which limit its
> usefulness. Currently all coalesced MMIO zones use the same ring buffer.
> That means that upon a userspace exit we have to handle potentially
> unrelated MMIO writes synchronously. And a VM-wide lock needs to be
> taken in the kernel when an MMIO exit occurs.
> 
> Additionally, there is no direct way for userspace to be notified about
> coalesced MMIO writes. If the next MMIO exit to userspace is when the
> ring buffer has filled then a substantial (and unbounded) amount of time
> may have passed since the first coalesced MMIO.
> 
> Add a KVM_CREATE_COALESCED_MMIO_BUFFER ioctl to KVM. This ioctl simply

Why allocate the buffer in KVM?  It allows reusing coalesced_mmio_write() without
needing {get,put}_user() or any form of kmap(), but it adds complexity (quite a
bit in fact) to KVM and inherits many of the restrictions and warts of the existing
coalesced I/O implementation.

E.g. the size of the ring buffer is "fixed", yet variable based on the host kernel
page size.

Why not go with a BYOB model?  E.g. so that userspace can explicitly define the
buffer size to best fit the properties of the I/O range, and to avoid additional
complexity in KVM.

> returns a file descriptor to the caller but does not allocate a ring
> buffer. Userspace can then pass this fd to mmap() to actually allocate a
> buffer and map it to its address space.
> 
> Subsequent patches will allow userspace to:
> 
> - Associate the fd with a coalescing zone when registering it so that
>   writes to that zone are accumulated in that specific ring buffer
>   rather than the VM-wide one.
> - Poll for MMIO writes using this fd.

Why?  I get the desire for a doorbell, but KVM already supports "fast" I/O for
doorbells.  The only use case I can think of is where the doorbell sits in the
same region as the data payload, but that essentially just saves an entry in
KVM's MMIO bus (or I suppose two entries if the doorbell is in the middle of
the region).

> 
> Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
> Reviewed-by: Paul Durrant <paul@xen.org>
> ---

...


> +static void coalesced_mmio_buffer_vma_close(struct vm_area_struct *vma)
> +{
> +	struct kvm_coalesced_mmio_buffer_dev *dev = vma->vm_private_data;
> +
> +	spin_lock(&dev->ring_lock);
> +
> +	vfree(dev->ring);
> +	dev->ring = NULL;
> +
> +	spin_unlock(&dev->ring_lock);

I doubt this is correct.  I don't see VM_DONTCOPY, and so userspace can create a
second (or more) VMA by forking, and then KVM will hit a UAF if any of the VMAs
is closed.

> +}
> +
> +static const struct vm_operations_struct coalesced_mmio_buffer_vm_ops = {
> +	.close = coalesced_mmio_buffer_vma_close,
> +};
> +
> +static int coalesced_mmio_buffer_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +	struct kvm_coalesced_mmio_buffer_dev *dev = file->private_data;
> +	unsigned long pfn;
> +	int ret = 0;
> +
> +	spin_lock(&dev->ring_lock);
> +
> +	if (dev->ring) {

Restricting userspace to a single mmap() is a very bizarre restriction.

> +		ret = -EBUSY;
> +		goto out_unlock;
> +	}

  reply	other threads:[~2024-08-28 14:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-20 13:33 [PATCH v3 0/6] KVM: Improve MMIO Coalescing API Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 1/6] KVM: Fix coalesced_mmio_has_room() Ilias Stamatis
2024-08-24  0:04   ` Sean Christopherson
2024-08-20 13:33 ` [PATCH v3 2/6] KVM: Add KVM_CREATE_COALESCED_MMIO_BUFFER ioctl Ilias Stamatis
2024-08-28 14:25   ` Sean Christopherson [this message]
2024-08-29 11:20     ` Stamatis, Ilias
2024-08-29 14:55       ` Sean Christopherson
2024-08-29 17:42         ` Stamatis, Ilias
2024-08-20 13:33 ` [PATCH v3 3/6] KVM: Support poll() on coalesced mmio buffer fds Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 4/6] KVM: Add KVM_(UN)REGISTER_COALESCED_MMIO2 ioctls Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 5/6] KVM: Documentation: Document v2 of coalesced MMIO API Ilias Stamatis
2024-08-20 13:33 ` [PATCH v3 6/6] KVM: selftests: Add coalesced_mmio_test Ilias Stamatis
2024-08-28 17:20   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zs8yV4M0e4nZSdni@google.com \
    --to=seanjc@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=ilstam@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=nh-open-source@amazon.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=pdurrant@amazon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.