From: Sasha Levin <levinsasha928@gmail.com>
To: Avi Kivity <avi@redhat.com>
Cc: kvm@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Marcelo Tosatti <mtosatti@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Pekka Enberg <penberg@kernel.org>
Subject: Re: [PATCH] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_PIPE
Date: Sun, 03 Jul 2011 20:44:51 +0300 [thread overview]
Message-ID: <1309715091.4117.16.camel@sasha> (raw)
In-Reply-To: <4E10A3E6.1070606@redhat.com>
On Sun, 2011-07-03 at 20:16 +0300, Avi Kivity wrote:
> On 07/03/2011 08:04 PM, Sasha Levin wrote:
> > The new flag allows passing a write side of a pipe instead of an
> > eventfd to be notified of writes to the specified memory region.
> >
> > Instead of signaling an event, the value written to the memory region
> > is written to the pipe.
> >
> > Using a pipe instead of an eventfd is usefull when any value can be
> > written to the memory region but we're interested in recieving the
> > actual value instead of just a notification.
> >
> > A simple example for practical use is the serial port. we are not
> > interested in an exit every time a char is written to the port, but
> > we do need to know what was written so we could handle it on the guest.
>
> > ---
> > include/linux/kvm.h | 2 +
> > virt/kvm/eventfd.c | 65 +++++++++++++++++++++++++++++++++++---------------
>
> Documentation/virtua/kvm/api.txt +++++++++++++++++
I couldn't find the ioeventfd docs in there and forgot that it's not in
the mainline yet, I'll rebase on kvm git tree :)
> >
> > @@ -424,6 +425,7 @@ struct _ioeventfd {
> > struct list_head list;
> > u64 addr;
> > int length;
> > + struct file *pipe;
> > struct eventfd_ctx *eventfd;
>
> In a union with eventfd please.
>
> > @@ -481,6 +487,21 @@ ioeventfd_in_range(struct _ioeventfd *p, gpa_t addr, int len, const void *val)
> > return _val == p->datamatch ? true : false;
> > }
> >
> > +static ssize_t kernel_write(struct file *file, const char *buf, size_t count,
> > + loff_t pos)
> > +{
> > + mm_segment_t old_fs;
> > + ssize_t res;
> > +
> > + old_fs = get_fs();
> > + set_fs(get_ds());
> > + /* The cast to a user pointer is valid due to the set_fs() */
> > + res = vfs_write(file, (const char __user *)buf, count,&pos);
> > + set_fs(old_fs);
> > +
> > + return res;
> > +}
> > +
>
> Is there no generic helper for this? Should there be?
>
I couldn't find one, I took the code above from fs/splice.c.
There should probably be a generic version of it as this snippet repeats
itself in several locations throughout the kernel.
> > /* MMIO/PIO writes trigger an event if the addr/val match */
> > static int
> > ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
> > @@ -491,7 +512,11 @@ ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
> > if (!ioeventfd_in_range(p, addr, len, val))
> > return -EOPNOTSUPP;
> >
> > - eventfd_signal(p->eventfd, 1);
> > + if (p->pipe)
> > + kernel_write(p->pipe, val, len, 0);
>
> You're writing potentially variable length data.
>
> We need a protocol containing address, data, length, and supporting read
> accesses as well.
>
This can't be variable length.
The user defines an ioeventfd as an address+length (with length being up
to 8 bytes). The only time an ioeventfd is signaled is when the write to
the guest memory is exactly at the specified address with exactly the
specified length.
ioeventfds can be extended to handle more than 8 bytes, variable address
offset and reads now that pipe support is added, but I'd rather do it in
follow-up patches once basic pipe support is in.
> Is the write guaranteed atomic? We probably need serialization here.
afaik vfs_write is just a wrapper to the write() function of the
underlying fs so it should be atomic, no?
>
> > + else
> > + eventfd_signal(p->eventfd, 1);
> > +
> > return 0;
> > }
> >
> > @@ -555,9 +580,11 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> > if (args->flags& ~KVM_IOEVENTFD_VALID_FLAG_MASK)
> > return -EINVAL;
> >
> > - eventfd = eventfd_ctx_fdget(args->fd);
> > - if (IS_ERR(eventfd))
> > - return PTR_ERR(eventfd);
> > + if (!(args->flags& KVM_IOEVENTFD_FLAG_PIPE)) {
> > + eventfd = eventfd_ctx_fdget(args->fd);
> > + if (IS_ERR(eventfd))
> > + return PTR_ERR(eventfd);
> > + }
> >
> > p = kzalloc(sizeof(*p), GFP_KERNEL);
> > if (!p) {
> > @@ -568,7 +595,11 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> > INIT_LIST_HEAD(&p->list);
> > p->addr = args->addr;
> > p->length = args->len;
> > - p->eventfd = eventfd;
> > +
> > + if (args->flags& KVM_IOEVENTFD_FLAG_PIPE)
> > + p->pipe = fget(args->fd);
> > + else
> > + p->eventfd = eventfd;
>
> The split logic with the previous hunk isn't nice. Suggest moving the
> 'else' there, and assigning the whole union here.
>
> > list_for_each_entry_safe(p, tmp,&kvm->ioeventfds, list) {
> > bool wildcard = !(args->flags& KVM_IOEVENTFD_FLAG_DATAMATCH);
> >
> > - if (p->eventfd != eventfd ||
> > - p->addr != args->addr ||
> > + if (p->addr != args->addr ||
> > p->length != args->len ||
> > p->wildcard != wildcard)
> > continue;
>
> Why?
I didn't think that assigning 2 different events with exactly the same
address, length and data can happen. Why would it?
>
>
--
Sasha.
next prev parent reply other threads:[~2011-07-03 17:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-03 17:04 [PATCH] ioeventfd: Introduce KVM_IOEVENTFD_FLAG_PIPE Sasha Levin
2011-07-03 17:16 ` Avi Kivity
2011-07-03 17:44 ` Sasha Levin [this message]
2011-07-03 17:57 ` Pekka Enberg
2011-07-04 10:27 ` Avi Kivity
2011-07-04 10:49 ` Michael S. Tsirkin
2011-07-04 10:57 ` Avi Kivity
2011-07-04 14:38 ` Sasha Levin
2011-07-04 14:45 ` Avi Kivity
2011-07-04 14:52 ` Sasha Levin
2011-07-04 14:59 ` Avi Kivity
2011-07-06 4:37 ` Sasha Levin
2011-07-06 11:30 ` Avi Kivity
2011-07-04 10:32 ` Michael S. Tsirkin
2011-07-04 10:45 ` Avi Kivity
2011-07-04 11:07 ` Michael S. Tsirkin
2011-07-04 11:19 ` Avi Kivity
2011-07-04 11:45 ` Michael S. Tsirkin
2011-07-04 11:49 ` Avi Kivity
2011-07-04 12:12 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1309715091.4117.16.camel@sasha \
--to=levinsasha928@gmail.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mst@redhat.com \
--cc=mtosatti@redhat.com \
--cc=penberg@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox