From: Amos Kong <akong@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com,
kvm@vger.kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org,
Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 0/2] virtio-pci: fix abort when fail to allocate ioeventfd
Date: Fri, 16 Mar 2012 16:59:35 +0800 [thread overview]
Message-ID: <4F6300F7.6080806@redhat.com> (raw)
In-Reply-To: <CAJSP0QXHu9_Yahrdv5eu+EbOhSAAZtf2gUWRc6iZrYmtdqd4dA@mail.gmail.com>
On 14/03/12 19:46, Stefan Hajnoczi wrote:
> On Wed, Mar 14, 2012 at 10:46 AM, Avi Kivity<avi@redhat.com> wrote:
>> On 03/14/2012 12:39 PM, Stefan Hajnoczi wrote:
>>> On Wed, Mar 14, 2012 at 10:05 AM, Avi Kivity<avi@redhat.com> wrote:
>>>> On 03/14/2012 11:59 AM, Stefan Hajnoczi wrote:
>>>>> On Wed, Mar 14, 2012 at 9:22 AM, Avi Kivity<avi@redhat.com> wrote:
>>>>>> On 03/13/2012 12:42 PM, Amos Kong wrote:
>>>>>>> Boot up guest with 232 virtio-blk disk, qemu will abort for fail to
>>>>>>> allocate ioeventfd. This patchset changes kvm_has_many_ioeventfds(),
>>>>>>> and check if available ioeventfd exists. If not, virtio-pci will
>>>>>>> fallback to userspace, and don't use ioeventfd for io notification.
>>>>>>
>>>>>> How about an alternative way of solving this, within the memory core:
>>>>>> trap those writes in qemu and write to the ioeventfd yourself. This way
>>>>>> ioeventfds work even without kvm:
>>>>>>
>>>>>>
>>>>>> core: create eventfd
>>>>>> core: install handler for memory address that writes to ioeventfd
>>>>>> kvm (optional): install kernel handler for ioeventfd
Can you give some detail about this? I'm not familiar with Memory API.
btw, can we fix this problem by replacing abort() by a error note?
virtio-pci will auto fallback to userspace.
diff --git a/kvm-all.c b/kvm-all.c
index 3c6b4f0..cf23dbf 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -749,7 +749,8 @@ static void
kvm_mem_ioeventfd_add(MemoryRegionSection *section,
r = kvm_set_ioeventfd_mmio_long(fd,
section->offset_within_address_space,
data, true);
if (r < 0) {
- abort();
+ fprintf(stderr, "%s: unable to map ioeventfd: %s.\nFallback to "
+ "userspace (slower).\n", __func__, strerror(-r));
}
}
@@ -775,7 +776,8 @@ static void kvm_io_ioeventfd_add(MemoryRegionSection
*section,
r = kvm_set_ioeventfd_pio_word(fd,
section->offset_within_address_space,
data, true);
if (r < 0) {
- abort();
+ fprintf(stderr, "%s: unable to map ioeventfd: %s.\nFallback to "
+ "userspace (slower).\n", __func__, strerror(-r));
}
}
>>>>>> even if the third step fails, the ioeventfd still works, it's just slower.
>>>>>
>>>>> That approach will penalize guests with large numbers of disks - they
>>>>> see an extra switch to vcpu thread instead of kvm.ko -> iothread.
>>>>
>>>> It's only a failure path. The normal path is expected to have a kvm
>>>> ioeventfd installed.
>>>
>>> It's the normal path when you attach>232 virtio-blk devices to a
>>> guest (or 300 in the future).
>>
>> Well, there's nothing we can do about it.
>>
>> We'll increase the limit of course, but old kernels will remain out
>> there. The right fix is virtio-scsi anyway.
>>
>>>>> It
>>>>> seems okay provided we can solve the limit in the kernel once and for
>>>>> all by introducing a more dynamic data structure for in-kernel
>>>>> devices. That way future kernels will never hit an arbitrary limit
>>>>> below their file descriptor rlimit.
>>>>>
>>>>> Is there some reason why kvm.ko must use a fixed size array? Would it
>>>>> be possible to use a tree (maybe with a cache for recent lookups)?
>>>>
>>>> It does use bsearch today IIRC. We'll expand the limit, but there must
>>>> be a limit, and qemu must be prepared to deal with it.
>>>
>>> Shouldn't the limit be the file descriptor rlimit? If userspace
>>> cannot create more eventfds then it cannot set up more ioeventfds.
>>
>> You can use the same eventfd for multiple ioeventfds. If you mean to
>> slave kvm's ioeventfd limit to the number of files the process can have,
>> that's a good idea. Surely an ioeventfd occupies less resources than an
>> open file.
>
> Yes.
>
> Ultimately I guess you're right in that we still need to have an error
> path and virtio-scsi will reduce the pressure on I/O eventfds for
> storage.
>
> Stefan
--
Amos.
next prev parent reply other threads:[~2012-03-16 8:59 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-13 10:42 [Qemu-devel] [PATCH 0/2] virtio-pci: fix abort when fail to allocate ioeventfd Amos Kong
2012-03-13 10:42 ` [Qemu-devel] [PATCH 1/2] return available ioeventfds count in kvm_has_many_ioeventfds() Amos Kong
2012-03-13 11:50 ` Jan Kiszka
2012-03-13 12:00 ` Amos Kong
2012-03-13 12:24 ` Jan Kiszka
2012-03-13 13:05 ` Amos Kong
2012-03-13 10:42 ` [Qemu-devel] [PATCH 2/2] virtio-pci: fallback to userspace when there is no enough available ioeventfd Amos Kong
2012-03-13 11:23 ` [Qemu-devel] [PATCH 0/2] virtio-pci: fix abort when fail to allocate ioeventfd Stefan Hajnoczi
2012-03-13 11:51 ` Amos Kong
2012-03-13 14:30 ` Stefan Hajnoczi
2012-03-13 14:47 ` Amos Kong
2012-03-13 16:36 ` Stefan Hajnoczi
2012-03-14 0:30 ` Amos Kong
2012-03-14 8:57 ` Stefan Hajnoczi
2012-03-14 9:22 ` Avi Kivity
2012-03-14 9:59 ` Stefan Hajnoczi
2012-03-14 10:05 ` Avi Kivity
2012-03-14 10:39 ` Stefan Hajnoczi
2012-03-14 10:46 ` Avi Kivity
2012-03-14 11:46 ` Stefan Hajnoczi
2012-03-16 8:59 ` Amos Kong [this message]
2012-03-19 8:21 ` Stefan Hajnoczi
2012-03-19 10:11 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F6300F7.6080806@redhat.com \
--to=akong@redhat.com \
--cc=aliguori@us.ibm.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).