From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:35944)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S7kzH-0002ty-Bv
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 05:59:25 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S7kzB-0006NG-7E
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 05:59:18 -0400
Received: from mail-lpp01m010-f45.google.com ([209.85.215.45]:51553)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S7kzA-0006N2-TS
	for qemu-devel@nongnu.org; Wed, 14 Mar 2012 05:59:13 -0400
Received: by lahe6 with SMTP id e6so1368708lah.4
	for <qemu-devel@nongnu.org>; Wed, 14 Mar 2012 02:59:09 -0700 (PDT)
MIME-Version: 1.0
In-Reply-To: <4F606356.9080003@redhat.com>
References: <20120313103602.8741.71939.stgit@dhcp-8-167.nay.redhat.com>
	<4F606356.9080003@redhat.com>
Date: Wed, 14 Mar 2012 09:59:09 +0000
Message-ID: <CAJSP0QXxD8fSU1TrJsMj2owpBr4n8etycEr6gj08suMapu+=LQ@mail.gmail.com>
From: Stefan Hajnoczi <stefanha@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH 0/2] virtio-pci: fix abort when fail to
	allocate ioeventfd
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, kvm@vger.kernel.org, mtosatti@redhat.com, qemu-devel@nongnu.org, Amos Kong <akong@redhat.com>

On Wed, Mar 14, 2012 at 9:22 AM, Avi Kivity <avi@redhat.com> wrote:
> On 03/13/2012 12:42 PM, Amos Kong wrote:
>> Boot up guest with 232 virtio-blk disk, qemu will abort for fail to
>> allocate ioeventfd. This patchset changes kvm_has_many_ioeventfds(),
>> and check if available ioeventfd exists. If not, virtio-pci will
>> fallback to userspace, and don't use ioeventfd for io notification.
>
> How about an alternative way of solving this, within the memory core:
> trap those writes in qemu and write to the ioeventfd yourself. =A0This wa=
y
> ioeventfds work even without kvm:
>
>
> =A0core: create eventfd
> =A0core: install handler for memory address that writes to ioeventfd
> =A0kvm (optional): install kernel handler for ioeventfd
>
> even if the third step fails, the ioeventfd still works, it's just slower=
.

That approach will penalize guests with large numbers of disks - they
see an extra switch to vcpu thread instead of kvm.ko -> iothread.  It
seems okay provided we can solve the limit in the kernel once and for
all by introducing a more dynamic data structure for in-kernel
devices.  That way future kernels will never hit an arbitrary limit
below their file descriptor rlimit.

Is there some reason why kvm.ko must use a fixed size array?  Would it
be possible to use a tree (maybe with a cache for recent lookups)?

Stefan