From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH v2] Shared memory device with interrupt support Date: Wed, 20 May 2009 08:45:14 -0500 Message-ID: <4A14096A.8090500@codemonkey.ws> References: <1241712992-17004-1-git-send-email-cam@cs.ualberta.ca> <4A11AECB.70908@codemonkey.ws> <4A123639.9000901@redhat.com> <4A12FAEC.9070601@codemonkey.ws> <4A13C6D5.5060705@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Cam Macdonell , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from yw-out-2324.google.com ([74.125.46.30]:36271 "EHLO yw-out-2324.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754508AbZETNpQ (ORCPT ); Wed, 20 May 2009 09:45:16 -0400 Received: by yw-out-2324.google.com with SMTP id 5so279272ywb.1 for ; Wed, 20 May 2009 06:45:17 -0700 (PDT) In-Reply-To: <4A13C6D5.5060705@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Anthony Liguori wrote: >> Avi Kivity wrote: >>> Anthony Liguori wrote: >>>> I'd strongly recommend working these patches on qemu-devel and >>>> lkml. I suspect Avi may disagree with me, but in order for this to >>>> be eventually merged in either place, you're going to have >>>> additional requirements put on you. >>> >>> I don't disagree with the fact that there will be additional >>> requirements, but I might disagree with some of those additional >>> requirements themselves. >> >> It actually works out better than I think you expect it to... > > Can you explain why? You haven't addressed my concerns the last time > around. Because of the qemu_ram_alloc() patches. We no longer have a contiguous phys_ram_base so we don't have to deal with mmap(MAP_FIXED). We can also more practically do memory hot-add which is more or less a requirement of this work. It also means we could do shared memory through more traditional means too like sys v ipc or whatever is the native mechanism on the underlying platform. That means we could even support Win32 (although I wouldn't make that an initial requirement). >> >> We can't use mmap() directly. With the new RAM allocation scheme, I >> think it's pretty reasonable to now allow portions of ram to come >> from files that get mmap() (sort of like -mem-path). >> >> This RAM area could be setup as a BAR. > > That's what Cam's patch does, and what you objected to. I'm flexible. BARs are pretty unattractive because of the size requirements. The actual transport implementation is the least important part though IMHO. The guest interface and how it's implemented within QEMU is much more important to get right the first time. >> Why is that unimplementable? > > Bad choice of words - it's implementable, just not very usable. You > can't share 1GB in a 256MB guest, will fragment host vmas, no > guarantee the guest can actually allocate all that memory, doesn't > work with large pages, what happens on freeing, etc. You can share 1GB with a PCI BAR today. You're limited to 32-bit addresses which admittedly we could fix. Any reason to bother with BARs instead of just picking unused physical addresses? Does Windows do anything special with BAR addresses? >> >> The QEMU bits and the device model bits are actually relatively >> simple. The part that I think needs more deep thought is the >> guest-visible interface. >> >> A char device is probably not the best interface. I think you want >> something like tmpfs/hugetlbfs. > > Yes those are so wonderful to work with. qemu -ivshmem file=/dev/shm/ring.shared,name=shared-ring,size=1G,notify=/path/to/socket /path/to/socket is used to pass an eventfd Within the guest, you'd have: /dev/ivshmemfs/shared-ring An app would mmap() that file, and then could do something like an ioctl() to get an eventfd. Alternatively, you could have something like: /dev/ivshmemfs/mem/shared-ring /dev/ivshmemfs/notify/shared-ring Where notify/shared-ring behaves like an eventfd(). Regards, Anthony Liguori