From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=38995 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OBX9a-0006F1-Ph for qemu-devel@nongnu.org; Mon, 10 May 2010 13:52:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OBX9U-00034Z-Hp for qemu-devel@nongnu.org; Mon, 10 May 2010 13:52:30 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:43056) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OBX9T-00034H-Ui for qemu-devel@nongnu.org; Mon, 10 May 2010 13:52:24 -0400 Received: by vws10 with SMTP id 10so314297vws.4 for ; Mon, 10 May 2010 10:52:23 -0700 (PDT) Message-ID: <4BE847CB.7050503@codemonkey.ws> Date: Mon, 10 May 2010 12:52:11 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1271872408-22842-1-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-3-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-4-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-5-git-send-email-cam@cs.ualberta.ca> <4BE7F517.5010707@redhat.com> <4BE82623.4000905@redhat.com> <4BE82877.1040408@codemonkey.ws> <4BE83B69.4040904@redhat.com> <4BE84172.9080305@codemonkey.ws> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cam Macdonell Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org On 05/10/2010 12:43 PM, Cam Macdonell wrote: > On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori wrote: > >> On 05/10/2010 11:59 AM, Avi Kivity wrote: >> >>> On 05/10/2010 06:38 PM, Anthony Liguori wrote: >>> >>>> >>>>>> Otherwise, if the BAR is allocated during initialization, I would have >>>>>> to use MAP_FIXED to mmap the memory. This is what I did before the >>>>>> qemu_ram_mmap() function was added. >>>>>> >>>>> What would happen to any data written to the BAR before the the >>>>> handshake completed? I think it would disappear. >>>>> >>>> You don't have to do MAP_FIXED. You can allocate a ram area and map that >>>> in when disconnected. When you connect, you create another ram area and >>>> memcpy() the previous ram area to the new one. You then map the second ram >>>> area in. >>>> >>> But it's a shared memory area. Other peers could have connected and >>> written some data in. The memcpy() would destroy their data. >>> >> Why try to attempt to support multi-master shared memory? What's the >> use-case? >> > I don't see it as multi-master, but that the latest guest to join > shouldn't have their contents take precedence. In developing this > patch, my motivation has been to let the guests decide. If the memcpy > is always done, even when no data is written, a guest cannot join > without overwriting everything. > > One use case we're looking at is having VMs using a map reduce > framework like Hadoop or Phoenix running in VMs. However, if a > workqueue is stored or data transfer passes through shared memory, a > system can't scale up the number of workers because each new guest > will erase the shared memory (and the workqueue or in progress data > transfer). > (Replying again to list) What data structure would you use? For a lockless ring queue, you can only support a single producer and consumer. To achieve bidirectional communication in virtio, we always use two queues. If you're adding additional queues to support other levels of communication, you can always use different areas of shared memory. I guess this is the point behind the doorbell mechanism? Regards, Anthony Liguori > In cases where the latest guest to join wants to clear the memory, it > can do so without the automatic memcpy. The guest can do a memset > once it knows the memory is attached. My opinion is to leave it to > the guests and the application that is using the shared memory to > decide what to do on guest joins. > > Cam > > >> Regards, >> >> Anthony Liguori >> >> >>>> From the guest's perspective, it's totally transparent. For the backend, >>>> I'd suggest having an explicit "initialized" ack or something so that it >>>> knows that the data is now mapped to the guest. >>>> >>> From the peers' perspective, it's non-transparent :( >>> >>> Also it doubles the transient memory requirement. >>> >>> >>>> If you're doing just a ring queue in shared memory, it should allow >>>> disconnect/reconnect during live migration asynchronously to the actual qemu >>>> live migration. >>>> >>>> >>> Live migration of guests using shared memory is interesting. You'd need >>> to freeze all peers on one node, disconnect, reconnect, and restart them on >>> the other node. >>> >>> >> >> > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >