From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=51731 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OBX1E-0000Uo-82 for qemu-devel@nongnu.org; Mon, 10 May 2010 13:43:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OBX18-0001ZD-Qi for qemu-devel@nongnu.org; Mon, 10 May 2010 13:43:52 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:54886) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OBX18-0001Z6-J1 for qemu-devel@nongnu.org; Mon, 10 May 2010 13:43:46 -0400 Received: by vws10 with SMTP id 10so305087vws.4 for ; Mon, 10 May 2010 10:43:45 -0700 (PDT) MIME-Version: 1.0 Sender: camm@ualberta.ca In-Reply-To: <4BE84172.9080305@codemonkey.ws> References: <1271872408-22842-1-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-3-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-4-git-send-email-cam@cs.ualberta.ca> <1271872408-22842-5-git-send-email-cam@cs.ualberta.ca> <4BE7F517.5010707@redhat.com> <4BE82623.4000905@redhat.com> <4BE82877.1040408@codemonkey.ws> <4BE83B69.4040904@redhat.com> <4BE84172.9080305@codemonkey.ws> Date: Mon, 10 May 2010 11:43:44 -0600 Message-ID: From: Cam Macdonell Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Avi Kivity , kvm@vger.kernel.org, qemu-devel@nongnu.org On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori w= rote: > On 05/10/2010 11:59 AM, Avi Kivity wrote: >> >> On 05/10/2010 06:38 PM, Anthony Liguori wrote: >>> >>>>> Otherwise, if the BAR is allocated during initialization, I would hav= e >>>>> to use MAP_FIXED to mmap the memory. =A0This is what I did before the >>>>> qemu_ram_mmap() function was added. >>>> >>>> What would happen to any data written to the BAR before the the >>>> handshake completed? =A0I think it would disappear. >>> >>> You don't have to do MAP_FIXED. =A0You can allocate a ram area and map = that >>> in when disconnected. =A0When you connect, you create another ram area = and >>> memcpy() the previous ram area to the new one. =A0You then map the seco= nd ram >>> area in. >> >> But it's a shared memory area. =A0Other peers could have connected and >> written some data in. =A0The memcpy() would destroy their data. > > Why try to attempt to support multi-master shared memory? =A0What's the > use-case? I don't see it as multi-master, but that the latest guest to join shouldn't have their contents take precedence. In developing this patch, my motivation has been to let the guests decide. If the memcpy is always done, even when no data is written, a guest cannot join without overwriting everything. One use case we're looking at is having VMs using a map reduce framework like Hadoop or Phoenix running in VMs. However, if a workqueue is stored or data transfer passes through shared memory, a system can't scale up the number of workers because each new guest will erase the shared memory (and the workqueue or in progress data transfer). In cases where the latest guest to join wants to clear the memory, it can do so without the automatic memcpy. The guest can do a memset once it knows the memory is attached. My opinion is to leave it to the guests and the application that is using the shared memory to decide what to do on guest joins. Cam > > Regards, > > Anthony Liguori > >>> >>> From the guest's perspective, it's totally transparent. =A0For the back= end, >>> I'd suggest having an explicit "initialized" ack or something so that i= t >>> knows that the data is now mapped to the guest. >> >> From the peers' perspective, it's non-transparent :( >> >> Also it doubles the transient memory requirement. >> >>> >>> If you're doing just a ring queue in shared memory, it should allow >>> disconnect/reconnect during live migration asynchronously to the actual= qemu >>> live migration. >>> >> >> Live migration of guests using shared memory is interesting. =A0You'd ne= ed >> to freeze all peers on one node, disconnect, reconnect, and restart them= on >> the other node. >> > >