From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rusty Russell Subject: Re: [PATCH v2] Driver for Inter-VM shared memory device for KVM supporting interrupts. Date: Mon, 25 May 2009 15:48:59 +0930 Message-ID: <200905251549.00283.rusty@rustcorp.com.au> References: <1241713567-17256-1-git-send-email-cam@cs.ualberta.ca> <200905201228.38718.rusty@rustcorp.com.au> <200905200933.01736.borntraeger@de.ibm.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Cam Macdonell , Avi Kivity , kvm@vger.kernel.org, Christian Ehrhardt , Anthony Liguori To: Christian =?utf-8?q?Borntr=C3=A4ger?= Return-path: Received: from ozlabs.org ([203.10.76.45]:43803 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752589AbZEYK2P convert rfc822-to-8bit (ORCPT ); Mon, 25 May 2009 06:28:15 -0400 In-Reply-To: <200905200933.01736.borntraeger@de.ibm.com> Content-Disposition: inline Sender: kvm-owner@vger.kernel.org List-ID: On Wed, 20 May 2009 05:03:01 pm Christian Borntr=C3=A4ger wrote: > Am Mittwoch 20 Mai 2009 04:58:38 schrieb Rusty Russell: > > But what's it *for*, this shared memory?=20 =2E.. > z/VM uses these segments for several purposes: > o The monitoring subsystem uses a DCSS to get data from several compo= nents In KVM this probably doesn't require inter-guest access; presumably mon= itoring=20 is done on the host. > o shared guest kernels: The CMS operating system is build as a bootab= le > DCSS (called named-saved-segments NSS). All guests have the same host= pages > for the read-only parts of the CMS kernel. The local data is stored i= n > exclusive-write parts of the same NSS. Linux on System z is also capa= ble of > using this feature (CONFIG_SHARED_KERNEL). The kernel linkage is chan= ged in > a way to separate the read-only text segment from the other parts wit= h > segment size alignment This is unlikely for x86 at least, and as you point out, not good for=20 distributions either. > o execute-in-place: This is a Linux feature to exploit the DCSS techn= ology. > The goal is to shared identical guest pages without the additional > overhead of KSM etc. We have a block device driver for DCSS. This blo= ck > device driver supports the direct_access function and therefore allow= s to > use the xip option of ext2. The idea is to put binaries into an read= -only > ext2 filesystem. Whenever an mmap is made on this file system, the pa= ge is > not mapped into the page cache. The ptes point into the DCSS memory > instead. Since the DCSS is demand-paged by the host no memory is wast= ed for > unused parts of the binaries. In case of COW the page is copied as us= ual. > It turned out that installations with many similar guests (lets say 4= 00 > guests) will profit in terms of memory saving and quicker application > startups (not the first guest of course). There is a downside: this > requires a skilled administrator to setup. We're better off doing opportunistic KSM in virtio_blk I'd say. Anyway= , it's=20 not really "inter-guest" in this sense; the host controls it, though it= lets=20 multiple guests read from it. > We have also experimented with network, Posix shared memory, and shar= ed > caches via DCSS. Most of these ideas turned out to be not very useful= or > hard to implement proper. Indeed, and this is what I suspect these patches are aiming for... Thanks, Rusty.