All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian Bornträger" <borntraeger@de.ibm.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Cam Macdonell <cam@cs.ualberta.ca>, Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: [PATCH v2] Driver for Inter-VM shared memory device for KVM supporting interrupts.
Date: Wed, 20 May 2009 09:33:01 +0200	[thread overview]
Message-ID: <200905200933.01736.borntraeger@de.ibm.com> (raw)
In-Reply-To: <200905201228.38718.rusty@rustcorp.com.au>

Am Mittwoch 20 Mai 2009 04:58:38 schrieb Rusty Russell:
> On Wed, 20 May 2009 02:21:08 am Cam Macdonell wrote:
> > Avi Kivity wrote:
> > > Christian Bornträger wrote:
> > >>> To summarize, Anthony thinks it should use virtio, while I believe
> > >>> virtio is useful for exporting guest memory, not for importing host
> > >>> memory.
>
> Yes, precisely.
>
> But what's it *for*, this shared memory?  Implementing shared memory is
> trivial.  Using it is harder.  For example, inter-guest networking: you'd
> have to copy packets in and out, making it slow as well as losing
> abstraction.
>
> The only interesting idea I can think of is exposing it to userspace, and
> having that run some protocol across it for fast app <-> app comms.  But if
> that's your plan, you still have a lot of code the write!
>
> So I guess I'm missing the big picture here?

I can give some insights about shared memory usage in z/VM. z/VM uses so-
called discontiguous saved segments (DCSS) to shared memory between guests.
	(naming side note:
	o discontigous because these segments can have holes and different access
      rights, e.g. you can build DCSS that go from 800M-801M read only and
      900M-910M exclusive-write.
	o segments because the 2nd level of our page tables is called segment table.
     )

z/VM uses these segments for several purposes:
o The monitoring subsystem uses a DCSS to get data from several components
o shared guest kernels: The CMS operating system is build as a bootable DCSS
  (called named-saved-segments NSS). All guests have the same host pages for
  the read-only parts of the CMS kernel. The local data is stored in
  exclusive-write parts of the same NSS. Linux on System z is also capable of
  using this feature (CONFIG_SHARED_KERNEL). The kernel linkage is changed in
  a way to separate the read-only text segment from the other parts with
  segment size alignment
o execute-in-place: This is a Linux feature to exploit the DCSS technology.
  The goal is to shared identical guest pages without the additional overhead
  of KSM etc. We have a block device driver for DCSS. This block device driver
  supports the direct_access function and therefore allows to use the xip
  option of ext2. The idea is to put  binaries into an read-only ext2
  filesystem. Whenever an mmap is made on this file system, the page is not
  mapped into the page cache. The ptes point into the DCSS memory instead.
  Since the DCSS is demand-paged by the host no memory is wasted for unused
  parts of the binaries. In case of COW the page is copied as usual. It turned
  out that installations with many similar guests (lets say 400 guests) will
  profit in terms of memory saving and quicker application startups (not the
  first guest of course). There is a downside: this requires a skilled
  administrator to setup.

We have also experimented with network, Posix shared memory, and shared caches 
via DCSS. Most of these ideas turned out to be not very useful or hard to 
implement proper.

  reply	other threads:[~2009-05-20  7:33 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-07 16:26 [PATCH v2] Driver for Inter-VM shared memory device for KVM supporting interrupts Cam Macdonell
2009-05-18 14:07 ` Christian Borntraeger
2009-05-18 14:26   ` Avi Kivity
2009-05-19  9:00     ` Christian Bornträger
2009-05-19  9:10       ` Avi Kivity
2009-05-19 16:51         ` Cam Macdonell
2009-05-20  2:58           ` Rusty Russell
2009-05-20  7:33             ` Christian Bornträger [this message]
2009-05-20  8:45               ` Avi Kivity
2009-05-20  9:07                 ` Christian Bornträger
2009-05-20  9:11                   ` Avi Kivity
2009-05-20  9:20                     ` Christian Bornträger
2009-05-25  6:18               ` Rusty Russell
2009-05-20  8:07             ` François Diakhate
2009-05-19 18:39       ` Anthony Liguori
2009-05-20  7:33         ` Christian Bornträger
2009-05-20 13:26           ` Anthony Liguori
2009-05-18 16:56   ` Cam Macdonell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200905200933.01736.borntraeger@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=cam@cs.ualberta.ca \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.