xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* V4V
@ 2012-05-24 17:23 Jean Guyader
  2012-05-25  9:48 ` V4V Stefano Stabellini
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jean Guyader @ 2012-05-24 17:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Ross Philipson, Jean Guyader, James McKenzie


[-- Attachment #1.1: Type: text/plain, Size: 5772 bytes --]

As I'm going through the code to clean-up XenClient's inter VM
communication
(V4V), I thought it would be a good idea to start a thread to talk about
the
fundamental differences between V4V and libvchan. I believe the two system
are
not clones of eachother and they serve different
purposes.


Disclaimer: I'm not an expert in libvchan; most of the assertion I'm doing
about libvchan it coming from my reading of the code. If some of the facts
are wrong it's only due to my ignorance about the subject.

1. Why V4V?

About the time when we started XenClient (3 year ago) we were looking for a
lightweight inter VM communication scheme. We started working on a system
based on netchannel2 at the time called V2V (VM to VM). The system
was very similar to what libvchan is today, and we started to hit some
roadblocks:

    - The setup relied on a broker in dom0 to prepare the xenstore node
      permissions when a guest wanted to create a new connection. The code
      to do this setup was a single point of failure. If the
      broker was down you could create any more connections.
    - Symmetric communications were a nightmare. Take the case where A is a
      backend for B and B is a backend for A. If one of the domain crash the
      other one couldn't be destroyed because it has some paged mapped from
      the dead domain. This specific issue is probably fixed today.

Some of the downsides to using the shared memory grant method:
    - This method imposes an implicit ordering on domain destruction.
      When this ordering is not honored the grantor domain cannot shutdown
      while the grantee still holds references. In the extreme case where
      the grantee domain hangs or crashes without releasing it granted
      pages, both domains can end up hung and unstoppable (the DEADBEEF
      issue).
    - You can't trust any ring structures because the entire set of pages
      that are granted are available to be written by the either guest.
    - The PV connect/disconnect state-machine is poorly implemented.
      There's no trivial mechanism to synchronize disconnecting/reconnecting
      and dom0 must also allow the two domains to see parts of xenstore
      belonging to the other domain in the process.
    - Using the grant-ref model and having to map grant pages on each
      transfer cause updates to V->P memory mappings and thus leads to
      TLB misses and flushes (TLB flushes being expensive operations).

After a lot time spent trying to make the V2V solution work the way we
wanted,  we decided that we should look at a new design that wouldn't have
the issues mentioned above. At this point we started to work on V4V (V2V
version 2).

2. What is V4V?

One of the fundamental problem about V2V was that it didn't implement a
connection mechanism. If one end of the ring disappeared you had to hope
that you would received the xenstore watch that will sort everything out.

V4V is a inter-domain communication that supports 1 to many connections.
All the communications from a domain (even dom0) to another domain goes
through Xen and Xen forward the packet with a memory copies.

Here are some of the reasons why we think v4v is a good solution for
inter-domain communication.

Reasons why the V4V method is quite good even though it does memory copies:
    - Memory transfer speeds through the FSB in modern chipsets is quite
      fast. Speeds on the order of 10-12 Gb/s (over say 2 DRAM channels)
      can be realized.
    - Transfers on a single clock cycle using SSE(2)(3) instructions allow
      moving up to 128 bits at a time.
    - Locality of reference arguments with respect to processor caches
      imply even more speed-up due to likely cache hits (this may in fact
      make the most difference in mem copy speed).
    - V4V provides much better domain isolation since one domain's memory
      is never seen by another and the hypervisor (a trusted component)
      brokers all interactions. This also implies that the structure of
      the ring can be trusted.
    - Use of V4V obviates the event channel depletion issue since
      it doesn't consume individual channel bits when using VIRQs.
    - The projected overhead of VMEXITs (that was originally cited as a
      majorly limiting factor) did not manifest itself as an issue. In
      fact, it can be seen that in the worst case V4V is not causing
      many more VMEXITs than the shared memory grant method and in
      general is at parity with the existing method.
    - The implementation specifics of V4V make its use in both a Windows
      and a Unix/Linux type OS's very simple and natural (ReadFile/WriteFile
      and sockets respectively). In addition, V4V uses TCP/IP protocol
      semantics which are widely understood and it does not introduce an
      entire new protocol set that must be learned.
    - V4V comes with a userspace library that can be use to interpose
      the standard userspace socket layer. That mean that *any* network
      program can be "V4Ved" *without* behing recompiled.
      In fact we tried it on many program suchs as ssh, midori,
      dbus (TCP-IP), X11.
      This is possible because the underlying V4V protocol implement
      a V4V sementic and supports connection. Suchs feature will be
      really really hard to implement over the top of the current
      libvchan implementation.

3. V4V compared to libvchan

I've done some benchmarks on V4V and libchan and the results were
pretty close between the the two if you use the same buffer size in both
cases.


In conclusion, this is not an attempt to demonstrate that V4V is superior to
libvchan. Rather it is an attempt to illustrate that they can coexist in the
Xen ecosystem, helping to solve different sets of problems.

Thanks,
Jean

[-- Attachment #1.2: Type: text/html, Size: 7077 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-05-31 18:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24 17:23 V4V Jean Guyader
2012-05-25  9:48 ` V4V Stefano Stabellini
2012-05-25 10:11   ` V4V Jean Guyader
2012-05-25 10:16     ` V4V Stefano Stabellini
2012-05-25 10:19 ` V4V Pasi Kärkkäinen
2012-05-29 22:22 ` V4V Daniel De Graaf
2012-05-30 11:41   ` V4V Stefano Stabellini
2012-05-30 14:19     ` V4V Daniel De Graaf
2012-05-31 17:20       ` V4V Stefano Stabellini
2012-05-31 18:18         ` V4V Daniel De Graaf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).