All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Wray <mike.wray@hp.com>
To: andrew.warfield@cl.cam.ac.uk
Cc: Eric Van Hensbergen <ericvh@gmail.com>,
	Harry Butterworth <harry@hebutterworth.freeserve.co.uk>,
	"Ronald G. Minnich" <rminnich@lanl.gov>,
	xen-devel@lists.xensource.com
Subject: Re: Re: Interdomain comms
Date: Tue, 10 May 2005 15:30:59 +0100	[thread overview]
Message-ID: <4280C5A3.2050502@hp.com> (raw)
In-Reply-To: <eacc82a405051003094d84c011@mail.gmail.com>

Andrew Warfield wrote:
>>It should be possible to still use the page mapping in the i/o transport.
>>The issue right now is that the i/o interface is very low-level and
>>intimately tangled up with the structs being transported.
> 
> I don't doubt that it is possible.  The point I was making is that the
> current i/o interfaces are low level for a reason, and that
> generalizing this to a higher-level communications primitive is a
> non-trivial thing.  Just considering the disk and net interfaces, the
> current device channels each make particular decisions regarding (a)
> what to copy and what to map, (b) when to send notification to get
> efficient batching through the scheduler, and most recently (c) which
> grant mechanism to use to pass pages securely across domains.

It should be relatively easy to provide these kinds of facilities in a
higher-level api.

> Having a higher-level API to make all this easier, and especially to
> reduce the code/complexity required to build new drivers etc is
> something that will be fantastic to have.  I think though that at
> least some of these underlying issues will need to be exposed for it
> to be useful.  I'm not convinced that reimplementing the sockets API
> for interdomain communication is a very good solution...

I wasn't suggesting exactly the sockets api, but something more like
the connect/send and listen/recv logic. Harry's API is quite like that,
with additional higher-level facilities.

> The
> buffer_reference struct that Harry mentioned looks quite interesting
> as a start though in terms of describing a variety of underlying
> transports.  Do you have a paper reference on that work, Harry?
> 
> With regards forwarding device channels across a network, I think we
> can expect application-level involvement for shifting device messages
> across a cluster.  If this is down the road, and it's certainly
> something that has been discussed, a device channel is potentially two
> local shared memory device channels between VMs on local hosts, and a
> network connection between the physical hosts.  Beyond the more
> complicated error cases that this obviously involves, we can then make
> this as arbitrarily more complex by discussing HA or security
> concerns... for the moment though, I think it would be interesting to
> see how well the existing local host cases can be generalized.  ;)
>  
> 
>>And with the domain control channel there's an implicit assumption
>>that 'there can be only one'. This means for example, that domain A
>>using a device with backend in domain B can't connect directly to domain B,
>>but has to be 'introduced' by xend. It'd be better if it could connect
>>directly.
> 
> This is not a flaw with the current implementation -- it's completely
> intentional.  By forcing control through xend we ensure that there is
> a single point for control logic, and for managing state.  Why do you
> feel it would be better to provide arbitrary point-to-point comms in a
> VMM environment that is specifically trying to provide isolation
> between guests?

OK, so it's an intentional flaw ;-).

One reason is that front-end drivers have to connect to their backends.
If they can find out who to connect to and then do it, it simplifies things.
Especially when that info is available from a store or registry service
as proposed for 3.0.

At the moment xend has to exchange messages with the domain to get the
device front-end handle and shared page address, and then exchange messages
with the back-end so it can create the device and map the page.
Telling the font-end which back-end to connect to would be much simpler.

>>Something like what Harry proposes should still be able to use
>>page mapping for efficient local comms, but without _requiring_
>>it. This opens the way for alternative transports, such as network.
>>
>>Rather than going straight for something very high-level, I'd prefer
>>to build up gradually, starting with a more general message transport
>>api that includes analogues to listen/connect/recv/send.
> 
> 
> As I said, I'm unconvinced that trying to mimic the sockets API is the
> right way to go -- I think the communicating parties often want to see
> and work with batches of messages without having to do extra copies or
> have event notification made implicit.  

Like I said, I wasn't suggesting _exactly_ the sockets api, more the
spirit of it. There is an analogue of batching for sockets though: flush.

> I think you are completely
> right about a gradual approach though -- having a generalized
> host-local device channel would be very interesting to see...
> especially if it could be shown to apply to the existing block, net,
> usb, and control channels in a simplifying fashion.
> 

Just a small matter of programming then :-).

Mike

  reply	other threads:[~2005-05-10 14:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-05 15:18 please help: initialize XEND for my debug-FE/BE.c Aggarwal, Vikas (OFT)
2005-05-05 20:37 ` Harry Butterworth
     [not found]   ` <427B20B9.1010101@hp.com>
2005-05-06 12:14     ` Interdomain comms Harry Butterworth
2005-05-06 13:39       ` Mark Williamson
2005-05-06 16:04       ` Ronald G. Minnich
2005-05-06 16:49         ` Eric Van Hensbergen
2005-05-06 23:13         ` Harry Butterworth
2005-05-07  0:19           ` Eric Van Hensbergen
2005-05-07 13:26             ` Harry Butterworth
2005-05-07 14:57               ` Eric Van Hensbergen
2005-05-07 16:15               ` Ronald G. Minnich
2005-05-07 17:10                 ` Keir Fraser
2005-05-07 21:22                   ` Eric Van Hensbergen
2005-05-07 17:17                 ` Harry Butterworth
2005-05-07 21:29                   ` Eric Van Hensbergen
2005-05-07 22:11                     ` Harry Butterworth
2005-05-08  0:57                       ` Eric Van Hensbergen
2005-05-08  8:19                         ` Andrew Warfield
2005-05-08 15:27                           ` Eric Van Hensbergen
2005-05-10  8:31                           ` Mike Wray
2005-05-10 10:09                             ` Andrew Warfield
2005-05-10 14:30                               ` Mike Wray [this message]
2005-05-10 14:51                               ` Harry Butterworth
     [not found]                                 ` <eacc82a405051008243195164c@mail.gmail.com>
2005-05-10 15:26                                   ` Andrew Warfield
2005-05-10 16:42                                     ` Harry Butterworth
2005-05-08  8:36                         ` Harry Butterworth
2005-05-08 16:18                           ` Eric Van Hensbergen
2005-05-08 17:48                             ` Harry Butterworth
2005-05-06 16:57       ` Nivedita Singhvi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4280C5A3.2050502@hp.com \
    --to=mike.wray@hp.com \
    --cc=andrew.warfield@cl.cam.ac.uk \
    --cc=ericvh@gmail.com \
    --cc=harry@hebutterworth.freeserve.co.uk \
    --cc=rminnich@lanl.gov \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.