xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Driver domains communication protocol proposal
@ 2012-04-04 15:46 Ian Jackson
  2012-04-10 15:06 ` Paul Durrant
  2012-04-12  9:33 ` George Dunlap
  0 siblings, 2 replies; 9+ messages in thread
From: Ian Jackson @ 2012-04-04 15:46 UTC (permalink / raw)
  To: xen-devel

During some discussions and handwaving, including discussions with
some experts on the Xenserver/XCP storage architecture, we came up
with what we think might be a plausible proposal for an architecture
for communication between toolstack and driver domain, for storage at
least.

I offered to write it up.  The abstract proposal is as I understand
the consensus from our conversation.  The concrete protocol is my own
invention.

Please comments.  After a round of review here we should consider
whether some of the assumptions need review from the communities
involved in "other" backends (particularly, the BSDs).

(FAOD the implementation of something like this is not 4.3 material,
but it may inform some API decisions etc. we take in 4.2.)

Ian.


Components

 toolstack

 guest
    Might be the toolstack domain, or an (intended) guest vm.

 driver domain
    Responsible for providing the disk service to guests.
    Consists, internally, of (at least):
       control plane
       backend
    but we avoid exposing this internal implementation detail.

    We permit different driver domains on a single host, serving
    different guests or the same guests.  

    The toolstack is expected to know the domid of the driver domain.

 driver domain kind
    We permit different "kinds" of driver domain, perhaps implemented
    by completely different code, which support different facilities.

    Each driver domain kind needs to document what targets (see
    below) are valid and how they are specified, and what preparatory
    steps may need to be taken eg at system boot.

    Driver domain kinds do not have a formal presence in the API.

Objects

 target
     A kind of name.

     Combination of a physical location and data format plus all other
     information needed by the underlying mechanisms, or relating to
     the data format, needed to access it.

     These names are assigned by the driver domain kind; the names may
     be an open class; no facility provided via this API to enumerate
     these.

     Syntactically, these are key/value pairs, mapping short string
     keys to shortish string values, suitable for storage in a
     xenstore directory.

 vdi
     This host's intent to access a specific target.
     Non-persistent, created on request by toolstack, enumerable.
     Possible states: inactive/active.
     Abstract operations: prepare, activate, deactivate, unprepare.

     (We call the "create" operation for this object "prepare" to
     avoid confusion with other kinds of "create".)

     The toolstack promises that no two vdis for the same target
     will simultaneously be active, even if the two vdis are on
     different hosts.

 vbd
     Provision of a facility for a guest to access a particular target
     via a particular vdi.  There may be zero or more of these at any
     point for a particular vdi.

     Non-persistent, created on request by toolstack, enumerable.
     Abstract operations: plug, unplug.

     (We call the "create" operation for this object "plug" to avoid
     confusion with other kinds of "create".)

     vbds may be created/destroyed, and the underlying vdi
     activated/deactivated, in any other.  However IO is only possible
     to a vbd when the corresponding vdi is active.  The reason for
     requiring activation as a separate step is to allow as much of
     the setup for an incoming migration domain's storage to be done
     before committing to the migration and entering the "domain is
     down" stage, during which access is switched from the old to the
     new host.

     We will consider here the case of a vbd which provides
     service as a Xen vbd backend.  Other cases (eg, the driver domain
     is the same as the toolstack domain and the vbd provides a block
     device in the toolstack domain) can be regarded as
     optimisations/shortcuts.

Concrete protocol

 The toolstack gives instructions to the driver domain, and receives
 results, via xenstore, in the path:
   /local/domain/<driverdomid>/backendctrl/vdi
 Both driver domain and toolstack have write access to the whole of
 this area.

 Each vdi which has been requested and/or exists, corresponds to a
 path .../backendctrl/vdi/<vdi> where <vdi> is a string (of
 alphanumerics, hyphens and underscores) chosen by the toolstack.
 Inside this, there are the following nodes:

 /local/domain/<driverdomid>/backendctrl/vdi/<vdi>/
   state       The current state.  Values are "inactive", "active",
               or ENOENT meaning the vdi does not exist.
               Set by the driver domain in response to requests.

   request     Operation requested by the toolstack and currently
               being performed.  Created by the toolstack, but may
               then not be modified by the toolstack.  Deleted
               by the driver domain when the operation has completed.

               The values of "request" are:
                 prepare
                 activate
                 deactivate
                 unprepare
                 plug <vbd>
                 unplug <vbd>
               <vbd> is an id chosen by the toolstack like <vdi>

   result      errno value (in decimal, Xen error number) best
               describing the results of the most recently completed
               operation; 0 means success.  Created or set by the
               driver domain in the same transaction as it deletes
               request.  The toolstack may delete this.

   result_msg  Optional UTF-8 string explaining any error; does not
               exist when result is "0".  Created or deleted by the
               driver domain whenever the driver domain sets result.
               The toolstack may delete this.

   t/*         The target name.  Must be written by the toolstack.
               But may not be removed or changed while either of
               state or request exist.

   vbd/<vbd>/state
               The state of a vbd, "ok" or ENOENT.
               Set or deleted by the driver domain in response to
               requests.

   vbd/<vbd>/frontend
               The frontend path (complete path in xenstore) which the
               xen vbd should be servicing.  Set by the toolstack
               with the plug request and not modified until after
               completion of unplug.

   vbd/<vbd>/backend
               The backend path (complete path in xenstore) which the
               driver domain has chosen for the vbd.  Set by the
               driver domain in response to a plug request.

   vbd/<vbd>/b-copy/*
               The driver domain may request, in response to plug,
               that the toolstack copy these values to the specified
               backend directory, in the same transaction as it
               creates the frontend.  Set by the driver domain in
               response to a plug request; may be deleted by the
               toolstack.  DEPRECATED, see below.

The operations:

 prepare
        Creates a vdi from a target.
        Preconditions:
            state ENOENT
            request ENOENT
        Request (xenstore writes by toolstack):
            request = "prepare"
            t/* as appropriate
        Results on success (xenstore writes by driver domain):
            request ENOENT    } applies to success from all operations,
            result = "0"      }  will not be restated below
            state = "inactive"
        Results on error (applies to all operations):  }
            request ENOENT                             }  applies
            result = some decimal integer errno value  }   to all
            result_msg = ENOENT or a string            }    failures

 activate
        Preconditions:
            state = "inactive"
            request ENOENT
        Request:
            request = "activate"
        Results on success:
            state = "active"

 deactivate
        Preconditions:
            state = "active"
            request ENOENT
        Request:
            request = "deactivate"
        Results on success:
            state = "inactive"

 unprepare
        Preconditions:
            state != ENOENT
            request ENOENT
        Request:
            request = "unprepare"
        Results on success:
            state = ENOENT

 removal, modification, etc. of an unprepared vdi:
        Preconditions:
            state ENOENT
            request ENOENT
        Request:
            any changes to <vdi> directory which do
             not create "state" or "request"
        Results:
            ignored - no response from driver domain

 plug <vbd>
        Preconditions:
            state ENOENT
            request ENOENT
            vbd/<vbd>/state ENOENT
            <frontend> ENOENT
        Request:
            request = "plug <vbd>"
            vbd/<vbd>/frontend = <frontend> ("/local/domain/<guest>/...")
        Results on success:
            vbd/<vbd>/state = "ok"
            vbd/<vbd>/backend = <rel-backend>
                (<rel-backend> is the backend path relative to the
                 driver domain's home directory in xenstore)
            vbd/<vbd>/b-copy/*  may be created    } at least one of these
            <backend>/*  may come into existence  }  must happen
        Next step (xenstore write) by toolstack:
            <frontend>  created and populated, specifically
            <frontend>/backend = <backend>
                ("/local/domain/<driverdomid>/<rel-backend>")
            <backend>    created if necessary
            <backend>/*  copied from  vbd/<vbd>/b-copy/*  if any
            <backend>/frontend = <frontend>  unless already set

 unplug <vbd>
        Preconditions:
            state ENOENT
            request ENOENT
            vbd/<vbd>/state "ok"
        Request:
            request = "unplug <vbd>"
            <frontend> ENOENT
        Results on success:
            vbd/<vbd>/state ENOENT
            <backend> ENOENT

 The toolstack and driver domains should not store state of their own,
 not required for these communication purposes, in the backendctrl/
 directory in xenstore.  If the driver domain wishes to make records
 for its own use in xenstore, it should do so in a different directory
 of its choice (eg, /local/domain/<driverdomid>/private/<something>.


Notes regarding driver domains whose block backend implementation is
controlled from the actual xenstore backend directory:

 The b-copy/* feature exists for compatibility with some of these.  If
 such a backend cannot cope with the backend directory coming into
 existence before the corresponding frontend directory, then it is
 necessary to create and populate the backend in the same xenstore
 transaction as the creation of the frontend.  However, such backends
 should be fixed; the b-copy/* feature is deprecated and will be
 withdrawn at some point.

 Note that a vbd may be created with the vdi inactive.  In this case
 the frontend and backend directories will exist, but the information
 needed to start up the backend properly may be lacking until the vdi
 is activated.  For example, if the existence of a suitable block
 device in the driver domain depends on vdi activation, the block
 device id cannot be made known to the backend until after the backend
 directory has already been created and perhaps has existed for some
 time.  It is believed that existing backends cope with this, because
 they use a "hotplug script" approach - where the backend directory is
 created without specifying the device node, and this backend directory
 creation causes the invocation of machinery which establishes the
 device node, which is subsequently written to xenstore.


Question

 What about network interfaces and other kinds of backend ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-04 15:46 Ian Jackson
@ 2012-04-10 15:06 ` Paul Durrant
  2012-04-11 10:11   ` Ian Jackson
  2012-04-12  9:33 ` George Dunlap
  1 sibling, 1 reply; 9+ messages in thread
From: Paul Durrant @ 2012-04-10 15:06 UTC (permalink / raw)
  To: Ian Jackson, xen-devel@lists.xen.org

> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Ian Jackson
> Sent: 04 April 2012 16:47
> To: xen-devel@lists.xen.org
> Subject: [Xen-devel] Driver domains communication protocol proposal
> 
> During some discussions and handwaving, including discussions with some
> experts on the Xenserver/XCP storage architecture, we came up with what
> we think might be a plausible proposal for an architecture for
> communication between toolstack and driver domain, for storage at least.
> 
> I offered to write it up.  The abstract proposal is as I understand the
> consensus from our conversation.  The concrete protocol is my own
> invention.
> 
> Please comments.  After a round of review here we should consider
> whether some of the assumptions need review from the communities
> involved in "other" backends (particularly, the BSDs).
> 
> (FAOD the implementation of something like this is not 4.3 material, but it
> may inform some API decisions etc. we take in 4.2.)
> 

I'm wondering how we should deal with driver domain re-starts (possibly because of a crash). One of the compelling reasons for using driver domains is the ability to re-start them, possibly transparently to the frontend.
If a driver domain were to crash, I guess it would be the responsibility of the tools to notice this and build a new one as quickly as possible. A frontend could notice the loss of a driver domain backend by, presumably a backend state watch firing followed by an inability to read the backend state key, as presumably a clean unplug should go through the usual closing->closed dance first. The frontend could then, perhaps, stall I/O while the tools build a new driver domain and re-build communications when it notices the <frontend>/backend key get updated by the tools. Does that sequence sound plausible?

  Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-10 15:06 ` Paul Durrant
@ 2012-04-11 10:11   ` Ian Jackson
  0 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2012-04-11 10:11 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel@lists.xen.org

Paul Durrant writes ("RE: [Xen-devel] Driver domains communication protocol proposal"):
> I'm wondering how we should deal with driver domain re-starts
> (possibly because of a crash). One of the compelling reasons for
> using driver domains is the ability to re-start them, possibly
> transparently to the frontend.

Right.

> If a driver domain were to crash, I guess it would be the
> responsibility of the tools to notice this and build a new one as
> quickly as possible. A frontend could notice the loss of a driver
> domain backend by, presumably a backend state watch firing followed
> by an inability to read the backend state key,

No, I don't think anything would necessarily remove the backend from
xenstore.  So the frontend shouldn't notice anything (other than a
stall, obviously) until the <frontend>/backend node was updated to
point to the replacement.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-04 15:46 Ian Jackson
  2012-04-10 15:06 ` Paul Durrant
@ 2012-04-12  9:33 ` George Dunlap
  2012-04-24 18:00   ` Ian Jackson
  1 sibling, 1 reply; 9+ messages in thread
From: George Dunlap @ 2012-04-12  9:33 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel

On Wed, Apr 4, 2012 at 4:46 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> During some discussions and handwaving, including discussions with
> some experts on the Xenserver/XCP storage architecture, we came up
> with what we think might be a plausible proposal for an architecture
> for communication between toolstack and driver domain, for storage at
> least.
>
> I offered to write it up.  The abstract proposal is as I understand
> the consensus from our conversation.  The concrete protocol is my own
> invention.
>
> Please comments.  After a round of review here we should consider
> whether some of the assumptions need review from the communities
> involved in "other" backends (particularly, the BSDs).
>
> (FAOD the implementation of something like this is not 4.3 material,
> but it may inform some API decisions etc. we take in 4.2.)
>
> Ian.
>
>
> Components
>
>  toolstack
>
>  guest
>    Might be the toolstack domain, or an (intended) guest vm.
>
>  driver domain
>    Responsible for providing the disk service to guests.
>    Consists, internally, of (at least):
>       control plane
>       backend
>    but we avoid exposing this internal implementation detail.
>
>    We permit different driver domains on a single host, serving
>    different guests or the same guests.
>
>    The toolstack is expected to know the domid of the driver domain.
>
>  driver domain kind
>    We permit different "kinds" of driver domain, perhaps implemented
>    by completely different code, which support different facilities.
>
>    Each driver domain kind needs to document what targets (see
>    below) are valid and how they are specified, and what preparatory
>    steps may need to be taken eg at system boot.
>
>    Driver domain kinds do not have a formal presence in the API.
>
> Objects
>
>  target
>     A kind of name.
>
>     Combination of a physical location and data format plus all other
>     information needed by the underlying mechanisms, or relating to
>     the data format, needed to access it.
>
>     These names are assigned by the driver domain kind; the names may
>     be an open class; no facility provided via this API to enumerate
>     these.
>
>     Syntactically, these are key/value pairs, mapping short string
>     keys to shortish string values, suitable for storage in a
>     xenstore directory.
>
>  vdi
>     This host's intent to access a specific target.
>     Non-persistent, created on request by toolstack, enumerable.
>     Possible states: inactive/active.
>     Abstract operations: prepare, activate, deactivate, unprepare.

VDI as used by XenServer seems to mean "virtual disk instance", and as
such is actually persistent.  I don't quite understand what it's
supposed to mean here, and how it differs from VBD (which in XenServer
terminology means "virtual block device").

 -George

>
>     (We call the "create" operation for this object "prepare" to
>     avoid confusion with other kinds of "create".)
>
>     The toolstack promises that no two vdis for the same target
>     will simultaneously be active, even if the two vdis are on
>     different hosts.
>
>  vbd
>     Provision of a facility for a guest to access a particular target
>     via a particular vdi.  There may be zero or more of these at any
>     point for a particular vdi.
>
>     Non-persistent, created on request by toolstack, enumerable.
>     Abstract operations: plug, unplug.
>
>     (We call the "create" operation for this object "plug" to avoid
>     confusion with other kinds of "create".)
>
>     vbds may be created/destroyed, and the underlying vdi
>     activated/deactivated, in any other.  However IO is only possible
>     to a vbd when the corresponding vdi is active.  The reason for
>     requiring activation as a separate step is to allow as much of
>     the setup for an incoming migration domain's storage to be done
>     before committing to the migration and entering the "domain is
>     down" stage, during which access is switched from the old to the
>     new host.
>
>     We will consider here the case of a vbd which provides
>     service as a Xen vbd backend.  Other cases (eg, the driver domain
>     is the same as the toolstack domain and the vbd provides a block
>     device in the toolstack domain) can be regarded as
>     optimisations/shortcuts.
>
> Concrete protocol
>
>  The toolstack gives instructions to the driver domain, and receives
>  results, via xenstore, in the path:
>   /local/domain/<driverdomid>/backendctrl/vdi
>  Both driver domain and toolstack have write access to the whole of
>  this area.
>
>  Each vdi which has been requested and/or exists, corresponds to a
>  path .../backendctrl/vdi/<vdi> where <vdi> is a string (of
>  alphanumerics, hyphens and underscores) chosen by the toolstack.
>  Inside this, there are the following nodes:
>
>  /local/domain/<driverdomid>/backendctrl/vdi/<vdi>/
>   state       The current state.  Values are "inactive", "active",
>               or ENOENT meaning the vdi does not exist.
>               Set by the driver domain in response to requests.
>
>   request     Operation requested by the toolstack and currently
>               being performed.  Created by the toolstack, but may
>               then not be modified by the toolstack.  Deleted
>               by the driver domain when the operation has completed.
>
>               The values of "request" are:
>                 prepare
>                 activate
>                 deactivate
>                 unprepare
>                 plug <vbd>
>                 unplug <vbd>
>               <vbd> is an id chosen by the toolstack like <vdi>
>
>   result      errno value (in decimal, Xen error number) best
>               describing the results of the most recently completed
>               operation; 0 means success.  Created or set by the
>               driver domain in the same transaction as it deletes
>               request.  The toolstack may delete this.
>
>   result_msg  Optional UTF-8 string explaining any error; does not
>               exist when result is "0".  Created or deleted by the
>               driver domain whenever the driver domain sets result.
>               The toolstack may delete this.
>
>   t/*         The target name.  Must be written by the toolstack.
>               But may not be removed or changed while either of
>               state or request exist.
>
>   vbd/<vbd>/state
>               The state of a vbd, "ok" or ENOENT.
>               Set or deleted by the driver domain in response to
>               requests.
>
>   vbd/<vbd>/frontend
>               The frontend path (complete path in xenstore) which the
>               xen vbd should be servicing.  Set by the toolstack
>               with the plug request and not modified until after
>               completion of unplug.
>
>   vbd/<vbd>/backend
>               The backend path (complete path in xenstore) which the
>               driver domain has chosen for the vbd.  Set by the
>               driver domain in response to a plug request.
>
>   vbd/<vbd>/b-copy/*
>               The driver domain may request, in response to plug,
>               that the toolstack copy these values to the specified
>               backend directory, in the same transaction as it
>               creates the frontend.  Set by the driver domain in
>               response to a plug request; may be deleted by the
>               toolstack.  DEPRECATED, see below.
>
> The operations:
>
>  prepare
>        Creates a vdi from a target.
>        Preconditions:
>            state ENOENT
>            request ENOENT
>        Request (xenstore writes by toolstack):
>            request = "prepare"
>            t/* as appropriate
>        Results on success (xenstore writes by driver domain):
>            request ENOENT    } applies to success from all operations,
>            result = "0"      }  will not be restated below
>            state = "inactive"
>        Results on error (applies to all operations):  }
>            request ENOENT                             }  applies
>            result = some decimal integer errno value  }   to all
>            result_msg = ENOENT or a string            }    failures
>
>  activate
>        Preconditions:
>            state = "inactive"
>            request ENOENT
>        Request:
>            request = "activate"
>        Results on success:
>            state = "active"
>
>  deactivate
>        Preconditions:
>            state = "active"
>            request ENOENT
>        Request:
>            request = "deactivate"
>        Results on success:
>            state = "inactive"
>
>  unprepare
>        Preconditions:
>            state != ENOENT
>            request ENOENT
>        Request:
>            request = "unprepare"
>        Results on success:
>            state = ENOENT
>
>  removal, modification, etc. of an unprepared vdi:
>        Preconditions:
>            state ENOENT
>            request ENOENT
>        Request:
>            any changes to <vdi> directory which do
>             not create "state" or "request"
>        Results:
>            ignored - no response from driver domain
>
>  plug <vbd>
>        Preconditions:
>            state ENOENT
>            request ENOENT
>            vbd/<vbd>/state ENOENT
>            <frontend> ENOENT
>        Request:
>            request = "plug <vbd>"
>            vbd/<vbd>/frontend = <frontend> ("/local/domain/<guest>/...")
>        Results on success:
>            vbd/<vbd>/state = "ok"
>            vbd/<vbd>/backend = <rel-backend>
>                (<rel-backend> is the backend path relative to the
>                 driver domain's home directory in xenstore)
>            vbd/<vbd>/b-copy/*  may be created    } at least one of these
>            <backend>/*  may come into existence  }  must happen
>        Next step (xenstore write) by toolstack:
>            <frontend>  created and populated, specifically
>            <frontend>/backend = <backend>
>                ("/local/domain/<driverdomid>/<rel-backend>")
>            <backend>    created if necessary
>            <backend>/*  copied from  vbd/<vbd>/b-copy/*  if any
>            <backend>/frontend = <frontend>  unless already set
>
>  unplug <vbd>
>        Preconditions:
>            state ENOENT
>            request ENOENT
>            vbd/<vbd>/state "ok"
>        Request:
>            request = "unplug <vbd>"
>            <frontend> ENOENT
>        Results on success:
>            vbd/<vbd>/state ENOENT
>            <backend> ENOENT
>
>  The toolstack and driver domains should not store state of their own,
>  not required for these communication purposes, in the backendctrl/
>  directory in xenstore.  If the driver domain wishes to make records
>  for its own use in xenstore, it should do so in a different directory
>  of its choice (eg, /local/domain/<driverdomid>/private/<something>.
>
>
> Notes regarding driver domains whose block backend implementation is
> controlled from the actual xenstore backend directory:
>
>  The b-copy/* feature exists for compatibility with some of these.  If
>  such a backend cannot cope with the backend directory coming into
>  existence before the corresponding frontend directory, then it is
>  necessary to create and populate the backend in the same xenstore
>  transaction as the creation of the frontend.  However, such backends
>  should be fixed; the b-copy/* feature is deprecated and will be
>  withdrawn at some point.
>
>  Note that a vbd may be created with the vdi inactive.  In this case
>  the frontend and backend directories will exist, but the information
>  needed to start up the backend properly may be lacking until the vdi
>  is activated.  For example, if the existence of a suitable block
>  device in the driver domain depends on vdi activation, the block
>  device id cannot be made known to the backend until after the backend
>  directory has already been created and perhaps has existed for some
>  time.  It is believed that existing backends cope with this, because
>  they use a "hotplug script" approach - where the backend directory is
>  created without specifying the device node, and this backend directory
>  creation causes the invocation of machinery which establishes the
>  device node, which is subsequently written to xenstore.
>
>
> Question
>
>  What about network interfaces and other kinds of backend ?
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-12  9:33 ` George Dunlap
@ 2012-04-24 18:00   ` Ian Jackson
  2012-04-25 10:00     ` Ian Campbell
  2012-04-25 10:27     ` George Dunlap
  0 siblings, 2 replies; 9+ messages in thread
From: Ian Jackson @ 2012-04-24 18:00 UTC (permalink / raw)
  To: George Dunlap; +Cc: Ian Jackson, xen-devel

George Dunlap writes ("Re: [Xen-devel] Driver domains communication protocol proposal"):
> On Wed, Apr 4, 2012 at 4:46 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> >  vdi
> >     This host's intent to access a specific target.
> >     Non-persistent, created on request by toolstack, enumerable.
> >     Possible states: inactive/active.
> >     Abstract operations: prepare, activate, deactivate, unprepare.
> 
> VDI as used by XenServer seems to mean "virtual disk instance", and as
> such is actually persistent.  I don't quite understand what it's
> supposed to mean here, and how it differs from VBD (which in XenServer
> terminology means "virtual block device").

One "vdi" in this sense can support multiple "vbd"s.  A "vbd"
represents an attachment to a domain (or some other kind of provision
for use) whereas a "vdi" is a preparatory thing.

Feel free to suggest different terminology.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-24 18:00   ` Ian Jackson
@ 2012-04-25 10:00     ` Ian Campbell
  2012-04-25 10:27     ` George Dunlap
  1 sibling, 0 replies; 9+ messages in thread
From: Ian Campbell @ 2012-04-25 10:00 UTC (permalink / raw)
  To: Ian Jackson; +Cc: George Dunlap, Jon Ludlam, xen-devel@lists.xen.org

On Tue, 2012-04-24 at 19:00 +0100, Ian Jackson wrote:
> George Dunlap writes ("Re: [Xen-devel] Driver domains communication protocol proposal"):
> > On Wed, Apr 4, 2012 at 4:46 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> > >  vdi
> > >     This host's intent to access a specific target.
> > >     Non-persistent, created on request by toolstack, enumerable.
> > >     Possible states: inactive/active.
> > >     Abstract operations: prepare, activate, deactivate, unprepare.
> > 
> > VDI as used by XenServer seems to mean "virtual disk instance", and as
> > such is actually persistent.  I don't quite understand what it's
> > supposed to mean here, and how it differs from VBD (which in XenServer
> > terminology means "virtual block device").
> 
> One "vdi" in this sense can support multiple "vbd"s.  A "vbd"
> represents an attachment to a domain (or some other kind of provision
> for use) whereas a "vdi" is a preparatory thing.
> 
> Feel free to suggest different terminology.

What does the XCP SMAPI call these things? (Jon CCd)

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-24 18:00   ` Ian Jackson
  2012-04-25 10:00     ` Ian Campbell
@ 2012-04-25 10:27     ` George Dunlap
  2012-05-11 15:00       ` Ian Jackson
  1 sibling, 1 reply; 9+ messages in thread
From: George Dunlap @ 2012-04-25 10:27 UTC (permalink / raw)
  To: Ian Jackson, Jonathan Ludlam; +Cc: xen-devel

On Tue, Apr 24, 2012 at 7:00 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> George Dunlap writes ("Re: [Xen-devel] Driver domains communication protocol proposal"):
>> On Wed, Apr 4, 2012 at 4:46 PM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
>> >  vdi
>> >     This host's intent to access a specific target.
>> >     Non-persistent, created on request by toolstack, enumerable.
>> >     Possible states: inactive/active.
>> >     Abstract operations: prepare, activate, deactivate, unprepare.
>>
>> VDI as used by XenServer seems to mean "virtual disk instance", and as
>> such is actually persistent.  I don't quite understand what it's
>> supposed to mean here, and how it differs from VBD (which in XenServer
>> terminology means "virtual block device").
>
> One "vdi" in this sense can support multiple "vbd"s.  A "vbd"
> represents an attachment to a domain (or some other kind of provision
> for use) whereas a "vdi" is a preparatory thing.

Ah, so what you're calling "vdi" in this case is a thing into which
vbd's can plug -- what we might call the backend "node" for a
particular disk image?

So we have:

[A] <--> [B] <--> { [C], [D], [E] }

Where:
* A is the actual disk image on stable storage somewhere
* B is the instance of the code that can access A and provide access
to VMs which connect to it (not persistent)
* C D and E are instances of code running inside the guest which
connect to B and provide a block device to the guest OS which looks
like A (again not persistent)

Is that correct?

I think calling A a "virtual disk image" makes the most sense; reusing
that name for B is a bad idea given that it's already used for A in
XenServer terminology.  (Jonathan, correct me if I'm wrong here.)

I think that calling C D and E "vbd"s also makes sense.

So we just need to have a good name for the running instance of a
blockback process / thread / whatever that accesses a particular VDI.
Virtual disk provider (VDP)? Block back instance (BBI)?  Virtual block
backend (VBB)?

 -George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
  2012-04-25 10:27     ` George Dunlap
@ 2012-05-11 15:00       ` Ian Jackson
  0 siblings, 0 replies; 9+ messages in thread
From: Ian Jackson @ 2012-05-11 15:00 UTC (permalink / raw)
  To: George Dunlap; +Cc: Jonathan Ludlam, xen-devel

George Dunlap writes ("Re: [Xen-devel] Driver domains communication protocol proposal"):
> Ah, so what you're calling "vdi" in this case is a thing into which
> vbd's can plug -- what we might call the backend "node" for a
> particular disk image?

Yes.

> So we have:
> 
> [A] <--> [B] <--> { [C], [D], [E] }
> 
> Where:
> * A is the actual disk image on stable storage somewhere
> * B is the instance of the code that can access A and provide access
> to VMs which connect to it (not persistent)
> * C D and E are instances of code running inside the guest which
> connect to B and provide a block device to the guest OS which looks
> like A (again not persistent)
> 
> Is that correct?

Yes.

> I think calling A a "virtual disk image" makes the most sense; reusing
> that name for B is a bad idea given that it's already used for A in
> XenServer terminology.  (Jonathan, correct me if I'm wrong here.)

Right.

> I think that calling C D and E "vbd"s also makes sense.
> 
> So we just need to have a good name for the running instance of a
> blockback process / thread / whatever that accesses a particular VDI.
> Virtual disk provider (VDP)? Block back instance (BBI)?  Virtual block
> backend (VBB)?

Anything with "backend" in it is probably wrong because in general C,
D and E are backend/frontend pairs.

The thing that B has that A (the vdi) hasn't is that B has done all
the preparatory work necessary for accessing the vdi apart from
anything that involves exclusivity.

"nonexclusive image context" aka "nic" ? :-)
"nonexclusive image handle" aka "nih" :-)
"preparatory exclusive (not) image session" ?

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Driver domains communication protocol proposal
@ 2012-12-03 10:34 Roger Pau Monné
  0 siblings, 0 replies; 9+ messages in thread
From: Roger Pau Monné @ 2012-12-03 10:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson

Hello,

I no longer have the original message, so I'm going to reply in a
copy-paste of xen mailing list archive. Sorry for the inconvenience.

> During some discussions and handwaving, including discussions with
> some experts on the Xenserver/XCP storage architecture, we came up
> with what we think might be a plausible proposal for an architecture
> for communication between toolstack and driver domain, for storage at
> least.
>
> I offered to write it up.  The abstract proposal is as I understand
> the consensus from our conversation.  The concrete protocol is my own
> invention.
>
> Please comments.  After a round of review here we should consider
> whether some of the assumptions need review from the communities
> involved in "other" backends (particularly, the BSDs).
>
> (FAOD the implementation of something like this is not 4.3 material,
> but it may inform some API decisions etc. we take in 4.2.)
>
> Ian.
>
>
> Components
>
>  toolstack
>
>  guest
>     Might be the toolstack domain, or an (intended) guest vm.
>
>  driver domain
>     Responsible for providing the disk service to guests.
>     Consists, internally, of (at least):
>        control plane
>        backend
>     but we avoid exposing this internal implementation detail.
>
>     We permit different driver domains on a single host, serving
>     different guests or the same guests.
>
>     The toolstack is expected to know the domid of the driver domain.
>
>  driver domain kind
>     We permit different "kinds" of driver domain, perhaps implemented
>     by completely different code, which support different facilities.
>
>     Each driver domain kind needs to document what targets (see
>     below) are valid and how they are specified, and what preparatory
>     steps may need to be taken eg at system boot.
>
>     Driver domain kinds do not have a formal presence in the API.
>
> Objects
>
>  target
>      A kind of name.
>
>      Combination of a physical location and data format plus all other
>      information needed by the underlying mechanisms, or relating to
>      the data format, needed to access it.
>
>      These names are assigned by the driver domain kind; the names may
>      be an open class; no facility provided via this API to enumerate
>      these.
>
>      Syntactically, these are key/value pairs, mapping short string
>      keys to shortish string values, suitable for storage in a
>      xenstore directory.
>
>  vdi
>      This host's intent to access a specific target.
>      Non-persistent, created on request by toolstack, enumerable.
>      Possible states: inactive/active.
>      Abstract operations: prepare, activate, deactivate, unprepare.
>
>      (We call the "create" operation for this object "prepare" to
>      avoid confusion with other kinds of "create".)
>
>      The toolstack promises that no two vdis for the same target
>      will simultaneously be active, even if the two vdis are on
>      different hosts.
>
>  vbd
>      Provision of a facility for a guest to access a particular target
>      via a particular vdi.  There may be zero or more of these at any
>      point for a particular vdi.
>
>      Non-persistent, created on request by toolstack, enumerable.
>      Abstract operations: plug, unplug.
>
>      (We call the "create" operation for this object "plug" to avoid
>      confusion with other kinds of "create".)
>
>      vbds may be created/destroyed, and the underlying vdi
>      activated/deactivated, in any other.  However IO is only possible
>      to a vbd when the corresponding vdi is active.  The reason for
>      requiring activation as a separate step is to allow as much of
>      the setup for an incoming migration domain's storage to be done
>      before committing to the migration and entering the "domain is
>      down" stage, during which access is switched from the old to the
>      new host.
>
>      We will consider here the case of a vbd which provides
>      service as a Xen vbd backend.  Other cases (eg, the driver domain
>      is the same as the toolstack domain and the vbd provides a block
>      device in the toolstack domain) can be regarded as
>      optimisations/shortcuts.
>
> Concrete protocol
>
>  The toolstack gives instructions to the driver domain, and receives
>  results, via xenstore, in the path:
>    /local/domain/<driverdomid>/backendctrl/vdi
>  Both driver domain and toolstack have write access to the whole of
>  this area.
>
>  Each vdi which has been requested and/or exists, corresponds to a
>  path .../backendctrl/vdi/<vdi> where <vdi> is a string (of
>  alphanumerics, hyphens and underscores) chosen by the toolstack.
>  Inside this, there are the following nodes:
>
>  /local/domain/<driverdomid>/backendctrl/vdi/<vdi>/
>    state       The current state.  Values are "inactive", "active",
>                or ENOENT meaning the vdi does not exist.
>                Set by the driver domain in response to requests.
>
>    request     Operation requested by the toolstack and currently
>                being performed.  Created by the toolstack, but may
>                then not be modified by the toolstack.  Deleted
>                by the driver domain when the operation has completed.
>
>                The values of "request" are:
>                  prepare
>                  activate
>                  deactivate
>                  unprepare
>                  plug <vbd>
>                  unplug <vbd>
>                <vbd> is an id chosen by the toolstack like <vdi>
>
>    result      errno value (in decimal, Xen error number) best
>                describing the results of the most recently completed
>                operation; 0 means success.  Created or set by the
>                driver domain in the same transaction as it deletes
>                request.  The toolstack may delete this.
>
>    result_msg  Optional UTF-8 string explaining any error; does not
>                exist when result is "0".  Created or deleted by the
>                driver domain whenever the driver domain sets result.
>                The toolstack may delete this.
>
>    t/*         The target name.  Must be written by the toolstack.
>                But may not be removed or changed while either of
>                state or request exist.
>
>    vbd/<vbd>/state
>                The state of a vbd, "ok" or ENOENT.
>                Set or deleted by the driver domain in response to
>                requests.
>
>    vbd/<vbd>/frontend
>                The frontend path (complete path in xenstore) which the
>                xen vbd should be servicing.  Set by the toolstack
>                with the plug request and not modified until after
>                completion of unplug.
>
>    vbd/<vbd>/backend
>                The backend path (complete path in xenstore) which the
>                driver domain has chosen for the vbd.  Set by the
>                driver domain in response to a plug request.
>
>    vbd/<vbd>/b-copy/*
>                The driver domain may request, in response to plug,
>                that the toolstack copy these values to the specified
>                backend directory, in the same transaction as it
>                creates the frontend.  Set by the driver domain in
>                response to a plug request; may be deleted by the
>                toolstack.  DEPRECATED, see below.
>
> The operations:
>
>  prepare
>         Creates a vdi from a target.
>         Preconditions:
>             state ENOENT
>             request ENOENT
>         Request (xenstore writes by toolstack):
>             request = "prepare"
>             t/* as appropriate
>         Results on success (xenstore writes by driver domain):
>             request ENOENT    } applies to success from all operations,
>             result = "0"      }  will not be restated below
>             state = "inactive"
>         Results on error (applies to all operations):  }
>             request ENOENT                             }  applies
>             result = some decimal integer errno value  }   to all
>             result_msg = ENOENT or a string            }    failures
>
>  activate
>         Preconditions:
>             state = "inactive"
>             request ENOENT
>         Request:
>             request = "activate"
>         Results on success:
>             state = "active"
>
>  deactivate
>         Preconditions:
>             state = "active"
>             request ENOENT
>         Request:
>             request = "deactivate"
>         Results on success:
>             state = "inactive"
>
>  unprepare
>         Preconditions:
>             state != ENOENT
>             request ENOENT
>         Request:
>             request = "unprepare"
>         Results on success:
>             state = ENOENT
>
>  removal, modification, etc. of an unprepared vdi:
>         Preconditions:
>             state ENOENT
>             request ENOENT
>         Request:
>             any changes to <vdi> directory which do
>              not create "state" or "request"
>         Results:
>             ignored - no response from driver domain
>
>  plug <vbd>
>         Preconditions:
>             state ENOENT

I'm not sure about this, but shouldn't state = "active" or at least
"prepared"? Maybe I don't understant the protocol correctly, but to be
able to plug a vbd, shouldn't the underlying vdi be prepared first?

Also, as far as I understand, each vdi only has one vbd, why is the
<vbd> parameter needed in both the plug and unplug operations?

>             request ENOENT
>             vbd/<vbd>/state ENOENT
>             <frontend> ENOENT
>         Request:
>             request = "plug <vbd>"
>             vbd/<vbd>/frontend = <frontend> ("/local/domain/<guest>/...")
>         Results on success:
>             vbd/<vbd>/state = "ok"
>             vbd/<vbd>/backend = <rel-backend>
>                 (<rel-backend> is the backend path relative to the
>                  driver domain's home directory in xenstore)
>             vbd/<vbd>/b-copy/*  may be created    } at least one of these
>             <backend>/*  may come into existence  }  must happen
>         Next step (xenstore write) by toolstack:
>             <frontend>  created and populated, specifically
>             <frontend>/backend = <backend>
>                 ("/local/domain/<driverdomid>/<rel-backend>")
>             <backend>    created if necessary
>             <backend>/*  copied from  vbd/<vbd>/b-copy/*  if any
>             <backend>/frontend = <frontend>  unless already set
>
>  unplug <vbd>
>         Preconditions:
>             state ENOENT
>             request ENOENT
>             vbd/<vbd>/state "ok"
>         Request:
>             request = "unplug <vbd>"
>             <frontend> ENOENT
>         Results on success:
>             vbd/<vbd>/state ENOENT
>             <backend> ENOENT

So the flow of the procotol is (if everything return success):

connection: prepare -> activate -> plug
disconnection: unplug -> deactivate -> unprepare

>
>  The toolstack and driver domains should not store state of their own,
>  not required for these communication purposes, in the backendctrl/
>  directory in xenstore.  If the driver domain wishes to make records
>  for its own use in xenstore, it should do so in a different directory
>  of its choice (eg, /local/domain/<driverdomid>/private/<something>.
>
>
> Notes regarding driver domains whose block backend implementation is
> controlled from the actual xenstore backend directory:
>
>  The b-copy/* feature exists for compatibility with some of these.  If
>  such a backend cannot cope with the backend directory coming into
>  existence before the corresponding frontend directory, then it is
>  necessary to create and populate the backend in the same xenstore
>  transaction as the creation of the frontend.  However, such backends
>  should be fixed; the b-copy/* feature is deprecated and will be
>  withdrawn at some point.
>
>  Note that a vbd may be created with the vdi inactive.  In this case

So in this case, the connection may happen with:

connection: prepare -> plug -> activate?

I frankly find this vbd/vdi naming very confusing.

>  the frontend and backend directories will exist, but the information
>  needed to start up the backend properly may be lacking until the vdi
>  is activated.  For example, if the existence of a suitable block
>  device in the driver domain depends on vdi activation, the block
>  device id cannot be made known to the backend until after the backend
>  directory has already been created and perhaps has existed for some
>  time.  It is believed that existing backends cope with this, because
>  they use a "hotplug script" approach - where the backend directory is
>  created without specifying the device node, and this backend directory
>  creation causes the invocation of machinery which establishes the
>  device node, which is subsequently written to xenstore.
>
>
> Question
>
>  What about network interfaces and other kinds of backend ?>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-12-03 10:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-03 10:34 Driver domains communication protocol proposal Roger Pau Monné
  -- strict thread matches above, loose matches on Subject: below --
2012-04-04 15:46 Ian Jackson
2012-04-10 15:06 ` Paul Durrant
2012-04-11 10:11   ` Ian Jackson
2012-04-12  9:33 ` George Dunlap
2012-04-24 18:00   ` Ian Jackson
2012-04-25 10:00     ` Ian Campbell
2012-04-25 10:27     ` George Dunlap
2012-05-11 15:00       ` Ian Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).