All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Ondrej Holecek <oholecek@suse.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel@lists.xen.org
Subject: Re: [PATCH] libbxl: add support for pvscsi, iteration 1
Date: Fri, 2 May 2014 17:54:53 +0200	[thread overview]
Message-ID: <20140502155452.GA21983@aepfle.de> (raw)
In-Reply-To: <1399043505.18944.3.camel@kazak.uk.xensource.com>

On Fri, May 02, Ian Campbell wrote:

> > The backend driver uses a single_host:many_devices notation to manage
> > domU devices. The xenstore layout looks like this:
> 
> Please can you add ~/device/vscsi/$DEVID/* and
> ~/backend/vif/$DOMID/$DEVID/* to docs/misc/xenstore-paths.markdown.
> 
> If you were inclined to improve xen/include/public/io/vscsiif.h by using
> some of the content from here then that would be awesome too.

I think its a good idea to document what the current backend/frontend
code does, as this will also aid with review of the libxl changes.

> > This is a challenge for libxl because libxl currently handles only
> > single_host:single_device in its device helper functions.
> 
> This is a bit like pci I suppose?

Maybe. But today Ondrej found a way to attach additional devices,
detaching them will most likely work as well. Basically one creates the
new vscsi-devs/dev-N/ subdir with state==1, then write state==7 to the
backend. Similar with detach, state==5 to the dev-N, then state==7 to
the backend.

> > Due to this
> > limitation in libxl, scsi-detach will remove the whole virtual scsi host
> > with all its virtual devices, instead of just the requested single
> > virtual device.
> 
> That's rather unfortunate!
> 
> Are multiple <vhost> allowed/supported? e.g. could I (or better, libxl
> by default) setup a device-per-host situation to alleviate this?
> 
> For pcifront/back we support bus reconfiguration (via
> XenbusStateReconfigur{ing,ed}) which is how this works there. Does vscsi
> not support anything like that?

Many vhosts are supported, I just did not realize that reconfigure was
used here as well. So maybe libxl does not need further changes. I will
hack some code to show how attach/detach works in theory.

> > To support the pdev notation '/dev/$device' it is required to parse
> > sysfs to translate the path to HST:CHN:TGT:LUN notation. In its current
> > variant /sys/dev/ is used, which is available since at least Linux 3.0.
> > Support for dom0 kernels which lack this interface will be added later.
> 
> Are there any such kernels which anyone cares about any more?

I have to check when /sys/dev was introduced. In the end it would be
nice to support also older dom0 kernels.

> >  - xl scsi-detach will remove entire scsi hosts, unlike its xm counter part.
> 
> What did xm do instead? Fail?

No, it will use the xenstore reconfigure state to indicate a change.

> >  Note that the latter format is unreliable because
> > +the HOST value can change across dom0 reboots.
> 
> /dev/sgN would be unreliable too I think? As might block devices not
> using the /dev/disk/by-* links you give in the example. You might be
> best saying that only /dev/disk/by-* block device references are
> reliable because all of the others might change over a reboot.

I will reword this part.

One customer reported that his SCSI device was not working with xend
because this device was accessible only via a char device. And I think
udev was able to create some by-id path for the sgN node. Have to dig
into that old report. 

> > +
> > +=item C<vdev>
> > +
> > +Specifies how the SCSI device is mapped into the guest. The notation is in
> > +SCSI notation HOST:CHANNEL:TARGET:LUN. HOST in this case means a virthal
> 
> Typo "virtual"
> 
> > +SCSI host within the guest.
> 
> All of HOST:CHANNEL:TARGET:LUN are numbers I think, are they
> hex/dec/oct? Are there limits?

I have to find out, its most likely all decimal.

> > +Right now only one option is recognized: feature-host.
> What does it do?

I have to find out ;-) 
Last time I checked the command was passed as is from domU to dom0, but
I dont really know as it is not documented.

> > +=item B<scsi-attach> I<domain-id> I<pdev> I<vdev> I<[,feature-host]>
> 
> Is the comma in I<[,feature-host]> literally required, as in
>     xl scsi-attach mydom 0:0:0:0 0:0:0:0 ,feature-host
> ?

This is a typo I noticed after submission.
Not sure how to specify the option. Either a '0:0:0:0,feature-host' or
as two strings.

> > +Removes the vscsi device from domain specified by I<domain-id>.
> > +Note that the whole virtual SCSI host with all its devices is removed.
> > +This is a BUG!
> 
> I have a feeling it would be preferable in the first instance to either
> just not provide the detach portion of this interface, or to require
> that vdev is only a host and not a specific device (... unless the host
> only has a single device, but maybe that's getting too far)

Attach/detach can probably be implemented without much work.

> > +++ b/tools/libxl/libxl.c
> > +
> > +void libxl__device_vscsi_add(libxl__egc *egc, uint32_t domid,
> > +                           libxl_device_vscsi *vscsi,
> > +                           libxl__ao_device *aodev)
> > +{
> > +    STATE_AO_GC(aodev->ao);
> > +    flexarray_t *front;
> > +    flexarray_t *back;
> > +    libxl__device *device;
> > +    unsigned int rc, i;
> > +
> > +    i = 2 * (4 + (3 * vscsi->num_vscsi_devs));
> 
> That's quite exciting ;-)
> 
> A flexarray will grow as neccessary so some sort of static best guest of
> a starting point would be fine here I think (or a comment about 2 4 and
> 3)!

Yes, that was just an attempt to avoid realloc. As usual, optimization
of non-critical code paths...

> I suppose the num_vscsi_devs here is an artefact of the short coming you
> mentioned before. The way PCI handles this is to create the "bus" (aka
> vhost here) either when the first device is attached or in the higher
> level code before it calls the individual device adds (i.e. this
> function), or something (I must confess my memory is a bit fuzzy,
> libxl__create_pci_backend might be a place to look). Having done that
> then libxl_device_pci is just a single device.
> 
> I think you should look to do something similar here, so that each
> libxl_device_vscsi is actually a single device. The alternative is some
> sort of libxl_bus_vscsi thing, which would be unconventional compared
> with everything else

The thing with host:dev[,dev[,dev]] is that within the guest something
may rely on the way SCSI devices are presented (ordering, naming, no
idea). Surely the vscsi= and scsi-attach/detach parsing code could do
some sort of translation from a group of devices-per-vhost into a single
vhost:dev. But now that we have found that reconfiguration is used in
xenstore to attach/detach devices it may be possible to reuse existing
libxl helpers while preserving existing domU.cfg files and backend
drivers.


> > +libxl_device_vscsi *libxl_device_vscsi_list(libxl_ctx *ctx, uint32_t domid, int *num)
> > +{
> > +    GC_INIT(ctx);
> > +
> > +    libxl_device_vscsi *vscsi_hosts = NULL;
> > +    char *fe_path;
> > +    char **dir;
> > +    unsigned int ndirs = 0;
> > +
> > +    fe_path = libxl__sprintf(gc, "%s/device/vscsi", libxl__xs_get_dompath(gc, domid));
> 
> Need to take care trusting this too much...

Not sure what you mean with this sentence?

> > @@ -3520,6 +3713,10 @@ DEFINE_DEVICE_REMOVE(vfb, destroy, 1)
> >  DEFINE_DEVICE_REMOVE(vtpm, remove, 0)
> >  DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
> >  
> > +/* vscsi */
> > +DEFINE_DEVICE_REMOVE(vscsi, remove, 0)
> > +DEFINE_DEVICE_REMOVE(vscsi, destroy, 0)
> 
> Should the second one not be (vscsi, destroy, 1) ?

Right. 

> > +libxl_device_vscsi = Struct("device_vscsi", [
> > +    ("backend_domid",    libxl_domid),
> > +    ("devid",            libxl_devid),
> > +    ("v_hst",            uint32),
> > +    ("vscsi_devs",       Array(libxl_vscsi_dev, "num_vscsi_devs")),
> > +    ("feature_host",     bool),
> 
> Not defbool (I don't know what this does...)

This is just flag for "feature-host" yes/no. Have to find out what this
backend option actually does.

> > @@ -508,6 +528,28 @@ libxl_vtpminfo = Struct("vtpminfo", [
> >      ("uuid", libxl_uuid),
> >      ], dir=DIR_OUT)
> >  
> > +libxl_vscsiinfo = Struct("vscsiinfo", [
> > +    ("backend", string),
> > +    ("backend_id", uint32),
> > +    ("frontend", string),
> > +    ("frontend_id", uint32),
> > +    ("devid", libxl_devid),
> > +    ("p_hst", uint16),
> > +    ("p_chn", uint16),
> > +    ("p_tgt", uint16),
> > +    ("p_lun", uint16),
> 
> Type, also the widths here differ from the device...
> 
> > +    ("vscsi_dev_id", libxl_devid),
> > +    ("v_hst", uint16),
> > +    ("v_chn", uint16),
> > +    ("v_tgt", uint16),
> > +    ("v_lun", uint16),
> 
> and again

Some format code was unhappy with u16, cant remember. I was just trying
to save space. Again, optimization of non-critical code paths.

> > +static char *vscsi_trim_string(char *s)
> > +{
> > +    unsigned int len;
> > +
> > +    while (isspace(*s))
> > +        s++;
> > +    len = strlen(s);
> > +    while (len-- > 1 && isspace(s[len]))
> > +        s[len] = '\0';
> > +    return s;
> 
> Doesn't seem to be anything vscsi specific here (and it seems like a
> generally useful thing. I expect at least thevif parsing would benefit
> from deploying this too (in so much as all this wouldn't be better with
> a proper parser using flex).

I figured that could be a generic helper. Have to check existing code if
it can make use of it.

> > +}
> > +
> > +static void parse_vscsi_config(libxl_device_vscsi *vscsi_host,
> > +                              libxl_vscsi_dev *vscsi_dev,
> > +                              char *buf)
> > +{
> > +    char *pdev, *vdev, *fhost;
> > +    unsigned int hst, chn, tgt, lun;
> > +
> > +    libxl_device_vscsi_init(vscsi_host);
> > +    pdev = strtok(buf, ",");
> > +    vdev = strtok(NULL, ",");
> > +    fhost = strtok(NULL, ",");
> 
> I guess we are single threaded here so this is ok.
> 
> I wonder if this might be useful to put in libxlu alongside the disk
> parser? The question is whether anything else wants to parse xm style
> vscsi strings, like libvirt perhaps. (arguably a bunch of these parsers
> here have that property and belong in libxlu).

I was wondering how libvirt support for vscsi would look like, looks
like it has to reimplement all that parsing to fill struct
libxl_domain_config. Maybe that parser could be generic, depends how
external callers deal with such parsing today.

> > +    pdev = vscsi_trim_string(pdev);
> > +    vdev = vscsi_trim_string(vdev);
> > +
> > +    if (strncmp(pdev, "/dev/", 5) == 0) {
> > +#ifdef __linux__
> 
> Urk. Normally we do this by splitting by files, e.g. into a linux
> version and a nop version and compiling the appropriate one.
> 
> This seems like a candidate for libxlu too. We probably need to have a
> think about how much of this stuff is done in libxl too -- e.g. for
> disks we pass the pdev in as a string (path) and interpret it there.
> That allows us to choose between alternative backend providers etc. Also
> libxl is a slightly more palatable place to deal with Linux vs non-Linux
> bits. I think that might be the right answer here too.

There is a libxl_linux.c file, but thats for libxl not xl. So we went
with the ifdef for the time being.

> > @@ -1242,6 +1360,56 @@ static void parse_config_data(const char *config_source,
> >          }
> >      }
> >  
> > +    if (!xlu_cfg_get_list(config, "vscsi", &vscsis, 0, 0)) {
> > +        int cnt_vscsi_devs = 0;
> > +        d_config->num_vscsis = 0;
> > +        d_config->vscsis = NULL;
> > +        while ((buf = xlu_cfg_get_listitem (vscsis, cnt_vscsi_devs)) != NULL) {
> > +            libxl_vscsi_dev vscsi_dev = { };
> > +            libxl_device_vscsi vscsi_host = { };
> > +            libxl_device_vscsi *host;
> > +            char *tmp_buf;
> > +            int num_vscsis, host_found = 0;
> > +
> > +            /*
> > +             * #1: parse the devspec and place it in temporary host+dev part
> > +             * #2: find existing vscsi_host with number v_hst
> > +             *     if found, append the vscsi_dev to this vscsi_host
> > +             * #3: otherwise, create new vscsi_host and append vscsi_dev
> > +             * Note: v_hst does not represent the index named "num_vscsis",
> > +             *       it is a private index used just in the config file
> 
> Would all be a lot simpler if a libxl_device_vscsi was a single device
> and all the worrying about which bus was which was in libxl

As said above, domU.cfg code like this may serve a purpose. Devices are
grouped per vhost (0: and 1: in this example):

### vscsi = [
###     '/dev/sdm, 0:0:0:0',
###     '/dev/sdn, 0:0:0:1',
###     '/dev/sdo, 0:0:1:0',
###     '/dev/sdp, 0:2:0:1 ',
###     '/dev/sdq, 1:0:0:0, feature-host  ',
###     '/dev/sdr, 1:1:1:0, feature-host',
### ]


Thanks for the quick review!

Olaf

  reply	other threads:[~2014-05-02 15:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-30 11:03 [PATCH] libbxl: add support for pvscsi, iteration 1 Olaf Hering
2014-05-02 14:36 ` Olaf Hering
2014-05-02 15:30   ` Ian Campbell
2014-05-02 15:11 ` Ian Campbell
2014-05-02 15:54   ` Olaf Hering [this message]
2014-05-02 16:35     ` Ian Campbell
2014-05-02 15:54 ` Ian Jackson
2014-05-02 16:00   ` Olaf Hering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140502155452.GA21983@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=oholecek@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.