All of lore.kernel.org
 help / color / mirror / Atom feed
* libvirt, libxl and QDISKs
@ 2013-04-24 14:22 David Scott
  2013-04-24 15:55 ` Stefano Stabellini
  2013-04-25  8:55 ` Ian Campbell
  0 siblings, 2 replies; 32+ messages in thread
From: David Scott @ 2013-04-24 14:22 UTC (permalink / raw)
  To: xen-devel

Hi,

Now that libxl + qemu's built-in disk backend for ceph/rbd is working 
nicely, I'm trying to get it all working through libvirt.

When libvirt's libxl driver creates a libxl_device_disk, it applies a 
few simple rules[1] covering specific file format types (qcow, qcow2, 
vhd). If none of these rules apply then it defers to libxl's best guess:

             * If driverName is not specified, default to raw as per
             * xl-disk-configuration.txt in the xen documentation and let
             * libxl pick a suitable backend.
             */
             x_disk->format = LIBXL_DISK_FORMAT_RAW;
             x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;

The libxl code in libxl__device_disk_set_backend[2] tries to resolve 
'UNKNOWN' into something concrete by calling 'disk_try_backend':

             disk_try_backend(&a, LIBXL_DISK_BACKEND_PHY) ?:
             disk_try_backend(&a, LIBXL_DISK_BACKEND_TAP) ?:
             disk_try_backend(&a, LIBXL_DISK_BACKEND_QDISK);

Unfortunately for me and my quest, the case for LIBXL_DISK_BACKEND_TAP 
just checks for
  * lack of hotplug script
  * libxl__blktap_enabled
  * DISK_FORMAT_{RAW,VHD}
and so it selects TAP and then fails, because tapdisk doesn't know 
anything about this disk access protocol (yet at least). Unbeknownst to 
libxl, qemu would be able to handle the disk in this case.

I'm not sure what the best way to address this is. I could disable 
blktap on my system but it would be a shame to drop support for the 
stuff tapdisk does well (like .vhd handling)

On the other hand, having both tapdisk and qemu means having to choose
between them for disk protocols they could both handle. A choice would
either have to be coded in libxl (or libvirt's libxl driver), which 
seems like a bit of an annoying maintenance burden (i.e. update the list 
every time someone adds a driver somewhere); or it could be left to the 
sysadmin to choose via some kind of resolver script, which seems a bit 
complex.

A third possibility is to be able to ask tapdisk, "do you actually 
support this disk" and if yes, use it in preference to qemu, if no fall 
back to qemu.

Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)

Cheers,
Dave

[1] 
http://libvirt.org/git/?p=libvirt.git;a=blob;f=src/libxl/libxl_conf.c;h=7e0753a69ed4a0867d14843cbad20d1430a67f0c;hb=HEAD#l462

[2] 
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxl/libxl_device.c;h=eeea9d9a023877a69ff2832c973e753629333356;hb=HEAD#l219

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-24 14:22 libvirt, libxl and QDISKs David Scott
@ 2013-04-24 15:55 ` Stefano Stabellini
  2013-04-25  8:55 ` Ian Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-24 15:55 UTC (permalink / raw)
  To: David Scott; +Cc: xen-devel

On Wed, 24 Apr 2013, David Scott wrote:
> Hi,
> 
> Now that libxl + qemu's built-in disk backend for ceph/rbd is working 
> nicely, I'm trying to get it all working through libvirt.
> 
> When libvirt's libxl driver creates a libxl_device_disk, it applies a 
> few simple rules[1] covering specific file format types (qcow, qcow2, 
> vhd). If none of these rules apply then it defers to libxl's best guess:
> 
>              * If driverName is not specified, default to raw as per
>              * xl-disk-configuration.txt in the xen documentation and let
>              * libxl pick a suitable backend.
>              */
>              x_disk->format = LIBXL_DISK_FORMAT_RAW;
>              x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
> 
> The libxl code in libxl__device_disk_set_backend[2] tries to resolve 
> 'UNKNOWN' into something concrete by calling 'disk_try_backend':
> 
>              disk_try_backend(&a, LIBXL_DISK_BACKEND_PHY) ?:
>              disk_try_backend(&a, LIBXL_DISK_BACKEND_TAP) ?:
>              disk_try_backend(&a, LIBXL_DISK_BACKEND_QDISK);
> 
> Unfortunately for me and my quest, the case for LIBXL_DISK_BACKEND_TAP 
> just checks for
>   * lack of hotplug script
>   * libxl__blktap_enabled
>   * DISK_FORMAT_{RAW,VHD}
> and so it selects TAP and then fails, because tapdisk doesn't know 
> anything about this disk access protocol (yet at least). Unbeknownst to 
> libxl, qemu would be able to handle the disk in this case.
> 
> I'm not sure what the best way to address this is. I could disable 
> blktap on my system but it would be a shame to drop support for the 
> stuff tapdisk does well (like .vhd handling)
> 
> On the other hand, having both tapdisk and qemu means having to choose
> between them for disk protocols they could both handle. A choice would
> either have to be coded in libxl (or libvirt's libxl driver), which 
> seems like a bit of an annoying maintenance burden (i.e. update the list 
> every time someone adds a driver somewhere); or it could be left to the 
> sysadmin to choose via some kind of resolver script, which seems a bit 
> complex.
> 
> A third possibility is to be able to ask tapdisk, "do you actually 
> support this disk" and if yes, use it in preference to qemu, if no fall 
> back to qemu.
> 
> Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)

I vote for using blktap exclusively for VHD files.
In fact we do know that blktap support is not great for anything else
anyway (qcow2 I am thinking about you).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-24 14:22 libvirt, libxl and QDISKs David Scott
  2013-04-24 15:55 ` Stefano Stabellini
@ 2013-04-25  8:55 ` Ian Campbell
  2013-04-25 10:36   ` George Dunlap
  1 sibling, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-25  8:55 UTC (permalink / raw)
  To: David Scott; +Cc: xen-devel

On Wed, 2013-04-24 at 15:22 +0100, David Scott wrote:
> Hi,
> 
> Now that libxl + qemu's built-in disk backend for ceph/rbd is working 
> nicely, I'm trying to get it all working through libvirt.
> 
> When libvirt's libxl driver creates a libxl_device_disk, it applies a 
> few simple rules[1] covering specific file format types (qcow, qcow2, 
> vhd). If none of these rules apply then it defers to libxl's best guess:
> 
>              * If driverName is not specified, default to raw as per
>              * xl-disk-configuration.txt in the xen documentation and let
>              * libxl pick a suitable backend.

Can you specify driverName in your config somehow? I confess I don't
know if this is ~= format or backend...

> Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)

This is not the first time that the issue of hardcoded defaults in libxl
has come up recently (in fact I think you mentioned one of the others to
me...).

It seems to me that there is a need arising for a system wide libxl
configuration file which sets these sorts of defaults for all users of
libxl on that system and which can be modified according to admin
preference. As well as the disk backend selection (which is actually
probably pretty complex to express in a simple configuration file) this
has come up in the context of stubdomain usage and autoballooning.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25  8:55 ` Ian Campbell
@ 2013-04-25 10:36   ` George Dunlap
  2013-04-25 11:33     ` Ian Campbell
  0 siblings, 1 reply; 32+ messages in thread
From: George Dunlap @ 2013-04-25 10:36 UTC (permalink / raw)
  To: Ian Campbell; +Cc: David Scott, xen-devel

On Thu, Apr 25, 2013 at 9:55 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2013-04-24 at 15:22 +0100, David Scott wrote:
>> Hi,
>>
>> Now that libxl + qemu's built-in disk backend for ceph/rbd is working
>> nicely, I'm trying to get it all working through libvirt.
>>
>> When libvirt's libxl driver creates a libxl_device_disk, it applies a
>> few simple rules[1] covering specific file format types (qcow, qcow2,
>> vhd). If none of these rules apply then it defers to libxl's best guess:
>>
>>              * If driverName is not specified, default to raw as per
>>              * xl-disk-configuration.txt in the xen documentation and let
>>              * libxl pick a suitable backend.
>
> Can you specify driverName in your config somehow? I confess I don't
> know if this is ~= format or backend...
>
>> Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)
>
> This is not the first time that the issue of hardcoded defaults in libxl
> has come up recently (in fact I think you mentioned one of the others to
> me...).
>
> It seems to me that there is a need arising for a system wide libxl
> configuration file which sets these sorts of defaults for all users of
> libxl on that system and which can be modified according to admin
> preference. As well as the disk backend selection (which is actually
> probably pretty complex to express in a simple configuration file) this
> has come up in the context of stubdomain usage and autoballooning.

Something like this was actually on my "to-implement" feature list for
4.3, but no one got around to it, so I dropped it.

FWIW, I think having this kind of thing would make 4.3 betterer enough
that it would be worth slipping the release for a few weeks to get it
in.

On the other hand, experience has shown that converging on what the
"right" interface should be take a heck of a lot longer than you
think; so maybe we should just plan on doing this for 4.4 at this
point.

 -George

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 10:36   ` George Dunlap
@ 2013-04-25 11:33     ` Ian Campbell
  2013-04-25 11:55       ` Stefano Stabellini
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-25 11:33 UTC (permalink / raw)
  To: George Dunlap; +Cc: Dave Scott, xen-devel

On Thu, 2013-04-25 at 11:36 +0100, George Dunlap wrote:
> On Thu, Apr 25, 2013 at 9:55 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Wed, 2013-04-24 at 15:22 +0100, David Scott wrote:
> >> Hi,
> >>
> >> Now that libxl + qemu's built-in disk backend for ceph/rbd is working
> >> nicely, I'm trying to get it all working through libvirt.
> >>
> >> When libvirt's libxl driver creates a libxl_device_disk, it applies a
> >> few simple rules[1] covering specific file format types (qcow, qcow2,
> >> vhd). If none of these rules apply then it defers to libxl's best guess:
> >>
> >>              * If driverName is not specified, default to raw as per
> >>              * xl-disk-configuration.txt in the xen documentation and let
> >>              * libxl pick a suitable backend.
> >
> > Can you specify driverName in your config somehow? I confess I don't
> > know if this is ~= format or backend...
> >
> >> Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)
> >
> > This is not the first time that the issue of hardcoded defaults in libxl
> > has come up recently (in fact I think you mentioned one of the others to
> > me...).
> >
> > It seems to me that there is a need arising for a system wide libxl
> > configuration file which sets these sorts of defaults for all users of
> > libxl on that system and which can be modified according to admin
> > preference. As well as the disk backend selection (which is actually
> > probably pretty complex to express in a simple configuration file) this
> > has come up in the context of stubdomain usage and autoballooning.
> 
> Something like this was actually on my "to-implement" feature list for
> 4.3, but no one got around to it, so I dropped it.
> 
> FWIW, I think having this kind of thing would make 4.3 betterer enough
> that it would be worth slipping the release for a few weeks to get it
> in.
> 
> On the other hand, experience has shown that converging on what the
> "right" interface should be take a heck of a lot longer than you
> think; so maybe we should just plan on doing this for 4.4 at this
> point.

I don't think this can/should be rushed at this point.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 11:33     ` Ian Campbell
@ 2013-04-25 11:55       ` Stefano Stabellini
  2013-04-25 11:57         ` Ian Campbell
  2013-04-25 18:26         ` Sylvain Munaut
  0 siblings, 2 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-25 11:55 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, Dave Scott, xen-devel

On Thu, 25 Apr 2013, Ian Campbell wrote:
> On Thu, 2013-04-25 at 11:36 +0100, George Dunlap wrote:
> > On Thu, Apr 25, 2013 at 9:55 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > > On Wed, 2013-04-24 at 15:22 +0100, David Scott wrote:
> > >> Hi,
> > >>
> > >> Now that libxl + qemu's built-in disk backend for ceph/rbd is working
> > >> nicely, I'm trying to get it all working through libvirt.
> > >>
> > >> When libvirt's libxl driver creates a libxl_device_disk, it applies a
> > >> few simple rules[1] covering specific file format types (qcow, qcow2,
> > >> vhd). If none of these rules apply then it defers to libxl's best guess:
> > >>
> > >>              * If driverName is not specified, default to raw as per
> > >>              * xl-disk-configuration.txt in the xen documentation and let
> > >>              * libxl pick a suitable backend.
> > >
> > > Can you specify driverName in your config somehow? I confess I don't
> > > know if this is ~= format or backend...
> > >
> > >> Anyone got any thoughts? (Or perhaps I've missed something obvious! :-)
> > >
> > > This is not the first time that the issue of hardcoded defaults in libxl
> > > has come up recently (in fact I think you mentioned one of the others to
> > > me...).
> > >
> > > It seems to me that there is a need arising for a system wide libxl
> > > configuration file which sets these sorts of defaults for all users of
> > > libxl on that system and which can be modified according to admin
> > > preference. As well as the disk backend selection (which is actually
> > > probably pretty complex to express in a simple configuration file) this
> > > has come up in the context of stubdomain usage and autoballooning.
> > 
> > Something like this was actually on my "to-implement" feature list for
> > 4.3, but no one got around to it, so I dropped it.
> > 
> > FWIW, I think having this kind of thing would make 4.3 betterer enough
> > that it would be worth slipping the release for a few weeks to get it
> > in.
> > 
> > On the other hand, experience has shown that converging on what the
> > "right" interface should be take a heck of a lot longer than you
> > think; so maybe we should just plan on doing this for 4.4 at this
> > point.
> 
> I don't think this can/should be rushed at this point.

Right, but going back to the original question: if we have in our hands
something that is not vhd, raw or qcow and blktap2 is available,
should we really try to pass it to it?

We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
implementation of the first two formats is too buggy to be used).

Thus I think that the answer is pretty obvious here: we should try with
QEMU, whose supported format list is way wider.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 11:55       ` Stefano Stabellini
@ 2013-04-25 11:57         ` Ian Campbell
  2013-04-25 12:12           ` David Scott
  2013-04-25 18:26         ` Sylvain Munaut
  1 sibling, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-25 11:57 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, Dave Scott, xen-devel

On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:

> Right, but going back to the original question: if we have in our hands
> something that is not vhd, raw or qcow and blktap2 is available,
> should we really try to pass it to it?

That's not quite the original question because in that case it was raw,
at least as far as the libxl interface is concerned.

> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
> implementation of the first two formats is too buggy to be used).
> 
> Thus I think that the answer is pretty obvious here: we should try with
> QEMU, whose supported format list is way wider.

Which I think we do, we only try and use blktap for raw and vhd.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 11:57         ` Ian Campbell
@ 2013-04-25 12:12           ` David Scott
  2013-04-25 12:43             ` Ian Campbell
  0 siblings, 1 reply; 32+ messages in thread
From: David Scott @ 2013-04-25 12:12 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, xen-devel, Stefano Stabellini

On 25/04/13 12:57, Ian Campbell wrote:
> On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
>
>> Right, but going back to the original question: if we have in our hands
>> something that is not vhd, raw or qcow and blktap2 is available,
>> should we really try to pass it to it?
>
> That's not quite the original question because in that case it was raw,
> at least as far as the libxl interface is concerned.
>
>> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
>> implementation of the first two formats is too buggy to be used).
>>
>> Thus I think that the answer is pretty obvious here: we should try with
>> QEMU, whose supported format list is way wider.
>
> Which I think we do, we only try and use blktap for raw and vhd.

As well as the disk format dimension (vhd, qcow2 etc) there's also the 
network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both 
blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm 
showing my bias here ;-) disk access protocols are in qemu. So if we 
default to blktap/tapdisk we can only handle raw *files*, whereas if we 
default to qemu we can also do these new fancy things as well.

Cheers,
Dave

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 12:12           ` David Scott
@ 2013-04-25 12:43             ` Ian Campbell
  2013-04-25 13:22               ` David Scott
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-25 12:43 UTC (permalink / raw)
  To: Dave Scott; +Cc: George Dunlap, xen-devel, Stefano Stabellini

On Thu, 2013-04-25 at 13:12 +0100, Dave Scott wrote:
> On 25/04/13 12:57, Ian Campbell wrote:
> > On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
> >
> >> Right, but going back to the original question: if we have in our hands
> >> something that is not vhd, raw or qcow and blktap2 is available,
> >> should we really try to pass it to it?
> >
> > That's not quite the original question because in that case it was raw,
> > at least as far as the libxl interface is concerned.
> >
> >> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
> >> implementation of the first two formats is too buggy to be used).
> >>
> >> Thus I think that the answer is pretty obvious here: we should try with
> >> QEMU, whose supported format list is way wider.
> >
> > Which I think we do, we only try and use blktap for raw and vhd.
> 
> As well as the disk format dimension (vhd, qcow2 etc) there's also the 
> network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both 
> blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm 
> showing my bias here ;-) disk access protocols are in qemu. So if we 
> default to blktap/tapdisk we can only handle raw *files*, whereas if we 
> default to qemu we can also do these new fancy things as well.

Remind me why you can't specify backend=qdisk explicitly? Does libvirt
not propagate anything like that?

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 12:43             ` Ian Campbell
@ 2013-04-25 13:22               ` David Scott
  2013-04-25 13:56                 ` Ian Campbell
  0 siblings, 1 reply; 32+ messages in thread
From: David Scott @ 2013-04-25 13:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, xen-devel, Stefano Stabellini

On 25/04/13 13:43, Ian Campbell wrote:
> On Thu, 2013-04-25 at 13:12 +0100, Dave Scott wrote:
>> On 25/04/13 12:57, Ian Campbell wrote:
>>> On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
>>>
>>>> Right, but going back to the original question: if we have in our hands
>>>> something that is not vhd, raw or qcow and blktap2 is available,
>>>> should we really try to pass it to it?
>>>
>>> That's not quite the original question because in that case it was raw,
>>> at least as far as the libxl interface is concerned.
>>>
>>>> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
>>>> implementation of the first two formats is too buggy to be used).
>>>>
>>>> Thus I think that the answer is pretty obvious here: we should try with
>>>> QEMU, whose supported format list is way wider.
>>>
>>> Which I think we do, we only try and use blktap for raw and vhd.
>>
>> As well as the disk format dimension (vhd, qcow2 etc) there's also the
>> network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both
>> blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm
>> showing my bias here ;-) disk access protocols are in qemu. So if we
>> default to blktap/tapdisk we can only handle raw *files*, whereas if we
>> default to qemu we can also do these new fancy things as well.
>
> Remind me why you can't specify backend=qdisk explicitly? Does libvirt
> not propagate anything like that?

I'll give this a go tomorrow when I'm not at the mercy of dodgy wifi :-) 
 From my reading of the libvirt code it does allow the selection of a 
driverName from the set "tap" "tap2" "file" and "phy". It looks like we 
could define a "qdisk" driverName.

Actually this reminds me of a general concern I had: I worry that people 
who choose to use libvirt will likely use all the default options, and 
not customise their code too much for a specific hypervisor. After all, 
if they wanted to customise their code for xen they would just use libxl 
and use the richer interface. I think we ought to make sure as much 
stuff just works by default as possible. In my ideal world it would be 
possible to take a simple libvirt domain configuration (perhaps one 
which used to run on kvm) and run it on xen, transparently benefiting 
from xen features such as driver domains (and automatic blktap/qemu 
selection?). I think if the domain XML requires too much tweaking then 
people will just not use the features or may give up altogether.

Cheers,
Dave

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 13:22               ` David Scott
@ 2013-04-25 13:56                 ` Ian Campbell
  2013-04-26  1:27                   ` Jim Fehlig
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-25 13:56 UTC (permalink / raw)
  To: Dave Scott; +Cc: George Dunlap, xen-devel, Stefano Stabellini

On Thu, 2013-04-25 at 14:22 +0100, Dave Scott wrote:
> On 25/04/13 13:43, Ian Campbell wrote:
> > On Thu, 2013-04-25 at 13:12 +0100, Dave Scott wrote:
> >> On 25/04/13 12:57, Ian Campbell wrote:
> >>> On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
> >>>
> >>>> Right, but going back to the original question: if we have in our hands
> >>>> something that is not vhd, raw or qcow and blktap2 is available,
> >>>> should we really try to pass it to it?
> >>>
> >>> That's not quite the original question because in that case it was raw,
> >>> at least as far as the libxl interface is concerned.
> >>>
> >>>> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
> >>>> implementation of the first two formats is too buggy to be used).
> >>>>
> >>>> Thus I think that the answer is pretty obvious here: we should try with
> >>>> QEMU, whose supported format list is way wider.
> >>>
> >>> Which I think we do, we only try and use blktap for raw and vhd.
> >>
> >> As well as the disk format dimension (vhd, qcow2 etc) there's also the
> >> network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both
> >> blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm
> >> showing my bias here ;-) disk access protocols are in qemu. So if we
> >> default to blktap/tapdisk we can only handle raw *files*, whereas if we
> >> default to qemu we can also do these new fancy things as well.
> >
> > Remind me why you can't specify backend=qdisk explicitly? Does libvirt
> > not propagate anything like that?
> 
> I'll give this a go tomorrow when I'm not at the mercy of dodgy wifi :-) 
>  From my reading of the libvirt code it does allow the selection of a 
> driverName from the set "tap" "tap2" "file" and "phy".

In libxl the choices are phy, tap and qdisk. There's no file nor any
distinction between tap and tap2. I expect the libxl libvirt driver is
trying to make things easier for people migrating from xend (which is
to be applauded).

> It looks like we could define a "qdisk" driverName.

Yes, I think this would work. WRT to your comment below it looks like
other libvirt drivers call this driver "qemu".

> Actually this reminds me of a general concern I had: I worry that people 
> who choose to use libvirt will likely use all the default options, and 
> not customise their code too much for a specific hypervisor. After all, 
> if they wanted to customise their code for xen they would just use libxl 
> and use the richer interface. I think we ought to make sure as much 
> stuff just works by default as possible. In my ideal world it would be 
> possible to take a simple libvirt domain configuration (perhaps one 
> which used to run on kvm) and run it on xen, transparently benefiting 
> from xen features such as driver domains (and automatic blktap/qemu 
> selection?). I think if the domain XML requires too much tweaking then 
> people will just not use the features or may give up altogether.

This would be ideal, I seem to remember hearing that disk configuration
was something which libvirt didn't abstract, tending to just expose
underlying configuration strings of the underlying toolstack. So this is
one area where you actually have to change things when you migrate
between hypervisors or they won't work at all.

I may be talking rubbish or be years out of date with that though.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 11:55       ` Stefano Stabellini
  2013-04-25 11:57         ` Ian Campbell
@ 2013-04-25 18:26         ` Sylvain Munaut
  1 sibling, 0 replies; 32+ messages in thread
From: Sylvain Munaut @ 2013-04-25 18:26 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, xen-devel, Ian Campbell, Dave Scott

Hi,

> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
> implementation of the first two formats is too buggy to be used).

Well, jfyi I have a working RBD driver for blktap2 ...

Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-25 13:56                 ` Ian Campbell
@ 2013-04-26  1:27                   ` Jim Fehlig
  2013-04-26  4:50                     ` Jim Fehlig
                                       ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Jim Fehlig @ 2013-04-26  1:27 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, Stefano Stabellini, Dave Scott, xen-devel

Ian Campbell wrote:
> On Thu, 2013-04-25 at 14:22 +0100, Dave Scott wrote:
>   
>> On 25/04/13 13:43, Ian Campbell wrote:
>>     
>>> On Thu, 2013-04-25 at 13:12 +0100, Dave Scott wrote:
>>>       
>>>> On 25/04/13 12:57, Ian Campbell wrote:
>>>>         
>>>>> On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
>>>>>
>>>>>           
>>>>>> Right, but going back to the original question: if we have in our hands
>>>>>> something that is not vhd, raw or qcow and blktap2 is available,
>>>>>> should we really try to pass it to it?
>>>>>>             
>>>>> That's not quite the original question because in that case it was raw,
>>>>> at least as far as the libxl interface is concerned.
>>>>>
>>>>>           
>>>>>> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
>>>>>> implementation of the first two formats is too buggy to be used).
>>>>>>
>>>>>> Thus I think that the answer is pretty obvious here: we should try with
>>>>>> QEMU, whose supported format list is way wider.
>>>>>>             
>>>>> Which I think we do, we only try and use blktap for raw and vhd.
>>>>>           
>>>> As well as the disk format dimension (vhd, qcow2 etc) there's also the
>>>> network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both
>>>> blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm
>>>> showing my bias here ;-) disk access protocols are in qemu. So if we
>>>> default to blktap/tapdisk we can only handle raw *files*, whereas if we
>>>> default to qemu we can also do these new fancy things as well.
>>>>         
>>> Remind me why you can't specify backend=qdisk explicitly? Does libvirt
>>> not propagate anything like that?
>>>       
>> I'll give this a go tomorrow when I'm not at the mercy of dodgy wifi :-) 
>>  From my reading of the libvirt code it does allow the selection of a 
>> driverName from the set "tap" "tap2" "file" and "phy".
>>     
>
> In libxl the choices are phy, tap and qdisk. There's no file nor any
> distinction between tap and tap2. I expect the libxl libvirt driver is
> trying to make things easier for people migrating from xend (which is
> to be applauded).
>   

Right.  'file' is mapped to backend 'tap' and format 'raw'.  In fact, I
should just post the code here for review :).  For a bit of context,
l_disk is libvirt's definition, and x_disk is libxl's.  Comments on this
mapping welcome, keeping in mind Ian's point about migrating from xend.

    if (l_disk->driverName) {
        if (STREQ(l_disk->driverName, "tap") ||
            STREQ(l_disk->driverName, "tap2")) {
            switch (l_disk->format) {
            case VIR_STORAGE_FILE_QCOW:
                x_disk->format = LIBXL_DISK_FORMAT_QCOW;
                x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
                break;
            case VIR_STORAGE_FILE_QCOW2:
                x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
                x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
                break;
            case VIR_STORAGE_FILE_VHD:
                x_disk->format = LIBXL_DISK_FORMAT_VHD;
                x_disk->backend = LIBXL_DISK_BACKEND_TAP;
                break;
            case VIR_STORAGE_FILE_NONE:
                /* No subtype specified, default to raw/tap */
            case VIR_STORAGE_FILE_RAW:
                x_disk->format = LIBXL_DISK_FORMAT_RAW;
                x_disk->backend = LIBXL_DISK_BACKEND_TAP;
                break;
            default:
                virReportError(VIR_ERR_INTERNAL_ERROR,
                               _("libxenlight does not support disk
driver %s"),
                              
virStorageFileFormatTypeToString(l_disk->format));
                return -1;
            }
        } else if (STREQ(l_disk->driverName, "file")) {
            x_disk->format = LIBXL_DISK_FORMAT_RAW;
            x_disk->backend = LIBXL_DISK_BACKEND_TAP;
        } else if (STREQ(l_disk->driverName, "phy")) {
            x_disk->format = LIBXL_DISK_FORMAT_RAW;
            x_disk->backend = LIBXL_DISK_BACKEND_PHY;
        } else {
            virReportError(VIR_ERR_INTERNAL_ERROR,
                           _("libxenlight does not support disk driver %s"),
                           l_disk->driverName);
            return -1;
        }
    } else {
        /*
         * If driverName is not specified, default to raw as per
         * xl-disk-configuration.txt in the xen documentation and let
         * libxl pick a suitable backend.
         */
        x_disk->format = LIBXL_DISK_FORMAT_RAW;
        x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
    }

>   
>> It looks like we could define a "qdisk" driverName.
>>     
>
> Yes, I think this would work. WRT to your comment below it looks like
> other libvirt drivers call this driver "qemu".
>   

IMO, we should just extend the above to map driverName 'qemu' to backend
'qdisk'.  But what about the formats?  I though qdisk could handle all
of them, particularly with qemu-upstream, even vmdk?

>   
>> Actually this reminds me of a general concern I had: I worry that people 
>> who choose to use libvirt will likely use all the default options, and 
>> not customise their code too much for a specific hypervisor. After all, 
>> if they wanted to customise their code for xen they would just use libxl 
>> and use the richer interface. I think we ought to make sure as much 
>> stuff just works by default as possible. In my ideal world it would be 
>> possible to take a simple libvirt domain configuration (perhaps one 
>> which used to run on kvm) and run it on xen, transparently benefiting 
>> from xen features such as driver domains (and automatic blktap/qemu 
>> selection?). I think if the domain XML requires too much tweaking then 
>> people will just not use the features or may give up altogether.
>>     
>
> This would be ideal, I seem to remember hearing that disk configuration
> was something which libvirt didn't abstract, tending to just expose
> underlying configuration strings of the underlying toolstack. So this is
> one area where you actually have to change things when you migrate
> between hypervisors or they won't work at all.
>   

For the most part, that is true, unless you were using a minimal
configuration and letting the individual hypervisor driver choose
defaults.  It would probably work with raw images and no driver element
defined, e.g.

  <disk type='file' device='disk'>
    <source file='/path/to/disk.raw'/>
    <target dev='hda' bus='ide/>
  </disk>

Regards,
Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26  1:27                   ` Jim Fehlig
@ 2013-04-26  4:50                     ` Jim Fehlig
  2013-04-26 10:37                       ` David Scott
  2013-04-26  8:49                     ` Ian Campbell
  2013-04-26 10:10                     ` Stefano Stabellini
  2 siblings, 1 reply; 32+ messages in thread
From: Jim Fehlig @ 2013-04-26  4:50 UTC (permalink / raw)
  To: Ian Campbell; +Cc: George Dunlap, Stefano Stabellini, Dave Scott, xen-devel

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

Jim Fehlig wrote:
> Ian Campbell wrote:
>   
>> On Thu, 2013-04-25 at 14:22 +0100, Dave Scott wrote:
>>   
>>     
>>> On 25/04/13 13:43, Ian Campbell wrote:
>>>     
>>>       
>>   
>>     
>>> It looks like we could define a "qdisk" driverName.
>>>     
>>>       
>> Yes, I think this would work. WRT to your comment below it looks like
>> other libvirt drivers call this driver "qemu".
>>   
>>     
>
> IMO, we should just extend the above to map driverName 'qemu' to backend
> 'qdisk'.  But what about the formats?  I though qdisk could handle all
> of them, particularly with qemu-upstream, even vmdk?
>   

Something like the attached, which seems to work well for me when
specifying driverName = qemu, e.g.

    <disk type='file' device='disk'>
      <driver name='qemu'/>
      <source file='/var/lib/xen/images/sles11sp2-pv/disk0.raw'/>
      <target dev='xvda' bus='xen'/>
    </disk>

Regards,
Jim


[-- Attachment #2: libvirt-libxl-qdisk.patch --]
[-- Type: text/x-patch, Size: 1581 bytes --]

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 7e0753a..f549a5d 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -505,6 +505,29 @@ libxlMakeDisk(virDomainDiskDefPtr l_disk, libxl_device_disk *x_disk)
         } else if (STREQ(l_disk->driverName, "phy")) {
             x_disk->format = LIBXL_DISK_FORMAT_RAW;
             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
+        } else if (STREQ(l_disk->driverName, "qemu")) {
+            x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
+            switch (l_disk->format) {
+            case VIR_STORAGE_FILE_QCOW:
+                x_disk->format = LIBXL_DISK_FORMAT_QCOW;
+                break;
+            case VIR_STORAGE_FILE_QCOW2:
+                x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
+                break;
+            case VIR_STORAGE_FILE_VHD:
+                x_disk->format = LIBXL_DISK_FORMAT_VHD;
+                break;
+            case VIR_STORAGE_FILE_NONE:
+                /* No subtype specified, default to raw */
+            case VIR_STORAGE_FILE_RAW:
+                x_disk->format = LIBXL_DISK_FORMAT_RAW;
+                break;
+            default:
+                virReportError(VIR_ERR_INTERNAL_ERROR,
+                               _("libxenlight does not support disk format %s"),
+                               virStorageFileFormatTypeToString(l_disk->format));
+                return -1;
+            }
         } else {
             virReportError(VIR_ERR_INTERNAL_ERROR,
                            _("libxenlight does not support disk driver %s"),

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26  1:27                   ` Jim Fehlig
  2013-04-26  4:50                     ` Jim Fehlig
@ 2013-04-26  8:49                     ` Ian Campbell
  2013-04-26 10:10                     ` Stefano Stabellini
  2 siblings, 0 replies; 32+ messages in thread
From: Ian Campbell @ 2013-04-26  8:49 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: George Dunlap, Stefano Stabellini, Dave Scott, xen-devel

On Fri, 2013-04-26 at 02:27 +0100, Jim Fehlig wrote:
> Ian Campbell wrote:
> > On Thu, 2013-04-25 at 14:22 +0100, Dave Scott wrote:
> >   
> >> On 25/04/13 13:43, Ian Campbell wrote:
> >>     
> >>> On Thu, 2013-04-25 at 13:12 +0100, Dave Scott wrote:
> >>>       
> >>>> On 25/04/13 12:57, Ian Campbell wrote:
> >>>>         
> >>>>> On Thu, 2013-04-25 at 12:55 +0100, Stefano Stabellini wrote:
> >>>>>
> >>>>>           
> >>>>>> Right, but going back to the original question: if we have in our hands
> >>>>>> something that is not vhd, raw or qcow and blktap2 is available,
> >>>>>> should we really try to pass it to it?
> >>>>>>             
> >>>>> That's not quite the original question because in that case it was raw,
> >>>>> at least as far as the libxl interface is concerned.
> >>>>>
> >>>>>           
> >>>>>> We do know that blktap2 only handles qcow, qcow2, raw and vhd (and the
> >>>>>> implementation of the first two formats is too buggy to be used).
> >>>>>>
> >>>>>> Thus I think that the answer is pretty obvious here: we should try with
> >>>>>> QEMU, whose supported format list is way wider.
> >>>>>>             
> >>>>> Which I think we do, we only try and use blktap for raw and vhd.
> >>>>>           
> >>>> As well as the disk format dimension (vhd, qcow2 etc) there's also the
> >>>> network disk access protocol (iSCSI, ceph/RBD, sheepdog?). Although both
> >>>> blktap/tapdisk can handle raw, alas all the interesting (perhaps I'm
> >>>> showing my bias here ;-) disk access protocols are in qemu. So if we
> >>>> default to blktap/tapdisk we can only handle raw *files*, whereas if we
> >>>> default to qemu we can also do these new fancy things as well.
> >>>>         
> >>> Remind me why you can't specify backend=qdisk explicitly? Does libvirt
> >>> not propagate anything like that?
> >>>       
> >> I'll give this a go tomorrow when I'm not at the mercy of dodgy wifi :-) 
> >>  From my reading of the libvirt code it does allow the selection of a 
> >> driverName from the set "tap" "tap2" "file" and "phy".
> >>     
> >
> > In libxl the choices are phy, tap and qdisk. There's no file nor any
> > distinction between tap and tap2. I expect the libxl libvirt driver is
> > trying to make things easier for people migrating from xend (which is
> > to be applauded).
> >   
> 
> Right.  'file' is mapped to backend 'tap' and format 'raw'.  In fact, I
> should just post the code here for review :).  For a bit of context,
> l_disk is libvirt's definition, and x_disk is libxl's.  Comments on this
> mapping welcome, keeping in mind Ian's point about migrating from xend.
> 
>     if (l_disk->driverName) {
>         if (STREQ(l_disk->driverName, "tap") ||
>             STREQ(l_disk->driverName, "tap2")) {
>             switch (l_disk->format) {
>             case VIR_STORAGE_FILE_QCOW:
>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>                 break;
>             case VIR_STORAGE_FILE_QCOW2:
>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>                 break;
>             case VIR_STORAGE_FILE_VHD:
>                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>                 break;
>             case VIR_STORAGE_FILE_NONE:
>                 /* No subtype specified, default to raw/tap */
>             case VIR_STORAGE_FILE_RAW:
>                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
[...]
> >> It looks like we could define a "qdisk" driverName.
> >>     
> >
> > Yes, I think this would work. WRT to your comment below it looks like
> > other libvirt drivers call this driver "qemu".
> >   
> 
> IMO, we should just extend the above to map driverName 'qemu' to backend
> 'qdisk'.  But what about the formats?  I though qdisk could handle all
> of them, particularly with qemu-upstream, even vmdk?

We could certainly add new LIBXL_DISK_FORMAT_FOO to expose other
formats, such as VMDK and the presumably these would map naturally from
the VIR_STORAGE namespace.

I think for qdisk you want a switch much like the above, but the backend
always qdisk.

I could also see an argument for handling the case of an unknown (or
NIL) driver name by defaulting backend to UNKNOWN and setting the format
as best you are able, like you do for tap (and soon qdisk ;-)).

In addition are we at liberty to define driverName="xl" which explicitly
tells us that source file == the full xl spec (which I suppose is really
the xlu spec or something), using the libxlu_disk parser?

Perhaps having LIBXL_DISK_FORMAT as an enum was a mistake and really it
should have been a string with a mapping to backends in a configuration
file, so it could be extensible without needing to recompile the
library. This could be something to consider for a future libxl.

> >> Actually this reminds me of a general concern I had: I worry that people 
> >> who choose to use libvirt will likely use all the default options, and 
> >> not customise their code too much for a specific hypervisor. After all, 
> >> if they wanted to customise their code for xen they would just use libxl 
> >> and use the richer interface. I think we ought to make sure as much 
> >> stuff just works by default as possible. In my ideal world it would be 
> >> possible to take a simple libvirt domain configuration (perhaps one 
> >> which used to run on kvm) and run it on xen, transparently benefiting 
> >> from xen features such as driver domains (and automatic blktap/qemu 
> >> selection?). I think if the domain XML requires too much tweaking then 
> >> people will just not use the features or may give up altogether.
> >>     
> >
> > This would be ideal, I seem to remember hearing that disk configuration
> > was something which libvirt didn't abstract, tending to just expose
> > underlying configuration strings of the underlying toolstack. So this is
> > one area where you actually have to change things when you migrate
> > between hypervisors or they won't work at all.
> >   
> 
> For the most part, that is true,

Glad to hear I'm not telling too blatant lies ;-)

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26  1:27                   ` Jim Fehlig
  2013-04-26  4:50                     ` Jim Fehlig
  2013-04-26  8:49                     ` Ian Campbell
@ 2013-04-26 10:10                     ` Stefano Stabellini
  2013-04-26 10:13                       ` Ian Campbell
                                         ` (2 more replies)
  2 siblings, 3 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-26 10:10 UTC (permalink / raw)
  To: Jim Fehlig
  Cc: George Dunlap, xen-devel, Ian Campbell, Stefano Stabellini,
	Dave Scott

On Fri, 26 Apr 2013, Jim Fehlig wrote:
>     if (l_disk->driverName) {
>         if (STREQ(l_disk->driverName, "tap") ||
>             STREQ(l_disk->driverName, "tap2")) {
>             switch (l_disk->format) {
>             case VIR_STORAGE_FILE_QCOW:
>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>                 break;
>             case VIR_STORAGE_FILE_QCOW2:
>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>                 break;
>             case VIR_STORAGE_FILE_VHD:
>                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>                 break;
>             case VIR_STORAGE_FILE_NONE:
>                 /* No subtype specified, default to raw/tap */
>             case VIR_STORAGE_FILE_RAW:
>                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>                 break;
>             default:
>                 virReportError(VIR_ERR_INTERNAL_ERROR,
>                                _("libxenlight does not support disk
> driver %s"),
>                               
> virStorageFileFormatTypeToString(l_disk->format));
>                 return -1;
>             }
>         } else if (STREQ(l_disk->driverName, "file")) {
>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>         } else if (STREQ(l_disk->driverName, "phy")) {
>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
>         } else {
>             virReportError(VIR_ERR_INTERNAL_ERROR,
>                            _("libxenlight does not support disk driver %s"),
>                            l_disk->driverName);
>             return -1;
>         }
>     } else {
>         /*
>          * If driverName is not specified, default to raw as per
>          * xl-disk-configuration.txt in the xen documentation and let
>          * libxl pick a suitable backend.
>          */
>         x_disk->format = LIBXL_DISK_FORMAT_RAW;
>         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
>     }

It looks like the defaults are the same of libxl.

However the mapping of RAW to TAP (libxl does the same) has always been
a bit dubious to me: now that upstream QEMU is used with HVM guests too
by libxl, there is no reason to use blktap over QEMU for raw files any
more.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 10:10                     ` Stefano Stabellini
@ 2013-04-26 10:13                       ` Ian Campbell
  2013-04-26 10:28                         ` Stefano Stabellini
  2013-04-26 11:31                       ` Marek Marczykowski
  2013-04-26 14:36                       ` Roger Pau Monné
  2 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-26 10:13 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: George Dunlap, Jim Fehlig, Dave Scott, xen-devel

On Fri, 2013-04-26 at 11:10 +0100, Stefano Stabellini wrote:
> On Fri, 26 Apr 2013, Jim Fehlig wrote:
> >     if (l_disk->driverName) {
> >         if (STREQ(l_disk->driverName, "tap") ||
> >             STREQ(l_disk->driverName, "tap2")) {
> >             switch (l_disk->format) {
> >             case VIR_STORAGE_FILE_QCOW:
> >                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
> >                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> >                 break;
> >             case VIR_STORAGE_FILE_QCOW2:
> >                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
> >                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> >                 break;
> >             case VIR_STORAGE_FILE_VHD:
> >                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
> >                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >                 break;
> >             case VIR_STORAGE_FILE_NONE:
> >                 /* No subtype specified, default to raw/tap */
> >             case VIR_STORAGE_FILE_RAW:
> >                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >                 break;
> >             default:
> >                 virReportError(VIR_ERR_INTERNAL_ERROR,
> >                                _("libxenlight does not support disk
> > driver %s"),
> >                               
> > virStorageFileFormatTypeToString(l_disk->format));
> >                 return -1;
> >             }
> >         } else if (STREQ(l_disk->driverName, "file")) {
> >             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >         } else if (STREQ(l_disk->driverName, "phy")) {
> >             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
> >         } else {
> >             virReportError(VIR_ERR_INTERNAL_ERROR,
> >                            _("libxenlight does not support disk driver %s"),
> >                            l_disk->driverName);
> >             return -1;
> >         }
> >     } else {
> >         /*
> >          * If driverName is not specified, default to raw as per
> >          * xl-disk-configuration.txt in the xen documentation and let
> >          * libxl pick a suitable backend.
> >          */
> >         x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
> >     }
> 
> It looks like the defaults are the same of libxl.
> 
> However the mapping of RAW to TAP (libxl does the same) has always been
> a bit dubious to me: now that upstream QEMU is used with HVM guests too
> by libxl, there is no reason to use blktap over QEMU for raw files any
> more.

There are two TAP+RAW in the above, one is inside an if (driver="tap")
which seems reasonable, are you talking about the one in the
driver="file"?

I think it would be better for libvirt in the file and phy cases to just
say format=raw and leave libxl to pick a backend capable of providing
this.

Changing what libxl does is a separate question, but I think you are
right that a case could be made for preferring qdisk in the default
case.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 10:13                       ` Ian Campbell
@ 2013-04-26 10:28                         ` Stefano Stabellini
  0 siblings, 0 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-26 10:28 UTC (permalink / raw)
  To: Ian Campbell
  Cc: George Dunlap, Jim Fehlig, xen-devel, Dave Scott,
	Stefano Stabellini

On Fri, 26 Apr 2013, Ian Campbell wrote:
> On Fri, 2013-04-26 at 11:10 +0100, Stefano Stabellini wrote:
> > On Fri, 26 Apr 2013, Jim Fehlig wrote:
> > >     if (l_disk->driverName) {
> > >         if (STREQ(l_disk->driverName, "tap") ||
> > >             STREQ(l_disk->driverName, "tap2")) {
> > >             switch (l_disk->format) {
> > >             case VIR_STORAGE_FILE_QCOW:
> > >                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
> > >                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> > >                 break;
> > >             case VIR_STORAGE_FILE_QCOW2:
> > >                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
> > >                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> > >                 break;
> > >             case VIR_STORAGE_FILE_VHD:
> > >                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
> > >                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> > >                 break;
> > >             case VIR_STORAGE_FILE_NONE:
> > >                 /* No subtype specified, default to raw/tap */
> > >             case VIR_STORAGE_FILE_RAW:
> > >                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
> > >                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> > >                 break;
> > >             default:
> > >                 virReportError(VIR_ERR_INTERNAL_ERROR,
> > >                                _("libxenlight does not support disk
> > > driver %s"),
> > >                               
> > > virStorageFileFormatTypeToString(l_disk->format));
> > >                 return -1;
> > >             }
> > >         } else if (STREQ(l_disk->driverName, "file")) {
> > >             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> > >             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> > >         } else if (STREQ(l_disk->driverName, "phy")) {
> > >             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> > >             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
> > >         } else {
> > >             virReportError(VIR_ERR_INTERNAL_ERROR,
> > >                            _("libxenlight does not support disk driver %s"),
> > >                            l_disk->driverName);
> > >             return -1;
> > >         }
> > >     } else {
> > >         /*
> > >          * If driverName is not specified, default to raw as per
> > >          * xl-disk-configuration.txt in the xen documentation and let
> > >          * libxl pick a suitable backend.
> > >          */
> > >         x_disk->format = LIBXL_DISK_FORMAT_RAW;
> > >         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
> > >     }
> > 
> > It looks like the defaults are the same of libxl.
> > 
> > However the mapping of RAW to TAP (libxl does the same) has always been
> > a bit dubious to me: now that upstream QEMU is used with HVM guests too
> > by libxl, there is no reason to use blktap over QEMU for raw files any
> > more.
> 
> There are two TAP+RAW in the above, one is inside an if (driver="tap")
> which seems reasonable, are you talking about the one in the
> driver="file"?

Yes

> I think it would be better for libvirt in the file and phy cases to just
> say format=raw and leave libxl to pick a backend capable of providing
> this.

I agree

> Changing what libxl does is a separate question, but I think you are
> right that a case could be made for preferring qdisk in the default
> case.

Indeed

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26  4:50                     ` Jim Fehlig
@ 2013-04-26 10:37                       ` David Scott
  2013-04-26 23:44                         ` Jim Fehlig
       [not found]                         ` <517B1170.4020300@suse.com>
  0 siblings, 2 replies; 32+ messages in thread
From: David Scott @ 2013-04-26 10:37 UTC (permalink / raw)
  To: Jim Fehlig; +Cc: George Dunlap, Stefano Stabellini, Ian Campbell, xen-devel

[-- Attachment #1: Type: text/plain, Size: 1798 bytes --]

On 26/04/13 05:50, Jim Fehlig wrote:
> Jim Fehlig wrote:
>> Ian Campbell wrote:
>>
>>> On Thu, 2013-04-25 at 14:22 +0100, Dave Scott wrote:
>>>
>>>
>>>> On 25/04/13 13:43, Ian Campbell wrote:
>>>>
>>>>
>>>
>>>
>>>> It looks like we could define a "qdisk" driverName.
>>>>
>>>>
>>> Yes, I think this would work. WRT to your comment below it looks like
>>> other libvirt drivers call this driver "qemu".
>>>
>>>
>>
>> IMO, we should just extend the above to map driverName 'qemu' to backend
>> 'qdisk'.  But what about the formats?  I though qdisk could handle all
>> of them, particularly with qemu-upstream, even vmdk?
>>
>
> Something like the attached, which seems to work well for me when
> specifying driverName = qemu, e.g.
>
>      <disk type='file' device='disk'>
>        <driver name='qemu'/>
>        <source file='/var/lib/xen/images/sles11sp2-pv/disk0.raw'/>
>        <target dev='xvda' bus='xen'/>
>      </disk>

This also works for me!

On a related note, what do you think about the attached patch? It allows 
the user to select a non-default qemu via the <emulator> element. My 
domain XML looked like this:

   <devices>
     <emulator>/usr/lib/xen/bin/qemu-system-i386</emulator>
     <disk device="disk" type="network">
       <driver name='qemu'/>
       <source protocol="rbd" name="rbd:rbd/ubuntu1204.img"/>
       <target dev="hda"/>
     </disk>
     <graphics type="vnc" port="-1" autoport="yes" listen="0.0.0.0"/>
   </devices>

Now that upstream qemu is the default in xen-4.3, it seems useful to be 
able to select the traditional qemu for older VMs.

Also I backported this to my xen-4.2 system and used this patch + your 
patch + the previous 'stat()' fix to successfully start a VM on ceph 
storage via libvirt + libxl (my quest is almost complete!)

Cheers,
Dave



[-- Attachment #2: libxl-make-emulator.patch --]
[-- Type: text/x-patch, Size: 1378 bytes --]

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index f549a5d..abbd3c0 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -811,6 +811,30 @@ libxlMakeCapabilities(libxl_ctx *ctx)
 }
 
 int
+libxlMakeEmulator(virDomainDefPtr def, libxl_domain_config *d_config)
+{
+    /* No explicit override means use the default */
+    if (!def->emulator) {
+        return 0;
+    }
+    if (strstr(def->emulator, "/qemu-system-")) {
+        d_config->b_info.device_model_version =
+            LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
+        return 0;
+    }
+    if (strstr(def->emulator, "/qemu-dm")) {
+        d_config->b_info.device_model_version =
+            LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
+        return 0;
+    }
+    virReportError(VIR_ERR_INTERNAL_ERROR,
+                   _("libxenlight doesn't support emulator '%s'"),
+                   def->emulator);
+    return -1;
+}
+
+
+int
 libxlBuildDomainConfig(libxlDriverPrivatePtr driver,
                        virDomainDefPtr def, libxl_domain_config *d_config)
 {
@@ -834,6 +858,10 @@ libxlBuildDomainConfig(libxlDriverPrivatePtr driver,
         goto error;
     }
 
+    if (libxlMakeEmulator(def, d_config) < 0) {
+        goto error;
+    }
+
     d_config->on_reboot = def->onReboot;
     d_config->on_poweroff = def->onPoweroff;
     d_config->on_crash = def->onCrash;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 10:10                     ` Stefano Stabellini
  2013-04-26 10:13                       ` Ian Campbell
@ 2013-04-26 11:31                       ` Marek Marczykowski
  2013-04-26 11:40                         ` Ian Campbell
  2013-04-26 11:45                         ` Stefano Stabellini
  2013-04-26 14:36                       ` Roger Pau Monné
  2 siblings, 2 replies; 32+ messages in thread
From: Marek Marczykowski @ 2013-04-26 11:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Jim Fehlig, Dave Scott, Ian Campbell, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 3371 bytes --]

On 26.04.2013 12:10, Stefano Stabellini wrote:
> On Fri, 26 Apr 2013, Jim Fehlig wrote:
>>     if (l_disk->driverName) {
>>         if (STREQ(l_disk->driverName, "tap") ||
>>             STREQ(l_disk->driverName, "tap2")) {
>>             switch (l_disk->format) {
>>             case VIR_STORAGE_FILE_QCOW:
>>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>>                 break;
>>             case VIR_STORAGE_FILE_QCOW2:
>>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>>                 break;
>>             case VIR_STORAGE_FILE_VHD:
>>                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>                 break;
>>             case VIR_STORAGE_FILE_NONE:
>>                 /* No subtype specified, default to raw/tap */
>>             case VIR_STORAGE_FILE_RAW:
>>                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>                 break;
>>             default:
>>                 virReportError(VIR_ERR_INTERNAL_ERROR,
>>                                _("libxenlight does not support disk
>> driver %s"),
>>                               
>> virStorageFileFormatTypeToString(l_disk->format));
>>                 return -1;
>>             }
>>         } else if (STREQ(l_disk->driverName, "file")) {
>>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>         } else if (STREQ(l_disk->driverName, "phy")) {
>>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
>>         } else {
>>             virReportError(VIR_ERR_INTERNAL_ERROR,
>>                            _("libxenlight does not support disk driver %s"),
>>                            l_disk->driverName);
>>             return -1;
>>         }
>>     } else {
>>         /*
>>          * If driverName is not specified, default to raw as per
>>          * xl-disk-configuration.txt in the xen documentation and let
>>          * libxl pick a suitable backend.
>>          */
>>         x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
>>     }
> 
> It looks like the defaults are the same of libxl.
> 
> However the mapping of RAW to TAP (libxl does the same) has always been
> a bit dubious to me: now that upstream QEMU is used with HVM guests too
> by libxl, there is no reason to use blktap over QEMU for raw files any
> more.

What about old good loop+phy based backend for file disk images? I don't want
whole qemu in dom0 for PV domains, only for handling simple disk backend.
Additionally sparse images + loop + phy + mount -o discard in domU makes the
images "auto shrinking". Don't know if qemu is able to do this.

Attached patch, which I currently use for that. If it is close to something
that would be accepted, I will send it in new thread.

BTW I don't want qemu in dom0 for HVM also, but this is another story
(modified stubdom etc) and probably not acceptable in vanilla xen.

-- 
Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1.2: 0001-libxl-allow-PHY-backend-for-files-allocate-loop-devi.patch --]
[-- Type: text/x-patch; name="0001-libxl-allow-PHY-backend-for-files-allocate-loop-devi.patch", Size: 8660 bytes --]

From cbd5831fd5a6a00429957acb6b1d58d7c645f219 Mon Sep 17 00:00:00 2001
From: Marek Marczykowski <marmarek@invisiblethingslab.com>
Date: Tue, 16 Apr 2013 22:20:25 +0200
Subject: [PATCH] libxl: allow PHY backend for files - allocate loop device
Organization: Invisible Things Lab
Cc: Marek Marczykowski <marmarek@invisiblethingslab.com>

This is implementation of /etc/xen/scripts/block behaviour, but without
calling the script, which should be much faster.

Signed-off-by: Marek Marczykowski <marmarek@invisiblethingslab.com>
---
 tools/libxl/libxl.c          |  13 ++++-
 tools/libxl/libxl_device.c   |   8 +++
 tools/libxl/libxl_internal.h |   9 +++
 tools/libxl/libxl_linux.c    | 129 ++++++++++++++++++++++++++++++++++++++++++-
 tools/libxl/libxl_netbsd.c   |  11 ++++
 5 files changed, 168 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a813a88..4af5591 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2086,7 +2086,18 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
                  */
                 if (!disk->script) {
                     int major, minor;
-                    libxl__device_physdisk_major_minor(dev, &major, &minor);
+                    char *node;
+
+                    node = libxl__loopdev_setup(gc, dev);
+                    if (!node) {
+                        /* error already reported by libxl__setup_loopdev */
+                        goto out;
+                    }
+                    if (node != dev) {
+                        flexarray_append_pair(back, "node", node);
+                    }
+
+                    libxl__device_physdisk_major_minor(node, &major, &minor);
                     flexarray_append_pair(back, "physical-device",
                             libxl__sprintf(gc, "%x:%x", major, minor));
                 }
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index d622826..936961e 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -883,6 +883,7 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
 {
     STATE_AO_GC(aodev->ao);
     char *be_path = libxl__device_backend_path(gc, aodev->dev);
+    char *node;
     char **args = NULL, **env = NULL;
     int rc = 0;
     int hotplug;
@@ -906,6 +907,13 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
     switch (hotplug) {
     case 0:
         /* no hotplug script to execute */
+        /* check if we have loop device allocated - if so, try to release it */
+        node = libxl__xs_read(gc, XBT_NULL, GCSPRINTF("%s/node", be_path));
+        if (node) {
+            /* ignore errors: if device is already released - no problem, if
+             * EBUSY - can be shared */
+            libxl__loopdev_cleanup(gc, node);
+        }
         goto out;
     case 1:
         /* execute hotplug script */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3ba3a21..573e3d1 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1050,6 +1050,15 @@ static inline void libxl__domaindeathcheck_stop(libxl__gc *gc,
  */
 _hidden int libxl__try_phy_backend(mode_t st_mode);
 
+/*
+ * libxl__setup_loopdev - Setup loop device if needed.
+ *
+ * Return its path on success or NULL in case of failure. If no loop device is
+ * needed, return original filename.
+ */
+_hidden char *libxl__loopdev_setup(libxl__gc *gc, char *filename);
+
+_hidden void libxl__loopdev_cleanup(libxl__gc *gc, char *devname);
 
 _hidden char *libxl__devid_to_localdev(libxl__gc *gc, int devid);
 
diff --git a/tools/libxl/libxl_linux.c b/tools/libxl/libxl_linux.c
index 37815eb..115332a 100644
--- a/tools/libxl/libxl_linux.c
+++ b/tools/libxl/libxl_linux.c
@@ -16,16 +16,143 @@
 #include "libxl_osdeps.h" /* must come before any other headers */
 
 #include "libxl_internal.h"
+#include <linux/loop.h>
  
 int libxl__try_phy_backend(mode_t st_mode)
 {
-    if (!S_ISBLK(st_mode)) {
+    if (!S_ISBLK(st_mode) && !S_ISREG(st_mode)) {
         return 0;
     }
 
     return 1;
 }
 
+char *libxl__loopdev_setup(libxl__gc *gc, char *filename) {
+    struct stat st;
+    DIR *sysblock;
+    int dir_fd;
+    int control_fd = -1;
+    int loop_fd = -1;
+    int file_fd = -1;
+    int devnr;
+    struct dirent *d;
+    struct loop_info64  info;
+    char *loop_name = NULL;
+    char *ret = NULL;
+
+    if (stat(filename, &st) < 0) {
+        LOG(ERROR, "unable to stat %s", filename);
+        return NULL;
+    }
+
+    /* do not do anything if filename already is block device */
+    if (S_ISBLK(st.st_mode))
+        return filename;
+
+    /* first check if file has already loop device assigned according to
+     *
+     * losetup implementation readdir on /sys/block + stat on
+     * loop/backing_file is faster than open each loop dev
+     */
+    sysblock = opendir("/sys/block");
+    if (!sysblock) {
+        LOG(ERROR, "unable to open /sys/block");
+        return NULL;
+    }
+
+    dir_fd = dirfd(sysblock);
+
+    while ((d = readdir(sysblock))) {
+        char name[256];
+        struct stat buf;
+
+        if (strcmp(d->d_name, ".") == 0
+                || strcmp(d->d_name, "..") == 0
+                || strncmp(d->d_name, "loop", 4) != 0)
+            continue;
+        snprintf(name, sizeof(name), "%s/loop/backing_file", d->d_name);
+        if (fstatat(dir_fd, name, &buf, 0) == 0) {
+            snprintf(name, sizeof(name), "/dev/%s", d->d_name);
+            loop_fd = open(name, O_RDWR);
+            if (loop_fd < 0) {
+                LOG(ERROR, "unable to open %s", name);
+                goto out;
+            }
+            if (ioctl(loop_fd, LOOP_GET_STATUS64, &info) < 0) {
+                LOG(ERROR, "unable to get %s info: %s", name, strerror(errno));
+                goto out;
+            }
+            if (info.lo_inode == st.st_ino && info.lo_device == st.st_dev) {
+                /* found */
+                ret = libxl__strdup(gc, name);
+                goto out;
+            }
+        }
+    }
+
+    /* not found, so allocate new one */
+    file_fd = open(filename, O_RDWR);
+    if (file_fd < 0) {
+        LOG(ERROR, "unable to open %s: %s", filename, strerror(errno));
+        goto out;
+    }
+
+    control_fd = open("/dev/loop-control", O_RDWR);
+    if (control_fd < 0) {
+        LOG(ERROR, "unable to open /dev/loop-control: %s", strerror(errno));
+        goto out;
+    }
+    devnr = ioctl(control_fd, LOOP_CTL_GET_FREE);
+    if (devnr < 0) {
+        LOG(ERROR, "unable to get free loop device: %s", strerror(errno));
+        goto out;
+    }
+    loop_name = libxl__sprintf(gc, "/dev/loop%d", devnr);
+    loop_fd = open(loop_name, O_RDWR);
+    if (loop_fd < 0) {
+        LOG(ERROR, "unable to open %s: %s", loop_name, strerror(errno));
+        goto out;
+    }
+    if (ioctl(loop_fd, LOOP_SET_FD, file_fd) < 0) {
+        LOG(ERROR, "unable to set loop backing store: %s", strerror(errno));
+        goto out;
+    }
+    ret = loop_name;
+
+out:
+    if (file_fd >= 0)
+        close(file_fd);
+    if (control_fd >= 0)
+        close(control_fd);
+    if (loop_fd >= 0)
+        close(loop_fd);
+    if (sysblock)
+        closedir(sysblock);
+
+    return ret;
+}
+
+void libxl__loopdev_cleanup(libxl__gc *gc, char *devname)
+{
+    int loop_fd;
+
+    loop_fd = open(devname, O_RDWR);
+    if (loop_fd < 0) {
+        LOG(ERROR, "unable to open device %s: %s", devname, strerror(errno));
+        goto out;
+    }
+
+    if (ioctl(loop_fd, LOOP_CLR_FD) < 0) {
+        LOG(ERROR, "unable to release device %s: %s", devname, strerror(errno));
+        goto out;
+    }
+
+out:
+    if (loop_fd >= 0)
+        close(loop_fd);
+}
+
+
 #define EXT_SHIFT 28
 #define EXTENDED (1<<EXT_SHIFT)
 #define VDEV_IS_EXTENDED(dev) ((dev)&(EXTENDED))
diff --git a/tools/libxl/libxl_netbsd.c b/tools/libxl/libxl_netbsd.c
index 898e160..1e71572 100644
--- a/tools/libxl/libxl_netbsd.c
+++ b/tools/libxl/libxl_netbsd.c
@@ -25,6 +25,17 @@ int libxl__try_phy_backend(mode_t st_mode)
     return 0;
 }
 
+char *libxl__loopdev_setup(libxl__gc *gc, char *filename)
+{
+    /* files supported natively */
+    return filename;
+}
+
+void libxl__loopdev_cleanup(libxl__gc *gc, char *devname)
+{
+    /* nothing to do */
+}
+
 char *libxl__devid_to_localdev(libxl__gc *gc, int devid)
 {
     /* TODO */
-- 
1.8.1.4


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 553 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:31                       ` Marek Marczykowski
@ 2013-04-26 11:40                         ` Ian Campbell
  2013-04-26 11:48                           ` Stefano Stabellini
  2013-04-26 11:50                           ` Marek Marczykowski
  2013-04-26 11:45                         ` Stefano Stabellini
  1 sibling, 2 replies; 32+ messages in thread
From: Ian Campbell @ 2013-04-26 11:40 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: George Dunlap, Jim Fehlig, xen-devel, Dave Scott,
	Stefano Stabellini

On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> What about old good loop+phy based backend for file disk images? I don't want
> whole qemu in dom0 for PV domains, only for handling simple disk backend.
> Additionally sparse images + loop + phy + mount -o discard in domU makes the
> images "auto shrinking". Don't know if qemu is able to do this.

IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
therefore your data isn't actually on the disk when you might think it
is, which can lead to filesystem corruption even if the f/s is doing
correct barriers.

> Attached patch, which I currently use for that. If it is close to something
> that would be accepted, I will send it in new thread.

I think you can use a block script for this (i.e. it does the loop
mount) and avoid patching libxl at all. That's what xend did at least...

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:31                       ` Marek Marczykowski
  2013-04-26 11:40                         ` Ian Campbell
@ 2013-04-26 11:45                         ` Stefano Stabellini
  1 sibling, 0 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-26 11:45 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: Dave Scott, Stefano Stabellini, George Dunlap, xen-devel,
	Jim Fehlig, Ian Campbell

On Fri, 26 Apr 2013, Marek Marczykowski wrote:
> On 26.04.2013 12:10, Stefano Stabellini wrote:
> > On Fri, 26 Apr 2013, Jim Fehlig wrote:
> >>     if (l_disk->driverName) {
> >>         if (STREQ(l_disk->driverName, "tap") ||
> >>             STREQ(l_disk->driverName, "tap2")) {
> >>             switch (l_disk->format) {
> >>             case VIR_STORAGE_FILE_QCOW:
> >>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
> >>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> >>                 break;
> >>             case VIR_STORAGE_FILE_QCOW2:
> >>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
> >>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
> >>                 break;
> >>             case VIR_STORAGE_FILE_VHD:
> >>                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
> >>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >>                 break;
> >>             case VIR_STORAGE_FILE_NONE:
> >>                 /* No subtype specified, default to raw/tap */
> >>             case VIR_STORAGE_FILE_RAW:
> >>                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >>                 break;
> >>             default:
> >>                 virReportError(VIR_ERR_INTERNAL_ERROR,
> >>                                _("libxenlight does not support disk
> >> driver %s"),
> >>                               
> >> virStorageFileFormatTypeToString(l_disk->format));
> >>                 return -1;
> >>             }
> >>         } else if (STREQ(l_disk->driverName, "file")) {
> >>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >>             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
> >>         } else if (STREQ(l_disk->driverName, "phy")) {
> >>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >>             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
> >>         } else {
> >>             virReportError(VIR_ERR_INTERNAL_ERROR,
> >>                            _("libxenlight does not support disk driver %s"),
> >>                            l_disk->driverName);
> >>             return -1;
> >>         }
> >>     } else {
> >>         /*
> >>          * If driverName is not specified, default to raw as per
> >>          * xl-disk-configuration.txt in the xen documentation and let
> >>          * libxl pick a suitable backend.
> >>          */
> >>         x_disk->format = LIBXL_DISK_FORMAT_RAW;
> >>         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
> >>     }
> > 
> > It looks like the defaults are the same of libxl.
> > 
> > However the mapping of RAW to TAP (libxl does the same) has always been
> > a bit dubious to me: now that upstream QEMU is used with HVM guests too
> > by libxl, there is no reason to use blktap over QEMU for raw files any
> > more.
> 
> What about old good loop+phy based backend for file disk images? I don't want
> whole qemu in dom0 for PV domains, only for handling simple disk backend.
> Additionally sparse images + loop + phy + mount -o discard in domU makes the
> images "auto shrinking". Don't know if qemu is able to do this.
> 
> Attached patch, which I currently use for that. If it is close to something
> that would be accepted, I will send it in new thread.
> 
> BTW I don't want qemu in dom0 for HVM also, but this is another story
> (modified stubdom etc) and probably not acceptable in vanilla xen.

The original reason to drop loop+phy from libxl was that it is not safe
as it doesn't support O_DIRECT. It always goes through the Linux page
cache.
However after a lengthy discussion regarding QEMU and O_DIRECT with
Alex, it turns out that we don't actually need O_DIRECT for data
consistency so loop+phy should be safe too after all.

Time to bring back loop support?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:40                         ` Ian Campbell
@ 2013-04-26 11:48                           ` Stefano Stabellini
  2013-04-26 13:27                             ` Ian Campbell
  2013-04-26 11:50                           ` Marek Marczykowski
  1 sibling, 1 reply; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-26 11:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Dave Scott, Stefano Stabellini, George Dunlap, Marek Marczykowski,
	xen-devel, Jim Fehlig

On Fri, 26 Apr 2013, Ian Campbell wrote:
> On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> > What about old good loop+phy based backend for file disk images? I don't want
> > whole qemu in dom0 for PV domains, only for handling simple disk backend.
> > Additionally sparse images + loop + phy + mount -o discard in domU makes the
> > images "auto shrinking". Don't know if qemu is able to do this.
> 
> IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> therefore your data isn't actually on the disk when you might think it
> is, which can lead to filesystem corruption even if the f/s is doing
> correct barriers.

If it is safe for QEMU, it should be safe for loop and blkback too.


> > Attached patch, which I currently use for that. If it is close to something
> > that would be accepted, I will send it in new thread.
> 
> I think you can use a block script for this (i.e. it does the loop
> mount) and avoid patching libxl at all. That's what xend did at least...

Yes, libxl would need to arrage the script to be called when "phy" is
used on a file, right?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:40                         ` Ian Campbell
  2013-04-26 11:48                           ` Stefano Stabellini
@ 2013-04-26 11:50                           ` Marek Marczykowski
  2013-04-29 21:00                             ` Jim Fehlig
  1 sibling, 1 reply; 32+ messages in thread
From: Marek Marczykowski @ 2013-04-26 11:50 UTC (permalink / raw)
  To: Ian Campbell
  Cc: George Dunlap, Jim Fehlig, xen-devel, Dave Scott,
	Stefano Stabellini


[-- Attachment #1.1: Type: text/plain, Size: 1398 bytes --]

On 26.04.2013 13:40, Ian Campbell wrote:
> On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
>> What about old good loop+phy based backend for file disk images? I don't want
>> whole qemu in dom0 for PV domains, only for handling simple disk backend.
>> Additionally sparse images + loop + phy + mount -o discard in domU makes the
>> images "auto shrinking". Don't know if qemu is able to do this.
> 
> IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> therefore your data isn't actually on the disk when you might think it
> is, which can lead to filesystem corruption even if the f/s is doing
> correct barriers.
> 
>> Attached patch, which I currently use for that. If it is close to something
>> that would be accepted, I will send it in new thread.
> 
> I think you can use a block script for this (i.e. it does the loop
> mount) and avoid patching libxl at all. That's what xend did at least...

This also was solution I've evaluated, but libvirt developers don't want to
hear about disk->script setting in official libvirt, so I still need to patch
some library. Choosing libxl seems to be more universal. Setting loop in libxl
is also much faster than calling script for that.
Perhaps I can rewrite this patch to allow the choice (LIBXL_DISK_BACKEND_LOOP)?

-- 
Best Regards / Pozdrawiam,
Marek Marczykowski
Invisible Things Lab


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 553 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:48                           ` Stefano Stabellini
@ 2013-04-26 13:27                             ` Ian Campbell
  2013-04-26 17:10                               ` Stefano Stabellini
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-26 13:27 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Jim Fehlig, Dave Scott, Marek Marczykowski,
	xen-devel

On Fri, 2013-04-26 at 12:48 +0100, Stefano Stabellini wrote:
> On Fri, 26 Apr 2013, Ian Campbell wrote:
> > On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> > > What about old good loop+phy based backend for file disk images? I don't want
> > > whole qemu in dom0 for PV domains, only for handling simple disk backend.
> > > Additionally sparse images + loop + phy + mount -o discard in domU makes the
> > > images "auto shrinking". Don't know if qemu is able to do this.
> > 
> > IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> > therefore your data isn't actually on the disk when you might think it
> > is, which can lead to filesystem corruption even if the f/s is doing
> > correct barriers.
> 
> If it is safe for QEMU, it should be safe for loop and blkback too.

qemu (and blkback) will issue correct barriers/syncs to the underlying
storage. AFAIK loop does not.

> > > Attached patch, which I currently use for that. If it is close to something
> > > that would be accepted, I will send it in new thread.
> > 
> > I think you can use a block script for this (i.e. it does the loop
> > mount) and avoid patching libxl at all. That's what xend did at least...
> 
> Yes, libxl would need to arrage the script to be called when "phy" is
> used on a file, right?

I meant the user can pass "script=block-loop". In the full knowledge of
what that means for their data integrity.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 10:10                     ` Stefano Stabellini
  2013-04-26 10:13                       ` Ian Campbell
  2013-04-26 11:31                       ` Marek Marczykowski
@ 2013-04-26 14:36                       ` Roger Pau Monné
  2 siblings, 0 replies; 32+ messages in thread
From: Roger Pau Monné @ 2013-04-26 14:36 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Jim Fehlig, Dave Scott, Ian Campbell, xen-devel

On 26/04/13 12:10, Stefano Stabellini wrote:
> On Fri, 26 Apr 2013, Jim Fehlig wrote:
>>     if (l_disk->driverName) {
>>         if (STREQ(l_disk->driverName, "tap") ||
>>             STREQ(l_disk->driverName, "tap2")) {
>>             switch (l_disk->format) {
>>             case VIR_STORAGE_FILE_QCOW:
>>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>>                 break;
>>             case VIR_STORAGE_FILE_QCOW2:
>>                 x_disk->format = LIBXL_DISK_FORMAT_QCOW2;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_QDISK;
>>                 break;
>>             case VIR_STORAGE_FILE_VHD:
>>                 x_disk->format = LIBXL_DISK_FORMAT_VHD;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>                 break;
>>             case VIR_STORAGE_FILE_NONE:
>>                 /* No subtype specified, default to raw/tap */
>>             case VIR_STORAGE_FILE_RAW:
>>                 x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>                 x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>                 break;
>>             default:
>>                 virReportError(VIR_ERR_INTERNAL_ERROR,
>>                                _("libxenlight does not support disk
>> driver %s"),
>>                               
>> virStorageFileFormatTypeToString(l_disk->format));
>>                 return -1;
>>             }
>>         } else if (STREQ(l_disk->driverName, "file")) {
>>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>             x_disk->backend = LIBXL_DISK_BACKEND_TAP;
>>         } else if (STREQ(l_disk->driverName, "phy")) {
>>             x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>             x_disk->backend = LIBXL_DISK_BACKEND_PHY;
>>         } else {
>>             virReportError(VIR_ERR_INTERNAL_ERROR,
>>                            _("libxenlight does not support disk driver %s"),
>>                            l_disk->driverName);
>>             return -1;
>>         }
>>     } else {
>>         /*
>>          * If driverName is not specified, default to raw as per
>>          * xl-disk-configuration.txt in the xen documentation and let
>>          * libxl pick a suitable backend.
>>          */
>>         x_disk->format = LIBXL_DISK_FORMAT_RAW;
>>         x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
>>     }
> 
> It looks like the defaults are the same of libxl.
> 
> However the mapping of RAW to TAP (libxl does the same) has always been
> a bit dubious to me: now that upstream QEMU is used with HVM guests too
> by libxl, there is no reason to use blktap over QEMU for raw files any
> more.

Qemu should indeed be much faster than blktap when using recent Linux
kernels in the DomU due to persistent grants, I don't think blktap can
provide the same performance, although I have not tested it. With Qemu
we can get 900k IOPS.

http://xenbits.xen.org/people/royger/persistent_read_qemu.png

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 13:27                             ` Ian Campbell
@ 2013-04-26 17:10                               ` Stefano Stabellini
  2013-04-29  8:23                                 ` Ian Campbell
  0 siblings, 1 reply; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-26 17:10 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Dave Scott, Stefano Stabellini, George Dunlap, Marek Marczykowski,
	xen-devel, Jim Fehlig

On Fri, 26 Apr 2013, Ian Campbell wrote:
> On Fri, 2013-04-26 at 12:48 +0100, Stefano Stabellini wrote:
> > On Fri, 26 Apr 2013, Ian Campbell wrote:
> > > On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> > > > What about old good loop+phy based backend for file disk images? I don't want
> > > > whole qemu in dom0 for PV domains, only for handling simple disk backend.
> > > > Additionally sparse images + loop + phy + mount -o discard in domU makes the
> > > > images "auto shrinking". Don't know if qemu is able to do this.
> > > 
> > > IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> > > therefore your data isn't actually on the disk when you might think it
> > > is, which can lead to filesystem corruption even if the f/s is doing
> > > correct barriers.
> > 
> > If it is safe for QEMU, it should be safe for loop and blkback too.
> 
> qemu (and blkback) will issue correct barriers/syncs to the underlying
> storage. AFAIK loop does not.

Looking at drivers/block/loop.c, it seems to me that REQ_FUA and
REQ_FLUSH both handled correctly by issuing a vfs_fsync.


> > > > Attached patch, which I currently use for that. If it is close to something
> > > > that would be accepted, I will send it in new thread.
> > > 
> > > I think you can use a block script for this (i.e. it does the loop
> > > mount) and avoid patching libxl at all. That's what xend did at least...
> > 
> > Yes, libxl would need to arrage the script to be called when "phy" is
> > used on a file, right?
> 
> I meant the user can pass "script=block-loop". In the full knowledge of
> what that means for their data integrity.
 
Yeah, but there is no such block-loop script at the moment, the script
is called block and its behaviour depends on the xenstore "type" node:
it must be "file" and libxl never writes "type" "file" to xenstore (see
device_disk_add and libxl__device_disk_string_of_backend).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 10:37                       ` David Scott
@ 2013-04-26 23:44                         ` Jim Fehlig
       [not found]                         ` <517B1170.4020300@suse.com>
  1 sibling, 0 replies; 32+ messages in thread
From: Jim Fehlig @ 2013-04-26 23:44 UTC (permalink / raw)
  To: David Scott
  Cc: George Dunlap, LibVir, xen-devel, Ian Campbell,
	Stefano Stabellini

David Scott wrote:
>> Something like the attached, which seems to work well for me when
>> specifying driverName = qemu, e.g.
>>
>>      <disk type='file' device='disk'>
>>        <driver name='qemu'/>
>>        <source file='/var/lib/xen/images/sles11sp2-pv/disk0.raw'/>
>>        <target dev='xvda' bus='xen'/>
>>      </disk>
>
> This also works for me!

Good to hear.  I'll send the patch to the libvirt list.

>
> On a related note, what do you think about the attached patch? It
> allows the user to select a non-default qemu via the <emulator>
> element. My domain XML looked like this:
>
>   <devices>
>     <emulator>/usr/lib/xen/bin/qemu-system-i386</emulator>

IMO, the possible emulators should be exposed in the capabilities.  E.g.
on a machine with both kvm and qemu, both emulators are shown as
possibilities for an hvm, x86_64 guest

# virsh capabilities
...
<guest>
  <os_type>hvm</os_type>
  <arch name='x86_64'>
  <wordsize>64</wordsize>
  <domain type='qemu'>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <machine>pc-1.1</machine>
    <machine canonical='pc-1.1'>pc</machine>
    <machine>pc-1.0</machine>
    <machine>pc-0.15</machine>
    <machine>pc-0.14</machine>
    <machine>pc-0.13</machine>
    <machine>pc-0.12</machine>
    <machine>pc-0.11</machine>
    <machine>pc-0.10</machine>
    <machine>isapc</machine>
  </domain>
  <domain type='kvm'>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <machine>pc-1.1</machine>
    <machine canonical='pc-1.1'>pc</machine>
    <machine>pc-1.0</machine>
    <machine>pc-0.15</machine>
    <machine>pc-0.14</machine>
    <machine>pc-0.13</machine>
    <machine>pc-0.12</machine>
    <machine>pc-0.11</machine>
    <machine>pc-0.10</machine>
    <machine>isapc</machine>
  </domain>
   ....
</guest>
...

>     <disk device="disk" type="network">
>       <driver name='qemu'/>
>       <source protocol="rbd" name="rbd:rbd/ubuntu1204.img"/>
>       <target dev="hda"/>
>     </disk>
>     <graphics type="vnc" port="-1" autoport="yes" listen="0.0.0.0"/>
>   </devices>
>
> Now that upstream qemu is the default in xen-4.3, it seems useful to
> be able to select the traditional qemu for older VMs.

Agreed.  And reporting that both emulators exist via the capabilities
would be helpful for users.

>
> Also I backported this to my xen-4.2 system and used this patch + your
> patch + the previous 'stat()' fix to successfully start a VM on ceph
> storage via libvirt + libxl (my quest is almost complete!)

Nice!

> diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
> index f549a5d..abbd3c0 100644
> --- a/src/libxl/libxl_conf.c
> +++ b/src/libxl/libxl_conf.c
> @@ -811,6 +811,30 @@ libxlMakeCapabilities(libxl_ctx *ctx)
>  }
>  
>  int
> +libxlMakeEmulator(virDomainDefPtr def, libxl_domain_config *d_config)
> +{
> +    /* No explicit override means use the default */
> +    if (!def->emulator) {
> +        return 0;
> +    }
> +    if (strstr(def->emulator, "/qemu-system-")) {
> +        d_config->b_info.device_model_version =
> +            LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN;
> +        return 0;
> +    }
>   

Here we could check the requested emulator against the capabilities, and
then do the proper mapping for device_model_version.

Do you have time for an upstream libvirt patch to expose the possible
emulators in the capabilities, along with this patch allowing the user
to specify one?

Regards,
Jim

> +    if (strstr(def->emulator, "/qemu-dm")) {
> +        d_config->b_info.device_model_version =
> +            LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
> +        return 0;
> +    }
> +    virReportError(VIR_ERR_INTERNAL_ERROR,
> +                   _("libxenlight doesn't support emulator '%s'"),
> +                   def->emulator);
> +    return -1;
> +}
> +
> +
> +int
>  libxlBuildDomainConfig(libxlDriverPrivatePtr driver,
>                         virDomainDefPtr def, libxl_domain_config *d_config)
>  {
> @@ -834,6 +858,10 @@ libxlBuildDomainConfig(libxlDriverPrivatePtr driver,
>          goto error;
>      }
>  
> +    if (libxlMakeEmulator(def, d_config) < 0) {
> +        goto error;
> +    }
> +
>      d_config->on_reboot = def->onReboot;
>      d_config->on_poweroff = def->onPoweroff;
>      d_config->on_crash = def->onCrash;
>   
> ------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 17:10                               ` Stefano Stabellini
@ 2013-04-29  8:23                                 ` Ian Campbell
  2013-04-29 10:07                                   ` Stefano Stabellini
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2013-04-29  8:23 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: George Dunlap, Jim Fehlig, Dave Scott, Marek Marczykowski,
	xen-devel

On Fri, 2013-04-26 at 18:10 +0100, Stefano Stabellini wrote:
> On Fri, 26 Apr 2013, Ian Campbell wrote:
> > On Fri, 2013-04-26 at 12:48 +0100, Stefano Stabellini wrote:
> > > On Fri, 26 Apr 2013, Ian Campbell wrote:
> > > > On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> > > > > What about old good loop+phy based backend for file disk images? I don't want
> > > > > whole qemu in dom0 for PV domains, only for handling simple disk backend.
> > > > > Additionally sparse images + loop + phy + mount -o discard in domU makes the
> > > > > images "auto shrinking". Don't know if qemu is able to do this.
> > > > 
> > > > IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> > > > therefore your data isn't actually on the disk when you might think it
> > > > is, which can lead to filesystem corruption even if the f/s is doing
> > > > correct barriers.
> > > 
> > > If it is safe for QEMU, it should be safe for loop and blkback too.
> > 
> > qemu (and blkback) will issue correct barriers/syncs to the underlying
> > storage. AFAIK loop does not.
> 
> Looking at drivers/block/loop.c, it seems to me that REQ_FUA and
> REQ_FLUSH both handled correctly by issuing a vfs_fsync.

Oh, good, either it's been fixed (do we know when?) or I was confusing
things with O_DIRECT.

> > > > > Attached patch, which I currently use for that. If it is close to something
> > > > > that would be accepted, I will send it in new thread.
> > > > 
> > > > I think you can use a block script for this (i.e. it does the loop
> > > > mount) and avoid patching libxl at all. That's what xend did at least...
> > > 
> > > Yes, libxl would need to arrage the script to be called when "phy" is
> > > used on a file, right?
> > 
> > I meant the user can pass "script=block-loop". In the full knowledge of
> > what that means for their data integrity.
>  
> Yeah, but there is no such block-loop script at the moment, the script
> is called block and its behaviour depends on the xenstore "type" node:
> it must be "file" and libxl never writes "type" "file" to xenstore (see
> device_disk_add and libxl__device_disk_string_of_backend).

A block-loop script would be a matter of a few minutes work based on the
current block script, or the current block script could be modified to
stat the path and DTRT when handling a phy type device.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
       [not found]                         ` <517B1170.4020300@suse.com>
@ 2013-04-29  9:41                           ` David Scott
  0 siblings, 0 replies; 32+ messages in thread
From: David Scott @ 2013-04-29  9:41 UTC (permalink / raw)
  To: Jim Fehlig
  Cc: George Dunlap, LibVir, xen-devel, Ian Campbell,
	Stefano Stabellini

Hi Jim,

Thanks for the explanation about the capabilities.

On 27/04/13 00:44, Jim Fehlig wrote:
> Do you have time for an upstream libvirt patch to expose the possible
> emulators in the capabilities, along with this patch allowing the user
> to specify one?

I'll have a go and send you (cc'ing the libvirt list) what I manage to 
come up with.

Cheers,
Dave

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-29  8:23                                 ` Ian Campbell
@ 2013-04-29 10:07                                   ` Stefano Stabellini
  0 siblings, 0 replies; 32+ messages in thread
From: Stefano Stabellini @ 2013-04-29 10:07 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Dave Scott, Stefano Stabellini, George Dunlap, Marek Marczykowski,
	xen-devel, Jim Fehlig

On Mon, 29 Apr 2013, Ian Campbell wrote:
> On Fri, 2013-04-26 at 18:10 +0100, Stefano Stabellini wrote:
> > On Fri, 26 Apr 2013, Ian Campbell wrote:
> > > On Fri, 2013-04-26 at 12:48 +0100, Stefano Stabellini wrote:
> > > > On Fri, 26 Apr 2013, Ian Campbell wrote:
> > > > > On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
> > > > > > What about old good loop+phy based backend for file disk images? I don't want
> > > > > > whole qemu in dom0 for PV domains, only for handling simple disk backend.
> > > > > > Additionally sparse images + loop + phy + mount -o discard in domU makes the
> > > > > > images "auto shrinking". Don't know if qemu is able to do this.
> > > > > 
> > > > > IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
> > > > > therefore your data isn't actually on the disk when you might think it
> > > > > is, which can lead to filesystem corruption even if the f/s is doing
> > > > > correct barriers.
> > > > 
> > > > If it is safe for QEMU, it should be safe for loop and blkback too.
> > > 
> > > qemu (and blkback) will issue correct barriers/syncs to the underlying
> > > storage. AFAIK loop does not.
> > 
> > Looking at drivers/block/loop.c, it seems to me that REQ_FUA and
> > REQ_FLUSH both handled correctly by issuing a vfs_fsync.
> 
> Oh, good, either it's been fixed (do we know when?) or I was confusing
> things with O_DIRECT.
> 
> > > > > > Attached patch, which I currently use for that. If it is close to something
> > > > > > that would be accepted, I will send it in new thread.
> > > > > 
> > > > > I think you can use a block script for this (i.e. it does the loop
> > > > > mount) and avoid patching libxl at all. That's what xend did at least...
> > > > 
> > > > Yes, libxl would need to arrage the script to be called when "phy" is
> > > > used on a file, right?
> > > 
> > > I meant the user can pass "script=block-loop". In the full knowledge of
> > > what that means for their data integrity.
> >  
> > Yeah, but there is no such block-loop script at the moment, the script
> > is called block and its behaviour depends on the xenstore "type" node:
> > it must be "file" and libxl never writes "type" "file" to xenstore (see
> > device_disk_add and libxl__device_disk_string_of_backend).
> 
> A block-loop script would be a matter of a few minutes work based on the
> current block script, or the current block script could be modified to
> stat the path and DTRT when handling a phy type device.

I vote for having this in 4.3

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: libvirt, libxl and QDISKs
  2013-04-26 11:50                           ` Marek Marczykowski
@ 2013-04-29 21:00                             ` Jim Fehlig
  0 siblings, 0 replies; 32+ messages in thread
From: Jim Fehlig @ 2013-04-29 21:00 UTC (permalink / raw)
  To: Marek Marczykowski
  Cc: George Dunlap, Stefano Stabellini, Dave Scott, Ian Campbell,
	xen-devel

Marek Marczykowski wrote:
> On 26.04.2013 13:40, Ian Campbell wrote:
>   
>> On Fri, 2013-04-26 at 12:31 +0100, Marek Marczykowski wrote:
>>     
>>> What about old good loop+phy based backend for file disk images? I don't want
>>> whole qemu in dom0 for PV domains, only for handling simple disk backend.
>>> Additionally sparse images + loop + phy + mount -o discard in domU makes the
>>> images "auto shrinking". Don't know if qemu is able to do this.
>>>       
>> IIRC the problem with loop+phy is that loop doesn't do O_DIRECT and
>> therefore your data isn't actually on the disk when you might think it
>> is, which can lead to filesystem corruption even if the f/s is doing
>> correct barriers.
>>
>>     
>>> Attached patch, which I currently use for that. If it is close to something
>>> that would be accepted, I will send it in new thread.
>>>       
>> I think you can use a block script for this (i.e. it does the loop
>> mount) and avoid patching libxl at all. That's what xend did at least...
>>     
>
> This also was solution I've evaluated, but libvirt developers don't want to
> hear about disk->script setting in official libvirt, so I still need to patch
> some library.

Good old loop+phy is supported by the legacy libvirt xen driver without
libvirt needing <disk><script /></disk>, generally by using <driver
name='file' />.

>  Choosing libxl seems to be more universal. Setting loop in libxl
> is also much faster than calling script for that.
> Perhaps I can rewrite this patch to allow the choice (LIBXL_DISK_BACKEND_LOOP)?
>   

With such a patch, libvirt could set the backend to
LIBXL_DISK_BACKEND_LOOP when <driver name='file'>, preserving loop+phy
backend setup done by the old toolstack.

Regards,
Jim

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2013-04-29 21:00 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-24 14:22 libvirt, libxl and QDISKs David Scott
2013-04-24 15:55 ` Stefano Stabellini
2013-04-25  8:55 ` Ian Campbell
2013-04-25 10:36   ` George Dunlap
2013-04-25 11:33     ` Ian Campbell
2013-04-25 11:55       ` Stefano Stabellini
2013-04-25 11:57         ` Ian Campbell
2013-04-25 12:12           ` David Scott
2013-04-25 12:43             ` Ian Campbell
2013-04-25 13:22               ` David Scott
2013-04-25 13:56                 ` Ian Campbell
2013-04-26  1:27                   ` Jim Fehlig
2013-04-26  4:50                     ` Jim Fehlig
2013-04-26 10:37                       ` David Scott
2013-04-26 23:44                         ` Jim Fehlig
     [not found]                         ` <517B1170.4020300@suse.com>
2013-04-29  9:41                           ` David Scott
2013-04-26  8:49                     ` Ian Campbell
2013-04-26 10:10                     ` Stefano Stabellini
2013-04-26 10:13                       ` Ian Campbell
2013-04-26 10:28                         ` Stefano Stabellini
2013-04-26 11:31                       ` Marek Marczykowski
2013-04-26 11:40                         ` Ian Campbell
2013-04-26 11:48                           ` Stefano Stabellini
2013-04-26 13:27                             ` Ian Campbell
2013-04-26 17:10                               ` Stefano Stabellini
2013-04-29  8:23                                 ` Ian Campbell
2013-04-29 10:07                                   ` Stefano Stabellini
2013-04-26 11:50                           ` Marek Marczykowski
2013-04-29 21:00                             ` Jim Fehlig
2013-04-26 11:45                         ` Stefano Stabellini
2013-04-26 14:36                       ` Roger Pau Monné
2013-04-25 18:26         ` Sylvain Munaut

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.