All of lore.kernel.org
 help / color / mirror / Atom feed
* Getting xen to recognise large disks
@ 2006-11-20 22:40 Robin Bowes
  2006-11-20 23:31 ` Ian Pratt
  0 siblings, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-20 22:40 UTC (permalink / raw)
  To: xen-devel

Hi,

I've got a large ext3 file system, created on the Dom0 host, that I'd
like to make available to a domU guest.

The filesystem is built like this:

 - 8x500GB SATA drives combined as /dev/md2 (RAID6)
 - /dev/md2 designated an LVM PV
 - Volume Group vg_media created using /dev/md2
 - Logical Volume lv_media created in vg_media
 - ext3 filesystem created on lv_media


I'm using the following disk config in my xen config file:

disk = [ 'phy:vg_host/lv_slim,xvda,w',
         'phy:vg_media/lv_media,xvdb,w', ]


However, /dev/xvdb is not appearing as the correct size in the DomU guest:

>From /proc/partitions:

major minor  #blocks  name
 202    16  782819328 xvdb

When I look at the same partition in the host, I see this:
major minor  #blocks  name
   9     2 2930303616 md2

There appears to be a problem in passing the size of the device to the
DomU guest.

Can anyone shed any light on this?

Thanks,

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: Getting xen to recognise large disks
  2006-11-20 22:40 Getting xen to recognise large disks Robin Bowes
@ 2006-11-20 23:31 ` Ian Pratt
  2006-11-21  1:01   ` Robin Bowes
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Pratt @ 2006-11-20 23:31 UTC (permalink / raw)
  To: Robin Bowes, xen-devel

>  - 8x500GB SATA drives combined as /dev/md2 (RAID6)
>  - /dev/md2 designated an LVM PV
>  - Volume Group vg_media created using /dev/md2
>  - Logical Volume lv_media created in vg_media
>  - ext3 filesystem created on lv_media
> 
> 
> I'm using the following disk config in my xen config file:
> 
> disk = [ 'phy:vg_host/lv_slim,xvda,w',
>          'phy:vg_media/lv_media,xvdb,w', ]
> 
> 
> However, /dev/xvdb is not appearing as the correct size in the DomU
guest:
> 
> >From /proc/partitions:
> 
> major minor  #blocks  name
>  202    16  782819328 xvdb
> 
> When I look at the same partition in the host, I see this:
> major minor  #blocks  name
>    9     2 2930303616 md2
> 
> There appears to be a problem in passing the size of the device to the
> DomU guest.

Hmm, 2930303616 - 2^31 = 782819968

Argh -- I can see the problem: see the connect function in blkfront.c.

Fortunately, it can be fixed without an interface change. Just change
'sectors' from an unsigned long to a blkif_sector_t and update the
xenbus_gather to use:  "sectors", "%llu", &sectors 

You'll also need to edit the xenbus_printf (to %llu) in the connect
function in blkback.c too.

Please post a patch!

Thanks,
Ian

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-20 23:31 ` Ian Pratt
@ 2006-11-21  1:01   ` Robin Bowes
  2006-11-21  1:41     ` Robin Bowes
  0 siblings, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-21  1:01 UTC (permalink / raw)
  To: xen-devel

Ian Pratt wrote:
>> major minor  #blocks  name
>>  202    16  782819328 xvdb
>>
>> When I look at the same partition in the host, I see this:
>> major minor  #blocks  name
>>    9     2 2930303616 md2
>>
>> There appears to be a problem in passing the size of the device to the
>> DomU guest.
> 
> Hmm, 2930303616 - 2^31 = 782819968
> 
> Argh -- I can see the problem: see the connect function in blkfront.c.
> 
> Fortunately, it can be fixed without an interface change. Just change
> 'sectors' from an unsigned long to a blkif_sector_t and update the
> xenbus_gather to use:  "sectors", "%llu", &sectors 
> 
> You'll also need to edit the xenbus_printf (to %llu) in the connect
> function in blkback.c too.
> 
> Please post a patch!

Ian,

I'd love to post a patch, but I'm afraid I'm not a coder.

I'm downloading the SRPMS as I type and I'll give it a go, but it might
be an idea if someone with more coding skills fixes this.

I'll post an update when I've had a go.

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21  1:01   ` Robin Bowes
@ 2006-11-21  1:41     ` Robin Bowes
  2006-11-21  1:47       ` Ian Pratt
  2006-11-21  1:51       ` Daniel P. Berrange
  0 siblings, 2 replies; 20+ messages in thread
From: Robin Bowes @ 2006-11-21  1:41 UTC (permalink / raw)
  To: xen-devel

Robin Bowes wrote:
> I'd love to post a patch, but I'm afraid I'm not a coder.
> 
> I'm downloading the SRPMS as I type and I'll give it a go, but it might
> be an idea if someone with more coding skills fixes this.
> 
> I'll post an update when I've had a go.

OK, I've made the change to blkfront.c but there is no xenbus_printf in
blkback.c so I didn't make that change. (I'm using xen-3.0.3 from the
FC6 SRPM)

I've rebuilt the xen RPM with this patch:

diff -ur
xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
xen-3.0.3-rc3.patched/linux-2.6-xen-sp
arse/drivers/xen/blkfront/blkfront.c
--- xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
2006-10-10 15:23:43.000000000 +0100
+++
xen-3.0.3-rc3.patched/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
 2006-11-21 01:10:54.000000000 +000
0
@@ -294,7 +294,8 @@
  */
 static void connect(struct blkfront_info *info)
 {
-       unsigned long sectors, sector_size;
+       blkif_sector_r sectors;
+    unsigned long sector_size;
        unsigned int binfo;
        int err;

@@ -305,7 +306,7 @@
        DPRINTK("blkfront.c:connect:%s.\n", info->xbdev->otherend);

        err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
-                           "sectors", "%lu", &sectors,
+                           "sectors", "%llu", &sectors,
                            "info", "%u", &binfo,
                            "sector-size", "%lu", &sector_size,
                            NULL);


I installed the resulting RPMs (xen and xen-libs) and rebooted the dom0
host.

However, the xvdb device still only shows up like this:

major minor  #blocks  name
 202    16  782819328 xvdb

Did I not do it right?

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: Re: Getting xen to recognise large disks
  2006-11-21  1:41     ` Robin Bowes
@ 2006-11-21  1:47       ` Ian Pratt
  2006-11-21  1:51       ` Daniel P. Berrange
  1 sibling, 0 replies; 20+ messages in thread
From: Ian Pratt @ 2006-11-21  1:47 UTC (permalink / raw)
  To: Robin Bowes, xen-devel

> Robin Bowes wrote:
> > I'd love to post a patch, but I'm afraid I'm not a coder.
> >
> > I'm downloading the SRPMS as I type and I'll give it a go, but it
might
> > be an idea if someone with more coding skills fixes this.
> >
> > I'll post an update when I've had a go.
> 
> OK, I've made the change to blkfront.c but there is no xenbus_printf
in
> blkback.c so I didn't make that change. (I'm using xen-3.0.3 from the
> FC6 SRPM)

I meant blkback/xenbus.c

It would also be good to change tools/blktap/lib/xenbus.c for good
measure.

The rest of the patch looks OK, modulo use of tab vs spaces.

Thanks,
Ian

 
> I've rebuilt the xen RPM with this patch:
> 
> diff -ur
> xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
> xen-3.0.3-rc3.patched/linux-2.6-xen-sp
> arse/drivers/xen/blkfront/blkfront.c
> --- xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
> 2006-10-10 15:23:43.000000000 +0100
> +++
>
xen-3.0.3-rc3.patched/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront
.c
>  2006-11-21 01:10:54.000000000 +000
> 0
> @@ -294,7 +294,8 @@
>   */
>  static void connect(struct blkfront_info *info)
>  {
> -       unsigned long sectors, sector_size;
> +       blkif_sector_r sectors;
> +    unsigned long sector_size;
>         unsigned int binfo;
>         int err;
> 
> @@ -305,7 +306,7 @@
>         DPRINTK("blkfront.c:connect:%s.\n", info->xbdev->otherend);
> 
>         err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
> -                           "sectors", "%lu", &sectors,
> +                           "sectors", "%llu", &sectors,
>                             "info", "%u", &binfo,
>                             "sector-size", "%lu", &sector_size,
>                             NULL);
> 
> 
> I installed the resulting RPMs (xen and xen-libs) and rebooted the
dom0
> host.
> 
> However, the xvdb device still only shows up like this:
> 
> major minor  #blocks  name
>  202    16  782819328 xvdb
> 
> Did I not do it right?
> 
> R.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21  1:41     ` Robin Bowes
  2006-11-21  1:47       ` Ian Pratt
@ 2006-11-21  1:51       ` Daniel P. Berrange
  2006-11-21  2:13         ` Robin Bowes
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel P. Berrange @ 2006-11-21  1:51 UTC (permalink / raw)
  To: Robin Bowes; +Cc: xen-devel

On Tue, Nov 21, 2006 at 01:41:07AM +0000, Robin Bowes wrote:
> Robin Bowes wrote:
> > I'd love to post a patch, but I'm afraid I'm not a coder.
> > 
> > I'm downloading the SRPMS as I type and I'll give it a go, but it might
> > be an idea if someone with more coding skills fixes this.
> > 
> > I'll post an update when I've had a go.
> 
> OK, I've made the change to blkfront.c but there is no xenbus_printf in
> blkback.c so I didn't make that change. (I'm using xen-3.0.3 from the
> FC6 SRPM)
> 
> I've rebuilt the xen RPM with this patch:
> 
> diff -ur
> xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
> xen-3.0.3-rc3.patched/linux-2.6-xen-sp
> arse/drivers/xen/blkfront/blkfront.c
> --- xen-3.0.3-rc3/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
> 2006-10-10 15:23:43.000000000 +0100
> +++
> xen-3.0.3-rc3.patched/linux-2.6-xen-sparse/drivers/xen/blkfront/blkfront.c
>  2006-11-21 01:10:54.000000000 +000
> 0
> @@ -294,7 +294,8 @@
>   */
>  static void connect(struct blkfront_info *info)
>  {
> -       unsigned long sectors, sector_size;
> +       blkif_sector_r sectors;
> +    unsigned long sector_size;
>         unsigned int binfo;
>         int err;
> 
> @@ -305,7 +306,7 @@
>         DPRINTK("blkfront.c:connect:%s.\n", info->xbdev->otherend);
> 
>         err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
> -                           "sectors", "%lu", &sectors,
> +                           "sectors", "%llu", &sectors,
>                             "info", "%u", &binfo,
>                             "sector-size", "%lu", &sector_size,
>                             NULL);
> 
> 
> I installed the resulting RPMs (xen and xen-libs) and rebooted the dom0
> host.
> However, the xvdb device still only shows up like this:
> 
> major minor  #blocks  name
>  202    16  782819328 xvdb
> 
> Did I not do it right?

I'm afraid not.

blkfront is the device driver for the DomU guest kernel, rather than Dom0.
Also in Fedora, the 'xen' RPM only contains the userspace parts of xen
for Dom0. The hypervisor & kernel itself are in the kernel-xen RPM (which
is one of many built from the kernel SRPM).

FYI, I opened a bugzilla against Fedora to track this problem since I can
also trivially reproduce it by creating a (sparse) 5 TB block device and
exporting it to a guest.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216555

On x86_64 meanwhile everything is sizing up correctly - I succesfully
exported a 15 PB (yes, PB) block device to a PV guest.

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21  1:51       ` Daniel P. Berrange
@ 2006-11-21  2:13         ` Robin Bowes
  2006-11-21  7:46           ` Keir Fraser
  0 siblings, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-21  2:13 UTC (permalink / raw)
  To: xen-devel

Daniel P. Berrange wrote:
> blkfront is the device driver for the DomU guest kernel, rather than Dom0.
> Also in Fedora, the 'xen' RPM only contains the userspace parts of xen
> for Dom0. The hypervisor & kernel itself are in the kernel-xen RPM (which
> is one of many built from the kernel SRPM).
> 
> FYI, I opened a bugzilla against Fedora to track this problem since I can
> also trivially reproduce it by creating a (sparse) 5 TB block device and
> exporting it to a guest.
> 
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216555
> 
> On x86_64 meanwhile everything is sizing up correctly - I succesfully
> exported a 15 PB (yes, PB) block device to a PV guest.

OK, I'll track that bug.

This is going to be one of the longest server builds on record! I've got
2x250GB drives mirrored as the "system" drive, plus 8x500GB drives as
the "data" drive, configured in RAID6.

After getting over all the hardware problems, I found that grub only
supports up to 8 disks, so I had to patch it to support 16 disks.

Now I'm finding that xen guests have problems with big disks.

<sigh>

I'll get there eventually!

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21  2:13         ` Robin Bowes
@ 2006-11-21  7:46           ` Keir Fraser
  2006-11-21 11:21             ` Robin Bowes
  0 siblings, 1 reply; 20+ messages in thread
From: Keir Fraser @ 2006-11-21  7:46 UTC (permalink / raw)
  To: Robin Bowes, xen-devel


I'll make a patch today.

 -- Keir

On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:

>> FYI, I opened a bugzilla against Fedora to track this problem since I can
>> also trivially reproduce it by creating a (sparse) 5 TB block device and
>> exporting it to a guest.
>> 
>> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216555
>> 
>> On x86_64 meanwhile everything is sizing up correctly - I succesfully
>> exported a 15 PB (yes, PB) block device to a PV guest.
> 
> OK, I'll track that bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21  7:46           ` Keir Fraser
@ 2006-11-21 11:21             ` Robin Bowes
  2006-11-21 11:34               ` Keir Fraser
  0 siblings, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-21 11:21 UTC (permalink / raw)
  To: xen-devel

Keir Fraser wrote:
> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> 
>>> FYI, I opened a bugzilla against Fedora to track this problem since I can
>>> also trivially reproduce it by creating a (sparse) 5 TB block device and
>>> exporting it to a guest.
>>>
>>> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=216555
>>>
>>> On x86_64 meanwhile everything is sizing up correctly - I succesfully
>>> exported a 15 PB (yes, PB) block device to a PV guest.
>> OK, I'll track that bug.
>
> I'll make a patch today.
>

Thanks Keir, looking forward to testing it.

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 11:21             ` Robin Bowes
@ 2006-11-21 11:34               ` Keir Fraser
  2006-11-21 11:53                 ` Robin Bowes
  2006-11-21 21:11                 ` Re: Getting xen to recognise large disks Daniel P. Berrange
  0 siblings, 2 replies; 20+ messages in thread
From: Keir Fraser @ 2006-11-21 11:34 UTC (permalink / raw)
  To: Robin Bowes, xen-devel

On 21/11/06 11:21, "Robin Bowes" <robin-lists@robinbowes.com> wrote:

> Keir Fraser wrote:
>> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
>> 
>> I'll make a patch today.
>> 
> 
> Thanks Keir, looking forward to testing it.

If you don't mind using the xen-unstable source repository, it's changeset
12496:0c0ef61de06b. It probably hasn't reached the public repository just
yet (should very shortly though).

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21 11:34               ` Keir Fraser
@ 2006-11-21 11:53                 ` Robin Bowes
  2006-11-21 11:58                   ` Keir Fraser
  2006-11-21 21:11                 ` Re: Getting xen to recognise large disks Daniel P. Berrange
  1 sibling, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-21 11:53 UTC (permalink / raw)
  To: xen-devel

Keir Fraser wrote:
> On 21/11/06 11:21, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> 
>> Keir Fraser wrote:
>>> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
>>>
>>> I'll make a patch today.
>>>
>> Thanks Keir, looking forward to testing it.
> 
> If you don't mind using the xen-unstable source repository, it's changeset
> 12496:0c0ef61de06b. It probably hasn't reached the public repository just
> yet (should very shortly though).

Erm, I'm hoping for a modified RPM :)

Is this a kernel-level change, i.e. will I need to rebuild and install a
new kernel? Or is it just the userland tools that have changed?

For example, on my Dom0 host I have the following RPMs installed with
"xen" in their name:

# rpm -qa | grep xen
kernel-xen-2.6.18-1.2798.fc6
xen-3.0.3-0.1.rc3bigdisk
kernel-xen-2.6.18-1.2849.fc6
xen-libs-3.0.3-0.1.rc3bigdisk

(The packages suffixed "bigdisk" are the result of my own attempt to
patch for this problem)

On the guests, I just have the kernel.

So, I'm rather suspecting that I'll need to build a new kernel and
update both the host and the guests?

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 11:53                 ` Robin Bowes
@ 2006-11-21 11:58                   ` Keir Fraser
  2006-11-21 12:04                     ` Robin Bowes
  2006-11-21 12:14                     ` Xen 3.0.3 Kernel 2.6.18 Claes Lindblom
  0 siblings, 2 replies; 20+ messages in thread
From: Keir Fraser @ 2006-11-21 11:58 UTC (permalink / raw)
  To: Robin Bowes, xen-devel




On 21/11/06 11:53, "Robin Bowes" <robin-lists@robinbowes.com> wrote:

> So, I'm rather suspecting that I'll need to build a new kernel and
> update both the host and the guests?

I'm afraid so!

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21 11:58                   ` Keir Fraser
@ 2006-11-21 12:04                     ` Robin Bowes
  2006-11-21 12:36                       ` Daniel P. Berrange
  2006-11-21 12:14                     ` Xen 3.0.3 Kernel 2.6.18 Claes Lindblom
  1 sibling, 1 reply; 20+ messages in thread
From: Robin Bowes @ 2006-11-21 12:04 UTC (permalink / raw)
  To: xen-devel

Keir Fraser wrote:
> 
> 
> On 21/11/06 11:53, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> 
>> So, I'm rather suspecting that I'll need to build a new kernel and
>> update both the host and the guests?
> 
> I'm afraid so!

Bugger!

I don't suppose you happen to know of a guide to rebuilding the FC6
kernel from SRPMs do you?

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Xen 3.0.3 Kernel 2.6.18
  2006-11-21 11:58                   ` Keir Fraser
  2006-11-21 12:04                     ` Robin Bowes
@ 2006-11-21 12:14                     ` Claes Lindblom
  2006-11-21 12:25                       ` Atsushi SAKAI
  1 sibling, 1 reply; 20+ messages in thread
From: Claes Lindblom @ 2006-11-21 12:14 UTC (permalink / raw)
  To: xen-devel

Hi,
Where can I get working patches for the latest kernel 2.6.18.x?

Regards
Claes Lindblom

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Xen 3.0.3 Kernel 2.6.18
  2006-11-21 12:14                     ` Xen 3.0.3 Kernel 2.6.18 Claes Lindblom
@ 2006-11-21 12:25                       ` Atsushi SAKAI
  0 siblings, 0 replies; 20+ messages in thread
From: Atsushi SAKAI @ 2006-11-21 12:25 UTC (permalink / raw)
  To: Claes Lindblom, xen-devel

Hi,

It is better to see FC6 SRPM.
http://download.fedora.redhat.com/pub/fedora/linux/core/development/source/SRPMS/

Thanks
Atushi Sakai

>Hi,
>Where can I get working patches for the latest kernel 2.6.18.x?
>
>Regards
>Claes Lindblom
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 12:04                     ` Robin Bowes
@ 2006-11-21 12:36                       ` Daniel P. Berrange
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel P. Berrange @ 2006-11-21 12:36 UTC (permalink / raw)
  To: Robin Bowes; +Cc: xen-devel

On Tue, Nov 21, 2006 at 12:04:32PM +0000, Robin Bowes wrote:
> Keir Fraser wrote:
> > 
> > 
> > On 21/11/06 11:53, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> > 
> >> So, I'm rather suspecting that I'll need to build a new kernel and
> >> update both the host and the guests?
> > 
> > I'm afraid so!
> 
> Bugger!
> 
> I don't suppose you happen to know of a guide to rebuilding the FC6
> kernel from SRPMs do you?

I'll post some notes to the fedora-xen mailing list, so we don't bother
xen-devel with Fedora specific noise that most people don't need to read.

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 11:34               ` Keir Fraser
  2006-11-21 11:53                 ` Robin Bowes
@ 2006-11-21 21:11                 ` Daniel P. Berrange
  2006-11-21 22:41                   ` Daniel P. Berrange
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel P. Berrange @ 2006-11-21 21:11 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Robin Bowes

On Tue, Nov 21, 2006 at 11:34:45AM +0000, Keir Fraser wrote:
> On 21/11/06 11:21, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> 
> > Keir Fraser wrote:
> >> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> >> 
> >> I'll make a patch today.
> >> 
> > 
> > Thanks Keir, looking forward to testing it.
> 
> If you don't mind using the xen-unstable source repository, it's changeset
> 12496:0c0ef61de06b. It probably hasn't reached the public repository just
> yet (should very shortly though).

I've tested that changeset with the following

 - phy:  against a 5 TB partition
 - file: against a 7.3 TB file

In both cases the # of sectors matches in Dom0 vs DomU. For good measure
I also ran Stephen Tweedie's verify-data tool in the DomU to verify no
data I/O wraparound issues elsewhere in the code & it passed without
trouble.

Blktap, however, is a different story - it is showing wraparound for disk
size at the 2 TB size mark stil. The userspace blktap tools have totally
inconsistent data types. Sometimes using int, sometimes long, sometimes
unsigned long & sometimes uint64. I'm working on a patch which makes it 

 - 'unsigned long long'  for # sectors
 - 'unsigned long'       for sector size
 - 'unsigned int'        for info

This makes it match the data types used in blkfront/blkback exactly.
With this patch applied, the DomU sees correct disk size, however,
the verify-data tool is showing nasty data consistency issues when
writing/reading to such a disk. So I think there is 32-bit wrap
around somewhere in the I/O codepath for blktap. I'll get back when
I've found out more info...

Regards,
Dan.

[1] http://people.redhat.com/sct/src/verify-data/
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 21:11                 ` Re: Getting xen to recognise large disks Daniel P. Berrange
@ 2006-11-21 22:41                   ` Daniel P. Berrange
  2006-11-22 10:42                     ` Robin Bowes
  2006-11-28 21:52                     ` Daniel P. Berrange
  0 siblings, 2 replies; 20+ messages in thread
From: Daniel P. Berrange @ 2006-11-21 22:41 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Robin Bowes

[-- Attachment #1: Type: text/plain, Size: 2685 bytes --]

On Tue, Nov 21, 2006 at 09:11:18PM +0000, Daniel P. Berrange wrote:
> On Tue, Nov 21, 2006 at 11:34:45AM +0000, Keir Fraser wrote:
> > On 21/11/06 11:21, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> > 
> > > Keir Fraser wrote:
> > >> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> > >> 
> > >> I'll make a patch today.
> > >> 
> > > 
> > > Thanks Keir, looking forward to testing it.
> > 
> > If you don't mind using the xen-unstable source repository, it's changeset
> > 12496:0c0ef61de06b. It probably hasn't reached the public repository just
> > yet (should very shortly though).
> 
> I've tested that changeset with the following
> 
>  - phy:  against a 5 TB partition
>  - file: against a 7.3 TB file
> 
> In both cases the # of sectors matches in Dom0 vs DomU. For good measure
> I also ran Stephen Tweedie's verify-data tool in the DomU to verify no
> data I/O wraparound issues elsewhere in the code & it passed without
> trouble.
> 
> Blktap, however, is a different story - it is showing wraparound for disk
> size at the 2 TB size mark stil. The userspace blktap tools have totally
> inconsistent data types. Sometimes using int, sometimes long, sometimes
> unsigned long & sometimes uint64. I'm working on a patch which makes it 
> 
>  - 'unsigned long long'  for # sectors
>  - 'unsigned long'       for sector size
>  - 'unsigned int'        for info
> 
> This makes it match the data types used in blkfront/blkback exactly.
> With this patch applied, the DomU sees correct disk size, however,
> the verify-data tool is showing nasty data consistency issues when
> writing/reading to such a disk. So I think there is 32-bit wrap
> around somewhere in the I/O codepath for blktap. I'll get back when
> I've found out more info...

It turns out that blktap wasn't (directly) at fault here. I was storing my
file based disk images on an XFS formatted partition in the host. Well it
appears that XFS doesn't play nice with the async I/O + O_DIRECT options
that blktap likes so all your data goes to /dev/null :-)

I re-tested blktap + large file backed disks on ext3 & GFS and everything
is working as expected. So stay away from a XFS+blktap combo if you like 
your data :-)


Attaching the patch to blktap to fix 32-bit wraparound of sector counts.

 Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

[-- Attachment #2: blktap-2tb-2.patch --]
[-- Type: text/plain, Size: 3206 bytes --]

diff -r 00ed59a6f043 tools/blktap/drivers/blktapctrl.c
--- a/tools/blktap/drivers/blktapctrl.c	Tue Nov 21 10:22:19 2006 +0000
+++ b/tools/blktap/drivers/blktapctrl.c	Tue Nov 21 14:53:39 2006 -0500
@@ -420,7 +420,7 @@ static int read_msg(int fd, int msgtype,
 			image->secsize = img->secsize;
 			image->info = img->info;
 
-			DPRINTF("Received CTLMSG_IMG: %lu, %lu, %lu\n",
+			DPRINTF("Received CTLMSG_IMG: %llu, %lu, %u\n",
 				image->size, image->secsize, image->info);
 			if(msgtype != CTLMSG_IMG) ret = 0;
 			break;
diff -r 00ed59a6f043 tools/blktap/drivers/blktapctrl.h
--- a/tools/blktap/drivers/blktapctrl.h	Tue Nov 21 10:22:19 2006 +0000
+++ b/tools/blktap/drivers/blktapctrl.h	Tue Nov 21 14:53:39 2006 -0500
@@ -30,19 +30,19 @@
  */
 
 
-static inline long int tapdisk_get_size(blkif_t *blkif)
+static inline unsigned long long tapdisk_get_size(blkif_t *blkif)
 {
 	image_t *img = (image_t *)blkif->prv;
 	return img->size;
 }
 
-static inline long int tapdisk_get_secsize(blkif_t *blkif)
+static inline unsigned long tapdisk_get_secsize(blkif_t *blkif)
 {
 	image_t *img = (image_t *)blkif->prv;
 	return img->secsize;
 }
 
-static inline unsigned tapdisk_get_info(blkif_t *blkif)
+static inline unsigned int tapdisk_get_info(blkif_t *blkif)
 {
 	image_t *img = (image_t *)blkif->prv;
 	return img->info;
diff -r 00ed59a6f043 tools/blktap/drivers/tapdisk.h
--- a/tools/blktap/drivers/tapdisk.h	Tue Nov 21 10:22:19 2006 +0000
+++ b/tools/blktap/drivers/tapdisk.h	Tue Nov 21 14:53:39 2006 -0500
@@ -74,9 +74,9 @@ struct td_state {
 	void *ring_info;
 	void *fd_entry;
 	char backing_file[1024]; /*Used by differencing disks, e.g. qcow*/
-	long int   sector_size;
-	uint64_t   size;
-	long int   info;
+	unsigned long      sector_size;
+	unsigned long long size;
+	unsigned int       info;
 };
 
 /* Prototype of the callback to activate as requests complete.              */
diff -r 00ed59a6f043 tools/blktap/lib/blktaplib.h
--- a/tools/blktap/lib/blktaplib.h	Tue Nov 21 10:22:19 2006 +0000
+++ b/tools/blktap/lib/blktaplib.h	Tue Nov 21 14:54:21 2006 -0500
@@ -97,9 +97,9 @@ typedef struct {
 } pending_req_t;
 
 struct blkif_ops {
-	long int (*get_size)(struct blkif *blkif);
-	long int (*get_secsize)(struct blkif *blkif);
-	unsigned (*get_info)(struct blkif *blkif);
+	unsigned long long (*get_size)(struct blkif *blkif);
+	unsigned long (*get_secsize)(struct blkif *blkif);
+	unsigned int (*get_info)(struct blkif *blkif);
 };
 
 typedef struct blkif {
@@ -156,9 +156,9 @@ typedef struct domid_translate {
 } domid_translate_t ;
 
 typedef struct image {
-	long int size;
-	long int secsize;
-	long int info;
+	unsigned long long size;
+	unsigned long secsize;
+	unsigned int info;
 } image_t;
 
 typedef struct msg_hdr {
diff -r 00ed59a6f043 tools/blktap/lib/xenbus.c
--- a/tools/blktap/lib/xenbus.c	Tue Nov 21 10:22:19 2006 +0000
+++ b/tools/blktap/lib/xenbus.c	Tue Nov 21 14:53:58 2006 -0500
@@ -219,7 +219,7 @@ static void ueblktap_setup(struct xs_han
 	}
 
 	/* Supply the information about the device to xenstore */
-	er = xs_printf(h, be->backpath, "sectors", "%lu",
+	er = xs_printf(h, be->backpath, "sectors", "%llu",
 			be->blkif->ops->get_size(be->blkif));
 
 	if (er == 0) {

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Getting xen to recognise large disks
  2006-11-21 22:41                   ` Daniel P. Berrange
@ 2006-11-22 10:42                     ` Robin Bowes
  2006-11-28 21:52                     ` Daniel P. Berrange
  1 sibling, 0 replies; 20+ messages in thread
From: Robin Bowes @ 2006-11-22 10:42 UTC (permalink / raw)
  To: xen-devel

I installed the patched kernel and xen RPMS this morning and can confirm
that this fixed my problem.

I can now see my 2.7TB partition correctly in the DomU guest.

Thanks to all for the quick fix, and to Dan for help with building the
modified kernel RPM.

R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: Getting xen to recognise large disks
  2006-11-21 22:41                   ` Daniel P. Berrange
  2006-11-22 10:42                     ` Robin Bowes
@ 2006-11-28 21:52                     ` Daniel P. Berrange
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel P. Berrange @ 2006-11-28 21:52 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Robin Bowes

On Tue, Nov 21, 2006 at 10:41:41PM +0000, Daniel P. Berrange wrote:
> On Tue, Nov 21, 2006 at 09:11:18PM +0000, Daniel P. Berrange wrote:
> > On Tue, Nov 21, 2006 at 11:34:45AM +0000, Keir Fraser wrote:
> > > On 21/11/06 11:21, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> > > 
> > > > Keir Fraser wrote:
> > > >> On 21/11/06 2:13 am, "Robin Bowes" <robin-lists@robinbowes.com> wrote:
> > > >> 
> > > >> I'll make a patch today.
> > > >> 
> > > > 
> > > > Thanks Keir, looking forward to testing it.
> > > 
> > > If you don't mind using the xen-unstable source repository, it's changeset
> > > 12496:0c0ef61de06b. It probably hasn't reached the public repository just
> > > yet (should very shortly though).
> > 
> > I've tested that changeset with the following
> > 
> >  - phy:  against a 5 TB partition
> >  - file: against a 7.3 TB file
> > 
> > In both cases the # of sectors matches in Dom0 vs DomU. For good measure
> > I also ran Stephen Tweedie's verify-data tool in the DomU to verify no
> > data I/O wraparound issues elsewhere in the code & it passed without
> > trouble.
> > 
> > Blktap, however, is a different story - it is showing wraparound for disk
> > size at the 2 TB size mark stil. The userspace blktap tools have totally
> > inconsistent data types. Sometimes using int, sometimes long, sometimes
> > unsigned long & sometimes uint64. I'm working on a patch which makes it 
> > 
> >  - 'unsigned long long'  for # sectors
> >  - 'unsigned long'       for sector size
> >  - 'unsigned int'        for info
> > 
> > This makes it match the data types used in blkfront/blkback exactly.
> > With this patch applied, the DomU sees correct disk size, however,
> > the verify-data tool is showing nasty data consistency issues when
> > writing/reading to such a disk. So I think there is 32-bit wrap
> > around somewhere in the I/O codepath for blktap. I'll get back when
> > I've found out more info...
> 
> It turns out that blktap wasn't (directly) at fault here. I was storing my
> file based disk images on an XFS formatted partition in the host. Well it
> appears that XFS doesn't play nice with the async I/O + O_DIRECT options
> that blktap likes so all your data goes to /dev/null :-)
> 
> I re-tested blktap + large file backed disks on ext3 & GFS and everything
> is working as expected. So stay away from a XFS+blktap combo if you like 
> your data :-)

FYI, in case anyone else out there is reading the archives..it turns out
there is a kernel bug which caused the data corruption problems with XFS
in this case - it wasn't a xen or blktap issue. The root cause was that
if you used O_DIRECT + async-IO on a sparse file, XFS ended up writing
data into the wrong region of the file! So if you're using XFS for storing
file backed images, make sure they're not sparse images, or use the old
loopback driver which avoids the O_DIRECT+AIO codepaths. Gory details in

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2006-11-28 21:52 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-20 22:40 Getting xen to recognise large disks Robin Bowes
2006-11-20 23:31 ` Ian Pratt
2006-11-21  1:01   ` Robin Bowes
2006-11-21  1:41     ` Robin Bowes
2006-11-21  1:47       ` Ian Pratt
2006-11-21  1:51       ` Daniel P. Berrange
2006-11-21  2:13         ` Robin Bowes
2006-11-21  7:46           ` Keir Fraser
2006-11-21 11:21             ` Robin Bowes
2006-11-21 11:34               ` Keir Fraser
2006-11-21 11:53                 ` Robin Bowes
2006-11-21 11:58                   ` Keir Fraser
2006-11-21 12:04                     ` Robin Bowes
2006-11-21 12:36                       ` Daniel P. Berrange
2006-11-21 12:14                     ` Xen 3.0.3 Kernel 2.6.18 Claes Lindblom
2006-11-21 12:25                       ` Atsushi SAKAI
2006-11-21 21:11                 ` Re: Getting xen to recognise large disks Daniel P. Berrange
2006-11-21 22:41                   ` Daniel P. Berrange
2006-11-22 10:42                     ` Robin Bowes
2006-11-28 21:52                     ` Daniel P. Berrange

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.