Resizing RAID-1 arrays - some possible bugs and problems

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Resizing RAID-1 arrays - some possible bugs and problems
@ 2006-07-07 13:40 Reuben Farrelly
  2006-07-07 18:52 ` Justin Piszcz
  2006-07-07 22:12 ` Neil Brown
  0 siblings, 2 replies; 5+ messages in thread
From: Reuben Farrelly @ 2006-07-07 13:40 UTC (permalink / raw)
  To: linux-raid; +Cc: Neil Brown

I'm just in the process of upgrading the RAID-1 disks in my server, and have 
started to experiment with the RAID-1 --grow command.  The first phase of the 
change went well, I added the new disks to the old arrays and then increased the 
size of the arrays to include both the new and old disks.  This meant that I had 
a full and clean transfer of all the data.  Then took the old disks out...it all 
worked nicely.

However I've had two problems with the next phase which was the resizing of the 
arrays.

Firstly, after moving the array, the kernel still seems to think that the raid 
array is only as big as the older disks.  This is to be expected, however 
looking at the output of this:

[root@tornado /]# mdadm --detail /dev/md0
/dev/md0:
         Version : 00.90.03
   Creation Time : Sat Nov  5 14:02:50 2005
      Raid Level : raid1
      Array Size : 24410688 (23.28 GiB 25.00 GB)
     Device Size : 24410688 (23.28 GiB 25.00 GB)
    Raid Devices : 2
   Total Devices : 2
Preferred Minor : 0
     Persistence : Superblock is persistent

   Intent Bitmap : Internal

     Update Time : Sat Jul  8 01:23:54 2006
           State : active
  Active Devices : 2
Working Devices : 2
  Failed Devices : 0
   Spare Devices : 0

            UUID : 24de08b7:e256a424:cca64cdd:638a1428
          Events : 0.5139442

     Number   Major   Minor   RaidDevice State
        0       8       34        0      active sync   /dev/sdc2
        1       8        2        1      active sync   /dev/sda2
[root@tornado /]#

We note that the "Device Size" according to the system is still 25.0 GB.  Except 
that the device size is REALLY 40Gb, as seen by the output of fdisk -l:

/dev/sda2               8        4871    39070080   fd  Linux raid autodetect

and

/dev/sdc2               8        4871    39070080   fd  Linux raid autodetect

Is that a bug?  My expectation is that this field should now reflect the size of 
the device/partition, with the *Array Size* still being the original, unresized 
size.

Secondly, I understand that I need to use the --grow command to bring the array 
up to the size of the device.
How do I know what size I should specify?  On my old disk, the size of the 
partition as read by fdisk was slightly larger than the array and device size as 
shown by mdadm.
How much difference should there be?
(Hint:  maybe this could be documented in the manpage (please), NeilB?)

And lastly, I felt brave and decided to plunge ahead, resize to 128 blocks 
smaller than the device size.  mdadm --grow /dev/md1 --size=

The kernel then went like this:

md: couldn't update array info. -28
VFS: busy inodes on changed media.
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)
md1: invalid bitmap page request: 150 (> 149)

...and kept going and going and going, every now and then the count incremented 
up until about 155 by which point I shut the box down.
The array then refused to come up on boot and after forcing it to reassemble it 
did a full dirty resync up:

md: bind<sda3>
md: md1 stopped.
md: unbind<sda3>
md: export_rdev(sda3)
md: bind<sda3>
md: bind<sdc3>
md: md1: raid array is not clean -- starting background reconstruction
raid1: raid set md1 active with 2 out of 2 mirrors
attempt to access beyond end of device
sdc3: rw=16, want=39086152, limit=39086145
attempt to access beyond end of device
sda3: rw=16, want=39086152, limit=39086145
md1: bitmap initialized from disk: read 23/38 pages, set 183740 bits, status: -5
md1: failed to create bitmap (-5)
md: pers->run() failed ...
md: array md1 already has disks!
raid1: raid set md1 active with 2 out of 2 mirrors
md1: bitmap file is out of date (0 < 4258299) -- forcing full recovery
md1: bitmap file is out of date, doing full recovery
md1: bitmap initialized from disk: read 10/10 pages, set 305359 bits, status: 0
created bitmap (150 pages) for device md1
md: syncing RAID array md1
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) 
                                              for reconstruction.
md: using 128k window, over a total of 19542944 blocks.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on md1, internal journal
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
md: md1: sync done.
RAID1 conf printout:
  --- wd:2 rd:2
  disk 0, wo:0, o:1, dev:sdc3
  disk 1, wo:0, o:1, dev:sda3

That was not really what I expected to happen.

I am running mdadm-2.3.1 which is the current version shipped with Fedora Core 
right now, but I'm about to file a bug report to get this upgraded.  A cursory 
look through the Changelog didn't suggest anything about any of these things 
being changed.

I get the feeling I am treading unchartered waters here, has anyone else done 
this sort of this and/or seen this sort of problem before?

Reuben

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Resizing RAID-1 arrays - some possible bugs and problems
  2006-07-07 13:40 Resizing RAID-1 arrays - some possible bugs and problems Reuben Farrelly
@ 2006-07-07 18:52 ` Justin Piszcz
  2006-07-07 21:10   ` Reuben Farrelly
  2006-07-07 22:12 ` Neil Brown
  1 sibling, 1 reply; 5+ messages in thread
From: Justin Piszcz @ 2006-07-07 18:52 UTC (permalink / raw)
  To: Reuben Farrelly; +Cc: linux-raid, Neil Brown



On Sat, 8 Jul 2006, Reuben Farrelly wrote:

> I'm just in the process of upgrading the RAID-1 disks in my server, and have 
> started to experiment with the RAID-1 --grow command.  The first phase of the 
> change went well, I added the new disks to the old arrays and then increased 
> the size of the arrays to include both the new and old disks.  This meant 
> that I had a full and clean transfer of all the data.  Then took the old 
> disks out...it all worked nicely.
>
> However I've had two problems with the next phase which was the resizing of 
> the arrays.
>
> Firstly, after moving the array, the kernel still seems to think that the 
> raid array is only as big as the older disks.  This is to be expected, 
> however looking at the output of this:
>
> [root@tornado /]# mdadm --detail /dev/md0
> /dev/md0:
>        Version : 00.90.03
>  Creation Time : Sat Nov  5 14:02:50 2005
>     Raid Level : raid1
>     Array Size : 24410688 (23.28 GiB 25.00 GB)
>    Device Size : 24410688 (23.28 GiB 25.00 GB)
>   Raid Devices : 2
>  Total Devices : 2
> Preferred Minor : 0
>    Persistence : Superblock is persistent
>
>  Intent Bitmap : Internal
>
>    Update Time : Sat Jul  8 01:23:54 2006
>          State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
>  Spare Devices : 0
>
>           UUID : 24de08b7:e256a424:cca64cdd:638a1428
>         Events : 0.5139442
>
>    Number   Major   Minor   RaidDevice State
>       0       8       34        0      active sync   /dev/sdc2
>       1       8        2        1      active sync   /dev/sda2
> [root@tornado /]#
>
> We note that the "Device Size" according to the system is still 25.0 GB. 
> Except that the device size is REALLY 40Gb, as seen by the output of fdisk 
> -l:
>
> /dev/sda2               8        4871    39070080   fd  Linux raid autodetect
>
> and
>
> /dev/sdc2               8        4871    39070080   fd  Linux raid autodetect
>
> Is that a bug?  My expectation is that this field should now reflect the size 
> of the device/partition, with the *Array Size* still being the original, 
> unresized size.
>
> Secondly, I understand that I need to use the --grow command to bring the 
> array up to the size of the device.
> How do I know what size I should specify?  On my old disk, the size of the 
> partition as read by fdisk was slightly larger than the array and device size 
> as shown by mdadm.
> How much difference should there be?
> (Hint:  maybe this could be documented in the manpage (please), NeilB?)
>
>
> And lastly, I felt brave and decided to plunge ahead, resize to 128 blocks 
> smaller than the device size.  mdadm --grow /dev/md1 --size=
>
> The kernel then went like this:
>
> md: couldn't update array info. -28
> VFS: busy inodes on changed media.
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
>
> ...and kept going and going and going, every now and then the count 
> incremented up until about 155 by which point I shut the box down.
> The array then refused to come up on boot and after forcing it to reassemble 
> it did a full dirty resync up:
>
> md: bind<sda3>
> md: md1 stopped.
> md: unbind<sda3>
> md: export_rdev(sda3)
> md: bind<sda3>
> md: bind<sdc3>
> md: md1: raid array is not clean -- starting background reconstruction
> raid1: raid set md1 active with 2 out of 2 mirrors
> attempt to access beyond end of device
> sdc3: rw=16, want=39086152, limit=39086145
> attempt to access beyond end of device
> sda3: rw=16, want=39086152, limit=39086145
> md1: bitmap initialized from disk: read 23/38 pages, set 183740 bits, status: 
> -5
> md1: failed to create bitmap (-5)
> md: pers->run() failed ...
> md: array md1 already has disks!
> raid1: raid set md1 active with 2 out of 2 mirrors
> md1: bitmap file is out of date (0 < 4258299) -- forcing full recovery
> md1: bitmap file is out of date, doing full recovery
> md1: bitmap initialized from disk: read 10/10 pages, set 305359 bits, status: 
> 0
> created bitmap (150 pages) for device md1
> md: syncing RAID array md1
> md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
> md: using maximum available idle IO bandwidth (but not more than 200000 
> KB/sec)                                              for reconstruction.
> md: using 128k window, over a total of 19542944 blocks.
> kjournald starting.  Commit interval 5 seconds
> EXT3 FS on md1, internal journal
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with ordered data mode.
> md: md1: sync done.
> RAID1 conf printout:
> --- wd:2 rd:2
> disk 0, wo:0, o:1, dev:sdc3
> disk 1, wo:0, o:1, dev:sda3
>
> That was not really what I expected to happen.
>
> I am running mdadm-2.3.1 which is the current version shipped with Fedora 
> Core right now, but I'm about to file a bug report to get this upgraded.  A 
> cursory look through the Changelog didn't suggest anything about any of these 
> things being changed.
>
> I get the feeling I am treading unchartered waters here, has anyone else done 
> this sort of this and/or seen this sort of problem before?
>
> Reuben
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Reuben,

What chunk size did you use?

I can't even get mine to get past this part:

p34:~# mdadm /dev/md3 --grow --raid-disks=7
mdadm: Need to backup 15360K of critical section..
mdadm: Cannot set device size/shape for /dev/md3: No space left on device

Justin.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Resizing RAID-1 arrays - some possible bugs and problems
  2006-07-07 18:52 ` Justin Piszcz
@ 2006-07-07 21:10   ` Reuben Farrelly
  0 siblings, 0 replies; 5+ messages in thread
From: Reuben Farrelly @ 2006-07-07 21:10 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-raid, Neil Brown



On 8/07/2006 6:52 a.m., Justin Piszcz wrote:
> Reuben,
> 
> What chunk size did you use?
> 
> I can't even get mine to get past this part:
> 
> p34:~# mdadm /dev/md3 --grow --raid-disks=7
> mdadm: Need to backup 15360K of critical section..
> mdadm: Cannot set device size/shape for /dev/md3: No space left on device
> 
> Justin.
> 

Just whatever the system selected for the chunk size, ie I didn't specify it 
myself ;)

[root@tornado cisco]# cat /proc/mdstat
Personalities : [raid1]

md0 : active raid1 sdc2[0] sda2[1]
       24410688 blocks [2/2] [UU]
       bitmap: 0/187 pages [0KB], 64KB chunk

md1 : active raid1 sdc3[0] sda3[1]
       19542944 blocks [2/2] [UU]
       bitmap: 0/150 pages [0KB], 64KB chunk

md2 : active raid1 sdc5[0] sda5[1]
       4891648 blocks [2/2] [UU]
       bitmap: 2/150 pages [8KB], 16KB chunk

I was working on md1 when I filed the email earlier.

I wonder if the chunk size is left as-is after a --grow, and if this is optimal 
or not or if this could lead to issues...

reuben


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Resizing RAID-1 arrays - some possible bugs and problems
  2006-07-07 13:40 Resizing RAID-1 arrays - some possible bugs and problems Reuben Farrelly
  2006-07-07 18:52 ` Justin Piszcz
@ 2006-07-07 22:12 ` Neil Brown
  2006-07-07 22:58   ` Reuben Farrelly
  1 sibling, 1 reply; 5+ messages in thread
From: Neil Brown @ 2006-07-07 22:12 UTC (permalink / raw)
  To: Reuben Farrelly; +Cc: linux-raid

On Saturday July 8, reuben-lkml@reub.net wrote:
> I'm just in the process of upgrading the RAID-1 disks in my server, and have 
> started to experiment with the RAID-1 --grow command.  The first phase of the 
> change went well, I added the new disks to the old arrays and then increased the 
> size of the arrays to include both the new and old disks.  This meant that I had 
> a full and clean transfer of all the data.  Then took the old disks out...it all 
> worked nicely.
> 
> However I've had two problems with the next phase which was the resizing of the 
> arrays.
> 
> Firstly, after moving the array, the kernel still seems to think that the raid 
> array is only as big as the older disks.  This is to be expected, however 
> looking at the output of this:
> 
> [root@tornado /]# mdadm --detail /dev/md0
> /dev/md0:
>          Version : 00.90.03
>    Creation Time : Sat Nov  5 14:02:50 2005
>       Raid Level : raid1
>       Array Size : 24410688 (23.28 GiB 25.00 GB)
>      Device Size : 24410688 (23.28 GiB 25.00 GB)
> 
> We note that the "Device Size" according to the system is still 25.0 GB.  Except 
> that the device size is REALLY 40Gb, as seen by the output of fdisk -l:

"Device Size" is a slight misnomer.  It actually means "the amount of
this device that will be used in the array".   Maybe I should make it
"Used Device Size".
> 
> Secondly, I understand that I need to use the --grow command to bring the array 
> up to the size of the device.
> How do I know what size I should specify? 

 --size=max

              This  value  can  be  set with --grow for RAID level 1/4/5/6. If the array was
              created with a size smaller than the currently active drives, the extra  space
              can  be  accessed  using  --grow.  The size can be given as max which means to
              choose the largest size that fits on all current drives.

> How much difference should there be?
> (Hint:  maybe this could be documented in the manpage (please), NeilB?)

man 4 md
       The  common  format - known as version 0.90 - has a superblock that is 4K long and is
       written into a 64K aligned block that starts at least 64K and less than 128K from the
       end  of  the  device (i.e. to get the address of the superblock round the size of the
       device down to a multiple of 64K and then subtract 64K).  The available size of  each
       device is the amount of space before the super block, so between 64K and 128K is lost
       when a device in incorporated into an MD array.  This  superblock  stores  multi-byte
       fields in a processor-dependant manner, so arrays cannot easily be moved between com-
       puters with different processors.


> 
> 
> And lastly, I felt brave and decided to plunge ahead, resize to 128 blocks 
> smaller than the device size.  mdadm --grow /dev/md1 --size=
> 
> The kernel then went like this:
> 
> md: couldn't update array info. -28
> VFS: busy inodes on changed media.
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)
> md1: invalid bitmap page request: 150 (> 149)

Oh dear, that's bad.

I guess I didn't think through resizing of an array with an active
bitmap properly... :-(
That won't be fixed in a hurry I'm afraid.
You'll need to remove the bitmap before the grow and re-add it
afterwards, which isn't really ideal.  
I'll look at making this more robust when I return from vacation in a
week or so.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Resizing RAID-1 arrays - some possible bugs and problems
  2006-07-07 22:12 ` Neil Brown
@ 2006-07-07 22:58   ` Reuben Farrelly
  0 siblings, 0 replies; 5+ messages in thread
From: Reuben Farrelly @ 2006-07-07 22:58 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On 8/07/2006 10:12 a.m., Neil Brown wrote:
> On Saturday July 8, reuben-lkml@reub.net wrote:

>> And lastly, I felt brave and decided to plunge ahead, resize to 128 blocks 
>> smaller than the device size.  mdadm --grow /dev/md1 --size=
>>
>> The kernel then went like this:
>>
>> md: couldn't update array info. -28
>> VFS: busy inodes on changed media.
>> md1: invalid bitmap page request: 150 (> 149)
>> md1: invalid bitmap page request: 150 (> 149)
>> md1: invalid bitmap page request: 150 (> 149)
> 
> Oh dear, that's bad.
> 
> I guess I didn't think through resizing of an array with an active
> bitmap properly... :-(
> That won't be fixed in a hurry I'm afraid.
> You'll need to remove the bitmap before the grow and re-add it
> afterwards, which isn't really ideal.  
> I'll look at making this more robust when I return from vacation in a
> week or so.
> 
> NeilBrown

Thanks for the response and references to the manpage, Neil.  I had misread the 
reference to 'max' and not realised it was a keyword/option that could be passed 
to --size.

I disabled bitmaps and the machine went into a bit of a spin.  It returned 
immediately at the prompt when I removed the bitmap (--bitmap=none) but each 
session locked up as soon as I attempted any sort of disk I/O, ie attempting 
things like 'df' or using 'less' which is mounted on the root filesystem on md0 
which is one I was disabling bitmaps on.  I had to power cycle the box to get a 
response from it.

Nothing was logged in /var/log/messages so unfortunately I've not anything to 
work with on it.  But I guess it further suggests that it was the disk I/O that 
did fall over.

Then I rebooted into single user mode, noticed that the system had in fact 
removed the bitmaps, and did my resize using --size=max.  It worked!  Now I can 
move on to resizing the filesystems themselves.

In other news, last night I requested an upgrade of mdadm in Fedora Core/Devel 
and this has since been done, so 2.5.2 should come through tonight in the 
nightly FC build (and of course be in FC6 when it comes out).

reuben

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-07-07 22:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-07 13:40 Resizing RAID-1 arrays - some possible bugs and problems Reuben Farrelly
2006-07-07 18:52 ` Justin Piszcz
2006-07-07 21:10   ` Reuben Farrelly
2006-07-07 22:12 ` Neil Brown
2006-07-07 22:58   ` Reuben Farrelly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).