RAID5 Shrinking array-size nearly killed the system

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID5 Shrinking array-size nearly killed the system
       [not found] <AANLkTimrq904HRZfx6RpPrVNd0EJ5AkUZtytY7TqcFYv@mail.gmail.com>
@ 2011-03-12  4:58 ` Rory Jaffe
  2011-03-12  5:56   ` Mikael Abrahamsson
  0 siblings, 1 reply; 12+ messages in thread
From: Rory Jaffe @ 2011-03-12  4:58 UTC (permalink / raw)
  To: linux-raid

Using Ubuntu 10.10 mdadm v 3.2. ext4 filesystem. I wanted to shrink
from 6 disks to 4. I have about 2TB of files on the disks. So, I ran
$ sudo mdadm -G -n 4 /dev/md0
which gave the message:

mdadm: this change will reduce the size of the array.
       use --grow --array-size first to truncate array.
       e.g. mdadm --grow /dev/md0 --array-size 5857612608

then ran
$ sudo mdadm --grow /dev/md0 --array-size 5857612608
and started testing the filesystem prior to reducing the array. I
quickly found out that the filesystem was broken. It was broken enough
that I couldn't even get access to commands, including sudo, mdadm,
reboot, and ls. I had to power down the system and restart it. There
were a number of disk errors, but managed to restart the system and it
looks like there wasn't much damage. fstab is listed after my
questions.
Questions:
1. Is there a safer way to shrink the file system prior to reducing
the number of disks in the array?
2. Is there a way of rearranging the files and directories to make the
file system shrink safer?
3. Is there something I did that caused the crash?
Thanks-Rory

# /etc/fstab: static file system information.
#
# Use 'blkid -o value -s UUID' to print the universally unique identifier
# for a device; this may be used with UUID= as a more robust way to name
# devices that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    nodev,noexec,nosuid 0       0
# / was on /dev/md0 during installation
UUID=9a978b70-e034-4d79-9e1d-237a67b553d5 /               ext4
commit=60,errors=remount-ro,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv1,barrier=0
 0       1
# /boot was on /dev/sdb1 during installation
UUID=5beb5144-6d2f-4a73-b9e9-442355d8f529 /boot           ext2
defaults        0       2
# swap was on /dev/sda1 during installation
UUID=31660a21-3f99-4ffb-81cc-501dc6ce5de7 none            swap    sw
           0       0
# swap was on /dev/sdc1 during installation
UUID=f168588d-8b9e-45c3-b9ae-f90f66906616 none            swap    sw
           0       0
# swap was on /dev/sdd1 during installation
UUID=05cadc48-59df-479e-b5ed-b9e9322cb905 none            swap    sw
           0       0
# swap was on /dev/sde1 during installation
UUID=61fba94d-e6c5-4a58-b0cd-9d878b55b65c none            swap    sw
           0       0
# swap was on /dev/sdf1 during installation
UUID=47737641-7555-4cbc-9bf6-508c9f2035bc none            swap    sw
           0       0
# swap was on /dev/sdg1 during installation
UUID=ad06f3d6-a6ec-445a-bcfb-427fec72725b none            swap    sw
           0       0
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-12  4:58 ` RAID5 Shrinking array-size nearly killed the system Rory Jaffe
@ 2011-03-12  5:56   ` Mikael Abrahamsson
  2011-03-12 14:40     ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Mikael Abrahamsson @ 2011-03-12  5:56 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: linux-raid

On Fri, 11 Mar 2011, Rory Jaffe wrote:

> 3. Is there something I did that caused the crash?

Mdadm has no concept of filesystem, it works on the block device.

From your text you give no indication that you shrunk the filesystem 
before shrinking the block device. Did you?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-12  5:56   ` Mikael Abrahamsson
@ 2011-03-12 14:40     ` Phil Turmel
       [not found]       ` <AANLkTik2qk2ep7fsQPjAesnjur-0AB-Xx7EeZ5YfeCSA@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2011-03-12 14:40 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Mikael Abrahamsson, linux-raid

On 03/12/2011 12:56 AM, Mikael Abrahamsson wrote:
> On Fri, 11 Mar 2011, Rory Jaffe wrote:
> 
>> 3. Is there something I did that caused the crash?
> 
> Mdadm has no concept of filesystem, it works on the block device.
> 
> From your text you give no indication that you shrunk the filesystem before shrinking the block device. Did you?
> 

Also, you didn't say what kind of filesystem it is.  Some don't support shrinking at all (XFS is one), or don't support shrinking while running (most of them).

You will very likely need to do this from a LiveCD of one form or another.  If your filesystem cannot be shrunk, you'll need to make a fresh backup, reformat, and restore.

(You do have a backup, don't you?)

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Add more warnings to --grow documentation (was: RAID5 Shrinking array-size nearly killed the system)
       [not found]       ` <AANLkTik2qk2ep7fsQPjAesnjur-0AB-Xx7EeZ5YfeCSA@mail.gmail.com>
@ 2011-03-12 17:47         ` Phil Turmel
       [not found]         ` <AANLkTikRdqZ4nAdLcUuYz17KeWb54ES_sNYzzW-u1R0x@mail.gmail.com>
  1 sibling, 0 replies; 12+ messages in thread
From: Phil Turmel @ 2011-03-12 17:47 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Mikael Abrahamsson, linux-raid

[CC restored.  Always use reply-to-all on kernel.org lists.]

On 03/12/2011 11:07 AM, Rory Jaffe wrote:
> It's ext4. I've backed up the critical data, so a complete loss of the
> files won't be a catastrophe, just a major pain. I was planning to
> unmount the filesystem, shrink it, remount it, then proceed with
> shrinking the block device.
> 
> Mikael also asked about what I would change in the documentation. I'd
> change the following two paragraphs (my changes are in all caps).
> 
> Note that when an array changes size, any filesystem that may be stored
>      in the array will  not  automatically  grow  OR SHRINK to  use
> the  space.   The
>      filesystem will need to be explicitly told to use the extra space
> AFTER GROWING THE ARRAY, OR TO REDUCE ITS SIZE PRIOR TO SHRINKING THE
> ARRAY.
> 
> When  decreasing the number of devices, the size of the array will also
>      decrease.  If there was data in the array, it could get  destroyed  and
>      this is not reversible. FIRST, SHRINK THE FILESYSTEMS ON THE
> ARRAY TO ACCOMODATE THE NEW SIZE. To help prevent accidents, mdadm
> requires that
>      the size of the array be decreased first  with  mdadm  --grow  --array-
>      size.   This  is  a reversible change which simply makes the end of the
>      array inaccessible.  The integrity of any  data  can  then  be  checked
>      before  the  non-reversible  reduction  in  the  number  of  devices is
>      request.

I've incorporated some of the above suggestions in the following patch,
paraphrased somewhat, with additional text in the summary section.

8<======================
From 027600b0db6bb9043fc5141a5ad6db32f5ba5ab5 Mon Sep 17 00:00:00 2001
From: Philip J. Turmel <philip@turmel.org>
Date: Sat, 12 Mar 2011 12:30:23 -0500
Subject: [PATCH] Grow: Improve the documentation of shrinking

Users often report difficulties caused by shrinking an array
without having shrunk the contained filesystem.  Add a note
warning about this in the summary description for --grow, and
elaborate in the SIZE CHANGES subsection of the GROW mode
detailed description.

Inspired-by: Rory Jaffe <rsjaffe@gmail.com>
Signed-off-by: Philip J. Turmel <philip@turmel.org>
---
 mdadm.8.in |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/mdadm.8.in b/mdadm.8.in
index 08e4255..43c2022 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -126,6 +126,13 @@ of component devices and changing the number of active devices in RAID
 levels 1/4/5/6, changing the RAID level between 1, 5, and 6, changing
 the chunk size and layout for RAID5 and RAID5, as well as adding or
 removing a write-intent bitmap.
+.B "Note:"
+The contained filesystem, if any, is
+.B NOT
+adjusted automatically.
+Shrinking without prior filesystem adjustment
+.B WILL
+do irreversible damage.
 
 .TP
 .B "Incremental Assembly"
@@ -2158,8 +2165,12 @@ space to start being used.  If the size is increased in this way, a
 are synchronised.
 
 Note that when an array changes size, any filesystem that may be
-stored in the array will not automatically grow to use the space.  The
-filesystem will need to be explicitly told to use the extra space.
+stored in the array will not automatically grow to use the space.
+Nor will it automatically shrink to fit a smaller size.  The
+filesystem will need to be explicitly told to use the new size.
+.B "Note:"
+Some filesystems can not be shrunk at all.  Most filesystems can not
+be shrunk while mounted.
 
 Also the size of an array cannot be changed while it has an active
 bitmap.  If an array has a bitmap, it must be removed before the size
-- 
1.7.4.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
       [not found]         ` <AANLkTikRdqZ4nAdLcUuYz17KeWb54ES_sNYzzW-u1R0x@mail.gmail.com>
@ 2011-03-12 17:58           ` Phil Turmel
  2011-03-12 18:31             ` Rory Jaffe
       [not found]             ` <AANLkTim+OwO5-w5Hhjdchp+Nj8k0zTLqqvRKfAzFgkWz@mail.gmail.com>
  0 siblings, 2 replies; 12+ messages in thread
From: Phil Turmel @ 2011-03-12 17:58 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Mikael Abrahamsson, linux-raid

[CC restored]

On 03/12/2011 12:37 PM, Rory Jaffe wrote:
> This is my plan now--did I get this right? -- thanks --
> 
> shutdown -r now # go to live cd
> umount /dev/md0 #just to make sure
> e2fsck /dev/md0
> resize2fs /dev/md0 3800G #3.2T currently in use
> shutdown -r now # go back to main system
> mdadm --grow /dev/md0 --array-size 4000000000
> mdadm -G -n 4 -x 2 --backup-file=/path/to/file.bak /dev/md0
> resize2fs /dev/md0

I would do everything in the LiveCD environment, and I would add an fsck after the resize, and again at the end.

In the LiveCD, there's a good chance the array will be assembled for you, but as a different number.  That shouldn't cause any problems, but it affects the commands you'll type.  "cat /proc/mdstat" will give you a quick summary of where you stand.

I can't comment on the size figures you've chosen, as you haven't shared the output of "mdadm -D /dev/md0" and "mdadm -E" for each of the component devices.

Also note that the backup file needed by mdadm cannot be *inside* the array you are resizing.  You *must* have another storage device for it.  I use a thumb drive with my LiveCD for this kind of task.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-12 17:58           ` RAID5 Shrinking array-size nearly killed the system Phil Turmel
@ 2011-03-12 18:31             ` Rory Jaffe
       [not found]             ` <AANLkTim+OwO5-w5Hhjdchp+Nj8k0zTLqqvRKfAzFgkWz@mail.gmail.com>
  1 sibling, 0 replies; 12+ messages in thread
From: Rory Jaffe @ 2011-03-12 18:31 UTC (permalink / raw)
  To: linux-raid

On Sat, Mar 12, 2011 at 5:58 PM, Phil Turmel <philip@turmel.org> wrote:
>
> [CC restored]
>
> On 03/12/2011 12:37 PM, Rory Jaffe wrote:
> > This is my plan now--did I get this right? -- thanks --
> >
> > shutdown -r now # go to live cd
> > umount /dev/md0 #just to make sure
> > e2fsck /dev/md0
> > resize2fs /dev/md0 3800G #3.2T currently in use
> > shutdown -r now # go back to main system
> > mdadm --grow /dev/md0 --array-size 4000000000
> > mdadm -G -n 4 -x 2 --backup-file=/path/to/file.bak /dev/md0
> > resize2fs /dev/md0
>
> I would do everything in the LiveCD environment, and I would add an fsck after the resize, and again at the end.
>
> In the LiveCD, there's a good chance the array will be assembled for you, but as a different number.  That shouldn't cause any problems, but it affects the commands you'll type.  "cat /proc/mdstat" will give you a quick summary of where you stand.
>
> I can't comment on the size figures you've chosen, as you haven't shared the output of "mdadm -D /dev/md0" and "mdadm -E" for each of the component devices.
>
> Also note that the backup file needed by mdadm cannot be *inside* the array you are resizing.  You *must* have another storage device for it.  I use a thumb drive with my LiveCD for this kind of task.
>
> Phil
Here's the data on array sizes
sudo mdadm -D /dev/md/0_0
/dev/md/0_0:
        Version : 0.90
  Creation Time : Thu Jan  6 06:13:08 2011
     Raid Level : raid5
     Array Size : 9762687680 (9310.42 GiB 9996.99 GB)
  Used Dev Size : 1952537536 (1862.08 GiB 1999.40 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Sat Mar 12 17:56:34 2011
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 7e946e9d:b6a3395c:b57e8a13:68af0467
         Events : 0.72

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       50        1      active sync   /dev/sdd2
       2       8       66        2      active sync   /dev/sde2
       3       8       82        3      active sync   /dev/sdf2
       4       8       98        4      active sync   /dev/sdg2
       5       8      114        5      active sync   /dev/sdh2
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
       [not found]             ` <AANLkTim+OwO5-w5Hhjdchp+Nj8k0zTLqqvRKfAzFgkWz@mail.gmail.com>
@ 2011-03-12 20:10               ` Phil Turmel
  2011-03-13  6:56                 ` Rory Jaffe
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2011-03-12 20:10 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Mikael Abrahamsson, linux-raid

On 03/12/2011 01:28 PM, Rory Jaffe wrote:
> On Sat, Mar 12, 2011 at 5:58 PM, Phil Turmel <philip@turmel.org> wrote:
> 
>> [CC restored]
>>
>> On 03/12/2011 12:37 PM, Rory Jaffe wrote:
>>> This is my plan now--did I get this right? -- thanks --
>>>
>>> shutdown -r now # go to live cd
>>> umount /dev/md0 #just to make sure
>>> e2fsck /dev/md0
>>> resize2fs /dev/md0 3800G #3.2T currently in use
>>> shutdown -r now # go back to main system
>>> mdadm --grow /dev/md0 --array-size 4000000000
>>> mdadm -G -n 4 -x 2 --backup-file=/path/to/file.bak /dev/md0
>>> resize2fs /dev/md0
>>
>> I would do everything in the LiveCD environment, and I would add an fsck
>> after the resize, and again at the end.
>>
>> In the LiveCD, there's a good chance the array will be assembled for you,
>> but as a different number.  That shouldn't cause any problems, but it
>> affects the commands you'll type.  "cat /proc/mdstat" will give you a quick
>> summary of where you stand.
>>
>> I can't comment on the size figures you've chosen, as you haven't shared
>> the output of "mdadm -D /dev/md0" and "mdadm -E" for each of the component
>> devices.
>>
>> Also note that the backup file needed by mdadm cannot be *inside* the array
>> you are resizing.  You *must* have another storage device for it.  I use a
>> thumb drive with my LiveCD for this kind of task.
>>
>> Phil
>>
> Here's the data on array sizes
> sudo mdadm -D /dev/md/0_0
> /dev/md/0_0:
>         Version : 0.90
>   Creation Time : Thu Jan  6 06:13:08 2011
>      Raid Level : raid5
>      Array Size : 9762687680 (9310.42 GiB 9996.99 GB)
>   Used Dev Size : 1952537536 (1862.08 GiB 1999.40 GB)
>    Raid Devices : 6
>   Total Devices : 6
> Preferred Minor : 127
>     Persistence : Superblock is persistent
> 
>     Update Time : Sat Mar 12 17:56:34 2011
>           State : clean
>  Active Devices : 6
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>            UUID : 7e946e9d:b6a3395c:b57e8a13:68af0467
>          Events : 0.72
> 
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       8       50        1      active sync   /dev/sdd2
>        2       8       66        2      active sync   /dev/sde2
>        3       8       82        3      active sync   /dev/sdf2
>        4       8       98        4      active sync   /dev/sdg2
>        5       8      114        5      active sync   /dev/sdh2
> 

OK, so your new array size will be 5857612608 (1952537536 * 3) == 5586GB

You can use an initial resize2fs to 5.4T to speed things up, as you don't really need to move items that are currently located between the 3.8T and 5.4T mark.  Then use the exact "mdadm --grow /dev/md0 --array-size=5857612608" afterwards before you fsck it.

If that passes, do the rest.  The final resize2fs should be very quick.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-12 20:10               ` Phil Turmel
@ 2011-03-13  6:56                 ` Rory Jaffe
  2011-03-13 13:33                   ` Phil Turmel
  0 siblings, 1 reply; 12+ messages in thread
From: Rory Jaffe @ 2011-03-13  6:56 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Mikael Abrahamsson, linux-raid

> OK, so your new array size will be 5857612608 (1952537536 * 3) == 5586GB
>
> You can use an initial resize2fs to 5.4T to speed things up, as you don't really need to move items that are currently located between the 3.8T and 5.4T mark.  Then use the exact "mdadm --grow /dev/md0 --array-size=5857612608" afterwards before you fsck it.
>
> If that passes, do the rest.  The final resize2fs should be very quick.
>
> Phil
>

One more glitch? I ran the following command, trying several different
locations for the backup file, all of which have plenty of space and
are not on the array.

sudo mdadm -G /dev/md/0_0 -n 4 --backup-file=/tmp/backmd

mdadm gives the message "mdadm: Need to backup 960K of critical
section.." and it immediately returns to the command prompt without
shrinking the array.

Following is the current array info

ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm -D /dev/md/0_0
/dev/md/0_0:
        Version : 0.90
  Creation Time : Thu Jan  6 06:13:08 2011
     Raid Level : raid5
     Array Size : 5857612608 (5586.25 GiB 5998.20 GB)
  Used Dev Size : 1952537536 (1862.08 GiB 1999.40 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Sun Mar 13 06:46:37 2011
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 7e946e9d:b6a3395c:b57e8a13:68af0467
         Events : 0.73

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       50        1      active sync   /dev/sdd2
       2       8       66        2      active sync   /dev/sde2
       3       8       82        3      active sync   /dev/sdf2
       4       8       98        4      active sync   /dev/sdg2
       5       8      114        5      active sync   /dev/sdh2
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-13  6:56                 ` Rory Jaffe
@ 2011-03-13 13:33                   ` Phil Turmel
  2011-03-15  5:26                     ` Rory Jaffe
  0 siblings, 1 reply; 12+ messages in thread
From: Phil Turmel @ 2011-03-13 13:33 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Mikael Abrahamsson, linux-raid

Hi Rory,

On 03/13/2011 01:56 AM, Rory Jaffe wrote:
>> OK, so your new array size will be 5857612608 (1952537536 * 3) == 5586GB
>>
>> You can use an initial resize2fs to 5.4T to speed things up, as you don't really need to move items that are currently located between the 3.8T and 5.4T mark.  Then use the exact "mdadm --grow /dev/md0 --array-size=5857612608" afterwards before you fsck it.
>>
>> If that passes, do the rest.  The final resize2fs should be very quick.
>>
>> Phil
>>
> 
> One more glitch? I ran the following command, trying several different
> locations for the backup file, all of which have plenty of space and
> are not on the array.
> 
> sudo mdadm -G /dev/md/0_0 -n 4 --backup-file=/tmp/backmd
> 
> mdadm gives the message "mdadm: Need to backup 960K of critical
> section.." and it immediately returns to the command prompt without
> shrinking the array.

Are you sure its not doing the reshape?  "cat /proc/mdstat" will show whats happening in the background.

Also, check your dmesg to see if there are any explanatory messages.

Phil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-13 13:33                   ` Phil Turmel
@ 2011-03-15  5:26                     ` Rory Jaffe
  2011-03-15  5:44                       ` NeilBrown
  0 siblings, 1 reply; 12+ messages in thread
From: Rory Jaffe @ 2011-03-15  5:26 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Mikael Abrahamsson, linux-raid

>> One more glitch? I ran the following command, trying several different
>> locations for the backup file, all of which have plenty of space and
>> are not on the array.
>>
>> sudo mdadm -G /dev/md/0_0 -n 4 --backup-file=/tmp/backmd
>>
>> mdadm gives the message "mdadm: Need to backup 960K of critical
>> section.." and it immediately returns to the command prompt without
>> shrinking the array.
>
> Are you sure its not doing the reshape?  "cat /proc/mdstat" will show whats happening in the background.
>
> Also, check your dmesg to see if there are any explanatory messages.
>
> Phil
>
I tried again, with the same results. Details follow:

To assemble the array, I used
ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm --assemble --scan
then
I resynced the array.
then
ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm --grow /dev/md127 --array-size 5857612608
then
ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm -G -n 4 --backup-file=mdbak /dev/md127
and again received the messages:
ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm -G -n 4 --backup-file=mdback /dev/md127
mdadm: Need to backup 960K of critical section..
ubuntu@ubuntu:~/mdadm-3.2$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sda2[0] sdh2[5] sdg2[4] sdf2[3] sde2[2] sdd2[1]
      5857612608 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]

unused devices: <none>
ubuntu@ubuntu:~/mdadm-3.2$ mdadm -V
mdadm - v3.2 DEVELOPER_ONLY - 1st February 2011 (USE WITH CARE)


The following appear to be the relevant parts of dmesg--

[  758.516860] md: md127 stopped.
[  758.522499] md: bind<sdd2>
[  758.523731] md: bind<sde2>
[  758.525170] md: bind<sdf2>
[  758.525588] md: bind<sdg2>
[  758.526003] md: bind<sdh2>
[  758.526748] md: bind<sda2>
[  758.567380] async_tx: api initialized (async)
[  758.740173] raid6: int64x1    335 MB/s
[  758.910051] raid6: int64x2    559 MB/s
[  759.080062] raid6: int64x4    593 MB/s
[  759.250058] raid6: int64x8    717 MB/s
[  759.420148] raid6: sse2x1     437 MB/s
[  759.590013] raid6: sse2x2     599 MB/s
[  759.760037] raid6: sse2x4     634 MB/s
[  759.760044] raid6: using algorithm sse2x4 (634 MB/s)
[  759.793413] md: raid6 personality registered for level 6
[  759.793423] md: raid5 personality registered for level 5
[  759.793429] md: raid4 personality registered for level 4
[  759.798708] md/raid:md127: device sda2 operational as raid disk 0
[  759.798720] md/raid:md127: device sdh2 operational as raid disk 5
[  759.798729] md/raid:md127: device sdg2 operational as raid disk 4
[  759.798739] md/raid:md127: device sdf2 operational as raid disk 3
[  759.798747] md/raid:md127: device sde2 operational as raid disk 2
[  759.798756] md/raid:md127: device sdd2 operational as raid disk 1
[  759.800722] md/raid:md127: allocated 6386kB
[  759.810239] md/raid:md127: raid level 5 active with 6 out of 6
devices, algorithm 2
[  759.810249] RAID conf printout:
[  759.810255]  --- level:5 rd:6 wd:6
[  759.810263]  disk 0, o:1, dev:sda2
[  759.810271]  disk 1, o:1, dev:sdd2
[  759.810278]  disk 2, o:1, dev:sde2
[  759.810285]  disk 3, o:1, dev:sdf2
[  759.810293]  disk 4, o:1, dev:sdg2
[  759.810300]  disk 5, o:1, dev:sdh2
[  759.810416] md127: detected capacity change from 0 to 9996992184320
[  759.825149]  md127: unknown partition table
[  810.381494] md127: detected capacity change from 9996992184320 to
5998195310592
[  810.384868]  md127: unknown partition table

and here is the information about the array.
sudo mdadm -D /dev/md127
/dev/md127:
        Version : 0.90
  Creation Time : Thu Jan  6 06:13:08 2011
     Raid Level : raid5
     Array Size : 5857612608 (5586.25 GiB 5998.20 GB)
  Used Dev Size : 1952537536 (1862.08 GiB 1999.40 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Tue Mar 15 00:45:28 2011
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 7e946e9d:b6a3395c:b57e8a13:68af0467
         Events : 0.76

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       50        1      active sync   /dev/sdd2
       2       8       66        2      active sync   /dev/sde2
       3       8       82        3      active sync   /dev/sdf2
       4       8       98        4      active sync   /dev/sdg2
       5       8      114        5      active sync   /dev/sdh2
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-15  5:26                     ` Rory Jaffe
@ 2011-03-15  5:44                       ` NeilBrown
  2011-03-15  5:53                         ` Rory Jaffe
  0 siblings, 1 reply; 12+ messages in thread
From: NeilBrown @ 2011-03-15  5:44 UTC (permalink / raw)
  To: Rory Jaffe; +Cc: Phil Turmel, Mikael Abrahamsson, linux-raid

On Tue, 15 Mar 2011 05:26:44 +0000 Rory Jaffe <rsjaffe@gmail.com> wrote:

> >> One more glitch? I ran the following command, trying several different
> >> locations for the backup file, all of which have plenty of space and
> >> are not on the array.
> >>
> >> sudo mdadm -G /dev/md/0_0 -n 4 --backup-file=/tmp/backmd
> >>
> >> mdadm gives the message "mdadm: Need to backup 960K of critical
> >> section.." and it immediately returns to the command prompt without
> >> shrinking the array.
> >
> > Are you sure its not doing the reshape?  "cat /proc/mdstat" will show whats happening in the background.
> >
> > Also, check your dmesg to see if there are any explanatory messages.
> >
> > Phil
> >
> I tried again, with the same results. Details follow:
> 
> To assemble the array, I used
> ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm --assemble --scan
> then
> I resynced the array.
> then
> ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm --grow /dev/md127 --array-size 5857612608
> then
> ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm -G -n 4 --backup-file=mdbak /dev/md127
> and again received the messages:
> ubuntu@ubuntu:~/mdadm-3.2$ sudo mdadm -G -n 4 --backup-file=mdback /dev/md127
> mdadm: Need to backup 960K of critical section..
> ubuntu@ubuntu:~/mdadm-3.2$ cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md127 : active raid5 sda2[0] sdh2[5] sdg2[4] sdf2[3] sde2[2] sdd2[1]
>       5857612608 blocks level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
> 
> unused devices: <none>
> ubuntu@ubuntu:~/mdadm-3.2$ mdadm -V
> mdadm - v3.2 DEVELOPER_ONLY - 1st February 2011 (USE WITH CARE)
               ^^^^^^^^^^^^^^                      ^^^^^^^^^^^^^

I guess you must be a developer, so probably don't need any help....

But may I suggest trying mdadm-3.1.4 instead??

NeilBrown




> 
> 
> The following appear to be the relevant parts of dmesg--
> 
> [  758.516860] md: md127 stopped.
> [  758.522499] md: bind<sdd2>
> [  758.523731] md: bind<sde2>
> [  758.525170] md: bind<sdf2>
> [  758.525588] md: bind<sdg2>
> [  758.526003] md: bind<sdh2>
> [  758.526748] md: bind<sda2>
> [  758.567380] async_tx: api initialized (async)
> [  758.740173] raid6: int64x1    335 MB/s
> [  758.910051] raid6: int64x2    559 MB/s
> [  759.080062] raid6: int64x4    593 MB/s
> [  759.250058] raid6: int64x8    717 MB/s
> [  759.420148] raid6: sse2x1     437 MB/s
> [  759.590013] raid6: sse2x2     599 MB/s
> [  759.760037] raid6: sse2x4     634 MB/s
> [  759.760044] raid6: using algorithm sse2x4 (634 MB/s)
> [  759.793413] md: raid6 personality registered for level 6
> [  759.793423] md: raid5 personality registered for level 5
> [  759.793429] md: raid4 personality registered for level 4
> [  759.798708] md/raid:md127: device sda2 operational as raid disk 0
> [  759.798720] md/raid:md127: device sdh2 operational as raid disk 5
> [  759.798729] md/raid:md127: device sdg2 operational as raid disk 4
> [  759.798739] md/raid:md127: device sdf2 operational as raid disk 3
> [  759.798747] md/raid:md127: device sde2 operational as raid disk 2
> [  759.798756] md/raid:md127: device sdd2 operational as raid disk 1
> [  759.800722] md/raid:md127: allocated 6386kB
> [  759.810239] md/raid:md127: raid level 5 active with 6 out of 6
> devices, algorithm 2
> [  759.810249] RAID conf printout:
> [  759.810255]  --- level:5 rd:6 wd:6
> [  759.810263]  disk 0, o:1, dev:sda2
> [  759.810271]  disk 1, o:1, dev:sdd2
> [  759.810278]  disk 2, o:1, dev:sde2
> [  759.810285]  disk 3, o:1, dev:sdf2
> [  759.810293]  disk 4, o:1, dev:sdg2
> [  759.810300]  disk 5, o:1, dev:sdh2
> [  759.810416] md127: detected capacity change from 0 to 9996992184320
> [  759.825149]  md127: unknown partition table
> [  810.381494] md127: detected capacity change from 9996992184320 to
> 5998195310592
> [  810.384868]  md127: unknown partition table
> 
> and here is the information about the array.
> sudo mdadm -D /dev/md127
> /dev/md127:
>         Version : 0.90
>   Creation Time : Thu Jan  6 06:13:08 2011
>      Raid Level : raid5
>      Array Size : 5857612608 (5586.25 GiB 5998.20 GB)
>   Used Dev Size : 1952537536 (1862.08 GiB 1999.40 GB)
>    Raid Devices : 6
>   Total Devices : 6
> Preferred Minor : 127
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Mar 15 00:45:28 2011
>           State : clean
>  Active Devices : 6
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>            UUID : 7e946e9d:b6a3395c:b57e8a13:68af0467
>          Events : 0.76
> 
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       8       50        1      active sync   /dev/sdd2
>        2       8       66        2      active sync   /dev/sde2
>        3       8       82        3      active sync   /dev/sdf2
>        4       8       98        4      active sync   /dev/sdg2
>        5       8      114        5      active sync   /dev/sdh2
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RAID5 Shrinking array-size nearly killed the system
  2011-03-15  5:44                       ` NeilBrown
@ 2011-03-15  5:53                         ` Rory Jaffe
  0 siblings, 0 replies; 12+ messages in thread
From: Rory Jaffe @ 2011-03-15  5:53 UTC (permalink / raw)
  To: NeilBrown; +Cc: Phil Turmel, Mikael Abrahamsson, linux-raid

> ubuntu@ubuntu:~/mdadm-3.2$ mdadm -V
>> mdadm - v3.2 DEVELOPER_ONLY - 1st February 2011 (USE WITH CARE)
>               ^^^^^^^^^^^^^^                      ^^^^^^^^^^^^^
>
> I guess you must be a developer, so probably don't need any help....
>
> But may I suggest trying mdadm-3.1.4 instead??
>
> NeilBrown
Well, I was developing a headache. Does that qualify me as a developer?

Seriously, point well-taken. I've now installed 3.1.4. 3.1.4 seems to
be doing the job correctly.

It's now reshaping.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-03-15  5:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <AANLkTimrq904HRZfx6RpPrVNd0EJ5AkUZtytY7TqcFYv@mail.gmail.com>
2011-03-12  4:58 ` RAID5 Shrinking array-size nearly killed the system Rory Jaffe
2011-03-12  5:56   ` Mikael Abrahamsson
2011-03-12 14:40     ` Phil Turmel
     [not found]       ` <AANLkTik2qk2ep7fsQPjAesnjur-0AB-Xx7EeZ5YfeCSA@mail.gmail.com>
2011-03-12 17:47         ` [PATCH] Add more warnings to --grow documentation (was: RAID5 Shrinking array-size nearly killed the system) Phil Turmel
     [not found]         ` <AANLkTikRdqZ4nAdLcUuYz17KeWb54ES_sNYzzW-u1R0x@mail.gmail.com>
2011-03-12 17:58           ` RAID5 Shrinking array-size nearly killed the system Phil Turmel
2011-03-12 18:31             ` Rory Jaffe
     [not found]             ` <AANLkTim+OwO5-w5Hhjdchp+Nj8k0zTLqqvRKfAzFgkWz@mail.gmail.com>
2011-03-12 20:10               ` Phil Turmel
2011-03-13  6:56                 ` Rory Jaffe
2011-03-13 13:33                   ` Phil Turmel
2011-03-15  5:26                     ` Rory Jaffe
2011-03-15  5:44                       ` NeilBrown
2011-03-15  5:53                         ` Rory Jaffe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).