Please Help! RAID5 -> 6 reshapre gone bad

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Please Help! RAID5 -> 6 reshapre gone bad
@ 2012-02-07  1:34 Richard Herd
  2012-02-07  2:15 ` Phil Turmel
  2012-02-07  2:39 ` NeilBrown
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Herd @ 2012-02-07  1:34 UTC (permalink / raw)
  To: linux-raid

Hey guys,

I'm in a bit of a pickle here and if any mdadm kings could step in and
throw some advice my way I'd be very grateful :-)

Quick bit of background - little NAS based on an AMD E350 running
Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
months one of the drives would fail a request and get kicked from the
array (as is becoming common for these larger multi TB drives they
tolerate the occasional bad sector by reallocating from a pool of
spares (but that's a whole other story)).  This happened across a
variety of brands and two different controllers. I'd simply add the
disk that got popped back in and let it re-sync.  SMART tests always
in good health.

It did make me nervous though.  So I decided I'd add a second disk for
a bit of extra redundancy, making the array a RAID 6 - the thinking
was the occasional disk getting kicked and re-added from a RAID 6
array wouldn't present as much risk as a single disk getting kicked
from a RAID 5.

So first off, I added the 6th disk as a hotspare to the RAID5 array.
So I now had my 5 disk RAID 5 + hotspare.

I then found that mdadm 2.6.7 (in the repositories) isn't actually
capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
compiled myself a new version of mdadm.

With the newer version of mdadm, it was happy to do the reshape - so I
set it off on it's merry way, using an esata HD (mounted at /usb :-P)
for the backupfile:

root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
--backup-file=/usb/md0.backup

It would take a week to reshape, but it was ona UPS & happily ticking
along.  The array would be online the whole time so I was in no rush.
Content, I went to get some shut-eye.

I got up this morning and took a quick look in /proc/mdstat to see how
things were going and saw things had failed spectacularly.  At least
two disks had been kicked from the array and the whole thing had
crumbled.

Ouch.

I tried to assembe the array, to see if it would continue the reshape:

root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
/dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1

Unfortunately mdadm had decided that the backup-file was out of date
(timestamps didn't match) and was erroring with: Failed to restore
critical section for reshape, sorry..

Chances are things were in such a mess that backup file wasn't going
to be used anyway, so I blocked the timestamp check with: export
MDADM_GROW_ALLOW_OLD=1

That allowed me to assemble the array, but not run it as there were
not enough disks to start it.

This is the current state of the array:

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
      7814047744 blocks super 0.91

unused devices: <none>

root@raven:/# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.91
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 6
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Feb  7 09:32:29 2012
          State : active, FAILED, Not Started
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric-6
     Chunk Size : 64K

     New Layout : left-symmetric

           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
         Events : 0.1848341

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1
       2       8        1        2      active sync   /dev/sda1
       3       0        0        3      removed
       4       8       81        4      active sync   /dev/sdf1
       5       8       49        5      spare rebuilding   /dev/sdd1

The two removed disks:
[ 3020.998529] md: kicking non-fresh sdc1 from array!
[ 3021.012672] md: kicking non-fresh sdg1 from array!

Attempted to re-add the disks (same for both):
root@raven:/# mdadm /dev/md0 --add /dev/sdg1
mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
--re-add fails.
mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.

With a failed array the last thing we want to do is add spares and
trigger a resync so obviously I haven't zeroed the superblocks and
added yet.

Checked and two disks really are out of sync:
root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
         Events : 1848341
         Events : 1848341
         Events : 1848333
         Events : 1848341
         Events : 1848341
         Events : 1772921

I'll post the output of --examine on all the disks below - if anyone
has any advice I'd really appreciate it (Neil Brown doesn't read these
forums does he?!?).  I would usually move next to recreating the array
and using assume-clean but since it's right in the middle of a reshape
I'm not inclined to try.

Critical stuff is of course backed up, but there is some user data not
covered by backups that I'd like to try and restore if at all
possible.

Thanks

root@raven:/# mdadm --examine /dev/sd[a-h]1
/dev/sda1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 09:32:29 2012
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 3c0c8563 - correct
         Events : 1848341

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       17        2      active sync   /dev/sdb1

   0     0       0        0        0      removed
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       0        0        3      faulty removed
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1

/dev/sdb1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 09:32:29 2012
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 3c0c8571 - correct
         Events : 1848341

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       0        0        0      removed
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       0        0        3      faulty removed
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1

/dev/sdc1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 07:12:01 2012
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 3c0c6478 - correct
         Events : 1848333

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       49        3      active sync   /dev/sdd1

   0     0       0        0        0      removed
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       8       49        3      active sync   /dev/sdd1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1

/dev/sdd1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 09:32:29 2012
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 3c0c8595 - correct
         Events : 1848341

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       65        5      active   /dev/sde1

   0     0       0        0        0      removed
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       0        0        3      faulty removed
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1

/dev/sdf1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 09:32:29 2012
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 3c0c85a7 - correct
         Events : 1848341

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       81        4      active sync   /dev/sdf1

   0     0       0        0        0      removed
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       0        0        3      faulty removed
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1

/dev/sdg1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

  Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)
     New Layout : left-symmetric

    Update Time : Tue Feb  7 01:06:46 2012
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 3c09c1d2 - correct
         Events : 1772921

         Layout : left-symmetric-6
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       97        0      active sync   /dev/sdg1

   0     0       8       97        0      active sync   /dev/sdg1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       8       49        3      active sync   /dev/sdd1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       65        5      active   /dev/sde1
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  1:34 Please Help! RAID5 -> 6 reshapre gone bad Richard Herd
@ 2012-02-07  2:15 ` Phil Turmel
       [not found]   ` <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>
  2012-02-07  2:39 ` NeilBrown
  1 sibling, 1 reply; 27+ messages in thread
From: Phil Turmel @ 2012-02-07  2:15 UTC (permalink / raw)
  To: Richard Herd; +Cc: linux-raid

Hi Richard,

On 02/06/2012 08:34 PM, Richard Herd wrote:
> Hey guys,
> 
> I'm in a bit of a pickle here and if any mdadm kings could step in and
> throw some advice my way I'd be very grateful :-)
> 
> Quick bit of background - little NAS based on an AMD E350 running
> Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
> months one of the drives would fail a request and get kicked from the
> array (as is becoming common for these larger multi TB drives they
> tolerate the occasional bad sector by reallocating from a pool of
> spares (but that's a whole other story)).  This happened across a
> variety of brands and two different controllers. I'd simply add the
> disk that got popped back in and let it re-sync.  SMART tests always
> in good health.

Some more detail on the actual devices would help, especially the
output of lsdrv [1] to document what device serial numbers are which,
for future reference.

I also suspect you have problems with your drive's error recovery
control, also known as time-limited error recovery.  Simple sector
errors should *not* be kicking out your drives.  Mdadm knows to
reconstruct from parity and rewrite when a read error is encountered.
That either succeeds directly, or causes the drive to remap.

You say that the SMART tests are good, so read errors are probably
escalating into link timeouts, and the drive ignores the attempt to
reconstruct.  *That* kicks the drive out.

"smartctl -x" reports for all of your drives would help identify if
you have this problem.  You *cannot* safely run raid arrays with drives
that don't (or won't) report errors in a timely fashion (a few seconds).

> It did make me nervous though.  So I decided I'd add a second disk for
> a bit of extra redundancy, making the array a RAID 6 - the thinking
> was the occasional disk getting kicked and re-added from a RAID 6
> array wouldn't present as much risk as a single disk getting kicked
> from a RAID 5.
> 
> So first off, I added the 6th disk as a hotspare to the RAID5 array.
> So I now had my 5 disk RAID 5 + hotspare.
> 
> I then found that mdadm 2.6.7 (in the repositories) isn't actually
> capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
> compiled myself a new version of mdadm.
> 
> With the newer version of mdadm, it was happy to do the reshape - so I
> set it off on it's merry way, using an esata HD (mounted at /usb :-P)
> for the backupfile:
> 
> root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
> --backup-file=/usb/md0.backup
> 
> It would take a week to reshape, but it was ona UPS & happily ticking
> along.  The array would be online the whole time so I was in no rush.
> Content, I went to get some shut-eye.
> 
> I got up this morning and took a quick look in /proc/mdstat to see how
> things were going and saw things had failed spectacularly.  At least
> two disks had been kicked from the array and the whole thing had
> crumbled.

Do you still have the dmesg for this?

> Ouch.
> 
> I tried to assembe the array, to see if it would continue the reshape:
> 
> root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
> /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
> 
> Unfortunately mdadm had decided that the backup-file was out of date
> (timestamps didn't match) and was erroring with: Failed to restore
> critical section for reshape, sorry..
> 
> Chances are things were in such a mess that backup file wasn't going
> to be used anyway, so I blocked the timestamp check with: export
> MDADM_GROW_ALLOW_OLD=1
> 
> That allowed me to assemble the array, but not run it as there were
> not enough disks to start it.
> 
> This is the current state of the array:
> 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
>       7814047744 blocks super 0.91
> 
> unused devices: <none>
> 
> root@raven:/# mdadm --detail /dev/md0
> /dev/md0:
>         Version : 0.91
>   Creation Time : Tue Jul 12 23:05:01 2011
>      Raid Level : raid6
>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>    Raid Devices : 6
>   Total Devices : 4
> Preferred Minor : 0
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Feb  7 09:32:29 2012
>           State : active, FAILED, Not Started
>  Active Devices : 3
> Working Devices : 4
>  Failed Devices : 0
>   Spare Devices : 1
> 
>          Layout : left-symmetric-6
>      Chunk Size : 64K
> 
>      New Layout : left-symmetric
> 
>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>          Events : 0.1848341
> 
>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       17        1      active sync   /dev/sdb1
>        2       8        1        2      active sync   /dev/sda1
>        3       0        0        3      removed
>        4       8       81        4      active sync   /dev/sdf1
>        5       8       49        5      spare rebuilding   /dev/sdd1
> 
> The two removed disks:
> [ 3020.998529] md: kicking non-fresh sdc1 from array!
> [ 3021.012672] md: kicking non-fresh sdg1 from array!
> 
> Attempted to re-add the disks (same for both):
> root@raven:/# mdadm /dev/md0 --add /dev/sdg1
> mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
> --re-add fails.
> mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.
> 
> With a failed array the last thing we want to do is add spares and
> trigger a resync so obviously I haven't zeroed the superblocks and
> added yet.

That would be catastrophic.

> Checked and two disks really are out of sync:
> root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
>          Events : 1848341
>          Events : 1848341
>          Events : 1848333
>          Events : 1848341
>          Events : 1848341
>          Events : 1772921

So /dev/sdg1 dropped out first, and /dev/sdc1 followed and killed the
array.

> I'll post the output of --examine on all the disks below - if anyone
> has any advice I'd really appreciate it (Neil Brown doesn't read these
> forums does he?!?).  I would usually move next to recreating the array
> and using assume-clean but since it's right in the middle of a reshape
> I'm not inclined to try.

Neil absolutely reads this mailing list, and is likely to pitch in if
I don't offer precisely correct advice :-)

He's in an Australian time zone though, so latency might vary.  I'm on the
U.S. east coast, fwiw.

In any case, with a re-shape in progress, "--create --assume-clean" is
not an option.

> Critical stuff is of course backed up, but there is some user data not
> covered by backups that I'd like to try and restore if at all
> possible.

Hope is not all lost.  If we can get your ERC adjusted, the next step
would be to disconnect /dev/sdg from the system, and assemble with
--force and MDADM_GROW_ALLOW_OLD=1

That'll let the reshape finish, leaving you with a single-degraded
raid6.  Then you fsck and make critical backups.  Then you --zero- and
--add /dev/sdg.

If your drives don't support ERC, I can't recommend you continue until
you've ddrescue'd your drives onto new ones that do support ERC.

HTH,

Phil

[1] http://github.com/pturmel/lsdrv

^ permalink raw reply	[flat|nested] 27+ messages in thread

[parent not found: <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>]

* Re: Please Help! RAID5 -> 6 reshapre gone bad
       [not found]   ` <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>
@ 2012-02-07  2:57     ` Phil Turmel
  2012-02-07  3:10       ` Richard Herd
  2012-02-07  3:24       ` Keith Keller
  2012-02-07  3:04     ` Fwd: " Richard Herd
  1 sibling, 2 replies; 27+ messages in thread
From: Phil Turmel @ 2012-02-07  2:57 UTC (permalink / raw)
  To: Richard Herd, linux-raid@vger.kernel.org

Hi Richard,

[restored CC list...  please use reply-to-all on kernel.org lists]

On 02/06/2012 09:40 PM, Richard Herd wrote:
> Hi Phil,
> 
> Thanks for the swift response :-)  Also I'm in (what I'd like to say
> but can't - sunny) Sydney...
> 
> OK, without slathering this thread is smart reports I can quite
> definitely say you are exactly nail-on-the-head with regard to the
> read errors escalating into link timeouts.  This is exactly what is
> happening.  I had thought this was actually a pretty common setup for
> home users (eg mdadm and drives such as WD20EARS/ST2000s) - I have the
> luxury of budgets for Netapp kit at work - unfortunately my personal
> finances only stretch to an ITX case and a bunch of cheap HDs!

I understand the constraints, as I pinch pennies at home and at the
office (I own my engineering firm).  I've made do with cheap desktop
drives that do support ERC.  I got burned when Seagate dropped ERC on
their latest desktop drives.  Hitachi Deskstar is the only affordable
model on the market that still support ERC.

> I understand it's the ERC causing disks to get kicked, and fully
> understand if you can't help further.

Not that I won't help, as there's no risk to me :-)

> Assembling without sdg I'm not sure will do it, as what we have is 4
> disks with the same events counter (3 active sync (sda/sdb/sdf), 1
> spare rebuilding (sdd)), and 2 (sdg/sdc) removed with older event
> counters.  Leaving out sdg leaves us with sdc which has an event
> counter of 1848333.  As the 3 active sync (sda/sdb/sdf) + 1 spare
> (sdd) have an event counter of 1848341, mdadm doesn't want to let me
> use sdc in the array even with --force.

This surprises me.  The purpose of "--force" with assemble is to
ignore the event count.  Have you tried this with the newer mdadm
you compiled?

> As you say as it's in the middle of a reshape so a recreate is out.
> 
> I'm considering data loss is a given at this point, but even being
> able to bring the array online degraded and pull out whatever is still
> intact would help.
> 
> If you have any further suggestions that would be great, but I do
> understand your position on ERC and thank you for your input :-)

Please do retry the --assemble --force with /dev/sdg left out?

I'll leave the balance of your response untrimmed for the list to see.

Phil


> Feb  7 01:07:16 raven kernel: [18891.989330] ata8: hard resetting link
> Feb  7 01:07:22 raven kernel: [18897.356104] ata8: link is slow to
> respond, please be patient (ready=0)
> Feb  7 01:07:26 raven kernel: [18902.004280] ata8: hard resetting link
> Feb  7 01:07:32 raven kernel: [18907.372104] ata8: link is slow to
> respond, please be patient (ready=0)
> Feb  7 01:07:36 raven kernel: [18912.020097] ata8: SATA link up 6.0
> Gbps (SStatus 133 SControl 300)
> Feb  7 01:07:41 raven kernel: [18917.020093] ata8.00: qc timeout (cmd 0xec)
> Feb  7 01:07:41 raven kernel: [18917.028074] ata8.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 01:07:41 raven kernel: [18917.028310] ata8: hard resetting link
> Feb  7 01:07:47 raven kernel: [18922.396089] ata8: link is slow to
> respond, please be patient (ready=0)
> Feb  7 01:07:51 raven kernel: [18927.044313] ata8: hard resetting link
> Feb  7 01:07:56 raven kernel: [18932.020099] ata8: SATA link up 6.0
> Gbps (SStatus 133 SControl 300)
> Feb  7 01:08:06 raven kernel: [18942.020048] ata8.00: qc timeout (cmd 0xec)
> Feb  7 01:08:06 raven kernel: [18942.028075] ata8.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 01:08:06 raven kernel: [18942.028307] ata8: limiting SATA link
> speed to 3.0 Gbps
> Feb  7 01:08:06 raven kernel: [18942.028321] ata8: hard resetting link
> Feb  7 01:08:12 raven kernel: [18947.396108] ata8: link is slow to
> respond, please be patient (ready=0)
> Feb  7 01:08:16 raven kernel: [18951.988069] ata8: SATA link up 6.0
> Gbps (SStatus 133 SControl 320)
> Feb  7 01:08:46 raven kernel: [18981.988104] ata8.00: qc timeout (cmd 0xec)
> Feb  7 01:08:46 raven kernel: [18981.996070] ata8.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 01:08:46 raven kernel: [18981.996302] ata8.00: disabled
> Feb  7 01:08:46 raven kernel: [18981.996324] ata8.00: device reported
> invalid CHS sector 0
> Feb  7 01:08:46 raven kernel: [18981.996348] ata8: hard resetting link
> Feb  7 01:08:52 raven kernel: [18987.364104] ata8: link is slow to
> respond, please be patient (ready=0)
> Feb  7 01:08:56 raven kernel: [18992.012050] ata8: SATA link up 6.0
> Gbps (SStatus 133 SControl 320)
> Feb  7 01:08:56 raven kernel: [18992.012114] ata8: EH complete
> Feb  7 01:08:56 raven kernel: [18992.012158] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.012165] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.012176] sd 8:0:0:0: [sdg] CDB:
> Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
> Feb  7 01:08:56 raven kernel: [18992.012696] md: super_written gets
> error=-5, uptodate=0
> Feb  7 01:08:56 raven kernel: [18992.013169] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.013176] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.013186] sd 8:0:0:0: [sdg] CDB:
> Read(10): 28 00 04 9d bd bf 00 00 80 00
> Feb  7 01:08:56 raven kernel: [18992.276986] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.276999] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.277012] sd 8:0:0:0: [sdg] CDB:
> Read(10): 28 00 04 9d be 3f 00 00 80 00
> Feb  7 01:08:56 raven kernel: [18992.316919] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.316930] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.316942] sd 8:0:0:0: [sdg] CDB:
> Read(10): 28 00 04 9d be bf 00 00 80 00
> Feb  7 01:08:56 raven kernel: [18992.326906] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.326920] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.326932] sd 8:0:0:0: [sdg] CDB:
> Read(10): 28 00 04 9d bf 3f 00 00 80 00
> Feb  7 01:08:56 raven kernel: [18992.327944] sd 8:0:0:0: [sdg]
> Unhandled error code
> Feb  7 01:08:56 raven kernel: [18992.327956] sd 8:0:0:0: [sdg] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 01:08:56 raven kernel: [18992.327968] sd 8:0:0:0: [sdg] CDB:
> Read(10): 28 00 04 9d bf bf 00 00 80 00
> Feb  7 01:08:57 raven kernel: [18992.555093] md: md0: reshape done.
> Feb  7 01:08:57 raven kernel: [18992.607595] md: reshape of RAID array md0
> Feb  7 01:08:57 raven kernel: [18992.607606] md: minimum _guaranteed_
> speed: 200000 KB/sec/disk.
> Feb  7 01:08:57 raven kernel: [18992.607614] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> reshape.
> Feb  7 01:08:57 raven kernel: [18992.607628] md: using 128k window,
> over a total of 1953511936 blocks.
> Feb  7 06:41:02 raven rsyslogd: [origin software="rsyslogd"
> swVersion="4.2.0" x-pid="911" x-info="http://www.rsyslog.com"]
> rsyslogd was HUPed, type 'lightweight'.
> Feb  7 07:12:32 raven kernel: [40807.989092] ata5: hard resetting link
> Feb  7 07:12:38 raven kernel: [40813.524074] ata5: SATA link up 6.0
> Gbps (SStatus 133 SControl 300)
> Feb  7 07:12:43 raven kernel: [40818.524106] ata5.00: qc timeout (cmd 0xec)
> Feb  7 07:12:43 raven kernel: [40818.524126] ata5.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 07:12:43 raven kernel: [40818.532788] ata5: hard resetting link
> Feb  7 07:12:48 raven kernel: [40824.058039] ata5: SATA link up 6.0
> Gbps (SStatus 133 SControl 300)
> Feb  7 07:12:58 raven kernel: [40834.056101] ata5.00: qc timeout (cmd 0xec)
> Feb  7 07:12:58 raven kernel: [40834.056121] ata5.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 07:12:58 raven kernel: [40834.064203] ata5: limiting SATA link
> speed to 3.0 Gbps
> Feb  7 07:12:58 raven kernel: [40834.064217] ata5: hard resetting link
> Feb  7 07:13:04 raven kernel: [40839.592095] ata5: SATA link up 3.0
> Gbps (SStatus 123 SControl 320)
> Feb  7 07:13:34 raven kernel: [40869.592088] ata5.00: qc timeout (cmd 0xec)
> Feb  7 07:13:34 raven kernel: [40869.592110] ata5.00: failed to
> IDENTIFY (I/O error, err_mask=0x4)
> Feb  7 07:13:34 raven kernel: [40869.599676] ata5.00: disabled
> Feb  7 07:13:34 raven kernel: [40869.599700] ata5.00: device reported
> invalid CHS sector 0
> Feb  7 07:13:34 raven kernel: [40869.599724] ata5: hard resetting link
> Feb  7 07:13:39 raven kernel: [40875.124128] ata5: SATA link up 3.0
> Gbps (SStatus 123 SControl 320)
> Feb  7 07:13:39 raven kernel: [40875.124201] ata5: EH complete
> Feb  7 07:13:39 raven kernel: [40875.124243] sd 4:0:0:0: [sdd]
> Unhandled error code
> Feb  7 07:13:39 raven kernel: [40875.124251] sd 4:0:0:0: [sdd] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 07:13:39 raven kernel: [40875.124262] sd 4:0:0:0: [sdd] CDB:
> Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
> Feb  7 07:13:39 raven kernel: [40875.135544] md: super_written gets
> error=-5, uptodate=0
> Feb  7 07:13:39 raven kernel: [40875.152171] sd 4:0:0:0: [sdd]
> Unhandled error code
> Feb  7 07:13:39 raven kernel: [40875.152179] sd 4:0:0:0: [sdd] Result:
> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
> Feb  7 07:13:39 raven kernel: [40875.152189] sd 4:0:0:0: [sdd] CDB:
> Read(10): 28 00 09 2b f2 3f 00 00 80 00
> Feb  7 07:13:41 raven kernel: [40876.734504] md: md0: reshape done.
> Feb  7 07:13:41 raven kernel: [40876.736298] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.743529] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.750009] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.755143] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.760126] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.765070] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.769890] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.774759] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.779456] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.784166] lost page write due to
> I/O error on md0
> Feb  7 07:13:41 raven kernel: [40876.788773] JBD: Detected IO errors
> while flushing file data on md0
> Feb  7 07:13:41 raven kernel: [40876.796386] JBD: Detected IO errors
> while flushing file data on md0
> 
> On Tue, Feb 7, 2012 at 1:15 PM, Phil Turmel <philip@turmel.org> wrote:
>> Hi Richard,
>>
>> On 02/06/2012 08:34 PM, Richard Herd wrote:
>>> Hey guys,
>>>
>>> I'm in a bit of a pickle here and if any mdadm kings could step in and
>>> throw some advice my way I'd be very grateful :-)
>>>
>>> Quick bit of background - little NAS based on an AMD E350 running
>>> Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
>>> months one of the drives would fail a request and get kicked from the
>>> array (as is becoming common for these larger multi TB drives they
>>> tolerate the occasional bad sector by reallocating from a pool of
>>> spares (but that's a whole other story)).  This happened across a
>>> variety of brands and two different controllers. I'd simply add the
>>> disk that got popped back in and let it re-sync.  SMART tests always
>>> in good health.
>>
>> Some more detail on the actual devices would help, especially the
>> output of lsdrv [1] to document what device serial numbers are which,
>> for future reference.
>>
>> I also suspect you have problems with your drive's error recovery
>> control, also known as time-limited error recovery.  Simple sector
>> errors should *not* be kicking out your drives.  Mdadm knows to
>> reconstruct from parity and rewrite when a read error is encountered.
>> That either succeeds directly, or causes the drive to remap.
>>
>> You say that the SMART tests are good, so read errors are probably
>> escalating into link timeouts, and the drive ignores the attempt to
>> reconstruct.  *That* kicks the drive out.
>>
>> "smartctl -x" reports for all of your drives would help identify if
>> you have this problem.  You *cannot* safely run raid arrays with drives
>> that don't (or won't) report errors in a timely fashion (a few seconds).
>>
>>> It did make me nervous though.  So I decided I'd add a second disk for
>>> a bit of extra redundancy, making the array a RAID 6 - the thinking
>>> was the occasional disk getting kicked and re-added from a RAID 6
>>> array wouldn't present as much risk as a single disk getting kicked
>>> from a RAID 5.
>>>
>>> So first off, I added the 6th disk as a hotspare to the RAID5 array.
>>> So I now had my 5 disk RAID 5 + hotspare.
>>>
>>> I then found that mdadm 2.6.7 (in the repositories) isn't actually
>>> capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
>>> compiled myself a new version of mdadm.
>>>
>>> With the newer version of mdadm, it was happy to do the reshape - so I
>>> set it off on it's merry way, using an esata HD (mounted at /usb :-P)
>>> for the backupfile:
>>>
>>> root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
>>> --backup-file=/usb/md0.backup
>>>
>>> It would take a week to reshape, but it was ona UPS & happily ticking
>>> along.  The array would be online the whole time so I was in no rush.
>>> Content, I went to get some shut-eye.
>>>
>>> I got up this morning and took a quick look in /proc/mdstat to see how
>>> things were going and saw things had failed spectacularly.  At least
>>> two disks had been kicked from the array and the whole thing had
>>> crumbled.
>>
>> Do you still have the dmesg for this?
>>
>>> Ouch.
>>>
>>> I tried to assembe the array, to see if it would continue the reshape:
>>>
>>> root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
>>> /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
>>>
>>> Unfortunately mdadm had decided that the backup-file was out of date
>>> (timestamps didn't match) and was erroring with: Failed to restore
>>> critical section for reshape, sorry..
>>>
>>> Chances are things were in such a mess that backup file wasn't going
>>> to be used anyway, so I blocked the timestamp check with: export
>>> MDADM_GROW_ALLOW_OLD=1
>>>
>>> That allowed me to assemble the array, but not run it as there were
>>> not enough disks to start it.
>>>
>>> This is the current state of the array:
>>>
>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>> [raid4] [raid10]
>>> md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
>>>       7814047744 blocks super 0.91
>>>
>>> unused devices: <none>
>>>
>>> root@raven:/# mdadm --detail /dev/md0
>>> /dev/md0:
>>>         Version : 0.91
>>>   Creation Time : Tue Jul 12 23:05:01 2011
>>>      Raid Level : raid6
>>>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>>>    Raid Devices : 6
>>>   Total Devices : 4
>>> Preferred Minor : 0
>>>     Persistence : Superblock is persistent
>>>
>>>     Update Time : Tue Feb  7 09:32:29 2012
>>>           State : active, FAILED, Not Started
>>>  Active Devices : 3
>>> Working Devices : 4
>>>  Failed Devices : 0
>>>   Spare Devices : 1
>>>
>>>          Layout : left-symmetric-6
>>>      Chunk Size : 64K
>>>
>>>      New Layout : left-symmetric
>>>
>>>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>>>          Events : 0.1848341
>>>
>>>     Number   Major   Minor   RaidDevice State
>>>        0       0        0        0      removed
>>>        1       8       17        1      active sync   /dev/sdb1
>>>        2       8        1        2      active sync   /dev/sda1
>>>        3       0        0        3      removed
>>>        4       8       81        4      active sync   /dev/sdf1
>>>        5       8       49        5      spare rebuilding   /dev/sdd1
>>>
>>> The two removed disks:
>>> [ 3020.998529] md: kicking non-fresh sdc1 from array!
>>> [ 3021.012672] md: kicking non-fresh sdg1 from array!
>>>
>>> Attempted to re-add the disks (same for both):
>>> root@raven:/# mdadm /dev/md0 --add /dev/sdg1
>>> mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
>>> --re-add fails.
>>> mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.
>>> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.
>>>
>>> With a failed array the last thing we want to do is add spares and
>>> trigger a resync so obviously I haven't zeroed the superblocks and
>>> added yet.
>>
>> That would be catastrophic.
>>
>>> Checked and two disks really are out of sync:
>>> root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
>>>          Events : 1848341
>>>          Events : 1848341
>>>          Events : 1848333
>>>          Events : 1848341
>>>          Events : 1848341
>>>          Events : 1772921
>>
>> So /dev/sdg1 dropped out first, and /dev/sdc1 followed and killed the
>> array.
>>
>>> I'll post the output of --examine on all the disks below - if anyone
>>> has any advice I'd really appreciate it (Neil Brown doesn't read these
>>> forums does he?!?).  I would usually move next to recreating the array
>>> and using assume-clean but since it's right in the middle of a reshape
>>> I'm not inclined to try.
>>
>> Neil absolutely reads this mailing list, and is likely to pitch in if
>> I don't offer precisely correct advice :-)
>>
>> He's in an Australian time zone though, so latency might vary.  I'm on the
>> U.S. east coast, fwiw.
>>
>> In any case, with a re-shape in progress, "--create --assume-clean" is
>> not an option.
>>
>>> Critical stuff is of course backed up, but there is some user data not
>>> covered by backups that I'd like to try and restore if at all
>>> possible.
>>
>> Hope is not all lost.  If we can get your ERC adjusted, the next step
>> would be to disconnect /dev/sdg from the system, and assemble with
>> --force and MDADM_GROW_ALLOW_OLD=1
>>
>> That'll let the reshape finish, leaving you with a single-degraded
>> raid6.  Then you fsck and make critical backups.  Then you --zero- and
>> --add /dev/sdg.
>>
>> If your drives don't support ERC, I can't recommend you continue until
>> you've ddrescue'd your drives onto new ones that do support ERC.
>>
>> HTH,
>>
>> Phil
>>
>> [1] http://github.com/pturmel/lsdrv


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  2:57     ` Phil Turmel
@ 2012-02-07  3:10       ` Richard Herd
  2012-02-07  3:24       ` Keith Keller
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Herd @ 2012-02-07  3:10 UTC (permalink / raw)
  To: Phil Turmel; +Cc: linux-raid@vger.kernel.org

Thanks again Phil.

To confirm:

root@raven:/# mdadm -Avv --force --backup-file=/usb/md0.backup
/dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1

Results in the below, so even with --force it doesn't want to accept
'non-fresh' sdc.

mdadm: looking for devices for /dev/md0
mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
mdadm:/dev/md0 has an active reshape - checking if critical section
needs to be restored
mdadm: accepting backup with timestamp 1328559119 for array with
timestamp 1328567549
mdadm: restoring critical section
mdadm: no uptodate device for slot 0 of /dev/md0
mdadm: added /dev/sda1 to /dev/md0 as 2
mdadm: added /dev/sdc1 to /dev/md0 as 3
mdadm: added /dev/sdf1 to /dev/md0 as 4
mdadm: added /dev/sdd1 to /dev/md0 as 5
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error

And dmesg shows:
[11595.863451] md: bind<sda1>
[11595.863972] md: bind<sdc1>
[11595.865341] md: bind<sdf1>
[11595.869893] md: bind<sdd1>
[11595.870891] md: bind<sdb1>
[11595.871357] md: kicking non-fresh sdc1 from array!
[11595.871370] md: unbind<sdc1>
[11595.880072] md: export_rdev(sdc1)
[11595.882513] raid5: reshape will continue
[11595.882538] raid5: device sdb1 operational as raid disk 1
[11595.882542] raid5: device sdf1 operational as raid disk 4
[11595.882546] raid5: device sda1 operational as raid disk 2
[11595.883544] raid5: allocated 6308kB for md0
[11595.883627] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[11595.883633] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
[11595.883637] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[11595.883642] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[11595.883645] raid5: not enough operational devices for md0 (3/6 failed)
[11595.891968] RAID5 conf printout:
[11595.891971]  --- rd:6 wd:3
[11595.891976]  disk 1, o:1, dev:sdb1
[11595.891979]  disk 2, o:1, dev:sda1
[11595.891983]  disk 4, o:1, dev:sdf1
[11595.891986]  disk 5, o:1, dev:sdd1
[11595.892520] raid5: failed to run raid set md0
[11595.900726] md: pers->run() failed ...

Cheers


On Tue, Feb 7, 2012 at 1:57 PM, Phil Turmel <philip@turmel.org> wrote:
> Hi Richard,
>
> [restored CC list...  please use reply-to-all on kernel.org lists]
>
> On 02/06/2012 09:40 PM, Richard Herd wrote:
>> Hi Phil,
>>
>> Thanks for the swift response :-)  Also I'm in (what I'd like to say
>> but can't - sunny) Sydney...
>>
>> OK, without slathering this thread is smart reports I can quite
>> definitely say you are exactly nail-on-the-head with regard to the
>> read errors escalating into link timeouts.  This is exactly what is
>> happening.  I had thought this was actually a pretty common setup for
>> home users (eg mdadm and drives such as WD20EARS/ST2000s) - I have the
>> luxury of budgets for Netapp kit at work - unfortunately my personal
>> finances only stretch to an ITX case and a bunch of cheap HDs!
>
> I understand the constraints, as I pinch pennies at home and at the
> office (I own my engineering firm).  I've made do with cheap desktop
> drives that do support ERC.  I got burned when Seagate dropped ERC on
> their latest desktop drives.  Hitachi Deskstar is the only affordable
> model on the market that still support ERC.
>
>> I understand it's the ERC causing disks to get kicked, and fully
>> understand if you can't help further.
>
> Not that I won't help, as there's no risk to me :-)
>
>> Assembling without sdg I'm not sure will do it, as what we have is 4
>> disks with the same events counter (3 active sync (sda/sdb/sdf), 1
>> spare rebuilding (sdd)), and 2 (sdg/sdc) removed with older event
>> counters.  Leaving out sdg leaves us with sdc which has an event
>> counter of 1848333.  As the 3 active sync (sda/sdb/sdf) + 1 spare
>> (sdd) have an event counter of 1848341, mdadm doesn't want to let me
>> use sdc in the array even with --force.
>
> This surprises me.  The purpose of "--force" with assemble is to
> ignore the event count.  Have you tried this with the newer mdadm
> you compiled?
>
>> As you say as it's in the middle of a reshape so a recreate is out.
>>
>> I'm considering data loss is a given at this point, but even being
>> able to bring the array online degraded and pull out whatever is still
>> intact would help.
>>
>> If you have any further suggestions that would be great, but I do
>> understand your position on ERC and thank you for your input :-)
>
> Please do retry the --assemble --force with /dev/sdg left out?
>
> I'll leave the balance of your response untrimmed for the list to see.
>
> Phil
>
>
>> Feb  7 01:07:16 raven kernel: [18891.989330] ata8: hard resetting link
>> Feb  7 01:07:22 raven kernel: [18897.356104] ata8: link is slow to
>> respond, please be patient (ready=0)
>> Feb  7 01:07:26 raven kernel: [18902.004280] ata8: hard resetting link
>> Feb  7 01:07:32 raven kernel: [18907.372104] ata8: link is slow to
>> respond, please be patient (ready=0)
>> Feb  7 01:07:36 raven kernel: [18912.020097] ata8: SATA link up 6.0
>> Gbps (SStatus 133 SControl 300)
>> Feb  7 01:07:41 raven kernel: [18917.020093] ata8.00: qc timeout (cmd 0xec)
>> Feb  7 01:07:41 raven kernel: [18917.028074] ata8.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 01:07:41 raven kernel: [18917.028310] ata8: hard resetting link
>> Feb  7 01:07:47 raven kernel: [18922.396089] ata8: link is slow to
>> respond, please be patient (ready=0)
>> Feb  7 01:07:51 raven kernel: [18927.044313] ata8: hard resetting link
>> Feb  7 01:07:56 raven kernel: [18932.020099] ata8: SATA link up 6.0
>> Gbps (SStatus 133 SControl 300)
>> Feb  7 01:08:06 raven kernel: [18942.020048] ata8.00: qc timeout (cmd 0xec)
>> Feb  7 01:08:06 raven kernel: [18942.028075] ata8.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 01:08:06 raven kernel: [18942.028307] ata8: limiting SATA link
>> speed to 3.0 Gbps
>> Feb  7 01:08:06 raven kernel: [18942.028321] ata8: hard resetting link
>> Feb  7 01:08:12 raven kernel: [18947.396108] ata8: link is slow to
>> respond, please be patient (ready=0)
>> Feb  7 01:08:16 raven kernel: [18951.988069] ata8: SATA link up 6.0
>> Gbps (SStatus 133 SControl 320)
>> Feb  7 01:08:46 raven kernel: [18981.988104] ata8.00: qc timeout (cmd 0xec)
>> Feb  7 01:08:46 raven kernel: [18981.996070] ata8.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 01:08:46 raven kernel: [18981.996302] ata8.00: disabled
>> Feb  7 01:08:46 raven kernel: [18981.996324] ata8.00: device reported
>> invalid CHS sector 0
>> Feb  7 01:08:46 raven kernel: [18981.996348] ata8: hard resetting link
>> Feb  7 01:08:52 raven kernel: [18987.364104] ata8: link is slow to
>> respond, please be patient (ready=0)
>> Feb  7 01:08:56 raven kernel: [18992.012050] ata8: SATA link up 6.0
>> Gbps (SStatus 133 SControl 320)
>> Feb  7 01:08:56 raven kernel: [18992.012114] ata8: EH complete
>> Feb  7 01:08:56 raven kernel: [18992.012158] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.012165] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.012176] sd 8:0:0:0: [sdg] CDB:
>> Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
>> Feb  7 01:08:56 raven kernel: [18992.012696] md: super_written gets
>> error=-5, uptodate=0
>> Feb  7 01:08:56 raven kernel: [18992.013169] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.013176] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.013186] sd 8:0:0:0: [sdg] CDB:
>> Read(10): 28 00 04 9d bd bf 00 00 80 00
>> Feb  7 01:08:56 raven kernel: [18992.276986] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.276999] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.277012] sd 8:0:0:0: [sdg] CDB:
>> Read(10): 28 00 04 9d be 3f 00 00 80 00
>> Feb  7 01:08:56 raven kernel: [18992.316919] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.316930] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.316942] sd 8:0:0:0: [sdg] CDB:
>> Read(10): 28 00 04 9d be bf 00 00 80 00
>> Feb  7 01:08:56 raven kernel: [18992.326906] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.326920] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.326932] sd 8:0:0:0: [sdg] CDB:
>> Read(10): 28 00 04 9d bf 3f 00 00 80 00
>> Feb  7 01:08:56 raven kernel: [18992.327944] sd 8:0:0:0: [sdg]
>> Unhandled error code
>> Feb  7 01:08:56 raven kernel: [18992.327956] sd 8:0:0:0: [sdg] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 01:08:56 raven kernel: [18992.327968] sd 8:0:0:0: [sdg] CDB:
>> Read(10): 28 00 04 9d bf bf 00 00 80 00
>> Feb  7 01:08:57 raven kernel: [18992.555093] md: md0: reshape done.
>> Feb  7 01:08:57 raven kernel: [18992.607595] md: reshape of RAID array md0
>> Feb  7 01:08:57 raven kernel: [18992.607606] md: minimum _guaranteed_
>> speed: 200000 KB/sec/disk.
>> Feb  7 01:08:57 raven kernel: [18992.607614] md: using maximum
>> available idle IO bandwidth (but not more than 200000 KB/sec) for
>> reshape.
>> Feb  7 01:08:57 raven kernel: [18992.607628] md: using 128k window,
>> over a total of 1953511936 blocks.
>> Feb  7 06:41:02 raven rsyslogd: [origin software="rsyslogd"
>> swVersion="4.2.0" x-pid="911" x-info="http://www.rsyslog.com"]
>> rsyslogd was HUPed, type 'lightweight'.
>> Feb  7 07:12:32 raven kernel: [40807.989092] ata5: hard resetting link
>> Feb  7 07:12:38 raven kernel: [40813.524074] ata5: SATA link up 6.0
>> Gbps (SStatus 133 SControl 300)
>> Feb  7 07:12:43 raven kernel: [40818.524106] ata5.00: qc timeout (cmd 0xec)
>> Feb  7 07:12:43 raven kernel: [40818.524126] ata5.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 07:12:43 raven kernel: [40818.532788] ata5: hard resetting link
>> Feb  7 07:12:48 raven kernel: [40824.058039] ata5: SATA link up 6.0
>> Gbps (SStatus 133 SControl 300)
>> Feb  7 07:12:58 raven kernel: [40834.056101] ata5.00: qc timeout (cmd 0xec)
>> Feb  7 07:12:58 raven kernel: [40834.056121] ata5.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 07:12:58 raven kernel: [40834.064203] ata5: limiting SATA link
>> speed to 3.0 Gbps
>> Feb  7 07:12:58 raven kernel: [40834.064217] ata5: hard resetting link
>> Feb  7 07:13:04 raven kernel: [40839.592095] ata5: SATA link up 3.0
>> Gbps (SStatus 123 SControl 320)
>> Feb  7 07:13:34 raven kernel: [40869.592088] ata5.00: qc timeout (cmd 0xec)
>> Feb  7 07:13:34 raven kernel: [40869.592110] ata5.00: failed to
>> IDENTIFY (I/O error, err_mask=0x4)
>> Feb  7 07:13:34 raven kernel: [40869.599676] ata5.00: disabled
>> Feb  7 07:13:34 raven kernel: [40869.599700] ata5.00: device reported
>> invalid CHS sector 0
>> Feb  7 07:13:34 raven kernel: [40869.599724] ata5: hard resetting link
>> Feb  7 07:13:39 raven kernel: [40875.124128] ata5: SATA link up 3.0
>> Gbps (SStatus 123 SControl 320)
>> Feb  7 07:13:39 raven kernel: [40875.124201] ata5: EH complete
>> Feb  7 07:13:39 raven kernel: [40875.124243] sd 4:0:0:0: [sdd]
>> Unhandled error code
>> Feb  7 07:13:39 raven kernel: [40875.124251] sd 4:0:0:0: [sdd] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 07:13:39 raven kernel: [40875.124262] sd 4:0:0:0: [sdd] CDB:
>> Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
>> Feb  7 07:13:39 raven kernel: [40875.135544] md: super_written gets
>> error=-5, uptodate=0
>> Feb  7 07:13:39 raven kernel: [40875.152171] sd 4:0:0:0: [sdd]
>> Unhandled error code
>> Feb  7 07:13:39 raven kernel: [40875.152179] sd 4:0:0:0: [sdd] Result:
>> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
>> Feb  7 07:13:39 raven kernel: [40875.152189] sd 4:0:0:0: [sdd] CDB:
>> Read(10): 28 00 09 2b f2 3f 00 00 80 00
>> Feb  7 07:13:41 raven kernel: [40876.734504] md: md0: reshape done.
>> Feb  7 07:13:41 raven kernel: [40876.736298] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.743529] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.750009] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.755143] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.760126] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.765070] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.769890] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.774759] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.779456] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.784166] lost page write due to
>> I/O error on md0
>> Feb  7 07:13:41 raven kernel: [40876.788773] JBD: Detected IO errors
>> while flushing file data on md0
>> Feb  7 07:13:41 raven kernel: [40876.796386] JBD: Detected IO errors
>> while flushing file data on md0
>>
>> On Tue, Feb 7, 2012 at 1:15 PM, Phil Turmel <philip@turmel.org> wrote:
>>> Hi Richard,
>>>
>>> On 02/06/2012 08:34 PM, Richard Herd wrote:
>>>> Hey guys,
>>>>
>>>> I'm in a bit of a pickle here and if any mdadm kings could step in and
>>>> throw some advice my way I'd be very grateful :-)
>>>>
>>>> Quick bit of background - little NAS based on an AMD E350 running
>>>> Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
>>>> months one of the drives would fail a request and get kicked from the
>>>> array (as is becoming common for these larger multi TB drives they
>>>> tolerate the occasional bad sector by reallocating from a pool of
>>>> spares (but that's a whole other story)).  This happened across a
>>>> variety of brands and two different controllers. I'd simply add the
>>>> disk that got popped back in and let it re-sync.  SMART tests always
>>>> in good health.
>>>
>>> Some more detail on the actual devices would help, especially the
>>> output of lsdrv [1] to document what device serial numbers are which,
>>> for future reference.
>>>
>>> I also suspect you have problems with your drive's error recovery
>>> control, also known as time-limited error recovery.  Simple sector
>>> errors should *not* be kicking out your drives.  Mdadm knows to
>>> reconstruct from parity and rewrite when a read error is encountered.
>>> That either succeeds directly, or causes the drive to remap.
>>>
>>> You say that the SMART tests are good, so read errors are probably
>>> escalating into link timeouts, and the drive ignores the attempt to
>>> reconstruct.  *That* kicks the drive out.
>>>
>>> "smartctl -x" reports for all of your drives would help identify if
>>> you have this problem.  You *cannot* safely run raid arrays with drives
>>> that don't (or won't) report errors in a timely fashion (a few seconds).
>>>
>>>> It did make me nervous though.  So I decided I'd add a second disk for
>>>> a bit of extra redundancy, making the array a RAID 6 - the thinking
>>>> was the occasional disk getting kicked and re-added from a RAID 6
>>>> array wouldn't present as much risk as a single disk getting kicked
>>>> from a RAID 5.
>>>>
>>>> So first off, I added the 6th disk as a hotspare to the RAID5 array.
>>>> So I now had my 5 disk RAID 5 + hotspare.
>>>>
>>>> I then found that mdadm 2.6.7 (in the repositories) isn't actually
>>>> capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
>>>> compiled myself a new version of mdadm.
>>>>
>>>> With the newer version of mdadm, it was happy to do the reshape - so I
>>>> set it off on it's merry way, using an esata HD (mounted at /usb :-P)
>>>> for the backupfile:
>>>>
>>>> root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
>>>> --backup-file=/usb/md0.backup
>>>>
>>>> It would take a week to reshape, but it was ona UPS & happily ticking
>>>> along.  The array would be online the whole time so I was in no rush.
>>>> Content, I went to get some shut-eye.
>>>>
>>>> I got up this morning and took a quick look in /proc/mdstat to see how
>>>> things were going and saw things had failed spectacularly.  At least
>>>> two disks had been kicked from the array and the whole thing had
>>>> crumbled.
>>>
>>> Do you still have the dmesg for this?
>>>
>>>> Ouch.
>>>>
>>>> I tried to assembe the array, to see if it would continue the reshape:
>>>>
>>>> root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
>>>> /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
>>>>
>>>> Unfortunately mdadm had decided that the backup-file was out of date
>>>> (timestamps didn't match) and was erroring with: Failed to restore
>>>> critical section for reshape, sorry..
>>>>
>>>> Chances are things were in such a mess that backup file wasn't going
>>>> to be used anyway, so I blocked the timestamp check with: export
>>>> MDADM_GROW_ALLOW_OLD=1
>>>>
>>>> That allowed me to assemble the array, but not run it as there were
>>>> not enough disks to start it.
>>>>
>>>> This is the current state of the array:
>>>>
>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>> [raid4] [raid10]
>>>> md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
>>>>       7814047744 blocks super 0.91
>>>>
>>>> unused devices: <none>
>>>>
>>>> root@raven:/# mdadm --detail /dev/md0
>>>> /dev/md0:
>>>>         Version : 0.91
>>>>   Creation Time : Tue Jul 12 23:05:01 2011
>>>>      Raid Level : raid6
>>>>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>>>>    Raid Devices : 6
>>>>   Total Devices : 4
>>>> Preferred Minor : 0
>>>>     Persistence : Superblock is persistent
>>>>
>>>>     Update Time : Tue Feb  7 09:32:29 2012
>>>>           State : active, FAILED, Not Started
>>>>  Active Devices : 3
>>>> Working Devices : 4
>>>>  Failed Devices : 0
>>>>   Spare Devices : 1
>>>>
>>>>          Layout : left-symmetric-6
>>>>      Chunk Size : 64K
>>>>
>>>>      New Layout : left-symmetric
>>>>
>>>>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>>>>          Events : 0.1848341
>>>>
>>>>     Number   Major   Minor   RaidDevice State
>>>>        0       0        0        0      removed
>>>>        1       8       17        1      active sync   /dev/sdb1
>>>>        2       8        1        2      active sync   /dev/sda1
>>>>        3       0        0        3      removed
>>>>        4       8       81        4      active sync   /dev/sdf1
>>>>        5       8       49        5      spare rebuilding   /dev/sdd1
>>>>
>>>> The two removed disks:
>>>> [ 3020.998529] md: kicking non-fresh sdc1 from array!
>>>> [ 3021.012672] md: kicking non-fresh sdg1 from array!
>>>>
>>>> Attempted to re-add the disks (same for both):
>>>> root@raven:/# mdadm /dev/md0 --add /dev/sdg1
>>>> mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
>>>> --re-add fails.
>>>> mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.
>>>> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.
>>>>
>>>> With a failed array the last thing we want to do is add spares and
>>>> trigger a resync so obviously I haven't zeroed the superblocks and
>>>> added yet.
>>>
>>> That would be catastrophic.
>>>
>>>> Checked and two disks really are out of sync:
>>>> root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
>>>>          Events : 1848341
>>>>          Events : 1848341
>>>>          Events : 1848333
>>>>          Events : 1848341
>>>>          Events : 1848341
>>>>          Events : 1772921
>>>
>>> So /dev/sdg1 dropped out first, and /dev/sdc1 followed and killed the
>>> array.
>>>
>>>> I'll post the output of --examine on all the disks below - if anyone
>>>> has any advice I'd really appreciate it (Neil Brown doesn't read these
>>>> forums does he?!?).  I would usually move next to recreating the array
>>>> and using assume-clean but since it's right in the middle of a reshape
>>>> I'm not inclined to try.
>>>
>>> Neil absolutely reads this mailing list, and is likely to pitch in if
>>> I don't offer precisely correct advice :-)
>>>
>>> He's in an Australian time zone though, so latency might vary.  I'm on the
>>> U.S. east coast, fwiw.
>>>
>>> In any case, with a re-shape in progress, "--create --assume-clean" is
>>> not an option.
>>>
>>>> Critical stuff is of course backed up, but there is some user data not
>>>> covered by backups that I'd like to try and restore if at all
>>>> possible.
>>>
>>> Hope is not all lost.  If we can get your ERC adjusted, the next step
>>> would be to disconnect /dev/sdg from the system, and assemble with
>>> --force and MDADM_GROW_ALLOW_OLD=1
>>>
>>> That'll let the reshape finish, leaving you with a single-degraded
>>> raid6.  Then you fsck and make critical backups.  Then you --zero- and
>>> --add /dev/sdg.
>>>
>>> If your drives don't support ERC, I can't recommend you continue until
>>> you've ddrescue'd your drives onto new ones that do support ERC.
>>>
>>> HTH,
>>>
>>> Phil
>>>
>>> [1] http://github.com/pturmel/lsdrv
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  2:57     ` Phil Turmel
  2012-02-07  3:10       ` Richard Herd
@ 2012-02-07  3:24       ` Keith Keller
  2012-02-07  3:38         ` Phil Turmel
  2012-02-08  7:13         ` Please Help! RAID5 -> 6 reshapre gone bad Stan Hoeppner
  1 sibling, 2 replies; 27+ messages in thread
From: Keith Keller @ 2012-02-07  3:24 UTC (permalink / raw)
  To: linux-raid

On 2012-02-07, Phil Turmel <philip@turmel.org> wrote:
>
> I understand the constraints, as I pinch pennies at home and at the
> office (I own my engineering firm).  I've made do with cheap desktop
> drives that do support ERC.  I got burned when Seagate dropped ERC on
> their latest desktop drives.  Hitachi Deskstar is the only affordable
> model on the market that still support ERC.

I can testify that the EARS/EADS drives can be troublesome (see my
recent threads on the list).  I also found out that apparently the
flooding in Thailand is delaying all drive vendors' enterprise drives--
they seem to be one of the few factories that make an essential part,
and their factories are all underwater.

Have others had success with mdraid and the Deskstar drives?  I wouldn't
mind saving a little money if the drives will actually work, especially
if I can get them in before April (the earliest one vendor thinks
they might be able to start building drives again).

--keith

-- 
kkeller@wombat.san-francisco.ca.us

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:24       ` Keith Keller
@ 2012-02-07  3:38         ` Phil Turmel
  2012-01-31  6:31           ` rebuild raid6 after two failures Keith Keller
  2012-02-08  7:13         ` Please Help! RAID5 -> 6 reshapre gone bad Stan Hoeppner
  1 sibling, 1 reply; 27+ messages in thread
From: Phil Turmel @ 2012-02-07  3:38 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

On 02/06/2012 10:24 PM, Keith Keller wrote:
> On 2012-02-07, Phil Turmel <philip@turmel.org> wrote:
>>
>> I understand the constraints, as I pinch pennies at home and at the
>> office (I own my engineering firm).  I've made do with cheap desktop
>> drives that do support ERC.  I got burned when Seagate dropped ERC on
>> their latest desktop drives.  Hitachi Deskstar is the only affordable
>> model on the market that still support ERC.
> 
> I can testify that the EARS/EADS drives can be troublesome (see my
> recent threads on the list).  I also found out that apparently the
> flooding in Thailand is delaying all drive vendors' enterprise drives--
> they seem to be one of the few factories that make an essential part,
> and their factories are all underwater.
> 
> Have others had success with mdraid and the Deskstar drives?  I wouldn't
> mind saving a little money if the drives will actually work, especially
> if I can get them in before April (the earliest one vendor thinks
> they might be able to start building drives again).

Ow.  I haven't actually bought any yet... I was hoping the prices would
come down before I needed to.  Sounds like I'll be waiting longer than
I expected.

But, I reviewed the OEM documentation for the 7K3000 family, and they
clearly document support for the SCT ERC commands (para 9.18.1.2).

Phil

^ permalink raw reply	[flat|nested] 27+ messages in thread

* rebuild raid6 after two failures
@ 2012-01-31  6:31           ` Keith Keller
  2012-02-01  4:42             ` Keith Keller
  0 siblings, 1 reply; 27+ messages in thread
From: Keith Keller @ 2012-01-31  6:31 UTC (permalink / raw)
  To: linux-raid

Hello list,

I recently had a RAID6 lose two drives in quick succession, with one
spare already in place.  The rebuild started fine with the spare, but
now that I've replaced the failed disks, should I expect the current
rebuild to finish, then rebuild on another spare?  Or do I need to do
something special to kick off the rebuilding on another spare?  I tried
looking for the answer using various web search permutations with no
success.

My mdadm and uname output is below.  (I did not remember to use a newer
mdadm to add the spares, so I originally used 2.6.9, but I do have 3.2.3
available on this box.)  Thanks for any pointers.

--keith

# uname -a
Linux xxxxxxxxxx 2.6.39-4.1.el5.elrepo #1 SMP PREEMPT Wed Jan 18 13:16:25 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
# mdadm -D /dev/md0
/dev/md0:
        Version : 1.01
  Creation Time : Thu Sep 29 21:26:35 2011
     Raid Level : raid6
     Array Size : 15624911360 (14901.08 GiB 15999.91 GB)
  Used Dev Size : 1953113920 (1862.63 GiB 1999.99 GB)
   Raid Devices : 10
  Total Devices : 12
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Jan 30 22:07:26 2012
          State : clean, degraded, recovering
 Active Devices : 8
Working Devices : 12
 Failed Devices : 0
  Spare Devices : 4

     Chunk Size : 64K

 Rebuild Status : 18% complete

           Name : 0
           UUID : 24363b01:90deb9b5:4b51e5df:68b8b6ea
         Events : 164419

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
      13       8       33        1      active sync   /dev/sdc1
      11       8      145        2      active sync   /dev/sdj1
      12       8      161        3      active sync   /dev/sdk1
       4       8       65        4      active sync   /dev/sde1
       9       8      113        5      active sync   /dev/sdh1
      10       8       81        6      active sync   /dev/sdf1
       3       8       49        7      spare rebuilding   /dev/sdd1
       8       8      129        8      active sync   /dev/sdi1
       9       0        0        9      removed

      14       8      177        -      spare   /dev/sdl1
      15       8      209        -      spare   /dev/sdn1
      16       8      225        -      spare   /dev/sdo1


-- 
kkeller@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://www.therockgarden.ca/aolsfaq.txt
see X- headers for PGP signature information



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: rebuild raid6 after two failures
  2012-01-31  6:31           ` rebuild raid6 after two failures Keith Keller
@ 2012-02-01  4:42             ` Keith Keller
  2012-02-01  5:31               ` NeilBrown
  2012-02-03 16:08               ` using dd (or dd_rescue) to salvage array Keith Keller
  0 siblings, 2 replies; 27+ messages in thread
From: Keith Keller @ 2012-02-01  4:42 UTC (permalink / raw)
  To: linux-raid

On 2012-01-31, Keith Keller <kkeller@wombat.san-francisco.ca.us> wrote:
>
> I recently had a RAID6 lose two drives in quick succession, with one
> spare already in place.  The rebuild started fine with the spare, but
> now that I've replaced the failed disks, should I expect the current
> rebuild to finish, then rebuild on another spare?

[snip]

Well, for better or worse, this is now a moot question--I had another
drive kicked out of the array, I believe prematurely by the controller.
I was able to --assemble --force the array, and it is now rebuilding
two spares instead of one.  AFAIR there was no activity on the
filesystem at the time, so I am optimistic that the filesystem should be
fine after an fsck.  Thanks to the advice from last time which suggested
--assemble --force instead of --assume-clean in this situation.

Could it have been the older version of mdadm that didn't tell the
kernel to start rebuilding the added spare?  I have made 3.2.3 my
default mdadm, which I hope alleviates some of the issues I've had with
rebuilds not starting.  (As an aside, I've also bitten the bullet and
decided to swap out all the WD-EARS drives for real RAID drives; ideally
I'd replace the controller, but I don't want to invest the time needed
to replace and test all the components properly.)

--keith


-- 
kkeller@wombat.san-francisco.ca.us



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: rebuild raid6 after two failures
  2012-02-01  4:42             ` Keith Keller
@ 2012-02-01  5:31               ` NeilBrown
  2012-02-01  5:48                 ` Keith Keller
  2012-02-03 16:08               ` using dd (or dd_rescue) to salvage array Keith Keller
  1 sibling, 1 reply; 27+ messages in thread
From: NeilBrown @ 2012-02-01  5:31 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1704 bytes --]

On Tue, 31 Jan 2012 20:42:28 -0800 Keith Keller
<kkeller@wombat.san-francisco.ca.us> wrote:

> On 2012-01-31, Keith Keller <kkeller@wombat.san-francisco.ca.us> wrote:
> >
> > I recently had a RAID6 lose two drives in quick succession, with one
> > spare already in place.  The rebuild started fine with the spare, but
> > now that I've replaced the failed disks, should I expect the current
> > rebuild to finish, then rebuild on another spare?
> 
> [snip]
> 
> Well, for better or worse, this is now a moot question--I had another
> drive kicked out of the array, I believe prematurely by the controller.
> I was able to --assemble --force the array, and it is now rebuilding
> two spares instead of one.  AFAIR there was no activity on the
> filesystem at the time, so I am optimistic that the filesystem should be
> fine after an fsck.  Thanks to the advice from last time which suggested
> --assemble --force instead of --assume-clean in this situation.
> 
> Could it have been the older version of mdadm that didn't tell the
> kernel to start rebuilding the added spare?  I have made 3.2.3 my
> default mdadm, which I hope alleviates some of the issues I've had with
> rebuilds not starting.  (As an aside, I've also bitten the bullet and
> decided to swap out all the WD-EARS drives for real RAID drives; ideally
> I'd replace the controller, but I don't want to invest the time needed
> to replace and test all the components properly.)

If a spare is being rebuild when another spare is added, it keeps with the
first rebuild rather than restarting from the beginning.

This means that you get some redundancy sooner, which is probably a good
thing.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: rebuild raid6 after two failures
  2012-02-01  5:31               ` NeilBrown
@ 2012-02-01  5:48                 ` Keith Keller
  0 siblings, 0 replies; 27+ messages in thread
From: Keith Keller @ 2012-02-01  5:48 UTC (permalink / raw)
  To: linux-raid

On 2012-02-01, NeilBrown <neilb@suse.de> wrote:
>
> If a spare is being rebuild when another spare is added, it keeps with the
> first rebuild rather than restarting from the beginning.
>
> This means that you get some redundancy sooner, which is probably a good
> thing.

Great, thanks for the info.  I just wanted to check that the
behavior I saw earlier was expected.  (Yes, it's a good thing!)

--keith

-- 
kkeller@wombat.san-francisco.ca.us



^ permalink raw reply	[flat|nested] 27+ messages in thread

* using dd (or dd_rescue) to salvage array
  2012-02-01  4:42             ` Keith Keller
  2012-02-01  5:31               ` NeilBrown
@ 2012-02-03 16:08               ` Keith Keller
  2012-02-04 18:01                 ` Stefan /*St0fF*/ Hübner
  1 sibling, 1 reply; 27+ messages in thread
From: Keith Keller @ 2012-02-03 16:08 UTC (permalink / raw)
  To: linux-raid

On 2012-02-01, Keith Keller <kkeller@wombat.san-francisco.ca.us> wrote:
>
> Well, for better or worse, this is now a moot question--I had another
> drive kicked out of the array, I believe prematurely by the controller.

It turns out to be worse--the drive does in fact appear to be failing,
which would be the third failure on this RAID6 array.  I had what might
be a crazy thought--would it be worth the trouble to attempt to use dd
(or dd_rescue, a tool I found that claims to continue on bad blocks) to
write the disk image to another disk, and attempt a rebuild with the new
disk?  Or am I just wasting my time?  (The array is hosting an rsnapshot
backup set, so I can recreate the latest snapshot, but it'll take a
while.  So it'd be nice to save the array if it's possible and not time-
consuming.)

Thanks for your help!

--keith

-- 
kkeller@wombat.san-francisco.ca.us

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-03 16:08               ` using dd (or dd_rescue) to salvage array Keith Keller
@ 2012-02-04 18:01                 ` Stefan /*St0fF*/ Hübner
  2012-02-05 19:10                   ` Keith Keller
  0 siblings, 1 reply; 27+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2012-02-04 18:01 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

Am 03.02.2012 17:08, schrieb Keith Keller:
> On 2012-02-01, Keith Keller <kkeller@wombat.san-francisco.ca.us> wrote:
>>
>> Well, for better or worse, this is now a moot question--I had another
>> drive kicked out of the array, I believe prematurely by the controller.
> 
> It turns out to be worse--the drive does in fact appear to be failing,
> which would be the third failure on this RAID6 array.  I had what might
> be a crazy thought--would it be worth the trouble to attempt to use dd
> (or dd_rescue, a tool I found that claims to continue on bad blocks) to
> write the disk image to another disk, and attempt a rebuild with the new
> disk?  Or am I just wasting my time?  (The array is hosting an rsnapshot
> backup set, so I can recreate the latest snapshot, but it'll take a
> while.  So it'd be nice to save the array if it's possible and not time-
> consuming.)
> 
> Thanks for your help!
> 
> --keith
> 
Hi Keith,

actually, ddrescue is THE WAY TO GO in this case.  Don't use the old
ddrescue, but the GNU version.  Some distros call it gddrescue, on
gentoo the old one is called dd-rescue and the gnu-one ddrescue.  Just
check it out: http://www.gnu.org/software/ddrescue/ddrescue.html

Good luck,
Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-04 18:01                 ` Stefan /*St0fF*/ Hübner
@ 2012-02-05 19:10                   ` Keith Keller
  2012-02-06 21:37                     ` Stefan *St0fF* Huebner
  0 siblings, 1 reply; 27+ messages in thread
From: Keith Keller @ 2012-02-05 19:10 UTC (permalink / raw)
  To: linux-raid

On 2012-02-04, Stefan /*St0fF*/ Hübner <stefan.huebner@stud.tu-ilmenau.de> wrote:
>
> actually, ddrescue is THE WAY TO GO in this case.  Don't use the old
> ddrescue, but the GNU version.  Some distros call it gddrescue, on
> gentoo the old one is called dd-rescue and the gnu-one ddrescue.  Just
> check it out: http://www.gnu.org/software/ddrescue/ddrescue.html

Thanks for the advice, Stefan.  Frustratingly enough, I will get a
chance to try GNU ddrescue despite my impatience--I originally used
dd_rescue to try to get an image of the failing drive, and while that
succeeded just fine (only lost 8k), the target ended up reporting ECC
errors during the rebuild!  So I've taken a new image with ddrescue
to a tested drive (again, losing 8k), and am hoping that it goes better.
(At the moment I'm just attempting a one-spare rebuild, which I'm hoping
will go faster than a two-disk build, and therefore report any problems
sooner.)

I realized after reading my initial post that I wasn't 100% clear what I
was asking.  I knew that some sort of dd would work, but I'd only done
it before in a filesystem context, and didn't know how mdraid would
react.  So I am curious, does anyone know what I might expect when the
rebuild gets to the part on the new image where the data was lost?  Will
it just create a problem on the filesystem, or might something worse
happen?  Should I run a check if the rebuild completes successfully?
And will mismatch_cnt get populated by the rebuild, or would I need a
check to expose mismatches?

--keith

-- 
kkeller-usenet@wombat.san-francisco.ca.us

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-05 19:10                   ` Keith Keller
@ 2012-02-06 21:37                     ` Stefan *St0fF* Huebner
  2012-02-07  3:44                       ` Keith Keller
  2012-02-07  4:24                       ` Keith Keller
  0 siblings, 2 replies; 27+ messages in thread
From: Stefan *St0fF* Huebner @ 2012-02-06 21:37 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

On 05.02.2012 20:10, Keith Keller wrote:
> On 2012-02-04, Stefan /*St0fF*/ Hübner<stefan.huebner@stud.tu-ilmenau.de>  wrote:
>> actually, ddrescue is THE WAY TO GO in this case.  Don't use the old
>> ddrescue, but the GNU version.  Some distros call it gddrescue, on
>> gentoo the old one is called dd-rescue and the gnu-one ddrescue.  Just
>> check it out: http://www.gnu.org/software/ddrescue/ddrescue.html
> Thanks for the advice, Stefan.  Frustratingly enough, I will get a
> chance to try GNU ddrescue despite my impatience--I originally used
> dd_rescue to try to get an image of the failing drive, and while that
> succeeded just fine (only lost 8k), the target ended up reporting ECC
> errors during the rebuild!  So I've taken a new image with ddrescue
> to a tested drive (again, losing 8k), and am hoping that it goes better.
> (At the moment I'm just attempting a one-spare rebuild, which I'm hoping
> will go faster than a two-disk build, and therefore report any problems
> sooner.)
>
> I realized after reading my initial post that I wasn't 100% clear what I
> was asking.  I knew that some sort of dd would work, but I'd only done
> it before in a filesystem context, and didn't know how mdraid would
> react.  So I am curious, does anyone know what I might expect when the
> rebuild gets to the part on the new image where the data was lost?  Will
> it just create a problem on the filesystem, or might something worse
> happen?  Should I run a check if the rebuild completes successfully?
> And will mismatch_cnt get populated by the rebuild, or would I need a
> check to expose mismatches?
>
> --keith
>
 From the logical point of view those lost 8k would create bad data - 
i.e. a filesystem problem OR simply corrupted data.  That depends on 
which blocks exactly are bad.  If you were using lvm it could even be 
worse, like broken metadata.

It would be good if those 8k were "in a row" - that way at max 3 
fs-blocks (when using 4k fs-blocksize) would be corrupted.  If you're 
lucky, you won't even notice - like me: my system SSD broke down 
lately.  I ddrescued as much as I could, but around 250k are gone.  I'm 
dual-booting windows and gentoo and I have not yet encountered a problem 
from the missing data.  Lucky me...

Cheers,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-06 21:37                     ` Stefan *St0fF* Huebner
@ 2012-02-07  3:44                       ` Keith Keller
  2012-02-07  4:24                       ` Keith Keller
  1 sibling, 0 replies; 27+ messages in thread
From: Keith Keller @ 2012-02-07  3:44 UTC (permalink / raw)
  To: linux-raid

On 2012-02-06, Stefan *St0fF* Huebner <st0ff@gmx.net> wrote:
>
>  From the logical point of view those lost 8k would create bad data - 
> i.e. a filesystem problem OR simply corrupted data.  That depends on 
> which blocks exactly are bad.  If you were using lvm it could even be 
> worse, like broken metadata.

I am using LVM, so I'll just have to hope for the best.  I haven't yet
done an xfs_repair, but I will do that soon.  I just made my volume
active, and vgchange didn't complain, so I'm guessing that's a good
sign.

> It would be good if those 8k were "in a row" - that way at max 3 
> fs-blocks (when using 4k fs-blocksize) would be corrupted.

It was--it looks like it was really just that one spot on the drive.  So
I am hopeful that any errors that are a result of the lost 8k will be
reparable by xfs_repair.

--keith


-- 
kkeller@wombat.san-francisco.ca.us



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-06 21:37                     ` Stefan *St0fF* Huebner
  2012-02-07  3:44                       ` Keith Keller
@ 2012-02-07  4:24                       ` Keith Keller
  2012-02-07 20:01                         ` Stefan *St0fF* Huebner
  1 sibling, 1 reply; 27+ messages in thread
From: Keith Keller @ 2012-02-07  4:24 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1499 bytes --]

On Mon, Feb 06, 2012 at 10:37:38PM +0100, Stefan *St0fF* Huebner wrote:
> From the logical point of view those lost 8k would create bad data - 
> i.e. a filesystem problem OR simply corrupted data.  That depends on 
> which blocks exactly are bad.  If you were using lvm it could even be 
> worse, like broken metadata.

FWIW, xfs_repair has spit out over 100k lines on stderr, but when I
mounted before the repair (this is suggested if you need to replay the
log), everything seemed intact.  So I'm not yet sure what to make of
things; perhaps it'll be fine, or perhaps I need to start over.
(Alternatively, perhaps the next rsnapshot run will expose problems.)

On Mon, Feb 06, 2012 at 10:38:27PM -0500, Phil Turmel wrote:
> 
> But, I reviewed the OEM documentation for the 7K3000 family, and they
> clearly document support for the SCT ERC commands (para 9.18.1.2).

If I'm reading the docs right, then the 5K3000 also supports them (if
you're really cheap and can tolerate the slower speeds).  At this point,
if we're waiting till April for real ''enterprise'' drives, I can't see
anything too bad about getting one or two of these and testing them out
in my environment--the EARS drives are bad enough with my configuration
that it's hard to imagine being any worse.  (To be fair, I have to blame
the 3ware 9550 controller a bit too; I have EARS drives on other 3ware
controllers without all these issues.)

--keith

-- 
kkeller@wombat.san-francisco.ca.us

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: using dd (or dd_rescue) to salvage array
  2012-02-07  4:24                       ` Keith Keller
@ 2012-02-07 20:01                         ` Stefan *St0fF* Huebner
  0 siblings, 0 replies; 27+ messages in thread
From: Stefan *St0fF* Huebner @ 2012-02-07 20:01 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

On 07.02.2012 05:24, Keith Keller wrote:
> On Mon, Feb 06, 2012 at 10:37:38PM +0100, Stefan *St0fF* Huebner wrote:
>>  From the logical point of view those lost 8k would create bad data -
>> i.e. a filesystem problem OR simply corrupted data.  That depends on
>> which blocks exactly are bad.  If you were using lvm it could even be
>> worse, like broken metadata.
> FWIW, xfs_repair has spit out over 100k lines on stderr, but when I
> mounted before the repair (this is suggested if you need to replay the
> log), everything seemed intact.  So I'm not yet sure what to make of
> things; perhaps it'll be fine, or perhaps I need to start over.
> (Alternatively, perhaps the next rsnapshot run will expose problems.)
>
>
> On Mon, Feb 06, 2012 at 10:38:27PM -0500, Phil Turmel wrote:
>> But, I reviewed the OEM documentation for the 7K3000 family, and they
>> clearly document support for the SCT ERC commands (para 9.18.1.2).
> If I'm reading the docs right, then the 5K3000 also supports them (if
> you're really cheap and can tolerate the slower speeds).  At this point,
> if we're waiting till April for real ''enterprise'' drives, I can't see
> anything too bad about getting one or two of these and testing them out
> in my environment--the EARS drives are bad enough with my configuration
> that it's hard to imagine being any worse.  (To be fair, I have to blame
> the 3ware 9550 controller a bit too; I have EARS drives on other 3ware
> controllers without all these issues.)
>
> --keith
>
>
>
Sounds promising.  If you're really lucky, the blocks were freed by the 
fs and nothing has gone...

Stefan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:24       ` Keith Keller
  2012-02-07  3:38         ` Phil Turmel
@ 2012-02-08  7:13         ` Stan Hoeppner
  1 sibling, 0 replies; 27+ messages in thread
From: Stan Hoeppner @ 2012-02-08  7:13 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-raid

On 2/6/2012 9:24 PM, Keith Keller wrote:

> Have others had success with mdraid and the Deskstar drives?  I wouldn't
> mind saving a little money if the drives will actually work, especially
> if I can get them in before April (the earliest one vendor thinks
> they might be able to start building drives again).

Newegg seems to have nine 7.2k Deskstar models in stock, cond new, from
500GB to 3TB:

http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=100007603+50001984+600003340&QksAutoSuggestion=&ShowDeactivatedMark=False&Configurator=&IsNodeId=1&Subcategory=14&description=&hisInDesc=&Ntk=&CFG=&SpeTabStoreType=&AdvancedSearch=1&srchInDesc=

As of the date/time of this email, here's a spattering of the 9
available and their order qty limitations:

H3IK30003272SW (0S03208) 3TB qty:	20
HUA723020ALA640 (0F12455) 2TB qty:	5
H3IK20003272SP (0S02861) 2TB qty:	20
HDS723020BLA642 (0f12115) 2TB qty:	20
HDS721010DLE630 (0F13180) 1TB qty:	100
7K1000.C 0F10383 1TB qty:		20

The HDS721010DLE630 1TB model looks pretty attractive if one needs
spindle count IOPS more than total capacity.  And needs more than 20 drives.

-- 
Stan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Fwd: Please Help! RAID5 -> 6 reshapre gone bad
       [not found]   ` <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>
  2012-02-07  2:57     ` Phil Turmel
@ 2012-02-07  3:04     ` Richard Herd
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Herd @ 2012-02-07  3:04 UTC (permalink / raw)
  To: linux-raid

Sorry, sent that directly to Phil instead of back to the list.  FYI
the below email.

Thanks for the response Neil :-)

Also, just as a bit of clarification, it may help to understand what
was going on 'real-world':

# Last night reshape kicked off to go from RAID 5 to 6.
# This morning at 1 a disk (sdg) was kicked out of the array basically
for timing out on ERC.  mdadm stops reshape, then continues the
reshape without sdg.
# This morning at 7 a second disk (sdc) was kicked out of the array
(again ERC).  mdadm stops reshape, and does not continue _however_ md0
itself is NOT stopped.

As I have vlc recording streams from my security cameras to md0 24/7,
I think what happened at 7 this morning was that the array got into a
bad state with the two failed disks and stopped the reshape, but
didn't stop md0.  md0 stayed mounted and vlc will have been doing
writes of the cam footage to md0 for a couple of hours until about 9
when I noticed this and manually did mdadm --stop /dev/md0.

I would hazard a guess as that's why sdc has an older event counter
than the rest of the array - it was kicked out at 7 but the array
stayed up without enough disks for another couple of hours until 9
when manually stopped.

Hopefully that makes sense and adds a bit of context :-)

Cheers







---------- Forwarded message ----------
From: Richard Herd <2001oddity@gmail.com>
Date: Tue, Feb 7, 2012 at 1:40 PM
Subject: Re: Please Help! RAID5 -> 6 reshapre gone bad
To: Phil Turmel


Hi Phil,

Thanks for the swift response :-)  Also I'm in (what I'd like to say
but can't - sunny) Sydney...

OK, without slathering this thread is smart reports I can quite
definitely say you are exactly nail-on-the-head with regard to the
read errors escalating into link timeouts.  This is exactly what is
happening.  I had thought this was actually a pretty common setup for
home users (eg mdadm and drives such as WD20EARS/ST2000s) - I have the
luxury of budgets for Netapp kit at work - unfortunately my personal
finances only stretch to an ITX case and a bunch of cheap HDs!

I understand it's the ERC causing disks to get kicked, and fully
understand if you can't help further.

Assembling without sdg I'm not sure will do it, as what we have is 4
disks with the same events counter (3 active sync (sda/sdb/sdf), 1
spare rebuilding (sdd)), and 2 (sdg/sdc) removed with older event
counters.  Leaving out sdg leaves us with sdc which has an event
counter of 1848333.  As the 3 active sync (sda/sdb/sdf) + 1 spare
(sdd) have an event counter of 1848341, mdadm doesn't want to let me
use sdc in the array even with --force.

As you say as it's in the middle of a reshape so a recreate is out.

I'm considering data loss is a given at this point, but even being
able to bring the array online degraded and pull out whatever is still
intact would help.

If you have any further suggestions that would be great, but I do
understand your position on ERC and thank you for your input :-)

Cheers

Feb  7 01:07:16 raven kernel: [18891.989330] ata8: hard resetting link
Feb  7 01:07:22 raven kernel: [18897.356104] ata8: link is slow to
respond, please be patient (ready=0)
Feb  7 01:07:26 raven kernel: [18902.004280] ata8: hard resetting link
Feb  7 01:07:32 raven kernel: [18907.372104] ata8: link is slow to
respond, please be patient (ready=0)
Feb  7 01:07:36 raven kernel: [18912.020097] ata8: SATA link up 6.0
Gbps (SStatus 133 SControl 300)
Feb  7 01:07:41 raven kernel: [18917.020093] ata8.00: qc timeout (cmd 0xec)
Feb  7 01:07:41 raven kernel: [18917.028074] ata8.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 01:07:41 raven kernel: [18917.028310] ata8: hard resetting link
Feb  7 01:07:47 raven kernel: [18922.396089] ata8: link is slow to
respond, please be patient (ready=0)
Feb  7 01:07:51 raven kernel: [18927.044313] ata8: hard resetting link
Feb  7 01:07:56 raven kernel: [18932.020099] ata8: SATA link up 6.0
Gbps (SStatus 133 SControl 300)
Feb  7 01:08:06 raven kernel: [18942.020048] ata8.00: qc timeout (cmd 0xec)
Feb  7 01:08:06 raven kernel: [18942.028075] ata8.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 01:08:06 raven kernel: [18942.028307] ata8: limiting SATA link
speed to 3.0 Gbps
Feb  7 01:08:06 raven kernel: [18942.028321] ata8: hard resetting link
Feb  7 01:08:12 raven kernel: [18947.396108] ata8: link is slow to
respond, please be patient (ready=0)
Feb  7 01:08:16 raven kernel: [18951.988069] ata8: SATA link up 6.0
Gbps (SStatus 133 SControl 320)
Feb  7 01:08:46 raven kernel: [18981.988104] ata8.00: qc timeout (cmd 0xec)
Feb  7 01:08:46 raven kernel: [18981.996070] ata8.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 01:08:46 raven kernel: [18981.996302] ata8.00: disabled
Feb  7 01:08:46 raven kernel: [18981.996324] ata8.00: device reported
invalid CHS sector 0
Feb  7 01:08:46 raven kernel: [18981.996348] ata8: hard resetting link
Feb  7 01:08:52 raven kernel: [18987.364104] ata8: link is slow to
respond, please be patient (ready=0)
Feb  7 01:08:56 raven kernel: [18992.012050] ata8: SATA link up 6.0
Gbps (SStatus 133 SControl 320)
Feb  7 01:08:56 raven kernel: [18992.012114] ata8: EH complete
Feb  7 01:08:56 raven kernel: [18992.012158] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.012165] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.012176] sd 8:0:0:0: [sdg] CDB:
Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
Feb  7 01:08:56 raven kernel: [18992.012696] md: super_written gets
error=-5, uptodate=0
Feb  7 01:08:56 raven kernel: [18992.013169] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.013176] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.013186] sd 8:0:0:0: [sdg] CDB:
Read(10): 28 00 04 9d bd bf 00 00 80 00
Feb  7 01:08:56 raven kernel: [18992.276986] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.276999] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.277012] sd 8:0:0:0: [sdg] CDB:
Read(10): 28 00 04 9d be 3f 00 00 80 00
Feb  7 01:08:56 raven kernel: [18992.316919] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.316930] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.316942] sd 8:0:0:0: [sdg] CDB:
Read(10): 28 00 04 9d be bf 00 00 80 00
Feb  7 01:08:56 raven kernel: [18992.326906] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.326920] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.326932] sd 8:0:0:0: [sdg] CDB:
Read(10): 28 00 04 9d bf 3f 00 00 80 00
Feb  7 01:08:56 raven kernel: [18992.327944] sd 8:0:0:0: [sdg]
Unhandled error code
Feb  7 01:08:56 raven kernel: [18992.327956] sd 8:0:0:0: [sdg] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 01:08:56 raven kernel: [18992.327968] sd 8:0:0:0: [sdg] CDB:
Read(10): 28 00 04 9d bf bf 00 00 80 00
Feb  7 01:08:57 raven kernel: [18992.555093] md: md0: reshape done.
Feb  7 01:08:57 raven kernel: [18992.607595] md: reshape of RAID array md0
Feb  7 01:08:57 raven kernel: [18992.607606] md: minimum _guaranteed_
speed: 200000 KB/sec/disk.
Feb  7 01:08:57 raven kernel: [18992.607614] md: using maximum
available idle IO bandwidth (but not more than 200000 KB/sec) for
reshape.
Feb  7 01:08:57 raven kernel: [18992.607628] md: using 128k window,
over a total of 1953511936 blocks.
Feb  7 06:41:02 raven rsyslogd: [origin software="rsyslogd"
swVersion="4.2.0" x-pid="911" x-info="http://www.rsyslog.com"]
rsyslogd was HUPed, type 'lightweight'.
Feb  7 07:12:32 raven kernel: [40807.989092] ata5: hard resetting link
Feb  7 07:12:38 raven kernel: [40813.524074] ata5: SATA link up 6.0
Gbps (SStatus 133 SControl 300)
Feb  7 07:12:43 raven kernel: [40818.524106] ata5.00: qc timeout (cmd 0xec)
Feb  7 07:12:43 raven kernel: [40818.524126] ata5.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 07:12:43 raven kernel: [40818.532788] ata5: hard resetting link
Feb  7 07:12:48 raven kernel: [40824.058039] ata5: SATA link up 6.0
Gbps (SStatus 133 SControl 300)
Feb  7 07:12:58 raven kernel: [40834.056101] ata5.00: qc timeout (cmd 0xec)
Feb  7 07:12:58 raven kernel: [40834.056121] ata5.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 07:12:58 raven kernel: [40834.064203] ata5: limiting SATA link
speed to 3.0 Gbps
Feb  7 07:12:58 raven kernel: [40834.064217] ata5: hard resetting link
Feb  7 07:13:04 raven kernel: [40839.592095] ata5: SATA link up 3.0
Gbps (SStatus 123 SControl 320)
Feb  7 07:13:34 raven kernel: [40869.592088] ata5.00: qc timeout (cmd 0xec)
Feb  7 07:13:34 raven kernel: [40869.592110] ata5.00: failed to
IDENTIFY (I/O error, err_mask=0x4)
Feb  7 07:13:34 raven kernel: [40869.599676] ata5.00: disabled
Feb  7 07:13:34 raven kernel: [40869.599700] ata5.00: device reported
invalid CHS sector 0
Feb  7 07:13:34 raven kernel: [40869.599724] ata5: hard resetting link
Feb  7 07:13:39 raven kernel: [40875.124128] ata5: SATA link up 3.0
Gbps (SStatus 123 SControl 320)
Feb  7 07:13:39 raven kernel: [40875.124201] ata5: EH complete
Feb  7 07:13:39 raven kernel: [40875.124243] sd 4:0:0:0: [sdd]
Unhandled error code
Feb  7 07:13:39 raven kernel: [40875.124251] sd 4:0:0:0: [sdd] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 07:13:39 raven kernel: [40875.124262] sd 4:0:0:0: [sdd] CDB:
Write(10): 2a 00 e8 e0 74 3f 00 00 08 00
Feb  7 07:13:39 raven kernel: [40875.135544] md: super_written gets
error=-5, uptodate=0
Feb  7 07:13:39 raven kernel: [40875.152171] sd 4:0:0:0: [sdd]
Unhandled error code
Feb  7 07:13:39 raven kernel: [40875.152179] sd 4:0:0:0: [sdd] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Feb  7 07:13:39 raven kernel: [40875.152189] sd 4:0:0:0: [sdd] CDB:
Read(10): 28 00 09 2b f2 3f 00 00 80 00
Feb  7 07:13:41 raven kernel: [40876.734504] md: md0: reshape done.
Feb  7 07:13:41 raven kernel: [40876.736298] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.743529] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.750009] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.755143] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.760126] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.765070] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.769890] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.774759] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.779456] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.784166] lost page write due to
I/O error on md0
Feb  7 07:13:41 raven kernel: [40876.788773] JBD: Detected IO errors
while flushing file data on md0
Feb  7 07:13:41 raven kernel: [40876.796386] JBD: Detected IO errors
while flushing file data on md0

On Tue, Feb 7, 2012 at 1:15 PM, Phil Turmel <philip@turmel.org> wrote:
> Hi Richard,
>
> On 02/06/2012 08:34 PM, Richard Herd wrote:
>> Hey guys,
>>
>> I'm in a bit of a pickle here and if any mdadm kings could step in and
>> throw some advice my way I'd be very grateful :-)
>>
>> Quick bit of background - little NAS based on an AMD E350 running
>> Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
>> months one of the drives would fail a request and get kicked from the
>> array (as is becoming common for these larger multi TB drives they
>> tolerate the occasional bad sector by reallocating from a pool of
>> spares (but that's a whole other story)).  This happened across a
>> variety of brands and two different controllers. I'd simply add the
>> disk that got popped back in and let it re-sync.  SMART tests always
>> in good health.
>
> Some more detail on the actual devices would help, especially the
> output of lsdrv [1] to document what device serial numbers are which,
> for future reference.
>
> I also suspect you have problems with your drive's error recovery
> control, also known as time-limited error recovery.  Simple sector
> errors should *not* be kicking out your drives.  Mdadm knows to
> reconstruct from parity and rewrite when a read error is encountered.
> That either succeeds directly, or causes the drive to remap.
>
> You say that the SMART tests are good, so read errors are probably
> escalating into link timeouts, and the drive ignores the attempt to
> reconstruct.  *That* kicks the drive out.
>
> "smartctl -x" reports for all of your drives would help identify if
> you have this problem.  You *cannot* safely run raid arrays with drives
> that don't (or won't) report errors in a timely fashion (a few seconds).
>
>> It did make me nervous though.  So I decided I'd add a second disk for
>> a bit of extra redundancy, making the array a RAID 6 - the thinking
>> was the occasional disk getting kicked and re-added from a RAID 6
>> array wouldn't present as much risk as a single disk getting kicked
>> from a RAID 5.
>>
>> So first off, I added the 6th disk as a hotspare to the RAID5 array.
>> So I now had my 5 disk RAID 5 + hotspare.
>>
>> I then found that mdadm 2.6.7 (in the repositories) isn't actually
>> capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
>> compiled myself a new version of mdadm.
>>
>> With the newer version of mdadm, it was happy to do the reshape - so I
>> set it off on it's merry way, using an esata HD (mounted at /usb :-P)
>> for the backupfile:
>>
>> root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
>> --backup-file=/usb/md0.backup
>>
>> It would take a week to reshape, but it was ona UPS & happily ticking
>> along.  The array would be online the whole time so I was in no rush.
>> Content, I went to get some shut-eye.
>>
>> I got up this morning and took a quick look in /proc/mdstat to see how
>> things were going and saw things had failed spectacularly.  At least
>> two disks had been kicked from the array and the whole thing had
>> crumbled.
>
> Do you still have the dmesg for this?
>
>> Ouch.
>>
>> I tried to assembe the array, to see if it would continue the reshape:
>>
>> root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
>> /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
>>
>> Unfortunately mdadm had decided that the backup-file was out of date
>> (timestamps didn't match) and was erroring with: Failed to restore
>> critical section for reshape, sorry..
>>
>> Chances are things were in such a mess that backup file wasn't going
>> to be used anyway, so I blocked the timestamp check with: export
>> MDADM_GROW_ALLOW_OLD=1
>>
>> That allowed me to assemble the array, but not run it as there were
>> not enough disks to start it.
>>
>> This is the current state of the array:
>>
>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>> [raid4] [raid10]
>> md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
>>       7814047744 blocks super 0.91
>>
>> unused devices: <none>
>>
>> root@raven:/# mdadm --detail /dev/md0
>> /dev/md0:
>>         Version : 0.91
>>   Creation Time : Tue Jul 12 23:05:01 2011
>>      Raid Level : raid6
>>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>>    Raid Devices : 6
>>   Total Devices : 4
>> Preferred Minor : 0
>>     Persistence : Superblock is persistent
>>
>>     Update Time : Tue Feb  7 09:32:29 2012
>>           State : active, FAILED, Not Started
>>  Active Devices : 3
>> Working Devices : 4
>>  Failed Devices : 0
>>   Spare Devices : 1
>>
>>          Layout : left-symmetric-6
>>      Chunk Size : 64K
>>
>>      New Layout : left-symmetric
>>
>>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>>          Events : 0.1848341
>>
>>     Number   Major   Minor   RaidDevice State
>>        0       0        0        0      removed
>>        1       8       17        1      active sync   /dev/sdb1
>>        2       8        1        2      active sync   /dev/sda1
>>        3       0        0        3      removed
>>        4       8       81        4      active sync   /dev/sdf1
>>        5       8       49        5      spare rebuilding   /dev/sdd1
>>
>> The two removed disks:
>> [ 3020.998529] md: kicking non-fresh sdc1 from array!
>> [ 3021.012672] md: kicking non-fresh sdg1 from array!
>>
>> Attempted to re-add the disks (same for both):
>> root@raven:/# mdadm /dev/md0 --add /dev/sdg1
>> mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
>> --re-add fails.
>> mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.
>> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.
>>
>> With a failed array the last thing we want to do is add spares and
>> trigger a resync so obviously I haven't zeroed the superblocks and
>> added yet.
>
> That would be catastrophic.
>
>> Checked and two disks really are out of sync:
>> root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
>>          Events : 1848341
>>          Events : 1848341
>>          Events : 1848333
>>          Events : 1848341
>>          Events : 1848341
>>          Events : 1772921
>
> So /dev/sdg1 dropped out first, and /dev/sdc1 followed and killed the
> array.
>
>> I'll post the output of --examine on all the disks below - if anyone
>> has any advice I'd really appreciate it (Neil Brown doesn't read these
>> forums does he?!?).  I would usually move next to recreating the array
>> and using assume-clean but since it's right in the middle of a reshape
>> I'm not inclined to try.
>
> Neil absolutely reads this mailing list, and is likely to pitch in if
> I don't offer precisely correct advice :-)
>
> He's in an Australian time zone though, so latency might vary.  I'm on the
> U.S. east coast, fwiw.
>
> In any case, with a re-shape in progress, "--create --assume-clean" is
> not an option.
>
>> Critical stuff is of course backed up, but there is some user data not
>> covered by backups that I'd like to try and restore if at all
>> possible.
>
> Hope is not all lost.  If we can get your ERC adjusted, the next step
> would be to disconnect /dev/sdg from the system, and assemble with
> --force and MDADM_GROW_ALLOW_OLD=1
>
> That'll let the reshape finish, leaving you with a single-degraded
> raid6.  Then you fsck and make critical backups.  Then you --zero- and
> --add /dev/sdg.
>
> If your drives don't support ERC, I can't recommend you continue until
> you've ddrescue'd your drives onto new ones that do support ERC.
>
> HTH,
>
> Phil
>
> [1] http://github.com/pturmel/lsdrv
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  1:34 Please Help! RAID5 -> 6 reshapre gone bad Richard Herd
  2012-02-07  2:15 ` Phil Turmel
@ 2012-02-07  2:39 ` NeilBrown
  2012-02-07  3:10   ` NeilBrown
  1 sibling, 1 reply; 27+ messages in thread
From: NeilBrown @ 2012-02-07  2:39 UTC (permalink / raw)
  To: Richard Herd; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 7103 bytes --]

On Tue, 7 Feb 2012 12:34:48 +1100 Richard Herd <2001oddity@gmail.com> wrote:

> Hey guys,
> 
> I'm in a bit of a pickle here and if any mdadm kings could step in and
> throw some advice my way I'd be very grateful :-)
> 
> Quick bit of background - little NAS based on an AMD E350 running
> Ubuntu 10.04. Running a software RAID 5 from 5x2TB disks.  Every few
> months one of the drives would fail a request and get kicked from the
> array (as is becoming common for these larger multi TB drives they
> tolerate the occasional bad sector by reallocating from a pool of
> spares (but that's a whole other story)).  This happened across a
> variety of brands and two different controllers. I'd simply add the
> disk that got popped back in and let it re-sync.  SMART tests always
> in good health.
> 
> It did make me nervous though.  So I decided I'd add a second disk for
> a bit of extra redundancy, making the array a RAID 6 - the thinking
> was the occasional disk getting kicked and re-added from a RAID 6
> array wouldn't present as much risk as a single disk getting kicked
> from a RAID 5.
> 
> So first off, I added the 6th disk as a hotspare to the RAID5 array.
> So I now had my 5 disk RAID 5 + hotspare.
> 
> I then found that mdadm 2.6.7 (in the repositories) isn't actually
> capable of a 5->6 reshape.  So I pulled the latest 3.2.3 sources and
> compiled myself a new version of mdadm.
> 
> With the newer version of mdadm, it was happy to do the reshape - so I
> set it off on it's merry way, using an esata HD (mounted at /usb :-P)
> for the backupfile:
> 
> root@raven:/# mdadm --grow /dev/md0 --level=6 --raid-devices=6
> --backup-file=/usb/md0.backup
> 
> It would take a week to reshape, but it was ona UPS & happily ticking
> along.  The array would be online the whole time so I was in no rush.
> Content, I went to get some shut-eye.
> 
> I got up this morning and took a quick look in /proc/mdstat to see how
> things were going and saw things had failed spectacularly.  At least
> two disks had been kicked from the array and the whole thing had
> crumbled.
> 
> Ouch.
> 
> I tried to assembe the array, to see if it would continue the reshape:
> 
> root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
> /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
> 
> Unfortunately mdadm had decided that the backup-file was out of date
> (timestamps didn't match) and was erroring with: Failed to restore
> critical section for reshape, sorry..
> 
> Chances are things were in such a mess that backup file wasn't going
> to be used anyway, so I blocked the timestamp check with: export
> MDADM_GROW_ALLOW_OLD=1
> 
> That allowed me to assemble the array, but not run it as there were
> not enough disks to start it.

You probably just need to add "--force" to the assemble line.
So stop the array (mdamd -S /dev/md0) and assemble again with --force as well
as the other options.... or maybe don't.

I just tested that and I didn't do what it should.  I've hacked the code a
bit and can see what the problem is and think I can fix it.

So leave it a bit.  I'll let you know when you should  grab my latest code
and try that.


> 
> This is the current state of the array:
> 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : inactive sdb1[1] sdd1[5] sdf1[4] sda1[2]
>       7814047744 blocks super 0.91
> 
> unused devices: <none>
> 
> root@raven:/# mdadm --detail /dev/md0
> /dev/md0:
>         Version : 0.91
>   Creation Time : Tue Jul 12 23:05:01 2011
>      Raid Level : raid6
>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>    Raid Devices : 6
>   Total Devices : 4
> Preferred Minor : 0
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Feb  7 09:32:29 2012
>           State : active, FAILED, Not Started
>  Active Devices : 3
> Working Devices : 4
>  Failed Devices : 0
>   Spare Devices : 1
> 
>          Layout : left-symmetric-6
>      Chunk Size : 64K
> 
>      New Layout : left-symmetric
> 
>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>          Events : 0.1848341
> 
>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       17        1      active sync   /dev/sdb1
>        2       8        1        2      active sync   /dev/sda1
>        3       0        0        3      removed
>        4       8       81        4      active sync   /dev/sdf1
>        5       8       49        5      spare rebuilding   /dev/sdd1
> 
> The two removed disks:
> [ 3020.998529] md: kicking non-fresh sdc1 from array!
> [ 3021.012672] md: kicking non-fresh sdg1 from array!
> 
> Attempted to re-add the disks (same for both):
> root@raven:/# mdadm /dev/md0 --add /dev/sdg1
> mdadm: /dev/sdg1 reports being an active member for /dev/md0, but a
> --re-add fails.
> mdadm: not performing --add as that would convert /dev/sdg1 in to a spare.

Gee I'm glad I put that check in!


> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdg1" first.
> 
> With a failed array the last thing we want to do is add spares and
> trigger a resync so obviously I haven't zeroed the superblocks and
> added yet.

Excellent!

> 
> Checked and two disks really are out of sync:
> root@raven:/# mdadm --examine /dev/sd[a-h]1 | grep Event
>          Events : 1848341
>          Events : 1848341
>          Events : 1848333
>          Events : 1848341
>          Events : 1848341
>          Events : 1772921

sdg1 failed first shortly after 01:06:46.  The reshape should have just
continued.  However every device has the same:

>   Reshape pos'n : 307740672 (293.48 GiB 315.13 GB)

including sdg1.  That implied that it didn't continue.  Confused.

Anyway, around 07:12:01, sdc1 failed. This will definitely have stopped the
reshape and everything else.

> 
> I'll post the output of --examine on all the disks below - if anyone
> has any advice I'd really appreciate it (Neil Brown doesn't read these
> forums does he?!?).  I would usually move next to recreating the array
> and using assume-clean but since it's right in the middle of a reshape
> I'm not inclined to try.

Me?  No, I don't hang out here much...

> 
> Critical stuff is of course backed up, but there is some user data not
> covered by backups that I'd like to try and restore if at all
> possible.

"backups" - music to my ears.

I definitely recommend an 'fsck' after we get it going again and there
could be minor corruption, but you will probably have everything back.

Of course I cannot promise that it won't just happen again when it hits
another read error.  Not sure what you can do about that.


So - stay tuned.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  2:39 ` NeilBrown
@ 2012-02-07  3:10   ` NeilBrown
  2012-02-07  3:19     ` Richard Herd
  0 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2012-02-07  3:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: Richard Herd, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1760 bytes --]

On Tue, 7 Feb 2012 13:39:47 +1100 NeilBrown <neilb@suse.de> wrote:
 
> > I tried to assembe the array, to see if it would continue the reshape:
> > 
> > root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
> > /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
> > 
> > Unfortunately mdadm had decided that the backup-file was out of date
> > (timestamps didn't match) and was erroring with: Failed to restore
> > critical section for reshape, sorry..
> > 
> > Chances are things were in such a mess that backup file wasn't going
> > to be used anyway, so I blocked the timestamp check with: export
> > MDADM_GROW_ALLOW_OLD=1
> > 
> > That allowed me to assemble the array, but not run it as there were
> > not enough disks to start it.
> 
> You probably just need to add "--force" to the assemble line.
> So stop the array (mdamd -S /dev/md0) and assemble again with --force as well
> as the other options.... or maybe don't.
> 
> I just tested that and I didn't do what it should.  I've hacked the code a
> bit and can see what the problem is and think I can fix it.
> 
> So leave it a bit.  I'll let you know when you should  grab my latest code
> and try that.

Ok, that should work..
If you:

 git clone git://neil.brown.name/mdadm
 cd mdadm
 make
 export MDADM_GROW_ALLOW_OLD=1
 ./mdadm -Avv --backup-file=/usb/md0.backup /dev/md0 ..list.of.devices.. --force


it should restart the grow.  Once device will be left failed.  If you think
it is usable then when the grow completes you can add it back in.

If you get another failure it will die again and you'll have to restart it.

If you get a persistent failure, you might be out of luck.

Please let me know how it goes.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:10   ` NeilBrown
@ 2012-02-07  3:19     ` Richard Herd
  2012-02-07  3:39       ` NeilBrown
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Herd @ 2012-02-07  3:19 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

Thanks.

FYI, I've cloned your git repo and compiled and tried using your code.
 Unfortunately everything looks the same as below (exactly same
output, exactly same dmesg - still wants to kick non-fresh sdc from
the array at assemble).

Cheers

Rich

On Tue, Feb 7, 2012 at 2:10 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 7 Feb 2012 13:39:47 +1100 NeilBrown <neilb@suse.de> wrote:
>
>> > I tried to assembe the array, to see if it would continue the reshape:
>> >
>> > root@raven:/# mdadm -Avv --backup-file=/usb/md0.backup /dev/md0
>> > /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sdf1 /dev/sdg1
>> >
>> > Unfortunately mdadm had decided that the backup-file was out of date
>> > (timestamps didn't match) and was erroring with: Failed to restore
>> > critical section for reshape, sorry..
>> >
>> > Chances are things were in such a mess that backup file wasn't going
>> > to be used anyway, so I blocked the timestamp check with: export
>> > MDADM_GROW_ALLOW_OLD=1
>> >
>> > That allowed me to assemble the array, but not run it as there were
>> > not enough disks to start it.
>>
>> You probably just need to add "--force" to the assemble line.
>> So stop the array (mdamd -S /dev/md0) and assemble again with --force as well
>> as the other options.... or maybe don't.
>>
>> I just tested that and I didn't do what it should.  I've hacked the code a
>> bit and can see what the problem is and think I can fix it.
>>
>> So leave it a bit.  I'll let you know when you should  grab my latest code
>> and try that.
>
> Ok, that should work..
> If you:
>
>  git clone git://neil.brown.name/mdadm
>  cd mdadm
>  make
>  export MDADM_GROW_ALLOW_OLD=1
>  ./mdadm -Avv --backup-file=/usb/md0.backup /dev/md0 ..list.of.devices.. --force
>
>
> it should restart the grow.  Once device will be left failed.  If you think
> it is usable then when the grow completes you can add it back in.
>
> If you get another failure it will die again and you'll have to restart it.
>
> If you get a persistent failure, you might be out of luck.
>
> Please let me know how it goes.
>
> NeilBrown
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:19     ` Richard Herd
@ 2012-02-07  3:39       ` NeilBrown
  2012-02-07  3:50         ` Richard Herd
  0 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2012-02-07  3:39 UTC (permalink / raw)
  To: Richard Herd; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 745 bytes --]

On Tue, 7 Feb 2012 14:19:06 +1100 Richard Herd <2001oddity@gmail.com> wrote:

> Hi Neil,
> 
> Thanks.
> 
> FYI, I've cloned your git repo and compiled and tried using your code.
>  Unfortunately everything looks the same as below (exactly same
> output, exactly same dmesg - still wants to kick non-fresh sdc from
> the array at assemble).

Strange.

Please report output of 
  git describe HEAD

and also run the 'mdadm --assemble --force ....' with -vvv as well, and
report all of the output.

Also I think some of you devices have changed named a bit.  Make sure you
list exactly the 6 devices that were recently in the array. i.e. exactly
those that report something sensible to "mdadm -E /dev/WHATEVER"

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:39       ` NeilBrown
@ 2012-02-07  3:50         ` Richard Herd
  2012-02-07  4:25           ` NeilBrown
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Herd @ 2012-02-07  3:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

OK, git head is: mdadm-3.2.3-21-gda8fe5a

I have 8 disks.  They get muddled about each boot (an issue I have
never addressed).   Ignore sde (esata HD) and sdh (usb boot).

It seems even with --force, dmesg always reports 'kicking non-fresh
sdc/g1 from array!'.  Leaving sdg out as suggested by Phil doesn't
help unfortunately.

root@raven:/neil/mdadm# ./mdadm -Avvv --force
--backup-file=/usb/md0.backup /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
/dev/sdd1 /dev/sdf1 /dev/sdg1
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
mdadm:/dev/md0 has an active reshape - checking if critical section
needs to be restored
mdadm: accepting backup with timestamp 1328559119 for array with
timestamp 1328567549
mdadm: restoring critical section
mdadm: added /dev/sdg1 to /dev/md0 as 0
mdadm: added /dev/sda1 to /dev/md0 as 2
mdadm: added /dev/sdc1 to /dev/md0 as 3
mdadm: added /dev/sdf1 to /dev/md0 as 4
mdadm: added /dev/sdd1 to /dev/md0 as 5
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error

and dmesg:
[13964.591801] md: bind<sdg1>
[13964.595371] md: bind<sda1>
[13964.595668] md: bind<sdc1>
[13964.595900] md: bind<sdf1>
[13964.599084] md: bind<sdd1>
[13964.599652] md: bind<sdb1>
[13964.600478] md: kicking non-fresh sdc1 from array!
[13964.600493] md: unbind<sdc1>
[13964.612138] md: export_rdev(sdc1)
[13964.612163] md: kicking non-fresh sdg1 from array!
[13964.612183] md: unbind<sdg1>
[13964.624077] md: export_rdev(sdg1)
[13964.628203] raid5: reshape will continue
[13964.628243] raid5: device sdb1 operational as raid disk 1
[13964.628252] raid5: device sdf1 operational as raid disk 4
[13964.628260] raid5: device sda1 operational as raid disk 2
[13964.629614] raid5: allocated 6308kB for md0
[13964.629731] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[13964.629742] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
[13964.629751] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[13964.629760] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
[13964.629767] raid5: not enough operational devices for md0 (3/6 failed)
[13964.640403] RAID5 conf printout:
[13964.640409]  --- rd:6 wd:3
[13964.640416]  disk 1, o:1, dev:sdb1
[13964.640423]  disk 2, o:1, dev:sda1
[13964.640429]  disk 4, o:1, dev:sdf1
[13964.640436]  disk 5, o:1, dev:sdd1
[13964.641621] raid5: failed to run raid set md0
[13964.649886] md: pers->run() failed ...

root@raven:/neil/mdadm# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.91
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 6
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Feb  7 09:32:29 2012
          State : active, FAILED, Not Started
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric-6
     Chunk Size : 64K

     New Layout : left-symmetric

           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
         Events : 0.1848341

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       17        1      active sync   /dev/sdb1
       2       8        1        2      active sync   /dev/sda1
       3       0        0        3      removed
       4       8       81        4      active sync   /dev/sdf1
       5       8       49        5      spare rebuilding   /dev/sdd1

On Tue, Feb 7, 2012 at 2:39 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 7 Feb 2012 14:19:06 +1100 Richard Herd <2001oddity@gmail.com> wrote:
>
>> Hi Neil,
>>
>> Thanks.
>>
>> FYI, I've cloned your git repo and compiled and tried using your code.
>>  Unfortunately everything looks the same as below (exactly same
>> output, exactly same dmesg - still wants to kick non-fresh sdc from
>> the array at assemble).
>
> Strange.
>
> Please report output of
>  git describe HEAD
>
> and also run the 'mdadm --assemble --force ....' with -vvv as well, and
> report all of the output.
>
> Also I think some of you devices have changed named a bit.  Make sure you
> list exactly the 6 devices that were recently in the array. i.e. exactly
> those that report something sensible to "mdadm -E /dev/WHATEVER"
>
> NeilBrown
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  3:50         ` Richard Herd
@ 2012-02-07  4:25           ` NeilBrown
  2012-02-07  5:02             ` Richard Herd
  0 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2012-02-07  4:25 UTC (permalink / raw)
  To: Richard Herd; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3359 bytes --]

On Tue, 7 Feb 2012 14:50:57 +1100 Richard Herd <2001oddity@gmail.com> wrote:

> Hi Neil,
> 
> OK, git head is: mdadm-3.2.3-21-gda8fe5a
> 
> I have 8 disks.  They get muddled about each boot (an issue I have
> never addressed).   Ignore sde (esata HD) and sdh (usb boot).
> 
> It seems even with --force, dmesg always reports 'kicking non-fresh
> sdc/g1 from array!'.  Leaving sdg out as suggested by Phil doesn't
> help unfortunately.
> 
> root@raven:/neil/mdadm# ./mdadm -Avvv --force
> --backup-file=/usb/md0.backup /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
> /dev/sdd1 /dev/sdf1 /dev/sdg1
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
> mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
> mdadm:/dev/md0 has an active reshape - checking if critical section
> needs to be restored
> mdadm: accepting backup with timestamp 1328559119 for array with
> timestamp 1328567549
> mdadm: restoring critical section
> mdadm: added /dev/sdg1 to /dev/md0 as 0
> mdadm: added /dev/sda1 to /dev/md0 as 2
> mdadm: added /dev/sdc1 to /dev/md0 as 3
> mdadm: added /dev/sdf1 to /dev/md0 as 4
> mdadm: added /dev/sdd1 to /dev/md0 as 5
> mdadm: added /dev/sdb1 to /dev/md0 as 1
> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error


Hmmm.... maybe your kernel isn't quite doing the right thing.
 commit 674806d62fb02a22eea948c9f1b5e58e0947b728 is important.
It is in 2.6.35.  What kernel are you running?  
Definitely something older given the "1: w=1 pa=18...." messages.  They
disappear in 2.6.34.

So I'm afraid you're going to need a new kernel.

NeilBrown




> 
> and dmesg:
> [13964.591801] md: bind<sdg1>
> [13964.595371] md: bind<sda1>
> [13964.595668] md: bind<sdc1>
> [13964.595900] md: bind<sdf1>
> [13964.599084] md: bind<sdd1>
> [13964.599652] md: bind<sdb1>
> [13964.600478] md: kicking non-fresh sdc1 from array!
> [13964.600493] md: unbind<sdc1>
> [13964.612138] md: export_rdev(sdc1)
> [13964.612163] md: kicking non-fresh sdg1 from array!
> [13964.612183] md: unbind<sdg1>
> [13964.624077] md: export_rdev(sdg1)
> [13964.628203] raid5: reshape will continue
> [13964.628243] raid5: device sdb1 operational as raid disk 1
> [13964.628252] raid5: device sdf1 operational as raid disk 4
> [13964.628260] raid5: device sda1 operational as raid disk 2
> [13964.629614] raid5: allocated 6308kB for md0
> [13964.629731] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> [13964.629742] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
> [13964.629751] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> [13964.629760] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> [13964.629767] raid5: not enough operational devices for md0 (3/6 failed)
> [13964.640403] RAID5 conf printout:
> [13964.640409]  --- rd:6 wd:3
> [13964.640416]  disk 1, o:1, dev:sdb1
> [13964.640423]  disk 2, o:1, dev:sda1
> [13964.640429]  disk 4, o:1, dev:sdf1
> [13964.640436]  disk 5, o:1, dev:sdd1
> [13964.641621] raid5: failed to run raid set md0
> [13964.649886] md: pers->run() failed ...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  4:25           ` NeilBrown
@ 2012-02-07  5:02             ` Richard Herd
  2012-02-07  5:16               ` NeilBrown
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Herd @ 2012-02-07  5:02 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

Hmm - see you're point about the kernel...

Kernel updated.  I'm now running 2.6.38.

I went to work on it a bit more under 2.6.38 - I'm not sure here, it
wouldn't take all the disks as before, but this time seems to have
assembled (with --force) using 4 of the disks.

Trying to re-add the 5th and 6th didn't throw the same warning as
before (failed to re-add and not adding as spare), it said ''re-added
/dev/xxx to /dev/md0' but when checking detail we can see they were
added as spares not as part of the array.

Anyway, with the array assembled and running, I have got the
filesystem mounted and am quickly smashing an rsync to mirror what I
can (8TB, how long could it take? lol).

Thanks so much for your help guys - once I got the hint on the kernel
it wasn't too hard to get the array assembled again.  Now it's just a
waiting game I guess to see how much of the data is intact.  Also, at
what point would those two disks now marked as spare be re-synced into
the array?  After the reshape completes?

Really appreciate your help :-)

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid6 sde1[6](S) sdg1[7](S) sdc1[1] sdf1[4] sdd1[3] sdb1[2]
      7814047744 blocks super 0.91 level 6, 64k chunk, algorithm 18
[6/4] [_UUUU_]
      [>....................]  reshape =  3.9% (78086144/1953511936)
finish=11710.7min speed=2668K/sec

unused devices: <none>


root@raven:~# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.91
  Creation Time : Tue Jul 12 23:05:01 2011
     Raid Level : raid6
     Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Feb  7 15:52:10 2012
          State : clean, degraded, reshaping
 Active Devices : 4
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 2

         Layout : left-symmetric-6
     Chunk Size : 64K

 Reshape Status : 3% complete
     New Layout : left-symmetric

           UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
         Events : 0.1850269

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       33        1      active sync   /dev/sdc1
       2       8       17        2      active sync   /dev/sdb1
       3       8       49        3      active sync   /dev/sdd1
       4       8       81        4      active sync   /dev/sdf1
       5       0        0        5      removed

       6       8       65        -      spare   /dev/sde1
       7       8       97        -      spare   /dev/sdg1




On Tue, Feb 7, 2012 at 3:25 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 7 Feb 2012 14:50:57 +1100 Richard Herd <2001oddity@gmail.com> wrote:
>
>> Hi Neil,
>>
>> OK, git head is: mdadm-3.2.3-21-gda8fe5a
>>
>> I have 8 disks.  They get muddled about each boot (an issue I have
>> never addressed).   Ignore sde (esata HD) and sdh (usb boot).
>>
>> It seems even with --force, dmesg always reports 'kicking non-fresh
>> sdc/g1 from array!'.  Leaving sdg out as suggested by Phil doesn't
>> help unfortunately.
>>
>> root@raven:/neil/mdadm# ./mdadm -Avvv --force
>> --backup-file=/usb/md0.backup /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
>> /dev/sdd1 /dev/sdf1 /dev/sdg1
>> mdadm: looking for devices for /dev/md0
>> mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
>> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
>> mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
>> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
>> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
>> mdadm:/dev/md0 has an active reshape - checking if critical section
>> needs to be restored
>> mdadm: accepting backup with timestamp 1328559119 for array with
>> timestamp 1328567549
>> mdadm: restoring critical section
>> mdadm: added /dev/sdg1 to /dev/md0 as 0
>> mdadm: added /dev/sda1 to /dev/md0 as 2
>> mdadm: added /dev/sdc1 to /dev/md0 as 3
>> mdadm: added /dev/sdf1 to /dev/md0 as 4
>> mdadm: added /dev/sdd1 to /dev/md0 as 5
>> mdadm: added /dev/sdb1 to /dev/md0 as 1
>> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
>
>
> Hmmm.... maybe your kernel isn't quite doing the right thing.
>  commit 674806d62fb02a22eea948c9f1b5e58e0947b728 is important.
> It is in 2.6.35.  What kernel are you running?
> Definitely something older given the "1: w=1 pa=18...." messages.  They
> disappear in 2.6.34.
>
> So I'm afraid you're going to need a new kernel.
>
> NeilBrown
>
>
>
>
>>
>> and dmesg:
>> [13964.591801] md: bind<sdg1>
>> [13964.595371] md: bind<sda1>
>> [13964.595668] md: bind<sdc1>
>> [13964.595900] md: bind<sdf1>
>> [13964.599084] md: bind<sdd1>
>> [13964.599652] md: bind<sdb1>
>> [13964.600478] md: kicking non-fresh sdc1 from array!
>> [13964.600493] md: unbind<sdc1>
>> [13964.612138] md: export_rdev(sdc1)
>> [13964.612163] md: kicking non-fresh sdg1 from array!
>> [13964.612183] md: unbind<sdg1>
>> [13964.624077] md: export_rdev(sdg1)
>> [13964.628203] raid5: reshape will continue
>> [13964.628243] raid5: device sdb1 operational as raid disk 1
>> [13964.628252] raid5: device sdf1 operational as raid disk 4
>> [13964.628260] raid5: device sda1 operational as raid disk 2
>> [13964.629614] raid5: allocated 6308kB for md0
>> [13964.629731] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
>> [13964.629742] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
>> [13964.629751] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
>> [13964.629760] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
>> [13964.629767] raid5: not enough operational devices for md0 (3/6 failed)
>> [13964.640403] RAID5 conf printout:
>> [13964.640409]  --- rd:6 wd:3
>> [13964.640416]  disk 1, o:1, dev:sdb1
>> [13964.640423]  disk 2, o:1, dev:sda1
>> [13964.640429]  disk 4, o:1, dev:sdf1
>> [13964.640436]  disk 5, o:1, dev:sdd1
>> [13964.641621] raid5: failed to run raid set md0
>> [13964.649886] md: pers->run() failed ...
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: Please Help! RAID5 -> 6 reshapre gone bad
  2012-02-07  5:02             ` Richard Herd
@ 2012-02-07  5:16               ` NeilBrown
  0 siblings, 0 replies; 27+ messages in thread
From: NeilBrown @ 2012-02-07  5:16 UTC (permalink / raw)
  To: Richard Herd; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 7145 bytes --]

On Tue, 7 Feb 2012 16:02:27 +1100 Richard Herd <2001oddity@gmail.com> wrote:

> Hi Neil,
> 
> Hmm - see you're point about the kernel...
> 
> Kernel updated.  I'm now running 2.6.38.
> 
> I went to work on it a bit more under 2.6.38 - I'm not sure here, it
> wouldn't take all the disks as before, but this time seems to have
> assembled (with --force) using 4 of the disks.
> 
> Trying to re-add the 5th and 6th didn't throw the same warning as
> before (failed to re-add and not adding as spare), it said ''re-added
> /dev/xxx to /dev/md0' but when checking detail we can see they were
> added as spares not as part of the array.

That is expected. "--force" just gets you enough to keep going and that is
what you have.  Hopefully no more errors (keep the air-con ?? or maybe just
keep the doors open, depending where you are :-)

> 
> Anyway, with the array assembled and running, I have got the
> filesystem mounted and am quickly smashing an rsync to mirror what I
> can (8TB, how long could it take? lol).

Good news.

> 
> Thanks so much for your help guys - once I got the hint on the kernel
> it wasn't too hard to get the array assembled again.  Now it's just a
> waiting game I guess to see how much of the data is intact.  Also, at
> what point would those two disks now marked as spare be re-synced into
> the array?  After the reshape completes?

Yes.  When the reshape completes, both the spares will get included into the
array and recovered together.


> 
> Really appreciate your help :-)

And I appreciate nice detailed bug reports - they tend to get more
attention.  Thanks!

NeilBrown



> 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid6 sde1[6](S) sdg1[7](S) sdc1[1] sdf1[4] sdd1[3] sdb1[2]
>       7814047744 blocks super 0.91 level 6, 64k chunk, algorithm 18
> [6/4] [_UUUU_]
>       [>....................]  reshape =  3.9% (78086144/1953511936)
> finish=11710.7min speed=2668K/sec
> 
> unused devices: <none>
> 
> 
> root@raven:~# mdadm --detail /dev/md0
> /dev/md0:
>         Version : 0.91
>   Creation Time : Tue Jul 12 23:05:01 2011
>      Raid Level : raid6
>      Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>    Raid Devices : 6
>   Total Devices : 6
> Preferred Minor : 0
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Feb  7 15:52:10 2012
>           State : clean, degraded, reshaping
>  Active Devices : 4
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 2
> 
>          Layout : left-symmetric-6
>      Chunk Size : 64K
> 
>  Reshape Status : 3% complete
>      New Layout : left-symmetric
> 
>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>          Events : 0.1850269
> 
>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       33        1      active sync   /dev/sdc1
>        2       8       17        2      active sync   /dev/sdb1
>        3       8       49        3      active sync   /dev/sdd1
>        4       8       81        4      active sync   /dev/sdf1
>        5       0        0        5      removed
> 
>        6       8       65        -      spare   /dev/sde1
>        7       8       97        -      spare   /dev/sdg1
> 
> 
> 
> 
> On Tue, Feb 7, 2012 at 3:25 PM, NeilBrown <neilb@suse.de> wrote:
> > On Tue, 7 Feb 2012 14:50:57 +1100 Richard Herd <2001oddity@gmail.com> wrote:
> >
> >> Hi Neil,
> >>
> >> OK, git head is: mdadm-3.2.3-21-gda8fe5a
> >>
> >> I have 8 disks.  They get muddled about each boot (an issue I have
> >> never addressed).   Ignore sde (esata HD) and sdh (usb boot).
> >>
> >> It seems even with --force, dmesg always reports 'kicking non-fresh
> >> sdc/g1 from array!'.  Leaving sdg out as suggested by Phil doesn't
> >> help unfortunately.
> >>
> >> root@raven:/neil/mdadm# ./mdadm -Avvv --force
> >> --backup-file=/usb/md0.backup /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
> >> /dev/sdd1 /dev/sdf1 /dev/sdg1
> >> mdadm: looking for devices for /dev/md0
> >> mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
> >> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
> >> mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
> >> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
> >> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
> >> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
> >> mdadm:/dev/md0 has an active reshape - checking if critical section
> >> needs to be restored
> >> mdadm: accepting backup with timestamp 1328559119 for array with
> >> timestamp 1328567549
> >> mdadm: restoring critical section
> >> mdadm: added /dev/sdg1 to /dev/md0 as 0
> >> mdadm: added /dev/sda1 to /dev/md0 as 2
> >> mdadm: added /dev/sdc1 to /dev/md0 as 3
> >> mdadm: added /dev/sdf1 to /dev/md0 as 4
> >> mdadm: added /dev/sdd1 to /dev/md0 as 5
> >> mdadm: added /dev/sdb1 to /dev/md0 as 1
> >> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
> >
> >
> > Hmmm.... maybe your kernel isn't quite doing the right thing.
> >  commit 674806d62fb02a22eea948c9f1b5e58e0947b728 is important.
> > It is in 2.6.35.  What kernel are you running?
> > Definitely something older given the "1: w=1 pa=18...." messages.  They
> > disappear in 2.6.34.
> >
> > So I'm afraid you're going to need a new kernel.
> >
> > NeilBrown
> >
> >
> >
> >
> >>
> >> and dmesg:
> >> [13964.591801] md: bind<sdg1>
> >> [13964.595371] md: bind<sda1>
> >> [13964.595668] md: bind<sdc1>
> >> [13964.595900] md: bind<sdf1>
> >> [13964.599084] md: bind<sdd1>
> >> [13964.599652] md: bind<sdb1>
> >> [13964.600478] md: kicking non-fresh sdc1 from array!
> >> [13964.600493] md: unbind<sdc1>
> >> [13964.612138] md: export_rdev(sdc1)
> >> [13964.612163] md: kicking non-fresh sdg1 from array!
> >> [13964.612183] md: unbind<sdg1>
> >> [13964.624077] md: export_rdev(sdg1)
> >> [13964.628203] raid5: reshape will continue
> >> [13964.628243] raid5: device sdb1 operational as raid disk 1
> >> [13964.628252] raid5: device sdf1 operational as raid disk 4
> >> [13964.628260] raid5: device sda1 operational as raid disk 2
> >> [13964.629614] raid5: allocated 6308kB for md0
> >> [13964.629731] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629742] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
> >> [13964.629751] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629760] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629767] raid5: not enough operational devices for md0 (3/6 failed)
> >> [13964.640403] RAID5 conf printout:
> >> [13964.640409]  --- rd:6 wd:3
> >> [13964.640416]  disk 1, o:1, dev:sdb1
> >> [13964.640423]  disk 2, o:1, dev:sda1
> >> [13964.640429]  disk 4, o:1, dev:sdf1
> >> [13964.640436]  disk 5, o:1, dev:sdd1
> >> [13964.641621] raid5: failed to run raid set md0
> >> [13964.649886] md: pers->run() failed ...


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2012-02-08  7:13 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-07  1:34 Please Help! RAID5 -> 6 reshapre gone bad Richard Herd
2012-02-07  2:15 ` Phil Turmel
     [not found]   ` <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>
2012-02-07  2:57     ` Phil Turmel
2012-02-07  3:10       ` Richard Herd
2012-02-07  3:24       ` Keith Keller
2012-02-07  3:38         ` Phil Turmel
2012-01-31  6:31           ` rebuild raid6 after two failures Keith Keller
2012-02-01  4:42             ` Keith Keller
2012-02-01  5:31               ` NeilBrown
2012-02-01  5:48                 ` Keith Keller
2012-02-03 16:08               ` using dd (or dd_rescue) to salvage array Keith Keller
2012-02-04 18:01                 ` Stefan /*St0fF*/ Hübner
2012-02-05 19:10                   ` Keith Keller
2012-02-06 21:37                     ` Stefan *St0fF* Huebner
2012-02-07  3:44                       ` Keith Keller
2012-02-07  4:24                       ` Keith Keller
2012-02-07 20:01                         ` Stefan *St0fF* Huebner
2012-02-08  7:13         ` Please Help! RAID5 -> 6 reshapre gone bad Stan Hoeppner
2012-02-07  3:04     ` Fwd: " Richard Herd
2012-02-07  2:39 ` NeilBrown
2012-02-07  3:10   ` NeilBrown
2012-02-07  3:19     ` Richard Herd
2012-02-07  3:39       ` NeilBrown
2012-02-07  3:50         ` Richard Herd
2012-02-07  4:25           ` NeilBrown
2012-02-07  5:02             ` Richard Herd
2012-02-07  5:16               ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).