Re: Please Help! RAID5 -> 6 reshapre gone bad

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: Richard Herd <2001oddity@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Please Help! RAID5 -> 6 reshapre gone bad
Date: Tue, 7 Feb 2012 16:16:16 +1100	[thread overview]
Message-ID: <20120207161616.1951a682@notabene.brown> (raw)
In-Reply-To: <CAOANJV8kQ2U3LaZiH0QXtaPGeEh5wQ110B9d_CU_77E5Kwit7Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7145 bytes --]

On Tue, 7 Feb 2012 16:02:27 +1100 Richard Herd <2001oddity@gmail.com> wrote:

> Hi Neil,
> 
> Hmm - see you're point about the kernel...
> 
> Kernel updated.  I'm now running 2.6.38.
> 
> I went to work on it a bit more under 2.6.38 - I'm not sure here, it
> wouldn't take all the disks as before, but this time seems to have
> assembled (with --force) using 4 of the disks.
> 
> Trying to re-add the 5th and 6th didn't throw the same warning as
> before (failed to re-add and not adding as spare), it said ''re-added
> /dev/xxx to /dev/md0' but when checking detail we can see they were
> added as spares not as part of the array.

That is expected. "--force" just gets you enough to keep going and that is
what you have.  Hopefully no more errors (keep the air-con ?? or maybe just
keep the doors open, depending where you are :-)

> 
> Anyway, with the array assembled and running, I have got the
> filesystem mounted and am quickly smashing an rsync to mirror what I
> can (8TB, how long could it take? lol).

Good news.

> 
> Thanks so much for your help guys - once I got the hint on the kernel
> it wasn't too hard to get the array assembled again.  Now it's just a
> waiting game I guess to see how much of the data is intact.  Also, at
> what point would those two disks now marked as spare be re-synced into
> the array?  After the reshape completes?

Yes.  When the reshape completes, both the spares will get included into the
array and recovered together.


> 
> Really appreciate your help :-)

And I appreciate nice detailed bug reports - they tend to get more
attention.  Thanks!

NeilBrown



> 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid6 sde1[6](S) sdg1[7](S) sdc1[1] sdf1[4] sdd1[3] sdb1[2]
>       7814047744 blocks super 0.91 level 6, 64k chunk, algorithm 18
> [6/4] [_UUUU_]
>       [>....................]  reshape =  3.9% (78086144/1953511936)
> finish=11710.7min speed=2668K/sec
> 
> unused devices: <none>
> 
> 
> root@raven:~# mdadm --detail /dev/md0
> /dev/md0:
>         Version : 0.91
>   Creation Time : Tue Jul 12 23:05:01 2011
>      Raid Level : raid6
>      Array Size : 7814047744 (7452.06 GiB 8001.58 GB)
>   Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
>    Raid Devices : 6
>   Total Devices : 6
> Preferred Minor : 0
>     Persistence : Superblock is persistent
> 
>     Update Time : Tue Feb  7 15:52:10 2012
>           State : clean, degraded, reshaping
>  Active Devices : 4
> Working Devices : 6
>  Failed Devices : 0
>   Spare Devices : 2
> 
>          Layout : left-symmetric-6
>      Chunk Size : 64K
> 
>  Reshape Status : 3% complete
>      New Layout : left-symmetric
> 
>            UUID : 9a76d1bd:2aabd685:1fc5fe0e:7751cfd7 (local to host raven)
>          Events : 0.1850269
> 
>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       33        1      active sync   /dev/sdc1
>        2       8       17        2      active sync   /dev/sdb1
>        3       8       49        3      active sync   /dev/sdd1
>        4       8       81        4      active sync   /dev/sdf1
>        5       0        0        5      removed
> 
>        6       8       65        -      spare   /dev/sde1
>        7       8       97        -      spare   /dev/sdg1
> 
> 
> 
> 
> On Tue, Feb 7, 2012 at 3:25 PM, NeilBrown <neilb@suse.de> wrote:
> > On Tue, 7 Feb 2012 14:50:57 +1100 Richard Herd <2001oddity@gmail.com> wrote:
> >
> >> Hi Neil,
> >>
> >> OK, git head is: mdadm-3.2.3-21-gda8fe5a
> >>
> >> I have 8 disks.  They get muddled about each boot (an issue I have
> >> never addressed).   Ignore sde (esata HD) and sdh (usb boot).
> >>
> >> It seems even with --force, dmesg always reports 'kicking non-fresh
> >> sdc/g1 from array!'.  Leaving sdg out as suggested by Phil doesn't
> >> help unfortunately.
> >>
> >> root@raven:/neil/mdadm# ./mdadm -Avvv --force
> >> --backup-file=/usb/md0.backup /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
> >> /dev/sdd1 /dev/sdf1 /dev/sdg1
> >> mdadm: looking for devices for /dev/md0
> >> mdadm: /dev/sda1 is identified as a member of /dev/md0, slot 2.
> >> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
> >> mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 3.
> >> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 5.
> >> mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 4.
> >> mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
> >> mdadm:/dev/md0 has an active reshape - checking if critical section
> >> needs to be restored
> >> mdadm: accepting backup with timestamp 1328559119 for array with
> >> timestamp 1328567549
> >> mdadm: restoring critical section
> >> mdadm: added /dev/sdg1 to /dev/md0 as 0
> >> mdadm: added /dev/sda1 to /dev/md0 as 2
> >> mdadm: added /dev/sdc1 to /dev/md0 as 3
> >> mdadm: added /dev/sdf1 to /dev/md0 as 4
> >> mdadm: added /dev/sdd1 to /dev/md0 as 5
> >> mdadm: added /dev/sdb1 to /dev/md0 as 1
> >> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
> >
> >
> > Hmmm.... maybe your kernel isn't quite doing the right thing.
> >  commit 674806d62fb02a22eea948c9f1b5e58e0947b728 is important.
> > It is in 2.6.35.  What kernel are you running?
> > Definitely something older given the "1: w=1 pa=18...." messages.  They
> > disappear in 2.6.34.
> >
> > So I'm afraid you're going to need a new kernel.
> >
> > NeilBrown
> >
> >
> >
> >
> >>
> >> and dmesg:
> >> [13964.591801] md: bind<sdg1>
> >> [13964.595371] md: bind<sda1>
> >> [13964.595668] md: bind<sdc1>
> >> [13964.595900] md: bind<sdf1>
> >> [13964.599084] md: bind<sdd1>
> >> [13964.599652] md: bind<sdb1>
> >> [13964.600478] md: kicking non-fresh sdc1 from array!
> >> [13964.600493] md: unbind<sdc1>
> >> [13964.612138] md: export_rdev(sdc1)
> >> [13964.612163] md: kicking non-fresh sdg1 from array!
> >> [13964.612183] md: unbind<sdg1>
> >> [13964.624077] md: export_rdev(sdg1)
> >> [13964.628203] raid5: reshape will continue
> >> [13964.628243] raid5: device sdb1 operational as raid disk 1
> >> [13964.628252] raid5: device sdf1 operational as raid disk 4
> >> [13964.628260] raid5: device sda1 operational as raid disk 2
> >> [13964.629614] raid5: allocated 6308kB for md0
> >> [13964.629731] 1: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629742] 5: w=1 pa=18 pr=6 m=2 a=2 r=6 op1=1 op2=0
> >> [13964.629751] 4: w=2 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629760] 2: w=3 pa=18 pr=6 m=2 a=2 r=6 op1=0 op2=0
> >> [13964.629767] raid5: not enough operational devices for md0 (3/6 failed)
> >> [13964.640403] RAID5 conf printout:
> >> [13964.640409]  --- rd:6 wd:3
> >> [13964.640416]  disk 1, o:1, dev:sdb1
> >> [13964.640423]  disk 2, o:1, dev:sda1
> >> [13964.640429]  disk 4, o:1, dev:sdf1
> >> [13964.640436]  disk 5, o:1, dev:sdd1
> >> [13964.641621] raid5: failed to run raid set md0
> >> [13964.649886] md: pers->run() failed ...


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

     prev parent reply	other threads:[~2012-02-07  5:16 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07  1:34 Please Help! RAID5 -> 6 reshapre gone bad Richard Herd
2012-02-07  2:15 ` Phil Turmel
     [not found]   ` <CAOANJV955ZdLexRTjVkQzTMapAaMitq5eqxP0rUvDjjLh4Wgzw@mail.gmail.com>
2012-02-07  2:57     ` Phil Turmel
2012-02-07  3:10       ` Richard Herd
2012-02-07  3:24       ` Keith Keller
2012-02-07  3:38         ` Phil Turmel
2012-01-31  6:31           ` rebuild raid6 after two failures Keith Keller
2012-02-01  4:42             ` Keith Keller
2012-02-01  5:31               ` NeilBrown
2012-02-01  5:48                 ` Keith Keller
2012-02-03 16:08               ` using dd (or dd_rescue) to salvage array Keith Keller
2012-02-04 18:01                 ` Stefan /*St0fF*/ Hübner
2012-02-05 19:10                   ` Keith Keller
2012-02-06 21:37                     ` Stefan *St0fF* Huebner
2012-02-07  3:44                       ` Keith Keller
2012-02-07  4:24                       ` Keith Keller
2012-02-07 20:01                         ` Stefan *St0fF* Huebner
2012-02-08  7:13         ` Please Help! RAID5 -> 6 reshapre gone bad Stan Hoeppner
2012-02-07  3:04     ` Fwd: " Richard Herd
2012-02-07  2:39 ` NeilBrown
2012-02-07  3:10   ` NeilBrown
2012-02-07  3:19     ` Richard Herd
2012-02-07  3:39       ` NeilBrown
2012-02-07  3:50         ` Richard Herd
2012-02-07  4:25           ` NeilBrown
2012-02-07  5:02             ` Richard Herd
2012-02-07  5:16               ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120207161616.1951a682@notabene.brown \
    --to=neilb@suse.de \
    --cc=2001oddity@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).