and again: broken RAID5

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* and again: broken RAID5
@ 2012-05-10  8:07 Lars Schimmer
  2012-05-10  9:20 ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Schimmer @ 2012-05-10  8:07 UTC (permalink / raw)
  To: linux-raid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!



I just want some tips on howto get two broken raid5 running again.

It contains of 4 drives, one was thrown out on saturday evening, and
the second drive threw read/write errors before I could replace the
first one.

Now I got all four drives running again, looks like some
controller/cable problem. Changed them.

But mdadm tells me, only 2 of 4 are available and could not start raid:

md2 : inactive sdh2[0](S) sdi2[4](S) sdf2[6](S) sdj2[5](S)
4251770144 blocks super 1.2

md1 : inactive sdh1[0](S) sdi1[4](S) sdf1[6](S) sdj1[5](S)
2097147904 blocks super 1.2

mdadm -E tells me e.g. for md1:

/dev/sdh1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
Name : debian:1  (local to host debian)
Creation Time : Sat Feb 11 13:39:46 2012
Raid Level : raid5    Raid Devices : 4
Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 710a2926:640bc675:7b5fe308:861d1883
Update Time : Sun May  6 13:57:22 2012
Checksum : e387e748 - correct
Events : 206286
Layout : left-symmetric
Chunk Size : 128K
Device Role : Active device 0
Array State : AA.. ('A' == active, '.' == missing)

/dev/sdf1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
Name : debian:1  (local to host debian)
Creation Time : Sat Feb 11 13:39:46 2012
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f53b097c:9762f2a4:f2af9011:3fa7ced7
Update Time : Sun May  6 13:38:26 2012
Checksum : c5f745bd - correct
Events : 206273
Layout : left-symmetric
Chunk Size : 128K
Device Role : Active device 2
Array State : AAA. ('A' == active, '.' == missing)

/dev/sdi1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
Name : debian:1  (local to host debian)
Creation Time : Sat Feb 11 13:39:46 2012
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 79eb5c0c:db73e2f8:21d9b7f8:97936da4
Update Time : Sun May  6 01:32:35 2012
Checksum : 5dff4e2a - correct
Events : 197114
Layout : left-symmetric
Chunk Size : 128K
Device Role : Active device 3
Array State : AAAA ('A' == active, '.' == missing)

/dev/sdj1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
Name : debian:1  (local to host debian)
Creation Time : Sat Feb 11 13:39:46 2012
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : e8055bca:0f3c9b54:421bc2c2:4a72359b
Update Time : Sun May  6 13:57:22 2012
Checksum : 388b23fc - correct
Events : 206286
Layout : left-symmetric
Chunk Size : 128K
Device Role : Active device 1
Array State : AA.. ('A' == active, '.' == missing)


So all 4 disks see some diff state. Any chance on getting raid5
running again and read some data from it?

Does the mdadm -C --assume-clean option help me in that case anyhow?


Thank you!

MfG,
Lars Schimmer
- -- 
- -------------------------------------------------------------
TU Graz, Institut für ComputerGraphik & WissensVisualisierung
Tel: +43 316 873-5405       E-Mail: l.schimmer@cgv.tugraz.at
Fax: +43 316 873-5402       PGP-Key-ID: 0x4A9B1723


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+rdzgACgkQmWhuE0qbFyMvtgCfaoUiyHj2qFEvaCgo+azZoo3S
guUAnR+s0jZvyc2XL5HItCbT+81pLjwe
=kT0C
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: and again: broken RAID5
  2012-05-10  8:07 and again: broken RAID5 Lars Schimmer
@ 2012-05-10  9:20 ` NeilBrown
  2012-05-10 10:19   ` John Robinson
  2012-05-11  7:27   ` Lars Schimmer
  0 siblings, 2 replies; 6+ messages in thread
From: NeilBrown @ 2012-05-10  9:20 UTC (permalink / raw)
  To: Lars Schimmer; +Cc: linux-raid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, 10 May 2012 10:07:21 +0200 Lars Schimmer <l.schimmer@cgv.tugraz.at>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi!
> 
> 
> 
> I just want some tips on howto get two broken raid5 running again.
> 
> It contains of 4 drives, one was thrown out on saturday evening, and
> the second drive threw read/write errors before I could replace the
> first one.
> 
> Now I got all four drives running again, looks like some
> controller/cable problem. Changed them.
> 
> But mdadm tells me, only 2 of 4 are available and could not start raid:
> 
> md2 : inactive sdh2[0](S) sdi2[4](S) sdf2[6](S) sdj2[5](S)
> 4251770144 blocks super 1.2
> 
> md1 : inactive sdh1[0](S) sdi1[4](S) sdf1[6](S) sdj1[5](S)
> 2097147904 blocks super 1.2
> 
> mdadm -E tells me e.g. for md1:
> 
> /dev/sdh1:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
> Name : debian:1  (local to host debian)
> Creation Time : Sat Feb 11 13:39:46 2012
> Raid Level : raid5    Raid Devices : 4
> Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
> Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
> Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : 710a2926:640bc675:7b5fe308:861d1883
> Update Time : Sun May  6 13:57:22 2012
> Checksum : e387e748 - correct
> Events : 206286
> Layout : left-symmetric
> Chunk Size : 128K
> Device Role : Active device 0
> Array State : AA.. ('A' == active, '.' == missing)
> 
> /dev/sdf1:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
> Name : debian:1  (local to host debian)
> Creation Time : Sat Feb 11 13:39:46 2012
> Raid Level : raid5
> Raid Devices : 4
> Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
> Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
> Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : f53b097c:9762f2a4:f2af9011:3fa7ced7
> Update Time : Sun May  6 13:38:26 2012
> Checksum : c5f745bd - correct
> Events : 206273
> Layout : left-symmetric
> Chunk Size : 128K
> Device Role : Active device 2
> Array State : AAA. ('A' == active, '.' == missing)
> 
> /dev/sdi1:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
> Name : debian:1  (local to host debian)
> Creation Time : Sat Feb 11 13:39:46 2012
> Raid Level : raid5
> Raid Devices : 4
> Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
> Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
> Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : 79eb5c0c:db73e2f8:21d9b7f8:97936da4
> Update Time : Sun May  6 01:32:35 2012
> Checksum : 5dff4e2a - correct
> Events : 197114
> Layout : left-symmetric
> Chunk Size : 128K
> Device Role : Active device 3
> Array State : AAAA ('A' == active, '.' == missing)
> 
> /dev/sdj1:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : a503cea9:eb613db1:c8909233:fd5415ce
> Name : debian:1  (local to host debian)
> Creation Time : Sat Feb 11 13:39:46 2012
> Raid Level : raid5
> Raid Devices : 4
> Avail Dev Size : 1048573952 (500.00 GiB 536.87 GB)
> Array Size : 3145720320 (1500.00 GiB 1610.61 GB)
> Used Dev Size : 1048573440 (500.00 GiB 536.87 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : e8055bca:0f3c9b54:421bc2c2:4a72359b
> Update Time : Sun May  6 13:57:22 2012
> Checksum : 388b23fc - correct
> Events : 206286
> Layout : left-symmetric
> Chunk Size : 128K
> Device Role : Active device 1
> Array State : AA.. ('A' == active, '.' == missing)
> 
> 
> So all 4 disks see some diff state. Any chance on getting raid5
> running again and read some data from it?
> 
> Does the mdadm -C --assume-clean option help me in that case anyhow?

Just add --force the to --assemble command.

NeilBrown

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iQIVAwUBT6uIUDnsnt1WYoG5AQLYxxAAvEPEuHXnC4fB5BQJnzVFFrFwwI6K6re2
XmjKr3XnNjluUCbtXgGAi61qti82rYn5ZDKrbYHO47cSJ51E3RQrOOv0v7u3PBx+
v6gAkDUAer68OZWm6Sxxb///3V1joanidEmVdpPntcb3kNz4pLuolGTyKVzmeWwe
u0ggOL9yqIOgCHNAr50RiYP1vsM6QOhXL4f6962ZenQ3vdAqKZiAhZ73fqueeHYi
UqS2B1tHauo0CXCFQrQ2OLw9gtTDp1R46wk/bxZIJMOC/rn9v5u8/GfkJD5nYgNm
rKmDq0Ae+OYpDsVfs9uOV8l9Xwq7Ezg3NvVIK5Dub5rFhMSaNTx+hHsGKvB4DkR/
IX7/HJwHlBQeptPCXbJvdQsbx1YFjyrFIR4HEEPbPndbgr4wNesg2xb6pbDog2Yz
s2Ganv229ZKL0qOTNdFJkZe8uT0VTyQFb+RFUxRzFB3fSdc8vJoBqCwdRYtsYm90
b96rR7hMLsw/5Y0jgj7AGP0Xez07M4KqfWzi4bWGWXSBZarQffgn+ztBUHSfpeHn
quVoNJ/wpHHOxVHpkaKuBDHwsG2pDFrjHberNF2x2lHZeDK0d07ij1KtbVL9564c
Hoh6TzrP89vLQYSqlVSfFGIE1w8c8ZCVfSv6T674g34DZzlU8PLcNUIAPfM1YAKb
pg8vLpp2ZVA=
=TFGJ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: and again: broken RAID5
  2012-05-10  9:20 ` NeilBrown
@ 2012-05-10 10:19   ` John Robinson
  2012-05-10 10:45     ` NeilBrown
  2012-05-11  7:27   ` Lars Schimmer
  1 sibling, 1 reply; 6+ messages in thread
From: John Robinson @ 2012-05-10 10:19 UTC (permalink / raw)
  To: NeilBrown; +Cc: Lars Schimmer, linux-raid

On 10/05/2012 10:20, NeilBrown wrote:
> On Thu, 10 May 2012 10:07:21 +0200 Lars Schimmer<l.schimmer@cgv.tugraz.at>
> wrote:
[...]
>> mdadm -E tells me e.g. for md1:
>>
>> /dev/sdh1:
>> Events : 206286
>>
>> /dev/sdf1:
>> Events : 206273
>>
>> /dev/sdi1:
>> Events : 197114
>>
>> /dev/sdj1:
>> Events : 206286
[...]
>>
>>
>> So all 4 disks see some diff state. Any chance on getting raid5
>> running again and read some data from it?
>>
>> Does the mdadm -C --assume-clean option help me in that case anyhow?
>
> Just add --force the to --assemble command.

I think I'd assemble from only 3 drives, sdh, sdf and sdj, because sdi's 
event count is so much lower it must be the one that was knocked out 
first and you probably want to resync onto it (or a fresh drive if it's 
actually faulty).

Cheers,

John.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: and again: broken RAID5
  2012-05-10 10:19   ` John Robinson
@ 2012-05-10 10:45     ` NeilBrown
  2012-05-11 10:28       ` John Robinson
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2012-05-10 10:45 UTC (permalink / raw)
  To: John Robinson; +Cc: Lars Schimmer, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1124 bytes --]

On Thu, 10 May 2012 11:19:30 +0100 John Robinson
<john.robinson@anonymous.org.uk> wrote:

> On 10/05/2012 10:20, NeilBrown wrote:
> > On Thu, 10 May 2012 10:07:21 +0200 Lars Schimmer<l.schimmer@cgv.tugraz.at>
> > wrote:
> [...]
> >> mdadm -E tells me e.g. for md1:
> >>
> >> /dev/sdh1:
> >> Events : 206286
> >>
> >> /dev/sdf1:
> >> Events : 206273
> >>
> >> /dev/sdi1:
> >> Events : 197114
> >>
> >> /dev/sdj1:
> >> Events : 206286
> [...]
> >>
> >>
> >> So all 4 disks see some diff state. Any chance on getting raid5
> >> running again and read some data from it?
> >>
> >> Does the mdadm -C --assume-clean option help me in that case anyhow?
> >
> > Just add --force the to --assemble command.
> 
> I think I'd assemble from only 3 drives, sdh, sdf and sdj, because sdi's 
> event count is so much lower it must be the one that was knocked out 
> first and you probably want to resync onto it (or a fresh drive if it's 
> actually faulty).

correct.  And that is exactly what "mdadm --assemble --force" will decide
too. :-)
It will assemble from h, f, j and exclude i.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: and again: broken RAID5
  2012-05-10  9:20 ` NeilBrown
  2012-05-10 10:19   ` John Robinson
@ 2012-05-11  7:27   ` Lars Schimmer
  1 sibling, 0 replies; 6+ messages in thread
From: Lars Schimmer @ 2012-05-11  7:27 UTC (permalink / raw)
  Cc: linux-raid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2012-05-10 11:20, NeilBrown wrote:
> On Thu, 10 May 2012 10:07:21 +0200 Lars Schimmer
> <l.schimmer@cgv.tugraz.at> wrote:

>> Does the mdadm -C --assume-clean option help me in that case
>> anyhow?
> 
> Just add --force the to --assemble command.

Thank to all, mdadm worked like a charm and I could read out data.
Fine solution.

> NeilBrown
> 

MfG,
Lars Schimmer
- -- 
- -------------------------------------------------------------
TU Graz, Institut für ComputerGraphik & WissensVisualisierung
Tel: +43 316 873-5405       E-Mail: l.schimmer@cgv.tugraz.at
Fax: +43 316 873-5402       PGP-Key-ID: 0x4A9B1723


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+sv2oACgkQmWhuE0qbFyOmCACfc+BNDJvZ7N31Ef15HNUhalYk
GagAmwSY/RQYaQ4WE4Ma1bbTsfxSaDuF
=TaRt
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: and again: broken RAID5
  2012-05-10 10:45     ` NeilBrown
@ 2012-05-11 10:28       ` John Robinson
  0 siblings, 0 replies; 6+ messages in thread
From: John Robinson @ 2012-05-11 10:28 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On 10/05/2012 11:45, NeilBrown wrote:
> On Thu, 10 May 2012 11:19:30 +0100 John Robinson
> <john.robinson@anonymous.org.uk>  wrote:
>
>> On 10/05/2012 10:20, NeilBrown wrote:
[...]
>>> Just add --force the to --assemble command.
>>
>> I think I'd assemble from only 3 drives, sdh, sdf and sdj, because sdi's
>> event count is so much lower it must be the one that was knocked out
>> first and you probably want to resync onto it (or a fresh drive if it's
>> actually faulty).
>
> correct.  And that is exactly what "mdadm --assemble --force" will decide
> too. :-)
> It will assemble from h, f, j and exclude i.

Ah - I didn't know it was clever enough to do that, I had assumed that 
if you forced all the drives they'd all be marked up-to-date.

What should I do if I had 3 different event counts (like this case) all 
fairly close to each other (unlike this case) and I really did want to 
force them all, avoiding a resync?

Cheers,

John.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-05-11 10:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-10  8:07 and again: broken RAID5 Lars Schimmer
2012-05-10  9:20 ` NeilBrown
2012-05-10 10:19   ` John Robinson
2012-05-10 10:45     ` NeilBrown
2012-05-11 10:28       ` John Robinson
2012-05-11  7:27   ` Lars Schimmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).