linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RAID-5 and mdadm --assemble troubleshooting
@ 2011-03-21 22:08 A J Wyborny
  2011-03-21 22:28 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: A J Wyborny @ 2011-03-21 22:08 UTC (permalink / raw)
  To: linux-raid

Hi all,

After exhausting my efforts with google searches and linux-raid IRC
chats, I'm reaching out to you all for some help with why I can't
assemble a broken RAID-5 configuration.  My initial problem, I've
determined, was caused by a faulty PCI-E SATA controller card.  I
would constantly lose access to my mounted RAID volume (/home) at
random times and increasingly during high write accesses.  In the past
a reboot and running "mdadm --assemble --force --scan" would solve the
issue.  This time, no such luck.  In the process of troubleshooting I
also fat-fingered an "mdadm --assemble" command and lost the
superblock of my /dev/sda1 partition, which isn't helping things
either.

The SMART status is clean on all disks.

I really appreciate any thoughts/input you might have.  -Adam

Here's my setup:

RAID-5 array with four 1.5TB disks (/dev/sda1, /dev/sdb1, /dev/sdd1, /dev/sde1)
/dev/sdc is my root and swap partitions
/dev/md0 should be mounted to /home

Results:
root@focalor:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
unused devices: <none>
root@focalor:~# mdadm -vv --assemble --force --scan
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sdc1: Device or resource busy
mdadm: /dev/sdc1 has wrong uuid.
mdadm: no RAID superblock on /dev/sda1
mdadm: /dev/sda1 has wrong uuid.
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: no uptodate device for slot 0 of /dev/md0
mdadm: added /dev/sdd1 to /dev/md0 as 2
mdadm: added /dev/sde1 to /dev/md0 as 3
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
root@focalor:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : inactive sdb1[1] sde1[3] sdd1[2]
     4395407808 blocks

unused devices: <none>
root@focalor:~# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@focalor:~# mdadm -vv --assemble --force /dev/md0 /dev/sdb1
/dev/sdd1 /dev/sde1
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
mdadm: no uptodate device for slot 0 of /dev/md0
mdadm: added /dev/sdd1 to /dev/md0 as 2
mdadm: added /dev/sde1 to /dev/md0 as 3
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
root@focalor:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : inactive sdb1[1] sde1[3] sdd1[2]
     4395407808 blocks

unused devices: <none>

dmesg output:
http://pastebin.com/usrzvmpn

mdadm -E output:
http://pastebin.com/vnaamC75
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RAID-5 and mdadm --assemble troubleshooting
  2011-03-21 22:08 RAID-5 and mdadm --assemble troubleshooting A J Wyborny
@ 2011-03-21 22:28 ` NeilBrown
  2011-03-22 16:42   ` A J Wyborny
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2011-03-21 22:28 UTC (permalink / raw)
  To: A J Wyborny; +Cc: linux-raid

On Mon, 21 Mar 2011 23:08:40 +0100 A J Wyborny <ajwyborny@gmail.com> wrote:

> Hi all,
> 
> After exhausting my efforts with google searches and linux-raid IRC
> chats, I'm reaching out to you all for some help with why I can't
> assemble a broken RAID-5 configuration.  My initial problem, I've
> determined, was caused by a faulty PCI-E SATA controller card.  I
> would constantly lose access to my mounted RAID volume (/home) at
> random times and increasingly during high write accesses.  In the past
> a reboot and running "mdadm --assemble --force --scan" would solve the
> issue.  This time, no such luck.  In the process of troubleshooting I
> also fat-fingered an "mdadm --assemble" command and lost the
> superblock of my /dev/sda1 partition, which isn't helping things
> either.
> 
> The SMART status is clean on all disks.
> 
> I really appreciate any thoughts/input you might have.  -Adam
> 
> Here's my setup:
> 
> RAID-5 array with four 1.5TB disks (/dev/sda1, /dev/sdb1, /dev/sdd1, /dev/sde1)
> /dev/sdc is my root and swap partitions
> /dev/md0 should be mounted to /home
> 
> Results:
> root@focalor:~# cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> unused devices: <none>
> root@focalor:~# mdadm -vv --assemble --force --scan
> mdadm: looking for devices for /dev/md0
> mdadm: cannot open device /dev/sdc1: Device or resource busy
> mdadm: /dev/sdc1 has wrong uuid.
> mdadm: no RAID superblock on /dev/sda1
> mdadm: /dev/sda1 has wrong uuid.
> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
> mdadm: no uptodate device for slot 0 of /dev/md0
> mdadm: added /dev/sdd1 to /dev/md0 as 2
> mdadm: added /dev/sde1 to /dev/md0 as 3
> mdadm: added /dev/sdb1 to /dev/md0 as 1
> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error

You are hitting an mdadm bug fixed in 2.6.9 by 

http://neil.brown.name/git?p=mdadm;a=commitdiff;h=4e9a6ff778cdc58dcc6897e74cf5ee1d3f73e1f7

What version of mdadm are you running?

You can work around it by
  echo 1 > /sys/module/md_mod/parameters/start_dirty_degraded

before running the 'mdadm -A' command.


> dmesg output:
> http://pastebin.com/usrzvmpn
> 
> mdadm -E output:
> http://pastebin.com/vnaamC75

It is OK - even encouraged - to include this content directly in the Email.
That makes it easier to reference in a reply, should that be helpful.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RAID-5 and mdadm --assemble troubleshooting
  2011-03-21 22:28 ` NeilBrown
@ 2011-03-22 16:42   ` A J Wyborny
  2011-03-23  0:04     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: A J Wyborny @ 2011-03-22 16:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Wow, thank you so much.  I forgot to mention that I AM running mdadm
v3.1.4, compiled from source, after initially running the Ubuntu
default of 2.6.7.1.  I'm not sure why the "start_dirty_degraded" file
wasn't updated, but I'm glad I know about it now.

My attempts to re-add /dev/sda1 to the array right away failed, but a
reboot took care of that problem and it's now recovering.

Again, I really appreciate it.

Adam

On Mon, Mar 21, 2011 at 11:28 PM, NeilBrown <neilb@suse.de> wrote:
> On Mon, 21 Mar 2011 23:08:40 +0100 A J Wyborny <ajwyborny@gmail.com> wrote:
>
>> Hi all,
>>
>> After exhausting my efforts with google searches and linux-raid IRC
>> chats, I'm reaching out to you all for some help with why I can't
>> assemble a broken RAID-5 configuration.  My initial problem, I've
>> determined, was caused by a faulty PCI-E SATA controller card.  I
>> would constantly lose access to my mounted RAID volume (/home) at
>> random times and increasingly during high write accesses.  In the past
>> a reboot and running "mdadm --assemble --force --scan" would solve the
>> issue.  This time, no such luck.  In the process of troubleshooting I
>> also fat-fingered an "mdadm --assemble" command and lost the
>> superblock of my /dev/sda1 partition, which isn't helping things
>> either.
>>
>> The SMART status is clean on all disks.
>>
>> I really appreciate any thoughts/input you might have.  -Adam
>>
>> Here's my setup:
>>
>> RAID-5 array with four 1.5TB disks (/dev/sda1, /dev/sdb1, /dev/sdd1, /dev/sde1)
>> /dev/sdc is my root and swap partitions
>> /dev/md0 should be mounted to /home
>>
>> Results:
>> root@focalor:~# cat /proc/mdstat
>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>> [raid4] [raid10]
>> unused devices: <none>
>> root@focalor:~# mdadm -vv --assemble --force --scan
>> mdadm: looking for devices for /dev/md0
>> mdadm: cannot open device /dev/sdc1: Device or resource busy
>> mdadm: /dev/sdc1 has wrong uuid.
>> mdadm: no RAID superblock on /dev/sda1
>> mdadm: /dev/sda1 has wrong uuid.
>> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
>> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
>> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
>> mdadm: no uptodate device for slot 0 of /dev/md0
>> mdadm: added /dev/sdd1 to /dev/md0 as 2
>> mdadm: added /dev/sde1 to /dev/md0 as 3
>> mdadm: added /dev/sdb1 to /dev/md0 as 1
>> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
>
> You are hitting an mdadm bug fixed in 2.6.9 by
>
> http://neil.brown.name/git?p=mdadm;a=commitdiff;h=4e9a6ff778cdc58dcc6897e74cf5ee1d3f73e1f7
>
> What version of mdadm are you running?
>
> You can work around it by
>  echo 1 > /sys/module/md_mod/parameters/start_dirty_degraded
>
> before running the 'mdadm -A' command.
>
>
>> dmesg output:
>> http://pastebin.com/usrzvmpn
>>
>> mdadm -E output:
>> http://pastebin.com/vnaamC75
>
> It is OK - even encouraged - to include this content directly in the Email.
> That makes it easier to reference in a reply, should that be helpful.
>
> NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RAID-5 and mdadm --assemble troubleshooting
  2011-03-22 16:42   ` A J Wyborny
@ 2011-03-23  0:04     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2011-03-23  0:04 UTC (permalink / raw)
  To: A J Wyborny; +Cc: linux-raid

On Tue, 22 Mar 2011 17:42:33 +0100 A J Wyborny <ajwyborny@gmail.com> wrote:

> Wow, thank you so much.  I forgot to mention that I AM running mdadm
> v3.1.4, compiled from source, after initially running the Ubuntu
> default of 2.6.7.1.  I'm not sure why the "start_dirty_degraded" file
> wasn't updated, but I'm glad I know about it now.

I thought it was fixed in 3.1.4... apparently not quite.  Your particular
case was a bit unusual and slip through the tests.  I've added a fix which
will be in 3.1.5 and 3.2.1.

'start_dirty_degraded' is only intended to be used when arrays are
auto-assembled by the kernel (a practice that I don't encourage).
I suggested it's use here as a work around for the bug in mdadm rather than
the appropriate way to generally deal with your situation. It won't be needed
in future mdadm releases.

Thanks,
NeilBrown


> 
> My attempts to re-add /dev/sda1 to the array right away failed, but a
> reboot took care of that problem and it's now recovering.
> 
> Again, I really appreciate it.
> 
> Adam
> 
> On Mon, Mar 21, 2011 at 11:28 PM, NeilBrown <neilb@suse.de> wrote:
> > On Mon, 21 Mar 2011 23:08:40 +0100 A J Wyborny <ajwyborny@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> After exhausting my efforts with google searches and linux-raid IRC
> >> chats, I'm reaching out to you all for some help with why I can't
> >> assemble a broken RAID-5 configuration.  My initial problem, I've
> >> determined, was caused by a faulty PCI-E SATA controller card.  I
> >> would constantly lose access to my mounted RAID volume (/home) at
> >> random times and increasingly during high write accesses.  In the past
> >> a reboot and running "mdadm --assemble --force --scan" would solve the
> >> issue.  This time, no such luck.  In the process of troubleshooting I
> >> also fat-fingered an "mdadm --assemble" command and lost the
> >> superblock of my /dev/sda1 partition, which isn't helping things
> >> either.
> >>
> >> The SMART status is clean on all disks.
> >>
> >> I really appreciate any thoughts/input you might have.  -Adam
> >>
> >> Here's my setup:
> >>
> >> RAID-5 array with four 1.5TB disks (/dev/sda1, /dev/sdb1, /dev/sdd1, /dev/sde1)
> >> /dev/sdc is my root and swap partitions
> >> /dev/md0 should be mounted to /home
> >>
> >> Results:
> >> root@focalor:~# cat /proc/mdstat
> >> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> >> [raid4] [raid10]
> >> unused devices: <none>
> >> root@focalor:~# mdadm -vv --assemble --force --scan
> >> mdadm: looking for devices for /dev/md0
> >> mdadm: cannot open device /dev/sdc1: Device or resource busy
> >> mdadm: /dev/sdc1 has wrong uuid.
> >> mdadm: no RAID superblock on /dev/sda1
> >> mdadm: /dev/sda1 has wrong uuid.
> >> mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
> >> mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
> >> mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
> >> mdadm: no uptodate device for slot 0 of /dev/md0
> >> mdadm: added /dev/sdd1 to /dev/md0 as 2
> >> mdadm: added /dev/sde1 to /dev/md0 as 3
> >> mdadm: added /dev/sdb1 to /dev/md0 as 1
> >> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
> >
> > You are hitting an mdadm bug fixed in 2.6.9 by
> >
> > http://neil.brown.name/git?p=mdadm;a=commitdiff;h=4e9a6ff778cdc58dcc6897e74cf5ee1d3f73e1f7
> >
> > What version of mdadm are you running?
> >
> > You can work around it by
> >  echo 1 > /sys/module/md_mod/parameters/start_dirty_degraded
> >
> > before running the 'mdadm -A' command.
> >
> >
> >> dmesg output:
> >> http://pastebin.com/usrzvmpn
> >>
> >> mdadm -E output:
> >> http://pastebin.com/vnaamC75
> >
> > It is OK - even encouraged - to include this content directly in the Email.
> > That makes it easier to reference in a reply, should that be helpful.
> >
> > NeilBrown
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-03-23  0:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-21 22:08 RAID-5 and mdadm --assemble troubleshooting A J Wyborny
2011-03-21 22:28 ` NeilBrown
2011-03-22 16:42   ` A J Wyborny
2011-03-23  0:04     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).