From: Alexander Lyakas <alex.bolshoy@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID5: failing an active component during spare rebuild - arrays hangs
Date: Sun, 26 Jun 2011 21:13:17 +0300 [thread overview]
Message-ID: <BANLkTinijekK8Y6TTk6TYtYhv4Dn2YGQbA@mail.gmail.com> (raw)
In-Reply-To: <20110622125409.14428883@notabene.brown>
Hello Neil,
thank you for your response. Meanwhile I have moved to stock ubuntu
natty 11.04, but it still happens. I have a simple script that
reproduces the issue for me in less than 1 minute.
System details:
Linux ubuntu 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
Here is the script:
##################################
#!/bin/bash
while true
do
mdadm --create /dev/md1123 --raid-devices=3 --level=5
--bitmap=internal --name=1123 --run --auto=md --metadata=1.2
--homehost=alex --verbose /dev/sda /dev/sdb /dev/sdc
sleep 6
mdadm --manage /dev/md1123 --fail /dev/sda
sleep 1
if mdadm --stop /dev/md1123
then
true
else
break
fi
done
#####################################
And here is the output of one run. At the end of the output, the
--stop command fails and from that point I am unable to do anything
with the array, other than rebooting the machine.
root@ubuntu:/mnt/work/alex# ./repro.sh
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:55:54 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:55:54 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:55:54 2011
mdadm: size set to 20969984K
mdadm: creation continuing despite oddities due to --run
mdadm: array /dev/md1123 started.
mdadm: set /dev/sda faulty in /dev/md1123
mdadm: stopped /dev/md1123
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:45 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:45 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:45 2011
mdadm: size set to 20969984K
mdadm: creation continuing despite oddities due to --run
mdadm: array /dev/md1123 started.
mdadm: set /dev/sda faulty in /dev/md1123
mdadm: stopped /dev/md1123
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: /dev/sda appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:52 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdb appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:52 2011
mdadm: layout defaults to left-symmetric
mdadm: /dev/sdc appears to be part of a raid array:
level=raid5 devices=3 ctime=Sun Jun 26 20:57:52 2011
mdadm: size set to 20969984K
mdadm: creation continuing despite oddities due to --run
mdadm: array /dev/md1123 started.
mdadm: set /dev/sda faulty in /dev/md1123
mdadm: failed to stop array /dev/md1123: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?
At this point mdadm --detail produces:
/dev/md1123:
Version : 1.2
Creation Time : Sun Jun 26 20:57:59 2011
Raid Level : raid5
Array Size : 41939968 (40.00 GiB 42.95 GB)
Used Dev Size : 20969984 (20.00 GiB 21.47 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sun Jun 26 20:58:23 2011
State : active, FAILED
Active Devices : 1
Working Devices : 2
Failed Devices : 1
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : alex:1123
UUID : cd564563:94fecf52:5b3492d4:4530ecbc
Events : 4
Number Major Minor RaidDevice State
0 8 0 0 faulty spare rebuilding /dev/sda
1 8 16 1 active sync /dev/sdb
3 8 32 2 spare rebuilding /dev/sdc
and the faulty device is not kicked out from the array, as I would expect.
Thanks,
Alex.
On Wed, Jun 22, 2011 at 5:54 AM, NeilBrown <neilb@suse.de> wrote:
>
> On Sun, 5 Jun 2011 22:41:55 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
> wrote:
>
> > Hello everybody,
> > I am testing a scenario, in which I create a RAID5 with three devices:
> > /dev/sd{a,b,c}. Since I don't supply --force to mdadm during creation,
> > it treats the array as degraded and starts rebuilding the sdc as a
> > spare. This is as documented.
> >
> > Then I do --fail on /dev/sda. I understand that at this point my data
> > is gone, but I think should still be able to tear down the array.
> >
> > Sometimes I see that /dev/sda is kicked from the array as faulty, and
> > /dev/sdc is also removed and marked as a spare. Then I am able to tear
> > down the array.
> >
> > But sometimes, it looks like the system hits some kind of a deadlock.
>
> I cannot reproduce this, either on current mainline or 2.6.38. I didn't try
> the particular Ubuntu kernel that you mentioned as I don't have any Ubuntu
> machines.
> It is unlikely that Ubuntu have broken something, but not impossible... are
> you able to compile a kernel.org kernel (preferably 2.6.39) and see if you
> can reproduce.
>
> Also, can you provide a simple script that will trigger the bug reliably for
> you.
>
> I did:
>
> while : ; do mdadm -CR /dev/md0 -l5 -n3 /dev/sd[abc] ; sleep 5; mdadm /dev/md0 -f /dev/sda ; mdadm -Ss ; echo ; echo; done
>
> and it has no problems at all.
>
> Certainly a deadlock shouldn't be happening...
>
> From the stack trace you get it looks like it is probably hanging at
>
> wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
>
> which suggests that so resync request started and didn't complete. I've
> never seen a hang there before.
>
> NeilBrown
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-06-26 18:13 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <BANLkTikkeoCsr3-UBSPEDrYwh4jGSn=MaA@mail.gmail.com>
2011-06-05 19:41 ` RAID5: failing an active component during spare rebuild - arrays hangs Alexander Lyakas
[not found] ` <20110605230014.14822hd7b50rcqww@cakebox.homeunix.net>
2011-06-06 18:19 ` Alexander Lyakas
2011-06-21 8:05 ` Alexander Lyakas
2011-06-22 2:54 ` NeilBrown
2011-06-26 18:13 ` Alexander Lyakas [this message]
2011-06-28 2:29 ` NeilBrown
2011-07-17 8:29 ` Alexander Lyakas
2011-08-25 8:59 ` Alexander Lyakas
2011-08-25 10:10 ` Alexander Lyakas
2011-08-31 2:46 ` NeilBrown
2011-11-27 9:56 ` Alexander Lyakas
2011-12-06 3:16 ` NeilBrown
2011-12-06 21:07 ` Alexander Lyakas
2011-12-06 21:21 ` NeilBrown
2011-12-14 10:27 ` Alexander Lyakas
2011-12-14 11:32 ` NeilBrown
2011-12-15 14:38 ` Alexander Lyakas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BANLkTinijekK8Y6TTk6TYtYhv4Dn2YGQbA@mail.gmail.com \
--to=alex.bolshoy@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).