From: Glen Dragon <glen.dragon@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid5 reshape failure - restart?
Date: Sun, 15 May 2011 17:45:34 -0400 [thread overview]
Message-ID: <BANLkTi=7oHaQYbOA+qWHHzUzP5fp6UB=9A@mail.gmail.com> (raw)
In-Reply-To: <20110516073702.6b6b9bb2@notabene.brown>
On Sun, May 15, 2011 at 5:37 PM, NeilBrown <neilb@suse.de> wrote:
> On Sun, 15 May 2011 13:33:28 -0400 Glen Dragon <glen.dragon@gmail.com> wrote:
>
>> In trying to reshape a raid5 array, I encountered some problems.
>> I was trying to reshape from raid5 3->4 devices. The reshape process
>> started with seeming no problems, however i noticed in the kernel log
>> a number of ata3.00: failed command: WRITE FPDMA QUEUED errors.
>> In trying to determine if this was going to be bad for me, I disabled
>> ncq on this device. Looking at the log, i notice around the same time
>> /dev/sdd reported problems and took itself offline.
>> At this point the reshape seemed to be continuing w/o issue, even
>> though one of the drives was offline.. I wasn't sure that this made
>> sense.
>>
>> Shortly after, I noticed that the progress on the reshape had stalled.
>> I tried changing the stripe_cache_size from 256 to [1024|2048|4096],
>> but the reshape did not resume. top reported that the reshape process
>> was using 100% of one core, and the load average was climbing into the
>> 50's
>>
>> At this point I rebooted. The array does not start.
>>
>> Can the reshape be restarted? I cannot figure out where the backup
>> file ended up. It does not seem to be where I thought I saved it.
>
> When a reshape is increasing the size of the array the backup file is only
> needed for the first few stripes. After that it is irrelevant and is removed.
>
> You should be able to simply reassemble the array and it should continue the
> reshape.
>
> What happens when you try:
>
> mdadm -S /dev/md_d2
> mdadm -A /dev/md_d2 /dev/sd[abc]5 -vv
>
> Please report both the messsages from mdadm and any new message is "dmesg" at
> the time.
>
> NeilBrown
>
# mdadm -S /dev/md_d2
mdadm: stopped /dev/md_d2
# mdadm -A /dev/md_d2 /dev/sd[abcd]5 -vv
mdadm: looking for devices for /dev/md_d2
mdadm: /dev/sda5 is identified as a member of /dev/md_d2, slot 0.
mdadm: /dev/sdb5 is identified as a member of /dev/md_d2, slot 1.
mdadm: /dev/sdc5 is identified as a member of /dev/md_d2, slot 3.
mdadm: /dev/sdd5 is identified as a member of /dev/md_d2, slot 2.
mdadm:/dev/md_d2 has an active reshape - checking if critical section
needs to be restored
mdadm: No backup metadata on device-3
mdadm: added /dev/sdb5 to /dev/md_d2 as 1
mdadm: added /dev/sdd5 to /dev/md_d2 as 2
mdadm: added /dev/sdc5 to /dev/md_d2 as 3
mdadm: added /dev/sda5 to /dev/md_d2 as 0
mdadm: /dev/md_d2 assembled from 3 drives - not enough to start the
array while not clean - consider --force.
# mdadm -D /dev/md_d2
mdadm: md device /dev/md_d2 does not appear to be active.
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [multipath] [raid1]
md_d2 : inactive sda5[0](S) sdc5[3](S) sdd5[2](S) sdb5[1](S)
2799357952 blocks super 0.91
md8 : active raid5 sdh1[0] sdg1[4] sdf1[1] sdi1[3] sde1[2]
5860542464 blocks level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
md1 : active raid5 sdd3[2] sdb3[1] sda3[0]
62926336 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]
md0 : active raid1 sdb1[1] sda1[0] sdd1[2]
208704 blocks [3/3] [UUU]
kernel log:
md: md_d2 stopped.
md: unbind<sda5>
md: export_rdev(sda5)
md: unbind<sdc5>
md: export_rdev(sdc5)
md: unbind<sdd5>
md: export_rdev(sdd5)
md: unbind<sdb5>
md: export_rdev(sdb5)
md: md_d2 stopped.
md: bind<sdb5>
md: bind<sdd5>
md: bind<sdc5>
md: bind<sda5>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2011-05-15 21:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-15 17:33 raid5 reshape failure - restart? Glen Dragon
2011-05-15 21:37 ` NeilBrown
2011-05-15 21:45 ` Glen Dragon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTi=7oHaQYbOA+qWHHzUzP5fp6UB=9A@mail.gmail.com' \
--to=glen.dragon@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).