From: Roberto Spadim <roberto@spadim.com.br>
To: NeilBrown <neilb@suse.de>
Cc: Brad Campbell <lists2009@fnarfbargle.com>, linux-raid@vger.kernel.org
Subject: Re: What the heck happened to my array?
Date: Fri, 8 Apr 2011 12:27:55 -0300 [thread overview]
Message-ID: <BANLkTinJ+VqEf0Se8Wj_8J08W0ofsShVWQ@mail.gmail.com> (raw)
In-Reply-To: <20110408195238.1e051c40@notabene.brown>
hi neil, with time i could help changin sysfs information (add unit
information at output, and remove it (unit text) from input)
what's the current kernel version of develop?
2011/4/8 NeilBrown <neilb@suse.de>:
> On Fri, 08 Apr 2011 09:19:01 +0800 Brad Campbell <lists2009@fnarfbargle.com>
> wrote:
>
>> On 05/04/11 14:10, NeilBrown wrote:
>>
>> > I would suggest:
>> > copy anything that you need off, just in case - if you can.
>> >
>> > Kill the mdadm that is running in the back ground. This will mean that
>> > if the machine crashes your array will be corrupted, but you are thinking
>> > of rebuilding it any, so that isn't the end of the world.
>> > In /sys/block/md0/md
>> > cat suspend_hi> suspend_lo
>> > cat component_size> sync_max
>> >
>>
>> root@srv:/sys/block/md0/md# cat /proc/mdstat
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
>> md0 : active raid6 sdc[0] sdd[6](S) sdl[1](S) sdh[9] sda[8] sde[7]
>> sdg[5] sdb[4] sdf[3] sdm[2]
>> 7814078464 blocks super 1.2 level 6, 512k chunk, algorithm 2
>> [10/8] [U_UUUU_UUU]
>> [=================>...] reshape = 88.2% (861696000/976759808)
>> finish=3713.3min speed=516K/sec
>>
>> md2 : active raid5 sdi[0] sdk[3] sdj[1]
>> 1465146368 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3]
>> [UUU]
>>
>> md6 : active raid1 sdp6[0] sdo6[1]
>> 821539904 blocks [2/2] [UU]
>>
>> md5 : active raid1 sdp5[0] sdo5[1]
>> 104864192 blocks [2/2] [UU]
>>
>> md4 : active raid1 sdp3[0] sdo3[1]
>> 20980800 blocks [2/2] [UU]
>>
>> md3 : active raid1 sdp2[0] sdo2[1]
>> 8393856 blocks [2/2] [UU]
>>
>> md1 : active raid1 sdp1[0] sdo1[1]
>> 20980736 blocks [2/2] [UU]
>>
>> unused devices: <none>
>> root@srv:/sys/block/md0/md# cat component_size > sync_max
>> cat: write error: Device or resource busy
>
> Sorry, I should have checked the source code.
>
>
> echo max > sync_max
>
> is what you want.
> Or just a much bigger number.
>
>>
>> root@srv:/sys/block/md0/md# cat suspend_hi suspend_lo
>> 13788774400
>> 13788774400
>
> They are the same so that is good - nothing will be suspended.
>
>>
>> root@srv:/sys/block/md0/md# grep . sync_*
>> sync_action:reshape
>> sync_completed:1723392000 / 1953519616
>> sync_force_parallel:0
>> sync_max:1723392000
>> sync_min:0
>> sync_speed:281
>> sync_speed_max:200000 (system)
>> sync_speed_min:200000 (local)
>>
>> So I killed mdadm, then did the cat suspend_hi > suspend_lo.. but as you
>> can see it won't let me change sync_max. The array above reports
>> 516K/sec, but that was just on its way down to 0 on a time based
>> average. It was not moving at all.
>>
>> I then tried stopping the array, restarting it with mdadm 3.1.4 which
>> immediately segfaulted and left the array in state resync=DELAYED.
>>
>> I issued the above commands again, which succeeded this time but while
>> the array looked good, it was not resyncing :
>> root@srv:/sys/block/md0/md# cat /proc/mdstat
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
>> md0 : active raid6 sdc[0] sdd[6](S) sdl[1](S) sdh[9] sda[8] sde[7]
>> sdg[5] sdb[4] sdf[3] sdm[2]
>> 7814078464 blocks super 1.2 level 6, 512k chunk, algorithm 2
>> [10/8] [U_UUUU_UUU]
>> [=================>...] reshape = 88.2% (861698048/976759808)
>> finish=30203712.0min speed=0K/sec
>>
>> md2 : active raid5 sdi[0] sdk[3] sdj[1]
>> 1465146368 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3]
>> [UUU]
>>
>> md6 : active raid1 sdp6[0] sdo6[1]
>> 821539904 blocks [2/2] [UU]
>>
>> md5 : active raid1 sdp5[0] sdo5[1]
>> 104864192 blocks [2/2] [UU]
>>
>> md4 : active raid1 sdp3[0] sdo3[1]
>> 20980800 blocks [2/2] [UU]
>>
>> md3 : active raid1 sdp2[0] sdo2[1]
>> 8393856 blocks [2/2] [UU]
>>
>> md1 : active raid1 sdp1[0] sdo1[1]
>> 20980736 blocks [2/2] [UU]
>>
>> unused devices: <none>
>>
>> root@srv:/sys/block/md0/md# grep . sync*
>> sync_action:reshape
>> sync_completed:1723396096 / 1953519616
>> sync_force_parallel:0
>> sync_max:976759808
>> sync_min:0
>> sync_speed:0
>> sync_speed_max:200000 (system)
>> sync_speed_min:200000 (local)
>>
>> I stopped the array and restarted it with mdadm 3.2.1 and it continued
>> along its merry way.
>>
>> Not an issue, and I don't much care if it blew something up, but I
>> thought it worthy of a follow up.
>>
>> If there is anything you need tested while it's in this state I've got ~
>> 1000 minutes of resync time left and I'm happy to damage it if requested.
>
> No thank - I think I know what happened. Main problem is that there is
> confusion between 'k' and 'sectors' and there are random other values that
> sometimes work (like 'max') and I never remember which is which. sysfs in md
> is a bit of a mess.... one day I hope to completely replace it (with back
> compatibility of course...)
>
> Thanks for the feedback.
>
> NeilBrown
>
>
>>
>> Regards,
>> Brad
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2011-04-08 15:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-03 13:32 What the heck happened to my array? (No apparent data loss) Brad Campbell
2011-04-03 15:47 ` Roberto Spadim
2011-04-04 5:59 ` Brad Campbell
2011-04-04 16:49 ` Roberto Spadim
2011-04-05 0:47 ` What the heck happened to my array? Brad Campbell
2011-04-05 6:10 ` NeilBrown
2011-04-05 9:02 ` Brad Campbell
2011-04-05 11:31 ` NeilBrown
2011-04-05 11:47 ` Brad Campbell
2011-04-08 1:19 ` Brad Campbell
2011-04-08 9:52 ` NeilBrown
2011-04-08 15:27 ` Roberto Spadim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BANLkTinJ+VqEf0Se8Wj_8J08W0ofsShVWQ@mail.gmail.com \
--to=roberto@spadim.com.br \
--cc=linux-raid@vger.kernel.org \
--cc=lists2009@fnarfbargle.com \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).