From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: Likely forced assemby with wrong disk during raid5 grow. Recoverable? Date: Mon, 21 Feb 2011 11:53:03 +1100 Message-ID: <20110221115303.4862e093@notabene.brown> References: <20110220162509.2eb85a03@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Claude Nobs Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On Sun, 20 Feb 2011 15:44:35 +0100 Claude Nobs w= rote: > > They are the 'Number' column in the --detail output below. =A0This = is /dev/md1 > > - I can tell from the --examine outputs, but it is a bit confusing.= =A0Newer > > versions of mdadm make this a little less confusing. =A0If you look= for > > patterns of U and u =A0in the 'Array State' line, the U is 'this de= vice', the > > 'u' is some other devices. >=20 > Actually this is running a stock Ubunutu 10.10 server kernel. But as > it is from my memory it could very well have been : >=20 > 2930281920 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4= /5] [U_UUU] >=20 I'm quite sure it would have been '[U_UUU]' as you say. When I say "Newer versions" I mean of mdadm, not the kernel. What does mdadm -V show? Version 3.0 or later gives less confusing output for "mdadm --ex= amine" on 1.x metadata. > > Just to go through some of the numbers... > > > > Chunk size is 64K. =A0Reshape was 4->5, so 3 -> 4 data disks. > > So old stripes have 192K, new stripes have 256K. > > > > The 'good' disks think reshape has reached 502815488K which is > > 1964123 new stripes. (2618830.66 old stripes) > > md1 thinks reshape has only reached 489510400K which is 1912150 > > new stripes (2549533.33 old stripes). >=20 > i think you mixed up sdd1 with md1 here? (the numbers above for md1 > are for sdd1. md1 would be : reshape has reached 502809856K which > would be 1964101 new stripes. so the difference between the good disk= s > and md1 would be 22 stripes.) Yes, I got them mixed up. But the net result is the same - the 'new' s= tripes numbers haven't got close to overwriting the 'old' stripe numbers. >=20 > > > > So of the 51973 stripes that have been reshaped since the last meta= data > > update on sdd1, some will have been done on sdd1, but some not, and= we don't > > really know how many. =A0But it is perfectly safe to repeat those s= tripes > > as all writes to that region will have been suspended (and you prob= ably > > weren't writing anyway). >=20 > jep there was nothing writing to the array. so now i am a little > confused, if you meant sdd1 (which failed first is 51973 stripes > behind) this would imply that at least so many stripes of data are > kept of the old (3 data disks) configuration as well as the new one? > if continuing from there is possible then the array would no longer b= e > degraded right? so i think you meant md1 (22 stripes behind), as > keeping 5.5M of data from the old and new config seems more > reasonable. however this is just a guess :-) Yes, it probably is possible to re-assemble the array to include sdd1 a= nd not have a degraded array, and still have all your data safe - providing yo= u are sure that nothing at all changed on the array (e.g. maybe it was unmoun= ted?). I'm not sure I'd recommend it though.... I cannot see anything that wo= uld go wrong, but it is somewhat unknown territory. Up to you... If you: % git clone git://neil.brown.name/mdadm master % cd mdadm % make % sudo bash # ./mdadm -S /dev/md2 # ./mdadm -Afvv /dev/md2 /dev/sda1 /dev/md0 /dev/md1 /dev/sdc1 It should restart your array - degraded - and repeat the last stages of reshape just in case. Alternately, before you run 'make' you could edit Assemble.c, find: while (force && !enough(content->array.level, content->array.raid_disk= s, content->array.layout, 1, avail, okcnt)) { around line 818, and change the '1,' to '0,', then run make, mdadm -S, = and then # ./mdadm -Afvv /dev/md2 /dev/sda1 /dev/md0 /dev/md1 /dev/sdc1 /dev/sdd= 1 it should assemble the array non-degraded and repeat all of the reshape= since sdd1 fell out of the array. As you have a backup, this is probably safe because even if to goes bad= you can restore from backups - not that I expect it to go bad but .... > > > > Thanks for the excellent problem report. > > > > NeilBrown >=20 > Well i thank you for providing such an elaborate and friendly answer! > this is actually my first mailing list post and considering how many > questions get ignored (don't know about this list though) i just hope= d > someone would at least answer with a one liner... i never expected > this. so thanks again. All part of the service... :-) NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html