From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nagilum Subject: Re: Accidental grow before add Date: Tue, 28 Sep 2010 17:14:51 +0200 Message-ID: <20100928171451.27293d0o1kotcfi8@cakebox.homeunix.net> References: <193703.58642.qm@web51305.mail.re2.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-2; DelSp="Yes"; format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Disposition: inline Sender: linux-raid-owner@vger.kernel.org To: Mike Hartman Cc: Jon@ehardcastle.com, linux-raid@vger.kernel.org List-Id: linux-raid.ids ----- Message from mike@hartmanipulation.com --------- >> I am more interested to know why it kicked off a reshape that would >> leave the array in a degraded state without a warning and >> needing a '--force' are you sure there wasn't capacity to 'grow' anyway? > > Positive. I had no spare of any kind and mdstat was showing all disks > were in use. Yep, a warning/safety net would be good. At the moment mdadm assumes you know what you're doing. > Now I've got the new drive in there as a spare, but it > was added after the reshape started and mdadm doesn't seem to be > trying to use it yet. I'm thinking it's going through the original > reshape I kicked off (transforming it from an intact 7 disk RAID 6 to > a degraded 8 disk RAID 6) and then when it gets to the end it will run > another reshape to pick up the new spare. Yes, that's what's going to happen. >> Also, when i first ran my reshape it was incredibly slow from >> Raid5~6 tho.. it literally took days. > I did a RAID 5 -> RAID 6 conversion the other week and it was also > slower than a normal resizing, but only 2-2.5 times as slow. Adding a > new disk usually takes a bit less than 2 days on this array and that > conversion took closer to 4. However, at the slowest rate I reported > above it would have taken something 11 months - definitely a whole > different ballpark. Yeah that was due to the disk errors. I find "iostat -d 2 -kx" helpful to understand what's going on. > At any rate, apparently one of my other drives in the array was > throwing some read errors. Eventually it did something unrecoverable > and was dropped from the array. Once that happened the speed returned > to a more normal level, but I stopped the arrays to run a complete > read test on every drive before continuing. With an already degraded > array, losing that drive killed any failure buffer I had left. I want > to make quite sure all the other drives will finish the reshape > properly before risking it. Then I guess it's just a matter of waiting > 3 or 4 days for both reshapes to complete. Yep, I once got bitten by a linux kernel bug that caused the RAID5 to corrupt when a drive failed during reshape. I managed to recover though. Since then I always do a raid-check before starting any changes. Good luck and thanks for the story so far. Alex. ======================================================================== # _ __ _ __ http://www.nagilum.org/ \n icq://69646724 # # / |/ /__ ____ _(_) /_ ____ _ nagilum@nagilum.org \n +491776461165 # # / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux # # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux # # /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 # ======================================================================== ---------------------------------------------------------------- cakebox.homeunix.net - all the machine one needs..