From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Steinar H. Gunderson" Subject: Re: [PATCH] Online RAID-5 resizing Date: Tue, 20 Sep 2005 17:36:22 +0200 Message-ID: <20050920153622.GA14287@uio.no> References: <20050920143346.GA5777@uio.no> <17200.9302.242957.23189@cse.unsw.edu.au> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Content-Disposition: inline In-Reply-To: <17200.9302.242957.23189@cse.unsw.edu.au> Sender: linux-raid-owner@vger.kernel.org To: Neil Brown Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On Wed, Sep 21, 2005 at 01:01:42AM +1000, Neil Brown wrote: > Shrinking certainly adds a lot of complications, and you would have t= o > start at the 'top' and work backwards. Probably not worth the effort= , > except that people might want to be able to back-out a change... I worked on EVMS' resizing code prior to doing this, and it seems like = a resize was simply doing it the other way without any further complicati= ons... I don't know how the underlying block layer in Linux would like it, tho= ugh. >> - It leaks memory; it doesn't properly free up the old stripes etc. = at the >> end of the resize. (This also makes it impossible to do a grow and= then >> another grow without stopping and starting the volumes.) > I'm sure that can be fixed. Yes, of course; it's mostly about not having gotten around to doing it = yet. A good start would be doing shrink_stripes(), but the =E2=80=9Cfinish up = the expanding=E2=80=9D code is currently called from __release_stripe() when the last stripe f= rom the old array is freed, and thus is done under the device_lock, and I h= ad problems doing memory management under the spinlock. The correct soluti= on would probably be moving it into raid5d, outside the spinlock. > Crash recovery is essential I think. There are some awkward cases, > particularly while growing the first few stripes. I'm sure we can > work it out together. Mm, or at least the very first stripe. I'm not really sure if it's wort= h it, though; perfect crash recovery is pretty hard (for one, you'd have to d= isable all write caching on the destination disks), and I'm not sure how proba= ble a power loss 20ms into the resizing is. >> - It's quite slow; on my test system with old IDE disks, it achieves= about >> 1MB/sec. One could probably make a speed/memory tradeoff here, and= move >> more chunks at a time instead of just one by one; I'm a bit concer= ned >> about the implications of the kernel allocating something like 64M= B in one >> go, though :-) > I doubt speed is a top priority. Well, with multi-terabyte arrays, restriping at those speeds will take _weeks_, so more speed is always good. I agree that we don't need to be pushing it very hard, though. > I'll try to have a read through your code over the next week or so an= d > give you more detailed feedback. OK, thanks. :-) There's a lot of unneeded junk in the patch, BTW (some reindenting here and there that I don't know where is coming from, plus= lots of temporary added printks), but I guess we can sort out the cleanness = after a while. :-) /* Steinar */ --=20 Homepage: http://www.sesse.net/ - To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html