From mboxrd@z Thu Jan 1 00:00:00 1970 From: NeilBrown Subject: Re: What the heck happened to my array? Date: Tue, 5 Apr 2011 16:10:43 +1000 Message-ID: <20110405161043.00d54901@notabene.brown> References: <4D9876E4.6080501@fnarfbargle.com> <4D995E27.3060800@fnarfbargle.com> <4D9A6694.4040606@fnarfbargle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <4D9A6694.4040606@fnarfbargle.com> Sender: linux-raid-owner@vger.kernel.org To: Brad Campbell Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On Tue, 05 Apr 2011 08:47:16 +0800 Brad Campbell wrote: > On 05/04/11 00:49, Roberto Spadim wrote: > > i don=B4t know but this happened with me on a hp server, with linux > > 2,6,37 i changed kernel to a older release and the problem ended, > > check with neil and others md guys what=B4s the real problem > > maybe realtime module and others changes inside kernel are the > > problem, maybe not... > > just a quick solution idea: try a older kernel > > >=20 > Quick precis: > - Started reshape 512k to 64k chunk size. > - sdd got bad sector and was kicked. > - Array froze all IO. That .... shouldn't happen. But I know why it did. mdadm forks and runs in the back ground monitoring the reshape. It suspends IO to a region of the array, backs up the data, then lets t= he reshape progress over that region, then invalidates the backup and allo= ws IO to resume, then moves on to the next region (it actually have two regio= ns in different states at the same time, but you get the idea). If the device failed the reshape in the kernel aborted and then restart= ed. It is meant to do this - restore to a known state, then decide if there= is anything useful to do. It restarts exactly where it left off so all sh= ould be fine. mdadm periodically checks the value in 'sync_completed' to see how far = the reshape has progressed to know if it can move on. If it checks while the reshape is temporarily aborted it sees 'none', w= hich is not a number, so it aborts. That should be fixed. It aborts with IO to a region still suspended so it is very possible fo= r IO to freeze if anything is destined for that region. > - Reboot required to get system back. > - Restarted reshape with 9 drives. > - sdl suffered IO error and was kicked Very sad. > - Array froze all IO. Same thing... > - Reboot required to get system back. > - Array will no longer mount with 8/10 drives. > - Mdadm 3.1.5 segfaults when trying to start reshape. Don't know why it would have done that... I cannot reproduce it easily. > Naively tried to run it under gdb to get a backtrace but was unabl= e=20 > to stop it forking Yes, tricky .... an "strace -o /tmp/file -f mdadm ...." might have been enough, but to late to worry about that now. > - Got array started with mdadm 3.2.1 > - Attempted to re-add sdd/sdl (now marked as spares) Hmm... it isn't meant to do that any more. I thought I fixed it so tha= t it if a device looked like part of the array it wouldn't add it as a spare= =2E.. Obviously that didn't work. I'd better look in to it again. > [ 304.393245] mdadm[5940]: segfault at 7f2000 ip 00000000004480d2 sp= =20 > 00007fffa04777b8 error 4 in mdadm[400000+64000] >=20 If you have the exact mdadm binary that caused this segfault we should = be able to figure out what instruction was at 0004480d2. If you don't fe= el up to it, could you please email me the file privately and I'll have a loo= k. > root@srv:~/mdadm-3.1.5# uname -a > Linux srv 2.6.38 #19 SMP Wed Mar 23 09:57:05 WST 2011 x86_64 GNU/Linu= x >=20 > Now. The array restarted with mdadm 3.2.1, but of course its now=20 > reshaping 8 out of 10 disks, has no redundancy and is going at 600k/s= =20 > which will take over 10 days. Is there anything I can do to give it s= ome=20 > redundancy while it completes or am I better to copy the data off, bl= ow=20 > it away and start again? All the important stuff is backed up anyway,= I=20 > just wanted to avoid restoring 8TB from backup if I could. No, you cannot give it extra redundancy. I would suggest: copy anything that you need off, just in case - if you can. Kill the mdadm that is running in the back ground. This will mean th= at if the machine crashes your array will be corrupted, but you are thin= king of rebuilding it any, so that isn't the end of the world. In /sys/block/md0/md cat suspend_hi > suspend_lo cat component_size > sync_max That will allow the reshape to continue without any backup. It will = be much faster (but less safe, as I said). If the reshape completes without incident, it will start recovering t= o the two 'spares' - and then you will have a happy array again. If something goes wrong, you will need to scrap the array, recreate i= t, and copy data back from where-ever you copied it to (or backups). If anything there doesn't make sense, or doesn't seem to work - please = ask. Thanks for the report. I'll try to get those mdadm issues addressed - particularly if you can get me the mdadm file which caused the segfault= =2E NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html