From: "Janos Haar" <janos.haar@netcenter.hu>
To: MRK <mrk@shiftmail.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: Suggestion needed for fixing RAID6
Date: Thu, 29 Apr 2010 23:07:40 +0200 [thread overview]
Message-ID: <0c1201cae7e0$01f9a930$0400a8c0@dcccs> (raw)
In-Reply-To: 4BD9A41E.9050009@shiftmail.org
----- Original Message -----
From: "MRK" <mrk@shiftmail.org>
To: "Janos Haar" <janos.haar@netcenter.hu>
Cc: <linux-raid@vger.kernel.org>
Sent: Thursday, April 29, 2010 5:22 PM
Subject: Re: Suggestion needed for fixing RAID6
> On 04/29/2010 09:55 AM, Janos Haar wrote:
>>
>> md3 : active raid6 sdd4[12] sdl4[11] sdk4[10] sdj4[9] sdi4[8] dm-1[13](F)
>> sdg4[6
>> ] sdf4[5] dm-0[4] sdc4[2] sdb4[1] sda4[0]
>> 14626538880 blocks level 6, 16k chunk, algorithm 2 [12/10]
>> [UUU_UUU_UUUU]
>> [===========>.........] recovery = 56.8% (831095108/1462653888)
>> finish=50
>> 19.8min speed=2096K/sec
>>
>> Drive dropped again with this patch!
>> + the kernel freezed.
>> (I will try to get more info...)
>>
>> Janos
>
> Hmm too bad :-( it seems it still doesn't work, sorry for that
>
> I suppose the kernel didn't freeze immediately after disabling the drive
> or you wouldn't have had the chance to cat /proc/mdstat...
this was this command in putty.exe window:
watch "cat /proc/mdstat ; du -h /snap*"
I think it have crashed soon.
I had no time to recognize what happened and exit from the watch.
>
> Hence dmesg messages might have gone to /var/log/messages or something.
> Can you look there to see if there is any interesting message to post
> here?
Yes, i know that.
The crash was not written up unfortunately.
But there is some info:
(some UNC reported from sdh)
....
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: res
51/40:00:27:c0:5e/40:00:63:00:00/e0 Emask 0x9 (media error)
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8.00: status: { DRDY ERR }
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8.00: error: { UNC }
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8.00: configured for UDMA/133
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Sense Key : Medium
Error [current] [descriptor]
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: Descriptor sense data with sense
descriptors (in hex):
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: 72 03 11 04 00 00 00 0c 00
0a 80 00 00 00 00 00
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: 63 5e c0 27
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Add. Sense:
Unrecovered read error - auto reallocate failed
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: end_request: I/O error, dev sdh,
sector 1667153959
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189872 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189880 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189888 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189896 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189904 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189912 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189920 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189928 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189936 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5:md3: read error not
correctable (sector 1662189944 on dm-1).
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8: EH complete
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Write Protect is
off
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] 2930277168
512-byte hardware sectors: (1.50 TB/1.36 TiB)
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Write Protect is
off
Apr 29 09:50:29 Clarus-gl2k10-2 kernel: sd 7:0:0:0: [sdh] Write cache:
enabled, read cache: enabled, doesn't support DPO or FUA
Apr 29 13:07:39 Clarus-gl2k10-2 syslogd 1.4.1: restart.
> Did the COW device fill up at least a bit?
The initial size is 1.1MB, and what we wants to see is only some kbytes...
I don't know exactly.
Next time i will try to reduce the initial size to 16KByte.
>
> Also: you know that if you disable graphics on the server
> ("/etc/init.d/gdm stop" or something like that) you usually can see the
> stack trace of the kernel panic on screen when it hangs (unless terminal
> was blank for powersaving, which you can disable too). You can take a
> photo of that one (or write it down but it will be long) to so maybe
> somebody can understand why it hanged. You might be even obtain the stack
> trace through a serial port but that will take more effort.
This pc based server have no graphic card at all. :-) (this is one of my
freak ideas)
And the terminal is redirected to the com1.
If i really want, i can catch this with serial cable, but i think the log
should be enough from the messages file.
Thanks,
Janos
next prev parent reply other threads:[~2010-04-29 21:07 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12 ` Janos Haar
2010-04-22 15:18 ` Mikael Abrahamsson
2010-04-22 16:25 ` Janos Haar
2010-04-22 16:32 ` Peter Rabbitson
[not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48 ` Janos Haar
2010-04-23 6:51 ` Luca Berra
2010-04-23 8:47 ` Janos Haar
2010-04-23 12:34 ` MRK
2010-04-24 19:36 ` Janos Haar
2010-04-24 22:47 ` MRK
2010-04-25 10:00 ` Janos Haar
2010-04-26 10:24 ` MRK
2010-04-26 12:52 ` Janos Haar
2010-04-26 16:53 ` MRK
2010-04-26 22:39 ` Janos Haar
2010-04-26 23:06 ` Michael Evans
[not found] ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27 0:04 ` Michael Evans
2010-04-27 15:50 ` Janos Haar
2010-04-27 23:02 ` MRK
2010-04-28 1:37 ` Neil Brown
2010-04-28 2:02 ` Mikael Abrahamsson
2010-04-28 2:12 ` Neil Brown
2010-04-28 2:30 ` Mikael Abrahamsson
2010-05-03 2:29 ` Neil Brown
2010-04-28 12:57 ` MRK
2010-04-28 13:32 ` Janos Haar
2010-04-28 14:19 ` MRK
2010-04-28 14:51 ` Janos Haar
2010-04-29 7:55 ` Janos Haar
2010-04-29 15:22 ` MRK
2010-04-29 21:07 ` Janos Haar [this message]
2010-04-29 23:00 ` MRK
2010-04-30 6:17 ` Janos Haar
2010-04-30 23:54 ` MRK
[not found] ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01 9:37 ` Janos Haar
2010-05-01 17:17 ` MRK
2010-05-01 21:44 ` Janos Haar
2010-05-02 23:05 ` MRK
2010-05-03 2:17 ` Neil Brown
2010-05-03 10:04 ` MRK
2010-05-03 10:21 ` MRK
2010-05-03 21:04 ` Neil Brown
2010-05-03 21:02 ` Neil Brown
[not found] ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20 ` Janos Haar
2010-05-05 15:24 ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27 ` MRK
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='0c1201cae7e0$01f9a930$0400a8c0@dcccs' \
--to=janos.haar@netcenter.hu \
--cc=linux-raid@vger.kernel.org \
--cc=mrk@shiftmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.