linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Janos Haar" <janos.haar@netcenter.hu>
To: MRK <mrk@shiftmail.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: Suggestion needed for fixing RAID6
Date: Tue, 27 Apr 2010 17:50:43 +0200	[thread overview]
Message-ID: <80a201cae621$684daa30$0400a8c0@dcccs> (raw)
In-Reply-To: 4BD5C51E.9040207@shiftmail.org


----- Original Message ----- 
From: "MRK" <mrk@shiftmail.org>
To: "Janos Haar" <janos.haar@netcenter.hu>
Cc: <linux-raid@vger.kernel.org>
Sent: Monday, April 26, 2010 6:53 PM
Subject: Re: Suggestion needed for fixing RAID6


> On 04/26/2010 02:52 PM, Janos Haar wrote:
>>
>> Oops, you are right!
>> It was my mistake.
>> Sorry, i will try it again, to support 2 drives with dm-cow.
>> I will try it.
>
> Great! post here the results... the dmesg in particular.
> The dmesg should contain multiple lines like this "raid5:md3: read error 
> corrected ....."
> then you know it worked.

I am affraid i am still right about that....

...
end_request: I/O error, dev sdh, sector 1667152256
raid5:md3: read error not correctable (sector 1662188168 on dm-1).
raid5: Disk failure on dm-1, disabling device.
raid5: Operation continuing on 10 devices.
raid5:md3: read error not correctable (sector 1662188176 on dm-1).
raid5:md3: read error not correctable (sector 1662188184 on dm-1).
raid5:md3: read error not correctable (sector 1662188192 on dm-1).
raid5:md3: read error not correctable (sector 1662188200 on dm-1).
raid5:md3: read error not correctable (sector 1662188208 on dm-1).
raid5:md3: read error not correctable (sector 1662188216 on dm-1).
raid5:md3: read error not correctable (sector 1662188224 on dm-1).
raid5:md3: read error not correctable (sector 1662188232 on dm-1).
raid5:md3: read error not correctable (sector 1662188240 on dm-1).
ata8: EH complete
sd 7:0:0:0: [sdh] 2930277168 512-byte hardware sectors: (1.50 TB/1.36 TiB)
ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata8.00: port_status 0x20200000
ata8.00: cmd 25/00:f8:f5:ba:5e/00:03:63:00:00/e0 tag 0 dma 520192 in
         res 51/40:00:ef:bb:5e/40:00:63:00:00/e0 Emask 0x9 (media error)
ata8.00: status: { DRDY ERR }
ata8.00: error: { UNC }
ata8.00: configured for UDMA/133
ata8: EH complete
....
....
sd 7:0:0:0: [sdh] Add. Sense: Unrecovered read error - auto reallocate 
failed
end_request: I/O error, dev sdh, sector 1667152879
__ratelimit: 36 callbacks suppressed
raid5:md3: read error not correctable (sector 1662188792 on dm-1).
raid5:md3: read error not correctable (sector 1662188800 on dm-1).
md: md3: recovery done.
raid5:md3: read error not correctable (sector 1662188808 on dm-1).
raid5:md3: read error not correctable (sector 1662188816 on dm-1).
raid5:md3: read error not correctable (sector 1662188824 on dm-1).
raid5:md3: read error not correctable (sector 1662188832 on dm-1).
raid5:md3: read error not correctable (sector 1662188840 on dm-1).
raid5:md3: read error not correctable (sector 1662188848 on dm-1).
raid5:md3: read error not correctable (sector 1662188856 on dm-1).
raid5:md3: read error not correctable (sector 1662188864 on dm-1).
ata8: EH complete
sd 7:0:0:0: [sdh] Write Protect is off
sd 7:0:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO 
or FUA
ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata8.00: port_status 0x20200000
....
....
res 51/40:00:27:c0:5e/40:00:63:00:00/e0 Emask 0x9 (media error)
ata8.00: status: { DRDY ERR }
ata8.00: error: { UNC }
ata8.00: configured for UDMA/133
sd 7:0:0:0: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
sd 7:0:0:0: [sdh] Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
        63 5e c0 27
sd 7:0:0:0: [sdh] Add. Sense: Unrecovered read error - auto reallocate 
failed
end_request: I/O error, dev sdh, sector 1667153959
__ratelimit: 86 callbacks suppressed
raid5:md3: read error not correctable (sector 1662189872 on dm-1).
raid5:md3: read error not correctable (sector 1662189880 on dm-1).
raid5:md3: read error not correctable (sector 1662189888 on dm-1).
raid5:md3: read error not correctable (sector 1662189896 on dm-1).
raid5:md3: read error not correctable (sector 1662189904 on dm-1).
raid5:md3: read error not correctable (sector 1662189912 on dm-1).
raid5:md3: read error not correctable (sector 1662189920 on dm-1).
raid5:md3: read error not correctable (sector 1662189928 on dm-1).
raid5:md3: read error not correctable (sector 1662189936 on dm-1).
raid5:md3: read error not correctable (sector 1662189944 on dm-1).
ata8: EH complete
sd 7:0:0:0: [sdh] 2930277168 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 7:0:0:0: [sdh] Write Protect is off
sd 7:0:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO 
or FUA
sd 7:0:0:0: [sdh] 2930277168 512-byte hardware sectors: (1.50 TB/1.36 TiB)
sd 7:0:0:0: [sdh] Write Protect is off
sd 7:0:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support 
DPO 
or FUA
RAID5 conf printout:
 --- rd:12 wd:10
 disk 0, o:1, dev:sda4
 disk 1, o:1, dev:sdb4
 disk 2, o:1, dev:sdc4
 disk 3, o:1, dev:sdd4
 disk 4, o:1, dev:dm-0
 disk 5, o:1, dev:sdf4
 disk 6, o:1, dev:sdg4
 disk 7, o:0, dev:dm-1
 disk 8, o:1, dev:sdi4
 disk 9, o:1, dev:sdj4
 disk 10, o:1, dev:sdk4
 disk 11, o:1, dev:sdl4
RAID5 conf printout:
 --- rd:12 wd:10
 disk 0, o:1, dev:sda4
 disk 1, o:1, dev:sdb4
 disk 2, o:1, dev:sdc4
 disk 3, o:1, dev:sdd4
 disk 4, o:1, dev:dm-0
 disk 5, o:1, dev:sdf4
 disk 6, o:1, dev:sdg4
 disk 7, o:0, dev:dm-1
 disk 8, o:1, dev:sdi4
 disk 9, o:1, dev:sdj4
 disk 10, o:1, dev:sdk4
 disk 11, o:1, dev:sdl4
RAID5 conf printout:
 --- rd:12 wd:10
 disk 0, o:1, dev:sda4
 disk 1, o:1, dev:sdb4
 disk 2, o:1, dev:sdc4
 disk 3, o:1, dev:sdd4
 disk 4, o:1, dev:dm-0
 disk 5, o:1, dev:sdf4
 disk 6, o:1, dev:sdg4
 disk 7, o:0, dev:dm-1
 disk 8, o:1, dev:sdi4
 disk 9, o:1, dev:sdj4
 disk 10, o:1, dev:sdk4
 disk 11, o:1, dev:sdl4
RAID5 conf printout:
 --- rd:12 wd:10
 disk 0, o:1, dev:sda4
 disk 1, o:1, dev:sdb4
 disk 2, o:1, dev:sdc4
 disk 3, o:1, dev:sdd4
 disk 4, o:1, dev:dm-0
 disk 5, o:1, dev:sdf4
 disk 6, o:1, dev:sdg4
 disk 8, o:1, dev:sdi4
 disk 9, o:1, dev:sdj4
 disk 10, o:1, dev:sdk4
 disk 11, o:1, dev:sdl4
md: recovery of RAID array md3
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 
KB/sec) 
for recovery.
md: using 128k window, over a total of 1462653888 blocks.
md: resuming recovery of md3 from checkpoint.

md3 : active raid6 sdd4[12] sdl4[11] sdk4[10] sdj4[9] sdi4[8] dm-1[13](F) 
sdg4[6] sdf4[5] dm-0[4] sdc4[2] sdb4[1] sda4[0]
      14626538880 blocks level 6, 16k chunk, algorithm 2 [12/10] 
[UUU_UUU_UUUU]
      [===============>.....]  recovery = 75.3% (1101853312/1462653888) 
finish=292.3min speed=20565K/sec

du -h /sna*
1.1M    /snapshot2.bin
1.1M    /snapshot.bin

df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md1               19G   16G  3.5G  82% /
/dev/md0               99M   34M   60M  36% /boot
tmpfs                 2.0G     0  2.0G   0% /dev/shm

This is the actual state. :-(
In this way, the sync will stop again at 97.9%.

Another idea?
Or how to solve this dm-snapshot thing?

I think i know how can this be:
If i am right, the sync uses normal block size like usually wich is 4Kbyte 
in linux.
But the bad blocks are 512 bytes.
lets see for example one 4K window:
[BGBGBBGG] B: bad G: good sector
The sync reads up the block, the reported state is UNC because the drive 
reported UNC for some sector in this area.
The md recalculates the first 512byte bad block because the address is the 
same like the 4K block, than re-write it.
Than re-read the 4K block wich is still UNC because the 3rd sector is bad.
Can this be the issue?

Thanks,
Janos

> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html 


  parent reply	other threads:[~2010-04-27 15:50 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12   ` Janos Haar
2010-04-22 15:18     ` Mikael Abrahamsson
2010-04-22 16:25       ` Janos Haar
2010-04-22 16:32       ` Peter Rabbitson
     [not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48   ` Janos Haar
2010-04-23  6:51 ` Luca Berra
2010-04-23  8:47   ` Janos Haar
2010-04-23 12:34     ` MRK
2010-04-24 19:36       ` Janos Haar
2010-04-24 22:47         ` MRK
2010-04-25 10:00           ` Janos Haar
2010-04-26 10:24             ` MRK
2010-04-26 12:52               ` Janos Haar
2010-04-26 16:53                 ` MRK
2010-04-26 22:39                   ` Janos Haar
2010-04-26 23:06                     ` Michael Evans
     [not found]                       ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27  0:04                         ` Michael Evans
2010-04-27 15:50                   ` Janos Haar [this message]
2010-04-27 23:02                     ` MRK
2010-04-28  1:37                       ` Neil Brown
2010-04-28  2:02                         ` Mikael Abrahamsson
2010-04-28  2:12                           ` Neil Brown
2010-04-28  2:30                             ` Mikael Abrahamsson
2010-05-03  2:29                               ` Neil Brown
2010-04-28 12:57                         ` MRK
2010-04-28 13:32                           ` Janos Haar
2010-04-28 14:19                             ` MRK
2010-04-28 14:51                               ` Janos Haar
2010-04-29  7:55                               ` Janos Haar
2010-04-29 15:22                                 ` MRK
2010-04-29 21:07                                   ` Janos Haar
2010-04-29 23:00                                     ` MRK
2010-04-30  6:17                                       ` Janos Haar
2010-04-30 23:54                                         ` MRK
     [not found]                                         ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01  9:37                                           ` Janos Haar
2010-05-01 17:17                                             ` MRK
2010-05-01 21:44                                               ` Janos Haar
2010-05-02 23:05                                                 ` MRK
2010-05-03  2:17                                                 ` Neil Brown
2010-05-03 10:04                                                   ` MRK
2010-05-03 10:21                                                     ` MRK
2010-05-03 21:04                                                       ` Neil Brown
2010-05-03 21:02                                                     ` Neil Brown
     [not found]                                                   ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20                                                     ` Janos Haar
2010-05-05 15:24                                                     ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27                                                       ` MRK

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='80a201cae621$684daa30$0400a8c0@dcccs' \
    --to=janos.haar@netcenter.hu \
    --cc=linux-raid@vger.kernel.org \
    --cc=mrk@shiftmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).