All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ethan Wilson <ethan.wilson@shiftmail.org>
To: "Pedro Teixeira" <finas@aeiou.pt>, "Lars Täuber" <taeuber@bbaw.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: strange problem with raid6 read errors on active non-degraded array
Date: Wed, 02 Jul 2014 18:35:09 +0200	[thread overview]
Message-ID: <53B434BD.30301@shiftmail.org> (raw)
In-Reply-To: <20140702151406.Horde.HZoGSPYRo99TtBOu1q6B-GA@webmail.aeiou.pt>

You have multiple bad-blocks list (an MD feature) which are already full 
of sectors. Those are earlier disk errors which were stored on MD 
headers (one list per drive).

MD will not try to read from such sectors anymore, and during reads MD 
will return error to the upper layers immediately. This is if the stripe 
does not have enough good components to read after excluding the bad 
blocks, e.g. raid5 is able to tolerate up to 1 disk with badblocks in a 
stripe, so with 2 badblocks in 2 different disks in the same stripes MD 
will return a read error immediately and without trying.
That's why in dmesg you are seeing read errors from MD but not from the 
component devices.

Now the question is how could so many badblocks be recorded on your array.
It seems very unlikely that so many disks of your array are in such bad 
shape .  This might indicate an MD bug in the badblocks code.
I am thinking some form of erroneous propagation of bad blocks, so that 
e.g. writing to an area where an MD badblock exists, instead of clearing 
the bad block could have propagated the badblock to the other disks in 
the same stripe. Something like that.

See if you can check that writing to a bad block clears it. It will be 
difficult to compute the correct offset to write to, though. You might 
want to do some trials-and-errors with dd together with blktrace. If you 
can do that, you might want to check that it behaves correctly even when 
writing something that does not align to 512b or 4k . Obviously this 
test is desctructive wrt your data in that location.

Another easier test is if to try to read with dd from a component device 
itself. If MD has recorded (even if happened long time in the past) a 
bad block there, the direct read with dd should also hit it, return 
error and stop, because badblocks in the surface of disks do not heal by 
themselves with time.

Another test is to read from md0 with dd from an area where you see that 
only 1 disk has badblocks (probably requires some trial and error with 
blktrace because the offsets of md0 are not equal to the offsets of the 
component devices) . If MD works correctly, with such read it should 
"heal" the badblock: compute from parity from the other disks, then 
write over the badblock. The MD badblock should disappear.

The last 2 tests I described should not be destructive except in case of 
MD bugs.

EW


On 02/07/2014 16:14, Pedro Teixeira wrote:
> Hi Lars,
>
> the output of those commands:
>
> root@nas3:/# cat /sys/block/sdb/queue/physical_block_size
> 4096
> root@nas3:/# cat /sys/block/md0/queue/physical_block_size
> 4096
> root@nas3:/#
>
> The strange thing here is that dmesg is not poluted with sata errors 
> like it is usual when a hard disk has bad sectors or some other 
> hardware problem. the only thing in dmesg that hints to why reading 
> the md volume fails are from dm itself.
>
> Cheers
> Pedro
>
>
> Citando Lars Täuber
>> Hi Pedro,
>>
>> maybe an issue with the logical/physical blocksize?
>> What tell these commands:
>>
>> cat /sys/block/sdb/queue/physical_block_size
>> cat /sys/block/md0/queue/physical_block_size
>>
>> Seagate says there are 4096 bytes/sector on this devices.
>>
>> Lars
>
>
>
> ________________________________________________________________________________ 
>
> Mensagem enviada através do email grátis AEIOU
> http://www.aeiou.pt
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-07-02 16:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-02  9:32 strange problem with raid6 read errors on active non-degraded array Pedro Teixeira
2014-07-02  9:52 ` Roman Mamedov
2014-07-02 10:07   ` Pedro Teixeira
2014-07-02 10:11     ` Roman Mamedov
2014-07-02 10:37       ` Pedro Teixeira
2014-07-02 11:03       ` Pedro Teixeira
2014-07-02 10:45 ` NeilBrown
2014-07-02 11:54   ` Pedro Teixeira
     [not found]     ` <20140702152429.742a3e8ea8bd100f5b3bae1f@bbaw.de>
2014-07-02 14:14       ` Pedro Teixeira
2014-07-02 14:55         ` Lars Täuber
2014-07-02 16:35         ` Ethan Wilson [this message]
     [not found]           ` <20140702192825.Horde.18y4TPYRo99TtE9JC9kSzUA@webmail.aeiou.pt>
2014-07-02 21:34             ` Ethan Wilson
2014-07-02 16:43     ` John Stoffel
     [not found]       ` <20140702193706.Horde.Q4yuGvYRo99TtFFSw8qw6-A@webmail.aeiou.pt>
2014-07-02 18:41         ` Pedro Teixeira
2014-07-02 19:01         ` John Stoffel
2014-07-03  2:40     ` NeilBrown
2014-07-03  8:29       ` Pedro Teixeira
2014-07-03 10:39       ` Pedro Teixeira
2014-07-03 21:06       ` Pedro Teixeira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B434BD.30301@shiftmail.org \
    --to=ethan.wilson@shiftmail.org \
    --cc=finas@aeiou.pt \
    --cc=linux-raid@vger.kernel.org \
    --cc=taeuber@bbaw.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.