linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dark Penguin <darkpenguin@yandex.ru>
To: linux-raid@vger.kernel.org
Subject: Re: md failing mechanism
Date: Sat, 23 Jan 2016 00:44:35 +0300	[thread overview]
Message-ID: <56A2A2C3.9000801@yandex.ru> (raw)
In-Reply-To: <56A28309.9080806@turmel.org>

Oh! Thank you! I really wanted to see a reliable "what's supposed to 
happen" sequence!

As for my case, those were indeed, um, "cheap desktop drives" - to be 
precise, some 80-Gb IDE drives in a Pentium-4 machine; "it works well 
for a small file server", I thought, oblivious to the finer details 
about the process of failure handling... But, I also have "big" file 
servers, so that timeout mismatch issue is something worth paying attention!

And also, now I understand why I probably "should have been scrubbing". 
=/ Do I understand correctly that "scrubbing" means those "monthly 
redundancy checks" that mdadm suggests? And I suppose what it does is 
just the same - read every sector and attempt to write it back upon 
failure, otherwise kicking the device?


So, I understand a common problem now: the read timeout on the "desktop" 
drives is too long, which makes sense for the desktops, but not for 
RAIDs, because the "write back attempt" fails and leads to "BOOM" and 
kick. Enterprise-grade drives, however, offer an option to change their 
timeout, which is called "TL;DR technology" (yes, that's how I'm going 
to call it! Because I can't remember the acronym no matter how may times 
I read it, and the meaning kinda fits!). And what about drives that do 
not support it?.. Do they even have some kid of huge timeout or 
something?.. Yesterday I've been checking one drive for bad blocks 
(badblocks read-only test), and it took no more than two seconds per 
block to confirm its... badness!

As I understand, one way around this problem is to change the kernel 
timeout to exceed the drive timeout by changing 
/sys/block/sd?/device/timeout to something larger than the default 30, 
but I'd have to do that after every reboot, is all that correct?


Still, I don't think it has anything to do with what has happened to my 
"small file server"... It was the opposite; for some reason, it was not 
kicked from the array. But, it happened a while ago, and I've destroyed 
the array afterwards, so I can't get any more data about that incident. 
But, I've got what I wanted: I now I know what is supposed to happen 
when a drive in a RAID fails, and it's not what happened that time. And 
I know I should set up proper TL;DR timeouts and scrubbing...


-- 
darkpenguin

  parent reply	other threads:[~2016-01-22 21:44 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22 17:59 md failing mechanism Dark Penguin
2016-01-22 19:29 ` Phil Turmel
2016-01-22 20:00   ` Wols Lists
2016-01-22 21:44   ` Dark Penguin [this message]
2016-01-22 22:18     ` Phil Turmel
2016-01-22 22:50       ` Dark Penguin
2016-01-22 23:23         ` Edward Kuns
2016-01-22 23:34       ` Wols Lists
2016-01-23  0:09         ` Dark Penguin
2016-01-22 22:37     ` Edward Kuns
2016-01-22 23:07       ` Dark Penguin
2016-01-22 23:39         ` Wols Lists
2016-01-23  0:09           ` Dark Penguin
2016-01-23  0:34         ` Phil Turmel
2016-01-23 10:33           ` Dark Penguin
2016-01-23 15:12             ` Phil Turmel
2016-01-22 23:40     ` James J
2016-01-23  0:44       ` Phil Turmel
2016-01-23 14:09       ` Wols Lists
2016-01-23 19:02         ` James J
2016-01-24 22:13           ` Adam Goryachev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A2A2C3.9000801@yandex.ru \
    --to=darkpenguin@yandex.ru \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).