Re: [ LR] Kernel 4.8.4: INFO: task kworker/u16:8:289 blocked for more than 120 seconds.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Andreas Klauer <Andreas.Klauer@metamorpher.de>
To: TomK <tk@mdevsys.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [ LR] Kernel 4.8.4: INFO: task kworker/u16:8:289 blocked for more than 120 seconds.
Date: Sun, 30 Oct 2016 21:13:02 +0100	[thread overview]
Message-ID: <20161030201302.GB6727@metamorpher.de> (raw)
In-Reply-To: <73e35e17-80aa-c7e6-535c-3665d9789e16@mdevsys.com>

On Sun, Oct 30, 2016 at 02:56:58PM -0400, TomK wrote:
> So the question is how come the mdadm RAID did not catch this disk as a 
> failed disk and pull it out of the array?

RAID doesn't know about SMART. It's that simple.

If SMART already knows about errors - too bad, RAID doesn't care.
It also doesn't know about anything else really. You ddrescue the 
member disk directly and it finds tons of errors... RAID isn't involved.

RAID will only kick when it by itself stumbles over an error that does 
not go away when rewriting data. Or when the drive just doesn't respond 
anymore for an extended period of time. And that timeout is per request 
so a bad disk can grind the entire system to a halt without ever kicked.

ddrescue has this nice --min-read-rate option, any zone that yields data
slower will be considered a hopeless case, RAID does not have such magic. 
If your drive always responds and always claims to successfully write 
even when it doesn't, then RAID will never kick it.

If you never run array checks or smart selftests, errors won't show.
RAID will show them as healthy, SMART will show them as healthy, 
doesn't mean diddly-squat until you actually test it. Regularly.

Kicking drives yourself is quite normal. RAID only does so much. 
This is why we have mdadm --replace, that way even a semi-broken disk 
can help with the rebuild effort and bad sectors on other disks won't 
result in an even bigger problem, or at least, not right away.

If you leave RAID to its own devices, it has a much higher chance of dying 
than if you run tests, and actually decide to do something once *you're* 
aware that there are problems that RAID itself isn't aware of.

> On a separate topic, if I eventually expand the array to 6 2TB disks, 
> will the array be smart enough to allow me to expand it to the new size? 

Yes. Perhaps after an additional --grow --size=max.

Regards
Andreas Klauer

next prev parent reply	other threads:[~2016-10-30 20:13 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-30  2:16 Buffer I/O error on dev md5, logical block 7073536, async page read Marc MERLIN
2016-10-30  9:33 ` Andreas Klauer
2016-10-30 15:38   ` Marc MERLIN
2016-10-30 16:19     ` Andreas Klauer
2016-10-30 16:34       ` Phil Turmel
2016-10-30 17:12         ` clearing blocks wrongfully marked as bad if --update=no-bbl can't be used? Marc MERLIN
2016-10-30 17:16           ` Marc MERLIN
2016-11-04 18:18             ` Marc MERLIN
2016-11-04 18:22               ` Phil Turmel
2016-11-04 18:50                 ` Marc MERLIN
2016-11-04 18:59                   ` Roman Mamedov
2016-11-04 19:31                     ` Roman Mamedov
2016-11-04 20:02                       ` Marc MERLIN
2016-11-04 19:51                     ` Marc MERLIN
2016-11-07  0:16                       ` NeilBrown
2016-11-07  1:13                         ` Marc MERLIN
2016-11-07  3:36                           ` Phil Turmel
2016-10-30 18:56         ` [ LR] Kernel 4.8.4: INFO: task kworker/u16:8:289 blocked for more than 120 seconds TomK
2016-10-30 19:16           ` TomK
2016-10-30 20:13           ` Andreas Klauer [this message]
2016-10-30 21:08             ` TomK
2016-10-31 19:29           ` Wols Lists
2016-11-01  2:40             ` TomK
2016-10-30 16:43       ` Buffer I/O error on dev md5, logical block 7073536, async page read Marc MERLIN
2016-10-30 17:02         ` Andreas Klauer
2016-10-31 19:24         ` Wols Lists

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161030201302.GB6727@metamorpher.de \
    --to=andreas.klauer@metamorpher.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=tk@mdevsys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).