Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: Wols Lists <antlists@youngman.org.uk>
Cc: Jeremy Graham <jeremy@doghouse.agency>, linux-raid@vger.kernel.org
Subject: Re: mdadm stuck at 0% reshape after grow
Date: Tue, 05 Dec 2017 15:49:30 +0000	[thread overview]
Message-ID: <87fu8pdvjp.fsf@esperi.org.uk> (raw)
In-Reply-To: <5A267B54.7070408@youngman.org.uk> (Wols Lists's message of "Tue, 5 Dec 2017 10:56:20 +0000")

On 5 Dec 2017, Wols Lists told this:

> On 05/12/17 09:41, Jeremy Graham wrote:
>> $ mdadm --version
>> mdadm - v3.4 - 28th January 2016
>
> Won't do any harm to try the latest version, but this could well be the
> problem.
>
> https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> That'll tell you where to download the latest mdadm from. This sounds a
> typical problem that people have had, and iirc upgrading mdadm often
> fixes it.

This suggests otherwise:

[69979.933007] md0: detected capacity change from 0 to 12002359508992
[69979.933130] md: reshape of RAID array md0
[69979.933132] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[69979.933134] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for reshape.
[69979.933139] md: using 128k window, over a total of 2930263552k.
[70197.635112] INFO: task md0_reshape:30529 blocked for more than 120 seconds.
[70197.635142]       Not tainted 4.4.0-101-generic #124-Ubuntu
[70197.635161] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[70197.635187] md0_reshape     D ffff88011da37aa8     0 30529      2 0x00000000
[70197.635191]  ffff88011da37aa8 ffff88011da37a78 ffff880214a40e00
ffff880210577000
[70197.635193]  ffff88011da38000 ffff8800d49de424 ffff8800d49de658
ffff8800d49de638
[70197.635194]  ffff8800d49de670 ffff88011da37ac0 ffffffff818406d5
ffff8800d49de400
[70197.635196] Call Trace:
[70197.635202]  [<ffffffff818406d5>] schedule+0x35/0x80
[70197.635206]  [<ffffffffc034045f>]
raid5_get_active_stripe+0x31f/0x700 [raid456]
[70197.635210]  [<ffffffff810c4420>] ? wake_atomic_t_function+0x60/0x60
[70197.635212]  [<ffffffffc0344da4>] reshape_request+0x584/0x950 [raid456]
[70197.635215]  [<ffffffff810a9c6a>] ? finish_task_switch+0x7a/0x220
[70197.635218]  [<ffffffffc034548c>] sync_request+0x31c/0x3a0 [raid456]
[70197.635219]  [<ffffffff81840026>] ? __schedule+0x3b6/0xa30
[70197.635222]  [<ffffffff814102b5>] ? find_next_bit+0x15/0x20
[70197.635225]  [<ffffffff81710bb1>] ? is_mddev_idle+0x9c/0xfa
[70197.635227]  [<ffffffff816adbbc>] md_do_sync+0x89c/0xe60
[70197.635229]  [<ffffffff810c4420>] ? wake_atomic_t_function+0x60/0x60
[70197.635231]  [<ffffffff816aa319>] md_thread+0x139/0x150
[70197.635233]  [<ffffffff810c4420>] ? wake_atomic_t_function+0x60/0x60
[70197.635234]  [<ffffffff816aa1e0>] ? find_pers+0x70/0x70
[70197.635236]  [<ffffffff810a0c75>] kthread+0xe5/0x100
[70197.635237]  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
[70197.635239]  [<ffffffff81844b8f>] ret_from_fork+0x3f/0x70
[70197.635241]  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
[70317.630767] INFO: task md0_reshape:30529 blocked for more than 120 seconds.
[70317.630796]       Not tainted 4.4.0-101-generic #124-Ubuntu
[70317.630815] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.

That's a kernel bug, probably a deadlock. *Definitely* try a newer
kernel, 4.14.3 (the latest) if possible. I bet this is fixed by

6ab2a4b806ae21b6c3e47c5ff1285ec06d505325
RAID5: revert e9e4c377e2f563 to fix a livelock

which fixes a bug which exactly like this: the faulty patch was present
from v4.2 to v4.6. You're in the middle of that range... it might be
worth seeing if the distro kernel you're running has applied that patch,
too.

  reply	other threads:[~2017-12-05 15:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-05  9:41 mdadm stuck at 0% reshape after grow Jeremy Graham
2017-12-05 10:56 ` Wols Lists
2017-12-05 15:49   ` Nix [this message]
2017-12-05 15:55 ` 002
2017-12-06  2:51   ` Phil Turmel
2017-12-06  4:33     ` Jeremy Graham
2017-12-06  7:36       ` Jeremy Graham
2017-12-06 13:34         ` Wols Lists
2017-12-06 14:02         ` 002
2017-12-06 10:49       ` Andreas Klauer
2017-12-06 14:15         ` Phil Turmel
2017-12-06 16:03           ` Andreas Klauer
2017-12-06 16:21             ` Phil Turmel
2017-12-06 18:24               ` 002
2017-12-07  8:40                 ` Jeremy Graham
2017-12-06 20:19               ` Edward Kuns
2017-12-07 10:26                 ` Wols Lists
2017-12-07 13:58                 ` Andreas Klauer
2017-12-07 17:06                   ` Wols Lists
2017-12-07 17:40                   ` Andreas Klauer
2017-12-07 20:31                     ` Wols Lists
2017-12-07 23:40                     ` Wols Lists
2017-12-08  1:25                       ` 002
2017-12-09  0:20                       ` Edward Kuns
2017-12-14 12:43                         ` Brad Campbell
2017-12-14 17:32                           ` Edward Kuns

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fu8pdvjp.fsf@esperi.org.uk \
    --to=nix@esperi.org.uk \
    --cc=antlists@youngman.org.uk \
    --cc=jeremy@doghouse.agency \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox