From: NeilBrown <neilb@suse.de>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: 3.10.1: echo repair > sync_action causes hang on RAID-1 (2 x SSD)
Date: Mon, 22 Jul 2013 09:02:57 +1000 [thread overview]
Message-ID: <20130722090257.2faa0874@notabene.brown> (raw)
In-Reply-To: <000501ce85fc$d3a60a10$7af21e30$@lucidpixels.com>
[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]
On Sun, 21 Jul 2013 06:26:55 -0400 "Justin Piszcz" <jpiszcz@lucidpixels.com>
wrote:
> Hi,
>
> When I run repair on an MD-RAID1 sync_action, the speed slows down and it
> stays like this (below) for hours.
>
> The system is then completely unresponsive to user input. I have replaced a
> failing SSD; however, after a check, mismatch_cnt seems to increase over
> time. When I run repair, the system freezes to user-input. Has anyone else
> run into this issue with a RAID-1 volume (2 x SSD) using 0.90 metadata?
> Long ago I used to use this same configuration with two physical disks and
> there was never a problem.
>
> Even though I left a root shell open, this has no effect to break the
> resync:
> # echo idle > /sys/devices/virtual/block/md1/md/sync_action
>
> Every 1.0s: cat /proc/mdstat Sun Jul 21 06:15:38
> 2013
>
> Personalities : [raid1]
> md1 : active raid1 sdc2[0] sdb2[1]
> 233381376 blocks [2/2] [UU]
> [>....................] resync = 0.0% (151616/233381376)
> finish=36171.5min speed=107K/sec
>
> md0 : active raid1 sdc1[0] sdb1[1]
> 1048512 blocks [2/2] [UU]
>
> unused devices: <none>
>
> 10 minutes later:
>
> 233381376 blocks [2/2] [UU]
> [>....................] resync = 0.0% (151616/233381376)
> finish=52219.3min speed=74K/sec
>
> Where it hangs (151616) or elsewhere, has been different each time I watched
> it, it does not appear to be hanging at the same block each time.
>
Hi Justin,
this is a known bug. Fix has been accepted into mainline for 3.11-rc2.
Hopefully it will get into 3.10.3 (too late for 3.10.2).
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: NeilBrown <neilb@suse.de>
To: "Justin Piszcz" <jpiszcz@lucidpixels.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-raid@vger.kernel.org>
Subject: Re: 3.10.1: echo repair > sync_action causes hang on RAID-1 (2 x SSD)
Date: Mon, 22 Jul 2013 09:02:57 +1000 [thread overview]
Message-ID: <20130722090257.2faa0874@notabene.brown> (raw)
In-Reply-To: <000501ce85fc$d3a60a10$7af21e30$@lucidpixels.com>
[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]
On Sun, 21 Jul 2013 06:26:55 -0400 "Justin Piszcz" <jpiszcz@lucidpixels.com>
wrote:
> Hi,
>
> When I run repair on an MD-RAID1 sync_action, the speed slows down and it
> stays like this (below) for hours.
>
> The system is then completely unresponsive to user input. I have replaced a
> failing SSD; however, after a check, mismatch_cnt seems to increase over
> time. When I run repair, the system freezes to user-input. Has anyone else
> run into this issue with a RAID-1 volume (2 x SSD) using 0.90 metadata?
> Long ago I used to use this same configuration with two physical disks and
> there was never a problem.
>
> Even though I left a root shell open, this has no effect to break the
> resync:
> # echo idle > /sys/devices/virtual/block/md1/md/sync_action
>
> Every 1.0s: cat /proc/mdstat Sun Jul 21 06:15:38
> 2013
>
> Personalities : [raid1]
> md1 : active raid1 sdc2[0] sdb2[1]
> 233381376 blocks [2/2] [UU]
> [>....................] resync = 0.0% (151616/233381376)
> finish=36171.5min speed=107K/sec
>
> md0 : active raid1 sdc1[0] sdb1[1]
> 1048512 blocks [2/2] [UU]
>
> unused devices: <none>
>
> 10 minutes later:
>
> 233381376 blocks [2/2] [UU]
> [>....................] resync = 0.0% (151616/233381376)
> finish=52219.3min speed=74K/sec
>
> Where it hangs (151616) or elsewhere, has been different each time I watched
> it, it does not appear to be hanging at the same block each time.
>
Hi Justin,
this is a known bug. Fix has been accepted into mainline for 3.11-rc2.
Hopefully it will get into 3.10.3 (too late for 3.10.2).
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-07-21 23:02 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-21 10:26 3.10.1: echo repair > sync_action causes hang on RAID-1 (2 x SSD) Justin Piszcz
2013-07-21 10:26 ` Justin Piszcz
2013-07-21 23:02 ` NeilBrown [this message]
2013-07-21 23:02 ` NeilBrown
2013-07-25 23:10 ` Justin Piszcz
2013-07-25 23:10 ` Justin Piszcz
2013-07-26 0:35 ` NeilBrown
2013-07-26 0:35 ` NeilBrown
2013-07-26 9:56 ` Justin Piszcz
2013-07-26 9:56 ` Justin Piszcz
2013-07-29 5:56 ` NeilBrown
2013-07-29 5:56 ` NeilBrown
2013-07-29 7:33 ` Justin Piszcz
2013-07-29 7:33 ` Justin Piszcz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130722090257.2faa0874@notabene.brown \
--to=neilb@suse.de \
--cc=jpiszcz@lucidpixels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.