From: "Justin Piszcz" <jpiszcz@lucidpixels.com>
To: 'joystick' <joystick@shiftmail.org>
Cc: 'Bernd Schubert' <bernd.schubert@fastmail.fm>,
'linux-raid' <linux-raid@vger.kernel.org>
Subject: RE: 3.12: raid-1 mismatch_cnt question
Date: Thu, 14 Nov 2013 12:22:26 -0500 [thread overview]
Message-ID: <007301cee15e$178969e0$469c3da0$@lucidpixels.com> (raw)
In-Reply-To: <5284F5B2.3040307@shiftmail.org>
-----Original Message-----
From: joystick [mailto:joystick@shiftmail.org]
Sent: Thursday, November 14, 2013 11:09 AM
To: Justin Piszcz
Cc: 'Bernd Schubert'; 'linux-raid'
Subject: Re: 3.12: raid-1 mismatch_cnt question
[ .. ]
>> At the end of the procedure (like now, if you didn't resync or repair in
>> the meanwhile) is mismatch_cnt still so high?
After a reboot, I ran the check and yes it was still high.
[ .. ]
>> no, not that one...
>> it would be helpful to know the kernel version that *creates*
>> mismatches, the one that you have running normally on the live system.
Version: 3.12.0 (and typically always use the latest)
That's the "bugged" one, supposing this is really a bug (until we find
where the mismatches are, it's difficult to say wether this is a data
loss or not)
>> Maybe the mismatched are located ext4 metadata areas which are not files
>> and so can't be seen with md5sums... That would still be as much
>> worrisome, unless some expert of ext4 can tell that it's ok (it can be
>> OK if the region with mismatches is an old metadata area, currently
>> unused; the mechanism that can create harmless mismatches in this case
>> has been described by Neil)
If that is what is occurring, is it possible to exclude them from mismatch_cnt?
[ .. ]
- First confirm that mismatch_cnt is still high..
It was 0 after reboot.
[ .. ]
- Then if this does not disrupt your system operation too much, i would
suggest to fill 95% of free space with a zeroes file like you did in
earlier tests. Otherwise for a mismatch happening in non-file area we
won't be sure of what kind of area is that. Maybe recompute mismatch_cnt
after this.
Create file up to 95% utilization on /root:
/dev/root 219G 205G 12G 95% /
Re-check:
# echo check > /sys/devices/virtual/block/md1/md/sync_action
# cat /sys/devices/virtual/block/md1/md/mismatch_cnt
27520
then, copypasting the procedure with some modifications:
----
... to determine the location of mismatches (...)
Unfortunately I don't think MD tells you the location of mismatches
directly. Do you want to try the following:
/sys/block/md1/md/sync_min and /sys/block/md1/md/sync_max should allow
you to narrow the region of the next check.
Set them, then perform check, then cat mismatch_cnt.
Narrow progressively sync_min and sync_max so that you identify the most
dense areas of mismatches, or a few single blocks that mismatch.
When you have identified some regions or isolated blocks, invoke "sync"
from bash and then check again the same region a couple of times so to
be sure that it stays mismatched and it's not just a transient situation.
Then try with debugfs (in readonly mode can be used with fs mounted):
there should be an option to get the inode number from a block number of
the device... I hope that block numbers are not offset by MD... I think
it's icheck and after that you might need "find -inum <inode_number>"
launched on the same filesystem to find the corresponding filename from
the inode number. That should be the file that contains the mismatch.
[ .. ]
When I do this, the speed of check thereafter is very slow:
Personalities : [raid1]
md1 : active raid1 sdc2[0] sdb2[1]
233381376 blocks [2/2] [UU]
[>....................] check = 0.0% (4500/233381376) finish=80387.9min speed=48K/sec (55 days)
The speed continues to decrease when the sync_min is set to 1000 and sync_max is 9000 (this won't work).
A few minutes later:
Personalities : [raid1]
md1 : active raid1 sdc2[0] sdb2[1]
233381376 blocks [2/2] [UU]
[>....................] check = 0.0% (4500/233381376) finish=200485.5min speed=19K/sec
It would be interesting if someone else on this list has ext4 and sees similar results (mismatch_cnt) with their SSDs vs. another FS (XFS/etc).
Justin.
next prev parent reply other threads:[~2013-11-14 17:22 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-04 10:25 3.12: raid-1 mismatch_cnt question Justin Piszcz
2013-11-07 10:54 ` Justin Piszcz
2013-11-12 0:39 ` Brad Campbell
2013-11-12 9:14 ` Justin Piszcz
[not found] ` <527E8B74.70301@shiftmail.org>
2013-11-09 22:49 ` Justin Piszcz
2013-11-10 12:45 ` joystick
2013-11-11 9:26 ` Justin Piszcz
2013-11-11 11:06 ` joystick
2013-11-11 18:52 ` Justin Piszcz
2013-11-11 21:23 ` John Stoffel
2013-11-11 21:55 ` NeilBrown
2013-11-12 2:49 ` John Stoffel
2013-11-11 21:58 ` NeilBrown
2013-11-11 22:18 ` Justin Piszcz
2013-11-12 9:30 ` joystick
2013-11-12 10:29 ` Bernd Schubert
2013-11-13 22:10 ` Justin Piszcz
2013-11-14 8:44 ` joystick
2013-11-14 10:43 ` Justin Piszcz
2013-11-14 16:09 ` joystick
2013-11-14 17:22 ` Justin Piszcz [this message]
2013-11-15 8:51 ` joystick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='007301cee15e$178969e0$469c3da0$@lucidpixels.com' \
--to=jpiszcz@lucidpixels.com \
--cc=bernd.schubert@fastmail.fm \
--cc=joystick@shiftmail.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).