linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: joystick <joystick@shiftmail.org>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: 'Bernd Schubert' <bernd.schubert@fastmail.fm>,
	'linux-raid' <linux-raid@vger.kernel.org>
Subject: Re: 3.12: raid-1 mismatch_cnt question
Date: Thu, 14 Nov 2013 17:09:22 +0100	[thread overview]
Message-ID: <5284F5B2.3040307@shiftmail.org> (raw)
In-Reply-To: <002201cee126$5c390290$14ab07b0$@lucidpixels.com>

On 14/11/2013 11:43, Justin Piszcz wrote:
> $ cat /sys/devices/virtual/block/md1/md/mismatch_cnt
> 303232
>
> Ready to test again, mismatch_cnt very high..
>
> [ .. ]
>
> Please see the following per your new instructions:
> http://home.comcast.net/~jpiszcz/20131114/joystick_cmds2.txt
>
> Summary: No diffs found.

mmh that's strange...

At the end of the procedure (like now, if you didn't resync or repair in 
the meanwhile) is mismatch_cnt still so high?
I'm wondering if a resync happened anyway somehow notwithstanding the 
procedure seems correct to me this time.


>>> What kernel version is yours?
> Was using system rescue cd 3.7.0, appears to be 3.4.47.

no, not that one...
it would be helpful to know the kernel version that *creates* 
mismatches, the one that you have running normally on the live system.
That's the "bugged" one, supposing this is really a bug (until we find 
where the mismatches are, it's difficult to say wether this is a data 
loss or not)


> Will re-try with the latest system rescue cd, 3.8.1, appears to be 3.4.66.
no, that's not needed...


> On a side note, I have not seen any corruption on any of my files; debsums also confirms no issues with any of the system files, so I am wondering if mismatch_cnt is accurate based on the diff above and not seeing any corruption?

yep the problem is now in fact to understand WHERE these mismatches are 
hiding...

Ubuntu files are mostly executables and config files which do not get 
changed often. Mismatches there are less likely to be there than in the 
files which do indeed change.

Maybe the mismatched are located ext4 metadata areas which are not files 
and so can't be seen with md5sums... That would still be as much 
worrisome, unless some expert of ext4 can tell that it's ok (it can be 
OK if the region with mismatches is an old metadata area, currently 
unused; the mechanism that can create harmless mismatches in this case 
has been described by Neil)

It seems you will need to perform the other test I described previously. 
A bit more complex, but it should find something. This can be done live, 
or at least the beginning of it:

- First confirm that mismatch_cnt is still high..

- Then if this does not disrupt your system operation too much, i would 
suggest to fill 95% of free space with a zeroes file like you did in 
earlier tests. Otherwise for a mismatch happening in non-file area we 
won't be sure of what kind of area is that. Maybe recompute mismatch_cnt 
after this.

then, copypasting the procedure with some modifications:
----
... to determine the location of mismatches (...)
Unfortunately I don't think MD tells you the location of mismatches 
directly. Do you want to try the following:
/sys/block/md1/md/sync_min and /sys/block/md1/md/sync_max should allow 
you to narrow the region of the next check.
Set them, then perform check, then cat mismatch_cnt.
Narrow progressively sync_min and sync_max so that you identify the most 
dense areas of mismatches, or a few single blocks that mismatch.
When you have identified some regions or isolated blocks, invoke "sync" 
from bash and then check again the same region a couple of times so to 
be sure that it stays mismatched and it's not just a transient situation.
Then try with debugfs (in readonly mode can be used with fs mounted): 
there should be an option to get the inode number from a block number of 
the device... I hope that block numbers are not offset by MD... I think 
it's icheck and after that you might need "find -inum <inode_number>" 
launched on the same filesystem to find the corresponding filename from 
the inode number. That should be the file that contains the mismatch.
----

Try to report here what you find.
If the mismatching regions do not correspond to files (that would agree 
with your previous test), somebody expert of ext4 might be able to tell 
what do they correspond to.

Regards
J.




  reply	other threads:[~2013-11-14 16:09 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-04 10:25 3.12: raid-1 mismatch_cnt question Justin Piszcz
2013-11-07 10:54 ` Justin Piszcz
2013-11-12  0:39   ` Brad Campbell
2013-11-12  9:14     ` Justin Piszcz
     [not found] ` <527E8B74.70301@shiftmail.org>
2013-11-09 22:49   ` Justin Piszcz
2013-11-10 12:45     ` joystick
2013-11-11  9:26       ` Justin Piszcz
2013-11-11 11:06         ` joystick
2013-11-11 18:52           ` Justin Piszcz
2013-11-11 21:23             ` John Stoffel
2013-11-11 21:55               ` NeilBrown
2013-11-12  2:49                 ` John Stoffel
2013-11-11 21:58             ` NeilBrown
2013-11-11 22:18               ` Justin Piszcz
2013-11-12  9:30             ` joystick
2013-11-12 10:29               ` Bernd Schubert
2013-11-13 22:10                 ` Justin Piszcz
2013-11-14  8:44                   ` joystick
2013-11-14 10:43                     ` Justin Piszcz
2013-11-14 16:09                       ` joystick [this message]
2013-11-14 17:22                         ` Justin Piszcz
2013-11-15  8:51                           ` joystick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5284F5B2.3040307@shiftmail.org \
    --to=joystick@shiftmail.org \
    --cc=bernd.schubert@fastmail.fm \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).