linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Niklas <simd@vfemail.net>
To: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: And then they sent my defective CPU back to me...
Date: Mon, 29 Jan 2024 00:21:24 -0500	[thread overview]
Message-ID: <20240129002124.36e1a1b5@firefly> (raw)

Hello,

### Background (for the curious.)

Much like the rest of you, I have a RAID array on my PC.

I've just come out of the tail end of a PC repair nightmare where you
send your parts in for warranty and they're still broken when you get
them back.

During this trying process of testing my array "decided" to resync itself.
I tried to idle it, but that failed. Granted, I might have copied the
data I wanted off of it, but I didn't think about it at the time and it
was the array which seemed to really trigger the bug.

The problem I was seeing was silent data corruption. I've done some
tests, and the system seems stable now that I've replaced my CPU, but I
can't prove it as this bug has a history of being hard to detect.

And how did I find it? I was running a check of my array before backing
up my data to cold storage... My PC kept crashing. I then replaced,
unplugged, or warrantied out each and every part. But the bug was still
there. Thus I purchased some a new CPU, lower performance CPU, and an
identical MB, which were the only parts I didn't have spares for.

And then ata12 decided to act up... so that's one drive I probably have
to warranty...

And then when booting fsck "decided" to fix the file system... which
would ordinarily be mounted ro as I knew I had problems.
I killed fsck as fast as I could.
### end background



As things stand, I have an array which has some errors in it. Like 100
reports of mismatches worth of errors.

It's a RAID 60 array. Why 60? Because I had read that only 4 drives
could be used in a RAID 6 array in my CompTIA book and my drives are over
6TB so RAID 5 won't work right. I'm beginning to suspect that limitation
doesn't exist for md raid.

Now last time I had a problem with mismatched sectors, I used this
article here to help me find the affected files:
https://unix.stackexchange.com/questions/730307/find-files-contained-in-sector-of-a-raid-array

Last time, I could easily replace the affected files. So I just resynced.
Then, out of curiosity, I compared the originals to the damaged ones and
they were identical. So, is the above answer the way to find damaged
files?

Is there a way to isolate the array so that I can see the different
"versions", if you will, of the affected files/file system?

Thanks,
David

                 reply	other threads:[~2024-01-29  5:21 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240129002124.36e1a1b5@firefly \
    --to=simd@vfemail.net \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).