linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: james harvey <jamespharvey20@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Expected behavior of bad sectors on one drive in a RAID1
Date: Tue, 20 Oct 2015 00:16:15 -0400	[thread overview]
Message-ID: <CA+X5Wn6Wu3RfX6rM6fpp9b4ZQQNm5WCbSOzxyOnE=+jDb0xDoA@mail.gmail.com> (raw)

Background -----

My fileserver had a "bad event" last week.  Shut it down normally to
add a new hard drive, and it would no longer post.  Tried about 50
times, doing the typical everything non-essential unplugged, trying 1
of 4 memory modules at a time, and 1 of 2 processors at a time.  Got
no where.

Inexpensive HP workstation, so purchased a used identical model
(complete other than hard drives) on eBay.  Replacement arrived today.
Posts fine.  Moved hard drives over (again, identical model, and Arch
Linux not Windows) and it started giving "Watchdog detected hard
LOCKUP" type errors I've never seen before.

Decided I'd diagnose which part in the original server was bad.  By
sitting turned off for a week, it suddenly started posting just fine.
But, with the hard drives back in it, I'm getting the same hard lockup
errors.

An Arch ISO DVD runs stress testing perfectly.

Btrfs-specific -----

The current problem I'm having must be a bad hard drive or corrupted data.

3 drive btrfs RAID1 (data and metadata.)  sda has 1GB of the 3GB of
data, and 1GB of the 1GB of metadata.

sda appears to be going bad, with my low threshold of "going bad", and
will be replaced ASAP.  It just developed 16 reallocated sectors, and
has 40 current pending sectors.

I'm currently running a "btrfs scrub start -B -d -r /terra", which
status on another term shows me has found 32 errors after running for
an hour.

Question 1 - I'm expecting if I re-run the scrub without the read-only
option, that it will detect from the checksum data which sector is
correct, and re-write to the drive with bad sectors the data to a new
sector.  Correct?

Question 2 - Before having ran the scrub, booting off the raid with
bad sectors, would btrfs "on the fly" recognize it was getting bad
sector data with the checksum being off, and checking the other
drives?  Or, is it expected that I could get a bad sector read in a
critical piece of operating system and/or kernel, which could be
causing my lockup issues?

Question 3 - Probably doesn't matter, but how can I see which files
(or metadata to files) the 40 current bad sectors are in?  (On extX,
I'd use tune2fs and debugfs to be able to see this information.)

I do have hourly snapshots, from when it was properly running, so once
I'm that far in the process, I can also compare the most recent
snapshots, and see if there's any changes that happened to files that
shouldn't have.

             reply	other threads:[~2015-10-20  4:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20  4:16 james harvey [this message]
2015-10-20  4:45 ` Expected behavior of bad sectors on one drive in a RAID1 Russell Coker
2015-10-20 13:00   ` Austin S Hemmelgarn
2015-10-20 13:15     ` Russell Coker
2015-10-20 13:59       ` Austin S Hemmelgarn
2015-10-20 19:20         ` Duncan
2015-10-20 19:59           ` Austin S Hemmelgarn
2015-10-20 20:54             ` Tim Walberg
2015-10-21 11:51             ` Austin S Hemmelgarn
2015-10-21 12:07               ` Austin S Hemmelgarn
2015-10-21 16:01                 ` Chris Murphy
2015-10-21 17:28                   ` Austin S Hemmelgarn
2015-10-20 18:54 ` Duncan
2015-10-20 19:48   ` Austin S Hemmelgarn
2015-10-20 21:24     ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+X5Wn6Wu3RfX6rM6fpp9b4ZQQNm5WCbSOzxyOnE=+jDb0xDoA@mail.gmail.com' \
    --to=jamespharvey20@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).