Re: btrfs scrub with unexpected results

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tom Arild Naess <tanaess@gmail.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: btrfs scrub with unexpected results
Date: Wed, 9 Nov 2016 18:30:15 +0100	[thread overview]
Message-ID: <48d461f9-0455-5da2-651a-39d4e59cd217@gmail.com> (raw)
In-Reply-To: <479c9899-f073-5791-0693-1c9daef3f92d@gmail.com>

On 09. nov. 2016 14:04, Austin S. Hemmelgarn wrote:
> On 2016-11-09 07:40, Tom Arild Naess wrote:
>> Thanks for your lengthy answer. Just after posting my question I
>> realized that the last reboot I did resulted in the filesystem being
>> mounted RO. I started a "btrfs check --repair" but terminated it after
>> six days, since I really need to get the backup up and running again. I
>> have decided to start with a fresh btrfs to rule out any errors created
>> by old kernels.
> Even with other filesystems, doing this on occasion is generally a 
> good idea.  It goes double for BTRFS though, I'd say right now every 
> year or so you should be re-creating the filesystem if your using BTRFS.
>>
>> I find it unlikely that my problems are caused by any hardware faults,
>> as the server has been running 24/7 for six months with nightly backups
>> every day without any problems. Also the system has been scrubbed once a
>> month without issues in the same timespan. Every time there have been
>> scrubbing errors, these have all occurred in the the same old snapshots
>> that I created from my hard link backups. These were the first snapshots
>> I ever took, and back then I ran a quite old kernel.
> Just to clarify, most of the reason I'm thinking it's a hardware issue 
> is that a reboot fixed things.  In most cases I've seen, that 
> generally means you either have hardware problems (even failing 
> hardware usually works correctly for a little while after being power 
> cycled), or that you got hit with a memory error somewhere (not 
> everything has ECC memory on a server system, the on-device caches on 
> most disks and some storage controllers often don't for example).  It 
> could just as easily be the result of a bug somewhere as well, but I 
> usually tend to blame the hardware first because I find that it's a 
> lot easier to debug most of the time (I might also be a bit biased 
> because BTRFS has helped me ID a whole lot of marginal hardware in the 
> past 2 years).

Ok, I will keep this in mind if the server is starting to act strange again.
>>
>> If a fresh btrfs does not solve my problems, I will go through the list
>> you provided. Some have already been handled earlier, like memtest (did
>> a long run before the system was put into service). I am also running
>> smartctl as a service, and nothing is reported there either.
>>
>> One last thing: The CPU on the server is a really low end AMD C-70, and
>> I wonder if it's a little too weak for a storage server? Not in the day
>> to day, but when a repair is needed. Seems like more than six days for a
>> repair on 4x 3TB system is way too long?
> For something like a storage server, what you really want to look at 
> is memory bandwidth, as that tends to directly impact pretty much 
> everything the system is supposed to be doing.  In your case, the 
> limiting factor probably is the CPU, as a C-70 runs at 1GHz and only 
> supports up to DDR3-1066 RAM.  This works fine for just serving files 
> of course, but it gets problematic when you have to move lots of data 
> around or process a filesystem for repairs.  As a general rule for a 
> file-server, I wouldn't use anything running at less than 2GHz with at 
> least 2 (preferably 4) cores which supports at minimum DDR3-1333 
> (preferably DDR3-1600) RAM.
>
> In fact, with some very specific exceptions, memory bandwidth is 
> actually one of the most important metrics for almost any computer 
> (provided the CPU isn't running slower than the RAM or limiting it's 
> max operation speed, I'd upgrade RAM before upgrading the CPU most of 
> the time for most systems).

Sorry, but I will have to disagree on your point about memory! The 
memory controllers on modern computers are quite well matched to the 
CPU, and the difference between DDR3-1066 and DDR3-1600 will often be 
minuscule in the real world. I found this article on DDR3 from reputable 
anantech.com showing the real effects the different spec'ed DDR3 has on 
the systems performance: http://www.anandtech.com/show/2792

About multi-core systems: I noticed that "btrfs check" did only utilize 
one single core, and maxed it out at 100%. Seems like it would benefit 
from utilizing more cores. Has this been considered?


-- 
Tom Arild Naess


>>
>>
>> -- 
>> Tom Arild Naess
>>
>> On 03. nov. 2016 12:51, Austin S. Hemmelgarn wrote:
>>> On 2016-11-02 17:55, Tom Arild Naess wrote:
>>>> Hello,
>>>>
>>>> I have been running btrfs on a file server and backup server for a
>>>> couple of years now, both set up as RAID 10. The file server has been
>>>> running along without any problems since day one. My problems has been
>>>> with the backup server.
>>>>
>>>> A little background about the backup server before I dive into the
>>>> problems. The server was a new build that was set to replace an aging
>>>> machine, and my intention was to start using btrfs send/receive 
>>>> instead
>>>> of hard links for the backups. Since I had 8x the space on the new
>>>> server, I just rsynced the whole lot of old backups to the new 
>>>> server. I
>>>> then made some scripts that created snapshots from the old file
>>>> hierarchy. As I started rewriting my backup scripts (on file server 
>>>> and
>>>> backup server) to use send/receive, I also tested scrubbing to see 
>>>> that
>>>> everything was OK. After doing this a few times, scrub found
>>>> unrecoverable files. This, I thought, should not be possible on new
>>>> disks. I tried to get some help on this list, but no answers were 
>>>> found,
>>>> and since I was unable to find what triggered this, I just stopped 
>>>> using
>>>> send/receive, and let my old backup regime live on on this new backup
>>>> server as well. I don't remember how I fixed the errors, but I guess I
>>>> just replaced the offending files with fresh ones, and scrub ran 
>>>> without
>>>> any more problems. I decided to let things just run like this, and set
>>>> up scrubbing on a monthly schedule.
>>>>
>>>> Last night I got the unpleasant mail from cron telling me that 
>>>> scrub had
>>>> failed (for the first time in over a year). Since I was running on an
>>>> older kernel (4.2.x), I decided to upgrade, and went for the latest of
>>>> the longterm branches, namely 4.4.30. After rebooting I did (for
>>>> whatever reason) check one of the offending files, and I could read 
>>>> the
>>>> file just fine! I checked the rest of the bunch, and all files read
>>>> fine, and had the same md5 sum as the originals! All these files were
>>>> located in those old snapshots. I thought that maybe this was 
>>>> because of
>>>> a bug resolved since my last kernel. Then I ran a new scrub, and this
>>>> one also reported unrecoverable errors. This time on two other 
>>>> files but
>>>> also in some of the old snapshots. I tried reading the files, and got
>>>> the expected I/O errors. One reboot later, these files reads just fine
>>>> again!
>>> So, based on what your saying, this sounds like you have hardware
>>> problems.  The fact that a reboot is fixing I/O errors caused by
>>> checksum mismatches tells me that either (in relative order of
>>> likelihood):
>>> 1. You have some bad RAM (probably not much given the small number of
>>> errors).
>>> 2. You have some bad hardware in the storage path other than the
>>> physical media in your storage devices.  Any of the storage
>>> controller, the cabling/back-plane, or the on-disk cache having issues
>>> can cause things like this to happen.
>>> 3. Some other component is having issues.  A PSU that's not providing
>>> clean power could cause this also, but is not likely unless you've got
>>> a really cheap PSU.
>>> 4. You've found an odd corner case in BTRFS that nobody's reported
>>> before (this is pretty much certain if you rule out the hardware).
>>>
>>> Based on this, what I would suggest doing (in order):
>>> 1. Run self-tests on the storage devices using smartctl (and see if
>>> they think they're healthy or not).  I doubt that this will show
>>> anything, but it's quick and easy to test and doesn't require taking
>>> the system off-line, so it's one of the first things to check.
>>> 2. Check your cabling.  This is really easy to verify, just disconnect
>>> and reconnect everything and see if you still have problems. If you
>>> do still have problems, try switching out one data (SATA/SAS/whatever
>>> you use) cable at a time and see if you still have problems (it takes
>>> longer than using a cable tester, but finding a working cable tester
>>> for internal computer cables is hard).
>>> 3. Check your RAM.  Memtest86 and Memtest86+ are the best options for
>>> general testing, but I doubt that those will turn up anything.  If you
>>> have spare RAM, I'd actually suggest just swapping out one DIMM at a
>>> time and seeing if you still get the behavior your seeing.
>>> 4. Check your PSU.  I list this before the storage controller and
>>> disks because it's pretty easy to test (you just need a PSU tester,
>>> which are about 15 USD on Amazon, or a good multi-meter, some wire,
>>> and some basic knowledge of the wiring), but after the RAM because
>>> it's significantly less likely to be the problem than your RAM unless
>>> you've got a really cheap PSU.
>>> 5. Check your storage controller.  This is _hard_ to do unless you
>>> have a spare known working storage controller.
>>> 6. If you have any extra expansion cards your not using (NIC's, HBA's,
>>> etc), try pulling them out.  This sounds odd, but I've seen cases
>>> where the driver for something I wasn't using at all was causing
>>> problems elsewhere.
>>>
>>> Now, assuming none of that turns anything up, then you probably have
>>> found a bug in BTRFS, but I have no idea in this case how we would go
>>> about debugging it as it seems to be some kind of in-memory data
>>> corruption (maybe a buffer overflow?).
>>>
>>>>
>>>> Some system info:
>>>>
>>>> $ uname -a
>>>> Linux backup 4.4.30-1-lts #1 SMP Tue Nov 1 22:09:20 CET 2016 x86_64
>>>> GNU/Linux
>>>>
>>>> $ btrfs --version
>>>> btrfs-progs v4.8.2
>>>>
>>>> $ btrfs fi show /backup
>>>> Label: none  uuid: 8825ce78-d620-48f5-9f03-8c4568d3719d
>>>>     Total devices 4 FS bytes used 2.81TiB
>>>>     devid    1 size 2.73TiB used 1.41TiB path /dev/sdb
>>>>     devid    2 size 2.73TiB used 1.41TiB path /dev/sda
>>>>     devid    3 size 2.73TiB used 1.41TiB path /dev/sdd
>>>>     devid    4 size 2.73TiB used 1.41TiB path /dev/sdc
>>>
>>
>

next prev parent reply	other threads:[~2016-11-09 17:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-02 21:55 btrfs scrub with unexpected results Tom Arild Naess
2016-11-03 11:51 ` Austin S. Hemmelgarn
2016-11-09 12:40   ` Tom Arild Naess
2016-11-09 13:04     ` Austin S. Hemmelgarn
2016-11-09 17:30       ` Tom Arild Naess [this message]
2016-11-09 20:13         ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48d461f9-0455-5da2-651a-39d4e59cd217@gmail.com \
    --to=tanaess@gmail.com \
    --cc=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).