Re: Unocorrectable errors with RAID1

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Christoph Groth <christoph@grothesque.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Unocorrectable errors with RAID1
Date: Tue, 17 Jan 2017 07:32:27 -0500	[thread overview]
Message-ID: <0cedce7a-5641-cbf2-d3d7-f0773fcc14c7@gmail.com> (raw)
In-Reply-To: <87pojmavts.fsf@grothesque.org>

On 2017-01-17 04:18, Christoph Groth wrote:
> Austin S. Hemmelgarn wrote:
>
>> There's not really much in the way of great documentation that I know
>> of.  I can however cover the basics here:
>>
>> (...)
>
> Thanks for this explanation.  I'm sure it will be also useful to others.
Glad I could help.
>
>> If the chunk to be allocated was a data chunk, you get -ENOSPC
>> (usually, sometimes you might get other odd results) in the userspace
>> application that triggered the allocation.
>
> It seems that the available space reported by the system df command
> corresponds roughly to the size of the block device minus all the "used"
> space as reported by "btrfs fi df".
That's correct.
>
> If I understand what you wrote correctly this means that when writing a
> huge file it may happen that the system df will report enough free
> space, but btrfs will raise ENOSPC.  However, it should be possible to
> keep writing small files even at this point (assuming that there's
> enough space for the metadata).  Or will btrfs split the huge file into
> small pieces to fit it into the fragmented free space in the chunks?
OK, so the first bit to understanding this is that an extent in a file 
can't be larger than a chunk.  This means that if you have space for 3 
1GB data chunks located in 3 different places on the storage device, you 
can still write a 3GB file to the filesystem, it will just end up with 3 
1GB extents.  The issues with ENOSPC come in when almost all of your 
space is allocated to chunks and one type gets full.  In such a 
situation, if you have metadata space, you can keep writing to the FS, 
but big writes may fail, and you'll eventually end up in a situation 
where you need to delete things to free up space.
>
> Such a situation should be avoided of course.  I'm asking out of curiosity.
>
>>>>> * So scrubbing is not enough to check the health of a btrfs file
>>>>> system?  It’s also necessary to read all the files?
>>>
>>>> Scrubbing checks data integrity, but not the state of the data. IOW,
>>>> you're checking that the data and metadata match with the checksums,
>>>> but not necessarily that the filesystem itself is valid.
>>>
>>> I see, but what should one then do to detect problems such as mine as
>>> soon as possible?  Periodically calculate hashes for all files? I’ve
>>> never seen a recommendation to do that for btrfs.
>
>> Scrub will verify that the data is the same as when the kernel
>> calculated the block checksum.  That's really the best that can be
>> done. In your case, it couldn't correct the errors because both copies
>> of the corrupted blocks were bad (this points at an issue with either
>> RAM or the storage controller BTW, not the disks themselves).  Had one
>> of the copies been valid, it would have intelligently detected which
>> one was bad and fixed things.
>
> I think I understand the problem with the three corrupted blocks that I
> was able to fix by replacing the files.
>
> But there is also the strange "Stale file handle" error with some other
> files that was not found by scrubbing, and also does not seem to appear
> in the output of "btrfs dev stats", which is BTW
>
> [/dev/sda2].write_io_errs   0
> [/dev/sda2].read_io_errs    0
> [/dev/sda2].flush_io_errs   0
> [/dev/sda2].corruption_errs 3
> [/dev/sda2].generation_errs 0
> [/dev/sdb2].write_io_errs   0
> [/dev/sdb2].read_io_errs    0
> [/dev/sdb2].flush_io_errs   0
> [/dev/sdb2].corruption_errs 3
> [/dev/sdb2].generation_errs 0
>
> (The 2 times 3 corruption errors seem to be the uncorrectable errors
> that I could fix by replacing the files.)
Yep, those correspond directly to the uncorrectable errors you mentioned 
in your original post.
>
> To get the "stale file handle" error I need to try to read the affected
> file.  That's why I was wondering whether reading all the files
> periodically is indeed a useful maintenance procedure with btrfs.
In the cases I've seen, no it isn't all that useful.  As far as the 
whole ESTALE thing, that's almost certainly a bug and you either 
shouldn't be getting an error there, or you shouldn't be getting that 
error code there.
>
> "btrfs check" does find the problem, but it can be only run on an
> unmounted file system.

next prev parent reply	other threads:[~2017-01-17 12:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-16 11:10 Unocorrectable errors with RAID1 Christoph Groth
2017-01-16 13:24 ` Austin S. Hemmelgarn
2017-01-16 15:42   ` Christoph Groth
2017-01-16 16:29     ` Austin S. Hemmelgarn
2017-01-17  4:50       ` Janos Toth F.
2017-01-17 12:25         ` Austin S. Hemmelgarn
2017-01-17  9:18       ` Christoph Groth
2017-01-17 12:32         ` Austin S. Hemmelgarn [this message]
2017-01-16 22:45 ` Goldwyn Rodrigues
2017-01-17  8:44   ` Christoph Groth
2017-01-17 11:32     ` Goldwyn Rodrigues
2017-01-17 20:25       ` Christoph Groth
2017-01-17 21:52         ` Chris Murphy
2017-01-17 23:10           ` Christoph Groth
2017-01-18  7:13             ` gdb log of crashed "btrfs-image -s" Christoph Groth
2017-01-18 11:49               ` Goldwyn Rodrigues
2017-01-18 20:11                 ` Christoph Groth
2017-01-23 12:09                   ` Goldwyn Rodrigues
2017-01-17 22:57         ` Unocorrectable errors with RAID1 Goldwyn Rodrigues
2017-01-17 23:22           ` Christoph Groth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0cedce7a-5641-cbf2-d3d7-f0773fcc14c7@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=christoph@grothesque.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).