From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Christoph Groth <christoph@grothesque.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Unocorrectable errors with RAID1
Date: Tue, 17 Jan 2017 07:32:27 -0500 [thread overview]
Message-ID: <0cedce7a-5641-cbf2-d3d7-f0773fcc14c7@gmail.com> (raw)
In-Reply-To: <87pojmavts.fsf@grothesque.org>
On 2017-01-17 04:18, Christoph Groth wrote:
> Austin S. Hemmelgarn wrote:
>
>> There's not really much in the way of great documentation that I know
>> of. I can however cover the basics here:
>>
>> (...)
>
> Thanks for this explanation. I'm sure it will be also useful to others.
Glad I could help.
>
>> If the chunk to be allocated was a data chunk, you get -ENOSPC
>> (usually, sometimes you might get other odd results) in the userspace
>> application that triggered the allocation.
>
> It seems that the available space reported by the system df command
> corresponds roughly to the size of the block device minus all the "used"
> space as reported by "btrfs fi df".
That's correct.
>
> If I understand what you wrote correctly this means that when writing a
> huge file it may happen that the system df will report enough free
> space, but btrfs will raise ENOSPC. However, it should be possible to
> keep writing small files even at this point (assuming that there's
> enough space for the metadata). Or will btrfs split the huge file into
> small pieces to fit it into the fragmented free space in the chunks?
OK, so the first bit to understanding this is that an extent in a file
can't be larger than a chunk. This means that if you have space for 3
1GB data chunks located in 3 different places on the storage device, you
can still write a 3GB file to the filesystem, it will just end up with 3
1GB extents. The issues with ENOSPC come in when almost all of your
space is allocated to chunks and one type gets full. In such a
situation, if you have metadata space, you can keep writing to the FS,
but big writes may fail, and you'll eventually end up in a situation
where you need to delete things to free up space.
>
> Such a situation should be avoided of course. I'm asking out of curiosity.
>
>>>>> * So scrubbing is not enough to check the health of a btrfs file
>>>>> system? It’s also necessary to read all the files?
>>>
>>>> Scrubbing checks data integrity, but not the state of the data. IOW,
>>>> you're checking that the data and metadata match with the checksums,
>>>> but not necessarily that the filesystem itself is valid.
>>>
>>> I see, but what should one then do to detect problems such as mine as
>>> soon as possible? Periodically calculate hashes for all files? I’ve
>>> never seen a recommendation to do that for btrfs.
>
>> Scrub will verify that the data is the same as when the kernel
>> calculated the block checksum. That's really the best that can be
>> done. In your case, it couldn't correct the errors because both copies
>> of the corrupted blocks were bad (this points at an issue with either
>> RAM or the storage controller BTW, not the disks themselves). Had one
>> of the copies been valid, it would have intelligently detected which
>> one was bad and fixed things.
>
> I think I understand the problem with the three corrupted blocks that I
> was able to fix by replacing the files.
>
> But there is also the strange "Stale file handle" error with some other
> files that was not found by scrubbing, and also does not seem to appear
> in the output of "btrfs dev stats", which is BTW
>
> [/dev/sda2].write_io_errs 0
> [/dev/sda2].read_io_errs 0
> [/dev/sda2].flush_io_errs 0
> [/dev/sda2].corruption_errs 3
> [/dev/sda2].generation_errs 0
> [/dev/sdb2].write_io_errs 0
> [/dev/sdb2].read_io_errs 0
> [/dev/sdb2].flush_io_errs 0
> [/dev/sdb2].corruption_errs 3
> [/dev/sdb2].generation_errs 0
>
> (The 2 times 3 corruption errors seem to be the uncorrectable errors
> that I could fix by replacing the files.)
Yep, those correspond directly to the uncorrectable errors you mentioned
in your original post.
>
> To get the "stale file handle" error I need to try to read the affected
> file. That's why I was wondering whether reading all the files
> periodically is indeed a useful maintenance procedure with btrfs.
In the cases I've seen, no it isn't all that useful. As far as the
whole ESTALE thing, that's almost certainly a bug and you either
shouldn't be getting an error there, or you shouldn't be getting that
error code there.
>
> "btrfs check" does find the problem, but it can be only run on an
> unmounted file system.
next prev parent reply other threads:[~2017-01-17 12:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-16 11:10 Unocorrectable errors with RAID1 Christoph Groth
2017-01-16 13:24 ` Austin S. Hemmelgarn
2017-01-16 15:42 ` Christoph Groth
2017-01-16 16:29 ` Austin S. Hemmelgarn
2017-01-17 4:50 ` Janos Toth F.
2017-01-17 12:25 ` Austin S. Hemmelgarn
2017-01-17 9:18 ` Christoph Groth
2017-01-17 12:32 ` Austin S. Hemmelgarn [this message]
2017-01-16 22:45 ` Goldwyn Rodrigues
2017-01-17 8:44 ` Christoph Groth
2017-01-17 11:32 ` Goldwyn Rodrigues
2017-01-17 20:25 ` Christoph Groth
2017-01-17 21:52 ` Chris Murphy
2017-01-17 23:10 ` Christoph Groth
2017-01-18 7:13 ` gdb log of crashed "btrfs-image -s" Christoph Groth
2017-01-18 11:49 ` Goldwyn Rodrigues
2017-01-18 20:11 ` Christoph Groth
2017-01-23 12:09 ` Goldwyn Rodrigues
2017-01-17 22:57 ` Unocorrectable errors with RAID1 Goldwyn Rodrigues
2017-01-17 23:22 ` Christoph Groth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0cedce7a-5641-cbf2-d3d7-f0773fcc14c7@gmail.com \
--to=ahferroin7@gmail.com \
--cc=christoph@grothesque.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).