From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: John Petrini <jpetrini@coredial.com>
Cc: Chris Murphy <lists@colorremedies.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Volume appears full but TB's of space available
Date: Fri, 7 Apr 2017 07:41:22 -0400 [thread overview]
Message-ID: <d531fd79-c08a-0ba1-ae51-5503bc56b121@gmail.com> (raw)
In-Reply-To: <CAD4AmV68i-nUqzYMxc-QHuDAx8P=K4Jr4JrZdW=vFFyC0PK1+A@mail.gmail.com>
On 2017-04-06 23:25, John Petrini wrote:
> Interesting. That's the first time I'm hearing this. If that's the
> case I feel like it's a stretch to call it RAID10 at all. It sounds a
> lot more like basic replication similar to Ceph only Ceph understands
> failure domains and therefore can be configured to handle device
> failure (albeit at a higher level)
Yeah, the stacking is a bit odd, and there are some rather annoying
caveats that make most of the names other than raid5/raid6 misleading.
In fact, raid1 mode in BTRFS is more like what most people think of as
RAID10 when run on more than 2 disks than BTRFS raid10 mode is, although
it stripes at a much higher level.
>
> I do of course keep backups but I chose RAID10 for the mix of
> performance and reliability. It doesn't seems worth it losing 50% of
> my usable space for the performance gain alone.
>
> Thank you for letting me know about this. Knowing that I think I may
> have to reconsider my choice here. I've really been enjoying the
> flexibility of BTRS which is why I switched to it in the first place
> but with experimental RAID5/6 and what you've just told me I'm
> beginning to doubt that it's the right choice.
There are some other options in how you configure it. Most of the more
useful operational modes actually require stacking BTRFS on top of LVM
or MD. I'm rather fond of running BTRFS raid1 on top of LVM RAID0
volumes, which while it provides no better data safety than BTRFS raid10
mode, gets noticeably better performance. You can also reverse that to
get something more like traditional RAID10, but you lose the
self-correcting aspect of BTRFS.
>
> What's more concerning is that I haven't found a good way to monitor
> BTRFS. I might be able to accept that the array can only handle a
> single drive failure if I was confident that I could detect it but so
> far I haven't found a good solution for this.
This I can actually give some advice on. There are a couple of options,
but the easiest is to find a piece of generic monitoring software that
can check the return code of external programs, and then write some
simple scripts to perform the checks on BTRFS. The things you want to
keep an eye on are:
1. Output of 'btrfs dev stats'. If you've got a new enough copy of
btrfs-progs, you can pass '--check' and the return code will be non-zero
if any of the error counters isn't zero. If you've got to use an older
version, you'll instead have to write a script to parse the output (I
will comment that this is much easier in a language like Perl or Python
than it is in bash). You want to watch for steady increases in error
counts or sudden large jumps. Single intermittent errors are worth
tracking, but they tend to happen more frequently the larger the array is.
2. Results from 'btrfs scrub'. This is somewhat tricky because scrub is
either asynchronous or blocks for a _long_ time. The simplest option
I've found is to fire off an asynchronous scrub to run during down-time,
and then schedule recurring checks with 'btrfs scrub status'. On the
plus side, 'btrfs scrub status' already returns non-zero if the scrub
found errors.
3. Watch the filesystem flags. Some monitoring software can easily do
this for you (Monit for example can watch for changes in the flags).
The general idea here is that BTRFS will go read-only if it hits certain
serious errors, so you can watch for that transition and send a
notification when it happens. This is also worth watching since the
filesystem flags should not change during normal operation of any
filesystem.
4. Watch SMART status on the drives and run regular self-tests. Most of
the time, issues will show up here before they show up in the FS, so by
watching this, you may have an opportunity to replace devices before the
filesystem ends up completely broken.
5. If you're feeling really ambitious, watch the kernel logs for errors
from BTRFS and whatever storage drivers you use. This is the least
reliable thing out of this list to automate, so I'd not suggest just
doing this by itself.
The first two items are BTRFS specific. The rest however, are standard
things you should be monitoring regardless of what type of storage stack
you have. Of these, item 3 will immediately trigger in the event of a
catastrophic device failure, while 1, 2, and 5 will provide better
coverage of slow failures, and 4 will cover both aspects.
As far as what to use to actually track these, that really depends on
your use case. For tracking on an individual system basis, I'd suggest
Monit, it's efficient, easy to configure, provides some degree of error
resilience, and can actually cover a lot of monitoring tasks beyond
stuff like this. If you want some kind of centralized monitoring, I'd
probably go with Nagios, but that's more because that's the standard for
that type of thing, not because I've used it myself (I much prefer
per-system decentralized monitoring, with only the checks that systems
are online centralized).
next prev parent reply other threads:[~2017-04-07 11:41 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-07 0:47 Volume appears full but TB's of space available John Petrini
2017-04-07 1:15 ` John Petrini
2017-04-07 1:21 ` Chris Murphy
2017-04-07 1:31 ` John Petrini
2017-04-07 2:42 ` Chris Murphy
2017-04-07 3:25 ` John Petrini
2017-04-07 11:41 ` Austin S. Hemmelgarn [this message]
2017-04-07 13:28 ` John Petrini
2017-04-07 13:50 ` Austin S. Hemmelgarn
2017-04-07 16:28 ` Chris Murphy
2017-04-07 16:58 ` Austin S. Hemmelgarn
2017-04-07 17:05 ` John Petrini
2017-04-07 17:11 ` Austin S. Hemmelgarn
2017-04-07 16:04 ` Chris Murphy
2017-04-07 16:51 ` Austin S. Hemmelgarn
2017-04-07 16:58 ` John Petrini
2017-04-07 17:04 ` Austin S. Hemmelgarn
2017-04-08 5:12 ` Duncan
2017-04-10 11:31 ` Austin S. Hemmelgarn
2017-04-07 1:17 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d531fd79-c08a-0ba1-ae51-5503bc56b121@gmail.com \
--to=ahferroin7@gmail.com \
--cc=jpetrini@coredial.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).