linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: John Petrini <jpetrini@coredial.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Volume appears full but TB's of space available
Date: Fri, 7 Apr 2017 07:41:22 -0400	[thread overview]
Message-ID: <d531fd79-c08a-0ba1-ae51-5503bc56b121@gmail.com> (raw)
In-Reply-To: <CAD4AmV68i-nUqzYMxc-QHuDAx8P=K4Jr4JrZdW=vFFyC0PK1+A@mail.gmail.com>

On 2017-04-06 23:25, John Petrini wrote:
> Interesting. That's the first time I'm hearing this. If that's the
> case I feel like it's a stretch to call it RAID10 at all. It sounds a
> lot more like basic replication similar to Ceph only Ceph understands
> failure domains and therefore can be configured to handle device
> failure (albeit at a higher level)
Yeah, the stacking is a bit odd, and there are some rather annoying 
caveats that make most of the names other than raid5/raid6 misleading. 
In fact, raid1 mode in BTRFS is more like what most people think of as 
RAID10 when run on more than 2 disks than BTRFS raid10 mode is, although 
it stripes at a much higher level.
>
> I do of course keep backups but I chose RAID10 for the mix of
> performance and reliability. It doesn't seems worth it losing 50% of
> my usable space for the performance gain alone.
>
> Thank you for letting me know about this. Knowing that I think I may
> have to reconsider my choice here. I've really been enjoying the
> flexibility of BTRS which is why I switched to it in the first place
> but with experimental RAID5/6 and what you've just told me I'm
> beginning to doubt that it's the right choice.
There are some other options in how you configure it.  Most of the more 
useful operational modes actually require stacking BTRFS on top of LVM 
or MD.  I'm rather fond of running BTRFS raid1 on top of LVM RAID0 
volumes, which while it provides no better data safety than BTRFS raid10 
mode, gets noticeably better performance.  You can also reverse that to 
get something more like traditional RAID10, but you lose the 
self-correcting aspect of BTRFS.
>
> What's more concerning is that I haven't found a good way to monitor
> BTRFS. I might be able to accept that the array can only handle a
> single drive failure if I was confident that I could detect it but so
> far I haven't found a good solution for this.
This I can actually give some advice on.  There are a couple of options, 
but the easiest is to find a piece of generic monitoring software that 
can check the return code of external programs, and then write some 
simple scripts to perform the checks on BTRFS.  The things you want to 
keep an eye on are:

1. Output of 'btrfs dev stats'.  If you've got a new enough copy of 
btrfs-progs, you can pass '--check' and the return code will be non-zero 
if any of the error counters isn't zero.  If you've got to use an older 
version, you'll instead have to write a script to parse the output (I 
will comment that this is much easier in a language like Perl or Python 
than it is in bash).  You want to watch for steady increases in error 
counts or sudden large jumps.  Single intermittent errors are worth 
tracking, but they tend to happen more frequently the larger the array is.

2. Results from 'btrfs scrub'.  This is somewhat tricky because scrub is 
either asynchronous or blocks for a _long_ time.  The simplest option 
I've found is to fire off an asynchronous scrub to run during down-time, 
and then schedule recurring checks with 'btrfs scrub status'.  On the 
plus side, 'btrfs scrub status' already returns non-zero if the scrub 
found errors.

3. Watch the filesystem flags.  Some monitoring software can easily do 
this for you (Monit for example can watch for changes in the flags). 
The general idea here is that BTRFS will go read-only if it hits certain 
serious errors, so you can watch for that transition and send a 
notification when it happens.  This is also worth watching since the 
filesystem flags should not change during normal operation of any 
filesystem.

4. Watch SMART status on the drives and run regular self-tests.  Most of 
the time, issues will show up here before they show up in the FS, so by 
watching this, you may have an opportunity to replace devices before the 
filesystem ends up completely broken.

5. If you're feeling really ambitious, watch the kernel logs for errors 
from BTRFS and whatever storage drivers you use.  This is the least 
reliable thing out of this list to automate,  so I'd not suggest just 
doing this by itself.

The first two items are BTRFS specific.  The rest however, are standard 
things you should be monitoring regardless of what type of storage stack 
you have.  Of these, item 3 will immediately trigger in the event of a 
catastrophic device failure, while 1, 2, and 5 will provide better 
coverage of slow failures, and 4 will cover both aspects.

As far as what to use to actually track these, that really depends on 
your use case.  For tracking on an individual system basis, I'd suggest 
Monit, it's efficient, easy to configure, provides some degree of error 
resilience, and can actually cover a lot of monitoring tasks beyond 
stuff like this.  If you want some kind of centralized monitoring, I'd 
probably go with Nagios, but that's more because that's the standard for 
that type of thing, not because I've used it myself (I much prefer 
per-system decentralized monitoring, with only the checks that systems 
are online centralized).

  reply	other threads:[~2017-04-07 11:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  0:47 Volume appears full but TB's of space available John Petrini
2017-04-07  1:15 ` John Petrini
2017-04-07  1:21   ` Chris Murphy
2017-04-07  1:31     ` John Petrini
2017-04-07  2:42       ` Chris Murphy
2017-04-07  3:25         ` John Petrini
2017-04-07 11:41           ` Austin S. Hemmelgarn [this message]
2017-04-07 13:28             ` John Petrini
2017-04-07 13:50               ` Austin S. Hemmelgarn
2017-04-07 16:28                 ` Chris Murphy
2017-04-07 16:58                   ` Austin S. Hemmelgarn
2017-04-07 17:05                     ` John Petrini
2017-04-07 17:11                       ` Austin S. Hemmelgarn
2017-04-07 16:04             ` Chris Murphy
2017-04-07 16:51               ` Austin S. Hemmelgarn
2017-04-07 16:58                 ` John Petrini
2017-04-07 17:04                   ` Austin S. Hemmelgarn
2017-04-08  5:12             ` Duncan
2017-04-10 11:31               ` Austin S. Hemmelgarn
2017-04-07  1:17 ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d531fd79-c08a-0ba1-ae51-5503bc56b121@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=jpetrini@coredial.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).