Re: Monitoring Btrfs - Austin S. Hemmelgarn

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Anand Jain <anand.jain@oracle.com>,
	Stefan Malte Schumacher <stefan.m.schumacher@gmail.com>,
	linux-btrfs@vger.kernel.org, David Sterba <dsterba@suse.cz>
Subject: Re: Monitoring Btrfs
Date: Wed, 19 Oct 2016 09:33:21 -0400	[thread overview]
Message-ID: <dd66793c-709a-1b7e-09d0-99f1b745f908@gmail.com> (raw)
In-Reply-To: <bf0b049c-cd7a-00f4-cc20-68d50a7be039@oracle.com>

On 2016-10-19 09:06, Anand Jain wrote:
>
>
> On 10/19/16 19:15, Austin S. Hemmelgarn wrote:
>> On 2016-10-18 17:36, Anand Jain wrote:
>>>
>>>
>>>>>>> I would like to monitor my btrfs-filesystem for missing drives.
>>>
>>>
>>>>>> This is actually correct behavior, the filesystem reports that it
>>>>>> should
>>>>>> have 6 devices, which is how it knows a device is missing.
>>>
>>>
>>>>>  Missing - means missing at the time of mount. So how are you planning
>>>>> to monitor a disk which is failed while in production ?
>>>
>>>> No, in `btrfs fi show` it means that it can't find the device.
>>>
>>>  'btrfs fi show' is miss-leading as compared to 'btrfs fi show -m'
>>>  -m tells btrfs-kernel perspective of the devices, as of now
>>>  there is no code in the kernel which changes the device status
>>>  while its mounted (expect for readonly, which is irrelevant in
>>>  raid1 with 1 disk failed).
>
>> Actually, that's exactly how I would expect each of them to behave.  We
>> need some way to get both the state the kernel thinks the FS is in, and
>> the state it's actually in (according to the tools, not the kernel), and
>> '-m' reporting kernel state while no '-m' reports actual state is
>> exactly what I would expect in this case.
>
>
>> That leads also to another way I hadn't thought of to monitor a
>> filesystem.  The output of 'fi show' with and without '-m' should match
>> if the filesystem was healthy when mounted and is still healthy, if they
>> don't, then something is wrong.
>
>
>>>> 1. Filesystem flags.  These will change when the filesystem goes
>>>> degraded,
>>>
>>>   Which flag is in question here. ?
>> I should clarify here, I mean the mount options, I'm just used to the
>> monit terminology (which was not well picked in this case).  The big one
>> to watch is the read-only flag, as BTRFS will force a filesystem
>> read-only (which updates the mount options).  Any change to the mount
>> options though without manual intervention is generally a sign that
>> _something_ is wrong.
>
>
>  btrfs-progs shouldn't add its own intelligence in determining the
>  device state, it should be a transparent tool to report status from
>  the btrfs-kernel. So I opposed to the patches such as
>
>     commit 206efb60cbe3049e0d44c6da3c1909aeee18f813
>     btrfs-progs: Add missing devices check for mounted btrfs.
>
>  There are many ways a device can fail/recover in the SAN environment,
>  these device state managing intelligence should be at one place and
>  in the kernel. The volume manager part of the code in the kernel
>  is incomplete.
>
I don't agree that the management should be completely unified or that 
the tools should just report kernel state.  The tools have to have some 
way to check device state for unmounted filesystems because they have to 
operate on unmounted filesystems, and because until the kernel gets 
smart enough to actually handle device state properly, some method is 
needed to check the actual state of the devices.  Even once the kernel 
is smart enough, it's still helpful to see without mounting a filesystem 
whether or not all the devices are there, and if we ever switch to a 
real mount helper (which I am in favor of for multiple reasons), we'll 
need device state checking in userspace for that too.

Take a look for at LVM.  The separation of responsibilities there is 
ideally what we should be looking at long term for BTRFS.  The userspace 
components tell the kernel what to do, and list both kernel state _and_ 
physical state in a readable manner.  The kernel tracks limited parts of 
the state (only for active LV's, so the equivalent of mounted 
filesystems, and even then only what it needs to track (Is this RAID 
volume in sync?  Is that snapshot or thin storage pool getting close to 
full?)), and sends notifications to a userspace component which then 
acts on those conditions (possibly then telling the kernel what to do in 
response to them).  On top of that, the userspace components don't 
require a kernel which supports them for any off-line operations, and 
the kernel works fine with older userspace.  Both userspace and the 
kernel handle missing devices (userspace tools report them, the kernel 
refuses to activate LV's that require them).

next prev parent reply	other threads:[~2016-10-19 14:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 16:44 Monitoring Btrfs Stefan Malte Schumacher
2016-10-17 17:23 ` Austin S. Hemmelgarn
2016-10-18  3:23   ` Anand Jain
2016-10-18 12:39     ` Austin S. Hemmelgarn
2016-10-18 21:36       ` Anand Jain
2016-10-19 11:15         ` Austin S. Hemmelgarn
2016-10-19 13:06           ` Anand Jain
2016-10-19 13:33             ` Austin S. Hemmelgarn [this message]
2016-10-19 21:38               ` Anand Jain
2016-10-17 17:41 ` Zygo Blaxell
2016-10-17 17:55 ` Kyle Manna
2016-10-17 20:40 ` Chris Murphy
2016-10-18 12:41   ` Austin S. Hemmelgarn
2016-10-19 22:46 ` Stefan Malte Schumacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd66793c-709a-1b7e-09d0-99f1b745f908@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=stefan.m.schumacher@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).