Re: [RFC] Filesystem error notifications proposal

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Gabriel Krisman Bertazi <krisman@collabora.com>
To: Jan Kara <jack@suse.cz>
Cc: tytso@mit.edu, jack@suse.com, viro@zeniv.linux.org.uk,
	amir73il@gmail.com, dhowells@redhat.com, david@fromorbit.com,
	darrick.wong@oracle.com, khazhy@google.com,
	linux-fsdevel@vger.kernel.org, kernel@collabora.com
Subject: Re: [RFC] Filesystem error notifications proposal
Date: Thu, 21 Jan 2021 15:56:12 -0300	[thread overview]
Message-ID: <87a6t2dsqb.fsf@collabora.com> (raw)
In-Reply-To: <20210121114453.GD24063@quack2.suse.cz> (Jan Kara's message of "Thu, 21 Jan 2021 12:44:53 +0100")

Jan Kara <jack@suse.cz> writes:
> On Wed 20-01-21 17:13:15, Gabriel Krisman Bertazi wrote:
>>   fanotify is based on fsnotify, and the infrastructure for the
>>   visibility semantics is mostly implemented by fsnotify itself.  It
>>   would be possible to make error notifications a new mechanism on top
>>   of fsnotify, without modifying fanotify, but that would require
>>   exposing a very similar interface to userspace, new system calls, and
>>   that doesn't seem justifiable since we can extend fanotify syscalls
>>   without ABI breakage.
>
> I agree fanotify could be used for propagating the error information from
> kernel to userspace. But before we go to design how exactly the events
> should look like, I'd like to hear a usecase (or usecases) for this
> feature. Because the intended use very much determines which information we
> need to propagate to userspace and what flexibility we need there.

Hi Jan,

Thanks for the review.

My user is a cloud provider who wants to monitor several machines,
potentially with several volumes each, for metadata and data corruption
that might require attention from a sysadmin.  They would run a watcher
for each filesystem of each machine and consolidate the information.
Right now, they have a solution in-house to monitor ext4, which
basically exports to userspace any kind of condition that would trigger
one of the ext4_error_* functions.

We need to monitor a bit more than just writeback errors, but that is a
big part of the functionality we want.  We would be interested in
finding out about inode corruption, checksum failures, and read errors
as well.

Khazhy, in Cc, is more familiar with the specific requirements, and
might be able to provide more information on their usage of the existing
solution.

>> 3 Proposal
>> ==========
>> 
>>   The error notification framework is based on fanotify instead of
>>   watch_queue.  It is exposed through a new set of marks FAN_ERROR_*,
>>   exposed through the fanotify_mark(2) API.
>> 
>>   fanotify (fsnotify-based) has the infrastructure in-place to link
>>   notifications happening at filesystem objects to mountpoints and to
>>   filesystems, and already exposes an interface with well defined
>>   semantics of how those are exposed to watchers in different
>>   mountpoints or different subtrees.
>> 
>>   A new message format is exposed, if the user passed
>>   FAN_REPORT_DETAILED_ERROR fanotify_init(2) flag.  FAN_ERROR messages
>>   don't have FID/DFID records.
>
> So this depends on the use case. I can imagine that we could introduce new
> event type FAN_FS_ERROR (like we currently have FAN_MODIFY, FAN_ACCESS,
> FAN_CREATE, ...). This event type would be generated when error happens on
> fs object - we can generate it associated with struct file, struct dentry,
> or only superblock, depending on what information is available at the place
> where fs spots the error. Similarly to how we generate other fsnotify
> events. But here's the usecase question - is there really need for fine
> granularity of error reporting (i.e., someone watching errors only on a
> particular file, or directory)?

My use-case requires only watching an entire filesystem, and not
specific files.  I can imagine a use-case of someone watching a specific
mount point, like an application running inside a container is notified
only of errors in files/directories within that container.

>
> Then we can have FAN_REPORT_DETAILED_ERROR which would add to event more
> information about the error like you write below - although identification
> of the inode / fsid is IMO redundant - you can get all that information
> (and more) from FID / DFID event info entries if you want it.

When FID provides a handle, I'd need to get information on that handle
without opening it, as the file might be corrupted.  Would we need an
interface to retrieve inode number by handle without opening the file,
i.e. stat_by_handle?

Thanks,

-- 
Gabriel Krisman Bertazi

next prev parent reply	other threads:[~2021-01-21 19:32 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 20:13 [RFC] Filesystem error notifications proposal Gabriel Krisman Bertazi
2021-01-21  4:01 ` Viacheslav Dubeyko
2021-01-21 11:44 ` Jan Kara
2021-01-21 13:27   ` Amir Goldstein
2021-01-21 18:56   ` Gabriel Krisman Bertazi [this message]
2021-01-21 22:44 ` Theodore Ts'o
2021-01-22  0:44   ` Gabriel Krisman Bertazi
2021-01-22  7:36     ` Amir Goldstein
2021-02-02 20:51       ` Gabriel Krisman Bertazi
2021-01-28 22:28     ` Theodore Ts'o
2021-02-02 20:26       ` Gabriel Krisman Bertazi
2021-02-02 22:34         ` Theodore Ts'o
2021-02-08 18:49           ` Gabriel Krisman Bertazi
2021-02-08 22:19             ` Dave Chinner
2021-02-09  1:08               ` Theodore Ts'o
2021-02-09  5:12                 ` Khazhismel Kumykov
2021-02-09  8:55                 ` Dave Chinner
2021-02-09 17:57                   ` Theodore Ts'o
2021-02-10  0:52                     ` Darrick J. Wong
2021-02-10  2:21                       ` Theodore Ts'o
2021-02-10  2:32                         ` Darrick J. Wong
2021-02-09 17:35               ` Jan Kara
2021-02-10  0:22                 ` Darrick J. Wong
2021-02-10  7:46                 ` Dave Chinner
2021-02-10  0:49               ` Darrick J. Wong
2021-02-10  0:09 ` Darrick J. Wong
2021-02-10  7:23   ` Amir Goldstein
2021-02-10 23:29   ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6t2dsqb.fsf@collabora.com \
    --to=krisman@collabora.com \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=kernel@collabora.com \
    --cc=khazhy@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).