linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Beata Michalska <b.michalska@samsung.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-kernel@vger.kernel.org, tytso@mit.edu,
	adilger.kernel@dilger.ca, hughd@google.com, lczerner@redhat.com,
	hch@infradead.org, linux-ext4@vger.kernel.org,
	linux-mm@kvack.org, kyungmin.park@samsung.com,
	kmpark@infradead.org,
	Linux Filesystem Mailing List <linux-fsdevel@vger.kernel.org>,
	linux-api@vger.kernel.org
Subject: Re: [RFC 1/4] fs: Add generic file system event notifications
Date: Fri, 17 Apr 2015 15:15:34 +0200	[thread overview]
Message-ID: <55310776.20200@samsung.com> (raw)
In-Reply-To: <553104E5.2040704@samsung.com>

On 04/17/2015 03:04 PM, Beata Michalska wrote:
> On 04/17/2015 01:31 PM, Jan Kara wrote:
>> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>>> Introduce configurable generic interface for file
>>> system-wide event notifications to provide file
>>> systems with a common way of reporting any potential
>>> issues as they emerge.
>>>
>>> The notifications are to be issued through generic
>>> netlink interface, by a dedicated, for file system
>>> events, multicast group. The file systems might as
>>> well use this group to send their own custom messages.
>>>
>>> The events have been split into four base categories:
>>> information, warnings, errors and threshold notifications,
>>> with some very basic event types like running out of space
>>> or file system being remounted as read-only.
>>>
>>> Threshold notifications have been included to allow
>>> triggering an event whenever the amount of free space
>>> drops below a certain level - or levels to be more precise
>>> as two of them are being supported: the lower and the upper
>>> range. The notifications work both ways: once the threshold
>>> level has been reached, an event shall be generated whenever
>>> the number of available blocks goes up again re-activating
>>> the threshold.
>>>
>>> The interface has been exposed through a vfs. Once mounted,
>>> it serves as an entry point for the set-up where one can
>>> register for particular file system events.
>>>
>>> Signed-off-by: Beata Michalska <b.michalska@samsung.com>
>>   Thanks for the patches! Some comments are below.
>>
>>> ---
>>>  Documentation/filesystems/events.txt |  254 +++++++++++
>>>  fs/Makefile                          |    1 +
>>>  fs/events/Makefile                   |    6 +
>>>  fs/events/fs_event.c                 |  775 ++++++++++++++++++++++++++++++++++
>>>  fs/events/fs_event.h                 |   27 ++
>>>  fs/events/fs_event_netlink.c         |   94 +++++
>>>  fs/namespace.c                       |    1 +
>>>  include/linux/fs.h                   |    6 +-
>>>  include/linux/fs_event.h             |   69 +++
>>>  include/uapi/linux/fs_event.h        |   62 +++
>>>  include/uapi/linux/genetlink.h       |    1 +
>>>  net/netlink/genetlink.c              |    7 +-
>>>  12 files changed, 1301 insertions(+), 2 deletions(-)
>>>  create mode 100644 Documentation/filesystems/events.txt
>>>  create mode 100644 fs/events/Makefile
>>>  create mode 100644 fs/events/fs_event.c
>>>  create mode 100644 fs/events/fs_event.h
>>>  create mode 100644 fs/events/fs_event_netlink.c
>>>  create mode 100644 include/linux/fs_event.h
>>>  create mode 100644 include/uapi/linux/fs_event.h
>>>
>>> diff --git a/Documentation/filesystems/events.txt b/Documentation/filesystems/events.txt
>>> new file mode 100644
>>> index 0000000..c85dd88
>>> --- /dev/null
>>> +++ b/Documentation/filesystems/events.txt
>>> @@ -0,0 +1,254 @@
>>> +
>>> +	Generic file system event notification interface
>>> +
>>> +Document created 09 April 2015 by Beata Michalska <b.michalska@samsung.com>
>>> +
>>> +1. The reason behind:
>>> +=====================
>>> +
>>> +There are many corner cases when things might get messy with the filesystems.
>>> +And it is not always obvious what and when went wrong. Sometimes you might
>>> +get some subtle hints that there is something going on - but by the time
>>> +you realise it, it might be too late as you are already out-of-space
>>> +or the filesystem has been remounted as read-only (i.e.). The generic
>>> +interface for the filesystem events fills the gap by providing a rather
>>> +easy way of real-time notifications triggered whenever something intreseting
>>> +happens, allowing filesystems to report events in a common way, as they occur.
>>> +
>>> +2. How does it work:
>>> +====================
>>> +
>>> +The interface itself has been exposed as fstrace-type Virtual File System,
>>> +primarily to ease the process of setting up the configuration for the file
>>> +system notifications. So for starters it needs to get mounted (obviously):
>>> +
>>> +	mount -t fstrace none /sys/fs/events
>>> +
>>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>>> +through which the notification are being set-up.
>>> +
>>> +Activating notifications for particular filesystem is as straightforward
>>> +as writing into the 'config' file. Note that by default all events despite
>>> +the actual filesystem type are being disregarded.
>>   Is there a reason to have a special filesystem for this? Do you expect
>> extending it by (many) more files? Why not just creating a file in sysfs or
>> something like that?
> 
> No particular reason here - just for possible future extension if needed.
> I'm totally fine with having a single sysfs entry.
> 

On the other hand .... sysfs entries are mostly single-valued or are sets
of values of a single type, so not sure if we would fit in here -
with the current configuration for the interface.

>>
>>> +Synopsis of config:
>>> +------------------
>>> +
>>> +	MOUNT EVENT_TYPE [L1] [L2]
>>> +
>>> + MOUNT      : the filesystem's mount point
>>   I'm not quite decided but is mountpoint really the right thing to pass
>> via the interface? They aren't unique (filesystem can be mounted in
>> multiple places) and more importantly can change over time. So won't it be
>> better to pass major:minor over the interface? These are stable, unique to
>> the filesystem, and userspace can easily get them by calling stat(2) on the
>> desired path (or directly from /proc/self/mountinfo). That could be also
>> used as an fs identifier instead of assigned ID (and thus we won't need
>> those events about creation of new trace which look somewhat strange to
>> me).
>>
> Even if a given filesystem is being mounted in many places this will come
> down to single super_block - the interface will add trace for the first mount
> point. This is just to ease the usage: internally a particular trace is 
> associated with a super_block. 
> 
>> OTOH using major:minor may have issues in container world where processes
>> could watch events from filesystems inaccessible to the container if they
>> guess the device number. So maybe we could use 'path' when creating new
>> trace but I'd still like to use the device number internally and for all
>> outgoing communication because of above mentioned problems with
>> mountpoints.
> 
> Alright then, so dropping the idea of announcing new trace (with assigned id)
> and switching to using the major:minor numbers. Sounds OK to me.
> 
>>
>>> + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
>>> +              at least one type needs to be specified;
>>> +              note the comma delimiter and lack of spaces between
>>> +	      those options
>>> + L1         : the threshold limit - lower range
>>> + L2         : the threshold limit - upper range
>>> + 	      case enabling threshold notifications the lower level is
>>> +	      mandatory, whereas the upper one remains optional;
>>> +	      note though, that as those refer to the number of available
>>> +	      blocks, the lower level needs to be higher than the upper one
>>> +
>>> +Sample request could look like the follwoing:
>>> +
>>> + echo /sample/mount/point warn,err,thr 710000 500000 > /sys/fs/events/config
>>> +
>>> +Multiple request might be specified provided they are separated with semicolon.
>>   Is this necessary? It somewhat complicates syntax and parsing in kernel
>> and I don't see a need for that. I'd prefer to keep the interface as simple
>> as possible.
>>
> 
> This is not necessary but could ease the usage - i.e. through scripting: to specify
> multiple traces and register them in one go. 
> 
>> Also I think that we should make it clear that each event type has
>> different set of arguments. For threshold events they'll be L1 & L2, for
>> other events there may be no arguments, for other events maybe something
>> else...
>>
> 
> Currently only the threshold events use arguments -  not sure what arguments
> could be used for the remaining notifications. But any suggestions are welcomed.
> 
>> ...
>>> +static const match_table_t fs_etypes = {
>>> +	{ FS_EVENT_INFO,    "info"  },
>>> +	{ FS_EVENT_WARN,    "warn"  },
>>> +	{ FS_EVENT_THRESH,  "thr"   },
>>> +	{ FS_EVENT_ERR,     "err"   },
>>> +	{ 0, NULL },
>>> +};
>>   Why are there these generic message types? Threshold messages make good
>> sense to me. But not so much the rest. If they don't have a clear meaning,
>> it will be a mess. So I also agree with a message like - "filesystem has
>> trouble, you should probably unmount and run fsck" - that's fine. But
>> generic "info" or "warning" doesn't really carry any meaning on its own and
>> thus seems pretty useless to me. To explain a bit more, AFAIU this
>> shouldn't be a generic logging interface where something like severity
>> makes sense but rather a relatively specific interface notifying about
>> events in filesystem userspace should know about so I expect relatively low
>> number of types of events, not tens or even hundreds...
>>
>> 								Honza
> 
> Getting rid of those would simplify the configuration part, indeed.
> So we would be left with 'generic' and threshold events.
> I guess I've overdone this part.
> 
> Thanks for Your comments so far.
> 
> BR
> Beata
> 
> 
> 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-04-17 13:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1429082147-4151-1-git-send-email-b.michalska@samsung.com>
     [not found] ` <20150417081727.GB3116@quack.suse.cz>
2015-04-17  9:10   ` [RFC 0/4] Generic file system events interface Beata Michalska
     [not found] ` <1429082147-4151-2-git-send-email-b.michalska@samsung.com>
     [not found]   ` <552F308F.1050505@redhat.com>
     [not found]     ` <552F75D6.4030902@samsung.com>
     [not found]       ` <alpine.LSU.2.11.1504161229450.17935@eggly.anvils>
2015-04-17  9:10         ` [RFC 1/4] fs: Add generic file system event notifications Beata Michalska
     [not found]   ` <55302FFB.4010108@gmx.de>
     [not found]     ` <55302FFB.4010108-Mmb7MZpHnFY@public.gmane.org>
2015-04-17  9:46       ` Beata Michalska
     [not found]   ` <20150417113110.GD3116@quack.suse.cz>
2015-04-17 13:04     ` Beata Michalska
2015-04-17 13:15       ` Beata Michalska [this message]
     [not found]       ` <553104E5.2040704-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2015-04-17 13:16         ` Jan Kara
2015-04-17 13:23       ` Austin S Hemmelgarn
2015-04-17 13:41         ` Jan Kara
2015-04-17 14:51         ` John Spray
2015-04-17 15:43           ` Jan Kara
     [not found]             ` <20150417154351.GA26736-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2015-04-17 16:08               ` John Spray
     [not found]                 ` <55312FEA.3030905-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-04-17 16:22                   ` Jan Kara
2015-04-17 16:29                     ` Austin S Hemmelgarn
     [not found]                       ` <553134D3.9040001-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-04-17 16:39                         ` Jan Kara
2015-04-17 17:37                     ` John Spray
2015-04-17 22:37                       ` Andreas Dilger
2015-04-17 16:25                   ` Beata Michalska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55310776.20200@samsung.com \
    --to=b.michalska@samsung.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kmpark@infradead.org \
    --cc=kyungmin.park@samsung.com \
    --cc=lczerner@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).