All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Paris <eparis@redhat.com>
To: Andreas Dilger <adilger@sun.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	malware-list@dmesg.printk.net, Valdis.Kletnieks@vt.edu,
	greg@kroah.com, jcm@redhat.com, douglas.leeder@sophos.com,
	tytso@mit.edu, arjan@infradead.org, david@lang.hm,
	jengelh@medozas.de, aviro@redhat.com, mrkafk@gmail.com,
	alexl@redhat.com, jack@suse.cz, tvrtko.ursulin@sophos.com,
	a.p.zijlstra@chello.nl, hch@infradead.org,
	alan@lxorguk.ukuu.org.uk, mmorley@hcl.in, pavel@suse.cz
Subject: Re: fanotify - overall design before I start sending patches
Date: Fri, 24 Jul 2009 17:21:25 -0400	[thread overview]
Message-ID: <1248470485.3567.106.camel@localhost> (raw)
In-Reply-To: <20090724210008.GE4231@webber.adilger.int>

On Fri, 2009-07-24 at 15:00 -0600, Andreas Dilger wrote:
> On Jul 24, 2009  16:13 -0400, Eric Paris wrote:
> > fanotify kernel/userspace interaction is over a new socket protocol.  A
> > listener opens a new socket in the new PF_FANOTIFY family.  The socket
> > is then bound to an address.  Using the following struct:
> 
> Would it make sense to use existing netlink?

I looked at netlink, but because of the nature of the fact that fd
creation has to be done in the listener context I couldn't figure out
how to make it suitable.

> > struct fanotify_addr {
> >         sa_family_t family;
> >         __u32 priority;
> >         __u32 group_num;
> >         __u32 mask;
> >         __u32 f_flags;
> >         __u32 unused[16];
> > }  __attribute__((packed));
> > 
> > The mask is the indication of the events this group is interested in.
> > The set of events of interest if FAN_GLOBAL_LISTENER is set at bind
> > time.  If FAN_GLOBAL_LISTENER is not set, this field is meaningless as
> > the registration of events on individual inodes will dictate the
> > reception of events.
> > 
> > * FAN_ACCESS: every file access.
> > * FAN_MODIFY: file modifications.
> > * FAN_CLOSE: files are closed.
> > * FAN_OPEN: open() calls.
> > * FAN_ACCESS_PERM: like FAN_ACCESS, except that the process trying to
> > access the file is put on hold while the fanotify client decides whether
> > to allow the operation.
> > * FAN_OPEN_PERM: like FAN_OPEN, but with the permission check.
> > * FAN_EVENT_ON_CHILD: receive notification of events on inodes inside
> > this subdirectory. (this is not a full recursive notification of all
> > descendants, only direct children)
> > * FAN_GLOBAL_LISTENER: notify for events on all files in the system.
> > * FAN_SURVIVE_MODIFY: special flag that ignores should survive inode
> > modification.  Discussed below.
> 
> It seems like a 32-bit mask might not be enough, it wouldn't be hard
> at this stage to add a 64-bit mask.  Lustre has a similar mechanism
> (changelog) that allows tracking all different kinds of filesystem
> events (create/unlink/symlink/link/rename/mkdir/setxattr/etc), instead
> of just open/close, also use by HSM, enhanced rsync, etc.

I had a 64 bit mask, but Al Viro ask me to go back to a 32 bit mask
because of i386 register pressure.  The bitmask operations are on VERY
hot paths inside the kernel.

> > struct fanotify_event_metadata {
> >         __u32 event_len;
> >         __s32 fd;
> >         __u32 mask;
> >         __u32 f_flags;
> >         __s32 pid;
> >         __s32 tgid;
> >         __u64 cookie;
> > }  __attribute__((packed));
> 
> Getting the attributes that have changed into this message is also
> useful, as it avoids a continual stream of "stat" calls on the inodes.

Hmmm, I'll take a look.  Do you have a good example of what you would
want to see?  I don't think we know in the notification hooks what
actually is being changed  :(

> The other thing that is important for HSM is that this log is atomic
> and persistent, otherwise there may be files that are missed if the
> node crashes.  This involves creating atomic update records as part
> of the filesystem operation, and then userspace consumes them and
> tells the kernel that it is finished with records up to X.  Otherwise
> you risk inconsistencies between rsync/HSM/updatedb for files that
> are updated just before a crash.

Uhhh, persistent across a crash?  Nope, don't have that.  Notification
is all in memory.  Can't I just put the onus on userspace to recheck
things maybe?  Sounds like a user for i_version....

> > If a FAN_ACCESS_PERM or FAN_OPEN_PERM event is received the listener
> > must send a response before the 5 second timeout.  If no response is
> > sent before the 5 second timeout the original operation is allowed.  If
> > this happens too many times (10 in a row) the fanotify group is evicted
> > from the kernel and will not get any new events.
> 
> This should be a tunable, since if the intent is to monitor PERM checks
> it would be possible for users to DOS the machine and delay the userspace
> programs and access files they shouldn't be able to.

At the moment I cheat and say root only to bind.  I do plan to open it
up to non-root users after it's in and working, but I'm seriously
considering leaving _PERM events as root only.  It's hard to map the
original to listener security implications.  So making sure the listener
is always root is easy   :)

Userspace would never be able to access a file it shouldn't be allowed
to (the new fd is created in the context of the listener and EPERM is
possible.)


  reply	other threads:[~2009-07-24 21:22 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-24 20:13 fanotify - overall design before I start sending patches Eric Paris
2009-07-24 20:13 ` Eric Paris
2009-07-24 20:48 ` david
2009-07-24 20:48   ` david-gFPdbfVZQbY
2009-07-24 21:01   ` Eric Paris
2009-07-24 21:01     ` Eric Paris
2009-07-24 21:44     ` Jamie Lokier
2009-07-27 17:52       ` Evgeniy Polyakov
2009-07-29 20:11         ` Eric Paris
2009-07-24 21:00 ` Andreas Dilger
2009-07-24 21:00   ` Andreas Dilger
2009-07-24 21:21   ` Eric Paris [this message]
2009-07-24 22:42     ` Andreas Dilger
2009-07-24 23:01       ` Jamie Lokier
2009-07-24 22:48 ` Jamie Lokier
2009-07-24 23:25   ` Eric Paris
2009-07-24 23:25     ` Eric Paris
2009-07-24 23:46     ` Jamie Lokier
2009-07-24 23:49   ` Eric Paris
2009-07-24 23:49     ` Eric Paris
2009-07-25  0:29     ` Jamie Lokier
2009-07-27 18:33       ` Andreas Dilger
2009-07-27 19:23         ` Jamie Lokier
2009-07-28 17:59           ` Andreas Dilger
2009-07-29 20:14           ` Eric Paris
2009-07-29 20:14             ` Eric Paris
2009-07-29 20:12         ` Eric Paris
2009-07-29 20:12           ` Eric Paris
2009-07-29 20:07       ` Eric Paris
2009-07-27 16:54   ` Jan Kara
2009-07-27 16:54     ` Jan Kara
2009-07-25 14:22 ` Niraj kumar
2009-07-25 14:22   ` Niraj kumar
2009-07-29 20:08   ` Eric Paris
2009-07-28 11:48 ` Jon Masters
2009-07-29 20:20   ` Eric Paris
2009-07-29 20:20     ` Eric Paris
2009-08-03 16:23 ` Christoph Hellwig
2009-08-03 16:55   ` Eric Paris
2009-08-03 16:55     ` Eric Paris
2009-08-03 18:04     ` Christoph Hellwig
2009-08-03 18:13       ` Eric Paris
2009-08-03 18:13         ` Eric Paris
2009-08-04 16:09 ` Tvrtko Ursulin
2009-08-04 16:09   ` Tvrtko Ursulin
2009-08-04 16:27   ` Eric Paris
2009-08-04 16:27     ` Eric Paris
2009-08-04 16:39     ` Tvrtko Ursulin
2009-08-04 17:22     ` Valdis.Kletnieks
2009-08-04 17:22       ` Valdis.Kletnieks-PjAqaU27lzQ
2009-08-04 18:20       ` John Stoffel
2009-08-04 18:20         ` John Stoffel
2009-08-04 18:50         ` Eric Paris
2009-08-04 18:50           ` Eric Paris
2009-08-05  9:32         ` Tvrtko Ursulin
2009-08-05  9:32           ` Tvrtko Ursulin
2009-08-04 16:34 ` Tvrtko Ursulin
2009-08-05 10:12   ` Douglas Leeder
2009-08-05 10:12     ` Douglas Leeder
2009-08-05 10:35   ` Douglas Leeder
2009-08-05  2:05 ` Pavel Machek
2009-08-05 16:46   ` Tvrtko Ursulin
2009-08-06 10:10     ` Pavel Machek
2009-08-06 10:10       ` Pavel Machek
2009-08-06 10:20       ` Tvrtko Ursulin
2009-08-06 10:24         ` Pavel Machek
2009-08-06 10:20       ` Douglas Leeder
2009-08-06 10:20         ` Douglas Leeder
2009-08-06 10:22         ` Pavel Machek
2009-08-07  8:59           ` Jamie Lokier
2009-08-07  8:59             ` Jamie Lokier
2009-08-06 10:29         ` Peter Zijlstra
2009-08-06 10:59           ` Tvrtko Ursulin
2009-08-06 10:59             ` Tvrtko Ursulin
2009-08-06 11:23             ` Peter Zijlstra
2009-08-06 12:48               ` Tvrtko Ursulin
2009-08-06 12:58                 ` Alan Cox
2009-08-06 12:58                   ` Alan Cox
2009-08-06 18:18                   ` Eric Paris
2009-08-06 18:18                     ` Eric Paris
2009-08-06 13:50               ` Kernel Event Notification Subsystem (was: fanotify - overall design before I start sending patches) Al Boldi
2009-08-06 13:50               ` Al Boldi
2009-08-06 13:50               ` Al Boldi
2009-08-06 18:18               ` fanotify - overall design before I start sending patches Eric Paris
2009-08-06 18:18                 ` Eric Paris
2009-08-07 16:36                 ` Miklos Szeredi
2009-08-07 17:43                   ` Eric Paris
2009-08-07 17:43                     ` Eric Paris
2009-08-08 10:36                     ` Pavel Machek
2009-08-10 10:03                     ` Miklos Szeredi
2009-08-08 10:34                 ` Pavel Machek
2009-08-08 10:34                   ` Pavel Machek
2009-08-06 11:24             ` Pavel Machek
2009-08-06 11:24               ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1248470485.3567.106.camel@localhost \
    --to=eparis@redhat.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=adilger@sun.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=alexl@redhat.com \
    --cc=arjan@infradead.org \
    --cc=aviro@redhat.com \
    --cc=david@lang.hm \
    --cc=douglas.leeder@sophos.com \
    --cc=greg@kroah.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jcm@redhat.com \
    --cc=jengelh@medozas.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=malware-list@dmesg.printk.net \
    --cc=mmorley@hcl.in \
    --cc=mrkafk@gmail.com \
    --cc=pavel@suse.cz \
    --cc=tvrtko.ursulin@sophos.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.