* [PATCH 2/9] vfs: introduce FMODE_NONOTIFY
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
@ 2009-08-28 18:55 ` Eric Paris
2009-08-28 18:55 ` [PATCH 3/9] networking/fanotify: declare fanotify socket numbers Eric Paris
` (7 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:55 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
This is a new f_mode which can only be set by the kernel. It indicates
that the fd was opened by fanotify and should not cause future fanotify
events. This is needed to prevent fanotify livelock. An example of
obvious livelock is from fanotify close events.
Process A closes file1
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
This creates a close event for file1.
fanotify opens file1 for Listener X
Listener X deals with the event and closes its fd for file1.
notice a pattern?
The fix is to add the FMODE_NONOTIFY bit to the open filp done by the kernel
for fanotify. Thus when that file is used it will not generate future
events.
This patch simply defines the bit.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
include/linux/fs.h | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e3a32d..c3d7b8b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -87,6 +87,9 @@ struct inodes_stat_t {
*/
#define FMODE_NOCMTIME ((__force fmode_t)2048)
+/* File was opened by fanotify and shouldn't generate fanotify events */
+#define FMODE_NONOTIFY ((__force fmode_t)4096)
+
/*
* The below are the various read and write types that we support. Some of
* them include behavioral modifiers that send information down to the
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/9] networking/fanotify: declare fanotify socket numbers
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
2009-08-28 18:55 ` [PATCH 2/9] vfs: introduce FMODE_NONOTIFY Eric Paris
@ 2009-08-28 18:55 ` Eric Paris
2009-08-28 18:56 ` [PATCH 4/9] fanotify: fscking all notification system Eric Paris
` (6 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:55 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
fanotify's user interface uses a custom socket (it doesn't use netlink
since work must be done in the context of the receive side of the socket)
This patch simply defines the fanotify socket number declarations. The
actual implementation of the socket is in a later patch.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
include/linux/socket.h | 5 ++++-
net/core/sock.c | 6 +++---
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 3b461df..e03f47b 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -195,7 +195,8 @@ struct ucred {
#define AF_ISDN 34 /* mISDN sockets */
#define AF_PHONET 35 /* Phonet sockets */
#define AF_IEEE802154 36 /* IEEE802154 sockets */
-#define AF_MAX 37 /* For now.. */
+#define AF_FANOTIFY 37 /* fscking all access sockets */
+#define AF_MAX 38 /* For now.. */
/* Protocol families, same as address families. */
#define PF_UNSPEC AF_UNSPEC
@@ -235,6 +236,7 @@ struct ucred {
#define PF_ISDN AF_ISDN
#define PF_PHONET AF_PHONET
#define PF_IEEE802154 AF_IEEE802154
+#define PF_FANOTIFY AF_FANOTIFY
#define PF_MAX AF_MAX
/* Maximum queue length specifiable by listen. */
@@ -306,6 +308,7 @@ struct ucred {
#define SOL_PNPIPE 275
#define SOL_RDS 276
#define SOL_IUCV 277
+#define SOL_FANOTIFY 278
/* IPX options */
#define IPX_TYPE 1
diff --git a/net/core/sock.c b/net/core/sock.c
index 3ac34ea..1259525 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -155,7 +155,7 @@ static const char *const af_family_key_strings[AF_MAX+1] = {
"sk_lock-27" , "sk_lock-28" , "sk_lock-AF_CAN" ,
"sk_lock-AF_TIPC" , "sk_lock-AF_BLUETOOTH", "sk_lock-IUCV" ,
"sk_lock-AF_RXRPC" , "sk_lock-AF_ISDN" , "sk_lock-AF_PHONET" ,
- "sk_lock-AF_IEEE802154",
+ "sk_lock-AF_IEEE802154", "sk_lock-AF_FANOTIFY",
"sk_lock-AF_MAX"
};
static const char *const af_family_slock_key_strings[AF_MAX+1] = {
@@ -171,7 +171,7 @@ static const char *const af_family_slock_key_strings[AF_MAX+1] = {
"slock-27" , "slock-28" , "slock-AF_CAN" ,
"slock-AF_TIPC" , "slock-AF_BLUETOOTH", "slock-AF_IUCV" ,
"slock-AF_RXRPC" , "slock-AF_ISDN" , "slock-AF_PHONET" ,
- "slock-AF_IEEE802154",
+ "slock-AF_IEEE802154", "slock=AF_FANOTIFY",
"slock-AF_MAX"
};
static const char *const af_family_clock_key_strings[AF_MAX+1] = {
@@ -187,7 +187,7 @@ static const char *const af_family_clock_key_strings[AF_MAX+1] = {
"clock-27" , "clock-28" , "clock-AF_CAN" ,
"clock-AF_TIPC" , "clock-AF_BLUETOOTH", "clock-AF_IUCV" ,
"clock-AF_RXRPC" , "clock-AF_ISDN" , "clock-AF_PHONET" ,
- "clock-AF_IEEE802154",
+ "clock-AF_IEEE802154", "clock-AF_FANOTIFY",
"clock-AF_MAX"
};
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/9] fanotify: fscking all notification system
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
2009-08-28 18:55 ` [PATCH 2/9] vfs: introduce FMODE_NONOTIFY Eric Paris
2009-08-28 18:55 ` [PATCH 3/9] networking/fanotify: declare fanotify socket numbers Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 18:56 ` [PATCH 5/9] fanotify:drop notification if they exist in the outgoing queue Eric Paris
` (5 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
fanotify is a novel file notification system which bases notification on
giving userspace both an event type (open, close, read, write) and an open
file descriptor to the object in question. This should address a number of
races and problems with other notification systems like inotify and dnotify
and should allow the future implementation of blocking or access controlled
notification. These are useful for on access scanners or hierachical storage
management schemes.
This patch just implements the basics of the fsnotify functions.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/Kconfig | 1
fs/notify/Makefile | 1
fs/notify/fanotify/Kconfig | 11 +++++
fs/notify/fanotify/Makefile | 1
fs/notify/fanotify/fanotify.c | 90 +++++++++++++++++++++++++++++++++++++++++
fs/notify/fanotify/fanotify.h | 12 +++++
include/linux/Kbuild | 1
include/linux/fanotify.h | 40 ++++++++++++++++++
8 files changed, 157 insertions(+), 0 deletions(-)
create mode 100644 fs/notify/fanotify/Kconfig
create mode 100644 fs/notify/fanotify/Makefile
create mode 100644 fs/notify/fanotify/fanotify.c
create mode 100644 fs/notify/fanotify/fanotify.h
create mode 100644 include/linux/fanotify.h
diff --git a/fs/notify/Kconfig b/fs/notify/Kconfig
index dffbb09..22c629e 100644
--- a/fs/notify/Kconfig
+++ b/fs/notify/Kconfig
@@ -3,3 +3,4 @@ config FSNOTIFY
source "fs/notify/dnotify/Kconfig"
source "fs/notify/inotify/Kconfig"
+source "fs/notify/fanotify/Kconfig"
diff --git a/fs/notify/Makefile b/fs/notify/Makefile
index 0922cc8..396a387 100644
--- a/fs/notify/Makefile
+++ b/fs/notify/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_FSNOTIFY) += fsnotify.o notification.o group.o inode_mark.o
obj-y += dnotify/
obj-y += inotify/
+obj-y += fanotify/
diff --git a/fs/notify/fanotify/Kconfig b/fs/notify/fanotify/Kconfig
new file mode 100644
index 0000000..70631ed
--- /dev/null
+++ b/fs/notify/fanotify/Kconfig
@@ -0,0 +1,11 @@
+config FANOTIFY
+ bool "Filesystem wide access notification"
+ select FSNOTIFY
+ default y
+ ---help---
+ Say Y here to enable fanotify suport. fanotify is a system wide
+ file access notification interface. Events are read from from a
+ socket and in doing so an fd is created in the reading process
+ which points to the same data as the one on which the event occured.
+
+ If unsure, say Y.
diff --git a/fs/notify/fanotify/Makefile b/fs/notify/fanotify/Makefile
new file mode 100644
index 0000000..e7d39c0
--- /dev/null
+++ b/fs/notify/fanotify/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_FANOTIFY) += fanotify.o
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
new file mode 100644
index 0000000..59bc883
--- /dev/null
+++ b/fs/notify/fanotify/fanotify.c
@@ -0,0 +1,90 @@
+#include <linux/fdtable.h>
+#include <linux/fsnotify_backend.h>
+#include <linux/init.h>
+#include <linux/kernel.h> /* UINT_MAX */
+#include <linux/net.h> /* struct socket */
+#include <linux/sched.h> /* task_struct */
+#include <linux/types.h>
+
+#include "fanotify.h"
+
+static int fanotify_handle_event(struct fsnotify_group *group, struct fsnotify_event *event)
+{
+ int ret;
+
+ BUILD_BUG_ON(FAN_ACCESS != FS_ACCESS);
+ BUILD_BUG_ON(FAN_MODIFY != FS_MODIFY);
+ BUILD_BUG_ON(FAN_CLOSE_NOWRITE != FS_CLOSE_NOWRITE);
+ BUILD_BUG_ON(FAN_CLOSE_WRITE != FS_CLOSE_WRITE);
+ BUILD_BUG_ON(FAN_OPEN != FS_OPEN);
+ BUILD_BUG_ON(FAN_EVENT_ON_CHILD != FS_EVENT_ON_CHILD);
+ BUILD_BUG_ON(FAN_Q_OVERFLOW != FS_Q_OVERFLOW);
+
+ ret = fsnotify_add_notify_event(group, event, NULL, NULL);
+
+ return ret;
+}
+
+static bool fanotify_should_send_event(struct fsnotify_group *group, struct inode *inode,
+ __u32 mask, void *data, int data_type)
+{
+ struct fsnotify_mark_entry *entry;
+ bool send;
+
+ /* if we are in an open operation do not send events to fanotify */
+ if (current->flags & PF_NONOTIFY)
+ return false;
+
+ /* sorry, fanotify only gives a damn about files and dirs */
+ if (!S_ISREG(inode->i_mode) &&
+ !S_ISDIR(inode->i_mode))
+ return false;
+
+ /* if we don't have enough info to send an event to userspace say no */
+ if ((data_type != FSNOTIFY_EVENT_FILE) &&
+ (data_type != FSNOTIFY_EVENT_PATH))
+ return false;
+
+ /* if this file was opened by fanotify don't send events about it */
+ if (data_type == FSNOTIFY_EVENT_FILE) {
+ struct file *file;
+
+ file = (struct file *)data;
+ if (file->f_mode & FMODE_NONOTIFY)
+ return false;
+ }
+
+ spin_lock(&inode->i_lock);
+ entry = fsnotify_find_mark_entry(group, inode);
+ spin_unlock(&inode->i_lock);
+ if (!entry)
+ return false;
+
+ /* if the event is for a child and this inode doesn't care about
+ * events on the child, don't send it! */
+ if ((mask & FS_EVENT_ON_CHILD) &&
+ !(entry->mask & FS_EVENT_ON_CHILD))
+ send = false;
+ else {
+ if (!(entry->mask & FS_EVENT_ON_CHILD) &&
+ (mask & FS_EVENT_ON_CHILD))
+ send = false;
+ else {
+ mask = (mask & ~FS_EVENT_ON_CHILD);
+ send = (entry->mask & mask);
+ }
+ }
+
+ /* find took a reference */
+ fsnotify_put_mark(entry);
+
+ return send;
+}
+
+const struct fsnotify_ops fanotify_ops = {
+ .handle_event = fanotify_handle_event,
+ .should_send_event = fanotify_should_send_event,
+ .free_group_priv = NULL,
+ .free_event_priv = NULL,
+ .freeing_mark = NULL,
+};
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
new file mode 100644
index 0000000..a8785c1
--- /dev/null
+++ b/fs/notify/fanotify/fanotify.h
@@ -0,0 +1,12 @@
+#include <linux/fanotify.h>
+#include <linux/fsnotify_backend.h>
+#include <linux/net.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+
+static inline bool fanotify_is_mask_valid(__u32 mask)
+{
+ if (mask & ~(FAN_ALL_INCOMING_EVENTS))
+ return false;
+ return true;
+}
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index e7d84ff..b298c0e 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -206,6 +206,7 @@ unifdef-y += ethtool.h
unifdef-y += eventpoll.h
unifdef-y += signalfd.h
unifdef-y += ext2_fs.h
+unifdef-y += fanotify.h
unifdef-y += fb.h
unifdef-y += fcntl.h
unifdef-y += filter.h
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
new file mode 100644
index 0000000..b560f86
--- /dev/null
+++ b/include/linux/fanotify.h
@@ -0,0 +1,40 @@
+#ifndef _LINUX_FANOTIFY_H
+#define _LINUX_FANOTIFY_H
+
+#include <linux/types.h>
+
+/* the following events that user-space can register for */
+#define FAN_ACCESS 0x00000001 /* File was accessed */
+#define FAN_MODIFY 0x00000002 /* File was modified */
+#define FAN_CLOSE_WRITE 0x00000008 /* Unwrittable file closed */
+#define FAN_CLOSE_NOWRITE 0x00000010 /* Writtable file closed */
+#define FAN_OPEN 0x00000020 /* File was opened */
+
+#define FAN_EVENT_ON_CHILD 0x08000000 /* interested in child events */
+
+/* FIXME currently Q's have no limit.... */
+#define FAN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+
+/* helper events */
+#define FAN_CLOSE (FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE) /* close */
+
+/*
+ * All of the events - we build the list by hand so that we can add flags in
+ * the future and not break backward compatibility. Apps will get only the
+ * events that they originally wanted. Be sure to add new events here!
+ */
+#define FAN_ALL_EVENTS (FAN_ACCESS |\
+ FAN_MODIFY |\
+ FAN_CLOSE |\
+ FAN_OPEN)
+
+/*
+ * All legal FAN bits userspace can request (although possibly not all
+ * at the same time.
+ */
+#define FAN_ALL_INCOMING_EVENTS (FAN_ALL_EVENTS |\
+ FAN_EVENT_ON_CHILD)
+#ifdef __KERNEL__
+
+#endif /* __KERNEL__ */
+#endif /* _LINUX_FANOTIFY_H */
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/9] fanotify:drop notification if they exist in the outgoing queue
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (2 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 4/9] fanotify: fscking all notification system Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 18:56 ` [PATCH 6/9] fanotify: merge notification events with different masks Eric Paris
` (4 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
fanotify listeners get an open file descriptor to the object in question so
the ordering of operations is not as important as in other notification
systems. inotify will drop events if the last event in the event FIFO is
the same as the current event. This patch will drop fanotify events if
they are the same as another event anywhere in the event FIFO.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/fanotify/fanotify.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 59bc883..caf34bb 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -8,6 +8,40 @@
#include "fanotify.h"
+static bool try_merge(struct fsnotify_event *old, struct fsnotify_event *new)
+{
+ if ((old->mask == new->mask) &&
+ (old->to_tell == new->to_tell) &&
+ (old->data_type == new->data_type)) {
+ switch (old->data_type) {
+ case (FSNOTIFY_EVENT_PATH):
+ if ((old->path.mnt == new->path.mnt) &&
+ (old->path.dentry == new->path.dentry))
+ return true;
+ case (FSNOTIFY_EVENT_NONE):
+ return true;
+ default:
+ BUG();
+ };
+ }
+ return false;
+}
+
+static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
+{
+ struct fsnotify_event_holder *holder;
+ struct fsnotify_event *test_event;
+
+ /* and the list better be locked by something too! */
+
+ list_for_each_entry_reverse(holder, list, event_list) {
+ test_event = holder->event;
+ if (try_merge(test_event, event))
+ return -EEXIST;
+ }
+
+ return 0;
+}
static int fanotify_handle_event(struct fsnotify_group *group, struct fsnotify_event *event)
{
int ret;
@@ -20,8 +54,10 @@ static int fanotify_handle_event(struct fsnotify_group *group, struct fsnotify_e
BUILD_BUG_ON(FAN_EVENT_ON_CHILD != FS_EVENT_ON_CHILD);
BUILD_BUG_ON(FAN_Q_OVERFLOW != FS_Q_OVERFLOW);
- ret = fsnotify_add_notify_event(group, event, NULL, NULL);
-
+ ret = fsnotify_add_notify_event(group, event, NULL, fanotify_merge);
+ /* -EEXIST means this event was merged with another, not that it was an error */
+ if (ret == -EEXIST)
+ ret = 0;
return ret;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/9] fanotify: merge notification events with different masks
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (3 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 5/9] fanotify:drop notification if they exist in the outgoing queue Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 18:56 ` [PATCH 7/9] fanotify: userspace socket Eric Paris
` (3 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
Instead of just merging fanotify events if they are exactly the same, merge
notification events with different masks. To do this we have to clone the
old event, update the mask in the new event with the new merged mask, and
put the new event in place of the old event.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/fanotify/fanotify.c | 24 ++++++++++++++++++------
1 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index caf34bb..e8e56cb 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -10,8 +10,7 @@
static bool try_merge(struct fsnotify_event *old, struct fsnotify_event *new)
{
- if ((old->mask == new->mask) &&
- (old->to_tell == new->to_tell) &&
+ if ((old->to_tell == new->to_tell) &&
(old->data_type == new->data_type)) {
switch (old->data_type) {
case (FSNOTIFY_EVENT_PATH):
@@ -29,15 +28,28 @@ static bool try_merge(struct fsnotify_event *old, struct fsnotify_event *new)
static int fanotify_merge(struct list_head *list, struct fsnotify_event *event)
{
- struct fsnotify_event_holder *holder;
+ struct fsnotify_event_holder *test_holder, *prev;
struct fsnotify_event *test_event;
+ struct fsnotify_event *new_event;
+ int ret;
/* and the list better be locked by something too! */
- list_for_each_entry_reverse(holder, list, event_list) {
- test_event = holder->event;
- if (try_merge(test_event, event))
+ list_for_each_entry_safe_reverse(test_holder, prev, list, event_list) {
+ test_event = test_holder->event;
+ if (try_merge(test_event, event)) {
+ if (test_event->mask == event->mask)
+ return -EEXIST;
+ new_event = fsnotify_clone_event(test_event);
+ if (!new_event)
+ return 0;
+ new_event->mask = (test_event->mask | event->mask);
+ ret = fsnotify_replace_event(test_holder, new_event);
+ fsnotify_put_event(new_event); /* matches the ref from clone */
+ if (ret)
+ return ret;
return -EEXIST;
+ }
}
return 0;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 7/9] fanotify: userspace socket
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (4 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 6/9] fanotify: merge notification events with different masks Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 18:56 ` [PATCH 8/9] fanotify: userspace can add and remove fsnotify inode marks Eric Paris
` (2 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
This patch implements an userspace interface for the fanotify notification
system. An fanotify socket is created in userspace and is 'bound' to an
address. That bind call actually creates the new fanotify listener much like
inotify_init() creates an inotify instance.
Requests for notification of events on certain fs objects is done using a
setsockopt() call. (not implemented in this patch) This setsockopt() call is
largely analogous to inotify_add_watch()
Events are retrieved from the kernel calling read on the bound socket.
This interface is designed to be forward looking as the kernel/userspace
interaction can be changed simply by implementing a new getsockopt option.
Macros are provided much like the netlink macros in order to allow of the
messages from the kernel to userspace to change in length in the future while
maintaining backwards compatibility.
This patch only implements the socket registration and the bind call. The
getsockopt() calls and data read call are implemented in later patches.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/fanotify/Makefile | 2 -
fs/notify/fanotify/af_fanotify.c | 152 ++++++++++++++++++++++++++++++++++++++
fs/notify/fanotify/af_fanotify.h | 21 +++++
fs/notify/fanotify/fanotify.h | 2 +
include/linux/fanotify.h | 19 +++++
5 files changed, 195 insertions(+), 1 deletions(-)
create mode 100644 fs/notify/fanotify/af_fanotify.c
create mode 100644 fs/notify/fanotify/af_fanotify.h
diff --git a/fs/notify/fanotify/Makefile b/fs/notify/fanotify/Makefile
index e7d39c0..1196005 100644
--- a/fs/notify/fanotify/Makefile
+++ b/fs/notify/fanotify/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_FANOTIFY) += fanotify.o
+obj-$(CONFIG_FANOTIFY) += fanotify.o af_fanotify.o
diff --git a/fs/notify/fanotify/af_fanotify.c b/fs/notify/fanotify/af_fanotify.c
new file mode 100644
index 0000000..d7bf658
--- /dev/null
+++ b/fs/notify/fanotify/af_fanotify.c
@@ -0,0 +1,152 @@
+#include <linux/errno.h>
+#include <linux/fdtable.h>
+#include <linux/file.h>
+#include <linux/fsnotify_backend.h>
+#include <linux/init.h>
+#include <linux/kernel.h> /* UINT_MAX */
+#include <linux/mount.h> /* mntget() */
+#include <linux/net.h>
+#include <linux/skbuff.h>
+#include <linux/socket.h>
+#include <linux/types.h>
+
+#include <net/net_namespace.h>
+#include <net/sock.h>
+
+#include "fanotify.h"
+#include "af_fanotify.h"
+
+static const struct proto_ops fanotify_proto_ops;
+
+static struct proto fanotify_proto = {
+ .name = "FANOTIFY",
+ .owner = THIS_MODULE,
+ .obj_size = sizeof(struct fanotify_sock),
+};
+
+static int fan_sock_create(struct net *net, struct socket *sock, int protocol)
+{
+ struct sock *sk;
+ struct fanotify_sock *fan_sock;
+
+ /* FIXME maybe a new LSM hook? */
+ if (!capable(CAP_NET_RAW))
+ return -EPERM;
+
+ if (protocol != 0)
+ return -ESOCKTNOSUPPORT;
+
+ if (sock->type != SOCK_RAW)
+ return -ESOCKTNOSUPPORT;
+
+ sock->state = SS_UNCONNECTED;
+
+ sk = sk_alloc(net, PF_FANOTIFY, GFP_KERNEL, &fanotify_proto);
+ if (sk == NULL)
+ return -ENOBUFS;
+
+ sock->ops = &fanotify_proto_ops;
+
+ sock_init_data(sock, sk);
+
+ sk->sk_family = PF_FANOTIFY;
+ sk_refcnt_debug_inc(sk);
+
+ fan_sock = fan_sk(sk);
+ fan_sock->group = NULL;
+
+ return 0;
+}
+
+static int fan_release(struct socket *sock)
+{
+ struct sock *sk;
+ struct fanotify_sock *fan_sock;
+
+ sk = sock->sk;
+ if (!sk)
+ return 0;
+
+ fan_sock = fan_sk(sk);
+
+ if (sock->state == SS_CONNECTED) {
+ sock->state = SS_UNCONNECTED;
+ fsnotify_put_group(fan_sock->group);
+ }
+
+ fan_sock->group = NULL;
+
+ sock_orphan(sk);
+ sock->sk = NULL;
+
+ sk_refcnt_debug_release(sk);
+
+ sock_put(sk);
+
+ return 0;
+}
+
+static int fan_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
+{
+ struct fanotify_addr *fan_addr = (struct fanotify_addr *)addr;
+ struct fanotify_sock *fan_sock;
+
+ if (addr_len != sizeof(struct fanotify_addr))
+ return -EINVAL;
+
+ if (sock->state != SS_UNCONNECTED)
+ return -EINVAL;
+
+ if (!fanotify_is_mask_valid(fan_addr->mask))
+ return -EINVAL;
+
+ fan_sock = fan_sk(sock->sk);
+ fan_sock->group = fsnotify_obtain_group(fan_addr->mask, &fanotify_ops);
+
+ if (IS_ERR(fan_sock->group))
+ return PTR_ERR(fan_sock->group);
+
+ fan_sock->group->max_events = 16383;
+
+ sock->state = SS_CONNECTED;
+
+ return 0;
+}
+
+static const struct net_proto_family fanotify_family_ops = {
+ .family = PF_FANOTIFY,
+ .create = fan_sock_create,
+ .owner = THIS_MODULE,
+};
+
+static const struct proto_ops fanotify_proto_ops = {
+ .family = PF_FANOTIFY,
+ .owner = THIS_MODULE,
+ .release = fan_release,
+ .bind = fan_bind,
+ .connect = sock_no_connect,
+ .socketpair = sock_no_socketpair,
+ .accept = sock_no_accept,
+ .getname = sock_no_getname,
+ .poll = sock_no_poll,
+ .ioctl = sock_no_ioctl,
+ .listen = sock_no_listen,
+ .shutdown = sock_no_shutdown,
+ .setsockopt = sock_no_setsockopt,
+ .getsockopt = sock_no_getsockopt,
+ .sendmsg = sock_no_sendmsg,
+ .recvmsg = sock_no_recvmsg,
+ .mmap = sock_no_mmap,
+ .sendpage = sock_no_sendpage,
+};
+
+static int __init fanotify_init(void)
+{
+ if (proto_register(&fanotify_proto, 0))
+ panic("unable to register fanotify protocol with network stack\n");
+
+ sock_register(&fanotify_family_ops);
+
+ return 0;
+}
+device_initcall(fanotify_init);
diff --git a/fs/notify/fanotify/af_fanotify.h b/fs/notify/fanotify/af_fanotify.h
new file mode 100644
index 0000000..fff0e66
--- /dev/null
+++ b/fs/notify/fanotify/af_fanotify.h
@@ -0,0 +1,21 @@
+#ifndef _LINUX_AF_FANOTIFY_H
+#define _LINUX_AF_FANOTIFY_H
+
+#include <linux/fanotify.h>
+#include <net/sock.h>
+
+struct fanotify_sock {
+ struct sock sock;
+ struct fsnotify_group *group;
+};
+
+static inline struct fanotify_sock *fan_sk(struct sock *sock)
+{
+ struct fanotify_sock *fan_sock;
+
+ fan_sock = container_of(sock, struct fanotify_sock, sock);
+
+ return fan_sock;
+}
+
+#endif /* _LINUX_AF_NET_H */
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index a8785c1..6c7bf06 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -4,6 +4,8 @@
#include <linux/kernel.h>
#include <linux/types.h>
+extern const struct fsnotify_ops fanotify_ops;
+
static inline bool fanotify_is_mask_valid(__u32 mask)
{
if (mask & ~(FAN_ALL_INCOMING_EVENTS))
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index b560f86..31fa74d 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -1,6 +1,7 @@
#ifndef _LINUX_FANOTIFY_H
#define _LINUX_FANOTIFY_H
+#include <linux/socket.h>
#include <linux/types.h>
/* the following events that user-space can register for */
@@ -34,6 +35,24 @@
*/
#define FAN_ALL_INCOMING_EVENTS (FAN_ALL_EVENTS |\
FAN_EVENT_ON_CHILD)
+#ifndef SOL_FANOTIFY
+#define SOL_FANOTIFY 278
+#endif
+
+#ifndef AF_FANOTIFY
+#define AF_FANOTIFY 37
+#define PF_FANOTIFY AF_FANOTIFY
+#endif
+
+struct fanotify_addr {
+ sa_family_t family;
+ __u32 priority; /* unused */
+ __u32 mask_hi; /* unused */
+ __u32 mask;
+ __u32 f_flags; /* unused */
+ __u32 unused[16];
+} __attribute__((packed));
+
#ifdef __KERNEL__
#endif /* __KERNEL__ */
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 8/9] fanotify: userspace can add and remove fsnotify inode marks
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (5 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 7/9] fanotify: userspace socket Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 18:56 ` [PATCH 9/9] fanotify: send events to userspace over socket reads Eric Paris
2009-08-28 22:36 ` [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Evgeniy Polyakov
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
Using setsockopt a user can add or remove fsnotify marks on inodes. These
marks are used to determine which events for which inode are to be sent to
userspace. They are very similar in nature to inotify_add_watch and
inotify_rm_watch.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/fanotify/af_fanotify.c | 169 ++++++++++++++++++++++++++++++++++++++
include/linux/fanotify.h | 10 ++
2 files changed, 178 insertions(+), 1 deletions(-)
diff --git a/fs/notify/fanotify/af_fanotify.c b/fs/notify/fanotify/af_fanotify.c
index d7bf658..ac6aee1 100644
--- a/fs/notify/fanotify/af_fanotify.c
+++ b/fs/notify/fanotify/af_fanotify.c
@@ -17,6 +17,7 @@
#include "af_fanotify.h"
static const struct proto_ops fanotify_proto_ops;
+static struct kmem_cache *fanotify_mark_cache __read_mostly;
static struct proto fanotify_proto = {
.name = "FANOTIFY",
@@ -113,6 +114,170 @@ static int fan_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
return 0;
}
+static void fanotify_free_mark(struct fsnotify_mark_entry *entry)
+{
+ kmem_cache_free(fanotify_mark_cache, entry);
+}
+
+static int fanotify_remove_inode_mark(struct fsnotify_group *group,
+ struct fanotify_so_inode_mark *so_inode_mark)
+{
+ struct fsnotify_mark_entry *entry;
+ struct file *file;
+ struct inode *inode;
+ int fput_needed, ret = 0;
+
+ ret = -EBADF;
+ file = fget_light(so_inode_mark->fd, &fput_needed);
+ if (!file)
+ goto out;
+
+ inode = file->f_path.dentry->d_inode;
+
+ spin_lock(&inode->i_lock);
+ entry = fsnotify_find_mark_entry(group, inode);
+ spin_unlock(&inode->i_lock);
+
+ ret = -ENOENT;
+ if (!entry)
+ goto out_fput;
+
+ ret = 0;
+
+ fsnotify_destroy_mark_by_entry(entry);
+
+ /* matches the fsnotify_find_mark_entry() */
+ fsnotify_put_mark(entry);
+
+ fsnotify_recalc_group_mask(group);
+out_fput:
+ fput_light(file, fput_needed);
+out:
+ return ret;
+}
+
+static int fanotify_add_inode_mark(struct fsnotify_group *group,
+ struct fanotify_so_inode_mark *so_inode_mark)
+{
+ struct fsnotify_mark_entry *entry;
+ struct file *file;
+ struct inode *inode;
+ __u32 old_mask, new_mask;
+ int fput_needed, ret;
+
+ ret = -EINVAL;
+ if (!fanotify_is_mask_valid(so_inode_mark->mask))
+ goto out;
+
+ ret = -EBADF;
+ file = fget_light(so_inode_mark->fd, &fput_needed);
+ if (!file)
+ goto out;
+
+ inode = file->f_path.dentry->d_inode;
+
+ spin_lock(&inode->i_lock);
+ entry = fsnotify_find_mark_entry(group, inode);
+ spin_unlock(&inode->i_lock);
+
+ if (!entry) {
+ struct fsnotify_mark_entry *new_entry;
+
+ ret = -ENOMEM;
+ new_entry = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
+ if (!new_entry)
+ goto out_fput;
+
+ fsnotify_init_mark(new_entry, fanotify_free_mark);
+ ret = fsnotify_add_mark(new_entry, group, inode, 0);
+ if (ret) {
+ fanotify_free_mark(new_entry);
+ goto out_fput;
+ }
+
+ entry = new_entry;
+ }
+
+ ret = 0;
+
+ spin_lock(&entry->lock);
+ old_mask = entry->mask;
+ entry->mask |= so_inode_mark->mask;
+ new_mask = entry->mask;
+ spin_unlock(&entry->lock);
+
+ /* we made changes to a mask, update the group mask and the inode mask
+ * so things happen quickly. */
+ if (old_mask != new_mask) {
+ /* more bits in old than in new? */
+ int dropped = (old_mask & ~new_mask);
+ /* more bits in this entry than the inode's mask? */
+ int do_inode = (new_mask & ~inode->i_fsnotify_mask);
+ /* more bits in this entry than the group? */
+ int do_group = (new_mask & ~group->mask);
+
+ /* update the inode with this new entry */
+ if (dropped || do_inode)
+ fsnotify_recalc_inode_mask(inode);
+
+ /* update the group mask with the new mask */
+ if (dropped || do_group)
+ fsnotify_recalc_group_mask(group);
+ }
+
+ /* match the init or the find.... */
+ fsnotify_put_mark(entry);
+
+out_fput:
+ fput_light(file, fput_needed);
+out:
+ return ret;
+}
+
+static int fan_setsockopt(struct socket *sock, int level, int optname,
+ char __user *optval, int optlen)
+{
+ struct fanotify_sock *fan_sock;
+ struct fsnotify_group *group;
+ size_t copy_len;
+
+ union {
+ struct fanotify_so_inode_mark inode_mark;
+ } data;
+ int ret = 0;
+
+ if (sock->state != SS_CONNECTED)
+ return -EBADF;
+
+ if (level != SOL_FANOTIFY)
+ return -ENOPROTOOPT;
+
+ fan_sock = fan_sk(sock->sk);
+ group = fan_sock->group;
+
+ copy_len = min(optlen, (int)sizeof(data));
+ ret = copy_from_user(&data, optval, copy_len);
+ if (ret)
+ return ret;
+
+ switch (optname) {
+ case FANOTIFY_SET_MARK:
+ case FANOTIFY_REMOVE_MARK:
+ if (optlen < sizeof(struct fanotify_so_inode_mark))
+ return -ENOMEM;
+
+ if (optname == FANOTIFY_SET_MARK)
+ ret = fanotify_add_inode_mark(group, &data.inode_mark);
+ else if (optname == FANOTIFY_REMOVE_MARK)
+ ret = fanotify_remove_inode_mark(group, &data.inode_mark);
+ break;
+ default:
+ return -ENOPROTOOPT;
+ }
+
+ return ret;
+}
+
static const struct net_proto_family fanotify_family_ops = {
.family = PF_FANOTIFY,
.create = fan_sock_create,
@@ -132,7 +297,7 @@ static const struct proto_ops fanotify_proto_ops = {
.ioctl = sock_no_ioctl,
.listen = sock_no_listen,
.shutdown = sock_no_shutdown,
- .setsockopt = sock_no_setsockopt,
+ .setsockopt = fan_setsockopt,
.getsockopt = sock_no_getsockopt,
.sendmsg = sock_no_sendmsg,
.recvmsg = sock_no_recvmsg,
@@ -142,6 +307,8 @@ static const struct proto_ops fanotify_proto_ops = {
static int __init fanotify_init(void)
{
+ fanotify_mark_cache = KMEM_CACHE(fsnotify_mark_entry, SLAB_PANIC);
+
if (proto_register(&fanotify_proto, 0))
panic("unable to register fanotify protocol with network stack\n");
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 31fa74d..db96dd8 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -53,6 +53,16 @@ struct fanotify_addr {
__u32 unused[16];
} __attribute__((packed));
+/* struct used for FANOTIFY_SET_MARK */
+struct fanotify_so_inode_mark {
+ __s32 fd;
+ __u32 mask;
+} __attribute__((packed));
+
+/* fanotify setsockopt optvals */
+#define FANOTIFY_SET_MARK 1
+#define FANOTIFY_REMOVE_MARK 2
+
#ifdef __KERNEL__
#endif /* __KERNEL__ */
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 9/9] fanotify: send events to userspace over socket reads
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (6 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 8/9] fanotify: userspace can add and remove fsnotify inode marks Eric Paris
@ 2009-08-28 18:56 ` Eric Paris
2009-08-28 22:36 ` [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Evgeniy Polyakov
8 siblings, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-08-28 18:56 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, netdev; +Cc: davem, viro, alan, hch
fanotify sends event notification to userspace when userspace reads from the
fanotify socket. This patch implements the operations that happen at read
time. These include opening the file descriptor to the original object and
then filling the userspace buffer. The fd should be pollable to indicate when
it has data present and it should return how much data it has to send when the
FIONREAD ioctl is checked.
Signed-off-by: Eric Paris <eparis@redhat.com>
---
fs/notify/fanotify/af_fanotify.c | 230 ++++++++++++++++++++++++++++++++++++++
fs/notify/fanotify/fanotify.h | 5 +
include/linux/fanotify.h | 22 ++++
3 files changed, 255 insertions(+), 2 deletions(-)
diff --git a/fs/notify/fanotify/af_fanotify.c b/fs/notify/fanotify/af_fanotify.c
index ac6aee1..cefd108 100644
--- a/fs/notify/fanotify/af_fanotify.c
+++ b/fs/notify/fanotify/af_fanotify.c
@@ -2,6 +2,7 @@
#include <linux/fdtable.h>
#include <linux/file.h>
#include <linux/fsnotify_backend.h>
+#include <linux/ima.h> /* ima_path_check */
#include <linux/init.h>
#include <linux/kernel.h> /* UINT_MAX */
#include <linux/mount.h> /* mntget() */
@@ -16,6 +17,8 @@
#include "fanotify.h"
#include "af_fanotify.h"
+#include <asm/ioctls.h>
+
static const struct proto_ops fanotify_proto_ops;
static struct kmem_cache *fanotify_mark_cache __read_mostly;
@@ -114,6 +117,36 @@ static int fan_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
return 0;
}
+static int fan_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
+{
+ struct fanotify_sock *fan_sock;
+ struct fsnotify_group *group;
+ struct fsnotify_event_holder *holder;
+ void __user *p;
+ int ret = -ENOTTY;
+ size_t send_len = 0;
+
+ if (sock->state != SS_CONNECTED)
+ return -EBADF;
+
+ fan_sock = fan_sk(sock->sk);
+ group = fan_sock->group;
+
+ p = (void __user *) arg;
+
+ switch (cmd) {
+ case FIONREAD:
+ mutex_lock(&group->notification_mutex);
+ list_for_each_entry(holder, &group->notification_list, event_list)
+ send_len += FAN_EVENT_METADATA_LEN;
+ mutex_unlock(&group->notification_mutex);
+ ret = put_user(send_len, (int __user *) p);
+ break;
+ }
+
+ return ret;
+}
+
static void fanotify_free_mark(struct fsnotify_mark_entry *entry)
{
kmem_cache_free(fanotify_mark_cache, entry);
@@ -278,6 +311,199 @@ static int fan_setsockopt(struct socket *sock, int level, int optname,
return ret;
}
+/*
+ * Get an fsnotify notification event if one exists and is small
+ * enough to fit in "count". Return an error pointer if the count
+ * is not large enough.
+ *
+ * Called with the group->notification_mutex held.
+ */
+static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
+ size_t count)
+{
+ BUG_ON(!mutex_is_locked(&group->notification_mutex));
+
+ if (fsnotify_notify_queue_is_empty(group))
+ return NULL;
+
+ if (FAN_EVENT_METADATA_LEN > count)
+ return ERR_PTR(-EINVAL);
+
+ /* held the notification_mutex the whole time, so this is the
+ * same event we peeked above */
+ return fsnotify_remove_notify_event(group);
+}
+
+static int create_and_fill_fd(struct fsnotify_group *group,
+ struct fanotify_event_metadata *metadata,
+ struct fsnotify_event *event)
+{
+ int client_fd, err;
+ struct dentry *dentry;
+ struct vfsmount *mnt;
+ struct file *new_file;
+
+ client_fd = get_unused_fd();
+ if (client_fd < 0)
+ return client_fd;
+
+ if (event->data_type != FSNOTIFY_EVENT_PATH) {
+ WARN_ON(1);
+ put_unused_fd(client_fd);
+ return -EINVAL;
+ }
+
+ /*
+ * we need a new file handle for the userspace program so it can read even if it was
+ * originally opened O_WRONLY.
+ */
+ dentry = dget(event->path.dentry);
+ mnt = mntget(event->path.mnt);
+ /* it's possible this event was an overflow event. in that case dentry and mnt
+ * are NULL; That's fine, just don't call dentry open */
+ if (dentry && mnt) {
+ err = ima_path_check(&event->path, MAY_READ, IMA_COUNT_UPDATE);
+ if (err)
+ new_file = ERR_PTR(err);
+ else {
+ current->flags |= PF_NONOTIFY;
+ new_file = dentry_open(dentry, mnt, O_RDONLY | O_LARGEFILE,
+ current_cred());
+ current->flags &= ~PF_NONOTIFY;
+ }
+ } else
+ new_file = ERR_PTR(-EOVERFLOW);
+ if (IS_ERR(new_file)) {
+ /*
+ * we still send an event even if we can't open the file. this
+ * can happen when say tasks are gone and we try to open their
+ * /proc entries or we try to open a WRONLY file like in sysfs
+ * we just send the errno to userspace since there isn't much
+ * else we can do.
+ */
+ put_unused_fd(client_fd);
+ client_fd = PTR_ERR(new_file);
+ } else {
+ new_file->f_mode |= FMODE_NONOTIFY;
+ fd_install(client_fd, new_file);
+ }
+
+ metadata->fd = client_fd;
+
+ return 0;
+}
+
+static ssize_t fill_event_metadata(struct fsnotify_group *group,
+ struct fanotify_event_metadata *metadata,
+ struct fsnotify_event *event)
+{
+ pr_debug("%s: \n", __func__);
+
+ metadata->event_len = FAN_EVENT_METADATA_LEN;
+ metadata->mask = fanotify_outgoing_mask(event->mask);
+
+ return create_and_fill_fd(group, metadata, event);
+
+}
+
+static ssize_t copy_event_to_iov(struct fsnotify_group *group,
+ struct fsnotify_event *event,
+ struct iovec *iov)
+{
+ struct fanotify_event_metadata fanotify_event_metadata;
+ int ret;
+
+ pr_debug("%s: \n", __func__);
+
+ ret = fill_event_metadata(group, &fanotify_event_metadata, event);
+ if (ret)
+ return ret;
+
+ /* send the main event */
+ ret = memcpy_toiovec(iov, (unsigned char *)&fanotify_event_metadata,
+ FAN_EVENT_METADATA_LEN);
+ if (ret < 0)
+ return ret;
+
+ return FAN_EVENT_METADATA_LEN;
+}
+
+static ssize_t fan_recv_events(struct fsnotify_group *group, struct msghdr *msg,
+ int count, int nonblock)
+{
+ struct fsnotify_event *event;
+ int ret, len_sent = 0;
+ DEFINE_WAIT(wait);
+
+ pr_debug("%s: \n", __func__);
+
+ while (1) {
+ prepare_to_wait(&group->notification_waitq, &wait, TASK_INTERRUPTIBLE);
+
+ mutex_lock(&group->notification_mutex);
+ event = get_one_event(group, count);
+ mutex_unlock(&group->notification_mutex);
+
+ if (event) {
+ ret = PTR_ERR(event);
+ if (IS_ERR(event))
+ break;
+
+ ret = copy_event_to_iov(group, event, msg->msg_iov);
+ fsnotify_put_event(event);
+ if (ret < 0)
+ break;
+ len_sent += ret;
+ count -= ret;
+ continue;
+ }
+
+ ret = -EAGAIN;
+ if (nonblock)
+ break;
+ ret = -EINTR;
+ if (signal_pending(current))
+ break;
+
+ if (len_sent)
+ break;
+
+ schedule();
+ }
+
+ finish_wait(&group->notification_waitq, &wait);
+ if (len_sent && ret != -EFAULT)
+ ret = len_sent;
+ return ret;
+}
+
+static int fan_recvmsg(struct kiocb *iocb, struct socket *sock,
+ struct msghdr *msg, size_t size, int flags)
+{
+ struct fanotify_sock *fan_sock;
+ struct fsnotify_group *group;
+ int nonblock;
+
+ pr_debug("%s: \n", __func__);
+
+ if (sock->state != SS_CONNECTED)
+ return -EBADF;
+
+ if (size < FAN_EVENT_METADATA_LEN)
+ return -ENOMEM;
+
+ fan_sock = fan_sk(sock->sk);
+ group = fan_sock->group;
+
+ /* hey, nonblock no matter how they ask */
+ nonblock = !!(sock->file->f_flags & O_NONBLOCK);
+ nonblock |= !!(flags & MSG_DONTWAIT);
+
+ size = fan_recv_events(group, msg, size, nonblock);
+
+ return size;
+}
+
static const struct net_proto_family fanotify_family_ops = {
.family = PF_FANOTIFY,
.create = fan_sock_create,
@@ -294,13 +520,13 @@ static const struct proto_ops fanotify_proto_ops = {
.accept = sock_no_accept,
.getname = sock_no_getname,
.poll = sock_no_poll,
- .ioctl = sock_no_ioctl,
+ .ioctl = fan_ioctl,
.listen = sock_no_listen,
.shutdown = sock_no_shutdown,
.setsockopt = fan_setsockopt,
.getsockopt = sock_no_getsockopt,
.sendmsg = sock_no_sendmsg,
- .recvmsg = sock_no_recvmsg,
+ .recvmsg = fan_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
};
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 6c7bf06..4a5c785 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -12,3 +12,8 @@ static inline bool fanotify_is_mask_valid(__u32 mask)
return false;
return true;
}
+
+static inline __u32 fanotify_outgoing_mask(__u32 mask)
+{
+ return mask & FAN_ALL_OUTGOING_EVENTS;
+}
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index db96dd8..f44a668 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -35,6 +35,10 @@
*/
#define FAN_ALL_INCOMING_EVENTS (FAN_ALL_EVENTS |\
FAN_EVENT_ON_CHILD)
+
+#define FAN_ALL_OUTGOING_EVENTS (FAN_ALL_EVENTS |\
+ FAN_Q_OVERFLOW)
+
#ifndef SOL_FANOTIFY
#define SOL_FANOTIFY 278
#endif
@@ -63,6 +67,24 @@ struct fanotify_so_inode_mark {
#define FANOTIFY_SET_MARK 1
#define FANOTIFY_REMOVE_MARK 2
+struct fanotify_event_metadata {
+ __u32 event_len;
+ __s32 fd;
+ __u32 mask;
+} __attribute__((packed));
+
+
+/* Helper functions to deal with fanotify_event_metadata buffers */
+#define FAN_EVENT_METADATA_LEN (sizeof(struct fanotify_event_metadata))
+
+#define FAN_EVENT_NEXT(meta, len) ((len) -= (meta)->event_len, \
+ (struct fanotify_event_metadata*)(((char *)(meta)) + \
+ (meta)->event_len))
+
+#define FAN_EVENT_OK(meta, len) ((long)(len) >= (long)FAN_EVENT_METADATA_LEN && \
+ (long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \
+ (long)(meta)->event_len <= (long)(len))
+
#ifdef __KERNEL__
#endif /* __KERNEL__ */
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use
2009-08-28 18:55 [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Eric Paris
` (7 preceding siblings ...)
2009-08-28 18:56 ` [PATCH 9/9] fanotify: send events to userspace over socket reads Eric Paris
@ 2009-08-28 22:36 ` Evgeniy Polyakov
2009-08-28 22:39 ` Eric Paris
2009-09-03 20:25 ` Eric Paris
8 siblings, 2 replies; 13+ messages in thread
From: Evgeniy Polyakov @ 2009-08-28 22:36 UTC (permalink / raw)
To: Eric Paris; +Cc: linux-kernel, linux-fsdevel, netdev, davem, viro, alan, hch
Hi.
On Fri, Aug 28, 2009 at 02:55:42PM -0400, Eric Paris (eparis@redhat.com) wrote:
> Since fanotify opens file descriptors inside the kernel for it's listeners
> it needs a way to make sure that 2 fanotify listeners, both which listen to
> open events do not continuously see each others open events (and get into a
> livelock reporting on each other's activity). This fix is to create a new
> tast_struct flags called PF_NONOTIFY. If this flag is set in a task no
> fanotify events will be generated for that task. fanotify will set the
> flag before and open call and will clear it immediately after.
Is there a way to get old-school notifications with the object
information instead of opened file desriptor, which may suffer rlimit
problems and scalability issues with too many opened/closed descriptors?
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use
2009-08-28 22:36 ` [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Evgeniy Polyakov
@ 2009-08-28 22:39 ` Eric Paris
2009-08-28 22:50 ` Evgeniy Polyakov
2009-09-03 20:25 ` Eric Paris
1 sibling, 1 reply; 13+ messages in thread
From: Eric Paris @ 2009-08-28 22:39 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: linux-kernel, linux-fsdevel, netdev, davem, viro, alan, hch
On Sat, 2009-08-29 at 02:36 +0400, Evgeniy Polyakov wrote:
> Hi.
>
> On Fri, Aug 28, 2009 at 02:55:42PM -0400, Eric Paris (eparis@redhat.com) wrote:
> > Since fanotify opens file descriptors inside the kernel for it's listeners
> > it needs a way to make sure that 2 fanotify listeners, both which listen to
> > open events do not continuously see each others open events (and get into a
> > livelock reporting on each other's activity). This fix is to create a new
> > tast_struct flags called PF_NONOTIFY. If this flag is set in a task no
> > fanotify events will be generated for that task. fanotify will set the
> > flag before and open call and will clear it immediately after.
>
> Is there a way to get old-school notifications with the object
> information instead of opened file desriptor, which may suffer rlimit
> problems and scalability issues with too many opened/closed descriptors?
Use inotify.
-Eric
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use
2009-08-28 22:39 ` Eric Paris
@ 2009-08-28 22:50 ` Evgeniy Polyakov
0 siblings, 0 replies; 13+ messages in thread
From: Evgeniy Polyakov @ 2009-08-28 22:50 UTC (permalink / raw)
To: Eric Paris; +Cc: linux-kernel, linux-fsdevel, netdev, davem, viro, alan, hch
On Fri, Aug 28, 2009 at 06:39:28PM -0400, Eric Paris (eparis@redhat.com) wrote:
> > Is there a way to get old-school notifications with the object
> > information instead of opened file desriptor, which may suffer rlimit
> > problems and scalability issues with too many opened/closed descriptors?
>
> Use inotify.
I need pids.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use
2009-08-28 22:36 ` [PATCH 1/9] task_struct: add PF_NONOTIFY for fanotify to use Evgeniy Polyakov
2009-08-28 22:39 ` Eric Paris
@ 2009-09-03 20:25 ` Eric Paris
1 sibling, 0 replies; 13+ messages in thread
From: Eric Paris @ 2009-09-03 20:25 UTC (permalink / raw)
Cc: linux-kernel, linux-fsdevel, netdev, davem, viro, alan, hch
> On Fri, Aug 28, 2009 at 02:55:42PM -0400, Eric Paris (eparis@redhat.com) wrote:
> > Since fanotify opens file descriptors inside the kernel for it's listeners
> > it needs a way to make sure that 2 fanotify listeners, both which listen to
> > open events do not continuously see each others open events (and get into a
> > livelock reporting on each other's activity). This fix is to create a new
> > tast_struct flags called PF_NONOTIFY. If this flag is set in a task no
> > fanotify events will be generated for that task. fanotify will set the
> > flag before and open call and will clear it immediately after.
So this patch isn't actually needed, I thought the fsnotify_open() call
was deeper than it is. I'm going to drop this particular patch.
Can I take silence on the list as a lack of disagreement? I'd like to
start putting these into linux-next starting tomorrow. Obviously I'd
like to see the networking definitions approved and taken by davem, I'd
like to see the FMODE_ change approved and taken by viro, but if you two
would like me to push it directly I'd be happy to.
Davem, maybe you'd rather my af_fanotify was somewhere inside net/
instead of in fs/notify/fanotify? I'd love to get some review here, but
noone onlist is excited enough to step up. (I have numerous people who
have privately claimed they intend to use this stuff, so i promise it
won't be dead code.)
I'm ready to start committing and adding features, anyone telling me no?
-Eric
^ permalink raw reply [flat|nested] 13+ messages in thread