* [PATCH 1/2] Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"
@ 2016-02-15 0:15 Jeff Layton
2016-02-15 0:15 ` [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job Jeff Layton
0 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2016-02-15 0:15 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, linux-kernel, Paul E. McKenney, Jan Kara,
Eric Paris, Eryu Guan
This reverts commit c510eff6bebaa244e577b8f499e86606b5e5d4c7.
Eryu reported that he was seeing some OOM kills kick in when running a
testcase that adds and removes inotify marks on a file in a tight loop.
The above commit changed the code to use call_srcu to clean up the
marks. While that does (in principle) work, the srcu callback job is
limited to cleaning up entries in small batches and only once per jiffy.
It's easily possible to overwhelm that machinery with too many call_srcu
callbacks, and Eryu's reproduer did just that.
There's also another potential problem with using call_srcu here. While
you can obviously sleep while holding the srcu_read_lock, the callbacks
run under local_bh_disable, so you can't sleep there.
It's possible when putting the last reference to the fsnotify_mark that
we'll end up putting a chain of references including the fsnotify_group,
uid, and associated keys. While I don't see any obvious ways that that
could occurs, it's probably still best to avoid using call_srcu here
after all.
This patch reverts the above patch. A later patch will take a different
approach to eliminated the dedicated thread here.
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
fs/notify/mark.c | 66 +++++++++++++++++++++++++++++++---------
include/linux/fsnotify_backend.h | 5 +--
2 files changed, 53 insertions(+), 18 deletions(-)
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index cfcbf114676e..fc0df4442f7b 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -92,6 +92,9 @@
#include "fsnotify.h"
struct srcu_struct fsnotify_mark_srcu;
+static DEFINE_SPINLOCK(destroy_lock);
+static LIST_HEAD(destroy_list);
+static DECLARE_WAIT_QUEUE_HEAD(destroy_waitq);
void fsnotify_get_mark(struct fsnotify_mark *mark)
{
@@ -165,19 +168,10 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
atomic_dec(&group->num_marks);
}
-static void
-fsnotify_mark_free_rcu(struct rcu_head *rcu)
-{
- struct fsnotify_mark *mark;
-
- mark = container_of(rcu, struct fsnotify_mark, g_rcu);
- fsnotify_put_mark(mark);
-}
-
/*
- * Free fsnotify mark. The freeing is actually happening from a call_srcu
- * callback. Caller must have a reference to the mark or be protected by
- * fsnotify_mark_srcu.
+ * Free fsnotify mark. The freeing is actually happening from a kthread which
+ * first waits for srcu period end. Caller must have a reference to the mark
+ * or be protected by fsnotify_mark_srcu.
*/
void fsnotify_free_mark(struct fsnotify_mark *mark)
{
@@ -192,7 +186,10 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
spin_unlock(&mark->lock);
- call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu);
+ spin_lock(&destroy_lock);
+ list_add(&mark->g_list, &destroy_list);
+ spin_unlock(&destroy_lock);
+ wake_up(&destroy_waitq);
/*
* Some groups like to know that marks are being freed. This is a
@@ -388,7 +385,11 @@ err:
spin_unlock(&mark->lock);
- call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu);
+ spin_lock(&destroy_lock);
+ list_add(&mark->g_list, &destroy_list);
+ spin_unlock(&destroy_lock);
+ wake_up(&destroy_waitq);
+
return ret;
}
@@ -491,3 +492,40 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
atomic_set(&mark->refcnt, 1);
mark->free_mark = free_mark;
}
+
+static int fsnotify_mark_destroy(void *ignored)
+{
+ struct fsnotify_mark *mark, *next;
+ struct list_head private_destroy_list;
+
+ for (;;) {
+ spin_lock(&destroy_lock);
+ /* exchange the list head */
+ list_replace_init(&destroy_list, &private_destroy_list);
+ spin_unlock(&destroy_lock);
+
+ synchronize_srcu(&fsnotify_mark_srcu);
+
+ list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
+ list_del_init(&mark->g_list);
+ fsnotify_put_mark(mark);
+ }
+
+ wait_event_interruptible(destroy_waitq, !list_empty(&destroy_list));
+ }
+
+ return 0;
+}
+
+static int __init fsnotify_mark_init(void)
+{
+ struct task_struct *thread;
+
+ thread = kthread_run(fsnotify_mark_destroy, NULL,
+ "fsnotify_mark");
+ if (IS_ERR(thread))
+ panic("unable to start fsnotify mark destruction thread.");
+
+ return 0;
+}
+device_initcall(fsnotify_mark_init);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 6b7e89f45aa4..533c4408529a 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -220,10 +220,7 @@ struct fsnotify_mark {
/* List of marks by group->i_fsnotify_marks. Also reused for queueing
* mark into destroy_list when it's waiting for the end of SRCU period
* before it can be freed. [group->mark_mutex] */
- union {
- struct list_head g_list;
- struct rcu_head g_rcu;
- };
+ struct list_head g_list;
/* Protects inode / mnt pointers, flags, masks */
spinlock_t lock;
/* List of marks for inode / vfsmount [obj_lock] */
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job
2016-02-15 0:15 [PATCH 1/2] Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" Jeff Layton
@ 2016-02-15 0:15 ` Jeff Layton
2016-02-15 6:03 ` Eryu Guan
2016-02-17 20:01 ` Jan Kara
0 siblings, 2 replies; 5+ messages in thread
From: Jeff Layton @ 2016-02-15 0:15 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-fsdevel, linux-kernel, Paul E. McKenney, Jan Kara,
Eric Paris, Eryu Guan
We don't require a dedicated thread for fsnotify cleanup. Switch it over
to a workqueue job instead that runs on the system_unbound_wq.
In the interest of not thrashing the queued job too often when there are
a lot of marks being removed, we delay the reaper job slightly when
queueing it, to allow several to gather on the list.
Cc: Jan Kara <jack@suse.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
fs/notify/mark.c | 49 ++++++++++++++++++-------------------------------
1 file changed, 18 insertions(+), 31 deletions(-)
diff --git a/fs/notify/mark.c b/fs/notify/mark.c
index fc0df4442f7b..7115c5d7d373 100644
--- a/fs/notify/mark.c
+++ b/fs/notify/mark.c
@@ -91,10 +91,14 @@
#include <linux/fsnotify_backend.h>
#include "fsnotify.h"
+#define FSNOTIFY_REAPER_DELAY (1) /* 1 jiffy */
+
struct srcu_struct fsnotify_mark_srcu;
static DEFINE_SPINLOCK(destroy_lock);
static LIST_HEAD(destroy_list);
-static DECLARE_WAIT_QUEUE_HEAD(destroy_waitq);
+
+static void fsnotify_mark_destroy(struct work_struct *work);
+static DECLARE_DELAYED_WORK(reaper_work, fsnotify_mark_destroy);
void fsnotify_get_mark(struct fsnotify_mark *mark)
{
@@ -189,7 +193,8 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
spin_lock(&destroy_lock);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
- wake_up(&destroy_waitq);
+ queue_delayed_work(system_unbound_wq, &reaper_work,
+ FSNOTIFY_REAPER_DELAY);
/*
* Some groups like to know that marks are being freed. This is a
@@ -388,7 +393,8 @@ err:
spin_lock(&destroy_lock);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
- wake_up(&destroy_waitq);
+ queue_delayed_work(system_unbound_wq, &reaper_work,
+ FSNOTIFY_REAPER_DELAY);
return ret;
}
@@ -493,39 +499,20 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
mark->free_mark = free_mark;
}
-static int fsnotify_mark_destroy(void *ignored)
+static void fsnotify_mark_destroy(struct work_struct *work)
{
struct fsnotify_mark *mark, *next;
struct list_head private_destroy_list;
- for (;;) {
- spin_lock(&destroy_lock);
- /* exchange the list head */
- list_replace_init(&destroy_list, &private_destroy_list);
- spin_unlock(&destroy_lock);
-
- synchronize_srcu(&fsnotify_mark_srcu);
+ spin_lock(&destroy_lock);
+ /* exchange the list head */
+ list_replace_init(&destroy_list, &private_destroy_list);
+ spin_unlock(&destroy_lock);
- list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
- list_del_init(&mark->g_list);
- fsnotify_put_mark(mark);
- }
+ synchronize_srcu(&fsnotify_mark_srcu);
- wait_event_interruptible(destroy_waitq, !list_empty(&destroy_list));
+ list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
+ list_del_init(&mark->g_list);
+ fsnotify_put_mark(mark);
}
-
- return 0;
-}
-
-static int __init fsnotify_mark_init(void)
-{
- struct task_struct *thread;
-
- thread = kthread_run(fsnotify_mark_destroy, NULL,
- "fsnotify_mark");
- if (IS_ERR(thread))
- panic("unable to start fsnotify mark destruction thread.");
-
- return 0;
}
-device_initcall(fsnotify_mark_init);
--
2.5.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job
2016-02-15 0:15 ` [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job Jeff Layton
@ 2016-02-15 6:03 ` Eryu Guan
2016-02-15 17:55 ` Jeff Layton
2016-02-17 20:01 ` Jan Kara
1 sibling, 1 reply; 5+ messages in thread
From: Eryu Guan @ 2016-02-15 6:03 UTC (permalink / raw)
To: Jeff Layton
Cc: Andrew Morton, linux-fsdevel, linux-kernel, Paul E. McKenney,
Jan Kara, Eric Paris
On Sun, Feb 14, 2016 at 07:15:23PM -0500, Jeff Layton wrote:
> We don't require a dedicated thread for fsnotify cleanup. Switch it over
> to a workqueue job instead that runs on the system_unbound_wq.
>
> In the interest of not thrashing the queued job too often when there are
> a lot of marks being removed, we delay the reaper job slightly when
> queueing it, to allow several to gather on the list.
>
> Cc: Jan Kara <jack@suse.com>
> Cc: Eric Paris <eparis@parisplace.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Eryu Guan <guaneryu@gmail.com>
> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
With these two patches applied on top of r.5-rc3, the same test passed
2-hour stress run, also survived stress test that forks 5 processes
running the same test program for 30 minutes.
Thanks for looking into this!
Eryu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job
2016-02-15 6:03 ` Eryu Guan
@ 2016-02-15 17:55 ` Jeff Layton
0 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2016-02-15 17:55 UTC (permalink / raw)
To: Eryu Guan
Cc: Andrew Morton, linux-fsdevel, linux-kernel, Paul E. McKenney,
Jan Kara, Eric Paris
On Mon, 15 Feb 2016 14:03:10 +0800
Eryu Guan <guaneryu@gmail.com> wrote:
> On Sun, Feb 14, 2016 at 07:15:23PM -0500, Jeff Layton wrote:
> > We don't require a dedicated thread for fsnotify cleanup. Switch it over
> > to a workqueue job instead that runs on the system_unbound_wq.
> >
> > In the interest of not thrashing the queued job too often when there are
> > a lot of marks being removed, we delay the reaper job slightly when
> > queueing it, to allow several to gather on the list.
> >
> > Cc: Jan Kara <jack@suse.com>
> > Cc: Eric Paris <eparis@parisplace.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Eryu Guan <guaneryu@gmail.com>
> > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
>
> With these two patches applied on top of r.5-rc3, the same test passed
> 2-hour stress run, also survived stress test that forks 5 processes
> running the same test program for 30 minutes.
>
> Thanks for looking into this!
>
> Eryu
Thanks for reporting and testing the patches. Yes, I tested them too
while running your reproducer and watched the size of the
inotify_inode_mark slab. It seemed to stay pretty stable with these
patches as well.
Andrew, would you mind picking up these patches since you merged the
original one?
Thanks,
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job
2016-02-15 0:15 ` [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job Jeff Layton
2016-02-15 6:03 ` Eryu Guan
@ 2016-02-17 20:01 ` Jan Kara
1 sibling, 0 replies; 5+ messages in thread
From: Jan Kara @ 2016-02-17 20:01 UTC (permalink / raw)
To: Jeff Layton
Cc: Andrew Morton, linux-fsdevel, linux-kernel, Paul E. McKenney,
Jan Kara, Eric Paris, Eryu Guan
On Sun 14-02-16 19:15:23, Jeff Layton wrote:
> We don't require a dedicated thread for fsnotify cleanup. Switch it over
> to a workqueue job instead that runs on the system_unbound_wq.
>
> In the interest of not thrashing the queued job too often when there are
> a lot of marks being removed, we delay the reaper job slightly when
> queueing it, to allow several to gather on the list.
>
> Cc: Jan Kara <jack@suse.com>
> Cc: Eric Paris <eparis@parisplace.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Eryu Guan <guaneryu@gmail.com>
> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
The patch looks correct to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/notify/mark.c | 49 ++++++++++++++++++-------------------------------
> 1 file changed, 18 insertions(+), 31 deletions(-)
>
> diff --git a/fs/notify/mark.c b/fs/notify/mark.c
> index fc0df4442f7b..7115c5d7d373 100644
> --- a/fs/notify/mark.c
> +++ b/fs/notify/mark.c
> @@ -91,10 +91,14 @@
> #include <linux/fsnotify_backend.h>
> #include "fsnotify.h"
>
> +#define FSNOTIFY_REAPER_DELAY (1) /* 1 jiffy */
> +
> struct srcu_struct fsnotify_mark_srcu;
> static DEFINE_SPINLOCK(destroy_lock);
> static LIST_HEAD(destroy_list);
> -static DECLARE_WAIT_QUEUE_HEAD(destroy_waitq);
> +
> +static void fsnotify_mark_destroy(struct work_struct *work);
> +static DECLARE_DELAYED_WORK(reaper_work, fsnotify_mark_destroy);
>
> void fsnotify_get_mark(struct fsnotify_mark *mark)
> {
> @@ -189,7 +193,8 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
> spin_lock(&destroy_lock);
> list_add(&mark->g_list, &destroy_list);
> spin_unlock(&destroy_lock);
> - wake_up(&destroy_waitq);
> + queue_delayed_work(system_unbound_wq, &reaper_work,
> + FSNOTIFY_REAPER_DELAY);
>
> /*
> * Some groups like to know that marks are being freed. This is a
> @@ -388,7 +393,8 @@ err:
> spin_lock(&destroy_lock);
> list_add(&mark->g_list, &destroy_list);
> spin_unlock(&destroy_lock);
> - wake_up(&destroy_waitq);
> + queue_delayed_work(system_unbound_wq, &reaper_work,
> + FSNOTIFY_REAPER_DELAY);
>
> return ret;
> }
> @@ -493,39 +499,20 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
> mark->free_mark = free_mark;
> }
>
> -static int fsnotify_mark_destroy(void *ignored)
> +static void fsnotify_mark_destroy(struct work_struct *work)
> {
> struct fsnotify_mark *mark, *next;
> struct list_head private_destroy_list;
>
> - for (;;) {
> - spin_lock(&destroy_lock);
> - /* exchange the list head */
> - list_replace_init(&destroy_list, &private_destroy_list);
> - spin_unlock(&destroy_lock);
> -
> - synchronize_srcu(&fsnotify_mark_srcu);
> + spin_lock(&destroy_lock);
> + /* exchange the list head */
> + list_replace_init(&destroy_list, &private_destroy_list);
> + spin_unlock(&destroy_lock);
>
> - list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
> - list_del_init(&mark->g_list);
> - fsnotify_put_mark(mark);
> - }
> + synchronize_srcu(&fsnotify_mark_srcu);
>
> - wait_event_interruptible(destroy_waitq, !list_empty(&destroy_list));
> + list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
> + list_del_init(&mark->g_list);
> + fsnotify_put_mark(mark);
> }
> -
> - return 0;
> -}
> -
> -static int __init fsnotify_mark_init(void)
> -{
> - struct task_struct *thread;
> -
> - thread = kthread_run(fsnotify_mark_destroy, NULL,
> - "fsnotify_mark");
> - if (IS_ERR(thread))
> - panic("unable to start fsnotify mark destruction thread.");
> -
> - return 0;
> }
> -device_initcall(fsnotify_mark_init);
> --
> 2.5.0
>
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-02-17 20:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-15 0:15 [PATCH 1/2] Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" Jeff Layton
2016-02-15 0:15 ` [PATCH 2/2] fsnotify: turn fsnotify reaper thread into a workqueue job Jeff Layton
2016-02-15 6:03 ` Eryu Guan
2016-02-15 17:55 ` Jeff Layton
2016-02-17 20:01 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).