All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fsnotify: Rearrange fast path to minimise overhead when there is no watcher
@ 2020-06-08 14:05 Mel Gorman
  2020-06-08 15:03 ` Amir Goldstein
  2020-06-08 15:19 ` Jan Kara
  0 siblings, 2 replies; 15+ messages in thread
From: Mel Gorman @ 2020-06-08 14:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: Amir Goldstein, linux-fsdevel, linux-kernel

The fsnotify paths are trivial to hit even when there are no watchers and
they are surprisingly expensive. For example, every successful vfs_write()
hits fsnotify_modify which calls both fsnotify_parent and fsnotify unless
FMODE_NONOTIFY is set which is an internal flag invisible to userspace.
As it stands, fsnotify_parent is a guaranteed functional call even if there
are no watchers and fsnotify() does a substantial amount of unnecessary
work before it checks if there are any watchers. A perf profile showed
that applying mnt->mnt_fsnotify_mask in fnotify() was almost half of the
total samples taken in that function during a test. This patch rearranges
the fast paths to reduce the amount of work done when there are no watchers.

The test motivating this was "perf bench sched messaging --pipe". Despite
the fact the pipes are anonymous, fsnotify is still called a lot and
the overhead is noticable even though it's completely pointless. It's
likely the overhead is negligible for real IO so this is an extreme
example. This is a comparison of hackbench using processes and pipes on
a 1-socket machine with 8 CPU threads without fanotify watchers.

                              5.7.0                  5.7.0
                            vanilla      fastfsnotify-v1r1
Amean     1       0.4837 (   0.00%)      0.4630 *   4.27%*
Amean     3       1.5447 (   0.00%)      1.4557 (   5.76%)
Amean     5       2.6037 (   0.00%)      2.4363 (   6.43%)
Amean     7       3.5987 (   0.00%)      3.4757 (   3.42%)
Amean     12      5.8267 (   0.00%)      5.6983 (   2.20%)
Amean     18      8.4400 (   0.00%)      8.1327 (   3.64%)
Amean     24     11.0187 (   0.00%)     10.0290 *   8.98%*
Amean     30     13.1013 (   0.00%)     12.8510 (   1.91%)
Amean     32     13.9190 (   0.00%)     13.2410 (   4.87%)

                       5.7.0       5.7.0
                     vanilla fastfsnotify-v1r1
Duration User         157.05      152.79
Duration System      1279.98     1219.32
Duration Elapsed      182.81      174.52

This is showing that the latencies are improved by roughly 2-9%. The
variability is not shown but some of these results are within the noise
as this workload heavily overloads the machine. That said, the system CPU
usage is reduced by quite a bit so it makes sense to avoid the overhead
even if it is a bit tricky to detect at times. A perf profile of just 1
group of tasks showed that 5.14% of samples taken were in either fsnotify()
or fsnotify_parent(). With the patch, 2.8% of samples were in fsnotify,
mostly function entry and the initial check for watchers.  The check for
watchers is complicated enough that inlining it may be controversial.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 fs/notify/fsnotify.c             | 25 +++++++++++++++----------
 include/linux/fsnotify.h         | 10 ++++++++++
 include/linux/fsnotify_backend.h |  4 ++--
 3 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 72d332ce8e12..de7bbfd973c0 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -143,7 +143,7 @@ void __fsnotify_update_child_dentry_flags(struct inode *inode)
 }
 
 /* Notify this dentry's parent about a child's events. */
-int fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
+int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 		    int data_type)
 {
 	struct dentry *parent;
@@ -174,7 +174,7 @@ int fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(fsnotify_parent);
+EXPORT_SYMBOL_GPL(__fsnotify_parent);
 
 static int send_to_group(struct inode *to_tell,
 			 __u32 mask, const void *data,
@@ -315,17 +315,12 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	struct fsnotify_iter_info iter_info = {};
 	struct super_block *sb = to_tell->i_sb;
 	struct mount *mnt = NULL;
-	__u32 mnt_or_sb_mask = sb->s_fsnotify_mask;
+	__u32 mnt_or_sb_mask;
 	int ret = 0;
-	__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
+	__u32 test_mask;
 
-	if (path) {
+	if (path)
 		mnt = real_mount(path->mnt);
-		mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
-	}
-	/* An event "on child" is not intended for a mount/sb mark */
-	if (mask & FS_EVENT_ON_CHILD)
-		mnt_or_sb_mask = 0;
 
 	/*
 	 * Optimization: srcu_read_lock() has a memory barrier which can
@@ -337,11 +332,21 @@ int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
 	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
 	    (!mnt || !mnt->mnt_fsnotify_marks))
 		return 0;
+
+	/* An event "on child" is not intended for a mount/sb mark */
+	mnt_or_sb_mask = 0;
+	if (!(mask & FS_EVENT_ON_CHILD)) {
+		mnt_or_sb_mask = sb->s_fsnotify_mask;
+		if (path)
+			mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
+	}
+
 	/*
 	 * if this is a modify event we may need to clear the ignored masks
 	 * otherwise return if neither the inode nor the vfsmount/sb care about
 	 * this type of event.
 	 */
+	test_mask = (mask & ALL_FSNOTIFY_EVENTS);
 	if (!(mask & FS_MODIFY) &&
 	    !(test_mask & (to_tell->i_fsnotify_mask | mnt_or_sb_mask)))
 		return 0;
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 5ab28f6c7d26..508f6bb0b06b 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -44,6 +44,16 @@ static inline void fsnotify_dirent(struct inode *dir, struct dentry *dentry,
 	fsnotify_name(dir, mask, d_inode(dentry), &dentry->d_name, 0);
 }
 
+/* Notify this dentry's parent about a child's events. */
+static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
+				  const void *data, int data_type)
+{
+	if (!(dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED))
+		return 0;
+
+	return __fsnotify_parent(dentry, mask, data, data_type);
+}
+
 /*
  * Simple wrappers to consolidate calls fsnotify_parent()/fsnotify() when
  * an event is on a file/dentry.
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index f0c506405b54..1626fa7d10ff 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -379,7 +379,7 @@ struct fsnotify_mark {
 /* main fsnotify call to send events */
 extern int fsnotify(struct inode *to_tell, __u32 mask, const void *data,
 		    int data_type, const struct qstr *name, u32 cookie);
-extern int fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
+extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
 			   int data_type);
 extern void __fsnotify_inode_delete(struct inode *inode);
 extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
@@ -541,7 +541,7 @@ static inline int fsnotify(struct inode *to_tell, __u32 mask, const void *data,
 	return 0;
 }
 
-static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
+static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
 				  const void *data, int data_type)
 {
 	return 0;

^ permalink raw reply related	[flat|nested] 15+ messages in thread
* Re: [PATCH] fsnotify: Rearrange fast path to minimise overhead when there is no watcher
@ 2020-06-08 17:52 kernel test robot
  0 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2020-06-08 17:52 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 9082 bytes --]

CC: kbuild-all(a)lists.01.org
In-Reply-To: <20200608140557.GG3127@techsingularity.net>
References: <20200608140557.GG3127@techsingularity.net>
TO: Mel Gorman <mgorman@suse.de>

Hi Mel,

I love your patch! Perhaps something to improve:

[auto build test WARNING on ext3/fsnotify]
[also build test WARNING on v5.7 next-20200608]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Mel-Gorman/fsnotify-Rearrange-fast-path-to-minimise-overhead-when-there-is-no-watcher/20200608-220745
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
:::::: branch date: 4 hours ago
:::::: commit date: 4 hours ago
config: i386-randconfig-m021-20200607 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

New smatch warnings:
fs/notify/fsnotify.c:341 fsnotify() error: we previously assumed 'mnt' could be null (see line 333)

Old smatch warnings:
fs/notify/fsnotify.c:140 __fsnotify_update_child_dentry_flags() error: double unlocked 'alias->d_lockref.lock' (orig line 138)

# https://github.com/0day-ci/linux/commit/3c0b72381918767551923e484bc915c26a258e2a
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 3c0b72381918767551923e484bc915c26a258e2a
vim +/mnt +341 fs/notify/fsnotify.c

d9a6f30bb89309 Amir Goldstein         2018-04-20  304  
90586523eb4b34 Eric Paris             2009-05-21  305  /*
90586523eb4b34 Eric Paris             2009-05-21  306   * This is the main call to fsnotify.  The VFS calls into hook specific functions
90586523eb4b34 Eric Paris             2009-05-21  307   * in linux/fsnotify.h.  Those functions then in turn call here.  Here will call
90586523eb4b34 Eric Paris             2009-05-21  308   * out to all of the registered fsnotify_group.  Those groups can then use the
90586523eb4b34 Eric Paris             2009-05-21  309   * notification event in whatever means they feel necessary.
90586523eb4b34 Eric Paris             2009-05-21  310   */
e637835eccc8b9 Al Viro                2016-11-20  311  int fsnotify(struct inode *to_tell, __u32 mask, const void *data, int data_is,
25b229dff4ffff Al Viro                2019-04-26  312  	     const struct qstr *file_name, u32 cookie)
90586523eb4b34 Eric Paris             2009-05-21  313  {
aa93bdc5500cc9 Amir Goldstein         2020-03-19  314  	const struct path *path = fsnotify_data_path(data, data_is);
3427ce71554123 Miklos Szeredi         2017-10-30  315  	struct fsnotify_iter_info iter_info = {};
45a9fb3725d886 Amir Goldstein         2019-01-10  316  	struct super_block *sb = to_tell->i_sb;
60f7ed8c7c4d06 Amir Goldstein         2018-09-01  317  	struct mount *mnt = NULL;
3c0b7238191876 Mel Gorman             2020-06-08  318  	__u32 mnt_or_sb_mask;
9385a84d7e1f65 Jan Kara               2016-11-10  319  	int ret = 0;
3c0b7238191876 Mel Gorman             2020-06-08  320  	__u32 test_mask;
90586523eb4b34 Eric Paris             2009-05-21  321  
3c0b7238191876 Mel Gorman             2020-06-08  322  	if (path)
aa93bdc5500cc9 Amir Goldstein         2020-03-19  323  		mnt = real_mount(path->mnt);
3a9fb89f4cd04c Eric Paris             2009-12-17  324  
7c49b8616460eb Dave Hansen            2015-09-04  325  	/*
7c49b8616460eb Dave Hansen            2015-09-04  326  	 * Optimization: srcu_read_lock() has a memory barrier which can
7c49b8616460eb Dave Hansen            2015-09-04  327  	 * be expensive.  It protects walking the *_fsnotify_marks lists.
7c49b8616460eb Dave Hansen            2015-09-04  328  	 * However, if we do not walk the lists, we do not have to do
7c49b8616460eb Dave Hansen            2015-09-04  329  	 * SRCU because we have no references to any objects and do not
7c49b8616460eb Dave Hansen            2015-09-04  330  	 * need SRCU to keep them "alive".
7c49b8616460eb Dave Hansen            2015-09-04  331  	 */
45a9fb3725d886 Amir Goldstein         2019-01-10  332  	if (!to_tell->i_fsnotify_marks && !sb->s_fsnotify_marks &&
45a9fb3725d886 Amir Goldstein         2019-01-10 @333  	    (!mnt || !mnt->mnt_fsnotify_marks))
7c49b8616460eb Dave Hansen            2015-09-04  334  		return 0;
3c0b7238191876 Mel Gorman             2020-06-08  335  
3c0b7238191876 Mel Gorman             2020-06-08  336  	/* An event "on child" is not intended for a mount/sb mark */
3c0b7238191876 Mel Gorman             2020-06-08  337  	mnt_or_sb_mask = 0;
3c0b7238191876 Mel Gorman             2020-06-08  338  	if (!(mask & FS_EVENT_ON_CHILD)) {
3c0b7238191876 Mel Gorman             2020-06-08  339  		mnt_or_sb_mask = sb->s_fsnotify_mask;
3c0b7238191876 Mel Gorman             2020-06-08  340  		if (path)
3c0b7238191876 Mel Gorman             2020-06-08 @341  			mnt_or_sb_mask |= mnt->mnt_fsnotify_mask;
3c0b7238191876 Mel Gorman             2020-06-08  342  	}
3c0b7238191876 Mel Gorman             2020-06-08  343  
613a807fe7c793 Eric Paris             2010-07-28  344  	/*
613a807fe7c793 Eric Paris             2010-07-28  345  	 * if this is a modify event we may need to clear the ignored masks
60f7ed8c7c4d06 Amir Goldstein         2018-09-01  346  	 * otherwise return if neither the inode nor the vfsmount/sb care about
613a807fe7c793 Eric Paris             2010-07-28  347  	 * this type of event.
613a807fe7c793 Eric Paris             2010-07-28  348  	 */
3c0b7238191876 Mel Gorman             2020-06-08  349  	test_mask = (mask & ALL_FSNOTIFY_EVENTS);
613a807fe7c793 Eric Paris             2010-07-28  350  	if (!(mask & FS_MODIFY) &&
60f7ed8c7c4d06 Amir Goldstein         2018-09-01  351  	    !(test_mask & (to_tell->i_fsnotify_mask | mnt_or_sb_mask)))
613a807fe7c793 Eric Paris             2010-07-28  352  		return 0;
75c1be487a690d Eric Paris             2010-07-28  353  
9385a84d7e1f65 Jan Kara               2016-11-10  354  	iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
75c1be487a690d Eric Paris             2010-07-28  355  
47d9c7cc457adc Amir Goldstein         2018-04-20  356  	iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
3427ce71554123 Miklos Szeredi         2017-10-30  357  		fsnotify_first_mark(&to_tell->i_fsnotify_marks);
45a9fb3725d886 Amir Goldstein         2019-01-10  358  	iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
45a9fb3725d886 Amir Goldstein         2019-01-10  359  		fsnotify_first_mark(&sb->s_fsnotify_marks);
9bdda4e9cf2dce Amir Goldstein         2018-09-01  360  	if (mnt) {
47d9c7cc457adc Amir Goldstein         2018-04-20  361  		iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
3427ce71554123 Miklos Szeredi         2017-10-30  362  			fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
7131485a93679f Eric Paris             2009-12-17  363  	}
75c1be487a690d Eric Paris             2010-07-28  364  
8edc6e1688fc8f Jan Kara               2014-11-13  365  	/*
60f7ed8c7c4d06 Amir Goldstein         2018-09-01  366  	 * We need to merge inode/vfsmount/sb mark lists so that e.g. inode mark
60f7ed8c7c4d06 Amir Goldstein         2018-09-01  367  	 * ignore masks are properly reflected for mount/sb mark notifications.
8edc6e1688fc8f Jan Kara               2014-11-13  368  	 * That's why this traversal is so complicated...
8edc6e1688fc8f Jan Kara               2014-11-13  369  	 */
d9a6f30bb89309 Amir Goldstein         2018-04-20  370  	while (fsnotify_iter_select_report_types(&iter_info)) {
5b0457ad021f3f Amir Goldstein         2018-04-20  371  		ret = send_to_group(to_tell, mask, data, data_is, cookie,
e43e9c339a78a0 Al Viro                2019-04-26  372  				    file_name, &iter_info);
613a807fe7c793 Eric Paris             2010-07-28  373  
ff8bcbd03da881 Eric Paris             2010-10-28  374  		if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
ff8bcbd03da881 Eric Paris             2010-10-28  375  			goto out;
ff8bcbd03da881 Eric Paris             2010-10-28  376  
d9a6f30bb89309 Amir Goldstein         2018-04-20  377  		fsnotify_iter_next(&iter_info);
90586523eb4b34 Eric Paris             2009-05-21  378  	}
ff8bcbd03da881 Eric Paris             2010-10-28  379  	ret = 0;
ff8bcbd03da881 Eric Paris             2010-10-28  380  out:
9385a84d7e1f65 Jan Kara               2016-11-10  381  	srcu_read_unlock(&fsnotify_mark_srcu, iter_info.srcu_idx);
c4ec54b40d33f8 Eric Paris             2009-12-17  382  
98b5c10d320adf Jean-Christophe Dubois 2010-03-23  383  	return ret;
90586523eb4b34 Eric Paris             2009-05-21  384  }
90586523eb4b34 Eric Paris             2009-05-21  385  EXPORT_SYMBOL_GPL(fsnotify);
90586523eb4b34 Eric Paris             2009-05-21  386  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 32658 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-06-12 12:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-08 14:05 [PATCH] fsnotify: Rearrange fast path to minimise overhead when there is no watcher Mel Gorman
2020-06-08 15:03 ` Amir Goldstein
2020-06-08 16:06   ` Mel Gorman
2020-06-08 16:26     ` Amir Goldstein
2020-06-08 18:01       ` Mel Gorman
2020-06-08 18:12         ` Amir Goldstein
2020-06-08 20:39           ` Amir Goldstein
2020-06-10 12:59             ` Mel Gorman
2020-06-10 13:24               ` Amir Goldstein
2020-06-12  9:38                 ` Amir Goldstein
2020-06-12 12:42                   ` Mel Gorman
2020-06-08 15:19 ` Jan Kara
2020-06-08 16:50   ` Mel Gorman
2020-06-08 17:45     ` Amir Goldstein
  -- strict thread matches above, loose matches on Subject: below --
2020-06-08 17:52 kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.