public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave@sr71.net>,
	dave.hansen@linux.intel.com, akpm@linux-foundation.org,
	jack@suse.cz, viro@zeniv.linux.org.uk, eparis@redhat.com,
	john@johnmccutchan.com, rlove@rlove.org,
	tim.c.chen@linux.intel.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files
Date: Fri, 19 Jun 2015 17:29:37 -0700	[thread overview]
Message-ID: <20150620002937.GE3913@linux.vnet.ibm.com> (raw)
In-Reply-To: <20150619233306.GT25760@tassilo.jf.intel.com>

On Fri, Jun 19, 2015 at 04:33:06PM -0700, Andi Kleen wrote:
> On Fri, Jun 19, 2015 at 02:50:25PM -0700, Dave Hansen wrote:
> > 
> > From: Dave Hansen <dave.hansen@linux.intel.com>
> > 
> > I have a _tiny_ microbenchmark that sits in a loop and writes
> > single bytes to a file.  Writing one byte to a tmpfs file is
> > around 2x slower than reading one byte from a file, which is a
> > _bit_ more than I expecte.  This is a dumb benchmark, but I think
> > it's hard to deny that write() is a hot path and we should avoid
> > unnecessary overhead there.
> > 
> > I did a 'perf record' of 30-second samples of read and write.
> > The top item in a diffprofile is srcu_read_lock() from
> > fsnotify().  There are active inotify fd's from systemd, but
> > nothing is actually listening to the file or its part of
> > the filesystem.
> > 
> > I *think* we can avoid taking the srcu_read_lock() for the
> > common case where there are no actual marks on the file
> > being modified *or* the vfsmount.
> 
> What is so expensive in it? Just the memory barrier in it?
> 
> Perhaps the function can be tuned in general.

The memory barrier we are pretty much stuck with unless we want
synchronize_srcu() to be quite a bit more expensive (and for SRCU to be
unusable from offline and idle) -- and that synchronize_srcu() expense
drove rewrite from the earlier version to this one.  It is possible to
cut down from two to one instances of __this_cpu_inc(), however.

It of course would be possible to have two types of SRCU, one for fast
grace periods and the other for memory-barrier-free read-side critical
sections, but obviously a very clear case would need to be made for this.
At least judging from the reactions the last time I introduced a new
flavor of RCU.  ;-)

So, echoing Andi, what exactly is expensive?

							Thanx, Paul

> -Andi
> 
> int __srcu_read_lock(struct srcu_struct *sp)
> {
>         int idx;
> 
>         idx = ACCESS_ONCE(sp->completed) & 0x1;
>         preempt_disable();
>         __this_cpu_inc(sp->per_cpu_ref->c[idx]);
>         smp_mb(); /* B */  /* Avoid leaking the critical section. */
>         __this_cpu_inc(sp->per_cpu_ref->seq[idx]);
>         preempt_enable();
>         return idx;
> }
> 
> 
> > 
> > The *_fsnotify_mask is an aggregate of each of the masks from
> > each mark.  If we have nothing set in the masks at all then there
> > are no marks and no need to do anything with 'ignored masks'
> > since none exist.  This keeps us from having to do the costly
> > srcu_read_lock() for a check which is very cheap.
> > 
> > This patch gave a 10.8% speedup in writes/second on my test.
> > 
> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > Cc: Eric Paris <eparis@redhat.com>
> > Cc: John McCutchan <john@johnmccutchan.com>
> > Cc: Robert Love <rlove@rlove.org>
> > Cc: Tim Chen <tim.c.chen@linux.intel.com>
> > Cc: Andi Kleen <ak@linux.intel.com>
> > Cc: linux-kernel@vger.kernel.org
> > ---
> > 
> >  b/fs/notify/fsnotify.c |   10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff -puN fs/notify/fsnotify.c~optimize-fsnotify fs/notify/fsnotify.c
> > --- a/fs/notify/fsnotify.c~optimize-fsnotify	2015-06-19 13:29:53.117283581 -0700
> > +++ b/fs/notify/fsnotify.c	2015-06-19 13:29:53.123283853 -0700
> > @@ -213,6 +213,16 @@ int fsnotify(struct inode *to_tell, __u3
> >  	    !(test_mask & to_tell->i_fsnotify_mask) &&
> >  	    !(mnt && test_mask & mnt->mnt_fsnotify_mask))
> >  		return 0;
> > +	/*
> > +	 * Optimization: The *_fsnotify_mask is an aggregate of each of the
> > +	 * masks from each mark.  If we have nothing set in the masks at
> > +	 * all then there are no marks and no need to do anything with
> > +	 * 'ignored masks' since none exist.  This keeps us from having to
> > +	 * do the costly srcu_read_lock() for a check which is very cheap.
> > +	 */
> > +	if (!to_tell->i_fsnotify_mask &&
> > +	    (!mnt || !mnt->mnt_fsnotify_mask))
> > +		return 0;
> >  
> >  	idx = srcu_read_lock(&fsnotify_mark_srcu);
> >  
> > _
> 
> -- 
> ak@linux.intel.com -- Speaking for myself only
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2015-06-20  0:29 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-19 21:50 [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files Dave Hansen
2015-06-19 23:33 ` Andi Kleen
2015-06-20  0:29   ` Paul E. McKenney [this message]
2015-06-20  0:39   ` Dave Hansen
2015-06-20  2:21     ` Paul E. McKenney
2015-06-20 18:02       ` Dave Hansen
2015-06-21  1:30         ` Paul E. McKenney
2015-06-22 13:28           ` Peter Zijlstra
2015-06-22 15:11             ` Paul E. McKenney
2015-06-22 15:20               ` Peter Zijlstra
2015-06-22 16:29                 ` Paul E. McKenney
2015-06-22 19:03                   ` Peter Zijlstra
2015-06-23  0:31                     ` Paul E. McKenney
2015-06-22 18:50               ` Dave Hansen
2015-06-23  0:26                 ` Paul E. McKenney
2015-06-24 16:50                   ` Dave Hansen
2015-06-24 17:29                     ` Paul E. McKenney
2015-06-22 18:52               ` Peter Zijlstra
2015-06-23  0:29                 ` Paul E. McKenney
2015-06-23 15:17 ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150620002937.GE3913@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@sr71.net \
    --cc=eparis@redhat.com \
    --cc=jack@suse.cz \
    --cc=john@johnmccutchan.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rlove@rlove.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox