linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>, Jan Kara <jack@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Andy Lutomirski <luto@amacapital.net>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: page fault scalability (ext3, ext4, xfs)
Date: Thu, 15 Aug 2013 10:24:36 +1000	[thread overview]
Message-ID: <20130815002436.GI6023@dastard> (raw)
In-Reply-To: <520BB9EF.5020308@linux.intel.com>

On Wed, Aug 14, 2013 at 10:10:07AM -0700, Dave Hansen wrote:
> We talked a little about this issue in this thread:
> 
> 	http://marc.info/?l=linux-mm&m=137573185419275&w=2
> 
> but I figured I'd follow up with a full comparison.  ext4 is about 20%
> slower in handling write page faults than ext3.  xfs is about 30% slower
> than ext3.  I'm running on an 8-socket / 80-core / 160-thread system.
> Test case is this:
> 
> 	https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c

So, it writes a 128MB file sequentially via mmap page faults. This
isn't a page fault benchmark, as such...

> 
> It's a little easier to look at the trends as you grow the number of
> processes:
> 
> 	http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16
> 
> I recorded and diff'd some perf data (I've still got the raw data if
> anyone wants it), and the main culprit of the ext4/xfs delta looks to be
> spinlock contention (or at least bouncing) in xfs_log_commit_cil().
> This looks to be a known problem:
> 
> 	http://oss.sgi.com/archives/xfs/2013-07/msg00110.html

Yup, apparently they've been pulled into the xfsdev tree, but i
haven't seen it updated since they were pulled in so the linux-next
builds aren't picking up the fixes yet.

> Here's a brief snippet of the ext4->xfs 'perf diff'.  Note that things
> like page_fault() go down in the profile because we are doing _fewer_ of
> them, not because it got faster:
> 
> > # Baseline    Delta          Shared Object                                          Symbol
> > # ........  .......  .....................  ..............................................
> > #
> >     22.04%   -4.07%  [kernel.kallsyms]      [k] page_fault                                
> >      2.93%  +12.49%  [kernel.kallsyms]      [k] _raw_spin_lock                            
> >      8.21%   -0.58%  page_fault3_processes  [.] testcase                                  
> >      4.87%   -0.34%  [kernel.kallsyms]      [k] __set_page_dirty_buffers                  
> >      4.07%   -0.58%  [kernel.kallsyms]      [k] mem_cgroup_update_page_stat               
> >      4.10%   -0.61%  [kernel.kallsyms]      [k] __block_write_begin                       
> >      3.69%   -0.57%  [kernel.kallsyms]      [k] find_get_page                             
> 
> It's a bit of a bummer that things are so much less scalable on the
> newer filesystems.

Sorry, what? What filesystems are you comparing here? XFS is
anything but new...

> I expected xfs to do a _lot_ better than it did.

perf diff doesn't tell me anything about how you should expect the
workload to scale.

This workload appears to be a concurrent write workload using
mmap(), so performance is going to be determined by filesystem
configuration, storage capability and the CPU overhead of the
page_mkwrite() path through the filesystem. It's not a page fault
benchmark at all - it's simply a filesystem write bandwidth
benchmark.

So, perhaps you could describe the storage you are using, as that
would shed more light on your results. A good summary of what
information is useful to us is here:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

And FWIW, it's no secret that XFS has more per-operation overhead
than ext4 through the write path when it comes to allocation, so
it's no surprise that on a workload that is highly dependent on
allocation overhead that ext4 is a bit faster....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2013-08-15  0:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 17:10 page fault scalability (ext3, ext4, xfs) Dave Hansen
2013-08-14 19:43 ` Theodore Ts'o
2013-08-14 20:50   ` Dave Hansen
2013-08-14 23:06     ` Theodore Ts'o
2013-08-14 23:38       ` Andy Lutomirski
2013-08-15  1:11         ` Theodore Ts'o
2013-08-15  2:10           ` Dave Chinner
2013-08-15  4:32             ` Andy Lutomirski
2013-08-15  6:01               ` Dave Chinner
2013-08-15  6:14                 ` Andy Lutomirski
2013-08-15  6:18                   ` David Lang
2013-08-15  6:28                     ` Andy Lutomirski
2013-08-15  7:11                   ` Dave Chinner
2013-08-15  7:45                     ` Jan Kara
2013-08-15 21:28                       ` Dave Chinner
2013-08-15 21:31                         ` Andy Lutomirski
2013-08-15 21:39                           ` Dave Chinner
2013-08-19 23:23                         ` David Lang
2013-08-19 23:31                           ` Andy Lutomirski
2013-08-15 15:17                     ` Andy Lutomirski
2013-08-15 21:37                       ` Dave Chinner
2013-08-15 21:43                         ` Andy Lutomirski
2013-08-15 22:18                           ` Dave Chinner
2013-08-15 22:26                             ` Andy Lutomirski
2013-08-16  0:14                               ` Dave Chinner
2013-08-16  0:21                                 ` Andy Lutomirski
2013-08-16 22:02                         ` J. Bruce Fields
2013-08-16 23:18                           ` Andy Lutomirski
2013-08-18 20:17                             ` J. Bruce Fields
2013-08-19 22:17                 ` J. Bruce Fields
2013-08-19 22:29                   ` Andy Lutomirski
2013-08-15 15:14           ` Dave Hansen
2013-08-15  0:24 ` Dave Chinner [this message]
2013-08-15  2:24   ` Andi Kleen
2013-08-15  4:29     ` Dave Chinner
2013-08-15 15:36       ` Dave Hansen
2013-08-15 15:09   ` Dave Hansen
2013-08-15 15:05 ` Theodore Ts'o
2013-08-15 17:45   ` Dave Hansen
2013-08-15 19:31     ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130815002436.GI6023@dastard \
    --to=david@fromorbit.com \
    --cc=ak@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=tim.c.chen@linux.intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).