All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
	linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	david@fromorbit.com, Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Andy Lutomirski <luto@amacapital.net>
Subject: page fault scalability (ext3, ext4, xfs)
Date: Wed, 14 Aug 2013 10:10:07 -0700	[thread overview]
Message-ID: <520BB9EF.5020308@linux.intel.com> (raw)

We talked a little about this issue in this thread:

	http://marc.info/?l=linux-mm&m=137573185419275&w=2

but I figured I'd follow up with a full comparison.  ext4 is about 20%
slower in handling write page faults than ext3.  xfs is about 30% slower
than ext3.  I'm running on an 8-socket / 80-core / 160-thread system.
Test case is this:

	https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c

It's a little easier to look at the trends as you grow the number of
processes:

	http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16

I recorded and diff'd some perf data (I've still got the raw data if
anyone wants it), and the main culprit of the ext4/xfs delta looks to be
spinlock contention (or at least bouncing) in xfs_log_commit_cil().
This looks to be a known problem:

	http://oss.sgi.com/archives/xfs/2013-07/msg00110.html

Here's a brief snippet of the ext4->xfs 'perf diff'.  Note that things
like page_fault() go down in the profile because we are doing _fewer_ of
them, not because it got faster:

> # Baseline    Delta          Shared Object                                          Symbol
> # ........  .......  .....................  ..............................................
> #
>     22.04%   -4.07%  [kernel.kallsyms]      [k] page_fault                                
>      2.93%  +12.49%  [kernel.kallsyms]      [k] _raw_spin_lock                            
>      8.21%   -0.58%  page_fault3_processes  [.] testcase                                  
>      4.87%   -0.34%  [kernel.kallsyms]      [k] __set_page_dirty_buffers                  
>      4.07%   -0.58%  [kernel.kallsyms]      [k] mem_cgroup_update_page_stat               
>      4.10%   -0.61%  [kernel.kallsyms]      [k] __block_write_begin                       
>      3.69%   -0.57%  [kernel.kallsyms]      [k] find_get_page                             

It's a bit of a bummer that things are so much less scalable on the
newer filesystems.  I expected xfs to do a _lot_ better than it did.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
	linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	david@fromorbit.com, Tim Chen <tim.c.chen@linux.intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Andy Lutomirski <luto@amacapital.net>
Subject: page fault scalability (ext3, ext4, xfs)
Date: Wed, 14 Aug 2013 10:10:07 -0700	[thread overview]
Message-ID: <520BB9EF.5020308@linux.intel.com> (raw)

We talked a little about this issue in this thread:

	http://marc.info/?l=linux-mm&m=137573185419275&w=2

but I figured I'd follow up with a full comparison.  ext4 is about 20%
slower in handling write page faults than ext3.  xfs is about 30% slower
than ext3.  I'm running on an 8-socket / 80-core / 160-thread system.
Test case is this:

	https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c

It's a little easier to look at the trends as you grow the number of
processes:

	http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16

I recorded and diff'd some perf data (I've still got the raw data if
anyone wants it), and the main culprit of the ext4/xfs delta looks to be
spinlock contention (or at least bouncing) in xfs_log_commit_cil().
This looks to be a known problem:

	http://oss.sgi.com/archives/xfs/2013-07/msg00110.html

Here's a brief snippet of the ext4->xfs 'perf diff'.  Note that things
like page_fault() go down in the profile because we are doing _fewer_ of
them, not because it got faster:

> # Baseline    Delta          Shared Object                                          Symbol
> # ........  .......  .....................  ..............................................
> #
>     22.04%   -4.07%  [kernel.kallsyms]      [k] page_fault                                
>      2.93%  +12.49%  [kernel.kallsyms]      [k] _raw_spin_lock                            
>      8.21%   -0.58%  page_fault3_processes  [.] testcase                                  
>      4.87%   -0.34%  [kernel.kallsyms]      [k] __set_page_dirty_buffers                  
>      4.07%   -0.58%  [kernel.kallsyms]      [k] mem_cgroup_update_page_stat               
>      4.10%   -0.61%  [kernel.kallsyms]      [k] __block_write_begin                       
>      3.69%   -0.57%  [kernel.kallsyms]      [k] find_get_page                             

It's a bit of a bummer that things are so much less scalable on the
newer filesystems.  I expected xfs to do a _lot_ better than it did.

             reply	other threads:[~2013-08-14 17:10 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 17:10 Dave Hansen [this message]
2013-08-14 17:10 ` page fault scalability (ext3, ext4, xfs) Dave Hansen
2013-08-14 19:43 ` Theodore Ts'o
2013-08-14 19:43   ` Theodore Ts'o
2013-08-14 20:50   ` Dave Hansen
2013-08-14 20:50     ` Dave Hansen
2013-08-14 23:06     ` Theodore Ts'o
2013-08-14 23:06       ` Theodore Ts'o
2013-08-14 23:38       ` Andy Lutomirski
2013-08-15  1:11         ` Theodore Ts'o
2013-08-15  2:10           ` Dave Chinner
2013-08-15  4:32             ` Andy Lutomirski
2013-08-15  4:32               ` Andy Lutomirski
2013-08-15  6:01               ` Dave Chinner
2013-08-15  6:14                 ` Andy Lutomirski
2013-08-15  6:14                   ` Andy Lutomirski
2013-08-15  6:18                   ` David Lang
2013-08-15  6:18                     ` David Lang
2013-08-15  6:28                     ` Andy Lutomirski
2013-08-15  6:28                       ` Andy Lutomirski
2013-08-15  7:11                   ` Dave Chinner
2013-08-15  7:11                     ` Dave Chinner
2013-08-15  7:45                     ` Jan Kara
2013-08-15 21:28                       ` Dave Chinner
2013-08-15 21:28                         ` Dave Chinner
2013-08-15 21:31                         ` Andy Lutomirski
2013-08-15 21:39                           ` Dave Chinner
2013-08-19 23:23                         ` David Lang
2013-08-19 23:23                           ` David Lang
2013-08-19 23:31                           ` Andy Lutomirski
2013-08-15 15:17                     ` Andy Lutomirski
2013-08-15 15:17                       ` Andy Lutomirski
2013-08-15 21:37                       ` Dave Chinner
2013-08-15 21:37                         ` Dave Chinner
2013-08-15 21:43                         ` Andy Lutomirski
2013-08-15 21:43                           ` Andy Lutomirski
2013-08-15 22:18                           ` Dave Chinner
2013-08-15 22:18                             ` Dave Chinner
2013-08-15 22:26                             ` Andy Lutomirski
2013-08-16  0:14                               ` Dave Chinner
2013-08-16  0:21                                 ` Andy Lutomirski
2013-08-16 22:02                         ` J. Bruce Fields
2013-08-16 22:02                           ` J. Bruce Fields
2013-08-16 23:18                           ` Andy Lutomirski
2013-08-16 23:18                             ` Andy Lutomirski
2013-08-18 20:17                             ` J. Bruce Fields
2013-08-18 20:17                               ` J. Bruce Fields
2013-08-19 22:17                 ` J. Bruce Fields
2013-08-19 22:17                   ` J. Bruce Fields
2013-08-19 22:29                   ` Andy Lutomirski
2013-08-19 22:29                     ` Andy Lutomirski
2013-08-15 15:14           ` Dave Hansen
2013-08-15 15:14             ` Dave Hansen
2013-08-15  0:24 ` Dave Chinner
2013-08-15  0:24   ` Dave Chinner
2013-08-15  2:24   ` Andi Kleen
2013-08-15  2:24     ` Andi Kleen
2013-08-15  4:29     ` Dave Chinner
2013-08-15  4:29       ` Dave Chinner
2013-08-15 15:36       ` Dave Hansen
2013-08-15 15:36         ` Dave Hansen
2013-08-15 15:09   ` Dave Hansen
2013-08-15 15:05 ` Theodore Ts'o
2013-08-15 17:45   ` Dave Hansen
2013-08-15 17:45     ` Dave Hansen
2013-08-15 19:31     ` Theodore Ts'o
2013-08-15 19:31       ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520BB9EF.5020308@linux.intel.com \
    --to=dave.hansen@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=tim.c.chen@linux.intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.