From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
LKML <linux-kernel@vger.kernel.org>,
david@fromorbit.com, Tim Chen <tim.c.chen@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Andy Lutomirski <luto@amacapital.net>
Subject: page fault scalability (ext3, ext4, xfs)
Date: Wed, 14 Aug 2013 10:10:07 -0700 [thread overview]
Message-ID: <520BB9EF.5020308@linux.intel.com> (raw)
We talked a little about this issue in this thread:
http://marc.info/?l=linux-mm&m=137573185419275&w=2
but I figured I'd follow up with a full comparison. ext4 is about 20%
slower in handling write page faults than ext3. xfs is about 30% slower
than ext3. I'm running on an 8-socket / 80-core / 160-thread system.
Test case is this:
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c
It's a little easier to look at the trends as you grow the number of
processes:
http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16
I recorded and diff'd some perf data (I've still got the raw data if
anyone wants it), and the main culprit of the ext4/xfs delta looks to be
spinlock contention (or at least bouncing) in xfs_log_commit_cil().
This looks to be a known problem:
http://oss.sgi.com/archives/xfs/2013-07/msg00110.html
Here's a brief snippet of the ext4->xfs 'perf diff'. Note that things
like page_fault() go down in the profile because we are doing _fewer_ of
them, not because it got faster:
> # Baseline Delta Shared Object Symbol
> # ........ ....... ..................... ..............................................
> #
> 22.04% -4.07% [kernel.kallsyms] [k] page_fault
> 2.93% +12.49% [kernel.kallsyms] [k] _raw_spin_lock
> 8.21% -0.58% page_fault3_processes [.] testcase
> 4.87% -0.34% [kernel.kallsyms] [k] __set_page_dirty_buffers
> 4.07% -0.58% [kernel.kallsyms] [k] mem_cgroup_update_page_stat
> 4.10% -0.61% [kernel.kallsyms] [k] __block_write_begin
> 3.69% -0.57% [kernel.kallsyms] [k] find_get_page
It's a bit of a bummer that things are so much less scalable on the
newer filesystems. I expected xfs to do a _lot_ better than it did.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
WARNING: multiple messages have this Message-ID (diff)
From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
LKML <linux-kernel@vger.kernel.org>,
david@fromorbit.com, Tim Chen <tim.c.chen@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Andy Lutomirski <luto@amacapital.net>
Subject: page fault scalability (ext3, ext4, xfs)
Date: Wed, 14 Aug 2013 10:10:07 -0700 [thread overview]
Message-ID: <520BB9EF.5020308@linux.intel.com> (raw)
We talked a little about this issue in this thread:
http://marc.info/?l=linux-mm&m=137573185419275&w=2
but I figured I'd follow up with a full comparison. ext4 is about 20%
slower in handling write page faults than ext3. xfs is about 30% slower
than ext3. I'm running on an 8-socket / 80-core / 160-thread system.
Test case is this:
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c
It's a little easier to look at the trends as you grow the number of
processes:
http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16
I recorded and diff'd some perf data (I've still got the raw data if
anyone wants it), and the main culprit of the ext4/xfs delta looks to be
spinlock contention (or at least bouncing) in xfs_log_commit_cil().
This looks to be a known problem:
http://oss.sgi.com/archives/xfs/2013-07/msg00110.html
Here's a brief snippet of the ext4->xfs 'perf diff'. Note that things
like page_fault() go down in the profile because we are doing _fewer_ of
them, not because it got faster:
> # Baseline Delta Shared Object Symbol
> # ........ ....... ..................... ..............................................
> #
> 22.04% -4.07% [kernel.kallsyms] [k] page_fault
> 2.93% +12.49% [kernel.kallsyms] [k] _raw_spin_lock
> 8.21% -0.58% page_fault3_processes [.] testcase
> 4.87% -0.34% [kernel.kallsyms] [k] __set_page_dirty_buffers
> 4.07% -0.58% [kernel.kallsyms] [k] mem_cgroup_update_page_stat
> 4.10% -0.61% [kernel.kallsyms] [k] __block_write_begin
> 3.69% -0.57% [kernel.kallsyms] [k] find_get_page
It's a bit of a bummer that things are so much less scalable on the
newer filesystems. I expected xfs to do a _lot_ better than it did.
next reply other threads:[~2013-08-14 17:10 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-14 17:10 Dave Hansen [this message]
2013-08-14 17:10 ` page fault scalability (ext3, ext4, xfs) Dave Hansen
2013-08-14 19:43 ` Theodore Ts'o
2013-08-14 19:43 ` Theodore Ts'o
2013-08-14 20:50 ` Dave Hansen
2013-08-14 20:50 ` Dave Hansen
2013-08-14 23:06 ` Theodore Ts'o
2013-08-14 23:06 ` Theodore Ts'o
2013-08-14 23:38 ` Andy Lutomirski
2013-08-15 1:11 ` Theodore Ts'o
2013-08-15 2:10 ` Dave Chinner
2013-08-15 4:32 ` Andy Lutomirski
2013-08-15 4:32 ` Andy Lutomirski
2013-08-15 6:01 ` Dave Chinner
2013-08-15 6:14 ` Andy Lutomirski
2013-08-15 6:14 ` Andy Lutomirski
2013-08-15 6:18 ` David Lang
2013-08-15 6:18 ` David Lang
2013-08-15 6:28 ` Andy Lutomirski
2013-08-15 6:28 ` Andy Lutomirski
2013-08-15 7:11 ` Dave Chinner
2013-08-15 7:11 ` Dave Chinner
2013-08-15 7:45 ` Jan Kara
2013-08-15 21:28 ` Dave Chinner
2013-08-15 21:28 ` Dave Chinner
2013-08-15 21:31 ` Andy Lutomirski
2013-08-15 21:39 ` Dave Chinner
2013-08-19 23:23 ` David Lang
2013-08-19 23:23 ` David Lang
2013-08-19 23:31 ` Andy Lutomirski
2013-08-15 15:17 ` Andy Lutomirski
2013-08-15 15:17 ` Andy Lutomirski
2013-08-15 21:37 ` Dave Chinner
2013-08-15 21:37 ` Dave Chinner
2013-08-15 21:43 ` Andy Lutomirski
2013-08-15 21:43 ` Andy Lutomirski
2013-08-15 22:18 ` Dave Chinner
2013-08-15 22:18 ` Dave Chinner
2013-08-15 22:26 ` Andy Lutomirski
2013-08-16 0:14 ` Dave Chinner
2013-08-16 0:21 ` Andy Lutomirski
2013-08-16 22:02 ` J. Bruce Fields
2013-08-16 22:02 ` J. Bruce Fields
2013-08-16 23:18 ` Andy Lutomirski
2013-08-16 23:18 ` Andy Lutomirski
2013-08-18 20:17 ` J. Bruce Fields
2013-08-18 20:17 ` J. Bruce Fields
2013-08-19 22:17 ` J. Bruce Fields
2013-08-19 22:17 ` J. Bruce Fields
2013-08-19 22:29 ` Andy Lutomirski
2013-08-19 22:29 ` Andy Lutomirski
2013-08-15 15:14 ` Dave Hansen
2013-08-15 15:14 ` Dave Hansen
2013-08-15 0:24 ` Dave Chinner
2013-08-15 0:24 ` Dave Chinner
2013-08-15 2:24 ` Andi Kleen
2013-08-15 2:24 ` Andi Kleen
2013-08-15 4:29 ` Dave Chinner
2013-08-15 4:29 ` Dave Chinner
2013-08-15 15:36 ` Dave Hansen
2013-08-15 15:36 ` Dave Hansen
2013-08-15 15:09 ` Dave Hansen
2013-08-15 15:05 ` Theodore Ts'o
2013-08-15 17:45 ` Dave Hansen
2013-08-15 17:45 ` Dave Hansen
2013-08-15 19:31 ` Theodore Ts'o
2013-08-15 19:31 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=520BB9EF.5020308@linux.intel.com \
--to=dave.hansen@linux.intel.com \
--cc=ak@linux.intel.com \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=tim.c.chen@linux.intel.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.