From: Mel Gorman <mgorman@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Vlastimil Babka <vbabka@suse.cz>, Jan Kara <jack@suse.cz>,
Michal Hocko <mhocko@suse.cz>, Hugh Dickins <hughd@google.com>,
Peter Zijlstra <peterz@infradead.org>,
Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>,
Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: [PATCH 00/19] Misc page alloc, shmem, mark_page_accessed and page_waitqueue optimisations v3r33
Date: Tue, 13 May 2014 10:45:31 +0100 [thread overview]
Message-ID: <1399974350-11089-1-git-send-email-mgorman@suse.de> (raw)
Changelog since V2
o Fewer atomic operations in buffer discards (mgorman)
o Remove number_of_cpusets and use ref count in jump labels (peterz)
o Optimise set loop for pageblock flags further (peterz)
o Remove unnecessary parameters when setting pageblock flags (vbabka)
o Rework how PG_waiters are set/cleared to avoid changing wait.c (mgorman)
I was investigating a performance bug that looked like dd to tmpfs
had regressed. The bulk of the problem turned out to be a difference
in Kconfig but it got me looking at the unnecessary overhead in tmpfs,
mark_page_accessed and parts of the allocator. This series is the result.
The patches themselves have details of the performance results but here
are a few showing the impact of the whole series. This is the result of
dd'ing to a file multiple times on tmpfs
sync DD to tmpfs
Throughput 3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Min 4096.0000 ( 0.00%) 4300.8000 ( 5.00%)
Mean 4785.4933 ( 0.00%) 5003.9467 ( 4.56%)
TrimMean 4812.8000 ( 0.00%) 5028.5714 ( 4.48%)
Stddev 147.0509 ( 0.00%) 191.9981 ( 30.57%)
Max 5017.6000 ( 0.00%) 5324.8000 ( 6.12%)
sync DD to tmpfs
Elapsed Time 3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Min elapsed 0.4200 ( 0.00%) 0.3900 ( 7.14%)
Mean elapsed 0.4947 ( 0.00%) 0.4527 ( 8.49%)
TrimMean elapsed 0.4968 ( 0.00%) 0.4539 ( 8.63%)
Stddev elapsed 0.0255 ( 0.00%) 0.0340 (-33.02%)
Max elapsed 0.5200 ( 0.00%) 0.4800 ( 7.69%)
TrimMean elapsed 0.4796 ( 0.00%) 0.4179 ( 12.88%)
Stddev elapsed 0.0353 ( 0.00%) 0.0379 ( -7.23%)
Max elapsed 0.5100 ( 0.00%) 0.4800 ( 5.88%)
sync DD to ext4
Throughput 3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Min 113.0000 ( 0.00%) 117.0000 ( 3.54%)
Mean 116.3000 ( 0.00%) 119.6667 ( 2.89%)
TrimMean 116.2857 ( 0.00%) 119.5714 ( 2.83%)
Stddev 1.6961 ( 0.00%) 1.1643 (-31.35%)
Max 120.0000 ( 0.00%) 122.0000 ( 1.67%)
sync DD to ext4
Elapsed time 3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Min elapsed 13.9500 ( 0.00%) 13.6900 ( 1.86%)
Mean elapsed 14.4253 ( 0.00%) 14.0010 ( 2.94%)
TrimMean elapsed 14.4321 ( 0.00%) 14.0161 ( 2.88%)
Stddev elapsed 0.2047 ( 0.00%) 0.1423 ( 30.46%)
Max elapsed 14.8300 ( 0.00%) 14.3100 ( 3.51%)
async DD to ext4
Elapsed time 3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Min elapsed 0.7900 ( 0.00%) 0.7800 ( 1.27%)
Mean elapsed 12.4023 ( 0.00%) 12.2957 ( 0.86%)
TrimMean elapsed 13.2036 ( 0.00%) 13.0918 ( 0.85%)
Stddev elapsed 3.3286 ( 0.00%) 2.9842 ( 10.35%)
Max elapsed 18.6000 ( 0.00%) 13.4300 ( 27.80%)
This table shows the latency in usecs of accessing ext4-backed
mappings of various sizes
lat_mmap
3.15.0-rc4 3.15.0-rc4
vanilla fullseries-v3
Procs 107M 564.0000 ( 0.00%) 546.0000 ( 3.19%)
Procs 214M 1123.0000 ( 0.00%) 1090.0000 ( 2.94%)
Procs 322M 1636.0000 ( 0.00%) 1395.0000 ( 14.73%)
Procs 429M 2076.0000 ( 0.00%) 2051.0000 ( 1.20%)
Procs 536M 2518.0000 ( 0.00%) 2482.0000 ( 1.43%)
Procs 644M 3008.0000 ( 0.00%) 2978.0000 ( 1.00%)
Procs 751M 3506.0000 ( 0.00%) 3450.0000 ( 1.60%)
Procs 859M 3988.0000 ( 0.00%) 3756.0000 ( 5.82%)
Procs 966M 4544.0000 ( 0.00%) 4310.0000 ( 5.15%)
Procs 1073M 4960.0000 ( 0.00%) 4928.0000 ( 0.65%)
Procs 1181M 5342.0000 ( 0.00%) 5144.0000 ( 3.71%)
Procs 1288M 5573.0000 ( 0.00%) 5427.0000 ( 2.62%)
Procs 1395M 5777.0000 ( 0.00%) 6056.0000 ( -4.83%)
Procs 1503M 6141.0000 ( 0.00%) 5963.0000 ( 2.90%)
Procs 1610M 6689.0000 ( 0.00%) 6331.0000 ( 5.35%)
Procs 1717M 8839.0000 ( 0.00%) 6807.0000 ( 22.99%)
Procs 1825M 8399.0000 ( 0.00%) 9062.0000 ( -7.89%)
Procs 1932M 7871.0000 ( 0.00%) 8778.0000 (-11.52%)
Procs 2040M 8235.0000 ( 0.00%) 8081.0000 ( 1.87%)
Procs 2147M 8861.0000 ( 0.00%) 8337.0000 ( 5.91%)
In general the system CPU overhead is lower.
arch/tile/mm/homecache.c | 2 +-
fs/btrfs/extent_io.c | 11 +-
fs/btrfs/file.c | 5 +-
fs/buffer.c | 21 ++-
fs/ext4/mballoc.c | 14 +-
fs/f2fs/checkpoint.c | 3 -
fs/f2fs/node.c | 2 -
fs/fuse/dev.c | 2 +-
fs/fuse/file.c | 2 -
fs/gfs2/aops.c | 1 -
fs/gfs2/meta_io.c | 4 +-
fs/ntfs/attrib.c | 1 -
fs/ntfs/file.c | 1 -
include/linux/buffer_head.h | 5 +
include/linux/cpuset.h | 46 +++++
include/linux/gfp.h | 4 +-
include/linux/jump_label.h | 20 ++-
include/linux/mmzone.h | 21 ++-
include/linux/page-flags.h | 20 +++
include/linux/pageblock-flags.h | 30 +++-
include/linux/pagemap.h | 115 +++++++++++-
include/linux/swap.h | 9 +-
kernel/cpuset.c | 10 +-
mm/filemap.c | 380 +++++++++++++++++++++++++---------------
mm/page_alloc.c | 229 ++++++++++++++----------
mm/shmem.c | 8 +-
mm/swap.c | 27 ++-
mm/swap_state.c | 2 +-
mm/vmscan.c | 9 +-
29 files changed, 686 insertions(+), 318 deletions(-)
--
1.8.4.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2014-05-13 9:45 UTC|newest]
Thread overview: 103+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-13 9:45 Mel Gorman [this message]
2014-05-13 9:45 ` [PATCH 01/19] mm: page_alloc: Do not update zlc unless the zlc is active Mel Gorman
2014-05-13 9:45 ` [PATCH 02/19] mm: page_alloc: Do not treat a zone that cannot be used for dirty pages as "full" Mel Gorman
2014-05-13 9:45 ` [PATCH 03/19] jump_label: Expose the reference count Mel Gorman
2014-05-13 9:45 ` [PATCH 04/19] mm: page_alloc: Use jump labels to avoid checking number_of_cpusets Mel Gorman
2014-05-13 10:58 ` Peter Zijlstra
2014-05-13 12:28 ` Mel Gorman
2014-05-13 9:45 ` [PATCH 05/19] mm: page_alloc: Calculate classzone_idx once from the zonelist ref Mel Gorman
2014-05-13 22:25 ` Andrew Morton
2014-05-14 6:32 ` Mel Gorman
2014-05-14 20:29 ` Mel Gorman
2014-05-13 9:45 ` [PATCH 06/19] mm: page_alloc: Only check the zone id check if pages are buddies Mel Gorman
2014-05-13 9:45 ` [PATCH 07/19] mm: page_alloc: Only check the alloc flags and gfp_mask for dirty once Mel Gorman
2014-05-13 9:45 ` [PATCH 08/19] mm: page_alloc: Take the ALLOC_NO_WATERMARK check out of the fast path Mel Gorman
2014-05-13 9:45 ` [PATCH 09/19] mm: page_alloc: Use word-based accesses for get/set pageblock bitmaps Mel Gorman
2014-05-22 9:24 ` Vlastimil Babka
2014-05-22 18:23 ` Andrew Morton
2014-05-22 18:45 ` Vlastimil Babka
2014-05-13 9:45 ` [PATCH 10/19] mm: page_alloc: Reduce number of times page_to_pfn is called Mel Gorman
2014-05-13 13:27 ` Vlastimil Babka
2014-05-13 14:09 ` Mel Gorman
2014-05-13 9:45 ` [PATCH 11/19] mm: page_alloc: Lookup pageblock migratetype with IRQs enabled during free Mel Gorman
2014-05-13 13:36 ` Vlastimil Babka
2014-05-13 14:23 ` Mel Gorman
2014-05-13 9:45 ` [PATCH 12/19] mm: page_alloc: Use unsigned int for order in more places Mel Gorman
2014-05-13 9:45 ` [PATCH 13/19] mm: page_alloc: Convert hot/cold parameter and immediate callers to bool Mel Gorman
2014-05-13 9:45 ` [PATCH 14/19] mm: shmem: Avoid atomic operation during shmem_getpage_gfp Mel Gorman
2014-05-13 9:45 ` [PATCH 15/19] mm: Do not use atomic operations when releasing pages Mel Gorman
2014-05-13 9:45 ` [PATCH 16/19] mm: Do not use unnecessary atomic operations when adding pages to the LRU Mel Gorman
2014-05-13 9:45 ` [PATCH 17/19] fs: buffer: Do not use unnecessary atomic operations when discarding buffers Mel Gorman
2014-05-13 11:09 ` Peter Zijlstra
2014-05-13 12:50 ` Mel Gorman
2014-05-13 13:49 ` Jan Kara
2014-05-13 14:30 ` Mel Gorman
2014-05-13 14:01 ` Peter Zijlstra
2014-05-13 14:46 ` Mel Gorman
2014-05-13 13:50 ` Jan Kara
2014-05-13 22:29 ` Andrew Morton
2014-05-14 6:12 ` Mel Gorman
2014-05-13 9:45 ` [PATCH 18/19] mm: Non-atomically mark page accessed during page cache allocation where possible Mel Gorman
2014-05-13 14:29 ` Theodore Ts'o
2014-05-20 15:49 ` [PATCH] mm: non-atomically mark page accessed during page cache allocation where possible -fix Mel Gorman
2014-05-20 19:34 ` Andrew Morton
2014-05-21 12:09 ` Mel Gorman
2014-05-21 22:11 ` Andrew Morton
2014-05-22 0:07 ` Mel Gorman
2014-05-22 5:35 ` Prabhakar Lad
2014-05-13 9:45 ` [PATCH 19/19] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath Mel Gorman
2014-05-13 12:53 ` Mel Gorman
2014-05-13 14:17 ` Peter Zijlstra
2014-05-13 15:27 ` Paul E. McKenney
2014-05-13 15:44 ` Peter Zijlstra
2014-05-13 16:14 ` Paul E. McKenney
2014-05-13 18:57 ` Oleg Nesterov
2014-05-13 20:24 ` Paul E. McKenney
2014-05-14 14:25 ` Oleg Nesterov
2014-05-13 18:22 ` Oleg Nesterov
2014-05-13 18:18 ` Oleg Nesterov
2014-05-13 18:24 ` Peter Zijlstra
2014-05-13 18:52 ` Paul E. McKenney
2014-05-13 19:31 ` Oleg Nesterov
2014-05-13 20:32 ` Paul E. McKenney
2014-05-14 16:11 ` Oleg Nesterov
2014-05-14 16:17 ` Peter Zijlstra
2014-05-16 13:51 ` [PATCH 0/1] ptrace: task_clear_jobctl_trapping()->wake_up_bit() needs mb() Oleg Nesterov
2014-05-16 13:51 ` [PATCH 1/1] " Oleg Nesterov
2014-05-21 9:29 ` Peter Zijlstra
2014-05-21 19:19 ` Andrew Morton
2014-05-21 19:18 ` [PATCH 0/1] " Andrew Morton
2014-05-14 19:29 ` [PATCH 19/19] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath Oleg Nesterov
2014-05-14 20:53 ` Mel Gorman
2014-05-15 10:48 ` [PATCH] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath v4 Mel Gorman
2014-05-15 13:20 ` Peter Zijlstra
2014-05-15 13:29 ` Peter Zijlstra
2014-05-15 15:34 ` Oleg Nesterov
2014-05-15 15:45 ` Peter Zijlstra
2014-05-15 16:18 ` Mel Gorman
2014-05-15 15:03 ` Oleg Nesterov
2014-05-15 21:24 ` Andrew Morton
2014-05-21 12:15 ` [PATCH] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath v5 Mel Gorman
2014-05-21 13:02 ` Peter Zijlstra
2014-05-21 15:33 ` Mel Gorman
2014-05-21 16:08 ` Peter Zijlstra
2014-05-21 21:26 ` Andrew Morton
2014-05-21 21:33 ` Peter Zijlstra
2014-05-21 21:50 ` Andrew Morton
2014-05-22 0:07 ` Mel Gorman
2014-05-22 7:20 ` Peter Zijlstra
2014-05-22 10:40 ` [PATCH] mm: filemap: Avoid unnecessary barriers and waitqueue lookups in unlock_page fastpath v7 Mel Gorman
2014-05-22 10:56 ` Peter Zijlstra
2014-05-22 13:00 ` Mel Gorman
2014-05-22 14:40 ` Mel Gorman
2014-05-22 15:04 ` Peter Zijlstra
2014-05-22 15:36 ` Mel Gorman
2014-05-22 16:58 ` [PATCH] mm: filemap: Avoid unnecessary barriers and waitqueue lookups in unlock_page fastpath v8 Mel Gorman
2014-05-22 6:45 ` [PATCH] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath v5 Peter Zijlstra
2014-05-22 8:46 ` Mel Gorman
2014-05-22 17:47 ` Andrew Morton
2014-05-22 19:53 ` Mel Gorman
2014-05-21 23:35 ` Mel Gorman
2014-05-13 16:52 ` [PATCH 19/19] mm: filemap: Avoid unnecessary barries and waitqueue lookups in unlock_page fastpath Peter Zijlstra
2014-05-14 7:31 ` Mel Gorman
2014-05-19 8:57 ` [PATCH] mm: Avoid unnecessary atomic operations during end_page_writeback Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1399974350-11089-1-git-send-email-mgorman@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=peterz@infradead.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).