linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Chinner <david@fromorbit.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ying Huang <ying.huang@intel.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: [PATCH 4/4] mm, vmscan: Potentially stall direct reclaimers on tree_lock contention
Date: Fri,  9 Sep 2016 10:59:35 +0100	[thread overview]
Message-ID: <1473415175-20807-5-git-send-email-mgorman@techsingularity.net> (raw)
In-Reply-To: <1473415175-20807-1-git-send-email-mgorman@techsingularity.net>

If a heavy writer of a single file is forcing contention on the tree_lock
then it may be necessary to tempoarily stall the direct writer to allow
kswapd to make progress. This patch marks a pgdat congested if tree_lock
is being contended on the tail of the LRU.

On a swap-intensive workload to ramdisk, the following is observed

usemem
                              4.8.0-rc5             4.8.0-rc5
                           waitqueue-v1      directcongest-v1
Amean    System-1      179.61 (  0.00%)      202.21 (-12.58%)
Amean    System-3       68.91 (  0.00%)      105.14 (-52.59%)
Amean    System-5       93.09 (  0.00%)       80.98 ( 13.01%)
Amean    System-7       90.98 (  0.00%)       81.07 ( 10.90%)
Amean    System-8      299.81 (  0.00%)      227.08 ( 24.26%)
Amean    Elapsd-1      210.41 (  0.00%)      236.56 (-12.43%)
Amean    Elapsd-3       33.89 (  0.00%)       46.78 (-38.06%)
Amean    Elapsd-5       25.19 (  0.00%)       23.33 (  7.38%)
Amean    Elapsd-7       18.45 (  0.00%)       17.18 (  6.91%)
Amean    Elapsd-8       48.80 (  0.00%)       38.09 ( 21.93%)

Note that system CPU usage is reduced for high thread counts but it
is not a universal win and it's known to be highly variable. The
overall time stats look like

           4.8.0-rc5   4.8.0-rc5
        waitqueue-v1 directcongest-v1
User          462.40      468.18
System       5127.32     4875.92
Elapsed      2364.08     2539.77

It takes longer to complete but uses less system CPU. The benefit
is more noticable with xfs_io rewriting a file backed by ramdisk

                                                        4.8.0-rc5             4.8.0-rc5
                                                  waitqueue-v1r24   directcongest-v1r24
Amean    pwrite-single-rewrite-async-System        3.23 (  0.00%)        3.21 (  0.80%)
Amean    pwrite-single-rewrite-async-Elapsd        3.33 (  0.00%)        3.31 (  0.67%)

           4.8.0-rc5   4.8.0-rc5
        waitqueue-v1 directcongest-v1
User            8.76        9.25
System        392.31      389.10
Elapsed       406.29      403.74

As with the previous patch, a test from Dave Chinner would be necessary
to decide whether this patch is worthwhile. It seems reasonable to favour
workloads that are heavily writing files than heavily swapping as the
former situation is normal and reasonable while the latter situation will
never be optimal.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 936070b0790e..953df97abe0c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -771,6 +771,15 @@ static unsigned long remove_mapping_list(struct list_head *mapping_list,
 					/* Stall kswapd once for 10ms on contention */
 					if (cmpxchg(&kswapd_exclusive, NUMA_NO_NODE, pgdat->node_id) != NUMA_NO_NODE) {
 						DEFINE_WAIT(wait);
+
+						/*
+						 * Tag the pgdat as congested as it may
+						 * indicate contention with a heavy
+						 * writer that should stall on
+						 * wait_iff_congested.
+						 */
+						set_bit(PGDAT_CONGESTED, &pgdat->flags);
+
 						prepare_to_wait(&kswapd_contended_wait,
 							&wait, TASK_INTERRUPTIBLE);
 						io_schedule_timeout(HZ/100);
-- 
2.6.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-09-09  9:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-09  9:59 [RFC PATCH 0/4] Reduce tree_lock contention during swap and reclaim of a single file v1 Mel Gorman
2016-09-09  9:59 ` [PATCH 1/4] mm, vmscan: Batch removal of mappings under a single lock during reclaim Mel Gorman
2016-09-16 13:25   ` Peter Zijlstra
2016-09-16 14:07     ` Peter Zijlstra
2016-09-16 18:33     ` Linus Torvalds
2016-09-17  1:36       ` Peter Zijlstra
2016-09-09  9:59 ` [PATCH 2/4] block, brd: Treat storage as non-rotational Mel Gorman
2016-09-09  9:59 ` [PATCH 3/4] mm, vmscan: Stall kswapd if contending on tree_lock Mel Gorman
2016-09-09  9:59 ` Mel Gorman [this message]
2016-09-09 15:31 ` [RFC PATCH 0/4] Reduce tree_lock contention during swap and reclaim of a single file v1 Linus Torvalds
2016-09-09 16:19   ` Mel Gorman
2016-09-09 18:16     ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1473415175-20807-5-git-send-email-mgorman@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).