From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Mel Gorman <mgorman@suse.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: performance regression due to commit e82e0561("mm: vmscan: obey proportional scanning requirements for kswapd")
Date: Tue, 18 Feb 2014 16:01:22 +0800 [thread overview]
Message-ID: <20140218080122.GO26593@yliu-dev.sh.intel.com> (raw)
Hi,
Commit e82e0561("mm: vmscan: obey proportional scanning requirements for
kswapd") caused a big performance regression(73%) for vm-scalability/
lru-file-readonce testcase on a system with 256G memory without swap.
That testcase simply looks like this:
truncate -s 1T /tmp/vm-scalability.img
mkfs.xfs -q /tmp/vm-scalability.img
mount -o loop /tmp/vm-scalability.img /tmp/vm-scalability
SPARESE_FILE="/tmp/vm-scalability/sparse-lru-file-readonce"
for i in `seq 1 120`; do
truncate $SPARESE_FILE-$i -s 36G
timeout --foreground -s INT 300 dd bs=4k if=$SPARESE_FILE-$i of=/dev/null
done
wait
Actually, it's not the newlly added code(obey proportional scanning)
in that commit caused the regression. But instead, it's the following
change:
+
+ if (nr_reclaimed < nr_to_reclaim || scan_adjusted)
+ continue;
+
- if (nr_reclaimed >= nr_to_reclaim &&
- sc->priority < DEF_PRIORITY)
+ if (global_reclaim(sc) && !current_is_kswapd())
break;
The difference is that we might reclaim more than requested before
in the first round reclaimming(sc->priority == DEF_PRIORITY).
So, for a testcase like lru-file-readonce, the dirty rate is fast, and
reclaimming SWAP_CLUSTER_MAX(32 pages) each time is not enough for catching
up the dirty rate. And thus page allocation stalls, and performance drops:
O for e82e0561
* for parent commit
proc-vmstat.allocstall
2e+06 ++---------------------------------------------------------------+
1.8e+06 O+ O O O |
| |
1.6e+06 ++ |
1.4e+06 ++ |
| |
1.2e+06 ++ |
1e+06 ++ |
800000 ++ |
| |
600000 ++ |
400000 ++ |
| |
200000 *+..............*................*...............*...............*
0 ++---------------------------------------------------------------+
vm-scalability.throughput
2.2e+07 ++---------------------------------------------------------------+
| |
2e+07 *+..............*................*...............*...............*
1.8e+07 ++ |
| |
1.6e+07 ++ |
| |
1.4e+07 ++ |
| |
1.2e+07 ++ |
1e+07 ++ |
| |
8e+06 ++ O O O |
O |
6e+06 ++---------------------------------------------------------------+
I made a patch which simply keeps reclaimming more if sc->priority == DEF_PRIORITY.
I'm not sure it's the right way to go or not. Anyway, I pasted it here for comments.
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 26ad67f..37004a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1828,7 +1828,16 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
unsigned long nr_reclaimed = 0;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct blk_plug plug;
- bool scan_adjusted = false;
+ /*
+ * On large memory systems, direct reclamming of SWAP_CLUSTER_MAX
+ * each time may not catch up the dirty rate in some cases(say,
+ * vm-scalability/lru-file-readonce), which may increase the
+ * page allocation stall latency in the end.
+ *
+ * Here we try to reclaim more than requested for the first round
+ * (sc->priority == DEF_PRIORITY) to reduce such latency.
+ */
+ bool scan_adjusted = sc->priority == DEF_PRIORITY;
get_scan_count(lruvec, sc, nr);
--
1.7.7.6
--yliu
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Mel Gorman <mgorman@suse.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: performance regression due to commit e82e0561("mm: vmscan: obey proportional scanning requirements for kswapd")
Date: Tue, 18 Feb 2014 16:01:22 +0800 [thread overview]
Message-ID: <20140218080122.GO26593@yliu-dev.sh.intel.com> (raw)
Hi,
Commit e82e0561("mm: vmscan: obey proportional scanning requirements for
kswapd") caused a big performance regression(73%) for vm-scalability/
lru-file-readonce testcase on a system with 256G memory without swap.
That testcase simply looks like this:
truncate -s 1T /tmp/vm-scalability.img
mkfs.xfs -q /tmp/vm-scalability.img
mount -o loop /tmp/vm-scalability.img /tmp/vm-scalability
SPARESE_FILE="/tmp/vm-scalability/sparse-lru-file-readonce"
for i in `seq 1 120`; do
truncate $SPARESE_FILE-$i -s 36G
timeout --foreground -s INT 300 dd bs=4k if=$SPARESE_FILE-$i of=/dev/null
done
wait
Actually, it's not the newlly added code(obey proportional scanning)
in that commit caused the regression. But instead, it's the following
change:
+
+ if (nr_reclaimed < nr_to_reclaim || scan_adjusted)
+ continue;
+
- if (nr_reclaimed >= nr_to_reclaim &&
- sc->priority < DEF_PRIORITY)
+ if (global_reclaim(sc) && !current_is_kswapd())
break;
The difference is that we might reclaim more than requested before
in the first round reclaimming(sc->priority == DEF_PRIORITY).
So, for a testcase like lru-file-readonce, the dirty rate is fast, and
reclaimming SWAP_CLUSTER_MAX(32 pages) each time is not enough for catching
up the dirty rate. And thus page allocation stalls, and performance drops:
O for e82e0561
* for parent commit
proc-vmstat.allocstall
2e+06 ++---------------------------------------------------------------+
1.8e+06 O+ O O O |
| |
1.6e+06 ++ |
1.4e+06 ++ |
| |
1.2e+06 ++ |
1e+06 ++ |
800000 ++ |
| |
600000 ++ |
400000 ++ |
| |
200000 *+..............*................*...............*...............*
0 ++---------------------------------------------------------------+
vm-scalability.throughput
2.2e+07 ++---------------------------------------------------------------+
| |
2e+07 *+..............*................*...............*...............*
1.8e+07 ++ |
| |
1.6e+07 ++ |
| |
1.4e+07 ++ |
| |
1.2e+07 ++ |
1e+07 ++ |
| |
8e+06 ++ O O O |
O |
6e+06 ++---------------------------------------------------------------+
I made a patch which simply keeps reclaimming more if sc->priority == DEF_PRIORITY.
I'm not sure it's the right way to go or not. Anyway, I pasted it here for comments.
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 26ad67f..37004a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1828,7 +1828,16 @@ static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
unsigned long nr_reclaimed = 0;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct blk_plug plug;
- bool scan_adjusted = false;
+ /*
+ * On large memory systems, direct reclamming of SWAP_CLUSTER_MAX
+ * each time may not catch up the dirty rate in some cases(say,
+ * vm-scalability/lru-file-readonce), which may increase the
+ * page allocation stall latency in the end.
+ *
+ * Here we try to reclaim more than requested for the first round
+ * (sc->priority == DEF_PRIORITY) to reduce such latency.
+ */
+ bool scan_adjusted = sc->priority == DEF_PRIORITY;
get_scan_count(lruvec, sc, nr);
--
1.7.7.6
--yliu
next reply other threads:[~2014-02-18 7:59 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-18 8:01 Yuanhan Liu [this message]
2014-02-18 8:01 ` performance regression due to commit e82e0561("mm: vmscan: obey proportional scanning requirements for kswapd") Yuanhan Liu
2014-03-07 8:22 ` Yuanhan Liu
2014-03-07 8:22 ` Yuanhan Liu
2014-03-12 16:54 ` Mel Gorman
2014-03-12 16:54 ` Mel Gorman
2014-03-13 12:44 ` Hugh Dickins
2014-03-13 12:44 ` Hugh Dickins
2014-03-14 14:21 ` Mel Gorman
2014-03-14 14:21 ` Mel Gorman
2014-03-16 3:56 ` Hugh Dickins
2014-03-16 3:56 ` Hugh Dickins
2014-03-18 6:38 ` Yuanhan Liu
2014-03-18 6:38 ` Yuanhan Liu
2014-03-19 3:20 ` Hugh Dickins
2014-03-19 3:20 ` Hugh Dickins
2014-03-14 4:54 ` Yuanhan Liu
2014-03-14 4:54 ` Yuanhan Liu
2014-03-20 10:03 ` Bob Liu
2014-03-20 10:03 ` Bob Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140218080122.GO26593@yliu-dev.sh.intel.com \
--to=yuanhan.liu@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.