From: Simon Jeons <simon.jeons@gmail.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Shaohua Li <shli@kernel.org>,
lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
Rik van Riel <riel@redhat.com>
Subject: Re: [LSF/MM TOPIC]swap improvements for fast SSD
Date: Fri, 05 Apr 2013 08:17:00 +0800 [thread overview]
Message-ID: <515E17FC.9050008@gmail.com> (raw)
In-Reply-To: <20130123075808.GH2723@blaptop>
[-- Attachment #1: Type: text/plain, Size: 4832 bytes --]
Hi Minchan,
On 01/23/2013 03:58 PM, Minchan Kim wrote:
> On Tue, Jan 22, 2013 at 02:53:41PM +0800, Shaohua Li wrote:
>> Hi,
>>
>> Because of high density, low power and low price, flash storage (SSD) is a good
>> candidate to partially replace DRAM. A quick answer for this is using SSD as
>> swap. But Linux swap is designed for slow hard disk storage. There are a lot of
>> challenges to efficiently use SSD for swap:
> Many of below item could be applied in in-memory swap like zram, zcache.
>
>> 1. Lock contentions (swap_lock, anon_vma mutex, swap address space lock)
>> 2. TLB flush overhead. To reclaim one page, we need at least 2 TLB flush. This
>> overhead is very high even in a normal 2-socket machine.
>> 3. Better swap IO pattern. Both direct and kswapd page reclaim can do swap,
>> which makes swap IO pattern is interleave. Block layer isn't always efficient
>> to do request merge. Such IO pattern also makes swap prefetch hard.
> Agreed.
>
>> 4. Swap map scan overhead. Swap in-memory map scan scans an array, which is
>> very inefficient, especially if swap storage is fast.
> Agreed.
>
>> 5. SSD related optimization, mainly discard support
>> 6. Better swap prefetch algorithm. Besides item 3, sequentially accessed pages
>> aren't always in LRU list adjacently, so page reclaim will not swap such pages
>> in adjacent storage sectors. This makes swap prefetch hard.
> One of problem is LRU churning and I wanted to try to fix it.
> http://marc.info/?l=linux-mm&m=130978831028952&w=4
I'm interested in this feature, why it didn't merged? what's the fatal
issue in your patchset?
http://lwn.net/Articles/449866/
You mentioned test script and all-at-once patch, but I can't get them
from the URL, could you tell me how to get it?
>
>> 7. Alternative page reclaim policy to bias reclaiming anonymous page.
>> Currently reclaim anonymous page is considering harder than reclaim file pages,
>> so we bias reclaiming file pages. If there are high speed swap storage, we are
>> considering doing swap more aggressively.
> Yeb. We need it. I tried it with extending vm_swappiness to 200.
>
> From: Minchan Kim <minchan@kernel.org>
> Date: Mon, 3 Dec 2012 16:21:00 +0900
> Subject: [PATCH] mm: increase swappiness to 200
>
> We have thought swap out cost is very high but it's not true
> if we use fast device like swap-over-zram. Nonetheless, we can
> swap out 1:1 ratio of anon and page cache at most.
> It's not enough to use swap device fully so we encounter OOM kill
> while there are many free space in zram swap device. It's never
> what we want.
>
> This patch makes swap out aggressively.
>
> Cc: Luigi Semenzato <semenzato@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
> kernel/sysctl.c | 3 ++-
> mm/vmscan.c | 6 ++++--
> 2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 693e0ed..f1dbd9d 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -130,6 +130,7 @@ static int __maybe_unused two = 2;
> static int __maybe_unused three = 3;
> static unsigned long one_ul = 1;
> static int one_hundred = 100;
> +extern int max_swappiness;
> #ifdef CONFIG_PRINTK
> static int ten_thousand = 10000;
> #endif
> @@ -1157,7 +1158,7 @@ static struct ctl_table vm_table[] = {
> .mode = 0644,
> .proc_handler = proc_dointvec_minmax,
> .extra1 = &zero,
> - .extra2 = &one_hundred,
> + .extra2 = &max_swappiness,
> },
> #ifdef CONFIG_HUGETLB_PAGE
> {
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 53dcde9..64f3c21 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -53,6 +53,8 @@
> #define CREATE_TRACE_POINTS
> #include <trace/events/vmscan.h>
>
> +int max_swappiness = 200;
> +
> struct scan_control {
> /* Incremented by the number of inactive pages that were scanned */
> unsigned long nr_scanned;
> @@ -1626,6 +1628,7 @@ static int vmscan_swappiness(struct scan_control *sc)
> return mem_cgroup_swappiness(sc->target_mem_cgroup);
> }
>
> +
> /*
> * Determine how aggressively the anon and file LRU lists should be
> * scanned. The relative value of each set of LRU lists is determined
> @@ -1701,11 +1704,10 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
> }
>
> /*
> - * With swappiness at 100, anonymous and file have the same priority.
> * This scanning priority is essentially the inverse of IO cost.
> */
> anon_prio = vmscan_swappiness(sc);
> - file_prio = 200 - anon_prio;
> + file_prio = max_swappiness - anon_prio;
>
> /*
> * OK, so we have swap space and a fair amount of page cache
[-- Attachment #2: Type: text/html, Size: 6098 bytes --]
next prev parent reply other threads:[~2013-04-05 0:17 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-22 6:53 [LSF/MM TOPIC]swap improvements for fast SSD Shaohua Li
2013-01-23 7:58 ` Minchan Kim
2013-01-23 19:04 ` Seth Jennings
2013-01-24 1:40 ` Minchan Kim
2013-01-24 8:29 ` Simon Jeons
2013-01-24 2:02 ` Shaohua Li
2013-01-24 7:52 ` Simon Jeons
2013-01-24 9:09 ` Simon Jeons
2013-01-26 4:40 ` Kyungmin Park
2013-01-27 0:26 ` Simon Jeons
2013-01-27 14:18 ` Shaohua Li
2013-01-28 7:37 ` Kyungmin Park
2013-02-01 12:37 ` Kyungmin Park
2013-02-04 4:56 ` Hugh Dickins
2013-02-19 6:15 ` Shaohua Li
2013-02-19 19:41 ` Hugh Dickins
2013-04-05 0:17 ` Simon Jeons [this message]
2013-04-05 8:08 ` Minchan Kim
2013-01-23 16:56 ` Seth Jennings
2013-01-24 6:28 ` Simon Jeons
2013-03-15 9:39 ` Simon Jeons
2013-03-18 10:38 ` Bob Liu
2013-03-19 1:27 ` Shaohua Li
2013-03-19 1:32 ` Simon Jeons
2013-03-19 5:57 ` Shaohua Li
2013-03-19 6:10 ` Simon Jeons
2013-03-19 4:25 ` Wanpeng Li
2013-03-19 4:25 ` Wanpeng Li
2013-04-28 8:12 ` Simon Jeons
[not found] <766b9855-adf5-47ce-9484-971f88ff0e54@default>
2013-01-23 23:05 ` Dan Magenheimer
2013-01-24 2:11 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=515E17FC.9050008@gmail.com \
--to=simon.jeons@gmail.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.