From: Satoru Moriya <smoriya@redhat.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
linux-mm@kvack.org, akpm@linux-foundation.org, npiggin@kernel.dk,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
kamezawa.hiroyu@jp.fujitsu.com, Mel Gorman <mel@csn.ul.ie>,
Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [PATCH 0/3] Unmapped page cache control (v5)
Date: Fri, 01 Apr 2011 19:10:51 -0400 [thread overview]
Message-ID: <4D965B7B.9070208@redhat.com> (raw)
In-Reply-To: <20110401221921.A890.A69D9226@jp.fujitsu.com>
On 04/01/2011 09:17 AM, KOSAKI Motohiro wrote:
> Hi Christoph,
>
> Thanks, long explanation.
>
>
>> On Thu, 31 Mar 2011, KOSAKI Motohiro wrote:
>>
>>> 1) zone reclaim doesn't work if the system has multiple node and the
>>> workload is file cache oriented (eg file server, web server, mail server, et al).
>>> because zone recliam make some much free pages than zone->pages_min and
>>> then new page cache request consume nearest node memory and then it
>>> bring next zone reclaim. Then, memory utilization is reduced and
>>> unnecessary LRU discard is increased dramatically.
>>
>> That is only true if the webserver only allocates from a single node. If
>> the allocation load is balanced then it will be fine. It is useful to
>> reclaim pages from the node where we allocate memory since that keeps the
>> dataset node local.
>
> Why?
> Scheduler load balancing only consider cpu load. Then, usually memory
> pressure is no complete symmetric. That's the reason why we got the
> bug report periodically.
Agreed. As Christoph said if the allocation load is balanced it will be fine.
But I think it's not always true that the allocation load is balanced.
>>> But, I agree that now we have to concern slightly large VM change parhaps
>>> (or parhaps not). Ok, it's good opportunity to fill out some thing.
>>> Historically, Linux MM has "free memory are waste memory" policy, and It
>>> worked completely fine. But now we have a few exceptions.
>>>
>>> 1) RT, embedded and finance systems. They really hope to avoid reclaim
>>> latency (ie avoid foreground reclaim completely) and they can accept
>>> to make slightly much free pages before memory shortage.
>>
>> In general we need a mechanism to ensure we can avoid reclaim during
>> critical sections of application. So some way to give some hints to the
>> machine to free up lots of memory (/proc/sys/vm/dropcaches is far too
>> drastic) may be useful.
>
> Exactly.
> I've heard multiple times this request from finance people. And I've also
> heared the same request from bullet train control software people recently.
I completely agree with you. I have both customers and they really need it
to make their critical section deterministic.
Thanks,
Satoru
WARNING: multiple messages have this Message-ID (diff)
From: Satoru Moriya <smoriya@redhat.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>,
Balbir Singh <balbir@linux.vnet.ibm.com>,
linux-mm@kvack.org, akpm@linux-foundation.org, npiggin@kernel.dk,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
kamezawa.hiroyu@jp.fujitsu.com, Mel Gorman <mel@csn.ul.ie>,
Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [PATCH 0/3] Unmapped page cache control (v5)
Date: Fri, 01 Apr 2011 19:10:51 -0400 [thread overview]
Message-ID: <4D965B7B.9070208@redhat.com> (raw)
In-Reply-To: <20110401221921.A890.A69D9226@jp.fujitsu.com>
On 04/01/2011 09:17 AM, KOSAKI Motohiro wrote:
> Hi Christoph,
>
> Thanks, long explanation.
>
>
>> On Thu, 31 Mar 2011, KOSAKI Motohiro wrote:
>>
>>> 1) zone reclaim doesn't work if the system has multiple node and the
>>> workload is file cache oriented (eg file server, web server, mail server, et al).
>>> because zone recliam make some much free pages than zone->pages_min and
>>> then new page cache request consume nearest node memory and then it
>>> bring next zone reclaim. Then, memory utilization is reduced and
>>> unnecessary LRU discard is increased dramatically.
>>
>> That is only true if the webserver only allocates from a single node. If
>> the allocation load is balanced then it will be fine. It is useful to
>> reclaim pages from the node where we allocate memory since that keeps the
>> dataset node local.
>
> Why?
> Scheduler load balancing only consider cpu load. Then, usually memory
> pressure is no complete symmetric. That's the reason why we got the
> bug report periodically.
Agreed. As Christoph said if the allocation load is balanced it will be fine.
But I think it's not always true that the allocation load is balanced.
>>> But, I agree that now we have to concern slightly large VM change parhaps
>>> (or parhaps not). Ok, it's good opportunity to fill out some thing.
>>> Historically, Linux MM has "free memory are waste memory" policy, and It
>>> worked completely fine. But now we have a few exceptions.
>>>
>>> 1) RT, embedded and finance systems. They really hope to avoid reclaim
>>> latency (ie avoid foreground reclaim completely) and they can accept
>>> to make slightly much free pages before memory shortage.
>>
>> In general we need a mechanism to ensure we can avoid reclaim during
>> critical sections of application. So some way to give some hints to the
>> machine to free up lots of memory (/proc/sys/vm/dropcaches is far too
>> drastic) may be useful.
>
> Exactly.
> I've heard multiple times this request from finance people. And I've also
> heared the same request from bullet train control software people recently.
I completely agree with you. I have both customers and they really need it
to make their critical section deterministic.
Thanks,
Satoru
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-04-01 23:11 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-30 5:30 [PATCH 0/3] Unmapped page cache control (v5) Balbir Singh
2011-03-30 5:30 ` Balbir Singh
2011-03-30 5:31 ` [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v5) Balbir Singh
2011-03-30 5:31 ` Balbir Singh
2011-03-30 5:31 ` [PATCH 2/3] Refactor zone_reclaim code (v5) Balbir Singh
2011-03-30 5:31 ` Balbir Singh
2011-03-30 5:32 ` [PATCH 3/3] Provide control over unmapped pages (v5) Balbir Singh
2011-03-30 5:32 ` Balbir Singh
2011-03-30 23:35 ` Andrew Morton
2011-03-30 23:35 ` Andrew Morton
2011-03-31 5:52 ` Balbir Singh
2011-03-31 5:52 ` Balbir Singh
2011-03-30 23:36 ` [PATCH 0/3] Unmapped page cache control (v5) Andrew Morton
2011-03-30 23:36 ` Andrew Morton
2011-03-31 5:27 ` Balbir Singh
2011-03-31 5:27 ` Balbir Singh
2011-03-31 5:32 ` Andrew Morton
2011-03-31 5:32 ` Andrew Morton
2011-04-01 17:31 ` Balbir Singh
2011-04-01 17:31 ` Balbir Singh
2011-03-31 5:40 ` KOSAKI Motohiro
2011-03-31 5:40 ` KOSAKI Motohiro
2011-03-31 8:28 ` Balbir Singh
2011-03-31 8:28 ` Balbir Singh
2011-04-01 7:56 ` KOSAKI Motohiro
2011-04-01 7:56 ` KOSAKI Motohiro
2011-04-01 13:12 ` Balbir Singh
2011-04-01 13:12 ` Balbir Singh
2011-04-01 13:21 ` KOSAKI Motohiro
2011-04-01 13:21 ` KOSAKI Motohiro
2011-04-01 18:04 ` Balbir Singh
2011-04-01 18:04 ` Balbir Singh
2011-04-03 9:39 ` KOSAKI Motohiro
2011-04-03 9:39 ` KOSAKI Motohiro
2011-03-31 20:13 ` Christoph Lameter
2011-03-31 20:13 ` Christoph Lameter
2011-04-01 13:17 ` KOSAKI Motohiro
2011-04-01 13:17 ` KOSAKI Motohiro
2011-04-01 14:50 ` Christoph Lameter
2011-04-01 14:50 ` Christoph Lameter
2011-04-03 9:44 ` KOSAKI Motohiro
2011-04-03 9:44 ` KOSAKI Motohiro
2011-04-03 18:45 ` Christoph Lameter
2011-04-03 18:45 ` Christoph Lameter
2011-04-01 23:10 ` Satoru Moriya [this message]
2011-04-01 23:10 ` Satoru Moriya
2011-04-02 1:10 ` Dave Chinner
2011-04-02 1:10 ` Dave Chinner
2011-04-03 9:32 ` KOSAKI Motohiro
2011-04-03 9:32 ` KOSAKI Motohiro
2011-04-04 0:19 ` Dave Chinner
2011-04-04 0:19 ` Dave Chinner
2011-04-04 12:05 ` KOSAKI Motohiro
2011-04-04 12:05 ` KOSAKI Motohiro
2011-04-03 18:41 ` Christoph Lameter
2011-04-03 18:41 ` Christoph Lameter
2011-03-31 21:40 ` Dave Chinner
2011-03-31 21:40 ` Dave Chinner
2011-04-01 3:08 ` Balbir Singh
2011-04-01 3:08 ` Balbir Singh
2011-04-01 5:31 ` Dave Chinner
2011-04-01 5:31 ` Dave Chinner
2011-04-01 3:18 ` KOSAKI Motohiro
2011-04-01 3:18 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D965B7B.9070208@redhat.com \
--to=smoriya@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=balbir@linux.vnet.ibm.com \
--cc=cl@linux.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan.kim@gmail.com \
--cc=npiggin@kernel.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.