* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
@ 2011-11-02 16:30 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:30 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Aug 08, 2011 at 04:18:11PM +0400, Konstantin Khlebnikov wrote:
> Pekka Enberg wrote:
> >Hi Konstantin,
> >
> >On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
> ><khlebnikov@openvz.org> wrote:
> >>Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> >>greatly decreases lifetime of single-used mapped file pages.
> >>Unfortunately it also decreases life time of all shared mapped file pages.
> >>Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> >>page-fault handler does not mark page active or even referenced.
> >>
> >>Thus page_check_references() activates file page only if it was used twice while
> >>it stays in inactive list, meanwhile it activates anon pages after first access.
> >>Inactive list can be small enough, this way reclaimer can accidentally
> >>throw away any widely used page if it wasn't used twice in short period.
> >>
> >>After this patch page_check_references() also activate file mapped page at first
> >>inactive list scan if this page is already used multiple times via several ptes.
> >>
> >>Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
> >
> >Both patches seem reasonable but the changelogs don't really explain
> >why you're doing the changes. How did you find out about the problem?
> >Is there some workload that's affected? How did you test your changes?
> >
>
> I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5 (~2.6.18).
> There a complete mess with >100 web/mail/spam/ftp containers,
> they share all their files but there a lot of anonymous pages:
> ~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.
> In this situation major-pagefaults are very costly, because all containers share the same page.
> In my load kernel created a disproportionate pressure on the file memory, compared with the anonymous,
> they equaled only if I raise swappiness up to 150 =)
>
> These patches actually wasn't helped a lot in my problem,
> but I saw noticable (10-20 times) reduce in count and average time of major-pagefault in file-mapped areas.
>
> Actually both patches are fixes for commit v2.6.33-5448-g6457474,
> because it was aimed at one scenario (singly used pages),
> but it breaks the logic in other scenarios (shared and/or executable pages)
I suspect that while saving shared/executable mapped file pages more
aggressively helps to some extent, the underlying problem is that we
tip the lru balance (comparing the recent_scanned/recent_rotated
ratios) in favor of file pages too much and in unexpected places.
For mapped file, we do:
add to lru: recent_scanned++
cycle: recent_scanned++
[ activate: recent_scanned++, recent_rotated++ ]
[ deactivate: recent_scanned++, recent_rotated++ ]
reclaim: recent_scanned++
while for anon:
add to lru: recent_scanned++, recent_rotated++
reactivate: recent_scanned++, recent_rotated++
deactivate: recent_scanned++, recent_rotated++
[ activate: recent_scanned++, recent_rotated++ ]
[ deactivate: recent_scanned++, recent_rotated++ ]
reclaim: recent_scanned++
As you can see, even a long-lived file page tips the balance to the
file list twice: on creation and during the used-once detection. A
thrashing file working set as in Konstantin's case will actually be
seen as a lucrative source of reclaimable pages.
Tipping the balance with each new file LRU page was meant to steer the
reclaim focus towards streaming IO pages and away from anonymous pages
but wouldn't it be easier to just not swap above a certain priority to
have the same effect? With enough used-once file pages, we should not
reach that priority threshold.
Tipping the balance for inactive list rotation has been there from the
beginning, but I don't quite understand why. It probably was not a
problem as the conditions for inactive cycling applied to both file
and anon equally, but with used-once detection for file and deferred
file writeback from direct reclaim, we tend to cycle more file pages
on the inactive list than anonymous ones. Those rotated pages should
be a signal to favor file reclaim, though.
Here are three (currently under testing) RFC patches that 1. prevent
swapping above DEF_PRIORITY-2, 2. treat inactive list rotations to be
neutral wrt. the inter-LRU balance, and 3. revert the file list boost
on lru addition.
The result looks like this:
file:
add to lru:
[ activate: recent_scanned++, recent_rotated++ ]
[ deactivate: recent_scanned++, recent_rotated++ ]
reclaim: recent_scanned++
mapped file:
add to lru:
cycle: recent_scanned++, recent_rotated++
[ activate: recent_scanned++, recent_rotated++ ]
[ deactivate: recent_scanned++, recent_rotated++ ]
reclaim: recent_scanned++
anon:
add to lru: recent_scanned++, recent_rotated++
reactivate: recent_scanned++, recent_rotated++
deactivate: recent_scanned++, recent_rotated++
[ activate: recent_scanned++, recent_rotated++ ]
[ deactivate: recent_scanned++, recent_rotated++ ]
reclaim: recent_scanned++
As you can see, this still behaves under the assumption that refaults
from swap are more costly than from the fs, but we keep considering
anonymous pages when the file working set is thrashing.
What do reclaim people think about this?
Konstantin, would you have the chance to try this set directly with
your affected workload if nobody spots any obvious problems?
Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-02 16:30 ` Johannes Weiner
@ 2011-11-02 16:31 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:31 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
We want to prevent floods of used-once file cache pushing us to swap
out anonymous pages. Never swap under a certain priority level. The
availability of used-once cache pages should prevent us from reaching
that threshold.
This is needed because subsequent patches will revert some of the
mechanisms that tried to prefer file over anon, and this should not
result in more eager swapping again.
It might also be better to keep the aging machinery going and just not
swap, rather than staying away from anonymous pages in the first place
and having less useful age information at the time of swapout.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/vmscan.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a90c603..39d3da3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
* Try to allocate it some swap space here.
*/
if (PageAnon(page) && !PageSwapCache(page)) {
+ if (priority >= DEF_PRIORITY - 2)
+ goto keep_locked;
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
if (!add_to_swap(page))
--
1.7.6.4
^ permalink raw reply related [flat|nested] 64+ messages in thread* [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-02 16:31 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:31 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
We want to prevent floods of used-once file cache pushing us to swap
out anonymous pages. Never swap under a certain priority level. The
availability of used-once cache pages should prevent us from reaching
that threshold.
This is needed because subsequent patches will revert some of the
mechanisms that tried to prefer file over anon, and this should not
result in more eager swapping again.
It might also be better to keep the aging machinery going and just not
swap, rather than staying away from anonymous pages in the first place
and having less useful age information at the time of swapout.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/vmscan.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a90c603..39d3da3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
* Try to allocate it some swap space here.
*/
if (PageAnon(page) && !PageSwapCache(page)) {
+ if (priority >= DEF_PRIORITY - 2)
+ goto keep_locked;
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
if (!add_to_swap(page))
--
1.7.6.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-02 16:31 ` Johannes Weiner
@ 2011-11-02 17:54 ` KOSAKI Motohiro
-1 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 17:54 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
> ---
> mm/vmscan.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> * Try to allocate it some swap space here.l
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (priority >= DEF_PRIORITY - 2)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
"DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-02 17:54 ` KOSAKI Motohiro
0 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 17:54 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
> ---
> mm/vmscan.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> * Try to allocate it some swap space here.l
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (priority >= DEF_PRIORITY - 2)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
"DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-02 17:54 ` KOSAKI Motohiro
@ 2011-11-03 15:51 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-03 15:51 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
> > ---
> > mm/vmscan.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > * Try to allocate it some swap space here.l
> > */
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > + if (priority >= DEF_PRIORITY - 2)
> > + goto keep_locked;
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > if (!add_to_swap(page))
>
> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
Do you remember what kind of tests you ran that demonstrated
misbehaviour?
We can not reclaim anonymous pages without swapping, so the priority
cutoff applies only to inactive file pages. If you had 1TB of
inactive file pages, the scanner would have to go through
((1 << (40 - 12)) >> 12) +
((1 << (40 - 12)) >> 11) +
((1 << (40 - 12)) >> 10) = 1792MB
without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
That's a lot of scanning but how likely is it that you have a TB of
unreclaimable inactive cache pages?
Put into proportion, with a priority threshold of 10 a reclaimer will
look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
balance bias) of inactive file pages without reclaiming
SWAP_CLUSTER_MAX before it considers swapping.
Currently, the list balance biasing with each newly-added file page
has much higher resistance to scan anonymous pages initially. But
once it shifted toward anon pages, all reclaimers will start swapping,
unlike the priority threshold that each reclaimer has to reach
individually. Could this have been what was causing problems for you?
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-03 15:51 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-03 15:51 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
> > ---
> > mm/vmscan.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > * Try to allocate it some swap space here.l
> > */
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > + if (priority >= DEF_PRIORITY - 2)
> > + goto keep_locked;
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > if (!add_to_swap(page))
>
> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
Do you remember what kind of tests you ran that demonstrated
misbehaviour?
We can not reclaim anonymous pages without swapping, so the priority
cutoff applies only to inactive file pages. If you had 1TB of
inactive file pages, the scanner would have to go through
((1 << (40 - 12)) >> 12) +
((1 << (40 - 12)) >> 11) +
((1 << (40 - 12)) >> 10) = 1792MB
without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
That's a lot of scanning but how likely is it that you have a TB of
unreclaimable inactive cache pages?
Put into proportion, with a priority threshold of 10 a reclaimer will
look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
balance bias) of inactive file pages without reclaiming
SWAP_CLUSTER_MAX before it considers swapping.
Currently, the list balance biasing with each newly-added file page
has much higher resistance to scan anonymous pages initially. But
once it shifted toward anon pages, all reclaimers will start swapping,
unlike the priority threshold that each reclaimer has to reach
individually. Could this have been what was causing problems for you?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-03 15:51 ` Johannes Weiner
@ 2011-11-08 0:16 ` KOSAKI Motohiro
-1 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-08 0:16 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
Hi,
Sorry for the delay. I had tripped San Jose in last week.
> On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
>>> ---
>>> mm/vmscan.c | 2 ++
>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a90c603..39d3da3 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>> * Try to allocate it some swap space here.l
>>> */
>>> if (PageAnon(page) && !PageSwapCache(page)) {
>>> + if (priority >= DEF_PRIORITY - 2)
>>> + goto keep_locked;
>>> if (!(sc->gfp_mask & __GFP_IO))
>>> goto keep_locked;for
>>> if (!add_to_swap(page))
>>
>> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
>> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
>> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
>
> Do you remember what kind of tests you ran that demonstrated
> misbehaviour?
>
> We can not reclaim anonymous pages without swapping, so the priority
> cutoff applies only to inactive file pages. If you had 1TB of
> inactive file pages, the scanner would have to go through
>
> ((1 << (40 - 12)) >> 12) +
> ((1 << (40 - 12)) >> 11) +
> ((1 << (40 - 12)) >> 10) = 1792MB
>
> without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
> That's a lot of scanning but how likely is it that you have a TB of
> unreclaimable inactive cache pages?
I meant, the affect of this protection strongly depend on system memory.
- system memory is plenty.
the protection virtually affect to disable swap-out completely.
- system memory is not plenty.
the protection slightly makes a bonus to avoid swap out.
If people buy new machine and move-in their legacy workload into it, they
might surprise a lot of behavior change. I'm worry about it.
That's why I dislike DEF_PRIORITY based heuristic.
> Put into proportion, with a priority threshold of 10 a reclaimer will
> look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
> balance bias) of inactive file pages without reclaiming
> SWAP_CLUSTER_MAX before it considers swapping.
Moreover, I think we need to make more precious analysis why unnecessary swapout
was happen. Which factor is dominant and when occur.
> Currently, the list balance biasing with each newly-added file page
> has much higher resistance to scan anonymous pages initially. But
> once it shifted toward anon pages, all reclaimers will start swapping,
> unlike the priority threshold that each reclaimer has to reach
> individually. Could this have been what was causing problems for you?
Um. Currently number of fulusher threads are controlled by kernel. But,
number of swap-out threads aren't limited at all. So, our swapout often
works too aggressively. I think we need fix it.
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-08 0:16 ` KOSAKI Motohiro
0 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-08 0:16 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
Hi,
Sorry for the delay. I had tripped San Jose in last week.
> On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
>>> ---
>>> mm/vmscan.c | 2 ++
>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a90c603..39d3da3 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>> * Try to allocate it some swap space here.l
>>> */
>>> if (PageAnon(page) && !PageSwapCache(page)) {
>>> + if (priority >= DEF_PRIORITY - 2)
>>> + goto keep_locked;
>>> if (!(sc->gfp_mask & __GFP_IO))
>>> goto keep_locked;for
>>> if (!add_to_swap(page))
>>
>> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
>> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
>> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
>
> Do you remember what kind of tests you ran that demonstrated
> misbehaviour?
>
> We can not reclaim anonymous pages without swapping, so the priority
> cutoff applies only to inactive file pages. If you had 1TB of
> inactive file pages, the scanner would have to go through
>
> ((1 << (40 - 12)) >> 12) +
> ((1 << (40 - 12)) >> 11) +
> ((1 << (40 - 12)) >> 10) = 1792MB
>
> without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
> That's a lot of scanning but how likely is it that you have a TB of
> unreclaimable inactive cache pages?
I meant, the affect of this protection strongly depend on system memory.
- system memory is plenty.
the protection virtually affect to disable swap-out completely.
- system memory is not plenty.
the protection slightly makes a bonus to avoid swap out.
If people buy new machine and move-in their legacy workload into it, they
might surprise a lot of behavior change. I'm worry about it.
That's why I dislike DEF_PRIORITY based heuristic.
> Put into proportion, with a priority threshold of 10 a reclaimer will
> look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
> balance bias) of inactive file pages without reclaiming
> SWAP_CLUSTER_MAX before it considers swapping.
Moreover, I think we need to make more precious analysis why unnecessary swapout
was happen. Which factor is dominant and when occur.
> Currently, the list balance biasing with each newly-added file page
> has much higher resistance to scan anonymous pages initially. But
> once it shifted toward anon pages, all reclaimers will start swapping,
> unlike the priority threshold that each reclaimer has to reach
> individually. Could this have been what was causing problems for you?
Um. Currently number of fulusher threads are controlled by kernel. But,
number of swap-out threads aren't limited at all. So, our swapout often
works too aggressively. I think we need fix it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-02 16:31 ` Johannes Weiner
@ 2011-11-07 2:29 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:29 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:31:41 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> We want to prevent floods of used-once file cache pushing us to swap
> out anonymous pages. Never swap under a certain priority level. The
> availability of used-once cache pages should prevent us from reaching
> that threshold.
>
> This is needed because subsequent patches will revert some of the
> mechanisms that tried to prefer file over anon, and this should not
> result in more eager swapping again.
>
> It might also be better to keep the aging machinery going and just not
> swap, rather than staying away from anonymous pages in the first place
> and having less useful age information at the time of swapout.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
> mm/vmscan.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> * Try to allocate it some swap space here.
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (priority >= DEF_PRIORITY - 2)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
Hm, how about not scanning LRU_ANON rather than checking here ?
Add some bias to get_scan_count() or some..
If you think to need rotation of LRU, only kswapd should do that..
Thanks,
-Kame
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-07 2:29 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:29 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:31:41 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> We want to prevent floods of used-once file cache pushing us to swap
> out anonymous pages. Never swap under a certain priority level. The
> availability of used-once cache pages should prevent us from reaching
> that threshold.
>
> This is needed because subsequent patches will revert some of the
> mechanisms that tried to prefer file over anon, and this should not
> result in more eager swapping again.
>
> It might also be better to keep the aging machinery going and just not
> swap, rather than staying away from anonymous pages in the first place
> and having less useful age information at the time of swapout.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
> mm/vmscan.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> * Try to allocate it some swap space here.
> */
> if (PageAnon(page) && !PageSwapCache(page)) {
> + if (priority >= DEF_PRIORITY - 2)
> + goto keep_locked;
> if (!(sc->gfp_mask & __GFP_IO))
> goto keep_locked;
> if (!add_to_swap(page))
Hm, how about not scanning LRU_ANON rather than checking here ?
Add some bias to get_scan_count() or some..
If you think to need rotation of LRU, only kswapd should do that..
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
2011-11-07 2:29 ` KAMEZAWA Hiroyuki
@ 2011-11-10 15:29 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 15:29 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:29:41AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:31:41 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > We want to prevent floods of used-once file cache pushing us to swap
> > out anonymous pages. Never swap under a certain priority level. The
> > availability of used-once cache pages should prevent us from reaching
> > that threshold.
> >
> > This is needed because subsequent patches will revert some of the
> > mechanisms that tried to prefer file over anon, and this should not
> > result in more eager swapping again.
> >
> > It might also be better to keep the aging machinery going and just not
> > swap, rather than staying away from anonymous pages in the first place
> > and having less useful age information at the time of swapout.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> > mm/vmscan.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > * Try to allocate it some swap space here.
> > */
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > + if (priority >= DEF_PRIORITY - 2)
> > + goto keep_locked;
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > if (!add_to_swap(page))
>
> Hm, how about not scanning LRU_ANON rather than checking here ?
> Add some bias to get_scan_count() or some..
> If you think to need rotation of LRU, only kswapd should do that..
Absolutely, it would require more tuning. This patch was really a
'hey, how about we do something like this? anyone tried that before?'
I keep those things in mind if I pursue this further, thanks.
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
@ 2011-11-10 15:29 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 15:29 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:29:41AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:31:41 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > We want to prevent floods of used-once file cache pushing us to swap
> > out anonymous pages. Never swap under a certain priority level. The
> > availability of used-once cache pages should prevent us from reaching
> > that threshold.
> >
> > This is needed because subsequent patches will revert some of the
> > mechanisms that tried to prefer file over anon, and this should not
> > result in more eager swapping again.
> >
> > It might also be better to keep the aging machinery going and just not
> > swap, rather than staying away from anonymous pages in the first place
> > and having less useful age information at the time of swapout.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> > mm/vmscan.c | 2 ++
> > 1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > * Try to allocate it some swap space here.
> > */
> > if (PageAnon(page) && !PageSwapCache(page)) {
> > + if (priority >= DEF_PRIORITY - 2)
> > + goto keep_locked;
> > if (!(sc->gfp_mask & __GFP_IO))
> > goto keep_locked;
> > if (!add_to_swap(page))
>
> Hm, how about not scanning LRU_ANON rather than checking here ?
> Add some bias to get_scan_count() or some..
> If you think to need rotation of LRU, only kswapd should do that..
Absolutely, it would require more tuning. This patch was really a
'hey, how about we do something like this? anyone tried that before?'
I keep those things in mind if I pursue this further, thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-02 16:30 ` Johannes Weiner
@ 2011-11-02 16:32 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
Each page that is scanned but put back to the inactive list is counted
as a successful reclaim, which tips the balance between file and anon
lists more towards the cycling list.
This does - in my opinion - not make too much sense, but at the same
time it was not much of a problem, as the conditions that lead to an
inactive list cycle were mostly temporary - locked page, concurrent
page table changes, backing device congested - or at least limited to
a single reclaimer that was not allowed to unmap or meddle with IO.
More important than being moderately rare, those conditions should
apply to both anon and mapped file pages equally and balance out in
the end.
Recently, we started cycling file pages in particular on the inactive
list much more aggressively, for used-once detection of mapped pages,
and when avoiding writeback from direct reclaim.
Those rotated pages do not exactly speak for the reclaimability of the
list they sit on and we risk putting immense pressure on file list for
no good reason.
Instead, count each page not reclaimed and put back to any list,
active or inactive, as rotated, so they are neutral with respect to
the scan/rotate ratio of the list class, as they should be.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/vmscan.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39d3da3..6da66a7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
*/
spin_lock(&zone->lru_lock);
while (!list_empty(page_list)) {
+ int file;
int lru;
+
page = lru_to_page(page_list);
VM_BUG_ON(PageLRU(page));
list_del(&page->lru);
@@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
SetPageLRU(page);
lru = page_lru(page);
add_page_to_lru_list(zone, page, lru);
- if (is_active_lru(lru)) {
- int file = is_file_lru(lru);
- int numpages = hpage_nr_pages(page);
- reclaim_stat->recent_rotated[file] += numpages;
- }
+ file = is_file_lru(lru);
+ reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
if (!pagevec_add(&pvec, page)) {
spin_unlock_irq(&zone->lru_lock);
__pagevec_release(&pvec);
--
1.7.6.4
^ permalink raw reply related [flat|nested] 64+ messages in thread* [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-02 16:32 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
Each page that is scanned but put back to the inactive list is counted
as a successful reclaim, which tips the balance between file and anon
lists more towards the cycling list.
This does - in my opinion - not make too much sense, but at the same
time it was not much of a problem, as the conditions that lead to an
inactive list cycle were mostly temporary - locked page, concurrent
page table changes, backing device congested - or at least limited to
a single reclaimer that was not allowed to unmap or meddle with IO.
More important than being moderately rare, those conditions should
apply to both anon and mapped file pages equally and balance out in
the end.
Recently, we started cycling file pages in particular on the inactive
list much more aggressively, for used-once detection of mapped pages,
and when avoiding writeback from direct reclaim.
Those rotated pages do not exactly speak for the reclaimability of the
list they sit on and we risk putting immense pressure on file list for
no good reason.
Instead, count each page not reclaimed and put back to any list,
active or inactive, as rotated, so they are neutral with respect to
the scan/rotate ratio of the list class, as they should be.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/vmscan.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39d3da3..6da66a7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
*/
spin_lock(&zone->lru_lock);
while (!list_empty(page_list)) {
+ int file;
int lru;
+
page = lru_to_page(page_list);
VM_BUG_ON(PageLRU(page));
list_del(&page->lru);
@@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
SetPageLRU(page);
lru = page_lru(page);
add_page_to_lru_list(zone, page, lru);
- if (is_active_lru(lru)) {
- int file = is_file_lru(lru);
- int numpages = hpage_nr_pages(page);
- reclaim_stat->recent_rotated[file] += numpages;
- }
+ file = is_file_lru(lru);
+ reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
if (!pagevec_add(&pvec, page)) {
spin_unlock_irq(&zone->lru_lock);
__pagevec_release(&pvec);
--
1.7.6.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 64+ messages in thread* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-02 16:32 ` Johannes Weiner
@ 2011-11-02 18:04 ` KOSAKI Motohiro
-1 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 18:04 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
(11/2/2011 9:32 AM), Johannes Weiner wrote:
> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
>
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
>
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
>
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
>
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
> mm/vmscan.c | 9 ++++-----
> 1 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 39d3da3..6da66a7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> */
> spin_lock(&zone->lru_lock);
> while (!list_empty(page_list)) {
> + int file;
> int lru;
> +
> page = lru_to_page(page_list);
> VM_BUG_ON(PageLRU(page));
> list_del(&page->lru);
> @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> SetPageLRU(page);
> lru = page_lru(page);
> add_page_to_lru_list(zone, page, lru);
> - if (is_active_lru(lru)) {
> - int file = is_file_lru(lru);
> - int numpages = hpage_nr_pages(page);
> - reclaim_stat->recent_rotated[file] += numpages;
> - }
> + file = is_file_lru(lru);
> + reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
> if (!pagevec_add(&pvec, page)) {
> spin_unlock_irq(&zone->lru_lock);
> __pagevec_release(&pvec);
When avoiding writeback from direct reclaim case, I think we shouldn't increase
recent_rotated because VM decided "the page should be eviceted, but also it
should be delayed". i'm not sure it's minor factor or not.
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-02 18:04 ` KOSAKI Motohiro
0 siblings, 0 replies; 64+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 18:04 UTC (permalink / raw)
To: jweiner
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
(11/2/2011 9:32 AM), Johannes Weiner wrote:
> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
>
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
>
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
>
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
>
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
> mm/vmscan.c | 9 ++++-----
> 1 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 39d3da3..6da66a7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> */
> spin_lock(&zone->lru_lock);
> while (!list_empty(page_list)) {
> + int file;
> int lru;
> +
> page = lru_to_page(page_list);
> VM_BUG_ON(PageLRU(page));
> list_del(&page->lru);
> @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> SetPageLRU(page);
> lru = page_lru(page);
> add_page_to_lru_list(zone, page, lru);
> - if (is_active_lru(lru)) {
> - int file = is_file_lru(lru);
> - int numpages = hpage_nr_pages(page);
> - reclaim_stat->recent_rotated[file] += numpages;
> - }
> + file = is_file_lru(lru);
> + reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
> if (!pagevec_add(&pvec, page)) {
> spin_unlock_irq(&zone->lru_lock);
> __pagevec_release(&pvec);
When avoiding writeback from direct reclaim case, I think we shouldn't increase
recent_rotated because VM decided "the page should be eviceted, but also it
should be delayed". i'm not sure it's minor factor or not.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-02 18:04 ` KOSAKI Motohiro
@ 2011-11-03 12:49 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-03 12:49 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
On Wed, Nov 02, 2011 at 11:04:30AM -0700, KOSAKI Motohiro wrote:
> (11/2/2011 9:32 AM), Johannes Weiner wrote:
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> >
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> >
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> >
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> >
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> > mm/vmscan.c | 9 ++++-----
> > 1 files changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 39d3da3..6da66a7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> > */
> > spin_lock(&zone->lru_lock);
> > while (!list_empty(page_list)) {
> > + int file;
> > int lru;
> > +
> > page = lru_to_page(page_list);
> > VM_BUG_ON(PageLRU(page));
> > list_del(&page->lru);
> > @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> > SetPageLRU(page);
> > lru = page_lru(page);
> > add_page_to_lru_list(zone, page, lru);
> > - if (is_active_lru(lru)) {
> > - int file = is_file_lru(lru);
> > - int numpages = hpage_nr_pages(page);
> > - reclaim_stat->recent_rotated[file] += numpages;
> > - }
> > + file = is_file_lru(lru);
> > + reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
> > if (!pagevec_add(&pvec, page)) {
> > spin_unlock_irq(&zone->lru_lock);
> > __pagevec_release(&pvec);
>
> When avoiding writeback from direct reclaim case, I think we shouldn't increase
> recent_rotated because VM decided "the page should be eviceted, but also it
> should be delayed". i'm not sure it's minor factor or not.
But we DO increase recent_scanned another time when the page is
reclaimed on the next round.
If we don't increase recent_rotated for deferred reclaims, they are
counted as success twice and so considered more valuable than
immediate reclaims. I don't think that makes sense.
^ permalink raw reply [flat|nested] 64+ messages in thread* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-03 12:49 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-03 12:49 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett
On Wed, Nov 02, 2011 at 11:04:30AM -0700, KOSAKI Motohiro wrote:
> (11/2/2011 9:32 AM), Johannes Weiner wrote:
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> >
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> >
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> >
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> >
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> > mm/vmscan.c | 9 ++++-----
> > 1 files changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 39d3da3..6da66a7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> > */
> > spin_lock(&zone->lru_lock);
> > while (!list_empty(page_list)) {
> > + int file;
> > int lru;
> > +
> > page = lru_to_page(page_list);
> > VM_BUG_ON(PageLRU(page));
> > list_del(&page->lru);
> > @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> > SetPageLRU(page);
> > lru = page_lru(page);
> > add_page_to_lru_list(zone, page, lru);
> > - if (is_active_lru(lru)) {
> > - int file = is_file_lru(lru);
> > - int numpages = hpage_nr_pages(page);
> > - reclaim_stat->recent_rotated[file] += numpages;
> > - }
> > + file = is_file_lru(lru);
> > + reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
> > if (!pagevec_add(&pvec, page)) {
> > spin_unlock_irq(&zone->lru_lock);
> > __pagevec_release(&pvec);
>
> When avoiding writeback from direct reclaim case, I think we shouldn't increase
> recent_rotated because VM decided "the page should be eviceted, but also it
> should be delayed". i'm not sure it's minor factor or not.
But we DO increase recent_scanned another time when the page is
reclaimed on the next round.
If we don't increase recent_rotated for deferred reclaims, they are
counted as success twice and so considered more valuable than
immediate reclaims. I don't think that makes sense.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-02 16:32 ` Johannes Weiner
@ 2011-11-07 2:34 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:34 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:32:13 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
>
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
>
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
>
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
>
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
I think this makes sense.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
I wonder it may be better to have victim list for written-backed pages..
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-07 2:34 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:34 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:32:13 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
>
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
>
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
>
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
>
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
I think this makes sense.
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
I wonder it may be better to have victim list for written-backed pages..
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-07 2:34 ` KAMEZAWA Hiroyuki
@ 2011-11-10 16:06 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:06 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:13 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> >
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> >
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> >
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> >
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
>
> I think this makes sense.
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> I wonder it may be better to have victim list for written-backed pages..
Do you mean an extra LRU list that holds dirty pages?
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-10 16:06 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:06 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:13 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> >
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> >
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> >
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> >
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
>
> I think this makes sense.
>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>
> I wonder it may be better to have victim list for written-backed pages..
Do you mean an extra LRU list that holds dirty pages?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
2011-11-10 16:06 ` Johannes Weiner
@ 2011-11-11 0:05 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-11 0:05 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Thu, 10 Nov 2011 17:06:28 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Wed, 2 Nov 2011 17:32:13 +0100
> > Johannes Weiner <jweiner@redhat.com> wrote:
> >
> > > Each page that is scanned but put back to the inactive list is counted
> > > as a successful reclaim, which tips the balance between file and anon
> > > lists more towards the cycling list.
> > >
> > > This does - in my opinion - not make too much sense, but at the same
> > > time it was not much of a problem, as the conditions that lead to an
> > > inactive list cycle were mostly temporary - locked page, concurrent
> > > page table changes, backing device congested - or at least limited to
> > > a single reclaimer that was not allowed to unmap or meddle with IO.
> > > More important than being moderately rare, those conditions should
> > > apply to both anon and mapped file pages equally and balance out in
> > > the end.
> > >
> > > Recently, we started cycling file pages in particular on the inactive
> > > list much more aggressively, for used-once detection of mapped pages,
> > > and when avoiding writeback from direct reclaim.
> > >
> > > Those rotated pages do not exactly speak for the reclaimability of the
> > > list they sit on and we risk putting immense pressure on file list for
> > > no good reason.
> > >
> > > Instead, count each page not reclaimed and put back to any list,
> > > active or inactive, as rotated, so they are neutral with respect to
> > > the scan/rotate ratio of the list class, as they should be.
> > >
> > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> >
> > I think this makes sense.
> >
> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > I wonder it may be better to have victim list for written-backed pages..
>
> Do you mean an extra LRU list that holds dirty pages?
an extra LRU for pages PG_reclaim ?
THanks,
-Kame
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
@ 2011-11-11 0:05 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-11 0:05 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Thu, 10 Nov 2011 17:06:28 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Wed, 2 Nov 2011 17:32:13 +0100
> > Johannes Weiner <jweiner@redhat.com> wrote:
> >
> > > Each page that is scanned but put back to the inactive list is counted
> > > as a successful reclaim, which tips the balance between file and anon
> > > lists more towards the cycling list.
> > >
> > > This does - in my opinion - not make too much sense, but at the same
> > > time it was not much of a problem, as the conditions that lead to an
> > > inactive list cycle were mostly temporary - locked page, concurrent
> > > page table changes, backing device congested - or at least limited to
> > > a single reclaimer that was not allowed to unmap or meddle with IO.
> > > More important than being moderately rare, those conditions should
> > > apply to both anon and mapped file pages equally and balance out in
> > > the end.
> > >
> > > Recently, we started cycling file pages in particular on the inactive
> > > list much more aggressively, for used-once detection of mapped pages,
> > > and when avoiding writeback from direct reclaim.
> > >
> > > Those rotated pages do not exactly speak for the reclaimability of the
> > > list they sit on and we risk putting immense pressure on file list for
> > > no good reason.
> > >
> > > Instead, count each page not reclaimed and put back to any list,
> > > active or inactive, as rotated, so they are neutral with respect to
> > > the scan/rotate ratio of the list class, as they should be.
> > >
> > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> >
> > I think this makes sense.
> >
> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> >
> > I wonder it may be better to have victim list for written-backed pages..
>
> Do you mean an extra LRU list that holds dirty pages?
an extra LRU for pages PG_reclaim ?
THanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* [rfc 3/3] mm: vmscan: revert file list boost on lru addition
2011-11-02 16:30 ` Johannes Weiner
@ 2011-11-02 16:32 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
reclaim focus onto file pages with every new file page that hits the
lru list, so that an influx of used-once file pages does not lead to
swapping of anonymous pages.
The problem is that nobody is fixing up the balance if the pages in
fact become part of the resident set.
Anonymous page creation is neutral to the inter-lru balance, so even a
comparably tiny number of heavily used file pages tip the balance in
favor of the file list.
In addition, there is no refault detection, and every refault will
bias the balance even more. A thrashing file working set will be
mistaken for a very lucrative source of reclaimable pages.
As anonymous pages are no longer swapped above a certain priority
level, this mechanism is no longer needed. Used-once file pages
should get reclaimed before the VM even considers swapping.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/swap.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/mm/swap.c b/mm/swap.c
index 3a442f1..33e5387 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -683,7 +683,6 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
SetPageLRU(page);
if (active)
SetPageActive(page);
- update_page_reclaim_stat(zone, page, file, active);
add_page_to_lru_list(zone, page, lru);
}
--
1.7.6.4
^ permalink raw reply related [flat|nested] 64+ messages in thread
* [rfc 3/3] mm: vmscan: revert file list boost on lru addition
@ 2011-11-02 16:32 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Pekka Enberg, linux-mm@kvack.org, Andrew Morton,
linux-kernel@vger.kernel.org, Wu Fengguang, KAMEZAWA Hiroyuki,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
reclaim focus onto file pages with every new file page that hits the
lru list, so that an influx of used-once file pages does not lead to
swapping of anonymous pages.
The problem is that nobody is fixing up the balance if the pages in
fact become part of the resident set.
Anonymous page creation is neutral to the inter-lru balance, so even a
comparably tiny number of heavily used file pages tip the balance in
favor of the file list.
In addition, there is no refault detection, and every refault will
bias the balance even more. A thrashing file working set will be
mistaken for a very lucrative source of reclaimable pages.
As anonymous pages are no longer swapped above a certain priority
level, this mechanism is no longer needed. Used-once file pages
should get reclaimed before the VM even considers swapping.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
mm/swap.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/mm/swap.c b/mm/swap.c
index 3a442f1..33e5387 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -683,7 +683,6 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
SetPageLRU(page);
if (active)
SetPageActive(page);
- update_page_reclaim_stat(zone, page, file, active);
add_page_to_lru_list(zone, page, lru);
}
--
1.7.6.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 64+ messages in thread
* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
2011-11-02 16:32 ` Johannes Weiner
@ 2011-11-07 2:45 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:45 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:32:47 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> reclaim focus onto file pages with every new file page that hits the
> lru list, so that an influx of used-once file pages does not lead to
> swapping of anonymous pages.
>
> The problem is that nobody is fixing up the balance if the pages in
> fact become part of the resident set.
>
> Anonymous page creation is neutral to the inter-lru balance, so even a
> comparably tiny number of heavily used file pages tip the balance in
> favor of the file list.
>
> In addition, there is no refault detection, and every refault will
> bias the balance even more. A thrashing file working set will be
> mistaken for a very lucrative source of reclaimable pages.
>
> As anonymous pages are no longer swapped above a certain priority
> level, this mechanism is no longer needed. Used-once file pages
> should get reclaimed before the VM even considers swapping.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Do you have some results ?
Thanks,
-Kame
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
@ 2011-11-07 2:45 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 64+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07 2:45 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, 2 Nov 2011 17:32:47 +0100
Johannes Weiner <jweiner@redhat.com> wrote:
> The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> reclaim focus onto file pages with every new file page that hits the
> lru list, so that an influx of used-once file pages does not lead to
> swapping of anonymous pages.
>
> The problem is that nobody is fixing up the balance if the pages in
> fact become part of the resident set.
>
> Anonymous page creation is neutral to the inter-lru balance, so even a
> comparably tiny number of heavily used file pages tip the balance in
> favor of the file list.
>
> In addition, there is no refault detection, and every refault will
> bias the balance even more. A thrashing file working set will be
> mistaken for a very lucrative source of reclaimable pages.
>
> As anonymous pages are no longer swapped above a certain priority
> level, this mechanism is no longer needed. Used-once file pages
> should get reclaimed before the VM even considers swapping.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Do you have some results ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
2011-11-07 2:45 ` KAMEZAWA Hiroyuki
@ 2011-11-10 16:12 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:12 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:45:20AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:47 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> > reclaim focus onto file pages with every new file page that hits the
> > lru list, so that an influx of used-once file pages does not lead to
> > swapping of anonymous pages.
> >
> > The problem is that nobody is fixing up the balance if the pages in
> > fact become part of the resident set.
> >
> > Anonymous page creation is neutral to the inter-lru balance, so even a
> > comparably tiny number of heavily used file pages tip the balance in
> > favor of the file list.
> >
> > In addition, there is no refault detection, and every refault will
> > bias the balance even more. A thrashing file working set will be
> > mistaken for a very lucrative source of reclaimable pages.
> >
> > As anonymous pages are no longer swapped above a certain priority
> > level, this mechanism is no longer needed. Used-once file pages
> > should get reclaimed before the VM even considers swapping.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
>
> Do you have some results ?
Not yet, sorry, I had to drop it all and do something else.
This change relies on the VM having a different mechanism to go for
one-shot file cache first, so I need to address Kosaki-san's concerns
about 1/3 before pursuing this patch.
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
@ 2011-11-10 16:12 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:12 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
Johannes Weiner, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Mon, Nov 07, 2011 at 11:45:20AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:47 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
>
> > The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> > reclaim focus onto file pages with every new file page that hits the
> > lru list, so that an influx of used-once file pages does not lead to
> > swapping of anonymous pages.
> >
> > The problem is that nobody is fixing up the balance if the pages in
> > fact become part of the resident set.
> >
> > Anonymous page creation is neutral to the inter-lru balance, so even a
> > comparably tiny number of heavily used file pages tip the balance in
> > favor of the file list.
> >
> > In addition, there is no refault detection, and every refault will
> > bias the balance even more. A thrashing file working set will be
> > mistaken for a very lucrative source of reclaimable pages.
> >
> > As anonymous pages are no longer swapped above a certain priority
> > level, this mechanism is no longer needed. Used-once file pages
> > should get reclaimed before the VM even considers swapping.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
>
> Do you have some results ?
Not yet, sorry, I had to drop it all and do something else.
This change relies on the VM having a different mechanism to go for
one-shot file cache first, so I need to address Kosaki-san's concerns
about 1/3 before pursuing this patch.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
2011-11-02 16:30 ` Johannes Weiner
@ 2011-11-02 16:35 ` Johannes Weiner
-1 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:35 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
KAMEZAWA Hiroyuki, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, Nov 02, 2011 at 05:30:56PM +0100, Johannes Weiner wrote:
> Tipping the balance for inactive list rotation has been there from the
> beginning, but I don't quite understand why. It probably was not a
> problem as the conditions for inactive cycling applied to both file
> and anon equally, but with used-once detection for file and deferred
> file writeback from direct reclaim, we tend to cycle more file pages
> on the inactive list than anonymous ones. Those rotated pages should
> be a signal to favor file reclaim, though.
[...] should NOT be a signal [...]
obviously. Sorry.
^ permalink raw reply [flat|nested] 64+ messages in thread
* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
@ 2011-11-02 16:35 ` Johannes Weiner
0 siblings, 0 replies; 64+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:35 UTC (permalink / raw)
To: Johannes Weiner
Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm@kvack.org,
Andrew Morton, linux-kernel@vger.kernel.org, Wu Fengguang,
KAMEZAWA Hiroyuki, Rik van Riel, Mel Gorman, Minchan Kim,
Gene Heskett
On Wed, Nov 02, 2011 at 05:30:56PM +0100, Johannes Weiner wrote:
> Tipping the balance for inactive list rotation has been there from the
> beginning, but I don't quite understand why. It probably was not a
> problem as the conditions for inactive cycling applied to both file
> and anon equally, but with used-once detection for file and deferred
> file writeback from direct reclaim, we tend to cycle more file pages
> on the inactive list than anonymous ones. Those rotated pages should
> be a signal to favor file reclaim, though.
[...] should NOT be a signal [...]
obviously. Sorry.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 64+ messages in thread