From: Rik van Riel <riel@redhat.com>
To: Ying Han <yinghan@google.com>
Cc: linux-mm@kvack.org, hannes@cmpxchg.org, mhocko@suse.cz,
Mel Gorman <mel@csn.ul.ie>
Subject: Re: [RFC][PATCH -mm 1/3] mm,vmscan: track recent pressure on each LRU set
Date: Fri, 10 Aug 2012 11:48:21 -0400 [thread overview]
Message-ID: <50252D45.50308@redhat.com> (raw)
In-Reply-To: <CALWz4ixBJNu8s9irH9G8O=vMQo1JAzG-jLhOfH4Zbod2EWM-6g@mail.gmail.com>
On 08/09/2012 09:22 PM, Ying Han wrote:
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index f222e06..b03be69 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -189,12 +189,20 @@ struct zone_reclaim_stat {
>> * The pageout code in vmscan.c keeps track of how many of the
>> * mem/swap backed and file backed pages are referenced.
>> * The higher the rotated/scanned ratio, the more valuable
>> - * that cache is.
>> + * that cache is. These numbers are aged separately for each LRU.
>> *
>> * The anon LRU stats live in [0], file LRU stats in [1]
>> */
>> unsigned long recent_rotated[2];
>> unsigned long recent_scanned[2];
>> + /*
>> + * This number is incremented together with recent_rotated,
>
> s/recent_rotated/recent/scanned.
>
> I assume the idea here is to associate the scanned to the amount of
> pressure applied on the list.
Indeed. Pageout scanning equals pressure :)
>> +/*
>> + * Ensure that the ->recent_pressure statistics for this lruvec are
>> + * aged to the same degree as those elsewhere in the system, before
>> + * we do reclaim on this lruvec or evaluate its reclaim priority.
>> + */
>> +static DEFINE_SPINLOCK(recent_pressure_lock);
>> +static int recent_pressure_seq;
>> +static void age_recent_pressure(struct lruvec *lruvec, struct zone *zone)
>> +{
>> + struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
>> + unsigned long anon = get_lru_size(lruvec, LRU_ACTIVE_ANON) +
>> + get_lru_size(lruvec, LRU_INACTIVE_ANON);
>> + unsigned long file = get_lru_size(lruvec, LRU_ACTIVE_FILE) +
>> + get_lru_size(lruvec, LRU_INACTIVE_FILE);
>> + int shift;
>> +
>> + /*
>> + * Do not bother recalculating unless we are behind with the
>> + * system wide statistics, or our local recent_pressure numbers
>> + * have grown too large. We have to keep the number somewhat
>> + * small, to ensure that reclaim_score returns non-zero.
>> + */
>> + if (reclaim_stat->recent_pressure_seq != recent_pressure_seq &&
>> + reclaim_stat->recent_pressure[0] < anon / 4 &&
>> + reclaim_stat->recent_pressure[1] < file / 4)
>> + return;
>
> Let's see if I understand the logic here:
>
> If updating the reclaim_stat->recent_pressure for this lruvec is
> falling behind, don't bother to update it unless the scan count grows
> fast enough. When that happens, recent_pressure is adjusted based on
> the gap between the global pressure level (recent_pressure_seq) and
> local pressure level (reclaim_stat->recent_pressure_seq). The lager
> the gap, the more pressure applied on the lruvec.
>
> 1. if the usage activity(scan_count) is always low on a lruvec, the
> pressure will be low.
The scan count being low, indicates that the lruvec has seen little
pageout pressure recently.
This is essentially unrelated to how actively programs are using
(touching) the pages that are sitting on the lists in this lruvec.
> 2. if the usage activity is low for a while, and then when the
> scan_count jumps suddenly, it will cause the pressure to jump as well
> 3. if the usage activity is always high, the pressure will be high .
>
> So, the mechanism here is a way to balance the system pressure across
> lruvec over time?
Yes.
>> +
>> + spin_lock(&recent_pressure_lock);
>> + /*
>> + * If we are aging due to local activity, increment the global
>> + * sequence counter. Leave the global counter alone if we are
>> + * merely playing catchup.
>> + */
>> + if (reclaim_stat->recent_pressure_seq == recent_pressure_seq)
>> + recent_pressure_seq++;
>> + shift = recent_pressure_seq - reclaim_stat->recent_pressure_seq;
>> + shift = min(shift, (BITS_PER_LONG-1));
>> + reclaim_stat->recent_pressure_seq = recent_pressure_seq;
>> + spin_unlock(&recent_pressure_lock);
>> +
>> + /* For every aging interval, do one division by two. */
>> + spin_lock_irq(&zone->lru_lock);
>> + reclaim_stat->recent_pressure[0] >>= shift;
>> + reclaim_stat->recent_pressure[1] >>= shift;
>
> This is a bit confusing. I would assume the bigger the shift, the less
> pressure it causes. However, the end result is the other way around.
The longer ago it has been since this lruvec was last scanned by
the page reclaim code, the more the pressure is aged.
The less recent pressure an lruvec has recently seen, the more
likely it is that it will be scanned in the future (see patch 2/3).
Is there anything that I could explain better?
--
All rights reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-08-10 15:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-08 21:45 [RFC][PATCH -mm 0/3] mm,vmscan: reclaim from highest score cgroup Rik van Riel
2012-08-08 21:47 ` [RFC][PATCH -mm 1/3] mm,vmscan: track recent pressure on each LRU set Rik van Riel
2012-08-10 1:22 ` Ying Han
2012-08-10 1:23 ` Ying Han
2012-08-10 15:48 ` Rik van Riel [this message]
2012-08-08 21:48 ` [RFC][PATCH -mm 2/3] mm,vmscan: reclaim from highest score cgroups Rik van Riel
2012-08-14 19:19 ` Rafael Aquini
2012-08-08 21:49 ` [RFC][PATCH -mm 3/3] mm,vmscan: evict inactive file pages first Rik van Riel
2012-08-12 23:56 ` Rafael Aquini
2012-08-13 2:13 ` [RFC][PATCH -mm -v2 " Rik van Riel
2012-08-14 19:11 ` Rafael Aquini
2012-08-09 1:02 ` [RFC][PATCH -mm 0/3] mm,vmscan: reclaim from highest score cgroup Ying Han
2012-08-09 22:23 ` Ying Han
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50252D45.50308@redhat.com \
--to=riel@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).