From: Mel Gorman <mel@csn.ul.ie>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Shaohua Li <shaohua.li@intel.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Date: Thu, 28 Oct 2010 10:49:03 +0100 [thread overview]
Message-ID: <20101028094903.GC4896@csn.ul.ie> (raw)
In-Reply-To: <20101028100920.5d4ce413.kamezawa.hiroyu@jp.fujitsu.com>
On Thu, Oct 28, 2010 at 10:09:20AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 27 Oct 2010 09:47:35 +0100
> Mel Gorman <mel@csn.ul.ie> wrote:
>
> > Commit [aa45484: calculate a better estimate of NR_FREE_PAGES when
> > memory is low] noted that watermarks were based on the vmstat
> > NR_FREE_PAGES. To avoid synchronization overhead, these counters are
> > maintained on a per-cpu basis and drained both periodically and when a
> > threshold is above a threshold. On large CPU systems, the difference
> > between the estimate and real value of NR_FREE_PAGES can be very high.
> > The system can get into a case where pages are allocated far below the
> > min watermark potentially causing livelock issues. The commit solved the
> > problem by taking a better reading of NR_FREE_PAGES when memory was low.
> >
> > <SNIP>
> >
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index 355a9e6..cafcc2d 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -81,6 +81,12 @@ EXPORT_SYMBOL(vm_stat);
> >
> > #ifdef CONFIG_SMP
> >
> > +static int calculate_pressure_threshold(struct zone *zone)
> > +{
> > + return max(1, (int)((high_wmark_pages(zone) - low_wmark_pages(zone) /
> > + num_online_cpus())));
> > +}
> > +
>
> Could you add background theory of this calculation as a comment to
> show the difference with calculate_threshold() ?
>
Sure. When writing it, I realised that the calculations here differ from
what percpu_drift_mark does. This is what I currently have
int calculate_pressure_threshold(struct zone *zone)
{
int threshold;
int watermark_distance;
/*
* As vmstats are not up to date, there is drift between the estimated
* and real values. For high thresholds and a high number of CPUs, it
* is possible for the min watermark to be breached while the estimated
* value looks fine. The pressure threshold is a reduced value such
* that even the maximum amount of drift will not accidentally breach
* the min watermark
*/
watermark_distance = low_wmark_pages(zone) - min_wmark_pages(zone);
threshold = max(1, watermark_distance / num_online_cpus());
/*
* Maximum threshold is 125
*/
threshold = min(125, threshold);
return threshold;
}
Is this better?
> And don't we need to have "max=125" thresh here ?
>
Yes.
>
> > static int calculate_threshold(struct zone *zone)
> > {
> > int threshold;
> > @@ -159,6 +165,44 @@ static void refresh_zone_stat_thresholds(void)
> > }
> > }
> >
> > +void reduce_pgdat_percpu_threshold(pg_data_t *pgdat)
> > +{
> > + struct zone *zone;
> > + int cpu;
> > + int threshold;
> > + int i;
> > +
>
> get_online_cpus();
>
Also correct.
Thanks very much. I'm revising the series.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mel@csn.ul.ie>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Shaohua Li <shaohua.li@intel.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Date: Thu, 28 Oct 2010 10:49:03 +0100 [thread overview]
Message-ID: <20101028094903.GC4896@csn.ul.ie> (raw)
In-Reply-To: <20101028100920.5d4ce413.kamezawa.hiroyu@jp.fujitsu.com>
On Thu, Oct 28, 2010 at 10:09:20AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 27 Oct 2010 09:47:35 +0100
> Mel Gorman <mel@csn.ul.ie> wrote:
>
> > Commit [aa45484: calculate a better estimate of NR_FREE_PAGES when
> > memory is low] noted that watermarks were based on the vmstat
> > NR_FREE_PAGES. To avoid synchronization overhead, these counters are
> > maintained on a per-cpu basis and drained both periodically and when a
> > threshold is above a threshold. On large CPU systems, the difference
> > between the estimate and real value of NR_FREE_PAGES can be very high.
> > The system can get into a case where pages are allocated far below the
> > min watermark potentially causing livelock issues. The commit solved the
> > problem by taking a better reading of NR_FREE_PAGES when memory was low.
> >
> > <SNIP>
> >
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index 355a9e6..cafcc2d 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -81,6 +81,12 @@ EXPORT_SYMBOL(vm_stat);
> >
> > #ifdef CONFIG_SMP
> >
> > +static int calculate_pressure_threshold(struct zone *zone)
> > +{
> > + return max(1, (int)((high_wmark_pages(zone) - low_wmark_pages(zone) /
> > + num_online_cpus())));
> > +}
> > +
>
> Could you add background theory of this calculation as a comment to
> show the difference with calculate_threshold() ?
>
Sure. When writing it, I realised that the calculations here differ from
what percpu_drift_mark does. This is what I currently have
int calculate_pressure_threshold(struct zone *zone)
{
int threshold;
int watermark_distance;
/*
* As vmstats are not up to date, there is drift between the estimated
* and real values. For high thresholds and a high number of CPUs, it
* is possible for the min watermark to be breached while the estimated
* value looks fine. The pressure threshold is a reduced value such
* that even the maximum amount of drift will not accidentally breach
* the min watermark
*/
watermark_distance = low_wmark_pages(zone) - min_wmark_pages(zone);
threshold = max(1, watermark_distance / num_online_cpus());
/*
* Maximum threshold is 125
*/
threshold = min(125, threshold);
return threshold;
}
Is this better?
> And don't we need to have "max=125" thresh here ?
>
Yes.
>
> > static int calculate_threshold(struct zone *zone)
> > {
> > int threshold;
> > @@ -159,6 +165,44 @@ static void refresh_zone_stat_thresholds(void)
> > }
> > }
> >
> > +void reduce_pgdat_percpu_threshold(pg_data_t *pgdat)
> > +{
> > + struct zone *zone;
> > + int cpu;
> > + int threshold;
> > + int i;
> > +
>
> get_online_cpus();
>
Also correct.
Thanks very much. I'm revising the series.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-10-28 9:49 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-27 8:47 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions Mel Gorman
2010-10-27 8:47 ` Mel Gorman
2010-10-27 8:47 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-27 8:47 ` Mel Gorman
2010-10-27 20:16 ` Christoph Lameter
2010-10-27 20:16 ` Christoph Lameter
2010-10-28 1:09 ` KAMEZAWA Hiroyuki
2010-10-28 1:09 ` KAMEZAWA Hiroyuki
2010-10-28 9:49 ` Mel Gorman [this message]
2010-10-28 9:49 ` Mel Gorman
2010-10-28 9:58 ` KAMEZAWA Hiroyuki
2010-10-28 9:58 ` KAMEZAWA Hiroyuki
2010-11-14 8:53 ` [PATCH] set_pgdat_percpu_threshold() don't use for_each_online_cpu KOSAKI Motohiro
2010-11-14 8:53 ` KOSAKI Motohiro
2010-11-15 10:26 ` Mel Gorman
2010-11-15 10:26 ` Mel Gorman
2010-11-15 14:04 ` Christoph Lameter
2010-11-15 14:04 ` Christoph Lameter
2010-11-16 9:58 ` Mel Gorman
2010-11-16 9:58 ` Mel Gorman
2010-11-17 0:07 ` Andrew Morton
2010-11-17 0:07 ` Andrew Morton
2010-11-19 15:29 ` Christoph Lameter
2010-11-19 15:29 ` Christoph Lameter
2010-11-23 8:32 ` KOSAKI Motohiro
2010-11-23 8:32 ` KOSAKI Motohiro
2010-11-01 7:06 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low KOSAKI Motohiro
2010-11-01 7:06 ` KOSAKI Motohiro
2010-11-26 16:06 ` Kyle McMartin
2010-11-26 16:06 ` Kyle McMartin
2010-11-29 9:56 ` Mel Gorman
2010-11-29 9:56 ` Mel Gorman
2010-11-29 13:16 ` Kyle McMartin
2010-11-29 13:16 ` Kyle McMartin
2010-11-29 15:08 ` Mel Gorman
2010-11-29 15:08 ` Mel Gorman
2010-11-29 15:22 ` Kyle McMartin
2010-11-29 15:22 ` Kyle McMartin
2010-11-29 15:26 ` Kyle McMartin
2010-11-29 15:26 ` Kyle McMartin
2010-11-29 15:58 ` Mel Gorman
2010-11-29 15:58 ` Mel Gorman
2010-12-23 22:18 ` David Rientjes
2010-12-23 22:18 ` David Rientjes
2010-12-23 22:35 ` Andrew Morton
2010-12-23 22:35 ` Andrew Morton
2010-12-23 23:00 ` Kyle McMartin
2010-12-23 23:00 ` Kyle McMartin
2010-12-23 23:07 ` David Rientjes
2010-12-23 23:07 ` David Rientjes
2010-12-23 23:17 ` Andrew Morton
2010-12-23 23:17 ` Andrew Morton
2010-10-27 8:47 ` [PATCH 2/2] mm: vmstat: Use a single setter function and callback for adjusting percpu thresholds Mel Gorman
2010-10-27 8:47 ` Mel Gorman
2010-10-27 20:13 ` Christoph Lameter
2010-10-27 20:13 ` Christoph Lameter
2010-10-28 1:10 ` KAMEZAWA Hiroyuki
2010-10-28 1:10 ` KAMEZAWA Hiroyuki
2010-11-01 7:06 ` KOSAKI Motohiro
2010-11-01 7:06 ` KOSAKI Motohiro
-- strict thread matches above, loose matches on Subject: below --
2010-10-28 15:13 [PATCH 0/2] Reduce the amount of time spent in watermark-related functions V4 Mel Gorman
2010-10-28 15:13 ` [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low Mel Gorman
2010-10-28 15:13 ` Mel Gorman
2010-10-28 22:04 ` Andrew Morton
2010-10-28 22:04 ` Andrew Morton
2010-10-29 10:12 ` Mel Gorman
2010-10-29 10:12 ` Mel Gorman
2010-10-29 19:40 ` Andrew Morton
2010-10-29 19:40 ` Andrew Morton
2010-11-02 0:53 ` Shaohua Li
2010-11-02 0:53 ` Shaohua Li
2010-11-09 11:33 ` Mel Gorman
2010-11-09 11:33 ` Mel Gorman
2010-11-09 16:48 ` Christoph Lameter
2010-11-09 16:48 ` Christoph Lameter
2010-10-29 14:58 ` Christoph Lameter
2010-10-29 14:58 ` Christoph Lameter
2010-10-29 18:25 ` Andrew Morton
2010-10-29 18:25 ` Andrew Morton
2010-10-29 19:33 ` Christoph Lameter
2010-10-29 19:33 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101028094903.GC4896@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.