From: Andi Kleen <andi@firstfloor.org>
To: Rik van Riel <riel@redhat.com>
Cc: lwoodman@redhat.com, kosaki.motohiro@jp.fujitsu.com,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, aarcange@redhat.com
Subject: Re: [PATCH] vmscan: limit concurrent reclaimers in shrink_zone
Date: Mon, 14 Dec 2009 14:08:09 +0100 [thread overview]
Message-ID: <87pr6hya86.fsf@basil.nowhere.org> (raw)
In-Reply-To: <20091210185626.26f9828a@cuia.bos.redhat.com> (Rik van Riel's message of "Thu, 10 Dec 2009 18:56:26 -0500")
Rik van Riel <riel@redhat.com> writes:
> +max_zone_concurrent_reclaim:
> +
> +The number of processes that are allowed to simultaneously reclaim
> +memory from a particular memory zone.
> +
> +With certain workloads, hundreds of processes end up in the page
> +reclaim code simultaneously. This can cause large slowdowns due
> +to lock contention, freeing of way too much memory and occasionally
> +false OOM kills.
> +
> +To avoid these problems, only allow a smaller number of processes
> +to reclaim pages from each memory zone simultaneously.
> +
> +The default value is 8.
I don't like the hardcoded number. Is the same number good for a 128MB
embedded system as for as 1TB server? Seems doubtful.
This should be perhaps scaled with memory size and number of CPUs?
> +/*
> + * Maximum number of processes concurrently running the page
> + * reclaim code in a memory zone. Having too many processes
> + * just results in them burning CPU time waiting for locks,
> + * so we're better off limiting page reclaim to a sane number
> + * of processes at a time. We do this per zone so local node
> + * reclaim on one NUMA node will not block other nodes from
> + * making progress.
> + */
> +int max_zone_concurrent_reclaimers = 8;
__read_mostly
> +
> static LIST_HEAD(shrinker_list);
> static DECLARE_RWSEM(shrinker_rwsem);
>
> @@ -1600,6 +1612,29 @@ static void shrink_zone(int priority, struct zone *zone,
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
> int noswap = 0;
>
> + if (!current_is_kswapd() && atomic_read(&zone->concurrent_reclaimers) >
> + max_zone_concurrent_reclaimers) {
> + /*
> + * Do not add to the lock contention if this zone has
> + * enough processes doing page reclaim already, since
> + * we would just make things slower.
> + */
> + sleep_on(&zone->reclaim_wait);
wait_event()? sleep_on is a really deprecated racy interface.
This would still badly thunder the herd if not enough memory is freed
, won't it? It would be better to only wake up a single process if memory got freed.
How about for each page freed do a wake up for one thread?
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
WARNING: multiple messages have this Message-ID (diff)
From: Andi Kleen <andi@firstfloor.org>
To: Rik van Riel <riel@redhat.com>
Cc: lwoodman@redhat.com, kosaki.motohiro@jp.fujitsu.com,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, aarcange@redhat.com
Subject: Re: [PATCH] vmscan: limit concurrent reclaimers in shrink_zone
Date: Mon, 14 Dec 2009 14:08:09 +0100 [thread overview]
Message-ID: <87pr6hya86.fsf@basil.nowhere.org> (raw)
In-Reply-To: <20091210185626.26f9828a@cuia.bos.redhat.com> (Rik van Riel's message of "Thu, 10 Dec 2009 18:56:26 -0500")
Rik van Riel <riel@redhat.com> writes:
> +max_zone_concurrent_reclaim:
> +
> +The number of processes that are allowed to simultaneously reclaim
> +memory from a particular memory zone.
> +
> +With certain workloads, hundreds of processes end up in the page
> +reclaim code simultaneously. This can cause large slowdowns due
> +to lock contention, freeing of way too much memory and occasionally
> +false OOM kills.
> +
> +To avoid these problems, only allow a smaller number of processes
> +to reclaim pages from each memory zone simultaneously.
> +
> +The default value is 8.
I don't like the hardcoded number. Is the same number good for a 128MB
embedded system as for as 1TB server? Seems doubtful.
This should be perhaps scaled with memory size and number of CPUs?
> +/*
> + * Maximum number of processes concurrently running the page
> + * reclaim code in a memory zone. Having too many processes
> + * just results in them burning CPU time waiting for locks,
> + * so we're better off limiting page reclaim to a sane number
> + * of processes at a time. We do this per zone so local node
> + * reclaim on one NUMA node will not block other nodes from
> + * making progress.
> + */
> +int max_zone_concurrent_reclaimers = 8;
__read_mostly
> +
> static LIST_HEAD(shrinker_list);
> static DECLARE_RWSEM(shrinker_rwsem);
>
> @@ -1600,6 +1612,29 @@ static void shrink_zone(int priority, struct zone *zone,
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
> int noswap = 0;
>
> + if (!current_is_kswapd() && atomic_read(&zone->concurrent_reclaimers) >
> + max_zone_concurrent_reclaimers) {
> + /*
> + * Do not add to the lock contention if this zone has
> + * enough processes doing page reclaim already, since
> + * we would just make things slower.
> + */
> + sleep_on(&zone->reclaim_wait);
wait_event()? sleep_on is a really deprecated racy interface.
This would still badly thunder the herd if not enough memory is freed
, won't it? It would be better to only wake up a single process if memory got freed.
How about for each page freed do a wake up for one thread?
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-12-14 13:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-10 23:56 [PATCH] vmscan: limit concurrent reclaimers in shrink_zone Rik van Riel
2009-12-10 23:56 ` Rik van Riel
2009-12-11 2:03 ` Minchan Kim
2009-12-11 2:03 ` Minchan Kim
2009-12-11 3:19 ` Rik van Riel
2009-12-11 3:19 ` Rik van Riel
2009-12-11 3:43 ` Minchan Kim
2009-12-11 3:43 ` Minchan Kim
2009-12-11 12:07 ` Larry Woodman
2009-12-11 12:07 ` Larry Woodman
2009-12-11 13:41 ` Minchan Kim
2009-12-11 13:41 ` Minchan Kim
2009-12-11 13:51 ` Rik van Riel
2009-12-11 13:51 ` Rik van Riel
2009-12-11 14:08 ` Minchan Kim
2009-12-11 14:08 ` Minchan Kim
2009-12-11 13:48 ` Rik van Riel
2009-12-11 13:48 ` Rik van Riel
2009-12-11 21:24 ` Rik van Riel
2009-12-11 21:24 ` Rik van Riel
2009-12-11 11:49 ` Larry Woodman
2009-12-11 11:49 ` Larry Woodman
2009-12-14 13:08 ` Andi Kleen [this message]
2009-12-14 13:08 ` Andi Kleen
2009-12-14 14:23 ` Larry Woodman
2009-12-14 14:23 ` Larry Woodman
2009-12-14 16:19 ` Andi Kleen
2009-12-14 16:19 ` Andi Kleen
2009-12-14 14:40 ` Rik van Riel
2009-12-14 14:40 ` Rik van Riel
2009-12-14 13:14 ` Christoph Hellwig
2009-12-14 13:14 ` Christoph Hellwig
2009-12-14 14:22 ` Larry Woodman
2009-12-14 14:22 ` Larry Woodman
2009-12-14 14:52 ` Rik van Riel
2009-12-14 14:52 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pr6hya86.fsf@basil.nowhere.org \
--to=andi@firstfloor.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lwoodman@redhat.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.