From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
devel@openvz.org, Mel Gorman <mgorman@suse.de>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@redhat.com>,
Dave Chinner <dchinner@redhat.com>,
Glauber Costa <glommer@gmail.com>
Subject: Re: [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory
Date: Thu, 16 Jan 2014 12:50:51 +0400 [thread overview]
Message-ID: <52D79D6B.10304@parallels.com> (raw)
In-Reply-To: <20140115145327.6aae2e13a9a8bba619923ac9@linux-foundation.org>
On 01/16/2014 02:53 AM, Andrew Morton wrote:
> On Wed, 15 Jan 2014 19:55:11 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>>> We could avoid the "scan 32 then scan just 1" issue with something like
>>>
>>> if (total_scan > batch_size)
>>> total_scan %= batch_size;
>>>
>>> before the loop. But I expect the effects of that will be unmeasurable
>>> - on average the number of objects which are scanned in the final pass
>>> of the loop will be batch_size/2, yes? That's still a decent amount.
>> Let me try to summarize. We want to scan batch_size objects in one pass,
>> not more (to keep latency low) and not less (to avoid cpu cache
>> pollution due to too frequent calls); if the calculated value of
>> nr_to_scan is less than the batch_size we should accumulate it in
>> nr_deferred instead of calling ->scan() and add nr_deferred to
>> nr_to_scan on the next pass, i.e. in pseudo-code:
>>
>> /* calculate current nr_to_scan */
>> max_pass = shrinker->count();
>> delta = max_pass * nr_user_pages_scanned / nr_user_pages;
>>
>> /* add nr_deferred */
>> total_scan = delta + nr_deferred;
>>
>> while (total_scan >= batch_size) {
>> shrinker->scan(batch_size);
>> total_scan -= batch_size;
>> }
>>
>> /* save the remainder to nr_deferred */
>> nr_deferred = total_scan;
>>
>> That would work, but if max_pass is < batch_size, it would not scan the
>> objects immediately even if prio is high (we want to scan all objects).
> Yes, that's a problem.
>
>> For example, dropping caches would not work on the first attempt - the
>> user would have to call it batch_size / max_pass times.
> And we do want drop_caches to work immediately.
>
>> This could be
>> fixed by making the code proceed to ->scan() not only if total_scan is
>>> = batch_size, but also if max_pass is < batch_size and total_scan is >=
>> max_pass, i.e.
>>
>> while (total_scan >= batch_size ||
>> (max_pass < batch_size && total_scan >= max_pass)) ...
>>
>> which is equivalent to
>>
>> while (total_scan >= batch_size ||
>> total_scan >= max_pass) ...
>>
>> The latter is the loop condition from the current patch, i.e. this patch
>> would make the trick if shrink_slab() followed the pseudo-code above. In
>> real life, it does not actually - we have to bias total_scan before the
>> while loop in order to avoid dropping fs meta caches on light memory
>> pressure due to a large number being built in nr_deferred:
>>
>> if (delta < max_pass / 4)
>> total_scan = min(total_scan, max_pass / 2);
> Oh, is that what's it's for. Where did you discover this gem?
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
<devel@openvz.org>, Mel Gorman <mgorman@suse.de>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@redhat.com>,
Dave Chinner <dchinner@redhat.com>,
Glauber Costa <glommer@gmail.com>
Subject: Re: [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory
Date: Thu, 16 Jan 2014 12:50:51 +0400 [thread overview]
Message-ID: <52D79D6B.10304@parallels.com> (raw)
In-Reply-To: <20140115145327.6aae2e13a9a8bba619923ac9@linux-foundation.org>
On 01/16/2014 02:53 AM, Andrew Morton wrote:
> On Wed, 15 Jan 2014 19:55:11 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>>> We could avoid the "scan 32 then scan just 1" issue with something like
>>>
>>> if (total_scan > batch_size)
>>> total_scan %= batch_size;
>>>
>>> before the loop. But I expect the effects of that will be unmeasurable
>>> - on average the number of objects which are scanned in the final pass
>>> of the loop will be batch_size/2, yes? That's still a decent amount.
>> Let me try to summarize. We want to scan batch_size objects in one pass,
>> not more (to keep latency low) and not less (to avoid cpu cache
>> pollution due to too frequent calls); if the calculated value of
>> nr_to_scan is less than the batch_size we should accumulate it in
>> nr_deferred instead of calling ->scan() and add nr_deferred to
>> nr_to_scan on the next pass, i.e. in pseudo-code:
>>
>> /* calculate current nr_to_scan */
>> max_pass = shrinker->count();
>> delta = max_pass * nr_user_pages_scanned / nr_user_pages;
>>
>> /* add nr_deferred */
>> total_scan = delta + nr_deferred;
>>
>> while (total_scan >= batch_size) {
>> shrinker->scan(batch_size);
>> total_scan -= batch_size;
>> }
>>
>> /* save the remainder to nr_deferred */
>> nr_deferred = total_scan;
>>
>> That would work, but if max_pass is < batch_size, it would not scan the
>> objects immediately even if prio is high (we want to scan all objects).
> Yes, that's a problem.
>
>> For example, dropping caches would not work on the first attempt - the
>> user would have to call it batch_size / max_pass times.
> And we do want drop_caches to work immediately.
>
>> This could be
>> fixed by making the code proceed to ->scan() not only if total_scan is
>>> = batch_size, but also if max_pass is < batch_size and total_scan is >=
>> max_pass, i.e.
>>
>> while (total_scan >= batch_size ||
>> (max_pass < batch_size && total_scan >= max_pass)) ...
>>
>> which is equivalent to
>>
>> while (total_scan >= batch_size ||
>> total_scan >= max_pass) ...
>>
>> The latter is the loop condition from the current patch, i.e. this patch
>> would make the trick if shrink_slab() followed the pseudo-code above. In
>> real life, it does not actually - we have to bias total_scan before the
>> while loop in order to avoid dropping fs meta caches on light memory
>> pressure due to a large number being built in nr_deferred:
>>
>> if (delta < max_pass / 4)
>> total_scan = min(total_scan, max_pass / 2);
> Oh, is that what's it's for. Where did you discover this gem?
>From 3567b59aa80ac4417002bf58e35dce5c777d4164 Mon Sep 17 00:00:00 2001
From: Dave Chinner <dchinner@redhat.com>
Date: Fri, 8 Jul 2011 14:14:36 +1000
Subject: [PATCH] vmscan: reduce wind up shrinker->nr when shrinker can't do
work
>> while (total_scan >= batch_size) ...
>>
>> With this biasing, it is impossible to achieve the ideal behavior I've
>> described above, because we will never accumulate max_pass objects in
>> nr_deferred if memory pressure is low. So, if applied to the real code,
>> this patch takes on a slightly different sense, which I tried to reflect
>> in the comment to the code: it will call ->scan() with nr_to_scan <
>> batch_size only if:
>>
>> 1) max_pass < batch_size && total_scan >= max_pass
>>
>> and
>>
>> 2) we're tight on memory, i.e. the current delta is high (otherwise
>> total_scan will be biased as max_pass / 2 and condition 1 won't be
>> satisfied).
> (is max_pass misnamed?)
Yes, the name is misleading. I guess, it should be called freeable_cnt,
because we actually scan up to 2*max_pass objects in one pass.
>> >From our discussion it seems condition 2 is not necessary at all, but it
>> follows directly from the biasing rule. So I propose to tweak the
>> biasing a bit so that total_scan won't be lowered < batch_size:
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index eea668d..78ddd5e 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -267,7 +267,7 @@ shrink_slab_node(struct shrink_control *shrinkctl,
>> struct shrinker *shrinker,
>> * a large delta change is calculated directly.
>> */
>> if (delta < max_pass / 4)
>> - total_scan = min(total_scan, max_pass / 2);
>> + total_scan = min(total_scan, max(max_pass / 2, batch_size));
>>
>> /*
>> * Avoid risking looping forever due to too large nr value:
>> @@ -281,7 +281,7 @@ shrink_slab_node(struct shrink_control *shrinkctl,
>> struct shrinker *shrinker,
>> nr_pages_scanned, lru_pages,
>> max_pass, delta, total_scan);
>>
>> - while (total_scan >= batch_size) {
>> + while (total_scan >= batch_size || total_scan >= max_pass) {
>> unsigned long ret;
>>
>> shrinkctl->nr_to_scan = batch_size;
>>
>> The first hunk guarantees that total_scan will always accumulate at
>> least batch_size objects no matter how small max_pass is. That means
>> that when max_pass is < batch_size we will eventually get >= max_pass
>> objects to scan and shrink the slab to 0 as we need. What do you think
>> about that?
> I'm a bit lost :(
I'll try to clean up shrink_slab_node() and resend the patch then.
Thanks.
next prev parent reply other threads:[~2014-01-16 8:51 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-11 12:36 [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory Vladimir Davydov
2014-01-11 12:36 ` Vladimir Davydov
2014-01-11 12:36 ` [PATCH 2/5] mm: vmscan: call NUMA-unaware shrinkers irrespective of nodemask Vladimir Davydov
2014-01-11 12:36 ` Vladimir Davydov
2014-01-11 12:36 ` [PATCH 3/5] mm: vmscan: respect NUMA policy mask when shrinking slab on direct reclaim Vladimir Davydov
2014-01-11 12:36 ` Vladimir Davydov
2014-01-13 23:11 ` Andrew Morton
2014-01-13 23:11 ` Andrew Morton
2014-01-14 6:56 ` Vladimir Davydov
2014-01-14 6:56 ` Vladimir Davydov
2014-01-11 12:36 ` [PATCH 4/5] mm: vmscan: move call to shrink_slab() to shrink_zones() Vladimir Davydov
2014-01-11 12:36 ` Vladimir Davydov
2014-01-13 23:13 ` Andrew Morton
2014-01-13 23:13 ` Andrew Morton
2014-01-14 6:53 ` Vladimir Davydov
2014-01-14 6:53 ` Vladimir Davydov
2014-01-11 12:36 ` [PATCH 5/5] mm: vmscan: remove shrink_control arg from do_try_to_free_pages() Vladimir Davydov
2014-01-11 12:36 ` Vladimir Davydov
2014-01-13 23:05 ` [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory Andrew Morton
2014-01-13 23:05 ` Andrew Morton
2014-01-14 7:23 ` Vladimir Davydov
2014-01-14 7:23 ` Vladimir Davydov
2014-01-14 22:14 ` Andrew Morton
2014-01-14 22:14 ` Andrew Morton
2014-01-15 8:47 ` Vladimir Davydov
2014-01-15 8:47 ` Vladimir Davydov
2014-01-15 9:25 ` Andrew Morton
2014-01-15 9:25 ` Andrew Morton
2014-01-15 15:55 ` Vladimir Davydov
2014-01-15 15:55 ` Vladimir Davydov
2014-01-15 22:53 ` Andrew Morton
2014-01-15 22:53 ` Andrew Morton
2014-01-16 8:50 ` Vladimir Davydov [this message]
2014-01-16 8:50 ` Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D79D6B.10304@parallels.com \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=dchinner@redhat.com \
--cc=devel@openvz.org \
--cc=glommer@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.