All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	devel@openvz.org, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	Glauber Costa <glommer@gmail.com>
Subject: Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
Date: Thu, 6 Feb 2014 22:51:21 +0400	[thread overview]
Message-ID: <52F3D9A9.1070107@parallels.com> (raw)
In-Reply-To: <20140205125230.e1705369abcb634ddf141008@linux-foundation.org>

On 02/06/2014 12:52 AM, Andrew Morton wrote:
> On Wed, 5 Feb 2014 11:16:49 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>>> So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
>>> recreate (say) an inode would require a seek to the inode data then a
>>> seek back.  Is it legitimate to include the
>>> seek-back-to-what-you-were-doing-before seek in the cost of an inode
>>> reclaim?  I guess so...
>> Hmm, that explains this 2. Since we typically don't need to "seek back"
>> when recreating a cache page, as they are usually read in bunches by
>> readahead, the number of seeks to bring back a user page is 1, while the
>> number of seeks to recreate an average inode is 2, right?
> Sounds right to me.
>
>> Then to scan inodes and user pages so that they would generate
>> approximately the same number of seeks, we should calculate the number
>> of objects to scan as follows:
>>
>> nr_objects_to_scan = nr_pages_scanned / lru_pages *
>>                                         nr_freeable_objects /
>> shrinker->seeks
>>
>> where shrinker->seeks = DEFAULT_SEEKS = 2 for inodes.
> hm, I wonder if we should take the size of the object into account. 
> Should we be maximizing (memory-reclaimed / seeks-to-reestablish-it).

I'm not sure I understand you quite right. You mean that if two slab
caches have obj sizes 1k and 2k and both of them need 2 seeks to
recreate an object, we should scan the 1k (or 2k?) slab cache more
aggressively than the 2k one? Hmm... I don't know. It depends on what we
want to achieve. But this won't balance the seeks, which is our goal for
now, IIUC.

>> But currently we
>> have four times that. I can explain why we should multiply this by 2 -
>> we do not count pages moving from active to inactive lrus in
>> nr_pages_scanned, and 2*nr_pages_scanned can be a good approximation for
>> that - but I have no idea why we multiply it by 4...
> I don't understand this code at all:
>
> 	total_scan = nr;
> 	delta = (4 * nr_pages_scanned) / shrinker->seeks;
> 	delta *= freeable;
> 	do_div(delta, lru_pages + 1);
> 	total_scan += delta;
>
> If it actually makes any sense, it sorely sorely needs documentation.

To find its roots I had to checkout the linux history tree:

commit c3f4656118a78c1c294e0b4d338ac946265a822b
Author: Andrew Morton <akpm@osdl.org>
Date:   Mon Dec 29 23:48:44 2003 -0800

    [PATCH] shrink_slab acounts for seeks incorrectly
   
    wli points out that shrink_slab inverts the sense of
shrinker->seeks: those
    caches which require more seeks to reestablish an object are shrunk
harder.
    That's wrong - they should be shrunk less.
   
    So fix that up, but scaling the result so that the patch is actually
a no-op
    at this time, because all caches use DEFAULT_SEEKS (2).

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b8594827bbac..f2da3c9fb346 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -154,7 +154,7 @@ static int shrink_slab(long scanned, unsigned int
gfp_mask)
        list_for_each_entry(shrinker, &shrinker_list, list) {
                unsigned long long delta;
 
-               delta = scanned * shrinker->seeks;
+               delta = 4 * (scanned / shrinker->seeks);
                delta *= (*shrinker->shrinker)(0, gfp_mask);
                do_div(delta, pages + 1);
                shrinker->nr += delta;


So the idea seemed to be fixing a bug without introducing any functional
changes. Since then we have been living with this "4", which makes no
sense (?). Nobody complained though.

Thanks.

> David, you touched it last.  Any hints?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<devel@openvz.org>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	Glauber Costa <glommer@gmail.com>
Subject: Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
Date: Thu, 6 Feb 2014 22:51:21 +0400	[thread overview]
Message-ID: <52F3D9A9.1070107@parallels.com> (raw)
In-Reply-To: <20140205125230.e1705369abcb634ddf141008@linux-foundation.org>

On 02/06/2014 12:52 AM, Andrew Morton wrote:
> On Wed, 5 Feb 2014 11:16:49 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>>> So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
>>> recreate (say) an inode would require a seek to the inode data then a
>>> seek back.  Is it legitimate to include the
>>> seek-back-to-what-you-were-doing-before seek in the cost of an inode
>>> reclaim?  I guess so...
>> Hmm, that explains this 2. Since we typically don't need to "seek back"
>> when recreating a cache page, as they are usually read in bunches by
>> readahead, the number of seeks to bring back a user page is 1, while the
>> number of seeks to recreate an average inode is 2, right?
> Sounds right to me.
>
>> Then to scan inodes and user pages so that they would generate
>> approximately the same number of seeks, we should calculate the number
>> of objects to scan as follows:
>>
>> nr_objects_to_scan = nr_pages_scanned / lru_pages *
>>                                         nr_freeable_objects /
>> shrinker->seeks
>>
>> where shrinker->seeks = DEFAULT_SEEKS = 2 for inodes.
> hm, I wonder if we should take the size of the object into account. 
> Should we be maximizing (memory-reclaimed / seeks-to-reestablish-it).

I'm not sure I understand you quite right. You mean that if two slab
caches have obj sizes 1k and 2k and both of them need 2 seeks to
recreate an object, we should scan the 1k (or 2k?) slab cache more
aggressively than the 2k one? Hmm... I don't know. It depends on what we
want to achieve. But this won't balance the seeks, which is our goal for
now, IIUC.

>> But currently we
>> have four times that. I can explain why we should multiply this by 2 -
>> we do not count pages moving from active to inactive lrus in
>> nr_pages_scanned, and 2*nr_pages_scanned can be a good approximation for
>> that - but I have no idea why we multiply it by 4...
> I don't understand this code at all:
>
> 	total_scan = nr;
> 	delta = (4 * nr_pages_scanned) / shrinker->seeks;
> 	delta *= freeable;
> 	do_div(delta, lru_pages + 1);
> 	total_scan += delta;
>
> If it actually makes any sense, it sorely sorely needs documentation.

To find its roots I had to checkout the linux history tree:

commit c3f4656118a78c1c294e0b4d338ac946265a822b
Author: Andrew Morton <akpm@osdl.org>
Date:   Mon Dec 29 23:48:44 2003 -0800

    [PATCH] shrink_slab acounts for seeks incorrectly
   
    wli points out that shrink_slab inverts the sense of
shrinker->seeks: those
    caches which require more seeks to reestablish an object are shrunk
harder.
    That's wrong - they should be shrunk less.
   
    So fix that up, but scaling the result so that the patch is actually
a no-op
    at this time, because all caches use DEFAULT_SEEKS (2).

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b8594827bbac..f2da3c9fb346 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -154,7 +154,7 @@ static int shrink_slab(long scanned, unsigned int
gfp_mask)
        list_for_each_entry(shrinker, &shrinker_list, list) {
                unsigned long long delta;
 
-               delta = scanned * shrinker->seeks;
+               delta = 4 * (scanned / shrinker->seeks);
                delta *= (*shrinker->shrinker)(0, gfp_mask);
                do_div(delta, pages + 1);
                shrinker->nr += delta;


So the idea seemed to be fixing a bug without introducing any functional
changes. Since then we have been living with this "4", which makes no
sense (?). Nobody complained though.

Thanks.

> David, you touched it last.  Any hints?


  reply	other threads:[~2014-02-06 23:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-17 19:25 [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable Vladimir Davydov
2014-01-17 19:25 ` Vladimir Davydov
2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
2014-01-17 19:25   ` Vladimir Davydov
2014-02-04 21:58   ` Andrew Morton
2014-02-04 21:58     ` Andrew Morton
2014-02-05  7:16     ` Vladimir Davydov
2014-02-05  7:16       ` Vladimir Davydov
2014-02-05 20:52       ` Andrew Morton
2014-02-05 20:52         ` Andrew Morton
2014-02-06 18:51         ` Vladimir Davydov [this message]
2014-02-06 18:51           ` Vladimir Davydov
2014-01-17 19:25 ` [PATCH 3/3] mm: vmscan: shrink_slab: do not skip caches with < batch_size objects Vladimir Davydov
2014-01-17 19:25   ` Vladimir Davydov
2014-01-21 22:22 ` [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable David Rientjes
2014-01-21 22:22   ` David Rientjes
2014-01-22  6:11   ` Vladimir Davydov
2014-01-22  6:11     ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52F3D9A9.1070107@parallels.com \
    --to=vdavydov@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=dchinner@redhat.com \
    --cc=devel@openvz.org \
    --cc=glommer@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.