* [PATCH] vmscan: fix slab vs lru balance
@ 2015-11-24 12:47 Vladimir Davydov
2015-11-24 23:02 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Davydov @ 2015-11-24 12:47 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Vlastimil Babka, Mel Gorman,
Dave Chinner, linux-mm, linux-kernel
The comment to shrink_slab states that the portion of kmem objects
scanned by it equals the portion of lru pages scanned by shrink_zone
over shrinker->seeks.
shrinker->seeks is supposed to be equal to the number of disk seeks
required to recreated an object. It is usually set to DEFAULT_SEEKS (2),
which is quite logical, because most kmem objects (e.g. dentry or inode)
require random IO to reread (seek to read and seek back).
That said, one would expect that dcache is scanned two times less
intensively than page cache, which sounds sane as dentries are generally
more costly to recreate.
However, the formula for distributing memory pressure between slab and
lru actually looks as follows (see do_shrink_slab):
lru_scanned
objs_to_scan = objs_total * --------------- * 4 / shrinker->seeks
lru_reclaimable
That is dcache, as well as most of other slab caches, is scanned two
times more aggressively than page cache.
Fix this by dropping '4' from the equation above.
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 97ba9e1cde09..9d553b07bb86 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -290,7 +290,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
total_scan = nr;
- delta = (4 * nr_scanned) / shrinker->seeks;
+ delta = nr_scanned / shrinker->seeks;
delta *= freeable;
do_div(delta, nr_eligible + 1);
total_scan += delta;
--
2.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] vmscan: fix slab vs lru balance
2015-11-24 12:47 [PATCH] vmscan: fix slab vs lru balance Vladimir Davydov
@ 2015-11-24 23:02 ` Andrew Morton
2015-11-26 13:26 ` Vladimir Davydov
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2015-11-24 23:02 UTC (permalink / raw)
To: Vladimir Davydov
Cc: Johannes Weiner, Michal Hocko, Vlastimil Babka, Mel Gorman,
Dave Chinner, linux-mm, linux-kernel
On Tue, 24 Nov 2015 15:47:21 +0300 Vladimir Davydov <vdavydov@virtuozzo.com> wrote:
> The comment to shrink_slab states that the portion of kmem objects
> scanned by it equals the portion of lru pages scanned by shrink_zone
> over shrinker->seeks.
>
> shrinker->seeks is supposed to be equal to the number of disk seeks
> required to recreated an object. It is usually set to DEFAULT_SEEKS (2),
> which is quite logical, because most kmem objects (e.g. dentry or inode)
> require random IO to reread (seek to read and seek back).
>
> That said, one would expect that dcache is scanned two times less
> intensively than page cache, which sounds sane as dentries are generally
> more costly to recreate.
>
> However, the formula for distributing memory pressure between slab and
> lru actually looks as follows (see do_shrink_slab):
>
> lru_scanned
> objs_to_scan = objs_total * --------------- * 4 / shrinker->seeks
> lru_reclaimable
>
> That is dcache, as well as most of other slab caches, is scanned two
> times more aggressively than page cache.
>
> Fix this by dropping '4' from the equation above.
>
oh geeze. Who wrote that crap?
commit c3f4656118a78c1c294e0b4d338ac946265a822b
Author: Andrew Morton <akpm@osdl.org>
Date: Mon Dec 29 23:48:44 2003 -0800
[PATCH] shrink_slab acounts for seeks incorrectly
wli points out that shrink_slab inverts the sense of shrinker->seeks: those
caches which require more seeks to reestablish an object are shrunk harder.
That's wrong - they should be shrunk less.
So fix that up, but scaling the result so that the patch is actually a no-op
at this time, because all caches use DEFAULT_SEEKS (2).
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b859482..f2da3c9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -154,7 +154,7 @@ static int shrink_slab(long scanned, unsigned int gfp_mask)
list_for_each_entry(shrinker, &shrinker_list, list) {
unsigned long long delta;
- delta = scanned * shrinker->seeks;
+ delta = 4 * (scanned / shrinker->seeks);
delta *= (*shrinker->shrinker)(0, gfp_mask);
do_div(delta, pages + 1);
shrinker->nr += delta;
What a pathetic changelog.
The current code may be good, it may be bad, but I'm reluctant to
change it without a solid demonstration that the result is overall
superior.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] vmscan: fix slab vs lru balance
2015-11-24 23:02 ` Andrew Morton
@ 2015-11-26 13:26 ` Vladimir Davydov
0 siblings, 0 replies; 3+ messages in thread
From: Vladimir Davydov @ 2015-11-26 13:26 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, Vlastimil Babka, Mel Gorman,
Dave Chinner, linux-mm, linux-kernel
On Tue, Nov 24, 2015 at 03:02:27PM -0800, Andrew Morton wrote:
> On Tue, 24 Nov 2015 15:47:21 +0300 Vladimir Davydov <vdavydov@virtuozzo.com> wrote:
>
> > The comment to shrink_slab states that the portion of kmem objects
> > scanned by it equals the portion of lru pages scanned by shrink_zone
> > over shrinker->seeks.
> >
> > shrinker->seeks is supposed to be equal to the number of disk seeks
> > required to recreated an object. It is usually set to DEFAULT_SEEKS (2),
> > which is quite logical, because most kmem objects (e.g. dentry or inode)
> > require random IO to reread (seek to read and seek back).
> >
> > That said, one would expect that dcache is scanned two times less
> > intensively than page cache, which sounds sane as dentries are generally
> > more costly to recreate.
> >
> > However, the formula for distributing memory pressure between slab and
> > lru actually looks as follows (see do_shrink_slab):
> >
> > lru_scanned
> > objs_to_scan = objs_total * --------------- * 4 / shrinker->seeks
> > lru_reclaimable
> >
> > That is dcache, as well as most of other slab caches, is scanned two
> > times more aggressively than page cache.
> >
> > Fix this by dropping '4' from the equation above.
> >
>
> oh geeze. Who wrote that crap?
>
>
> commit c3f4656118a78c1c294e0b4d338ac946265a822b
> Author: Andrew Morton <akpm@osdl.org>
> Date: Mon Dec 29 23:48:44 2003 -0800
>
> [PATCH] shrink_slab acounts for seeks incorrectly
>
> wli points out that shrink_slab inverts the sense of shrinker->seeks: those
> caches which require more seeks to reestablish an object are shrunk harder.
> That's wrong - they should be shrunk less.
>
> So fix that up, but scaling the result so that the patch is actually a no-op
> at this time, because all caches use DEFAULT_SEEKS (2).
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b859482..f2da3c9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -154,7 +154,7 @@ static int shrink_slab(long scanned, unsigned int gfp_mask)
> list_for_each_entry(shrinker, &shrinker_list, list) {
> unsigned long long delta;
>
> - delta = scanned * shrinker->seeks;
> + delta = 4 * (scanned / shrinker->seeks);
> delta *= (*shrinker->shrinker)(0, gfp_mask);
> do_div(delta, pages + 1);
> shrinker->nr += delta;
>
>
> What a pathetic changelog.
>
> The current code may be good, it may be bad, but I'm reluctant to
> change it without a solid demonstration that the result is overall
> superior.
>
Yep, that's understandable - we've been living with this (mis)behavior
for more than 10 years already and nobody seems to complain.
I don't have a solid proof at hand right now that the patch makes things
substantially better in most cases - it just comes from the speculation
that dropping dcache is really expensive, because (a) rereading it
requires random IO and (b) dropping an inode automatically results in
dropping page cache attached to it, so it shouldn't be scanned more
aggressively than unmapped page cache.
Anyway, I'll try to run various workloads with and w/o this patch and
report back if I find those which benefit from it.
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-11-26 13:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-24 12:47 [PATCH] vmscan: fix slab vs lru balance Vladimir Davydov
2015-11-24 23:02 ` Andrew Morton
2015-11-26 13:26 ` Vladimir Davydov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).