All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Josef Bacik <josef@toxicpanda.com>,
	Minchan Kim <minchan@kernel.org>,
	hannes@cmpxchg.org, riel@redhat.com, akpm@linux-foundation.org,
	linux-mm@kvack.org, kernel-team@fb.com,
	Josef Bacik <jbacik@fb.com>,
	mhocko@kernel.org, cl@linux.com
Subject: Re: [PATCH 1/2] mm: use slab size in the slab shrinking ratio calculation
Date: Mon, 3 Jul 2017 09:52:37 -0400	[thread overview]
Message-ID: <20170703135236.GD27097@destiny> (raw)
In-Reply-To: <20170702015843.GA17762@dastard>

On Sun, Jul 02, 2017 at 11:58:43AM +1000, Dave Chinner wrote:
> [I don't have the full context, haven't seen the proposed patches,
> etc so I'm only commenting on small bits of what I was cc'd on]
> 
> On Fri, Jun 30, 2017 at 11:03:24AM -0400, Josef Bacik wrote:
> > On Fri, Jun 30, 2017 at 11:17:13AM +0900, Minchan Kim wrote:
> > 
> > <snip>
> > 
> > > > 
> > > > Because this static step down wastes cycles.  Why loop 10 times when you could
> > > > set the target at actual usage and try to get everything in one go?  Most
> > > > shrinkable slabs adhere to this default of in use first model, which means that
> > > > we have to hit an object in the lru twice before it is freed.  So in order to
> > > 
> > > I didn't know that.
> > > 
> > > > reclaim anything we have to scan a slab cache's entire lru at least once before
> > > > any reclaim starts happening.  If we're doing this static step down thing we
> > > 
> > > If it's really true, I think that shrinker should be fixed first.
> > > 
> > 
> > Easier said than done.  I've fixed this for the super shrinkers, but like I said
> > below, all it takes is some asshole doing find / -exec stat {} \; twice to put
> > us back in the same situation again.  There's no aging mechanism other than
> > memory reclaim, so we get into this shitty situation of aging+reclaiming at the
> > same time.
> 
> To me, this sounds like the problem that needs fixing. i.e.
> threshold detection to trigger demand-based aging of shrinkable
> caches at allocation time rather than waiting for memory pressure to
> trigger aging.
> 
> > > I guess at the first time, all of objects in shrinker could be INUSE state
> > > as you said, however, on steady state, they would work like real LRU
> > > to reflect recency, otherwise, I want to call it broken and we shouldn't
> > > design general slab aging model for those specific one.
> > 
> > Yeah that's totally a valid argument to make, but the idea of coming up with
> > something completely different hurts my head, and I'm trying to fix this problem
> > right now, not in 6 cycles when we all finally agree on the new mechanism.
> 
> Add a shrinker callback to the slab allocation code that checks the
> rate of growth of the cache. If the cache is growing fast and
> getting large, run the shrinker every so often to slow down the rate
> of growth. The larger the cache, the slower the allowed rate of
> growth. It's the feedback loop that we are missing between slab
> allocation and slab reclaim that prevents use from controlling slab
> cache growthi and aging cleanly.
> 
> This would also allow us to cap the size of caching slabs easily
> - something people have been asking us to do for years....
> 
> Note: we need to make a clear distinction between slabs used as
> "heap" memory and slabs used for allocating large numbers of objects
> that are cached for performance reasons. The problem here is cache
> management - the fact we are using slabs to allocate the memory for
> the objects is largely irrelevant...
> 
> > > > I expanded on this above, but I'll give a more concrete example.  Consider xfs
> > > > metadata, we allocate a slab object and read in our page, use it, and then free
> > > > the buffer which put's it on the lru list.  XFS marks this with a INUSE flag,
> > > > which must be cleared before it is free'd from the LRU.  We scan through the
> > > > LRU, clearing the INUSE flag and moving the object to the back of the LRU, but
> > > > not actually free'ing it.  This happens for all (well most) objects that end up
> > > > on the LRU, and this design pattern is used _everywhere_.  Until recently it was
> > > > used for the super shrinkers, but I changed that so thankfully the bulk of the
> > > > problem is gone.  However if you do a find / -exec stat {} \;, and then do it
> > > > again, you'll end up with the same scenario for the super shrinker.   There's no
> > > > aging except via pressure on the slabs, so worst case we always have to scan the
> > > > entire slab object lru once before we start reclaiming anything.  Being
> > > > agressive here I think is ok, we have things in place to make sure we don't over
> > > > reclaim.
> > > 
> > > Thanks for the detail example. Now I understood but the question is it is
> > > always true? I mean at the first stage(ie, first population of objects), it
> > > seems to be but at the steady stage, I guess some of objects have INUSE,
> > > others not by access pattern so it emulates LRU model. No?
> > > 
> > 
> > Sort of.  Lets take icache/dcache.  We open a file and close a file, this get's
> > added to the LRU.  We open the file and close the file again, it's already on
> > the LRU so it stays where it was and gets the INUSE flag set.  Once an object is
> > on the LRU it doesn't move unless we hit it via the shrinker.  Even if we open
> > and close the file, and then open and keep it open, the file stays on the LRU,
> > and is only removed once the shrinker hits it, sees it's refcount is > 1, and
> > removes it from the list.
> 
> Yes, but  the lazy LRU removal is a performance feature - the cost
> in terms of lock contention of removing dentries inodes from the LRU
> on first reference is prohibitive, especially for short term usage
> like 'find / -exec stat {} \;' workloads. Git does this sort of
> traverse/stat a lot, so making this path any slower would make lots
> of people unhappy...
>

Sure we want it when it's useful, but often times it ends up in latencies due to
reclaim.  I like your aging thing, it would definitely help in our worst case,
I'll give that a whirl after my vacation.  Thanks,

Josef 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-03 13:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-08 19:19 [PATCH 1/2] mm: use slab size in the slab shrinking ratio calculation josef
2017-06-08 19:19 ` [PATCH 2/2] mm: make kswapd try harder to keep active pages in cache josef
2017-06-13  5:28 ` [PATCH 1/2] mm: use slab size in the slab shrinking ratio calculation Minchan Kim
2017-06-13 12:01   ` Josef Bacik
2017-06-14  6:40     ` Minchan Kim
2017-06-19 15:11       ` Josef Bacik
2017-06-20  2:46         ` Minchan Kim
2017-06-27 13:59           ` Josef Bacik
2017-06-30  2:17             ` Minchan Kim
2017-06-30 15:03               ` Josef Bacik
2017-07-02  1:58                 ` Dave Chinner
2017-07-03 13:52                   ` Josef Bacik [this message]
2017-07-03  1:33                 ` Minchan Kim
2017-07-03 13:50                   ` Josef Bacik
2017-07-04  3:01                     ` Minchan Kim
2017-07-04 13:21                       ` Josef Bacik
2017-07-04 22:57                         ` Dave Chinner
2017-07-05  4:59                           ` Minchan Kim
2017-07-05 23:58                             ` Dave Chinner
2017-07-06  3:56                               ` Minchan Kim
2017-07-05 13:33                           ` Josef Bacik
2017-07-05 23:30                             ` Dave Chinner
2017-07-05  4:43                         ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170703135236.GD27097@destiny \
    --to=josef@toxicpanda.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=jbacik@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.