From: Johannes Weiner <hannes@cmpxchg.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@kernel.org, minchan@kernel.org, ying.huang@intel.com,
mgorman@techsingularity.net, vdavydov.dev@gmail.com,
akpm@linux-foundation.org, shakeelb@google.com,
gthelen@google.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.
Date: Wed, 15 Nov 2017 08:28:36 -0500 [thread overview]
Message-ID: <20171115132836.GA6524@cmpxchg.org> (raw)
In-Reply-To: <201711151958.CBI60413.FHQMtFLFOOSOJV@I-love.SAKURA.ne.jp>
On Wed, Nov 15, 2017 at 07:58:09PM +0900, Tetsuo Handa wrote:
> I think that Minchan's approach depends on how
>
> In our production, we have observed that the job loader gets stuck for
> 10s of seconds while doing mount operation. It turns out that it was
> stuck in register_shrinker() and some unrelated job was under memory
> pressure and spending time in shrink_slab(). Our machines have a lot
> of shrinkers registered and jobs under memory pressure has to traverse
> all of those memcg-aware shrinkers and do affect unrelated jobs which
> want to register their own shrinkers.
>
> is interpreted. If there were 100000 shrinkers and each do_shrink_slab() call
> took 1 millisecond, aborting the iteration as soon as rwsem_is_contended() would
> help a lot. But if there were 10 shrinkers and each do_shrink_slab() call took
> 10 seconds, aborting the iteration as soon as rwsem_is_contended() would help
> less. Or, there might be some specific shrinker where its do_shrink_slab() call
> takes 100 seconds. In that case, checking rwsem_is_contended() is too lazy.
In your patch, unregister() waits for shrinker->nr_active instead of
the lock, which is decreased in the same location where Minchan drops
the lock. How is that different behavior for long-running shrinkers?
Anyway, I suspect it's many shrinkers and many concurrent invocations,
so the lockbreak granularity you both chose should be fine.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@kernel.org, minchan@kernel.org, ying.huang@intel.com,
mgorman@techsingularity.net, vdavydov.dev@gmail.com,
akpm@linux-foundation.org, shakeelb@google.com,
gthelen@google.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock.
Date: Wed, 15 Nov 2017 08:28:36 -0500 [thread overview]
Message-ID: <20171115132836.GA6524@cmpxchg.org> (raw)
In-Reply-To: <201711151958.CBI60413.FHQMtFLFOOSOJV@I-love.SAKURA.ne.jp>
On Wed, Nov 15, 2017 at 07:58:09PM +0900, Tetsuo Handa wrote:
> I think that Minchan's approach depends on how
>
> In our production, we have observed that the job loader gets stuck for
> 10s of seconds while doing mount operation. It turns out that it was
> stuck in register_shrinker() and some unrelated job was under memory
> pressure and spending time in shrink_slab(). Our machines have a lot
> of shrinkers registered and jobs under memory pressure has to traverse
> all of those memcg-aware shrinkers and do affect unrelated jobs which
> want to register their own shrinkers.
>
> is interpreted. If there were 100000 shrinkers and each do_shrink_slab() call
> took 1 millisecond, aborting the iteration as soon as rwsem_is_contended() would
> help a lot. But if there were 10 shrinkers and each do_shrink_slab() call took
> 10 seconds, aborting the iteration as soon as rwsem_is_contended() would help
> less. Or, there might be some specific shrinker where its do_shrink_slab() call
> takes 100 seconds. In that case, checking rwsem_is_contended() is too lazy.
In your patch, unregister() waits for shrinker->nr_active instead of
the lock, which is decreased in the same location where Minchan drops
the lock. How is that different behavior for long-running shrinkers?
Anyway, I suspect it's many shrinkers and many concurrent invocations,
so the lockbreak granularity you both chose should be fine.
next prev parent reply other threads:[~2017-11-15 13:28 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-13 21:37 [PATCH 1/2] mm,vmscan: Kill global shrinker lock Tetsuo Handa
2017-11-13 21:37 ` Tetsuo Handa
2017-11-13 21:37 ` [PATCH 2/2] mm,vmscan: Allow parallel registration/unregistration of shrinkers Tetsuo Handa
2017-11-13 21:37 ` Tetsuo Handa
2017-11-13 22:05 ` [PATCH 1/2] mm,vmscan: Kill global shrinker lock Shakeel Butt
2017-11-13 22:05 ` Shakeel Butt
2017-11-15 0:56 ` Minchan Kim
2017-11-15 0:56 ` Minchan Kim
2017-11-15 6:28 ` Shakeel Butt
2017-11-15 6:28 ` Shakeel Butt
2017-11-16 0:46 ` Minchan Kim
2017-11-16 0:46 ` Minchan Kim
2017-11-16 1:41 ` Shakeel Butt
2017-11-16 1:41 ` Shakeel Butt
2017-11-16 4:50 ` Minchan Kim
2017-11-16 4:50 ` Minchan Kim
2017-11-15 8:56 ` Michal Hocko
2017-11-15 8:56 ` Michal Hocko
2017-11-15 9:18 ` Michal Hocko
2017-11-15 9:18 ` Michal Hocko
2017-11-16 17:44 ` Johannes Weiner
2017-11-16 17:44 ` Johannes Weiner
2017-11-23 23:46 ` Minchan Kim
2017-11-23 23:46 ` Minchan Kim
2017-11-15 9:02 ` Michal Hocko
2017-11-15 9:02 ` Michal Hocko
2017-11-15 10:58 ` Tetsuo Handa
2017-11-15 10:58 ` Tetsuo Handa
2017-11-15 11:51 ` Michal Hocko
2017-11-15 11:51 ` Michal Hocko
2017-11-16 0:56 ` Minchan Kim
2017-11-16 0:56 ` Minchan Kim
2017-11-15 13:28 ` Johannes Weiner [this message]
2017-11-15 13:28 ` Johannes Weiner
2017-11-16 10:56 ` Tetsuo Handa
2017-11-16 10:56 ` Tetsuo Handa
2017-11-15 14:00 ` Johannes Weiner
2017-11-15 14:00 ` Johannes Weiner
2017-11-15 14:11 ` Michal Hocko
2017-11-15 14:11 ` Michal Hocko
2018-01-25 2:04 ` Tetsuo Handa
2018-01-25 2:04 ` Tetsuo Handa
2018-01-25 8:36 ` Michal Hocko
2018-01-25 8:36 ` Michal Hocko
2018-01-25 10:56 ` Tetsuo Handa
2018-01-25 10:56 ` Tetsuo Handa
2018-01-25 11:41 ` Michal Hocko
2018-01-25 11:41 ` Michal Hocko
2018-01-25 22:19 ` Eric Wheeler
2018-01-25 22:19 ` Eric Wheeler
2018-01-26 3:12 ` Tetsuo Handa
2018-01-26 3:12 ` Tetsuo Handa
2018-01-26 10:08 ` Michal Hocko
2018-01-26 10:08 ` Michal Hocko
2017-11-17 17:35 ` Christoph Hellwig
2017-11-17 17:35 ` Christoph Hellwig
2017-11-17 17:41 ` Shakeel Butt
2017-11-17 17:41 ` Shakeel Butt
2017-11-17 17:53 ` Shakeel Butt
2017-11-17 17:53 ` Shakeel Butt
2017-11-17 18:36 ` Christoph Hellwig
2017-11-17 18:36 ` Christoph Hellwig
2017-11-20 9:25 ` Michal Hocko
2017-11-20 9:25 ` Michal Hocko
2017-11-20 9:33 ` Christoph Hellwig
2017-11-20 9:33 ` Christoph Hellwig
2017-11-20 9:42 ` Michal Hocko
2017-11-20 9:42 ` Michal Hocko
2017-11-20 10:41 ` Christoph Hellwig
2017-11-20 10:41 ` Christoph Hellwig
2017-11-20 10:56 ` Tetsuo Handa
2017-11-20 10:56 ` Tetsuo Handa
2017-11-20 18:28 ` Paul E. McKenney
2017-11-20 18:28 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171115132836.GA6524@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=minchan@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=shakeelb@google.com \
--cc=vdavydov.dev@gmail.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.