From: Chris Mason <clm@fb.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim
Date: Wed, 16 Nov 2016 20:07:28 -0500 [thread overview]
Message-ID: <20161117010727.GB4811@clm-mbp.masoncoding.com> (raw)
In-Reply-To: <20161117004745.GB19783@dastard>
On Thu, Nov 17, 2016 at 11:47:45AM +1100, Dave Chinner wrote:
>On Tue, Nov 15, 2016 at 10:03:52PM -0500, Chris Mason wrote:
>> Moving forward, I think I can manage to carry the one line patch in
>> code that hasn't measurably changed in years. We'll get it tested
>> in a variety of workloads and come back with more benchmarks for the
>> great slab rework coming soon to a v5.x kernel near you.
>
>FWIW, I just tested your one-liner against my simoops config here,
>and by comparing the behaviour to my patchset that still allows
>direct reclaim to block on dirty inodes, it would appear that all
>the allocation latency I'm seeing here is from direct reclaim.
Meaning that your allocation latencies are constant regardless of if
we're waiting in the xfs shrinker?
>
>So I went looking at the direct reclaim throttle with the intent to
>hack it to throttle earlier. It throttles based on watermarks, so
>I figured Id just hack them to be larger to trigger direct reclaim
>throttling earlier. And then I found this recent addition:
>
>https://patchwork.kernel.org/patch/8426381/
>
>+=============================================================
>+
>+watermark_scale_factor:
>+
>+This factor controls the aggressiveness of kswapd. It defines the
>+amount of memory left in a node/system before kswapd is woken up and
>+how much memory needs to be free before kswapd goes back to sleep.
>+
>+The unit is in fractions of 10,000. The default value of 10 means the
>+distances between watermarks are 0.1% of the available memory in the
>+node/system. The maximum value is 1000, or 10% of memory.
>+
>+A high rate of threads entering direct reclaim (allocstall) or kswapd
>+going to sleep prematurely (kswapd_low_wmark_hit_quickly) can indicate
>+that the number of free pages kswapd maintains for latency reasons is
>+too small for the allocation bursts occurring in the system. This knob
>+can then be used to tune kswapd aggressiveness accordingly.
>+
>
>The /exact hack/ I was thinking of was committed about 6 months
>ago and added "support for ever more" /proc file:
Yeah, Johannes spent a bunch of time looking at kswapd in a few places
it was causing trouble here.
>
>commit 795ae7a0de6b834a0cc202aa55c190ef81496665
>Author: Johannes Weiner <hannes@cmpxchg.org>
>Date: Thu Mar 17 14:19:14 2016 -0700
>
> mm: scale kswapd watermarks in proportion to memory
>
>
>What's painfully obvious, though, is that even when I wind it up to
>it's full threshold (10% memory), it does not prevent direct reclaim
>from being entered and causing excessive latencies when it blocks.
>This is despite the fact that simoops is now running with a big free
>memory reserve (3-3.5GB of free memory on my machine as the page
>cache now only consumes ~4GB instead of 7-8GB).
Huh, I'll try to reproduce that. It might be NUMA imbalance or just
that simoop is so bursty that we're blowing past that 3.5GB.
>
>And, while harder to trigger, kswapd still goes on the "free fucking
>everything" rampages that trigger page writeback from kswapd and
>empty both the page cache and the slab caches. The only difference
>now is that it does this /without triggering the allocstall
>counter/....
>
>So it's seems that just upping the direct reclaim throttle point
>isn't a sufficient workaround for the "too much direct reclaim"
>problem here...
>
>Cheers,
>
>Dave.
>--
>Dave Chinner
>david@fromorbit.com
next prev parent reply other threads:[~2016-11-17 1:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-14 12:27 [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim Chris Mason
2016-10-15 22:34 ` Dave Chinner
2016-10-17 0:24 ` Chris Mason
2016-10-17 1:52 ` Dave Chinner
2016-10-17 13:30 ` Chris Mason
2016-10-17 22:30 ` Dave Chinner
2016-10-17 23:20 ` Chris Mason
2016-10-18 2:03 ` Dave Chinner
2016-11-14 1:00 ` Chris Mason
2016-11-14 7:27 ` Dave Chinner
2016-11-14 20:56 ` Chris Mason
2016-11-14 23:58 ` Dave Chinner
2016-11-15 3:09 ` Chris Mason
2016-11-15 5:54 ` Dave Chinner
2016-11-15 19:00 ` Chris Mason
2016-11-16 1:30 ` Dave Chinner
2016-11-16 3:03 ` Chris Mason
2016-11-16 23:31 ` Dave Chinner
2016-11-17 0:27 ` Chris Mason
2016-11-17 1:00 ` Dave Chinner
2016-11-17 0:47 ` Dave Chinner
2016-11-17 1:07 ` Chris Mason [this message]
2016-11-17 3:39 ` Dave Chinner
2019-06-14 12:58 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161117010727.GB4811@clm-mbp.masoncoding.com \
--to=clm@fb.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.