Re: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Chris Mason <clm@fb.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim
Date: Thu, 17 Nov 2016 14:39:55 +1100	[thread overview]
Message-ID: <20161117033955.GD19783@dastard> (raw)
In-Reply-To: <20161117010727.GB4811@clm-mbp.masoncoding.com>

On Wed, Nov 16, 2016 at 08:07:28PM -0500, Chris Mason wrote:
> On Thu, Nov 17, 2016 at 11:47:45AM +1100, Dave Chinner wrote:
> >On Tue, Nov 15, 2016 at 10:03:52PM -0500, Chris Mason wrote:
> >>Moving forward, I think I can manage to carry the one line patch in
> >>code that hasn't measurably changed in years.  We'll get it tested
> >>in a variety of workloads and come back with more benchmarks for the
> >>great slab rework coming soon to a v5.x kernel near you.
> >
> >FWIW, I just tested your one-liner against my simoops config here,
> >and by comparing the behaviour to my patchset that still allows
> >direct reclaim to block on dirty inodes, it would appear that all
> >the allocation latency I'm seeing here is from direct reclaim.
> 
> Meaning that your allocation latencies are constant regardless of if
> we're waiting in the xfs shrinker?

No, what I mean is that all the big p99 latencies are a result of
blocking in direct reclaim, not blocking kswapd. i.e. fully
non-blocking kswapd + blocking direct reclaim == big bad p99
latencies vs non-blocking kswapd + direct reclaim == no big
latencies.

It also /appears/ that the bad FFE kswapd behaviour is closely
correlated to the long blocking latencies in direct reclaim, though
I haven't been able to confirm this hypothesis yet.

> >commit 795ae7a0de6b834a0cc202aa55c190ef81496665
> >Author: Johannes Weiner <hannes@cmpxchg.org>
> >Date:   Thu Mar 17 14:19:14 2016 -0700
> >
> >   mm: scale kswapd watermarks in proportion to memory
> >
> >
> >What's painfully obvious, though, is that even when I wind it up to
> >it's full threshold (10% memory), it does not prevent direct reclaim
> >from being entered and causing excessive latencies when it blocks.
> >This is despite the fact that simoops is now running with a big free
> >memory reserve (3-3.5GB of free memory on my machine as the page
> >cache now only consumes ~4GB instead of 7-8GB).
> 
> Huh, I'll try to reproduce that.  It might be NUMA imbalance or just
> that simoop is so bursty that we're blowing past that 3.5GB.

It's probably blowing through it, but regardless of this there's
more serious problems with this approach. I originally
turned up the watermarks a few seconds after starting simoops and
everythign was fine. However, when I stopped and tried to restart
simoops, it *always* fails in a few seconds with either:

....
Creating working files
done creating working files
du thread is running /mnt/scratch
du thread is done /mnt/scratch
error 11 from pthread_create
$

or

....
Creating working files
done creating working files
du thread is running /mnt/scratch
du thread is done /mnt/scratch
mmap: Cannot allocate memory
$

I couldn't start simoops again until I backed out the watermark
tuning, and then it started straight away.

IOWs, screwing with the watermarks to try to avoid direct reclaim
appears to make userspace randomly fail with ENOMEM problems when
there are large reserves still available. So AFAICT this doesn't fix
the problems that I've been seeing and instead creates a bunch of
new ones.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2016-11-17  3:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-14 12:27 [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim Chris Mason
2016-10-15 22:34 ` Dave Chinner
2016-10-17  0:24   ` Chris Mason
2016-10-17  1:52     ` Dave Chinner
2016-10-17 13:30       ` Chris Mason
2016-10-17 22:30         ` Dave Chinner
2016-10-17 23:20           ` Chris Mason
2016-10-18  2:03             ` Dave Chinner
2016-11-14  1:00               ` Chris Mason
2016-11-14  7:27                 ` Dave Chinner
2016-11-14 20:56                   ` Chris Mason
2016-11-14 23:58                     ` Dave Chinner
2016-11-15  3:09                       ` Chris Mason
2016-11-15  5:54                       ` Dave Chinner
2016-11-15 19:00                         ` Chris Mason
2016-11-16  1:30                           ` Dave Chinner
2016-11-16  3:03                             ` Chris Mason
2016-11-16 23:31                               ` Dave Chinner
2016-11-17  0:27                                 ` Chris Mason
2016-11-17  1:00                                   ` Dave Chinner
2016-11-17  0:47                               ` Dave Chinner
2016-11-17  1:07                                 ` Chris Mason
2016-11-17  3:39                                   ` Dave Chinner [this message]
2019-06-14 12:58 ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161117033955.GD19783@dastard \
    --to=david@fromorbit.com \
    --cc=clm@fb.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).