Re: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Chris Mason <clm@fb.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim
Date: Thu, 17 Nov 2016 12:00:08 +1100	[thread overview]
Message-ID: <20161117010008.GC19783@dastard> (raw)
In-Reply-To: <20161117002726.GA4811@clm-mbp.masoncoding.com>

On Wed, Nov 16, 2016 at 07:27:27PM -0500, Chris Mason wrote:
> On Thu, Nov 17, 2016 at 10:31:36AM +1100, Dave Chinner wrote:
> >>>So I'm running on 16GB RAM and have 100-150MB of XFS slab.
> >>>Percentage wise, the inode cache is a larger portion of memory than
> >>>in your machines. I can increase the number of files to increase it
> >>>further, but I don't think that will change anything.
> >>
> >>I think the way to see what I'm seeing would be to drop the number
> >>of IO threads (-T) and bump both -m and -M.  Basically less inode
> >>working set and more memory working set.
> >
> >If I increase m/M by any non-trivial amount, the test OOMs within a
> >couple of minutes of starting even after cutting the number of IO
> >threads in half. I've managed to increase -m by 10% without OOM -
> >I'll keep trying to increase this part of the load as much as I
> >can as I refine the patchset I have.
> 
> Gotcha.  -m is long lasting, allocated once at the start of the run
> and stays around for ever.  It basically soaks up ram.  -M is
> allocated once per work loop, and it should be where the stalls
> really hit.  I'll peel off a flash machine tomorrow and find a
> command line that matches my results so far.
> 
> What kind of flash are you using?  I can choose between modern nvme
> or something more crusty.

Crusty old stuff - a pair of EVO 840s in HW-RAID0 behind 512MB of
BBWC. Read rates peak at ~150MB/s, write rates sustain at about
75MB/s.

I'm testing on a 200GB filesystem, configured as:

mkfs.xfs -f -dagcount=8,size=200g /dev/vdc

The backing file for /dev/vdc is fully preallocated and linear,
accessed via virtio/direct IO, so it's no different to accessing the
real block device....

> >>>That's what removing the blocking from the shrinker causes the
> >>>overall work rate to go down - it results in the cache not
> >>>maintaining a working set of inodes and so increases the IO load and
> >>>that then slows everything down.
> >>
> >>At least on my machines, it made the overall work rate go up.  Both
> >>simoop and prod are 10-15% faster.
> >
> >Ok, I'll see if I can tune the workload here to behave more like
> >this....
> 
> What direction do you have in mind for your current patches?  Many
> tiers have shadows where we can put experimental code without
> feeling bad if machines crash or data is lost.  I'm happy to line up
> tests if you want data from specific workloads.

Right now I have kswapd as fully non-blocking - even more so that
your one line patch because reclaim can (and does) still block on
inode locks with SYNC_TRYLOCK set. I don't see any problems with
doing this.

I'm still trying to work out what to do with direct reclaim - it's
clearly the source of the worst allocation latency problems, and it
also seems to be the contributing factor to the obnoxious kswapd FFE
behaviour.  There's a couple more variations I want to try to see if
I can make it block less severely, but what I do in the short term
here is largely dependent on the effect on other benchmarks and
loads....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2016-11-17  1:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-14 12:27 [PATCH RFC] xfs: drop SYNC_WAIT from xfs_reclaim_inodes_ag during slab reclaim Chris Mason
2016-10-15 22:34 ` Dave Chinner
2016-10-17  0:24   ` Chris Mason
2016-10-17  1:52     ` Dave Chinner
2016-10-17 13:30       ` Chris Mason
2016-10-17 22:30         ` Dave Chinner
2016-10-17 23:20           ` Chris Mason
2016-10-18  2:03             ` Dave Chinner
2016-11-14  1:00               ` Chris Mason
2016-11-14  7:27                 ` Dave Chinner
2016-11-14 20:56                   ` Chris Mason
2016-11-14 23:58                     ` Dave Chinner
2016-11-15  3:09                       ` Chris Mason
2016-11-15  5:54                       ` Dave Chinner
2016-11-15 19:00                         ` Chris Mason
2016-11-16  1:30                           ` Dave Chinner
2016-11-16  3:03                             ` Chris Mason
2016-11-16 23:31                               ` Dave Chinner
2016-11-17  0:27                                 ` Chris Mason
2016-11-17  1:00                                   ` Dave Chinner [this message]
2016-11-17  0:47                               ` Dave Chinner
2016-11-17  1:07                                 ` Chris Mason
2016-11-17  3:39                                   ` Dave Chinner
2019-06-14 12:58 ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161117010008.GC19783@dastard \
    --to=david@fromorbit.com \
    --cc=clm@fb.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.