From: Dave Chinner <david@fromorbit.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org, linux-xfs@vger.kernel.org,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] [Regression, v5.0] mm: boosted kswapd reclaim b0rks system cache balance
Date: Thu, 8 Aug 2019 10:26:11 +1000 [thread overview]
Message-ID: <20190808002611.GT7777@dread.disaster.area> (raw)
In-Reply-To: <20190807234815.GJ2739@techsingularity.net>
On Thu, Aug 08, 2019 at 12:48:15AM +0100, Mel Gorman wrote:
> On Thu, Aug 08, 2019 at 08:32:41AM +1000, Dave Chinner wrote:
> > On Wed, Aug 07, 2019 at 09:56:15PM +0100, Mel Gorman wrote:
> > > On Wed, Aug 07, 2019 at 04:03:16PM +0100, Mel Gorman wrote:
> > > > <SNIP>
> > > >
> > > > On that basis, it may justify ripping out the may_shrinkslab logic
> > > > everywhere. The downside is that some microbenchmarks will notice.
> > > > Specifically IO benchmarks that fill memory and reread (particularly
> > > > rereading the metadata via any inode operation) may show reduced
> > > > results. Such benchmarks can be strongly affected by whether the inode
> > > > information is still memory resident and watermark boosting reduces
> > > > the changes the data is still resident in memory. Technically still a
> > > > regression but a tunable one.
> > > >
> > > > Hence the following "it builds" patch that has zero supporting data on
> > > > whether it's a good idea or not.
> > > >
> > >
> > > This is a more complete version of the same patch that summaries the
> > > problem and includes data from my own testing
> > ....
> > > A fsmark benchmark configuration was constructed similar to
> > > what Dave reported and is codified by the mmtest configuration
> > > config-io-fsmark-small-file-stream. It was evaluated on a 1-socket machine
> > > to avoid dealing with NUMA-related issues and the timing of reclaim. The
> > > storage was an SSD Samsung Evo and a fresh XFS filesystem was used for
> > > the test data.
> >
> > Have you run fstrim on that drive recently? I'm running these tests
> > on a 960 EVO ssd, and when I started looking at shrinkers 3 weeks
> > ago I had all sorts of whacky performance problems and inconsistent
> > results. Turned out there were all sorts of random long IO latencies
> > occurring (in the hundreds of milliseconds) because the drive was
> > constantly running garbage collection to free up space. As a result
> > it was both blocking on GC and thermal throttling under these fsmark
> > workloads.
> >
>
> No, I was under the impression that making a new filesystem typically
> trimmed it as well. Maybe that's just some filesystems (e.g. ext4) or
> just completely wrong.
Depends. IIRC, some have turned that off by default because of the
amount of poor implementations that take minutes to trim a whole
device. XFS discards by default, but that doesn't mean it actually
gets done. e.g. it might be on a block device that does not support
or pass down discard requests.
FWIW, I run these in a VM on a sparse filesystem image (500TB) held
in a file on the host XFS filesystem and:
$ cat /sys/block/vdc/queue/discard_max_bytes
0
Discard requests don't pass down through the virtio block device
(nor do I really want them to). Hence I have to punch the image file
and fstrim on the host side before launching the VM that runs the
tests...
> > then ran
> > fstrim on it to tell the drive all the space is free. Drive temps
> > dropped 30C immediately, and all of the whacky performance anomolies
> > went away. I now fstrim the drive in my vm startup scripts before
> > each test run, and it's giving consistent results again.
> >
>
> I'll replicate that if making a new filesystem is not guaranteed to
> trim. It'll muck up historical data but that happens to me every so
> often anyway.
mkfs.xfs should be doing it if you're directly on top of the SSD.
Just wanted to check seeing as I've recently been bitten by this.
> > That looks a lot better. Patch looks reasonable, though I'm
> > interested to know what impact it has on tests you ran in the
> > original commit for the boosting.
> >
>
> I'll find out soon enough but I'm leaning on the side that kswapd reclaim
> should be predictable and that even if there are some performance problems
> as a result of it, there will be others that see a gain. It'll be a case
> of "no matter what way you jump, someone shouts" but kswapd having spiky
> unpredictable behaviour is a recipe for "sometimes my machine is crap
> and I've no idea why".
Yeah, and that's precisely the motiviation for getting XFS inode
reclaim to avoid blocking altogether and relying on memory reclaim
to back off when appropriate. I expect there will be other problems
I find with reclaim backoff and blance as a kick the tyres more...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-08-08 0:27 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-07 9:18 [PATCH] [Regression, v5.0] mm: boosted kswapd reclaim b0rks system cache balance Dave Chinner
2019-08-07 9:30 ` Michal Hocko
2019-08-07 15:03 ` Mel Gorman
2019-08-07 20:56 ` Mel Gorman
2019-08-07 22:32 ` Dave Chinner
2019-08-07 23:48 ` Mel Gorman
2019-08-08 0:26 ` Dave Chinner [this message]
2019-08-08 15:36 ` Christoph Hellwig
2019-08-08 17:04 ` Mel Gorman
2019-08-07 22:08 ` Dave Chinner
2019-08-07 22:33 ` Dave Chinner
2019-08-07 23:55 ` Mel Gorman
2019-08-08 0:30 ` Dave Chinner
2019-08-08 5:51 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190808002611.GT7777@dread.disaster.area \
--to=david@fromorbit.com \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox