From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-xfs@vger.kernel.org, "Darrick J. Wong" <djwong@kernel.org>,
Dave Chinner <dchinner@redhat.com>
Subject: Re: Regression in XFS for fsync heavy workload
Date: Thu, 17 Mar 2022 10:26:36 +1100 [thread overview]
Message-ID: <20220316232636.GT3927073@dread.disaster.area> (raw)
In-Reply-To: <20220316193840.3t2ahjxnkvmk6okz@quack3.lan>
On Wed, Mar 16, 2022 at 08:38:40PM +0100, Jan Kara wrote:
> On Wed 16-03-22 11:09:34, Jan Kara wrote:
> > On Wed 16-03-22 18:44:59, Dave Chinner wrote:
> > > On Wed, Mar 16, 2022 at 12:06:27PM +1100, Dave Chinner wrote:
> > > > On Tue, Mar 15, 2022 at 01:49:43PM +0100, Jan Kara wrote:
> > > > > Hello,
> > > > >
> > > > > I was tracking down a regression in dbench workload on XFS we have
> > > > > identified during our performance testing. These are results from one of
> > > > > our test machine (server with 64GB of RAM, 48 CPUs, SATA SSD for the test
> > > > > disk):
> > > > >
> > > > > good bad
> > > > > Amean 1 64.29 ( 0.00%) 73.11 * -13.70%*
> > > > > Amean 2 84.71 ( 0.00%) 98.05 * -15.75%*
> > > > > Amean 4 146.97 ( 0.00%) 148.29 * -0.90%*
> > > > > Amean 8 252.94 ( 0.00%) 254.91 * -0.78%*
> > > > > Amean 16 454.79 ( 0.00%) 456.70 * -0.42%*
> > > > > Amean 32 858.84 ( 0.00%) 857.74 ( 0.13%)
> > > > > Amean 64 1828.72 ( 0.00%) 1865.99 * -2.04%*
> > > > >
> > > > > Note that the numbers are actually times to complete workload, not
> > > > > traditional dbench throughput numbers so lower is better.
> > > ....
> > >
> > > > > This should still
> > > > > submit it rather early to provide the latency advantage. Otherwise postpone
> > > > > the flush to the moment we know we are going to flush the iclog to save
> > > > > pointless flushes. But we would have to record whether the flush happened
> > > > > or not in the iclog and it would all get a bit hairy...
> > > >
> > > > I think we can just set the NEED_FLUSH flag appropriately.
> > > >
> > > > However, given all this, I'm wondering if the async cache flush was
> > > > really a case of premature optimisation. That is, we don't really
> > > > gain anything by reducing the flush latency of the first iclog write
> > > > wehn we are writing 100-1000 iclogs before the commit record, and it
> > > > can be harmful to some workloads by issuing more flushes than we
> > > > need to.
> > > >
> > > > So perhaps the right thing to do is just get rid of it and always
> > > > mark the first iclog in a checkpoint as NEED_FLUSH....
> > >
> > > So I've run some tests on code that does this, and the storage I've
> > > tested it on shows largely no difference in stream CIL commit and
> > > fsync heavy workloads when comparing synv vs as cache flushes. On
> > > set of tests was against high speed NVMe ssds, the other against
> > > old, slower SATA SSDs.
> > >
> > > Jan, can you run the patch below (against 5.17-rc8) and see what
> > > results you get on your modified dbench test?
> >
> > Sure, I'll run the test. I forgot to mention that in vanilla upstream kernel
> > I could see the difference in the number of cache flushes caused by the
> > XFS changes but not actual change in dbench numbers (they were still
> > comparable to the bad ones from my test). The XFS change made material
> > difference to dbench performance only together with scheduler / cpuidling /
> > frequency scaling fixes we have in our SLE kernel (I didn't try to pin down
> > which exactly - I guess I can try working around that by using performance
> > cpufreq governor and disabling low cstates so that I can test stock
> > vanilla kernels). Thanks for the patch!
>
> Yup, so with limiting cstates and performance cpufreq governor I can see
> your patch helps significantly the dbench performance:
>
> 5.18-rc8-vanilla 5.18-rc8-patched
> Amean 1 71.22 ( 0.00%) 64.94 * 8.81%*
> Amean 2 93.03 ( 0.00%) 84.80 * 8.85%*
> Amean 4 150.54 ( 0.00%) 137.51 * 8.66%*
> Amean 8 252.53 ( 0.00%) 242.24 * 4.08%*
> Amean 16 454.13 ( 0.00%) 439.08 * 3.31%*
> Amean 32 835.24 ( 0.00%) 829.74 * 0.66%*
> Amean 64 1740.59 ( 0.00%) 1686.73 * 3.09%*
>
> The performance is restored to values before commit bad77c375e8d ("xfs: CIL
> checkpoint flushes caches unconditionally") as well as the number of
> flushes.
OK, good to know, thanks for testing quickly. I'll spin this up into
a proper patch that removes the async flush functionality and
support infrastructure.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2022-03-16 23:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-15 12:49 Regression in XFS for fsync heavy workload Jan Kara
2022-03-16 1:06 ` Dave Chinner
2022-03-16 7:44 ` Dave Chinner
2022-03-16 10:09 ` Jan Kara
2022-03-16 19:38 ` Jan Kara
2022-03-16 23:26 ` Dave Chinner [this message]
2022-03-16 9:54 ` Jan Kara
2022-03-16 23:38 ` Dave Chinner
2022-03-17 11:56 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220316232636.GT3927073@dread.disaster.area \
--to=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=jack@suse.cz \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox