All of lore.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] automating file system benchmarks
@ 2019-12-13  1:47 Theodore Y. Ts'o
  2019-12-13  5:12 ` [Lsf-pc] " Amir Goldstein
  0 siblings, 1 reply; 3+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-13  1:47 UTC (permalink / raw)
  To: lsf-pc, linux-fsdevel

I'd like to have a discussion at LSF/MM about making it easier and
more accessible for file system developers to run benchmarks as part
of their development processes.

My interest in this was sparked a few weeks ago, when there was a
click-bait article published on Phoronix, "The Disappointing Direction
Of Linux Performance From 4.16 To 5.4 Kernels"[1], wherein the author
published results which seem to indicate a radical decrease in
performance in a pre-5.4 kernel, which showed the 5.4(-ish) kernel
performance four times worse on a SQLite test.

[1] https://www.phoronix.com/scan.php?page=article&item=linux-416-54&num=1

I tried to reproduce this, and trying to replicate the exact
benchmark, I decided to try using the Phoronix Test Suite (PTS).
Somewhat to my surprise, it was well documented[2], straightforward to
set up, and a lot of care was put into being able to get repeatable
results from running a large set of benchmarks.  And so I added
support[3] for running to my gce-xfstests test automation framework.

[2] https://www.phoronix-test-suite.com/documentation/phoronix-test-suite.html
[3] https://github.com/tytso/xfstests-bld/commit/b8236c94caf0686b1cfacb1348b5a46fa1f52f48

Fortunately, using a controlled set kernel configs it I could find no
evidence of a massive performance regression a few days before 5.4 was
released by Linus.  These results were reproduced by Jan Kara using mmtests.

Josef Bacik added a fio benchmark to xfstests in late 2017[4], and
this was discussed at the 2018 LSF/MM.  Unfortunately, there doesn't
seem to have been any additional work to add benchmarking
functionality to xfstests.

[4] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e0d95552fdb2948c63b29af4a8169a2027f84a1d

In addition to using xfstests, I have started using PTS to as a way to
sanity check patch submissions to ext4.  I've also started
investigating using mmtests as well; mmtests isn't quite as polished
and well documented, but has better support for running running
monitoring scripts (e.g., iostat, perf, systemtap, etc.) in parallel
with running benchmarks as workloads.

I'd like to share what I've learned, and also hopefully learn what
other file system developers have been using to automate measuring
file system performance as a part of their development workflow,
especially if it has been packaged up so other people can more easily
replicate their findings.

Cheers,

							- Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] automating file system benchmarks
  2019-12-13  1:47 [LSF/MM/BPF TOPIC] automating file system benchmarks Theodore Y. Ts'o
@ 2019-12-13  5:12 ` Amir Goldstein
  2019-12-13 15:35   ` Theodore Y. Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Amir Goldstein @ 2019-12-13  5:12 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: lsf-pc, linux-fsdevel, Jan Kara, Dave Chinner

On Fri, Dec 13, 2019 at 3:47 AM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> I'd like to have a discussion at LSF/MM about making it easier and
> more accessible for file system developers to run benchmarks as part
> of their development processes.
>
> My interest in this was sparked a few weeks ago, when there was a
> click-bait article published on Phoronix, "The Disappointing Direction
> Of Linux Performance From 4.16 To 5.4 Kernels"[1], wherein the author
> published results which seem to indicate a radical decrease in
> performance in a pre-5.4 kernel, which showed the 5.4(-ish) kernel
> performance four times worse on a SQLite test.
>
> [1] https://www.phoronix.com/scan.php?page=article&item=linux-416-54&num=1
>
> I tried to reproduce this, and trying to replicate the exact
> benchmark, I decided to try using the Phoronix Test Suite (PTS).
> Somewhat to my surprise, it was well documented[2], straightforward to
> set up, and a lot of care was put into being able to get repeatable
> results from running a large set of benchmarks.  And so I added
> support[3] for running to my gce-xfstests test automation framework.
>

Very nice :)
You should post an [ANNOUNCE] every now and then.
I rarely check upstream of xfstests-bld, because it just-works ;-)

> [2] https://www.phoronix-test-suite.com/documentation/phoronix-test-suite.html
> [3] https://github.com/tytso/xfstests-bld/commit/b8236c94caf0686b1cfacb1348b5a46fa1f52f48
>
> Fortunately, using a controlled set kernel configs it I could find no
> evidence of a massive performance regression a few days before 5.4 was
> released by Linus.  These results were reproduced by Jan Kara using mmtests.
>
> Josef Bacik added a fio benchmark to xfstests in late 2017[4], and
> this was discussed at the 2018 LSF/MM.  Unfortunately, there doesn't
> seem to have been any additional work to add benchmarking
> functionality to xfstests.
>
> [4] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e0d95552fdb2948c63b29af4a8169a2027f84a1d
>
> In addition to using xfstests, I have started using PTS to as a way to
> sanity check patch submissions to ext4.  I've also started

I suppose you have access to a dedicated metal in the cloud for running
your performance regression tests? Or at least a dedicated metal per execution.
I have not looked into GCE, so don't know how easy it is and how expensive
to use GCE this way.
Is there any chance of Google donating this sort of resource for a performance
regression test bot?

> investigating using mmtests as well; mmtests isn't quite as polished
> and well documented, but has better support for running running
> monitoring scripts (e.g., iostat, perf, systemtap, etc.) in parallel
> with running benchmarks as workloads.
>
> I'd like to share what I've learned, and also hopefully learn what
> other file system developers have been using to automate measuring
> file system performance as a part of their development workflow,
> especially if it has been packaged up so other people can more easily
> replicate their findings.
>

Trying to say this carefully, hopefully without starting a mud tossing war -
It is sometimes useful to compare performance benchmark on different
filesystems on the same benchmark/hardware. Not in order to prove that
this filesystem is "better" than the other (we are all for diversity),
but because
it can sometimes point our attention to core issues.

This simple question [1] I posted about a huge difference in fio randomrw
benchmark on xfs vs. ext4 has led to patches being posted to address issues
in both xfs and ext4 [2][3][4] and to discover bugs in other
filesystems as well.

Thanks,
Amir.

[1] https://lore.kernel.org/linux-xfs/CAOQ4uxi0pGczXBX7GRAFs88Uw0n1ERJZno3JSeZR71S1dXg+2w@mail.gmail.com/
[2] https://lore.kernel.org/linux-xfs/20190404165737.30889-1-amir73il@gmail.com/
[3] https://lore.kernel.org/linux-xfs/20190829131034.10563-1-jack@suse.cz/
[4] https://lore.kernel.org/linux-ext4/20190603132155.20600-1-jack@suse.cz/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] automating file system benchmarks
  2019-12-13  5:12 ` [Lsf-pc] " Amir Goldstein
@ 2019-12-13 15:35   ` Theodore Y. Ts'o
  0 siblings, 0 replies; 3+ messages in thread
From: Theodore Y. Ts'o @ 2019-12-13 15:35 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: lsf-pc, linux-fsdevel, Jan Kara, Dave Chinner

On Fri, Dec 13, 2019 at 07:12:03AM +0200, Amir Goldstein wrote:
> 
> Very nice :)
> You should post an [ANNOUNCE] every now and then.
> I rarely check upstream of xfstests-bld, because it just-works ;-)

Right now, the PTS support in gce-xfstests is very manual.  Right now
the VM is launched via "gce-xfstests pts", then you have to log into
the VM, "gce-xfstests ssh pts" after a few minutes, then run
"phoronix-test-suite pts/disk", answer a few questions, and then
afterwards run "pts-save --results" and then kill off the pts VM.

I want to get it to the point where "gce-xfstests pts" is sufficient,
where the benchmarks are run and the VM is automatically shut down
afterwards.  Also still to be done is to add support for kvm-xfstests.
That'll hopefully be done in the next month or so, as I have some free
time.

> I suppose you have access to a dedicated metal in the cloud for running
> your performance regression tests? Or at least a dedicated metal per execution.

I'm not currently using a dedicated VM currently.  I've been primarily
using a 1TB PD-SSD as the storage medium and a n1-standard-16 as the
VM type.  That's been fairly reliable.

Using GCE Local SSD is a little tricky because there is more than one
underlying hardware, and that can result in differing results across
different VM's.  What you *can* do is to just use the same VM, and
then kexec into different kernels each time.  This can be done
manually, by copying in a different kernel into /root/bzImage, and
then running /root/do_kexec, and then running the next benchmark.
Eventually my plan to support this with a  command like

gce-xfstests --kernel gs://$B/bzImage-4.19,gs://bz/$B/bzImage-5.3 \
	--local-ssd pts

The reason why Local SSD is interesting is that GCE's Persistent Disk
has a very different performance profile than HDD's or SSD's --- it
acts much more like a battery-backed enterprise storage array, in that
CACHE FLUSH's are super fast, as are random writes.  GCE Local SSD
acts like, well, a real high performance SSD, and it's good to
benchmark both.

> I have not looked into GCE, so don't know how easy it is and how expensive
> to use GCE this way.

A benchmark run does take longer than "gce-xfstests -g auto", since
you generally use a larger VM and a larger amount of storage.  A 1T
PD-SSD plus a n1-standard-16 VM is about a dollar an hour, and it's
3-4 hours to run the pts/disk benchmark suite.  So call it $3-4 for a
single performance test run.

> Is there any chance of Google donating this sort of resource for a performance
> regression test bot?

We're not at the point where we could run gce-xfstests (either for
functional or performance testing) as a bot.  There's still some
development work that needs to happen before that could be a reality.
For now, if there was a development team that wanted to use
gce-xfstests for performance and benchmarking, I'm happy to put them
in contact with the folks at Google which support open source
projects.

   	     		       	   		   - Ted

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-12-13 20:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-12-13  1:47 [LSF/MM/BPF TOPIC] automating file system benchmarks Theodore Y. Ts'o
2019-12-13  5:12 ` [Lsf-pc] " Amir Goldstein
2019-12-13 15:35   ` Theodore Y. Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.