public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] A common project for file system performance testing
@ 2026-02-12 13:42 Hans Holmberg
  2026-02-12 14:31 ` Daniel Wagner
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hans Holmberg @ 2026-02-12 13:42 UTC (permalink / raw)
  To: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org
  Cc: Damien Le Moal, hch, Johannes Thumshirn, Naohiro Aota,
	josef@toxicpanda.com, jack@suse.com, Shinichiro Kawasaki

Hi all,

I'd like to propose a topic on file system benchmarking:

Can we establish a common project(like xfstests, blktests) for
measuring file system performance? The idea is to share a common base
containing peer-reviewed workloads and scripts to run these, collect and
store results.

Benchmarking is hard hard hard, let's share the burden!

A shared project would remove the need for everyone to cook up their
own frameworks and help define a set of workloads that the community
cares about.

Myself, I want to ensure that any optimizations I work on:

1) Do not introduce regressions in performance elsewhere before I
   submit patches
2) Can be reliably reproduced, verified, and regression‑tested by the
   community

The focus, I think, would first be on synthetic workloads (e.g. fio)
but it could expanded to running application and database workloads 
(e.g. RocksDB).

The fsperf[1] project is a python-based implementation for file system
benchmarking that we can use as a base for the discussion.
There are probably others out there as well.

[1] https://github.com/josefbacik/fsperf

Cheers,

Hans

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 13:42 [LSF/MM/BPF TOPIC] A common project for file system performance testing Hans Holmberg
@ 2026-02-12 14:31 ` Daniel Wagner
  2026-02-13 11:50   ` Shinichiro Kawasaki
  2026-02-12 16:42 ` Johannes Thumshirn
  2026-02-18 15:31 ` Theodore Tso
  2 siblings, 1 reply; 14+ messages in thread
From: Daniel Wagner @ 2026-02-12 14:31 UTC (permalink / raw)
  To: Hans Holmberg
  Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Johannes Thumshirn, Naohiro Aota,
	josef@toxicpanda.com, jack@suse.com, Shinichiro Kawasaki

On Thu, Feb 12, 2026 at 01:42:35PM +0000, Hans Holmberg wrote:
> A shared project would remove the need for everyone to cook up their
> own frameworks and help define a set of workloads that the community
> cares about.
> 
> Myself, I want to ensure that any optimizations I work on:
> 
> 1) Do not introduce regressions in performance elsewhere before I
>    submit patches
> 2) Can be reliably reproduced, verified, and regression‑tested by the
>    community

Not that I use it very often but mmtests is pretty good for this:

https://github.com/gormanm/mmtests

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 13:42 [LSF/MM/BPF TOPIC] A common project for file system performance testing Hans Holmberg
  2026-02-12 14:31 ` Daniel Wagner
@ 2026-02-12 16:42 ` Johannes Thumshirn
  2026-02-12 17:32   ` Josef Bacik
  2026-02-18 15:31 ` Theodore Tso
  2 siblings, 1 reply; 14+ messages in thread
From: Johannes Thumshirn @ 2026-02-12 16:42 UTC (permalink / raw)
  To: Hans Holmberg, lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org
  Cc: Damien Le Moal, hch, Naohiro Aota, josef@toxicpanda.com,
	jack@suse.com, Shinichiro Kawasaki

On 2/12/26 2:42 PM, Hans Holmberg wrote:
> Hi all,
>
> I'd like to propose a topic on file system benchmarking:
>
> Can we establish a common project(like xfstests, blktests) for
> measuring file system performance? The idea is to share a common base
> containing peer-reviewed workloads and scripts to run these, collect and
> store results.
>
> Benchmarking is hard hard hard, let's share the burden!

Definitely I'm all in!

> A shared project would remove the need for everyone to cook up their
> own frameworks and help define a set of workloads that the community
> cares about.
>
> Myself, I want to ensure that any optimizations I work on:
>
> 1) Do not introduce regressions in performance elsewhere before I
>     submit patches
> 2) Can be reliably reproduced, verified, and regression‑tested by the
>     community
>
> The focus, I think, would first be on synthetic workloads (e.g. fio)
> but it could expanded to running application and database workloads
> (e.g. RocksDB).
>
> The fsperf[1] project is a python-based implementation for file system
> benchmarking that we can use as a base for the discussion.
> There are probably others out there as well.
>
> [1] https://github.com/josefbacik/fsperf

I was about to mention Josef's fsperf project. We also used to have some 
sort of a dashboard for fsperf results for BTRFS, but that vanished 
together with Josef.

A common dashboard with per workload statistics for different 
filesystems would be a great thing to have, but for that to work, we'd 
need different hardware and probably the vendors of said hardware to buy 
in into it.

For developers it would be a benefit to see eventual regressions and 
overall weak points, for users it would be a nice tool to see what FS to 
pick for what workload.

BUT someone has to do the job setting everything up and maintaining it.


Byte,

     Johannes


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 16:42 ` Johannes Thumshirn
@ 2026-02-12 17:32   ` Josef Bacik
  2026-02-12 17:37     ` [Lsf-pc] " Amir Goldstein
  0 siblings, 1 reply; 14+ messages in thread
From: Josef Bacik @ 2026-02-12 17:32 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Hans Holmberg, lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, Damien Le Moal, hch, Naohiro Aota,
	jack@suse.com, Shinichiro Kawasaki

On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 2/12/26 2:42 PM, Hans Holmberg wrote:
> > Hi all,
> >
> > I'd like to propose a topic on file system benchmarking:
> >
> > Can we establish a common project(like xfstests, blktests) for
> > measuring file system performance? The idea is to share a common base
> > containing peer-reviewed workloads and scripts to run these, collect and
> > store results.
> >
> > Benchmarking is hard hard hard, let's share the burden!
>
> Definitely I'm all in!
>
> > A shared project would remove the need for everyone to cook up their
> > own frameworks and help define a set of workloads that the community
> > cares about.
> >
> > Myself, I want to ensure that any optimizations I work on:
> >
> > 1) Do not introduce regressions in performance elsewhere before I
> >     submit patches
> > 2) Can be reliably reproduced, verified, and regression‑tested by the
> >     community
> >
> > The focus, I think, would first be on synthetic workloads (e.g. fio)
> > but it could expanded to running application and database workloads
> > (e.g. RocksDB).
> >
> > The fsperf[1] project is a python-based implementation for file system
> > benchmarking that we can use as a base for the discussion.
> > There are probably others out there as well.
> >
> > [1] https://github.com/josefbacik/fsperf
>
> I was about to mention Josef's fsperf project. We also used to have some
> sort of a dashboard for fsperf results for BTRFS, but that vanished
> together with Josef.
>
> A common dashboard with per workload statistics for different
> filesystems would be a great thing to have, but for that to work, we'd
> need different hardware and probably the vendors of said hardware to buy
> in into it.
>
> For developers it would be a benefit to see eventual regressions and
> overall weak points, for users it would be a nice tool to see what FS to
> pick for what workload.
>
> BUT someone has to do the job setting everything up and maintaining it.
>

I'm still here, the dashboard disappeared because the drives died, and
although the history is interesting it didn't seem like we were using
it much. The A/B testing part of fsperf still is being used regularly
as far as I can tell.

But yeah maintaining a dashboard is always the hardest part, because
it means setting up a website somewhere and a way to sync the pages.
What I had for fsperf was quite janky, basically I'd run it every
night, generate the new report pages, and scp them to the VPS I had.
With Claude we could probably come up with a better way to do this
quickly, since I'm clearly not a web developer. That being said we
still have to have someplace to put it, and have some sort of hardware
that runs stuff consistently.

I think A/B testing just makes more sense in the general use case.
Trends are interesting, but nobody pays attention to them. Thanks,

Josef

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 17:32   ` Josef Bacik
@ 2026-02-12 17:37     ` Amir Goldstein
  2026-02-12 19:03       ` Josef Bacik
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Amir Goldstein @ 2026-02-12 17:37 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Johannes Thumshirn, Hans Holmberg,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Naohiro Aota, jack@suse.com,
	Shinichiro Kawasaki

On Thu, Feb 12, 2026 at 7:32 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
> <Johannes.Thumshirn@wdc.com> wrote:
> >
> > On 2/12/26 2:42 PM, Hans Holmberg wrote:
> > > Hi all,
> > >
> > > I'd like to propose a topic on file system benchmarking:
> > >
> > > Can we establish a common project(like xfstests, blktests) for
> > > measuring file system performance? The idea is to share a common base
> > > containing peer-reviewed workloads and scripts to run these, collect and
> > > store results.
> > >
> > > Benchmarking is hard hard hard, let's share the burden!
> >
> > Definitely I'm all in!
> >
> > > A shared project would remove the need for everyone to cook up their
> > > own frameworks and help define a set of workloads that the community
> > > cares about.
> > >
> > > Myself, I want to ensure that any optimizations I work on:
> > >
> > > 1) Do not introduce regressions in performance elsewhere before I
> > >     submit patches
> > > 2) Can be reliably reproduced, verified, and regression‑tested by the
> > >     community
> > >
> > > The focus, I think, would first be on synthetic workloads (e.g. fio)
> > > but it could expanded to running application and database workloads
> > > (e.g. RocksDB).
> > >
> > > The fsperf[1] project is a python-based implementation for file system
> > > benchmarking that we can use as a base for the discussion.
> > > There are probably others out there as well.
> > >
> > > [1] https://github.com/josefbacik/fsperf
> >
> > I was about to mention Josef's fsperf project. We also used to have some
> > sort of a dashboard for fsperf results for BTRFS, but that vanished
> > together with Josef.
> >
> > A common dashboard with per workload statistics for different
> > filesystems would be a great thing to have, but for that to work, we'd
> > need different hardware and probably the vendors of said hardware to buy
> > in into it.
> >
> > For developers it would be a benefit to see eventual regressions and
> > overall weak points, for users it would be a nice tool to see what FS to
> > pick for what workload.
> >
> > BUT someone has to do the job setting everything up and maintaining it.
> >
>
> I'm still here, the dashboard disappeared because the drives died, and
> although the history is interesting it didn't seem like we were using
> it much. The A/B testing part of fsperf still is being used regularly
> as far as I can tell.
>
> But yeah maintaining a dashboard is always the hardest part, because
> it means setting up a website somewhere and a way to sync the pages.
> What I had for fsperf was quite janky, basically I'd run it every
> night, generate the new report pages, and scp them to the VPS I had.
> With Claude we could probably come up with a better way to do this
> quickly, since I'm clearly not a web developer. That being said we
> still have to have someplace to put it, and have some sort of hardware
> that runs stuff consistently.
>

That's the main point IMO.

Perf regression tests must rely on consistent hardware setups.
If we do not have organizations to fund/donate this hardware and put in
the engineering effort to drive it, talking about WHAT to run in LSFMM
is useless IMO.

The fact that there is still a single test in fstests/tests/perf since 2017
says it all - it's not about lack of tests to run, it is about lack of resources
and this is not the sort of thing that gets resolved in LSFMM discussion IMO.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 17:37     ` [Lsf-pc] " Amir Goldstein
@ 2026-02-12 19:03       ` Josef Bacik
  2026-02-13  9:13         ` Hans Holmberg
  2026-02-13  6:59       ` Johannes Thumshirn
  2026-02-16 10:10       ` Jan Kara
  2 siblings, 1 reply; 14+ messages in thread
From: Josef Bacik @ 2026-02-12 19:03 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Johannes Thumshirn, Hans Holmberg,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Naohiro Aota, jack@suse.com,
	Shinichiro Kawasaki

On Thu, Feb 12, 2026 at 12:37 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Thu, Feb 12, 2026 at 7:32 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
> > <Johannes.Thumshirn@wdc.com> wrote:
> > >
> > > On 2/12/26 2:42 PM, Hans Holmberg wrote:
> > > > Hi all,
> > > >
> > > > I'd like to propose a topic on file system benchmarking:
> > > >
> > > > Can we establish a common project(like xfstests, blktests) for
> > > > measuring file system performance? The idea is to share a common base
> > > > containing peer-reviewed workloads and scripts to run these, collect and
> > > > store results.
> > > >
> > > > Benchmarking is hard hard hard, let's share the burden!
> > >
> > > Definitely I'm all in!
> > >
> > > > A shared project would remove the need for everyone to cook up their
> > > > own frameworks and help define a set of workloads that the community
> > > > cares about.
> > > >
> > > > Myself, I want to ensure that any optimizations I work on:
> > > >
> > > > 1) Do not introduce regressions in performance elsewhere before I
> > > >     submit patches
> > > > 2) Can be reliably reproduced, verified, and regression‑tested by the
> > > >     community
> > > >
> > > > The focus, I think, would first be on synthetic workloads (e.g. fio)
> > > > but it could expanded to running application and database workloads
> > > > (e.g. RocksDB).
> > > >
> > > > The fsperf[1] project is a python-based implementation for file system
> > > > benchmarking that we can use as a base for the discussion.
> > > > There are probably others out there as well.
> > > >
> > > > [1] https://github.com/josefbacik/fsperf
> > >
> > > I was about to mention Josef's fsperf project. We also used to have some
> > > sort of a dashboard for fsperf results for BTRFS, but that vanished
> > > together with Josef.
> > >
> > > A common dashboard with per workload statistics for different
> > > filesystems would be a great thing to have, but for that to work, we'd
> > > need different hardware and probably the vendors of said hardware to buy
> > > in into it.
> > >
> > > For developers it would be a benefit to see eventual regressions and
> > > overall weak points, for users it would be a nice tool to see what FS to
> > > pick for what workload.
> > >
> > > BUT someone has to do the job setting everything up and maintaining it.
> > >
> >
> > I'm still here, the dashboard disappeared because the drives died, and
> > although the history is interesting it didn't seem like we were using
> > it much. The A/B testing part of fsperf still is being used regularly
> > as far as I can tell.
> >
> > But yeah maintaining a dashboard is always the hardest part, because
> > it means setting up a website somewhere and a way to sync the pages.
> > What I had for fsperf was quite janky, basically I'd run it every
> > night, generate the new report pages, and scp them to the VPS I had.
> > With Claude we could probably come up with a better way to do this
> > quickly, since I'm clearly not a web developer. That being said we
> > still have to have someplace to put it, and have some sort of hardware
> > that runs stuff consistently.
> >
>
> That's the main point IMO.
>
> Perf regression tests must rely on consistent hardware setups.
> If we do not have organizations to fund/donate this hardware and put in
> the engineering effort to drive it, talking about WHAT to run in LSFMM
> is useless IMO.
>
> The fact that there is still a single test in fstests/tests/perf since 2017
> says it all - it's not about lack of tests to run, it is about lack of resources
> and this is not the sort of thing that gets resolved in LSFMM discussion IMO.
>

Well that's because getting code into fstests is miserable, so we just
worked on fsperf outside of fstests. We've added a fair number of
tests, a good bit of infrastructure to do different things and collect
different metrics, even run bpf scripts to get additional metrics.

Hardware is always a problem, that's why I think it's better to just
use this stuff as A/B testing. Making it easy to run means we can have
a consistent tool to run on different machines that may have different
characteristics. I can validate my fix works good on my NVME drive,
but then Johannes can run on their ZNS drives and see it regresses,
and then we have a consistent test to go back and forth and work with
to get to a fix.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 17:37     ` [Lsf-pc] " Amir Goldstein
  2026-02-12 19:03       ` Josef Bacik
@ 2026-02-13  6:59       ` Johannes Thumshirn
  2026-02-16 10:10       ` Jan Kara
  2 siblings, 0 replies; 14+ messages in thread
From: Johannes Thumshirn @ 2026-02-13  6:59 UTC (permalink / raw)
  To: Amir Goldstein, Josef Bacik
  Cc: Hans Holmberg, lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, Damien Le Moal, hch, Naohiro Aota,
	jack@suse.com, Shinichiro Kawasaki

On 2/12/26 6:37 PM, Amir Goldstein wrote:
> The fact that there is still a single test in fstests/tests/perf since 2017
> says it all - it's not about lack of tests to run, it is about lack of resources
> and this is not the sort of thing that gets resolved in LSFMM discussion IMO.

As sad as it might be, I'm afraid you're 100% correct on this.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 19:03       ` Josef Bacik
@ 2026-02-13  9:13         ` Hans Holmberg
  0 siblings, 0 replies; 14+ messages in thread
From: Hans Holmberg @ 2026-02-13  9:13 UTC (permalink / raw)
  To: Josef Bacik, Amir Goldstein
  Cc: Johannes Thumshirn, lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, Damien Le Moal, hch, Naohiro Aota,
	jack@suse.com, Shinichiro Kawasaki

On 12/02/2026 20:03, Josef Bacik wrote:
> On Thu, Feb 12, 2026 at 12:37 PM Amir Goldstein <amir73il@gmail.com> wrote:
>>
>> On Thu, Feb 12, 2026 at 7:32 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>>
>>> On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
>>> <Johannes.Thumshirn@wdc.com> wrote:
>>>>
>>>> On 2/12/26 2:42 PM, Hans Holmberg wrote:
>>>>> Hi all,
>>>>>
>>>>> I'd like to propose a topic on file system benchmarking:
>>>>>
>>>>> Can we establish a common project(like xfstests, blktests) for
>>>>> measuring file system performance? The idea is to share a common base
>>>>> containing peer-reviewed workloads and scripts to run these, collect and
>>>>> store results.
>>>>>
>>>>> Benchmarking is hard hard hard, let's share the burden!
>>>>
>>>> Definitely I'm all in!
>>>>
>>>>> A shared project would remove the need for everyone to cook up their
>>>>> own frameworks and help define a set of workloads that the community
>>>>> cares about.
>>>>>
>>>>> Myself, I want to ensure that any optimizations I work on:
>>>>>
>>>>> 1) Do not introduce regressions in performance elsewhere before I
>>>>>     submit patches
>>>>> 2) Can be reliably reproduced, verified, and regression‑tested by the
>>>>>     community
>>>>>
>>>>> The focus, I think, would first be on synthetic workloads (e.g. fio)
>>>>> but it could expanded to running application and database workloads
>>>>> (e.g. RocksDB).
>>>>>
>>>>> The fsperf[1] project is a python-based implementation for file system
>>>>> benchmarking that we can use as a base for the discussion.
>>>>> There are probably others out there as well.
>>>>>
>>>>> [1] https://github.com/josefbacik/fsperf
>>>>
>>>> I was about to mention Josef's fsperf project. We also used to have some
>>>> sort of a dashboard for fsperf results for BTRFS, but that vanished
>>>> together with Josef.
>>>>
>>>> A common dashboard with per workload statistics for different
>>>> filesystems would be a great thing to have, but for that to work, we'd
>>>> need different hardware and probably the vendors of said hardware to buy
>>>> in into it.
>>>>
>>>> For developers it would be a benefit to see eventual regressions and
>>>> overall weak points, for users it would be a nice tool to see what FS to
>>>> pick for what workload.
>>>>
>>>> BUT someone has to do the job setting everything up and maintaining it.
>>>>
>>>
>>> I'm still here, the dashboard disappeared because the drives died, and
>>> although the history is interesting it didn't seem like we were using
>>> it much. The A/B testing part of fsperf still is being used regularly
>>> as far as I can tell.
>>>
>>> But yeah maintaining a dashboard is always the hardest part, because
>>> it means setting up a website somewhere and a way to sync the pages.
>>> What I had for fsperf was quite janky, basically I'd run it every
>>> night, generate the new report pages, and scp them to the VPS I had.
>>> With Claude we could probably come up with a better way to do this
>>> quickly, since I'm clearly not a web developer. That being said we
>>> still have to have someplace to put it, and have some sort of hardware
>>> that runs stuff consistently.
>>>
>>
>> That's the main point IMO.
>>
>> Perf regression tests must rely on consistent hardware setups.
>> If we do not have organizations to fund/donate this hardware and put in
>> the engineering effort to drive it, talking about WHAT to run in LSFMM
>> is useless IMO.
>>
>> The fact that there is still a single test in fstests/tests/perf since 2017
>> says it all - it's not about lack of tests to run, it is about lack of resources
>> and this is not the sort of thing that gets resolved in LSFMM discussion IMO.

Amir, not sure that i follow you here, that single test is certainly not
enough to motivate investments in resources to run it :)

But maybe you refer to available fs benchmarking elsewhere?
Where would you point people to for adding new benchmarks? mmtests?

>>
> 
> Well that's because getting code into fstests is miserable, so we just
> worked on fsperf outside of fstests. We've added a fair number of
> tests, a good bit of infrastructure to do different things and collect
> different metrics, even run bpf scripts to get additional metrics.

Josef, did you consider extending mmtests in stead of creating fsperf?

> 
> Hardware is always a problem, that's why I think it's better to just
> use this stuff as A/B testing. Making it easy to run means we can have
> a consistent tool to run on different machines that may have different
> characteristics. I can validate my fix works good on my NVME drive,
> but then Johannes can run on their ZNS drives and see it regresses,
> and then we have a consistent test to go back and forth and work with
> to get to a fix.  Thanks,
> 

Yeah, I too find fsperf really useful for A/B testing. It is also relatively
easy to extend and focused solely on file system performance.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 14:31 ` Daniel Wagner
@ 2026-02-13 11:50   ` Shinichiro Kawasaki
  0 siblings, 0 replies; 14+ messages in thread
From: Shinichiro Kawasaki @ 2026-02-13 11:50 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Hans Holmberg, lsf-pc@lists.linux-foundation.org,
	linux-fsdevel@vger.kernel.org, Damien Le Moal, hch,
	Johannes Thumshirn, Naohiro Aota, josef@toxicpanda.com,
	jack@suse.com

On Feb 12, 2026 / 15:31, Daniel Wagner wrote:
> On Thu, Feb 12, 2026 at 01:42:35PM +0000, Hans Holmberg wrote:
> > A shared project would remove the need for everyone to cook up their
> > own frameworks and help define a set of workloads that the community
> > cares about.
> > 
> > Myself, I want to ensure that any optimizations I work on:
> > 
> > 1) Do not introduce regressions in performance elsewhere before I
> >    submit patches
> > 2) Can be reliably reproduced, verified, and regression‑tested by the
> >    community
> 
> Not that I use it very often but mmtests is pretty good for this:
> 
> https://github.com/gormanm/mmtests

Just FYI, I remember that the last session of the "Kernel Testing &
Dependability MC" [1] at the Linux Plumbers Conf 2026 in Tokyo was related to
this topic. It was titled "A fast path to benchmarking", and it discussed the
new OSS tool named "fastpath" [2], quote,

   "Fastpath is a command-line tool specifically designed for monitoring the
    performance of the Linux kernel by executing structured performance
    benchmarks on a diverse range of hardware platforms."

[1] https://lpc.events/event/19/sessions/228/#20251212
[2] https://fastpath.docs.arm.com/en/latest/introduction.html#overview-of-fastpath

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 17:37     ` [Lsf-pc] " Amir Goldstein
  2026-02-12 19:03       ` Josef Bacik
  2026-02-13  6:59       ` Johannes Thumshirn
@ 2026-02-16 10:10       ` Jan Kara
  2026-02-17  8:13         ` Hans Holmberg
  2 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2026-02-16 10:10 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Josef Bacik, Johannes Thumshirn, Hans Holmberg,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Naohiro Aota, jack@suse.com,
	Shinichiro Kawasaki

On Thu 12-02-26 18:37:22, Amir Goldstein wrote:
> On Thu, Feb 12, 2026 at 7:32 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
> > <Johannes.Thumshirn@wdc.com> wrote:
> > >
> > > On 2/12/26 2:42 PM, Hans Holmberg wrote:
> > > > Hi all,
> > > >
> > > > I'd like to propose a topic on file system benchmarking:
> > > >
> > > > Can we establish a common project(like xfstests, blktests) for
> > > > measuring file system performance? The idea is to share a common base
> > > > containing peer-reviewed workloads and scripts to run these, collect and
> > > > store results.
> > > >
> > > > Benchmarking is hard hard hard, let's share the burden!
> > >
> > > Definitely I'm all in!
> > >
> > > > A shared project would remove the need for everyone to cook up their
> > > > own frameworks and help define a set of workloads that the community
> > > > cares about.
> > > >
> > > > Myself, I want to ensure that any optimizations I work on:
> > > >
> > > > 1) Do not introduce regressions in performance elsewhere before I
> > > >     submit patches
> > > > 2) Can be reliably reproduced, verified, and regression‑tested by the
> > > >     community
> > > >
> > > > The focus, I think, would first be on synthetic workloads (e.g. fio)
> > > > but it could expanded to running application and database workloads
> > > > (e.g. RocksDB).
> > > >
> > > > The fsperf[1] project is a python-based implementation for file system
> > > > benchmarking that we can use as a base for the discussion.
> > > > There are probably others out there as well.
> > > >
> > > > [1] https://github.com/josefbacik/fsperf
> > >
> > > I was about to mention Josef's fsperf project. We also used to have some
> > > sort of a dashboard for fsperf results for BTRFS, but that vanished
> > > together with Josef.
> > >
> > > A common dashboard with per workload statistics for different
> > > filesystems would be a great thing to have, but for that to work, we'd
> > > need different hardware and probably the vendors of said hardware to buy
> > > in into it.
> > >
> > > For developers it would be a benefit to see eventual regressions and
> > > overall weak points, for users it would be a nice tool to see what FS to
> > > pick for what workload.
> > >
> > > BUT someone has to do the job setting everything up and maintaining it.
> > >
> >
> > I'm still here, the dashboard disappeared because the drives died, and
> > although the history is interesting it didn't seem like we were using
> > it much. The A/B testing part of fsperf still is being used regularly
> > as far as I can tell.
> >
> > But yeah maintaining a dashboard is always the hardest part, because
> > it means setting up a website somewhere and a way to sync the pages.
> > What I had for fsperf was quite janky, basically I'd run it every
> > night, generate the new report pages, and scp them to the VPS I had.
> > With Claude we could probably come up with a better way to do this
> > quickly, since I'm clearly not a web developer. That being said we
> > still have to have someplace to put it, and have some sort of hardware
> > that runs stuff consistently.
> >
> 
> That's the main point IMO.
> 
> Perf regression tests must rely on consistent hardware setups.
> If we do not have organizations to fund/donate this hardware and put in
> the engineering effort to drive it, talking about WHAT to run in LSFMM
> is useless IMO.

My dayjob is watching kernel performance for our distro so I feel a bit
obliged to share my view :) I agree the problem here isn't the lack of
tools. We use mmtests as a suite for benchmarking - since that is rather
generic suite for running benchmarks with quite some benchmark integrated
the learning curve is relatively steep but once you get a hunch of it it
isn't difficult to use. Fsperf is IMO also fine to use if you are fine with
the limited benchmarking it can do. This isn't the hard part - anyone can
download these suites and run them.

The hard part on benchmarking is having sensible hardware to run the test
on, selecting a benchmark and setup that's actually exercising the code
you're interested in, and getting statistically significant results (i.e.,
discerning random noise from real differences). And these are things that
are difficult to share or solve by discussion.

As others wrote one solution to this is if someone dedicates the hardware
and engineers with know-how for this. But realistically I don't see that
happening in the nearterm. There might be other solutions how to share more
- like sharing a VM with preconfigured set of benchmarks covering the
basics (similarly to some people sharing VM images with preconfigured
fstests runs). But I don't have a clear picture how much such thing would
help.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-16 10:10       ` Jan Kara
@ 2026-02-17  8:13         ` Hans Holmberg
  0 siblings, 0 replies; 14+ messages in thread
From: Hans Holmberg @ 2026-02-17  8:13 UTC (permalink / raw)
  To: Jan Kara, Amir Goldstein, Shinichiro Kawasaki
  Cc: Josef Bacik, Johannes Thumshirn,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Naohiro Aota, jack@suse.com

On 16/02/2026 11:10, Jan Kara wrote:
> On Thu 12-02-26 18:37:22, Amir Goldstein wrote:
>> On Thu, Feb 12, 2026 at 7:32 PM Josef Bacik <josef@toxicpanda.com> wrote:
>>> On Thu, Feb 12, 2026 at 11:42 AM Johannes Thumshirn
>>> <Johannes.Thumshirn@wdc.com> wrote:
>>>>
>>>> On 2/12/26 2:42 PM, Hans Holmberg wrote:
>>>>> Hi all,
>>>>>
>>>>> I'd like to propose a topic on file system benchmarking:
>>>>>
>>>>> Can we establish a common project(like xfstests, blktests) for
>>>>> measuring file system performance? The idea is to share a common base
>>>>> containing peer-reviewed workloads and scripts to run these, collect and
>>>>> store results.
>>>>>
>>>>> Benchmarking is hard hard hard, let's share the burden!
>>>>
>>>> Definitely I'm all in!
>>>>
>>>>> A shared project would remove the need for everyone to cook up their
>>>>> own frameworks and help define a set of workloads that the community
>>>>> cares about.
>>>>>
>>>>> Myself, I want to ensure that any optimizations I work on:
>>>>>
>>>>> 1) Do not introduce regressions in performance elsewhere before I
>>>>>     submit patches
>>>>> 2) Can be reliably reproduced, verified, and regression‑tested by the
>>>>>     community
>>>>>
>>>>> The focus, I think, would first be on synthetic workloads (e.g. fio)
>>>>> but it could expanded to running application and database workloads
>>>>> (e.g. RocksDB).
>>>>>
>>>>> The fsperf[1] project is a python-based implementation for file system
>>>>> benchmarking that we can use as a base for the discussion.
>>>>> There are probably others out there as well.
>>>>>
>>>>> [1] https://github.com/josefbacik/fsperf
>>>>
>>>> I was about to mention Josef's fsperf project. We also used to have some
>>>> sort of a dashboard for fsperf results for BTRFS, but that vanished
>>>> together with Josef.
>>>>
>>>> A common dashboard with per workload statistics for different
>>>> filesystems would be a great thing to have, but for that to work, we'd
>>>> need different hardware and probably the vendors of said hardware to buy
>>>> in into it.
>>>>
>>>> For developers it would be a benefit to see eventual regressions and
>>>> overall weak points, for users it would be a nice tool to see what FS to
>>>> pick for what workload.
>>>>
>>>> BUT someone has to do the job setting everything up and maintaining it.
>>>>
>>>
>>> I'm still here, the dashboard disappeared because the drives died, and
>>> although the history is interesting it didn't seem like we were using
>>> it much. The A/B testing part of fsperf still is being used regularly
>>> as far as I can tell.
>>>
>>> But yeah maintaining a dashboard is always the hardest part, because
>>> it means setting up a website somewhere and a way to sync the pages.
>>> What I had for fsperf was quite janky, basically I'd run it every
>>> night, generate the new report pages, and scp them to the VPS I had.
>>> With Claude we could probably come up with a better way to do this
>>> quickly, since I'm clearly not a web developer. That being said we
>>> still have to have someplace to put it, and have some sort of hardware
>>> that runs stuff consistently.
>>>
>>
>> That's the main point IMO.
>>
>> Perf regression tests must rely on consistent hardware setups.
>> If we do not have organizations to fund/donate this hardware and put in
>> the engineering effort to drive it, talking about WHAT to run in LSFMM
>> is useless IMO.
> 
> My dayjob is watching kernel performance for our distro so I feel a bit
> obliged to share my view :) I agree the problem here isn't the lack of
> tools. We use mmtests as a suite for benchmarking - since that is rather
> generic suite for running benchmarks with quite some benchmark integrated
> the learning curve is relatively steep but once you get a hunch of it it
> isn't difficult to use. Fsperf is IMO also fine to use if you are fine with
> the limited benchmarking it can do. This isn't the hard part - anyone can
> download these suites and run them.

Yeah, there seems to be no lack of projects that implements benchmarking
frameworks -  I'd just wish that we would have something as widely used as
blktests/fstests for file system benchmarking purposes.

mmtests seems to be a great option for regression testing, so i'll  have
a go at running and adding tests to that.

> 
> The hard part on benchmarking is having sensible hardware to run the test
> on, selecting a benchmark and setup that's actually exercising the code
> you're interested in, and getting statistically significant results (i.e.,
> discerning random noise from real differences). And these are things that
> are difficult to share or solve by discussion.

I think designing good benchmarks is a really tricky part as well, but maybe
that's just me :)

> 
> As others wrote one solution to this is if someone dedicates the hardware
> and engineers with know-how for this. But realistically I don't see that
> happening in the nearterm. There might be other solutions how to share more
> - like sharing a VM with preconfigured set of benchmarks covering the
> basics (similarly to some people sharing VM images with preconfigured
> fstests runs). But I don't have a clear picture how much such thing would
> help.
> 
> 								Honza

Thanks for the feedback everyone!



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-12 13:42 [LSF/MM/BPF TOPIC] A common project for file system performance testing Hans Holmberg
  2026-02-12 14:31 ` Daniel Wagner
  2026-02-12 16:42 ` Johannes Thumshirn
@ 2026-02-18 15:31 ` Theodore Tso
  2026-02-20  8:59   ` Hans Holmberg
  2 siblings, 1 reply; 14+ messages in thread
From: Theodore Tso @ 2026-02-18 15:31 UTC (permalink / raw)
  To: Hans Holmberg
  Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Johannes Thumshirn, Naohiro Aota,
	josef@toxicpanda.com, jack@suse.com, Shinichiro Kawasaki

I think this is definitely an interesting topic.  One thing that I
think we should consider is requirements (or feature requests, if
we're talking about an existing code base) to make it easier to run
performance testing.

A) Separate out the building of the benchamrks from the running of
    said benchmarks.  My pattern is to build a test appliance which
    can be uploaded to the system under test (SUT) which has all of
    the necessary dependencies that might be used (e.g., fio, dbench,
    etc.) precompiled.  The SUT might be a VM, or a device where
    running a compiler is prohibited by security policy (e.g., a
    machine in a data center), or a device which doesn't have a
    compiler installed, and/or where running the compiler would be
    slow and painful (e.g., an Android device).

B) Separate out fetching the benchmark components from the building.
   This might be because an enterprise might have local changes, so
   they want to use a version of these tools from a local repo.  It
   also could be that security policy prohibits downloading software
   from the network in an automated process, and there is a
   requirement that any softare to be built in the build environment
   has to reviewed by one or more human beings.

C) A modular way of storing the results.  I like to run my file system
   tests in a VM, which is deleted as soon as the test run is
   completed.  This significantly reduces the cost since the cost of
   the VM is only paid when a test is active.  But that means that the
   performance runs should not be assumed to be stored on the local
   file system where the benchmarks are run, but instead, the results
   should ideally be stored in some kind of flat file (ala Junit and
   Kunit files) which can then be collated in some kind of centralized
   store.

D) A standardized way of specifying the hardware configuration of the
   SUT.  This might include using VM's hosted at a hyperscale because
   of the cost advantage, and because very often, the softare defined
   storage in cloud VM's don't necessarily act like traditional HDD's
   or flash devices.)

I'll note that one of the concerns of running performance tests using
a VM is the noisy neighbor problem.  That is, what if the behavior of
other VM's on the host affects the performance of the test VM?  This
may vary depending on whether CPU or memory is subject to
overprovisioning (which may vary depending on the VM type).  There are
also VM types where all of the resources are dedicated to a single VM.

One thing that would be useful would be to have people running
benchmarks to run the exact same configuration (kernel version,
benchmark software versions, etc.) multiple times at different times
on the same VM type, so the variability of the benchmark results can
be measured.

Yes, this is a bit more work, but the benefits of using VM's, where
you don't have to maintain hardware, and deal with hard drive
failings, etc., means that some people might find the cost/benefits
tradeoffs to be appealing.

Cheers,

					- Ted


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-18 15:31 ` Theodore Tso
@ 2026-02-20  8:59   ` Hans Holmberg
  2026-02-23 13:26     ` Johannes Thumshirn
  0 siblings, 1 reply; 14+ messages in thread
From: Hans Holmberg @ 2026-02-20  8:59 UTC (permalink / raw)
  To: Theodore Tso
  Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Johannes Thumshirn, Naohiro Aota,
	josef@toxicpanda.com, jack@suse.com, Shinichiro Kawasaki

On 18/02/2026 16:31, Theodore Tso wrote:
> I think this is definitely an interesting topic.  One thing that I
> think we should consider is requirements (or feature requests, if
> we're talking about an existing code base) to make it easier to run
> performance testing.
> 
> A) Separate out the building of the benchamrks from the running of
>     said benchmarks.  My pattern is to build a test appliance which
>     can be uploaded to the system under test (SUT) which has all of
>     the necessary dependencies that might be used (e.g., fio, dbench,
>     etc.) precompiled.  The SUT might be a VM, or a device where
>     running a compiler is prohibited by security policy (e.g., a
>     machine in a data center), or a device which doesn't have a
>     compiler installed, and/or where running the compiler would be
>     slow and painful (e.g., an Android device).
> 
> B) Separate out fetching the benchmark components from the building.
>    This might be because an enterprise might have local changes, so
>    they want to use a version of these tools from a local repo.  It
>    also could be that security policy prohibits downloading software
>    from the network in an automated process, and there is a
>    requirement that any softare to be built in the build environment
>    has to reviewed by one or more human beings.
> 
> C) A modular way of storing the results.  I like to run my file system
>    tests in a VM, which is deleted as soon as the test run is
>    completed.  This significantly reduces the cost since the cost of
>    the VM is only paid when a test is active.  But that means that the
>    performance runs should not be assumed to be stored on the local
>    file system where the benchmarks are run, but instead, the results
>    should ideally be stored in some kind of flat file (ala Junit and
>    Kunit files) which can then be collated in some kind of centralized
>    store.


Yeah, I think splitting things up in modules/parts would be really
beneficial.

Just like blktests and fstests mainly focus on providing good tests
(with well defined ways for starting test runs and providing results),
we could have a module or a project that just defines a bunch of useful
workloads.

Setting up benchmarking runs and analyzing & presenting results could
be done by other modules as there'll be different preferences and needs
for those, depending on use case, ci-system etc.


> 
> D) A standardized way of specifying the hardware configuration of the
>    SUT.  This might include using VM's hosted at a hyperscale because
>    of the cost advantage, and because very often, the softare defined
>    storage in cloud VM's don't necessarily act like traditional HDD's
>    or flash devices.)
> 
> I'll note that one of the concerns of running performance tests using
> a VM is the noisy neighbor problem.  That is, what if the behavior of
> other VM's on the host affects the performance of the test VM?  This
> may vary depending on whether CPU or memory is subject to
> overprovisioning (which may vary depending on the VM type).  There are
> also VM types where all of the resources are dedicated to a single VM.
> 
> One thing that would be useful would be to have people running
> benchmarks to run the exact same configuration (kernel version,
> benchmark software versions, etc.) multiple times at different times
> on the same VM type, so the variability of the benchmark results can
> be measured.
> 
> Yes, this is a bit more work, but the benefits of using VM's, where
> you don't have to maintain hardware, and deal with hard drive
> failings, etc., means that some people might find the cost/benefits
> tradeoffs to be appealing.
> 
> Cheers,
> 
> 					- Ted
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LSF/MM/BPF TOPIC] A common project for file system performance testing
  2026-02-20  8:59   ` Hans Holmberg
@ 2026-02-23 13:26     ` Johannes Thumshirn
  0 siblings, 0 replies; 14+ messages in thread
From: Johannes Thumshirn @ 2026-02-23 13:26 UTC (permalink / raw)
  To: Hans Holmberg, Theodore Tso
  Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
	Damien Le Moal, hch, Naohiro Aota, josef@toxicpanda.com,
	jack@suse.com, Shinichiro Kawasaki

On 2/20/26 9:59 AM, Hans Holmberg wrote:
> Yeah, I think splitting things up in modules/parts would be really
> beneficial.
>
> Just like blktests and fstests mainly focus on providing good tests
> (with well defined ways for starting test runs and providing results),
> we could have a module or a project that just defines a bunch of useful
> workloads.
>
> Setting up benchmarking runs and analyzing & presenting results could
> be done by other modules as there'll be different preferences and needs
> for those, depending on use case, ci-system etc.

I know workload classification can be hard, but starting with a set of 
fio scripts we can run would be a good start.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-02-23 13:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12 13:42 [LSF/MM/BPF TOPIC] A common project for file system performance testing Hans Holmberg
2026-02-12 14:31 ` Daniel Wagner
2026-02-13 11:50   ` Shinichiro Kawasaki
2026-02-12 16:42 ` Johannes Thumshirn
2026-02-12 17:32   ` Josef Bacik
2026-02-12 17:37     ` [Lsf-pc] " Amir Goldstein
2026-02-12 19:03       ` Josef Bacik
2026-02-13  9:13         ` Hans Holmberg
2026-02-13  6:59       ` Johannes Thumshirn
2026-02-16 10:10       ` Jan Kara
2026-02-17  8:13         ` Hans Holmberg
2026-02-18 15:31 ` Theodore Tso
2026-02-20  8:59   ` Hans Holmberg
2026-02-23 13:26     ` Johannes Thumshirn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox