linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
@ 2025-01-21 11:13 Andreas Hindborg
  2025-01-21 12:04 ` [Lsf-pc] " Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Andreas Hindborg @ 2025-01-21 11:13 UTC (permalink / raw)
  To: lsf-pc
  Cc: linux-block, Jens Axboe, Matthew Wilcox, Luis Chamberlain,
	Miguel Ojeda

I All,

I would like to propose that we have a session on Rust in the block
layer again this year. Specifically I would like to discuss some rather
puzzling results I observe when I benchmark the C and Rust null block
drivers. I did a write up of the challenges I face at [1]. The
observations are not tied to rust, they also manifest in the C driver.

Being able to consistently benchmark performance of the null block
driver is rather important in terms of validating Rust for use in the
block layer. I would hope to be able to collect some feedback on these
issues during a session.

If time permits, I would like to give a status update on the efforts of
building a feature complete null block driver in Rust. I will send
additional patches that enable memory backed devices and timer
completions before the Summit. Most of the patches have been ready for a
while, but they are pending merge of dependencies (xarray, hrtimer,
module_params).

If anyone is interested, I would make myself available for deep dives on
the Rust block layer code base, 1:1 or tutorial style. We (the Rust
kernel hackers) have had some good experiences with these kinds of
sessions in other subsystems.


Best regards,
Andreas Hindborg


[1] https://metaspace.github.io/2024/12/02/problems-in-benchmark-land.html


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
  2025-01-21 11:13 [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies Andreas Hindborg
@ 2025-01-21 12:04 ` Jan Kara
  2025-01-21 12:51   ` Andreas Hindborg
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2025-01-21 12:04 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: lsf-pc, linux-block, Jens Axboe, Matthew Wilcox, Luis Chamberlain,
	Miguel Ojeda

Hi!

On Tue 21-01-25 12:13:48, Andreas Hindborg via Lsf-pc wrote:
> I would like to propose that we have a session on Rust in the block
> layer again this year. Specifically I would like to discuss some rather
> puzzling results I observe when I benchmark the C and Rust null block
> drivers. I did a write up of the challenges I face at [1]. The
> observations are not tied to rust, they also manifest in the C driver.

The results are indeed somewhat curious. One factor I didn't see addressed
in your blog is CPU scheduling. I've seen in the past cases where IO tasks
were getting migrated across cores leading to jumps in perfomance. Did you
try binding fio jobs to one CPU each?

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
  2025-01-21 12:04 ` [Lsf-pc] " Jan Kara
@ 2025-01-21 12:51   ` Andreas Hindborg
  2025-01-21 13:18     ` Jan Kara
  2025-01-22  9:52     ` Niklas Cassel
  0 siblings, 2 replies; 6+ messages in thread
From: Andreas Hindborg @ 2025-01-21 12:51 UTC (permalink / raw)
  To: Jan Kara
  Cc: lsf-pc, linux-block, Jens Axboe, Matthew Wilcox, Luis Chamberlain,
	Miguel Ojeda

Hi Jan,

"Jan Kara" <jack@suse.cz> writes:

> Hi!
>
> On Tue 21-01-25 12:13:48, Andreas Hindborg via Lsf-pc wrote:
>> I would like to propose that we have a session on Rust in the block
>> layer again this year. Specifically I would like to discuss some rather
>> puzzling results I observe when I benchmark the C and Rust null block
>> drivers. I did a write up of the challenges I face at [1]. The
>> observations are not tied to rust, they also manifest in the C driver.
>
> The results are indeed somewhat curious. One factor I didn't see addressed
> in your blog is CPU scheduling. I've seen in the past cases where IO tasks
> were getting migrated across cores leading to jumps in perfomance. Did you
> try binding fio jobs to one CPU each?

Yes, I am pinning the io jobs to cores with fio options `cpus_allowed=0-<jobs>`
and `--cpus_allowed_policy=split` so I get 1 job per core.

The kernel is configured with PREEMPT_NONE=y.


Best regards,
Andreas Hindborg



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
  2025-01-21 12:51   ` Andreas Hindborg
@ 2025-01-21 13:18     ` Jan Kara
  2025-01-22  9:52     ` Niklas Cassel
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Kara @ 2025-01-21 13:18 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Jan Kara, lsf-pc, linux-block, Jens Axboe, Matthew Wilcox,
	Luis Chamberlain, Miguel Ojeda

On Tue 21-01-25 13:51:11, Andreas Hindborg wrote:
> "Jan Kara" <jack@suse.cz> writes:
> > On Tue 21-01-25 12:13:48, Andreas Hindborg via Lsf-pc wrote:
> >> I would like to propose that we have a session on Rust in the block
> >> layer again this year. Specifically I would like to discuss some rather
> >> puzzling results I observe when I benchmark the C and Rust null block
> >> drivers. I did a write up of the challenges I face at [1]. The
> >> observations are not tied to rust, they also manifest in the C driver.
> >
> > The results are indeed somewhat curious. One factor I didn't see addressed
> > in your blog is CPU scheduling. I've seen in the past cases where IO tasks
> > were getting migrated across cores leading to jumps in perfomance. Did you
> > try binding fio jobs to one CPU each?
> 
> Yes, I am pinning the io jobs to cores with fio options `cpus_allowed=0-<jobs>`
> and `--cpus_allowed_policy=split` so I get 1 job per core.
> 
> The kernel is configured with PREEMPT_NONE=y.

Ah, OK. In that case no great ideas from me. Since you've mentioned that
when you get to slow / fast case the performance tends to stay there,
perhaps you could use perf to profile the slow / fast case and see where's
the difference?

									Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
  2025-01-21 12:51   ` Andreas Hindborg
  2025-01-21 13:18     ` Jan Kara
@ 2025-01-22  9:52     ` Niklas Cassel
  2025-01-23  8:56       ` Andreas Hindborg
  1 sibling, 1 reply; 6+ messages in thread
From: Niklas Cassel @ 2025-01-22  9:52 UTC (permalink / raw)
  To: Andreas Hindborg
  Cc: Jan Kara, lsf-pc, linux-block, Jens Axboe, Matthew Wilcox,
	Luis Chamberlain, Miguel Ojeda

Hello Andreas,

On Tue, Jan 21, 2025 at 01:51:11PM +0100, Andreas Hindborg wrote:
> Hi Jan,
> 
> "Jan Kara" <jack@suse.cz> writes:
> 
> > Hi!
> >
> > On Tue 21-01-25 12:13:48, Andreas Hindborg via Lsf-pc wrote:
> >> I would like to propose that we have a session on Rust in the block
> >> layer again this year. Specifically I would like to discuss some rather
> >> puzzling results I observe when I benchmark the C and Rust null block
> >> drivers. I did a write up of the challenges I face at [1]. The
> >> observations are not tied to rust, they also manifest in the C driver.
> >
> > The results are indeed somewhat curious. One factor I didn't see addressed
> > in your blog is CPU scheduling. I've seen in the past cases where IO tasks
> > were getting migrated across cores leading to jumps in perfomance. Did you
> > try binding fio jobs to one CPU each?
> 
> Yes, I am pinning the io jobs to cores with fio options `cpus_allowed=0-<jobs>`
> and `--cpus_allowed_policy=split` so I get 1 job per core.
> 
> The kernel is configured with PREEMPT_NONE=y.

"I also cover a problem with the benchmark results that manifested during
testing for v6.12-rc2."

I assume that all the results on:
https://metaspace.github.io/2024/12/02/problems-in-benchmark-land.html

are with kernel v6.12-rc2 ?

It would be interesting to test an older kernel version, and see if it
is e.g. a scheduler bug.


You might also want to test with this series applied (which landed last
minute before v6.13 was tagged):
https://lore.kernel.org/lkml/20250119110410.GAZ4zcKkx5sCjD5XvH@fat_crate.local/T/#u


It fixes bugs that were introduced in v6.12-rc1 and v6.7-rc2 respectively.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies
  2025-01-22  9:52     ` Niklas Cassel
@ 2025-01-23  8:56       ` Andreas Hindborg
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Hindborg @ 2025-01-23  8:56 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Jan Kara, lsf-pc, linux-block, Jens Axboe, Matthew Wilcox,
	Luis Chamberlain, Miguel Ojeda

"Niklas Cassel" <cassel@kernel.org> writes:

> Hello Andreas,
>
> On Tue, Jan 21, 2025 at 01:51:11PM +0100, Andreas Hindborg wrote:
>> Hi Jan,
>>
>> "Jan Kara" <jack@suse.cz> writes:
>>
>> > Hi!
>> >
>> > On Tue 21-01-25 12:13:48, Andreas Hindborg via Lsf-pc wrote:
>> >> I would like to propose that we have a session on Rust in the block
>> >> layer again this year. Specifically I would like to discuss some rather
>> >> puzzling results I observe when I benchmark the C and Rust null block
>> >> drivers. I did a write up of the challenges I face at [1]. The
>> >> observations are not tied to rust, they also manifest in the C driver.
>> >
>> > The results are indeed somewhat curious. One factor I didn't see addressed
>> > in your blog is CPU scheduling. I've seen in the past cases where IO tasks
>> > were getting migrated across cores leading to jumps in perfomance. Did you
>> > try binding fio jobs to one CPU each?
>>
>> Yes, I am pinning the io jobs to cores with fio options `cpus_allowed=0-<jobs>`
>> and `--cpus_allowed_policy=split` so I get 1 job per core.
>>
>> The kernel is configured with PREEMPT_NONE=y.
>
> "I also cover a problem with the benchmark results that manifested during
> testing for v6.12-rc2."
>
> I assume that all the results on:
> https://metaspace.github.io/2024/12/02/problems-in-benchmark-land.html
>
> are with kernel v6.12-rc2 ?

Yes.

>
> It would be interesting to test an older kernel version, and see if it
> is e.g. a scheduler bug.

Yes, I did not do that. I should collect some more detailed data for
past kernels.

> You might also want to test with this series applied (which landed last
> minute before v6.13 was tagged):
> https://lore.kernel.org/lkml/20250119110410.GAZ4zcKkx5sCjD5XvH@fat_crate.local/T/#u
>
>
> It fixes bugs that were introduced in v6.12-rc1 and v6.7-rc2 respectively.
>

I'll try that, thanks.


Best regards,
Andreas Hindborg




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-01-23  8:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-21 11:13 [LSF/MM/BPF TOPIC] Rust block layer abstractions and benchmark strategies Andreas Hindborg
2025-01-21 12:04 ` [Lsf-pc] " Jan Kara
2025-01-21 12:51   ` Andreas Hindborg
2025-01-21 13:18     ` Jan Kara
2025-01-22  9:52     ` Niklas Cassel
2025-01-23  8:56       ` Andreas Hindborg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).