Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* BTRFS Performance page
@ 2008-10-21 22:20 Steven Pratt
  2008-10-22  0:14 ` Chris Mason
  2008-10-22 15:00 ` Steven Pratt
  0 siblings, 2 replies; 10+ messages in thread
From: Steven Pratt @ 2008-10-21 22:20 UTC (permalink / raw)
  To: linux-btrfs

As discussed on the BTRFS conference call, myself and Kevin Corry have 
set up some test machines for the purpose of doing performance testing 
on BTRFS.  The intent is to have a semi permanent setup that we can use 
to test new features and code drops in BTRFS as well as to do 
comparisons to other file systems.  The systems are pretty much fully 
automated for execution, so we should be able to crank out large numbers 
of different benchmarks as well as keep up with GIT changes.

The data is hosted at http://btrfs.boxacle.net/. So far we have the data 
for the single disk tests uploaded. We should be able to upload results 
from the larger RAID config tomorrow.

Initial tests were done with the FFSB benchmark and we picked 5 common 
workloads; create, random and sequential read, random write, and a mail 
server emulation.  We plan to expand this based on feedback to include 
more FFSB tests and/or other workloads.

All runs have complete analysis data with them (iostat, mpstat, 
oprofile, sar), as well as the FFSB profiles that can be used to 
recreate any test we ran. We also have collected blktrace data but not 
uploaded due to size.

Please follow the results link on the bottom of the main page to get to 
the current results.  Let me know what you like or don't like.   I will 
post again when we get the RAID data uploaded.


Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-21 22:20 BTRFS Performance page Steven Pratt
@ 2008-10-22  0:14 ` Chris Mason
  2008-10-22 13:53   ` Steven Pratt
  2008-10-22 15:00 ` Steven Pratt
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Mason @ 2008-10-22  0:14 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote:
> As discussed on the BTRFS conference call, myself and Kevin Corry have  
> set up some test machines for the purpose of doing performance testing  
> on BTRFS.  The intent is to have a semi permanent setup that we can use  
> to test new features and code drops in BTRFS as well as to do  
> comparisons to other file systems.  The systems are pretty much fully  
> automated for execution, so we should be able to crank out large numbers  
> of different benchmarks as well as keep up with GIT changes.
>
> The data is hosted at http://btrfs.boxacle.net/. So far we have the data  
> for the single disk tests uploaded. We should be able to upload results  
> from the larger RAID config tomorrow.
>
> Initial tests were done with the FFSB benchmark and we picked 5 common  
> workloads; create, random and sequential read, random write, and a mail  
> server emulation.  We plan to expand this based on feedback to include  
> more FFSB tests and/or other workloads.
>
> All runs have complete analysis data with them (iostat, mpstat,  
> oprofile, sar), as well as the FFSB profiles that can be used to  
> recreate any test we ran. We also have collected blktrace data but not  
> uploaded due to size.
>
> Please follow the results link on the bottom of the main page to get to  
> the current results.  Let me know what you like or don't like.   I will  
> post again when we get the RAID data uploaded.

Very interesting data, thank you for posting this.  The first comment
I'll make is that -o nodatacow requires -o nodatasum.  The sums aren't
valid without the cow.

The FFSB mail server workload, does it do fsync writes?

For the sequential read workload, I'm guessing (hoping) the files are
created in parallel?

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22  0:14 ` Chris Mason
@ 2008-10-22 13:53   ` Steven Pratt
  2008-10-22 14:05     ` Chris Mason
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Pratt @ 2008-10-22 13:53 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Chris Mason wrote:
> On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote:
>   
>> As discussed on the BTRFS conference call, myself and Kevin Corry have  
>> set up some test machines for the purpose of doing performance testing  
>> on BTRFS.  The intent is to have a semi permanent setup that we can use  
>> to test new features and code drops in BTRFS as well as to do  
>> comparisons to other file systems.  The systems are pretty much fully  
>> automated for execution, so we should be able to crank out large numbers  
>> of different benchmarks as well as keep up with GIT changes.
>>
>> The data is hosted at http://btrfs.boxacle.net/. So far we have the data  
>> for the single disk tests uploaded. We should be able to upload results  
>> from the larger RAID config tomorrow.
>>
>> Initial tests were done with the FFSB benchmark and we picked 5 common  
>> workloads; create, random and sequential read, random write, and a mail  
>> server emulation.  We plan to expand this based on feedback to include  
>> more FFSB tests and/or other workloads.
>>
>> All runs have complete analysis data with them (iostat, mpstat,  
>> oprofile, sar), as well as the FFSB profiles that can be used to  
>> recreate any test we ran. We also have collected blktrace data but not  
>> uploaded due to size.
>>
>> Please follow the results link on the bottom of the main page to get to  
>> the current results.  Let me know what you like or don't like.   I will  
>> post again when we get the RAID data uploaded.
>>     
>
> Very interesting data, thank you for posting this.  The first comment
> I'll make is that -o nodatacow requires -o nodatasum.  The sums aren't
> valid without the cow.
>   
Thought that might be the case.  Ok, we will drop this variation.

> The FFSB mail server workload, does it do fsync writes?
>   
No, but we have the ability to add that if we choose.

> For the sequential read workload, I'm guessing (hoping) the files are
> created in parallel?
>   
Sorry, setup is still single threaded.

Steve

> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 13:53   ` Steven Pratt
@ 2008-10-22 14:05     ` Chris Mason
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Mason @ 2008-10-22 14:05 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Wed, 2008-10-22 at 08:53 -0500, Steven Pratt wrote:
> Chris Mason wrote:
> > On Tue, Oct 21, 2008 at 05:20:03PM -0500, Steven Pratt wrote:
> >   
> >> As discussed on the BTRFS conference call, myself and Kevin Corry have  
> >> set up some test machines for the purpose of doing performance testing  
> >> on BTRFS.  The intent is to have a semi permanent setup that we can use  
> >> to test new features and code drops in BTRFS as well as to do  
> >> comparisons to other file systems.  The systems are pretty much fully  
> >> automated for execution, so we should be able to crank out large numbers  
> >> of different benchmarks as well as keep up with GIT changes.
> >>
> >> The data is hosted at http://btrfs.boxacle.net/. So far we have the data  
> >> for the single disk tests uploaded. We should be able to upload results  
> >> from the larger RAID config tomorrow.
> >>
> >> Initial tests were done with the FFSB benchmark and we picked 5 common  
> >> workloads; create, random and sequential read, random write, and a mail  
> >> server emulation.  We plan to expand this based on feedback to include  
> >> more FFSB tests and/or other workloads.
> >>
> >> All runs have complete analysis data with them (iostat, mpstat,  
> >> oprofile, sar), as well as the FFSB profiles that can be used to  
> >> recreate any test we ran. We also have collected blktrace data but not  
> >> uploaded due to size.
> >>
> >> Please follow the results link on the bottom of the main page to get to  
> >> the current results.  Let me know what you like or don't like.   I will  
> >> post again when we get the RAID data uploaded.
> >>     
> >
> > Very interesting data, thank you for posting this.  The first comment
> > I'll make is that -o nodatacow requires -o nodatasum.  The sums aren't
> > valid without the cow.
> >   
> Thought that might be the case.  Ok, we will drop this variation.
> 
> > The FFSB mail server workload, does it do fsync writes?
> >   
> No, but we have the ability to add that if we choose.
> 
I'd be interested in it at least.

> > For the sequential read workload, I'm guessing (hoping) the files are
> > created in parallel?
> >   
> Sorry, setup is still single threaded.

Ok, I'll try to reproduce these results.  Thanks.

-chris



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-21 22:20 BTRFS Performance page Steven Pratt
  2008-10-22  0:14 ` Chris Mason
@ 2008-10-22 15:00 ` Steven Pratt
  2008-10-22 15:19   ` Chris Mason
  2008-10-22 19:25   ` Paul P Komkoff Jr
  1 sibling, 2 replies; 10+ messages in thread
From: Steven Pratt @ 2008-10-22 15:00 UTC (permalink / raw)
  To: linux-btrfs

Steven Pratt wrote:
> As discussed on the BTRFS conference call, myself and Kevin Corry have 
> set up some test machines for the purpose of doing performance testing 
> on BTRFS.  The intent is to have a semi permanent setup that we can 
> use to test new features and code drops in BTRFS as well as to do 
> comparisons to other file systems.  The systems are pretty much fully 
> automated for execution, so we should be able to crank out large 
> numbers of different benchmarks as well as keep up with GIT changes.
>
> The data is hosted at http://btrfs.boxacle.net/. So far we have the 
> data for the single disk tests uploaded. We should be able to upload 
> results from the larger RAID config tomorrow.
>
> Initial tests were done with the FFSB benchmark and we picked 5 common 
> workloads; create, random and sequential read, random write, and a 
> mail server emulation.  We plan to expand this based on feedback to 
> include more FFSB tests and/or other workloads.
>
> All runs have complete analysis data with them (iostat, mpstat, 
> oprofile, sar), as well as the FFSB profiles that can be used to 
> recreate any test we ran. We also have collected blktrace data but not 
> uploaded due to size.
>
> Please follow the results link on the bottom of the main page to get 
> to the current results.  Let me know what you like or don't like.   I 
> will post again when we get the RAID data uploaded.
RAID data is now uploaded.  The config used is 136 15k rpm fiber disks 
in 8 arrays all striped together with DM.  These results are not as 
favorable to BTRFS, as there seem to be some major issues with random 
write and mail server workloads.

http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html

Steve

>
>
> Steve
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 15:00 ` Steven Pratt
@ 2008-10-22 15:19   ` Chris Mason
  2008-10-22 15:45     ` Steven Pratt
  2008-10-22 19:25   ` Paul P Komkoff Jr
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Mason @ 2008-10-22 15:19 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote:
> Steven Pratt wrote:
> > As discussed on the BTRFS conference call, myself and Kevin Corry have 
> > set up some test machines for the purpose of doing performance testing 
> > on BTRFS.  The intent is to have a semi permanent setup that we can 
> > use to test new features and code drops in BTRFS as well as to do 
> > comparisons to other file systems.  The systems are pretty much fully 
> > automated for execution, so we should be able to crank out large 
> > numbers of different benchmarks as well as keep up with GIT changes.
> >
> > The data is hosted at http://btrfs.boxacle.net/. So far we have the 
> > data for the single disk tests uploaded. We should be able to upload 
> > results from the larger RAID config tomorrow.
> >
> > Initial tests were done with the FFSB benchmark and we picked 5 common 
> > workloads; create, random and sequential read, random write, and a 
> > mail server emulation.  We plan to expand this based on feedback to 
> > include more FFSB tests and/or other workloads.
> >
> > All runs have complete analysis data with them (iostat, mpstat, 
> > oprofile, sar), as well as the FFSB profiles that can be used to 
> > recreate any test we ran. We also have collected blktrace data but not 
> > uploaded due to size.
> >

I'll try to reproduce things here, but I might end up asking for some of
the blktrace data.

> > Please follow the results link on the bottom of the main page to get 
> > to the current results.  Let me know what you like or don't like.   I 
> > will post again when we get the RAID data uploaded.
> RAID data is now uploaded.  The config used is 136 15k rpm fiber disks 
> in 8 arrays all striped together with DM.  These results are not as 
> favorable to BTRFS, as there seem to be some major issues with random 
> write and mail server workloads.
> 
> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html
> 

I need to look harder at the mail server workload, my initial guess is
that I'm doing too much metadata readahead in these effectively random
operations.

If I'm reading the config correctly, the random write workload does
this:

1) create a file sequentially
2) do buffered random writes to the file

Since buffered writeback happens via pdflush, the IO isn't actually as
random as you would expect.  Pages are written in file offset order,
which actually corresponds to disk order.

When btrfs is doing COW, file offset order maps to random order on disk,
leading to much lower tput.  The nocow results should be better than
they are, and I'll see what I can do about the cow results too.

-chris



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 15:19   ` Chris Mason
@ 2008-10-22 15:45     ` Steven Pratt
  2008-10-22 15:55       ` Chris Mason
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Pratt @ 2008-10-22 15:45 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Chris Mason wrote:
> On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote:
>   
>> Steven Pratt wrote:
>>     
>>> As discussed on the BTRFS conference call, myself and Kevin Corry have 
>>> set up some test machines for the purpose of doing performance testing 
>>> on BTRFS.  The intent is to have a semi permanent setup that we can 
>>> use to test new features and code drops in BTRFS as well as to do 
>>> comparisons to other file systems.  The systems are pretty much fully 
>>> automated for execution, so we should be able to crank out large 
>>> numbers of different benchmarks as well as keep up with GIT changes.
>>>
>>> The data is hosted at http://btrfs.boxacle.net/. So far we have the 
>>> data for the single disk tests uploaded. We should be able to upload 
>>> results from the larger RAID config tomorrow.
>>>
>>> Initial tests were done with the FFSB benchmark and we picked 5 common 
>>> workloads; create, random and sequential read, random write, and a 
>>> mail server emulation.  We plan to expand this based on feedback to 
>>> include more FFSB tests and/or other workloads.
>>>
>>> All runs have complete analysis data with them (iostat, mpstat, 
>>> oprofile, sar), as well as the FFSB profiles that can be used to 
>>> recreate any test we ran. We also have collected blktrace data but not 
>>> uploaded due to size.
>>>
>>>       
>
> I'll try to reproduce things here, but I might end up asking for some of
> the blktrace data.
>
>   
Sure, not a problem for select workloads, but it was just too much data 
to upload for every run.  Just let me know which ones you need.

>>> Please follow the results link on the bottom of the main page to get 
>>> to the current results.  Let me know what you like or don't like.   I 
>>> will post again when we get the RAID data uploaded.
>>>       
>> RAID data is now uploaded.  The config used is 136 15k rpm fiber disks 
>> in 8 arrays all striped together with DM.  These results are not as 
>> favorable to BTRFS, as there seem to be some major issues with random 
>> write and mail server workloads.
>>
>> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html
>>
>>     
>
> I need to look harder at the mail server workload, my initial guess is
> that I'm doing too much metadata readahead in these effectively random
> operations.
>
> If I'm reading the config correctly, the random write workload does
> this:
>
> 1) create a file sequentially
> 2) do buffered random writes to the file
>   
Correct.  Although there are multiple files(but created serially) and 
multiple threads writing to different files at the same time.  We also 
only write 5% of a file before moving on to a new file.  So while there 
can be some ordering, the merging should be minimal.  If fact we see 
that from iostat (this is from 16 thread ext3)

sdf               0.00   177.05   14.17 3665.07    56.69 15374.05     8.39    67.88   18.48   0.24  86.53

177 merges out of 3665 ios, with average request size on 4.2k.  



> Since buffered writeback happens via pdflush, the IO isn't actually as
> random as you would expect.  Pages are written in file offset order,
> which actually corresponds to disk order.
>
>   
Right, there will be a fair amount of locality to the random writes.

> When btrfs is doing COW, file offset order maps to random order on disk,
> leading to much lower tput.  The nocow results should be better than
> they are, and I'll see what I can do about the cow results too.
>
>   
Not sure I understand this point, doesn't the COW code allocate new 
space sequentially?

Steve

> -chris
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 15:45     ` Steven Pratt
@ 2008-10-22 15:55       ` Chris Mason
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Mason @ 2008-10-22 15:55 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

On Wed, 2008-10-22 at 10:45 -0500, Steven Pratt wrote:
> Chris Mason wrote:
> > On Wed, 2008-10-22 at 10:00 -0500, Steven Pratt wrote:
> >   
> >> Steven Pratt wrote:
> >>     
> >>> As discussed on the BTRFS conference call, myself and Kevin Corry have 
> >>> set up some test machines for the purpose of doing performance testing 
> >>> on BTRFS.  The intent is to have a semi permanent setup that we can 
> >>> use to test new features and code drops in BTRFS as well as to do 
> >>> comparisons to other file systems.  The systems are pretty much fully 
> >>> automated for execution, so we should be able to crank out large 
> >>> numbers of different benchmarks as well as keep up with GIT changes.
> >>>
> >>> The data is hosted at http://btrfs.boxacle.net/. So far we have the 
> >>> data for the single disk tests uploaded. We should be able to upload 
> >>> results from the larger RAID config tomorrow.
> >>>
> >>> Initial tests were done with the FFSB benchmark and we picked 5 common 
> >>> workloads; create, random and sequential read, random write, and a 
> >>> mail server emulation.  We plan to expand this based on feedback to 
> >>> include more FFSB tests and/or other workloads.
> >>>
> >>> All runs have complete analysis data with them (iostat, mpstat, 
> >>> oprofile, sar), as well as the FFSB profiles that can be used to 
> >>> recreate any test we ran. We also have collected blktrace data but not 
> >>> uploaded due to size.
> >>>
> >>>       
> >
> > I'll try to reproduce things here, but I might end up asking for some of
> > the blktrace data.
> >
> >   
> Sure, not a problem for select workloads, but it was just too much data 
> to upload for every run.  Just let me know which ones you need.
> 

Hopefully I'll get similar results to yours, I'll give it a shot later this week.

> > If I'm reading the config correctly, the random write workload does
> > this:
> >
> > 1) create a file sequentially
> > 2) do buffered random writes to the file
> >   
> Correct.  Although there are multiple files(but created serially) and 
> multiple threads writing to different files at the same time.  We also 
> only write 5% of a file before moving on to a new file.  So while there 
> can be some ordering, the merging should be minimal.  If fact we see 
> that from iostat (this is from 16 thread ext3)
> 
> sdf               0.00   177.05   14.17 3665.07    56.69 15374.05     8.39    67.88   18.48   0.24  86.53
> 
> 177 merges out of 3665 ios, with average request size on 4.2k.

> 
> 
> > Since buffered writeback happens via pdflush, the IO isn't actually as
> > random as you would expect.  Pages are written in file offset order,
> > which actually corresponds to disk order.
> >
> >   
> Right, there will be a fair amount of locality to the random writes.
> 
> > When btrfs is doing COW, file offset order maps to random order on disk,
> > leading to much lower tput.  The nocow results should be better than
> > they are, and I'll see what I can do about the cow results too.
> >
> >   
> Not sure I understand this point, doesn't the COW code allocate new 
> space sequentially?

Yes, COW allocates new space sequentially and via delayed allocation,
and based on the config the extents should be about 5MB in size.  But
based on the numbers, we're getting something much more random.  pdflush
is really tricky here, and when it does the wrong thing the COW mode
will show it most.

I'd be curious to see the difference in performance between this run and
this run with 5MB O_SYNC (or O_DIRECT) writes.  Btrfs can do both, the
O_DIRECT write just does the normal page cache write plus an invalidate.

-chris



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 15:00 ` Steven Pratt
  2008-10-22 15:19   ` Chris Mason
@ 2008-10-22 19:25   ` Paul P Komkoff Jr
  2008-10-22 19:46     ` Steven Pratt
  1 sibling, 1 reply; 10+ messages in thread
From: Paul P Komkoff Jr @ 2008-10-22 19:25 UTC (permalink / raw)
  To: Steven Pratt; +Cc: linux-btrfs

Replying to Steven Pratt:
> Steven Pratt wrote:
> RAID data is now uploaded.  The config used is 136 15k rpm fiber disks  
> in 8 arrays all striped together with DM.  These results are not as  
> favorable to BTRFS, as there seem to be some major issues with random  
> write and mail server workloads.

Why don't use btrfs' own RAID capabilities instead?
Honestly, I will never ever use md as soon as I'll get btrfs working
:)

> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html

-- 
Paul P 'Stingray' Komkoff Jr // http://stingr.net/key <- my pgp key
 This message represents the official view of the voices in my head

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BTRFS Performance page
  2008-10-22 19:25   ` Paul P Komkoff Jr
@ 2008-10-22 19:46     ` Steven Pratt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Pratt @ 2008-10-22 19:46 UTC (permalink / raw)
  To: Paul P Komkoff Jr; +Cc: linux-btrfs

Paul P Komkoff Jr wrote:
> Replying to Steven Pratt:
>   
>> Steven Pratt wrote:
>> RAID data is now uploaded.  The config used is 136 15k rpm fiber disks  
>> in 8 arrays all striped together with DM.  These results are not as  
>> favorable to BTRFS, as there seem to be some major issues with random  
>> write and mail server workloads.
>>     
>
> Why don't use btrfs' own RAID capabilities instead?
> Honestly, I will never ever use md as soon as I'll get btrfs working
> :)
>
>   
On the list of things to try.  Main reason was we wanted to be able to 
compare to other file systems and they are lacking that feature. 

Steve

>> http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html
>>     
>
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-10-22 19:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-21 22:20 BTRFS Performance page Steven Pratt
2008-10-22  0:14 ` Chris Mason
2008-10-22 13:53   ` Steven Pratt
2008-10-22 14:05     ` Chris Mason
2008-10-22 15:00 ` Steven Pratt
2008-10-22 15:19   ` Chris Mason
2008-10-22 15:45     ` Steven Pratt
2008-10-22 15:55       ` Chris Mason
2008-10-22 19:25   ` Paul P Komkoff Jr
2008-10-22 19:46     ` Steven Pratt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox