public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible?
@ 2005-11-11 22:58 Jeff V. Merkey
  2005-11-12  9:51 ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff V. Merkey @ 2005-11-11 22:58 UTC (permalink / raw)
  To: LKML


I am running one of our 3U appliances with dual 9500 Series 3Ware 
Controllers.  The unit is an online demo system accessible from
the internet via SSH to the public for solera networks Linux appliance 
demos running the DSFS file system:

(ncurses)
demo.soleranetworks.com
Account:  demo
password:  demo

(text ncurses)
demo.soleranetworks.com
Account: demo-text
password: demo

I have allocated 393,216 bio buffers I statically maintain in a chain 
and am running the dsfs file system with 3 x gigabit links fully 
saturated.  meta-data
increases the write sizes to 720 MB/Second on dual 9500 controllers with 
8 drives each (total of 16) 7200 RPM Drives.  I am seeing some 
congestion and bursting on the bio chains as they are submitted.  I am 
not aware of anyone pushing 2.6 to these limits at present with this 
type of architecture.  I have split
the kernel address space 3GB/1GB 3-kernel 1-user space in order to 
create enough memory to run this file system with 2GB of cache.

DSFS dynamically generates html status files form within the file 
system.  When the system gets somewhat behind, I am seeing bursts > 1 
GB/Second which exceeds the theoretical limit of the bus.   I have a 
timer function that runs every second and profiles the I/O throughput 
created by DSFS with bio submissions and captured packets.  I am asking 
if there is clock skew at these data rates with use of the timer 
functions.  The system appears to be sustaining 1GB/Second throughput on 
dual controllers.  I have verified through data rates the system is 
sustaining 800 megabytes/second with these 1GB/S bursts.  I am curious 
if there is potentially timer skew at these higher rates since I am 
having a hard time accepting that I can push 1GB/S through a bus rated 
at only 850 MB/S for DMA based transfers.   The unit is accessible by 
the general public, since its a demo unit andwe are unconcerned about 
folks getting on the system.  Folks are welcome to look and if anyone 
has any thoughts on this, please let me know.  I am concerned that the 
timer functions are not always ending on second boundries, which would 
explain the higher reported numbers.  Windows 2003 does not approach 
these performance numbers, BTW, so Linux appears to win on raw 
performance for vertical File System Apps.

dsfs file system mounted at /var/ftp can be viewed:
ftp://demo.soleranetworks.com/

Stats pages generated from dsfs:

capture stats:
ftp://demo.soleranetworks.com/stats/capture.html
storage stats:
ftp://demo.soleranetworks.com/stats/storage.html
dsfs cache stats:
ftp://demo.soleranetworks.com/stats/cache.html
network interface stats:
ftp://demo.soleranetworks.com/stats/network.html
virtual network interface maps:
ftp://demo.soleranetworks.com/stats/virtual.html

Jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible?
  2005-11-11 22:58 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible? Jeff V. Merkey
@ 2005-11-12  9:51 ` Jens Axboe
  2005-11-12 10:51   ` Jeff V. Merkey
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2005-11-12  9:51 UTC (permalink / raw)
  To: Jeff V. Merkey; +Cc: LKML

On Fri, Nov 11 2005, Jeff V. Merkey wrote:
> I have allocated 393,216 bio buffers I statically maintain in a chain 
> and am running the dsfs file system with 3 x gigabit links fully 
> saturated.  meta-data
> increases the write sizes to 720 MB/Second on dual 9500 controllers with 
> 8 drives each (total of 16) 7200 RPM Drives.  I am seeing some 
> congestion and bursting on the bio chains as they are submitted.  I am 
> not aware of anyone pushing 2.6 to these limits at present with this 
> type of architecture.  I have split

16 disks on 2 controllers, I'm 100% sure they are lots of people
pushing 2.6 much further than that! I wouldn't evne call that a big
setup.

> DSFS dynamically generates html status files form within the file 
> system.  When the system gets somewhat behind, I am seeing bursts > 1 
> GB/Second which exceeds the theoretical limit of the bus.   I have a 
> timer function that runs every second and profiles the I/O throughput 
> created by DSFS with bio submissions and captured packets.  I am asking 
> if there is clock skew at these data rates with use of the timer 
> functions.  The system appears to be sustaining 1GB/Second throughput on 
> dual controllers.  I have verified through data rates the system is 
> sustaining 800 megabytes/second with these 1GB/S bursts.  I am curious 
> if there is potentially timer skew at these higher rates since I am 
> having a hard time accepting that I can push 1GB/S through a bus rated 
> at only 850 MB/S for DMA based transfers.   The unit is accessible by 

Note that the linux io stats accounting in 2.6.9 accounts queued io, not
io completions. So it's quite possible to have burst rates > bus speeds
for async io. 2.6.15-rc1 change this.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible?
  2005-11-12  9:51 ` Jens Axboe
@ 2005-11-12 10:51   ` Jeff V. Merkey
  2005-11-13 19:36     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff V. Merkey @ 2005-11-12 10:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff V. Merkey, LKML

Jens Axboe wrote:

>On Fri, Nov 11 2005, Jeff V. Merkey wrote:
>  
>
>>I have allocated 393,216 bio buffers I statically maintain in a chain 
>>and am running the dsfs file system with 3 x gigabit links fully 
>>saturated.  meta-data
>>increases the write sizes to 720 MB/Second on dual 9500 controllers with 
>>8 drives each (total of 16) 7200 RPM Drives.  I am seeing some 
>>congestion and bursting on the bio chains as they are submitted.  
>>


>16 disks on 2 controllers, I'm 100% sure they are lots of people
>pushing 2.6 much further than that! I wouldn't evne call that a big
>setup.
>  
>
Probably not for this type of application.

>  
>
>>DSFS dynamically generates html status files form within the file 
>>system.  When the system gets somewhat behind, I am seeing bursts > 1 
>>GB/Second which exceeds the theoretical limit of the bus.   I have a 
>>timer function that runs every second and profiles the I/O throughput 
>>created by DSFS with bio submissions and captured packets.  I am asking 
>>if there is clock skew at these data rates with use of the timer 
>>functions.  The system appears to be sustaining 1GB/Second throughput on 
>>dual controllers.  I have verified through data rates the system is 
>>sustaining 800 megabytes/second with these 1GB/S bursts.  I am curious 
>>if there is potentially timer skew at these higher rates since I am 
>>having a hard time accepting that I can push 1GB/S through a bus rated 
>>at only 850 MB/S for DMA based transfers.   The unit is accessible by 
>>    
>>
>
>Note that the linux io stats accounting in 2.6.9 accounts queued io, not
>io completions. So it's quite possible to have burst rates > bus speeds
>for async io. 2.6.15-rc1 change this.
>
>  
>
So you are willing to log into the unit and validate these numbers? I 
would like for an
someone other than me to validate I am seeing these rates.

Jeff



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible?
  2005-11-12 10:51   ` Jeff V. Merkey
@ 2005-11-13 19:36     ` Jens Axboe
  2005-11-13 21:00       ` jmerkey
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2005-11-13 19:36 UTC (permalink / raw)
  To: Jeff V. Merkey; +Cc: Jeff V. Merkey, LKML

On Sat, Nov 12 2005, Jeff V. Merkey wrote:
> Jens Axboe wrote:
> 
> >On Fri, Nov 11 2005, Jeff V. Merkey wrote:
> > 
> >
> >>I have allocated 393,216 bio buffers I statically maintain in a chain 
> >>and am running the dsfs file system with 3 x gigabit links fully 
> >>saturated.  meta-data
> >>increases the write sizes to 720 MB/Second on dual 9500 controllers with 
> >>8 drives each (total of 16) 7200 RPM Drives.  I am seeing some 
> >>congestion and bursting on the bio chains as they are submitted.  
> >>
> 
> 
> >16 disks on 2 controllers, I'm 100% sure they are lots of people
> >pushing 2.6 much further than that! I wouldn't evne call that a big
> >setup.
> > 
> >
> Probably not for this type of application.
> 
> > 
> >
> >>DSFS dynamically generates html status files form within the file 
> >>system.  When the system gets somewhat behind, I am seeing bursts > 1 
> >>GB/Second which exceeds the theoretical limit of the bus.   I have a 
> >>timer function that runs every second and profiles the I/O throughput 
> >>created by DSFS with bio submissions and captured packets.  I am asking 
> >>if there is clock skew at these data rates with use of the timer 
> >>functions.  The system appears to be sustaining 1GB/Second throughput on 
> >>dual controllers.  I have verified through data rates the system is 
> >>sustaining 800 megabytes/second with these 1GB/S bursts.  I am curious 
> >>if there is potentially timer skew at these higher rates since I am 
> >>having a hard time accepting that I can push 1GB/S through a bus rated 
> >>at only 850 MB/S for DMA based transfers.   The unit is accessible by 
> >>   
> >>
> >
> >Note that the linux io stats accounting in 2.6.9 accounts queued io, not
> >io completions. So it's quite possible to have burst rates > bus speeds
> >for async io. 2.6.15-rc1 change this.
> >
> > 
> >
> So you are willing to log into the unit and validate these numbers? I 
> would like for an
> someone other than me to validate I am seeing these rates.

If you average the bandwidth over a time long enough to eliminate the
bursty queueing rates, your average rage should drop to what the
hardware can actually do. Or dig out the patch from 2.6.15-rc1 for
ll_rw_blk.c and apply it to 2.6.9, find it here:

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d72d904a5367ad4ca3f2c9a2ce8c3a68f0b28bf0;hp=d83c671fb7023f69a9582e622d01525054f23b66

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible?
  2005-11-13 19:36     ` Jens Axboe
@ 2005-11-13 21:00       ` jmerkey
  0 siblings, 0 replies; 5+ messages in thread
From: jmerkey @ 2005-11-13 21:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff V. Merkey, LKML

Jens Axboe wrote:

>On Sat, Nov 12 2005, Jeff V. Merkey wrote:
>  
>
>>Jens Axboe wrote:
>>
>>    
>>
>>>On Fri, Nov 11 2005, Jeff V. Merkey wrote:
>>>
>>>
>>>      
>>>
>>>>I have allocated 393,216 bio buffers I statically maintain in a chain 
>>>>and am running the dsfs file system with 3 x gigabit links fully 
>>>>saturated.  meta-data
>>>>increases the write sizes to 720 MB/Second on dual 9500 controllers with 
>>>>8 drives each (total of 16) 7200 RPM Drives.  I am seeing some 
>>>>congestion and bursting on the bio chains as they are submitted.  
>>>>
>>>>        
>>>>
>>    
>>
>>>16 disks on 2 controllers, I'm 100% sure they are lots of people
>>>pushing 2.6 much further than that! I wouldn't evne call that a big
>>>setup.
>>>
>>>
>>>      
>>>
>>Probably not for this type of application.
>>
>>    
>>
>>>      
>>>
>>>>DSFS dynamically generates html status files form within the file 
>>>>system.  When the system gets somewhat behind, I am seeing bursts > 1 
>>>>GB/Second which exceeds the theoretical limit of the bus.   I have a 
>>>>timer function that runs every second and profiles the I/O throughput 
>>>>created by DSFS with bio submissions and captured packets.  I am asking 
>>>>if there is clock skew at these data rates with use of the timer 
>>>>functions.  The system appears to be sustaining 1GB/Second throughput on 
>>>>dual controllers.  I have verified through data rates the system is 
>>>>sustaining 800 megabytes/second with these 1GB/S bursts.  I am curious 
>>>>if there is potentially timer skew at these higher rates since I am 
>>>>having a hard time accepting that I can push 1GB/S through a bus rated 
>>>>at only 850 MB/S for DMA based transfers.   The unit is accessible by 
>>>>  
>>>>
>>>>        
>>>>
>>>Note that the linux io stats accounting in 2.6.9 accounts queued io, not
>>>io completions. So it's quite possible to have burst rates > bus speeds
>>>for async io. 2.6.15-rc1 change this.
>>>
>>>
>>>
>>>      
>>>
>>So you are willing to log into the unit and validate these numbers? I 
>>would like for an
>>someone other than me to validate I am seeing these rates.
>>    
>>
>
>If you average the bandwidth over a time long enough to eliminate the
>bursty queueing rates, your average rage should drop to what the
>hardware can actually do. Or dig out the patch from 2.6.15-rc1 for
>ll_rw_blk.c and apply it to 2.6.9, find it here:
>
>http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d72d904a5367ad4ca3f2c9a2ce8c3a68f0b28bf0;hp=d83c671fb7023f69a9582e622d01525054f23b66
>
>  
>
Jens,

Thanks. I'll dig out the patch. I am measuring the rates on the back end 
and they are running at 720-800 MB/S apart from what's being reported from
the bio submission. At any rate, I ave to say the bio performance is 
stunning in comparison to Windows 2003.

Jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-11-13 21:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-11 22:58 2.6.9 reporting 1 Gigabyte/second throughput on bio's, timer skew possible? Jeff V. Merkey
2005-11-12  9:51 ` Jens Axboe
2005-11-12 10:51   ` Jeff V. Merkey
2005-11-13 19:36     ` Jens Axboe
2005-11-13 21:00       ` jmerkey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox