Large Sequential Reads Being Broken Up. Why?

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Large Sequential Reads Being Broken Up. Why?
@ 2006-01-30 15:20 Martin W. Schlining III
  2006-01-30 15:51 ` David.Egolf
  2006-01-30 16:17 ` James Smart
  0 siblings, 2 replies; 5+ messages in thread
From: Martin W. Schlining III @ 2006-01-30 15:20 UTC (permalink / raw)
  To: linux-scsi

I am running a program on my Linux box which is asking for 2M IO (reads 
and writes) with the file handle being opened with the O_DIRECT flag. 
However, the IO being put out on the wire is no larger than 512K.  My 
target device is the SCSI block device (/dev/sdb in this case). What is 
preventing me from getting large IO through the SCSI block layer? How 
can I fix it?

The sg device can achieve the 2M IO size, so I know its at least 
possible. How can I improve the IO size for the SCSI block layer?

Details:

Dell 2850 server with dual Xeons, 1G RAM
OS: Linux racerx 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 
x86_64 x86_64 x86_64 GNU/Linux
Emulex LP11000 Fibre Channel HBA using driver version 8.0.13 (changing 
the driver hasn't helped, so far)
I set the lookahead value pretty large to improve read performance 
(hdparm -a)
The scheduler for this device is anticipatory.

Any ideas?

Thanks,
Martin Schlining

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Large Sequential Reads Being Broken Up. Why?
  2006-01-30 15:20 Large Sequential Reads Being Broken Up. Why? Martin W. Schlining III
@ 2006-01-30 15:51 ` David.Egolf
  2006-01-30 16:17 ` James Smart
  1 sibling, 0 replies; 5+ messages in thread
From: David.Egolf @ 2006-01-30 15:51 UTC (permalink / raw)
  To: Martin W. Schlining III; +Cc: linux-scsi, linux-scsi-owner

Martin, 

I was just doing some informal performance testing from a RedHat linux box 
to two commercial disk arrays.  We have 
an Emulex 8000 card on the box. 

I found that exceeding 256 blocks; i.e., 128K I/O, actually decreased 
performance.  Not only was the elapsed time 
of a 200 MB file longer using sg_dd with 512MB, it consumed considerably 
more system time as reported by the 'time' command. 

When I investigated the configuration, I found that many distributions 
will limit you to 1M transfers by default. 

Certainly, the gains to be made by changing from 512K to 2MB are marginal 
and will probably add to your maintenance 
costs if you ever change anything in the I/O layer. 

-- David Egolf 

linux-scsi-owner@vger.kernel.org wrote on 01/30/2006 08:20:16 AM:

> I am running a program on my Linux box which is asking for 2M IO (reads 
> and writes) with the file handle being opened with the O_DIRECT flag. 
> However, the IO being put out on the wire is no larger than 512K.  My 
> target device is the SCSI block device (/dev/sdb in this case). What is 
> preventing me from getting large IO through the SCSI block layer? How 
> can I fix it?
> 
> The sg device can achieve the 2M IO size, so I know its at least 
> possible. How can I improve the IO size for the SCSI block layer?
> 
> Details:
> 
> Dell 2850 server with dual Xeons, 1G RAM
> OS: Linux racerx 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 
> x86_64 x86_64 x86_64 GNU/Linux
> Emulex LP11000 Fibre Channel HBA using driver version 8.0.13 (changing 
> the driver hasn't helped, so far)
> I set the lookahead value pretty large to improve read performance 
> (hdparm -a)
> The scheduler for this device is anticipatory.
> 
> Any ideas?
> 
> Thanks,
> Martin Schlining
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Large Sequential Reads Being Broken Up. Why?
  2006-01-30 15:20 Large Sequential Reads Being Broken Up. Why? Martin W. Schlining III
  2006-01-30 15:51 ` David.Egolf
@ 2006-01-30 16:17 ` James Smart
  2006-01-30 16:58   ` James Bottomley
  1 sibling, 1 reply; 5+ messages in thread
From: James Smart @ 2006-01-30 16:17 UTC (permalink / raw)
  To: Martin W. Schlining III; +Cc: linux-scsi

Here's one thing that will help you....
   in the lpfc driver (in rev 8.0.13 - the file is lpfc_fcp.c), in the
   scsi_driver_template structure - add the field:
       .max_sectors = 0xFFFF,

As of 2.6.10, the kernel started paying attention to this field, which the
emulex driver, as of that time, didn't set. The result was the kernel
dropped back to a default max_sectors of 1024 - which results in a 512k max.
The lpfc driver was updated in rev 8.0.29 with this change.

Caveat is : Even with this change, you must be using O_DIRECT to get high
bandwidth. Otherwise, the upper layers will segment the requests (if I
remember right, we had a hard time making a "normal" config exceed 256k).

-- james s

Martin W. Schlining III wrote:
> I am running a program on my Linux box which is asking for 2M IO (reads 
> and writes) with the file handle being opened with the O_DIRECT flag. 
> However, the IO being put out on the wire is no larger than 512K.  My 
> target device is the SCSI block device (/dev/sdb in this case). What is 
> preventing me from getting large IO through the SCSI block layer? How 
> can I fix it?
> 
> The sg device can achieve the 2M IO size, so I know its at least 
> possible. How can I improve the IO size for the SCSI block layer?
> 
> Details:
> 
> Dell 2850 server with dual Xeons, 1G RAM
> OS: Linux racerx 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 
> x86_64 x86_64 x86_64 GNU/Linux
> Emulex LP11000 Fibre Channel HBA using driver version 8.0.13 (changing 
> the driver hasn't helped, so far)
> I set the lookahead value pretty large to improve read performance 
> (hdparm -a)
> The scheduler for this device is anticipatory.
> 
> Any ideas?
> 
> Thanks,
> Martin Schlining
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Large Sequential Reads Being Broken Up. Why?
@ 2006-01-30 16:51 egoggin
  0 siblings, 0 replies; 5+ messages in thread
From: egoggin @ 2006-01-30 16:51 UTC (permalink / raw)
  To: James.Smart, mschlining; +Cc: linux-scsi

Isn't there also a chance that given high enough memory contention, the
O_DIRECT user buffer will yield over 255 hardware segments for a 1MB or
greater io request?  While sg O_DIRECT would have the same issue, sg
indirect IO would at least try to allocate multi-page segments.

-----Original Message-----
From: linux-scsi-owner@vger.kernel.org
[mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of James Smart
Sent: Monday, January 30, 2006 11:17 AM
To: Martin W. Schlining III
Cc: linux-scsi@vger.kernel.org
Subject: Re: Large Sequential Reads Being Broken Up. Why?

Here's one thing that will help you....
   in the lpfc driver (in rev 8.0.13 - the file is lpfc_fcp.c), in the
   scsi_driver_template structure - add the field:
       .max_sectors = 0xFFFF,

As of 2.6.10, the kernel started paying attention to this field, which the
emulex driver, as of that time, didn't set. The result was the kernel
dropped back to a default max_sectors of 1024 - which results in a 512k max.
The lpfc driver was updated in rev 8.0.29 with this change.

Caveat is : Even with this change, you must be using O_DIRECT to get high
bandwidth. Otherwise, the upper layers will segment the requests (if I
remember right, we had a hard time making a "normal" config exceed 256k).

-- james s

Martin W. Schlining III wrote:
> I am running a program on my Linux box which is asking for 2M IO (reads 
> and writes) with the file handle being opened with the O_DIRECT flag. 
> However, the IO being put out on the wire is no larger than 512K.  My 
> target device is the SCSI block device (/dev/sdb in this case). What is 
> preventing me from getting large IO through the SCSI block layer? How 
> can I fix it?
> 
> The sg device can achieve the 2M IO size, so I know its at least 
> possible. How can I improve the IO size for the SCSI block layer?
> 
> Details:
> 
> Dell 2850 server with dual Xeons, 1G RAM
> OS: Linux racerx 2.6.11.4-21.10-smp #1 SMP Tue Nov 29 14:32:49 UTC 2005 
> x86_64 x86_64 x86_64 GNU/Linux
> Emulex LP11000 Fibre Channel HBA using driver version 8.0.13 (changing 
> the driver hasn't helped, so far)
> I set the lookahead value pretty large to improve read performance 
> (hdparm -a)
> The scheduler for this device is anticipatory.
> 
> Any ideas?
> 
> Thanks,
> Martin Schlining
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Large Sequential Reads Being Broken Up. Why?
  2006-01-30 16:17 ` James Smart
@ 2006-01-30 16:58   ` James Bottomley
  0 siblings, 0 replies; 5+ messages in thread
From: James Bottomley @ 2006-01-30 16:58 UTC (permalink / raw)
  To: James.Smart; +Cc: Martin W. Schlining III, linux-scsi

On Mon, 2006-01-30 at 11:17 -0500, James Smart wrote:
> As of 2.6.10, the kernel started paying attention to this field, which the
> emulex driver, as of that time, didn't set. The result was the kernel
> dropped back to a default max_sectors of 1024 - which results in a 512k max.
> The lpfc driver was updated in rev 8.0.29 with this change.
> 
> Caveat is : Even with this change, you must be using O_DIRECT to get high
> bandwidth. Otherwise, the upper layers will segment the requests (if I
> remember right, we had a hard time making a "normal" config exceed 256k).

Actually, please also remember that the maximum SG element list size is
128 in a normal kernel (depending on the driver ... some drivers set
lower limits as well), so on a very fragmented 4k page machine, you're
unlikely to get above 512k just because you run out of SG table entries
(obviously on 16k page machines, this goes up to 2MB, and if you're
lucky enough to have a fully functional IOMMU, this limitation won't
affect you at all).

James

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-01-30 16:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-30 15:20 Large Sequential Reads Being Broken Up. Why? Martin W. Schlining III
2006-01-30 15:51 ` David.Egolf
2006-01-30 16:17 ` James Smart
2006-01-30 16:58   ` James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2006-01-30 16:51 egoggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox