linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: bhalevy@panasas.com, James.Bottomley@SteelEye.com,
	jens.axboe@oracle.com, linux-scsi@vger.kernel.org
Subject: Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining
Date: Tue, 31 Jul 2007 23:12:26 +0300	[thread overview]
Message-ID: <46AF97AA.5000908@panasas.com> (raw)
In-Reply-To: <46A7A2EC.6040400@panasas.com>

[-- Attachment #1: Type: text/plain, Size: 4383 bytes --]

Boaz Harrosh wrote:
> FUJITA Tomonori wrote:
>> From: Benny Halevy <bhalevy@panasas.com>
>> Subject: Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining
>> Date: Wed, 25 Jul 2007 11:26:44 +0300
>>
>>>> However, I'm perfectly happy to go with whatever the empirical evidence
>>>> says is best .. and hopefully, now we don't have to pick this once and
>>>> for all time ... we can alter it if whatever is chosen proves to be
>>>> suboptimal.
>>> I agree.  This isn't a catholic marriage :)
>>> We'll run some performance experiments comparing the sgtable chaining
>>> implementation vs. a scsi_data_buff implementation pointing
>>> at a possibly chained sglist and let's see if we can measure
>>> any difference.  We'll send results as soon as we have them.
>> I did some tests with your sgtable patchset and the approach to use
>> separate buffer for sglists. As expected, there was no performance
>> difference with small I/Os. I've not tried very large I/Os, which
>> might give some difference.
>>
> 
> Next week I will try to mount lots of scsi_debug devices and
> use large parallel IO to try and find a difference. I will
> test Jens's sglist-arch tree against above sglist-arch+scsi_sgtable
> 


I was able to run some tests here are my results.

The results:
PPT - is Pages Per Transfer (sg_count)

The numbers are accumulated time of 20 transfers of 32GB each,
and the average of 4 such runs. (Lower time is better)
Transfers are sg_dd into scsi_debug

Kernel         | total time 128-PPT | total time 2048-PPT
---------------|--------------------|---------------------
sglist-arch    |      47.26         | Test Failed
scsi_data_buff |      41.68         | 35.05
scsi_sgtable   |      42.42         | 36.45


The test:
1. scsi_debug
  I mounted the scsi_debug module which was converted and fixed for 
  chaining with the following options:
  $ modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1
  
  32 GB of virtual drive on 32M of memory with 0 delay
  and read/write do nothing with the fake_rw=1.
  After that I just enabled chained IO on the device

  So what I'm actually testing is only sg + scsi-ml request
  queuing and sglist allocation/deallocation. Which is what I want
  to test.

2. sg_dd
  In the test script (see prof_test_scsi_debug attached)
  I use sg_dd in direct io mode to send a direct scsi-command
  to above device.
  I did 2 tests, in both I transfer 32GB of data.
  1st test with 128 (4K) pages IO size.
  2nd test with 2048 pages IO size.
  The second test will successfully run only if chaining is enabled
  and working. Otherwise it will fail.

The tested Kernels:

1. Jens's sglist-arch
  I was not able to pass all tests with this Kernel. For some reason when
  bigger than 256 pages commands are queued the Machine will run out
  of memory and will kill the test. After the test is killed the system
  is left with 10M of memory and can hardly reboot.
  I have done some prints at the queuecommand entry in scsi_debug.c
  and I can see that I receive the expected large sg_count and bufflen
  but unlike other tests I get a different pointer at scsi_sglist().
  In other tests since nothing is happening at this machine while in
  the test, the sglist pointer is always the same. commands comes in,
  allocates memory, do nothing in scsi_debug, freed, and returns. 
  I suspect sglist leak or allocation bug.

2. scsi_data_buff
  This tree is what I posted last. It is basically: 
  0. sglist-arch
  1. revert of scsi-ml support for chaining.
  2. sg-pools cleanup [PATCH AB1]
  3. scsi-ml sglist-arch [PATCH B1]
  4. scsi_data_buff patch. scsi_lib.c (Last patch sent)
  5. scsi_data_buff patch for sr.c sd.c & scsi_error.c
  6. Plus converted libata, ide-scsi, so Kernel can compile.
  7. convert of scsi_debug.c and fix for chaining.
  ( see http://www.bhalevy.com/open-osd/download/scsi_data_buff)

  All Tests run

3. scsi_sgtable 
  This tree is what I posted as patches that open this mailing thread.
  0. sglist-arch
  1. revert of scsi-ml support for chaining.
  2. sg-pools cleanup [PATCH AB1]
  3. sgtable [PATCH A2]
  3. chaining [PATCH A3]
  4. scsi_sgtable for sd sr and scsi_error
  6. Converted libata ide-scsi so Kernel can compile.
  7. convert of scsi_debug.c and fix for chaining.
  ( see http://www.bhalevy.com/open-osd/download/scsi_sgtable/linux-block/)

  All Tests run


[-- Attachment #2: install_sdebug_chaining --]
[-- Type: text/plain, Size: 473 bytes --]

#!/bin/sh
sdx=sdb
#load the device with these params
modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1

# go set some live params
# $ cd /sys/bus/pseudo/drivers/scsi_debug
# $ echo 1 > fake_rw

# mess with sglist chaining
cd /sys/block/$sdx/queue
echo 4096 > max_segments
cat max_hw_sectors_kb  > max_sectors_kb
echo "max_hw_sectors_kb="$(cat max_hw_sectors_kb) 
echo "max_hw_sectors_kb="$(cat max_sectors_kb) 
echo "max_hw_sectors_kb="$(cat max_segments)

[-- Attachment #3: prof_test_scsi_debug --]
[-- Type: text/plain, Size: 1265 bytes --]

#!/bin/sh

#load the device with these params
#$ modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1

# go set some live params
# $ cd /sys/bus/pseudo/drivers/scsi_debug
# $ echo 1 > fake_rw

# mess with sglist chaining
# $ cd /sys/block/sdb/queue
# $ echo 4096 > max_segments
# $ cat max_hw_sectors_kb  > max_sectors_kb
# $ cat max_hw_sectors_kb 


if=/dev/zero
of=/dev/sdb

outputfile=$1.txt
echo "Testing $1"

# send 32G in $1 sectrors at once
do_dd()
{
# blocks of one sector
bs=512
#memory page in blocks
page=8
#number of scatterlist elements in a transfer
sgs=$1
#calculate the bpt param
bpt=$(($sgs*$page))
#total blocks to transfer 32 Giga bytes
count=64M


echo $3: "bpt=$bpt"

\time bash -c \
	"sg_dd blk_sgio=1 dio=1 if=$if of=$of bpt=$bpt bs=$bs count=$count 2>/dev/null" \
	2>> $2
}

echo "BEGIN RUN $1" >> $outputfile

# warm run
for i in {1..5}; do
do_dd 2048 /dev/null $i;
done

# one page trasfers
echo "one page transfers"
echo "one page transfers" >> $outputfile
for i in {1..20}; do
do_dd 128 $outputfile $i;
done

# chained
# 16K / 8 = 2K pages
# 2K / 128 = 16 chained sglists
echo "16 chained sglists"
echo "16 chained sglists" >> $outputfile
for i in {1..20}; do
do_dd 2048 $outputfile $i;
done

echo "END RUN" >> $outputfile

  parent reply	other threads:[~2007-07-31 20:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-24  8:47 [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining Boaz Harrosh
2007-07-24  8:52 ` [PATCH AB1/5] SCSI: SG pools allocation cleanup Boaz Harrosh
2007-07-24 13:08   ` Boaz Harrosh
2007-07-25  8:08   ` Boaz Harrosh
2007-07-25  9:05     ` [PATCH AB1/5 ver2] " Boaz Harrosh
2007-07-25  9:06     ` [PATCH A2/5 ver2] SCSI: scsi_sgtable implementation Boaz Harrosh
2007-07-24  8:56 ` [PATCH A2/5] " Boaz Harrosh
2007-07-24  8:59 ` [PATCH A3/5] SCSI: sg-chaining over scsi_sgtable Boaz Harrosh
2007-07-24  9:01 ` [PATCH B2/5] SCSI: support for allocating large scatterlists Boaz Harrosh
2007-07-24  9:03 ` [PATCH B3/5] SCSI: scsi_sgtable over sg-chainning Boaz Harrosh
2007-07-24  9:16 ` [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining FUJITA Tomonori
2007-07-24 10:01   ` Boaz Harrosh
2007-07-24 11:12     ` FUJITA Tomonori
2007-07-24 13:41       ` FUJITA Tomonori
2007-07-24 14:01         ` Benny Halevy
2007-07-24 16:10           ` James Bottomley
2007-07-25  8:26             ` Benny Halevy
2007-07-25  8:42               ` FUJITA Tomonori
2007-07-25 19:22                 ` Boaz Harrosh
2007-07-26 11:33                   ` FUJITA Tomonori
2007-07-31 20:12                   ` Boaz Harrosh [this message]
2007-08-05 16:03                     ` FUJITA Tomonori
2007-08-06  7:22                     ` FUJITA Tomonori
2007-08-07  6:55                       ` Jens Axboe
2007-08-07  8:36                         ` FUJITA Tomonori
2007-08-08  7:16                           ` Jens Axboe
2007-07-25 19:50                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46AF97AA.5000908@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=bhalevy@panasas.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).