All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jaxboe@fusionio.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Vasu Dev <vasu.dev@intel.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] scsi, fcoe, libfc: drop scsi host_lock use from  fc_queuecommand
Date: Sun, 26 Sep 2010 12:19:37 +0900	[thread overview]
Message-ID: <4C9EBBC9.9070709@fusionio.com> (raw)
In-Reply-To: <AANLkTi=Y3WnG8rRf4PAaTev+ZS0SifeZLSoKekNMg2L6@mail.gmail.com>

On 2010-09-26 01:55, Bart Van Assche wrote:
> On Fri, Sep 24, 2010 at 8:41 AM, Jens Axboe <jaxboe@fusionio.com> wrote:
>>
>> [ ... ]
>>
>> Bart, can you try with this patchset added:
>>
>> git://git.kernel.dk/linux-2.6-block.git blk-alloc-optimize
>>
>> It's a work in progress and not suitable for general consumption yet,
>> but it's tested working at least. There will be more built on top of
>> this, but at least even this simple stuff is making a big difference
>> for IOPS testing for me.
> 
> Hello Jens,
> 
> Thanks for the feedback. I see a nice 10% speedup after having applied
> the four block layer optimization patches from the blk-alloc-optimize
> branch on an already patched 2.6.35.5 SRP initiator.

Great! Not too bad for something that's will a WIP.

> Note: according to the output of perf record -g, most spinlock calls
> still originate from the block layer. This is what the perf tool
> reported for a fio run using libaio with small blocks (512 bytes):
> 
> Event: cycles
> -      7.06%    fio  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
>    - _raw_spin_lock_irqsave
>       + 19.51% blk_run_queue
>       + 13.71% blk_end_bidi_request
>       + 10.04% mlx4_ib_poll_cq
>       + 4.68% lock_timer_base
>       + 4.22% aio_complete
>       + 3.97% srp_send_completion
>       + 3.71% srp_queuecommand
>       + 3.55% dio_bio_end_aio
>       + 3.37% __srp_get_tx_iu
>       + 3.14% srp_recv_completion
>       + 3.00% scsi_device_unbusy
>       + 2.87% __scsi_put_command
>       + 2.82% __blockdev_direct_IO_newtrunc
>       + 2.76% scsi_put_command
>       + 2.69% scsi_run_queue
>       + 2.65% dio_bio_submit
>       + 2.54% srp_remove_req
>       + 2.46% mlx4_ib_post_send
>       + 2.33% scsi_get_command
>       + 1.95% mlx4_ib_post_recv

One piece of low hanging fruit is reducing the number of queue runs.
SCSI does this for every completed command to keep the device queue
full. I bet if you try an experiement where you only run the queue when
a certain number of requests have completed, you would greatly reduce
scsi_run_queue and blk_run_queue in the above profile.

-- 
Jens Axboe


      reply	other threads:[~2010-09-26  3:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-03 22:27 [PATCH] scsi, fcoe, libfc: drop scsi host_lock use from fc_queuecommand Vasu Dev
2010-09-04  1:20 ` Nicholas A. Bellinger
2010-09-12 16:37 ` Bart Van Assche
2010-09-15 17:31   ` Vasu Dev
2010-09-15 18:01     ` Bart Van Assche
2010-09-24  6:41   ` Jens Axboe
2010-09-25 16:55     ` Bart Van Assche
2010-09-26  3:19       ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C9EBBC9.9070709@fusionio.com \
    --to=jaxboe@fusionio.com \
    --cc=bvanassche@acm.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=vasu.dev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.