linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/8] IB/srp: scaling and bug fixes for 2.6.38
@ 2010-12-23 21:31 David Dillow
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma; +Cc: linux-scsi, Bart Van Assche

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

The first patch in this series fixes a longstanding issue where we crash if we
use sg_reset to perform a bus reset, but haven't sent enough commands to
initialize all of our request structures. The remaining patches break up Bart
Van Assche's lock scaling work, and add a few optimizations on top.

The scaling work looks to have paid off pretty well. All tests were conducted
over a QDR link between two Dell R410s with 2.6GHz Xeons. To push any possible
bottlenecks to the initiator, the test target was stripped down to not transfer
the requests data -- it simply response to the command as though it had.

For fio driving one LUN using the SG engine, refactoring the locking using
patches 2 through 6 give a 30% increase in command throughput from 16 to 64
threads, while allowing similar (within the noise) or slight improvements for 1
to 8 threads and 128 threads and above. Unsharing the lock (patch 7) with the
SCSI mid-layer hurts a bit for the single thread case (~2%) but gives an
additional 1 to 6% with more than one thread. Cache optimization (patch 8)
returns the single thread case back to par, and gives a modest increase as
threads increase.

For fio driving mulitple LUNs using the AIO engine, patches 2 through 6 give
slightly smaller increases at low thread counts with a single drive (20% over
baseline), but the improvement increases as drives are added and/or iodepth
increases, reaching 50% in many cases. The removing the shared lock typically
brings 5-10% improvement over the lock reduction work, and optimizing the cache
usage also gives a modest improvement, though more than in the SG case.

There is more investigation to be done -- for example, AIO peaked at 296k IOPs
from a single drive at an iodepth of 32 and a thread count of 32. SG peaked at
183k IOPS at 64 threads (iodepth was 1, but I did not try a survey for this
engine). I have some completion batching and blk-iopoll conversion patches as
well, but they have some interesting performance anomolies at the moment that
prevent them being a win.

I'd appreciate people's review and comments, as while the patches have over 10
billion commands on them from the performance testing and real hardware, they
involve locking and race conditions, which have a habit of not showing up until
the most inopportune time.

Once 2.6.37 is out, I'll add sign offs and push these to my repo for 2.6.38.



David Dillow (8):
  IB/srp: allow task management without a previous request
  IB/srp: consolidate state change code
  IB/srp: allow lockless work posting
  IB/srp: don't move active requests to their own list
  IB/srp: reduce local coverage for command submission and EH
  IB/srp: reduce lock coverage of command completion
  IB/srp: stop sharing the host lock with SCSI
  IB/srp: consolidate hot-path variables into cache lines

 drivers/infiniband/ulp/srp/ib_srp.c |  390 ++++++++++++++++-------------------
 drivers/infiniband/ulp/srp/ib_srp.h |   46 +++--
 2 files changed, 204 insertions(+), 232 deletions(-)

-- 
1.7.2.3


[-- Attachment #2: srp-scaling.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 115310 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-01-05 18:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-23 21:31 [RFC 0/8] IB/srp: scaling and bug fixes for 2.6.38 David Dillow
     [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2010-12-23 21:31   ` [RFC 1/8] IB/srp: allow task management without a previous request David Dillow
2010-12-23 21:31   ` [RFC 2/8] IB/srp: consolidate state change code David Dillow
2010-12-23 21:31   ` [RFC 3/8] IB/srp: allow lockless work posting David Dillow
2010-12-23 21:31   ` [RFC 4/8] IB/srp: don't move active requests to their own list David Dillow
     [not found]     ` <1293139893-11678-5-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2010-12-27 15:50       ` Bart Van Assche
     [not found]         ` <AANLkTinuOD919iV+ixcWhZpSkJP1QP=+OyphzVSq4_aO-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-27 17:46           ` David Dillow
2010-12-23 21:31   ` [RFC 5/8] IB/srp: reduce local coverage for command submission and EH David Dillow
2010-12-23 21:31   ` [RFC 6/8] IB/srp: reduce lock coverage of command completion David Dillow
2011-01-02 17:27     ` Bart Van Assche
     [not found]       ` <AANLkTikBkdn6XWgJELaBBFi1ZHen+UwQuumDn+yEPDPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-03  8:44         ` David Dillow
2010-12-23 21:31   ` [RFC 7/8] IB/srp: stop sharing the host lock with SCSI David Dillow
2010-12-27 17:49     ` David Dillow
2010-12-27 17:56       ` Boaz Harrosh
     [not found]         ` <4D18D355.30308-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2010-12-27 18:01           ` David Dillow
     [not found]             ` <1293472913.2896.36.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-12-27 18:47               ` Boaz Harrosh
     [not found]                 ` <4D18DF4D.6010607-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2010-12-27 19:52                   ` Dave Dillow
2010-12-28 17:38                   ` Bart Van Assche
2010-12-23 21:31   ` [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines David Dillow
2011-01-05 18:13     ` Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).