[RFC 0/8] IB/srp: scaling and bug fixes for 2.6.38

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/8] IB/srp: scaling and bug fixes for 2.6.38
@ 2010-12-23 21:31 David Dillow
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma; +Cc: linux-scsi, Bart Van Assche

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

The first patch in this series fixes a longstanding issue where we crash if we
use sg_reset to perform a bus reset, but haven't sent enough commands to
initialize all of our request structures. The remaining patches break up Bart
Van Assche's lock scaling work, and add a few optimizations on top.

The scaling work looks to have paid off pretty well. All tests were conducted
over a QDR link between two Dell R410s with 2.6GHz Xeons. To push any possible
bottlenecks to the initiator, the test target was stripped down to not transfer
the requests data -- it simply response to the command as though it had.

For fio driving one LUN using the SG engine, refactoring the locking using
patches 2 through 6 give a 30% increase in command throughput from 16 to 64
threads, while allowing similar (within the noise) or slight improvements for 1
to 8 threads and 128 threads and above. Unsharing the lock (patch 7) with the
SCSI mid-layer hurts a bit for the single thread case (~2%) but gives an
additional 1 to 6% with more than one thread. Cache optimization (patch 8)
returns the single thread case back to par, and gives a modest increase as
threads increase.

For fio driving mulitple LUNs using the AIO engine, patches 2 through 6 give
slightly smaller increases at low thread counts with a single drive (20% over
baseline), but the improvement increases as drives are added and/or iodepth
increases, reaching 50% in many cases. The removing the shared lock typically
brings 5-10% improvement over the lock reduction work, and optimizing the cache
usage also gives a modest improvement, though more than in the SG case.

There is more investigation to be done -- for example, AIO peaked at 296k IOPs
from a single drive at an iodepth of 32 and a thread count of 32. SG peaked at
183k IOPS at 64 threads (iodepth was 1, but I did not try a survey for this
engine). I have some completion batching and blk-iopoll conversion patches as
well, but they have some interesting performance anomolies at the moment that
prevent them being a win.

I'd appreciate people's review and comments, as while the patches have over 10
billion commands on them from the performance testing and real hardware, they
involve locking and race conditions, which have a habit of not showing up until
the most inopportune time.

Once 2.6.37 is out, I'll add sign offs and push these to my repo for 2.6.38.

David Dillow (8):
  IB/srp: allow task management without a previous request
  IB/srp: consolidate state change code
  IB/srp: allow lockless work posting
  IB/srp: don't move active requests to their own list
  IB/srp: reduce local coverage for command submission and EH
  IB/srp: reduce lock coverage of command completion
  IB/srp: stop sharing the host lock with SCSI
  IB/srp: consolidate hot-path variables into cache lines

 drivers/infiniband/ulp/srp/ib_srp.c |  390 ++++++++++++++++-------------------
 drivers/infiniband/ulp/srp/ib_srp.h |   46 +++--
 2 files changed, 204 insertions(+), 232 deletions(-)

-- 
1.7.2.3

[-- Attachment #2: srp-scaling.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 115310 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC 1/8] IB/srp: allow task management without a previous request
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2010-12-23 21:31   ` David Dillow
  2010-12-23 21:31   ` [RFC 2/8] IB/srp: consolidate state change code David Dillow
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

We can only have one task management comment outstanding, so move the
completion and status to the target port. This allows us to handle
resets of a LUN without a corresponding request having been sent.
Meanwhile, we don't need to play games with host_scribble, just use it
as the pointer it is.

This fixes a crash when we issue a bus reset using sg_reset.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=13893
Signed-off-by: David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c |   90 ++++++++++++----------------------
 drivers/infiniband/ulp/srp/ib_srp.h |   10 ++--
 2 files changed, 37 insertions(+), 63 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 1e1e347..95fb9f0 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -542,6 +542,7 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
 static void srp_remove_req(struct srp_target_port *target, struct srp_request *req)
 {
 	srp_unmap_data(req->scmnd, target, req);
+	req->scmnd = NULL;
 	list_move_tail(&req->list, &target->free_reqs);
 }
 
@@ -925,15 +926,13 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 
 	target->req_lim += delta;
 
-	req = &target->req_ring[rsp->tag & ~SRP_TAG_TSK_MGMT];
-
 	if (unlikely(rsp->tag & SRP_TAG_TSK_MGMT)) {
-		if (be32_to_cpu(rsp->resp_data_len) < 4)
-			req->tsk_status = -1;
-		else
-			req->tsk_status = rsp->data[3];
-		complete(&req->done);
+		target->tsk_mgmt_status = -1;
+		if (be32_to_cpu(rsp->resp_data_len) >= 4)
+			target->tsk_mgmt_status = rsp->data[3];
+		complete(&target->tsk_mgmt_done);
 	} else {
+		req = &target->req_ring[rsp->tag];
 		scmnd = req->scmnd;
 		if (!scmnd)
 			shost_printk(KERN_ERR, target->scsi_host,
@@ -953,13 +952,9 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 		else if (rsp->flags & (SRP_RSP_FLAG_DIOVER | SRP_RSP_FLAG_DIUNDER))
 			scsi_set_resid(scmnd, be32_to_cpu(rsp->data_in_res_cnt));
 
-		if (!req->tsk_mgmt) {
-			scmnd->host_scribble = (void *) -1L;
-			scmnd->scsi_done(scmnd);
-
-			srp_remove_req(target, req);
-		} else
-			req->cmd_done = 1;
+		scmnd->host_scribble = NULL;
+		scmnd->scsi_done(scmnd);
+		srp_remove_req(target, req);
 	}
 
 	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
@@ -1155,7 +1150,7 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 
 	scmnd->scsi_done     = done;
 	scmnd->result        = 0;
-	scmnd->host_scribble = (void *) (long) req->index;
+	scmnd->host_scribble = (void *) req;
 
 	cmd = iu->buf;
 	memset(cmd, 0, sizeof *cmd);
@@ -1167,8 +1162,6 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 
 	req->scmnd    = scmnd;
 	req->cmd      = iu;
-	req->cmd_done = 0;
-	req->tsk_mgmt = NULL;
 
 	len = srp_map_data(scmnd, target, req);
 	if (len < 0) {
@@ -1442,7 +1435,7 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 }
 
 static int srp_send_tsk_mgmt(struct srp_target_port *target,
-			     struct srp_request *req, u8 func)
+			     u64 req_tag, unsigned int lun, u8 func)
 {
 	struct ib_device *dev = target->srp_host->srp_dev->dev;
 	struct srp_iu *iu;
@@ -1451,12 +1444,10 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 	spin_lock_irq(target->scsi_host->host_lock);
 
 	if (target->state == SRP_TARGET_DEAD ||
-	    target->state == SRP_TARGET_REMOVED) {
-		req->scmnd->result = DID_BAD_TARGET << 16;
+	    target->state == SRP_TARGET_REMOVED)
 		goto out;
-	}
 
-	init_completion(&req->done);
+	init_completion(&target->tsk_mgmt_done);
 
 	iu = __srp_get_tx_iu(target, SRP_IU_TSK_MGMT);
 	if (!iu)
@@ -1468,21 +1459,19 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 	memset(tsk_mgmt, 0, sizeof *tsk_mgmt);
 
 	tsk_mgmt->opcode 	= SRP_TSK_MGMT;
-	tsk_mgmt->lun 		= cpu_to_be64((u64) req->scmnd->device->lun << 48);
-	tsk_mgmt->tag 		= req->index | SRP_TAG_TSK_MGMT;
+	tsk_mgmt->lun 		= cpu_to_be64((u64) lun << 48);
+	tsk_mgmt->tag 		= req_tag | SRP_TAG_TSK_MGMT;
 	tsk_mgmt->tsk_mgmt_func = func;
-	tsk_mgmt->task_tag 	= req->index;
+	tsk_mgmt->task_tag 	= req_tag;
 
 	ib_dma_sync_single_for_device(dev, iu->dma, sizeof *tsk_mgmt,
 				      DMA_TO_DEVICE);
 	if (__srp_post_send(target, iu, sizeof *tsk_mgmt))
 		goto out;
 
-	req->tsk_mgmt = iu;
-
 	spin_unlock_irq(target->scsi_host->host_lock);
 
-	if (!wait_for_completion_timeout(&req->done,
+	if (!wait_for_completion_timeout(&target->tsk_mgmt_done,
 					 msecs_to_jiffies(SRP_ABORT_TIMEOUT_MS)))
 		return -1;
 
@@ -1493,43 +1482,29 @@ out:
 	return -1;
 }
 
-static int srp_find_req(struct srp_target_port *target,
-			struct scsi_cmnd *scmnd,
-			struct srp_request **req)
-{
-	if (scmnd->host_scribble == (void *) -1L)
-		return -1;
-
-	*req = &target->req_ring[(long) scmnd->host_scribble];
-
-	return 0;
-}
-
 static int srp_abort(struct scsi_cmnd *scmnd)
 {
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
-	struct srp_request *req;
+	struct srp_request *req = (struct srp_request *) scmnd->host_scribble;
 	int ret = SUCCESS;
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n");
 
-	if (target->qp_in_error)
-		return FAILED;
-	if (srp_find_req(target, scmnd, &req))
+	if (!req || target->qp_in_error)
 		return FAILED;
-	if (srp_send_tsk_mgmt(target, req, SRP_TSK_ABORT_TASK))
+	if (srp_send_tsk_mgmt(target, req->index, scmnd->device->lun,
+			      SRP_TSK_ABORT_TASK))
 		return FAILED;
 
 	spin_lock_irq(target->scsi_host->host_lock);
 
-	if (req->cmd_done) {
-		srp_remove_req(target, req);
-		scmnd->scsi_done(scmnd);
-	} else if (!req->tsk_status) {
-		srp_remove_req(target, req);
-		scmnd->result = DID_ABORT << 16;
-	} else
-		ret = FAILED;
+	if (req->scmnd) {
+		if (!target->tsk_mgmt_status) {
+			srp_remove_req(target, req);
+			scmnd->result = DID_ABORT << 16;
+		} else
+			ret = FAILED;
+	}
 
 	spin_unlock_irq(target->scsi_host->host_lock);
 
@@ -1545,17 +1520,16 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	if (target->qp_in_error)
 		return FAILED;
-	if (srp_find_req(target, scmnd, &req))
-		return FAILED;
-	if (srp_send_tsk_mgmt(target, req, SRP_TSK_LUN_RESET))
+	if (srp_send_tsk_mgmt(target, SRP_TAG_NO_REQ, scmnd->device->lun,
+			      SRP_TSK_LUN_RESET))
 		return FAILED;
-	if (req->tsk_status)
+	if (target->tsk_mgmt_status)
 		return FAILED;
 
 	spin_lock_irq(target->scsi_host->host_lock);
 
 	list_for_each_entry_safe(req, tmp, &target->req_queue, list)
-		if (req->scmnd->device == scmnd->device)
+		if (req->scmnd && req->scmnd->device == scmnd->device)
 			srp_reset_req(target, req);
 
 	spin_unlock_irq(target->scsi_host->host_lock);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index ed0dce9..c923731 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -68,7 +68,8 @@ enum {
 	SRP_TSK_MGMT_SQ_SIZE	= 1,
 	SRP_CMD_SQ_SIZE		= SRP_REQ_SQ_SIZE - SRP_TSK_MGMT_SQ_SIZE,
 
-	SRP_TAG_TSK_MGMT	= 1 << (SRP_RQ_SHIFT + 1),
+	SRP_TAG_NO_REQ		= ~0UL,
+	SRP_TAG_TSK_MGMT	= 1UL << 31,
 
 	SRP_FMR_SIZE		= 256,
 	SRP_FMR_POOL_SIZE	= 1024,
@@ -113,12 +114,8 @@ struct srp_request {
 	struct list_head	list;
 	struct scsi_cmnd       *scmnd;
 	struct srp_iu	       *cmd;
-	struct srp_iu	       *tsk_mgmt;
 	struct ib_pool_fmr     *fmr;
-	struct completion	done;
 	short			index;
-	u8			cmd_done;
-	u8			tsk_status;
 };
 
 struct srp_target_port {
@@ -165,6 +162,9 @@ struct srp_target_port {
 	int			status;
 	enum srp_target_state	state;
 	int			qp_in_error;
+
+	struct completion	tsk_mgmt_done;
+	u8			tsk_mgmt_status;
 };
 
 struct srp_iu {
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 2/8] IB/srp: consolidate state change code
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
  2010-12-23 21:31   ` [RFC 1/8] IB/srp: allow task management without a previous request David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2010-12-23 21:31   ` [RFC 3/8] IB/srp: allow lockless work posting David Dillow
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

This is a break-out and slight cleanup of Bart Van Assche's work.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   45 ++++++++++++++++++----------------
 1 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 95fb9f0..a14975b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -441,18 +441,28 @@ static void srp_disconnect_target(struct srp_target_port *target)
 	wait_for_completion(&target->done);
 }
 
+static bool srp_change_state(struct srp_target_port *target,
+			    enum srp_target_state old,
+			    enum srp_target_state new)
+{
+	int changed = false;
+
+	spin_lock_irq(target->scsi_host->host_lock);
+	if (target->state == old) {
+		target->state = new;
+		changed = true;
+	}
+	spin_unlock_irq(target->scsi_host->host_lock);
+	return changed;
+}
+
 static void srp_remove_work(struct work_struct *work)
 {
 	struct srp_target_port *target =
 		container_of(work, struct srp_target_port, work);
 
-	spin_lock_irq(target->scsi_host->host_lock);
-	if (target->state != SRP_TARGET_DEAD) {
-		spin_unlock_irq(target->scsi_host->host_lock);
+	if (!srp_change_state(target, SRP_TARGET_DEAD, SRP_TARGET_REMOVED))
 		return;
-	}
-	target->state = SRP_TARGET_REMOVED;
-	spin_unlock_irq(target->scsi_host->host_lock);
 
 	spin_lock(&target->srp_host->target_lock);
 	list_del(&target->list);
@@ -560,13 +570,8 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	struct ib_wc wc;
 	int ret;
 
-	spin_lock_irq(target->scsi_host->host_lock);
-	if (target->state != SRP_TARGET_LIVE) {
-		spin_unlock_irq(target->scsi_host->host_lock);
+	if (!srp_change_state(target, SRP_TARGET_LIVE, SRP_TARGET_CONNECTING))
 		return -EAGAIN;
-	}
-	target->state = SRP_TARGET_CONNECTING;
-	spin_unlock_irq(target->scsi_host->host_lock);
 
 	srp_disconnect_target(target);
 	/*
@@ -605,13 +610,8 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	if (ret)
 		goto err;
 
-	spin_lock_irq(target->scsi_host->host_lock);
-	if (target->state == SRP_TARGET_CONNECTING) {
-		ret = 0;
-		target->state = SRP_TARGET_LIVE;
-	} else
+	if (!srp_change_state(target, SRP_TARGET_CONNECTING, SRP_TARGET_LIVE))
 		ret = -EAGAIN;
-	spin_unlock_irq(target->scsi_host->host_lock);
 
 	return ret;
 
@@ -621,9 +621,12 @@ err:
 
 	/*
 	 * We couldn't reconnect, so kill our target port off.
-	 * However, we have to defer the real removal because we might
-	 * be in the context of the SCSI error handler now, which
-	 * would deadlock if we call scsi_remove_host().
+	 * However, we have to defer the real removal because we
+	 * are in the context of the SCSI error handler now, which
+	 * will deadlock if we call scsi_remove_host().
+	 *
+	 * Schedule our work inside the lock to avoid a race with
+	 * the flush_scheduled_work() in srp_remove_one().
 	 */
 	spin_lock_irq(target->scsi_host->host_lock);
 	if (target->state == SRP_TARGET_CONNECTING) {
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 3/8] IB/srp: allow lockless work posting
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
  2010-12-23 21:31   ` [RFC 1/8] IB/srp: allow task management without a previous request David Dillow
  2010-12-23 21:31   ` [RFC 2/8] IB/srp: consolidate state change code David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2010-12-23 21:31   ` [RFC 4/8] IB/srp: don't move active requests to their own list David Dillow
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

Only one CPU at a time will own an RX IU, so using the address of the IU
as the work request cookie allows us to avoid taking a lock. We can
similarly prepare the TX path for lockless posting by moving the free TX
IUs to a list. This also removes the requirement that the queue sizes be
a power of 2.

This is a break-out of Bart Van Assche's work, modified to avoid needing
an extra field in the IU.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   66 ++++++++++++++--------------------
 drivers/infiniband/ulp/srp/ib_srp.h |    7 +---
 2 files changed, 29 insertions(+), 44 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index a14975b..6d98217 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -568,7 +568,7 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	struct ib_qp_attr qp_attr;
 	struct srp_request *req, *tmp;
 	struct ib_wc wc;
-	int ret;
+	int i, ret;
 
 	if (!srp_change_state(target, SRP_TARGET_LIVE, SRP_TARGET_CONNECTING))
 		return -EAGAIN;
@@ -601,9 +601,9 @@ static int srp_reconnect_target(struct srp_target_port *target)
 		srp_reset_req(target, req);
 	spin_unlock_irq(target->scsi_host->host_lock);
 
-	target->rx_head	 = 0;
-	target->tx_head	 = 0;
-	target->tx_tail  = 0;
+	list_del_init(&target->free_tx);
+	for (i = 0; i < SRP_SQ_SIZE; ++i)
+		list_move(&target->tx_ring[i]->list, &target->free_tx);
 
 	target->qp_in_error = 0;
 	ret = srp_connect_target(target);
@@ -817,7 +817,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 
 /*
  * Must be called with target->scsi_host->host_lock held to protect
- * req_lim and tx_head.  Lock cannot be dropped between call here and
+ * req_lim and free_tx.  Lock cannot be dropped between call here and
  * call to __srp_post_send().
  *
  * Note:
@@ -837,7 +837,7 @@ static struct srp_iu *__srp_get_tx_iu(struct srp_target_port *target,
 
 	srp_send_completion(target->send_cq, target);
 
-	if (target->tx_head - target->tx_tail >= SRP_SQ_SIZE)
+	if (list_empty(&target->free_tx))
 		return NULL;
 
 	/* Initiator responses to target requests do not consume credits */
@@ -846,14 +846,14 @@ static struct srp_iu *__srp_get_tx_iu(struct srp_target_port *target,
 		return NULL;
 	}
 
-	iu = target->tx_ring[target->tx_head & SRP_SQ_MASK];
+	iu = list_first_entry(&target->free_tx, struct srp_iu, list);
 	iu->type = iu_type;
 	return iu;
 }
 
 /*
  * Must be called with target->scsi_host->host_lock held to protect
- * req_lim and tx_head.
+ * req_lim and free_tx.
  */
 static int __srp_post_send(struct srp_target_port *target,
 			   struct srp_iu *iu, int len)
@@ -867,7 +867,7 @@ static int __srp_post_send(struct srp_target_port *target,
 	list.lkey   = target->srp_host->srp_dev->mr->lkey;
 
 	wr.next       = NULL;
-	wr.wr_id      = target->tx_head & SRP_SQ_MASK;
+	wr.wr_id      = (uintptr_t) iu;
 	wr.sg_list    = &list;
 	wr.num_sge    = 1;
 	wr.opcode     = IB_WR_SEND;
@@ -876,7 +876,7 @@ static int __srp_post_send(struct srp_target_port *target,
 	ret = ib_post_send(target->qp, &wr, &bad_wr);
 
 	if (!ret) {
-		++target->tx_head;
+		list_del_init(&iu->list);
 		if (iu->type != SRP_IU_RSP)
 			--target->req_lim;
 	}
@@ -884,36 +884,21 @@ static int __srp_post_send(struct srp_target_port *target,
 	return ret;
 }
 
-static int srp_post_recv(struct srp_target_port *target)
+static int srp_post_recv(struct srp_target_port *target, struct srp_iu *iu)
 {
-	unsigned long flags;
-	struct srp_iu *iu;
-	struct ib_sge list;
 	struct ib_recv_wr wr, *bad_wr;
-	unsigned int next;
-	int ret;
-
-	spin_lock_irqsave(target->scsi_host->host_lock, flags);
-
-	next	 = target->rx_head & SRP_RQ_MASK;
-	wr.wr_id = next;
-	iu	 = target->rx_ring[next];
+	struct ib_sge list;
 
 	list.addr   = iu->dma;
 	list.length = iu->size;
 	list.lkey   = target->srp_host->srp_dev->mr->lkey;
 
 	wr.next     = NULL;
+	wr.wr_id    = (uintptr_t) iu;
 	wr.sg_list  = &list;
 	wr.num_sge  = 1;
 
-	ret = ib_post_recv(target->qp, &wr, &bad_wr);
-	if (!ret)
-		++target->rx_head;
-
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
-
-	return ret;
+	return ib_post_recv(target->qp, &wr, &bad_wr);
 }
 
 static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
@@ -1030,14 +1015,11 @@ static void srp_process_aer_req(struct srp_target_port *target,
 
 static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 {
-	struct ib_device *dev;
-	struct srp_iu *iu;
+	struct ib_device *dev = target->srp_host->srp_dev->dev;
+	struct srp_iu *iu = (struct srp_iu *) wc->wr_id;
 	int res;
 	u8 opcode;
 
-	iu = target->rx_ring[wc->wr_id];
-
-	dev = target->srp_host->srp_dev->dev;
 	ib_dma_sync_single_for_cpu(dev, iu->dma, target->max_ti_iu_len,
 				   DMA_FROM_DEVICE);
 
@@ -1078,7 +1060,7 @@ static void srp_handle_recv(struct srp_target_port *target, struct ib_wc *wc)
 	ib_dma_sync_single_for_device(dev, iu->dma, target->max_ti_iu_len,
 				      DMA_FROM_DEVICE);
 
-	res = srp_post_recv(target);
+	res = srp_post_recv(target, iu);
 	if (res != 0)
 		shost_printk(KERN_ERR, target->scsi_host,
 			     PFX "Recv failed with error code %d\n", res);
@@ -1107,6 +1089,7 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
 {
 	struct srp_target_port *target = target_ptr;
 	struct ib_wc wc;
+	struct srp_iu *iu;
 
 	while (ib_poll_cq(cq, 1, &wc) > 0) {
 		if (wc.status) {
@@ -1117,7 +1100,8 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
 			break;
 		}
 
-		++target->tx_tail;
+		iu = (struct srp_iu *) wc.wr_id;
+		list_add(&iu->list, &target->free_tx);
 	}
 }
 
@@ -1212,6 +1196,9 @@ static int srp_alloc_iu_bufs(struct srp_target_port *target)
 						  GFP_KERNEL, DMA_TO_DEVICE);
 		if (!target->tx_ring[i])
 			goto err;
+
+		INIT_LIST_HEAD(&target->tx_ring[i]->list);
+		list_add(&target->tx_ring[i]->list, &target->free_tx);
 	}
 
 	return 0;
@@ -1373,7 +1360,8 @@ static int srp_cm_handler(struct ib_cm_id *cm_id, struct ib_cm_event *event)
 			break;
 
 		for (i = 0; i < SRP_RQ_SIZE; i++) {
-			target->status = srp_post_recv(target);
+			struct srp_iu *iu = target->rx_ring[i];
+			target->status = srp_post_recv(target, iu);
 			if (target->status)
 				break;
 		}
@@ -1965,6 +1953,7 @@ static ssize_t srp_create_target(struct device *dev,
 	target->scsi_host  = target_host;
 	target->srp_host   = host;
 
+	INIT_LIST_HEAD(&target->free_tx);
 	INIT_LIST_HEAD(&target->free_reqs);
 	INIT_LIST_HEAD(&target->req_queue);
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
@@ -2235,8 +2224,7 @@ static int __init srp_init_module(void)
 {
 	int ret;
 
-	BUILD_BUG_ON_NOT_POWER_OF_2(SRP_SQ_SIZE);
-	BUILD_BUG_ON_NOT_POWER_OF_2(SRP_RQ_SIZE);
+	BUILD_BUG_ON(FIELD_SIZEOF(struct ib_wc, wr_id) < sizeof(void *));
 
 	if (srp_sg_tablesize > 255) {
 		printk(KERN_WARNING PFX "Clamping srp_sg_tablesize to 255\n");
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index c923731..0db525e 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -59,10 +59,8 @@ enum {
 
 	SRP_RQ_SHIFT    	= 6,
 	SRP_RQ_SIZE		= 1 << SRP_RQ_SHIFT,
-	SRP_RQ_MASK		= SRP_RQ_SIZE - 1,
 
 	SRP_SQ_SIZE		= SRP_RQ_SIZE,
-	SRP_SQ_MASK		= SRP_SQ_SIZE - 1,
 	SRP_RSP_SQ_SIZE		= 1,
 	SRP_REQ_SQ_SIZE		= SRP_SQ_SIZE - SRP_RSP_SQ_SIZE,
 	SRP_TSK_MGMT_SQ_SIZE	= 1,
@@ -144,11 +142,9 @@ struct srp_target_port {
 
 	int			zero_req_lim;
 
-	unsigned		rx_head;
 	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
 
-	unsigned		tx_head;
-	unsigned		tx_tail;
+	struct list_head	free_tx;
 	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
 
 	struct list_head	free_reqs;
@@ -168,6 +164,7 @@ struct srp_target_port {
 };
 
 struct srp_iu {
+	struct list_head	list;
 	u64			dma;
 	void		       *buf;
 	size_t			size;
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 4/8] IB/srp: don't move active requests to their own list
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2010-12-23 21:31   ` [RFC 3/8] IB/srp: allow lockless work posting David Dillow
@ 2010-12-23 21:31   ` David Dillow
       [not found]     ` <1293139893-11678-5-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
  2010-12-23 21:31   ` [RFC 5/8] IB/srp: reduce local coverage for command submission and EH David Dillow
                     ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

We use req->scmnd != NULL to indicate an active request, so there's no
need to keep a separate list for them. We can afford the array iteration
during error handling, and dropping it gives us one less item that needs
lock protection.

This is a break-out and slight cleanup of Bart Van Assche's work.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   17 ++++++++++-------
 drivers/infiniband/ulp/srp/ib_srp.h |    1 -
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 6d98217..fcc8702 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -566,7 +566,6 @@ static void srp_reset_req(struct srp_target_port *target, struct srp_request *re
 static int srp_reconnect_target(struct srp_target_port *target)
 {
 	struct ib_qp_attr qp_attr;
-	struct srp_request *req, *tmp;
 	struct ib_wc wc;
 	int i, ret;
 
@@ -597,8 +596,11 @@ static int srp_reconnect_target(struct srp_target_port *target)
 		; /* nothing */
 
 	spin_lock_irq(target->scsi_host->host_lock);
-	list_for_each_entry_safe(req, tmp, &target->req_queue, list)
-		srp_reset_req(target, req);
+	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+		struct srp_request *req = &target->req_ring[i];
+		if (req->scmnd)
+			srp_reset_req(target, req);
+	}
 	spin_unlock_irq(target->scsi_host->host_lock);
 
 	list_del_init(&target->free_tx);
@@ -1165,7 +1167,7 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 		goto err_unmap;
 	}
 
-	list_move_tail(&req->list, &target->req_queue);
+	list_del_init(&req->list);
 
 	return 0;
 
@@ -1505,7 +1507,7 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 static int srp_reset_device(struct scsi_cmnd *scmnd)
 {
 	struct srp_target_port *target = host_to_target(scmnd->device->host);
-	struct srp_request *req, *tmp;
+	int i;
 
 	shost_printk(KERN_ERR, target->scsi_host, "SRP reset_device called\n");
 
@@ -1519,9 +1521,11 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 
 	spin_lock_irq(target->scsi_host->host_lock);
 
-	list_for_each_entry_safe(req, tmp, &target->req_queue, list)
+	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
+		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd && req->scmnd->device == scmnd->device)
 			srp_reset_req(target, req);
+	}
 
 	spin_unlock_irq(target->scsi_host->host_lock);
 
@@ -1955,7 +1959,6 @@ static ssize_t srp_create_target(struct device *dev,
 
 	INIT_LIST_HEAD(&target->free_tx);
 	INIT_LIST_HEAD(&target->free_reqs);
-	INIT_LIST_HEAD(&target->req_queue);
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		target->req_ring[i].index = i;
 		list_add_tail(&target->req_ring[i].list, &target->free_reqs);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 0db525e..b3808dd 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -148,7 +148,6 @@ struct srp_target_port {
 	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
 
 	struct list_head	free_reqs;
-	struct list_head	req_queue;
 	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
 
 	struct work_struct	work;
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 5/8] IB/srp: reduce local coverage for command submission and EH
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
                     ` (3 preceding siblings ...)
  2010-12-23 21:31   ` [RFC 4/8] IB/srp: don't move active requests to their own list David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2010-12-23 21:31   ` [RFC 6/8] IB/srp: reduce lock coverage of command completion David Dillow
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

We only need locks to protect our lists and number of credits available.
By pre-consuming the credit for the request, we can reduce our lock
coverage to just those areas. If we don't actually send the request,
we'll need to put the credit back into the pool.

This is a break-out and slight cleanup of Bart Van Assche's work.
---
 drivers/infiniband/ulp/srp/ib_srp.c |  126 +++++++++++++++++++----------------
 drivers/infiniband/ulp/srp/ib_srp.h |    1 -
 2 files changed, 69 insertions(+), 58 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index fcc8702..96d301e 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -818,9 +818,24 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 }
 
 /*
+ * Return an IU and possible credit to the free pool
+ */
+static void srp_put_tx_iu(struct srp_target_port *target, struct srp_iu *iu,
+			  enum srp_iu_type iu_type)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(target->scsi_host->host_lock, flags);
+	list_add(&iu->list, &target->free_tx);
+	if (iu_type != SRP_IU_RSP)
+		++target->req_lim;
+	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
+}
+
+/*
  * Must be called with target->scsi_host->host_lock held to protect
- * req_lim and free_tx.  Lock cannot be dropped between call here and
- * call to __srp_post_send().
+ * req_lim and free_tx.  If IU is not sent, it must be returned using
+ * srp_put_tx_iu().
  *
  * Note:
  * An upper limit for the number of allocated information units for each
@@ -843,26 +858,25 @@ static struct srp_iu *__srp_get_tx_iu(struct srp_target_port *target,
 		return NULL;
 
 	/* Initiator responses to target requests do not consume credits */
-	if (target->req_lim <= rsv && iu_type != SRP_IU_RSP) {
-		++target->zero_req_lim;
-		return NULL;
+	if (iu_type != SRP_IU_RSP) {
+		if (target->req_lim <= rsv) {
+			++target->zero_req_lim;
+			return NULL;
+		}
+
+		--target->req_lim;
 	}
 
 	iu = list_first_entry(&target->free_tx, struct srp_iu, list);
-	iu->type = iu_type;
+	list_del_init(&iu->list);
 	return iu;
 }
 
-/*
- * Must be called with target->scsi_host->host_lock held to protect
- * req_lim and free_tx.
- */
-static int __srp_post_send(struct srp_target_port *target,
-			   struct srp_iu *iu, int len)
+static int srp_post_send(struct srp_target_port *target,
+			 struct srp_iu *iu, int len)
 {
 	struct ib_sge list;
 	struct ib_send_wr wr, *bad_wr;
-	int ret = 0;
 
 	list.addr   = iu->dma;
 	list.length = len;
@@ -875,15 +889,7 @@ static int __srp_post_send(struct srp_target_port *target,
 	wr.opcode     = IB_WR_SEND;
 	wr.send_flags = IB_SEND_SIGNALED;
 
-	ret = ib_post_send(target->qp, &wr, &bad_wr);
-
-	if (!ret) {
-		list_del_init(&iu->list);
-		if (iu->type != SRP_IU_RSP)
-			--target->req_lim;
-	}
-
-	return ret;
+	return ib_post_send(target->qp, &wr, &bad_wr);
 }
 
 static int srp_post_recv(struct srp_target_port *target, struct srp_iu *iu)
@@ -953,34 +959,33 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 static int srp_response_common(struct srp_target_port *target, s32 req_delta,
 			       void *rsp, int len)
 {
-	struct ib_device *dev;
+	struct ib_device *dev = target->srp_host->srp_dev->dev;
 	unsigned long flags;
 	struct srp_iu *iu;
-	int err = 1;
-
-	dev = target->srp_host->srp_dev->dev;
+	int err;
 
 	spin_lock_irqsave(target->scsi_host->host_lock, flags);
 	target->req_lim += req_delta;
-
 	iu = __srp_get_tx_iu(target, SRP_IU_RSP);
+	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
+
 	if (!iu) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX
 			     "no IU available to send response\n");
-		goto out;
+		return 1;
 	}
 
 	ib_dma_sync_single_for_cpu(dev, iu->dma, len, DMA_TO_DEVICE);
 	memcpy(iu->buf, rsp, len);
 	ib_dma_sync_single_for_device(dev, iu->dma, len, DMA_TO_DEVICE);
 
-	err = __srp_post_send(target, iu, len);
-	if (err)
+	err = srp_post_send(target, iu, len);
+	if (err) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX
 			     "unable to post response: %d\n", err);
+		srp_put_tx_iu(target, iu, SRP_IU_RSP);
+	}
 
-out:
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
 	return err;
 }
 
@@ -1107,14 +1112,14 @@ static void srp_send_completion(struct ib_cq *cq, void *target_ptr)
 	}
 }
 
-static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
-			    void (*done)(struct scsi_cmnd *))
+static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 {
-	struct srp_target_port *target = host_to_target(scmnd->device->host);
+	struct srp_target_port *target = host_to_target(shost);
 	struct srp_request *req;
 	struct srp_iu *iu;
 	struct srp_cmd *cmd;
 	struct ib_device *dev;
+	unsigned long flags;
 	int len;
 
 	if (target->state == SRP_TARGET_CONNECTING)
@@ -1123,11 +1128,21 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 	if (target->state == SRP_TARGET_DEAD ||
 	    target->state == SRP_TARGET_REMOVED) {
 		scmnd->result = DID_BAD_TARGET << 16;
-		done(scmnd);
+		scmnd->scsi_done(scmnd);
 		return 0;
 	}
 
+	spin_lock_irqsave(shost->host_lock, flags);
+	/* This goes away once the scsi_eh routines stop testing it. */
+	scsi_cmd_get_serial(shost, scmnd);
 	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
+	if (iu) {
+		req = list_first_entry(&target->free_reqs, struct srp_request,
+				      list);
+		list_del_init(&req->list);
+	}
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	if (!iu)
 		goto err;
 
@@ -1135,9 +1150,6 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 	ib_dma_sync_single_for_cpu(dev, iu->dma, srp_max_iu_len,
 				   DMA_TO_DEVICE);
 
-	req = list_first_entry(&target->free_reqs, struct srp_request, list);
-
-	scmnd->scsi_done     = done;
 	scmnd->result        = 0;
 	scmnd->host_scribble = (void *) req;
 
@@ -1156,30 +1168,33 @@ static int srp_queuecommand_lck(struct scsi_cmnd *scmnd,
 	if (len < 0) {
 		shost_printk(KERN_ERR, target->scsi_host,
 			     PFX "Failed to map data\n");
-		goto err;
+		goto err_iu;
 	}
 
 	ib_dma_sync_single_for_device(dev, iu->dma, srp_max_iu_len,
 				      DMA_TO_DEVICE);
 
-	if (__srp_post_send(target, iu, len)) {
+	if (srp_post_send(target, iu, len)) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX "Send failed\n");
 		goto err_unmap;
 	}
 
-	list_del_init(&req->list);
-
 	return 0;
 
 err_unmap:
 	srp_unmap_data(scmnd, target, req);
 
+err_iu:
+	srp_put_tx_iu(target, iu, SRP_IU_CMD);
+
+	spin_lock_irqsave(shost->host_lock, flags);
+	list_add(&req->list, &target->free_reqs);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 err:
 	return SCSI_MLQUEUE_HOST_BUSY;
 }
 
-static DEF_SCSI_QCMD(srp_queuecommand)
-
 static int srp_alloc_iu_bufs(struct srp_target_port *target)
 {
 	int i;
@@ -1434,17 +1449,18 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 	struct srp_iu *iu;
 	struct srp_tsk_mgmt *tsk_mgmt;
 
-	spin_lock_irq(target->scsi_host->host_lock);
-
 	if (target->state == SRP_TARGET_DEAD ||
 	    target->state == SRP_TARGET_REMOVED)
-		goto out;
+		return -1;
 
 	init_completion(&target->tsk_mgmt_done);
 
+	spin_lock_irq(target->scsi_host->host_lock);
 	iu = __srp_get_tx_iu(target, SRP_IU_TSK_MGMT);
+	spin_unlock_irq(target->scsi_host->host_lock);
+
 	if (!iu)
-		goto out;
+		return -1;
 
 	ib_dma_sync_single_for_cpu(dev, iu->dma, sizeof *tsk_mgmt,
 				   DMA_TO_DEVICE);
@@ -1459,20 +1475,16 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 
 	ib_dma_sync_single_for_device(dev, iu->dma, sizeof *tsk_mgmt,
 				      DMA_TO_DEVICE);
-	if (__srp_post_send(target, iu, sizeof *tsk_mgmt))
-		goto out;
-
-	spin_unlock_irq(target->scsi_host->host_lock);
+	if (srp_post_send(target, iu, sizeof *tsk_mgmt)) {
+		srp_put_tx_iu(target, iu, SRP_IU_TSK_MGMT);
+		return -1;
+	}
 
 	if (!wait_for_completion_timeout(&target->tsk_mgmt_done,
 					 msecs_to_jiffies(SRP_ABORT_TIMEOUT_MS)))
 		return -1;
 
 	return 0;
-
-out:
-	spin_unlock_irq(target->scsi_host->host_lock);
-	return -1;
 }
 
 static int srp_abort(struct scsi_cmnd *scmnd)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index b3808dd..e2b1719 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -168,7 +168,6 @@ struct srp_iu {
 	void		       *buf;
 	size_t			size;
 	enum dma_data_direction	direction;
-	enum srp_iu_type	type;
 };
 
 #endif /* IB_SRP_H */
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 6/8] IB/srp: reduce lock coverage of command completion
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
                     ` (4 preceding siblings ...)
  2010-12-23 21:31   ` [RFC 5/8] IB/srp: reduce local coverage for command submission and EH David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2011-01-02 17:27     ` Bart Van Assche
  2010-12-23 21:31   ` [RFC 7/8] IB/srp: stop sharing the host lock with SCSI David Dillow
  2010-12-23 21:31   ` [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines David Dillow
  7 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

We only need the lock to cover list and credit manipulations, so push
those into srp_remove_req() and update the call chains.

We reorder the request removal and command completion in
srp_process_rsp() to avoid the SCSI mid-layer sending another command
before we've released our request and added any credits returned by the
target. This prevents us from returning HOST_BUSY unneccesarily.

This is a break-out of Bart Van Assche's work, modified to avoid
HOST_BUSY.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   34 ++++++++++------------------------
 1 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 96d301e..4e8ae59 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -549,18 +549,24 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
 			scsi_sg_count(scmnd), scmnd->sc_data_direction);
 }
 
-static void srp_remove_req(struct srp_target_port *target, struct srp_request *req)
+static void srp_remove_req(struct srp_target_port *target,
+			   struct srp_request *req, s32 req_lim_delta)
 {
+	unsigned long flags;
+
 	srp_unmap_data(req->scmnd, target, req);
+	spin_lock_irqsave(target->scsi_host->host_lock, flags);
+	target->req_lim += req_lim_delta;
 	req->scmnd = NULL;
 	list_move_tail(&req->list, &target->free_reqs);
+	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
 }
 
 static void srp_reset_req(struct srp_target_port *target, struct srp_request *req)
 {
 	req->scmnd->result = DID_RESET << 16;
 	req->scmnd->scsi_done(req->scmnd);
-	srp_remove_req(target, req);
+	srp_remove_req(target, req, 0);
 }
 
 static int srp_reconnect_target(struct srp_target_port *target)
@@ -595,13 +601,11 @@ static int srp_reconnect_target(struct srp_target_port *target)
 	while (ib_poll_cq(target->send_cq, 1, &wc) > 0)
 		; /* nothing */
 
-	spin_lock_irq(target->scsi_host->host_lock);
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd)
 			srp_reset_req(target, req);
 	}
-	spin_unlock_irq(target->scsi_host->host_lock);
 
 	list_del_init(&target->free_tx);
 	for (i = 0; i < SRP_SQ_SIZE; ++i)
@@ -913,14 +917,6 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 {
 	struct srp_request *req;
 	struct scsi_cmnd *scmnd;
-	unsigned long flags;
-	s32 delta;
-
-	delta = (s32) be32_to_cpu(rsp->req_lim_delta);
-
-	spin_lock_irqsave(target->scsi_host->host_lock, flags);
-
-	target->req_lim += delta;
 
 	if (unlikely(rsp->tag & SRP_TAG_TSK_MGMT)) {
 		target->tsk_mgmt_status = -1;
@@ -948,12 +944,10 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
 		else if (rsp->flags & (SRP_RSP_FLAG_DIOVER | SRP_RSP_FLAG_DIUNDER))
 			scsi_set_resid(scmnd, be32_to_cpu(rsp->data_in_res_cnt));
 
+		srp_remove_req(target, req, be32_to_cpu(rsp->req_lim_delta));
 		scmnd->host_scribble = NULL;
 		scmnd->scsi_done(scmnd);
-		srp_remove_req(target, req);
 	}
-
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
 }
 
 static int srp_response_common(struct srp_target_port *target, s32 req_delta,
@@ -1501,18 +1495,14 @@ static int srp_abort(struct scsi_cmnd *scmnd)
 			      SRP_TSK_ABORT_TASK))
 		return FAILED;
 
-	spin_lock_irq(target->scsi_host->host_lock);
-
 	if (req->scmnd) {
 		if (!target->tsk_mgmt_status) {
-			srp_remove_req(target, req);
+			srp_remove_req(target, req, 0);
 			scmnd->result = DID_ABORT << 16;
 		} else
 			ret = FAILED;
 	}
 
-	spin_unlock_irq(target->scsi_host->host_lock);
-
 	return ret;
 }
 
@@ -1531,16 +1521,12 @@ static int srp_reset_device(struct scsi_cmnd *scmnd)
 	if (target->tsk_mgmt_status)
 		return FAILED;
 
-	spin_lock_irq(target->scsi_host->host_lock);
-
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
 		struct srp_request *req = &target->req_ring[i];
 		if (req->scmnd && req->scmnd->device == scmnd->device)
 			srp_reset_req(target, req);
 	}
 
-	spin_unlock_irq(target->scsi_host->host_lock);
-
 	return SUCCESS;
 }
 
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
                     ` (5 preceding siblings ...)
  2010-12-23 21:31   ` [RFC 6/8] IB/srp: reduce lock coverage of command completion David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2010-12-27 17:49     ` David Dillow
  2010-12-23 21:31   ` [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines David Dillow
  7 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

We don't need protection against the SCSI stack, so use our own lock to
allow parallel progress on separate CPUs.

This is a break-out of Bart Van Assche's work.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   42 +++++++++++++++++-----------------
 drivers/infiniband/ulp/srp/ib_srp.h |    2 +
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 4e8ae59..fff72c8 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -447,12 +447,12 @@ static bool srp_change_state(struct srp_target_port *target,
 {
 	int changed = false;
 
-	spin_lock_irq(target->scsi_host->host_lock);
+	spin_lock_irq(&target->lock);
 	if (target->state == old) {
 		target->state = new;
 		changed = true;
 	}
-	spin_unlock_irq(target->scsi_host->host_lock);
+	spin_unlock_irq(&target->lock);
 	return changed;
 }
 
@@ -555,11 +555,11 @@ static void srp_remove_req(struct srp_target_port *target,
 	unsigned long flags;
 
 	srp_unmap_data(req->scmnd, target, req);
-	spin_lock_irqsave(target->scsi_host->host_lock, flags);
+	spin_lock_irqsave(&target->lock, flags);
 	target->req_lim += req_lim_delta;
 	req->scmnd = NULL;
 	list_move_tail(&req->list, &target->free_reqs);
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
+	spin_unlock_irqrestore(&target->lock, flags);
 }
 
 static void srp_reset_req(struct srp_target_port *target, struct srp_request *req)
@@ -634,13 +634,13 @@ err:
 	 * Schedule our work inside the lock to avoid a race with
 	 * the flush_scheduled_work() in srp_remove_one().
 	 */
-	spin_lock_irq(target->scsi_host->host_lock);
+	spin_lock_irq(&target->lock);
 	if (target->state == SRP_TARGET_CONNECTING) {
 		target->state = SRP_TARGET_DEAD;
 		INIT_WORK(&target->work, srp_remove_work);
 		schedule_work(&target->work);
 	}
-	spin_unlock_irq(target->scsi_host->host_lock);
+	spin_unlock_irq(&target->lock);
 
 	return ret;
 }
@@ -829,17 +829,16 @@ static void srp_put_tx_iu(struct srp_target_port *target, struct srp_iu *iu,
 {
 	unsigned long flags;
 
-	spin_lock_irqsave(target->scsi_host->host_lock, flags);
+	spin_lock_irqsave(&target->lock, flags);
 	list_add(&iu->list, &target->free_tx);
 	if (iu_type != SRP_IU_RSP)
 		++target->req_lim;
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
+	spin_unlock_irqrestore(&target->lock, flags);
 }
 
 /*
- * Must be called with target->scsi_host->host_lock held to protect
- * req_lim and free_tx.  If IU is not sent, it must be returned using
- * srp_put_tx_iu().
+ * Must be called with target->lock held to protect req_lim and free_tx.
+ * If IU is not sent, it must be returned using srp_put_tx_iu().
  *
  * Note:
  * An upper limit for the number of allocated information units for each
@@ -958,10 +957,10 @@ static int srp_response_common(struct srp_target_port *target, s32 req_delta,
 	struct srp_iu *iu;
 	int err;
 
-	spin_lock_irqsave(target->scsi_host->host_lock, flags);
+	spin_lock_irqsave(&target->lock, flags);
 	target->req_lim += req_delta;
 	iu = __srp_get_tx_iu(target, SRP_IU_RSP);
-	spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
+	spin_unlock_irqrestore(&target->lock, flags);
 
 	if (!iu) {
 		shost_printk(KERN_ERR, target->scsi_host, PFX
@@ -1126,7 +1125,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 		return 0;
 	}
 
-	spin_lock_irqsave(shost->host_lock, flags);
+	spin_lock_irqsave(&target->lock, flags);
 	/* This goes away once the scsi_eh routines stop testing it. */
 	scsi_cmd_get_serial(shost, scmnd);
 	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
@@ -1135,7 +1134,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
 				      list);
 		list_del_init(&req->list);
 	}
-	spin_unlock_irqrestore(shost->host_lock, flags);
+	spin_unlock_irqrestore(&target->lock, flags);
 
 	if (!iu)
 		goto err;
@@ -1181,9 +1180,9 @@ err_unmap:
 err_iu:
 	srp_put_tx_iu(target, iu, SRP_IU_CMD);
 
-	spin_lock_irqsave(shost->host_lock, flags);
+	spin_lock_irqsave(&target->lock, flags);
 	list_add(&req->list, &target->free_reqs);
-	spin_unlock_irqrestore(shost->host_lock, flags);
+	spin_unlock_irqrestore(&target->lock, flags);
 
 err:
 	return SCSI_MLQUEUE_HOST_BUSY;
@@ -1449,9 +1448,9 @@ static int srp_send_tsk_mgmt(struct srp_target_port *target,
 
 	init_completion(&target->tsk_mgmt_done);
 
-	spin_lock_irq(target->scsi_host->host_lock);
+	spin_lock_irq(&target->lock);
 	iu = __srp_get_tx_iu(target, SRP_IU_TSK_MGMT);
-	spin_unlock_irq(target->scsi_host->host_lock);
+	spin_unlock_irq(&target->lock);
 
 	if (!iu)
 		return -1;
@@ -1955,6 +1954,7 @@ static ssize_t srp_create_target(struct device *dev,
 	target->scsi_host  = target_host;
 	target->srp_host   = host;
 
+	spin_lock_init(&target->lock);
 	INIT_LIST_HEAD(&target->free_tx);
 	INIT_LIST_HEAD(&target->free_reqs);
 	for (i = 0; i < SRP_CMD_SQ_SIZE; ++i) {
@@ -2184,9 +2184,9 @@ static void srp_remove_one(struct ib_device *device)
 		 */
 		spin_lock(&host->target_lock);
 		list_for_each_entry(target, &host->target_list, list) {
-			spin_lock_irq(target->scsi_host->host_lock);
+			spin_lock_irq(&target->lock);
 			target->state = SRP_TARGET_REMOVED;
-			spin_unlock_irq(target->scsi_host->host_lock);
+			spin_unlock_irq(&target->lock);
 		}
 		spin_unlock(&host->target_lock);
 
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index e2b1719..c4699ea 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -144,6 +144,8 @@ struct srp_target_port {
 
 	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
 
+	spinlock_t		lock;
+
 	struct list_head	free_tx;
 	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
 
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines
       [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
                     ` (6 preceding siblings ...)
  2010-12-23 21:31   ` [RFC 7/8] IB/srp: stop sharing the host lock with SCSI David Dillow
@ 2010-12-23 21:31   ` David Dillow
  2011-01-05 18:13     ` Roland Dreier
  7 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-23 21:31 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

Put the variables accessed together in the hot-path into common
cachelines, and separate them by RW vs RO to avoid false dirtying.
We keep a local copy of the lkey and rkey in the target to avoid
traversing pointers (and associated cache lines) to find them.
---
 drivers/infiniband/ulp/srp/ib_srp.c |   12 +++++++-----
 drivers/infiniband/ulp/srp/ib_srp.h |   31 +++++++++++++++++++------------
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index fff72c8..cef6191 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -768,7 +768,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 		struct srp_direct_buf *buf = (void *) cmd->add_data;
 
 		buf->va  = cpu_to_be64(ib_sg_dma_address(ibdev, scat));
-		buf->key = cpu_to_be32(dev->mr->rkey);
+		buf->key = cpu_to_be32(target->rkey);
 		buf->len = cpu_to_be32(ib_sg_dma_len(ibdev, scat));
 	} else if (srp_map_fmr(target, scat, count, req,
 			       (void *) cmd->add_data)) {
@@ -793,7 +793,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 			buf->desc_list[i].va  =
 				cpu_to_be64(ib_sg_dma_address(ibdev, sg));
 			buf->desc_list[i].key =
-				cpu_to_be32(dev->mr->rkey);
+				cpu_to_be32(target->rkey);
 			buf->desc_list[i].len = cpu_to_be32(dma_len);
 			datalen += dma_len;
 		}
@@ -806,7 +806,7 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_target_port *target,
 		buf->table_desc.va  =
 			cpu_to_be64(req->cmd->dma + sizeof *cmd + sizeof *buf);
 		buf->table_desc.key =
-			cpu_to_be32(target->srp_host->srp_dev->mr->rkey);
+			cpu_to_be32(target->rkey);
 		buf->table_desc.len =
 			cpu_to_be32(count * sizeof (struct srp_direct_buf));
 
@@ -883,7 +883,7 @@ static int srp_post_send(struct srp_target_port *target,
 
 	list.addr   = iu->dma;
 	list.length = len;
-	list.lkey   = target->srp_host->srp_dev->mr->lkey;
+	list.lkey   = target->lkey;
 
 	wr.next       = NULL;
 	wr.wr_id      = (uintptr_t) iu;
@@ -902,7 +902,7 @@ static int srp_post_recv(struct srp_target_port *target, struct srp_iu *iu)
 
 	list.addr   = iu->dma;
 	list.length = iu->size;
-	list.lkey   = target->srp_host->srp_dev->mr->lkey;
+	list.lkey   = target->lkey;
 
 	wr.next     = NULL;
 	wr.wr_id    = (uintptr_t) iu;
@@ -1953,6 +1953,8 @@ static ssize_t srp_create_target(struct device *dev,
 	target->io_class   = SRP_REV16A_IB_IO_CLASS;
 	target->scsi_host  = target_host;
 	target->srp_host   = host;
+	target->lkey	   = host->srp_dev->mr->lkey;
+	target->rkey	   = host->srp_dev->mr->rkey;
 
 	spin_lock_init(&target->lock);
 	INIT_LIST_HEAD(&target->free_tx);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index c4699ea..43f9129 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -117,6 +117,24 @@ struct srp_request {
 };
 
 struct srp_target_port {
+	/* These are RW in the hot path, and commonly used together */
+	struct list_head	free_tx;
+	struct list_head	free_reqs;
+	spinlock_t		lock;
+	s32			req_lim;
+
+	/* These are read-only in the hot path */
+	struct ib_cq	       *send_cq ____cacheline_aligned;
+	struct ib_cq	       *recv_cq;
+	struct ib_qp	       *qp;
+	u32			lkey;
+	u32			rkey;
+	enum srp_target_state	state;
+
+	/* Everything above this point is used in the hot path of
+	 * command processing. Try to keep them packed into cachelines.
+	 */
+
 	__be64			id_ext;
 	__be64			ioc_guid;
 	__be64			service_id;
@@ -133,23 +151,13 @@ struct srp_target_port {
 	int			path_query_id;
 
 	struct ib_cm_id	       *cm_id;
-	struct ib_cq	       *recv_cq;
-	struct ib_cq	       *send_cq;
-	struct ib_qp	       *qp;
 
 	int			max_ti_iu_len;
-	s32			req_lim;
 
 	int			zero_req_lim;
 
-	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
-
-	spinlock_t		lock;
-
-	struct list_head	free_tx;
 	struct srp_iu	       *tx_ring[SRP_SQ_SIZE];
-
-	struct list_head	free_reqs;
+	struct srp_iu	       *rx_ring[SRP_RQ_SIZE];
 	struct srp_request	req_ring[SRP_CMD_SQ_SIZE];
 
 	struct work_struct	work;
@@ -157,7 +165,6 @@ struct srp_target_port {
 	struct list_head	list;
 	struct completion	done;
 	int			status;
-	enum srp_target_state	state;
 	int			qp_in_error;
 
 	struct completion	tsk_mgmt_done;
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC 4/8] IB/srp: don't move active requests to their own list
       [not found]     ` <1293139893-11678-5-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
@ 2010-12-27 15:50       ` Bart Van Assche
       [not found]         ` <AANLkTinuOD919iV+ixcWhZpSkJP1QP=+OyphzVSq4_aO-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2010-12-27 15:50 UTC (permalink / raw)
  To: David Dillow
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Thu, Dec 23, 2010 at 10:31 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> We use req->scmnd != NULL to indicate an active request, so there's no
> need to keep a separate list for them. We can afford the array iteration
> during error handling, and dropping it gives us one less item that needs
> lock protection.
> [ ... ]

Before this patch a request was either on the free_reqs list or on the
req_queue list. After this patch has been applied a request is either
on the free_reqs list or not on any list. So it doesn't make sense
anymore to invoke list_move_tail() from srp_remove_req() - that
function should be modified such that it invokes list_add_tail(). And
once that is done the list_del_init() call added by this patch can be
replaced by a list_del() call.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 4/8] IB/srp: don't move active requests to their own list
       [not found]         ` <AANLkTinuOD919iV+ixcWhZpSkJP1QP=+OyphzVSq4_aO-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-12-27 17:46           ` David Dillow
  0 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2010-12-27 17:46 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Mon, 2010-12-27 at 16:50 +0100, Bart Van Assche wrote:
> On Thu, Dec 23, 2010 at 10:31 PM, David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org> wrote:
> > We use req->scmnd != NULL to indicate an active request, so there's no
> > need to keep a separate list for them. We can afford the array iteration
> > during error handling, and dropping it gives us one less item that needs
> > lock protection.
> > [ ... ]
> 
> Before this patch a request was either on the free_reqs list or on the
> req_queue list. After this patch has been applied a request is either
> on the free_reqs list or not on any list. So it doesn't make sense
> anymore to invoke list_move_tail() from srp_remove_req() - that
> function should be modified such that it invokes list_add_tail(). And
> once that is done the list_del_init() call added by this patch can be
> replaced by a list_del() call.

Good suggestion.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
  2010-12-23 21:31   ` [RFC 7/8] IB/srp: stop sharing the host lock with SCSI David Dillow
@ 2010-12-27 17:49     ` David Dillow
  2010-12-27 17:56       ` Boaz Harrosh
  0 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-27 17:49 UTC (permalink / raw)
  To: linux-rdma; +Cc: linux-scsi, Bart Van Assche

On Thu, 2010-12-23 at 16:31 -0500, David Dillow wrote:
> We don't need protection against the SCSI stack, so use our own lock to
> allow parallel progress on separate CPUs.

> @@ -1126,7 +1125,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>  		return 0;
>  	}
>  
> -	spin_lock_irqsave(shost->host_lock, flags);
> +	spin_lock_irqsave(&target->lock, flags);
>  	/* This goes away once the scsi_eh routines stop testing it. */
>  	scsi_cmd_get_serial(shost, scmnd);
>  	iu = __srp_get_tx_iu(target, SRP_IU_CMD);

This one may be a bit problematic -- I need to look at the required
locking to avoid a race with the serial number. Of course, there's an
easy fix -- if this patch lands after the patches to remove the serial
number check from the error handler, then there's no reason to get a
serial number in the initiator.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
  2010-12-27 17:49     ` David Dillow
@ 2010-12-27 17:56       ` Boaz Harrosh
       [not found]         ` <4D18D355.30308-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Boaz Harrosh @ 2010-12-27 17:56 UTC (permalink / raw)
  To: David Dillow; +Cc: linux-rdma, linux-scsi, Bart Van Assche

On 12/27/2010 07:49 PM, David Dillow wrote:
> On Thu, 2010-12-23 at 16:31 -0500, David Dillow wrote:
>> We don't need protection against the SCSI stack, so use our own lock to
>> allow parallel progress on separate CPUs.
> 
>> @@ -1126,7 +1125,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>>  		return 0;
>>  	}
>>  
>> -	spin_lock_irqsave(shost->host_lock, flags);
>> +	spin_lock_irqsave(&target->lock, flags);
>>  	/* This goes away once the scsi_eh routines stop testing it. */
>>  	scsi_cmd_get_serial(shost, scmnd);
>>  	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
> 
> This one may be a bit problematic -- I need to look at the required
> locking to avoid a race with the serial number. Of course, there's an
> easy fix -- if this patch lands after the patches to remove the serial
> number check from the error handler, then there's no reason to get a
> serial number in the initiator.
> 

$ git log --oneline -1 linus/master -- drivers/scsi/scsi_error.c
459dbf7 [SCSI] Eliminate error handler overload of the SCSI serial number

Cheers

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
       [not found]         ` <4D18D355.30308-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
@ 2010-12-27 18:01           ` David Dillow
       [not found]             ` <1293472913.2896.36.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: David Dillow @ 2010-12-27 18:01 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche

On Mon, 2010-12-27 at 19:56 +0200, Boaz Harrosh wrote:
> On 12/27/2010 07:49 PM, David Dillow wrote:
> > On Thu, 2010-12-23 at 16:31 -0500, David Dillow wrote:
> >> We don't need protection against the SCSI stack, so use our own lock to
> >> allow parallel progress on separate CPUs.
> > 
> >> @@ -1126,7 +1125,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
> >>  		return 0;
> >>  	}
> >>  
> >> -	spin_lock_irqsave(shost->host_lock, flags);
> >> +	spin_lock_irqsave(&target->lock, flags);
> >>  	/* This goes away once the scsi_eh routines stop testing it. */
> >>  	scsi_cmd_get_serial(shost, scmnd);
> >>  	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
> > 
> > This one may be a bit problematic -- I need to look at the required
> > locking to avoid a race with the serial number. Of course, there's an
> > easy fix -- if this patch lands after the patches to remove the serial
> > number check from the error handler, then there's no reason to get a
> > serial number in the initiator.
> > 
> 
> $ git log --oneline -1 linus/master -- drivers/scsi/scsi_error.c
> 459dbf7 [SCSI] Eliminate error handler overload of the SCSI serial number

Oh goodie, I hadn't followed up to see if it was in one of James's later
pulls. Thanks for the heads up -- I'll dump scsi_cmd_get_serial() in
this patch then.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
       [not found]             ` <1293472913.2896.36.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
@ 2010-12-27 18:47               ` Boaz Harrosh
       [not found]                 ` <4D18DF4D.6010607-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Boaz Harrosh @ 2010-12-27 18:47 UTC (permalink / raw)
  To: David Dillow
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche,
	Nicholas A. Bellinger

On 12/27/2010 08:01 PM, David Dillow wrote:
> On Mon, 2010-12-27 at 19:56 +0200, Boaz Harrosh wrote:
>> On 12/27/2010 07:49 PM, David Dillow wrote:
>>> On Thu, 2010-12-23 at 16:31 -0500, David Dillow wrote:
>>>> We don't need protection against the SCSI stack, so use our own lock to
>>>> allow parallel progress on separate CPUs.
>>>
>>>> @@ -1126,7 +1125,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd)
>>>>  		return 0;
>>>>  	}
>>>>  
>>>> -	spin_lock_irqsave(shost->host_lock, flags);
>>>> +	spin_lock_irqsave(&target->lock, flags);
>>>>  	/* This goes away once the scsi_eh routines stop testing it. */
>>>>  	scsi_cmd_get_serial(shost, scmnd);
>>>>  	iu = __srp_get_tx_iu(target, SRP_IU_CMD);
>>>
>>> This one may be a bit problematic -- I need to look at the required
>>> locking to avoid a race with the serial number. Of course, there's an
>>> easy fix -- if this patch lands after the patches to remove the serial
>>> number check from the error handler, then there's no reason to get a
>>> serial number in the initiator.
>>>
>>
>> $ git log --oneline -1 linus/master -- drivers/scsi/scsi_error.c
>> 459dbf7 [SCSI] Eliminate error handler overload of the SCSI serial number
> 
> Oh goodie, I hadn't followed up to see if it was in one of James's later
> pulls. Thanks for the heads up -- I'll dump scsi_cmd_get_serial() in
> this patch then.
> 

Are these base on James's trees? Or are you maintaining these scsi bits yourself?

Also CC: Nicholas A. Bellinger just in case he has a matching patch in
his luckless tree.

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
       [not found]                 ` <4D18DF4D.6010607-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
@ 2010-12-27 19:52                   ` Dave Dillow
  2010-12-28 17:38                   ` Bart Van Assche
  1 sibling, 0 replies; 20+ messages in thread
From: Dave Dillow @ 2010-12-27 19:52 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, Bart Van Assche,
	Nicholas A. Bellinger

On Mon, Dec 27, 2010 at 08:47:41PM +0200, Boaz Harrosh wrote:
> On 12/27/2010 08:01 PM, David Dillow wrote:
> > Oh goodie, I hadn't followed up to see if it was in one of James's later
> > pulls. Thanks for the heads up -- I'll dump scsi_cmd_get_serial() in
> > this patch then.
> > 
> 
> Are these base on James's trees? Or are you maintaining these scsi bits yourself?

The the SRP initiator has historically gone through Roland's Infiniband tree. I
try to base my work on Linus's tree so that Roland doesn't have to deal with
conflicts when he merges my pull request, but sometimes I may need to work
off other trees, and will work it out during the merge window. It's early yet
in our working relationship, so that hasn't come up.

In this case, these are based on a Linus rc just after the lock pushdown.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 7/8] IB/srp: stop sharing the host lock with SCSI
       [not found]                 ` <4D18DF4D.6010607-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
  2010-12-27 19:52                   ` Dave Dillow
@ 2010-12-28 17:38                   ` Bart Van Assche
  1 sibling, 0 replies; 20+ messages in thread
From: Bart Van Assche @ 2010-12-28 17:38 UTC (permalink / raw)
  To: Boaz Harrosh
  Cc: David Dillow, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Mon, Dec 27, 2010 at 7:47 PM, Boaz Harrosh <bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org> wrote:
> [ ... ]
> Are these base on James's trees? Or are you maintaining these scsi bits yourself?
> [ ... ]

Hi Boaz,

In case you are not reading the linux-rdma mailing list, this message
will be interesting for you:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg05950.html

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/8] IB/srp: reduce lock coverage of command completion
  2010-12-23 21:31   ` [RFC 6/8] IB/srp: reduce lock coverage of command completion David Dillow
@ 2011-01-02 17:27     ` Bart Van Assche
       [not found]       ` <AANLkTikBkdn6XWgJELaBBFi1ZHen+UwQuumDn+yEPDPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2011-01-02 17:27 UTC (permalink / raw)
  To: David Dillow; +Cc: linux-rdma, linux-scsi

On Thu, Dec 23, 2010 at 10:31 PM, David Dillow <dillowda@ornl.gov> wrote:
> [ ... ]
> @@ -913,14 +917,6 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
>  {
>        struct srp_request *req;
>        struct scsi_cmnd *scmnd;
> -       unsigned long flags;
> -       s32 delta;
> -
> -       delta = (s32) be32_to_cpu(rsp->req_lim_delta);
> -
> -       spin_lock_irqsave(target->scsi_host->host_lock, flags);
> -
> -       target->req_lim += delta;
>
>        if (unlikely(rsp->tag & SRP_TAG_TSK_MGMT)) {
>                target->tsk_mgmt_status = -1;
> @@ -948,12 +944,10 @@ static void srp_process_rsp(struct srp_target_port *target, struct srp_rsp *rsp)
>                else if (rsp->flags & (SRP_RSP_FLAG_DIOVER | SRP_RSP_FLAG_DIUNDER))
>                        scsi_set_resid(scmnd, be32_to_cpu(rsp->data_in_res_cnt));
>
> +               srp_remove_req(target, req, be32_to_cpu(rsp->req_lim_delta));
>                scmnd->host_scribble = NULL;
>                scmnd->scsi_done(scmnd);
> -               srp_remove_req(target, req);
>        }
> -
> -       spin_unlock_irqrestore(target->scsi_host->host_lock, flags);
>  }
>
>  [ ... ]

Hello Dave,

Before this patch target->req_lim was updated both when receiving a
response for a regular SRP command and when receiving a response for
an SRP task management command. This patch modifies that in only
updating target->req_lim in the former case. Are you sure that's
correct ?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 6/8] IB/srp: reduce lock coverage of command completion
       [not found]       ` <AANLkTikBkdn6XWgJELaBBFi1ZHen+UwQuumDn+yEPDPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-01-03  8:44         ` David Dillow
  0 siblings, 0 replies; 20+ messages in thread
From: David Dillow @ 2011-01-03  8:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA

On Sun, 2011-01-02 at 18:27 +0100, Bart Van Assche wrote:
> Before this patch target->req_lim was updated both when receiving a
> response for a regular SRP command and when receiving a response for
> an SRP task management command. This patch modifies that in only
> updating target->req_lim in the former case. Are you sure that's
> correct ?

No, that's a bug. Good catch, thanks.
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines
  2010-12-23 21:31   ` [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines David Dillow
@ 2011-01-05 18:13     ` Roland Dreier
  0 siblings, 0 replies; 20+ messages in thread
From: Roland Dreier @ 2011-01-05 18:13 UTC (permalink / raw)
  To: David Dillow; +Cc: linux-rdma, linux-scsi, Bart Van Assche

Oh, I did have one minor comment I forgot about:

 > +	struct ib_cq	       *send_cq ____cacheline_aligned;

This could actually be ____cacheline_aligned_in_smp I think, not that it
matters at all in practice (although maybe someone will build a 1-core
embedded initiator on a memory-constrained system someday??)

 - R.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-01-05 18:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-23 21:31 [RFC 0/8] IB/srp: scaling and bug fixes for 2.6.38 David Dillow
     [not found] ` <1293139893-11678-1-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2010-12-23 21:31   ` [RFC 1/8] IB/srp: allow task management without a previous request David Dillow
2010-12-23 21:31   ` [RFC 2/8] IB/srp: consolidate state change code David Dillow
2010-12-23 21:31   ` [RFC 3/8] IB/srp: allow lockless work posting David Dillow
2010-12-23 21:31   ` [RFC 4/8] IB/srp: don't move active requests to their own list David Dillow
     [not found]     ` <1293139893-11678-5-git-send-email-dillowda-1Heg1YXhbW8@public.gmane.org>
2010-12-27 15:50       ` Bart Van Assche
     [not found]         ` <AANLkTinuOD919iV+ixcWhZpSkJP1QP=+OyphzVSq4_aO-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-27 17:46           ` David Dillow
2010-12-23 21:31   ` [RFC 5/8] IB/srp: reduce local coverage for command submission and EH David Dillow
2010-12-23 21:31   ` [RFC 6/8] IB/srp: reduce lock coverage of command completion David Dillow
2011-01-02 17:27     ` Bart Van Assche
     [not found]       ` <AANLkTikBkdn6XWgJELaBBFi1ZHen+UwQuumDn+yEPDPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-03  8:44         ` David Dillow
2010-12-23 21:31   ` [RFC 7/8] IB/srp: stop sharing the host lock with SCSI David Dillow
2010-12-27 17:49     ` David Dillow
2010-12-27 17:56       ` Boaz Harrosh
     [not found]         ` <4D18D355.30308-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2010-12-27 18:01           ` David Dillow
     [not found]             ` <1293472913.2896.36.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-12-27 18:47               ` Boaz Harrosh
     [not found]                 ` <4D18DF4D.6010607-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2010-12-27 19:52                   ` Dave Dillow
2010-12-28 17:38                   ` Bart Van Assche
2010-12-23 21:31   ` [RFC 8/8] IB/srp: consolidate hot-path variables into cache lines David Dillow
2011-01-05 18:13     ` Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).