linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: James Bottomley <James.Bottomley@SteelEye.com>,
	Jens Axboe <jens.axboe@oracle.com>,
	FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: [PATCH A3/5] SCSI: sg-chaining over scsi_sgtable
Date: Tue, 24 Jul 2007 11:59:13 +0300	[thread overview]
Message-ID: <46A5BF61.20509@panasas.com> (raw)
In-Reply-To: <46A5BCB6.9050102@panasas.com>


   Based on Jens code for sg-chaining but over scsi_sgtable implementation
   - Previous scsi_{alloc,free}_sgtable() renamed to scsi_{alloc,free}_sgtable_page()
   - scsi_{alloc,free}_sgtable() using the above now supports sg-chaining with multiple
     sgtable allocations.
   - Report arbitrary default of 2048 to block layer.

    from Jens:
      This is what enables large commands. If we need to allocate an
      sgtable that doesn't fit in a single page, allocate several
      SCSI_MAX_SG_SEGMENTS sized tables and chain them together.
      SCSI defaults to large chained sg tables, if the arch supports it.

 Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
 drivers/scsi/scsi_lib.c |   89 +++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 262128c..13870b5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -59,6 +59,12 @@ static inline unsigned scsi_pool_size(int pool)
 	return scsi_sg_pools[pool].size;
 }
 
+/*
+ * IO limit For archs that have sg chaining. This limit is totally arbitrary,
+ * a setting of 2048 will get you at least 8mb ios.
+ */
+#define SCSI_MAX_SG_CHAIN_SEGMENTS	2048
+
 static void scsi_run_queue(struct request_queue *q);
 
 /*
@@ -713,7 +719,7 @@ static unsigned scsi_sgtable_index(unsigned nents)
 	return -1;
 }
 
-struct scsi_sgtable *scsi_alloc_sgtable(int sg_count, gfp_t gfp_mask)
+static struct scsi_sgtable *scsi_alloc_sgtable_page(int sg_count, gfp_t gfp_mask)
 {
 	unsigned int pool = scsi_sgtable_index(sg_count);
 	struct scsi_sgtable *sgt;
@@ -727,12 +733,77 @@ struct scsi_sgtable *scsi_alloc_sgtable(int sg_count, gfp_t gfp_mask)
 	sgt->sg_pool = pool;
 	return sgt;
 }
+
+struct scsi_sgtable *scsi_alloc_sgtable(int sg_count, gfp_t gfp_mask)
+{
+	struct scsi_sgtable *sgt, *prev, *ret;
+
+	if (sg_count <= SCSI_MAX_SG_SEGMENTS)
+		return scsi_alloc_sgtable_page(sg_count, gfp_mask);
+
+	ret = prev = NULL;
+	do {
+		int this;
+
+		if (sg_count > SCSI_MAX_SG_SEGMENTS) {
+			this = SCSI_MAX_SG_SEGMENTS - 1; /* room for chain */
+		} else {
+			this = sg_count;
+		}
+
+		sgt = scsi_alloc_sgtable_page(this, gfp_mask);
+		/*
+		 * FIXME: since second and on allocations are done 
+		 * ~__GFP_WAIT we can fail more easilly, but nothing
+		 * prevents us from trying smaller pools and chaining
+		 * more arrays. The last patch in the series does just
+		 * that.
+ 		 */
+		if (unlikely(!sgt))
+			goto enomem;
+
+		/* first loop through, set return value */
+		if (!ret)
+			ret = sgt;
+
+		/* chain previous sglist, if any */
+		if (prev)
+			sg_chain(prev->sglist, scsi_pool_size(prev->sg_pool),
+			                                           sgt->sglist);
+
+		/*
+		 * don't allow subsequent mempool allocs to sleep, it would
+		 * violate the mempool principle.
+		 */
+		gfp_mask &= ~__GFP_WAIT;
+		gfp_mask |= __GFP_HIGH;
+		sg_count -= this;
+		prev = sgt;
+	} while (sg_count);
+
+	return ret;
+enomem:
+	if (ret)
+		scsi_free_sgtable(ret);
+	return NULL;
+}
 EXPORT_SYMBOL(scsi_alloc_sgtable);
 
-void scsi_free_sgtable(struct scsi_sgtable *sgt)
+static void scsi_free_sgtable_page(struct scsi_sgtable *sgt)
 {
 	mempool_free(sgt, scsi_sg_pools[sgt->sg_pool].pool);
 }
+
+static void scsi_free_sgtable(struct scsi_sgtable *sgt)
+{
+	do {
+		struct scatterlist *next, *here_last;
+		here_last = &sgt->sglist[scsi_pool_size(sgt->sg_pool) - 1];
+		next = sg_is_chain(here_last) ? sg_chain_ptr(here_last) : NULL;
+		scsi_free_sgtable_page(sgt);
+		sgt = next ? ((struct scsi_sgtable*)next) - 1 : NULL;
+	} while(sgt);
+}
 EXPORT_SYMBOL(scsi_free_sgtable);
 
 /*
@@ -1550,8 +1621,22 @@ struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
 	if (!q)
 		return NULL;
 
+	/*
+	 * this limit is imposed by hardware restrictions
+	 */
 	blk_queue_max_hw_segments(q, shost->sg_tablesize);
+
+	/*
+	 * In the future, sg chaining support will be mandatory and this
+	 * ifdef can then go away. Right now we don't have all archs
+	 * converted, so better keep it safe.
+	 */
+#ifdef ARCH_HAS_SG_CHAIN
+	blk_queue_max_phys_segments(q, SCSI_MAX_SG_CHAIN_SEGMENTS);
+#else
 	blk_queue_max_phys_segments(q, SCSI_MAX_SG_SEGMENTS);
+#endif
+
 	blk_queue_max_sectors(q, shost->max_sectors);
 	blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost));
 	blk_queue_segment_boundary(q, shost->dma_boundary);
-- 
1.5.2.2.249.g45fd



  parent reply	other threads:[~2007-07-24  9:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-24  8:47 [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining Boaz Harrosh
2007-07-24  8:52 ` [PATCH AB1/5] SCSI: SG pools allocation cleanup Boaz Harrosh
2007-07-24 13:08   ` Boaz Harrosh
2007-07-25  8:08   ` Boaz Harrosh
2007-07-25  9:05     ` [PATCH AB1/5 ver2] " Boaz Harrosh
2007-07-25  9:06     ` [PATCH A2/5 ver2] SCSI: scsi_sgtable implementation Boaz Harrosh
2007-07-24  8:56 ` [PATCH A2/5] " Boaz Harrosh
2007-07-24  8:59 ` Boaz Harrosh [this message]
2007-07-24  9:01 ` [PATCH B2/5] SCSI: support for allocating large scatterlists Boaz Harrosh
2007-07-24  9:03 ` [PATCH B3/5] SCSI: scsi_sgtable over sg-chainning Boaz Harrosh
2007-07-24  9:16 ` [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining FUJITA Tomonori
2007-07-24 10:01   ` Boaz Harrosh
2007-07-24 11:12     ` FUJITA Tomonori
2007-07-24 13:41       ` FUJITA Tomonori
2007-07-24 14:01         ` Benny Halevy
2007-07-24 16:10           ` James Bottomley
2007-07-25  8:26             ` Benny Halevy
2007-07-25  8:42               ` FUJITA Tomonori
2007-07-25 19:22                 ` Boaz Harrosh
2007-07-26 11:33                   ` FUJITA Tomonori
2007-07-31 20:12                   ` Boaz Harrosh
2007-08-05 16:03                     ` FUJITA Tomonori
2007-08-06  7:22                     ` FUJITA Tomonori
2007-08-07  6:55                       ` Jens Axboe
2007-08-07  8:36                         ` FUJITA Tomonori
2007-08-08  7:16                           ` Jens Axboe
2007-07-25 19:50                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A5BF61.20509@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=fujita.tomonori@lab.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).