public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] block changes for request-based dm-multipath
@ 2008-09-18 14:43 Kiyoshi Ueda
  2008-09-18 14:45 ` [PATCH 1/3] block: add request update interface Kiyoshi Ueda
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2008-09-18 14:43 UTC (permalink / raw)
  To: jens.axboe; +Cc: j-nomura, k-ueda, dm-devel, linux-kernel, linux-scsi

Hi Jens,

The following patches are changes of the block layer for
request-based dm-multipath.
The patches are created on the following commit of for-2.6.28
in linux-2.6-block.
---------------------------------------------------------------
commit 1fd75f9b2e1c273ab145c9bf3e7a8afcbd8be87a
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Tue Sep 16 15:09:28 2008 -0700

    block: cleanup some of the integrity stuff in blkdev.h
---------------------------------------------------------------

No major changes since the last post (http://lkml.org/lkml/2008/9/12/100),
except for following up the differences from 2.6.27-rc6.
(As I said, I don't include patches for refactoring of similar
 interfaces at this time.  The refactoring patches will be posted
 after the request-based dm-multipath work is done.)

Summary of the patches:
  1/3: block: add a request data completion interface
  2/3: block: add a request submission interface
  3/3: block: add a queue flag for request stacking support

Please review and apply.

Thanks,
Kiyoshi Ueda

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] block: add request update interface
  2008-09-18 14:43 [PATCH 0/3] block changes for request-based dm-multipath Kiyoshi Ueda
@ 2008-09-18 14:45 ` Kiyoshi Ueda
  2008-09-18 14:45 ` [PATCH 2/3] block: add request submission interface Kiyoshi Ueda
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2008-09-18 14:45 UTC (permalink / raw)
  To: jens.axboe; +Cc: j-nomura, k-ueda, dm-devel, linux-kernel, linux-scsi

This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.

Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.

  - Request stacking drivers can't use blk_end_request() directly from
    the lower driver's completion context (bio->bi_end_io or rq->end_io),
    because some device drivers (e.g. ide) may try to complete
    their request with queue lock held, and it may cause deadlock.
    See below for detailed description of possible deadlock:
    <http://marc.info/?l=linux-kernel&m=120311479108569&w=2>

  - To solve that, request-based dm offloads the completion of
    cloned struct request to softirq context (i.e. using
    blk_complete_request() from rq->end_io).

  - Though it is possible to use the same solution from bio->bi_end_io,
    it will delay the notification of bio completion to the original
    submitter.  Also, it will cause inefficient partial completion,
    because the lower driver can't perform the cloned request anymore
    and request-based dm needs to requeue and redispatch it to
    the lower driver again later.  That's not good.

  - So request-based dm needs blk_update_request() to perform the bio
    completion in the lower driver's completion context, which is more
    efficient.


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-core.c       |   57 +++++++++++++++++++++++++++++++++++++++++--------
 include/linux/blkdev.h |    2 +
 2 files changed, 50 insertions(+), 9 deletions(-)

Index: linux-2.6-block/block/blk-core.c
===================================================================
--- linux-2.6-block.orig/block/blk-core.c
+++ linux-2.6-block/block/blk-core.c
@@ -1798,6 +1798,22 @@ void end_request(struct request *req, in
 }
 EXPORT_SYMBOL(end_request);
 
+static int end_that_request_data(struct request *rq, int error,
+				 unsigned int nr_bytes, unsigned int bidi_bytes)
+{
+	if (rq->bio) {
+		if (__end_that_request_first(rq, error, nr_bytes))
+			return 1;
+
+		/* Bidi request must be completed as a whole */
+		if (blk_bidi_rq(rq) &&
+		    __end_that_request_first(rq->next_rq, error, bidi_bytes))
+			return 1;
+	}
+
+	return 0;
+}
+
 /**
  * blk_end_io - Generic end_io function to complete a request.
  * @rq:           the request being processed
@@ -1824,15 +1840,8 @@ static int blk_end_io(struct request *rq
 	struct request_queue *q = rq->q;
 	unsigned long flags = 0UL;
 
-	if (rq->bio) {
-		if (__end_that_request_first(rq, error, nr_bytes))
-			return 1;
-
-		/* Bidi request must be completed as a whole */
-		if (blk_bidi_rq(rq) &&
-		    __end_that_request_first(rq->next_rq, error, bidi_bytes))
-			return 1;
-	}
+	if (end_that_request_data(rq, error, nr_bytes, bidi_bytes))
+		return 1;
 
 	/* Special feature for tricky drivers */
 	if (drv_callback && drv_callback(rq))
@@ -1915,6 +1924,36 @@ int blk_end_bidi_request(struct request 
 EXPORT_SYMBOL_GPL(blk_end_bidi_request);
 
 /**
+ * blk_update_request - Special helper function for request stacking drivers
+ * @rq:           the request being processed
+ * @error:        %0 for success, < %0 for error
+ * @nr_bytes:     number of bytes to complete @rq
+ *
+ * Description:
+ *     Ends I/O on a number of bytes attached to @rq, but doesn't complete
+ *     the request structure even if @rq doesn't have leftover.
+ *     If @rq has leftover, sets it up for the next range of segments.
+ *
+ *     This special helper function is only for request stacking drivers
+ *     (e.g. request-based dm) so that they can handle partial completion.
+ *     Actual device drivers should use blk_end_request instead.
+ */
+void blk_update_request(struct request *rq, int error, unsigned int nr_bytes)
+{
+	if (!end_that_request_data(rq, error, nr_bytes, 0)) {
+		/*
+		 * These members are not updated in end_that_request_data()
+		 * when all bios are completed.
+		 * Update them so that the request stacking driver can find
+		 * how many bytes remain in the request later.
+		 */
+		rq->nr_sectors = rq->hard_nr_sectors = 0;
+		rq->current_nr_sectors = rq->hard_cur_sectors = 0;
+	}
+}
+EXPORT_SYMBOL_GPL(blk_update_request);
+
+/**
  * blk_end_request_callback - Special helper function for tricky drivers
  * @rq:           the request being processed
  * @error:        %0 for success, < %0 for error
Index: linux-2.6-block/include/linux/blkdev.h
===================================================================
--- linux-2.6-block.orig/include/linux/blkdev.h
+++ linux-2.6-block/include/linux/blkdev.h
@@ -791,6 +791,8 @@ extern void blk_complete_request(struct 
 extern void __blk_complete_request(struct request *);
 extern void blk_abort_request(struct request *);
 extern void blk_abort_queue(struct request_queue *);
+extern void blk_update_request(struct request *rq, int error,
+			       unsigned int nr_bytes);
 
 /*
  * blk_end_request() takes bytes instead of sectors as a complete size.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/3] block: add request submission interface
  2008-09-18 14:43 [PATCH 0/3] block changes for request-based dm-multipath Kiyoshi Ueda
  2008-09-18 14:45 ` [PATCH 1/3] block: add request update interface Kiyoshi Ueda
@ 2008-09-18 14:45 ` Kiyoshi Ueda
  2008-09-18 14:46 ` [PATCH 3/3] block: add a queue flag for request stacking support Kiyoshi Ueda
  2008-09-18 16:25 ` [PATCH 0/3] block changes for request-based dm-multipath Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2008-09-18 14:45 UTC (permalink / raw)
  To: jens.axboe; +Cc: j-nomura, k-ueda, dm-devel, linux-kernel, linux-scsi

This patch adds blk_insert_cloned_request(), a generic request
submission interface for request stacking drivers.
Request-based dm will use it to submit their clones to underlying
devices.

blk_rq_check_limits() is also added because it is possible that
the lower queue has stronger limitations than the upper queue
if multiple drivers are stacking at request-level.
Not only for blk_insert_cloned_request()'s internal use, the function
will be used by request-based dm when the queue limitation is
modified (e.g. by replacing dm's table).


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-core.c       |   81 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/blkdev.h |    3 +
 2 files changed, 84 insertions(+)

Index: linux-2.6-block/block/blk-core.c
===================================================================
--- linux-2.6-block.orig/block/blk-core.c
+++ linux-2.6-block/block/blk-core.c
@@ -1522,6 +1522,87 @@ void submit_bio(int rw, struct bio *bio)
 EXPORT_SYMBOL(submit_bio);
 
 /**
+ * blk_rq_check_limits - Helper function to check a request for the queue limit
+ * @q:  the queue
+ * @rq: the request being checked
+ *
+ * Description:
+ *    @rq may have been made based on weaker limitations of upper-level queues
+ *    in request stacking drivers, and it may violate the limitation of @q.
+ *    Since the block layer and the underlying device driver trust @rq
+ *    after it is inserted to @q, it should be checked against @q before
+ *    the insertion using this generic function.
+ *
+ *    This function should also be useful for request stacking drivers
+ *    in some cases below, so export this fuction.
+ *    Request stacking drivers like request-based dm may change the queue
+ *    limits while requests are in the queue (e.g. dm's table swapping).
+ *    Such request stacking drivers should check those requests agaist
+ *    the new queue limits again when they dispatch those requests,
+ *    although such checkings are also done against the old queue limits
+ *    when submitting requests.
+ */
+int blk_rq_check_limits(struct request_queue *q, struct request *rq)
+{
+	if (rq->nr_sectors > q->max_sectors ||
+	    rq->data_len > q->max_hw_sectors << 9) {
+		printk(KERN_ERR "%s: over max size limit.\n", __func__);
+		return -EIO;
+	}
+
+	/*
+	 * queue's settings related to segment counting like q->bounce_pfn
+	 * may differ from that of other stacking queues.
+	 * Recalculate it to check the request correctly on this queue's
+	 * limitation.
+	 */
+	blk_recalc_rq_segments(rq);
+	if (rq->nr_phys_segments > q->max_phys_segments ||
+	    rq->nr_phys_segments > q->max_hw_segments) {
+		printk(KERN_ERR "%s: over max segments limit.\n", __func__);
+		return -EIO;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(blk_rq_check_limits);
+
+/**
+ * blk_insert_cloned_request - Helper for stacking drivers to submit a request
+ * @q:  the queue to submit the request
+ * @rq: the request being queued
+ */
+int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
+{
+	unsigned long flags;
+
+	if (blk_rq_check_limits(q, rq))
+		return -EIO;
+
+#ifdef CONFIG_FAIL_MAKE_REQUEST
+	if (rq->rq_disk && rq->rq_disk->part0.make_it_fail &&
+	    should_fail(&fail_make_request, blk_rq_bytes(rq)))
+		return -EIO;
+#endif
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	/*
+	 * Submitting request must be dequeued before calling this function
+	 * because it will be linked to another request_queue
+	 */
+	BUG_ON(blk_queued_rq(rq));
+
+	drive_stat_acct(rq, 1);
+	__elv_add_request(q, rq, ELEVATOR_INSERT_BACK, 0);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
+
+/**
  * __end_that_request_first - end I/O on a request
  * @req:      the request being processed
  * @error:    %0 for success, < %0 for error
Index: linux-2.6-block/include/linux/blkdev.h
===================================================================
--- linux-2.6-block.orig/include/linux/blkdev.h
+++ linux-2.6-block/include/linux/blkdev.h
@@ -693,6 +693,9 @@ extern void __blk_put_request(struct req
 extern struct request *blk_get_request(struct request_queue *, int, gfp_t);
 extern void blk_insert_request(struct request_queue *, struct request *, int, void *);
 extern void blk_requeue_request(struct request_queue *, struct request *);
+extern int blk_rq_check_limits(struct request_queue *q, struct request *rq);
+extern int blk_insert_cloned_request(struct request_queue *q,
+				     struct request *rq);
 extern void blk_plug_device(struct request_queue *);
 extern void blk_plug_device_unlocked(struct request_queue *);
 extern int blk_remove_plug(struct request_queue *);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 3/3] block: add a queue flag for request stacking support
  2008-09-18 14:43 [PATCH 0/3] block changes for request-based dm-multipath Kiyoshi Ueda
  2008-09-18 14:45 ` [PATCH 1/3] block: add request update interface Kiyoshi Ueda
  2008-09-18 14:45 ` [PATCH 2/3] block: add request submission interface Kiyoshi Ueda
@ 2008-09-18 14:46 ` Kiyoshi Ueda
  2008-09-18 16:25 ` [PATCH 0/3] block changes for request-based dm-multipath Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2008-09-18 14:46 UTC (permalink / raw)
  To: jens.axboe; +Cc: j-nomura, k-ueda, dm-devel, linux-kernel, linux-scsi

This patch adds a queue flag to indicate the block device can be
used for request stacking.

Request stacking drivers need to stack their devices on top of
only devices of which q->request_fn is functional.
Since bio stacking drivers (e.g. md, loop) basically initialize
their queue using blk_alloc_queue() and don't set q->request_fn,
the check of (q->request_fn == NULL) looks enough for that purpose.

However, dm will become both types of stacking driver (bio-based and
request-based).  And dm will always set q->request_fn even if the dm
device is bio-based of which q->request_fn is not functional actually.
So we need something else to distinguish the type of the device.
Adding a queue flag is a solution for that.


The reason why dm always sets q->request_fn is to keep
the compatibility of dm user-space tools.
Currently, all dm user-space tools are using bio-based dm without
specifying the type of the dm device they use.
To use request-based dm without changing such tools, the kernel
must decide the type of the dm device automatically.
The automatic type decision can't be done at the device creation time
and needs to be deferred until such tools load a mapping table,
since the actual type is decided by dm target type included in
the mapping table.

So a dm device has to be initialized using blk_init_queue()
so that we can load either type of table.
Then, all queue stuffs are set (e.g. q->request_fn) and we have
no element to distinguish that it is bio-based or request-based,
even after a table is loaded and the type of the device is decided.

By the way, some stuffs of the queue (e.g. request_list, elevator)
are needless when the dm device is used as bio-based.
But the memory size is not so large (about 20[KB] per queue on ia64),
so I hope the memory loss can be acceptable for bio-based dm users.


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-core.c       |    3 ++-
 include/linux/blkdev.h |    3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6-block/block/blk-core.c
===================================================================
--- linux-2.6-block.orig/block/blk-core.c
+++ linux-2.6-block/block/blk-core.c
@@ -566,7 +566,8 @@ blk_init_queue_node(request_fn_proc *rfn
 	q->request_fn		= rfn;
 	q->prep_rq_fn		= NULL;
 	q->unplug_fn		= generic_unplug_device;
-	q->queue_flags		= (1 << QUEUE_FLAG_CLUSTER);
+	q->queue_flags		= (1 << QUEUE_FLAG_CLUSTER |
+				   1 << QUEUE_FLAG_STACKABLE);
 	q->queue_lock		= lock;
 
 	blk_queue_segment_boundary(q, 0xffffffff);
Index: linux-2.6-block/include/linux/blkdev.h
===================================================================
--- linux-2.6-block.orig/include/linux/blkdev.h
+++ linux-2.6-block/include/linux/blkdev.h
@@ -441,6 +441,7 @@ struct request_queue
 #define QUEUE_FLAG_NOMERGES    10	/* disable merge attempts */
 #define QUEUE_FLAG_SAME_COMP   11	/* force complete on same CPU */
 #define QUEUE_FLAG_FAIL_IO     12	/* fake timeout */
+#define QUEUE_FLAG_STACKABLE   13	/* supports request stacking */
 
 static inline int queue_is_locked(struct request_queue *q)
 {
@@ -547,6 +548,8 @@ enum {
 #define blk_queue_stopped(q)	test_bit(QUEUE_FLAG_STOPPED, &(q)->queue_flags)
 #define blk_queue_nomerges(q)	test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags)
 #define blk_queue_flushing(q)	((q)->ordseq)
+#define blk_queue_stackable(q)	\
+	test_bit(QUEUE_FLAG_STACKABLE, &(q)->queue_flags)
 
 #define blk_fs_request(rq)	((rq)->cmd_type == REQ_TYPE_FS)
 #define blk_pc_request(rq)	((rq)->cmd_type == REQ_TYPE_BLOCK_PC)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] block changes for request-based dm-multipath
  2008-09-18 14:43 [PATCH 0/3] block changes for request-based dm-multipath Kiyoshi Ueda
                   ` (2 preceding siblings ...)
  2008-09-18 14:46 ` [PATCH 3/3] block: add a queue flag for request stacking support Kiyoshi Ueda
@ 2008-09-18 16:25 ` Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2008-09-18 16:25 UTC (permalink / raw)
  To: Kiyoshi Ueda; +Cc: linux-kernel, linux-scsi, dm-devel, j-nomura

On Thu, Sep 18 2008, Kiyoshi Ueda wrote:
> Hi Jens,
> 
> The following patches are changes of the block layer for
> request-based dm-multipath.
> The patches are created on the following commit of for-2.6.28
> in linux-2.6-block.

Thank you, I've merged these three patches.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-18 16:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-18 14:43 [PATCH 0/3] block changes for request-based dm-multipath Kiyoshi Ueda
2008-09-18 14:45 ` [PATCH 1/3] block: add request update interface Kiyoshi Ueda
2008-09-18 14:45 ` [PATCH 2/3] block: add request submission interface Kiyoshi Ueda
2008-09-18 14:46 ` [PATCH 3/3] block: add a queue flag for request stacking support Kiyoshi Ueda
2008-09-18 16:25 ` [PATCH 0/3] block changes for request-based dm-multipath Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox