linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov
@ 2008-04-10 14:17 FUJITA Tomonori
  2008-04-10 14:17 ` [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov FUJITA Tomonori
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: FUJITA Tomonori @ 2008-04-10 14:17 UTC (permalink / raw)
  To: linux-scsi
  Cc: Tejun Heo, Mike Christie, James Bottomley, Jens Axboe,
	FUJITA Tomonori

As discussed [*1], blk_rq_map_user_iov path is broken regarding
padding at the moment. In 2.6.24, libata did padding but libata's
padding code was removed and now libata expects the block layer to do
that.

blk_rq_map_user does padding but blk_rq_map_user_iov doesn't so
blk_rq_map_user_iov doesn't work in case libata needs padding (so far
nobody has complained, maybe nobody uses blk_rq_map_user_iov
interface).

This patchset adds padding support to blk_rq_map_user_iov. I converted
convert bio_copy_user to bio_copy_user_iov, which uses a temporary
kernel buffers.  blk_rq_map_user_iov uses bio_copy_user_iov when a low
level driver needs padding or a buffer in sg_iovec isn't aligned. We
can safely do padding in blk_rq_map_sg.

In the long run, I want to integrate several mapping APIs for PC
commands (and new API should be useful for sg/st/osst) but I need more
time to finish that work.

This is against the latest Linus tree. Can we merge this after 2.6.25?


[*1]
http://marc.info/?t=120716961000004&r=1&w=2



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov
  2008-04-10 14:17 [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov FUJITA Tomonori
@ 2008-04-10 14:17 ` FUJITA Tomonori
  2008-04-10 14:17   ` [PATCH 2/3] block: add bio_copy_user_iov support to blk_rq_map_user_iov FUJITA Tomonori
  2008-04-11  5:43 ` [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov Tejun Heo
  2008-04-11 10:55 ` Jens Axboe
  2 siblings, 1 reply; 9+ messages in thread
From: FUJITA Tomonori @ 2008-04-10 14:17 UTC (permalink / raw)
  To: linux-scsi
  Cc: FUJITA Tomonori, Tejun Heo, Mike Christie, James Bottomley,
	Jens Axboe

This patch enables bio_copy_user to take struct sg_iovec (renamed
bio_copy_user_iov). bio_copy_user uses bio_copy_user_iov internally as
bio_map_user uses bio_map_user_iov.

The major changes are:

- adds sg_iovec array to struct bio_map_data

- adds __bio_copy_iov that copy data between bio and
sg_iovec. bio_copy_user_iov and bio_uncopy_user use it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 fs/bio.c            |  158 +++++++++++++++++++++++++++++++++++++-------------
 include/linux/bio.h |    2 +
 2 files changed, 119 insertions(+), 41 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 553b5b7..6e0b6f6 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -444,22 +444,27 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
 
 struct bio_map_data {
 	struct bio_vec *iovecs;
-	void __user *userptr;
+	int nr_sgvecs;
+	struct sg_iovec *sgvecs;
 };
 
-static void bio_set_map_data(struct bio_map_data *bmd, struct bio *bio)
+static void bio_set_map_data(struct bio_map_data *bmd, struct bio *bio,
+			     struct sg_iovec *iov, int iov_count)
 {
 	memcpy(bmd->iovecs, bio->bi_io_vec, sizeof(struct bio_vec) * bio->bi_vcnt);
+	memcpy(bmd->sgvecs, iov, sizeof(struct sg_iovec) * iov_count);
+	bmd->nr_sgvecs = iov_count;
 	bio->bi_private = bmd;
 }
 
 static void bio_free_map_data(struct bio_map_data *bmd)
 {
 	kfree(bmd->iovecs);
+	kfree(bmd->sgvecs);
 	kfree(bmd);
 }
 
-static struct bio_map_data *bio_alloc_map_data(int nr_segs)
+static struct bio_map_data *bio_alloc_map_data(int nr_segs, int iov_count)
 {
 	struct bio_map_data *bmd = kmalloc(sizeof(*bmd), GFP_KERNEL);
 
@@ -467,13 +472,71 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs)
 		return NULL;
 
 	bmd->iovecs = kmalloc(sizeof(struct bio_vec) * nr_segs, GFP_KERNEL);
-	if (bmd->iovecs)
+	if (!bmd->iovecs) {
+		kfree(bmd);
+		return NULL;
+	}
+
+	bmd->sgvecs = kmalloc(sizeof(struct sg_iovec) * iov_count, GFP_KERNEL);
+	if (bmd->sgvecs)
 		return bmd;
 
+	kfree(bmd->iovecs);
 	kfree(bmd);
 	return NULL;
 }
 
+static int __bio_copy_iov(struct bio *bio, struct sg_iovec *iov, int iov_count,
+			  int uncopy)
+{
+	int ret = 0, i;
+	struct bio_vec *bvec;
+	int iov_idx = 0;
+	unsigned int iov_off = 0;
+	int read = bio_data_dir(bio) == READ;
+
+	__bio_for_each_segment(bvec, bio, i, 0) {
+		char *bv_addr = page_address(bvec->bv_page);
+		unsigned int bv_len = bvec->bv_len;
+
+		while (bv_len && iov_idx < iov_count) {
+			unsigned int bytes;
+			char *iov_addr;
+
+			bytes = min_t(unsigned int,
+				      iov[iov_idx].iov_len - iov_off, bv_len);
+			iov_addr = iov[iov_idx].iov_base + iov_off;
+
+			if (!ret) {
+				if (!read && !uncopy)
+					ret = copy_from_user(bv_addr, iov_addr,
+							     bytes);
+				if (read && uncopy)
+					ret = copy_to_user(iov_addr, bv_addr,
+							   bytes);
+
+				if (ret)
+					ret = -EFAULT;
+			}
+
+			bv_len -= bytes;
+			bv_addr += bytes;
+			iov_addr += bytes;
+			iov_off += bytes;
+
+			if (iov[iov_idx].iov_len == iov_off) {
+				iov_idx++;
+				iov_off = 0;
+			}
+		}
+
+		if (uncopy)
+			__free_page(bvec->bv_page);
+	}
+
+	return ret;
+}
+
 /**
  *	bio_uncopy_user	-	finish previously mapped bio
  *	@bio: bio being terminated
@@ -484,55 +547,56 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs)
 int bio_uncopy_user(struct bio *bio)
 {
 	struct bio_map_data *bmd = bio->bi_private;
-	const int read = bio_data_dir(bio) == READ;
-	struct bio_vec *bvec;
-	int i, ret = 0;
+	int ret;
 
-	__bio_for_each_segment(bvec, bio, i, 0) {
-		char *addr = page_address(bvec->bv_page);
-		unsigned int len = bmd->iovecs[i].bv_len;
+	ret = __bio_copy_iov(bio, bmd->sgvecs, bmd->nr_sgvecs, 1);
 
-		if (read && !ret && copy_to_user(bmd->userptr, addr, len))
-			ret = -EFAULT;
-
-		__free_page(bvec->bv_page);
-		bmd->userptr += len;
-	}
 	bio_free_map_data(bmd);
 	bio_put(bio);
 	return ret;
 }
 
 /**
- *	bio_copy_user	-	copy user data to bio
+ *	bio_copy_user_iov	-	copy user data to bio
  *	@q: destination block queue
- *	@uaddr: start of user address
- *	@len: length in bytes
+ *	@iov:	the iovec.
+ *	@iov_count: number of elements in the iovec
  *	@write_to_vm: bool indicating writing to pages or not
  *
  *	Prepares and returns a bio for indirect user io, bouncing data
  *	to/from kernel pages as necessary. Must be paired with
  *	call bio_uncopy_user() on io completion.
  */
-struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr,
-			  unsigned int len, int write_to_vm)
+struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov,
+			      int iov_count, int write_to_vm)
 {
-	unsigned long end = (uaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	unsigned long start = uaddr >> PAGE_SHIFT;
 	struct bio_map_data *bmd;
 	struct bio_vec *bvec;
 	struct page *page;
 	struct bio *bio;
 	int i, ret;
+	int nr_pages = 0;
+	unsigned int len = 0;
 
-	bmd = bio_alloc_map_data(end - start);
+	for (i = 0; i < iov_count; i++) {
+		unsigned long uaddr;
+		unsigned long end;
+		unsigned long start;
+
+		uaddr = (unsigned long)iov[i].iov_base;
+		end = (uaddr + iov[i].iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		start = uaddr >> PAGE_SHIFT;
+
+		nr_pages += end - start;
+		len += iov[i].iov_len;
+	}
+
+	bmd = bio_alloc_map_data(nr_pages, iov_count);
 	if (!bmd)
 		return ERR_PTR(-ENOMEM);
 
-	bmd->userptr = (void __user *) uaddr;
-
 	ret = -ENOMEM;
-	bio = bio_alloc(GFP_KERNEL, end - start);
+	bio = bio_alloc(GFP_KERNEL, nr_pages);
 	if (!bio)
 		goto out_bmd;
 
@@ -564,22 +628,12 @@ struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr,
 	 * success
 	 */
 	if (!write_to_vm) {
-		char __user *p = (char __user *) uaddr;
-
-		/*
-		 * for a write, copy in data to kernel pages
-		 */
-		ret = -EFAULT;
-		bio_for_each_segment(bvec, bio, i) {
-			char *addr = page_address(bvec->bv_page);
-
-			if (copy_from_user(addr, p, bvec->bv_len))
-				goto cleanup;
-			p += bvec->bv_len;
-		}
+		ret = __bio_copy_iov(bio, iov, iov_count, 0);
+		if (ret)
+			goto cleanup;
 	}
 
-	bio_set_map_data(bmd, bio);
+	bio_set_map_data(bmd, bio, iov, iov_count);
 	return bio;
 cleanup:
 	bio_for_each_segment(bvec, bio, i)
@@ -591,6 +645,28 @@ out_bmd:
 	return ERR_PTR(ret);
 }
 
+/**
+ *	bio_copy_user	-	copy user data to bio
+ *	@q: destination block queue
+ *	@uaddr: start of user address
+ *	@len: length in bytes
+ *	@write_to_vm: bool indicating writing to pages or not
+ *
+ *	Prepares and returns a bio for indirect user io, bouncing data
+ *	to/from kernel pages as necessary. Must be paired with
+ *	call bio_uncopy_user() on io completion.
+ */
+struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr,
+			  unsigned int len, int write_to_vm)
+{
+	struct sg_iovec iov;
+
+	iov.iov_base = (void __user *)uaddr;
+	iov.iov_len = len;
+
+	return bio_copy_user_iov(q, &iov, 1, write_to_vm);
+}
+
 static struct bio *__bio_map_user_iov(struct request_queue *q,
 				      struct block_device *bdev,
 				      struct sg_iovec *iov, int iov_count,
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 4c59bdc..d259690 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -327,6 +327,8 @@ extern struct bio *bio_map_kern(struct request_queue *, void *, unsigned int,
 extern void bio_set_pages_dirty(struct bio *bio);
 extern void bio_check_pages_dirty(struct bio *bio);
 extern struct bio *bio_copy_user(struct request_queue *, unsigned long, unsigned int, int);
+extern struct bio *bio_copy_user_iov(struct request_queue *, struct sg_iovec *,
+				     int, int);
 extern int bio_uncopy_user(struct bio *);
 void zero_fill_bio(struct bio *bio);
 
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] block: add bio_copy_user_iov support to blk_rq_map_user_iov
  2008-04-10 14:17 ` [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov FUJITA Tomonori
@ 2008-04-10 14:17   ` FUJITA Tomonori
  2008-04-10 14:17     ` [PATCH 3/3] block: move the padding adjustment to blk_rq_map_sg FUJITA Tomonori
  0 siblings, 1 reply; 9+ messages in thread
From: FUJITA Tomonori @ 2008-04-10 14:17 UTC (permalink / raw)
  To: linux-scsi
  Cc: FUJITA Tomonori, Tejun Heo, Mike Christie, James Bottomley,
	Jens Axboe

With this patch, blk_rq_map_user_iov uses bio_copy_user_iov when a low
level driver needs padding or a buffer in sg_iovec isn't aligned. That
is, it uses temporary kernel buffers instead of mapping user pages
directly.

When a LLD needs padding, later blk_rq_map_sg needs to extend the last
entry of a scatter list. bio_copy_user_iov guarantees that there is
enough space for padding by using temporary kernel buffers instead of
user pages.

blk_rq_map_user_iov needs buffers in sg_iovec to be aligned. The
comment in blk_rq_map_user_iov indicates that drivers/scsi/sg.c also
needs buffers in sg_iovec to be aligned. Actually, drivers/scsi/sg.c
works with unaligned buffers in sg_iovec (it always uses temporary
kernel buffers).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-map.c |   22 +++++++++++++++++-----
 1 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index c07d9c8..ab43533 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -5,6 +5,7 @@
 #include <linux/module.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <scsi/sg.h>		/* for struct sg_iovec */
 
 #include "blk.h"
 
@@ -194,15 +195,26 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 			struct sg_iovec *iov, int iov_count, unsigned int len)
 {
 	struct bio *bio;
+	int i, read = rq_data_dir(rq) == READ;
+	int unaligned = 0;
 
 	if (!iov || iov_count <= 0)
 		return -EINVAL;
 
-	/* we don't allow misaligned data like bio_map_user() does.  If the
-	 * user is using sg, they're expected to know the alignment constraints
-	 * and respect them accordingly */
-	bio = bio_map_user_iov(q, NULL, iov, iov_count,
-				rq_data_dir(rq) == READ);
+	for (i = 0; i < iov_count; i++) {
+		unsigned long uaddr = (unsigned long)iov[i].iov_base;
+
+		if (uaddr & queue_dma_alignment(q)) {
+			unaligned = 1;
+			break;
+		}
+	}
+
+	if (unaligned || (q->dma_pad_mask & len))
+		bio = bio_copy_user_iov(q, iov, iov_count, read);
+	else
+		bio = bio_map_user_iov(q, NULL, iov, iov_count, read);
+
 	if (IS_ERR(bio))
 		return PTR_ERR(bio);
 
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] block: move the padding adjustment to blk_rq_map_sg
  2008-04-10 14:17   ` [PATCH 2/3] block: add bio_copy_user_iov support to blk_rq_map_user_iov FUJITA Tomonori
@ 2008-04-10 14:17     ` FUJITA Tomonori
  2008-04-10 15:02       ` Boaz Harrosh
  0 siblings, 1 reply; 9+ messages in thread
From: FUJITA Tomonori @ 2008-04-10 14:17 UTC (permalink / raw)
  To: linux-scsi
  Cc: FUJITA Tomonori, Tejun Heo, Mike Christie, James Bottomley,
	Jens Axboe

blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
that req->data_len (the true data length) is equal to sum(bio). It
broke the scsi command completion code.

commit e97a294ef6938512b655b1abf17656cf2b26f709 was introduced to fix
the above issue. However, the partial completion code doesn't work
with it. The commit is also a layer violation (scsi mid-layer should
not know about the block layer's padding).

This patch moves the padding adjustment to blk_rq_map_sg (suggested by
James). The padding works like the drain buffer. This patch breaks the
rule that req->data_len is equal to sum(sg), however, the drain buffer
already broke it. So this patch just restores the rule that
req->data_len is equal to sub(bio) without breaking anything new.

Now when a low level driver needs padding, blk_rq_map_user and
blk_rq_map_user_iov guarantee there's enough room for padding.
blk_rq_map_sg can safely extend the last entry of a scatter list.

blk_rq_map_sg must extend the last entry of a scatter list only for a
request that got through bio_copy_user_iov. This patches introduces
new REQ_COPY_USER flag.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-map.c        |   24 +++++-------------------
 block/blk-merge.c      |    9 +++++++++
 drivers/scsi/scsi.c    |    2 +-
 include/linux/blkdev.h |    2 ++
 4 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/block/blk-map.c b/block/blk-map.c
index ab43533..3c942bd 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -141,25 +141,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
 		ubuf += ret;
 	}
 
-	/*
-	 * __blk_rq_map_user() copies the buffers if starting address
-	 * or length isn't aligned to dma_pad_mask.  As the copied
-	 * buffer is always page aligned, we know that there's enough
-	 * room for padding.  Extend the last bio and update
-	 * rq->data_len accordingly.
-	 *
-	 * On unmap, bio_uncopy_user() will use unmodified
-	 * bio_map_data pointed to by bio->bi_private.
-	 */
-	if (len & q->dma_pad_mask) {
-		unsigned int pad_len = (q->dma_pad_mask & ~len) + 1;
-		struct bio *tail = rq->biotail;
-
-		tail->bi_io_vec[tail->bi_vcnt - 1].bv_len += pad_len;
-		tail->bi_size += pad_len;
-
-		rq->extra_len += pad_len;
-	}
+	if (!bio_flagged(bio, BIO_USER_MAPPED))
+		rq->cmd_flags |= REQ_COPY_USER;
 
 	rq->buffer = rq->data = NULL;
 	return 0;
@@ -224,6 +207,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 		return -EINVAL;
 	}
 
+	if (!bio_flagged(bio, BIO_USER_MAPPED))
+		rq->cmd_flags |= REQ_COPY_USER;
+
 	bio_get(bio);
 	blk_rq_bio_prep(q, rq, bio);
 	rq->buffer = rq->data = NULL;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 0f58616..b5c5c4a 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -220,6 +220,15 @@ new_segment:
 		bvprv = bvec;
 	} /* segments in rq */
 
+
+	if (unlikely(rq->cmd_flags & REQ_COPY_USER) &&
+	    (rq->data_len & q->dma_pad_mask)) {
+		unsigned int pad_len = (q->dma_pad_mask & ~rq->data_len) + 1;
+
+		sg->length += pad_len;
+		rq->extra_len += pad_len;
+	}
+
 	if (q->dma_drain_size && q->dma_drain_needed(rq)) {
 		if (rq->cmd_flags & REQ_RW)
 			memset(q->dma_drain_buffer, 0, q->dma_drain_size);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index c78b836..7a85cb3 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -759,7 +759,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
 				"Notifying upper driver of completion "
 				"(result %x)\n", cmd->result));
 
-	good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;
+	good_bytes = scsi_bufflen(cmd);
         if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
 		drv = scsi_cmd_to_driver(cmd);
 		if (drv->done)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6f79d40..b3a58ad 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -112,6 +112,7 @@ enum rq_flag_bits {
 	__REQ_RW_SYNC,		/* request is sync (O_DIRECT) */
 	__REQ_ALLOCED,		/* request came from our alloc pool */
 	__REQ_RW_META,		/* metadata io request */
+	__REQ_COPY_USER,	/* contains copies of user pages */
 	__REQ_NR_BITS,		/* stops here */
 };
 
@@ -133,6 +134,7 @@ enum rq_flag_bits {
 #define REQ_RW_SYNC	(1 << __REQ_RW_SYNC)
 #define REQ_ALLOCED	(1 << __REQ_ALLOCED)
 #define REQ_RW_META	(1 << __REQ_RW_META)
+#define REQ_COPY_USER	(1 << __REQ_COPY_USER)
 
 #define BLK_MAX_CDB	16
 
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] block: move the padding adjustment to blk_rq_map_sg
  2008-04-10 14:17     ` [PATCH 3/3] block: move the padding adjustment to blk_rq_map_sg FUJITA Tomonori
@ 2008-04-10 15:02       ` Boaz Harrosh
  0 siblings, 0 replies; 9+ messages in thread
From: Boaz Harrosh @ 2008-04-10 15:02 UTC (permalink / raw)
  To: FUJITA Tomonori, Tejun Heo
  Cc: linux-scsi, Mike Christie, James Bottomley, Jens Axboe

On Thu, Apr 10 2008 at 17:17 +0300, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
> that req->data_len (the true data length) is equal to sum(bio). It
> broke the scsi command completion code.
> 
> commit e97a294ef6938512b655b1abf17656cf2b26f709 was introduced to fix
> the above issue. However, the partial completion code doesn't work
> with it. The commit is also a layer violation (scsi mid-layer should
> not know about the block layer's padding).
> 
> This patch moves the padding adjustment to blk_rq_map_sg (suggested by
> James). The padding works like the drain buffer. This patch breaks the
> rule that req->data_len is equal to sum(sg), however, the drain buffer
> already broke it. So this patch just restores the rule that
> req->data_len is equal to sub(bio) without breaking anything new.
> 
> Now when a low level driver needs padding, blk_rq_map_user and
> blk_rq_map_user_iov guarantee there's enough room for padding.
> blk_rq_map_sg can safely extend the last entry of a scatter list.
> 

This is a grave bug this patchset must be submitted for 2.6.25 stable
releases after it has been tested.

> blk_rq_map_sg must extend the last entry of a scatter list only for a
> request that got through bio_copy_user_iov. This patches introduces
> new REQ_COPY_USER flag.
> 

What about blk_rq_map_kern? should we have a BUG_ON if a kernel driver
submits unaligned mapping according to what driver has set.

> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Tejun Heo <htejun@gmail.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> ---
>  block/blk-map.c        |   24 +++++-------------------
>  block/blk-merge.c      |    9 +++++++++
>  drivers/scsi/scsi.c    |    2 +-
>  include/linux/blkdev.h |    2 ++
>  4 files changed, 17 insertions(+), 20 deletions(-)
> 
> diff --git a/block/blk-map.c b/block/blk-map.c
> index ab43533..3c942bd 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -141,25 +141,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq,
>  		ubuf += ret;
>  	}
>  
> -	/*
> -	 * __blk_rq_map_user() copies the buffers if starting address
> -	 * or length isn't aligned to dma_pad_mask.  As the copied
> -	 * buffer is always page aligned, we know that there's enough
> -	 * room for padding.  Extend the last bio and update
> -	 * rq->data_len accordingly.
> -	 *
> -	 * On unmap, bio_uncopy_user() will use unmodified
> -	 * bio_map_data pointed to by bio->bi_private.
> -	 */
> -	if (len & q->dma_pad_mask) {
> -		unsigned int pad_len = (q->dma_pad_mask & ~len) + 1;
> -		struct bio *tail = rq->biotail;
> -
> -		tail->bi_io_vec[tail->bi_vcnt - 1].bv_len += pad_len;
> -		tail->bi_size += pad_len;
> -
> -		rq->extra_len += pad_len;
> -	}
> +	if (!bio_flagged(bio, BIO_USER_MAPPED))
> +		rq->cmd_flags |= REQ_COPY_USER;
>  
>  	rq->buffer = rq->data = NULL;
>  	return 0;
> @@ -224,6 +207,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
>  		return -EINVAL;
>  	}
>  
> +	if (!bio_flagged(bio, BIO_USER_MAPPED))
> +		rq->cmd_flags |= REQ_COPY_USER;
> +
>  	bio_get(bio);
>  	blk_rq_bio_prep(q, rq, bio);
>  	rq->buffer = rq->data = NULL;
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 0f58616..b5c5c4a 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -220,6 +220,15 @@ new_segment:
>  		bvprv = bvec;
>  	} /* segments in rq */
>  
> +
> +	if (unlikely(rq->cmd_flags & REQ_COPY_USER) &&
> +	    (rq->data_len & q->dma_pad_mask)) {

+	if (rq->data_len & q->dma_pad_mask) {
+		BUG_ON(!(rq->cmd_flags & REQ_COPY_USER));

> +		unsigned int pad_len = (q->dma_pad_mask & ~rq->data_len) + 1;
> +
> +		sg->length += pad_len;
> +		rq->extra_len += pad_len;
> +	}
> +
>  	if (q->dma_drain_size && q->dma_drain_needed(rq)) {
>  		if (rq->cmd_flags & REQ_RW)
>  			memset(q->dma_drain_buffer, 0, q->dma_drain_size);
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index c78b836..7a85cb3 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -759,7 +759,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd)
>  				"Notifying upper driver of completion "
>  				"(result %x)\n", cmd->result));
>  
> -	good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len;

good god this is going into 2.6.25 kernel? ?!?!
why was it not put in blk_end_request ?

> +	good_bytes = scsi_bufflen(cmd);
>          if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) {
>  		drv = scsi_cmd_to_driver(cmd);
>  		if (drv->done)
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 6f79d40..b3a58ad 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -112,6 +112,7 @@ enum rq_flag_bits {
>  	__REQ_RW_SYNC,		/* request is sync (O_DIRECT) */
>  	__REQ_ALLOCED,		/* request came from our alloc pool */
>  	__REQ_RW_META,		/* metadata io request */
> +	__REQ_COPY_USER,	/* contains copies of user pages */
>  	__REQ_NR_BITS,		/* stops here */
>  };
>  
> @@ -133,6 +134,7 @@ enum rq_flag_bits {
>  #define REQ_RW_SYNC	(1 << __REQ_RW_SYNC)
>  #define REQ_ALLOCED	(1 << __REQ_ALLOCED)
>  #define REQ_RW_META	(1 << __REQ_RW_META)
> +#define REQ_COPY_USER	(1 << __REQ_COPY_USER)
>  
>  #define BLK_MAX_CDB	16
>  

Boaz is :'(


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov
  2008-04-10 14:17 [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov FUJITA Tomonori
  2008-04-10 14:17 ` [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov FUJITA Tomonori
@ 2008-04-11  5:43 ` Tejun Heo
  2008-04-11 10:55 ` Jens Axboe
  2 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2008-04-11  5:43 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: linux-scsi, Mike Christie, James Bottomley, Jens Axboe

FUJITA Tomonori wrote:
> As discussed [*1], blk_rq_map_user_iov path is broken regarding
> padding at the moment. In 2.6.24, libata did padding but libata's
> padding code was removed and now libata expects the block layer to do
> that.
> 
> blk_rq_map_user does padding but blk_rq_map_user_iov doesn't so
> blk_rq_map_user_iov doesn't work in case libata needs padding (so far
> nobody has complained, maybe nobody uses blk_rq_map_user_iov
> interface).
> 
> This patchset adds padding support to blk_rq_map_user_iov. I converted
> convert bio_copy_user to bio_copy_user_iov, which uses a temporary
> kernel buffers.  blk_rq_map_user_iov uses bio_copy_user_iov when a low
> level driver needs padding or a buffer in sg_iovec isn't aligned. We
> can safely do padding in blk_rq_map_sg.
> 
> In the long run, I want to integrate several mapping APIs for PC
> commands (and new API should be useful for sg/st/osst) but I need more
> time to finish that work.
> 
> This is against the latest Linus tree. Can we merge this after 2.6.25?

Haven't really reviewed line by line but generally looks good to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov
  2008-04-10 14:17 [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov FUJITA Tomonori
  2008-04-10 14:17 ` [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov FUJITA Tomonori
  2008-04-11  5:43 ` [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov Tejun Heo
@ 2008-04-11 10:55 ` Jens Axboe
  2008-04-11 14:57   ` FUJITA Tomonori
  2 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2008-04-11 10:55 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: linux-scsi, Tejun Heo, Mike Christie, James Bottomley

On Thu, Apr 10 2008, FUJITA Tomonori wrote:
> As discussed [*1], blk_rq_map_user_iov path is broken regarding
> padding at the moment. In 2.6.24, libata did padding but libata's
> padding code was removed and now libata expects the block layer to do
> that.
> 
> blk_rq_map_user does padding but blk_rq_map_user_iov doesn't so
> blk_rq_map_user_iov doesn't work in case libata needs padding (so far
> nobody has complained, maybe nobody uses blk_rq_map_user_iov
> interface).
> 
> This patchset adds padding support to blk_rq_map_user_iov. I converted
> convert bio_copy_user to bio_copy_user_iov, which uses a temporary
> kernel buffers.  blk_rq_map_user_iov uses bio_copy_user_iov when a low
> level driver needs padding or a buffer in sg_iovec isn't aligned. We
> can safely do padding in blk_rq_map_sg.
> 
> In the long run, I want to integrate several mapping APIs for PC
> commands (and new API should be useful for sg/st/osst) but I need more
> time to finish that work.
> 
> This is against the latest Linus tree. Can we merge this after 2.6.25?

Thanks Tomo, this looks good to me know. I'll queue it up.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov
  2008-04-11 10:55 ` Jens Axboe
@ 2008-04-11 14:57   ` FUJITA Tomonori
  2008-04-14 17:55     ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: FUJITA Tomonori @ 2008-04-11 14:57 UTC (permalink / raw)
  To: jens.axboe; +Cc: fujita.tomonori, linux-scsi, htejun, michaelc, James.Bottomley

On Fri, 11 Apr 2008 12:55:18 +0200
Jens Axboe <jens.axboe@oracle.com> wrote:

> On Thu, Apr 10 2008, FUJITA Tomonori wrote:
> > As discussed [*1], blk_rq_map_user_iov path is broken regarding
> > padding at the moment. In 2.6.24, libata did padding but libata's
> > padding code was removed and now libata expects the block layer to do
> > that.
> > 
> > blk_rq_map_user does padding but blk_rq_map_user_iov doesn't so
> > blk_rq_map_user_iov doesn't work in case libata needs padding (so far
> > nobody has complained, maybe nobody uses blk_rq_map_user_iov
> > interface).
> > 
> > This patchset adds padding support to blk_rq_map_user_iov. I converted
> > convert bio_copy_user to bio_copy_user_iov, which uses a temporary
> > kernel buffers.  blk_rq_map_user_iov uses bio_copy_user_iov when a low
> > level driver needs padding or a buffer in sg_iovec isn't aligned. We
> > can safely do padding in blk_rq_map_sg.
> > 
> > In the long run, I want to integrate several mapping APIs for PC
> > commands (and new API should be useful for sg/st/osst) but I need more
> > time to finish that work.
> > 
> > This is against the latest Linus tree. Can we merge this after 2.6.25?
> 
> Thanks Tomo, this looks good to me know. I'll queue it up.

My pleasure!

BTW, what do you think about a patch to change blk_rq_unmap_user to
take a request instead of a bio?

http://marc.info/?l=linux-scsi&m=120716475703416&w=2

I know that you are unwilling to add somthing to struct request (a
pointer to bio in this case) but symmetrical mapping APIs are more
handy (the patch also cleans ups the users of the mapping APIs like
bsg) and less error prone...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov
  2008-04-11 14:57   ` FUJITA Tomonori
@ 2008-04-14 17:55     ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2008-04-14 17:55 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: linux-scsi, htejun, michaelc, James.Bottomley

On Fri, Apr 11 2008, FUJITA Tomonori wrote:
> On Fri, 11 Apr 2008 12:55:18 +0200
> Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > On Thu, Apr 10 2008, FUJITA Tomonori wrote:
> > > As discussed [*1], blk_rq_map_user_iov path is broken regarding
> > > padding at the moment. In 2.6.24, libata did padding but libata's
> > > padding code was removed and now libata expects the block layer to do
> > > that.
> > > 
> > > blk_rq_map_user does padding but blk_rq_map_user_iov doesn't so
> > > blk_rq_map_user_iov doesn't work in case libata needs padding (so far
> > > nobody has complained, maybe nobody uses blk_rq_map_user_iov
> > > interface).
> > > 
> > > This patchset adds padding support to blk_rq_map_user_iov. I converted
> > > convert bio_copy_user to bio_copy_user_iov, which uses a temporary
> > > kernel buffers.  blk_rq_map_user_iov uses bio_copy_user_iov when a low
> > > level driver needs padding or a buffer in sg_iovec isn't aligned. We
> > > can safely do padding in blk_rq_map_sg.
> > > 
> > > In the long run, I want to integrate several mapping APIs for PC
> > > commands (and new API should be useful for sg/st/osst) but I need more
> > > time to finish that work.
> > > 
> > > This is against the latest Linus tree. Can we merge this after 2.6.25?
> > 
> > Thanks Tomo, this looks good to me know. I'll queue it up.
> 
> My pleasure!
> 
> BTW, what do you think about a patch to change blk_rq_unmap_user to
> take a request instead of a bio?
> 
> http://marc.info/?l=linux-scsi&m=120716475703416&w=2
> 
> I know that you are unwilling to add somthing to struct request (a
> pointer to bio in this case) but symmetrical mapping APIs are more
> handy (the patch also cleans ups the users of the mapping APIs like
> bsg) and less error prone...

I like symmetric APIs as much as the next guy, but it's not really that
ugly in this case I think. As such I'd be willing to take such a patch
only if it cleans up drivers considerably - that is, they have to go
through lengths to hang on to the bio to pass in for unmapping. Your
patch is just borderline I'd say :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-04-14 17:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-10 14:17 [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov FUJITA Tomonori
2008-04-10 14:17 ` [PATCH 1/3] block: convert bio_copy_user to bio_copy_user_iov FUJITA Tomonori
2008-04-10 14:17   ` [PATCH 2/3] block: add bio_copy_user_iov support to blk_rq_map_user_iov FUJITA Tomonori
2008-04-10 14:17     ` [PATCH 3/3] block: move the padding adjustment to blk_rq_map_sg FUJITA Tomonori
2008-04-10 15:02       ` Boaz Harrosh
2008-04-11  5:43 ` [PATCH 0/3] block: adds padding support to blk_rq_map_user_iov Tejun Heo
2008-04-11 10:55 ` Jens Axboe
2008-04-11 14:57   ` FUJITA Tomonori
2008-04-14 17:55     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).