* reserve mem for REQ_TYPE_BLOCK_PC reqs and covert sg to block/bio helpers
@ 2007-10-20 5:44 michaelc
2007-10-20 5:44 ` [PATCH 01/10] use seperate bioset for REQ_TYPE_BLOCK_PC michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe
This patchset has to goals. The first is to be able to reserve
memory (bio, bio vec and pages), for REQ_TYPE_BLOCK_PC commands.
This is needed for multipath where we need to make forward
progress when failing over the device. And it is needed if all
the paths are failed and we need userspace or the scsi layer
to send IO. Userspace multipathd does path testing and
userspace and the scsi layer send commands that are needed to setup
new devices (if dev_loss_tmo fires and devices are removed, we need to
be able to readd them).
The second goal is to convert sg, st and osst to the blk and bio
layer helpers to map and copy data.
I am not done testing the patchset. The first two patches are ok
and are pretty safe since they only add biosets to the bio layer.
0005-have-block-scsi_ioctl-user-GFP_NOIO.patch,
0006-use-GFP_NOIO-in-dm-rdac.patch and
0007-fix-blk_rq_map_user_iov-bounce-code.patch
are also pretty safe, but they are built on other patches since
I noticed them while testing my patches and I can rediff them.
Since the last time I sent the patches, I killed the bio_reserve_buffer
junk and replaced it with biosets and a mempool. And before I finish up
testing and start on st, I wanted to get comments since I am pretty happy
overall with the patches. I want to figure out better names for the
functions and maybe merge some other functions like the blk_rq_map_iov
with the old blk_map_user, but I can at least do the latter in another patch.
My only major bug concern is if blk_copy_user_iov only works for request
that go to the scsi layer. In the original code, bio_copy_user
copied the bv_len to a second struct, but for requests going
to the scsi layer this is not needed because we always complete
the whole request in one call (__end_that request will never
modify bv_len and do a recalc). I am not done grepping through
the other possble users like IDE.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 01/10] use seperate bioset for REQ_TYPE_BLOCK_PC
2007-10-20 5:44 reserve mem for REQ_TYPE_BLOCK_PC reqs and covert sg to block/bio helpers michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 02/10] rm block device arg from bio map user functions michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
If we are failing over a device or trying to do sg io to add a path to a
device where all paths are failed, we do not want to allocate bios from the
same bioset as the FS above it, because the device could have been internally
queueing IO while there were no paths and the fs bioset could be depleted.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
fs/bio.c | 22 +++++++++++++++++++---
1 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/fs/bio.c b/fs/bio.c
index d59ddbf..0781e65 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -75,6 +75,10 @@ struct bio_set {
* IO code that does not need private memory pools.
*/
static struct bio_set *fs_bio_set;
+/*
+ * blk_bio_set is for block layer commands like REQ_TYPE_BLOCK_PC
+ */
+static struct bio_set *blk_bio_set;
static inline struct bio_vec *bvec_alloc_bs(gfp_t gfp_mask, int nr, unsigned long *idx, struct bio_set *bs)
{
@@ -128,6 +132,11 @@ static void bio_fs_destructor(struct bio *bio)
bio_free(bio, fs_bio_set);
}
+static void bio_blk_destructor(struct bio *bio)
+{
+ bio_free(bio, blk_bio_set);
+}
+
void bio_init(struct bio *bio)
{
memset(bio, 0, sizeof(*bio));
@@ -532,11 +541,12 @@ struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr,
bmd->userptr = (void __user *) uaddr;
ret = -ENOMEM;
- bio = bio_alloc(GFP_KERNEL, end - start);
+ bio = bio_alloc_bioset(GFP_KERNEL, end - start, blk_bio_set);
if (!bio)
goto out_bmd;
bio->bi_rw |= (!write_to_vm << BIO_RW);
+ bio->bi_destructor = bio_blk_destructor;
ret = 0;
while (len) {
@@ -620,9 +630,10 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
if (!nr_pages)
return ERR_PTR(-EINVAL);
- bio = bio_alloc(GFP_KERNEL, nr_pages);
+ bio = bio_alloc_bioset(GFP_KERNEL, nr_pages, blk_bio_set);
if (!bio)
return ERR_PTR(-ENOMEM);
+ bio->bi_destructor = bio_blk_destructor;
ret = -ENOMEM;
pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL);
@@ -805,9 +816,10 @@ static struct bio *__bio_map_kern(struct request_queue *q, void *data,
int offset, i;
struct bio *bio;
- bio = bio_alloc(gfp_mask, nr_pages);
+ bio = bio_alloc_bioset(gfp_mask, nr_pages, blk_bio_set);
if (!bio)
return ERR_PTR(-ENOMEM);
+ bio->bi_destructor = bio_blk_destructor;
offset = offset_in_page(kaddr);
for (i = 0; i < nr_pages; i++) {
@@ -1172,6 +1184,10 @@ static int __init init_bio(void)
if (!fs_bio_set)
panic("bio: can't allocate bios\n");
+ blk_bio_set = bioset_create(BIO_POOL_SIZE, 2);
+ if (!blk_bio_set)
+ panic("Failed to create blk_bio_set");
+
bio_split_pool = mempool_create_kmalloc_pool(BIO_SPLIT_ENTRIES,
sizeof(struct bio_pair));
if (!bio_split_pool)
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 02/10] rm block device arg from bio map user functions
2007-10-20 5:44 ` [PATCH 01/10] use seperate bioset for REQ_TYPE_BLOCK_PC michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 03/10] Extend bio_sets to pool pages for bios in sets michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
Everyone is passing in NULL, so let's just drop the
block device argument from the bio mapping functions.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/ll_rw_blk.c | 4 ++--
fs/bio.c | 17 ++++++-----------
include/linux/bio.h | 5 ++---
3 files changed, 10 insertions(+), 16 deletions(-)
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 3935469..7c90e9b 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2383,7 +2383,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq,
*/
uaddr = (unsigned long) ubuf;
if (!(uaddr & queue_dma_alignment(q)) && !(len & queue_dma_alignment(q)))
- bio = bio_map_user(q, NULL, uaddr, len, reading);
+ bio = bio_map_user(q, uaddr, len, reading);
else
bio = bio_copy_user(q, uaddr, len, reading);
@@ -2508,7 +2508,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
/* we don't allow misaligned data like bio_map_user() does. If the
* user is using sg, they're expected to know the alignment constraints
* and respect them accordingly */
- bio = bio_map_user_iov(q, NULL, iov, iov_count, rq_data_dir(rq)== READ);
+ bio = bio_map_user_iov(q, iov, iov_count, rq_data_dir(rq)== READ);
if (IS_ERR(bio))
return PTR_ERR(bio);
diff --git a/fs/bio.c b/fs/bio.c
index 0781e65..f85139a 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -602,7 +602,6 @@ out_bmd:
}
static struct bio *__bio_map_user_iov(struct request_queue *q,
- struct block_device *bdev,
struct sg_iovec *iov, int iov_count,
int write_to_vm)
{
@@ -696,7 +695,6 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
if (!write_to_vm)
bio->bi_rw |= (1 << BIO_RW);
- bio->bi_bdev = bdev;
bio->bi_flags |= (1 << BIO_USER_MAPPED);
return bio;
@@ -715,7 +713,6 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
/**
* bio_map_user - map user address into bio
* @q: the struct request_queue for the bio
- * @bdev: destination block device
* @uaddr: start of user address
* @len: length in bytes
* @write_to_vm: bool indicating writing to pages or not
@@ -723,21 +720,20 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
* Map the user space address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
-struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev,
- unsigned long uaddr, unsigned int len, int write_to_vm)
+struct bio *bio_map_user(struct request_queue *q, unsigned long uaddr,
+ unsigned int len, int write_to_vm)
{
struct sg_iovec iov;
iov.iov_base = (void __user *)uaddr;
iov.iov_len = len;
- return bio_map_user_iov(q, bdev, &iov, 1, write_to_vm);
+ return bio_map_user_iov(q, &iov, 1, write_to_vm);
}
/**
* bio_map_user_iov - map user sg_iovec table into bio
* @q: the struct request_queue for the bio
- * @bdev: destination block device
* @iov: the iovec.
* @iov_count: number of elements in the iovec
* @write_to_vm: bool indicating writing to pages or not
@@ -745,13 +741,12 @@ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev,
* Map the user space address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
-struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev,
- struct sg_iovec *iov, int iov_count,
- int write_to_vm)
+struct bio *bio_map_user_iov(struct request_queue *q, struct sg_iovec *iov,
+ int iov_count, int write_to_vm)
{
struct bio *bio;
- bio = __bio_map_user_iov(q, bdev, iov, iov_count, write_to_vm);
+ bio = __bio_map_user_iov(q, iov, iov_count, write_to_vm);
if (IS_ERR(bio))
return bio;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 4da4413..b76eb77 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -315,11 +315,10 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
extern int bio_get_nr_vecs(struct block_device *);
-extern struct bio *bio_map_user(struct request_queue *, struct block_device *,
- unsigned long, unsigned int, int);
+extern struct bio *bio_map_user(struct request_queue *, unsigned long,
+ unsigned int, int);
struct sg_iovec;
extern struct bio *bio_map_user_iov(struct request_queue *,
- struct block_device *,
struct sg_iovec *, int, int);
extern void bio_unmap_user(struct bio *);
extern struct bio *bio_map_kern(struct request_queue *, void *, unsigned int,
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 03/10] Extend bio_sets to pool pages for bios in sets.
2007-10-20 5:44 ` [PATCH 02/10] rm block device arg from bio map user functions michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 04/10] convert blk_rq_map helpers to use bioset's page pool helper michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
If we have REQ_BLOCK_PC commands that transfer data like,
an inquiry, read of sector 0, or mode page, then we need to
make sure that we can allocate memory for the data.
sg and st have implemented their own reserve buffers. This patch
adds a mempool of pages onto the bio_set which serves the same
purpose. In later patches the block layer sg code and sg/st will
be converted to use this.
This patch just adds the infrastructure. The next patches will
convert the code to the new functions.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
fs/bio.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/bio.h | 2 +
2 files changed, 146 insertions(+), 0 deletions(-)
diff --git a/fs/bio.c b/fs/bio.c
index f85139a..1e8db03 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -68,6 +68,8 @@ static struct biovec_slab bvec_slabs[BIOVEC_NR_POOLS] __read_mostly = {
struct bio_set {
mempool_t *bio_pool;
mempool_t *bvec_pools[BIOVEC_NR_POOLS];
+ mempool_t *page_pool;
+ int page_pool_order;
};
/*
@@ -184,6 +186,118 @@ out:
return bio;
}
+#if 0
+This #if is just to break up the patchset, make it easier to read
+and git bisectable.
+
+This patch extends biosets to have page pools. The next patch will replace
+bio_copy_user and friends with the the bioset version added below.
+
+struct bio_map_vec {
+ struct page *page;
+ unsigned int len;
+ void __user *userptr;
+};
+
+struct bio_map_data {
+ struct bio_map_vec *iovecs;
+ int nr_vecs;
+};
+
+static void bio_free_map_data(struct bio_map_data *bmd)
+{
+ kfree(bmd->iovecs);
+ kfree(bmd);
+}
+
+static struct bio_map_data *bio_alloc_map_data(int nr_segs)
+{
+ struct bio_map_data *bmd = kzalloc(sizeof(*bmd), GFP_KERNEL);
+
+ if (!bmd)
+ return NULL;
+
+ bmd->iovecs = kmalloc(sizeof(struct bio_map_vec) * nr_segs, GFP_KERNEL);
+ if (bmd->iovecs)
+ return bmd;
+
+ kfree(bmd);
+ return NULL;
+}
+
+
+void bioset_free_pages(struct bio_set *bs, struct bio *bio)
+{
+ struct bio_map_data *bmd = bio->bi_private;
+ int i;
+
+ for (i = 0; i < bmd->nr_vecs; i++)
+ mempool_free(bmd->iovecs[i].page, bs->page_pool);
+ bio_free_map_data(bmd);
+ bio_put(bio);
+}
+
+struct bio *bioset_add_pages(struct request_queue *q, struct bio_set *bs,
+ unsigned int len, int write_to_vm, gfp_t gfp_mask)
+{
+ int nr_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ struct bio_map_data *bmd;
+ struct page *page;
+ struct bio *bio;
+ int i = 0, ret;
+
+ bmd = bio_alloc_map_data(nr_pages);
+ if (!bmd)
+ return ERR_PTR(-ENOMEM);
+
+ ret = -ENOMEM;
+ bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
+ if (!bio)
+ goto out_bmd;
+ bio->bi_rw |= (!write_to_vm << BIO_RW);
+
+ ret = 0;
+ while (len) {
+ unsigned add_len;
+
+ page = mempool_alloc(bs->page_pool, q->bounce_gfp | gfp_mask);
+ if (!page)
+ goto cleanup;
+
+ bmd->nr_vecs++;
+ bmd->iovecs[i].page = page;
+ bmd->iovecs[i].len = 0;
+
+ add_len = min_t(unsigned int,
+ (1 << bs->page_pool_order) << PAGE_SHIFT, len);
+ while (add_len) {
+ unsigned int added, bytes = PAGE_SIZE;
+
+ if (bytes > add_len)
+ bytes = add_len;
+
+ added = bio_add_pc_page(q, bio, page++, bytes, 0);
+ bmd->iovecs[i].len += added;
+ if (added < bytes)
+ break;
+ add_len -= bytes;
+ len -= bytes;
+ }
+ i++;
+ }
+
+ bio->bi_private = bmd;
+ return bio;
+
+cleanup:
+ bioset_free_pages(bs, bio);
+ bio_free(bio, bs);
+out_bmd:
+ bio_free_map_data(bmd);
+ return ERR_PTR(ret);
+}
+#endif
+
struct bio *bio_alloc(gfp_t gfp_mask, int nr_iovecs)
{
struct bio *bio = bio_alloc_bioset(gfp_mask, nr_iovecs, fs_bio_set);
@@ -1155,6 +1269,34 @@ bad:
return NULL;
}
+void bioset_pagepool_free(struct bio_set *bs)
+{
+ bioset_free(bs);
+
+ if (bs->page_pool)
+ mempool_destroy(bs->page_pool);
+}
+
+struct bio_set *bioset_pagepool_create(int bio_pool_size, int bvec_pool_size,
+ int order)
+{
+ struct bio_set *bs = bioset_create(bio_pool_size, bvec_pool_size);
+
+ if (!bs)
+ return NULL;
+
+ bs->page_pool = mempool_create_page_pool(bio_pool_size, order);
+ if (!bs->page_pool)
+ goto free_bioset;
+
+ bs->page_pool_order = order;
+ return bs;
+
+free_bioset:
+ bioset_free(bs);
+ return NULL;
+}
+
static void __init biovec_init_slabs(void)
{
int i;
@@ -1212,5 +1354,7 @@ EXPORT_SYMBOL(bio_split_pool);
EXPORT_SYMBOL(bio_copy_user);
EXPORT_SYMBOL(bio_uncopy_user);
EXPORT_SYMBOL(bioset_create);
+EXPORT_SYMBOL(bioset_pagepool_create);
EXPORT_SYMBOL(bioset_free);
+EXPORT_SYMBOL(bioset_pagepool_free);
EXPORT_SYMBOL(bio_alloc_bioset);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b76eb77..2d28c3b 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -294,7 +294,9 @@ extern mempool_t *bio_split_pool;
extern void bio_pair_release(struct bio_pair *dbio);
extern struct bio_set *bioset_create(int, int);
+extern struct bio_set *bioset_pagepool_create(int, int, int);
extern void bioset_free(struct bio_set *);
+extern void bioset_pagepool_free(struct bio_set *);
extern struct bio *bio_alloc(gfp_t, int);
extern struct bio *bio_alloc_bioset(gfp_t, int, struct bio_set *);
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 04/10] convert blk_rq_map helpers to use bioset's page pool helper
2007-10-20 5:44 ` [PATCH 03/10] Extend bio_sets to pool pages for bios in sets michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 05/10] have block/scsi_ioctl user GFP_NOIO michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
This patch converts the blk_rq map helpers to use the page pool
helpers instead of bio_copy_user. bio_copy/uncopy_user used to allocate
the bio, set up biovecs, allocate pages and copy/uncopy the data. Now
the bio page pool helper only takes care of allocating the bio and pages
and setting up the biovecs. The data transfer is done by a new blk helper
copy_user_iov. This seperation of the bio allocation/setup and copy/uncopy
if data will be useful for mmap support later on in the patches.
Also with this patch, we rename blk_rq_map_user to blk_rq_setup_transfer
to make it clear that it does not necessarily map the data, and because
in the future sg will want to control if it wants to map the data or
copy the data.
This patch is a little larger because it also converts the users
of blk_rq_map_user to the new api.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/bsg.c | 23 ++-
block/ll_rw_blk.c | 370 ++++++++++++++++++++++++++++++++-----------
block/scsi_ioctl.c | 11 +-
drivers/block/pktcdvd.c | 3 +-
drivers/cdrom/cdrom.c | 4 +-
drivers/md/dm-mpath-rdac.c | 3 +-
drivers/scsi/scsi_lib.c | 4 +-
drivers/scsi/scsi_tgt_lib.c | 5 +-
fs/bio.c | 263 ++++++++-----------------------
include/linux/bio.h | 16 +-
include/linux/blkdev.h | 18 ++-
11 files changed, 396 insertions(+), 324 deletions(-)
diff --git a/block/bsg.c b/block/bsg.c
index 8e181ab..5ff02fa 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -281,7 +281,9 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr)
rq->next_rq = next_rq;
dxferp = (void*)(unsigned long)hdr->din_xferp;
- ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len);
+ dxfer_len = hdr->din_xfer_len;
+ ret = blk_rq_setup_transfer(NULL, next_rq, dxferp, dxfer_len,
+ GFP_KERNEL);
if (ret)
goto out;
}
@@ -296,7 +298,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr)
dxfer_len = 0;
if (dxfer_len) {
- ret = blk_rq_map_user(q, rq, dxferp, dxfer_len);
+ ret = blk_rq_setup_transfer(NULL, rq, dxferp, dxfer_len,
+ GFP_KERNEL);
if (ret)
goto out;
}
@@ -304,7 +307,7 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr)
out:
blk_put_request(rq);
if (next_rq) {
- blk_rq_unmap_user(next_rq->bio);
+ blk_rq_complete_transfer(next_rq->bio, dxferp, dxfer_len);
blk_put_request(next_rq);
}
return ERR_PTR(ret);
@@ -409,6 +412,8 @@ static struct bsg_command *bsg_get_done_cmd(struct bsg_device *bd)
static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
struct bio *bio, struct bio *bidi_bio)
{
+ unsigned int dxfer_len = 0;
+ void *dxferp = NULL;
int ret = 0;
dprintk("rq %p bio %p %u\n", rq, bio, rq->errors);
@@ -438,14 +443,18 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
if (rq->next_rq) {
hdr->dout_resid = rq->data_len;
hdr->din_resid = rq->next_rq->data_len;
- blk_rq_unmap_user(bidi_bio);
+ blk_rq_complete_transfer(bidi_bio,
+ (void __user *)hdr->din_xferp,
+ hdr->din_xfer_len);
blk_put_request(rq->next_rq);
- } else if (rq_data_dir(rq) == READ)
+ } else if (rq_data_dir(rq) == READ) {
hdr->din_resid = rq->data_len;
- else
+ dxfer_len = hdr->din_xfer_len;
+ dxferp = (void*)(unsigned long)hdr->din_xferp;
+ } else
hdr->dout_resid = rq->data_len;
- blk_rq_unmap_user(bio);
+ blk_rq_complete_transfer(bio, dxferp, dxfer_len);
blk_put_request(rq);
return ret;
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 7c90e9b..fad17de 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -36,6 +36,10 @@
* for max sense size
*/
#include <scsi/scsi_cmnd.h>
+/*
+ * for struct sg_iovc
+ */
+#include <scsi/sg.h>
static void blk_unplug_work(struct work_struct *work);
static void blk_unplug_timeout(unsigned long data);
@@ -2337,20 +2341,6 @@ void blk_insert_request(struct request_queue *q, struct request *rq,
EXPORT_SYMBOL(blk_insert_request);
-static int __blk_rq_unmap_user(struct bio *bio)
-{
- int ret = 0;
-
- if (bio) {
- if (bio_flagged(bio, BIO_USER_MAPPED))
- bio_unmap_user(bio);
- else
- ret = bio_uncopy_user(bio);
- }
-
- return ret;
-}
-
int blk_rq_append_bio(struct request_queue *q, struct request *rq,
struct bio *bio)
{
@@ -2368,25 +2358,64 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq,
}
EXPORT_SYMBOL(blk_rq_append_bio);
-static int __blk_rq_map_user(struct request_queue *q, struct request *rq,
- void __user *ubuf, unsigned int len)
+static void __blk_rq_destroy_buffer(struct bio *bio)
{
- unsigned long uaddr;
+ if (!bio)
+ return;
+
+ if (bio_flagged(bio, BIO_USER_MAPPED))
+ bio_unmap_user(bio);
+ else
+ bioset_free_pages(bio);
+}
+
+void blk_rq_destroy_buffer(struct bio *bio)
+{
+ struct bio *mapped_bio;
+
+ while (bio) {
+ mapped_bio = bio;
+ if (unlikely(bio_flagged(bio, BIO_BOUNCED)))
+ mapped_bio = bio->bi_private;
+
+ __blk_rq_destroy_buffer(mapped_bio);
+ mapped_bio = bio;
+ bio = bio->bi_next;
+ bio_put(mapped_bio);
+ }
+}
+EXPORT_SYMBOL(blk_rq_destroy_buffer);
+
+static int __blk_rq_setup_buffer(struct bio_set *bs, struct request *rq,
+ void __user *ubuf, unsigned int len,
+ gfp_t gfp_mask)
+{
+ struct request_queue *q = rq->q;
struct bio *bio, *orig_bio;
int reading, ret;
reading = rq_data_dir(rq) == READ;
- /*
- * if alignment requirement is satisfied, map in user pages for
- * direct dma. else, set up kernel bounce buffers
- */
- uaddr = (unsigned long) ubuf;
- if (!(uaddr & queue_dma_alignment(q)) && !(len & queue_dma_alignment(q)))
- bio = bio_map_user(q, uaddr, len, reading);
- else
- bio = bio_copy_user(q, uaddr, len, reading);
+ if (ubuf) {
+ unsigned long map_len, end, start;
+ map_len = min_t(unsigned long, len, BIO_MAX_SIZE);
+ end = ((unsigned long)ubuf + map_len + PAGE_SIZE - 1)
+ >> PAGE_SHIFT;
+ start = (unsigned long)ubuf >> PAGE_SHIFT;
+ /*
+ * A bad offset could cause us to require BIO_MAX_PAGES + 1
+ * pages. If this happens we just lower the requested
+ * mapping len by a page so that we can fit
+ */
+ if (end - start > BIO_MAX_PAGES)
+ map_len -= PAGE_SIZE;
+
+ bio = bio_map_user(q, bs, (unsigned long)ubuf, map_len,
+ reading, gfp_mask);
+ } else
+ bio = bioset_add_pages(q, bs, len, reading,
+ gfp_mask);
if (IS_ERR(bio))
return PTR_ERR(bio);
@@ -2405,100 +2434,249 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq,
/* if it was boucned we must call the end io function */
bio_endio(bio, 0);
- __blk_rq_unmap_user(orig_bio);
+ __blk_rq_destroy_buffer(orig_bio);
bio_put(bio);
return ret;
}
/**
- * blk_rq_map_user - map user data to a request, for REQ_BLOCK_PC usage
- * @q: request queue where request should be inserted
+ * blk_rq_setup_buffer - setup buffer to bio mappings
+ * @bs: optional bio set
* @rq: request structure to fill
- * @ubuf: the user buffer
+ * @ubuf: the user buffer (required for map)
* @len: length of user data
+ * @gfp_mask: gfp flags to use for bio allocations
*
* Description:
* Data will be mapped directly for zero copy io, if possible. Otherwise
- * a kernel bounce buffer is used.
+ * a kernel bounce buffer is used. Callers should only use this function
+ * if they know that the map will always be successful or they are
+ * prepared to handle the copy part of the operation.
*
- * A matching blk_rq_unmap_user() must be issued at the end of io, while
+ * A matching blk_complete_transfer must be issued at the end of io, while
* still in process context.
*
- * Note: The mapped bio may need to be bounced through blk_queue_bounce()
- * before being submitted to the device, as pages mapped may be out of
- * reach. It's the callers responsibility to make sure this happens. The
- * original bio must be passed back in to blk_rq_unmap_user() for proper
- * unmapping.
+ * Note: The original rq->bio must be passed back in to blk_complete_transfer
+ * for proper unmapping.
*/
-int blk_rq_map_user(struct request_queue *q, struct request *rq,
- void __user *ubuf, unsigned long len)
+int blk_rq_setup_buffer(struct bio_set *bs, struct request *rq,
+ void __user *ubuf, unsigned long len, gfp_t gfp_mask)
{
+ struct request_queue *q = rq->q;
unsigned long bytes_read = 0;
struct bio *bio = NULL;
int ret;
if (len > (q->max_hw_sectors << 9))
return -EINVAL;
- if (!len || !ubuf)
- return -EINVAL;
while (bytes_read != len) {
- unsigned long map_len, end, start;
-
- map_len = min_t(unsigned long, len - bytes_read, BIO_MAX_SIZE);
- end = ((unsigned long)ubuf + map_len + PAGE_SIZE - 1)
- >> PAGE_SHIFT;
- start = (unsigned long)ubuf >> PAGE_SHIFT;
-
- /*
- * A bad offset could cause us to require BIO_MAX_PAGES + 1
- * pages. If this happens we just lower the requested
- * mapping len by a page so that we can fit
- */
- if (end - start > BIO_MAX_PAGES)
- map_len -= PAGE_SIZE;
-
- ret = __blk_rq_map_user(q, rq, ubuf, map_len);
+ ret = __blk_rq_setup_buffer(bs, rq, ubuf, len - bytes_read,
+ gfp_mask);
if (ret < 0)
goto unmap_rq;
if (!bio)
bio = rq->bio;
bytes_read += ret;
- ubuf += ret;
+ if (ubuf)
+ ubuf += ret;
}
rq->buffer = rq->data = NULL;
return 0;
unmap_rq:
- blk_rq_unmap_user(bio);
+ blk_rq_destroy_buffer(bio);
+ rq->bio = NULL;
+ return ret;
+}
+EXPORT_SYMBOL(blk_rq_setup_buffer);
+
+static int blk_copy_user_iov(struct bio *head, struct sg_iovec *iov,
+ int iov_count)
+{
+ unsigned int iov_len = 0;
+ int ret, i = 0, iov_index = 0;
+ struct bio *bio;
+ struct bio_vec *bvec;
+ char __user *p = NULL;
+
+ if (!iov || !iov_count)
+ return 0;
+
+ for (bio = head; bio; bio = bio->bi_next) {
+ bio_for_each_segment(bvec, bio, i) {
+ unsigned int copy_bytes, bvec_offset = 0;
+ char *addr;
+
+continue_from_bvec:
+ addr = page_address(bvec->bv_page) + bvec_offset;
+ if (!p) {
+ if (iov_index == iov_count)
+ /*
+ * caller wanted a buffer larger
+ * than transfer
+ */
+ break;
+
+ p = iov[iov_index].iov_base;
+ iov_len = iov[iov_index].iov_len;
+ if (!p || !iov_len) {
+ iov_index++;
+ p = NULL;
+ /*
+ * got an invalid iov, so just try to
+ * complete what is valid
+ */
+ goto continue_from_bvec;
+ }
+ }
+
+ copy_bytes = min(iov_len, bvec->bv_len - bvec_offset);
+ if (bio_data_dir(head) == READ)
+ ret = copy_to_user(p, addr, copy_bytes);
+ else
+ ret = copy_from_user(addr, p, copy_bytes);
+ if (ret)
+ return -EFAULT;
+
+ bvec_offset += copy_bytes;
+ iov_len -= copy_bytes;
+ if (iov_len == 0) {
+ p = NULL;
+ iov_index++;
+ if (bvec_offset < bvec->bv_len)
+ goto continue_from_bvec;
+ } else
+ p += copy_bytes;
+ }
+ }
+ return 0;
+}
+
+/**
+ * blk_rq_copy_user_iov - copy user data to a request.
+ * @bs: optional bio set
+ * @rq: request structure to fill
+ * @iov: sg iovec
+ * @iov_count: number of elements in the iovec
+ * @len: max length of data (length of buffer)
+ * @gfp_mask: gfp flag for bio allocations
+ *
+ * Description:
+ * This function is for REQ_BLOCK_PC usage.
+ *
+ * A matching blk_rq_uncopy_user_iov() must be issued at the end of io,
+ * while still in process context.
+ *
+ * It's the callers responsibility to make sure this happens. The
+ * original bio must be passed back in to blk_rq_uncopy_user_iov() for
+ * proper unmapping.
+ */
+int blk_rq_copy_user_iov(struct bio_set *bs, struct request *rq,
+ struct sg_iovec *iov, int iov_count,
+ unsigned long len, gfp_t gfp_mask)
+{
+ int ret;
+
+ ret = blk_rq_setup_buffer(bs, rq, NULL, len, gfp_mask);
+ if (ret)
+ return ret;
+
+ if (rq_data_dir(rq) == READ)
+ return 0;
+
+ ret = blk_copy_user_iov(rq->bio, iov, iov_count);
+ if (ret)
+ goto fail;
+ return 0;
+fail:
+ blk_rq_destroy_buffer(rq->bio);
+ return -EFAULT;
+}
+EXPORT_SYMBOL(blk_rq_copy_user_iov);
+
+int blk_rq_uncopy_user_iov(struct bio *bio, struct sg_iovec *iov,
+ int iov_count)
+{
+ int ret = 0;
+
+ if (!bio)
+ return 0;
+
+ if (bio_data_dir(bio) == READ)
+ ret = blk_copy_user_iov(bio, iov, iov_count);
+ blk_rq_destroy_buffer(bio);
return ret;
}
+EXPORT_SYMBOL(blk_rq_uncopy_user_iov);
-EXPORT_SYMBOL(blk_rq_map_user);
+/**
+ * blk_rq_setup_transfer - map or copy user data to a request.
+ * @bs: optional bio set
+ * @rq: request structure to fill
+ * @ubuf: the user buffer
+ * @len: length of user data
+ * @gfp_mask: gfp flag for bio allocations
+ *
+ * Description:
+ * This function is for REQ_BLOCK_PC usage.
+ * Data will be mapped directly for zero copy io, if possible. Otherwise
+ * a kernel bounce buffer is used. This function will try to map data
+ * first and if that is not possible then it will try to setup buffers
+ * to copy the data.
+ *
+ * A matching blk_rq_complete_transfer() must be issued at the end of io,
+ * while still in process context.
+ *
+ * Note: The original bio must be passed back in to
+ * blk_rq_complete_transfer() for proper unmapping.
+ */
+int blk_rq_setup_transfer(struct bio_set *bs, struct request *rq,
+ void __user *ubuf, unsigned long len, gfp_t gfp_mask)
+{
+ int ret;
+
+ if (!ubuf)
+ return -EINVAL;
+
+ ret = blk_rq_setup_buffer(bs, rq, ubuf, len, gfp_mask);
+ if (ret) {
+ struct sg_iovec iov;
+
+ iov.iov_base = ubuf;
+ iov.iov_len = len;
+
+ ret = blk_rq_copy_user_iov(bs, rq, &iov, 1, len, gfp_mask);
+ }
+ return ret;
+}
+EXPORT_SYMBOL(blk_rq_setup_transfer);
/**
* blk_rq_map_user_iov - map user data to a request, for REQ_BLOCK_PC usage
- * @q: request queue where request should be inserted
+ * @bs: optional bio set
* @rq: request to map data to
* @iov: pointer to the iovec
* @iov_count: number of elements in the iovec
* @len: I/O byte count
+ * @gfp_mask: gfp flag for bio allocations
*
* Description:
* Data will be mapped directly for zero copy io, if possible. Otherwise
* a kernel bounce buffer is used.
*
- * A matching blk_rq_unmap_user() must be issued at the end of io, while
- * still in process context.
+ * A matching blk_rq_complete_transfer() must be issued at the end of io,
+ * while still in process context.
*
* Note: The mapped bio may need to be bounced through blk_queue_bounce()
* before being submitted to the device, as pages mapped may be out of
* reach. It's the callers responsibility to make sure this happens. The
- * original bio must be passed back in to blk_rq_unmap_user() for proper
- * unmapping.
+ * original bio must be passed back in to blk_rq_complete_transfer() for
+ * proper unmapping.
*/
-int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
- struct sg_iovec *iov, int iov_count, unsigned int len)
+int blk_rq_map_user_iov(struct bio_set *bs, struct request *rq,
+ struct sg_iovec *iov, int iov_count, unsigned int len,
+ gfp_t gfp_mask)
{
struct bio *bio;
@@ -2508,7 +2686,8 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
/* we don't allow misaligned data like bio_map_user() does. If the
* user is using sg, they're expected to know the alignment constraints
* and respect them accordingly */
- bio = bio_map_user_iov(q, iov, iov_count, rq_data_dir(rq)== READ);
+ bio = bio_map_user_iov(rq->q, bs, iov, iov_count,
+ rq_data_dir(rq)== READ, gfp_mask);
if (IS_ERR(bio))
return PTR_ERR(bio);
@@ -2519,7 +2698,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
}
bio_get(bio);
- blk_rq_bio_prep(q, rq, bio);
+ blk_rq_bio_prep(rq->q, rq, bio);
rq->buffer = rq->data = NULL;
return 0;
}
@@ -2527,48 +2706,49 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
EXPORT_SYMBOL(blk_rq_map_user_iov);
/**
- * blk_rq_unmap_user - unmap a request with user data
- * @bio: start of bio list
+ * blk_rq_complete_transfer - unmap a request with user data
+ * @bio: start of bio list
+ * @ubuf: buffer to copy to if needed
+ * @len: number of bytes to copy if needed
*
* Description:
- * Unmap a rq previously mapped by blk_rq_map_user(). The caller must
- * supply the original rq->bio from the blk_rq_map_user() return, since
- * the io completion may have changed rq->bio.
+ * Unmap a rq mapped with blk_rq_init_transfer, blk_rq_map_user_iov,
+ * blk_rq_map_user or blk_rq_copy_user_iov (if copying back to single buf).
+ * The caller must supply the original rq->bio, since the io completion
+ * may have changed rq->bio.
*/
-int blk_rq_unmap_user(struct bio *bio)
+int blk_rq_complete_transfer(struct bio *bio, void __user *ubuf,
+ unsigned long len)
{
- struct bio *mapped_bio;
- int ret = 0, ret2;
-
- while (bio) {
- mapped_bio = bio;
- if (unlikely(bio_flagged(bio, BIO_BOUNCED)))
- mapped_bio = bio->bi_private;
+ struct sg_iovec iov;
+ int ret = 0;
- ret2 = __blk_rq_unmap_user(mapped_bio);
- if (ret2 && !ret)
- ret = ret2;
+ if (!bio)
+ return 0;
- mapped_bio = bio;
- bio = bio->bi_next;
- bio_put(mapped_bio);
+ if (bio_flagged(bio, BIO_USER_MAPPED))
+ blk_rq_destroy_buffer(bio);
+ else {
+ iov.iov_base = ubuf;
+ iov.iov_len = len;
+ ret = blk_rq_uncopy_user_iov(bio, &iov, 1);
}
-
return ret;
}
-
-EXPORT_SYMBOL(blk_rq_unmap_user);
+EXPORT_SYMBOL(blk_rq_complete_transfer);
/**
* blk_rq_map_kern - map kernel data to a request, for REQ_BLOCK_PC usage
+ * @bs: optional bio set
* @q: request queue where request should be inserted
* @rq: request to fill
* @kbuf: the kernel buffer
* @len: length of user data
* @gfp_mask: memory allocation flags
*/
-int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
- unsigned int len, gfp_t gfp_mask)
+int blk_rq_map_kern(struct bio_set *bs, struct request_queue *q,
+ struct request *rq, void *kbuf, unsigned int len,
+ gfp_t gfp_mask)
{
struct bio *bio;
@@ -2577,7 +2757,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
if (!len || !kbuf)
return -EINVAL;
- bio = bio_map_kern(q, kbuf, len, gfp_mask);
+ bio = bio_map_kern(q, bs, kbuf, len, gfp_mask);
if (IS_ERR(bio))
return PTR_ERR(bio);
@@ -2592,7 +2772,7 @@ int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
EXPORT_SYMBOL(blk_rq_map_kern);
-/**
+/*
* blk_execute_rq_nowait - insert a request into queue for execution
* @q: queue to insert the request in
* @bd_disk: matching gendisk
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 91c7322..bf97b22 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -245,7 +245,7 @@ static int blk_fill_sghdr_rq(struct request_queue *q, struct request *rq,
*/
static int blk_unmap_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr)
{
- blk_rq_unmap_user(rq->bio);
+ blk_rq_complete_transfer(rq->bio, hdr->dxferp, hdr->dxfer_len);
blk_put_request(rq);
return 0;
}
@@ -343,11 +343,12 @@ static int sg_io(struct file *file, struct request_queue *q,
goto out;
}
- ret = blk_rq_map_user_iov(q, rq, iov, hdr->iovec_count,
- hdr->dxfer_len);
+ ret = blk_rq_map_user_iov(NULL, rq, iov, hdr->iovec_count,
+ hdr->dxfer_len, GFP_KERNEL);
kfree(iov);
} else if (hdr->dxfer_len)
- ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len);
+ ret = blk_rq_setup_transfer(NULL, rq, hdr->dxferp,
+ hdr->dxfer_len, GFP_KERNEL);
if (ret)
goto out;
@@ -485,7 +486,7 @@ int sg_scsi_ioctl(struct file *file, struct request_queue *q,
break;
}
- if (bytes && blk_rq_map_kern(q, rq, buffer, bytes, __GFP_WAIT)) {
+ if (bytes && blk_rq_map_kern(NULL, q, rq, buffer, bytes, __GFP_WAIT)) {
err = DRIVER_ERROR << 24;
goto out;
}
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index a8130a4..94c307b 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -760,7 +760,8 @@ static int pkt_generic_packet(struct pktcdvd_device *pd, struct packet_command *
WRITE : READ, __GFP_WAIT);
if (cgc->buflen) {
- if (blk_rq_map_kern(q, rq, cgc->buffer, cgc->buflen, __GFP_WAIT))
+ if (blk_rq_map_kern(NULL, q, rq, cgc->buffer, cgc->buflen,
+ __GFP_WAIT))
goto out;
}
diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 7924571..5a037ff 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2122,7 +2122,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf,
len = nr * CD_FRAMESIZE_RAW;
- ret = blk_rq_map_user(q, rq, ubuf, len);
+ ret = blk_rq_setup_transfer(NULL, rq, ubuf, len, GFP_KERNEL);
if (ret)
break;
@@ -2149,7 +2149,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf,
cdi->last_sense = s->sense_key;
}
- if (blk_rq_unmap_user(bio))
+ if (blk_rq_complete_transfer(bio, ubuf, len))
ret = -EFAULT;
if (ret)
diff --git a/drivers/md/dm-mpath-rdac.c b/drivers/md/dm-mpath-rdac.c
index 16b1613..9e71e0e 100644
--- a/drivers/md/dm-mpath-rdac.c
+++ b/drivers/md/dm-mpath-rdac.c
@@ -278,7 +278,8 @@ static struct request *get_rdac_req(struct rdac_handler *h,
return NULL;
}
- if (buflen && blk_rq_map_kern(q, rq, buffer, buflen, GFP_KERNEL)) {
+ if (buflen && blk_rq_map_kern(NULL, q, rq, buffer, buflen,
+ GFP_KERNEL)) {
blk_put_request(rq);
DMINFO("get_rdac_req: blk_rq_map_kern failed");
return NULL;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index aac8a02..c799e98 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -186,7 +186,7 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd,
req = blk_get_request(sdev->request_queue, write, __GFP_WAIT);
- if (bufflen && blk_rq_map_kern(sdev->request_queue, req,
+ if (bufflen && blk_rq_map_kern(NULL, sdev->request_queue, req,
buffer, bufflen, __GFP_WAIT))
goto out;
@@ -396,7 +396,7 @@ int scsi_execute_async(struct scsi_device *sdev, const unsigned char *cmd,
if (use_sg)
err = scsi_req_map_sg(req, buffer, use_sg, bufflen, gfp);
else if (bufflen)
- err = blk_rq_map_kern(req->q, req, buffer, bufflen, gfp);
+ err = blk_rq_map_kern(NULL, req->q, req, buffer, bufflen, gfp);
if (err)
goto free_req;
diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c
index a91761c..7e0189c 100644
--- a/drivers/scsi/scsi_tgt_lib.c
+++ b/drivers/scsi/scsi_tgt_lib.c
@@ -171,7 +171,7 @@ static void cmd_hashlist_del(struct scsi_cmnd *cmd)
static void scsi_unmap_user_pages(struct scsi_tgt_cmd *tcmd)
{
- blk_rq_unmap_user(tcmd->bio);
+ blk_rq_destroy_buffer(tcmd->bio);
}
static void scsi_tgt_cmd_destroy(struct work_struct *work)
@@ -381,12 +381,11 @@ static int scsi_tgt_init_cmd(struct scsi_cmnd *cmd, gfp_t gfp_mask)
static int scsi_map_user_pages(struct scsi_tgt_cmd *tcmd, struct scsi_cmnd *cmd,
unsigned long uaddr, unsigned int len, int rw)
{
- struct request_queue *q = cmd->request->q;
struct request *rq = cmd->request;
int err;
dprintk("%lx %u\n", uaddr, len);
- err = blk_rq_map_user(q, rq, (void *)uaddr, len);
+ err = blk_rq_setup_buffer(NULL, rq, (void *)uaddr, len, GFP_KERNEL);
if (err) {
/*
* TODO: need to fixup sg_tablesize, max_segment_size,
diff --git a/fs/bio.c b/fs/bio.c
index 1e8db03..df90896 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -136,7 +136,9 @@ static void bio_fs_destructor(struct bio *bio)
static void bio_blk_destructor(struct bio *bio)
{
- bio_free(bio, blk_bio_set);
+ struct bio_set *bs = bio->bi_private;
+
+ bio_free(bio, bs);
}
void bio_init(struct bio *bio)
@@ -186,20 +188,13 @@ out:
return bio;
}
-#if 0
-This #if is just to break up the patchset, make it easier to read
-and git bisectable.
-
-This patch extends biosets to have page pools. The next patch will replace
-bio_copy_user and friends with the the bioset version added below.
-
struct bio_map_vec {
struct page *page;
- unsigned int len;
void __user *userptr;
};
struct bio_map_data {
+ struct bio_set *bs;
struct bio_map_vec *iovecs;
int nr_vecs;
};
@@ -210,14 +205,14 @@ static void bio_free_map_data(struct bio_map_data *bmd)
kfree(bmd);
}
-static struct bio_map_data *bio_alloc_map_data(int nr_segs)
+static struct bio_map_data *bio_alloc_map_data(int nr_segs, gfp_t gfp_mask)
{
- struct bio_map_data *bmd = kzalloc(sizeof(*bmd), GFP_KERNEL);
+ struct bio_map_data *bmd = kzalloc(sizeof(*bmd), gfp_mask);
if (!bmd)
return NULL;
- bmd->iovecs = kmalloc(sizeof(struct bio_map_vec) * nr_segs, GFP_KERNEL);
+ bmd->iovecs = kzalloc(sizeof(struct bio_map_vec) * nr_segs, gfp_mask);
if (bmd->iovecs)
return bmd;
@@ -225,15 +220,28 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs)
return NULL;
}
+static void bio_bmd_destructor(struct bio *bio)
+{
+ struct bio_map_data *bmd = bio->bi_private;
+ struct bio_set *bs;
+
+ if (!bmd)
+ return;
+ bs = bmd->bs;
+ bio_free_map_data(bmd);
+ bio_free(bio, bs);
+}
-void bioset_free_pages(struct bio_set *bs, struct bio *bio)
+void bioset_free_pages(struct bio *bio)
{
struct bio_map_data *bmd = bio->bi_private;
+ struct bio_set *bs = bmd->bs;
int i;
- for (i = 0; i < bmd->nr_vecs; i++)
- mempool_free(bmd->iovecs[i].page, bs->page_pool);
- bio_free_map_data(bmd);
+ for (i = 0; i < bmd->nr_vecs; i++) {
+ if (bmd->iovecs[i].page)
+ mempool_free(bmd->iovecs[i].page, bs->page_pool);
+ }
bio_put(bio);
}
@@ -246,27 +254,33 @@ struct bio *bioset_add_pages(struct request_queue *q, struct bio_set *bs,
struct bio *bio;
int i = 0, ret;
- bmd = bio_alloc_map_data(nr_pages);
+ bmd = bio_alloc_map_data(nr_pages, gfp_mask);
if (!bmd)
return ERR_PTR(-ENOMEM);
ret = -ENOMEM;
+ if (!bs)
+ bs = blk_bio_set;
bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
if (!bio)
goto out_bmd;
bio->bi_rw |= (!write_to_vm << BIO_RW);
+ bio->bi_destructor = bio_bmd_destructor;
+ bio->bi_private = bmd;
+ bmd->bs = bs;
ret = 0;
while (len) {
unsigned add_len;
page = mempool_alloc(bs->page_pool, q->bounce_gfp | gfp_mask);
- if (!page)
- goto cleanup;
-
+ if (!page) {
+ ret = -ENOMEM;
+ bioset_free_pages(bio);
+ goto fail;
+ }
bmd->nr_vecs++;
bmd->iovecs[i].page = page;
- bmd->iovecs[i].len = 0;
add_len = min_t(unsigned int,
(1 << bs->page_pool_order) << PAGE_SHIFT, len);
@@ -277,7 +291,6 @@ struct bio *bioset_add_pages(struct request_queue *q, struct bio_set *bs,
bytes = add_len;
added = bio_add_pc_page(q, bio, page++, bytes, 0);
- bmd->iovecs[i].len += added;
if (added < bytes)
break;
add_len -= bytes;
@@ -286,17 +299,13 @@ struct bio *bioset_add_pages(struct request_queue *q, struct bio_set *bs,
i++;
}
- bio->bi_private = bmd;
return bio;
-cleanup:
- bioset_free_pages(bs, bio);
- bio_free(bio, bs);
out_bmd:
bio_free_map_data(bmd);
+fail:
return ERR_PTR(ret);
}
-#endif
struct bio *bio_alloc(gfp_t gfp_mask, int nr_iovecs)
{
@@ -565,159 +574,10 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
return __bio_add_page(q, bio, page, len, offset, q->max_sectors);
}
-struct bio_map_data {
- struct bio_vec *iovecs;
- void __user *userptr;
-};
-
-static void bio_set_map_data(struct bio_map_data *bmd, struct bio *bio)
-{
- memcpy(bmd->iovecs, bio->bi_io_vec, sizeof(struct bio_vec) * bio->bi_vcnt);
- bio->bi_private = bmd;
-}
-
-static void bio_free_map_data(struct bio_map_data *bmd)
-{
- kfree(bmd->iovecs);
- kfree(bmd);
-}
-
-static struct bio_map_data *bio_alloc_map_data(int nr_segs)
-{
- struct bio_map_data *bmd = kmalloc(sizeof(*bmd), GFP_KERNEL);
-
- if (!bmd)
- return NULL;
-
- bmd->iovecs = kmalloc(sizeof(struct bio_vec) * nr_segs, GFP_KERNEL);
- if (bmd->iovecs)
- return bmd;
-
- kfree(bmd);
- return NULL;
-}
-
-/**
- * bio_uncopy_user - finish previously mapped bio
- * @bio: bio being terminated
- *
- * Free pages allocated from bio_copy_user() and write back data
- * to user space in case of a read.
- */
-int bio_uncopy_user(struct bio *bio)
-{
- struct bio_map_data *bmd = bio->bi_private;
- const int read = bio_data_dir(bio) == READ;
- struct bio_vec *bvec;
- int i, ret = 0;
-
- __bio_for_each_segment(bvec, bio, i, 0) {
- char *addr = page_address(bvec->bv_page);
- unsigned int len = bmd->iovecs[i].bv_len;
-
- if (read && !ret && copy_to_user(bmd->userptr, addr, len))
- ret = -EFAULT;
-
- __free_page(bvec->bv_page);
- bmd->userptr += len;
- }
- bio_free_map_data(bmd);
- bio_put(bio);
- return ret;
-}
-
-/**
- * bio_copy_user - copy user data to bio
- * @q: destination block queue
- * @uaddr: start of user address
- * @len: length in bytes
- * @write_to_vm: bool indicating writing to pages or not
- *
- * Prepares and returns a bio for indirect user io, bouncing data
- * to/from kernel pages as necessary. Must be paired with
- * call bio_uncopy_user() on io completion.
- */
-struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr,
- unsigned int len, int write_to_vm)
-{
- unsigned long end = (uaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
- unsigned long start = uaddr >> PAGE_SHIFT;
- struct bio_map_data *bmd;
- struct bio_vec *bvec;
- struct page *page;
- struct bio *bio;
- int i, ret;
-
- bmd = bio_alloc_map_data(end - start);
- if (!bmd)
- return ERR_PTR(-ENOMEM);
-
- bmd->userptr = (void __user *) uaddr;
-
- ret = -ENOMEM;
- bio = bio_alloc_bioset(GFP_KERNEL, end - start, blk_bio_set);
- if (!bio)
- goto out_bmd;
-
- bio->bi_rw |= (!write_to_vm << BIO_RW);
- bio->bi_destructor = bio_blk_destructor;
-
- ret = 0;
- while (len) {
- unsigned int bytes = PAGE_SIZE;
-
- if (bytes > len)
- bytes = len;
-
- page = alloc_page(q->bounce_gfp | GFP_KERNEL);
- if (!page) {
- ret = -ENOMEM;
- break;
- }
-
- if (bio_add_pc_page(q, bio, page, bytes, 0) < bytes)
- break;
-
- len -= bytes;
- }
-
- if (ret)
- goto cleanup;
-
- /*
- * success
- */
- if (!write_to_vm) {
- char __user *p = (char __user *) uaddr;
-
- /*
- * for a write, copy in data to kernel pages
- */
- ret = -EFAULT;
- bio_for_each_segment(bvec, bio, i) {
- char *addr = page_address(bvec->bv_page);
-
- if (copy_from_user(addr, p, bvec->bv_len))
- goto cleanup;
- p += bvec->bv_len;
- }
- }
-
- bio_set_map_data(bmd, bio);
- return bio;
-cleanup:
- bio_for_each_segment(bvec, bio, i)
- __free_page(bvec->bv_page);
-
- bio_put(bio);
-out_bmd:
- bio_free_map_data(bmd);
- return ERR_PTR(ret);
-}
-
static struct bio *__bio_map_user_iov(struct request_queue *q,
- struct sg_iovec *iov, int iov_count,
- int write_to_vm)
+ struct bio_set *bs, struct sg_iovec *iov,
+ int iov_count, int write_to_vm,
+ gfp_t gfp_mask)
{
int i, j;
int nr_pages = 0;
@@ -743,13 +603,16 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
if (!nr_pages)
return ERR_PTR(-EINVAL);
- bio = bio_alloc_bioset(GFP_KERNEL, nr_pages, blk_bio_set);
+ if (!bs)
+ bs = blk_bio_set;
+ bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
if (!bio)
return ERR_PTR(-ENOMEM);
bio->bi_destructor = bio_blk_destructor;
+ bio->bi_private = bs;
ret = -ENOMEM;
- pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL);
+ pages = kcalloc(nr_pages, sizeof(struct page *), gfp_mask);
if (!pages)
goto out;
@@ -827,40 +690,46 @@ static struct bio *__bio_map_user_iov(struct request_queue *q,
/**
* bio_map_user - map user address into bio
* @q: the struct request_queue for the bio
+ * @bs: bio set
* @uaddr: start of user address
* @len: length in bytes
* @write_to_vm: bool indicating writing to pages or not
+ * @gfp_mask: gfp flag
*
* Map the user space address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
-struct bio *bio_map_user(struct request_queue *q, unsigned long uaddr,
- unsigned int len, int write_to_vm)
+struct bio *bio_map_user(struct request_queue *q, struct bio_set *bs,
+ unsigned long uaddr, unsigned int len, int write_to_vm,
+ gfp_t gfp_mask)
{
struct sg_iovec iov;
iov.iov_base = (void __user *)uaddr;
iov.iov_len = len;
- return bio_map_user_iov(q, &iov, 1, write_to_vm);
+ return bio_map_user_iov(q, bs, &iov, 1, write_to_vm, gfp_mask);
}
/**
* bio_map_user_iov - map user sg_iovec table into bio
* @q: the struct request_queue for the bio
+ * @bs: bio set
* @iov: the iovec.
* @iov_count: number of elements in the iovec
* @write_to_vm: bool indicating writing to pages or not
+ * @gfp_mask: gfp flag
*
* Map the user space address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
-struct bio *bio_map_user_iov(struct request_queue *q, struct sg_iovec *iov,
- int iov_count, int write_to_vm)
+struct bio *bio_map_user_iov(struct request_queue *q, struct bio_set *bs,
+ struct sg_iovec *iov, int iov_count,
+ int write_to_vm, gfp_t gfp_mask)
{
struct bio *bio;
- bio = __bio_map_user_iov(q, iov, iov_count, write_to_vm);
+ bio = __bio_map_user_iov(q, bs, iov, iov_count, write_to_vm, gfp_mask);
if (IS_ERR(bio))
return bio;
@@ -915,8 +784,8 @@ static void bio_map_kern_endio(struct bio *bio, int err)
}
-static struct bio *__bio_map_kern(struct request_queue *q, void *data,
- unsigned int len, gfp_t gfp_mask)
+static struct bio *__bio_map_kern(struct request_queue *q, struct bio_set *bs,
+ void *data, unsigned int len, gfp_t gfp_mask)
{
unsigned long kaddr = (unsigned long)data;
unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
@@ -925,10 +794,13 @@ static struct bio *__bio_map_kern(struct request_queue *q, void *data,
int offset, i;
struct bio *bio;
- bio = bio_alloc_bioset(gfp_mask, nr_pages, blk_bio_set);
+ if (!bs)
+ bs = blk_bio_set;
+ bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
if (!bio)
return ERR_PTR(-ENOMEM);
bio->bi_destructor = bio_blk_destructor;
+ bio->bi_private = bs;
offset = offset_in_page(kaddr);
for (i = 0; i < nr_pages; i++) {
@@ -956,6 +828,7 @@ static struct bio *__bio_map_kern(struct request_queue *q, void *data,
/**
* bio_map_kern - map kernel address into bio
* @q: the struct request_queue for the bio
+ * @bs: bio set
* @data: pointer to buffer to map
* @len: length in bytes
* @gfp_mask: allocation flags for bio allocation
@@ -963,12 +836,12 @@ static struct bio *__bio_map_kern(struct request_queue *q, void *data,
* Map the kernel address into a bio suitable for io to a block
* device. Returns an error pointer in case of error.
*/
-struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len,
- gfp_t gfp_mask)
+struct bio *bio_map_kern(struct request_queue *q, struct bio_set *bs,
+ void *data, unsigned int len, gfp_t gfp_mask)
{
struct bio *bio;
- bio = __bio_map_kern(q, data, len, gfp_mask);
+ bio = __bio_map_kern(q, bs, data, len, gfp_mask);
if (IS_ERR(bio))
return bio;
@@ -1321,7 +1194,7 @@ static int __init init_bio(void)
if (!fs_bio_set)
panic("bio: can't allocate bios\n");
- blk_bio_set = bioset_create(BIO_POOL_SIZE, 2);
+ blk_bio_set = bioset_pagepool_create(BIO_POOL_SIZE, 2, 0);
if (!blk_bio_set)
panic("Failed to create blk_bio_set");
@@ -1351,8 +1224,6 @@ EXPORT_SYMBOL(bio_map_kern);
EXPORT_SYMBOL(bio_pair_release);
EXPORT_SYMBOL(bio_split);
EXPORT_SYMBOL(bio_split_pool);
-EXPORT_SYMBOL(bio_copy_user);
-EXPORT_SYMBOL(bio_uncopy_user);
EXPORT_SYMBOL(bioset_create);
EXPORT_SYMBOL(bioset_pagepool_create);
EXPORT_SYMBOL(bioset_free);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 2d28c3b..b860448 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -317,19 +317,21 @@ extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
extern int bio_get_nr_vecs(struct block_device *);
-extern struct bio *bio_map_user(struct request_queue *, unsigned long,
- unsigned int, int);
+extern struct bio *bio_map_user(struct request_queue *, struct bio_set *,
+ unsigned long, unsigned int, int, gfp_t);
struct sg_iovec;
extern struct bio *bio_map_user_iov(struct request_queue *,
- struct sg_iovec *, int, int);
+ struct bio_set *, struct sg_iovec *,
+ int, int, gfp_t);
extern void bio_unmap_user(struct bio *);
-extern struct bio *bio_map_kern(struct request_queue *, void *, unsigned int,
- gfp_t);
+extern struct bio *bio_map_kern(struct request_queue *, struct bio_set *,
+ void *, unsigned int, gfp_t);
extern void bio_set_pages_dirty(struct bio *bio);
extern void bio_check_pages_dirty(struct bio *bio);
extern void bio_release_pages(struct bio *bio);
-extern struct bio *bio_copy_user(struct request_queue *, unsigned long, unsigned int, int);
-extern int bio_uncopy_user(struct bio *);
+extern void bioset_free_pages(struct bio *);
+extern struct bio *bioset_add_pages(struct request_queue *,
+ struct bio_set *, unsigned int, int, gfp_t);
void zero_fill_bio(struct bio *bio);
#ifdef CONFIG_HIGHMEM
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bbf906a..75f92cb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -687,11 +687,19 @@ extern void blk_sync_queue(struct request_queue *q);
extern void __blk_stop_queue(struct request_queue *q);
extern void blk_run_queue(struct request_queue *);
extern void blk_start_queueing(struct request_queue *);
-extern int blk_rq_map_user(struct request_queue *, struct request *, void __user *, unsigned long);
-extern int blk_rq_unmap_user(struct bio *);
-extern int blk_rq_map_kern(struct request_queue *, struct request *, void *, unsigned int, gfp_t);
-extern int blk_rq_map_user_iov(struct request_queue *, struct request *,
- struct sg_iovec *, int, unsigned int);
+extern int blk_rq_setup_transfer(struct bio_set *, struct request *,
+ void __user *, unsigned long, gfp_t);
+extern int blk_rq_complete_transfer(struct bio *, void __user *, unsigned long);
+extern int blk_rq_setup_buffer(struct bio_set *, struct request *,
+ void __user *, unsigned long, gfp_t);
+extern void blk_rq_destroy_buffer(struct bio *);
+extern int blk_rq_map_kern(struct bio_set *, struct request_queue *,
+ struct request *, void *, unsigned int, gfp_t);
+extern int blk_rq_map_user_iov(struct bio_set *, struct request *,
+ struct sg_iovec *, int, unsigned int, gfp_t);
+extern int blk_rq_copy_user_iov(struct bio_set *, struct request *,
+ struct sg_iovec *, int, unsigned long, gfp_t);
+extern int blk_rq_uncopy_user_iov(struct bio *, struct sg_iovec *, int);
extern int blk_execute_rq(struct request_queue *, struct gendisk *,
struct request *, int);
extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *,
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 05/10] have block/scsi_ioctl user GFP_NOIO
2007-10-20 5:44 ` [PATCH 04/10] convert blk_rq_map helpers to use bioset's page pool helper michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 06/10] use GFP_NOIO in dm rdac michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
Because user spaces uses block/scsi_ioctl for path testing and to
readd devices, we cannot use GFP_KERNEL.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/scsi_ioctl.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index bf97b22..adb3fc9 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -315,7 +315,7 @@ static int sg_io(struct file *file, struct request_queue *q,
break;
}
- rq = blk_get_request(q, writing ? WRITE : READ, GFP_KERNEL);
+ rq = blk_get_request(q, writing ? WRITE : READ, GFP_NOIO);
if (!rq)
return -ENOMEM;
@@ -348,7 +348,7 @@ static int sg_io(struct file *file, struct request_queue *q,
kfree(iov);
} else if (hdr->dxfer_len)
ret = blk_rq_setup_transfer(NULL, rq, hdr->dxferp,
- hdr->dxfer_len, GFP_KERNEL);
+ hdr->dxfer_len, GFP_NOIO);
if (ret)
goto out;
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 06/10] use GFP_NOIO in dm rdac
2007-10-20 5:44 ` [PATCH 05/10] have block/scsi_ioctl user GFP_NOIO michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 07/10] fix blk_rq_map_user_iov bounce code michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
The pg_init path needs to be GFP_NOIO since it used
to failover a deviec and we cannot end up calling
back into the device.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
drivers/md/dm-mpath-rdac.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-mpath-rdac.c b/drivers/md/dm-mpath-rdac.c
index 9e71e0e..bb8ce6d 100644
--- a/drivers/md/dm-mpath-rdac.c
+++ b/drivers/md/dm-mpath-rdac.c
@@ -271,7 +271,7 @@ static struct request *get_rdac_req(struct rdac_handler *h,
struct request *rq;
struct request_queue *q = bdev_get_queue(h->path->dev->bdev);
- rq = blk_get_request(q, rw, GFP_KERNEL);
+ rq = blk_get_request(q, rw, GFP_NOIO);
if (!rq) {
DMINFO("get_rdac_req: blk_get_request failed");
@@ -279,7 +279,7 @@ static struct request *get_rdac_req(struct rdac_handler *h,
}
if (buflen && blk_rq_map_kern(NULL, q, rq, buffer, buflen,
- GFP_KERNEL)) {
+ GFP_NOIO)) {
blk_put_request(rq);
DMINFO("get_rdac_req: blk_rq_map_kern failed");
return NULL;
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 07/10] fix blk_rq_map_user_iov bounce code
2007-10-20 5:44 ` [PATCH 06/10] use GFP_NOIO in dm rdac michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 08/10] split bioset_add_pages for sg mmap use michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
bio_map_user_iov grabs a extra reference to the bio incase
a function that does not exist (bio_map_user_iov comments
reference __bio_map_user) bounced the bio.
This patch has blk_rq_map_user_iov grab a extra reference instead
of the bio function (blk_rq_map_user_iov was actually already
grabbing an extra reference), and it adds a blk_queue_bounce to bounce
the bio.
It also removes the bio_endio call in the failure path. This should
be needed because the bio layer did not bounce the bio.
There was also an extra bio_put in bio_unmap_user to handle the
extra get in bio_map_user_iov. This patch also removes that since
it is not needed.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/ll_rw_blk.c | 9 +++++++--
fs/bio.c | 43 +++++++++++--------------------------------
2 files changed, 18 insertions(+), 34 deletions(-)
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index fad17de..2e00bd2 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2692,13 +2692,18 @@ int blk_rq_map_user_iov(struct bio_set *bs, struct request *rq,
return PTR_ERR(bio);
if (bio->bi_size != len) {
- bio_endio(bio, 0);
bio_unmap_user(bio);
return -EINVAL;
}
- bio_get(bio);
blk_rq_bio_prep(rq->q, rq, bio);
+ blk_queue_bounce(rq->q, &rq->bio);
+ /*
+ * If the bio was bounced then the bounced bio would be freed
+ * when its endio is called, so we must grab an extra reference
+ * for the unamp code.
+ */
+ bio_get(rq->bio);
rq->buffer = rq->data = NULL;
return 0;
}
diff --git a/fs/bio.c b/fs/bio.c
index df90896..05ffe68 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -727,25 +727,19 @@ struct bio *bio_map_user_iov(struct request_queue *q, struct bio_set *bs,
struct sg_iovec *iov, int iov_count,
int write_to_vm, gfp_t gfp_mask)
{
- struct bio *bio;
-
- bio = __bio_map_user_iov(q, bs, iov, iov_count, write_to_vm, gfp_mask);
-
- if (IS_ERR(bio))
- return bio;
-
- /*
- * subtle -- if __bio_map_user() ended up bouncing a bio,
- * it would normally disappear when its bi_end_io is run.
- * however, we need it for the unmap, so grab an extra
- * reference to it
- */
- bio_get(bio);
-
- return bio;
+ return __bio_map_user_iov(q, bs, iov, iov_count, write_to_vm, gfp_mask);
}
-static void __bio_unmap_user(struct bio *bio)
+/**
+ * bio_unmap_user - unmap a bio
+ * @bio: the bio being unmapped
+ *
+ * Unmap a bio previously mapped by bio_map_user(). Must be called with
+ * a process context.
+ *
+ * bio_unmap_user() may sleep.
+ */
+void bio_unmap_user(struct bio *bio)
{
struct bio_vec *bvec;
int i;
@@ -763,21 +757,6 @@ static void __bio_unmap_user(struct bio *bio)
bio_put(bio);
}
-/**
- * bio_unmap_user - unmap a bio
- * @bio: the bio being unmapped
- *
- * Unmap a bio previously mapped by bio_map_user(). Must be called with
- * a process context.
- *
- * bio_unmap_user() may sleep.
- */
-void bio_unmap_user(struct bio *bio)
-{
- __bio_unmap_user(bio);
- bio_put(bio);
-}
-
static void bio_map_kern_endio(struct bio *bio, int err)
{
bio_put(bio);
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 08/10] split bioset_add_pages for sg mmap use
2007-10-20 5:44 ` [PATCH 07/10] fix blk_rq_map_user_iov bounce code michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 09/10] Add REQ_TYPE_BLOCL_PC mmap helpers michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
open(/dev/sg)
mmap()
write(/dev/sg)
read(/dev/sg)
Then can do
write(/dev/sg)
read(/dev/sg)
multiple times.
close(/dev/sg)
The pages for the mmap operation are reused for each sg request, so
they cannot be freed when the bio is like with the copy or map operations.
This patch breaks up the bio page allocation from the page addition, so that
a mmap helper can alloc pages then reuse them for another requests.
The next patch contains the mmap helper and then the last patch will
have sg use all the helpers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/ll_rw_blk.c | 6 +-
fs/bio.c | 120 ++++++++++++++++++++++++++++++++++----------------
include/linux/bio.h | 5 +-
3 files changed, 86 insertions(+), 45 deletions(-)
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 2e00bd2..7298289 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2366,7 +2366,7 @@ static void __blk_rq_destroy_buffer(struct bio *bio)
if (bio_flagged(bio, BIO_USER_MAPPED))
bio_unmap_user(bio);
else
- bioset_free_pages(bio);
+ bio_put(bio);
}
void blk_rq_destroy_buffer(struct bio *bio)
@@ -2414,8 +2414,8 @@ static int __blk_rq_setup_buffer(struct bio_set *bs, struct request *rq,
bio = bio_map_user(q, bs, (unsigned long)ubuf, map_len,
reading, gfp_mask);
} else
- bio = bioset_add_pages(q, bs, len, reading,
- gfp_mask);
+ bio = bioset_setup_pages(q, bs, len, reading,
+ gfp_mask);
if (IS_ERR(bio))
return PTR_ERR(bio);
diff --git a/fs/bio.c b/fs/bio.c
index 05ffe68..2f115ab 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -220,21 +220,8 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs, gfp_t gfp_mask)
return NULL;
}
-static void bio_bmd_destructor(struct bio *bio)
+void bioset_free_pages(struct bio_map_data *bmd)
{
- struct bio_map_data *bmd = bio->bi_private;
- struct bio_set *bs;
-
- if (!bmd)
- return;
- bs = bmd->bs;
- bio_free_map_data(bmd);
- bio_free(bio, bs);
-}
-
-void bioset_free_pages(struct bio *bio)
-{
- struct bio_map_data *bmd = bio->bi_private;
struct bio_set *bs = bmd->bs;
int i;
@@ -242,71 +229,126 @@ void bioset_free_pages(struct bio *bio)
if (bmd->iovecs[i].page)
mempool_free(bmd->iovecs[i].page, bs->page_pool);
}
- bio_put(bio);
+ bio_free_map_data(bmd);
}
-struct bio *bioset_add_pages(struct request_queue *q, struct bio_set *bs,
- unsigned int len, int write_to_vm, gfp_t gfp_mask)
+struct bio_map_data *bioset_alloc_pages(struct request_queue *q,
+ struct bio_set *bs, unsigned int len,
+ gfp_t gfp_mask)
{
int nr_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
struct bio_map_data *bmd;
struct page *page;
- struct bio *bio;
int i = 0, ret;
bmd = bio_alloc_map_data(nr_pages, gfp_mask);
if (!bmd)
return ERR_PTR(-ENOMEM);
+ bmd->bs = bs;
+ if (!bmd->bs)
+ bmd->bs = blk_bio_set;
- ret = -ENOMEM;
- if (!bs)
- bs = blk_bio_set;
- bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
+ ret = 0;
+ while (len) {
+ page = mempool_alloc(bmd->bs->page_pool,
+ q->bounce_gfp | gfp_mask);
+ if (!page) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+ bmd->nr_vecs++;
+ bmd->iovecs[i].page = page;
+
+ len -= min_t(unsigned int,
+ (1 << bmd->bs->page_pool_order) << PAGE_SHIFT, len);
+ i++;
+ }
+ return bmd;
+fail:
+ bioset_free_pages(bmd);
+ return ERR_PTR(ret);
+}
+
+static void bio_bmd_destructor(struct bio *bio)
+{
+ struct bio_map_data *bmd = bio->bi_private;
+ struct bio_set *bs;
+
+ if (!bmd)
+ return;
+ bs = bmd->bs;
+ bioset_free_pages(bmd);
+ bio_free(bio, bs);
+}
+
+struct bio *bioset_add_pages(struct request_queue *q, struct bio_map_data *bmd,
+ unsigned int len, int write_to_vm, gfp_t gfp_mask)
+{
+ int nr_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ struct page *page;
+ struct bio *bio;
+ int i = 0, ret;
+
+ bio = bio_alloc_bioset(gfp_mask, nr_pages, bmd->bs);
if (!bio)
- goto out_bmd;
+ return ERR_PTR(-ENOMEM);
bio->bi_rw |= (!write_to_vm << BIO_RW);
- bio->bi_destructor = bio_bmd_destructor;
bio->bi_private = bmd;
- bmd->bs = bs;
+ bio->bi_destructor = bio_bmd_destructor;
ret = 0;
while (len) {
unsigned add_len;
- page = mempool_alloc(bs->page_pool, q->bounce_gfp | gfp_mask);
+ page = bmd->iovecs[i].page;
if (!page) {
- ret = -ENOMEM;
- bioset_free_pages(bio);
- goto fail;
+ ret = -EINVAL;
+ printk(KERN_ERR "Invalid bio map data. Not enough "
+ "pages allocated to handle req of len %d\n",
+ len);
+ goto free_bio;
}
- bmd->nr_vecs++;
- bmd->iovecs[i].page = page;
add_len = min_t(unsigned int,
- (1 << bs->page_pool_order) << PAGE_SHIFT, len);
+ (1 << bmd->bs->page_pool_order) << PAGE_SHIFT,
+ len);
while (add_len) {
- unsigned int added, bytes = PAGE_SIZE;
+ unsigned int bytes = PAGE_SIZE;
if (bytes > add_len)
bytes = add_len;
- added = bio_add_pc_page(q, bio, page++, bytes, 0);
- if (added < bytes)
+ if (bio_add_pc_page(q, bio, page++, bytes, 0) < bytes)
break;
add_len -= bytes;
len -= bytes;
}
i++;
}
-
return bio;
-out_bmd:
- bio_free_map_data(bmd);
-fail:
+free_bio:
+ bio_free(bio, bmd->bs);
return ERR_PTR(ret);
}
+struct bio *bioset_setup_pages(struct request_queue *q, struct bio_set *bs,
+ unsigned int len, int write_to_vm,
+ gfp_t gfp_mask)
+{
+ struct bio_map_data *bmd;
+ struct bio *bio;
+
+ bmd = bioset_alloc_pages(q, bs, len, gfp_mask);
+ if (IS_ERR(bmd))
+ return ERR_PTR(-ENOMEM);
+
+ bio = bioset_add_pages(q, bmd, len, write_to_vm, gfp_mask);
+ if (IS_ERR(bio))
+ bioset_free_pages(bmd);
+ return bio;
+}
+
struct bio *bio_alloc(gfp_t gfp_mask, int nr_iovecs)
{
struct bio *bio = bio_alloc_bioset(gfp_mask, nr_iovecs, fs_bio_set);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index b860448..6d0c6b7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -329,9 +329,8 @@ extern struct bio *bio_map_kern(struct request_queue *, struct bio_set *,
extern void bio_set_pages_dirty(struct bio *bio);
extern void bio_check_pages_dirty(struct bio *bio);
extern void bio_release_pages(struct bio *bio);
-extern void bioset_free_pages(struct bio *);
-extern struct bio *bioset_add_pages(struct request_queue *,
- struct bio_set *, unsigned int, int, gfp_t);
+extern struct bio *bioset_setup_pages(struct request_queue *, struct bio_set *,
+ unsigned int, int, gfp_t);
void zero_fill_bio(struct bio *bio);
#ifdef CONFIG_HIGHMEM
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 09/10] Add REQ_TYPE_BLOCL_PC mmap helpers
2007-10-20 5:44 ` [PATCH 08/10] split bioset_add_pages for sg mmap use michaelc
@ 2007-10-20 5:44 ` michaelc
2007-10-20 5:44 ` [PATCH 10/10] convert sg.c to blk/bio helpers michaelc
0 siblings, 1 reply; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
sg supports a sg io mmap operation. These patches add some mmap helpers
based on the existing bio and blk functions.
This patch also modifies bioset_pagepool_create so that
it takes the number of pagepool entries. This is needed by
sg mmap (and other sg ops), so that it can allocate multiple large
blocks of pages.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
block/ll_rw_blk.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++
fs/bio.c | 81 +++++++++++++++++++++++++++++++++++++++----
include/linux/bio.h | 12 ++++++-
include/linux/blkdev.h | 9 +++++
4 files changed, 183 insertions(+), 9 deletions(-)
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 7298289..52b42d7 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -2710,6 +2710,96 @@ int blk_rq_map_user_iov(struct bio_set *bs, struct request *rq,
EXPORT_SYMBOL(blk_rq_map_user_iov);
+void blk_rq_mmap_close(struct bio_map_data *bmd)
+{
+ bioset_free_pages(bmd);
+}
+EXPORT_SYMBOL(blk_rq_mmap_close);
+
+/**
+ * blk_rq_mmap_open - alloc and setup buffers for REQ_BLOCK_PC mmap
+ * @bs: bio set
+ * @q: request queue
+ * @vma: vm struct
+ *
+ * Description:
+ * A the caller must also call blk_rq_setup_mmap_buffer on the request to
+ * map the buffer to a bio.
+ *
+ * When the mmap operation is done, blk_rq_mmap_close must be called.
+ */
+struct bio_map_data *blk_rq_mmap_open(struct bio_set *bs,
+ struct request_queue *q,
+ struct vm_area_struct *vma)
+{
+ struct bio_map_data *bmd;
+
+ if (vma->vm_pgoff)
+ return NULL;
+
+ if (!bs)
+ return NULL;
+
+ bmd = bioset_alloc_pages(q, bs, vma->vm_end - vma->vm_start,
+ GFP_KERNEL);
+ if (!bmd)
+ return NULL;
+
+ vma->vm_flags |= VM_RESERVED;
+ return bmd;
+}
+EXPORT_SYMBOL(blk_rq_mmap_open);
+
+struct page *blk_rq_vma_nopage(struct bio_map_data *bmd,
+ struct vm_area_struct *vma,
+ unsigned long addr, int *type)
+{
+ struct page *p;
+
+ if (!bmd)
+ return NOPAGE_SIGBUS;
+
+ p = bio_map_data_get_page(bmd, addr - vma->vm_start);
+ if (p)
+ get_page(p);
+ else
+ p = NOPAGE_SIGBUS;
+ if (type)
+ *type = VM_FAULT_MINOR;
+ return p;
+}
+EXPORT_SYMBOL(blk_rq_vma_nopage);
+
+/**
+ * blk_rq_setup_mmap_buffer - setup request and bio page mappings
+ * @rq: request
+ * @bmd: bio_map_data returned from blk_rq_mmap
+ * @len: len of transfer
+ *
+ * Note: there is not need to call a complete or transfer function.
+ * The bio's destructor function will handle the bio release.
+ */
+int blk_rq_setup_mmap_buffer(struct request *rq, struct bio_map_data *bmd,
+ unsigned int len, gfp_t gfp_mask)
+{
+ struct request_queue *q = rq->q;
+ struct bio *bio;
+
+ if (!len || len > (q->max_hw_sectors << 9))
+ return -EINVAL;
+
+ bio = bioset_add_mmap_pages(q, bmd, len, rq_data_dir(rq) == READ,
+ gfp_mask);
+ if (IS_ERR(bio))
+ return PTR_ERR(bio);
+
+ blk_rq_bio_prep(q, rq, bio);
+ blk_queue_bounce(q, &rq->bio);
+ rq->buffer = rq->data = NULL;
+ return 0;
+}
+EXPORT_SYMBOL(blk_rq_setup_mmap_buffer);
+
/**
* blk_rq_complete_transfer - unmap a request with user data
* @bio: start of bio list
diff --git a/fs/bio.c b/fs/bio.c
index 2f115ab..ccd4e3e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -70,6 +70,7 @@ struct bio_set {
mempool_t *bvec_pools[BIOVEC_NR_POOLS];
mempool_t *page_pool;
int page_pool_order;
+ int page_pool_size;
};
/*
@@ -250,8 +251,15 @@ struct bio_map_data *bioset_alloc_pages(struct request_queue *q,
ret = 0;
while (len) {
+ /*
+ * __GFP_COMP is from sg. It is needed for higher order
+ * allocs when the pages are sent to something like the network
+ * layer which does get/put page.
+ *
+ * It is also needed for mmap.
+ */
page = mempool_alloc(bmd->bs->page_pool,
- q->bounce_gfp | gfp_mask);
+ __GFP_COMP | q->bounce_gfp | gfp_mask);
if (!page) {
ret = -ENOMEM;
goto fail;
@@ -281,8 +289,10 @@ static void bio_bmd_destructor(struct bio *bio)
bio_free(bio, bs);
}
-struct bio *bioset_add_pages(struct request_queue *q, struct bio_map_data *bmd,
- unsigned int len, int write_to_vm, gfp_t gfp_mask)
+static struct bio *bioset_add_pages(struct request_queue *q,
+ struct bio_map_data *bmd,
+ unsigned int len, int write_to_vm,
+ gfp_t gfp_mask)
{
int nr_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT;
struct page *page;
@@ -293,8 +303,6 @@ struct bio *bioset_add_pages(struct request_queue *q, struct bio_map_data *bmd,
if (!bio)
return ERR_PTR(-ENOMEM);
bio->bi_rw |= (!write_to_vm << BIO_RW);
- bio->bi_private = bmd;
- bio->bi_destructor = bio_bmd_destructor;
ret = 0;
while (len) {
@@ -332,6 +340,52 @@ free_bio:
return ERR_PTR(ret);
}
+static void bio_mmap_endio(struct bio *bio, int err)
+{
+ bio_put(bio);
+}
+
+struct bio *bioset_add_mmap_pages(struct request_queue *q,
+ struct bio_map_data *bmd, unsigned int len,
+ int write_to_vm, gfp_t gfp_mask)
+{
+ struct bio *bio;
+
+ bio = bioset_add_pages(q, bmd, len, write_to_vm, gfp_mask);
+ if (IS_ERR(bio))
+ return bio;
+ /*
+ * The mmap operation may want to reuse the bmd so we just free
+ * the bio
+ */
+ bio->bi_private = bmd->bs;
+ bio->bi_destructor = bio_blk_destructor;
+ bio->bi_end_io = bio_mmap_endio;
+
+ if (bio->bi_size == len)
+ return bio;
+ /*
+ * Don't support partial mappings.
+ */
+ bio_put(bio);
+ return ERR_PTR(-EINVAL);
+}
+
+struct page *bio_map_data_get_page(struct bio_map_data *bmd,
+ unsigned long offset)
+{
+ unsigned long seg_size = (1 << bmd->bs->page_pool_order) << PAGE_SHIFT;
+ unsigned long seg_offset;
+ int iovec;
+
+ if (offset >= seg_size * bmd->nr_vecs)
+ return NULL;
+
+ iovec = offset / seg_size;
+ seg_offset = offset - (iovec * seg_size);
+ return bmd->iovecs[iovec].page + (seg_offset >> PAGE_SHIFT);
+}
+
struct bio *bioset_setup_pages(struct request_queue *q, struct bio_set *bs,
unsigned int len, int write_to_vm,
gfp_t gfp_mask)
@@ -346,6 +400,10 @@ struct bio *bioset_setup_pages(struct request_queue *q, struct bio_set *bs,
bio = bioset_add_pages(q, bmd, len, write_to_vm, gfp_mask);
if (IS_ERR(bio))
bioset_free_pages(bmd);
+ else {
+ bio->bi_private = bmd;
+ bio->bi_destructor = bio_bmd_destructor;
+ }
return bio;
}
@@ -1172,17 +1230,18 @@ void bioset_pagepool_free(struct bio_set *bs)
}
struct bio_set *bioset_pagepool_create(int bio_pool_size, int bvec_pool_size,
- int order)
+ int page_pool_size, int order)
{
struct bio_set *bs = bioset_create(bio_pool_size, bvec_pool_size);
if (!bs)
return NULL;
- bs->page_pool = mempool_create_page_pool(bio_pool_size, order);
+ bs->page_pool = mempool_create_page_pool(page_pool_size, order);
if (!bs->page_pool)
goto free_bioset;
+ bs->page_pool_size = page_pool_size;
bs->page_pool_order = order;
return bs;
@@ -1191,6 +1250,11 @@ free_bioset:
return NULL;
}
+unsigned bioset_pagepool_get_size(struct bio_set *bs)
+{
+ return bs->page_pool_size * (1 << bs->page_pool_order) << PAGE_SHIFT;
+}
+
static void __init biovec_init_slabs(void)
{
int i;
@@ -1215,7 +1279,7 @@ static int __init init_bio(void)
if (!fs_bio_set)
panic("bio: can't allocate bios\n");
- blk_bio_set = bioset_pagepool_create(BIO_POOL_SIZE, 2, 0);
+ blk_bio_set = bioset_pagepool_create(BIO_POOL_SIZE, 2, 1, 0);
if (!blk_bio_set)
panic("Failed to create blk_bio_set");
@@ -1249,4 +1313,5 @@ EXPORT_SYMBOL(bioset_create);
EXPORT_SYMBOL(bioset_pagepool_create);
EXPORT_SYMBOL(bioset_free);
EXPORT_SYMBOL(bioset_pagepool_free);
+EXPORT_SYMBOL(bioset_pagepool_get_size);
EXPORT_SYMBOL(bio_alloc_bioset);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 6d0c6b7..bc7d244 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -62,6 +62,7 @@ struct bio_vec {
unsigned int bv_offset;
};
+struct bio_map_data;
struct bio_set;
struct bio;
typedef void (bio_end_io_t) (struct bio *, int);
@@ -294,9 +295,10 @@ extern mempool_t *bio_split_pool;
extern void bio_pair_release(struct bio_pair *dbio);
extern struct bio_set *bioset_create(int, int);
-extern struct bio_set *bioset_pagepool_create(int, int, int);
+extern struct bio_set *bioset_pagepool_create(int, int, int, int);
extern void bioset_free(struct bio_set *);
extern void bioset_pagepool_free(struct bio_set *);
+extern unsigned bioset_pagepool_get_size(struct bio_set *);
extern struct bio *bio_alloc(gfp_t, int);
extern struct bio *bio_alloc_bioset(gfp_t, int, struct bio_set *);
@@ -331,6 +333,14 @@ extern void bio_check_pages_dirty(struct bio *bio);
extern void bio_release_pages(struct bio *bio);
extern struct bio *bioset_setup_pages(struct request_queue *, struct bio_set *,
unsigned int, int, gfp_t);
+extern void bioset_free_pages(struct bio_map_data *);
+extern struct bio_map_data *bioset_alloc_pages(struct request_queue *,
+ struct bio_set *, unsigned int,
+ gfp_t);
+extern struct bio *bioset_add_mmap_pages(struct request_queue *,
+ struct bio_map_data *, unsigned int,
+ int, gfp_t gfp_mask);
+extern struct page *bio_map_data_get_page(struct bio_map_data *, unsigned long);
void zero_fill_bio(struct bio *bio);
#ifdef CONFIG_HIGHMEM
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 75f92cb..64bc2bc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -700,6 +700,15 @@ extern int blk_rq_map_user_iov(struct bio_set *, struct request *,
extern int blk_rq_copy_user_iov(struct bio_set *, struct request *,
struct sg_iovec *, int, unsigned long, gfp_t);
extern int blk_rq_uncopy_user_iov(struct bio *, struct sg_iovec *, int);
+extern struct bio_map_data *blk_rq_mmap_open(struct bio_set *,
+ struct request_queue *,
+ struct vm_area_struct *);
+extern void blk_rq_mmap_close(struct bio_map_data *);
+extern struct page *blk_rq_vma_nopage(struct bio_map_data *,
+ struct vm_area_struct *,
+ unsigned long, int *);
+extern int blk_rq_setup_mmap_buffer(struct request *, struct bio_map_data *,
+ unsigned int, gfp_t);
extern int blk_execute_rq(struct request_queue *, struct gendisk *,
struct request *, int);
extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *,
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 10/10] convert sg.c to blk/bio helpers
2007-10-20 5:44 ` [PATCH 09/10] Add REQ_TYPE_BLOCL_PC mmap helpers michaelc
@ 2007-10-20 5:44 ` michaelc
0 siblings, 0 replies; 11+ messages in thread
From: michaelc @ 2007-10-20 5:44 UTC (permalink / raw)
To: linux-scsi, dm-devel, jens.axboe; +Cc: Mike Christie
From: Mike Christie <michaelc@cs.wisc.edu>
This patch converts sg to use the block/bio layer helpers
to map and copy data.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
---
drivers/scsi/sg.c | 1061 ++++++++++++++++-------------------------------------
1 files changed, 310 insertions(+), 751 deletions(-)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 7238b2d..8adf7ea 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -67,7 +67,6 @@ static void sg_proc_cleanup(void);
#endif
#define SG_ALLOW_DIO_DEF 0
-#define SG_ALLOW_DIO_CODE /* compile out by commenting this define */
#define SG_MAX_DEVS 32768
@@ -94,9 +93,6 @@ int sg_big_buff = SG_DEF_RESERVED_SIZE;
static int def_reserved_size = -1; /* picks up init parameter */
static int sg_allow_dio = SG_ALLOW_DIO_DEF;
-static int scatter_elem_sz = SG_SCATTER_SZ;
-static int scatter_elem_sz_prev = SG_SCATTER_SZ;
-
#define SG_SECTOR_SZ 512
#define SG_SECTOR_MSK (SG_SECTOR_SZ - 1)
@@ -114,12 +110,10 @@ static struct class_interface sg_interface = {
typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */
unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */
- unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */
unsigned bufflen; /* Size of (aggregate) data buffer */
- unsigned b_malloc_len; /* actual len malloc'ed in buffer */
- struct scatterlist *buffer;/* scatter list */
- char dio_in_use; /* 0->indirect IO (or mmap), 1->dio */
unsigned char cmd_opcode; /* first byte of command */
+ struct bio_map_data *bmd; /* reserve memory */
+ struct bio_set *bs; /* bio pool */
} Sg_scatter_hold;
struct sg_device; /* forward declarations */
@@ -131,6 +125,8 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */
Sg_scatter_hold data; /* hold buffer, perhaps scatter list */
sg_io_hdr_t header; /* scsi command+info, see <scsi/sg.h> */
unsigned char sense_b[SCSI_SENSE_BUFFERSIZE];
+ struct request *request;
+ struct bio *bio; /* ptr to bio for later unmapping */
char res_used; /* 1 -> using reserve buffer, 0 -> not ... */
char orphan; /* 1 -> drop on sight, 0 -> normal */
char sg_io_owned; /* 1 -> packet belongs to SG_IO */
@@ -145,7 +141,6 @@ typedef struct sg_fd { /* holds the state of a file descriptor */
int timeout; /* defaults to SG_DEFAULT_TIMEOUT */
int timeout_user; /* defaults to SG_DEFAULT_TIMEOUT_USER */
Sg_scatter_hold reserve; /* buffer held for this file descriptor */
- unsigned save_scat_len; /* original length of trunc. scat. element */
Sg_request *headrp; /* head of request slist, NULL->empty */
struct fasync_struct *async_qp; /* used by asynchronous notification */
Sg_request req_arr[SG_MAX_QUEUE]; /* used as singly-linked list */
@@ -173,38 +168,24 @@ typedef struct sg_device { /* holds the state of each scsi generic device */
static int sg_fasync(int fd, struct file *filp, int mode);
/* tasklet or soft irq callback */
-static void sg_cmd_done(void *data, char *sense, int result, int resid);
-static int sg_start_req(Sg_request * srp);
+static void sg_cmd_done(struct request *rq, int uptodate);
+static int sg_setup_req(Sg_request * srp);
static void sg_finish_rem_req(Sg_request * srp);
-static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size);
-static int sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp,
- int tablesize);
static ssize_t sg_new_read(Sg_fd * sfp, char __user *buf, size_t count,
Sg_request * srp);
static ssize_t sg_new_write(Sg_fd * sfp, const char __user *buf, size_t count,
int blocking, int read_only, Sg_request ** o_srp);
static int sg_common_write(Sg_fd * sfp, Sg_request * srp,
unsigned char *cmnd, int timeout, int blocking);
-static int sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind,
- int wr_xf, int *countp, unsigned char __user **up);
-static int sg_write_xfer(Sg_request * srp);
static int sg_read_xfer(Sg_request * srp);
-static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer);
-static void sg_remove_scat(Sg_scatter_hold * schp);
-static void sg_build_reserve(Sg_fd * sfp, int req_size);
-static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size);
-static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp);
-static struct page *sg_page_malloc(int rqSz, int lowDma, int *retSzp);
-static void sg_page_free(struct page *page, int size);
+static int sg_build_reserve(Sg_fd * sfp, int req_size);
static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev);
static int sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp);
static void __sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp);
static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id);
static Sg_request *sg_add_request(Sg_fd * sfp);
static int sg_remove_request(Sg_fd * sfp, Sg_request * srp);
-static int sg_res_in_use(Sg_fd * sfp);
static int sg_allow_access(unsigned char opcode, char dev_type);
-static int sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len);
static Sg_device *sg_get_dev(int dev);
#ifdef CONFIG_SCSI_PROC_FS
static int sg_last_dev(void);
@@ -301,6 +282,12 @@ sg_open(struct inode *inode, struct file *filp)
return retval;
}
+static void sg_cleanup_transfer(struct sg_request *srp)
+{
+ srp->bio = NULL;
+ srp->res_used = 0;
+}
+
/* Following function was formerly called 'sg_close' */
static int
sg_release(struct inode *inode, struct file *filp)
@@ -460,7 +447,9 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
if (count > old_hdr->reply_len)
count = old_hdr->reply_len;
if (count > SZ_SG_HEADER) {
- if (sg_read_oxfer(srp, buf, count - SZ_SG_HEADER)) {
+ retval = blk_rq_complete_transfer(srp->bio, buf, count);
+ sg_cleanup_transfer(srp);
+ if (retval) {
retval = -EFAULT;
goto free_old_hdr;
}
@@ -646,18 +635,13 @@ sg_new_write(Sg_fd * sfp, const char __user *buf, size_t count,
return -ENOSYS;
}
if (hp->flags & SG_FLAG_MMAP_IO) {
- if (hp->dxfer_len > sfp->reserve.bufflen) {
- sg_remove_request(sfp, srp);
- return -ENOMEM; /* MMAP_IO size must fit in reserve buffer */
- }
+ /*
+ * the call to mmap will have claimed the reserve buffer
+ */
if (hp->flags & SG_FLAG_DIRECT_IO) {
sg_remove_request(sfp, srp);
return -EINVAL; /* either MMAP_IO or DIRECT_IO (not both) */
}
- if (sg_res_in_use(sfp)) {
- sg_remove_request(sfp, srp);
- return -EBUSY; /* reserve buffer already being used */
- }
}
ul_timeout = msecs_to_jiffies(srp->header.timeout);
timeout = (ul_timeout < INT_MAX) ? ul_timeout : INT_MAX;
@@ -690,9 +674,11 @@ static int
sg_common_write(Sg_fd * sfp, Sg_request * srp,
unsigned char *cmnd, int timeout, int blocking)
{
- int k, data_dir;
+ int k;
Sg_device *sdp = sfp->parentdp;
sg_io_hdr_t *hp = &srp->header;
+ struct request_queue *q = sdp->device->request_queue;
+ struct request *rq;
srp->data.cmd_opcode = cmnd[0]; /* hold opcode of command */
hp->status = 0;
@@ -702,54 +688,44 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
hp->host_status = 0;
hp->driver_status = 0;
hp->resid = 0;
+
SCSI_LOG_TIMEOUT(4, printk("sg_common_write: scsi opcode=0x%02x, cmd_size=%d\n",
(int) cmnd[0], (int) hp->cmd_len));
- if ((k = sg_start_req(srp))) {
- SCSI_LOG_TIMEOUT(1, printk("sg_common_write: start_req err=%d\n", k));
- sg_finish_rem_req(srp);
- return k; /* probably out of space --> ENOMEM */
- }
- if ((k = sg_write_xfer(srp))) {
- SCSI_LOG_TIMEOUT(1, printk("sg_common_write: write_xfer, bad address\n"));
- sg_finish_rem_req(srp);
- return k;
+ rq = blk_get_request(q, hp->dxfer_direction == SG_DXFER_TO_DEV,
+ GFP_NOIO);
+ if (!rq) {
+ SCSI_LOG_TIMEOUT(1, printk("sg_common_write: Could "
+ "not allocate request\n"));
+ return -ENOMEM;
}
+ srp->request = rq;
+
+ memset(srp->sense_b, 0, SCSI_SENSE_BUFFERSIZE);
+ rq->sense = srp->sense_b;
+ rq->sense_len = 0;
+ rq->cmd_len = hp->cmd_len;
+ memcpy(rq->cmd, cmnd, rq->cmd_len);
+ rq->timeout = timeout;
+ rq->retries = SG_DEFAULT_RETRIES;
+ rq->cmd_type = REQ_TYPE_BLOCK_PC;
+ rq->cmd_flags |= REQ_QUIET;
+ rq->end_io_data = srp;
+
if (sdp->detached) {
sg_finish_rem_req(srp);
return -ENODEV;
}
- switch (hp->dxfer_direction) {
- case SG_DXFER_TO_FROM_DEV:
- case SG_DXFER_FROM_DEV:
- data_dir = DMA_FROM_DEVICE;
- break;
- case SG_DXFER_TO_DEV:
- data_dir = DMA_TO_DEVICE;
- break;
- case SG_DXFER_UNKNOWN:
- data_dir = DMA_BIDIRECTIONAL;
- break;
- default:
- data_dir = DMA_NONE;
- break;
+ if ((k = sg_setup_req(srp))) {
+ SCSI_LOG_TIMEOUT(1, printk("sg_common_write: start_req err=%d\n", k));
+ sg_finish_rem_req(srp);
+ return k; /* probably out of space --> ENOMEM */
}
+
hp->duration = jiffies_to_msecs(jiffies);
-/* Now send everything of to mid-level. The next time we hear about this
- packet is when sg_cmd_done() is called (i.e. a callback). */
- if (scsi_execute_async(sdp->device, cmnd, hp->cmd_len, data_dir, srp->data.buffer,
- hp->dxfer_len, srp->data.k_use_sg, timeout,
- SG_DEFAULT_RETRIES, srp, sg_cmd_done,
- GFP_ATOMIC)) {
- SCSI_LOG_TIMEOUT(1, printk("sg_common_write: scsi_execute_async failed\n"));
- /*
- * most likely out of mem, but could also be a bad map
- */
- sg_finish_rem_req(srp);
- return -ENOMEM;
- } else
- return 0;
+ blk_execute_rq_nowait(q, NULL, rq, 1, sg_cmd_done);
+ return 0;
}
static int
@@ -838,14 +814,13 @@ sg_ioctl(struct inode *inode, struct file *filp,
result = get_user(val, ip);
if (result)
return result;
- if (val) {
+ if (val)
+ /*
+ * We should always be allocated mem from the right
+ * limit, so maybe this should always be zero?.
+ */
sfp->low_dma = 1;
- if ((0 == sfp->low_dma) && (0 == sg_res_in_use(sfp))) {
- val = (int) sfp->reserve.bufflen;
- sg_remove_scat(&sfp->reserve);
- sg_build_reserve(sfp, val);
- }
- } else {
+ else {
if (sdp->detached)
return -ENODEV;
sfp->low_dma = sdp->device->host->unchecked_isa_dma;
@@ -914,14 +889,8 @@ sg_ioctl(struct inode *inode, struct file *filp,
if (val < 0)
return -EINVAL;
val = min_t(int, val,
- sdp->device->request_queue->max_sectors * 512);
- if (val != sfp->reserve.bufflen) {
- if (sg_res_in_use(sfp) || sfp->mmap_called)
- return -EBUSY;
- sg_remove_scat(&sfp->reserve);
- sg_build_reserve(sfp, val);
- }
- return 0;
+ sdp->device->request_queue->max_sectors * 512);
+ return sg_build_reserve(sfp, val);
case SG_GET_RESERVED_SIZE:
val = min_t(int, sfp->reserve.bufflen,
sdp->device->request_queue->max_sectors * 512);
@@ -1060,9 +1029,6 @@ sg_ioctl(struct inode *inode, struct file *filp,
if (sdp->detached)
return -ENODEV;
return scsi_ioctl(sdp->device, cmd_in, p);
- case BLKSECTGET:
- return put_user(sdp->device->request_queue->max_sectors * 512,
- ip);
default:
if (read_only)
return -EPERM; /* don't know so take safe approach */
@@ -1148,72 +1114,53 @@ static struct page *
sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type)
{
Sg_fd *sfp;
- struct page *page = NOPAGE_SIGBUS;
- unsigned long offset, len, sa;
- Sg_scatter_hold *rsv_schp;
- struct scatterlist *sg;
- int k;
if ((NULL == vma) || (!(sfp = (Sg_fd *) vma->vm_private_data)))
- return page;
- rsv_schp = &sfp->reserve;
- offset = addr - vma->vm_start;
- if (offset >= rsv_schp->bufflen)
- return page;
- SCSI_LOG_TIMEOUT(3, printk("sg_vma_nopage: offset=%lu, scatg=%d\n",
- offset, rsv_schp->k_use_sg));
- sg = rsv_schp->buffer;
- sa = vma->vm_start;
- for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end);
- ++k, sg = sg_next(sg)) {
- len = vma->vm_end - sa;
- len = (len < sg->length) ? len : sg->length;
- if (offset < len) {
- page = virt_to_page(page_address(sg->page) + offset);
- get_page(page); /* increment page count */
- break;
- }
- sa += len;
- offset -= len;
- }
+ return NOPAGE_SIGBUS;
- if (type)
- *type = VM_FAULT_MINOR;
- return page;
+ return blk_rq_vma_nopage(sfp->reserve.bmd, vma, addr, type);
+}
+
+static void
+sg_vma_close(struct vm_area_struct *vma)
+{
+ Sg_fd *sfp = vma->vm_private_data;
+
+ if (!sfp || !sfp->reserve.bmd)
+ return;
+ blk_rq_mmap_close(sfp->reserve.bmd);
+ sfp->reserve.bmd = NULL;
}
static struct vm_operations_struct sg_mmap_vm_ops = {
.nopage = sg_vma_nopage,
+ .close = sg_vma_close,
};
static int
sg_mmap(struct file *filp, struct vm_area_struct *vma)
{
Sg_fd *sfp;
- unsigned long req_sz, len, sa;
- Sg_scatter_hold *rsv_schp;
- int k;
- struct scatterlist *sg;
if ((!filp) || (!vma) || (!(sfp = (Sg_fd *) filp->private_data)))
return -ENXIO;
- req_sz = vma->vm_end - vma->vm_start;
- SCSI_LOG_TIMEOUT(3, printk("sg_mmap starting, vm_start=%p, len=%d\n",
- (void *) vma->vm_start, (int) req_sz));
- if (vma->vm_pgoff)
- return -EINVAL; /* want no offset */
- rsv_schp = &sfp->reserve;
- if (req_sz > rsv_schp->bufflen)
- return -ENOMEM; /* cannot map more than reserved buffer */
-
- sa = vma->vm_start;
- sg = rsv_schp->buffer;
- for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end);
- ++k, sg = sg_next(sg)) {
- len = vma->vm_end - sa;
- len = (len < sg->length) ? len : sg->length;
- sa += len;
- }
+ if (sfp->reserve.bmd)
+ return -ENXIO;
+ if (!sfp->reserve.bs)
+ return -ENOMEM;
+
+ SCSI_LOG_TIMEOUT(3, printk("sg_mmap starting, vm_start=%p\n",
+ (void *) vma->vm_start));
+
+ /*
+ * This only allocates the buffer and checks we can execute the op.
+ * We do not build the request until it is sent down through the write.
+ */
+ sfp->reserve.bmd = blk_rq_mmap_open(sfp->reserve.bs,
+ sfp->parentdp->device->request_queue,
+ vma);
+ if (!sfp->reserve.bmd)
+ return -ENOMEM;
sfp->mmap_called = 1;
vma->vm_flags |= VM_RESERVED;
@@ -1223,18 +1170,18 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma)
}
/* This function is a "bottom half" handler that is called by the
- * mid level when a command is completed (or has failed). */
+ * block level when a command is completed (or has failed). */
static void
-sg_cmd_done(void *data, char *sense, int result, int resid)
+sg_cmd_done(struct request *rq, int uptodate)
{
- Sg_request *srp = data;
+ Sg_request *srp = rq->end_io_data;
Sg_device *sdp = NULL;
Sg_fd *sfp;
unsigned long iflags;
unsigned int ms;
if (NULL == srp) {
- printk(KERN_ERR "sg_cmd_done: NULL request\n");
+ __blk_put_request(rq->q, rq);
return;
}
sfp = srp->parentfp;
@@ -1247,29 +1194,28 @@ sg_cmd_done(void *data, char *sense, int result, int resid)
SCSI_LOG_TIMEOUT(4, printk("sg_cmd_done: %s, pack_id=%d, res=0x%x\n",
- sdp->disk->disk_name, srp->header.pack_id, result));
- srp->header.resid = resid;
+ sdp->disk->disk_name, srp->header.pack_id, rq->errors));
+ srp->header.resid = rq->data_len;
ms = jiffies_to_msecs(jiffies);
srp->header.duration = (ms > srp->header.duration) ?
(ms - srp->header.duration) : 0;
- if (0 != result) {
+ if (0 != rq->errors) {
struct scsi_sense_hdr sshdr;
- memcpy(srp->sense_b, sense, sizeof (srp->sense_b));
- srp->header.status = 0xff & result;
- srp->header.masked_status = status_byte(result);
- srp->header.msg_status = msg_byte(result);
- srp->header.host_status = host_byte(result);
- srp->header.driver_status = driver_byte(result);
+ srp->header.status = 0xff & rq->errors;
+ srp->header.masked_status = status_byte(rq->errors);
+ srp->header.msg_status = msg_byte(rq->errors);
+ srp->header.host_status = host_byte(rq->errors);
+ srp->header.driver_status = driver_byte(rq->errors);
if ((sdp->sgdebug > 0) &&
((CHECK_CONDITION == srp->header.masked_status) ||
(COMMAND_TERMINATED == srp->header.masked_status)))
- __scsi_print_sense("sg_cmd_done", sense,
- SCSI_SENSE_BUFFERSIZE);
+ __scsi_print_sense("sg_cmd_done", rq->sense,
+ rq->sense_len);
/* Following if statement is a patch supplied by Eric Youngdale */
- if (driver_byte(result) != 0
- && scsi_normalize_sense(sense, SCSI_SENSE_BUFFERSIZE, &sshdr)
+ if (driver_byte(rq->errors) != 0
+ && scsi_normalize_sense(rq->sense, rq->sense_len, &sshdr)
&& !scsi_sense_is_deferred(&sshdr)
&& sshdr.sense_key == UNIT_ATTENTION
&& sdp->device->removable) {
@@ -1278,12 +1224,14 @@ sg_cmd_done(void *data, char *sense, int result, int resid)
sdp->device->changed = 1;
}
}
+
+ srp->request = NULL;
+ __blk_put_request(rq->q, rq);
/* Rely on write phase to clean out srp status values, so no "else" */
if (sfp->closed) { /* whoops this fd already released, cleanup */
SCSI_LOG_TIMEOUT(1, printk("sg_cmd_done: already closed, freeing ...\n"));
sg_finish_rem_req(srp);
- srp = NULL;
if (NULL == sfp->headrp) {
SCSI_LOG_TIMEOUT(1, printk("sg_cmd_done: already closed, final cleanup\n"));
if (0 == sg_remove_sfp(sdp, sfp)) { /* device still present */
@@ -1294,10 +1242,8 @@ sg_cmd_done(void *data, char *sense, int result, int resid)
} else if (srp && srp->orphan) {
if (sfp->keep_orphan)
srp->sg_io_owned = 0;
- else {
+ else
sg_finish_rem_req(srp);
- srp = NULL;
- }
}
if (sfp && srp) {
/* Now wake up any sg_read() that is waiting for this packet. */
@@ -1521,7 +1467,6 @@ sg_remove(struct class_device *cl_dev, struct class_interface *cl_intf)
msleep(10); /* dirty detach so delay device destruction */
}
-module_param_named(scatter_elem_sz, scatter_elem_sz, int, S_IRUGO | S_IWUSR);
module_param_named(def_reserved_size, def_reserved_size, int,
S_IRUGO | S_IWUSR);
module_param_named(allow_dio, sg_allow_dio, int, S_IRUGO | S_IWUSR);
@@ -1532,8 +1477,6 @@ MODULE_LICENSE("GPL");
MODULE_VERSION(SG_VERSION_STR);
MODULE_ALIAS_CHARDEV_MAJOR(SCSI_GENERIC_MAJOR);
-MODULE_PARM_DESC(scatter_elem_sz, "scatter gather element "
- "size (default: max(SG_SCATTER_SZ, PAGE_SIZE))");
MODULE_PARM_DESC(def_reserved_size, "size of buffer reserved for each fd");
MODULE_PARM_DESC(allow_dio, "allow direct I/O (default: 0 (disallow))");
@@ -1542,10 +1485,6 @@ init_sg(void)
{
int rc;
- if (scatter_elem_sz < PAGE_SIZE) {
- scatter_elem_sz = PAGE_SIZE;
- scatter_elem_sz_prev = scatter_elem_sz;
- }
if (def_reserved_size >= 0)
sg_big_buff = def_reserved_size;
else
@@ -1589,602 +1528,279 @@ exit_sg(void)
}
static int
-sg_start_req(Sg_request * srp)
+sg_setup_req(Sg_request * srp)
{
- int res;
+ struct request *rq = srp->request;
Sg_fd *sfp = srp->parentfp;
sg_io_hdr_t *hp = &srp->header;
+ struct sg_iovec *u_iov;
int dxfer_len = (int) hp->dxfer_len;
int dxfer_dir = hp->dxfer_direction;
- Sg_scatter_hold *req_schp = &srp->data;
- Sg_scatter_hold *rsv_schp = &sfp->reserve;
+ int new_interface = ('\0' == hp->interface_id) ? 0 : 1;
+ int res = 0, num_xfer = 0, size;
- SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len));
- if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE))
+ SCSI_LOG_TIMEOUT(4, printk("sg_setup_req: dxfer_len=%d\n", dxfer_len));
+
+ /* no transfer */
+ if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE) ||
+ (new_interface && (SG_FLAG_NO_DXFER & hp->flags)))
return 0;
+
+ /* mmap */
+ if (new_interface && (SG_FLAG_MMAP_IO & hp->flags)) {
+ res = blk_rq_setup_mmap_buffer(rq, sfp->reserve.bmd,
+ dxfer_len, GFP_NOIO);
+ if (res)
+ goto fail;
+ goto done;
+ }
+
+ /* dio */
if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) &&
(dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) &&
(!sfp->parentdp->device->host->unchecked_isa_dma)) {
- res = sg_build_direct(srp, sfp, dxfer_len);
- if (res <= 0) /* -ve -> error, 0 -> done, 1 -> try indirect */
- return res;
- }
- if ((!sg_res_in_use(sfp)) && (dxfer_len <= rsv_schp->bufflen))
- sg_link_reserve(sfp, srp, dxfer_len);
- else {
- res = sg_build_indirect(req_schp, sfp, dxfer_len);
- if (res) {
- sg_remove_scat(req_schp);
- return res;
+ res = blk_rq_setup_buffer(sfp->reserve.bs, rq, hp->dxferp,
+ dxfer_len, GFP_NOIO);
+ if (!res) {
+ hp->info |= SG_INFO_DIRECT_IO;
+ goto done;
}
+ /* drop down to copy */
}
- return 0;
-}
-
-static void
-sg_finish_rem_req(Sg_request * srp)
-{
- Sg_fd *sfp = srp->parentfp;
- Sg_scatter_hold *req_schp = &srp->data;
-
- SCSI_LOG_TIMEOUT(4, printk("sg_finish_rem_req: res_used=%d\n", (int) srp->res_used));
- if (srp->res_used)
- sg_unlink_reserve(sfp, srp);
- else
- sg_remove_scat(req_schp);
- sg_remove_request(sfp, srp);
-}
-
-static int
-sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, int tablesize)
-{
- int sg_bufflen = tablesize * sizeof(struct scatterlist);
- gfp_t gfp_flags = GFP_ATOMIC | __GFP_NOWARN;
-
- /*
- * TODO: test without low_dma, we should not need it since
- * the block layer will bounce the buffer for us
- *
- * XXX(hch): we shouldn't need GFP_DMA for the actual S/G list.
- */
- if (sfp->low_dma)
- gfp_flags |= GFP_DMA;
- schp->buffer = kzalloc(sg_bufflen, gfp_flags);
- if (!schp->buffer)
- return -ENOMEM;
- schp->sglist_len = sg_bufflen;
- return tablesize; /* number of scat_gath elements allocated */
-}
-
-#ifdef SG_ALLOW_DIO_CODE
-/* vvvvvvvv following code borrowed from st driver's direct IO vvvvvvvvv */
- /* TODO: hopefully we can use the generic block layer code */
-
-/* Pin down user pages and put them into a scatter gather list. Returns <= 0 if
- - mapping of all pages not successful
- (i.e., either completely successful or fails)
-*/
-static int
-st_map_user_pages(struct scatterlist *sgl, const unsigned int max_pages,
- unsigned long uaddr, size_t count, int rw)
-{
- unsigned long end = (uaddr + count + PAGE_SIZE - 1) >> PAGE_SHIFT;
- unsigned long start = uaddr >> PAGE_SHIFT;
- const int nr_pages = end - start;
- int res, i, j;
- struct page **pages;
-
- /* User attempted Overflow! */
- if ((uaddr + count) < uaddr)
- return -EINVAL;
- /* Too big */
- if (nr_pages > max_pages)
- return -ENOMEM;
+ /* copy */
+ /* old interface put SG_DXFER_TO_DEV/SG_DXFER_TO_FROM_DEV in flags */
+ if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_TO_DEV == dxfer_dir) ||
+ (SG_DXFER_TO_FROM_DEV == dxfer_dir)) {
+ num_xfer = (int) (new_interface ? hp->dxfer_len : hp->flags);
+ if (num_xfer > dxfer_len)
+ num_xfer = dxfer_len;
+ }
- /* Hmm? */
- if (count == 0)
- return 0;
+ SCSI_LOG_TIMEOUT(4, printk("sg_setup_req: Try xfer num_xfer=%d, "
+ "iovec_count=%d\n", dxfer_len, hp->iovec_count));
- if ((pages = kmalloc(max_pages * sizeof(*pages), GFP_ATOMIC)) == NULL)
- return -ENOMEM;
+ if (!hp->iovec_count) {
+ struct sg_iovec iov;
- /* Try to fault in all of the necessary pages */
- down_read(¤t->mm->mmap_sem);
- /* rw==READ means read from drive, write into memory area */
- res = get_user_pages(
- current,
- current->mm,
- uaddr,
- nr_pages,
- rw == READ,
- 0, /* don't force */
- pages,
- NULL);
- up_read(¤t->mm->mmap_sem);
-
- /* Errors and no page mapped should return here */
- if (res < nr_pages)
- goto out_unmap;
-
- for (i=0; i < nr_pages; i++) {
- /* FIXME: flush superflous for rw==READ,
- * probably wrong function for rw==WRITE
- */
- flush_dcache_page(pages[i]);
- /* ?? Is locking needed? I don't think so */
- /* if (TestSetPageLocked(pages[i]))
- goto out_unlock; */
- }
+ iov.iov_base = hp->dxferp;
+ iov.iov_len = num_xfer;
- sgl[0].page = pages[0];
- sgl[0].offset = uaddr & ~PAGE_MASK;
- if (nr_pages > 1) {
- sgl[0].length = PAGE_SIZE - sgl[0].offset;
- count -= sgl[0].length;
- for (i=1; i < nr_pages ; i++) {
- sgl[i].page = pages[i];
- sgl[i].length = count < PAGE_SIZE ? count : PAGE_SIZE;
- count -= PAGE_SIZE;
- }
- }
- else {
- sgl[0].length = count;
+ res = blk_rq_copy_user_iov(sfp->reserve.bs, rq, &iov, 1,
+ dxfer_len, GFP_NOIO);
+ if (res)
+ goto fail;
+ goto done;
}
- kfree(pages);
- return nr_pages;
-
- out_unmap:
- if (res > 0) {
- for (j=0; j < res; j++)
- page_cache_release(pages[j]);
- res = 0;
+ if (!access_ok(VERIFY_READ, hp->dxferp,
+ SZ_SG_IOVEC * hp->iovec_count)) {
+ res = -EFAULT;
+ goto fail;
}
- kfree(pages);
- return res;
-}
-
-
-/* And unmap them... */
-static int
-st_unmap_user_pages(struct scatterlist *sgl, const unsigned int nr_pages,
- int dirtied)
-{
- int i;
-
- for (i=0; i < nr_pages; i++) {
- struct page *page = sgl[i].page;
- if (dirtied)
- SetPageDirty(page);
- /* unlock_page(page); */
- /* FIXME: cache flush missing for rw==READ
- * FIXME: call the correct reference counting function
- */
- page_cache_release(page);
+ size = SZ_SG_IOVEC * hp->iovec_count;
+ u_iov = kmalloc(size, GFP_KERNEL);
+ if (!u_iov) {
+ res = -ENOMEM;
+ goto fail;
}
- return 0;
-}
-
-/* ^^^^^^^^ above code borrowed from st driver's direct IO ^^^^^^^^^ */
-#endif
-
-
-/* Returns: -ve -> error, 0 -> done, 1 -> try indirect */
-static int
-sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len)
-{
-#ifdef SG_ALLOW_DIO_CODE
- sg_io_hdr_t *hp = &srp->header;
- Sg_scatter_hold *schp = &srp->data;
- int sg_tablesize = sfp->parentdp->sg_tablesize;
- int mx_sc_elems, res;
- struct scsi_device *sdev = sfp->parentdp->device;
+ if (copy_from_user(u_iov, hp->dxferp, size)) {
+ kfree(u_iov);
+ res = -EFAULT;
+ goto fail;
+ }
- if (((unsigned long)hp->dxferp &
- queue_dma_alignment(sdev->request_queue)) != 0)
- return 1;
+ res = blk_rq_copy_user_iov(sfp->reserve.bs, rq, u_iov, hp->iovec_count,
+ dxfer_len, GFP_NOIO);
+ kfree(u_iov);
+ if (res)
+ goto fail;
- mx_sc_elems = sg_build_sgat(schp, sfp, sg_tablesize);
- if (mx_sc_elems <= 0) {
- return 1;
- }
- res = st_map_user_pages(schp->buffer, mx_sc_elems,
- (unsigned long)hp->dxferp, dxfer_len,
- (SG_DXFER_TO_DEV == hp->dxfer_direction) ? 1 : 0);
- if (res <= 0) {
- sg_remove_scat(schp);
- return 1;
- }
- schp->k_use_sg = res;
- schp->dio_in_use = 1;
- hp->info |= SG_INFO_DIRECT_IO;
+done:
+ /* the blk/bio layer handles mmap cleanup */
+ if (!(new_interface && (SG_FLAG_MMAP_IO & hp->flags)))
+ /* must save for later unmapping */
+ srp->bio = rq->bio;
+ srp->res_used = 1;
return 0;
-#else
- return 1;
-#endif
+
+fail:
+ return res;
}
-static int
-sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size)
+static void
+sg_finish_rem_req(Sg_request * srp)
{
- struct scatterlist *sg;
- int ret_sz = 0, k, rem_sz, num, mx_sc_elems;
- int sg_tablesize = sfp->parentdp->sg_tablesize;
- int blk_size = buff_size;
- struct page *p = NULL;
-
- if (blk_size < 0)
- return -EFAULT;
- if (0 == blk_size)
- ++blk_size; /* don't know why */
-/* round request up to next highest SG_SECTOR_SZ byte boundary */
- blk_size = (blk_size + SG_SECTOR_MSK) & (~SG_SECTOR_MSK);
- SCSI_LOG_TIMEOUT(4, printk("sg_build_indirect: buff_size=%d, blk_size=%d\n",
- buff_size, blk_size));
-
- /* N.B. ret_sz carried into this block ... */
- mx_sc_elems = sg_build_sgat(schp, sfp, sg_tablesize);
- if (mx_sc_elems < 0)
- return mx_sc_elems; /* most likely -ENOMEM */
-
- num = scatter_elem_sz;
- if (unlikely(num != scatter_elem_sz_prev)) {
- if (num < PAGE_SIZE) {
- scatter_elem_sz = PAGE_SIZE;
- scatter_elem_sz_prev = PAGE_SIZE;
- } else
- scatter_elem_sz_prev = num;
- }
- for (k = 0, sg = schp->buffer, rem_sz = blk_size;
- (rem_sz > 0) && (k < mx_sc_elems);
- ++k, rem_sz -= ret_sz, sg = sg_next(sg)) {
-
- num = (rem_sz > scatter_elem_sz_prev) ?
- scatter_elem_sz_prev : rem_sz;
- p = sg_page_malloc(num, sfp->low_dma, &ret_sz);
- if (!p)
- return -ENOMEM;
-
- if (num == scatter_elem_sz_prev) {
- if (unlikely(ret_sz > scatter_elem_sz_prev)) {
- scatter_elem_sz = ret_sz;
- scatter_elem_sz_prev = ret_sz;
- }
- }
- sg->page = p;
- sg->length = (ret_sz > num) ? num : ret_sz;
-
- SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k=%d, num=%d, "
- "ret_sz=%d\n", k, num, ret_sz));
- } /* end of for loop */
-
- schp->k_use_sg = k;
- SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k_use_sg=%d, "
- "rem_sz=%d\n", k, rem_sz));
+ Sg_fd *sfp = srp->parentfp;
- schp->bufflen = blk_size;
- if (rem_sz > 0) /* must have failed */
- return -ENOMEM;
+ SCSI_LOG_TIMEOUT(4, printk("sg_finish_rem_req: res_used=%d\n", (int) srp->res_used));
- return 0;
+ if (srp->bio)
+ /*
+ * buffer is left from something like a signal or close
+ * which was being accessed at the time. We cannot copy
+ * back to userspace so just release buffers.
+ *
+ * BUG: the old sg.c and this code, can get run from a softirq
+ * and if dio was used then we need process context.
+ * TODO: either document that you cannot use DIO and the feature
+ * which closes devices or interrupts IO while DIO is in
+ * progress. Or do something like James process context exec
+ */
+ blk_rq_destroy_buffer(srp->bio);
+ sg_cleanup_transfer(srp);
+ sg_remove_request(sfp, srp);
}
static int
-sg_write_xfer(Sg_request * srp)
+sg_read_xfer(Sg_request * srp)
{
sg_io_hdr_t *hp = &srp->header;
- Sg_scatter_hold *schp = &srp->data;
- struct scatterlist *sg = schp->buffer;
- int num_xfer = 0;
- int j, k, onum, usglen, ksglen, res;
int iovec_count = (int) hp->iovec_count;
- int dxfer_dir = hp->dxfer_direction;
- unsigned char *p;
- unsigned char __user *up;
int new_interface = ('\0' == hp->interface_id) ? 0 : 1;
+ int res = 0, num_xfer = 0;
+ int dxfer_dir = hp->dxfer_direction;
- if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_TO_DEV == dxfer_dir) ||
- (SG_DXFER_TO_FROM_DEV == dxfer_dir)) {
- num_xfer = (int) (new_interface ? hp->dxfer_len : hp->flags);
- if (schp->bufflen < num_xfer)
- num_xfer = schp->bufflen;
- }
- if ((num_xfer <= 0) || (schp->dio_in_use) ||
- (new_interface
- && ((SG_FLAG_NO_DXFER | SG_FLAG_MMAP_IO) & hp->flags)))
+ if (new_interface && (SG_FLAG_NO_DXFER & hp->flags))
return 0;
- SCSI_LOG_TIMEOUT(4, printk("sg_write_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n",
- num_xfer, iovec_count, schp->k_use_sg));
+ SCSI_LOG_TIMEOUT(4, printk("sg_read_xfer\n"));
+
+ if (SG_DXFER_UNKNOWN == dxfer_dir ||
+ SG_DXFER_FROM_DEV == dxfer_dir ||
+ SG_DXFER_TO_FROM_DEV == dxfer_dir)
+ num_xfer = hp->dxfer_len;
if (iovec_count) {
- onum = iovec_count;
- if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum))
+ int size;
+ struct sg_iovec *u_iov;
+
+ if (!access_ok(VERIFY_READ, hp->dxferp,
+ SZ_SG_IOVEC * iovec_count))
return -EFAULT;
- } else
- onum = 1;
- ksglen = sg->length;
- p = page_address(sg->page);
- for (j = 0, k = 0; j < onum; ++j) {
- res = sg_u_iovec(hp, iovec_count, j, 1, &usglen, &up);
- if (res)
- return res;
+ size = SZ_SG_IOVEC * iovec_count;
+ u_iov = kmalloc(size, GFP_KERNEL);
+ if (!u_iov)
+ return -ENOMEM;
- for (; p; sg = sg_next(sg), ksglen = sg->length,
- p = page_address(sg->page)) {
- if (usglen <= 0)
- break;
- if (ksglen > usglen) {
- if (usglen >= num_xfer) {
- if (__copy_from_user(p, up, num_xfer))
- return -EFAULT;
- return 0;
- }
- if (__copy_from_user(p, up, usglen))
- return -EFAULT;
- p += usglen;
- ksglen -= usglen;
- break;
- } else {
- if (ksglen >= num_xfer) {
- if (__copy_from_user(p, up, num_xfer))
- return -EFAULT;
- return 0;
- }
- if (__copy_from_user(p, up, ksglen))
- return -EFAULT;
- up += ksglen;
- usglen -= ksglen;
- }
- ++k;
- if (k >= schp->k_use_sg)
- return 0;
+ if (copy_from_user(u_iov, hp->dxferp, size)) {
+ kfree(u_iov);
+ return -EFAULT;
}
- }
- return 0;
+ res = blk_rq_uncopy_user_iov(srp->bio, u_iov, iovec_count);
+ kfree(u_iov);
+ } else if (!(new_interface && (SG_FLAG_MMAP_IO & hp->flags)))
+ /*
+ * dio or non iovec copy user. For mmap blk/bio layer
+ * handles the cleanup.
+ */
+ res = blk_rq_complete_transfer(srp->bio, hp->dxferp, num_xfer);
+ sg_cleanup_transfer(srp);
+ return res;
}
static int
-sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind,
- int wr_xf, int *countp, unsigned char __user **up)
+sg_res_in_use(Sg_fd * sfp)
{
- int num_xfer = (int) hp->dxfer_len;
- unsigned char __user *p = hp->dxferp;
- int count;
+ const Sg_request *srp;
+ unsigned long iflags;
- if (0 == sg_num) {
- if (wr_xf && ('\0' == hp->interface_id))
- count = (int) hp->flags; /* holds "old" input_size */
- else
- count = num_xfer;
- } else {
- sg_iovec_t iovec;
- if (__copy_from_user(&iovec, p + ind*SZ_SG_IOVEC, SZ_SG_IOVEC))
- return -EFAULT;
- p = iovec.iov_base;
- count = (int) iovec.iov_len;
- }
- if (!access_ok(wr_xf ? VERIFY_READ : VERIFY_WRITE, p, count))
- return -EFAULT;
- if (up)
- *up = p;
- if (countp)
- *countp = count;
- return 0;
+ read_lock_irqsave(&sfp->rq_list_lock, iflags);
+ for (srp = sfp->headrp; srp; srp = srp->nextrp)
+ if (srp->res_used)
+ break;
+ read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
+ return srp ? 1 : 0;
}
static void
-sg_remove_scat(Sg_scatter_hold * schp)
+sg_calc_reserve_settings(struct request_queue *q, int req_size, int *order,
+ int *nr_segs)
{
- SCSI_LOG_TIMEOUT(4, printk("sg_remove_scat: k_use_sg=%d\n", schp->k_use_sg));
- if (schp->buffer && (schp->sglist_len > 0)) {
- struct scatterlist *sg = schp->buffer;
+ unsigned int bytes;
- if (schp->dio_in_use) {
-#ifdef SG_ALLOW_DIO_CODE
- st_unmap_user_pages(sg, schp->k_use_sg, TRUE);
-#endif
- } else {
- int k;
-
- for (k = 0; (k < schp->k_use_sg) && sg->page;
- ++k, sg = sg_next(sg)) {
- SCSI_LOG_TIMEOUT(5, printk(
- "sg_remove_scat: k=%d, pg=0x%p, len=%d\n",
- k, sg->page, sg->length));
- sg_page_free(sg->page, sg->length);
- }
- }
- kfree(schp->buffer);
- }
- memset(schp, 0, sizeof (*schp));
-}
-
-static int
-sg_read_xfer(Sg_request * srp)
-{
- sg_io_hdr_t *hp = &srp->header;
- Sg_scatter_hold *schp = &srp->data;
- struct scatterlist *sg = schp->buffer;
- int num_xfer = 0;
- int j, k, onum, usglen, ksglen, res;
- int iovec_count = (int) hp->iovec_count;
- int dxfer_dir = hp->dxfer_direction;
- unsigned char *p;
- unsigned char __user *up;
- int new_interface = ('\0' == hp->interface_id) ? 0 : 1;
+ *order = 0;
+ *nr_segs = 0;
- if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_FROM_DEV == dxfer_dir)
- || (SG_DXFER_TO_FROM_DEV == dxfer_dir)) {
- num_xfer = hp->dxfer_len;
- if (schp->bufflen < num_xfer)
- num_xfer = schp->bufflen;
+ if (req_size <= PAGE_SIZE) {
+ bytes = PAGE_SIZE;
+ *order = 0;
+ goto calc_segs;
}
- if ((num_xfer <= 0) || (schp->dio_in_use) ||
- (new_interface
- && ((SG_FLAG_NO_DXFER | SG_FLAG_MMAP_IO) & hp->flags)))
- return 0;
-
- SCSI_LOG_TIMEOUT(4, printk("sg_read_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n",
- num_xfer, iovec_count, schp->k_use_sg));
- if (iovec_count) {
- onum = iovec_count;
- if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum))
- return -EFAULT;
- } else
- onum = 1;
- p = page_address(sg->page);
- ksglen = sg->length;
- for (j = 0, k = 0; j < onum; ++j) {
- res = sg_u_iovec(hp, iovec_count, j, 0, &usglen, &up);
- if (res)
- return res;
-
- for (; p; sg = sg_next(sg), ksglen = sg->length,
- p = page_address(sg->page)) {
- if (usglen <= 0)
- break;
- if (ksglen > usglen) {
- if (usglen >= num_xfer) {
- if (__copy_to_user(up, p, num_xfer))
- return -EFAULT;
- return 0;
- }
- if (__copy_to_user(up, p, usglen))
- return -EFAULT;
- p += usglen;
- ksglen -= usglen;
- break;
- } else {
- if (ksglen >= num_xfer) {
- if (__copy_to_user(up, p, num_xfer))
- return -EFAULT;
- return 0;
- }
- if (__copy_to_user(up, p, ksglen))
- return -EFAULT;
- up += ksglen;
- usglen -= ksglen;
- }
- ++k;
- if (k >= schp->k_use_sg)
- return 0;
- }
+ if (!(q->queue_flags & (1 << QUEUE_FLAG_CLUSTER))) {
+ *order = 0;
+ bytes = PAGE_SIZE;
+ goto calc_segs;
}
- return 0;
-}
-
-static int
-sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer)
-{
- Sg_scatter_hold *schp = &srp->data;
- struct scatterlist *sg = schp->buffer;
- int k, num;
-
- SCSI_LOG_TIMEOUT(4, printk("sg_read_oxfer: num_read_xfer=%d\n",
- num_read_xfer));
- if ((!outp) || (num_read_xfer <= 0))
- return 0;
+ bytes = min(q->max_segment_size, q->max_hw_sectors << 9);
+ if (bytes > BIO_MAX_SIZE)
+ bytes = BIO_MAX_SIZE;
+ else if (bytes > req_size)
+ bytes = req_size;
+ *order = get_order(bytes);
- for (k = 0; (k < schp->k_use_sg) && sg->page; ++k, sg = sg_next(sg)) {
- num = sg->length;
- if (num > num_read_xfer) {
- if (__copy_to_user(outp, page_address(sg->page),
- num_read_xfer))
- return -EFAULT;
- break;
- } else {
- if (__copy_to_user(outp, page_address(sg->page),
- num))
- return -EFAULT;
- num_read_xfer -= num;
- if (num_read_xfer <= 0)
- break;
- outp += num;
- }
- }
-
- return 0;
+calc_segs:
+ *nr_segs = req_size / bytes;
+ if ((bytes * (*nr_segs)) < req_size)
+ *nr_segs = *nr_segs + 1;
}
-static void
+static int
sg_build_reserve(Sg_fd * sfp, int req_size)
{
- Sg_scatter_hold *schp = &sfp->reserve;
+ struct request_queue *q = sfp->parentdp->device->request_queue;
+ int order = 0, nr_segs = 0, old_nr_segs;
+ struct bio_set *old_bs = NULL;
+ unsigned old_bufflen;
SCSI_LOG_TIMEOUT(4, printk("sg_build_reserve: req_size=%d\n", req_size));
- do {
- if (req_size < PAGE_SIZE)
- req_size = PAGE_SIZE;
- if (0 == sg_build_indirect(schp, sfp, req_size))
- return;
- else
- sg_remove_scat(schp);
- req_size >>= 1; /* divide by 2 */
- } while (req_size > (PAGE_SIZE / 2));
-}
+ if (req_size < 0)
+ return -EINVAL;
-static void
-sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size)
-{
- Sg_scatter_hold *req_schp = &srp->data;
- Sg_scatter_hold *rsv_schp = &sfp->reserve;
- struct scatterlist *sg = rsv_schp->buffer;
- int k, num, rem;
+ if (req_size == 0)
+ return 0;
- srp->res_used = 1;
- SCSI_LOG_TIMEOUT(4, printk("sg_link_reserve: size=%d\n", size));
- rem = size;
-
- for (k = 0; k < rsv_schp->k_use_sg; ++k, sg = sg_next(sg)) {
- num = sg->length;
- if (rem <= num) {
- sfp->save_scat_len = num;
- sg->length = rem;
- req_schp->k_use_sg = k + 1;
- req_schp->sglist_len = rsv_schp->sglist_len;
- req_schp->buffer = rsv_schp->buffer;
-
- req_schp->bufflen = size;
- req_schp->b_malloc_len = rsv_schp->b_malloc_len;
- break;
- } else
- rem -= num;
+ if (sfp->reserve.bs &&
+ (bioset_pagepool_get_size(sfp->reserve.bs) == req_size))
+ return 0;
+
+ if (sfp->mmap_called || sg_res_in_use(sfp))
+ return -EBUSY;
+
+ if (sfp->reserve.bs) {
+ old_bs = sfp->reserve.bs;
+ sfp->reserve.bs = NULL;
}
- if (k >= rsv_schp->k_use_sg)
- SCSI_LOG_TIMEOUT(1, printk("sg_link_reserve: BAD size\n"));
-}
+ old_bufflen = sfp->reserve.bufflen = 0;
+ old_nr_segs = sfp->reserve.k_use_sg = 0;
-static void
-sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp)
-{
- Sg_scatter_hold *req_schp = &srp->data;
- Sg_scatter_hold *rsv_schp = &sfp->reserve;
+ sg_calc_reserve_settings(q, req_size, &order, &nr_segs);
+ /*
+ * the max reserve size was limited to the q->max_sectors,
+ * which fits in one bio.
+ */
+ sfp->reserve.bs = bioset_pagepool_create(1, 1, nr_segs, order);
+ if (!sfp->reserve.bs) {
+ if (old_bs) {
+ sfp->reserve.bs = old_bs;
+ sfp->reserve.bufflen = old_bufflen;
+ sfp->reserve.k_use_sg = old_nr_segs;
+ }
+ return -ENOMEM;
+ }
- SCSI_LOG_TIMEOUT(4, printk("sg_unlink_reserve: req->k_use_sg=%d\n",
- (int) req_schp->k_use_sg));
- if ((rsv_schp->k_use_sg > 0) && (req_schp->k_use_sg > 0)) {
- struct scatterlist *sg = rsv_schp->buffer;
+ if (old_bs)
+ bioset_pagepool_free(old_bs);
- if (sfp->save_scat_len > 0)
- (sg + (req_schp->k_use_sg - 1))->length =
- (unsigned) sfp->save_scat_len;
- else
- SCSI_LOG_TIMEOUT(1, printk ("sg_unlink_reserve: BAD save_scat_len\n"));
- }
- req_schp->k_use_sg = 0;
- req_schp->bufflen = 0;
- req_schp->buffer = NULL;
- req_schp->sglist_len = 0;
- sfp->save_scat_len = 0;
- srp->res_used = 0;
+ sfp->reserve.bufflen = bioset_pagepool_get_size(sfp->reserve.bs);
+ sfp->reserve.k_use_sg = nr_segs;
+ return 0;
}
static Sg_request *
@@ -2375,12 +1991,16 @@ __sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp)
prev_fp = fp;
}
}
+
if (sfp->reserve.bufflen > 0) {
SCSI_LOG_TIMEOUT(6,
printk("__sg_remove_sfp: bufflen=%d, k_use_sg=%d\n",
(int) sfp->reserve.bufflen, (int) sfp->reserve.k_use_sg));
- sg_remove_scat(&sfp->reserve);
+ bioset_pagepool_free(sfp->reserve.bs);
+ sfp->reserve.bs = NULL;
+ sfp->reserve.bufflen = 0;
}
+
sfp->parentdp = NULL;
SCSI_LOG_TIMEOUT(6, printk("__sg_remove_sfp: sfp=0x%p\n", sfp));
kfree(sfp);
@@ -2425,67 +2045,6 @@ sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp)
return res;
}
-static int
-sg_res_in_use(Sg_fd * sfp)
-{
- const Sg_request *srp;
- unsigned long iflags;
-
- read_lock_irqsave(&sfp->rq_list_lock, iflags);
- for (srp = sfp->headrp; srp; srp = srp->nextrp)
- if (srp->res_used)
- break;
- read_unlock_irqrestore(&sfp->rq_list_lock, iflags);
- return srp ? 1 : 0;
-}
-
-/* The size fetched (value output via retSzp) set when non-NULL return */
-static struct page *
-sg_page_malloc(int rqSz, int lowDma, int *retSzp)
-{
- struct page *resp = NULL;
- gfp_t page_mask;
- int order, a_size;
- int resSz;
-
- if ((rqSz <= 0) || (NULL == retSzp))
- return resp;
-
- if (lowDma)
- page_mask = GFP_ATOMIC | GFP_DMA | __GFP_COMP | __GFP_NOWARN;
- else
- page_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN;
-
- for (order = 0, a_size = PAGE_SIZE; a_size < rqSz;
- order++, a_size <<= 1) ;
- resSz = a_size; /* rounded up if necessary */
- resp = alloc_pages(page_mask, order);
- while ((!resp) && order) {
- --order;
- a_size >>= 1; /* divide by 2, until PAGE_SIZE */
- resp = alloc_pages(page_mask, order); /* try half */
- resSz = a_size;
- }
- if (resp) {
- if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO))
- memset(page_address(resp), 0, resSz);
- *retSzp = resSz;
- }
- return resp;
-}
-
-static void
-sg_page_free(struct page *page, int size)
-{
- int order, a_size;
-
- if (!page)
- return;
- for (order = 0, a_size = PAGE_SIZE; a_size < size;
- order++, a_size <<= 1) ;
- __free_pages(page, order);
-}
-
#ifndef MAINTENANCE_IN_CMD
#define MAINTENANCE_IN_CMD 0xa3
#endif
--
1.5.1.2
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-10-20 5:44 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-20 5:44 reserve mem for REQ_TYPE_BLOCK_PC reqs and covert sg to block/bio helpers michaelc
2007-10-20 5:44 ` [PATCH 01/10] use seperate bioset for REQ_TYPE_BLOCK_PC michaelc
2007-10-20 5:44 ` [PATCH 02/10] rm block device arg from bio map user functions michaelc
2007-10-20 5:44 ` [PATCH 03/10] Extend bio_sets to pool pages for bios in sets michaelc
2007-10-20 5:44 ` [PATCH 04/10] convert blk_rq_map helpers to use bioset's page pool helper michaelc
2007-10-20 5:44 ` [PATCH 05/10] have block/scsi_ioctl user GFP_NOIO michaelc
2007-10-20 5:44 ` [PATCH 06/10] use GFP_NOIO in dm rdac michaelc
2007-10-20 5:44 ` [PATCH 07/10] fix blk_rq_map_user_iov bounce code michaelc
2007-10-20 5:44 ` [PATCH 08/10] split bioset_add_pages for sg mmap use michaelc
2007-10-20 5:44 ` [PATCH 09/10] Add REQ_TYPE_BLOCL_PC mmap helpers michaelc
2007-10-20 5:44 ` [PATCH 10/10] convert sg.c to blk/bio helpers michaelc
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).