* [PATCH 0/5] convert sg to use the block layer
@ 2008-08-26 2:10 FUJITA Tomonori
2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw)
To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori
This patchset converts sg to use the block layer functions. That is,
sg doesn't use scsi_execute_async() any more. This is a part of the
overdue task to remove scsi_req_map_sg.
I tested this patchset with sg v3 and the old interface (struct
sg_header) via SG_IO (v3) and the vfs API.
Doug,
1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always
uses GFP_ATOMIC as before.
2. I don't remove GFP_DMA allocation. As before, sg allocates reserved
pages with GFP_DMA (sfp->low_dma case).
3. I keep the reserved buffer per struct sg_fd as before.
4. I use high-order page allocation for reserved buffer as before. sg
works well with HBAs that have the limitation of the number of sg
entries.
5. I think that you were concern about the overhead of the block layer
functions. But if you look at scsi_execute_async() that sg uses now,
you can find that scsi_execute_async() uses the block layer functions
internally. So the current sg incurs the overhead (if such overhead
exists).
Jens,
I keep the block API changes to a minimum (I might need more changes
for st/osst but I'd like to progress step by step). I did only two
things.
1. I add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov.
2. I introduces struct rq_map_data holding pages. sg puts
pre-allocated pages to it and passes it to bio_copy_user_iov(). The
current users of bio_copy_user_iov simply passes NULL. blk_rq_map_user
and blk_rq_map_user_iov take a pointer to struct rq_map_data and in
the end bio_copy_user_iov gets it.
This patchset against the for-linus branch in Jens' tree + the two
patches for 2.6.27:
http://marc.info/?l=linux-kernel&m=121964251911717&w=2
http://marc.info/?l=linux-kernel&m=121964241911611&w=2
After Jens rebases the for-2.6.28 brach, I'll update this patchset
too.
This patchset also is available at:
git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov 2008-08-26 2:10 [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori @ 2008-08-26 2:10 ` FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori 2008-08-26 16:35 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Christoph Hellwig 2008-08-26 7:56 ` [PATCH 0/5] convert sg to use the block layer Jens Axboe 2008-08-27 20:14 ` Douglas Gilbert 2 siblings, 2 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw) To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori Currently, blk_rq_map_user and blk_rq_map_user_iov always do GFP_KERNEL allocation. This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov so sg can use it (sg always does GFP_ATOMIC allocation). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- block/blk-map.c | 20 ++++++++++++-------- block/bsg.c | 5 +++-- block/scsi_ioctl.c | 5 +++-- drivers/cdrom/cdrom.c | 2 +- drivers/scsi/scsi_tgt_lib.c | 2 +- fs/bio.c | 27 ++++++++++++++++----------- include/linux/bio.h | 9 +++++---- include/linux/blkdev.h | 5 +++-- 8 files changed, 44 insertions(+), 31 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index af37e4a..c363b45 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -41,7 +41,8 @@ static int __blk_rq_unmap_user(struct bio *bio) } static int __blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned int len) + void __user *ubuf, unsigned int len, + gfp_t gfp_mask) { unsigned long uaddr; unsigned int alignment; @@ -57,9 +58,9 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, uaddr = (unsigned long) ubuf; alignment = queue_dma_alignment(q) | q->dma_pad_mask; if (!(uaddr & alignment) && !(len & alignment)) - bio = bio_map_user(q, NULL, uaddr, len, reading); + bio = bio_map_user(q, NULL, uaddr, len, reading, gfp_mask); else - bio = bio_copy_user(q, uaddr, len, reading); + bio = bio_copy_user(q, uaddr, len, reading, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); @@ -90,6 +91,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * @rq: request structure to fill * @ubuf: the user buffer * @len: length of user data + * @gfp_mask: memory allocation flags * * Description: * Data will be mapped directly for zero copy io, if possible. Otherwise @@ -105,7 +107,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * unmapping. */ int blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned long len) + void __user *ubuf, unsigned long len, gfp_t gfp_mask) { unsigned long bytes_read = 0; struct bio *bio = NULL; @@ -132,7 +134,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, if (end - start > BIO_MAX_PAGES) map_len -= PAGE_SIZE; - ret = __blk_rq_map_user(q, rq, ubuf, map_len); + ret = __blk_rq_map_user(q, rq, ubuf, map_len, gfp_mask); if (ret < 0) goto unmap_rq; if (!bio) @@ -160,6 +162,7 @@ EXPORT_SYMBOL(blk_rq_map_user); * @iov: pointer to the iovec * @iov_count: number of elements in the iovec * @len: I/O byte count + * @gfp_mask: memory allocation flags * * Description: * Data will be mapped directly for zero copy io, if possible. Otherwise @@ -175,7 +178,8 @@ EXPORT_SYMBOL(blk_rq_map_user); * unmapping. */ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, - struct sg_iovec *iov, int iov_count, unsigned int len) + struct sg_iovec *iov, int iov_count, unsigned int len, + gfp_t gfp_mask) { struct bio *bio; int i, read = rq_data_dir(rq) == READ; @@ -194,9 +198,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, } if (unaligned || (q->dma_pad_mask & len)) - bio = bio_copy_user_iov(q, iov, iov_count, read); + bio = bio_copy_user_iov(q, iov, iov_count, read, gfp_mask); else - bio = bio_map_user_iov(q, NULL, iov, iov_count, read); + bio = bio_map_user_iov(q, NULL, iov, iov_count, read, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); diff --git a/block/bsg.c b/block/bsg.c index 0aae8d7..e7a142e 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -283,7 +283,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) next_rq->cmd_type = rq->cmd_type; dxferp = (void*)(unsigned long)hdr->din_xferp; - ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len); + ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len, + GFP_KERNEL); if (ret) goto out; } @@ -298,7 +299,7 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) dxfer_len = 0; if (dxfer_len) { - ret = blk_rq_map_user(q, rq, dxferp, dxfer_len); + ret = blk_rq_map_user(q, rq, dxferp, dxfer_len, GFP_KERNEL); if (ret) goto out; } diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index 3aab80a..f49d6a1 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -315,10 +315,11 @@ static int sg_io(struct file *file, struct request_queue *q, } ret = blk_rq_map_user_iov(q, rq, iov, hdr->iovec_count, - hdr->dxfer_len); + hdr->dxfer_len, GFP_KERNEL); kfree(iov); } else if (hdr->dxfer_len) - ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len); + ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len, + GFP_KERNEL); if (ret) goto out; diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index 74031de..e861d24 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2097,7 +2097,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf, len = nr * CD_FRAMESIZE_RAW; - ret = blk_rq_map_user(q, rq, ubuf, len); + ret = blk_rq_map_user(q, rq, ubuf, len, GFP_KERNEL); if (ret) break; diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 257e097..2a4fd82 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -362,7 +362,7 @@ static int scsi_map_user_pages(struct scsi_tgt_cmd *tcmd, struct scsi_cmnd *cmd, int err; dprintk("%lx %u\n", uaddr, len); - err = blk_rq_map_user(q, rq, (void *)uaddr, len); + err = blk_rq_map_user(q, rq, (void *)uaddr, len, GFP_KERNEL); if (err) { /* * TODO: need to fixup sg_tablesize, max_segment_size, diff --git a/fs/bio.c b/fs/bio.c index 3cba7ae..4da5439 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -568,13 +568,14 @@ int bio_uncopy_user(struct bio *bio) * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Prepares and returns a bio for indirect user io, bouncing data * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, - int iov_count, int write_to_vm) + int iov_count, int write_to_vm, gfp_t gfp_mask) { struct bio_map_data *bmd; struct bio_vec *bvec; @@ -615,7 +616,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, if (bytes > len) bytes = len; - page = alloc_page(q->bounce_gfp | GFP_KERNEL); + page = alloc_page(q->bounce_gfp | gfp_mask); if (!page) { ret = -ENOMEM; break; @@ -657,26 +658,27 @@ out_bmd: * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Prepares and returns a bio for indirect user io, bouncing data * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr, - unsigned int len, int write_to_vm) + unsigned int len, int write_to_vm, gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_copy_user_iov(q, &iov, 1, write_to_vm); + return bio_copy_user_iov(q, &iov, 1, write_to_vm, gfp_mask); } static struct bio *__bio_map_user_iov(struct request_queue *q, struct block_device *bdev, struct sg_iovec *iov, int iov_count, - int write_to_vm) + int write_to_vm, gfp_t gfp_mask) { int i, j; int nr_pages = 0; @@ -702,12 +704,12 @@ static struct bio *__bio_map_user_iov(struct request_queue *q, if (!nr_pages) return ERR_PTR(-EINVAL); - bio = bio_alloc(GFP_KERNEL, nr_pages); + bio = bio_alloc(gfp_mask, nr_pages); if (!bio) return ERR_PTR(-ENOMEM); ret = -ENOMEM; - pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL); + pages = kcalloc(nr_pages, sizeof(struct page *), gfp_mask); if (!pages) goto out; @@ -786,19 +788,21 @@ static struct bio *__bio_map_user_iov(struct request_queue *q, * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Map the user space address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev, - unsigned long uaddr, unsigned int len, int write_to_vm) + unsigned long uaddr, unsigned int len, int write_to_vm, + gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_map_user_iov(q, bdev, &iov, 1, write_to_vm); + return bio_map_user_iov(q, bdev, &iov, 1, write_to_vm, gfp_mask); } /** @@ -808,17 +812,18 @@ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev, * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Map the user space address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev, struct sg_iovec *iov, int iov_count, - int write_to_vm) + int write_to_vm, gfp_t gfp_mask) { struct bio *bio; - bio = __bio_map_user_iov(q, bdev, iov, iov_count, write_to_vm); + bio = __bio_map_user_iov(q, bdev, iov, iov_count, write_to_vm, gfp_mask); if (IS_ERR(bio)) return bio; diff --git a/include/linux/bio.h b/include/linux/bio.h index 0933a14..f4820ca 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -347,11 +347,11 @@ extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, unsigned int, unsigned int); extern int bio_get_nr_vecs(struct block_device *); extern struct bio *bio_map_user(struct request_queue *, struct block_device *, - unsigned long, unsigned int, int); + unsigned long, unsigned int, int, gfp_t); struct sg_iovec; extern struct bio *bio_map_user_iov(struct request_queue *, struct block_device *, - struct sg_iovec *, int, int); + struct sg_iovec *, int, int, gfp_t); extern void bio_unmap_user(struct bio *); extern struct bio *bio_map_kern(struct request_queue *, void *, unsigned int, gfp_t); @@ -359,9 +359,10 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int, gfp_t, int); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); -extern struct bio *bio_copy_user(struct request_queue *, unsigned long, unsigned int, int); +extern struct bio *bio_copy_user(struct request_queue *, unsigned long, + unsigned int, int, gfp_t); extern struct bio *bio_copy_user_iov(struct request_queue *, struct sg_iovec *, - int, int); + int, int, gfp_t); extern int bio_uncopy_user(struct bio *); void zero_fill_bio(struct bio *bio); extern struct bio_vec *bvec_alloc_bs(gfp_t, int, unsigned long *, struct bio_set *); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index ab247d5..e4cb266 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -705,11 +705,12 @@ extern void __blk_stop_queue(struct request_queue *q); extern void __blk_run_queue(struct request_queue *); extern void blk_run_queue(struct request_queue *); extern void blk_start_queueing(struct request_queue *); -extern int blk_rq_map_user(struct request_queue *, struct request *, void __user *, unsigned long); +extern int blk_rq_map_user(struct request_queue *, struct request *, + void __user *, unsigned long, gfp_t); extern int blk_rq_unmap_user(struct bio *); extern int blk_rq_map_kern(struct request_queue *, struct request *, void *, unsigned int, gfp_t); extern int blk_rq_map_user_iov(struct request_queue *, struct request *, - struct sg_iovec *, int, unsigned int); + struct sg_iovec *, int, unsigned int, gfp_t); extern int blk_execute_rq(struct request_queue *, struct gendisk *, struct request *, int); extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *, -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages 2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori @ 2008-08-26 2:10 ` FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori 2008-08-26 16:35 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Christoph Hellwig 1 sibling, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw) To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori This patch introduces struct rq_map_data to enable bio_copy_use_iov() use reserved pages. Currently, bio_copy_user_iov allocates bounce pages but drivers/scsi/sg.c wants to allocate pages by itself and use them. struct rq_map_data can be used to pass allocated pages to bio_copy_user_iov. The current users of bio_copy_user_iov simply passes NULL (they don't want to use pre-allocated pages). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- block/blk-map.c | 26 ++++++++++++------- block/bsg.c | 7 +++-- block/scsi_ioctl.c | 4 +- drivers/cdrom/cdrom.c | 2 +- drivers/scsi/scsi_tgt_lib.c | 2 +- fs/bio.c | 58 ++++++++++++++++++++++++++++++------------ include/linux/bio.h | 8 +++-- include/linux/blkdev.h | 12 +++++++- 8 files changed, 80 insertions(+), 39 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index c363b45..3f6ae02 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -41,8 +41,8 @@ static int __blk_rq_unmap_user(struct bio *bio) } static int __blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned int len, - gfp_t gfp_mask) + struct rq_map_data *map_data, void __user *ubuf, + unsigned int len, gfp_t gfp_mask) { unsigned long uaddr; unsigned int alignment; @@ -57,10 +57,10 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, */ uaddr = (unsigned long) ubuf; alignment = queue_dma_alignment(q) | q->dma_pad_mask; - if (!(uaddr & alignment) && !(len & alignment)) + if (!(uaddr & alignment) && !(len & alignment) && !map_data) bio = bio_map_user(q, NULL, uaddr, len, reading, gfp_mask); else - bio = bio_copy_user(q, uaddr, len, reading, gfp_mask); + bio = bio_copy_user(q, map_data, uaddr, len, reading, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); @@ -89,6 +89,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * blk_rq_map_user - map user data to a request, for REQ_BLOCK_PC usage * @q: request queue where request should be inserted * @rq: request structure to fill + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @ubuf: the user buffer * @len: length of user data * @gfp_mask: memory allocation flags @@ -107,7 +108,8 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * unmapping. */ int blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned long len, gfp_t gfp_mask) + struct rq_map_data *map_data, void __user *ubuf, + unsigned long len, gfp_t gfp_mask) { unsigned long bytes_read = 0; struct bio *bio = NULL; @@ -134,7 +136,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, if (end - start > BIO_MAX_PAGES) map_len -= PAGE_SIZE; - ret = __blk_rq_map_user(q, rq, ubuf, map_len, gfp_mask); + ret = __blk_rq_map_user(q, rq, map_data, ubuf, map_len, + gfp_mask); if (ret < 0) goto unmap_rq; if (!bio) @@ -159,6 +162,7 @@ EXPORT_SYMBOL(blk_rq_map_user); * blk_rq_map_user_iov - map user data to a request, for REQ_BLOCK_PC usage * @q: request queue where request should be inserted * @rq: request to map data to + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @iov: pointer to the iovec * @iov_count: number of elements in the iovec * @len: I/O byte count @@ -178,8 +182,8 @@ EXPORT_SYMBOL(blk_rq_map_user); * unmapping. */ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, - struct sg_iovec *iov, int iov_count, unsigned int len, - gfp_t gfp_mask) + struct rq_map_data *map_data, struct sg_iovec *iov, + int iov_count, unsigned int len, gfp_t gfp_mask) { struct bio *bio; int i, read = rq_data_dir(rq) == READ; @@ -197,8 +201,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, } } - if (unaligned || (q->dma_pad_mask & len)) - bio = bio_copy_user_iov(q, iov, iov_count, read, gfp_mask); + if (unaligned || (q->dma_pad_mask & len) || map_data) + bio = bio_copy_user_iov(q, map_data, iov, iov_count, read, + gfp_mask); else bio = bio_map_user_iov(q, NULL, iov, iov_count, read, gfp_mask); @@ -220,6 +225,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, rq->buffer = rq->data = NULL; return 0; } +EXPORT_SYMBOL(blk_rq_map_user_iov); /** * blk_rq_unmap_user - unmap a request with user data diff --git a/block/bsg.c b/block/bsg.c index e7a142e..56cb343 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -283,8 +283,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) next_rq->cmd_type = rq->cmd_type; dxferp = (void*)(unsigned long)hdr->din_xferp; - ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len, - GFP_KERNEL); + ret = blk_rq_map_user(q, next_rq, NULL, dxferp, + hdr->din_xfer_len, GFP_KERNEL); if (ret) goto out; } @@ -299,7 +299,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) dxfer_len = 0; if (dxfer_len) { - ret = blk_rq_map_user(q, rq, dxferp, dxfer_len, GFP_KERNEL); + ret = blk_rq_map_user(q, rq, NULL, dxferp, dxfer_len, + GFP_KERNEL); if (ret) goto out; } diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index f49d6a1..c34272a 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -314,11 +314,11 @@ static int sg_io(struct file *file, struct request_queue *q, goto out; } - ret = blk_rq_map_user_iov(q, rq, iov, hdr->iovec_count, + ret = blk_rq_map_user_iov(q, rq, NULL, iov, hdr->iovec_count, hdr->dxfer_len, GFP_KERNEL); kfree(iov); } else if (hdr->dxfer_len) - ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len, + ret = blk_rq_map_user(q, rq, NULL, hdr->dxferp, hdr->dxfer_len, GFP_KERNEL); if (ret) diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index e861d24..d47f2f8 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2097,7 +2097,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf, len = nr * CD_FRAMESIZE_RAW; - ret = blk_rq_map_user(q, rq, ubuf, len, GFP_KERNEL); + ret = blk_rq_map_user(q, rq, NULL, ubuf, len, GFP_KERNEL); if (ret) break; diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 2a4fd82..3117bb1 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -362,7 +362,7 @@ static int scsi_map_user_pages(struct scsi_tgt_cmd *tcmd, struct scsi_cmnd *cmd, int err; dprintk("%lx %u\n", uaddr, len); - err = blk_rq_map_user(q, rq, (void *)uaddr, len, GFP_KERNEL); + err = blk_rq_map_user(q, rq, NULL, (void *)uaddr, len, GFP_KERNEL); if (err) { /* * TODO: need to fixup sg_tablesize, max_segment_size, diff --git a/fs/bio.c b/fs/bio.c index 4da5439..6dc2045 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -449,16 +449,19 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len, struct bio_map_data { struct bio_vec *iovecs; - int nr_sgvecs; struct sg_iovec *sgvecs; + int nr_sgvecs; + int is_our_pages; }; static void bio_set_map_data(struct bio_map_data *bmd, struct bio *bio, - struct sg_iovec *iov, int iov_count) + struct sg_iovec *iov, int iov_count, + int is_our_pages) { memcpy(bmd->iovecs, bio->bi_io_vec, sizeof(struct bio_vec) * bio->bi_vcnt); memcpy(bmd->sgvecs, iov, sizeof(struct sg_iovec) * iov_count); bmd->nr_sgvecs = iov_count; + bmd->is_our_pages = is_our_pages; bio->bi_private = bmd; } @@ -493,7 +496,8 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs, int iov_count, } static int __bio_copy_iov(struct bio *bio, struct bio_vec *iovecs, - struct sg_iovec *iov, int iov_count, int uncopy) + struct sg_iovec *iov, int iov_count, int uncopy, + int do_free_page) { int ret = 0, i; struct bio_vec *bvec; @@ -536,7 +540,7 @@ static int __bio_copy_iov(struct bio *bio, struct bio_vec *iovecs, } } - if (uncopy) + if (do_free_page) __free_page(bvec->bv_page); } @@ -555,7 +559,8 @@ int bio_uncopy_user(struct bio *bio) struct bio_map_data *bmd = bio->bi_private; int ret; - ret = __bio_copy_iov(bio, bmd->iovecs, bmd->sgvecs, bmd->nr_sgvecs, 1); + ret = __bio_copy_iov(bio, bmd->iovecs, bmd->sgvecs, bmd->nr_sgvecs, 1, + bmd->is_our_pages); bio_free_map_data(bmd); bio_put(bio); @@ -565,6 +570,7 @@ int bio_uncopy_user(struct bio *bio) /** * bio_copy_user_iov - copy user data to bio * @q: destination block queue + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not @@ -574,8 +580,10 @@ int bio_uncopy_user(struct bio *bio) * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ -struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, - int iov_count, int write_to_vm, gfp_t gfp_mask) +struct bio *bio_copy_user_iov(struct request_queue *q, + struct rq_map_data *map_data, + struct sg_iovec *iov, int iov_count, + int write_to_vm, gfp_t gfp_mask) { struct bio_map_data *bmd; struct bio_vec *bvec; @@ -610,13 +618,26 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, bio->bi_rw |= (!write_to_vm << BIO_RW); ret = 0; + i = 0; while (len) { - unsigned int bytes = PAGE_SIZE; + unsigned int bytes; + + if (map_data) + bytes = 1U << (PAGE_SHIFT + map_data->page_order); + else + bytes = PAGE_SIZE; if (bytes > len) bytes = len; - page = alloc_page(q->bounce_gfp | gfp_mask); + if (map_data) { + if (i == map_data->nr_entries) { + ret = -ENOMEM; + break; + } + page = map_data->pages[i++]; + } else + page = alloc_page(q->bounce_gfp | gfp_mask); if (!page) { ret = -ENOMEM; break; @@ -635,16 +656,17 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, * success */ if (!write_to_vm) { - ret = __bio_copy_iov(bio, bio->bi_io_vec, iov, iov_count, 0); + ret = __bio_copy_iov(bio, bio->bi_io_vec, iov, iov_count, 0, 0); if (ret) goto cleanup; } - bio_set_map_data(bmd, bio, iov, iov_count); + bio_set_map_data(bmd, bio, iov, iov_count, map_data ? 0 : 1); return bio; cleanup: - bio_for_each_segment(bvec, bio, i) - __free_page(bvec->bv_page); + if (!map_data) + bio_for_each_segment(bvec, bio, i) + __free_page(bvec->bv_page); bio_put(bio); out_bmd: @@ -655,6 +677,7 @@ out_bmd: /** * bio_copy_user - copy user data to bio * @q: destination block queue + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not @@ -664,15 +687,16 @@ out_bmd: * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ -struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr, - unsigned int len, int write_to_vm, gfp_t gfp_mask) +struct bio *bio_copy_user(struct request_queue *q, struct rq_map_data *map_data, + unsigned long uaddr, unsigned int len, + int write_to_vm, gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_copy_user_iov(q, &iov, 1, write_to_vm, gfp_mask); + return bio_copy_user_iov(q, map_data, &iov, 1, write_to_vm, gfp_mask); } static struct bio *__bio_map_user_iov(struct request_queue *q, @@ -1038,7 +1062,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, bio->bi_private = bmd; bio->bi_end_io = bio_copy_kern_endio; - bio_set_map_data(bmd, bio, &iov, 1); + bio_set_map_data(bmd, bio, &iov, 1, 1); return bio; cleanup: bio_for_each_segment(bvec, bio, i) diff --git a/include/linux/bio.h b/include/linux/bio.h index f4820ca..a68c617 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -349,6 +349,7 @@ extern int bio_get_nr_vecs(struct block_device *); extern struct bio *bio_map_user(struct request_queue *, struct block_device *, unsigned long, unsigned int, int, gfp_t); struct sg_iovec; +struct rq_map_data; extern struct bio *bio_map_user_iov(struct request_queue *, struct block_device *, struct sg_iovec *, int, int, gfp_t); @@ -359,9 +360,10 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int, gfp_t, int); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); -extern struct bio *bio_copy_user(struct request_queue *, unsigned long, - unsigned int, int, gfp_t); -extern struct bio *bio_copy_user_iov(struct request_queue *, struct sg_iovec *, +extern struct bio *bio_copy_user(struct request_queue *, struct rq_map_data *, + unsigned long, unsigned int, int, gfp_t); +extern struct bio *bio_copy_user_iov(struct request_queue *, + struct rq_map_data *, struct sg_iovec *, int, int, gfp_t); extern int bio_uncopy_user(struct bio *); void zero_fill_bio(struct bio *bio); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e4cb266..d5e76d1 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -637,6 +637,12 @@ static inline void blk_queue_bounce(struct request_queue *q, struct bio **bio) } #endif /* CONFIG_MMU */ +struct rq_map_data { + struct page **pages; + int page_order; + int nr_entries; +}; + struct req_iterator { int i; struct bio *bio; @@ -706,11 +712,13 @@ extern void __blk_run_queue(struct request_queue *); extern void blk_run_queue(struct request_queue *); extern void blk_start_queueing(struct request_queue *); extern int blk_rq_map_user(struct request_queue *, struct request *, - void __user *, unsigned long, gfp_t); + struct rq_map_data *, void __user *, unsigned long, + gfp_t); extern int blk_rq_unmap_user(struct bio *); extern int blk_rq_map_kern(struct request_queue *, struct request *, void *, unsigned int, gfp_t); extern int blk_rq_map_user_iov(struct request_queue *, struct request *, - struct sg_iovec *, int, unsigned int, gfp_t); + struct rq_map_data *, struct sg_iovec *, int, + unsigned int, gfp_t); extern int blk_execute_rq(struct request_queue *, struct gendisk *, struct request *, int); extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *, -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 3/5] sg: convert the non-data path to use the block layer 2008-08-26 2:10 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori @ 2008-08-26 2:10 ` FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw) To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori This patch converts the non data path to use the block layer functions (blk_get_request, blk_execute_rq_nowait, etc) instead of uses scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 47 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 661f9f2..d1b3de9 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -137,6 +137,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */ char orphan; /* 1 -> drop on sight, 0 -> normal */ char sg_io_owned; /* 1 -> packet belongs to SG_IO */ volatile char done; /* 0->before bh, 1->before read, 2->read */ + struct request *rq; } Sg_request; typedef struct sg_fd { /* holds the state of a file descriptor */ @@ -176,7 +177,7 @@ typedef struct sg_device { /* holds the state of each scsi generic device */ static int sg_fasync(int fd, struct file *filp, int mode); /* tasklet or soft irq callback */ static void sg_cmd_done(void *data, char *sense, int result, int resid); -static int sg_start_req(Sg_request * srp); +static int sg_start_req(Sg_request * srp, unsigned char *cmd); static void sg_finish_rem_req(Sg_request * srp); static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size); static int sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, @@ -229,6 +230,11 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd) cmd, filp->f_mode & FMODE_WRITE); } +static void sg_rq_end_io(struct request *rq, int uptodate) +{ + sg_cmd_done(rq->end_io_data, rq->sense, rq->errors, rq->data_len); +} + static int sg_open(struct inode *inode, struct file *filp) { @@ -732,7 +738,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, SCSI_LOG_TIMEOUT(4, printk("sg_common_write: scsi opcode=0x%02x, cmd_size=%d\n", (int) cmnd[0], (int) hp->cmd_len)); - if ((k = sg_start_req(srp))) { + if ((k = sg_start_req(srp, cmnd))) { SCSI_LOG_TIMEOUT(1, printk("sg_common_write: start_req err=%d\n", k)); sg_finish_rem_req(srp); return k; /* probably out of space --> ENOMEM */ @@ -765,6 +771,12 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, hp->duration = jiffies_to_msecs(jiffies); /* Now send everything of to mid-level. The next time we hear about this packet is when sg_cmd_done() is called (i.e. a callback). */ + if (srp->rq) { + srp->rq->timeout = timeout; + blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, + srp->rq, 1, sg_rq_end_io); + return 0; + } if (scsi_execute_async(sdp->device, cmnd, hp->cmd_len, data_dir, srp->data.buffer, hp->dxfer_len, srp->data.k_use_sg, timeout, SG_DEFAULT_RETRIES, srp, sg_cmd_done, @@ -1634,8 +1646,33 @@ exit_sg(void) idr_destroy(&sg_index_idr); } +static int __sg_start_req(struct sg_request *srp, struct sg_io_hdr *hp, + unsigned char *cmd) +{ + struct sg_fd *sfp = srp->parentfp; + struct request_queue *q = sfp->parentdp->device->request_queue; + struct request *rq; + int rw = hp->dxfer_direction == SG_DXFER_TO_DEV ? WRITE : READ; + + rq = blk_get_request(q, rw, GFP_ATOMIC); + if (!rq) + return -ENOMEM; + + memcpy(rq->cmd, cmd, hp->cmd_len); + + rq->cmd_len = hp->cmd_len; + rq->cmd_type = REQ_TYPE_BLOCK_PC; + + srp->rq = rq; + rq->end_io_data = srp; + rq->sense = srp->sense_b; + rq->retries = SG_DEFAULT_RETRIES; + + return 0; +} + static int -sg_start_req(Sg_request * srp) +sg_start_req(Sg_request * srp, unsigned char *cmd) { int res; Sg_fd *sfp = srp->parentfp; @@ -1646,8 +1683,10 @@ sg_start_req(Sg_request * srp) Sg_scatter_hold *rsv_schp = &sfp->reserve; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); + if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) - return 0; + return __sg_start_req(srp, hp, cmd); + if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && (!sfp->parentdp->device->host->unchecked_isa_dma)) { @@ -1678,6 +1717,10 @@ sg_finish_rem_req(Sg_request * srp) sg_unlink_reserve(sfp, srp); else sg_remove_scat(req_schp); + + if (srp->rq) + blk_put_request(srp->rq); + sg_remove_request(sfp, srp); } -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 4/5] sg: convert the direct IO path to use the block layer 2008-08-26 2:10 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori @ 2008-08-26 2:10 ` FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 5/5] sg: convert the indirect " FUJITA Tomonori 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw) To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori This patch converts the direct IO path (SG_FLAG_DIRECT_IO) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 173 ++++++++-------------------------------------------- 1 files changed, 27 insertions(+), 146 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index d1b3de9..10a285e 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -138,6 +138,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */ char sg_io_owned; /* 1 -> packet belongs to SG_IO */ volatile char done; /* 0->before bh, 1->before read, 2->read */ struct request *rq; + struct bio *bio; } Sg_request; typedef struct sg_fd { /* holds the state of a file descriptor */ @@ -1679,21 +1680,29 @@ sg_start_req(Sg_request * srp, unsigned char *cmd) sg_io_hdr_t *hp = &srp->header; int dxfer_len = (int) hp->dxfer_len; int dxfer_dir = hp->dxfer_direction; + unsigned long uaddr = (unsigned long)hp->dxferp; Sg_scatter_hold *req_schp = &srp->data; Sg_scatter_hold *rsv_schp = &sfp->reserve; + struct request_queue *q = sfp->parentdp->device->request_queue; + unsigned long alignment = queue_dma_alignment(q) | q->dma_pad_mask; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) return __sg_start_req(srp, hp, cmd); +#ifdef SG_ALLOW_DIO_CODE if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && - (!sfp->parentdp->device->host->unchecked_isa_dma)) { - res = sg_build_direct(srp, sfp, dxfer_len); - if (res <= 0) /* -ve -> error, 0 -> done, 1 -> try indirect */ - return res; + (!sfp->parentdp->device->host->unchecked_isa_dma) && + !(uaddr & alignment) && !(dxfer_len & alignment)) { + res = __sg_start_req(srp, hp, cmd); + if (!res) + res = sg_build_direct(srp, sfp, dxfer_len); + + return res; } +#endif if ((!sg_res_in_use(sfp)) && (dxfer_len <= rsv_schp->bufflen)) sg_link_reserve(sfp, srp, dxfer_len); else { @@ -1718,8 +1727,11 @@ sg_finish_rem_req(Sg_request * srp) else sg_remove_scat(req_schp); - if (srp->rq) + if (srp->rq) { + if (srp->bio) + blk_rq_unmap_user(srp->bio); blk_put_request(srp->rq); + } sg_remove_request(sfp, srp); } @@ -1746,151 +1758,23 @@ sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, int tablesize) return tablesize; /* number of scat_gath elements allocated */ } -#ifdef SG_ALLOW_DIO_CODE -/* vvvvvvvv following code borrowed from st driver's direct IO vvvvvvvvv */ - /* TODO: hopefully we can use the generic block layer code */ - -/* Pin down user pages and put them into a scatter gather list. Returns <= 0 if - - mapping of all pages not successful - (i.e., either completely successful or fails) -*/ -static int -st_map_user_pages(struct scatterlist *sgl, const unsigned int max_pages, - unsigned long uaddr, size_t count, int rw) -{ - unsigned long end = (uaddr + count + PAGE_SIZE - 1) >> PAGE_SHIFT; - unsigned long start = uaddr >> PAGE_SHIFT; - const int nr_pages = end - start; - int res, i, j; - struct page **pages; - - /* User attempted Overflow! */ - if ((uaddr + count) < uaddr) - return -EINVAL; - - /* Too big */ - if (nr_pages > max_pages) - return -ENOMEM; - - /* Hmm? */ - if (count == 0) - return 0; - - if ((pages = kmalloc(max_pages * sizeof(*pages), GFP_ATOMIC)) == NULL) - return -ENOMEM; - - /* Try to fault in all of the necessary pages */ - down_read(¤t->mm->mmap_sem); - /* rw==READ means read from drive, write into memory area */ - res = get_user_pages( - current, - current->mm, - uaddr, - nr_pages, - rw == READ, - 0, /* don't force */ - pages, - NULL); - up_read(¤t->mm->mmap_sem); - - /* Errors and no page mapped should return here */ - if (res < nr_pages) - goto out_unmap; - - for (i=0; i < nr_pages; i++) { - /* FIXME: flush superflous for rw==READ, - * probably wrong function for rw==WRITE - */ - flush_dcache_page(pages[i]); - /* ?? Is locking needed? I don't think so */ - /* if (!trylock_page(pages[i])) - goto out_unlock; */ - } - - sg_set_page(sgl, pages[0], 0, uaddr & ~PAGE_MASK); - if (nr_pages > 1) { - sgl[0].length = PAGE_SIZE - sgl[0].offset; - count -= sgl[0].length; - for (i=1; i < nr_pages ; i++) - sg_set_page(&sgl[i], pages[i], count < PAGE_SIZE ? count : PAGE_SIZE, 0); - } - else { - sgl[0].length = count; - } - - kfree(pages); - return nr_pages; - - out_unmap: - if (res > 0) { - for (j=0; j < res; j++) - page_cache_release(pages[j]); - res = 0; - } - kfree(pages); - return res; -} - - -/* And unmap them... */ -static int -st_unmap_user_pages(struct scatterlist *sgl, const unsigned int nr_pages, - int dirtied) -{ - int i; - - for (i=0; i < nr_pages; i++) { - struct page *page = sg_page(&sgl[i]); - - if (dirtied) - SetPageDirty(page); - /* unlock_page(page); */ - /* FIXME: cache flush missing for rw==READ - * FIXME: call the correct reference counting function - */ - page_cache_release(page); - } - - return 0; -} - -/* ^^^^^^^^ above code borrowed from st driver's direct IO ^^^^^^^^^ */ -#endif - - /* Returns: -ve -> error, 0 -> done, 1 -> try indirect */ static int sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len) { -#ifdef SG_ALLOW_DIO_CODE sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - int sg_tablesize = sfp->parentdp->sg_tablesize; - int mx_sc_elems, res; - struct scsi_device *sdev = sfp->parentdp->device; - - if (((unsigned long)hp->dxferp & - queue_dma_alignment(sdev->request_queue)) != 0) - return 1; + int res; + struct request *rq = srp->rq; + struct request_queue *q = sfp->parentdp->device->request_queue; - mx_sc_elems = sg_build_sgat(schp, sfp, sg_tablesize); - if (mx_sc_elems <= 0) { - return 1; - } - res = st_map_user_pages(schp->buffer, mx_sc_elems, - (unsigned long)hp->dxferp, dxfer_len, - (SG_DXFER_TO_DEV == hp->dxfer_direction) ? 1 : 0); - if (res <= 0) { - sg_remove_scat(schp); - return 1; - } - schp->k_use_sg = res; + res = blk_rq_map_user(q, rq, NULL, hp->dxferp, dxfer_len, GFP_ATOMIC); + if (res) + return res; + srp->bio = rq->bio; schp->dio_in_use = 1; hp->info |= SG_INFO_DIRECT_IO; return 0; -#else - return 1; -#endif } static int @@ -2069,11 +1953,7 @@ sg_remove_scat(Sg_scatter_hold * schp) if (schp->buffer && (schp->sglist_len > 0)) { struct scatterlist *sg = schp->buffer; - if (schp->dio_in_use) { -#ifdef SG_ALLOW_DIO_CODE - st_unmap_user_pages(sg, schp->k_use_sg, TRUE); -#endif - } else { + if (!schp->dio_in_use) { int k; for (k = 0; (k < schp->k_use_sg) && sg_page(sg); @@ -2083,8 +1963,9 @@ sg_remove_scat(Sg_scatter_hold * schp) k, sg_page(sg), sg->length)); sg_page_free(sg_page(sg), sg->length); } + + kfree(schp->buffer); } - kfree(schp->buffer); } memset(schp, 0, sizeof (*schp)); } -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 5/5] sg: convert the indirect IO path to use the block layer 2008-08-26 2:10 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori @ 2008-08-26 2:10 ` FUJITA Tomonori 0 siblings, 0 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-26 2:10 UTC (permalink / raw) To: linux-scsi; +Cc: jens.axboe, dougg, michaelc, James.Bottomley, fujita.tomonori This patch converts the indirect IO path (including mmap IO and old struct sg_header) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 393 ++++++++++++++--------------------------------------- 1 files changed, 103 insertions(+), 290 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 10a285e..66a2c31 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -47,7 +47,6 @@ static int sg_version_num = 30534; /* 2 digits for each component */ #include <linux/seq_file.h> #include <linux/blkdev.h> #include <linux/delay.h> -#include <linux/scatterlist.h> #include <linux/blktrace_api.h> #include <linux/smp_lock.h> @@ -119,7 +118,8 @@ typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ unsigned bufflen; /* Size of (aggregate) data buffer */ unsigned b_malloc_len; /* actual len malloc'ed in buffer */ - struct scatterlist *buffer;/* scatter list */ + struct page **pages; + int page_order; char dio_in_use; /* 0->indirect IO (or mmap), 1->dio */ unsigned char cmd_opcode; /* first byte of command */ } Sg_scatter_hold; @@ -190,8 +190,6 @@ static ssize_t sg_new_write(Sg_fd *sfp, struct file *file, int read_only, Sg_request **o_srp); static int sg_common_write(Sg_fd * sfp, Sg_request * srp, unsigned char *cmnd, int timeout, int blocking); -static int sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind, - int wr_xf, int *countp, unsigned char __user **up); static int sg_write_xfer(Sg_request * srp); static int sg_read_xfer(Sg_request * srp); static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer); @@ -199,8 +197,6 @@ static void sg_remove_scat(Sg_scatter_hold * schp); static void sg_build_reserve(Sg_fd * sfp, int req_size); static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size); static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp); -static struct page *sg_page_malloc(int rqSz, int lowDma, int *retSzp); -static void sg_page_free(struct page *page, int size); static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev); static int sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp); static void __sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp); @@ -770,26 +766,11 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, break; } hp->duration = jiffies_to_msecs(jiffies); -/* Now send everything of to mid-level. The next time we hear about this - packet is when sg_cmd_done() is called (i.e. a callback). */ - if (srp->rq) { - srp->rq->timeout = timeout; - blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, - srp->rq, 1, sg_rq_end_io); - return 0; - } - if (scsi_execute_async(sdp->device, cmnd, hp->cmd_len, data_dir, srp->data.buffer, - hp->dxfer_len, srp->data.k_use_sg, timeout, - SG_DEFAULT_RETRIES, srp, sg_cmd_done, - GFP_ATOMIC)) { - SCSI_LOG_TIMEOUT(1, printk("sg_common_write: scsi_execute_async failed\n")); - /* - * most likely out of mem, but could also be a bad map - */ - sg_finish_rem_req(srp); - return -ENOMEM; - } else - return 0; + + srp->rq->timeout = timeout; + blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, + srp->rq, 1, sg_rq_end_io); + return 0; } static int @@ -1205,8 +1186,7 @@ sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) Sg_fd *sfp; unsigned long offset, len, sa; Sg_scatter_hold *rsv_schp; - struct scatterlist *sg; - int k; + int k, length; if ((NULL == vma) || (!(sfp = (Sg_fd *) vma->vm_private_data))) return VM_FAULT_SIGBUS; @@ -1216,15 +1196,14 @@ sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) return VM_FAULT_SIGBUS; SCSI_LOG_TIMEOUT(3, printk("sg_vma_fault: offset=%lu, scatg=%d\n", offset, rsv_schp->k_use_sg)); - sg = rsv_schp->buffer; sa = vma->vm_start; - for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); - ++k, sg = sg_next(sg)) { + length = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg && sa < vma->vm_end; k++) { len = vma->vm_end - sa; - len = (len < sg->length) ? len : sg->length; + len = (len < length) ? len : length; if (offset < len) { - struct page *page; - page = virt_to_page(page_address(sg_page(sg)) + offset); + struct page *page = nth_page(rsv_schp->pages[k], + offset >> PAGE_SHIFT); get_page(page); /* increment page count */ vmf->page = page; return 0; /* success */ @@ -1246,8 +1225,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) Sg_fd *sfp; unsigned long req_sz, len, sa; Sg_scatter_hold *rsv_schp; - int k; - struct scatterlist *sg; + int k, length; if ((!filp) || (!vma) || (!(sfp = (Sg_fd *) filp->private_data))) return -ENXIO; @@ -1261,11 +1239,10 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) return -ENOMEM; /* cannot map more than reserved buffer */ sa = vma->vm_start; - sg = rsv_schp->buffer; - for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); - ++k, sg = sg_next(sg)) { + length = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg && sa < vma->vm_end; k++) { len = vma->vm_end - sa; - len = (len < sg->length) ? len : sg->length; + len = (len < length) ? len : length; sa += len; } @@ -1309,7 +1286,6 @@ sg_cmd_done(void *data, char *sense, int result, int resid) if (0 != result) { struct scsi_sense_hdr sshdr; - memcpy(srp->sense_b, sense, sizeof (srp->sense_b)); srp->header.status = 0xff & result; srp->header.masked_status = status_byte(result); srp->header.msg_status = msg_byte(result); @@ -1685,34 +1661,51 @@ sg_start_req(Sg_request * srp, unsigned char *cmd) Sg_scatter_hold *rsv_schp = &sfp->reserve; struct request_queue *q = sfp->parentdp->device->request_queue; unsigned long alignment = queue_dma_alignment(q) | q->dma_pad_mask; + struct rq_map_data map_data; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); + res = __sg_start_req(srp, hp, cmd); + if (res) + return res; + if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) - return __sg_start_req(srp, hp, cmd); + return 0; #ifdef SG_ALLOW_DIO_CODE if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && (!sfp->parentdp->device->host->unchecked_isa_dma) && - !(uaddr & alignment) && !(dxfer_len & alignment)) { - res = __sg_start_req(srp, hp, cmd); - if (!res) - res = sg_build_direct(srp, sfp, dxfer_len); - - return res; - } + !(uaddr & alignment) && !(dxfer_len & alignment)) + return sg_build_direct(srp, sfp, dxfer_len); #endif if ((!sg_res_in_use(sfp)) && (dxfer_len <= rsv_schp->bufflen)) sg_link_reserve(sfp, srp, dxfer_len); - else { + else res = sg_build_indirect(req_schp, sfp, dxfer_len); - if (res) { - sg_remove_scat(req_schp); - return res; - } + + if (!res) { + struct request *rq = srp->rq; + Sg_scatter_hold *schp = &srp->data; + int iovec_count = (int) hp->iovec_count; + + map_data.pages = schp->pages; + map_data.page_order = schp->page_order; + map_data.nr_entries = schp->k_use_sg; + + if (iovec_count) + res = blk_rq_map_user_iov(q, rq, &map_data, hp->dxferp, + iovec_count, + hp->dxfer_len, GFP_ATOMIC); + else + res = blk_rq_map_user(q, rq, &map_data, hp->dxferp, hp->dxfer_len, + GFP_ATOMIC); + + if (!res) + srp->bio = rq->bio; } - return 0; + + return res; } static void @@ -1730,6 +1723,7 @@ sg_finish_rem_req(Sg_request * srp) if (srp->rq) { if (srp->bio) blk_rq_unmap_user(srp->bio); + blk_put_request(srp->rq); } @@ -1739,21 +1733,12 @@ sg_finish_rem_req(Sg_request * srp) static int sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, int tablesize) { - int sg_bufflen = tablesize * sizeof(struct scatterlist); + int sg_bufflen = tablesize * sizeof(struct page *); gfp_t gfp_flags = GFP_ATOMIC | __GFP_NOWARN; - /* - * TODO: test without low_dma, we should not need it since - * the block layer will bounce the buffer for us - * - * XXX(hch): we shouldn't need GFP_DMA for the actual S/G list. - */ - if (sfp->low_dma) - gfp_flags |= GFP_DMA; - schp->buffer = kzalloc(sg_bufflen, gfp_flags); - if (!schp->buffer) + schp->pages = kzalloc(sg_bufflen, gfp_flags); + if (!schp->pages) return -ENOMEM; - sg_init_table(schp->buffer, tablesize); schp->sglist_len = sg_bufflen; return tablesize; /* number of scat_gath elements allocated */ } @@ -1780,11 +1765,10 @@ sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len) static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) { - struct scatterlist *sg; - int ret_sz = 0, k, rem_sz, num, mx_sc_elems; + int ret_sz = 0, i, k, rem_sz, num, mx_sc_elems; int sg_tablesize = sfp->parentdp->sg_tablesize; - int blk_size = buff_size; - struct page *p = NULL; + int blk_size = buff_size, order; + gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; if (blk_size < 0) return -EFAULT; @@ -1808,15 +1792,26 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) } else scatter_elem_sz_prev = num; } - for (k = 0, sg = schp->buffer, rem_sz = blk_size; - (rem_sz > 0) && (k < mx_sc_elems); - ++k, rem_sz -= ret_sz, sg = sg_next(sg)) { - + + if (sfp->low_dma) + gfp_mask |= GFP_DMA; + + if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) + gfp_mask |= __GFP_ZERO; + + order = get_order(num); +retry: + ret_sz = 1 << (PAGE_SHIFT + order); + + for (k = 0, rem_sz = blk_size; rem_sz > 0 && k < mx_sc_elems; + k++, rem_sz -= ret_sz) { + num = (rem_sz > scatter_elem_sz_prev) ? - scatter_elem_sz_prev : rem_sz; - p = sg_page_malloc(num, sfp->low_dma, &ret_sz); - if (!p) - return -ENOMEM; + scatter_elem_sz_prev : rem_sz; + + schp->pages[k] = alloc_pages(gfp_mask, order); + if (!schp->pages[k]) + goto out; if (num == scatter_elem_sz_prev) { if (unlikely(ret_sz > scatter_elem_sz_prev)) { @@ -1824,12 +1819,12 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) scatter_elem_sz_prev = ret_sz; } } - sg_set_page(sg, p, (ret_sz > num) ? num : ret_sz, 0); SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k=%d, num=%d, " "ret_sz=%d\n", k, num, ret_sz)); } /* end of for loop */ + schp->page_order = order; schp->k_use_sg = k; SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k_use_sg=%d, " "rem_sz=%d\n", k, rem_sz)); @@ -1837,8 +1832,15 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) schp->bufflen = blk_size; if (rem_sz > 0) /* must have failed */ return -ENOMEM; - return 0; +out: + for (i = 0; i < k; i++) + __free_pages(schp->pages[k], order); + + if (--order >= 0) + goto retry; + + return -ENOMEM; } static int @@ -1846,13 +1848,8 @@ sg_write_xfer(Sg_request * srp) { sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int num_xfer = 0; - int j, k, onum, usglen, ksglen, res; - int iovec_count = (int) hp->iovec_count; int dxfer_dir = hp->dxfer_direction; - unsigned char *p; - unsigned char __user *up; int new_interface = ('\0' == hp->interface_id) ? 0 : 1; if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_TO_DEV == dxfer_dir) || @@ -1868,103 +1865,26 @@ sg_write_xfer(Sg_request * srp) SCSI_LOG_TIMEOUT(4, printk("sg_write_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n", num_xfer, iovec_count, schp->k_use_sg)); - if (iovec_count) { - onum = iovec_count; - if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum)) - return -EFAULT; - } else - onum = 1; - - ksglen = sg->length; - p = page_address(sg_page(sg)); - for (j = 0, k = 0; j < onum; ++j) { - res = sg_u_iovec(hp, iovec_count, j, 1, &usglen, &up); - if (res) - return res; - - for (; p; sg = sg_next(sg), ksglen = sg->length, - p = page_address(sg_page(sg))) { - if (usglen <= 0) - break; - if (ksglen > usglen) { - if (usglen >= num_xfer) { - if (__copy_from_user(p, up, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_from_user(p, up, usglen)) - return -EFAULT; - p += usglen; - ksglen -= usglen; - break; - } else { - if (ksglen >= num_xfer) { - if (__copy_from_user(p, up, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_from_user(p, up, ksglen)) - return -EFAULT; - up += ksglen; - usglen -= ksglen; - } - ++k; - if (k >= schp->k_use_sg) - return 0; - } - } return 0; } -static int -sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind, - int wr_xf, int *countp, unsigned char __user **up) -{ - int num_xfer = (int) hp->dxfer_len; - unsigned char __user *p = hp->dxferp; - int count; - - if (0 == sg_num) { - if (wr_xf && ('\0' == hp->interface_id)) - count = (int) hp->flags; /* holds "old" input_size */ - else - count = num_xfer; - } else { - sg_iovec_t iovec; - if (__copy_from_user(&iovec, p + ind*SZ_SG_IOVEC, SZ_SG_IOVEC)) - return -EFAULT; - p = iovec.iov_base; - count = (int) iovec.iov_len; - } - if (!access_ok(wr_xf ? VERIFY_READ : VERIFY_WRITE, p, count)) - return -EFAULT; - if (up) - *up = p; - if (countp) - *countp = count; - return 0; -} - static void sg_remove_scat(Sg_scatter_hold * schp) { SCSI_LOG_TIMEOUT(4, printk("sg_remove_scat: k_use_sg=%d\n", schp->k_use_sg)); - if (schp->buffer && (schp->sglist_len > 0)) { - struct scatterlist *sg = schp->buffer; - + if (schp->pages && schp->sglist_len > 0) { if (!schp->dio_in_use) { int k; - for (k = 0; (k < schp->k_use_sg) && sg_page(sg); - ++k, sg = sg_next(sg)) { + for (k = 0; k < schp->k_use_sg && schp->pages[k]; k++) { SCSI_LOG_TIMEOUT(5, printk( - "sg_remove_scat: k=%d, pg=0x%p, len=%d\n", - k, sg_page(sg), sg->length)); - sg_page_free(sg_page(sg), sg->length); + "sg_remove_scat: k=%d, pg=0x%p\n", + k, page)); + __free_pages(schp->pages[k], schp->page_order); } - kfree(schp->buffer); + kfree(schp->pages); } } memset(schp, 0, sizeof (*schp)); @@ -1975,13 +1895,8 @@ sg_read_xfer(Sg_request * srp) { sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int num_xfer = 0; - int j, k, onum, usglen, ksglen, res; - int iovec_count = (int) hp->iovec_count; int dxfer_dir = hp->dxfer_direction; - unsigned char *p; - unsigned char __user *up; int new_interface = ('\0' == hp->interface_id) ? 0 : 1; if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_FROM_DEV == dxfer_dir) @@ -1996,53 +1911,7 @@ sg_read_xfer(Sg_request * srp) return 0; SCSI_LOG_TIMEOUT(4, printk("sg_read_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n", - num_xfer, iovec_count, schp->k_use_sg)); - if (iovec_count) { - onum = iovec_count; - if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum)) - return -EFAULT; - } else - onum = 1; - - p = page_address(sg_page(sg)); - ksglen = sg->length; - for (j = 0, k = 0; j < onum; ++j) { - res = sg_u_iovec(hp, iovec_count, j, 0, &usglen, &up); - if (res) - return res; - - for (; p; sg = sg_next(sg), ksglen = sg->length, - p = page_address(sg_page(sg))) { - if (usglen <= 0) - break; - if (ksglen > usglen) { - if (usglen >= num_xfer) { - if (__copy_to_user(up, p, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_to_user(up, p, usglen)) - return -EFAULT; - p += usglen; - ksglen -= usglen; - break; - } else { - if (ksglen >= num_xfer) { - if (__copy_to_user(up, p, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_to_user(up, p, ksglen)) - return -EFAULT; - up += ksglen; - usglen -= ksglen; - } - ++k; - if (k >= schp->k_use_sg) - return 0; - } - } - + num_xfer, (int)hp->iovec_count, schp->k_use_sg)); return 0; } @@ -2050,7 +1919,6 @@ static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer) { Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int k, num; SCSI_LOG_TIMEOUT(4, printk("sg_read_oxfer: num_read_xfer=%d\n", @@ -2058,15 +1926,18 @@ sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer) if ((!outp) || (num_read_xfer <= 0)) return 0; - for (k = 0; (k < schp->k_use_sg) && sg_page(sg); ++k, sg = sg_next(sg)) { - num = sg->length; + blk_rq_unmap_user(srp->bio); + srp->bio = NULL; + + num = 1 << (PAGE_SHIFT + schp->page_order); + for (k = 0; k < schp->k_use_sg && schp->pages[k]; k++) { if (num > num_read_xfer) { - if (__copy_to_user(outp, page_address(sg_page(sg)), + if (__copy_to_user(outp, page_address(schp->pages[k]), num_read_xfer)) return -EFAULT; break; } else { - if (__copy_to_user(outp, page_address(sg_page(sg)), + if (__copy_to_user(outp, page_address(schp->pages[k]), num)) return -EFAULT; num_read_xfer -= num; @@ -2101,24 +1972,22 @@ sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size) { Sg_scatter_hold *req_schp = &srp->data; Sg_scatter_hold *rsv_schp = &sfp->reserve; - struct scatterlist *sg = rsv_schp->buffer; int k, num, rem; srp->res_used = 1; SCSI_LOG_TIMEOUT(4, printk("sg_link_reserve: size=%d\n", size)); rem = size; - for (k = 0; k < rsv_schp->k_use_sg; ++k, sg = sg_next(sg)) { - num = sg->length; + num = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg; k++) { if (rem <= num) { - sfp->save_scat_len = num; - sg->length = rem; req_schp->k_use_sg = k + 1; req_schp->sglist_len = rsv_schp->sglist_len; - req_schp->buffer = rsv_schp->buffer; + req_schp->pages = rsv_schp->pages; req_schp->bufflen = size; req_schp->b_malloc_len = rsv_schp->b_malloc_len; + req_schp->page_order = rsv_schp->page_order; break; } else rem -= num; @@ -2132,22 +2001,13 @@ static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp) { Sg_scatter_hold *req_schp = &srp->data; - Sg_scatter_hold *rsv_schp = &sfp->reserve; SCSI_LOG_TIMEOUT(4, printk("sg_unlink_reserve: req->k_use_sg=%d\n", (int) req_schp->k_use_sg)); - if ((rsv_schp->k_use_sg > 0) && (req_schp->k_use_sg > 0)) { - struct scatterlist *sg = rsv_schp->buffer; - - if (sfp->save_scat_len > 0) - (sg + (req_schp->k_use_sg - 1))->length = - (unsigned) sfp->save_scat_len; - else - SCSI_LOG_TIMEOUT(1, printk ("sg_unlink_reserve: BAD save_scat_len\n")); - } req_schp->k_use_sg = 0; req_schp->bufflen = 0; - req_schp->buffer = NULL; + req_schp->pages = NULL; + req_schp->page_order = 0; req_schp->sglist_len = 0; sfp->save_scat_len = 0; srp->res_used = 0; @@ -2405,53 +2265,6 @@ sg_res_in_use(Sg_fd * sfp) return srp ? 1 : 0; } -/* The size fetched (value output via retSzp) set when non-NULL return */ -static struct page * -sg_page_malloc(int rqSz, int lowDma, int *retSzp) -{ - struct page *resp = NULL; - gfp_t page_mask; - int order, a_size; - int resSz; - - if ((rqSz <= 0) || (NULL == retSzp)) - return resp; - - if (lowDma) - page_mask = GFP_ATOMIC | GFP_DMA | __GFP_COMP | __GFP_NOWARN; - else - page_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; - - for (order = 0, a_size = PAGE_SIZE; a_size < rqSz; - order++, a_size <<= 1) ; - resSz = a_size; /* rounded up if necessary */ - resp = alloc_pages(page_mask, order); - while ((!resp) && order) { - --order; - a_size >>= 1; /* divide by 2, until PAGE_SIZE */ - resp = alloc_pages(page_mask, order); /* try half */ - resSz = a_size; - } - if (resp) { - if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) - memset(page_address(resp), 0, resSz); - *retSzp = resSz; - } - return resp; -} - -static void -sg_page_free(struct page *page, int size) -{ - int order, a_size; - - if (!page) - return; - for (order = 0, a_size = PAGE_SIZE; a_size < size; - order++, a_size <<= 1) ; - __free_pages(page, order); -} - #ifdef CONFIG_SCSI_PROC_FS static int sg_idr_max_id(int id, void *p, void *data) -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov 2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori @ 2008-08-26 16:35 ` Christoph Hellwig 2008-08-27 1:30 ` FUJITA Tomonori 1 sibling, 1 reply; 23+ messages in thread From: Christoph Hellwig @ 2008-08-26 16:35 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, jens.axboe, dougg, michaelc, James.Bottomley On Tue, Aug 26, 2008 at 11:10:50AM +0900, FUJITA Tomonori wrote: > Currently, blk_rq_map_user and blk_rq_map_user_iov always do > GFP_KERNEL allocation. > > This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov > so sg can use it (sg always does GFP_ATOMIC allocation). Most GFP_ATOMIC looks rather spurious to me, and are there probably for some historic reason. Do you have a caller that actually needs GFP_ATOMIC because it's under a spinlock or from irq context, or is this just to stay as close as possible to the existing sg code? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov 2008-08-26 16:35 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Christoph Hellwig @ 2008-08-27 1:30 ` FUJITA Tomonori 0 siblings, 0 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-27 1:30 UTC (permalink / raw) To: hch Cc: fujita.tomonori, linux-scsi, jens.axboe, dougg, michaelc, James.Bottomley On Tue, 26 Aug 2008 12:35:45 -0400 Christoph Hellwig <hch@infradead.org> wrote: > On Tue, Aug 26, 2008 at 11:10:50AM +0900, FUJITA Tomonori wrote: > > Currently, blk_rq_map_user and blk_rq_map_user_iov always do > > GFP_KERNEL allocation. > > > > This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov > > so sg can use it (sg always does GFP_ATOMIC allocation). > > Most GFP_ATOMIC looks rather spurious to me, and are there probably > for some historic reason. Do you have a caller that actually needs > GFP_ATOMIC because it's under a spinlock or from irq context, No, we don't have. > or is this just to stay as close as possible to the existing sg > code? Yes, I don't change sg behavior. GFP_NOWAIT would be more appropriate than GFP_ATOMIC for sg, I guess. But let me finish the conversion without changing sg behavior. I know changing the behavior makes it more difficult to get Doug's ACK on these patches. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-26 2:10 [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori @ 2008-08-26 7:56 ` Jens Axboe 2008-08-27 2:14 ` FUJITA Tomonori 2008-08-27 20:14 ` Douglas Gilbert 2 siblings, 1 reply; 23+ messages in thread From: Jens Axboe @ 2008-08-26 7:56 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, dougg, michaelc, James.Bottomley On Tue, Aug 26 2008, FUJITA Tomonori wrote: > This patchset converts sg to use the block layer functions. That is, > sg doesn't use scsi_execute_async() any more. This is a part of the > overdue task to remove scsi_req_map_sg. > > I tested this patchset with sg v3 and the old interface (struct > sg_header) via SG_IO (v3) and the vfs API. > > > Doug, > > 1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always > uses GFP_ATOMIC as before. > > 2. I don't remove GFP_DMA allocation. As before, sg allocates reserved > pages with GFP_DMA (sfp->low_dma case). > > 3. I keep the reserved buffer per struct sg_fd as before. > > 4. I use high-order page allocation for reserved buffer as before. sg > works well with HBAs that have the limitation of the number of sg > entries. > > 5. I think that you were concern about the overhead of the block layer > functions. But if you look at scsi_execute_async() that sg uses now, > you can find that scsi_execute_async() uses the block layer functions > internally. So the current sg incurs the overhead (if such overhead > exists). > > > Jens, > > I keep the block API changes to a minimum (I might need more changes > for st/osst but I'd like to progress step by step). I did only two > things. > > 1. I add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov. > > 2. I introduces struct rq_map_data holding pages. sg puts > pre-allocated pages to it and passes it to bio_copy_user_iov(). The > current users of bio_copy_user_iov simply passes NULL. blk_rq_map_user > and blk_rq_map_user_iov take a pointer to struct rq_map_data and in > the end bio_copy_user_iov gets it. > > > This patchset against the for-linus branch in Jens' tree + the two > patches for 2.6.27: > > http://marc.info/?l=linux-kernel&m=121964251911717&w=2 > http://marc.info/?l=linux-kernel&m=121964241911611&w=2 > > After Jens rebases the for-2.6.28 brach, I'll update this patchset > too. Thanks a lot for doing this work, it's been pending for a long time. I've rebased for-2.6.28 on top of for-linus to ease this integration, so if you could resend the patchset then I can add it. -- Jens Axboe ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-26 7:56 ` [PATCH 0/5] convert sg to use the block layer Jens Axboe @ 2008-08-27 2:14 ` FUJITA Tomonori 2008-08-27 7:10 ` Jens Axboe 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-27 2:14 UTC (permalink / raw) To: jens.axboe; +Cc: fujita.tomonori, linux-scsi, dougg, michaelc, James.Bottomley On Tue, 26 Aug 2008 09:56:28 +0200 Jens Axboe <jens.axboe@oracle.com> wrote: > On Tue, Aug 26 2008, FUJITA Tomonori wrote: > > This patchset converts sg to use the block layer functions. That is, > > sg doesn't use scsi_execute_async() any more. This is a part of the > > overdue task to remove scsi_req_map_sg. > > > > I tested this patchset with sg v3 and the old interface (struct > > sg_header) via SG_IO (v3) and the vfs API. > > > > > > Doug, > > > > 1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always > > uses GFP_ATOMIC as before. > > > > 2. I don't remove GFP_DMA allocation. As before, sg allocates reserved > > pages with GFP_DMA (sfp->low_dma case). > > > > 3. I keep the reserved buffer per struct sg_fd as before. > > > > 4. I use high-order page allocation for reserved buffer as before. sg > > works well with HBAs that have the limitation of the number of sg > > entries. > > > > 5. I think that you were concern about the overhead of the block layer > > functions. But if you look at scsi_execute_async() that sg uses now, > > you can find that scsi_execute_async() uses the block layer functions > > internally. So the current sg incurs the overhead (if such overhead > > exists). > > > > > > Jens, > > > > I keep the block API changes to a minimum (I might need more changes > > for st/osst but I'd like to progress step by step). I did only two > > things. > > > > 1. I add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov. > > > > 2. I introduces struct rq_map_data holding pages. sg puts > > pre-allocated pages to it and passes it to bio_copy_user_iov(). The > > current users of bio_copy_user_iov simply passes NULL. blk_rq_map_user > > and blk_rq_map_user_iov take a pointer to struct rq_map_data and in > > the end bio_copy_user_iov gets it. > > > > > > This patchset against the for-linus branch in Jens' tree + the two > > patches for 2.6.27: > > > > http://marc.info/?l=linux-kernel&m=121964251911717&w=2 > > http://marc.info/?l=linux-kernel&m=121964241911611&w=2 > > > > After Jens rebases the for-2.6.28 brach, I'll update this patchset > > too. > > Thanks a lot for doing this work, it's been pending for a long time. > I've rebased for-2.6.28 on top of for-linus to ease this integration, so > if you could resend the patchset then I can add it. Thanks, I put an updated version against the for-2.6.28 branch in your tree: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block Can you wait for a little while before applying this to your for-2.6.28 branch? We need Doug's ACK on this. But I think that it's nice if you can send this to linux-next since testing this patchset on linux-next is good. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-27 2:14 ` FUJITA Tomonori @ 2008-08-27 7:10 ` Jens Axboe 2008-08-27 23:26 ` FUJITA Tomonori 0 siblings, 1 reply; 23+ messages in thread From: Jens Axboe @ 2008-08-27 7:10 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, dougg, michaelc, James.Bottomley On Wed, Aug 27 2008, FUJITA Tomonori wrote: > On Tue, 26 Aug 2008 09:56:28 +0200 > Jens Axboe <jens.axboe@oracle.com> wrote: > > > On Tue, Aug 26 2008, FUJITA Tomonori wrote: > > > This patchset converts sg to use the block layer functions. That is, > > > sg doesn't use scsi_execute_async() any more. This is a part of the > > > overdue task to remove scsi_req_map_sg. > > > > > > I tested this patchset with sg v3 and the old interface (struct > > > sg_header) via SG_IO (v3) and the vfs API. > > > > > > > > > Doug, > > > > > > 1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always > > > uses GFP_ATOMIC as before. > > > > > > 2. I don't remove GFP_DMA allocation. As before, sg allocates reserved > > > pages with GFP_DMA (sfp->low_dma case). > > > > > > 3. I keep the reserved buffer per struct sg_fd as before. > > > > > > 4. I use high-order page allocation for reserved buffer as before. sg > > > works well with HBAs that have the limitation of the number of sg > > > entries. > > > > > > 5. I think that you were concern about the overhead of the block layer > > > functions. But if you look at scsi_execute_async() that sg uses now, > > > you can find that scsi_execute_async() uses the block layer functions > > > internally. So the current sg incurs the overhead (if such overhead > > > exists). > > > > > > > > > Jens, > > > > > > I keep the block API changes to a minimum (I might need more changes > > > for st/osst but I'd like to progress step by step). I did only two > > > things. > > > > > > 1. I add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov. > > > > > > 2. I introduces struct rq_map_data holding pages. sg puts > > > pre-allocated pages to it and passes it to bio_copy_user_iov(). The > > > current users of bio_copy_user_iov simply passes NULL. blk_rq_map_user > > > and blk_rq_map_user_iov take a pointer to struct rq_map_data and in > > > the end bio_copy_user_iov gets it. > > > > > > > > > This patchset against the for-linus branch in Jens' tree + the two > > > patches for 2.6.27: > > > > > > http://marc.info/?l=linux-kernel&m=121964251911717&w=2 > > > http://marc.info/?l=linux-kernel&m=121964241911611&w=2 > > > > > > After Jens rebases the for-2.6.28 brach, I'll update this patchset > > > too. > > > > Thanks a lot for doing this work, it's been pending for a long time. > > I've rebased for-2.6.28 on top of for-linus to ease this integration, so > > if you could resend the patchset then I can add it. > > Thanks, I put an updated version against the for-2.6.28 branch in your > tree: > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > Can you wait for a little while before applying this to your > for-2.6.28 branch? We need Doug's ACK on this. Sure, just let me know... > But I think that it's nice if you can send this to linux-next since > testing this patchset on linux-next is good. for-2.6.28 is already in linux-next, so once this gets applied it'll get tested there as well. And in -mm. -- Jens Axboe ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-27 7:10 ` Jens Axboe @ 2008-08-27 23:26 ` FUJITA Tomonori 2008-08-28 6:51 ` Jens Axboe 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-27 23:26 UTC (permalink / raw) To: jens.axboe; +Cc: fujita.tomonori, linux-scsi, dougg, michaelc, James.Bottomley On Wed, 27 Aug 2008 09:10:18 +0200 Jens Axboe <jens.axboe@oracle.com> wrote: > > Thanks, I put an updated version against the for-2.6.28 branch in your > > tree: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > > > > Can you wait for a little while before applying this to your > > for-2.6.28 branch? We need Doug's ACK on this. > > Sure, just let me know... Now we got Doug's ACK on this (Doug, thanks!). Can you apply them to your for-2.6.28 branch? I'll resend them if necessary. > > But I think that it's nice if you can send this to linux-next since > > testing this patchset on linux-next is good. > > for-2.6.28 is already in linux-next, so once this gets applied it'll get > tested there as well. And in -mm. Thanks, I see. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-27 23:26 ` FUJITA Tomonori @ 2008-08-28 6:51 ` Jens Axboe 2008-08-28 7:07 ` Jens Axboe 0 siblings, 1 reply; 23+ messages in thread From: Jens Axboe @ 2008-08-28 6:51 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, dougg, michaelc, James.Bottomley On Thu, Aug 28 2008, FUJITA Tomonori wrote: > On Wed, 27 Aug 2008 09:10:18 +0200 > Jens Axboe <jens.axboe@oracle.com> wrote: > > > > Thanks, I put an updated version against the for-2.6.28 branch in your > > > tree: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > > > > > > > Can you wait for a little while before applying this to your > > > for-2.6.28 branch? We need Doug's ACK on this. > > > > Sure, just let me know... > > Now we got Doug's ACK on this (Doug, thanks!). Can you apply them to > your for-2.6.28 branch? I'll resend them if necessary. Yes certainly. No need to resend, I'll add Dougs ack and merge it. -- Jens Axboe ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-28 6:51 ` Jens Axboe @ 2008-08-28 7:07 ` Jens Axboe 2008-08-28 7:17 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-28 7:32 ` [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 0 siblings, 2 replies; 23+ messages in thread From: Jens Axboe @ 2008-08-28 7:07 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, dougg, michaelc, James.Bottomley On Thu, Aug 28 2008, Jens Axboe wrote: > On Thu, Aug 28 2008, FUJITA Tomonori wrote: > > On Wed, 27 Aug 2008 09:10:18 +0200 > > Jens Axboe <jens.axboe@oracle.com> wrote: > > > > > > Thanks, I put an updated version against the for-2.6.28 branch in your > > > > tree: > > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > > > > > > > > > > Can you wait for a little while before applying this to your > > > > for-2.6.28 branch? We need Doug's ACK on this. > > > > > > Sure, just let me know... > > > > Now we got Doug's ACK on this (Doug, thanks!). Can you apply them to > > your for-2.6.28 branch? I'll resend them if necessary. > > Yes certainly. No need to resend, I'll add Dougs ack and merge it. Actually, the previous was against for-linus (not for-2.6.28), so if you could resend against 2.6.28 that would help a lot. Thanks! -- Jens Axboe ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov 2008-08-28 7:07 ` Jens Axboe @ 2008-08-28 7:17 ` FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori 2008-08-28 7:34 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Jens Axboe 2008-08-28 7:32 ` [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 1 sibling, 2 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:17 UTC (permalink / raw) To: jens.axboe; +Cc: linux-scsi, dougg, michaelc, James.Bottomley, FUJITA Tomonori Currently, blk_rq_map_user and blk_rq_map_user_iov always do GFP_KERNEL allocation. This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov so sg can use it (sg always does GFP_ATOMIC allocation). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- block/blk-map.c | 20 ++++++++++++-------- block/bsg.c | 5 +++-- block/scsi_ioctl.c | 5 +++-- drivers/cdrom/cdrom.c | 2 +- drivers/scsi/scsi_tgt_lib.c | 2 +- fs/bio.c | 33 +++++++++++++++++++-------------- include/linux/bio.h | 9 +++++---- include/linux/blkdev.h | 5 +++-- 8 files changed, 47 insertions(+), 34 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index ea1bf53..ac21b73 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -41,7 +41,8 @@ static int __blk_rq_unmap_user(struct bio *bio) } static int __blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned int len) + void __user *ubuf, unsigned int len, + gfp_t gfp_mask) { unsigned long uaddr; unsigned int alignment; @@ -57,9 +58,9 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, uaddr = (unsigned long) ubuf; alignment = queue_dma_alignment(q) | q->dma_pad_mask; if (!(uaddr & alignment) && !(len & alignment)) - bio = bio_map_user(q, NULL, uaddr, len, reading); + bio = bio_map_user(q, NULL, uaddr, len, reading, gfp_mask); else - bio = bio_copy_user(q, uaddr, len, reading); + bio = bio_copy_user(q, uaddr, len, reading, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); @@ -90,6 +91,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * @rq: request structure to fill * @ubuf: the user buffer * @len: length of user data + * @gfp_mask: memory allocation flags * * Description: * Data will be mapped directly for zero copy I/O, if possible. Otherwise @@ -105,7 +107,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * unmapping. */ int blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned long len) + void __user *ubuf, unsigned long len, gfp_t gfp_mask) { unsigned long bytes_read = 0; struct bio *bio = NULL; @@ -132,7 +134,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, if (end - start > BIO_MAX_PAGES) map_len -= PAGE_SIZE; - ret = __blk_rq_map_user(q, rq, ubuf, map_len); + ret = __blk_rq_map_user(q, rq, ubuf, map_len, gfp_mask); if (ret < 0) goto unmap_rq; if (!bio) @@ -160,6 +162,7 @@ EXPORT_SYMBOL(blk_rq_map_user); * @iov: pointer to the iovec * @iov_count: number of elements in the iovec * @len: I/O byte count + * @gfp_mask: memory allocation flags * * Description: * Data will be mapped directly for zero copy I/O, if possible. Otherwise @@ -175,7 +178,8 @@ EXPORT_SYMBOL(blk_rq_map_user); * unmapping. */ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, - struct sg_iovec *iov, int iov_count, unsigned int len) + struct sg_iovec *iov, int iov_count, unsigned int len, + gfp_t gfp_mask) { struct bio *bio; int i, read = rq_data_dir(rq) == READ; @@ -194,9 +198,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, } if (unaligned || (q->dma_pad_mask & len)) - bio = bio_copy_user_iov(q, iov, iov_count, read); + bio = bio_copy_user_iov(q, iov, iov_count, read, gfp_mask); else - bio = bio_map_user_iov(q, NULL, iov, iov_count, read); + bio = bio_map_user_iov(q, NULL, iov, iov_count, read, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); diff --git a/block/bsg.c b/block/bsg.c index 0aae8d7..e7a142e 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -283,7 +283,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) next_rq->cmd_type = rq->cmd_type; dxferp = (void*)(unsigned long)hdr->din_xferp; - ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len); + ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len, + GFP_KERNEL); if (ret) goto out; } @@ -298,7 +299,7 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) dxfer_len = 0; if (dxfer_len) { - ret = blk_rq_map_user(q, rq, dxferp, dxfer_len); + ret = blk_rq_map_user(q, rq, dxferp, dxfer_len, GFP_KERNEL); if (ret) goto out; } diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index 3aab80a..f49d6a1 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -315,10 +315,11 @@ static int sg_io(struct file *file, struct request_queue *q, } ret = blk_rq_map_user_iov(q, rq, iov, hdr->iovec_count, - hdr->dxfer_len); + hdr->dxfer_len, GFP_KERNEL); kfree(iov); } else if (hdr->dxfer_len) - ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len); + ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len, + GFP_KERNEL); if (ret) goto out; diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index 74031de..e861d24 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2097,7 +2097,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf, len = nr * CD_FRAMESIZE_RAW; - ret = blk_rq_map_user(q, rq, ubuf, len); + ret = blk_rq_map_user(q, rq, ubuf, len, GFP_KERNEL); if (ret) break; diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 257e097..2a4fd82 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -362,7 +362,7 @@ static int scsi_map_user_pages(struct scsi_tgt_cmd *tcmd, struct scsi_cmnd *cmd, int err; dprintk("%lx %u\n", uaddr, len); - err = blk_rq_map_user(q, rq, (void *)uaddr, len); + err = blk_rq_map_user(q, rq, (void *)uaddr, len, GFP_KERNEL); if (err) { /* * TODO: need to fixup sg_tablesize, max_segment_size, diff --git a/fs/bio.c b/fs/bio.c index 6a637b5..3d2e9ad 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -558,13 +558,14 @@ int bio_uncopy_user(struct bio *bio) * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Prepares and returns a bio for indirect user io, bouncing data * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, - int iov_count, int write_to_vm) + int iov_count, int write_to_vm, gfp_t gfp_mask) { struct bio_map_data *bmd; struct bio_vec *bvec; @@ -587,12 +588,12 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, len += iov[i].iov_len; } - bmd = bio_alloc_map_data(nr_pages, iov_count, GFP_KERNEL); + bmd = bio_alloc_map_data(nr_pages, iov_count, gfp_mask); if (!bmd) return ERR_PTR(-ENOMEM); ret = -ENOMEM; - bio = bio_alloc(GFP_KERNEL, nr_pages); + bio = bio_alloc(gfp_mask, nr_pages); if (!bio) goto out_bmd; @@ -605,7 +606,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, if (bytes > len) bytes = len; - page = alloc_page(q->bounce_gfp | GFP_KERNEL); + page = alloc_page(q->bounce_gfp | gfp_mask); if (!page) { ret = -ENOMEM; break; @@ -647,26 +648,27 @@ out_bmd: * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Prepares and returns a bio for indirect user io, bouncing data * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr, - unsigned int len, int write_to_vm) + unsigned int len, int write_to_vm, gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_copy_user_iov(q, &iov, 1, write_to_vm); + return bio_copy_user_iov(q, &iov, 1, write_to_vm, gfp_mask); } static struct bio *__bio_map_user_iov(struct request_queue *q, struct block_device *bdev, struct sg_iovec *iov, int iov_count, - int write_to_vm) + int write_to_vm, gfp_t gfp_mask) { int i, j; int nr_pages = 0; @@ -692,12 +694,12 @@ static struct bio *__bio_map_user_iov(struct request_queue *q, if (!nr_pages) return ERR_PTR(-EINVAL); - bio = bio_alloc(GFP_KERNEL, nr_pages); + bio = bio_alloc(gfp_mask, nr_pages); if (!bio) return ERR_PTR(-ENOMEM); ret = -ENOMEM; - pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL); + pages = kcalloc(nr_pages, sizeof(struct page *), gfp_mask); if (!pages) goto out; @@ -776,19 +778,21 @@ static struct bio *__bio_map_user_iov(struct request_queue *q, * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Map the user space address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev, - unsigned long uaddr, unsigned int len, int write_to_vm) + unsigned long uaddr, unsigned int len, int write_to_vm, + gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_map_user_iov(q, bdev, &iov, 1, write_to_vm); + return bio_map_user_iov(q, bdev, &iov, 1, write_to_vm, gfp_mask); } /** @@ -798,18 +802,19 @@ struct bio *bio_map_user(struct request_queue *q, struct block_device *bdev, * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not + * @gfp_mask: memory allocation flags * * Map the user space address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ struct bio *bio_map_user_iov(struct request_queue *q, struct block_device *bdev, struct sg_iovec *iov, int iov_count, - int write_to_vm) + int write_to_vm, gfp_t gfp_mask) { struct bio *bio; - bio = __bio_map_user_iov(q, bdev, iov, iov_count, write_to_vm); - + bio = __bio_map_user_iov(q, bdev, iov, iov_count, write_to_vm, + gfp_mask); if (IS_ERR(bio)) return bio; diff --git a/include/linux/bio.h b/include/linux/bio.h index 13aba20..200b185 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -325,11 +325,11 @@ extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, unsigned int, unsigned int); extern int bio_get_nr_vecs(struct block_device *); extern struct bio *bio_map_user(struct request_queue *, struct block_device *, - unsigned long, unsigned int, int); + unsigned long, unsigned int, int, gfp_t); struct sg_iovec; extern struct bio *bio_map_user_iov(struct request_queue *, struct block_device *, - struct sg_iovec *, int, int); + struct sg_iovec *, int, int, gfp_t); extern void bio_unmap_user(struct bio *); extern struct bio *bio_map_kern(struct request_queue *, void *, unsigned int, gfp_t); @@ -337,9 +337,10 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int, gfp_t, int); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); -extern struct bio *bio_copy_user(struct request_queue *, unsigned long, unsigned int, int); +extern struct bio *bio_copy_user(struct request_queue *, unsigned long, + unsigned int, int, gfp_t); extern struct bio *bio_copy_user_iov(struct request_queue *, struct sg_iovec *, - int, int); + int, int, gfp_t); extern int bio_uncopy_user(struct bio *); void zero_fill_bio(struct bio *bio); extern struct bio_vec *bvec_alloc_bs(gfp_t, int, unsigned long *, struct bio_set *); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index ed9324f..9512c5b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -710,11 +710,12 @@ extern void __blk_stop_queue(struct request_queue *q); extern void __blk_run_queue(struct request_queue *); extern void blk_run_queue(struct request_queue *); extern void blk_start_queueing(struct request_queue *); -extern int blk_rq_map_user(struct request_queue *, struct request *, void __user *, unsigned long); +extern int blk_rq_map_user(struct request_queue *, struct request *, + void __user *, unsigned long, gfp_t); extern int blk_rq_unmap_user(struct bio *); extern int blk_rq_map_kern(struct request_queue *, struct request *, void *, unsigned int, gfp_t); extern int blk_rq_map_user_iov(struct request_queue *, struct request *, - struct sg_iovec *, int, unsigned int); + struct sg_iovec *, int, unsigned int, gfp_t); extern int blk_execute_rq(struct request_queue *, struct gendisk *, struct request *, int); extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *, -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages 2008-08-28 7:17 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori @ 2008-08-28 7:17 ` FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori 2008-08-28 7:34 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Jens Axboe 1 sibling, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:17 UTC (permalink / raw) To: jens.axboe; +Cc: linux-scsi, dougg, michaelc, James.Bottomley, FUJITA Tomonori This patch introduces struct rq_map_data to enable bio_copy_use_iov() use reserved pages. Currently, bio_copy_user_iov allocates bounce pages but drivers/scsi/sg.c wants to allocate pages by itself and use them. struct rq_map_data can be used to pass allocated pages to bio_copy_user_iov. The current users of bio_copy_user_iov simply passes NULL (they don't want to use pre-allocated pages). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- block/blk-map.c | 26 ++++++++++++------- block/bsg.c | 7 +++-- block/scsi_ioctl.c | 4 +- drivers/cdrom/cdrom.c | 2 +- drivers/scsi/scsi_tgt_lib.c | 2 +- fs/bio.c | 58 ++++++++++++++++++++++++++++++------------ include/linux/bio.h | 8 +++-- include/linux/blkdev.h | 12 +++++++- 8 files changed, 80 insertions(+), 39 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index ac21b73..dad6a29 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -41,8 +41,8 @@ static int __blk_rq_unmap_user(struct bio *bio) } static int __blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned int len, - gfp_t gfp_mask) + struct rq_map_data *map_data, void __user *ubuf, + unsigned int len, gfp_t gfp_mask) { unsigned long uaddr; unsigned int alignment; @@ -57,10 +57,10 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, */ uaddr = (unsigned long) ubuf; alignment = queue_dma_alignment(q) | q->dma_pad_mask; - if (!(uaddr & alignment) && !(len & alignment)) + if (!(uaddr & alignment) && !(len & alignment) && !map_data) bio = bio_map_user(q, NULL, uaddr, len, reading, gfp_mask); else - bio = bio_copy_user(q, uaddr, len, reading, gfp_mask); + bio = bio_copy_user(q, map_data, uaddr, len, reading, gfp_mask); if (IS_ERR(bio)) return PTR_ERR(bio); @@ -89,6 +89,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * blk_rq_map_user - map user data to a request, for REQ_TYPE_BLOCK_PC usage * @q: request queue where request should be inserted * @rq: request structure to fill + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @ubuf: the user buffer * @len: length of user data * @gfp_mask: memory allocation flags @@ -107,7 +108,8 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * unmapping. */ int blk_rq_map_user(struct request_queue *q, struct request *rq, - void __user *ubuf, unsigned long len, gfp_t gfp_mask) + struct rq_map_data *map_data, void __user *ubuf, + unsigned long len, gfp_t gfp_mask) { unsigned long bytes_read = 0; struct bio *bio = NULL; @@ -134,7 +136,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, if (end - start > BIO_MAX_PAGES) map_len -= PAGE_SIZE; - ret = __blk_rq_map_user(q, rq, ubuf, map_len, gfp_mask); + ret = __blk_rq_map_user(q, rq, map_data, ubuf, map_len, + gfp_mask); if (ret < 0) goto unmap_rq; if (!bio) @@ -159,6 +162,7 @@ EXPORT_SYMBOL(blk_rq_map_user); * blk_rq_map_user_iov - map user data to a request, for REQ_TYPE_BLOCK_PC usage * @q: request queue where request should be inserted * @rq: request to map data to + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @iov: pointer to the iovec * @iov_count: number of elements in the iovec * @len: I/O byte count @@ -178,8 +182,8 @@ EXPORT_SYMBOL(blk_rq_map_user); * unmapping. */ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, - struct sg_iovec *iov, int iov_count, unsigned int len, - gfp_t gfp_mask) + struct rq_map_data *map_data, struct sg_iovec *iov, + int iov_count, unsigned int len, gfp_t gfp_mask) { struct bio *bio; int i, read = rq_data_dir(rq) == READ; @@ -197,8 +201,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, } } - if (unaligned || (q->dma_pad_mask & len)) - bio = bio_copy_user_iov(q, iov, iov_count, read, gfp_mask); + if (unaligned || (q->dma_pad_mask & len) || map_data) + bio = bio_copy_user_iov(q, map_data, iov, iov_count, read, + gfp_mask); else bio = bio_map_user_iov(q, NULL, iov, iov_count, read, gfp_mask); @@ -220,6 +225,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq, rq->buffer = rq->data = NULL; return 0; } +EXPORT_SYMBOL(blk_rq_map_user_iov); /** * blk_rq_unmap_user - unmap a request with user data diff --git a/block/bsg.c b/block/bsg.c index e7a142e..56cb343 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -283,8 +283,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) next_rq->cmd_type = rq->cmd_type; dxferp = (void*)(unsigned long)hdr->din_xferp; - ret = blk_rq_map_user(q, next_rq, dxferp, hdr->din_xfer_len, - GFP_KERNEL); + ret = blk_rq_map_user(q, next_rq, NULL, dxferp, + hdr->din_xfer_len, GFP_KERNEL); if (ret) goto out; } @@ -299,7 +299,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, int has_write_perm) dxfer_len = 0; if (dxfer_len) { - ret = blk_rq_map_user(q, rq, dxferp, dxfer_len, GFP_KERNEL); + ret = blk_rq_map_user(q, rq, NULL, dxferp, dxfer_len, + GFP_KERNEL); if (ret) goto out; } diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index f49d6a1..c34272a 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -314,11 +314,11 @@ static int sg_io(struct file *file, struct request_queue *q, goto out; } - ret = blk_rq_map_user_iov(q, rq, iov, hdr->iovec_count, + ret = blk_rq_map_user_iov(q, rq, NULL, iov, hdr->iovec_count, hdr->dxfer_len, GFP_KERNEL); kfree(iov); } else if (hdr->dxfer_len) - ret = blk_rq_map_user(q, rq, hdr->dxferp, hdr->dxfer_len, + ret = blk_rq_map_user(q, rq, NULL, hdr->dxferp, hdr->dxfer_len, GFP_KERNEL); if (ret) diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c index e861d24..d47f2f8 100644 --- a/drivers/cdrom/cdrom.c +++ b/drivers/cdrom/cdrom.c @@ -2097,7 +2097,7 @@ static int cdrom_read_cdda_bpc(struct cdrom_device_info *cdi, __u8 __user *ubuf, len = nr * CD_FRAMESIZE_RAW; - ret = blk_rq_map_user(q, rq, ubuf, len, GFP_KERNEL); + ret = blk_rq_map_user(q, rq, NULL, ubuf, len, GFP_KERNEL); if (ret) break; diff --git a/drivers/scsi/scsi_tgt_lib.c b/drivers/scsi/scsi_tgt_lib.c index 2a4fd82..3117bb1 100644 --- a/drivers/scsi/scsi_tgt_lib.c +++ b/drivers/scsi/scsi_tgt_lib.c @@ -362,7 +362,7 @@ static int scsi_map_user_pages(struct scsi_tgt_cmd *tcmd, struct scsi_cmnd *cmd, int err; dprintk("%lx %u\n", uaddr, len); - err = blk_rq_map_user(q, rq, (void *)uaddr, len, GFP_KERNEL); + err = blk_rq_map_user(q, rq, NULL, (void *)uaddr, len, GFP_KERNEL); if (err) { /* * TODO: need to fixup sg_tablesize, max_segment_size, diff --git a/fs/bio.c b/fs/bio.c index 3d2e9ad..a2f0726 100644 --- a/fs/bio.c +++ b/fs/bio.c @@ -439,16 +439,19 @@ int bio_add_page(struct bio *bio, struct page *page, unsigned int len, struct bio_map_data { struct bio_vec *iovecs; - int nr_sgvecs; struct sg_iovec *sgvecs; + int nr_sgvecs; + int is_our_pages; }; static void bio_set_map_data(struct bio_map_data *bmd, struct bio *bio, - struct sg_iovec *iov, int iov_count) + struct sg_iovec *iov, int iov_count, + int is_our_pages) { memcpy(bmd->iovecs, bio->bi_io_vec, sizeof(struct bio_vec) * bio->bi_vcnt); memcpy(bmd->sgvecs, iov, sizeof(struct sg_iovec) * iov_count); bmd->nr_sgvecs = iov_count; + bmd->is_our_pages = is_our_pages; bio->bi_private = bmd; } @@ -483,7 +486,8 @@ static struct bio_map_data *bio_alloc_map_data(int nr_segs, int iov_count, } static int __bio_copy_iov(struct bio *bio, struct bio_vec *iovecs, - struct sg_iovec *iov, int iov_count, int uncopy) + struct sg_iovec *iov, int iov_count, int uncopy, + int do_free_page) { int ret = 0, i; struct bio_vec *bvec; @@ -526,7 +530,7 @@ static int __bio_copy_iov(struct bio *bio, struct bio_vec *iovecs, } } - if (uncopy) + if (do_free_page) __free_page(bvec->bv_page); } @@ -545,7 +549,8 @@ int bio_uncopy_user(struct bio *bio) struct bio_map_data *bmd = bio->bi_private; int ret; - ret = __bio_copy_iov(bio, bmd->iovecs, bmd->sgvecs, bmd->nr_sgvecs, 1); + ret = __bio_copy_iov(bio, bmd->iovecs, bmd->sgvecs, bmd->nr_sgvecs, 1, + bmd->is_our_pages); bio_free_map_data(bmd); bio_put(bio); @@ -555,6 +560,7 @@ int bio_uncopy_user(struct bio *bio) /** * bio_copy_user_iov - copy user data to bio * @q: destination block queue + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @iov: the iovec. * @iov_count: number of elements in the iovec * @write_to_vm: bool indicating writing to pages or not @@ -564,8 +570,10 @@ int bio_uncopy_user(struct bio *bio) * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ -struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, - int iov_count, int write_to_vm, gfp_t gfp_mask) +struct bio *bio_copy_user_iov(struct request_queue *q, + struct rq_map_data *map_data, + struct sg_iovec *iov, int iov_count, + int write_to_vm, gfp_t gfp_mask) { struct bio_map_data *bmd; struct bio_vec *bvec; @@ -600,13 +608,26 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, bio->bi_rw |= (!write_to_vm << BIO_RW); ret = 0; + i = 0; while (len) { - unsigned int bytes = PAGE_SIZE; + unsigned int bytes; + + if (map_data) + bytes = 1U << (PAGE_SHIFT + map_data->page_order); + else + bytes = PAGE_SIZE; if (bytes > len) bytes = len; - page = alloc_page(q->bounce_gfp | gfp_mask); + if (map_data) { + if (i == map_data->nr_entries) { + ret = -ENOMEM; + break; + } + page = map_data->pages[i++]; + } else + page = alloc_page(q->bounce_gfp | gfp_mask); if (!page) { ret = -ENOMEM; break; @@ -625,16 +646,17 @@ struct bio *bio_copy_user_iov(struct request_queue *q, struct sg_iovec *iov, * success */ if (!write_to_vm) { - ret = __bio_copy_iov(bio, bio->bi_io_vec, iov, iov_count, 0); + ret = __bio_copy_iov(bio, bio->bi_io_vec, iov, iov_count, 0, 0); if (ret) goto cleanup; } - bio_set_map_data(bmd, bio, iov, iov_count); + bio_set_map_data(bmd, bio, iov, iov_count, map_data ? 0 : 1); return bio; cleanup: - bio_for_each_segment(bvec, bio, i) - __free_page(bvec->bv_page); + if (!map_data) + bio_for_each_segment(bvec, bio, i) + __free_page(bvec->bv_page); bio_put(bio); out_bmd: @@ -645,6 +667,7 @@ out_bmd: /** * bio_copy_user - copy user data to bio * @q: destination block queue + * @map_data: pointer to the rq_map_data holding pages (if necessary) * @uaddr: start of user address * @len: length in bytes * @write_to_vm: bool indicating writing to pages or not @@ -654,15 +677,16 @@ out_bmd: * to/from kernel pages as necessary. Must be paired with * call bio_uncopy_user() on io completion. */ -struct bio *bio_copy_user(struct request_queue *q, unsigned long uaddr, - unsigned int len, int write_to_vm, gfp_t gfp_mask) +struct bio *bio_copy_user(struct request_queue *q, struct rq_map_data *map_data, + unsigned long uaddr, unsigned int len, + int write_to_vm, gfp_t gfp_mask) { struct sg_iovec iov; iov.iov_base = (void __user *)uaddr; iov.iov_len = len; - return bio_copy_user_iov(q, &iov, 1, write_to_vm, gfp_mask); + return bio_copy_user_iov(q, map_data, &iov, 1, write_to_vm, gfp_mask); } static struct bio *__bio_map_user_iov(struct request_queue *q, @@ -1028,7 +1052,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void *data, unsigned int len, bio->bi_private = bmd; bio->bi_end_io = bio_copy_kern_endio; - bio_set_map_data(bmd, bio, &iov, 1); + bio_set_map_data(bmd, bio, &iov, 1, 1); return bio; cleanup: bio_for_each_segment(bvec, bio, i) diff --git a/include/linux/bio.h b/include/linux/bio.h index 200b185..bc386cd 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -327,6 +327,7 @@ extern int bio_get_nr_vecs(struct block_device *); extern struct bio *bio_map_user(struct request_queue *, struct block_device *, unsigned long, unsigned int, int, gfp_t); struct sg_iovec; +struct rq_map_data; extern struct bio *bio_map_user_iov(struct request_queue *, struct block_device *, struct sg_iovec *, int, int, gfp_t); @@ -337,9 +338,10 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int, gfp_t, int); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); -extern struct bio *bio_copy_user(struct request_queue *, unsigned long, - unsigned int, int, gfp_t); -extern struct bio *bio_copy_user_iov(struct request_queue *, struct sg_iovec *, +extern struct bio *bio_copy_user(struct request_queue *, struct rq_map_data *, + unsigned long, unsigned int, int, gfp_t); +extern struct bio *bio_copy_user_iov(struct request_queue *, + struct rq_map_data *, struct sg_iovec *, int, int, gfp_t); extern int bio_uncopy_user(struct bio *); void zero_fill_bio(struct bio *bio); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 9512c5b..d2faa72 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -642,6 +642,12 @@ static inline void blk_queue_bounce(struct request_queue *q, struct bio **bio) } #endif /* CONFIG_MMU */ +struct rq_map_data { + struct page **pages; + int page_order; + int nr_entries; +}; + struct req_iterator { int i; struct bio *bio; @@ -711,11 +717,13 @@ extern void __blk_run_queue(struct request_queue *); extern void blk_run_queue(struct request_queue *); extern void blk_start_queueing(struct request_queue *); extern int blk_rq_map_user(struct request_queue *, struct request *, - void __user *, unsigned long, gfp_t); + struct rq_map_data *, void __user *, unsigned long, + gfp_t); extern int blk_rq_unmap_user(struct bio *); extern int blk_rq_map_kern(struct request_queue *, struct request *, void *, unsigned int, gfp_t); extern int blk_rq_map_user_iov(struct request_queue *, struct request *, - struct sg_iovec *, int, unsigned int, gfp_t); + struct rq_map_data *, struct sg_iovec *, int, + unsigned int, gfp_t); extern int blk_execute_rq(struct request_queue *, struct gendisk *, struct request *, int); extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *, -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 3/5] sg: convert the non-data path to use the block layer 2008-08-28 7:17 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori @ 2008-08-28 7:17 ` FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:17 UTC (permalink / raw) To: jens.axboe; +Cc: linux-scsi, dougg, michaelc, James.Bottomley, FUJITA Tomonori This patch converts the non data path to use the block layer functions (blk_get_request, blk_execute_rq_nowait, etc) instead of uses scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 48 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 661f9f2..487c777 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -137,6 +137,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */ char orphan; /* 1 -> drop on sight, 0 -> normal */ char sg_io_owned; /* 1 -> packet belongs to SG_IO */ volatile char done; /* 0->before bh, 1->before read, 2->read */ + struct request *rq; } Sg_request; typedef struct sg_fd { /* holds the state of a file descriptor */ @@ -176,7 +177,7 @@ typedef struct sg_device { /* holds the state of each scsi generic device */ static int sg_fasync(int fd, struct file *filp, int mode); /* tasklet or soft irq callback */ static void sg_cmd_done(void *data, char *sense, int result, int resid); -static int sg_start_req(Sg_request * srp); +static int sg_start_req(Sg_request *srp, unsigned char *cmd); static void sg_finish_rem_req(Sg_request * srp); static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size); static int sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, @@ -229,6 +230,11 @@ static int sg_allow_access(struct file *filp, unsigned char *cmd) cmd, filp->f_mode & FMODE_WRITE); } +static void sg_rq_end_io(struct request *rq, int uptodate) +{ + sg_cmd_done(rq->end_io_data, rq->sense, rq->errors, rq->data_len); +} + static int sg_open(struct inode *inode, struct file *filp) { @@ -732,7 +738,8 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, SCSI_LOG_TIMEOUT(4, printk("sg_common_write: scsi opcode=0x%02x, cmd_size=%d\n", (int) cmnd[0], (int) hp->cmd_len)); - if ((k = sg_start_req(srp))) { + k = sg_start_req(srp, cmnd); + if (k) { SCSI_LOG_TIMEOUT(1, printk("sg_common_write: start_req err=%d\n", k)); sg_finish_rem_req(srp); return k; /* probably out of space --> ENOMEM */ @@ -765,6 +772,12 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, hp->duration = jiffies_to_msecs(jiffies); /* Now send everything of to mid-level. The next time we hear about this packet is when sg_cmd_done() is called (i.e. a callback). */ + if (srp->rq) { + srp->rq->timeout = timeout; + blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, + srp->rq, 1, sg_rq_end_io); + return 0; + } if (scsi_execute_async(sdp->device, cmnd, hp->cmd_len, data_dir, srp->data.buffer, hp->dxfer_len, srp->data.k_use_sg, timeout, SG_DEFAULT_RETRIES, srp, sg_cmd_done, @@ -1634,8 +1647,32 @@ exit_sg(void) idr_destroy(&sg_index_idr); } -static int -sg_start_req(Sg_request * srp) +static int __sg_start_req(struct sg_request *srp, struct sg_io_hdr *hp, + unsigned char *cmd) +{ + struct sg_fd *sfp = srp->parentfp; + struct request_queue *q = sfp->parentdp->device->request_queue; + struct request *rq; + int rw = hp->dxfer_direction == SG_DXFER_TO_DEV ? WRITE : READ; + + rq = blk_get_request(q, rw, GFP_ATOMIC); + if (!rq) + return -ENOMEM; + + memcpy(rq->cmd, cmd, hp->cmd_len); + + rq->cmd_len = hp->cmd_len; + rq->cmd_type = REQ_TYPE_BLOCK_PC; + + srp->rq = rq; + rq->end_io_data = srp; + rq->sense = srp->sense_b; + rq->retries = SG_DEFAULT_RETRIES; + + return 0; +} + +static int sg_start_req(Sg_request *srp, unsigned char *cmd) { int res; Sg_fd *sfp = srp->parentfp; @@ -1646,8 +1683,10 @@ sg_start_req(Sg_request * srp) Sg_scatter_hold *rsv_schp = &sfp->reserve; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); + if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) - return 0; + return __sg_start_req(srp, hp, cmd); + if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && (!sfp->parentdp->device->host->unchecked_isa_dma)) { @@ -1678,6 +1717,10 @@ sg_finish_rem_req(Sg_request * srp) sg_unlink_reserve(sfp, srp); else sg_remove_scat(req_schp); + + if (srp->rq) + blk_put_request(srp->rq); + sg_remove_request(sfp, srp); } -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 4/5] sg: convert the direct IO path to use the block layer 2008-08-28 7:17 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori @ 2008-08-28 7:17 ` FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 5/5] sg: convert the indirect " FUJITA Tomonori 0 siblings, 1 reply; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:17 UTC (permalink / raw) To: jens.axboe; +Cc: linux-scsi, dougg, michaelc, James.Bottomley, FUJITA Tomonori This patch converts the direct IO path (SG_FLAG_DIRECT_IO) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 173 ++++++++-------------------------------------------- 1 files changed, 27 insertions(+), 146 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 487c777..cb6de07 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -138,6 +138,7 @@ typedef struct sg_request { /* SG_MAX_QUEUE requests outstanding per file */ char sg_io_owned; /* 1 -> packet belongs to SG_IO */ volatile char done; /* 0->before bh, 1->before read, 2->read */ struct request *rq; + struct bio *bio; } Sg_request; typedef struct sg_fd { /* holds the state of a file descriptor */ @@ -1679,21 +1680,29 @@ static int sg_start_req(Sg_request *srp, unsigned char *cmd) sg_io_hdr_t *hp = &srp->header; int dxfer_len = (int) hp->dxfer_len; int dxfer_dir = hp->dxfer_direction; + unsigned long uaddr = (unsigned long)hp->dxferp; Sg_scatter_hold *req_schp = &srp->data; Sg_scatter_hold *rsv_schp = &sfp->reserve; + struct request_queue *q = sfp->parentdp->device->request_queue; + unsigned long alignment = queue_dma_alignment(q) | q->dma_pad_mask; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) return __sg_start_req(srp, hp, cmd); +#ifdef SG_ALLOW_DIO_CODE if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && - (!sfp->parentdp->device->host->unchecked_isa_dma)) { - res = sg_build_direct(srp, sfp, dxfer_len); - if (res <= 0) /* -ve -> error, 0 -> done, 1 -> try indirect */ - return res; + (!sfp->parentdp->device->host->unchecked_isa_dma) && + !(uaddr & alignment) && !(dxfer_len & alignment)) { + res = __sg_start_req(srp, hp, cmd); + if (!res) + res = sg_build_direct(srp, sfp, dxfer_len); + + return res; } +#endif if ((!sg_res_in_use(sfp)) && (dxfer_len <= rsv_schp->bufflen)) sg_link_reserve(sfp, srp, dxfer_len); else { @@ -1718,8 +1727,11 @@ sg_finish_rem_req(Sg_request * srp) else sg_remove_scat(req_schp); - if (srp->rq) + if (srp->rq) { + if (srp->bio) + blk_rq_unmap_user(srp->bio); blk_put_request(srp->rq); + } sg_remove_request(sfp, srp); } @@ -1746,151 +1758,23 @@ sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, int tablesize) return tablesize; /* number of scat_gath elements allocated */ } -#ifdef SG_ALLOW_DIO_CODE -/* vvvvvvvv following code borrowed from st driver's direct IO vvvvvvvvv */ - /* TODO: hopefully we can use the generic block layer code */ - -/* Pin down user pages and put them into a scatter gather list. Returns <= 0 if - - mapping of all pages not successful - (i.e., either completely successful or fails) -*/ -static int -st_map_user_pages(struct scatterlist *sgl, const unsigned int max_pages, - unsigned long uaddr, size_t count, int rw) -{ - unsigned long end = (uaddr + count + PAGE_SIZE - 1) >> PAGE_SHIFT; - unsigned long start = uaddr >> PAGE_SHIFT; - const int nr_pages = end - start; - int res, i, j; - struct page **pages; - - /* User attempted Overflow! */ - if ((uaddr + count) < uaddr) - return -EINVAL; - - /* Too big */ - if (nr_pages > max_pages) - return -ENOMEM; - - /* Hmm? */ - if (count == 0) - return 0; - - if ((pages = kmalloc(max_pages * sizeof(*pages), GFP_ATOMIC)) == NULL) - return -ENOMEM; - - /* Try to fault in all of the necessary pages */ - down_read(¤t->mm->mmap_sem); - /* rw==READ means read from drive, write into memory area */ - res = get_user_pages( - current, - current->mm, - uaddr, - nr_pages, - rw == READ, - 0, /* don't force */ - pages, - NULL); - up_read(¤t->mm->mmap_sem); - - /* Errors and no page mapped should return here */ - if (res < nr_pages) - goto out_unmap; - - for (i=0; i < nr_pages; i++) { - /* FIXME: flush superflous for rw==READ, - * probably wrong function for rw==WRITE - */ - flush_dcache_page(pages[i]); - /* ?? Is locking needed? I don't think so */ - /* if (!trylock_page(pages[i])) - goto out_unlock; */ - } - - sg_set_page(sgl, pages[0], 0, uaddr & ~PAGE_MASK); - if (nr_pages > 1) { - sgl[0].length = PAGE_SIZE - sgl[0].offset; - count -= sgl[0].length; - for (i=1; i < nr_pages ; i++) - sg_set_page(&sgl[i], pages[i], count < PAGE_SIZE ? count : PAGE_SIZE, 0); - } - else { - sgl[0].length = count; - } - - kfree(pages); - return nr_pages; - - out_unmap: - if (res > 0) { - for (j=0; j < res; j++) - page_cache_release(pages[j]); - res = 0; - } - kfree(pages); - return res; -} - - -/* And unmap them... */ -static int -st_unmap_user_pages(struct scatterlist *sgl, const unsigned int nr_pages, - int dirtied) -{ - int i; - - for (i=0; i < nr_pages; i++) { - struct page *page = sg_page(&sgl[i]); - - if (dirtied) - SetPageDirty(page); - /* unlock_page(page); */ - /* FIXME: cache flush missing for rw==READ - * FIXME: call the correct reference counting function - */ - page_cache_release(page); - } - - return 0; -} - -/* ^^^^^^^^ above code borrowed from st driver's direct IO ^^^^^^^^^ */ -#endif - - /* Returns: -ve -> error, 0 -> done, 1 -> try indirect */ static int sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len) { -#ifdef SG_ALLOW_DIO_CODE sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - int sg_tablesize = sfp->parentdp->sg_tablesize; - int mx_sc_elems, res; - struct scsi_device *sdev = sfp->parentdp->device; - - if (((unsigned long)hp->dxferp & - queue_dma_alignment(sdev->request_queue)) != 0) - return 1; + int res; + struct request *rq = srp->rq; + struct request_queue *q = sfp->parentdp->device->request_queue; - mx_sc_elems = sg_build_sgat(schp, sfp, sg_tablesize); - if (mx_sc_elems <= 0) { - return 1; - } - res = st_map_user_pages(schp->buffer, mx_sc_elems, - (unsigned long)hp->dxferp, dxfer_len, - (SG_DXFER_TO_DEV == hp->dxfer_direction) ? 1 : 0); - if (res <= 0) { - sg_remove_scat(schp); - return 1; - } - schp->k_use_sg = res; + res = blk_rq_map_user(q, rq, NULL, hp->dxferp, dxfer_len, GFP_ATOMIC); + if (res) + return res; + srp->bio = rq->bio; schp->dio_in_use = 1; hp->info |= SG_INFO_DIRECT_IO; return 0; -#else - return 1; -#endif } static int @@ -2069,11 +1953,7 @@ sg_remove_scat(Sg_scatter_hold * schp) if (schp->buffer && (schp->sglist_len > 0)) { struct scatterlist *sg = schp->buffer; - if (schp->dio_in_use) { -#ifdef SG_ALLOW_DIO_CODE - st_unmap_user_pages(sg, schp->k_use_sg, TRUE); -#endif - } else { + if (!schp->dio_in_use) { int k; for (k = 0; (k < schp->k_use_sg) && sg_page(sg); @@ -2083,8 +1963,9 @@ sg_remove_scat(Sg_scatter_hold * schp) k, sg_page(sg), sg->length)); sg_page_free(sg_page(sg), sg->length); } + + kfree(schp->buffer); } - kfree(schp->buffer); } memset(schp, 0, sizeof (*schp)); } -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 5/5] sg: convert the indirect IO path to use the block layer 2008-08-28 7:17 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori @ 2008-08-28 7:17 ` FUJITA Tomonori 0 siblings, 0 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:17 UTC (permalink / raw) To: jens.axboe; +Cc: linux-scsi, dougg, michaelc, James.Bottomley, FUJITA Tomonori This patch converts the indirect IO path (including mmap IO and old struct sg_header) to use the block layer functions (blk_get_request, blk_execute_rq_nowait, blk_rq_map_user, etc) instead of scsi_execute_async(). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/scsi/sg.c | 393 ++++++++++++++--------------------------------------- 1 files changed, 103 insertions(+), 290 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index cb6de07..56a5d96 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -47,7 +47,6 @@ static int sg_version_num = 30534; /* 2 digits for each component */ #include <linux/seq_file.h> #include <linux/blkdev.h> #include <linux/delay.h> -#include <linux/scatterlist.h> #include <linux/blktrace_api.h> #include <linux/smp_lock.h> @@ -119,7 +118,8 @@ typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ unsigned bufflen; /* Size of (aggregate) data buffer */ unsigned b_malloc_len; /* actual len malloc'ed in buffer */ - struct scatterlist *buffer;/* scatter list */ + struct page **pages; + int page_order; char dio_in_use; /* 0->indirect IO (or mmap), 1->dio */ unsigned char cmd_opcode; /* first byte of command */ } Sg_scatter_hold; @@ -190,8 +190,6 @@ static ssize_t sg_new_write(Sg_fd *sfp, struct file *file, int read_only, Sg_request **o_srp); static int sg_common_write(Sg_fd * sfp, Sg_request * srp, unsigned char *cmnd, int timeout, int blocking); -static int sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind, - int wr_xf, int *countp, unsigned char __user **up); static int sg_write_xfer(Sg_request * srp); static int sg_read_xfer(Sg_request * srp); static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer); @@ -199,8 +197,6 @@ static void sg_remove_scat(Sg_scatter_hold * schp); static void sg_build_reserve(Sg_fd * sfp, int req_size); static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size); static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp); -static struct page *sg_page_malloc(int rqSz, int lowDma, int *retSzp); -static void sg_page_free(struct page *page, int size); static Sg_fd *sg_add_sfp(Sg_device * sdp, int dev); static int sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp); static void __sg_remove_sfp(Sg_device * sdp, Sg_fd * sfp); @@ -771,26 +767,11 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, break; } hp->duration = jiffies_to_msecs(jiffies); -/* Now send everything of to mid-level. The next time we hear about this - packet is when sg_cmd_done() is called (i.e. a callback). */ - if (srp->rq) { - srp->rq->timeout = timeout; - blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, - srp->rq, 1, sg_rq_end_io); - return 0; - } - if (scsi_execute_async(sdp->device, cmnd, hp->cmd_len, data_dir, srp->data.buffer, - hp->dxfer_len, srp->data.k_use_sg, timeout, - SG_DEFAULT_RETRIES, srp, sg_cmd_done, - GFP_ATOMIC)) { - SCSI_LOG_TIMEOUT(1, printk("sg_common_write: scsi_execute_async failed\n")); - /* - * most likely out of mem, but could also be a bad map - */ - sg_finish_rem_req(srp); - return -ENOMEM; - } else - return 0; + + srp->rq->timeout = timeout; + blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk, + srp->rq, 1, sg_rq_end_io); + return 0; } static int @@ -1206,8 +1187,7 @@ sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) Sg_fd *sfp; unsigned long offset, len, sa; Sg_scatter_hold *rsv_schp; - struct scatterlist *sg; - int k; + int k, length; if ((NULL == vma) || (!(sfp = (Sg_fd *) vma->vm_private_data))) return VM_FAULT_SIGBUS; @@ -1217,15 +1197,14 @@ sg_vma_fault(struct vm_area_struct *vma, struct vm_fault *vmf) return VM_FAULT_SIGBUS; SCSI_LOG_TIMEOUT(3, printk("sg_vma_fault: offset=%lu, scatg=%d\n", offset, rsv_schp->k_use_sg)); - sg = rsv_schp->buffer; sa = vma->vm_start; - for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); - ++k, sg = sg_next(sg)) { + length = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg && sa < vma->vm_end; k++) { len = vma->vm_end - sa; - len = (len < sg->length) ? len : sg->length; + len = (len < length) ? len : length; if (offset < len) { - struct page *page; - page = virt_to_page(page_address(sg_page(sg)) + offset); + struct page *page = nth_page(rsv_schp->pages[k], + offset >> PAGE_SHIFT); get_page(page); /* increment page count */ vmf->page = page; return 0; /* success */ @@ -1247,8 +1226,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) Sg_fd *sfp; unsigned long req_sz, len, sa; Sg_scatter_hold *rsv_schp; - int k; - struct scatterlist *sg; + int k, length; if ((!filp) || (!vma) || (!(sfp = (Sg_fd *) filp->private_data))) return -ENXIO; @@ -1262,11 +1240,10 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) return -ENOMEM; /* cannot map more than reserved buffer */ sa = vma->vm_start; - sg = rsv_schp->buffer; - for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); - ++k, sg = sg_next(sg)) { + length = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg && sa < vma->vm_end; k++) { len = vma->vm_end - sa; - len = (len < sg->length) ? len : sg->length; + len = (len < length) ? len : length; sa += len; } @@ -1310,7 +1287,6 @@ sg_cmd_done(void *data, char *sense, int result, int resid) if (0 != result) { struct scsi_sense_hdr sshdr; - memcpy(srp->sense_b, sense, sizeof (srp->sense_b)); srp->header.status = 0xff & result; srp->header.masked_status = status_byte(result); srp->header.msg_status = msg_byte(result); @@ -1685,34 +1661,51 @@ static int sg_start_req(Sg_request *srp, unsigned char *cmd) Sg_scatter_hold *rsv_schp = &sfp->reserve; struct request_queue *q = sfp->parentdp->device->request_queue; unsigned long alignment = queue_dma_alignment(q) | q->dma_pad_mask; + struct rq_map_data map_data; SCSI_LOG_TIMEOUT(4, printk("sg_start_req: dxfer_len=%d\n", dxfer_len)); + res = __sg_start_req(srp, hp, cmd); + if (res) + return res; + if ((dxfer_len <= 0) || (dxfer_dir == SG_DXFER_NONE)) - return __sg_start_req(srp, hp, cmd); + return 0; #ifdef SG_ALLOW_DIO_CODE if (sg_allow_dio && (hp->flags & SG_FLAG_DIRECT_IO) && (dxfer_dir != SG_DXFER_UNKNOWN) && (0 == hp->iovec_count) && (!sfp->parentdp->device->host->unchecked_isa_dma) && - !(uaddr & alignment) && !(dxfer_len & alignment)) { - res = __sg_start_req(srp, hp, cmd); - if (!res) - res = sg_build_direct(srp, sfp, dxfer_len); - - return res; - } + !(uaddr & alignment) && !(dxfer_len & alignment)) + return sg_build_direct(srp, sfp, dxfer_len); #endif if ((!sg_res_in_use(sfp)) && (dxfer_len <= rsv_schp->bufflen)) sg_link_reserve(sfp, srp, dxfer_len); - else { + else res = sg_build_indirect(req_schp, sfp, dxfer_len); - if (res) { - sg_remove_scat(req_schp); - return res; - } + + if (!res) { + struct request *rq = srp->rq; + Sg_scatter_hold *schp = &srp->data; + int iovec_count = (int) hp->iovec_count; + + map_data.pages = schp->pages; + map_data.page_order = schp->page_order; + map_data.nr_entries = schp->k_use_sg; + + if (iovec_count) + res = blk_rq_map_user_iov(q, rq, &map_data, hp->dxferp, + iovec_count, + hp->dxfer_len, GFP_ATOMIC); + else + res = blk_rq_map_user(q, rq, &map_data, hp->dxferp, + hp->dxfer_len, GFP_ATOMIC); + + if (!res) + srp->bio = rq->bio; } - return 0; + + return res; } static void @@ -1730,6 +1723,7 @@ sg_finish_rem_req(Sg_request * srp) if (srp->rq) { if (srp->bio) blk_rq_unmap_user(srp->bio); + blk_put_request(srp->rq); } @@ -1739,21 +1733,12 @@ sg_finish_rem_req(Sg_request * srp) static int sg_build_sgat(Sg_scatter_hold * schp, const Sg_fd * sfp, int tablesize) { - int sg_bufflen = tablesize * sizeof(struct scatterlist); + int sg_bufflen = tablesize * sizeof(struct page *); gfp_t gfp_flags = GFP_ATOMIC | __GFP_NOWARN; - /* - * TODO: test without low_dma, we should not need it since - * the block layer will bounce the buffer for us - * - * XXX(hch): we shouldn't need GFP_DMA for the actual S/G list. - */ - if (sfp->low_dma) - gfp_flags |= GFP_DMA; - schp->buffer = kzalloc(sg_bufflen, gfp_flags); - if (!schp->buffer) + schp->pages = kzalloc(sg_bufflen, gfp_flags); + if (!schp->pages) return -ENOMEM; - sg_init_table(schp->buffer, tablesize); schp->sglist_len = sg_bufflen; return tablesize; /* number of scat_gath elements allocated */ } @@ -1780,11 +1765,10 @@ sg_build_direct(Sg_request * srp, Sg_fd * sfp, int dxfer_len) static int sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) { - struct scatterlist *sg; - int ret_sz = 0, k, rem_sz, num, mx_sc_elems; + int ret_sz = 0, i, k, rem_sz, num, mx_sc_elems; int sg_tablesize = sfp->parentdp->sg_tablesize; - int blk_size = buff_size; - struct page *p = NULL; + int blk_size = buff_size, order; + gfp_t gfp_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; if (blk_size < 0) return -EFAULT; @@ -1808,15 +1792,26 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) } else scatter_elem_sz_prev = num; } - for (k = 0, sg = schp->buffer, rem_sz = blk_size; - (rem_sz > 0) && (k < mx_sc_elems); - ++k, rem_sz -= ret_sz, sg = sg_next(sg)) { - + + if (sfp->low_dma) + gfp_mask |= GFP_DMA; + + if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) + gfp_mask |= __GFP_ZERO; + + order = get_order(num); +retry: + ret_sz = 1 << (PAGE_SHIFT + order); + + for (k = 0, rem_sz = blk_size; rem_sz > 0 && k < mx_sc_elems; + k++, rem_sz -= ret_sz) { + num = (rem_sz > scatter_elem_sz_prev) ? - scatter_elem_sz_prev : rem_sz; - p = sg_page_malloc(num, sfp->low_dma, &ret_sz); - if (!p) - return -ENOMEM; + scatter_elem_sz_prev : rem_sz; + + schp->pages[k] = alloc_pages(gfp_mask, order); + if (!schp->pages[k]) + goto out; if (num == scatter_elem_sz_prev) { if (unlikely(ret_sz > scatter_elem_sz_prev)) { @@ -1824,12 +1819,12 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) scatter_elem_sz_prev = ret_sz; } } - sg_set_page(sg, p, (ret_sz > num) ? num : ret_sz, 0); SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k=%d, num=%d, " "ret_sz=%d\n", k, num, ret_sz)); } /* end of for loop */ + schp->page_order = order; schp->k_use_sg = k; SCSI_LOG_TIMEOUT(5, printk("sg_build_indirect: k_use_sg=%d, " "rem_sz=%d\n", k, rem_sz)); @@ -1837,8 +1832,15 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) schp->bufflen = blk_size; if (rem_sz > 0) /* must have failed */ return -ENOMEM; - return 0; +out: + for (i = 0; i < k; i++) + __free_pages(schp->pages[k], order); + + if (--order >= 0) + goto retry; + + return -ENOMEM; } static int @@ -1846,13 +1848,8 @@ sg_write_xfer(Sg_request * srp) { sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int num_xfer = 0; - int j, k, onum, usglen, ksglen, res; - int iovec_count = (int) hp->iovec_count; int dxfer_dir = hp->dxfer_direction; - unsigned char *p; - unsigned char __user *up; int new_interface = ('\0' == hp->interface_id) ? 0 : 1; if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_TO_DEV == dxfer_dir) || @@ -1868,103 +1865,26 @@ sg_write_xfer(Sg_request * srp) SCSI_LOG_TIMEOUT(4, printk("sg_write_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n", num_xfer, iovec_count, schp->k_use_sg)); - if (iovec_count) { - onum = iovec_count; - if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum)) - return -EFAULT; - } else - onum = 1; - - ksglen = sg->length; - p = page_address(sg_page(sg)); - for (j = 0, k = 0; j < onum; ++j) { - res = sg_u_iovec(hp, iovec_count, j, 1, &usglen, &up); - if (res) - return res; - - for (; p; sg = sg_next(sg), ksglen = sg->length, - p = page_address(sg_page(sg))) { - if (usglen <= 0) - break; - if (ksglen > usglen) { - if (usglen >= num_xfer) { - if (__copy_from_user(p, up, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_from_user(p, up, usglen)) - return -EFAULT; - p += usglen; - ksglen -= usglen; - break; - } else { - if (ksglen >= num_xfer) { - if (__copy_from_user(p, up, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_from_user(p, up, ksglen)) - return -EFAULT; - up += ksglen; - usglen -= ksglen; - } - ++k; - if (k >= schp->k_use_sg) - return 0; - } - } return 0; } -static int -sg_u_iovec(sg_io_hdr_t * hp, int sg_num, int ind, - int wr_xf, int *countp, unsigned char __user **up) -{ - int num_xfer = (int) hp->dxfer_len; - unsigned char __user *p = hp->dxferp; - int count; - - if (0 == sg_num) { - if (wr_xf && ('\0' == hp->interface_id)) - count = (int) hp->flags; /* holds "old" input_size */ - else - count = num_xfer; - } else { - sg_iovec_t iovec; - if (__copy_from_user(&iovec, p + ind*SZ_SG_IOVEC, SZ_SG_IOVEC)) - return -EFAULT; - p = iovec.iov_base; - count = (int) iovec.iov_len; - } - if (!access_ok(wr_xf ? VERIFY_READ : VERIFY_WRITE, p, count)) - return -EFAULT; - if (up) - *up = p; - if (countp) - *countp = count; - return 0; -} - static void sg_remove_scat(Sg_scatter_hold * schp) { SCSI_LOG_TIMEOUT(4, printk("sg_remove_scat: k_use_sg=%d\n", schp->k_use_sg)); - if (schp->buffer && (schp->sglist_len > 0)) { - struct scatterlist *sg = schp->buffer; - + if (schp->pages && schp->sglist_len > 0) { if (!schp->dio_in_use) { int k; - for (k = 0; (k < schp->k_use_sg) && sg_page(sg); - ++k, sg = sg_next(sg)) { + for (k = 0; k < schp->k_use_sg && schp->pages[k]; k++) { SCSI_LOG_TIMEOUT(5, printk( - "sg_remove_scat: k=%d, pg=0x%p, len=%d\n", - k, sg_page(sg), sg->length)); - sg_page_free(sg_page(sg), sg->length); + "sg_remove_scat: k=%d, pg=0x%p\n", + k, page)); + __free_pages(schp->pages[k], schp->page_order); } - kfree(schp->buffer); + kfree(schp->pages); } } memset(schp, 0, sizeof (*schp)); @@ -1975,13 +1895,8 @@ sg_read_xfer(Sg_request * srp) { sg_io_hdr_t *hp = &srp->header; Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int num_xfer = 0; - int j, k, onum, usglen, ksglen, res; - int iovec_count = (int) hp->iovec_count; int dxfer_dir = hp->dxfer_direction; - unsigned char *p; - unsigned char __user *up; int new_interface = ('\0' == hp->interface_id) ? 0 : 1; if ((SG_DXFER_UNKNOWN == dxfer_dir) || (SG_DXFER_FROM_DEV == dxfer_dir) @@ -1996,53 +1911,7 @@ sg_read_xfer(Sg_request * srp) return 0; SCSI_LOG_TIMEOUT(4, printk("sg_read_xfer: num_xfer=%d, iovec_count=%d, k_use_sg=%d\n", - num_xfer, iovec_count, schp->k_use_sg)); - if (iovec_count) { - onum = iovec_count; - if (!access_ok(VERIFY_READ, hp->dxferp, SZ_SG_IOVEC * onum)) - return -EFAULT; - } else - onum = 1; - - p = page_address(sg_page(sg)); - ksglen = sg->length; - for (j = 0, k = 0; j < onum; ++j) { - res = sg_u_iovec(hp, iovec_count, j, 0, &usglen, &up); - if (res) - return res; - - for (; p; sg = sg_next(sg), ksglen = sg->length, - p = page_address(sg_page(sg))) { - if (usglen <= 0) - break; - if (ksglen > usglen) { - if (usglen >= num_xfer) { - if (__copy_to_user(up, p, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_to_user(up, p, usglen)) - return -EFAULT; - p += usglen; - ksglen -= usglen; - break; - } else { - if (ksglen >= num_xfer) { - if (__copy_to_user(up, p, num_xfer)) - return -EFAULT; - return 0; - } - if (__copy_to_user(up, p, ksglen)) - return -EFAULT; - up += ksglen; - usglen -= ksglen; - } - ++k; - if (k >= schp->k_use_sg) - return 0; - } - } - + num_xfer, (int)hp->iovec_count, schp->k_use_sg)); return 0; } @@ -2050,7 +1919,6 @@ static int sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer) { Sg_scatter_hold *schp = &srp->data; - struct scatterlist *sg = schp->buffer; int k, num; SCSI_LOG_TIMEOUT(4, printk("sg_read_oxfer: num_read_xfer=%d\n", @@ -2058,15 +1926,18 @@ sg_read_oxfer(Sg_request * srp, char __user *outp, int num_read_xfer) if ((!outp) || (num_read_xfer <= 0)) return 0; - for (k = 0; (k < schp->k_use_sg) && sg_page(sg); ++k, sg = sg_next(sg)) { - num = sg->length; + blk_rq_unmap_user(srp->bio); + srp->bio = NULL; + + num = 1 << (PAGE_SHIFT + schp->page_order); + for (k = 0; k < schp->k_use_sg && schp->pages[k]; k++) { if (num > num_read_xfer) { - if (__copy_to_user(outp, page_address(sg_page(sg)), + if (__copy_to_user(outp, page_address(schp->pages[k]), num_read_xfer)) return -EFAULT; break; } else { - if (__copy_to_user(outp, page_address(sg_page(sg)), + if (__copy_to_user(outp, page_address(schp->pages[k]), num)) return -EFAULT; num_read_xfer -= num; @@ -2101,24 +1972,22 @@ sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size) { Sg_scatter_hold *req_schp = &srp->data; Sg_scatter_hold *rsv_schp = &sfp->reserve; - struct scatterlist *sg = rsv_schp->buffer; int k, num, rem; srp->res_used = 1; SCSI_LOG_TIMEOUT(4, printk("sg_link_reserve: size=%d\n", size)); rem = size; - for (k = 0; k < rsv_schp->k_use_sg; ++k, sg = sg_next(sg)) { - num = sg->length; + num = 1 << (PAGE_SHIFT + rsv_schp->page_order); + for (k = 0; k < rsv_schp->k_use_sg; k++) { if (rem <= num) { - sfp->save_scat_len = num; - sg->length = rem; req_schp->k_use_sg = k + 1; req_schp->sglist_len = rsv_schp->sglist_len; - req_schp->buffer = rsv_schp->buffer; + req_schp->pages = rsv_schp->pages; req_schp->bufflen = size; req_schp->b_malloc_len = rsv_schp->b_malloc_len; + req_schp->page_order = rsv_schp->page_order; break; } else rem -= num; @@ -2132,22 +2001,13 @@ static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp) { Sg_scatter_hold *req_schp = &srp->data; - Sg_scatter_hold *rsv_schp = &sfp->reserve; SCSI_LOG_TIMEOUT(4, printk("sg_unlink_reserve: req->k_use_sg=%d\n", (int) req_schp->k_use_sg)); - if ((rsv_schp->k_use_sg > 0) && (req_schp->k_use_sg > 0)) { - struct scatterlist *sg = rsv_schp->buffer; - - if (sfp->save_scat_len > 0) - (sg + (req_schp->k_use_sg - 1))->length = - (unsigned) sfp->save_scat_len; - else - SCSI_LOG_TIMEOUT(1, printk ("sg_unlink_reserve: BAD save_scat_len\n")); - } req_schp->k_use_sg = 0; req_schp->bufflen = 0; - req_schp->buffer = NULL; + req_schp->pages = NULL; + req_schp->page_order = 0; req_schp->sglist_len = 0; sfp->save_scat_len = 0; srp->res_used = 0; @@ -2405,53 +2265,6 @@ sg_res_in_use(Sg_fd * sfp) return srp ? 1 : 0; } -/* The size fetched (value output via retSzp) set when non-NULL return */ -static struct page * -sg_page_malloc(int rqSz, int lowDma, int *retSzp) -{ - struct page *resp = NULL; - gfp_t page_mask; - int order, a_size; - int resSz; - - if ((rqSz <= 0) || (NULL == retSzp)) - return resp; - - if (lowDma) - page_mask = GFP_ATOMIC | GFP_DMA | __GFP_COMP | __GFP_NOWARN; - else - page_mask = GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN; - - for (order = 0, a_size = PAGE_SIZE; a_size < rqSz; - order++, a_size <<= 1) ; - resSz = a_size; /* rounded up if necessary */ - resp = alloc_pages(page_mask, order); - while ((!resp) && order) { - --order; - a_size >>= 1; /* divide by 2, until PAGE_SIZE */ - resp = alloc_pages(page_mask, order); /* try half */ - resSz = a_size; - } - if (resp) { - if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) - memset(page_address(resp), 0, resSz); - *retSzp = resSz; - } - return resp; -} - -static void -sg_page_free(struct page *page, int size) -{ - int order, a_size; - - if (!page) - return; - for (order = 0, a_size = PAGE_SIZE; a_size < size; - order++, a_size <<= 1) ; - __free_pages(page, order); -} - #ifdef CONFIG_SCSI_PROC_FS static int sg_idr_max_id(int id, void *p, void *data) -- 1.5.5.GIT ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov 2008-08-28 7:17 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori @ 2008-08-28 7:34 ` Jens Axboe 1 sibling, 0 replies; 23+ messages in thread From: Jens Axboe @ 2008-08-28 7:34 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, dougg, michaelc, James.Bottomley On Thu, Aug 28 2008, FUJITA Tomonori wrote: > Currently, blk_rq_map_user and blk_rq_map_user_iov always do > GFP_KERNEL allocation. > > This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov > so sg can use it (sg always does GFP_ATOMIC allocation). Thanks for the (quick!) resend, earns you another shining star to place on the refrigerator! I've applied them, thanks again Tomo. -- Jens Axboe ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-28 7:07 ` Jens Axboe 2008-08-28 7:17 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori @ 2008-08-28 7:32 ` FUJITA Tomonori 1 sibling, 0 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-28 7:32 UTC (permalink / raw) To: jens.axboe; +Cc: fujita.tomonori, linux-scsi, dougg, michaelc, James.Bottomley On Thu, 28 Aug 2008 09:07:25 +0200 Jens Axboe <jens.axboe@oracle.com> wrote: > On Thu, Aug 28 2008, Jens Axboe wrote: > > On Thu, Aug 28 2008, FUJITA Tomonori wrote: > > > On Wed, 27 Aug 2008 09:10:18 +0200 > > > Jens Axboe <jens.axboe@oracle.com> wrote: > > > > > > > > Thanks, I put an updated version against the for-2.6.28 branch in your > > > > > tree: > > > > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > > > > > > > > > > > > > Can you wait for a little while before applying this to your > > > > > for-2.6.28 branch? We need Doug's ACK on this. > > > > > > > > Sure, just let me know... > > > > > > Now we got Doug's ACK on this (Doug, thanks!). Can you apply them to > > > your for-2.6.28 branch? I'll resend them if necessary. > > > > Yes certainly. No need to resend, I'll add Dougs ack and merge it. > > Actually, the previous was against for-linus (not for-2.6.28), so if you > could resend against 2.6.28 that would help a lot. Thanks! Done, though actually the patchset in my git tree was updated against for-linus. Thanks, ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-26 2:10 [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-26 7:56 ` [PATCH 0/5] convert sg to use the block layer Jens Axboe @ 2008-08-27 20:14 ` Douglas Gilbert 2008-08-27 23:26 ` FUJITA Tomonori 2 siblings, 1 reply; 23+ messages in thread From: Douglas Gilbert @ 2008-08-27 20:14 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, jens.axboe, michaelc, James.Bottomley FUJITA Tomonori wrote: > This patchset converts sg to use the block layer functions. That is, > sg doesn't use scsi_execute_async() any more. This is a part of the > overdue task to remove scsi_req_map_sg. > > I tested this patchset with sg v3 and the old interface (struct > sg_header) via SG_IO (v3) and the vfs API. > > > Doug, > > 1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always > uses GFP_ATOMIC as before. > > 2. I don't remove GFP_DMA allocation. As before, sg allocates reserved > pages with GFP_DMA (sfp->low_dma case). > > 3. I keep the reserved buffer per struct sg_fd as before. > > 4. I use high-order page allocation for reserved buffer as before. sg > works well with HBAs that have the limitation of the number of sg > entries. > > 5. I think that you were concern about the overhead of the block layer > functions. But if you look at scsi_execute_async() that sg uses now, > you can find that scsi_execute_async() uses the block layer functions > internally. So the current sg incurs the overhead (if such overhead > exists). Tomo, Thanks for doing the conversion. Signed-off-by: Douglas Gilbert <dougg@torque.net> > Jens, > > I keep the block API changes to a minimum (I might need more changes > for st/osst but I'd like to progress step by step). I did only two > things. > > 1. I add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov. > > 2. I introduces struct rq_map_data holding pages. sg puts > pre-allocated pages to it and passes it to bio_copy_user_iov(). The > current users of bio_copy_user_iov simply passes NULL. blk_rq_map_user > and blk_rq_map_user_iov take a pointer to struct rq_map_data and in > the end bio_copy_user_iov gets it. > > > This patchset against the for-linus branch in Jens' tree + the two > patches for 2.6.27: > > http://marc.info/?l=linux-kernel&m=121964251911717&w=2 > http://marc.info/?l=linux-kernel&m=121964241911611&w=2 > > After Jens rebases the for-2.6.28 brach, I'll update this patchset > too. > > This patchset also is available at: > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git sg-block > > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] convert sg to use the block layer 2008-08-27 20:14 ` Douglas Gilbert @ 2008-08-27 23:26 ` FUJITA Tomonori 0 siblings, 0 replies; 23+ messages in thread From: FUJITA Tomonori @ 2008-08-27 23:26 UTC (permalink / raw) To: dougg; +Cc: fujita.tomonori, linux-scsi, jens.axboe, michaelc, James.Bottomley On Wed, 27 Aug 2008 16:14:24 -0400 Douglas Gilbert <dougg@torque.net> wrote: > FUJITA Tomonori wrote: > > This patchset converts sg to use the block layer functions. That is, > > sg doesn't use scsi_execute_async() any more. This is a part of the > > overdue task to remove scsi_req_map_sg. > > > > I tested this patchset with sg v3 and the old interface (struct > > sg_header) via SG_IO (v3) and the vfs API. > > > > > > Doug, > > > > 1. I don't demote the sg driver's GFP_ATOMIC to GFP_KERNEL. sg always > > uses GFP_ATOMIC as before. > > > > 2. I don't remove GFP_DMA allocation. As before, sg allocates reserved > > pages with GFP_DMA (sfp->low_dma case). > > > > 3. I keep the reserved buffer per struct sg_fd as before. > > > > 4. I use high-order page allocation for reserved buffer as before. sg > > works well with HBAs that have the limitation of the number of sg > > entries. > > > > 5. I think that you were concern about the overhead of the block layer > > functions. But if you look at scsi_execute_async() that sg uses now, > > you can find that scsi_execute_async() uses the block layer functions > > internally. So the current sg incurs the overhead (if such overhead > > exists). > > Tomo, > Thanks for doing the conversion. > > Signed-off-by: Douglas Gilbert <dougg@torque.net> Thanks a lot! ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2008-08-28 7:34 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-26 2:10 [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori 2008-08-26 2:10 ` [PATCH 5/5] sg: convert the indirect " FUJITA Tomonori 2008-08-26 16:35 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Christoph Hellwig 2008-08-27 1:30 ` FUJITA Tomonori 2008-08-26 7:56 ` [PATCH 0/5] convert sg to use the block layer Jens Axboe 2008-08-27 2:14 ` FUJITA Tomonori 2008-08-27 7:10 ` Jens Axboe 2008-08-27 23:26 ` FUJITA Tomonori 2008-08-28 6:51 ` Jens Axboe 2008-08-28 7:07 ` Jens Axboe 2008-08-28 7:17 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 2/5] block: introduce struct rq_map_data to use reserved pages FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 3/5] sg: convert the non-data path to use the block layer FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 4/5] sg: convert the direct IO " FUJITA Tomonori 2008-08-28 7:17 ` [PATCH 5/5] sg: convert the indirect " FUJITA Tomonori 2008-08-28 7:34 ` [PATCH 1/5] block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov Jens Axboe 2008-08-28 7:32 ` [PATCH 0/5] convert sg to use the block layer FUJITA Tomonori 2008-08-27 20:14 ` Douglas Gilbert 2008-08-27 23:26 ` FUJITA Tomonori
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox