* [PATCH for-next] dm: fix missing bi_remaining accounting @ 2013-11-01 13:59 Mike Snitzer 2013-11-01 15:03 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: Mike Snitzer @ 2013-11-01 13:59 UTC (permalink / raw) To: Jens Axboe Cc: Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon, Mikulas Patocka Add the missing bi_remaining increment, required by the block layer's new bio-chaining code, to both the verity and old snapshot DM targets. Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-snap.c | 1 + drivers/md/dm-verity.c | 1 + 2 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index b7a5053..1b4d22d 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1415,6 +1415,7 @@ out: if (full_bio) { full_bio->bi_end_io = pe->full_bio_end_io; full_bio->bi_private = pe->full_bio_private; + atomic_inc(&full_bio->bi_remaining); } free_pending_exception(pe); diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c index ac35e95..dc15857 100644 --- a/drivers/md/dm-verity.c +++ b/drivers/md/dm-verity.c @@ -384,6 +384,7 @@ static void verity_finish_io(struct dm_verity_io *io, int error) bio->bi_end_io = io->orig_bi_end_io; bio->bi_private = io->orig_bi_private; + atomic_inc(&bio->bi_remaining); bio_endio(bio, error); } ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-01 13:59 [PATCH for-next] dm: fix missing bi_remaining accounting Mike Snitzer @ 2013-11-01 15:03 ` Jens Axboe 2013-11-01 15:43 ` stec skd block driver needs updating for immutable biovec Mike Snitzer 2013-11-04 15:06 ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka 0 siblings, 2 replies; 16+ messages in thread From: Jens Axboe @ 2013-11-01 15:03 UTC (permalink / raw) To: Mike Snitzer Cc: Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon, Mikulas Patocka On 11/01/2013 07:59 AM, Mike Snitzer wrote: > Add the missing bi_remaining increment, required by the block layer's > new bio-chaining code, to both the verity and old snapshot DM targets. > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). Thanks Mike, added to the mix. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* stec skd block driver needs updating for immutable biovec 2013-11-01 15:03 ` Jens Axboe @ 2013-11-01 15:43 ` Mike Snitzer 2013-11-01 15:50 ` Christoph Hellwig 2013-11-04 15:06 ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka 1 sibling, 1 reply; 16+ messages in thread From: Mike Snitzer @ 2013-11-01 15:43 UTC (permalink / raw) To: Jens Axboe; +Cc: Kent Overstreet, linux-kernel Hey Jens and Kent, You're likely already aware of this but building the latest linux-block.git 'for-next' I get: CC [M] drivers/block/skd_main.o drivers/block/skd_main.c: In function ‘skd_request_fn’: drivers/block/skd_main.c:773: error: ‘struct bio’ has no member named ‘bi_sector’ drivers/block/skd_main.c: In function ‘skd_end_request_bio’: drivers/block/skd_main.c:1142: error: ‘struct bio’ has no member named ‘bi_sector’ drivers/block/skd_main.c: In function ‘skd_preop_sg_list_bio’: drivers/block/skd_main.c:1198: error: implicit declaration of function ‘bio_iovec_idx’ drivers/block/skd_main.c:1198: warning: assignment makes pointer from integer without a cast drivers/block/skd_main.c: In function ‘skd_log_skreq’: drivers/block/skd_main.c:5814: error: ‘struct bio’ has no member named ‘bi_sector’ make[2]: *** [drivers/block/skd_main.o] Error 1 make[1]: *** [drivers/block] Error 2 make: *** [drivers] Error 2 All the bi_sector ones are low hanging fruit, but the conversion for skd_preop_sg_list_bio()'s bio_vec code is more involved. Kent, any chance you could crank through it? If not I can come back to trying to fix this later.. but I'm working through a test merge of linux-dm.git's 'for-next' with linux-block.git's 'for-next'. (I happen to have one of these stec cards so I'll be able to test) Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 15:43 ` stec skd block driver needs updating for immutable biovec Mike Snitzer @ 2013-11-01 15:50 ` Christoph Hellwig 2013-11-01 16:02 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: Christoph Hellwig @ 2013-11-01 15:50 UTC (permalink / raw) To: Mike Snitzer; +Cc: Jens Axboe, Kent Overstreet, linux-kernel On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: > All the bi_sector ones are low hanging fruit, but the conversion for > skd_preop_sg_list_bio()'s bio_vec code is more involved. > > Kent, any chance you could crank through it? > > If not I can come back to trying to fix this later.. but I'm working > through a test merge of linux-dm.git's 'for-next' with linux-block.git's > 'for-next'. The right thing for 3.13 is to rip out the bio base code path, and for 3.14 to convert it to blk-mq. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 15:50 ` Christoph Hellwig @ 2013-11-01 16:02 ` Jens Axboe 2013-11-01 16:28 ` Mike Snitzer 0 siblings, 1 reply; 16+ messages in thread From: Jens Axboe @ 2013-11-01 16:02 UTC (permalink / raw) To: Christoph Hellwig, Mike Snitzer; +Cc: Kent Overstreet, linux-kernel On 11/01/2013 09:50 AM, Christoph Hellwig wrote: > On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: >> All the bi_sector ones are low hanging fruit, but the conversion for >> skd_preop_sg_list_bio()'s bio_vec code is more involved. >> >> Kent, any chance you could crank through it? >> >> If not I can come back to trying to fix this later.. but I'm working >> through a test merge of linux-dm.git's 'for-next' with linux-block.git's >> 'for-next'. > > The right thing for 3.13 is to rip out the bio base code path, and > for 3.14 to convert it to blk-mq. It is. I will kill it. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 16:02 ` Jens Axboe @ 2013-11-01 16:28 ` Mike Snitzer 2013-11-01 16:34 ` Jens Axboe 0 siblings, 1 reply; 16+ messages in thread From: Mike Snitzer @ 2013-11-01 16:28 UTC (permalink / raw) To: Jens Axboe; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel On Fri, Nov 01 2013 at 12:02pm -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 11/01/2013 09:50 AM, Christoph Hellwig wrote: > > On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: > >> All the bi_sector ones are low hanging fruit, but the conversion for > >> skd_preop_sg_list_bio()'s bio_vec code is more involved. > >> > >> Kent, any chance you could crank through it? > >> > >> If not I can come back to trying to fix this later.. but I'm working > >> through a test merge of linux-dm.git's 'for-next' with linux-block.git's > >> 'for-next'. > > > > The right thing for 3.13 is to rip out the bio base code path, and > > for 3.14 to convert it to blk-mq. > > It is. I will kill it. I just cranked through it.. hope this helps (think I got everything but may have missed something): From: Mike Snitzer <snitzer@redhat.com> Subject: [PATCH] skd: remove all the bio-based code We'll want to convert this driver to blk-mq in the near-term. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com> --- drivers/block/skd_main.c | 515 ++++++---------------------------------------- 1 files changed, 62 insertions(+), 453 deletions(-) diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c index 1a8717f..d259b1a 100644 --- a/drivers/block/skd_main.c +++ b/drivers/block/skd_main.c @@ -214,8 +214,6 @@ struct skd_request_context { u32 fitmsg_id; struct request *req; - struct bio *bio; - unsigned long start_time; u8 flush_cmd; u8 discard_page; @@ -357,9 +355,6 @@ struct skd_device { struct work_struct completion_worker; - struct bio_list bio_queue; - int queue_stopped; - struct list_head flush_list; }; @@ -470,11 +465,6 @@ MODULE_PARM_DESC(skd_dbg_level, "s1120 debug level (0,1,2)"); module_param(skd_isr_comp_limit, int, 0444); MODULE_PARM_DESC(skd_isr_comp_limit, "s1120 isr comp limit (0=none) default=4"); -static int skd_bio; -module_param(skd_bio, int, 0444); -MODULE_PARM_DESC(skd_bio, - "Register as a bio device instead of block (0, 1) default=0"); - /* Major device number dynamically assigned. */ static u32 skd_major; @@ -512,11 +502,6 @@ static void skd_log_skmsg(struct skd_device *skdev, static void skd_log_skreq(struct skd_device *skdev, struct skd_request_context *skreq, const char *event); -/* FLUSH FUA flag handling. */ -static int skd_flush_cmd_enqueue(struct skd_device *, void *); -static void *skd_flush_cmd_dequeue(struct skd_device *); - - /* ***************************************************************************** * READ/WRITE REQUESTS @@ -524,37 +509,22 @@ static void *skd_flush_cmd_dequeue(struct skd_device *); */ static void skd_stop_queue(struct skd_device *skdev) { - if (!skd_bio) - blk_stop_queue(skdev->queue); - else - skdev->queue_stopped = 1; + blk_stop_queue(skdev->queue); } static void skd_unstop_queue(struct skd_device *skdev) { - if (!skd_bio) - queue_flag_clear(QUEUE_FLAG_STOPPED, skdev->queue); - else - skdev->queue_stopped = 0; + queue_flag_clear(QUEUE_FLAG_STOPPED, skdev->queue); } static void skd_start_queue(struct skd_device *skdev) { - if (!skd_bio) { - blk_start_queue(skdev->queue); - } else { - pr_err("(%s): Starting queue\n", skd_name(skdev)); - skdev->queue_stopped = 0; - skd_request_fn(skdev->queue); - } + blk_start_queue(skdev->queue); } static int skd_queue_stopped(struct skd_device *skdev) { - if (!skd_bio) - return blk_queue_stopped(skdev->queue); - else - return skdev->queue_stopped; + return blk_queue_stopped(skdev->queue); } static void skd_fail_all_pending_blk(struct skd_device *skdev) @@ -571,40 +541,9 @@ static void skd_fail_all_pending_blk(struct skd_device *skdev) } } -static void skd_fail_all_pending_bio(struct skd_device *skdev) -{ - struct bio *bio; - int error = -EIO; - - for (;; ) { - bio = bio_list_pop(&skdev->bio_queue); - - if (bio == NULL) - break; - - bio_endio(bio, error); - } -} - static void skd_fail_all_pending(struct skd_device *skdev) { - if (!skd_bio) - skd_fail_all_pending_blk(skdev); - else - skd_fail_all_pending_bio(skdev); -} - -static void skd_make_request(struct request_queue *q, struct bio *bio) -{ - struct skd_device *skdev = q->queuedata; - unsigned long flags; - - spin_lock_irqsave(&skdev->lock, flags); - - bio_list_add(&skdev->bio_queue, bio); - skd_request_fn(skdev->queue); - - spin_unlock_irqrestore(&skdev->lock, flags); + skd_fail_all_pending_blk(skdev); } static void @@ -667,18 +606,9 @@ skd_prep_discard_cdb(struct skd_scsi_request *scsi_req, put_unaligned_be64(lba, &buf[8]); put_unaligned_be32(count, &buf[16]); - if (!skd_bio) { - req = skreq->req; - blk_add_request_payload(req, page, len); - req->buffer = buf; - } else { - skreq->bio->bi_io_vec->bv_page = page; - skreq->bio->bi_io_vec->bv_offset = 0; - skreq->bio->bi_io_vec->bv_len = len; - - skreq->bio->bi_vcnt = 1; - skreq->bio->bi_phys_segments = 1; - } + req = skreq->req; + blk_add_request_payload(req, page, len); + req->buffer = buf; } static void skd_request_fn_not_online(struct request_queue *q); @@ -690,7 +620,6 @@ static void skd_request_fn(struct request_queue *q) struct fit_msg_hdr *fmh = NULL; struct skd_request_context *skreq; struct request *req = NULL; - struct bio *bio = NULL; struct skd_scsi_request *scsi_req; struct page *page; unsigned long io_flags; @@ -732,60 +661,27 @@ static void skd_request_fn(struct request_queue *q) flush = fua = 0; - if (!skd_bio) { - req = blk_peek_request(q); - - /* Are there any native requests to start? */ - if (req == NULL) - break; - - lba = (u32)blk_rq_pos(req); - count = blk_rq_sectors(req); - data_dir = rq_data_dir(req); - io_flags = req->cmd_flags; - - if (io_flags & REQ_FLUSH) - flush++; - - if (io_flags & REQ_FUA) - fua++; - - pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) " - "count=%u(0x%x) dir=%d\n", - skdev->name, __func__, __LINE__, - req, lba, lba, count, count, data_dir); - } else { - if (!list_empty(&skdev->flush_list)) { - /* Process data part of FLUSH request. */ - bio = (struct bio *)skd_flush_cmd_dequeue(skdev); - flush++; - pr_debug("%s:%s:%d processing FLUSH request with data.\n", - skdev->name, __func__, __LINE__); - } else { - /* peek at our bio queue */ - bio = bio_list_peek(&skdev->bio_queue); - } + req = blk_peek_request(q); - /* Are there any native requests to start? */ - if (bio == NULL) - break; + /* Are there any native requests to start? */ + if (req == NULL) + break; - lba = (u32)bio->bi_sector; - count = bio_sectors(bio); - data_dir = bio_data_dir(bio); - io_flags = bio->bi_rw; + lba = (u32)blk_rq_pos(req); + count = blk_rq_sectors(req); + data_dir = rq_data_dir(req); + io_flags = req->cmd_flags; - pr_debug("%s:%s:%d new bio=%p lba=%u(0x%x) " - "count=%u(0x%x) dir=%d\n", - skdev->name, __func__, __LINE__, - bio, lba, lba, count, count, data_dir); + if (io_flags & REQ_FLUSH) + flush++; - if (io_flags & REQ_FLUSH) - flush++; + if (io_flags & REQ_FUA) + fua++; - if (io_flags & REQ_FUA) - fua++; - } + pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) " + "count=%u(0x%x) dir=%d\n", + skdev->name, __func__, __LINE__, + req, lba, lba, count, count, data_dir); /* At this point we know there is a request * (from our bio q or req q depending on the way @@ -825,29 +721,15 @@ static void skd_request_fn(struct request_queue *q) skreq->discard_page = 0; /* - * OK to now dequeue request from either bio or q. + * OK to now dequeue request from q. * * At this point we are comitted to either start or reject * the native request. Note that skd_request_context is * available but is still at the head of the free list. */ - if (!skd_bio) { - blk_start_request(req); - skreq->req = req; - skreq->fitmsg_id = 0; - } else { - if (unlikely(flush == SKD_FLUSH_DATA_SECOND)) { - skreq->bio = bio; - } else { - skreq->bio = bio_list_pop(&skdev->bio_queue); - SKD_ASSERT(skreq->bio == bio); - skreq->start_time = jiffies; - part_inc_in_flight(&skdev->disk->part0, - bio_data_dir(bio)); - } - - skreq->fitmsg_id = 0; - } + blk_start_request(req); + skreq->req = req; + skreq->fitmsg_id = 0; /* Either a FIT msg is in progress or we have to start one. */ if (skmsg == NULL) { @@ -923,8 +805,7 @@ static void skd_request_fn(struct request_queue *q) if (fua) scsi_req->cdb[1] |= SKD_FUA_NV; - if ((!skd_bio && !req->bio) || - (skd_bio && flush == SKD_FLUSH_ZERO_SIZE_FIRST)) + if (!req->bio) goto skip_sg; error = skd_preop_sg_list(skdev, skreq); @@ -1011,8 +892,7 @@ skip_sg: * If req is non-NULL it means there is something to do but * we are out of a resource. */ - if (((!skd_bio) && req) || - ((skd_bio) && bio_list_peek(&skdev->bio_queue))) + if (req) skd_stop_queue(skdev); } @@ -1124,184 +1004,22 @@ static void skd_postop_sg_list_blk(struct skd_device *skdev, pci_unmap_sg(skdev->pdev, &skreq->sg[0], skreq->n_sg, pci_dir); } -static void skd_end_request_bio(struct skd_device *skdev, - struct skd_request_context *skreq, int error) -{ - struct bio *bio = skreq->bio; - int rw = bio_data_dir(bio); - unsigned long io_flags = bio->bi_rw; - - if ((io_flags & REQ_DISCARD) && - (skreq->discard_page == 1)) { - pr_debug("%s:%s:%d biomode: skd_end_request: freeing DISCARD page.\n", - skdev->name, __func__, __LINE__); - free_page((unsigned long)page_address(bio->bi_io_vec->bv_page)); - } - - if (unlikely(error)) { - u32 lba = (u32)skreq->bio->bi_sector; - u32 count = bio_sectors(skreq->bio); - char *cmd = (rw == WRITE) ? "write" : "read"; - pr_err("(%s): Error cmd=%s sect=%u count=%u id=0x%x\n", - skd_name(skdev), cmd, lba, count, skreq->id); - } - { - int cpu = part_stat_lock(); - - if (likely(!error)) { - part_stat_inc(cpu, &skdev->disk->part0, ios[rw]); - part_stat_add(cpu, &skdev->disk->part0, sectors[rw], - bio_sectors(bio)); - } - part_stat_add(cpu, &skdev->disk->part0, ticks[rw], - jiffies - skreq->start_time); - part_dec_in_flight(&skdev->disk->part0, rw); - part_stat_unlock(); - } - - pr_debug("%s:%s:%d id=0x%x error=%d\n", - skdev->name, __func__, __LINE__, skreq->id, error); - - bio_endio(skreq->bio, error); -} - -static int skd_preop_sg_list_bio(struct skd_device *skdev, - struct skd_request_context *skreq) -{ - struct bio *bio = skreq->bio; - int writing = skreq->sg_data_dir == SKD_DATA_DIR_HOST_TO_CARD; - int pci_dir = writing ? PCI_DMA_TODEVICE : PCI_DMA_FROMDEVICE; - int n_sg; - int i; - struct bio_vec *vec; - struct fit_sg_descriptor *sgd; - u64 dma_addr; - u32 count; - int errs = 0; - unsigned int io_flags = 0; - io_flags |= bio->bi_rw; - - skreq->sg_byte_count = 0; - n_sg = skreq->n_sg = skreq->bio->bi_vcnt; - - if (n_sg <= 0) - return -EINVAL; - - if (n_sg > skdev->sgs_per_request) { - pr_err("(%s): sg overflow n=%d\n", - skd_name(skdev), n_sg); - skreq->n_sg = 0; - return -EIO; - } - - for (i = 0; i < skreq->n_sg; i++) { - vec = bio_iovec_idx(bio, i); - dma_addr = pci_map_page(skdev->pdev, - vec->bv_page, - vec->bv_offset, vec->bv_len, pci_dir); - count = vec->bv_len; - - if (count == 0 || count > 64u * 1024u || (count & 3) != 0 - || (dma_addr & 3) != 0) { - pr_err( - "(%s): Bad sg ix=%d count=%d addr=0x%llx\n", - skd_name(skdev), i, count, dma_addr); - errs++; - } - - sgd = &skreq->sksg_list[i]; - - sgd->control = FIT_SGD_CONTROL_NOT_LAST; - sgd->byte_count = vec->bv_len; - skreq->sg_byte_count += vec->bv_len; - sgd->host_side_addr = dma_addr; - sgd->dev_side_addr = 0; /* not used */ - } - - skreq->sksg_list[n_sg - 1].next_desc_ptr = 0LL; - skreq->sksg_list[n_sg - 1].control = FIT_SGD_CONTROL_LAST; - - - if (!(io_flags & REQ_DISCARD)) { - count = bio_sectors(bio) << 9u; - if (count != skreq->sg_byte_count) { - pr_err("(%s): mismatch count sg=%d req=%d\n", - skd_name(skdev), skreq->sg_byte_count, count); - errs++; - } - } - - if (unlikely(skdev->dbg_level > 1)) { - pr_debug("%s:%s:%d skreq=%x sksg_list=%p sksg_dma=%llx\n", - skdev->name, __func__, __LINE__, - skreq->id, skreq->sksg_list, skreq->sksg_dma_address); - for (i = 0; i < n_sg; i++) { - struct fit_sg_descriptor *sgd = &skreq->sksg_list[i]; - pr_debug("%s:%s:%d sg[%d] count=%u ctrl=0x%x " - "addr=0x%llx next=0x%llx\n", - skdev->name, __func__, __LINE__, - i, sgd->byte_count, sgd->control, - sgd->host_side_addr, sgd->next_desc_ptr); - } - } - - if (errs != 0) { - skd_postop_sg_list(skdev, skreq); - skreq->n_sg = 0; - return -EIO; - } - - return 0; -} - static int skd_preop_sg_list(struct skd_device *skdev, struct skd_request_context *skreq) { - if (!skd_bio) - return skd_preop_sg_list_blk(skdev, skreq); - else - return skd_preop_sg_list_bio(skdev, skreq); -} - -static void skd_postop_sg_list_bio(struct skd_device *skdev, - struct skd_request_context *skreq) -{ - int writing = skreq->sg_data_dir == SKD_DATA_DIR_HOST_TO_CARD; - int pci_dir = writing ? PCI_DMA_TODEVICE : PCI_DMA_FROMDEVICE; - int i; - struct fit_sg_descriptor *sgd; - - /* - * restore the next ptr for next IO request so we - * don't have to set it every time. - */ - skreq->sksg_list[skreq->n_sg - 1].next_desc_ptr = - skreq->sksg_dma_address + - ((skreq->n_sg) * sizeof(struct fit_sg_descriptor)); - - for (i = 0; i < skreq->n_sg; i++) { - sgd = &skreq->sksg_list[i]; - pci_unmap_page(skdev->pdev, sgd->host_side_addr, - sgd->byte_count, pci_dir); - } + return skd_preop_sg_list_blk(skdev, skreq); } static void skd_postop_sg_list(struct skd_device *skdev, struct skd_request_context *skreq) { - if (!skd_bio) - skd_postop_sg_list_blk(skdev, skreq); - else - skd_postop_sg_list_bio(skdev, skreq); + skd_postop_sg_list_blk(skdev, skreq); } static void skd_end_request(struct skd_device *skdev, struct skd_request_context *skreq, int error) { - if (likely(!skd_bio)) - skd_end_request_blk(skdev, skreq, error); - else - skd_end_request_bio(skdev, skreq, error); + skd_end_request_blk(skdev, skreq, error); } static void skd_request_fn_not_online(struct request_queue *q) @@ -2754,13 +2472,10 @@ static void skd_resolve_req_exception(struct skd_device *skdev, break; case SKD_CHECK_STATUS_REQUEUE_REQUEST: - if (!skd_bio) { - if ((unsigned long) ++skreq->req->special < - SKD_MAX_RETRIES) { - skd_log_skreq(skdev, skreq, "retry"); - skd_requeue_request(skdev, skreq); - break; - } + if ((unsigned long) ++skreq->req->special < SKD_MAX_RETRIES) { + skd_log_skreq(skdev, skreq, "retry"); + skd_requeue_request(skdev, skreq); + break; } /* fall through to report error */ @@ -2774,16 +2489,9 @@ static void skd_resolve_req_exception(struct skd_device *skdev, static void skd_requeue_request(struct skd_device *skdev, struct skd_request_context *skreq) { - if (!skd_bio) { - blk_requeue_request(skdev->queue, skreq->req); - } else { - bio_list_add_head(&skdev->bio_queue, skreq->bio); - skreq->bio = NULL; - } + blk_requeue_request(skdev->queue, skreq->req); } - - /* assume spinlock is already held */ static void skd_release_skreq(struct skd_device *skdev, struct skd_request_context *skreq) @@ -2840,11 +2548,7 @@ static void skd_release_skreq(struct skd_device *skdev, /* * Reset backpointer */ - if (likely(!skd_bio)) - skreq->req = NULL; - else - skreq->bio = NULL; - + skreq->req = NULL; /* * Reclaim the skd_request_context @@ -3084,8 +2788,6 @@ static int skd_isr_completion_posted(struct skd_device *skdev, u32 cmp_bytes = 0; int rc = 0; int processed = 0; - int ret; - for (;; ) { SKD_ASSERT(skdev->skcomp_ix < SKD_N_COMPLETION_ENTRY); @@ -3180,8 +2882,7 @@ static int skd_isr_completion_posted(struct skd_device *skdev, if (skreq->n_sg > 0) skd_postop_sg_list(skdev, skreq); - if (((!skd_bio) && !skreq->req) || - ((skd_bio) && !skreq->bio)) { + if (!skreq->req) { pr_debug("%s:%s:%d NULL backptr skdreq %p, " "req=0x%x req_id=0x%x\n", skdev->name, __func__, __LINE__, @@ -3191,30 +2892,10 @@ static int skd_isr_completion_posted(struct skd_device *skdev, * Capture the outcome and post it back to the * native request. */ - if (likely(cmp_status == SAM_STAT_GOOD)) { - if (unlikely(skreq->flush_cmd)) { - if (skd_bio) { - /* if empty size bio, we are all done */ - if (bio_sectors(skreq->bio) == 0) { - skd_end_request(skdev, skreq, 0); - } else { - ret = skd_flush_cmd_enqueue(skdev, (void *)skreq->bio); - if (ret != 0) { - pr_err("Failed to enqueue flush bio with Data. Err=%d.\n", ret); - skd_end_request(skdev, skreq, ret); - } else { - ((*enqueued)++); - } - } - } else { - skd_end_request(skdev, skreq, 0); - } - } else { - skd_end_request(skdev, skreq, 0); - } - } else { + if (likely(cmp_status == SAM_STAT_GOOD)) + skd_end_request(skdev, skreq, 0); + else skd_resolve_req_exception(skdev, skreq); - } } /* @@ -3645,34 +3326,22 @@ static void skd_recover_requests(struct skd_device *skdev, int requeue) skd_log_skreq(skdev, skreq, "recover"); SKD_ASSERT((skreq->id & SKD_ID_INCR) != 0); - if (!skd_bio) - SKD_ASSERT(skreq->req != NULL); - else - SKD_ASSERT(skreq->bio != NULL); + SKD_ASSERT(skreq->req != NULL); /* Release DMA resources for the request. */ if (skreq->n_sg > 0) skd_postop_sg_list(skdev, skreq); - if (!skd_bio) { - if (requeue && - (unsigned long) ++skreq->req->special < - SKD_MAX_RETRIES) - skd_requeue_request(skdev, skreq); - else - skd_end_request(skdev, skreq, -EIO); - } else + if (requeue && + (unsigned long) ++skreq->req->special < SKD_MAX_RETRIES) + skd_requeue_request(skdev, skreq); + else skd_end_request(skdev, skreq, -EIO); - if (!skd_bio) - skreq->req = NULL; - else - skreq->bio = NULL; + skreq->req = NULL; skreq->state = SKD_REQ_STATE_IDLE; skreq->id += SKD_ID_INCR; - - } if (i > 0) skreq[-1].next = skreq; @@ -4580,10 +4249,6 @@ static struct skd_device *skd_construct(struct pci_dev *pdev) skdev->sgs_per_request = skd_sgs_per_request; skdev->dbg_level = skd_dbg_level; - if (skd_bio) - bio_list_init(&skdev->bio_queue); - - atomic_set(&skdev->device_count, 0); spin_lock_init(&skdev->lock); @@ -4941,13 +4606,7 @@ static int skd_cons_disk(struct skd_device *skdev) disk->fops = &skd_blockdev_ops; disk->private_data = skdev; - if (!skd_bio) { - q = blk_init_queue(skd_request_fn, &skdev->lock); - } else { - q = blk_alloc_queue(GFP_KERNEL); - q->queue_flags = QUEUE_FLAG_IO_STAT | QUEUE_FLAG_STACKABLE; - } - + q = blk_init_queue(skd_request_fn, &skdev->lock); if (!q) { rc = -ENOMEM; goto err_out; @@ -4957,11 +4616,6 @@ static int skd_cons_disk(struct skd_device *skdev) disk->queue = q; q->queuedata = skdev; - if (skd_bio) { - q->queue_lock = &skdev->lock; - blk_queue_make_request(q, skd_make_request); - } - blk_queue_flush(q, REQ_FLUSH | REQ_FUA); blk_queue_max_segments(q, skdev->sgs_per_request); blk_queue_max_hw_sectors(q, SKD_N_MAX_SECTORS); @@ -5794,35 +5448,19 @@ static void skd_log_skreq(struct skd_device *skdev, skdev->name, __func__, __LINE__, skreq->timeout_stamp, skreq->sg_data_dir, skreq->n_sg); - if (!skd_bio) { - if (skreq->req != NULL) { - struct request *req = skreq->req; - u32 lba = (u32)blk_rq_pos(req); - u32 count = blk_rq_sectors(req); - - pr_debug("%s:%s:%d " - "req=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n", - skdev->name, __func__, __LINE__, - req, lba, lba, count, count, - (int)rq_data_dir(req)); - } else - pr_debug("%s:%s:%d req=NULL\n", - skdev->name, __func__, __LINE__); - } else { - if (skreq->bio != NULL) { - struct bio *bio = skreq->bio; - u32 lba = (u32)bio->bi_sector; - u32 count = bio_sectors(bio); + if (skreq->req != NULL) { + struct request *req = skreq->req; + u32 lba = (u32)blk_rq_pos(req); + u32 count = blk_rq_sectors(req); - pr_debug("%s:%s:%d " - "bio=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n", - skdev->name, __func__, __LINE__, - bio, lba, lba, count, count, - (int)bio_data_dir(bio)); - } else - pr_debug("%s:%s:%d req=NULL\n", - skdev->name, __func__, __LINE__); - } + pr_debug("%s:%s:%d " + "req=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n", + skdev->name, __func__, __LINE__, + req, lba, lba, count, count, + (int)rq_data_dir(req)); + } else + pr_debug("%s:%s:%d req=NULL\n", + skdev->name, __func__, __LINE__); } /* @@ -5918,34 +5556,5 @@ static void __exit skd_exit(void) kmem_cache_destroy(skd_flush_slab); } -static int -skd_flush_cmd_enqueue(struct skd_device *skdev, void *cmd) -{ - struct skd_flush_cmd *item; - - item = kmem_cache_zalloc(skd_flush_slab, GFP_ATOMIC); - if (!item) { - pr_err("skd_flush_cmd_enqueue: Failed to allocated item.\n"); - return -ENOMEM; - } - - item->cmd = cmd; - list_add_tail(&item->flist, &skdev->flush_list); - return 0; -} - -static void * -skd_flush_cmd_dequeue(struct skd_device *skdev) -{ - void *cmd; - struct skd_flush_cmd *item; - - item = list_entry(skdev->flush_list.next, struct skd_flush_cmd, flist); - list_del_init(&item->flist); - cmd = item->cmd; - kmem_cache_free(skd_flush_slab, item); - return cmd; -} - module_init(skd_init); module_exit(skd_exit); ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 16:28 ` Mike Snitzer @ 2013-11-01 16:34 ` Jens Axboe 2013-11-01 17:46 ` Mike Snitzer 2013-11-04 11:25 ` Bartlomiej Zolnierkiewicz 0 siblings, 2 replies; 16+ messages in thread From: Jens Axboe @ 2013-11-01 16:34 UTC (permalink / raw) To: Mike Snitzer; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel On 11/01/2013 10:28 AM, Mike Snitzer wrote: > On Fri, Nov 01 2013 at 12:02pm -0400, > Jens Axboe <axboe@kernel.dk> wrote: > >> On 11/01/2013 09:50 AM, Christoph Hellwig wrote: >>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: >>>> All the bi_sector ones are low hanging fruit, but the conversion for >>>> skd_preop_sg_list_bio()'s bio_vec code is more involved. >>>> >>>> Kent, any chance you could crank through it? >>>> >>>> If not I can come back to trying to fix this later.. but I'm working >>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's >>>> 'for-next'. >>> >>> The right thing for 3.13 is to rip out the bio base code path, and >>> for 3.14 to convert it to blk-mq. >> >> It is. I will kill it. > > I just cranked through it.. hope this helps (think I got everything but > may have missed something): You lost out, I committed it 20 min ago :-0 http://git.kernel.dk/?p=linux-block.git;a=commit;h=1d36f7a5fb577afaaead6c5e2fc8e01e0c95235d Looks like you missed some of the skd_device removal, while I neglected killing bio/start_time in the skd_request_context. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 16:34 ` Jens Axboe @ 2013-11-01 17:46 ` Mike Snitzer 2013-11-04 11:25 ` Bartlomiej Zolnierkiewicz 1 sibling, 0 replies; 16+ messages in thread From: Mike Snitzer @ 2013-11-01 17:46 UTC (permalink / raw) To: Jens Axboe; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel On Fri, Nov 01 2013 at 12:34pm -0400, Jens Axboe <axboe@kernel.dk> wrote: > On 11/01/2013 10:28 AM, Mike Snitzer wrote: > > On Fri, Nov 01 2013 at 12:02pm -0400, > > Jens Axboe <axboe@kernel.dk> wrote: > > > >> On 11/01/2013 09:50 AM, Christoph Hellwig wrote: > >>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: > >>>> All the bi_sector ones are low hanging fruit, but the conversion for > >>>> skd_preop_sg_list_bio()'s bio_vec code is more involved. > >>>> > >>>> Kent, any chance you could crank through it? > >>>> > >>>> If not I can come back to trying to fix this later.. but I'm working > >>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's > >>>> 'for-next'. > >>> > >>> The right thing for 3.13 is to rip out the bio base code path, and > >>> for 3.14 to convert it to blk-mq. > >> > >> It is. I will kill it. > > > > I just cranked through it.. hope this helps (think I got everything but > > may have missed something): > > You lost out, I committed it 20 min ago :-0 Cool, it's OK, I was still able to get the fulfilling experience of killing a bunch of code... made my day. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: stec skd block driver needs updating for immutable biovec 2013-11-01 16:34 ` Jens Axboe 2013-11-01 17:46 ` Mike Snitzer @ 2013-11-04 11:25 ` Bartlomiej Zolnierkiewicz 1 sibling, 0 replies; 16+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2013-11-04 11:25 UTC (permalink / raw) To: Jens Axboe; +Cc: Mike Snitzer, Christoph Hellwig, Kent Overstreet, linux-kernel Hi, On Friday, November 01, 2013 10:34:23 AM Jens Axboe wrote: > On 11/01/2013 10:28 AM, Mike Snitzer wrote: > > On Fri, Nov 01 2013 at 12:02pm -0400, > > Jens Axboe <axboe@kernel.dk> wrote: > > > >> On 11/01/2013 09:50 AM, Christoph Hellwig wrote: > >>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote: > >>>> All the bi_sector ones are low hanging fruit, but the conversion for > >>>> skd_preop_sg_list_bio()'s bio_vec code is more involved. > >>>> > >>>> Kent, any chance you could crank through it? > >>>> > >>>> If not I can come back to trying to fix this later.. but I'm working > >>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's > >>>> 'for-next'. > >>> > >>> The right thing for 3.13 is to rip out the bio base code path, and > >>> for 3.14 to convert it to blk-mq. > >> > >> It is. I will kill it. > > > > I just cranked through it.. hope this helps (think I got everything but > > may have missed something): > > You lost out, I committed it 20 min ago :-0 > > http://git.kernel.dk/?p=linux-block.git;a=commit;h=1d36f7a5fb577afaaead6c5e2fc8e01e0c95235d I did the complete bio path removal a month ago in my skd patchset: https://lkml.org/lkml/2013/9/30/279 and the change has been agreed on by Ramprasad C from STEC: https://lkml.org/lkml/2013/10/3/339 > Looks like you missed some of the skd_device removal, while I neglected > killing bio/start_time in the skd_request_context. Could you please apply my patchset before doing new work on skd driver? Best regards, -- Bartlomiej Zolnierkiewicz Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-01 15:03 ` Jens Axboe 2013-11-01 15:43 ` stec skd block driver needs updating for immutable biovec Mike Snitzer @ 2013-11-04 15:06 ` Mikulas Patocka 2013-11-04 15:20 ` Mike Snitzer 2013-11-04 20:12 ` Kent Overstreet 1 sibling, 2 replies; 16+ messages in thread From: Mikulas Patocka @ 2013-11-04 15:06 UTC (permalink / raw) To: Jens Axboe Cc: Mike Snitzer, Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon On Fri, 1 Nov 2013, Jens Axboe wrote: > On 11/01/2013 07:59 AM, Mike Snitzer wrote: > > Add the missing bi_remaining increment, required by the block layer's > > new bio-chaining code, to both the verity and old snapshot DM targets. > > > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). > > Thanks Mike, added to the mix. > > -- > Jens Axboe Hi This improves a little bit on the previous patch, by replacing costly atomic_inc with cheap atomic_set. From: Mikulas Patocka <mpatocka@redhat.com> dm: change atomic_inc to atomic_set(1) There are places in dm where we save bi_endio and bi_private, set them to target's routine, submit the bio, from the target's bi_endio routine we restore bi_endio and bi_private and end the bio with bi_endio. This causes underflow of bi_remaining, so we must restore bi_remaining before ending the bio from the target bi_endio routine. The code uses atomic_inc for restoration of bi_remaining. This patch changes it to atomic_set(1) to avoid an interlocked instruction. In the target's bi_endio routine we are sure that bi_remaining is zero (otherwise, the bi_endio routine wouldn't be called) and there are no concurrent users of the bio, so we can replace atomic_inc with atomic_set(1). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> --- drivers/md/dm-cache-target.c | 2 +- drivers/md/dm-snap.c | 2 +- drivers/md/dm-thin.c | 4 ++-- drivers/md/dm-verity.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) Index: linux-block/drivers/md/dm-cache-target.c =================================================================== --- linux-block.orig/drivers/md/dm-cache-target.c 2013-11-01 17:21:11.000000000 +0100 +++ linux-block/drivers/md/dm-cache-target.c 2013-11-04 13:59:13.000000000 +0100 @@ -670,7 +670,7 @@ static void writethrough_endio(struct bi * Must bump bi_remaining to allow bio to complete with * restored bi_end_io. */ - atomic_inc(&bio->bi_remaining); + atomic_set(&bio->bi_remaining, 1); if (err) { bio_endio(bio, err); Index: linux-block/drivers/md/dm-snap.c =================================================================== --- linux-block.orig/drivers/md/dm-snap.c 2013-11-01 17:21:11.000000000 +0100 +++ linux-block/drivers/md/dm-snap.c 2013-11-04 13:59:56.000000000 +0100 @@ -1415,7 +1415,7 @@ out: if (full_bio) { full_bio->bi_end_io = pe->full_bio_end_io; full_bio->bi_private = pe->full_bio_private; - atomic_inc(&full_bio->bi_remaining); + atomic_set(&full_bio->bi_remaining, 1); } free_pending_exception(pe); Index: linux-block/drivers/md/dm-thin.c =================================================================== --- linux-block.orig/drivers/md/dm-thin.c 2013-11-01 17:21:11.000000000 +0100 +++ linux-block/drivers/md/dm-thin.c 2013-11-04 14:00:19.000000000 +0100 @@ -613,7 +613,7 @@ static void process_prepared_mapping_fai { if (m->bio) { m->bio->bi_end_io = m->saved_bi_end_io; - atomic_inc(&m->bio->bi_remaining); + atomic_set(&m->bio->bi_remaining, 1); } cell_error(m->tc->pool, m->cell); list_del(&m->list); @@ -630,7 +630,7 @@ static void process_prepared_mapping(str bio = m->bio; if (bio) { bio->bi_end_io = m->saved_bi_end_io; - atomic_inc(&bio->bi_remaining); + atomic_set(&bio->bi_remaining, 1); } if (m->err) { Index: linux-block/drivers/md/dm-verity.c =================================================================== --- linux-block.orig/drivers/md/dm-verity.c 2013-11-01 17:21:11.000000000 +0100 +++ linux-block/drivers/md/dm-verity.c 2013-11-04 13:59:28.000000000 +0100 @@ -384,7 +384,7 @@ static void verity_finish_io(struct dm_v bio->bi_end_io = io->orig_bi_end_io; bio->bi_private = io->orig_bi_private; - atomic_inc(&bio->bi_remaining); + atomic_set(&bio->bi_remaining, 1); bio_endio(bio, error); } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 15:06 ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka @ 2013-11-04 15:20 ` Mike Snitzer 2013-11-04 15:25 ` Mikulas Patocka 2013-11-04 20:12 ` Kent Overstreet 1 sibling, 1 reply; 16+ messages in thread From: Mike Snitzer @ 2013-11-04 15:20 UTC (permalink / raw) To: Mikulas Patocka Cc: Jens Axboe, Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon On Mon, Nov 04 2013 at 10:06am -0500, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Fri, 1 Nov 2013, Jens Axboe wrote: > > > On 11/01/2013 07:59 AM, Mike Snitzer wrote: > > > Add the missing bi_remaining increment, required by the block layer's > > > new bio-chaining code, to both the verity and old snapshot DM targets. > > > > > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). > > > > Thanks Mike, added to the mix. > > > > -- > > Jens Axboe > > Hi > > This improves a little bit on the previous patch, by replacing costly > atomic_inc with cheap atomic_set. > > > From: Mikulas Patocka <mpatocka@redhat.com> > > dm: change atomic_inc to atomic_set(1) > > There are places in dm where we save bi_endio and bi_private, set them to > target's routine, submit the bio, from the target's bi_endio routine we > restore bi_endio and bi_private and end the bio with bi_endio. > > This causes underflow of bi_remaining, so we must restore bi_remaining > before ending the bio from the target bi_endio routine. > > The code uses atomic_inc for restoration of bi_remaining. This patch > changes it to atomic_set(1) to avoid an interlocked instruction. In the > target's bi_endio routine we are sure that bi_remaining is zero > (otherwise, the bi_endio routine wouldn't be called) and there are no > concurrent users of the bio, so we can replace atomic_inc with > atomic_set(1). This isn't DM-specific. Shouldn't the other places in the tree that use atomic_inc on bi_remaining should really be converted at the same time? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 15:20 ` Mike Snitzer @ 2013-11-04 15:25 ` Mikulas Patocka 2013-11-04 16:04 ` Mike Snitzer 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2013-11-04 15:25 UTC (permalink / raw) To: Mike Snitzer Cc: Jens Axboe, Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon On Mon, 4 Nov 2013, Mike Snitzer wrote: > On Mon, Nov 04 2013 at 10:06am -0500, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > On Fri, 1 Nov 2013, Jens Axboe wrote: > > > > > On 11/01/2013 07:59 AM, Mike Snitzer wrote: > > > > Add the missing bi_remaining increment, required by the block layer's > > > > new bio-chaining code, to both the verity and old snapshot DM targets. > > > > > > > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). > > > > > > Thanks Mike, added to the mix. > > > > > > -- > > > Jens Axboe > > > > Hi > > > > This improves a little bit on the previous patch, by replacing costly > > atomic_inc with cheap atomic_set. > > > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > > > dm: change atomic_inc to atomic_set(1) > > > > There are places in dm where we save bi_endio and bi_private, set them to > > target's routine, submit the bio, from the target's bi_endio routine we > > restore bi_endio and bi_private and end the bio with bi_endio. > > > > This causes underflow of bi_remaining, so we must restore bi_remaining > > before ending the bio from the target bi_endio routine. > > > > The code uses atomic_inc for restoration of bi_remaining. This patch > > changes it to atomic_set(1) to avoid an interlocked instruction. In the > > target's bi_endio routine we are sure that bi_remaining is zero > > (otherwise, the bi_endio routine wouldn't be called) and there are no > > concurrent users of the bio, so we can replace atomic_inc with > > atomic_set(1). > > This isn't DM-specific. Shouldn't the other places in the tree that use > atomic_inc on bi_remaining should really be converted at the same time? There is no 'atomic_inc.*bi_remaining' in other drivers. It is just in fs/bio.c in bio_chain and bio_endio_nodec where it's probably needed. Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 15:25 ` Mikulas Patocka @ 2013-11-04 16:04 ` Mike Snitzer 2013-11-04 17:49 ` Mikulas Patocka 0 siblings, 1 reply; 16+ messages in thread From: Mike Snitzer @ 2013-11-04 16:04 UTC (permalink / raw) To: Mikulas Patocka; +Cc: axboe, kmo, dm-devel, agk, linux-kernel On Mon, Nov 04 2013 at 10:25am -0500, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Mon, 4 Nov 2013, Mike Snitzer wrote: > > > On Mon, Nov 04 2013 at 10:06am -0500, > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > The code uses atomic_inc for restoration of bi_remaining. This patch > > > changes it to atomic_set(1) to avoid an interlocked instruction. In the > > > target's bi_endio routine we are sure that bi_remaining is zero > > > (otherwise, the bi_endio routine wouldn't be called) and there are no > > > concurrent users of the bio, so we can replace atomic_inc with > > > atomic_set(1). > > > > This isn't DM-specific. Shouldn't the other places in the tree that use > > atomic_inc on bi_remaining should really be converted at the same time? > > There is no 'atomic_inc.*bi_remaining' in other drivers. Wrong. I know btrfs has at least one. As does bcache afaik. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 16:04 ` Mike Snitzer @ 2013-11-04 17:49 ` Mikulas Patocka 2013-11-05 0:56 ` Mike Snitzer 0 siblings, 1 reply; 16+ messages in thread From: Mikulas Patocka @ 2013-11-04 17:49 UTC (permalink / raw) To: Mike Snitzer; +Cc: axboe, kmo, dm-devel, agk, linux-kernel On Mon, 4 Nov 2013, Mike Snitzer wrote: > On Mon, Nov 04 2013 at 10:25am -0500, > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > On Mon, 4 Nov 2013, Mike Snitzer wrote: > > > > > On Mon, Nov 04 2013 at 10:06am -0500, > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > The code uses atomic_inc for restoration of bi_remaining. This patch > > > > changes it to atomic_set(1) to avoid an interlocked instruction. In the > > > > target's bi_endio routine we are sure that bi_remaining is zero > > > > (otherwise, the bi_endio routine wouldn't be called) and there are no > > > > concurrent users of the bio, so we can replace atomic_inc with > > > > atomic_set(1). > > > > > > This isn't DM-specific. Shouldn't the other places in the tree that use > > > atomic_inc on bi_remaining should really be converted at the same time? > > > > There is no 'atomic_inc.*bi_remaining' in other drivers. > > Wrong. I know btrfs has at least one. As does bcache afaik. grep -r 'atomic_inc.*bi_remaining' * yilds no hits in btrfs or bcache (on git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git , branch remotes/origin/for-next). It only finds drivers/md/dm-cache-target.c, drivers/md/dm-verity.c, drivers/md/dm-snap.c, drivers/md/dm-thin.c. Maybe in other git trees there are more cases of this. Mikulas ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 17:49 ` Mikulas Patocka @ 2013-11-05 0:56 ` Mike Snitzer 0 siblings, 0 replies; 16+ messages in thread From: Mike Snitzer @ 2013-11-05 0:56 UTC (permalink / raw) To: Mikulas Patocka; +Cc: axboe, kmo, dm-devel, agk, linux-kernel On Mon, Nov 04 2013 at 12:49pm -0500, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Mon, 4 Nov 2013, Mike Snitzer wrote: > > > On Mon, Nov 04 2013 at 10:25am -0500, > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > > On Mon, 4 Nov 2013, Mike Snitzer wrote: > > > > > > > On Mon, Nov 04 2013 at 10:06am -0500, > > > > Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > > > > > The code uses atomic_inc for restoration of bi_remaining. This patch > > > > > changes it to atomic_set(1) to avoid an interlocked instruction. In the > > > > > target's bi_endio routine we are sure that bi_remaining is zero > > > > > (otherwise, the bi_endio routine wouldn't be called) and there are no > > > > > concurrent users of the bio, so we can replace atomic_inc with > > > > > atomic_set(1). > > > > > > > > This isn't DM-specific. Shouldn't the other places in the tree that use > > > > atomic_inc on bi_remaining should really be converted at the same time? > > > > > > There is no 'atomic_inc.*bi_remaining' in other drivers. > > > > Wrong. I know btrfs has at least one. As does bcache afaik. > > grep -r 'atomic_inc.*bi_remaining' * yilds no hits in btrfs or bcache (on > git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git , > branch remotes/origin/for-next). It only finds > drivers/md/dm-cache-target.c, drivers/md/dm-verity.c, > drivers/md/dm-snap.c, drivers/md/dm-thin.c. Maybe in other git trees there > are more cases of this. Here is the btrfs one I was thinking of from Chris: https://patchwork.kernel.org/patch/3123121/ Should probably make its way into linux-block.git, Jens? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH for-next] dm: fix missing bi_remaining accounting 2013-11-04 15:06 ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka 2013-11-04 15:20 ` Mike Snitzer @ 2013-11-04 20:12 ` Kent Overstreet 1 sibling, 0 replies; 16+ messages in thread From: Kent Overstreet @ 2013-11-04 20:12 UTC (permalink / raw) To: Mikulas Patocka Cc: Jens Axboe, Mike Snitzer, dm-devel, linux-kernel, Alasdair Kergon On Mon, Nov 04, 2013 at 10:06:00AM -0500, Mikulas Patocka wrote: > > > On Fri, 1 Nov 2013, Jens Axboe wrote: > > > On 11/01/2013 07:59 AM, Mike Snitzer wrote: > > > Add the missing bi_remaining increment, required by the block layer's > > > new bio-chaining code, to both the verity and old snapshot DM targets. > > > > > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio(). > > > > Thanks Mike, added to the mix. > > > > -- > > Jens Axboe > > Hi > > This improves a little bit on the previous patch, by replacing costly > atomic_inc with cheap atomic_set. IMO, this is a bad idea; the behaviour with this patch does _not_ match the naming of bio_endio_nodec(), and the performance difference should be well in the noise anyways because we're touching a cacheline we already have in cache and won't be contended. The fact that it's currently safe is accidental, I could see this easily tripping people up and being a pain in the ass to debug in the future. > > > From: Mikulas Patocka <mpatocka@redhat.com> > > dm: change atomic_inc to atomic_set(1) > > There are places in dm where we save bi_endio and bi_private, set them to > target's routine, submit the bio, from the target's bi_endio routine we > restore bi_endio and bi_private and end the bio with bi_endio. > > This causes underflow of bi_remaining, so we must restore bi_remaining > before ending the bio from the target bi_endio routine. > > The code uses atomic_inc for restoration of bi_remaining. This patch > changes it to atomic_set(1) to avoid an interlocked instruction. In the > target's bi_endio routine we are sure that bi_remaining is zero > (otherwise, the bi_endio routine wouldn't be called) and there are no > concurrent users of the bio, so we can replace atomic_inc with > atomic_set(1). ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-11-05 0:56 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-01 13:59 [PATCH for-next] dm: fix missing bi_remaining accounting Mike Snitzer 2013-11-01 15:03 ` Jens Axboe 2013-11-01 15:43 ` stec skd block driver needs updating for immutable biovec Mike Snitzer 2013-11-01 15:50 ` Christoph Hellwig 2013-11-01 16:02 ` Jens Axboe 2013-11-01 16:28 ` Mike Snitzer 2013-11-01 16:34 ` Jens Axboe 2013-11-01 17:46 ` Mike Snitzer 2013-11-04 11:25 ` Bartlomiej Zolnierkiewicz 2013-11-04 15:06 ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka 2013-11-04 15:20 ` Mike Snitzer 2013-11-04 15:25 ` Mikulas Patocka 2013-11-04 16:04 ` Mike Snitzer 2013-11-04 17:49 ` Mikulas Patocka 2013-11-05 0:56 ` Mike Snitzer 2013-11-04 20:12 ` Kent Overstreet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox