[PATCH for-next] dm: fix missing bi

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH for-next] dm: fix missing bi_remaining accounting
@ 2013-11-01 13:59 Mike Snitzer
  2013-11-01 15:03 ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2013-11-01 13:59 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon,
	Mikulas Patocka

Add the missing bi_remaining increment, required by the block layer's
new bio-chaining code, to both the verity and old snapshot DM targets.

Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-snap.c   |    1 +
 drivers/md/dm-verity.c |    1 +
 2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index b7a5053..1b4d22d 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -1415,6 +1415,7 @@ out:
 	if (full_bio) {
 		full_bio->bi_end_io = pe->full_bio_end_io;
 		full_bio->bi_private = pe->full_bio_private;
+		atomic_inc(&full_bio->bi_remaining);
 	}
 	free_pending_exception(pe);
 
diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c
index ac35e95..dc15857 100644
--- a/drivers/md/dm-verity.c
+++ b/drivers/md/dm-verity.c
@@ -384,6 +384,7 @@ static void verity_finish_io(struct dm_verity_io *io, int error)
 
 	bio->bi_end_io = io->orig_bi_end_io;
 	bio->bi_private = io->orig_bi_private;
+	atomic_inc(&bio->bi_remaining);
 
 	bio_endio(bio, error);
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-01 13:59 [PATCH for-next] dm: fix missing bi_remaining accounting Mike Snitzer
@ 2013-11-01 15:03 ` Jens Axboe
  2013-11-01 15:43   ` stec skd block driver needs updating for immutable biovec Mike Snitzer
  2013-11-04 15:06   ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2013-11-01 15:03 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Kent Overstreet, dm-devel, linux-kernel, Alasdair Kergon,
	Mikulas Patocka

On 11/01/2013 07:59 AM, Mike Snitzer wrote:
> Add the missing bi_remaining increment, required by the block layer's
> new bio-chaining code, to both the verity and old snapshot DM targets.
> 
> Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().

Thanks Mike, added to the mix.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* stec skd block driver needs updating for immutable biovec
  2013-11-01 15:03 ` Jens Axboe
@ 2013-11-01 15:43   ` Mike Snitzer
  2013-11-01 15:50     ` Christoph Hellwig
  2013-11-04 15:06   ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2013-11-01 15:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kent Overstreet, linux-kernel

Hey Jens and Kent,

You're likely already aware of this but building the latest
linux-block.git 'for-next' I get:

  CC [M]  drivers/block/skd_main.o
drivers/block/skd_main.c: In function ‘skd_request_fn’:
drivers/block/skd_main.c:773: error: ‘struct bio’ has no member named ‘bi_sector’
drivers/block/skd_main.c: In function ‘skd_end_request_bio’:
drivers/block/skd_main.c:1142: error: ‘struct bio’ has no member named ‘bi_sector’
drivers/block/skd_main.c: In function ‘skd_preop_sg_list_bio’:
drivers/block/skd_main.c:1198: error: implicit declaration of function ‘bio_iovec_idx’
drivers/block/skd_main.c:1198: warning: assignment makes pointer from integer without a cast
drivers/block/skd_main.c: In function ‘skd_log_skreq’:
drivers/block/skd_main.c:5814: error: ‘struct bio’ has no member named ‘bi_sector’
make[2]: *** [drivers/block/skd_main.o] Error 1
make[1]: *** [drivers/block] Error 2
make: *** [drivers] Error 2

All the bi_sector ones are low hanging fruit, but the conversion for
skd_preop_sg_list_bio()'s bio_vec code is more involved.

Kent, any chance you could crank through it?

If not I can come back to trying to fix this later.. but I'm working
through a test merge of linux-dm.git's 'for-next' with linux-block.git's
'for-next'.

(I happen to have one of these stec cards so I'll be able to test)

Mike

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 15:43   ` stec skd block driver needs updating for immutable biovec Mike Snitzer
@ 2013-11-01 15:50     ` Christoph Hellwig
  2013-11-01 16:02       ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2013-11-01 15:50 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Jens Axboe, Kent Overstreet, linux-kernel

On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
> All the bi_sector ones are low hanging fruit, but the conversion for
> skd_preop_sg_list_bio()'s bio_vec code is more involved.
> 
> Kent, any chance you could crank through it?
> 
> If not I can come back to trying to fix this later.. but I'm working
> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
> 'for-next'.

The right thing for 3.13 is to rip out the bio base code path, and
for 3.14 to convert it to blk-mq.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 15:50     ` Christoph Hellwig
@ 2013-11-01 16:02       ` Jens Axboe
  2013-11-01 16:28         ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Jens Axboe @ 2013-11-01 16:02 UTC (permalink / raw)
  To: Christoph Hellwig, Mike Snitzer; +Cc: Kent Overstreet, linux-kernel

On 11/01/2013 09:50 AM, Christoph Hellwig wrote:
> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
>> All the bi_sector ones are low hanging fruit, but the conversion for
>> skd_preop_sg_list_bio()'s bio_vec code is more involved.
>>
>> Kent, any chance you could crank through it?
>>
>> If not I can come back to trying to fix this later.. but I'm working
>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
>> 'for-next'.
> 
> The right thing for 3.13 is to rip out the bio base code path, and
> for 3.14 to convert it to blk-mq.

It is. I will kill it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 16:02       ` Jens Axboe
@ 2013-11-01 16:28         ` Mike Snitzer
  2013-11-01 16:34           ` Jens Axboe
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2013-11-01 16:28 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel

On Fri, Nov 01 2013 at 12:02pm -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 11/01/2013 09:50 AM, Christoph Hellwig wrote:
> > On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
> >> All the bi_sector ones are low hanging fruit, but the conversion for
> >> skd_preop_sg_list_bio()'s bio_vec code is more involved.
> >>
> >> Kent, any chance you could crank through it?
> >>
> >> If not I can come back to trying to fix this later.. but I'm working
> >> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
> >> 'for-next'.
> > 
> > The right thing for 3.13 is to rip out the bio base code path, and
> > for 3.14 to convert it to blk-mq.
> 
> It is. I will kill it.

I just cranked through it.. hope this helps (think I got everything but
may have missed something):


From: Mike Snitzer <snitzer@redhat.com>
Subject: [PATCH] skd: remove all the bio-based code

We'll want to convert this driver to blk-mq in the near-term.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/block/skd_main.c |  515 ++++++----------------------------------------
 1 files changed, 62 insertions(+), 453 deletions(-)

diff --git a/drivers/block/skd_main.c b/drivers/block/skd_main.c
index 1a8717f..d259b1a 100644
--- a/drivers/block/skd_main.c
+++ b/drivers/block/skd_main.c
@@ -214,8 +214,6 @@ struct skd_request_context {
 	u32 fitmsg_id;
 
 	struct request *req;
-	struct bio *bio;
-	unsigned long start_time;
 	u8 flush_cmd;
 	u8 discard_page;
 
@@ -357,9 +355,6 @@ struct skd_device {
 
 	struct work_struct completion_worker;
 
-	struct bio_list bio_queue;
-	int queue_stopped;
-
 	struct list_head flush_list;
 };
 
@@ -470,11 +465,6 @@ MODULE_PARM_DESC(skd_dbg_level, "s1120 debug level (0,1,2)");
 module_param(skd_isr_comp_limit, int, 0444);
 MODULE_PARM_DESC(skd_isr_comp_limit, "s1120 isr comp limit (0=none) default=4");
 
-static int skd_bio;
-module_param(skd_bio, int, 0444);
-MODULE_PARM_DESC(skd_bio,
-		 "Register as a bio device instead of block (0, 1) default=0");
-
 /* Major device number dynamically assigned. */
 static u32 skd_major;
 
@@ -512,11 +502,6 @@ static void skd_log_skmsg(struct skd_device *skdev,
 static void skd_log_skreq(struct skd_device *skdev,
 			  struct skd_request_context *skreq, const char *event);
 
-/* FLUSH FUA flag handling. */
-static int skd_flush_cmd_enqueue(struct skd_device *, void *);
-static void *skd_flush_cmd_dequeue(struct skd_device *);
-
-
 /*
  *****************************************************************************
  * READ/WRITE REQUESTS
@@ -524,37 +509,22 @@ static void *skd_flush_cmd_dequeue(struct skd_device *);
  */
 static void skd_stop_queue(struct skd_device *skdev)
 {
-	if (!skd_bio)
-		blk_stop_queue(skdev->queue);
-	else
-		skdev->queue_stopped = 1;
+	blk_stop_queue(skdev->queue);
 }
 
 static void skd_unstop_queue(struct skd_device *skdev)
 {
-	if (!skd_bio)
-		queue_flag_clear(QUEUE_FLAG_STOPPED, skdev->queue);
-	else
-		skdev->queue_stopped = 0;
+	queue_flag_clear(QUEUE_FLAG_STOPPED, skdev->queue);
 }
 
 static void skd_start_queue(struct skd_device *skdev)
 {
-	if (!skd_bio) {
-		blk_start_queue(skdev->queue);
-	} else {
-		pr_err("(%s): Starting queue\n", skd_name(skdev));
-		skdev->queue_stopped = 0;
-		skd_request_fn(skdev->queue);
-	}
+	blk_start_queue(skdev->queue);
 }
 
 static int skd_queue_stopped(struct skd_device *skdev)
 {
-	if (!skd_bio)
-		return blk_queue_stopped(skdev->queue);
-	else
-		return skdev->queue_stopped;
+	return blk_queue_stopped(skdev->queue);
 }
 
 static void skd_fail_all_pending_blk(struct skd_device *skdev)
@@ -571,40 +541,9 @@ static void skd_fail_all_pending_blk(struct skd_device *skdev)
 	}
 }
 
-static void skd_fail_all_pending_bio(struct skd_device *skdev)
-{
-	struct bio *bio;
-	int error = -EIO;
-
-	for (;; ) {
-		bio = bio_list_pop(&skdev->bio_queue);
-
-		if (bio == NULL)
-			break;
-
-		bio_endio(bio, error);
-	}
-}
-
 static void skd_fail_all_pending(struct skd_device *skdev)
 {
-	if (!skd_bio)
-		skd_fail_all_pending_blk(skdev);
-	else
-		skd_fail_all_pending_bio(skdev);
-}
-
-static void skd_make_request(struct request_queue *q, struct bio *bio)
-{
-	struct skd_device *skdev = q->queuedata;
-	unsigned long flags;
-
-	spin_lock_irqsave(&skdev->lock, flags);
-
-	bio_list_add(&skdev->bio_queue, bio);
-	skd_request_fn(skdev->queue);
-
-	spin_unlock_irqrestore(&skdev->lock, flags);
+	skd_fail_all_pending_blk(skdev);
 }
 
 static void
@@ -667,18 +606,9 @@ skd_prep_discard_cdb(struct skd_scsi_request *scsi_req,
 	put_unaligned_be64(lba, &buf[8]);
 	put_unaligned_be32(count, &buf[16]);
 
-	if (!skd_bio) {
-		req = skreq->req;
-		blk_add_request_payload(req, page, len);
-		req->buffer = buf;
-	} else {
-		skreq->bio->bi_io_vec->bv_page = page;
-		skreq->bio->bi_io_vec->bv_offset = 0;
-		skreq->bio->bi_io_vec->bv_len = len;
-
-		skreq->bio->bi_vcnt = 1;
-		skreq->bio->bi_phys_segments = 1;
-	}
+	req = skreq->req;
+	blk_add_request_payload(req, page, len);
+	req->buffer = buf;
 }
 
 static void skd_request_fn_not_online(struct request_queue *q);
@@ -690,7 +620,6 @@ static void skd_request_fn(struct request_queue *q)
 	struct fit_msg_hdr *fmh = NULL;
 	struct skd_request_context *skreq;
 	struct request *req = NULL;
-	struct bio *bio = NULL;
 	struct skd_scsi_request *scsi_req;
 	struct page *page;
 	unsigned long io_flags;
@@ -732,60 +661,27 @@ static void skd_request_fn(struct request_queue *q)
 
 		flush = fua = 0;
 
-		if (!skd_bio) {
-			req = blk_peek_request(q);
-
-			/* Are there any native requests to start? */
-			if (req == NULL)
-				break;
-
-			lba = (u32)blk_rq_pos(req);
-			count = blk_rq_sectors(req);
-			data_dir = rq_data_dir(req);
-			io_flags = req->cmd_flags;
-
-			if (io_flags & REQ_FLUSH)
-				flush++;
-
-			if (io_flags & REQ_FUA)
-				fua++;
-
-			pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) "
-				 "count=%u(0x%x) dir=%d\n",
-				 skdev->name, __func__, __LINE__,
-				 req, lba, lba, count, count, data_dir);
-		} else {
-			if (!list_empty(&skdev->flush_list)) {
-				/* Process data part of FLUSH request. */
-				bio = (struct bio *)skd_flush_cmd_dequeue(skdev);
-				flush++;
-				pr_debug("%s:%s:%d processing FLUSH request with data.\n",
-					 skdev->name, __func__, __LINE__);
-			} else {
-				/* peek at our bio queue */
-				bio = bio_list_peek(&skdev->bio_queue);
-			}
+		req = blk_peek_request(q);
 
-			/* Are there any native requests to start? */
-			if (bio == NULL)
-				break;
+		/* Are there any native requests to start? */
+		if (req == NULL)
+			break;
 
-			lba = (u32)bio->bi_sector;
-			count = bio_sectors(bio);
-			data_dir = bio_data_dir(bio);
-			io_flags = bio->bi_rw;
+		lba = (u32)blk_rq_pos(req);
+		count = blk_rq_sectors(req);
+		data_dir = rq_data_dir(req);
+		io_flags = req->cmd_flags;
 
-			pr_debug("%s:%s:%d new bio=%p lba=%u(0x%x) "
-				 "count=%u(0x%x) dir=%d\n",
-				 skdev->name, __func__, __LINE__,
-				 bio, lba, lba, count, count, data_dir);
+		if (io_flags & REQ_FLUSH)
+			flush++;
 
-			if (io_flags & REQ_FLUSH)
-				flush++;
+		if (io_flags & REQ_FUA)
+			fua++;
 
-			if (io_flags & REQ_FUA)
-				fua++;
-		}
+		pr_debug("%s:%s:%d new req=%p lba=%u(0x%x) "
+			 "count=%u(0x%x) dir=%d\n",
+			 skdev->name, __func__, __LINE__,
+			 req, lba, lba, count, count, data_dir);
 
 		/* At this point we know there is a request
 		 * (from our bio q or req q depending on the way
@@ -825,29 +721,15 @@ static void skd_request_fn(struct request_queue *q)
 		skreq->discard_page = 0;
 
 		/*
-		 * OK to now dequeue request from either bio or q.
+		 * OK to now dequeue request from q.
 		 *
 		 * At this point we are comitted to either start or reject
 		 * the native request. Note that skd_request_context is
 		 * available but is still at the head of the free list.
 		 */
-		if (!skd_bio) {
-			blk_start_request(req);
-			skreq->req = req;
-			skreq->fitmsg_id = 0;
-		} else {
-			if (unlikely(flush == SKD_FLUSH_DATA_SECOND)) {
-				skreq->bio = bio;
-			} else {
-				skreq->bio = bio_list_pop(&skdev->bio_queue);
-				SKD_ASSERT(skreq->bio == bio);
-				skreq->start_time = jiffies;
-				part_inc_in_flight(&skdev->disk->part0,
-						   bio_data_dir(bio));
-			}
-
-			skreq->fitmsg_id = 0;
-		}
+		blk_start_request(req);
+		skreq->req = req;
+		skreq->fitmsg_id = 0;
 
 		/* Either a FIT msg is in progress or we have to start one. */
 		if (skmsg == NULL) {
@@ -923,8 +805,7 @@ static void skd_request_fn(struct request_queue *q)
 		if (fua)
 			scsi_req->cdb[1] |= SKD_FUA_NV;
 
-		if ((!skd_bio && !req->bio) ||
-			(skd_bio && flush == SKD_FLUSH_ZERO_SIZE_FIRST))
+		if (!req->bio)
 			goto skip_sg;
 
 		error = skd_preop_sg_list(skdev, skreq);
@@ -1011,8 +892,7 @@ skip_sg:
 	 * If req is non-NULL it means there is something to do but
 	 * we are out of a resource.
 	 */
-	if (((!skd_bio) && req) ||
-	    ((skd_bio) && bio_list_peek(&skdev->bio_queue)))
+	if (req)
 		skd_stop_queue(skdev);
 }
 
@@ -1124,184 +1004,22 @@ static void skd_postop_sg_list_blk(struct skd_device *skdev,
 	pci_unmap_sg(skdev->pdev, &skreq->sg[0], skreq->n_sg, pci_dir);
 }
 
-static void skd_end_request_bio(struct skd_device *skdev,
-				struct skd_request_context *skreq, int error)
-{
-	struct bio *bio = skreq->bio;
-	int rw = bio_data_dir(bio);
-	unsigned long io_flags = bio->bi_rw;
-
-	if ((io_flags & REQ_DISCARD) &&
-		(skreq->discard_page == 1)) {
-		pr_debug("%s:%s:%d biomode: skd_end_request: freeing DISCARD page.\n",
-			 skdev->name, __func__, __LINE__);
-		free_page((unsigned long)page_address(bio->bi_io_vec->bv_page));
-	}
-
-	if (unlikely(error)) {
-		u32 lba = (u32)skreq->bio->bi_sector;
-		u32 count = bio_sectors(skreq->bio);
-		char *cmd = (rw == WRITE) ? "write" : "read";
-		pr_err("(%s): Error cmd=%s sect=%u count=%u id=0x%x\n",
-		       skd_name(skdev), cmd, lba, count, skreq->id);
-	}
-	{
-		int cpu = part_stat_lock();
-
-		if (likely(!error)) {
-			part_stat_inc(cpu, &skdev->disk->part0, ios[rw]);
-			part_stat_add(cpu, &skdev->disk->part0, sectors[rw],
-				      bio_sectors(bio));
-		}
-		part_stat_add(cpu, &skdev->disk->part0, ticks[rw],
-			      jiffies - skreq->start_time);
-		part_dec_in_flight(&skdev->disk->part0, rw);
-		part_stat_unlock();
-	}
-
-	pr_debug("%s:%s:%d id=0x%x error=%d\n",
-		 skdev->name, __func__, __LINE__, skreq->id, error);
-
-	bio_endio(skreq->bio, error);
-}
-
-static int skd_preop_sg_list_bio(struct skd_device *skdev,
-				 struct skd_request_context *skreq)
-{
-	struct bio *bio = skreq->bio;
-	int writing = skreq->sg_data_dir == SKD_DATA_DIR_HOST_TO_CARD;
-	int pci_dir = writing ? PCI_DMA_TODEVICE : PCI_DMA_FROMDEVICE;
-	int n_sg;
-	int i;
-	struct bio_vec *vec;
-	struct fit_sg_descriptor *sgd;
-	u64 dma_addr;
-	u32 count;
-	int errs = 0;
-	unsigned int io_flags = 0;
-	io_flags |= bio->bi_rw;
-
-	skreq->sg_byte_count = 0;
-	n_sg = skreq->n_sg = skreq->bio->bi_vcnt;
-
-	if (n_sg <= 0)
-		return -EINVAL;
-
-	if (n_sg > skdev->sgs_per_request) {
-		pr_err("(%s): sg overflow n=%d\n",
-		       skd_name(skdev), n_sg);
-		skreq->n_sg = 0;
-		return -EIO;
-	}
-
-	for (i = 0; i < skreq->n_sg; i++) {
-		vec = bio_iovec_idx(bio, i);
-		dma_addr = pci_map_page(skdev->pdev,
-					vec->bv_page,
-					vec->bv_offset, vec->bv_len, pci_dir);
-		count = vec->bv_len;
-
-		if (count == 0 || count > 64u * 1024u || (count & 3) != 0
-		    || (dma_addr & 3) != 0) {
-			pr_err(
-			       "(%s): Bad sg ix=%d count=%d addr=0x%llx\n",
-			       skd_name(skdev), i, count, dma_addr);
-			errs++;
-		}
-
-		sgd = &skreq->sksg_list[i];
-
-		sgd->control = FIT_SGD_CONTROL_NOT_LAST;
-		sgd->byte_count = vec->bv_len;
-		skreq->sg_byte_count += vec->bv_len;
-		sgd->host_side_addr = dma_addr;
-		sgd->dev_side_addr = 0; /* not used */
-	}
-
-	skreq->sksg_list[n_sg - 1].next_desc_ptr = 0LL;
-	skreq->sksg_list[n_sg - 1].control = FIT_SGD_CONTROL_LAST;
-
-
-	 if (!(io_flags & REQ_DISCARD)) {
-		count = bio_sectors(bio) << 9u;
-		if (count != skreq->sg_byte_count) {
-			pr_err("(%s): mismatch count sg=%d req=%d\n",
-			       skd_name(skdev), skreq->sg_byte_count, count);
-			errs++;
-		}
-	}
-
-	if (unlikely(skdev->dbg_level > 1)) {
-		pr_debug("%s:%s:%d skreq=%x sksg_list=%p sksg_dma=%llx\n",
-			 skdev->name, __func__, __LINE__,
-			 skreq->id, skreq->sksg_list, skreq->sksg_dma_address);
-		for (i = 0; i < n_sg; i++) {
-			struct fit_sg_descriptor *sgd = &skreq->sksg_list[i];
-			pr_debug("%s:%s:%d   sg[%d] count=%u ctrl=0x%x "
-				 "addr=0x%llx next=0x%llx\n",
-				 skdev->name, __func__, __LINE__,
-				 i, sgd->byte_count, sgd->control,
-				 sgd->host_side_addr, sgd->next_desc_ptr);
-		}
-	}
-
-	if (errs != 0) {
-		skd_postop_sg_list(skdev, skreq);
-		skreq->n_sg = 0;
-		return -EIO;
-	}
-
-	return 0;
-}
-
 static int skd_preop_sg_list(struct skd_device *skdev,
 			     struct skd_request_context *skreq)
 {
-	if (!skd_bio)
-		return skd_preop_sg_list_blk(skdev, skreq);
-	else
-		return skd_preop_sg_list_bio(skdev, skreq);
-}
-
-static void skd_postop_sg_list_bio(struct skd_device *skdev,
-				   struct skd_request_context *skreq)
-{
-	int writing = skreq->sg_data_dir == SKD_DATA_DIR_HOST_TO_CARD;
-	int pci_dir = writing ? PCI_DMA_TODEVICE : PCI_DMA_FROMDEVICE;
-	int i;
-	struct fit_sg_descriptor *sgd;
-
-	/*
-	 * restore the next ptr for next IO request so we
-	 * don't have to set it every time.
-	 */
-	skreq->sksg_list[skreq->n_sg - 1].next_desc_ptr =
-		skreq->sksg_dma_address +
-		((skreq->n_sg) * sizeof(struct fit_sg_descriptor));
-
-	for (i = 0; i < skreq->n_sg; i++) {
-		sgd = &skreq->sksg_list[i];
-		pci_unmap_page(skdev->pdev, sgd->host_side_addr,
-			       sgd->byte_count, pci_dir);
-	}
+	return skd_preop_sg_list_blk(skdev, skreq);
 }
 
 static void skd_postop_sg_list(struct skd_device *skdev,
 			       struct skd_request_context *skreq)
 {
-	if (!skd_bio)
-		skd_postop_sg_list_blk(skdev, skreq);
-	else
-		skd_postop_sg_list_bio(skdev, skreq);
+	skd_postop_sg_list_blk(skdev, skreq);
 }
 
 static void skd_end_request(struct skd_device *skdev,
 			    struct skd_request_context *skreq, int error)
 {
-	if (likely(!skd_bio))
-		skd_end_request_blk(skdev, skreq, error);
-	else
-		skd_end_request_bio(skdev, skreq, error);
+	skd_end_request_blk(skdev, skreq, error);
 }
 
 static void skd_request_fn_not_online(struct request_queue *q)
@@ -2754,13 +2472,10 @@ static void skd_resolve_req_exception(struct skd_device *skdev,
 		break;
 
 	case SKD_CHECK_STATUS_REQUEUE_REQUEST:
-		if (!skd_bio) {
-			if ((unsigned long) ++skreq->req->special <
-			    SKD_MAX_RETRIES) {
-				skd_log_skreq(skdev, skreq, "retry");
-				skd_requeue_request(skdev, skreq);
-				break;
-			}
+		if ((unsigned long) ++skreq->req->special < SKD_MAX_RETRIES) {
+			skd_log_skreq(skdev, skreq, "retry");
+			skd_requeue_request(skdev, skreq);
+			break;
 		}
 	/* fall through to report error */
 
@@ -2774,16 +2489,9 @@ static void skd_resolve_req_exception(struct skd_device *skdev,
 static void skd_requeue_request(struct skd_device *skdev,
 				struct skd_request_context *skreq)
 {
-	if (!skd_bio) {
-		blk_requeue_request(skdev->queue, skreq->req);
-	} else {
-		bio_list_add_head(&skdev->bio_queue, skreq->bio);
-		skreq->bio = NULL;
-	}
+	blk_requeue_request(skdev->queue, skreq->req);
 }
 
-
-
 /* assume spinlock is already held */
 static void skd_release_skreq(struct skd_device *skdev,
 			      struct skd_request_context *skreq)
@@ -2840,11 +2548,7 @@ static void skd_release_skreq(struct skd_device *skdev,
 	/*
 	 * Reset backpointer
 	 */
-	if (likely(!skd_bio))
-		skreq->req = NULL;
-	else
-		skreq->bio = NULL;
-
+	skreq->req = NULL;
 
 	/*
 	 * Reclaim the skd_request_context
@@ -3084,8 +2788,6 @@ static int skd_isr_completion_posted(struct skd_device *skdev,
 	u32 cmp_bytes = 0;
 	int rc = 0;
 	int processed = 0;
-	int ret;
-
 
 	for (;; ) {
 		SKD_ASSERT(skdev->skcomp_ix < SKD_N_COMPLETION_ENTRY);
@@ -3180,8 +2882,7 @@ static int skd_isr_completion_posted(struct skd_device *skdev,
 		if (skreq->n_sg > 0)
 			skd_postop_sg_list(skdev, skreq);
 
-		if (((!skd_bio) && !skreq->req) ||
-		    ((skd_bio) && !skreq->bio)) {
+		if (!skreq->req) {
 			pr_debug("%s:%s:%d NULL backptr skdreq %p, "
 				 "req=0x%x req_id=0x%x\n",
 				 skdev->name, __func__, __LINE__,
@@ -3191,30 +2892,10 @@ static int skd_isr_completion_posted(struct skd_device *skdev,
 			 * Capture the outcome and post it back to the
 			 * native request.
 			 */
-			if (likely(cmp_status == SAM_STAT_GOOD)) {
-				if (unlikely(skreq->flush_cmd)) {
-					if (skd_bio) {
-						/* if empty size bio, we are all done */
-						if (bio_sectors(skreq->bio) == 0) {
-							skd_end_request(skdev, skreq, 0);
-						} else {
-							ret = skd_flush_cmd_enqueue(skdev, (void *)skreq->bio);
-							if (ret != 0) {
-								pr_err("Failed to enqueue flush bio with Data. Err=%d.\n", ret);
-								skd_end_request(skdev, skreq, ret);
-							} else {
-								((*enqueued)++);
-							}
-						}
-					} else {
-						skd_end_request(skdev, skreq, 0);
-					}
-				} else {
-					skd_end_request(skdev, skreq, 0);
-				}
-			} else {
+			if (likely(cmp_status == SAM_STAT_GOOD))
+				skd_end_request(skdev, skreq, 0);
+			else
 				skd_resolve_req_exception(skdev, skreq);
-			}
 		}
 
 		/*
@@ -3645,34 +3326,22 @@ static void skd_recover_requests(struct skd_device *skdev, int requeue)
 			skd_log_skreq(skdev, skreq, "recover");
 
 			SKD_ASSERT((skreq->id & SKD_ID_INCR) != 0);
-			if (!skd_bio)
-				SKD_ASSERT(skreq->req != NULL);
-			else
-				SKD_ASSERT(skreq->bio != NULL);
+			SKD_ASSERT(skreq->req != NULL);
 
 			/* Release DMA resources for the request. */
 			if (skreq->n_sg > 0)
 				skd_postop_sg_list(skdev, skreq);
 
-			if (!skd_bio) {
-				if (requeue &&
-				    (unsigned long) ++skreq->req->special <
-				    SKD_MAX_RETRIES)
-					skd_requeue_request(skdev, skreq);
-				else
-				skd_end_request(skdev, skreq, -EIO);
-			} else
+			if (requeue &&
+			    (unsigned long) ++skreq->req->special < SKD_MAX_RETRIES)			    
+				skd_requeue_request(skdev, skreq);
+			else
 				skd_end_request(skdev, skreq, -EIO);
 
-			if (!skd_bio)
-				skreq->req = NULL;
-			else
-				skreq->bio = NULL;
+			skreq->req = NULL;
 
 			skreq->state = SKD_REQ_STATE_IDLE;
 			skreq->id += SKD_ID_INCR;
-
-
 		}
 		if (i > 0)
 			skreq[-1].next = skreq;
@@ -4580,10 +4249,6 @@ static struct skd_device *skd_construct(struct pci_dev *pdev)
 	skdev->sgs_per_request = skd_sgs_per_request;
 	skdev->dbg_level = skd_dbg_level;
 
-	if (skd_bio)
-		bio_list_init(&skdev->bio_queue);
-
-
 	atomic_set(&skdev->device_count, 0);
 
 	spin_lock_init(&skdev->lock);
@@ -4941,13 +4606,7 @@ static int skd_cons_disk(struct skd_device *skdev)
 	disk->fops = &skd_blockdev_ops;
 	disk->private_data = skdev;
 
-	if (!skd_bio) {
-		q = blk_init_queue(skd_request_fn, &skdev->lock);
-	} else {
-		q = blk_alloc_queue(GFP_KERNEL);
-		q->queue_flags = QUEUE_FLAG_IO_STAT | QUEUE_FLAG_STACKABLE;
-	}
-
+	q = blk_init_queue(skd_request_fn, &skdev->lock);
 	if (!q) {
 		rc = -ENOMEM;
 		goto err_out;
@@ -4957,11 +4616,6 @@ static int skd_cons_disk(struct skd_device *skdev)
 	disk->queue = q;
 	q->queuedata = skdev;
 
-	if (skd_bio) {
-		q->queue_lock = &skdev->lock;
-		blk_queue_make_request(q, skd_make_request);
-	}
-
 	blk_queue_flush(q, REQ_FLUSH | REQ_FUA);
 	blk_queue_max_segments(q, skdev->sgs_per_request);
 	blk_queue_max_hw_sectors(q, SKD_N_MAX_SECTORS);
@@ -5794,35 +5448,19 @@ static void skd_log_skreq(struct skd_device *skdev,
 		 skdev->name, __func__, __LINE__,
 		 skreq->timeout_stamp, skreq->sg_data_dir, skreq->n_sg);
 
-	if (!skd_bio) {
-		if (skreq->req != NULL) {
-			struct request *req = skreq->req;
-			u32 lba = (u32)blk_rq_pos(req);
-			u32 count = blk_rq_sectors(req);
-
-			pr_debug("%s:%s:%d "
-				 "req=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n",
-				 skdev->name, __func__, __LINE__,
-				 req, lba, lba, count, count,
-				 (int)rq_data_dir(req));
-		} else
-			pr_debug("%s:%s:%d req=NULL\n",
-				 skdev->name, __func__, __LINE__);
-	} else {
-		if (skreq->bio != NULL) {
-			struct bio *bio = skreq->bio;
-			u32 lba = (u32)bio->bi_sector;
-			u32 count = bio_sectors(bio);
+	if (skreq->req != NULL) {
+		struct request *req = skreq->req;
+		u32 lba = (u32)blk_rq_pos(req);
+		u32 count = blk_rq_sectors(req);
 
-			pr_debug("%s:%s:%d "
-				 "bio=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n",
-				 skdev->name, __func__, __LINE__,
-				 bio, lba, lba, count, count,
-				 (int)bio_data_dir(bio));
-		} else
-			pr_debug("%s:%s:%d req=NULL\n",
-				 skdev->name, __func__, __LINE__);
-	}
+		pr_debug("%s:%s:%d "
+			 "req=%p lba=%u(0x%x) count=%u(0x%x) dir=%d\n",
+			 skdev->name, __func__, __LINE__,
+			 req, lba, lba, count, count,
+			 (int)rq_data_dir(req));
+	} else
+		pr_debug("%s:%s:%d req=NULL\n",
+			 skdev->name, __func__, __LINE__);
 }
 
 /*
@@ -5918,34 +5556,5 @@ static void __exit skd_exit(void)
 	kmem_cache_destroy(skd_flush_slab);
 }
 
-static int
-skd_flush_cmd_enqueue(struct skd_device *skdev, void *cmd)
-{
-	struct skd_flush_cmd *item;
-
-	item = kmem_cache_zalloc(skd_flush_slab, GFP_ATOMIC);
-	if (!item) {
-		pr_err("skd_flush_cmd_enqueue: Failed to allocated item.\n");
-		return -ENOMEM;
-	}
-
-	item->cmd = cmd;
-	list_add_tail(&item->flist, &skdev->flush_list);
-	return 0;
-}
-
-static void *
-skd_flush_cmd_dequeue(struct skd_device *skdev)
-{
-	void *cmd;
-	struct skd_flush_cmd *item;
-
-	item = list_entry(skdev->flush_list.next, struct skd_flush_cmd, flist);
-	list_del_init(&item->flist);
-	cmd = item->cmd;
-	kmem_cache_free(skd_flush_slab, item);
-	return cmd;
-}
-
 module_init(skd_init);
 module_exit(skd_exit);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 16:28         ` Mike Snitzer
@ 2013-11-01 16:34           ` Jens Axboe
  2013-11-01 17:46             ` Mike Snitzer
  2013-11-04 11:25             ` Bartlomiej Zolnierkiewicz
  0 siblings, 2 replies; 16+ messages in thread
From: Jens Axboe @ 2013-11-01 16:34 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel

On 11/01/2013 10:28 AM, Mike Snitzer wrote:
> On Fri, Nov 01 2013 at 12:02pm -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 11/01/2013 09:50 AM, Christoph Hellwig wrote:
>>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
>>>> All the bi_sector ones are low hanging fruit, but the conversion for
>>>> skd_preop_sg_list_bio()'s bio_vec code is more involved.
>>>>
>>>> Kent, any chance you could crank through it?
>>>>
>>>> If not I can come back to trying to fix this later.. but I'm working
>>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
>>>> 'for-next'.
>>>
>>> The right thing for 3.13 is to rip out the bio base code path, and
>>> for 3.14 to convert it to blk-mq.
>>
>> It is. I will kill it.
> 
> I just cranked through it.. hope this helps (think I got everything but
> may have missed something):

You lost out, I committed it 20 min ago :-0

http://git.kernel.dk/?p=linux-block.git;a=commit;h=1d36f7a5fb577afaaead6c5e2fc8e01e0c95235d

Looks like you missed some of the skd_device removal, while I neglected
killing bio/start_time in the skd_request_context.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 16:34           ` Jens Axboe
@ 2013-11-01 17:46             ` Mike Snitzer
  2013-11-04 11:25             ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 16+ messages in thread
From: Mike Snitzer @ 2013-11-01 17:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, Kent Overstreet, linux-kernel

On Fri, Nov 01 2013 at 12:34pm -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 11/01/2013 10:28 AM, Mike Snitzer wrote:
> > On Fri, Nov 01 2013 at 12:02pm -0400,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> >> On 11/01/2013 09:50 AM, Christoph Hellwig wrote:
> >>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
> >>>> All the bi_sector ones are low hanging fruit, but the conversion for
> >>>> skd_preop_sg_list_bio()'s bio_vec code is more involved.
> >>>>
> >>>> Kent, any chance you could crank through it?
> >>>>
> >>>> If not I can come back to trying to fix this later.. but I'm working
> >>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
> >>>> 'for-next'.
> >>>
> >>> The right thing for 3.13 is to rip out the bio base code path, and
> >>> for 3.14 to convert it to blk-mq.
> >>
> >> It is. I will kill it.
> > 
> > I just cranked through it.. hope this helps (think I got everything but
> > may have missed something):
> 
> You lost out, I committed it 20 min ago :-0

Cool, it's OK, I was still able to get the fulfilling experience of
killing a bunch of code... made my day.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: stec skd block driver needs updating for immutable biovec
  2013-11-01 16:34           ` Jens Axboe
  2013-11-01 17:46             ` Mike Snitzer
@ 2013-11-04 11:25             ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 16+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2013-11-04 11:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Snitzer, Christoph Hellwig, Kent Overstreet, linux-kernel


Hi,

On Friday, November 01, 2013 10:34:23 AM Jens Axboe wrote:
> On 11/01/2013 10:28 AM, Mike Snitzer wrote:
> > On Fri, Nov 01 2013 at 12:02pm -0400,
> > Jens Axboe <axboe@kernel.dk> wrote:
> > 
> >> On 11/01/2013 09:50 AM, Christoph Hellwig wrote:
> >>> On Fri, Nov 01, 2013 at 11:43:39AM -0400, Mike Snitzer wrote:
> >>>> All the bi_sector ones are low hanging fruit, but the conversion for
> >>>> skd_preop_sg_list_bio()'s bio_vec code is more involved.
> >>>>
> >>>> Kent, any chance you could crank through it?
> >>>>
> >>>> If not I can come back to trying to fix this later.. but I'm working
> >>>> through a test merge of linux-dm.git's 'for-next' with linux-block.git's
> >>>> 'for-next'.
> >>>
> >>> The right thing for 3.13 is to rip out the bio base code path, and
> >>> for 3.14 to convert it to blk-mq.
> >>
> >> It is. I will kill it.
> > 
> > I just cranked through it.. hope this helps (think I got everything but
> > may have missed something):
> 
> You lost out, I committed it 20 min ago :-0
> 
> http://git.kernel.dk/?p=linux-block.git;a=commit;h=1d36f7a5fb577afaaead6c5e2fc8e01e0c95235d

I did the complete bio path removal a month ago in my skd patchset:

	https://lkml.org/lkml/2013/9/30/279

and the change has been agreed on by Ramprasad C from STEC:

	https://lkml.org/lkml/2013/10/3/339

> Looks like you missed some of the skd_device removal, while I neglected
> killing bio/start_time in the skd_request_context.

Could you please apply my patchset before doing new work on skd driver?

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-01 15:03 ` Jens Axboe
  2013-11-01 15:43   ` stec skd block driver needs updating for immutable biovec Mike Snitzer
@ 2013-11-04 15:06   ` Mikulas Patocka
  2013-11-04 15:20     ` Mike Snitzer
  2013-11-04 20:12     ` Kent Overstreet
  1 sibling, 2 replies; 16+ messages in thread
From: Mikulas Patocka @ 2013-11-04 15:06 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Snitzer, Kent Overstreet, dm-devel, linux-kernel,
	Alasdair Kergon



On Fri, 1 Nov 2013, Jens Axboe wrote:

> On 11/01/2013 07:59 AM, Mike Snitzer wrote:
> > Add the missing bi_remaining increment, required by the block layer's
> > new bio-chaining code, to both the verity and old snapshot DM targets.
> > 
> > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().
> 
> Thanks Mike, added to the mix.
> 
> -- 
> Jens Axboe

Hi

This improves a little bit on the previous patch, by replacing costly 
atomic_inc with cheap atomic_set.


From: Mikulas Patocka <mpatocka@redhat.com>

dm: change atomic_inc to atomic_set(1)

There are places in dm where we save bi_endio and bi_private, set them to
target's routine, submit the bio, from the target's bi_endio routine we
restore bi_endio and bi_private and end the bio with bi_endio.

This causes underflow of bi_remaining, so we must restore bi_remaining
before ending the bio from the target bi_endio routine.

The code uses atomic_inc for restoration of bi_remaining. This patch
changes it to atomic_set(1) to avoid an interlocked instruction. In the
target's bi_endio routine we are sure that bi_remaining is zero
(otherwise, the bi_endio routine wouldn't be called) and there are no
concurrent users of the bio, so we can replace atomic_inc with
atomic_set(1).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-cache-target.c |    2 +-
 drivers/md/dm-snap.c         |    2 +-
 drivers/md/dm-thin.c         |    4 ++--
 drivers/md/dm-verity.c       |    2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Index: linux-block/drivers/md/dm-cache-target.c
===================================================================
--- linux-block.orig/drivers/md/dm-cache-target.c	2013-11-01 17:21:11.000000000 +0100
+++ linux-block/drivers/md/dm-cache-target.c	2013-11-04 13:59:13.000000000 +0100
@@ -670,7 +670,7 @@ static void writethrough_endio(struct bi
 	 * Must bump bi_remaining to allow bio to complete with
 	 * restored bi_end_io.
 	 */
-	atomic_inc(&bio->bi_remaining);
+	atomic_set(&bio->bi_remaining, 1);
 
 	if (err) {
 		bio_endio(bio, err);
Index: linux-block/drivers/md/dm-snap.c
===================================================================
--- linux-block.orig/drivers/md/dm-snap.c	2013-11-01 17:21:11.000000000 +0100
+++ linux-block/drivers/md/dm-snap.c	2013-11-04 13:59:56.000000000 +0100
@@ -1415,7 +1415,7 @@ out:
 	if (full_bio) {
 		full_bio->bi_end_io = pe->full_bio_end_io;
 		full_bio->bi_private = pe->full_bio_private;
-		atomic_inc(&full_bio->bi_remaining);
+		atomic_set(&full_bio->bi_remaining, 1);
 	}
 	free_pending_exception(pe);
 
Index: linux-block/drivers/md/dm-thin.c
===================================================================
--- linux-block.orig/drivers/md/dm-thin.c	2013-11-01 17:21:11.000000000 +0100
+++ linux-block/drivers/md/dm-thin.c	2013-11-04 14:00:19.000000000 +0100
@@ -613,7 +613,7 @@ static void process_prepared_mapping_fai
 {
 	if (m->bio) {
 		m->bio->bi_end_io = m->saved_bi_end_io;
-		atomic_inc(&m->bio->bi_remaining);
+		atomic_set(&m->bio->bi_remaining, 1);
 	}
 	cell_error(m->tc->pool, m->cell);
 	list_del(&m->list);
@@ -630,7 +630,7 @@ static void process_prepared_mapping(str
 	bio = m->bio;
 	if (bio) {
 		bio->bi_end_io = m->saved_bi_end_io;
-		atomic_inc(&bio->bi_remaining);
+		atomic_set(&bio->bi_remaining, 1);
 	}
 
 	if (m->err) {
Index: linux-block/drivers/md/dm-verity.c
===================================================================
--- linux-block.orig/drivers/md/dm-verity.c	2013-11-01 17:21:11.000000000 +0100
+++ linux-block/drivers/md/dm-verity.c	2013-11-04 13:59:28.000000000 +0100
@@ -384,7 +384,7 @@ static void verity_finish_io(struct dm_v
 
 	bio->bi_end_io = io->orig_bi_end_io;
 	bio->bi_private = io->orig_bi_private;
-	atomic_inc(&bio->bi_remaining);
+	atomic_set(&bio->bi_remaining, 1);
 
 	bio_endio(bio, error);
 }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 15:06   ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka
@ 2013-11-04 15:20     ` Mike Snitzer
  2013-11-04 15:25       ` Mikulas Patocka
  2013-11-04 20:12     ` Kent Overstreet
  1 sibling, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2013-11-04 15:20 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Kent Overstreet, dm-devel, linux-kernel,
	Alasdair Kergon

On Mon, Nov 04 2013 at 10:06am -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Fri, 1 Nov 2013, Jens Axboe wrote:
> 
> > On 11/01/2013 07:59 AM, Mike Snitzer wrote:
> > > Add the missing bi_remaining increment, required by the block layer's
> > > new bio-chaining code, to both the verity and old snapshot DM targets.
> > > 
> > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().
> > 
> > Thanks Mike, added to the mix.
> > 
> > -- 
> > Jens Axboe
> 
> Hi
> 
> This improves a little bit on the previous patch, by replacing costly 
> atomic_inc with cheap atomic_set.
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> dm: change atomic_inc to atomic_set(1)
> 
> There are places in dm where we save bi_endio and bi_private, set them to
> target's routine, submit the bio, from the target's bi_endio routine we
> restore bi_endio and bi_private and end the bio with bi_endio.
> 
> This causes underflow of bi_remaining, so we must restore bi_remaining
> before ending the bio from the target bi_endio routine.
> 
> The code uses atomic_inc for restoration of bi_remaining. This patch
> changes it to atomic_set(1) to avoid an interlocked instruction. In the
> target's bi_endio routine we are sure that bi_remaining is zero
> (otherwise, the bi_endio routine wouldn't be called) and there are no
> concurrent users of the bio, so we can replace atomic_inc with
> atomic_set(1).

This isn't DM-specific.  Shouldn't the other places in the tree that use
atomic_inc on bi_remaining should really be converted at the same time?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 15:20     ` Mike Snitzer
@ 2013-11-04 15:25       ` Mikulas Patocka
  2013-11-04 16:04         ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2013-11-04 15:25 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Jens Axboe, Kent Overstreet, dm-devel, linux-kernel,
	Alasdair Kergon



On Mon, 4 Nov 2013, Mike Snitzer wrote:

> On Mon, Nov 04 2013 at 10:06am -0500,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > 
> > 
> > On Fri, 1 Nov 2013, Jens Axboe wrote:
> > 
> > > On 11/01/2013 07:59 AM, Mike Snitzer wrote:
> > > > Add the missing bi_remaining increment, required by the block layer's
> > > > new bio-chaining code, to both the verity and old snapshot DM targets.
> > > > 
> > > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().
> > > 
> > > Thanks Mike, added to the mix.
> > > 
> > > -- 
> > > Jens Axboe
> > 
> > Hi
> > 
> > This improves a little bit on the previous patch, by replacing costly 
> > atomic_inc with cheap atomic_set.
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > dm: change atomic_inc to atomic_set(1)
> > 
> > There are places in dm where we save bi_endio and bi_private, set them to
> > target's routine, submit the bio, from the target's bi_endio routine we
> > restore bi_endio and bi_private and end the bio with bi_endio.
> > 
> > This causes underflow of bi_remaining, so we must restore bi_remaining
> > before ending the bio from the target bi_endio routine.
> > 
> > The code uses atomic_inc for restoration of bi_remaining. This patch
> > changes it to atomic_set(1) to avoid an interlocked instruction. In the
> > target's bi_endio routine we are sure that bi_remaining is zero
> > (otherwise, the bi_endio routine wouldn't be called) and there are no
> > concurrent users of the bio, so we can replace atomic_inc with
> > atomic_set(1).
> 
> This isn't DM-specific.  Shouldn't the other places in the tree that use
> atomic_inc on bi_remaining should really be converted at the same time?

There is no 'atomic_inc.*bi_remaining' in other drivers.

It is just in fs/bio.c in bio_chain and bio_endio_nodec where it's 
probably needed.

Mikulas

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 15:25       ` Mikulas Patocka
@ 2013-11-04 16:04         ` Mike Snitzer
  2013-11-04 17:49           ` Mikulas Patocka
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Snitzer @ 2013-11-04 16:04 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: axboe, kmo, dm-devel, agk, linux-kernel

On Mon, Nov 04 2013 at 10:25am -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Mon, 4 Nov 2013, Mike Snitzer wrote:
> 
> > On Mon, Nov 04 2013 at 10:06am -0500,
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > 
> > > The code uses atomic_inc for restoration of bi_remaining. This patch
> > > changes it to atomic_set(1) to avoid an interlocked instruction. In the
> > > target's bi_endio routine we are sure that bi_remaining is zero
> > > (otherwise, the bi_endio routine wouldn't be called) and there are no
> > > concurrent users of the bio, so we can replace atomic_inc with
> > > atomic_set(1).
> > 
> > This isn't DM-specific.  Shouldn't the other places in the tree that use
> > atomic_inc on bi_remaining should really be converted at the same time?
> 
> There is no 'atomic_inc.*bi_remaining' in other drivers.

Wrong.  I know btrfs has at least one.  As does bcache afaik.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 16:04         ` Mike Snitzer
@ 2013-11-04 17:49           ` Mikulas Patocka
  2013-11-05  0:56             ` Mike Snitzer
  0 siblings, 1 reply; 16+ messages in thread
From: Mikulas Patocka @ 2013-11-04 17:49 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: axboe, kmo, dm-devel, agk, linux-kernel



On Mon, 4 Nov 2013, Mike Snitzer wrote:

> On Mon, Nov 04 2013 at 10:25am -0500,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > 
> > 
> > On Mon, 4 Nov 2013, Mike Snitzer wrote:
> > 
> > > On Mon, Nov 04 2013 at 10:06am -0500,
> > > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > > 
> > > > The code uses atomic_inc for restoration of bi_remaining. This patch
> > > > changes it to atomic_set(1) to avoid an interlocked instruction. In the
> > > > target's bi_endio routine we are sure that bi_remaining is zero
> > > > (otherwise, the bi_endio routine wouldn't be called) and there are no
> > > > concurrent users of the bio, so we can replace atomic_inc with
> > > > atomic_set(1).
> > > 
> > > This isn't DM-specific.  Shouldn't the other places in the tree that use
> > > atomic_inc on bi_remaining should really be converted at the same time?
> > 
> > There is no 'atomic_inc.*bi_remaining' in other drivers.
> 
> Wrong.  I know btrfs has at least one.  As does bcache afaik.

grep -r 'atomic_inc.*bi_remaining' * yilds no hits in btrfs or bcache (on 
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git , 
branch remotes/origin/for-next). It only finds 
drivers/md/dm-cache-target.c, drivers/md/dm-verity.c, 
drivers/md/dm-snap.c, drivers/md/dm-thin.c. Maybe in other git trees there 
are more cases of this.

Mikulas

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 17:49           ` Mikulas Patocka
@ 2013-11-05  0:56             ` Mike Snitzer
  0 siblings, 0 replies; 16+ messages in thread
From: Mike Snitzer @ 2013-11-05  0:56 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: axboe, kmo, dm-devel, agk, linux-kernel

On Mon, Nov 04 2013 at 12:49pm -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Mon, 4 Nov 2013, Mike Snitzer wrote:
> 
> > On Mon, Nov 04 2013 at 10:25am -0500,
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > 
> > > 
> > > 
> > > On Mon, 4 Nov 2013, Mike Snitzer wrote:
> > > 
> > > > On Mon, Nov 04 2013 at 10:06am -0500,
> > > > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > > > > 
> > > > > The code uses atomic_inc for restoration of bi_remaining. This patch
> > > > > changes it to atomic_set(1) to avoid an interlocked instruction. In the
> > > > > target's bi_endio routine we are sure that bi_remaining is zero
> > > > > (otherwise, the bi_endio routine wouldn't be called) and there are no
> > > > > concurrent users of the bio, so we can replace atomic_inc with
> > > > > atomic_set(1).
> > > > 
> > > > This isn't DM-specific.  Shouldn't the other places in the tree that use
> > > > atomic_inc on bi_remaining should really be converted at the same time?
> > > 
> > > There is no 'atomic_inc.*bi_remaining' in other drivers.
> > 
> > Wrong.  I know btrfs has at least one.  As does bcache afaik.
> 
> grep -r 'atomic_inc.*bi_remaining' * yilds no hits in btrfs or bcache (on 
> git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git , 
> branch remotes/origin/for-next). It only finds 
> drivers/md/dm-cache-target.c, drivers/md/dm-verity.c, 
> drivers/md/dm-snap.c, drivers/md/dm-thin.c. Maybe in other git trees there 
> are more cases of this.

Here is the btrfs one I was thinking of from Chris:
https://patchwork.kernel.org/patch/3123121/

Should probably make its way into linux-block.git, Jens?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH for-next] dm: fix missing bi_remaining accounting
  2013-11-04 15:06   ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka
  2013-11-04 15:20     ` Mike Snitzer
@ 2013-11-04 20:12     ` Kent Overstreet
  1 sibling, 0 replies; 16+ messages in thread
From: Kent Overstreet @ 2013-11-04 20:12 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Jens Axboe, Mike Snitzer, dm-devel, linux-kernel, Alasdair Kergon

On Mon, Nov 04, 2013 at 10:06:00AM -0500, Mikulas Patocka wrote:
> 
> 
> On Fri, 1 Nov 2013, Jens Axboe wrote:
> 
> > On 11/01/2013 07:59 AM, Mike Snitzer wrote:
> > > Add the missing bi_remaining increment, required by the block layer's
> > > new bio-chaining code, to both the verity and old snapshot DM targets.
> > > 
> > > Otherwise users will hit the bi_remaining <= 0 BUG_ON in bio_endio().
> > 
> > Thanks Mike, added to the mix.
> > 
> > -- 
> > Jens Axboe
> 
> Hi
> 
> This improves a little bit on the previous patch, by replacing costly 
> atomic_inc with cheap atomic_set.

IMO, this is a bad idea; the behaviour with this patch does _not_ match the
naming of bio_endio_nodec(), and the performance difference should be well in
the noise anyways because we're touching a cacheline we already have in cache
and won't be contended.

The fact that it's currently safe is accidental, I could see this easily
tripping people up and being a pain in the ass to debug in the future.

> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> dm: change atomic_inc to atomic_set(1)
> 
> There are places in dm where we save bi_endio and bi_private, set them to
> target's routine, submit the bio, from the target's bi_endio routine we
> restore bi_endio and bi_private and end the bio with bi_endio.
> 
> This causes underflow of bi_remaining, so we must restore bi_remaining
> before ending the bio from the target bi_endio routine.
> 
> The code uses atomic_inc for restoration of bi_remaining. This patch
> changes it to atomic_set(1) to avoid an interlocked instruction. In the
> target's bi_endio routine we are sure that bi_remaining is zero
> (otherwise, the bi_endio routine wouldn't be called) and there are no
> concurrent users of the bio, so we can replace atomic_inc with
> atomic_set(1).

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-11-05  0:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-01 13:59 [PATCH for-next] dm: fix missing bi_remaining accounting Mike Snitzer
2013-11-01 15:03 ` Jens Axboe
2013-11-01 15:43   ` stec skd block driver needs updating for immutable biovec Mike Snitzer
2013-11-01 15:50     ` Christoph Hellwig
2013-11-01 16:02       ` Jens Axboe
2013-11-01 16:28         ` Mike Snitzer
2013-11-01 16:34           ` Jens Axboe
2013-11-01 17:46             ` Mike Snitzer
2013-11-04 11:25             ` Bartlomiej Zolnierkiewicz
2013-11-04 15:06   ` [PATCH for-next] dm: fix missing bi_remaining accounting Mikulas Patocka
2013-11-04 15:20     ` Mike Snitzer
2013-11-04 15:25       ` Mikulas Patocka
2013-11-04 16:04         ` Mike Snitzer
2013-11-04 17:49           ` Mikulas Patocka
2013-11-05  0:56             ` Mike Snitzer
2013-11-04 20:12     ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox