public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <mchristie@fusionio.com>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Jens Axboe <axboe@kernel.dk>,
	Alexander Gordeev <agordeev@redhat.com>,
	Tejun Heo <tj@kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-ide@vger.kernel.org>, Jeff Garzik <jgarzik@pobox.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing
Date: Sat, 20 Jul 2013 09:48:28 -0500	[thread overview]
Message-ID: <51EAA33C.9010405@fusionio.com> (raw)
In-Reply-To: <1374296162.7397.1098.camel@haakon3.risingtidesystems.com>

[-- Attachment #1: Type: text/plain, Size: 4062 bytes --]

On 07/19/2013 11:56 PM, Nicholas A. Bellinger wrote:
> On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
>>> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
>>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>>> index 0101af5..191bc15 100644
>>>> --- a/drivers/ata/libata-scsi.c
>>>> +++ b/drivers/ata/libata-scsi.c
>>>> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>>>>                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
>>>>                         sdev->sector_size);
>>>>  
>>>> -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       if (!q->mq_ops) {
>>>> +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
>>>> +       } else {
>>>> +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
>>>> +       }
>>>
>>> Amazingly enough there is a reason for the dma alignment, and it wasn't
>>> just to annoy you, so you can't blindly do this.
>>>
>>> The email thread is probably lost in the mists of time, but if I
>>> remember correctly the problem is that some ahci DMA controllers barf if
>>> the sector they're doing DMA on crosses a page boundary.  Some are
>>> annoying enough to actually cause silent data corruption.  You won't
>>> find every ahci DMA controller doing this, so the change will work for
>>> some, but it will be hard to identify those it won't work for until
>>> people start losing data.
>>
>> Thanks for the extra background.
>>
>> So at least from what I gather thus far this shouldn't be an issue for
>> initial testing with scsi-mq <-> libata w/ ata_piix.
>>
>>>
>>> The correct fix, obviously, is to do the bio copy on the kernel path for
>>> unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
>>> aligned (because of the block to page alignment).
>>>
>>
>> Indeed.  Looking into the bio_copy_kern() breakage next..
>>
> 
> OK, after further investigation the root cause is a actually a missing
> bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the
> blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is
> currently using.
> 
> Including the following patch into the scsi-mq working branch now, and
> reverting the libata dma_alignment=0x03 hack.
> 
> Alexander, care to give this a try..?
> 
> --nab
> 
> diff --git a/block/blk-exec.c b/block/blk-exec.c
> index 0761c89..70303d2 100644
> --- a/block/blk-exec.c
> +++ b/block/blk-exec.c
> @@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error)
>         struct completion *waiting = rq->end_io_data;
>  
>         rq->end_io_data = NULL;
> -       if (!rq->q->mq_ops) {
> +       if (rq->q->mq_ops) {
> +               if (rq->bio)
> +                       bio_endio(rq->bio, error);
> +       } else {
>                 __blk_put_request(rq->q, rq);
>         }
> 


This does not handle requests with multiple bios, and for the mq stye
passthrough insertion completions you actually want to call
blk_mq_finish_request in scsi_execute. Same for all the other
passthrough code in your scsi mq tree. That is your root bug. Instead of
doing that though I think we want to have the block layer free the bios
like before.

For the non mq calls, blk_end_request type of calls will complete the
bios when blk_finish_request is called from that path. It will then call
the rq end_io callback.

I think the blk mq code assumes if the end_io callack is set that the
end_io function will do the bio cleanup. See __blk_mq_end_io. Also see
how blk_mq_execute_rq calls blk_mq_finish_request for an example of how
rq passthrough execution and cleanup is being done in the mq paths.

Now with the scsi mq changes, when blk_execute_rq_nowait calls
blk_mq_insert_request it calls it with a old non mq style of end io
function that does not complete the bios.

What about the attached only compile tested patch. The patch has the mq
block code work like the non mq code for bio cleanups.



[-- Attachment #2: blk-mq-free-bio.patch --]
[-- Type: text/plain, Size: 2513 bytes --]

blk-mq: blk-mq should free bios in pass through case

For non mq calls, the block layer will free the bios when
blk_finish_request is called.

For mq calls, the blk mq code wants the caller to do this.

This patch has the blk mq code work like the non mq code
and has the block layer free the bios.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>

diff --git a/block/blk-flush.c b/block/blk-flush.c
index c56c37d..3e4cc9c 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -231,7 +231,7 @@ static void flush_end_io(struct request *flush_rq, int error)
 	unsigned long flags = 0;
 
 	if (q->mq_ops) {
-		blk_mq_finish_request(flush_rq, error);
+		blk_mq_free_request(flush_rq);
 		spin_lock_irqsave(&q->mq_flush_lock, flags);
 	}
 	running = &q->flush_queue[q->flush_running_idx];
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 799d305..5489b5a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -270,7 +270,7 @@ void blk_mq_free_request(struct request *rq)
 }
 EXPORT_SYMBOL(blk_mq_free_request);
 
-void blk_mq_finish_request(struct request *rq, int error)
+static void blk_mq_finish_request(struct request *rq, int error)
 {
 	struct bio *bio = rq->bio;
 	unsigned int bytes = 0;
@@ -286,22 +286,17 @@ void blk_mq_finish_request(struct request *rq, int error)
 
 	blk_account_io_completion(rq, bytes);
 	blk_account_io_done(rq);
-	blk_mq_free_request(rq);
 }
 
 void blk_mq_complete_request(struct request *rq, int error)
 {
 	trace_block_rq_complete(rq->q, rq);
+	blk_mq_finish_request(rq, error);
 
-	/*
-	 * If ->end_io is set, it's responsible for doing the rest of the
-	 * completion.
-	 */
 	if (rq->end_io)
 		rq->end_io(rq, error);
 	else
-		blk_mq_finish_request(rq, error);
-
+		blk_mq_free_request(rq);
 }
 
 void __blk_mq_end_io(struct request *rq, int error)
@@ -973,8 +968,7 @@ int blk_mq_execute_rq(struct request_queue *q, struct request *rq)
 	if (rq->errors)
 		err = -EIO;
 
-	blk_mq_finish_request(rq, rq->errors);
-
+	blk_mq_free_request(rq);
 	return err;
 }
 EXPORT_SYMBOL(blk_mq_execute_rq);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 42d0110..52bf1f9 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -27,7 +27,6 @@ void blk_mq_complete_request(struct request *rq, int error);
 void blk_mq_run_request(struct request *rq, bool run_queue, bool async);
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_init_flush(struct request_queue *q);
-void blk_mq_finish_request(struct request *rq, int error);
 
 /*
  * CPU hotplug helpers

  reply	other threads:[~2013-07-20 14:48 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-21 19:00 [PATCH RESEND 0/1] AHCI: Optimize interrupt processing Alexander Gordeev
2013-05-21 19:00 ` [PATCH RESEND 1/1] " Alexander Gordeev
2013-05-21 23:50 ` [PATCH RESEND 0/1] " Tejun Heo
2013-05-22 14:39   ` Alexander Gordeev
2013-05-22 17:03     ` Jens Axboe
2013-07-11 10:26       ` Alexander Gordeev
2013-07-11 23:00         ` Nicholas A. Bellinger
2013-07-12  7:46           ` Alexander Gordeev
2013-07-13  5:20             ` Nicholas A. Bellinger
2013-07-16 18:32               ` Alexander Gordeev
2013-07-16 21:38                 ` Nicholas A. Bellinger
2013-07-17 16:19                   ` Alexander Gordeev
2013-07-18 18:51                     ` Nicholas A. Bellinger
2013-07-18 19:12                       ` Mike Christie
2013-07-19  0:23                         ` Nicholas A. Bellinger
2013-07-19  0:30                           ` Jens Axboe
2013-07-19  1:03                             ` Nicholas A. Bellinger
2013-07-19  6:34                               ` Nicholas A. Bellinger
2013-07-19 15:33                                 ` James Bottomley
2013-07-19 21:01                                   ` Nicholas A. Bellinger
2013-07-20  4:56                                     ` Nicholas A. Bellinger
2013-07-20 14:48                                       ` Mike Christie [this message]
2013-07-20 22:14                                         ` Nicholas A. Bellinger
2013-07-20 23:57                                         ` Jens Axboe
2013-08-09 19:15                                         ` Alexander Gordeev
2013-08-09 20:17                                           ` Nicholas A. Bellinger
2013-08-15 16:23                                             ` Alexander Gordeev
2013-08-16  2:19                                               ` Nicholas A. Bellinger
2013-08-16 16:41                                                 ` Alexander Gordeev
2013-08-16 17:46                                                   ` Nicholas A. Bellinger
2013-08-28 15:56                                                 ` Alexander Gordeev
2013-09-20 15:19                                                 ` Alexander Gordeev
2013-09-20 20:41                                                   ` Nicholas A. Bellinger
2013-10-03 11:06                                         ` Christoph Hellwig
2013-10-07 14:44                                           ` Alexander Gordeev
2013-07-22 15:03                                       ` Alexander Gordeev
2013-07-22 21:10                                         ` Nicholas A. Bellinger
2013-07-25 10:16                                           ` Alexander Gordeev
2013-07-25 22:08                                             ` Nicholas A. Bellinger
2013-07-26  2:09                                               ` Jens Axboe
2013-07-26 21:14                                                 ` Nicholas A. Bellinger
2013-07-27  0:43                                                   ` Nicholas A. Bellinger
2013-07-29 11:18                                                     ` Alexander Gordeev
2013-07-29 14:08                                                       ` Jens Axboe
2013-07-29 19:19                                                       ` Nicholas A. Bellinger
2013-07-31  4:16                                                         ` Marc C
2013-07-31 10:23                                                           ` Tejun Heo
2013-07-29 11:50                                                     ` Tejun Heo
2013-07-29 19:11                                                       ` Nicholas A. Bellinger
2013-07-29 11:46                                                   ` Tejun Heo
2013-07-29 14:03                                                     ` Jens Axboe
2013-08-09  8:23                                                     ` Alexander Gordeev
2013-08-09 14:15                                                       ` Tejun Heo
2013-08-09 14:24                                                       ` Jens Axboe
2013-08-09 15:07                                                         ` Alexander Gordeev
2013-08-09 15:52                                                           ` Jens Axboe
2013-08-09 16:46                                                             ` Alexander Gordeev
2013-08-09 17:07                                                               ` Jens Axboe
2013-08-12 15:21                                                                 ` Alexander Gordeev
2013-07-29  7:28                                                 ` Hannes Reinecke
2013-07-31 17:11                                             ` Alexander Gordeev
2013-07-19 15:58                           ` Mike Christie
2013-07-19 21:05                             ` Nicholas A. Bellinger
2013-07-18 19:14                       ` Nicholas A. Bellinger
2013-07-18 21:21                         ` Jens Axboe
2014-09-11 12:42   ` Alexander Gordeev
2014-09-11 12:44     ` [PATCH v2] " Alexander Gordeev
2014-09-13  4:43       ` Tejun Heo
2014-09-11 13:36     ` [PATCH RESEND 0/1] " Bartlomiej Zolnierkiewicz
2014-09-13  4:46     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EAA33C.9010405@fusionio.com \
    --to=mchristie@fusionio.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=agordeev@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=jgarzik@pobox.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@linux-iscsi.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox