From: Jens Axboe <jens.axboe@oracle.com>
To: Mike Anderson <andmike@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>,
James Bottomley <James.Bottomley@HansenPartnership.com>,
linux-scsi <linux-scsi@vger.kernel.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
IDE/ATA development list <linux-ide@vger.kernel.org>
Subject: Re: [PATCH] block: add timer on blkdev_dequeue_request() not elv_next_request()
Date: Thu, 30 Oct 2008 12:14:42 +0100 [thread overview]
Message-ID: <20081030111441.GR31673@kernel.dk> (raw)
In-Reply-To: <20081030092741.GA9478@linux.vnet.ibm.com>
On Thu, Oct 30 2008, Mike Anderson wrote:
> Jens Axboe <jens.axboe@oracle.com> wrote:
> > On Thu, Oct 30 2008, Tejun Heo wrote:
> > > Jens Axboe wrote:
> > > > That's actually a pretty dumb error, I'm surprised it hasn't reared its
> > > > ugly face in more ways. Presumably because the timeout is usually so
> > > > long, that we'll get to actually issuing and completing it within the
> > > > normal timeout anyway.
> > >
> > > Heh... it showed its ugly face in many different ways while I was
> > > playing with PMP connected via a very long eSATA cable.
> >
> > Ah :-)
> >
> > If we had it wired up for eg the old IDE drivers, it would have shown up
> > quite quickly as well I think.
>
> I am getting errors now and my system will not boot up. The system is
> connected to storage with active / passive paths. If we are doing a
> BLKPREP_KILL we will call elv_dequeue_request which will add the
> timer for the request we are killing.
>
> The attached patch is a quick patch to work around my issue, but we
> probably need something better. I would like to run some short timeout
> testing on it for a while (though that previously did not catch Tejun's
> issue). I will look at this more tomorrow unless someone beats me to it.
>
> -andmike
> --
> Michael Anderson
> andmike@linux.vnet.ibm.com
>
> 1.) Dmesg
> [ 35.104483] Buffer I/O error on device sda, logical block 0
> [ 35.104493] ------------[ cut here ]------------
> [ 35.104496] WARNING: at
> /home/data/xbuild/source/linux-2.6_work/lib/list_debug.c:30
> __list_add+0x6e/0x87()
> [ 35.104499] list_add corruption. prev->next should be next
> (ffff88007c5be568), but was ffff88007bd3ce88. (prev=ffff88007bd3ce88).
> [ 35.104501] Modules linked in: sd_mod(+) usbcore(+) dm_snapshot dm_mod
> edd ext3 jbd fan mptsas mptscsih mptbase scsi_transport_sas lpfc
> scsi_transport_fc scsi_dh_rdac scsi_dh scsi_mod thermal processor
> [ 35.104512] Pid: 1168, comm: modprobe Not tainted
> 2.6.28-rc2-00044-g9678cc9 #1
> [ 35.104514] Call Trace:
> [ 35.104521] [<ffffffff8023a368>] warn_slowpath+0xae/0xd5
> [ 35.104527] [<ffffffff80256daf>] ? trace_hardirqs_off+0xd/0xf
> [ 35.104531] [<ffffffff8025ad17>] ? __lock_acquire+0x1541/0x155d
> [ 35.104537] [<ffffffff80457e17>] ? printk+0x67/0x70
> [ 35.104542] [<ffffffff80335106>] ? __ratelimit+0xb6/0xc0
> [ 35.104546] [<ffffffff80387d9e>] ? mix_pool_bytes_extract+0x13e/0x14d
> [ 35.104549] [<ffffffff80256daf>] ? trace_hardirqs_off+0xd/0xf
> [ 35.104553] [<ffffffff8045acb9>] ? _spin_unlock_irqrestore+0x38/0x47
> [ 35.104556] [<ffffffff8033b96b>] __list_add+0x6e/0x87
> [ 35.104561] [<ffffffff80327c6a>] blk_add_timer+0x99/0xe4
> [ 35.104564] [<ffffffff803219d4>] elv_dequeue_request+0x53/0x55
> [ 35.104567] [<ffffffff80323395>] end_that_request_last+0x4b/0x1e5
> [ 35.104570] [<ffffffff80323610>] __blk_end_request+0x3c/0x45
> [ 35.104572] [<ffffffff80321b9c>] elv_next_request+0x1c6/0x234
> [ 35.104576] [<ffffffff80243777>] ? lock_timer_base+0x26/0x4a
> [ 35.104608] [<ffffffffa0028d0a>] scsi_request_fn+0x93/0x510 [scsi_mod]
> [ 35.104611] [<ffffffff80323e02>] __generic_unplug_device+0x27/0x2c
> [ 35.104614] [<ffffffff80323e2b>] blk_start_queueing+0x24/0x26
> [ 35.104617] [<ffffffff8032ec40>] cfq_insert_request+0x333/0x352
> [ 35.104620] [<ffffffff80321dff>] elv_insert+0x1f5/0x29e
> [ 35.104622] [<ffffffff80321f3e>] __elv_add_request+0x96/0x9f
> [ 35.104625] [<ffffffff803249d6>] __make_request+0x405/0x46e
> [ 35.104628] [<ffffffff80258eed>] ? trace_hardirqs_on_caller+0xf9/0x124
> [ 35.104631] [<ffffffff8032306b>] generic_make_request+0x3a6/0x3e9
> [ 35.104635] [<ffffffff8027c1e3>] ? mempool_alloc+0x5b/0x113
> [ 35.104638] [<ffffffff80323161>] submit_bio+0xb3/0xbc
> [ 35.104643] [<ffffffff802c650f>] submit_bh+0xea/0x10e
> [ 35.104647] [<ffffffff802c98a3>] block_read_full_page+0x273/0x292
> [ 35.104650] [<ffffffff802cb011>] ? blkdev_get_block+0x0/0x5d
> [ 35.104652] [<ffffffff80279dd3>] ? __lock_page+0x63/0x6a
> [ 35.104656] [<ffffffff802cc502>] blkdev_readpage+0x13/0x15
> [ 35.104658] [<ffffffff8027a454>] read_cache_page_async+0x118/0x143
> [ 35.104661] [<ffffffff802cc4ef>] ? blkdev_readpage+0x0/0x15
> [ 35.104663] [<ffffffff8027a48d>] read_cache_page+0xe/0x46
> [ 35.104668] [<ffffffff802f03a0>] read_dev_sector+0x2e/0x93
> [ 35.104671] [<ffffffff802f411f>] read_lba+0x5b/0xb3
> [ 35.104674] [<ffffffff80258f25>] ? trace_hardirqs_on+0xd/0xf
> [ 35.104677] [<ffffffff802f43cb>] efi_partition+0x9f/0x5f8
> [ 35.104679] [<ffffffff80258f25>] ? trace_hardirqs_on+0xd/0xf
> [ 35.104682] [<ffffffff80258eed>] ? trace_hardirqs_on_caller+0xf9/0x124
> [ 35.104686] [<ffffffff802f0bb2>] rescan_partitions+0x159/0x2e1
> [ 35.104693] [<ffffffffa01a047a>] ? sd_open+0xcd/0x18d [sd_mod]
> [ 35.104695] [<ffffffff802cbf4b>] __blkdev_get+0x1ff/0x2cf
> [ 35.104699] [<ffffffff80332697>] ? kobject_put+0x47/0x4b
> [ 35.104701] [<ffffffff802cc026>] blkdev_get+0xb/0xd
> [ 35.104704] [<ffffffff802f04e5>] register_disk+0xe0/0x145
> [ 35.104707] [<ffffffff80329280>] add_disk+0xc0/0x124
> [ 35.104711] [<ffffffffa01a1fa8>] sd_probe+0x2c6/0x39e [sd_mod]
> [ 35.104715] [<ffffffff802f6c1e>] ? sysfs_create_link+0xe/0x10
> [ 35.104719] [<ffffffff803ad9f2>] driver_probe_device+0xc0/0x16e
> [ 35.104721] [<ffffffff803adb02>] __driver_attach+0x62/0x8c
> [ 35.104724] [<ffffffff803adaa0>] ? __driver_attach+0x0/0x8c
> [ 35.104726] [<ffffffff803ad269>] bus_for_each_dev+0x52/0x8c
> [ 35.104729] [<ffffffff803ad83a>] driver_attach+0x1c/0x1e
> [ 35.104731] [<ffffffff803acb2a>] bus_add_driver+0xba/0x207
> [ 35.104734] [<ffffffff803adcf6>] driver_register+0xab/0x12b
> [ 35.104749] [<ffffffffa002dcbc>] scsi_register_driver+0x11/0x13
> [scsi_mod]
> [ 35.104752] [<ffffffffa01ac0a4>] init_sd+0xa4/0xff [sd_mod]
> [ 35.104756] [<ffffffffa01ac000>] ? init_sd+0x0/0xff [sd_mod]
> [ 35.104760] [<ffffffff80209058>] _stext+0x58/0x138
> [ 35.104763] [<ffffffff80256daf>] ? trace_hardirqs_off+0xd/0xf
> [ 35.104766] [<ffffffff8045acb9>] ? _spin_unlock_irqrestore+0x38/0x47
> [ 35.104769] [<ffffffff80233a6c>] ? try_to_wake_up+0x186/0x198
> [ 35.104772] [<ffffffff80258f25>] ? trace_hardirqs_on+0xd/0xf
> [ 35.104775] [<ffffffff80258eed>] ? trace_hardirqs_on_caller+0xf9/0x124
> [ 35.104780] [<ffffffff802642a7>] sys_init_module+0xab/0x1bc
> [ 35.104783] [<ffffffff8020be9b>] system_call_fastpath+0x16/0x1b
> [ 35.104785] ---[ end trace 08ddd733955d992c ]---
> [
>
> 2.) Patch
>
> From 698916fcee612e84ecce89f27ea66dd5f21bc351 Mon Sep 17 00:00:00 2001
> From: Mike Anderson <andmike@linux.vnet.ibm.com>
> Date: Thu, 30 Oct 2008 02:16:20 -0700
> Subject: [PATCH 1/1] blk: move blk_delete_timer call in end_that_request_last
>
> Move the calling blk_delete_timer to later in end_that_request_last to
> address an issue where blkdev_dequeue_request may have add a timer for the
> request.
>
> Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
> ---
> block/blk-core.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index c3df30c..10e8a64 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1770,8 +1770,6 @@ static void end_that_request_last(struct request *req, int error)
> {
> struct gendisk *disk = req->rq_disk;
>
> - blk_delete_timer(req);
> -
> if (blk_rq_tagged(req))
> blk_queue_end_tag(req->q, req);
>
> @@ -1781,6 +1779,8 @@ static void end_that_request_last(struct request *req, int error)
> if (unlikely(laptop_mode) && blk_fs_request(req))
> laptop_io_completion();
>
> + blk_delete_timer(req);
> +
> /*
> * Account IO completion. bar_rq isn't accounted as a normal
> * IO on queueing nor completion. Accounting the containing
> --
> 1.5.6.5
Good catch, I queued this up for 2.6.28 (and added Tejuns ack).
--
Jens Axboe
prev parent reply other threads:[~2008-10-30 11:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-30 3:19 [PATCH] block: add timer on blkdev_dequeue_request() not elv_next_request() Tejun Heo
2008-10-30 3:19 ` Tejun Heo
2008-10-30 7:29 ` Jens Axboe
2008-10-30 7:55 ` Tejun Heo
2008-10-30 7:58 ` Jens Axboe
2008-10-30 9:27 ` Mike Anderson
2008-10-30 9:34 ` Tejun Heo
2008-10-30 9:51 ` Tejun Heo
2008-10-30 11:14 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081030111441.GR31673@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=andmike@linux.vnet.ibm.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.