public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: Valdis.Kletnieks@vt.edu
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: 2.6.17-rc5-mm3 - crash in cfq_queue_empty() after iosched change
Date: Tue, 6 Jun 2006 09:15:37 +0200	[thread overview]
Message-ID: <20060606071537.GP4400@suse.de> (raw)
In-Reply-To: <200606051442.k55EghgI004703@turing-police.cc.vt.edu>

On Mon, Jun 05 2006, Valdis.Kletnieks@vt.edu wrote:
> I've been hitting this about once every two weeks for a while now,
> probably back to a 2.6.16-rc or so.  It always bites at the same time
> while my laptop was at a point very late in bootup. I finally caught
> one when I had pen, paper, *and* time to chase it a bit rather than
> reboot.  Sorry for the very partial traceback, it's not a good CTS day
> and I didn't have a digital camera handy.
> 
> BUG: Unable to handle kernel NULL pointer dereference at 0x0000005c
> EIP at cfq_queue_empty+0x9/0x15
> call trace:
> 	elv_queue_empty+0x20/0x22
> 	ide_do_request+0xa4/0x788
> 	ide_intr+0x1ec/0x236
> 	handle_IRQ_eent+0x27/0x52
> 	handle_level_IRQ+0xb6
> 	do_IRQ+0x5d/0x78
> 	common_interrupt+0x1a/0x20
> 
> In my .config:
> 
> CONFIG_IOSCHED_NOOP=y
> CONFIG_IOSCHED_AS=y
> CONFIG_IOSCHED_DEADLINE=y
> CONFIG_IOSCHED_CFQ=y
> CONFIG_DEFAULT_IOSCHED="anticipatory"
> 
> This happened very soon (within a few milliseconds or two) after my /etc/rc.local did:
> 
> echo cfq >| /sys/block/hda/queue/scheduler
> 
> (The next executable statement in /etc/rc.local is this:
> echo noop >| /sys/block/hdb/queue/scheduler  and 'last sysfs file' still
> pointed at /dev/hda).
> 
> It *looks* like the problem is in elevator_switch() in block/elevator.c:
> 
>        while (q->rq.elvpriv) {
>                 blk_remove_plug(q);
>                 q->request_fn(q);
>                 spin_unlock_irq(q->queue_lock);
>                 msleep(10);
>                 spin_lock_irq(q->queue_lock);
>                 elv_drain_elevator(q);
>         }
> 
> this--> spin_unlock_irq(q->queue_lock);
> 
>         /*
>          * unregister old elevator data
>          */
>         elv_unregister_queue(q);
>         old_elevator = q->elevator;
> 
>         /*
>          * attach and start new elevator
>          */
>         if (elevator_attach(q, e))
>                 goto fail;
> 
> should be down here someplace, after elevator_attach(), I suspect?
> Looks like the disk popped an IRQ after we had installed the
> iosched_cfq.ops[] but q->elevator->elevator_data hadn't been
> initialized yet...
> 
> (I'd attach a patch, except I'm not positive I have the diagnosis
> right?)

I think your analysis is pretty good, there's definitely a period there
where we don't want the queueing invoked. Does this help?

diff --git a/block/elevator.c b/block/elevator.c
index 8768a36..429702a 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -806,8 +806,6 @@ static int elevator_switch(request_queue
 		elv_drain_elevator(q);
 	}
 
-	spin_unlock_irq(q->queue_lock);
-
 	/*
 	 * unregister old elevator data
 	 */
@@ -823,6 +821,8 @@ static int elevator_switch(request_queue
 	if (elv_register_queue(q))
 		goto fail_register;
 
+	spin_unlock_irq(q->queue_lock);
+
 	/*
 	 * finally exit old elevator and turn off BYPASS.
 	 */
@@ -831,16 +831,19 @@ static int elevator_switch(request_queue
 	return 1;
 
 fail_register:
+	spin_unlock_irq(q->queue_lock);
 	/*
 	 * switch failed, exit the new io scheduler and reattach the old
 	 * one again (along with re-adding the sysfs dir)
 	 */
 	elevator_exit(e);
 	e = NULL;
+	spin_lock_irq(q->queue_lock);
 fail:
 	q->elevator = old_elevator;
 	elv_register_queue(q);
 	clear_bit(QUEUE_FLAG_ELVSWITCH, &q->queue_flags);
+	spin_unlock_irq(q->queue_lock);
 	if (e)
 		kobject_put(&e->kobj);
 	return 0;

-- 
Jens Axboe


  reply	other threads:[~2006-06-06  7:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-05 14:42 2.6.17-rc5-mm3 - crash in cfq_queue_empty() after iosched change Valdis.Kletnieks
2006-06-06  7:15 ` Jens Axboe [this message]
2006-06-06  7:23   ` Jens Axboe
2006-06-06 11:39     ` Jens Axboe
2006-06-06 12:21       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060606071537.GP4400@suse.de \
    --to=axboe@suse.de \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox