public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
@ 2003-10-19 19:54 Peter Osterlund
  2003-10-19 21:20 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Osterlund @ 2003-10-19 19:54 UTC (permalink / raw)
  To: Jens Axboe, Nick Piggin; +Cc: Kernel Mailing List

I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
testing the CDRW packet writing driver, I got an oops in
as_latter_request. (Full oops at the end of this message.) It is
repeatable and happens because arq->rb_node.rb_right is uninitialized.

Although I have only seen this when using the packet writing driver, I
don't think that driver is causing the problem. It isn't doing
anything fancy with the cdrom request queue. It is only submitting a
bunch of read/write requests using submit_bio() when the oops happens.

This patch appears to fix the problem, but I haven't tried to
understand the AS code, so I don't know if this is the correct
solution.

--- linux/drivers/block/as-iosched.c~	2003-10-09 18:54:36.000000000 +0200
+++ linux/drivers/block/as-iosched.c	2003-10-19 20:33:45.000000000 +0200
@@ -1718,6 +1718,7 @@
 	struct as_rq *arq = mempool_alloc(ad->arq_pool, gfp_mask);
 
 	if (arq) {
+		memset(&arq->rb_node, 0, sizeof(struct rb_node));
 		RB_CLEAR(&arq->rb_node);
 		arq->request = rq;
 		arq->state = AS_RQ_NEW;

Here is the oops:

Unable to handle kernel paging request at virtual address 5a5a5a66
 printing eip:
c02213ef
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c02213ef>]    Not tainted
EFLAGS: 00010006
EIP is at rb_next+0xf/0x60
eax: 5a5a5a5a   ebx: c5ec0510   ecx: c373597c   edx: 5a5a5a5a
esi: c373597c   edi: 00000292   ebp: c3375d98   esp: c3375d98
ds: 007b   es: 007b   ss: 0068
Process ld (pid: 1290, threadinfo=c3374000 task=c3656fc0)
Stack: c3375da4 c0266df4 c5e8277c c3375db4 c025d37e c5ec0510 c373597c c3375de4 
       c02609df c5ec0510 c373597c c373597c c3375de4 c027da4c 00000000 c5e59d2c 
       c373597c c041280c c5e59d2c c3375e04 c027dab2 c5ec0510 c373597c 00000000 
Call Trace:
 [<c0266df4>] as_latter_request+0x14/0x30
 [<c025d37e>] elv_latter_request+0x2e/0x30
 [<c02609df>] blk_attempt_remerge+0x8f/0x1a0
 [<c027da4c>] restore_request+0xcc/0xe0
 [<c027dab2>] cdrom_start_read+0x52/0xb0
 [<c027eab4>] ide_do_rw_cdrom+0x74/0x190
 [<c026b141>] start_request+0x181/0x290
 [<c026b4c3>] ide_do_request+0x243/0x4c0
 [<c026c1ab>] ide_intr+0x36b/0x5d0
 [<c027d220>] cdrom_read_intr+0x0/0x3f0
 [<c010c1bb>] handle_IRQ_event+0x3b/0x70
 [<c010c7d1>] do_IRQ+0x141/0x3b0
 [<c010a6f8>] common_interrupt+0x18/0x20
 [<c0127d1d>] do_softirq+0x4d/0xb0
 [<c010c8e5>] do_IRQ+0x255/0x3b0
 [<c015d9aa>] sys_brk+0xfa/0x130
 [<c010a6f8>] common_interrupt+0x18/0x20

Code: 8b 40 0c 85 c0 74 13 8d 76 00 8d bc 27 00 00 00 00 89 c2 8b 
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

-- 
Peter Osterlund - petero2@telia.com
http://w1.894.telia.com/~u89404340

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
  2003-10-19 19:54 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request() Peter Osterlund
@ 2003-10-19 21:20 ` Andrew Morton
  2003-10-20  0:25   ` Nick Piggin
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2003-10-19 21:20 UTC (permalink / raw)
  To: Peter Osterlund; +Cc: axboe, piggin, linux-kernel

Peter Osterlund <petero2@telia.com> wrote:
>
> I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
>  testing the CDRW packet writing driver, I got an oops in
>  as_latter_request. (Full oops at the end of this message.) It is
>  repeatable and happens because arq->rb_node.rb_right is uninitialized.

deadline seems to have the same problem.

We may as well squish this with the big hammer?

 drivers/block/as-iosched.c       |    1 +
 drivers/block/deadline-iosched.c |    1 +
 2 files changed, 2 insertions(+)

diff -puN drivers/block/as-iosched.c~iosched-oops-fixes drivers/block/as-iosched.c
--- 25/drivers/block/as-iosched.c~iosched-oops-fixes	2003-10-19 14:17:39.000000000 -0700
+++ 25-akpm/drivers/block/as-iosched.c	2003-10-19 14:18:09.000000000 -0700
@@ -1718,6 +1718,7 @@ static int as_set_request(request_queue_
 	struct as_rq *arq = mempool_alloc(ad->arq_pool, gfp_mask);
 
 	if (arq) {
+		memset(arq, 0, sizeof(*arq));
 		RB_CLEAR(&arq->rb_node);
 		arq->request = rq;
 		arq->state = AS_RQ_NEW;
diff -puN drivers/block/deadline-iosched.c~iosched-oops-fixes drivers/block/deadline-iosched.c
--- 25/drivers/block/deadline-iosched.c~iosched-oops-fixes	2003-10-19 14:17:39.000000000 -0700
+++ 25-akpm/drivers/block/deadline-iosched.c	2003-10-19 14:17:39.000000000 -0700
@@ -775,6 +775,7 @@ deadline_set_request(request_queue_t *q,
 
 	drq = mempool_alloc(dd->drq_pool, gfp_mask);
 	if (drq) {
+		memset(drq, 0, sizeof(*drq));
 		RB_CLEAR(&drq->rb_node);
 		drq->request = rq;
 

_


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
  2003-10-19 21:20 ` Andrew Morton
@ 2003-10-20  0:25   ` Nick Piggin
  2003-10-20  7:09     ` Jens Axboe
  2003-10-20 19:37     ` Peter Osterlund
  0 siblings, 2 replies; 6+ messages in thread
From: Nick Piggin @ 2003-10-20  0:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Peter Osterlund, axboe, linux-kernel



Andrew Morton wrote:

>Peter Osterlund <petero2@telia.com> wrote:
>
>>I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
>> testing the CDRW packet writing driver, I got an oops in
>> as_latter_request. (Full oops at the end of this message.) It is
>> repeatable and happens because arq->rb_node.rb_right is uninitialized.
>>
>
>deadline seems to have the same problem.
>
>We may as well squish this with the big hammer?
>

Thanks for the report, Peter.

The request is a special request, so either blk_attempt_remerge should
never be called on it, or blk_attempt_remerge (or as_latter_request) should
check for this. Its up to Jens.

I would say to stick something like
if (!rq_mergeable(rq))
    return;

into blk_attempt_remerge.

I'd say we shouldn't expect drivers to try to get this right.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
  2003-10-20  0:25   ` Nick Piggin
@ 2003-10-20  7:09     ` Jens Axboe
  2003-10-20  8:08       ` Nick Piggin
  2003-10-20 19:37     ` Peter Osterlund
  1 sibling, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2003-10-20  7:09 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, Peter Osterlund, linux-kernel

On Mon, Oct 20 2003, Nick Piggin wrote:
> 
> 
> Andrew Morton wrote:
> 
> >Peter Osterlund <petero2@telia.com> wrote:
> >
> >>I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
> >>testing the CDRW packet writing driver, I got an oops in
> >>as_latter_request. (Full oops at the end of this message.) It is
> >>repeatable and happens because arq->rb_node.rb_right is uninitialized.
> >>
> >
> >deadline seems to have the same problem.
> >
> >We may as well squish this with the big hammer?
> >
> 
> Thanks for the report, Peter.
> 
> The request is a special request, so either blk_attempt_remerge should
> never be called on it, or blk_attempt_remerge (or as_latter_request) should
> check for this. Its up to Jens.
> 
> I would say to stick something like
> if (!rq_mergeable(rq))
>    return;
> 
> into blk_attempt_remerge.
> 
> I'd say we shouldn't expect drivers to try to get this right.

attempt_merge() already includes such a check. To me it looks really
buggy that elv_latter_request() cannot be called on non-fs requests, I'd
rather get that fixed like Peter suggests. elv_latter_request() should
work on all requests in the io sched queue, period.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
  2003-10-20  7:09     ` Jens Axboe
@ 2003-10-20  8:08       ` Nick Piggin
  0 siblings, 0 replies; 6+ messages in thread
From: Nick Piggin @ 2003-10-20  8:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, Peter Osterlund, linux-kernel



Jens Axboe wrote:

>On Mon, Oct 20 2003, Nick Piggin wrote:
>
>>
>>Andrew Morton wrote:
>>
>>
>>>Peter Osterlund <petero2@telia.com> wrote:
>>>
>>>
>>>>I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
>>>>testing the CDRW packet writing driver, I got an oops in
>>>>as_latter_request. (Full oops at the end of this message.) It is
>>>>repeatable and happens because arq->rb_node.rb_right is uninitialized.
>>>>
>>>>
>>>deadline seems to have the same problem.
>>>
>>>We may as well squish this with the big hammer?
>>>
>>>
>>Thanks for the report, Peter.
>>
>>The request is a special request, so either blk_attempt_remerge should
>>never be called on it, or blk_attempt_remerge (or as_latter_request) should
>>check for this. Its up to Jens.
>>
>>I would say to stick something like
>>if (!rq_mergeable(rq))
>>   return;
>>
>>into blk_attempt_remerge.
>>
>>I'd say we shouldn't expect drivers to try to get this right.
>>
>
>attempt_merge() already includes such a check. To me it looks really
>buggy that elv_latter_request() cannot be called on non-fs requests, I'd
>rather get that fixed like Peter suggests. elv_latter_request() should
>work on all requests in the io sched queue, period.
>

I don't have a problem with that. Peter's patch it is.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request()
  2003-10-20  0:25   ` Nick Piggin
  2003-10-20  7:09     ` Jens Axboe
@ 2003-10-20 19:37     ` Peter Osterlund
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Osterlund @ 2003-10-20 19:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, axboe, linux-kernel

Nick Piggin <piggin@cyberone.com.au> writes:

> Andrew Morton wrote:
> 
> >Peter Osterlund <petero2@telia.com> wrote:
> >
> >>I was running 2.6.0-test8 compiled with CONFIG_DEBUG_SLAB=y. When
> >> testing the CDRW packet writing driver, I got an oops in
> >> as_latter_request. (Full oops at the end of this message.) It is
> >> repeatable and happens because arq->rb_node.rb_right is uninitialized.
> >>
> >
> >deadline seems to have the same problem.
> >
> >We may as well squish this with the big hammer?
> >
> 
> Thanks for the report, Peter.
> 
> The request is a special request, so either blk_attempt_remerge should
> never be called on it, or blk_attempt_remerge (or as_latter_request) should
> check for this. Its up to Jens.

I don't think it is a special request. I added this debug hack:

--- linux/drivers/block/as-iosched.c~   2003-10-19 20:33:45.000000000 +0200
+++ linux/drivers/block/as-iosched.c    2003-10-20 21:14:20.000000000 +0200
@@ -1501,9 +1501,18 @@
 static struct request *
 as_latter_request(request_queue_t *q, struct request *rq)
 {
-       struct as_rq *arq = RQ_DATA(rq);
-       struct rb_node *rbnext = rb_next(&arq->rb_node);
-       struct request *ret = NULL;
+       struct as_rq *arq;
+       struct rb_node *rbnext;
+       struct request *ret;
+
+       arq = RQ_DATA(rq);
+       if (arq->rb_node.rb_right == (void*)0x5a5a5a5a) {
+               printk("flags:%lx sector:%ld cmd:%02x %02x %02x %02x\n",
+                      rq->flags, rq->sector,
+                      rq->cmd[0], rq->cmd[1], rq->cmd[2], rq->cmd[3]);
+       }
+       rbnext = rb_next(&arq->rb_node);
+       ret = NULL;
 
        if (rbnext)
                ret = rb_entry_arq(rbnext)->request;

The result was:

        flags:50 sector:186920 cmd:28 00 00 00
        Unable to handle kernel paging request at virtual address 5a5a5a66
         printing eip:
        ...

Note that:

        0x50 == REQ_CMD | REQ_STARTED
        0x28 == GPCMD_READ_10

So this looks like a regular read request to me. I'm not sure if this
means that something else is wrong.

-- 
Peter Osterlund - petero2@telia.com
http://w1.894.telia.com/~u89404340

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-10-20 19:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-19 19:54 2.6.0-test8, DEBUG_SLAB, oops in as_latter_request() Peter Osterlund
2003-10-19 21:20 ` Andrew Morton
2003-10-20  0:25   ` Nick Piggin
2003-10-20  7:09     ` Jens Axboe
2003-10-20  8:08       ` Nick Piggin
2003-10-20 19:37     ` Peter Osterlund

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox