public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
@ 2006-05-28  5:12 Mike Galbraith
  2006-05-28  5:25 ` Al Viro
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-28  5:12 UTC (permalink / raw)
  To: lkml; +Cc: Jens Axboe

Greetings,

I tried to boot 2.6.17-rc4-mm3 twice yesterday, and received the below
both times.  Both times, the oops->panic occurred while X/KDE was
starting.  KDE would not run thereafter, and had to be reinstalled.

Box is P4/HT/ICH5.

	-Mike

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000054
 printing eip:
 b11c28f0
 *pde = 37e93067
 Oops: 0000 [#1]
 PREEMPT SMP
 last sysfs file: /devices/pci0000:00/0000:00:1f.3/class
 Modules linked in: snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd tda9887 ip6t_REJECT xt_tcpudp ipt_REJECT xt_state iptable_mangle iptable_nat ip_nat iptable_filter saa7134 ip6table_mangle snd_intel8x0 snd_ac97_codec snd_ac97_bus ir_kbd_i2c snd_pcm snd_timer snd ip_conntrack bt878 prism54 soundcore nfnetlink ohci1394 ieee1394 i2c_i801 ip_tables snd_page_alloc ip6table_filter ip6_tables x_tables tuner bttv video_buf firmware_class ir_common btcx_risc tveeprom nls_iso8859_1 nls_cp437 nls_utf8 sd_mod
 CPU:    0
 EIP:    0060:[<b11c28f0>]    Not tainted VLI
 EFLAGS: 00213046   (2.6.17-rc4-mm3-smp #157)
 EIP is at cfq_dispatch_requests+0xef/0x540
 eax: 00000000   ebx: ef547c30   ecx: fffc593e   edx: 00000000
 esi: ecbb4f98   edi: 00000001   ebp: b153def0   esp: b153deb4
 ds: 007b   es: 007b   ss: 0068
 Process X (pid: 6992, threadinfo=b153d000 task=eb757000)
 Stack: 00203046 ef551f80 e44c1ee4 ee678f00 ef547c00 00000004 00000000 ef547c44
        ef547c44 ef547c34 ef51f81c ef51f7f4 ef51f7e4 b1598400 ef5e5d80 b153df14
	b11b744e 00000001 ef51f7e4 b153df14 b11b92a4 b1598494 b1598400 ef5e5d80
Call Trace:
 <b1003cf3> show_stack_log_lvl+0x9e/0xc3  <b1003f00> show_registers+0x1ac/0x237
 <b10040bd> die+0x132/0x2fb  <b1019df3> do_page_fault+0x4f3/0x577
 <b1003827> error_code+0x4f/0x54  <b11b744e> elv_next_request+0x1b/0x12f
 <b12764a3> ide_do_request+0x1b7/0x841  <b1276e45> ide_intr+0x1dc/0x1e1
 <b104a4a1> handle_IRQ_event+0x35/0x65  <b104a55f> __do_IRQ+0x8e/0xff
 <b100562a> do_IRQ+0x3e/0x57
 =======================
 <b10036ce> common_interrupt+0x1a/0x20
Code: 00 00 75 32 8d 43 34 3b 43 34 74 2a 8b 43 34 8b 70 3c 8b 0d 00 34 4e b1 83 7b 10 01 19 c0 83 e0 fc 8b 84 10 00 01 00 00 8b 56 14 <03> 42 54 39 c8 0f 88 98 01 00 00 8b 73 20 89 f2 8b 4d d4 8b 01
EIP: [<b11c28f0>] cfq_dispatch_requests+0xef/0x540 SS:ESP 0068:b153deb4
Kernel panic - not syncing: Fatal exception in interrupt
BUG: warning at arch/i386/kernel/smp.c:537/smp_call_function()
 <b1003d52> show_trace+0xd/0xf  <b1004440> dump_stack+0x17/0x19
 <b10129d2> smp_call_function+0x124/0x129  <b10129f5> smp_send_stop+0x1e/0x27
 <b1022a2b> panic+0x60/0x1c5  <b1004277> die+0x2ec/0x2fb
 <b1019df3> do_page_fault+0x4f3/0x577  <b1003827> error_code+0x4f/0x54
 <b11b744e> elv_next_request+0x1b/0x12f  <b12764a3> ide_do_request+0x1b7/0x841
 <b1276e45> ide_intr+0x1dc/0x1e1  <b104a4a1> handle_IRQ_event+0x35/0x65
 <b104a55f> __do_IRQ+0x8e/0xff  <b100562a> do_IRQ+0x3e/0x57
 =======================
 <b10036ce> common_interrupt+0x1a/0x20
BUG: warning at kernel/panic.c:138/panic()
 <b1003d52> show_trace+0xd/0xf  <b1004440> dump_stack+0x17/0x19
 <b1022b5d> panic+0x192/0x1c5  <b1004277> die+0x2ec/0x2fb
 <b1019df3> do_page_fault+0x4f3/0x577  <b1003827> error_code+0x4f/0x54
 <b11b744e> elv_next_request+0x1b/0x12f  <b12764a3> ide_do_request+0x1b7/0x841
 <b1276e45> ide_intr+0x1dc/0x1e1  <b104a4a1> handle_IRQ_event+0x35/0x65
 <b104a55f> __do_IRQ+0x8e/0xff  <b100562a> do_IRQ+0x3e/0x57
 =======================
 <b10036ce> common_interrupt+0x1a/0x20

(gdb) list *cfq_dispatch_requests+0xef
0xb11c28f0 is in cfq_dispatch_requests (cfq-iosched.c:969).
964             if (!list_empty(&cfqq->fifo)) {
965                     int fifo = cfq_cfqq_class_sync(cfqq);
966
967                     crq = RQ_DATA(list_entry_fifo(cfqq->fifo.next));
968                     rq = crq->request;
969                     if (time_after(jiffies, rq->start_time + cfqd->cfq_fifo_expire[fifo])) {
970                             cfq_mark_cfqq_fifo_expire(cfqq);
971                             return crq;
972                     }
973             }
(gdb)

0xb11c28f0 <cfq_dispatch_requests+239>: add    0x54(%edx),%eax




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  5:12 2.6.17-rc4-mm3 cfq oops->panic w. fs damage Mike Galbraith
@ 2006-05-28  5:25 ` Al Viro
  2006-05-28  6:00   ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Al Viro @ 2006-05-28  5:25 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: lkml, Jens Axboe

On Sun, May 28, 2006 at 07:12:03AM +0200, Mike Galbraith wrote:
> Greetings,
> 
> I tried to boot 2.6.17-rc4-mm3 twice yesterday, and received the below
> both times.  Both times, the oops->panic occurred while X/KDE was
> starting.  KDE would not run thereafter, and had to be reinstalled.

Can you reproduce that with mainline?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  5:25 ` Al Viro
@ 2006-05-28  6:00   ` Mike Galbraith
  2006-05-28  7:48     ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-28  6:00 UTC (permalink / raw)
  To: Al Viro; +Cc: lkml, Jens Axboe

On Sun, 2006-05-28 at 06:25 +0100, Al Viro wrote:
> On Sun, May 28, 2006 at 07:12:03AM +0200, Mike Galbraith wrote:
> > Greetings,
> > 
> > I tried to boot 2.6.17-rc4-mm3 twice yesterday, and received the below
> > both times.  Both times, the oops->panic occurred while X/KDE was
> > starting.  KDE would not run thereafter, and had to be reinstalled.
> 
> Can you reproduce that with mainline?

Virgin rc4 has been working fine, but I've been using UP kernels.  I'll
try the same config as SMP.

I'm still picking up the pieces ATM, because (expletive) YAST and I had
a minor disagreement wrt what all wanted restoration (yup, i lost;). 

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  6:00   ` Mike Galbraith
@ 2006-05-28  7:48     ` Mike Galbraith
  2006-05-28  8:03       ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-28  7:48 UTC (permalink / raw)
  To: Al Viro; +Cc: lkml, Jens Axboe

On Sun, 2006-05-28 at 08:00 +0200, Mike Galbraith wrote:
> On Sun, 2006-05-28 at 06:25 +0100, Al Viro wrote:
> > On Sun, May 28, 2006 at 07:12:03AM +0200, Mike Galbraith wrote:
> > > Greetings,
> > > 
> > > I tried to boot 2.6.17-rc4-mm3 twice yesterday, and received the below
> > > both times.  Both times, the oops->panic occurred while X/KDE was
> > > starting.  KDE would not run thereafter, and had to be reinstalled.
> > 
> > Can you reproduce that with mainline?
> 
> Virgin rc4 has been working fine, but I've been using UP kernels.  I'll
> try the same config as SMP.

She's running fine.  Guess I'll go prod mm3 again.

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  7:48     ` Mike Galbraith
@ 2006-05-28  8:03       ` Mike Galbraith
  2006-05-28  8:24         ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-28  8:03 UTC (permalink / raw)
  To: Al Viro; +Cc: lkml, Jens Axboe

On Sun, 2006-05-28 at 09:48 +0200, Mike Galbraith wrote:
> On Sun, 2006-05-28 at 08:00 +0200, Mike Galbraith wrote:
> > On Sun, 2006-05-28 at 06:25 +0100, Al Viro wrote:
> > > On Sun, May 28, 2006 at 07:12:03AM +0200, Mike Galbraith wrote:
> > > > Greetings,
> > > > 
> > > > I tried to boot 2.6.17-rc4-mm3 twice yesterday, and received the below
> > > > both times.  Both times, the oops->panic occurred while X/KDE was
> > > > starting.  KDE would not run thereafter, and had to be reinstalled.
> > > 
> > > Can you reproduce that with mainline?
> > 
> > Virgin rc4 has been working fine, but I've been using UP kernels.  I'll
> > try the same config as SMP.
> 
> She's running fine.  Guess I'll go prod mm3 again.

Yup, mm3 makes reliable kaboom.

I suppose the first thing to do is see if it's cfq, and then maybe toss
a dart at the patch list.

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  8:03       ` Mike Galbraith
@ 2006-05-28  8:24         ` Mike Galbraith
  2006-05-29 11:00           ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-28  8:24 UTC (permalink / raw)
  To: Al Viro; +Cc: lkml, Jens Axboe

On Sun, 2006-05-28 at 10:03 +0200, Mike Galbraith wrote:

> Yup, mm3 makes reliable kaboom.
> 
> I suppose the first thing to do is see if it's cfq, and then maybe toss
> a dart at the patch list.

That was too easy.  It's git-cfq.patch.

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-28  8:24         ` Mike Galbraith
@ 2006-05-29 11:00           ` Mike Galbraith
  2006-05-30 12:36             ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-29 11:00 UTC (permalink / raw)
  To: Al Viro; +Cc: lkml, Jens Axboe

On Sun, 2006-05-28 at 10:24 +0200, Mike Galbraith wrote: 
> On Sun, 2006-05-28 at 10:03 +0200, Mike Galbraith wrote:
> 
> > Yup, mm3 makes reliable kaboom.
> > 
> > I suppose the first thing to do is see if it's cfq, and then maybe toss
> > a dart at the patch list.
> 
> That was too easy.  It's git-cfq.patch.

Too easy indeed.

After staring at these changes, and not having anything poke me in the
eye that looked like it might cause list corruption, I decided to try
them in a different kernel.  I put them into 2.6.16-rt25, and there they
work peachy.  A diff of 2.6.16-rt25+git-cfq.patch->2.6.17-rc4-mm3 shows
what I was expecting (locking changes), but it's embedded in ~1000 lines
of diff, and doesn't look particularly trivial.

Hi Jens <punt> :)

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-29 11:00           ` Mike Galbraith
@ 2006-05-30 12:36             ` Jens Axboe
  2006-05-30 13:27               ` Mike Galbraith
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2006-05-30 12:36 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Al Viro, lkml

On Mon, May 29 2006, Mike Galbraith wrote:
> On Sun, 2006-05-28 at 10:24 +0200, Mike Galbraith wrote: 
> > On Sun, 2006-05-28 at 10:03 +0200, Mike Galbraith wrote:
> > 
> > > Yup, mm3 makes reliable kaboom.
> > > 
> > > I suppose the first thing to do is see if it's cfq, and then maybe toss
> > > a dart at the patch list.
> > 
> > That was too easy.  It's git-cfq.patch.
> 
> Too easy indeed.
> 
> After staring at these changes, and not having anything poke me in the
> eye that looked like it might cause list corruption, I decided to try
> them in a different kernel.  I put them into 2.6.16-rt25, and there they
> work peachy.  A diff of 2.6.16-rt25+git-cfq.patch->2.6.17-rc4-mm3 shows
> what I was expecting (locking changes), but it's embedded in ~1000 lines
> of diff, and doesn't look particularly trivial.
> 
> Hi Jens <punt> :)

I'm suspecting a recent -mm change, since git-cfq hasn't changed in
quite a while and it used to work just fine. Can you pass me the diff
you generated?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-30 12:36             ` Jens Axboe
@ 2006-05-30 13:27               ` Mike Galbraith
  2006-05-30 13:30                 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Galbraith @ 2006-05-30 13:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Al Viro, lkml

On Tue, 2006-05-30 at 14:36 +0200, Jens Axboe wrote:

> I'm suspecting a recent -mm change, since git-cfq hasn't changed in
> quite a while and it used to work just fine.

It's apparently not mm.  I just plugged it into 2.6.17-rc4, and get the
same explosion.  It doesn't seem to play well with the changes in
2.6.17-rc1.

	-Mike


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2.6.17-rc4-mm3 cfq oops->panic w. fs damage
  2006-05-30 13:27               ` Mike Galbraith
@ 2006-05-30 13:30                 ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2006-05-30 13:30 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Al Viro, lkml

On Tue, May 30 2006, Mike Galbraith wrote:
> On Tue, 2006-05-30 at 14:36 +0200, Jens Axboe wrote:
> 
> > I'm suspecting a recent -mm change, since git-cfq hasn't changed in
> > quite a while and it used to work just fine.
> 
> It's apparently not mm.  I just plugged it into 2.6.17-rc4, and get the
> same explosion.  It doesn't seem to play well with the changes in
> 2.6.17-rc1.

Ah, ok that makes sense. I'll take a closer look at it, thanks for
reporting!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-05-30 13:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-28  5:12 2.6.17-rc4-mm3 cfq oops->panic w. fs damage Mike Galbraith
2006-05-28  5:25 ` Al Viro
2006-05-28  6:00   ` Mike Galbraith
2006-05-28  7:48     ` Mike Galbraith
2006-05-28  8:03       ` Mike Galbraith
2006-05-28  8:24         ` Mike Galbraith
2006-05-29 11:00           ` Mike Galbraith
2006-05-30 12:36             ` Jens Axboe
2006-05-30 13:27               ` Mike Galbraith
2006-05-30 13:30                 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox