public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [problem captured] Re: cerberus on 2.4.17-rc2 UP
Date: Wed, 09 Jan 2002 01:04:01 -0800	[thread overview]
Message-ID: <3C3C077E.20A09A43@zip.com.au> (raw)
In-Reply-To: <3C3B6F65.F9226437@colorfullife.com>

Manfred Spraul wrote:
> 
> > Yes, I can generate it at will on two quite different IDE machines
> > with the run-bash-shared-mapping script from
> > http://www.zip.com.au/~akpm/ext3-tools.tar.gz
> 
> Could you apply the attached patch and try to reproduce it?

Nice patch.

> Enable CONFIG_DEBUG_SLAB.
> 
> The patch poisons all objects I could find that might have something
> to do with the bug. (all slab caches, struct request, struct page,
> struct filp, partially struct buffer_head).
> 
> My test box survives the run-bash_shared-mapping script (~30 min, 128
> MB memory).
> 

Mine survives only a few minutes.  Once it only lasted a second.
That's with mem=64m.  It lasts much, much longer with more memory.

The patch, alas, sheds no light.  I'll delve into it fairly soon,
I expect.

EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 212k freed
end_request: buffer-list destroyed
hda6: bad access: block=86256, count=-8
end_request: I/O error, dev 03:06 (hda), sector 86256
hda: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hda: drive not ready for command
hda: lost interrupt

and:

end_request: buffer-list destroyed
hda6: bad access: block=93608, count=-8
end_request: I/O error, dev 03:06 (hda), sector 93608
hda6: bad access: block=93616, count=-16
end_request: I/O error, dev 03:06 (hda), sector 93616
hda6: bad access: block=93624, count=-24
end_request: I/O error, dev 03:06 (hda), sector 93624
hda6: bad access: block=93632, count=-32
end_request: I/O error, dev 03:06 (hda), sector 93632
hda6: bad access: block=93640, count=-40
end_request: I/O error, dev 03:06 (hda), sector 93640
hda6: bad access: block=93648, count=-48
end_request: I/O error, dev 03:06 (hda), sector 93648
hda6: bad access: block=93656, count=-56
end_request: I/O error, dev 03:06 (hda), sector 93656
hda6: bad access: block=93664, count=-64
end_request: I/O error, dev 03:06 (hda), sector 93664
hda6: bad access: block=93672, count=-72
end_request: I/O error, dev 03:06 (hda), sector 93672
hda6: bad access: block=93680, count=-80
end_request: I/O error, dev 03:06 (hda), sector 93680
hda6: bad access: block=93688, count=-88
end_request: I/O error, dev 03:06 (hda), sector 93688
hda6: bad access: block=93696, count=-96
end_request: I/O error, dev 03:06 (hda), sector 93696
hda6: bad access: block=93704, count=-104
end_request: I/O error, dev 03:06 (hda), sector 93704
hda: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14
hda: status error: status=0x58 { DriveReady SeekComplete DataRequest }
hda: drive not ready for command
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt

hmm..  hda6 is the root filesystem.  The test was hitting hda8
and hda5(swap).  The only activity happening on hda6 would be
a bit of pagein, maybe syslog.  hmm.  

Always hda6:

end_request: buffer-list destroyed
hda6: bad access: block=90704, count=-8
end_request: I/O error, dev 03:06 (hda), sector 90704
hda6: bad access: block=90712, count=-16
end_request: I/O error, dev 03:06 (hda), sector 90712
hda6: bad access: block=90720, count=-24
end_request: I/O error, dev 03:06 (hda), sector 90720
hda6: bad access: block=90728, count=-32

Interestingly, 2.4.13-ac8 doesn't fail.  Well, it eventually takes
oopses in do_IRQ()'s get_current() - %cr2 has value 0x4017a000.

That kernel has the new IDE drivers, but I've seen the problem with
Andre's latest patches on PIIX, on VIA, and there are reports of it
on SCSI.  And "buffer-list destroyed" is always the first message.
It doesn't feel like a driver problem.  I'll go do a binary search
through some kernel revs.

-

  reply	other threads:[~2002-01-09  9:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-01-08 22:15 [problem captured] Re: cerberus on 2.4.17-rc2 UP Manfred Spraul
2002-01-09  9:04 ` Andrew Morton [this message]
  -- strict thread matches above, loose matches on Subject: below --
2001-12-20 12:59 marc. h.
2001-12-21 16:56 ` Marcelo Tosatti
2002-01-07 11:14   ` marc. h.
2002-01-08 15:48     ` [problem captured] " marc. h.
2002-01-08 16:13       ` Alan Cox
2002-01-08 20:33         ` Andrew Morton
2002-01-08 21:05           ` Alex Scheele
2002-01-09  9:37           ` marc. h.
2002-01-16 10:35           ` marc. h.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C3C077E.20A09A43@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox