From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 31 May 2002 20:47:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 31 May 2002 20:47:27 -0400 Received: from [195.63.194.11] ([195.63.194.11]:28940 "EHLO mail.stock-world.de") by vger.kernel.org with ESMTP id ; Fri, 31 May 2002 20:47:26 -0400 Message-ID: <3CF80B34.9080508@evision-ventures.com> Date: Sat, 01 Jun 2002 01:45:56 +0200 From: Martin Dalecki User-Agent: Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.0rc3) Gecko/20020523 X-Accept-Language: en-us, pl MIME-Version: 1.0 To: David Brownell CC: linux-kernel@vger.kernel.org Subject: Re: 2.5.19 (and earlier) IDE (+EXT3+???) bugs In-Reply-To: <3CF7DCEF.9050203@pacbell.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org David Brownell wrote: > I'm trying to use a slightly elderly laptop for some testing > on the 2.5 kernels. (As an ultralight, it's got some hardware > that tweaks some interesting USB/APM codepaths that don't > otherwise show up.) It's run Linux (mostly) since I got it, > and I can install and use RH 7.3 on it, no troubles. > > But it doesn't seem to want to run any recent 2.5 kernels. > I first tried with 2.5.15, and kernels up to and including > the latest (2.5.19 as I write) have the same overall failure > mode ... which does not happen with any of the 2.4 kernels > I've tried. (No recent ones other than the RH 7.3 code, > but many earlier ones.) Basically, I see: > > - kernel loads OK ... I attach "dmesg" output. > - runs init, which runs init scripts. > - everything's fine, disk fscks as right, UNTIL ... > - ...it blows up when remounting the root filesystem r/w > * Takes a *long* time, if it even succeeds > * Most of that time is evidently used to scribble over > as much of the disk as it can! > * If I powerdown the system very quickly, "fsck" can mostly > recover. If not, then both root and /boot get trashed. > - Next step is to re-install the OS again. > > As a stock RH 7.3 install, this root filesystem uses ext3. > > I was able to boot with "init=/bin/sh" and do some basic > testing with a read-only root FS. Reading files works ok, > "hdparm -I" gives the same info it did under the RH7.3 kernel, > and I can use DD to read and write to the disk. (USB works OK; > I can bring it up by hand using the "ohci-hcd" driver, which > is how I could transfer the dmesg info off this system.) > > So far the only really suggestive thing I've come up with is > that if I do much disk I/O, I start to see "hda: lost interrupt" > and the operation seems to become timeout-driven. I first > noticed that with DD, but then "fsck" of the root FS (5+MBytes) > turned up the same failure. (The fsck took so much time I had > to kill it; running on 2.4, it quickly reported no problems.) > > Does anyone know what might be going on here? Or better yet, > have a fix to whatever it is that's wrong? :) Seems to me there > is a clear IDE problem: lost interrupts were not an issue on > the 2.4 kernels. Whether fixing that would make that "scribble > on the disk" problem go away, I couldn't say. > > - Dave > > p.s. Hardware is a Toshiba Portege 3020ct, pci host bridge > is a "Toshiba America Info Systems 601 (rev a2)" > according to lspci. > > > Linux version 2.5.19 (root@neon) (gcc version 2.96 20000731 (Red Hat > Linux 7.3 2.96-110)) #1 Thu May 30 19:40:52 PDT 2002 > Video mode to be used for restore is f03 > BIOS-provided physical RAM map: > BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) > BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) > BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) > BIOS-e820: 0000000000100000 - 0000000004010000 (usable) > BIOS-e820: 0000000004010000 - 0000000004020000 (ACPI data) > BIOS-e820: 0000000004020000 - 0000000004040000 (reserved) > BIOS-e820: 00000000fef80000 - 00000000ff000000 (reserved) > BIOS-e820: 00000000ffee0000 - 00000000ffee6e00 (reserved) > BIOS-e820: 00000000ffee6e00 - 00000000ffee7000 (ACPI NVS) > BIOS-e820: 00000000ffee7000 - 00000000ffef0000 (reserved) > BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) > 64MB LOWMEM available. > On node 0 totalpages: 16400 > zone(0): 4096 pages. > zone(1): 12304 pages. > zone(2): 0 pages. > Kernel command line: init=/bin/sh ro root=/dev/hda2 vga=0x0f03 > Initializing CPU#0 > Detected 299.947 MHz processor. > Console: colour VGA+ 80x28 > Calibrating delay loop... 598.01 BogoMIPS > Memory: 62912k/65600k available (1004k kernel code, 2300k reserved, 244k > data, 216k init, 0k highmem) > Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes) > Inode-cache hash table entries: 8192 (order: 4, 65536 bytes) > Mount-cache hash table entries: 512 (order: 0, 4096 bytes) > CPU: Before vendor init, caps: 008001bf 00000000 00000000, vendor = 0 > Intel Pentium with F0 0F bug - workaround enabled. > CPU: After vendor init, caps: 008001bf 00000000 00000000 00000000 > CPU: After generic, caps: 008001bf 00000000 00000000 00000000 > CPU: Common caps: 008001bf 00000000 00000000 00000000 > CPU: Intel Mobile Pentium MMX stepping 02 > Checking 'hlt' instruction... OK. > POSIX conformance testing by UNIFIX > Linux NET4.0 for Linux 2.4 > Based upon Swansea University Computer Society NET3.039 > Initializing RT netlink socket > PCI: PCI BIOS revision 2.10 entry at 0xfd84f, last bus=21 > PCI: Using configuration type 1 > isapnp: Scanning for PnP cards... > isapnp: No Plug & Play device found > PnPBIOS: Found PnP BIOS installation structure at 0xc00f9020 > PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x9563, dseg 0x0 > PnPBIOS: 17 nodes reported by PnP BIOS; 17 recorded by driver > PnPBIOS: PNP0c02: ioport range 0x1882-0x1885 has been reserved > PCI: Probing PCI hardware > PCI: Probing PCI hardware (bus 00) > apm: BIOS version 1.2 Flags 0x02 (Driver version 1.16) > Starting kswapd > BIO: pool of 256 setup, 14Kb (56 bytes/bio) > biovec: init pool 0, 1 entries, 12 bytes > biovec: init pool 1, 4 entries, 48 bytes > biovec: init pool 2, 16 entries, 192 bytes > biovec: init pool 3, 64 entries, 768 bytes > biovec: init pool 4, 128 entries, 1536 bytes > biovec: init pool 5, 256 entries, 3072 bytes > Journalled Block Device driver loaded > pty: 512 Unix98 ptys configured > Real Time Clock Driver v1.11 > block: 256 slots per queue, batch=32 > Floppy drive(s): fd0 is 1.44M > FDC 0 is an 8272A > ATA/ATAPI device driver v7.0.0 > ATA: PCI bus speed 33.3MHz > hda: TOSHIBA MK6411MAT, ATA DISK drive > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > hda: 12685680 sectors, CHS=13424/15/63 > hda: [PTBL] [789/255/63] hda1 hda2 hda3 > mice: PS/2 mouse device common for all mice > NET4: Linux TCP/IP 1.0 for NET4.0 > IP Protocols: ICMP, UDP, TCP > IP: routing cache hash table of 512 buckets, 4Kbytes > TCP: Hash tables configured (established 4096 bind 4096) > NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > kjournald starting. Commit interval 5 seconds > EXT3-fs: mounted filesystem with ordered data mode. > VFS: Mounted root (ext3 filesystem) readonly. > Freeing unused kernel memory: 216k freed Thoese are indeed most propably symptoms of the multi mode write probles, I have and I know about :-(. The the drives are apparently comming up in PIO mode. Well slowly it's time to do something about it.