* PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2
@ 2003-11-08 1:49 Russell Kroll
2003-11-09 16:37 ` Matthew Wilcox
0 siblings, 1 reply; 4+ messages in thread
From: Russell Kroll @ 2003-11-08 1:49 UTC (permalink / raw)
To: linux-scsi
On the HP NetServer LH4, the sym53c8xx driver only works in 2.2 kernels,
and fails with a "CACHE INCORRECTLY CONFIGURED" error on 2.4 and 2.6.
This also applies to the sym53c8xx_2 driver.
I searched through the 2.4 and 2.3 series, and finally found that it
broke after the pre4-1 patch from the 2.3.99 era. I split that patch up
into little parts and found the one piece that exposed the problem - it's
a small pair of changes to pci-i386.c. Backing that patch out of 2.4.22
with patch -R makes the sym driver work again.
I've captured lspci dumps before and after the change, along with the
output from the driver on a stock kernel vs. a kernel that has this patch
backed out. There's a lot of it, so I've put it out on my web server
at this URL:
http://www.exploits.org/~rkroll/sym/
I am not suggesting that the patch be reverted, since it's obviously a bug
fix. My guess is that the fix exposed a problem in the driver.
Some background on the hardware: the LH4 seems to have two 53c895
interfaces on board. In 2.2, you can't see the primary interface in
lspci, perhaps due to the bug that was fixed by the pre4-1 patch. This is
generally not a problem, since the NetRAID/MegaRAID typically covers that
interface.
On 2.4, both interfaces are visible, and the driver can't seem to latch
onto the second one. I don't care if the first one doesn't work, since
I don't intend to talk to the disks that way. All I need is the second
one, since that's where my tape drive is connected.
This system happens to be a dual Pentium III, but I've whittled the kernel
builds down to generic UP/Pentium settings without any positive effect. I
even threw out just about everything except what I need to boot at one
point. I've tried using "sym53c8xx=excl:0x1400", but it doesn't help.
It's not something in userspace which causes this, since it still happens
when I boot with 'init=/bin/sh'.
I intend to run 2.4.22 on this system, but tried 2.6.0-test9 in case the
problem had been discovered and fixed in that tree. Unfortunately, it
also failed. I haven't tried mangling 2.6's PCI code to duplicate the
effects of backing out that patch yet.
This box is not in production, so I can run tests and try different
things. My own attempts at making the driver work were not successful.
I'm stuck. Any advice or pointers would be appreciated.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2
2003-11-08 1:49 PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2 Russell Kroll
@ 2003-11-09 16:37 ` Matthew Wilcox
2003-11-09 16:51 ` Doug Ledford
2003-11-09 20:46 ` Russell Kroll
0 siblings, 2 replies; 4+ messages in thread
From: Matthew Wilcox @ 2003-11-09 16:37 UTC (permalink / raw)
To: Russell Kroll; +Cc: linux-scsi
On Fri, Nov 07, 2003 at 07:49:37PM -0600, Russell Kroll wrote:
> On the HP NetServer LH4, the sym53c8xx driver only works in 2.2 kernels,
> and fails with a "CACHE INCORRECTLY CONFIGURED" error on 2.4 and 2.6.
> This also applies to the sym53c8xx_2 driver.
Hi Russell. I'm really interested in fixing this for the sym2 driver in 2.6.
Given this bit of the lspci log:
01:07.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 01)
Subsystem: Hewlett-Packard Company: Unknown device 1000
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 247 (7500ns min, 16000ns max), cache line size 08
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at 1400 [size=256]
Region 1: Memory at 40000000 (32-bit, non-prefetchable) [size=256]
Region 2: Memory at 40001000 (32-bit, non-prefetchable) [size=4K]
01:07.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 01)
Subsystem: Hewlett-Packard Company: Unknown device 1000
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 247 (7500ns min, 16000ns max), cache line size 08
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at a000 [size=256]
Region 1: Memory at e8101000 (32-bit, non-prefetchable) [size=256]
Region 2: Memory at e8100000 (32-bit, non-prefetchable) [size=4K]
I would say that this is a PCI problem, not a sym2 problem. I'm only
trying to pass the buck to another of my personalities though, since I also
hack PCI stuff. I really don't think we should have two devices with the
same domain:bus:slot.function since there's no way to do pci config
cycles to one or the other.
> Some background on the hardware: the LH4 seems to have two 53c895
> interfaces on board. In 2.2, you can't see the primary interface in
> lspci, perhaps due to the bug that was fixed by the pre4-1 patch. This is
> generally not a problem, since the NetRAID/MegaRAID typically covers that
> interface.
Hmm.. is this what's meant by the comment:
/*
* Ignore Symbios chips controlled by various RAID controllers.
* These controllers set value 0x52414944 at RAM end - 16.
*/
#if defined(__i386__)
if (base_2_c) {
...
ram_val = readl_raw(ram_ptr + ram_size - 16);
iounmap(ram_ptr);
if (ram_val == 0x52414944) {
printf_info("%s: not initializing, "
"driven by RAID controller.\n",
sym_name(device));
return -1;
}
> On 2.4, both interfaces are visible, and the driver can't seem to latch
> onto the second one. I don't care if the first one doesn't work, since
> I don't intend to talk to the disks that way. All I need is the second
> one, since that's where my tape drive is connected.
>
> This system happens to be a dual Pentium III, but I've whittled the kernel
> builds down to generic UP/Pentium settings without any positive effect. I
> even threw out just about everything except what I need to boot at one
> point. I've tried using "sym53c8xx=excl:0x1400", but it doesn't help.
> It's not something in userspace which causes this, since it still happens
> when I boot with 'init=/bin/sh'.
>
> I intend to run 2.4.22 on this system, but tried 2.6.0-test9 in case the
> problem had been discovered and fixed in that tree. Unfortunately, it
> also failed. I haven't tried mangling 2.6's PCI code to duplicate the
> effects of backing out that patch yet.
>
> This box is not in production, so I can run tests and try different
> things. My own attempts at making the driver work were not successful.
>
> I'm stuck. Any advice or pointers would be appreciated.
I have a couple of suggestions.
First, I have a strong suspicion that updating your firmware will
fix this problem. But once we do that, we lose this opportunity for
debugging the pci code. So after we're done robustifying the pci code,
you can go to http://welcome.hp.com/country/us/en/support.html select
"Download drivers and software", enter 'lh 4' in the box, submit, then
select "Cross operating system (BIOS, Firmware, etc)".
Second, I want to see what the PCI code is up to. So, can you change
the #undef DEBUG to #define DEBUG in linux-2.6.0-test9/drivers/pci/probe.c
then boot that kernel and send me the dmesg output.
Thanks.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2
2003-11-09 16:37 ` Matthew Wilcox
@ 2003-11-09 16:51 ` Doug Ledford
2003-11-09 20:46 ` Russell Kroll
1 sibling, 0 replies; 4+ messages in thread
From: Doug Ledford @ 2003-11-09 16:51 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Russell Kroll, linux-scsi mailing list
On Sun, 2003-11-09 at 11:37, Matthew Wilcox wrote:
> On Fri, Nov 07, 2003 at 07:49:37PM -0600, Russell Kroll wrote:
> > On the HP NetServer LH4, the sym53c8xx driver only works in 2.2 kernels,
> > and fails with a "CACHE INCORRECTLY CONFIGURED" error on 2.4 and 2.6.
> > This also applies to the sym53c8xx_2 driver.
>
> Hi Russell. I'm really interested in fixing this for the sym2 driver in 2.6.
> Given this bit of the lspci log:
>
> 01:07.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 01)
> Subsystem: Hewlett-Packard Company: Unknown device 1000
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 247 (7500ns min, 16000ns max), cache line size 08
> Interrupt: pin A routed to IRQ 11
> Region 0: I/O ports at 1400 [size=256]
> Region 1: Memory at 40000000 (32-bit, non-prefetchable) [size=256]
> Region 2: Memory at 40001000 (32-bit, non-prefetchable) [size=4K]
>
> 01:07.0 SCSI storage controller: LSI Logic / Symbios Logic 53c895 (rev 01)
> Subsystem: Hewlett-Packard Company: Unknown device 1000
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
> Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
> Latency: 247 (7500ns min, 16000ns max), cache line size 08
> Interrupt: pin A routed to IRQ 11
> Region 0: I/O ports at a000 [size=256]
> Region 1: Memory at e8101000 (32-bit, non-prefetchable) [size=256]
> Region 2: Memory at e8100000 (32-bit, non-prefetchable) [size=4K]
>
> I would say that this is a PCI problem, not a sym2 problem. I'm only
> trying to pass the buck to another of my personalities though, since I also
> hack PCI stuff. I really don't think we should have two devices with the
> same domain:bus:slot.function since there's no way to do pci config
> cycles to one or the other.
I can tell you what's going on here. This is a 450NX based
motherboard. The 450NX chipset from Intel was the first chipset to have
peer PCI busses. For backwards compatibility, some machine makers
hacked their PCI BIOS to have a fake bridge device on PCI bus 0 that
points to the same bus number as the peer bus. This way if the OS
didn't know about the peer bus registers it would still find the devices
by scanning behind the bridge. In this case we are scanning behind this
fake bridge and then also scanning based upon the peer bus registers in
the chipset, and as a result we are finding the device twice. In order
to fix this problem you need to change the peer bus quirk code for the
450NX chipset to scan the list of bus 0 devices looking for a bridge
that has the same config as the peer bus registers and if so delete the
bridge from the list. That will avoid double scanning and will avoid
having the PCI code try and configure sub busses via a fake bridge when
it should do all configurations via the 450NX peer bus registers.
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2
2003-11-09 16:37 ` Matthew Wilcox
2003-11-09 16:51 ` Doug Ledford
@ 2003-11-09 20:46 ` Russell Kroll
1 sibling, 0 replies; 4+ messages in thread
From: Russell Kroll @ 2003-11-09 20:46 UTC (permalink / raw)
To: willy; +Cc: linux-scsi
Matthew Wilcox wrote:
> > lspci, perhaps due to the bug that was fixed by the pre4-1 patch. This is
> > generally not a problem, since the NetRAID/MegaRAID typically covers that
> > interface.
>
> Hmm.. is this what's meant by the comment:
>
> /*
> * Ignore Symbios chips controlled by various RAID controllers.
> * These controllers set value 0x52414944 at RAM end - 16.
> */
I don't know for sure, but my guess is that code would only matter if both
the "bottom" controller and the RAID controller were visible at the same
time. Given the other response about 450NX bus weirdness, I wonder if
this is actually a single device that's showing up twice.
Another data point: I checked in the Symbios BIOS (^C at boot) and it only
lists one controller:
Port Irq ---------Status-------- NvRAM
Num Level Current Next-Boot Found
SYM53C895 A000 11 On On Yes
Change Adapter Status
< snip >
As you saw, lspci has both 0x1400 and 0xa000. When it works (2.2 or
mangled 2.4), it's only at 0xa000.
> First, I have a strong suspicion that updating your firmware will
> fix this problem. But once we do that, we lose this opportunity for
The latest BIOS out there is 4.06.36PS, and the first line in the boot
process is "PhoenixBIOS 4.06.36 PS", so that may not be an option here.
> Second, I want to see what the PCI code is up to. So, can you change
> the #undef DEBUG to #define DEBUG in linux-2.6.0-test9/drivers/pci/probe.c
> then boot that kernel and send me the dmesg output.
OK, here it is:
---
Linux version 2.6.0-test9 (rkroll@webserv) (gcc version 3.2.2) #1 Sun Nov 9 13:10:48 MST 2003
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e8800 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
BIOS-e820: 000000003fff0000 - 000000003ffffc00 (ACPI data)
BIOS-e820: 000000003ffffc00 - 0000000040000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000fffe8800 - 0000000100000000 (reserved)
Warning only 896MB will be used.
Use a HIGHMEM enabled kernel.
896MB LOWMEM available.
On node 0 totalpages: 229376
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 225280 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.1 present.
Building zonelist for node : 0
Kernel command line: auto BOOT_IMAGE=test ro root=801 panic=15
Initializing CPU#0
PID hash table entries: 4096 (order 12: 32768 bytes)
Detected 500.105 MHz processor.
Console: colour VGA+ 80x25
Memory: 905912k/917504k available (946k kernel code, 10808k reserved, 356k data, 252k init, 0k highmem)
Calibrating delay loop... 987.13 BogoMIPS
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0387fbff 00000000 00000000 00000000
CPU: After vendor identify, caps: 0387fbff 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU serial number disabled.
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040
CPU: Intel Pentium III (Katmai) stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdab2, last bus=2
PCI: Using configuration type 1
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
Scanning bus 00
Found 00:10 [8086/0960] 000604 01
Found 00:11 [8086/1960] 000e00 00
Found 00:18 [1011/0024] 000604 01
Found 00:30 [103c/10c1] 000880 00
Found 00:40 [1013/00b8] 000300 00
Found 00:78 [8086/7110] 000601 00
Found 00:79 [8086/7111] 000101 00
Found 00:7a [8086/7112] 000c03 00
Found 00:7b [8086/7113] 000680 00
Found 00:80 [8086/84ca] 000600 00
PCI: Searching for i450NX host bridges on 0000:00:10.0
Scanning bus 01
Found 01:38 [1000/000c] 000100 00
Fixups for bus 01
Bus scan for 01 returning with max=01
Found 00:90 [8086/84cb] 000600 00
Fixups for bus 00
Scanning behind PCI bridge 0000:00:02.0, config 010100, pass 0
Scanning bus 01
Found 01:38 [1000/000c] 000100 00
Fixups for bus 01
Bus scan for 01 returning with max=01
Scanning behind PCI bridge 0000:00:03.0, config 020200, pass 0
Scanning bus 02
Found 02:10 [8086/1229] 000200 00
Fixups for bus 02
Bus scan for 02 returning with max=02
Scanning behind PCI bridge 0000:00:02.0, config 010100, pass 1
Scanning behind PCI bridge 0000:00:03.0, config 020200, pass 1
Bus scan for 00 returning with max=02
PCI: Using IRQ router PIIX [8086/7110] at 0000:00:0f.0
PCI: IRQ 0 for device 0000:00:0f.2 doesn't match PIRQ mask - try pci=usepirqmask
PCI: Cannot allocate resource region 0 of device 0000:01:07.0
PCI: Cannot allocate resource region 1 of device 0000:01:07.0
PCI: Cannot allocate resource region 2 of device 0000:01:07.0
SBF: ACPI BOOT descriptor is wrong length (39)
SBF: Simple Boot Flag extension found and enabled.
SBF: Setting boot flags 0x1
pty: 256 Unix98 ptys configured
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS3 at I/O 0x2e8 (irq = 3) is a 16550A
Using anticipatory io scheduler
Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
megaraid: v2.00.3 (Release Date: Wed Feb 19 08:51:30 EST 2003)
PCI: Assigned IRQ 10 for device 0000:00:02.1
megaraid: found 0x8086:0x1960:bus 0:slot 2:func 1
scsi0:Found MegaRAID controller at 0xf8800000, IRQ:10
megaraid: [\x03\x03D :\x03\x02B ] detected 1 logical drives.
megaraid: channel[0] is raid.
scsi0 : LSI Logic MegaRAID \x03\x03D 254 commands 16 targs 4 chans 7 luns
scsi0: scanning scsi channel 0 for logical drives.
Vendor: MegaRAID Model: LD0 RAID5 51834R Rev: D
Type: Direct-Access ANSI SCSI revision: 02
scsi0: scanning scsi channel 4 [P0] for physical devices.
st: Version 20030811, fixed bufsize 32768, s/g segs 256
SCSI device sda: 106156032 512-byte hdwr sectors (54352 MB)
sda: asking for cache data failed
sda: assuming drive cache: write through
sda: sda1 sda2
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
mice: PS/2 mouse device common for all mice
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
NET: Registered protocol family 1
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 252k freed
Adding 1662716k swap on /dev/sda2. Priority:-1 extents:1
EXT3 FS on sda1, internal journal
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-11-09 20:46 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-08 1:49 PROBLEM: sym53c8xx is broken on HP LH 4 after Linux 2.2 Russell Kroll
2003-11-09 16:37 ` Matthew Wilcox
2003-11-09 16:51 ` Doug Ledford
2003-11-09 20:46 ` Russell Kroll
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox