* Re: The horrible hack from hell called A20
@ 2000-12-06 12:09 Miles Lane
2000-12-06 20:05 ` Linus Torvalds
0 siblings, 1 reply; 7+ messages in thread
From: Miles Lane @ 2000-12-06 12:09 UTC (permalink / raw)
To: Linus Torvalds, linux-kernel
I reported problems with using my two Cardbus cards simultaneously
with previous test12 releases. The behavior has changed with pre6.
#1
When I run "ifup eth0", I get an error message:
SIOCADDRT: File exists
SIOCADDRT: File exists
This happens even when my 3c575 Cardbus ethernet card is the
only Cardbus card inserted. This behavior existed in pre4, too,
though.
#2
If I insert both my 3c575 and Belkin BusPort Mobile USB host-controller
and then enable both of them, "modprobe usb-ohci" hangs. If I then
attempt "modprobe -r 3c59x", that process hangs, too. lsmod shows:
usb-ohci 15072 1 (initializing)
3c59x 0 0 (deleted)
usbcore 50384 1 (autoclean) [usb-ohci]
Then, when I try to shut the machine down, the shutdown process
hangs when trying to close down eth0.
I am including my entire dmesg output. I apologize for this, but
I am not sure what parts of the logfile are definitely irrelevant
to this report.
Linux version 2.4.0-test12 (root@agate) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #5 Wed Dec 6 00:48:18 PST 2000
BIOS-provided physical RAM map:
BIOS-e820: 000000000009f800 @ 0000000000000000 (usable)
BIOS-e820: 0000000000000800 @ 000000000009f800 (reserved)
BIOS-e820: 0000000000010000 @ 00000000000f0000 (reserved)
BIOS-e820: 0000000004f00000 @ 0000000000100000 (usable)
BIOS-e820: 0000000000010000 @ 00000000ffff0000 (reserved)
Scan SMP from c0000000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f0000 for 65536 bytes.
Scan SMP from c009f800 for 4096 bytes.
On node 0 totalpages: 20480
zone(0): 4096 pages.
zone(1): 16384 pages.
zone(2): 0 pages.
mapped APIC to ffffe000 (01156000)
Kernel command line: auto BOOT_IMAGE=Serial-Debug ro root=305
pci=biosirq console=ttyS0,38400 console=tty0
Initializing CPU#0
Detected 232.112 MHz processor.
Console: colour VGA+ 80x43
Calibrating delay loop... 462.03 BogoMIPS
Memory: 78616k/81920k available (1032k kernel code, 2916k reserved, 82k
data, 204k init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Buffer-cache hash table entries: 4096 (order: 2, 16384 bytes)
Page-cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
VFS: Diskquotas version dquot_6.4.0 initialized
CPU: Before vendor init, caps: 0183f9ff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0183f9ff 00000000 00000000 00000000
CPU: After generic, caps: 0183f9ff 00000000 00000000 00000000
CPU: Common caps: 0183f9ff 00000000 00000000 00000000
CPU: Intel Pentium II (Deschutes) stepping 00
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfda13, last bus=0
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router PIIX [8086/7110] at 00:07.0
got res[10000000:10000fff] for resource 0 of Texas Instruments PCI1131
got res[10001000:10001fff] for resource 0 of Texas Instruments
PCI1131 (#2)
Limiting direct PCI/PCI transfers.
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Starting kswapd v1.8
pty: 256 Unix98 ptys configured
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller on PCI bus 00 dev 39
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xfcf0-0xfcf7, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xfcf8-0xfcff, BIOS settings: hdc:pio, hdd:pio
hda: TOSHIBA MK4006MAV, ATA DISK drive
hdc: TOSHIBA CD-ROM XM-1702BC, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 8007552 sectors (4100 MB), CHS=993/128/63, UDMA(33)
Partition check:
/dev/ide/host0/bus0/target0/lun0: p1 p2 < p5 p6 >
Serial driver version 5.02 (2000-08-09) with MANY_PORTS SHARE_IRQ
SERIAL_PCI enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Linux PCMCIA Card Services 3.1.22
options: [pci] [cardbus]
PCI: Enabling device 00:04.0 (0000 -> 0002)
PCI: Assigned IRQ 11 for device 00:04.0
PCI: Enabling device 00:04.1 (0000 -> 0002)
PCI: Assigned IRQ 11 for device 00:04.1
Intel PCIC probe: not found.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP: Hash tables configured (established 8192 bind 8192)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
devfs: v0.102 (20000622) Richard Gooch (rgooch@atnf.csiro.au)
devfs: devfs_debug: 0x0
devfs: boot_options: 0x2
Yenta IRQ list 0698, PCI irq11
Socket status: 30000006
Yenta IRQ list 0698, PCI irq11
Socket status: 30000020
cs: cb_alloc(bus 1): vendor 0x10b7, device 0x5157
got res[1000:107f] for resource 0 of PCI device 10b7:5157
got res[10800000:1080007f] for resource 1 of PCI device 10b7:5157
got res[10800080:108000ff] for resource 2 of PCI device 10b7:5157
got res[10400000:1041ffff] for resource 6 of PCI device 10b7:5157
PCI: Enabling device 01:00.0 (0000 -> 0003)
PCI: Found IRQ 11 for device 01:00.0
PCI: The same IRQ used for device 00:04.0
call_usermodehelper[/sbin/hotplug]: no root fs
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 204k freed
Adding Swap: 108824k swap-space (priority -1)
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
3c59x.c:LK1.1.11 13 Nov 2000 Donald Becker and others.
http://www.scyld.com/network/vortex.html $Revision: 1.102.2.46 $
See Documentation/networking/vortex.txt
eth0: 3Com PCI 3CCFE575BT Cyclone CardBus at 0x1000, PCI: Found IRQ 11
for device 01:00.0
PCI: The same IRQ used for device 00:04.0
PCI: Setting latency timer of device 01:00.0 to 64
00:10:4b:7c:9d:9d, IRQ 11
eth0: CardBus functions mapped 10800080->c5840080
8K byte-wide RAM 5:3 Rx:Tx split, MII interface.
MII transceiver found at address 0, status 782d.
Enabling bus-master transmits and whole-frame receives.
eth0: using default media MII
isapnp: Scanning for Pnp cards...
isapnp: No Plug & Play device found
snd: cs4231: port = 0x530, id = 0xa
snd: CS4231: VERSION (I25) = 0x3
snd: CS4231: ext version; rev = 0xe8, id = 0xe8
snd: CS4236: [0xf00] C1 (version) = 0xe8, ext = 0xe8
cs: cb_alloc(bus 5): vendor 0x1045, device 0xc861
got res[11000000:11000fff] for resource 0 of PCI device 1045:c861
PCI: Enabling device 05:00.0 (0000 -> 0002)
PCI: Found IRQ 11 for device 05:00.0
PCI: The same IRQ used for device 00:04.0
PCI: The same IRQ used for device 01:00.0
PCI: Found IRQ 11 for device 05:00.0
PCI: The same IRQ used for device 00:04.0
PCI: The same IRQ used for device 01:00.0
PCI: Setting latency timer of device 05:00.0 to 64
usb-ohci.c: USB OHCI at membase 0xc586b000, IRQ 11
usb-ohci.c: usb-05:00.0, PCI device 1045:c861
usb.c: new USB bus registered, assigned bus number 1
usb.c: kmalloc IF c2a60720, numif 1
usb.c: new device strings: Mfr=0, Product=2, SerialNumber=1
usb.c: USB device number 1 default language ID 0x0
Product: USB OHCI Root Hub
SerialNumber: c586b000
hub.c: USB hub found
hub.c: 2 ports detected
hub.c: standalone hub
hub.c: ganged power switching
hub.c: global over-current protection
hub.c: power on to power good time: 2ms
hub.c: hub controller current requirement: 0mA
hub.c: port removable status: RR
hub.c: local power source is good
hub.c: no over-current condition exists
hub.c: enabling power on all ports
usb.c: hub driver claimed interface c2a60720
usb.c: kusbd: /sbin/hotplug add 1
hub.c: port 2 connection change
hub.c: port 2, portstatus 301, change 1, 1.5 Mb/s
hub.c: port 2, portstatus 303, change 10, 1.5 Mb/s
hub.c: USB new device connect on bus1/2, assigned device number 2
usb.c: kmalloc IF c2a603a0, numif 1
usb.c: skipped 1 class/vendor specific interface descriptors
usb.c: new device strings: Mfr=1, Product=2, SerialNumber=0
usb.c: USB device number 2 default language ID 0x409
Manufacturer: Microsoft
Product: Microsoft IntelliMouse® Optical
usb.c: unhandled interfaces on device
usb.c: USB device 2 (vend/prod 0x45e/0x29) is not claimed by any active
driver.
Length = 18
DescriptorType = 01
USB version = 1.10
Vendor:Product = 045e:0029
MaxPacketSize0 = 8
NumConfigurations = 1
Device version = 1.08
Device Class:SubClass:Protocol = 00:00:00
Per-interface classes
Configuration:
bLength = 9
bDescriptorType = 02
wTotalLength = 0022
bNumInterfaces = 01
bConfigurationValue = 01
iConfiguration = 00
bmAttributes = a0
MaxPower = 100mA
Interface: 0
Alternate Setting: 0
bLength = 9
bDescriptorType = 04
bInterfaceNumber = 00
bAlternateSetting = 00
bNumEndpoints = 01
bInterface Class:SubClass:Protocol = 03:01:02
iInterface = 00
Endpoint:
bLength = 7
bDescriptorType = 05
bEndpointAddress = 81 (in)
bmAttributes = 03 (Interrupt)
wMaxPacketSize = 0004
bInterval = 0a
usb.c: kusbd: /sbin/hotplug add 2
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: The horrible hack from hell called A20 2000-12-06 12:09 The horrible hack from hell called A20 Miles Lane @ 2000-12-06 20:05 ` Linus Torvalds 2000-12-06 22:35 ` Miles Lane 0 siblings, 1 reply; 7+ messages in thread From: Linus Torvalds @ 2000-12-06 20:05 UTC (permalink / raw) To: Miles Lane; +Cc: linux-kernel On Wed, 6 Dec 2000, Miles Lane wrote: > > If I insert both my 3c575 and Belkin BusPort Mobile USB host-controller > and then enable both of them, "modprobe usb-ohci" hangs. If I then > attempt "modprobe -r 3c59x", that process hangs, too. lsmod shows: > > usb-ohci 15072 1 (initializing) > 3c59x 0 0 (deleted) > usbcore 50384 1 (autoclean) [usb-ohci] The only thing in common between the two will be the fact that they do share the same irq, and I'm not at all sure that those two drivers are always happy about irq sharing. Your dmesg output looks sane and happy, though. Both the USB and the 3c59x driver find their hardware, and claim to have successfully initialized them. The USB driver even finds the stuff on the USB bus (microsoft intellimouse), so it obviously works to a large degree. Similarly, the ethernet driver happily finds everything etc. In fact, everything looks so happy that I bet that the reason the module is stuck initializing is some setup problem, possibly because kusbd ends up waiting on /sbin/hotplug or similar. It does not look like the drivers themselves would have trouble, it looks much more like a modprobe-related issue (maybe deadlocking on some semaphore or other lock). I'd suggest two things: - try not using modules. Does it "just work" for you then? (Both the OHCI and the 3c59x driver should happily work with hotplug compiled right into the kernel). - try "strace"ing the whole modprobe thing, to see where it hangs, in order to figure out what it is waiting for. I wonder if it's the keventd changes. Basically, I think this is a completely different problem, and not really driver-related any more. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The horrible hack from hell called A20 2000-12-06 20:05 ` Linus Torvalds @ 2000-12-06 22:35 ` Miles Lane 2000-12-07 2:27 ` Linus Torvalds 0 siblings, 1 reply; 7+ messages in thread From: Miles Lane @ 2000-12-06 22:35 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Hi Linus, Thanks for the reply. I agree with your analysis of the information I reported in this message. However, in previous related bug reports I mentioned actual functional conflicts between the drivers. Here is what goes wrong: Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:32 agate kernel: eth0: using default media MII Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:32 agate kernel: eth0: using default media MII Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:33 agate kernel: eth0: using default media MII Dec 6 04:21:33 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:33 agate kernel: eth0: using default media MII Dec 6 04:21:33 agate kernel: eth0: Too much work in interrupt, status e003. Dec 6 04:21:33 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:33 agate kernel: eth0: using default media MII Dec 6 04:21:33 agate kernel: eth0: Host error, FIFO diagnostic register 0000. Dec 6 04:21:33 agate kernel: eth0: using default media MII Dec 6 04:21:33 agate kernel: eth0: Host error, FIFO diagnostic register 0000. The repro case is to simply get both drivers happily loaded, insert my USB mouse and restart XFree86 so the USB mouse gets used, then copy a file from an FTP site while moving the mouse. Here's an strace of modprobe usb-ohci: query_module(NULL, QM_SYMBOLS, 0x806afe0, 16384, 21890) = -1 ENOSPC (No space left on device) brk(0x8076000) = 0x8076000 query_module(NULL, QM_SYMBOLS, { /* 930 entries */ }, 930) = 0 lseek(3, 0, SEEK_SET) = 0 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\1\0\3\0\1\0\0\0\0\0\0\0"..., 52) = 52 lseek(3, 15100, SEEK_SET) = 15100 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 560) = 560 lseek(3, 64, SEEK_SET) = 64 read(3, "WVS\213t$\0241\333\17\267V\0049\323}\34\215~\20\213\4\237"..., 13074) = 13074 lseek(3, 18756, SEEK_SET) = 18756 read(3, "\35\0\0\0\2:\0\0/\0\0\0\2:\0\0o\0\0\0\2;\0\0\213\0\0\0"..., 2248) = 2248 lseek(3, 13152, SEEK_SET) = 13152 read(3, "\0\0\0\0\254\377\377\377\271\377\377\377\254\377\377\377"..., 192) = 192 lseek(3, 21004, SEEK_SET) = 21004 read(3, "@\0\0\0\1\3\0\0D\0\0\0\1\3\0\0L\0\0\0\1\2\0\0P\0\0\0\1"..., 96) = 96 lseek(3, 13376, SEEK_SET) = 13376 read(3, "kernel_version=2.4.0-test12\0\0\0\0\0"..., 140) = 140 lseek(3, 13536, SEEK_SET) = 13536 read(3, "/usr/src/linux/include/linux/mou"..., 1393) = 1393 lseek(3, 21100, SEEK_SET) = 21100 read(3, "\350\2\0\0\1\2\0\0\354\2\0\0\1\2\0\0\360\2\0\0\1\2\0\0"..., 160) = 160 lseek(3, 14929, SEEK_SET) = 14929 read(3, "\0GCC: (GNU) egcs-2.91.66 1999031"..., 61) = 61 lseek(3, 14990, SEEK_SET) = 14990 read(3, "\0.symtab\0.strtab\0.shstrtab\0.text"..., 108) = 108 lseek(3, 15660, SEEK_SET) = 15660 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0"..., 1664) = 1664 brk(0x8077000) = 0x8077000 lseek(3, 17324, SEEK_SET) = 17324 read(3, "\0usb-ohci.c\0gcc2_compiled.\0__mod"..., 1430) = 1430 brk(0x8078000) = 0x8078000 lstat("/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/lib/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/lib/modules/2.4.0-test12", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 lstat("/lib/modules/2.4.0-test12/kernel", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/lib/modules/2.4.0-test12/kernel/drivers", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/lib/modules/2.4.0-test12/kernel/drivers/usb", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/lib/modules/2.4.0-test12/kernel/drivers/usb/usb-ohci.o", {st_mode=S_IFREG|0644, st_size=21260, ...}) = 0 stat("/lib/modules/2.4.0-test12/kernel/drivers/usb/usb-ohci.o", {st_mode=S_IFREG|0644, st_size=21260, ...}) = 0 create_module("usb-ohci", 15072) = 0xc5891000 brk(0x807c000) = 0x807c000 init_module("usb-ohci", 0x8077d10 <unfinished ...> I will test again with usb-ohci and 3c59x built into the kernel. As always, many thanks for your help! Miles <snip> > The only thing in common between the two will be the fact that they do > share the same irq, and I'm not at all sure that those two drivers are > always happy about irq sharing. > > Your dmesg output looks sane and happy, though. Both the USB and the 3c59x > driver find their hardware, and claim to have successfully initialized > them. The USB driver even finds the stuff on the USB bus (microsoft > intellimouse), so it obviously works to a large degree. Similarly, the > ethernet driver happily finds everything etc. > > In fact, everything looks so happy that I bet that the reason the module > is stuck initializing is some setup problem, possibly because kusbd ends > up waiting on /sbin/hotplug or similar. It does not look like the drivers > themselves would have trouble, it looks much more like a modprobe-related > issue (maybe deadlocking on some semaphore or other lock). > > I'd suggest two things: > > - try not using modules. Does it "just work" for you then? (Both the OHCI > and the 3c59x driver should happily work with hotplug compiled right > into the kernel). > > - try "strace"ing the whole modprobe thing, to see where it hangs, in > order to figure out what it is waiting for. I wonder if it's the > keventd changes. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The horrible hack from hell called A20 2000-12-06 22:35 ` Miles Lane @ 2000-12-07 2:27 ` Linus Torvalds 2000-12-07 6:43 ` Miles Lane 2000-12-07 15:29 ` Andrew Morton 0 siblings, 2 replies; 7+ messages in thread From: Linus Torvalds @ 2000-12-07 2:27 UTC (permalink / raw) To: Miles Lane; +Cc: linux-kernel On Wed, 6 Dec 2000, Miles Lane wrote: > > Here is what goes wrong: > > Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. But it continues to work, right? I bet that your ethernet card is just unhappy that it couldn't get DMA in time, because the bus was so busy. Many of the busmastering ethernet devices will start the packet send early, happy in the knowledge that they'll usually have plenty of time to DMA the data by the time they need it. This works fine most of the time, but if you have a busy PCI bus and you're doing things over a (potentially slow) PCI bridge like the Cardbus bridge, you're taking chances. And sometimes those chances do not work out ok.. Especially if you have slow memory, which most laptops have. I suspect that the worst result of this is just a noisy driver: both on the network (runt packets) and on the console. And it obviously will cause performance to suffer too, due to retransmitting packets that failed, and/or losing packets. There may be some rule for the threshold for sending packets or something else to make this happen less, so this is probably tweakable. But it doesn't sound deadly (unless the driver causes this to result in a dead network - does it?) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The horrible hack from hell called A20 2000-12-07 2:27 ` Linus Torvalds @ 2000-12-07 6:43 ` Miles Lane 2000-12-07 7:10 ` Linus Torvalds 2000-12-07 15:29 ` Andrew Morton 1 sibling, 1 reply; 7+ messages in thread From: Miles Lane @ 2000-12-07 6:43 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Linus Torvalds wrote: > > On Wed, 6 Dec 2000, Miles Lane wrote: > >> Here is what goes wrong: >> >> Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. > > > But it continues to work, right? I'll check. My system only has 80MB RAM, and I run Mozilla, which pushes a lot of information into the swap space. When I encounter this "Host error" problem, tons of messages start spewing into my logs. This bogs my entire system down horribly. <great educational material snipped> I have reproduced this problem with all the drivers built into the kernel. I have also just tried a test pass with 3c59x built in and USB built as modules. I booted with only the 3c575 inserted. I got eth0 running and then loaded usb-ohci (with the enable bus mastering change added). This resulted in modprobe hanging again. Now I'll try with all modules again and check to see whether eth0 is still usable. Thanks, Miles - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The horrible hack from hell called A20 2000-12-07 6:43 ` Miles Lane @ 2000-12-07 7:10 ` Linus Torvalds 0 siblings, 0 replies; 7+ messages in thread From: Linus Torvalds @ 2000-12-07 7:10 UTC (permalink / raw) To: Miles Lane; +Cc: linux-kernel On Wed, 6 Dec 2000, Miles Lane wrote: > > I have also just tried a test pass with 3c59x built in and > USB built as modules. I booted with only the 3c575 inserted. > I got eth0 running and then loaded usb-ohci (with the enable > bus mastering change added). This resulted in modprobe hanging > again. I bet you're hanging on the rtnl_semaphore due to having a /sbin/hotplug policy. Miles, mind trying out a really simple change in the ____call_usermodehelper() function in kernel/kmod.c? Change: #if 0 out the whole block that says "if (retval >= 0)" and does the waiting for the child. We shouldn't wait for the user mode helper: that's just going to cause nasty deadlocks. Deadlocks like the one you seem to be seeing, in fact. Does your ifconfig problem go away with that fix? Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: The horrible hack from hell called A20 2000-12-07 2:27 ` Linus Torvalds 2000-12-07 6:43 ` Miles Lane @ 2000-12-07 15:29 ` Andrew Morton 1 sibling, 0 replies; 7+ messages in thread From: Andrew Morton @ 2000-12-07 15:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: Miles Lane, linux-kernel Linus Torvalds wrote: > > On Wed, 6 Dec 2000, Miles Lane wrote: > > > > Here is what goes wrong: > > > > Dec 6 04:21:32 agate kernel: eth0: Host error, FIFO diagnostic register 0000. > > But it continues to work, right? > > I bet that your ethernet card is just unhappy that it couldn't get DMA in > time, because the bus was so busy. Many of the busmastering ethernet > devices will start the packet send early, happy in the knowledge that > they'll usually have plenty of time to DMA the data by the time they need > it. > > This works fine most of the time, but if you have a busy PCI bus and > you're doing things over a (potentially slow) PCI bridge like the Cardbus > bridge, you're taking chances. And sometimes those chances do not work out > ok.. Especially if you have slow memory, which most laptops have. > > I suspect that the worst result of this is just a noisy driver: both on > the network (runt packets) and on the console. And it obviously will cause > performance to suffer too, due to retransmitting packets that failed, > and/or losing packets. > > There may be some rule for the threshold for sending packets or something > else to make this happen less, so this is probably tweakable. But it > doesn't sound deadly (unless the driver causes this to result in a dead > network - does it?) > We initialise the 3com NICs so that the DMA of Tx frames doesn't commence until 1536 free bytes are available in the Tx FIFO. I assume this is to make the most of the NIC's ability to bus-master-transfer an entire frame in one slurp. But this is irrelevant. We initialise the NIC so it starts putting data on the wire after 128 bytes are in the Tx FIFO. So yes, there is an opportunity for another bus master to interrupt the slurp and to hold the bus for so long that the NIC gets a TX underrun. But surely not by just wiggling the mouse around? I have seen just one report of a person getting Tx underruns. The driver recovered OK. But Miles is reporting "Host error". This is different. The 3com datasheet says: This bit is set when a catastrophic error related to the bus interface occurs. The errors that set this bit are PCI target abort and PCI master abort. This bis is cleared by issuing the GlobalReset command... This is a very rare problem. Trolling the vortex archives comes up with a few comments from Das Nicmeisters: > Donald Becker write: > Another PCMCIA setup bug, except this one is much harder to track down. > The CardBus bridge chip isn't configured correctly. > This is a real bus problem, not a false report. > David Hinds wrote: > I've gotten a few reports of these PCI bus errors. They have indeed > been very hard to track down, since they are specific to particular > hardware combinations, and I've never been able to reproduce them. > Donald Becker wrote: > I've gotten this error on my Vaio 505TR, but I've never been able to > reproduce it when I'm ready to observe it. Miles, could you please apply the below patch? It'll give us a little more info about the PCI error. Bit 31 of `bus status' is MasterAbort and bit 30 is TargetAbort. Also, you can disable the start-tx-after-128-bytes feature by uncommenting // wait_for_completion(dev, SetTxStart|0x07ff); near the end of vortex_up(). With this change the NIC won't start transmitting until it has the entire frame onboard. It shouldn't make any difference (hah). This does look like a Cardbus bridge problem. --- linux-2.4.0-test12-pre7/drivers/net/3c59x.c Tue Nov 21 20:11:20 2000 +++ linux-akpm/drivers/net/3c59x.c Fri Dec 8 02:24:11 2000 @@ -203,7 +203,7 @@ #include <linux/delay.h> static char version[] __devinitdata = -"3c59x.c:LK1.1.11 13 Nov 2000 Donald Becker and others. http://www.scyld.com/network/vortex.html " "$Revision: 1.102.2.46 $\n"; +"3c59x.c:LK1.1.11 13 Nov 2000 Donald Becker and others. http://www.scyld.com/network/vortex.html " "$Revision: 1.102.2.40 $\n"; MODULE_AUTHOR("Donald Becker <becker@scyld.com>"); MODULE_DESCRIPTION("3Com 3c59x/3c90x/3c575 series Vortex/Boomerang/Cyclone driver"); @@ -843,10 +843,15 @@ { int rc; - rc = vortex_probe1 (pdev, pci_resource_start (pdev, 0), pdev->irq, - ent->driver_data, vortex_cards_found); - if (rc == 0) - vortex_cards_found++; + /* wake up and enable device */ + if (pci_enable_device (pdev)) { + rc = -EIO; + } else { + rc = vortex_probe1 (pdev, pci_resource_start (pdev, 0), pdev->irq, + ent->driver_data, vortex_cards_found); + if (rc == 0) + vortex_cards_found++; + } return rc; } @@ -863,7 +868,7 @@ struct vortex_private *vp; int option; unsigned int eeprom[0x40], checksum = 0; /* EEPROM contents */ - int i; + int i, step; struct net_device *dev; static int printed_version; int retval; @@ -912,12 +917,6 @@ vp->must_free_region = 1; } - /* wake up and enable device */ - if (pci_enable_device (pdev)) { - retval = -EIO; - goto free_region; - } - /* enable bus-mastering if necessary */ if (vci->flags & PCI_USES_MASTER) pci_set_master (pdev); @@ -1025,6 +1024,13 @@ dev->irq); #endif + EL3WINDOW(4); + step = (inb(ioaddr + Wn4_NetDiag) & 0x1e) >> 1; + printk(KERN_INFO " product code '%c%c' rev %02x.%d date %02d-" + "%02d-%02d\n", eeprom[6]&0xff, eeprom[6]>>8, eeprom[0x14], + step, (eeprom[4]>>5) & 15, eeprom[4] & 31, eeprom[4]>>9); + + if (pdev && vci->drv_flags & HAS_CB_FNS) { unsigned long fn_st_addr; /* Cardbus function status space */ unsigned short n; @@ -1148,14 +1154,19 @@ return retval; } -static void wait_for_completion(struct net_device *dev, int cmd) +#define wait_for_completion(dev, cmd) _wait_for_completion(dev, cmd, __LINE__) + +static void _wait_for_completion(struct net_device *dev, int cmd, int line) { - int i = 4000; + int i; outw(cmd, dev->base_addr + EL3_CMD); - while (--i > 0) { - if (!(inw(dev->base_addr + EL3_STATUS) & CmdInProgress)) + for (i = 0; i < 4000000; i++) { + if (!(inw(dev->base_addr + EL3_STATUS) & CmdInProgress)) { + if (i > 1000) + printk("wait_for_completion: line=%d, count=%d\n", line, i); return; + } } printk(KERN_ERR "%s: command 0x%04x did not complete! Status=0x%x\n", dev->name, cmd, inw(dev->base_addr + EL3_STATUS)); @@ -1331,6 +1342,7 @@ set_rx_mode(dev); outw(StatsEnable, ioaddr + EL3_CMD); /* Turn on statistics. */ +// wait_for_completion(dev, SetTxStart|0x07ff); outw(RxEnable, ioaddr + EL3_CMD); /* Enable the receiver. */ outw(TxEnable, ioaddr + EL3_CMD); /* Enable transmitter. */ /* Allow status bits to be seen. */ @@ -1663,6 +1675,12 @@ dev->name, fifo_diag); /* Adapter failure requires Tx/Rx reset and reinit. */ if (vp->full_bus_master_tx) { + int bus_status = inl(ioaddr + PktStatus); + /* 0x80000000 PCI master abort. */ + /* 0x40000000 PCI target abort. */ + if (vortex_debug) + printk(KERN_ERR "%s: PCI bus error, bus status %8.8x\n", dev->name, bus_status); + /* In this case, blow the card away */ vortex_down(dev); wait_for_completion(dev, TotalReset | 0xff); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2000-12-07 15:56 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2000-12-06 12:09 The horrible hack from hell called A20 Miles Lane 2000-12-06 20:05 ` Linus Torvalds 2000-12-06 22:35 ` Miles Lane 2000-12-07 2:27 ` Linus Torvalds 2000-12-07 6:43 ` Miles Lane 2000-12-07 7:10 ` Linus Torvalds 2000-12-07 15:29 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox