* [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
@ 2005-06-03 11:25 Vivek Goyal
2005-06-03 11:54 ` Richard B. Johnson
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Vivek Goyal @ 2005-06-03 11:25 UTC (permalink / raw)
To: linux kernel mailing list, greg
Cc: Fastboot mailing list, Morton Andrew Morton, Alan Stern,
Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 5535 bytes --]
Hi,
In kdump, sometimes, general driver initialization issues seems to be cropping
in second kernel due to devices not being shutdown during crash and these
devices are sending interrupts while second kernel is booting and drivers are
not expecting any interrupts yet.
In some cases, we are observing a storm of interrupts while second kernel
is booting and kernel disables that irq line. May be the case of stuck irq
line because it is shared level triggered irq and there is no driver
loaded for the device.
So, we need something generic which disables interrupt generation from device
until a driver has been registered for that device and driver is ready to
receive the interrupts. PCI specifications (ver 2.3 onwards), have introduced
interrupt disable bit in command register to disable interrupt generation
from the device. This can become handy here. In capture kernel, traverse all
the PCI devices, disable interrupt generation. Enable the interrupt generation
back once the driver for that device registers. May be after the probe handler
has run. In probe handler, driver can reset the device or register for irq so
that it can handle any interrupt from the device after that.
Greg mentioned that there are reasons that we can not disable all pci
interrupts. Meanwhile I am going through archives to find more about it.
In previous conversations, Alan Stern had raised the issue of console also
not working if interrupts are disabled on all the devices. I am not sure
but this should be working at least for serial consoles and vga text consoles.
May be sufficient to capture the dump.
Attached is a hack patch which is by no means complete. I have got one machine
which has got some PCI 2.3 compliant hardware. After applying this hack this
problem does not occur atleast on this machine. Attached is the serial console
log which shows one kind of problem due to unwanted interrupts.
Bugme 4631 and 4573 are two more instances of driver initialization failure
in second kernel.
---
o In kdump, devices are not shutdown/reset after a crash and this leads to
various driver initialization failures while second kernel is booting.
Most of them seem to be happening due to the fact that system/driver
is receiving the interrupts from device when it is not prepared to do so.
o This patch tries to solve the problem by disabling the interrupts at
PCI level for all the devices. These interrupts are enabled back once the
driver for that device registers. Currenty this enabling and disabling
is done only in dump capture kernel.
o This is for devices compliant to PCI specification v2.3 or higher.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
---
linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci-driver.c | 3 +
linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci.c | 4 ++
linux-2.6.12-rc5-mm1-16M-root/include/linux/pci.h | 33 +++++++++++++++++
3 files changed, 39 insertions(+), 1 deletion(-)
diff -puN drivers/pci/pci.c~kdump-pci-interrupt-disable drivers/pci/pci.c
--- linux-2.6.12-rc5-mm1-16M/drivers/pci/pci.c~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
+++ linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci.c 2005-06-03 15:59:19.000000000 +0530
@@ -823,6 +823,10 @@ static int __devinit pci_init(void)
while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
pci_fixup_device(pci_fixup_final, dev);
+
+ /* A hack to disable interrupts from all PCI devices in
+ * capture kernel. */
+ pci_disable_device_intx(dev);
}
return 0;
}
diff -puN drivers/pci/pci-driver.c~kdump-pci-interrupt-disable drivers/pci/pci-driver.c
--- linux-2.6.12-rc5-mm1-16M/drivers/pci/pci-driver.c~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
+++ linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci-driver.c 2005-06-03 14:35:26.000000000 +0530
@@ -248,7 +248,8 @@ static int pci_device_probe(struct devic
error = __pci_device_probe(drv, pci_dev);
if (error)
pci_dev_put(pci_dev);
-
+ else
+ pci_enable_device_intx(pci_dev);
return error;
}
diff -puN include/linux/pci.h~kdump-pci-interrupt-disable include/linux/pci.h
--- linux-2.6.12-rc5-mm1-16M/include/linux/pci.h~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
+++ linux-2.6.12-rc5-mm1-16M-root/include/linux/pci.h 2005-06-03 16:06:32.000000000 +0530
@@ -901,6 +901,39 @@ extern void pci_disable_msix(struct pci_
extern void msi_remove_pci_irq_vectors(struct pci_dev *dev);
#endif
+#ifdef CONFIG_CRASH_DUMP
+static inline int pci_enable_device_intx(struct pci_dev *dev)
+{
+ u16 pci_command;
+
+ /* Enable Interrupt generation if not already enabled */
+ pci_read_config_word(dev, PCI_COMMAND, &pci_command);
+ if (pci_command & PCI_COMMAND_INTX_DISABLE) {
+ pci_command &= ~PCI_COMMAND_INTX_DISABLE;
+ pci_write_config_word(dev, PCI_COMMAND, pci_command);
+ }
+ return 0;
+}
+
+static inline int pci_disable_device_intx(struct pci_dev *dev)
+{
+ u16 pci_command;
+
+ /* Disable Interrupt generation if not already disabled */
+ pci_read_config_word(dev, PCI_COMMAND, &pci_command);
+ if (!(pci_command & PCI_COMMAND_INTX_DISABLE)) {
+ pci_command |= PCI_COMMAND_INTX_DISABLE;
+ pci_write_config_word(dev, PCI_COMMAND, pci_command);
+ }
+ return 0;
+}
+#else
+static inline int pci_enable_device_intx(struct pci_dev *dev)
+{ return 0; }
+static inline int pci_disable_device_intx(struct pci_dev *dev)
+{ return 0; }
+#endif /* CONFIG_CRASH_DUMP */
+
#endif /* CONFIG_PCI */
/* Include architecture-dependent settings and functions */
_
[-- Attachment #2: pci-logs.txt --]
[-- Type: text/plain, Size: 16715 bytes --]
[root@llm15p ~]# SysRq : Trigger a crashdump
Linux version 2.6.12-rc5-mm1-16M (root@llm15p.in.ibm.com) (gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)) #16 Fri Jun 3 14:24:23 IST 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000100 - 000000000009dc00 (usable)
BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
BIOS-e820: 0000000000100000 - 00000000c7fcb940 (usable)
BIOS-e820: 00000000c7fcb940 - 00000000c7fcf800 (ACPI data)
BIOS-e820: 00000000c7fcf800 - 00000000c8000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
user-defined physical RAM map:
user: 0000000000000000 - 00000000000a0000 (usable)
user: 0000000001000000 - 0000000001486000 (usable)
user: 0000000001526400 - 0000000005000000 (usable)
0MB HIGHMEM available.
80MB LOWMEM available.
DMI 2.3 present.
Allocating PCI resources starting at 05000000 (gap: 05000000:fb000000)
Built 1 zonelists
Initializing CPU#0
Kernel command line: root=/dev/sda3 console=tty0 console=ttyS1,38400 rhgb memmap=exactmap memmap=640K@0K memmap=4632K@16384K memmap=60263K@21657K elfcorehdr=21656K
PID hash table entries: 512 (order: 9, 8192 bytes)
Detected 3600.765 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Unknown interrupt or fault at EIP 00000292 00000060 c140c6a3
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 59464k/81920k available (3030k kernel code, 5880k reserved, 1098k data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 7209.88 BogoMIPS (lpj=14419769)
Mount-cache hash table entries: 512
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (24) available
CPU: Intel(R) Xeon(TM) CPU 3.60GHz stepping 01
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ACPI: setting ELCR to 0200 (from 0c20)
checking if image is initramfs... it is
Freeing initrd memory: 538k freed
softlockup thread 0 started up.
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xfd71e, last bus=9
PCI: Using MMCONFIG
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20050408
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] segment is 0
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
ACPI: PCI Interrupt Link [LP00] (IRQs *5)
ACPI: PCI Interrupt Link [LP01] (IRQs *11)
ACPI: PCI Interrupt Link [LP02] (IRQs *10)
ACPI: PCI Interrupt Link [LP03] (IRQs *11)
ACPI: PCI Interrupt Link [LP04] (IRQs *11)
ACPI: PCI Interrupt Link [LP05] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP06] (IRQs) *0, disabled.
ACPI: PCI Interrupt Link [LP07] (IRQs *5)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 16 devices
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
pnp: 00:00: ioport range 0x510-0x517 could not be reserved
pnp: 00:00: ioport range 0x504-0x507 could not be reserved
pnp: 00:00: ioport range 0x500-0x503 could not be reserved
pnp: 00:00: ioport range 0x520-0x53f has been reserved
pnp: 00:00: ioport range 0x540-0x547 has been reserved
pnp: 00:00: ioport range 0x460-0x461 has been reserved
pnp: 00:0e: ioport range 0x400-0x43f has been reserved
Machine check exception polling timer started.
audit: initializing netlink socket (disabled)
audit(1117808719.712:1): initialized
inotify device minor=63
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
ACPI: Power Button (FF) [PWRF]
ACPI: CPU0 (power states: C1[C1])
lp: driver loaded but no devices found
Linux agpgart interface v0.101 (c) Dave Jones
[drm] Initialized drm 1.0.0 20040925
cn_fork is registered
PNP: PS/2 controller has invalid data port 0x64; using default 0x60
PNP: PS/2 controller has invalid command port 0x60; using default 0x64
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP(,...)]
lp0: using parport0 (interrupt-driven).
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
tg3.c:v3.29 (May 23, 2005)
ACPI: PCI Interrupt Link [LP00] enabled at IRQ 5
PCI: setting IRQ 5 as level-triggered
ACPI: PCI Interrupt 0000:06:00.0[A] -> Link [LP00] -> GSI 5 (level, low) -> IRQ 5
eth0: Tigon3 [partno(BCM95721) rev 4101 PHY(5750)] (PCIX:100MHz:32-bit) 10/100/1000BaseT Ethernet 00:11:25:3f:6f:10
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76180000]
ACPI: PCI Interrupt 0000:07:00.0[A] -> Link [LP00] -> GSI 5 (level, low) -> IRQ 5
eth1: Tigon3 [partno(BCM95721) rev 4101 PHY(5750)] (PCIX:100MHz:32-bit) 10/100/1000BaseT Ethernet 00:11:25:3f:6f:11
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[76180000]
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH5: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[A]: no GSI - using IRQ 0
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x0480-0x0487, BIOS settings: hda:DMA, hdb:DMA
hda: BTC CD-ROM F523E, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide-disk: probe of 0.0 failed with error 1
hda: ATAPI 48X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
PCI: Enabling device 0000:03:07.0 (0156 -> 0157)
ACPI: PCI Interrupt Link [LP02] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:03:07.0[A] -> Link [LP02] -> GSI 10 (level, low) -> IRQ 10
PCI: Enabling device 0000:03:07.1 (0156 -> 0157)
ACPI: PCI Interrupt Link [LP03] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI Interrupt 0000:03:07.1[B] -> Link [LP03] -> GSI 11 (level, low) -> IRQ 11
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 1.3.11
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
(scsi1:A:2): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
(scsi1:A:3): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
(scsi1:A:4): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
(scsi1:A:5): 320.000MB/s transfers (160.000MHz DT|IU|RTI, 16bit)
Vendor: IBM-ESXS Model: VPR073C3-ETS10FN Rev: S330
Type: Direct-Access ANSI SCSI revision: 04
scsi1:A:2:0: Tagged Queuing enabled. Depth 32
Vendor: IBM-ESXS Model: VPR073C3-ETS10FN Rev: S330
Type: Direct-Access ANSI SCSI revision: 04
scsi1:A:3:0: Tagged Queuing enabled. Depth 32
Vendor: IBM-ESXS Model: VPR073C3-ETS10FN Rev: S330
Type: Direct-Access ANSI SCSI revision: 04
scsi1:A:4:0: Tagged Queuing enabled. Depth 32
Vendor: IBM-ESXS Model: VPR073C3-ETS10FN Rev: S330
Type: Direct-Access ANSI SCSI revision: 04
scsi1:A:5:0: Tagged Queuing enabled. Depth 32
Vendor: IBM Model: 02R0962a S320 1 Rev: 1
Type: Processor ANSI SCSI revision: 02
SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sda: drive cache: write through
SCSI device sda: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sda: drive cache: write through
sda: sda1 sda2 sda3 sda4 < sda5 >
Attached scsi disk sda at scsi1, channel 0, id 2, lun 0
SCSI device sdb: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdb: drive cache: write through
SCSI device sdb: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdb: drive cache: write through
sdb:
Attached scsi disk sdb at scsi1, channel 0, id 3, lun 0
SCSI device sdc: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdc: drive cache: write through
SCSI device sdc: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdc: drive cache: write through
sdc:
Attached scsi disk sdc at scsi1, channel 0, id 4, lun 0
SCSI device sdd: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdd: drive cache: write through
SCSI device sdd: 143374000 512-byte hdwr sectors (73407 MB)
SCSI device sdd: drive cache: write through
sdd:
Attached scsi disk sdd at scsi1, channel 0, id 5, lun 0
Attached scsi generic sg0 at scsi1, channel 0, id 2, lun 0, type 0
Attached scsi generic sg1 at scsi1, channel 0, id 3, lun 0, type 0
Attached scsi generic sg2 at scsi1, channel 0, id 4, lun 0, type 0
Attached scsi generic sg3 at scsi1, channel 0, id 5, lun 0, type 0
Attached scsi generic sg4 at scsi1, channel 0, id 8, lun 0, type 3
ieee1394: raw1394: /dev/raw1394 device initialized
usbmon: debugs is not available
ACPI: PCI Interrupt Link [LP07] enabled at IRQ 5
ACPI: PCI Interrupt 0000:00:1d.7[D] -> Link [LP07] -> GSI 5 (level, low) -> IRQ 5
ehci_hcd 0000:00:1d.7: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: irq 5, io mem 0xf0000000
ehci_hcd 0000:00:1d.7: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004
hub 1-0:1.0: USB hub found
irq 11: nobody cared (try booting with the "irqpoll" option.
[<c103a3ef>] __report_bad_irq+0x2a/0x8d
[<c1039c28>] handle_IRQ_event+0x39/0x6d
[<c103a4eb>] note_interrupt+0x7f/0xe4
[<c1039dc5>] __do_IRQ+0x169/0x194
[<c1004cc6>] do_IRQ+0x1b/0x28
[<c1003256>] common_interrupt+0x1a/0x20
[<c101d61b>] __do_softirq+0x2b/0x88
[<c101d69e>] do_softirq+0x26/0x2a
[<c101d759>] irq_exit+0x33/0x35
[<c1004ccb>] do_IRQ+0x20/0x28
[<c1003256>] common_interrupt+0x1a/0x20
[<c1018916>] release_console_sem+0x43/0xf6
[<c101875f>] vprintk+0x16c/0x277
[<c1074dd8>] d_lookup+0x23/0x46
[<c10748ac>] d_alloc+0x136/0x1a6
[<c10185ef>] printk+0x17/0x1b
[<c120f33f>] hub_probe+0xa0/0x168
[<c1096dd1>] sysfs_new_dirent+0x1f/0x64
[<c120d37f>] usb_probe_interface+0x59/0x71
[<c116c9b2>] driver_probe_device+0x2f/0xa7
[<c116ca2a>] __device_attach+0x0/0x5
[<c116c259>] bus_for_each_drv+0x58/0x78
[<c116ca8f>] device_attach+0x60/0x64
[<c116ca2a>] __device_attach+0x0/0x5
[<c116c383>] bus_add_device+0x29/0xa6
[<c116f9c7>] device_pm_add+0x56/0x9c
[<c116b7c3>] device_add+0xc5/0x14a
[<c121557e>] usb_set_configuration+0x346/0x528
[<c120fa47>] usb_new_device+0xab/0x1be
[<c120007b>] ohci_iso_recv_set_channel_mask+0x3/0xc6
[<c12124e1>] register_root_hub+0xaf/0x162
[<c12133bf>] usb_add_hcd+0x172/0x396
[<c1217bce>] usb_hcd_pci_probe+0x26e/0x375
[<c10284e7>] __call_usermodehelper+0x0/0x61
[<c110cbe4>] pci_device_probe_static+0x40/0x54
[<c110cc27>] __pci_device_probe+0x2f/0x42
[<c110cc63>] pci_device_probe+0x29/0x47
[<c116c9b2>] driver_probe_device+0x2f/0xa7
[<c116ca93>] __driver_attach+0x0/0x43
[<c116cad4>] __driver_attach+0x41/0x43
[<c116c1c2>] bus_for_each_dev+0x5a/0x7a
[<c116cafc>] driver_attach+0x26/0x2a
[<c116ca93>] __driver_attach+0x0/0x43
[<c116c5cb>] bus_add_driver+0x6b/0xa5
[<c110ce97>] pci_register_driver+0x5e/0x7c
[<c14246f5>] init+0x1d/0x25
[<c140c7ed>] do_initcalls+0x54/0xb6
[<c100029c>] init+0x0/0x10c
[<c100029c>] init+0x0/0x10c
[<c10002c6>] init+0x2a/0x10c
[<c1001348>] kernel_thread_helper+0x0/0xb
[<c100134d>] kernel_thread_helper+0x5/0xb
handlers:
[<c11e3421>] (ahd_linux_isr+0x0/0x284)
Disabling IRQ #11
hub 1-0:1.0: 4 ports detected
USB Universal Host Controller Interface driver v2.3
ACPI: PCI Interrupt 0000:00:1d.0[A] -> Link [LP00] -> GSI 5 (level, low) -> IRQ 5
uhci_hcd 0000:00:1d.0: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 5, io base 0x00002200
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> Link [LP03] -> GSI 11 (level, low) -> IRQ 11
uhci_hcd 0000:00:1d.1: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 11, io base 0x00002600
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usb 3-2: new full speed USB device using uhci_hcd and address 2
usbcore: registered new driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
input: USB HID v1.10 Keyboard [IBM PPC I/F] on usb-0000:00:1d.1-2
input: USB HID v1.10 Mouse [IBM PPC I/F] on usb-0000:00:1d.1-2
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.01:USB HID core driver
mice: PS/2 mouse device common for all mice
Advanced Linux Sound Architecture Driver Version 1.0.9rc3 (Thu Mar 24 10:33:39 2005 UTC).
ALSA device list:
No soundcards found.
oprofile: using timer interrupt.
NET: Registered protocol family 2
input: AT Translated Set 2 keyboard on isa0060/serio0
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP established hash table entries: 4096 (order: 3, 32768 bytes)
TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
ip_conntrack version 2.1 (640 buckets, 5120 max) - 212 bytes per conntrack
ip_tables: (C) 2000-2002 Netfilter core team
input: PS/2 Generic Mouse on isa0060/serio1
ipt_recent v0.3.2: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/
arp_tables: (C) 2002 David S. Miller
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI wakeup devices:
PCI0
ACPI: (supports S0 S4 S5)
Freeing unused kernel memory: 196k freed
Red Hat nash version 4.1.18 starting
Mounted /proc filesystem
Mounting sysfs
Creating /dev
Starting udev
Loading scsi_modscsi_mod: version magic '2.6.9-5.EL 686 REGPARM 4KSTACKS gcc-3.4' should be '2.6.12-rc5-mm1-16M preempt PENTIUM4 gcc-3.4'
.ko module
sd_mod: version magic '2.6.9-5.EL 686 REGPARM 4KSTACKS gcc-3.4' should be '2.6.12-rc5-mm1-16M preempt PENTIUM4 gcc-3.4'
insmod: error inaic79xx: version magic '2.6.9-5.EL 686 REGPARM 4KSTACKS gcc-3.4' should be '2.6.12-rc5-mm1-16M preempt PENTIUM4 gcc-3.4'
serting '/lib/scjbd: version magic '2.6.9-5.EL 686 REGPARM 4KSTACKS gcc-3.4' should be '2.6.12-rc5-mm1-16M preempt PENTIUM4 gcc-3.4'
si_mod.ko': -1 Iext3: version magic '2.6.9-5.EL 686 REGPARM 4KSTACKS gcc-3.4' should be '2.6.12-rc5-mm1-16M preempt PENTIUM4 gcc-3.4'
nvalid module format
ERROR: /bin/insmod exited abnormally!
Loading sd_mod.ko module
insmod: error inserting '/lib/sd_mod.ko': -1 Invalid module format
ERROR: /bin/insmod exited abnormally!
Loading aic79xx.ko module
insmod: error inserting '/lib/aic79xx.ko': -1 Invalid module format
ERROR: /bin/insmod exited abnormally!
Loading jbd.ko module
insmod: error inserting '/lib/jbd.ko': -1 Invalid module format
ERROR: /bin/insmod exited abnormally!
Loading ext3.ko module
insmod: error inserting '/lib/ext3.ko': -1 Invalid module format
ERROR: /bin/insmod exited abnormally!
Creating root device
Mounting root filesystem
scsi1:0:2:0: Attempting to abort cmd c4dbab00: 0x28 0x0 0x2 0x8f 0xf9 0x76 0x0 0x0 0x2 0x0
scsi1:0:2:0: Command already completed
scsi1:0:2:0: Attempting to abort cmd c4dbab00: 0x0 0x0 0x0 0x0 0x0 0x0
scsi1:0:2:0: Command already completed
Recovery code sleeping
Recovery code awake
Timer Expired
scsi1: Device reset returning 0x2003
Recovery SCB completes
scsi1:0:2:0: Attempting to abort cmd c4dbab00: 0x0 0x0 0x0 0x0 0x0 0x0
(scsi1:A:2:0): Task Management Func 0x0 Complete
Kernel panic - not syncing: Unexpected TaskMgmt Func
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 11:25 Vivek Goyal
@ 2005-06-03 11:54 ` Richard B. Johnson
2005-06-03 15:24 ` Alan Stern
2005-06-03 18:21 ` Greg KH
2 siblings, 0 replies; 17+ messages in thread
From: Richard B. Johnson @ 2005-06-03 11:54 UTC (permalink / raw)
To: Vivek Goyal
Cc: linux kernel mailing list, greg, Fastboot mailing list,
Morton Andrew Morton, Alan Stern, Eric W. Biederman
Yes. We need something like that since one must now 'enable' the
device to have its final IRQ routing established. This has created
dire consequences for existin drivers.
However, I have the following comments about the patch.
> + if (!(pci_command & PCI_COMMAND_INTX_DISABLE)) {
|
|
|___ This is never needed. Just read the register,
do your thing, then write it back.
On Fri, 3 Jun 2005, Vivek Goyal wrote:
> Hi,
>
> In kdump, sometimes, general driver initialization issues seems to be cropping
> in second kernel due to devices not being shutdown during crash and these
> devices are sending interrupts while second kernel is booting and drivers are
> not expecting any interrupts yet.
>
> In some cases, we are observing a storm of interrupts while second kernel
> is booting and kernel disables that irq line. May be the case of stuck irq
> line because it is shared level triggered irq and there is no driver
> loaded for the device.
>
> So, we need something generic which disables interrupt generation from device
> until a driver has been registered for that device and driver is ready to
> receive the interrupts. PCI specifications (ver 2.3 onwards), have introduced
> interrupt disable bit in command register to disable interrupt generation
> from the device. This can become handy here. In capture kernel, traverse all
> the PCI devices, disable interrupt generation. Enable the interrupt generation
> back once the driver for that device registers. May be after the probe handler
> has run. In probe handler, driver can reset the device or register for irq so
> that it can handle any interrupt from the device after that.
>
> Greg mentioned that there are reasons that we can not disable all pci
> interrupts. Meanwhile I am going through archives to find more about it.
>
> In previous conversations, Alan Stern had raised the issue of console also
> not working if interrupts are disabled on all the devices. I am not sure
> but this should be working at least for serial consoles and vga text consoles.
> May be sufficient to capture the dump.
>
> Attached is a hack patch which is by no means complete. I have got one machine
> which has got some PCI 2.3 compliant hardware. After applying this hack this
> problem does not occur atleast on this machine. Attached is the serial console
> log which shows one kind of problem due to unwanted interrupts.
>
> Bugme 4631 and 4573 are two more instances of driver initialization failure
> in second kernel.
>
>
>
>
>
>
> ---
> o In kdump, devices are not shutdown/reset after a crash and this leads to
> various driver initialization failures while second kernel is booting.
> Most of them seem to be happening due to the fact that system/driver
> is receiving the interrupts from device when it is not prepared to do so.
>
> o This patch tries to solve the problem by disabling the interrupts at
> PCI level for all the devices. These interrupts are enabled back once the
> driver for that device registers. Currenty this enabling and disabling
> is done only in dump capture kernel.
>
> o This is for devices compliant to PCI specification v2.3 or higher.
>
>
>
> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
> ---
>
> linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci-driver.c | 3 +
> linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci.c | 4 ++
> linux-2.6.12-rc5-mm1-16M-root/include/linux/pci.h | 33 +++++++++++++++++
> 3 files changed, 39 insertions(+), 1 deletion(-)
>
> diff -puN drivers/pci/pci.c~kdump-pci-interrupt-disable drivers/pci/pci.c
> --- linux-2.6.12-rc5-mm1-16M/drivers/pci/pci.c~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
> +++ linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci.c 2005-06-03 15:59:19.000000000 +0530
> @@ -823,6 +823,10 @@ static int __devinit pci_init(void)
>
> while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
> pci_fixup_device(pci_fixup_final, dev);
> +
> + /* A hack to disable interrupts from all PCI devices in
> + * capture kernel. */
> + pci_disable_device_intx(dev);
> }
> return 0;
> }
> diff -puN drivers/pci/pci-driver.c~kdump-pci-interrupt-disable drivers/pci/pci-driver.c
> --- linux-2.6.12-rc5-mm1-16M/drivers/pci/pci-driver.c~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
> +++ linux-2.6.12-rc5-mm1-16M-root/drivers/pci/pci-driver.c 2005-06-03 14:35:26.000000000 +0530
> @@ -248,7 +248,8 @@ static int pci_device_probe(struct devic
> error = __pci_device_probe(drv, pci_dev);
> if (error)
> pci_dev_put(pci_dev);
> -
> + else
> + pci_enable_device_intx(pci_dev);
> return error;
> }
>
> diff -puN include/linux/pci.h~kdump-pci-interrupt-disable include/linux/pci.h
> --- linux-2.6.12-rc5-mm1-16M/include/linux/pci.h~kdump-pci-interrupt-disable 2005-06-03 14:35:26.000000000 +0530
> +++ linux-2.6.12-rc5-mm1-16M-root/include/linux/pci.h 2005-06-03 16:06:32.000000000 +0530
> @@ -901,6 +901,39 @@ extern void pci_disable_msix(struct pci_
> extern void msi_remove_pci_irq_vectors(struct pci_dev *dev);
> #endif
>
> +#ifdef CONFIG_CRASH_DUMP
> +static inline int pci_enable_device_intx(struct pci_dev *dev)
> +{
> + u16 pci_command;
> +
> + /* Enable Interrupt generation if not already enabled */
> + pci_read_config_word(dev, PCI_COMMAND, &pci_command);
> + if (pci_command & PCI_COMMAND_INTX_DISABLE) {
> + pci_command &= ~PCI_COMMAND_INTX_DISABLE;
> + pci_write_config_word(dev, PCI_COMMAND, pci_command);
> + }
> + return 0;
> +}
> +
> +static inline int pci_disable_device_intx(struct pci_dev *dev)
> +{
> + u16 pci_command;
> +
> + /* Disable Interrupt generation if not already disabled */
> + pci_read_config_word(dev, PCI_COMMAND, &pci_command);
> + if (!(pci_command & PCI_COMMAND_INTX_DISABLE)) {
> + pci_command |= PCI_COMMAND_INTX_DISABLE;
> + pci_write_config_word(dev, PCI_COMMAND, pci_command);
> + }
> + return 0;
> +}
> +#else
> +static inline int pci_enable_device_intx(struct pci_dev *dev)
> +{ return 0; }
> +static inline int pci_disable_device_intx(struct pci_dev *dev)
> +{ return 0; }
> +#endif /* CONFIG_CRASH_DUMP */
> +
> #endif /* CONFIG_PCI */
>
> /* Include architecture-dependent settings and functions */
> _
>
Cheers,
Dick Johnson
Penguin : Linux version 2.6.11.9 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 11:25 Vivek Goyal
2005-06-03 11:54 ` Richard B. Johnson
@ 2005-06-03 15:24 ` Alan Stern
2005-06-03 18:26 ` Eric W. Biederman
2005-06-03 18:21 ` Greg KH
2 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2005-06-03 15:24 UTC (permalink / raw)
To: Vivek Goyal
Cc: linux kernel mailing list, greg, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman
On Fri, 3 Jun 2005, Vivek Goyal wrote:
> In previous conversations, Alan Stern had raised the issue of console also
> not working if interrupts are disabled on all the devices. I am not sure
> but this should be working at least for serial consoles and vga text consoles.
> May be sufficient to capture the dump.
This isn't an issue for x86. It affects other architectures, in which the
system console is managed during the early stages of booting by the
platform firmware. I suppose serial consoles would always work.
Alan Stern
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 11:25 Vivek Goyal
2005-06-03 11:54 ` Richard B. Johnson
2005-06-03 15:24 ` Alan Stern
@ 2005-06-03 18:21 ` Greg KH
2005-06-03 18:36 ` Eric W. Biederman
2 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2005-06-03 18:21 UTC (permalink / raw)
To: Vivek Goyal
Cc: linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Alan Stern, Eric W. Biederman
On Fri, Jun 03, 2005 at 04:55:24PM +0530, Vivek Goyal wrote:
> Hi,
>
> In kdump, sometimes, general driver initialization issues seems to be cropping
> in second kernel due to devices not being shutdown during crash and these
> devices are sending interrupts while second kernel is booting and drivers are
> not expecting any interrupts yet.
What are the errors you are seeing?
How would the drivers be able to be getting interrupts delivered to them
if they haven't registered the irq handler yet?
> In some cases, we are observing a storm of interrupts while second kernel
> is booting and kernel disables that irq line. May be the case of stuck irq
> line because it is shared level triggered irq and there is no driver
> loaded for the device.
>
> So, we need something generic which disables interrupt generation from device
> until a driver has been registered for that device and driver is ready to
> receive the interrupts. PCI specifications (ver 2.3 onwards), have introduced
> interrupt disable bit in command register to disable interrupt generation
> from the device. This can become handy here. In capture kernel, traverse all
> the PCI devices, disable interrupt generation. Enable the interrupt generation
> back once the driver for that device registers. May be after the probe handler
> has run. In probe handler, driver can reset the device or register for irq so
> that it can handle any interrupt from the device after that.
>
> Greg mentioned that there are reasons that we can not disable all pci
> interrupts. Meanwhile I am going through archives to find more about it.
You recalled the conversation in the next paragraph:
> In previous conversations, Alan Stern had raised the issue of console also
> not working if interrupts are disabled on all the devices. I am not sure
> but this should be working at least for serial consoles and vga text consoles.
> May be sufficient to capture the dump.
That's the main objection a lot of people had to this kind of patch,
last time it was discussed.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 15:24 ` Alan Stern
@ 2005-06-03 18:26 ` Eric W. Biederman
0 siblings, 0 replies; 17+ messages in thread
From: Eric W. Biederman @ 2005-06-03 18:26 UTC (permalink / raw)
To: Alan Stern
Cc: Vivek Goyal, linux kernel mailing list, greg,
Fastboot mailing list, Morton Andrew Morton
Alan Stern <stern@rowland.harvard.edu> writes:
> On Fri, 3 Jun 2005, Vivek Goyal wrote:
>
> > In previous conversations, Alan Stern had raised the issue of console also
> > not working if interrupts are disabled on all the devices. I am not sure
> > but this should be working at least for serial consoles and vga text consoles.
>
> > May be sufficient to capture the dump.
>
> This isn't an issue for x86. It affects other architectures, in which the
> system console is managed during the early stages of booting by the
> platform firmware. I suppose serial consoles would always work.
In the plain kexec case that should be doable. I don't think
I have heard of a kdump case where we can work with the platform
firmware.
Eric
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 18:21 ` Greg KH
@ 2005-06-03 18:36 ` Eric W. Biederman
2005-06-04 13:18 ` Denis Vlasenko
0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2005-06-03 18:36 UTC (permalink / raw)
To: Greg KH
Cc: Vivek Goyal, linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Alan Stern
Greg KH <greg@kroah.com> writes:
> On Fri, Jun 03, 2005 at 04:55:24PM +0530, Vivek Goyal wrote:
> > Hi,
> >
> > In kdump, sometimes, general driver initialization issues seems to be cropping
>
> > in second kernel due to devices not being shutdown during crash and these
> > devices are sending interrupts while second kernel is booting and drivers are
>
> > not expecting any interrupts yet.
>
> What are the errors you are seeing?
> How would the drivers be able to be getting interrupts delivered to them
> if they haven't registered the irq handler yet?
As I recall the drivers were not getting the interrupts but the interrupts
were happening. To stop being spammed the kernel disables the irq line,
at the interrupt controller. Then when the driver registered the
interrupt it would never receive the interrupt.
Eric
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
@ 2005-06-04 10:43 Vivek Goyal
0 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2005-06-04 10:43 UTC (permalink / raw)
To: Greg KH
Cc: linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Alan Stern, Eric W. Biederman
Quoting Greg KH <greg@kroah.com>:
> On Fri, Jun 03, 2005 at 04:55:24PM +0530, Vivek Goyal wrote:
> > Hi,
> >
> > In kdump, sometimes, general driver initialization issues seems to be
> cropping
> > in second kernel due to devices not being shutdown during crash and these
>
> > devices are sending interrupts while second kernel is booting and drivers
> are
> > not expecting any interrupts yet.
>
> What are the errors you are seeing?
We have observed mainly two kind of problems.
1. Devices A and B are sharing the same irq line. Driver for device A is
initializing and opens the respective irq line(request_irq()).
Device B raises an interrupt. Interrupt handler for device A denies its
not my irq and there is no interrupt handler for device B yet. Device B
keeps the irq line asserted and kernel sees a flood of interrupts and
finally kernel disables the irq line. Now device A's interrupt also stop
coming and driver initialization for device A fails.
We have observed this especially in case of aic7xxx driver and one sample
log is attached with the mail.
2. Second problem is that driver is not expecting an interrupt the moment it
calls request_irq(). Otherwise initialization fails. We saw this problem in
IPS driver initialization failure. (Bugme 4573). Though this seenms to be
more of a driver bug and can be fixed separately.
> How would the drivers be able to be getting interrupts delivered to them
> if they haven't registered the irq handler yet?
True that drivers will not get interrupts until they register request_irq().
In the above mentioned aic7xxx dirver failure, this driver starts getting
interrupts the moment it calls request_irq(). And the fact is that these
interrupts are being generated from some other sharing the same irq line.
>
> > In some cases, we are observing a storm of interrupts while second kernel
>
> > is booting and kernel disables that irq line. May be the case of stuck
> irq
> > line because it is shared level triggered irq and there is no driver
> > loaded for the device.
> >
> > So, we need something generic which disables interrupt generation from
> device
> > until a driver has been registered for that device and driver is ready to
>
> > receive the interrupts. PCI specifications (ver 2.3 onwards), have
> introduced
> > interrupt disable bit in command register to disable interrupt generation
> > from the device. This can become handy here. In capture kernel, traverse
> all
> > the PCI devices, disable interrupt generation. Enable the interrupt
> generation
> > back once the driver for that device registers. May be after the probe
> handler
> > has run. In probe handler, driver can reset the device or register for irq
> so
> > that it can handle any interrupt from the device after that.
> >
> > Greg mentioned that there are reasons that we can not disable all pci
> > interrupts. Meanwhile I am going through archives to find more about it.
>
> You recalled the conversation in the next paragraph:
>
> > In previous conversations, Alan Stern had raised the issue of console
> also
> > not working if interrupts are disabled on all the devices. I am not sure
> > but this should be working at least for serial consoles and vga text
> consoles.
> > May be sufficient to capture the dump.
>
> That's the main objection a lot of people had to this kind of patch,
> last time it was discussed.
It can very well be a big concern when it comes to booting normal kernel but
I believe that for capture kernel (kernel booting after a crash), it might
not be that big an issue. Anyway, capture kernel is booting only to capture
and save the dump to some pre configured storage media. Most probably it
will be done from an initrd or an init script and system will boot into
production kernel back enabling all the consoles.
Alternatively, pci device id of console can be passed to capture kernel
through command line and disabling interrupts can be skipped on console.
This is little hackish though but crash dump is a peculiar case.
Thanks
Vivek
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
@ 2005-06-04 10:57 Vivek Goyal
2005-06-04 15:35 ` Alan Stern
0 siblings, 1 reply; 17+ messages in thread
From: Vivek Goyal @ 2005-06-04 10:57 UTC (permalink / raw)
To: Alan Stern
Cc: linux kernel mailing list, greg, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman
Quoting Alan Stern <stern@rowland.harvard.edu>:
> On Fri, 3 Jun 2005, Vivek Goyal wrote:
>
> > In previous conversations, Alan Stern had raised the issue of console
> also
> > not working if interrupts are disabled on all the devices. I am not sure
> > but this should be working at least for serial consoles and vga text
> consoles.
> > May be sufficient to capture the dump.
>
> This isn't an issue for x86. It affects other architectures, in which the
> system console is managed during the early stages of booting by the
> platform firmware. I suppose serial consoles would always work.
>
Hi Alan, I know very little about consoles and their working.
I had a question. Even if console is being managed by platform firmware, in
initial states of booting, does it require interrupts to be enabled at
VGA contorller (at least for the simple text mode). I was quickly browsing
through drivers/video/console/vgacon.c and did not look like that this
console driver needed interrupts to be enabled at the controller.
Anyway, looks like serial consoles will always work. So at least this can be
done for kdump case (CONFIG_CRASH_DUMP) and not generic kernel. Or, as I
mentioned in previous mail, while pre-loading capture kernel, pass a command
line parameter containing pci dev id of console and capture kernel does not
disable interrupts on this console.
Thanks
Vivek
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
[not found] <4bExX-3uT-11@gated-at.bofh.it>
@ 2005-06-04 12:38 ` Bodo Eggert
0 siblings, 0 replies; 17+ messages in thread
From: Bodo Eggert @ 2005-06-04 12:38 UTC (permalink / raw)
To: Vivek Goyal, Alan Stern, linux kernel mailing list, greg,
Fastboot mailing list, Morton Andrew Morton, Eric W. Biederman
Vivek Goyal <vgoyal@in.ibm.com> wrote:
> Hi Alan, I know very little about consoles and their working.
> I had a question. Even if console is being managed by platform firmware, in
> initial states of booting, does it require interrupts to be enabled at
> VGA contorller (at least for the simple text mode).
VGA does not use interrupts for normal operation, even in graphics mode.
It can generate them for synchronisation.
--
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-03 18:36 ` Eric W. Biederman
@ 2005-06-04 13:18 ` Denis Vlasenko
0 siblings, 0 replies; 17+ messages in thread
From: Denis Vlasenko @ 2005-06-04 13:18 UTC (permalink / raw)
To: Eric W. Biederman, Greg KH
Cc: Vivek Goyal, linux kernel mailing list, Fastboot mailing list,
Andrew Morton, Alan Stern
On Friday 03 June 2005 21:36, Eric W. Biederman wrote:
> Greg KH <greg@kroah.com> writes:
>
> > On Fri, Jun 03, 2005 at 04:55:24PM +0530, Vivek Goyal wrote:
> > > Hi,
> > >
> > > In kdump, sometimes, general driver initialization issues seems to be cropping
> >
> > > in second kernel due to devices not being shutdown during crash and these
> > > devices are sending interrupts while second kernel is booting and drivers are
> >
> > > not expecting any interrupts yet.
> >
> > What are the errors you are seeing?
> > How would the drivers be able to be getting interrupts delivered to them
> > if they haven't registered the irq handler yet?
>
> As I recall the drivers were not getting the interrupts but the interrupts
> were happening. To stop being spammed the kernel disables the irq line,
> at the interrupt controller. Then when the driver registered the
> interrupt it would never receive the interrupt.
Shouldn't kernel keep all interrupt lines initially disabled
(sans platform-specific magic), and enable each like only when
a device driver requests IRQ? This sounds simpler to do...
--
vda
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-04 10:57 Vivek Goyal
@ 2005-06-04 15:35 ` Alan Stern
2005-06-04 18:26 ` Grant Grundler
0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2005-06-04 15:35 UTC (permalink / raw)
To: Vivek Goyal
Cc: linux kernel mailing list, Greg KH, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman, Bodo Eggert,
Dipankar Sarma, Grant Grundler
On Sat, 4 Jun 2005, Vivek Goyal wrote:
> Hi Alan, I know very little about consoles and their working.
> I had a question. Even if console is being managed by platform firmware, in
> initial states of booting, does it require interrupts to be enabled at
> VGA contorller (at least for the simple text mode). I was quickly browsing
> through drivers/video/console/vgacon.c and did not look like that this
> console driver needed interrupts to be enabled at the controller.
This isn't an issue for VGA, as far as I know. It applies to
architectures like PPC-64 and perhaps Alpha or PA-Risc. And I don't know
the details; ask Grant Grundler.
> Anyway, looks like serial consoles will always work. So at least this can be
> done for kdump case (CONFIG_CRASH_DUMP) and not generic kernel. Or, as I
> mentioned in previous mail, while pre-loading capture kernel, pass a command
> line parameter containing pci dev id of console and capture kernel does not
> disable interrupts on this console.
I suspect you're right that implementing this only in kdump kernels will
work okay.
For people interesting in reading some old threads on the subject, here
are some pointers:
http://marc.theaimsgroup.com/?l=linux-usb-devel&m=111055702309788&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=98383052711171&w=2
Alan Stern
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-04 15:35 ` Alan Stern
@ 2005-06-04 18:26 ` Grant Grundler
0 siblings, 0 replies; 17+ messages in thread
From: Grant Grundler @ 2005-06-04 18:26 UTC (permalink / raw)
To: Alan Stern
Cc: Vivek Goyal, linux kernel mailing list, Greg KH,
Fastboot mailing list, Morton Andrew Morton, Eric W. Biederman,
Bodo Eggert, Dipankar Sarma, Grant Grundler, awilliam,
bjorn.helgaas
On Sat, Jun 04, 2005 at 11:35:59AM -0400, Alan Stern wrote:
> On Sat, 4 Jun 2005, Vivek Goyal wrote:
>
> > Hi Alan, I know very little about consoles and their working.
> > I had a question. Even if console is being managed by platform firmware, in
> > initial states of booting, does it require interrupts to be enabled at
> > VGA contorller (at least for the simple text mode). I was quickly browsing
> > through drivers/video/console/vgacon.c and did not look like that this
> > console driver needed interrupts to be enabled at the controller.
>
> This isn't an issue for VGA, as far as I know. It applies to
> architectures like PPC-64 and perhaps Alpha or PA-Risc. And I don't know
> the details; ask Grant Grundler.
I'm more familiar with the serial consoles and how PDC interacts with them.
>From HP, both Alex Williamson and Bjorn Helgaas know more about
VGA support. I've cc'd both.
> > Anyway, looks like serial consoles will always work. So at least this can be
> > done for kdump case (CONFIG_CRASH_DUMP) and not generic kernel. Or, as I
> > mentioned in previous mail, while pre-loading capture kernel, pass a command
> > line parameter containing pci dev id of console and capture kernel does not
> > disable interrupts on this console.
parisc serial consoles don't need interrupts enabled. The serial device
does need it's MMIO and/or IO Port range enabled (I forgot which).
ISTR most serial consoles don't do DMA and thus don't need BusMaster
enabled in the PCI command register either.
> I suspect you're right that implementing this only in kdump kernels will
> work okay.
>
> For people interesting in reading some old threads on the subject, here
> are some pointers:
>
> http://marc.theaimsgroup.com/?l=linux-usb-devel&m=111055702309788&w=2
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=98383052711171&w=2
wow...from 2001.
That's when we first release a500 support with Debian 3.0.
thanks,
grant
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
@ 2005-06-07 3:07 Vivek Goyal
2005-06-07 5:07 ` Grant Grundler
0 siblings, 1 reply; 17+ messages in thread
From: Vivek Goyal @ 2005-06-07 3:07 UTC (permalink / raw)
To: greg
Cc: linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman, Bodo Eggert,
Dipankar Sarma, Grant Grundler, stern, awilliam, bjorn.helgaas
Quoting Alan Stern <stern@rowland.harvard.edu>:
> On Sat, 4 Jun 2005, Vivek Goyal wrote:
>
> > Hi Alan, I know very little about consoles and their working.
> > I had a question. Even if console is being managed by platform firmware,
> in
> > initial states of booting, does it require interrupts to be enabled at
> > VGA contorller (at least for the simple text mode). I was quickly
> browsing
> > through drivers/video/console/vgacon.c and did not look like that this
> > console driver needed interrupts to be enabled at the controller.
>
> This isn't an issue for VGA, as far as I know. It applies to
> architectures like PPC-64 and perhaps Alpha or PA-Risc. And I don't know
> the details; ask Grant Grundler.
>
> > Anyway, looks like serial consoles will always work. So at least this can
> be
> > done for kdump case (CONFIG_CRASH_DUMP) and not generic kernel. Or, as I
> > mentioned in previous mail, while pre-loading capture kernel, pass a
> command
> > line parameter containing pci dev id of console and capture kernel does not
>
> > disable interrupts on this console.
>
> I suspect you're right that implementing this only in kdump kernels will
> work okay.
>
> For people interesting in reading some old threads on the subject, here
> are some pointers:
>
> http://marc.theaimsgroup.com/?l=linux-usb-devel&m=111055702309788&w=2
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=98383052711171&w=2
I browsed through the discussion threads quickly. Previous proposal included
disabling the DMA as well from the devices. Currently, for kdump, we are not
looking at disabling the DMA from the devices. So far have not run into
any problems due to ongoing DMA (Need to look into IOMMU reprogramming aspect
though). Following is a snippet from one of the discussion threads.
|List: linux-usb-devel
|Subject: [linux-usb-devel] Re: PCI device initialization and USB
early-handoff
|From: Grant Grundler <grundler () parisc-linux ! org>
|Date: 2005-03-11 18:12:57
|Message-ID: <20050311181257.GB15070 () colo ! lackof ! org>
|[Download message RAW]
.....
|> > Is it feasible to have the PCI device initialization sequence disable DMA
|> > and IRQs from the device? This could solve the problems we've been seeing
|> > with non-quiescent devices sharing an IRQ line at startup.
|two potential issues here:
|o ISTR VGA devices may not like disabling Bus Master bit in the command reg.
| But I'm blissfully ignorant of all the issues around VGA and someone
| else will have to comment on that.
|
|o platform devices (e.g. bridges) that don't have PCI drivers to re-enable
| them later. "transperent" Bridges are the only example I can come up with
| now but expect more to come out of the woodwork as this gets widely
| tested. Trolling through PCI quirks might flag some of the known ones.
| I would expect a few more to show up with this change.
|hth,
|grant
o Bus Master bit is not being disabled, only interrupt generation will be
disabled and looks like at least VGA and serial consoles are not impacted
due to disabling of interrupt. Any other consoles which require interrupt
to be enabled for their working????
o As per pci-to-pci bridge architecture specification revision 1.2,
interrupt disable bit in PCI-PCI bridge is optional and if implemented,
it will disable interrupt generation from bridge but will have no effect
on interrupts that the bridge forwards from the PCI devices on the
secondary bus.
So even if interrupts are disabled on PCI-PCI bridge, interrupts generated
by PCI devices on secondary bus are not blocked and I hope device should
be working fine.
The whole idea is that currently this change is kdump specific. Ofcourse there
shall be issues which are not known yet and more devices might not
work for kdump kernels. But at the same time kdump kernels are not supposed to
do a great deal except capture and save the dump. So this change might not
be of a big concern even if some devices don't work as long as kdump kernel
can boot.
Disabling interrupts at PCI level should increase the reliability of capturing
the dump on newer machines with hardware compliant with PCI 2.3 or higher.
Booting a kdump kernel with reduced functionality should always be better than
not booting at all.
Thanks
Vivek
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-07 3:07 Vivek Goyal
@ 2005-06-07 5:07 ` Grant Grundler
2005-06-07 9:59 ` Eric W. Biederman
0 siblings, 1 reply; 17+ messages in thread
From: Grant Grundler @ 2005-06-07 5:07 UTC (permalink / raw)
To: Vivek Goyal
Cc: greg, linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman, Bodo Eggert,
Dipankar Sarma, Grant Grundler, stern, awilliam, bjorn.helgaas
On Mon, Jun 06, 2005 at 11:07:17PM -0400, Vivek Goyal wrote:
> So even if interrupts are disabled on PCI-PCI bridge, interrupts generated
> by PCI devices on secondary bus are not blocked and I hope device should
> be working fine.
How did you plan on disabling interrupts?
Did you see the MSI discussion that going on now in linux-pci mailing list?
> But at the same time kdump kernels are not supposed to
> do a great deal except capture and save the dump.
I'd think you want to stop DMA for all devices.
Just to prevent them from messing more with memory
that you want to dump - ie get a consistent snapshot.
Leaving VGA devices alone should be safe.
> Disabling interrupts at PCI level should increase the reliability of capturing
> the dump on newer machines with hardware compliant with PCI 2.3 or higher.
*lots* of PCI devices predate PCI2.3. Possibly even the majority.
hth,
grant
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-07 5:07 ` Grant Grundler
@ 2005-06-07 9:59 ` Eric W. Biederman
2005-06-07 16:21 ` Grant Grundler
0 siblings, 1 reply; 17+ messages in thread
From: Eric W. Biederman @ 2005-06-07 9:59 UTC (permalink / raw)
To: Grant Grundler
Cc: Vivek Goyal, greg, linux kernel mailing list,
Fastboot mailing list, Morton Andrew Morton, Bodo Eggert,
Dipankar Sarma, stern, awilliam, bjorn.helgaas
Grant Grundler <grundler@parisc-linux.org> writes:
> On Mon, Jun 06, 2005 at 11:07:17PM -0400, Vivek Goyal wrote:
> > So even if interrupts are disabled on PCI-PCI bridge, interrupts generated
> > by PCI devices on secondary bus are not blocked and I hope device should
> > be working fine.
>
> How did you plan on disabling interrupts?
> Did you see the MSI discussion that going on now in linux-pci mailing list?
>
> > But at the same time kdump kernels are not supposed to
> > do a great deal except capture and save the dump.
>
> I'd think you want to stop DMA for all devices.
> Just to prevent them from messing more with memory
> that you want to dump - ie get a consistent snapshot.
> Leaving VGA devices alone should be safe.
>
> > Disabling interrupts at PCI level should increase the reliability of capturing
>
> > the dump on newer machines with hardware compliant with PCI 2.3 or higher.
>
> *lots* of PCI devices predate PCI2.3. Possibly even the majority.
In general generic hardware bits for disabling DMA, disabling interrupts
and the like are all advisory. With the current architecture things
will work properly even if you don't manage to disable DMA (assuming
you don't reassign IOMMU entries at least).
Non-shared interrupts are not a problem as they can fairly safely
be disabled at the interrupt controller.
Shared interrupts are an interesting case. The simplest solution I can
think of for a crash dump capture kernel is to periodically poll
the hardware, as if all interrupts are shared. At that level
I think we could get away with ignoring all hardware interrupt sources.
Does anyone know of a anything that would break by always polling
the hardware? I guess there could be a problem with drivers
that don't understand shared interrupts, are there enough of those
to be an issue.
Eric
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
@ 2005-06-07 11:56 Vivek Goyal
0 siblings, 0 replies; 17+ messages in thread
From: Vivek Goyal @ 2005-06-07 11:56 UTC (permalink / raw)
To: Grant Grundler
Cc: greg, linux kernel mailing list, Fastboot mailing list,
Morton Andrew Morton, Eric W. Biederman, Bodo Eggert,
Dipankar Sarma, stern, awilliam, bjorn.helgaas
Quoting Grant Grundler <grundler@parisc-linux.org>:
> On Mon, Jun 06, 2005 at 11:07:17PM -0400, Vivek Goyal wrote:
> > So even if interrupts are disabled on PCI-PCI bridge, interrupts
> generated
> > by PCI devices on secondary bus are not blocked and I hope device
> should
> > be working fine.
>
> How did you plan on disabling interrupts?
> Did you see the MSI discussion that going on now in linux-pci mailing list?
I am following the discussion now. Thanks.
I am planning to disable only leagacy shared interrupts (irq pin assertion/INTx
emulation) because shared interrupts are a problem. MSI are
not shared but I am not sure can they lead to any other problem.
>
> > But at the same time kdump kernels are not supposed to
> > do a great deal except capture and save the dump.
>
> I'd think you want to stop DMA for all devices.
> Just to prevent them from messing more with memory
> that you want to dump - ie get a consistent snapshot.
> Leaving VGA devices alone should be safe.
>
May be at some point of time.
> > Disabling interrupts at PCI level should increase the reliability of
> capturing
> > the dump on newer machines with hardware compliant with PCI 2.3 or higher.
>
>
> *lots* of PCI devices predate PCI2.3. Possibly even the majority.
Ya, Some other solution is needed for hardware predating PCI2.3. May be Eric's
suggestion of polling the hardware.
Thanks
Vivek
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel
2005-06-07 9:59 ` Eric W. Biederman
@ 2005-06-07 16:21 ` Grant Grundler
0 siblings, 0 replies; 17+ messages in thread
From: Grant Grundler @ 2005-06-07 16:21 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Grant Grundler, Vivek Goyal, greg, linux kernel mailing list,
Fastboot mailing list, Morton Andrew Morton, Bodo Eggert,
Dipankar Sarma, stern, awilliam, bjorn.helgaas
On Tue, Jun 07, 2005 at 03:59:18AM -0600, Eric W. Biederman wrote:
> > *lots* of PCI devices predate PCI2.3. Possibly even the majority.
>
> In general generic hardware bits for disabling DMA, disabling interrupts
> and the like are all advisory. With the current architecture things
> will work properly even if you don't manage to disable DMA (assuming
> you don't reassign IOMMU entries at least).
ISTR, pSeries (IBM), some alpha, some sparc64, and parisc (64-bit) require
use of the IOMMU for *any* DMA. ie IOMMU entries need to be programmed.
Probably want to make a choice to ignore those arches for now
or sort out how to deal with an IOMMU.
> Shared interrupts are an interesting case. The simplest solution I can
> think of for a crash dump capture kernel is to periodically poll
> the hardware, as if all interrupts are shared. At that level
> I think we could get away with ignoring all hardware interrupt sources.
Yes, that's perfectly ok. We are no longer in a multitasking env.
> Does anyone know of a anything that would break by always polling
> the hardware? I guess there could be a problem with drivers
> that don't understand shared interrupts, are there enough of those
> to be an issue.
PCI requires drivers support Shared IRQs.
A few oddballs might be broken but I expect networking/mass storage
drivers get this right.
grant
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2005-06-07 16:18 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-04 10:43 [RFC/PATCH] Kdump: Disabling PCI interrupts in capture kernel Vivek Goyal
-- strict thread matches above, loose matches on Subject: below --
2005-06-07 11:56 Vivek Goyal
2005-06-07 3:07 Vivek Goyal
2005-06-07 5:07 ` Grant Grundler
2005-06-07 9:59 ` Eric W. Biederman
2005-06-07 16:21 ` Grant Grundler
[not found] <4bExX-3uT-11@gated-at.bofh.it>
2005-06-04 12:38 ` Bodo Eggert
2005-06-04 10:57 Vivek Goyal
2005-06-04 15:35 ` Alan Stern
2005-06-04 18:26 ` Grant Grundler
2005-06-03 11:25 Vivek Goyal
2005-06-03 11:54 ` Richard B. Johnson
2005-06-03 15:24 ` Alan Stern
2005-06-03 18:26 ` Eric W. Biederman
2005-06-03 18:21 ` Greg KH
2005-06-03 18:36 ` Eric W. Biederman
2005-06-04 13:18 ` Denis Vlasenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox