* [GIT PATCH] PCI patches for 2.6.15 - retry
@ 2006-01-09 20:37 Greg KH
2006-01-10 0:00 ` Linus Torvalds
0 siblings, 1 reply; 15+ messages in thread
From: Greg KH @ 2006-01-09 20:37 UTC (permalink / raw)
To: Linus Torvalds, Andrew Morton; +Cc: linux-kernel, linux-pci
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown-8bit, Size: 7682 bytes --]
Here are some PCI patches against your latest git tree. They have all
been in the -mm tree for a while with no problems. I've pulled out all
of the offending patches that people objected to, or ones that crashed
older machines from the last series I sent you.
The thing that touches so many different files are the change from the
pci_module_init() to pci_register_driver() that was done by Richard
Knutsson. Other big stuff is the addition of the pci error recovery
framework, after many different revisions and reworks.
There are also some pci hotplug fixes, and quirks added.
Please pull from:
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6.git/
or if master.kernel.org hasn't synced up yet:
master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6.git/
The full patches will be sent to the linux-pci mailing list, if anyone
wants to see them.
thanks,
greg k-h
Documentation/filesystems/sysfs-pci.txt | 21 +-
Documentation/pci-error-recovery.txt | 246 +++++++++++++++++++++++++++
MAINTAINERS | 7
arch/alpha/kernel/sys_alcor.c | 3
arch/alpha/kernel/sys_sio.c | 6
arch/frv/mb93090-mb00/pci-frv.c | 8
arch/frv/mb93090-mb00/pci-irq.c | 4
arch/i386/kernel/scx200.c | 2
arch/i386/pci/acpi.c | 2
arch/i386/pci/fixup.c | 7
arch/i386/pci/irq.c | 42 ++--
arch/mips/vr41xx/common/vrc4173.c | 2
arch/ppc/kernel/pci.c | 21 +-
arch/ppc/platforms/85xx/mpc85xx_cds_common.c | 11 -
arch/sparc64/kernel/ebus.c | 15 -
drivers/acpi/pci_irq.c | 7
drivers/block/DAC960.c | 2
drivers/block/cciss.c | 2
drivers/block/sx8.c | 2
drivers/block/umem.c | 2
drivers/hwmon/vt8231.c | 2
drivers/media/radio/radio-gemtek-pci.c | 2
drivers/media/radio/radio-maxiradio.c | 2
drivers/media/video/bttv-driver.c | 2
drivers/media/video/saa7134/saa7134-core.c | 2
drivers/parport/parport_serial.c | 2
drivers/pci/hotplug/acpiphp_glue.c | 6
drivers/pci/hotplug/cpqphp.h | 8
drivers/pci/hotplug/cpqphp_core.c | 127 +++++++------
drivers/pci/hotplug/cpqphp_ctrl.c | 28 ---
drivers/pci/hotplug/cpqphp_sysfs.c | 138 ++++++++++++---
drivers/pci/hotplug/ibmphp_pci.c | 2
drivers/pci/hotplug/pciehp_core.c | 92 +++++-----
drivers/pci/hotplug/pciehp_hpc.c | 19 +-
drivers/pci/hotplug/pciehp_pci.c | 52 +++--
drivers/pci/hotplug/pciehprm_acpi.c | 13 -
drivers/pci/hotplug/rpadlpar_core.c | 27 --
drivers/pci/hotplug/rpaphp_pci.c | 47 -----
drivers/pci/hotplug/shpchp.h | 4
drivers/pci/hotplug/shpchp_core.c | 16 +
drivers/pci/hotplug/shpchp_ctrl.c | 37 ----
drivers/pci/hotplug/shpchp_hpc.c | 138 +++++++++------
drivers/pci/hotplug/shpchp_pci.c | 19 +-
drivers/pci/pci.c | 7
drivers/pci/pci.h | 5
drivers/pci/pcie/portdrv_core.c | 4
drivers/pci/probe.c | 49 ++++-
drivers/pci/proc.c | 3
drivers/pci/quirks.c | 26 ++
drivers/pci/remove.c | 3
drivers/pcmcia/vrc4173_cardu.c | 2
drivers/serial/serial_txx9.c | 2
drivers/video/cyblafb.c | 1
include/linux/pci.h | 69 +++++++
sound/oss/ad1889.c | 2
sound/oss/btaudio.c | 2
sound/oss/cmpci.c | 2
sound/oss/cs4281/cs4281m.c | 2
sound/oss/cs46xx.c | 2
sound/oss/emu10k1/main.c | 2
sound/oss/es1370.c | 2
sound/oss/es1371.c | 2
sound/oss/ite8172.c | 2
sound/oss/kahlua.c | 2
sound/oss/maestro.c | 2
sound/oss/nec_vrc5477.c | 2
sound/oss/nm256_audio.c | 2
sound/oss/rme96xx.c | 2
sound/oss/sonicvibes.c | 2
sound/oss/ymfpci.c | 2
70 files changed, 956 insertions(+), 444 deletions(-)
Adrian Bunk:
PCI Hotplug: cpqphp_ctrl.c: remove dead code
PCI: drivers/pci: some cleanups
Benjamin Herrenschmidt:
PCI: Export pci_cfg_space_size
Daniel Marjamäki:
PCI: irq.c: trivial printk and DBG updates
Daniel Yeisley:
PCI Quirk: 1K I/O space granularity on Intel P64H2
Dominik Brodowski:
PCI: use bus numbers sparsely, if necessary
Greg Kroah-Hartman:
PCI Hotplug: fix up the sysfs file in the compaq pci hotplug driver
drivers/sound/oss: Replace pci_module_init() with pci_register_driver()
Hanna Linder:
PCI: arch/i386/pci/acpi.c: use for_each_pci_dev
Jesper Juhl:
PCI: Reduce nr of ptr derefs in drivers/pci/hotplug/cpqphp_core.c
PCI: Reduce nr of ptr derefs in drivers/pci/hotplug/rpaphp_pci.c
PCI: Reduce nr of ptr derefs in drivers/pci/hotplug/pciehp_core.c
PCI: Reduce nr of ptr derefs in drivers/pci/hotplug/pciehprm_acpi.c
Jesse Barnes:
PCI: document sysfs rom file interface
PCI: update Toshiba ohci quirk DMI table
Jiri Slaby:
PCI: pci_find_device remove (ppc/platforms/85xx/mpc85xx_cds_common.c)
PCI: pci_find_device remove (alpha/kernel/sys_sio.c)
PCI: pci_find_device remove (alpha/kernel/sys_alcor.c)
PCI: pci_find_device remove (ppc/kernel/pci.c)
PCI: arch: pci_find_device remove (frv/mb93090-mb00/pci-irq.c)
PCI: pci_find_device remove (frv/mb93090-mb00/pci-frv.c)
PCI: pci_find_device remove (sparc64/kernel/ebus.c)
Jordan, William P:
PCI Hotplug: ibmphp_pci.c copy-n-paste fix
Kenji Kaneshige:
shpchp: fix improper reference to Slot Avail Regsister
shpchp: fix improper reference to Mode 1 ECC Capability" bit
shpchp: replace pci_find_slot() with pci_get_slot()
shpchp: fix improper mmio mapping
shpchp: fix improper wait for command completion
shpchp: fix improper write to Command Completion Detect bit
shpchp: Implement get_address callback
Kristen Accardi:
pci: use pin stored in pci_dev
apci: use pin stored in pci_dev
pci: store PCI_INTERRUPT_PIN in pci_dev
pci: call pci_read_irq for bridges
acpiphp: only size new bus
linas:
PCI Error Recovery: header file patch
linas@austin.ibm.com:
PCI Hotplug/powerpc: remove duplicated code
PCI Hotplug/powerpc: more removal of duplicated code
PCI Error Recovery: documentation
Rajesh Shah:
pciehp: allow bridged card hotplug
Richard Knutsson:
arch: Replace pci_module_init() with pci_register_driver()
drivers/block: Replace pci_module_init() with pci_register_driver()
drivers/*rest*: Replace pci_module_init() with pci_register_driver()
Sergey Vlasov:
PCIE: make bus_id for PCI Express devices unique
Thomas Schaefer:
pciehp: handle sticky power-fault status
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-09 20:37 [GIT PATCH] PCI patches for 2.6.15 - retry Greg KH @ 2006-01-10 0:00 ` Linus Torvalds 2006-01-10 0:44 ` Andrew Morton 0 siblings, 1 reply; 15+ messages in thread From: Linus Torvalds @ 2006-01-10 0:00 UTC (permalink / raw) To: Greg KH; +Cc: Andrew Morton, linux-kernel, linux-pci On Mon, 9 Jan 2006, Greg KH wrote: > > Here are some PCI patches against your latest git tree. They have all > been in the -mm tree for a while with no problems. I've pulled out all > of the offending patches that people objected to, or ones that crashed > older machines from the last series I sent you. Before I pull this, I'd like to get some confirmation that some of the other problems that seem to be PCI-related in the -mm tree are also understood, or at least known to be part of the stuff that you're _not_ sending me.. [ There's at least a pci_call_probe() NULL ptr dereference report by Martin Bligh, I think Andrew has a few others he's tracked.. ] Linus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 0:00 ` Linus Torvalds @ 2006-01-10 0:44 ` Andrew Morton 2006-01-10 1:49 ` Alan Cox 2006-01-10 2:28 ` Greg KH 0 siblings, 2 replies; 15+ messages in thread From: Andrew Morton @ 2006-01-10 0:44 UTC (permalink / raw) To: Linus Torvalds; +Cc: gregkh, linux-kernel, linux-pci Linus Torvalds <torvalds@osdl.org> wrote: > > > > On Mon, 9 Jan 2006, Greg KH wrote: > > > > Here are some PCI patches against your latest git tree. They have all > > been in the -mm tree for a while with no problems. I've pulled out all > > of the offending patches that people objected to, or ones that crashed > > older machines from the last series I sent you. > > Before I pull this, I'd like to get some confirmation that some of the > other problems that seem to be PCI-related in the -mm tree are also > understood, or at least known to be part of the stuff that you're _not_ > sending me.. It's really hard to keep track of all this, so it's likely that some things will still sneak through. - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi or driver core. - A few problems with ehci. For example Grant Coady went oops loading the module. Probably USB, maybe solved now, but there are interactions... - gregkh-pci-x86-pci-domain-support-the-meat.patch is a problem, but wasn't in this tree. > [ There's at least a pci_call_probe() NULL ptr dereference report by > Martin Bligh, I think Andrew has a few others he's tracked.. ] Yes, Martin is reporting failures on a few machines. Hopefully he's working out whether gregkh-pci-x86-pci-domain-support-the-meat.patch was the culprit here. If so, I'd say we're good to go. If that's _not_ the source then we just don't know where the failure is coming from. All very vague, sorry. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 0:44 ` Andrew Morton @ 2006-01-10 1:49 ` Alan Cox 2006-01-10 1:49 ` Andrew Morton 2006-01-12 20:55 ` Jeff Garzik 2006-01-10 2:28 ` Greg KH 1 sibling, 2 replies; 15+ messages in thread From: Alan Cox @ 2006-01-10 1:49 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, gregkh, linux-kernel, linux-pci On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: > - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi > or driver core. libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy pata driver. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 1:49 ` Alan Cox @ 2006-01-10 1:49 ` Andrew Morton 2006-01-10 10:03 ` Reuben Farrelly 2006-01-12 3:55 ` Reuben Farrelly 2006-01-12 20:55 ` Jeff Garzik 1 sibling, 2 replies; 15+ messages in thread From: Andrew Morton @ 2006-01-10 1:49 UTC (permalink / raw) To: Alan Cox; +Cc: torvalds, gregkh, linux-kernel, linux-pci, Reuben Farrelly Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: > > - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi > > or driver core. > > libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy > pata driver. Well that's all merged up now. Reuben, could you please test 2.6.15git6 tomorrow? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 1:49 ` Andrew Morton @ 2006-01-10 10:03 ` Reuben Farrelly 2006-01-12 3:55 ` Reuben Farrelly 1 sibling, 0 replies; 15+ messages in thread From: Reuben Farrelly @ 2006-01-10 10:03 UTC (permalink / raw) To: Andrew Morton; +Cc: Alan Cox, torvalds, gregkh, linux-kernel, linux-pci On 10/01/2006 2:49 p.m., Andrew Morton wrote: > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >> On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: >>> - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi >>> or driver core. >> libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy >> pata driver. > > Well that's all merged up now. Reuben, could you please test 2.6.15git6 > tomorrow? A couple of reboots later with git6 and at this stage it seems all OK, no oopses. I'm still having 100% repeatable "soft" hangs when booting up though, both with -mm2 (-mm1 seems OK in this regard) and git6. It's enough to make git6 and mm2 unusable because the machine never finishes booting userspace. I'll put more details of that in another email following up to the original -mm2 thread, as it's unrelated to the oops above (but probably equally as nasty). But it means I can't test the git6 fixes much more because every time I boot it I have to alt-sysrq S+U+B or uncleanly kill the box by hitting the reset button. reuben ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 1:49 ` Andrew Morton 2006-01-10 10:03 ` Reuben Farrelly @ 2006-01-12 3:55 ` Reuben Farrelly 2006-01-12 4:29 ` Andrew Morton 2006-01-12 11:42 ` Alan Cox 1 sibling, 2 replies; 15+ messages in thread From: Reuben Farrelly @ 2006-01-12 3:55 UTC (permalink / raw) To: Andrew Morton; +Cc: Alan Cox, torvalds, gregkh, linux-kernel, linux-pci On 10/01/2006 2:49 p.m., Andrew Morton wrote: > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >> On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: >>> - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi >>> or driver core. >> libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy >> pata driver. > > Well that's all merged up now. Reuben, could you please test 2.6.15git6 > tomorrow? Seemingly not fixed afterall. I've been doing many reboots lately getting to the bottom of the barrier/md bug and just before I hit this with -mm3 (linus.patch -git7) which I believe is the same bug (the call trace looks very similar). Real Time Clock Driver v1.12ac serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled ÿserial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ACPI: PCI Interrupt 0000:06:02.0[A] -> GSI 18 (level, low) -> IRQ 185 0000:06:02.0: ttyS1 at I/O 0xbc00 (irq = 185) is a 16550A 0000:06:02.0: ttyS2 at I/O 0xbc08 (irq = 185) is a 16550A Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ahci: probe of 0000:00:1f.2 failed with error -12 ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c023c873 *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: Modules linked in: CPU: 0 EIP: 0060:[<c023c873>] Not tainted VLI EFLAGS: 00010206 (2.6.15-mm3) EIP is at make_class_name+0x28/0x8d eax: 00000000 ebx: ffffffff ecx: ffffffff edx: c1a12224 esi: 00000009 edi: 00000000 ebp: c1921d2c esp: c1921d1c ds: 007b es: 007b ss: 0068 Process swapper (pid: 1, threadinfo=c1921000 task=c1920a90) Stack: <0>c1a12224 c03913f8 c1a12224 c03913f8 c1921d54 c023cabd c1921d58 c0391380 00000000 c1af39c0 c0391400 c1a12224 c1a12000 c1a12030 c1921d60 c023cb7b c1a120e4 c1921d74 c0255dbf c1a122c0 c1a43a40 00000000 c1921d80 c025e393 Call Trace: [<c0103c5d>] show_stack+0x9b/0xc0 [<c0103de4>] show_registers+0x162/0x1e7 [<c0103f8f>] die+0x126/0x231 [<c01140db>] do_page_fault+0x271/0x5b9 [<c01037df>] error_code+0x4f/0x54 [<c023cabd>] class_device_del+0xa3/0x156 [<c023cb7b>] class_device_unregister+0xb/0x15 [<c0255dbf>] scsi_remove_host+0xb4/0xef [<c025e393>] ata_host_remove+0x11/0x1c [<c0260ec6>] ata_device_add+0x2e4/0xb7b [<c0261cd6>] ata_pci_init_one+0x322/0x387 [<c0265b34>] piix_init_one+0x18c/0x338 [<c01f4f4f>] pci_device_probe+0x44/0x5f [<c023bf62>] driver_probe_device+0x3e/0xb0 [<c023c0df>] __driver_attach+0x8e/0x90 [<c023b9f3>] bus_for_each_dev+0x44/0x62 [<c023bece>] driver_attach+0x19/0x1b [<c023b687>] bus_add_driver+0x6d/0x126 [<c023c350>] driver_register+0x6b/0x9b [<c01f50fb>] __pci_register_driver+0x6a/0x95 [<c03e8ea8>] piix_init+0xf/0x22 [<c01003cc>] init+0xff/0x325 [<c0100d25>] kernel_thread_helper+0x5/0xb Code: c8 5d c3 55 89 e5 57 56 53 83 ec 04 89 45 f0 89 c2 8b 40 48 8b 38 bb ff ff ff ff 89 d9 31 c0 f2 ae f7 d1 49 89 ce 8b 7a 08 89 d9 <f2> ae f7 d1 49 89 ca 8d 4e 02 8d 04 0a ba d0 00 00 00 e8 3b 72 <0>Kernel panic - not syncing: Attempted to kill init! reuben ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 3:55 ` Reuben Farrelly @ 2006-01-12 4:29 ` Andrew Morton 2006-01-12 4:55 ` Reuben Farrelly 2006-01-12 11:42 ` Alan Cox 1 sibling, 1 reply; 15+ messages in thread From: Andrew Morton @ 2006-01-12 4:29 UTC (permalink / raw) To: Reuben Farrelly Cc: alan, torvalds, gregkh, linux-kernel, linux-pci, Jeff Garzik Reuben Farrelly <reuben-lkml@reub.net> wrote: > > > > On 10/01/2006 2:49 p.m., Andrew Morton wrote: > > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > >> On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: > >>> - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi > >>> or driver core. > >> libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy > >> pata driver. > > > > Well that's all merged up now. Reuben, could you please test 2.6.15git6 > > tomorrow? > > Seemingly not fixed afterall. I've been doing many reboots lately getting to > the bottom of the barrier/md bug and just before I hit this with -mm3 > (linus.patch -git7) which I believe is the same bug (the call trace looks very > similar). > > ... I'm getting my bugs confused now - there are so many. Were you the person who reported this before? > serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > ACPI: PCI Interrupt 0000:06:02.0[A] -> GSI 18 (level, low) -> IRQ 185 > 0000:06:02.0: ttyS1 at I/O 0xbc00 (irq = 185) is a 16550A > 0000:06:02.0: ttyS2 at I/O 0xbc08 (irq = 185) is a 16550A > Floppy drive(s): fd0 is 1.44M > FDC 0 is a post-1991 82077 > Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx > ahci: probe of 0000:00:1f.2 failed with error -12 > ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 > ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 > Unable to handle kernel NULL pointer dereference at virtual address 00000000 > printing eip: > c023c873 > *pde = 00000000 > Oops: 0000 [#1] > SMP > last sysfs file: > Modules linked in: > CPU: 0 > EIP: 0060:[<c023c873>] Not tainted VLI > EFLAGS: 00010206 (2.6.15-mm3) > EIP is at make_class_name+0x28/0x8d > eax: 00000000 ebx: ffffffff ecx: ffffffff edx: c1a12224 > esi: 00000009 edi: 00000000 ebp: c1921d2c esp: c1921d1c > ds: 007b es: 007b ss: 0068 > Process swapper (pid: 1, threadinfo=c1921000 task=c1920a90) > Stack: <0>c1a12224 c03913f8 c1a12224 c03913f8 c1921d54 c023cabd c1921d58 c0391380 > 00000000 c1af39c0 c0391400 c1a12224 c1a12000 c1a12030 c1921d60 c023cb7b > c1a120e4 c1921d74 c0255dbf c1a122c0 c1a43a40 00000000 c1921d80 c025e393 > Call Trace: > [<c0103c5d>] show_stack+0x9b/0xc0 > [<c0103de4>] show_registers+0x162/0x1e7 > [<c0103f8f>] die+0x126/0x231 > [<c01140db>] do_page_fault+0x271/0x5b9 > [<c01037df>] error_code+0x4f/0x54 > [<c023cabd>] class_device_del+0xa3/0x156 > [<c023cb7b>] class_device_unregister+0xb/0x15 > [<c0255dbf>] scsi_remove_host+0xb4/0xef > [<c025e393>] ata_host_remove+0x11/0x1c > [<c0260ec6>] ata_device_add+0x2e4/0xb7b > [<c0261cd6>] ata_pci_init_one+0x322/0x387 > [<c0265b34>] piix_init_one+0x18c/0x338 > [<c01f4f4f>] pci_device_probe+0x44/0x5f > [<c023bf62>] driver_probe_device+0x3e/0xb0 > [<c023c0df>] __driver_attach+0x8e/0x90 > [<c023b9f3>] bus_for_each_dev+0x44/0x62 > [<c023bece>] driver_attach+0x19/0x1b > [<c023b687>] bus_add_driver+0x6d/0x126 > [<c023c350>] driver_register+0x6b/0x9b > [<c01f50fb>] __pci_register_driver+0x6a/0x95 > [<c03e8ea8>] piix_init+0xf/0x22 > [<c01003cc>] init+0xff/0x325 > [<c0100d25>] kernel_thread_helper+0x5/0xb > Code: c8 5d c3 55 89 e5 57 56 53 83 ec 04 89 45 f0 89 c2 8b 40 48 8b 38 bb ff ff > ff ff 89 d9 31 c0 f2 ae f7 d1 49 89 ce 8b 7a 08 89 d9 <f2> ae f7 d1 49 89 ca 8d > 4e 02 8d 04 0a ba d0 00 00 00 e8 3b 72 Jeff, I beleive this is a sata bug. ata_device_add() called ata_host_remove() and something under there isnot yet sufficiently initialised. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 4:29 ` Andrew Morton @ 2006-01-12 4:55 ` Reuben Farrelly 0 siblings, 0 replies; 15+ messages in thread From: Reuben Farrelly @ 2006-01-12 4:55 UTC (permalink / raw) To: Andrew Morton Cc: alan, torvalds, gregkh, linux-kernel, linux-pci, Jeff Garzik On 12/01/2006 5:29 p.m., Andrew Morton wrote: > Reuben Farrelly <reuben-lkml@reub.net> wrote: >> >> >> On 10/01/2006 2:49 p.m., Andrew Morton wrote: >>> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: >>>> On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: >>>>> - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi >>>>> or driver core. >>>> libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy >>>> pata driver. >>> Well that's all merged up now. Reuben, could you please test 2.6.15git6 >>> tomorrow? >> Seemingly not fixed afterall. I've been doing many reboots lately getting to >> the bottom of the barrier/md bug and just before I hit this with -mm3 >> (linus.patch -git7) which I believe is the same bug (the call trace looks very >> similar). >> >> ... > > I'm getting my bugs confused now - there are so many. Were you the person > who reported this before? Yes. It was suggested I try -git6. I reported that it seemed to be OK, but clearly it isn't. Then again, I've done a hell of a lot of reboots in the last couple of days. I've updated my list at http://www.reub.net/files/kernel/outstanding-kernel-bugs.txt and http://www.reub.net/files/kernel/ This is a very basic text file which outlines the details of the various bugs that I have on the go at any given point in time and where they're at as. There are various postings on LKML reporting almost all of them. Thread starts http://www.ussg.iu.edu/hypermail/linux/kernel/0601.1/0619.html Greg KH (Mon Jan 09 2006 - 15:36:39 EST) >> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A >> ACPI: PCI Interrupt 0000:06:02.0[A] -> GSI 18 (level, low) -> IRQ 185 >> 0000:06:02.0: ttyS1 at I/O 0xbc00 (irq = 185) is a 16550A >> 0000:06:02.0: ttyS2 at I/O 0xbc08 (irq = 185) is a 16550A >> Floppy drive(s): fd0 is 1.44M >> FDC 0 is a post-1991 82077 >> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 >> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx >> ahci: probe of 0000:00:1f.2 failed with error -12 >> ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 >> ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 >> Unable to handle kernel NULL pointer dereference at virtual address 00000000 >> printing eip: >> c023c873 >> *pde = 00000000 >> Oops: 0000 [#1] >> SMP >> last sysfs file: >> Modules linked in: >> CPU: 0 >> EIP: 0060:[<c023c873>] Not tainted VLI >> EFLAGS: 00010206 (2.6.15-mm3) >> EIP is at make_class_name+0x28/0x8d >> eax: 00000000 ebx: ffffffff ecx: ffffffff edx: c1a12224 >> esi: 00000009 edi: 00000000 ebp: c1921d2c esp: c1921d1c >> ds: 007b es: 007b ss: 0068 >> Process swapper (pid: 1, threadinfo=c1921000 task=c1920a90) >> Stack: <0>c1a12224 c03913f8 c1a12224 c03913f8 c1921d54 c023cabd c1921d58 c0391380 >> 00000000 c1af39c0 c0391400 c1a12224 c1a12000 c1a12030 c1921d60 c023cb7b >> c1a120e4 c1921d74 c0255dbf c1a122c0 c1a43a40 00000000 c1921d80 c025e393 >> Call Trace: >> [<c0103c5d>] show_stack+0x9b/0xc0 >> [<c0103de4>] show_registers+0x162/0x1e7 >> [<c0103f8f>] die+0x126/0x231 >> [<c01140db>] do_page_fault+0x271/0x5b9 >> [<c01037df>] error_code+0x4f/0x54 >> [<c023cabd>] class_device_del+0xa3/0x156 >> [<c023cb7b>] class_device_unregister+0xb/0x15 >> [<c0255dbf>] scsi_remove_host+0xb4/0xef >> [<c025e393>] ata_host_remove+0x11/0x1c >> [<c0260ec6>] ata_device_add+0x2e4/0xb7b >> [<c0261cd6>] ata_pci_init_one+0x322/0x387 >> [<c0265b34>] piix_init_one+0x18c/0x338 >> [<c01f4f4f>] pci_device_probe+0x44/0x5f >> [<c023bf62>] driver_probe_device+0x3e/0xb0 >> [<c023c0df>] __driver_attach+0x8e/0x90 >> [<c023b9f3>] bus_for_each_dev+0x44/0x62 >> [<c023bece>] driver_attach+0x19/0x1b >> [<c023b687>] bus_add_driver+0x6d/0x126 >> [<c023c350>] driver_register+0x6b/0x9b >> [<c01f50fb>] __pci_register_driver+0x6a/0x95 >> [<c03e8ea8>] piix_init+0xf/0x22 >> [<c01003cc>] init+0xff/0x325 >> [<c0100d25>] kernel_thread_helper+0x5/0xb >> Code: c8 5d c3 55 89 e5 57 56 53 83 ec 04 89 45 f0 89 c2 8b 40 48 8b 38 bb ff ff >> ff ff 89 d9 31 c0 f2 ae f7 d1 49 89 ce 8b 7a 08 89 d9 <f2> ae f7 d1 49 89 ca 8d >> 4e 02 8d 04 0a ba d0 00 00 00 e8 3b 72 > > Jeff, I beleive this is a sata bug. ata_device_add() called > ata_host_remove() and something under there isnot yet sufficiently > initialised. reuben ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 3:55 ` Reuben Farrelly 2006-01-12 4:29 ` Andrew Morton @ 2006-01-12 11:42 ` Alan Cox 2006-01-12 20:59 ` Jeff Garzik 1 sibling, 1 reply; 15+ messages in thread From: Alan Cox @ 2006-01-12 11:42 UTC (permalink / raw) To: Reuben Farrelly; +Cc: Andrew Morton, torvalds, gregkh, linux-kernel, linux-pci On Iau, 2006-01-12 at 16:55 +1300, Reuben Farrelly wrote: > ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 > ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 > Unable to handle kernel NULL pointer dereference at virtual address 00000000 That is the critical bit. The SATA ports have no PCI resources assigned for bus mastering (BAR 4). libata should have driven the device PIO in this case but the resource should have been assigned. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 11:42 ` Alan Cox @ 2006-01-12 20:59 ` Jeff Garzik 2006-01-16 13:11 ` Reuben Farrelly 0 siblings, 1 reply; 15+ messages in thread From: Jeff Garzik @ 2006-01-12 20:59 UTC (permalink / raw) To: Alan Cox Cc: Reuben Farrelly, Andrew Morton, torvalds, gregkh, linux-kernel, linux-pci Alan Cox wrote: > On Iau, 2006-01-12 at 16:55 +1300, Reuben Farrelly wrote: > >>ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 >>ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 >>Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > > That is the critical bit. The SATA ports have no PCI resources assigned > for bus mastering (BAR 4). libata should have driven the device PIO in > this case but the resource should have been assigned. Agreed. This appears to be BIOS assigning bad values to SATA hardware. However, libata should recognize this and not attempt to iomap or drive the hardware, in that case, rather than oops. Jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 20:59 ` Jeff Garzik @ 2006-01-16 13:11 ` Reuben Farrelly 0 siblings, 0 replies; 15+ messages in thread From: Reuben Farrelly @ 2006-01-16 13:11 UTC (permalink / raw) To: Jeff Garzik Cc: Alan Cox, Andrew Morton, torvalds, gregkh, linux-kernel, linux-acpi On 13/01/2006 9:59 a.m., Jeff Garzik wrote: > Alan Cox wrote: >> On Iau, 2006-01-12 at 16:55 +1300, Reuben Farrelly wrote: >> >>> ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 >>> ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 >>> Unable to handle kernel NULL pointer dereference at virtual address >>> 00000000 >> >> >> That is the critical bit. The SATA ports have no PCI resources assigned >> for bus mastering (BAR 4). libata should have driven the device PIO in >> this case but the resource should have been assigned. > > Agreed. This appears to be BIOS assigning bad values to SATA hardware. > > However, libata should recognize this and not attempt to iomap or drive > the hardware, in that case, rather than oops. > > Jeff Some testing tonight has shown up a bit more about where this regression crept in. Below the table reads release on left hand side, and the result of a given reboot on the right hand side after the dash. I had to do so many reboots to be sure that the bug was there or not - as you can see from the -mm1 test it doesn't always show up. 2.6.15 - OK OK OK OK OK 2.6.15-git1 - OK OK OK OK OK OK OK OK 2.6.15-git2 - OK 2.6.15-git6 - OK OK OK OK OK OK OK OK 2.6.15-git12 - OK OK OK OK OK OK OK 2.6.15-rc5-mm3 - OK OK OK(but oopsed in usb) OK OK(but oopsed in usb) Those oopses in USB were only seen in this release so looks likely whatever was causing them was fixed soon after. 2.6.15-mm1 - OK OK OOPSED OOPSED OOPSED all in ATA 2.6.15-mm2 + mm3 - [known to OOPS on this bug frequently but not tested in this round] 2.6.15-mm4 - OOPSED OK OOPSED TIMEOUT TIMEOUT OOPS OK 2.6.15-mm1 with no git-acpi.patch - TIMEOUT TIMEOUT OOPSED TIMEOUT OK OK means the system booted up to single user mode without issue, TIMEOUT means that the controllers were assigned IRQ 50 and then failed to find any ATA disks and OOPSED means that he SATA ports were not assigned IRQs at all and hence the system oopsed out like this: ahci: probe of 0000:00:1f.2 failed with error -12 ata1: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x0 irq 0 ata2: SATA max UDMA/133 cmd 0x0 ctl 0x2 bmdma 0x8 irq 0 Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c023c873 *pde = 00000000 Oops: 0000 [#1] <plus a trace and a whole lot more> Full output on http://www.reub.net/files/kernel/outstanding-kernel-bugs.txt (as usual) In summary the good news is that 2.6.15-git12 (which is the latest linus tree) is GOOD and does not seem to exhibit this problem. Not a single -git release crapped out. It seems some regression was introduced into 2.6.15-mm1 which has been carried forward through to -mm4 so far though but never pushed to Linus. I guess it also suggests that it's not a hardware or bios issue given the sheer number of successful reboots in a row, and I think reverting the git-acpi.patch suggests that ACPI is not the cause of it, at least in this instance. But that's about as far as I have gotten. 45 reboots later I'm finishing for tonight, but before I go back and hit it with git bisect to narrow it down any further, Andrew/Jeff does this make it any easier to pinpoint, and/or do you have any preliminary patches to test or ideas as to what other subsystems could be involved? Thanks, Reuben ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 1:49 ` Alan Cox 2006-01-10 1:49 ` Andrew Morton @ 2006-01-12 20:55 ` Jeff Garzik 2006-01-13 0:16 ` Alan Cox 1 sibling, 1 reply; 15+ messages in thread From: Jeff Garzik @ 2006-01-12 20:55 UTC (permalink / raw) To: Alan Cox; +Cc: Andrew Morton, Linus Torvalds, gregkh, linux-kernel, linux-pci Alan Cox wrote: > On Llu, 2006-01-09 at 16:44 -0800, Andrew Morton wrote: > >>- Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi >> or driver core. > > > libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy > pata driver. Any additional info? How can I reproduce? Jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-12 20:55 ` Jeff Garzik @ 2006-01-13 0:16 ` Alan Cox 0 siblings, 0 replies; 15+ messages in thread From: Alan Cox @ 2006-01-13 0:16 UTC (permalink / raw) To: Jeff Garzik Cc: Andrew Morton, Linus Torvalds, gregkh, linux-kernel, linux-pci On Iau, 2006-01-12 at 15:55 -0500, Jeff Garzik wrote: > > libata I think. I reproduced it on 2.6.14-mm2 by accident with a buggy > > pata driver. > > > Any additional info? How can I reproduce? In my case I'm fairly sure (waves arms frantically) that it was registering a controller which then failed to add any drives so it got cleaned back up early ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [GIT PATCH] PCI patches for 2.6.15 - retry 2006-01-10 0:44 ` Andrew Morton 2006-01-10 1:49 ` Alan Cox @ 2006-01-10 2:28 ` Greg KH 1 sibling, 0 replies; 15+ messages in thread From: Greg KH @ 2006-01-10 2:28 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, linux-kernel, linux-pci On Mon, Jan 09, 2006 at 04:44:10PM -0800, Andrew Morton wrote: > Linus Torvalds <torvalds@osdl.org> wrote: > > > > On Mon, 9 Jan 2006, Greg KH wrote: > > > > > > Here are some PCI patches against your latest git tree. They have all > > > been in the -mm tree for a while with no problems. I've pulled out all > > > of the offending patches that people objected to, or ones that crashed > > > older machines from the last series I sent you. > > > > Before I pull this, I'd like to get some confirmation that some of the > > other problems that seem to be PCI-related in the -mm tree are also > > understood, or at least known to be part of the stuff that you're _not_ > > sending me.. > > It's really hard to keep track of all this, so it's likely that some things > will still sneak through. > > - Reuben Farrelly's oops in make_class_name(). Could be libata, or scsi > or driver core. Haven't heard of this one before, but it shouldn't be a pci issue. > - A few problems with ehci. For example Grant Coady went oops loading > the module. Probably USB, maybe solved now, but there are > interactions... People are still working on this one, but it shouldn't be a pci issue. > - gregkh-pci-x86-pci-domain-support-the-meat.patch is a problem, but > wasn't in this tree. > > > [ There's at least a pci_call_probe() NULL ptr dereference report by > > Martin Bligh, I think Andrew has a few others he's tracked.. ] Yes, that's the x86-* patches in my tree, from Jeff, and I'm not going to be sending them to you until all of the breakage is fixed up (he created them for machines that aren't public yet, so I don't think there's a rush for them to get in anytime soon...) > Yes, Martin is reporting failures on a few machines. Hopefully he's > working out whether gregkh-pci-x86-pci-domain-support-the-meat.patch was > the culprit here. If so, I'd say we're good to go. If that's _not_ the > source then we just don't know where the failure is coming from. It sure looks like it's the reason why, as we are suddenly failing with a NULL pointer problem after we change an integer field into a pointer :) Linus, it should all be safe for you to pull this tree, as I took everything that people objected to out of my last attempt :) thanks, greg k-h ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-01-16 13:11 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-01-09 20:37 [GIT PATCH] PCI patches for 2.6.15 - retry Greg KH 2006-01-10 0:00 ` Linus Torvalds 2006-01-10 0:44 ` Andrew Morton 2006-01-10 1:49 ` Alan Cox 2006-01-10 1:49 ` Andrew Morton 2006-01-10 10:03 ` Reuben Farrelly 2006-01-12 3:55 ` Reuben Farrelly 2006-01-12 4:29 ` Andrew Morton 2006-01-12 4:55 ` Reuben Farrelly 2006-01-12 11:42 ` Alan Cox 2006-01-12 20:59 ` Jeff Garzik 2006-01-16 13:11 ` Reuben Farrelly 2006-01-12 20:55 ` Jeff Garzik 2006-01-13 0:16 ` Alan Cox 2006-01-10 2:28 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox