linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-acpi breakage, sym2
@ 2006-05-12  6:10 Andrew Morton
  2006-05-12 11:23 ` Matthew Wilcox
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-05-12  6:10 UTC (permalink / raw)
  To: linux-acpi, Brown, Len, linux-scsi, Matthew Wilcox


The latest
git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git#test
kills my very vanilla P4 box.

Without:

Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 15 devices
PnPBIOS: Disabled by ACPI PNP
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
pnp: 00:02: ioport range 0x400-0x47f could not be reserved
pnp: 00:02: ioport range 0x500-0x53f has been reserved
pnp: 00:02: ioport range 0x800-0x87f has been reserved
PCI: Bridge: 0000:00:01.0
  IO window: disabled.
  MEM window: fc900000-fe9fffff
  PREFETCH window: e4500000-f46fffff
PCI: Bridge: 0000:00:1e.0
  IO window: d000-dfff
  MEM window: fea00000-feafffff
  PREFETCH window: f4700000-f47fffff


With:

Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00f2500
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x1e2a, dseg 0xf0000
PnPBIOS: 18 nodes reported by PnP BIOS; 18 recorded by driver
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0500-053f claimed by ICH4 GPIO
PCI: Transparent bridge - 0000:00:1e.0
PCI: Using IRQ router PIIX/ICH [8086/2440] at 0000:00:1f.0
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 1!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
PCI BIOS passed nonexistent PCI bus 2!
PCI BIOS passed nonexistent PCI bus 0!
pnp: 00:09: ioport range 0x4d0-0x4d1 has been reserved
pnp: 00:09: ioport range 0xcf8-0xcff has been reserved
pnp: 00:0b: ioport range 0x800-0x87f has been reserved
PCI: Bridge: 0000:00:01.0
  IO window: disabled.
  MEM window: fc900000-fe9fffff
  PREFETCH window: e4500000-f46fffff
PCI: Bridge: 0000:00:1e.0
  IO window: d000-dfff
  MEM window: fea00000-feafffff
  PREFETCH window: f4700000-f47fffff
NET: Registered protocol family 2


So I'll temporarily drop that patch, along with the enormous shower of
still-unmerged other acpi patches which are probably dependent upon it.  A
swift fix would be really appreciated...



The above bug appears to trigger a scsi or sym2 bug.  With git-acpi.patch
present I get

sym0: <895> rev 0x2 at pci 0000:02:0c.0 irq 9
sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
sym0: SCSI BUS has been reset.
scsi0 : sym-2.2.3
 target0:0:0: Multiple LUNs disabled in NVRAM
 0:0:0:0: ABORT operation started.
 0:0:0:0: ABORT operation timed-out.
 0:0:0:0: DEVICE RESET operation started.
 0:0:0:0: DEVICE RESET operation timed-out.
 0:0:0:0: BUS RESET operation started.
 0:0:0:0: BUS RESET operation timed-out.
 0:0:0:0: HOST RESET operation started.
sym0: SCSI BUS has been reset.
 0:0:0:0: HOST RESET operation timed-out.
 0:0:0:0: scsi: Device offlined - not ready after error recovery
 0:0:1:0: ABORT operation started.
 0:0:1:0: ABORT operation timed-out.
 0:0:1:0: DEVICE RESET operation started.
 0:0:1:0: DEVICE RESET operation timed-out.
 0:0:1:0: BUS RESET operation started.
 0:0:1:0: BUS RESET operation timed-out.
 0:0:1:0: HOST RESET operation started.
sym0: SCSI BUS has been reset.
 0:0:1:0: HOST RESET operation timed-out.
 0:0:1:0: scsi: Device offlined - not ready after error recovery
 0:0:2:0: ABORT operation started.
 0:0:2:0: ABORT operation timed-out.
 0:0:2:0: DEVICE RESET operation started.
 0:0:2:0: DEVICE RESET operation timed-out.
 0:0:2:0: BUS RESET operation started.
 0:0:2:0: BUS RESET operation timed-out.
 0:0:2:0: HOST RESET operation started.
sym0: SCSI BUS has been reset.

ad infinitum.  How come?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: git-acpi breakage, sym2
@ 2006-05-12  7:26 Brown, Len
  2006-05-12  7:35 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Brown, Len @ 2006-05-12  7:26 UTC (permalink / raw)
  To: Andrew Morton, linux-acpi, linux-scsi, Matthew Wilcox

>The latest
>git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux
>-acpi-2.6.git#test
>kills my very vanilla P4 box.

hmmm, killed my p4 box too in a similar way.

booting with "acpi=off" made it work,
so I expect the failures after ACPI refuses to start
are due to ACPI and not do to something outside ACPI.

-Len

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-acpi breakage, sym2
  2006-05-12  7:26 Brown, Len
@ 2006-05-12  7:35 ` Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2006-05-12  7:35 UTC (permalink / raw)
  To: Brown, Len; +Cc: linux-acpi, linux-scsi, willy

"Brown, Len" <len.brown@intel.com> wrote:
>
> >The latest
> >git+ssh://master.kernel.org/pub/scm/linux/kernel/git/lenb/linux
> >-acpi-2.6.git#test
> >kills my very vanilla P4 box.
> 
> hmmm, killed my p4 box too in a similar way.
> 
> booting with "acpi=off" made it work,
> so I expect the failures after ACPI refuses to start
> are due to ACPI and not do to something outside ACPI.
> 

Yeah.  Turns out I'm basically unable to drop the acpi tree because I have
so much other stuff dependent upon it.  So it's debugging time.

By the time we get to IO_APIC_get_PCI_irq_vector(), mp_bus_id_to_pci_bus[]
is still all -1's.

Because MP_bus_info() hasn't been called yet.

get_smp_config() is being called, but bales because

	 if (acpi_lapic && acpi_ioapic) {

returns true.

However that all appears to be normal.  Am still poking at it.

git-bisect came up with garbage.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-acpi breakage, sym2
  2006-05-12  6:10 git-acpi breakage, sym2 Andrew Morton
@ 2006-05-12 11:23 ` Matthew Wilcox
  2006-05-12 11:25   ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2006-05-12 11:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-acpi, Brown, Len, linux-scsi, Matthew Wilcox

On Thu, May 11, 2006 at 11:10:05PM -0700, Andrew Morton wrote:
> The above bug appears to trigger a scsi or sym2 bug.  With git-acpi.patch
> present I get
> 
> sym0: <895> rev 0x2 at pci 0000:02:0c.0 irq 9
> sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
> sym0: SCSI BUS has been reset.
> scsi0 : sym-2.2.3
>  target0:0:0: Multiple LUNs disabled in NVRAM
>  0:0:0:0: ABORT operation started.
>  0:0:0:0: ABORT operation timed-out.
>  0:0:0:0: DEVICE RESET operation started.
>  0:0:0:0: DEVICE RESET operation timed-out.
>  0:0:0:0: BUS RESET operation started.
>  0:0:0:0: BUS RESET operation timed-out.
>  0:0:0:0: HOST RESET operation started.
> sym0: SCSI BUS has been reset.
>  0:0:0:0: HOST RESET operation timed-out.
>  0:0:0:0: scsi: Device offlined - not ready after error recovery
>  0:0:1:0: ABORT operation started.
>  0:0:1:0: ABORT operation timed-out.
>  0:0:1:0: DEVICE RESET operation started.
>  0:0:1:0: DEVICE RESET operation timed-out.
>  0:0:1:0: BUS RESET operation started.
>  0:0:1:0: BUS RESET operation timed-out.
>  0:0:1:0: HOST RESET operation started.
> sym0: SCSI BUS has been reset.
>  0:0:1:0: HOST RESET operation timed-out.
>  0:0:1:0: scsi: Device offlined - not ready after error recovery
>  0:0:2:0: ABORT operation started.
>  0:0:2:0: ABORT operation timed-out.
>  0:0:2:0: DEVICE RESET operation started.
>  0:0:2:0: DEVICE RESET operation timed-out.
>  0:0:2:0: BUS RESET operation started.
>  0:0:2:0: BUS RESET operation timed-out.
>  0:0:2:0: HOST RESET operation started.
> sym0: SCSI BUS has been reset.
> 
> ad infinitum.  How come?

Are you sure it's ad infinitum or just once for every device?
Anyway, this looks like a fairly classic "sym2 isn't getting any
interrupts" scenario.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-acpi breakage, sym2
  2006-05-12 11:23 ` Matthew Wilcox
@ 2006-05-12 11:25   ` Andrew Morton
  2006-05-12 11:40     ` Matthew Wilcox
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2006-05-12 11:25 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-acpi, len.brown, linux-scsi, willy

Matthew Wilcox <matthew@wil.cx> wrote:
>
> On Thu, May 11, 2006 at 11:10:05PM -0700, Andrew Morton wrote:
> > The above bug appears to trigger a scsi or sym2 bug.  With git-acpi.patch
> > present I get
> > 
> > sym0: <895> rev 0x2 at pci 0000:02:0c.0 irq 9
> > sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
> > sym0: SCSI BUS has been reset.
> > scsi0 : sym-2.2.3
> >  target0:0:0: Multiple LUNs disabled in NVRAM
> >  0:0:0:0: ABORT operation started.
> >  0:0:0:0: ABORT operation timed-out.
> >  0:0:0:0: DEVICE RESET operation started.
> >  0:0:0:0: DEVICE RESET operation timed-out.
> >  0:0:0:0: BUS RESET operation started.
> >  0:0:0:0: BUS RESET operation timed-out.
> >  0:0:0:0: HOST RESET operation started.
> > sym0: SCSI BUS has been reset.
> >  0:0:0:0: HOST RESET operation timed-out.
> >  0:0:0:0: scsi: Device offlined - not ready after error recovery
> >  0:0:1:0: ABORT operation started.
> >  0:0:1:0: ABORT operation timed-out.
> >  0:0:1:0: DEVICE RESET operation started.
> >  0:0:1:0: DEVICE RESET operation timed-out.
> >  0:0:1:0: BUS RESET operation started.
> >  0:0:1:0: BUS RESET operation timed-out.
> >  0:0:1:0: HOST RESET operation started.
> > sym0: SCSI BUS has been reset.
> >  0:0:1:0: HOST RESET operation timed-out.
> >  0:0:1:0: scsi: Device offlined - not ready after error recovery
> >  0:0:2:0: ABORT operation started.
> >  0:0:2:0: ABORT operation timed-out.
> >  0:0:2:0: DEVICE RESET operation started.
> >  0:0:2:0: DEVICE RESET operation timed-out.
> >  0:0:2:0: BUS RESET operation started.
> >  0:0:2:0: BUS RESET operation timed-out.
> >  0:0:2:0: HOST RESET operation started.
> > sym0: SCSI BUS has been reset.
> > 
> > ad infinitum.  How come?
> 
> Are you sure it's ad infinitum or just once for every device?

I have a single card talking to a single disk.  I whacked it after a couple
of minutes.

> Anyway, this looks like a fairly classic "sym2 isn't getting any
> interrupts" scenario.

It is - Len found that one.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-acpi breakage, sym2
  2006-05-12 11:25   ` Andrew Morton
@ 2006-05-12 11:40     ` Matthew Wilcox
  0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2006-05-12 11:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-acpi, len.brown, linux-scsi, willy

On Fri, May 12, 2006 at 04:25:04AM -0700, Andrew Morton wrote:
> Matthew Wilcox <matthew@wil.cx> wrote:
> > > scsi0 : sym-2.2.3
> > >  0:0:0:0: ABORT operation started.
> > >  0:0:1:0: ABORT operation started.
> > >  0:0:2:0: ABORT operation started.
> > > 
> > > ad infinitum.  How come?
> > 
> > Are you sure it's ad infinitum or just once for every device?
> 
> I have a single card talking to a single disk.  I whacked it after a couple
> of minutes.

Sure, but it has to probe each device to find out it's not there.  See
the target number increasing in the snippet I left above?

> > Anyway, this looks like a fairly classic "sym2 isn't getting any
> > interrupts" scenario.
> 
> It is - Len found that one.

Cool.  I'd love sym2 to behave more sanely in the 'no interrupts' case
because it means I get a lot of bug reports directed my way from people
with bad setups.  I don't see a nice way to do it though ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-05-12 11:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-12  6:10 git-acpi breakage, sym2 Andrew Morton
2006-05-12 11:23 ` Matthew Wilcox
2006-05-12 11:25   ` Andrew Morton
2006-05-12 11:40     ` Matthew Wilcox
  -- strict thread matches above, loose matches on Subject: below --
2006-05-12  7:26 Brown, Len
2006-05-12  7:35 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).