public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
@ 2008-04-01  8:12 ` bugme-daemon
  2008-04-01  8:15 ` bugme-daemon
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01  8:12 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374


bunk@kernel.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|drivers_other@kernel-       |scsi_drivers-
                   |bugs.osdl.org               |sym53c8xx@kernel-
                   |                            |bugs.osdl.org
          Component|Other                       |sym53c8xx
            Product|Drivers                     |SCSI Drivers
         Regression|0                           |1




-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
  2008-04-01  8:12 ` [Bug 10374] sym53c8xx: weird behavior with udev bugme-daemon
@ 2008-04-01  8:15 ` bugme-daemon
  2008-04-01  8:58 ` bugme-daemon
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01  8:15 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #1 from anonymous@kernel-bugs.osdl.org  2008-04-01 01:15 -------
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue,  1 Apr 2008 01:01:22 -0700 (PDT) bugme-daemon@bugzilla.kernel.org
wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10374
> 
>            Summary: sym53c8xx: weird behavior with udev
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.24.4
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: drivers_other@kernel-bugs.osdl.org
>         ReportedBy: seraph@xs4all.nl
> 
> 
> Latest working kernel version: 2.6.22.9
> Earliest failing kernel version: 2.6.23
> Distribution: Gentoo
> Hardware Environment: sparc64 (Sun Blade 100)
> Software Environment:
> Problem Description:
> 
> Since kernel 2.6.23, I have been having problems getting the sungem network
> device working on one of my two Blade 100s, see bug #10273.
> 
> After debugging this, I found that this seems to be somehow related to
> sym53c8xx and udev.
> 
> If I allow udev to load sym53c8xx during boot, the attached disks work fine but
> the network does not work at all. While the device is up and mii-diag says
> there is link beat, no packets can be sent or received, and attempts to use the
> network result in "network unreachable" errors.
> 
> If I blacklist sym53c8xx in /etc/modprobe.d/blacklist, let the machine boot
> normally and then manually load sym53c8xx after everything is settled, both the
> scsi disks and the network appear to work fine.
> 
> None of this happened in kernel versions 2.6.22.9 and earlier, 2.6.23 was the
> first to start showing this behavior, and it still persists in 2.6.24.
> 
> I've tried playing with the option CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE as
> that is what seems to have changed between kernel versions 2.6.22 and 2.6.23.
> The recommended value for my machine is 0, but I have tried the other possible
> values without result. I have also tried toggling CONFIG_SCSI_SYM53C8XX_MMIO,
> also without result.
> 
> Steps to reproduce:
> 
> Let udev load sym53c8xx in kernel 2.6.23 or newer.

urgh.  Perhaps it's related to platform IRQ routing or something.

I'd suggest that the next step would be to send us the `dmesg -s 1000000'
output for both good and bad kernels.  A comparison might show where things
went bad.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
  2008-04-01  8:12 ` [Bug 10374] sym53c8xx: weird behavior with udev bugme-daemon
  2008-04-01  8:15 ` bugme-daemon
@ 2008-04-01  8:58 ` bugme-daemon
  2008-04-01 14:12 ` bugme-daemon
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01  8:58 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #2 from seraph@xs4all.nl  2008-04-01 01:58 -------
On Tue, 1 Apr 2008 01:15:18 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> urgh.  Perhaps it's related to platform IRQ routing or something.
> 
> I'd suggest that the next step would be to send us the `dmesg -s 1000000'
> output for both good and bad kernels.  A comparison might show where things
> went bad.

Alright, here goes. Attached three files.

dmesg-2.6.22.9-good.txt is the dmesg output from 2.6.22.9 where everything
works as it should.
dmesg-2.6.24.4-bad.txt is the dmesg output from 2.6.24.4 with sym53cxx loaded
by udev and the network not responding.
dmesg-2.6.24.4-good.txt is the dmesg output from 2.6.24.4 with sym53cxx
blacklisted and the network working fine.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (2 preceding siblings ...)
  2008-04-01  8:58 ` bugme-daemon
@ 2008-04-01 14:12 ` bugme-daemon
  2008-04-01 14:48 ` bugme-daemon
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 14:12 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #3 from anonymous@kernel-bugs.osdl.org  2008-04-01 07:12 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Tue, 2008-04-01 at 01:15 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> > Steps to reproduce:
> > 
> > Let udev load sym53c8xx in kernel 2.6.23 or newer.
> 
> urgh.  Perhaps it's related to platform IRQ routing or something.
> 
> I'd suggest that the next step would be to send us the `dmesg -s 1000000'
> output for both good and bad kernels.  A comparison might show where things
> went bad.

Yes, that would be my guess too ... although I don't see anything amiss
in the dmesg   I note you have two ethernet interfaces:

eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:03:ba:08:61:7c
eth1394: eth1: IPv4 over IEEE 1394 (fw-host0)

I'm assuming eth0 is the problem?

Could you also send us the output of /proc/interrupts, /proc/iomem
and /proc/ioports just to see if we have a problem.  Also, if eth0 is on
its own interrupt line, does the interrupt count rise even while the
interface is non functional?

Thanks,

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (3 preceding siblings ...)
  2008-04-01 14:12 ` bugme-daemon
@ 2008-04-01 14:48 ` bugme-daemon
  2008-04-01 19:05 ` bugme-daemon
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 14:48 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #4 from seraph@xs4all.nl  2008-04-01 07:48 -------
On Tue, 01 Apr 2008 09:11:55 -0500
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Tue, 2008-04-01 at 01:15 -0700, Andrew Morton wrote:
> > 
> > I'd suggest that the next step would be to send us the `dmesg -s 1000000'
> > output for both good and bad kernels.  A comparison might show where things
> > went bad.
> 
> Yes, that would be my guess too ... although I don't see anything amiss
> in the dmesg   I note you have two ethernet interfaces:
> 
> eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:03:ba:08:61:7c
> eth1394: eth1: IPv4 over IEEE 1394 (fw-host0)
> 
> I'm assuming eth0 is the problem?

Yes, the Sun GEM interface is the problem. Ethernet over FireWire (a direct
link between the webserver and database server) works just fine. In a minimal
config with no devices enabled except the SU serial line (needed for console),
the dual SCSI adapter and the Sun GEM nic, the problem still remained the same.

> Could you also send us the output of /proc/interrupts, /proc/iomem
> and /proc/ioports just to see if we have a problem.  Also, if eth0 is on
> its own interrupt line, does the interrupt count rise even while the
> interface is non functional?

Here goes:

seraphim ~ # cat /proc/interrupts                                               
           CPU0                                                                 
  0:       7259     <NULL>  timer                                               
  8:          0      sun4u  power                                               
  9:        405      sun4u  su(serial)                                          
 11:          1      sun4u  eth0                                                
 12:         95      sun4u  ohci1394                                            
 13:          0      sun4u  ohci_hcd:usb1                                       
 14:          0      sun4u  ALI 5451                                            
 15:       2265      sun4u  ide0                                                
 16:        293      sun4u  sym53c8xx                                           
 17:         69      sun4u  sym53c8xx

seraphim ~ # cat /proc/iomem                                                    
1fe020002e8-1fe020002ef : su                                                    
1fe020003f8-1fe020003ff : su                                                    
1fe02000800-1fe02000803 : power                                                 
1ff00000000-1ffffffffff : /pci@1f,0                                             
  1ff000a0000-1ff000bffff : Video RAM area                                      
  1ff000c0000-1ff000c7fff : Video ROM                                           
  1ff000f0000-1ff000fffff : System ROM                                          
  1ff00400000-1ff0041ffff : sungem                                              
  1ff00420000-1ff004207ff : ohci1394                                            
  1ff00424000-1ff00425fff : ALI 5451                                            
  1ff02000000-1ff02ffffff : ohci_hcd                                            
  1ff03000000-1ff03001fff : sym53c8xx                                           
  1ff03002000-1ff03003fff : sym53c8xx                                           
  1ff03004000-1ff03005fff : sym53c8xx                                           
  1ff03006000-1ff03007fff : sym53c8xx                                           
  1ffc0000000-1ffdfffffff : IOMMU                                               
  1fff1000000-1fff1001fff : clock

seraphim ~ # cat /proc/ioports                                                  
00000600-0000061f : ali1535_smbus                                               
1fe02000000-1fe02ffffff : /pci@1f,0                                             
  1fe02000600-1fe0200061f : 0000:00:03.0                                        
  1fe02000800-1fe0200083f : 0000:00:03.0                                        
  1fe02000900-1fe020009ff : ALI 5451                                            
  1fe02000a00-1fe02000a07 : ide0                                                
  1fe02000a1a-1fe02000a1a : ide0                                                
  1fe02000a20-1fe02000a27 : ide0                                                
  1fe02000a28-1fe02000a2f : ide1                                                
  1fe02001000-1fe020010ff : sym53c8xx                                           
  1fe02001100-1fe020011ff : sym53c8xx


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (4 preceding siblings ...)
  2008-04-01 14:48 ` bugme-daemon
@ 2008-04-01 19:05 ` bugme-daemon
  2008-04-01 20:20 ` bugme-daemon
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 19:05 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #5 from seraph@xs4all.nl  2008-04-01 12:05 -------
Hello all,


I did a bit more testing, and I think this may be related to the order in which
modules are loaded.

If I let udev load sungem, and load sym53c8xx manually, everything works.

If I let udev load sym53c8xx, and load sungem manually, I get the
non-functional network.

If I let udev load both modules, I also get the non-functional network. While
udev loads sungem first and sym53c8xx later, I don't suppose it waits for one
module to 'settle' before loading the next. :-)


So to sum it up, the bug is triggered if sym53c8xx is loaded before sungem is.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (5 preceding siblings ...)
  2008-04-01 19:05 ` bugme-daemon
@ 2008-04-01 20:20 ` bugme-daemon
  2008-04-01 20:57 ` bugme-daemon
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 20:20 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #6 from anonymous@kernel-bugs.osdl.org  2008-04-01 13:20 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Tue, 2008-04-01 at 21:05 +0200, Jos van der Ende wrote:
> Hello all,
> 
> 
> I did a bit more testing, and I think this may be related to the order in which modules are loaded.
> 
> If I let udev load sungem, and load sym53c8xx manually, everything works.
> 
> If I let udev load sym53c8xx, and load sungem manually, I get the non-functional network.
> 
> If I let udev load both modules, I also get the non-functional network. While udev loads sungem first and sym53c8xx later, I don't suppose it waits for one module to 'settle' before loading the next. :-)

That's odd ... it's behaving like a resource conflict.  However, the
ports and interrupt trace didn't betray anything.  What does lspci -vv
say for each of the devices?  Also, if you remove the sym2 module in the
problem case, does the sungem come back to life?

I'm afraid I can't see anything relevant looking over the sym2 changes,
so you might need to bisect this to identify the culprit.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (6 preceding siblings ...)
  2008-04-01 20:20 ` bugme-daemon
@ 2008-04-01 20:57 ` bugme-daemon
  2008-04-01 21:14 ` bugme-daemon
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 20:57 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #7 from seraph@xs4all.nl  2008-04-01 13:57 -------
On Tue, 01 Apr 2008 15:19:29 -0500
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> That's odd ... it's behaving like a resource conflict.  However, the
> ports and interrupt trace didn't betray anything.  What does lspci -vv
> say for each of the devices?

Output from lspci -vv attached.

> Also, if you remove the sym2 module in the
> problem case, does the sungem come back to life?

No, once it is hosed it stays hosed until the next boot. Fiddling with the
wrong ioports maybe?

> I'm afraid I can't see anything relevant looking over the sym2 changes,
> so you might need to bisect this to identify the culprit.

Working on that, but it is a hassle as this bitty-box needs some time to
compile a kernel. 2.6.23-rc1 didn't boot, for starters.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (7 preceding siblings ...)
  2008-04-01 20:57 ` bugme-daemon
@ 2008-04-01 21:14 ` bugme-daemon
  2008-04-01 22:30 ` bugme-daemon
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 21:14 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #8 from anonymous@kernel-bugs.osdl.org  2008-04-01 14:14 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Tue, 2008-04-01 at 22:57 +0200, Jos van der Ende wrote:
> On Tue, 01 Apr 2008 15:19:29 -0500
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> > That's odd ... it's behaving like a resource conflict.  However, the
> > ports and interrupt trace didn't betray anything.  What does lspci -vv
> > say for each of the devices?
> 
> Output from lspci -vv attached.

Thanks ... unfortunately looks normal too.  The gem has a single memory
region; the sym2 has 2 mem and one IO region, all of which show up in
the /proc/iomem|ports.

> > Also, if you remove the sym2 module in the
> > problem case, does the sungem come back to life?
> 
> No, once it is hosed it stays hosed until the next boot. Fiddling with the wrong ioports maybe?

Yes ... that's what I guess.  Just as one last grasp at a straw, is
there any difference in /proc/iomem or /proc/ioports for the working
case (sungem loaded first followed by sym2)?

> > I'm afraid I can't see anything relevant looking over the sym2 changes,
> > so you might need to bisect this to identify the culprit.
> 
> Working on that, but it is a hassle as this bitty-box needs some time to compile a kernel. 2.6.23-rc1 didn't boot, for starters.

Sorry ... can't think of much else that will help.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (8 preceding siblings ...)
  2008-04-01 21:14 ` bugme-daemon
@ 2008-04-01 22:30 ` bugme-daemon
  2008-04-02 10:29 ` bugme-daemon
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-01 22:30 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #9 from seraph@xs4all.nl  2008-04-01 15:30 -------
On Tue, 01 Apr 2008 16:14:14 -0500
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> Yes ... that's what I guess.  Just as one last grasp at a straw, is
> there any difference in /proc/iomem or /proc/ioports for the working
> case (sungem loaded first followed by sym2)?

Nope, exactly the same.

> > > I'm afraid I can't see anything relevant looking over the sym2 changes,
> > > so you might need to bisect this to identify the culprit.

First results are in: 2.6.23-rc1 could not boot, 2.6.23-rc2 already had the
problem.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (9 preceding siblings ...)
  2008-04-01 22:30 ` bugme-daemon
@ 2008-04-02 10:29 ` bugme-daemon
  2008-04-02 12:07 ` bugme-daemon
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-02 10:29 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #10 from seraph@xs4all.nl  2008-04-02 03:29 -------
Maybe it's nothing, but I did notice that sym53c8xx compiles with a warning of
a possibly uninitialized variable:

  CC [M]  drivers/scsi/sym53c8xx_2/sym_glue.o
drivers/scsi/sym53c8xx_2/sym_glue.c: In function 'sym_eh_handler':
drivers/scsi/sym53c8xx_2/sym_glue.c:612: warning: 'io_reset' may be used
uninitialized in this function


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (10 preceding siblings ...)
  2008-04-02 10:29 ` bugme-daemon
@ 2008-04-02 12:07 ` bugme-daemon
  2008-04-02 14:09 ` bugme-daemon
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-02 12:07 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #11 from matthew@wil.cx  2008-04-02 05:07 -------
On Wed, Apr 02, 2008 at 12:29:44PM +0200, Jos van der Ende wrote:
> Maybe it's nothing, but I did notice that sym53c8xx compiles with a warning of a possibly uninitialized variable:
> 
>   CC [M]  drivers/scsi/sym53c8xx_2/sym_glue.o
> drivers/scsi/sym53c8xx_2/sym_glue.c: In function 'sym_eh_handler':
> drivers/scsi/sym53c8xx_2/sym_glue.c:612: warning: 'io_reset' may be used uninitialized in this function

Yeah, that's nothing.  It's actually a bug in GCC that produces that
warning (and it's code that'll never be executed on your platform anyway).


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (11 preceding siblings ...)
  2008-04-02 12:07 ` bugme-daemon
@ 2008-04-02 14:09 ` bugme-daemon
  2008-04-02 15:50 ` bugme-daemon
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-02 14:09 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #12 from seraph@xs4all.nl  2008-04-02 07:09 -------
Finally, bisecting is done. :-)

Well, it took more reboots than a typical Windows XP installation (and thank
the heavens for my Sparc64 cross compiler on my Core 2 Duo), but this seems to
be the culprit:

5a606b72a4309a656cd1a19ad137dc5557c4b8ea is first bad commit
commit 5a606b72a4309a656cd1a19ad137dc5557c4b8ea
Author: David S. Miller <davem@sunset.davemloft.net>
Date:   Mon Jul 9 22:40:36 2007 -0700

    [SPARC64]: Do not ACK an INO if it is disabled or inprogress.

    This is also a partial workaround for a bug in the LDOM firmware which
    double-transmits RX inos during high load.  Without this, such an
    event causes the kernel to loop forever in the interrupt call chain
    ACK'ing but never actually running the IRQ handler (and thus clearing
    the interrupt condition in the device).

    There is still a bad potential effect when double INOs occur,
    not covered by this changeset.  Namely, if the INO is already on
    the per-cpu INO vector list, we still blindly re-insert it and
    thus we can end up losing interrupts already linked in after
    it.

    We could deal with that by traversing the list before insertion,
    but that's too expensive for this edge case.

    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 7e65c9b16e6c37f2c3f83195c5a57b4d2b8f0a7c
e7a7bedcc88d33793a6525e9337a1a51982bc513 M      arch


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (12 preceding siblings ...)
  2008-04-02 14:09 ` bugme-daemon
@ 2008-04-02 15:50 ` bugme-daemon
  2008-04-02 16:07 ` bugme-daemon
  2008-05-17 15:52 ` bugme-daemon
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-02 15:50 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #13 from anonymous@kernel-bugs.osdl.org  2008-04-02 08:50 -------
Reply-To: James.Bottomley@HansenPartnership.com

On Wed, 2008-04-02 at 16:09 +0200, Jos van der Ende wrote:
> Finally, bisecting is done. :-)

Thanks for doing this ... we'd never have found it by looking at the
driver code ...

> Well, it took more reboots than a typical Windows XP installation (and
> thank the heavens for my Sparc64 cross compiler on my Core 2 Duo), but
> this seems to be the culprit:
> 
> 5a606b72a4309a656cd1a19ad137dc5557c4b8ea is first bad commit

Reading the code for this, it seems that something fiddled with the
IRQ_DISABLED or IRQ_PENDING flags when it came time for the ->eoi() so
the gem interrupt is always held pending (because it's never ended).

Since the sym2 is on interrupts 16 and 17 and gem on 11 (and the
descriptors are separate entities in the irq_desc array) I can't really
see how sym2 would be doing this.

James


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (13 preceding siblings ...)
  2008-04-02 15:50 ` bugme-daemon
@ 2008-04-02 16:07 ` bugme-daemon
  2008-05-17 15:52 ` bugme-daemon
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-04-02 16:07 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374





------- Comment #14 from seraph@xs4all.nl  2008-04-02 09:07 -------
On Wed, 02 Apr 2008 10:49:19 -0500
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> Reading the code for this, it seems that something fiddled with the
> IRQ_DISABLED or IRQ_PENDING flags when it came time for the ->eoi() so
> the gem interrupt is always held pending (because it's never ended).

So Andrew's first hunch that interrupts were somehow involved is right.

> Since the sym2 is on interrupts 16 and 17 and gem on 11 (and the
> descriptors are separate entities in the irq_desc array) I can't really
> see how sym2 would be doing this.

Yeah, that has me baffled too. Still, the fact is that I can only trigger the
bug by loading sym53c8xx before sungem. I have yet to find any other conditions
that trigger it. Loading sungem before sym53c8xx on an affected kernel gives no
trouble at all.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 10374] sym53c8xx: weird behavior with udev
       [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
                   ` (14 preceding siblings ...)
  2008-04-02 16:07 ` bugme-daemon
@ 2008-05-17 15:52 ` bugme-daemon
  15 siblings, 0 replies; 16+ messages in thread
From: bugme-daemon @ 2008-05-17 15:52 UTC (permalink / raw)
  To: linux-scsi

http://bugzilla.kernel.org/show_bug.cgi?id=10374


seraph@xs4all.nl changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |CLOSED
         Resolution|                            |CODE_FIX




------- Comment #15 from seraph@xs4all.nl  2008-05-17 08:52 -------
This is fixed by commit a5f56179c861a23a2f08409711926ff812f30d38 in 2.6.25.4.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-05-17 15:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-10374-11613@http.bugzilla.kernel.org/>
2008-04-01  8:12 ` [Bug 10374] sym53c8xx: weird behavior with udev bugme-daemon
2008-04-01  8:15 ` bugme-daemon
2008-04-01  8:58 ` bugme-daemon
2008-04-01 14:12 ` bugme-daemon
2008-04-01 14:48 ` bugme-daemon
2008-04-01 19:05 ` bugme-daemon
2008-04-01 20:20 ` bugme-daemon
2008-04-01 20:57 ` bugme-daemon
2008-04-01 21:14 ` bugme-daemon
2008-04-01 22:30 ` bugme-daemon
2008-04-02 10:29 ` bugme-daemon
2008-04-02 12:07 ` bugme-daemon
2008-04-02 14:09 ` bugme-daemon
2008-04-02 15:50 ` bugme-daemon
2008-04-02 16:07 ` bugme-daemon
2008-05-17 15:52 ` bugme-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox