* Linux 2.6.9 Adaptec 4 Port Starfire Sickness @ 2005-04-03 4:41 jmerkey 2005-04-03 5:47 ` Willy Tarreau 2005-04-03 7:26 ` Jeff Garzik 0 siblings, 2 replies; 7+ messages in thread From: jmerkey @ 2005-04-03 4:41 UTC (permalink / raw) To: linux-kernel With linux 2.6.9 running at 192 MB/S network loading and protocol splitting drivers routing packets out of a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 ports, the adaptec starfire driver goes into constant Tx FIFO reconfiguration mode and after 3-4 days of constantly resetting the Tx FIFO window and generating a deluge of messages such as: ethX: PCI bus congestion, resetting Tx FIFO window to X bytes pouring into the system log file at a rate of a dozen per minute. After several days, the PCI bus totally locks up and hangs the system. Need a config option to allow the starfire to disable this feature. At very high bus loading rates, the starfire card will completely lock the bus after 3-4 days of constant Tx FIFO reconfiguration at very high data rates with protocol splitting and routing. Jeff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 4:41 Linux 2.6.9 Adaptec 4 Port Starfire Sickness jmerkey @ 2005-04-03 5:47 ` Willy Tarreau 2005-04-03 6:58 ` jmerkey 2005-04-03 7:26 ` Jeff Garzik 1 sibling, 1 reply; 7+ messages in thread From: Willy Tarreau @ 2005-04-03 5:47 UTC (permalink / raw) To: jmerkey; +Cc: linux-kernel Hi Jeff, I've also experienced those messages under 2.4, but they were harmless, and I never had a machine hang even after weeks of full load (the adapter was mounted on a stress test machine before being used in firewalls for months). So I wonder how you can be sure that it is this driver which finally locks the bus. Perhaps the system locks for any other reason (eg: race condition). Have you tried with any other 4-port NIC (tulip or sun for example) ? Sun QFE would be the most interesting to test as it also supports 64 bits / 66 MHz. Regards, Willy On Sat, Apr 02, 2005 at 09:41:28PM -0700, jmerkey wrote: > With linux 2.6.9 running at 192 MB/S network loading and protocol > splitting drivers routing packets out of > a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 ports, > the adaptec starfire driver goes into > constant Tx FIFO reconfiguration mode and after 3-4 days of constantly > resetting the Tx FIFO window and > generating a deluge of messages such as: > > ethX: PCI bus congestion, resetting Tx FIFO window to X bytes > > pouring into the system log file at a rate of a dozen per minute. After > several days, the PCI bus totally locks up > and hangs the system. Need a config option to allow the starfire to > disable this feature. At very > high bus loading rates, the starfire card will completely lock the bus > after 3-4 days > of constant Tx FIFO reconfiguration at very high data rates with > protocol splitting and routing. > > Jeff > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 5:47 ` Willy Tarreau @ 2005-04-03 6:58 ` jmerkey 2005-04-03 7:38 ` Willy Tarreau 0 siblings, 1 reply; 7+ messages in thread From: jmerkey @ 2005-04-03 6:58 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel It works fine with the Intel Dual Port Pro-1000 MT adapters without these problems. I am using testing scenarios with Jumbo Frames as well. I am guessing the PCI bus contention is high due to the disk I/O bandwidth and this is causing conditions the adapter does not normally see. Documentation states that this message should be very rare, and not spool off into the logs at this rate. See http://www.ibiblio.org/mdw/HOWTO/Ethernet-HOWTO-8.html Jeff Willy Tarreau wrote: >Hi Jeff, > >I've also experienced those messages under 2.4, but they were harmless, >and I never had a machine hang even after weeks of full load (the adapter >was mounted on a stress test machine before being used in firewalls for >months). > >So I wonder how you can be sure that it is this driver which finally locks >the bus. Perhaps the system locks for any other reason (eg: race condition). >Have you tried with any other 4-port NIC (tulip or sun for example) ? Sun >QFE would be the most interesting to test as it also supports 64 bits / >66 MHz. > >Regards, >Willy > >On Sat, Apr 02, 2005 at 09:41:28PM -0700, jmerkey wrote: > > >>With linux 2.6.9 running at 192 MB/S network loading and protocol >>splitting drivers routing packets out of >>a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 ports, >>the adaptec starfire driver goes into >>constant Tx FIFO reconfiguration mode and after 3-4 days of constantly >>resetting the Tx FIFO window and >>generating a deluge of messages such as: >> >>ethX: PCI bus congestion, resetting Tx FIFO window to X bytes >> >>pouring into the system log file at a rate of a dozen per minute. After >>several days, the PCI bus totally locks up >>and hangs the system. Need a config option to allow the starfire to >>disable this feature. At very >>high bus loading rates, the starfire card will completely lock the bus >>after 3-4 days >>of constant Tx FIFO reconfiguration at very high data rates with >>protocol splitting and routing. >> >>Jeff >>- >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >>Please read the FAQ at http://www.tux.org/lkml/ >> >> > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 6:58 ` jmerkey @ 2005-04-03 7:38 ` Willy Tarreau 2005-04-03 7:21 ` jmerkey 0 siblings, 1 reply; 7+ messages in thread From: Willy Tarreau @ 2005-04-03 7:38 UTC (permalink / raw) To: jmerkey; +Cc: linux-kernel On Sat, Apr 02, 2005 at 11:58:44PM -0700, jmerkey wrote: > > It works fine with the Intel Dual Port Pro-1000 MT adapters without > these problems. but unless I'm mistaken, there's no PCI bridge on this board, and it is possible that the two ports share the same IRQ, that's why I suggested trying a 4-port sun QFE or something which is more similar to the starfire. > I am using testing scenarios > with Jumbo Frames as well. I am guessing the PCI bus contention is high > due to the disk I/O bandwidth and > this is causing conditions the adapter does not normally see. As I said, I have been saturating this card for weeks during stress tests and although it spitted out lots of messages, it never hanged (at least on recent 2.4 kernels, because very early 2.4 were a real pain with this one). > Documentation states that this message should be very > rare, and not spool off into the logs at this rate. perhaps you have a mix of small and large frames which makes the driver constantly change the fifo size, and this part is not handled properly ? Willy > See http://www.ibiblio.org/mdw/HOWTO/Ethernet-HOWTO-8.html > > Jeff > > Willy Tarreau wrote: > > >Hi Jeff, > > > >I've also experienced those messages under 2.4, but they were harmless, > >and I never had a machine hang even after weeks of full load (the adapter > >was mounted on a stress test machine before being used in firewalls for > >months). > > > >So I wonder how you can be sure that it is this driver which finally > >locks > >the bus. Perhaps the system locks for any other reason (eg: race > >condition). > >Have you tried with any other 4-port NIC (tulip or sun for example) ? Sun > >QFE would be the most interesting to test as it also supports 64 bits / > >66 MHz. > > > >Regards, > >Willy > > > >On Sat, Apr 02, 2005 at 09:41:28PM -0700, jmerkey wrote: > > > > > >>With linux 2.6.9 running at 192 MB/S network loading and protocol > >>splitting drivers routing packets out of > >>a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 > >>ports, the adaptec starfire driver goes into > >>constant Tx FIFO reconfiguration mode and after 3-4 days of constantly > >>resetting the Tx FIFO window and > >>generating a deluge of messages such as: > >> > >>ethX: PCI bus congestion, resetting Tx FIFO window to X bytes > >> > >>pouring into the system log file at a rate of a dozen per minute. > >>After several days, the PCI bus totally locks up > >>and hangs the system. Need a config option to allow the starfire to > >>disable this feature. At very > >>high bus loading rates, the starfire card will completely lock the bus > >>after 3-4 days > >>of constant Tx FIFO reconfiguration at very high data rates with > >>protocol splitting and routing. > >> > >>Jeff > >>- > >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" > >>in > >>the body of a message to majordomo@vger.kernel.org > >>More majordomo info at http://vger.kernel.org/majordomo-info.html > >>Please read the FAQ at http://www.tux.org/lkml/ > >> > >> > > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 7:38 ` Willy Tarreau @ 2005-04-03 7:21 ` jmerkey 0 siblings, 0 replies; 7+ messages in thread From: jmerkey @ 2005-04-03 7:21 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel I disabled the FIFO resetting code and am running tests. See what happens. I am on 2.6 not 2.4 so it could be a problem there. At any rate, I will see if the problem goes away. Jeff Willy Tarreau wrote: >On Sat, Apr 02, 2005 at 11:58:44PM -0700, jmerkey wrote: > > >>It works fine with the Intel Dual Port Pro-1000 MT adapters without >>these problems. >> >> > >but unless I'm mistaken, there's no PCI bridge on this board, and it is >possible that the two ports share the same IRQ, that's why I suggested >trying a 4-port sun QFE or something which is more similar to the starfire. > > > >>I am using testing scenarios >>with Jumbo Frames as well. I am guessing the PCI bus contention is high >>due to the disk I/O bandwidth and >>this is causing conditions the adapter does not normally see. >> >> > >As I said, I have been saturating this card for weeks during stress tests >and although it spitted out lots of messages, it never hanged (at least on >recent 2.4 kernels, because very early 2.4 were a real pain with this one). > > > >>Documentation states that this message should be very >>rare, and not spool off into the logs at this rate. >> >> > >perhaps you have a mix of small and large frames which makes the driver >constantly change the fifo size, and this part is not handled properly ? > >Willy > > > >>See http://www.ibiblio.org/mdw/HOWTO/Ethernet-HOWTO-8.html >> >>Jeff >> >>Willy Tarreau wrote: >> >> >> >>>Hi Jeff, >>> >>>I've also experienced those messages under 2.4, but they were harmless, >>>and I never had a machine hang even after weeks of full load (the adapter >>>was mounted on a stress test machine before being used in firewalls for >>>months). >>> >>>So I wonder how you can be sure that it is this driver which finally >>>locks >>>the bus. Perhaps the system locks for any other reason (eg: race >>>condition). >>>Have you tried with any other 4-port NIC (tulip or sun for example) ? Sun >>>QFE would be the most interesting to test as it also supports 64 bits / >>>66 MHz. >>> >>>Regards, >>>Willy >>> >>>On Sat, Apr 02, 2005 at 09:41:28PM -0700, jmerkey wrote: >>> >>> >>> >>> >>>>With linux 2.6.9 running at 192 MB/S network loading and protocol >>>>splitting drivers routing packets out of >>>>a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 >>>>ports, the adaptec starfire driver goes into >>>>constant Tx FIFO reconfiguration mode and after 3-4 days of constantly >>>>resetting the Tx FIFO window and >>>>generating a deluge of messages such as: >>>> >>>>ethX: PCI bus congestion, resetting Tx FIFO window to X bytes >>>> >>>>pouring into the system log file at a rate of a dozen per minute. >>>>After several days, the PCI bus totally locks up >>>>and hangs the system. Need a config option to allow the starfire to >>>>disable this feature. At very >>>>high bus loading rates, the starfire card will completely lock the bus >>>>after 3-4 days >>>>of constant Tx FIFO reconfiguration at very high data rates with >>>>protocol splitting and routing. >>>> >>>>Jeff >>>>- >>>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" >>>>in >>>>the body of a message to majordomo@vger.kernel.org >>>>More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>Please read the FAQ at http://www.tux.org/lkml/ >>>> >>>> >>>> >>>> >>> >>> >>> >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 4:41 Linux 2.6.9 Adaptec 4 Port Starfire Sickness jmerkey 2005-04-03 5:47 ` Willy Tarreau @ 2005-04-03 7:26 ` Jeff Garzik 2005-04-03 7:07 ` jmerkey 1 sibling, 1 reply; 7+ messages in thread From: Jeff Garzik @ 2005-04-03 7:26 UTC (permalink / raw) To: jmerkey; +Cc: linux-kernel jmerkey wrote: > With linux 2.6.9 running at 192 MB/S network loading and protocol > splitting drivers routing packets out of > a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 ports, > the adaptec starfire driver goes into > constant Tx FIFO reconfiguration mode and after 3-4 days of constantly > resetting the Tx FIFO window and > generating a deluge of messages such as: > > ethX: PCI bus congestion, resetting Tx FIFO window to X bytes > > pouring into the system log file at a rate of a dozen per minute. After > several days, the PCI bus totally locks up > and hangs the system. Need a config option to allow the starfire to > disable this feature. At very > high bus loading rates, the starfire card will completely lock the bus > after 3-4 days > of constant Tx FIFO reconfiguration at very high data rates with > protocol splitting and routing. The feature doesn't need disabling; just modify the driver to stop the flapping. Jeff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Linux 2.6.9 Adaptec 4 Port Starfire Sickness 2005-04-03 7:26 ` Jeff Garzik @ 2005-04-03 7:07 ` jmerkey 0 siblings, 0 replies; 7+ messages in thread From: jmerkey @ 2005-04-03 7:07 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel Jeff Garzik wrote: > jmerkey wrote: > >> With linux 2.6.9 running at 192 MB/S network loading and protocol >> splitting drivers routing packets out of >> a 2.6.9 device at full 100 mb/s (12.5 MB/S) simultaneously over 4 >> ports, the adaptec starfire driver goes into >> constant Tx FIFO reconfiguration mode and after 3-4 days of >> constantly resetting the Tx FIFO window and >> generating a deluge of messages such as: >> >> ethX: PCI bus congestion, resetting Tx FIFO window to X bytes >> >> pouring into the system log file at a rate of a dozen per minute. >> After several days, the PCI bus totally locks up >> and hangs the system. Need a config option to allow the starfire to >> disable this feature. At very >> high bus loading rates, the starfire card will completely lock the >> bus after 3-4 days >> of constant Tx FIFO reconfiguration at very high data rates with >> protocol splitting and routing. > > > The feature doesn't need disabling; just modify the driver to stop the > flapping. > > Jeff > > > > I am going to try to just turn off the Tx FIFO setting in the code completely and see if this helps, not just the message. See what happens ... Jeff ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-04-03 7:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-03 4:41 Linux 2.6.9 Adaptec 4 Port Starfire Sickness jmerkey 2005-04-03 5:47 ` Willy Tarreau 2005-04-03 6:58 ` jmerkey 2005-04-03 7:38 ` Willy Tarreau 2005-04-03 7:21 ` jmerkey 2005-04-03 7:26 ` Jeff Garzik 2005-04-03 7:07 ` jmerkey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox