* sk98lin for 2.6.23-rc1
@ 2007-07-26 15:16 Kyle Rose
2007-07-26 16:28 ` Jan Engelhardt
` (3 more replies)
0 siblings, 4 replies; 35+ messages in thread
From: Kyle Rose @ 2007-07-26 15:16 UTC (permalink / raw)
To: linux-kernel
>From http://www.krose.org/~krose/computing.html:
Since the sky2 driver continues to suck ass (which is a technical
description for "it hangs all the time under load, at least on my
hardware" :-) ), I've fixed the sk98lin driver to compile for
linux-2.6.23-rc1. Those who continue to have problems with sky2 can
still use 2.6.23-rc1, simply by doing the following:
1.
Make sure you have the headers for your kernel properly installed
and linked to /usr/src/linux-$KVER.
2.
Download the sk98lin source from Marvell's site
<http://www.marvell.com/drivers/search.do>.
3.
Untar the driver and run the install.sh according to the
directions. It will fail.
4.
Look in /tmp for a directory called Sk98something. Go to
http://www.krose.org/~krose/projects/sk98lin/ and copy the
Makefile <http://www.krose.org/%7Ekrose/projects/sk98lin/Makefile>
and sky2.c <http://www.krose.org/%7Ekrose/projects/sk98lin/sky2.c>
into /tmp/Sk98something/all.
5.
Change into /tmp/Sk98something/all and execute:
sudo -H make -C /usr/src/linux-$KVER M=`pwd` modules
sudo -H make -C /usr/src/linux-$KVER M=`pwd` modules_install
6.
Blacklist sky2 in /etc/modprobe.d/blacklist, and (maybe not
necessary) manually load sk98lin in /etc/modules.
There. You're done. Stable networking at last... er, again.
Unfortunately, you lose the nicest differential feature of
sky2---WOL---but that's a small price to pay for networking stability of
a desktop machine. It's nice to be able to watch MythTV again without
having to sudo bash -c 'ifdown eth0; rmmod sky2; modprobe sky2; ifup
eth0' every few minutes.
Personally, I'd like to see sk98lin remain in the kernel proper until
sky2 goes at least 6 months without reported problems. The fact that I
am not the only one still seeing issues is a clear indication that sky2
(even with the recent patches in 2.6.23-rc1) is not yet ready to replace
sk98lin.
I'm happy to help debug the remaining issues with sky2, Stephen; just
let me know what information you need.
Kyle
^ permalink raw reply [flat|nested] 35+ messages in thread* Re: sk98lin for 2.6.23-rc1 2007-07-26 15:16 sk98lin for 2.6.23-rc1 Kyle Rose @ 2007-07-26 16:28 ` Jan Engelhardt 2007-07-26 16:30 ` Kyle Rose 2007-07-26 16:57 ` Adrian Bunk ` (2 subsequent siblings) 3 siblings, 1 reply; 35+ messages in thread From: Jan Engelhardt @ 2007-07-26 16:28 UTC (permalink / raw) To: Kyle Rose; +Cc: linux-kernel On Jul 26 2007 11:16, Kyle Rose wrote: > > 1. > > Make sure you have the headers for your kernel properly installed > and linked to /usr/src/linux-$KVER. Why is this a requirement? Makefile not properly done? > 4. > > Look in /tmp for a directory called Sk98something. Go to Why /tmp? If untarred (with default options) in ~, it's in ~/tmp. > 5. > > Change into /tmp/Sk98something/all and execute: > > sudo -H make -C /usr/src/linux-$KVER M=`pwd` modules > sudo -H make -C /usr/src/linux-$KVER M=`pwd` modules_install This breaks with O= builds. See (1). Sorry for the nitpick, it can be done easier :) Jan -- ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:28 ` Jan Engelhardt @ 2007-07-26 16:30 ` Kyle Rose 2007-07-26 16:41 ` Jan Engelhardt 0 siblings, 1 reply; 35+ messages in thread From: Kyle Rose @ 2007-07-26 16:30 UTC (permalink / raw) To: Jan Engelhardt; +Cc: Kyle Rose, linux-kernel > Sorry for the nitpick, it can be done easier :) I'm sure it can. I didn't want to have to figure out the kernel build system just to get this one driver working. Hence my desire for it to remain in the kernel proper until sky2 utterly works. ;-) Kyle ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:30 ` Kyle Rose @ 2007-07-26 16:41 ` Jan Engelhardt 2007-07-27 1:07 ` Kyle Rose 0 siblings, 1 reply; 35+ messages in thread From: Jan Engelhardt @ 2007-07-26 16:41 UTC (permalink / raw) To: Kyle Rose; +Cc: Kyle Rose, linux-kernel On Jul 26 2007 12:30, Kyle Rose wrote: >> Sorry for the nitpick, it can be done easier :) > >I'm sure it can. I didn't want to have to figure out the kernel build >system just to get this one driver working. Hence my desire for it to >remain in the kernel proper until sky2 utterly works. ;-) Oh it's really easy, have a look at https://dev.computergmbh.de/svn/misc_kernel/oopser/trunk/Makefile Jan -- ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:41 ` Jan Engelhardt @ 2007-07-27 1:07 ` Kyle Rose 0 siblings, 0 replies; 35+ messages in thread From: Kyle Rose @ 2007-07-27 1:07 UTC (permalink / raw) To: Jan Engelhardt; +Cc: Kyle Rose, linux-kernel Thanks for the pointer. I've done this, and created an actual kernel module tarball that is now available at http://www.krose.org/~krose/projects/sk98lin/sk98lin.tar.gz. Thanks, Kyle Jan Engelhardt wrote: > On Jul 26 2007 12:30, Kyle Rose wrote: >>> Sorry for the nitpick, it can be done easier :) >> I'm sure it can. I didn't want to have to figure out the kernel build >> system just to get this one driver working. Hence my desire for it to >> remain in the kernel proper until sky2 utterly works. ;-) > > Oh it's really easy, have a look at > https://dev.computergmbh.de/svn/misc_kernel/oopser/trunk/Makefile > > > Jan ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 15:16 sk98lin for 2.6.23-rc1 Kyle Rose 2007-07-26 16:28 ` Jan Engelhardt @ 2007-07-26 16:57 ` Adrian Bunk 2007-07-26 22:58 ` Chris Stromsoe ` (2 more replies) 2007-07-26 19:17 ` Stephen Hemminger 2007-07-26 23:52 ` Bill Davidsen 3 siblings, 3 replies; 35+ messages in thread From: Adrian Bunk @ 2007-07-26 16:57 UTC (permalink / raw) To: Kyle Rose; +Cc: linux-kernel On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle Rose wrote: > >From http://www.krose.org/~krose/computing.html: > > Since the sky2 driver continues to suck ass (which is a technical > description for "it hangs all the time under load, at least on my > hardware" :-) ), I've fixed the sk98lin driver to compile for > linux-2.6.23-rc1. Those who continue to have problems with sky2 can > still use 2.6.23-rc1, simply by doing the following: >... > Personally, I'd like to see sk98lin remain in the kernel proper until > sky2 goes at least 6 months without reported problems. The fact that I > am not the only one still seeing issues is a clear indication that sky2 > (even with the recent patches in 2.6.23-rc1) is not yet ready to replace > sk98lin. >... This sounds good in theory. The practical problem with this approach is that there are always many people who use the old driver when the new driver doesn't work for them instead of reporting their problems with the new driver. For these people a new driver will often suck when the old driver gets removed, but after the removal of the old driver they are finally forced to report their bugs resulting in a better new driver for everyone. The sky2 driver is since nearly 2 years in the kernel and Stephen is usually quite good at handling bugs. > Kyle cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:57 ` Adrian Bunk @ 2007-07-26 22:58 ` Chris Stromsoe 2007-07-26 23:38 ` Bill Davidsen 2007-07-30 3:01 ` Rob Sims 2 siblings, 0 replies; 35+ messages in thread From: Chris Stromsoe @ 2007-07-26 22:58 UTC (permalink / raw) To: Adrian Bunk; +Cc: Kyle Rose, linux-kernel On Thu, 26 Jul 2007, Adrian Bunk wrote: > On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle Rose wrote: >>> From http://www.krose.org/~krose/computing.html: >> >> Since the sky2 driver continues to suck ass (which is a technical >> description for "it hangs all the time under load, at least on my >> hardware" :-) ), I've fixed the sk98lin driver to compile for >> linux-2.6.23-rc1. Those who continue to have problems with sky2 can >> still use 2.6.23-rc1, simply by doing the following: ... Personally, >> I'd like to see sk98lin remain in the kernel proper until sky2 goes at >> least 6 months without reported problems. The fact that I am not the >> only one still seeing issues is a clear indication that sky2 (even with >> the recent patches in 2.6.23-rc1) is not yet ready to replace sk98lin. >> ... > > This sounds good in theory. > > The practical problem with this approach is that there are always many > people who use the old driver when the new driver doesn't work for them > instead of reporting their problems with the new driver. I have a number of SK-9844 "SK-NET GE-SX dual link" cards. skge has never worked with the cards. The following sequence locks up the machine completely (power cycle to get it back) with 2.6.22.1: fresno:~# modprobe skge fresno:~# ip li set eth2 up fresno:~# ip li set eth2 down fresno:~# ip li set eth3 up This works just fine: fresno:~# rmmod skge fresno:~# modprobe sk98lin RlmtMode=DualNet fresno:~# ip li set eth2 up fresno:~# ip li set eth2 down fresno:~# ip li set eth3 up fresno:~# ip li set eth3 down eth2 and eth3 are ports off the sk-9844. I've been reporting the problem since March. If sk98lin is removed, I won't have networking. -Chris ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:57 ` Adrian Bunk 2007-07-26 22:58 ` Chris Stromsoe @ 2007-07-26 23:38 ` Bill Davidsen 2007-07-26 23:41 ` Jeff Garzik 2007-07-30 3:01 ` Rob Sims 2 siblings, 1 reply; 35+ messages in thread From: Bill Davidsen @ 2007-07-26 23:38 UTC (permalink / raw) To: Adrian Bunk; +Cc: Kyle Rose, linux-kernel Adrian Bunk wrote: > On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle Rose wrote: >> >From http://www.krose.org/~krose/computing.html: >> >> Since the sky2 driver continues to suck ass (which is a technical >> description for "it hangs all the time under load, at least on my >> hardware" :-) ), I've fixed the sk98lin driver to compile for >> linux-2.6.23-rc1. Those who continue to have problems with sky2 can >> still use 2.6.23-rc1, simply by doing the following: >> ... >> Personally, I'd like to see sk98lin remain in the kernel proper until >> sky2 goes at least 6 months without reported problems. The fact that I >> am not the only one still seeing issues is a clear indication that sky2 >> (even with the recent patches in 2.6.23-rc1) is not yet ready to replace >> sk98lin. >> ... > > This sounds good in theory. > > The practical problem with this approach is that there are always many > people who use the old driver when the new driver doesn't work for them > instead of reporting their problems with the new driver. > Yes, you've grasped the reason for leaving the old driver in, so people can use their computers. Because when there is a new driver for previously unsupported hardware people will be glad to put time into debugging it to make the hardware useful. But when you take out a working driver because you (ie. the responsible developer) have a new idea which interests you, users don't want to use it because they have something which works, so you take out the working driver to make work for the users and create what you call a "better new driver" below. The old driver wasn't requiring any resources to maintain, the old hardware wasn't changing, there was no particular benefit to users in breaking their configuration. This disregard for the users just gives Linux critics an arguing point, "the next new kernel may withdraw support for your hardware." Isn't that why 2.6.16 is still being maintained? Nobody (sane) expects new drivers to be perfect, they just don't expect the working drivers to be disabled. > For these people a new driver will often suck when the old driver gets > removed, but after the removal of the old driver they are finally forced > to report their bugs resulting in a better new driver for everyone. > "Better" is a very subjective thing, you see elegance of design perhaps, I see works or not, and when I have to use statistical methods to see latency or CPU overhead benefits, I frankly don't care. Removing a working driver without a fully functional replacement forces people to stop upgrading their kernel, or start maintaining old drivers out of line. Problems of the "just occasionally goes away" type can take months to debug, the load can't be duplicated in most cases, and there's no log or oops data to help. > The sky2 driver is since nearly 2 years in the kernel and Stephen is > usually quite good at handling bugs. > Where does sky2 come in? Does this mean the the recent suggestion to "just change to skge and stop complaining" is also wrong? -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 23:38 ` Bill Davidsen @ 2007-07-26 23:41 ` Jeff Garzik 0 siblings, 0 replies; 35+ messages in thread From: Jeff Garzik @ 2007-07-26 23:41 UTC (permalink / raw) To: Bill Davidsen; +Cc: Adrian Bunk, Kyle Rose, linux-kernel Bill Davidsen wrote: > The old driver wasn't requiring any resources to maintain, the old This statement proves you don't know anything at all about the situation. Jeff ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 16:57 ` Adrian Bunk 2007-07-26 22:58 ` Chris Stromsoe 2007-07-26 23:38 ` Bill Davidsen @ 2007-07-30 3:01 ` Rob Sims 2007-09-05 9:22 ` Stephen Hemminger 2 siblings, 1 reply; 35+ messages in thread From: Rob Sims @ 2007-07-30 3:01 UTC (permalink / raw) To: Adrian Bunk; +Cc: Kyle Rose, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2307 bytes --] On Thu, Jul 26, 2007 at 06:57:01PM +0200, Adrian Bunk wrote: > On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle Rose wrote: > > >From http://www.krose.org/~krose/computing.html: > > > > Since the sky2 driver continues to suck ass (which is a technical > > description for "it hangs all the time under load, at least on my > > hardware" :-) ), I've fixed the sk98lin driver to compile for > > linux-2.6.23-rc1. Those who continue to have problems with sky2 can > > still use 2.6.23-rc1, simply by doing the following: > >... > > Personally, I'd like to see sk98lin remain in the kernel proper until > > sky2 goes at least 6 months without reported problems. The fact that I > > am not the only one still seeing issues is a clear indication that sky2 > > (even with the recent patches in 2.6.23-rc1) is not yet ready to replace > > sk98lin. > >... > > This sounds good in theory. > > The practical problem with this approach is that there are always many > people who use the old driver when the new driver doesn't work for them > instead of reporting their problems with the new driver. > > For these people a new driver will often suck when the old driver gets > removed, but after the removal of the old driver they are finally forced > to report their bugs resulting in a better new driver for everyone. > > The sky2 driver is since nearly 2 years in the kernel and Stephen is > usually quite good at handling bugs. The driver still (2.6.20/sky2 1.13) hangs for me (more rarely than in the past), and cycling the module generally fixes the issues. I have supplied all the information that Stephen has asked for, but still no resolution. I am not complaining about the lack of a fix, but don't assume that all it takes to get sky2 working is adequate bug reports. I have been and remain willing to test and assist debug, but after several dropped threads, I feel like the desire or ability to fix this issue isn't there (and remote debug of an intermittent hardware issue IS hard), and I didn't want to be a nuisance to someone that has no obligation to me to address the issue in the first place. Stability has improved, it's just not there yet. I'll switch to 1.16 soon, and respond to Stephen's request on netdev for current issues. -- Rob [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-30 3:01 ` Rob Sims @ 2007-09-05 9:22 ` Stephen Hemminger 2007-09-05 19:42 ` James Corey 2007-09-12 16:46 ` Torsten Kaiser 0 siblings, 2 replies; 35+ messages in thread From: Stephen Hemminger @ 2007-09-05 9:22 UTC (permalink / raw) To: Rob Sims; +Cc: Adrian Bunk, Kyle Rose, linux-kernel On Sun, 29 Jul 2007 21:01:30 -0600 Rob Sims <lkml-z@robsims.com> wrote: > On Thu, Jul 26, 2007 at 06:57:01PM +0200, Adrian Bunk wrote: > > On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle Rose wrote: > > > >From http://www.krose.org/~krose/computing.html: > > > > > > Since the sky2 driver continues to suck ass (which is a technical > > > description for "it hangs all the time under load, at least on my > > > hardware" :-) ), I've fixed the sk98lin driver to compile for > > > linux-2.6.23-rc1. Those who continue to have problems with sky2 can > > > still use 2.6.23-rc1, simply by doing the following: > > >... > > > Personally, I'd like to see sk98lin remain in the kernel proper until > > > sky2 goes at least 6 months without reported problems. The fact that I > > > am not the only one still seeing issues is a clear indication that sky2 > > > (even with the recent patches in 2.6.23-rc1) is not yet ready to replace > > > sk98lin. > > >... > > > > This sounds good in theory. > > > > The practical problem with this approach is that there are always many > > people who use the old driver when the new driver doesn't work for them > > instead of reporting their problems with the new driver. > > > > For these people a new driver will often suck when the old driver gets > > removed, but after the removal of the old driver they are finally forced > > to report their bugs resulting in a better new driver for everyone. > > > > The sky2 driver is since nearly 2 years in the kernel and Stephen is > > usually quite good at handling bugs. > > The driver still (2.6.20/sky2 1.13) hangs for me (more rarely than in > the past), and cycling the module generally fixes the issues. I have > supplied all the information that Stephen has asked for, but still no > resolution. I am not complaining about the lack of a fix, but don't > assume that all it takes to get sky2 working is adequate bug reports. I > have been and remain willing to test and assist debug, but after several > dropped threads, I feel like the desire or ability to fix this issue > isn't there (and remote debug of an intermittent hardware issue IS > hard), and I didn't want to be a nuisance to someone that has no > obligation to me to address the issue in the first place. > > Stability has improved, it's just not there yet. > > I'll switch to 1.16 soon, and respond to Stephen's request on netdev for > current issues. > -- > Rob The only known outstanding problems on 2.62.22.6 of sky2 are: * problems with fibre PHY based systems * suspend/resume issues, missing multicast reinitalization, etc. The previous stability problems have been addressed. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-05 9:22 ` Stephen Hemminger @ 2007-09-05 19:42 ` James Corey 2007-09-05 21:04 ` Kyle Rose 2007-09-08 17:44 ` Bill Davidsen 2007-09-12 16:46 ` Torsten Kaiser 1 sibling, 2 replies; 35+ messages in thread From: James Corey @ 2007-09-05 19:42 UTC (permalink / raw) To: Stephen Hemminger, Rob Sims; +Cc: Adrian Bunk, Kyle Rose, linux-kernel --- Stephen Hemminger <shemminger@linux-foundation.org> wrote: > On Sun, 29 Jul 2007 21:01:30 -0600 > Rob Sims <lkml-z@robsims.com> wrote: > > > On Thu, Jul 26, 2007 at 06:57:01PM +0200, Adrian > Bunk wrote: > > > On Thu, Jul 26, 2007 at 11:16:36AM -0400, Kyle > Rose wrote: > > > > >From > http://www.krose.org/~krose/computing.html: > > > > > > > > Since the sky2 driver continues to suck ass > (which is a technical > > > > description for "it hangs all the time under > load, at least on my > > > > hardware" :-) ), I've fixed the sk98lin driver > to compile for > > > > linux-2.6.23-rc1. Those who continue to have > problems with sky2 can > > > > still use 2.6.23-rc1, simply by doing the > following: > > > >... > > > > Personally, I'd like to see sk98lin remain in > the kernel proper until > > > > sky2 goes at least 6 months without reported > problems. The fact that I > > > > am not the only one still seeing issues is a > clear indication that sky2 > > > > (even with the recent patches in 2.6.23-rc1) > is not yet ready to replace > > > > sk98lin. > > > >... > > > > > > This sounds good in theory. > > > > > > The practical problem with this approach is that > there are always many > > > people who use the old driver when the new > driver doesn't work for them > > > instead of reporting their problems with the new > driver. > > > > > > For these people a new driver will often suck > when the old driver gets > > > removed, but after the removal of the old driver > they are finally forced > > > to report their bugs resulting in a better new > driver for everyone. > > > > > > The sky2 driver is since nearly 2 years in the > kernel and Stephen is > > > usually quite good at handling bugs. > > > > The driver still (2.6.20/sky2 1.13) hangs for me > (more rarely than in > > the past), and cycling the module generally fixes > the issues. I have > > supplied all the information that Stephen has > asked for, but still no > > resolution. I am not complaining about the lack > of a fix, but don't > > assume that all it takes to get sky2 working is > adequate bug reports. I > > have been and remain willing to test and assist > debug, but after several > > dropped threads, I feel like the desire or ability > to fix this issue > > isn't there (and remote debug of an intermittent > hardware issue IS > > hard), and I didn't want to be a nuisance to > someone that has no > > obligation to me to address the issue in the first > place. > > > > Stability has improved, it's just not there yet. > > > > I'll switch to 1.16 soon, and respond to Stephen's > request on netdev for > > current issues. > > -- > > Rob > > The only known outstanding problems on 2.62.22.6 of > sky2 are: > * problems with fibre PHY based systems > * suspend/resume issues, missing multicast > reinitalization, etc. > The previous stability problems have been addressed. I pretty much agree with everything said, including the part about the sky2 people working hard on it. I have noticed several bugs fixed recently in the driver source. However, it really DOES lock up under load. I even tried 2.6.23-rc4 and the absolute latest version of the driver and it still locks up, as in eth1: hw csum failure. Call Trace: <IRQ> [<ffffffff804779b6>] __skb_checksum_complete_head+0x43/0x56 [<ffffffff804779d5>] __skb_checksum_complete+0xc/0x11 [<ffffffff804a989d>] tcp_v4_rcv+0x14e/0x801 [<ffffffff8048ff84>] ip_local_deliver+0xca/0x14c [<ffffffff80490472>] ip_rcv+0x46c/0x4ae [<ffffffff88006138>] :sky2:sky2_poll+0x72b/0x9c7 [<ffffffff80245979>] update_wall_time+0x28c/0x39b [<ffffffff8047c934>] net_rx_action+0xa8/0x166 [<ffffffff8023901c>] do_timer+0x10/0xab [<ffffffff80235ced>] __do_softirq+0x55/0xc4 [<ffffffff8020c5cc>] call_softirq+0x1c/0x28 [<ffffffff8020d6fd>] do_softirq+0x2c/0x7d [<ffffffff8020d9bb>] do_IRQ+0x13e/0x15f [<ffffffff8020a780>] mwait_idle+0x0/0x48 [<ffffffff8020b951>] ret_from_intr+0x0/0xa <EOI> [<ffffffff804acdb9>] udp_poll+0x0/0xfb [<ffffffff8020a7c2>] mwait_idle+0x42/0x48 [<ffffffff8020a718>] cpu_idle+0xbd/0xe0 [<ffffffff80704a5a>] start_kernel+0x2ac/0x2b8 [<ffffffff80704140>] _sinittext+0x140/0x144 As far as I can tell, this bug has been with the sky2 driver all the way back to the Beforetime. Based on it happening with various versions of the driver back to 2.6.18 that I have tried, plus some googling on it. So while I bug reporting point is a good one, it would be nice to have a reliable driver in the kernel until the sky2 one is better. The alternative is to use the vendor driver, which less than optimal. -J ____________________________________________________________________________________ Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-05 19:42 ` James Corey @ 2007-09-05 21:04 ` Kyle Rose 2007-09-05 23:00 ` Stephen Hemminger 2007-09-08 17:44 ` Bill Davidsen 1 sibling, 1 reply; 35+ messages in thread From: Kyle Rose @ 2007-09-05 21:04 UTC (permalink / raw) To: James Corey; +Cc: Stephen Hemminger, Rob Sims, Adrian Bunk, linux-kernel > However, it really DOES lock up under load. I even > tried 2.6.23-rc4 and the absolute latest version of > the > driver and it still locks up, as in > Yich. I'm glad I'm still using sk98lin on my unmanned colo box. Kyle ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-05 21:04 ` Kyle Rose @ 2007-09-05 23:00 ` Stephen Hemminger 0 siblings, 0 replies; 35+ messages in thread From: Stephen Hemminger @ 2007-09-05 23:00 UTC (permalink / raw) To: Kyle Rose; +Cc: James Corey, Rob Sims, Adrian Bunk, linux-kernel On Wed, 05 Sep 2007 17:04:59 -0400 Kyle Rose <krose@krose.org> wrote: > > > However, it really DOES lock up under load. I even > > tried 2.6.23-rc4 and the absolute latest version of > > the > > driver and it still locks up, as in > > > Yich. I'm glad I'm still using sk98lin on my unmanned colo box. > > Kyle > Great for you, when I was testing sk98lin crashed my machine on overnight stress run. My intuition is that there is a bug in sk98lin on Yukon EC-U chips (those without ram buffer) and a hardware problem on Yukon XL chips (those with ram buffer) and the sky2 driver doesn't have workaround for getting the ram buffer stuck (yet). I don't like putting workarounds in for problems I can't reproduce. After KS, I'll rerun more stress tests on all the chip flavors and see if the hang is reproducible. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-05 19:42 ` James Corey 2007-09-05 21:04 ` Kyle Rose @ 2007-09-08 17:44 ` Bill Davidsen 2007-09-08 19:11 ` Adrian Bunk 1 sibling, 1 reply; 35+ messages in thread From: Bill Davidsen @ 2007-09-08 17:44 UTC (permalink / raw) To: James Corey Cc: Stephen Hemminger, Rob Sims, Adrian Bunk, Kyle Rose, linux-kernel James Corey wrote: > --- Stephen Hemminger > <shemminger@linux-foundation.org> wrote: > >> On Sun, 29 Jul 2007 21:01:30 -0600 >> Rob Sims <lkml-z@robsims.com> wrote: >> >>> On Thu, Jul 26, 2007 at 06:57:01PM +0200, Adrian >> Bunk wrote: >> The only known outstanding problems on 2.62.22.6 of >> sky2 are: >> * problems with fibre PHY based systems >> * suspend/resume issues, missing multicast >> reinitalization, etc. >> The previous stability problems have been addressed. > > I pretty much agree with everything said, including > the part about the sky2 people working hard on it. I > have noticed several bugs fixed recently in the driver > source. > > However, it really DOES lock up under load. I even > tried 2.6.23-rc4 and the absolute latest version of > the > driver and it still locks up, as in > > eth1: hw csum failure. > I checnged from the sk98lin to the previous driver Adrian said was the "right one," skge IIRC. Then he started pushing sky2, and I tried that. Like you I get hangs, but unlike you the system doesn't hang, just the NIC. No errors, warnings, and reboot fixes it. Acts as if the cable were pulled. That was with 2.6.22.5 (or so), dropped back to an old kernel with sk98lin, previously had uptimes in three digit days. Up for a week or so now. Haven't tried later kernels, don't intend to, while no network is really secure, it not really useful. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-08 17:44 ` Bill Davidsen @ 2007-09-08 19:11 ` Adrian Bunk 2007-09-09 2:42 ` Kyle Rose ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Adrian Bunk @ 2007-09-08 19:11 UTC (permalink / raw) To: Bill Davidsen Cc: James Corey, Stephen Hemminger, Rob Sims, Kyle Rose, linux-kernel On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: >... > That was with 2.6.22.5 (or so), dropped back to an old kernel with sk98lin, > previously had uptimes in three digit days. Up for a week or so now. There is a real long-term advantage of removing drivers like sk98lin because it forces people to report bugs if the new driver doesn't work instead of giving them the workaround of using the obsolete driver. And this has the (at first sight surprising) effect that removing code results in an improvement of the kernel. > Haven't tried later kernels, don't intend to, while no network is really > secure, it not really useful. You are a regular reader of linux-kernel, and therefore the sk98lin removal can hardly be a surprise for you. If you prefer whining over helping to improve the kernel that's your choice... > Bill Davidsen cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-08 19:11 ` Adrian Bunk @ 2007-09-09 2:42 ` Kyle Rose 2007-09-09 4:48 ` Willy Tarreau 2007-09-09 11:13 ` Adrian Bunk 2007-09-09 12:54 ` Chris Stromsoe 2007-09-10 14:32 ` Bill Davidsen 2 siblings, 2 replies; 35+ messages in thread From: Kyle Rose @ 2007-09-09 2:42 UTC (permalink / raw) To: Adrian Bunk Cc: Bill Davidsen, James Corey, Stephen Hemminger, Rob Sims, linux-kernel > You are a regular reader of linux-kernel, and therefore the sk98lin > removal can hardly be a surprise for you. If you prefer whining over > helping to improve the kernel that's your choice... > In my case the issue is simply one of practicality: I cannot go to the data center 5 times per day to reboot my colo box. Therefore, I run sk98lin. It's really that simple. Kyle ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-09 2:42 ` Kyle Rose @ 2007-09-09 4:48 ` Willy Tarreau 2007-09-09 11:13 ` Adrian Bunk 1 sibling, 0 replies; 35+ messages in thread From: Willy Tarreau @ 2007-09-09 4:48 UTC (permalink / raw) To: Kyle Rose Cc: Adrian Bunk, Bill Davidsen, James Corey, Stephen Hemminger, Rob Sims, linux-kernel On Sat, Sep 08, 2007 at 10:42:20PM -0400, Kyle Rose wrote: > > > You are a regular reader of linux-kernel, and therefore the sk98lin > > removal can hardly be a surprise for you. If you prefer whining over > > helping to improve the kernel that's your choice... > > > In my case the issue is simply one of practicality: I cannot go to the > data center 5 times per day to reboot my colo box. Therefore, I run > sk98lin. It's really that simple. Adrian generally wants to force "normal" users to test new drivers in order to quickly find bugs and fade out older ones. While this is often possible on the desktop, it's not possible for production servers. And not everyone can run 2.6.16.x to get a long-term stable kernel. I think that what is really needed is to add the opposite of "experimental" in the config options. Something like "deprecated drivers" which would be disabled by default. Desktop users would normally not care about that and rely only on newer drivers. Server users would have to enable the option if they want their old driver to be present because they have no other choice. With each driver's help text, it would be wise to add some text indicating what will replace the driver in question, so that their users know how to test it on non-production machines. But I agree with Kyle that on production systems, it is not acceptable to have a driver hang even once a month. This generally implies loss of service and customers going away. Ideology has no place in this area, is is quickly replaced by pragmatism. It was the same reason I spent time trying to get sky2 to reliably work in 2.4 ; sk98lin v8 was horribly unstable. Sky2 was fairly better but did not support some basic operations such as ifdown/ifup. sk98lin v10 finally worked fine, and I upgraded my customer's system with it because I needed anything which would reliably work. It was not acceptable anymore to have the customer phone twice a week complaining that their server had crashed again. In the long term, I would really like to get sky2 to work well in 2.4 because I'm more confident it in, it's cleaner, less obscure and less bloated. Having passed terabytes of data through both drivers I have not observed any glitch with sky2 as I had with sk98lin v8. Fortunately, sky2 chips are mostly found on desktop motherboards, so that helps the driver stabilize very quickly. It should not take as long as the transition from eepro100 to e100. Willy ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-09 2:42 ` Kyle Rose 2007-09-09 4:48 ` Willy Tarreau @ 2007-09-09 11:13 ` Adrian Bunk 2007-09-11 8:05 ` Stephen Hemminger 1 sibling, 1 reply; 35+ messages in thread From: Adrian Bunk @ 2007-09-09 11:13 UTC (permalink / raw) To: Kyle Rose Cc: Bill Davidsen, James Corey, Stephen Hemminger, Rob Sims, linux-kernel On Sat, Sep 08, 2007 at 10:42:20PM -0400, Kyle Rose wrote: > > > You are a regular reader of linux-kernel, and therefore the sk98lin > > removal can hardly be a surprise for you. If you prefer whining over > > helping to improve the kernel that's your choice... > > > In my case the issue is simply one of practicality: I cannot go to the > data center 5 times per day to reboot my colo box. Therefore, I run > sk98lin. It's really that simple. When did you report this bug the first time? What we need is that people when testing a new kernel they plan to use test the new drivers *and report the bugs if they run into any*. What could we have done so that you reported your bug without removing the sk98lin driver? > Kyle cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-09 11:13 ` Adrian Bunk @ 2007-09-11 8:05 ` Stephen Hemminger 2007-09-11 11:54 ` Adrian Bunk 2007-09-11 22:20 ` James Corey 0 siblings, 2 replies; 35+ messages in thread From: Stephen Hemminger @ 2007-09-11 8:05 UTC (permalink / raw) To: Adrian Bunk; +Cc: Kyle Rose, Bill Davidsen, James Corey, Rob Sims, linux-kernel On Sun, 9 Sep 2007 13:13:26 +0200 Adrian Bunk <bunk@kernel.org> wrote: > On Sat, Sep 08, 2007 at 10:42:20PM -0400, Kyle Rose wrote: > > > > > You are a regular reader of linux-kernel, and therefore the sk98lin > > > removal can hardly be a surprise for you. If you prefer whining over > > > helping to improve the kernel that's your choice... > > > > > In my case the issue is simply one of practicality: I cannot go to the > > data center 5 times per day to reboot my colo box. Therefore, I run > > sk98lin. It's really that simple. > > When did you report this bug the first time? > > What we need is that people when testing a new kernel they plan to use > test the new drivers *and report the bugs if they run into any*. > > What could we have done so that you reported your bug without removing > the sk98lin driver? > > > Kyle > > cu > Adrian There are several different problems in this thread: 1. The removal of old sk98lin driver caused some users to be forced to use skge. These users have uncovered issues with the dual port fiber based versions of the board. Short term: The sk98lin driver should be restored to previous state, and the PCI table should be used to limit the usage to only fiber systems. If Adrian doesn't do it, I'll do it when I return from Germany. Long term: I have fiber based board (thanks ebay) on the way to resolve skge bug. 2. Sky2 driver has it's own fiber based problems. Solve these after skge fiber. 3. Sky2 doesn't have as many workarounds for hardware problems as vendor sk98lin driver. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-11 8:05 ` Stephen Hemminger @ 2007-09-11 11:54 ` Adrian Bunk 2007-09-11 14:29 ` Bill Davidsen 2007-09-11 22:20 ` James Corey 1 sibling, 1 reply; 35+ messages in thread From: Adrian Bunk @ 2007-09-11 11:54 UTC (permalink / raw) To: Stephen Hemminger Cc: Kyle Rose, Bill Davidsen, James Corey, Rob Sims, linux-kernel, Jeff Garzik, netdev On Tue, Sep 11, 2007 at 10:05:26AM +0200, Stephen Hemminger wrote: > > There are several different problems in this thread: > 1. The removal of old sk98lin driver caused some users to be forced to use > skge. These users have uncovered issues with the dual port fiber based versions > of the board. > Short term: The sk98lin driver should be restored to previous state, > and the PCI table should be used to limit the usage to only fiber systems. > If Adrian doesn't do it, I'll do it when I return from Germany. >... No problem with this, but since it was Jeff's patch it should better be him who reverts it (and he's anyway one step nearer to Linus). But the underlying general problem still remains: How can we get people to test and report bugs with the new drivers before removing the old driver? That's a question especially for the people who now had problems after sk98lin was removed. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-11 11:54 ` Adrian Bunk @ 2007-09-11 14:29 ` Bill Davidsen 2007-09-11 15:03 ` Adrian Bunk 0 siblings, 1 reply; 35+ messages in thread From: Bill Davidsen @ 2007-09-11 14:29 UTC (permalink / raw) To: Adrian Bunk Cc: Stephen Hemminger, Kyle Rose, James Corey, Rob Sims, linux-kernel, Jeff Garzik, netdev Adrian Bunk wrote: > On Tue, Sep 11, 2007 at 10:05:26AM +0200, Stephen Hemminger wrote: > >> There are several different problems in this thread: >> 1. The removal of old sk98lin driver caused some users to be forced to use >> skge. These users have uncovered issues with the dual port fiber based versions >> of the board. >> Short term: The sk98lin driver should be restored to previous state, >> and the PCI table should be used to limit the usage to only fiber systems. >> If Adrian doesn't do it, I'll do it when I return from Germany. >> ... >> > > No problem with this, but since it was Jeff's patch it should better be > him who reverts it (and he's anyway one step nearer to Linus). > > But the underlying general problem still remains: > > How can we get people to test and report bugs with the new drivers > before removing the old driver? > > Sorry for a long answer, I'm trying to provide insight on two recent cases. Thinking back to several drivers, when e100 was new I tried it because I had problems with eepro100 in the area of multiple cards, multiple cables on a single card, and jumbo packets. For a while I used both, until e100 worked where I need it. So I initially tried it because it had features I needed, and then dropped to older driver just to avoid having to decide. With sk98lin, the driver worked flawlessly with all (3-4) systems, so I had no reason to try any other. When removing sk98lin was first proposed, I tried skge, first measurements showed it was 5-8% slower, NOT what I want, so I went back. For me there was no reliability issue, but I never tried it in a system with more than on NIC on the driver. Would "it's a little slower" be a valid bug report? Or would I have gotten "works fine for me" from people not beating it over Gbit? I didn't try sky2 until you suggested it, and I have reported my results previously, just stops working. Could it be my hardware? I tried it on one system, so yes, but sk98lin works for months. > That's a question especially for the people who now had problems after > sk98lin was removed. So if you want people to try a new driver, I think it really has to have some benefits to the users, in terms of performance, reliability, or features. "Cleaner design" doesn't motivate, and it does raise the question of why the old driver wasn't just cleaned up. I've been doing software for decades, I appreciate why, but users in general just want to use their system. Which raises the question of why to delete drivers which work for many or even most users? Testing a new kernel is no longer a drop in a boot operation if modprobe.conf must be edited to get the network up, and the typical user isn't going to write that shell script to try one or the other driver. Honestly, new drivers which offer little benefit to most users are the exception rather than the rule, so this may a corner case I would like to see sk98lin back in the kernel, for a while I can build my own kernels and patch it in, but until other drivers are drop-in, I probably won't change. Separate but related: why keep skge and sky2? Are we going through this again in a year? Is the benefit worth the effort? Hope some of this is helpful. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-11 14:29 ` Bill Davidsen @ 2007-09-11 15:03 ` Adrian Bunk 2007-09-11 22:37 ` Willy Tarreau 0 siblings, 1 reply; 35+ messages in thread From: Adrian Bunk @ 2007-09-11 15:03 UTC (permalink / raw) To: Bill Davidsen Cc: Stephen Hemminger, Kyle Rose, James Corey, Rob Sims, linux-kernel, Jeff Garzik, netdev On Tue, Sep 11, 2007 at 10:29:47AM -0400, Bill Davidsen wrote: > Adrian Bunk wrote: >> On Tue, Sep 11, 2007 at 10:05:26AM +0200, Stephen Hemminger wrote: >> >>> There are several different problems in this thread: >>> 1. The removal of old sk98lin driver caused some users to be forced to >>> use >>> skge. These users have uncovered issues with the dual port fiber >>> based versions >>> of the board. Short term: The sk98lin driver should be restored >>> to previous state, and the PCI table should be used to limit the >>> usage to only fiber systems. >>> If Adrian doesn't do it, I'll do it when I return from Germany. >>> ... >>> >> >> No problem with this, but since it was Jeff's patch it should better be >> him who reverts it (and he's anyway one step nearer to Linus). >> >> But the underlying general problem still remains: >> >> How can we get people to test and report bugs with the new drivers before >> removing the old driver? >> >> > Sorry for a long answer, I'm trying to provide insight on two recent cases. > > Thinking back to several drivers, when e100 was new I tried it because I > had problems with eepro100 in the area of multiple cards, multiple cables > on a single card, and jumbo packets. For a while I used both, until e100 > worked where I need it. So I initially tried it because it had features I > needed, and then dropped to older driver just to avoid having to decide. > > With sk98lin, the driver worked flawlessly with all (3-4) systems, so I had > no reason to try any other. When removing sk98lin was first proposed, I > tried skge, first measurements showed it was 5-8% slower, NOT what I want, > so I went back. For me there was no reliability issue, but I never tried it > in a system with more than on NIC on the driver. Would "it's a little > slower" be a valid bug report? Or would I have gotten "works fine for me" > from people not beating it over Gbit? >... If you get less throughput that is a regression, and it should be reported and fixed. I doubt anybody would have told you otherwise. Is this bug still present as of 2.6.23-rc6? >> That's a question especially for the people who now had problems after >> sk98lin was removed. > > So if you want people to try a new driver, I think it really has to have > some benefits to the users, in terms of performance, reliability, or > features. "Cleaner design" doesn't motivate, and it does raise the question > of why the old driver wasn't just cleaned up. I've been doing software for > decades, I appreciate why, but users in general just want to use their > system. Which raises the question of why to delete drivers which work for > many or even most users? As I already explained, there is a long term advantage for all users if there is only one driver in the kernel. Therefore all users should switch away from obsolete drivers to the replacement drivers, and the obsolete driver will be removed at some point in time. The only question is how to do it. > Testing a new kernel is no longer a drop in a boot > operation if modprobe.conf must be edited to get the network up, and the > typical user isn't going to write that shell script to try one or the other > driver. The typical user will let his distribution handle this. And MODULE_ALIAS can also handle this. > Honestly, new drivers which offer little benefit to most users are the > exception rather than the rule, so this may a corner case I would like to > see sk98lin back in the kernel, for a while I can build my own kernels and > patch it in, but until other drivers are drop-in, I probably won't change. That a new driver offers benefits that cause most users to switch isn't realistic. You mention e100 as an example - well, I'm using this driver in my computer, but I doubt anything would be worse for me if I'd use the obsolete eepro100 driver instead since I'm not using any of the fancy e100 features you mentioned as advantages. There is a long term advantage for all users if there is only one driver in the kernel. Therefore all users should switch away from obsolete drivers to the replacement drivers, and the obsolete driver will be removed at some point in time. The only question is how to do it. > Separate but related: why keep skge and sky2? Are we going through this > again in a year? Is the benefit worth the effort? >... skge and sky2 support distinct hardware. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-11 15:03 ` Adrian Bunk @ 2007-09-11 22:37 ` Willy Tarreau 0 siblings, 0 replies; 35+ messages in thread From: Willy Tarreau @ 2007-09-11 22:37 UTC (permalink / raw) To: Adrian Bunk Cc: Bill Davidsen, Stephen Hemminger, Kyle Rose, James Corey, Rob Sims, linux-kernel, Jeff Garzik, netdev On Tue, Sep 11, 2007 at 05:03:57PM +0200, Adrian Bunk wrote: > On Tue, Sep 11, 2007 at 10:29:47AM -0400, Bill Davidsen wrote: > > So if you want people to try a new driver, I think it really has to have > > some benefits to the users, in terms of performance, reliability, or > > features. "Cleaner design" doesn't motivate, and it does raise the question > > of why the old driver wasn't just cleaned up. I've been doing software for > > decades, I appreciate why, but users in general just want to use their > > system. Which raises the question of why to delete drivers which work for > > many or even most users? > > As I already explained, there is a long term advantage for all users if > there is only one driver in the kernel. Not only that. You have to place the switch in its context with history. Stephen, please correct me if I'm wrong, but sk98lin has been randomly working for a very long time. Not 100% the driver's fault, because it has had to workaround a lot of chips bugs. The fact that this driver supports *all* chips in the family makes it harder to identify whether problems are caused by the hardware or by the driver because it is bloated with tons of if/else. I've personally encountered random data corruption on the receive path with PCI-E hardware with sk98lin, as well as random TX stops. Sometimes it would require one terabyte of data, sometimes just a few hundreds megs. On other hardware (skge now), UDP would simply stop being sent and some TCP traffic was necessary to restart UDP! One guy at Marvell once asked me for more information, but it was not easy to provide much more, given the randomness of the problems! Stephen has done an excellent (and thankless) job at restarting from scratch, and the idea to separate the two chips was a good one IMHO. The problem is that he might have thought that most of the bugs were in the driver, while most of them are in the hardware, and this requires a lot of workarounds, which do not always work the same for everybody (I remember having tried to disable flow control with sk98lin because it helped with sky2). In parallel, sk98lin has improved on the vendor's site. v8 exhibited all the problems I explained above, but v10 has fixed a lot of them, making the new sk98lin more reliable. In parallel, sky2 and skge had got wider acceptance and testing. The nastiest hardware bugs will slowly surface, a good deal of driver bugs have been detected too (and that's expected from any new driver). It is possible that after 2 or 3 patches, a lot of the remaining problems will suddenly vanish. But it's also possible that the driver will still not work for 1% of people for 1 or 2 years because of some obscure hardware combinations which trigger some obscure hardware bugs. > Therefore all users should > switch away from obsolete drivers to the replacement drivers, and the > obsolete driver will be removed at some point in time. The only question > is how to do it. Desktop users genreally have no problem experimenting with multiple kernels or drivers. They can report feedback too, but generally, they're not very good at downloading alternative drivers and patching their kernel with those. Server users cannot experiment for a long time. After 2 or 3 losses of service, they *have* to provide a definitive solution. For some of them when sky2 fails, it may very well be to switch over to sk98lin. Downloading from the vendor's site and patching is not a problem for those users, but it causes them the trouble of updating the kernel for security fixes, so the old driver must be shipped with the kernel. However, I remember something which might constitute a solution. In 2.4, there's a small bug in the kbuild process on alpha. One question is always asked during make oldconfig. Its saved value is ignored because of the way it is computed. I don't know if we could do this with 2.6 kbuild. It would then be nice to always set sk98lin to unset if it was set to "Y" or "M", so that at each build, the user has to explicitly state he wants it. It's annoying enough to give the other one a try once in a while, without causing too much trouble to people who really have no other choice right now. What we need with this driver is people being fed up with it, not them being unable to use it as a last resort. Also, given that it has improved over the last years (probably due to competition pressure from sky2/skge), users will even less understand why there is such incentive to remove it. Another trick for obsolete drivers would be to simply remove them from the usual build system, but have them being available for explicit build. Eg: make modules will not build them, but make obsolete-modules would do. > > Testing a new kernel is no longer a drop in a boot > > operation if modprobe.conf must be edited to get the network up, and the > > typical user isn't going to write that shell script to try one or the other > > driver. > > The typical user will let his distribution handle this. > > And MODULE_ALIAS can also handle this. No system config should be edited to switch back to the alternative, otherwise it remains in its working state. > > Honestly, new drivers which offer little benefit to most users are the > > exception rather than the rule, so this may a corner case I would like to > > see sk98lin back in the kernel, for a while I can build my own kernels and > > patch it in, but until other drivers are drop-in, I probably won't change. > > That a new driver offers benefits that cause most users to switch isn't > realistic. Desktop users are curious and have plenty of time to kill. Server users are frightened and lazy. So I think that annoying the user slightly is a good solution (eg: make obsolete-modules). > You mention e100 as an example - well, I'm using this driver in my > computer, but I doubt anything would be worse for me if I'd use the > obsolete eepro100 driver instead since I'm not using any of the fancy > e100 features you mentioned as advantages. After having been happy with eepro100 for years, I discovered many problems with its VLAN support in 2.4 (MTU, ...) for which e100 was a solution. It was a good reason to switch. But the old e100 driver took ages to load (half of the machine boot time), which was not satisfying. So having a new driver load faster is another good reason to switch. > There is a long term advantage for all users if there is only one driver > in the kernel. Therefore all users should switch away from obsolete > drivers to the replacement drivers, and the obsolete driver will be > removed at some point in time. The only question is how to do it. Hmmm we already read this paragraph above :-) > > Separate but related: why keep skge and sky2? Are we going through this > > again in a year? Is the benefit worth the effort? > >... > > skge and sky2 support distinct hardware. ... and as such are both smaller than sk98lin which supports both. Cheers, Willy ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-11 8:05 ` Stephen Hemminger 2007-09-11 11:54 ` Adrian Bunk @ 2007-09-11 22:20 ` James Corey 1 sibling, 0 replies; 35+ messages in thread From: James Corey @ 2007-09-11 22:20 UTC (permalink / raw) To: Stephen Hemminger, Adrian Bunk Cc: Kyle Rose, Bill Davidsen, James Corey, Rob Sims, linux-kernel --- Stephen Hemminger <shemminger@linux-foundation.org> wrote: > On Sun, 9 Sep 2007 13:13:26 +0200 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Sat, Sep 08, 2007 at 10:42:20PM -0400, Kyle > Rose wrote: > > > > > > > You are a regular reader of linux-kernel, and > therefore the sk98lin > > > > removal can hardly be a surprise for you. If > you prefer whining over > > > > helping to improve the kernel that's your > choice... > > > > > > > In my case the issue is simply one of > practicality: I cannot go to the > > > data center 5 times per day to reboot my colo > box. Therefore, I run > > > sk98lin. It's really that simple. > > > > When did you report this bug the first time? > > > > What we need is that people when testing a new > kernel they plan to use > > test the new drivers *and report the bugs if they > run into any*. > > > > What could we have done so that you reported your > bug without removing > > the sk98lin driver? > > > > > Kyle > > > > cu > > Adrian > > > There are several different problems in this thread: > 1. The removal of old sk98lin driver caused some > users to be forced to use > skge. These users have uncovered issues with the > dual port fiber based versions > of the board. > Short term: The sk98lin driver should be > restored to previous state, > and the PCI table should be used to limit the > usage to only fiber systems. > If Adrian doesn't do it, I'll do it when I > return from Germany. > Long term: I have fiber based board (thanks > ebay) on the way to resolve > skge bug. > > 2. Sky2 driver has it's own fiber based problems. > Solve these after skge fiber. > > 3. Sky2 doesn't have as many workarounds for > hardware problems as vendor sk98lin > driver. > - Hm, hope I didn't trigger a religious debate. When you get to the point of working on the SKY2 driver problem with DGE-550SX (Syskonnect SK-9S81) also known as the "hw csum failure" issue, I'll be glad to test a patch or take debug data. Til then, I'll stay out of the way. -J ____________________________________________________________________________________ Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-08 19:11 ` Adrian Bunk 2007-09-09 2:42 ` Kyle Rose @ 2007-09-09 12:54 ` Chris Stromsoe 2007-11-06 22:23 ` Stephen Hemminger 2007-09-10 14:32 ` Bill Davidsen 2 siblings, 1 reply; 35+ messages in thread From: Chris Stromsoe @ 2007-09-09 12:54 UTC (permalink / raw) To: Adrian Bunk Cc: Bill Davidsen, James Corey, Stephen Hemminger, Rob Sims, Kyle Rose, linux-kernel On Sat, 8 Sep 2007, Adrian Bunk wrote: > On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: > >> Haven't tried later kernels, don't intend to, while no network is >> really secure, it not really useful. > > You are a regular reader of linux-kernel, and therefore the sk98lin > removal can hardly be a surprise for you. If you prefer whining over > helping to improve the kernel that's your choice... I've been trying to migrate off sk98lin to skge since earlier this year, without success, starting with 2.6.18 or .19. I have several of these cards in production using the sk98lin driver: fresno:~# lspci -vv -s 02:01 02:01.0 Ethernet controller: SysKonnect SK-9872 Gigabit Ethernet Server Adapter (SK-NET GE-ZX dual link) (rev 11) Subsystem: SysKonnect SK-9844 Gigabit Ethernet Server Adapter (SK-NET GE-SX dual link) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 64 (5750ns min, 7750ns max), Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 22 Region 0: Memory at febfc000 (32-bit, non-prefetchable) [size=16K] Region 1: I/O ports at e800 [size=256] Expansion ROM at febc0000 [disabled] [size=128K] Capabilities: [48] Power Management version 1 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data They are dual port SX fiber. Both ports are connected. If I do this: fresno:~# modprobe skge fresno:~# ip li set eth2 up fresno:~# ip li set eth2 down fresno:~# ip li set eth3 up the system locks up and I have to power cycle it. The order doesn't matter (if I do eth3 up/down, then eth2 up kills it). I don't have any problems with sk98lin. This works fine: fresno:~# modprobe sk98lin RlmtMode=DualNet fresno:~# ip li set eth2 up fresno:~# ip li set eth2 down fresno:~# ip li set eth3 up fresno:~# ip li set eth3 down I am more than happy to test various driver changes, and have tried a few suggested patches but nothing has worked so far. I would like to be using skge instead of sk98lin, but so far haven't had any success. -Chris ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-09 12:54 ` Chris Stromsoe @ 2007-11-06 22:23 ` Stephen Hemminger 2007-11-07 1:42 ` Chris Stromsoe 0 siblings, 1 reply; 35+ messages in thread From: Stephen Hemminger @ 2007-11-06 22:23 UTC (permalink / raw) To: Chris Stromsoe Cc: Adrian Bunk, Bill Davidsen, James Corey, Rob Sims, Kyle Rose, linux-kernel On Sun, 9 Sep 2007 05:54:45 -0700 (PDT) Chris Stromsoe <cbs@cts.ucla.edu> wrote: > On Sat, 8 Sep 2007, Adrian Bunk wrote: > > On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: > > > >> Haven't tried later kernels, don't intend to, while no network is > >> really secure, it not really useful. > > > > You are a regular reader of linux-kernel, and therefore the sk98lin > > removal can hardly be a surprise for you. If you prefer whining over > > helping to improve the kernel that's your choice... > > I've been trying to migrate off sk98lin to skge since earlier this year, > without success, starting with 2.6.18 or .19. > > I have several of these cards in production using the sk98lin driver: > > fresno:~# lspci -vv -s 02:01 > 02:01.0 Ethernet controller: SysKonnect SK-9872 Gigabit Ethernet Server Adapter (SK-NET GE-ZX dual link) (rev 11) > Subsystem: SysKonnect SK-9844 Gigabit Ethernet Server Adapter (SK-NET GE-SX dual link) > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- > Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- > Latency: 64 (5750ns min, 7750ns max), Cache Line Size: 32 bytes > Interrupt: pin A routed to IRQ 22 > Region 0: Memory at febfc000 (32-bit, non-prefetchable) [size=16K] > Region 1: I/O ports at e800 [size=256] > Expansion ROM at febc0000 [disabled] [size=128K] > Capabilities: [48] Power Management version 1 > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 PME-Enable- DSel=0 DScale=1 PME- > Capabilities: [50] Vital Product Data > > They are dual port SX fiber. Both ports are connected. If I do this: > > fresno:~# modprobe skge > fresno:~# ip li set eth2 up > fresno:~# ip li set eth2 down > fresno:~# ip li set eth3 up > > the system locks up and I have to power cycle it. The order doesn't > matter (if I do eth3 up/down, then eth2 up kills it). > > I don't have any problems with sk98lin. This works fine: > > fresno:~# modprobe sk98lin RlmtMode=DualNet > fresno:~# ip li set eth2 up > fresno:~# ip li set eth2 down > fresno:~# ip li set eth3 up > fresno:~# ip li set eth3 down > > > I am more than happy to test various driver changes, and have tried a few > suggested patches but nothing has worked so far. I would like to be using > skge instead of sk98lin, but so far haven't had any success. Please test 2.6.24-rc1 (or -rc2) because there were several fixes for skge that made it work correctly for dual port fiber board. The worst bug in skge was that it configured the ram buffer incorrectly. I just submitted these for next 2.6.23.X stable release as well -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-11-06 22:23 ` Stephen Hemminger @ 2007-11-07 1:42 ` Chris Stromsoe 0 siblings, 0 replies; 35+ messages in thread From: Chris Stromsoe @ 2007-11-07 1:42 UTC (permalink / raw) To: Stephen Hemminger Cc: Adrian Bunk, Bill Davidsen, James Corey, Rob Sims, Kyle Rose, linux-kernel On Tue, 6 Nov 2007, Stephen Hemminger wrote: > On Sun, 9 Sep 2007 05:54:45 -0700 (PDT) > Chris Stromsoe <cbs@cts.ucla.edu> wrote: > >> On Sat, 8 Sep 2007, Adrian Bunk wrote: >>> On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: >>> >>>> Haven't tried later kernels, don't intend to, while no network is >>>> really secure, it not really useful. >>> >>> You are a regular reader of linux-kernel, and therefore the sk98lin >>> removal can hardly be a surprise for you. If you prefer whining over >>> helping to improve the kernel that's your choice... >> >> I've been trying to migrate off sk98lin to skge since earlier this year, >> without success, starting with 2.6.18 or .19. >> >> I have several of these cards in production using the sk98lin driver: >> >> fresno:~# lspci -vv -s 02:01 >> 02:01.0 Ethernet controller: SysKonnect SK-9872 Gigabit Ethernet Server Adapter (SK-NET GE-ZX dual link) (rev 11) >> Subsystem: SysKonnect SK-9844 Gigabit Ethernet Server Adapter (SK-NET GE-SX dual link) >> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- >> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- >> Latency: 64 (5750ns min, 7750ns max), Cache Line Size: 32 bytes >> Interrupt: pin A routed to IRQ 22 >> Region 0: Memory at febfc000 (32-bit, non-prefetchable) [size=16K] >> Region 1: I/O ports at e800 [size=256] >> Expansion ROM at febc0000 [disabled] [size=128K] >> Capabilities: [48] Power Management version 1 >> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) >> Status: D0 PME-Enable- DSel=0 DScale=1 PME- >> Capabilities: [50] Vital Product Data >> >> They are dual port SX fiber. Both ports are connected. If I do this: >> >> fresno:~# modprobe skge >> fresno:~# ip li set eth2 up >> fresno:~# ip li set eth2 down >> fresno:~# ip li set eth3 up >> >> the system locks up and I have to power cycle it. The order doesn't >> matter (if I do eth3 up/down, then eth2 up kills it). >> >> I don't have any problems with sk98lin. This works fine: >> >> fresno:~# modprobe sk98lin RlmtMode=DualNet >> fresno:~# ip li set eth2 up >> fresno:~# ip li set eth2 down >> fresno:~# ip li set eth3 up >> fresno:~# ip li set eth3 down >> >> >> I am more than happy to test various driver changes, and have tried a few >> suggested patches but nothing has worked so far. I would like to be using >> skge instead of sk98lin, but so far haven't had any success. > > Please test 2.6.24-rc1 (or -rc2) because there were several fixes for skge > that made it work correctly for dual port fiber board. The worst bug in skge > was that it configured the ram buffer incorrectly. > > I just submitted these for next 2.6.23.X stable release as well I tested 2.6.24-rc1. This series of commands fresno:~# modprobe skge fresno:~# ip li set eth2 up fresno:~# ip li set eth2 down fresno:~# ip li set eth3 up still hard-locks the box in the same place. Was there anything in the -rc2 patch for skge? -Chris ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-08 19:11 ` Adrian Bunk 2007-09-09 2:42 ` Kyle Rose 2007-09-09 12:54 ` Chris Stromsoe @ 2007-09-10 14:32 ` Bill Davidsen 2007-09-10 15:39 ` Adrian Bunk 2 siblings, 1 reply; 35+ messages in thread From: Bill Davidsen @ 2007-09-10 14:32 UTC (permalink / raw) To: Adrian Bunk Cc: James Corey, Stephen Hemminger, Rob Sims, Kyle Rose, linux-kernel Adrian Bunk wrote: > On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: > >> ... >> That was with 2.6.22.5 (or so), dropped back to an old kernel with sk98lin, >> previously had uptimes in three digit days. Up for a week or so now. >> > > There is a real long-term advantage of removing drivers like sk98lin > because it forces people to report bugs if the new driver doesn't work > instead of giving them the workaround of using the obsolete driver. The issue is that sk98lin is only obsolete because you say so! skge crashes the system, as Chris reports, sky2 just stops passing bits and behaves as if the network cable were idle, no error messages of any nature, ping claims it's sending packets, tcpdump claims packets are being sent, the switch never blinks and systems on the switch see no packets. Again, no error messages, no dumps, nothing which would help you debug it, and it happens after some undefined time. skge and sky2 are up to eight or ten versions now, and they still don't work. Just because a driver works doesn't mean it's obsolete. > > And this has the (at first sight surprising) effect that removing code > results in an improvement of the kernel. > > >> Haven't tried later kernels, don't intend to, while no network is really >> secure, it not really useful. >> > > You are a regular reader of linux-kernel, and therefore the sk98lin > removal can hardly be a surprise for you. If you prefer whining over > helping to improve the kernel that's your choice... > I am trying to "improve the kernel" by advocating not removing reliable drivers in favor of unreliable drivers. Saying a driver is better because it has a clean design and good code is something I would expect from someone who hadn't written or used code. If skge and sky2 were so clean you wouldn't still be chasing obscure bugs after the driver had been in the kernel for six+ versions, you wouldn't have me wasting time trying to get a more secure kernel which is still reliable, wouldn't have Willy Tarreau suggesting you should be marking sk98lin as obsolete and leaving it in, wouldn't have someone maintaining sk98lin as a patch, wouldn't have Chris Stromsoe getting hard lock-ups. No matter how ugly sk98lin looks, and how well designed skge and sky2 may be, reliability is not a beauty contest. The volume of complaint should give you a hint that in this case the new drivers aren't usefully stable for many people, and that you are advocating a removal which is at least premature. If you can't admit you're wrong on this one, you can say you have reconsidered the timing of removal in light of new information. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-10 14:32 ` Bill Davidsen @ 2007-09-10 15:39 ` Adrian Bunk 2007-09-11 4:23 ` Kyle Moffett 0 siblings, 1 reply; 35+ messages in thread From: Adrian Bunk @ 2007-09-10 15:39 UTC (permalink / raw) To: Bill Davidsen Cc: James Corey, Stephen Hemminger, Rob Sims, Kyle Rose, linux-kernel On Mon, Sep 10, 2007 at 10:32:45AM -0400, Bill Davidsen wrote: > Adrian Bunk wrote: >> On Sat, Sep 08, 2007 at 01:44:20PM -0400, Bill Davidsen wrote: >> >>> ... >>> That was with 2.6.22.5 (or so), dropped back to an old kernel with >>> sk98lin, previously had uptimes in three digit days. Up for a week or so >>> now. >>> >> >> There is a real long-term advantage of removing drivers like sk98lin >> because it forces people to report bugs if the new driver doesn't work >> instead of giving them the workaround of using the obsolete driver. > > The issue is that sk98lin is only obsolete because you say so! No, it is obsolete because we have more than one driver for this hardware, and the people responsible for network drivers in the kernel decided some time ago that sk98lin is the one that is obsolete. >... >> And this has the (at first sight surprising) effect that removing code >> results in an improvement of the kernel. >> >> >>> Haven't tried later kernels, don't intend to, while no network is really >>> secure, it not really useful. >>> >> >> You are a regular reader of linux-kernel, and therefore the sk98lin >> removal can hardly be a surprise for you. If you prefer whining over >> helping to improve the kernel that's your choice... >> > > I am trying to "improve the kernel" by advocating not removing reliable > drivers in favor of unreliable drivers. Saying a driver is better because > it has a clean design and good code is something I would expect from > someone who hadn't written or used code. If skge and sky2 were so clean you > wouldn't still be chasing obscure bugs after the driver had been in the > kernel for six+ versions, you wouldn't have me wasting time trying to get a > more secure kernel which is still reliable, wouldn't have Willy Tarreau > suggesting you should be marking sk98lin as obsolete and leaving it in, > wouldn't have someone maintaining sk98lin as a patch, wouldn't have Chris > Stromsoe getting hard lock-ups. No matter how ugly sk98lin looks, and how > well designed skge and sky2 may be, reliability is not a beauty contest. A better written driver might still lack some workarounds for broken hardware or similar problems. Or simply contain some bugs like all software does. The important word is not "reliability", it's "maintainability". And that's something that pays off in the long term. > The volume of complaint should give you a hint that in this case the new > drivers aren't usefully stable for many people, and that you are advocating > a removal which is at least premature. If you can't admit you're wrong on > this one, you can say you have reconsidered the timing of removal in light > of new information. It was clear that sk98lin would go in the long term, and the only thing that could be discussed is the when and how of removal. When you talk about "new information", why did this information not surface until after the sk98lin driver was removed? Is there really a problem with "the timing of removal" or would we have faced exactly the same problems if the removal was timed a year later? And this is really the essence when I'm saying "removing code improves the kernel": The goal is to get people to report if the new drivers aren't usefully stable for them, not to use sk98lin instead without sending a bug report. Having different drivers with different sets of bugs and features is not a situation that should be retained for a longer time. The underlying question is: Is there anything better than a quick removal of the obsolete driver to get people to both test and report bugs with the new driver? Keeping obsolete drivers longer only for running into exactly the same problem later isn't an improvement. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-10 15:39 ` Adrian Bunk @ 2007-09-11 4:23 ` Kyle Moffett 0 siblings, 0 replies; 35+ messages in thread From: Kyle Moffett @ 2007-09-11 4:23 UTC (permalink / raw) To: Adrian Bunk Cc: Bill Davidsen, James Corey, Stephen Hemminger, Rob Sims, Kyle Rose, linux-kernel On Sep 10, 2007, at 11:39:53, Adrian Bunk wrote: > No, it is obsolete because we have more than one driver for this > hardware, and the people responsible for network drivers in the > kernel decided some time ago that sk98lin is the one that is obsolete. I would like to happily report that the sky2 driver works great in the NIC on my tablet where the sk98lin and skge drivers both fail utterly and hang the kernel. On another system the sk98lin and skge drivers don't recognize the chipset at all (missing PCI ID?) while the sky2 driver works perfectly for large quantities of data transferred. Cheers, Kyle Moffett ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-09-05 9:22 ` Stephen Hemminger 2007-09-05 19:42 ` James Corey @ 2007-09-12 16:46 ` Torsten Kaiser 1 sibling, 0 replies; 35+ messages in thread From: Torsten Kaiser @ 2007-09-12 16:46 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Rob Sims, Adrian Bunk, Kyle Rose, linux-kernel On 9/5/07, Stephen Hemminger <shemminger@linux-foundation.org> wrote: > > The only known outstanding problems on 2.62.22.6 of sky2 are: > * problems with fibre PHY based systems > * suspend/resume issues, missing multicast reinitalization, etc. > The previous stability problems have been addressed. Sorry to disappoint you, but it just hung for me again. After seeing the backport of commit c59697e06058fc2361da8cefcfa3de85ac107582 as "sky2: restore workarounds for lost interrupts" going into 2.6.22.5 I decided to give it another try. First tests worked and for two days I had no trouble, but today the network hung again, until I removed and reinserted the sky2 module. I'm using the Gentoo kernel 2.6.22-gentoo-r6 which is based on 2.6.22.6. (All patches at http://dev.gentoo.org/~dsd/genpatches/patches-2.6.22-7.htm ) This is as x86_64 kernel but with a 32bit userland. My hardware: 00:00.0 Host bridge: Intel Corporation 82915G/P/GV/GL/PL/910GL Memory Controller Hub (rev 04) 00:02.0 VGA compatible controller: Intel Corporation 82915G/GV/910GL Integrated Graphics Controller (rev 04) 00:02.1 Display controller: Intel Corporation 82915G Integrated Graphics Controller (rev 04) 00:1b.0 Audio device: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 03) 00:1c.0 PCI bridge: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) PCI Express Port 1 (rev 03) 00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 03) 00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 03) 00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 03) 00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 03) 00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 03) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3) 00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface Bridge (rev 03) 00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03) 00:1f.2 IDE interface: Intel Corporation 82801FB/FW (ICH6/ICH6W) SATA Controller (rev 03) 00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03) 01:04.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 01:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19) The Marvell controller is onboard, more info: linux ~ # lspci -vxxx -s 02:00.0 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 19) Subsystem: ASUSTeK Computer Inc. Marvell 88E8053 Gigabit Ethernet controller PCIe (Asus) Flags: bus master, fast devsel, latency 0, IRQ 318 Memory at cfffc000 (64-bit, non-prefetchable) [size=16K] I/O ports at e800 [size=256] Expansion ROM at cffc0000 [disabled] [size=128K] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/1 Enable+ Capabilities: [e0] Express Legacy Endpoint IRQ 0 00: ab 11 62 43 07 04 10 00 19 00 00 02 04 00 00 00 10: 04 c0 ff cf 00 00 00 00 01 e8 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 42 81 30: 00 00 fc cf 48 00 00 00 00 00 00 00 0a 01 00 00 40: 00 00 f0 01 00 80 a0 01 01 50 02 fe 00 20 00 13 50: 03 5c 00 80 00 00 00 01 00 00 00 01 05 e0 83 00 60: 0c 30 e0 fe 00 00 00 00 89 41 00 00 00 00 00 00 70: 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 10 00 11 00 c0 0f 00 00 00 20 1b 00 11 a4 03 00 f0: 08 00 11 10 00 00 00 00 00 00 00 00 00 00 00 00 >From /proc/interrupts 318: 230462 0 PCI-MSI-edge eth2 >From syslog: Sep 12 11:01:27 linux [ 9580.538373] CIFS VFS: server not responding Sep 12 11:01:27 linux [ 9580.538385] CIFS VFS: No response for cmd 50 mid 34863 Now the network was dead, I tried to restart it with ifconfig down && ifconfig up Sep 12 11:03:54 linux [ 9727.917997] sky2 eth2: disabling interface Sep 12 11:03:55 linux [ 9728.270436] sky2 eth2: enabling interface Sep 12 11:03:55 linux [ 9728.272401] sky2 eth2: ram buffer 48K Sep 12 11:03:56 linux [ 9730.016797] sky2 eth2: Link is up at 100 Mbps, full duplex, flow control both As that did not help, I removed the sky2 module and reinserted it: Sep 12 11:04:12 linux [ 9745.832197] sky2 eth2: disabling interface Sep 12 11:04:18 linux [ 9751.197733] ACPI: PCI interrupt for device 0000:02:00.0 disabled Sep 12 11:04:25 linux [ 9758.264714] ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 16 Sep 12 11:04:25 linux [ 9758.264736] PCI: Setting latency timer of device 0000:02:00.0 to 64 Sep 12 11:04:25 linux [ 9758.265409] sky2 0000:02:00.0: v1.14 addr 0xcfffc000 irq 16 Yukon-EC (0xb6) rev 2 Sep 12 11:04:25 linux [ 9758.265910] sky2 eth0: addr 00:15:f2:55:ce:f9 Sep 12 11:04:25 linux [ 9758.267754] udev: renamed network interface eth0 to eth2 Sep 12 11:04:25 linux [ 9758.705240] sky2 eth2: enabling interface Sep 12 11:04:25 linux [ 9758.707076] sky2 eth2: ram buffer 48K Sep 12 11:04:27 linux [ 9760.592061] sky2 eth2: Link is up at 100 Mbps, full duplex, flow control both Now the network was up again, but around one hour later it hung again. Again after removing and reinserting the module it started to work again, this time until I went home. I switched back to the Realtek 8139, as that card works. I can provide more info about the hardware, but I can't test any patches, as this server is needed for work and random hangs after hours of working are not really the nicest things to debug. Torsten ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 15:16 sk98lin for 2.6.23-rc1 Kyle Rose 2007-07-26 16:28 ` Jan Engelhardt 2007-07-26 16:57 ` Adrian Bunk @ 2007-07-26 19:17 ` Stephen Hemminger 2007-07-26 23:52 ` Bill Davidsen 3 siblings, 0 replies; 35+ messages in thread From: Stephen Hemminger @ 2007-07-26 19:17 UTC (permalink / raw) To: linux-kernel On Thu, 26 Jul 2007 11:16:36 -0400 Kyle Rose <krose@akamai.com> wrote: > From http://www.krose.org/~krose/computing.html: > > Since the sky2 driver continues to suck ass (which is a technical > description for "it hangs all the time under load, at least on my > hardware" :-) ), I've fixed the sk98lin driver to compile for > linux-2.6.23-rc1. Those who continue to have problems with sky2 can > still use 2.6.23-rc1, simply by doing the following: > Just don't build it with lock debugging enabled or you will see all the deadlocks lying below the surface. Worse yet, read the macro hell of sky2le.h ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 15:16 sk98lin for 2.6.23-rc1 Kyle Rose ` (2 preceding siblings ...) 2007-07-26 19:17 ` Stephen Hemminger @ 2007-07-26 23:52 ` Bill Davidsen 2007-07-27 1:13 ` Kyle Rose 3 siblings, 1 reply; 35+ messages in thread From: Bill Davidsen @ 2007-07-26 23:52 UTC (permalink / raw) To: Kyle Rose; +Cc: linux-kernel Kyle Rose wrote: > From http://www.krose.org/~krose/computing.html: > > Since the sky2 driver continues to suck ass (which is a technical > description for "it hangs all the time under load, at least on my > hardware" :-) ), I've fixed the sk98lin driver to compile for > linux-2.6.23-rc1. Those who continue to have problems with sky2 can > still use 2.6.23-rc1, simply by doing the following: > Bless you, extends my update capability for another version. ;-) However, Ingo posted a patch for the thread "network dies after random time" which probably didn't make it into rc1. In all fairness applying that might fix the problem, it's possible if unlikely that the new driver tickles a bug the stable sk98lin driver didn't. Does skge work for your hardware? Based on a sample size of one (four to go) everything worked for me except NFS, jumbo packets work with tcp, not with udp. I don't have everything nailed down enough for a proper bug report, it's just something to note. In truth there's little to choose between tcp and udp for machines in the same room, I could live with skge. haven't tried shy2, there was a build failure on my last server build, won't look at it until Monday. -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: sk98lin for 2.6.23-rc1 2007-07-26 23:52 ` Bill Davidsen @ 2007-07-27 1:13 ` Kyle Rose 0 siblings, 0 replies; 35+ messages in thread From: Kyle Rose @ 2007-07-27 1:13 UTC (permalink / raw) To: Bill Davidsen; +Cc: linux-kernel > Does skge work for your hardware? I unloaded sky2 and loaded skge at one point, but it didn't recognize my hardware. Perhaps it doesn't work with the 88E8053? Kyle ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2007-11-07 2:06 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-26 15:16 sk98lin for 2.6.23-rc1 Kyle Rose 2007-07-26 16:28 ` Jan Engelhardt 2007-07-26 16:30 ` Kyle Rose 2007-07-26 16:41 ` Jan Engelhardt 2007-07-27 1:07 ` Kyle Rose 2007-07-26 16:57 ` Adrian Bunk 2007-07-26 22:58 ` Chris Stromsoe 2007-07-26 23:38 ` Bill Davidsen 2007-07-26 23:41 ` Jeff Garzik 2007-07-30 3:01 ` Rob Sims 2007-09-05 9:22 ` Stephen Hemminger 2007-09-05 19:42 ` James Corey 2007-09-05 21:04 ` Kyle Rose 2007-09-05 23:00 ` Stephen Hemminger 2007-09-08 17:44 ` Bill Davidsen 2007-09-08 19:11 ` Adrian Bunk 2007-09-09 2:42 ` Kyle Rose 2007-09-09 4:48 ` Willy Tarreau 2007-09-09 11:13 ` Adrian Bunk 2007-09-11 8:05 ` Stephen Hemminger 2007-09-11 11:54 ` Adrian Bunk 2007-09-11 14:29 ` Bill Davidsen 2007-09-11 15:03 ` Adrian Bunk 2007-09-11 22:37 ` Willy Tarreau 2007-09-11 22:20 ` James Corey 2007-09-09 12:54 ` Chris Stromsoe 2007-11-06 22:23 ` Stephen Hemminger 2007-11-07 1:42 ` Chris Stromsoe 2007-09-10 14:32 ` Bill Davidsen 2007-09-10 15:39 ` Adrian Bunk 2007-09-11 4:23 ` Kyle Moffett 2007-09-12 16:46 ` Torsten Kaiser 2007-07-26 19:17 ` Stephen Hemminger 2007-07-26 23:52 ` Bill Davidsen 2007-07-27 1:13 ` Kyle Rose
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox