* Oops report for the week preceding June 16th, 2008 @ 2008-06-16 18:24 Arjan van de Ven 2008-06-17 9:20 ` Ingo Molnar 2008-06-17 17:18 ` Bob Copeland 0 siblings, 2 replies; 20+ messages in thread From: Arjan van de Ven @ 2008-06-16 18:24 UTC (permalink / raw) To: Linux Kernel Mailing List Cc: Linus Torvalds, Andrew Morton, Ingo Molnar, Thomas Gleixner, John W. Linville This week, a total of 3877 oopses and warnings have been reported, compared to 3390 reports in the previous week. Recently, Fedora put out an updated kernel that contained a wireless update; unfortunately, this update was rather broken and caused various things to show up in the top 20. A few days later, another update fixing the most obvious ones got released with the result that only rank 2 and 9 are from the broken update, rather than a lot more.. A new feature this week: for certain type of oopses (need to happen in vmlinux etc), the website now shows a mixed view of code/disassembly of the oops. For example, on http://www.kerneloops.org/searchweek.php?search=page_remove_rmap you can hover your mouse over the "Decode" word and it'll show the disassembled view. This information is also part of the detailed view of each oops that is suitable, and is also in the git exported view (git clone git://www.kerneloops.org/ ) Per file statistics 570 external/madwifi/wrapper (P) 323 drivers/net/wireless/b43/main.c 284 external/madwifi/binary (P) 276 drivers/parport/procfs.c 230 fs/sysfs/dir.c 208 fs/jbd/journal.c 174 security/selinux/hooks.c 137 net/mac80211/util.c 81 kernel/time/tick-broadcast.c 58 fs/ext3/super.c 49 net/mac80211/main.c 48 external/nvidia/binary (P) 45 mm/rmap.c Seen with untainted kernels --------------------------- Rank 2: b43_generate_noise_sample (warning) Reported 323 times (389 total reports) [fixed] too strict WARN_ON in b43 driver Fix available; not merged in mainline yet This warning was last seen in version 2.6.26-rc4-git5, and first seen in 2.6.25.3. More info: http://www.kerneloops.org/searchweek.php?search=b43_generate_noise_sample Rank 4: parport_device_proc_register (warning) Reported 276 times (1290 total reports) Duplicate /proc registration in the parport driver This warning was last seen in version 2.6.26-rc5-git7, and first seen in 2.6.24-rc5. More info: http://www.kerneloops.org/searchweek.php?search=parport_device_proc_register Rank 5: sysfs_add_one (warning) Reported 217 times (1280 total reports) Standard duplicate device name registration issue, still very much alive This warning was last seen in version 2.6.26-rc3, and first seen in 2.6.24-rc6. More info: http://www.kerneloops.org/searchweek.php?search=sysfs_add_one Rank 6: journal_update_superblock (warning) Reported 208 times (972 total reports) Likely caused by the user removing a USB stick while mounted This warning was last seen in version 2.6.26, and first seen in 2.6.24-rc6-git1. More info: http://www.kerneloops.org/searchweek.php?search=journal_update_superblock Rank 9: ieee80211_iterate_active_interfaces (warning) Reported 137 times (240 total reports) [fedora] Fedora merged a buggy rt25xx wireless driver patch This warning was last seen in version 2.6.25.4, and first seen in 2.6.25.4. More info: http://www.kerneloops.org/searchweek.php?search=ieee80211_iterate_active_interfaces Rank 10: tick_broadcast_oneshot_control (softlockup) Reported 81 times (425 total reports) Hard to trace down issue; but it's too frequent to be a fluke This softlockup was last seen in version 2.6.25.6, and first seen in 2.6.24-rc4. More info: http://www.kerneloops.org/searchweek.php?search=tick_broadcast_oneshot_control Rank 11: ext3_commit_super (warning) Reported 53 times (237 total reports) Likely caused by the user removing a USB stick while mounted This warning was last seen in version 2.6.25.6, and first seen in 2.6.24. More info: http://www.kerneloops.org/searchweek.php?search=ext3_commit_super Rank 12: default_idle (oops) Reported 44 times (118 total reports) Similar to the tick_broadcast_oneshot_control issue; hard to trace down This oops was last seen in version 2.6.26-rc5, and first seen in 2.6.21.3. More info: http://www.kerneloops.org/searchweek.php?search=default_idle Only seen with tainted kernels ------------------------------ Rank 1: ath_dynamic_sysctl_register (warning) Reported 376 times (3007 total reports) [external] Bug in the proprietary madwifi driver warning only shows up in tainted kernels This warning was last seen in version 2.6.25.6, and first seen in 2.6.24. More info: http://www.kerneloops.org/searchweek.php?search=ath_dynamic_sysctl_register Rank 3: init_ath_hal (warning) Reported 284 times (1843 total reports) [external] Bug in the proprietary madwifi driver warning only shows up in tainted kernels This warning was last seen in version 2.6.25.6, and first seen in 2.6.24. More info: http://www.kerneloops.org/searchweek.php?search=init_ath_hal Rank 7: ath_sysctl_register (warning) Reported 194 times (810 total reports) [external] Bug in the proprietary madwifi driver warning only shows up in tainted kernels This warning was last seen in version 2.6.25.6, and first seen in 2.6.24-rc4-git4. More info: http://www.kerneloops.org/searchweek.php?search=ath_sysctl_register Rank 8: task_has_capability (warning) Reported 166 times (580 total reports) [out of tree] Bug in the proprietary firegl driver warning only shows up in tainted kernels This warning was last seen in version 2.6.25.6, and first seen in 2.6.25. More info: http://www.kerneloops.org/searchweek.php?search=task_has_capability Rank 13: VNetBridgeDown (warning) Reported 41 times (230 total reports) [external] Bug in the proprietary VMWare drivers warning only shows up in tainted kernels This warning was last seen in version 2.6.25.6, and first seen in 2.6.24. More info: http://www.kerneloops.org/searchweek.php?search=VNetBridgeDown ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-16 18:24 Oops report for the week preceding June 16th, 2008 Arjan van de Ven @ 2008-06-17 9:20 ` Ingo Molnar 2008-06-17 9:26 ` David Miller 2008-06-17 17:18 ` Bob Copeland 1 sibling, 1 reply; 20+ messages in thread From: Ingo Molnar @ 2008-06-17 9:20 UTC (permalink / raw) To: Arjan van de Ven Cc: Linux Kernel Mailing List, Linus Torvalds, Andrew Morton, Thomas Gleixner, John W. Linville, Dave Jones, Greg Kroah-Hartman * Arjan van de Ven <arjan@linux.intel.com> wrote: > This week, a total of 3877 oopses and warnings have been reported, > compared to 3390 reports in the previous week. > > Recently, Fedora put out an updated kernel that contained a wireless > update; unfortunately, this update was rather broken and caused > various things to show up in the top 20. A few days later, another > update fixing the most obvious ones got released with the result that > only rank 2 and 9 are from the broken update, rather than a lot more.. sidenote: i suspect Fedora has done this to enable more hardware, and/or to fix mainline wireless bugs? I wish we would do such new driver merging in mainline instead, so that we had a single point of testing and single point of effort. Same for Nouveau: Fedora carries it and i dont understand why such a major piece of work is not done in mainline and not _helped by_ mainline. It's not like there would be any big risk from having such a new, experimental 3D driver around - instead of people running nvidia.ko that causes trouble in all sorts of other subsystems. All the years of moaning about nvidia.ko and finally we have some real OSS project and real chance of action but after a year of development Nouveau still has not been picked up ... When distros feel the need to add large and risky patches that IMO shows process failure on our part and further isolates mainline from distros and from testers. While we dont want to merge anything that gets thrown at us, not merging new, major, new-hardware-enabling OSS drivers in the mainline kernel is almost the same thing as intentionally hurting OSS projects and helping binary-only drivers. Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 9:20 ` Ingo Molnar @ 2008-06-17 9:26 ` David Miller 2008-06-17 15:33 ` Ingo Molnar 0 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2008-06-17 9:26 UTC (permalink / raw) To: mingo; +Cc: arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh From: Ingo Molnar <mingo@elte.hu> Date: Tue, 17 Jun 2008 11:20:23 +0200 > When distros feel the need to add large and risky patches that IMO shows > process failure on our part and further isolates mainline from distros > and from testers. You say this to the point where you sound like a broken record. It's a bit tiring, and nothing positive ever comes of these rants. And I think you massively oversimplify the situation, to top it off. On the wireless front, I severely doubt... in fact I know because I'm looking at every wireless merge going into my tree, that John Linville is not holding back new drivers submissions from the current 2.6.26 tree. Neither is Jeff Garzik for non-wireless net drivers. If the Fedora9 tree is based off of 2.6.25 or similar (it is), well that's how life works. Stuff gets backported from mainline into whatever they are using and stuff breaks from time to time. Did you investigate any of these facts to figure out what the specific situation is here? Or did some of your favorite keywords pop up Arjan's report so that you could unleash your favorite whine? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 9:26 ` David Miller @ 2008-06-17 15:33 ` Ingo Molnar 2008-06-17 17:54 ` Greg KH ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Ingo Molnar @ 2008-06-17 15:33 UTC (permalink / raw) To: David Miller Cc: arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh * David Miller <davem@davemloft.net> wrote: > On the wireless front, I severely doubt... in fact I know because I'm > looking at every wireless merge going into my tree, that John Linville > is not holding back new drivers submissions from the current 2.6.26 > tree. Neither is Jeff Garzik for non-wireless net drivers. i have no gripes about the current situation of wireless in linux-next, other than it all came 1-2 years too late: $ for ((v=12; v<27; v++)); do v2=v2.6.$[$v+1]; \ [ $v = 25 ] && v2=linus/master; \ [ $v = 26 ] && v2=linux-next/master; \ echo -n v2.6.$[$v+1]": "; \ git-diff --shortstat -M v2.6.$v..$v2 drivers/net/wireless/; done v2.6.13: 16 files changed, 1707 insertions(+), 1353 deletions(-) v2.6.14: 46 files changed, 40734 insertions(+), 756 deletions(-) v2.6.15: 53 files changed, 8016 insertions(+), 4183 deletions(-) v2.6.16: 37 files changed, 1818 insertions(+), 2513 deletions(-) v2.6.17: 64 files changed, 17829 insertions(+), 2214 deletions(-) v2.6.18: 78 files changed, 11159 insertions(+), 1427 deletions(-) v2.6.19: 63 files changed, 3441 insertions(+), 1500 deletions(-) v2.6.20: 58 files changed, 1290 insertions(+), 1028 deletions(-) v2.6.21: 42 files changed, 729 insertions(+), 678 deletions(-) v2.6.22: 85 files changed, 18989 insertions(+), 552 deletions(-) v2.6.23: 42 files changed, 2824 insertions(+), 356 deletions(-) v2.6.24: 208 files changed, 100960 insertions(+), 4303 deletions(-) v2.6.25: 227 files changed, 54467 insertions(+), 23126 deletions(-) -git: 214 files changed, 21940 insertions(+), 34143 deletions(-) -next: 126 files changed, 13585 insertions(+), 10146 deletions(-) up to v2.6.24 (released only 4 months ago!) we amassed a huge backlog of ~100+ KLOC wireless changes - there were OSS wireless drivers that havent been merged for up to 1.5 years. v2.6.24 was no doubt a huge step in the right direction but it came too late and we are still suffering from the fallout today as we have not reached test cycle equilibrium yet: by the time mainline gets the patches a new large batch comes up, invalidating much of mainline's role and forcing distros to gamble with (much untested and thus detached from reality) experimental branches. That's my main point: when we mess up and dont merge OSS driver code that was out there in time - and we messed up big time with wireless - we should admit the screwup and swallow the bitter pill. I.e. should merge the _full_ pipeline, open up to every developer who is willing to help with the mess, face instability for a short while until the dust settles and go for absolutely short turnaround for fixes and even enhancements - because there's little QA value in the existing code. Instead of pretending that we are "stable" (in this area of the kernel) - with a code base that distros end up skipping over. Have a look at Fedora's kernel-2.6.25.6-24.fc8.src.rpm (which is the Fedora kernel Arjan referred to and which we are talking about here) to see how this all ends up in distros in practice: earth4:/usr/src/redhat/SOURCES> ls -ldt *wireless* -rw-r--r-- 1 root root 4102957 2008-06-03 23:01 linux-2.6-wireless.patch (excluding renames: 214 files changed, 21940 insertions, 34143 deletions) -rw-r--r-- 1 root root 1663540 2008-05-29 20:46 linux-2.6-wireless-pending.patch (excluding renames: 126 files changed, 13585 insertions, 10146 deletions) -rw-r--r-- 1 root root 38430 2008-05-29 20:46 linux-2.6-wireless-fixups.patch linux-2.6-wireless.patch [4 MB patch, 55KLOC flux] is what v2.6.26 will be in a month or so, and it is already an obsolete, historic version, compared to what Fedora ships today... linux-2.6-wireless-pending.patch [1.6 MB patch, 23 KLOC flux] is what is in linux-next in essence and what will go into v2.6.27. Do you think Fedora jumped to the linux-next version of wireless because the current (not even released) mainline version was working so well? And lets finally admit that this pain is all happening to us because we: _didnt merge drivers soon enough_ Just about anyone who tried to use 3D and wireless on a Linux PC in the past 3 years will attest to that, without the need for much background research ;-) IMO we are not learning and are repeating history once again, as the Nouveau situation is building up towards a similar "we didnt merge it in time" pain point. From kernel-2.6.25.6-24.fc8.src.rpm: -rw-r--r-- 1 root root 513639 2008-05-22 04:31 nouveau-drm.patch 39 files changed, 13960 insertions(+), 5 deletions(-) Nouveau has been started in 2006, about two years ago. It's a lot less painful (not the least it is a lot faster as well) if such things are developed gradually in mainline. Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 15:33 ` Ingo Molnar @ 2008-06-17 17:54 ` Greg KH 2008-06-17 18:14 ` Dave Jones 2008-06-17 18:43 ` Daniel Barkalow 2008-06-17 19:24 ` Johannes Berg 2008-06-17 21:51 ` David Miller 2 siblings, 2 replies; 20+ messages in thread From: Greg KH @ 2008-06-17 17:54 UTC (permalink / raw) To: Ingo Molnar Cc: David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej On Tue, Jun 17, 2008 at 05:33:56PM +0200, Ingo Molnar wrote: > IMO we are not learning and are repeating history once again, as the > Nouveau situation is building up towards a similar "we didnt merge it in > time" pain point. From kernel-2.6.25.6-24.fc8.src.rpm: > > -rw-r--r-- 1 root root 513639 2008-05-22 04:31 nouveau-drm.patch > > 39 files changed, 13960 insertions(+), 5 deletions(-) > > Nouveau has been started in 2006, about two years ago. It's a lot less > painful (not the least it is a lot faster as well) if such things are > developed gradually in mainline. Not to dispute your original claim of wanting to merge drivers earlier, but a lot of the time, there are good reasons why the code doesn't get merged. As recently pointed out by the nouveau driver authors on the xorg mailing list, they don't want the driver to be added to the main kernel.org tree yet as they feel that their userspace/kernelspace inteface is not complete and will change in the future. We try to respect the authors of the code when not including them into the kernel tree for situations like this :) As for why Fedora added it, it might be because they can control both sides of the boundry with matching packages much easier. thanks, greg k-h ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 17:54 ` Greg KH @ 2008-06-17 18:14 ` Dave Jones 2008-06-17 18:43 ` Daniel Barkalow 1 sibling, 0 replies; 20+ messages in thread From: Dave Jones @ 2008-06-17 18:14 UTC (permalink / raw) To: Greg KH Cc: Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville On Tue, Jun 17, 2008 at 10:54:14AM -0700, Greg KH wrote: > We try to respect the authors of the code when not including them into > the kernel tree for situations like this :) > > As for why Fedora added it, it might be because they can control both > sides of the boundry with matching packages much easier. That's exactly it. Dave Airlie keeps both the X and kernel side of DRI in check in Fedora, and with him being the DRI maintainer, he tends to have a good handle on the state of things. Nouveau has been a bit bumpy, and isn't ready for mass-use, which is why we don't enable it by default. We ship it, but a user has to actually install it, and set it up to explicitly use it instead of the 'nv' X driver right now. Given it's there as a sort of 'preview' for interested parties, I don't think the world is ending because we jumped the gun by shipping this even though it's not upstream. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 17:54 ` Greg KH 2008-06-17 18:14 ` Dave Jones @ 2008-06-17 18:43 ` Daniel Barkalow 2008-06-17 19:31 ` Johannes Berg 2008-06-17 22:48 ` Greg KH 1 sibling, 2 replies; 20+ messages in thread From: Daniel Barkalow @ 2008-06-17 18:43 UTC (permalink / raw) To: Greg KH Cc: Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej On Tue, 17 Jun 2008, Greg KH wrote: > On Tue, Jun 17, 2008 at 05:33:56PM +0200, Ingo Molnar wrote: > > IMO we are not learning and are repeating history once again, as the > > Nouveau situation is building up towards a similar "we didnt merge it in > > time" pain point. From kernel-2.6.25.6-24.fc8.src.rpm: > > > > -rw-r--r-- 1 root root 513639 2008-05-22 04:31 nouveau-drm.patch > > > > 39 files changed, 13960 insertions(+), 5 deletions(-) > > > > Nouveau has been started in 2006, about two years ago. It's a lot less > > painful (not the least it is a lot faster as well) if such things are > > developed gradually in mainline. > > Not to dispute your original claim of wanting to merge drivers earlier, > but a lot of the time, there are good reasons why the code doesn't get > merged. > > As recently pointed out by the nouveau driver authors on the xorg > mailing list, they don't want the driver to be added to the main > kernel.org tree yet as they feel that their userspace/kernelspace > inteface is not complete and will change in the future. That's the same reason the wireless drivers didn't get merged sooner, with the slight difference that they were waiting on the new 802.11 stack's interface to stabilize, rather than their own interface. On the other hand, it would be good if there were a way to include unstable APIs in the mainline kernel so that they could get some exposure before they're set in stone, and that would also eliminate that reason for keeping drivers out so long. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 18:43 ` Daniel Barkalow @ 2008-06-17 19:31 ` Johannes Berg 2008-06-17 22:48 ` Greg KH 1 sibling, 0 replies; 20+ messages in thread From: Johannes Berg @ 2008-06-17 19:31 UTC (permalink / raw) To: Daniel Barkalow Cc: Greg KH, Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej [-- Attachment #1: Type: text/plain, Size: 1076 bytes --] On Tue, 2008-06-17 at 14:43 -0400, Daniel Barkalow wrote: > That's the same reason the wireless drivers didn't get merged sooner, with > the slight difference that they were waiting on the new 802.11 stack's > interface to stabilize, rather than their own interface. > > On the other hand, it would be good if there were a way to include > unstable APIs in the mainline kernel so that they could get some exposure > before they're set in stone, and that would also eliminate that reason for > keeping drivers out so long. Small correction here: We actually did evolve the mac80211 APIs quite radically in mainline (and new API revamps will be landing in .26 and .27). Most drivers, however, were waiting to be written against the mac80211 API, i.e. there were drivers against net80211 or (even more of them) drivers that had their own 802.11 stack in the driver, which meant the driver needed to be "ported" (rewritten) to mac80211's API. Stuff like that takes a time when each driver has at best one or two interested developers. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 18:43 ` Daniel Barkalow 2008-06-17 19:31 ` Johannes Berg @ 2008-06-17 22:48 ` Greg KH 2008-06-18 2:40 ` Daniel Barkalow 1 sibling, 1 reply; 20+ messages in thread From: Greg KH @ 2008-06-17 22:48 UTC (permalink / raw) To: Daniel Barkalow Cc: Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej On Tue, Jun 17, 2008 at 02:43:02PM -0400, Daniel Barkalow wrote: > > On the other hand, it would be good if there were a way to include > unstable APIs in the mainline kernel so that they could get some exposure > before they're set in stone, and that would also eliminate that reason for > keeping drivers out so long. That's exactly what the documentation in Documentation/ABI is there for. Document your "experimental" API, along with any userspace programs that are using it, and work to try to finalize it. thanks, greg k-h ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 22:48 ` Greg KH @ 2008-06-18 2:40 ` Daniel Barkalow 0 siblings, 0 replies; 20+ messages in thread From: Daniel Barkalow @ 2008-06-18 2:40 UTC (permalink / raw) To: Greg KH Cc: Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej On Tue, 17 Jun 2008, Greg KH wrote: > On Tue, Jun 17, 2008 at 02:43:02PM -0400, Daniel Barkalow wrote: > > > > On the other hand, it would be good if there were a way to include > > unstable APIs in the mainline kernel so that they could get some exposure > > before they're set in stone, and that would also eliminate that reason for > > keeping drivers out so long. > > That's exactly what the documentation in Documentation/ABI is there for. > Document your "experimental" API, along with any userspace programs that > are using it, and work to try to finalize it. Documentation/ABI/README doesn't list an "experimental" level of stability. I suppose a developer with an API they expect to change could create it as "obsolete" (since the experimental verison will get removed when the real one is done), but that's a little odd as a use of that category. Also, that doesn't stop people from looking through sysfs for useful stuff they expect to be undocumented but easy enough to figure out, and starting to use it without realizing that it's not intended to be maintained. And the "stable/syscalls" entry implies that all syscalls are stable when they get merged, which means that a patch that adds a syscall can't stablize in mainline. If there are people, like the Nouveau developers, using the instability of their userspace API as a reason not to submit their drivers, and we would ideally like the drivers to stabilize with mainline exposure, then we need to do something more to address these authors' concerns. -Daniel *This .sig left intentionally blank* ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 15:33 ` Ingo Molnar 2008-06-17 17:54 ` Greg KH @ 2008-06-17 19:24 ` Johannes Berg 2008-06-17 19:41 ` Dave Jones 2008-06-17 21:51 ` David Miller 2 siblings, 1 reply; 20+ messages in thread From: Johannes Berg @ 2008-06-17 19:24 UTC (permalink / raw) To: Ingo Molnar Cc: David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh [-- Attachment #1: Type: text/plain, Size: 701 bytes --] > i have no gripes about the current situation of wireless in linux-next, > other than it all came 1-2 years too late: Clearly, you don't have a clue about wireless. I'll admit to being pissed off by statements like this because I personally spent a lot of time getting wireless code into shape for merging, and it took a long time. If we'd have merged the existing wireless drivers 2 years ago, we would have (at least) four 802.11 stacks in the kernel now, at least two legally questionable drivers (the ath5k legal situation would probably never have been cleared up, acx100 still isn't), no uniform API so it would be impossible to write userspace support tools etc. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 19:24 ` Johannes Berg @ 2008-06-17 19:41 ` Dave Jones 2008-06-18 3:34 ` Arjan van de Ven 0 siblings, 1 reply; 20+ messages in thread From: Dave Jones @ 2008-06-17 19:41 UTC (permalink / raw) To: Johannes Berg Cc: Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, gregkh On Tue, Jun 17, 2008 at 09:24:14PM +0200, Johannes Berg wrote: > > > i have no gripes about the current situation of wireless in linux-next, > > other than it all came 1-2 years too late: > > Clearly, you don't have a clue about wireless. I'll admit to being > pissed off by statements like this because I personally spent a lot of > time getting wireless code into shape for merging, and it took a long > time. > > If we'd have merged the existing wireless drivers 2 years ago, we would > have (at least) four 802.11 stacks in the kernel now, at least two > legally questionable drivers (the ath5k legal situation would probably > never have been cleared up, acx100 still isn't), no uniform API so it > would be impossible to write userspace support tools etc. FWIW, the fact that there's so much churn happening in wireless right now is IMO, a sign of its health. When I told John "commit whatever wireless bits you think need to be in Fedora" many months back, I admit I wasn't expecting as much churn as there has been. It's been something of a double edged sword. It's great that users are getting the latest drivers & fixes, but at the same time, it means they get exposed to all the latest breakage at the same time. Given the volume of change occuring, cherry-picking isn't an enviable task, so distros are stuck between this reality, or leaving users hanging until we get to the next point release. FWIW, wireless isn't unique in this regard. For eg, the last few months we've always been shipping the latest ALSA bits rather than what's in kernel.org too, for similar reasons -- when bugs appear, the developers want to know "does it still happen with the latest bits?" The situation isn't perfect, but I don't think it's quite as bleak as Ingo painted it to be. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 19:41 ` Dave Jones @ 2008-06-18 3:34 ` Arjan van de Ven 2008-06-18 7:17 ` Johannes Berg 0 siblings, 1 reply; 20+ messages in thread From: Arjan van de Ven @ 2008-06-18 3:34 UTC (permalink / raw) To: Dave Jones, Johannes Berg, Ingo Molnar, David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, gregkh Dave Jones wrote: > On Tue, Jun 17, 2008 at 09:24:14PM +0200, Johannes Berg wrote: > > > > > i have no gripes about the current situation of wireless in linux-next, > > > other than it all came 1-2 years too late: > > > > Clearly, you don't have a clue about wireless. I'll admit to being > > pissed off by statements like this because I personally spent a lot of > > time getting wireless code into shape for merging, and it took a long > > time. > > > > If we'd have merged the existing wireless drivers 2 years ago, we would > > have (at least) four 802.11 stacks in the kernel now, at least two > > legally questionable drivers (the ath5k legal situation would probably > > never have been cleared up, acx100 still isn't), no uniform API so it > > would be impossible to write userspace support tools etc. > > FWIW, the fact that there's so much churn happening in wireless right > now is IMO, a sign of its health. I totally agree with that. In fact I'm quite happy with the progress. > It's been something of a double edged sword. It's great that users are > getting the latest drivers & fixes, but at the same time, it means they > get exposed to all the latest breakage at the same time. > Given the volume of change occuring, cherry-picking isn't an enviable task, > so distros are stuck between this reality, or leaving users hanging until we > get to the next point release. > > FWIW, wireless isn't unique in this regard. For eg, the last few months we've > always been shipping the latest ALSA bits rather than what's in kernel.org too, > for similar reasons -- when bugs appear, the developers want to know > "does it still happen with the latest bits?" > this is the part that concerns me. The fact that you feel the need to use "not yet in mainline" pieces (I'm not so much talking about backporting from 2.6.26-git to 2.6.25; that's perfectly fine, but I'm talking about code not in 2.6.26-git) is NOT a healthy sign.... if that truely is the case then that code surely deserves to be in mainline as well? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-18 3:34 ` Arjan van de Ven @ 2008-06-18 7:17 ` Johannes Berg 2008-06-18 14:22 ` Arjan van de Ven 0 siblings, 1 reply; 20+ messages in thread From: Johannes Berg @ 2008-06-18 7:17 UTC (permalink / raw) To: Arjan van de Ven Cc: Dave Jones, Ingo Molnar, David Miller, linux-kernel, torvalds, akpm, tglx, linville, gregkh [-- Attachment #1: Type: text/plain, Size: 1619 bytes --] > > It's been something of a double edged sword. It's great that users are > > getting the latest drivers & fixes, but at the same time, it means they > > get exposed to all the latest breakage at the same time. > > Given the volume of change occuring, cherry-picking isn't an enviable task, > > so distros are stuck between this reality, or leaving users hanging until we > > get to the next point release. > > > > FWIW, wireless isn't unique in this regard. For eg, the last few months we've > > always been shipping the latest ALSA bits rather than what's in kernel.org too, > > for similar reasons -- when bugs appear, the developers want to know > > "does it still happen with the latest bits?" > > > > > this is the part that concerns me. The fact that you feel the need to use "not yet in mainline" pieces > (I'm not so much talking about backporting from 2.6.26-git to 2.6.25; that's perfectly fine, but I'm > talking about code not in 2.6.26-git) is NOT a healthy sign.... if that truely is the case then that code surely > deserves to be in mainline as well? That's more a case of Fedora living on the bleeding edge. The code is fairly stable, all in linux-next, but the churn tends to be high because of internal API changes that affect all drivers. Currently, I don't think there is actually any _feature_ pending in linux-next, only internal cleanups. Such cleanups are desirable, but at the same time can lead to instability, hence being kept out of .26-git for the time being, and are in -next for .27. Mostly because we only wrote them after .26 started. johannes [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-18 7:17 ` Johannes Berg @ 2008-06-18 14:22 ` Arjan van de Ven 2008-06-23 16:55 ` John W. Linville 0 siblings, 1 reply; 20+ messages in thread From: Arjan van de Ven @ 2008-06-18 14:22 UTC (permalink / raw) To: Johannes Berg Cc: Dave Jones, Ingo Molnar, David Miller, linux-kernel, torvalds, akpm, tglx, linville, gregkh Johannes Berg wrote: >>> FWIW, wireless isn't unique in this regard. For eg, the last few months we've >>> always been shipping the latest ALSA bits rather than what's in kernel.org too, >>> for similar reasons -- when bugs appear, the developers want to know >>> "does it still happen with the latest bits?" >>> >> >> this is the part that concerns me. The fact that you feel the need to use "not yet in mainline" pieces >> (I'm not so much talking about backporting from 2.6.26-git to 2.6.25; that's perfectly fine, but I'm >> talking about code not in 2.6.26-git) is NOT a healthy sign.... if that truely is the case then that code surely >> deserves to be in mainline as well? > > That's more a case of Fedora living on the bleeding edge. The code is > fairly stable, all in linux-next, but the churn tends to be high because > of internal API changes that affect all drivers. Currently, I don't > think there is actually any _feature_ pending in linux-next, only > internal cleanups. Such cleanups are desirable, but at the same time can > lead to instability, hence being kept out of .26-git for the time being, > and are in -next for .27. Mostly because we only wrote them after .26 > started. > My concern is that if there's something technological in the "bleeding tree" that is so valuable to users that distros feel that it's ready "enough" and that they need to pick it up for their users, we have a flaw in our processes in moving to slow for users. From what you described that's not the case for wireless (more a case of Fedora jumping off the bridge while forgetting to tie down the bungee cord ;-), and that's good. I hope the same applies for the ALSA parts.... ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-18 14:22 ` Arjan van de Ven @ 2008-06-23 16:55 ` John W. Linville 0 siblings, 0 replies; 20+ messages in thread From: John W. Linville @ 2008-06-23 16:55 UTC (permalink / raw) To: linux-kernel; +Cc: Arjan van de Ven Arjan van de Ven wrote: > Johannes Berg wrote: > > That's more a case of Fedora living on the bleeding edge. The code is > > fairly stable, all in linux-next, but the churn tends to be high because > > of internal API changes that affect all drivers. Currently, I don't > > think there is actually any _feature_ pending in linux-next, only > > internal cleanups. Such cleanups are desirable, but at the same time can > > lead to instability, hence being kept out of .26-git for the time being, > > and are in -next for .27. Mostly because we only wrote them after .26 > > started. > > > > My concern is that if there's something technological in the "bleeding tree" that is > so valuable to users that distros feel that it's ready "enough" and that they need to > pick it up for their users, we have a flaw in our processes in moving to slow for > users. From what you described that's not the case for wireless (more a case of > Fedora jumping off the bridge while forgetting to tie down the bungee cord ;-), and > that's good. I hope the same applies for the ALSA parts.... My first question is "how do you guys know to start these discussions when I go on vacation?" :-) I was out of town last week and missed this little eruption, so forgive my late reply. Given my pertinent role in the topic, I thought I should still remark. I would remind anyone that at one time there was still a lot of pressue to _not_ merge the new wireless bits upstream. The reasons for that pressure were mentioned elsewhere in this thread -- mostly fear of introducing new/instable userland ABI as well as general concerns about the design/implementation of what is now the mac80211 component. My own lack of experience as a maintainer contributed here, as I was often uncertain about how to get things moving along sooner. Thankfully my experience dealing with these maintenance issues has increased. Moreover the external pressure against merging has subsided due both to some technical resolutions and also perhaps to a shift in attitude about what is mergeable upstream. I don't think there is any remaining logjam with regard to upstream wireless merges. The practice of pushing cutting-edge wireless stuff into Fedora started as a means of getting testers. Once it was in Rawhide, it never made sense to yank it away from users. So, I have continued the process of merging what is now known as -next wireless bits into Fedora. This is at least partly because I haven't figured-out how to gracefully stop doing that. :-) In fact, in Fedora 9 I have started to stage those bits more slowly between -next, Rawhide, F9, and F8. FWIW, I think this staging (rather than pushing new -next stuff into F{10,9,8} more-or-less immediately) may have created the window for releasing the bad Fedora kernels that plagued kerneloops.org last week. Anyway, the wireless bits in Fedora are all on their way upstream. The ones that aren't in linux-2.6 are only missing due to the "bugfixes only after -rc" policy, not some systemic refusal to merge. Given the current process, it would be impossible to get them upstream any faster. In fact, getting exposure in Fedora gives us an early jump in _avoiding_ upstream regressions when these bits get into 2.6.27-rc1. The fact that it _usually_ make things better for Fedora wireless users is just gravy. :-) Thanks, John -- John W. Linville linville@tuxdriver.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 15:33 ` Ingo Molnar 2008-06-17 17:54 ` Greg KH 2008-06-17 19:24 ` Johannes Berg @ 2008-06-17 21:51 ` David Miller 2008-06-19 0:21 ` Ingo Molnar 2 siblings, 1 reply; 20+ messages in thread From: David Miller @ 2008-06-17 21:51 UTC (permalink / raw) To: mingo; +Cc: arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh From: Ingo Molnar <mingo@elte.hu> Date: Tue, 17 Jun 2008 17:33:56 +0200 > v2.6.24 was no doubt a huge step in the right direction but it came too > late and we are still suffering from the fallout today as we have not > reached test cycle equilibrium yet: by the time mainline gets the > patches a new large batch comes up, invalidating much of mainline's role > and forcing distros to gamble with (much untested and thus detached from > reality) experimental branches. > > That's my main point: when we mess up and dont merge OSS driver code > that was out there in time - and we messed up big time with wireless - > we should admit the screwup and swallow the bitter pill. Your point seems to be that, even though we've acknowledged and entirely corrected the problem now, you still will whack us over the head and complain because it took in your opinion too long to get to that point. How nice. That makes the wireless folks feel great I imagine. You also have no idea what infrastructure or other invasive wireless subsystem changes might have been necessary to merge in some of those drivers. Of course, that doesn't suit your goal of making the wireless folks look like a bunch of incompetant twits, so it doesn't surprise me that you haven't investigated any such facts. It is impossible, therefore, to please you since we cannot change the past. So all we can do at this point is continue doing the right thing and completely ignore your pointless whines. In this context your complaints are beyond unfair and beyond pointless, therefore you're finally in my kill file now, have a nice day Ingo. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-17 21:51 ` David Miller @ 2008-06-19 0:21 ` Ingo Molnar 2008-06-20 6:01 ` Len Brown 0 siblings, 1 reply; 20+ messages in thread From: Ingo Molnar @ 2008-06-19 0:21 UTC (permalink / raw) To: David Miller Cc: arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh * David Miller <davem@davemloft.net> wrote: > From: Ingo Molnar <mingo@elte.hu> > Date: Tue, 17 Jun 2008 17:33:56 +0200 > > > v2.6.24 was no doubt a huge step in the right direction but it came > > too late and we are still suffering from the fallout today as we > > have not reached test cycle equilibrium yet: by the time mainline > > gets the patches a new large batch comes up, invalidating much of > > mainline's role and forcing distros to gamble with (much untested > > and thus detached from reality) experimental branches. > > > > That's my main point: when we mess up and dont merge OSS driver code > > that was out there in time - and we messed up big time with wireless > > - we should admit the screwup and swallow the bitter pill. > > Your point seems to be that, even though we've acknowledged and > entirely corrected the problem now, you still will whack us over the > head and complain because it took in your opinion too long to get to > that point. from the discussion it was not at all clear to me that you appear to agree with me - all i saw really was that you tried to ridicule my position. > How nice. That makes the wireless folks feel great I imagine. my only worry was about the current situation, which, according to kerneloops.org, with 17442 oopses reported against v2.6.25, isnt anything to feel too great about. (And that's not limited to wireless in any way - there is a rather prominent tick_broadcast_oneshot_control() soft lockup entry as well that we are trying to figure out.) It will all get better i'm sure - we now finally have objective visibility of bugs as they happen to users. Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-19 0:21 ` Ingo Molnar @ 2008-06-20 6:01 ` Len Brown 0 siblings, 0 replies; 20+ messages in thread From: Len Brown @ 2008-06-20 6:01 UTC (permalink / raw) To: Ingo Molnar Cc: David Miller, arjan, linux-kernel, torvalds, akpm, tglx, linville, davej, gregkh > my only worry was about the current situation, which, according to > kerneloops.org, with 17442 oopses reported against v2.6.25, isnt > anything to feel too great about. (And that's not limited to wireless in > any way - there is a rather prominent tick_broadcast_oneshot_control() > soft lockup entry as well that we are trying to figure out.) > > It will all get better i'm sure - we now finally have objective > visibility of bugs as they happen to users. kerneloops.org is indeed a wonderful thing (kudos to Arjan, once again!). Note, however, that the large number of recent reports isn't necessarily a fair comparison with numbers for previous releases. For the number of clients to report to kerneloops.org is not constant. (eg. it was included with Fedora Core 9, but not with Fedora Core 8) cheers, -Len ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Oops report for the week preceding June 16th, 2008 2008-06-16 18:24 Oops report for the week preceding June 16th, 2008 Arjan van de Ven 2008-06-17 9:20 ` Ingo Molnar @ 2008-06-17 17:18 ` Bob Copeland 1 sibling, 0 replies; 20+ messages in thread From: Bob Copeland @ 2008-06-17 17:18 UTC (permalink / raw) To: Arjan van de Ven Cc: Linux Kernel Mailing List, Linus Torvalds, Andrew Morton, Ingo Molnar, Thomas Gleixner, John W. Linville On Mon, Jun 16, 2008 at 2:24 PM, Arjan van de Ven <arjan@linux.intel.com> wrote: > Only seen with tainted kernels > ------------------------------ > Rank 1: ath_dynamic_sysctl_register (warning) > Reported 376 times (3007 total reports) > [external] Bug in the proprietary madwifi driver > warning only shows up in tainted kernels > This warning was last seen in version 2.6.25.6, and first seen in > 2.6.24. > More info: > http://www.kerneloops.org/searchweek.php?search=ath_dynamic_sysctl_register I just looked at my moldy old copy of madwifi - AFAICT this is in the non-HAL section. We can probably fix this one by filling out newer fields in the ctl_table, just to get it off the #1 list. Even though they should be using ath5k anyway... -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2008-06-23 17:17 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-16 18:24 Oops report for the week preceding June 16th, 2008 Arjan van de Ven 2008-06-17 9:20 ` Ingo Molnar 2008-06-17 9:26 ` David Miller 2008-06-17 15:33 ` Ingo Molnar 2008-06-17 17:54 ` Greg KH 2008-06-17 18:14 ` Dave Jones 2008-06-17 18:43 ` Daniel Barkalow 2008-06-17 19:31 ` Johannes Berg 2008-06-17 22:48 ` Greg KH 2008-06-18 2:40 ` Daniel Barkalow 2008-06-17 19:24 ` Johannes Berg 2008-06-17 19:41 ` Dave Jones 2008-06-18 3:34 ` Arjan van de Ven 2008-06-18 7:17 ` Johannes Berg 2008-06-18 14:22 ` Arjan van de Ven 2008-06-23 16:55 ` John W. Linville 2008-06-17 21:51 ` David Miller 2008-06-19 0:21 ` Ingo Molnar 2008-06-20 6:01 ` Len Brown 2008-06-17 17:18 ` Bob Copeland
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.