public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* stable? quality assurance?
@ 2010-07-11  7:18 Martin Steigerwald
  2010-07-11  8:39 ` Eric Dumazet
                   ` (4 more replies)
  0 siblings, 5 replies; 72+ messages in thread
From: Martin Steigerwald @ 2010-07-11  7:18 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2396 bytes --]


Hi!

2.6.34 was a desaster for me: bug #15969 - patch was availble before 
2.6.34 already, bug #15788, also reported with 2.6.34-rc2 already, as well 
as most important two complete lockups - well maybe just X.org and radeon 
KMS, I didn't start my second laptop to SSH into the locked up one - on my 
ThinkPad T42. I fixed the first one with the patch, but after the lockups I 
just downgraded to 2.6.33 again.

I still actually *use* my machines for something else than hunting patches 
for kernel bugs and on kernel.org it is written "Latest *Stable* Kernel" 
(accentuation from me). I know of the argument that one should use a 
distro kernel for machines that are for production use. But frankly, does 
that justify to deliver in advance known crap to the distributors? What 
impact do partly grave bugs reported on bugzilla have on the release 
decision?

And how about people who have their reasons - mine is TuxOnIce - to 
compile their own kernels?

Well 2.6.34.1 fixed the two reported bugs and it seemed to have fixed the 
freezes as well. So far so good.

Maybe it should read "prerelease of stable" for at least 2.6.34.0 on the 
website. And I just again always wait for .2 or .3, as with 2.6.34.1 I 
still have some problems like the hang on hibernation reported in

hang on hibernation with kernel 2.6.34.1 and TuxOnIce 3.1.1.1

on this mailing list just a moment ago. But then 2.6.33 did hang with 
TuxOnIce which apparently (!) wasn't a TuxOnIce problem either, since 
2.6.34 did not hang with it anymore which was a reason for me to try 
2.6.34 earlier.

I am quite a bit worried about the quality of the recent kernels. Some 
iterations earlier I just compiled them, partly even rc-ones which I do 
not expact to be table, and they just worked. But in the recent times .0, 
partly even .1 or .2 versions haven't been stable for me quite some times 
already and thus they better not be advertised as such on kernel.org I 
think. I am willing to risk some testing and do bug reports, but these are 
still production machines, I do not have any spare test machines, and 
there needs to be some balance, i.e. the kernels should basically work. 
Thus I for sure will be more reluctant to upgrade in the future.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread
* Re: stable? quality assurance?
@ 2010-09-04 16:42 Martin Steigerwald
  2010-09-04 17:22 ` Willy Tarreau
  0 siblings, 1 reply; 72+ messages in thread
From: Martin Steigerwald @ 2010-09-04 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Willy Tarreau

[-- Attachment #1: Type: text/plain, Size: 4929 bytes --]

Sorry, forgot Cc again.

Am Sonntag 11 Juli 2010 schrieb Willy Tarreau:
> Hi Martin,

Hi Willy, hi everyone else reading this,

> On Sun, Jul 11, 2010 at 04:51:42PM +0200, Martin Steigerwald wrote:
> > I hope that someone answers who actually can take some critique. From
> > the  current replies I perceive a lack of that ability.
> 
> well, I'll try to do then :-)
> 
> There were some threads in the past about kernel releases quality,
> where Linus explained why it could not be completely black or white.
> 
> Among the things he explained, I remember that one of primary concern
> was the inability to slow down development. I mean, if he waits 2 more
> weeks for things to stabilize, then there will be two more weeks of
> crap^H^H^H^Hdevelopment merged in next merge window, so in fact this
> will just shift dates and not quality.

During bisecting [Bug 16376] random - possibly Radeon DRM KMS related 
freezes, which goes very slowly due to having lots of unbootable kernels 
with an ext4 / readahead related backtrace during boot, I had an idea:

I think main problem is that the current development process does not give 
time for quality work and bug fixing. As I understand it currently its just 
a constant development of new features with bug fixing and quality work 
having to be done beneath that development:

- before 2.6.36 is released developers aim at developing new stuff for 
2.6.37.

- after 2.6.36 is released developers aim at getting as much stuff into 
2.6.37 and then after two weeks at developing new features for 2.6.38.

This process does not take bug fixing into account at all, cause after the 
merge window has closing, developers hurry to get the stuff ready for the 
next window.

In that model extending the freeze period after rc1 doesn't help at all, 
cause as you say more "crap^H^H^H^Hdevelopment" gets collected for the 
next kernel.

But is that a *given* that no one actually has any influence to? Is 
collecting changes for next kernel like rain that either pours down or not 
- usually pours down in this case like in August in Germany ;)? Who feeds 
Linus with new stuff during the merge window? From what I understand of the 
Linux development process its mainly the subsystem maintainers and Andrew 
Morton.

What if those people stop collecting new stuff for Linus except bugfixes 
about two or three weeks before the next kernel is relased? This would 
give the subsystem trees and the mm tree some time to stabilize a bit, so 
that Linus gets more quality stuff in the first time. And more importantly, 
since developers know that subsystem maintainers and Andrew only collect 
bugfixes 2-3 weeks before the release of a stable kernel, they can as well 
spend some time on quality work.

Of course, developers can still decide: Well if 2.6.37 work is closed 
already and continue developing for 2.6.38 even earlier, but I still think 
this would help to slow things down a bit prior to the critical phase 
before releasing a stable kernel. Cause when I know my subsystem 
maintainer or Andrew won't be taking my stuff anyway, before the release 
kernel is released, I can take a little time for other things.

The main idea here is to have a two-staged freeze process and to 
distribute the "I am only taking bug fixes" work to more people than Linus.

For this to work properly, I think at the time of the release of the 
stable kernel subsystem maintainers and Andrew should branch their trees. 
For example when 2.6.36 is released:

- tree 
  => 2.6.36-stable-tree
  => tree, where 2.6.37 stuff will be going in

Thus when subsystem maintainers take new stuff during the merge window, it 
will be for the next kernel release already, not for the current one. 
Except bugfix work. Whereas I think the criteria for bug fix work should not 
be that strict than for the stable patches Greg collects.

Thus it needs to be clear: No new stuff for next kernel already two weeks 
prior to release the current stable kernel.

I think, this could help. Its a bit like the two-staged development 
process of Debian, but with the freeze period for "unstable" being a fixed 
time interval of about 2 weeks instead of RC=0 for stable ;). Its a bit of 
a formal shift of attention to the stable kernel about 2 weeks before its 
release. Developers might find creative ways to circumvent it, or they 
understand, that this process serves a purpose of improving kernel 
quality.

When you think these two weeks cannot be squeezed into the three-monthly 
development cycle, a four-monthly development cycle might do. But actually 
I don't see why these two weeks could not be made to fit in there.

Installing and testing next kernel after yet another mail to this thread,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2010-09-05  9:48 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-11  7:18 stable? quality assurance? Martin Steigerwald
2010-07-11  8:39 ` Eric Dumazet
2010-07-11 14:22   ` Martin Steigerwald
2010-07-11 14:52     ` Martin Steigerwald
2010-07-11 15:58   ` William Pitcock
2010-07-11 16:34     ` Eric Dumazet
2010-07-16  6:59     ` Greg KH
2010-08-05  3:27       ` Jeremy Fitzhardinge
2010-07-11 17:04   ` Heinz Diehl
2010-07-11 13:16 ` Ted Ts'o
2010-07-11 18:02   ` Anca Emanuel
2010-07-12  6:46   ` David Newall
     [not found]     ` <AANLkTilGjfx9sb66qVfZn1SeFPURHUrrdE7JCrild8VX@mail.gmail.com>
2010-07-12 12:35       ` Fwd: " Marcin Letyns
2010-07-12 12:42         ` Alexey Dobriyan
     [not found]           ` <AANLkTik64lxDiCN-eRo3i_-cTqAvCzbaRI4EEXoD44Vj@mail.gmail.com>
2010-07-12 12:52             ` Fwd: " Marcin Letyns
2010-07-12 14:57           ` Valdis.Kletnieks
2010-07-12 15:56       ` David Newall
2010-07-12 17:48         ` Marcin Letyns
2010-07-12 18:00         ` Stefan Richter
2010-07-12 19:58           ` David Newall
2010-07-12 21:11             ` Stefan Richter
2010-07-12 21:39             ` Martin Steigerwald
2010-07-12 22:44               ` Stefan Richter
2010-07-15  7:23             ` david
2010-07-13 16:50         ` Theodore Tso
2010-07-13 20:45           ` David Newall
2010-07-14  6:33             ` Theodore Tso
2010-09-04 17:12   ` Martin Steigerwald
2010-07-11 13:56 ` Lee Mathers
2010-07-11 14:51   ` Martin Steigerwald
2010-07-11 17:22     ` Willy Tarreau
2010-07-11 21:38       ` Rafael J. Wysocki
2010-07-12  4:17         ` Willy Tarreau
2010-07-12  9:56       ` Martin Steigerwald
2010-07-12 15:43       ` Martin Steigerwald
2010-07-12 17:36         ` Willy Tarreau
2010-07-12 19:56           ` Martin Steigerwald
2010-07-12 23:03             ` Stefan Richter
2010-07-13 10:30               ` Martin Steigerwald
2010-07-15  7:32               ` david
2010-07-12 17:55         ` Stefan Richter
2010-09-04 16:38       ` Martin Steigerwald
2010-09-04 18:46         ` Ted Ts'o
2010-09-04 19:11           ` Martin Steigerwald
2010-09-04 23:23             ` Ted Ts'o
2010-09-05  7:59               ` Martin Steigerwald
2010-09-04 19:24         ` Stefan Richter
2010-09-04 19:34           ` Stefan Richter
2010-09-04 20:21           ` Martin Steigerwald
2010-09-04 22:50             ` Stefan Richter
2010-09-04 23:16             ` Ted Ts'o
2010-09-05  8:35         ` Avi Kivity
2010-09-05  9:48           ` Martin Steigerwald
2010-07-11 19:49     ` Stefan Richter
2010-07-13 11:11     ` Alejandro Riveira Fernández
2010-07-13 12:50       ` rt2x00: slow wifi with correct basic rate bitmap (was Re: stable? quality assurance?) Stefan Richter
2010-07-13 15:35         ` John W. Linville
2010-07-13 18:19           ` Alejandro Riveira Fernández
2010-07-13 18:38             ` John W. Linville
2010-07-13 19:07               ` Alejandro Riveira Fernández
2010-07-13 18:06         ` Alejandro Riveira Fernández
2010-07-13 19:18           ` Stefan Richter
2010-07-12 19:46 ` stable? quality assurance? Nix
     [not found] ` <AANLkTimEdVsmIgXBbmhsq75ElQvGAI8avsM8-wlDpm4z@mail.gmail.com>
2010-07-15  9:09   ` Valeo de Vries
2010-07-16  7:00     ` Greg KH
2010-07-16  7:19       ` Justin P. Mattock
2010-07-16 15:25       ` Randy Dunlap
2010-07-16 15:34       ` Valeo de Vries
  -- strict thread matches above, loose matches on Subject: below --
2010-09-04 16:42 Martin Steigerwald
2010-09-04 17:22 ` Willy Tarreau
2010-09-04 19:33   ` Martin Steigerwald
2010-09-04 20:19     ` Willy Tarreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox