public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Linux 2.4.21-rc7
@ 2003-06-08  8:54 Clayton Weaver
  2003-06-08  9:47 ` Willy Tarreau
  0 siblings, 1 reply; 24+ messages in thread
From: Clayton Weaver @ 2003-06-08  8:54 UTC (permalink / raw)
  To: linux-kernel

> Now I really hope its the last one, all this
> rc's are making me mad.

We still have ide problems, and I don't see any
potential fixes for that in the changelog between -rc6 and -rc7.

I tried -rc6 on a whim and had hda report
a timeout (dma, I think, but the message went by kind of quick), then the big freeze with the
disk light stuck on,  Never happened
in 6 months on the same hardware running
2.4.19-rc2 (with glibc-2.2.5, gcc-2.95.3,
binutils-2.12.90.0.9, all ext2 filesystems).

I recompiled with all kernel debugging options
enabled and disabled partition statistics, since that was the one thing that was obviously new about the enabled ide options (I didn't select
any other new options, but of course the kernel code underneath is probably different, so one could not conclude anything from suck meager
testing). It ran for about 8 hours without freezing, with that drive doing a lot more
work than it was doing when it livelocked.

e2fsck reported errors on the next reboot, though,
and it's been rebooted into 2.4.19-rc2 to get some
other work done with it since then (caching the source for an upgrade of a 2.2.x box, different libc, yada yada, needs to be reliable until
that is finished).

SiS530/5513, k6-II/450, udma33 Maxtor drive that 2.4.19-rc2 has no problems with.

You can release a 2.4.21 anyway, of course, but without finding out where the ide livelock (and other big freezes, thinking of the report on the all-scsi system already posted) originates, calling it "stable" would be a bit fanciful.

(2.4.19-rc2 has its own quirks, of course, but
not "single-threaded ide livelock with this
chipset and ide drive". I can reliably kill it with 32 threads depth-first scanning different directory trees on that same disk in parallel, unfortunately without an oops to show for it.
It is not running out of memory (no ENOMEM reports), merely some mundane race condition or missing lock or whatever. Change it to 32 forks running in parallel, and they finish normally, though of course not all that quickly while seek-thrashing one and the same disk between them.)

Not what you wanted to hear, right? Oh well.

(Better to find out sooner than release
2.4.21-stable and watch 52 different bug reports on it arrive at the list the next day.)

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
_______________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup


^ permalink raw reply	[flat|nested] 24+ messages in thread
* Re: Linux 2.4.21-rc7
@ 2003-06-08 20:17 Clayton Weaver
  2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
  2003-06-08 21:47 ` Willy Tarreau
  0 siblings, 2 replies; 24+ messages in thread
From: Clayton Weaver @ 2003-06-08 20:17 UTC (permalink / raw)
  To: willy; +Cc: linux-kernel

----- Original Message -----
From: Willy Tarreau <willy@w.ods.org>
Date: Sun, 8 Jun 2003 11:47:29 +0200
To: Clayton Weaver <cgweav@email.com>
Subject: Re: Linux 2.4.21-rc7

> Hi !

Greets.

> [ first, please fix your mailer and cut your lines, it's not easy to quote you in replies ]

Long lines?

email.com is a web mailer. If it is failing
to wrap where I put newlines, I'll see what I
can do.
 
> On Sun, Jun 08, 2003 at 03:54:48AM -0500, Clayton Weaver wrote:
> > > Now I really hope its the last one, all this
> > > rc's are making me mad.

> > We still have ide problems, and I don't see
any
> > potential fixes for that in the changelog between -rc6 and -rc7.
> > 
> > I tried -rc6 on a whim and had hda report
> > a timeout (dma, I think, but the message went by kind of quick), then the big freeze with the
> > disk light stuck on,  Never happened in 6 months on the same hardware running
> > 2.4.19-rc2 (with glibc-2.2.5, gcc-2.95.3, binutils-2.12.90.0.9, all ext2 filesystems).
 
> Did you try with "ide0=nodma", or other similar options ?

No.

Note that "nodma" is unnecessary on this
same box running kernel 2.4.19-rc2. Why would
2.4.21-rcX need it? To pin down whether the
problem is in the ide dma code or some other
part of the ide code?

> > SiS530/5513, k6-II/450, udma33 Maxtor
drivethat 2.4.19-rc2 has no problems with.

Here is the data on the drive from hdparm
while running under 2.4.19-rc2. rc.local
executes "hdparm -c 1 /dev/hda" at boot.

hdparm -v:

/dev/hda:
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  0 (off)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 1655/255/63, sectors = 26588016, start = 0

hdparm -i:

/dev/hda:

 Model=Maxtor 91360U4, FwRev=MA540RR0, SerialNo=C40LMAFC
 Config={ Fixed }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=57
 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=26588016
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 *udma2
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-4 T13 1153D revision 17:  1 2 3 4 5

> That's not exactly what you said below. You said that you could reliably kill it with 32 threads...
> Perhaps you have a broken hardware, and 2.4.21 stresses it more than 2.4.19-rc2. Perhaps it's
> really an old driver bug, then having reported it since this you encountered it would have been
> more constructive than telling us at 2.4.21 time that it dies even more easily than a one year old
> 2.4.19-rc2.

It does not die more easily with 2.4.19-rc2
(in my opinion). It dies in a threads context
but not in a forks context, where the threads
and the forks are doing the same i/o to/from
the same controller/disk (different versions
of same program).

I have also seen it freeze with an unlucky
mouse click in XFree86 4.0 under 2.4.19-rc2,
so I did not assume that the threads hang
was necessarily ide-relevant. Something
disk i/o intensive was merely what it
happened to be doing with those threads,
but that problem seemed to me more thread
related than ide related. (Guess I'll have
to spawn a bunch of threads doing some other
kind of i/o to test that assumption.)

[]

> > (Better to find out sooner than release
> > 2.4.21-stable and watch 52 different bug reports on it arrive at the list the next day.)
 
> Well, look through the archives, there have been two patches by Lionel Bouton and Vojtech Pavlik
> posted in May for the 5513 driver, to support newer chipsets. I don't know if they have been
> included, nor if they also fixed old bugs. Perhaps you'll be intersted in checking them.

(SiS530 is not newer, k6-II era, but it
is worth a look anyway.)

The SiS5513 driver seems fine. You can
hammer on it all day with this motherboard
with gcc, multiple smb mounts, gigabyte ftp or
sftp transfers, etc, in parallel, and no blinks from the hard drive (modulo threads or the X-server under 2.4.19-rc2).

(Why 2.4.19-rc2? It mostly works, ie it is
stable for what I typically use that box
for. Someone running a different application
mix or different hardware might consider it useless crap. It has the lcall fix and a
few other minor bug fixes that were posted
to the kernel list between then and now.)
 
> BTW, someone reported yesterday that his 5513 worked flawlessly in 2.4.20, but behaved like yours
> on 2.5.70. Have you tested 2.4.20, or better, have you tried to narrow the problem down to a
> particular version (but I bet it will be tied to the introduction of the newer IDE code).

No. (I do actually need this thing to work at
times.) The newer ide code as the source of the
problem matches my hunch. Maybe the kernel
debugging that I enabled at compile time will
come up with something (*before* the
deadlock, so it can actually log an anomaly).

The newer ide code may have found a bug in
the SiS5513 driver that the old code did
not exercise. Let us hope not, because then
a fix only fixes it for me and other users
of that driver, while lots of people with
other kinds of ide hardware seem to be
reporting similar problems.

My guess is that the problems are upstream
of any specific driver, but that is merely
a hunch. (It is possible that they all do
the same wrong thing *in the drivers*.)

> You may also try -ac kernels which have more recent, but less tested code.

> Regards,
> Willy

Thanks for the insight.

Regards,

Clayton Weaver
<mailto: cgweav@email.com>

-- 
_______________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup


^ permalink raw reply	[flat|nested] 24+ messages in thread
* Re: Linux 2.4.21-rc7
@ 2003-06-03 18:45 Margit Schubert-While
  2003-06-03 18:50 ` Marc-Christian Petersen
  0 siblings, 1 reply; 24+ messages in thread
From: Margit Schubert-While @ 2003-06-03 18:45 UTC (permalink / raw)
  To: linux-kernel

if [ -r System.map ]; then /sbin/depmod -ae -F System.map  2.4.21-rc7; fi
depmod: *** Unresolved symbols in 
/lib/modules/2.4.21-rc7/kernel/drivers/net/wan/comx.o
depmod:         proc_get_inode

Margit


^ permalink raw reply	[flat|nested] 24+ messages in thread
* Linux 2.4.21-rc7
@ 2003-06-03 17:04 Marcelo Tosatti
  2003-06-03 18:02 ` Tomas Szepe
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Marcelo Tosatti @ 2003-06-03 17:04 UTC (permalink / raw)
  To: lkml


Hallo,

Now I really hope its the last one, all this rc's are making me mad.

Ok, here it is.


Summary of changes from v2.4.21-rc6 to v2.4.21-rc7
============================================

<ehabkost@conectiva.com.br>:
  o [SPARC]: Export phys_base on sparc32

<jgarzik@pobox.com>:
  o fix olympic driver build

<lethal@linux-sh.org>:
  o Fix Solution Engine 7751 Build
  o Define VM_DATA_DEFAULT_FLAGS for SH

<wesolows@foobazco.org>:
  o [sparc]: Attempt mul/div emulation handling on all cpus

David S. Miller <davem@nuts.ninka.net>:
  o [SPARC]: Fix sys_ipc to return ENOSYS instead of EINVAL as appropriate
  o [SPARC64]: Implement dump_stack in 2.4.x
  o [SPARC64]: Only use power interrupt when button property exists
  o [IPV4/IPV6]: Use Jenkins hash for fragment reassembly handling
  o [IPV6]: Input full addresses into TCP_SYNQ hash function
  o [IPV4]: Add sysctl to control ipfrag_secret_interval
  o [SPARC64]: Fix probe error handling in envctrl.c driver
  o [SPARC64]: Fix probe error handling in bbc_{envctrl,i2c}.c driver
  o [SPARC64]: Fix exploitable holes and bugs in ioctl32 translations

Douglas Gilbert <dougg@torque.net>:
  o sg: Fix side effect introduced by last "off by one" fix

Eric Brower <ebrower@usa.net>:
  o [SPARC]: Refactor AUXIO support

Marcelo Tosatti <marcelo@freak.distro.conectiva>:
  o Changed EXTRAVERSION to -rc7

Pete Zaitcev <zaitcev@redhat.com>:
  o [sparc] Force type in __put_user
  o [SPARC]: Fix gcc-3.x builds

Rob Radez <rob@osinvestor.com>:
  o [sparc]: Fix uninitialized spinlock in SRMMU code
  o [SPARC]: Kill initialize_secondary, unused


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2003-06-12  9:24 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-08  8:54 Linux 2.4.21-rc7 Clayton Weaver
2003-06-08  9:47 ` Willy Tarreau
  -- strict thread matches above, loose matches on Subject: below --
2003-06-08 20:17 Clayton Weaver
2003-06-08 20:51 ` Bartlomiej Zolnierkiewicz
2003-06-08 21:47 ` Willy Tarreau
2003-06-03 18:45 Margit Schubert-While
2003-06-03 18:50 ` Marc-Christian Petersen
2003-06-03 19:38   ` Christoph Hellwig
2003-06-03 17:04 Marcelo Tosatti
2003-06-03 18:02 ` Tomas Szepe
2003-06-03 18:07   ` Marcelo Tosatti
2003-06-03 19:15     ` lk
2003-06-03 19:40       ` Alan Cox
2003-06-03 18:30 ` Alex Romosan
2003-06-03 19:27   ` Jeff Garzik
2003-06-03 19:58     ` Alex Romosan
2003-06-03 20:14       ` Tom Rini
2003-06-04  3:35         ` David S. Miller
2003-06-04 15:09           ` Mr. James W. Laferriere
2003-06-04 23:37           ` Alex Romosan
2003-06-05 12:09 ` Andreas Haumer
2003-06-07 15:46   ` Andreas Haumer
2003-06-11 20:48     ` Marcelo Tosatti
     [not found]       ` <1055408183.2552.18.camel@tor.trudheim.com>
2003-06-12  9:35         ` Andreas Haumer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox