Error messages.

All of lore.kernel.org
 help / color / mirror / Atom feed

* Error messages.
@ 2003-03-05 19:18 Anders Widman
  2003-03-05 20:41 ` Ross Vandegrift
                   ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-05 19:18 UTC (permalink / raw)
  To: reiserfs-list

   This  has  come up on this list a number of times, and no one still
   seem to have found the true answer to the problem.

   kernel: status error: status=0x58 { DriveReady SeekComplete DataRequest }

   Most  seem  to  say  this is a bad block on the harddrive. I am not
   convinced  though.  Using  Linux on three machines here, and I have
   seen  this error on all of them, with lots of disks. The error seem
   to   come  random,  but  does  cause  system  lockups  and  broken
   filesystems.

   Have  about  20  disks, and have replaced and upgraded them several
   times  too.  This error has shown on most of them. But when testing
   them  with tools like IBM DFT, Maxtor Powermax, badblocks or chkdsk
   in Windows none show up to be with errors on.

   Sometimes  it  seem  to  help to disable DMA and or lower UDMA mode
   (all  drives are ATA-100 or ATA-133). But then after a few days, or
   a  few  minutes  the  kernel  starts spitting out these status=0x58
   errors.

   After  looking  online  on  different forums it does seem that many
   people are experiencing them.

   What  exactly does this status=0x58 error mean, and what can one do
   to solve the problem?

   //Anders

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 19:18 Anders Widman
@ 2003-03-05 20:41 ` Ross Vandegrift
  2003-03-05 20:51   ` Anders Widman
  2003-03-06  6:57 ` Oleg Drokin
  2003-03-07  5:47 ` Zygo Blaxell
  2 siblings, 1 reply; 46+ messages in thread
From: Ross Vandegrift @ 2003-03-05 20:41 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>    What  exactly does this status=0x58 error mean, and what can one do
>    to solve the problem?

You said you tried changing the disks - next step is cables and
controllers.  There's absolutely no question that's a hardware failure -
you just have to figure out what piece has gone bad.

-- 
Ross Vandegrift
ross@willow.seitz.com

A Pope has a Water Cannon.                               It is a Water Cannon.
He fires Holy-Water from it.                        It is a Holy-Water Cannon.
He Blesses it.                                 It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it.          It is a Wholly Holy Holy-Water Cannon.
He has it pierced.                It is a Holey Wholly Holy Holy-Water Cannon.
He makes it official.       It is a Canon Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive.                                       He shoots them.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 20:41 ` Ross Vandegrift
@ 2003-03-05 20:51   ` Anders Widman
  2003-03-05 21:01     ` Anders Widman
                       ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-05 20:51 UTC (permalink / raw)
  To: reiserfs-list

> On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>    What  exactly does this status=0x58 error mean, and what can one do
>>    to solve the problem?

> You said you tried changing the disks - next step is cables and
> controllers.  There's absolutely no question that's a hardware failure -
> you just have to figure out what piece has gone bad.

   This  is  what  I  also  have been thinking.. But it would mean the
   hardware is broken on all three machines, and 20 harddrives. So far
   I have tested with the following:

   New 80w IDE cables
   New 40w IDE cables (running DMA-33 only)
   New harddrives
       Maxtor
       IBM
       Seagate
   New mainboards
       MSI with VIA KT400 chipset
       MSI with VIA KT266A chipset
       MSI with Intel 440BX chipset
       MSI with Intel 440BX chipset with dual CPU
   Different RAM
   Different Power-supply
   New Promise controllers
       PDC20268 (Ultra 100Tx2)

   So..  What  more  can  I do? I seriously do not believe all of this
   hardware  is  broken.  They  all  run  without  problems in Windows
   2000/XP   and   all   drives  pass  check  through  badblocks  and
   IBM/Maxtor/Seagate Drive Fitness tools.

   I  made  a  quick  search on Google and found multiple forums where
   users have this same problems - and without any error reported from
   DFT tools...

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 20:51   ` Anders Widman
@ 2003-03-05 21:01     ` Anders Widman
  2003-03-05 21:14     ` Ross Vandegrift
  2003-03-05 23:02     ` Soeren Sonnenburg
  2 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-05 21:01 UTC (permalink / raw)
  To: reiserfs-list

>> On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>>    What  exactly does this status=0x58 error mean, and what can one do
>>>    to solve the problem?

>> You said you tried changing the disks - next step is cables and
>> controllers.  There's absolutely no question that's a hardware failure -
>> you just have to figure out what piece has gone bad.

>    This  is  what  I  also  have been thinking.. But it would mean the
>    hardware is broken on all three machines, and 20 harddrives. So far
>    I have tested with the following:

>    New 80w IDE cables
>    New 40w IDE cables (running DMA-33 only)
>    New harddrives
>        Maxtor
>        IBM
>        Seagate
>    New mainboards
>        MSI with VIA KT400 chipset
>        MSI with VIA KT266A chipset
>        MSI with Intel 440BX chipset
>        MSI with Intel 440BX chipset with dual CPU

Heck... The BX chipset boards were Asus.. :)

>    Different RAM
>    Different Power-supply
>    New Promise controllers
>        PDC20268 (Ultra 100Tx2)

>    So..  What  more  can  I do? I seriously do not believe all of this
>    hardware  is  broken.  They  all  run  without  problems in Windows
>    2000/XP   and   all   drives  pass  check  through  badblocks  and
>    IBM/Maxtor/Seagate Drive Fitness tools.

>    I  made  a  quick  search on Google and found multiple forums where
>    users have this same problems - and without any error reported from
>    DFT tools...

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt



^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-05 21:01 berthiaume_wayne
  2003-03-05 21:18 ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-05 21:01 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Anders, you need the subsequent lines for the error to determine
what the status error is. The 0x58 is a staatus error. It could be a status
timeout, seek error, etc. The subsequent line generally tells you what it
is. The faact you have noticed the errors diminished with DMA off or at the
lower speeds seems to indicate possible timeouts. 
	Typically, the IDE driver will drop out of DMA and retry in PIO when
an error is encountered. If the kernel is <2.4.18(?) it will have a
configuration switch that says DMA always for disks. What this means is IDE
will drop back to PIO, complete the request, then go back to DMA for
subsequent requests. Prior to this kernel, IDE would degrade that drive to
PIO forever until the next boot.
	To solve it - good hardware. Really! For instance, there is a Maxtor
tool that allows you to examine the probational logs of the drive - areas of
the disk that are suspected as bad. These are areas that aren't necessarily
logged as a bad block but are on "probation" for the next write. They can
cause these error messages to occur. Unfortunately, I don't believe the tool
is available to the general populace. Sorry.
	You may want to pose your question the IDE mailing list
(linux-ide@vger.kernel.org) to see if anyone else has any ideas for you on
how to deal with these errors.
Regards,
Wayne

email:       Berthiaume_Wayne@emc.com

"One man can make a difference, and every man should try."  - JFK

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Wednesday, March 05, 2003 2:18 PM
To: reiserfs-list@namesys.com
Subject: Error messages.

   This  has  come up on this list a number of times, and no one still
   seem to have found the true answer to the problem.

   kernel: status error: status=0x58 { DriveReady SeekComplete DataRequest }

   Most  seem  to  say  this is a bad block on the harddrive. I am not
   convinced  though.  Using  Linux on three machines here, and I have
   seen  this error on all of them, with lots of disks. The error seem
   to   come  random,  but  does  cause  system  lockups  and  broken
   filesystems.

   Have  about  20  disks, and have replaced and upgraded them several
   times  too.  This error has shown on most of them. But when testing
   them  with tools like IBM DFT, Maxtor Powermax, badblocks or chkdsk
   in Windows none show up to be with errors on.

   Sometimes  it  seem  to  help to disable DMA and or lower UDMA mode
   (all  drives are ATA-100 or ATA-133). But then after a few days, or
   a  few  minutes  the  kernel  starts spitting out these status=0x58
   errors.

   After  looking  online  on  different forums it does seem that many
   people are experiencing them.

   What  exactly does this status=0x58 error mean, and what can one do
   to solve the problem?

   //Anders

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-05 21:07 berthiaume_wayne
  2003-03-05 21:21 ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-05 21:07 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Anders, which kernel are you running and what Promise driver? I am
very familiar with these errors occurring on this hardware. There are issues
with the driver in lk 2.4.13, for a fact. I am currently using lk 2.4.19
which has these problems fixed. I believe the problems were fixed in lk
2.4.16-pre7.
Regards,
Wayne

email:       Berthiaume_Wayne@emc.com

"One man can make a difference, and every man should try."  - JFK


-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Wednesday, March 05, 2003 3:51 PM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


> On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>    What  exactly does this status=0x58 error mean, and what can one do
>>    to solve the problem?

> You said you tried changing the disks - next step is cables and
> controllers.  There's absolutely no question that's a hardware failure -
> you just have to figure out what piece has gone bad.

   This  is  what  I  also  have been thinking.. But it would mean the
   hardware is broken on all three machines, and 20 harddrives. So far
   I have tested with the following:

   New 80w IDE cables
   New 40w IDE cables (running DMA-33 only)
   New harddrives
       Maxtor
       IBM
       Seagate
   New mainboards
       MSI with VIA KT400 chipset
       MSI with VIA KT266A chipset
       MSI with Intel 440BX chipset
       MSI with Intel 440BX chipset with dual CPU
   Different RAM
   Different Power-supply
   New Promise controllers
       PDC20268 (Ultra 100Tx2)

   So..  What  more  can  I do? I seriously do not believe all of this
   hardware  is  broken.  They  all  run  without  problems in Windows
   2000/XP   and   all   drives  pass  check  through  badblocks  and
   IBM/Maxtor/Seagate Drive Fitness tools.

   I  made  a  quick  search on Google and found multiple forums where
   users have this same problems - and without any error reported from
   DFT tools...




--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 20:51   ` Anders Widman
  2003-03-05 21:01     ` Anders Widman
@ 2003-03-05 21:14     ` Ross Vandegrift
  2003-03-05 23:02     ` Soeren Sonnenburg
  2 siblings, 0 replies; 46+ messages in thread
From: Ross Vandegrift @ 2003-03-05 21:14 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

On Wed, Mar 05, 2003 at 09:51:22PM +0100, Anders Widman wrote:
>    New 80w IDE cables
>    New 40w IDE cables (running DMA-33 only)

How long are your cables?  If your cables are out of spec, this can
cause errors.  Your 40-pin IDE cables should be no longer than 18" and
the 80-ping should be no longer than 24".  It's also possible that you
just got crummy cables due to some damage/defect/etc.

-- 
Ross Vandegrift
ross@willow.seitz.com

A Pope has a Water Cannon.                               It is a Water Cannon.
He fires Holy-Water from it.                        It is a Holy-Water Cannon.
He Blesses it.                                 It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it.          It is a Wholly Holy Holy-Water Cannon.
He has it pierced.                It is a Holey Wholly Holy Holy-Water Cannon.
He makes it official.       It is a Canon Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive.                                       He shoots them.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 21:01 berthiaume_wayne
@ 2003-03-05 21:18 ` Anders Widman
  0 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-05 21:18 UTC (permalink / raw)
  To: reiserfs-list

>         Anders, you need the subsequent lines for the error to determine
> what the status error is. The 0x58 is a staatus error. It could be a status
> timeout, seek error, etc. The subsequent line generally tells you what it
> is. The faact you have noticed the errors diminished with DMA off or at the
> lower speeds seems to indicate possible timeouts.

The errors come even without DMA enabled.

Right now most of them contain "drive not ready for command".

>         To solve it - good hardware. Really! For instance, there is a Maxtor
> tool that allows you to examine the probational logs of the drive - areas of
> the disk that are suspected as bad. These are areas that aren't necessarily
> logged as a bad block but are on "probation" for the next write. They can
> cause these error messages to occur. Unfortunately, I don't believe the tool
> is available to the general populace. Sorry.

Well,  you  can  check  smart  levels  and thresholds and they are not
violated  yet.  Also, I do not believe that all 4 computer systems are
broken.  Especially  when  I  have run Windows on them for a long time
without any problems at all.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 21:07 berthiaume_wayne
@ 2003-03-05 21:21 ` Anders Widman
  2003-03-06  6:58   ` Todd Lyons
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-05 21:21 UTC (permalink / raw)
  To: berthiaume_wayne; +Cc: reiserfs-list

>         Anders, which kernel are you running and what Promise driver?

I  have  been  trying most kernels since 2.4.17 including stock redhat
and mandrake kernels.


> Wayne



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-05 21:31 berthiaume_wayne
  0 siblings, 0 replies; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-05 21:31 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Are you running a slave/master drive configuration on the channel
that is failing? The error indicates that you have a command pending and are
trying to stuff another one on the drive. We had been seeing a similar issue
while testing IDE write barrier patches against lk 2.4.19 and needed to back
port a patch for ide.c from lk 2.4.20. Not sure it's the same issue for you.
Regards,
Wayne

email:       Berthiaume_Wayne@emc.com

"One man can make a difference, and every man should try."  - JFK


-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Wednesday, March 05, 2003 4:18 PM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


>         Anders, you need the subsequent lines for the error to determine
> what the status error is. The 0x58 is a staatus error. It could be a
status
> timeout, seek error, etc. The subsequent line generally tells you what it
> is. The faact you have noticed the errors diminished with DMA off or at
the
> lower speeds seems to indicate possible timeouts.

The errors come even without DMA enabled.

Right now most of them contain "drive not ready for command".

>         To solve it - good hardware. Really! For instance, there is a
Maxtor
> tool that allows you to examine the probational logs of the drive - areas
of
> the disk that are suspected as bad. These are areas that aren't
necessarily
> logged as a bad block but are on "probation" for the next write. They can
> cause these error messages to occur. Unfortunately, I don't believe the
tool
> is available to the general populace. Sorry.

Well,  you  can  check  smart  levels  and thresholds and they are not
violated  yet.  Also, I do not believe that all 4 computer systems are
broken.  Especially  when  I  have run Windows on them for a long time
without any problems at all.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-05 21:36 berthiaume_wayne
  2003-03-05 22:50 ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-05 21:36 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	You may want to look at the latest from SuSE. I've been mostly
successful with lk 2.4.19 on the pdc202xx driver with Maxtor 250GB drives on
a Tyan PIII motherboard. I added IDE write barrier patches from Chris Mason
and Jens Axboe and ran into the "drive not ready for command" errors. This
was subsequently fixed by Jens for a 2.4.20 kernel and I've back ported that
bit for the 2.4.19 kernel I'm testing. The status errors are gone but now
I'm investigating filesystem coruptions on my ReiserFS partitions.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Wednesday, March 05, 2003 4:21 PM
To: berthiaume_wayne@emc.com
Cc: reiserfs-list@namesys.com
Subject: Re: Error messages.


>         Anders, which kernel are you running and what Promise driver?

I  have  been  trying most kernels since 2.4.17 including stock redhat
and mandrake kernels.


> Wayne



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 21:36 berthiaume_wayne
@ 2003-03-05 22:50 ` Anders Widman
  2003-03-05 22:53   ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-05 22:50 UTC (permalink / raw)
  To: reiserfs-list

>         You may want to look at the latest from SuSE. I've been mostly
> successful with lk 2.4.19 on the pdc202xx driver with Maxtor 250GB drives on
> a Tyan PIII motherboard. I added IDE write barrier patches from Chris Mason
> and Jens Axboe and ran into the "drive not ready for command" errors. This
> was subsequently fixed by Jens for a 2.4.20 kernel and I've back ported that
> bit for the 2.4.19 kernel I'm testing. The status errors are gone but now
> I'm investigating filesystem coruptions on my ReiserFS partitions.

I  am currently running 2.4.21-pre4. It seem to keep most other things
working, except for the "drive not ready for command" error.

It might be worth testing 2.4.19 kernel with these patches then. I did
try with mandrake 2.4.19 without success though.

Also, am going to try 2.5.64 too.


PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 22:50 ` Anders Widman
@ 2003-03-05 22:53   ` Anders Widman
  0 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-05 22:53 UTC (permalink / raw)
  To: reiserfs-list

>>         You may want to look at the latest from SuSE. I've been mostly
>> successful with lk 2.4.19 on the pdc202xx driver with Maxtor 250GB drives on
>> a Tyan PIII motherboard. I added IDE write barrier patches from Chris Mason
>> and Jens Axboe and ran into the "drive not ready for command" errors. This
>> was subsequently fixed by Jens for a 2.4.20 kernel and I've back ported that
>> bit for the 2.4.19 kernel I'm testing. The status errors are gone but now
>> I'm investigating filesystem coruptions on my ReiserFS partitions.

> I  am currently running 2.4.21-pre4. It seem to keep most other things
> working, except for the "drive not ready for command" error.

> It might be worth testing 2.4.19 kernel with these patches then. I did
> try with mandrake 2.4.19 without success though.

> Also, am going to try 2.5.64 too.

Might  be  something to do with APIC and/or ACPI/APM too. Have to test
with/without these options.

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 20:51   ` Anders Widman
  2003-03-05 21:01     ` Anders Widman
  2003-03-05 21:14     ` Ross Vandegrift
@ 2003-03-05 23:02     ` Soeren Sonnenburg
  2003-03-06  8:46       ` Anders Widman
  2 siblings, 1 reply; 46+ messages in thread
From: Soeren Sonnenburg @ 2003-03-05 23:02 UTC (permalink / raw)
  To: reiserfs-list; +Cc: Anders Widman

On Wed, 2003-03-05 at 21:51, Anders Widman wrote:
> > On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>    New Promise controllers
>        PDC20268 (Ultra 100Tx2)

does that mean you only tested on these pdc's ?

If so then then drop this damn PDC controller and get one that is
supported under linux (e.g. hpt370 based controllers).

I had the very same problems with these PDC20268 controllers. When I
switched to anything above MDMA0 (note not even UDMA) the system was
freezing from time to time.

On the internal controller your drives should work all fine (via/intel
chipsets work nicely), also on hpt based chipsets and also cmd is
supporting linux... but forget about promise. This company just does not
support linux.

I was using kernels 2.4.19/20/21pre1/21pre4/21pre4-ac5 and all had the
very same problem. When I heard from others that they had problems with
promise I switched... and I am now enjoying a rock stable system.

Soeren.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-05 23:09 berthiaume_wayne
  0 siblings, 0 replies; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-05 23:09 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list, axboe

	I don't know if the patch for 2.4.20 I received was rolled out to
the community at large or even if it will resolve your problem. It did help
my particular problem though.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Wednesday, March 05, 2003 5:50 PM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


>         You may want to look at the latest from SuSE. I've been mostly
> successful with lk 2.4.19 on the pdc202xx driver with Maxtor 250GB drives
on
> a Tyan PIII motherboard. I added IDE write barrier patches from Chris
Mason
> and Jens Axboe and ran into the "drive not ready for command" errors. This
> was subsequently fixed by Jens for a 2.4.20 kernel and I've back ported
that
> bit for the 2.4.19 kernel I'm testing. The status errors are gone but now
> I'm investigating filesystem coruptions on my ReiserFS partitions.

I  am currently running 2.4.21-pre4. It seem to keep most other things
working, except for the "drive not ready for command" error.

It might be worth testing 2.4.19 kernel with these patches then. I did
try with mandrake 2.4.19 without success though.

Also, am going to try 2.5.64 too.


PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 19:18 Anders Widman
  2003-03-05 20:41 ` Ross Vandegrift
@ 2003-03-06  6:57 ` Oleg Drokin
  2003-03-06  7:07   ` Voicu Liviu
  2003-03-06  8:32   ` Anders Widman
  2003-03-07  5:47 ` Zygo Blaxell
  2 siblings, 2 replies; 46+ messages in thread
From: Oleg Drokin @ 2003-03-06  6:57 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

Hello!

On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:

>    This  has  come up on this list a number of times, and no one still
>    seem to have found the true answer to the problem.
>    kernel: status error: status=0x58 { DriveReady SeekComplete DataRequest }

Is this the only messages?
The most similar stuff I saw is coming in pairs with different message like this:
  hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
  hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

And this one means there is noisy IDE cable and/or too high UDMA mode for this cable.

>    Sometimes  it  seem  to  help to disable DMA and or lower UDMA mode

This is also confirms above theory of bad data cable.
And if you do not use UDMA, CRC is not checked at all.

>    (all  drives are ATA-100 or ATA-133). But then after a few days, or
>    a  few  minutes  the  kernel  starts spitting out these status=0x58
>    errors.

Hm. This is strange.
Well, the best thing you can probably do is just post the whole history to
IDE maintainers and ask for decoding. Worked for me.

>    What  exactly does this status=0x58 error mean, and what can one do
>    to solve the problem?

Well, I checked my logs:
Feb 27 21:14:47 car kernel: hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
Feb 27 21:14:47 car kernel: hdg: drive not ready for command
Feb 27 21:14:51 car kernel: hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
Feb 27 21:14:51 car kernel: hdg: drive not ready for command
;)))
And for this case I am sure this was a scratchy CD-ROM disk in my CD-ROM drive.

Probably same stuff can be get when drive is busy remapping bad sectors?
Use smartctl to find out how these messages corellate with remapped bad sectors counts?

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 21:21 ` Anders Widman
@ 2003-03-06  6:58   ` Todd Lyons
  2003-03-06  8:34     ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: Todd Lyons @ 2003-03-06  6:58 UTC (permalink / raw)
  To: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Anders Widman wanted us to know:

>I  have  been  trying most kernels since 2.4.17 including stock redhat
>and mandrake kernels.

Do you have apic enabled or disabled in both the kernel and the BIOS?  
Do you have acpi enabled or disabled in both the kernel and the BIOS?

Have you tried the absolute latest Cooker (Mandrake) kernel?
kernel-2.4.21*12mdk.i586.rpm? (I can't remember the _exact_ file name) 
- -- 
Blue skies...		Todd
| Get a bigger hammer!   |  All vendors suck, but different ones  |
| http://www.mrball.net  |  suck less in different applications.  |
| http://faq.mrball.net  |                --Andy Walden on NANOG  |
Linux kernel 2.4.19-16mdk   4 users,  load average: 0.00, 0.10, 0.22
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: http://www.mrball.net/todd.asc

iD8DBQE+ZvF/IBT1264ScBURAgnmAJ0RdFCpIVohPHEpArcuGBbB24GQBACfUdIo
4xzbyGrxNG3pr8QiZI2C0s8=
=7irQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  6:57 ` Oleg Drokin
@ 2003-03-06  7:07   ` Voicu Liviu
  2003-03-06  7:19     ` Oleg Drokin
  2003-03-06  8:32   ` Anders Widman
  1 sibling, 1 reply; 46+ messages in thread
From: Voicu Liviu @ 2003-03-06  7:07 UTC (permalink / raw)
  To: reiserfs-list

I also get messages like this: ( what is this? )

On Thursday 06 March 2003 08:57, Oleg Drokin wrote:
> hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
>   hda: dma_intr: error=0x84 { DriveStatusError BadCRC }


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  7:19     ` Oleg Drokin
@ 2003-03-06  7:19       ` Voicu Liviu
  2003-03-06  8:37         ` Oleg Drokin
  0 siblings, 1 reply; 46+ messages in thread
From: Voicu Liviu @ 2003-03-06  7:19 UTC (permalink / raw)
  To: reiserfs-list

On Thursday 06 March 2003 09:19, Oleg Drokin wrote:
> Hello!
>
> On Thu, Mar 06, 2003 at 09:07:49AM +0200, Voicu Liviu wrote:
> > I also get messages like this: ( what is this? )
> >
> > On Thursday 06 March 2003 08:57, Oleg Drokin wrote:
> > > hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > >   hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
>
> This is a complete sign of either your having noisy/broken (or too long?)
> IDE cable (consider changing).
> or your using too high UDMA mode for your IDE cable (e.g. UDMA3+ for 40
> wire cable) (consider changing cable or lowering UDMA mode).

but not bad HD right? this is a new Seagate 40 GB

>
> Bye,
>     Oleg


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  7:07   ` Voicu Liviu
@ 2003-03-06  7:19     ` Oleg Drokin
  2003-03-06  7:19       ` Voicu Liviu
  0 siblings, 1 reply; 46+ messages in thread
From: Oleg Drokin @ 2003-03-06  7:19 UTC (permalink / raw)
  To: Voicu Liviu; +Cc: reiserfs-list

Hello!

On Thu, Mar 06, 2003 at 09:07:49AM +0200, Voicu Liviu wrote:
> I also get messages like this: ( what is this? )
> On Thursday 06 March 2003 08:57, Oleg Drokin wrote:
> > hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> >   hda: dma_intr: error=0x84 { DriveStatusError BadCRC }

This is a complete sign of either your having noisy/broken (or too long?)
IDE cable (consider changing).
or your using too high UDMA mode for your IDE cable (e.g. UDMA3+ for 40 wire cable)
(consider changing cable or lowering UDMA mode).

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  6:57 ` Oleg Drokin
  2003-03-06  7:07   ` Voicu Liviu
@ 2003-03-06  8:32   ` Anders Widman
  2003-03-06  8:40     ` Oleg Drokin
  2003-03-06 12:16     ` Hans Reiser
  1 sibling, 2 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-06  8:32 UTC (permalink / raw)
  To: reiserfs-list


> Well, I checked my logs:
> kernel: hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
> kernel: hdg: drive not ready for command

   This is what I get all the time.

> And for this case I am sure this was a scratchy CD-ROM disk in my CD-ROM drive.

   Well, have no CD-ROM. :)

> Probably same stuff can be get when drive is busy remapping bad sectors?
> Use smartctl to find out how these messages corellate with remapped bad sectors counts?

  Very strange. Would mean all of my harddrives would be broken, or on
  their  way  to  get  broken.  I  do  not  believe that.  Most of the
  hardware, including the cabling has been replaced and changed.

> Bye,
>     Oleg



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  6:58   ` Todd Lyons
@ 2003-03-06  8:34     ` Anders Widman
  2003-03-06 17:33       ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-06  8:34 UTC (permalink / raw)
  To: reiserfs-list

> Anders Widman wanted us to know:

>>I  have  been  trying most kernels since 2.4.17 including stock redhat
>>and mandrake kernels.

> Do you have apic enabled or disabled in both the kernel and the BIOS?
> Do you have acpi enabled or disabled in both the kernel and the BIOS?

Yes,  right now both are. Will be trying without. If it works it means
there is a nasty bug in the kernel/or Promise drivers?


> Have you tried the absolute latest Cooker (Mandrake) kernel?
> kernel-2.4.21*12mdk.i586.rpm? (I can't remember the _exact_ file name)

  Well, tried 2.4.21pre4-6mdk as the latest Mandrake kernel.

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  7:19       ` Voicu Liviu
@ 2003-03-06  8:37         ` Oleg Drokin
  0 siblings, 0 replies; 46+ messages in thread
From: Oleg Drokin @ 2003-03-06  8:37 UTC (permalink / raw)
  To: Voicu Liviu; +Cc: reiserfs-list

Hello!

On Thu, Mar 06, 2003 at 09:19:45AM +0200, Voicu Liviu wrote:
> > > On Thursday 06 March 2003 08:57, Oleg Drokin wrote:
> > > > hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > > >   hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
> > This is a complete sign of either your having noisy/broken (or too long?)
> > IDE cable (consider changing).
> > or your using too high UDMA mode for your IDE cable (e.g. UDMA3+ for 40
> > wire cable) (consider changing cable or lowering UDMA mode).
> but not bad HD right? this is a new Seagate 40 GB

Right, this is not bad HDD.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  8:32   ` Anders Widman
@ 2003-03-06  8:40     ` Oleg Drokin
  2003-03-06  8:43       ` Anders Widman
  2003-03-06 12:16     ` Hans Reiser
  1 sibling, 1 reply; 46+ messages in thread
From: Oleg Drokin @ 2003-03-06  8:40 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

Hello!

On Thu, Mar 06, 2003 at 09:32:38AM +0100, Anders Widman wrote:

> > And for this case I am sure this was a scratchy CD-ROM disk in my CD-ROM drive.
>    Well, have no CD-ROM. :)

/dev/hdg is one of my CD-ROMs ;)

> > Probably same stuff can be get when drive is busy remapping bad sectors?
> > Use smartctl to find out how these messages corellate with remapped bad sectors counts?
>   Very strange. Would mean all of my harddrives would be broken, or on
>   their  way  to  get  broken.  I  do  not  believe that.  Most of the
>   hardware, including the cabling has been replaced and changed.

Well, seems as Wayne have noticed, you have one common part:
Promise controllers. How about using different kind of controller
on one of the boxes and see if it helps?

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  8:40     ` Oleg Drokin
@ 2003-03-06  8:43       ` Anders Widman
  2003-03-06  8:48         ` Oleg Drokin
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-06  8:43 UTC (permalink / raw)
  To: reiserfs-list

> Hello!

> On Thu, Mar 06, 2003 at 09:32:38AM +0100, Anders Widman wrote:

>> > And for this case I am sure this was a scratchy CD-ROM disk in my CD-ROM drive.
>>    Well, have no CD-ROM. :)

> /dev/hdg is one of my CD-ROMs ;)

>> > Probably same stuff can be get when drive is busy remapping bad sectors?
>> > Use smartctl to find out how these messages corellate with remapped bad sectors counts?
>>   Very strange. Would mean all of my harddrives would be broken, or on
>>   their  way  to  get  broken.  I  do  not  believe that.  Most of the
>>   hardware, including the cabling has been replaced and changed.

> Well, seems as Wayne have noticed, you have one common part:
> Promise controllers. How about using different kind of controller
> on one of the boxes and see if it helps?

Perhaps,  but  the  same  happens on the internal controller. In fact,
the  internal  controller  (either VIA or the Intel) causes the system
to freeze when it happens to many times.

> Bye,
>     Oleg


   



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 23:02     ` Soeren Sonnenburg
@ 2003-03-06  8:46       ` Anders Widman
  0 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-06  8:46 UTC (permalink / raw)
  To: reiserfs-list

> On Wed, 2003-03-05 at 21:51, Anders Widman wrote:
>> > On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>    New Promise controllers
>>        PDC20268 (Ultra 100Tx2)

> does that mean you only tested on these pdc's ?

I  changed  from  Three  Ultra100  to  Ultra100Tx2. Now I only use two
boards in this particular system.

> If so then then drop this damn PDC controller and get one that is
> supported under linux (e.g. hpt370 based controllers).

> I had the very same problems with these PDC20268 controllers. When I
> switched to anything above MDMA0 (note not even UDMA) the system was
> freezing from time to time.

This happens here too...

> On the internal controller your drives should work all fine (via/intel
> chipsets work nicely), also on hpt based chipsets and also cmd is
> supporting linux... but forget about promise. This company just does not
> support linux.

> I was using kernels 2.4.19/20/21pre1/21pre4/21pre4-ac5 and all had the
> very same problem. When I heard from others that they had problems with
> promise I switched... and I am now enjoying a rock stable system.

It  might  just  have  to  come  to this, but I do not want to buy new
hardware :)

> Soeren.


   



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  8:43       ` Anders Widman
@ 2003-03-06  8:48         ` Oleg Drokin
  0 siblings, 0 replies; 46+ messages in thread
From: Oleg Drokin @ 2003-03-06  8:48 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

Hello!

On Thu, Mar 06, 2003 at 09:43:32AM +0100, Anders Widman wrote:
> >> > Probably same stuff can be get when drive is busy remapping bad sectors?
> >> > Use smartctl to find out how these messages corellate with remapped bad sectors counts?
> >>   Very strange. Would mean all of my harddrives would be broken, or on
> >>   their  way  to  get  broken.  I  do  not  believe that.  Most of the
> >>   hardware, including the cabling has been replaced and changed.
> > Well, seems as Wayne have noticed, you have one common part:
> > Promise controllers. How about using different kind of controller
> > on one of the boxes and see if it helps?
> Perhaps,  but  the  same  happens on the internal controller. In fact,
> the  internal  controller  (either VIA or the Intel) causes the system
> to freeze when it happens to many times.

Ah, hm. Then I think your best bet is to contact Linux IDE maintainers, I think.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  8:32   ` Anders Widman
  2003-03-06  8:40     ` Oleg Drokin
@ 2003-03-06 12:16     ` Hans Reiser
  2003-03-06 12:23       ` Anders Widman
  1 sibling, 1 reply; 46+ messages in thread
From: Hans Reiser @ 2003-03-06 12:16 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

Anders Widman wrote:

>>Well, I checked my logs:
>>kernel: hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
>>kernel: hdg: drive not ready for command
>>    
>>
>
>   This is what I get all the time.
>
>  
>
>>And for this case I am sure this was a scratchy CD-ROM disk in my CD-ROM drive.
>>    
>>
>
>   Well, have no CD-ROM. :)
>
>  
>
>>Probably same stuff can be get when drive is busy remapping bad sectors?
>>Use smartctl to find out how these messages corellate with remapped bad sectors counts?
>>    
>>
>
>  Very strange. Would mean all of my harddrives would be broken, or on
>  their  way  to  get  broken.  I  do  not  believe that.  Most of the
>  hardware, including the cabling has been replaced and changed.
>
>  
>
>>Bye,
>>    Oleg
>>    
>>
>
>
>
>--------
>PGP public key: https://tnonline.net/secure/pgp_key.txt
>
>
>
>  
>
Hardware is so much fun to debug sometimes, and when you are the 1% case 
life can really suck.

Do you have:

bad cooling

bad power supply

bad voltage from power company

electrical noise

?

I think though that Oleg is right that you should contact the IDE guys 
and ask them for their list of things to check for that particular error 
message (and tell us about it).

-- 
Hans



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06 12:16     ` Hans Reiser
@ 2003-03-06 12:23       ` Anders Widman
  2003-03-06 12:23         ` Dieter Nützel
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-06 12:23 UTC (permalink / raw)
  To: reiserfs-list

>>
> Hardware is so much fun to debug sometimes, and when you are the 1% case
> life can really suck.

> Do you have:

> bad cooling

Nope.  Not  warmer  than  35C  anywhere,  including the surface of the
drives.

> bad power supply

Well,  this has been checked too, though I could not be entirely sure.
I have used two different Chieftek 340W PSUs

> bad voltage from power company

The  power  distribution  facility is just about 300m from here. And I
have installed line filters that takes cares of noise and spikes.

> electrical noise

How do I measure this?... Might be as there are many drives installed.
They might disturb each other.

> ?

> I think though that Oleg is right that you should contact the IDE guys
> and ask them for their list of things to check for that particular error
> message (and tell us about it).

Yes, I am trying to get contact with the linux-ide mailing list. We'll
see  what  will  come out of those. Prior experience with that list is
not very successful ;)

Regards,
Anders

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06 12:23       ` Anders Widman
@ 2003-03-06 12:23         ` Dieter Nützel
  0 siblings, 0 replies; 46+ messages in thread
From: Dieter Nützel @ 2003-03-06 12:23 UTC (permalink / raw)
  To: Anders Widman, reiserfs-list

Am Donnerstag, 6. März 2003 13:23 schrieb Anders Widman:
> > Hardware is so much fun to debug sometimes, and when you are the 1% case
> > life can really suck.
> >
> > Do you have:
> >
> > bad cooling
>
> Nope.  Not  warmer  than  35C  anywhere,  including the surface of the
> drives.
>
> > bad power supply
>
> Well,  this has been checked too, though I could not be entirely sure.
> I have used two different Chieftek 340W PSUs
>
> > bad voltage from power company
>
> The  power  distribution  facility is just about 300m from here. And I
> have installed line filters that takes cares of noise and spikes.
>
> > electrical noise
>
> How do I measure this?... Might be as there are many drives installed.
> They might disturb each other.
>
> > ?
> >
> > I think though that Oleg is right that you should contact the IDE guys
> > and ask them for their list of things to check for that particular error
> > message (and tell us about it).
>
> Yes, I am trying to get contact with the linux-ide mailing list. We'll
> see  what  will  come out of those. Prior experience with that list is
> not very successful ;)

Maybe "simple" l-k?
Or Alan Cox, Andre Hedrick directly? ;-)

Regards,
	Dieter

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-06 13:27 berthiaume_wayne
  2003-03-06 13:52 ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-06 13:27 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	That's rather puzzling... I did not have the same problems with the
mii driver; however, I was unable to run the full extent of the 250GB drive
or the UDMA level 6 with mii under 2.4.13, so I was using a special patched
driver form Promise to support both the pdc20269 and 48LBA. In 2.4.19 the 48
LBA was added so I was able to get the full address range on the 250GB
drives without patches from Promise; however, was still unable to run UDMA
level 6 on the onboard Intel chip. 
	I still use the Promise pdc20269 and run UDMA level 6 on thousands
of deployed servers at this time. What is the cable length from drives to
controller? Eventhough you have several configured servers, I have thousands
without the problem you are seeing. Yes, I do get an occasional status error
under heavy loads but they've always been recoverable and the systems
continue to chug along.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Thursday, March 06, 2003 3:44 AM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


> Hello!

> On Thu, Mar 06, 2003 at 09:32:38AM +0100, Anders Widman wrote:

>> > And for this case I am sure this was a scratchy CD-ROM disk in my
CD-ROM drive.
>>    Well, have no CD-ROM. :)

> /dev/hdg is one of my CD-ROMs ;)

>> > Probably same stuff can be get when drive is busy remapping bad
sectors?
>> > Use smartctl to find out how these messages corellate with remapped bad
sectors counts?
>>   Very strange. Would mean all of my harddrives would be broken, or on
>>   their  way  to  get  broken.  I  do  not  believe that.  Most of the
>>   hardware, including the cabling has been replaced and changed.

> Well, seems as Wayne have noticed, you have one common part:
> Promise controllers. How about using different kind of controller
> on one of the boxes and see if it helps?

Perhaps,  but  the  same  happens on the internal controller. In fact,
the  internal  controller  (either VIA or the Intel) causes the system
to freeze when it happens to many times.

> Bye,
>     Oleg


   



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06 13:27 berthiaume_wayne
@ 2003-03-06 13:52 ` Anders Widman
  0 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-06 13:52 UTC (permalink / raw)
  To: reiserfs-list

>         That's rather puzzling... I did not have the same problems with the
> mii driver; however, I was unable to run the full extent of the 250GB drive
> or the UDMA level 6 with mii under 2.4.13, so I was using a special patched
> driver form Promise to support both the pdc20269 and 48LBA. In 2.4.19 the 48
> LBA was added so I was able to get the full address range on the 250GB
> drives without patches from Promise; however, was still unable to run UDMA
> level 6 on the onboard Intel chip.

UDMA6  works on the machine with the VIA KT400 chip and 2.4.21 kernel.
The  other machines are limited to ATA-100 as the controllers does not
support  higher.  Actually  I  do  not need high DMA, DMA-33 should be
enough.

Though,  the  errors come even with DMA turned off. It seem though, at
least  so  far,  that  the system crashes/lockups come much more often
with DMA than without.

>         I still use the Promise pdc20269 and run UDMA level 6 on thousands
> of deployed servers at this time. What is the cable length from drives to
> controller? Eventhough you have several configured servers, I have thousands
> without the problem you are seeing. Yes, I do get an occasional status error
> under heavy loads but they've always been recoverable and the systems
> continue to chug along.

Cables are between 40-45cm / 15,5-17in.

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-06 14:12 berthiaume_wayne
  2003-03-06 14:20 ` Anders Widman
  0 siblings, 1 reply; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-06 14:12 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Anders, here is what I have and it works on thousands of duplicate
servers:

Tyan S2420 with 1.0GHz PIII
512MB RAM
Promise PDC20269 in PCI1
Intel Dual 10/100 NIC in PCI2
Four Maxtor 250GB IDE drives off of the Promise controller
lk 2.4.19 on RH7.3

hdparm -a64 -K1 -W1 -u1 -m16 -c1 -d1 /dev/hd<x>

Regards,
Wayne.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Thursday, March 06, 2003 3:46 AM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


> On Wed, 2003-03-05 at 21:51, Anders Widman wrote:
>> > On Wed, Mar 05, 2003 at 08:18:18PM +0100, Anders Widman wrote:
>>    New Promise controllers
>>        PDC20268 (Ultra 100Tx2)

> does that mean you only tested on these pdc's ?

I  changed  from  Three  Ultra100  to  Ultra100Tx2. Now I only use two
boards in this particular system.

> If so then then drop this damn PDC controller and get one that is
> supported under linux (e.g. hpt370 based controllers).

> I had the very same problems with these PDC20268 controllers. When I
> switched to anything above MDMA0 (note not even UDMA) the system was
> freezing from time to time.

This happens here too...

> On the internal controller your drives should work all fine (via/intel
> chipsets work nicely), also on hpt based chipsets and also cmd is
> supporting linux... but forget about promise. This company just does not
> support linux.

> I was using kernels 2.4.19/20/21pre1/21pre4/21pre4-ac5 and all had the
> very same problem. When I heard from others that they had problems with
> promise I switched... and I am now enjoying a rock stable system.

It  might  just  have  to  come  to this, but I do not want to buy new
hardware :)

> Soeren.


   



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06 14:12 berthiaume_wayne
@ 2003-03-06 14:20 ` Anders Widman
  0 siblings, 0 replies; 46+ messages in thread
From: Anders Widman @ 2003-03-06 14:20 UTC (permalink / raw)
  To: berthiaume_wayne; +Cc: reiserfs-list

>         Anders, here is what I have and it works on thousands of duplicate
> servers:

> Tyan S2420 with 1.0GHz PIII
> 512MB RAM
> Promise PDC20269 in PCI1

Using PDC20268

> Intel Dual 10/100 NIC in PCI2
> Four Maxtor 250GB IDE drives off of the Promise controller
> lk 2.4.19 on RH7.3

> hdparm -a64 -K1 -W1 -u1 -m16 -c1 -d1 /dev/hd<x>

hm.. The big difference I see is -that I normally use -c3.




^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-06 14:25 berthiaume_wayne
  0 siblings, 0 replies; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-06 14:25 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Cable length is similar to mine. The PDC20268 will only go to UDMA
5. I haven't done any testing with this controller, needed the PDC20269's
UDMA 6 capability.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Thursday, March 06, 2003 8:52 AM
To: reiserfs-list@namesys.com
Subject: Re: Error messages.


>         That's rather puzzling... I did not have the same problems with
the
> mii driver; however, I was unable to run the full extent of the 250GB
drive
> or the UDMA level 6 with mii under 2.4.13, so I was using a special
patched
> driver form Promise to support both the pdc20269 and 48LBA. In 2.4.19 the
48
> LBA was added so I was able to get the full address range on the 250GB
> drives without patches from Promise; however, was still unable to run UDMA
> level 6 on the onboard Intel chip.

UDMA6  works on the machine with the VIA KT400 chip and 2.4.21 kernel.
The  other machines are limited to ATA-100 as the controllers does not
support  higher.  Actually  I  do  not need high DMA, DMA-33 should be
enough.

Though,  the  errors come even with DMA turned off. It seem though, at
least  so  far,  that  the system crashes/lockups come much more often
with DMA than without.

>         I still use the Promise pdc20269 and run UDMA level 6 on thousands
> of deployed servers at this time. What is the cable length from drives to
> controller? Eventhough you have several configured servers, I have
thousands
> without the problem you are seeing. Yes, I do get an occasional status
error
> under heavy loads but they've always been recoverable and the systems
> continue to chug along.

Cables are between 40-45cm / 15,5-17in.

--------
PGP public key: https://tnonline.net/secure/pgp_key.txt

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: Error messages.
@ 2003-03-06 14:32 berthiaume_wayne
  0 siblings, 0 replies; 46+ messages in thread
From: berthiaume_wayne @ 2003-03-06 14:32 UTC (permalink / raw)
  To: andewid; +Cc: reiserfs-list

	Originally, I was -W0 with fsync(2) being used to insure data
integrity. I'm presently testing lk 2.4.19 + Namesys patches 1 thru 13 +
Chris Mason's write barrier patch with hdparm -W1 and fsync(2). Under this
configuration I don't see the problem you are encountering, but am
investigating data coruption on the ReiserFS partitions.

-----Original Message-----
From: Anders Widman [mailto:andewid@tnonline.net]
Sent: Thursday, March 06, 2003 9:20 AM
To: berthiaume_wayne@emc.com
Cc: reiserfs-list@namesys.com
Subject: Re: Error messages.


>         Anders, here is what I have and it works on thousands of duplicate
> servers:

> Tyan S2420 with 1.0GHz PIII
> 512MB RAM
> Promise PDC20269 in PCI1

Using PDC20268

> Intel Dual 10/100 NIC in PCI2
> Four Maxtor 250GB IDE drives off of the Promise controller
> lk 2.4.19 on RH7.3

> hdparm -a64 -K1 -W1 -u1 -m16 -c1 -d1 /dev/hd<x>

hm.. The big difference I see is -that I normally use -c3.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06  8:34     ` Anders Widman
@ 2003-03-06 17:33       ` Anders Widman
  2003-03-07  5:50         ` Todd Lyons
  0 siblings, 1 reply; 46+ messages in thread
From: Anders Widman @ 2003-03-06 17:33 UTC (permalink / raw)
  To: reiserfs-list

>> Do you have apic enabled or disabled in both the kernel and the BIOS?
>> Do you have acpi enabled or disabled in both the kernel and the BIOS?

> Yes,  right now both are. Will be trying without. If it works it means
> there is a nasty bug in the kernel/or Promise drivers?

Have now tried without ACPI,APIC and APM. Still crashes.... :(

Will fiddle more with this in the weekend.



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-05 19:18 Anders Widman
  2003-03-05 20:41 ` Ross Vandegrift
  2003-03-06  6:57 ` Oleg Drokin
@ 2003-03-07  5:47 ` Zygo Blaxell
  2 siblings, 0 replies; 46+ messages in thread
From: Zygo Blaxell @ 2003-03-07  5:47 UTC (permalink / raw)
  To: reiserfs-list

In article <7473069171.20030305201818@tnonline.net>,
Anders Widman  <andewid@tnonline.net> wrote:
>   This  has  come up on this list a number of times, and no one still
>   seem to have found the true answer to the problem.
>
>   kernel: status error: status=0x58 { DriveReady SeekComplete DataRequest }
>
>   Most  seem  to  say  this is a bad block on the harddrive. I am not
>   convinced  though.  Using  Linux on three machines here, and I have
>   seen  this error on all of them, with lots of disks. The error seem
>   to   come  random,  but  does  cause  system  lockups  and  broken
>   filesystems.

It's a timeout during a data request, which could be caused by a bad
block, but might also be caused by poor cabling, overheating, or crap
drive firmware.  If there is a disk that appears to be implicated, the
real culprit could actually be caused by the _other_ disk on the cable,
if there is one.  It's very hard to tell which of these is the case
without more information than this log message--all you know is that
suddenly the drive stops responding to commands, or that you can't
send commands to the drive any more.

You get data corruption because the usual way out of one of these
messages is a drive reset, which will discard any writes that might
have been buffered in the drive's controller but not written on the disk.
Linux might also get confused here, which just makes a bad situation
worse.

I had dozens of these messages every day before I started explicitly
cooling drives _and_ the drive controllers.  For some reason board
manufacturers to this day do not put heat sinks on their ATA100 and
faster chips.  I can only assume that this is because they assume your
machine will spend no more than 20% of its time doing disk I/O, and
design a system that will overheat if it does disk I/O continuously at
full speed for any length of time.

After I started aggressively cooling disks and controllers, I now only
see that message a few weeks before disks fail.  Usually the 'smartctl'
utility (from smartsuite) will also list reallocated sectors in the
output of 'smartctl -v' (i.e. bad sectors that have been remapped).

>   Have  about  20  disks, and have replaced and upgraded them several
>   times  too.  This error has shown on most of them. But when testing
>   them  with tools like IBM DFT, Maxtor Powermax, badblocks or chkdsk
>   in Windows none show up to be with errors on.

Most vendor utilities will never report errors on a drive until the
disk has failed in some fatal way.  It's against their interests to
do otherwise.

>   Sometimes  it  seem  to  help to disable DMA and or lower UDMA mode
>   (all  drives are ATA-100 or ATA-133). But then after a few days, or
>   a  few  minutes  the  kernel  starts spitting out these status=0x58
>   errors.

This happens to make the chips run cooler.

>   After  looking  online  on  different forums it does seem that many
>   people are experiencing them.
>
>   What  exactly does this status=0x58 error mean, and what can one do
>   to solve the problem?

0x58 = 0x40 | 0x10 | 0x08 (i.e. the DriveReady, SeekComplete, and
DataRequest bits).  Usually this is followed by an error message from
the last command that was sent to the drive (e.g. 

	end_request: I/O error, dev 03:42 (hdb), sector 69234536

).

-- 
Zygo Blaxell (Laptop) <zblaxell@feedme.hungrycats.org>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages.
  2003-03-06 17:33       ` Anders Widman
@ 2003-03-07  5:50         ` Todd Lyons
  0 siblings, 0 replies; 46+ messages in thread
From: Todd Lyons @ 2003-03-07  5:50 UTC (permalink / raw)
  To: reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Anders Widman wanted us to know:

>Have now tried without ACPI,APIC and APM. Still crashes.... :(
>Will fiddle more with this in the weekend.

Get the absolute latest Cooker kernel.  For what it's worth, I've heard
that RedHat's kernels work well with the PDC chipsets, so you might try
with their latest from their Beta as well.
- -- 
Blue skies...		Todd
| Get a bigger hammer!   |  Are you feeling lucky...punk?         |
| http://www.mrball.net  |  I've had better days...               |
| http://faq.mrball.net  |  It's the end of the world as we know i|
Linux kernel 2.4.19-24mdk   load average: 0.00, 0.04, 0.00
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: http://www.mrball.net/todd.asc

iD8DBQE+aDMSIBT1264ScBURAjDSAJ4zrTIW67XJPkjL0jn8fwew1HCERgCeJq2R
+pV0MM+C8F1mbI9Hkdafw8A=
=QjxQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Error Messages
@ 2005-11-03 18:32 Wilfred Holloway
  0 siblings, 0 replies; 46+ messages in thread
From: Wilfred Holloway @ 2005-11-03 18:32 UTC (permalink / raw)
  To: linux-kernel

Things will get better -- despite our efforts to improve them.While we look to the dramatist to give romance to realism, we ask of the actor to give realism to romance.The logic of the world is prior to all truth and falsehood.
http://datymicoluvylo.com/main/
18yo Tanned Lesbian Couple Fucking Oral
The moment of enlightenment is when a person's dreams of possibilities become images of probabilities.
I am tomorrow, or some future day, what I establish today. I am today what I established yesterday or some previous day.God doesn't have any grandchildren.Dog. A kind of additional or subsidiary Deity designed to catch the overflow and surplus of the world's worship.
Public instruction should be the first object of government.Nobody goes there anymore. It's too crowded.An incompetent attorney can delay a trial for years or months. A competent attorney can delay one even longer.
Shelving hard decisions is the least ethical course.When you have them by the balls, their hearts and minds will follow.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Error messages
@ 2007-11-12 19:30 Haydn Solomon
       [not found] ` <b75785ba0711121130v2d222800k201506c3802ecde8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 46+ messages in thread
From: Haydn Solomon @ 2007-11-12 19:30 UTC (permalink / raw)
  To: kvm-devel


[-- Attachment #1.1: Type: text/plain, Size: 6137 bytes --]

Not sure what happened but now I'm getting messages like the following when
running my guest windows XP ACPI HAL. This is the first time I'm seeing this
type of error. The only thing I did today was upgrade to kvm release 52. The
only thing I did prior to seeing this error was upgrade an vista 32 guest
machine. Sorry about the long output.

Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: page:ffff8100015c6930 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P       )
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: page:ffff8100015c5430 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: page:ffff8100015c5dd0 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: page:ffff8100015c5698 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: page:ffff8100015c5350 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
localhost kernel: Backtrace:unhandled vm exit: 0x9 vcpu_id 0
rax 0000000000000020 rbx 0000000080542ffc rcx 00000000000020ac rdx
000000000000018a
rsi 0000000080042000 rdi 00000000ffdff000 rsp 0000000000000990 rbp
00000000f8acd718
r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11
0000000000000000
r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15
0000000000000000
rip 0000000000000192 rflags 00033016
cs 2000 (00020000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 2000 (00020000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr 0028 (80042000/000020ab p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
gdt 8003f000/3ff
idt 8003f400/7ff
cr0 e001003b cr2 f8acd4e0 cr3 2c00020 cr4 6f8 cr8 f efer 800

Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: page:ffff8100015c6230 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: page:ffff8100015c68c0 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Bad page state in process 'qemu-system-x86'
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: page:ffff8100015c5200 flags:0x0018080000000014
mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Trying to fix it up, but a reboot is needed
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Backtrace:
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: Eeek! page_mapcount(page) went negative! (-1)
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel:   page pfn = 1a60a
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel:   page->flags = 18080000080014
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel:   page->count = 0
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel:   page->mapping = 0000000000000000
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel:   vma->vm_ops = 0x0
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
localhost kernel: ------------[ cut here ]------------
Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...

[-- Attachment #1.2: Type: text/html, Size: 6774 bytes --]

[-- Attachment #2: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages
       [not found] ` <b75785ba0711121130v2d222800k201506c3802ecde8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-11-12 20:17   ` Haydn Solomon
  2007-11-12 20:42   ` Izik Eidus
  1 sibling, 0 replies; 46+ messages in thread
From: Haydn Solomon @ 2007-11-12 20:17 UTC (permalink / raw)
  To: kvm-devel


[-- Attachment #1.1: Type: text/plain, Size: 6531 bytes --]

Reverting to release 51 seems to have solved my problems.



On Nov 12, 2007 2:30 PM, Haydn Solomon <haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Not sure what happened but now I'm getting messages like the following
> when running my guest windows XP ACPI HAL. This is the first time I'm seeing
> this type of error. The only thing I did today was upgrade to kvm release
> 52. The only thing I did prior to seeing this error was upgrade an vista 32
> guest machine. Sorry about the long output.
>
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: page:ffff8100015c6930 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P       )
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: page:ffff8100015c5430 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: page:ffff8100015c5dd0 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: page:ffff8100015c5698 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: page:ffff8100015c5350 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:46 2007 ...
> localhost kernel: Backtrace:unhandled vm exit: 0x9 vcpu_id 0
> rax 0000000000000020 rbx 0000000080542ffc rcx 00000000000020ac rdx
> 000000000000018a
> rsi 0000000080042000 rdi 00000000ffdff000 rsp 0000000000000990 rbp
> 00000000f8acd718
> r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11
> 0000000000000000
> r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15
> 0000000000000000
> rip 0000000000000192 rflags 00033016
> cs 2000 (00020000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> ds 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> es 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> ss 2000 (00020000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> fs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> tr 0028 (80042000/000020ab p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
> ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
> gdt 8003f000/3ff
> idt 8003f400/7ff
> cr0 e001003b cr2 f8acd4e0 cr3 2c00020 cr4 6f8 cr8 f efer 800
>
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: page:ffff8100015c6230 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: page:ffff8100015c68c0 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Bad page state in process 'qemu-system-x86'
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: page:ffff8100015c5200 flags:0x0018080000000014
> mapping:0000000000000000 mapcount:1 count:0 (Tainted: P    B  )
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Trying to fix it up, but a reboot is needed
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Backtrace:
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: Eeek! page_mapcount(page) went negative! (-1)
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel:   page pfn = 1a60a
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel:   page->flags = 18080000080014
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel:   page->count = 0
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel:   page->mapping = 0000000000000000
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel:   vma->vm_ops = 0x0
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
> localhost kernel: ------------[ cut here ]------------
> Message from syslogd@ at Mon Nov 12 14:23:47 2007 ...
>
>

[-- Attachment #1.2: Type: text/html, Size: 7214 bytes --]

[-- Attachment #2: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Error messages
       [not found] ` <b75785ba0711121130v2d222800k201506c3802ecde8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2007-11-12 20:17   ` Haydn Solomon
@ 2007-11-12 20:42   ` Izik Eidus
       [not found]     ` <b75785ba0711121246s2e3fb110ud339182b267f39c9@mail.gmail.com>
  1 sibling, 1 reply; 46+ messages in thread
From: Izik Eidus @ 2007-11-12 20:42 UTC (permalink / raw)
  To: Haydn Solomon; +Cc: kvm-devel

Haydn Solomon wrote:
> Not sure what happened but now I'm getting messages like the following 
> when running my guest windows XP ACPI HAL. This is the first time I'm 
> seeing this type of error. The only thing I did today was upgrade to 
> kvm release 52. The only thing I did prior to seeing this error was 
> upgrade an vista 32 guest machine. Sorry about the long output.
>
is it repeatability?
if yes, i will give you patch to check something

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Fwd:  Error messages
       [not found]       ` <b75785ba0711121246s2e3fb110ud339182b267f39c9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-11-12 20:46         ` Haydn Solomon
       [not found]           ` <b75785ba0711121246r2eac79ect88f7a603ad6ea859-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 46+ messages in thread
From: Haydn Solomon @ 2007-11-12 20:46 UTC (permalink / raw)
  To: kvm-devel


[-- Attachment #1.1: Type: text/plain, Size: 1014 bytes --]

---------- Forwarded message ----------
From: Haydn Solomon <haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Nov 12, 2007 3:46 PM
Subject: Re: [kvm-devel] Error messages
To: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>


Yeah.. it was repeating and sometimes would hang the host. This continuted
to happen even after rebooting the host a couple of times. I'm willing to
test your patch as currently I have to run on release 51.


On Nov 12, 2007 3:42 PM, Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:

> Haydn Solomon wrote:
> > Not sure what happened but now I'm getting messages like the following
> > when running my guest windows XP ACPI HAL. This is the first time I'm
> > seeing this type of error. The only thing I did today was upgrade to
> > kvm release 52. The only thing I did prior to seeing this error was
> > upgrade an vista 32 guest machine. Sorry about the long output.
> >
> is it repeatability?
> if yes, i will give you patch to check something
>

[-- Attachment #1.2: Type: text/html, Size: 1643 bytes --]

[-- Attachment #2: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Fwd:  Error messages
       [not found]           ` <b75785ba0711121246r2eac79ect88f7a603ad6ea859-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-11-12 20:52             ` Izik Eidus
       [not found]               ` <4738BD18.4000901-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 46+ messages in thread
From: Izik Eidus @ 2007-11-12 20:52 UTC (permalink / raw)
  To: Haydn Solomon; +Cc: kvm-devel

before the patch, can you test this userspace (kvm-52) with the kvm 
module from kvm-51 and report if you have this issue?
>
>
> ---------- Forwarded message ----------
> From: *Haydn Solomon* <haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org 
> <mailto:haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>>
> Date: Nov 12, 2007 3:46 PM
> Subject: Re: [kvm-devel] Error messages
> To: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org <mailto:izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>>
>
>
> Yeah.. it was repeating and sometimes would hang the host. This 
> continuted to happen even after rebooting the host a couple of times. 
> I'm willing to test your patch as currently I have to run on release 51.
>
>
> On Nov 12, 2007 3:42 PM, Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org 
> <mailto:izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>> wrote:
>
>     Haydn Solomon wrote:
>     > Not sure what happened but now I'm getting messages like the
>     following
>     > when running my guest windows XP ACPI HAL. This is the first
>     time I'm
>     > seeing this type of error. The only thing I did today was
>     upgrade to
>     > kvm release 52. The only thing I did prior to seeing this error was
>     > upgrade an vista 32 guest machine. Sorry about the long output.
>     >
>     is it repeatability?
>     if yes, i will give you patch to check something
>
>
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> ------------------------------------------------------------------------
>
> _______________________________________________
> kvm-devel mailing list
> kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: Fwd: Error messages
       [not found]               ` <4738BD18.4000901-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-11-12 21:07                 ` Haydn Solomon
  0 siblings, 0 replies; 46+ messages in thread
From: Haydn Solomon @ 2007-11-12 21:07 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm-devel


[-- Attachment #1.1: Type: text/plain, Size: 2566 bytes --]

On Nov 12, 2007 3:52 PM, Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:

> before the patch, can you test this userspace (kvm-52) with the kvm
> module from kvm-51 and report if you have this issue?


OK, ran userspace 52 against module 51 and so far I haven't been able to
reproduce the problem. I was able to boot two windows xp 32 bit hosts, one
with standard HAL and one with ACPI. Both of these were giving me the
messages in previous email but with 52 user/51 mod,  they are fine.



>
> >
> >
> > ---------- Forwarded message ----------
> > From: *Haydn Solomon* <haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
> > <mailto:haydn.solomon-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>>
> > Date: Nov 12, 2007 3:46 PM
> > Subject: Re: [kvm-devel] Error messages
> > To: Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org <mailto:izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>>
> >
> >
> > Yeah.. it was repeating and sometimes would hang the host. This
> > continuted to happen even after rebooting the host a couple of times.
> > I'm willing to test your patch as currently I have to run on release 51.
> >
> >
> > On Nov 12, 2007 3:42 PM, Izik Eidus <izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org
> > <mailto:izike-atKUWr5tajBWk0Htik3J/w@public.gmane.org>> wrote:
> >
> >     Haydn Solomon wrote:
> >     > Not sure what happened but now I'm getting messages like the
> >     following
> >     > when running my guest windows XP ACPI HAL. This is the first
> >     time I'm
> >     > seeing this type of error. The only thing I did today was
> >     upgrade to
> >     > kvm release 52. The only thing I did prior to seeing this error
> was
> >     > upgrade an vista 32 guest machine. Sorry about the long output.
> >     >
> >     is it repeatability?
> >     if yes, i will give you patch to check something
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> >
> -------------------------------------------------------------------------
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > kvm-devel mailing list
> > kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
> > https://lists.sourceforge.net/lists/listinfo/kvm-devel
> >
>
>

[-- Attachment #1.2: Type: text/html, Size: 4137 bytes --]

[-- Attachment #2: Type: text/plain, Size: 314 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 186 bytes --]

_______________________________________________
kvm-devel mailing list
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2007-11-12 21:07 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-12 19:30 Error messages Haydn Solomon
     [not found] ` <b75785ba0711121130v2d222800k201506c3802ecde8-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-12 20:17   ` Haydn Solomon
2007-11-12 20:42   ` Izik Eidus
     [not found]     ` <b75785ba0711121246s2e3fb110ud339182b267f39c9@mail.gmail.com>
     [not found]       ` <b75785ba0711121246s2e3fb110ud339182b267f39c9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-12 20:46         ` Fwd: " Haydn Solomon
     [not found]           ` <b75785ba0711121246r2eac79ect88f7a603ad6ea859-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-11-12 20:52             ` Izik Eidus
     [not found]               ` <4738BD18.4000901-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-11-12 21:07                 ` Haydn Solomon
  -- strict thread matches above, loose matches on Subject: below --
2005-11-03 18:32 Error Messages Wilfred Holloway
2003-03-06 14:32 Error messages berthiaume_wayne
2003-03-06 14:25 berthiaume_wayne
2003-03-06 14:12 berthiaume_wayne
2003-03-06 14:20 ` Anders Widman
2003-03-06 13:27 berthiaume_wayne
2003-03-06 13:52 ` Anders Widman
2003-03-05 23:09 berthiaume_wayne
2003-03-05 21:36 berthiaume_wayne
2003-03-05 22:50 ` Anders Widman
2003-03-05 22:53   ` Anders Widman
2003-03-05 21:31 berthiaume_wayne
2003-03-05 21:07 berthiaume_wayne
2003-03-05 21:21 ` Anders Widman
2003-03-06  6:58   ` Todd Lyons
2003-03-06  8:34     ` Anders Widman
2003-03-06 17:33       ` Anders Widman
2003-03-07  5:50         ` Todd Lyons
2003-03-05 21:01 berthiaume_wayne
2003-03-05 21:18 ` Anders Widman
2003-03-05 19:18 Anders Widman
2003-03-05 20:41 ` Ross Vandegrift
2003-03-05 20:51   ` Anders Widman
2003-03-05 21:01     ` Anders Widman
2003-03-05 21:14     ` Ross Vandegrift
2003-03-05 23:02     ` Soeren Sonnenburg
2003-03-06  8:46       ` Anders Widman
2003-03-06  6:57 ` Oleg Drokin
2003-03-06  7:07   ` Voicu Liviu
2003-03-06  7:19     ` Oleg Drokin
2003-03-06  7:19       ` Voicu Liviu
2003-03-06  8:37         ` Oleg Drokin
2003-03-06  8:32   ` Anders Widman
2003-03-06  8:40     ` Oleg Drokin
2003-03-06  8:43       ` Anders Widman
2003-03-06  8:48         ` Oleg Drokin
2003-03-06 12:16     ` Hans Reiser
2003-03-06 12:23       ` Anders Widman
2003-03-06 12:23         ` Dieter Nützel
2003-03-07  5:47 ` Zygo Blaxell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.