public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rob Landley <rob@landley.net>
To: Coen Rosdorff <coen@rosdorff.dyndns.org>,
	Hugh Dickins <hugh@veritas.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: VM: killing process amavis
Date: Sun, 17 Aug 2003 05:48:16 -0400	[thread overview]
Message-ID: <200308170548.16094.rob@landley.net> (raw)
In-Reply-To: <Pine.LNX.4.44.0308132119140.4138-100000@rosdorff.dyndns.org>

On Wednesday 13 August 2003 15:40, Coen Rosdorff wrote:
> On Wed, 13 Aug 2003, Hugh Dickins wrote:
> > It really would be worth giving memtest86 a good long run.
> >
> > 02000000 looks very much like a single-bit memory error,
> > and swap_free is exactly where such errors often show up.
>
> I had the same problem before on the previous server. Running memtest for
> 19 days didn't showed any memory problems.
>
> After replacing the motherboard cpu and ram, now I have the same problem.

I had a system once that looked very much like it had bad ram, but it turned 
out to have a bad hard drive controller, which showed up paging stuff into 
memory from disk (ala exec, sometimes), and in bringing stuff back in from 
swap.  (The kernel almost never went bye-bye, because it never swapped out, 
you see...)

Caused the weirdest problems in Myth II, among other things...

> So the problem moved from 00000100 to 02000000
>
> The networkcards and the 3ware raid controler moved form the old to the
> new box. Could one of them be the problem?
>
> I am running out of options.

Check the raid controller.  Especially if you're swapping through the raid 
controller.  I found out what was wrong with the other system by copying big 
tarballs through the network and verifying them.

Try this:

1) Copy a tarball to the remote system and confirm that it came out OK just 
coming across the network.

  cat enormous.tgz | ssh othersystem "tar tvz"

2) Now copy the tarball to the remote machine's disk, and test that the copy 
on disk is good.

  cat enormous.tgz | ssh othersystem "cat > temp.tgz; tar tvzf temp.tgz"

Of course using a tarball that's bigger than your ram, so it actually does 
have to write it out to disk and read it back in again.  Using ssh provides a 
little bit of a CPU load, and of course the network is providing a competing 
source of interrupts.  (You could also run contest in the background or some 
such to really beat the system to death...)

Rob



      reply	other threads:[~2003-08-17 20:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-13 15:23 VM: killing process amavis Coen Rosdorff
2003-08-13 15:40 ` Hugh Dickins
2003-08-13 19:40   ` Coen Rosdorff
2003-08-17  9:48     ` Rob Landley [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200308170548.16094.rob@landley.net \
    --to=rob@landley.net \
    --cc=coen@rosdorff.dyndns.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox