* VM: killing process amavis
@ 2003-08-13 15:23 Coen Rosdorff
2003-08-13 15:40 ` Hugh Dickins
0 siblings, 1 reply; 4+ messages in thread
From: Coen Rosdorff @ 2003-08-13 15:23 UTC (permalink / raw)
To: linux-kernel
Who can tell me something about this error in /var/log/messages:
Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis
Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000
Memtest86: No errors.
Kernel: 2.4.21
Mem: 256MB
CPU: Intel PII 300Mhz
# cat /proc/swaps
Filename Type Size Used Priority
/dev/sda2 partition 530136 44256 -1
# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 263229440 194764800 68464640 0 55820288 91078656
Swap: 542859264 45318144 497541120
MemTotal: 257060 kB
MemFree: 66860 kB
MemShared: 0 kB
Buffers: 54512 kB
Cached: 58248 kB
SwapCached: 30696 kB
Active: 90332 kB
Inactive: 74088 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 257060 kB
LowFree: 66860 kB
SwapTotal: 530136 kB
SwapFree: 485880 kB
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: VM: killing process amavis 2003-08-13 15:23 VM: killing process amavis Coen Rosdorff @ 2003-08-13 15:40 ` Hugh Dickins 2003-08-13 19:40 ` Coen Rosdorff 0 siblings, 1 reply; 4+ messages in thread From: Hugh Dickins @ 2003-08-13 15:40 UTC (permalink / raw) To: Coen Rosdorff; +Cc: linux-kernel On Wed, 13 Aug 2003, Coen Rosdorff wrote: > Who can tell me something about this error in /var/log/messages: > > Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis > Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000 > > Memtest86: No errors. It really would be worth giving memtest86 a good long run. 02000000 looks very much like a single-bit memory error, and swap_free is exactly where such errors often show up. Hugh ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: VM: killing process amavis 2003-08-13 15:40 ` Hugh Dickins @ 2003-08-13 19:40 ` Coen Rosdorff 2003-08-17 9:48 ` Rob Landley 0 siblings, 1 reply; 4+ messages in thread From: Coen Rosdorff @ 2003-08-13 19:40 UTC (permalink / raw) To: Hugh Dickins; +Cc: linux-kernel On Wed, 13 Aug 2003, Hugh Dickins wrote: > It really would be worth giving memtest86 a good long run. > > 02000000 looks very much like a single-bit memory error, > and swap_free is exactly where such errors often show up. I had the same problem before on the previous server. Running memtest for 19 days didn't showed any memory problems. After replacing the motherboard cpu and ram, now I have the same problem. Previous motherboard: swap_free: Unused swap offset entry 00000100 Apr 26 09:40:05 rosdorff kernel: kernel BUG at dcache.c:345! Apr 26 09:40:05 rosdorff kernel: invalid operand: 0000 Apr 26 09:40:05 rosdorff kernel: CPU: 0 Apr 26 09:40:05 rosdorff kernel: EIP: 0010:[<c0141764>] Not tainted Apr 26 09:40:05 rosdorff kernel: EFLAGS: 00010206 Apr 26 09:40:05 rosdorff kernel: eax: 00000100 ebx: c17a8958 ecx: c1127f84 edx: c17a89d8 Apr 26 09:40:05 rosdorff kernel: esi: c17a8940 edi: 0000064d ebp: 00000113 esp: c1147f20 Apr 26 09:40:05 rosdorff kernel: ds: 0018 es: 0018 ss: 0018 Apr 26 09:40:05 rosdorff kernel: Process kswapd (pid: 4, stackpage=c1147000) May 11 14:40:05 rosdorff kernel: kernel BUG at dcache.c:345! May 11 14:40:05 rosdorff kernel: invalid operand: 0000 May 11 14:40:05 rosdorff kernel: CPU: 0 May 11 14:40:05 rosdorff kernel: EIP: 0010:[<c0141b84>] Not tainted May 11 14:40:05 rosdorff kernel: EFLAGS: 00010206 May 11 14:40:05 rosdorff kernel: eax: 00000100 ebx: c17a8958 ecx: c1127f84 edx: c4d2a6d8 May 11 14:40:05 rosdorff kernel: esi: c17a8940 edi: 000011aa ebp: 0000021b esp: c114bf20 May 11 14:40:05 rosdorff kernel: ds: 0018 es: 0018 ss: 0018 May 11 14:40:05 rosdorff kernel: Process kswapd (pid: 4, stackpage=c114b000) Jun 18 05:00:06 rosdorff kernel: kernel BUG at dcache.c:345! Jun 18 05:00:06 rosdorff kernel: invalid operand: 0000 Jun 18 05:00:06 rosdorff kernel: CPU: 0 Jun 18 05:00:06 rosdorff kernel: EIP: 0010:[<c0141264>] Not tainted Jun 18 05:00:06 rosdorff kernel: EFLAGS: 00010206 Jun 18 05:00:06 rosdorff kernel: eax: 00000100 ebx: c17a8958 ecx: c110ff84 edx: c17a89d8 Jun 18 05:00:06 rosdorff kernel: esi: c17a8940 edi: 000019c1 ebp: 00000393 esp: c1163f20 Jun 18 05:00:06 rosdorff kernel: ds: 0018 es: 0018 ss: 0018 Jun 18 05:00:06 rosdorff kernel: Process kswapd (pid: 4, stackpage=c1163000) Current motherboard: Jul 8 08:31:53 rosdorff kernel: memory.c:100: bad pmd 02000000 Jul 15 04:05:16 rosdorff kernel: Unable to handle kernel paging request at virtual address 02000000 Jul 15 04:05:16 rosdorff kernel: printing eip: Jul 15 04:05:16 rosdorff kernel: c0131614 Jul 15 04:05:16 rosdorff kernel: *pde = 00000000 Jul 15 04:05:16 rosdorff kernel: Oops: 0002 Jul 15 04:05:16 rosdorff kernel: CPU: 0 Jul 15 04:05:16 rosdorff kernel: EIP: 0010:[<c0131614>] Not tainted Jul 15 04:05:16 rosdorff kernel: EFLAGS: 00010256 Jul 15 04:05:16 rosdorff kernel: eax: 00000000 ebx: c36cf3e0 ecx: c36cf3e0 edx: 02000000 Jul 15 04:05:16 rosdorff kernel: esi: c36cf3e0 edi: c36cf3e0 ebp: c11b5970 esp: c136df00 Jul 15 04:05:16 rosdorff kernel: ds: 0018 es: 0018 ss: 0018 Jul 15 04:05:16 rosdorff kernel: Process kswapd (pid: 4, stackpage=c136d000) Aug 13 10:12:51 rosdorff kernel: VM: killing process amavis Aug 13 10:12:51 rosdorff kernel: swap_free: Unused swap offset entry 02000000 So the problem moved from 00000100 to 02000000 The networkcards and the 3ware raid controler moved form the old to the new box. Could one of them be the problem? I am running out of options. TIA, Coen Rosdorff ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: VM: killing process amavis 2003-08-13 19:40 ` Coen Rosdorff @ 2003-08-17 9:48 ` Rob Landley 0 siblings, 0 replies; 4+ messages in thread From: Rob Landley @ 2003-08-17 9:48 UTC (permalink / raw) To: Coen Rosdorff, Hugh Dickins; +Cc: linux-kernel On Wednesday 13 August 2003 15:40, Coen Rosdorff wrote: > On Wed, 13 Aug 2003, Hugh Dickins wrote: > > It really would be worth giving memtest86 a good long run. > > > > 02000000 looks very much like a single-bit memory error, > > and swap_free is exactly where such errors often show up. > > I had the same problem before on the previous server. Running memtest for > 19 days didn't showed any memory problems. > > After replacing the motherboard cpu and ram, now I have the same problem. I had a system once that looked very much like it had bad ram, but it turned out to have a bad hard drive controller, which showed up paging stuff into memory from disk (ala exec, sometimes), and in bringing stuff back in from swap. (The kernel almost never went bye-bye, because it never swapped out, you see...) Caused the weirdest problems in Myth II, among other things... > So the problem moved from 00000100 to 02000000 > > The networkcards and the 3ware raid controler moved form the old to the > new box. Could one of them be the problem? > > I am running out of options. Check the raid controller. Especially if you're swapping through the raid controller. I found out what was wrong with the other system by copying big tarballs through the network and verifying them. Try this: 1) Copy a tarball to the remote system and confirm that it came out OK just coming across the network. cat enormous.tgz | ssh othersystem "tar tvz" 2) Now copy the tarball to the remote machine's disk, and test that the copy on disk is good. cat enormous.tgz | ssh othersystem "cat > temp.tgz; tar tvzf temp.tgz" Of course using a tarball that's bigger than your ram, so it actually does have to write it out to disk and read it back in again. Using ssh provides a little bit of a CPU load, and of course the network is providing a competing source of interrupts. (You could also run contest in the background or some such to really beat the system to death...) Rob ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-08-17 20:12 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-08-13 15:23 VM: killing process amavis Coen Rosdorff 2003-08-13 15:40 ` Hugh Dickins 2003-08-13 19:40 ` Coen Rosdorff 2003-08-17 9:48 ` Rob Landley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox