* Bug at mm/rmap.c:493, Kernel 2.6.13.2
@ 2005-10-02 16:53 Christian Seiler
2005-10-03 4:40 ` Hugh Dickins
0 siblings, 1 reply; 3+ messages in thread
From: Christian Seiler @ 2005-10-02 16:53 UTC (permalink / raw)
To: linux-kernel
Hello,
In the kernel log of a computer I'm administrating a strange message
appeared stating there was a kernel bug in mm/rmap.c, line 493. I put
together the kernel log message (including the stack trace), the kernel
configuration, the output of lspci -v, lsmod, uname -a and gcc/ld
-version here:
http://src.selfhtml.org/lkml/
Although the message says a reboot is needed, the server still seems to
work after that message (login using SSH is possible, all services still
respond normally). After a reboot the same message reappears inside the
log after some time.
The distribution is Gentoo Linux, but the kernel is built from vanilla
sources. The system is entirely 64bit - no 32bit libraries are
installed. The server itself is a Sun Fire V20z with two Opteron 244, 2
GiB of RAM and hardware RAID-1 with two U320 SCSI disks.
Regards,
Christian
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Bug at mm/rmap.c:493, Kernel 2.6.13.2
2005-10-02 16:53 Bug at mm/rmap.c:493, Kernel 2.6.13.2 Christian Seiler
@ 2005-10-03 4:40 ` Hugh Dickins
2005-10-03 15:58 ` Christian Seiler
0 siblings, 1 reply; 3+ messages in thread
From: Hugh Dickins @ 2005-10-03 4:40 UTC (permalink / raw)
To: Christian Seiler; +Cc: linux-kernel
On Sun, 2 Oct 2005, Christian Seiler wrote:
>
> In the kernel log of a computer I'm administrating a strange message
> appeared stating there was a kernel bug in mm/rmap.c, line 493. I put
> together the kernel log message (including the stack trace), the kernel
> configuration, the output of lspci -v, lsmod, uname -a and gcc/ld
> -version here:
>
> http://src.selfhtml.org/lkml/
>
> Although the message says a reboot is needed, the server still seems to
> work after that message (login using SSH is possible, all services still
> respond normally). After a reboot the same message reappears inside the
> log after some time.
>
> The distribution is Gentoo Linux, but the kernel is built from vanilla
> sources. The system is entirely 64bit - no 32bit libraries are
> installed. The server itself is a Sun Fire V20z with two Opteron 244, 2
> GiB of RAM and hardware RAID-1 with two U320 SCSI disks.
Please try Linus' patch at the bottom: on dual Opteron, our best guess
is that yours is a different manifestation of the same underlying issue.
(I believe there's now a more finely targetted version of the patch in
-rc3, but this will do if it is your problem). Please get back to me
if you find this doesn't fix it - thanks.
Here's what Linus said on 20 Sep:
On Tue, 20 Sep 2005, Charles McCreary wrote:
>
> Another datapoint for this thread. The box spewing the bad pmds messages is a
> dual opteron 246 on a TYAN S2885 Thunder K8W motherboard. Kernel is
> 2.6.11.4-20a-smp.
This is quite possibly the result of an Opteron errata (tlb flush
filtering is broken on SMP) that we worked around as of 2.6.14-rc4.
So either just try 2.6.14-rc2, or try the appended patch (it has since
been confirmed by many more people).
Linus
---
diff-tree bc5e8fdfc622b03acf5ac974a1b8b26da6511c99 (from 61ffcafafb3d985e1ab8463be0187b421614775c)
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date: Sat Sep 17 15:41:04 2005 -0700
x86-64/smp: fix random SIGSEGV issues
They seem to have been due to AMD errata 63/122; the fix is to disable
TLB flush filtering in SMP configurations.
Confirmed to fix the problem by Andrew Walrond <andrew@walrond.org>
[ Let's see if we'll have a better fix eventually, this is the Q&D
"let's get this fixed and out there" version ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c
--- a/arch/x86_64/kernel/setup.c
+++ b/arch/x86_64/kernel/setup.c
@@ -831,11 +831,26 @@ static void __init amd_detect_cmp(struct
#endif
}
+#define HWCR 0xc0010015
+
static int __init init_amd(struct cpuinfo_x86 *c)
{
int r;
int level;
+#ifdef CONFIG_SMP
+ unsigned long value;
+
+ // Disable TLB flush filter by setting HWCR.FFDIS:
+ // bit 6 of msr C001_0015
+ //
+ // Errata 63 for SH-B3 steppings
+ // Errata 122 for all(?) steppings
+ rdmsrl(HWCR, value);
+ value |= 1 << 6;
+ wrmsrl(HWCR, value);
+#endif
+
/* Bit 31 in normal CPUID used for nonstandard 3DNow ID;
3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */
clear_bit(0*32+31, &c->x86_capability);
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Bug at mm/rmap.c:493, Kernel 2.6.13.2
2005-10-03 4:40 ` Hugh Dickins
@ 2005-10-03 15:58 ` Christian Seiler
0 siblings, 0 replies; 3+ messages in thread
From: Christian Seiler @ 2005-10-03 15:58 UTC (permalink / raw)
To: Hugh Dickins; +Cc: linux-kernel
Hello,
> Please try Linus' patch at the bottom: on dual Opteron, our best guess
> is that yours is a different manifestation of the same underlying issue.
Thanks a lot - this patch really seems to help. The server is now up for
1.5 hours with the kernel patch and the error did not occur anymore.
Furthermore, another issue I had on that computer, seems to be fixed by
this patch, too: gcc and ld sometimes failed to compile/link a file at
random, gcc exiting with error code 1 and no message and ld exiting with
the error message:
Inconsistency detected by ld.so: rtld.c: 1075: dl_main: Assertion
`_rtld_local._dl_rtld_map.l_libname' failed!
This error occured at random. It seems to be gone now. I didn't report
it here because I thought it was an issue with binutils, gcc or glibc
but it seems that this kernel patch fixes it, too.
Thanks!
Christian
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-10-03 15:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-02 16:53 Bug at mm/rmap.c:493, Kernel 2.6.13.2 Christian Seiler
2005-10-03 4:40 ` Hugh Dickins
2005-10-03 15:58 ` Christian Seiler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).