* Document hadling of bad memory
@ 2008-11-26 16:15 Pavel Machek
2008-11-26 16:25 ` Jan-Simon Möller
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Pavel Machek @ 2008-11-26 16:15 UTC (permalink / raw)
To: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc,
Andrew Morton, Trivial patch monkey
Document how to deal with bad memory reported with memtest.
Signed-off-by: Pavel Machek <pavel@suse.cz>
diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt
new file mode 100644
index 0000000..df84162
--- /dev/null
+++ b/Documentation/bad_memory.txt
@@ -0,0 +1,45 @@
+March 2008
+Jan-Simon Moeller, dl9pf@gmx.de
+
+
+How to deal with bad memory e.g. reported by memtest86+ ?
+#########################################################
+
+There are three possibilities I know of:
+
+1) Reinsert/swap the memory modules
+
+2) Buy new modules (best!) or try to exchange the memory
+ if you have spare-parts
+
+3) Use BadRAM or memmap
+
+This Howto is about number 3) .
+
+
+BadRAM
+######
+BadRAM is the actively developed and available as kernel-patch
+here: http://rick.vanrein.org/linux/badram/
+
+For more details see the BadRAM documentation.
+
+memmap
+######
+
+memmap is already in the kernel and usable as kernel-parameter at
+boot-time. Its syntax is slightly strange and you may need to
+calculate the values by yourself!
+
+Syntax to exclude a memory area (see kernel-parameters.txt for details):
+memmap=<size>$<address>
+
+Example: memtest86+ reported here errors at address 0x18691458, 0x18698424 and
+ some others. All had 0x1869xxxx in common, so I chose a pattern of
+ 0x18690000,0xffff0000.
+
+With the numbers of the example above:
+memmap=64K$0x18690000
+ or
+memmap=0x10000$0x18690000
+
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: Document hadling of bad memory 2008-11-26 16:15 Document hadling of bad memory Pavel Machek @ 2008-11-26 16:25 ` Jan-Simon Möller 2008-11-27 0:42 ` Jiri Kosina ` (2 subsequent siblings) 3 siblings, 0 replies; 12+ messages in thread From: Jan-Simon Möller @ 2008-11-26 16:25 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, mtk.manpages, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey Am Mittwoch 26 November 2008 17:15:21 schrieb Pavel Machek: > > Document how to deal with bad memory reported with memtest. > > Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de> > diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt [...] Best regards, Jan-Simon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-26 16:15 Document hadling of bad memory Pavel Machek 2008-11-26 16:25 ` Jan-Simon Möller @ 2008-11-27 0:42 ` Jiri Kosina 2008-11-28 9:00 ` Rob Landley 2008-12-01 18:56 ` Randy Dunlap 3 siblings, 0 replies; 12+ messages in thread From: Jiri Kosina @ 2008-11-27 0:42 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey, linux-doc [ linux-doc@vger.kernel.org added, these should be the proper guys to merge this ] On Wed, 26 Nov 2008, Pavel Machek wrote: > > Document how to deal with bad memory reported with memtest. > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt > new file mode 100644 > index 0000000..df84162 > --- /dev/null > +++ b/Documentation/bad_memory.txt > @@ -0,0 +1,45 @@ > +March 2008 > +Jan-Simon Moeller, dl9pf@gmx.de > + > + > +How to deal with bad memory e.g. reported by memtest86+ ? > +######################################################### > + > +There are three possibilities I know of: > + > +1) Reinsert/swap the memory modules > + > +2) Buy new modules (best!) or try to exchange the memory > + if you have spare-parts > + > +3) Use BadRAM or memmap > + > +This Howto is about number 3) . > + > + > +BadRAM > +###### > +BadRAM is the actively developed and available as kernel-patch > +here: http://rick.vanrein.org/linux/badram/ > + > +For more details see the BadRAM documentation. > + > +memmap > +###### > + > +memmap is already in the kernel and usable as kernel-parameter at > +boot-time. Its syntax is slightly strange and you may need to > +calculate the values by yourself! > + > +Syntax to exclude a memory area (see kernel-parameters.txt for details): > +memmap=<size>$<address> > + > +Example: memtest86+ reported here errors at address 0x18691458, 0x18698424 and > + some others. All had 0x1869xxxx in common, so I chose a pattern of > + 0x18690000,0xffff0000. > + > +With the numbers of the example above: > +memmap=64K$0x18690000 > + or > +memmap=0x10000$0x18690000 > + > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > -- Jiri Kosina SUSE Labs ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-26 16:15 Document hadling of bad memory Pavel Machek 2008-11-26 16:25 ` Jan-Simon Möller 2008-11-27 0:42 ` Jiri Kosina @ 2008-11-28 9:00 ` Rob Landley 2008-11-28 9:47 ` Jan-Simon Möller ` (2 more replies) 2008-12-01 18:56 ` Randy Dunlap 3 siblings, 3 replies; 12+ messages in thread From: Rob Landley @ 2008-11-28 9:00 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Wednesday 26 November 2008 10:15:21 Pavel Machek wrote: > Document how to deal with bad memory reported with memtest. ... > +BadRAM > +###### > +BadRAM is the actively developed and available as kernel-patch > +here: http://rick.vanrein.org/linux/badram/ So the patch isn't worth merging, but documentation about the out-of-tree patch is worth merging? I'm not objecting, I'm just confused about to what the merge criteria are... Rob ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-28 9:00 ` Rob Landley @ 2008-11-28 9:47 ` Jan-Simon Möller 2008-11-28 12:18 ` Pavel Machek 2008-11-29 6:50 ` Andrew Morton 2 siblings, 0 replies; 12+ messages in thread From: Jan-Simon Möller @ 2008-11-28 9:47 UTC (permalink / raw) To: Rob Landley Cc: Pavel Machek, kernel list, mtk.manpages, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey Am Freitag 28 November 2008 10:00:26 schrieb Rob Landley: > > So the patch isn't worth merging, but documentation about the out-of-tree > patch is worth merging? Good point. IIRC we tried merging the patch, but without luck at that time. It was said, that there's another method (with an even <irony>better</irony> syntax) which could also handle this case and there should be better some hacking to get the syntax parsed to use the functions of this already in-kernel method. I don't know the status of this (guess: none). What I know: badmem worked here really good. (But meantime I bought new ram.) Best regards, Jan-Simon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-28 9:00 ` Rob Landley 2008-11-28 9:47 ` Jan-Simon Möller @ 2008-11-28 12:18 ` Pavel Machek 2008-11-29 5:28 ` Rob Landley 2008-11-29 6:50 ` Andrew Morton 2 siblings, 1 reply; 12+ messages in thread From: Pavel Machek @ 2008-11-28 12:18 UTC (permalink / raw) To: Rob Landley Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Fri 2008-11-28 03:00:26, Rob Landley wrote: > On Wednesday 26 November 2008 10:15:21 Pavel Machek wrote: > > Document how to deal with bad memory reported with memtest. > ... > > +BadRAM > > +###### > > +BadRAM is the actively developed and available as kernel-patch > > +here: http://rick.vanrein.org/linux/badram/ > > So the patch isn't worth merging, but documentation about the out-of-tree > patch is worth merging? Well, why not. The patch is unneccessary, but for the poor souls hit by bad memory, one line pointer can help... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-28 12:18 ` Pavel Machek @ 2008-11-29 5:28 ` Rob Landley 0 siblings, 0 replies; 12+ messages in thread From: Rob Landley @ 2008-11-29 5:28 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Friday 28 November 2008 06:18:38 Pavel Machek wrote: > On Fri 2008-11-28 03:00:26, Rob Landley wrote: > > So the patch isn't worth merging, but documentation about the out-of-tree > > patch is worth merging? > > Well, why not. The patch is unneccessary, but for the poor souls hit > by bad memory, one line pointer can help... > Pavel Define "unnecessary". Rob ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-28 9:00 ` Rob Landley 2008-11-28 9:47 ` Jan-Simon Möller 2008-11-28 12:18 ` Pavel Machek @ 2008-11-29 6:50 ` Andrew Morton 2 siblings, 0 replies; 12+ messages in thread From: Andrew Morton @ 2008-11-29 6:50 UTC (permalink / raw) To: Rob Landley Cc: Pavel Machek, kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Trivial patch monkey On Fri, 28 Nov 2008 03:00:26 -0600 Rob Landley <rob@landley.net> wrote: > On Wednesday 26 November 2008 10:15:21 Pavel Machek wrote: > > Document how to deal with bad memory reported with memtest. > ... > > +BadRAM > > +###### > > +BadRAM is the actively developed and available as kernel-patch > > +here: http://rick.vanrein.org/linux/badram/ > > So the patch isn't worth merging, but documentation about the out-of-tree > patch is worth merging? > > I'm not objecting, I'm just confused about to what the merge criteria are... > mm.. If someone finds it useful (and I assume that at least one person would have found it useful, hence the effort to write the patch) then why not? (And yeah, yeah, someone might find a .gif of a parrot useful too. Go do some work.) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-11-26 16:15 Document hadling of bad memory Pavel Machek ` (2 preceding siblings ...) 2008-11-28 9:00 ` Rob Landley @ 2008-12-01 18:56 ` Randy Dunlap 2008-12-09 12:31 ` Pavel Machek 3 siblings, 1 reply; 12+ messages in thread From: Randy Dunlap @ 2008-12-01 18:56 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Wed, 26 Nov 2008 17:15:21 +0100 Pavel Machek wrote: > Document how to deal with bad memory reported with memtest. > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt > new file mode 100644 > index 0000000..df84162 > --- /dev/null > +++ b/Documentation/bad_memory.txt > @@ -0,0 +1,45 @@ > +March 2008 > +Jan-Simon Moeller, dl9pf@gmx.de > + > + > +How to deal with bad memory e.g. reported by memtest86+ ? > +######################################################### > + > +There are three possibilities I know of: > + > +1) Reinsert/swap the memory modules > + > +2) Buy new modules (best!) or try to exchange the memory > + if you have spare-parts > + > +3) Use BadRAM or memmap > + > +This Howto is about number 3) . No space between 3) and '.'. > + > + > +BadRAM > +###### > +BadRAM is the actively developed and available as kernel-patch > +here: http://rick.vanrein.org/linux/badram/ > + > +For more details see the BadRAM documentation. > + > +memmap > +###### > + > +memmap is already in the kernel and usable as kernel-parameter at a kernel parameter at > +boot-time. Its syntax is slightly strange and you may need to boot time. > +calculate the values by yourself! s/!/./ > + > +Syntax to exclude a memory area (see kernel-parameters.txt for details): > +memmap=<size>$<address> > + > +Example: memtest86+ reported here errors at address 0x18691458, 0x18698424 and s/here // > + some others. All had 0x1869xxxx in common, so I chose a pattern of > + 0x18690000,0xffff0000. What is the 0xffff0000 for? Needs explanation. > + > +With the numbers of the example above: > +memmap=64K$0x18690000 > + or > +memmap=0x10000$0x18690000 > + Please lose the last empty line. and thanks for the patch/new file. --- ~Randy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-12-01 18:56 ` Randy Dunlap @ 2008-12-09 12:31 ` Pavel Machek 2008-12-09 21:40 ` Rob Landley 0 siblings, 1 reply; 12+ messages in thread From: Pavel Machek @ 2008-12-09 12:31 UTC (permalink / raw) To: Randy Dunlap Cc: kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey I cleaned the document up according to Randy (thanks!). I don't actually know enough about DRAM error characcteristics, I guess'round the size of bad region up to nearest 2^n makes sense. Signed-off-by: Pavel Machek <pavel@suse.cz> diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt index df84162..a2a8703 100644 --- a/Documentation/bad_memory.txt +++ b/Documentation/bad_memory.txt @@ -14,12 +14,12 @@ There are three possibilities I know of: 3) Use BadRAM or memmap -This Howto is about number 3) . +This Howto is about number 3). BadRAM ###### -BadRAM is the actively developed and available as kernel-patch +BadRAM is the actively developed and available as a kernel patch here: http://rick.vanrein.org/linux/badram/ For more details see the BadRAM documentation. @@ -27,19 +27,20 @@ For more details see the BadRAM documentation. memmap ###### -memmap is already in the kernel and usable as kernel-parameter at -boot-time. Its syntax is slightly strange and you may need to -calculate the values by yourself! +memmap is already in the kernel and usable as a kernel parameter at +boot time. Its syntax is slightly strange and you may need to +calculate the values by yourself. Syntax to exclude a memory area (see kernel-parameters.txt for details): memmap=<size>$<address> -Example: memtest86+ reported here errors at address 0x18691458, 0x18698424 and +Example: memtest86+ reported errors at address 0x18691458, 0x18698424 and some others. All had 0x1869xxxx in common, so I chose a pattern of - 0x18690000,0xffff0000. + 0x18690000 and size of 0x10000. (Size needs to cover at least all + known bad places, and rounding to nearest power of 2 makes sense + 'just to be safe'). With the numbers of the example above: memmap=64K$0x18690000 or memmap=0x10000$0x18690000 - -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-12-09 12:31 ` Pavel Machek @ 2008-12-09 21:40 ` Rob Landley 2008-12-09 23:11 ` Pavel Machek 0 siblings, 1 reply; 12+ messages in thread From: Rob Landley @ 2008-12-09 21:40 UTC (permalink / raw) To: Pavel Machek Cc: Randy Dunlap, kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Tuesday 09 December 2008 06:31:52 Pavel Machek wrote: > I cleaned the document up according to Randy (thanks!). I don't actually > know enough about DRAM error characcteristics, I guess'round the size of > bad region up to nearest 2^n makes sense. > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt ... > +This Howto is about number 3). > > > BadRAM > ###### > -BadRAM is the actively developed and available as kernel-patch > +BadRAM is the actively developed and available as a kernel patch > here: http://rick.vanrein.org/linux/badram/ Ok, once again: the point of this patch is to document an out of tree patch. The out of tree patch is here: http://rick.vanrein.org/linux/badram/software/BadRAM-2.6.27.1.patch It has its own Documentation/badram.txt file and it patches Documentation/memory.txt, as acknowledged here: > For more details see the BadRAM documentation. > @@ -27,19 +27,20 @@ For more details see the BadRAM documentation. > memmap > ###### Now what I don't understand is, why add something to the tree formalizing the out-of-tree status of this other patch? Why not just merge it? If it's interesting enough to have documentation about the patch in the tree, why is the patch itself not interesting enough to merge? It's clearly got an active maintainer, and has for years. (Is there something specific about it that needs to be cleaned up?) Adding this extra documentation to the badram patch sounds great. Merging the badram patch into the linux kernel sounds useful; obviously _this_ patch is inherently an expression of interest in it. Adding documentation about the badram patch to the linux kernel tree but _not_ adding the badram patch itself seems kind of crazy. Would someone please explain the reasoning here? I don't understand it. Rob ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Document hadling of bad memory 2008-12-09 21:40 ` Rob Landley @ 2008-12-09 23:11 ` Pavel Machek 0 siblings, 0 replies; 12+ messages in thread From: Pavel Machek @ 2008-12-09 23:11 UTC (permalink / raw) To: Rob Landley Cc: Randy Dunlap, kernel list, mtk.manpages, dl9pf, rdunlap, linux-doc, Andrew Morton, Trivial patch monkey On Tue 2008-12-09 15:40:41, Rob Landley wrote: > On Tuesday 09 December 2008 06:31:52 Pavel Machek wrote: > > I cleaned the document up according to Randy (thanks!). I don't actually > > know enough about DRAM error characcteristics, I guess'round the size of > > bad region up to nearest 2^n makes sense. > > > > Signed-off-by: Pavel Machek <pavel@suse.cz> > > > > diff --git a/Documentation/bad_memory.txt b/Documentation/bad_memory.txt > ... > > +This Howto is about number 3). > > > > > > BadRAM > > ###### > > -BadRAM is the actively developed and available as kernel-patch > > +BadRAM is the actively developed and available as a kernel patch > > here: http://rick.vanrein.org/linux/badram/ > > Ok, once again: the point of this patch is to document an out of tree patch. No; the point of this piece of documentation is to tell people how to work _without_ that patch. Because it is simple enough. > The out of tree patch is here: > http://rick.vanrein.org/linux/badram/software/BadRAM-2.6.27.1.patch > > It has its own Documentation/badram.txt file and it patches > Documentation/memory.txt, as acknowledged here: > > > For more details see the BadRAM documentation. > > @@ -27,19 +27,20 @@ For more details see the BadRAM documentation. > > memmap > > ###### > > Now what I don't understand is, why add something to the tree formalizing the > out-of-tree status of this other patch? Why not just merge it? If > it's Take a look at that patch. It is seriously overengineered. This should not need a config option, should not introduce new page flag, etc. We already have perfectly working interface for excluding specific addresses; maybe we need better documentation, and maybe kernel commandline interface should be changed to be more user friendly, but we certainly don't want to take the badram patch. This excerpt should be enough: diff -pruN linux-2.6.27/include/linux/page-flags.h linux-2.6.27-new/include/linux/page-flags.h --- linux-2.6.27/include/linux/page-flags.h 2008-10-10 03:43:53.000000000 +0530 +++ linux-2.6.27-new/include/linux/page-flags.h 2008-10-15 10:04:48.000000000 +0530 @@ -93,6 +93,9 @@ enum pageflags { PG_mappedtodisk, /* Has blocks allocated on-disk */ PG_reclaim, /* To be reclaimed asap */ PG_buddy, /* Page is free, on buddy lists */ +#ifdef CONFIG_BADRAM + PG_badram, /* BadRam page */ +#endif #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR PG_uncached, /* Page has been mapped as uncached */ # Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-12-09 23:09 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-26 16:15 Document hadling of bad memory Pavel Machek 2008-11-26 16:25 ` Jan-Simon Möller 2008-11-27 0:42 ` Jiri Kosina 2008-11-28 9:00 ` Rob Landley 2008-11-28 9:47 ` Jan-Simon Möller 2008-11-28 12:18 ` Pavel Machek 2008-11-29 5:28 ` Rob Landley 2008-11-29 6:50 ` Andrew Morton 2008-12-01 18:56 ` Randy Dunlap 2008-12-09 12:31 ` Pavel Machek 2008-12-09 21:40 ` Rob Landley 2008-12-09 23:11 ` Pavel Machek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox