linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] support for broken memory modules (BadRAM)
@ 2011-06-21  9:23 Stefan Assmann
  2011-06-21 22:02 ` Andrew Morton
  0 siblings, 1 reply; 49+ messages in thread
From: Stefan Assmann @ 2011-06-21  9:23 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, tony.luck, andi, mingo, hpa, rick, rdunlap, sassmann

Following the RFC for the BadRAM feature here's the updated version with
spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
as requested by Andi Kleen.
v2 with even more spelling fixes suggested by Randy.
Patches are against vanilla 2.6.39.

The idea is to allow the user to specify RAM addresses that shouldn't be
touched by the OS, because they are broken in some way. Not all machines have
hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
use bitmasks to mask address patterns with the new "badram" kernel command line
parameter.
Memtest86 has an option to generate these patterns since v2.3 so the only thing
for the user to do should be:
- run Memtest86
- note down the pattern
- add badram=<pattern> to the kernel command line

The concerning pages are then marked with the hwpoison flag and thus won't be
used by the memory managment system.

Link to Ricks original patches and docs:
http://rick.vanrein.org/linux/badram/

  Stefan

Stefan Assmann (3):
  Add string parsing function get_next_ulong
  support for broken memory modules (BadRAM)
  Add documentation and credits for BadRAM

 CREDITS                             |    9 +
 Documentation/BadRAM.txt            |  370 +++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    6 +
 include/linux/kernel.h              |    1 +
 lib/cmdline.c                       |   35 ++++
 mm/memory-failure.c                 |  100 ++++++++++
 6 files changed, 521 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/BadRAM.txt

-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-21  9:23 Stefan Assmann
@ 2011-06-21 22:02 ` Andrew Morton
  2011-06-22 11:11   ` Stefan Assmann
  0 siblings, 1 reply; 49+ messages in thread
From: Andrew Morton @ 2011-06-21 22:02 UTC (permalink / raw)
  To: Stefan Assmann; +Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, rdunlap

On Tue, 21 Jun 2011 11:23:15 +0200
Stefan Assmann <sassmann@kpanic.de> wrote:

> Following the RFC for the BadRAM feature here's the updated version with
> spelling fixes, thanks go to Randy Dunlap.

I have some thoughts but I think that linux-mm is too narrow an audience for
a patchset of this scope.

Please resend the patchset cc'ing linux-kernel so that others can see what
we're talking about.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-21 22:02 ` Andrew Morton
@ 2011-06-22 11:11   ` Stefan Assmann
  0 siblings, 0 replies; 49+ messages in thread
From: Stefan Assmann @ 2011-06-22 11:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, tony.luck, andi, mingo, hpa, rick, rdunlap

On 22.06.2011 00:02, Andrew Morton wrote:
> On Tue, 21 Jun 2011 11:23:15 +0200
> Stefan Assmann <sassmann@kpanic.de> wrote:
> 
>> Following the RFC for the BadRAM feature here's the updated version with
>> spelling fixes, thanks go to Randy Dunlap.
> 
> I have some thoughts but I think that linux-mm is too narrow an audience for
> a patchset of this scope.
> 
> Please resend the patchset cc'ing linux-kernel so that others can see what
> we're talking about.

Sure, I'm looking forward to your feedback. Patches are going to be reposted
with LKML in Cc soon.
Thanks!

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v2 0/3] support for broken memory modules (BadRAM)
@ 2011-06-22 11:18 Stefan Assmann
  2011-06-22 18:00 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Stefan Assmann @ 2011-06-22 11:18 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, akpm, tony.luck, andi, mingo, hpa, rick, rdunlap,
	sassmann

Following the RFC for the BadRAM feature here's the updated version with
spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
as requested by Andi Kleen.
v2 with even more spelling fixes suggested by Randy.
Patches are against vanilla 2.6.39.
Repost with LKML in Cc as suggested by Andrew Morton.

The idea is to allow the user to specify RAM addresses that shouldn't be
touched by the OS, because they are broken in some way. Not all machines have
hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
use bitmasks to mask address patterns with the new "badram" kernel command line
parameter.
Memtest86 has an option to generate these patterns since v2.3 so the only thing
for the user to do should be:
- run Memtest86
- note down the pattern
- add badram=<pattern> to the kernel command line

The concerning pages are then marked with the hwpoison flag and thus won't be
used by the memory managment system.

Link to Ricks original patches and docs:
http://rick.vanrein.org/linux/badram/

  Stefan

Stefan Assmann (3):
  Add string parsing function get_next_ulong
  support for broken memory modules (BadRAM)
  Add documentation and credits for BadRAM

 CREDITS                             |    9 +
 Documentation/BadRAM.txt            |  370 +++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    6 +
 include/linux/kernel.h              |    1 +
 lib/cmdline.c                       |   35 ++++
 mm/memory-failure.c                 |  100 ++++++++++
 6 files changed, 521 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/BadRAM.txt

-- 
1.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 11:18 Stefan Assmann
@ 2011-06-22 18:00 ` Andrew Morton
  2011-06-22 18:06   ` Josh Boyer
                     ` (5 more replies)
  2011-06-22 18:15 ` H. Peter Anvin
  2011-06-23 13:39 ` Matthew Garrett
  2 siblings, 6 replies; 49+ messages in thread
From: Andrew Morton @ 2011-06-22 18:00 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, linux-kernel, tony.luck, andi, mingo, hpa, rick,
	rdunlap, Nancy Yuen, Michael Ditto

On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:

> Following the RFC for the BadRAM feature here's the updated version with
> spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
> as requested by Andi Kleen.
> v2 with even more spelling fixes suggested by Randy.
> Patches are against vanilla 2.6.39.
> 
> The idea is to allow the user to specify RAM addresses that shouldn't be
> touched by the OS, because they are broken in some way. Not all machines have
> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
> use bitmasks to mask address patterns with the new "badram" kernel command line
> parameter.
> Memtest86 has an option to generate these patterns since v2.3 so the only thing
> for the user to do should be:
> - run Memtest86
> - note down the pattern
> - add badram=<pattern> to the kernel command line
> 
> The concerning pages are then marked with the hwpoison flag and thus won't be
> used by the memory managment system.

The google kernel has a similar capability.  I asked Nancy to comment
on these patches and she said:

: One, the bad addresses are passed via the kernel command line, which
: has a limited length.  It's okay if the addresses can be fit into a
: pattern, but that's not necessarily the case in the google kernel.  And
: even with patterns, the limit on the command line length limits the
: number of patterns that user can specify.  Instead we use lilo to pass
: a file containing the bad pages in e820 format to the kernel.
: 
: Second, the BadRAM patch expands the address patterns from the command
: line into individual entries in the kernel's e820 table.  The e820
: table is a fixed buffer that supports a very small, hard coded number
: of entries (128).  We require a much larger number of entries (on
: the order of a few thousand), so much of the google kernel patch deals
: with expanding the e820 table. Also, with the BadRAM patch, entries
: that don't fit in the table are silently dropped and this isn't
: appropriate for us.
: 
: Another caveat of mapping out too much bad memory in general.  If too
: much memory is removed from low memory, a system may not boot.  We
: solve this by generating good maps.  Our userspace tools do not map out
: memory below a certain limit, and it verifies against a system's iomap
: that only addresses from memory is mapped out.

I have a couple of thoughts here:

- If this patchset is merged and a major user such as google is
  unable to use it and has to continue to carry a separate patch then
  that's a regrettable situation for the upstream kernel.

- Google's is, afaik, the largest use case we know of: zillions of
  machines for a number of years.  And this real-world experience tells
  us that the badram patchset has shortcomings.  Shortcomings which we
  can expect other users to experience.

So.  What are your thoughts on these issues?

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
@ 2011-06-22 18:06   ` Josh Boyer
  2011-06-22 18:09   ` Randy Dunlap
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 49+ messages in thread
From: Josh Boyer @ 2011-06-22 18:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, andi, mingo,
	hpa, rick, rdunlap, Nancy Yuen, Michael Ditto

On Wed, Jun 22, 2011 at 2:00 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:>
> I have a couple of thoughts here:
>
> - If this patchset is merged and a major user such as google is
>  unable to use it and has to continue to carry a separate patch then
>  that's a regrettable situation for the upstream kernel.
>
> - Google's is, afaik, the largest use case we know of: zillions of
>  machines for a number of years.  And this real-world experience tells
>  us that the badram patchset has shortcomings.  Shortcomings which we
>  can expect other users to experience.
>
> So.  What are your thoughts on these issues?

Has Google submitted patches for their implementation?

josh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
  2011-06-22 18:06   ` Josh Boyer
@ 2011-06-22 18:09   ` Randy Dunlap
  2011-06-22 18:11     ` Nancy Yuen
  2011-06-22 18:13   ` H. Peter Anvin
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 49+ messages in thread
From: Randy Dunlap @ 2011-06-22 18:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, andi, mingo,
	hpa, rick, rdunlap, Nancy Yuen, Michael Ditto

On Wed, 22 Jun 2011 11:00:34 -0700 Andrew Morton wrote:

> On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:
> 
> > Following the RFC for the BadRAM feature here's the updated version with
> > spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
> > as requested by Andi Kleen.
> > v2 with even more spelling fixes suggested by Randy.
> > Patches are against vanilla 2.6.39.
> > 
> > The idea is to allow the user to specify RAM addresses that shouldn't be
> > touched by the OS, because they are broken in some way. Not all machines have
> > hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
> > use bitmasks to mask address patterns with the new "badram" kernel command line
> > parameter.
> > Memtest86 has an option to generate these patterns since v2.3 so the only thing
> > for the user to do should be:
> > - run Memtest86
> > - note down the pattern
> > - add badram=<pattern> to the kernel command line
> > 
> > The concerning pages are then marked with the hwpoison flag and thus won't be
> > used by the memory managment system.
> 
> The google kernel has a similar capability.  I asked Nancy to comment
> on these patches and she said:
> 
> : One, the bad addresses are passed via the kernel command line, which
> : has a limited length.  It's okay if the addresses can be fit into a
> : pattern, but that's not necessarily the case in the google kernel.  And
> : even with patterns, the limit on the command line length limits the
> : number of patterns that user can specify.  Instead we use lilo to pass
> : a file containing the bad pages in e820 format to the kernel.
> : 
> : Second, the BadRAM patch expands the address patterns from the command
> : line into individual entries in the kernel's e820 table.  The e820
> : table is a fixed buffer that supports a very small, hard coded number
> : of entries (128).  We require a much larger number of entries (on
> : the order of a few thousand), so much of the google kernel patch deals
> : with expanding the e820 table. Also, with the BadRAM patch, entries
> : that don't fit in the table are silently dropped and this isn't
> : appropriate for us.
> : 
> : Another caveat of mapping out too much bad memory in general.  If too
> : much memory is removed from low memory, a system may not boot.  We
> : solve this by generating good maps.  Our userspace tools do not map out
> : memory below a certain limit, and it verifies against a system's iomap
> : that only addresses from memory is mapped out.
> 
> I have a couple of thoughts here:
> 
> - If this patchset is merged and a major user such as google is
>   unable to use it and has to continue to carry a separate patch then
>   that's a regrettable situation for the upstream kernel.
> 
> - Google's is, afaik, the largest use case we know of: zillions of
>   machines for a number of years.  And this real-world experience tells
>   us that the badram patchset has shortcomings.  Shortcomings which we
>   can expect other users to experience.
> 
> So.  What are your thoughts on these issues?


Good comments, so where is google's patch submittal?

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:09   ` Randy Dunlap
@ 2011-06-22 18:11     ` Nancy Yuen
  0 siblings, 0 replies; 49+ messages in thread
From: Nancy Yuen @ 2011-06-22 18:11 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Andrew Morton, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	andi, mingo, hpa, rick, Michael Ditto

I haven't had time to submit the patches, though it's on my todo list.

----------
Nancy



On Wed, Jun 22, 2011 at 11:09, Randy Dunlap <rdunlap@xenotime.net> wrote:
> On Wed, 22 Jun 2011 11:00:34 -0700 Andrew Morton wrote:
>
>> On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:
>>
>> > Following the RFC for the BadRAM feature here's the updated version with
>> > spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
>> > as requested by Andi Kleen.
>> > v2 with even more spelling fixes suggested by Randy.
>> > Patches are against vanilla 2.6.39.
>> >
>> > The idea is to allow the user to specify RAM addresses that shouldn't be
>> > touched by the OS, because they are broken in some way. Not all machines have
>> > hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>> > use bitmasks to mask address patterns with the new "badram" kernel command line
>> > parameter.
>> > Memtest86 has an option to generate these patterns since v2.3 so the only thing
>> > for the user to do should be:
>> > - run Memtest86
>> > - note down the pattern
>> > - add badram=<pattern> to the kernel command line
>> >
>> > The concerning pages are then marked with the hwpoison flag and thus won't be
>> > used by the memory managment system.
>>
>> The google kernel has a similar capability.  I asked Nancy to comment
>> on these patches and she said:
>>
>> : One, the bad addresses are passed via the kernel command line, which
>> : has a limited length.  It's okay if the addresses can be fit into a
>> : pattern, but that's not necessarily the case in the google kernel.  And
>> : even with patterns, the limit on the command line length limits the
>> : number of patterns that user can specify.  Instead we use lilo to pass
>> : a file containing the bad pages in e820 format to the kernel.
>> :
>> : Second, the BadRAM patch expands the address patterns from the command
>> : line into individual entries in the kernel's e820 table.  The e820
>> : table is a fixed buffer that supports a very small, hard coded number
>> : of entries (128).  We require a much larger number of entries (on
>> : the order of a few thousand), so much of the google kernel patch deals
>> : with expanding the e820 table. Also, with the BadRAM patch, entries
>> : that don't fit in the table are silently dropped and this isn't
>> : appropriate for us.
>> :
>> : Another caveat of mapping out too much bad memory in general.  If too
>> : much memory is removed from low memory, a system may not boot.  We
>> : solve this by generating good maps.  Our userspace tools do not map out
>> : memory below a certain limit, and it verifies against a system's iomap
>> : that only addresses from memory is mapped out.
>>
>> I have a couple of thoughts here:
>>
>> - If this patchset is merged and a major user such as google is
>>   unable to use it and has to continue to carry a separate patch then
>>   that's a regrettable situation for the upstream kernel.
>>
>> - Google's is, afaik, the largest use case we know of: zillions of
>>   machines for a number of years.  And this real-world experience tells
>>   us that the badram patchset has shortcomings.  Shortcomings which we
>>   can expect other users to experience.
>>
>> So.  What are your thoughts on these issues?
>
>
> Good comments, so where is google's patch submittal?
>
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
  2011-06-22 18:06   ` Josh Boyer
  2011-06-22 18:09   ` Randy Dunlap
@ 2011-06-22 18:13   ` H. Peter Anvin
  2011-06-22 19:01     ` Nancy Yuen
  2011-06-22 18:24   ` Andi Kleen
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 18:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, andi, mingo,
	rick, rdunlap, Nancy Yuen, Michael Ditto

On 06/22/2011 11:00 AM, Andrew Morton wrote:
> : 
> : Second, the BadRAM patch expands the address patterns from the command
> : line into individual entries in the kernel's e820 table.  The e820
> : table is a fixed buffer that supports a very small, hard coded number
> : of entries (128).  We require a much larger number of entries (on
> : the order of a few thousand), so much of the google kernel patch deals
> : with expanding the e820 table.

This has not been true for a long time.

> I have a couple of thoughts here:
> 
> - If this patchset is merged and a major user such as google is
>   unable to use it and has to continue to carry a separate patch then
>   that's a regrettable situation for the upstream kernel.
> 
> - Google's is, afaik, the largest use case we know of: zillions of
>   machines for a number of years.  And this real-world experience tells
>   us that the badram patchset has shortcomings.  Shortcomings which we
>   can expect other users to experience.
> 
> So.  What are your thoughts on these issues?

I think a binary structure fed as a linked list data object makes a lot
more sense.  We already support feeding e820 entries in this way,
bypassing the 128-entry limitation of the fixed table in the zeropage.

The main issue then is priority; in particular memory marked UNUSABLE
(type 5) in the fed-in e820 map will of course overlap entries with
normal RAM (type 1) information in the native map; we need to make sure
that the type 5 information takes priority.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 11:18 Stefan Assmann
  2011-06-22 18:00 ` Andrew Morton
@ 2011-06-22 18:15 ` H. Peter Anvin
  2011-06-22 20:30   ` Stefan Assmann
  2011-06-23 13:39 ` Matthew Garrett
  2 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 18:15 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, rick,
	rdunlap

On 06/22/2011 04:18 AM, Stefan Assmann wrote:
> 
> The idea is to allow the user to specify RAM addresses that shouldn't be
> touched by the OS, because they are broken in some way. Not all machines have
> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
> use bitmasks to mask address patterns with the new "badram" kernel command line
> parameter.
> Memtest86 has an option to generate these patterns since v2.3 so the only thing
> for the user to do should be:
> - run Memtest86
> - note down the pattern
> - add badram=<pattern> to the kernel command line
> 

We already support the equivalent functionality with
memmap=<address>$<length> for those with only a few ranges... this has
been supported for ages, literally.  For those with a lot of ranges,
like Google, the command line is insufficient.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
                     ` (2 preceding siblings ...)
  2011-06-22 18:13   ` H. Peter Anvin
@ 2011-06-22 18:24   ` Andi Kleen
  2011-06-22 18:38     ` Andrew Morton
  2011-06-22 20:18   ` Stefan Assmann
  2011-06-23 10:10   ` Rick van Rein
  5 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2011-06-22 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, andi, mingo,
	hpa, rick, rdunlap, Nancy Yuen, Michael Ditto

> So.  What are your thoughts on these issues?

Sounds orthogonal to me. You have to crawl before you walk.

A better way to pass in the data would be nice, but can be always
added on top (e.g. some EFI environment variable) 

For a first try a command line argument is quite
appropiate and simple enough.

A check for removing too much memory would be nice though,
although it's just a choice between panicing early or later.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:24   ` Andi Kleen
@ 2011-06-22 18:38     ` Andrew Morton
  2011-06-22 18:56       ` Andi Kleen
  0 siblings, 1 reply; 49+ messages in thread
From: Andrew Morton @ 2011-06-22 18:38 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, mingo, hpa,
	rick, rdunlap, Nancy Yuen, Michael Ditto

On Wed, 22 Jun 2011 20:24:45 +0200
Andi Kleen <andi@firstfloor.org> wrote:

> > So.  What are your thoughts on these issues?
> 
> Sounds orthogonal to me. You have to crawl before you walk.
> 
> A better way to pass in the data would be nice, but can be always
> added on top (e.g. some EFI environment variable) 
> 
> For a first try a command line argument is quite
> appropiate and simple enough.
> 
> A check for removing too much memory would be nice though,
> although it's just a choice between panicing early or later.
> 

If something can be grafted on later then that's of course all good.  I
do think we should have some sort of plan in which we work out how that
will be done.  If we want to do it, that is.

However if we go this way then there's a risk that we'll end up with
two different ways of configuring the feature and we'll need to
maintain the old way for ever.  That's a bad thing and we'd be better
off implementing the fancier scheme on day one.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:38     ` Andrew Morton
@ 2011-06-22 18:56       ` Andi Kleen
  2011-06-22 19:05         ` H. Peter Anvin
  0 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2011-06-22 18:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andi Kleen, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	mingo, hpa, rick, rdunlap, Nancy Yuen, Michael Ditto

> If something can be grafted on later then that's of course all good.  I
> do think we should have some sort of plan in which we work out how that
> will be done.  If we want to do it, that is.
> 
> However if we go this way then there's a risk that we'll end up with
> two different ways of configuring the feature and we'll need to

You'll always have multiple ways. Whatever magic you come up for
the google BIOS or for EFI won't help the majority of users with
old crufty legacy BIOS.

So you need a "everything included" way -- and the only straight forward
way to do that that I can see is the command line.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:13   ` H. Peter Anvin
@ 2011-06-22 19:01     ` Nancy Yuen
  2011-06-22 19:06       ` H. Peter Anvin
  0 siblings, 1 reply; 49+ messages in thread
From: Nancy Yuen @ 2011-06-22 19:01 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andrew Morton, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	andi, mingo, rick, rdunlap, Michael Ditto

On Wed, Jun 22, 2011 at 11:13, H. Peter Anvin <hpa@zytor.com> wrote:
> On 06/22/2011 11:00 AM, Andrew Morton wrote:
>> :
>> : Second, the BadRAM patch expands the address patterns from the command
>> : line into individual entries in the kernel's e820 table.  The e820
>> : table is a fixed buffer that supports a very small, hard coded number
>> : of entries (128).  We require a much larger number of entries (on
>> : the order of a few thousand), so much of the google kernel patch deals
>> : with expanding the e820 table.
>
> This has not been true for a long time.

Good point.  There's the MAX_NODES that expands it, though it's still
hard coded, and as I understand, intended for NUMA node entries.  We
need anywhere from 8K to 64K 'bad' entries.  This creates holes and
translates to twice as many entries in the e820.  We only want to
allow this memory if it's needed, instead of hard coding it.

>
>> I have a couple of thoughts here:
>>
>> - If this patchset is merged and a major user such as google is
>>   unable to use it and has to continue to carry a separate patch then
>>   that's a regrettable situation for the upstream kernel.
>>
>> - Google's is, afaik, the largest use case we know of: zillions of
>>   machines for a number of years.  And this real-world experience tells
>>   us that the badram patchset has shortcomings.  Shortcomings which we
>>   can expect other users to experience.
>>
>> So.  What are your thoughts on these issues?
>
> I think a binary structure fed as a linked list data object makes a lot
> more sense.  We already support feeding e820 entries in this way,
> bypassing the 128-entry limitation of the fixed table in the zeropage.
>
> The main issue then is priority; in particular memory marked UNUSABLE
> (type 5) in the fed-in e820 map will of course overlap entries with
> normal RAM (type 1) information in the native map; we need to make sure
> that the type 5 information takes priority.
>
>        -hpa
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:56       ` Andi Kleen
@ 2011-06-22 19:05         ` H. Peter Anvin
  2011-06-22 19:15           ` Andi Kleen
  0 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 19:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	mingo, rick, rdunlap, Nancy Yuen, Michael Ditto

On 06/22/2011 11:56 AM, Andi Kleen wrote:
> 
> You'll always have multiple ways. Whatever magic you come up for
> the google BIOS or for EFI won't help the majority of users with
> old crufty legacy BIOS.
> 

I don't think this has anything to do with this.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 19:01     ` Nancy Yuen
@ 2011-06-22 19:06       ` H. Peter Anvin
  0 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 19:06 UTC (permalink / raw)
  To: Nancy Yuen
  Cc: Andrew Morton, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	andi, mingo, rick, rdunlap, Michael Ditto

On 06/22/2011 12:01 PM, Nancy Yuen wrote:
> 
> Good point.  There's the MAX_NODES that expands it, though it's still
> hard coded, and as I understand, intended for NUMA node entries.  We
> need anywhere from 8K to 64K 'bad' entries.  This creates holes and
> translates to twice as many entries in the e820.  We only want to
> allow this memory if it's needed, instead of hard coding it.
> 

It should be dynamic, probably.  We can waste memory during early
reclaim, but the memblock stuff should be dynamic.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 19:05         ` H. Peter Anvin
@ 2011-06-22 19:15           ` Andi Kleen
  2011-06-22 20:25             ` H. Peter Anvin
  0 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2011-06-22 19:15 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andi Kleen, Andrew Morton, Stefan Assmann, linux-mm, linux-kernel,
	tony.luck, mingo, rick, rdunlap, Nancy Yuen, Michael Ditto

On Wed, Jun 22, 2011 at 12:05:07PM -0700, H. Peter Anvin wrote:
> On 06/22/2011 11:56 AM, Andi Kleen wrote:
> > 
> > You'll always have multiple ways. Whatever magic you come up for
> > the google BIOS or for EFI won't help the majority of users with
> > old crufty legacy BIOS.
> > 
> 
> I don't think this has anything to do with this.

Please elaborate.

How would you pass the bad page information instead in a fully backwards
compatible way?

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
                     ` (3 preceding siblings ...)
  2011-06-22 18:24   ` Andi Kleen
@ 2011-06-22 20:18   ` Stefan Assmann
  2011-06-23 10:33     ` Rick van Rein
  2011-06-23 10:10   ` Rick van Rein
  5 siblings, 1 reply; 49+ messages in thread
From: Stefan Assmann @ 2011-06-22 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, tony.luck, andi, mingo, hpa, rick,
	rdunlap, Nancy Yuen, Michael Ditto

On 22.06.2011 20:00, Andrew Morton wrote:
> On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:
> 

[...]

>> The idea is to allow the user to specify RAM addresses that shouldn't be
>> touched by the OS, because they are broken in some way. Not all machines have
>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>> use bitmasks to mask address patterns with the new "badram" kernel command line
>> parameter.
>> Memtest86 has an option to generate these patterns since v2.3 so the only thing
>> for the user to do should be:
>> - run Memtest86
>> - note down the pattern
>> - add badram=<pattern> to the kernel command line
>>
>> The concerning pages are then marked with the hwpoison flag and thus won't be
>> used by the memory managment system.
> 
> The google kernel has a similar capability.  I asked Nancy to comment
> on these patches and she said:

This is the first time I hear about this feature from Google. If I had
known about it I sure would have talked to the person responsible.

> 
> : One, the bad addresses are passed via the kernel command line, which
> : has a limited length.  It's okay if the addresses can be fit into a
> : pattern, but that's not necessarily the case in the google kernel.  And
> : even with patterns, the limit on the command line length limits the
> : number of patterns that user can specify.  Instead we use lilo to pass
> : a file containing the bad pages in e820 format to the kernel.

I see no reason why there couldn't be multiple ways of specifying bad
addresses.

> : 
> : Second, the BadRAM patch expands the address patterns from the command
> : line into individual entries in the kernel's e820 table.  The e820
> : table is a fixed buffer that supports a very small, hard coded number
> : of entries (128).  We require a much larger number of entries (on
> : the order of a few thousand), so much of the google kernel patch deals
> : with expanding the e820 table. Also, with the BadRAM patch, entries
> : that don't fit in the table are silently dropped and this isn't
> : appropriate for us.

So far the use case I had in mind wasn't "thousands of entries". However
expanding the e820 table is probably an issue that could be dealt with
separately ?

> : 
> : Another caveat of mapping out too much bad memory in general.  If too
> : much memory is removed from low memory, a system may not boot.  We
> : solve this by generating good maps.  Our userspace tools do not map out
> : memory below a certain limit, and it verifies against a system's iomap
> : that only addresses from memory is mapped out.

Well if too much low memory is bad, you're screwed anyway, not? :)

> 
> I have a couple of thoughts here:
> 
> - If this patchset is merged and a major user such as google is
>   unable to use it and has to continue to carry a separate patch then
>   that's a regrettable situation for the upstream kernel.

I'm all ears for making things work out for potential users, I just
didn't know.

> 
> - Google's is, afaik, the largest use case we know of: zillions of
>   machines for a number of years.  And this real-world experience tells
>   us that the badram patchset has shortcomings.  Shortcomings which we
>   can expect other users to experience.
> 
> So.  What are your thoughts on these issues?

I'm aware that the implementation I posted is not covering *everything*.
It's a start and I tried to keep it simple and make use of already
existing infrastructure.
At the moment I don't see any arguments why this patchset couldn't play
along nicely or get enhanced to support what Google needs, but I don't
know Googles patches yet.

Thanks!

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 19:15           ` Andi Kleen
@ 2011-06-22 20:25             ` H. Peter Anvin
  2011-06-22 20:28               ` Andi Kleen
  0 siblings, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 20:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andrew Morton, Stefan Assmann, linux-mm, linux-kernel, tony.luck,
	mingo, rick, rdunlap, Nancy Yuen, Michael Ditto

On 06/22/2011 12:15 PM, Andi Kleen wrote:
> On Wed, Jun 22, 2011 at 12:05:07PM -0700, H. Peter Anvin wrote:
>> On 06/22/2011 11:56 AM, Andi Kleen wrote:
>>>
>>> You'll always have multiple ways. Whatever magic you come up for
>>> the google BIOS or for EFI won't help the majority of users with
>>> old crufty legacy BIOS.
>>>
>>
>> I don't think this has anything to do with this.
> 
> Please elaborate.
> 
> How would you pass the bad page information instead in a fully backwards
> compatible way?
> 

Depends on what you mean with "fully backward compatible".  In some ways
this is a nonsense statement since if we create anything new older
kernels will not run.

However, the other discussions in this thread have been about injecting
data in kernel-specific data structures and thus aren't dependent on the
firmware layer used.

The fully backward compatible way is "memmap=<address>$<length>".

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 20:25             ` H. Peter Anvin
@ 2011-06-22 20:28               ` Andi Kleen
  0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2011-06-22 20:28 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andi Kleen, Andrew Morton, Stefan Assmann, linux-mm, linux-kernel,
	tony.luck, mingo, rick, rdunlap, Nancy Yuen, Michael Ditto

> The fully backward compatible way is "memmap=<address>$<length>".

This doesn't really work for patterns. badmem is about making patterns/
strides/etc.  work as far as I understand. Those are very common
with modern interleaving schemes.

Please read the original patchkit and its documentation.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:15 ` H. Peter Anvin
@ 2011-06-22 20:30   ` Stefan Assmann
  2011-06-22 20:33     ` H. Peter Anvin
  0 siblings, 1 reply; 49+ messages in thread
From: Stefan Assmann @ 2011-06-22 20:30 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, rick,
	rdunlap

On 22.06.2011 20:15, H. Peter Anvin wrote:
> On 06/22/2011 04:18 AM, Stefan Assmann wrote:
>>
>> The idea is to allow the user to specify RAM addresses that shouldn't be
>> touched by the OS, because they are broken in some way. Not all machines have
>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>> use bitmasks to mask address patterns with the new "badram" kernel command line
>> parameter.
>> Memtest86 has an option to generate these patterns since v2.3 so the only thing
>> for the user to do should be:
>> - run Memtest86
>> - note down the pattern
>> - add badram=<pattern> to the kernel command line
>>
> 
> We already support the equivalent functionality with
> memmap=<address>$<length> for those with only a few ranges... this has
> been supported for ages, literally.  For those with a lot of ranges,
> like Google, the command line is insufficient.

Right, I think this has been discussed a while ago. So the advantages I
see in this approach are. It allows to break down memory exclusion to
the page level with a pattern of non-consecutive pages. So if every
other page would be considered bad that's a bit tough to deal with using
memmap.
Secondly patterns can be easily generated by running Memtest86 and thus
easily be fed to the kernel by command line. Making it much more feasible
for the average user to take advantage of it.

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 20:30   ` Stefan Assmann
@ 2011-06-22 20:33     ` H. Peter Anvin
  0 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-22 20:33 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, rick,
	rdunlap

On 06/22/2011 01:30 PM, Stefan Assmann wrote:
> On 22.06.2011 20:15, H. Peter Anvin wrote:
>> On 06/22/2011 04:18 AM, Stefan Assmann wrote:
>>>
>>> The idea is to allow the user to specify RAM addresses that shouldn't be
>>> touched by the OS, because they are broken in some way. Not all machines have
>>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>>> use bitmasks to mask address patterns with the new "badram" kernel command line
>>> parameter.
>>> Memtest86 has an option to generate these patterns since v2.3 so the only thing
>>> for the user to do should be:
>>> - run Memtest86
>>> - note down the pattern
>>> - add badram=<pattern> to the kernel command line
>>>
>>
>> We already support the equivalent functionality with
>> memmap=<address>$<length> for those with only a few ranges... this has
>> been supported for ages, literally.  For those with a lot of ranges,
>> like Google, the command line is insufficient.
> 
> Right, I think this has been discussed a while ago. So the advantages I
> see in this approach are. It allows to break down memory exclusion to
> the page level with a pattern of non-consecutive pages. So if every
> other page would be considered bad that's a bit tough to deal with using
> memmap.
> Secondly patterns can be easily generated by running Memtest86 and thus
> easily be fed to the kernel by command line. Making it much more feasible
> for the average user to take advantage of it.
> 

How common are nontrivial patterns on real hardware?  This would be
interesting to hear from Google or another large user.

If so, we should probably introduce this as another linked-list data
structure; we can allow it to be preprocessed from the command line if
need be.

I have to say I think Google's point that truncating the list is
unacceptable... that would mean running in a known-bad configuration,
and even a hard crash would be better.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 18:00 ` Andrew Morton
                     ` (4 preceding siblings ...)
  2011-06-22 20:18   ` Stefan Assmann
@ 2011-06-23 10:10   ` Rick van Rein
  5 siblings, 0 replies; 49+ messages in thread
From: Rick van Rein @ 2011-06-23 10:10 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Stefan Assmann, linux-mm, linux-kernel, tony.luck, andi, mingo,
	hpa, rick, rdunlap, Nancy Yuen, Michael Ditto

Hello,

> > The concerning pages are then marked with the hwpoison flag and thus won't be
> > used by the memory managment system.
> 
> The google kernel has a similar capability.  I asked Nancy to comment
> on these patches and she said:
> 
> : One, the bad addresses are passed via the kernel command line, which
> : has a limited length.  It's okay if the addresses can be fit into a
> : pattern, but that's not necessarily the case in the google kernel.

They are guaranteed to fit in 5 patterns (and even that is a choice).
The BadRAM pattern printing option built into Memtest86 will never
create more than that.  If your memory is really screwed, it will
simply make patterns so generic that at least all the faults are
covered.

The figure 5 is a bit arbitrary, but was chosen in a time that we all
used LILO and had to live with its limited cmdline length.  GRUB is
more relaxed in that respect, but there has never been a need to go
beyond five.  Most errors are regular patterns (because an entire row,
or an entire column is damaged if not just a single cell is affected)
that will fit into a limited number of patterns without a need for so
many.

> : And
> : even with patterns, the limit on the command line length limits the
> : number of patterns that user can specify.  Instead we use lilo to pass
> : a file containing the bad pages in e820 format to the kernel.

I've looked into the aproach of e820 and actually turned away from it.
The e820 format does not permit to specify the regularity that comes
with real-life memory problems.  Having made the BadRAM patch, I've seen
numerous examples, and they all came down to single-cell errors and
either one or more rows and/or one or more columns of cells.  There has
never been a reporting of such erratic destruction that it could not
comfortably (that is, with minimal pages sacrificed) fit in the limit
of 5 patterns that Memtest86 (not BadRAM) imposes.  I'm pretty sure I
would have heard about it if there had been any such problems, given
the interactivity of people who had gone through all the effort of
patching a kernel.  Kernel patchers are not usually the silent kind
when it comes to an opportunity to improve Linux ;-)

> : Second, the BadRAM patch expands the address patterns from the command
> : line into individual entries in the kernel's e820 table.  The e820
> : table is a fixed buffer [...]

This is not how BadRAM works -- it will set a page flag for defected
pages in Linux' page table.  It does this before getting to the stage
where all pages are initially 'freed' into the memory pool, and can
thus avoid that damaged pages are ever released for allocation.

> : We require a much larger number of entries (on
> : the order of a few thousand), so much of the google kernel patch deals
> : with expanding the e820 table.

Interesting.  I have made a deliberate choice not to go that way,
but that was because we were looking at e820 as a communications
mechanism between a BadRAM-supportive GRUB and the kernel.  The
advantage of that would have been to do it before the kernel.

Indeed, if you take this route you will see a severe expansion of the
e820 table.  A damaged row (or column) does indeed lead to 4096 or so
error spots, that is quite common.

I'd like to know -- are the pages with faults that you have not also
organised in a regular pattern, which is what BadRAM addresses?  If
not, that would be a strongly countering argument for the
pattern-based approach of BadRAM, but I would be really surprised if
one or two patterns (or up to five) could not comfortably describe
the error patterns -- as they were designed to match how memory
hardware actually work.

Also, if you find 4093 error pages, you would not generalise it to
a 4096 page error, right?  I would not feel comfortable in that
case.

> : Also, with the BadRAM patch, entries
> : that don't fit in the table are silently dropped and this isn't
> : appropriate for us.

The e820 page is not used, so nothing is silently dropped.  BadRAM
would rather err at the expense of a few pages than miss an opportunity
to fix a problem.  There's nothing Google-specific about that wish :-)

> : Another caveat of mapping out too much bad memory in general.

Never seen that, or heard complaints about it, in over 10 years.  Do
you have examples on the contrary, or is this merely a concern?

> : If too
> : much memory is removed from low memory, a system may not boot.  We
> : solve this by generating good maps.  Our userspace tools do not map out
> : memory below a certain limit, and it verifies against a system's iomap
> : that only addresses from memory is mapped out.

I've seen rare occasions where a system could not be helped due to a
bug in the low parts of memory, indeed.  Maybe 1 or 2 cases in >10 years.

> - If this patchset is merged and a major user such as google is
>   unable to use it and has to continue to carry a separate patch then
>   that's a regrettable situation for the upstream kernel.

First, I wonder if there is any conflict at all.  If someone wanted
to use their own local approach, such as one based on e820 tables, I
don't think there would be any interference?

But I doubt that Google's requirements are that different from those of
other users.  BadRAM adds a layer of abstraction, but this is not an
office worker's abstraction -- instead it reflects the structures of
hardware, leading to the BadRAM pattern abstraction.  I really believe
that Google would be able to work easily with the BadRAM patch if it
was in conflict with their e820-based approach.

> - Google's is, afaik, the largest use case we know of: zillions of
>   machines for a number of years.  And this real-world experience tells
>   us that the badram patchset has shortcomings.  Shortcomings which we
>   can expect other users to experience.

Please, do show examples and figures of how common they are if you have
anything concrete to counter the pattern-based approach.  I am eager
to learn if my experience with a diverse set of individual cases for over
a decade has any shortcomings.


Best wishes,
 -Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 20:18   ` Stefan Assmann
@ 2011-06-23 10:33     ` Rick van Rein
  2011-06-23 10:49       ` Rick van Rein
  0 siblings, 1 reply; 49+ messages in thread
From: Rick van Rein @ 2011-06-23 10:33 UTC (permalink / raw)
  To: H. Peter Anvin, Stefan Assmann
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, rick,
	rdunlap, Nancy Yuen, Michael Ditto

Hello,

> We already support the equivalent functionality with
> memmap=<address>$<length> for those with only a few ranges...

This is not a realistic option for people whose memory failed.
Google is quite right when they say they hit thousands of erroneous
pages.  If you have, say, a static discharge damaging the buffers
from the cell array to the outside world, then the entire row or
column behind that buffer will fail.  I've seen many such examples.

> For those with a lot of ranges,
> like Google, the command line is insufficient.

Not if you recognise that there is a pattern :-)

Google does not seem to have realised that, and is simply listing
the pages that are defected.  IMHO, but being the BadRAM author I
can hardly be called objective, this is the added value of BadRAM,
that it understands the nature of the problem and solves it with
an elegant concept at the right level of abstraction.

> So far the use case I had in mind wasn't "thousands of entries". However
> expanding the e820 table is probably an issue that could be dealt with
> separately ?

This could help with other approaches as well -- as mentioned,
there have been attempts to get BadRAM into GRUB, so that the
kernel needn't be aware of it.  But adding BadRAM or expanding
the e820 table are both cases of changing the kernel, and in that
case I thought it'd be best to actually solve the problem and
not upgrade the messenger.

> Well if too much low memory is bad, you're screwed anyway, not? :)

If the kernel is always loaded in a fixed location, yes.  That
is one assumption that the kernel makes (made?) that will only
work if all your memory is good.

> At the moment I don't see any arguments why this patchset couldn't play
> along nicely or get enhanced to support what Google needs, but I don't
> know Googles patches yet.

Changes to e820 should not interfere with setting flags (and
living by them) for failing memory pages.  One property of BadRAM,
namely that it does not slow down your system (you have less
pages on hand, but that's all) may or may not apply to an e820-based
approach.  I don't know if e820 is ever consulted after boot?

> How common are nontrivial patterns on real hardware?  This would be
> interesting to hear from Google or another large user.

Yes.  And "non-trivial" would mean that the patterns waste more space
than fair, *because of* the generalisation to patterns.

If you plug 10 DIMMs into your machine, and each has a faulty row
somewhere, then you will get into trouble if you stick to 5 patterns.
But if you happen to run into a faulty DIMM from time to time, the
patterns should be your way out.

> I have to say I think Google's point that truncating the list is
> unacceptable...

Of course, that is true.  This is why memmap=... does not work.
It has nothing to do with BadRAM however, there will never be more
than 5 patterns.

> that would mean running in a known-bad configuration,
> and even a hard crash would be better.

...which is so sensible that it was of course taken into account in
the BadRAM design!


Cheers,
 -Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 10:33     ` Rick van Rein
@ 2011-06-23 10:49       ` Rick van Rein
  0 siblings, 0 replies; 49+ messages in thread
From: Rick van Rein @ 2011-06-23 10:49 UTC (permalink / raw)
  To: Rick van Rein
  Cc: H. Peter Anvin, Stefan Assmann, linux-mm, linux-kernel, akpm,
	tony.luck, andi, mingo, rdunlap, Nancy Yuen, Michael Ditto

Hello,

My last email may have assumed that you knew all about BadRAM; this
is probably worth an expansion:

> If you plug 10 DIMMs into your machine, and each has a faulty row
> somewhere, then you will get into trouble if you stick to 5 patterns.

With "trouble" I mean that a 6th pattern would be merged with the
nearest of the already-found 5 patterns.  It may be that this leads
to a pattern that covers more addresses than strictly needed.  This
is how I can guarantee that there are never more than 5 patterns,
and so never more than the cmdline can take.  No cut-offs are made.

> But if you happen to run into a faulty DIMM from time to time, the
> patterns should be your way out.

...without needing to be more general than really required.  Of course,
if all your PCs ran on 10 DIMMs, you could expand the number of
patterns to a comfortably higher number, but what I've seen with the
various cases I've supported, this has never been necessary.

> > that would mean running in a known-bad configuration,
> > and even a hard crash would be better.
> 
> ...which is so sensible that it was of course taken into account in
> the BadRAM design!

Meaning, that is why patterns are merged if the exceed the rather high
number of 5 patterns.  Rather waste those extra pages than running
into a known fault.

This high number of patterns is not at all common, however, making it
safe to assume that the figure is high enough, in spite of leaving
space on even LILO's cmdline to support adding several other tweaks.


-Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-22 11:18 Stefan Assmann
  2011-06-22 18:00 ` Andrew Morton
  2011-06-22 18:15 ` H. Peter Anvin
@ 2011-06-23 13:39 ` Matthew Garrett
  2011-06-23 14:08   ` Stefan Assmann
  2 siblings, 1 reply; 49+ messages in thread
From: Matthew Garrett @ 2011-06-23 13:39 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, hpa, rick,
	rdunlap

On Wed, Jun 22, 2011 at 01:18:51PM +0200, Stefan Assmann wrote:
> Following the RFC for the BadRAM feature here's the updated version with
> spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
> as requested by Andi Kleen.
> v2 with even more spelling fixes suggested by Randy.
> Patches are against vanilla 2.6.39.
> Repost with LKML in Cc as suggested by Andrew Morton.

Would it be more reasonable to do this in the bootloader? You'd ideally 
want this to be done as early as possible in order to avoid awkward 
situations like your ramdisk ending up in the bad RAM area.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 13:39 ` Matthew Garrett
@ 2011-06-23 14:08   ` Stefan Assmann
  2011-06-23 14:12     ` Matthew Garrett
  0 siblings, 1 reply; 49+ messages in thread
From: Stefan Assmann @ 2011-06-23 14:08 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, hpa, rick,
	rdunlap

On 23.06.2011 15:39, Matthew Garrett wrote:
> On Wed, Jun 22, 2011 at 01:18:51PM +0200, Stefan Assmann wrote:
>> Following the RFC for the BadRAM feature here's the updated version with
>> spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
>> as requested by Andi Kleen.
>> v2 with even more spelling fixes suggested by Randy.
>> Patches are against vanilla 2.6.39.
>> Repost with LKML in Cc as suggested by Andrew Morton.
> 
> Would it be more reasonable to do this in the bootloader? You'd ideally 
> want this to be done as early as possible in order to avoid awkward 
> situations like your ramdisk ending up in the bad RAM area.

Not sure what exactly you are suggesting here. The kernel somehow needs
to know what memory areas to avoid so we supply this information via
kernel command line.
What the bootloader could do is to allow the kernel/initrd to be loaded
at an alternative address. That's briefly mentioned in the BadRAM
Documentation as well. Is that what you mean or am I missing something?

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 14:08   ` Stefan Assmann
@ 2011-06-23 14:12     ` Matthew Garrett
  2011-06-23 15:37       ` Stefan Assmann
  0 siblings, 1 reply; 49+ messages in thread
From: Matthew Garrett @ 2011-06-23 14:12 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, hpa, rick,
	rdunlap

On Thu, Jun 23, 2011 at 04:08:32PM +0200, Stefan Assmann wrote:
> On 23.06.2011 15:39, Matthew Garrett wrote:
> > Would it be more reasonable to do this in the bootloader? You'd ideally 
> > want this to be done as early as possible in order to avoid awkward 
> > situations like your ramdisk ending up in the bad RAM area.
> 
> Not sure what exactly you are suggesting here. The kernel somehow needs
> to know what memory areas to avoid so we supply this information via
> kernel command line.
> What the bootloader could do is to allow the kernel/initrd to be loaded
> at an alternative address. That's briefly mentioned in the BadRAM
> Documentation as well. Is that what you mean or am I missing something?

For EFI booting we just hand an e820 map to the kernel. It ought to be 
easy enough to add support for that to the 16-bit entry point as well. 
Then the bootloader just needs to construct an e820 map of its own. I 
think grub2 actually already has some support for this. The advantage of 
this approach is that the knowledge of bad memory only has to exist in 
one place (ie, the bootloader) - the kernel can remain blisfully 
unaware.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 14:12     ` Matthew Garrett
@ 2011-06-23 15:37       ` Stefan Assmann
  2011-06-23 16:30         ` H. Peter Anvin
  2011-06-23 17:00         ` Andi Kleen
  0 siblings, 2 replies; 49+ messages in thread
From: Stefan Assmann @ 2011-06-23 15:37 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: linux-mm, linux-kernel, akpm, tony.luck, andi, mingo, hpa, rick,
	rdunlap

On 23.06.2011 16:12, Matthew Garrett wrote:
> On Thu, Jun 23, 2011 at 04:08:32PM +0200, Stefan Assmann wrote:
>> On 23.06.2011 15:39, Matthew Garrett wrote:
>>> Would it be more reasonable to do this in the bootloader? You'd ideally 
>>> want this to be done as early as possible in order to avoid awkward 
>>> situations like your ramdisk ending up in the bad RAM area.
>>
>> Not sure what exactly you are suggesting here. The kernel somehow needs
>> to know what memory areas to avoid so we supply this information via
>> kernel command line.
>> What the bootloader could do is to allow the kernel/initrd to be loaded
>> at an alternative address. That's briefly mentioned in the BadRAM
>> Documentation as well. Is that what you mean or am I missing something?
> 
> For EFI booting we just hand an e820 map to the kernel. It ought to be 
> easy enough to add support for that to the 16-bit entry point as well. 
> Then the bootloader just needs to construct an e820 map of its own. I 
> think grub2 actually already has some support for this. The advantage of 
> this approach is that the knowledge of bad memory only has to exist in 
> one place (ie, the bootloader) - the kernel can remain blisfully 
> unaware.
> 

According to Rick's reply in this thread a damaged row in a DIMM can
easily cause a few thousand entries in the e820 table because it doesn't
handle patterns. So the question I'm asking is, is it acceptable to
have an e820 table with thousands maybe ten-thousands of entries?
I really have no idea of the implications, maybe somebody else can
comment on that.

  Stefan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 15:37       ` Stefan Assmann
@ 2011-06-23 16:30         ` H. Peter Anvin
  2011-06-24  0:59           ` Andi Kleen
  2011-06-23 17:00         ` Andi Kleen
  1 sibling, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-23 16:30 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: Matthew Garrett, linux-mm, linux-kernel, akpm, tony.luck, andi,
	mingo, rick, rdunlap

On 06/23/2011 08:37 AM, Stefan Assmann wrote:
> 
> According to Rick's reply in this thread a damaged row in a DIMM can
> easily cause a few thousand entries in the e820 table because it doesn't
> handle patterns. So the question I'm asking is, is it acceptable to
> have an e820 table with thousands maybe ten-thousands of entries?
> I really have no idea of the implications, maybe somebody else can
> comment on that.
> 

Given that that is what actually ends up happening in the kernel at some
point anyway, I don't see why it would matter.

The bubble sort has to go, but quite frankly stress-testing the range
handling isn't a bad thing.
	
	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 15:37       ` Stefan Assmann
  2011-06-23 16:30         ` H. Peter Anvin
@ 2011-06-23 17:00         ` Andi Kleen
  2011-06-23 17:12           ` Luck, Tony
  1 sibling, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2011-06-23 17:00 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: Matthew Garrett, linux-mm, linux-kernel, akpm, tony.luck, andi,
	mingo, hpa, rick, rdunlap

> According to Rick's reply in this thread a damaged row in a DIMM can
> easily cause a few thousand entries in the e820 table because it doesn't
> handle patterns. So the question I'm asking is, is it acceptable to
> have an e820 table with thousands maybe ten-thousands of entries?
> I really have no idea of the implications, maybe somebody else can
> comment on that.

I don't think it makes sense to handle something like that with a list.
The compact representation currently in badram is great for that.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 17:00         ` Andi Kleen
@ 2011-06-23 17:12           ` Luck, Tony
  2011-06-24  1:03             ` Craig Bergstrom
  0 siblings, 1 reply; 49+ messages in thread
From: Luck, Tony @ 2011-06-23 17:12 UTC (permalink / raw)
  To: Andi Kleen, Stefan Assmann
  Cc: Matthew Garrett, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, hpa@zytor.com,
	rick@vanrein.org, rdunlap@xenotime.net

> I don't think it makes sense to handle something like that with a list.
> The compact representation currently in badram is great for that.

I'd tend to agree here.  Rick has made a convincing argument that there
are significant numbers of real world cases where a defective row/column
in a DIMM results in a predictable pattern of errors.  The ball is now
in Google's court to take a look at their systems that have high numbers
of errors to see if they can actually be described by a small number
of BadRAM patterns as Rick has claimed.

-Tony

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 16:30         ` H. Peter Anvin
@ 2011-06-24  0:59           ` Andi Kleen
  0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2011-06-24  0:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Stefan Assmann, Matthew Garrett, linux-mm, linux-kernel, akpm,
	tony.luck, andi, mingo, rick, rdunlap

On Thu, Jun 23, 2011 at 09:30:37AM -0700, H. Peter Anvin wrote:
> On 06/23/2011 08:37 AM, Stefan Assmann wrote:
> > 
> > According to Rick's reply in this thread a damaged row in a DIMM can
> > easily cause a few thousand entries in the e820 table because it doesn't
> > handle patterns. So the question I'm asking is, is it acceptable to
> > have an e820 table with thousands maybe ten-thousands of entries?
> > I really have no idea of the implications, maybe somebody else can
> > comment on that.
> > 
> 
> Given that that is what actually ends up happening in the kernel at some
> point anyway, 

hwpoison can poison most pages without any lists.  Read Stefan's original patch.

The only thing that needs list really is conflict handling with
early allocations.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-23 17:12           ` Luck, Tony
@ 2011-06-24  1:03             ` Craig Bergstrom
  2011-06-24  1:08               ` Andi Kleen
  2011-06-24  8:05               ` Rick van Rein
  0 siblings, 2 replies; 49+ messages in thread
From: Craig Bergstrom @ 2011-06-24  1:03 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Andi Kleen, Stefan Assmann, Matthew Garrett, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	mingo@elte.hu, hpa@zytor.com, rick@vanrein.org,
	rdunlap@xenotime.net

[-- Attachment #1: Type: text/plain, Size: 1181 bytes --]

On Thu, Jun 23, 2011 at 10:12 AM, Luck, Tony <tony.luck@intel.com> wrote:

> > I don't think it makes sense to handle something like that with a list.
> > The compact representation currently in badram is great for that.
>
> I'd tend to agree here.  Rick has made a convincing argument that there
> are significant numbers of real world cases where a defective row/column
> in a DIMM results in a predictable pattern of errors.  The ball is now
> in Google's court to take a look at their systems that have high numbers
> of errors to see if they can actually be described by a small number
> of BadRAM patterns as Rick has claimed.
>
>
Hi All,

We (Google) are working on a data-driven answer for this question.  I know
that there has been some analysis on this topic on the past, but I don't
want to speculate until we've had some time to put all the pieces together.
 Stay tuned for specifics.

Cheers,
CraigB



> -Tony
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

[-- Attachment #2: Type: text/html, Size: 2022 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24  1:03             ` Craig Bergstrom
@ 2011-06-24  1:08               ` Andi Kleen
  2011-06-24  1:22                 ` Craig Bergstrom
  2011-06-24  8:05               ` Rick van Rein
  1 sibling, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2011-06-24  1:08 UTC (permalink / raw)
  To: Craig Bergstrom
  Cc: Luck, Tony, Andi Kleen, Stefan Assmann, Matthew Garrett,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, hpa@zytor.com,
	rick@vanrein.org, rdunlap@xenotime.net

> We (Google) are working on a data-driven answer for this question.  I know
> that there has been some analysis on this topic on the past, but I don't
> want to speculate until we've had some time to put all the pieces together.
>  Stay tuned for specifics.

It would be also good if you posted your kernel patches.

It's highly unusual -- to say the least -- to let someone's openly
developed and posted patchkit compete with someone's else secret 
internal solution for review purposes.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24  1:08               ` Andi Kleen
@ 2011-06-24  1:22                 ` Craig Bergstrom
  0 siblings, 0 replies; 49+ messages in thread
From: Craig Bergstrom @ 2011-06-24  1:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Luck, Tony, Stefan Assmann, Matthew Garrett, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	mingo@elte.hu, hpa@zytor.com, rick@vanrein.org,
	rdunlap@xenotime.net

On Thu, Jun 23, 2011 at 6:08 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> We (Google) are working on a data-driven answer for this question.  I know
>> that there has been some analysis on this topic on the past, but I don't
>> want to speculate until we've had some time to put all the pieces together.
>>  Stay tuned for specifics.
>
> It would be also good if you posted your kernel patches.
>
> It's highly unusual -- to say the least -- to let someone's openly
> developed and posted patchkit compete with someone's else secret
> internal solution for review purposes.

Hi Andi,

This is quite hard to argue with.  Let me see what I can do.

Cheers,
CraigB

> -Andi
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24  1:03             ` Craig Bergstrom
  2011-06-24  1:08               ` Andi Kleen
@ 2011-06-24  8:05               ` Rick van Rein
  2011-06-24 14:34                 ` Craig Bergstrom
  2011-06-24 16:16                 ` H. Peter Anvin
  1 sibling, 2 replies; 49+ messages in thread
From: Rick van Rein @ 2011-06-24  8:05 UTC (permalink / raw)
  To: Craig Bergstrom
  Cc: Luck, Tony, Andi Kleen, Stefan Assmann, Matthew Garrett,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, hpa@zytor.com,
	rick@vanrein.org, rdunlap@xenotime.net

Hi Craig,

> We (Google) are working on a data-driven answer for this question.  I know
> that there has been some analysis on this topic on the past, but I don't
> want to speculate until we've had some time to put all the pieces together.

The easiest way to do this could be to take the algorithm from Memtest86
and apply it to your data, to see if it finds suitable patterns for the
cases tried.

By counting bits set to zero in the masks, you could then determine how
'tight' they are.  A mask with all-ones covers one memory page; each
zero bit in the mask (outside of the CPU's page size) doubles the number
of pages covered.

You can ignore the address over which the mask is applied, although you
would then be assuming that all the pages covered by the mask are indeed
filled with RAM.

You would want to add the figures for the different masks.

I am very curious about your findings.  Independently of those, I am in
favour of a patch that enables longer e820 tables if it has no further
impact on speed or space.


Cheers,
 -Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24  8:05               ` Rick van Rein
@ 2011-06-24 14:34                 ` Craig Bergstrom
  2011-06-24 16:16                 ` H. Peter Anvin
  1 sibling, 0 replies; 49+ messages in thread
From: Craig Bergstrom @ 2011-06-24 14:34 UTC (permalink / raw)
  To: Rick van Rein
  Cc: Luck, Tony, Andi Kleen, Stefan Assmann, Matthew Garrett,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, hpa@zytor.com,
	rdunlap@xenotime.net

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

On Fri, Jun 24, 2011 at 1:05 AM, Rick van Rein <rick@vanrein.org> wrote:

> Hi Craig,
>
> > We (Google) are working on a data-driven answer for this question.  I
> know
> > that there has been some analysis on this topic on the past, but I don't
> > want to speculate until we've had some time to put all the pieces
> together.
>
> The easiest way to do this could be to take the algorithm from Memtest86
> and apply it to your data, to see if it finds suitable patterns for the
> cases tried.
>
> By counting bits set to zero in the masks, you could then determine how
> 'tight' they are.  A mask with all-ones covers one memory page; each
> zero bit in the mask (outside of the CPU's page size) doubles the number
> of pages covered.
>
> You can ignore the address over which the mask is applied, although you
> would then be assuming that all the pages covered by the mask are indeed
> filled with RAM.
>
> You would want to add the figures for the different masks.
>

This seems like a reasonable approach.  I know there was some analysis done,
and I'm doing my best to get the folks who made the original decision to
weigh in.


>
> I am very curious about your findings.  Independently of those, I am in
> favour of a patch that enables longer e820 tables if it has no further
> impact on speed or space.
>

I think that we'd all be satisfied with a mechanism that allows for badram
to be specified via both command line and an extended e820 map.


>
>
> Cheers,
>  -Rick
>

[-- Attachment #2: Type: text/html, Size: 2186 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24  8:05               ` Rick van Rein
  2011-06-24 14:34                 ` Craig Bergstrom
@ 2011-06-24 16:16                 ` H. Peter Anvin
  2011-06-24 16:40                   ` Luck, Tony
  1 sibling, 1 reply; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-24 16:16 UTC (permalink / raw)
  To: Rick van Rein
  Cc: Craig Bergstrom, Luck, Tony, Andi Kleen, Stefan Assmann,
	Matthew Garrett, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, rdunlap@xenotime.net

On 06/24/2011 01:05 AM, Rick van Rein wrote:
> 
> I am very curious about your findings.  Independently of those, I am in
> favour of a patch that enables longer e820 tables if it has no further
> impact on speed or space.
> 

That is already in the mainline kernel, although only if fed from the
boot loader (it was developed in the context of mega-NUMA machines); the
stub fetching from INT 15h doesn't use this at the moment.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24 16:16                 ` H. Peter Anvin
@ 2011-06-24 16:40                   ` Luck, Tony
  2011-06-24 16:56                     ` Rick van Rein
  0 siblings, 1 reply; 49+ messages in thread
From: Luck, Tony @ 2011-06-24 16:40 UTC (permalink / raw)
  To: H. Peter Anvin, Rick van Rein
  Cc: Craig Bergstrom, Andi Kleen, Stefan Assmann, Matthew Garrett,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, rdunlap@xenotime.net

> > I am very curious about your findings.  Independently of those, I am in
> > favour of a patch that enables longer e820 tables if it has no further
> > impact on speed or space.
> > 
>
> That is already in the mainline kernel, although only if fed from the
> boot loader (it was developed in the context of mega-NUMA machines); the
> stub fetching from INT 15h doesn't use this at the moment.

Does it scale?  Current X86 systems go up to about 2TB - presumably
in the form of 256 8GB DIMMs (or maybe 512 4GB ones).  If a faulty
row or column on a DIMM can give rise to 4K bad pages, then these
large systems could conceivably have 1-2 million bad pages (while
still being quite usable - a loss of 4-8G from a 2TB system is down
in the noise).  Can we handle a 2 million entry e820 table? Do we
want to?

Perhaps we may end up with a composite solution. Use e820 to map out
the bad pages below some limit (like 4GB). Preferably in the boot loader
so it can find a range of good memory to load the kernel. Then use
badRAM patterns for addresses over 4GB for Linux to avoid bad pages
by flagging their page structures.

-Tony

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24 16:40                   ` Luck, Tony
@ 2011-06-24 16:56                     ` Rick van Rein
  2011-06-24 17:14                       ` H. Peter Anvin
  0 siblings, 1 reply; 49+ messages in thread
From: Rick van Rein @ 2011-06-24 16:56 UTC (permalink / raw)
  To: Luck, Tony
  Cc: H. Peter Anvin, Rick van Rein, Craig Bergstrom, Andi Kleen,
	Stefan Assmann, Matthew Garrett, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	mingo@elte.hu, rdunlap@xenotime.net

Hello,

> Does it scale? [...] Perhaps we may end up with a composite solution. 

If I had my way, there would be an extension to the e820 format to allow
the BadRAM patterns to be specified.  Since the extension with bad page
information is specific to boot loader interaction, this would work in
exactly those cases that are covered by the current situation.

-Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24 16:56                     ` Rick van Rein
@ 2011-06-24 17:14                       ` H. Peter Anvin
  0 siblings, 0 replies; 49+ messages in thread
From: H. Peter Anvin @ 2011-06-24 17:14 UTC (permalink / raw)
  To: Rick van Rein
  Cc: Luck, Tony, Craig Bergstrom, Andi Kleen, Stefan Assmann,
	Matthew Garrett, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, mingo@elte.hu, rdunlap@xenotime.net

On 06/24/2011 09:56 AM, Rick van Rein wrote:
> Hello,
> 
>> Does it scale? [...] Perhaps we may end up with a composite solution. 
> 
> If I had my way, there would be an extension to the e820 format to allow
> the BadRAM patterns to be specified.  Since the extension with bad page
> information is specific to boot loader interaction, this would work in
> exactly those cases that are covered by the current situation.
> 

Yes, a different table might be worthwhile.

Another question, however, is what does this look like at runtime.  In
particular, if I'm not mistaken hwpoison will create struct pages for
these non-memory pages, which seems undesirable...

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
       [not found] <fa.fHPNPTsllvyE/7DxrKwiwgVbVww@ifi.uio.no>
@ 2011-06-24 21:10 ` Shane Nay
  2011-06-28  2:33   ` Craig Bergstrom
  0 siblings, 1 reply; 49+ messages in thread
From: Shane Nay @ 2011-06-24 21:10 UTC (permalink / raw)
  To: fa.linux.kernel
  Cc: H. Peter Anvin, Stefan Assmann, linux-mm, linux-kernel, akpm,
	tony.luck, andi, mingo, rick, rdunlap, Nancy Yuen, Michael Ditto


> > For those with a lot of ranges,
> > like Google, the command line is insufficient.
> 
> Not if you recognise that there is a pattern :-)
> 
> Google does not seem to have realised that, and is simply listing
> the pages that are defected.  IMHO, but being the BadRAM author I
> can hardly be called objective, this is the added value of BadRAM,
> that it understands the nature of the problem and solves it with
> an elegant concept at the right level of abstraction.

No, we have realized patterns when there is one.  It depends on the specific defect that is at play.  There are several different defect types, and incidence rate with respect to the defect being observed.  We do observe "classic" failures of the type you are describing, where with the physical addressing information (bank, row, column), we can reproducibly cause errors to occur along that path.

One problem is that badram syntax doesn't cleanly mesh with all modern systems.  For instance, not all chipsets have power-of-two bank interleave.  Holes in addressing also create trouble on some systems.

Other defects look like white noise, these are typically indicative of manufacturing process defects.

When we find a crisp-pattern in the data, it's not always the entirety of that bit-maskable pattern which is effected.  There can be interleaved subtractions from the underlying pattern orthogonal to interleave.

IMHO, badram is a good tool for it's intended purpose.  They aren't really mutually exclusive anyway.  We're cleaning up our existing patches to send out early next week.  However, we had at one time had a way of inserting badram syntax generated e820's from command line along with passed in e820's, and extended versions.  That bit isn't in our tree right now, but it's possible, and we're looking to see if we can make it work with the existing code.


> s (and
> living by them) for failing memory pages.  One property of BadRAM,
> namely that it does not slow down your system (you have less
> pages on hand, but that's all) may or may not apply to an e820-based
> approach.  I don't know if e820 is ever consulted after boot?
> 
> > How common are nontrivial patterns on real hardware?  This would be
> > interesting to hear from Google or another large user.
> 
> Yes.  And "non-trivial" would mean that the patterns waste more space
> than fair, *because of* the generalisation to patterns.
> 
> If you plug 10 DIMMs into your machine, and each has a faulty row
> somewhere, then you will get into trouble if you stick to 5 patterns.
> But if you happen to run into a faulty DIMM from time to time, the
> patterns should be your way out.
> 
> > I have to say I think Google's point that truncating the list is
> > unacceptable...
> 
> Of course, that is true.  This is why memmap=... does not work.
> It has nothing to do with BadRAM however, there will never be more
> than 5 patterns.
> 
> > that would mean running in a known-bad configuration,
> > and even a hard crash would be better.
> 
> ..which is so sensible that it was of course taken into account in
> the BadRAM design!
> 
> 
> Cheers,
>  -Rick
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-24 21:10 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Shane Nay
@ 2011-06-28  2:33   ` Craig Bergstrom
  2011-06-29  8:08     ` Rick van Rein
  0 siblings, 1 reply; 49+ messages in thread
From: Craig Bergstrom @ 2011-06-28  2:33 UTC (permalink / raw)
  To: fa.linux.kernel
  Cc: H. Peter Anvin, Stefan Assmann, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	Luck, Tony, Andi Kleen, mingo@elte.hu, rick@vanrein.org,
	rdunlap@xenotime.net, Nancy Yuen, Michael Ditto

Hi All,

Just a quick update regarding the outstanding request for the
submission of Google's BadRAM patch.

I'm still making some final changes to Google's e820-based BadRAM
patch and plan to send it as an RFC patch to LKML soon (most likely
tomorrow).

Some folks had mentioned that they're interested in details about what
we've learned about bad ram from our fleet of machines.  I suspect
that you need ACM portal access to read this, but for those folks an
interesting read can be found at the link shown below.  My sincere
apologies that I cannot post a world-readable copy.

http://portal.acm.org/citation.cfm?id=1555372

Cheers,
CraigB

On Fri, Jun 24, 2011 at 2:10 PM, Shane Nay <snay@google.com> wrote:
>
>> > For those with a lot of ranges,
>> > like Google, the command line is insufficient.
>>
>> Not if you recognise that there is a pattern :-)
>>
>> Google does not seem to have realised that, and is simply listing
>> the pages that are defected.  IMHO, but being the BadRAM author I
>> can hardly be called objective, this is the added value of BadRAM,
>> that it understands the nature of the problem and solves it with
>> an elegant concept at the right level of abstraction.
>
> No, we have realized patterns when there is one.  It depends on the specific defect that is at play.  There are several different defect types, and incidence rate with respect to the defect being observed.  We do observe "classic" failures of the type you are describing, where with the physical addressing information (bank, row, column), we can reproducibly cause errors to occur along that path.
>
> One problem is that badram syntax doesn't cleanly mesh with all modern systems.  For instance, not all chipsets have power-of-two bank interleave.  Holes in addressing also create trouble on some systems.
>
> Other defects look like white noise, these are typically indicative of manufacturing process defects.
>
> When we find a crisp-pattern in the data, it's not always the entirety of that bit-maskable pattern which is effected.  There can be interleaved subtractions from the underlying pattern orthogonal to interleave.
>
> IMHO, badram is a good tool for it's intended purpose.  They aren't really mutually exclusive anyway.  We're cleaning up our existing patches to send out early next week.  However, we had at one time had a way of inserting badram syntax generated e820's from command line along with passed in e820's, and extended versions.  That bit isn't in our tree right now, but it's possible, and we're looking to see if we can make it work with the existing code.
>
>
>> s (and
>> living by them) for failing memory pages.  One property of BadRAM,
>> namely that it does not slow down your system (you have less
>> pages on hand, but that's all) may or may not apply to an e820-based
>> approach.  I don't know if e820 is ever consulted after boot?
>>
>> > How common are nontrivial patterns on real hardware?  This would be
>> > interesting to hear from Google or another large user.
>>
>> Yes.  And "non-trivial" would mean that the patterns waste more space
>> than fair, *because of* the generalisation to patterns.
>>
>> If you plug 10 DIMMs into your machine, and each has a faulty row
>> somewhere, then you will get into trouble if you stick to 5 patterns.
>> But if you happen to run into a faulty DIMM from time to time, the
>> patterns should be your way out.
>>
>> > I have to say I think Google's point that truncating the list is
>> > unacceptable...
>>
>> Of course, that is true.  This is why memmap=... does not work.
>> It has nothing to do with BadRAM however, there will never be more
>> than 5 patterns.
>>
>> > that would mean running in a known-bad configuration,
>> > and even a hard crash would be better.
>>
>> ..which is so sensible that it was of course taken into account in
>> the BadRAM design!
>>
>>
>> Cheers,
>>  -Rick
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majo...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-28  2:33   ` Craig Bergstrom
@ 2011-06-29  8:08     ` Rick van Rein
  2011-06-29 15:28       ` craig lkml
  2011-06-30 14:32       ` Jody Belka
  0 siblings, 2 replies; 49+ messages in thread
From: Rick van Rein @ 2011-06-29  8:08 UTC (permalink / raw)
  To: Craig Bergstrom
  Cc: fa.linux.kernel, H. Peter Anvin, Stefan Assmann,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, Luck, Tony, Andi Kleen, mingo@elte.hu,
	rick@vanrein.org, rdunlap@xenotime.net, Nancy Yuen, Michael Ditto

Hello Craig,

> Some folks had mentioned that they're interested in details about what
> we've learned about bad ram from our fleet of machines.  I suspect
> that you need ACM portal access to read this,

I'm happy that this didn't cause a flame, but clearly this is not the
right response in an open environment.  ACM may have copyright on the
*form* in which you present your knowledge, but could you please poor
the knowledge in another form that bypasses their copyright so the
knowledge is made available to all?


Thanks,
 -Rick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-29  8:08     ` Rick van Rein
@ 2011-06-29 15:28       ` craig lkml
  2011-06-29 16:06         ` Craig Bergstrom
  2011-06-30 14:32       ` Jody Belka
  1 sibling, 1 reply; 49+ messages in thread
From: craig lkml @ 2011-06-29 15:28 UTC (permalink / raw)
  To: Rick van Rein
  Cc: Craig Bergstrom, fa.linux.kernel, H. Peter Anvin, Stefan Assmann,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, Luck, Tony, Andi Kleen, mingo@elte.hu,
	rdunlap@xenotime.net, Nancy Yuen, Michael Ditto

[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

Hi Rick,

Thanks for your response.  My sincere apologies for not posting the work
directly.

My intention is to point interested parties to contributions that Google has
made to this space through known and respected channels.  The cited research
is not my research but the research of my colleagues.  As a result, I
hesitate to paraphrase the work as I will likely get the details wrong.  In
any case, Shane's points are the most relevant for the discussion here.
 Please refer to his post in this thread.

In an attempt to contribute to the community as much as I can, I have
prepared and mailed our BadRAM patch as requested.  In case it is not
otherwise clear, my belief is that the ideal solution for the upstream
kernel is a hybrid of our approaches.

Thank you,
CraigB

On Wed, Jun 29, 2011 at 1:08 AM, Rick van Rein <rick@vanrein.org> wrote:

> Hello Craig,
>
> > Some folks had mentioned that they're interested in details about what
> > we've learned about bad ram from our fleet of machines.  I suspect
> > that you need ACM portal access to read this,
>
> I'm happy that this didn't cause a flame, but clearly this is not the
> right response in an open environment.  ACM may have copyright on the
> *form* in which you present your knowledge, but could you please poor
> the knowledge in another form that bypasses their copyright so the
> knowledge is made available to all?
>
>
> Thanks,
>  -Rick
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

[-- Attachment #2: Type: text/html, Size: 2374 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-29 15:28       ` craig lkml
@ 2011-06-29 16:06         ` Craig Bergstrom
  2011-06-29 21:24           ` Tony Luck
  0 siblings, 1 reply; 49+ messages in thread
From: Craig Bergstrom @ 2011-06-29 16:06 UTC (permalink / raw)
  To: linux-kernel, fa.linux.kernel
  Cc: Rick van Rein, H. Peter Anvin, Stefan Assmann, linux-mm@kvack.org,
	akpm@linux-foundation.org, Luck, Tony, Andi Kleen, mingo@elte.hu,
	rdunlap@xenotime.net, Nancy Yuen, Michael Ditto

My apologies, I send this initial reply from the wrong address. Please
reply to this @google.com address.

Cheers,
CraigB

On Wed, Jun 29, 2011 at 8:28 AM, craig lkml <craig.lkml@gmail.com> wrote:
> Hi Rick,
> Thanks for your response.  My sincere apologies for not posting the work
> directly.
> My intention is to point interested parties to contributions that Google has
> made to this space through known and respected channels.  The cited research
> is not my research but the research of my colleagues.  As a result, I
> hesitate to paraphrase the work as I will likely get the details wrong.  In
> any case, Shane's points are the most relevant for the discussion here.
>  Please refer to his post in this thread.
> In an attempt to contribute to the community as much as I can, I have
> prepared and mailed our BadRAM patch as requested.  In case it is not
> otherwise clear, my belief is that the ideal solution for the upstream
> kernel is a hybrid of our approaches.
> Thank you,
> CraigB
>
> On Wed, Jun 29, 2011 at 1:08 AM, Rick van Rein <rick@vanrein.org> wrote:
>>
>> Hello Craig,
>>
>> > Some folks had mentioned that they're interested in details about what
>> > we've learned about bad ram from our fleet of machines.  I suspect
>> > that you need ACM portal access to read this,
>>
>> I'm happy that this didn't cause a flame, but clearly this is not the
>> right response in an open environment.  ACM may have copyright on the
>> *form* in which you present your knowledge, but could you please poor
>> the knowledge in another form that bypasses their copyright so the
>> knowledge is made available to all?
>>
>>
>> Thanks,
>>  -Rick
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-29 16:06         ` Craig Bergstrom
@ 2011-06-29 21:24           ` Tony Luck
  0 siblings, 0 replies; 49+ messages in thread
From: Tony Luck @ 2011-06-29 21:24 UTC (permalink / raw)
  To: Craig Bergstrom
  Cc: linux-kernel, fa.linux.kernel, Rick van Rein, H. Peter Anvin,
	Stefan Assmann, linux-mm@kvack.org, akpm@linux-foundation.org,
	Andi Kleen, mingo@elte.hu, rdunlap@xenotime.net, Nancy Yuen,
	Michael Ditto

One extra consideration for this whole proposal ...

Is the "physical address" a stable enough representation of the location
of the faulty memory cells?

On high end systems I can see a number of ways where the mapping
from cells to physical address may change across reboot:

1) System support redundant memory (rank sparing or mirroring)
2) BIOS self test removes some memory from use
3) A multi-node system elects a different node to be boot-meister,
which results in reshuffling of the address map.

If any of these can happen: then it doesn't matter whether we have
a list of addresses, or a pattern that expands to a list of addresses.
We'll still mark some innocent memory as bad, and allow some known
bad memory to be used - because our "addresses" no longer correspond
to the bad memory cells.

-Tony

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
  2011-06-29  8:08     ` Rick van Rein
  2011-06-29 15:28       ` craig lkml
@ 2011-06-30 14:32       ` Jody Belka
  1 sibling, 0 replies; 49+ messages in thread
From: Jody Belka @ 2011-06-30 14:32 UTC (permalink / raw)
  To: Rick van Rein
  Cc: Craig Bergstrom, fa.linux.kernel, H. Peter Anvin, Stefan Assmann,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, Luck, Tony, Andi Kleen, mingo@elte.hu,
	rdunlap@xenotime.net, Nancy Yuen, Michael Ditto

On 29 June 2011 09:08, Rick van Rein <rick@vanrein.org> wrote:
>
> Hello Craig,
>
> > Some folks had mentioned that they're interested in details about what
> > we've learned about bad ram from our fleet of machines.  I suspect
> > that you need ACM portal access to read this,
>
> I'm happy that this didn't cause a flame, but clearly this is not the
> right response in an open environment.  ACM may have copyright on the
> *form* in which you present your knowledge, but could you please poor
> the knowledge in another form that bypasses their copyright so the
> knowledge is made available to all?

Luckily one of the authors (Bianca Schroeder) has a copy on her
university web space, free for personal/classroom use. Can be found at
http://www.cs.toronto.edu/~bianca/, search for "DRAM errors in the
wild".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2011-06-30 14:32 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa.fHPNPTsllvyE/7DxrKwiwgVbVww@ifi.uio.no>
2011-06-24 21:10 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Shane Nay
2011-06-28  2:33   ` Craig Bergstrom
2011-06-29  8:08     ` Rick van Rein
2011-06-29 15:28       ` craig lkml
2011-06-29 16:06         ` Craig Bergstrom
2011-06-29 21:24           ` Tony Luck
2011-06-30 14:32       ` Jody Belka
2011-06-22 11:18 Stefan Assmann
2011-06-22 18:00 ` Andrew Morton
2011-06-22 18:06   ` Josh Boyer
2011-06-22 18:09   ` Randy Dunlap
2011-06-22 18:11     ` Nancy Yuen
2011-06-22 18:13   ` H. Peter Anvin
2011-06-22 19:01     ` Nancy Yuen
2011-06-22 19:06       ` H. Peter Anvin
2011-06-22 18:24   ` Andi Kleen
2011-06-22 18:38     ` Andrew Morton
2011-06-22 18:56       ` Andi Kleen
2011-06-22 19:05         ` H. Peter Anvin
2011-06-22 19:15           ` Andi Kleen
2011-06-22 20:25             ` H. Peter Anvin
2011-06-22 20:28               ` Andi Kleen
2011-06-22 20:18   ` Stefan Assmann
2011-06-23 10:33     ` Rick van Rein
2011-06-23 10:49       ` Rick van Rein
2011-06-23 10:10   ` Rick van Rein
2011-06-22 18:15 ` H. Peter Anvin
2011-06-22 20:30   ` Stefan Assmann
2011-06-22 20:33     ` H. Peter Anvin
2011-06-23 13:39 ` Matthew Garrett
2011-06-23 14:08   ` Stefan Assmann
2011-06-23 14:12     ` Matthew Garrett
2011-06-23 15:37       ` Stefan Assmann
2011-06-23 16:30         ` H. Peter Anvin
2011-06-24  0:59           ` Andi Kleen
2011-06-23 17:00         ` Andi Kleen
2011-06-23 17:12           ` Luck, Tony
2011-06-24  1:03             ` Craig Bergstrom
2011-06-24  1:08               ` Andi Kleen
2011-06-24  1:22                 ` Craig Bergstrom
2011-06-24  8:05               ` Rick van Rein
2011-06-24 14:34                 ` Craig Bergstrom
2011-06-24 16:16                 ` H. Peter Anvin
2011-06-24 16:40                   ` Luck, Tony
2011-06-24 16:56                     ` Rick van Rein
2011-06-24 17:14                       ` H. Peter Anvin
  -- strict thread matches above, loose matches on Subject: below --
2011-06-21  9:23 Stefan Assmann
2011-06-21 22:02 ` Andrew Morton
2011-06-22 11:11   ` Stefan Assmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).