From: Andrew Morton <akpm@linux-foundation.org>
To: Stefan Assmann <sassmann@kpanic.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
tony.luck@intel.com, andi@firstfloor.org, mingo@elte.hu,
hpa@zytor.com, rick@vanrein.org, rdunlap@xenotime.net,
Nancy Yuen <yuenn@google.com>, Michael Ditto <mditto@google.com>
Subject: Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)
Date: Wed, 22 Jun 2011 11:00:34 -0700 [thread overview]
Message-ID: <20110622110034.89ee399c.akpm@linux-foundation.org> (raw)
In-Reply-To: <1308741534-6846-1-git-send-email-sassmann@kpanic.de>
On Wed, 22 Jun 2011 13:18:51 +0200 Stefan Assmann <sassmann@kpanic.de> wrote:
> Following the RFC for the BadRAM feature here's the updated version with
> spelling fixes, thanks go to Randy Dunlap. Also the code is now less verbose,
> as requested by Andi Kleen.
> v2 with even more spelling fixes suggested by Randy.
> Patches are against vanilla 2.6.39.
>
> The idea is to allow the user to specify RAM addresses that shouldn't be
> touched by the OS, because they are broken in some way. Not all machines have
> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
> use bitmasks to mask address patterns with the new "badram" kernel command line
> parameter.
> Memtest86 has an option to generate these patterns since v2.3 so the only thing
> for the user to do should be:
> - run Memtest86
> - note down the pattern
> - add badram=<pattern> to the kernel command line
>
> The concerning pages are then marked with the hwpoison flag and thus won't be
> used by the memory managment system.
The google kernel has a similar capability. I asked Nancy to comment
on these patches and she said:
: One, the bad addresses are passed via the kernel command line, which
: has a limited length. It's okay if the addresses can be fit into a
: pattern, but that's not necessarily the case in the google kernel. And
: even with patterns, the limit on the command line length limits the
: number of patterns that user can specify. Instead we use lilo to pass
: a file containing the bad pages in e820 format to the kernel.
:
: Second, the BadRAM patch expands the address patterns from the command
: line into individual entries in the kernel's e820 table. The e820
: table is a fixed buffer that supports a very small, hard coded number
: of entries (128). We require a much larger number of entries (on
: the order of a few thousand), so much of the google kernel patch deals
: with expanding the e820 table. Also, with the BadRAM patch, entries
: that don't fit in the table are silently dropped and this isn't
: appropriate for us.
:
: Another caveat of mapping out too much bad memory in general. If too
: much memory is removed from low memory, a system may not boot. We
: solve this by generating good maps. Our userspace tools do not map out
: memory below a certain limit, and it verifies against a system's iomap
: that only addresses from memory is mapped out.
I have a couple of thoughts here:
- If this patchset is merged and a major user such as google is
unable to use it and has to continue to carry a separate patch then
that's a regrettable situation for the upstream kernel.
- Google's is, afaik, the largest use case we know of: zillions of
machines for a number of years. And this real-world experience tells
us that the badram patchset has shortcomings. Shortcomings which we
can expect other users to experience.
So. What are your thoughts on these issues?
Thanks
next prev parent reply other threads:[~2011-06-22 18:01 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-22 11:18 [PATCH v2 0/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 1/3] Add string parsing function get_next_ulong Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 2/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-06-22 11:18 ` [PATCH v2 3/3] Add documentation and credits for BadRAM Stefan Assmann
2011-06-22 18:00 ` Andrew Morton [this message]
2011-06-22 18:06 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Josh Boyer
2011-06-22 18:09 ` Randy Dunlap
2011-06-22 18:11 ` Nancy Yuen
2011-06-22 18:13 ` H. Peter Anvin
2011-06-22 19:01 ` Nancy Yuen
2011-06-22 19:06 ` H. Peter Anvin
2011-06-22 18:24 ` Andi Kleen
2011-06-22 18:38 ` Andrew Morton
2011-06-22 18:56 ` Andi Kleen
2011-06-22 19:05 ` H. Peter Anvin
2011-06-22 19:15 ` Andi Kleen
2011-06-22 20:25 ` H. Peter Anvin
2011-06-22 20:28 ` Andi Kleen
2011-06-22 19:46 ` [PATCH] x86: e820: Eliminate bubble sort from sanitize_e820_map Mike Ditto
2011-06-22 20:18 ` [PATCH v2 0/3] support for broken memory modules (BadRAM) Stefan Assmann
2011-06-23 10:10 ` Rick van Rein
2011-06-22 18:15 ` H. Peter Anvin
2011-06-22 20:30 ` Stefan Assmann
2011-06-22 20:33 ` H. Peter Anvin
2011-06-23 10:33 ` Rick van Rein
2011-06-23 10:49 ` Rick van Rein
2011-06-23 13:39 ` Matthew Garrett
2011-06-23 14:08 ` Stefan Assmann
2011-06-23 14:12 ` Matthew Garrett
2011-06-23 15:37 ` Stefan Assmann
2011-06-23 16:30 ` H. Peter Anvin
2011-06-24 0:59 ` Andi Kleen
2011-06-23 17:00 ` Andi Kleen
2011-06-23 17:12 ` Luck, Tony
2011-06-24 1:09 ` Craig Bergstrom
[not found] ` <BANLkTikTTCU3eKkCtrbLbtpLJtksehyEMg@mail.gmail.com>
2011-06-24 1:08 ` Andi Kleen
2011-06-24 1:22 ` Craig Bergstrom
2011-06-24 8:05 ` Rick van Rein
2011-06-24 14:35 ` Craig Bergstrom
2011-06-24 16:16 ` H. Peter Anvin
2011-06-24 16:40 ` Luck, Tony
2011-06-24 16:56 ` Rick van Rein
2011-06-24 17:14 ` H. Peter Anvin
[not found] <fa.fHPNPTsllvyE/7DxrKwiwgVbVww@ifi.uio.no>
2011-06-24 21:10 ` Shane Nay
2011-06-28 2:33 ` Craig Bergstrom
2011-06-29 8:08 ` Rick van Rein
2011-06-29 15:29 ` craig lkml
[not found] ` <BANLkTikw9bnrurUo8n-6yUwwQ0zOv5iAOBDt=T6Nm6nkUd7vLA@mail.gmail.com>
2011-06-29 16:06 ` Craig Bergstrom
2011-06-29 21:24 ` Tony Luck
2011-06-30 14:32 ` Jody Belka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110622110034.89ee399c.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mditto@google.com \
--cc=mingo@elte.hu \
--cc=rdunlap@xenotime.net \
--cc=rick@vanrein.org \
--cc=sassmann@kpanic.de \
--cc=tony.luck@intel.com \
--cc=yuenn@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox