From: Andi Kleen <andi@firstfloor.org>
To: Russ Anderson <rja@sgi.com>
Cc: Andi Kleen <andi@firstfloor.org>,
mingo@elte.hu, tglx@linutronix.de,
Tony Luck <tony.luck@intel.com>,
linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org
Subject: Re: [PATCH 0/2] Migrate data off physical pages with corrected memory errors (Version 7)
Date: Mon, 21 Jul 2008 21:40:00 +0200 [thread overview]
Message-ID: <20080721194000.GE29543@basil.nowhere.org> (raw)
In-Reply-To: <20080720173914.GA9409@sgi.com>
On Sun, Jul 20, 2008 at 12:39:14PM -0500, Russ Anderson wrote:
> The patch has a module for IA64, based on experience on IA64 hardware.
> It is a first step, to get the basic functionality in the kernel.
The basic functionality doesn't seem flexible enough for me
for useful policies.
> (~20,000 in one customer system). So disabling the memory on a
> DIMM with a flaky connector is a small percentage of overall memory.
> On a large NUMA machine the flaky DIMM connector would only effect
> memory on one node.
You would still lose significant parts of that node, won't you?
Even on the large systems people might miss a node or two.
> A good enhancement would be to migrate all the data off a DRAM and/or
> DIMM when a threshold is exceeded. That would take knowledge of the
> physical memory to memory map layout.
Would be probably difficult to teach this the kernel in a nice generic
way. In particular interleaving is difficult.
> > If you really wanted to do this you probably should hook it up
> > to mcelog's (or the IA64 equivalent) DIMM database
>
> Is there an IA64 equivalent? I've looked at the x86_64 mcelog,
> but have not found a IA64 version.
There's a sal logger process in user space I believe, but I have never looked
at it. It could do these things in theory.
Also in the IA64 case the firmware can actually tell the kernel
what to do because it gets involved here (and firmware often
has usable heuristics for this case)
> > and DIMM specific knowledge. But it's unlikely it can be really
> > done nicely in a way that is isolated from very specific
> > knowledge about the underlying memory configuration.
>
> Agreed. An interface to export the physical memory configuration
> (from ACPI tables?) would be useful.
On x86 there's currently only DMI/SMBIOS for this, but it has some issues.
-Andi
next prev parent reply other threads:[~2008-07-21 19:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-18 20:35 [PATCH 0/2] Migrate data off physical pages with corrected memory errors (Version 7) Russ Anderson
2008-07-19 10:37 ` Andi Kleen
2008-07-19 12:13 ` Matthew Wilcox
2008-07-19 15:06 ` Andi Kleen
2008-07-20 17:50 ` Russ Anderson
2008-07-20 17:39 ` Russ Anderson
2008-07-21 19:11 ` Alex Williamson
2008-07-21 19:45 ` Russ Anderson
2008-07-21 19:40 ` Andi Kleen [this message]
2008-07-28 21:44 ` Russ Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080721194000.GE29543@basil.nowhere.org \
--to=andi@firstfloor.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rja@sgi.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox