linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Nils Carlson <nils.carlson@ludd.ltu.se>
Cc: Andi Kleen <andi@firstfloor.org>, Ingo Molnar <mingo@elte.hu>,
	Borislav Petkov <bp@amd64.org>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	Mauro Carvalho Chehab <mchehab@redhat.com>,
	"Young, Brent" <brent.young@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"bluesmoke-devel@lists.sourceforge.net" 
	<bluesmoke-devel@lists.sourceforge.net>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Doug Thompson <dougthompson@xmission.com>,
	Joe Perches <joe@perches.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux Edac Mailing List <linux-edac@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Matt Domsch <Matt_Domsch@dell.com>
Subject: Re: Hardware Error Kernel Mini-Summit
Date: Mon, 14 Jun 2010 22:21:09 +0200	[thread overview]
Message-ID: <20100614202109.GD369@basil.fritz.box> (raw)
In-Reply-To: <9D5E19B6-5313-43B4-9C3D-493C8C226E8D@ludd.ltu.se>

On Mon, Jun 14, 2010 at 09:47:33PM +0200, Nils Carlson wrote:
>> Also the biggest problem is still that EDAC doesn't
>> give you any silk screen labels, so unless you
>> have motherboard schemantics the layout it presents
>> is fairly useless -- you still don't know which DIMM
>> to exchange. So in theory EDAC looks great, but in practice ...
>>
> I do have motherboard schematics, or rather, we build our own
> boards. But the point is valid, a lot of people don't make their own

Just supply correct DMI tables then?

> hardware. On the other hand, the people who do use this part of
> EDAC perhaps aren't your typical home computer users?

Most users do not build their own boards and do not have
schemantics. And that's not home computer users.

Anyways I think important is that by default you get something
useful (including silk screen labels) without doing 
any special configuration steps.

Right now DMI is the only sane option for this that I can see.
EDAC doesn't do it because it has no silk screen labels.

And yes if someone is a power user they could still override
that. Just by default it has to do something reasonable.

>
> This is true, and this is the way things are going on
> our end as well. I guess that would mean
> So you wouldn't go to the EDAC sysfs directory
> to find everything to do with the same piece of hardware
> anymore, but would have to go the n different
> directories looking for all the pieces? I don't really
> like that...

Let me try to understand that.

You want to inject errors on a random computer you don't
know anything about? Do you do that frequently? Why
are you doing this? 

Obviously there needs to be a way to identify to what
hardware an error injector belongs.

>
>> Anyways the old EDAC drivers for this are not going
>> away, you can still use them. The interesting
>> question though is how to properly define the interface
>> for new hardware.
>
> But all new hardware will look the way the hardware
> designers want it to, so our interface will be a moving
> target? Maybe it's time to let hardware makers provide

You can define relatively abstract interfaces.

It's just that EDAC is not it. They may not be perfect
future proof (after all who knows how memories of quantum
computers or whatever will look like), but hopefully
at least reasonably forward looking.

e.g. for memory layout imho a reasonable way
is to just define it as

DIMM  (if you need below that look at a log) 
 \-------- silk screen label (most important attribute!)
 |
abstract path. This can be an arbitary string. e.g. MC0/Ch1/DIMM0
 |             Or MC0/BOB0/Ch1/DIMM3
 |             Parsers don't need to know any details about it.
 |
socket

You can event represent that as a flat data structure,
no need to really map the abstract path to directories
(that just makes parsers difficult to write -- most sysfs
parsers traditionally have trouble with varying directories)



> a board specification with device tree and memory
> layout? (Pure speculation)

That's DMI on x86! 

Well it's not perfect, but also not too bad.


> There is a use-case. A lot has to do with how different patrol
> scrub rates work, some just go through memory at a constant
> speed (MB/s), others vary according to load. The thing is,
> different applications want their memory scrubbed within
> different time frames, and as the amount of memory on boards

What's the theory behind varying scrub rates? 
I would be interested in more details.

> Patrol scrubbing is normally used because it discovers errors
> faster in seldom accessed memory allowing a DIMM with
> too many errors to be replaced faster. Some applications

Yes, but why do you want to vary the rate?
Normally it should just depend on memory size and expected
error rate (that is the more memory the faster you scrub) 

> like to use demand scrubbing as well, and some consider
> it to increase memory latency too much.

That sounds odd -- if you have so many errors that you worry
about that you have other problems definitely? 
Is this based on some benchmarking?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

  reply	other threads:[~2010-06-14 20:21 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-17 18:23 Hardware Error Kernel Mini-Summit Mauro Carvalho Chehab
2010-05-17 22:41 ` Andi Kleen
2010-05-18 16:50   ` Mauro Carvalho Chehab
2010-05-18 18:10     ` Andi Kleen
2010-05-18  6:52 ` Hidetoshi Seto
2010-05-18 16:44   ` Mauro Carvalho Chehab
2010-05-18 17:42     ` Joe Perches
2010-05-18 17:59       ` Mauro Carvalho Chehab
2010-05-18 18:45       ` Andi Kleen
2010-05-18 18:57         ` Joe Perches
2010-05-18 18:53       ` Ingo Molnar
2010-05-18 19:08         ` Luck, Tony
2010-05-18 19:18           ` Borislav Petkov
2010-05-18 19:34             ` Ingo Molnar
2010-05-18 22:14             ` Eric W. Biederman
2010-05-18 22:28               ` Andi Kleen
2010-05-19  1:14                 ` Eric W. Biederman
2010-05-19  6:46                   ` Borislav Petkov
2010-05-19  7:09                     ` Ingo Molnar
2010-05-19 11:54                       ` Mauro Carvalho Chehab
2010-05-20 12:37                         ` Ingo Molnar
2010-06-14 10:03                       ` Nils Carlson
2010-06-14 11:49                         ` Andi Kleen
2010-06-14 19:47                           ` Nils Carlson
2010-06-14 20:21                             ` Andi Kleen [this message]
2010-06-14 20:06                           ` Eric W. Biederman
2010-06-14 20:21                             ` Luck, Tony
2010-06-14 20:36                             ` Andi Kleen
2010-06-14 21:34                               ` Tony Luck
2010-06-15  6:44                                 ` Andi Kleen
     [not found]                                 ` <35525.41387.qm@web50105.mail.re2.yahoo.com>
2010-06-15  6:56                                   ` Andi Kleen
2010-06-15  8:06                                     ` Nils Carlson
2010-06-15 10:01                                       ` Borislav Petkov
2010-06-15 11:41                                       ` Andi Kleen
2010-06-15 12:21                                         ` Nils Carlson
2010-06-15 18:15                                           ` Luck, Tony
2010-06-15 18:38                                             ` Nils Carlson
2010-06-15 19:37                                             ` Andi Kleen
2010-06-15 19:35                                           ` Andi Kleen
2010-06-15 20:48                                             ` Nils Carlson
2010-06-16  9:40                                               ` Andi Kleen
2010-06-15 22:33                                     ` Tony Luck
2010-05-19  9:03                   ` Andi Kleen
2010-05-24 16:21                     ` Russ Anderson
2010-05-24 18:26                       ` Andi Kleen
2010-05-19 17:30                   ` Tony Luck
2010-05-24 15:55                     ` Russ Anderson
2010-05-24 17:35                       ` Tony Luck
2010-05-24 18:31                         ` Andi Kleen
2010-05-18 22:29               ` Ingo Molnar
2010-05-18 19:30           ` Ingo Molnar
2010-05-18 20:42             ` Ingo Molnar
2010-05-18 21:37               ` Tony Luck
2010-05-18 22:00                 ` Ingo Molnar
2010-05-24 17:13                   ` Russ Anderson
2010-05-19  6:39                 ` Ingo Molnar
2010-05-18 13:06 ` Borislav Petkov
2010-05-18 16:52   ` Mauro Carvalho Chehab
2010-05-18 17:06 ` Mauro Carvalho Chehab
  -- strict thread matches above, loose matches on Subject: below --
2010-06-16  8:57 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100614202109.GD369@basil.fritz.box \
    --to=andi@firstfloor.org \
    --cc=Matt_Domsch@dell.com \
    --cc=bluesmoke-devel@lists.sourceforge.net \
    --cc=bp@amd64.org \
    --cc=brent.young@intel.com \
    --cc=dougthompson@xmission.com \
    --cc=ebiederm@xmission.com \
    --cc=joe@perches.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=nils.carlson@ludd.ltu.se \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).