All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>,
	Mauro Carvalho Chehab <mchehab@redhat.com>,
	Ingo Molnar <mingo@elte.hu>,
	EDAC devel <linux-edac@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] mce: Add a msg string to the MCE tracepoint
Date: Fri, 02 Mar 2012 13:02:13 +0900	[thread overview]
Message-ID: <4F504645.5040708@jp.fujitsu.com> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F040ACC@ORSMSX104.amr.corp.intel.com>

(2012/03/02 3:28), Luck, Tony wrote:
>>> My concern is; on Sandy Bridge, is it safe to gather info about the DIMM
>>> location in/from machine check context in a reasonable time span?
>>
>> Well, what amd64_edac does is "buffer" the required lookup info so
>> whenever you get an error, you simply lookup the channel and chip select
>> - all ops which can be done in atomic context.
> 
> Yes - we could pre-read all the config space registers ahead of time and
> save them in memory (none of the values should change - except if the platform
> supports hot-plug for memory). Total is only a few Kbytes. Then decode in
> machine check context is both safe, and fast.

To sort out my thought:

 - First of all, OS gathers info about physical location of DIMMs from
   DMI/ACPI/PCI etc., before enabling MCE mechanism.
 - Make a kind of "physical memory location table" on memory buffer,
   to ease mapping a physical address to the location of a DIMM module
   and/or chip which have the memory cell pointed by the address.
    - It would be better to have a well organized table rather than
      having a raw copy of config space etc.
    - Likewise it will also nice if we can map logical processor numbers
      to the location of physical sockets on motherboard.
    - Happy if user can refer the table via sysfs.
    - Allow updating the table if the platform supports hot-plug.
 - Once MCE is enabled, handler can refer the table on memory to
   determine an erroneous device which should be replaced.

This storyline up to here is reasonable and acceptable, I think.

Then now it is clear that the last point where I feel uneasy about is
putting a string into the ring buffer instead of binary bits like index
of location table.  Please use binary (or "binary + string") to tell
the error location to userland.


Thanks,
H.Seto


  reply	other threads:[~2012-03-02  4:03 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28 16:11 [RFC PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-02-28 16:11 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov
2012-02-29  1:14   ` Hidetoshi Seto
2012-02-29 10:10     ` Borislav Petkov
2012-02-29 12:04       ` Mauro Carvalho Chehab
2012-02-29 12:19         ` Borislav Petkov
2012-02-29 13:05           ` Mauro Carvalho Chehab
2012-02-29 13:37             ` Borislav Petkov
2012-02-29 17:11               ` Luck, Tony
2012-02-29 17:19                 ` Borislav Petkov
2012-03-01  2:23               ` Hidetoshi Seto
2012-03-01 11:40                 ` Borislav Petkov
2012-03-01 18:28                   ` Luck, Tony
2012-03-02  4:02                     ` Hidetoshi Seto [this message]
2012-03-02 13:17                       ` Mauro Carvalho Chehab
2012-03-02 20:05                       ` Luck, Tony
2012-02-29 17:20         ` Luck, Tony
2012-02-29 18:00           ` Mauro Carvalho Chehab
2012-02-29 18:11             ` Luck, Tony
2012-02-29 12:52   ` Mauro Carvalho Chehab
2012-02-29 13:45     ` Borislav Petkov
2012-02-29 14:04       ` Mauro Carvalho Chehab
2012-02-29 14:40         ` Borislav Petkov
2012-02-29 16:58           ` Luck, Tony
2012-02-29 17:16             ` Borislav Petkov
2012-02-29 17:33               ` Luck, Tony
2012-03-01 11:29                 ` Borislav Petkov
2012-03-01 13:19                   ` Mauro Carvalho Chehab
2012-03-01 18:15                     ` Luck, Tony
2012-03-01 18:45                       ` Borislav Petkov
2012-03-01 18:58                         ` Luck, Tony
2012-03-01 19:54                           ` Mauro Carvalho Chehab
2012-02-29 17:45               ` Mauro Carvalho Chehab
2012-02-29 17:17           ` Mauro Carvalho Chehab
2012-02-28 16:11 ` [PATCH 2/3] x86, RAS: Add a decoded msg buffer Borislav Petkov
2012-02-28 22:43   ` Luck, Tony
2012-02-29 10:11     ` Borislav Petkov
2012-03-02  9:55       ` Borislav Petkov
2012-02-28 16:11 ` [PATCH 3/3] EDAC: Convert AMD EDAC pieces to use RAS printk buffer Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2012-03-06 13:31 [RFC -v3 PATCH 0/3] RAS: Use MCE tracepoint for decoded MCEs Borislav Petkov
2012-03-06 13:31 ` [PATCH 1/3] mce: Add a msg string to the MCE tracepoint Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F504645.5040708@jp.fujitsu.com \
    --to=seto.hidetoshi@jp.fujitsu.com \
    --cc=bp@amd64.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.