From: Peter Zijlstra <peterz@infradead.org>
To: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andi Kleen <ak@linux.intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -v2] x86: MCE: Re-implement MCE log ring buffer as per-CPU ring buffer
Date: Tue, 28 Apr 2009 12:21:43 +0200 [thread overview]
Message-ID: <1240914103.7620.110.camel@twins> (raw)
In-Reply-To: <1240910841.6842.1163.camel@yhuang-dev.sh.intel.com>
On Tue, 2009-04-28 at 17:27 +0800, Huang Ying wrote:
> Re-implement MCE log ring buffer as per-CPU ring buffer for better
> scalability. Basic design is as follow:
>
> - One ring buffer for each CPU
>
> + MCEs are added to corresponding local per-CPU buffer, instead of
> one big global buffer. Contention/unfairness between CPUs is
> eleminated.
>
> + MCE records are read out and removed from per-CPU buffers by mutex
> protected global reader function. Because there are no many
> readers in system to contend in most cases.
>
> - Per-CPU ring buffer data structure
>
> + An array is used to hold MCE records. integer "head" indicates
> next writing position and integer "tail" indicates next reading
> position.
>
> + To distinguish buffer empty and full, head and tail wrap to 0 at
> MCE_LOG_LIMIT instead of MCE_LOG_LEN. Then the real next writing
> position is head % MCE_LOG_LEN, and real next reading position is
> tail % MCE_LOG_LEN. If buffer is empty, head == tail, if buffer is
> full, head % MCE_LOG_LEN == tail % MCE_LOG_LEN and head != tail.
>
> - Lock-less for writer side
>
> + MCE log writer may come from NMI, so the writer side must be
> lock-less. For per-CPU buffer of one CPU, writers may come from
> process, IRQ or NMI context, so "head" is increased with
> cmpxchg_local() to allocate buffer space.
>
> + Reader side is protected with a mutex to guarantee only one reader
> is active in the whole system.
>
>
> Performance test show that the throughput of per-CPU mcelog buffer can
> reach 430k records/s compared with 5.3k records/s for original
> implementation on a 2-core 2.1GHz Core2 machine.
We're talking about Machine Check Exceptions here, right? Is there a
valid scenario where you care about performance? I always thought that
an MCE meant something seriously went wrong, log the event and reboot
the machine -- possibly start ordering replacement parts.
But now you're saying we want to be able to record more than 5.3k events
a second on this? Sounds daft to me.
Also, it sounds like something that might fit the ftrace ringbuffer
thingy.
next prev parent reply other threads:[~2009-04-28 10:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-28 9:27 [PATCH -v2] x86: MCE: Re-implement MCE log ring buffer as per-CPU ring buffer Huang Ying
2009-04-28 10:21 ` Andi Kleen
2009-04-29 1:31 ` Huang Ying
2009-04-29 6:11 ` Andi Kleen
2009-04-29 6:50 ` Huang Ying
2009-04-28 10:21 ` Peter Zijlstra [this message]
2009-04-28 10:33 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1240914103.7620.110.camel@twins \
--to=peterz@infradead.org \
--cc=ak@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox