From: Borislav Petkov <bp@alien8.de>
To: Havard Skinnemoen <hskinnemoen@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Ewout van Bekkum <ewout@google.com>
Subject: Re: [PATCH 2/6] x86-mce: Modify CMCI storm exit to reenable instead of rediscover banks.
Date: Thu, 10 Jul 2014 17:51:19 +0200 [thread overview]
Message-ID: <20140710155119.GJ2970@pd.tnic> (raw)
In-Reply-To: <CAFQmdRaZ-wNTFzm_CAvuCWrNGDuWWdYRzADXjmz27XgsFUgNaA@mail.gmail.com>
On Wed, Jul 09, 2014 at 02:34:39PM -0700, Havard Skinnemoen wrote:
> On Wed, Jul 9, 2014 at 1:20 PM, Luck, Tony <tony.luck@intel.com> wrote:
> >> The CMCI storm handler previously called cmci_reenable() when exiting a
> >> CMCI storm. However, when entering a CMCI storm the bank ownership was
> >> not relinquished by the affected CPUs. The CMCIs were only disabled via
> >> cmci_storm_disable_banks(). The handler was updated to instead call a
> >> new function, cmci_storm_enable_banks(), to reenable CMCI on the already
> >> owned banks instead of rediscovering CMCI banks (which were still owned
> >> but disabled).
> >
> > Won't this cause problems if we online a cpu during the storm. We will
> > re-run the discovery algorithm and some other cpu that shares the bank
> > will see MCi_CTL2{30} is zero and claim ownership.
>
> Yes, I think you're right. We didn't test this with CPU hotplugging.
>
> I'm at loss about how to fix it though. We need the CMCI bits to
> detect shared banks, but they're not reflecting the actual state of
> things at that point. If the CPU gives up ownership of the banks, then
> we might just see the storm move from CPU to CPU, right?
>
> We could keep a separate bitmask somewhere to indicate ownership, but
> even if we can see that the bank is shared with some other CPU, we
> don't know if it will be shared with a new CPU which we've never seen
> before...
>
> Perhaps we need to temporarily disable the storm handling when we're
> bringing up a new CPU?
Looking at this more, maybe cmci_storm_disable_banks() was a bad idea
after all. There's __cmci_disable_bank() which properly drops ownership
after having disabled CMCI.
Maybe we should kill cmci_storm_disable_banks() and do
__cmci_disable_bank by iterating over them all...
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
next prev parent reply other threads:[~2014-07-10 15:51 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-09 17:09 [PATCH 0/6] x86 mce fixes Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 1/6] x86-mce: Modify CMCI poll interval to adjust for small check_interval values Havard Skinnemoen
2014-07-09 19:17 ` Borislav Petkov
2014-07-09 21:24 ` Havard Skinnemoen
2014-07-10 9:01 ` Chen, Gong
2014-07-10 17:16 ` Havard Skinnemoen
2014-07-11 2:12 ` Chen, Gong
2014-07-10 11:42 ` Borislav Petkov
2014-07-10 17:51 ` Havard Skinnemoen
2014-07-10 18:55 ` Tony Luck
2014-07-10 22:45 ` Havard Skinnemoen
2014-07-11 15:35 ` Borislav Petkov
2014-07-11 18:56 ` Havard Skinnemoen
2014-07-11 20:10 ` Borislav Petkov
2014-07-11 20:39 ` Havard Skinnemoen
2014-07-14 14:57 ` Borislav Petkov
2014-07-11 20:22 ` Borislav Petkov
2014-07-12 0:10 ` Havard Skinnemoen
2014-07-14 15:14 ` Borislav Petkov
2014-07-11 20:36 ` Borislav Petkov
2014-07-11 21:05 ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 2/6] x86-mce: Modify CMCI storm exit to reenable instead of rediscover banks Havard Skinnemoen
2014-07-09 20:20 ` Luck, Tony
2014-07-09 21:34 ` Havard Skinnemoen
2014-07-10 15:51 ` Borislav Petkov [this message]
2014-07-10 18:32 ` Havard Skinnemoen
2014-07-09 17:09 ` [PATCH 3/6] x86-mce: Clear CMCI enable on all claimed CMCI banks before reboot Havard Skinnemoen
2014-07-09 20:36 ` Luck, Tony
2014-07-09 21:40 ` Havard Skinnemoen
2014-07-10 16:24 ` Borislav Petkov
2014-07-10 16:33 ` Tony Luck
2014-07-10 17:56 ` Havard Skinnemoen
2014-07-10 18:27 ` Tony Luck
2014-07-10 18:30 ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports Havard Skinnemoen
2014-07-09 20:35 ` Andi Kleen
2014-07-09 21:51 ` Havard Skinnemoen
2014-07-09 23:32 ` Luck, Tony
2014-07-10 8:16 ` Borislav Petkov
2014-07-09 20:47 ` Luck, Tony
2014-07-09 21:56 ` Havard Skinnemoen
2014-07-10 16:41 ` Borislav Petkov
2014-07-10 18:03 ` Havard Skinnemoen
2014-07-10 18:44 ` Borislav Petkov
2014-07-10 18:57 ` Tony Luck
2014-07-10 19:12 ` Borislav Petkov
2014-07-11 9:24 ` Borislav Petkov
2014-07-11 19:06 ` Tony Luck
2014-07-11 19:52 ` Borislav Petkov
2014-07-11 21:15 ` Havard Skinnemoen
2014-07-17 10:50 ` Borislav Petkov
2014-07-18 21:23 ` Tony Luck
2014-07-18 21:31 ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 5/6] x86-mce: check if no_way_out applies before deciding not to clear MCE banks Havard Skinnemoen
2014-07-09 21:00 ` Luck, Tony
2014-07-09 23:00 ` Havard Skinnemoen
2014-07-09 23:27 ` Luck, Tony
2014-07-10 16:49 ` Borislav Petkov
2014-07-09 17:09 ` [PATCH 6/6] x86-mce: ensure the MCP timer is not already set in the mce_timer_fn Havard Skinnemoen
2014-07-09 21:04 ` Luck, Tony
2014-07-09 23:01 ` Havard Skinnemoen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140710155119.GJ2970@pd.tnic \
--to=bp@alien8.de \
--cc=ewout@google.com \
--cc=hskinnemoen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox