public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steffen Persvold <sp@numascale.com>
To: Borislav Petkov <bp@alien8.de>,
	Daniel J Blueman <daniel@numascale-asia.com>,
	Tony Luck <tony.luck@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-edac@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86, amd, mce: Prevent potential cpu-online oops
Date: Thu, 04 Apr 2013 20:05:46 +0200	[thread overview]
Message-ID: <515DC0FA.1040408@numascale.com> (raw)
In-Reply-To: <20130404161340.GF32271@pd.tnic>

On 4/4/2013 6:13 PM, Borislav Petkov wrote:
> On Thu, Apr 04, 2013 at 11:52:00PM +0800, Daniel J Blueman wrote:
>> On platforms where all Northbridges may not be visible (due to routing, eg on
>> NumaConnect systems), prevent oopsing due to stale pointer access when
>> offlining cores.
>>
>> Signed-off-by: Steffen Persvold <sp@numascale.com>
>> Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
> 
> Huh, what's up?
> 
> This one is almost reverting 21c5e50e15b1a which you wrote in the first
> place. What's happening? What stale pointer access, where? We have the
> if (nb ..) guards there.
> 
> This commit message needs a *lot* more explanation about what's going
> on and why we're reverting 21c5e50e15b1a. And why the special handling
> for shared banks? I presume you offline some of the cores and there's a
> dangling pointer but again, there are the nb validity guards...
> 
> /me is genuinely confused.
> 

You get oopses when offlining cores when there's no NB struct for the shared MC4 bank. In threshold_remove_bank(), there's no "if (!nb)" guard :

	if (shared_bank[bank]) {
		if (!atomic_dec_and_test(&b->cpus)) {
			__threshold_remove_blocks(b);
			per_cpu(threshold_banks, cpu)[bank] = NULL;
			return;
		} else {
			/*
			 * the last CPU on this node using the shared bank is
			 * going away, remove that bank now.
			 */
			nb = node_to_amd_nb(amd_get_nb_id(cpu));
			nb->bank4 = NULL;
		}
	}


nb->bank4 = NULL will oops, since nb is NULL.

It made more sense (to me) to skip the creation of MC4 all together if you can't find the matching northbridge since you can't reliably do the dec_and_test() reference counting on the shared bank when you don't have the common NB struct for all the shared cores.

Or am I just smoking the wrong stuff ?

Cheers,
Steffen




  reply	other threads:[~2013-04-04 19:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-04 15:52 [PATCH] x86, amd, mce: Prevent potential cpu-online oops Daniel J Blueman
2013-04-04 16:04 ` Luck, Tony
2013-04-04 16:13 ` Borislav Petkov
2013-04-04 18:05   ` Steffen Persvold [this message]
2013-04-04 19:07     ` Borislav Petkov
2013-04-04 20:01       ` Steffen Persvold
2013-04-09  9:25       ` Steffen Persvold
2013-04-09  9:38         ` Borislav Petkov
2013-04-09  9:45           ` Steffen Persvold
2013-04-09 10:24             ` Borislav Petkov
2013-04-09 11:34               ` Steffen Persvold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515DC0FA.1040408@numascale.com \
    --to=sp@numascale.com \
    --cc=bp@alien8.de \
    --cc=daniel@numascale-asia.com \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox