linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Corey Minyard <minyard@acm.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org,
	Corey Minyard <cminyard@mvista.com>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH][RT] x86: Fix an RT MCE crash
Date: Thu, 30 Jun 2016 10:58:57 -0500	[thread overview]
Message-ID: <577541C1.20302@acm.org> (raw)
In-Reply-To: <20160630115101.6337c395@gandalf.local.home>

On 06/30/2016 10:51 AM, Steven Rostedt wrote:
> On Thu, 30 Jun 2016 09:49:19 -0500
> Corey Minyard <minyard@acm.org> wrote:
>
>> On 06/30/2016 08:43 AM, Steven Rostedt wrote:
>>> On Thu, 30 Jun 2016 08:24:49 -0500
>>> minyard@acm.org wrote:
>>>   
>>>> From: Corey Minyard <cminyard@mvista.com>
>>>>
>>>> On some x86 systems an MCE interrupt would come in before the kernel
>>>> was ready for it.  Looking at the latest RT code, it has similar
>>>> (but not quite the same) code, except it adds a bool that tells if
>>>> MCE handling is initialized.  Add the same bool for older versions.
>>>>
>>>> Signed-off-by: Corey Minyard <cminyard@mvista.com>
>>>> ---
>>>>    arch/x86/kernel/cpu/mcheck/mce.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> We noticed this issue on a new Broadwell system when we booted RT
>>>> on it.  This patch is for 3.10, I'm not sure if it applies to
>>>> other kernel versions.
>>> Do you mean other 'older' versions? and that this works with the
>>> versions after 3.10 without this patch?
>> I haven't look at supported kernel versions besides 3.10 and 4.4.
>> The fix was from the 4.4 version of this code.  This patch fixes
>> v3.10-rt; I can look at finding which other versions need this.  I
>> was planning to do this, but I wanted to get the patch out for
>> comments first.
> I'm not an MCE expert (I just Cc'd one though ;-)

Ok.  It's not really an MCE bug per say, just an initialization
order bug.

>
> OK, so you are saying that the fix was from 4.4-rt? I can go and look
> for it, and if so, I can add it to the "backport" patches I need to do.
> Which I need to go and do that soon (backport patches from previous
> versions). It may already be in that list.

The fix was from 4.4-rt, but it's not a separate fix.  The 4.4 change is
d21959b8ad98 (x86/mce: use swait queue for mce wakeups)
and it's doing the same thing as the 3.10-rt change
49fe500d2abd (x86/mce: Defer mce wakeups to threads for
PREEMPT_RT).

The 3.10-rt change just doesn't have the bool that fixes the
initialization order issue.

-corey

>
> -- Steve
>
>> -corey
>>
>>> -- Steve
>>>   
>>>> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
>>>> index aaf4b9b..7125584 100644
>>>> --- a/arch/x86/kernel/cpu/mcheck/mce.c
>>>> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
>>>> @@ -1365,6 +1365,7 @@ static void __mce_notify_work(void)
>>>>    }
>>>>    
>>>>    #ifdef CONFIG_PREEMPT_RT_FULL
>>>> +static bool notify_work_ready __read_mostly;
>>>>    struct task_struct *mce_notify_helper;
>>>>    
>>>>    static int mce_notify_helper_thread(void *unused)
>>>> @@ -1386,12 +1387,14 @@ static int mce_notify_work_init(void)
>>>>    	if (!mce_notify_helper)
>>>>    		return -ENOMEM;
>>>>    
>>>> +	notify_work_ready = true;
>>>>    	return 0;
>>>>    }
>>>>    
>>>>    static void mce_notify_work(void)
>>>>    {
>>>> -	wake_up_process(mce_notify_helper);
>>>> +	if (notify_work_ready)
>>>> +		wake_up_process(mce_notify_helper);
>>>>    }
>>>>    #else
>>>>    static void mce_notify_work(void)


  reply	other threads:[~2016-06-30 15:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30 13:24 [PATCH][RT] x86: Fix an RT MCE crash minyard
2016-06-30 13:43 ` Steven Rostedt
2016-06-30 14:49   ` Corey Minyard
2016-06-30 15:51     ` Steven Rostedt
2016-06-30 15:58       ` Corey Minyard [this message]
2016-06-30 16:01       ` Borislav Petkov
2016-06-30 16:17         ` Luck, Tony
2016-06-30 16:40           ` Corey Minyard
2016-06-30 17:01             ` Borislav Petkov
2016-06-30 17:18               ` Corey Minyard
2016-06-30 17:26                 ` Borislav Petkov
2016-06-30 17:54                   ` Corey Minyard
2016-06-30 18:22                     ` Borislav Petkov
2016-06-30 19:44                       ` Corey Minyard
2016-06-30 20:34                         ` Borislav Petkov
2016-06-30 22:47                           ` Corey Minyard
2016-07-01  7:20                             ` Borislav Petkov
2016-07-06  0:59                               ` Corey Minyard
2016-07-06  8:37                                 ` Borislav Petkov
2016-07-06 12:03                                   ` Corey Minyard
2016-07-06 13:32                                     ` Steven Rostedt
2016-07-06 13:43                                       ` Sebastian Andrzej Siewior
2016-07-11 17:32                                         ` Steven Rostedt
2016-07-01  9:20         ` Daniel Wagner
2016-06-30 16:04       ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=577541C1.20302@acm.org \
    --to=minyard@acm.org \
    --cc=bp@alien8.de \
    --cc=cminyard@mvista.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).