From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Borislav Petkov <bp@amd64.org>
Cc: linux-kernel@vger.kernel.org, "x86@kernel.org" <x86@kernel.org>,
Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH] x86, mce: stop calling del_timer_sync() from interrupt
Date: Fri, 17 Jun 2011 10:48:29 +0900 [thread overview]
Message-ID: <4DFAB26D.6050202@jp.fujitsu.com> (raw)
In-Reply-To: <20110616140915.GA5065@gere.osrc.amd.com>
(2011/06/16 23:09), Borislav Petkov wrote:
> On Thu, Jun 16, 2011 at 04:28:35PM +0900, Hidetoshi Seto wrote:
>> Use of on_each_cpu() results in calling the function passed as the
>> argument on interrupt context.
>
> in
>
>> Calling del_timer_sync() from interrupt context can cause deadlock
>> if it interrupts the target timer running.
>>
>> MCE code has some such misuse of del_timer_sync() in parts for sysfs
>> file; bank*, check_interval, cmci_disabled and ignore_ce.
>> Fortunately these files are rare-used but you will be warned on write.
>
> So, you're saying you're hitting the WARN_ON(in_irq()) in
> del_timer_sync() ?
Yes. It is terrible that there are simultaneous WARN_ON() * online_cpus.
I should have write something clear about that here in patch description.
>> This patch will fix it.
>
> You could say
>
> "Move timer deletion outside of the interrupt context as a fix"
>
> or something to that effect to make it much more precise what the patch
> is doing.
Sure, I'll update it.
> Patch idea looks fine, see below for some nitpicking :).
>
>>
>> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
>> ---
>> arch/x86/kernel/cpu/mcheck/mce.c | 22 ++++++++++++++++------
>> 1 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
>> index ff1ae9b..77df54f 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
>> @@ -1170,6 +1170,17 @@ static void mce_start_timer(unsigned long data)
>> add_timer_on(t, smp_processor_id());
>> }
>>
>> +/* Must not be called from interrupt where del_timer_sync() can deadlock */
>> +static void mce_timer_delete_all(void)
>> +{
>> + int cpu;
>> +
>> + for_each_online_cpu(cpu) {
>> + if (mce_available(&per_cpu(cpu_info, cpu)))
>> + del_timer_sync(&per_cpu(mce_timer, cpu));
>> + }
>> +}
>> +
>> static void mce_do_trigger(struct work_struct *work)
>> {
>> call_usermodehelper(mce_helper, mce_helper_argv, NULL, UMH_NO_WAIT);
>> @@ -1768,7 +1779,6 @@ static struct syscore_ops mce_syscore_ops = {
>>
>> static void mce_cpu_restart(void *data)
>> {
>> - del_timer_sync(&__get_cpu_var(mce_timer));
>> if (!mce_available(__this_cpu_ptr(&cpu_info)))
>> return;
>
> I'm wondering, was this actually a bug - the fact that we deleted the
> timer _before_ we do the mce_available() check.
>
> In looking at it a bit more, it looks like mce_restart() is always
> executed on codepaths behind mce_available() checks so the
>
> if (mce_available(...))
> del_timer_sync(...);
>
> part in mce_timer_delete_all could be done without the if-check, no?
Yes, I've noticed it too.
Notable point is that these sysfs files are not created when
!mce_available(). :-P
However it is an entirely different matter than what this patch is
addressed. Just to keep old logic in mce_disable_ce(), I've added
mce_available() in the mce_timer_delete_all().
I think we can remove many redundant if-checks w/ mce_available()
around here and there.
Now I'm writing a patch for that... Please wait.
Thanks,
H.Seto
prev parent reply other threads:[~2011-06-17 1:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-16 7:28 [PATCH] x86, mce: stop calling del_timer_sync() from interrupt Hidetoshi Seto
2011-06-16 14:09 ` Borislav Petkov
2011-06-17 1:48 ` Hidetoshi Seto [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DFAB26D.6050202@jp.fujitsu.com \
--to=seto.hidetoshi@jp.fujitsu.com \
--cc=bp@amd64.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.