From: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
To: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"mm-commits@vger.kernel.org" <mm-commits@vger.kernel.org>,
"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
"mingo@elte.hu" <mingo@elte.hu>,
"nickpiggin@yahoo.com.au" <nickpiggin@yahoo.com.au>,
"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree
Date: Tue, 15 Sep 2009 10:03:08 +0800 [thread overview]
Message-ID: <4AAEF5DC.4070308@cn.fujitsu.com> (raw)
In-Reply-To: <1252973802.2899.88.camel@sbs-t61.sc.intel.com>
Suresh Siddha wrote:
> On Mon, 2009-09-14 at 00:22 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>
>>>> CPU A call smp_call_function_many(wait=0) that want CPU B to call
>>>> a specific function, after smp_call_function_many() return, we let
>>>> CPU A offline immediately. Unfortunately, if CPU B receives this
>>>> IPI interrupt after CPU A down, it will crash like above description.
>>> How can cpu B receive the IPI interrupt after cpu A is down?
>>>
>>> As part of the cpu A going down, we first do the stop machine. i.e.,
>>> schedule the stop machine worker threads on each cpu. So, by the time
>>> all the worker threads on all the cpu's get scheduled and synchronized,
>>> ipi on B should get delivered.
>>>
>> Actually, those two examples have the same reason, that is how long
>> the destination CPU will receive the IPI interruption?
>>
>> If the stop machine threads can schedule in CPU B during the IPI
>> interruption delivering, It will occur those issue.
>>
>> I understand what you say but let me confuse is how we ensure it? The IPI
>> interruption is delivered over the APIC bus, It need several CPU instruction
>> cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
>> the answer, could you point out for me?
>
> Xiao, There is quite a bit of time between the time a particular cpu
> sends a smp call function IPI (with wait == 0) and the time that cpu
> starts running the stop machine thread and the whole system proceeds
> with stop machine. With in this time, typically the smp call function
> destination will receive the ipi interrupt. But theoretically the
> problem you explain might happen.
>
Yeah, though this case is very infrequent, but we can't avoid it.
>>From P4 onwards, interrupts are delivered over system bus and with NHM
> it is QPI. Also, the mechanism of scheduling the stop machine thread on
> a particular cpu involves resched IPI etc.
>
> Nevertheless, Have you seen a real hang or system crash due to this? If
> so, on what platform?
>
> Ideally, for xapic based platform, clear status of sender APIC ICR's
> delivery status indicates that the interrupt is registered at the
> receiver. for x2apic based platform, sending another interrupt will
> ensure that the previous interrupt was delivered.
>
> If you have indeed seen a crash related to this, can you review and give
> the appended patch a try and see if it fixes the issue? If you agree
> with the fix, then I will send the patch with a detailed change log etc.
>
I not seen this crash, just afraid it's unsafe while I review the code,
I try to generate this crash, but as you know, the race point is hard
to control.
> Your current fix is not clean and not complete in my opinion (as calling
> interrupt handlers manually and not doing the callbacks etc might cause
> other side affects). Thanks.
It is not the last version and doing the callbacks in another patch,
see below URL please:
http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2
I think we do better handle this in CPU down path in kernel/smp.c, It's very
safe and let people easy to understand.
> ---
>
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 9e3d8af..69ec2a9 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -93,8 +93,9 @@ void generic_smp_call_function_single_interrupt(void);
> void generic_smp_call_function_interrupt(void);
> void ipi_call_lock(void);
> void ipi_call_unlock(void);
> void ipi_call_lock_irq(void);
> void ipi_call_unlock_irq(void);
> +void quiesce_smp_call_functions(void);
> #endif
>
> /*
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 8e21850..d13a888 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -479,6 +479,27 @@ int smp_call_function(void (*func)(void *), void *info, int wait)
> }
> EXPORT_SYMBOL(smp_call_function);
>
> +void quiesce_smp_call_functions(void)
> +{
> + struct call_single_queue *q = &__get_cpu_var(call_single_queue);
> + bool empty;
> + unsigned long flags;
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&q->lock, flags);
> + empty = list_empty(&q->list);
> + spin_unlock_irqrestore(&q->lock, flags);
> + } while (!empty);
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&call_function.lock, flags);
> + empty = list_empty(&call_function.queue);
> + spin_unlock_irqrestore(&call_function.lock, flags);
> + } while (!empty);
> +}
> +
Why we need waiting CPU to handle this? It make no sense because the CPU is dying,
we can simple ignore the IPI request.
> void ipi_call_lock(void)
> {
> spin_lock(&call_function.lock);
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 912823e..dd2d90f 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -86,6 +86,9 @@ static void stop_cpu(struct work_struct *unused)
> curstate = state;
> switch (curstate) {
> case STOPMACHINE_DISABLE_IRQ:
> +#ifdef CONFIG_USE_GENERIC_SMP_HELPERS
> + quiesce_smp_call_functions();
> +#endif
It seems ugly, we can define a noop function if CONFIG_USE_GENERIC_SMP_HELPERS is not
defined.
Another problem is that all CPU must call quiesce_smp_call_functions() here, but only
dying CPU need do it.
> local_irq_disable();
> hard_irq_disable();
It will cause another race, if CPU A send a IPI interruption after CPU B call
quiesce_smp_call_functions() and disable IRQ, it will case the same problem.
(in this time, CPU B is enter stop machine, but CPU A is not)
Thanks,
Xiao
next prev parent reply other threads:[~2009-09-15 2:04 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-31 0:30 + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree akpm
[not found] ` <1252616988.7205.102.camel@laptop>
2009-09-11 7:45 ` Xiao Guangrong
2009-09-11 7:50 ` Peter Zijlstra
2009-09-11 19:08 ` Suresh Siddha
2009-09-14 7:22 ` Xiao Guangrong
2009-09-15 0:16 ` Suresh Siddha
2009-09-15 2:03 ` Xiao Guangrong [this message]
2009-09-16 2:20 ` Suresh Siddha
2009-09-17 3:00 ` Xiao Guangrong
2009-09-17 7:45 ` Peter Zijlstra
2009-09-19 2:16 ` Suresh Siddha
2009-09-21 2:55 ` Xiao Guangrong
2009-09-21 3:11 ` Suresh Siddha
2009-09-21 4:04 ` Xiao Guangrong
2009-09-22 5:32 ` Suresh Siddha
2009-09-22 6:52 ` Xiao Guangrong
-- strict thread matches above, loose matches on Subject: below --
2009-07-29 23:31 akpm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AAEF5DC.4070308@cn.fujitsu.com \
--to=xiaoguangrong@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mm-commits@vger.kernel.org \
--cc=nickpiggin@yahoo.com.au \
--cc=peterz@infradead.org \
--cc=rusty@rustcorp.com.au \
--cc=suresh.b.siddha@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.