Linux MIPS Architecture development
 help / color / mirror / Atom feed
* Ftrace for MIPS may hang on SMP system
@ 2010-08-22 12:18 wu zhangjin
  2010-08-22 12:20 ` wu zhangjin
  0 siblings, 1 reply; 13+ messages in thread
From: wu zhangjin @ 2010-08-22 12:18 UTC (permalink / raw)
  To: Ralf Baechle, Steven Rostedt, David Daney; +Cc: linux-mips

Hi, all

For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
found Ftrace for MIPS hanged on it after I issued:

$ echo function > /debug/tracing/current_tracer

I have gotten the root cause, that is:

in kernel/trace/ftrace.c:

stop_machine() disables the irqs of the other cpus and then modify the
codes via calling the arch specific ftrace_modify_code() in
__ftrace_modify_code().

As the description about stop_machine() in arch/x86/kernel/ftrace.c shows:

/*
 * Modifying code must take extra care. On an SMP machine, if
 * the code being modified is also being executed on another CPU
 * that CPU will have undefined results and possibly take a GPF.
 * We use kstop_machine to stop other CPUS from exectuing code.
[snip]

Then, it is reasonable to use stop_machine() here.

And in arch/mips/kernel/ftrace.c:

flush_icache_range() is called in ftrace_modify_code() to ensure the
intructions will be executed are what we want.

In UP system, there is no problem for flush_icache_range() simply
flush the instruction cache, but In SMP system, this may be different,
for flush_icache_range() may also need to ask the other cpus (via
sending ipi interrupt) to flush their icaches and will wait for them
till the other cpus finish their flushing.

But as we know above, the irqs of the other cpus are disabled by
stop_machine(), they have no opportunity to flush their icache and
will let the current cpu wait for them all the time, then soft lock
--> hang.

To fix it, there are two potential solutions:

1. replace flush_icache_range() by something else, maybe we can use
the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
defined in arch/x86/include/asm/processor.h to flush the icache on all
processors:

/* Stop speculative execution and prefetching of modified code. */
static inline void sync_core(void)
{
        int tmp;

#if defined(CONFIG_M386) || defined(CONFIG_M486)
        if (boot_cpu_data.x86 < 5)
                /* There is no speculative execution.
                 * jmp is a barrier to prefetching. */
                asm volatile("jmp 1f\n1:\n" ::: "memory");
        else
#endif
                /* cpuid is a barrier to speculative execution.
                 * Prefetched instructions are automatically
                 * invalidated when modified. */
                asm volatile("cpuid" : "=a" (tmp) : "0" (1)
                             : "ebx", "ecx", "edx", "memory");
}

But is there a cpuid like hardware instruction in MIPS SMP? As I know,
in UP, we may be possible to use prefetch instruction to push the
instruction to the cache, but in SMP, is there a instruction to force
the other cpus to flush their cache too?

2. Replace the stop_machine() by something else

I have written such a patch:

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2404b59..e4d058f 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
 static void ftrace_run_update_code(int command)
 {
        int ret;
+       unsigned long flags;

        ret = ftrace_arch_code_modify_prepare();
        FTRACE_WARN_ON(ret);
        if (ret)
                return;

-       stop_machine(__ftrace_modify_code, &command, NULL);
+       preempt_disable();
+       local_irq_save(flags);
+       __ftrace_modify_code(&command);
+       local_irq_restore(flags);
+       preempt_enable();

        ret = ftrace_arch_code_modify_post_process();
        FTRACE_WARN_ON(ret);

It works without any hang but I'm not sure whether it will guarantee
the "undefined results" problem mentioned above. Here we may need to
prevent the other cpus from executing the source code for we are
modifying the source code but also need to allow them to get the ipi
interrupt and flush their icaches.

And I have took a look at the part of code modification in kgdb
system, seems it doesn't use stop_machine().

What's your ideas?

Thanks & Regards,
Wu Zhangjin

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-22 12:18 Ftrace for MIPS may hang on SMP system wu zhangjin
@ 2010-08-22 12:20 ` wu zhangjin
  2010-08-22 14:27   ` wu zhangjin
  0 siblings, 1 reply; 13+ messages in thread
From: wu zhangjin @ 2010-08-22 12:20 UTC (permalink / raw)
  To: Ralf Baechle, Steven Rostedt, David Daney; +Cc: linux-mips

(Add 'another' Steven in this loop)

On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> Hi, all
>
> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
> found Ftrace for MIPS hanged on it after I issued:
>
> $ echo function > /debug/tracing/current_tracer
>
> I have gotten the root cause, that is:
>
> in kernel/trace/ftrace.c:
>
> stop_machine() disables the irqs of the other cpus and then modify the
> codes via calling the arch specific ftrace_modify_code() in
> __ftrace_modify_code().
>
> As the description about stop_machine() in arch/x86/kernel/ftrace.c shows:
>
> /*
>  * Modifying code must take extra care. On an SMP machine, if
>  * the code being modified is also being executed on another CPU
>  * that CPU will have undefined results and possibly take a GPF.
>  * We use kstop_machine to stop other CPUS from exectuing code.
> [snip]
>
> Then, it is reasonable to use stop_machine() here.
>
> And in arch/mips/kernel/ftrace.c:
>
> flush_icache_range() is called in ftrace_modify_code() to ensure the
> intructions will be executed are what we want.
>
> In UP system, there is no problem for flush_icache_range() simply
> flush the instruction cache, but In SMP system, this may be different,
> for flush_icache_range() may also need to ask the other cpus (via
> sending ipi interrupt) to flush their icaches and will wait for them
> till the other cpus finish their flushing.
>
> But as we know above, the irqs of the other cpus are disabled by
> stop_machine(), they have no opportunity to flush their icache and
> will let the current cpu wait for them all the time, then soft lock
> --> hang.
>
> To fix it, there are two potential solutions:
>
> 1. replace flush_icache_range() by something else, maybe we can use
> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
> defined in arch/x86/include/asm/processor.h to flush the icache on all
> processors:
>
> /* Stop speculative execution and prefetching of modified code. */
> static inline void sync_core(void)
> {
>         int tmp;
>
> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>         if (boot_cpu_data.x86 < 5)
>                 /* There is no speculative execution.
>                  * jmp is a barrier to prefetching. */
>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
>         else
> #endif
>                 /* cpuid is a barrier to speculative execution.
>                  * Prefetched instructions are automatically
>                  * invalidated when modified. */
>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>                              : "ebx", "ecx", "edx", "memory");
> }
>
> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
> in UP, we may be possible to use prefetch instruction to push the
> instruction to the cache, but in SMP, is there a instruction to force
> the other cpus to flush their cache too?
>
> 2. Replace the stop_machine() by something else
>
> I have written such a patch:
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 2404b59..e4d058f 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>  static void ftrace_run_update_code(int command)
>  {
>         int ret;
> +       unsigned long flags;
>
>         ret = ftrace_arch_code_modify_prepare();
>         FTRACE_WARN_ON(ret);
>         if (ret)
>                 return;
>
> -       stop_machine(__ftrace_modify_code, &command, NULL);
> +       preempt_disable();
> +       local_irq_save(flags);
> +       __ftrace_modify_code(&command);
> +       local_irq_restore(flags);
> +       preempt_enable();
>
>         ret = ftrace_arch_code_modify_post_process();
>         FTRACE_WARN_ON(ret);
>
> It works without any hang but I'm not sure whether it will guarantee
> the "undefined results" problem mentioned above. Here we may need to
> prevent the other cpus from executing the source code for we are
> modifying the source code but also need to allow them to get the ipi
> interrupt and flush their icaches.
>
> And I have took a look at the part of code modification in kgdb
> system, seems it doesn't use stop_machine().
>
> What's your ideas?
>
> Thanks & Regards,
> Wu Zhangjin
>


-- 
MSN+Gtalk: wuzhangjin@gmail.com
Blog: http://falcon.oss.lzu.edu.cn
Tel:+86-18710032278

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-22 12:20 ` wu zhangjin
@ 2010-08-22 14:27   ` wu zhangjin
  2010-08-23 12:50     ` wu zhangjin
  2010-08-30 20:47     ` Steven Rostedt
  0 siblings, 2 replies; 13+ messages in thread
From: wu zhangjin @ 2010-08-22 14:27 UTC (permalink / raw)
  To: Ralf Baechle, Steven Rostedt, David Daney; +Cc: linux-mips

On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> (Add 'another' Steven in this loop)
>
> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>> Hi, all
>>
>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
>> found Ftrace for MIPS hanged on it after I issued:
>>
>> $ echo function > /debug/tracing/current_tracer
>>
>> I have gotten the root cause, that is:
>>
>> in kernel/trace/ftrace.c:
>>
>> stop_machine() disables the irqs of the other cpus and then modify the
>> codes via calling the arch specific ftrace_modify_code() in
>> __ftrace_modify_code().
>>
>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
>> shows:
>>
>> /*
>>  * Modifying code must take extra care. On an SMP machine, if
>>  * the code being modified is also being executed on another CPU
>>  * that CPU will have undefined results and possibly take a GPF.
>>  * We use kstop_machine to stop other CPUS from exectuing code.
>> [snip]
>>
>> Then, it is reasonable to use stop_machine() here.
>>
>> And in arch/mips/kernel/ftrace.c:
>>
>> flush_icache_range() is called in ftrace_modify_code() to ensure the
>> intructions will be executed are what we want.
>>
>> In UP system, there is no problem for flush_icache_range() simply
>> flush the instruction cache, but In SMP system, this may be different,
>> for flush_icache_range() may also need to ask the other cpus (via
>> sending ipi interrupt) to flush their icaches and will wait for them
>> till the other cpus finish their flushing.
>>
>> But as we know above, the irqs of the other cpus are disabled by
>> stop_machine(), they have no opportunity to flush their icache and
>> will let the current cpu wait for them all the time, then soft lock
>> --> hang.
>>
>> To fix it, there are two potential solutions:
>>
>> 1. replace flush_icache_range() by something else, maybe we can use
>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
>> defined in arch/x86/include/asm/processor.h to flush the icache on all
>> processors:
>>
>> /* Stop speculative execution and prefetching of modified code. */
>> static inline void sync_core(void)
>> {
>>         int tmp;
>>
>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>>         if (boot_cpu_data.x86 < 5)
>>                 /* There is no speculative execution.
>>                  * jmp is a barrier to prefetching. */
>>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
>>         else
>> #endif
>>                 /* cpuid is a barrier to speculative execution.
>>                  * Prefetched instructions are automatically
>>                  * invalidated when modified. */
>>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>>                              : "ebx", "ecx", "edx", "memory");
>> }
>>
>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
>> in UP, we may be possible to use prefetch instruction to push the
>> instruction to the cache, but in SMP, is there a instruction to force
>> the other cpus to flush their cache too?
>>
>> 2. Replace the stop_machine() by something else
>>
>> I have written such a patch:
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index 2404b59..e4d058f 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>>  static void ftrace_run_update_code(int command)
>>  {
>>         int ret;
>> +       unsigned long flags;
>>
>>         ret = ftrace_arch_code_modify_prepare();
>>         FTRACE_WARN_ON(ret);
>>         if (ret)
>>                 return;
>>
>> -       stop_machine(__ftrace_modify_code, &command, NULL);
>> +       preempt_disable();
>> +       local_irq_save(flags);
>> +       __ftrace_modify_code(&command);
>> +       local_irq_restore(flags);
>> +       preempt_enable();
>>
>>         ret = ftrace_arch_code_modify_post_process();
>>         FTRACE_WARN_ON(ret);
>>

We may need to protect the __ftrace_modify_code() with raw spin lock.

>> It works without any hang but I'm not sure whether it will guarantee
>> the "undefined results" problem mentioned above. Here we may need to
>> prevent the other cpus from executing the source code for we are
>> modifying the source code but also need to allow them to get the ipi
>> interrupt and flush their icaches.
>>
>> And I have took a look at the part of code modification in kgdb
>> system, seems it doesn't use stop_machine().
>>
>> What's your ideas?
>>
>> Thanks & Regards,
>> Wu Zhangjin
>>
>
>
> --
> MSN+Gtalk: wuzhangjin@gmail.com
> Blog: http://falcon.oss.lzu.edu.cn
> Tel:+86-18710032278
>


-- 
MSN+Gtalk: wuzhangjin@gmail.com
Blog: http://falcon.oss.lzu.edu.cn
Tel:+86-18710032278

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-22 14:27   ` wu zhangjin
@ 2010-08-23 12:50     ` wu zhangjin
  2010-08-23 14:16       ` wu zhangjin
  2010-08-30 20:47     ` Steven Rostedt
  1 sibling, 1 reply; 13+ messages in thread
From: wu zhangjin @ 2010-08-23 12:50 UTC (permalink / raw)
  To: Ralf Baechle, Steven Rostedt, David Daney; +Cc: linux-mips

Hi,

To avoid touching the other parts, I have used the following method:

delay the cache flushing operation after the stop_machine().

Here is the patch:

diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index 5a84a1f..f4c9581 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -33,6 +33,25 @@ static inline int in_module(unsigned long ip)
        return ip & 0x40000000;
 }

+#ifdef CONFIG_SMP
+static bool machine_stopped;
+
+int ftrace_arch_code_modify_prepare(void)
+{
+       preempt_disable();
+       machine_stopped = 1;
+       return 0;
+}
+
+int ftrace_arch_code_modify_post_process(void)
+{
+       __flush_cache_all();
+       machine_stopped = 0;
+       preempt_enable();
+       return 0;
+}
+#endif
+
 #ifdef CONFIG_DYNAMIC_FTRACE

 #define JAL 0x0c000000         /* jump & link: ip --> ra, jump to target */
@@ -79,7 +98,12 @@ static int ftrace_modify_code(unsigned long ip,
unsigned int new_code)
        if (unlikely(faulted))
                return -EFAULT;

-       flush_icache_range(ip, ip + 8);
+#ifndef CONFIG_SMP
+       flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
+#else
+       if (!machine_stopped)
+               flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
+#endif

        return 0;
 }


Regards,
Wu Zhangjin

On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>> (Add 'another' Steven in this loop)
>>
>> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>>> Hi, all
>>>
>>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
>>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
>>> found Ftrace for MIPS hanged on it after I issued:
>>>
>>> $ echo function > /debug/tracing/current_tracer
>>>
>>> I have gotten the root cause, that is:
>>>
>>> in kernel/trace/ftrace.c:
>>>
>>> stop_machine() disables the irqs of the other cpus and then modify the
>>> codes via calling the arch specific ftrace_modify_code() in
>>> __ftrace_modify_code().
>>>
>>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
>>> shows:
>>>
>>> /*
>>>  * Modifying code must take extra care. On an SMP machine, if
>>>  * the code being modified is also being executed on another CPU
>>>  * that CPU will have undefined results and possibly take a GPF.
>>>  * We use kstop_machine to stop other CPUS from exectuing code.
>>> [snip]
>>>
>>> Then, it is reasonable to use stop_machine() here.
>>>
>>> And in arch/mips/kernel/ftrace.c:
>>>
>>> flush_icache_range() is called in ftrace_modify_code() to ensure the
>>> intructions will be executed are what we want.
>>>
>>> In UP system, there is no problem for flush_icache_range() simply
>>> flush the instruction cache, but In SMP system, this may be different,
>>> for flush_icache_range() may also need to ask the other cpus (via
>>> sending ipi interrupt) to flush their icaches and will wait for them
>>> till the other cpus finish their flushing.
>>>
>>> But as we know above, the irqs of the other cpus are disabled by
>>> stop_machine(), they have no opportunity to flush their icache and
>>> will let the current cpu wait for them all the time, then soft lock
>>> --> hang.
>>>
>>> To fix it, there are two potential solutions:
>>>
>>> 1. replace flush_icache_range() by something else, maybe we can use
>>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
>>> defined in arch/x86/include/asm/processor.h to flush the icache on all
>>> processors:
>>>
>>> /* Stop speculative execution and prefetching of modified code. */
>>> static inline void sync_core(void)
>>> {
>>>         int tmp;
>>>
>>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>>>         if (boot_cpu_data.x86 < 5)
>>>                 /* There is no speculative execution.
>>>                  * jmp is a barrier to prefetching. */
>>>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
>>>         else
>>> #endif
>>>                 /* cpuid is a barrier to speculative execution.
>>>                  * Prefetched instructions are automatically
>>>                  * invalidated when modified. */
>>>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>>>                              : "ebx", "ecx", "edx", "memory");
>>> }
>>>
>>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
>>> in UP, we may be possible to use prefetch instruction to push the
>>> instruction to the cache, but in SMP, is there a instruction to force
>>> the other cpus to flush their cache too?
>>>
>>> 2. Replace the stop_machine() by something else
>>>
>>> I have written such a patch:
>>>
>>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>>> index 2404b59..e4d058f 100644
>>> --- a/kernel/trace/ftrace.c
>>> +++ b/kernel/trace/ftrace.c
>>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>>>  static void ftrace_run_update_code(int command)
>>>  {
>>>         int ret;
>>> +       unsigned long flags;
>>>
>>>         ret = ftrace_arch_code_modify_prepare();
>>>         FTRACE_WARN_ON(ret);
>>>         if (ret)
>>>                 return;
>>>
>>> -       stop_machine(__ftrace_modify_code, &command, NULL);
>>> +       preempt_disable();
>>> +       local_irq_save(flags);
>>> +       __ftrace_modify_code(&command);
>>> +       local_irq_restore(flags);
>>> +       preempt_enable();
>>>
>>>         ret = ftrace_arch_code_modify_post_process();
>>>         FTRACE_WARN_ON(ret);
>>>
>
> We may need to protect the __ftrace_modify_code() with raw spin lock.
>
>>> It works without any hang but I'm not sure whether it will guarantee
>>> the "undefined results" problem mentioned above. Here we may need to
>>> prevent the other cpus from executing the source code for we are
>>> modifying the source code but also need to allow them to get the ipi
>>> interrupt and flush their icaches.
>>>
>>> And I have took a look at the part of code modification in kgdb
>>> system, seems it doesn't use stop_machine().
>>>
>>> What's your ideas?
>>>
>>> Thanks & Regards,
>>> Wu Zhangjin

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-23 12:50     ` wu zhangjin
@ 2010-08-23 14:16       ` wu zhangjin
  2010-08-23 17:35         ` David Daney
  2010-08-30 20:48         ` Steven Rostedt
  0 siblings, 2 replies; 13+ messages in thread
From: wu zhangjin @ 2010-08-23 14:16 UTC (permalink / raw)
  To: Ralf Baechle, Steven Rostedt, David Daney; +Cc: linux-mips

On 8/23/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> Hi,
>
> To avoid touching the other parts, I have used the following method:
>
> delay the cache flushing operation after the stop_machine().
>
> Here is the patch:
>
> diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
> index 5a84a1f..f4c9581 100644
> --- a/arch/mips/kernel/ftrace.c
> +++ b/arch/mips/kernel/ftrace.c
> @@ -33,6 +33,25 @@ static inline int in_module(unsigned long ip)
>         return ip & 0x40000000;
>  }
>
> +#ifdef CONFIG_SMP
> +static bool machine_stopped;
> +
> +int ftrace_arch_code_modify_prepare(void)
> +{
> +       preempt_disable();

preempt_disable() is not necessary, and it may introduce the warning
about "scheduling in atomic()"

Regards,
Wu Zhangjin

> +       machine_stopped = 1;
> +       return 0;
> +}
> +
> +int ftrace_arch_code_modify_post_process(void)
> +{
> +       __flush_cache_all();
> +       machine_stopped = 0;
> +       preempt_enable();
> +       return 0;
> +}
> +#endif
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>
>  #define JAL 0x0c000000         /* jump & link: ip --> ra, jump to target
> */
> @@ -79,7 +98,12 @@ static int ftrace_modify_code(unsigned long ip,
> unsigned int new_code)
>         if (unlikely(faulted))
>                 return -EFAULT;
>
> -       flush_icache_range(ip, ip + 8);
> +#ifndef CONFIG_SMP
> +       flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
> +#else
> +       if (!machine_stopped)
> +               flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
> +#endif
>
>         return 0;
>  }
>
>
> Regards,
> Wu Zhangjin
>
> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>>> (Add 'another' Steven in this loop)
>>>
>>> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
>>>> Hi, all
>>>>
>>>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
>>>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
>>>> found Ftrace for MIPS hanged on it after I issued:
>>>>
>>>> $ echo function > /debug/tracing/current_tracer
>>>>
>>>> I have gotten the root cause, that is:
>>>>
>>>> in kernel/trace/ftrace.c:
>>>>
>>>> stop_machine() disables the irqs of the other cpus and then modify the
>>>> codes via calling the arch specific ftrace_modify_code() in
>>>> __ftrace_modify_code().
>>>>
>>>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
>>>> shows:
>>>>
>>>> /*
>>>>  * Modifying code must take extra care. On an SMP machine, if
>>>>  * the code being modified is also being executed on another CPU
>>>>  * that CPU will have undefined results and possibly take a GPF.
>>>>  * We use kstop_machine to stop other CPUS from exectuing code.
>>>> [snip]
>>>>
>>>> Then, it is reasonable to use stop_machine() here.
>>>>
>>>> And in arch/mips/kernel/ftrace.c:
>>>>
>>>> flush_icache_range() is called in ftrace_modify_code() to ensure the
>>>> intructions will be executed are what we want.
>>>>
>>>> In UP system, there is no problem for flush_icache_range() simply
>>>> flush the instruction cache, but In SMP system, this may be different,
>>>> for flush_icache_range() may also need to ask the other cpus (via
>>>> sending ipi interrupt) to flush their icaches and will wait for them
>>>> till the other cpus finish their flushing.
>>>>
>>>> But as we know above, the irqs of the other cpus are disabled by
>>>> stop_machine(), they have no opportunity to flush their icache and
>>>> will let the current cpu wait for them all the time, then soft lock
>>>> --> hang.
>>>>
>>>> To fix it, there are two potential solutions:
>>>>
>>>> 1. replace flush_icache_range() by something else, maybe we can use
>>>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
>>>> defined in arch/x86/include/asm/processor.h to flush the icache on all
>>>> processors:
>>>>
>>>> /* Stop speculative execution and prefetching of modified code. */
>>>> static inline void sync_core(void)
>>>> {
>>>>         int tmp;
>>>>
>>>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>>>>         if (boot_cpu_data.x86 < 5)
>>>>                 /* There is no speculative execution.
>>>>                  * jmp is a barrier to prefetching. */
>>>>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
>>>>         else
>>>> #endif
>>>>                 /* cpuid is a barrier to speculative execution.
>>>>                  * Prefetched instructions are automatically
>>>>                  * invalidated when modified. */
>>>>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>>>>                              : "ebx", "ecx", "edx", "memory");
>>>> }
>>>>
>>>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
>>>> in UP, we may be possible to use prefetch instruction to push the
>>>> instruction to the cache, but in SMP, is there a instruction to force
>>>> the other cpus to flush their cache too?
>>>>
>>>> 2. Replace the stop_machine() by something else
>>>>
>>>> I have written such a patch:
>>>>
>>>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>>>> index 2404b59..e4d058f 100644
>>>> --- a/kernel/trace/ftrace.c
>>>> +++ b/kernel/trace/ftrace.c
>>>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>>>>  static void ftrace_run_update_code(int command)
>>>>  {
>>>>         int ret;
>>>> +       unsigned long flags;
>>>>
>>>>         ret = ftrace_arch_code_modify_prepare();
>>>>         FTRACE_WARN_ON(ret);
>>>>         if (ret)
>>>>                 return;
>>>>
>>>> -       stop_machine(__ftrace_modify_code, &command, NULL);
>>>> +       preempt_disable();
>>>> +       local_irq_save(flags);
>>>> +       __ftrace_modify_code(&command);
>>>> +       local_irq_restore(flags);
>>>> +       preempt_enable();
>>>>
>>>>         ret = ftrace_arch_code_modify_post_process();
>>>>         FTRACE_WARN_ON(ret);
>>>>
>>
>> We may need to protect the __ftrace_modify_code() with raw spin lock.
>>
>>>> It works without any hang but I'm not sure whether it will guarantee
>>>> the "undefined results" problem mentioned above. Here we may need to
>>>> prevent the other cpus from executing the source code for we are
>>>> modifying the source code but also need to allow them to get the ipi
>>>> interrupt and flush their icaches.
>>>>
>>>> And I have took a look at the part of code modification in kgdb
>>>> system, seems it doesn't use stop_machine().
>>>>
>>>> What's your ideas?
>>>>
>>>> Thanks & Regards,
>>>> Wu Zhangjin
>


-- 
MSN+Gtalk: wuzhangjin@gmail.com
Blog: http://falcon.oss.lzu.edu.cn
Tel:+86-18710032278

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-23 14:16       ` wu zhangjin
@ 2010-08-23 17:35         ` David Daney
  2010-08-24  6:25           ` wu zhangjin
  2010-08-30 20:48         ` Steven Rostedt
  1 sibling, 1 reply; 13+ messages in thread
From: David Daney @ 2010-08-23 17:35 UTC (permalink / raw)
  To: wu zhangjin; +Cc: Ralf Baechle, Steven Rostedt, linux-mips

Can you send a real patch with your proposed complete fix and proper 
Signed-off-by: header?

I would like to test it.

Also as a point of reference, I have been using ftrace on 16 way SMP 
mips64 systems without seeing this issue.

David Daney

On 08/23/2010 07:16 AM, wu zhangjin wrote:
> On 8/23/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>> Hi,
>>
>> To avoid touching the other parts, I have used the following method:
>>
>> delay the cache flushing operation after the stop_machine().
>>
>> Here is the patch:
>>
>> diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
>> index 5a84a1f..f4c9581 100644
>> --- a/arch/mips/kernel/ftrace.c
>> +++ b/arch/mips/kernel/ftrace.c
>> @@ -33,6 +33,25 @@ static inline int in_module(unsigned long ip)
>>          return ip&  0x40000000;
>>   }
>>
>> +#ifdef CONFIG_SMP
>> +static bool machine_stopped;
>> +
>> +int ftrace_arch_code_modify_prepare(void)
>> +{
>> +       preempt_disable();
>
> preempt_disable() is not necessary, and it may introduce the warning
> about "scheduling in atomic()"
>
> Regards,
> Wu Zhangjin
>
>> +       machine_stopped = 1;
>> +       return 0;
>> +}
>> +
>> +int ftrace_arch_code_modify_post_process(void)
>> +{
>> +       __flush_cache_all();
>> +       machine_stopped = 0;
>> +       preempt_enable();
>> +       return 0;
>> +}
>> +#endif
>> +
>>   #ifdef CONFIG_DYNAMIC_FTRACE
>>
>>   #define JAL 0x0c000000         /* jump&  link: ip -->  ra, jump to target
>> */
>> @@ -79,7 +98,12 @@ static int ftrace_modify_code(unsigned long ip,
>> unsigned int new_code)
>>          if (unlikely(faulted))
>>                  return -EFAULT;
>>
>> -       flush_icache_range(ip, ip + 8);
>> +#ifndef CONFIG_SMP
>> +       flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
>> +#else
>> +       if (!machine_stopped)
>> +               flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
>> +#endif
>>
>>          return 0;
>>   }
>>
>>
>> Regards,
>> Wu Zhangjin
>>
>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>>> (Add 'another' Steven in this loop)
>>>>
>>>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>>>> Hi, all
>>>>>
>>>>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
>>>>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
>>>>> found Ftrace for MIPS hanged on it after I issued:
>>>>>
>>>>> $ echo function>  /debug/tracing/current_tracer
>>>>>
>>>>> I have gotten the root cause, that is:
>>>>>
>>>>> in kernel/trace/ftrace.c:
>>>>>
>>>>> stop_machine() disables the irqs of the other cpus and then modify the
>>>>> codes via calling the arch specific ftrace_modify_code() in
>>>>> __ftrace_modify_code().
>>>>>
>>>>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
>>>>> shows:
>>>>>
>>>>> /*
>>>>>   * Modifying code must take extra care. On an SMP machine, if
>>>>>   * the code being modified is also being executed on another CPU
>>>>>   * that CPU will have undefined results and possibly take a GPF.
>>>>>   * We use kstop_machine to stop other CPUS from exectuing code.
>>>>> [snip]
>>>>>
>>>>> Then, it is reasonable to use stop_machine() here.
>>>>>
>>>>> And in arch/mips/kernel/ftrace.c:
>>>>>
>>>>> flush_icache_range() is called in ftrace_modify_code() to ensure the
>>>>> intructions will be executed are what we want.
>>>>>
>>>>> In UP system, there is no problem for flush_icache_range() simply
>>>>> flush the instruction cache, but In SMP system, this may be different,
>>>>> for flush_icache_range() may also need to ask the other cpus (via
>>>>> sending ipi interrupt) to flush their icaches and will wait for them
>>>>> till the other cpus finish their flushing.
>>>>>
>>>>> But as we know above, the irqs of the other cpus are disabled by
>>>>> stop_machine(), they have no opportunity to flush their icache and
>>>>> will let the current cpu wait for them all the time, then soft lock
>>>>> -->  hang.
>>>>>
>>>>> To fix it, there are two potential solutions:
>>>>>
>>>>> 1. replace flush_icache_range() by something else, maybe we can use
>>>>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
>>>>> defined in arch/x86/include/asm/processor.h to flush the icache on all
>>>>> processors:
>>>>>
>>>>> /* Stop speculative execution and prefetching of modified code. */
>>>>> static inline void sync_core(void)
>>>>> {
>>>>>          int tmp;
>>>>>
>>>>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>>>>>          if (boot_cpu_data.x86<  5)
>>>>>                  /* There is no speculative execution.
>>>>>                   * jmp is a barrier to prefetching. */
>>>>>                  asm volatile("jmp 1f\n1:\n" ::: "memory");
>>>>>          else
>>>>> #endif
>>>>>                  /* cpuid is a barrier to speculative execution.
>>>>>                   * Prefetched instructions are automatically
>>>>>                   * invalidated when modified. */
>>>>>                  asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>>>>>                               : "ebx", "ecx", "edx", "memory");
>>>>> }
>>>>>
>>>>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
>>>>> in UP, we may be possible to use prefetch instruction to push the
>>>>> instruction to the cache, but in SMP, is there a instruction to force
>>>>> the other cpus to flush their cache too?
>>>>>
>>>>> 2. Replace the stop_machine() by something else
>>>>>
>>>>> I have written such a patch:
>>>>>
>>>>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>>>>> index 2404b59..e4d058f 100644
>>>>> --- a/kernel/trace/ftrace.c
>>>>> +++ b/kernel/trace/ftrace.c
>>>>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>>>>>   static void ftrace_run_update_code(int command)
>>>>>   {
>>>>>          int ret;
>>>>> +       unsigned long flags;
>>>>>
>>>>>          ret = ftrace_arch_code_modify_prepare();
>>>>>          FTRACE_WARN_ON(ret);
>>>>>          if (ret)
>>>>>                  return;
>>>>>
>>>>> -       stop_machine(__ftrace_modify_code,&command, NULL);
>>>>> +       preempt_disable();
>>>>> +       local_irq_save(flags);
>>>>> +       __ftrace_modify_code(&command);
>>>>> +       local_irq_restore(flags);
>>>>> +       preempt_enable();
>>>>>
>>>>>          ret = ftrace_arch_code_modify_post_process();
>>>>>          FTRACE_WARN_ON(ret);
>>>>>
>>>
>>> We may need to protect the __ftrace_modify_code() with raw spin lock.
>>>
>>>>> It works without any hang but I'm not sure whether it will guarantee
>>>>> the "undefined results" problem mentioned above. Here we may need to
>>>>> prevent the other cpus from executing the source code for we are
>>>>> modifying the source code but also need to allow them to get the ipi
>>>>> interrupt and flush their icaches.
>>>>>
>>>>> And I have took a look at the part of code modification in kgdb
>>>>> system, seems it doesn't use stop_machine().
>>>>>
>>>>> What's your ideas?
>>>>>
>>>>> Thanks&  Regards,
>>>>> Wu Zhangjin
>>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-23 17:35         ` David Daney
@ 2010-08-24  6:25           ` wu zhangjin
  2010-08-30 20:51             ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: wu zhangjin @ 2010-08-24  6:25 UTC (permalink / raw)
  To: David Daney; +Cc: Ralf Baechle, Steven Rostedt, linux-mips

On 8/24/10, David Daney <ddaney@caviumnetworks.com> wrote:
> Can you send a real patch with your proposed complete fix and proper
> Signed-off-by: header?
>
> I would like to test it.
>
> Also as a point of reference, I have been using ftrace on 16 way SMP
> mips64 systems without seeing this issue.

Yes, even without this patch, it works well on cavium, the reason is
cavium doesn't use smp_call_function() to send the ipi interrupts, but
just send the ipi with octeon_send_ipi_single():

flush_icache_range              = octeon_flush_icache_range;

/**
 * Flush a range of kernel addresses out of the icache
 *
 */
static void octeon_flush_icache_range(unsigned long start, unsigned long end)
{
        octeon_flush_icache_all_cores(NULL);
}

octeon_flush_icache_all_cores
{
         [snip]
         for_each_cpu_mask(cpu, mask)
                octeon_send_ipi_single(cpu, SMP_ICACHE_FLUSH);
         [snip]
}

static irqreturn_t mailbox_interrupt(int irq, void *dev_id)
{
        [snip]
        if (action & SMP_CALL_FUNCTION)
                smp_call_function_interrupt();

        /* Check if we've been told to flush the icache */
        if (action & SMP_ICACHE_FLUSH)
                asm volatile ("synci 0($0)\n");
        return IRQ_HANDLED;
}

When the irqs of the cpus are disabled by stop_machine(), the icache
of the other cpus will not be flushed till the irqs are enabled, but
it is okay for the synci of cavium will flush the whole local icache
after the pending ipi is filed:

static inline void octeon_local_flush_icache(void)
{
        asm volatile ("synci 0($0)");
}

But for RMI XSL, flush_icache_range() will flush the indicated address
via the cache instruction, it doesn't flush the whole local icache,
so, if too many cache requests are sent to the other cpus, only the
icaches of the latest addresses will be flushed after the irqs are
enabled, and in RMI XSL, it use the smp_call_function() to send the
ipi to the other cpus, if the irqs of the cpus are disabled, there
will be deadlock for smp_call_function_many() called by
smp_call_function() will wait for the other cpus.

So, this patch is necessary to fix the deadlock and icache problem on
RMI XLS and it will also improve the performance via reducing the
unnecessary ipi interrupt on RML XLS and Cavium.

Thanks & Regards,
Wu Zhangjin

>
> David Daney
>
> On 08/23/2010 07:16 AM, wu zhangjin wrote:
>> On 8/23/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>> Hi,
>>>
>>> To avoid touching the other parts, I have used the following method:
>>>
>>> delay the cache flushing operation after the stop_machine().
>>>
>>> Here is the patch:
>>>
>>> diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
>>> index 5a84a1f..f4c9581 100644
>>> --- a/arch/mips/kernel/ftrace.c
>>> +++ b/arch/mips/kernel/ftrace.c
>>> @@ -33,6 +33,25 @@ static inline int in_module(unsigned long ip)
>>>          return ip&  0x40000000;
>>>   }
>>>
>>> +#ifdef CONFIG_SMP
>>> +static bool machine_stopped;
>>> +
>>> +int ftrace_arch_code_modify_prepare(void)
>>> +{
>>> +       preempt_disable();
>>
>> preempt_disable() is not necessary, and it may introduce the warning
>> about "scheduling in atomic()"
>>
>> Regards,
>> Wu Zhangjin
>>
>>> +       machine_stopped = 1;
>>> +       return 0;
>>> +}
>>> +
>>> +int ftrace_arch_code_modify_post_process(void)
>>> +{
>>> +       __flush_cache_all();
>>> +       machine_stopped = 0;
>>> +       preempt_enable();
>>> +       return 0;
>>> +}
>>> +#endif
>>> +
>>>   #ifdef CONFIG_DYNAMIC_FTRACE
>>>
>>>   #define JAL 0x0c000000         /* jump&  link: ip -->  ra, jump to
>>> target
>>> */
>>> @@ -79,7 +98,12 @@ static int ftrace_modify_code(unsigned long ip,
>>> unsigned int new_code)
>>>          if (unlikely(faulted))
>>>                  return -EFAULT;
>>>
>>> -       flush_icache_range(ip, ip + 8);
>>> +#ifndef CONFIG_SMP
>>> +       flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
>>> +#else
>>> +       if (!machine_stopped)
>>> +               flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
>>> +#endif
>>>
>>>          return 0;
>>>   }
>>>
>>>
>>> Regards,
>>> Wu Zhangjin
>>>
>>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>>>> (Add 'another' Steven in this loop)
>>>>>
>>>>> On 8/22/10, wu zhangjin<wuzhangjin@gmail.com>  wrote:
>>>>>> Hi, all
>>>>>>
>>>>>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
>>>>>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
>>>>>> found Ftrace for MIPS hanged on it after I issued:
>>>>>>
>>>>>> $ echo function>  /debug/tracing/current_tracer
>>>>>>
>>>>>> I have gotten the root cause, that is:
>>>>>>
>>>>>> in kernel/trace/ftrace.c:
>>>>>>
>>>>>> stop_machine() disables the irqs of the other cpus and then modify the
>>>>>> codes via calling the arch specific ftrace_modify_code() in
>>>>>> __ftrace_modify_code().
>>>>>>
>>>>>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
>>>>>> shows:
>>>>>>
>>>>>> /*
>>>>>>   * Modifying code must take extra care. On an SMP machine, if
>>>>>>   * the code being modified is also being executed on another CPU
>>>>>>   * that CPU will have undefined results and possibly take a GPF.
>>>>>>   * We use kstop_machine to stop other CPUS from exectuing code.
>>>>>> [snip]
>>>>>>
>>>>>> Then, it is reasonable to use stop_machine() here.
>>>>>>
>>>>>> And in arch/mips/kernel/ftrace.c:
>>>>>>
>>>>>> flush_icache_range() is called in ftrace_modify_code() to ensure the
>>>>>> intructions will be executed are what we want.
>>>>>>
>>>>>> In UP system, there is no problem for flush_icache_range() simply
>>>>>> flush the instruction cache, but In SMP system, this may be different,
>>>>>> for flush_icache_range() may also need to ask the other cpus (via
>>>>>> sending ipi interrupt) to flush their icaches and will wait for them
>>>>>> till the other cpus finish their flushing.
>>>>>>
>>>>>> But as we know above, the irqs of the other cpus are disabled by
>>>>>> stop_machine(), they have no opportunity to flush their icache and
>>>>>> will let the current cpu wait for them all the time, then soft lock
>>>>>> -->  hang.
>>>>>>
>>>>>> To fix it, there are two potential solutions:
>>>>>>
>>>>>> 1. replace flush_icache_range() by something else, maybe we can use
>>>>>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
>>>>>> defined in arch/x86/include/asm/processor.h to flush the icache on all
>>>>>> processors:
>>>>>>
>>>>>> /* Stop speculative execution and prefetching of modified code. */
>>>>>> static inline void sync_core(void)
>>>>>> {
>>>>>>          int tmp;
>>>>>>
>>>>>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
>>>>>>          if (boot_cpu_data.x86<  5)
>>>>>>                  /* There is no speculative execution.
>>>>>>                   * jmp is a barrier to prefetching. */
>>>>>>                  asm volatile("jmp 1f\n1:\n" ::: "memory");
>>>>>>          else
>>>>>> #endif
>>>>>>                  /* cpuid is a barrier to speculative execution.
>>>>>>                   * Prefetched instructions are automatically
>>>>>>                   * invalidated when modified. */
>>>>>>                  asm volatile("cpuid" : "=a" (tmp) : "0" (1)
>>>>>>                               : "ebx", "ecx", "edx", "memory");
>>>>>> }
>>>>>>
>>>>>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
>>>>>> in UP, we may be possible to use prefetch instruction to push the
>>>>>> instruction to the cache, but in SMP, is there a instruction to force
>>>>>> the other cpus to flush their cache too?
>>>>>>
>>>>>> 2. Replace the stop_machine() by something else
>>>>>>
>>>>>> I have written such a patch:
>>>>>>
>>>>>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>>>>>> index 2404b59..e4d058f 100644
>>>>>> --- a/kernel/trace/ftrace.c
>>>>>> +++ b/kernel/trace/ftrace.c
>>>>>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
>>>>>>   static void ftrace_run_update_code(int command)
>>>>>>   {
>>>>>>          int ret;
>>>>>> +       unsigned long flags;
>>>>>>
>>>>>>          ret = ftrace_arch_code_modify_prepare();
>>>>>>          FTRACE_WARN_ON(ret);
>>>>>>          if (ret)
>>>>>>                  return;
>>>>>>
>>>>>> -       stop_machine(__ftrace_modify_code,&command, NULL);
>>>>>> +       preempt_disable();
>>>>>> +       local_irq_save(flags);
>>>>>> +       __ftrace_modify_code(&command);
>>>>>> +       local_irq_restore(flags);
>>>>>> +       preempt_enable();
>>>>>>
>>>>>>          ret = ftrace_arch_code_modify_post_process();
>>>>>>          FTRACE_WARN_ON(ret);
>>>>>>
>>>>
>>>> We may need to protect the __ftrace_modify_code() with raw spin lock.
>>>>
>>>>>> It works without any hang but I'm not sure whether it will guarantee
>>>>>> the "undefined results" problem mentioned above. Here we may need to
>>>>>> prevent the other cpus from executing the source code for we are
>>>>>> modifying the source code but also need to allow them to get the ipi
>>>>>> interrupt and flush their icaches.
>>>>>>
>>>>>> And I have took a look at the part of code modification in kgdb
>>>>>> system, seems it doesn't use stop_machine().
>>>>>>
>>>>>> What's your ideas?
>>>>>>
>>>>>> Thanks&  Regards,
>>>>>> Wu Zhangjin
>>>
>>
>>
>
>


-- 
MSN+Gtalk: wuzhangjin@gmail.com
Blog: http://falcon.oss.lzu.edu.cn
Tel:+86-18710032278

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-22 14:27   ` wu zhangjin
  2010-08-23 12:50     ` wu zhangjin
@ 2010-08-30 20:47     ` Steven Rostedt
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2010-08-30 20:47 UTC (permalink / raw)
  To: wu zhangjin; +Cc: Ralf Baechle, David Daney, linux-mips

Sorry for the late reply, just got back from vacation.


On Sun, 2010-08-22 at 22:27 +0800, wu zhangjin wrote:
> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> > (Add 'another' Steven in this loop)
> >
> > On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> >> Hi, all
> >>
> >> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
> >> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
> >> found Ftrace for MIPS hanged on it after I issued:
> >>
> >> $ echo function > /debug/tracing/current_tracer
> >>
> >> I have gotten the root cause, that is:
> >>
> >> in kernel/trace/ftrace.c:
> >>
> >> stop_machine() disables the irqs of the other cpus and then modify the
> >> codes via calling the arch specific ftrace_modify_code() in
> >> __ftrace_modify_code().
> >>
> >> As the description about stop_machine() in arch/x86/kernel/ftrace.c
> >> shows:
> >>
> >> /*
> >>  * Modifying code must take extra care. On an SMP machine, if
> >>  * the code being modified is also being executed on another CPU
> >>  * that CPU will have undefined results and possibly take a GPF.
> >>  * We use kstop_machine to stop other CPUS from exectuing code.
> >> [snip]
> >>
> >> Then, it is reasonable to use stop_machine() here.
> >>
> >> And in arch/mips/kernel/ftrace.c:
> >>
> >> flush_icache_range() is called in ftrace_modify_code() to ensure the
> >> intructions will be executed are what we want.
> >>
> >> In UP system, there is no problem for flush_icache_range() simply
> >> flush the instruction cache, but In SMP system, this may be different,
> >> for flush_icache_range() may also need to ask the other cpus (via
> >> sending ipi interrupt) to flush their icaches and will wait for them
> >> till the other cpus finish their flushing.
> >>
> >> But as we know above, the irqs of the other cpus are disabled by
> >> stop_machine(), they have no opportunity to flush their icache and
> >> will let the current cpu wait for them all the time, then soft lock
> >> --> hang.
> >>
> >> To fix it, there are two potential solutions:
> >>
> >> 1. replace flush_icache_range() by something else, maybe we can use
> >> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
> >> defined in arch/x86/include/asm/processor.h to flush the icache on all
> >> processors:
> >>
> >> /* Stop speculative execution and prefetching of modified code. */
> >> static inline void sync_core(void)
> >> {
> >>         int tmp;
> >>
> >> #if defined(CONFIG_M386) || defined(CONFIG_M486)
> >>         if (boot_cpu_data.x86 < 5)
> >>                 /* There is no speculative execution.
> >>                  * jmp is a barrier to prefetching. */
> >>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
> >>         else
> >> #endif
> >>                 /* cpuid is a barrier to speculative execution.
> >>                  * Prefetched instructions are automatically
> >>                  * invalidated when modified. */
> >>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
> >>                              : "ebx", "ecx", "edx", "memory");
> >> }
> >>
> >> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
> >> in UP, we may be possible to use prefetch instruction to push the
> >> instruction to the cache, but in SMP, is there a instruction to force
> >> the other cpus to flush their cache too?
> >>
> >> 2. Replace the stop_machine() by something else
> >>
> >> I have written such a patch:
> >>
> >> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> >> index 2404b59..e4d058f 100644
> >> --- a/kernel/trace/ftrace.c
> >> +++ b/kernel/trace/ftrace.c
> >> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
> >>  static void ftrace_run_update_code(int command)
> >>  {
> >>         int ret;
> >> +       unsigned long flags;
> >>
> >>         ret = ftrace_arch_code_modify_prepare();
> >>         FTRACE_WARN_ON(ret);
> >>         if (ret)
> >>                 return;
> >>
> >> -       stop_machine(__ftrace_modify_code, &command, NULL);
> >> +       preempt_disable();
> >> +       local_irq_save(flags);
> >> +       __ftrace_modify_code(&command);
> >> +       local_irq_restore(flags);
> >> +       preempt_enable();
> >>
> >>         ret = ftrace_arch_code_modify_post_process();
> >>         FTRACE_WARN_ON(ret);
> >>
> 
> We may need to protect the __ftrace_modify_code() with raw spin lock.

Nope, this will still crash a x86 box. The problem is not with
synchronizing the modification of code, but with modifying code that may
be executing on another CPU.

If the code is being entered in the pipe on one CPU, and another CPU
happens to modify that code (especially in kernel mode), the executing
CPU may take a general protection fault.

I believe this is true with some PPC's as well, so it is not a x86 only
problem.

It may also be a problem with MIPS too.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-23 14:16       ` wu zhangjin
  2010-08-23 17:35         ` David Daney
@ 2010-08-30 20:48         ` Steven Rostedt
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2010-08-30 20:48 UTC (permalink / raw)
  To: wu zhangjin; +Cc: Ralf Baechle, David Daney, linux-mips

On Mon, 2010-08-23 at 22:16 +0800, wu zhangjin wrote:
> On 8/23/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> > Hi,
> >
> > To avoid touching the other parts, I have used the following method:
> >
> > delay the cache flushing operation after the stop_machine().
> >
> > Here is the patch:
> >
> > diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
> > index 5a84a1f..f4c9581 100644
> > --- a/arch/mips/kernel/ftrace.c
> > +++ b/arch/mips/kernel/ftrace.c
> > @@ -33,6 +33,25 @@ static inline int in_module(unsigned long ip)
> >         return ip & 0x40000000;
> >  }
> >
> > +#ifdef CONFIG_SMP
> > +static bool machine_stopped;
> > +
> > +int ftrace_arch_code_modify_prepare(void)
> > +{
> > +       preempt_disable();
> 
> preempt_disable() is not necessary, and it may introduce the warning
> about "scheduling in atomic()"
> 

Correct,

Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> Regards,
> Wu Zhangjin
> 
> > +       machine_stopped = 1;
> > +       return 0;
> > +}
> > +
> > +int ftrace_arch_code_modify_post_process(void)
> > +{
> > +       __flush_cache_all();
> > +       machine_stopped = 0;
> > +       preempt_enable();
> > +       return 0;
> > +}
> > +#endif
> > +
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> >
> >  #define JAL 0x0c000000         /* jump & link: ip --> ra, jump to target
> > */
> > @@ -79,7 +98,12 @@ static int ftrace_modify_code(unsigned long ip,
> > unsigned int new_code)
> >         if (unlikely(faulted))
> >                 return -EFAULT;
> >
> > -       flush_icache_range(ip, ip + 8);
> > +#ifndef CONFIG_SMP
> > +       flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
> > +#else
> > +       if (!machine_stopped)
> > +               flush_icache_range(ip, ip + MCOUNT_INSN_SIZE);
> > +#endif
> >
> >         return 0;
> >  }
> >
> >
> > Regards,
> > Wu Zhangjin
> >
> > On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> >> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> >>> (Add 'another' Steven in this loop)
> >>>
> >>> On 8/22/10, wu zhangjin <wuzhangjin@gmail.com> wrote:
> >>>> Hi, all
> >>>>
> >>>> For I didn't have a SMP machine, I haven't used Ftrace(in 2.6.34) for
> >>>> MIPS on SMP system before, Yesterday,  I got a RMI XLS machine and
> >>>> found Ftrace for MIPS hanged on it after I issued:
> >>>>
> >>>> $ echo function > /debug/tracing/current_tracer
> >>>>
> >>>> I have gotten the root cause, that is:
> >>>>
> >>>> in kernel/trace/ftrace.c:
> >>>>
> >>>> stop_machine() disables the irqs of the other cpus and then modify the
> >>>> codes via calling the arch specific ftrace_modify_code() in
> >>>> __ftrace_modify_code().
> >>>>
> >>>> As the description about stop_machine() in arch/x86/kernel/ftrace.c
> >>>> shows:
> >>>>
> >>>> /*
> >>>>  * Modifying code must take extra care. On an SMP machine, if
> >>>>  * the code being modified is also being executed on another CPU
> >>>>  * that CPU will have undefined results and possibly take a GPF.
> >>>>  * We use kstop_machine to stop other CPUS from exectuing code.
> >>>> [snip]
> >>>>
> >>>> Then, it is reasonable to use stop_machine() here.
> >>>>
> >>>> And in arch/mips/kernel/ftrace.c:
> >>>>
> >>>> flush_icache_range() is called in ftrace_modify_code() to ensure the
> >>>> intructions will be executed are what we want.
> >>>>
> >>>> In UP system, there is no problem for flush_icache_range() simply
> >>>> flush the instruction cache, but In SMP system, this may be different,
> >>>> for flush_icache_range() may also need to ask the other cpus (via
> >>>> sending ipi interrupt) to flush their icaches and will wait for them
> >>>> till the other cpus finish their flushing.
> >>>>
> >>>> But as we know above, the irqs of the other cpus are disabled by
> >>>> stop_machine(), they have no opportunity to flush their icache and
> >>>> will let the current cpu wait for them all the time, then soft lock
> >>>> --> hang.
> >>>>
> >>>> To fix it, there are two potential solutions:
> >>>>
> >>>> 1. replace flush_icache_range() by something else, maybe we can use
> >>>> the similar method in arch/x86/kernel/ftrace.c, x86 uses sync_core()
> >>>> defined in arch/x86/include/asm/processor.h to flush the icache on all
> >>>> processors:
> >>>>
> >>>> /* Stop speculative execution and prefetching of modified code. */
> >>>> static inline void sync_core(void)
> >>>> {
> >>>>         int tmp;
> >>>>
> >>>> #if defined(CONFIG_M386) || defined(CONFIG_M486)
> >>>>         if (boot_cpu_data.x86 < 5)
> >>>>                 /* There is no speculative execution.
> >>>>                  * jmp is a barrier to prefetching. */
> >>>>                 asm volatile("jmp 1f\n1:\n" ::: "memory");
> >>>>         else
> >>>> #endif
> >>>>                 /* cpuid is a barrier to speculative execution.
> >>>>                  * Prefetched instructions are automatically
> >>>>                  * invalidated when modified. */
> >>>>                 asm volatile("cpuid" : "=a" (tmp) : "0" (1)
> >>>>                              : "ebx", "ecx", "edx", "memory");
> >>>> }
> >>>>
> >>>> But is there a cpuid like hardware instruction in MIPS SMP? As I know,
> >>>> in UP, we may be possible to use prefetch instruction to push the
> >>>> instruction to the cache, but in SMP, is there a instruction to force
> >>>> the other cpus to flush their cache too?
> >>>>
> >>>> 2. Replace the stop_machine() by something else
> >>>>
> >>>> I have written such a patch:
> >>>>
> >>>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> >>>> index 2404b59..e4d058f 100644
> >>>> --- a/kernel/trace/ftrace.c
> >>>> +++ b/kernel/trace/ftrace.c
> >>>> @@ -1129,13 +1129,18 @@ static int __ftrace_modify_code(void *data)
> >>>>  static void ftrace_run_update_code(int command)
> >>>>  {
> >>>>         int ret;
> >>>> +       unsigned long flags;
> >>>>
> >>>>         ret = ftrace_arch_code_modify_prepare();
> >>>>         FTRACE_WARN_ON(ret);
> >>>>         if (ret)
> >>>>                 return;
> >>>>
> >>>> -       stop_machine(__ftrace_modify_code, &command, NULL);
> >>>> +       preempt_disable();
> >>>> +       local_irq_save(flags);
> >>>> +       __ftrace_modify_code(&command);
> >>>> +       local_irq_restore(flags);
> >>>> +       preempt_enable();
> >>>>
> >>>>         ret = ftrace_arch_code_modify_post_process();
> >>>>         FTRACE_WARN_ON(ret);
> >>>>
> >>
> >> We may need to protect the __ftrace_modify_code() with raw spin lock.
> >>
> >>>> It works without any hang but I'm not sure whether it will guarantee
> >>>> the "undefined results" problem mentioned above. Here we may need to
> >>>> prevent the other cpus from executing the source code for we are
> >>>> modifying the source code but also need to allow them to get the ipi
> >>>> interrupt and flush their icaches.
> >>>>
> >>>> And I have took a look at the part of code modification in kgdb
> >>>> system, seems it doesn't use stop_machine().
> >>>>
> >>>> What's your ideas?
> >>>>
> >>>> Thanks & Regards,
> >>>> Wu Zhangjin
> >
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-24  6:25           ` wu zhangjin
@ 2010-08-30 20:51             ` Steven Rostedt
  2010-08-31  3:33               ` wu zhangjin
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2010-08-30 20:51 UTC (permalink / raw)
  To: wu zhangjin; +Cc: David Daney, Ralf Baechle, linux-mips

On Tue, 2010-08-24 at 14:25 +0800, wu zhangjin wrote:

> So, this patch is necessary to fix the deadlock and icache problem on
> RMI XLS and it will also improve the performance via reducing the
> unnecessary ipi interrupt on RML XLS and Cavium.

I was about to mention the performance boost of the patch even on
machines not affected by the lock up.

When you enable function tracing, it can update 22,000 locations. Doing
a cache invalidate 22,000 times in a row, is not very efficient. Only a
full cache flush is needed at the end of the update (except for the
module updates, which are done on a live system).

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-30 20:51             ` Steven Rostedt
@ 2010-08-31  3:33               ` wu zhangjin
  2010-09-01  0:06                 ` Steven Rostedt
  0 siblings, 1 reply; 13+ messages in thread
From: wu zhangjin @ 2010-08-31  3:33 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Daney, Ralf Baechle, linux-mips

Hi,

On 8/31/10, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2010-08-24 at 14:25 +0800, wu zhangjin wrote:
>
>> So, this patch is necessary to fix the deadlock and icache problem on
>> RMI XLS and it will also improve the performance via reducing the
>> unnecessary ipi interrupt on RML XLS and Cavium.
>
> I was about to mention the performance boost of the patch even on
> machines not affected by the lock up.
>
> When you enable function tracing, it can update 22,000 locations. Doing
> a cache invalidate 22,000 times in a row, is not very efficient. Only a
> full cache flush is needed at the end of the update (except for the
> module updates, which are done on a live system).

Yeah, will enable it for UP too via removing the #ifdef CONFIG_SMP ... #endif

BTW: I have found another potential problem, that is:

We have called ftace_modify_code() with irq disabled(the same to
stop_machine()):

kernel/trace/ftrace.c:

       /* disable interrupts to prevent kstop machine */
        local_irq_save(flags);
        ftrace_update_code(mod);
        local_irq_restore(flags);

This may introduce the warning in the smp_call_func_many() if we call
flush_icache_range() in ftrace_modify_code() on SMP:

kernel/smp.c:

void smp_call_function_many(const struct cpumask *mask,
                            void (*func)(void *), void *info, bool wait)
{
        struct call_function_data *data;
        unsigned long flags;
        int cpu, next_cpu, this_cpu = smp_processor_id();

        /*
         * Can deadlock when called with interrupts disabled.
         * We allow cpu's that are not yet online though, as no one else can
         * send smp call function interrupt to this cpu and as such deadlocks
         * can't happen.
         */
        WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
                     && !oops_in_progress);

Actually, for the other cpus' irq are not disabled here, there will be
no real deadlock, but this warning may show there are really potential
problems, if the irqs of the other cpus are disabled by something
else, we may really get the deadlock.

So, If want to fix this problem eventually, my patch is not enough, we
may need to move the flush_icache_range() out of the
ftrace_modify_code() and after the irq is enabled, for example:

       /* disable interrupts to prevent kstop machine */
        local_irq_save(flags);
        ftrace_update_code(mod);
        local_irq_restore(flags);
+        __flush_icache_all();

and similarly, we add this __flush_icache_all() after the stop_machine() too:

static void ftrace_run_update_code(int command)
{
        int ret;

        ret = ftrace_arch_code_modify_prepare();
        FTRACE_WARN_ON(ret);
        if (ret)
                return;

        stop_machine(__ftrace_modify_code, &command, NULL);
+      __flush_icache_all();

        ret = ftrace_arch_code_modify_post_process();
        FTRACE_WARN_ON(ret);
}

I'm not sure whether this is needed for all of the architectures, but
this may be needed by MIPS and powerpc.

If X86 doesn't need it, we can add a macro
NEED_FTRACE_FLUSH_ICACHE_ALL for MIPS and powerpc and introduce a
wrapper ftrace_flush_icache_all() for __flush_icache_all().

static inline ftrace_flush_icache_all()
{
#ifdef NEED_FTRACE_FLUSH_ICACHE_ALL
       __flush_icache_all();
#endif
}

Can we apply this method on X86 too? I'm not sure the performance of
the current sync_core() ;-) If it is not good(especially when we use
it for 22,000 times as you mentioned above), we may be possible to
apply this method on X86 to improve the performance too.

And a side effect is: after moving flush_icache_range() out of the
ftrace_modify_code(), we may need to ensure every caller of
ftrace_modify_code() must flush the icaches themselves, sometimes we
may need to call __flush_icache_full() If we don't know which range we
need to flush, sometimes we may be possible to call
flush_icache_range() to flush the indicated range of the icaches
realted to ftrace_modify_code().

Thanks & Regards,
Wu Zhangjin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-08-31  3:33               ` wu zhangjin
@ 2010-09-01  0:06                 ` Steven Rostedt
  2010-09-02  5:54                   ` wu zhangjin
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2010-09-01  0:06 UTC (permalink / raw)
  To: wu zhangjin; +Cc: David Daney, Ralf Baechle, linux-mips

On Tue, 2010-08-31 at 11:33 +0800, wu zhangjin wrote:

> We have called ftace_modify_code() with irq disabled(the same to
> stop_machine()):
> 
> kernel/trace/ftrace.c:
> 
>        /* disable interrupts to prevent kstop machine */
>         local_irq_save(flags);
>         ftrace_update_code(mod);
>         local_irq_restore(flags);
> 
> This may introduce the warning in the smp_call_func_many() if we call
> flush_icache_range() in ftrace_modify_code() on SMP:

Hmm, I think preempt_disable() will do the same thing. Would that work?

> 
> kernel/smp.c:
> 
> void smp_call_function_many(const struct cpumask *mask,
>                             void (*func)(void *), void *info, bool wait)
> {
>         struct call_function_data *data;
>         unsigned long flags;
>         int cpu, next_cpu, this_cpu = smp_processor_id();
> 
>         /*
>          * Can deadlock when called with interrupts disabled.
>          * We allow cpu's that are not yet online though, as no one else can
>          * send smp call function interrupt to this cpu and as such deadlocks
>          * can't happen.
>          */
>         WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>                      && !oops_in_progress);
> 
> Actually, for the other cpus' irq are not disabled here, there will be
> no real deadlock, but this warning may show there are really potential
> problems, if the irqs of the other cpus are disabled by something
> else, we may really get the deadlock.
> 
> So, If want to fix this problem eventually, my patch is not enough, we
> may need to move the flush_icache_range() out of the
> ftrace_modify_code() and after the irq is enabled, for example:
> 
>        /* disable interrupts to prevent kstop machine */
>         local_irq_save(flags);
>         ftrace_update_code(mod);
>         local_irq_restore(flags);
> +        __flush_icache_all();
> 
> and similarly, we add this __flush_icache_all() after the stop_machine() too:
> 
> static void ftrace_run_update_code(int command)
> {
>         int ret;
> 
>         ret = ftrace_arch_code_modify_prepare();
>         FTRACE_WARN_ON(ret);
>         if (ret)
>                 return;
> 
>         stop_machine(__ftrace_modify_code, &command, NULL);
> +      __flush_icache_all();
> 
>         ret = ftrace_arch_code_modify_post_process();
>         FTRACE_WARN_ON(ret);
> }
> 
> I'm not sure whether this is needed for all of the architectures, but
> this may be needed by MIPS and powerpc.
> 
> If X86 doesn't need it, we can add a macro
> NEED_FTRACE_FLUSH_ICACHE_ALL for MIPS and powerpc and introduce a
> wrapper ftrace_flush_icache_all() for __flush_icache_all().

Perhaps just changing the above to preempt disable() and adding the
__flush_icache_all() 

> 
> static inline ftrace_flush_icache_all()
> {
> #ifdef NEED_FTRACE_FLUSH_ICACHE_ALL
>        __flush_icache_all();
> #endif
> }

Hmm, maybe we could just add another weak function called
ftrace_arch_module_post_process() and have the archs do the flush
themselves. I'm not sure that function is common even for the archs that
would need it.


> 
> Can we apply this method on X86 too? I'm not sure the performance of
> the current sync_core() ;-) If it is not good(especially when we use
> it for 22,000 times as you mentioned above), we may be possible to
> apply this method on X86 to improve the performance too.

Well, we also need it for NMI than do run, thus it should be fine. The
22,000 updates does not seem to be an issue.

> 
> And a side effect is: after moving flush_icache_range() out of the
> ftrace_modify_code(), we may need to ensure every caller of
> ftrace_modify_code() must flush the icaches themselves, sometimes we
> may need to call __flush_icache_full() If we don't know which range we
> need to flush, sometimes we may be possible to call
> flush_icache_range() to flush the indicated range of the icaches
> realted to ftrace_modify_code().

Perhaps just be safe and call the flush_icache_full() after the
modifications. Thus I think it would be best to have the ftrace_arch_*
functions.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Ftrace for MIPS may hang on SMP system
  2010-09-01  0:06                 ` Steven Rostedt
@ 2010-09-02  5:54                   ` wu zhangjin
  0 siblings, 0 replies; 13+ messages in thread
From: wu zhangjin @ 2010-09-02  5:54 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: David Daney, Ralf Baechle, linux-mips

On 9/1/10, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2010-08-31 at 11:33 +0800, wu zhangjin wrote:
>
>> We have called ftace_modify_code() with irq disabled(the same to
>> stop_machine()):
>>
>> kernel/trace/ftrace.c:
>>
>>        /* disable interrupts to prevent kstop machine */
>>         local_irq_save(flags);
>>         ftrace_update_code(mod);
>>         local_irq_restore(flags);
>>
>> This may introduce the warning in the smp_call_func_many() if we call
>> flush_icache_range() in ftrace_modify_code() on SMP:
>
> Hmm, I think preempt_disable() will do the same thing. Would that work?
>
>>
>> kernel/smp.c:
>>
>> void smp_call_function_many(const struct cpumask *mask,
>>                             void (*func)(void *), void *info, bool wait)
>> {
>>         struct call_function_data *data;
>>         unsigned long flags;
>>         int cpu, next_cpu, this_cpu = smp_processor_id();
>>
>>         /*
>>          * Can deadlock when called with interrupts disabled.
>>          * We allow cpu's that are not yet online though, as no one else
>> can
>>          * send smp call function interrupt to this cpu and as such
>> deadlocks
>>          * can't happen.
>>          */
>>         WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
>>                      && !oops_in_progress);
>>
>> Actually, for the other cpus' irq are not disabled here, there will be
>> no real deadlock, but this warning may show there are really potential
>> problems, if the irqs of the other cpus are disabled by something
>> else, we may really get the deadlock.
>>
>> So, If want to fix this problem eventually, my patch is not enough, we
>> may need to move the flush_icache_range() out of the
>> ftrace_modify_code() and after the irq is enabled, for example:
>>
>>        /* disable interrupts to prevent kstop machine */
>>         local_irq_save(flags);
>>         ftrace_update_code(mod);
>>         local_irq_restore(flags);
>> +        __flush_icache_all();
>>
>> and similarly, we add this __flush_icache_all() after the stop_machine()
>> too:
>>
>> static void ftrace_run_update_code(int command)
>> {
>>         int ret;
>>
>>         ret = ftrace_arch_code_modify_prepare();
>>         FTRACE_WARN_ON(ret);
>>         if (ret)
>>                 return;
>>
>>         stop_machine(__ftrace_modify_code, &command, NULL);
>> +      __flush_icache_all();
>>
>>         ret = ftrace_arch_code_modify_post_process();
>>         FTRACE_WARN_ON(ret);
>> }
>>
>> I'm not sure whether this is needed for all of the architectures, but
>> this may be needed by MIPS and powerpc.
>>
>> If X86 doesn't need it, we can add a macro
>> NEED_FTRACE_FLUSH_ICACHE_ALL for MIPS and powerpc and introduce a
>> wrapper ftrace_flush_icache_all() for __flush_icache_all().
>
> Perhaps just changing the above to preempt disable() and adding the
> __flush_icache_all()
>
>>
>> static inline ftrace_flush_icache_all()
>> {
>> #ifdef NEED_FTRACE_FLUSH_ICACHE_ALL
>>        __flush_icache_all();
>> #endif
>> }
>
> Hmm, maybe we could just add another weak function called
> ftrace_arch_module_post_process() and have the archs do the flush
> themselves. I'm not sure that function is common even for the archs that
> would need it.
>
>
>>
>> Can we apply this method on X86 too? I'm not sure the performance of
>> the current sync_core() ;-) If it is not good(especially when we use
>> it for 22,000 times as you mentioned above), we may be possible to
>> apply this method on X86 to improve the performance too.
>
> Well, we also need it for NMI than do run, thus it should be fine. The
> 22,000 updates does not seem to be an issue.
>
>>
>> And a side effect is: after moving flush_icache_range() out of the
>> ftrace_modify_code(), we may need to ensure every caller of
>> ftrace_modify_code() must flush the icaches themselves, sometimes we
>> may need to call __flush_icache_full() If we don't know which range we
>> need to flush, sometimes we may be possible to call
>> flush_icache_range() to flush the indicated range of the icaches
>> realted to ftrace_modify_code().
>
> Perhaps just be safe and call the flush_icache_full() after the
> modifications. Thus I think it would be best to have the ftrace_arch_*
> functions.

Ok, I will add the ftrace_arch_module_prepare/post_process() to
kernel/trace/ftrace.c.

Thanks Steve!

Regards,
Wu Zhangjin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-09-02  5:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-22 12:18 Ftrace for MIPS may hang on SMP system wu zhangjin
2010-08-22 12:20 ` wu zhangjin
2010-08-22 14:27   ` wu zhangjin
2010-08-23 12:50     ` wu zhangjin
2010-08-23 14:16       ` wu zhangjin
2010-08-23 17:35         ` David Daney
2010-08-24  6:25           ` wu zhangjin
2010-08-30 20:51             ` Steven Rostedt
2010-08-31  3:33               ` wu zhangjin
2010-09-01  0:06                 ` Steven Rostedt
2010-09-02  5:54                   ` wu zhangjin
2010-08-30 20:48         ` Steven Rostedt
2010-08-30 20:47     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox