All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking)
Date: Sat, 18 Apr 2015 18:56:02 -0700	[thread overview]
Message-ID: <55330B32.4010907@roeck-us.net> (raw)
In-Reply-To: <CA+55aFz1934X3wu7FdGerwYMJ_BAkrFsajOQFtU3_ogsUX3_eQ@mail.gmail.com>

On 04/18/2015 05:04 PM, Linus Torvalds wrote:
> On Sat, Apr 18, 2015 at 7:40 PM, Guenter Roeck <linux@roeck-us.net> wrote:
>> On Sat, Apr 18, 2015 at 04:23:25PM -0700, Guenter Roeck wrote:
>>>
>>> my qemu test for arm:vexpress fails with the latest upstream kernel. It fails
>>> hard - I don't get any output from the console. Bisect points to commit
>>> 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking").
>>> Reverting this commit fixes the problem.
>
> Hmm. It being qemu, can you look at where it seems to lock?
>

  static void csd_lock_wait(struct call_single_data *csd)
  {
+#if 0
         while (smp_load_acquire(&csd->flags) & CSD_FLAG_LOCK)
                 cpu_relax();
+#else
+       pr_info("csd_lock_wait: flags=0x%x\n", smp_load_acquire(&csd->flags));
+#endif
  }

prints

csd_lock_wait: flags=0x3

repeatedly for each call to csd_lock_wait() [and bypasses the problem].
Further debugging shows that wait==1, and that csd points to the
pre-initialized csd_stack (which has CSD_FLAG_LOCK set).

It seems that CSD_FLAG_LOCK is never reset (there is no call to csd_unlock(), ever).

Further debugging (with added WARN_ON if cpu != 0 in smp_call_function_single) shows:

[<800157ec>] (unwind_backtrace) from [<8001250c>] (show_stack+0x10/0x14)
[<8001250c>] (show_stack) from [<80494cb4>] (dump_stack+0x88/0x98)
[<80494cb4>] (dump_stack) from [<80024058>] (warn_slowpath_common+0x84/0xb4)
[<80024058>] (warn_slowpath_common) from [<80024124>] (warn_slowpath_null+0x1c/0x24)
[<80024124>] (warn_slowpath_null) from [<80078fc8>] (smp_call_function_single+0x170/0x178)
[<80078fc8>] (smp_call_function_single) from [<80090024>] (perf_event_exit_cpu+0x80/0xf0)
[<80090024>] (perf_event_exit_cpu) from [<80090110>] (perf_cpu_notify+0x30/0x48)
[<80090110>] (perf_cpu_notify) from [<8003d340>] (notifier_call_chain+0x44/0x84)
[<8003d340>] (notifier_call_chain) from [<8002451c>] (_cpu_up+0x120/0x168)
[<8002451c>] (_cpu_up) from [<800245d4>] (cpu_up+0x70/0x94)
[<800245d4>] (cpu_up) from [<80624234>] (smp_init+0xac/0xb0)
[<80624234>] (smp_init) from [<80618d84>] (kernel_init_freeable+0x118/0x268)
[<80618d84>] (kernel_init_freeable) from [<8049107c>] (kernel_init+0x8/0xe8)
[<8049107c>] (kernel_init) from [<8000f320>] (ret_from_fork+0x14/0x34)
---[ end trace 2f9f1bb8a47b3a1b ]---
smp_call_function_single, cpu=1, wait=1, csd_stack=87825ea0
generic_exec_single, cpu=1, smp_processor_id()=0
csd_lock_wait: csd=87825ea0, flags=0x3

This is repeated for each secondary CPU. But the secondary CPUs don't respond because
they are not enabled, which I guess explains why the lock is never released.

So, in other words, this happens because the system believes (presumably per configuration
/ fdt data) that there are four CPU cores, but that is not really the case. Previously that
did not matter, and was handled correctly. Now it is fatal.

Does this help ?

Guenter


  parent reply	other threads:[~2015-04-19  1:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-18 23:23 qemu:arm test failure due to commit 8053871d0f7f (smp: Fix smp_call_function_single_async() locking) Guenter Roeck
2015-04-18 23:40 ` Guenter Roeck
2015-04-19  0:04   ` Linus Torvalds
2015-04-19  0:36     ` Guenter Roeck
2015-04-19  1:56     ` Guenter Roeck [this message]
2015-04-19  3:39       ` Rabin Vincent
2015-04-19  4:03         ` Guenter Roeck
     [not found]         ` <CA+55aFw4FSja+VBuCYJ7wLXKVRQZ7w6vOUaUJ4B=FXyBmNkrUg@mail.gmail.com>
2015-04-19  8:56           ` Linus Torvalds
2015-04-19  9:31             ` Ingo Molnar
2015-04-19 14:08               ` Guenter Roeck
2015-04-19 18:01                 ` Ingo Molnar
2015-04-19 20:34                   ` Linus Torvalds
2015-04-20  5:39                     ` Ingo Molnar
2015-04-20 12:17                       ` Paul E. McKenney
2015-04-20 15:53                       ` Linus Torvalds
2015-04-20 15:41                   ` Rabin Vincent
2015-04-20 10:46                 ` Geert Uytterhoeven
2015-04-20 10:46                   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55330B32.4010907@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.