From: Greg KH <greg@kroah.com>
To: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
Paul Burton <paul.burton@mips.com>,
James Hogan <jhogan@kernel.org>,
Linux MIPS Mailing List <linux-mips@linux-mips.org>,
Fuxin Zhang <zhangfx@lemote.com>,
Zhangjin Wu <wuzhangjin@gmail.com>,
stable <stable@vger.kernel.org>
Subject: Re: [PATCH Resend 4.4] MIPS: Use async IPIs for arch_trigger_cpumask_backtrace()
Date: Wed, 18 Jul 2018 11:24:21 +0200 [thread overview]
Message-ID: <20180718092421.GC17551@kroah.com> (raw)
In-Reply-To: <CAAhV-H75ncA+Q3idrA7byN8U5HV7yxRk-LXciBC8RjRa4bwpBA@mail.gmail.com>
On Wed, Jul 18, 2018 at 08:52:46AM +0800, Huacai Chen wrote:
> On Tue, Jul 17, 2018 at 9:57 PM, Greg KH <greg@kroah.com> wrote:
> > On Tue, Jul 17, 2018 at 04:17:11PM +0800, Huacai Chen wrote:
> >> From: Paul Burton <paul.burton@mips.com>
> >>
> >> The current MIPS implementation of arch_trigger_cpumask_backtrace() is
> >> broken because it attempts to use synchronous IPIs despite the fact that
> >> it may be run with interrupts disabled.
> >>
> >> This means that when arch_trigger_cpumask_backtrace() is invoked, for
> >> example by the RCU CPU stall watchdog, we may:
> >>
> >> - Deadlock due to use of synchronous IPIs with interrupts disabled,
> >> causing the CPU that's attempting to generate the backtrace output
> >> to hang itself.
> >>
> >> - Not succeed in generating the desired output from remote CPUs.
> >>
> >> - Produce warnings about this from smp_call_function_many(), for
> >> example:
> >>
> >> [42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks:
> >> [42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0
> >> [42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0
> >> [42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33)
> >> [42760.568927] ------------[ cut here ]------------
> >> [42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c
> >> [42760.587839] Modules linked in:
> >> [42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2
> >> [42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8
> >> [42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000
> >> [42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0
> >> [42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0
> >> [42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca
> >> [42760.669424] ...
> >> [42760.673919] Call Trace:
> >> [42760.678672] [<27fde568>] show_stack+0x70/0xf0
> >> [42760.685417] [<84751641>] dump_stack+0xaa/0xd0
> >> [42760.692188] [<699d671c>] __warn+0x80/0x92
> >> [42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36
> >> [42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c
> >> [42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a
> >> [42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98
> >> [42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac
> >> [42760.737476] [<059b3b43>] update_process_times+0x18/0x34
> >> [42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38
> >> [42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50
> >> [42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226
> >> [42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a
> >> [42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a
> >> [42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168
> >> [42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
> >> [42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86
> >> [42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14
> >> [42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c
> >> [42760.820086] [<c7521934>] do_IRQ+0x16/0x20
> >> [42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94
> >> [42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78
> >> [42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c
> >> [42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c
> >> [42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98
> >> [42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44
> >> [42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e
> >> [42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66
> >> [42760.883907] [<b3fce717>] exit_mmap+0x76/0xea
> >> [42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a
> >> [42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c
> >> [42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e
> >> [42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e
> >> [42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c
> >> [42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c
> >> [42760.931765] ---[ end trace 02aa09da9dc52a60 ]---
> >> [42760.938342] ------------[ cut here ]------------
> >> [42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8
> >> ...
> >>
> >> This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async
> >> IPIs & smp_call_function_single_async() in order to resolve this
> >> problem. We ensure use of the pre-allocated call_single_data_t
> >> structures is serialized by maintaining a cpumask indicating that
> >> they're busy, and refusing to attempt to send an IPI when a CPU's bit is
> >> set in this mask. This should only happen if a CPU hasn't responded to a
> >> previous backtrace IPI - ie. if it's hung - and we print a warning to
> >> the console in this case.
> >>
> >> I've marked this for stable branches as far back as v4.9, to which it
> >> applies cleanly. Strictly speaking the faulty MIPS implementation can be
> >> traced further back to commit 856839b76836 ("MIPS: Add
> >> arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel
> >> versions v3.19 through v4.8 will require further work to backport due to
> >> the rework performed in commit 9a01c3ed5cdb ("nmi_backtrace: add more
> >> trigger_*_cpu_backtrace() methods").
> >>
> >> Signed-off-by: Paul Burton <paul.burton@mips.com>
> >> Patchwork: https://patchwork.linux-mips.org/patch/19597/
> >> Cc: James Hogan <jhogan@kernel.org>
> >> Cc: Ralf Baechle <ralf@linux-mips.org>
> >> Cc: linux-mips@linux-mips.org
> >> Cc: stable@vger.kernel.org
> >> Fixes: 856839b76836 ("MIPS: Add arch_trigger_all_cpu_backtrace() function")
> >> Fixes: 9a01c3ed5cdb ("nmi_backtrace: add more trigger_*_cpu_backtrace() methods")
> >> [ Huacai: backported to 4.4: Restruction since generic NMI solution is unavailable ]
> >> Signed-off-by: Huacai Chen <chenhc@lemote.com>
> >> ---
> >> arch/mips/kernel/process.c | 29 ++++++++++++++++++++++++++++-
> >> 1 file changed, 28 insertions(+), 1 deletion(-)
> >
> > Always give me a hint as to what the original git commit id is in
> > Linus's tree please.
> >
> > Also, this patch does not apply at all to the 4.4.y tree :(
> commit b63e132b6433a41cf311e8bc382d33fd2b73b505 upstream.
>
> The reason of "not applicable to 4.4" is because this is a patch
> series and its previous patch is
> 5a267832c2ec47b2dad0fdb291a96bb5b8869315 ("MIPS: Call dump_stack()
> from show_regs()") which is already cc stable and doesn't need any
> change.
But 5a267832c2ec47b2dad0fdb291a96bb5b8869315 was only for 4.9 and newer,
not 4.4.
I'm totally confused here.
> Now, should I resend this patch and its previous patch again?
Yes as I have no idea what is going on, nor what to do at all. Please
send a patch series, for 4.4.y, that has all of the needed patches
properly backported.
thanks,
greg k-h
prev parent reply other threads:[~2018-07-18 9:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-17 8:17 [PATCH Resend 4.4] MIPS: Use async IPIs for arch_trigger_cpumask_backtrace() Huacai Chen
2018-07-17 13:57 ` Greg KH
2018-07-18 0:52 ` Huacai Chen
2018-07-18 9:24 ` Greg KH [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180718092421.GC17551@kroah.com \
--to=greg@kroah.com \
--cc=chenhc@lemote.com \
--cc=jhogan@kernel.org \
--cc=linux-mips@linux-mips.org \
--cc=paul.burton@mips.com \
--cc=ralf@linux-mips.org \
--cc=stable@vger.kernel.org \
--cc=wuzhangjin@gmail.com \
--cc=zhangfx@lemote.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox