All of lore.kernel.org
 help / color / mirror / Atom feed
From: Weidong Wang <wangweidong1@huawei.com>
To: <tglx@linutronix.de>, <mingo@redhat.com>, <hpa@zytor.com>,
	<x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	<torvalds@linux-foundation.org>
Cc: Fengtiantian <fengtiantian@huawei.com>, <liuyongan@huawei.com>,
	<wangweidong1@huawei.com>
Subject: [Ask for help] met a deadlock with switch_fpu_finish on suse 3.0.93-0.8-default kernel
Date: Tue, 15 Mar 2016 21:24:49 +0800	[thread overview]
Message-ID: <56E80D21.7010607@huawei.com> (raw)

Hi all,

We find a deadlock problem in suse 3.0.93-0.8-default kernel when restore_fpu_checking return error in task switch.
--------------------------------------------
The Call Trace is :
193 PID: 2415   TASK: ffff880b739d24c0  CPU: 5   COMMAND: "qemu-kvm"
194  #0 [ffff880c7f6a6e40] crash_nmi_callback at ffffffff8102460f
195  #1 [ffff880c7f6a6e50] notifier_call_chain at ffffffff81465027
196  #2 [ffff880c7f6a6e80] __atomic_notifier_call_chain at ffffffff8146506d
197  #3 [ffff880c7f6a6e90] notify_die at ffffffff814650bd
198  #4 [ffff880c7f6a6ec0] default_do_nmi at ffffffff81462507
199  #5 [ffff880c7f6a6ee0] do_nmi at ffffffff81462738
200  #6 [ffff880c7f6a6ef0] restart_nmi at ffffffff81461c91
201     [exception RIP: _raw_spin_lock+21]
202     RIP: ffffffff814611e5  RSP: ffff8809d8d1ba80  RFLAGS: 00000093
203     RAX: 0000000000000010  RBX: 0000000000000010  RCX: 0000000000000093
204     RDX: ffff8809d8d1ba80  RSI: 0000000000000018  RDI: 0000000000000001
205     RBP: ffffffff814611e5   R8: ffffffff814611e5   R9: 0000000000000018
206     R10: ffff8809d8d1ba80  R11: 0000000000000093  R12: ffffffffffffffff
207     R13: ffff880c7f6b0a00  R14: 0000000000000005  R15: 000000000000e2b8
208     ORIG_RAX: 000000000000e2b8  CS: 0010  SS: 0018
209 --- <DOUBLEFAULT exception stack> ---
210  #7 [ffff8809d8d1ba80] _raw_spin_lock at ffffffff814611e5
211  #8 [ffff8809d8d1ba80] try_to_wake_up at ffffffff81054afb
212  #9 [ffff8809d8d1bad0] pollwake at ffffffff8116cfc6
213 #10 [ffff8809d8d1bb10] __wake_up_common at ffffffff81046e1a
214 #11 [ffff8809d8d1bb50] __wake_up at ffffffff8104bf43
215 #12 [ffff8809d8d1bb90] __send_signal at ffffffff81074bfd
216 #13 [ffff8809d8d1bbd0] force_sig_info at ffffffff81076194
217 #14 [ffff8809d8d1bc00] __switch_to at ffffffff81001930
218 #15 [ffff8809d8d1bcf0] reschedule_interrupt at ffffffff8146a06e
219 #16 [ffff8809d8d1bd58] vmx_handle_external_intr at ffffffffa03c3f4c [kvm_intel]
220 #17 [ffff8809d8d1bd80] vcpu_enter_guest at ffffffffa0363487 [kvm]
221 #18 [ffff8809d8d1be00] __vcpu_run at ffffffffa0363743 [kvm]
222 #19 [ffff8809d8d1be40] kvm_arch_vcpu_ioctl_run at ffffffffa0364438 [kvm]
223 #20 [ffff8809d8d1be70] kvm_vcpu_ioctl at ffffffffa0350cee [kvm]
224 #21 [ffff8809d8d1bf10] do_vfs_ioctl at ffffffff8116bd1b
225 #22 [ffff8809d8d1bf40] sys_ioctl at ffffffff8116c0e1
226 #23 [ffff8809d8d1bf80] system_call_fastpath at ffffffff81469172
--------------------------------------------

We see the patch
commit 80ab6f1e8c981b1b6604b2f22e36c917526235cd
"i387: use 'restore_fpu_checking()' directly in task switching code"

this patch remove the __math_state_restore in switch_fpu_finish,like that:

 static inline void switch_fpu_finish(struct task_struct *new, fpu_switch_t fpu)
 {
-       if (fpu.preload)
-               __math_state_restore(new);
+       if (fpu.preload) {
+               if (unlikely(restore_fpu_checking(new)))
+                       __thread_fpu_end(new);
+       }
 }

So in switch_fpu_finish, when entered restore_fpu_checking fail, it won't call force_sig().


1. Would it will fix this issuse(deadlock)?
2. We don't understand why the restore_fpu_checking would failed? Any one know that?
3. if the patch can fix the problem, We want to know that
   "restore_fpu_checking(tsk) really fail,and we not force send the SIGSEGV to the task,
    Would it introuduce other issue?"

Regards,
Weidong

             reply	other threads:[~2016-03-15 13:25 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-15 13:24 Weidong Wang [this message]
2016-03-16  6:31 ` [Ask for help] met a deadlock with switch_fpu_finish on suse 3.0.93-0.8-default kernel Liuyongan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56E80D21.7010607@huawei.com \
    --to=wangweidong1@huawei.com \
    --cc=fengtiantian@huawei.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liuyongan@huawei.com \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.