public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Rui Wang <rui.y.wang@intel.com>
To: airlied@redhat.com, daniel.vetter@ffwll.ch, robdclark@gmail.com,
	matthew.d.roper@intel.com, tony.luck@intel.com,
	gong.chen@intel.com, bp@alien8.de
Cc: linux-kernel@vger.kernel.org, Rui Wang <rui.y.wang@intel.com>
Subject: drm/mgag200: doesn't work in panic context
Date: Fri, 26 Jun 2015 15:55:14 +0800	[thread overview]
Message-ID: <1435305314-14337-1-git-send-email-rui.y.wang@intel.com> (raw)

Hi all,

I'm here to report two panics which hang forever (the machine cannot reboot). It is because mgag200 doesn't work in panic context. It sleeps and allocates memory non-atomically.

These were triggered while injecting machine checks using einj.

1)

[321381.466885] ------------[ cut here ]------------
[321381.472144] WARNING: CPU: 136 PID: 0 at kernel/time/timer.c:1098 del_timer_sync+0x36/0x60()
[321381.481571] Modules linked in: einj(E) nmioe(E) iscsi_ibft(E) iscsi_boot_sysfs(E) af_packet(E) x86_pkg_temp_thermal(E) btrfs(E) intel_powerclamp(E) coretemp(E) kvm(E) xor(E) crct10dif_pclmul(E) raid6_pq(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) iTCO_wdt(E) iTCO_vendor_support(E) joydev(E) aesni_intel(E) lpc_ich(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) sb_edac(E) ablk_helper(E) cryptd(E) pcspkr(E) mfd_core(E) i2c_i801(E) wmi(E) edac_core(E) shpchp(E) ipmi_si(E) ipmi_msghandler(E) processor(E) acpi_pad(E) button(E) dm_mod(E) ext4(E) crc16(E) mbcache(E) jbd2(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) sd_mod(E) mgag200(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) ehci_pci(E) ehci_hcd(E) drm_kms_helper(E) ixgbe(E) ahci(E) igb(E) mdio(E) ttm(E) libahci(E) ptp(E) i2c_algo_bit(E) usbcore(E) pps_core(E) drm(E) libata(E) megaraid_sas(E) usb_common(E) dca(E) sg(E) scsi_mod(E) autofs4(E)
[321381.572300] CPU: 136 PID: 0 Comm: swapper/136 Tainted: G        W   E   4.1.0-rc8-7-default+ #4
[321381.582117] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0059.R00.1501081238 01/08/2015
[321381.593777]  ffffffff81818089 ffff88047fc88808 ffffffff8157d67e 0000000000000000
[321381.602184]  0000000000000000 ffff88047fc88848 ffffffff810637fa ffff88046e4bc740
[321381.610595]  ffff88047fc888a8 ffff88047fc888a8 0000000104c6f0f8 ffff88047f5cdb00
[321381.619006] Call Trace:
[321381.621834]  <#MC>  [<ffffffff8157d67e>] dump_stack+0x4c/0x65
[321381.628358]  [<ffffffff810637fa>] warn_slowpath_common+0x8a/0xc0
[321381.635168]  [<ffffffff810638ea>] warn_slowpath_null+0x1a/0x20
[321381.641775]  [<ffffffff810cb316>] del_timer_sync+0x36/0x60
[321381.647995]  [<ffffffff81582bf0>] schedule_timeout+0x150/0x280
[321381.654611]  [<ffffffff812cc9fb>] ? idr_alloc+0x7b/0xe0
[321381.660547]  [<ffffffff810c9c90>] ? internal_add_timer+0x80/0x80
[321381.667359]  [<ffffffff810cb85c>] msleep+0x3c/0x50
[321381.672812]  [<ffffffffa0145607>] mga_crtc_prepare+0x167/0x370 [mgag200]
[321381.680404]  [<ffffffffa04f38b6>] drm_crtc_helper_set_mode+0x2d6/0x530 [drm_kms_helper]
[321381.689453]  [<ffffffffa04f4896>] drm_crtc_helper_set_config+0x856/0xa70 [drm_kms_helper]
[321381.698706]  [<ffffffffa00a3318>] drm_mode_set_config_internal+0x68/0x100 [drm]
[321381.706971]  [<ffffffffa04fe8b2>] restore_fbdev_mode+0xc2/0xf0 [drm_kms_helper]
[321381.715244]  [<ffffffffa04feaa3>] drm_fb_helper_force_kernel_mode+0x73/0xb0 [drm_kms_helper]
[321381.724780]  [<ffffffffa04ff6f9>] drm_fb_helper_panic+0x29/0x30 [drm_kms_helper]
[321381.733144]  [<ffffffff8108270d>] notifier_call_chain+0x4d/0x80
[321381.739859]  [<ffffffff81082791>] atomic_notifier_call_chain+0x21/0x30
[321381.747252]  [<ffffffff815792d4>] panic+0xee/0x1f5
[321381.752704]  [<ffffffff8102d272>] mce_panic+0x1e2/0x200
[321381.758640]  [<ffffffff8102d303>] mce_timed_out+0x73/0x80
[321381.764762]  [<ffffffff8102e8a1>] do_machine_check+0x5f1/0xae0
[321381.771377]  [<ffffffff81348eaf>] ? intel_idle+0xbf/0x130
[321381.777499]  [<ffffffff81585d49>] machine_check+0x29/0x50
[321381.783630]  [<ffffffff81348eaf>] ? intel_idle+0xbf/0x130
[321381.789760]  <<EOE>>  [<ffffffff81450170>] cpuidle_enter_state+0x70/0x1f0
[321381.797457]  [<ffffffff81450327>] cpuidle_enter+0x17/0x20
[321381.803586]  [<ffffffff810a5968>] cpu_startup_entry+0x308/0x390
[321381.810297]  [<ffffffff8103a203>] start_secondary+0x143/0x170
[321381.816814] ---[ end trace 9f2a977c4a9be24e ]---
[321381.822068] bad: scheduling from the idle thread!
[321381.827421] CPU: 136 PID: 0 Comm: swapper/136 Tainted: G        W   E   4.1.0-rc8-7-default+ #4
[321381.837238] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0059.R00.1501081238 01/08/2015
[321381.848898]  ffff88046e4bc740 ffff88047fc887a8 ffffffff8157d67e 0000000000000000
[321381.857305]  ffff88047fc95300 ffff88047fc887c8 ffffffff81093675 ffff88047fc88808
[321381.865713]  ffff88047fc95300 ffff88047fc887f8 ffffffff8108796c 0000000100000000
[321381.874124] Call Trace:
[321381.876951]  <#MC>  [<ffffffff8157d67e>] dump_stack+0x4c/0x65
[321381.883483]  [<ffffffff81093675>] dequeue_task_idle+0x35/0x50
[321381.890001]  [<ffffffff8108796c>] dequeue_task+0x5c/0x80
[321381.896027]  [<ffffffff8108c56b>] deactivate_task+0x2b/0x30
[321381.902352]  [<ffffffff8157fcea>] __schedule+0x64a/0x910
[321381.908385]  [<ffffffff8157ffee>] schedule+0x3e/0x90
[321381.914030]  [<ffffffff81582be8>] schedule_timeout+0x148/0x280
[321381.920636]  [<ffffffff812cc9fb>] ? idr_alloc+0x7b/0xe0
[321381.926570]  [<ffffffff810c9c90>] ? internal_add_timer+0x80/0x80
[321381.933382]  [<ffffffff810cb85c>] msleep+0x3c/0x50
[321381.938835]  [<ffffffffa0145607>] mga_crtc_prepare+0x167/0x370 [mgag200]
[321381.946428]  [<ffffffffa04f38b6>] drm_crtc_helper_set_mode+0x2d6/0x530 [drm_kms_helper]
[321381.955478]  [<ffffffffa04f4896>] drm_crtc_helper_set_config+0x856/0xa70 [drm_kms_helper]
[321381.964731]  [<ffffffffa00a3318>] drm_mode_set_config_internal+0x68/0x100 [drm]
[321381.973004]  [<ffffffffa04fe8b2>] restore_fbdev_mode+0xc2/0xf0 [drm_kms_helper]
[321381.981277]  [<ffffffffa04feaa3>] drm_fb_helper_force_kernel_mode+0x73/0xb0 [drm_kms_helper]
[321381.990811]  [<ffffffffa04ff6f9>] drm_fb_helper_panic+0x29/0x30 [drm_kms_helper]
[321381.999174]  [<ffffffff8108270d>] notifier_call_chain+0x4d/0x80
[321382.005887]  [<ffffffff81082791>] atomic_notifier_call_chain+0x21/0x30
[321382.013280]  [<ffffffff815792d4>] panic+0xee/0x1f5
[321382.018731]  [<ffffffff8102d272>] mce_panic+0x1e2/0x200
[321382.024660]  [<ffffffff8102d303>] mce_timed_out+0x73/0x80
[321382.030787]  [<ffffffff8102e8a1>] do_machine_check+0x5f1/0xae0
[321382.037404]  [<ffffffff81348eaf>] ? intel_idle+0xbf/0x130
[321382.043533]  [<ffffffff81585d49>] machine_check+0x29/0x50
[321382.049665]  [<ffffffff81348eaf>] ? intel_idle+0xbf/0x130
[321382.055794]  <<EOE>>  [<ffffffff81450170>] cpuidle_enter_state+0x70/0x1f0
[321382.063491]  [<ffffffff81450327>] cpuidle_enter+0x17/0x20
[321382.069623]  [<ffffffff810a5968>] cpu_startup_entry+0x308/0x390
[321382.076335]  [<ffffffff8103a203>] start_secondary+0x143/0x170
[321382.082877] ------------[ cut here ]------------


2)

bkd04sdp:~ # [58109.056018] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[58109.056058] mce: [Hardware Error]: Machine check events logged
[58110.109873] Shutting down cpus with NMI
[58110.176778] Kernel Offset: disabled
[58110.180667] drm_kms_helper: panic occurred, switching back to text console
[58110.188367] mga_delay choosing mdelay...
[58110.242399] mga_delay choosing mdelay...
[58110.266768] ------------[ cut here ]------------
[58110.271926] kernel BUG at mm/vmalloc.c:1335!
[58110.276695] invalid opcode: 0000 [#1] SMP
[58110.281289] Modules linked in: einj(E) nmioe(E) iscsi_ibft(E) iscsi_boot_sysfs(E) af_packet(E) btrfs(E) xor(E) raid6_pq(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm(E) joydev(E) iTCO_wdt(E) iTCO_vendor_support(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) sb_edac(E) ablk_helper(E) lpc_ich(E) cryptd(E) pcspkr(E) edac_core(E) mfd_core(E) i2c_i801(E) shpchp(E) wmi(E) ipmi_si(E) ipmi_msghandler(E) acpi_pad(E) processor(E) button(E) dm_mod(E) ext4(E) crc16(E) mbcache(E) jbd2(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) sd_mod(E) mgag200(E) syscopyarea(E) sysfillrect(E) ahci(E) ehci_pci(E) sysimgblt(E) drm_kms_helper(E) ehci_hcd(E) ixgbe(E) igb(E) ttm(E) libahci(E) mdio(E) ptp(E) usbcore(E) pps_core(E) drm(E) libata(E) i2c_algo_bit(E) usb_common(E) dca(E) megaraid_sas(E) sg(E) scsi_mod(E) autofs4(E)
[58110.371884] CPU: 75 PID: 0 Comm: swapper/75 Tainted: G            E   4.1.0-rc8-7-default+ #10
[58110.381506] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0059.R00.1501081238 01/08/2015
[58110.393063] task: ffff88046ea6d580 ti: ffff88046ea70000 task.ti: ffff88046ea70000
[58110.401422] RIP: 0010:[<ffffffff81189c65>]  [<ffffffff81189c65>] __get_vm_area_node+0x155/0x160
[58110.411156] RSP: 0018:ffff88047f7284b8  EFLAGS: 00010006
[58110.417091] RAX: 0000000080010003 RBX: 0000000091000000 RCX: ffffc90000000000
[58110.425065] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 00000000000eb000
[58110.433038] RBP: ffff88047f7284f8 R08: ffffe8ffffffffff R09: 00000000ffffffff
[58110.441010] R10: ffff880036a6a700 R11: ffff880460ab69c0 R12: 00000000910eb000
[58110.448983] R13: 0000000000000001 R14: 0000000091000000 R15: 00000000000eb000
[58110.456955] FS:  0000000000000000(0000) GS:ffff88047f720000(0000) knlGS:0000000000000000
[58110.465994] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[58110.472413] CR2: 00007f5bdb9ac095 CR3: 0000000001a0b000 CR4: 00000000001407e0
[58110.480386] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[58110.488358] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[58110.496331] Stack:
[58110.498577]  ffff88047f728518 ffffc90000000000 00000000910eafff 0000000091000000
[58110.506885]  00000000910eb000 0000000000000001 0000000091000000 00000000000eb000
[58110.515192]  ffff88047f728518 ffffffff8118ad20 00000000000000d0 ffffffffa0334f78
[58110.523499] Call Trace:
[58110.526229]  <#MC>
[58110.528382]  [<ffffffff8118ad20>] get_vm_area_caller+0x40/0x50
[58110.535111]  [<ffffffffa0334f78>] ? ttm_mem_reg_ioremap+0xc8/0x110 [ttm]
[58110.542607]  [<ffffffff81050f18>] __ioremap_caller+0x188/0x390
[58110.549127]  [<ffffffff812d59b9>] ? find_next_bit+0x19/0x20
[58110.555353]  [<ffffffff81051177>] ioremap_wc+0x17/0x20
[58110.561099]  [<ffffffffa0334f78>] ttm_mem_reg_ioremap+0xc8/0x110 [ttm]
[58110.568398]  [<ffffffffa0335351>] ttm_bo_move_memcpy+0xd1/0x700 [ttm]
[58110.575598]  [<ffffffff811a6c55>] ? __kmalloc+0x4b5/0x4c0
[58110.581632]  [<ffffffffa01d9b48>] mgag200_bo_move+0x18/0x20 [mgag200]
[58110.588830]  [<ffffffffa0332ea0>] ttm_bo_handle_move_mem+0x260/0x590 [ttm]
[58110.596514]  [<ffffffffa03337d2>] ? ttm_bo_mem_space+0xd2/0x320 [ttm]
[58110.603705]  [<ffffffffa0333eb2>] ttm_bo_validate+0x1c2/0x1d0 [ttm]
[58110.610711]  [<ffffffff8113f681>] ? irq_work_queue+0x11/0x90
[58110.617037]  [<ffffffffa01da3d3>] mgag200_bo_push_sysram+0x93/0xe0 [mgag200]
[58110.624915]  [<ffffffffa01d5a26>] mga_crtc_do_set_base.isra.8.constprop.21+0x76/0x410 [mgag200]
[58110.634636]  [<ffffffffa01d6e02>] mga_crtc_mode_set+0x1042/0x2140 [mgag200]
[58110.642416]  [<ffffffffa01d5492>] ? mga_crtc_prepare+0x132/0x370 [mgag200]
[58110.650106]  [<ffffffffa04ec8db>] drm_crtc_helper_set_mode+0x2fb/0x530 [drm_kms_helper]
[58110.659052]  [<ffffffffa04ed896>] drm_crtc_helper_set_config+0x856/0xa70 [drm_kms_helper]
[58110.668217]  [<ffffffffa00b5318>] drm_mode_set_config_internal+0x68/0x100 [drm]
[58110.676388]  [<ffffffffa04f78b2>] restore_fbdev_mode+0xc2/0xf0 [drm_kms_helper]
[58110.684558]  [<ffffffffa04f7aa3>] drm_fb_helper_force_kernel_mode+0x73/0xb0 [drm_kms_helper]
[58110.693989]  [<ffffffffa04f86f9>] drm_fb_helper_panic+0x29/0x30 [drm_kms_helper]
[58110.702260]  [<ffffffff81081bad>] notifier_call_chain+0x4d/0x80
[58110.708873]  [<ffffffff81081c31>] atomic_notifier_call_chain+0x21/0x30
[58110.716169]  [<ffffffff8156f954>] panic+0xee/0x1f5
[58110.721530]  [<ffffffff8102d232>] mce_panic+0x1e2/0x200
[58110.727366]  [<ffffffff8102d2c3>] mce_timed_out+0x73/0x80
[58110.733396]  [<ffffffff8102e861>] do_machine_check+0x5f1/0xae0
[58110.739915]  [<ffffffff8133f52f>] ? intel_idle+0xbf/0x130
[58110.745952]  [<ffffffff8157c3c9>] machine_check+0x29/0x50
[58110.751984]  [<ffffffff8133f52f>] ? intel_idle+0xbf/0x130
[58110.758017]  <<EOE>>
[58110.760362]  [<ffffffff814467f0>] cpuidle_enter_state+0x70/0x1f0
[58110.767275]  [<ffffffff814469a7>] cpuidle_enter+0x17/0x20
[58110.773309]  [<ffffffff810a4e18>] cpu_startup_entry+0x308/0x390
[58110.779916]  [<ffffffff8103a163>] start_secondary+0x143/0x170
[58110.786325] Code: 00 00 48 0f bd cf 83 c1 01 83 f9 0c 0f 4c c8 b0 1e 83 f9 1e 0f 4f c8 49 d3 e6 e9 f8 fe ff ff 48 89 df e8 9f a8 01 00 31 c0 eb b8 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 49 89 c8 41
[58110.808146] RIP  [<ffffffff81189c65>] __get_vm_area_node+0x155/0x160
[58110.815257]  RSP <ffff88047f7284b8>
[58110.820218] ---[ end trace ab0c230901a0ee95 ]---

Thanks
Rui


             reply	other threads:[~2015-06-26  8:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-26  7:55 Rui Wang [this message]
2015-06-26  9:27 ` drm/mgag200: doesn't work in panic context Daniel Vetter
2015-06-26 18:30   ` Luck, Tony
2015-06-27 13:52     ` Daniel Vetter
2015-06-27 14:12       ` Borislav Petkov
2015-06-27 17:56         ` Daniel Vetter
2015-06-29  8:09           ` Borislav Petkov
2015-06-29  9:25             ` Daniel Vetter
2015-06-29  9:42               ` Borislav Petkov
2015-06-29  9:58                 ` Daniel Vetter
  -- strict thread matches above, loose matches on Subject: below --
2015-06-30  2:53 Rui Wang
2015-06-30  6:36 ` Daniel Vetter
2015-06-30  7:23 Rui Wang
2015-06-30 15:23 ` Daniel Vetter
2015-07-01  7:26 Rui Wang
2015-07-01  9:59 ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1435305314-14337-1-git-send-email-rui.y.wang@intel.com \
    --to=rui.y.wang@intel.com \
    --cc=airlied@redhat.com \
    --cc=bp@alien8.de \
    --cc=daniel.vetter@ffwll.ch \
    --cc=gong.chen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.d.roper@intel.com \
    --cc=robdclark@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox