From: "José Pekkarinen" <koalinux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Xiangliang.Yu-5C7GfCeVMHo@public.gmane.org
Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Topaz mistakenly reported as
Date: Sun, 17 Dec 2017 21:20:49 +0200 [thread overview]
Message-ID: <2405754.qoBN6c2ta5@bee> (raw)
Hi,
I hit an issue that seems to be a topaz discrete vga reporting it's a
virtual function when my laptop is running on the battery. I received the
following bactrace:
Dec 17 11:17:28 bee kernel: [ 31.976810] kernel BUG at drivers/gpu/drm/amd/
amdgpu/mxgpu_vi.c:310!
Dec 17 11:17:28 bee kernel: [ 31.976815] invalid opcode: 0000 [#1] SMP
Dec 17 11:17:28 bee kernel: [ 31.976831] Modules linked in: vfio_pci
vfio_virqfd udl loop bfq arc4 iwlmvm mac80211 kvmgt vfio_mdev amdgpu(+) mdev
vfio_iommu_type1 vfio i915 uvcvideo x86_pkg_temp_thermal videobuf2_vmalloc
videobuf2_memo
ps videobuf2_v4l2 intel_powerclamp videobuf2_core coretemp videodev kvm_intel
kvm i2c_algo_bit rtsx_pci_sdmmc drm_kms_helper joydev mmc_core media mousedev
rtsx_pci_ms btusb btrtl btbcm memstick ttm drm wmi_bmof hci_uart btintel
bluetoot
h iwlwifi snd_hda_intel snd_hda_codec cfg80211 irqbypass crc32c_intel
ghash_clmulni_intel intel_cstate snd_hwdep intel_uncore snd_hda_core psmouse
intel_rapl_perf rtsx_pci snd_pcm efi_pstore evdev ideapad_laptop ac input_leds
serio_raw e
fivars sparse_keymap intel_lpss_acpi battery thermal ecdh_generic wmi fan
syscopyarea snd_timer sysfillrect snd rfkill intel_lpss
Dec 17 11:17:28 bee kernel: [ 31.977023] video sysimgblt tpm_crb soundcore
button mfd_core i2c_hid i2c_i801 fb_sys_fops backlight acpi_pad efivarfs unix
dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time
dm_round_ro
bin dm_queue_length dm_multipath dm_log_userspace cn dm_flakey dm_delay xts
aesni_intel crypto_simd cryptd glue_helper aes_x86_64 cbc sha256_generic
scsi_transport_iscsi r8169 mii fuse xfs nfs lockd grace sunrpc fscache ext4
mbcache jbd2
multipath linear raid10 raid1 raid0 dm_raid raid456 md_mod async_raid6_recov
async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c dm_snapshot
dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod dax hid_generic usbhid
xhc
i_pci xhci_hcd ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd usbcore
usb_common scsi_transport_fc sr_mod cdrom sg sd_mod ata_piix
Dec 17 11:17:28 bee kernel: [ 31.977223] ahci libahci sata_sx4 pata_oldpiix
Dec 17 11:17:28 bee kernel: [ 31.977239] CPU: 0 PID: 3698 Comm: udevd Not
tainted 4.14.5 #10
Dec 17 11:17:28 bee kernel: [ 31.977255] Hardware name: LENOVO 80UV/Lenovo
ideapad 510S-14IKB, BIOS 2SCN21WW(V2.01) 12/20/2016
Dec 17 11:17:28 bee kernel: [ 31.977278] task: ffff880358b54280 task.stack:
ffffc900014dc000
Dec 17 11:17:28 bee kernel: [ 31.977323] RIP:
0010:xgpu_vi_init_golden_registers+0x56/0xa0 [amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977341] RSP: 0018:ffffc900014dfa08 EFLAGS:
00010293
Dec 17 11:17:28 bee kernel: [ 31.977356] RAX: 000000000000000a RBX:
ffff880340040000 RCX: 0000000000000000
Dec 17 11:17:28 bee kernel: [ 31.977375] RDX: ffff880358b54280 RSI:
0000000000000100 RDI: ffff880340040000
Dec 17 11:17:28 bee kernel: [ 31.977394] RBP: ffffc900014dfa10 R08:
ffff88033c6dd198 R09: 0000000000000000
Dec 17 11:17:28 bee kernel: [ 31.977413] R10: ffff880352c0aaa0 R11:
0000000000000008 R12: ffff880340040458
Dec 17 11:17:28 bee kernel: [ 31.977432] R13: 0000000000000000 R14:
0000000000000000 R15: ffff880340040000
Dec 17 11:17:28 bee kernel: [ 31.977452] FS: 00007fbfdd8c0780(0000)
GS:ffff88046ec00000(0000) knlGS:0000000000000000
Dec 17 11:17:28 bee kernel: [ 31.977474] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Dec 17 11:17:28 bee kernel: [ 31.977490] CR2: 000055c3b48c1408 CR3:
0000000358307003 CR4: 00000000003606f0
Dec 17 11:17:28 bee kernel: [ 31.977527] Call Trace:
Dec 17 11:17:28 bee kernel: [ 31.977555] vi_common_hw_init+0x77/0xe0
[amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977584] amdgpu_device_init+0xc4b/0x14b0
[amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977601] ? kmem_cache_alloc_trace
+0x208/0x250
Dec 17 11:17:28 bee kernel: [ 31.977629] ? amdgpu_driver_load_kms+0x2a/
0x1b0 [amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977658] amdgpu_driver_load_kms+0x4f/0x1b0
[amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977682] drm_dev_register+0x146/0x1d0 [drm]
Dec 17 11:17:28 bee kernel: [ 31.977710] amdgpu_pci_probe+0x118/0x140
[amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.977725] pci_device_probe+0xcf/0x150
Dec 17 11:17:28 bee kernel: [ 31.977739] driver_probe_device+0x29c/0x450
Dec 17 11:17:28 bee kernel: [ 31.977753] __driver_attach+0xdf/0xf0
Dec 17 11:17:28 bee kernel: [ 31.978775] ? driver_probe_device+0x450/0x450
Dec 17 11:17:28 bee kernel: [ 31.979815] bus_for_each_dev+0x60/0xa0
Dec 17 11:17:28 bee kernel: [ 31.980882] driver_attach+0x1e/0x20
Dec 17 11:17:28 bee kernel: [ 31.981931] bus_add_driver+0x170/0x260
Dec 17 11:17:28 bee kernel: [ 31.982977] driver_register+0x60/0xe0
Dec 17 11:17:28 bee kernel: [ 31.984033] __pci_register_driver+0x5a/0x60
Dec 17 11:17:28 bee kernel: [ 31.985089] amdgpu_init+0x88/0x9b [amdgpu]
Dec 17 11:17:28 bee kernel: [ 31.986146] ? 0xffffffffa0c51000
Dec 17 11:17:28 bee kernel: [ 31.987192] do_one_initcall+0x52/0x190
Dec 17 11:17:28 bee kernel: [ 31.988229] ? kmem_cache_alloc_trace
+0x208/0x250
Dec 17 11:17:28 bee kernel: [ 31.989270] ? do_init_module+0x27/0x202
Dec 17 11:17:28 bee kernel: [ 31.990308] ? do_init_module+0x27/0x202
Dec 17 11:17:28 bee kernel: [ 31.991383] do_init_module+0x5f/0x202
Dec 17 11:17:28 bee kernel: [ 31.992396] load_module+0x1511/0x1740
Dec 17 11:17:28 bee kernel: [ 31.993433] SyS_finit_module+0xc1/0x100
Dec 17 11:17:28 bee kernel: [ 31.994478] ? SyS_finit_module+0xc1/0x100
Dec 17 11:17:28 bee kernel: [ 31.995505] do_syscall_64+0x66/0x1a0
Dec 17 11:17:28 bee kernel: [ 31.996556] entry_SYSCALL64_slow_path
+0x25/0x25
Dec 17 11:17:28 bee kernel: [ 31.997616] RIP: 0033:0x7fbfdcfd68f9
Dec 17 11:17:28 bee kernel: [ 31.998643] RSP: 002b:00007ffd31e4f848 EFLAGS:
00000246 ORIG_RAX: 0000000000000139
Dec 17 11:17:28 bee kernel: [ 31.999659] RAX: ffffffffffffffda RBX:
000055e4a76c8430 RCX: 00007fbfdcfd68f9
Dec 17 11:17:28 bee kernel: [ 32.000689] RDX: 0000000000000000 RSI:
00007fbfdd2a4565 RDI: 000000000000000e
Dec 17 11:17:28 bee kernel: [ 32.001736] RBP: 00007fbfdd2a4565 R08:
0000000000000000 R09: 00007ffd31e4f9c0
Dec 17 11:17:28 bee kernel: [ 32.002813] R10: 000000000000000e R11:
0000000000000246 R12: 0000000000000000
Dec 17 11:17:28 bee kernel: [ 32.003862] R13: 000055e4a76d6710 R14:
0000000000020000 R15: 000055e4a741b8e9
Dec 17 11:17:28 bee kernel: [ 32.004906] Code: 48 89 df ba 4b 00 00 00 48 c7
c6 60 62 13 a1 e8 11 b7 fc ff 48 89 df ba 1e 00 00 00 48 c7 c6 e0 61 13 a1 e8
fd b6 fc ff 5b 5d c3 <0f> 0b ba 05 01 00 00 48 c7 c6 c0 5d 13 a1 e8 e7 b6 fc
ff 48 89
Dec 17 11:17:28 bee kernel: [ 32.006061] RIP: xgpu_vi_init_golden_registers
+0x56/0xa0 [amdgpu] RSP: ffffc900014dfa08
Dec 17 11:17:28 bee kernel: [ 32.007226] ---[ end trace eb52a49a747a04be
]---
Which in the end means we got to the following BUG_ON on
xgpu_vi_init_golden_registers:
BUG_ON("Doesn't support chip type.\n");
Following the path in vi_init_golden_registers:
if (amdgpu_sriov_vf(adev)) {
xgpu_vi_init_golden_registers(adev);
mutex_unlock(&adev->grbm_idx_mutex);
return;
}
System is using the following kernel and cpu:
$ uname -a
Linux bee 4.14.5 #10 SMP Wed Dec 13 12:07:06 EET 2017 x86_64 Intel(R) Core(TM)
i7-7500U CPU @ 2.70GHz GenuineIntel GNU/Linux
And the graphic card is the following:
# lspci -vvvs 01:00.0
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT
[Radeon R7 M260/M265 / M340/M360 / M440/M445] (rev 81)
Subsystem: Lenovo Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/
M445]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 128
Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at b0000000 (64-bit, prefetchable) [size=2M]
Region 4: I/O ports at 4000 [size=256]
Region 5: Memory at b2300000 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at b2340000 [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us,
L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1, Exit
Latency L0s <64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-,
OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB,
EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+,
LinkEqualizationRequest-
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00338 Data: 0000
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
Len=010 <?>
Capabilities: [150 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr
+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+
ChkEn-
Capabilities: [270 v1] #19
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 00000020, Page Request Allocation:
00000000
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Kernel driver in use: amdgpu
Kernel modules: amdgpu
Funny thing is that I can boot the machine properly when not running on
the battery, so either this seems to be a problem in the firmware, or in the
way acpi interacts with the driver.
Any help, or ideas are appreciated.
José.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next reply other threads:[~2017-12-17 19:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-17 19:20 José Pekkarinen [this message]
2017-12-19 7:12 ` Topaz mistakenly reported as vf José Pekkarinen
2017-12-19 7:19 ` Yu, Xiangliang
[not found] ` <BY2PR1201MB0935B6DE5790F85F6CEC05FDEB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19 7:27 ` José Pekkarinen
2017-12-19 7:44 ` Yu, Xiangliang
[not found] ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19 7:50 ` José Pekkarinen
2017-12-19 7:56 ` Yu, Xiangliang
2017-12-19 14:20 ` Deucher, Alexander
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2405754.qoBN6c2ta5@bee \
--to=koalinux-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=Xiangliang.Yu-5C7GfCeVMHo@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.