All of lore.kernel.org
 help / color / mirror / Atom feed
* Topaz mistakenly reported as
@ 2017-12-17 19:20 José Pekkarinen
  2017-12-19  7:12 ` Topaz mistakenly reported as vf José Pekkarinen
  0 siblings, 1 reply; 8+ messages in thread
From: José Pekkarinen @ 2017-12-17 19:20 UTC (permalink / raw)
  To: Xiangliang.Yu-5C7GfCeVMHo; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

	Hi,

	I hit an issue that seems to be a topaz discrete vga reporting it's a 
virtual function when my laptop is running on the battery. I received the 
following bactrace:

Dec 17 11:17:28 bee kernel: [   31.976810] kernel BUG at drivers/gpu/drm/amd/
amdgpu/mxgpu_vi.c:310!
Dec 17 11:17:28 bee kernel: [   31.976815] invalid opcode: 0000 [#1] SMP
Dec 17 11:17:28 bee kernel: [   31.976831] Modules linked in: vfio_pci 
vfio_virqfd udl loop bfq arc4 iwlmvm mac80211 kvmgt vfio_mdev amdgpu(+) mdev 
vfio_iommu_type1 vfio i915 uvcvideo x86_pkg_temp_thermal videobuf2_vmalloc 
videobuf2_memo
ps videobuf2_v4l2 intel_powerclamp videobuf2_core coretemp videodev kvm_intel 
kvm i2c_algo_bit rtsx_pci_sdmmc drm_kms_helper joydev mmc_core media mousedev 
rtsx_pci_ms btusb btrtl btbcm memstick ttm drm wmi_bmof hci_uart btintel 
bluetoot
h iwlwifi snd_hda_intel snd_hda_codec cfg80211 irqbypass crc32c_intel 
ghash_clmulni_intel intel_cstate snd_hwdep intel_uncore snd_hda_core psmouse 
intel_rapl_perf rtsx_pci snd_pcm efi_pstore evdev ideapad_laptop ac input_leds 
serio_raw e
fivars sparse_keymap intel_lpss_acpi battery thermal ecdh_generic wmi fan 
syscopyarea snd_timer sysfillrect snd rfkill intel_lpss
Dec 17 11:17:28 bee kernel: [   31.977023]  video sysimgblt tpm_crb soundcore 
button mfd_core i2c_hid i2c_i801 fb_sys_fops backlight acpi_pad efivarfs unix 
dm_zero dm_thin_pool dm_persistent_data dm_bio_prison dm_service_time 
dm_round_ro
bin dm_queue_length dm_multipath dm_log_userspace cn dm_flakey dm_delay xts 
aesni_intel crypto_simd cryptd glue_helper aes_x86_64 cbc sha256_generic 
scsi_transport_iscsi r8169 mii fuse xfs nfs lockd grace sunrpc fscache ext4 
mbcache jbd2
 multipath linear raid10 raid1 raid0 dm_raid raid456 md_mod async_raid6_recov 
async_memcpy async_pq async_xor xor async_tx raid6_pq libcrc32c dm_snapshot 
dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod dax hid_generic usbhid 
xhc
i_pci xhci_hcd ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd usbcore 
usb_common scsi_transport_fc sr_mod cdrom sg sd_mod ata_piix
Dec 17 11:17:28 bee kernel: [   31.977223]  ahci libahci sata_sx4 pata_oldpiix
Dec 17 11:17:28 bee kernel: [   31.977239] CPU: 0 PID: 3698 Comm: udevd Not 
tainted 4.14.5 #10
Dec 17 11:17:28 bee kernel: [   31.977255] Hardware name: LENOVO 80UV/Lenovo 
ideapad 510S-14IKB, BIOS 2SCN21WW(V2.01) 12/20/2016
Dec 17 11:17:28 bee kernel: [   31.977278] task: ffff880358b54280 task.stack: 
ffffc900014dc000
Dec 17 11:17:28 bee kernel: [   31.977323] RIP: 
0010:xgpu_vi_init_golden_registers+0x56/0xa0 [amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977341] RSP: 0018:ffffc900014dfa08 EFLAGS: 
00010293
Dec 17 11:17:28 bee kernel: [   31.977356] RAX: 000000000000000a RBX: 
ffff880340040000 RCX: 0000000000000000
Dec 17 11:17:28 bee kernel: [   31.977375] RDX: ffff880358b54280 RSI: 
0000000000000100 RDI: ffff880340040000
Dec 17 11:17:28 bee kernel: [   31.977394] RBP: ffffc900014dfa10 R08: 
ffff88033c6dd198 R09: 0000000000000000
Dec 17 11:17:28 bee kernel: [   31.977413] R10: ffff880352c0aaa0 R11: 
0000000000000008 R12: ffff880340040458
Dec 17 11:17:28 bee kernel: [   31.977432] R13: 0000000000000000 R14: 
0000000000000000 R15: ffff880340040000
Dec 17 11:17:28 bee kernel: [   31.977452] FS:  00007fbfdd8c0780(0000) 
GS:ffff88046ec00000(0000) knlGS:0000000000000000
Dec 17 11:17:28 bee kernel: [   31.977474] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Dec 17 11:17:28 bee kernel: [   31.977490] CR2: 000055c3b48c1408 CR3: 
0000000358307003 CR4: 00000000003606f0
Dec 17 11:17:28 bee kernel: [   31.977527] Call Trace:
Dec 17 11:17:28 bee kernel: [   31.977555]  vi_common_hw_init+0x77/0xe0 
[amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977584]  amdgpu_device_init+0xc4b/0x14b0 
[amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977601]  ? kmem_cache_alloc_trace
+0x208/0x250
Dec 17 11:17:28 bee kernel: [   31.977629]  ? amdgpu_driver_load_kms+0x2a/
0x1b0 [amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977658]  amdgpu_driver_load_kms+0x4f/0x1b0 
[amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977682]  drm_dev_register+0x146/0x1d0 [drm]
Dec 17 11:17:28 bee kernel: [   31.977710]  amdgpu_pci_probe+0x118/0x140 
[amdgpu]
Dec 17 11:17:28 bee kernel: [   31.977725]  pci_device_probe+0xcf/0x150
Dec 17 11:17:28 bee kernel: [   31.977739]  driver_probe_device+0x29c/0x450
Dec 17 11:17:28 bee kernel: [   31.977753]  __driver_attach+0xdf/0xf0
Dec 17 11:17:28 bee kernel: [   31.978775]  ? driver_probe_device+0x450/0x450
Dec 17 11:17:28 bee kernel: [   31.979815]  bus_for_each_dev+0x60/0xa0
Dec 17 11:17:28 bee kernel: [   31.980882]  driver_attach+0x1e/0x20
Dec 17 11:17:28 bee kernel: [   31.981931]  bus_add_driver+0x170/0x260
Dec 17 11:17:28 bee kernel: [   31.982977]  driver_register+0x60/0xe0
Dec 17 11:17:28 bee kernel: [   31.984033]  __pci_register_driver+0x5a/0x60
Dec 17 11:17:28 bee kernel: [   31.985089]  amdgpu_init+0x88/0x9b [amdgpu]
Dec 17 11:17:28 bee kernel: [   31.986146]  ? 0xffffffffa0c51000
Dec 17 11:17:28 bee kernel: [   31.987192]  do_one_initcall+0x52/0x190
Dec 17 11:17:28 bee kernel: [   31.988229]  ? kmem_cache_alloc_trace
+0x208/0x250
Dec 17 11:17:28 bee kernel: [   31.989270]  ? do_init_module+0x27/0x202
Dec 17 11:17:28 bee kernel: [   31.990308]  ? do_init_module+0x27/0x202
Dec 17 11:17:28 bee kernel: [   31.991383]  do_init_module+0x5f/0x202
Dec 17 11:17:28 bee kernel: [   31.992396]  load_module+0x1511/0x1740
Dec 17 11:17:28 bee kernel: [   31.993433]  SyS_finit_module+0xc1/0x100
Dec 17 11:17:28 bee kernel: [   31.994478]  ? SyS_finit_module+0xc1/0x100
Dec 17 11:17:28 bee kernel: [   31.995505]  do_syscall_64+0x66/0x1a0
Dec 17 11:17:28 bee kernel: [   31.996556]  entry_SYSCALL64_slow_path
+0x25/0x25
Dec 17 11:17:28 bee kernel: [   31.997616] RIP: 0033:0x7fbfdcfd68f9
Dec 17 11:17:28 bee kernel: [   31.998643] RSP: 002b:00007ffd31e4f848 EFLAGS: 
00000246 ORIG_RAX: 0000000000000139
Dec 17 11:17:28 bee kernel: [   31.999659] RAX: ffffffffffffffda RBX: 
000055e4a76c8430 RCX: 00007fbfdcfd68f9
Dec 17 11:17:28 bee kernel: [   32.000689] RDX: 0000000000000000 RSI: 
00007fbfdd2a4565 RDI: 000000000000000e
Dec 17 11:17:28 bee kernel: [   32.001736] RBP: 00007fbfdd2a4565 R08: 
0000000000000000 R09: 00007ffd31e4f9c0
Dec 17 11:17:28 bee kernel: [   32.002813] R10: 000000000000000e R11: 
0000000000000246 R12: 0000000000000000
Dec 17 11:17:28 bee kernel: [   32.003862] R13: 000055e4a76d6710 R14: 
0000000000020000 R15: 000055e4a741b8e9
Dec 17 11:17:28 bee kernel: [   32.004906] Code: 48 89 df ba 4b 00 00 00 48 c7 
c6 60 62 13 a1 e8 11 b7 fc ff 48 89 df ba 1e 00 00 00 48 c7 c6 e0 61 13 a1 e8 
fd b6 fc ff 5b 5d c3 <0f> 0b ba 05 01 00 00 48 c7 c6 c0 5d 13 a1 e8 e7 b6 fc 
ff 48 89
Dec 17 11:17:28 bee kernel: [   32.006061] RIP: xgpu_vi_init_golden_registers
+0x56/0xa0 [amdgpu] RSP: ffffc900014dfa08
Dec 17 11:17:28 bee kernel: [   32.007226] ---[ end trace eb52a49a747a04be 
]---

	Which in the end means we got to the following BUG_ON on 
xgpu_vi_init_golden_registers:

	BUG_ON("Doesn't support chip type.\n");

	Following the path in vi_init_golden_registers:

        if (amdgpu_sriov_vf(adev)) {                                                                                  
                xgpu_vi_init_golden_registers(adev);
                mutex_unlock(&adev->grbm_idx_mutex);
                return;                          
        }

	System is using the following kernel and cpu:

$ uname -a
Linux bee 4.14.5 #10 SMP Wed Dec 13 12:07:06 EET 2017 x86_64 Intel(R) Core(TM) 
i7-7500U CPU @ 2.70GHz GenuineIntel GNU/Linux

	And the graphic card is the following:

# lspci -vvvs 01:00.0
01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT 
[Radeon R7 M260/M265 / M340/M360 / M440/M445] (rev 81)
        Subsystem: Lenovo Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/
M445]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 128
        Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
        Region 2: Memory at b0000000 (64-bit, prefetchable) [size=2M]
        Region 4: I/O ports at 4000 [size=256]
        Region 5: Memory at b2300000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at b2340000 [disabled] [size=128K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, 
L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1, Exit 
Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, 
OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, 
OBFF Disabled
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, 
EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, 
LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00338  Data: 0000
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 
Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- 
NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr
+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ 
ChkEn-
        Capabilities: [270 v1] #19
        Capabilities: [2b0 v1] Address Translation Service (ATS)
                ATSCap: Invalidate Queue Depth: 00
                ATSCtl: Enable-, Smallest Translation Unit: 00
        Capabilities: [2c0 v1] Page Request Interface (PRI)
                PRICtl: Enable- Reset-
                PRISta: RF- UPRGI- Stopped+
                Page Request Capacity: 00000020, Page Request Allocation: 
00000000
        Capabilities: [2d0 v1] Process Address Space ID (PASID)
                PASIDCap: Exec+ Priv+, Max PASID Width: 10
                PASIDCtl: Enable- Exec- Priv-
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu

	Funny thing is that I can boot the machine properly when not running on 
the battery, so either this seems to be a problem in the firmware, or in the 
way acpi interacts with the driver.

	Any help, or ideas are appreciated.

	José.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Topaz mistakenly reported as vf
  2017-12-17 19:20 Topaz mistakenly reported as José Pekkarinen
@ 2017-12-19  7:12 ` José Pekkarinen
  2017-12-19  7:19   ` Yu, Xiangliang
  0 siblings, 1 reply; 8+ messages in thread
From: José Pekkarinen @ 2017-12-19  7:12 UTC (permalink / raw)
  To: Xiangliang.Yu-5C7GfCeVMHo
  Cc: alexander.deucher-5C7GfCeVMHo, christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Sunday, 17 December 2017 21:20:49 EET José Pekkarinen wrote:
> 	Hi,
> 
> 	I hit an issue that seems to be a topaz discrete vga reporting it's a
> virtual function when my laptop is running on the battery. I received the
> following bactrace:
> 
> Dec 17 11:17:28 bee kernel: [   31.976810] kernel BUG at
> drivers/gpu/drm/amd/ amdgpu/mxgpu_vi.c:310!
> Dec 17 11:17:28 bee kernel: [   31.976815] invalid opcode: 0000 [#1] SMP
> Dec 17 11:17:28 bee kernel: [   31.976831] Modules linked in: vfio_pci
> vfio_virqfd udl loop bfq arc4 iwlmvm mac80211 kvmgt vfio_mdev amdgpu(+) mdev
> vfio_iommu_type1 vfio i915 uvcvideo x86_pkg_temp_thermal videobuf2_vmalloc
> videobuf2_memo
> ps videobuf2_v4l2 intel_powerclamp videobuf2_core coretemp videodev
> kvm_intel kvm i2c_algo_bit rtsx_pci_sdmmc drm_kms_helper joydev mmc_core
> media mousedev rtsx_pci_ms btusb btrtl btbcm memstick ttm drm wmi_bmof
> hci_uart btintel bluetoot
> h iwlwifi snd_hda_intel snd_hda_codec cfg80211 irqbypass crc32c_intel
> ghash_clmulni_intel intel_cstate snd_hwdep intel_uncore snd_hda_core psmouse
> intel_rapl_perf rtsx_pci snd_pcm efi_pstore evdev ideapad_laptop ac
> input_leds serio_raw e
> fivars sparse_keymap intel_lpss_acpi battery thermal ecdh_generic wmi fan
> syscopyarea snd_timer sysfillrect snd rfkill intel_lpss
> Dec 17 11:17:28 bee kernel: [   31.977023]  video sysimgblt tpm_crb
> soundcore button mfd_core i2c_hid i2c_i801 fb_sys_fops backlight acpi_pad
> efivarfs unix dm_zero dm_thin_pool dm_persistent_data dm_bio_prison
> dm_service_time dm_round_ro
> bin dm_queue_length dm_multipath dm_log_userspace cn dm_flakey dm_delay xts
> aesni_intel crypto_simd cryptd glue_helper aes_x86_64 cbc sha256_generic
> scsi_transport_iscsi r8169 mii fuse xfs nfs lockd grace sunrpc fscache ext4
> mbcache jbd2
>  multipath linear raid10 raid1 raid0 dm_raid raid456 md_mod
> async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
> libcrc32c dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log
> dm_mod dax hid_generic usbhid xhc
> i_pci xhci_hcd ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd usbcore
> usb_common scsi_transport_fc sr_mod cdrom sg sd_mod ata_piix
> Dec 17 11:17:28 bee kernel: [   31.977223]  ahci libahci sata_sx4
> pata_oldpiix Dec 17 11:17:28 bee kernel: [   31.977239] CPU: 0 PID: 3698
> Comm: udevd Not tainted 4.14.5 #10
> Dec 17 11:17:28 bee kernel: [   31.977255] Hardware name: LENOVO 80UV/Lenovo
> ideapad 510S-14IKB, BIOS 2SCN21WW(V2.01) 12/20/2016
> Dec 17 11:17:28 bee kernel: [   31.977278] task: ffff880358b54280
> task.stack: ffffc900014dc000
> Dec 17 11:17:28 bee kernel: [   31.977323] RIP:
> 0010:xgpu_vi_init_golden_registers+0x56/0xa0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977341] RSP: 0018:ffffc900014dfa08
> EFLAGS: 00010293
> Dec 17 11:17:28 bee kernel: [   31.977356] RAX: 000000000000000a RBX:
> ffff880340040000 RCX: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977375] RDX: ffff880358b54280 RSI:
> 0000000000000100 RDI: ffff880340040000
> Dec 17 11:17:28 bee kernel: [   31.977394] RBP: ffffc900014dfa10 R08:
> ffff88033c6dd198 R09: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977413] R10: ffff880352c0aaa0 R11:
> 0000000000000008 R12: ffff880340040458
> Dec 17 11:17:28 bee kernel: [   31.977432] R13: 0000000000000000 R14:
> 0000000000000000 R15: ffff880340040000
> Dec 17 11:17:28 bee kernel: [   31.977452] FS:  00007fbfdd8c0780(0000)
> GS:ffff88046ec00000(0000) knlGS:0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977474] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Dec 17 11:17:28 bee kernel: [   31.977490] CR2: 000055c3b48c1408 CR3:
> 0000000358307003 CR4: 00000000003606f0
> Dec 17 11:17:28 bee kernel: [   31.977527] Call Trace:
> Dec 17 11:17:28 bee kernel: [   31.977555]  vi_common_hw_init+0x77/0xe0
> [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977584]  amdgpu_device_init+0xc4b/0x14b0
> [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977601]  ? kmem_cache_alloc_trace
> +0x208/0x250
> Dec 17 11:17:28 bee kernel: [   31.977629]  ? amdgpu_driver_load_kms+0x2a/
> 0x1b0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977658] 
> amdgpu_driver_load_kms+0x4f/0x1b0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977682]  drm_dev_register+0x146/0x1d0
> [drm] Dec 17 11:17:28 bee kernel: [   31.977710] 
> amdgpu_pci_probe+0x118/0x140 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977725]  pci_device_probe+0xcf/0x150
> Dec 17 11:17:28 bee kernel: [   31.977739]  driver_probe_device+0x29c/0x450
> Dec 17 11:17:28 bee kernel: [   31.977753]  __driver_attach+0xdf/0xf0
> Dec 17 11:17:28 bee kernel: [   31.978775]  ?
> driver_probe_device+0x450/0x450 Dec 17 11:17:28 bee kernel: [   31.979815] 
> bus_for_each_dev+0x60/0xa0 Dec 17 11:17:28 bee kernel: [   31.980882] 
> driver_attach+0x1e/0x20 Dec 17 11:17:28 bee kernel: [   31.981931] 
> bus_add_driver+0x170/0x260 Dec 17 11:17:28 bee kernel: [   31.982977] 
> driver_register+0x60/0xe0 Dec 17 11:17:28 bee kernel: [   31.984033] 
> __pci_register_driver+0x5a/0x60 Dec 17 11:17:28 bee kernel: [   31.985089] 
> amdgpu_init+0x88/0x9b [amdgpu] Dec 17 11:17:28 bee kernel: [   31.986146] 
> ? 0xffffffffa0c51000
> Dec 17 11:17:28 bee kernel: [   31.987192]  do_one_initcall+0x52/0x190
> Dec 17 11:17:28 bee kernel: [   31.988229]  ? kmem_cache_alloc_trace
> +0x208/0x250
> Dec 17 11:17:28 bee kernel: [   31.989270]  ? do_init_module+0x27/0x202
> Dec 17 11:17:28 bee kernel: [   31.990308]  ? do_init_module+0x27/0x202
> Dec 17 11:17:28 bee kernel: [   31.991383]  do_init_module+0x5f/0x202
> Dec 17 11:17:28 bee kernel: [   31.992396]  load_module+0x1511/0x1740
> Dec 17 11:17:28 bee kernel: [   31.993433]  SyS_finit_module+0xc1/0x100
> Dec 17 11:17:28 bee kernel: [   31.994478]  ? SyS_finit_module+0xc1/0x100
> Dec 17 11:17:28 bee kernel: [   31.995505]  do_syscall_64+0x66/0x1a0
> Dec 17 11:17:28 bee kernel: [   31.996556]  entry_SYSCALL64_slow_path
> +0x25/0x25
> Dec 17 11:17:28 bee kernel: [   31.997616] RIP: 0033:0x7fbfdcfd68f9
> Dec 17 11:17:28 bee kernel: [   31.998643] RSP: 002b:00007ffd31e4f848
> EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> Dec 17 11:17:28 bee kernel: [   31.999659] RAX: ffffffffffffffda RBX:
> 000055e4a76c8430 RCX: 00007fbfdcfd68f9
> Dec 17 11:17:28 bee kernel: [   32.000689] RDX: 0000000000000000 RSI:
> 00007fbfdd2a4565 RDI: 000000000000000e
> Dec 17 11:17:28 bee kernel: [   32.001736] RBP: 00007fbfdd2a4565 R08:
> 0000000000000000 R09: 00007ffd31e4f9c0
> Dec 17 11:17:28 bee kernel: [   32.002813] R10: 000000000000000e R11:
> 0000000000000246 R12: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   32.003862] R13: 000055e4a76d6710 R14:
> 0000000000020000 R15: 000055e4a741b8e9
> Dec 17 11:17:28 bee kernel: [   32.004906] Code: 48 89 df ba 4b 00 00 00 48
> c7 c6 60 62 13 a1 e8 11 b7 fc ff 48 89 df ba 1e 00 00 00 48 c7 c6 e0 61 13
> a1 e8 fd b6 fc ff 5b 5d c3 <0f> 0b ba 05 01 00 00 48 c7 c6 c0 5d 13 a1 e8
> e7 b6 fc ff 48 89
> Dec 17 11:17:28 bee kernel: [   32.006061] RIP:
> xgpu_vi_init_golden_registers +0x56/0xa0 [amdgpu] RSP: ffffc900014dfa08
> Dec 17 11:17:28 bee kernel: [   32.007226] ---[ end trace eb52a49a747a04be
> ]---
> 
> 	Which in the end means we got to the following BUG_ON on
> xgpu_vi_init_golden_registers:
> 
> 	BUG_ON("Doesn't support chip type.\n");
> 
> 	Following the path in vi_init_golden_registers:
> 
>         if (amdgpu_sriov_vf(adev)) {
>                 xgpu_vi_init_golden_registers(adev);
>                 mutex_unlock(&adev->grbm_idx_mutex);
>                 return;
>         }
> 
> 	System is using the following kernel and cpu:
> 
> $ uname -a
> Linux bee 4.14.5 #10 SMP Wed Dec 13 12:07:06 EET 2017 x86_64 Intel(R)
> Core(TM) i7-7500U CPU @ 2.70GHz GenuineIntel GNU/Linux
> 
> 	And the graphic card is the following:
> 
> # lspci -vvvs 01:00.0
> 01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT
> [Radeon R7 M260/M265 / M340/M360 / M440/M445] (rev 81)
>         Subsystem: Lenovo Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/
> M445]
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 128
>         Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
>         Region 2: Memory at b0000000 (64-bit, prefetchable) [size=2M]
>         Region 4: I/O ports at 4000 [size=256]
>         Region 5: Memory at b2300000 (32-bit, non-prefetchable) [size=256K]
>         Expansion ROM at b2340000 [disabled] [size=128K]
>         Capabilities: [48] Vendor Specific Information: Len=08 <?>
>         Capabilities: [50] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us,
> L1 unlimited
>                         ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> Unsupported-
>                         RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> TransPend-
>                 LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1, Exit
> Latency L0s <64ns, L1 <1us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis-,
> LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
> LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance-
> SpeedDis- Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -3.5dB,
> EqualizationComplete+, EqualizationPhase1+
>                          EqualizationPhase2+, EqualizationPhase3+,
> LinkEqualizationRequest-
>         Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                 Address: 00000000fee00338  Data: 0000
>         Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
>         Capabilities: [150 v2] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr +
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+
> ChkEn-
>         Capabilities: [270 v1] #19
>         Capabilities: [2b0 v1] Address Translation Service (ATS)
>                 ATSCap: Invalidate Queue Depth: 00
>                 ATSCtl: Enable-, Smallest Translation Unit: 00
>         Capabilities: [2c0 v1] Page Request Interface (PRI)
>                 PRICtl: Enable- Reset-
>                 PRISta: RF- UPRGI- Stopped+
>                 Page Request Capacity: 00000020, Page Request Allocation:
> 00000000
>         Capabilities: [2d0 v1] Process Address Space ID (PASID)
>                 PASIDCap: Exec+ Priv+, Max PASID Width: 10
>                 PASIDCtl: Enable- Exec- Priv-
>         Kernel driver in use: amdgpu
>         Kernel modules: amdgpu
> 
> 	Funny thing is that I can boot the machine properly when not running on
> the battery, so either this seems to be a problem in the firmware, or in the
> way acpi interacts with the driver.
> 
> 	Any help, or ideas are appreciated.
> 
> 	José.

	Adding Alex and Christian.

	Best regards.

	José


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Topaz mistakenly reported as vf
  2017-12-19  7:12 ` Topaz mistakenly reported as vf José Pekkarinen
@ 2017-12-19  7:19   ` Yu, Xiangliang
       [not found]     ` <BY2PR1201MB0935B6DE5790F85F6CEC05FDEB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Yu, Xiangliang @ 2017-12-19  7:19 UTC (permalink / raw)
  To: José Pekkarinen
  Cc: Deucher, Alexander, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

Topaz doesn't support SRIOV.


> -----Original Message-----
> From: José Pekkarinen [mailto:jose.pekkarinen@canonical.com]
> Sent: Tuesday, December 19, 2017 3:12 PM
> To: Yu, Xiangliang <Xiangliang.Yu@amd.com>
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>
> Subject: Re: Topaz mistakenly reported as vf
> 
> On Sunday, 17 December 2017 21:20:49 EET José Pekkarinen wrote:
> > 	Hi,
> >
> > 	I hit an issue that seems to be a topaz discrete vga reporting it's a
> > virtual function when my laptop is running on the battery. I received
> > the following bactrace:
> >
> > Dec 17 11:17:28 bee kernel: [   31.976810] kernel BUG at
> > drivers/gpu/drm/amd/ amdgpu/mxgpu_vi.c:310!
> > Dec 17 11:17:28 bee kernel: [   31.976815] invalid opcode: 0000 [#1] SMP
> > Dec 17 11:17:28 bee kernel: [   31.976831] Modules linked in: vfio_pci
> > vfio_virqfd udl loop bfq arc4 iwlmvm mac80211 kvmgt vfio_mdev
> > amdgpu(+) mdev
> > vfio_iommu_type1 vfio i915 uvcvideo x86_pkg_temp_thermal
> > videobuf2_vmalloc videobuf2_memo ps videobuf2_v4l2 intel_powerclamp
> > videobuf2_core coretemp videodev kvm_intel kvm i2c_algo_bit
> > rtsx_pci_sdmmc drm_kms_helper joydev mmc_core media mousedev
> > rtsx_pci_ms btusb btrtl btbcm memstick ttm drm wmi_bmof hci_uart
> > btintel bluetoot h iwlwifi snd_hda_intel snd_hda_codec cfg80211
> > irqbypass crc32c_intel ghash_clmulni_intel intel_cstate snd_hwdep
> > intel_uncore snd_hda_core psmouse intel_rapl_perf rtsx_pci snd_pcm
> > efi_pstore evdev ideapad_laptop ac input_leds serio_raw e fivars
> > sparse_keymap intel_lpss_acpi battery thermal ecdh_generic wmi fan
> > syscopyarea snd_timer sysfillrect snd rfkill intel_lpss
> > Dec 17 11:17:28 bee kernel: [   31.977023]  video sysimgblt tpm_crb
> > soundcore button mfd_core i2c_hid i2c_i801 fb_sys_fops backlight
> > acpi_pad efivarfs unix dm_zero dm_thin_pool dm_persistent_data
> > dm_bio_prison dm_service_time dm_round_ro bin dm_queue_length
> > dm_multipath dm_log_userspace cn dm_flakey dm_delay xts aesni_intel
> > crypto_simd cryptd glue_helper aes_x86_64 cbc sha256_generic
> > scsi_transport_iscsi r8169 mii fuse xfs nfs lockd grace sunrpc fscache
> > ext4 mbcache jbd2  multipath linear raid10 raid1 raid0 dm_raid raid456
> > md_mod async_raid6_recov async_memcpy async_pq async_xor xor
> async_tx
> > raid6_pq libcrc32c dm_snapshot dm_bufio dm_crypt dm_mirror
> > dm_region_hash dm_log dm_mod dax hid_generic usbhid xhc i_pci
> xhci_hcd
> > ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd usbcore usb_common
> > scsi_transport_fc sr_mod cdrom sg sd_mod ata_piix
> > Dec 17 11:17:28 bee kernel: [   31.977223]  ahci libahci sata_sx4
> > pata_oldpiix Dec 17 11:17:28 bee kernel: [   31.977239] CPU: 0 PID: 3698
> > Comm: udevd Not tainted 4.14.5 #10
> > Dec 17 11:17:28 bee kernel: [   31.977255] Hardware name: LENOVO
> 80UV/Lenovo
> > ideapad 510S-14IKB, BIOS 2SCN21WW(V2.01) 12/20/2016
> > Dec 17 11:17:28 bee kernel: [   31.977278] task: ffff880358b54280
> > task.stack: ffffc900014dc000
> > Dec 17 11:17:28 bee kernel: [   31.977323] RIP:
> > 0010:xgpu_vi_init_golden_registers+0x56/0xa0 [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977341] RSP: 0018:ffffc900014dfa08
> > EFLAGS: 00010293
> > Dec 17 11:17:28 bee kernel: [   31.977356] RAX: 000000000000000a RBX:
> > ffff880340040000 RCX: 0000000000000000
> > Dec 17 11:17:28 bee kernel: [   31.977375] RDX: ffff880358b54280 RSI:
> > 0000000000000100 RDI: ffff880340040000
> > Dec 17 11:17:28 bee kernel: [   31.977394] RBP: ffffc900014dfa10 R08:
> > ffff88033c6dd198 R09: 0000000000000000
> > Dec 17 11:17:28 bee kernel: [   31.977413] R10: ffff880352c0aaa0 R11:
> > 0000000000000008 R12: ffff880340040458
> > Dec 17 11:17:28 bee kernel: [   31.977432] R13: 0000000000000000 R14:
> > 0000000000000000 R15: ffff880340040000
> > Dec 17 11:17:28 bee kernel: [   31.977452] FS:  00007fbfdd8c0780(0000)
> > GS:ffff88046ec00000(0000) knlGS:0000000000000000
> > Dec 17 11:17:28 bee kernel: [   31.977474] CS:  0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033
> > Dec 17 11:17:28 bee kernel: [   31.977490] CR2: 000055c3b48c1408 CR3:
> > 0000000358307003 CR4: 00000000003606f0
> > Dec 17 11:17:28 bee kernel: [   31.977527] Call Trace:
> > Dec 17 11:17:28 bee kernel: [   31.977555]  vi_common_hw_init+0x77/0xe0
> > [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977584]
> amdgpu_device_init+0xc4b/0x14b0
> > [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977601]  ? kmem_cache_alloc_trace
> > +0x208/0x250
> > Dec 17 11:17:28 bee kernel: [   31.977629]  ?
> amdgpu_driver_load_kms+0x2a/
> > 0x1b0 [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977658]
> > amdgpu_driver_load_kms+0x4f/0x1b0 [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977682]  drm_dev_register+0x146/0x1d0
> > [drm] Dec 17 11:17:28 bee kernel: [   31.977710]
> > amdgpu_pci_probe+0x118/0x140 [amdgpu]
> > Dec 17 11:17:28 bee kernel: [   31.977725]  pci_device_probe+0xcf/0x150
> > Dec 17 11:17:28 bee kernel: [   31.977739]
> driver_probe_device+0x29c/0x450
> > Dec 17 11:17:28 bee kernel: [   31.977753]  __driver_attach+0xdf/0xf0
> > Dec 17 11:17:28 bee kernel: [   31.978775]  ?
> > driver_probe_device+0x450/0x450 Dec 17 11:17:28 bee kernel: [   31.979815]
> > bus_for_each_dev+0x60/0xa0 Dec 17 11:17:28 bee kernel: [   31.980882]
> > driver_attach+0x1e/0x20 Dec 17 11:17:28 bee kernel: [   31.981931]
> > bus_add_driver+0x170/0x260 Dec 17 11:17:28 bee kernel: [   31.982977]
> > driver_register+0x60/0xe0 Dec 17 11:17:28 bee kernel: [   31.984033]
> > __pci_register_driver+0x5a/0x60 Dec 17 11:17:28 bee kernel: [   31.985089]
> > amdgpu_init+0x88/0x9b [amdgpu] Dec 17 11:17:28 bee kernel: [   31.986146]
> > ? 0xffffffffa0c51000
> > Dec 17 11:17:28 bee kernel: [   31.987192]  do_one_initcall+0x52/0x190
> > Dec 17 11:17:28 bee kernel: [   31.988229]  ? kmem_cache_alloc_trace
> > +0x208/0x250
> > Dec 17 11:17:28 bee kernel: [   31.989270]  ? do_init_module+0x27/0x202
> > Dec 17 11:17:28 bee kernel: [   31.990308]  ? do_init_module+0x27/0x202
> > Dec 17 11:17:28 bee kernel: [   31.991383]  do_init_module+0x5f/0x202
> > Dec 17 11:17:28 bee kernel: [   31.992396]  load_module+0x1511/0x1740
> > Dec 17 11:17:28 bee kernel: [   31.993433]  SyS_finit_module+0xc1/0x100
> > Dec 17 11:17:28 bee kernel: [   31.994478]  ? SyS_finit_module+0xc1/0x100
> > Dec 17 11:17:28 bee kernel: [   31.995505]  do_syscall_64+0x66/0x1a0
> > Dec 17 11:17:28 bee kernel: [   31.996556]  entry_SYSCALL64_slow_path
> > +0x25/0x25
> > Dec 17 11:17:28 bee kernel: [   31.997616] RIP: 0033:0x7fbfdcfd68f9
> > Dec 17 11:17:28 bee kernel: [   31.998643] RSP: 002b:00007ffd31e4f848
> > EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > Dec 17 11:17:28 bee kernel: [   31.999659] RAX: ffffffffffffffda RBX:
> > 000055e4a76c8430 RCX: 00007fbfdcfd68f9
> > Dec 17 11:17:28 bee kernel: [   32.000689] RDX: 0000000000000000 RSI:
> > 00007fbfdd2a4565 RDI: 000000000000000e
> > Dec 17 11:17:28 bee kernel: [   32.001736] RBP: 00007fbfdd2a4565 R08:
> > 0000000000000000 R09: 00007ffd31e4f9c0
> > Dec 17 11:17:28 bee kernel: [   32.002813] R10: 000000000000000e R11:
> > 0000000000000246 R12: 0000000000000000
> > Dec 17 11:17:28 bee kernel: [   32.003862] R13: 000055e4a76d6710 R14:
> > 0000000000020000 R15: 000055e4a741b8e9
> > Dec 17 11:17:28 bee kernel: [   32.004906] Code: 48 89 df ba 4b 00 00 00 48
> > c7 c6 60 62 13 a1 e8 11 b7 fc ff 48 89 df ba 1e 00 00 00 48 c7 c6 e0
> > 61 13
> > a1 e8 fd b6 fc ff 5b 5d c3 <0f> 0b ba 05 01 00 00 48 c7 c6 c0 5d 13 a1
> > e8
> > e7 b6 fc ff 48 89
> > Dec 17 11:17:28 bee kernel: [   32.006061] RIP:
> > xgpu_vi_init_golden_registers +0x56/0xa0 [amdgpu] RSP: ffffc900014dfa08
> > Dec 17 11:17:28 bee kernel: [   32.007226] ---[ end trace eb52a49a747a04be
> > ]---
> >
> > 	Which in the end means we got to the following BUG_ON on
> > xgpu_vi_init_golden_registers:
> >
> > 	BUG_ON("Doesn't support chip type.\n");
> >
> > 	Following the path in vi_init_golden_registers:
> >
> >         if (amdgpu_sriov_vf(adev)) {
> >                 xgpu_vi_init_golden_registers(adev);
> >                 mutex_unlock(&adev->grbm_idx_mutex);
> >                 return;
> >         }
> >
> > 	System is using the following kernel and cpu:
> >
> > $ uname -a
> > Linux bee 4.14.5 #10 SMP Wed Dec 13 12:07:06 EET 2017 x86_64 Intel(R)
> > Core(TM) i7-7500U CPU @ 2.70GHz GenuineIntel GNU/Linux
> >
> > 	And the graphic card is the following:
> >
> > # lspci -vvvs 01:00.0
> > 01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI]
> > Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445] (rev 81)
> >         Subsystem: Lenovo Topaz XT [Radeon R7 M260/M265 / M340/M360 /
> > M440/ M445]
> >         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr-
> > Stepping- SERR- FastB2B- DisINTx+
> >         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > <TAbort- <MAbort- >SERR- <PERR- INTx-
> >         Latency: 0, Cache Line Size: 64 bytes
> >         Interrupt: pin A routed to IRQ 128
> >         Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
> >         Region 2: Memory at b0000000 (64-bit, prefetchable) [size=2M]
> >         Region 4: I/O ports at 4000 [size=256]
> >         Region 5: Memory at b2300000 (32-bit, non-prefetchable) [size=256K]
> >         Expansion ROM at b2340000 [disabled] [size=128K]
> >         Capabilities: [48] Vendor Specific Information: Len=08 <?>
> >         Capabilities: [50] Power Management version 3
> >                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> > PME(D0-,D1-,D2-,D3hot-,D3cold-)
> >                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> >         Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
> >                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
> > <4us,
> > L1 unlimited
> >                         ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> >                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> > Unsupported-
> >                         RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
> >                         MaxPayload 256 bytes, MaxReadReq 512 bytes
> >                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
> > AuxPwr-
> > TransPend-
> >                 LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1,
> > Exit Latency L0s <64ns, L1 <1us
> >                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
> >                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> >                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> >                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train-
> > SlotClk+
> > DLActive- BWMgmt- ABWMgmt-
> >                 DevCap2: Completion Timeout: Not Supported,
> > TimeoutDis-, LTR-, OBFF Not Supported
> >                 DevCtl2: Completion Timeout: 50us to 50ms,
> > TimeoutDis-, LTR-, OBFF Disabled
> >                 LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance-
> > SpeedDis- Transmit Margin: Normal Operating Range,
> > EnterModifiedCompliance- ComplianceSOS-
> >                          Compliance De-emphasis: -6dB
> >                 LnkSta2: Current De-emphasis Level: -3.5dB,
> > EqualizationComplete+, EqualizationPhase1+
> >                          EqualizationPhase2+, EqualizationPhase3+,
> > LinkEqualizationRequest-
> >         Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
> >                 Address: 00000000fee00338  Data: 0000
> >         Capabilities: [100 v1] Vendor Specific Information: ID=0001
> > Rev=1
> > Len=010 <?>
> >         Capabilities: [150 v2] Advanced Error Reporting
> >                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> > UnxCmplt-
> > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> > UnxCmplt-
> > RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
> > UnxCmplt-
> > RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> >                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > NonFatalErr-
> >                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> > NonFatalErr +
> >                 AERCap: First Error Pointer: 00, GenCap+ CGenEn-
> > ChkCap+
> > ChkEn-
> >         Capabilities: [270 v1] #19
> >         Capabilities: [2b0 v1] Address Translation Service (ATS)
> >                 ATSCap: Invalidate Queue Depth: 00
> >                 ATSCtl: Enable-, Smallest Translation Unit: 00
> >         Capabilities: [2c0 v1] Page Request Interface (PRI)
> >                 PRICtl: Enable- Reset-
> >                 PRISta: RF- UPRGI- Stopped+
> >                 Page Request Capacity: 00000020, Page Request Allocation:
> > 00000000
> >         Capabilities: [2d0 v1] Process Address Space ID (PASID)
> >                 PASIDCap: Exec+ Priv+, Max PASID Width: 10
> >                 PASIDCtl: Enable- Exec- Priv-
> >         Kernel driver in use: amdgpu
> >         Kernel modules: amdgpu
> >
> > 	Funny thing is that I can boot the machine properly when not running
> > on the battery, so either this seems to be a problem in the firmware,
> > or in the way acpi interacts with the driver.
> >
> > 	Any help, or ideas are appreciated.
> >
> > 	José.
> 
> 	Adding Alex and Christian.
> 
> 	Best regards.
> 
> 	José
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Topaz mistakenly reported as vf
       [not found]     ` <BY2PR1201MB0935B6DE5790F85F6CEC05FDEB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-12-19  7:27       ` José Pekkarinen
  2017-12-19  7:44         ` Yu, Xiangliang
  0 siblings, 1 reply; 8+ messages in thread
From: José Pekkarinen @ 2017-12-19  7:27 UTC (permalink / raw)
  To: Yu, Xiangliang
  Cc: Deucher, Alexander, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

On Tuesday, 19 December 2017 09:19:02 EET Yu, Xiangliang wrote:
> Topaz doesn't support SRIOV.
> 

	Hi Xiangliang,

	Allow me to ask for some attention, as I'm trying to say that given that 
Topaz is not having support, amdgpu_sriov_vf(adev) may return false and ignore 
the vi.c code path. Instead of it, it enters the code path and reaches the 
BUG_ON that states this is not supported.

	Thanks for coming back!

	José.


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Topaz mistakenly reported as vf
  2017-12-19  7:27       ` José Pekkarinen
@ 2017-12-19  7:44         ` Yu, Xiangliang
       [not found]           ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Yu, Xiangliang @ 2017-12-19  7:44 UTC (permalink / raw)
  To: José Pekkarinen
  Cc: Deucher, Alexander, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

We can add ASIC check in vi_detect_hw_virtualization function to avoid to entry error path.
I'll submit patch later.

Thanks!

> -----Original Message-----
> From: José Pekkarinen [mailto:jose.pekkarinen@canonical.com]
> Sent: Tuesday, December 19, 2017 3:27 PM
> To: Yu, Xiangliang <Xiangliang.Yu@amd.com>
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>; Koenig, Christian
> <Christian.Koenig@amd.com>
> Subject: Re: Topaz mistakenly reported as vf
> 
> On Tuesday, 19 December 2017 09:19:02 EET Yu, Xiangliang wrote:
> > Topaz doesn't support SRIOV.
> >
> 
> 	Hi Xiangliang,
> 
> 	Allow me to ask for some attention, as I'm trying to say that given
> that Topaz is not having support, amdgpu_sriov_vf(adev) may return false
> and ignore the vi.c code path. Instead of it, it enters the code path and
> reaches the BUG_ON that states this is not supported.
> 
> 	Thanks for coming back!
> 
> 	José.
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Topaz mistakenly reported as vf
       [not found]           ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-12-19  7:50             ` José Pekkarinen
  2017-12-19  7:56               ` Yu, Xiangliang
  2017-12-19 14:20             ` Deucher, Alexander
  1 sibling, 1 reply; 8+ messages in thread
From: José Pekkarinen @ 2017-12-19  7:50 UTC (permalink / raw)
  To: Yu, Xiangliang
  Cc: Deucher, Alexander, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 334 bytes --]

On Tuesday, 19 December 2017 09:44:00 EET Yu, Xiangliang wrote:
> We can add ASIC check in vi_detect_hw_virtualization function to avoid to
> entry error path. I'll submit patch later.
> 
> Thanks!

	Awesome, thanks!

	Also, I noticed the following patch in case you can consider it for 
upstream.

	Best regards.

	José.

[-- Attachment #2: 0001-Release-the-mutex-hold-before-backtracing-for-not-su.patch --]
[-- Type: text/x-patch, Size: 1043 bytes --]

From dc691db50dbfb213aaee2b76574a5e8de2d7bca0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jos=C3=A9=20Pekkarinen?= <koalinux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Tue, 19 Dec 2017 09:39:50 +0200
Subject: [PATCH] Release the mutex hold before backtracing for not supported
 mxgpu.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: José Pekkarinen <koalinux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index c25a831f94ec..cac1d8b003e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -307,6 +307,7 @@ void xgpu_vi_init_golden_registers(struct amdgpu_device *adev)
 						 xgpu_tonga_golden_common_all));
 		break;
 	default:
+		mutex_unlock(&adev->grbm_idx_mutex);
 		BUG_ON("Doesn't support chip type.\n");
 		break;
 	}
-- 
2.13.6


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: Topaz mistakenly reported as vf
  2017-12-19  7:50             ` José Pekkarinen
@ 2017-12-19  7:56               ` Yu, Xiangliang
  0 siblings, 0 replies; 8+ messages in thread
From: Yu, Xiangliang @ 2017-12-19  7:56 UTC (permalink / raw)
  To: José Pekkarinen
  Cc: Deucher, Alexander, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org

> 	Also, I noticed the following patch in case you can consider it for
> upstream.

Please send it to amd-gfx mailist for code review.

> 
> 	Best regards.
> 
> 	José.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Topaz mistakenly reported as vf
       [not found]           ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  2017-12-19  7:50             ` José Pekkarinen
@ 2017-12-19 14:20             ` Deucher, Alexander
  1 sibling, 0 replies; 8+ messages in thread
From: Deucher, Alexander @ 2017-12-19 14:20 UTC (permalink / raw)
  To: Yu, Xiangliang, José Pekkarinen
  Cc: Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org


[-- Attachment #1.1: Type: text/plain, Size: 1557 bytes --]

Please make sure we only check that register on asics where it is relevant.  There are a bunch of VI asics.  It should probably be restricted to tonga and fiji only.


Alex


________________________________
From: Yu, Xiangliang
Sent: Tuesday, December 19, 2017 2:44 AM
To: José Pekkarinen
Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Deucher, Alexander; Koenig, Christian
Subject: RE: Topaz mistakenly reported as vf

We can add ASIC check in vi_detect_hw_virtualization function to avoid to entry error path.
I'll submit patch later.

Thanks!

> -----Original Message-----
> From: José Pekkarinen [mailto:jose.pekkarinen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org]
> Sent: Tuesday, December 19, 2017 3:27 PM
> To: Yu, Xiangliang <Xiangliang.Yu-5C7GfCeVMHo@public.gmane.org>
> Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Deucher, Alexander
> <Alexander.Deucher-5C7GfCeVMHo@public.gmane.org>; Koenig, Christian
> <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
> Subject: Re: Topaz mistakenly reported as vf
>
> On Tuesday, 19 December 2017 09:19:02 EET Yu, Xiangliang wrote:
> > Topaz doesn't support SRIOV.
> >
>
>        Hi Xiangliang,
>
>        Allow me to ask for some attention, as I'm trying to say that given
> that Topaz is not having support, amdgpu_sriov_vf(adev) may return false
> and ignore the vi.c code path. Instead of it, it enters the code path and
> reaches the BUG_ON that states this is not supported.
>
>        Thanks for coming back!
>
>        José.
>


[-- Attachment #1.2: Type: text/html, Size: 3056 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-12-19 14:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-17 19:20 Topaz mistakenly reported as José Pekkarinen
2017-12-19  7:12 ` Topaz mistakenly reported as vf José Pekkarinen
2017-12-19  7:19   ` Yu, Xiangliang
     [not found]     ` <BY2PR1201MB0935B6DE5790F85F6CEC05FDEB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19  7:27       ` José Pekkarinen
2017-12-19  7:44         ` Yu, Xiangliang
     [not found]           ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19  7:50             ` José Pekkarinen
2017-12-19  7:56               ` Yu, Xiangliang
2017-12-19 14:20             ` Deucher, Alexander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.