All of lore.kernel.org
 help / color / mirror / Atom feed
From: "José Pekkarinen" <jose.pekkarinen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
To: Xiangliang.Yu-5C7GfCeVMHo@public.gmane.org
Cc: alexander.deucher-5C7GfCeVMHo@public.gmane.org,
	christian.koenig-5C7GfCeVMHo@public.gmane.org,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: Topaz mistakenly reported as vf
Date: Tue, 19 Dec 2017 09:12:25 +0200	[thread overview]
Message-ID: <2725982.yAJCsOSm2P@localhost> (raw)
In-Reply-To: <2405754.qoBN6c2ta5@bee>

On Sunday, 17 December 2017 21:20:49 EET José Pekkarinen wrote:
> 	Hi,
> 
> 	I hit an issue that seems to be a topaz discrete vga reporting it's a
> virtual function when my laptop is running on the battery. I received the
> following bactrace:
> 
> Dec 17 11:17:28 bee kernel: [   31.976810] kernel BUG at
> drivers/gpu/drm/amd/ amdgpu/mxgpu_vi.c:310!
> Dec 17 11:17:28 bee kernel: [   31.976815] invalid opcode: 0000 [#1] SMP
> Dec 17 11:17:28 bee kernel: [   31.976831] Modules linked in: vfio_pci
> vfio_virqfd udl loop bfq arc4 iwlmvm mac80211 kvmgt vfio_mdev amdgpu(+) mdev
> vfio_iommu_type1 vfio i915 uvcvideo x86_pkg_temp_thermal videobuf2_vmalloc
> videobuf2_memo
> ps videobuf2_v4l2 intel_powerclamp videobuf2_core coretemp videodev
> kvm_intel kvm i2c_algo_bit rtsx_pci_sdmmc drm_kms_helper joydev mmc_core
> media mousedev rtsx_pci_ms btusb btrtl btbcm memstick ttm drm wmi_bmof
> hci_uart btintel bluetoot
> h iwlwifi snd_hda_intel snd_hda_codec cfg80211 irqbypass crc32c_intel
> ghash_clmulni_intel intel_cstate snd_hwdep intel_uncore snd_hda_core psmouse
> intel_rapl_perf rtsx_pci snd_pcm efi_pstore evdev ideapad_laptop ac
> input_leds serio_raw e
> fivars sparse_keymap intel_lpss_acpi battery thermal ecdh_generic wmi fan
> syscopyarea snd_timer sysfillrect snd rfkill intel_lpss
> Dec 17 11:17:28 bee kernel: [   31.977023]  video sysimgblt tpm_crb
> soundcore button mfd_core i2c_hid i2c_i801 fb_sys_fops backlight acpi_pad
> efivarfs unix dm_zero dm_thin_pool dm_persistent_data dm_bio_prison
> dm_service_time dm_round_ro
> bin dm_queue_length dm_multipath dm_log_userspace cn dm_flakey dm_delay xts
> aesni_intel crypto_simd cryptd glue_helper aes_x86_64 cbc sha256_generic
> scsi_transport_iscsi r8169 mii fuse xfs nfs lockd grace sunrpc fscache ext4
> mbcache jbd2
>  multipath linear raid10 raid1 raid0 dm_raid raid456 md_mod
> async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq
> libcrc32c dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log
> dm_mod dax hid_generic usbhid xhc
> i_pci xhci_hcd ohci_hcd uhci_hcd usb_storage ehci_pci ehci_hcd usbcore
> usb_common scsi_transport_fc sr_mod cdrom sg sd_mod ata_piix
> Dec 17 11:17:28 bee kernel: [   31.977223]  ahci libahci sata_sx4
> pata_oldpiix Dec 17 11:17:28 bee kernel: [   31.977239] CPU: 0 PID: 3698
> Comm: udevd Not tainted 4.14.5 #10
> Dec 17 11:17:28 bee kernel: [   31.977255] Hardware name: LENOVO 80UV/Lenovo
> ideapad 510S-14IKB, BIOS 2SCN21WW(V2.01) 12/20/2016
> Dec 17 11:17:28 bee kernel: [   31.977278] task: ffff880358b54280
> task.stack: ffffc900014dc000
> Dec 17 11:17:28 bee kernel: [   31.977323] RIP:
> 0010:xgpu_vi_init_golden_registers+0x56/0xa0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977341] RSP: 0018:ffffc900014dfa08
> EFLAGS: 00010293
> Dec 17 11:17:28 bee kernel: [   31.977356] RAX: 000000000000000a RBX:
> ffff880340040000 RCX: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977375] RDX: ffff880358b54280 RSI:
> 0000000000000100 RDI: ffff880340040000
> Dec 17 11:17:28 bee kernel: [   31.977394] RBP: ffffc900014dfa10 R08:
> ffff88033c6dd198 R09: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977413] R10: ffff880352c0aaa0 R11:
> 0000000000000008 R12: ffff880340040458
> Dec 17 11:17:28 bee kernel: [   31.977432] R13: 0000000000000000 R14:
> 0000000000000000 R15: ffff880340040000
> Dec 17 11:17:28 bee kernel: [   31.977452] FS:  00007fbfdd8c0780(0000)
> GS:ffff88046ec00000(0000) knlGS:0000000000000000
> Dec 17 11:17:28 bee kernel: [   31.977474] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Dec 17 11:17:28 bee kernel: [   31.977490] CR2: 000055c3b48c1408 CR3:
> 0000000358307003 CR4: 00000000003606f0
> Dec 17 11:17:28 bee kernel: [   31.977527] Call Trace:
> Dec 17 11:17:28 bee kernel: [   31.977555]  vi_common_hw_init+0x77/0xe0
> [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977584]  amdgpu_device_init+0xc4b/0x14b0
> [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977601]  ? kmem_cache_alloc_trace
> +0x208/0x250
> Dec 17 11:17:28 bee kernel: [   31.977629]  ? amdgpu_driver_load_kms+0x2a/
> 0x1b0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977658] 
> amdgpu_driver_load_kms+0x4f/0x1b0 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977682]  drm_dev_register+0x146/0x1d0
> [drm] Dec 17 11:17:28 bee kernel: [   31.977710] 
> amdgpu_pci_probe+0x118/0x140 [amdgpu]
> Dec 17 11:17:28 bee kernel: [   31.977725]  pci_device_probe+0xcf/0x150
> Dec 17 11:17:28 bee kernel: [   31.977739]  driver_probe_device+0x29c/0x450
> Dec 17 11:17:28 bee kernel: [   31.977753]  __driver_attach+0xdf/0xf0
> Dec 17 11:17:28 bee kernel: [   31.978775]  ?
> driver_probe_device+0x450/0x450 Dec 17 11:17:28 bee kernel: [   31.979815] 
> bus_for_each_dev+0x60/0xa0 Dec 17 11:17:28 bee kernel: [   31.980882] 
> driver_attach+0x1e/0x20 Dec 17 11:17:28 bee kernel: [   31.981931] 
> bus_add_driver+0x170/0x260 Dec 17 11:17:28 bee kernel: [   31.982977] 
> driver_register+0x60/0xe0 Dec 17 11:17:28 bee kernel: [   31.984033] 
> __pci_register_driver+0x5a/0x60 Dec 17 11:17:28 bee kernel: [   31.985089] 
> amdgpu_init+0x88/0x9b [amdgpu] Dec 17 11:17:28 bee kernel: [   31.986146] 
> ? 0xffffffffa0c51000
> Dec 17 11:17:28 bee kernel: [   31.987192]  do_one_initcall+0x52/0x190
> Dec 17 11:17:28 bee kernel: [   31.988229]  ? kmem_cache_alloc_trace
> +0x208/0x250
> Dec 17 11:17:28 bee kernel: [   31.989270]  ? do_init_module+0x27/0x202
> Dec 17 11:17:28 bee kernel: [   31.990308]  ? do_init_module+0x27/0x202
> Dec 17 11:17:28 bee kernel: [   31.991383]  do_init_module+0x5f/0x202
> Dec 17 11:17:28 bee kernel: [   31.992396]  load_module+0x1511/0x1740
> Dec 17 11:17:28 bee kernel: [   31.993433]  SyS_finit_module+0xc1/0x100
> Dec 17 11:17:28 bee kernel: [   31.994478]  ? SyS_finit_module+0xc1/0x100
> Dec 17 11:17:28 bee kernel: [   31.995505]  do_syscall_64+0x66/0x1a0
> Dec 17 11:17:28 bee kernel: [   31.996556]  entry_SYSCALL64_slow_path
> +0x25/0x25
> Dec 17 11:17:28 bee kernel: [   31.997616] RIP: 0033:0x7fbfdcfd68f9
> Dec 17 11:17:28 bee kernel: [   31.998643] RSP: 002b:00007ffd31e4f848
> EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> Dec 17 11:17:28 bee kernel: [   31.999659] RAX: ffffffffffffffda RBX:
> 000055e4a76c8430 RCX: 00007fbfdcfd68f9
> Dec 17 11:17:28 bee kernel: [   32.000689] RDX: 0000000000000000 RSI:
> 00007fbfdd2a4565 RDI: 000000000000000e
> Dec 17 11:17:28 bee kernel: [   32.001736] RBP: 00007fbfdd2a4565 R08:
> 0000000000000000 R09: 00007ffd31e4f9c0
> Dec 17 11:17:28 bee kernel: [   32.002813] R10: 000000000000000e R11:
> 0000000000000246 R12: 0000000000000000
> Dec 17 11:17:28 bee kernel: [   32.003862] R13: 000055e4a76d6710 R14:
> 0000000000020000 R15: 000055e4a741b8e9
> Dec 17 11:17:28 bee kernel: [   32.004906] Code: 48 89 df ba 4b 00 00 00 48
> c7 c6 60 62 13 a1 e8 11 b7 fc ff 48 89 df ba 1e 00 00 00 48 c7 c6 e0 61 13
> a1 e8 fd b6 fc ff 5b 5d c3 <0f> 0b ba 05 01 00 00 48 c7 c6 c0 5d 13 a1 e8
> e7 b6 fc ff 48 89
> Dec 17 11:17:28 bee kernel: [   32.006061] RIP:
> xgpu_vi_init_golden_registers +0x56/0xa0 [amdgpu] RSP: ffffc900014dfa08
> Dec 17 11:17:28 bee kernel: [   32.007226] ---[ end trace eb52a49a747a04be
> ]---
> 
> 	Which in the end means we got to the following BUG_ON on
> xgpu_vi_init_golden_registers:
> 
> 	BUG_ON("Doesn't support chip type.\n");
> 
> 	Following the path in vi_init_golden_registers:
> 
>         if (amdgpu_sriov_vf(adev)) {
>                 xgpu_vi_init_golden_registers(adev);
>                 mutex_unlock(&adev->grbm_idx_mutex);
>                 return;
>         }
> 
> 	System is using the following kernel and cpu:
> 
> $ uname -a
> Linux bee 4.14.5 #10 SMP Wed Dec 13 12:07:06 EET 2017 x86_64 Intel(R)
> Core(TM) i7-7500U CPU @ 2.70GHz GenuineIntel GNU/Linux
> 
> 	And the graphic card is the following:
> 
> # lspci -vvvs 01:00.0
> 01:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT
> [Radeon R7 M260/M265 / M340/M360 / M440/M445] (rev 81)
>         Subsystem: Lenovo Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/
> M445]
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 128
>         Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M]
>         Region 2: Memory at b0000000 (64-bit, prefetchable) [size=2M]
>         Region 4: I/O ports at 4000 [size=256]
>         Region 5: Memory at b2300000 (32-bit, non-prefetchable) [size=256K]
>         Expansion ROM at b2340000 [disabled] [size=128K]
>         Capabilities: [48] Vendor Specific Information: Len=08 <?>
>         Capabilities: [50] Power Management version 3
>                 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
>                 Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>         Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us,
> L1 unlimited
>                         ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
> Unsupported-
>                         RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
> TransPend-
>                 LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s L1, Exit
> Latency L0s <64ns, L1 <1us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Not Supported, TimeoutDis-,
> LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-,
> LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance-
> SpeedDis- Transmit Margin: Normal Operating Range,
> EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -3.5dB,
> EqualizationComplete+, EqualizationPhase1+
>                          EqualizationPhase2+, EqualizationPhase3+,
> LinkEqualizationRequest-
>         Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                 Address: 00000000fee00338  Data: 0000
>         Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1
> Len=010 <?>
>         Capabilities: [150 v2] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr +
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+
> ChkEn-
>         Capabilities: [270 v1] #19
>         Capabilities: [2b0 v1] Address Translation Service (ATS)
>                 ATSCap: Invalidate Queue Depth: 00
>                 ATSCtl: Enable-, Smallest Translation Unit: 00
>         Capabilities: [2c0 v1] Page Request Interface (PRI)
>                 PRICtl: Enable- Reset-
>                 PRISta: RF- UPRGI- Stopped+
>                 Page Request Capacity: 00000020, Page Request Allocation:
> 00000000
>         Capabilities: [2d0 v1] Process Address Space ID (PASID)
>                 PASIDCap: Exec+ Priv+, Max PASID Width: 10
>                 PASIDCtl: Enable- Exec- Priv-
>         Kernel driver in use: amdgpu
>         Kernel modules: amdgpu
> 
> 	Funny thing is that I can boot the machine properly when not running on
> the battery, so either this seems to be a problem in the firmware, or in the
> way acpi interacts with the driver.
> 
> 	Any help, or ideas are appreciated.
> 
> 	José.

	Adding Alex and Christian.

	Best regards.

	José


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2017-12-19  7:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-17 19:20 Topaz mistakenly reported as José Pekkarinen
2017-12-19  7:12 ` José Pekkarinen [this message]
2017-12-19  7:19   ` Topaz mistakenly reported as vf Yu, Xiangliang
     [not found]     ` <BY2PR1201MB0935B6DE5790F85F6CEC05FDEB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19  7:27       ` José Pekkarinen
2017-12-19  7:44         ` Yu, Xiangliang
     [not found]           ` <BY2PR1201MB0935ECB9DDC4DE571DF65433EB0F0-O28G1zQ8oGkaqtME6NEo1mrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-12-19  7:50             ` José Pekkarinen
2017-12-19  7:56               ` Yu, Xiangliang
2017-12-19 14:20             ` Deucher, Alexander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2725982.yAJCsOSm2P@localhost \
    --to=jose.pekkarinen-z7wlfzj8ewms+fvcfc7uqw@public.gmane.org \
    --cc=Xiangliang.Yu-5C7GfCeVMHo@public.gmane.org \
    --cc=alexander.deucher-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.