* snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) @ 2023-09-05 22:01 Antonio Terceiro 2023-09-06 6:10 ` Takashi Iwai 0 siblings, 1 reply; 11+ messages in thread From: Antonio Terceiro @ 2023-09-05 22:01 UTC (permalink / raw) To: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon Cc: alsa-devel, linux-kernel, linux-arm-kernel [-- Attachment #1.1: Type: text/plain, Size: 5265 bytes --] Hi, I'm using an arm64 workstation, and wanted to add a sound card to it. I bought one who was pretty popular around where I live, and it is supported by the snd-cmipci driver. It's this one: 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle kernel paging request at virtual address fffffbfffe80000c", and the system never finishes to boot. The login manager never shows up and the serial console never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, after rebuilding with CONFIG_SND_CMIPCI=m. If I stop the module from being automatically loaded by adding `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I remove the card from the PCIe slot), I get the system to boot. But tring to load the module manually causes the same crash (I only tested this with the card on): [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c [ +0,007927] Mem abort info: [ +0,002793] ESR = 0x0000000096000006 [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits [ +0,005307] SET = 0, FnV = 0 [ +0,003049] EA = 0, S1PTW = 0 [ +0,003134] FSC = 0x06: level 2 translation fault [ +0,004872] Data abort info: [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 [ +0,012506] Workqueue: events work_for_cpu_fn [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ +0,006953] pc : logic_inl+0xa0/0xd8 [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] [ +0,005578] sp : ffff80008287bc70 [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c [ +0,007126] Call trace: [ +0,002436] logic_inl+0xa0/0xd8 [ +0,003221] local_pci_probe+0x48/0xb8 [ +0,003744] work_for_cpu_fn+0x24/0x40 [ +0,003741] process_one_work+0x170/0x3a8 [ +0,004002] worker_thread+0x23c/0x460 [ +0,003742] kthread+0xe8/0xf8 [ +0,003047] ret_from_fork+0x10/0x20 [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) [ +0,006083] ---[ end trace 0000000000000000 ]--- Because this sound card chipset seems to be popular (pretty much all PCI cards I can find to buy locally use that), I'm thinking this might be specific to arm64, otherwise someone would have seen this before. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-05 22:01 snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) Antonio Terceiro @ 2023-09-06 6:10 ` Takashi Iwai 2023-09-06 12:49 ` Robin Murphy 0 siblings, 1 reply; 11+ messages in thread From: Takashi Iwai @ 2023-09-06 6:10 UTC (permalink / raw) To: Antonio Terceiro Cc: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On Wed, 06 Sep 2023 00:01:01 +0200, Antonio Terceiro wrote: > > Hi, > > I'm using an arm64 workstation, and wanted to add a sound card to it. I bought > one who was pretty popular around where I live, and it is supported by the > snd-cmipci driver. > > It's this one: > > 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) > > After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 > with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle > kernel paging request at virtual address fffffbfffe80000c", and the system > never finishes to boot. The login manager never shows up and the serial console > never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, > after rebuilding with CONFIG_SND_CMIPCI=m. > > If I stop the module from being automatically loaded by adding > `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I > remove the card from the PCIe slot), I get the system to boot. But tring > to load the module manually causes the same crash (I only tested this > with the card on): > > [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree > [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c > [ +0,007927] Mem abort info: > [ +0,002793] ESR = 0x0000000096000006 > [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits > [ +0,005307] SET = 0, FnV = 0 > [ +0,003049] EA = 0, S1PTW = 0 > [ +0,003134] FSC = 0x06: level 2 translation fault > [ +0,004872] Data abort info: > [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 > [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 > [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP > [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid 456 async_raid6_recov async_memcpy > [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core > [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 > [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 > [ +0,012506] Workqueue: events work_for_cpu_fn > [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ +0,006953] pc : logic_inl+0xa0/0xd8 > [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] > [ +0,005578] sp : ffff80008287bc70 > [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 > [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 > [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 > [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff > [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 > [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 > [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 > [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff > [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff > [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c > [ +0,007126] Call trace: > [ +0,002436] logic_inl+0xa0/0xd8 > [ +0,003221] local_pci_probe+0x48/0xb8 > [ +0,003744] work_for_cpu_fn+0x24/0x40 > [ +0,003741] process_one_work+0x170/0x3a8 > [ +0,004002] worker_thread+0x23c/0x460 > [ +0,003742] kthread+0xe8/0xf8 > [ +0,003047] ret_from_fork+0x10/0x20 > [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) > [ +0,006083] ---[ end trace 0000000000000000 ]--- > > Because this sound card chipset seems to be popular (pretty much all PCI cards > I can find to buy locally use that), I'm thinking this might be specific to > arm64, otherwise someone would have seen this before. There is only one change in this driver code itself since 6.5 (commit b6ba0aa46138), and judging from the stack trace, it's unrelated with your problem. It's more likely a regression in the lower level code, e.g. PCI layer or arch/arm64 stuff. Could you try git bisect? thanks, Takashi _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 6:10 ` Takashi Iwai @ 2023-09-06 12:49 ` Robin Murphy 2023-09-06 18:36 ` Antonio Terceiro 0 siblings, 1 reply; 11+ messages in thread From: Robin Murphy @ 2023-09-06 12:49 UTC (permalink / raw) To: Takashi Iwai, Antonio Terceiro Cc: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On 2023-09-06 07:10, Takashi Iwai wrote: > On Wed, 06 Sep 2023 00:01:01 +0200, > Antonio Terceiro wrote: >> >> Hi, >> >> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought >> one who was pretty popular around where I live, and it is supported by the >> snd-cmipci driver. >> >> It's this one: >> >> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) >> >> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 >> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle >> kernel paging request at virtual address fffffbfffe80000c", and the system >> never finishes to boot. The login manager never shows up and the serial console >> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, >> after rebuilding with CONFIG_SND_CMIPCI=m. >> >> If I stop the module from being automatically loaded by adding >> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I >> remove the card from the PCIe slot), I get the system to boot. But tring >> to load the module manually causes the same crash (I only tested this >> with the card on): >> >> [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree >> [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c >> [ +0,007927] Mem abort info: >> [ +0,002793] ESR = 0x0000000096000006 >> [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits >> [ +0,005307] SET = 0, FnV = 0 >> [ +0,003049] EA = 0, S1PTW = 0 >> [ +0,003134] FSC = 0x06: level 2 translation fault >> [ +0,004872] Data abort info: >> [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >> [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >> [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >> [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 >> [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 >> [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP >> [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid > 456 async_raid6_recov async_memcpy >> [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core >> [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 >> [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 >> [ +0,012506] Workqueue: events work_for_cpu_fn >> [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >> [ +0,006953] pc : logic_inl+0xa0/0xd8 >> [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] >> [ +0,005578] sp : ffff80008287bc70 >> [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 >> [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 >> [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 >> [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff >> [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 >> [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 >> [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 >> [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff >> [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff >> [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c >> [ +0,007126] Call trace: >> [ +0,002436] logic_inl+0xa0/0xd8 >> [ +0,003221] local_pci_probe+0x48/0xb8 >> [ +0,003744] work_for_cpu_fn+0x24/0x40 >> [ +0,003741] process_one_work+0x170/0x3a8 >> [ +0,004002] worker_thread+0x23c/0x460 >> [ +0,003742] kthread+0xe8/0xf8 >> [ +0,003047] ret_from_fork+0x10/0x20 >> [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) >> [ +0,006083] ---[ end trace 0000000000000000 ]--- >> >> Because this sound card chipset seems to be popular (pretty much all PCI cards >> I can find to buy locally use that), I'm thinking this might be specific to >> arm64, otherwise someone would have seen this before. > > There is only one change in this driver code itself since 6.5 (commit > b6ba0aa46138), and judging from the stack trace, it's unrelated with > your problem. It's more likely a regression in the lower level code, > e.g. PCI layer or arch/arm64 stuff. > > Could you try git bisect? Hmm, but has this combination of card and machine *ever* actually worked? It's blowing up trying to access PCI I/O space, which has apparently ended up in the indirect access mechanism without that being configured correctly. That is definitely an issue down somewhere between the PCI layer and the system firmware. Does the system even have an I/O space window? Some arm64 machines don't. I guess we might not have got as far as probing a driver if the I/O BAR couldn't be assigned at all, but either way something's not gone right. Thanks, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 12:49 ` Robin Murphy @ 2023-09-06 18:36 ` Antonio Terceiro 2023-09-06 19:03 ` Geraldo Nascimento ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Antonio Terceiro @ 2023-09-06 18:36 UTC (permalink / raw) To: Robin Murphy Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel [-- Attachment #1.1: Type: text/plain, Size: 7128 bytes --] On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: > On 2023-09-06 07:10, Takashi Iwai wrote: > > On Wed, 06 Sep 2023 00:01:01 +0200, > > Antonio Terceiro wrote: > > > > > > Hi, > > > > > > I'm using an arm64 workstation, and wanted to add a sound card to it. I bought > > > one who was pretty popular around where I live, and it is supported by the > > > snd-cmipci driver. > > > > > > It's this one: > > > > > > 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) > > > > > > After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 > > > with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle > > > kernel paging request at virtual address fffffbfffe80000c", and the system > > > never finishes to boot. The login manager never shows up and the serial console > > > never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, > > > after rebuilding with CONFIG_SND_CMIPCI=m. > > > > > > If I stop the module from being automatically loaded by adding > > > `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I > > > remove the card from the PCIe slot), I get the system to boot. But tring > > > to load the module manually causes the same crash (I only tested this > > > with the card on): > > > > > > [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree > > > [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c > > > [ +0,007927] Mem abort info: > > > [ +0,002793] ESR = 0x0000000096000006 > > > [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits > > > [ +0,005307] SET = 0, FnV = 0 > > > [ +0,003049] EA = 0, S1PTW = 0 > > > [ +0,003134] FSC = 0x06: level 2 translation fault > > > [ +0,004872] Data abort info: > > > [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 > > > [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > > [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > > [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 > > > [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 > > > [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP > > > [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid > > 456 async_raid6_recov async_memcpy > > > [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core > > > [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 > > > [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 > > > [ +0,012506] Workqueue: events work_for_cpu_fn > > > [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > > [ +0,006953] pc : logic_inl+0xa0/0xd8 > > > [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] > > > [ +0,005578] sp : ffff80008287bc70 > > > [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 > > > [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 > > > [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 > > > [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff > > > [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 > > > [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 > > > [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 > > > [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff > > > [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff > > > [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c > > > [ +0,007126] Call trace: > > > [ +0,002436] logic_inl+0xa0/0xd8 > > > [ +0,003221] local_pci_probe+0x48/0xb8 > > > [ +0,003744] work_for_cpu_fn+0x24/0x40 > > > [ +0,003741] process_one_work+0x170/0x3a8 > > > [ +0,004002] worker_thread+0x23c/0x460 > > > [ +0,003742] kthread+0xe8/0xf8 > > > [ +0,003047] ret_from_fork+0x10/0x20 > > > [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) > > > [ +0,006083] ---[ end trace 0000000000000000 ]--- > > > > > > Because this sound card chipset seems to be popular (pretty much all PCI cards > > > I can find to buy locally use that), I'm thinking this might be specific to > > > arm64, otherwise someone would have seen this before. > > > > There is only one change in this driver code itself since 6.5 (commit > > b6ba0aa46138), and judging from the stack trace, it's unrelated with > > your problem. It's more likely a regression in the lower level code, > > e.g. PCI layer or arch/arm64 stuff. > > > > Could you try git bisect? > > Hmm, but has this combination of card and machine *ever* actually worked? That could be it. In trying to find a starting point for the bisection, I tried 6.1.0, 5.15.130, and 5.10.19, and they all fail in exactly the same way. I didn't go further back. > It's blowing up trying to access PCI I/O space, which has apparently ended > up in the indirect access mechanism without that being configured correctly. > That is definitely an issue down somewhere between the PCI layer and the > system firmware. Does the system even have an I/O space window? Some arm64 > machines don't. I guess we might not have got as far as probing a driver if > the I/O BAR couldn't be assigned at all, but either way something's not gone > right. I'm pretty sure I saw reports of people using PCI GPUs on this machine, but I would need to confirm. What info would I need to gather from the machine in order to figure this out? [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 18:36 ` Antonio Terceiro @ 2023-09-06 19:03 ` Geraldo Nascimento 2023-09-06 20:37 ` Robin Murphy 2023-09-06 19:52 ` Robin Murphy 2023-09-07 2:29 ` Geraldo Nascimento 2 siblings, 1 reply; 11+ messages in thread From: Geraldo Nascimento @ 2023-09-06 19:03 UTC (permalink / raw) To: Antonio Terceiro Cc: Robin Murphy, Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote: > On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: > > On 2023-09-06 07:10, Takashi Iwai wrote: > > > On Wed, 06 Sep 2023 00:01:01 +0200, > > > Antonio Terceiro wrote: > > > > > > > > Hi, > > > > Hi Antonio, my 2 cents: > > > > I'm using an arm64 workstation, and wanted to add a sound card to it. I bought > > > > one who was pretty popular around where I live, and it is supported by the > > > > snd-cmipci driver. Specifically, which arm64 workstation? I'm guessing Compute Module 4 IO Board + Raspbery Pi CM4? This detail is important because the stack trace you provided only references generic PCI calls and there's a need to know exactly which PCIe driver could be failing. Is it pcie-brcmstb? Thanks, Geraldo Nascimento _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 19:03 ` Geraldo Nascimento @ 2023-09-06 20:37 ` Robin Murphy 2023-09-06 21:00 ` Geraldo Nascimento 0 siblings, 1 reply; 11+ messages in thread From: Robin Murphy @ 2023-09-06 20:37 UTC (permalink / raw) To: Geraldo Nascimento, Antonio Terceiro Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On 2023-09-06 20:03, Geraldo Nascimento wrote: > On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote: >> On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: >>> On 2023-09-06 07:10, Takashi Iwai wrote: >>>> On Wed, 06 Sep 2023 00:01:01 +0200, >>>> Antonio Terceiro wrote: >>>>> >>>>> Hi, >>>>> > > Hi Antonio, my 2 cents: > >>>>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought >>>>> one who was pretty popular around where I live, and it is supported by the >>>>> snd-cmipci driver. > > Specifically, which arm64 workstation? I'm guessing Compute Module 4 IO > Board + Raspbery Pi CM4? This detail is important because the stack > trace you provided only references generic PCI calls and there's a need > to know exactly which PCIe driver could be failing. Is it pcie-brcmstb? Bit bigger than a Pi... ;) > [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 They look like pretty nice boxes - https://www.ipi.wiki/pages/com-hpc-altra Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 20:37 ` Robin Murphy @ 2023-09-06 21:00 ` Geraldo Nascimento 0 siblings, 0 replies; 11+ messages in thread From: Geraldo Nascimento @ 2023-09-06 21:00 UTC (permalink / raw) To: Robin Murphy Cc: Antonio Terceiro, Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On Wed, Sep 06, 2023 at 09:37:18PM +0100, Robin Murphy wrote: > > Bit bigger than a Pi... ;) > Ohh, that's impressive indeed! But looking around with Google, it turns out the Altra Ampere PCIe is definitely quirky, see: https://lore.kernel.org/linux-acpi/20200806225525.GA706347@bjorn-Precision-5520/T/ https://github.com/Tencent/TencentOS-kernel/commit/f454797b673c06c0eb1b77be20d8a475ad2fbf6f The first quirk should probably be activated on Antonio's kernel but the second one being a downstream Tencent patch, isn't. Alas, the second quirk comes with a performance hit, see: https://gitlab.freedesktop.org/drm/amd/-/issues/2078 Thanks, Geraldo Nascimento _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 18:36 ` Antonio Terceiro 2023-09-06 19:03 ` Geraldo Nascimento @ 2023-09-06 19:52 ` Robin Murphy 2023-09-07 0:41 ` Antonio Terceiro 2023-09-07 2:29 ` Geraldo Nascimento 2 siblings, 1 reply; 11+ messages in thread From: Robin Murphy @ 2023-09-06 19:52 UTC (permalink / raw) To: Antonio Terceiro Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On 2023-09-06 19:36, Antonio Terceiro wrote: > On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: >> On 2023-09-06 07:10, Takashi Iwai wrote: >>> On Wed, 06 Sep 2023 00:01:01 +0200, >>> Antonio Terceiro wrote: >>>> >>>> Hi, >>>> >>>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought >>>> one who was pretty popular around where I live, and it is supported by the >>>> snd-cmipci driver. >>>> >>>> It's this one: >>>> >>>> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) >>>> >>>> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 >>>> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle >>>> kernel paging request at virtual address fffffbfffe80000c", and the system >>>> never finishes to boot. The login manager never shows up and the serial console >>>> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, >>>> after rebuilding with CONFIG_SND_CMIPCI=m. >>>> >>>> If I stop the module from being automatically loaded by adding >>>> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I >>>> remove the card from the PCIe slot), I get the system to boot. But tring >>>> to load the module manually causes the same crash (I only tested this >>>> with the card on): >>>> >>>> [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree >>>> [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c >>>> [ +0,007927] Mem abort info: >>>> [ +0,002793] ESR = 0x0000000096000006 >>>> [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits >>>> [ +0,005307] SET = 0, FnV = 0 >>>> [ +0,003049] EA = 0, S1PTW = 0 >>>> [ +0,003134] FSC = 0x06: level 2 translation fault >>>> [ +0,004872] Data abort info: >>>> [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >>>> [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>> [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>> [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 >>>> [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 >>>> [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP >>>> [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid >>> 456 async_raid6_recov async_memcpy >>>> [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core >>>> [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 >>>> [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 >>>> [ +0,012506] Workqueue: events work_for_cpu_fn >>>> [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>> [ +0,006953] pc : logic_inl+0xa0/0xd8 >>>> [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] >>>> [ +0,005578] sp : ffff80008287bc70 >>>> [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 >>>> [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 >>>> [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 >>>> [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff >>>> [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 >>>> [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 >>>> [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 >>>> [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff >>>> [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff >>>> [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c >>>> [ +0,007126] Call trace: >>>> [ +0,002436] logic_inl+0xa0/0xd8 >>>> [ +0,003221] local_pci_probe+0x48/0xb8 >>>> [ +0,003744] work_for_cpu_fn+0x24/0x40 >>>> [ +0,003741] process_one_work+0x170/0x3a8 >>>> [ +0,004002] worker_thread+0x23c/0x460 >>>> [ +0,003742] kthread+0xe8/0xf8 >>>> [ +0,003047] ret_from_fork+0x10/0x20 >>>> [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) >>>> [ +0,006083] ---[ end trace 0000000000000000 ]--- >>>> >>>> Because this sound card chipset seems to be popular (pretty much all PCI cards >>>> I can find to buy locally use that), I'm thinking this might be specific to >>>> arm64, otherwise someone would have seen this before. >>> >>> There is only one change in this driver code itself since 6.5 (commit >>> b6ba0aa46138), and judging from the stack trace, it's unrelated with >>> your problem. It's more likely a regression in the lower level code, >>> e.g. PCI layer or arch/arm64 stuff. >>> >>> Could you try git bisect? >> >> Hmm, but has this combination of card and machine *ever* actually worked? > > That could be it. In trying to find a starting point for the bisection, > I tried 6.1.0, 5.15.130, and 5.10.19, and they all fail in exactly the > same way. I didn't go further back. > >> It's blowing up trying to access PCI I/O space, which has apparently ended >> up in the indirect access mechanism without that being configured correctly. >> That is definitely an issue down somewhere between the PCI layer and the >> system firmware. Does the system even have an I/O space window? Some arm64 >> machines don't. I guess we might not have got as far as probing a driver if >> the I/O BAR couldn't be assigned at all, but either way something's not gone >> right. > > I'm pretty sure I saw reports of people using PCI GPUs on this machine, > but I would need to confirm. GPUs and any other PCIe devices will be fine, since they will use memory BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some systems don't even support it at all. I found a datasheet for CMI8738, and they seem to be right at the other end of the scale as legacy PCI chips with *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as well), so are definitely going to be hitting paths that are less well-exercised on arm64 in general. > What info would I need to gather from the machine in order to figure > this out? The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO disabled and see what difference that makes. I'm not too familiar with that area of the code, so the finer details of how to debug broken I/O space beyond that would be more of a linux-pci question. Thanks, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 19:52 ` Robin Murphy @ 2023-09-07 0:41 ` Antonio Terceiro 2023-09-07 12:22 ` Robin Murphy 0 siblings, 1 reply; 11+ messages in thread From: Antonio Terceiro @ 2023-09-07 0:41 UTC (permalink / raw) To: Robin Murphy Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel [-- Attachment #1.1: Type: text/plain, Size: 1360 bytes --] On Wed, Sep 06, 2023 at 08:52:40PM +0100, Robin Murphy wrote: > On 2023-09-06 19:36, Antonio Terceiro wrote: > > I'm pretty sure I saw reports of people using PCI GPUs on this machine, > > but I would need to confirm. > > GPUs and any other PCIe devices will be fine, since they will use memory > BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some > systems don't even support it at all. I found a datasheet for CMI8738, and > they seem to be right at the other end of the scale as legacy PCI chips with > *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as > well), so are definitely going to be hitting paths that are less > well-exercised on arm64 in general. OK, that makes sense. So If I'm able to find a card that is genuinely PCIe¹, then it should work? ¹ this one has a connector that looks like a PCIe x1, but it's not really PCIe as the chipset was designed for legacy PCI? > > What info would I need to gather from the machine in order to figure > > this out? > > The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO > disabled and see what difference that makes. I'm not too familiar with that > area of the code, so the finer details of how to debug broken I/O space > beyond that would be more of a linux-pci question. Tried that, didn't help. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] [-- Attachment #2: Type: text/plain, Size: 176 bytes --] _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-07 0:41 ` Antonio Terceiro @ 2023-09-07 12:22 ` Robin Murphy 0 siblings, 0 replies; 11+ messages in thread From: Robin Murphy @ 2023-09-07 12:22 UTC (permalink / raw) To: Antonio Terceiro Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On 07/09/2023 1:41 am, Antonio Terceiro wrote: > On Wed, Sep 06, 2023 at 08:52:40PM +0100, Robin Murphy wrote: >> On 2023-09-06 19:36, Antonio Terceiro wrote: >>> I'm pretty sure I saw reports of people using PCI GPUs on this machine, >>> but I would need to confirm. >> >> GPUs and any other PCIe devices will be fine, since they will use memory >> BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some >> systems don't even support it at all. I found a datasheet for CMI8738, and >> they seem to be right at the other end of the scale as legacy PCI chips with >> *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as >> well), so are definitely going to be hitting paths that are less >> well-exercised on arm64 in general. > > OK, that makes sense. So If I'm able to find a card that is genuinely > PCIe¹, then it should work? > > ¹ this one has a connector that looks like a PCIe x1, but it's not > really PCIe as the chipset was designed for legacy PCI? Probably - native PCIe endpoints are still allowed to have I/O resources, but they are required to be accessible as equivalent memory resources as well, so most PCIe drivers are unlikely to care about I/O BARs at all. >>> What info would I need to gather from the machine in order to figure >>> this out? >> >> The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO >> disabled and see what difference that makes. I'm not too familiar with that >> area of the code, so the finer details of how to debug broken I/O space >> beyond that would be more of a linux-pci question. > > Tried that, didn't help. OK, I managed to have a poke around on a full-fat Altra Mt.Jade system, and indeed, at least on this one, the firmware is not describing any I/O space windows at all: [ 8.657752] pci_bus 0001:00: root bus resource [bus 00-ff] [ 8.663235] pci_bus 0001:00: root bus resource [mem 0x30000000-0x37ffffff window] [ 8.670715] pci_bus 0001:00: root bus resource [mem 0x380000000000-0x3bffdfffffff window] [ 8.678926] pci 0001:00:00.0: [1def:e100] type 00 class 0x060000 [and so on for all 11(!) PCI segments...] ...which then leads to a lot of failing to configure I/O at the bridges: [ 9.005653] pci 0000:00:01.0: BAR 13: no space for [io size 0x1000] [ 9.012006] pci 0000:00:01.0: BAR 13: failed to assign [io size 0x1000] ...but unfortunately what I don't then have is any endpoint with an I/O BAR in that machine to see how that plays out. Either way, though, if your machine looks the same as this (i.e. does not report any "root bus resource [io ... window]" entries and fails to assign any I/O space), then there's no way that card can work, and it would seem to indicate a bug somewhere between the PCI layer and the driver that it's able to get as far as making an access to something it has no means of accessing. If on the other hand your firmware is different and *does* claim to have I/O windows as well, then something else is going screwy and I don't know, sorry. Cheers, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) 2023-09-06 18:36 ` Antonio Terceiro 2023-09-06 19:03 ` Geraldo Nascimento 2023-09-06 19:52 ` Robin Murphy @ 2023-09-07 2:29 ` Geraldo Nascimento 2 siblings, 0 replies; 11+ messages in thread From: Geraldo Nascimento @ 2023-09-07 2:29 UTC (permalink / raw) To: Antonio Terceiro Cc: Robin Murphy, Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote: > On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: > > It's blowing up trying to access PCI I/O space, which has apparently ended > > up in the indirect access mechanism without that being configured correctly. > > That is definitely an issue down somewhere between the PCI layer and the > > system firmware. Does the system even have an I/O space window? Some arm64 > > machines don't. I guess we might not have got as far as probing a driver if > > the I/O BAR couldn't be assigned at all, but either way something's not gone > > right. > > I'm pretty sure I saw reports of people using PCI GPUs on this machine, > but I would need to confirm. > > What info would I need to gather from the machine in order to figure > this out? Antonio, please see: https://community.amperecomputing.com/t/amd-gpus-on-the-altra-devkit-and-other-altras-patches-available-now/336/11 You have a quirky PCIe controller it seems. You'll have to go through the errata and then some. Good Luck, Geraldo Nascimento _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-09-07 12:22 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-05 22:01 snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) Antonio Terceiro 2023-09-06 6:10 ` Takashi Iwai 2023-09-06 12:49 ` Robin Murphy 2023-09-06 18:36 ` Antonio Terceiro 2023-09-06 19:03 ` Geraldo Nascimento 2023-09-06 20:37 ` Robin Murphy 2023-09-06 21:00 ` Geraldo Nascimento 2023-09-06 19:52 ` Robin Murphy 2023-09-07 0:41 ` Antonio Terceiro 2023-09-07 12:22 ` Robin Murphy 2023-09-07 2:29 ` Geraldo Nascimento
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).