linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
@ 2023-09-05 22:01 Antonio Terceiro
  2023-09-06  6:10 ` Takashi Iwai
  0 siblings, 1 reply; 11+ messages in thread
From: Antonio Terceiro @ 2023-09-05 22:01 UTC (permalink / raw)
  To: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon
  Cc: alsa-devel, linux-kernel, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 5265 bytes --]

Hi,

I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
one who was pretty popular around where I live, and it is supported by the
snd-cmipci driver.

It's this one:

0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10)

After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64
with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle
kernel paging request at virtual address fffffbfffe80000c", and the system
never finishes to boot. The login manager never shows up and the serial console
never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel,
after rebuilding with CONFIG_SND_CMIPCI=m.

If I stop the module from being automatically loaded by adding
`blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I
remove the card from the PCIe slot), I get the system to boot. But tring
to load the module manually causes the same crash (I only tested this
with the card on):

[  +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree
[  +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c
[  +0,007927] Mem abort info:
[  +0,002793]   ESR = 0x0000000096000006
[  +0,003743]   EC = 0x25: DABT (current EL), IL = 32 bits
[  +0,005307]   SET = 0, FnV = 0
[  +0,003049]   EA = 0, S1PTW = 0
[  +0,003134]   FSC = 0x06: level 2 translation fault
[  +0,004872] Data abort info:
[  +0,002873]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[  +0,005479]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  +0,005047]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000
[  +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000
[  +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP
[  +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy
[  +0,000142]  async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core
[  +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2
[  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022
[  +0,012506] Workqueue: events work_for_cpu_fn
[  +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  +0,006953] pc : logic_inl+0xa0/0xd8
[  +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci]
[  +0,005578] sp : ffff80008287bc70
[  +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000
[  +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001
[  +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080
[  +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff
[  +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80
[  +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000
[  +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18
[  +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff
[  +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff
[  +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c
[  +0,007126] Call trace:
[  +0,002436]  logic_inl+0xa0/0xd8
[  +0,003221]  local_pci_probe+0x48/0xb8
[  +0,003744]  work_for_cpu_fn+0x24/0x40
[  +0,003741]  process_one_work+0x170/0x3a8
[  +0,004002]  worker_thread+0x23c/0x460
[  +0,003742]  kthread+0xe8/0xf8
[  +0,003047]  ret_from_fork+0x10/0x20
[  +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) 
[  +0,006083] ---[ end trace 0000000000000000 ]---

Because this sound card chipset seems to be popular (pretty much all PCI cards
I can find to buy locally use that), I'm thinking this might be specific to
arm64, otherwise someone would have seen this before.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-05 22:01 snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) Antonio Terceiro
@ 2023-09-06  6:10 ` Takashi Iwai
  2023-09-06 12:49   ` Robin Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2023-09-06  6:10 UTC (permalink / raw)
  To: Antonio Terceiro
  Cc: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon,
	alsa-devel, linux-kernel, linux-arm-kernel

On Wed, 06 Sep 2023 00:01:01 +0200,
Antonio Terceiro wrote:
> 
> Hi,
> 
> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
> one who was pretty popular around where I live, and it is supported by the
> snd-cmipci driver.
> 
> It's this one:
> 
> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10)
> 
> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64
> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle
> kernel paging request at virtual address fffffbfffe80000c", and the system
> never finishes to boot. The login manager never shows up and the serial console
> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel,
> after rebuilding with CONFIG_SND_CMIPCI=m.
> 
> If I stop the module from being automatically loaded by adding
> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I
> remove the card from the PCIe slot), I get the system to boot. But tring
> to load the module manually causes the same crash (I only tested this
> with the card on):
> 
> [  +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree
> [  +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c
> [  +0,007927] Mem abort info:
> [  +0,002793]   ESR = 0x0000000096000006
> [  +0,003743]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  +0,005307]   SET = 0, FnV = 0
> [  +0,003049]   EA = 0, S1PTW = 0
> [  +0,003134]   FSC = 0x06: level 2 translation fault
> [  +0,004872] Data abort info:
> [  +0,002873]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
> [  +0,005479]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [  +0,005047]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000
> [  +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000
> [  +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP
> [  +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid
 456 async_raid6_recov async_memcpy
> [  +0,000142]  async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core
> [  +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2
> [  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022
> [  +0,012506] Workqueue: events work_for_cpu_fn
> [  +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  +0,006953] pc : logic_inl+0xa0/0xd8
> [  +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci]
> [  +0,005578] sp : ffff80008287bc70
> [  +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000
> [  +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001
> [  +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080
> [  +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff
> [  +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80
> [  +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000
> [  +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18
> [  +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff
> [  +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff
> [  +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c
> [  +0,007126] Call trace:
> [  +0,002436]  logic_inl+0xa0/0xd8
> [  +0,003221]  local_pci_probe+0x48/0xb8
> [  +0,003744]  work_for_cpu_fn+0x24/0x40
> [  +0,003741]  process_one_work+0x170/0x3a8
> [  +0,004002]  worker_thread+0x23c/0x460
> [  +0,003742]  kthread+0xe8/0xf8
> [  +0,003047]  ret_from_fork+0x10/0x20
> [  +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) 
> [  +0,006083] ---[ end trace 0000000000000000 ]---
> 
> Because this sound card chipset seems to be popular (pretty much all PCI cards
> I can find to buy locally use that), I'm thinking this might be specific to
> arm64, otherwise someone would have seen this before.

There is only one change in this driver code itself since 6.5 (commit
b6ba0aa46138), and judging from the stack trace, it's unrelated with
your problem.   It's more likely a regression in the lower level code,
e.g. PCI layer or arch/arm64 stuff.

Could you try git bisect?


thanks,

Takashi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06  6:10 ` Takashi Iwai
@ 2023-09-06 12:49   ` Robin Murphy
  2023-09-06 18:36     ` Antonio Terceiro
  0 siblings, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2023-09-06 12:49 UTC (permalink / raw)
  To: Takashi Iwai, Antonio Terceiro
  Cc: Jaroslav Kysela, Takashi Iwai, Catalin Marinas, Will Deacon,
	alsa-devel, linux-kernel, linux-arm-kernel

On 2023-09-06 07:10, Takashi Iwai wrote:
> On Wed, 06 Sep 2023 00:01:01 +0200,
> Antonio Terceiro wrote:
>>
>> Hi,
>>
>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
>> one who was pretty popular around where I live, and it is supported by the
>> snd-cmipci driver.
>>
>> It's this one:
>>
>> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10)
>>
>> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64
>> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle
>> kernel paging request at virtual address fffffbfffe80000c", and the system
>> never finishes to boot. The login manager never shows up and the serial console
>> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel,
>> after rebuilding with CONFIG_SND_CMIPCI=m.
>>
>> If I stop the module from being automatically loaded by adding
>> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I
>> remove the card from the PCIe slot), I get the system to boot. But tring
>> to load the module manually causes the same crash (I only tested this
>> with the card on):
>>
>> [  +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree
>> [  +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c
>> [  +0,007927] Mem abort info:
>> [  +0,002793]   ESR = 0x0000000096000006
>> [  +0,003743]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [  +0,005307]   SET = 0, FnV = 0
>> [  +0,003049]   EA = 0, S1PTW = 0
>> [  +0,003134]   FSC = 0x06: level 2 translation fault
>> [  +0,004872] Data abort info:
>> [  +0,002873]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
>> [  +0,005479]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>> [  +0,005047]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> [  +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000
>> [  +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000
>> [  +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP
>> [  +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid
>   456 async_raid6_recov async_memcpy
>> [  +0,000142]  async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core
>> [  +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2
>> [  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022
>> [  +0,012506] Workqueue: events work_for_cpu_fn
>> [  +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [  +0,006953] pc : logic_inl+0xa0/0xd8
>> [  +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci]
>> [  +0,005578] sp : ffff80008287bc70
>> [  +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000
>> [  +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001
>> [  +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080
>> [  +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff
>> [  +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80
>> [  +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000
>> [  +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18
>> [  +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff
>> [  +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff
>> [  +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c
>> [  +0,007126] Call trace:
>> [  +0,002436]  logic_inl+0xa0/0xd8
>> [  +0,003221]  local_pci_probe+0x48/0xb8
>> [  +0,003744]  work_for_cpu_fn+0x24/0x40
>> [  +0,003741]  process_one_work+0x170/0x3a8
>> [  +0,004002]  worker_thread+0x23c/0x460
>> [  +0,003742]  kthread+0xe8/0xf8
>> [  +0,003047]  ret_from_fork+0x10/0x20
>> [  +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000)
>> [  +0,006083] ---[ end trace 0000000000000000 ]---
>>
>> Because this sound card chipset seems to be popular (pretty much all PCI cards
>> I can find to buy locally use that), I'm thinking this might be specific to
>> arm64, otherwise someone would have seen this before.
> 
> There is only one change in this driver code itself since 6.5 (commit
> b6ba0aa46138), and judging from the stack trace, it's unrelated with
> your problem.   It's more likely a regression in the lower level code,
> e.g. PCI layer or arch/arm64 stuff.
> 
> Could you try git bisect?

Hmm, but has this combination of card and machine *ever* actually worked?

It's blowing up trying to access PCI I/O space, which has apparently 
ended up in the indirect access mechanism without that being configured 
correctly. That is definitely an issue down somewhere between the PCI 
layer and the system firmware. Does the system even have an I/O space 
window? Some arm64 machines don't. I guess we might not have got as far 
as probing a driver if the I/O BAR couldn't be assigned at all, but 
either way something's not gone right.

Thanks,
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 12:49   ` Robin Murphy
@ 2023-09-06 18:36     ` Antonio Terceiro
  2023-09-06 19:03       ` Geraldo Nascimento
                         ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Antonio Terceiro @ 2023-09-06 18:36 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas,
	Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 7128 bytes --]

On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote:
> On 2023-09-06 07:10, Takashi Iwai wrote:
> > On Wed, 06 Sep 2023 00:01:01 +0200,
> > Antonio Terceiro wrote:
> > > 
> > > Hi,
> > > 
> > > I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
> > > one who was pretty popular around where I live, and it is supported by the
> > > snd-cmipci driver.
> > > 
> > > It's this one:
> > > 
> > > 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10)
> > > 
> > > After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64
> > > with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle
> > > kernel paging request at virtual address fffffbfffe80000c", and the system
> > > never finishes to boot. The login manager never shows up and the serial console
> > > never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel,
> > > after rebuilding with CONFIG_SND_CMIPCI=m.
> > > 
> > > If I stop the module from being automatically loaded by adding
> > > `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I
> > > remove the card from the PCIe slot), I get the system to boot. But tring
> > > to load the module manually causes the same crash (I only tested this
> > > with the card on):
> > > 
> > > [  +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree
> > > [  +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c
> > > [  +0,007927] Mem abort info:
> > > [  +0,002793]   ESR = 0x0000000096000006
> > > [  +0,003743]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > [  +0,005307]   SET = 0, FnV = 0
> > > [  +0,003049]   EA = 0, S1PTW = 0
> > > [  +0,003134]   FSC = 0x06: level 2 translation fault
> > > [  +0,004872] Data abort info:
> > > [  +0,002873]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
> > > [  +0,005479]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > > [  +0,005047]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > > [  +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000
> > > [  +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000
> > > [  +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP
> > > [  +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid
> >   456 async_raid6_recov async_memcpy
> > > [  +0,000142]  async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core
> > > [  +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2
> > > [  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022
> > > [  +0,012506] Workqueue: events work_for_cpu_fn
> > > [  +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > [  +0,006953] pc : logic_inl+0xa0/0xd8
> > > [  +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci]
> > > [  +0,005578] sp : ffff80008287bc70
> > > [  +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000
> > > [  +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001
> > > [  +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080
> > > [  +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff
> > > [  +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80
> > > [  +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000
> > > [  +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18
> > > [  +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff
> > > [  +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff
> > > [  +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c
> > > [  +0,007126] Call trace:
> > > [  +0,002436]  logic_inl+0xa0/0xd8
> > > [  +0,003221]  local_pci_probe+0x48/0xb8
> > > [  +0,003744]  work_for_cpu_fn+0x24/0x40
> > > [  +0,003741]  process_one_work+0x170/0x3a8
> > > [  +0,004002]  worker_thread+0x23c/0x460
> > > [  +0,003742]  kthread+0xe8/0xf8
> > > [  +0,003047]  ret_from_fork+0x10/0x20
> > > [  +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000)
> > > [  +0,006083] ---[ end trace 0000000000000000 ]---
> > > 
> > > Because this sound card chipset seems to be popular (pretty much all PCI cards
> > > I can find to buy locally use that), I'm thinking this might be specific to
> > > arm64, otherwise someone would have seen this before.
> > 
> > There is only one change in this driver code itself since 6.5 (commit
> > b6ba0aa46138), and judging from the stack trace, it's unrelated with
> > your problem.   It's more likely a regression in the lower level code,
> > e.g. PCI layer or arch/arm64 stuff.
> > 
> > Could you try git bisect?
> 
> Hmm, but has this combination of card and machine *ever* actually worked?

That could be it. In trying to find a starting point for the bisection,
I tried 6.1.0, 5.15.130, and 5.10.19, and they all fail in exactly the
same way. I didn't go further back.

> It's blowing up trying to access PCI I/O space, which has apparently ended
> up in the indirect access mechanism without that being configured correctly.
> That is definitely an issue down somewhere between the PCI layer and the
> system firmware. Does the system even have an I/O space window? Some arm64
> machines don't. I guess we might not have got as far as probing a driver if
> the I/O BAR couldn't be assigned at all, but either way something's not gone
> right.

I'm pretty sure I saw reports of people using PCI GPUs on this machine,
but I would need to confirm.

What info would I need to gather from the machine in order to figure
this out?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 18:36     ` Antonio Terceiro
@ 2023-09-06 19:03       ` Geraldo Nascimento
  2023-09-06 20:37         ` Robin Murphy
  2023-09-06 19:52       ` Robin Murphy
  2023-09-07  2:29       ` Geraldo Nascimento
  2 siblings, 1 reply; 11+ messages in thread
From: Geraldo Nascimento @ 2023-09-06 19:03 UTC (permalink / raw)
  To: Antonio Terceiro
  Cc: Robin Murphy, Takashi Iwai, Jaroslav Kysela, Takashi Iwai,
	Catalin Marinas, Will Deacon, alsa-devel, linux-kernel,
	linux-arm-kernel

On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote:
> On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote:
> > On 2023-09-06 07:10, Takashi Iwai wrote:
> > > On Wed, 06 Sep 2023 00:01:01 +0200,
> > > Antonio Terceiro wrote:
> > > > 
> > > > Hi,
> > > > 

Hi Antonio, my 2 cents:

> > > > I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
> > > > one who was pretty popular around where I live, and it is supported by the
> > > > snd-cmipci driver.

Specifically, which arm64 workstation? I'm guessing Compute Module 4 IO
Board + Raspbery Pi CM4? This detail is important because the stack
trace you provided only references generic PCI calls and there's a need
to know exactly which PCIe driver could be failing. Is it pcie-brcmstb?

Thanks,
Geraldo Nascimento


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 18:36     ` Antonio Terceiro
  2023-09-06 19:03       ` Geraldo Nascimento
@ 2023-09-06 19:52       ` Robin Murphy
  2023-09-07  0:41         ` Antonio Terceiro
  2023-09-07  2:29       ` Geraldo Nascimento
  2 siblings, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2023-09-06 19:52 UTC (permalink / raw)
  To: Antonio Terceiro
  Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas,
	Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel

On 2023-09-06 19:36, Antonio Terceiro wrote:
> On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote:
>> On 2023-09-06 07:10, Takashi Iwai wrote:
>>> On Wed, 06 Sep 2023 00:01:01 +0200,
>>> Antonio Terceiro wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
>>>> one who was pretty popular around where I live, and it is supported by the
>>>> snd-cmipci driver.
>>>>
>>>> It's this one:
>>>>
>>>> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10)
>>>>
>>>> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64
>>>> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle
>>>> kernel paging request at virtual address fffffbfffe80000c", and the system
>>>> never finishes to boot. The login manager never shows up and the serial console
>>>> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel,
>>>> after rebuilding with CONFIG_SND_CMIPCI=m.
>>>>
>>>> If I stop the module from being automatically loaded by adding
>>>> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I
>>>> remove the card from the PCIe slot), I get the system to boot. But tring
>>>> to load the module manually causes the same crash (I only tested this
>>>> with the card on):
>>>>
>>>> [  +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree
>>>> [  +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c
>>>> [  +0,007927] Mem abort info:
>>>> [  +0,002793]   ESR = 0x0000000096000006
>>>> [  +0,003743]   EC = 0x25: DABT (current EL), IL = 32 bits
>>>> [  +0,005307]   SET = 0, FnV = 0
>>>> [  +0,003049]   EA = 0, S1PTW = 0
>>>> [  +0,003134]   FSC = 0x06: level 2 translation fault
>>>> [  +0,004872] Data abort info:
>>>> [  +0,002873]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
>>>> [  +0,005479]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>>> [  +0,005047]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>>> [  +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000
>>>> [  +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000
>>>> [  +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP
>>>> [  +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid
>>>    456 async_raid6_recov async_memcpy
>>>> [  +0,000142]  async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core
>>>> [  +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2
>>>> [  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022
>>>> [  +0,012506] Workqueue: events work_for_cpu_fn
>>>> [  +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>> [  +0,006953] pc : logic_inl+0xa0/0xd8
>>>> [  +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci]
>>>> [  +0,005578] sp : ffff80008287bc70
>>>> [  +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000
>>>> [  +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001
>>>> [  +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080
>>>> [  +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff
>>>> [  +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80
>>>> [  +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000
>>>> [  +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18
>>>> [  +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff
>>>> [  +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff
>>>> [  +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c
>>>> [  +0,007126] Call trace:
>>>> [  +0,002436]  logic_inl+0xa0/0xd8
>>>> [  +0,003221]  local_pci_probe+0x48/0xb8
>>>> [  +0,003744]  work_for_cpu_fn+0x24/0x40
>>>> [  +0,003741]  process_one_work+0x170/0x3a8
>>>> [  +0,004002]  worker_thread+0x23c/0x460
>>>> [  +0,003742]  kthread+0xe8/0xf8
>>>> [  +0,003047]  ret_from_fork+0x10/0x20
>>>> [  +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000)
>>>> [  +0,006083] ---[ end trace 0000000000000000 ]---
>>>>
>>>> Because this sound card chipset seems to be popular (pretty much all PCI cards
>>>> I can find to buy locally use that), I'm thinking this might be specific to
>>>> arm64, otherwise someone would have seen this before.
>>>
>>> There is only one change in this driver code itself since 6.5 (commit
>>> b6ba0aa46138), and judging from the stack trace, it's unrelated with
>>> your problem.   It's more likely a regression in the lower level code,
>>> e.g. PCI layer or arch/arm64 stuff.
>>>
>>> Could you try git bisect?
>>
>> Hmm, but has this combination of card and machine *ever* actually worked?
> 
> That could be it. In trying to find a starting point for the bisection,
> I tried 6.1.0, 5.15.130, and 5.10.19, and they all fail in exactly the
> same way. I didn't go further back.
> 
>> It's blowing up trying to access PCI I/O space, which has apparently ended
>> up in the indirect access mechanism without that being configured correctly.
>> That is definitely an issue down somewhere between the PCI layer and the
>> system firmware. Does the system even have an I/O space window? Some arm64
>> machines don't. I guess we might not have got as far as probing a driver if
>> the I/O BAR couldn't be assigned at all, but either way something's not gone
>> right.
> 
> I'm pretty sure I saw reports of people using PCI GPUs on this machine,
> but I would need to confirm.

GPUs and any other PCIe devices will be fine, since they will use memory 
BARs - I/O space is pretty much deprecated in PCIe, and as mentioned 
some systems don't even support it at all. I found a datasheet for 
CMI8738, and they seem to be right at the other end of the scale as 
legacy PCI chips with *only* an I/O BAR (and so I guess your card 
includes a PCIe-PCI bridge as well), so are definitely going to be 
hitting paths that are less well-exercised on arm64 in general.

> What info would I need to gather from the machine in order to figure
> this out?

The first thing I'd try is rebuilding the kernel with 
CONFIG_INDIRECT_PIO disabled and see what difference that makes. I'm not 
too familiar with that area of the code, so the finer details of how to 
debug broken I/O space beyond that would be more of a linux-pci question.

Thanks,
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 19:03       ` Geraldo Nascimento
@ 2023-09-06 20:37         ` Robin Murphy
  2023-09-06 21:00           ` Geraldo Nascimento
  0 siblings, 1 reply; 11+ messages in thread
From: Robin Murphy @ 2023-09-06 20:37 UTC (permalink / raw)
  To: Geraldo Nascimento, Antonio Terceiro
  Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas,
	Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel

On 2023-09-06 20:03, Geraldo Nascimento wrote:
> On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote:
>> On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote:
>>> On 2023-09-06 07:10, Takashi Iwai wrote:
>>>> On Wed, 06 Sep 2023 00:01:01 +0200,
>>>> Antonio Terceiro wrote:
>>>>>
>>>>> Hi,
>>>>>
> 
> Hi Antonio, my 2 cents:
> 
>>>>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought
>>>>> one who was pretty popular around where I live, and it is supported by the
>>>>> snd-cmipci driver.
> 
> Specifically, which arm64 workstation? I'm guessing Compute Module 4 IO
> Board + Raspbery Pi CM4? This detail is important because the stack
> trace you provided only references generic PCI calls and there's a need
> to know exactly which PCIe driver could be failing. Is it pcie-brcmstb?

Bit bigger than a Pi... ;)

 > [  +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA 
Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 
09/08/2022

They look like pretty nice boxes - https://www.ipi.wiki/pages/com-hpc-altra


Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 20:37         ` Robin Murphy
@ 2023-09-06 21:00           ` Geraldo Nascimento
  0 siblings, 0 replies; 11+ messages in thread
From: Geraldo Nascimento @ 2023-09-06 21:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Antonio Terceiro, Takashi Iwai, Jaroslav Kysela, Takashi Iwai,
	Catalin Marinas, Will Deacon, alsa-devel, linux-kernel,
	linux-arm-kernel

On Wed, Sep 06, 2023 at 09:37:18PM +0100, Robin Murphy wrote:
> 
> Bit bigger than a Pi... ;)
>

Ohh, that's impressive indeed!

But looking around with Google, it turns out the Altra Ampere PCIe is
definitely quirky, see:

https://lore.kernel.org/linux-acpi/20200806225525.GA706347@bjorn-Precision-5520/T/
https://github.com/Tencent/TencentOS-kernel/commit/f454797b673c06c0eb1b77be20d8a475ad2fbf6f

The first quirk should probably be activated on Antonio's kernel but the
second one being a downstream Tencent patch, isn't. Alas, the second
quirk comes with a performance hit, see:

https://gitlab.freedesktop.org/drm/amd/-/issues/2078

Thanks,
Geraldo Nascimento

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 19:52       ` Robin Murphy
@ 2023-09-07  0:41         ` Antonio Terceiro
  2023-09-07 12:22           ` Robin Murphy
  0 siblings, 1 reply; 11+ messages in thread
From: Antonio Terceiro @ 2023-09-07  0:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas,
	Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1360 bytes --]

On Wed, Sep 06, 2023 at 08:52:40PM +0100, Robin Murphy wrote:
> On 2023-09-06 19:36, Antonio Terceiro wrote:
> > I'm pretty sure I saw reports of people using PCI GPUs on this machine,
> > but I would need to confirm.
> 
> GPUs and any other PCIe devices will be fine, since they will use memory
> BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some
> systems don't even support it at all. I found a datasheet for CMI8738, and
> they seem to be right at the other end of the scale as legacy PCI chips with
> *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as
> well), so are definitely going to be hitting paths that are less
> well-exercised on arm64 in general.

OK, that makes sense. So If I'm able to find a card that is genuinely
PCIe¹, then it should work?

¹ this one has a connector that looks like a PCIe x1, but it's not
  really PCIe as the chipset was designed for legacy PCI?

> > What info would I need to gather from the machine in order to figure
> > this out?
> 
> The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO
> disabled and see what difference that makes. I'm not too familiar with that
> area of the code, so the finer details of how to debug broken I/O space
> beyond that would be more of a linux-pci question.

Tried that, didn't help.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-06 18:36     ` Antonio Terceiro
  2023-09-06 19:03       ` Geraldo Nascimento
  2023-09-06 19:52       ` Robin Murphy
@ 2023-09-07  2:29       ` Geraldo Nascimento
  2 siblings, 0 replies; 11+ messages in thread
From: Geraldo Nascimento @ 2023-09-07  2:29 UTC (permalink / raw)
  To: Antonio Terceiro
  Cc: Robin Murphy, Takashi Iwai, Jaroslav Kysela, Takashi Iwai,
	Catalin Marinas, Will Deacon, alsa-devel, linux-kernel,
	linux-arm-kernel

On Wed, Sep 06, 2023 at 03:36:40PM -0300, Antonio Terceiro wrote:
> On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote:
> > It's blowing up trying to access PCI I/O space, which has apparently ended
> > up in the indirect access mechanism without that being configured correctly.
> > That is definitely an issue down somewhere between the PCI layer and the
> > system firmware. Does the system even have an I/O space window? Some arm64
> > machines don't. I guess we might not have got as far as probing a driver if
> > the I/O BAR couldn't be assigned at all, but either way something's not gone
> > right.
> 
> I'm pretty sure I saw reports of people using PCI GPUs on this machine,
> but I would need to confirm.
> 
> What info would I need to gather from the machine in order to figure
> this out?

Antonio, please see:
https://community.amperecomputing.com/t/amd-gpus-on-the-altra-devkit-and-other-altras-patches-available-now/336/11

You have a quirky PCIe controller it seems. You'll have to go through
the errata and then some.

Good Luck,
Geraldo Nascimento

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1)
  2023-09-07  0:41         ` Antonio Terceiro
@ 2023-09-07 12:22           ` Robin Murphy
  0 siblings, 0 replies; 11+ messages in thread
From: Robin Murphy @ 2023-09-07 12:22 UTC (permalink / raw)
  To: Antonio Terceiro
  Cc: Takashi Iwai, Jaroslav Kysela, Takashi Iwai, Catalin Marinas,
	Will Deacon, alsa-devel, linux-kernel, linux-arm-kernel

On 07/09/2023 1:41 am, Antonio Terceiro wrote:
> On Wed, Sep 06, 2023 at 08:52:40PM +0100, Robin Murphy wrote:
>> On 2023-09-06 19:36, Antonio Terceiro wrote:
>>> I'm pretty sure I saw reports of people using PCI GPUs on this machine,
>>> but I would need to confirm.
>>
>> GPUs and any other PCIe devices will be fine, since they will use memory
>> BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some
>> systems don't even support it at all. I found a datasheet for CMI8738, and
>> they seem to be right at the other end of the scale as legacy PCI chips with
>> *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as
>> well), so are definitely going to be hitting paths that are less
>> well-exercised on arm64 in general.
> 
> OK, that makes sense. So If I'm able to find a card that is genuinely
> PCIe¹, then it should work?
> 
> ¹ this one has a connector that looks like a PCIe x1, but it's not
>    really PCIe as the chipset was designed for legacy PCI?

Probably - native PCIe endpoints are still allowed to have I/O 
resources, but they are required to be accessible as equivalent memory 
resources as well, so most PCIe drivers are unlikely to care about I/O 
BARs at all.

>>> What info would I need to gather from the machine in order to figure
>>> this out?
>>
>> The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO
>> disabled and see what difference that makes. I'm not too familiar with that
>> area of the code, so the finer details of how to debug broken I/O space
>> beyond that would be more of a linux-pci question.
> 
> Tried that, didn't help.

OK, I managed to have a poke around on a full-fat Altra Mt.Jade system, 
and indeed, at least on this one, the firmware is not describing any I/O 
space windows at all:

[    8.657752] pci_bus 0001:00: root bus resource [bus 00-ff]
[    8.663235] pci_bus 0001:00: root bus resource [mem 
0x30000000-0x37ffffff window]
[    8.670715] pci_bus 0001:00: root bus resource [mem 
0x380000000000-0x3bffdfffffff window]
[    8.678926] pci 0001:00:00.0: [1def:e100] type 00 class 0x060000

[and so on for all 11(!) PCI segments...]

...which then leads to a lot of failing to configure I/O at the bridges:

[    9.005653] pci 0000:00:01.0: BAR 13: no space for [io  size 0x1000]
[    9.012006] pci 0000:00:01.0: BAR 13: failed to assign [io  size 0x1000]

...but unfortunately what I don't then have is any endpoint with an I/O 
BAR in that machine to see how that plays out. Either way, though, if 
your machine looks the same as this (i.e. does not report any "root bus 
resource [io ... window]" entries and fails to assign any I/O space), 
then there's no way that card can work, and it would seem to indicate a 
bug somewhere between the PCI layer and the driver that it's able to get 
as far as making an access to something it has no means of accessing.

If on the other hand your firmware is different and *does* claim to have 
I/O windows as well, then something else is going screwy and I don't 
know, sorry.

Cheers,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-09-07 12:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-05 22:01 snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) Antonio Terceiro
2023-09-06  6:10 ` Takashi Iwai
2023-09-06 12:49   ` Robin Murphy
2023-09-06 18:36     ` Antonio Terceiro
2023-09-06 19:03       ` Geraldo Nascimento
2023-09-06 20:37         ` Robin Murphy
2023-09-06 21:00           ` Geraldo Nascimento
2023-09-06 19:52       ` Robin Murphy
2023-09-07  0:41         ` Antonio Terceiro
2023-09-07 12:22           ` Robin Murphy
2023-09-07  2:29       ` Geraldo Nascimento

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).