From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 880C3EE14C3 for ; Wed, 6 Sep 2023 19:53:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=WhLuoXrsxD41eO3NryYmsXUlKu8mOc6ROZuxn07CR9M=; b=164w06ZUjtMse9 9hZQ0xYyxR2BhfvnXDWSUd6DXiygCv39mwDyYJeL4N316WLyNFEwHzByVVlxPSbHyLmB+0X/xUlH1 7EZMEKPCL+uQxXBnkhZnK18JRpdKVHxkF7+381sV7fGMJGBYZ8mQJhM9aXo5bMIxWBqm5DjQKoQZb 84kZYsI0HXSNkzvhC8nizfPULXmBfTYT9hLQ3EoFyefmK7riDmFR1n8Ga7fdUv2Veio2JLU26Ejzu 8lUDPVFpho/Mv9z8dEnSCzJxZ/mDfqrPYs0v1KLq5+/yUYYaV/jpTeBMjfV8vs5TAa+1hQjk7bVZR tEzMSkz2oxS4OreoN3Ig==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qdya7-00Aoc3-1T; Wed, 06 Sep 2023 19:52:55 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qdya3-00AobS-2h for linux-arm-kernel@lists.infradead.org; Wed, 06 Sep 2023 19:52:54 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2723B106F; Wed, 6 Sep 2023 12:53:26 -0700 (PDT) Received: from [10.57.5.192] (unknown [10.57.5.192]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 87BD23F67D; Wed, 6 Sep 2023 12:52:46 -0700 (PDT) Message-ID: <43632d9d-722c-b14f-336a-eac402ef9362@arm.com> Date: Wed, 6 Sep 2023 20:52:40 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Subject: Re: snd-cmipci oops during probe on arm64 (current mainline, pre-6.6-rc1) Content-Language: en-GB To: Antonio Terceiro Cc: Takashi Iwai , Jaroslav Kysela , Takashi Iwai , Catalin Marinas , Will Deacon , alsa-devel@alsa-project.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <877cp3esse.wl-tiwai@suse.de> <4f335dd2-8043-c60e-cf84-c2b01c4fee12@arm.com> From: Robin Murphy In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230906_125251_977609_4E11CA4A X-CRM114-Status: GOOD ( 25.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2023-09-06 19:36, Antonio Terceiro wrote: > On Wed, Sep 06, 2023 at 01:49:16PM +0100, Robin Murphy wrote: >> On 2023-09-06 07:10, Takashi Iwai wrote: >>> On Wed, 06 Sep 2023 00:01:01 +0200, >>> Antonio Terceiro wrote: >>>> >>>> Hi, >>>> >>>> I'm using an arm64 workstation, and wanted to add a sound card to it. I bought >>>> one who was pretty popular around where I live, and it is supported by the >>>> snd-cmipci driver. >>>> >>>> It's this one: >>>> >>>> 0005:02:00.0 Multimedia audio controller: C-Media Electronics Inc CMI8738/CMI8768 PCI Audio (rev 10) >>>> >>>> After building a mailine kernel (post-v6.5, pre-rc1) on Debian testing arm64 >>>> with localmodconfig + CONFIG_SND_CMIPCI=m, it crashes with "Unable to handle >>>> kernel paging request at virtual address fffffbfffe80000c", and the system >>>> never finishes to boot. The login manager never shows up and the serial console >>>> never gets to a login prompt. I observed the same issue on a 6.3 Debian kernel, >>>> after rebuilding with CONFIG_SND_CMIPCI=m. >>>> >>>> If I stop the module from being automatically loaded by adding >>>> `blacklist snd-cmipci` to /etc/modprobe.d/snd-cmipci.conf (or if I >>>> remove the card from the PCIe slot), I get the system to boot. But tring >>>> to load the module manually causes the same crash (I only tested this >>>> with the card on): >>>> >>>> [ +4,501093] snd_cmipci 0005:02:00.0: stream 512 already in tree >>>> [ +0,000155] Unable to handle kernel paging request at virtual address fffffbfffe80000c >>>> [ +0,007927] Mem abort info: >>>> [ +0,002793] ESR = 0x0000000096000006 >>>> [ +0,003743] EC = 0x25: DABT (current EL), IL = 32 bits >>>> [ +0,005307] SET = 0, FnV = 0 >>>> [ +0,003049] EA = 0, S1PTW = 0 >>>> [ +0,003134] FSC = 0x06: level 2 translation fault >>>> [ +0,004872] Data abort info: >>>> [ +0,002873] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 >>>> [ +0,005479] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>> [ +0,005047] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>> [ +0,000003] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000080519fe9000 >>>> [ +0,000004] [fffffbfffe80000c] pgd=000008051a979003, p4d=000008051a979003, pud=000008051a97a003, pmd=0000000000000000 >>>> [ +0,000009] Internal error: Oops: 0000000096000006 [#1] SMP >>>> [ +0,028142] Modules linked in: snd_cmipci(+) snd_mpu401_uart snd_opl3_lib xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter nft_masq nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bridge stp llc nf_tables nfnetlink uvcvideo videobuf2_vmalloc videobuf2_memops uvc videobuf2_v4l2 videodev videobuf2_common snd_seq_dummy snd_hrtimer snd_seq qrtr rfkill overlay ftdi_sio usbserial snd_usb_audio snd_usbmidi_lib snd_pcm aes_ce_blk aes_ce_cipher snd_hwdep polyval_ce snd_rawmidi polyval_generic snd_seq_device joydev snd_timer ghash_ce hid_generic gf128mul snd usbhid sha2_ce ipmi_ssif soundcore hid mc sha256_arm64 ipmi_devintf arm_spe_pmu ipmi_msghandler sha1_ce sbsa_gwdt binfmt_misc nls_ascii nls_cp437 vfat fat xgene_hwmon cppc_cpufreq arm_cmn arm_dsu_pmu evdev nfsd auth_rpcgss nfs_acl lockd grace dm_mod fuse loop efi_pstore dax sunrpc configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid >>> 456 async_raid6_recov async_memcpy >>>> [ +0,000142] async_pq async_xor async_tx libcrc32c crc32c_generic xor xor_neon raid6_pq raid1 raid0 multipath linear md_mod nvme nvme_core ast t10_pi drm_shmem_helper xhci_pci drm_kms_helper xhci_hcd crc64_rocksoft crc64 drm crc_t10dif usbcore crct10dif_generic igb crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c_designware_platform i2c_designware_core >>>> [ +0,121670] CPU: 0 PID: 442 Comm: kworker/0:4 Not tainted 6.5.0+ #2 >>>> [ +0,006259] Hardware name: ADLINK AVA Developer Platform/AVA Developer Platform, BIOS TianoCore 2.04.100.07 (SYS: 2.06.20220308) 09/08/2022 >>>> [ +0,012506] Workqueue: events work_for_cpu_fn >>>> [ +0,004353] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>> [ +0,006953] pc : logic_inl+0xa0/0xd8 >>>> [ +0,003570] lr : snd_cmipci_probe+0x7a4/0x1140 [snd_cmipci] >>>> [ +0,005578] sp : ffff80008287bc70 >>>> [ +0,003303] x29: ffff80008287bc70 x28: ffff08008af9d6a0 x27: 0000000000000000 >>>> [ +0,007128] x26: ffffc4818263c228 x25: 0000000000000000 x24: 0000000000000001 >>>> [ +0,007127] x23: ffff07ff81a9e000 x22: ffff07ff81a9e0c0 x21: ffff08008af9d080 >>>> [ +0,007127] x20: ffffc4818263c000 x19: 0000000000000000 x18: ffffffffffffffff >>>> [ +0,007127] x17: 0000000000000000 x16: ffffc4819ac3cd38 x15: ffff80008287ba80 >>>> [ +0,007127] x14: 0000000000000001 x13: ffff80008287bbc4 x12: 0000000000000000 >>>> [ +0,007126] x11: ffff07ff834616d0 x10: ffffffffffffffc0 x9 : ffffc4819a61dd18 >>>> [ +0,007127] x8 : 0000000000000228 x7 : 0000000000000001 x6 : 00000000000000ff >>>> [ +0,007127] x5 : ffffc4819adb7998 x4 : 0000000000000000 x3 : 00000000000000ff >>>> [ +0,007127] x2 : 0000000000ffbffe x1 : 000000000000000c x0 : fffffbfffe80000c >>>> [ +0,007126] Call trace: >>>> [ +0,002436] logic_inl+0xa0/0xd8 >>>> [ +0,003221] local_pci_probe+0x48/0xb8 >>>> [ +0,003744] work_for_cpu_fn+0x24/0x40 >>>> [ +0,003741] process_one_work+0x170/0x3a8 >>>> [ +0,004002] worker_thread+0x23c/0x460 >>>> [ +0,003742] kthread+0xe8/0xf8 >>>> [ +0,003047] ret_from_fork+0x10/0x20 >>>> [ +0,003569] Code: d2bfd000 f2df7fe0 f2ffffe0 8b000020 (b9400000) >>>> [ +0,006083] ---[ end trace 0000000000000000 ]--- >>>> >>>> Because this sound card chipset seems to be popular (pretty much all PCI cards >>>> I can find to buy locally use that), I'm thinking this might be specific to >>>> arm64, otherwise someone would have seen this before. >>> >>> There is only one change in this driver code itself since 6.5 (commit >>> b6ba0aa46138), and judging from the stack trace, it's unrelated with >>> your problem. It's more likely a regression in the lower level code, >>> e.g. PCI layer or arch/arm64 stuff. >>> >>> Could you try git bisect? >> >> Hmm, but has this combination of card and machine *ever* actually worked? > > That could be it. In trying to find a starting point for the bisection, > I tried 6.1.0, 5.15.130, and 5.10.19, and they all fail in exactly the > same way. I didn't go further back. > >> It's blowing up trying to access PCI I/O space, which has apparently ended >> up in the indirect access mechanism without that being configured correctly. >> That is definitely an issue down somewhere between the PCI layer and the >> system firmware. Does the system even have an I/O space window? Some arm64 >> machines don't. I guess we might not have got as far as probing a driver if >> the I/O BAR couldn't be assigned at all, but either way something's not gone >> right. > > I'm pretty sure I saw reports of people using PCI GPUs on this machine, > but I would need to confirm. GPUs and any other PCIe devices will be fine, since they will use memory BARs - I/O space is pretty much deprecated in PCIe, and as mentioned some systems don't even support it at all. I found a datasheet for CMI8738, and they seem to be right at the other end of the scale as legacy PCI chips with *only* an I/O BAR (and so I guess your card includes a PCIe-PCI bridge as well), so are definitely going to be hitting paths that are less well-exercised on arm64 in general. > What info would I need to gather from the machine in order to figure > this out? The first thing I'd try is rebuilding the kernel with CONFIG_INDIRECT_PIO disabled and see what difference that makes. I'm not too familiar with that area of the code, so the finer details of how to debug broken I/O space beyond that would be more of a linux-pci question. Thanks, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel