From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8862C05027 for ; Fri, 3 Feb 2023 16:30:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/JJpdgLbCx8+mDTksrdqjMRymZ/ZohVlNV8dysKByF4=; b=DXuSn+cU5l+1bv KRpHAyHbmKQonZkxROVfFKcm4SmfaK9ke6O6WkWrlJzNzOUm3eWpHdpWSYDAcuh+8kuguvQ+sI4/R EJv+pspt+hfvDCttDrOQOUau2At2eRm+WMKucPiPhjdby3wNj2SXhtFn6ZZlIcY34jo+8KsGWQqoX hBgSrckBlureXuH6mfbk17EmfZb0DrS5X8WGN7gEtXH6u1HqEgcSRcMHwO9gQK3qjm58ysFhKCcNN Wq/K2FVCFo2jMb8u4yURJEEGYKPUi7EFK78uS3mIEdzkdG6k6YgD89JDjucpzItHPEwrjUNs4A1tG U5G6HqHPd/0SSlTKcOWQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNyvq-002v8u-5P; Fri, 03 Feb 2023 16:28:58 +0000 Received: from relay3-d.mail.gandi.net ([217.70.183.195]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pNyvj-002v5Q-Nr for linux-arm-kernel@lists.infradead.org; Fri, 03 Feb 2023 16:28:55 +0000 Received: from booty (unknown [37.161.147.43]) (Authenticated sender: luca.ceresoli@bootlin.com) by mail.gandi.net (Postfix) with ESMTPSA id 84BA560002; Fri, 3 Feb 2023 16:28:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bootlin.com; s=gm1; t=1675441726; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d0irwHXUSJ6EvZ8bNg3PxZPnoruBh93zNGem+YFqC6o=; b=RUAzDtsM2pbuVOkNTcNZCt6sX0QNq8JerQl4KNSYnIptXGdNJOJDWJUKVMQWNiCJQokSvT NCFh0Jt32p6iVCrnTXW0GTrdNEfwQKXJ355vIWpwYKo7cfEgttrn+izOZIW5F+oi+IQZ3L UcB+jQ2Sgwx7Mf/XVvp19lx43OTcqvLkM9tIr/2i1qH7WvSPEhNBF1Oa2mG0hlajdwfs+b yvnjKI5UlrutDuccYJZv57zteGlLHhvxZS+m2lFOH00q28dEwwEM7bv19X/Fno6Ldm6jHx fWJHMcWCMtS9AUw9a81uyUZu8Msg2+z7gxACVxLBU+DGbl7BXFfPSnMyyqx+JA== Date: Fri, 3 Feb 2023 17:28:42 +0100 From: Luca Ceresoli To: Georgi Djakov Cc: linux-arm-kernel@lists.infradead.org, linux-imx@nxp.com, linux-pm@vger.kernel.org, Marek Vasut Subject: Re: i.MX8 NULL pointer dereference on interconnect instantiation Message-ID: <20230203172842.34862b90@booty> In-Reply-To: References: <20230202175525.3dba79a7@booty> Organization: Bootlin X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230203_082853_055602_CF79B954 X-CRM114-Status: GOOD ( 29.50 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Georgi, On Fri, 3 Feb 2023 09:49:16 +0200 Georgi Djakov wrote: > Hi Luca, > > On 2.02.23 18:55, Luca Ceresoli wrote: > > Hello, > > > > I just met an oops on i.MX8MP that appears sporadically but quite often > > with my current config (~20%). It seems related to the concurrency of > > instantiaton between an interconnect and peripherals using it. > > > > I haven't found any existing similar report. > > > > Kernel: v6.2-rc5-20-g7bf70dbb1882 + the audio patches at > > https://lore.kernel.org/all/20220625013235.710346-1-marex@denx.de/ > > HW: Avnet MSC SM2-MB-EP1 Carrier Board > > > > A log of the relevant section follows. Lines starting with ">>>" were > > added by me and show the relevant code lines being executed and some > > variable values. > > > > ------------------------------8<------------------------------ > > > > [ 15.170236] at24 0-0050: supply vcc not found, using dummy regulator > > [ 15.181143] at24 0-0050: 8192 byte 24c64 EEPROM, writable, 32 bytes/write > > [ 15.272681] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.281519] >>> of_icc_get_from_provider:405 RETURN -EPROBE_DEFER > > [ 15.296345] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.305136] >>> of_icc_get_from_provider:405 RETURN -EPROBE_DEFER > > [ 15.317576] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.326715] >>> of_icc_get_from_provider:405 RETURN -EPROBE_DEFER > > [ 15.338297] input: 30370000.snvs:snvs-powerkey as /devices/platform/soc@0/30000000.bus/30370000.snvs/30370000.snvs:snvs-powerkey/input/input0 > > [ 15.359831] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.368372] >>> of_icc_get_from_provider:405 RETURN -EPROBE_DEFER > > [ 15.381942] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.383139] imx-bus-devfreq 32700000.interconnect: interconnect provider added to topology > > [ 15.387956] snvs_rtc 30370000.snvs:snvs-rtc-lp: registered as rtc1 > > [ 15.390482] >>> of_icc_xlate_onecell:352 START > > [ 15.401380] >>> of_icc_xlate_onecell:359 RETURN icc_data->nodes[37] = 0000000000000000 > > [ 15.409421] >>> of_icc_get_from_provider:416 RETURN data->node 0000000000000000 > > [ 15.416865] >>> of_icc_get_from_provider:383 START, spec: np > > [ 15.425391] >>> of_icc_xlate_onecell:352 START > > [ 15.429996] >>> of_icc_xlate_onecell:359 RETURN icc_data->nodes[36] = ffff000005fe9e00 > > [ 15.434640] i.mx8mm_thermal 30260000.tmu: No OCOTP nvmem reference found, SoC-specific calibration not loaded. Please update your DT. > > [ 15.438012] >>> of_icc_get_from_provider:416 RETURN data->node ffff000005fe9e00 > > [ 15.457502] >>> path_find:197 src 0000000000000000 > > [ 15.462430] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > > [ 15.471339] Mem abort info: > > [ 15.473249] imx-cpufreq-dt imx-cpufreq-dt: cpu speed grade 7 mkt segment 2 supported-hw 0x80 0x4 > > [ 15.474253] ESR = 0x0000000096000004 > > [ 15.486891] EC = 0x25: DABT (current EL), IL = 32 bits > > [ 15.492315] SET = 0, FnV = 0 > > [ 15.495407] EA = 0, S1PTW = 0 > > [ 15.498704] FSC = 0x04: level 0 translation fault > > [ 15.503725] Data abort info: > > [ 15.506646] ISV = 0, ISS = 0x00000004 > > [ 15.510728] CM = 0, WnR = 0 > > [ 15.513796] user pgtable: 4k pages, 48-bit VAs, pgdp=000000004611a000 > > [ 15.520354] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 > > [ 15.527450] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > > [ 15.533737] Modules linked in: imx_cpufreq_dt imx8mm_thermal imx8mp_interconnect rtc_snvs imx_interconnect snvs_pwrkey governor_userspace imx_bus at24 fsl_imx8_ddr_perf caam error crct10dif_ce > > [ 15.550925] CPU: 2 PID: 68 Comm: kworker/u8:4 Not tainted 6.2.0-rc5-00040-ged7bb521b8fe-dirty #70 > > [ 15.559809] Hardware name: MSC SM2-MB-EP1 Carrier Board with SM2S-IMX8PLUS-QC6-14N0600E SoM (DT) > > [ 15.568602] Workqueue: events_unbound deferred_probe_work_func > > [ 15.577666] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 15.584637] pc : path_find+0x94/0x374 > > [ 15.588314] lr : path_find+0x94/0x374 > > [ 15.591988] sp : ffff80000a78b730 > > [ 15.595305] x29: ffff80000a78b730 x28: 0000000000000000 x27: ffff80000a78b7c8 > > [ 15.602787] x26: ffff800009161988 x25: 0000000000000001 x24: 0000000000000000 > > [ 15.611498] x23: ffff800008e535c8 x22: ffff800008e53250 x21: ffff000005fe9e00 > > [ 15.618804] x20: ffff80000a78b7b8 x19: ffff80000a78b7a8 x18: 0000000000000030 > > [ 15.625956] x17: 3965663530303030 x16: 3066666666206564 x15: ffffffffffffffff > > [ 15.633112] x14: 0000000000000000 x13: 3030303030303030 x12: 000000000004034f > > [ 15.640265] x11: ffff8000095af930 x10: 000000000000011b x9 : 00000000ffffefff > > [ 15.647418] x8 : ffff800009607930 x7 : 0000000000017fe8 x6 : 0000000000000000 > > [ 15.654571] x5 : 80000000fffff000 x4 : 0000000000000000 x3 : 0000000000000000 > > [ 15.661726] x2 : 0000000000000000 x1 : ffff000003516100 x0 : 0000000000000026 > > [ 15.668877] Call trace: > > [ 15.671326] path_find+0x94/0x374 > > [ 15.674653] of_icc_get_by_index+0x1b0/0x290 > > [ 15.678932] of_icc_get+0x70/0xa0 > > [ 15.682252] of_icc_bulk_get+0x54/0xf0 > > [ 15.686007] devm_of_icc_bulk_get+0x5c/0xc0 > > [ 15.690196] imx8m_blk_ctrl_probe+0x22c/0x540 > > [ 15.694562] platform_probe+0x68/0xe0 > > [ 15.698231] really_probe+0xc0/0x3e0 > > [ 15.701820] __driver_probe_device+0x7c/0x190 > > [ 15.706182] driver_probe_device+0x3c/0x110 > > [ 15.710374] __device_attach_driver+0xbc/0x160 > > [ 15.714827] bus_for_each_drv+0x78/0xd0 > > [ 15.718670] __device_attach+0xa8/0x1f0 > > [ 15.722513] device_initial_probe+0x14/0x20 > > [ 15.726705] bus_probe_device+0x9c/0xb0 > > [ 15.730549] deferred_probe_work_func+0xa4/0x100 > > [ 15.735174] process_one_work+0x288/0x6b0 > > [ 15.739193] worker_thread+0x74/0x450 > > [ 15.742862] kthread+0x10c/0x110 > > [ 15.746095] ret_from_fork+0x10/0x20 > > [ 15.749683] Code: 90002480 91250000 f90053fb 97ffc398 (b8438783) > > [ 15.755783] ---[ end trace 0000000000000000 ]--- > > [ 23.343608] random: crng init done > > > > > > ------------------------------8<------------------------------ > > > > The relevant line is line "B" in this snippet: > > > > A [ 15.381942] >>> of_icc_get_from_provider:383 START, spec: np > > B [ 15.383139] imx-bus-devfreq 32700000.interconnect: interconnect provider added to topology > > C [ 15.387956] snvs_rtc 30370000.snvs:snvs-rtc-lp: registered as rtc1 > > D [ 15.390482] >>> of_icc_xlate_onecell:352 START > > E [ 15.401380] >>> of_icc_xlate_onecell:359 RETURN icc_data->nodes[37] = 0000000000000000 > > F [ 15.409421] >>> of_icc_get_from_provider:416 RETURN data->node 0000000000000000 > > > > Here 32700000.interconnect is added during the execution of > > of_icc_get_from_provider(), which in turn calls of_icc_xlate_onecell() > > to find the interconnect node, failing and thus returning NULL. This > > NULL pointer is propagated up to of_icc_get_by_index() which passes it > > to path_find() where the pointer is dereferenced and the kernel oopses. > > > > In successful runs, line B always appears outside of the execution of > > of_icc_get_from_provider(), i.e. either before line A or after line F, so > > it seems to me that the interconnect is being looked for while it is > > being added and the state is inconsistent. > > > > That's all on my side at the moment. I haven't looked at how this > > could be fixed but I think the problem is pretty focused now. > > > > I am of course available to provide more details. > > Thanks for the report! Could you please try this patchset and see if it helps: > https://lore.kernel.org/all/20230201101559.15529-1-johan+linaro@kernel.org/ For the records: it seems to be working, I replied in more detail to patch 4 in the series. -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel