From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B204CCD6E77 for ; Thu, 4 Jun 2026 17:55:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 10DA01128F8; Thu, 4 Jun 2026 17:55:56 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.b="PfjNQ4V3"; dkim-atps=neutral Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 01AA51128F8 for ; Thu, 4 Jun 2026 17:55:54 +0000 (UTC) Received: from smtp.kernel.org (quasi.space.kernel.org [100.103.45.18]) by sea.source.kernel.org (Postfix) with ESMTP id 921A14393C; Thu, 4 Jun 2026 17:55:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 593331F00893; Thu, 4 Jun 2026 17:55:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780595754; bh=zaFj2py8TsK6T7o0SYWKM/IbFagJ8zS2Zepg+odLIdw=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=PfjNQ4V33FjbielV006qgc5f3dfdTUcIW/3DxqQl51cVSsGOq+yRZ3eMc3CAaLIlI Zvit11VH2el80eooPKlvnpF/A524Bbb1w98LjwnqDKAcZxNU8b9hxP32StJCNeEQIy xKNlQ230WYZiGWnsEtl4lVDO1QA+f/b5ciiJVriwZ277LH6PE6VkjtTw3ZO31E3Hb1 m+6J3VVxfj5BnTsAETSWAhN8hjvDI9RrUvR2fbMQmsU8JmNjSTedeB6Ic49kbnb7rs cXTGASd3vt2E37RBIo4fYkhm1wfGNIrRwJybLVJ+1BMYRSn2BUaajCr73Achq6KdfH 1j3c5zyzzsfBA== From: sashiko-bot@kernel.org Subject: Re: [PATCH v2 7/7] drm/panfrost: Explicitly enable MMU interrupts at device init To: =?utf-8?b?QWRyacOhbiBMYXJ1bWJl?= Cc: dri-devel@lists.freedesktop.org In-Reply-To: <20260604-claude-fixes-v2-7-57c6bd4c1655@collabora.com> References: <20260604-claude-fixes-v2-7-57c6bd4c1655@collabora.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Thu, 04 Jun 2026 17:55:53 +0000 Message-Id: <20260604175554.593331F00893@smtp.kernel.org> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: sashiko-reviews@lists.linux.dev Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Thank you for your contribution! Sashiko AI review found 2 potential issue(= s) to consider: Pre-existing issues: - [High] The MMU IRQ handler remains active while device clocks and power d= omains are disabled during error teardown, causing unclocked MMIO accesses = (S-Error). - [High] panfrost_reset() does not mask or synchronize the MMU interrupt be= fore resetting the GPU hardware, leading to a race condition with the MMU I= RQ thread. -- commit cac1cfb873df895888c7871605b659cdaf55b78e Author: Adri=C3=A1n Larumbe Date: Thu Jun 4 17:35:26 2026 +0000 drm/panfrost: Explicitly enable MMU interrupts at device init =20 Because the device must be in a position to accept jobs between the time drm_dev_register() is called and autosuspend first kicks in, there's a = very narrow window inbetween during which jobs targeting the tiler buffer object would time out, since the device's PM status is 'Active', but no= MMU interrupts were enabled at device initialisation time. > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/pa= nfrost/panfrost_mmu.c > index 4a3162c3b659f..0e1ad6256c2cb 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c [ ... ] > @@ -355,8 +361,7 @@ void panfrost_mmu_reset(struct panfrost_device *pfdev) > =20 > spin_unlock(&pfdev->as_lock); > =20 > - mmu_write(pfdev, MMU_INT_CLEAR, ~0); > - mmu_write(pfdev, MMU_INT_MASK, ~0); > + panfrost_mmu_enable_interrupts(pfdev); > } [Severity: High] This isn't a bug introduced by this patch, but while reviewing the interrupt handling during reset, does panfrost_reset() need to mask and synchronize the MMU interrupt before resetting the hardware? When a job timeout occurs, panfrost_reset() correctly masks and synchronizes the job manager interrupts: drivers/gpu/drm/panfrost/panfrost_job.c:panfrost_reset() { /* Mask job interrupts and synchronize to make sure we won't be * interrupted during our reset. */ job_write(pfdev, JOB_INT_MASK, 0); synchronize_irq(pfdev->js->irq); ... panfrost_device_reset(pfdev, true); ... } However, it does not do the same for the MMU interrupt (pfdev->mmu_irq). Because MMU faults often cause job timeouts, the panfrost_mmu_irq_handler_thread could be actively running concurrently on another CPU. If the IRQ thread attempts to handle an MMU page fault, it will write to MMU registers while panfrost_device_reset() concurrently calls panfrost_gpu_soft_reset() to reset the hardware. Can uncoordinated MMIO accesses to hardware undergoing a soft reset lead to unpredictable behavior or bus errors? [ ... ] > @@ -970,6 +975,8 @@ int panfrost_mmu_init(struct panfrost_device *pfdev) > return err; > } > =20 > + panfrost_mmu_enable_interrupts(pfdev); > + [Severity: High] This is a pre-existing issue, but by explicitly enabling the MMU interrupts here during init, does this expose a race condition in the device initialization error paths? If a subsequent initialization step like panfrost_jm_init() fails, panfrost_device_init() will jump to the out_mmu label: drivers/gpu/drm/panfrost/panfrost_device.c:panfrost_device_init() { ... err =3D panfrost_jm_init(pfdev); if (err) goto out_mmu; ... out_mmu: panfrost_mmu_fini(pfdev); ... out_clk: panfrost_clk_fini(pfdev); ... } panfrost_mmu_fini() masks the interrupt in hardware, but does not synchroni= ze or disable the IRQ handler in the kernel: void panfrost_mmu_fini(struct panfrost_device *pfdev) { mmu_write(pfdev, MMU_INT_MASK, 0); } Because panfrost_mmu_init() registers a shared threaded IRQ handler using devm_request_threaded_irq(), the handler remains active during the rest of the teardown. If an interrupt fires or the thread runs after panfrost_clk_fini() has disabled the hardware clocks, panfrost_mmu_irq_handler() will read unclocked MMIO registers (e.g., mmu_read(pfdev, MMU_INT_STAT)). Could this lead to a fatal S-Error (synchronous external abort)? > return 0; > } --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260604-claude-fix= es-v2-0-57c6bd4c1655@collabora.com?part=3D7