From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bali.collaboradmins.com (bali.collaboradmins.com [148.251.105.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4F4936215E for ; Thu, 15 Jan 2026 11:31:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.251.105.195 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768476662; cv=none; b=IEHVtoYQC/y3A7/t70zDuDNnW9jdWXIre2+3oYGBCcS9VoCXEC3fL8lkDB4k8gON/ofRyasNRlfPcf5gaNQCmYuyQMMhyRN++c9JXJmmwsO+XwoP8YnE6yffIL7rv2aYQTmflbXa0DQg4i/59c8g+Hb1P+7itjVgzONZec+gSDI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768476662; c=relaxed/simple; bh=4BSQh18x4p2LA8T7wpkFltsvFgNZGSy+EpUsFfN67Vw=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GqQD50yOXtEn4caaOKXBchumQbixi8YT0I3Nq5c/B1/vYvdfNl4RVRWsR1s0IuhiKs/qsClFC5jMFLVd1NrH0bOc9ETv8xHNrJ4neOIaMyZ4LTww/0go/yuotBIpVTSQPqrm9JbECsffoRTajBzSPhoSWqWv/B9DSxg8iNDAu+I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=i4WEFWPY; arc=none smtp.client-ip=148.251.105.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="i4WEFWPY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1768476659; bh=4BSQh18x4p2LA8T7wpkFltsvFgNZGSy+EpUsFfN67Vw=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=i4WEFWPYM5px7QUWmkefLoc9pM9YwR9pcZ3xlmaHyt5t7r8Zt2FcSXRgXBMpNrgpJ mKQS9eyOaqX5LNGtprCXpJ2MagfkScFogfjrh1urCXuxUq2HN607eruLJgcxTzo5P6 ToRoN0lc8GfhBhbD2nZuEIML9DtwCC1zCCpDXbwmehxoxBPkw2P3q6FgIP3pqy9tA2 PLOGLhCeWHD3zhQvqw+uAK6YVE3ZoqZlwYAmXVYtP0HjR2b4Z8fnf9Ign/Pv8m2YFz PDvIw6VLECsyb7HDbaswqg/VglSA9oqFODSUeki0QpkhuNQQE+S390xOvh7Olm0IHh GraxFL1aRQSjw== Received: from fedora (unknown [IPv6:2a01:e0a:2c:6930:d919:a6e:5ea1:8a9f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by bali.collaboradmins.com (Postfix) with ESMTPSA id 8348217E01E7; Thu, 15 Jan 2026 12:30:58 +0100 (CET) Date: Thu, 15 Jan 2026 12:30:52 +0100 From: Boris Brezillon To: Nicolas Frattaroli Cc: Steven Price , Liviu Dudau , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , Chia-I Wu , Karunika Choo , kernel@collabora.com, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org Subject: Re: [PATCH v8 2/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Message-ID: <20260115123052.5c30e714@fedora> In-Reply-To: <3991854.44csPzL39Z@workhorse> References: <20260112-panthor-tracepoints-v8-0-63efcb421d22@collabora.com> <20260112-panthor-tracepoints-v8-2-63efcb421d22@collabora.com> <20260112161252.09396916@fedora> <3991854.44csPzL39Z@workhorse> Organization: Collabora X-Mailer: Claws Mail 4.3.1 (GTK 3.24.51; x86_64-redhat-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Thu, 15 Jan 2026 12:15:22 +0100 Nicolas Frattaroli wrote: > On Monday, 12 January 2026 16:12:52 Central European Standard Time Boris Brezillon wrote: > > On Mon, 12 Jan 2026 15:37:50 +0100 > > Nicolas Frattaroli wrote: > > > > > The current IRQ helpers do not guarantee mutual exclusion that covers > > > the entire transaction from accessing the mask member and modifying the > > > mask register. > > > > > > This makes it hard, if not impossible, to implement mask modification > > > helpers that may change one of these outside the normal > > > suspend/resume/isr code paths. > > > > > > Add a spinlock to struct panthor_irq that protects both the mask member > > > and register. Acquire it in all code paths that access these, but drop > > > it before processing the threaded handler function. Then, add the > > > aforementioned new helpers: enable_events, and disable_events. They work > > > by ORing and NANDing the mask bits. > > > > > > resume is changed to no longer have a mask passed, as pirq->mask is > > > supposed to be the user-requested mask now, rather than a mirror of the > > > INT_MASK register contents. Users of the resume helper are adjusted > > > accordingly, including a rather painful refactor in panthor_mmu.c. > > > > > > Signed-off-by: Nicolas Frattaroli > > > --- > > > drivers/gpu/drm/panthor/panthor_device.h | 72 +++++++-- > > > drivers/gpu/drm/panthor/panthor_fw.c | 3 +- > > > drivers/gpu/drm/panthor/panthor_gpu.c | 2 +- > > > drivers/gpu/drm/panthor/panthor_mmu.c | 247 ++++++++++++++++--------------- > > > drivers/gpu/drm/panthor/panthor_pwr.c | 2 +- > > > 5 files changed, 187 insertions(+), 139 deletions(-) > > > > > [... snip ...] > > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c > > > index 198d59f42578..71b8318eab31 100644 > > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c > > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c > > > @@ -655,125 +655,6 @@ static void panthor_vm_release_as_locked(struct panthor_vm *vm) > > > vm->as.id = -1; > > > } > > > > > > -/** > > > - * panthor_vm_active() - Flag a VM as active > > > - * @vm: VM to flag as active. > > > - * > > > - * Assigns an address space to a VM so it can be used by the GPU/MCU. > > > - * > > > - * Return: 0 on success, a negative error code otherwise. > > > - */ > > > -int panthor_vm_active(struct panthor_vm *vm) > > > -{ > > > - struct panthor_device *ptdev = vm->ptdev; > > > - u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features); > > > - struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg; > > > - int ret = 0, as, cookie; > > > - u64 transtab, transcfg; > > > - > > > - if (!drm_dev_enter(&ptdev->base, &cookie)) > > > - return -ENODEV; > > > - > > > - if (refcount_inc_not_zero(&vm->as.active_cnt)) > > > - goto out_dev_exit; > > > - > > > - /* Make sure we don't race with lock/unlock_region() calls > > > - * happening around VM bind operations. > > > - */ > > > - mutex_lock(&vm->op_lock); > > > - mutex_lock(&ptdev->mmu->as.slots_lock); > > > - > > > - if (refcount_inc_not_zero(&vm->as.active_cnt)) > > > - goto out_unlock; > > > - > > > - as = vm->as.id; > > > - if (as >= 0) { > > > - /* Unhandled pagefault on this AS, the MMU was disabled. We need to > > > - * re-enable the MMU after clearing+unmasking the AS interrupts. > > > - */ > > > - if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) > > > - goto out_enable_as; > > > - > > > - goto out_make_active; > > > - } > > > - > > > - /* Check for a free AS */ > > > - if (vm->for_mcu) { > > > - drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0)); > > > - as = 0; > > > - } else { > > > - as = ffz(ptdev->mmu->as.alloc_mask | BIT(0)); > > > - } > > > - > > > - if (!(BIT(as) & ptdev->gpu_info.as_present)) { > > > - struct panthor_vm *lru_vm; > > > - > > > - lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list, > > > - struct panthor_vm, > > > - as.lru_node); > > > - if (drm_WARN_ON(&ptdev->base, !lru_vm)) { > > > - ret = -EBUSY; > > > - goto out_unlock; > > > - } > > > - > > > - drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt)); > > > - as = lru_vm->as.id; > > > - > > > - ret = panthor_mmu_as_disable(ptdev, as, true); > > > - if (ret) > > > - goto out_unlock; > > > - > > > - panthor_vm_release_as_locked(lru_vm); > > > - } > > > - > > > - /* Assign the free or reclaimed AS to the FD */ > > > - vm->as.id = as; > > > - set_bit(as, &ptdev->mmu->as.alloc_mask); > > > - ptdev->mmu->as.slots[as].vm = vm; > > > - > > > -out_enable_as: > > > - transtab = cfg->arm_lpae_s1_cfg.ttbr; > > > - transcfg = AS_TRANSCFG_PTW_MEMATTR_WB | > > > - AS_TRANSCFG_PTW_RA | > > > - AS_TRANSCFG_ADRMODE_AARCH64_4K | > > > - AS_TRANSCFG_INA_BITS(55 - va_bits); > > > - if (ptdev->coherent) > > > - transcfg |= AS_TRANSCFG_PTW_SH_OS; > > > - > > > - /* If the VM is re-activated, we clear the fault. */ > > > - vm->unhandled_fault = false; > > > - > > > - /* Unhandled pagefault on this AS, clear the fault and re-enable interrupts > > > - * before enabling the AS. > > > - */ > > > - if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) { > > > - gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as)); > > > - ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as); > > > - ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as); > > > - gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask); > > > - } > > > - > > > - /* The VM update is guarded by ::op_lock, which we take at the beginning > > > - * of this function, so we don't expect any locked region here. > > > - */ > > > - drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0); > > > - ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr); > > > - > > > -out_make_active: > > > - if (!ret) { > > > - refcount_set(&vm->as.active_cnt, 1); > > > - list_del_init(&vm->as.lru_node); > > > - } > > > - > > > -out_unlock: > > > - mutex_unlock(&ptdev->mmu->as.slots_lock); > > > - mutex_unlock(&vm->op_lock); > > > - > > > -out_dev_exit: > > > - drm_dev_exit(cookie); > > > - return ret; > > > -} > > > - > > > /** > > > * panthor_vm_idle() - Flag a VM idle > > > * @vm: VM to flag as idle. > > > @@ -1762,6 +1643,128 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status) > > > } > > > PANTHOR_IRQ_HANDLER(mmu, MMU, panthor_mmu_irq_handler); > > > > > > +/** > > > + * panthor_vm_active() - Flag a VM as active > > > + * @vm: VM to flag as active. > > > + * > > > + * Assigns an address space to a VM so it can be used by the GPU/MCU. > > > + * > > > + * Return: 0 on success, a negative error code otherwise. > > > + */ > > > +int panthor_vm_active(struct panthor_vm *vm) > > > +{ > > > + struct panthor_device *ptdev = vm->ptdev; > > > + u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features); > > > + struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg; > > > + int ret = 0, as, cookie; > > > + u64 transtab, transcfg; > > > + u32 fault_mask; > > > + > > > + if (!drm_dev_enter(&ptdev->base, &cookie)) > > > + return -ENODEV; > > > + > > > + if (refcount_inc_not_zero(&vm->as.active_cnt)) > > > + goto out_dev_exit; > > > + > > > + /* Make sure we don't race with lock/unlock_region() calls > > > + * happening around VM bind operations. > > > + */ > > > + mutex_lock(&vm->op_lock); > > > + mutex_lock(&ptdev->mmu->as.slots_lock); > > > + > > > + if (refcount_inc_not_zero(&vm->as.active_cnt)) > > > + goto out_unlock; > > > + > > > + as = vm->as.id; > > > + if (as >= 0) { > > > + /* Unhandled pagefault on this AS, the MMU was disabled. We need to > > > + * re-enable the MMU after clearing+unmasking the AS interrupts. > > > + */ > > > + if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) > > > + goto out_enable_as; > > > + > > > + goto out_make_active; > > > + } > > > + > > > + /* Check for a free AS */ > > > + if (vm->for_mcu) { > > > + drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0)); > > > + as = 0; > > > + } else { > > > + as = ffz(ptdev->mmu->as.alloc_mask | BIT(0)); > > > + } > > > + > > > + if (!(BIT(as) & ptdev->gpu_info.as_present)) { > > > + struct panthor_vm *lru_vm; > > > + > > > + lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list, > > > + struct panthor_vm, > > > + as.lru_node); > > > + if (drm_WARN_ON(&ptdev->base, !lru_vm)) { > > > + ret = -EBUSY; > > > + goto out_unlock; > > > + } > > > + > > > + drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt)); > > > + as = lru_vm->as.id; > > > + > > > + ret = panthor_mmu_as_disable(ptdev, as, true); > > > + if (ret) > > > + goto out_unlock; > > > + > > > + panthor_vm_release_as_locked(lru_vm); > > > + } > > > + > > > + /* Assign the free or reclaimed AS to the FD */ > > > + vm->as.id = as; > > > + set_bit(as, &ptdev->mmu->as.alloc_mask); > > > + ptdev->mmu->as.slots[as].vm = vm; > > > + > > > +out_enable_as: > > > + transtab = cfg->arm_lpae_s1_cfg.ttbr; > > > + transcfg = AS_TRANSCFG_PTW_MEMATTR_WB | > > > + AS_TRANSCFG_PTW_RA | > > > + AS_TRANSCFG_ADRMODE_AARCH64_4K | > > > + AS_TRANSCFG_INA_BITS(55 - va_bits); > > > + if (ptdev->coherent) > > > + transcfg |= AS_TRANSCFG_PTW_SH_OS; > > > + > > > + /* If the VM is re-activated, we clear the fault. */ > > > + vm->unhandled_fault = false; > > > + > > > + /* Unhandled pagefault on this AS, clear the fault and re-enable interrupts > > > + * before enabling the AS. > > > + */ > > > + fault_mask = panthor_mmu_as_fault_mask(ptdev, as); > > > + if (ptdev->mmu->as.faulty_mask & fault_mask) { > > > + gpu_write(ptdev, MMU_INT_CLEAR, fault_mask); > > > + ptdev->mmu->as.faulty_mask &= ~fault_mask; > > > + panthor_mmu_irq_enable_events(&ptdev->mmu->irq, fault_mask); > > > + panthor_mmu_irq_disable_events(&ptdev->mmu->irq, ptdev->mmu->as.faulty_mask); > > > > Why do we need a _disable_events() here? > > It's what the code originally did as far as I can tell. Not super obvious > because I had to move the function, but it did: > > /* Unhandled pagefault on this AS, clear the fault and re-enable interrupts > * before enabling the AS. > */ > if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) { > gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as)); > ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as); > ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as); > gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask); > } > > We write `~(ptdev->mmu->as.faulty_mask & ~panthor_mmu_as_fault_mask(ptdev, as))` to > the mask register. Though now looking at it again, I don't think my new version > expands to the same thing at all, since > `ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);` is trying > to clear the fault mask of the one bit this translates to from what I can tell, > and then the negation in the write re-enables it but clears all other bits? That > can't be right. If anything if it wanted to re-enable interrupts it should OR > the register contents, not overwrite them. It's more an "enable anything that's not faulty" than a "re-enable events for this specific AS". So all we were doing is keep track of the faulty+not-acknowledged mask, and then clearing bits in this mask as AS slots get acknowledged or recycled. > > I feel a little better about the me from a few days ago when I can look at the > code with a fresh set of eyes and still not get what it's actually trying to do, > other than trusting the comment. > > Also, genuinely what is the point of `panthor_mmu_as_fault_mask`? Half of its > parameters are unused and its entire implementation is shorter than the function > name. panthor_mmu_as_fault_mask() is only here because at some point I intended to port JM HW support to panthor, the fault mask for an AS there is `BIT(as) | BIT(as + 16)` instead of just `BIT(as)` on new HW. So the plan was for panthor_mmu_as_fault_mask() to abstract that for us.