Re: [PATCH v8 2/4] drm/panthor: Extend IRQ helpers for mask modification/restoration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Cc: Steven Price <steven.price@arm.com>,
	Liviu Dudau <liviu.dudau@arm.com>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Chia-I Wu <olvaffe@gmail.com>,
	Karunika Choo <karunika.choo@arm.com>,
	kernel@collabora.com, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v8 2/4] drm/panthor: Extend IRQ helpers for mask modification/restoration
Date: Thu, 15 Jan 2026 12:30:52 +0100	[thread overview]
Message-ID: <20260115123052.5c30e714@fedora> (raw)
In-Reply-To: <3991854.44csPzL39Z@workhorse>

On Thu, 15 Jan 2026 12:15:22 +0100
Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:

> On Monday, 12 January 2026 16:12:52 Central European Standard Time Boris Brezillon wrote:
> > On Mon, 12 Jan 2026 15:37:50 +0100
> > Nicolas Frattaroli <nicolas.frattaroli@collabora.com> wrote:
> >   
> > > The current IRQ helpers do not guarantee mutual exclusion that covers
> > > the entire transaction from accessing the mask member and modifying the
> > > mask register.
> > > 
> > > This makes it hard, if not impossible, to implement mask modification
> > > helpers that may change one of these outside the normal
> > > suspend/resume/isr code paths.
> > > 
> > > Add a spinlock to struct panthor_irq that protects both the mask member
> > > and register. Acquire it in all code paths that access these, but drop
> > > it before processing the threaded handler function. Then, add the
> > > aforementioned new helpers: enable_events, and disable_events. They work
> > > by ORing and NANDing the mask bits.
> > > 
> > > resume is changed to no longer have a mask passed, as pirq->mask is
> > > supposed to be the user-requested mask now, rather than a mirror of the
> > > INT_MASK register contents. Users of the resume helper are adjusted
> > > accordingly, including a rather painful refactor in panthor_mmu.c.
> > > 
> > > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
> > > ---
> > >  drivers/gpu/drm/panthor/panthor_device.h |  72 +++++++--
> > >  drivers/gpu/drm/panthor/panthor_fw.c     |   3 +-
> > >  drivers/gpu/drm/panthor/panthor_gpu.c    |   2 +-
> > >  drivers/gpu/drm/panthor/panthor_mmu.c    | 247 ++++++++++++++++---------------
> > >  drivers/gpu/drm/panthor/panthor_pwr.c    |   2 +-
> > >  5 files changed, 187 insertions(+), 139 deletions(-)
> > >   
> > [... snip ...]  
> > > diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > index 198d59f42578..71b8318eab31 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_mmu.c
> > > +++ b/drivers/gpu/drm/panthor/panthor_mmu.c
> > > @@ -655,125 +655,6 @@ static void panthor_vm_release_as_locked(struct panthor_vm *vm)
> > >  	vm->as.id = -1;
> > >  }
> > >  
> > > -/**
> > > - * panthor_vm_active() - Flag a VM as active
> > > - * @vm: VM to flag as active.
> > > - *
> > > - * Assigns an address space to a VM so it can be used by the GPU/MCU.
> > > - *
> > > - * Return: 0 on success, a negative error code otherwise.
> > > - */
> > > -int panthor_vm_active(struct panthor_vm *vm)
> > > -{
> > > -	struct panthor_device *ptdev = vm->ptdev;
> > > -	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
> > > -	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
> > > -	int ret = 0, as, cookie;
> > > -	u64 transtab, transcfg;
> > > -
> > > -	if (!drm_dev_enter(&ptdev->base, &cookie))
> > > -		return -ENODEV;
> > > -
> > > -	if (refcount_inc_not_zero(&vm->as.active_cnt))
> > > -		goto out_dev_exit;
> > > -
> > > -	/* Make sure we don't race with lock/unlock_region() calls
> > > -	 * happening around VM bind operations.
> > > -	 */
> > > -	mutex_lock(&vm->op_lock);
> > > -	mutex_lock(&ptdev->mmu->as.slots_lock);
> > > -
> > > -	if (refcount_inc_not_zero(&vm->as.active_cnt))
> > > -		goto out_unlock;
> > > -
> > > -	as = vm->as.id;
> > > -	if (as >= 0) {
> > > -		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
> > > -		 * re-enable the MMU after clearing+unmasking the AS interrupts.
> > > -		 */
> > > -		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
> > > -			goto out_enable_as;
> > > -
> > > -		goto out_make_active;
> > > -	}
> > > -
> > > -	/* Check for a free AS */
> > > -	if (vm->for_mcu) {
> > > -		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
> > > -		as = 0;
> > > -	} else {
> > > -		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
> > > -	}
> > > -
> > > -	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
> > > -		struct panthor_vm *lru_vm;
> > > -
> > > -		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
> > > -						  struct panthor_vm,
> > > -						  as.lru_node);
> > > -		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
> > > -			ret = -EBUSY;
> > > -			goto out_unlock;
> > > -		}
> > > -
> > > -		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
> > > -		as = lru_vm->as.id;
> > > -
> > > -		ret = panthor_mmu_as_disable(ptdev, as, true);
> > > -		if (ret)
> > > -			goto out_unlock;
> > > -
> > > -		panthor_vm_release_as_locked(lru_vm);
> > > -	}
> > > -
> > > -	/* Assign the free or reclaimed AS to the FD */
> > > -	vm->as.id = as;
> > > -	set_bit(as, &ptdev->mmu->as.alloc_mask);
> > > -	ptdev->mmu->as.slots[as].vm = vm;
> > > -
> > > -out_enable_as:
> > > -	transtab = cfg->arm_lpae_s1_cfg.ttbr;
> > > -	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> > > -		   AS_TRANSCFG_PTW_RA |
> > > -		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
> > > -		   AS_TRANSCFG_INA_BITS(55 - va_bits);
> > > -	if (ptdev->coherent)
> > > -		transcfg |= AS_TRANSCFG_PTW_SH_OS;
> > > -
> > > -	/* If the VM is re-activated, we clear the fault. */
> > > -	vm->unhandled_fault = false;
> > > -
> > > -	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
> > > -	 * before enabling the AS.
> > > -	 */
> > > -	if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) {
> > > -		gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as));
> > > -		ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);
> > > -		ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as);
> > > -		gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask);
> > > -	}
> > > -
> > > -	/* The VM update is guarded by ::op_lock, which we take at the beginning
> > > -	 * of this function, so we don't expect any locked region here.
> > > -	 */
> > > -	drm_WARN_ON(&vm->ptdev->base, vm->locked_region.size > 0);
> > > -	ret = panthor_mmu_as_enable(vm->ptdev, vm->as.id, transtab, transcfg, vm->memattr);
> > > -
> > > -out_make_active:
> > > -	if (!ret) {
> > > -		refcount_set(&vm->as.active_cnt, 1);
> > > -		list_del_init(&vm->as.lru_node);
> > > -	}
> > > -
> > > -out_unlock:
> > > -	mutex_unlock(&ptdev->mmu->as.slots_lock);
> > > -	mutex_unlock(&vm->op_lock);
> > > -
> > > -out_dev_exit:
> > > -	drm_dev_exit(cookie);
> > > -	return ret;
> > > -}
> > > -
> > >  /**
> > >   * panthor_vm_idle() - Flag a VM idle
> > >   * @vm: VM to flag as idle.
> > > @@ -1762,6 +1643,128 @@ static void panthor_mmu_irq_handler(struct panthor_device *ptdev, u32 status)
> > >  }
> > >  PANTHOR_IRQ_HANDLER(mmu, MMU, panthor_mmu_irq_handler);
> > >  
> > > +/**
> > > + * panthor_vm_active() - Flag a VM as active
> > > + * @vm: VM to flag as active.
> > > + *
> > > + * Assigns an address space to a VM so it can be used by the GPU/MCU.
> > > + *
> > > + * Return: 0 on success, a negative error code otherwise.
> > > + */
> > > +int panthor_vm_active(struct panthor_vm *vm)
> > > +{
> > > +	struct panthor_device *ptdev = vm->ptdev;
> > > +	u32 va_bits = GPU_MMU_FEATURES_VA_BITS(ptdev->gpu_info.mmu_features);
> > > +	struct io_pgtable_cfg *cfg = &io_pgtable_ops_to_pgtable(vm->pgtbl_ops)->cfg;
> > > +	int ret = 0, as, cookie;
> > > +	u64 transtab, transcfg;
> > > +	u32 fault_mask;
> > > +
> > > +	if (!drm_dev_enter(&ptdev->base, &cookie))
> > > +		return -ENODEV;
> > > +
> > > +	if (refcount_inc_not_zero(&vm->as.active_cnt))
> > > +		goto out_dev_exit;
> > > +
> > > +	/* Make sure we don't race with lock/unlock_region() calls
> > > +	 * happening around VM bind operations.
> > > +	 */
> > > +	mutex_lock(&vm->op_lock);
> > > +	mutex_lock(&ptdev->mmu->as.slots_lock);
> > > +
> > > +	if (refcount_inc_not_zero(&vm->as.active_cnt))
> > > +		goto out_unlock;
> > > +
> > > +	as = vm->as.id;
> > > +	if (as >= 0) {
> > > +		/* Unhandled pagefault on this AS, the MMU was disabled. We need to
> > > +		 * re-enable the MMU after clearing+unmasking the AS interrupts.
> > > +		 */
> > > +		if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as))
> > > +			goto out_enable_as;
> > > +
> > > +		goto out_make_active;
> > > +	}
> > > +
> > > +	/* Check for a free AS */
> > > +	if (vm->for_mcu) {
> > > +		drm_WARN_ON(&ptdev->base, ptdev->mmu->as.alloc_mask & BIT(0));
> > > +		as = 0;
> > > +	} else {
> > > +		as = ffz(ptdev->mmu->as.alloc_mask | BIT(0));
> > > +	}
> > > +
> > > +	if (!(BIT(as) & ptdev->gpu_info.as_present)) {
> > > +		struct panthor_vm *lru_vm;
> > > +
> > > +		lru_vm = list_first_entry_or_null(&ptdev->mmu->as.lru_list,
> > > +						  struct panthor_vm,
> > > +						  as.lru_node);
> > > +		if (drm_WARN_ON(&ptdev->base, !lru_vm)) {
> > > +			ret = -EBUSY;
> > > +			goto out_unlock;
> > > +		}
> > > +
> > > +		drm_WARN_ON(&ptdev->base, refcount_read(&lru_vm->as.active_cnt));
> > > +		as = lru_vm->as.id;
> > > +
> > > +		ret = panthor_mmu_as_disable(ptdev, as, true);
> > > +		if (ret)
> > > +			goto out_unlock;
> > > +
> > > +		panthor_vm_release_as_locked(lru_vm);
> > > +	}
> > > +
> > > +	/* Assign the free or reclaimed AS to the FD */
> > > +	vm->as.id = as;
> > > +	set_bit(as, &ptdev->mmu->as.alloc_mask);
> > > +	ptdev->mmu->as.slots[as].vm = vm;
> > > +
> > > +out_enable_as:
> > > +	transtab = cfg->arm_lpae_s1_cfg.ttbr;
> > > +	transcfg = AS_TRANSCFG_PTW_MEMATTR_WB |
> > > +		   AS_TRANSCFG_PTW_RA |
> > > +		   AS_TRANSCFG_ADRMODE_AARCH64_4K |
> > > +		   AS_TRANSCFG_INA_BITS(55 - va_bits);
> > > +	if (ptdev->coherent)
> > > +		transcfg |= AS_TRANSCFG_PTW_SH_OS;
> > > +
> > > +	/* If the VM is re-activated, we clear the fault. */
> > > +	vm->unhandled_fault = false;
> > > +
> > > +	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
> > > +	 * before enabling the AS.
> > > +	 */
> > > +	fault_mask = panthor_mmu_as_fault_mask(ptdev, as);
> > > +	if (ptdev->mmu->as.faulty_mask & fault_mask) {
> > > +		gpu_write(ptdev, MMU_INT_CLEAR, fault_mask);
> > > +		ptdev->mmu->as.faulty_mask &= ~fault_mask;
> > > +		panthor_mmu_irq_enable_events(&ptdev->mmu->irq, fault_mask);
> > > +		panthor_mmu_irq_disable_events(&ptdev->mmu->irq, ptdev->mmu->as.faulty_mask);  
> > 
> > Why do we need a _disable_events() here?  
> 
> It's what the code originally did as far as I can tell. Not super obvious
> because I had to move the function, but it did:
> 
> 	/* Unhandled pagefault on this AS, clear the fault and re-enable interrupts
> 	 * before enabling the AS.
> 	 */
> 	if (ptdev->mmu->as.faulty_mask & panthor_mmu_as_fault_mask(ptdev, as)) {
> 		gpu_write(ptdev, MMU_INT_CLEAR, panthor_mmu_as_fault_mask(ptdev, as));
> 		ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);
> 		ptdev->mmu->irq.mask |= panthor_mmu_as_fault_mask(ptdev, as);
> 		gpu_write(ptdev, MMU_INT_MASK, ~ptdev->mmu->as.faulty_mask);
> 	}
> 
> We write `~(ptdev->mmu->as.faulty_mask & ~panthor_mmu_as_fault_mask(ptdev, as))` to
> the mask register. Though now looking at it again, I don't think my new version
> expands to the same thing at all, since
> `ptdev->mmu->as.faulty_mask &= ~panthor_mmu_as_fault_mask(ptdev, as);` is trying
> to clear the fault mask of the one bit this translates to from what I can tell,
> and then the negation in the write re-enables it but clears all other bits? That
> can't be right. If anything if it wanted to re-enable interrupts it should OR
> the register contents, not overwrite them.

It's more an "enable anything that's not faulty" than a "re-enable
events for this specific AS". So all we were doing is keep track of the
faulty+not-acknowledged mask, and then clearing bits in this mask as AS
slots get acknowledged or recycled.

> 
> I feel a little better about the me from a few days ago when I can look at the
> code with a fresh set of eyes and still not get what it's actually trying to do,
> other than trusting the comment.
> 
> Also, genuinely what is the point of `panthor_mmu_as_fault_mask`? Half of its
> parameters are unused and its entire implementation is shorter than the function
> name.

panthor_mmu_as_fault_mask() is only here because at some point I
intended to port JM HW support to panthor, the fault mask for an AS
there is `BIT(as) | BIT(as + 16)` instead of just `BIT(as)` on new HW.
So the plan was for panthor_mmu_as_fault_mask() to abstract that for us.

next prev parent reply	other threads:[~2026-01-15 11:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-12 14:37 [PATCH v8 0/4] Add a few tracepoints to panthor Nicolas Frattaroli
2026-01-12 14:37 ` [PATCH v8 1/4] drm/panthor: Rework panthor_irq::suspended into panthor_irq::state Nicolas Frattaroli
2026-01-14 16:07   ` Steven Price
2026-01-12 14:37 ` [PATCH v8 2/4] drm/panthor: Extend IRQ helpers for mask modification/restoration Nicolas Frattaroli
2026-01-12 15:12   ` Boris Brezillon
2026-01-15 11:15     ` Nicolas Frattaroli
2026-01-15 11:30       ` Boris Brezillon [this message]
2026-01-13 12:23   ` Boris Brezillon
2026-01-12 14:37 ` [PATCH v8 3/4] drm/panthor: Add tracepoint for hardware utilisation changes Nicolas Frattaroli
2026-01-12 14:37 ` [PATCH v8 4/4] drm/panthor: Add gpu_job_irq tracepoint Nicolas Frattaroli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260115123052.5c30e714@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=airlied@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=karunika.choo@arm.com \
    --cc=kernel@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liviu.dudau@arm.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=nicolas.frattaroli@collabora.com \
    --cc=olvaffe@gmail.com \
    --cc=simona@ffwll.ch \
    --cc=steven.price@arm.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.