All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Timur Kristóf" <timur.kristof@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	"Alex Deucher" <alexander.deucher@amd.com>,
	christian.koenig@amd.com, "Natalie Vock" <natalie.vock@gmx.de>,
	"Mario Limonciello" <mario.limonciello@amd.com>,
	"Amir Shetaia" <Amir.Shetaia@amd.com>,
	"Marek Olšák" <maraeo@gmail.com>,
	"Tvrtko Ursulin" <tursulin@ursulin.net>
Subject: Re: [PATCH 4/7] drm/amdgpu/ih: Add retry_cam_ack IH function pointer
Date: Mon, 15 Jun 2026 17:02:06 +0200	[thread overview]
Message-ID: <3701855.dWV9SEqChM@timur-hyperion> (raw)
In-Reply-To: <828817bb-8d69-429f-b206-7c9858eeca72@ursulin.net>

On Monday, June 15, 2026 4:44:22 PM Central European Summer Time Tvrtko 
Ursulin wrote:
> On 25/05/2026 12:45, Timur Kristóf wrote:
> > Instead of writing the doorbell in amdgpu_gmc_handle_retry_fault()
> > directly, add an IH function pointer which can be defined in
> > a different way for different IH versions.
> > 
> > This is to allow implementing the filter CAM without a doorbell.
> > 
> > Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> > ---
> > 
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h  | 1 +
> >   drivers/gpu/drm/amd/amdgpu/ih_v7_0.c    | 6 ++++++
> >   drivers/gpu/drm/amd/amdgpu/vega20_ih.c  | 8 +++++++-
> >   4 files changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index
> > 52258f1341c2..d790b7619ccd 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -565,7 +565,7 @@ int amdgpu_gmc_handle_retry_fault(struct amdgpu_device
> > *adev,> 
> >   		ret = amdgpu_vm_handle_fault(adev, entry->pasid, 
entry->vmid, node_id,
> >   		
> >   					     addr, entry-
>timestamp, write_fault);
> > 
> > -		WDOORBELL32(adev->irq.retry_cam_doorbell_index, 
cam_index);
> > +		adev->irq.ih_funcs->retry_cam_ack(adev, cam_index);
> 
> How does not map which IP generations can end up calling it? Presumably
> your selection of ih_v7_0 and vega20_ih.c is an insightful one, but for
> me I see amdgpu_gmc_handle_retry_fault() is called from
> gmc_v9_0_process_interrupt, gmc_v10_0_process_interrupt,
> gmc_v11_0_process_interrupt and gmc_v12_0_process_interrupt(). Is there
> a map somewhere which shows which GMC versions go with which IH blocks?

The hardware writes interrupt data into a so-called IH ring (interrupt handler 
ring):
- Old GPUs only have one IH ring
- Newer dedicated GPUs have two IH rings
- APUs only have one IH ring
- Additionally there is a soft IH ring, which is implemented entirely in the 
driver and is used to make the processing more reliable.

Specifically for retry faults, it is beneficial to configure page fault 
interrupts on the second IH ring (when available) to avoid them competing with 
other interrupts for space in the ring. In the upstream code this is only done 
on Vega dGPUs. My series implements that for GFX11 and a subsequent series 
also for GFX12 dGPUs.

To answer your actual question, it all depends on the IH IP block for any 
given generation. For some GPUs (but not all), the IH code configures the 
hardware to use the second IH ring for page fault interrupts. For everything 
else, the first IH ring is used. On GFX12 it seems the interrupts are handled 
on the first IH ring even though the second ring is configured; it's unclear if 
that's a bug or missing code in the kernel.

Hope this helps,
Timur




> 
> >   		if (ret)
> >   		
> >   			return 1;
> >   	
> >   	} else {
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 444437c30088..e6e34f6e86f4
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > @@ -97,6 +97,7 @@ struct amdgpu_ih_funcs {
> > 
> >   	const char *(*node_id_to_die_name)(struct amdgpu_device *adev,
> >   	
> >   					   unsigned int 
node_id,
> >   					   char *buf, size_t 
size);
> > 
> > +	void (*retry_cam_ack)(struct amdgpu_device *adev, u32 cam_index);
> > 
> >   };
> >   
> >   #define amdgpu_ih_get_wptr(adev, ih)
> >   (adev)->irq.ih_funcs->get_wptr((adev), (ih))> 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c index 6de9e87e04e1..c2431f4c2671
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/ih_v7_0.c
> > @@ -289,6 +289,11 @@ static uint32_t ih_v7_0_setup_retry_doorbell(u32
> > doorbell_index)> 
> >   	return val;
> >   
> >   }
> > 
> > +static void ih_v7_0_retry_cam_ack(struct amdgpu_device *adev, u32
> > cam_index) +{
> > +	WDOORBELL32(adev->irq.retry_cam_doorbell_index, cam_index);
> > +}
> > +
> > 
> >   #define regIH_RING1_CLIENT_CFG_INDEX_V7_1             0x122
> >   #define regIH_RING1_CLIENT_CFG_INDEX_V7_1_BASE_IDX    0
> >   #define regIH_RING1_CLIENT_CFG_DATA_V7_1              0x123
> > 
> > @@ -858,6 +863,7 @@ static const struct amdgpu_ih_funcs ih_v7_0_funcs = {
> > 
> >   	.decode_iv_ts = amdgpu_ih_decode_iv_ts_helper,
> >   	.set_rptr = ih_v7_0_set_rptr,
> >   	.node_id_to_die_name = ih_v7_0_node_id_to_die_name,
> > 
> > +	.retry_cam_ack = ih_v7_0_retry_cam_ack,
> > 
> >   };
> >   
> >   static void ih_v7_0_set_interrupt_funcs(struct amdgpu_device *adev)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c index 85846fd08ce4..30a82fff3ff7
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vega20_ih.c
> > @@ -293,6 +293,11 @@ static uint32_t vega20_setup_retry_doorbell(u32
> > doorbell_index)> 
> >   	return val;
> >   
> >   }
> > 
> > +static void vega20_retry_cam_ack(struct amdgpu_device *adev, u32
> > cam_index) +{
> > +	WDOORBELL32(adev->irq.retry_cam_doorbell_index, cam_index);
> > +}
> > +
> > 
> >   /**
> >   
> >    * vega20_ih_irq_init - init and enable the interrupt ring
> >    *
> > 
> > @@ -738,7 +743,8 @@ static const struct amdgpu_ih_funcs vega20_ih_funcs =
> > {
> > 
> >   	.get_wptr = vega20_ih_get_wptr,
> >   	.decode_iv = amdgpu_ih_decode_iv_helper,
> >   	.decode_iv_ts = amdgpu_ih_decode_iv_ts_helper,
> > 
> > -	.set_rptr = vega20_ih_set_rptr
> > +	.set_rptr = vega20_ih_set_rptr,
> > +	.retry_cam_ack = vega20_retry_cam_ack,
> > 
> >   };
> >   
> >   static void vega20_ih_set_interrupt_funcs(struct amdgpu_device *adev)





  reply	other threads:[~2026-06-15 15:02 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-25 11:45 [PATCH 0/7] drm/amdgpu: Improve retry fault handling (v2) Timur Kristóf
2026-05-25 11:45 ` [PATCH 1/7] drm/amdgpu: Use gmc->noretry instead of amdgpu_noretry directly Timur Kristóf
2026-05-25 11:45 ` [PATCH 2/7] drm/amdgpu/gfxhub: Program CRASH_ON_*_FAULT bits to 0 as needed Timur Kristóf
2026-05-26 15:00   ` Alex Deucher
2026-05-25 11:45 ` [PATCH 3/7] drm/amdgpu/gmc: Don't compare page fault timestamps with other interrupts Timur Kristóf
2026-06-15 14:32   ` Tvrtko Ursulin
2026-06-15 14:52     ` Timur Kristóf
2026-06-15 15:23       ` Tvrtko Ursulin
2026-06-15 15:32         ` Timur Kristóf
2026-06-15 15:48           ` Tvrtko Ursulin
2026-05-25 11:45 ` [PATCH 4/7] drm/amdgpu/ih: Add retry_cam_ack IH function pointer Timur Kristóf
2026-06-15 14:44   ` Tvrtko Ursulin
2026-06-15 15:02     ` Timur Kristóf [this message]
2026-05-25 11:45 ` [PATCH 5/7] drm/amdgpu/gfxhub: Enable retry fault interrupts when needed Timur Kristóf
2026-05-25 11:45 ` [PATCH 6/7] drm/amdgpu/gfxhub: Respect noretry flag for retry faults on GFX12.1 Timur Kristóf
2026-05-25 11:45 ` [PATCH 7/7] drm/amdgpu: Enable retry CAM on Navi 3 dGPUs Timur Kristóf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3701855.dWV9SEqChM@timur-hyperion \
    --to=timur.kristof@gmail.com \
    --cc=Amir.Shetaia@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=maraeo@gmail.com \
    --cc=mario.limonciello@amd.com \
    --cc=natalie.vock@gmx.de \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.