AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Timur Kristóf" <timur.kristof@gmail.com>
To: amd-gfx@lists.freedesktop.org,
	"Alex Deucher" <alexander.deucher@amd.com>,
	christian.koenig@amd.com, "Natalie Vock" <natalie.vock@gmx.de>,
	"Mario Limonciello" <mario.limonciello@amd.com>,
	"Amir Shetaia" <Amir.Shetaia@amd.com>,
	"Marek Olšák" <maraeo@gmail.com>,
	"Tvrtko Ursulin" <tursulin@ursulin.net>
Subject: Re: [PATCH 6/7] drm/amdgpu/gfxhub: Respect noretry flag for retry faults on GFX12.1
Date: Tue, 16 Jun 2026 14:36:41 +0200	[thread overview]
Message-ID: <3036224.DJkKcVGEfx@timur-hyperion> (raw)
In-Reply-To: <4296af8f-8001-4a98-b942-3a2840296b7e@ursulin.net>

On Tuesday, June 16, 2026 2:16:35 PM Central European Summer Time Tvrtko 
Ursulin wrote:
> On 16/06/2026 12:57, Timur Kristóf wrote:
> > On Tuesday, June 16, 2026 10:09:53 AM Central European Summer Time Tvrtko
> > 
> > Ursulin wrote:
> >> On 25/05/2026 12:45, Timur Kristóf wrote:
> >>> When retry faults are disabled (amdgpu.noretry=1),
> >>> the ENABLE_RETRY_FAULT_INTERRUPT bit should be programmed to 0.
> >>> 
> >>> Note that retry faults are enabled by default on GFX12.1
> >>> so this just fixes the case when they are explicitly disabled.
> >>> 
> >>> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
> >>> ---
> >>> 
> >>>    drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>> 
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c index
> >>> 4c2fd1e6616e..d2edfe037da8 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v12_1.c
> >>> @@ -243,7 +243,7 @@ static void
> >>> gfxhub_v12_1_xcc_init_system_aperture_regs(struct amdgpu_device *ade>
> >>> 
> >>>    		tmp = REG_SET_FIELD(tmp,
> > 
> > GCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> > 
> > ACTIVE_PAGE_MIGRATION_PTE_READ_RETRY, 1);
> > 
> >>>    		tmp = REG_SET_FIELD(tmp,
> > 
> > GCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> >>> -				    ENABLE_RETRY_FAULT_INTERRUPT,
> > 
> > 0x1);
> > 
> >>> +				    ENABLE_RETRY_FAULT_INTERRUPT,
> > 
> > !adev->gmc.noretry);
> > 
> >>>    		WREG32_SOC15(GC, GET_INST(GC, i),
> >>>    		
> >>>    			     regGCVM_L2_PROTECTION_FAULT_CNTL2,
> > 
> > tmp);
> > 
> >>>    	}
> >> 
> >> If I look at 6f894c92490b ("drm/amdgpu: Enable retry faults for GFX
> >> 12.1") which added this code, it also touched
> >> 
> >> gfxhub_v12_1_xcc_setup_vmid_config():
> >>       tmp = REG_SET_FIELD(tmp, GCVM_CONTEXT1_CNTL,
> >>       
> >>                           RETRY_PERMISSION_OR_INVALID_PAGE_FAULT,
> >> 
> >> -                       !amdgpu_noretry);
> >> +                       1);
> >> 
> >> Should that be changed as well?
> > 
> > I personally don't have a GFX12.1 GPU so I have no way to verify how that
> > works, which is why I try to avoid changing it unless it's pretty obvious
> > that the upstream code is wrong.
> > 
> > Can you elaborate on what you are suggesting exactly?
> 
> I'm asking. :)
> 
> commit 6f894c92490be1bb27492a82544b4b1e4ad20915
> Author: Mukul Joshi <mukul.joshi@amd.com>
> Date:   Wed Mar 26 22:06:39 2025 -0400
> 
>      drm/amdgpu: Enable retry faults for GFX 12.1
> 
> Made these three changes:
> 
> gfxhub_v12_1_xcc_init_system_aperture_regs:
> +                       tmp = REG_SET_FIELD(tmp,
> GCVM_L2_PROTECTION_FAULT_CNTL2,
> +
> ENABLE_RETRY_FAULT_INTERRUPT, 0x1);
> 
> 
> gfxhub_v12_1_xcc_setup_vmid_config:
> -                                           !amdgpu_noretry);
> +                                           1);
> 
> 
> mmhub_v4_2_0_mid_init_system_aperture_regs:
> +               tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL2,
> +                                   ENABLE_RETRY_FAULT_INTERRUPT, 0x1);
> 
> 
> The claim from that one was that it is enabling retry faults on gfx
> 12.1. If that is correct, and we look at your patch which wants respect
> the noretry modparam, but only changes one of those three.
> 
> So question is are you confident it is only that one you need to change
> to make it respect the modparam? I don't know to be clear, those are
> just things I spot while reading you patch and the relevant history
> trying to familiarise myself with this area.

I think I see what you mean.
Indeed it would make sense to change the patch to use the gmc->noretry flag 
there as well. Will add that to the next version.

Thanks,
Timur






  reply	other threads:[~2026-06-16 12:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-25 11:45 [PATCH 0/7] drm/amdgpu: Improve retry fault handling (v2) Timur Kristóf
2026-05-25 11:45 ` [PATCH 1/7] drm/amdgpu: Use gmc->noretry instead of amdgpu_noretry directly Timur Kristóf
2026-05-25 11:45 ` [PATCH 2/7] drm/amdgpu/gfxhub: Program CRASH_ON_*_FAULT bits to 0 as needed Timur Kristóf
2026-05-26 15:00   ` Alex Deucher
2026-05-25 11:45 ` [PATCH 3/7] drm/amdgpu/gmc: Don't compare page fault timestamps with other interrupts Timur Kristóf
2026-06-15 14:32   ` Tvrtko Ursulin
2026-06-15 14:52     ` Timur Kristóf
2026-06-15 15:23       ` Tvrtko Ursulin
2026-06-15 15:32         ` Timur Kristóf
2026-06-15 15:48           ` Tvrtko Ursulin
2026-06-16 10:15             ` Christian König
2026-06-16 11:17               ` Timur Kristóf
2026-06-16 12:48                 ` Christian König
2026-05-25 11:45 ` [PATCH 4/7] drm/amdgpu/ih: Add retry_cam_ack IH function pointer Timur Kristóf
2026-06-15 14:44   ` Tvrtko Ursulin
2026-06-15 15:02     ` Timur Kristóf
2026-06-16 10:34   ` Christian König
2026-05-25 11:45 ` [PATCH 5/7] drm/amdgpu/gfxhub: Enable retry fault interrupts when needed Timur Kristóf
2026-06-16  8:02   ` Tvrtko Ursulin
2026-06-16 11:54     ` Timur Kristóf
2026-05-25 11:45 ` [PATCH 6/7] drm/amdgpu/gfxhub: Respect noretry flag for retry faults on GFX12.1 Timur Kristóf
2026-06-16  8:09   ` Tvrtko Ursulin
2026-06-16 11:57     ` Timur Kristóf
2026-06-16 12:16       ` Tvrtko Ursulin
2026-06-16 12:36         ` Timur Kristóf [this message]
2026-05-25 11:45 ` [PATCH 7/7] drm/amdgpu: Enable retry CAM on Navi 3 dGPUs Timur Kristóf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3036224.DJkKcVGEfx@timur-hyperion \
    --to=timur.kristof@gmail.com \
    --cc=Amir.Shetaia@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=maraeo@gmail.com \
    --cc=mario.limonciello@amd.com \
    --cc=natalie.vock@gmx.de \
    --cc=tursulin@ursulin.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox