From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EC7DFCD98CE for ; Fri, 12 Jun 2026 06:54:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7D06B10F2A6; Fri, 12 Jun 2026 06:54:53 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="pSxIocfe"; dkim-atps=neutral Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) by gabe.freedesktop.org (Postfix) with ESMTPS id 98FBE10E953 for ; Fri, 12 Jun 2026 06:54:52 +0000 (UTC) Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-4908b92904fso6183525e9.0 for ; Thu, 11 Jun 2026 23:54:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781247291; x=1781852091; darn=lists.freedesktop.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eZTJ4uVj4hDcSLanrhU5Pt0t0xESLCC5Ll88CHT3pR0=; b=pSxIocfe/xwDge3zK4Je9o2kYuhvdjNWleiKeQAPe/ngy5S6TdSqDzfL5Zx5sbItzN H8Jx/CnLlThHfR31sF4RYbg2BlE6dGzM2iIunr3bGKdmjpjeIMp5OTkWSljMVr88sFV5 9OVC3IM1h9B8vqjPx6lUsR5AOhdaOEnTjdW6UuOjq+Am4/V1Z5HHPc7KCoN40rpc2lVa jMjYdsVCNVJMVx35B/RYU/saP7o0/B57KdHhz++4uhwuEEr3xkXvuvuzF2OFqMsI/Yut U1i6v7BZNr0YoNbqmhAxUNqZD/SDOORvi4n/6dxC2pGpibPBSUznfmlp+2z0qYuf9Toq LX2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781247291; x=1781852091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=eZTJ4uVj4hDcSLanrhU5Pt0t0xESLCC5Ll88CHT3pR0=; b=jWRh1STB2egD/Wsa+0lF/w+la7s4xeglUnky0L1cbG4SHM9EB0rufGJrM/5n74rcJk 0fDxatseslNyybaoXayJ/jW3l+Qy+x3ac1YDxjohJBvHWNCNEGL1JTKfiEpnu8OBD3cR s/n+20LUfLhJwXm88CqOGKjMnx9KoS4jpwyIAZR5zUEUZfj5EBsabPs/O26hzcy1RlUy 92itnjAuYGDADv5BdCnppIeemwtoKSg4gDoCIB0RBmGOADq5yezuO+aa2L8YsUrXExeT 9l1GRaXwkcqttPiJLmTAJiLe9QTwMIv5UIE/dGku1Dj3Zz4Xvs77gjOlR5FUueC0efCl RFyg== X-Gm-Message-State: AOJu0Yzb+XfD57P3429WcZo4TKQrNT+YDGqsAxymioFwIIArpzx5Cuym /g41PCCRepRxscqpMSJUBzrkyreWb3wwDbqQersAg0QYJdPLEKB52l44 X-Gm-Gg: Acq92OEReiV7uE8ZUoBZtK5k/NB65BMgIF87Dc+x4DrEVXaPkD0Sp6GxI7tWhRoKqos 8mq1lhgZWF/3F8uh8wXyOmJ248oWh4xRyYN6CEqWXmMf36j2fPy4hqQzpbn30ybIjw9asWLbpmH v9R7ZhBMpHsBGkJ6lN9UlOZi6U73PLLNRCfrtQKMHY5pH4PzTeMPCK+1r6pBMVeeM9/heThpe5Q +Ve7UW2PDUdejyi9EgW2oODqyJkes1QttISXUiEc8rWZ1n+87TFD48MI+J3vuJHRASBLAATnr1r HC3/e2V/MnvEs36cRfEWZ2uWl8GhBNTty1vsD0DNxl04KfvFZEb9MvHhNTr3PNr/Vu1kNOdxk30 MS+9tted5xrJKPEUnxz1ikNTfXVmffhqLsB/KnQYFO3948XbKF4gmfgaEAl1pKIesu24j07MK+A QulrU+6pxx0b4A5ZK2FfDUXuw80+thJCWnbFwSCVKFbVJzww+f1VQQIBqL9Hib/gZhXwPp42bme wdh7N/mepxC8Lxf1mDq7EbTMsCZ7g== X-Received: by 2002:a05:600c:4693:b0:490:d9d0:51c8 with SMTP id 5b1f17b1804b1-490ec504d3bmr18606035e9.18.1781247291016; Thu, 11 Jun 2026 23:54:51 -0700 (PDT) Received: from timur-max.localnet (20014C4E24EDFA00ACE56DB4A62D1E35.dsl.pool.telekom.hu. [2001:4c4e:24ed:fa00:ace5:6db4:a62d:1e35]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4606f2e592csm2873627f8f.36.2026.06.11.23.54.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jun 2026 23:54:50 -0700 (PDT) From: Timur =?UTF-8?B?S3Jpc3TDs2Y=?= To: natalie.vock@gmx.de, honghuan@amd.com, Alexander.Deucher@amd.com, Felix.Kuehling@amd.com, Philip.Yang@amd.com, christian.koenig@amd.com Cc: amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 06/13] drm/amdgpu: add amdgpu_vm_update_leaves() Date: Fri, 12 Jun 2026 08:54:49 +0200 Message-ID: <2299736.hkbZ0PkbqX@timur-max> In-Reply-To: <20260529114031.3714-7-christian.koenig@amd.com> References: <20260529114031.3714-1-christian.koenig@amd.com> <20260529114031.3714-7-christian.koenig@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" On 2026. m=C3=A1jus 29., p=C3=A9ntek 13:24:08 k=C3=B6z=C3=A9p-eur=C3=B3pai = ny=C3=A1ri id=C5=91 Christian K=C3=B6nig=20 wrote: > Add a new function amdgpu_vm_update_leaves() to avoid memory allocation > on page faults. >=20 > The idea is to only update the leave PTEs to insert a dummy PRT PTE. We should only use PRT PTEs on GPUs that do not have the SMEM PRT bug, in other words only GFX9 for now. As-is, this basically regresses all the work that I did for retry faults. Due to the SMEM PRT bug, using PRT PTEs is unacceptable because it would im= ply=20 that we can only mitigate VM faults that come from VMEM instructions. That= =20 creates unnecessary inconsistency (inability to mitigate some VM faults) an= d=20 poor user experience (GPU hangs). >=20 > TODO: HW older than GMC v9 needs a different solution. Can you elaborate what different solution is needed? As far as I know, GMC v6-v7-v8 do not use and do not need to call the=20 amdgpu_vm_handle_fault() function, they can just mitigate page faults on th= eir=20 own. >=20 > Signed-off-by: Christian K=C3=B6nig > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 40 ++++++++++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 + > drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 71 ++++++++++++++++++++++- > 3 files changed, 103 insertions(+), 11 deletions(-) >=20 > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index e5588346a03f..94632a660b79 > 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -2984,10 +2984,12 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device > *adev, u32 pasid, u32 vmid, u32 node_id, uint64_t addr, > uint64_t ts, bool write_fault) > { > - bool is_compute_context =3D false; > + struct amdgpu_vm_update_params params; > + bool is_compute_context; > struct amdgpu_bo *root; > uint64_t value, flags; > struct amdgpu_vm *vm; > + unsigned int idx; > int r; >=20 > vm =3D amdgpu_vm_lock_by_pasid(adev, &root, pasid); > @@ -3029,24 +3031,46 @@ bool amdgpu_vm_handle_fault(struct amdgpu_device > *adev, u32 pasid, value =3D adev->dummy_page_addr; > flags |=3D AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE | > AMDGPU_PTE_WRITEABLE; > - > + /* On +gfx9 we can use the PRT functionality instead */ > + if (!adev->gmc.gmc_funcs->set_prt) { > + flags &=3D ~AMDGPU_PTE_VALID; > + flags |=3D AMDGPU_PTE_PRT; > + } As noted above, PRT should only be used on GFX9 (and potentially, future=20 products where the SMEM PRT bug is fixed). Also, checking the set_prt is not necessary here as this function only make= s=20 sense on GFX9 and newer. Maybe we could help the confusion by documenting both amdgpu_vm_handle_faul= t()=20 and amdgpu_gmc_handle_retry_fault() that they are only applicable to GFX9 a= nd=20 newer that support recoverable faults or retry faults. (In a separate commi= t=20 from this one, though.) > } else { > /* Let the hw retry silently on the PTE */ > value =3D 0; > } >=20 > + if (!drm_dev_enter(adev_to_drm(adev), &idx)) { > + r =3D -ENODEV; > + goto error_unlock; > + } > + > + amdgpu_vm_eviction_lock(vm); > + if (vm->evicting) { > + r =3D -EBUSY; > + goto error_dev_exit; > + } > + > + memset(¶ms, 0, sizeof(params)); > + params.adev =3D adev; > + params.vm =3D vm; > + params.immediate =3D true; > + params.pages_addr =3D NULL; > + > r =3D dma_resv_reserve_fences(root->tbo.base.resv, 1); > if (r) { > pr_debug("failed %d to reserve fence slot\n", r); > - goto error_unlock; > + goto error_eviction_lock; > } >=20 > - r =3D amdgpu_vm_update_range(adev, vm, true, false, false, false, > - NULL, addr, addr, flags, value,=20 0, NULL, NULL, NULL); > - if (r) > - goto error_unlock; > + amdgpu_vm_update_leaves(¶ms, addr, addr, value, flags); >=20 > - r =3D amdgpu_vm_update_pdes(adev, vm, true); > +error_eviction_lock: > + amdgpu_vm_eviction_unlock(vm); > + > +error_dev_exit: > + drm_dev_exit(idx); >=20 > error_unlock: > amdgpu_bo_unreserve(root); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index cc096c005e34..04b32accfa3f > 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -613,6 +613,9 @@ int amdgpu_vm_pde_update(struct amdgpu_vm_update_para= ms > *params, int amdgpu_vm_ptes_update(struct amdgpu_vm_update_params *params, > uint64_t start, uint64_t end, > uint64_t dst, uint64_t flags); > +void amdgpu_vm_update_leaves(struct amdgpu_vm_update_params *params, > + uint64_t start, uint64_t end, > + int64_t dst, uint64_t flags); > void amdgpu_vm_pt_free_work(struct work_struct *work); > void amdgpu_vm_pt_free_list(struct amdgpu_device *adev, > struct amdgpu_vm_update_params *params); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c index > e43a60d09808..9766b6b9aecc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c > @@ -790,6 +790,8 @@ int amdgpu_vm_ptes_update(struct amdgpu_vm_update_par= ams > *params, uint64_t dst, uint64_t flags) > { > struct amdgpu_device *adev =3D params->adev; > + struct amdgpu_vm *vm =3D params->vm; > + pid_t tgid =3D vm->task_info ? vm->task_info->tgid : 0; These spurious changes in amdgpu_vm_ptes_update() look unrelated the rest o= f=20 the commit, so I think they should either be split off into a separate comm= it=20 or explained in the commit message why they are necessary here. > struct amdgpu_vm_pt_cursor cursor; > uint64_t frag_start =3D start, frag_end; > unsigned int frag; > @@ -881,7 +883,6 @@ int amdgpu_vm_ptes_update(struct amdgpu_vm_update_par= ams > *params, entry_end =3D min(entry_end, end); >=20 > do { > - struct amdgpu_vm *vm =3D params->vm; > uint64_t upd_end =3D min(entry_end, frag_end); > unsigned int nptes =3D (upd_end - frag_start)=20 >> shift; > uint64_t upd_flags =3D flags |=20 AMDGPU_PTE_FRAG(frag); > @@ -893,8 +894,7 @@ int amdgpu_vm_ptes_update(struct amdgpu_vm_update_par= ams > *params, >=20 > trace_amdgpu_vm_update_ptes(params,=20 frag_start, upd_end, > =20 min(nptes, 32u), dst, incr, > - =20 upd_flags, > - vm- >task_info ? vm->task_info->tgid : 0, > + =20 upd_flags, tgid, > vm- >immediate.fence_context); > amdgpu_vm_pte_update_flags(params,=20 to_amdgpu_bo_vm(pt), > =20 cursor.level, pe_start, dst, > @@ -938,6 +938,71 @@ int amdgpu_vm_ptes_update(struct > amdgpu_vm_update_params *params, return 0; > } >=20 > +/** > + * amdgpu_vm_update_leaves - update leave PDEs/PTEs > + * > + * @params: see amdgpu_vm_update_params definition > + * @start: start of GPU address range > + * @end: end of GPU address range > + * @dst: destination address to insert into the leave PDEs/PTEs > + * @flags: mapping flags > + * > + * Update the leave PDEs/PTEs in the range @start - @end without allocat= ing > or + * freeing page tables. > + * > + * Returns: > + * 0 for success, negative error code for failure. > + */ > +void amdgpu_vm_update_leaves(struct amdgpu_vm_update_params *params, > + uint64_t start, uint64_t end, > + int64_t dst, uint64_t flags) > +{ > + struct amdgpu_device *adev =3D params->adev; > + struct amdgpu_vm *vm =3D params->vm; > + pid_t tgid =3D vm->task_info ? vm->task_info->tgid : 0; > + struct amdgpu_vm_pt_cursor cursor; > + > + amdgpu_vm_pt_start(adev, params->vm, start, &cursor); > + while (cursor.pfn < end) { > + unsigned int shift, mask; > + uint64_t entry_end, pe_start; > + struct amdgpu_bo *pt; > + unsigned int nptes; > + > + /* Walk to the leave entries */ > + if (amdgpu_vm_pt_descendant(adev, &cursor)) > + continue; > + > + pt =3D cursor.parent->bo; > + shift =3D amdgpu_vm_pt_level_shift(adev, cursor.level -=20 1); > + mask =3D amdgpu_vm_pt_entries_mask(adev, cursor.level -=20 1); > + > + /* Looks good so far, calculate parameters for the=20 update */ > + pe_start =3D ((cursor.pfn >> shift) & mask) * 8; > + > + entry_end =3D ((uint64_t)mask + 1) << shift; > + entry_end +=3D cursor.pfn & ~(entry_end - 1); > + entry_end =3D min(entry_end, end); > + > + nptes =3D (entry_end - cursor.pfn) >> shift; > + /* > + * This can happen when we set higher level PDEs to=20 unmap and/or > + * silent to stop fault floods. > + */ > + nptes =3D max(nptes, 1u); > + > + trace_amdgpu_vm_update_ptes(params, cursor.pfn,=20 entry_end, > + min(nptes, 32u),=20 dst, 0, flags, > + tgid, > + vm- >immediate.fence_context); > + amdgpu_vm_pte_update_flags(params, to_amdgpu_bo_vm(pt), > + cursor.level - 1,=20 pe_start, dst, > + nptes, 0, flags); > + > + amdgpu_vm_pt_next(adev, &cursor); > + } > +} > + > /** > * amdgpu_vm_pt_map_tables - have bo of root PD cpu accessible > * @adev: amdgpu device structure