All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Daniel Tang <danielzgtg.opensource@gmail.com>,
	amd-gfx@lists.freedesktop.org
Cc: Xiaogang Chen <xiaogang.chen@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Felix Kuehling <Felix.Kuehling@amd.com>
Subject: Re: [PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute"
Date: Mon, 23 Oct 2023 17:15:43 +0200	[thread overview]
Message-ID: <38742906-869c-4bc6-9cce-ea3ea98873d4@gmail.com> (raw)
In-Reply-To: <5984374.lOV4Wx5bFT@daniel-desktop2>

Am 23.10.23 um 15:06 schrieb Daniel Tang:
> That commit causes the screen to freeze a few moments after running
> clinfo on v6.6-rc7 and ROCm 5.6. Sometimes the rest of the computer
> including ssh also freezes. On v6.5-rc1, it only results in a NULL pointer
> deference message in dmesg and the process to become a zombie whose
> unkillableness prevents shutdown without REISUB. Although llama.cpp and
> hashcat were working in v6.2 and ROCm 5.6, broke, and are not fixed by
> this revert, pytorch-rocm is now working with stability and without
> whole-computer freezes caused by any accidental running of clinfo.
>
> This reverts commit 1d7776cc148b9f2f3ebaf1181662ba695a29f639.

That result doesn't make much sense. Felix please correct me, but AFAIK 
the ATS stuff was completely removed by now.

Are you sure that this is pure v6.6-rc7 and not some other patches 
applied? If yes than we must have missed something.

Regards,
Christian.

>
> Closes: https://github.com/RadeonOpenCompute/ROCm/issues/2596
> Signed-off-by: Daniel Tang <danielzgtg.opensource@gmail.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 82f25996ff5e..602f311ab766 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2243,16 +2243,16 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   	if (r)
>   		return r;
>   
> +	/* Sanity checks */
> +	if (!amdgpu_vm_pt_is_root_clean(adev, vm)) {
> +		r = -EINVAL;
> +		goto unreserve_bo;
> +	}
> +
>   	/* Check if PD needs to be reinitialized and do it before
>   	 * changing any other state, in case it fails.
>   	 */
>   	if (pte_support_ats != vm->pte_support_ats) {
> -		/* Sanity checks */
> -		if (!amdgpu_vm_pt_is_root_clean(adev, vm)) {
> -			r = -EINVAL;
> -			goto unreserve_bo;
> -		}
> -
>   		vm->pte_support_ats = pte_support_ats;
>   		r = amdgpu_vm_pt_clear(adev, vm, to_amdgpu_bo_vm(vm->root.bo),
>   				       false);
> --
> 2.40.1
>
>
>


  reply	other threads:[~2023-10-23 15:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-23 13:06 [PATCH] Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute" Daniel Tang
2023-10-23 15:15 ` Christian König [this message]
2023-10-23 23:39   ` Felix Kuehling
2023-10-23 23:41   ` Felix Kuehling
2023-10-24  8:09     ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38742906-869c-4bc6-9cce-ea3ea98873d4@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=danielzgtg.opensource@gmail.com \
    --cc=xiaogang.chen@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.