All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Leslie Shi <Yuliang.Shi@amd.com>,
	andrey.grodzovsky@amd.com, xinhui.pan@amd.com,
	alexander.deucher@amd.com, amd-gfx@lists.freedesktop.org
Cc: guchun.chen@amd.com
Subject: Re: [PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash
Date: Wed, 15 Dec 2021 11:59:45 +0100	[thread overview]
Message-ID: <a8f7b8b6-669c-86b8-78eb-e08e6ce147a2@amd.com> (raw)
In-Reply-To: <20211215084636.2133355-1-Yuliang.Shi@amd.com>

Am 15.12.21 um 09:46 schrieb Leslie Shi:
> [Why]
> In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it
> will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well
> to release mapped VRAM. However, in the following release callback, driver stills visits the
> unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs.

Mhm, interesting workaround but I'm not sure that's the right thing to do.

Question is why are we unmapping the MMIO space on driver load failure 
so early in the first place? I mean don't we need to clean up a bit?

If that's really the way to go then we should at least add a comment 
explaining why it's done that way.

Regards,
Christian.

>
> [How]
> Add drm_dev_unplug() before executing amdgpu_driver_unload_kms to prevent such crash.
> GPU initialization failure is somehow allowed, but a kernel crash in this case should never happen.
>
> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index 651c7abfde03..7bf6aecdbb92 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -268,6 +268,8 @@ int amdgpu_driver_load_kms(struct amdgpu_device *adev, unsigned long flags)
>   		/* balance pm_runtime_get_sync in amdgpu_driver_unload_kms */
>   		if (adev->rmmio && adev->runpm)
>   			pm_runtime_put_noidle(dev->dev);
> +
> +		drm_dev_unplug(dev);
>   		amdgpu_driver_unload_kms(dev);
>   	}
>   


  reply	other threads:[~2021-12-15 11:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-15  8:46 [PATCH] drm/amdgpu: add drm_dev_unplug() in GPU initialization failure to prevent crash Leslie Shi
2021-12-15 10:59 ` Christian König [this message]
2021-12-15 13:28   ` Chen, Guchun
2021-12-15 15:19     ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8f7b8b6-669c-86b8-78eb-e08e6ce147a2@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Yuliang.Shi@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=guchun.chen@amd.com \
    --cc=xinhui.pan@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.