From: "Lazar, Lijo" <lijo.lazar@amd.com>
To: "Das, Nirmoy" <nirmoy.das@amd.com>, amd-gfx@lists.freedesktop.org
Cc: Christian.Koenig@amd.com, andrey.grodzovsky@amd.com
Subject: Re: [PATCH v2 3/3] drm/amdgpu: recover gart table at resume
Date: Wed, 20 Oct 2021 15:33:49 +0530 [thread overview]
Message-ID: <4897383c-b931-3bf5-e3c3-04ee9e3467ca@amd.com> (raw)
In-Reply-To: <2715b2e9-4f7e-a356-fbdf-48006985a252@amd.com>
On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
>
> On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
>>
>>
>> On 10/19/2021 11:44 PM, Nirmoy Das wrote:
>>> Get rid off pin/unpin of gart BO at resume/suspend and
>>> instead pin only once and try to recover gart content
>>> at resume time. This is much more stable in case there
>>> is OOM situation at 2nd call to amdgpu_device_evict_resources()
>>> while evicting GART table.
>>>
>>> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 42 ++++++++++++----------
>>> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 9 ++---
>>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 10 +++---
>>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 10 +++---
>>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 9 ++---
>>> 6 files changed, 45 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 5807df52031c..f69e613805db 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -3941,10 +3941,6 @@ int amdgpu_device_suspend(struct drm_device
>>> *dev, bool fbcon)
>>> amdgpu_fence_driver_hw_fini(adev);
>>>
>>> amdgpu_device_ip_suspend_phase2(adev);
>>> - /* This second call to evict device resources is to evict
>>> - * the gart page table using the CPU.
>>> - */
>>> - amdgpu_device_evict_resources(adev);
>>>
>>> return 0;
>>> }
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> index d3e4203f6217..97a9f61fa106 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>> @@ -107,33 +107,37 @@ void amdgpu_gart_dummy_page_fini(struct
>>> amdgpu_device *adev)
>>> *
>>> * @adev: amdgpu_device pointer
>>> *
>>> - * Allocate video memory for GART page table
>>> + * Allocate and pin video memory for GART page table
>>> * (pcie r4xx, r5xx+). These asics require the
>>> * gart table to be in video memory.
>>> * Returns 0 for success, error for failure.
>>> */
>>> int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
>>> {
>>> + struct amdgpu_bo_param bp;
>>> int r;
>>>
>>> - if (adev->gart.bo == NULL) {
>>> - struct amdgpu_bo_param bp;
>>> -
>>> - memset(&bp, 0, sizeof(bp));
>>> - bp.size = adev->gart.table_size;
>>> - bp.byte_align = PAGE_SIZE;
>>> - bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>>> - bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>>> - AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>> - bp.type = ttm_bo_type_kernel;
>>> - bp.resv = NULL;
>>> - bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>>> -
>>> - r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
>>> - if (r) {
>>> - return r;
>>> - }
>>> - }
>>> + if (adev->gart.bo != NULL)
>>> + return 0;
>>> +
>>> + memset(&bp, 0, sizeof(bp));
>>> + bp.size = adev->gart.table_size;
>>> + bp.byte_align = PAGE_SIZE;
>>> + bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>>> + bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>>> + AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>> + bp.type = ttm_bo_type_kernel;
>>> + bp.resv = NULL;
>>> + bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>>> +
>>> + r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
>>> + if (r)
>>> + return r;
>>> +
>>> + r = amdgpu_gart_table_vram_pin(adev);
>>> + if (r)
>>> + return r;
>>> +
>>> return 0;
>>> }
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> index 3ec5ff5a6dbe..75d584e1b0e9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> @@ -992,9 +992,11 @@ static int gmc_v10_0_gart_enable(struct
>>> amdgpu_device *adev)
>>> return -EINVAL;
>>> }
>>>
>>> - r = amdgpu_gart_table_vram_pin(adev);
>>> - if (r)
>>> - return r;
>>> + if (adev->in_suspend) {
>>> + r = amdgpu_gtt_mgr_recover(adev);
>>
>> When the existing usage of this function is checked, this is called
>> during reset recovery after ip resume phase1. Can't the same thing be
>> done in ip_resume() to place this after phase1 resume instead of
>> repeating in every IP version?
>
>
> Placing amdgpu_gtt_mgr_recover() after phase1 generally works but
> gmc_v10_0_gart_enable() seems to be correct place to do this
>
> gart specific work.
>
I see. In that case probably the patch needs to change other call places
also as this step is already taken care in gart enable.
Thanks,
Lijo
>
> Regards,
>
> Nirmoy
>
>
>
>>
>> Thanks,
>> Lijo
>>
>>> + if (r)
>>> + return r;
>>> + }
>>>
>>> r = adev->gfxhub.funcs->gart_enable(adev);
>>> if (r)
>>> @@ -1062,7 +1064,6 @@ static void gmc_v10_0_gart_disable(struct
>>> amdgpu_device *adev)
>>> {
>>> adev->gfxhub.funcs->gart_disable(adev);
>>> adev->mmhub.funcs->gart_disable(adev);
>>> - amdgpu_gart_table_vram_unpin(adev);
>>> }
>>>
>>> static int gmc_v10_0_hw_fini(void *handle)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>> index 0a50fdaced7e..02e90d9443c1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>> @@ -620,9 +620,12 @@ static int gmc_v7_0_gart_enable(struct
>>> amdgpu_device *adev)
>>> dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
>>> return -EINVAL;
>>> }
>>> - r = amdgpu_gart_table_vram_pin(adev);
>>> - if (r)
>>> - return r;
>>> +
>>> + if (adev->in_suspend) {
>>> + r = amdgpu_gtt_mgr_recover(adev);
>>> + if (r)
>>> + return r;
>>> + }
>>>
>>> table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
>>>
>>> @@ -758,7 +761,6 @@ static void gmc_v7_0_gart_disable(struct
>>> amdgpu_device *adev)
>>> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
>>> WREG32(mmVM_L2_CNTL, tmp);
>>> WREG32(mmVM_L2_CNTL2, 0);
>>> - amdgpu_gart_table_vram_unpin(adev);
>>> }
>>>
>>> /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> index 492ebed2915b..dc2577e37688 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>> @@ -837,9 +837,12 @@ static int gmc_v8_0_gart_enable(struct
>>> amdgpu_device *adev)
>>> dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
>>> return -EINVAL;
>>> }
>>> - r = amdgpu_gart_table_vram_pin(adev);
>>> - if (r)
>>> - return r;
>>> +
>>> + if (adev->in_suspend) {
>>> + r = amdgpu_gtt_mgr_recover(adev);
>>> + if (r)
>>> + return r;
>>> + }
>>>
>>> table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
>>>
>>> @@ -992,7 +995,6 @@ static void gmc_v8_0_gart_disable(struct
>>> amdgpu_device *adev)
>>> tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
>>> WREG32(mmVM_L2_CNTL, tmp);
>>> WREG32(mmVM_L2_CNTL2, 0);
>>> - amdgpu_gart_table_vram_unpin(adev);
>>> }
>>>
>>> /**
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> index cb82404df534..732d91dbf449 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> @@ -1714,9 +1714,11 @@ static int gmc_v9_0_gart_enable(struct
>>> amdgpu_device *adev)
>>> return -EINVAL;
>>> }
>>>
>>> - r = amdgpu_gart_table_vram_pin(adev);
>>> - if (r)
>>> - return r;
>>> + if (adev->in_suspend) {
>>> + r = amdgpu_gtt_mgr_recover(adev);
>>> + if (r)
>>> + return r;
>>> + }
>>>
>>> r = adev->gfxhub.funcs->gart_enable(adev);
>>> if (r)
>>> @@ -1793,7 +1795,6 @@ static void gmc_v9_0_gart_disable(struct
>>> amdgpu_device *adev)
>>> {
>>> adev->gfxhub.funcs->gart_disable(adev);
>>> adev->mmhub.funcs->gart_disable(adev);
>>> - amdgpu_gart_table_vram_unpin(adev);
>>> }
>>>
>>> static int gmc_v9_0_hw_fini(void *handle)
>>> --
>>> 2.32.0
>>>
next prev parent reply other threads:[~2021-10-20 10:04 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-19 18:14 [PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr Nirmoy Das
2021-10-19 18:14 ` [PATCH 2/3] drm/amdgpu: do not pass ttm_resource_manager to vram_mgr Nirmoy Das
2021-10-19 18:14 ` [PATCH v2 3/3] drm/amdgpu: recover gart table at resume Nirmoy Das
2021-10-20 6:52 ` Christian König
2021-10-20 8:41 ` Das, Nirmoy
2021-10-20 9:11 ` Lazar, Lijo
2021-10-20 9:53 ` Das, Nirmoy
2021-10-20 10:03 ` Lazar, Lijo [this message]
2021-10-20 10:12 ` Das, Nirmoy
2021-10-20 10:15 ` Lazar, Lijo
2021-10-20 10:21 ` Das, Nirmoy
2021-10-20 10:51 ` Christian König
2021-10-20 11:10 ` Das, Nirmoy
2021-10-20 6:49 ` [PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr Christian König
2021-10-20 8:48 ` Das, Nirmoy
2021-10-20 9:19 ` Lazar, Lijo
2021-10-20 10:49 ` Christian König
2021-10-20 11:12 ` Das, Nirmoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4897383c-b931-3bf5-e3c3-04ee9e3467ca@amd.com \
--to=lijo.lazar@amd.com \
--cc=Christian.Koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrey.grodzovsky@amd.com \
--cc=nirmoy.das@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox