AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Das, Nirmoy" <nirmoy.das@amd.com>
To: "Lazar, Lijo" <lijo.lazar@amd.com>, amd-gfx@lists.freedesktop.org
Cc: Christian.Koenig@amd.com, andrey.grodzovsky@amd.com
Subject: Re: [PATCH v2 3/3] drm/amdgpu: recover gart table at resume
Date: Wed, 20 Oct 2021 12:12:42 +0200	[thread overview]
Message-ID: <968c352a-29db-49c0-0f79-763596785be7@amd.com> (raw)
In-Reply-To: <4897383c-b931-3bf5-e3c3-04ee9e3467ca@amd.com>


On 10/20/2021 12:03 PM, Lazar, Lijo wrote:
>
>
> On 10/20/2021 3:23 PM, Das, Nirmoy wrote:
>>
>> On 10/20/2021 11:11 AM, Lazar, Lijo wrote:
>>>
>>>
>>> On 10/19/2021 11:44 PM, Nirmoy Das wrote:
>>>> Get rid off pin/unpin of gart BO at resume/suspend and
>>>> instead pin only once and try to recover gart content
>>>> at resume time. This is much more stable in case there
>>>> is OOM situation at 2nd call to amdgpu_device_evict_resources()
>>>> while evicting GART table.
>>>>
>>>> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 42 
>>>> ++++++++++++----------
>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c     |  9 ++---
>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c      | 10 +++---
>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c      | 10 +++---
>>>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c      |  9 ++---
>>>>   6 files changed, 45 insertions(+), 39 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 5807df52031c..f69e613805db 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -3941,10 +3941,6 @@ int amdgpu_device_suspend(struct drm_device 
>>>> *dev, bool fbcon)
>>>>       amdgpu_fence_driver_hw_fini(adev);
>>>>
>>>>       amdgpu_device_ip_suspend_phase2(adev);
>>>> -    /* This second call to evict device resources is to evict
>>>> -     * the gart page table using the CPU.
>>>> -     */
>>>> -    amdgpu_device_evict_resources(adev);
>>>>
>>>>       return 0;
>>>>   }
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> index d3e4203f6217..97a9f61fa106 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>>>> @@ -107,33 +107,37 @@ void amdgpu_gart_dummy_page_fini(struct 
>>>> amdgpu_device *adev)
>>>>    *
>>>>    * @adev: amdgpu_device pointer
>>>>    *
>>>> - * Allocate video memory for GART page table
>>>> + * Allocate and pin video memory for GART page table
>>>>    * (pcie r4xx, r5xx+).  These asics require the
>>>>    * gart table to be in video memory.
>>>>    * Returns 0 for success, error for failure.
>>>>    */
>>>>   int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
>>>>   {
>>>> +    struct amdgpu_bo_param bp;
>>>>       int r;
>>>>
>>>> -    if (adev->gart.bo == NULL) {
>>>> -        struct amdgpu_bo_param bp;
>>>> -
>>>> -        memset(&bp, 0, sizeof(bp));
>>>> -        bp.size = adev->gart.table_size;
>>>> -        bp.byte_align = PAGE_SIZE;
>>>> -        bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>> -        bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>>>> -            AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>>> -        bp.type = ttm_bo_type_kernel;
>>>> -        bp.resv = NULL;
>>>> -        bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>>>> -
>>>> -        r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
>>>> -        if (r) {
>>>> -            return r;
>>>> -        }
>>>> -    }
>>>> +    if (adev->gart.bo != NULL)
>>>> +        return 0;
>>>> +
>>>> +    memset(&bp, 0, sizeof(bp));
>>>> +    bp.size = adev->gart.table_size;
>>>> +    bp.byte_align = PAGE_SIZE;
>>>> +    bp.domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>> +    bp.flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
>>>> +        AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS;
>>>> +    bp.type = ttm_bo_type_kernel;
>>>> +    bp.resv = NULL;
>>>> +    bp.bo_ptr_size = sizeof(struct amdgpu_bo);
>>>> +
>>>> +    r = amdgpu_bo_create(adev, &bp, &adev->gart.bo);
>>>> +    if (r)
>>>> +        return r;
>>>> +
>>>> +    r = amdgpu_gart_table_vram_pin(adev);
>>>> +    if (r)
>>>> +        return r;
>>>> +
>>>>       return 0;
>>>>   }
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>> index 3ec5ff5a6dbe..75d584e1b0e9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>> @@ -992,9 +992,11 @@ static int gmc_v10_0_gart_enable(struct 
>>>> amdgpu_device *adev)
>>>>           return -EINVAL;
>>>>       }
>>>>
>>>> -    r = amdgpu_gart_table_vram_pin(adev);
>>>> -    if (r)
>>>> -        return r;
>>>> +    if (adev->in_suspend) {
>>>> +        r = amdgpu_gtt_mgr_recover(adev);
>>>
>>> When the existing usage of this function is checked, this is called 
>>> during reset recovery after ip resume phase1. Can't the same thing 
>>> be done in ip_resume() to place this after phase1 resume instead of 
>>> repeating in every IP version?
>>
>>
>> Placing amdgpu_gtt_mgr_recover() after phase1 generally works but 
>> gmc_v10_0_gart_enable() seems to be correct  place  to do this
>>
>> gart specific work.
>>
>
> I see. In that case probably the patch needs to change other call 
> places also as this step is already taken care in gart enable.


Do you mean amdgpu_do_asic_reset() ?


Nirmoy


>
> Thanks,
> Lijo
>
>>
>> Regards,
>>
>> Nirmoy
>>
>>
>>
>>>
>>> Thanks,
>>> Lijo
>>>
>>>> +        if (r)
>>>> +            return r;
>>>> +    }
>>>>
>>>>       r = adev->gfxhub.funcs->gart_enable(adev);
>>>>       if (r)
>>>> @@ -1062,7 +1064,6 @@ static void gmc_v10_0_gart_disable(struct 
>>>> amdgpu_device *adev)
>>>>   {
>>>>       adev->gfxhub.funcs->gart_disable(adev);
>>>>       adev->mmhub.funcs->gart_disable(adev);
>>>> -    amdgpu_gart_table_vram_unpin(adev);
>>>>   }
>>>>
>>>>   static int gmc_v10_0_hw_fini(void *handle)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>> index 0a50fdaced7e..02e90d9443c1 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>> @@ -620,9 +620,12 @@ static int gmc_v7_0_gart_enable(struct 
>>>> amdgpu_device *adev)
>>>>           dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
>>>>           return -EINVAL;
>>>>       }
>>>> -    r = amdgpu_gart_table_vram_pin(adev);
>>>> -    if (r)
>>>> -        return r;
>>>> +
>>>> +    if (adev->in_suspend) {
>>>> +        r = amdgpu_gtt_mgr_recover(adev);
>>>> +        if (r)
>>>> +            return r;
>>>> +    }
>>>>
>>>>       table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
>>>>
>>>> @@ -758,7 +761,6 @@ static void gmc_v7_0_gart_disable(struct 
>>>> amdgpu_device *adev)
>>>>       tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
>>>>       WREG32(mmVM_L2_CNTL, tmp);
>>>>       WREG32(mmVM_L2_CNTL2, 0);
>>>> -    amdgpu_gart_table_vram_unpin(adev);
>>>>   }
>>>>
>>>>   /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>> index 492ebed2915b..dc2577e37688 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>> @@ -837,9 +837,12 @@ static int gmc_v8_0_gart_enable(struct 
>>>> amdgpu_device *adev)
>>>>           dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
>>>>           return -EINVAL;
>>>>       }
>>>> -    r = amdgpu_gart_table_vram_pin(adev);
>>>> -    if (r)
>>>> -        return r;
>>>> +
>>>> +    if (adev->in_suspend) {
>>>> +        r = amdgpu_gtt_mgr_recover(adev);
>>>> +        if (r)
>>>> +            return r;
>>>> +    }
>>>>
>>>>       table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
>>>>
>>>> @@ -992,7 +995,6 @@ static void gmc_v8_0_gart_disable(struct 
>>>> amdgpu_device *adev)
>>>>       tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 0);
>>>>       WREG32(mmVM_L2_CNTL, tmp);
>>>>       WREG32(mmVM_L2_CNTL2, 0);
>>>> -    amdgpu_gart_table_vram_unpin(adev);
>>>>   }
>>>>
>>>>   /**
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> index cb82404df534..732d91dbf449 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> @@ -1714,9 +1714,11 @@ static int gmc_v9_0_gart_enable(struct 
>>>> amdgpu_device *adev)
>>>>           return -EINVAL;
>>>>       }
>>>>
>>>> -    r = amdgpu_gart_table_vram_pin(adev);
>>>> -    if (r)
>>>> -        return r;
>>>> +    if (adev->in_suspend) {
>>>> +        r = amdgpu_gtt_mgr_recover(adev);
>>>> +        if (r)
>>>> +            return r;
>>>> +    }
>>>>
>>>>       r = adev->gfxhub.funcs->gart_enable(adev);
>>>>       if (r)
>>>> @@ -1793,7 +1795,6 @@ static void gmc_v9_0_gart_disable(struct 
>>>> amdgpu_device *adev)
>>>>   {
>>>>       adev->gfxhub.funcs->gart_disable(adev);
>>>>       adev->mmhub.funcs->gart_disable(adev);
>>>> -    amdgpu_gart_table_vram_unpin(adev);
>>>>   }
>>>>
>>>>   static int gmc_v9_0_hw_fini(void *handle)
>>>> -- 
>>>> 2.32.0
>>>>

  reply	other threads:[~2021-10-20 10:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 18:14 [PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr Nirmoy Das
2021-10-19 18:14 ` [PATCH 2/3] drm/amdgpu: do not pass ttm_resource_manager to vram_mgr Nirmoy Das
2021-10-19 18:14 ` [PATCH v2 3/3] drm/amdgpu: recover gart table at resume Nirmoy Das
2021-10-20  6:52   ` Christian König
2021-10-20  8:41     ` Das, Nirmoy
2021-10-20  9:11   ` Lazar, Lijo
2021-10-20  9:53     ` Das, Nirmoy
2021-10-20 10:03       ` Lazar, Lijo
2021-10-20 10:12         ` Das, Nirmoy [this message]
2021-10-20 10:15           ` Lazar, Lijo
2021-10-20 10:21             ` Das, Nirmoy
2021-10-20 10:51               ` Christian König
2021-10-20 11:10                 ` Das, Nirmoy
2021-10-20  6:49 ` [PATCH 1/3] drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr Christian König
2021-10-20  8:48   ` Das, Nirmoy
2021-10-20  9:19     ` Lazar, Lijo
2021-10-20 10:49       ` Christian König
2021-10-20 11:12         ` Das, Nirmoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=968c352a-29db-49c0-0f79-763596785be7@amd.com \
    --to=nirmoy.das@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=lijo.lazar@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox