From: "Koenig, Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
To: "Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>,
"Olsak, Marek" <Marek.Olsak-5C7GfCeVMHo@public.gmane.org>,
"Zhou,
David(ChunMing)" <David1.Zhou-5C7GfCeVMHo@public.gmane.org>,
"Liang, Prike" <Prike.Liang-5C7GfCeVMHo@public.gmane.org>,
"dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3
Date: Mon, 27 May 2019 10:51:06 +0000 [thread overview]
Message-ID: <198d6abf-146d-c8f8-5602-37b95cd6b809@amd.com> (raw)
In-Reply-To: <776d29df-428f-ad98-8e38-4b191b602abb-5C7GfCeVMHo@public.gmane.org>
Am 24.05.19 um 23:34 schrieb Kuehling, Felix:
> On 2019-05-23 5:06 a.m., Christian König wrote:
>> [CAUTION: External Email]
>>
>> Leaving BOs on the LRU is harmless. We always did this for VM page table
>> and per VM BOs.
>>
>> The key point is that BOs which couldn't be reserved can't be evicted.
>> So what happened is that an application used basically all of VRAM
>> during CS and because of this X server couldn't pin a BO for scanout.
>>
>> Now we keep the BOs on the LRU and modify TTM to block for the CS to
>> complete, which in turn allows the X server to pin its BO for scanout.
>
> OK, let me rephrase that to make sure I understand it correctly. I think
> the point is that eviction candidates come from an LRU list, so leaving
> things on the LRU makes more BOs available for eviction and avoids OOM
> situations. To take advantage of that, patch 6 adds the ability to wait
> for reserved BOs when there is nothing easier to evict.
>
> ROCm applications like to use lots of memory. So it probably makes sense
> for us to stop removing our BOs from the LRU as well while we
> mass-validate our BOs in amdgpu_amdkfd_gpuvm_restore_process_bos.
Well that would allow concurrent calls of
amdgpu_amdkfd_gpuvm_restore_process_bos() to wait for each other.
If that's what you want then yeah that certainly makes sense.
Regards,
Christian.
>
> Regards,
> Felix
>
>
>> Christian.
>>
>> Am 22.05.19 um 21:43 schrieb Kuehling, Felix:
>>> Can you explain how this avoids OOM situations? When is it safe to leave
>>> a reserved BO on the LRU list? Could we do the same thing in
>>> amdgpu_amdkfd_gpuvm.c? And if we did, what would be the expected side
>>> effects or consequences?
>>>
>>> Thanks,
>>> Felix
>>>
>>> On 2019-05-22 8:59 a.m., Christian König wrote:
>>>> [CAUTION: External Email]
>>>>
>>>> This avoids OOM situations when we have lots of threads
>>>> submitting at the same time.
>>>>
>>>> v3: apply this to the whole driver, not just CS
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 2 +-
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++--
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 +-
>>>> 4 files changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> index 20f2955d2a55..3e2da24cd17a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct
>>>> amdgpu_cs_parser *p,
>>>> }
>>>>
>>>> r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
>>>> - &duplicates, true);
>>>> + &duplicates, false);
>>>> if (unlikely(r != 0)) {
>>>> if (r != -ERESTARTSYS)
>>>> DRM_ERROR("ttm_eu_reserve_buffers
>>>> failed.\n");
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>> index 06f83cac0d3a..f660628e6af9 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
>>>> @@ -79,7 +79,7 @@ int amdgpu_map_static_csa(struct amdgpu_device
>>>> *adev, struct amdgpu_vm *vm,
>>>> list_add(&csa_tv.head, &list);
>>>> amdgpu_vm_get_pd_bo(vm, &list, &pd);
>>>>
>>>> - r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, true);
>>>> + r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL, false);
>>>> if (r) {
>>>> DRM_ERROR("failed to reserve CSA,PD BOs:
>>>> err=%d\n", r);
>>>> return r;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> index d513a5ad03dd..ed25a4e14404 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>>> @@ -171,7 +171,7 @@ void amdgpu_gem_object_close(struct
>>>> drm_gem_object *obj,
>>>>
>>>> amdgpu_vm_get_pd_bo(vm, &list, &vm_pd);
>>>>
>>>> - r = ttm_eu_reserve_buffers(&ticket, &list, false,
>>>> &duplicates, true);
>>>> + r = ttm_eu_reserve_buffers(&ticket, &list, false,
>>>> &duplicates, false);
>>>> if (r) {
>>>> dev_err(adev->dev, "leaking bo va because "
>>>> "we fail to reserve bo (%d)\n", r);
>>>> @@ -608,7 +608,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev,
>>>> void *data,
>>>>
>>>> amdgpu_vm_get_pd_bo(&fpriv->vm, &list, &vm_pd);
>>>>
>>>> - r = ttm_eu_reserve_buffers(&ticket, &list, true,
>>>> &duplicates, true);
>>>> + r = ttm_eu_reserve_buffers(&ticket, &list, true,
>>>> &duplicates, false);
>>>> if (r)
>>>> goto error_unref;
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> index c430e8259038..d60593cc436e 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>>> @@ -155,7 +155,7 @@ static inline int amdgpu_bo_reserve(struct
>>>> amdgpu_bo *bo, bool no_intr)
>>>> struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>>> int r;
>>>>
>>>> - r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
>>>> + r = __ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL);
>>>> if (unlikely(r != 0)) {
>>>> if (r != -ERESTARTSYS)
>>>> dev_err(adev->dev, "%p reserve failed\n",
>>>> bo);
>>>> --
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2019-05-27 10:51 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-22 12:59 [PATCH 01/10] drm/ttm: Make LRU removal optional Christian König
2019-05-22 12:59 ` [PATCH 02/10] drm/ttm: return immediately in case of a signal Christian König
2019-05-22 12:59 ` [PATCH 03/10] drm/ttm: remove manual placement preference Christian König
[not found] ` <20190522125947.4592-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-05-22 12:59 ` [PATCH 04/10] drm/ttm: cleanup ttm_bo_mem_space Christian König
2019-05-22 12:59 ` [PATCH 05/10] drm/ttm: immediately move BOs to the new LRU v2 Christian König
2019-05-22 12:59 ` [PATCH 06/10] drm/ttm: fix busy memory to fail other user v10 Christian König
2019-05-23 10:24 ` zhoucm1
2019-05-23 11:03 ` Christian König
[not found] ` <16918096-1430-d581-7284-a987aacb89da-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-05-23 11:50 ` Chunming Zhou
[not found] ` <5d68ba04-250d-918e-3633-ec45e5b18904-5C7GfCeVMHo@public.gmane.org>
2019-05-23 14:15 ` Koenig, Christian
2019-05-24 5:35 ` Liang, Prike
[not found] ` <MN2PR12MB35364235378F29899838CD80FB020-rweVpJHSKTovpq7YPKzLfQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-05-24 8:49 ` Christian König
[not found] ` <20190522125947.4592-6-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-06-26 6:36 ` Kuehling, Felix
2019-05-22 12:59 ` [PATCH 07/10] drm/amd/display: use ttm_eu_reserve_buffers instead of amdgpu_bo_reserve v2 Christian König
2019-05-22 12:59 ` [PATCH 08/10] drm/amdgpu: drop some validation failure messages Christian König
2019-05-22 12:59 ` [PATCH 09/10] drm/amdgpu: create GDS, GWS and OA in system domain Christian König
2019-05-23 9:15 ` [PATCH 01/10] drm/ttm: Make LRU removal optional zhoucm1
[not found] ` <fbb023f9-28e7-2ac8-994f-e262da597098-5C7GfCeVMHo@public.gmane.org>
2019-05-23 9:39 ` Christian König
2019-05-22 12:59 ` [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3 Christian König
[not found] ` <20190522125947.4592-10-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-05-22 19:43 ` Kuehling, Felix
[not found] ` <48ac98a8-de22-3549-5d63-078a0effab72-5C7GfCeVMHo@public.gmane.org>
2019-05-23 9:06 ` Christian König
[not found] ` <eea6245e-616d-eb16-8521-2f21ce5d6d25-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-05-24 21:34 ` Kuehling, Felix
[not found] ` <776d29df-428f-ad98-8e38-4b191b602abb-5C7GfCeVMHo@public.gmane.org>
2019-05-27 10:51 ` Koenig, Christian [this message]
2019-05-23 8:27 ` Liang, Prike
-- strict thread matches above, loose matches on Subject: below --
2019-05-28 16:25 [PATCH 01/10] drm/ttm: Make LRU removal optional v2 Christian König
[not found] ` <20190528162557.1280-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-05-28 16:25 ` [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3 Christian König
2019-05-29 12:26 [PATCH 01/10] drm/ttm: Make LRU removal optional v2 Christian König
2019-05-29 12:27 ` [PATCH 10/10] drm/amdgpu: stop removing BOs from the LRU v3 Christian König
[not found] ` <20190529122702.13035-10-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2019-05-29 13:10 ` Zhou, David(ChunMing)
2019-05-29 13:40 ` Pelloux-prayer, Pierre-eric
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=198d6abf-146d-c8f8-5602-37b95cd6b809@amd.com \
--to=christian.koenig-5c7gfcevmho@public.gmane.org \
--cc=David1.Zhou-5C7GfCeVMHo@public.gmane.org \
--cc=Felix.Kuehling-5C7GfCeVMHo@public.gmane.org \
--cc=Marek.Olsak-5C7GfCeVMHo@public.gmane.org \
--cc=Prike.Liang-5C7GfCeVMHo@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox