All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhoucm1 <david1.zhou-5C7GfCeVMHo@public.gmane.org>
To: "Christian König"
	<deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH 00/13] shadow page table support
Date: Tue, 26 Jul 2016 10:40:20 +0800	[thread overview]
Message-ID: <5796CD94.6080405@amd.com> (raw)
In-Reply-To: <b2f1e133-c7e2-88c4-1e0f-d12310d734f0-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>



On 2016年07月25日 18:31, Christian König wrote:
> First of all patches #10 and #11 look like bug fixes to existing code 
> to me. So we should fix those problems before working on anything else.
>
> Patch #10 is Reviewed-by: Christian König <christian.koenig@amd.com>
>
> Patch #11:
>
>>      list_for_each_entry(s_job, &sched->ring_mirror_list, node) {
>>          struct amd_sched_fence *s_fence = s_job->s_fence;
>> -        struct fence *fence = sched->ops->run_job(s_job);
>> +        struct fence *fence;
>>
>> +        spin_unlock(&sched->job_list_lock);
>> +        fence = sched->ops->run_job(s_job);
>>          atomic_inc(&sched->hw_rq_count);
>>          if (fence) {
>>              s_fence->parent = fence_get(fence);
>> @@ -451,6 +453,7 @@ void amd_sched_job_recovery(struct 
>> amd_gpu_scheduler *sched)
>>              DRM_ERROR("Failed to run job!\n");
>>              amd_sched_process_job(NULL, &s_fence->cb);
>>          }
>> +        spin_lock(&sched->job_list_lock);
>>      }
>>      spin_unlock(&sched->job_list_lock);
> The problem is that the job might complete while we dropped the lock.
>
> Please use list_for_each_entry_safe here and add a comment why the 
> list could be modified in the meantime.
>
> With that fixed the patch is Reviewed-by: Christian König 
> <christian.koenig@amd.com> as well.

OK, pushed above two.

>
> The remaining set looks very good to me as well, but I was rather 
> thinking of a more general approach instead of making it VM PD/PT 
> specific.
>
> For example we also need to backup/restore shaders when a hard GPU 
> reset happens.
>
> So I would suggest the following:
> 1. We add an optional "shadow" flag so that when a BO in VRAM is 
> allocated we also allocate a shadow BO in GART.
>
> 2. We have another "backup" flag that says on the next command 
> submission the BO is backed up from VRAM to GART before that submission.
>
> 3. We set the shadow flag for VM PD/PT BOs and every time we modify 
> them set the backup flag so they get backed up on next CS.
>
> 4. We add an IOCTL to allow setting the backup flag from userspace so 
> that we can trigger another backup even after the first CS.
>
> What do you think?

Sounds very good, will try.

Thanks,
David Zhou
>
> Regards,
> Christian.
>
> Am 25.07.2016 um 09:22 schrieb Chunming Zhou:
>> Since we cannot make sure VRAM is safe after gpu reset, page table 
>> backup
>> is neccessary, shadow page table is sense way to recovery page talbe 
>> when
>> gpu reset happens.
>> We need to allocate GTT bo as the shadow of VRAM bo when creating 
>> page table,
>> and make them same. After gpu reset, we will need to use SDMA to copy 
>> GTT bo
>> content to VRAM bo, then page table will be recoveried.
>>
>> Chunming Zhou (13):
>>    drm/amdgpu: add pd/pt bo shadow
>>    drm/amdgpu: update shadow pt bo while update pt
>>    drm/amdgpu: update pd shadow while updating pd
>>    drm/amdgpu: implement amdgpu_vm_recover_page_table_from_shadow
>>    drm/amdgpu: link all vm clients
>>    drm/amdgpu: add vm_list_lock
>>    drm/amd: add block entity function
>>    drm/amdgpu: recover page tables after gpu reset
>>    drm/amdgpu: add vm recover pt fence
>>    drm/amd: reset hw count when reset job
>>    drm/amd: fix deadlock of job_list_lock
>>    drm/amd: wait neccessary dependency before running job
>>    drm/amdgpu: fix sched deadoff
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  17 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  12 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  30 ++++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |   5 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   5 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        | 161 
>> ++++++++++++++++++++++++--
>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  35 +++++-
>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |   3 +
>>   8 files changed, 250 insertions(+), 18 deletions(-)
>>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2016-07-26  2:40 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-25  7:22 [PATCH 00/13] shadow page table support Chunming Zhou
     [not found] ` <1469431353-15787-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-07-25  7:22   ` [PATCH 01/13] drm/amdgpu: add pd/pt bo shadow Chunming Zhou
2016-07-25  7:22   ` [PATCH 02/13] drm/amdgpu: update shadow pt bo while update pt Chunming Zhou
2016-07-25  7:22   ` [PATCH 03/13] drm/amdgpu: update pd shadow while updating pd Chunming Zhou
2016-07-25  7:22   ` [PATCH 04/13] drm/amdgpu: implement amdgpu_vm_recover_page_table_from_shadow Chunming Zhou
2016-07-25  7:22   ` [PATCH 05/13] drm/amdgpu: link all vm clients Chunming Zhou
2016-07-25  7:22   ` [PATCH 06/13] drm/amdgpu: add vm_list_lock Chunming Zhou
2016-07-25  7:22   ` [PATCH 07/13] drm/amd: add block entity function Chunming Zhou
2016-07-25  7:22   ` [PATCH 08/13] drm/amdgpu: recover page tables after gpu reset Chunming Zhou
2016-07-25  7:22   ` [PATCH 09/13] drm/amdgpu: add vm recover pt fence Chunming Zhou
2016-07-25  7:22   ` [PATCH 10/13] drm/amd: reset hw count when reset job Chunming Zhou
2016-07-25  7:22   ` [PATCH 11/13] drm/amd: fix deadlock of job_list_lock Chunming Zhou
2016-07-25  7:22   ` [PATCH 12/13] drm/amd: wait neccessary dependency before running job Chunming Zhou
2016-07-25  7:22   ` [PATCH 13/13] drm/amdgpu: fix sched deadoff Chunming Zhou
2016-07-25 10:31   ` [PATCH 00/13] shadow page table support Christian König
     [not found]     ` <b2f1e133-c7e2-88c4-1e0f-d12310d734f0-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-07-26  2:40       ` zhoucm1 [this message]
     [not found]         ` <5796CD94.6080405-5C7GfCeVMHo@public.gmane.org>
2016-07-26  5:33           ` zhoucm1
     [not found]             ` <5796F610.4050204-5C7GfCeVMHo@public.gmane.org>
2016-07-26  8:27               ` Christian König
     [not found]                 ` <a53d1727-796b-351e-7254-e8eed6369f2d-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-07-26  8:41                   ` zhoucm1
     [not found]                     ` <57972255.7000307-5C7GfCeVMHo@public.gmane.org>
2016-07-26  9:05                       ` Christian König
     [not found]                         ` <3766450b-7dd5-9632-ed0b-81e744d08f32-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-07-26  9:12                           ` zhoucm1
2016-07-26  8:51   ` Liu, Monk
2016-07-26  8:52 ` Liu, Monk
  -- strict thread matches above, loose matches on Subject: below --
2016-07-28 10:11 Chunming Zhou
     [not found] ` <1469700700-25013-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-02  2:03   ` zhoucm1
2016-08-02  7:48 Chunming Zhou
     [not found] ` <1470124147-22840-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2016-08-03 13:39   ` Christian König
     [not found]     ` <28f9e53b-5616-89e1-202b-6dba62b7f004-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-04  6:05       ` zhoucm1
     [not found]         ` <57A2DB2C.7050705-5C7GfCeVMHo@public.gmane.org>
2016-08-04  9:52           ` Christian König
     [not found]             ` <88f96afd-c2c2-b4e5-8b79-e09f1bf7e742-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-05  2:43               ` zhoucm1
     [not found]                 ` <57A3FD55.60602-5C7GfCeVMHo@public.gmane.org>
2016-08-05  8:56                   ` Christian König
     [not found]                     ` <99f20f60-11a2-9e79-728e-d7fe754d8b47-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2016-08-05  9:12                       ` zhoucm1
     [not found]                         ` <57A45883.6040901-5C7GfCeVMHo@public.gmane.org>
2016-08-05  9:21                           ` Christian König
2016-08-05  9:12                       ` zhoucm1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5796CD94.6080405@amd.com \
    --to=david1.zhou-5c7gfcevmho@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.