Re: [PATCH] drm/amdgpu: fix ocl test performance drop

From: Flora Cui <flora.cui-5C7GfCeVMHo@public.gmane.org>
To: "Christian König" <deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
Cc: zhoucm1 <david1.zhou-5C7GfCeVMHo@public.gmane.org>,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Subject: Re: [PATCH] drm/amdgpu: fix ocl test performance drop
Date: Mon, 22 May 2017 11:42:27 +0800	[thread overview]
Message-ID: <20170522034224.GA27084@flora> (raw)
In-Reply-To: <0246394a-ddbb-6478-b36f-8fb7e9213acb-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>

Could anyone give me a RB?

On Fri, May 19, 2017 at 01:59:16PM +0200, Christian König wrote:
> I'm pretty sure that bug is fixed in GFX9. So we don't need the check there
> any more.
> 
> Regards,
> Christian.
> 
> Am 19.05.2017 um 07:16 schrieb zhoucm1:
> >I also realized this problem when I did fix for sched fence, I thought if
> >no issue, just left it as current code, it's more safe.
> >As Flora point it out, it results in performance drop, then we need to
> >correct it back.
> >pipeline_sync in vm flush should be inserted only when
> >amdgpu_vm_ring_has_compute_vm_bug() is true, not all time for every vm
> >flush for all engines. The bug has been fixed in mec firmware from version
> >673 in gfx8.
> >About gfx9 for amdgpu_vm_ring_has_compute_vm_bug(), current implementation
> >is thinking gfx9 doesn't have compute vm bug by default. Not sure it's in
> >there. If no vm fault related to this bug, then we can think gfx9 has no
> >compute vm bug, I believe firmware team does that based on previous
> >generation, otherwise we need identify every generation, that's crazy.
> >
> >Regards,
> >David Zhou
> >On 2017年05月19日 11:36, Flora Cui wrote:
> >>btw, what's about gfx9 for amdgpu_vm_ring_has_compute_vm_bug()? Is the
> >>workaround still needed?
> >>
> >>On Fri, May 19, 2017 at 10:25:19AM +0800, Flora Cui wrote:
> >>>On Thu, May 18, 2017 at 01:38:15PM +0200, Christian König wrote:
> >>>>Am 18.05.2017 um 09:45 schrieb Flora Cui:
> >>>>>partial revert commit <6971d3d> - drm/amdgpu: cleanup logic in
> >>>>>amdgpu_vm_flush
> >>>>>
> >>>>>Change-Id: Iadce9d613dfe9a739643a74050cea55854832adb
> >>>>>Signed-off-by: Flora Cui <Flora.Cui@amd.com>
> >>>>I don't see how the revert should be faster than the original.
> >>>>
> >>>>Especially that amdgpu_vm_had_gpu_reset() is now called twice sounds
> >>>>like
> >>>>more overhead than necessary.
> >>>>
> >>>>Please explain further.
> >>>>
> >>>>Christian.
> >>>>
> >>>>>---
> >>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 +++++---------
> >>>>>  1 file changed, 5 insertions(+), 9 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>index 88420dc..a96bad6 100644
> >>>>>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>>>>@@ -743,23 +743,19 @@ int amdgpu_vm_flush(struct amdgpu_ring
> >>>>>*ring, struct amdgpu_job *job)
> >>>>>          id->gws_size != job->gws_size ||
> >>>>>          id->oa_base != job->oa_base ||
> >>>>>          id->oa_size != job->oa_size);
> >>>>>-    bool vm_flush_needed = job->vm_needs_flush ||
> >>>>>-        amdgpu_vm_ring_has_compute_vm_bug(ring);
> >>>>>      unsigned patch_offset = 0;
> >>>>>      int r;
> >>>>>-    if (amdgpu_vm_had_gpu_reset(adev, id)) {
> >>>>>-        gds_switch_needed = true;
> >>>>>-        vm_flush_needed = true;
> >>>>>-    }
> >>>>>-
> >>>>>-    if (!vm_flush_needed && !gds_switch_needed)
> >>>>>+    if (!job->vm_needs_flush && !gds_switch_needed &&
> >>>>>+        !amdgpu_vm_had_gpu_reset(adev, id) &&
> >>>>>+        !amdgpu_vm_ring_has_compute_vm_bug(ring))
> >>>>>          return 0;
> >>>>>      if (ring->funcs->init_cond_exec)
> >>>>>          patch_offset = amdgpu_ring_init_cond_exec(ring);
> >>>>>-    if (ring->funcs->emit_vm_flush && vm_flush_needed) {
> >>>[flora]: for compute ring & amdgpu_vm_ring_has_compute_vm_bug(), a
> >>>vm_flush is
> >>>inserted. This might cause performance drop.
> >>>>>+    if (ring->funcs->emit_vm_flush &&
> >>>>>+        (job->vm_needs_flush || amdgpu_vm_had_gpu_reset(adev,
> >>>>>id))) {
> >>>>>          struct fence *fence;
> >>>>>          trace_amdgpu_vm_flush(ring, job->vm_id, job->vm_pd_addr);
> >>>>
> >>>_______________________________________________
> >>>amd-gfx mailing list
> >>>amd-gfx@lists.freedesktop.org
> >>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >>_______________________________________________
> >>amd-gfx mailing list
> >>amd-gfx@lists.freedesktop.org
> >>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> >_______________________________________________
> >amd-gfx mailing list
> >amd-gfx@lists.freedesktop.org
> >https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx