From mboxrd@z Thu Jan 1 00:00:00 1970 From: zhoucm1 Subject: Re: Support for amdgpu VM update via CPU on large-bar systems Date: Fri, 12 May 2017 16:44:01 +0800 Message-ID: <591575D1.3020909@amd.com> References: <5915717A.3000209@amd.com> <21e206ed-8d60-450f-6d23-e01c68c5ad73@vodafone.de> <59157453.8010308@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1912539532==" Return-path: In-Reply-To: List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org Sender: "amd-gfx" To: =?UTF-8?B?Q2hyaXN0aWFuIEvDtm5pZw==?= , "Kasiviswanathan, Harish" , "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" --===============1912539532== Content-Type: multipart/alternative; boundary="------------000408080600080009020700" --------------000408080600080009020700 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit On 2017年05月12日 16:43, Christian König wrote: > Am 12.05.2017 um 10:37 schrieb zhoucm1: >> >> >> On 2017年05月12日 16:33, Christian König wrote: >>> Am 12.05.2017 um 10:25 schrieb zhoucm1: >>>> >>>> On 2017年05月10日 05:47, Kasiviswanathan, Harish wrote: >>>>> Hi, >>>>> >>>>> Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update. >>>> I think your improved performance is from less waiting for cs, >>>> generally, SDMA engine updating page table is faster than CPU, >>>> otherwise we don't need sdma for updating PT. >>>> So whether your this improvement proves we have some redundant sync >>>> when mapping / unmapping, if yes, we should fix that, then not sure >>>> if CPU method is need or not. >>> >>> The problem is that the KFD is designed synchronously for page table >>> updates. In other words they need to wait for the update to finish >>> and that takes time. >>> >>> Apart from that your comment is absolutely correct, we found that >>> the SDMA is sometimes much faster to do the update than the CPU. >> If the sdma is faster, even they wait for finish, which time is >> shorter than CPU, isn't it? Of course, the precondition is sdma is >> exclusive. They can reserve a sdma for PT updating. > > No, if I understood Felix numbers correctly the setup and wait time > for SDMA is a bit (but not much) longer than doing it with the CPU. > > What would really help is to fix the KFD design and work with async > page tables updates there as well. OK, no problem, just curious. Regards, David Zhou > > Regards, > Christian. > >> >> Regards, >> David Zhou >>> >>> Regards, >>> Christian. >>> >>>> >>>> Regards, >>>> David Zhou >>>>> Best Regards, >>>>> Harish >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> amd-gfx mailing list >>>>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>>> >>>> >>>> >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> >>> >> >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > --------------000408080600080009020700 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 8bit

On 2017年05月12日 16:43, Christian König wrote:
Am 12.05.2017 um 10:37 schrieb zhoucm1:


On 2017年05月12日 16:33, Christian König wrote:
Am 12.05.2017 um 10:25 schrieb zhoucm1:

On 2017年05月10日 05:47, Kasiviswanathan, Harish wrote:
Hi,

Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
I think your improved performance is from less waiting for cs, generally, SDMA engine updating page table is faster than CPU, otherwise we don't need sdma for updating PT.
So whether your this improvement proves we have some redundant sync when mapping / unmapping, if yes, we should fix that, then not sure if CPU method is need or not.

The problem is that the KFD is designed synchronously for page table updates. In other words they need to wait for the update to finish and that takes time.

Apart from that your comment is absolutely correct, we found that the SDMA is sometimes much faster to do the update than the CPU.
If the sdma is faster, even they wait for finish, which time is shorter than CPU, isn't it? Of course, the precondition is sdma is exclusive. They can reserve a sdma for PT updating.

No, if I understood Felix numbers correctly the setup and wait time for SDMA is a bit (but not much) longer than doing it with the CPU.

What would really help is to fix the KFD design and work with async page tables updates there as well.
OK, no problem, just curious.

Regards,
David Zhou

Regards,
Christian.


Regards,
David Zhou

Regards,
Christian.


Regards,
David Zhou
Best Regards,
Harish



_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx





_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



--------------000408080600080009020700-- --===============1912539532== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KYW1kLWdmeCBt YWlsaW5nIGxpc3QKYW1kLWdmeEBsaXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5m cmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9hbWQtZ2Z4Cg== --===============1912539532==--