Hi,
Please review the patch set that supports amdgpu VM update via CPU. This feature provides improved performance for compute (HSA) where mapping / unmapping is carried out (by Kernel) independent of command submissions (done directly by user space). This version doesn't support shadow copy of VM page tables for CPU based update.
I think your improved performance is from less waiting for cs,
generally, SDMA engine updating page table is faster than CPU,
otherwise we don't need sdma for updating PT.