From: "Koenig, Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>
To: "Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
Cc: Jerome Glisse <j.glisse-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
"Yang, Philip" <Philip.Yang-5C7GfCeVMHo@public.gmane.org>,
"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: RE: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4
Date: Thu, 27 Sep 2018 06:59:32 +0000 [thread overview]
Message-ID: <a76b71ac-4b5b-45d7-b48b-6d0e4a7e7524@email.android.com> (raw)
In-Reply-To: <DM5PR12MB17078469EB6D3AF1D53B788992140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 4751 bytes --]
No, that won't work. We would still run into lock inversion problems.
What we could do with the scheduler is to turn submissions into dummies if we find that the page tables are now outdated.
But that would be really hacky and I'm not sure if that would really work in all cases.
Christian.
Am 27.09.2018 08:53 schrieb "Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>:
I had a chat with Jerome yesterday. He pointed out that the new blockable parameter can be used to infer whether the MMU notifier is being called in a reclaim operation. So if blockable==true, it should even be safe to take the BO reservation lock without problems. I think with that we should be able to remove the read-write locking completely and go back to locking (or try-locking for blockable==false) the reservation locks in the MMU notifier?
Regards,
Felix
-----Original Message-----
From: amd-gfx <amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org> On Behalf Of Christian König
Sent: Saturday, September 15, 2018 3:47 AM
To: Kuehling, Felix <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org; Jerome Glisse <j.glisse-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4
Am 14.09.2018 um 22:21 schrieb Felix Kuehling:
> On 2018-09-14 01:52 PM, Christian König wrote:
>> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>>> On 2018-09-14 03:51 AM, Christian König wrote:
>>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
>>>>> [SNIP]
>>>>>> + amdgpu_mn_read_unlock(amn);
>>>>>> +
>>>>> amdgpu_mn_read_lock/unlock support recursive locking for multiple
>>>>> overlapping or nested invalidation ranges. But if you'r locking
>>>>> and unlocking in the same function. Is that still a concern?
>>> I don't understand the possible recursive case, but
>>> amdgpu_mn_read_lock() still support recursive locking.
>>>> Well the real problem is that unlocking them here won't work.
>>>>
>>>> We need to hold the lock until we are sure that the operation which
>>>> updates the page tables is completed.
>>>>
>>> The reason for this change is because hmm mirror has
>>> invalidate_start callback, no invalidate_end callback
>>>
>>> Check mmu_notifier.c and hmm.c again, below is entire logic to
>>> update CPU page tables and callback:
>>>
>>> mn lock amn->lock is used to protect interval tree access because
>>> user may submit/register new userptr anytime.
>>> This is same for old and new way.
>>>
>>> step 2 guarantee the GPU operation is done before updating CPU page
>>> table.
>>>
>>> So I think the change is safe. We don't need hold mn lock until the
>>> CPU page tables update is completed.
>> No, that isn't even remotely correct. The lock doesn't protects the
>> interval tree.
>>
>>> Old:
>>> 1. down_read_non_owner(&amn->lock)
>>> 2. loop to handle BOs from node->bos through interval tree
>>> amn->object nodes
>>> gfx: wait for pending BOs fence operation done, mark user
>>> pages dirty
>>> kfd: evict user queues of the process, wait for queue
>>> unmap/map operation done
>>> 3. update CPU page tables
>>> 4. up_read(&amn->lock)
>>>
>>> New, switch step 3 and 4
>>> 1. down_read_non_owner(&amn->lock)
>>> 2. loop to handle BOs from node->bos through interval tree
>>> amn->object nodes
>>> gfx: wait for pending BOs fence operation done, mark user
>>> pages dirty
>>> kfd: evict user queues of the process, wait for queue
>>> unmap/map operation done
>>> 3. up_read(&amn->lock)
>>> 4. update CPU page tables
>> The lock is there to make sure that we serialize page table updates
>> with command submission.
> As I understand it, the idea is to prevent command submission (adding
> new fences to BOs) while a page table invalidation is in progress.
Yes, exactly.
> But do we really need another lock for this? Wouldn't the
> re-validation of userptr BOs (currently calling get_user_pages) force
> synchronization with the ongoing page table invalidation through the
> mmap_sem or other MM locks?
No and yes. We don't hold any other locks while doing command submission, but I expect that HMM has its own mechanism to prevent that.
Since we don't modify amdgpu_mn_lock()/amdgpu_mn_unlock() we are certainly not using this mechanism correctly.
Regards,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[-- Attachment #1.2: Type: text/html, Size: 6876 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2018-09-27 6:59 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-13 20:52 [PATCH] drm/amdgpu: use HMM mirror callback to replace mmu notifier v4 Philip Yang
[not found] ` <1536871954-8451-1-git-send-email-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2018-09-13 21:51 ` Felix Kuehling
[not found] ` <9d6717ac-23f0-7beb-6e41-58c6e32acdf8-5C7GfCeVMHo@public.gmane.org>
2018-09-14 7:51 ` Christian König
[not found] ` <58bc3bb9-b7b1-a32f-e355-c78a23d95215-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-09-14 17:47 ` Philip Yang
[not found] ` <383388c8-1bff-48d9-1044-f16e66bcbfa5-5C7GfCeVMHo@public.gmane.org>
2018-09-14 17:52 ` Christian König
[not found] ` <3850fbeb-5d91-9c14-43c9-45d5d058e15b-5C7GfCeVMHo@public.gmane.org>
2018-09-14 20:21 ` Felix Kuehling
[not found] ` <de28cee0-3461-4f99-eeae-b793de00ca58-5C7GfCeVMHo@public.gmane.org>
2018-09-15 7:46 ` Christian König
[not found] ` <e4cf7212-4340-8639-c8c1-057e4d1083f0-5C7GfCeVMHo@public.gmane.org>
2018-09-27 6:53 ` Kuehling, Felix
[not found] ` <DM5PR12MB17078469EB6D3AF1D53B788992140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 6:59 ` Koenig, Christian [this message]
[not found] ` <a76b71ac-4b5b-45d7-b48b-6d0e4a7e7524-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2018-09-27 8:47 ` Kuehling, Felix
[not found] ` <DM5PR12MB1707D5E46617B2936F800F1992140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 9:27 ` Christian König
[not found] ` <d752c19b-6d2e-c7c1-1cd7-651e25b8f708-5C7GfCeVMHo@public.gmane.org>
2018-09-27 11:08 ` Kuehling, Felix
[not found] ` <DM5PR12MB17077A78E0F95BFBA57F2E1A92140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 11:23 ` Christian König
[not found] ` <425fe859-c780-48a5-a2c6-c3bf2b9abb38-5C7GfCeVMHo@public.gmane.org>
2018-09-27 13:18 ` Kuehling, Felix
[not found] ` <DM5PR12MB17072AE77EB0DE3AD22B8D7892140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 13:21 ` Christian König
[not found] ` <9b427976-f2ff-8ba1-6ebf-588ca95aef80-5C7GfCeVMHo@public.gmane.org>
2018-09-27 13:41 ` Kuehling, Felix
[not found] ` <DM5PR12MB17072879EC907150027D6F7892140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 13:58 ` Koenig, Christian
[not found] ` <58199419-e20f-4ab0-ac1d-a7eb79f5c6f7-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2018-09-27 14:27 ` Kuehling, Felix
[not found] ` <DM5PR12MB1707273AC4B0C03A3BEDF73092140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 14:29 ` Koenig, Christian
[not found] ` <1068c389-56fc-4125-ac40-b1ef2d60eabd-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2018-09-27 14:50 ` Kuehling, Felix
[not found] ` <DM5PR12MB1707BCD7BFC10EDD594FE90B92140-2J9CzHegvk9TCtO+SvGBKwdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2018-09-27 15:36 ` Christian König
[not found] ` <11ba3857-9bb0-648e-2806-0533090d9a0a-5C7GfCeVMHo@public.gmane.org>
2018-09-27 20:17 ` Philip Yang
[not found] ` <16d1faf6-80a4-dc46-bd2a-9cd475808e98-5C7GfCeVMHo@public.gmane.org>
2018-09-28 5:25 ` Koenig, Christian
[not found] ` <8f9f5703-214f-488d-9cfe-ccc64e8cd009-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2018-09-28 15:07 ` Philip Yang
[not found] ` <fe0d429b-5038-a297-e02e-423302544477-5C7GfCeVMHo@public.gmane.org>
2018-09-28 15:13 ` Koenig, Christian
[not found] ` <b8686e6b-0c3e-4feb-afbd-80397aac31a0-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2018-10-02 15:05 ` Christian König
[not found] ` <09916f9a-3f5f-27ab-01e6-6d77303cf052-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2018-10-03 20:31 ` Philip Yang
[not found] ` <bf806477-06b3-61de-fea3-5ad260d92cdd-5C7GfCeVMHo@public.gmane.org>
2018-10-04 7:20 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a76b71ac-4b5b-45d7-b48b-6d0e4a7e7524@email.android.com \
--to=christian.koenig-5c7gfcevmho@public.gmane.org \
--cc=Felix.Kuehling-5C7GfCeVMHo@public.gmane.org \
--cc=Philip.Yang-5C7GfCeVMHo@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=j.glisse-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox