From: zhoucm1 <david1.zhou-5C7GfCeVMHo@public.gmane.org>
To: "Christian König"
<deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>,
"Zhang, Jerry" <Jerry.Zhang-5C7GfCeVMHo@public.gmane.org>,
"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
Date: Fri, 28 Apr 2017 16:33:46 +0800 [thread overview]
Message-ID: <5902FE6A.3020801@amd.com> (raw)
In-Reply-To: <a8fe5428-756b-9fb8-a7ba-52ea9058f90c-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
Agree, but libdrm doesn't allow concurrent submissions from same
context, like protection 'pthread_mutex_lock(&context->sequence_mutex);'
in amdgpu_cs_submit_one.
Regards,
David Zhou
On 2017年04月28日 16:15, Christian König wrote:
> Indeed, but after a bit of thinking I've found another problem with
> that patch.
>
> When two threads are pushing jobs into the same scheduler context we
> don't guarantee correct execution order any more!
>
> Before that patch it was handled by the exclusiveness we had because
> of reserving the VM page tables, but now nothing prevents us from
> calling amd_sched_entity_push_job() in nondeterministic order.
>
> In other words we need an additional lock in amdgpu_ctx_ring or
> something like that.
>
> Regards,
> Christian.
>
> Am 28.04.2017 um 04:51 schrieb Zhang, Jerry:
>> Nice catch!
>> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
>>
>> Regards,
>> Jerry (Junwei Zhang)
>>
>> Linux Base Graphics
>> SRDC Software Development
>> _____________________________________
>>
>>
>>> -----Original Message-----
>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
>>> Behalf Of
>>> Chunming Zhou
>>> Sent: Friday, April 28, 2017 10:46
>>> To: amd-gfx@lists.freedesktop.org
>>> Cc: Zhou, David(ChunMing)
>>> Subject: [PATCH] drm/amdgpu: fix deadlock of reservation between cs
>>> and gpu
>>> reset v2
>>>
>>> the case could happen when gpu reset:
>>> 1. when gpu reset, cs can be continue until sw queue is full, then
>>> push job will
>>> wait with holding pd reservation.
>>> 2. gpu_reset routine will also need pd reservation to restore page
>>> table from
>>> their shadow.
>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting
>>> for cs releases
>>> reservation.
>>>
>>> v2: handle amdgpu_cs_submit error path.
>>>
>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
>>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
>>> Reviewed-by: Monk Liu <monk.liu@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 26168df..699f5fe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1074,6 +1074,7 @@ static int amdgpu_cs_submit(struct
>>> amdgpu_cs_parser
>>> *p,
>>> cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
>>> job->uf_sequence = cs->out.handle;
>>> amdgpu_job_free_resources(job);
>>> + amdgpu_cs_parser_fini(p, 0, true);
>>>
>>> trace_amdgpu_cs_ioctl(job);
>>> amd_sched_entity_push_job(&job->base);
>>> @@ -1129,7 +1130,10 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void
>>> *data, struct drm_file *filp)
>>> goto out;
>>>
>>> r = amdgpu_cs_submit(&parser, cs);
>>> + if (r)
>>> + goto out;
>>>
>>> + return 0;
>>> out:
>>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
>>> return r;
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
prev parent reply other threads:[~2017-04-28 8:33 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-28 2:45 [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2 Chunming Zhou
[not found] ` <1493347534-8201-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2017-04-28 2:51 ` Zhang, Jerry
[not found] ` <DM5PR12MB181832DC8E8A41338E5748F4FF130-2J9CzHegvk+QhrfEZJlvtAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-04-28 8:15 ` Christian König
[not found] ` <a8fe5428-756b-9fb8-a7ba-52ea9058f90c-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-04-28 8:33 ` zhoucm1 [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5902FE6A.3020801@amd.com \
--to=david1.zhou-5c7gfcevmho@public.gmane.org \
--cc=Jerry.Zhang-5C7GfCeVMHo@public.gmane.org \
--cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
--cc=deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox