All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhoucm1 <david1.zhou-5C7GfCeVMHo@public.gmane.org>
To: "Christian König"
	<deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>,
	"Zhang, Jerry" <Jerry.Zhang-5C7GfCeVMHo@public.gmane.org>,
	"amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
Date: Fri, 28 Apr 2017 16:33:46 +0800	[thread overview]
Message-ID: <5902FE6A.3020801@amd.com> (raw)
In-Reply-To: <a8fe5428-756b-9fb8-a7ba-52ea9058f90c-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>

Agree, but libdrm doesn't allow concurrent submissions from same 
context, like protection 'pthread_mutex_lock(&context->sequence_mutex);' 
in amdgpu_cs_submit_one.

Regards,
David Zhou
On 2017年04月28日 16:15, Christian König wrote:
> Indeed, but after a bit of thinking I've found another problem with 
> that patch.
>
> When two threads are pushing jobs into the same scheduler context we 
> don't guarantee correct execution order any more!
>
> Before that patch it was handled by the exclusiveness we had because 
> of reserving the VM page tables, but now nothing prevents us from 
> calling amd_sched_entity_push_job() in nondeterministic order.
>
> In other words we need an additional lock in amdgpu_ctx_ring or 
> something like that.
>
> Regards,
> Christian.
>
> Am 28.04.2017 um 04:51 schrieb Zhang, Jerry:
>> Nice catch!
>> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
>>
>> Regards,
>> Jerry (Junwei Zhang)
>>
>> Linux Base Graphics
>> SRDC Software Development
>> _____________________________________
>>
>>
>>> -----Original Message-----
>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On 
>>> Behalf Of
>>> Chunming Zhou
>>> Sent: Friday, April 28, 2017 10:46
>>> To: amd-gfx@lists.freedesktop.org
>>> Cc: Zhou, David(ChunMing)
>>> Subject: [PATCH] drm/amdgpu: fix deadlock of reservation between cs 
>>> and gpu
>>> reset v2
>>>
>>> the case could happen when gpu reset:
>>> 1. when gpu reset, cs can be continue until sw queue is full, then 
>>> push job will
>>> wait with holding pd reservation.
>>> 2. gpu_reset routine will also need pd reservation to restore page 
>>> table from
>>> their shadow.
>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting 
>>> for cs releases
>>> reservation.
>>>
>>> v2: handle amdgpu_cs_submit error path.
>>>
>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
>>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
>>> Reviewed-by: Monk Liu <monk.liu@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 26168df..699f5fe 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1074,6 +1074,7 @@ static int amdgpu_cs_submit(struct 
>>> amdgpu_cs_parser
>>> *p,
>>>       cs->out.handle = amdgpu_ctx_add_fence(p->ctx, ring, p->fence);
>>>       job->uf_sequence = cs->out.handle;
>>>       amdgpu_job_free_resources(job);
>>> +    amdgpu_cs_parser_fini(p, 0, true);
>>>
>>>       trace_amdgpu_cs_ioctl(job);
>>>       amd_sched_entity_push_job(&job->base);
>>> @@ -1129,7 +1130,10 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void
>>> *data, struct drm_file *filp)
>>>           goto out;
>>>
>>>       r = amdgpu_cs_submit(&parser, cs);
>>> +    if (r)
>>> +        goto out;
>>>
>>> +    return 0;
>>>   out:
>>>       amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
>>>       return r;
>>> -- 
>>> 1.9.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      parent reply	other threads:[~2017-04-28  8:33 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-28  2:45 [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2 Chunming Zhou
     [not found] ` <1493347534-8201-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2017-04-28  2:51   ` Zhang, Jerry
     [not found]     ` <DM5PR12MB181832DC8E8A41338E5748F4FF130-2J9CzHegvk+QhrfEZJlvtAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-04-28  8:15       ` Christian König
     [not found]         ` <a8fe5428-756b-9fb8-a7ba-52ea9058f90c-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-04-28  8:33           ` zhoucm1 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5902FE6A.3020801@amd.com \
    --to=david1.zhou-5c7gfcevmho@public.gmane.org \
    --cc=Jerry.Zhang-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.