* [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset
@ 2017-04-27 7:20 Chunming Zhou
[not found] ` <1493277617-14966-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Chunming Zhou @ 2017-04-27 7:20 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Cc: Chunming Zhou, Monk.Liu-5C7GfCeVMHo
the case could happen when gpu reset:
1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation.
2. gpu_reset routine will also need pd reservation to restore page table from their shadow.
3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation.
Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 9edb1a4..8c6d036 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1076,7 +1076,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_job_free_resources(job);
trace_amdgpu_cs_ioctl(job);
- amd_sched_entity_push_job(&job->base);
return 0;
}
@@ -1132,6 +1131,8 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
out:
amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
+ if (!r)
+ amd_sched_entity_push_job(&parser.job->base);
return r;
}
--
1.9.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread[parent not found: <1493277617-14966-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <1493277617-14966-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org> @ 2017-04-27 8:01 ` zhoucm1 0 siblings, 0 replies; 9+ messages in thread From: zhoucm1 @ 2017-04-27 8:01 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Monk.Liu-5C7GfCeVMHo pls ignore this one, will send another new one. On 2017年04月27日 15:20, Chunming Zhou wrote: > the case could happen when gpu reset: > 1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation. > 2. gpu_reset routine will also need pd reservation to restore page table from their shadow. > 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation. > > Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 > Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 9edb1a4..8c6d036 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1076,7 +1076,6 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, > amdgpu_job_free_resources(job); > > trace_amdgpu_cs_ioctl(job); > - amd_sched_entity_push_job(&job->base); > > return 0; > } > @@ -1132,6 +1131,8 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) > > out: > amdgpu_cs_parser_fini(&parser, r, reserved_buffers); > + if (!r) > + amd_sched_entity_push_job(&parser.job->base); > return r; > } > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset
@ 2017-04-27 8:25 Chunming Zhou
[not found] ` <1493281554-9898-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 9+ messages in thread
From: Chunming Zhou @ 2017-04-27 8:25 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
Cc: Chunming Zhou, Monk.Liu-5C7GfCeVMHo
the case could happen when gpu reset:
1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation.
2. gpu_reset routine will also need pd reservation to restore page table from their shadow.
3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation.
Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 9edb1a4..a6722a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
amdgpu_job_free_resources(job);
trace_amdgpu_cs_ioctl(job);
+ amdgpu_cs_parser_fini(p, 0, true);
amd_sched_entity_push_job(&job->base);
return 0;
@@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
r = amdgpu_cs_submit(&parser, cs);
+ return r;
out:
amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
return r;
--
1.9.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread[parent not found: <1493281554-9898-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <1493281554-9898-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org> @ 2017-04-27 9:08 ` Zhang, Jerry (Junwei) [not found] ` <5901B526.6080305-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Zhang, Jerry (Junwei) @ 2017-04-27 9:08 UTC (permalink / raw) To: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Monk.Liu-5C7GfCeVMHo On 04/27/2017 04:25 PM, Chunming Zhou wrote: > the case could happen when gpu reset: > 1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation. > 2. gpu_reset routine will also need pd reservation to restore page table from their shadow. > 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation. > > Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 > Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 9edb1a4..a6722a7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, > amdgpu_job_free_resources(job); > > trace_amdgpu_cs_ioctl(job); > + amdgpu_cs_parser_fini(p, 0, true); Please confirm: It will not free/release more things that may be accessed in job working process, as cs_parse_fini free so many things related to parser. Or we can just release the pd reservation here? Jerry > amd_sched_entity_push_job(&job->base); > > return 0; > @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) > > r = amdgpu_cs_submit(&parser, cs); > > + return r; > out: > amdgpu_cs_parser_fini(&parser, r, reserved_buffers); > return r; > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <5901B526.6080305-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <5901B526.6080305-5C7GfCeVMHo@public.gmane.org> @ 2017-04-27 10:11 ` Christian König [not found] ` <62336e5c-571a-196f-6c07-be25c928cdd3-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Christian König @ 2017-04-27 10:11 UTC (permalink / raw) To: Zhang, Jerry (Junwei), Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Monk.Liu-5C7GfCeVMHo Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei): > On 04/27/2017 04:25 PM, Chunming Zhou wrote: >> the case could happen when gpu reset: >> 1. when gpu reset, cs can be continue until sw queue is full, then >> push job will wait with holding pd reservation. >> 2. gpu_reset routine will also need pd reservation to restore page >> table from their shadow. >> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for >> cs releases reservation. >> >> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 >> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> index 9edb1a4..a6722a7 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct >> amdgpu_cs_parser *p, >> amdgpu_job_free_resources(job); >> >> trace_amdgpu_cs_ioctl(job); >> + amdgpu_cs_parser_fini(p, 0, true); > > Please confirm: > It will not free/release more things that may be accessed in job > working process, as cs_parse_fini free so many things related to parser. Yeah, that indeed won't work. amdgpu_cs_parser_fini() does: if (parser->job) amdgpu_job_free(parser->job); And amdgpu_cs_submit() sets the fence, so it must be called before amdgpu_cs_parser_fini(). But apart from that this is a rather nifty idea. What was the problem with the initial patch? > > Or we can just release the pd reservation here? We could run into problem with the reservation ticked with that. So I would want to avoid that. Christian. > > Jerry > >> amd_sched_entity_push_job(&job->base); >> >> return 0; >> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, >> void *data, struct drm_file *filp) >> >> r = amdgpu_cs_submit(&parser, cs); >> >> + return r; >> out: >> amdgpu_cs_parser_fini(&parser, r, reserved_buffers); >> return r; >> > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <62336e5c-571a-196f-6c07-be25c928cdd3-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <62336e5c-571a-196f-6c07-be25c928cdd3-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> @ 2017-04-27 10:23 ` zhoucm1 [not found] ` <5901C6BD.1070303-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: zhoucm1 @ 2017-04-27 10:23 UTC (permalink / raw) To: Christian König, Zhang, Jerry (Junwei), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Monk.Liu-5C7GfCeVMHo On 2017年04月27日 18:11, Christian König wrote: > Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei): >> On 04/27/2017 04:25 PM, Chunming Zhou wrote: >>> the case could happen when gpu reset: >>> 1. when gpu reset, cs can be continue until sw queue is full, then >>> push job will wait with holding pd reservation. >>> 2. gpu_reset routine will also need pd reservation to restore page >>> table from their shadow. >>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting >>> for cs releases reservation. >>> >>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 >>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> index 9edb1a4..a6722a7 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct >>> amdgpu_cs_parser *p, >>> amdgpu_job_free_resources(job); >>> >>> trace_amdgpu_cs_ioctl(job); >>> + amdgpu_cs_parser_fini(p, 0, true); >> >> Please confirm: >> It will not free/release more things that may be accessed in job >> working process, as cs_parse_fini free so many things related to parser. > > Yeah, that indeed won't work. amdgpu_cs_parser_fini() does: > No, with my patch, parser->job already is NULL here, same as previous sequence, just put push_job action after amdgpu_cs_parser_fini() > if (parser->job) > amdgpu_job_free(parser->job); > > And amdgpu_cs_submit() sets the fence, so it must be called before > amdgpu_cs_parser_fini(). yes, fence is already generated, so amdgpu_cs_parser_fini before pushing job is completely safe here. > > But apart from that this is a rather nifty idea. What was the problem > with the initial patch? Just as comments in patch, which is found by Monk. Regards, David Zhou > >> >> Or we can just release the pd reservation here? > > We could run into problem with the reservation ticked with that. So I > would want to avoid that. > > Christian. > >> >> Jerry >> >>> amd_sched_entity_push_job(&job->base); >>> >>> return 0; >>> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, >>> void *data, struct drm_file *filp) >>> >>> r = amdgpu_cs_submit(&parser, cs); >>> >>> + return r; >>> out: >>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers); >>> return r; >>> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <5901C6BD.1070303-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <5901C6BD.1070303-5C7GfCeVMHo@public.gmane.org> @ 2017-04-27 12:04 ` Christian König [not found] ` <b6e45c8f-40a1-4185-89fb-1f4f7719c0fb-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> 2017-04-27 14:02 ` Liu, Monk 1 sibling, 1 reply; 9+ messages in thread From: Christian König @ 2017-04-27 12:04 UTC (permalink / raw) To: zhoucm1, Zhang, Jerry (Junwei), amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Monk.Liu-5C7GfCeVMHo Am 27.04.2017 um 12:23 schrieb zhoucm1: > > > On 2017年04月27日 18:11, Christian König wrote: >> Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei): >>> On 04/27/2017 04:25 PM, Chunming Zhou wrote: >>>> the case could happen when gpu reset: >>>> 1. when gpu reset, cs can be continue until sw queue is full, then >>>> push job will wait with holding pd reservation. >>>> 2. gpu_reset routine will also need pd reservation to restore page >>>> table from their shadow. >>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting >>>> for cs releases reservation. >>>> >>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 >>>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> index 9edb1a4..a6722a7 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct >>>> amdgpu_cs_parser *p, >>>> amdgpu_job_free_resources(job); >>>> >>>> trace_amdgpu_cs_ioctl(job); >>>> + amdgpu_cs_parser_fini(p, 0, true); >>> >>> Please confirm: >>> It will not free/release more things that may be accessed in job >>> working process, as cs_parse_fini free so many things related to >>> parser. >> >> Yeah, that indeed won't work. amdgpu_cs_parser_fini() does: >> > No, with my patch, parser->job already is NULL here, same as previous > sequence, just put push_job action after amdgpu_cs_parser_fini() Ah! You add that into amdgpu_cs_submit! That was the point I was missing. > trace_amdgpu_cs_ioctl(job); > + amdgpu_cs_parser_fini(p, 0, true); > amd_sched_entity_push_job(&job->base); Can we keep the trace_amdgpu_cs_ioctl(); directly before the amd_sched_entity_push_job()? With that changed, the patch is Reviewed-by: Christian König <christian.koenig@amd.com>. Regards, Christian. >> if (parser->job) >> amdgpu_job_free(parser->job); >> >> And amdgpu_cs_submit() sets the fence, so it must be called before >> amdgpu_cs_parser_fini(). > yes, fence is already generated, so amdgpu_cs_parser_fini before > pushing job is completely safe here. >> >> But apart from that this is a rather nifty idea. What was the problem >> with the initial patch? > Just as comments in patch, which is found by Monk. > > Regards, > David Zhou >> >>> >>> Or we can just release the pd reservation here? >> >> We could run into problem with the reservation ticked with that. So I >> would want to avoid that. >> >> Christian. >> >>> >>> Jerry >>> >>>> amd_sched_entity_push_job(&job->base); >>>> >>>> return 0; >>>> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, >>>> void *data, struct drm_file *filp) >>>> >>>> r = amdgpu_cs_submit(&parser, cs); >>>> >>>> + return r; >>>> out: >>>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers); >>>> return r; >>>> >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> >> > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <b6e45c8f-40a1-4185-89fb-1f4f7719c0fb-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>]
* Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <b6e45c8f-40a1-4185-89fb-1f4f7719c0fb-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> @ 2017-04-28 2:37 ` Zhang, Jerry (Junwei) 0 siblings, 0 replies; 9+ messages in thread From: Zhang, Jerry (Junwei) @ 2017-04-28 2:37 UTC (permalink / raw) To: Christian König, zhoucm1, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: Monk.Liu-5C7GfCeVMHo On 04/27/2017 08:04 PM, Christian König wrote: > Am 27.04.2017 um 12:23 schrieb zhoucm1: >> >> >> On 2017年04月27日 18:11, Christian König wrote: >>> Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei): >>>> On 04/27/2017 04:25 PM, Chunming Zhou wrote: >>>>> the case could happen when gpu reset: >>>>> 1. when gpu reset, cs can be continue until sw queue is full, then push >>>>> job will wait with holding pd reservation. >>>>> 2. gpu_reset routine will also need pd reservation to restore page table >>>>> from their shadow. >>>>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs >>>>> releases reservation. >>>>> >>>>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 >>>>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> index 9edb1a4..a6722a7 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, >>>>> amdgpu_job_free_resources(job); >>>>> >>>>> trace_amdgpu_cs_ioctl(job); >>>>> + amdgpu_cs_parser_fini(p, 0, true); >>>> >>>> Please confirm: >>>> It will not free/release more things that may be accessed in job working >>>> process, as cs_parse_fini free so many things related to parser. >>> >>> Yeah, that indeed won't work. amdgpu_cs_parser_fini() does: >>> >> No, with my patch, parser->job already is NULL here, same as previous >> sequence, just put push_job action after amdgpu_cs_parser_fini() > > Ah! You add that into amdgpu_cs_submit! That was the point I was missing. Thanks your all input. See it too. Feel free to add my RB, if need. Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Jerry > >> trace_amdgpu_cs_ioctl(job); >> + amdgpu_cs_parser_fini(p, 0, true); >> amd_sched_entity_push_job(&job->base); > Can we keep the trace_amdgpu_cs_ioctl(); directly before the > amd_sched_entity_push_job()? > > With that changed, the patch is Reviewed-by: Christian König > <christian.koenig@amd.com>. > > Regards, > Christian. > >>> if (parser->job) >>> amdgpu_job_free(parser->job); >>> >>> And amdgpu_cs_submit() sets the fence, so it must be called before >>> amdgpu_cs_parser_fini(). >> yes, fence is already generated, so amdgpu_cs_parser_fini before pushing job >> is completely safe here. >>> >>> But apart from that this is a rather nifty idea. What was the problem with >>> the initial patch? >> Just as comments in patch, which is found by Monk. >> >> Regards, >> David Zhou >>> >>>> >>>> Or we can just release the pd reservation here? >>> >>> We could run into problem with the reservation ticked with that. So I would >>> want to avoid that. >>> >>> Christian. >>> >>>> >>>> Jerry >>>> >>>>> amd_sched_entity_push_job(&job->base); >>>>> >>>>> return 0; >>>>> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void >>>>> *data, struct drm_file *filp) >>>>> >>>>> r = amdgpu_cs_submit(&parser, cs); >>>>> >>>>> + return r; >>>>> out: >>>>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers); >>>>> return r; >>>>> >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >>> >>> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset [not found] ` <5901C6BD.1070303-5C7GfCeVMHo@public.gmane.org> 2017-04-27 12:04 ` Christian König @ 2017-04-27 14:02 ` Liu, Monk 1 sibling, 0 replies; 9+ messages in thread From: Liu, Monk @ 2017-04-27 14:02 UTC (permalink / raw) To: Zhou, David(ChunMing), Christian König, Zhang, Jerry, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org A really good improvement Reviewed-by: Monk Liu <monk.liu@amd.com> -----Original Message----- From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of zhoucm1 Sent: Thursday, April 27, 2017 6:24 PM To: Christian König <deathsimple@vodafone.de>; Zhang, Jerry <Jerry.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Liu, Monk <Monk.Liu@amd.com> Subject: Re: [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset On 2017年04月27日 18:11, Christian König wrote: > Am 27.04.2017 um 11:08 schrieb Zhang, Jerry (Junwei): >> On 04/27/2017 04:25 PM, Chunming Zhou wrote: >>> the case could happen when gpu reset: >>> 1. when gpu reset, cs can be continue until sw queue is full, then >>> push job will wait with holding pd reservation. >>> 2. gpu_reset routine will also need pd reservation to restore page >>> table from their shadow. >>> 3. cs is waiting for gpu_reset complete, but gpu reset is waiting >>> for cs releases reservation. >>> >>> Change-Id: I0f66d04b2bef3433035109623c8a5c5992c84202 >>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> index 9edb1a4..a6722a7 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> @@ -1076,6 +1076,7 @@ static int amdgpu_cs_submit(struct >>> amdgpu_cs_parser *p, >>> amdgpu_job_free_resources(job); >>> >>> trace_amdgpu_cs_ioctl(job); >>> + amdgpu_cs_parser_fini(p, 0, true); >> >> Please confirm: >> It will not free/release more things that may be accessed in job >> working process, as cs_parse_fini free so many things related to parser. > > Yeah, that indeed won't work. amdgpu_cs_parser_fini() does: > No, with my patch, parser->job already is NULL here, same as previous sequence, just put push_job action after amdgpu_cs_parser_fini() > if (parser->job) > amdgpu_job_free(parser->job); > > And amdgpu_cs_submit() sets the fence, so it must be called before > amdgpu_cs_parser_fini(). yes, fence is already generated, so amdgpu_cs_parser_fini before pushing job is completely safe here. > > But apart from that this is a rather nifty idea. What was the problem > with the initial patch? Just as comments in patch, which is found by Monk. Regards, David Zhou > >> >> Or we can just release the pd reservation here? > > We could run into problem with the reservation ticked with that. So I > would want to avoid that. > > Christian. > >> >> Jerry >> >>> amd_sched_entity_push_job(&job->base); >>> >>> return 0; >>> @@ -1130,6 +1131,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, >>> void *data, struct drm_file *filp) >>> >>> r = amdgpu_cs_submit(&parser, cs); >>> >>> + return r; >>> out: >>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers); >>> return r; >>> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-04-28 2:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-27 7:20 [PATCH] drm/amdgpu: fix deadlock of reservation between cs and gpu reset Chunming Zhou
[not found] ` <1493277617-14966-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2017-04-27 8:01 ` zhoucm1
-- strict thread matches above, loose matches on Subject: below --
2017-04-27 8:25 Chunming Zhou
[not found] ` <1493281554-9898-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2017-04-27 9:08 ` Zhang, Jerry (Junwei)
[not found] ` <5901B526.6080305-5C7GfCeVMHo@public.gmane.org>
2017-04-27 10:11 ` Christian König
[not found] ` <62336e5c-571a-196f-6c07-be25c928cdd3-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-04-27 10:23 ` zhoucm1
[not found] ` <5901C6BD.1070303-5C7GfCeVMHo@public.gmane.org>
2017-04-27 12:04 ` Christian König
[not found] ` <b6e45c8f-40a1-4185-89fb-1f4f7719c0fb-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-04-28 2:37 ` Zhang, Jerry (Junwei)
2017-04-27 14:02 ` Liu, Monk
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.