* [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2
@ 2017-10-30 14:59 Christian König
2017-10-30 14:59 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
[not found] ` <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
0 siblings, 2 replies; 17+ messages in thread
From: Christian König @ 2017-10-30 14:59 UTC (permalink / raw)
To: dri-devel, amd-gfx
From: Christian König <christian.koenig@amd.com>
The amdgpu issue to also need signaled fences in the reservation objects
should be fixed by now.
Optimize the list by keeping only the not signaled yet fences around.
v2: temporary put the signaled fences at the end of the new container
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/dma-buf/reservation.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index b44d9d7db347..6fc794576997 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -145,8 +145,8 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
- unsigned i;
struct dma_fence *old_fence = NULL;
+ unsigned i, j, k;
dma_fence_get(fence);
@@ -162,9 +162,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
* references from the old struct are carried over to
* the new.
*/
- fobj->shared_count = old->shared_count;
-
- for (i = 0; i < old->shared_count; ++i) {
+ for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
struct dma_fence *check;
check = rcu_dereference_protected(old->shared[i],
@@ -172,10 +170,18 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
if (!old_fence && check->context == fence->context) {
old_fence = check;
- RCU_INIT_POINTER(fobj->shared[i], fence);
- } else
- RCU_INIT_POINTER(fobj->shared[i], check);
+ RCU_INIT_POINTER(fobj->shared[j++], fence);
+ } else if (!dma_fence_is_signaled(check)) {
+ RCU_INIT_POINTER(fobj->shared[j++], check);
+ } else {
+ /*
+ * Temporary save the signaled fences at the end of the
+ * new container
+ */
+ RCU_INIT_POINTER(fobj->shared[--k], check);
+ }
}
+ fobj->shared_count = j;
if (!old_fence) {
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
@@ -192,10 +198,20 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
write_seqcount_end(&obj->seq);
preempt_enable();
- if (old)
- kfree_rcu(old, rcu);
-
dma_fence_put(old_fence);
+
+ if (!old)
+ return;
+
+ /* Drop the references to the signaled fences */
+ for (i = k; i < fobj->shared_max; ++i) {
+ struct dma_fence *f;
+
+ f = rcu_dereference_protected(fobj->shared[i],
+ reservation_object_held(obj));
+ dma_fence_put(f);
+ }
+ kfree_rcu(old, rcu);
}
/**
--
2.11.0
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-10-30 14:59 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Christian König @ 2017-10-30 14:59 ` Christian König 2017-11-06 16:22 ` Chris Wilson [not found] ` <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> 1 sibling, 1 reply; 17+ messages in thread From: Christian König @ 2017-10-30 14:59 UTC (permalink / raw) To: dri-devel, amd-gfx From: Christian König <christian.koenig@amd.com> The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the handling by replacing a signaled fence when adding a new shared one. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 6fc794576997..a3928ce9f311 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void -- 2.11.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-10-30 14:59 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König @ 2017-11-06 16:22 ` Chris Wilson [not found] ` <150998532204.10527.4781311190552781355-M6iVdVfohj6unts5RBS2dVaTQe2KTcn/@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Chris Wilson @ 2017-11-06 16:22 UTC (permalink / raw) To: Christian König, dri-devel, amd-gfx Quoting Christian König (2017-10-30 14:59:04) > From: Christian König <christian.koenig@amd.com> > > The amdgpu issue to also need signaled fences in the reservation objects should > be fixed by now. > > Optimize the handling by replacing a signaled fence when adding a new > shared one. > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/dma-buf/reservation.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 6fc794576997..a3928ce9f311 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > struct reservation_object_list *fobj, > struct dma_fence *fence) > { > - u32 i; > + struct dma_fence *signaled = NULL; > + u32 i, signaled_idx; > > dma_fence_get(fence); > > @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > dma_fence_put(old_fence); > return; > } > + > + if (!signaled && dma_fence_is_signaled(old_fence)) { > + signaled = old_fence; > + signaled_idx = i; > + } How much do we care about only keeping one fence per-ctx here? You could rearrange this to break on old_fence->context == fence->context || dma_fence_is_signaled(old_fence) then everyone can use the final block. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <150998532204.10527.4781311190552781355-M6iVdVfohj6unts5RBS2dVaTQe2KTcn/@public.gmane.org>]
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <150998532204.10527.4781311190552781355-M6iVdVfohj6unts5RBS2dVaTQe2KTcn/@public.gmane.org> @ 2017-11-14 14:24 ` Christian König [not found] ` <4fc05061-e92d-1702-a0f6-ee337b7d3eb3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Christian König @ 2017-11-14 14:24 UTC (permalink / raw) To: Chris Wilson, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Am 06.11.2017 um 17:22 schrieb Chris Wilson: > Quoting Christian König (2017-10-30 14:59:04) >> From: Christian König <christian.koenig@amd.com> >> >> The amdgpu issue to also need signaled fences in the reservation objects should >> be fixed by now. >> >> Optimize the handling by replacing a signaled fence when adding a new >> shared one. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> >> --- >> drivers/dma-buf/reservation.c | 18 +++++++++++++++--- >> 1 file changed, 15 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c >> index 6fc794576997..a3928ce9f311 100644 >> --- a/drivers/dma-buf/reservation.c >> +++ b/drivers/dma-buf/reservation.c >> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >> struct reservation_object_list *fobj, >> struct dma_fence *fence) >> { >> - u32 i; >> + struct dma_fence *signaled = NULL; >> + u32 i, signaled_idx; >> >> dma_fence_get(fence); >> >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >> dma_fence_put(old_fence); >> return; >> } >> + >> + if (!signaled && dma_fence_is_signaled(old_fence)) { >> + signaled = old_fence; >> + signaled_idx = i; >> + } > How much do we care about only keeping one fence per-ctx here? You could > rearrange this to break on old_fence->context == fence->context || > dma_fence_is_signaled(old_fence) then everyone can use the final block. Yeah, that is what David Zhou suggested as well. I've rejected this approach for now cause I think we still have cases where we rely on one fence per ctx (but I'm not 100% sure). I changed patch #1 in this series as you suggest and going to send that out once more in a minute. Can we get this upstream as is for now? I won't have much more time working on this. Regards, Christian. > -Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <4fc05061-e92d-1702-a0f6-ee337b7d3eb3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <4fc05061-e92d-1702-a0f6-ee337b7d3eb3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2017-11-14 14:34 ` Chris Wilson 2017-11-15 16:55 ` Chris Wilson 0 siblings, 1 reply; 17+ messages in thread From: Chris Wilson @ 2017-11-14 14:34 UTC (permalink / raw) To: christian.koenig-5C7GfCeVMHo, Christian König, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Quoting Christian König (2017-11-14 14:24:44) > Am 06.11.2017 um 17:22 schrieb Chris Wilson: > > Quoting Christian König (2017-10-30 14:59:04) > >> From: Christian König <christian.koenig@amd.com> > >> > >> The amdgpu issue to also need signaled fences in the reservation objects should > >> be fixed by now. > >> > >> Optimize the handling by replacing a signaled fence when adding a new > >> shared one. > >> > >> Signed-off-by: Christian König <christian.koenig@amd.com> > >> --- > >> drivers/dma-buf/reservation.c | 18 +++++++++++++++--- > >> 1 file changed, 15 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > >> index 6fc794576997..a3928ce9f311 100644 > >> --- a/drivers/dma-buf/reservation.c > >> +++ b/drivers/dma-buf/reservation.c > >> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >> struct reservation_object_list *fobj, > >> struct dma_fence *fence) > >> { > >> - u32 i; > >> + struct dma_fence *signaled = NULL; > >> + u32 i, signaled_idx; > >> > >> dma_fence_get(fence); > >> > >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >> dma_fence_put(old_fence); > >> return; > >> } > >> + > >> + if (!signaled && dma_fence_is_signaled(old_fence)) { > >> + signaled = old_fence; > >> + signaled_idx = i; > >> + } > > How much do we care about only keeping one fence per-ctx here? You could > > rearrange this to break on old_fence->context == fence->context || > > dma_fence_is_signaled(old_fence) then everyone can use the final block. > > Yeah, that is what David Zhou suggested as well. > > I've rejected this approach for now cause I think we still have cases > where we rely on one fence per ctx (but I'm not 100% sure). > > I changed patch #1 in this series as you suggest and going to send that > out once more in a minute. > > Can we get this upstream as is for now? I won't have much more time > working on this. Sure, we are only discussing how we might make it look tidier, pure micro-optimisation with the caveat of losing the one-fence-per-ctx guarantee. -Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-11-14 14:34 ` Chris Wilson @ 2017-11-15 16:55 ` Chris Wilson 2017-11-15 17:34 ` Christian König 0 siblings, 1 reply; 17+ messages in thread From: Chris Wilson @ 2017-11-15 16:55 UTC (permalink / raw) To: christian.koenig, Christian König, dri-devel, amd-gfx Quoting Chris Wilson (2017-11-14 14:34:05) > Quoting Christian König (2017-11-14 14:24:44) > > Am 06.11.2017 um 17:22 schrieb Chris Wilson: > > > Quoting Christian König (2017-10-30 14:59:04) > > >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > > >> dma_fence_put(old_fence); > > >> return; > > >> } > > >> + > > >> + if (!signaled && dma_fence_is_signaled(old_fence)) { > > >> + signaled = old_fence; > > >> + signaled_idx = i; > > >> + } > > > How much do we care about only keeping one fence per-ctx here? You could > > > rearrange this to break on old_fence->context == fence->context || > > > dma_fence_is_signaled(old_fence) then everyone can use the final block. > > > > Yeah, that is what David Zhou suggested as well. > > > > I've rejected this approach for now cause I think we still have cases > > where we rely on one fence per ctx (but I'm not 100% sure). > > > > I changed patch #1 in this series as you suggest and going to send that > > out once more in a minute. > > > > Can we get this upstream as is for now? I won't have much more time > > working on this. > > Sure, we are only discussing how we might make it look tidier, pure > micro-optimisation with the caveat of losing the one-fence-per-ctx > guarantee. Ah, one thing to note is that extra checking pushed one of our corner case tests over its time limit. If we can completely forgo the one-fence-per-ctx here, what works really well in testing is diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 5319ac478918..5755e95fab1b 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,39 +104,19 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - struct dma_fence *replace = NULL; - u32 ctx = fence->context; - u32 i; - dma_fence_get(fence); preempt_disable(); write_seqcount_begin(&obj->seq); - for (i = 0; i < fobj->shared_count; ++i) { - struct dma_fence *check; - - check = rcu_dereference_protected(fobj->shared[i], - reservation_object_held(obj)); - - if (check->context == ctx || dma_fence_is_signaled(check)) { - replace = old_fence; - break; - } - } - /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[i], fence); - if (!replace) - fobj->shared_count++; + RCU_INIT_POINTER(fobj->shared[fobj->shared_count++], fence); write_seqcount_end(&obj->seq); preempt_enable(); - - dma_fence_put(replace); } static void i.e. don't check when not replacing the shared[], on creating the new buffer we then discard all the old fences. It should work for amdgpu as well since you do a ht to coalesce redundant fences before queuing. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-11-15 16:55 ` Chris Wilson @ 2017-11-15 17:34 ` Christian König [not found] ` <1a565164-85cf-0678-b6ab-802ec047bb91-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Christian König @ 2017-11-15 17:34 UTC (permalink / raw) To: Chris Wilson, Christian König, dri-devel, amd-gfx Am 15.11.2017 um 17:55 schrieb Chris Wilson: > Quoting Chris Wilson (2017-11-14 14:34:05) >> Quoting Christian König (2017-11-14 14:24:44) >>> Am 06.11.2017 um 17:22 schrieb Chris Wilson: >>>> Quoting Christian König (2017-10-30 14:59:04) >>>>> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >>>>> dma_fence_put(old_fence); >>>>> return; >>>>> } >>>>> + >>>>> + if (!signaled && dma_fence_is_signaled(old_fence)) { >>>>> + signaled = old_fence; >>>>> + signaled_idx = i; >>>>> + } >>>> How much do we care about only keeping one fence per-ctx here? You could >>>> rearrange this to break on old_fence->context == fence->context || >>>> dma_fence_is_signaled(old_fence) then everyone can use the final block. >>> Yeah, that is what David Zhou suggested as well. >>> >>> I've rejected this approach for now cause I think we still have cases >>> where we rely on one fence per ctx (but I'm not 100% sure). >>> >>> I changed patch #1 in this series as you suggest and going to send that >>> out once more in a minute. >>> >>> Can we get this upstream as is for now? I won't have much more time >>> working on this. >> Sure, we are only discussing how we might make it look tidier, pure >> micro-optimisation with the caveat of losing the one-fence-per-ctx >> guarantee. > Ah, one thing to note is that extra checking pushed one of our corner > case tests over its time limit. > > If we can completely forgo the one-fence-per-ctx here, what works really > well in testing is > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 5319ac478918..5755e95fab1b 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -104,39 +104,19 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > struct reservation_object_list *fobj, > struct dma_fence *fence) > { > - struct dma_fence *replace = NULL; > - u32 ctx = fence->context; > - u32 i; > - > dma_fence_get(fence); > > preempt_disable(); > write_seqcount_begin(&obj->seq); > > - for (i = 0; i < fobj->shared_count; ++i) { > - struct dma_fence *check; > - > - check = rcu_dereference_protected(fobj->shared[i], > - reservation_object_held(obj)); > - > - if (check->context == ctx || dma_fence_is_signaled(check)) { > - replace = old_fence; > - break; > - } > - } > - > /* > * memory barrier is added by write_seqcount_begin, > * fobj->shared_count is protected by this lock too > */ > - RCU_INIT_POINTER(fobj->shared[i], fence); > - if (!replace) > - fobj->shared_count++; > + RCU_INIT_POINTER(fobj->shared[fobj->shared_count++], fence); > > write_seqcount_end(&obj->seq); > preempt_enable(); > - > - dma_fence_put(replace); > } > > static void > > i.e. don't check when not replacing the shared[], on creating the new > buffer we then discard all the old fences. > > It should work for amdgpu as well since you do a ht to coalesce > redundant fences before queuing. That won't work for all cases. This way the reservation object would keep growing without a chance to ever shrink. Christian. > -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1a565164-85cf-0678-b6ab-802ec047bb91-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <1a565164-85cf-0678-b6ab-802ec047bb91-5C7GfCeVMHo@public.gmane.org> @ 2017-11-15 17:43 ` Chris Wilson 2017-11-15 18:56 ` Christian König 0 siblings, 1 reply; 17+ messages in thread From: Chris Wilson @ 2017-11-15 17:43 UTC (permalink / raw) To: Christian König, Christian König, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Quoting Christian König (2017-11-15 17:34:07) > Am 15.11.2017 um 17:55 schrieb Chris Wilson: > > Quoting Chris Wilson (2017-11-14 14:34:05) > >> Quoting Christian König (2017-11-14 14:24:44) > >>> Am 06.11.2017 um 17:22 schrieb Chris Wilson: > >>>> Quoting Christian König (2017-10-30 14:59:04) > >>>>> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >>>>> dma_fence_put(old_fence); > >>>>> return; > >>>>> } > >>>>> + > >>>>> + if (!signaled && dma_fence_is_signaled(old_fence)) { > >>>>> + signaled = old_fence; > >>>>> + signaled_idx = i; > >>>>> + } > >>>> How much do we care about only keeping one fence per-ctx here? You could > >>>> rearrange this to break on old_fence->context == fence->context || > >>>> dma_fence_is_signaled(old_fence) then everyone can use the final block. > >>> Yeah, that is what David Zhou suggested as well. > >>> > >>> I've rejected this approach for now cause I think we still have cases > >>> where we rely on one fence per ctx (but I'm not 100% sure). > >>> > >>> I changed patch #1 in this series as you suggest and going to send that > >>> out once more in a minute. > >>> > >>> Can we get this upstream as is for now? I won't have much more time > >>> working on this. > >> Sure, we are only discussing how we might make it look tidier, pure > >> micro-optimisation with the caveat of losing the one-fence-per-ctx > >> guarantee. > > Ah, one thing to note is that extra checking pushed one of our corner > > case tests over its time limit. > > > > If we can completely forgo the one-fence-per-ctx here, what works really > > well in testing is > > > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > > index 5319ac478918..5755e95fab1b 100644 > > --- a/drivers/dma-buf/reservation.c > > +++ b/drivers/dma-buf/reservation.c > > @@ -104,39 +104,19 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > > struct reservation_object_list *fobj, > > struct dma_fence *fence) > > { > > - struct dma_fence *replace = NULL; > > - u32 ctx = fence->context; > > - u32 i; > > - > > dma_fence_get(fence); > > > > preempt_disable(); > > write_seqcount_begin(&obj->seq); > > > > - for (i = 0; i < fobj->shared_count; ++i) { > > - struct dma_fence *check; > > - > > - check = rcu_dereference_protected(fobj->shared[i], > > - reservation_object_held(obj)); > > - > > - if (check->context == ctx || dma_fence_is_signaled(check)) { > > - replace = old_fence; > > - break; > > - } > > - } > > - > > /* > > * memory barrier is added by write_seqcount_begin, > > * fobj->shared_count is protected by this lock too > > */ > > - RCU_INIT_POINTER(fobj->shared[i], fence); > > - if (!replace) > > - fobj->shared_count++; > > + RCU_INIT_POINTER(fobj->shared[fobj->shared_count++], fence); > > > > write_seqcount_end(&obj->seq); > > preempt_enable(); > > - > > - dma_fence_put(replace); > > } > > > > static void > > > > i.e. don't check when not replacing the shared[], on creating the new > > buffer we then discard all the old fences. > > > > It should work for amdgpu as well since you do a ht to coalesce > > redundant fences before queuing. > > That won't work for all cases. This way the reservation object would > keep growing without a chance to ever shrink. We only keep the active fences when it grows, which is effective enough to keep it in check on the workloads I can find in the hour since noticing the failure in CI ;) And on the workloads where it is being flooded with live fences from many contexts, the order of magnitude throughput improvement is not easy to ignore. -Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-11-15 17:43 ` Chris Wilson @ 2017-11-15 18:56 ` Christian König 2017-11-15 19:30 ` Chris Wilson 0 siblings, 1 reply; 17+ messages in thread From: Christian König @ 2017-11-15 18:56 UTC (permalink / raw) To: Chris Wilson, Christian König, dri-devel, amd-gfx Am 15.11.2017 um 18:43 schrieb Chris Wilson: > Quoting Christian König (2017-11-15 17:34:07) >> Am 15.11.2017 um 17:55 schrieb Chris Wilson: >>> Quoting Chris Wilson (2017-11-14 14:34:05) >>>> Quoting Christian König (2017-11-14 14:24:44) >>>>> Am 06.11.2017 um 17:22 schrieb Chris Wilson: >>>>>> Quoting Christian König (2017-10-30 14:59:04) >>>>>>> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >>>>>>> dma_fence_put(old_fence); >>>>>>> return; >>>>>>> } >>>>>>> + >>>>>>> + if (!signaled && dma_fence_is_signaled(old_fence)) { >>>>>>> + signaled = old_fence; >>>>>>> + signaled_idx = i; >>>>>>> + } >>>>>> How much do we care about only keeping one fence per-ctx here? You could >>>>>> rearrange this to break on old_fence->context == fence->context || >>>>>> dma_fence_is_signaled(old_fence) then everyone can use the final block. >>>>> Yeah, that is what David Zhou suggested as well. >>>>> >>>>> I've rejected this approach for now cause I think we still have cases >>>>> where we rely on one fence per ctx (but I'm not 100% sure). >>>>> >>>>> I changed patch #1 in this series as you suggest and going to send that >>>>> out once more in a minute. >>>>> >>>>> Can we get this upstream as is for now? I won't have much more time >>>>> working on this. >>>> Sure, we are only discussing how we might make it look tidier, pure >>>> micro-optimisation with the caveat of losing the one-fence-per-ctx >>>> guarantee. >>> Ah, one thing to note is that extra checking pushed one of our corner >>> case tests over its time limit. >>> >>> If we can completely forgo the one-fence-per-ctx here, what works really >>> well in testing is >>> >>> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c >>> index 5319ac478918..5755e95fab1b 100644 >>> --- a/drivers/dma-buf/reservation.c >>> +++ b/drivers/dma-buf/reservation.c >>> @@ -104,39 +104,19 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >>> struct reservation_object_list *fobj, >>> struct dma_fence *fence) >>> { >>> - struct dma_fence *replace = NULL; >>> - u32 ctx = fence->context; >>> - u32 i; >>> - >>> dma_fence_get(fence); >>> >>> preempt_disable(); >>> write_seqcount_begin(&obj->seq); >>> >>> - for (i = 0; i < fobj->shared_count; ++i) { >>> - struct dma_fence *check; >>> - >>> - check = rcu_dereference_protected(fobj->shared[i], >>> - reservation_object_held(obj)); >>> - >>> - if (check->context == ctx || dma_fence_is_signaled(check)) { >>> - replace = old_fence; >>> - break; >>> - } >>> - } >>> - >>> /* >>> * memory barrier is added by write_seqcount_begin, >>> * fobj->shared_count is protected by this lock too >>> */ >>> - RCU_INIT_POINTER(fobj->shared[i], fence); >>> - if (!replace) >>> - fobj->shared_count++; >>> + RCU_INIT_POINTER(fobj->shared[fobj->shared_count++], fence); >>> >>> write_seqcount_end(&obj->seq); >>> preempt_enable(); >>> - >>> - dma_fence_put(replace); >>> } >>> >>> static void >>> >>> i.e. don't check when not replacing the shared[], on creating the new >>> buffer we then discard all the old fences. >>> >>> It should work for amdgpu as well since you do a ht to coalesce >>> redundant fences before queuing. >> That won't work for all cases. This way the reservation object would >> keep growing without a chance to ever shrink. > We only keep the active fences when it grows, which is effective enough > to keep it in check on the workloads I can find in the hour since > noticing the failure in CI ;) Not sure what tests you run, but this change means that we basically just add the fence to an ever growing list of fences on every command submission which is only reaped when we double the list size. That is a serious no-go as far as I can see. What we could do is improve reservation_object_reserve_shared() as well to figure out how many signaled fences are actually in the array when we reallocate it. > And on the workloads where it is being > flooded with live fences from many contexts, the order of magnitude > throughput improvement is not easy to ignore. Well not sure about the Intel driver, but since we started to mostly use a single reservation object per process the time spend in those functions on amdgpu are completely negligible. Regards, Christian. > -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-11-15 18:56 ` Christian König @ 2017-11-15 19:30 ` Chris Wilson 0 siblings, 0 replies; 17+ messages in thread From: Chris Wilson @ 2017-11-15 19:30 UTC (permalink / raw) To: christian.koenig, Christian König Quoting Christian König (2017-11-15 18:56:43) > Am 15.11.2017 um 18:43 schrieb Chris Wilson: > > Quoting Christian König (2017-11-15 17:34:07) > >> Am 15.11.2017 um 17:55 schrieb Chris Wilson: > >>> Quoting Chris Wilson (2017-11-14 14:34:05) > >>>> Quoting Christian König (2017-11-14 14:24:44) > >>>>> Am 06.11.2017 um 17:22 schrieb Chris Wilson: > >>>>>> Quoting Christian König (2017-10-30 14:59:04) > >>>>>>> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >>>>>>> dma_fence_put(old_fence); > >>>>>>> return; > >>>>>>> } > >>>>>>> + > >>>>>>> + if (!signaled && dma_fence_is_signaled(old_fence)) { > >>>>>>> + signaled = old_fence; > >>>>>>> + signaled_idx = i; > >>>>>>> + } > >>>>>> How much do we care about only keeping one fence per-ctx here? You could > >>>>>> rearrange this to break on old_fence->context == fence->context || > >>>>>> dma_fence_is_signaled(old_fence) then everyone can use the final block. > >>>>> Yeah, that is what David Zhou suggested as well. > >>>>> > >>>>> I've rejected this approach for now cause I think we still have cases > >>>>> where we rely on one fence per ctx (but I'm not 100% sure). > >>>>> > >>>>> I changed patch #1 in this series as you suggest and going to send that > >>>>> out once more in a minute. > >>>>> > >>>>> Can we get this upstream as is for now? I won't have much more time > >>>>> working on this. > >>>> Sure, we are only discussing how we might make it look tidier, pure > >>>> micro-optimisation with the caveat of losing the one-fence-per-ctx > >>>> guarantee. > >>> Ah, one thing to note is that extra checking pushed one of our corner > >>> case tests over its time limit. > >>> > >>> If we can completely forgo the one-fence-per-ctx here, what works really > >>> well in testing is > >>> > >>> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > >>> index 5319ac478918..5755e95fab1b 100644 > >>> --- a/drivers/dma-buf/reservation.c > >>> +++ b/drivers/dma-buf/reservation.c > >>> @@ -104,39 +104,19 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >>> struct reservation_object_list *fobj, > >>> struct dma_fence *fence) > >>> { > >>> - struct dma_fence *replace = NULL; > >>> - u32 ctx = fence->context; > >>> - u32 i; > >>> - > >>> dma_fence_get(fence); > >>> > >>> preempt_disable(); > >>> write_seqcount_begin(&obj->seq); > >>> > >>> - for (i = 0; i < fobj->shared_count; ++i) { > >>> - struct dma_fence *check; > >>> - > >>> - check = rcu_dereference_protected(fobj->shared[i], > >>> - reservation_object_held(obj)); > >>> - > >>> - if (check->context == ctx || dma_fence_is_signaled(check)) { > >>> - replace = old_fence; > >>> - break; > >>> - } > >>> - } > >>> - > >>> /* > >>> * memory barrier is added by write_seqcount_begin, > >>> * fobj->shared_count is protected by this lock too > >>> */ > >>> - RCU_INIT_POINTER(fobj->shared[i], fence); > >>> - if (!replace) > >>> - fobj->shared_count++; > >>> + RCU_INIT_POINTER(fobj->shared[fobj->shared_count++], fence); > >>> > >>> write_seqcount_end(&obj->seq); > >>> preempt_enable(); > >>> - > >>> - dma_fence_put(replace); > >>> } > >>> > >>> static void > >>> > >>> i.e. don't check when not replacing the shared[], on creating the new > >>> buffer we then discard all the old fences. > >>> > >>> It should work for amdgpu as well since you do a ht to coalesce > >>> redundant fences before queuing. > >> That won't work for all cases. This way the reservation object would > >> keep growing without a chance to ever shrink. > > We only keep the active fences when it grows, which is effective enough > > to keep it in check on the workloads I can find in the hour since > > noticing the failure in CI ;) > > Not sure what tests you run, but this change means that we basically > just add the fence to an ever growing list of fences on every command > submission which is only reaped when we double the list size. Sure, just the frequency of doubling is also halved everytime. Then the entire array of shared is reaped when idle. Just throwing the issue out there; something that is already slow, just got noticeably slower. And that if we just ignore it, and only reap on reallocate, it vanishes. (It's just one case that happened to be caught by CI because it triggered a timeout, what CI isn't telling us is the dramatic improvement in other cases from avoiding an array of 1,000,000 stale fences on every object.) > That is a serious no-go as far as I can see. What we could do is improve > reservation_object_reserve_shared() as well to figure out how many > signaled fences are actually in the array when we reallocate it. Just amoritizing all the cost to the reallocate and then avoiding the growth should be enough. In the past I sent a conversion of the shared fences over to a compressed idr to avoid the cost of searching the shared array. It doesn't have the builtin reaping of stale fences; but we can do the discard on idle. > > And on the workloads where it is being > > flooded with live fences from many contexts, the order of magnitude > > throughput improvement is not easy to ignore. > > Well not sure about the Intel driver, but since we started to mostly use > a single reservation object per process the time spend in those > functions on amdgpu are completely negligible. Implicit fence tracking at object granularity. They are everywhere and very costly, until we can change the entirety of userspace (and dmabuf) to switch over to only using explicit fencing. Then it becomes somebody else's problem and the kernel need only track HW access which is a much smaller (and predefined) problem. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>]
* Re: [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 [not found] ` <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> @ 2017-11-02 12:26 ` Sumit Semwal 2017-11-06 16:18 ` Chris Wilson 1 sibling, 0 replies; 17+ messages in thread From: Sumit Semwal @ 2017-11-02 12:26 UTC (permalink / raw) To: Christian König; +Cc: amd-gfx list, DRI mailing list Hi Christian, On 30 October 2017 at 20:29, Christian König <ckoenig.leichtzumerken@gmail.com> wrote: > From: Christian König <christian.koenig@amd.com> > > The amdgpu issue to also need signaled fences in the reservation objects > should be fixed by now. > > Optimize the list by keeping only the not signaled yet fences around. I think this warrants some more context in the commit log. Could you please do the same for the other patch as well? > > v2: temporary put the signaled fences at the end of the new container > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/dma-buf/reservation.c | 36 ++++++++++++++++++++++++++---------- > 1 file changed, 26 insertions(+), 10 deletions(-) > Best, Sumit. _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 [not found] ` <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> 2017-11-02 12:26 ` [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Sumit Semwal @ 2017-11-06 16:18 ` Chris Wilson 1 sibling, 0 replies; 17+ messages in thread From: Chris Wilson @ 2017-11-06 16:18 UTC (permalink / raw) To: Christian König, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Quoting Christian König (2017-10-30 14:59:03) > From: Christian König <christian.koenig@amd.com> > > The amdgpu issue to also need signaled fences in the reservation objects > should be fixed by now. > > Optimize the list by keeping only the not signaled yet fences around. > > v2: temporary put the signaled fences at the end of the new container > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/dma-buf/reservation.c | 36 ++++++++++++++++++++++++++---------- > 1 file changed, 26 insertions(+), 10 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index b44d9d7db347..6fc794576997 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -145,8 +145,8 @@ reservation_object_add_shared_replace(struct reservation_object *obj, > struct reservation_object_list *fobj, > struct dma_fence *fence) > { > - unsigned i; > struct dma_fence *old_fence = NULL; > + unsigned i, j, k; > > dma_fence_get(fence); > > @@ -162,9 +162,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj, > * references from the old struct are carried over to > * the new. > */ > - fobj->shared_count = old->shared_count; > - > - for (i = 0; i < old->shared_count; ++i) { > + for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) { > struct dma_fence *check; > > check = rcu_dereference_protected(old->shared[i], > @@ -172,10 +170,18 @@ reservation_object_add_shared_replace(struct reservation_object *obj, > > if (!old_fence && check->context == fence->context) { > old_fence = check; > - RCU_INIT_POINTER(fobj->shared[i], fence); > - } else > - RCU_INIT_POINTER(fobj->shared[i], check); > + RCU_INIT_POINTER(fobj->shared[j++], fence); > + } else if (!dma_fence_is_signaled(check)) { > + RCU_INIT_POINTER(fobj->shared[j++], check); > + } else { > + /* > + * Temporary save the signaled fences at the end of the > + * new container > + */ > + RCU_INIT_POINTER(fobj->shared[--k], check); > + } > } > + fobj->shared_count = j; > if (!old_fence) { > RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); > fobj->shared_count++; > @@ -192,10 +198,20 @@ reservation_object_add_shared_replace(struct reservation_object *obj, > write_seqcount_end(&obj->seq); > preempt_enable(); > > - if (old) > - kfree_rcu(old, rcu); > - > dma_fence_put(old_fence); We don't need to keep old_fence (local var) around any more in this scheme. By construction, the new shared_list is large enough to hold all the old_fences+1, so you can use a loop like j = 0; k = fobj->shared_max; RCU_INIT_POINTER(fobj->shared[j++], fence); for (i = 0; i < old->shared_count; ++i) { struct dma_fence *check; check = rcu_dereference_protected(old->shared[i], reservation_object_held(obj)); if (check->context == fence->context || dma_fence_is_signaled(check)) RCU_INIT_POINTER(fobj->shared[--k], check); else RCU_INIT_POINTER(fobj->shared[j++], check); } fobj->shared_count = j; > + > + if (!old) > + return; > + > + /* Drop the references to the signaled fences */ > + for (i = k; i < fobj->shared_max; ++i) { > + struct dma_fence *f; > + > + f = rcu_dereference_protected(fobj->shared[i], > + reservation_object_held(obj)); > + dma_fence_put(f); > + } > + kfree_rcu(old, rcu); > } Fwiw, this fixes a big problem with the lack of pruning of the reservation_object, so \o/ -Chris _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v3
@ 2017-11-14 14:24 Christian König
[not found] ` <20171114142436.1360-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2017-11-14 14:24 UTC (permalink / raw)
To: chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
The amdgpu issue to also need signaled fences in the reservation objects
should be fixed by now.
Optimize the list by keeping only the not signaled yet fences around.
v2: temporary put the signaled fences at the end of the new container
v3: put the old fence at the end of the new container as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/dma-buf/reservation.c | 38 +++++++++++++++++++++-----------------
1 file changed, 21 insertions(+), 17 deletions(-)
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index b44d9d7db347..6d7a53dadf77 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -145,8 +145,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
- unsigned i;
- struct dma_fence *old_fence = NULL;
+ unsigned i, j, k;
dma_fence_get(fence);
@@ -162,24 +161,21 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
* references from the old struct are carried over to
* the new.
*/
- fobj->shared_count = old->shared_count;
-
- for (i = 0; i < old->shared_count; ++i) {
+ for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
struct dma_fence *check;
check = rcu_dereference_protected(old->shared[i],
reservation_object_held(obj));
- if (!old_fence && check->context == fence->context) {
- old_fence = check;
- RCU_INIT_POINTER(fobj->shared[i], fence);
- } else
- RCU_INIT_POINTER(fobj->shared[i], check);
- }
- if (!old_fence) {
- RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
- fobj->shared_count++;
+ if (check->context == fence->context ||
+ dma_fence_is_signaled(check))
+ RCU_INIT_POINTER(fobj->shared[--k], check);
+ else
+ RCU_INIT_POINTER(fobj->shared[j++], check);
}
+ fobj->shared_count = j;
+ RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
+ fobj->shared_count++;
done:
preempt_disable();
@@ -192,10 +188,18 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
write_seqcount_end(&obj->seq);
preempt_enable();
- if (old)
- kfree_rcu(old, rcu);
+ if (!old)
+ return;
- dma_fence_put(old_fence);
+ /* Drop the references to the signaled fences */
+ for (i = k; i < fobj->shared_max; ++i) {
+ struct dma_fence *f;
+
+ f = rcu_dereference_protected(fobj->shared[i],
+ reservation_object_held(obj));
+ dma_fence_put(f);
+ }
+ kfree_rcu(old, rcu);
}
/**
--
2.11.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 17+ messages in thread[parent not found: <20171114142436.1360-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>]
* [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <20171114142436.1360-1-christian.koenig-5C7GfCeVMHo@public.gmane.org> @ 2017-11-14 14:24 ` Christian König 2017-11-14 14:41 ` Chris Wilson 0 siblings, 1 reply; 17+ messages in thread From: Christian König @ 2017-11-14 14:24 UTC (permalink / raw) To: chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the handling by replacing a signaled fence when adding a new shared one. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 6d7a53dadf77..d72079742ca9 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void -- 2.11.0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-11-14 14:24 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König @ 2017-11-14 14:41 ` Chris Wilson 0 siblings, 0 replies; 17+ messages in thread From: Chris Wilson @ 2017-11-14 14:41 UTC (permalink / raw) To: Christian König, dri-devel, amd-gfx Quoting Christian König (2017-11-14 14:24:36) > The amdgpu issue to also need signaled fences in the reservation objects should > be fixed by now. > > Optimize the handling by replacing a signaled fence when adding a new > shared one. > > Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Does what you say, but not quite the same patch I've been testing. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2
@ 2017-10-31 8:43 Christian König
[not found] ` <20171031084306.1986-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2017-10-31 8:43 UTC (permalink / raw)
To: dri-devel, amd-gfx, sumit.semwal, christian.koenig
The amdgpu issue to also need signaled fences in the reservation objects
should be fixed by now.
Optimize the list by keeping only the not signaled yet fences around.
v2: temporary put the signaled fences at the end of the new container
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/dma-buf/reservation.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index b44d9d7db347..6fc794576997 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -145,8 +145,8 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
- unsigned i;
struct dma_fence *old_fence = NULL;
+ unsigned i, j, k;
dma_fence_get(fence);
@@ -162,9 +162,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
* references from the old struct are carried over to
* the new.
*/
- fobj->shared_count = old->shared_count;
-
- for (i = 0; i < old->shared_count; ++i) {
+ for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
struct dma_fence *check;
check = rcu_dereference_protected(old->shared[i],
@@ -172,10 +170,18 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
if (!old_fence && check->context == fence->context) {
old_fence = check;
- RCU_INIT_POINTER(fobj->shared[i], fence);
- } else
- RCU_INIT_POINTER(fobj->shared[i], check);
+ RCU_INIT_POINTER(fobj->shared[j++], fence);
+ } else if (!dma_fence_is_signaled(check)) {
+ RCU_INIT_POINTER(fobj->shared[j++], check);
+ } else {
+ /*
+ * Temporary save the signaled fences at the end of the
+ * new container
+ */
+ RCU_INIT_POINTER(fobj->shared[--k], check);
+ }
}
+ fobj->shared_count = j;
if (!old_fence) {
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
@@ -192,10 +198,20 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
write_seqcount_end(&obj->seq);
preempt_enable();
- if (old)
- kfree_rcu(old, rcu);
-
dma_fence_put(old_fence);
+
+ if (!old)
+ return;
+
+ /* Drop the references to the signaled fences */
+ for (i = k; i < fobj->shared_max; ++i) {
+ struct dma_fence *f;
+
+ f = rcu_dereference_protected(fobj->shared[i],
+ reservation_object_held(obj));
+ dma_fence_put(f);
+ }
+ kfree_rcu(old, rcu);
}
/**
--
2.11.0
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 17+ messages in thread[parent not found: <20171031084306.1986-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>]
* [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <20171031084306.1986-1-christian.koenig-5C7GfCeVMHo@public.gmane.org> @ 2017-10-31 8:43 ` Christian König 0 siblings, 0 replies; 17+ messages in thread From: Christian König @ 2017-10-31 8:43 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, sumit.semwal-QSEj5FYQhm4dnm+yROfE0A, christian.koenig-5C7GfCeVMHo The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the handling by replacing a signaled fence when adding a new shared one. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 6fc794576997..a3928ce9f311 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void -- 2.11.0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2
@ 2017-10-31 8:41 Christian König
[not found] ` <20171031084137.1925-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Christian König @ 2017-10-31 8:41 UTC (permalink / raw)
To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
sumit.semwal-QSEj5FYQhm4dnm+yROfE0A, christian.koenig-5C7GfCeVMHo
From: Christian König <christian.koenig@amd.com>
The amdgpu issue to also need signaled fences in the reservation objects
should be fixed by now.
Optimize the list by keeping only the not signaled yet fences around.
v2: temporary put the signaled fences at the end of the new container
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/dma-buf/reservation.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index b44d9d7db347..6fc794576997 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -145,8 +145,8 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
struct reservation_object_list *fobj,
struct dma_fence *fence)
{
- unsigned i;
struct dma_fence *old_fence = NULL;
+ unsigned i, j, k;
dma_fence_get(fence);
@@ -162,9 +162,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
* references from the old struct are carried over to
* the new.
*/
- fobj->shared_count = old->shared_count;
-
- for (i = 0; i < old->shared_count; ++i) {
+ for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
struct dma_fence *check;
check = rcu_dereference_protected(old->shared[i],
@@ -172,10 +170,18 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
if (!old_fence && check->context == fence->context) {
old_fence = check;
- RCU_INIT_POINTER(fobj->shared[i], fence);
- } else
- RCU_INIT_POINTER(fobj->shared[i], check);
+ RCU_INIT_POINTER(fobj->shared[j++], fence);
+ } else if (!dma_fence_is_signaled(check)) {
+ RCU_INIT_POINTER(fobj->shared[j++], check);
+ } else {
+ /*
+ * Temporary save the signaled fences at the end of the
+ * new container
+ */
+ RCU_INIT_POINTER(fobj->shared[--k], check);
+ }
}
+ fobj->shared_count = j;
if (!old_fence) {
RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
fobj->shared_count++;
@@ -192,10 +198,20 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
write_seqcount_end(&obj->seq);
preempt_enable();
- if (old)
- kfree_rcu(old, rcu);
-
dma_fence_put(old_fence);
+
+ if (!old)
+ return;
+
+ /* Drop the references to the signaled fences */
+ for (i = k; i < fobj->shared_max; ++i) {
+ struct dma_fence *f;
+
+ f = rcu_dereference_protected(fobj->shared[i],
+ reservation_object_held(obj));
+ dma_fence_put(f);
+ }
+ kfree_rcu(old, rcu);
}
/**
--
2.11.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 17+ messages in thread[parent not found: <20171031084137.1925-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>]
* [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace [not found] ` <20171031084137.1925-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> @ 2017-10-31 8:41 ` Christian König 0 siblings, 0 replies; 17+ messages in thread From: Christian König @ 2017-10-31 8:41 UTC (permalink / raw) To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, sumit.semwal-QSEj5FYQhm4dnm+yROfE0A, christian.koenig-5C7GfCeVMHo From: Christian König <christian.koenig@amd.com> The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the handling by replacing a signaled fence when adding a new shared one. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 6fc794576997..a3928ce9f311 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void -- 2.11.0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace @ 2017-10-24 13:55 Christian König 2017-10-24 13:55 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König 0 siblings, 1 reply; 17+ messages in thread From: Christian König @ 2017-10-24 13:55 UTC (permalink / raw) To: sumit.semwal, dri-devel, amd-gfx, zhoucm1 From: Christian König <christian.koenig@amd.com> The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the list by keeping only the not signaled yet fences around. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index b44d9d7db347..4ede77d1bb31 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -145,8 +145,8 @@ reservation_object_add_shared_replace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - unsigned i; struct dma_fence *old_fence = NULL; + unsigned i, j, k; dma_fence_get(fence); @@ -162,9 +162,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj, * references from the old struct are carried over to * the new. */ - fobj->shared_count = old->shared_count; - - for (i = 0; i < old->shared_count; ++i) { + for (i = 0, j = 0, k = old->shared_count; i < old->shared_count; ++i) { struct dma_fence *check; check = rcu_dereference_protected(old->shared[i], @@ -172,10 +170,14 @@ reservation_object_add_shared_replace(struct reservation_object *obj, if (!old_fence && check->context == fence->context) { old_fence = check; - RCU_INIT_POINTER(fobj->shared[i], fence); - } else - RCU_INIT_POINTER(fobj->shared[i], check); + RCU_INIT_POINTER(fobj->shared[j++], fence); + } else if (!dma_fence_is_signaled(check)) { + RCU_INIT_POINTER(fobj->shared[j++], check); + } else { + RCU_INIT_POINTER(fobj->shared[--k], check); + } } + fobj->shared_count = j; if (!old_fence) { RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); fobj->shared_count++; @@ -192,10 +194,19 @@ reservation_object_add_shared_replace(struct reservation_object *obj, write_seqcount_end(&obj->seq); preempt_enable(); - if (old) - kfree_rcu(old, rcu); - dma_fence_put(old_fence); + + if (!old) + return; + + for (i = fobj->shared_count; i < old->shared_count; ++i) { + struct dma_fence *f; + + f = rcu_dereference_protected(fobj->shared[i], + reservation_object_held(obj)); + dma_fence_put(f); + } + kfree_rcu(old, rcu); } /** -- 2.11.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace 2017-10-24 13:55 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace Christian König @ 2017-10-24 13:55 ` Christian König 0 siblings, 0 replies; 17+ messages in thread From: Christian König @ 2017-10-24 13:55 UTC (permalink / raw) To: sumit.semwal, dri-devel, amd-gfx, zhoucm1 From: Christian König <christian.koenig@amd.com> The amdgpu issue to also need signaled fences in the reservation objects should be fixed by now. Optimize the handling by replacing a signaled fence when adding a new shared one. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/dma-buf/reservation.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 4ede77d1bb31..702ef03a923a 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void -- 2.11.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-11-15 19:30 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-30 14:59 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Christian König
2017-10-30 14:59 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
2017-11-06 16:22 ` Chris Wilson
[not found] ` <150998532204.10527.4781311190552781355-M6iVdVfohj6unts5RBS2dVaTQe2KTcn/@public.gmane.org>
2017-11-14 14:24 ` Christian König
[not found] ` <4fc05061-e92d-1702-a0f6-ee337b7d3eb3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-11-14 14:34 ` Chris Wilson
2017-11-15 16:55 ` Chris Wilson
2017-11-15 17:34 ` Christian König
[not found] ` <1a565164-85cf-0678-b6ab-802ec047bb91-5C7GfCeVMHo@public.gmane.org>
2017-11-15 17:43 ` Chris Wilson
2017-11-15 18:56 ` Christian König
2017-11-15 19:30 ` Chris Wilson
[not found] ` <20171030145904.18584-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-11-02 12:26 ` [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Sumit Semwal
2017-11-06 16:18 ` Chris Wilson
-- strict thread matches above, loose matches on Subject: below --
2017-11-14 14:24 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v3 Christian König
[not found] ` <20171114142436.1360-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2017-11-14 14:24 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
2017-11-14 14:41 ` Chris Wilson
2017-10-31 8:43 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Christian König
[not found] ` <20171031084306.1986-1-christian.koenig-5C7GfCeVMHo@public.gmane.org>
2017-10-31 8:43 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
2017-10-31 8:41 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace v2 Christian König
[not found] ` <20171031084137.1925-1-deathsimple-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-10-31 8:41 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
2017-10-24 13:55 [PATCH 1/2] dma-buf: keep only not signaled fence in reservation_object_add_shared_replace Christian König
2017-10-24 13:55 ` [PATCH 2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace Christian König
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).