All of lore.kernel.org
 help / color / mirror / Atom feed
* Ensure progress for dma_fence_array
@ 2024-11-08  9:42 Christian König
  2024-11-08  9:42 ` [PATCH] dma-buf: fix dma_fence_array_signaled Christian König
  2024-11-09  0:32 ` Ensure progress for dma_fence_array Chia-I Wu
  0 siblings, 2 replies; 11+ messages in thread
From: Christian König @ 2024-11-08  9:42 UTC (permalink / raw)
  To: boris.brezillon, olvaffe, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

Hi guys,

as pointed out by Chia-I userspace doesn't see any progress when
signaling is not enabled and Boris noted that this is because
dma_fence_array_signaled() never returns true in this case.

Improve this by fixing the dma_fence_array_signaled() implementation to
also return true even if signaling was never explicitely enabled.

We should probably adjust the documentation as well that when the
callback is implemented it should make progess visible even without
enabling signaling.

Please test and review,
Christian.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08  9:42 Ensure progress for dma_fence_array Christian König
@ 2024-11-08  9:42 ` Christian König
  2024-11-08 13:01   ` Tvrtko Ursulin
  2024-11-08 14:54   ` Boris Brezillon
  2024-11-09  0:32 ` Ensure progress for dma_fence_array Chia-I Wu
  1 sibling, 2 replies; 11+ messages in thread
From: Christian König @ 2024-11-08  9:42 UTC (permalink / raw)
  To: boris.brezillon, olvaffe, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

The function silently assumed that signaling was already enabled for the
dma_fence_array. This meant that without enabling signaling first we would
never see forward progress.

Fix that by falling back to testing each individual fence when signaling
isn't enabled yet.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
index 46ac42bcfac0..01203796827a 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -103,10 +103,22 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence)
 static bool dma_fence_array_signaled(struct dma_fence *fence)
 {
 	struct dma_fence_array *array = to_dma_fence_array(fence);
+	unsigned int i, num_pending;
 
-	if (atomic_read(&array->num_pending) > 0)
+	num_pending = atomic_read(&array->num_pending);
+	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &array->base.flags)) {
+		if (!num_pending)
+			goto signal;
 		return false;
+	}
+
+	for (i = 0; i < array->num_fences; ++i) {
+		if (dma_fence_is_signaled(array->fences[i]) && !--num_pending)
+			goto signal;
+	}
+	return false;
 
+signal:
 	dma_fence_array_clear_pending_error(array);
 	return true;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08  9:42 ` [PATCH] dma-buf: fix dma_fence_array_signaled Christian König
@ 2024-11-08 13:01   ` Tvrtko Ursulin
  2024-11-08 14:18     ` Christian König
  2024-11-08 14:54   ` Boris Brezillon
  1 sibling, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2024-11-08 13:01 UTC (permalink / raw)
  To: Christian König, boris.brezillon, olvaffe, maarten.lankhorst,
	mripard, tzimmermann, lionel.g.landwerlin, dri-devel,
	faith.ekstrand, simona


On 08/11/2024 09:42, Christian König wrote:
> The function silently assumed that signaling was already enabled for the
> dma_fence_array. This meant that without enabling signaling first we would
> never see forward progress.
> 
> Fix that by falling back to testing each individual fence when signaling
> isn't enabled yet.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
> index 46ac42bcfac0..01203796827a 100644
> --- a/drivers/dma-buf/dma-fence-array.c
> +++ b/drivers/dma-buf/dma-fence-array.c
> @@ -103,10 +103,22 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence)
>   static bool dma_fence_array_signaled(struct dma_fence *fence)
>   {
>   	struct dma_fence_array *array = to_dma_fence_array(fence);
> +	unsigned int i, num_pending;
>   
> -	if (atomic_read(&array->num_pending) > 0)
> +	num_pending = atomic_read(&array->num_pending);
> +	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &array->base.flags)) {
> +		if (!num_pending)
> +			goto signal;
>   		return false;
> +	}
> +
> +	for (i = 0; i < array->num_fences; ++i) {
> +		if (dma_fence_is_signaled(array->fences[i]) && !--num_pending)
> +			goto signal;
> +	}
> +	return false;

Sampling num_pending (and decrementing) and test_bit from an unlocked 
path makes one need to think if there are consequences, false negatives, 
positives or something. Would it be fine to simplify like the below?

static bool dma_fence_array_signaled(struct dma_fence *fence)
{
	struct dma_fence_array *array = to_dma_fence_array(fence);
	unsigned int i;

	if (atomic_read(&array->num_pending)) {
		for (i = 0; i < array->num_fences; i++) {
			if (!dma_fence_is_signaled(array->fences[i]))
				return false;
		}
	}

	dma_fence_array_clear_pending_error(array);
	return true;
}

Or if the optimisation to not walk the array when signalling is already 
enabled is deemed important, perhaps a less thinking inducing way would 
be this:

static bool dma_fence_array_signaled(struct dma_fence *fence)
{
	struct dma_fence_array *array = to_dma_fence_array(fence);
	unsigned int i;

	if (atomic_read(&array->num_pending) == 0)
		goto signalled;

	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &array->base.flags))
		return false;

	for (i = 0; i < array->num_fences; i++) {
		if (!dma_fence_is_signaled(array->fences[i]))
			return false;
	}

signalled:
	dma_fence_array_clear_pending_error(array);
	return true;
}

Decrementing locally cached num_pending in the loop I think does not 
bring anything since when signalling is not enabled it will be stuck at 
num_fences. So the loop walks the whole array versus bail on first 
unsignalled, so latter even more efficient.

In which case, should dma-fence-chain also be aligned to have the fast 
path bail out?

Regards,

Tvrtko

>   
> +signal:
>   	dma_fence_array_clear_pending_error(array);
>   	return true;
>   }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08 13:01   ` Tvrtko Ursulin
@ 2024-11-08 14:18     ` Christian König
  2024-11-08 15:03       ` Boris Brezillon
  2024-11-08 15:52       ` Tvrtko Ursulin
  0 siblings, 2 replies; 11+ messages in thread
From: Christian König @ 2024-11-08 14:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, boris.brezillon, olvaffe, maarten.lankhorst,
	mripard, tzimmermann, lionel.g.landwerlin, dri-devel,
	faith.ekstrand, simona

Am 08.11.24 um 14:01 schrieb Tvrtko Ursulin:
>
> On 08/11/2024 09:42, Christian König wrote:
>> The function silently assumed that signaling was already enabled for the
>> dma_fence_array. This meant that without enabling signaling first we 
>> would
>> never see forward progress.
>>
>> Fix that by falling back to testing each individual fence when signaling
>> isn't enabled yet.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/dma-buf/dma-fence-array.c 
>> b/drivers/dma-buf/dma-fence-array.c
>> index 46ac42bcfac0..01203796827a 100644
>> --- a/drivers/dma-buf/dma-fence-array.c
>> +++ b/drivers/dma-buf/dma-fence-array.c
>> @@ -103,10 +103,22 @@ static bool 
>> dma_fence_array_enable_signaling(struct dma_fence *fence)
>>   static bool dma_fence_array_signaled(struct dma_fence *fence)
>>   {
>>       struct dma_fence_array *array = to_dma_fence_array(fence);
>> +    unsigned int i, num_pending;
>>   -    if (atomic_read(&array->num_pending) > 0)
>> +    num_pending = atomic_read(&array->num_pending);
>> +    if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, 
>> &array->base.flags)) {
>> +        if (!num_pending)
>> +            goto signal;
>>           return false;
>> +    }
>> +
>> +    for (i = 0; i < array->num_fences; ++i) {
>> +        if (dma_fence_is_signaled(array->fences[i]) && !--num_pending)
>> +            goto signal;
>> +    }
>> +    return false;
>
> Sampling num_pending (and decrementing) and test_bit from an unlocked 
> path makes one need to think if there are consequences, false 
> negatives, positives or something. Would it be fine to simplify like 
> the below?

Yeah I've played around with those ideas as well but came to the 
conclusion that neither of them are correct.

>
> static bool dma_fence_array_signaled(struct dma_fence *fence)
> {
>     struct dma_fence_array *array = to_dma_fence_array(fence);
>     unsigned int i;
>
>     if (atomic_read(&array->num_pending)) {
>         for (i = 0; i < array->num_fences; i++) {
>             if (!dma_fence_is_signaled(array->fences[i]))
>                 return false;

That's not correct. num_pending is not necessary equal to the number of 
fences in the array.

E.g. we have cases where num_pending is just 1 so that the 
dma_fence_array signals when *any* fence in it signals.

> }
>     }
>
>     dma_fence_array_clear_pending_error(array);
>     return true;
> }
>
> Or if the optimisation to not walk the array when signalling is 
> already enabled is deemed important, perhaps a less thinking inducing 
> way would be this:
...
> Decrementing locally cached num_pending in the loop I think does not 
> bring anything since when signalling is not enabled it will be stuck 
> at num_fences. So the loop walks the whole array versus bail on first 
> unsignalled, so latter even more efficient.

That is not for optimization but for correctness.

What the patch basically does is the following:
1. Grab the current value of num_pending.

2. Test if num_pending was potentially already modified because 
signaling is already enabled, if yes just test it and return the result.

3. If it wasn't modified go over the fences and see if we already have 
at least num_pending signaled.

I should probably add a code comment explaining that.

> In which case, should dma-fence-chain also be aligned to have the fast 
> path bail out?

Good point need to double check that code as well.

Thanks,
Christian.

>
> Regards,
>
> Tvrtko
>
>>   +signal:
>>       dma_fence_array_clear_pending_error(array);
>>       return true;
>>   }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08  9:42 ` [PATCH] dma-buf: fix dma_fence_array_signaled Christian König
  2024-11-08 13:01   ` Tvrtko Ursulin
@ 2024-11-08 14:54   ` Boris Brezillon
  1 sibling, 0 replies; 11+ messages in thread
From: Boris Brezillon @ 2024-11-08 14:54 UTC (permalink / raw)
  To: Christian König
  Cc: olvaffe, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

On Fri,  8 Nov 2024 10:42:56 +0100
"Christian König" <ckoenig.leichtzumerken@gmail.com> wrote:

> The function silently assumed that signaling was already enabled for the
> dma_fence_array. This meant that without enabling signaling first we would
> never see forward progress.
> 
> Fix that by falling back to testing each individual fence when signaling
> isn't enabled yet.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c
> index 46ac42bcfac0..01203796827a 100644
> --- a/drivers/dma-buf/dma-fence-array.c
> +++ b/drivers/dma-buf/dma-fence-array.c
> @@ -103,10 +103,22 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence)
>  static bool dma_fence_array_signaled(struct dma_fence *fence)
>  {
>  	struct dma_fence_array *array = to_dma_fence_array(fence);
> +	unsigned int i, num_pending;
>  
> -	if (atomic_read(&array->num_pending) > 0)
> +	num_pending = atomic_read(&array->num_pending);
> +	if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &array->base.flags)) {
> +		if (!num_pending)
> +			goto signal;
>  		return false;
> +	}
> +
> +	for (i = 0; i < array->num_fences; ++i) {
> +		if (dma_fence_is_signaled(array->fences[i]) && !--num_pending)
> +			goto signal;
> +	}
> +	return false;
>  
> +signal:
>  	dma_fence_array_clear_pending_error(array);
>  	return true;
>  }

It would be good to have comments explaining what happens here. I think
I figured it out, but it's far from obvious:

- we need to read array->num_pending before checking the enable_signal
  bit to avoid racing with the enable_signaling() implementation,
  which might decrement the counter, and cause a partial check.
- the !--num_pending is here to account for the any_signaled case
- if we race with enable_signaling(), that means the !num_pending
  check in the is_signalling_enabled branch might be outdated
  (num_pending might have been decremented), but that's fine. The user
  will get the right value when testing again later

With this explained in comments, the patch is

`Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>`

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08 14:18     ` Christian König
@ 2024-11-08 15:03       ` Boris Brezillon
  2024-11-08 15:52       ` Tvrtko Ursulin
  1 sibling, 0 replies; 11+ messages in thread
From: Boris Brezillon @ 2024-11-08 15:03 UTC (permalink / raw)
  To: Christian König
  Cc: Tvrtko Ursulin, olvaffe, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

On Fri, 8 Nov 2024 15:18:38 +0100
Christian König <ckoenig.leichtzumerken@gmail.com> wrote:

> > }
> >     }
> >
> >     dma_fence_array_clear_pending_error(array);
> >     return true;
> > }
> >
> > Or if the optimisation to not walk the array when signalling is 
> > already enabled is deemed important, perhaps a less thinking inducing 
> > way would be this:  
> ...
> > Decrementing locally cached num_pending in the loop I think does not 
> > bring anything since when signalling is not enabled it will be stuck 
> > at num_fences. So the loop walks the whole array versus bail on first 
> > unsignalled, so latter even more efficient.  
> 
> That is not for optimization but for correctness.
> 
> What the patch basically does is the following:
> 1. Grab the current value of num_pending.
> 
> 2. Test if num_pending was potentially already modified because 
> signaling is already enabled, if yes just test it and return the result.
> 
> 3. If it wasn't modified go over the fences and see if we already have 
> at least num_pending signaled.
> 
> I should probably add a code comment explaining that.

Sorry, I didn't sync my inbox before replying. Looks like we're on the
same page here: the code clearly needs comments to explain what's going
on.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08 14:18     ` Christian König
  2024-11-08 15:03       ` Boris Brezillon
@ 2024-11-08 15:52       ` Tvrtko Ursulin
  2024-11-12 12:16         ` Christian König
  1 sibling, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2024-11-08 15:52 UTC (permalink / raw)
  To: Christian König, boris.brezillon, olvaffe, maarten.lankhorst,
	mripard, tzimmermann, lionel.g.landwerlin, dri-devel,
	faith.ekstrand, simona


On 08/11/2024 14:18, Christian König wrote:
> Am 08.11.24 um 14:01 schrieb Tvrtko Ursulin:
>>
>> On 08/11/2024 09:42, Christian König wrote:
>>> The function silently assumed that signaling was already enabled for the
>>> dma_fence_array. This meant that without enabling signaling first we 
>>> would
>>> never see forward progress.
>>>
>>> Fix that by falling back to testing each individual fence when signaling
>>> isn't enabled yet.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence-array.c 
>>> b/drivers/dma-buf/dma-fence-array.c
>>> index 46ac42bcfac0..01203796827a 100644
>>> --- a/drivers/dma-buf/dma-fence-array.c
>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>> @@ -103,10 +103,22 @@ static bool 
>>> dma_fence_array_enable_signaling(struct dma_fence *fence)
>>>   static bool dma_fence_array_signaled(struct dma_fence *fence)
>>>   {
>>>       struct dma_fence_array *array = to_dma_fence_array(fence);
>>> +    unsigned int i, num_pending;
>>>   -    if (atomic_read(&array->num_pending) > 0)
>>> +    num_pending = atomic_read(&array->num_pending);
>>> +    if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, 
>>> &array->base.flags)) {
>>> +        if (!num_pending)
>>> +            goto signal;
>>>           return false;
>>> +    }
>>> +
>>> +    for (i = 0; i < array->num_fences; ++i) {
>>> +        if (dma_fence_is_signaled(array->fences[i]) && !--num_pending)
>>> +            goto signal;
>>> +    }
>>> +    return false;
>>
>> Sampling num_pending (and decrementing) and test_bit from an unlocked 
>> path makes one need to think if there are consequences, false 
>> negatives, positives or something. Would it be fine to simplify like 
>> the below?
> 
> Yeah I've played around with those ideas as well but came to the 
> conclusion that neither of them are correct.
> 
>>
>> static bool dma_fence_array_signaled(struct dma_fence *fence)
>> {
>>     struct dma_fence_array *array = to_dma_fence_array(fence);
>>     unsigned int i;
>>
>>     if (atomic_read(&array->num_pending)) {
>>         for (i = 0; i < array->num_fences; i++) {
>>             if (!dma_fence_is_signaled(array->fences[i]))
>>                 return false;
> 
> That's not correct. num_pending is not necessary equal to the number of 
> fences in the array.
> 
> E.g. we have cases where num_pending is just 1 so that the 
> dma_fence_array signals when *any* fence in it signals.

I forgot about that mode.

>> }
>>     }
>>
>>     dma_fence_array_clear_pending_error(array);
>>     return true;
>> }
>>
>> Or if the optimisation to not walk the array when signalling is 
>> already enabled is deemed important, perhaps a less thinking inducing 
>> way would be this:
> ...
>> Decrementing locally cached num_pending in the loop I think does not 
>> bring anything since when signalling is not enabled it will be stuck 
>> at num_fences. So the loop walks the whole array versus bail on first 
>> unsignalled, so latter even more efficient.
> 
> That is not for optimization but for correctness.
> 
> What the patch basically does is the following:
> 1. Grab the current value of num_pending.
> 
> 2. Test if num_pending was potentially already modified because 
> signaling is already enabled, if yes just test it and return the result.
> 
> 3. If it wasn't modified go over the fences and see if we already have 
> at least num_pending signaled.
> 
> I should probably add a code comment explaining that.

It would be good yes.

DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT can appear any time, even after the 
check, hence I am not sure the absence of the bit can be used to 
guarantee num_pending is stable. But I think this one is safe, since the 
loop will in any case find the desired number of signalled fences even 
if num_pending is stale (too high).

Also, can num_pending underflow in signal-on-any mode and if it can what 
can happen in unsigned int num_pending and the below loop. Potentially 
just one false negative with the following query returning signalled 
from the top level dma-fence code.

In summary I think patch works. I am just unsure if the above race can 
silently happen and cause one extra round trip through the query. If it 
can it still works, but definitely needs a big fat comment to explain it.

Regards,

Tvrtko

> 
>> In which case, should dma-fence-chain also be aligned to have the fast 
>> path bail out?
> 
> Good point need to double check that code as well.
> 
> Thanks,
> Christian.
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>   +signal:
>>>       dma_fence_array_clear_pending_error(array);
>>>       return true;
>>>   }
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ensure progress for dma_fence_array
  2024-11-08  9:42 Ensure progress for dma_fence_array Christian König
  2024-11-08  9:42 ` [PATCH] dma-buf: fix dma_fence_array_signaled Christian König
@ 2024-11-09  0:32 ` Chia-I Wu
  2024-11-12 12:00   ` Christian König
  1 sibling, 1 reply; 11+ messages in thread
From: Chia-I Wu @ 2024-11-09  0:32 UTC (permalink / raw)
  To: Christian König
  Cc: boris.brezillon, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

On Fri, Nov 8, 2024 at 1:43 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hi guys,
>
> as pointed out by Chia-I userspace doesn't see any progress when
> signaling is not enabled and Boris noted that this is because
> dma_fence_array_signaled() never returns true in this case.
>
> Improve this by fixing the dma_fence_array_signaled() implementation to
> also return true even if signaling was never explicitely enabled.
Yeah, this fixes the timeout I was seeing on panvk.

> We should probably adjust the documentation as well that when the
> callback is implemented it should make progess visible even without
> enabling signaling.
That would be really nice.  Both dma_fence_is_signaled and
dma_fence_ops::signaled explicitly state otherwise at the moment.



>
> Please test and review,
> Christian.
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ensure progress for dma_fence_array
  2024-11-09  0:32 ` Ensure progress for dma_fence_array Chia-I Wu
@ 2024-11-12 12:00   ` Christian König
  2024-11-12 21:10     ` Chia-I Wu
  0 siblings, 1 reply; 11+ messages in thread
From: Christian König @ 2024-11-12 12:00 UTC (permalink / raw)
  To: Chia-I Wu
  Cc: boris.brezillon, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

Am 09.11.24 um 01:32 schrieb Chia-I Wu:
> On Fri, Nov 8, 2024 at 1:43 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Hi guys,
>>
>> as pointed out by Chia-I userspace doesn't see any progress when
>> signaling is not enabled and Boris noted that this is because
>> dma_fence_array_signaled() never returns true in this case.
>>
>> Improve this by fixing the dma_fence_array_signaled() implementation to
>> also return true even if signaling was never explicitely enabled.
> Yeah, this fixes the timeout I was seeing on panvk.

Any objections to add your Tested-by?

>
>> We should probably adjust the documentation as well that when the
>> callback is implemented it should make progess visible even without
>> enabling signaling.
> That would be really nice.  Both dma_fence_is_signaled and
> dma_fence_ops::signaled explicitly state otherwise at the moment.

I have put that on my TODO list, but could take a while until I have 
time for it.

If anybody wants to make suggestions on the wording feel free to make a 
patch.

Thanks,
Christian.

>
>
>
>> Please test and review,
>> Christian.
>>
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] dma-buf: fix dma_fence_array_signaled
  2024-11-08 15:52       ` Tvrtko Ursulin
@ 2024-11-12 12:16         ` Christian König
  0 siblings, 0 replies; 11+ messages in thread
From: Christian König @ 2024-11-12 12:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, boris.brezillon, olvaffe, maarten.lankhorst,
	mripard, tzimmermann, lionel.g.landwerlin, dri-devel,
	faith.ekstrand, simona

Am 08.11.24 um 16:52 schrieb Tvrtko Ursulin:
> On 08/11/2024 14:18, Christian König wrote:
>> Am 08.11.24 um 14:01 schrieb Tvrtko Ursulin:
>>>
>>> On 08/11/2024 09:42, Christian König wrote:
>>>> The function silently assumed that signaling was already enabled 
>>>> for the
>>>> dma_fence_array. This meant that without enabling signaling first 
>>>> we would
>>>> never see forward progress.
>>>>
>>>> Fix that by falling back to testing each individual fence when 
>>>> signaling
>>>> isn't enabled yet.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/dma-buf/dma-fence-array.c | 14 +++++++++++++-
>>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/dma-buf/dma-fence-array.c 
>>>> b/drivers/dma-buf/dma-fence-array.c
>>>> index 46ac42bcfac0..01203796827a 100644
>>>> --- a/drivers/dma-buf/dma-fence-array.c
>>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>>> @@ -103,10 +103,22 @@ static bool 
>>>> dma_fence_array_enable_signaling(struct dma_fence *fence)
>>>>   static bool dma_fence_array_signaled(struct dma_fence *fence)
>>>>   {
>>>>       struct dma_fence_array *array = to_dma_fence_array(fence);
>>>> +    unsigned int i, num_pending;
>>>>   -    if (atomic_read(&array->num_pending) > 0)
>>>> +    num_pending = atomic_read(&array->num_pending);
>>>> +    if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, 
>>>> &array->base.flags)) {
>>>> +        if (!num_pending)
>>>> +            goto signal;
>>>>           return false;
>>>> +    }
>>>> +
>>>> +    for (i = 0; i < array->num_fences; ++i) {
>>>> +        if (dma_fence_is_signaled(array->fences[i]) && 
>>>> !--num_pending)
>>>> +            goto signal;
>>>> +    }
>>>> +    return false;
>>>
>>> Sampling num_pending (and decrementing) and test_bit from an 
>>> unlocked path makes one need to think if there are consequences, 
>>> false negatives, positives or something. Would it be fine to 
>>> simplify like the below?
>>
>> Yeah I've played around with those ideas as well but came to the 
>> conclusion that neither of them are correct.
>>
>>>
>>> static bool dma_fence_array_signaled(struct dma_fence *fence)
>>> {
>>>     struct dma_fence_array *array = to_dma_fence_array(fence);
>>>     unsigned int i;
>>>
>>>     if (atomic_read(&array->num_pending)) {
>>>         for (i = 0; i < array->num_fences; i++) {
>>>             if (!dma_fence_is_signaled(array->fences[i]))
>>>                 return false;
>>
>> That's not correct. num_pending is not necessary equal to the number 
>> of fences in the array.
>>
>> E.g. we have cases where num_pending is just 1 so that the 
>> dma_fence_array signals when *any* fence in it signals.
>
> I forgot about that mode.
>
>>> }
>>>     }
>>>
>>>     dma_fence_array_clear_pending_error(array);
>>>     return true;
>>> }
>>>
>>> Or if the optimisation to not walk the array when signalling is 
>>> already enabled is deemed important, perhaps a less thinking 
>>> inducing way would be this:
>> ...
>>> Decrementing locally cached num_pending in the loop I think does not 
>>> bring anything since when signalling is not enabled it will be stuck 
>>> at num_fences. So the loop walks the whole array versus bail on 
>>> first unsignalled, so latter even more efficient.
>>
>> That is not for optimization but for correctness.
>>
>> What the patch basically does is the following:
>> 1. Grab the current value of num_pending.
>>
>> 2. Test if num_pending was potentially already modified because 
>> signaling is already enabled, if yes just test it and return the result.
>>
>> 3. If it wasn't modified go over the fences and see if we already 
>> have at least num_pending signaled.
>>
>> I should probably add a code comment explaining that.
>
> It would be good yes.
>
> DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT can appear any time, even after the 
> check, hence I am not sure the absence of the bit can be used to 
> guarantee num_pending is stable. But I think this one is safe, since 
> the loop will in any case find the desired number of signalled fences 
> even if num_pending is stale (too high).
>
> Also, can num_pending underflow in signal-on-any mode and if it can 
> what can happen in unsigned int num_pending and the below loop. 
> Potentially just one false negative with the following query returning 
> signalled from the top level dma-fence code.

Oh, good point. Yeah num_pending can indeed underflow when signaling is 
enabled. I've fixed the check accordingly.

> In summary I think patch works. I am just unsure if the above race can 
> silently happen and cause one extra round trip through the query. If 
> it can it still works, but definitely needs a big fat comment to 
> explain it.

I've added the summary from Boris as comment.

Thanks,
Christian.

>
> Regards,
>
> Tvrtko
>
>>
>>> In which case, should dma-fence-chain also be aligned to have the 
>>> fast path bail out?
>>
>> Good point need to double check that code as well.
>>
>> Thanks,
>> Christian.
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>>   +signal:
>>>>       dma_fence_array_clear_pending_error(array);
>>>>       return true;
>>>>   }
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Ensure progress for dma_fence_array
  2024-11-12 12:00   ` Christian König
@ 2024-11-12 21:10     ` Chia-I Wu
  0 siblings, 0 replies; 11+ messages in thread
From: Chia-I Wu @ 2024-11-12 21:10 UTC (permalink / raw)
  To: Christian König
  Cc: boris.brezillon, maarten.lankhorst, mripard, tzimmermann,
	lionel.g.landwerlin, dri-devel, faith.ekstrand, simona

On Tue, Nov 12, 2024 at 4:00 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 09.11.24 um 01:32 schrieb Chia-I Wu:
> > On Fri, Nov 8, 2024 at 1:43 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> Hi guys,
> >>
> >> as pointed out by Chia-I userspace doesn't see any progress when
> >> signaling is not enabled and Boris noted that this is because
> >> dma_fence_array_signaled() never returns true in this case.
> >>
> >> Improve this by fixing the dma_fence_array_signaled() implementation to
> >> also return true even if signaling was never explicitely enabled.
> > Yeah, this fixes the timeout I was seeing on panvk.
>
> Any objections to add your Tested-by?
No.  v3 is also

  Tested-by: Chia-I Wu <olvaffe@gmail.com>

>
> >
> >> We should probably adjust the documentation as well that when the
> >> callback is implemented it should make progess visible even without
> >> enabling signaling.
> > That would be really nice.  Both dma_fence_is_signaled and
> > dma_fence_ops::signaled explicitly state otherwise at the moment.
>
> I have put that on my TODO list, but could take a while until I have
> time for it.
>
> If anybody wants to make suggestions on the wording feel free to make a
> patch.
>
> Thanks,
> Christian.
>
> >
> >
> >
> >> Please test and review,
> >> Christian.
> >>
> >>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-11-12 21:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08  9:42 Ensure progress for dma_fence_array Christian König
2024-11-08  9:42 ` [PATCH] dma-buf: fix dma_fence_array_signaled Christian König
2024-11-08 13:01   ` Tvrtko Ursulin
2024-11-08 14:18     ` Christian König
2024-11-08 15:03       ` Boris Brezillon
2024-11-08 15:52       ` Tvrtko Ursulin
2024-11-12 12:16         ` Christian König
2024-11-08 14:54   ` Boris Brezillon
2024-11-09  0:32 ` Ensure progress for dma_fence_array Chia-I Wu
2024-11-12 12:00   ` Christian König
2024-11-12 21:10     ` Chia-I Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.