netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking
@ 2024-10-15  6:36 Yuan Can
  2024-10-15  8:06 ` Petr Machata
  0 siblings, 1 reply; 5+ messages in thread
From: Yuan Can @ 2024-10-15  6:36 UTC (permalink / raw)
  To: idosch, petrm, davem, edumazet, kuba, pabeni, netdev; +Cc: yuancan

It is meant to use xa_err() to extract the error encoded in the return
value of xa_store().

Fixes: 44c2fbebe18a ("mlxsw: spectrum_router: Share nexthop counters in resilient groups")
Signed-off-by: Yuan Can <yuancan@huawei.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 800dfb64ec83..7d6d859cef3f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -3197,7 +3197,6 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
 {
 	struct mlxsw_sp_nexthop_group *nh_grp = nh->nhgi->nh_grp;
 	struct mlxsw_sp_nexthop_counter *nhct;
-	void *ptr;
 	int err;
 
 	nhct = xa_load(&nh_grp->nhgi->nexthop_counters, nh->id);
@@ -3210,12 +3209,10 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
 	if (IS_ERR(nhct))
 		return nhct;
 
-	ptr = xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
-		       GFP_KERNEL);
-	if (IS_ERR(ptr)) {
-		err = PTR_ERR(ptr);
+	err = xa_err(xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
+			      GFP_KERNEL));
+	if (err)
 		goto err_store;
-	}
 
 	return nhct;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking
  2024-10-15  6:36 [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking Yuan Can
@ 2024-10-15  8:06 ` Petr Machata
  2024-10-16  2:19   ` Yuan Can
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Machata @ 2024-10-15  8:06 UTC (permalink / raw)
  To: Yuan Can; +Cc: idosch, petrm, davem, edumazet, kuba, pabeni, netdev


Yuan Can <yuancan@huawei.com> writes:

> It is meant to use xa_err() to extract the error encoded in the return
> value of xa_store().
>
> Fixes: 44c2fbebe18a ("mlxsw: spectrum_router: Share nexthop counters in resilient groups")
> Signed-off-by: Yuan Can <yuancan@huawei.com>

Reviewed-by: Petr Machata <petrm@nvidia.com>

What's the consequence of using IS_ERR()/PTR_ERR() vs. xa_err()? From
the documentation it looks like IS_ERR() might interpret some valid
pointers as errors[0]. Which would then show as leaks, because we bail
out early and never clean up?

I.e. should this aim at net rather than net-next? It looks like it's not
just semantics, but has actual observable impact.

[0] "The XArray does not support storing IS_ERR() pointers as some
    conflict with value entries or internal entries."

> ---
>  drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> index 800dfb64ec83..7d6d859cef3f 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> @@ -3197,7 +3197,6 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
>  {
>  	struct mlxsw_sp_nexthop_group *nh_grp = nh->nhgi->nh_grp;
>  	struct mlxsw_sp_nexthop_counter *nhct;
> -	void *ptr;
>  	int err;
>  
>  	nhct = xa_load(&nh_grp->nhgi->nexthop_counters, nh->id);
> @@ -3210,12 +3209,10 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
>  	if (IS_ERR(nhct))
>  		return nhct;
>  
> -	ptr = xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
> -		       GFP_KERNEL);
> -	if (IS_ERR(ptr)) {
> -		err = PTR_ERR(ptr);
> +	err = xa_err(xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
> +			      GFP_KERNEL));
> +	if (err)
>  		goto err_store;
> -	}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking
  2024-10-15  8:06 ` Petr Machata
@ 2024-10-16  2:19   ` Yuan Can
  2024-10-16  9:41     ` Petr Machata
  2024-10-16 12:38     ` Przemek Kitszel
  0 siblings, 2 replies; 5+ messages in thread
From: Yuan Can @ 2024-10-16  2:19 UTC (permalink / raw)
  To: Petr Machata; +Cc: idosch, davem, edumazet, kuba, pabeni, netdev

On 2024/10/15 16:06, Petr Machata wrote:
> Yuan Can <yuancan@huawei.com> writes:
>
>> It is meant to use xa_err() to extract the error encoded in the return
>> value of xa_store().
>>
>> Fixes: 44c2fbebe18a ("mlxsw: spectrum_router: Share nexthop counters in resilient groups")
>> Signed-off-by: Yuan Can <yuancan@huawei.com>
> Reviewed-by: Petr Machata <petrm@nvidia.com>
>
> What's the consequence of using IS_ERR()/PTR_ERR() vs. xa_err()? From
> the documentation it looks like IS_ERR() might interpret some valid
> pointers as errors[0]. Which would then show as leaks, because we bail
> out early and never clean up?

At least the PRT_ERR() will return a wrong error number, though the 
error number

seems not used nor printed.

>
> I.e. should this aim at net rather than net-next? It looks like it's not
> just semantics, but has actual observable impact.
Ok, do I need to send a V2 patch to net branch?
>
> [0] "The XArray does not support storing IS_ERR() pointers as some
>      conflict with value entries or internal entries."
>
>> ---
>>   drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 9 +++------
>>   1 file changed, 3 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> index 800dfb64ec83..7d6d859cef3f 100644
>> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> @@ -3197,7 +3197,6 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
>>   {
>>   	struct mlxsw_sp_nexthop_group *nh_grp = nh->nhgi->nh_grp;
>>   	struct mlxsw_sp_nexthop_counter *nhct;
>> -	void *ptr;
>>   	int err;
>>   
>>   	nhct = xa_load(&nh_grp->nhgi->nexthop_counters, nh->id);
>> @@ -3210,12 +3209,10 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp *mlxsw_sp,
>>   	if (IS_ERR(nhct))
>>   		return nhct;
>>   
>> -	ptr = xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
>> -		       GFP_KERNEL);
>> -	if (IS_ERR(ptr)) {
>> -		err = PTR_ERR(ptr);
>> +	err = xa_err(xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
>> +			      GFP_KERNEL));
>> +	if (err)
>>   		goto err_store;
>> -	}

-- 
Best regards,
Yuan Can


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking
  2024-10-16  2:19   ` Yuan Can
@ 2024-10-16  9:41     ` Petr Machata
  2024-10-16 12:38     ` Przemek Kitszel
  1 sibling, 0 replies; 5+ messages in thread
From: Petr Machata @ 2024-10-16  9:41 UTC (permalink / raw)
  To: Yuan Can; +Cc: Petr Machata, idosch, davem, edumazet, kuba, pabeni, netdev


Yuan Can <yuancan@huawei.com> writes:

> On 2024/10/15 16:06, Petr Machata wrote:
>> Yuan Can <yuancan@huawei.com> writes:
>>
>>> It is meant to use xa_err() to extract the error encoded in the return
>>> value of xa_store().
>>>
>>> Fixes: 44c2fbebe18a ("mlxsw: spectrum_router: Share nexthop counters in resilient groups")
>>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>>
>> Reviewed-by: Petr Machata <petrm@nvidia.com>
>>
>> What's the consequence of using IS_ERR()/PTR_ERR() vs. xa_err()? From
>> the documentation it looks like IS_ERR() might interpret some valid
>> pointers as errors[0]. Which would then show as leaks, because we bail
>> out early and never clean up?
>
> At least the PRT_ERR() will return a wrong error number, though the error number
>
> seems not used nor printed.

What I'm saying is that if IS_ERR overestimates what is an error, we
bail out from mlxsw_sp_nexthop_sh_counter_get() with a failure, but
xa_store() actually succeeded, and the corresponding xa_erase is never
called, causing a leak.

(If IS_ERR underestimates what is an error, fails to store the allocated
counter, and counter sharing stops working. This will waste HW
resources, though I think it should still behave correctly overall.)

Anyway, it looks to me like a net material.

>>
>> I.e. should this aim at net rather than net-next? It looks like it's not
>> just semantics, but has actual observable impact.
>
> Ok, do I need to send a V2 patch to net branch?

Yes please.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking
  2024-10-16  2:19   ` Yuan Can
  2024-10-16  9:41     ` Petr Machata
@ 2024-10-16 12:38     ` Przemek Kitszel
  1 sibling, 0 replies; 5+ messages in thread
From: Przemek Kitszel @ 2024-10-16 12:38 UTC (permalink / raw)
  To: Yuan Can, Petr Machata; +Cc: idosch, davem, edumazet, kuba, pabeni, netdev

On 10/16/24 04:19, Yuan Can wrote:
> On 2024/10/15 16:06, Petr Machata wrote:
>> Yuan Can <yuancan@huawei.com> writes:
>>
>>> It is meant to use xa_err() to extract the error encoded in the return
>>> value of xa_store().
>>>
>>> Fixes: 44c2fbebe18a ("mlxsw: spectrum_router: Share nexthop counters 
>>> in resilient groups")
>>> Signed-off-by: Yuan Can <yuancan@huawei.com>
>> Reviewed-by: Petr Machata <petrm@nvidia.com>
>>
>> What's the consequence of using IS_ERR()/PTR_ERR() vs. xa_err()? From
>> the documentation it looks like IS_ERR() might interpret some valid
>> pointers as errors[0]. 

it is an error to insert error pointers into xarray,
but @nhct is not an error thanks to prior check

this patch correctly checks for error returned from xarray store attempt
which is later (just after the context of the patch) converted via
ERR_PTR(), so:
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>

>> Which would then show as leaks, because we bail
>> out early and never clean up?
> 
> At least the PRT_ERR() will return a wrong error number, though the 
> error number
> 
> seems not used nor printed.
> 
>>
>> I.e. should this aim at net rather than net-next? It looks like it's not
>> just semantics, but has actual observable impact.
> Ok, do I need to send a V2 patch to net branch?
>>
>> [0] "The XArray does not support storing IS_ERR() pointers as some
>>      conflict with value entries or internal entries."
>>
>>> ---
>>>   drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 9 +++------
>>>   1 file changed, 3 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/ 
>>> drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>>> index 800dfb64ec83..7d6d859cef3f 100644
>>> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>>> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>>> @@ -3197,7 +3197,6 @@ mlxsw_sp_nexthop_sh_counter_get(struct mlxsw_sp 
>>> *mlxsw_sp,
>>>   {
>>>       struct mlxsw_sp_nexthop_group *nh_grp = nh->nhgi->nh_grp;
>>>       struct mlxsw_sp_nexthop_counter *nhct;
>>> -    void *ptr;
>>>       int err;
>>>       nhct = xa_load(&nh_grp->nhgi->nexthop_counters, nh->id);
>>> @@ -3210,12 +3209,10 @@ mlxsw_sp_nexthop_sh_counter_get(struct 
>>> mlxsw_sp *mlxsw_sp,
>>>       if (IS_ERR(nhct))
>>>           return nhct;
>>> -    ptr = xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, nhct,
>>> -               GFP_KERNEL);
>>> -    if (IS_ERR(ptr)) {
>>> -        err = PTR_ERR(ptr);
>>> +    err = xa_err(xa_store(&nh_grp->nhgi->nexthop_counters, nh->id, 
>>> nhct,
>>> +                  GFP_KERNEL));
>>> +    if (err)
>>>           goto err_store;
>>> -    }
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-16 12:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-15  6:36 [PATCH net-next] mlxsw: spectrum_router: fix xa_store() error checking Yuan Can
2024-10-15  8:06 ` Petr Machata
2024-10-16  2:19   ` Yuan Can
2024-10-16  9:41     ` Petr Machata
2024-10-16 12:38     ` Przemek Kitszel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).