linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] RDMA/rxe: Fix parameter errors
@ 2023-01-19 18:05 Bob Pearson
  2023-01-19 19:18 ` Jason Gunthorpe
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Pearson @ 2023-01-19 18:05 UTC (permalink / raw)
  To: jgg, zyjzyj2000, leonro, linux-rdma, Rao.Shoaib; +Cc: Bob Pearson

Correct errors in rxe_param.h caused by extending the range of
indices for MRs allowing it to overlap the range for MWs. Since
the driver determines whether an rkey is for an MR or MW by comparing
the index part of the rkey with these ranges this can cause an
MR to be incorrectly determined to be an MW.

Additionally the parameters which determine the size of the index
ranges for MR, MW, QP and SRQ are incorrect since the actual
number of integers in the range [min, max] is (max - min + 1) not
(max - min).

This patch corrects these errors.

Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h
index a754fc902e3d..14baa84d1d9d 100644
--- a/drivers/infiniband/sw/rxe/rxe_param.h
+++ b/drivers/infiniband/sw/rxe/rxe_param.h
@@ -91,18 +91,29 @@ enum rxe_device_param {
 
 	RXE_MIN_QP_INDEX		= 16,
 	RXE_MAX_QP_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_QP			= DEFAULT_MAX_VALUE - RXE_MIN_QP_INDEX,
+	RXE_MAX_QP			= RXE_MAX_QP_INDEX
+						- RXE_MIN_QP_INDEX + 1,
 
 	RXE_MIN_SRQ_INDEX		= 0x00020001,
 	RXE_MAX_SRQ_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_SRQ			= DEFAULT_MAX_VALUE - RXE_MIN_SRQ_INDEX,
-
-	RXE_MIN_MR_INDEX		= 0x00000001,
+	RXE_MAX_SRQ			= RXE_MAX_SRQ_INDEX
+						- RXE_MIN_SRQ_INDEX + 1,
+
+	/*
+	 * MR and MW indices are converted to rkeys by shifting
+	 * left 8 bits and oring in an 8 bit key which either
+	 * belongs to the driver or the user depending on the
+	 * MR type. In order to determine if the rkey is an MR
+	 * or an MW the index ranges below must not overlap.
+	 */
+	RXE_MIN_MR_INDEX		= 1,
 	RXE_MAX_MR_INDEX		= DEFAULT_MAX_VALUE,
-	RXE_MAX_MR			= DEFAULT_MAX_VALUE - RXE_MIN_MR_INDEX,
-	RXE_MIN_MW_INDEX		= 0x00010001,
-	RXE_MAX_MW_INDEX		= 0x00020000,
-	RXE_MAX_MW			= 0x00001000,
+	RXE_MAX_MR			= RXE_MAX_MR_INDEX
+						- RXE_MIN_MR_INDEX + 1,
+	RXE_MIN_MW_INDEX		= DEFAULT_MAX_VALUE + 1,
+	RXE_MAX_MW_INDEX		= 2*DEFAULT_MAX_VALUE,
+	RXE_MAX_MW			= RXE_MAX_MW_INDEX
+						- RXE_MIN_MW_INDEX + 1,
 
 	RXE_MAX_PKT_PER_ACK		= 64,
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/rxe: Fix parameter errors
  2023-01-19 18:05 [PATCH] RDMA/rxe: Fix parameter errors Bob Pearson
@ 2023-01-19 19:18 ` Jason Gunthorpe
  2023-01-19 20:18   ` Bob Pearson
  2023-03-01 23:15   ` Bob Pearson
  0 siblings, 2 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2023-01-19 19:18 UTC (permalink / raw)
  To: Bob Pearson; +Cc: zyjzyj2000, leonro, linux-rdma, Rao.Shoaib

On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
> Correct errors in rxe_param.h caused by extending the range of
> indices for MRs allowing it to overlap the range for MWs. Since
> the driver determines whether an rkey is for an MR or MW by comparing
> the index part of the rkey with these ranges this can cause an
> MR to be incorrectly determined to be an MW.
> 
> Additionally the parameters which determine the size of the index
> ranges for MR, MW, QP and SRQ are incorrect since the actual
> number of integers in the range [min, max] is (max - min + 1) not
> (max - min).
> 
> This patch corrects these errors.
> 
> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)

This

commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Date:   Tue Dec 20 17:08:48 2022 +0900

    RDMA/rxe: Prevent faulty rkey generation
    
    If you create MRs more than 0x10000 times after loading the module,
    responder starts to reply NAKs for RDMA/Atomic operations because of rkey
    violation detected in check_rkey(). The root cause is that rkeys are
    incremented each time a new MR is created and the value overflows into the
    range reserved for MWs.
    
    This commit also increases the value of RXE_MAX_MW that has been limited
    unlike other parameters.
    
    Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
    Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
    Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
    Tested-by: Li Zhijian <lizhijian@fujitsu.com>
    Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>


Is already in v6.2-rc and conflicts with this patch, it looks like it
is doing the same thing, can you sort it out please?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/rxe: Fix parameter errors
  2023-01-19 19:18 ` Jason Gunthorpe
@ 2023-01-19 20:18   ` Bob Pearson
  2023-03-01 23:15   ` Bob Pearson
  1 sibling, 0 replies; 6+ messages in thread
From: Bob Pearson @ 2023-01-19 20:18 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: zyjzyj2000, leonro, linux-rdma, Rao.Shoaib

On 1/19/23 13:18, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>> Correct errors in rxe_param.h caused by extending the range of
>> indices for MRs allowing it to overlap the range for MWs. Since
>> the driver determines whether an rkey is for an MR or MW by comparing
>> the index part of the rkey with these ranges this can cause an
>> MR to be incorrectly determined to be an MW.
>>
>> Additionally the parameters which determine the size of the index
>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>> number of integers in the range [min, max] is (max - min + 1) not
>> (max - min).
>>
>> This patch corrects these errors.
>>
>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> This
> 
> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Date:   Tue Dec 20 17:08:48 2022 +0900
> 
>     RDMA/rxe: Prevent faulty rkey generation
>     
>     If you create MRs more than 0x10000 times after loading the module,
>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>     violation detected in check_rkey(). The root cause is that rkeys are
>     incremented each time a new MR is created and the value overflows into the
>     range reserved for MWs.
>     
>     This commit also increases the value of RXE_MAX_MW that has been limited
>     unlike other parameters.
>     
>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> 
> Is already in v6.2-rc and conflicts with this patch, it looks like it
> is doing the same thing, can you sort it out please?
> 
> Thanks,
> Jason

Missed that one. Yes, they are basically identical except he cut the range in half and gave one to each and I doubled it. The other change I made is still a bug but much less important. It reports an incorrect max_xxx number in hca attributes but has no ill effect.
We can leave it the way it is for now.

Bob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/rxe: Fix parameter errors
  2023-01-19 19:18 ` Jason Gunthorpe
  2023-01-19 20:18   ` Bob Pearson
@ 2023-03-01 23:15   ` Bob Pearson
  2023-03-06 20:51     ` Jason Gunthorpe
  1 sibling, 1 reply; 6+ messages in thread
From: Bob Pearson @ 2023-03-01 23:15 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: zyjzyj2000, leonro, linux-rdma, Rao.Shoaib

On 1/19/23 13:18, Jason Gunthorpe wrote:
> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>> Correct errors in rxe_param.h caused by extending the range of
>> indices for MRs allowing it to overlap the range for MWs. Since
>> the driver determines whether an rkey is for an MR or MW by comparing
>> the index part of the rkey with these ranges this can cause an
>> MR to be incorrectly determined to be an MW.
>>
>> Additionally the parameters which determine the size of the index
>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>> number of integers in the range [min, max] is (max - min + 1) not
>> (max - min).
>>
>> This patch corrects these errors.
>>
>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> This
> 
> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Date:   Tue Dec 20 17:08:48 2022 +0900
> 
>     RDMA/rxe: Prevent faulty rkey generation
>     
>     If you create MRs more than 0x10000 times after loading the module,
>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>     violation detected in check_rkey(). The root cause is that rkeys are
>     incremented each time a new MR is created and the value overflows into the
>     range reserved for MWs.
>     
>     This commit also increases the value of RXE_MAX_MW that has been limited
>     unlike other parameters.
>     
>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> 
> Is already in v6.2-rc and conflicts with this patch, it looks like it
> is doing the same thing, can you sort it out please?
> 
> Thanks,
> Jason

Did this get lost? for-next is now at 6.2-rc3 now and the bug is still in rxe_param.h.

Bob

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/rxe: Fix parameter errors
  2023-03-01 23:15   ` Bob Pearson
@ 2023-03-06 20:51     ` Jason Gunthorpe
  2023-03-13 19:55       ` Bob Pearson
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Gunthorpe @ 2023-03-06 20:51 UTC (permalink / raw)
  To: Bob Pearson; +Cc: zyjzyj2000, leonro, linux-rdma, Rao.Shoaib

On Wed, Mar 01, 2023 at 05:15:07PM -0600, Bob Pearson wrote:
> On 1/19/23 13:18, Jason Gunthorpe wrote:
> > On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
> >> Correct errors in rxe_param.h caused by extending the range of
> >> indices for MRs allowing it to overlap the range for MWs. Since
> >> the driver determines whether an rkey is for an MR or MW by comparing
> >> the index part of the rkey with these ranges this can cause an
> >> MR to be incorrectly determined to be an MW.
> >>
> >> Additionally the parameters which determine the size of the index
> >> ranges for MR, MW, QP and SRQ are incorrect since the actual
> >> number of integers in the range [min, max] is (max - min + 1) not
> >> (max - min).
> >>
> >> This patch corrects these errors.
> >>
> >> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> >> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> >> ---
> >>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
> >>  1 file changed, 19 insertions(+), 8 deletions(-)
> > 
> > This
> > 
> > commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
> > Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> > Date:   Tue Dec 20 17:08:48 2022 +0900
> > 
> >     RDMA/rxe: Prevent faulty rkey generation
> >     
> >     If you create MRs more than 0x10000 times after loading the module,
> >     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
> >     violation detected in check_rkey(). The root cause is that rkeys are
> >     incremented each time a new MR is created and the value overflows into the
> >     range reserved for MWs.
> >     
> >     This commit also increases the value of RXE_MAX_MW that has been limited
> >     unlike other parameters.
> >     
> >     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
> >     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
> >     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> >     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
> >     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
> >     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > 
> > 
> > Is already in v6.2-rc and conflicts with this patch, it looks like it
> > is doing the same thing, can you sort it out please?
> > 
> > Thanks,
> > Jason
> 
> Did this get lost? for-next is now at 6.2-rc3 now and the bug is
> still in rxe_param.h.

Check again we are at v6.3-rc1 now, if something needs to be fixed
send a new patch..

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/rxe: Fix parameter errors
  2023-03-06 20:51     ` Jason Gunthorpe
@ 2023-03-13 19:55       ` Bob Pearson
  0 siblings, 0 replies; 6+ messages in thread
From: Bob Pearson @ 2023-03-13 19:55 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: zyjzyj2000, leonro, linux-rdma, Rao.Shoaib

On 3/6/23 14:51, Jason Gunthorpe wrote:
> On Wed, Mar 01, 2023 at 05:15:07PM -0600, Bob Pearson wrote:
>> On 1/19/23 13:18, Jason Gunthorpe wrote:
>>> On Thu, Jan 19, 2023 at 12:05:07PM -0600, Bob Pearson wrote:
>>>> Correct errors in rxe_param.h caused by extending the range of
>>>> indices for MRs allowing it to overlap the range for MWs. Since
>>>> the driver determines whether an rkey is for an MR or MW by comparing
>>>> the index part of the rkey with these ranges this can cause an
>>>> MR to be incorrectly determined to be an MW.
>>>>
>>>> Additionally the parameters which determine the size of the index
>>>> ranges for MR, MW, QP and SRQ are incorrect since the actual
>>>> number of integers in the range [min, max] is (max - min + 1) not
>>>> (max - min).
>>>>
>>>> This patch corrects these errors.
>>>>
>>>> Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>>>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>>>> ---
>>>>  drivers/infiniband/sw/rxe/rxe_param.h | 27 +++++++++++++++++++--------
>>>>  1 file changed, 19 insertions(+), 8 deletions(-)
>>>
>>> This
>>>
>>> commit 1aefe5c177c1922119afb4ee443ddd6ac3140b37
>>> Author: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>>> Date:   Tue Dec 20 17:08:48 2022 +0900
>>>
>>>     RDMA/rxe: Prevent faulty rkey generation
>>>     
>>>     If you create MRs more than 0x10000 times after loading the module,
>>>     responder starts to reply NAKs for RDMA/Atomic operations because of rkey
>>>     violation detected in check_rkey(). The root cause is that rkeys are
>>>     incremented each time a new MR is created and the value overflows into the
>>>     range reserved for MWs.
>>>     
>>>     This commit also increases the value of RXE_MAX_MW that has been limited
>>>     unlike other parameters.
>>>     
>>>     Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs")
>>>     Link: https://lore.kernel.org/r/20221220080848.253785-2-matsuda-daisuke@fujitsu.com
>>>     Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
>>>     Tested-by: Li Zhijian <lizhijian@fujitsu.com>
>>>     Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>>>     Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>>
>>>
>>> Is already in v6.2-rc and conflicts with this patch, it looks like it
>>> is doing the same thing, can you sort it out please?
>>>
>>> Thanks,
>>> Jason
>>
>> Did this get lost? for-next is now at 6.2-rc3 now and the bug is
>> still in rxe_param.h.
> 
> Check again we are at v6.3-rc1 now, if something needs to be fixed
> send a new patch..
> 
> Jason

Just checked. It now looks good in for-next.

Thanks

Bob

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-13 19:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-19 18:05 [PATCH] RDMA/rxe: Fix parameter errors Bob Pearson
2023-01-19 19:18 ` Jason Gunthorpe
2023-01-19 20:18   ` Bob Pearson
2023-03-01 23:15   ` Bob Pearson
2023-03-06 20:51     ` Jason Gunthorpe
2023-03-13 19:55       ` Bob Pearson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).