* [PATCH for-next 0/2] Enable rcu locking of verbs objects
@ 2023-07-18 17:59 Bob Pearson
2023-07-18 17:59 ` [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking Bob Pearson
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
0 siblings, 2 replies; 10+ messages in thread
From: Bob Pearson @ 2023-07-18 17:59 UTC (permalink / raw)
To: jgg, leon, zyjzyj2000, linux-rdma, jhack; +Cc: Bob Pearson
This patch set consists of two patches. The first adds code to
rdma-core to optionally use kfree_rcu instead of kfree for three
of the verbs objects which are looked up by their indices in the
rxe driver and which are freed in the destroy verbs in rdma-core.
The second patch adds rcu_head to the private data in the rxe
driver and sets the offsets to these structs. This allows the
rxe driver to correctly use rcu locking on these objects.
Bob Pearson (2):
RDMA/core: Support drivers use of rcu locking
RDMA/rxe: Enable rcu locking of indexed objects
drivers/infiniband/core/uverbs_main.c | 2 +-
drivers/infiniband/core/verbs.c | 6 +++---
drivers/infiniband/sw/rxe/rxe_pool.h | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.c | 6 +++++-
include/rdma/ib_verbs.h | 24 ++++++++++++++++++++++++
5 files changed, 34 insertions(+), 5 deletions(-)
base-commit: f877f22ac1e9bf1f9aded3765b0012851e1dc4c5
--
2.39.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking
2023-07-18 17:59 [PATCH for-next 0/2] Enable rcu locking of verbs objects Bob Pearson
@ 2023-07-18 17:59 ` Bob Pearson
2023-08-09 19:10 ` Jason Gunthorpe
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
1 sibling, 1 reply; 10+ messages in thread
From: Bob Pearson @ 2023-07-18 17:59 UTC (permalink / raw)
To: jgg, leon, zyjzyj2000, linux-rdma, jhack; +Cc: Bob Pearson
This patch allows drivers to optionally include struct rcu_head
in their private object data structs and have rdma-core use kfree_rcu
to free the objects.
The offsets of the rcu_heads are stored in fields in struct
ib_device_ops and a macro RDMA_KFREE_RCU is introduced which calls
(an open coded) kfree_rcu instead of kfree if the value is non-
zero. Currently the supported object types are AH, QP and MW.
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
drivers/infiniband/core/uverbs_main.c | 2 +-
drivers/infiniband/core/verbs.c | 6 +++---
include/rdma/ib_verbs.h | 24 ++++++++++++++++++++++++
3 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 7c9c79c13941..50497e550f18 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -112,7 +112,7 @@ int uverbs_dealloc_mw(struct ib_mw *mw)
return ret;
atomic_dec(&pd->usecnt);
- kfree(mw);
+ RDMA_KFREE_RCU(mw);
return ret;
}
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index b99b3cc283b6..a49ae8c52c66 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -982,7 +982,7 @@ int rdma_destroy_ah_user(struct ib_ah *ah, u32 flags, struct ib_udata *udata)
if (sgid_attr)
rdma_put_gid_attr(sgid_attr);
- kfree(ah);
+ RDMA_KFREE_RCU(ah);
return ret;
}
EXPORT_SYMBOL(rdma_destroy_ah_user);
@@ -1970,7 +1970,7 @@ int ib_close_qp(struct ib_qp *qp)
atomic_dec(&real_qp->usecnt);
if (qp->qp_sec)
ib_close_shared_qp_security(qp->qp_sec);
- kfree(qp);
+ RDMA_KFREE_RCU(qp);
return 0;
}
@@ -2041,7 +2041,7 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
ib_destroy_qp_security_end(sec);
rdma_restrack_del(&qp->res);
- kfree(qp);
+ RDMA_KFREE_RCU(qp);
return ret;
}
EXPORT_SYMBOL(ib_destroy_qp_user);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 1e7774ac808f..616e9e54a733 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2684,8 +2684,32 @@ struct ib_device_ops {
DECLARE_RDMA_OBJ_SIZE(ib_srq);
DECLARE_RDMA_OBJ_SIZE(ib_ucontext);
DECLARE_RDMA_OBJ_SIZE(ib_xrcd);
+
+ /* if non-zero holds offset of rcu_head in object */
+ ssize_t rcu_offset_ah;
+ ssize_t rcu_offset_qp;
+ ssize_t rcu_offset_mw;
+
};
+/* drivers may optionally use rcu locking on verbs objects
+ * by including struct rcu_head in their private data and
+ * setting rcu_offset_xx in the ib_driver_ops struct where
+ * xx is one of ah, qp or mw.
+ */
+static inline void __rdma_kfree_rcu(void *obj, ssize_t rcu_offset)
+{
+ if (rcu_offset)
+ kvfree_call_rcu((struct rcu_head *)((u8 *)obj + rcu_offset),
+ obj);
+ else
+ kfree(obj);
+}
+
+/* obj must be one of ah, qp, or mw */
+#define RDMA_KFREE_RCU(obj) __rdma_kfree_rcu(obj, \
+ obj->device->ops.rcu_offset_##obj)
+
struct ib_core_device {
/* device must be the first element in structure until,
* union of ib_core_device and device exists in ib_device.
--
2.39.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-18 17:59 [PATCH for-next 0/2] Enable rcu locking of verbs objects Bob Pearson
2023-07-18 17:59 ` [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking Bob Pearson
@ 2023-07-18 17:59 ` Bob Pearson
2023-07-19 5:38 ` Zhu Yanjun
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Bob Pearson @ 2023-07-18 17:59 UTC (permalink / raw)
To: jgg, leon, zyjzyj2000, linux-rdma, jhack; +Cc: Bob Pearson
Make rcu_read locking of critical sections with the indexed
verbs objects be protected from early freeing of those objects.
The AH, QP, MR and MW objects are looked up from their indices
contained in received packets or WQEs during I/O processing.
Make these objects be freed using kfree_rcu to avoid races.
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
drivers/infiniband/sw/rxe/rxe_pool.h | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h
index b42e26427a70..ef2f6f88e254 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.h
+++ b/drivers/infiniband/sw/rxe/rxe_pool.h
@@ -25,6 +25,7 @@ struct rxe_pool_elem {
struct kref ref_cnt;
struct list_head list;
struct completion complete;
+ struct rcu_head rcu;
u32 index;
};
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 903f0b71447e..b899924b81eb 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -1395,7 +1395,7 @@ static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
if (cleanup_err)
rxe_err_mr(mr, "cleanup failed, err = %d", cleanup_err);
- kfree_rcu_mightsleep(mr);
+ kfree_rcu(mr, elem.rcu);
return 0;
err_out:
@@ -1494,6 +1494,10 @@ static const struct ib_device_ops rxe_dev_ops = {
INIT_RDMA_OBJ_SIZE(ib_srq, rxe_srq, ibsrq),
INIT_RDMA_OBJ_SIZE(ib_ucontext, rxe_ucontext, ibuc),
INIT_RDMA_OBJ_SIZE(ib_mw, rxe_mw, ibmw),
+
+ .rcu_offset_ah = offsetof(struct rxe_ah, elem.rcu),
+ .rcu_offset_qp = offsetof(struct rxe_qp, elem.rcu),
+ .rcu_offset_mw = offsetof(struct rxe_mw, elem.rcu),
};
int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
--
2.39.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
@ 2023-07-19 5:38 ` Zhu Yanjun
2023-07-19 14:34 ` Bob Pearson
2023-07-19 7:49 ` Leon Romanovsky
2023-08-09 19:17 ` Jason Gunthorpe
2 siblings, 1 reply; 10+ messages in thread
From: Zhu Yanjun @ 2023-07-19 5:38 UTC (permalink / raw)
To: Bob Pearson; +Cc: jgg, leon, linux-rdma, jhack
On Wed, Jul 19, 2023 at 2:00 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> Make rcu_read locking of critical sections with the indexed
> verbs objects be protected from early freeing of those objects.
> The AH, QP, MR and MW objects are looked up from their indices
> contained in received packets or WQEs during I/O processing.
> Make these objects be freed using kfree_rcu to avoid races.
>
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> drivers/infiniband/sw/rxe/rxe_pool.h | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.c | 6 +++++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h
> index b42e26427a70..ef2f6f88e254 100644
> --- a/drivers/infiniband/sw/rxe/rxe_pool.h
> +++ b/drivers/infiniband/sw/rxe/rxe_pool.h
> @@ -25,6 +25,7 @@ struct rxe_pool_elem {
> struct kref ref_cnt;
> struct list_head list;
> struct completion complete;
> + struct rcu_head rcu;
> u32 index;
> };
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 903f0b71447e..b899924b81eb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1395,7 +1395,7 @@ static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> if (cleanup_err)
> rxe_err_mr(mr, "cleanup failed, err = %d", cleanup_err);
>
> - kfree_rcu_mightsleep(mr);
> + kfree_rcu(mr, elem.rcu);
> return 0;
>
> err_out:
> @@ -1494,6 +1494,10 @@ static const struct ib_device_ops rxe_dev_ops = {
> INIT_RDMA_OBJ_SIZE(ib_srq, rxe_srq, ibsrq),
> INIT_RDMA_OBJ_SIZE(ib_ucontext, rxe_ucontext, ibuc),
> INIT_RDMA_OBJ_SIZE(ib_mw, rxe_mw, ibmw),
> +
> + .rcu_offset_ah = offsetof(struct rxe_ah, elem.rcu),
> + .rcu_offset_qp = offsetof(struct rxe_qp, elem.rcu),
> + .rcu_offset_mw = offsetof(struct rxe_mw, elem.rcu),
In your commits, you mentioned that ah, qp, mw and mr will be
protected. But here, only ah, qp and mw are set.
Not sure if mr is also protected and set in other place.
Except that, I am fine with it.
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Zhu Yanjun
> };
>
> int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
2023-07-19 5:38 ` Zhu Yanjun
@ 2023-07-19 7:49 ` Leon Romanovsky
2023-07-19 16:43 ` Bob Pearson
2023-08-09 19:17 ` Jason Gunthorpe
2 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2023-07-19 7:49 UTC (permalink / raw)
To: Bob Pearson; +Cc: jgg, zyjzyj2000, linux-rdma, jhack
On Tue, Jul 18, 2023 at 12:59:44PM -0500, Bob Pearson wrote:
> Make rcu_read locking of critical sections with the indexed
> verbs objects be protected from early freeing of those objects.
> The AH, QP, MR and MW objects are looked up from their indices
> contained in received packets or WQEs during I/O processing.
> Make these objects be freed using kfree_rcu to avoid races.
Sorry, how use of RCU avoid races?
Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-19 5:38 ` Zhu Yanjun
@ 2023-07-19 14:34 ` Bob Pearson
0 siblings, 0 replies; 10+ messages in thread
From: Bob Pearson @ 2023-07-19 14:34 UTC (permalink / raw)
To: Zhu Yanjun; +Cc: jgg, leon, linux-rdma, jhack
On 7/19/23 00:38, Zhu Yanjun wrote:
> On Wed, Jul 19, 2023 at 2:00 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>
>> Make rcu_read locking of critical sections with the indexed
>> verbs objects be protected from early freeing of those objects.
>> The AH, QP, MR and MW objects are looked up from their indices
>> contained in received packets or WQEs during I/O processing.
>> Make these objects be freed using kfree_rcu to avoid races.
>>
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>> drivers/infiniband/sw/rxe/rxe_pool.h | 1 +
>> drivers/infiniband/sw/rxe/rxe_verbs.c | 6 +++++-
>> 2 files changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h
>> index b42e26427a70..ef2f6f88e254 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_pool.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_pool.h
>> @@ -25,6 +25,7 @@ struct rxe_pool_elem {
>> struct kref ref_cnt;
>> struct list_head list;
>> struct completion complete;
>> + struct rcu_head rcu;
>> u32 index;
>> };
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
>> index 903f0b71447e..b899924b81eb 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
>> @@ -1395,7 +1395,7 @@ static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
>> if (cleanup_err)
>> rxe_err_mr(mr, "cleanup failed, err = %d", cleanup_err);
>>
>> - kfree_rcu_mightsleep(mr);
>> + kfree_rcu(mr, elem.rcu);
>> return 0;
>>
>> err_out:
>> @@ -1494,6 +1494,10 @@ static const struct ib_device_ops rxe_dev_ops = {
>> INIT_RDMA_OBJ_SIZE(ib_srq, rxe_srq, ibsrq),
>> INIT_RDMA_OBJ_SIZE(ib_ucontext, rxe_ucontext, ibuc),
>> INIT_RDMA_OBJ_SIZE(ib_mw, rxe_mw, ibmw),
>> +
>> + .rcu_offset_ah = offsetof(struct rxe_ah, elem.rcu),
>> + .rcu_offset_qp = offsetof(struct rxe_qp, elem.rcu),
>> + .rcu_offset_mw = offsetof(struct rxe_mw, elem.rcu),
>
> In your commits, you mentioned that ah, qp, mw and mr will be
> protected. But here, only ah, qp and mw are set.
> Not sure if mr is also protected and set in other place.
> Except that, I am fine with it.
>
> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
>
> Zhu Yanjun
>
>> };
>>
>> int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
>> --
>> 2.39.2
>>
All of the verbs objects except MR are allocated and freed by rdma-core. MR is allocated
and freed by the drivers. So for MR the kfree_rcu call is made in rxe_dereg_mr().
I changed it from kfree_rcu_mightsleep to the kfree_rcu variant to match the three others.
Bob
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-19 7:49 ` Leon Romanovsky
@ 2023-07-19 16:43 ` Bob Pearson
2023-07-23 15:00 ` Leon Romanovsky
0 siblings, 1 reply; 10+ messages in thread
From: Bob Pearson @ 2023-07-19 16:43 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: jgg, zyjzyj2000, linux-rdma, jhack
On 7/19/23 02:49, Leon Romanovsky wrote:
> On Tue, Jul 18, 2023 at 12:59:44PM -0500, Bob Pearson wrote:
>> Make rcu_read locking of critical sections with the indexed
>> verbs objects be protected from early freeing of those objects.
>> The AH, QP, MR and MW objects are looked up from their indices
>> contained in received packets or WQEs during I/O processing.
>> Make these objects be freed using kfree_rcu to avoid races.
>
> Sorry, how use of RCU avoid races?
>
> Thanks
The races are between destroy/deallocate/dereg verbs API calls and packets arriving or completing send
or deferred processing of wqes. Packets and wqes contain indices/keys/numbers that refer to objects.
The rxe driver maintains xarrays for each type of object that allow to lookup the address of the object
from its index and then take a reference to protect the pointer. The destroy verbs defer completion
until the reference count falls to zero and then removes the entry in the xarray. These operations
need to be atomic. One alternative is to use spinlocks to protect them but that places a load on
performance under heavy load which is typically dominated by the lookup function since objects tend
to have a long lifetime. rcu readlocks are a better alternative but depend on the deferred destruction
of the objects used in the rcu critical section. In certain benchmarks the use of rcu locking has
a very large performance advantage over spinlocks. For example 100s of QPs running over a 200 gbit/s
network.
Bob
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-19 16:43 ` Bob Pearson
@ 2023-07-23 15:00 ` Leon Romanovsky
0 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2023-07-23 15:00 UTC (permalink / raw)
To: Bob Pearson; +Cc: jgg, zyjzyj2000, linux-rdma, jhack
On Wed, Jul 19, 2023 at 11:43:30AM -0500, Bob Pearson wrote:
> On 7/19/23 02:49, Leon Romanovsky wrote:
> > On Tue, Jul 18, 2023 at 12:59:44PM -0500, Bob Pearson wrote:
> >> Make rcu_read locking of critical sections with the indexed
> >> verbs objects be protected from early freeing of those objects.
> >> The AH, QP, MR and MW objects are looked up from their indices
> >> contained in received packets or WQEs during I/O processing.
> >> Make these objects be freed using kfree_rcu to avoid races.
> >
> > Sorry, how use of RCU avoid races?
> >
> > Thanks
>
> The races are between destroy/deallocate/dereg verbs API calls and packets arriving or completing send
> or deferred processing of wqes. Packets and wqes contain indices/keys/numbers that refer to objects.
> The rxe driver maintains xarrays for each type of object that allow to lookup the address of the object
> from its index and then take a reference to protect the pointer. The destroy verbs defer completion
> until the reference count falls to zero and then removes the entry in the xarray. These operations
> need to be atomic. One alternative is to use spinlocks to protect them but that places a load on
> performance under heavy load which is typically dominated by the lookup function since objects tend
> to have a long lifetime. rcu readlocks are a better alternative but depend on the deferred destruction
> of the objects used in the rcu critical section.
You rarely can replace locks with RCU without careful design changes.
Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking
2023-07-18 17:59 ` [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking Bob Pearson
@ 2023-08-09 19:10 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2023-08-09 19:10 UTC (permalink / raw)
To: Bob Pearson; +Cc: leon, zyjzyj2000, linux-rdma, jhack
On Tue, Jul 18, 2023 at 12:59:43PM -0500, Bob Pearson wrote:
> This patch allows drivers to optionally include struct rcu_head
> in their private object data structs and have rdma-core use kfree_rcu
> to free the objects.
>
> The offsets of the rcu_heads are stored in fields in struct
> ib_device_ops and a macro RDMA_KFREE_RCU is introduced which calls
> (an open coded) kfree_rcu instead of kfree if the value is non-
> zero. Currently the supported object types are AH, QP and MW.
>
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> drivers/infiniband/core/uverbs_main.c | 2 +-
> drivers/infiniband/core/verbs.c | 6 +++---
> include/rdma/ib_verbs.h | 24 ++++++++++++++++++++++++
> 3 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 7c9c79c13941..50497e550f18 100644
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -112,7 +112,7 @@ int uverbs_dealloc_mw(struct ib_mw *mw)
> return ret;
>
> atomic_dec(&pd->usecnt);
> - kfree(mw);
> + RDMA_KFREE_RCU(mw);
> return ret;
> }
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index b99b3cc283b6..a49ae8c52c66 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -982,7 +982,7 @@ int rdma_destroy_ah_user(struct ib_ah *ah, u32 flags, struct ib_udata *udata)
> if (sgid_attr)
> rdma_put_gid_attr(sgid_attr);
>
> - kfree(ah);
> + RDMA_KFREE_RCU(ah);
> return ret;
> }
> EXPORT_SYMBOL(rdma_destroy_ah_user);
> @@ -1970,7 +1970,7 @@ int ib_close_qp(struct ib_qp *qp)
> atomic_dec(&real_qp->usecnt);
> if (qp->qp_sec)
> ib_close_shared_qp_security(qp->qp_sec);
> - kfree(qp);
> + RDMA_KFREE_RCU(qp);
>
> return 0;
> }
> @@ -2041,7 +2041,7 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
> ib_destroy_qp_security_end(sec);
>
> rdma_restrack_del(&qp->res);
> - kfree(qp);
> + RDMA_KFREE_RCU(qp);
> return ret;
> }
> EXPORT_SYMBOL(ib_destroy_qp_user);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 1e7774ac808f..616e9e54a733 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -2684,8 +2684,32 @@ struct ib_device_ops {
> DECLARE_RDMA_OBJ_SIZE(ib_srq);
> DECLARE_RDMA_OBJ_SIZE(ib_ucontext);
> DECLARE_RDMA_OBJ_SIZE(ib_xrcd);
> +
> + /* if non-zero holds offset of rcu_head in object */
> + ssize_t rcu_offset_ah;
> + ssize_t rcu_offset_qp;
> + ssize_t rcu_offset_mw;
Just size_t, this isn't negative
> +static inline void __rdma_kfree_rcu(void *obj, ssize_t rcu_offset)
Here too
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
2023-07-19 5:38 ` Zhu Yanjun
2023-07-19 7:49 ` Leon Romanovsky
@ 2023-08-09 19:17 ` Jason Gunthorpe
2 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2023-08-09 19:17 UTC (permalink / raw)
To: Bob Pearson; +Cc: leon, zyjzyj2000, linux-rdma, jhack
On Tue, Jul 18, 2023 at 12:59:44PM -0500, Bob Pearson wrote:
> Make rcu_read locking of critical sections with the indexed
> verbs objects be protected from early freeing of those objects.
> The AH, QP, MR and MW objects are looked up from their indices
> contained in received packets or WQEs during I/O processing.
> Make these objects be freed using kfree_rcu to avoid races.
>
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> drivers/infiniband/sw/rxe/rxe_pool.h | 1 +
> drivers/infiniband/sw/rxe/rxe_verbs.c | 6 +++++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h
> index b42e26427a70..ef2f6f88e254 100644
> --- a/drivers/infiniband/sw/rxe/rxe_pool.h
> +++ b/drivers/infiniband/sw/rxe/rxe_pool.h
> @@ -25,6 +25,7 @@ struct rxe_pool_elem {
> struct kref ref_cnt;
> struct list_head list;
> struct completion complete;
> + struct rcu_head rcu;
> u32 index;
> };
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 903f0b71447e..b899924b81eb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1395,7 +1395,7 @@ static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> if (cleanup_err)
> rxe_err_mr(mr, "cleanup failed, err = %d", cleanup_err);
>
> - kfree_rcu_mightsleep(mr);
> + kfree_rcu(mr, elem.rcu);
> return 0;
>
> err_out:
> @@ -1494,6 +1494,10 @@ static const struct ib_device_ops rxe_dev_ops = {
> INIT_RDMA_OBJ_SIZE(ib_srq, rxe_srq, ibsrq),
> INIT_RDMA_OBJ_SIZE(ib_ucontext, rxe_ucontext, ibuc),
> INIT_RDMA_OBJ_SIZE(ib_mw, rxe_mw, ibmw),
> +
> + .rcu_offset_ah = offsetof(struct rxe_ah, elem.rcu),
> + .rcu_offset_qp = offsetof(struct rxe_qp, elem.rcu),
> + .rcu_offset_mw = offsetof(struct rxe_mw, elem.rcu),
> };
This isn't quite the right calculation.. It works because the ib
struct is at the top of these structs
Do it like this:
/*
* When used the core will RCU free the object using the driver provided
* rcu member.
*/
#define INIT_RDMA_OBJ_SIZE_RCU(ib_struct, drv_struct, member, rcu_member) \
.rcu_offset_##ib_struct = \
(offsetof(struct drv_struct, rcu_member) - \
offsetof(struct drv_struct, member) + \
BUILD_BUG_ON_ZERO(offsetof(struct drv_struct, rcu_member) > \
offsetof(struct drv_struct, member))), \
INIT_RDMA_OBJ_SIZE(ib_struct, drv_struct, member),
(and maybe the prior patch could stick with the ssize_t and we could
remove the BUILD_BUG_ON_ZERO?)
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-08-09 19:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-18 17:59 [PATCH for-next 0/2] Enable rcu locking of verbs objects Bob Pearson
2023-07-18 17:59 ` [PATCH for-next 1/2] RDMA/core: Support drivers use of rcu locking Bob Pearson
2023-08-09 19:10 ` Jason Gunthorpe
2023-07-18 17:59 ` [PATCH for-next 2/2] RDMA/rxe: Enable rcu locking of indexed objects Bob Pearson
2023-07-19 5:38 ` Zhu Yanjun
2023-07-19 14:34 ` Bob Pearson
2023-07-19 7:49 ` Leon Romanovsky
2023-07-19 16:43 ` Bob Pearson
2023-07-23 15:00 ` Leon Romanovsky
2023-08-09 19:17 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).