* [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp()
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator Hou Tao
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu
From: "Paul E. McKenney" <paulmck@kernel.org>
As an accident of implementation, an RCU Tasks Trace grace period also
acts as an RCU grace period. However, this could change at any time.
This commit therefore creates an rcu_trace_implies_rcu_gp() that currently
returns true to codify this accident. Code relying on this accident
must call this function to verify that this accident is still happening.
Reported-by: Hou Tao <houtao@huaweicloud.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
---
include/linux/rcupdate.h | 12 ++++++++++++
kernel/rcu/tasks.h | 2 ++
2 files changed, 14 insertions(+)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 08605ce7379d..8822f06e4b40 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -240,6 +240,18 @@ static inline void exit_tasks_rcu_start(void) { }
static inline void exit_tasks_rcu_finish(void) { }
#endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
+/**
+ * rcu_trace_implies_rcu_gp - does an RCU Tasks Trace grace period imply an RCU grace period?
+ *
+ * As an accident of implementation, an RCU Tasks Trace grace period also
+ * acts as an RCU grace period. However, this could change at any time.
+ * Code relying on this accident must call this function to verify that
+ * this accident is still happening.
+ *
+ * You have been warned!
+ */
+static inline bool rcu_trace_implies_rcu_gp(void) { return true; }
+
/**
* cond_resched_tasks_rcu_qs - Report potential quiescent states to RCU
*
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index f5bf6fb430da..9435e5a7b53e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1535,6 +1535,8 @@ static void rcu_tasks_trace_postscan(struct list_head *hop)
{
// Wait for late-stage exiting tasks to finish exiting.
// These might have passed the call to exit_tasks_rcu_finish().
+
+ // If you remove the following line, update rcu_trace_implies_rcu_gp()!!!
synchronize_rcu();
// Any tasks that exit after this point will set
// TRC_NEED_QS_CHECKED in ->trc_reader_special.b.need_qs.
--
2.29.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map Hou Tao
` (3 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu
From: Hou Tao <houtao1@huawei.com>
The memory free logic in bpf memory allocator chains a RCU Tasks Trace
grace period and a normal RCU grace period one after the other, so it
can ensure that both sleepable and non-sleepable programs have finished.
With the introduction of rcu_trace_implies_rcu_gp(),
__free_rcu_tasks_trace() can check whether or not a normal RCU grace
period has also passed after a RCU Tasks Trace grace period has passed.
If it is true, freeing these elements directly, else freeing through
call_rcu().
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
kernel/bpf/memalloc.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 5f83be1d2018..2433be58bb85 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -222,9 +222,13 @@ static void __free_rcu(struct rcu_head *head)
static void __free_rcu_tasks_trace(struct rcu_head *head)
{
- struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu);
-
- call_rcu(&c->rcu, __free_rcu);
+ /* If RCU Tasks Trace grace period implies RCU grace period,
+ * there is no need to invoke call_rcu().
+ */
+ if (rcu_trace_implies_rcu_gp())
+ __free_rcu(head);
+ else
+ call_rcu(head, __free_rcu);
}
static void enque_to_free(struct bpf_mem_cache *c, void *obj)
@@ -253,8 +257,9 @@ static void do_call_rcu(struct bpf_mem_cache *c)
*/
__llist_add(llnode, &c->waiting_for_gp);
/* Use call_rcu_tasks_trace() to wait for sleepable progs to finish.
- * Then use call_rcu() to wait for normal progs to finish
- * and finally do free_one() on each element.
+ * If RCU Tasks Trace grace period implies RCU grace period, free
+ * these elements directly, else use call_rcu() to wait for normal
+ * progs to finish and finally do free_one() on each element.
*/
call_rcu_tasks_trace(&c->rcu, __free_rcu_tasks_trace);
}
--
2.29.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing Hou Tao
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu
From: Hou Tao <houtao1@huawei.com>
Local storage map is accessible for both sleepable and non-sleepable bpf
program, and its memory is freed by using both call_rcu_tasks_trace() and
kfree_rcu() to wait for both RCU-tasks-trace grace period and RCU grace
period to pass.
With the introduction of rcu_trace_implies_rcu_gp(), both
bpf_selem_free_rcu() and bpf_local_storage_free_rcu() can check whether
or not a normal RCU grace period has also passed after a RCU-tasks-trace
grace period has passed. If it is true, it is safe to call kfree()
directly.
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 802fc15b0d73..9dc6de1cf185 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -88,8 +88,14 @@ void bpf_local_storage_free_rcu(struct rcu_head *rcu)
{
struct bpf_local_storage *local_storage;
+ /* If RCU Tasks Trace grace period implies RCU grace period, do
+ * kfree(), else do kfree_rcu().
+ */
local_storage = container_of(rcu, struct bpf_local_storage, rcu);
- kfree_rcu(local_storage, rcu);
+ if (rcu_trace_implies_rcu_gp())
+ kfree(local_storage);
+ else
+ kfree_rcu(local_storage, rcu);
}
static void bpf_selem_free_rcu(struct rcu_head *rcu)
@@ -97,7 +103,10 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu)
struct bpf_local_storage_elem *selem;
selem = container_of(rcu, struct bpf_local_storage_elem, rcu);
- kfree_rcu(selem, rcu);
+ if (rcu_trace_implies_rcu_gp())
+ kfree(selem);
+ else
+ kfree_rcu(selem, rcu);
}
/* local_storage->lock must be held and selem->local_storage == local_storage.
--
2.29.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
` (2 preceding siblings ...)
2022-10-14 11:39 ` [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
2022-10-18 17:40 ` patchwork-bot+netdevbpf
5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu
From: Hou Tao <houtao1@huawei.com>
To support both sleepable and normal uprobe bpf program, the freeing of
trace program array chains a RCU-tasks-trace grace period and a normal
RCU grace period one after the other.
With the introduction of rcu_trace_implies_rcu_gp(),
__bpf_prog_array_free_sleepable_cb() can check whether or not a normal
RCU grace period has also passed after a RCU-tasks-trace grace period
has passed. If it is true, it is safe to invoke kfree() directly.
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
kernel/bpf/core.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 711fd293b6de..4bc5f46d7030 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2251,8 +2251,14 @@ static void __bpf_prog_array_free_sleepable_cb(struct rcu_head *rcu)
{
struct bpf_prog_array *progs;
+ /* If RCU Tasks Trace grace period implies RCU grace period, there is
+ * no need to call kfree_rcu(), just call kfree() directly.
+ */
progs = container_of(rcu, struct bpf_prog_array, rcu);
- kfree_rcu(progs, rcu);
+ if (rcu_trace_implies_rcu_gp())
+ kfree(progs);
+ else
+ kfree_rcu(progs, rcu);
}
void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs)
--
2.29.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
` (3 preceding siblings ...)
2022-10-14 11:39 ` [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing Hou Tao
@ 2022-10-17 13:39 ` Paul E. McKenney
2022-10-18 7:31 ` Hou Tao
2022-10-18 17:40 ` patchwork-bot+netdevbpf
5 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-17 13:39 UTC (permalink / raw)
To: Hou Tao
Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Delyan Kratunov, rcu
On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
>
> Hi,
>
> Now bpf uses RCU grace period chaining to wait for the completion of
> access from both sleepable and non-sleepable bpf program: calling
> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> a normal RCU grace period.
>
> According to the implementation of RCU Tasks Trace, it inovkes
> ->postscan_func() to wait for one RCU-tasks-trace grace period and
> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
> normal RCU grace period in turn, so one RCU-tasks-trace grace period
> will imply one normal RCU grace period. To codify the implication,
> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
> Other two uses of call_rcu_tasks_trace() are unchanged: for
> __bpf_prog_put_rcu() there is no gp chain and for
> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
> tasks GP.
>
> An alternative way to remove these unnecessary RCU grace period
> chainings is using the RCU polling API to check whether or not a normal
> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
> needs an unsigned long space for each free element or each call, and
> it is not affordable for local storage element, so as for now always
> rcu_trace_implies_rcu_gp().
>
> Comments are always welcome.
For #2-#4:
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
(#1 already has my Signed-off-by, in case anyone was wondering.)
> Change Log:
>
> v2:
> * codify the implication of RCU Tasks Trace grace period instead of
> assuming for it
>
> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
>
> Hou Tao (3):
> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
> bpf: Use rcu_trace_implies_rcu_gp() in local storage map
> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
>
> Paul E. McKenney (1):
> rcu-tasks: Provide rcu_trace_implies_rcu_gp()
>
> include/linux/rcupdate.h | 12 ++++++++++++
> kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
> kernel/bpf/core.c | 8 +++++++-
> kernel/bpf/memalloc.c | 15 ++++++++++-----
> kernel/rcu/tasks.h | 2 ++
> 5 files changed, 42 insertions(+), 8 deletions(-)
>
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
@ 2022-10-18 7:31 ` Hou Tao
2022-10-18 15:08 ` Paul E. McKenney
0 siblings, 1 reply; 11+ messages in thread
From: Hou Tao @ 2022-10-18 7:31 UTC (permalink / raw)
To: paulmck
Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Delyan Kratunov, rcu
Hi,
On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Hi,
>>
>> Now bpf uses RCU grace period chaining to wait for the completion of
>> access from both sleepable and non-sleepable bpf program: calling
>> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
>> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
>> a normal RCU grace period.
>>
>> According to the implementation of RCU Tasks Trace, it inovkes
>> ->postscan_func() to wait for one RCU-tasks-trace grace period and
>> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
>> normal RCU grace period in turn, so one RCU-tasks-trace grace period
>> will imply one normal RCU grace period. To codify the implication,
>> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
>> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
>> Other two uses of call_rcu_tasks_trace() are unchanged: for
>> __bpf_prog_put_rcu() there is no gp chain and for
>> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
>> tasks GP.
>>
>> An alternative way to remove these unnecessary RCU grace period
>> chainings is using the RCU polling API to check whether or not a normal
>> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
>> needs an unsigned long space for each free element or each call, and
>> it is not affordable for local storage element, so as for now always
>> rcu_trace_implies_rcu_gp().
>>
>> Comments are always welcome.
> For #2-#4:
>
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
>
> (#1 already has my Signed-off-by, in case anyone was wondering.)
Thanks for the review. But it seems I missed another possible use case for
rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
free_mem_alloc() is as following:
static void free_mem_alloc(struct bpf_mem_alloc *ma)
{
/* waiting_for_gp lists was drained, but __free_rcu might
* still execute. Wait for it now before we freeing percpu caches.
*/
rcu_barrier_tasks_trace();
rcu_barrier();
free_mem_alloc_no_barrier(ma);
}
It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
no need to call rcu_barrier().
static void free_mem_alloc(struct bpf_mem_alloc *ma)
{
/* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
* or __free_rcu() might still execute. Wait for it now before we
* freeing percpu caches.
*/
rcu_barrier_tasks_trace();
if (!rcu_trace_implies_rcu_gp())
rcu_barrier();
free_mem_alloc_no_barrier(ma);
}
Does the above change look good to you ? If it is, I will post v3 to include the
above change and add your Reviewed-by tag.
>
>> Change Log:
>>
>> v2:
>> * codify the implication of RCU Tasks Trace grace period instead of
>> assuming for it
>>
>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
>>
>> Hou Tao (3):
>> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
>> bpf: Use rcu_trace_implies_rcu_gp() in local storage map
>> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
>>
>> Paul E. McKenney (1):
>> rcu-tasks: Provide rcu_trace_implies_rcu_gp()
>>
>> include/linux/rcupdate.h | 12 ++++++++++++
>> kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
>> kernel/bpf/core.c | 8 +++++++-
>> kernel/bpf/memalloc.c | 15 ++++++++++-----
>> kernel/rcu/tasks.h | 2 ++
>> 5 files changed, 42 insertions(+), 8 deletions(-)
>>
>> --
>> 2.29.2
>>
> .
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-18 7:31 ` Hou Tao
@ 2022-10-18 15:08 ` Paul E. McKenney
2022-10-21 7:08 ` Hou Tao
0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-18 15:08 UTC (permalink / raw)
To: Hou Tao
Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Delyan Kratunov, rcu
On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
> Hi,
>
> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> > On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> >> From: Hou Tao <houtao1@huawei.com>
> >>
> >> Hi,
> >>
> >> Now bpf uses RCU grace period chaining to wait for the completion of
> >> access from both sleepable and non-sleepable bpf program: calling
> >> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> >> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> >> a normal RCU grace period.
> >>
> >> According to the implementation of RCU Tasks Trace, it inovkes
> >> ->postscan_func() to wait for one RCU-tasks-trace grace period and
> >> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
> >> normal RCU grace period in turn, so one RCU-tasks-trace grace period
> >> will imply one normal RCU grace period. To codify the implication,
> >> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
> >> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
> >> Other two uses of call_rcu_tasks_trace() are unchanged: for
> >> __bpf_prog_put_rcu() there is no gp chain and for
> >> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
> >> tasks GP.
> >>
> >> An alternative way to remove these unnecessary RCU grace period
> >> chainings is using the RCU polling API to check whether or not a normal
> >> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
> >> needs an unsigned long space for each free element or each call, and
> >> it is not affordable for local storage element, so as for now always
> >> rcu_trace_implies_rcu_gp().
> >>
> >> Comments are always welcome.
> > For #2-#4:
> >
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> >
> > (#1 already has my Signed-off-by, in case anyone was wondering.)
> Thanks for the review. But it seems I missed another possible use case for
> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
> free_mem_alloc() is as following:
>
> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> {
> /* waiting_for_gp lists was drained, but __free_rcu might
> * still execute. Wait for it now before we freeing percpu caches.
> */
> rcu_barrier_tasks_trace();
> rcu_barrier();
> free_mem_alloc_no_barrier(ma);
> }
>
> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
> no need to call rcu_barrier().
>
> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> {
> /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
> * or __free_rcu() might still execute. Wait for it now before we
> * freeing percpu caches.
> */
> rcu_barrier_tasks_trace();
> if (!rcu_trace_implies_rcu_gp())
> rcu_barrier();
> free_mem_alloc_no_barrier(ma);
> }
>
> Does the above change look good to you ? If it is, I will post v3 to include the
> above change and add your Reviewed-by tag.
Unfortunately, although synchronize_rcu_tasks_trace() implies
that synchronize_rcu(), there is no relationship between the
callbacks. Furthermore, rcu_barrier_tasks_trace() does not imply
synchronize_rcu_tasks_trace().
So the above change really would break things. Please do not do it.
You could use workqueues or similar to make the rcu_barrier_tasks_trace()
and the rcu_barrier() wait concurrently, though. This would of course
require some synchronization.
Thanx, Paul
> >> Change Log:
> >>
> >> v2:
> >> * codify the implication of RCU Tasks Trace grace period instead of
> >> assuming for it
> >>
> >> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
> >>
> >> Hou Tao (3):
> >> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
> >> bpf: Use rcu_trace_implies_rcu_gp() in local storage map
> >> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
> >>
> >> Paul E. McKenney (1):
> >> rcu-tasks: Provide rcu_trace_implies_rcu_gp()
> >>
> >> include/linux/rcupdate.h | 12 ++++++++++++
> >> kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
> >> kernel/bpf/core.c | 8 +++++++-
> >> kernel/bpf/memalloc.c | 15 ++++++++++-----
> >> kernel/rcu/tasks.h | 2 ++
> >> 5 files changed, 42 insertions(+), 8 deletions(-)
> >>
> >> --
> >> 2.29.2
> >>
> > .
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-18 15:08 ` Paul E. McKenney
@ 2022-10-21 7:08 ` Hou Tao
2022-10-21 18:50 ` Paul E. McKenney
0 siblings, 1 reply; 11+ messages in thread
From: Hou Tao @ 2022-10-21 7:08 UTC (permalink / raw)
To: paulmck
Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Delyan Kratunov, rcu
Hi,
On 10/18/2022 11:08 PM, Paul E. McKenney wrote:
> On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
>>> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
SNIP
>>>
>> Thanks for the review. But it seems I missed another possible use case for
>> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
>> free_mem_alloc() is as following:
>>
>> static void free_mem_alloc(struct bpf_mem_alloc *ma)
>> {
>> /* waiting_for_gp lists was drained, but __free_rcu might
>> * still execute. Wait for it now before we freeing percpu caches.
>> */
>> rcu_barrier_tasks_trace();
>> rcu_barrier();
>> free_mem_alloc_no_barrier(ma);
>> }
>>
>> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
>> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
>> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
>> no need to call rcu_barrier().
>>
>> static void free_mem_alloc(struct bpf_mem_alloc *ma)
>> {
>> /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
>> * or __free_rcu() might still execute. Wait for it now before we
>> * freeing percpu caches.
>> */
>> rcu_barrier_tasks_trace();
>> if (!rcu_trace_implies_rcu_gp())
>> rcu_barrier();
>> free_mem_alloc_no_barrier(ma);
>> }
>>
>> Does the above change look good to you ? If it is, I will post v3 to include the
>> above change and add your Reviewed-by tag.
> Unfortunately, although synchronize_rcu_tasks_trace() implies
> that synchronize_rcu(), there is no relationship between the
> callbacks. Furthermore, rcu_barrier_tasks_trace() does not imply
> synchronize_rcu_tasks_trace().
Yes. I see. And according to the code, if there is not pending cb,
rcu_barrier_tasks_trace() will returned immediately. It is also possible
rcu_tasks_trace kthread is in the middle of grace period waiting when invoking
rcu_barrier_task_trace(), so rcu_barrier_task_trace() does not imply
synchronize_rcu_tasks_trace().
>
> So the above change really would break things. Please do not do it.
However I am a little confused about the conclusion. If only considering the
invocations of call_rcu() and call_rcu_tasks_trace() in kernel/bpf/memalloc.c, I
think it is safe to do so, right ? Because if rcu_trace_implies_rcu_gp() is
true, there will be no invocation of call_rcu() and rcu_barrier_tasks_trace()
will wait for the completion of pending call_rcu_tasks_trace(). If
rcu_trace_implies_rcu_gp(), rcu_barrier_tasks_trace() and rcu_barrier() will do
the job. If considering the invocations of call_rcu() in other places, I think
it is definitely unsafe to do that, right ?
>
> You could use workqueues or similar to make the rcu_barrier_tasks_trace()
> and the rcu_barrier() wait concurrently, though. This would of course
> require some synchronization.
Thanks for the suggestion. Will check it later.
>
> Thanx, Paul
>
>>>> Change Log:
>>>>
>>>> v2:
>>>> * codify the implication of RCU Tasks Trace grace period instead of
>>>> assuming for it
>>>>
>>>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
>>>>
>>>> Hou Tao (3):
>>>> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
>>>> bpf: Use rcu_trace_implies_rcu_gp() in local storage map
>>>> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
>>>>
>>>> Paul E. McKenney (1):
>>>> rcu-tasks: Provide rcu_trace_implies_rcu_gp()
>>>>
>>>> include/linux/rcupdate.h | 12 ++++++++++++
>>>> kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
>>>> kernel/bpf/core.c | 8 +++++++-
>>>> kernel/bpf/memalloc.c | 15 ++++++++++-----
>>>> kernel/rcu/tasks.h | 2 ++
>>>> 5 files changed, 42 insertions(+), 8 deletions(-)
>>>>
>>>> --
>>>> 2.29.2
>>>>
>>> .
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-21 7:08 ` Hou Tao
@ 2022-10-21 18:50 ` Paul E. McKenney
0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-21 18:50 UTC (permalink / raw)
To: Hou Tao
Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
John Fastabend, houtao1, Delyan Kratunov, rcu
On Fri, Oct 21, 2022 at 03:08:21PM +0800, Hou Tao wrote:
> Hi,
>
> On 10/18/2022 11:08 PM, Paul E. McKenney wrote:
> > On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
> >> Hi,
> >>
> >> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> >>> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> SNIP
> >>>
> >> Thanks for the review. But it seems I missed another possible use case for
> >> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
> >> free_mem_alloc() is as following:
> >>
> >> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> >> {
> >> /* waiting_for_gp lists was drained, but __free_rcu might
> >> * still execute. Wait for it now before we freeing percpu caches.
> >> */
> >> rcu_barrier_tasks_trace();
> >> rcu_barrier();
> >> free_mem_alloc_no_barrier(ma);
> >> }
> >>
> >> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
> >> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
> >> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
> >> no need to call rcu_barrier().
> >>
> >> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> >> {
> >> /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
> >> * or __free_rcu() might still execute. Wait for it now before we
> >> * freeing percpu caches.
> >> */
> >> rcu_barrier_tasks_trace();
> >> if (!rcu_trace_implies_rcu_gp())
> >> rcu_barrier();
> >> free_mem_alloc_no_barrier(ma);
> >> }
> >>
> >> Does the above change look good to you ? If it is, I will post v3 to include the
> >> above change and add your Reviewed-by tag.
> > Unfortunately, although synchronize_rcu_tasks_trace() implies
> > that synchronize_rcu(), there is no relationship between the
> > callbacks. Furthermore, rcu_barrier_tasks_trace() does not imply
> > synchronize_rcu_tasks_trace().
>
> Yes. I see. And according to the code, if there is not pending cb,
> rcu_barrier_tasks_trace() will returned immediately. It is also possible
> rcu_tasks_trace kthread is in the middle of grace period waiting when invoking
> rcu_barrier_task_trace(), so rcu_barrier_task_trace() does not imply
> synchronize_rcu_tasks_trace().
Very good!
> > So the above change really would break things. Please do not do it.
>
> However I am a little confused about the conclusion. If only considering the
> invocations of call_rcu() and call_rcu_tasks_trace() in kernel/bpf/memalloc.c, I
> think it is safe to do so, right ? Because if rcu_trace_implies_rcu_gp() is
> true, there will be no invocation of call_rcu() and rcu_barrier_tasks_trace()
> will wait for the completion of pending call_rcu_tasks_trace(). If
> rcu_trace_implies_rcu_gp(), rcu_barrier_tasks_trace() and rcu_barrier() will do
> the job. If considering the invocations of call_rcu() in other places, I think
> it is definitely unsafe to do that, right ?
Agreed, I am being cautious and pessimistic in assuming that there are
other call_rcu() invocations. On the other hand, my caution and pessimism
is based on their having been other call_rcu() invocations in the past.
So please verify that there are none before making this sort of change.
Thanx, Paul
> > You could use workqueues or similar to make the rcu_barrier_tasks_trace()
> > and the rcu_barrier() wait concurrently, though. This would of course
> > require some synchronization.
> Thanks for the suggestion. Will check it later.
> >
> > Thanx, Paul
> >
> >>>> Change Log:
> >>>>
> >>>> v2:
> >>>> * codify the implication of RCU Tasks Trace grace period instead of
> >>>> assuming for it
> >>>>
> >>>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
> >>>>
> >>>> Hou Tao (3):
> >>>> bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
> >>>> bpf: Use rcu_trace_implies_rcu_gp() in local storage map
> >>>> bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
> >>>>
> >>>> Paul E. McKenney (1):
> >>>> rcu-tasks: Provide rcu_trace_implies_rcu_gp()
> >>>>
> >>>> include/linux/rcupdate.h | 12 ++++++++++++
> >>>> kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
> >>>> kernel/bpf/core.c | 8 +++++++-
> >>>> kernel/bpf/memalloc.c | 15 ++++++++++-----
> >>>> kernel/rcu/tasks.h | 2 ++
> >>>> 5 files changed, 42 insertions(+), 8 deletions(-)
> >>>>
> >>>> --
> >>>> 2.29.2
> >>>>
> >>> .
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
` (4 preceding siblings ...)
2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
@ 2022-10-18 17:40 ` patchwork-bot+netdevbpf
5 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-10-18 17:40 UTC (permalink / raw)
To: Hou Tao
Cc: bpf, kafai, andrii, songliubraving, haoluo, yhs, ast, daniel,
kpsingh, davem, kuba, sdf, jolsa, john.fastabend, houtao1,
paulmck, delyank, rcu
Hello:
This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Fri, 14 Oct 2022 19:39:42 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
>
> Hi,
>
> Now bpf uses RCU grace period chaining to wait for the completion of
> access from both sleepable and non-sleepable bpf program: calling
> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> a normal RCU grace period.
>
> [...]
Here is the summary with links:
- [bpf-next,v2,1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp()
https://git.kernel.org/bpf/bpf-next/c/e6c86c513f44
- [bpf-next,v2,2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
https://git.kernel.org/bpf/bpf-next/c/59be91e5e70a
- [bpf-next,v2,3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map
https://git.kernel.org/bpf/bpf-next/c/d39d1445d377
- [bpf-next,v2,4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
https://git.kernel.org/bpf/bpf-next/c/4835f9ee980c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread