public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread()
@ 2023-11-14 16:32 Oleg Nesterov
  2023-11-14 16:32 ` [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() Oleg Nesterov
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-14 16:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Yonghong Song
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf

Compile tested.

Every lockless usage of next_thread() was wrong, bpf/task_iter.c is
the last user and is no exception.

Oleg.
---

 kernel/bpf/task_iter.c | 29 +++++++++++------------------
 1 file changed, 11 insertions(+), 18 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
@ 2023-11-14 16:32 ` Oleg Nesterov
  2023-11-16  3:31   ` Yonghong Song
  2023-11-14 16:32 ` [PATCH 2/3] bpf: bpf_iter_task_next: " Oleg Nesterov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-14 16:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Yonghong Song
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf

Lockless use of next_thread() should be avoided, kernel/bpf/task_iter.c
is the last user and the usage is wrong.

task_group_seq_get_next() can return the group leader twice if it races
with mt-thread exec which changes the group->leader's pid.

Change the main loop to use __next_thread(), kill "next_tid == common->pid"
check.

__next_thread() can't loop forever, we can also change this code to retry
if next_tid == 0.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/bpf/task_iter.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index 26082b97894d..51ae15e2b290 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -70,15 +70,13 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm
 		return NULL;
 
 retry:
-	task = next_thread(task);
+	task = __next_thread(task);
+	if (!task)
+		return NULL;
 
 	next_tid = __task_pid_nr_ns(task, PIDTYPE_PID, common->ns);
-	if (!next_tid || next_tid == common->pid) {
-		/* Run out of tasks of a process.  The tasks of a
-		 * thread_group are linked as circular linked list.
-		 */
-		return NULL;
-	}
+	if (!next_tid)
+		goto retry;
 
 	if (skip_if_dup_files && task->files == task->group_leader->files)
 		goto retry;
-- 
2.25.1.362.g51ebf55


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] bpf: bpf_iter_task_next: use __next_thread() rather than next_thread()
  2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
  2023-11-14 16:32 ` [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() Oleg Nesterov
@ 2023-11-14 16:32 ` Oleg Nesterov
  2023-11-16  3:34   ` Yonghong Song
  2023-11-14 16:32 ` [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) Oleg Nesterov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-14 16:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Yonghong Song
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf

Lockless use of next_thread() should be avoided, kernel/bpf/task_iter.c
is the last user and the usage is wrong.

bpf_iter_task_next() can loop forever, "kit->pos == kit->task" can never
happen if kit->pos execs. Change this code to use __next_thread().

With or without this change the usage of kit->pos/task and next_task()
doesn't look nice, see the next patch.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/bpf/task_iter.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index 51ae15e2b290..d42e08d0d0b7 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -1015,12 +1015,11 @@ __bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it)
 	if (flags == BPF_TASK_ITER_ALL_PROCS)
 		goto get_next_task;
 
-	kit->pos = next_thread(kit->pos);
-	if (kit->pos == kit->task) {
-		if (flags == BPF_TASK_ITER_PROC_THREADS) {
-			kit->pos = NULL;
+	kit->pos = __next_thread(kit->pos);
+	if (!kit->pos) {
+		if (flags == BPF_TASK_ITER_PROC_THREADS)
 			return pos;
-		}
+		kit->pos = kit->task;
 	} else
 		return pos;
 
-- 
2.25.1.362.g51ebf55


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
  2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
  2023-11-14 16:32 ` [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() Oleg Nesterov
  2023-11-14 16:32 ` [PATCH 2/3] bpf: bpf_iter_task_next: " Oleg Nesterov
@ 2023-11-14 16:32 ` Oleg Nesterov
  2023-11-16  5:16   ` Yonghong Song
  2023-11-16  3:13 ` [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Yonghong Song
  2023-11-19 20:00 ` patchwork-bot+netdevbpf
  4 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-14 16:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Yonghong Song
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf

This looks more clear and simplifies the code. While at it, remove the
unnecessary initialization of pos/task at the start of bpf_iter_task_new().

Note that we can even kill kit->task, we can just use pos->group_leader,
but I don't understand the BUILD_BUG_ON() checks in bpf_iter_task_new().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/bpf/task_iter.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index d42e08d0d0b7..e5c3500443c6 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -978,7 +978,6 @@ __bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it,
 	BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) !=
 					__alignof__(struct bpf_iter_task));
 
-	kit->task = kit->pos = NULL;
 	switch (flags) {
 	case BPF_TASK_ITER_ALL_THREADS:
 	case BPF_TASK_ITER_ALL_PROCS:
@@ -1016,18 +1015,15 @@ __bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it)
 		goto get_next_task;
 
 	kit->pos = __next_thread(kit->pos);
-	if (!kit->pos) {
-		if (flags == BPF_TASK_ITER_PROC_THREADS)
-			return pos;
-		kit->pos = kit->task;
-	} else
+	if (kit->pos || flags == BPF_TASK_ITER_PROC_THREADS)
 		return pos;
 
 get_next_task:
-	kit->pos = next_task(kit->pos);
-	kit->task = kit->pos;
-	if (kit->pos == &init_task)
+	kit->task = next_task(kit->task);
+	if (kit->task == &init_task)
 		kit->pos = NULL;
+	else
+		kit->pos = kit->task;
 
 	return pos;
 }
-- 
2.25.1.362.g51ebf55


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread()
  2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
                   ` (2 preceding siblings ...)
  2023-11-14 16:32 ` [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) Oleg Nesterov
@ 2023-11-16  3:13 ` Yonghong Song
  2023-11-16  9:54   ` Oleg Nesterov
  2023-11-19 20:00 ` patchwork-bot+netdevbpf
  4 siblings, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2023-11-16  3:13 UTC (permalink / raw)
  To: Oleg Nesterov, Alexei Starovoitov
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf


On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> Compile tested.
>
> Every lockless usage of next_thread() was wrong, bpf/task_iter.c is
> the last user and is no exception.

It would be great if you can give more information in the commit message
about why the usage of next_thread() is wrong in bpf/task_iter.c.
IIUC, some information is presented in :
   https://lore.kernel.org/all/20230824143112.GA31208@redhat.com/

Also, please add 'bpf' in the subject tag ([PATCH bpf 0/3]) to
make it clear the patch should be applied to bpf tree.

>
> Oleg.
> ---
>
>   kernel/bpf/task_iter.c | 29 +++++++++++------------------
>   1 file changed, 11 insertions(+), 18 deletions(-)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  2023-11-14 16:32 ` [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() Oleg Nesterov
@ 2023-11-16  3:31   ` Yonghong Song
  2023-11-16  9:34     ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2023-11-16  3:31 UTC (permalink / raw)
  To: Oleg Nesterov, Alexei Starovoitov
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf


On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> Lockless use of next_thread() should be avoided, kernel/bpf/task_iter.c
> is the last user and the usage is wrong.
>
> task_group_seq_get_next() can return the group leader twice if it races
> with mt-thread exec which changes the group->leader's pid.
>
> Change the main loop to use __next_thread(), kill "next_tid == common->pid"
> check.
>
> __next_thread() can't loop forever, we can also change this code to retry
> if next_tid == 0.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>   kernel/bpf/task_iter.c | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index 26082b97894d..51ae15e2b290 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -70,15 +70,13 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm
>   		return NULL;
>   
>   retry:
> -	task = next_thread(task);
> +	task = __next_thread(task);
> +	if (!task)
> +		return NULL;
>   
>   	next_tid = __task_pid_nr_ns(task, PIDTYPE_PID, common->ns);
> -	if (!next_tid || next_tid == common->pid) {
> -		/* Run out of tasks of a process.  The tasks of a
> -		 * thread_group are linked as circular linked list.
> -		 */
> -		return NULL;
> -	}
> +	if (!next_tid)
> +		goto retry;

Look at the code. Looks like next_tid should never be 0 unless some
task is migrated to other namespace which I think is not possible.

common->ns is assigned as below:
   common->ns = get_pid_ns(task_active_pid_ns(current))
so we are searching tasks in the *current* namespace.

Look at:
pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns)
{
         struct upid *upid;
         pid_t nr = 0;

         if (pid && ns->level <= pid->level) {
                 upid = &pid->numbers[ns->level];
                 if (upid->ns == ns)
                         nr = upid->nr;
         }
         return nr;
}

pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
                         struct pid_namespace *ns)
{
         pid_t nr = 0;

         rcu_read_lock();
         if (!ns)
                 ns = task_active_pid_ns(current);
         nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
         rcu_read_unlock();
         
         return nr;
}

In func pid_nr_ns(), ns->level should be equal to pid->level if pid is
in input parameter 'ns'. and in this case the return value 'nr'
should be none zero.

If this is the case, could you remove
	if (!next_tid)
		goto retry;

Other than above, the change looks good to me.

>   
>   	if (skip_if_dup_files && task->files == task->group_leader->files)
>   		goto retry;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] bpf: bpf_iter_task_next: use __next_thread() rather than next_thread()
  2023-11-14 16:32 ` [PATCH 2/3] bpf: bpf_iter_task_next: " Oleg Nesterov
@ 2023-11-16  3:34   ` Yonghong Song
  0 siblings, 0 replies; 14+ messages in thread
From: Yonghong Song @ 2023-11-16  3:34 UTC (permalink / raw)
  To: Oleg Nesterov, Alexei Starovoitov
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf


On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> Lockless use of next_thread() should be avoided, kernel/bpf/task_iter.c
> is the last user and the usage is wrong.
>
> bpf_iter_task_next() can loop forever, "kit->pos == kit->task" can never
> happen if kit->pos execs. Change this code to use __next_thread().
>
> With or without this change the usage of kit->pos/task and next_task()
> doesn't look nice, see the next patch.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Yonghong Song <yonghong.song@linux.dev>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
  2023-11-14 16:32 ` [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) Oleg Nesterov
@ 2023-11-16  5:16   ` Yonghong Song
  2023-11-16  9:38     ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Yonghong Song @ 2023-11-16  5:16 UTC (permalink / raw)
  To: Oleg Nesterov, Alexei Starovoitov
  Cc: Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee, linux-kernel, bpf


On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> This looks more clear and simplifies the code. While at it, remove the
> unnecessary initialization of pos/task at the start of bpf_iter_task_new().
>
> Note that we can even kill kit->task, we can just use pos->group_leader,
> but I don't understand the BUILD_BUG_ON() checks in bpf_iter_task_new().

Let us keep kit->task, which is used in later function
bpf_iter_task_next(). The patch looks good to me.

>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Yonghong Song <yonghong.song@linux.dev>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  2023-11-16  3:31   ` Yonghong Song
@ 2023-11-16  9:34     ` Oleg Nesterov
  2023-11-16 11:46       ` Yonghong Song
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-16  9:34 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee,
	linux-kernel, bpf

On 11/15, Yonghong Song wrote:
>
> On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> >@@ -70,15 +70,13 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm
> >  		return NULL;
> >  retry:
> >-	task = next_thread(task);
> >+	task = __next_thread(task);
> >+	if (!task)
> >+		return NULL;
> >  	next_tid = __task_pid_nr_ns(task, PIDTYPE_PID, common->ns);
> >-	if (!next_tid || next_tid == common->pid) {
> >-		/* Run out of tasks of a process.  The tasks of a
> >-		 * thread_group are linked as circular linked list.
> >-		 */
> >-		return NULL;
> >-	}
> >+	if (!next_tid)
> >+		goto retry;
>
> Look at the code. Looks like next_tid should never be 0

...

> pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
>                         struct pid_namespace *ns)
> {
>         pid_t nr = 0;
>
>         rcu_read_lock();
>         if (!ns)
>                 ns = task_active_pid_ns(current);
>         nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^

Please note that task_pid_ptr(task, type)) can return NULL if this
task has already exited and called detach_pid().

detach_pid() does __change_pid(task, type, NULL), please note the

	*pid_ptr = new; // NULL in this case

assignment in __change_pid().

IOW. The problem is not that ns can change, the problem is that
task->thread_pid (and other pid links) can be NULL, and in this
case pid_nr_ns() returns zero.


This code should be rewritten from the very beginning, it should
not rely on pid_nr. If nothing else common->pid and/or pid_visiting
can be reused. But currently my only concern is next_thread().

> Other than above, the change looks good to me.

Thanks for review!

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
  2023-11-16  5:16   ` Yonghong Song
@ 2023-11-16  9:38     ` Oleg Nesterov
  0 siblings, 0 replies; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-16  9:38 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee,
	linux-kernel, bpf

On 11/16, Yonghong Song wrote:
>
> On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> >This looks more clear and simplifies the code. While at it, remove the
> >unnecessary initialization of pos/task at the start of bpf_iter_task_new().
> >
> >Note that we can even kill kit->task, we can just use pos->group_leader,
> >but I don't understand the BUILD_BUG_ON() checks in bpf_iter_task_new().
>
> Let us keep kit->task, which is used in later function
> bpf_iter_task_next(). The patch looks good to me.

Yes, but it can use pos->group_leader instead of kit->task.
But I agree, lets keep kit->task.

> Acked-by: Yonghong Song <yonghong.song@linux.dev>

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread()
  2023-11-16  3:13 ` [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Yonghong Song
@ 2023-11-16  9:54   ` Oleg Nesterov
  2023-11-16 11:52     ` Yonghong Song
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2023-11-16  9:54 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Alexei Starovoitov, Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee,
	linux-kernel, bpf

On 11/15, Yonghong Song wrote:
>
> On 11/14/23 11:32 AM, Oleg Nesterov wrote:
> >Compile tested.
> >
> >Every lockless usage of next_thread() was wrong, bpf/task_iter.c is
> >the last user and is no exception.
>
> It would be great if you can give more information in the commit message
> about why the usage of next_thread() is wrong in bpf/task_iter.c.

I tried to explain the problems in the changelogs:

1/3:
	task_group_seq_get_next() can return the group leader twice if it races
	with mt-thread exec which changes the group->leader's pid.

2/3:
	bpf_iter_task_next() can loop forever, "kit->pos == kit->task" can never
	happen if kit->pos execs.

> IIUC, some information is presented in :
>   https://lore.kernel.org/all/20230824143112.GA31208@redhat.com/

Yes, Linus and Eric suggest to simply kill next_thread(). I am not
sure, this needs another discussion.

But as for bpf/task_iter.c... Even _if_ the usage was correct, this
code simply doesn't need the "circular" next_thread(), NULL at the
end simplifies the code.

> Also, please add 'bpf' in the subject tag ([PATCH bpf 0/3]) to
> make it clear the patch should be applied to bpf tree.

OK, will do next time. Or should I resend this series with 'bpf'
in the subject tag?

Thanks,

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
  2023-11-16  9:34     ` Oleg Nesterov
@ 2023-11-16 11:46       ` Yonghong Song
  0 siblings, 0 replies; 14+ messages in thread
From: Yonghong Song @ 2023-11-16 11:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alexei Starovoitov, Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee,
	linux-kernel, bpf


On 11/16/23 4:34 AM, Oleg Nesterov wrote:
> On 11/15, Yonghong Song wrote:
>> On 11/14/23 11:32 AM, Oleg Nesterov wrote:
>>> @@ -70,15 +70,13 @@ static struct task_struct *task_group_seq_get_next(struct bpf_iter_seq_task_comm
>>>   		return NULL;
>>>   retry:
>>> -	task = next_thread(task);
>>> +	task = __next_thread(task);
>>> +	if (!task)
>>> +		return NULL;
>>>   	next_tid = __task_pid_nr_ns(task, PIDTYPE_PID, common->ns);
>>> -	if (!next_tid || next_tid == common->pid) {
>>> -		/* Run out of tasks of a process.  The tasks of a
>>> -		 * thread_group are linked as circular linked list.
>>> -		 */
>>> -		return NULL;
>>> -	}
>>> +	if (!next_tid)
>>> +		goto retry;
>> Look at the code. Looks like next_tid should never be 0
> ...
>
>> pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
>>                          struct pid_namespace *ns)
>> {
>>          pid_t nr = 0;
>>
>>          rcu_read_lock();
>>          if (!ns)
>>                  ns = task_active_pid_ns(current);
>>          nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
>                                            ^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Please note that task_pid_ptr(task, type)) can return NULL if this
> task has already exited and called detach_pid().
>
> detach_pid() does __change_pid(task, type, NULL), please note the
>
> 	*pid_ptr = new; // NULL in this case
>
> assignment in __change_pid().
>
> IOW. The problem is not that ns can change, the problem is that
> task->thread_pid (and other pid links) can be NULL, and in this
> case pid_nr_ns() returns zero.

Thanks for explanation. I certainly missed race between task
iterator and __change_pid(). Then the patch looks good to me.

Acked-by: Yonghong Song <yonghong.song@linux.dev>

>
>
> This code should be rewritten from the very beginning, it should
> not rely on pid_nr. If nothing else common->pid and/or pid_visiting
> can be reused. But currently my only concern is next_thread().
>
>> Other than above, the change looks good to me.
> Thanks for review!
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread()
  2023-11-16  9:54   ` Oleg Nesterov
@ 2023-11-16 11:52     ` Yonghong Song
  0 siblings, 0 replies; 14+ messages in thread
From: Yonghong Song @ 2023-11-16 11:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alexei Starovoitov, Chuyi Zhou, Daniel Borkmann, Kui-Feng Lee,
	linux-kernel, bpf


On 11/16/23 4:54 AM, Oleg Nesterov wrote:
> On 11/15, Yonghong Song wrote:
>> On 11/14/23 11:32 AM, Oleg Nesterov wrote:
>>> Compile tested.
>>>
>>> Every lockless usage of next_thread() was wrong, bpf/task_iter.c is
>>> the last user and is no exception.
>> It would be great if you can give more information in the commit message
>> about why the usage of next_thread() is wrong in bpf/task_iter.c.
> I tried to explain the problems in the changelogs:
>
> 1/3:
> 	task_group_seq_get_next() can return the group leader twice if it races
> 	with mt-thread exec which changes the group->leader's pid.
>
> 2/3:
> 	bpf_iter_task_next() can loop forever, "kit->pos == kit->task" can never
> 	happen if kit->pos execs.
>
>> IIUC, some information is presented in :
>>    https://lore.kernel.org/all/20230824143112.GA31208@redhat.com/
> Yes, Linus and Eric suggest to simply kill next_thread(). I am not
> sure, this needs another discussion.
>
> But as for bpf/task_iter.c... Even _if_ the usage was correct, this
> code simply doesn't need the "circular" next_thread(), NULL at the
> end simplifies the code.
>
>> Also, please add 'bpf' in the subject tag ([PATCH bpf 0/3]) to
>> make it clear the patch should be applied to bpf tree.
> OK, will do next time. Or should I resend this series with 'bpf'
> in the subject tag?

There is no need then. We can wait for maintainers who may or
may not have additional requests.


>
> Thanks,
>
> Oleg.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread()
  2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
                   ` (3 preceding siblings ...)
  2023-11-16  3:13 ` [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Yonghong Song
@ 2023-11-19 20:00 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-11-19 20:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: ast, yonghong.song, zhouchuyi, daniel, kuifeng, linux-kernel, bpf

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Tue, 14 Nov 2023 17:32:11 +0100 you wrote:
> Compile tested.
> 
> Every lockless usage of next_thread() was wrong, bpf/task_iter.c is
> the last user and is no exception.
> 
> Oleg.
> 
> [...]

Here is the summary with links:
  - [1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread()
    https://git.kernel.org/bpf/bpf-next/c/2d1618054f25
  - [2/3] bpf: bpf_iter_task_next: use __next_thread() rather than next_thread()
    https://git.kernel.org/bpf/bpf-next/c/5a34f9dabd9a
  - [3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos)
    https://git.kernel.org/bpf/bpf-next/c/ac8148d957f5

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-11-19 20:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-14 16:32 [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Oleg Nesterov
2023-11-14 16:32 ` [PATCH 1/3] bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() Oleg Nesterov
2023-11-16  3:31   ` Yonghong Song
2023-11-16  9:34     ` Oleg Nesterov
2023-11-16 11:46       ` Yonghong Song
2023-11-14 16:32 ` [PATCH 2/3] bpf: bpf_iter_task_next: " Oleg Nesterov
2023-11-16  3:34   ` Yonghong Song
2023-11-14 16:32 ` [PATCH 3/3] bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) Oleg Nesterov
2023-11-16  5:16   ` Yonghong Song
2023-11-16  9:38     ` Oleg Nesterov
2023-11-16  3:13 ` [PATCH 0/3] bpf: kernel/bpf/task_iter.c: don't abuse next_thread() Yonghong Song
2023-11-16  9:54   ` Oleg Nesterov
2023-11-16 11:52     ` Yonghong Song
2023-11-19 20:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox