From: Yonghong Song <yhs@fb.com>
To: Song Liu <songliubraving@fb.com>
Cc: KP Singh <kpsingh@kernel.org>, Martin Lau <kafai@fb.com>,
bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Kernel Team <Kernel-team@fb.com>, Hao Luo <haoluo@google.com>,
kernel test robot <lkp@intel.com>
Subject: Re: [PATCH bpf-next 1/4] bpf: enable task local storage for tracing programs
Date: Fri, 15 Jan 2021 17:50:03 -0800 [thread overview]
Message-ID: <d798d2c0-66da-a3ee-a140-df11cd4ba69b@fb.com> (raw)
In-Reply-To: <75A2A254-4D51-41F0-9B01-ED2AFA745E03@fb.com>
On 1/15/21 5:12 PM, Song Liu wrote:
>
>
>> On Jan 15, 2021, at 4:55 PM, Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 1/15/21 3:34 PM, Song Liu wrote:
>>>> On Jan 12, 2021, at 8:53 AM, KP Singh <kpsingh@kernel.org> wrote:
>>>>
>>>> On Tue, Jan 12, 2021 at 5:32 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 1/11/21 3:45 PM, Song Liu wrote:
>>>>>>
>>>>>>
>>>>>>> On Jan 11, 2021, at 1:58 PM, Martin Lau <kafai@fb.com> wrote:
>>>>>>>
>>>>>>> On Mon, Jan 11, 2021 at 10:35:43PM +0100, KP Singh wrote:
>>>>>>>> On Mon, Jan 11, 2021 at 7:57 PM Martin KaFai Lau <kafai@fb.com> wrote:
>>>>>>>>>
>>>>>>>>> On Fri, Jan 08, 2021 at 03:19:47PM -0800, Song Liu wrote:
>>>>>>>>>
>>>>>>>>> [ ... ]
>>>>>>>>>
>>>>>>>>>> diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
>>>>>>>>>> index dd5aedee99e73..9bd47ad2b26f1 100644
>>>>>>>>>> --- a/kernel/bpf/bpf_local_storage.c
>>>>>>>>>> +++ b/kernel/bpf/bpf_local_storage.c
>>>>
>>>> [...]
>>>>
>>>>>>>>>> +#include <linux/bpf.h>
>>>>>>>>>>
>>>>>>>>>> #include <asm/pgalloc.h>
>>>>>>>>>> #include <linux/uaccess.h>
>>>>>>>>>> @@ -734,6 +735,7 @@ void __put_task_struct(struct task_struct *tsk)
>>>>>>>>>> cgroup_free(tsk);
>>>>>>>>>> task_numa_free(tsk, true);
>>>>>>>>>> security_task_free(tsk);
>>>>>>>>>> + bpf_task_storage_free(tsk);
>>>>>>>>>> exit_creds(tsk);
>>>>>>>>> If exit_creds() is traced by a bpf and this bpf is doing
>>>>>>>>> bpf_task_storage_get(..., BPF_LOCAL_STORAGE_GET_F_CREATE),
>>>>>>>>> new task storage will be created after bpf_task_storage_free().
>>>>>>>>>
>>>>>>>>> I recalled there was an earlier discussion with KP and KP mentioned
>>>>>>>>> BPF_LSM will not be called with a task that is going away.
>>>>>>>>> It seems enabling bpf task storage in bpf tracing will break
>>>>>>>>> this assumption and needs to be addressed?
>>>>>>>>
>>>>>>>> For tracing programs, I think we will need an allow list where
>>>>>>>> task local storage can be used.
>>>>>>> Instead of whitelist, can refcount_inc_not_zero(&tsk->usage) be used?
>>>>>>
>>>>>> I think we can put refcount_inc_not_zero() in bpf_task_storage_get, like:
>>>>>>
>>>>>> diff --git i/kernel/bpf/bpf_task_storage.c w/kernel/bpf/bpf_task_storage.c
>>>>>> index f654b56907b69..93d01b0a010e6 100644
>>>>>> --- i/kernel/bpf/bpf_task_storage.c
>>>>>> +++ w/kernel/bpf/bpf_task_storage.c
>>>>>> @@ -216,6 +216,9 @@ BPF_CALL_4(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
>>>>>> * by an RCU read-side critical section.
>>>>>> */
>>>>>> if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
>>>>>> + if (!refcount_inc_not_zero(&task->usage))
>>>>>> + return -EBUSY;
>>>>>> +
>>>>>> sdata = bpf_local_storage_update(
>>>>>> task, (struct bpf_local_storage_map *)map, value,
>>>>>> BPF_NOEXIST);
>>>>>>
>>>>>> But where shall we add the refcount_dec()? IIUC, we cannot add it to
>>>>>> __put_task_struct().
>>>>>
>>>>> Maybe put_task_struct()?
>>>>
>>>> Yeah, something like, or if you find a more elegant alternative :)
>>>>
>>>> --- a/include/linux/sched/task.h
>>>> +++ b/include/linux/sched/task.h
>>>> @@ -107,13 +107,20 @@ extern void __put_task_struct(struct task_struct *t);
>>>>
>>>> static inline void put_task_struct(struct task_struct *t)
>>>> {
>>>> - if (refcount_dec_and_test(&t->usage))
>>>> +
>>>> + if (rcu_access_pointer(t->bpf_storage)) {
>>>> + if (refcount_sub_and_test(2, &t->usage))
>>>> + __put_task_struct(t);
>>>> + } else if (refcount_dec_and_test(&t->usage))
>>>> __put_task_struct(t);
>>>> }
>>>>
>>>> static inline void put_task_struct_many(struct task_struct *t, int nr)
>>>> {
>>>> - if (refcount_sub_and_test(nr, &t->usage))
>>>> + if (rcu_access_pointer(t->bpf_storage)) {
>>>> + if (refcount_sub_and_test(nr + 1, &t->usage))
>>>> + __put_task_struct(t);
>>>> + } else if (refcount_sub_and_test(nr, &t->usage))
>>>> __put_task_struct(t);
>>>> }
>>> It is not ideal to leak bpf_storage here. How about we only add the
>>> following:
>>> diff --git i/kernel/bpf/bpf_task_storage.c w/kernel/bpf/bpf_task_storage.c
>>> index f654b56907b69..2811b9fc47233 100644
>>> --- i/kernel/bpf/bpf_task_storage.c
>>> +++ w/kernel/bpf/bpf_task_storage.c
>>> @@ -216,6 +216,10 @@ BPF_CALL_4(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *,
>>> * by an RCU read-side critical section.
>>> */
>>> if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
>>> + /* the task_struct is being freed, fail over*/
>>> + if (!refcount_read(&task->usage))
>>> + return -EBUSY;
>>
>> This may not work? Even we check here and task->usage is not 0, it could still become 0 immediately after the above refcount_read, right?
>
> We call bpf_task_storage_get() with "task" that has valid BTF, so "task"
> should not go away during the BPF program? Whatever mechanism that
Oh, right. this is true. Otherwise, we cannot use task ptr in the helper.
> triggers the BPF program should either hold a reference to task (usage > 0)
> or be the only one owning it (usage == 0, in __put_task_struct). Did I miss
> anything?
Sorry. I think you are right. Not sure lsm requirement. There are two
more possible ways to check task is exiting which happens before
__put_task_struct():
. check task->exit_state
. check task->flags & PF_EXITING (used in bpf_trace.c)
Not sure which condition is the correct one to check.
>
> Thanks,
> Song
>
>>
>>> +
>>> sdata = bpf_local_storage_update(
>>> task, (struct bpf_local_storage_map *)map, value,
>>> BPF_NOEXIST);
>>>>
>>>>
>>>> I may be missing something but shouldn't bpf_storage be an __rcu
>>>> member like we have for sk_bpf_storage?
>>> Good catch! I will fix this in v2.
>>> Thanks,
>>> Song
>
next prev parent reply other threads:[~2021-01-16 1:51 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210108231950.3844417-1-songliubraving@fb.com>
[not found] ` <20210108231950.3844417-2-songliubraving@fb.com>
2021-01-11 6:27 ` [PATCH bpf-next 1/4] bpf: enable task local storage for tracing programs Yonghong Song
2021-01-11 10:17 ` KP Singh
2021-01-11 15:56 ` Yonghong Song
2021-01-11 10:14 ` KP Singh
2021-01-11 23:16 ` Song Liu
2021-01-11 17:16 ` Yonghong Song
2021-01-11 18:56 ` Martin KaFai Lau
2021-01-11 21:35 ` KP Singh
2021-01-11 21:58 ` Martin KaFai Lau
2021-01-11 23:45 ` Song Liu
2021-01-12 16:32 ` Yonghong Song
2021-01-12 16:53 ` KP Singh
2021-01-15 23:34 ` Song Liu
2021-01-16 0:55 ` Yonghong Song
2021-01-16 1:12 ` Song Liu
2021-01-16 1:50 ` Yonghong Song [this message]
2021-01-11 23:41 ` Song Liu
2021-01-12 18:21 ` Martin KaFai Lau
[not found] ` <20210108231950.3844417-4-songliubraving@fb.com>
2021-01-11 17:37 ` [PATCH bpf-next 3/4] bpf: runqslower: prefer use local vmlinux Yonghong Song
[not found] ` <20210108231950.3844417-5-songliubraving@fb.com>
2021-01-11 17:49 ` [PATCH bpf-next 4/4] bpf: runqslower: use task local storage Yonghong Song
2021-01-11 22:54 ` Song Liu
2021-01-12 3:24 ` Yonghong Song
2021-01-12 7:14 ` Andrii Nakryiko
2021-01-12 7:33 ` Yonghong Song
[not found] ` <20210108231950.3844417-3-songliubraving@fb.com>
2021-01-11 17:30 ` [PATCH bpf-next 2/4] selftests/bpf: add non-BPF_LSM test for " Yonghong Song
2021-01-11 17:44 ` KP Singh
2021-01-11 22:50 ` Song Liu
2021-01-11 22:49 ` Song Liu
2021-01-12 7:06 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d798d2c0-66da-a3ee-a140-df11cd4ba69b@fb.com \
--to=yhs@fb.com \
--cc=Kernel-team@fb.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.