From: John Fastabend <john.fastabend@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Hou Tao <houtao@huaweicloud.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
bpf <bpf@vger.kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
Hao Luo <haoluo@google.com>,
Yonghong Song <yonghong.song@linux.dev>,
Daniel Borkmann <daniel@iogearbox.net>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>,
Jiri Olsa <jolsa@kernel.org>,
xingwei lee <xrivendell7@gmail.com>,
Hou Tao <houtao1@huawei.com>
Subject: Re: [PATCH bpf-next v3 1/2] bpf: Reduce the scope of rcu_read_lock when updating fd map
Date: Thu, 14 Dec 2023 11:15:32 -0800 [thread overview]
Message-ID: <657b545493a0b_511332086@john.notmuch> (raw)
In-Reply-To: <CAADnVQK+C+9BVowRxESJhuH7BM+SWn2u_fTU2wjH0YuA-N9egw@mail.gmail.com>
Alexei Starovoitov wrote:
> On Wed, Dec 13, 2023 at 11:31 PM Hou Tao <houtao@huaweicloud.com> wrote:
> >
> > Hi,
> >
> > On 12/14/2023 2:22 PM, John Fastabend wrote:
> > > Hou Tao wrote:
> > >> From: Hou Tao <houtao1@huawei.com>
> > >>
> > >> There is no rcu-read-lock requirement for ops->map_fd_get_ptr() or
> > >> ops->map_fd_put_ptr(), so doesn't use rcu-read-lock for these two
> > >> callbacks.
> > >>
> > >> For bpf_fd_array_map_update_elem(), accessing array->ptrs doesn't need
> > >> rcu-read-lock because array->ptrs must still be allocated. For
> > >> bpf_fd_htab_map_update_elem(), htab_map_update_elem() only requires
> > >> rcu-read-lock to be held to avoid the WARN_ON_ONCE(), so only use
> > >> rcu_read_lock() during the invocation of htab_map_update_elem().
> > >>
> > >> Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > >> Signed-off-by: Hou Tao <houtao1@huawei.com>
> > >> ---
> > >> kernel/bpf/hashtab.c | 6 ++++++
> > >> kernel/bpf/syscall.c | 4 ----
> > >> 2 files changed, 6 insertions(+), 4 deletions(-)
> > >>
> > >> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > >> index 5b9146fa825f..ec3bdcc6a3cf 100644
> > >> --- a/kernel/bpf/hashtab.c
> > >> +++ b/kernel/bpf/hashtab.c
> > >> @@ -2523,7 +2523,13 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
> > >> if (IS_ERR(ptr))
> > >> return PTR_ERR(ptr);
> > >>
> > >> + /* The htab bucket lock is always held during update operations in fd
> > >> + * htab map, and the following rcu_read_lock() is only used to avoid
> > >> + * the WARN_ON_ONCE in htab_map_update_elem().
> > >> + */
Ah ok but isn't this comment wrong because you do need rcu read lock to do
the walk with lookup_nulls_elem_raw where there is no lock being held? And
then the subsequent copy in place is fine because you do have a lock.
So its not just to appease the WARN_ON_ONCE here it has an actual real
need?
> > >> + rcu_read_lock();
> > >> ret = htab_map_update_elem(map, key, &ptr, map_flags);
> > >> + rcu_read_unlock();
> > > Did we consider dropping the WARN_ON_ONCE in htab_map_update_elem()? It
> > > looks like there are two ways to get to htab_map_update_elem() either
> > > through a syscall and the path here (bpf_fd_htab_map_update_elem) or
> > > through a BPF program calling, bpf_update_elem()? In the BPF_CALL
> > > case bpf_map_update_elem() already has,
> > >
> > > WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held())
> > >
> > > The htab_map_update_elem() has an additional check for
> > > rcu_read_lock_trace_held(), but not sure where this is coming from
> > > at the moment. Can that be added to the BPF caller side if needed?
> > >
> > > Did I miss some caller path?
> >
> > No. But I think the main reason for the extra WARN in
> > bpf_map_update_elem() is that bpf_map_update_elem() may be inlined by
> > verifier in do_misc_fixups(), so the WARN_ON_ONCE in
> > bpf_map_update_elem() will not be invoked ever. For
> > rcu_read_lock_trace_held(), I have added the assertion in
> > bpf_map_delete_elem() recently in commit 169410eba271 ("bpf: Check
> > rcu_read_lock_trace_held() before calling bpf map helpers").
>
> Yep.
> We should probably remove WARN_ONs from
> bpf_map_update_elem() and others in kernel/bpf/helpers.c
> since they are inlined by the verifier with 99% probability
> and the WARNs are never called even in DEBUG kernels.
> And confusing developers. As this thread shows.
Agree. The rcu_read needs to be close as possible to where its actually
needed and the WARN_ON_ONCE should be dropped if its going to be
inlined.
>
> We can replace them with a comment that explains this inlining logic
> and where the real WARNs are.
next prev parent reply other threads:[~2023-12-14 19:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-14 4:30 [PATCH bpf-next v3 0/2] bpf: Use GFP_KERNEL in bpf_event_entry_gen() Hou Tao
2023-12-14 4:30 ` [PATCH bpf-next v3 1/2] bpf: Reduce the scope of rcu_read_lock when updating fd map Hou Tao
2023-12-14 6:22 ` John Fastabend
2023-12-14 7:31 ` Hou Tao
2023-12-14 13:55 ` Alexei Starovoitov
2023-12-14 19:15 ` John Fastabend [this message]
2023-12-15 3:23 ` Alexei Starovoitov
2023-12-15 3:39 ` Hou Tao
2023-12-15 8:18 ` Hou Tao
2023-12-14 4:30 ` [PATCH bpf-next v3 2/2] bpf: Use GFP_KERNEL in bpf_event_entry_gen() Hou Tao
2023-12-14 5:10 ` [PATCH bpf-next v3 0/2] " patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=657b545493a0b_511332086@john.notmuch \
--to=john.fastabend@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=houtao1@huawei.com \
--cc=houtao@huaweicloud.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=xrivendell7@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.