From: Florian Westphal <fw@strlen.de>
To: Daniel Xu <dxu@dxuuu.xyz>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Florian Westphal <fw@strlen.de>,
"David S. Miller" <davem@davemloft.net>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Paolo Abeni <pabeni@redhat.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Jozsef Kadlecsik <kadlec@netfilter.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, bpf <bpf@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
netfilter-devel <netfilter-devel@vger.kernel.org>,
coreteam@netfilter.org,
Network Development <netdev@vger.kernel.org>,
David Ahern <dsahern@kernel.org>
Subject: Re: [PATCH bpf-next v4 2/6] netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link
Date: Fri, 14 Jul 2023 11:47:41 +0200 [thread overview]
Message-ID: <20230714094741.GA7912@breakpoint.cc> (raw)
In-Reply-To: <t6wypww537golmoosbikfuombrqq555fh5mbycwl4whto6joo4@hcqlospkgqyr>
Daniel Xu <dxu@dxuuu.xyz> wrote:
> On Thu, Jul 13, 2023 at 04:10:03PM -0700, Alexei Starovoitov wrote:
> > Why is rcu_assign_pointer() used?
> > If it's not RCU protected, what is the point of rcu_*() accessors
> > and rcu_read_lock() ?
> >
> > In general, the pattern:
> > rcu_read_lock();
> > ptr = rcu_dereference(...);
> > rcu_read_unlock();
> > ptr->..
> > is a bug. 100%.
FWIW, I agree with Alexei, it does look... dodgy.
> The reason I left it like this is b/c otherwise I think there is a race
> with module unload and taking a refcnt. For example:
>
> ptr = READ_ONCE(global_var)
> <module unload on other cpu>
> // ptr invalid
> try_module_get(ptr->owner)
>
Yes, I agree.
> I think the the synchronize_rcu() call in
> kernel/module/main.c:free_module() protects against that race based on
> my reading.
>
> Maybe the ->enable() path can store a copy of the hook ptr in
> struct bpf_nf_link to get rid of the odd rcu_dereference()?
>
> Open to other ideas too -- would appreciate any hints.
I would suggest the following:
- Switch ordering of patches 2 and 3.
What is currently patch 3 would add the .owner fields only.
Then, what is currently patch #2 would document the rcu/modref
interaction like this (omitting error checking for brevity):
rcu_read_lock();
v6_hook = rcu_dereference(nf_defrag_v6_hook);
if (!v6_hook) {
rcu_read_unlock();
err = request_module("nf_defrag_ipv6");
if (err)
return err < 0 ? err : -EINVAL;
rcu_read_lock();
v6_hook = rcu_dereference(nf_defrag_v6_hook);
}
if (v6_hook && try_module_get(v6_hook->owner))
v6_hook = rcu_pointer_handoff(v6_hook);
else
v6_hook = NULL;
rcu_read_unlock();
if (!v6_hook)
err();
v6_hook->enable();
I'd store the v4/6_hook pointer in the nf bpf link struct, its probably more
self-explanatory for the disable side in that we did pick up a module reference
that we still own at delete time, without need for any rcu involvement.
Because above handoff is repetitive for ipv4 and ipv6,
I suggest to add an agnostic helper for this.
I know you added distinct structures for ipv4 and ipv6 but if they would use
the same one you could add
static const struct nf_defrag_hook *get_proto_frag_hook(const struct nf_defrag_hook __rcu *hook,
const char *modulename);
And then use it like:
v4_hook = get_proto_frag_hook(nf_defrag_v4_hook, "nf_defrag_ipv4");
Without a need to copy the modprobe and handoff part.
What do you think?
next prev parent reply other threads:[~2023-07-14 9:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-12 23:43 [PATCH bpf-next v4 0/6] Support defragmenting IPv(4|6) packets in BPF Daniel Xu
2023-07-12 23:43 ` [PATCH bpf-next v4 1/6] netfilter: defrag: Add glue hooks for enabling/disabling defrag Daniel Xu
2023-07-12 23:43 ` [PATCH bpf-next v4 2/6] netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link Daniel Xu
2023-07-13 0:43 ` Alexei Starovoitov
2023-07-13 1:22 ` Daniel Xu
2023-07-13 1:26 ` Alexei Starovoitov
2023-07-13 4:33 ` Daniel Xu
2023-07-13 23:10 ` Alexei Starovoitov
2023-07-13 23:42 ` Daniel Xu
2023-07-14 9:47 ` Florian Westphal [this message]
2023-07-18 21:45 ` Daniel Xu
2023-07-12 23:43 ` [PATCH bpf-next v4 3/6] netfilter: bpf: Prevent defrag module unload while link active Daniel Xu
2023-07-12 23:43 ` [PATCH bpf-next v4 4/6] bpf: selftests: Support not connecting client socket Daniel Xu
2023-07-12 23:44 ` [PATCH bpf-next v4 5/6] bpf: selftests: Support custom type and proto for client sockets Daniel Xu
2023-07-12 23:44 ` [PATCH bpf-next v4 6/6] bpf: selftests: Add defrag selftests Daniel Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230714094741.GA7912@breakpoint.cc \
--to=fw@strlen.de \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=coreteam@netfilter.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=dxu@dxuuu.xyz \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kadlec@netfilter.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.