Re: [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Martin KaFai Lau <martin.lau@linux.dev>,
	Mahe Tardy <mahe.tardy@gmail.com>
Cc: daniel@iogearbox.net, john.fastabend@gmail.com, ast@kernel.org,
	andrii@kernel.org, jolsa@kernel.org, bpf@vger.kernel.org,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs
Date: Fri, 7 Mar 2025 22:17:44 -0800	[thread overview]
Message-ID: <908c6a63-3049-4dd2-859a-215b31e5d1ea@linux.dev> (raw)
In-Reply-To: <a66af5a8-1aa4-481a-b57e-b3076cc520b0@linux.dev>



On 3/7/25 3:06 PM, Martin KaFai Lau wrote:
> On 3/6/25 9:03 AM, Mahe Tardy wrote:
>>>>> The immediate question is whether sock_net(sk) must be non-NULL 
>>>>> for tracing.
>>>> We discussed this offline with Daniel Borkmann and we think that it
>>>> might not be the question. The get_netns_cookie(NULL) call allows 
>>>> us to
>>>> compare against get_netns_cookie(sock) to see whether the sock's netns
>>>> is equal to the init netns and thus dispatch different logic.
>>> bpf_get_netns_cookie(NULL) should be fine.
>>>
>>> I meant to ask if sock_net(sk) may return NULL for a non NULL sk. 
>>> Please check.
>> Oh sorry for the confusion, I investigated with my humble kernel
>> knowledge: essentially sock_net(sk) is doing sk->sk_net->net, retrieving
>> the net struct representing the network namespace, to later extract the
>> cookie, and thus dereference the returned pointer (here is the concern).
>> The sk_net intermediary (in reality __sk_common.skc_net) is here because
>> of the possibility of switching on/off network namespaces via
>> CONFIG_NET_NS. It's a possible_net_t type containing (or not) the struct
>> net pointer, explaining why we use write/read_pnet to no-op or return
>> the global net ns.
>>
>> Now by adding this helper to tracing progs, it allows to call this
>> function in any function entry or function exit, but unlike kprobes,
>> it's not possible to just hook at an obvious arbitrary point in the code
>> where the net ns would be NULL in the sock struct. With that in mind, I
>> failed to crash the kernel tracing a function (some candidates were
>> inlined). I mostly grepped for sock_net_set, but I lack the knowledge to
>
> Thanks for checking.
>
> I took a quick look at the callers of sock_net_set. I suspect 
> "fentry/sk_prot_alloc" and "lsm/sk_alloc" could have a NULL?
>
>> guarantee that this could not happen right now or in the future. Maybe
>> that would be just safer to add a check and return 0 in that case if
>> that's ok? Not sure since the helper returns an 8-byte long opaque
>> number which thus includes 0 as a valid value.
>
> I assume net_cookie 0 is invalid, but then it leaks the implementation 
> details of what is a valid cookie in a uapi helper
>
>  * u64 bpf_get_netns_cookie(void *ctx)
>  * ...
>  *      Return
>  *              A 8-byte long opaque number
>
> Note that, the tracing program can already read most fields of the sk, 
> including sk->sk_net.net->net_cookie. Therefore, what this patch aims 
> to achieve has already been supported in tracing. It can also save a 
> helper call.
>
> The only thing that may be missing in your use case is determining the 
> init_net. I don't think reading a global kernel variable has been 
> supported yet. Not sure if init_net must have net_cookie 1. Otherwise, 
> we could consider to add a kfunc to return &init_net, which could be 
> used to compare with sk->sk_net.net. Having a pointer to &init_net 
> might be more useful for other tracing use cases in general.

There is the workaround for this tracing use case.

1. Declare a global variable in the bpf program, e.g.
    struct net *init_net;

2. After skel_open and before skel_load, find init_net address (from /proc/kallsyms) and
    assign the address to skel->bss->init_net.

3. In the prog, do
    struct net *netns = bpf_rdonly_cast(init_net, bpf_core_type_id_kernel(struct net));
    bpf_printk("%u\n", netns->net_cookie);

There is an effort to add global variables to BTF.
See https://lore.kernel.org/bpf/20250207012045.2129841-1-stephen.s.brennan@oracle.com/
The recommended way is to put these global variables in a module to avoid consume
too much kernel memory unconditionally.

     prev parent reply	other threads:[~2025-03-08  6:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250227182830.90863-1-mahe.tardy@gmail.com>
2025-02-27 20:32 ` [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs Martin KaFai Lau
2025-03-03 10:14   ` Mahe Tardy
2025-03-03 19:14     ` Martin KaFai Lau
2025-03-06 17:03       ` Mahe Tardy
2025-03-07 23:06         ` Martin KaFai Lau
2025-03-08  6:17           ` Yonghong Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=908c6a63-3049-4dd2-859a-215b31e5d1ea@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).