All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Martin KaFai Lau <martin.lau@linux.dev>,
	Mahe Tardy <mahe.tardy@gmail.com>
Cc: daniel@iogearbox.net, john.fastabend@gmail.com, ast@kernel.org,
	andrii@kernel.org, jolsa@kernel.org, bpf@vger.kernel.org,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs
Date: Fri, 7 Mar 2025 22:17:44 -0800	[thread overview]
Message-ID: <908c6a63-3049-4dd2-859a-215b31e5d1ea@linux.dev> (raw)
In-Reply-To: <a66af5a8-1aa4-481a-b57e-b3076cc520b0@linux.dev>



On 3/7/25 3:06 PM, Martin KaFai Lau wrote:
> On 3/6/25 9:03 AM, Mahe Tardy wrote:
>>>>> The immediate question is whether sock_net(sk) must be non-NULL 
>>>>> for tracing.
>>>> We discussed this offline with Daniel Borkmann and we think that it
>>>> might not be the question. The get_netns_cookie(NULL) call allows 
>>>> us to
>>>> compare against get_netns_cookie(sock) to see whether the sock's netns
>>>> is equal to the init netns and thus dispatch different logic.
>>> bpf_get_netns_cookie(NULL) should be fine.
>>>
>>> I meant to ask if sock_net(sk) may return NULL for a non NULL sk. 
>>> Please check.
>> Oh sorry for the confusion, I investigated with my humble kernel
>> knowledge: essentially sock_net(sk) is doing sk->sk_net->net, retrieving
>> the net struct representing the network namespace, to later extract the
>> cookie, and thus dereference the returned pointer (here is the concern).
>> The sk_net intermediary (in reality __sk_common.skc_net) is here because
>> of the possibility of switching on/off network namespaces via
>> CONFIG_NET_NS. It's a possible_net_t type containing (or not) the struct
>> net pointer, explaining why we use write/read_pnet to no-op or return
>> the global net ns.
>>
>> Now by adding this helper to tracing progs, it allows to call this
>> function in any function entry or function exit, but unlike kprobes,
>> it's not possible to just hook at an obvious arbitrary point in the code
>> where the net ns would be NULL in the sock struct. With that in mind, I
>> failed to crash the kernel tracing a function (some candidates were
>> inlined). I mostly grepped for sock_net_set, but I lack the knowledge to
>
> Thanks for checking.
>
> I took a quick look at the callers of sock_net_set. I suspect 
> "fentry/sk_prot_alloc" and "lsm/sk_alloc" could have a NULL?
>
>> guarantee that this could not happen right now or in the future. Maybe
>> that would be just safer to add a check and return 0 in that case if
>> that's ok? Not sure since the helper returns an 8-byte long opaque
>> number which thus includes 0 as a valid value.
>
> I assume net_cookie 0 is invalid, but then it leaks the implementation 
> details of what is a valid cookie in a uapi helper
>
>  * u64 bpf_get_netns_cookie(void *ctx)
>  * ...
>  *      Return
>  *              A 8-byte long opaque number
>
> Note that, the tracing program can already read most fields of the sk, 
> including sk->sk_net.net->net_cookie. Therefore, what this patch aims 
> to achieve has already been supported in tracing. It can also save a 
> helper call.
>
> The only thing that may be missing in your use case is determining the 
> init_net. I don't think reading a global kernel variable has been 
> supported yet. Not sure if init_net must have net_cookie 1. Otherwise, 
> we could consider to add a kfunc to return &init_net, which could be 
> used to compare with sk->sk_net.net. Having a pointer to &init_net 
> might be more useful for other tracing use cases in general.

There is the workaround for this tracing use case.

1. Declare a global variable in the bpf program, e.g.
    struct net *init_net;

2. After skel_open and before skel_load, find init_net address (from /proc/kallsyms) and
    assign the address to skel->bss->init_net.

3. In the prog, do
    struct net *netns = bpf_rdonly_cast(init_net, bpf_core_type_id_kernel(struct net));
    bpf_printk("%u\n", netns->net_cookie);

There is an effort to add global variables to BTF.
See https://lore.kernel.org/bpf/20250207012045.2129841-1-stephen.s.brennan@oracle.com/
The recommended way is to put these global variables in a module to avoid consume
too much kernel memory unconditionally.


      reply	other threads:[~2025-03-08  6:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27 18:28 [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs Mahe Tardy
2025-02-27 18:28 ` [PATCH bpf-next 2/2] selftests/bpf: add tracing netns cookie tests Mahe Tardy
2025-02-27 20:32 ` [PATCH bpf-next 1/2] bpf: add get_netns_cookie helper to tracing programs Martin KaFai Lau
2025-03-03 10:14   ` Mahe Tardy
2025-03-03 19:14     ` Martin KaFai Lau
2025-03-06 17:03       ` Mahe Tardy
2025-03-07 23:06         ` Martin KaFai Lau
2025-03-08  6:17           ` Yonghong Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=908c6a63-3049-4dd2-859a-215b31e5d1ea@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.