netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: Ferenc Fejes <ferenc@fejes.dev>, dsahern@gmail.com
Cc: netdev <netdev@vger.kernel.org>, kuniyu@amazon.com
Subject: Re: [question] robust netns association with fib4 lookup
Date: Fri, 25 Apr 2025 21:17:40 +0300	[thread overview]
Message-ID: <aAvRxOGcyaEx0_V2@shredder> (raw)
In-Reply-To: <c28ded3224734ca62187ed9a41f7ab39ceecb610.camel@fejes.dev>

On Thu, Apr 24, 2025 at 01:33:08PM +0200, Ferenc Fejes wrote:
> Hi,
> 
> tl;dr: I want to trace fib4 lookups within a network namespace with eBPF.   This
> works well with fib6, as the struct net ptr passed as an argument to
> fib6_table_lookup [0], so I can read the inode from it and pass it to userspace.
> 
> 
> Additional context. I'm working on a fib table and fib rule lookup tracer
> application that hooks fib_table_lookup/fib6_table_lookup and fib_rules_lookup
> with fexit eBPF probes and gathers useful data from the struct flowi4 and flowi6
> used for the lookup as well as the resulting nexthop (gw, seg6, mpls tunnel) if
> the lookup is successful. If this works, my plan is to extend it to neighbour,
> fdb and mdb lookups.
> 
> Tracepoints exist for fib lookups v4 [1] and v6 [2] but in my tracer I would
> like to have netns filtering. For example: "check unsuccessful fib4 rule and
> table lookups in netns foo". Unfortunately I can't find a reliable way to
> associate netns info with fib4 lookups. The main problems are as follows.
> 
> Unlike fib6_table_lookup for v6, fib_table_lookup for v4 does not have a struct
> net argument. This makes sense, as struct net is not needed there. But without
> it, the netns association is not as easy as in the v6 case.
> 
> On the other hand, fib_lookup [3], which in most cases calls fib_table_lookup,
> has a struct net parameter. Even better, there is the struct fib_result ptr
> returned by fib_table_lookup. This would be the perfect candidate to hook into,
> but unfortunately it is an inline function.
> 
> If there are custom fib rules in the netns, __fib_lookup [4] is called, which is
> hookable. This has all the necessary info like netns, table and result. To use
> this I have to add the custom rule to the traced netns and remove it
> immediately. This will enforce the __fib_lookup codepath. But I feel that at
> some point this bug(?) will be fixed and the kernel will notice the absence of
> custom rules and switch back to the original codepath.
> 
> But this option is useless for tracing unsuccessful lookups. The stack looks
> like this:
> __fib_lookup                    <-- netns info available
>   fib_rules_lookup              <-- losing netns info... :-(
>     fib4_rule_action            <-- unsuccessful result available
>       fib_table_lookup          <-- source of unsuccessful result
> 
> My current workaround is to restore the netns info using the struct flowi4
> pointer. When we have the stack above, I use an eBPF hashmap and use the flowi4
> pointer as the key and netns as the value. Then in the fib_table_lookup I look
> up the netns id based on the value of the flowi4 pointer. Since this is the
> common case, it works, but looks like fib_table_lookup is called from other
> places as well (even its rare).
> 
> Is there any other way to get the netns info for fib4 lookups? If not, would it
> be worth an RFC to pass the struct net argument to fib_table_lookup as well, as
> is currently done in fib6_table_lookup?

I think it makes sense to make both tracepoints similar and pass the net
argument to trace_fib_table_lookup()

> Unfortunately this includes some callers to fib_table_lookup. The
> netns id would also be presented in the existing tracepoints ([1] and
> [2]). Thanks in advance for any suggestion.

By "netns id" you mean the netns cookie? It seems that some TCP trace
events already expose it (see include/trace/events/tcp.h). It would be
nice to finally have "perf" filter these FIB events based on netns.

David, any objections?

> 
> Best,
> Ferenc
> 
> 
> [0] https://elixir.bootlin.com/linux/v6.15-rc3/source/net/ipv6/route.c#L2221
> [1] https://elixir.bootlin.com/linux/v6.15-rc3/source/include/trace/events/fib.h
> [2] https://elixir.bootlin.com/linux/v6.14/source/include/trace/events/fib6.h
> [3] https://elixir.bootlin.com/linux/v6.15-rc3/source/include/net/ip_fib.h#L374

  reply	other threads:[~2025-04-25 18:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-24 11:33 [question] robust netns association with fib4 lookup Ferenc Fejes
2025-04-25 18:17 ` Ido Schimmel [this message]
2025-04-25 18:21   ` David Ahern
2025-04-28 10:23     ` Ferenc Fejes
2025-04-28 10:20   ` Ferenc Fejes
2025-04-28 15:35     ` Ido Schimmel
2025-04-29  5:50       ` Ferenc Fejes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAvRxOGcyaEx0_V2@shredder \
    --to=idosch@nvidia.com \
    --cc=dsahern@gmail.com \
    --cc=ferenc@fejes.dev \
    --cc=kuniyu@amazon.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).