Netdev List
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Masashi Honma <masashi.honma@gmail.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, kuba@kernel.org,
	hawk@kernel.org, andrii@kernel.org, rostedt@goodmis.org,
	mhiramat@kernel.org, edumazet@google.com, pabeni@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] xdp: add device context to bpf_xdp_link_attach_failed tracepoint
Date: Sun, 28 Jun 2026 23:26:15 +0800	[thread overview]
Message-ID: <29129c40-4010-4862-9b4b-3bafad874568@linux.dev> (raw)
In-Reply-To: <CAFk-A4mE9Jweo2hfX7y_85xbPyt0FqpMT1EvqX1OcYZ=LTLgRA@mail.gmail.com>

On 2026/6/28 19:39, Masashi Honma wrote:
> Hello, I am re-posting this mail because I forget to add [RFC].
> 
> The bpf_xdp_link_attach_failed tracepoint (added in commit bf4ea1d0b2cb
> "xdp: Add tracepoint for xdp attaching failure") exposes the netlink
> extack message produced when attaching an XDP program via BPF_LINK_CREATE
> fails. This is useful because, unlike the netlink attach path, the

I really appreciate that the XDP tracepoint helped someone.

> bpf_link attach path does not return the extack to userspace -- the caller
> only gets an errno (e.g. EINVAL/ERANGE).
> 
> We would like to use this in Cilium [1][2]: when attaching the XDP
> datapath program fails, surface the kernel's reason (e.g. "single-buffer
> XDP requires MTU less than ...") in the agent logs instead of an opaque
> errno, so operators don't have to inspect dmesg on the host.
> 
> The limitation we hit is that the tracepoint only carries the message
> string, so a consumer cannot tell which device a failure belongs to.
> This matters for two reasons:
> 
>   1. Correlation: with only the message, a consumer cannot reliably
>       attribute a failure to a specific attach, particularly if multiple
>       XDP attaches happen concurrently.
>   2. Scoping: a consumer watching this tracepoint sees XDP attach
>       failures system-wide and cannot limit them to the devices it
>       manages.
> 
> At the call site (bpf_xdp_link_attach() in net/core/dev.c) the net_device
> is in scope, so exposing it looks straightforward:
> 
>   TRACE_EVENT(bpf_xdp_link_attach_failed,
>       TP_PROTO(const char *msg, const struct net_device *dev),
>       TP_ARGS(msg, dev),
>       TP_STRUCT__entry(
>           __string(msg, msg)
>           __field(int, ifindex)
>       ),
>       TP_fast_assign(
>           __assign_str(msg);
>           __entry->ifindex = dev->ifindex;
>       ),
>       TP_printk("ifindex=%d errmsg=%s", __entry->ifindex, __get_str(msg))
>   );
> 
>   - trace_bpf_xdp_link_attach_failed(extack._msg);
>   + trace_bpf_xdp_link_attach_failed(extack._msg, dev);
> 
> Before sending a formal patch I'd appreciate guidance on a few points:
> 
>   - Should the tracepoint take const struct net_device *dev (consistent
>     with the other tracepoints in this file, and lets TP_printk show the
>     device), or just the ifindex as an int (simpler for raw_tp BPF
>     consumers, which otherwise read dev->ifindex via CO-RE)?
> 
>   - For raw_tp consumers the argument order is effectively ABI: prepending
>     dev would shift the existing msg argument. I've appended dev above to
>     keep msg at args[0]. Is preserving the existing argument position the
>     right call, or is reordering acceptable given how new and rarely
>     consumed this tracepoint is?
> 

Good concerns. I'm not sure about these parts.

>   - Is extending the existing tracepoint preferred, or would you rather
>     keep it as-is and expose the device context some other way?
> 

I'm planning to retire this tracepoint. But I think I cannot do it, if
there's user space application relied on the tracepoint.

I'm planning to add BPF syscall common attributes support for
BPF_LINK_CREATE, including XDP link. By that way, the kernel will be
able to back-propagate the 'extack._msg' to user space, when fail to
create XDP link. Thereafter, the user space library will be able to get
the error message alongside the errno.

Thanks,
Leon

> This would be my first XDP/BPF tracepoint change, so any direction is
> welcome. I'm happy to send a proper patch once the shape is agreed.
> 
> Regards,
> Masashi Honma
> 
> [1] https://github.com/cilium/cilium/issues/40777
> [2] https://github.com/cilium/cilium/pull/46546


      reply	other threads:[~2026-06-28 15:26 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-28 11:39 [RFC] xdp: add device context to bpf_xdp_link_attach_failed tracepoint Masashi Honma
2026-06-28 15:26 ` Leon Hwang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29129c40-4010-4862-9b4b-3bafad874568@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=masashi.honma@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox