From: Stanislav Fomichev <sdf@google.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@linux.dev>,
ast@kernel.org, andrii@kernel.org, song@kernel.org, yhs@fb.com,
john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com,
jolsa@kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next 1/2] bpf: return correct -ENOBUFS from bpf_clone_redirect
Date: Mon, 11 Sep 2023 11:52:31 -0700 [thread overview]
Message-ID: <ZP9h71zBaUUVsYYs@google.com> (raw)
In-Reply-To: <927cc104-a266-7300-f601-e39d5d0fef59@iogearbox.net>
On 09/11, Daniel Borkmann wrote:
> On 9/11/23 7:41 PM, Stanislav Fomichev wrote:
> > On 09/11, Daniel Borkmann wrote:
> > > On 9/11/23 7:11 PM, Stanislav Fomichev wrote:
> > > > On 09/09, Martin KaFai Lau wrote:
> > > > > On 9/8/23 2:00 PM, Stanislav Fomichev wrote:
> > > > > > Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped
> > > > > > packets") exposed the fact that bpf_clone_redirect is capable of
> > > > > > returning raw NET_XMIT_XXX return codes.
> > > > > >
> > > > > > This is in the conflict with its UAPI doc which says the following:
> > > > > > "0 on success, or a negative error in case of failure."
> > > > > >
> > > > > > Let's wrap dev_queue_xmit's return value (in __bpf_tx_skb) into
> > > > > > net_xmit_errno to make sure we correctly propagate NET_XMIT_DROP
> > > > > > as -ENOBUFS instead of 1.
> > > > > >
> > > > > > Note, this is technically breaking existing UAPI where we used to
> > > > > > return 1 and now will do -ENOBUFS. The alternative is to
> > > > > > document that bpf_clone_redirect can return 1 for DROP and 2 for CN.
> > > > > >
> > > > > > Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> > > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > > > ---
> > > > > > net/core/filter.c | 3 +++
> > > > > > 1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > > > > index a094694899c9..9e297931b02f 100644
> > > > > > --- a/net/core/filter.c
> > > > > > +++ b/net/core/filter.c
> > > > > > @@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
> > > > > > ret = dev_queue_xmit(skb);
> > > > > > dev_xmit_recursion_dec();
> > > > > > + if (ret > 0)
> > > > > > + ret = net_xmit_errno(ret);
> > > > >
> > > > > I think it is better to have bpf_clone_redirect returning -ENOBUFS instead
> > > > > of leaking NET_XMIT_XXX to the uapi. The bpf_clone_redirect in the
> > > > > uapi/bpf.h also mentions
> > > > >
> > > > > * Return
> > > > > * 0 on success, or a negative error in case of failure.
> > > > >
> > > > > If -ENOBUFS is returned in __bpf_tx_skb, should the same be done for
> > > > > __bpf_rx_skb? and should net_xmit_errno() only be done for
> > > > > bpf_clone_redirect()? __bpf_{tx,rx}_skb is also used by skb_do_redirect()
> > > > > which also calls __bpf_redirect_neigh() that returns NET_XMIT_xxx but no
> > > > > caller seems to care the NET_XMIT_xxx value now.
> > > >
> > > > __bpf_rx_skb seems to only add to backlog and doesn't seem to return any
> > > > of the NET_XMIT_xxx. But I might be wrong and haven't looked too deep
> > > > into that.
> > > >
> > > > > Daniel should know more here. I would wait for Daniel to comment.
> > > >
> > > > Ack, sure!
> > >
> > > I think my preference would be to just document it in the helper UAPI, what
> > > Stan was suggesting below:
> > >
> > > | Note, this is technically breaking existing UAPI where we used to
> > > | return 1 and now will do -ENOBUFS. The alternative is to
> > > | document that bpf_clone_redirect can return 1 for DROP and 2 for CN.
> > >
> > > And then only adjusting the test case.
> >
> > In this case, would we also need something similar to our
> > TCP_BPF_<state> changes? Like BUILD_BUG_ON(BPF_NET_XMIT_XXX !=
> > NET_XMIT_XXX)? Otherwise, we risk more leakage into the UAPI.
> > Merely documenting doesn't seem enough?
>
> We could probably just mention that a positive, non-zero code indicates
> that the skb clone got forwarded to the target netdevice but got dropped
> from driver side. This is somewhat also driver dependent e.g. if you look
> at dummy which does drop-all, it returns NETDEV_TX_OK. Anything more
> specific in the helper doc such as defining BPF_NET_XMIT_* would be more
> confusing.
Something like the following?
Return
0 on success, or a negative error in case of failure. Positive
error indicates a potential drop or congestion in the target
device. The particular positive error codes are not defined.
next prev parent reply other threads:[~2023-09-11 18:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-08 21:00 [PATCH bpf-next 1/2] bpf: return correct -ENOBUFS from bpf_clone_redirect Stanislav Fomichev
2023-09-08 21:00 ` [PATCH bpf-next 2/2] selftests/bpf: update bpf_clone_redirect expected return code Stanislav Fomichev
2023-09-09 7:31 ` [PATCH bpf-next 1/2] bpf: return correct -ENOBUFS from bpf_clone_redirect Martin KaFai Lau
2023-09-11 17:11 ` Stanislav Fomichev
2023-09-11 17:23 ` Daniel Borkmann
2023-09-11 17:41 ` Stanislav Fomichev
2023-09-11 18:36 ` Daniel Borkmann
2023-09-11 18:52 ` Stanislav Fomichev [this message]
2023-09-11 19:05 ` Daniel Borkmann
2023-09-11 18:08 ` Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZP9h71zBaUUVsYYs@google.com \
--to=sdf@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=song@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.