From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9661EAD2E for ; Mon, 11 Sep 2023 17:41:34 +0000 (UTC) Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2B8DC5 for ; Mon, 11 Sep 2023 10:41:32 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-c8f360a07a2so4562204276.2 for ; Mon, 11 Sep 2023 10:41:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1694454092; x=1695058892; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xu7iMhDcNFGmX/EjUlUOV/uW3b7muKE6w5Sjrvs71hU=; b=hMmrWoccR5H7OZjeJkb4MCg46cy3iyLCQoleZmr79A8xdtoywh2jA53Nub9bkAU/n4 XF3c3MbKjf6DtM0IMoaJqwVkKatE+XulD8tqJrqO5jibCUdpxE1l2noQnyNuZyXUNEno f0bPNd9EOL/PjLQ/NpvFIXzrMPHL8hgkEJv568B1nyowOYL17zq9cXMtHUyf5E65T3OC FmjTbI52sJ5RL0g9Uj0r3C7btYMsm691nQrv2v+C0O++uimWpodGkIsz1wndzFWxbCIM oSbXF0NsmOmlYZMreAGq2ujPZAPuh4LTiltY7m6iIdTZBcsgRI74wcsgHjrHaPqurwKo wJyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694454092; x=1695058892; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xu7iMhDcNFGmX/EjUlUOV/uW3b7muKE6w5Sjrvs71hU=; b=R/i35dXvenzfVMmzuVAcnp+D/jtdMz72OschXEYGblA9eFMRR4X64ZO13Mj9M1C/sy h5i31DuOlQHgSeG6Y39azJK7YyMvUU+hNHFcWzDSmFGf7lQvxDg+2eMEBp6nCbJ/k+6f f544C2o6u/9oq24KaiUtBzuO/w5u1onhm9Xj+WgLxYvCVtTYPtuw2kqfBJtASUdqY7YD UHddTNRh/HaIHqXHHO4mrO5IE2N8tpLjZwyla+U6CTaeO7Rt2o+PqxchAM7vLGA8+kcO cUl77qOUBF2XnulobRZIUP50KjGhLy/HEi/s5TANqYHGLmo8BkybUjilp81DqnOTfdD1 25tg== X-Gm-Message-State: AOJu0Yx2/z6bHNvp4AZ/9H2rZOo+od9iBt9lI18tF+gRmVolegh6Awkx vp1fdkS38e1MjTcCDigoUcMF4Uc= X-Google-Smtp-Source: AGHT+IFD1OBnNOA3ZKLNQqB+s+Z5vDfLAh4EUipPkzJD6d1yh/dyS3XBJqCZPuj/+AZQZ5fdaScDMs0= X-Received: from sdf.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5935]) (user=sdf job=sendgmr) by 2002:a25:adc3:0:b0:d01:60ec:d0e with SMTP id d3-20020a25adc3000000b00d0160ec0d0emr240559ybe.9.1694454092113; Mon, 11 Sep 2023 10:41:32 -0700 (PDT) Date: Mon, 11 Sep 2023 10:41:30 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20230908210007.1469091-1-sdf@google.com> Message-ID: Subject: Re: [PATCH bpf-next 1/2] bpf: return correct -ENOBUFS from bpf_clone_redirect From: Stanislav Fomichev To: Daniel Borkmann Cc: Martin KaFai Lau , ast@kernel.org, andrii@kernel.org, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, bpf@vger.kernel.org Content-Type: text/plain; charset="utf-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net On 09/11, Daniel Borkmann wrote: > On 9/11/23 7:11 PM, Stanislav Fomichev wrote: > > On 09/09, Martin KaFai Lau wrote: > > > On 9/8/23 2:00 PM, Stanislav Fomichev wrote: > > > > Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped > > > > packets") exposed the fact that bpf_clone_redirect is capable of > > > > returning raw NET_XMIT_XXX return codes. > > > > > > > > This is in the conflict with its UAPI doc which says the following: > > > > "0 on success, or a negative error in case of failure." > > > > > > > > Let's wrap dev_queue_xmit's return value (in __bpf_tx_skb) into > > > > net_xmit_errno to make sure we correctly propagate NET_XMIT_DROP > > > > as -ENOBUFS instead of 1. > > > > > > > > Note, this is technically breaking existing UAPI where we used to > > > > return 1 and now will do -ENOBUFS. The alternative is to > > > > document that bpf_clone_redirect can return 1 for DROP and 2 for CN. > > > > > > > > Reported-by: Daniel Borkmann > > > > Signed-off-by: Stanislav Fomichev > > > > --- > > > > net/core/filter.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > index a094694899c9..9e297931b02f 100644 > > > > --- a/net/core/filter.c > > > > +++ b/net/core/filter.c > > > > @@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) > > > > ret = dev_queue_xmit(skb); > > > > dev_xmit_recursion_dec(); > > > > + if (ret > 0) > > > > + ret = net_xmit_errno(ret); > > > > > > I think it is better to have bpf_clone_redirect returning -ENOBUFS instead > > > of leaking NET_XMIT_XXX to the uapi. The bpf_clone_redirect in the > > > uapi/bpf.h also mentions > > > > > > * Return > > > * 0 on success, or a negative error in case of failure. > > > > > > If -ENOBUFS is returned in __bpf_tx_skb, should the same be done for > > > __bpf_rx_skb? and should net_xmit_errno() only be done for > > > bpf_clone_redirect()? __bpf_{tx,rx}_skb is also used by skb_do_redirect() > > > which also calls __bpf_redirect_neigh() that returns NET_XMIT_xxx but no > > > caller seems to care the NET_XMIT_xxx value now. > > > > __bpf_rx_skb seems to only add to backlog and doesn't seem to return any > > of the NET_XMIT_xxx. But I might be wrong and haven't looked too deep > > into that. > > > > > Daniel should know more here. I would wait for Daniel to comment. > > > > Ack, sure! > > I think my preference would be to just document it in the helper UAPI, what > Stan was suggesting below: > > | Note, this is technically breaking existing UAPI where we used to > | return 1 and now will do -ENOBUFS. The alternative is to > | document that bpf_clone_redirect can return 1 for DROP and 2 for CN. > > And then only adjusting the test case. In this case, would we also need something similar to our TCP_BPF_ changes? Like BUILD_BUG_ON(BPF_NET_XMIT_XXX != NET_XMIT_XXX)? Otherwise, we risk more leakage into the UAPI. Merely documenting doesn't seem enough? > Programs checking for ret < 0 will continue to behave as before. Technically > the bpf_clone_redirect() did its job just that on the veth side things were > dropped. Other drivers such as tun, vrf, ipvlan, bond could already have > returned NET_XMIT_DROP, so technically it's not a new situation where it is > possible. And having a ret > 0 could then also be clearly used to differentiate > that something came from driver side rather than helper side. > > > > For the selftest, may be another option is to use a 28 bytes data_in for the > > > lwt program redirecting to veth? 14 bytes used by bpf_prog_test_run_skb and > > > leave 14 bytes for veth_xmit. It seems the original intention of the "veth > > > ETH_HLEN+1 packet ingress" test is expecting it to succeed also. > > > > IIUC, you're suggesting to pass full ipv4 or ipv6 packet for veth tests > > to make them actually succeed with the forwarding, right? > > > > Sure, I can do that. But let's keep this entry with the -NOBUFS as well? > > Just for the sake of ensuring that we don't export NET_XMIT_xxx from > > uapi. > > >