From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B615C282D8 for ; Fri, 1 Feb 2019 17:22:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4A302218AC for ; Fri, 1 Feb 2019 17:22:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LVKAtnt5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730676AbfBARWo (ORCPT ); Fri, 1 Feb 2019 12:22:44 -0500 Received: from mail-it1-f202.google.com ([209.85.166.202]:58511 "EHLO mail-it1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729009AbfBARWo (ORCPT ); Fri, 1 Feb 2019 12:22:44 -0500 Received: by mail-it1-f202.google.com with SMTP id p21so6463785itb.8 for ; Fri, 01 Feb 2019 09:22:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=tI0C/LyMEgjYoih0e7I9cegX5eqh+sBv7jblsOZ06JQ=; b=LVKAtnt5kqTk4r0eDxvmUUF33WW68OJA9HERllO7fHe9QeZ+Hqbw59QFJtuofZk2xY ex1B7TakNaC+tzXsuwKdxMiCUX2cd7L0gT6QGCmos8bZDoGkFe8qzNTEvqEZF6Lb1gTk RUQb/9sWk9ugmje56Bh/rq0xyFcRWHl+MGPcb+zGwIqfiphJU4tUrmcsvTrlHQEqEPq/ zGEXz2xTJs2uJC9EXnmg4FmyxHPN5lwqUFEoIaO5SDq1OKx+N7JWOOJjctjMUFaz9LQH LBTsvvnTALc1IqXGGRsIrS1gGqui4iyrq9UrYAIgm2e1IfMsQ7A8a2PuzTUR4/on8QeX sYEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=tI0C/LyMEgjYoih0e7I9cegX5eqh+sBv7jblsOZ06JQ=; b=hQm0yCS37815gMrceeFMRv61BpGzJH+wEL9oF+EyKArmLolujxF1Z5039MU3Uhdika ukECfba5f/e0Ou2NgRIqN7zlZzQseYQtT232EXnvyn8pV22aeK7eEN1LAIGRNoZsINN2 a1abdCtku89mjs7RD88Pbc3XI937eCkZishHzjpzmFAlZOB4g1uVpBm1KbnaZXwj9QWm mYmMs3UWj9duewLfFBGtfqrwEmjCUI2OLPvvOGkAXRksfyPWeXirwyqVmUmcBorpykYJ xNfptibxJLNAKBD1t3Uk1UzRn3EIO2F8iPB1mVvosbAQ1Gsht+bfTx6tttliHFwgV8Mt AJcw== X-Gm-Message-State: AHQUAuayZaC8Igb5qPZuqJBmipiJJVl8Ml2ldkjtPyC05Xa5Z7zfDXxL ipz/BUQMA+X8dEANFGneSCZEXg8z X-Google-Smtp-Source: AHgI3IbR83pC6tlB1qYkzft+y4G1CRSqyFJS0NgznUAbVD8Ba2xbmYcUJyFS5SlBP3BaS0BEvaptRX9X X-Received: by 2002:a24:7381:: with SMTP id y123mr3316008itb.32.1549041762741; Fri, 01 Feb 2019 09:22:42 -0800 (PST) Date: Fri, 1 Feb 2019 09:22:26 -0800 In-Reply-To: <20190201172229.108867-1-posk@google.com> Message-Id: <20190201172229.108867-3-posk@google.com> Mime-Version: 1.0 References: <20190201172229.108867-1-posk@google.com> X-Mailer: git-send-email 2.20.1.611.gfbb209baf1-goog Subject: [PATCH bpf-next v6 2/5] bpf: implement BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap From: Peter Oskolkov To: Alexei Starovoitov , Daniel Borkmann , netdev@vger.kernel.org Cc: Peter Oskolkov , David Ahern , Willem de Bruijn , Peter Oskolkov Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch implements BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap BPF helper. It enables BPF programs (specifically, BPF_PROG_TYPE_LWT_IN and BPF_PROG_TYPE_LWT_XMIT prog types) to add IP encapsulation headers to packets (e.g. IP/GRE, GUE, IPIP). This is useful when thousands of different short-lived flows should be encapped, each with different and dynamically determined destination. Although lwtunnels can be used in some of these scenarios, the ability to dynamically generate encap headers adds more flexibility, e.g. when routing depends on the state of the host (reflected in global bpf maps). Note: a follow-up patch with deal with GSO-enabled packets, which are currently rejected at encapping attempt. Signed-off-by: Peter Oskolkov --- include/net/lwtunnel.h | 3 ++ net/core/filter.c | 3 +- net/core/lwt_bpf.c | 63 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+), 1 deletion(-) diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h index 33fd9ba7e0e5..f0973eca8036 100644 --- a/include/net/lwtunnel.h +++ b/include/net/lwtunnel.h @@ -126,6 +126,8 @@ int lwtunnel_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b); int lwtunnel_output(struct net *net, struct sock *sk, struct sk_buff *skb); int lwtunnel_input(struct sk_buff *skb); int lwtunnel_xmit(struct sk_buff *skb); +int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, + bool ingress); static inline void lwtunnel_set_redirect(struct dst_entry *dst) { @@ -138,6 +140,7 @@ static inline void lwtunnel_set_redirect(struct dst_entry *dst) dst->input = lwtunnel_input; } } + #else static inline void lwtstate_free(struct lwtunnel_state *lws) diff --git a/net/core/filter.c b/net/core/filter.c index 27d3fbe4b77b..de6bd4b4e0a3 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -73,6 +73,7 @@ #include #include #include +#include /** * sk_filter_trim_cap - run a packet through a socket filter @@ -4804,7 +4805,7 @@ static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len static int bpf_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress) { - return -EINVAL; /* Implemented in the next patch. */ + return bpf_lwt_push_ip_encap(skb, hdr, len, ingress); } BPF_CALL_4(bpf_lwt_in_push_encap, struct sk_buff *, skb, u32, type, void *, hdr, diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c index a648568c5e8f..eaec3d491894 100644 --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -390,6 +390,69 @@ static const struct lwtunnel_encap_ops bpf_encap_ops = { .owner = THIS_MODULE, }; +int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress) +{ + struct iphdr *iph; + bool ipv4; + int err; + + if (unlikely(len < sizeof(struct iphdr) || len > LWT_BPF_MAX_HEADROOM)) + return -EINVAL; + + /* GSO-enabled packets cannot be encapped at the moment. */ + if (unlikely(skb_is_gso(skb))) + return -EINVAL; + + /* validate protocol and length */ + iph = (struct iphdr *)hdr; + if (iph->version == 4) { + ipv4 = true; + if (unlikely(len < iph->ihl * 4)) + return -EINVAL; + } else if (iph->version == 6) { + ipv4 = false; + if (unlikely(len < sizeof(struct ipv6hdr))) + return -EINVAL; + } else { + return -EINVAL; + } + + if (ingress) + err = skb_cow_head(skb, len + skb->mac_len); + else + err = skb_cow_head(skb, + len + LL_RESERVED_SPACE(skb_dst(skb)->dev)); + if (unlikely(err)) + return err; + + /* push the encap headers and fix pointers */ + skb_reset_inner_headers(skb); + skb->encapsulation = 1; + skb_push(skb, len); + if (ingress) + skb_postpush_rcsum(skb, iph, len); + skb_reset_network_header(skb); + memcpy(skb_network_header(skb), hdr, len); + bpf_compute_data_pointers(skb); + + if (ipv4) { + skb->protocol = htons(ETH_P_IP); + iph = ip_hdr(skb); + if (iph->ihl * 4 < len) + skb_set_transport_header(skb, iph->ihl * 4); + + if (!iph->check) + iph->check = ip_fast_csum((unsigned char *)iph, + iph->ihl); + } else { + skb->protocol = htons(ETH_P_IPV6); + if (sizeof(struct ipv6hdr) < len) + skb_set_transport_header(skb, sizeof(struct ipv6hdr)); + } + + return 0; +} + static int __init bpf_lwt_init(void) { return lwtunnel_encap_add_ops(&bpf_encap_ops, LWTUNNEL_ENCAP_BPF); -- 2.20.1.611.gfbb209baf1-goog