From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C059C169C4 for ; Thu, 7 Feb 2019 00:37:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C77AF2175B for ; Thu, 7 Feb 2019 00:37:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="HHsnMFJU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726788AbfBGAhc (ORCPT ); Wed, 6 Feb 2019 19:37:32 -0500 Received: from mail-qt1-f201.google.com ([209.85.160.201]:50501 "EHLO mail-qt1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726424AbfBGAhc (ORCPT ); Wed, 6 Feb 2019 19:37:32 -0500 Received: by mail-qt1-f201.google.com with SMTP id 41so8566873qto.17 for ; Wed, 06 Feb 2019 16:37:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=vsaWY/R2m8vDZ99PPE3LcB/aJuVq9EIgErGGETrEIsk=; b=HHsnMFJUT+HTClxLC65W9/plwh3JYGA+QsNWLH4IyhEA7rJUuAAx4P0Txmw6DUd+KK 1rQgYeTTOLetb5iwSuey89Lcywy50VE6VcaV3OhhqR6P9fLAQVeQX8GxV3dd6wWgb9IZ tZZEe5AL9uvN+GhdBX+xpYEJ9rPAIcN9KRs+JwPsfcZZcDo/UJzUYXfJp4BN4gGXm004 yBMqg9ORyDl5s9FBRqbkABHyv+99QJ/mbNcmk7wYeZ/0t0o2uphO4GbvCtP7ywQuQAMd 3IxBwmuE9bcDsliW7485RNZJ/OjHhzowoWWuPs2MiEIaMhb9v/uqQtx0Z67yuxGQCSvB maIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=vsaWY/R2m8vDZ99PPE3LcB/aJuVq9EIgErGGETrEIsk=; b=ceFhTqZ2FUbvXcPVgHc+6i6qVyFWgR5fDnc0pWqzY3FJGlJdKntXwZVEbbh5thfZEk fL2c0fHKDT1MV37EF3t/ufrky5V9cGyxCkSflaZnAcPQG/jxZL96oaowu3yr+DSo0Ca8 PL0ytuBwd8oINhnldoKHtqZxfgSY8vYv2bTowqK7+Zu+/d7rhYMazakmHfj3bNq1luJp w6AZ3esR19EAG9O1y1+AwEfjtWRz6uo5DSJSiShlEXWhRtowSZoyJmz5zy/jpQFvSZB8 2VxIYRXxMJ9Lr/ZrVvTIs0mVtdMnVHKzVMYHK6zhnsO0OuzfDcqLW3Sw+5vnlT6LzYYe 06OA== X-Gm-Message-State: AHQUAuYjIb52z2cPAWslOyWbOJyw2H4hcEq+POQ9C/adHpRHVoe4UlJa cK020yR5fkN3yRH7b8/TODCPGJMA X-Google-Smtp-Source: AHgI3IbQpTf++cGUadbcNZWBqf5/qBoi7WXIUACG1JM5OHGsjIZpnBZP0/kA0I8O1TRKHFp6MKP72LZQ X-Received: by 2002:aed:22a7:: with SMTP id p36mr8218858qtc.53.1549499850898; Wed, 06 Feb 2019 16:37:30 -0800 (PST) Date: Wed, 6 Feb 2019 16:37:16 -0800 In-Reply-To: <20190207003720.51096-1-posk@google.com> Message-Id: <20190207003720.51096-3-posk@google.com> Mime-Version: 1.0 References: <20190207003720.51096-1-posk@google.com> X-Mailer: git-send-email 2.20.1.611.gfbb209baf1-goog Subject: [PATCH bpf-next v7 2/6] bpf: implement BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap From: Peter Oskolkov To: Alexei Starovoitov , Daniel Borkmann , netdev@vger.kernel.org Cc: Peter Oskolkov , David Ahern , Willem de Bruijn , Peter Oskolkov Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch implements BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap BPF helper. It enables BPF programs (specifically, BPF_PROG_TYPE_LWT_IN and BPF_PROG_TYPE_LWT_XMIT prog types) to add IP encapsulation headers to packets (e.g. IP/GRE, GUE, IPIP). This is useful when thousands of different short-lived flows should be encapped, each with different and dynamically determined destination. Although lwtunnels can be used in some of these scenarios, the ability to dynamically generate encap headers adds more flexibility, e.g. when routing depends on the state of the host (reflected in global bpf maps). v7 changes: - added a call skb_clear_hash(); - removed calls to skb_set_transport_header(); - refuse to encap GSO-enabled packets. Note: the next patch in the patchset with deal with GSO-enabled packets, which are currently rejected at encapping attempt. Signed-off-by: Peter Oskolkov --- include/net/lwtunnel.h | 3 ++ net/core/filter.c | 3 +- net/core/lwt_bpf.c | 65 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h index 33fd9ba7e0e5..f0973eca8036 100644 --- a/include/net/lwtunnel.h +++ b/include/net/lwtunnel.h @@ -126,6 +126,8 @@ int lwtunnel_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b); int lwtunnel_output(struct net *net, struct sock *sk, struct sk_buff *skb); int lwtunnel_input(struct sk_buff *skb); int lwtunnel_xmit(struct sk_buff *skb); +int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, + bool ingress); static inline void lwtunnel_set_redirect(struct dst_entry *dst) { @@ -138,6 +140,7 @@ static inline void lwtunnel_set_redirect(struct dst_entry *dst) dst->input = lwtunnel_input; } } + #else static inline void lwtstate_free(struct lwtunnel_state *lws) diff --git a/net/core/filter.c b/net/core/filter.c index 8884120fe458..7b7e7c9125e2 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -73,6 +73,7 @@ #include #include #include +#include /** * sk_filter_trim_cap - run a packet through a socket filter @@ -4804,7 +4805,7 @@ static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len static int bpf_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress) { - return -EINVAL; /* Implemented in the next patch. */ + return bpf_lwt_push_ip_encap(skb, hdr, len, ingress); } BPF_CALL_4(bpf_lwt_in_push_encap, struct sk_buff *, skb, u32, type, void *, hdr, diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c index a648568c5e8f..786b96148937 100644 --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -390,6 +390,71 @@ static const struct lwtunnel_encap_ops bpf_encap_ops = { .owner = THIS_MODULE, }; +static int handle_gso_encap(struct sk_buff *skb, bool ipv4, int encap_len) +{ + /* Handling of GSO-enabled packets is added in the next patch. */ + if (unlikely(skb_is_gso(skb))) + return -EINVAL; + + return 0; +} + +int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress) +{ + struct iphdr *iph; + bool ipv4; + int err; + + if (unlikely(len < sizeof(struct iphdr) || len > LWT_BPF_MAX_HEADROOM)) + return -EINVAL; + + /* validate protocol and length */ + iph = (struct iphdr *)hdr; + if (iph->version == 4) { + ipv4 = true; + if (unlikely(len < iph->ihl * 4)) + return -EINVAL; + } else if (iph->version == 6) { + ipv4 = false; + if (unlikely(len < sizeof(struct ipv6hdr))) + return -EINVAL; + } else { + return -EINVAL; + } + + if (ingress) + err = skb_cow_head(skb, len + skb->mac_len); + else + err = skb_cow_head(skb, + len + LL_RESERVED_SPACE(skb_dst(skb)->dev)); + if (unlikely(err)) + return err; + + /* push the encap headers and fix pointers */ + skb_reset_inner_headers(skb); + skb->encapsulation = 1; + skb_push(skb, len); + if (ingress) + skb_postpush_rcsum(skb, iph, len); + skb_reset_network_header(skb); + memcpy(skb_network_header(skb), hdr, len); + bpf_compute_data_pointers(skb); + skb_clear_hash(skb); + + if (ipv4) { + skb->protocol = htons(ETH_P_IP); + iph = ip_hdr(skb); + + if (!iph->check) + iph->check = ip_fast_csum((unsigned char *)iph, + iph->ihl); + } else { + skb->protocol = htons(ETH_P_IPV6); + } + + return handle_gso_encap(skb, ipv4, len); +} + static int __init bpf_lwt_init(void) { return lwtunnel_encap_add_ops(&bpf_encap_ops, LWTUNNEL_ENCAP_BPF); -- 2.20.1.611.gfbb209baf1-goog