From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.3 required=3.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6D67C432C0 for ; Tue, 26 Nov 2019 10:20:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 720FC20727 for ; Tue, 26 Nov 2019 10:20:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Uq/ahqOA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 720FC20727 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52230 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iZXxO-0002qb-1n for qemu-devel@archiver.kernel.org; Tue, 26 Nov 2019 05:20:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:57422) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iZXsE-0005e4-V2 for qemu-devel@nongnu.org; Tue, 26 Nov 2019 05:15:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iZXnW-0003iu-S3 for qemu-devel@nongnu.org; Tue, 26 Nov 2019 05:10:20 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]:43286) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iZXnW-0003iV-K9 for qemu-devel@nongnu.org; Tue, 26 Nov 2019 05:10:18 -0500 Received: by mail-pj1-x1032.google.com with SMTP id a10so8056401pju.10 for ; Tue, 26 Nov 2019 02:10:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=izpU3FP4fxkkqeoLGSB6k8qMmojNU/Dikc5DKXQ0xHU=; b=Uq/ahqOAzTEUkTwsljpwIkyrGPqH/evKDEDersVK+mnaWbVNUkIthOWiLU+OXtLUvU EOw+sFAxx99LvFjbn0xvlUFZ5J3TEpxJkoyjlF2msZFo/LOeVoj6IkxWSzwwvLF/GaO4 LV0VhZdKKH/GmYoyD3t6ep/62dQSmfaGEgWoU9yvfU7Rgusygl34frWANucKte7c05BM V5fH+YZvdKIZB7KcK8MgdI3YFmWqAtKSVQ2oidAApxjtHLUjzcKonWljChWnfE+ng/+x lioML1DyLTJ9oTbXDsRF4zQN7TE/MgG5O/maBRtCNJrC7VQNXshM+DqgYTr9+KMSBVLC NAng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=izpU3FP4fxkkqeoLGSB6k8qMmojNU/Dikc5DKXQ0xHU=; b=jsBGt8wIS6mwHMOGKwxWcjC/irkqKgISJMwd5iH/HRs5ccdJbSe4L37P6dr1ZIY1ej LEdTUuDFMQ57o/pBCvyFAYDABxBNV1K2qoBCUcZZ+KKnG/FW6UZHlpSCcMZ7Zwb7+6i3 PdKO3iHKB9BpU/tZZvG792Vd7SBLF52+hbkGf/n/J8SqeSLLlNaL4HDPxM60jRdnHqv8 5Ew6OVrKIF+3kOS/WisWI5hMqyw56GBOg2I59twpx3NbF4m1ZcIiWS6EKgN18v9f+DXA kxdkFvWist+ajK70iDZXv57hvTi/QOT8ZbHoHgM1+sBMgMLpQLdUL3LKztD/ELe1jW6k O/1A== X-Gm-Message-State: APjAAAWXX2O5rRTIGuUnoOQbx7k8J4CvrMZCYQ4niAb1c+Dv8yQEthIQ CAnzDyIL638OvLWTvs18LTg= X-Google-Smtp-Source: APXvYqytegqONiYQJkp3D/RPNSJspi2nkwoVBXcgZu7C3Ai2QAmscJb++iOInjMbvENLHy/9nfJw7g== X-Received: by 2002:a17:902:fe06:: with SMTP id g6mr16749434plj.52.1574763017372; Tue, 26 Nov 2019 02:10:17 -0800 (PST) Received: from localhost.localdomain ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id h9sm12059065pgk.84.2019.11.26.02.10.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Nov 2019 02:10:16 -0800 (PST) From: Prashant Bhole To: "Michael S . Tsirkin" , Jason Wang , qemu-devel@nongnu.org Subject: [RFC 2/3] virtio-net: add support for offloading XDP program Date: Tue, 26 Nov 2019 19:09:13 +0900 Message-Id: <20191126100914.5150-3-prashantbhole.linux@gmail.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191126100914.5150-1-prashantbhole.linux@gmail.com> References: <20191126100914.5150-1-prashantbhole.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1032 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Song Liu , Jakub Kicinski , Jesper Dangaard Brouer , Daniel Borkmann , netdev@vger.kernel.org, John Fastabend , Alexei Starovoitov , Martin KaFai Lau , Prashant Bhole , kvm@vger.kernel.org, Yonghong Song , Andrii Nakryiko , "David S . Miller" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Jason Wang This feature involves offloading of XDP program and ebpf map from the guest to the host. This patch takes care of offloadin of program. A handler for VIRTIO_NET_CTRL_EBPF command is added in virtio-net. The control buffer consist of struct virtio_net_ctrl_ebpf_prog and followed by an ebpf program instructions. An array of bpf_insn is prepared and passed to libbpf API bpf_load_program. The program fd is retuned by the API is then attached to tap fd using TUNSETOFFLOADEDXDP ioctl command. Signed-off-by: Jason Wang Co-developed-by: Prashant Bhole Signed-off-by: Prashant Bhole --- hw/net/virtio-net.c | 69 +++++++++++++++++++++ include/net/tap.h | 2 + include/standard-headers/linux/virtio_net.h | 27 ++++++++ net/Makefile.objs | 1 + net/tap-bsd.c | 5 ++ net/tap-linux.c | 48 ++++++++++++++ net/tap-linux.h | 1 + net/tap-solaris.c | 5 ++ net/tap-stub.c | 5 ++ net/tap.c | 7 +++ net/tap_int.h | 1 + 11 files changed, 171 insertions(+) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 97a5113f7e..7cc1bd1654 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -43,6 +43,11 @@ #include "monitor/qdev.h" #include "hw/pci/pci.h" +#ifdef CONFIG_LIBBPF +#include +#include +#endif + #define VIRTIO_NET_VM_VERSION 11 #define MAC_TABLE_ENTRIES 64 @@ -628,6 +633,21 @@ static int peer_attach(VirtIONet *n, int index) return tap_enable(nc->peer); } +static int peer_attach_ebpf(VirtIONet *n, int len, void *insns, uint8_t gpl) +{ + NetClientState *nc = qemu_get_subqueue(n->nic, 0); + + if (!nc->peer) { + return 0; + } + + if (nc->peer->info->type != NET_CLIENT_DRIVER_TAP) { + return 0; + } + + return tap_attach_ebpf(nc->peer, len, insns, gpl); +} + static int peer_detach(VirtIONet *n, int index) { NetClientState *nc = qemu_get_subqueue(n->nic, index); @@ -991,6 +1011,53 @@ static int virtio_net_handle_offloads(VirtIONet *n, uint8_t cmd, } } +static int virtio_net_handle_ebpf_prog(VirtIONet *n, struct iovec *iov, + unsigned int iov_cnt) +{ +#ifdef CONFIG_LIBBPF + struct bpf_insn prog[4096]; + struct virtio_net_ctrl_ebpf_prog ctrl; + size_t s; + int err = VIRTIO_NET_ERR; + + s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl)); + if (s != sizeof(ctrl)) { + error_report("Invalid ebpf prog control buffer"); + goto err; + } + + if (ctrl.cmd == VIRTIO_NET_BPF_CMD_SET_OFFLOAD) { + s = iov_to_buf(iov, iov_cnt, sizeof(ctrl), prog, sizeof(prog)); + if (s != ctrl.len) { + error_report("Invalid ebpf prog control buffer"); + goto err; + } + + err = peer_attach_ebpf(n, s, prog, ctrl.gpl_compatible); + if (err) { + error_report("Failed to attach XDP program"); + goto err; + } + } else if (ctrl.cmd == VIRTIO_NET_BPF_CMD_UNSET_OFFLOAD) { + err = peer_attach_ebpf(n, 0, NULL, 0); + } +err: + return err ? VIRTIO_NET_ERR : VIRTIO_NET_OK; +#else + return VIRTIO_NET_ERR; +#endif +} + +static int virtio_net_handle_ebpf(VirtIONet *n, uint8_t cmd, + struct iovec *iov, unsigned int iov_cnt) +{ + if (cmd == VIRTIO_NET_CTRL_EBPF_PROG) { + return virtio_net_handle_ebpf_prog(n, iov, iov_cnt); + } + + return VIRTIO_NET_ERR; +} + static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, struct iovec *iov, unsigned int iov_cnt) { @@ -1208,6 +1275,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq) status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt); } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) { status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt); + } else if (ctrl.class == VIRTIO_NET_CTRL_EBPF) { + status = virtio_net_handle_ebpf(n, ctrl.cmd, iov, iov_cnt); } s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, sizeof(status)); diff --git a/include/net/tap.h b/include/net/tap.h index 5d585515f9..19c507a1c2 100644 --- a/include/net/tap.h +++ b/include/net/tap.h @@ -33,6 +33,8 @@ int tap_disable(NetClientState *nc); int tap_get_fd(NetClientState *nc); +int tap_attach_ebpf(NetClientState *nc, int len, void *insns, uint8_t gpl); + struct vhost_net; struct vhost_net *tap_get_vhost_net(NetClientState *nc); diff --git a/include/standard-headers/linux/virtio_net.h b/include/standard-headers/linux/virtio_net.h index 260c3681d7..83292c81bc 100644 --- a/include/standard-headers/linux/virtio_net.h +++ b/include/standard-headers/linux/virtio_net.h @@ -261,4 +261,31 @@ struct virtio_net_ctrl_mq { #define VIRTIO_NET_CTRL_GUEST_OFFLOADS 5 #define VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET 0 +/* + * Control XDP offloads offloads + * + * When guest wants to offload XDP program to tap device, it calls + * VIRTIO_NET_CTRL_EBPF_PROG along with VIRTIO_NET_BPF_CMD_SET_OFFLOAD + * subcommands. When offloading is successful, the tap device run offloaded + * XDP program for each packet before sending it to the guest. + * + * VIRTIO_NET_BPF_CMD_UNSET_OFFLOAD removes the the offloaded program from + * the tap device, if exists. + */ + +struct virtio_net_ctrl_ebpf_prog { + /* program length in bytes */ + __virtio32 len; + __virtio16 cmd; + __virtio16 gpl_compatible; + uint8_t insns[0]; +}; + +#define VIRTIO_NET_CTRL_EBPF 6 + #define VIRTIO_NET_CTRL_EBPF_PROG 1 + +/* Commands for VIRTIO_NET_CTRL_EBPF_PROG */ +#define VIRTIO_NET_BPF_CMD_SET_OFFLOAD 1 +#define VIRTIO_NET_BPF_CMD_UNSET_OFFLOAD 2 + #endif /* _LINUX_VIRTIO_NET_H */ diff --git a/net/Makefile.objs b/net/Makefile.objs index c5d076d19c..e7645225be 100644 --- a/net/Makefile.objs +++ b/net/Makefile.objs @@ -28,5 +28,6 @@ common-obj-$(CONFIG_POSIX) += tap.o $(tap-obj-y) common-obj-$(CONFIG_WIN32) += tap-win32.o vde.o-libs = $(VDE_LIBS) +tap-linux.o-libs = $(LIBBPF_LIBS) common-obj-$(CONFIG_CAN_BUS) += can/ diff --git a/net/tap-bsd.c b/net/tap-bsd.c index a5c3707f80..e4e2a5c799 100644 --- a/net/tap-bsd.c +++ b/net/tap-bsd.c @@ -259,3 +259,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_attach_ebpf(int fd, int len, void *insns, uint8_t gpl) +{ + return -EINVAL; +} diff --git a/net/tap-linux.c b/net/tap-linux.c index e0dd442ee3..3ff806bf4f 100644 --- a/net/tap-linux.c +++ b/net/tap-linux.c @@ -31,6 +31,8 @@ #include #include +#include +#include #include "qapi/error.h" #include "qemu/error-report.h" @@ -314,3 +316,49 @@ int tap_fd_get_ifname(int fd, char *ifname) pstrcpy(ifname, sizeof(ifr.ifr_name), ifr.ifr_name); return 0; } + +int tap_fd_attach_ebpf(int fd, int len, void *insns, uint8_t gpl) +{ +#ifdef CONFIG_LIBBPF + struct bpf_insn *prog = (struct bpf_insn *)insns; + static char log_buf[65536]; + char license[16] = {0}; + int num_insn; + int bpf_fd; + int ret; + + if (!prog) { + bpf_fd = -1; + ret = ioctl(fd, TUNSETOFFLOADEDXDP, &bpf_fd); + if (ret) { + error_report("Failed to remove offloaded XDP: %s", strerror(errno)); + return -EFAULT; + } + return ret; + } + + num_insn = len / sizeof(prog[0]); + if (gpl) { + strncpy(license, "GPL", sizeof(license)); + } + + bpf_fd = bpf_load_program(BPF_PROG_TYPE_XDP, prog, num_insn, license, + 0, log_buf, sizeof(log_buf)); + if (bpf_fd < 0) { + error_report("Failed to load XDP program: %s", strerror(errno)); + error_report("ebpf verifier log: %s", log_buf); + return -EFAULT; + } + + ret = ioctl(fd, TUNSETOFFLOADEDXDP, &bpf_fd); + if (ret) { + error_report("Failed to set offloaded XDP: %s", strerror(errno)); + return -EFAULT; + } + close(bpf_fd); + + return ret; +#else + return -EINVAL; +#endif +} diff --git a/net/tap-linux.h b/net/tap-linux.h index 2f36d100fc..791aeaebc4 100644 --- a/net/tap-linux.h +++ b/net/tap-linux.h @@ -31,6 +31,7 @@ #define TUNSETQUEUE _IOW('T', 217, int) #define TUNSETVNETLE _IOW('T', 220, int) #define TUNSETVNETBE _IOW('T', 222, int) +#define TUNSETOFFLOADEDXDP _IOW('T', 228, int) #endif diff --git a/net/tap-solaris.c b/net/tap-solaris.c index 4725d2314e..38b9136b5f 100644 --- a/net/tap-solaris.c +++ b/net/tap-solaris.c @@ -254,3 +254,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_attach_ebpf(int fd, int len, void *insns, uint8_t gpl) +{ + return -EINVAL; +} diff --git a/net/tap-stub.c b/net/tap-stub.c index a9ab8f8293..5f4161b390 100644 --- a/net/tap-stub.c +++ b/net/tap-stub.c @@ -85,3 +85,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_attach_ebpf(int fd, int len, void *insns, uint8_t gpl) +{ + return -EINVAL; +} diff --git a/net/tap.c b/net/tap.c index 6207f61f84..3dba8eacb1 100644 --- a/net/tap.c +++ b/net/tap.c @@ -971,6 +971,13 @@ int tap_enable(NetClientState *nc) } } +int tap_attach_ebpf(NetClientState *nc, int len, void *insns, uint8_t gpl) +{ + TAPState *s = DO_UPCAST(TAPState, nc, nc); + + return tap_fd_attach_ebpf(s->fd, len, insns, gpl); +} + int tap_disable(NetClientState *nc) { TAPState *s = DO_UPCAST(TAPState, nc, nc); diff --git a/net/tap_int.h b/net/tap_int.h index e3194b23f4..af641607e2 100644 --- a/net/tap_int.h +++ b/net/tap_int.h @@ -44,5 +44,6 @@ int tap_fd_set_vnet_be(int fd, int vnet_is_be); int tap_fd_enable(int fd); int tap_fd_disable(int fd); int tap_fd_get_ifname(int fd, char *ifname); +int tap_fd_attach_ebpf(int fd, int len, void *insns, uint8_t gpl); #endif /* NET_TAP_INT_H */ -- 2.20.1