* [PATCH net-next v4 20/20] net: WireGuard secure network tunnel
From: Jason A. Donenfeld @ 2018-09-14 16:22 UTC (permalink / raw)
To: linux-kernel, netdev, linux-crypto, davem, gregkh; +Cc: Jason A. Donenfeld
In-Reply-To: <20180914162240.7925-1-Jason@zx2c4.com>
WireGuard is a layer 3 secure networking tunnel made specifically for
the kernel, that aims to be much simpler and easier to audit than IPsec.
Extensive documentation and description of the protocol and
considerations, along with formal proofs of the cryptography, are
available at:
* https://www.wireguard.com/
* https://www.wireguard.com/papers/wireguard.pdf
This commit implements WireGuard as a simple network device driver,
accessible in the usual RTNL way used by virtual network drivers. It
makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
networking subsystem APIs. It has a somewhat novel multicore queueing
system designed for maximum throughput and minimal latency of encryption
operations, but it is implemented modestly using workqueues and NAPI.
Configuration is done via generic Netlink, and following a review from
the Netlink maintainer a year ago, several high profile userspace
have already implemented the API.
This commit also comes with several different tests, both in-kernel
tests and out-of-kernel tests based on network namespaces, taking profit
of the fact that sockets used by WireGuard intentionally stay in the
namespace the WireGuard interface was originally created, exactly like
the semantics of userspace tun devices. See wireguard.com/netns/ for
pictures and examples.
The source code is fairly short, but rather than combining everything
into a single file, WireGuard is developed as cleanly separable files,
making auditing and comprehension easier. Things are laid out as
follows:
* noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
cryptographic aspects of the protocol, and are mostly data-only in
nature, taking in buffers of bytes and spitting out buffers of
bytes. They also handle reference counting for their various shared
pieces of data, like keys and key lists.
* ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
ratelimiting certain types of cryptographic operations in accordance
with particular WireGuard semantics.
* allowedips.[ch], hashtables.[ch]: The main lookup structures of
WireGuard, the former being trie-like with particular semantics, an
integral part of the design of the protocol, and the latter just
being nice helper functions around the specific hashtables we use.
* device.[ch]: Implementation of functions for the netdevice and for
rtnl, responsible for maintaining the life of a given interface and
wiring it up to the rest of WireGuard.
* peer.[ch]: Each interface has a list of peers, with helper functions
available here for creation, destruction, and reference counting.
* socket.[ch]: Implementation of functions related to udp_socket and
the general set of kernel socket APIs, for sending and receiving
ciphertext UDP packets, and taking care of WireGuard-specific sticky
socket routing semantics for the automatic roaming.
* netlink.[ch]: Userspace API entry point for configuring WireGuard
peers and devices. The API has been implemented by several userspace
tools and network management utility, and the WireGuard project
distributes the basic wg(8) tool.
* queueing.[ch]: Shared function on the rx and tx path for handling
the various queues used in the multicore algorithms.
* send.c: Handles encrypting outgoing packets in parallel on
multiple cores, before sending them in order on a single core, via
workqueues and ring buffers. Also handles sending handshake and cookie
messages as part of the protocol, in parallel.
* receive.c: Handles decrypting incoming packets in parallel on
multiple cores, before passing them off in order to be ingested via
the rest of the networking subsystem with GRO via the typical NAPI
poll function. Also handles receiving handshake and cookie messages
as part of the protocol, in parallel.
* timers.[ch]: Uses the timer wheel to implement protocol particular
event timeouts, and gives a set of very simple event-driven entry
point functions for callers.
* main.c, version.h: Initialization and deinitialization of the module.
* selftest/*.h: Runtime unit tests for some of the most security
sensitive functions.
* tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
script using network namespaces.
This commit aims to be as self-contained as possible, implementing
WireGuard as a standalone module not needing much special handling or
coordination from the network subsystem. I expect for future
optimizations to the network stack to positively improve WireGuard, and
vice-versa, but for the time being, this exists as intentionally
standalone.
We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: David Miller <davem@davemloft.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
---
MAINTAINERS | 8 +
drivers/net/Kconfig | 30 +
drivers/net/Makefile | 1 +
drivers/net/wireguard/Makefile | 18 +
drivers/net/wireguard/allowedips.c | 404 ++++++++++
drivers/net/wireguard/allowedips.h | 55 ++
drivers/net/wireguard/cookie.c | 234 ++++++
drivers/net/wireguard/cookie.h | 59 ++
drivers/net/wireguard/device.c | 438 +++++++++++
drivers/net/wireguard/device.h | 65 ++
drivers/net/wireguard/hashtables.c | 209 +++++
drivers/net/wireguard/hashtables.h | 63 ++
drivers/net/wireguard/main.c | 65 ++
drivers/net/wireguard/messages.h | 128 +++
drivers/net/wireguard/netlink.c | 605 ++++++++++++++
drivers/net/wireguard/netlink.h | 12 +
drivers/net/wireguard/noise.c | 784 +++++++++++++++++++
drivers/net/wireguard/noise.h | 129 +++
drivers/net/wireguard/peer.c | 191 +++++
drivers/net/wireguard/peer.h | 87 ++
drivers/net/wireguard/queueing.c | 52 ++
drivers/net/wireguard/queueing.h | 193 +++++
drivers/net/wireguard/ratelimiter.c | 220 ++++++
drivers/net/wireguard/ratelimiter.h | 19 +
drivers/net/wireguard/receive.c | 597 ++++++++++++++
drivers/net/wireguard/selftest/allowedips.h | 656 ++++++++++++++++
drivers/net/wireguard/selftest/counter.h | 103 +++
drivers/net/wireguard/selftest/ratelimiter.h | 174 ++++
drivers/net/wireguard/send.c | 420 ++++++++++
drivers/net/wireguard/socket.c | 435 ++++++++++
drivers/net/wireguard/socket.h | 44 ++
drivers/net/wireguard/timers.c | 256 ++++++
drivers/net/wireguard/timers.h | 30 +
drivers/net/wireguard/version.h | 1 +
include/uapi/linux/wireguard.h | 190 +++++
tools/testing/selftests/wireguard/netns.sh | 499 ++++++++++++
36 files changed, 7474 insertions(+)
create mode 100644 drivers/net/wireguard/Makefile
create mode 100644 drivers/net/wireguard/allowedips.c
create mode 100644 drivers/net/wireguard/allowedips.h
create mode 100644 drivers/net/wireguard/cookie.c
create mode 100644 drivers/net/wireguard/cookie.h
create mode 100644 drivers/net/wireguard/device.c
create mode 100644 drivers/net/wireguard/device.h
create mode 100644 drivers/net/wireguard/hashtables.c
create mode 100644 drivers/net/wireguard/hashtables.h
create mode 100644 drivers/net/wireguard/main.c
create mode 100644 drivers/net/wireguard/messages.h
create mode 100644 drivers/net/wireguard/netlink.c
create mode 100644 drivers/net/wireguard/netlink.h
create mode 100644 drivers/net/wireguard/noise.c
create mode 100644 drivers/net/wireguard/noise.h
create mode 100644 drivers/net/wireguard/peer.c
create mode 100644 drivers/net/wireguard/peer.h
create mode 100644 drivers/net/wireguard/queueing.c
create mode 100644 drivers/net/wireguard/queueing.h
create mode 100644 drivers/net/wireguard/ratelimiter.c
create mode 100644 drivers/net/wireguard/ratelimiter.h
create mode 100644 drivers/net/wireguard/receive.c
create mode 100644 drivers/net/wireguard/selftest/allowedips.h
create mode 100644 drivers/net/wireguard/selftest/counter.h
create mode 100644 drivers/net/wireguard/selftest/ratelimiter.h
create mode 100644 drivers/net/wireguard/send.c
create mode 100644 drivers/net/wireguard/socket.c
create mode 100644 drivers/net/wireguard/socket.h
create mode 100644 drivers/net/wireguard/timers.c
create mode 100644 drivers/net/wireguard/timers.h
create mode 100644 drivers/net/wireguard/version.h
create mode 100644 include/uapi/linux/wireguard.h
create mode 100755 tools/testing/selftests/wireguard/netns.sh
diff --git a/MAINTAINERS b/MAINTAINERS
index d2092e52320d..2043437adf0b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15813,6 +15813,14 @@ L: linux-gpio@vger.kernel.org
S: Maintained
F: drivers/gpio/gpio-ws16c48.c
+WIREGUARD SECURE NETWORK TUNNEL
+M: Jason A. Donenfeld <Jason@zx2c4.com>
+S: Maintained
+F: drivers/net/wireguard/
+F: tools/testing/selftests/wireguard/
+L: wireguard@lists.zx2c4.com
+L: netdev@vger.kernel.org
+
WISTRON LAPTOP BUTTON DRIVER
M: Miloslav Trmac <mitr@volny.cz>
S: Maintained
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index d03775100f7d..aa631fe3b395 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -70,6 +70,36 @@ config DUMMY
To compile this driver as a module, choose M here: the module
will be called dummy.
+config WIREGUARD
+ tristate "WireGuard secure network tunnel"
+ depends on NET && INET
+ select NET_UDP_TUNNEL
+ select DST_CACHE
+ select ZINC_CHACHA20POLY1305
+ select ZINC_BLAKE2S
+ select ZINC_CURVE25519
+ default m
+ help
+ WireGuard is a secure, fast, and easy to use replacement for IPSec
+ that uses modern cryptography and clever networking tricks. It's
+ designed to be fairly general purpose and abstract enough to fit most
+ use cases, while at the same time remaining extremely simple to
+ configure. See www.wireguard.com for more info.
+
+ It's safe to say Y or M here, as the driver is very lightweight and
+ is only in use when an administrator chooses to add an interface.
+
+config WIREGUARD_DEBUG
+ bool "Debugging checks and verbose messages"
+ depends on WIREGUARD
+ help
+ This will write log messages for handshake and other events
+ that occur for a WireGuard interface. It will also perform some
+ extra validation checks and unit tests at various points. This is
+ only useful for debugging.
+
+ Say N here unless you know what you're doing.
+
config EQUALIZER
tristate "EQL (serial line load balancing) support"
---help---
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 21cde7e78621..f0acd11a143d 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -24,6 +24,7 @@ obj-$(CONFIG_RIONET) += rionet.o
obj-$(CONFIG_NET_TEAM) += team/
obj-$(CONFIG_TUN) += tun.o
obj-$(CONFIG_TAP) += tap.o
+obj-$(CONFIG_WIREGUARD) += wireguard/
obj-$(CONFIG_VETH) += veth.o
obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
obj-$(CONFIG_VXLAN) += vxlan.o
diff --git a/drivers/net/wireguard/Makefile b/drivers/net/wireguard/Makefile
new file mode 100644
index 000000000000..d8856255bc9d
--- /dev/null
+++ b/drivers/net/wireguard/Makefile
@@ -0,0 +1,18 @@
+ccflags-y := -O3
+ccflags-y += -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt'
+ccflags-$(CONFIG_WIREGUARD_DEBUG) += -DDEBUG
+wireguard-y := main.o
+wireguard-y += noise.o
+wireguard-y += device.o
+wireguard-y += peer.o
+wireguard-y += timers.o
+wireguard-y += queueing.o
+wireguard-y += send.o
+wireguard-y += receive.o
+wireguard-y += socket.o
+wireguard-y += hashtables.o
+wireguard-y += allowedips.o
+wireguard-y += ratelimiter.o
+wireguard-y += cookie.o
+wireguard-y += netlink.o
+obj-$(CONFIG_WIREGUARD) := wireguard.o
diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
new file mode 100644
index 000000000000..fab15ad50170
--- /dev/null
+++ b/drivers/net/wireguard/allowedips.c
@@ -0,0 +1,404 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "allowedips.h"
+#include "peer.h"
+
+struct allowedips_node {
+ struct wireguard_peer __rcu *peer;
+ struct rcu_head rcu;
+ struct allowedips_node __rcu *bit[2];
+ /* While it may seem scandalous that we waste space for v4,
+ * we're alloc'ing to the nearest power of 2 anyway, so this
+ * doesn't actually make a difference.
+ */
+ u8 bits[16] __aligned(__alignof(u64));
+ u8 cidr, bit_at_a, bit_at_b;
+};
+
+static __always_inline void swap_endian(u8 *dst, const u8 *src, u8 bits)
+{
+ if (bits == 32)
+ *(u32 *)dst = be32_to_cpu(*(const __be32 *)src);
+ else if (bits == 128) {
+ ((u64 *)dst)[0] = be64_to_cpu(((const __be64 *)src)[0]);
+ ((u64 *)dst)[1] = be64_to_cpu(((const __be64 *)src)[1]);
+ }
+}
+
+static void copy_and_assign_cidr(struct allowedips_node *node, const u8 *src,
+ u8 cidr, u8 bits)
+{
+ node->cidr = cidr;
+ node->bit_at_a = cidr / 8U;
+#ifdef __LITTLE_ENDIAN
+ node->bit_at_a ^= (bits / 8U - 1U) % 8U;
+#endif
+ node->bit_at_b = 7U - (cidr % 8U);
+ memcpy(node->bits, src, bits / 8U);
+}
+
+#define choose_node(parent, key) \
+ parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1]
+
+static void node_free_rcu(struct rcu_head *rcu)
+{
+ kfree(container_of(rcu, struct allowedips_node, rcu));
+}
+
+#define push_rcu(stack, p, len) ({ \
+ if (rcu_access_pointer(p)) { \
+ BUG_ON(len >= 128); \
+ stack[len++] = rcu_dereference_raw(p); \
+ } \
+ true; \
+ })
+static void root_free_rcu(struct rcu_head *rcu)
+{
+ struct allowedips_node *node, *stack[128] = {
+ container_of(rcu, struct allowedips_node, rcu) };
+ unsigned int len = 1;
+
+ while (len > 0 && (node = stack[--len]) &&
+ push_rcu(stack, node->bit[0], len) &&
+ push_rcu(stack, node->bit[1], len))
+ kfree(node);
+}
+
+static int
+walk_by_peer(struct allowedips_node __rcu *top, u8 bits,
+ struct allowedips_cursor *cursor, struct wireguard_peer *peer,
+ int (*func)(void *ctx, const u8 *ip, u8 cidr, int family),
+ void *ctx, struct mutex *lock)
+{
+ const int address_family = bits == 32 ? AF_INET : AF_INET6;
+ u8 ip[16] __aligned(__alignof(u64));
+ struct allowedips_node *node;
+ int ret;
+
+ if (!rcu_access_pointer(top))
+ return 0;
+
+ if (!cursor->len)
+ push_rcu(cursor->stack, top, cursor->len);
+
+ for (; cursor->len > 0 && (node = cursor->stack[cursor->len - 1]);
+ --cursor->len, push_rcu(cursor->stack, node->bit[0], cursor->len),
+ push_rcu(cursor->stack, node->bit[1], cursor->len)) {
+ const unsigned int cidr_bytes = DIV_ROUND_UP(node->cidr, 8U);
+
+ if (rcu_dereference_protected(node->peer,
+ lockdep_is_held(lock)) != peer)
+ continue;
+
+ swap_endian(ip, node->bits, bits);
+ memset(ip + cidr_bytes, 0, bits / 8U - cidr_bytes);
+ if (node->cidr)
+ ip[cidr_bytes - 1U] &= ~0U << (-node->cidr % 8U);
+
+ ret = func(ctx, ip, node->cidr, address_family);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+#undef push_rcu
+
+#define ref(p) rcu_access_pointer(p)
+#define deref(p) rcu_dereference_protected(*p, lockdep_is_held(lock))
+#define push(p) ({ \
+ BUG_ON(len >= 128); \
+ stack[len++] = p; \
+ })
+static void walk_remove_by_peer(struct allowedips_node __rcu **top,
+ struct wireguard_peer *peer, struct mutex *lock)
+{
+ struct allowedips_node __rcu **stack[128], **nptr;
+ struct allowedips_node *node, *prev;
+ unsigned int len;
+
+ if (unlikely(!peer || !ref(*top)))
+ return;
+
+ for (prev = NULL, len = 0, push(top); len > 0; prev = node) {
+ nptr = stack[len - 1];
+ node = deref(nptr);
+ if (!node) {
+ --len;
+ continue;
+ }
+ if (!prev || ref(prev->bit[0]) == node ||
+ ref(prev->bit[1]) == node) {
+ if (ref(node->bit[0]))
+ push(&node->bit[0]);
+ else if (ref(node->bit[1]))
+ push(&node->bit[1]);
+ } else if (ref(node->bit[0]) == prev) {
+ if (ref(node->bit[1]))
+ push(&node->bit[1]);
+ } else {
+ if (rcu_dereference_protected(node->peer,
+ lockdep_is_held(lock)) == peer) {
+ RCU_INIT_POINTER(node->peer, NULL);
+ if (!node->bit[0] || !node->bit[1]) {
+ rcu_assign_pointer(*nptr,
+ deref(&node->bit[!ref(node->bit[0])]));
+ call_rcu_bh(&node->rcu, node_free_rcu);
+ node = deref(nptr);
+ }
+ }
+ --len;
+ }
+ }
+}
+#undef ref
+#undef deref
+#undef push
+
+static __always_inline unsigned int fls128(u64 a, u64 b)
+{
+ return a ? fls64(a) + 64U : fls64(b);
+}
+
+static __always_inline u8 common_bits(const struct allowedips_node *node,
+ const u8 *key, u8 bits)
+{
+ if (bits == 32)
+ return 32U - fls(*(const u32 *)node->bits ^ *(const u32 *)key);
+ else if (bits == 128)
+ return 128U - fls128(
+ *(const u64 *)&node->bits[0] ^ *(const u64 *)&key[0],
+ *(const u64 *)&node->bits[8] ^ *(const u64 *)&key[8]);
+ return 0;
+}
+
+/* This could be much faster if it actually just compared the common bits
+ * properly, by precomputing a mask bswap(~0 << (32 - cidr)), and the rest, but
+ * it turns out that common_bits is already super fast on modern processors,
+ * even taking into account the unfortunate bswap. So, we just inline it like
+ * this instead.
+ */
+#define prefix_matches(node, key, bits) \
+ (common_bits(node, key, bits) >= node->cidr)
+
+static __always_inline struct allowedips_node *
+find_node(struct allowedips_node *trie, u8 bits, const u8 *key)
+{
+ struct allowedips_node *node = trie, *found = NULL;
+
+ while (node && prefix_matches(node, key, bits)) {
+ if (rcu_access_pointer(node->peer))
+ found = node;
+ if (node->cidr == bits)
+ break;
+ node = rcu_dereference_bh(choose_node(node, key));
+ }
+ return found;
+}
+
+/* Returns a strong reference to a peer */
+static __always_inline struct wireguard_peer *
+lookup(struct allowedips_node __rcu *root, u8 bits, const void *be_ip)
+{
+ u8 ip[16] __aligned(__alignof(u64));
+ struct wireguard_peer *peer = NULL;
+ struct allowedips_node *node;
+
+ swap_endian(ip, be_ip, bits);
+
+ rcu_read_lock_bh();
+retry:
+ node = find_node(rcu_dereference_bh(root), bits, ip);
+ if (node) {
+ peer = peer_get_maybe_zero(rcu_dereference_bh(node->peer));
+ if (!peer)
+ goto retry;
+ }
+ rcu_read_unlock_bh();
+ return peer;
+}
+
+__attribute__((nonnull(1))) static inline bool
+node_placement(struct allowedips_node __rcu *trie, const u8 *key, u8 cidr,
+ u8 bits, struct allowedips_node **rnode, struct mutex *lock)
+{
+ struct allowedips_node *node = rcu_dereference_protected(trie,
+ lockdep_is_held(lock));
+ struct allowedips_node *parent = NULL;
+ bool exact = false;
+
+ while (node && node->cidr <= cidr && prefix_matches(node, key, bits)) {
+ parent = node;
+ if (parent->cidr == cidr) {
+ exact = true;
+ break;
+ }
+ node = rcu_dereference_protected(choose_node(parent, key),
+ lockdep_is_held(lock));
+ }
+ *rnode = parent;
+ return exact;
+}
+
+static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *be_key,
+ u8 cidr, struct wireguard_peer *peer, struct mutex *lock)
+{
+ struct allowedips_node *node, *parent, *down, *newnode;
+ u8 key[16] __aligned(__alignof(u64));
+
+ if (unlikely(cidr > bits || !peer))
+ return -EINVAL;
+
+ swap_endian(key, be_key, bits);
+
+ if (!rcu_access_pointer(*trie)) {
+ node = kzalloc(sizeof(*node), GFP_KERNEL);
+ if (unlikely(!node))
+ return -ENOMEM;
+ RCU_INIT_POINTER(node->peer, peer);
+ copy_and_assign_cidr(node, key, cidr, bits);
+ rcu_assign_pointer(*trie, node);
+ return 0;
+ }
+ if (node_placement(*trie, key, cidr, bits, &node, lock)) {
+ rcu_assign_pointer(node->peer, peer);
+ return 0;
+ }
+
+ newnode = kzalloc(sizeof(*newnode), GFP_KERNEL);
+ if (unlikely(!newnode))
+ return -ENOMEM;
+ RCU_INIT_POINTER(newnode->peer, peer);
+ copy_and_assign_cidr(newnode, key, cidr, bits);
+
+ if (!node)
+ down = rcu_dereference_protected(*trie, lockdep_is_held(lock));
+ else {
+ down = rcu_dereference_protected(choose_node(node, key),
+ lockdep_is_held(lock));
+ if (!down) {
+ rcu_assign_pointer(choose_node(node, key), newnode);
+ return 0;
+ }
+ }
+ cidr = min(cidr, common_bits(down, key, bits));
+ parent = node;
+
+ if (newnode->cidr == cidr) {
+ rcu_assign_pointer(choose_node(newnode, down->bits), down);
+ if (!parent)
+ rcu_assign_pointer(*trie, newnode);
+ else
+ rcu_assign_pointer(choose_node(parent, newnode->bits),
+ newnode);
+ } else {
+ node = kzalloc(sizeof(*node), GFP_KERNEL);
+ if (unlikely(!node)) {
+ kfree(newnode);
+ return -ENOMEM;
+ }
+ copy_and_assign_cidr(node, newnode->bits, cidr, bits);
+
+ rcu_assign_pointer(choose_node(node, down->bits), down);
+ rcu_assign_pointer(choose_node(node, newnode->bits), newnode);
+ if (!parent)
+ rcu_assign_pointer(*trie, node);
+ else
+ rcu_assign_pointer(choose_node(parent, node->bits),
+ node);
+ }
+ return 0;
+}
+
+void allowedips_init(struct allowedips *table)
+{
+ table->root4 = table->root6 = NULL;
+ table->seq = 1;
+}
+
+void allowedips_free(struct allowedips *table, struct mutex *lock)
+{
+ struct allowedips_node __rcu *old4 = table->root4, *old6 = table->root6;
+ ++table->seq;
+ RCU_INIT_POINTER(table->root4, NULL);
+ RCU_INIT_POINTER(table->root6, NULL);
+ if (rcu_access_pointer(old4))
+ call_rcu_bh(&rcu_dereference_protected(old4,
+ lockdep_is_held(lock))->rcu, root_free_rcu);
+ if (rcu_access_pointer(old6))
+ call_rcu_bh(&rcu_dereference_protected(old6,
+ lockdep_is_held(lock))->rcu, root_free_rcu);
+}
+
+int allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip,
+ u8 cidr, struct wireguard_peer *peer,
+ struct mutex *lock)
+{
+ ++table->seq;
+ return add(&table->root4, 32, (const u8 *)ip, cidr, peer, lock);
+}
+
+int allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip,
+ u8 cidr, struct wireguard_peer *peer,
+ struct mutex *lock)
+{
+ ++table->seq;
+ return add(&table->root6, 128, (const u8 *)ip, cidr, peer, lock);
+}
+
+void allowedips_remove_by_peer(struct allowedips *table,
+ struct wireguard_peer *peer, struct mutex *lock)
+{
+ ++table->seq;
+ walk_remove_by_peer(&table->root4, peer, lock);
+ walk_remove_by_peer(&table->root6, peer, lock);
+}
+
+int allowedips_walk_by_peer(struct allowedips *table,
+ struct allowedips_cursor *cursor,
+ struct wireguard_peer *peer,
+ int (*func)(void *ctx, const u8 *ip, u8 cidr, int family),
+ void *ctx, struct mutex *lock)
+{
+ int ret;
+
+ if (!cursor->seq)
+ cursor->seq = table->seq;
+ else if (cursor->seq != table->seq)
+ return 0;
+
+ if (!cursor->second_half) {
+ ret = walk_by_peer(table->root4, 32, cursor, peer, func, ctx, lock);
+ if (ret)
+ return ret;
+ cursor->len = 0;
+ cursor->second_half = true;
+ }
+ return walk_by_peer(table->root6, 128, cursor, peer, func, ctx, lock);
+}
+
+/* Returns a strong reference to a peer */
+struct wireguard_peer *allowedips_lookup_dst(struct allowedips *table,
+ struct sk_buff *skb)
+{
+ if (skb->protocol == htons(ETH_P_IP))
+ return lookup(table->root4, 32, &ip_hdr(skb)->daddr);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ return lookup(table->root6, 128, &ipv6_hdr(skb)->daddr);
+ return NULL;
+}
+
+/* Returns a strong reference to a peer */
+struct wireguard_peer *allowedips_lookup_src(struct allowedips *table,
+ struct sk_buff *skb)
+{
+ if (skb->protocol == htons(ETH_P_IP))
+ return lookup(table->root4, 32, &ip_hdr(skb)->saddr);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ return lookup(table->root6, 128, &ipv6_hdr(skb)->saddr);
+ return NULL;
+}
+
+#include "selftest/allowedips.h"
diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h
new file mode 100644
index 000000000000..d5ba1bee595e
--- /dev/null
+++ b/drivers/net/wireguard/allowedips.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_ALLOWEDIPS_H
+#define _WG_ALLOWEDIPS_H
+
+#include <linux/mutex.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+struct wireguard_peer;
+struct allowedips_node;
+
+struct allowedips {
+ struct allowedips_node __rcu *root4;
+ struct allowedips_node __rcu *root6;
+ u64 seq;
+};
+
+struct allowedips_cursor {
+ u64 seq;
+ struct allowedips_node *stack[128];
+ unsigned int len;
+ bool second_half;
+};
+
+void allowedips_init(struct allowedips *table);
+void allowedips_free(struct allowedips *table, struct mutex *mutex);
+int allowedips_insert_v4(struct allowedips *table, const struct in_addr *ip,
+ u8 cidr, struct wireguard_peer *peer,
+ struct mutex *lock);
+int allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip,
+ u8 cidr, struct wireguard_peer *peer,
+ struct mutex *lock);
+void allowedips_remove_by_peer(struct allowedips *table,
+ struct wireguard_peer *peer, struct mutex *lock);
+int allowedips_walk_by_peer(struct allowedips *table,
+ struct allowedips_cursor *cursor,
+ struct wireguard_peer *peer,
+ int (*func)(void *ctx, const u8 *ip, u8 cidr, int family),
+ void *ctx, struct mutex *lock);
+
+/* These return a strong reference to a peer: */
+struct wireguard_peer *allowedips_lookup_dst(struct allowedips *table,
+ struct sk_buff *skb);
+struct wireguard_peer *allowedips_lookup_src(struct allowedips *table,
+ struct sk_buff *skb);
+
+#ifdef DEBUG
+bool allowedips_selftest(void);
+#endif
+
+#endif /* _WG_ALLOWEDIPS_H */
diff --git a/drivers/net/wireguard/cookie.c b/drivers/net/wireguard/cookie.c
new file mode 100644
index 000000000000..d0739622341a
--- /dev/null
+++ b/drivers/net/wireguard/cookie.c
@@ -0,0 +1,234 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "cookie.h"
+#include "peer.h"
+#include "device.h"
+#include "messages.h"
+#include "ratelimiter.h"
+#include "timers.h"
+
+#include <zinc/blake2s.h>
+#include <zinc/chacha20poly1305.h>
+
+#include <net/ipv6.h>
+#include <crypto/algapi.h>
+
+void cookie_checker_init(struct cookie_checker *checker,
+ struct wireguard_device *wg)
+{
+ init_rwsem(&checker->secret_lock);
+ checker->secret_birthdate = ktime_get_boot_fast_ns();
+ get_random_bytes(checker->secret, NOISE_HASH_LEN);
+ checker->device = wg;
+}
+
+enum { COOKIE_KEY_LABEL_LEN = 8 };
+static const u8 mac1_key_label[COOKIE_KEY_LABEL_LEN] = "mac1----";
+static const u8 cookie_key_label[COOKIE_KEY_LABEL_LEN] = "cookie--";
+
+static void precompute_key(u8 key[NOISE_SYMMETRIC_KEY_LEN],
+ const u8 pubkey[NOISE_PUBLIC_KEY_LEN],
+ const u8 label[COOKIE_KEY_LABEL_LEN])
+{
+ struct blake2s_state blake;
+
+ blake2s_init(&blake, NOISE_SYMMETRIC_KEY_LEN);
+ blake2s_update(&blake, label, COOKIE_KEY_LABEL_LEN);
+ blake2s_update(&blake, pubkey, NOISE_PUBLIC_KEY_LEN);
+ blake2s_final(&blake, key, NOISE_SYMMETRIC_KEY_LEN);
+}
+
+/* Must hold peer->handshake.static_identity->lock */
+void cookie_checker_precompute_device_keys(struct cookie_checker *checker)
+{
+ if (likely(checker->device->static_identity.has_identity)) {
+ precompute_key(checker->cookie_encryption_key,
+ checker->device->static_identity.static_public,
+ cookie_key_label);
+ precompute_key(checker->message_mac1_key,
+ checker->device->static_identity.static_public,
+ mac1_key_label);
+ } else {
+ memset(checker->cookie_encryption_key, 0,
+ NOISE_SYMMETRIC_KEY_LEN);
+ memset(checker->message_mac1_key, 0, NOISE_SYMMETRIC_KEY_LEN);
+ }
+}
+
+void cookie_checker_precompute_peer_keys(struct wireguard_peer *peer)
+{
+ precompute_key(peer->latest_cookie.cookie_decryption_key,
+ peer->handshake.remote_static, cookie_key_label);
+ precompute_key(peer->latest_cookie.message_mac1_key,
+ peer->handshake.remote_static, mac1_key_label);
+}
+
+void cookie_init(struct cookie *cookie)
+{
+ memset(cookie, 0, sizeof(*cookie));
+ init_rwsem(&cookie->lock);
+}
+
+static void compute_mac1(u8 mac1[COOKIE_LEN], const void *message, size_t len,
+ const u8 key[NOISE_SYMMETRIC_KEY_LEN])
+{
+ len = len - sizeof(struct message_macs) +
+ offsetof(struct message_macs, mac1);
+ blake2s(mac1, message, key, COOKIE_LEN, len, NOISE_SYMMETRIC_KEY_LEN);
+}
+
+static void compute_mac2(u8 mac2[COOKIE_LEN], const void *message, size_t len,
+ const u8 cookie[COOKIE_LEN])
+{
+ len = len - sizeof(struct message_macs) +
+ offsetof(struct message_macs, mac2);
+ blake2s(mac2, message, cookie, COOKIE_LEN, len, COOKIE_LEN);
+}
+
+static void make_cookie(u8 cookie[COOKIE_LEN], struct sk_buff *skb,
+ struct cookie_checker *checker)
+{
+ struct blake2s_state state;
+
+ if (has_expired(checker->secret_birthdate, COOKIE_SECRET_MAX_AGE)) {
+ down_write(&checker->secret_lock);
+ checker->secret_birthdate = ktime_get_boot_fast_ns();
+ get_random_bytes(checker->secret, NOISE_HASH_LEN);
+ up_write(&checker->secret_lock);
+ }
+
+ down_read(&checker->secret_lock);
+
+ blake2s_init_key(&state, COOKIE_LEN, checker->secret, NOISE_HASH_LEN);
+ if (skb->protocol == htons(ETH_P_IP))
+ blake2s_update(&state, (u8 *)&ip_hdr(skb)->saddr,
+ sizeof(struct in_addr));
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ blake2s_update(&state, (u8 *)&ipv6_hdr(skb)->saddr,
+ sizeof(struct in6_addr));
+ blake2s_update(&state, (u8 *)&udp_hdr(skb)->source, sizeof(__be16));
+ blake2s_final(&state, cookie, COOKIE_LEN);
+
+ up_read(&checker->secret_lock);
+}
+
+enum cookie_mac_state cookie_validate_packet(struct cookie_checker *checker,
+ struct sk_buff *skb,
+ bool check_cookie)
+{
+ struct message_macs *macs = (struct message_macs *)
+ (skb->data + skb->len - sizeof(*macs));
+ enum cookie_mac_state ret;
+ u8 computed_mac[COOKIE_LEN];
+ u8 cookie[COOKIE_LEN];
+
+ ret = INVALID_MAC;
+ compute_mac1(computed_mac, skb->data, skb->len,
+ checker->message_mac1_key);
+ if (crypto_memneq(computed_mac, macs->mac1, COOKIE_LEN))
+ goto out;
+
+ ret = VALID_MAC_BUT_NO_COOKIE;
+
+ if (!check_cookie)
+ goto out;
+
+ make_cookie(cookie, skb, checker);
+
+ compute_mac2(computed_mac, skb->data, skb->len, cookie);
+ if (crypto_memneq(computed_mac, macs->mac2, COOKIE_LEN))
+ goto out;
+
+ ret = VALID_MAC_WITH_COOKIE_BUT_RATELIMITED;
+ if (!ratelimiter_allow(skb, dev_net(checker->device->dev)))
+ goto out;
+
+ ret = VALID_MAC_WITH_COOKIE;
+
+out:
+ return ret;
+}
+
+void cookie_add_mac_to_packet(void *message, size_t len,
+ struct wireguard_peer *peer)
+{
+ struct message_macs *macs = (struct message_macs *)
+ ((u8 *)message + len - sizeof(*macs));
+
+ down_write(&peer->latest_cookie.lock);
+ compute_mac1(macs->mac1, message, len,
+ peer->latest_cookie.message_mac1_key);
+ memcpy(peer->latest_cookie.last_mac1_sent, macs->mac1, COOKIE_LEN);
+ peer->latest_cookie.have_sent_mac1 = true;
+ up_write(&peer->latest_cookie.lock);
+
+ down_read(&peer->latest_cookie.lock);
+ if (peer->latest_cookie.is_valid &&
+ !has_expired(peer->latest_cookie.birthdate,
+ COOKIE_SECRET_MAX_AGE - COOKIE_SECRET_LATENCY))
+ compute_mac2(macs->mac2, message, len,
+ peer->latest_cookie.cookie);
+ else
+ memset(macs->mac2, 0, COOKIE_LEN);
+ up_read(&peer->latest_cookie.lock);
+}
+
+void cookie_message_create(struct message_handshake_cookie *dst,
+ struct sk_buff *skb, __le32 index,
+ struct cookie_checker *checker)
+{
+ struct message_macs *macs = (struct message_macs *)
+ ((u8 *)skb->data + skb->len - sizeof(*macs));
+ u8 cookie[COOKIE_LEN];
+
+ dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE);
+ dst->receiver_index = index;
+ get_random_bytes_wait(dst->nonce, COOKIE_NONCE_LEN);
+
+ make_cookie(cookie, skb, checker);
+ xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN,
+ macs->mac1, COOKIE_LEN, dst->nonce,
+ checker->cookie_encryption_key);
+}
+
+void cookie_message_consume(struct message_handshake_cookie *src,
+ struct wireguard_device *wg)
+{
+ struct wireguard_peer *peer = NULL;
+ u8 cookie[COOKIE_LEN];
+ bool ret;
+
+ if (unlikely(!index_hashtable_lookup(&wg->index_hashtable,
+ INDEX_HASHTABLE_HANDSHAKE |
+ INDEX_HASHTABLE_KEYPAIR,
+ src->receiver_index, &peer)))
+ return;
+
+ down_read(&peer->latest_cookie.lock);
+ if (unlikely(!peer->latest_cookie.have_sent_mac1)) {
+ up_read(&peer->latest_cookie.lock);
+ goto out;
+ }
+ ret = xchacha20poly1305_decrypt(
+ cookie, src->encrypted_cookie, sizeof(src->encrypted_cookie),
+ peer->latest_cookie.last_mac1_sent, COOKIE_LEN, src->nonce,
+ peer->latest_cookie.cookie_decryption_key);
+ up_read(&peer->latest_cookie.lock);
+
+ if (ret) {
+ down_write(&peer->latest_cookie.lock);
+ memcpy(peer->latest_cookie.cookie, cookie, COOKIE_LEN);
+ peer->latest_cookie.birthdate = ktime_get_boot_fast_ns();
+ peer->latest_cookie.is_valid = true;
+ peer->latest_cookie.have_sent_mac1 = false;
+ up_write(&peer->latest_cookie.lock);
+ } else
+ net_dbg_ratelimited("%s: Could not decrypt invalid cookie response\n",
+ wg->dev->name);
+
+out:
+ peer_put(peer);
+}
diff --git a/drivers/net/wireguard/cookie.h b/drivers/net/wireguard/cookie.h
new file mode 100644
index 000000000000..7802f6158d66
--- /dev/null
+++ b/drivers/net/wireguard/cookie.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_COOKIE_H
+#define _WG_COOKIE_H
+
+#include "messages.h"
+#include <linux/rwsem.h>
+
+struct wireguard_peer;
+
+struct cookie_checker {
+ u8 secret[NOISE_HASH_LEN];
+ u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN];
+ u64 secret_birthdate;
+ struct rw_semaphore secret_lock;
+ struct wireguard_device *device;
+};
+
+struct cookie {
+ u64 birthdate;
+ bool is_valid;
+ u8 cookie[COOKIE_LEN];
+ bool have_sent_mac1;
+ u8 last_mac1_sent[COOKIE_LEN];
+ u8 cookie_decryption_key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN];
+ struct rw_semaphore lock;
+};
+
+enum cookie_mac_state {
+ INVALID_MAC,
+ VALID_MAC_BUT_NO_COOKIE,
+ VALID_MAC_WITH_COOKIE_BUT_RATELIMITED,
+ VALID_MAC_WITH_COOKIE
+};
+
+void cookie_checker_init(struct cookie_checker *checker,
+ struct wireguard_device *wg);
+void cookie_checker_precompute_device_keys(struct cookie_checker *checker);
+void cookie_checker_precompute_peer_keys(struct wireguard_peer *peer);
+void cookie_init(struct cookie *cookie);
+
+enum cookie_mac_state cookie_validate_packet(struct cookie_checker *checker,
+ struct sk_buff *skb,
+ bool check_cookie);
+void cookie_add_mac_to_packet(void *message, size_t len,
+ struct wireguard_peer *peer);
+
+void cookie_message_create(struct message_handshake_cookie *src,
+ struct sk_buff *skb, __le32 index,
+ struct cookie_checker *checker);
+void cookie_message_consume(struct message_handshake_cookie *src,
+ struct wireguard_device *wg);
+
+#endif /* _WG_COOKIE_H */
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
new file mode 100644
index 000000000000..a87a39e25e94
--- /dev/null
+++ b/drivers/net/wireguard/device.c
@@ -0,0 +1,438 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "queueing.h"
+#include "socket.h"
+#include "timers.h"
+#include "device.h"
+#include "ratelimiter.h"
+#include "peer.h"
+#include "messages.h"
+
+#include <linux/module.h>
+#include <linux/rtnetlink.h>
+#include <linux/inet.h>
+#include <linux/netdevice.h>
+#include <linux/inetdevice.h>
+#include <linux/if_arp.h>
+#include <linux/icmp.h>
+#include <linux/suspend.h>
+#include <net/icmp.h>
+#include <net/rtnetlink.h>
+#include <net/ip_tunnels.h>
+#include <net/addrconf.h>
+
+static LIST_HEAD(device_list);
+
+static int open(struct net_device *dev)
+{
+ struct in_device *dev_v4 = __in_dev_get_rtnl(dev);
+ struct wireguard_device *wg = netdev_priv(dev);
+ struct inet6_dev *dev_v6 = __in6_dev_get(dev);
+ struct wireguard_peer *peer;
+ int ret;
+
+ if (dev_v4) {
+ /* At some point we might put this check near the ip_rt_send_
+ * redirect call of ip_forward in net/ipv4/ip_forward.c, similar
+ * to the current secpath check.
+ */
+ IN_DEV_CONF_SET(dev_v4, SEND_REDIRECTS, false);
+ IPV4_DEVCONF_ALL(dev_net(dev), SEND_REDIRECTS) = false;
+ }
+ if (dev_v6)
+ dev_v6->cnf.addr_gen_mode = IN6_ADDR_GEN_MODE_NONE;
+
+ ret = socket_init(wg, wg->incoming_port);
+ if (ret < 0)
+ return ret;
+ mutex_lock(&wg->device_update_lock);
+ list_for_each_entry (peer, &wg->peer_list, peer_list) {
+ packet_send_staged_packets(peer);
+ if (peer->persistent_keepalive_interval)
+ packet_send_keepalive(peer);
+ }
+ mutex_unlock(&wg->device_update_lock);
+ return 0;
+}
+
+#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)
+static int pm_notification(struct notifier_block *nb, unsigned long action,
+ void *data)
+{
+ struct wireguard_device *wg;
+ struct wireguard_peer *peer;
+
+ if (action != PM_HIBERNATION_PREPARE && action != PM_SUSPEND_PREPARE)
+ return 0;
+
+ rtnl_lock();
+ list_for_each_entry (wg, &device_list, device_list) {
+ mutex_lock(&wg->device_update_lock);
+ list_for_each_entry (peer, &wg->peer_list, peer_list) {
+ noise_handshake_clear(&peer->handshake);
+ noise_keypairs_clear(&peer->keypairs);
+ if (peer->timers_enabled)
+ del_timer(&peer->timer_zero_key_material);
+ }
+ mutex_unlock(&wg->device_update_lock);
+ }
+ rtnl_unlock();
+ rcu_barrier_bh();
+ return 0;
+}
+static struct notifier_block pm_notifier = { .notifier_call = pm_notification };
+#endif
+
+static int stop(struct net_device *dev)
+{
+ struct wireguard_device *wg = netdev_priv(dev);
+ struct wireguard_peer *peer;
+
+ mutex_lock(&wg->device_update_lock);
+ list_for_each_entry (peer, &wg->peer_list, peer_list) {
+ skb_queue_purge(&peer->staged_packet_queue);
+ timers_stop(peer);
+ noise_handshake_clear(&peer->handshake);
+ noise_keypairs_clear(&peer->keypairs);
+ atomic64_set(&peer->last_sent_handshake,
+ ktime_get_boot_fast_ns() -
+ (u64)(REKEY_TIMEOUT + 1) * NSEC_PER_SEC);
+ }
+ mutex_unlock(&wg->device_update_lock);
+ skb_queue_purge(&wg->incoming_handshakes);
+ socket_reinit(wg, NULL, NULL);
+ return 0;
+}
+
+static netdev_tx_t xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ struct wireguard_device *wg = netdev_priv(dev);
+ struct wireguard_peer *peer;
+ struct sk_buff *next;
+ struct sk_buff_head packets;
+ sa_family_t family;
+ u32 mtu;
+ int ret;
+
+ if (unlikely(skb_examine_untrusted_ip_hdr(skb) != skb->protocol)) {
+ ret = -EPROTONOSUPPORT;
+ net_dbg_ratelimited("%s: Invalid IP packet\n", dev->name);
+ goto err;
+ }
+
+ peer = allowedips_lookup_dst(&wg->peer_allowedips, skb);
+ if (unlikely(!peer)) {
+ ret = -ENOKEY;
+ if (skb->protocol == htons(ETH_P_IP))
+ net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI4\n",
+ dev->name, &ip_hdr(skb)->daddr);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI6\n",
+ dev->name, &ipv6_hdr(skb)->daddr);
+ goto err;
+ }
+
+ family = READ_ONCE(peer->endpoint.addr.sa_family);
+ if (unlikely(family != AF_INET && family != AF_INET6)) {
+ ret = -EDESTADDRREQ;
+ net_dbg_ratelimited("%s: No valid endpoint has been configured or discovered for peer %llu\n",
+ dev->name, peer->internal_id);
+ goto err_peer;
+ }
+
+ mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
+
+ __skb_queue_head_init(&packets);
+ if (!skb_is_gso(skb))
+ skb->next = NULL;
+ else {
+ struct sk_buff *segs = skb_gso_segment(skb, 0);
+
+ if (unlikely(IS_ERR(segs))) {
+ ret = PTR_ERR(segs);
+ goto err_peer;
+ }
+ dev_kfree_skb(skb);
+ skb = segs;
+ }
+ do {
+ next = skb->next;
+ skb->next = skb->prev = NULL;
+
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (unlikely(!skb))
+ continue;
+
+ /* We only need to keep the original dst around for icmp,
+ * so at this point we're in a position to drop it.
+ */
+ skb_dst_drop(skb);
+
+ PACKET_CB(skb)->mtu = mtu;
+
+ __skb_queue_tail(&packets, skb);
+ } while ((skb = next) != NULL);
+
+ spin_lock_bh(&peer->staged_packet_queue.lock);
+ /* If the queue is getting too big, we start removing the oldest packets
+ * until it's small again. We do this before adding the new packet, so
+ * we don't remove GSO segments that are in excess.
+ */
+ while (skb_queue_len(&peer->staged_packet_queue) > MAX_STAGED_PACKETS)
+ dev_kfree_skb(__skb_dequeue(&peer->staged_packet_queue));
+ skb_queue_splice_tail(&packets, &peer->staged_packet_queue);
+ spin_unlock_bh(&peer->staged_packet_queue.lock);
+
+ packet_send_staged_packets(peer);
+
+ peer_put(peer);
+ return NETDEV_TX_OK;
+
+err_peer:
+ peer_put(peer);
+err:
+ ++dev->stats.tx_errors;
+ if (skb->protocol == htons(ETH_P_IP))
+ icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
+ else if (skb->protocol == htons(ETH_P_IPV6))
+ icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0);
+ kfree_skb(skb);
+ return ret;
+}
+
+static const struct net_device_ops netdev_ops = {
+ .ndo_open = open,
+ .ndo_stop = stop,
+ .ndo_start_xmit = xmit,
+ .ndo_get_stats64 = ip_tunnel_get_stats64
+};
+
+static void destruct(struct net_device *dev)
+{
+ struct wireguard_device *wg = netdev_priv(dev);
+
+ rtnl_lock();
+ list_del(&wg->device_list);
+ rtnl_unlock();
+ mutex_lock(&wg->device_update_lock);
+ wg->incoming_port = 0;
+ socket_reinit(wg, NULL, NULL);
+ allowedips_free(&wg->peer_allowedips, &wg->device_update_lock);
+ /* The final references are cleared in the below calls to destroy_workqueue. */
+ peer_remove_all(wg);
+ destroy_workqueue(wg->handshake_receive_wq);
+ destroy_workqueue(wg->handshake_send_wq);
+ destroy_workqueue(wg->packet_crypt_wq);
+ packet_queue_free(&wg->decrypt_queue, true);
+ packet_queue_free(&wg->encrypt_queue, true);
+ rcu_barrier_bh(); /* Wait for all the peers to be actually freed. */
+ ratelimiter_uninit();
+ memzero_explicit(&wg->static_identity, sizeof(wg->static_identity));
+ skb_queue_purge(&wg->incoming_handshakes);
+ free_percpu(dev->tstats);
+ free_percpu(wg->incoming_handshakes_worker);
+ if (wg->have_creating_net_ref)
+ put_net(wg->creating_net);
+ mutex_unlock(&wg->device_update_lock);
+
+ pr_debug("%s: Interface deleted\n", dev->name);
+ free_netdev(dev);
+}
+
+static const struct device_type device_type = { .name = KBUILD_MODNAME };
+
+static void setup(struct net_device *dev)
+{
+ struct wireguard_device *wg = netdev_priv(dev);
+ enum { WG_NETDEV_FEATURES = NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
+ NETIF_F_SG | NETIF_F_GSO |
+ NETIF_F_GSO_SOFTWARE | NETIF_F_HIGHDMA };
+
+ dev->netdev_ops = &netdev_ops;
+ dev->hard_header_len = 0;
+ dev->addr_len = 0;
+ dev->needed_headroom = DATA_PACKET_HEAD_ROOM;
+ dev->needed_tailroom = noise_encrypted_len(MESSAGE_PADDING_MULTIPLE);
+ dev->type = ARPHRD_NONE;
+ dev->flags = IFF_POINTOPOINT | IFF_NOARP;
+ dev->priv_flags |= IFF_NO_QUEUE;
+ dev->features |= NETIF_F_LLTX;
+ dev->features |= WG_NETDEV_FEATURES;
+ dev->hw_features |= WG_NETDEV_FEATURES;
+ dev->hw_enc_features |= WG_NETDEV_FEATURES;
+ dev->mtu = ETH_DATA_LEN - MESSAGE_MINIMUM_LENGTH -
+ sizeof(struct udphdr) -
+ max(sizeof(struct ipv6hdr), sizeof(struct iphdr));
+
+ SET_NETDEV_DEVTYPE(dev, &device_type);
+
+ /* We need to keep the dst around in case of icmp replies. */
+ netif_keep_dst(dev);
+
+ memset(wg, 0, sizeof(*wg));
+ wg->dev = dev;
+}
+
+static int newlink(struct net *src_net, struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[],
+ struct netlink_ext_ack *extack)
+{
+ int ret = -ENOMEM;
+ struct wireguard_device *wg = netdev_priv(dev);
+
+ wg->creating_net = src_net;
+ init_rwsem(&wg->static_identity.lock);
+ mutex_init(&wg->socket_update_lock);
+ mutex_init(&wg->device_update_lock);
+ skb_queue_head_init(&wg->incoming_handshakes);
+ pubkey_hashtable_init(&wg->peer_hashtable);
+ index_hashtable_init(&wg->index_hashtable);
+ allowedips_init(&wg->peer_allowedips);
+ cookie_checker_init(&wg->cookie_checker, wg);
+ INIT_LIST_HEAD(&wg->peer_list);
+ wg->device_update_gen = 1;
+
+ dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
+ if (!dev->tstats)
+ goto error_1;
+
+ wg->incoming_handshakes_worker = packet_alloc_percpu_multicore_worker(
+ packet_handshake_receive_worker, wg);
+ if (!wg->incoming_handshakes_worker)
+ goto error_2;
+
+ wg->handshake_receive_wq = alloc_workqueue("wg-kex-%s",
+ WQ_CPU_INTENSIVE | WQ_FREEZABLE, 0, dev->name);
+ if (!wg->handshake_receive_wq)
+ goto error_3;
+
+ wg->handshake_send_wq = alloc_workqueue("wg-kex-%s",
+ WQ_UNBOUND | WQ_FREEZABLE, 0, dev->name);
+ if (!wg->handshake_send_wq)
+ goto error_4;
+
+ wg->packet_crypt_wq = alloc_workqueue("wg-crypt-%s",
+ WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM, 0, dev->name);
+ if (!wg->packet_crypt_wq)
+ goto error_5;
+
+ if (packet_queue_init(&wg->encrypt_queue, packet_encrypt_worker, true,
+ MAX_QUEUED_PACKETS) < 0)
+ goto error_6;
+
+ if (packet_queue_init(&wg->decrypt_queue, packet_decrypt_worker, true,
+ MAX_QUEUED_PACKETS) < 0)
+ goto error_7;
+
+ ret = ratelimiter_init();
+ if (ret < 0)
+ goto error_8;
+
+ ret = register_netdevice(dev);
+ if (ret < 0)
+ goto error_9;
+
+ list_add(&wg->device_list, &device_list);
+
+ /* We wait until the end to assign priv_destructor, so that
+ * register_netdevice doesn't call it for us if it fails.
+ */
+ dev->priv_destructor = destruct;
+
+ pr_debug("%s: Interface created\n", dev->name);
+ return ret;
+
+error_9:
+ ratelimiter_uninit();
+error_8:
+ packet_queue_free(&wg->decrypt_queue, true);
+error_7:
+ packet_queue_free(&wg->encrypt_queue, true);
+error_6:
+ destroy_workqueue(wg->packet_crypt_wq);
+error_5:
+ destroy_workqueue(wg->handshake_send_wq);
+error_4:
+ destroy_workqueue(wg->handshake_receive_wq);
+error_3:
+ free_percpu(wg->incoming_handshakes_worker);
+error_2:
+ free_percpu(dev->tstats);
+error_1:
+ return ret;
+}
+
+static struct rtnl_link_ops link_ops __read_mostly = {
+ .kind = KBUILD_MODNAME,
+ .priv_size = sizeof(struct wireguard_device),
+ .setup = setup,
+ .newlink = newlink,
+};
+
+static int netdevice_notification(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct net_device *dev = ((struct netdev_notifier_info *)data)->dev;
+ struct wireguard_device *wg = netdev_priv(dev);
+
+ ASSERT_RTNL();
+
+ if (action != NETDEV_REGISTER || dev->netdev_ops != &netdev_ops)
+ return 0;
+
+ if (dev_net(dev) == wg->creating_net && wg->have_creating_net_ref) {
+ put_net(wg->creating_net);
+ wg->have_creating_net_ref = false;
+ } else if (dev_net(dev) != wg->creating_net &&
+ !wg->have_creating_net_ref) {
+ wg->have_creating_net_ref = true;
+ get_net(wg->creating_net);
+ }
+ return 0;
+}
+
+static struct notifier_block netdevice_notifier = {
+ .notifier_call = netdevice_notification
+};
+
+int __init device_init(void)
+{
+ int ret;
+
+#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)
+ ret = register_pm_notifier(&pm_notifier);
+ if (ret)
+ return ret;
+#endif
+
+ ret = register_netdevice_notifier(&netdevice_notifier);
+ if (ret)
+ goto error_pm;
+
+ ret = rtnl_link_register(&link_ops);
+ if (ret)
+ goto error_netdevice;
+
+ return 0;
+
+error_netdevice:
+ unregister_netdevice_notifier(&netdevice_notifier);
+error_pm:
+#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)
+ unregister_pm_notifier(&pm_notifier);
+#endif
+ return ret;
+}
+
+void device_uninit(void)
+{
+ rtnl_link_unregister(&link_ops);
+ unregister_netdevice_notifier(&netdevice_notifier);
+#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)
+ unregister_pm_notifier(&pm_notifier);
+#endif
+ rcu_barrier_bh();
+}
diff --git a/drivers/net/wireguard/device.h b/drivers/net/wireguard/device.h
new file mode 100644
index 000000000000..2499782518c1
--- /dev/null
+++ b/drivers/net/wireguard/device.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_DEVICE_H
+#define _WG_DEVICE_H
+
+#include "noise.h"
+#include "allowedips.h"
+#include "hashtables.h"
+#include "cookie.h"
+
+#include <linux/types.h>
+#include <linux/netdevice.h>
+#include <linux/workqueue.h>
+#include <linux/mutex.h>
+#include <linux/net.h>
+#include <linux/ptr_ring.h>
+
+struct wireguard_device;
+
+struct multicore_worker {
+ void *ptr;
+ struct work_struct work;
+};
+
+struct crypt_queue {
+ struct ptr_ring ring;
+ union {
+ struct {
+ struct multicore_worker __percpu *worker;
+ int last_cpu;
+ };
+ struct work_struct work;
+ };
+};
+
+struct wireguard_device {
+ struct net_device *dev;
+ struct crypt_queue encrypt_queue, decrypt_queue;
+ struct sock __rcu *sock4, *sock6;
+ struct net *creating_net;
+ struct noise_static_identity static_identity;
+ struct workqueue_struct *handshake_receive_wq, *handshake_send_wq;
+ struct workqueue_struct *packet_crypt_wq;
+ struct sk_buff_head incoming_handshakes;
+ int incoming_handshake_cpu;
+ struct multicore_worker __percpu *incoming_handshakes_worker;
+ struct cookie_checker cookie_checker;
+ struct pubkey_hashtable peer_hashtable;
+ struct index_hashtable index_hashtable;
+ struct allowedips peer_allowedips;
+ struct mutex device_update_lock, socket_update_lock;
+ struct list_head device_list, peer_list;
+ unsigned int num_peers, device_update_gen;
+ u32 fwmark;
+ u16 incoming_port;
+ bool have_creating_net_ref;
+};
+
+int device_init(void);
+void device_uninit(void);
+
+#endif /* _WG_DEVICE_H */
diff --git a/drivers/net/wireguard/hashtables.c b/drivers/net/wireguard/hashtables.c
new file mode 100644
index 000000000000..4ba228845f2d
--- /dev/null
+++ b/drivers/net/wireguard/hashtables.c
@@ -0,0 +1,209 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "hashtables.h"
+#include "peer.h"
+#include "noise.h"
+
+static inline struct hlist_head *pubkey_bucket(struct pubkey_hashtable *table,
+ const u8 pubkey[NOISE_PUBLIC_KEY_LEN])
+{
+ /* siphash gives us a secure 64bit number based on a random key. Since
+ * the bits are uniformly distributed, we can then mask off to get the
+ * bits we need.
+ */
+ return &table->hashtable[
+ siphash(pubkey, NOISE_PUBLIC_KEY_LEN, &table->key) &
+ (HASH_SIZE(table->hashtable) - 1)];
+}
+
+void pubkey_hashtable_init(struct pubkey_hashtable *table)
+{
+ get_random_bytes(&table->key, sizeof(table->key));
+ hash_init(table->hashtable);
+ mutex_init(&table->lock);
+}
+
+void pubkey_hashtable_add(struct pubkey_hashtable *table,
+ struct wireguard_peer *peer)
+{
+ mutex_lock(&table->lock);
+ hlist_add_head_rcu(&peer->pubkey_hash,
+ pubkey_bucket(table, peer->handshake.remote_static));
+ mutex_unlock(&table->lock);
+}
+
+void pubkey_hashtable_remove(struct pubkey_hashtable *table,
+ struct wireguard_peer *peer)
+{
+ mutex_lock(&table->lock);
+ hlist_del_init_rcu(&peer->pubkey_hash);
+ mutex_unlock(&table->lock);
+}
+
+/* Returns a strong reference to a peer */
+struct wireguard_peer *
+pubkey_hashtable_lookup(struct pubkey_hashtable *table,
+ const u8 pubkey[NOISE_PUBLIC_KEY_LEN])
+{
+ struct wireguard_peer *iter_peer, *peer = NULL;
+
+ rcu_read_lock_bh();
+ hlist_for_each_entry_rcu_bh (iter_peer, pubkey_bucket(table, pubkey),
+ pubkey_hash) {
+ if (!memcmp(pubkey, iter_peer->handshake.remote_static,
+ NOISE_PUBLIC_KEY_LEN)) {
+ peer = iter_peer;
+ break;
+ }
+ }
+ peer = peer_get_maybe_zero(peer);
+ rcu_read_unlock_bh();
+ return peer;
+}
+
+static inline struct hlist_head *index_bucket(struct index_hashtable *table,
+ const __le32 index)
+{
+ /* Since the indices are random and thus all bits are uniformly
+ * distributed, we can find its bucket simply by masking.
+ */
+ return &table->hashtable[(__force u32)index &
+ (HASH_SIZE(table->hashtable) - 1)];
+}
+
+void index_hashtable_init(struct index_hashtable *table)
+{
+ hash_init(table->hashtable);
+ spin_lock_init(&table->lock);
+}
+
+/* At the moment, we limit ourselves to 2^20 total peers, which generally might
+ * amount to 2^20*3 items in this hashtable. The algorithm below works by
+ * picking a random number and testing it. We can see that these limits mean we
+ * usually succeed pretty quickly:
+ *
+ * >>> def calculation(tries, size):
+ * ... return (size / 2**32)**(tries - 1) * (1 - (size / 2**32))
+ * ...
+ * >>> calculation(1, 2**20 * 3)
+ * 0.999267578125
+ * >>> calculation(2, 2**20 * 3)
+ * 0.0007318854331970215
+ * >>> calculation(3, 2**20 * 3)
+ * 5.360489012673497e-07
+ * >>> calculation(4, 2**20 * 3)
+ * 3.9261394135792216e-10
+ *
+ * At the moment, we don't do any masking, so this algorithm isn't exactly
+ * constant time in either the random guessing or in the hash list lookup. We
+ * could require a minimum of 3 tries, which would successfully mask the
+ * guessing. this would not, however, help with the growing hash lengths, which
+ * is another thing to consider moving forward.
+ */
+
+__le32 index_hashtable_insert(struct index_hashtable *table,
+ struct index_hashtable_entry *entry)
+{
+ struct index_hashtable_entry *existing_entry;
+
+ spin_lock_bh(&table->lock);
+ hlist_del_init_rcu(&entry->index_hash);
+ spin_unlock_bh(&table->lock);
+
+ rcu_read_lock_bh();
+
+search_unused_slot:
+ /* First we try to find an unused slot, randomly, while unlocked. */
+ entry->index = (__force __le32)get_random_u32();
+ hlist_for_each_entry_rcu_bh (existing_entry,
+ index_bucket(table, entry->index),
+ index_hash) {
+ if (existing_entry->index == entry->index)
+ /* If it's already in use, we continue searching. */
+ goto search_unused_slot;
+ }
+
+ /* Once we've found an unused slot, we lock it, and then double-check
+ * that nobody else stole it from us.
+ */
+ spin_lock_bh(&table->lock);
+ hlist_for_each_entry_rcu_bh (existing_entry,
+ index_bucket(table, entry->index),
+ index_hash) {
+ if (existing_entry->index == entry->index) {
+ spin_unlock_bh(&table->lock);
+ /* If it was stolen, we start over. */
+ goto search_unused_slot;
+ }
+ }
+ /* Otherwise, we know we have it exclusively (since we're locked),
+ * so we insert.
+ */
+ hlist_add_head_rcu(&entry->index_hash,
+ index_bucket(table, entry->index));
+ spin_unlock_bh(&table->lock);
+
+ rcu_read_unlock_bh();
+
+ return entry->index;
+}
+
+bool index_hashtable_replace(struct index_hashtable *table,
+ struct index_hashtable_entry *old,
+ struct index_hashtable_entry *new)
+{
+ if (unlikely(hlist_unhashed(&old->index_hash)))
+ return false;
+ spin_lock_bh(&table->lock);
+ new->index = old->index;
+ hlist_replace_rcu(&old->index_hash, &new->index_hash);
+
+ /* Calling init here NULLs out index_hash, and in fact after this
+ * function returns, it's theoretically possible for this to get
+ * reinserted elsewhere. That means the RCU lookup below might either
+ * terminate early or jump between buckets, in which case the packet
+ * simply gets dropped, which isn't terrible.
+ */
+ INIT_HLIST_NODE(&old->index_hash);
+ spin_unlock_bh(&table->lock);
+ return true;
+}
+
+void index_hashtable_remove(struct index_hashtable *table,
+ struct index_hashtable_entry *entry)
+{
+ spin_lock_bh(&table->lock);
+ hlist_del_init_rcu(&entry->index_hash);
+ spin_unlock_bh(&table->lock);
+}
+
+/* Returns a strong reference to a entry->peer */
+struct index_hashtable_entry *
+index_hashtable_lookup(struct index_hashtable *table,
+ const enum index_hashtable_type type_mask,
+ const __le32 index, struct wireguard_peer **peer)
+{
+ struct index_hashtable_entry *iter_entry, *entry = NULL;
+
+ rcu_read_lock_bh();
+ hlist_for_each_entry_rcu_bh (iter_entry, index_bucket(table, index),
+ index_hash) {
+ if (iter_entry->index == index) {
+ if (likely(iter_entry->type & type_mask))
+ entry = iter_entry;
+ break;
+ }
+ }
+ if (likely(entry)) {
+ entry->peer = peer_get_maybe_zero(entry->peer);
+ if (likely(entry->peer))
+ *peer = entry->peer;
+ else
+ entry = NULL;
+ }
+ rcu_read_unlock_bh();
+ return entry;
+}
diff --git a/drivers/net/wireguard/hashtables.h b/drivers/net/wireguard/hashtables.h
new file mode 100644
index 000000000000..62858c554283
--- /dev/null
+++ b/drivers/net/wireguard/hashtables.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_HASHTABLES_H
+#define _WG_HASHTABLES_H
+
+#include "messages.h"
+
+#include <linux/hashtable.h>
+#include <linux/mutex.h>
+#include <linux/siphash.h>
+
+struct wireguard_peer;
+
+struct pubkey_hashtable {
+ /* TODO: move to rhashtable */
+ DECLARE_HASHTABLE(hashtable, 11);
+ siphash_key_t key;
+ struct mutex lock;
+};
+
+void pubkey_hashtable_init(struct pubkey_hashtable *table);
+void pubkey_hashtable_add(struct pubkey_hashtable *table,
+ struct wireguard_peer *peer);
+void pubkey_hashtable_remove(struct pubkey_hashtable *table,
+ struct wireguard_peer *peer);
+struct wireguard_peer *
+pubkey_hashtable_lookup(struct pubkey_hashtable *table,
+ const u8 pubkey[NOISE_PUBLIC_KEY_LEN]);
+
+struct index_hashtable {
+ /* TODO: move to rhashtable */
+ DECLARE_HASHTABLE(hashtable, 13);
+ spinlock_t lock;
+};
+
+enum index_hashtable_type {
+ INDEX_HASHTABLE_HANDSHAKE = 1U << 0,
+ INDEX_HASHTABLE_KEYPAIR = 1U << 1
+};
+
+struct index_hashtable_entry {
+ struct wireguard_peer *peer;
+ struct hlist_node index_hash;
+ enum index_hashtable_type type;
+ __le32 index;
+};
+void index_hashtable_init(struct index_hashtable *table);
+__le32 index_hashtable_insert(struct index_hashtable *table,
+ struct index_hashtable_entry *entry);
+bool index_hashtable_replace(struct index_hashtable *table,
+ struct index_hashtable_entry *old,
+ struct index_hashtable_entry *new);
+void index_hashtable_remove(struct index_hashtable *table,
+ struct index_hashtable_entry *entry);
+struct index_hashtable_entry *
+index_hashtable_lookup(struct index_hashtable *table,
+ const enum index_hashtable_type type_mask,
+ const __le32 index, struct wireguard_peer **peer);
+
+#endif /* _WG_HASHTABLES_H */
diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c
new file mode 100644
index 000000000000..45f999041660
--- /dev/null
+++ b/drivers/net/wireguard/main.c
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "version.h"
+#include "device.h"
+#include "noise.h"
+#include "queueing.h"
+#include "ratelimiter.h"
+#include "netlink.h"
+
+#include <uapi/linux/wireguard.h>
+
+#include <linux/version.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/genetlink.h>
+#include <net/rtnetlink.h>
+
+static int __init mod_init(void)
+{
+ int ret;
+
+#ifdef DEBUG
+ if (!allowedips_selftest() || !packet_counter_selftest() ||
+ !ratelimiter_selftest())
+ return -ENOTRECOVERABLE;
+#endif
+ noise_init();
+
+ ret = device_init();
+ if (ret < 0)
+ goto err_device;
+
+ ret = genetlink_init();
+ if (ret < 0)
+ goto err_netlink;
+
+ pr_info("WireGuard " WIREGUARD_VERSION " loaded. See www.wireguard.com for information.\n");
+ pr_info("Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.\n");
+
+ return 0;
+
+err_netlink:
+ device_uninit();
+err_device:
+ return ret;
+}
+
+static void __exit mod_exit(void)
+{
+ genetlink_uninit();
+ device_uninit();
+ pr_debug("WireGuard unloaded\n");
+}
+
+module_init(mod_init);
+module_exit(mod_exit);
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Fast, modern, and secure VPN tunnel");
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_VERSION(WIREGUARD_VERSION);
+MODULE_ALIAS_RTNL_LINK(KBUILD_MODNAME);
+MODULE_ALIAS_GENL_FAMILY(WG_GENL_NAME);
diff --git a/drivers/net/wireguard/messages.h b/drivers/net/wireguard/messages.h
new file mode 100644
index 000000000000..131e1c44049d
--- /dev/null
+++ b/drivers/net/wireguard/messages.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_MESSAGES_H
+#define _WG_MESSAGES_H
+
+#include <zinc/curve25519.h>
+#include <zinc/chacha20poly1305.h>
+#include <zinc/blake2s.h>
+
+#include <linux/kernel.h>
+#include <linux/param.h>
+#include <linux/skbuff.h>
+
+enum noise_lengths {
+ NOISE_PUBLIC_KEY_LEN = CURVE25519_POINT_SIZE,
+ NOISE_SYMMETRIC_KEY_LEN = CHACHA20POLY1305_KEYLEN,
+ NOISE_TIMESTAMP_LEN = sizeof(u64) + sizeof(u32),
+ NOISE_AUTHTAG_LEN = CHACHA20POLY1305_AUTHTAGLEN,
+ NOISE_HASH_LEN = BLAKE2S_OUTBYTES
+};
+
+#define noise_encrypted_len(plain_len) (plain_len + NOISE_AUTHTAG_LEN)
+
+enum cookie_values {
+ COOKIE_SECRET_MAX_AGE = 2 * 60,
+ COOKIE_SECRET_LATENCY = 5,
+ COOKIE_NONCE_LEN = XCHACHA20POLY1305_NONCELEN,
+ COOKIE_LEN = 16
+};
+
+enum counter_values {
+ COUNTER_BITS_TOTAL = 2048,
+ COUNTER_REDUNDANT_BITS = BITS_PER_LONG,
+ COUNTER_WINDOW_SIZE = COUNTER_BITS_TOTAL - COUNTER_REDUNDANT_BITS
+};
+
+enum limits {
+ REKEY_AFTER_MESSAGES = U64_MAX - 0xffff,
+ REJECT_AFTER_MESSAGES = U64_MAX - COUNTER_WINDOW_SIZE - 1,
+ REKEY_TIMEOUT = 5,
+ REKEY_TIMEOUT_JITTER_MAX_JIFFIES = HZ / 3,
+ REKEY_AFTER_TIME = 120,
+ REJECT_AFTER_TIME = 180,
+ INITIATIONS_PER_SECOND = 50,
+ MAX_PEERS_PER_DEVICE = 1U << 20,
+ KEEPALIVE_TIMEOUT = 10,
+ MAX_TIMER_HANDSHAKES = 90 / REKEY_TIMEOUT,
+ MAX_QUEUED_INCOMING_HANDSHAKES = 4096, /* TODO: replace this with DQL */
+ MAX_STAGED_PACKETS = 128,
+ MAX_QUEUED_PACKETS = 1024 /* TODO: replace this with DQL */
+};
+
+enum message_type {
+ MESSAGE_INVALID = 0,
+ MESSAGE_HANDSHAKE_INITIATION = 1,
+ MESSAGE_HANDSHAKE_RESPONSE = 2,
+ MESSAGE_HANDSHAKE_COOKIE = 3,
+ MESSAGE_DATA = 4
+};
+
+struct message_header {
+ /* The actual layout of this that we want is:
+ * u8 type
+ * u8 reserved_zero[3]
+ *
+ * But it turns out that by encoding this as little endian,
+ * we achieve the same thing, and it makes checking faster.
+ */
+ __le32 type;
+};
+
+struct message_macs {
+ u8 mac1[COOKIE_LEN];
+ u8 mac2[COOKIE_LEN];
+};
+
+struct message_handshake_initiation {
+ struct message_header header;
+ __le32 sender_index;
+ u8 unencrypted_ephemeral[NOISE_PUBLIC_KEY_LEN];
+ u8 encrypted_static[noise_encrypted_len(NOISE_PUBLIC_KEY_LEN)];
+ u8 encrypted_timestamp[noise_encrypted_len(NOISE_TIMESTAMP_LEN)];
+ struct message_macs macs;
+};
+
+struct message_handshake_response {
+ struct message_header header;
+ __le32 sender_index;
+ __le32 receiver_index;
+ u8 unencrypted_ephemeral[NOISE_PUBLIC_KEY_LEN];
+ u8 encrypted_nothing[noise_encrypted_len(0)];
+ struct message_macs macs;
+};
+
+struct message_handshake_cookie {
+ struct message_header header;
+ __le32 receiver_index;
+ u8 nonce[COOKIE_NONCE_LEN];
+ u8 encrypted_cookie[noise_encrypted_len(COOKIE_LEN)];
+};
+
+struct message_data {
+ struct message_header header;
+ __le32 key_idx;
+ __le64 counter;
+ u8 encrypted_data[];
+};
+
+#define message_data_len(plain_len) \
+ (noise_encrypted_len(plain_len) + sizeof(struct message_data))
+
+enum message_alignments {
+ MESSAGE_PADDING_MULTIPLE = 16,
+ MESSAGE_MINIMUM_LENGTH = message_data_len(0)
+};
+
+#define SKB_HEADER_LEN \
+ (max(sizeof(struct iphdr), sizeof(struct ipv6hdr)) + \
+ sizeof(struct udphdr) + NET_SKB_PAD)
+#define DATA_PACKET_HEAD_ROOM \
+ ALIGN(sizeof(struct message_data) + SKB_HEADER_LEN, 4)
+
+enum { HANDSHAKE_DSCP = 0x88 /* AF41, plus 00 ECN */ };
+
+#endif /* _WG_MESSAGES_H */
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
new file mode 100644
index 000000000000..c17049958426
--- /dev/null
+++ b/drivers/net/wireguard/netlink.c
@@ -0,0 +1,605 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "netlink.h"
+#include "device.h"
+#include "peer.h"
+#include "socket.h"
+#include "queueing.h"
+#include "messages.h"
+
+#include <uapi/linux/wireguard.h>
+
+#include <linux/if.h>
+#include <net/genetlink.h>
+#include <net/sock.h>
+
+static struct genl_family genl_family;
+
+static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = {
+ [WGDEVICE_A_IFINDEX] = { .type = NLA_U32 },
+ [WGDEVICE_A_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 },
+ [WGDEVICE_A_PRIVATE_KEY] = { .len = NOISE_PUBLIC_KEY_LEN },
+ [WGDEVICE_A_PUBLIC_KEY] = { .len = NOISE_PUBLIC_KEY_LEN },
+ [WGDEVICE_A_FLAGS] = { .type = NLA_U32 },
+ [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 },
+ [WGDEVICE_A_FWMARK] = { .type = NLA_U32 },
+ [WGDEVICE_A_PEERS] = { .type = NLA_NESTED }
+};
+
+static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = {
+ [WGPEER_A_PUBLIC_KEY] = { .len = NOISE_PUBLIC_KEY_LEN },
+ [WGPEER_A_PRESHARED_KEY] = { .len = NOISE_SYMMETRIC_KEY_LEN },
+ [WGPEER_A_FLAGS] = { .type = NLA_U32 },
+ [WGPEER_A_ENDPOINT] = { .len = sizeof(struct sockaddr) },
+ [WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL] = { .type = NLA_U16 },
+ [WGPEER_A_LAST_HANDSHAKE_TIME] = { .len = sizeof(struct timespec) },
+ [WGPEER_A_RX_BYTES] = { .type = NLA_U64 },
+ [WGPEER_A_TX_BYTES] = { .type = NLA_U64 },
+ [WGPEER_A_ALLOWEDIPS] = { .type = NLA_NESTED },
+ [WGPEER_A_PROTOCOL_VERSION] = { .type = NLA_U32 }
+};
+
+static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = {
+ [WGALLOWEDIP_A_FAMILY] = { .type = NLA_U16 },
+ [WGALLOWEDIP_A_IPADDR] = { .len = sizeof(struct in_addr) },
+ [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 }
+};
+
+static struct wireguard_device *lookup_interface(struct nlattr **attrs,
+ struct sk_buff *skb)
+{
+ struct net_device *dev = NULL;
+
+ if (!attrs[WGDEVICE_A_IFINDEX] == !attrs[WGDEVICE_A_IFNAME])
+ return ERR_PTR(-EBADR);
+ if (attrs[WGDEVICE_A_IFINDEX])
+ dev = dev_get_by_index(sock_net(skb->sk),
+ nla_get_u32(attrs[WGDEVICE_A_IFINDEX]));
+ else if (attrs[WGDEVICE_A_IFNAME])
+ dev = dev_get_by_name(sock_net(skb->sk),
+ nla_data(attrs[WGDEVICE_A_IFNAME]));
+ if (!dev)
+ return ERR_PTR(-ENODEV);
+ if (!dev->rtnl_link_ops || !dev->rtnl_link_ops->kind ||
+ strcmp(dev->rtnl_link_ops->kind, KBUILD_MODNAME)) {
+ dev_put(dev);
+ return ERR_PTR(-EOPNOTSUPP);
+ }
+ return netdev_priv(dev);
+}
+
+struct allowedips_ctx {
+ struct sk_buff *skb;
+ unsigned int i;
+};
+
+static int get_allowedips(void *ctx, const u8 *ip, u8 cidr, int family)
+{
+ struct allowedips_ctx *actx = ctx;
+ struct nlattr *allowedip_nest;
+
+ allowedip_nest = nla_nest_start(actx->skb, actx->i++);
+ if (!allowedip_nest)
+ return -EMSGSIZE;
+
+ if (nla_put_u8(actx->skb, WGALLOWEDIP_A_CIDR_MASK, cidr) ||
+ nla_put_u16(actx->skb, WGALLOWEDIP_A_FAMILY, family) ||
+ nla_put(actx->skb, WGALLOWEDIP_A_IPADDR, family == AF_INET6 ?
+ sizeof(struct in6_addr) : sizeof(struct in_addr), ip)) {
+ nla_nest_cancel(actx->skb, allowedip_nest);
+ return -EMSGSIZE;
+ }
+
+ nla_nest_end(actx->skb, allowedip_nest);
+ return 0;
+}
+
+static int get_peer(struct wireguard_peer *peer, unsigned int index,
+ struct allowedips_cursor *rt_cursor, struct sk_buff *skb)
+{
+ struct nlattr *allowedips_nest, *peer_nest = nla_nest_start(skb, index);
+ struct allowedips_ctx ctx = { .skb = skb };
+ bool fail;
+
+ if (!peer_nest)
+ return -EMSGSIZE;
+
+ down_read(&peer->handshake.lock);
+ fail = nla_put(skb, WGPEER_A_PUBLIC_KEY, NOISE_PUBLIC_KEY_LEN,
+ peer->handshake.remote_static);
+ up_read(&peer->handshake.lock);
+ if (fail)
+ goto err;
+
+ if (!rt_cursor->seq) {
+ down_read(&peer->handshake.lock);
+ fail = nla_put(skb, WGPEER_A_PRESHARED_KEY,
+ NOISE_SYMMETRIC_KEY_LEN,
+ peer->handshake.preshared_key);
+ up_read(&peer->handshake.lock);
+ if (fail)
+ goto err;
+
+ if (nla_put(skb, WGPEER_A_LAST_HANDSHAKE_TIME,
+ sizeof(peer->walltime_last_handshake),
+ &peer->walltime_last_handshake) ||
+ nla_put_u16(skb, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
+ peer->persistent_keepalive_interval) ||
+ nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, peer->tx_bytes,
+ WGPEER_A_UNSPEC) ||
+ nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, peer->rx_bytes,
+ WGPEER_A_UNSPEC) ||
+ nla_put_u32(skb, WGPEER_A_PROTOCOL_VERSION, 1))
+ goto err;
+
+ read_lock_bh(&peer->endpoint_lock);
+ if (peer->endpoint.addr.sa_family == AF_INET)
+ fail = nla_put(skb, WGPEER_A_ENDPOINT,
+ sizeof(peer->endpoint.addr4),
+ &peer->endpoint.addr4);
+ else if (peer->endpoint.addr.sa_family == AF_INET6)
+ fail = nla_put(skb, WGPEER_A_ENDPOINT,
+ sizeof(peer->endpoint.addr6),
+ &peer->endpoint.addr6);
+ read_unlock_bh(&peer->endpoint_lock);
+ if (fail)
+ goto err;
+ }
+
+ allowedips_nest = nla_nest_start(skb, WGPEER_A_ALLOWEDIPS);
+ if (!allowedips_nest)
+ goto err;
+ if (allowedips_walk_by_peer(&peer->device->peer_allowedips, rt_cursor,
+ peer, get_allowedips, &ctx,
+ &peer->device->device_update_lock)) {
+ nla_nest_end(skb, allowedips_nest);
+ nla_nest_end(skb, peer_nest);
+ return -EMSGSIZE;
+ }
+ memset(rt_cursor, 0, sizeof(*rt_cursor));
+ nla_nest_end(skb, allowedips_nest);
+ nla_nest_end(skb, peer_nest);
+ return 0;
+err:
+ nla_nest_cancel(skb, peer_nest);
+ return -EMSGSIZE;
+}
+
+static int get_device_start(struct netlink_callback *cb)
+{
+ struct nlattr **attrs = genl_family_attrbuf(&genl_family);
+ int ret = nlmsg_parse(cb->nlh, GENL_HDRLEN + genl_family.hdrsize, attrs,
+ genl_family.maxattr, device_policy, NULL);
+ struct wireguard_device *wg;
+
+ if (ret < 0)
+ return ret;
+ cb->args[2] = (long)kzalloc(sizeof(struct allowedips_cursor),
+ GFP_KERNEL);
+ if (unlikely(!cb->args[2]))
+ return -ENOMEM;
+ wg = lookup_interface(attrs, cb->skb);
+ if (IS_ERR(wg)) {
+ kfree((void *)cb->args[2]);
+ cb->args[2] = 0;
+ return PTR_ERR(wg);
+ }
+ cb->args[0] = (long)wg;
+ return 0;
+}
+
+static int get_device_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+ struct wireguard_peer *peer, *next_peer_cursor, *last_peer_cursor;
+ struct allowedips_cursor *rt_cursor;
+ struct wireguard_device *wg;
+ unsigned int peer_idx = 0;
+ struct nlattr *peers_nest;
+ bool done = true;
+ void *hdr;
+ int ret = -EMSGSIZE;
+
+ wg = (struct wireguard_device *)cb->args[0];
+ next_peer_cursor = (struct wireguard_peer *)cb->args[1];
+ last_peer_cursor = (struct wireguard_peer *)cb->args[1];
+ rt_cursor = (struct allowedips_cursor *)cb->args[2];
+
+ rtnl_lock();
+ mutex_lock(&wg->device_update_lock);
+ cb->seq = wg->device_update_gen;
+
+ hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+ &genl_family, NLM_F_MULTI, WG_CMD_GET_DEVICE);
+ if (!hdr)
+ goto out;
+ genl_dump_check_consistent(cb, hdr);
+
+ if (!last_peer_cursor) {
+ if (nla_put_u16(skb, WGDEVICE_A_LISTEN_PORT,
+ wg->incoming_port) ||
+ nla_put_u32(skb, WGDEVICE_A_FWMARK, wg->fwmark) ||
+ nla_put_u32(skb, WGDEVICE_A_IFINDEX, wg->dev->ifindex) ||
+ nla_put_string(skb, WGDEVICE_A_IFNAME, wg->dev->name))
+ goto out;
+
+ down_read(&wg->static_identity.lock);
+ if (wg->static_identity.has_identity) {
+ if (nla_put(skb, WGDEVICE_A_PRIVATE_KEY,
+ NOISE_PUBLIC_KEY_LEN,
+ wg->static_identity.static_private) ||
+ nla_put(skb, WGDEVICE_A_PUBLIC_KEY,
+ NOISE_PUBLIC_KEY_LEN,
+ wg->static_identity.static_public)) {
+ up_read(&wg->static_identity.lock);
+ goto out;
+ }
+ }
+ up_read(&wg->static_identity.lock);
+ }
+
+ peers_nest = nla_nest_start(skb, WGDEVICE_A_PEERS);
+ if (!peers_nest)
+ goto out;
+ ret = 0;
+ /* If the last cursor was removed via list_del_init in peer_remove, then
+ * we just treat this the same as there being no more peers left. The
+ * reason is that seq_nr should indicate to userspace that this isn't a
+ * coherent dump anyway, so they'll try again.
+ */
+ if (list_empty(&wg->peer_list) ||
+ (last_peer_cursor && list_empty(&last_peer_cursor->peer_list))) {
+ nla_nest_cancel(skb, peers_nest);
+ goto out;
+ }
+ lockdep_assert_held(&wg->device_update_lock);
+ peer = list_prepare_entry(last_peer_cursor, &wg->peer_list, peer_list);
+ list_for_each_entry_continue (peer, &wg->peer_list, peer_list) {
+ if (get_peer(peer, peer_idx++, rt_cursor, skb)) {
+ done = false;
+ break;
+ }
+ next_peer_cursor = peer;
+ }
+ nla_nest_end(skb, peers_nest);
+
+out:
+ if (!ret && !done && next_peer_cursor)
+ peer_get(next_peer_cursor);
+ peer_put(last_peer_cursor);
+ mutex_unlock(&wg->device_update_lock);
+ rtnl_unlock();
+
+ if (ret) {
+ genlmsg_cancel(skb, hdr);
+ return ret;
+ }
+ genlmsg_end(skb, hdr);
+ if (done) {
+ cb->args[1] = 0;
+ return 0;
+ }
+ cb->args[1] = (long)next_peer_cursor;
+ return skb->len;
+
+ /* At this point, we can't really deal ourselves with safely zeroing out
+ * the private key material after usage. This will need an additional API
+ * in the kernel for marking skbs as zero_on_free.
+ */
+}
+
+static int get_device_done(struct netlink_callback *cb)
+{
+ struct wireguard_device *wg = (struct wireguard_device *)cb->args[0];
+ struct wireguard_peer *peer = (struct wireguard_peer *)cb->args[1];
+ struct allowedips_cursor *rt_cursor =
+ (struct allowedips_cursor *)cb->args[2];
+
+ if (wg)
+ dev_put(wg->dev);
+ kfree(rt_cursor);
+ peer_put(peer);
+ return 0;
+}
+
+static int set_port(struct wireguard_device *wg, u16 port)
+{
+ struct wireguard_peer *peer;
+
+ if (wg->incoming_port == port)
+ return 0;
+ list_for_each_entry (peer, &wg->peer_list, peer_list)
+ socket_clear_peer_endpoint_src(peer);
+ if (!netif_running(wg->dev)) {
+ wg->incoming_port = port;
+ return 0;
+ }
+ return socket_init(wg, port);
+}
+
+static int set_allowedip(struct wireguard_peer *peer, struct nlattr **attrs)
+{
+ int ret = -EINVAL;
+ u16 family;
+ u8 cidr;
+
+ if (!attrs[WGALLOWEDIP_A_FAMILY] || !attrs[WGALLOWEDIP_A_IPADDR] ||
+ !attrs[WGALLOWEDIP_A_CIDR_MASK])
+ return ret;
+ family = nla_get_u16(attrs[WGALLOWEDIP_A_FAMILY]);
+ cidr = nla_get_u8(attrs[WGALLOWEDIP_A_CIDR_MASK]);
+
+ if (family == AF_INET && cidr <= 32 &&
+ nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in_addr))
+ ret = allowedips_insert_v4(
+ &peer->device->peer_allowedips,
+ nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
+ &peer->device->device_update_lock);
+ else if (family == AF_INET6 && cidr <= 128 &&
+ nla_len(attrs[WGALLOWEDIP_A_IPADDR]) == sizeof(struct in6_addr))
+ ret = allowedips_insert_v6(
+ &peer->device->peer_allowedips,
+ nla_data(attrs[WGALLOWEDIP_A_IPADDR]), cidr, peer,
+ &peer->device->device_update_lock);
+
+ return ret;
+}
+
+static int set_peer(struct wireguard_device *wg, struct nlattr **attrs)
+{
+ int ret;
+ u32 flags = 0;
+ struct wireguard_peer *peer = NULL;
+ u8 *public_key = NULL, *preshared_key = NULL;
+
+ ret = -EINVAL;
+ if (attrs[WGPEER_A_PUBLIC_KEY] &&
+ nla_len(attrs[WGPEER_A_PUBLIC_KEY]) == NOISE_PUBLIC_KEY_LEN)
+ public_key = nla_data(attrs[WGPEER_A_PUBLIC_KEY]);
+ else
+ goto out;
+ if (attrs[WGPEER_A_PRESHARED_KEY] &&
+ nla_len(attrs[WGPEER_A_PRESHARED_KEY]) == NOISE_SYMMETRIC_KEY_LEN)
+ preshared_key = nla_data(attrs[WGPEER_A_PRESHARED_KEY]);
+ if (attrs[WGPEER_A_FLAGS])
+ flags = nla_get_u32(attrs[WGPEER_A_FLAGS]);
+
+ ret = -EPFNOSUPPORT;
+ if (attrs[WGPEER_A_PROTOCOL_VERSION]) {
+ if (nla_get_u32(attrs[WGPEER_A_PROTOCOL_VERSION]) != 1)
+ goto out;
+ }
+
+ peer = pubkey_hashtable_lookup(&wg->peer_hashtable,
+ nla_data(attrs[WGPEER_A_PUBLIC_KEY]));
+ if (!peer) { /* Peer doesn't exist yet. Add a new one. */
+ ret = -ENODEV;
+ if (flags & WGPEER_F_REMOVE_ME)
+ goto out; /* Tried to remove a non-existing peer. */
+
+ down_read(&wg->static_identity.lock);
+ if (wg->static_identity.has_identity &&
+ !memcmp(nla_data(attrs[WGPEER_A_PUBLIC_KEY]),
+ wg->static_identity.static_public,
+ NOISE_PUBLIC_KEY_LEN)) {
+ /* We silently ignore peers that have the same public
+ * key as the device. The reason we do it silently is
+ * that we'd like for people to be able to reuse the
+ * same set of API calls across peers.
+ */
+ up_read(&wg->static_identity.lock);
+ ret = 0;
+ goto out;
+ }
+ up_read(&wg->static_identity.lock);
+
+ ret = -ENOMEM;
+ peer = peer_create(wg, public_key, preshared_key);
+ if (!peer)
+ goto out;
+ /* Take additional reference, as though we've just been
+ * looked up.
+ */
+ peer_get(peer);
+ }
+
+ ret = 0;
+ if (flags & WGPEER_F_REMOVE_ME) {
+ peer_remove(peer);
+ goto out;
+ }
+
+ if (preshared_key) {
+ down_write(&peer->handshake.lock);
+ memcpy(&peer->handshake.preshared_key, preshared_key,
+ NOISE_SYMMETRIC_KEY_LEN);
+ up_write(&peer->handshake.lock);
+ }
+
+ if (attrs[WGPEER_A_ENDPOINT]) {
+ struct sockaddr *addr = nla_data(attrs[WGPEER_A_ENDPOINT]);
+ size_t len = nla_len(attrs[WGPEER_A_ENDPOINT]);
+
+ if ((len == sizeof(struct sockaddr_in) &&
+ addr->sa_family == AF_INET) ||
+ (len == sizeof(struct sockaddr_in6) &&
+ addr->sa_family == AF_INET6)) {
+ struct endpoint endpoint = { { { 0 } } };
+
+ memcpy(&endpoint.addr, addr, len);
+ socket_set_peer_endpoint(peer, &endpoint);
+ }
+ }
+
+ if (flags & WGPEER_F_REPLACE_ALLOWEDIPS)
+ allowedips_remove_by_peer(&wg->peer_allowedips, peer,
+ &wg->device_update_lock);
+
+ if (attrs[WGPEER_A_ALLOWEDIPS]) {
+ struct nlattr *attr, *allowedip[WGALLOWEDIP_A_MAX + 1];
+ int rem;
+
+ nla_for_each_nested (attr, attrs[WGPEER_A_ALLOWEDIPS], rem) {
+ ret = nla_parse_nested(allowedip, WGALLOWEDIP_A_MAX,
+ attr, allowedip_policy, NULL);
+ if (ret < 0)
+ goto out;
+ ret = set_allowedip(peer, allowedip);
+ if (ret < 0)
+ goto out;
+ }
+ }
+
+ if (attrs[WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL]) {
+ const u16 persistent_keepalive_interval = nla_get_u16(
+ attrs[WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL]);
+ const bool send_keepalive =
+ !peer->persistent_keepalive_interval &&
+ persistent_keepalive_interval &&
+ netif_running(wg->dev);
+
+ peer->persistent_keepalive_interval = persistent_keepalive_interval;
+ if (send_keepalive)
+ packet_send_keepalive(peer);
+ }
+
+ if (netif_running(wg->dev))
+ packet_send_staged_packets(peer);
+
+out:
+ peer_put(peer);
+ if (attrs[WGPEER_A_PRESHARED_KEY])
+ memzero_explicit(nla_data(attrs[WGPEER_A_PRESHARED_KEY]),
+ nla_len(attrs[WGPEER_A_PRESHARED_KEY]));
+ return ret;
+}
+
+static int set_device(struct sk_buff *skb, struct genl_info *info)
+{
+ int ret;
+ struct wireguard_device *wg = lookup_interface(info->attrs, skb);
+
+ if (IS_ERR(wg)) {
+ ret = PTR_ERR(wg);
+ goto out_nodev;
+ }
+
+ rtnl_lock();
+ mutex_lock(&wg->device_update_lock);
+ ++wg->device_update_gen;
+
+ if (info->attrs[WGDEVICE_A_FWMARK]) {
+ struct wireguard_peer *peer;
+
+ wg->fwmark = nla_get_u32(info->attrs[WGDEVICE_A_FWMARK]);
+ list_for_each_entry (peer, &wg->peer_list, peer_list)
+ socket_clear_peer_endpoint_src(peer);
+ }
+
+ if (info->attrs[WGDEVICE_A_LISTEN_PORT]) {
+ ret = set_port(
+ wg, nla_get_u16(info->attrs[WGDEVICE_A_LISTEN_PORT]));
+ if (ret)
+ goto out;
+ }
+
+ if (info->attrs[WGDEVICE_A_FLAGS] &&
+ nla_get_u32(info->attrs[WGDEVICE_A_FLAGS]) &
+ WGDEVICE_F_REPLACE_PEERS)
+ peer_remove_all(wg);
+
+ if (info->attrs[WGDEVICE_A_PRIVATE_KEY] &&
+ nla_len(info->attrs[WGDEVICE_A_PRIVATE_KEY]) ==
+ NOISE_PUBLIC_KEY_LEN) {
+ u8 *private_key = nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]);
+ u8 public_key[NOISE_PUBLIC_KEY_LEN];
+ struct wireguard_peer *peer, *temp;
+
+ /* We remove before setting, to prevent race, which means doing
+ * two 25519-genpub ops.
+ */
+ if (curve25519_generate_public(public_key, private_key)) {
+ peer = pubkey_hashtable_lookup(&wg->peer_hashtable,
+ public_key);
+ if (peer) {
+ peer_put(peer);
+ peer_remove(peer);
+ }
+ }
+
+ down_write(&wg->static_identity.lock);
+ noise_set_static_identity_private_key(&wg->static_identity,
+ private_key);
+ list_for_each_entry_safe (peer, temp, &wg->peer_list,
+ peer_list) {
+ if (!noise_precompute_static_static(peer))
+ peer_remove(peer);
+ }
+ cookie_checker_precompute_device_keys(&wg->cookie_checker);
+ up_write(&wg->static_identity.lock);
+ }
+
+ if (info->attrs[WGDEVICE_A_PEERS]) {
+ int rem;
+ struct nlattr *attr, *peer[WGPEER_A_MAX + 1];
+
+ nla_for_each_nested (attr, info->attrs[WGDEVICE_A_PEERS], rem) {
+ ret = nla_parse_nested(peer, WGPEER_A_MAX, attr,
+ peer_policy, NULL);
+ if (ret < 0)
+ goto out;
+ ret = set_peer(wg, peer);
+ if (ret < 0)
+ goto out;
+ }
+ }
+ ret = 0;
+
+out:
+ mutex_unlock(&wg->device_update_lock);
+ rtnl_unlock();
+ dev_put(wg->dev);
+out_nodev:
+ if (info->attrs[WGDEVICE_A_PRIVATE_KEY])
+ memzero_explicit(nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]),
+ nla_len(info->attrs[WGDEVICE_A_PRIVATE_KEY]));
+ return ret;
+}
+
+static const struct genl_ops genl_ops[] = {
+ {
+ .cmd = WG_CMD_GET_DEVICE,
+ .start = get_device_start,
+ .dumpit = get_device_dump,
+ .done = get_device_done,
+ .policy = device_policy,
+ .flags = GENL_UNS_ADMIN_PERM
+ }, {
+ .cmd = WG_CMD_SET_DEVICE,
+ .doit = set_device,
+ .policy = device_policy,
+ .flags = GENL_UNS_ADMIN_PERM
+ }
+};
+
+static struct genl_family genl_family __ro_after_init = {
+ .ops = genl_ops,
+ .n_ops = ARRAY_SIZE(genl_ops),
+ .name = WG_GENL_NAME,
+ .version = WG_GENL_VERSION,
+ .maxattr = WGDEVICE_A_MAX,
+ .module = THIS_MODULE,
+ .netnsok = true
+};
+
+int __init genetlink_init(void)
+{
+ return genl_register_family(&genl_family);
+}
+
+void __exit genetlink_uninit(void)
+{
+ genl_unregister_family(&genl_family);
+}
diff --git a/drivers/net/wireguard/netlink.h b/drivers/net/wireguard/netlink.h
new file mode 100644
index 000000000000..c1cd9b019bd1
--- /dev/null
+++ b/drivers/net/wireguard/netlink.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_NETLINK_H
+#define _WG_NETLINK_H
+
+int genetlink_init(void);
+void genetlink_uninit(void);
+
+#endif /* _WG_NETLINK_H */
diff --git a/drivers/net/wireguard/noise.c b/drivers/net/wireguard/noise.c
new file mode 100644
index 000000000000..9bd2d7ef869a
--- /dev/null
+++ b/drivers/net/wireguard/noise.c
@@ -0,0 +1,784 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "noise.h"
+#include "device.h"
+#include "peer.h"
+#include "messages.h"
+#include "queueing.h"
+#include "hashtables.h"
+
+#include <linux/rcupdate.h>
+#include <linux/slab.h>
+#include <linux/bitmap.h>
+#include <linux/scatterlist.h>
+#include <linux/highmem.h>
+#include <crypto/algapi.h>
+
+/* This implements Noise_IKpsk2:
+ *
+ * <- s
+ * ******
+ * -> e, es, s, ss, {t}
+ * <- e, ee, se, psk, {}
+ */
+
+static const u8 handshake_name[37] = "Noise_IKpsk2_25519_ChaChaPoly_BLAKE2s";
+static const u8 identifier_name[34] = "WireGuard v1 zx2c4 Jason@zx2c4.com";
+static u8 handshake_init_hash[NOISE_HASH_LEN] __ro_after_init;
+static u8 handshake_init_chaining_key[NOISE_HASH_LEN] __ro_after_init;
+static atomic64_t keypair_counter = ATOMIC64_INIT(0);
+
+void __init noise_init(void)
+{
+ struct blake2s_state blake;
+
+ blake2s(handshake_init_chaining_key, handshake_name, NULL,
+ NOISE_HASH_LEN, sizeof(handshake_name), 0);
+ blake2s_init(&blake, NOISE_HASH_LEN);
+ blake2s_update(&blake, handshake_init_chaining_key, NOISE_HASH_LEN);
+ blake2s_update(&blake, identifier_name, sizeof(identifier_name));
+ blake2s_final(&blake, handshake_init_hash, NOISE_HASH_LEN);
+}
+
+/* Must hold peer->handshake.static_identity->lock */
+bool noise_precompute_static_static(struct wireguard_peer *peer)
+{
+ bool ret = true;
+
+ down_write(&peer->handshake.lock);
+ if (peer->handshake.static_identity->has_identity)
+ ret = curve25519(
+ peer->handshake.precomputed_static_static,
+ peer->handshake.static_identity->static_private,
+ peer->handshake.remote_static);
+ else
+ memset(peer->handshake.precomputed_static_static, 0,
+ NOISE_PUBLIC_KEY_LEN);
+ up_write(&peer->handshake.lock);
+ return ret;
+}
+
+bool noise_handshake_init(struct noise_handshake *handshake,
+ struct noise_static_identity *static_identity,
+ const u8 peer_public_key[NOISE_PUBLIC_KEY_LEN],
+ const u8 peer_preshared_key[NOISE_SYMMETRIC_KEY_LEN],
+ struct wireguard_peer *peer)
+{
+ memset(handshake, 0, sizeof(*handshake));
+ init_rwsem(&handshake->lock);
+ handshake->entry.type = INDEX_HASHTABLE_HANDSHAKE;
+ handshake->entry.peer = peer;
+ memcpy(handshake->remote_static, peer_public_key, NOISE_PUBLIC_KEY_LEN);
+ if (peer_preshared_key)
+ memcpy(handshake->preshared_key, peer_preshared_key,
+ NOISE_SYMMETRIC_KEY_LEN);
+ handshake->static_identity = static_identity;
+ handshake->state = HANDSHAKE_ZEROED;
+ return noise_precompute_static_static(peer);
+}
+
+static void handshake_zero(struct noise_handshake *handshake)
+{
+ memset(&handshake->ephemeral_private, 0, NOISE_PUBLIC_KEY_LEN);
+ memset(&handshake->remote_ephemeral, 0, NOISE_PUBLIC_KEY_LEN);
+ memset(&handshake->hash, 0, NOISE_HASH_LEN);
+ memset(&handshake->chaining_key, 0, NOISE_HASH_LEN);
+ handshake->remote_index = 0;
+ handshake->state = HANDSHAKE_ZEROED;
+}
+
+void noise_handshake_clear(struct noise_handshake *handshake)
+{
+ index_hashtable_remove(&handshake->entry.peer->device->index_hashtable,
+ &handshake->entry);
+ down_write(&handshake->lock);
+ handshake_zero(handshake);
+ up_write(&handshake->lock);
+ index_hashtable_remove(&handshake->entry.peer->device->index_hashtable,
+ &handshake->entry);
+}
+
+static struct noise_keypair *keypair_create(struct wireguard_peer *peer)
+{
+ struct noise_keypair *keypair = kzalloc(sizeof(*keypair), GFP_KERNEL);
+
+ if (unlikely(!keypair))
+ return NULL;
+ keypair->internal_id = atomic64_inc_return(&keypair_counter);
+ keypair->entry.type = INDEX_HASHTABLE_KEYPAIR;
+ keypair->entry.peer = peer;
+ kref_init(&keypair->refcount);
+ return keypair;
+}
+
+static void keypair_free_rcu(struct rcu_head *rcu)
+{
+ kzfree(container_of(rcu, struct noise_keypair, rcu));
+}
+
+static void keypair_free_kref(struct kref *kref)
+{
+ struct noise_keypair *keypair =
+ container_of(kref, struct noise_keypair, refcount);
+ net_dbg_ratelimited("%s: Keypair %llu destroyed for peer %llu\n",
+ keypair->entry.peer->device->dev->name,
+ keypair->internal_id,
+ keypair->entry.peer->internal_id);
+ index_hashtable_remove(&keypair->entry.peer->device->index_hashtable,
+ &keypair->entry);
+ call_rcu_bh(&keypair->rcu, keypair_free_rcu);
+}
+
+void noise_keypair_put(struct noise_keypair *keypair, bool unreference_now)
+{
+ if (unlikely(!keypair))
+ return;
+ if (unlikely(unreference_now))
+ index_hashtable_remove(
+ &keypair->entry.peer->device->index_hashtable,
+ &keypair->entry);
+ kref_put(&keypair->refcount, keypair_free_kref);
+}
+
+struct noise_keypair *noise_keypair_get(struct noise_keypair *keypair)
+{
+ RCU_LOCKDEP_WARN(!rcu_read_lock_bh_held(),
+ "Taking noise keypair reference without holding the RCU BH read lock");
+ if (unlikely(!keypair || !kref_get_unless_zero(&keypair->refcount)))
+ return NULL;
+ return keypair;
+}
+
+void noise_keypairs_clear(struct noise_keypairs *keypairs)
+{
+ struct noise_keypair *old;
+
+ spin_lock_bh(&keypairs->keypair_update_lock);
+ old = rcu_dereference_protected(keypairs->previous_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ RCU_INIT_POINTER(keypairs->previous_keypair, NULL);
+ noise_keypair_put(old, true);
+ old = rcu_dereference_protected(keypairs->next_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ RCU_INIT_POINTER(keypairs->next_keypair, NULL);
+ noise_keypair_put(old, true);
+ old = rcu_dereference_protected(keypairs->current_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ RCU_INIT_POINTER(keypairs->current_keypair, NULL);
+ noise_keypair_put(old, true);
+ spin_unlock_bh(&keypairs->keypair_update_lock);
+}
+
+static void add_new_keypair(struct noise_keypairs *keypairs,
+ struct noise_keypair *new_keypair)
+{
+ struct noise_keypair *previous_keypair, *next_keypair, *current_keypair;
+
+ spin_lock_bh(&keypairs->keypair_update_lock);
+ previous_keypair = rcu_dereference_protected(keypairs->previous_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ next_keypair = rcu_dereference_protected(keypairs->next_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ current_keypair = rcu_dereference_protected(keypairs->current_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ if (new_keypair->i_am_the_initiator) {
+ /* If we're the initiator, it means we've sent a handshake, and
+ * received a confirmation response, which means this new
+ * keypair can now be used.
+ */
+ if (next_keypair) {
+ /* If there already was a next keypair pending, we
+ * demote it to be the previous keypair, and free the
+ * existing current. Note that this means KCI can result
+ * in this transition. It would perhaps be more sound to
+ * always just get rid of the unused next keypair
+ * instead of putting it in the previous slot, but this
+ * might be a bit less robust. Something to think about
+ * for the future.
+ */
+ RCU_INIT_POINTER(keypairs->next_keypair, NULL);
+ rcu_assign_pointer(keypairs->previous_keypair,
+ next_keypair);
+ noise_keypair_put(current_keypair, true);
+ } else /* If there wasn't an existing next keypair, we replace
+ * the previous with the current one.
+ */
+ rcu_assign_pointer(keypairs->previous_keypair,
+ current_keypair);
+ /* At this point we can get rid of the old previous keypair, and
+ * set up the new keypair.
+ */
+ noise_keypair_put(previous_keypair, true);
+ rcu_assign_pointer(keypairs->current_keypair, new_keypair);
+ } else {
+ /* If we're the responder, it means we can't use the new keypair
+ * until we receive confirmation via the first data packet, so
+ * we get rid of the existing previous one, the possibly
+ * existing next one, and slide in the new next one.
+ */
+ rcu_assign_pointer(keypairs->next_keypair, new_keypair);
+ noise_keypair_put(next_keypair, true);
+ RCU_INIT_POINTER(keypairs->previous_keypair, NULL);
+ noise_keypair_put(previous_keypair, true);
+ }
+ spin_unlock_bh(&keypairs->keypair_update_lock);
+}
+
+bool noise_received_with_keypair(struct noise_keypairs *keypairs,
+ struct noise_keypair *received_keypair)
+{
+ struct noise_keypair *old_keypair;
+ bool key_is_new;
+
+ /* We first check without taking the spinlock. */
+ key_is_new = received_keypair ==
+ rcu_access_pointer(keypairs->next_keypair);
+ if (likely(!key_is_new))
+ return false;
+
+ spin_lock_bh(&keypairs->keypair_update_lock);
+ /* After locking, we double check that things didn't change from
+ * beneath us.
+ */
+ if (unlikely(received_keypair !=
+ rcu_dereference_protected(keypairs->next_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock)))) {
+ spin_unlock_bh(&keypairs->keypair_update_lock);
+ return false;
+ }
+
+ /* When we've finally received the confirmation, we slide the next
+ * into the current, the current into the previous, and get rid of
+ * the old previous.
+ */
+ old_keypair = rcu_dereference_protected(keypairs->previous_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock));
+ rcu_assign_pointer(keypairs->previous_keypair,
+ rcu_dereference_protected(keypairs->current_keypair,
+ lockdep_is_held(&keypairs->keypair_update_lock)));
+ noise_keypair_put(old_keypair, true);
+ rcu_assign_pointer(keypairs->current_keypair, received_keypair);
+ RCU_INIT_POINTER(keypairs->next_keypair, NULL);
+
+ spin_unlock_bh(&keypairs->keypair_update_lock);
+ return true;
+}
+
+/* Must hold static_identity->lock */
+void noise_set_static_identity_private_key(
+ struct noise_static_identity *static_identity,
+ const u8 private_key[NOISE_PUBLIC_KEY_LEN])
+{
+ memcpy(static_identity->static_private, private_key,
+ NOISE_PUBLIC_KEY_LEN);
+ static_identity->has_identity = curve25519_generate_public(
+ static_identity->static_public, private_key);
+}
+
+/* This is Hugo Krawczyk's HKDF:
+ * - https://eprint.iacr.org/2010/264.pdf
+ * - https://tools.ietf.org/html/rfc5869
+ */
+static void kdf(u8 *first_dst, u8 *second_dst, u8 *third_dst, const u8 *data,
+ size_t first_len, size_t second_len, size_t third_len,
+ size_t data_len, const u8 chaining_key[NOISE_HASH_LEN])
+{
+ u8 output[BLAKE2S_OUTBYTES + 1];
+ u8 secret[BLAKE2S_OUTBYTES];
+
+#ifdef DEBUG
+ BUG_ON(first_len > BLAKE2S_OUTBYTES || second_len > BLAKE2S_OUTBYTES ||
+ third_len > BLAKE2S_OUTBYTES ||
+ ((second_len || second_dst || third_len || third_dst) &&
+ (!first_len || !first_dst)) ||
+ ((third_len || third_dst) && (!second_len || !second_dst)));
+#endif
+
+ /* Extract entropy from data into secret */
+ blake2s_hmac(secret, data, chaining_key, BLAKE2S_OUTBYTES, data_len,
+ NOISE_HASH_LEN);
+
+ if (!first_dst || !first_len)
+ goto out;
+
+ /* Expand first key: key = secret, data = 0x1 */
+ output[0] = 1;
+ blake2s_hmac(output, output, secret, BLAKE2S_OUTBYTES, 1,
+ BLAKE2S_OUTBYTES);
+ memcpy(first_dst, output, first_len);
+
+ if (!second_dst || !second_len)
+ goto out;
+
+ /* Expand second key: key = secret, data = first-key || 0x2 */
+ output[BLAKE2S_OUTBYTES] = 2;
+ blake2s_hmac(output, output, secret, BLAKE2S_OUTBYTES,
+ BLAKE2S_OUTBYTES + 1, BLAKE2S_OUTBYTES);
+ memcpy(second_dst, output, second_len);
+
+ if (!third_dst || !third_len)
+ goto out;
+
+ /* Expand third key: key = secret, data = second-key || 0x3 */
+ output[BLAKE2S_OUTBYTES] = 3;
+ blake2s_hmac(output, output, secret, BLAKE2S_OUTBYTES,
+ BLAKE2S_OUTBYTES + 1, BLAKE2S_OUTBYTES);
+ memcpy(third_dst, output, third_len);
+
+out:
+ /* Clear sensitive data from stack */
+ memzero_explicit(secret, BLAKE2S_OUTBYTES);
+ memzero_explicit(output, BLAKE2S_OUTBYTES + 1);
+}
+
+static void symmetric_key_init(struct noise_symmetric_key *key)
+{
+ spin_lock_init(&key->counter.receive.lock);
+ atomic64_set(&key->counter.counter, 0);
+ memset(key->counter.receive.backtrack, 0,
+ sizeof(key->counter.receive.backtrack));
+ key->birthdate = ktime_get_boot_fast_ns();
+ key->is_valid = true;
+}
+
+static void derive_keys(struct noise_symmetric_key *first_dst,
+ struct noise_symmetric_key *second_dst,
+ const u8 chaining_key[NOISE_HASH_LEN])
+{
+ kdf(first_dst->key, second_dst->key, NULL, NULL,
+ NOISE_SYMMETRIC_KEY_LEN, NOISE_SYMMETRIC_KEY_LEN, 0, 0,
+ chaining_key);
+ symmetric_key_init(first_dst);
+ symmetric_key_init(second_dst);
+}
+
+static bool __must_check mix_dh(u8 chaining_key[NOISE_HASH_LEN],
+ u8 key[NOISE_SYMMETRIC_KEY_LEN],
+ const u8 private[NOISE_PUBLIC_KEY_LEN],
+ const u8 public[NOISE_PUBLIC_KEY_LEN])
+{
+ u8 dh_calculation[NOISE_PUBLIC_KEY_LEN];
+
+ if (unlikely(!curve25519(dh_calculation, private, public)))
+ return false;
+ kdf(chaining_key, key, NULL, dh_calculation, NOISE_HASH_LEN,
+ NOISE_SYMMETRIC_KEY_LEN, 0, NOISE_PUBLIC_KEY_LEN, chaining_key);
+ memzero_explicit(dh_calculation, NOISE_PUBLIC_KEY_LEN);
+ return true;
+}
+
+static void mix_hash(u8 hash[NOISE_HASH_LEN], const u8 *src, size_t src_len)
+{
+ struct blake2s_state blake;
+
+ blake2s_init(&blake, NOISE_HASH_LEN);
+ blake2s_update(&blake, hash, NOISE_HASH_LEN);
+ blake2s_update(&blake, src, src_len);
+ blake2s_final(&blake, hash, NOISE_HASH_LEN);
+}
+
+static void mix_psk(u8 chaining_key[NOISE_HASH_LEN], u8 hash[NOISE_HASH_LEN],
+ u8 key[NOISE_SYMMETRIC_KEY_LEN],
+ const u8 psk[NOISE_SYMMETRIC_KEY_LEN])
+{
+ u8 temp_hash[NOISE_HASH_LEN];
+
+ kdf(chaining_key, temp_hash, key, psk, NOISE_HASH_LEN, NOISE_HASH_LEN,
+ NOISE_SYMMETRIC_KEY_LEN, NOISE_SYMMETRIC_KEY_LEN, chaining_key);
+ mix_hash(hash, temp_hash, NOISE_HASH_LEN);
+ memzero_explicit(temp_hash, NOISE_HASH_LEN);
+}
+
+static void handshake_init(u8 chaining_key[NOISE_HASH_LEN],
+ u8 hash[NOISE_HASH_LEN],
+ const u8 remote_static[NOISE_PUBLIC_KEY_LEN])
+{
+ memcpy(hash, handshake_init_hash, NOISE_HASH_LEN);
+ memcpy(chaining_key, handshake_init_chaining_key, NOISE_HASH_LEN);
+ mix_hash(hash, remote_static, NOISE_PUBLIC_KEY_LEN);
+}
+
+static void message_encrypt(u8 *dst_ciphertext, const u8 *src_plaintext,
+ size_t src_len, u8 key[NOISE_SYMMETRIC_KEY_LEN],
+ u8 hash[NOISE_HASH_LEN])
+{
+ chacha20poly1305_encrypt(dst_ciphertext, src_plaintext, src_len, hash,
+ NOISE_HASH_LEN,
+ 0 /* Always zero for Noise_IK */, key);
+ mix_hash(hash, dst_ciphertext, noise_encrypted_len(src_len));
+}
+
+static bool message_decrypt(u8 *dst_plaintext, const u8 *src_ciphertext,
+ size_t src_len, u8 key[NOISE_SYMMETRIC_KEY_LEN],
+ u8 hash[NOISE_HASH_LEN])
+{
+ if (!chacha20poly1305_decrypt(dst_plaintext, src_ciphertext, src_len,
+ hash, NOISE_HASH_LEN,
+ 0 /* Always zero for Noise_IK */, key))
+ return false;
+ mix_hash(hash, src_ciphertext, src_len);
+ return true;
+}
+
+static void message_ephemeral(u8 ephemeral_dst[NOISE_PUBLIC_KEY_LEN],
+ const u8 ephemeral_src[NOISE_PUBLIC_KEY_LEN],
+ u8 chaining_key[NOISE_HASH_LEN],
+ u8 hash[NOISE_HASH_LEN])
+{
+ if (ephemeral_dst != ephemeral_src)
+ memcpy(ephemeral_dst, ephemeral_src, NOISE_PUBLIC_KEY_LEN);
+ mix_hash(hash, ephemeral_src, NOISE_PUBLIC_KEY_LEN);
+ kdf(chaining_key, NULL, NULL, ephemeral_src, NOISE_HASH_LEN, 0, 0,
+ NOISE_PUBLIC_KEY_LEN, chaining_key);
+}
+
+static void tai64n_now(u8 output[NOISE_TIMESTAMP_LEN])
+{
+ struct timespec64 now;
+
+ getnstimeofday64(&now);
+ /* https://cr.yp.to/libtai/tai64.html */
+ *(__be64 *)output = cpu_to_be64(0x400000000000000aULL + now.tv_sec);
+ *(__be32 *)(output + sizeof(__be64)) = cpu_to_be32(now.tv_nsec);
+}
+
+bool noise_handshake_create_initiation(struct message_handshake_initiation *dst,
+ struct noise_handshake *handshake)
+{
+ u8 timestamp[NOISE_TIMESTAMP_LEN];
+ u8 key[NOISE_SYMMETRIC_KEY_LEN];
+ bool ret = false;
+
+ /* We need to wait for crng _before_ taking any locks, since
+ * curve25519_generate_secret uses get_random_bytes_wait.
+ */
+ wait_for_random_bytes();
+
+ down_read(&handshake->static_identity->lock);
+ down_write(&handshake->lock);
+
+ if (unlikely(!handshake->static_identity->has_identity))
+ goto out;
+
+ dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION);
+
+ handshake_init(handshake->chaining_key, handshake->hash,
+ handshake->remote_static);
+
+ /* e */
+ curve25519_generate_secret(handshake->ephemeral_private);
+ if (!curve25519_generate_public(dst->unencrypted_ephemeral,
+ handshake->ephemeral_private))
+ goto out;
+ message_ephemeral(dst->unencrypted_ephemeral,
+ dst->unencrypted_ephemeral, handshake->chaining_key,
+ handshake->hash);
+
+ /* es */
+ if (!mix_dh(handshake->chaining_key, key, handshake->ephemeral_private,
+ handshake->remote_static))
+ goto out;
+
+ /* s */
+ message_encrypt(dst->encrypted_static,
+ handshake->static_identity->static_public,
+ NOISE_PUBLIC_KEY_LEN, key, handshake->hash);
+
+ /* ss */
+ kdf(handshake->chaining_key, key, NULL,
+ handshake->precomputed_static_static, NOISE_HASH_LEN,
+ NOISE_SYMMETRIC_KEY_LEN, 0, NOISE_PUBLIC_KEY_LEN,
+ handshake->chaining_key);
+
+ /* {t} */
+ tai64n_now(timestamp);
+ message_encrypt(dst->encrypted_timestamp, timestamp,
+ NOISE_TIMESTAMP_LEN, key, handshake->hash);
+
+ dst->sender_index = index_hashtable_insert(
+ &handshake->entry.peer->device->index_hashtable,
+ &handshake->entry);
+
+ handshake->state = HANDSHAKE_CREATED_INITIATION;
+ ret = true;
+
+out:
+ up_write(&handshake->lock);
+ up_read(&handshake->static_identity->lock);
+ memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN);
+ return ret;
+}
+
+struct wireguard_peer *
+noise_handshake_consume_initiation(struct message_handshake_initiation *src,
+ struct wireguard_device *wg)
+{
+ struct wireguard_peer *peer = NULL, *ret_peer = NULL;
+ struct noise_handshake *handshake;
+ bool replay_attack, flood_attack;
+ u8 key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 chaining_key[NOISE_HASH_LEN];
+ u8 hash[NOISE_HASH_LEN];
+ u8 s[NOISE_PUBLIC_KEY_LEN];
+ u8 e[NOISE_PUBLIC_KEY_LEN];
+ u8 t[NOISE_TIMESTAMP_LEN];
+
+ down_read(&wg->static_identity.lock);
+ if (unlikely(!wg->static_identity.has_identity))
+ goto out;
+
+ handshake_init(chaining_key, hash, wg->static_identity.static_public);
+
+ /* e */
+ message_ephemeral(e, src->unencrypted_ephemeral, chaining_key, hash);
+
+ /* es */
+ if (!mix_dh(chaining_key, key, wg->static_identity.static_private, e))
+ goto out;
+
+ /* s */
+ if (!message_decrypt(s, src->encrypted_static,
+ sizeof(src->encrypted_static), key, hash))
+ goto out;
+
+ /* Lookup which peer we're actually talking to */
+ peer = pubkey_hashtable_lookup(&wg->peer_hashtable, s);
+ if (!peer)
+ goto out;
+ handshake = &peer->handshake;
+
+ /* ss */
+ kdf(chaining_key, key, NULL, handshake->precomputed_static_static,
+ NOISE_HASH_LEN, NOISE_SYMMETRIC_KEY_LEN, 0, NOISE_PUBLIC_KEY_LEN,
+ chaining_key);
+
+ /* {t} */
+ if (!message_decrypt(t, src->encrypted_timestamp,
+ sizeof(src->encrypted_timestamp), key, hash))
+ goto out;
+
+ down_read(&handshake->lock);
+ replay_attack = memcmp(t, handshake->latest_timestamp,
+ NOISE_TIMESTAMP_LEN) <= 0;
+ flood_attack = handshake->last_initiation_consumption +
+ NSEC_PER_SEC / INITIATIONS_PER_SECOND >
+ ktime_get_boot_fast_ns();
+ up_read(&handshake->lock);
+ if (replay_attack || flood_attack)
+ goto out;
+
+ /* Success! Copy everything to peer */
+ down_write(&handshake->lock);
+ memcpy(handshake->remote_ephemeral, e, NOISE_PUBLIC_KEY_LEN);
+ memcpy(handshake->latest_timestamp, t, NOISE_TIMESTAMP_LEN);
+ memcpy(handshake->hash, hash, NOISE_HASH_LEN);
+ memcpy(handshake->chaining_key, chaining_key, NOISE_HASH_LEN);
+ handshake->remote_index = src->sender_index;
+ handshake->last_initiation_consumption = ktime_get_boot_fast_ns();
+ handshake->state = HANDSHAKE_CONSUMED_INITIATION;
+ up_write(&handshake->lock);
+ ret_peer = peer;
+
+out:
+ memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN);
+ memzero_explicit(hash, NOISE_HASH_LEN);
+ memzero_explicit(chaining_key, NOISE_HASH_LEN);
+ up_read(&wg->static_identity.lock);
+ if (!ret_peer)
+ peer_put(peer);
+ return ret_peer;
+}
+
+bool noise_handshake_create_response(struct message_handshake_response *dst,
+ struct noise_handshake *handshake)
+{
+ bool ret = false;
+ u8 key[NOISE_SYMMETRIC_KEY_LEN];
+
+ /* We need to wait for crng _before_ taking any locks, since
+ * curve25519_generate_secret uses get_random_bytes_wait.
+ */
+ wait_for_random_bytes();
+
+ down_read(&handshake->static_identity->lock);
+ down_write(&handshake->lock);
+
+ if (handshake->state != HANDSHAKE_CONSUMED_INITIATION)
+ goto out;
+
+ dst->header.type = cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE);
+ dst->receiver_index = handshake->remote_index;
+
+ /* e */
+ curve25519_generate_secret(handshake->ephemeral_private);
+ if (!curve25519_generate_public(dst->unencrypted_ephemeral,
+ handshake->ephemeral_private))
+ goto out;
+ message_ephemeral(dst->unencrypted_ephemeral,
+ dst->unencrypted_ephemeral, handshake->chaining_key,
+ handshake->hash);
+
+ /* ee */
+ if (!mix_dh(handshake->chaining_key, NULL, handshake->ephemeral_private,
+ handshake->remote_ephemeral))
+ goto out;
+
+ /* se */
+ if (!mix_dh(handshake->chaining_key, NULL, handshake->ephemeral_private,
+ handshake->remote_static))
+ goto out;
+
+ /* psk */
+ mix_psk(handshake->chaining_key, handshake->hash, key,
+ handshake->preshared_key);
+
+ /* {} */
+ message_encrypt(dst->encrypted_nothing, NULL, 0, key, handshake->hash);
+
+ dst->sender_index = index_hashtable_insert(
+ &handshake->entry.peer->device->index_hashtable,
+ &handshake->entry);
+
+ handshake->state = HANDSHAKE_CREATED_RESPONSE;
+ ret = true;
+
+out:
+ up_write(&handshake->lock);
+ up_read(&handshake->static_identity->lock);
+ memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN);
+ return ret;
+}
+
+struct wireguard_peer *
+noise_handshake_consume_response(struct message_handshake_response *src,
+ struct wireguard_device *wg)
+{
+ struct noise_handshake *handshake;
+ struct wireguard_peer *peer = NULL, *ret_peer = NULL;
+ u8 key[NOISE_SYMMETRIC_KEY_LEN];
+ u8 hash[NOISE_HASH_LEN];
+ u8 chaining_key[NOISE_HASH_LEN];
+ u8 e[NOISE_PUBLIC_KEY_LEN];
+ u8 ephemeral_private[NOISE_PUBLIC_KEY_LEN];
+ u8 static_private[NOISE_PUBLIC_KEY_LEN];
+ enum noise_handshake_state state = HANDSHAKE_ZEROED;
+
+ down_read(&wg->static_identity.lock);
+
+ if (unlikely(!wg->static_identity.has_identity))
+ goto out;
+
+ handshake = (struct noise_handshake *)index_hashtable_lookup(
+ &wg->index_hashtable, INDEX_HASHTABLE_HANDSHAKE,
+ src->receiver_index, &peer);
+ if (unlikely(!handshake))
+ goto out;
+
+ down_read(&handshake->lock);
+ state = handshake->state;
+ memcpy(hash, handshake->hash, NOISE_HASH_LEN);
+ memcpy(chaining_key, handshake->chaining_key, NOISE_HASH_LEN);
+ memcpy(ephemeral_private, handshake->ephemeral_private,
+ NOISE_PUBLIC_KEY_LEN);
+ up_read(&handshake->lock);
+
+ if (state != HANDSHAKE_CREATED_INITIATION)
+ goto fail;
+
+ /* e */
+ message_ephemeral(e, src->unencrypted_ephemeral, chaining_key, hash);
+
+ /* ee */
+ if (!mix_dh(chaining_key, NULL, ephemeral_private, e))
+ goto fail;
+
+ /* se */
+ if (!mix_dh(chaining_key, NULL, wg->static_identity.static_private, e))
+ goto fail;
+
+ /* psk */
+ mix_psk(chaining_key, hash, key, handshake->preshared_key);
+
+ /* {} */
+ if (!message_decrypt(NULL, src->encrypted_nothing,
+ sizeof(src->encrypted_nothing), key, hash))
+ goto fail;
+
+ /* Success! Copy everything to peer */
+ down_write(&handshake->lock);
+ /* It's important to check that the state is still the same, while we
+ * have an exclusive lock.
+ */
+ if (handshake->state != state) {
+ up_write(&handshake->lock);
+ goto fail;
+ }
+ memcpy(handshake->remote_ephemeral, e, NOISE_PUBLIC_KEY_LEN);
+ memcpy(handshake->hash, hash, NOISE_HASH_LEN);
+ memcpy(handshake->chaining_key, chaining_key, NOISE_HASH_LEN);
+ handshake->remote_index = src->sender_index;
+ handshake->state = HANDSHAKE_CONSUMED_RESPONSE;
+ up_write(&handshake->lock);
+ ret_peer = peer;
+ goto out;
+
+fail:
+ peer_put(peer);
+out:
+ memzero_explicit(key, NOISE_SYMMETRIC_KEY_LEN);
+ memzero_explicit(hash, NOISE_HASH_LEN);
+ memzero_explicit(chaining_key, NOISE_HASH_LEN);
+ memzero_explicit(ephemeral_private, NOISE_PUBLIC_KEY_LEN);
+ memzero_explicit(static_private, NOISE_PUBLIC_KEY_LEN);
+ up_read(&wg->static_identity.lock);
+ return ret_peer;
+}
+
+bool noise_handshake_begin_session(struct noise_handshake *handshake,
+ struct noise_keypairs *keypairs)
+{
+ struct noise_keypair *new_keypair;
+ bool ret = false;
+
+ down_write(&handshake->lock);
+ if (handshake->state != HANDSHAKE_CREATED_RESPONSE &&
+ handshake->state != HANDSHAKE_CONSUMED_RESPONSE)
+ goto out;
+
+ new_keypair = keypair_create(handshake->entry.peer);
+ if (!new_keypair)
+ goto out;
+ new_keypair->i_am_the_initiator = handshake->state ==
+ HANDSHAKE_CONSUMED_RESPONSE;
+ new_keypair->remote_index = handshake->remote_index;
+
+ if (new_keypair->i_am_the_initiator)
+ derive_keys(&new_keypair->sending, &new_keypair->receiving,
+ handshake->chaining_key);
+ else
+ derive_keys(&new_keypair->receiving, &new_keypair->sending,
+ handshake->chaining_key);
+
+ handshake_zero(handshake);
+ rcu_read_lock_bh();
+ if (likely(!container_of(handshake, struct wireguard_peer,
+ handshake)->is_dead)) {
+ add_new_keypair(keypairs, new_keypair);
+ net_dbg_ratelimited("%s: Keypair %llu created for peer %llu\n",
+ handshake->entry.peer->device->dev->name,
+ new_keypair->internal_id,
+ handshake->entry.peer->internal_id);
+ ret = index_hashtable_replace(
+ &handshake->entry.peer->device->index_hashtable,
+ &handshake->entry, &new_keypair->entry);
+ } else
+ kzfree(new_keypair);
+ rcu_read_unlock_bh();
+
+out:
+ up_write(&handshake->lock);
+ return ret;
+}
diff --git a/drivers/net/wireguard/noise.h b/drivers/net/wireguard/noise.h
new file mode 100644
index 000000000000..6a563ce41750
--- /dev/null
+++ b/drivers/net/wireguard/noise.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+#ifndef _WG_NOISE_H
+#define _WG_NOISE_H
+
+#include "messages.h"
+#include "hashtables.h"
+
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/atomic.h>
+#include <linux/rwsem.h>
+#include <linux/mutex.h>
+#include <linux/ktime.h>
+#include <linux/kref.h>
+
+union noise_counter {
+ struct {
+ u64 counter;
+ unsigned long backtrack[COUNTER_BITS_TOTAL / BITS_PER_LONG];
+ spinlock_t lock;
+ } receive;
+ atomic64_t counter;
+};
+
+struct noise_symmetric_key {
+ u8 key[NOISE_SYMMETRIC_KEY_LEN];
+ union noise_counter counter;
+ u64 birthdate;
+ bool is_valid;
+};
+
+struct noise_keypair {
+ struct index_hashtable_entry entry;
+ struct noise_symmetric_key sending;
+ struct noise_symmetric_key receiving;
+ __le32 remote_index;
+ bool i_am_the_initiator;
+ struct kref refcount;
+ struct rcu_head rcu;
+ u64 internal_id;
+};
+
+struct noise_keypairs {
+ struct noise_keypair __rcu *current_keypair;
+ struct noise_keypair __rcu *previous_keypair;
+ struct noise_keypair __rcu *next_keypair;
+ spinlock_t keypair_update_lock;
+};
+
+struct noise_static_identity {
+ u8 static_public[NOISE_PUBLIC_KEY_LEN];
+ u8 static_private[NOISE_PUBLIC_KEY_LEN];
+ struct rw_semaphore lock;
+ bool has_identity;
+};
+
+enum noise_handshake_state {
+ HANDSHAKE_ZEROED,
+ HANDSHAKE_CREATED_INITIATION,
+ HANDSHAKE_CONSUMED_INITIATION,
+ HANDSHAKE_CREATED_RESPONSE,
+ HANDSHAKE_CONSUMED_RESPONSE
+};
+
+struct noise_handshake {
+ struct index_hashtable_entry entry;
+
+ enum noise_handshake_state state;
+ u64 last_initiation_consumption;
+
+ struct noise_static_identity *static_identity;
+
+ u8 ephemeral_private[NOISE_PUBLIC_KEY_LEN];
+ u8 remote_static[NOISE_PUBLIC_KEY_LEN];
+ u8 remote_ephemeral[NOISE_PUBLIC_KEY_LEN];
+ u8 precomputed_static_static[NOISE_PUBLIC_KEY_LEN];
+
+ u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN];
+
+ u8 hash[NOISE_HASH_LEN];
+ u8 chaining_key[NOISE_HASH_LEN];
+
+ u8 latest_timestamp[NOISE_TIMESTAMP_LEN];
+ __le32 remote_index;
+
+ /* Protects all members except the immutable (after noise_handshake_
+ * init): remote_static, precomputed_static_static, static_identity. */
+ struct rw_semaphore lock;
+};
+
+struct wireguard_device;
+
+void noise_init(void);
+bool noise_handshake_init(struct noise_handshake *handshake,
+ struct noise_static_identity *static_identity,
+ const u8 peer_public_key[NOISE_PUBLIC_KEY_LEN],
+ const u8 peer_preshared_key[NOISE_SYMMETRIC_KEY_LEN],
+ struct wireguard_peer *peer);
+void noise_handshake_clear(struct noise_handshake *handshake);
+void noise_keypair_put(struct noise_keypair *keypair, bool unreference_now);
+struct noise_keypair *noise_keypair_get(struct noise_keypair *keypair);
+void noise_keypairs_clear(struct noise_keypairs *keypairs);
+bool noise_received_with_keypair(struct noise_keypairs *keypairs,
+ struct noise_keypair *received_keypair);
+
+void noise_set_static_identity_private_key(
+ struct noise_static_identity *static_identity,
+ const u8 private_key[NOISE_PUBLIC_KEY_LEN]);
+bool noise_precompute_static_static(struct wireguard_peer *peer);
+
+bool noise_handshake_create_initiation(struct message_handshake_initiation *dst,
+ struct noise_handshake *handshake);
+struct wireguard_peer *
+noise_handshake_consume_initiation(struct message_handshake_initiation *src,
+ struct wireguard_device *wg);
+
+bool noise_handshake_create_response(struct message_handshake_response *dst,
+ struct noise_handshake *handshake);
+struct wireguard_peer *
+noise_handshake_consume_response(struct message_handshake_response *src,
+ struct wireguard_device *wg);
+
+bool noise_handshake_begin_session(struct noise_handshake *handshake,
+ struct noise_keypairs *keypairs);
+
+#endif /* _WG_NOISE_H */
diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c
new file mode 100644
index 000000000000..ca7981cf77c1
--- /dev/null
+++ b/drivers/net/wireguard/peer.c
@@ -0,0 +1,191 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "peer.h"
+#include "device.h"
+#include "queueing.h"
+#include "timers.h"
+#include "hashtables.h"
+#include "noise.h"
+
+#include <linux/kref.h>
+#include <linux/lockdep.h>
+#include <linux/rcupdate.h>
+#include <linux/list.h>
+
+static atomic64_t peer_counter = ATOMIC64_INIT(0);
+
+struct wireguard_peer *
+peer_create(struct wireguard_device *wg,
+ const u8 public_key[NOISE_PUBLIC_KEY_LEN],
+ const u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN])
+{
+ struct wireguard_peer *peer;
+
+ lockdep_assert_held(&wg->device_update_lock);
+
+ if (wg->num_peers >= MAX_PEERS_PER_DEVICE)
+ return NULL;
+
+ peer = kzalloc(sizeof(*peer), GFP_KERNEL);
+ if (unlikely(!peer))
+ return NULL;
+ peer->device = wg;
+
+ if (!noise_handshake_init(&peer->handshake, &wg->static_identity,
+ public_key, preshared_key, peer))
+ goto err_1;
+ if (dst_cache_init(&peer->endpoint_cache, GFP_KERNEL))
+ goto err_1;
+ if (packet_queue_init(&peer->tx_queue, packet_tx_worker, false,
+ MAX_QUEUED_PACKETS))
+ goto err_2;
+ if (packet_queue_init(&peer->rx_queue, NULL, false, MAX_QUEUED_PACKETS))
+ goto err_3;
+
+ peer->internal_id = atomic64_inc_return(&peer_counter);
+ peer->serial_work_cpu = nr_cpumask_bits;
+ cookie_init(&peer->latest_cookie);
+ timers_init(peer);
+ cookie_checker_precompute_peer_keys(peer);
+ spin_lock_init(&peer->keypairs.keypair_update_lock);
+ INIT_WORK(&peer->transmit_handshake_work, packet_handshake_send_worker);
+ rwlock_init(&peer->endpoint_lock);
+ kref_init(&peer->refcount);
+ skb_queue_head_init(&peer->staged_packet_queue);
+ atomic64_set(&peer->last_sent_handshake,
+ ktime_get_boot_fast_ns() -
+ (u64)(REKEY_TIMEOUT + 1) * NSEC_PER_SEC);
+ set_bit(NAPI_STATE_NO_BUSY_POLL, &peer->napi.state);
+ netif_napi_add(wg->dev, &peer->napi, packet_rx_poll, NAPI_POLL_WEIGHT);
+ napi_enable(&peer->napi);
+ list_add_tail(&peer->peer_list, &wg->peer_list);
+ pubkey_hashtable_add(&wg->peer_hashtable, peer);
+ ++wg->num_peers;
+ pr_debug("%s: Peer %llu created\n", wg->dev->name, peer->internal_id);
+ return peer;
+
+err_3:
+ packet_queue_free(&peer->tx_queue, false);
+err_2:
+ dst_cache_destroy(&peer->endpoint_cache);
+err_1:
+ kfree(peer);
+ return NULL;
+}
+
+struct wireguard_peer *peer_get_maybe_zero(struct wireguard_peer *peer)
+{
+ RCU_LOCKDEP_WARN(!rcu_read_lock_bh_held(),
+ "Taking peer reference without holding the RCU read lock");
+ if (unlikely(!peer || !kref_get_unless_zero(&peer->refcount)))
+ return NULL;
+ return peer;
+}
+
+/* We have a separate "remove" function to get rid of the final reference
+ * because peer_list, clearing handshakes, and flushing all require mutexes
+ * which requires sleeping, which must only be done from certain contexts.
+ */
+void peer_remove(struct wireguard_peer *peer)
+{
+ if (unlikely(!peer))
+ return;
+ lockdep_assert_held(&peer->device->device_update_lock);
+
+ /* Remove from configuration-time lookup structures so new packets
+ * can't enter.
+ */
+ list_del_init(&peer->peer_list);
+ allowedips_remove_by_peer(&peer->device->peer_allowedips, peer,
+ &peer->device->device_update_lock);
+ pubkey_hashtable_remove(&peer->device->peer_hashtable, peer);
+
+ /* Mark as dead, so that we don't allow jumping contexts after. */
+ WRITE_ONCE(peer->is_dead, true);
+ synchronize_rcu_bh();
+
+ /* Now that no more keypairs can be created for this peer, we destroy
+ * existing ones.
+ */
+ noise_keypairs_clear(&peer->keypairs);
+
+ /* Destroy all ongoing timers that were in-flight at the beginning of
+ * this function.
+ */
+ timers_stop(peer);
+
+ /* The transition between packet encryption/decryption queues isn't
+ * guarded by is_dead, but each reference's life is strictly bounded by
+ * two generations: once for parallel crypto and once for serial
+ * ingestion, so we can simply flush twice, and be sure that we no
+ * longer have references inside these queues.
+ */
+
+ /* a) For encrypt/decrypt. */
+ flush_workqueue(peer->device->packet_crypt_wq);
+ /* b.1) For send (but not receive, since that's napi). */
+ flush_workqueue(peer->device->packet_crypt_wq);
+ /* b.2.1) For receive (but not send, since that's wq). */
+ napi_disable(&peer->napi);
+ /* b.2.1) It's now safe to remove the napi struct, which must be done
+ * here from process context.
+ */
+ netif_napi_del(&peer->napi);
+
+ /* Ensure any workstructs we own (like transmit_handshake_work or
+ * clear_peer_work) no longer are in use.
+ */
+ flush_workqueue(peer->device->handshake_send_wq);
+
+ --peer->device->num_peers;
+ peer_put(peer);
+}
+
+static void rcu_release(struct rcu_head *rcu)
+{
+ struct wireguard_peer *peer =
+ container_of(rcu, struct wireguard_peer, rcu);
+ dst_cache_destroy(&peer->endpoint_cache);
+ packet_queue_free(&peer->rx_queue, false);
+ packet_queue_free(&peer->tx_queue, false);
+ kzfree(peer);
+}
+
+static void kref_release(struct kref *refcount)
+{
+ struct wireguard_peer *peer =
+ container_of(refcount, struct wireguard_peer, refcount);
+ pr_debug("%s: Peer %llu (%pISpfsc) destroyed\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ /* Remove ourself from dynamic runtime lookup structures, now that the
+ * last reference is gone.
+ */
+ index_hashtable_remove(&peer->device->index_hashtable,
+ &peer->handshake.entry);
+ /* Remove any lingering packets that didn't have a chance to be
+ * transmitted.
+ */
+ skb_queue_purge(&peer->staged_packet_queue);
+ /* Free the memory used. */
+ call_rcu_bh(&peer->rcu, rcu_release);
+}
+
+void peer_put(struct wireguard_peer *peer)
+{
+ if (unlikely(!peer))
+ return;
+ kref_put(&peer->refcount, kref_release);
+}
+
+void peer_remove_all(struct wireguard_device *wg)
+{
+ struct wireguard_peer *peer, *temp;
+
+ lockdep_assert_held(&wg->device_update_lock);
+ list_for_each_entry_safe (peer, temp, &wg->peer_list, peer_list)
+ peer_remove(peer);
+}
diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h
new file mode 100644
index 000000000000..5613ccc2e9c2
--- /dev/null
+++ b/drivers/net/wireguard/peer.h
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_PEER_H
+#define _WG_PEER_H
+
+#include "device.h"
+#include "noise.h"
+#include "cookie.h"
+
+#include <linux/types.h>
+#include <linux/netfilter.h>
+#include <linux/spinlock.h>
+#include <linux/kref.h>
+#include <net/dst_cache.h>
+
+struct wireguard_device;
+
+struct endpoint {
+ union {
+ struct sockaddr addr;
+ struct sockaddr_in addr4;
+ struct sockaddr_in6 addr6;
+ };
+ union {
+ struct {
+ struct in_addr src4;
+ /* Essentially the same as addr6->scope_id */
+ int src_if4;
+ };
+ struct in6_addr src6;
+ };
+};
+
+struct wireguard_peer {
+ struct wireguard_device *device;
+ struct crypt_queue tx_queue, rx_queue;
+ struct sk_buff_head staged_packet_queue;
+ int serial_work_cpu;
+ struct noise_keypairs keypairs;
+ struct endpoint endpoint;
+ struct dst_cache endpoint_cache;
+ rwlock_t endpoint_lock;
+ struct noise_handshake handshake;
+ atomic64_t last_sent_handshake;
+ struct work_struct transmit_handshake_work, clear_peer_work;
+ struct cookie latest_cookie;
+ struct hlist_node pubkey_hash;
+ u64 rx_bytes, tx_bytes;
+ struct timer_list timer_retransmit_handshake, timer_send_keepalive;
+ struct timer_list timer_new_handshake, timer_zero_key_material;
+ struct timer_list timer_persistent_keepalive;
+ unsigned int timer_handshake_attempts;
+ u16 persistent_keepalive_interval;
+ bool timers_enabled, timer_need_another_keepalive;
+ bool sent_lastminute_handshake;
+ struct timespec walltime_last_handshake;
+ struct kref refcount;
+ struct rcu_head rcu;
+ struct list_head peer_list;
+ u64 internal_id;
+ struct napi_struct napi;
+ bool is_dead;
+};
+
+struct wireguard_peer *
+peer_create(struct wireguard_device *wg,
+ const u8 public_key[NOISE_PUBLIC_KEY_LEN],
+ const u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN]);
+
+struct wireguard_peer *__must_check
+peer_get_maybe_zero(struct wireguard_peer *peer);
+static inline struct wireguard_peer *peer_get(struct wireguard_peer *peer)
+{
+ kref_get(&peer->refcount);
+ return peer;
+}
+void peer_put(struct wireguard_peer *peer);
+void peer_remove(struct wireguard_peer *peer);
+void peer_remove_all(struct wireguard_device *wg);
+
+struct wireguard_peer *peer_lookup_by_index(struct wireguard_device *wg,
+ u32 index);
+
+#endif /* _WG_PEER_H */
diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
new file mode 100644
index 000000000000..9ec6588e3bf1
--- /dev/null
+++ b/drivers/net/wireguard/queueing.c
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "queueing.h"
+
+struct multicore_worker __percpu *
+packet_alloc_percpu_multicore_worker(work_func_t function, void *ptr)
+{
+ int cpu;
+ struct multicore_worker __percpu *worker =
+ alloc_percpu(struct multicore_worker);
+
+ if (!worker)
+ return NULL;
+
+ for_each_possible_cpu (cpu) {
+ per_cpu_ptr(worker, cpu)->ptr = ptr;
+ INIT_WORK(&per_cpu_ptr(worker, cpu)->work, function);
+ }
+ return worker;
+}
+
+int packet_queue_init(struct crypt_queue *queue, work_func_t function,
+ bool multicore, unsigned int len)
+{
+ int ret;
+
+ memset(queue, 0, sizeof(*queue));
+ ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
+ if (ret)
+ return ret;
+ if (function) {
+ if (multicore) {
+ queue->worker = packet_alloc_percpu_multicore_worker(
+ function, queue);
+ if (!queue->worker)
+ return -ENOMEM;
+ } else
+ INIT_WORK(&queue->work, function);
+ }
+ return 0;
+}
+
+void packet_queue_free(struct crypt_queue *queue, bool multicore)
+{
+ if (multicore)
+ free_percpu(queue->worker);
+ WARN_ON(!__ptr_ring_empty(&queue->ring));
+ ptr_ring_cleanup(&queue->ring, NULL);
+}
diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
new file mode 100644
index 000000000000..66b7134a968a
--- /dev/null
+++ b/drivers/net/wireguard/queueing.h
@@ -0,0 +1,193 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_QUEUEING_H
+#define _WG_QUEUEING_H
+
+#include "peer.h"
+#include <linux/types.h>
+#include <linux/skbuff.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+struct wireguard_device;
+struct wireguard_peer;
+struct multicore_worker;
+struct crypt_queue;
+struct sk_buff;
+
+/* queueing.c APIs: */
+int packet_queue_init(struct crypt_queue *queue, work_func_t function,
+ bool multicore, unsigned int len);
+void packet_queue_free(struct crypt_queue *queue, bool multicore);
+struct multicore_worker __percpu *
+packet_alloc_percpu_multicore_worker(work_func_t function, void *ptr);
+
+/* receive.c APIs: */
+void packet_receive(struct wireguard_device *wg, struct sk_buff *skb);
+void packet_handshake_receive_worker(struct work_struct *work);
+/* NAPI poll function: */
+int packet_rx_poll(struct napi_struct *napi, int budget);
+/* Workqueue worker: */
+void packet_decrypt_worker(struct work_struct *work);
+
+/* send.c APIs: */
+void packet_send_queued_handshake_initiation(struct wireguard_peer *peer,
+ bool is_retry);
+void packet_send_handshake_response(struct wireguard_peer *peer);
+void packet_send_handshake_cookie(struct wireguard_device *wg,
+ struct sk_buff *initiating_skb,
+ __le32 sender_index);
+void packet_send_keepalive(struct wireguard_peer *peer);
+void packet_send_staged_packets(struct wireguard_peer *peer);
+/* Workqueue workers: */
+void packet_handshake_send_worker(struct work_struct *work);
+void packet_tx_worker(struct work_struct *work);
+void packet_encrypt_worker(struct work_struct *work);
+
+enum packet_state {
+ PACKET_STATE_UNCRYPTED,
+ PACKET_STATE_CRYPTED,
+ PACKET_STATE_DEAD
+};
+
+struct packet_cb {
+ u64 nonce;
+ struct noise_keypair *keypair;
+ atomic_t state;
+ u32 mtu;
+ u8 ds;
+};
+
+#define PACKET_PEER(skb) (((struct packet_cb *)skb->cb)->keypair->entry.peer)
+#define PACKET_CB(skb) ((struct packet_cb *)skb->cb)
+
+/* Returns either the correct skb->protocol value, or 0 if invalid. */
+static inline __be16 skb_examine_untrusted_ip_hdr(struct sk_buff *skb)
+{
+ if (skb_network_header(skb) >= skb->head &&
+ (skb_network_header(skb) + sizeof(struct iphdr)) <=
+ skb_tail_pointer(skb) &&
+ ip_hdr(skb)->version == 4)
+ return htons(ETH_P_IP);
+ if (skb_network_header(skb) >= skb->head &&
+ (skb_network_header(skb) + sizeof(struct ipv6hdr)) <=
+ skb_tail_pointer(skb) &&
+ ipv6_hdr(skb)->version == 6)
+ return htons(ETH_P_IPV6);
+ return 0;
+}
+
+static inline void skb_reset(struct sk_buff *skb)
+{
+ const int pfmemalloc = skb->pfmemalloc;
+ skb_scrub_packet(skb, true);
+ memset(&skb->headers_start, 0,
+ offsetof(struct sk_buff, headers_end) -
+ offsetof(struct sk_buff, headers_start));
+ skb->pfmemalloc = pfmemalloc;
+ skb->queue_mapping = 0;
+ skb->nohdr = 0;
+ skb->peeked = 0;
+ skb->mac_len = 0;
+ skb->dev = NULL;
+#ifdef CONFIG_NET_SCHED
+ skb->tc_index = 0;
+ skb_reset_tc(skb);
+#endif
+ skb->hdr_len = skb_headroom(skb);
+ skb_reset_mac_header(skb);
+ skb_reset_network_header(skb);
+ skb_probe_transport_header(skb, 0);
+ skb_reset_inner_headers(skb);
+}
+
+static inline int cpumask_choose_online(int *stored_cpu, unsigned int id)
+{
+ unsigned int cpu = *stored_cpu, cpu_index, i;
+
+ if (unlikely(cpu == nr_cpumask_bits ||
+ !cpumask_test_cpu(cpu, cpu_online_mask))) {
+ cpu_index = id % cpumask_weight(cpu_online_mask);
+ cpu = cpumask_first(cpu_online_mask);
+ for (i = 0; i < cpu_index; ++i)
+ cpu = cpumask_next(cpu, cpu_online_mask);
+ *stored_cpu = cpu;
+ }
+ return cpu;
+}
+
+/* This function is racy, in the sense that next is unlocked, so it could return
+ * the same CPU twice. A race-free version of this would be to instead store an
+ * atomic sequence number, do an increment-and-return, and then iterate through
+ * every possible CPU until we get to that index -- choose_cpu. However that's
+ * a bit slower, and it doesn't seem like this potential race actually
+ * introduces any performance loss, so we live with it.
+ */
+static inline int cpumask_next_online(int *next)
+{
+ int cpu = *next;
+
+ while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
+ cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+ *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+ return cpu;
+}
+
+static inline int queue_enqueue_per_device_and_peer(
+ struct crypt_queue *device_queue, struct crypt_queue *peer_queue,
+ struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
+{
+ int cpu;
+
+ atomic_set_release(&PACKET_CB(skb)->state, PACKET_STATE_UNCRYPTED);
+ /* We first queue this up for the peer ingestion, but the consumer
+ * will wait for the state to change to CRYPTED or DEAD before.
+ */
+ if (unlikely(ptr_ring_produce_bh(&peer_queue->ring, skb)))
+ return -ENOSPC;
+ /* Then we queue it up in the device queue, which consumes the
+ * packet as soon as it can.
+ */
+ cpu = cpumask_next_online(next_cpu);
+ if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
+ return -EPIPE;
+ queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
+ return 0;
+}
+
+static inline void queue_enqueue_per_peer(struct crypt_queue *queue,
+ struct sk_buff *skb,
+ enum packet_state state)
+{
+ /* We take a reference, because as soon as we call atomic_set, the
+ * peer can be freed from below us.
+ */
+ struct wireguard_peer *peer = peer_get(PACKET_PEER(skb));
+ atomic_set_release(&PACKET_CB(skb)->state, state);
+ queue_work_on(cpumask_choose_online(&peer->serial_work_cpu,
+ peer->internal_id),
+ peer->device->packet_crypt_wq, &queue->work);
+ peer_put(peer);
+}
+
+static inline void queue_enqueue_per_peer_napi(struct crypt_queue *queue,
+ struct sk_buff *skb,
+ enum packet_state state)
+{
+ /* We take a reference, because as soon as we call atomic_set, the
+ * peer can be freed from below us.
+ */
+ struct wireguard_peer *peer = peer_get(PACKET_PEER(skb));
+ atomic_set_release(&PACKET_CB(skb)->state, state);
+ napi_schedule(&peer->napi);
+ peer_put(peer);
+}
+
+#ifdef DEBUG
+bool packet_counter_selftest(void);
+#endif
+
+#endif /* _WG_QUEUEING_H */
diff --git a/drivers/net/wireguard/ratelimiter.c b/drivers/net/wireguard/ratelimiter.c
new file mode 100644
index 000000000000..52381ee80663
--- /dev/null
+++ b/drivers/net/wireguard/ratelimiter.c
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "ratelimiter.h"
+#include <linux/siphash.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <net/ip.h>
+
+static struct kmem_cache *entry_cache;
+static hsiphash_key_t key;
+static spinlock_t table_lock = __SPIN_LOCK_UNLOCKED("ratelimiter_table_lock");
+static DEFINE_MUTEX(init_lock);
+static atomic64_t refcnt = ATOMIC64_INIT(0);
+static atomic_t total_entries = ATOMIC_INIT(0);
+static unsigned int max_entries, table_size;
+static void gc_entries(struct work_struct *);
+static DECLARE_DEFERRABLE_WORK(gc_work, gc_entries);
+static struct hlist_head *table_v4;
+#if IS_ENABLED(CONFIG_IPV6)
+static struct hlist_head *table_v6;
+#endif
+
+struct ratelimiter_entry {
+ u64 last_time_ns, tokens;
+ __be64 ip;
+ void *net;
+ spinlock_t lock;
+ struct hlist_node hash;
+ struct rcu_head rcu;
+};
+
+enum {
+ PACKETS_PER_SECOND = 20,
+ PACKETS_BURSTABLE = 5,
+ PACKET_COST = NSEC_PER_SEC / PACKETS_PER_SECOND,
+ TOKEN_MAX = PACKET_COST * PACKETS_BURSTABLE
+};
+
+static void entry_free(struct rcu_head *rcu)
+{
+ kmem_cache_free(entry_cache,
+ container_of(rcu, struct ratelimiter_entry, rcu));
+ atomic_dec(&total_entries);
+}
+
+static void entry_uninit(struct ratelimiter_entry *entry)
+{
+ hlist_del_rcu(&entry->hash);
+ call_rcu(&entry->rcu, entry_free);
+}
+
+/* Calling this function with a NULL work uninits all entries. */
+static void gc_entries(struct work_struct *work)
+{
+ const u64 now = ktime_get_boot_fast_ns();
+ struct ratelimiter_entry *entry;
+ struct hlist_node *temp;
+ unsigned int i;
+
+ for (i = 0; i < table_size; ++i) {
+ spin_lock(&table_lock);
+ hlist_for_each_entry_safe (entry, temp, &table_v4[i], hash) {
+ if (unlikely(!work) ||
+ now - entry->last_time_ns > NSEC_PER_SEC)
+ entry_uninit(entry);
+ }
+#if IS_ENABLED(CONFIG_IPV6)
+ hlist_for_each_entry_safe (entry, temp, &table_v6[i], hash) {
+ if (unlikely(!work) ||
+ now - entry->last_time_ns > NSEC_PER_SEC)
+ entry_uninit(entry);
+ }
+#endif
+ spin_unlock(&table_lock);
+ if (likely(work))
+ cond_resched();
+ }
+ if (likely(work))
+ queue_delayed_work(system_power_efficient_wq, &gc_work, HZ);
+}
+
+bool ratelimiter_allow(struct sk_buff *skb, struct net *net)
+{
+ struct { __be64 ip; u32 net; } data = {
+ .net = (unsigned long)net & 0xffffffff };
+ struct ratelimiter_entry *entry;
+ struct hlist_head *bucket;
+
+ if (skb->protocol == htons(ETH_P_IP)) {
+ data.ip = (__force __be64)ip_hdr(skb)->saddr;
+ bucket = &table_v4[hsiphash(&data, sizeof(u32) * 3, &key) &
+ (table_size - 1)];
+ }
+#if IS_ENABLED(CONFIG_IPV6)
+ else if (skb->protocol == htons(ETH_P_IPV6)) {
+ memcpy(&data.ip, &ipv6_hdr(skb)->saddr,
+ sizeof(__be64)); /* Only 64 bits */
+ bucket = &table_v6[hsiphash(&data, sizeof(u32) * 3, &key) &
+ (table_size - 1)];
+ }
+#endif
+ else
+ return false;
+ rcu_read_lock();
+ hlist_for_each_entry_rcu (entry, bucket, hash) {
+ if (entry->net == net && entry->ip == data.ip) {
+ u64 now, tokens;
+ bool ret;
+ /* Quasi-inspired by nft_limit.c, but this is actually a
+ * slightly different algorithm. Namely, we incorporate
+ * the burst as part of the maximum tokens, rather than
+ * as part of the rate.
+ */
+ spin_lock(&entry->lock);
+ now = ktime_get_boot_fast_ns();
+ tokens = min_t(u64, TOKEN_MAX,
+ entry->tokens + now -
+ entry->last_time_ns);
+ entry->last_time_ns = now;
+ ret = tokens >= PACKET_COST;
+ entry->tokens = ret ? tokens - PACKET_COST : tokens;
+ spin_unlock(&entry->lock);
+ rcu_read_unlock();
+ return ret;
+ }
+ }
+ rcu_read_unlock();
+
+ if (atomic_inc_return(&total_entries) > max_entries)
+ goto err_oom;
+
+ entry = kmem_cache_alloc(entry_cache, GFP_KERNEL);
+ if (unlikely(!entry))
+ goto err_oom;
+
+ entry->net = net;
+ entry->ip = data.ip;
+ INIT_HLIST_NODE(&entry->hash);
+ spin_lock_init(&entry->lock);
+ entry->last_time_ns = ktime_get_boot_fast_ns();
+ entry->tokens = TOKEN_MAX - PACKET_COST;
+ spin_lock(&table_lock);
+ hlist_add_head_rcu(&entry->hash, bucket);
+ spin_unlock(&table_lock);
+ return true;
+
+err_oom:
+ atomic_dec(&total_entries);
+ return false;
+}
+
+int ratelimiter_init(void)
+{
+ mutex_lock(&init_lock);
+ if (atomic64_inc_return(&refcnt) != 1)
+ goto out;
+
+ entry_cache = KMEM_CACHE(ratelimiter_entry, 0);
+ if (!entry_cache)
+ goto err;
+
+ /* xt_hashlimit.c uses a slightly different algorithm for ratelimiting,
+ * but what it shares in common is that it uses a massive hashtable. So,
+ * we borrow their wisdom about good table sizes on different systems
+ * dependent on RAM. This calculation here comes from there.
+ */
+ table_size = (totalram_pages > (1U << 30) / PAGE_SIZE) ? 8192 :
+ max_t(unsigned long, 16, roundup_pow_of_two(
+ (totalram_pages << PAGE_SHIFT) /
+ (1U << 14) / sizeof(struct hlist_head)));
+ max_entries = table_size * 8;
+
+ table_v4 = kvzalloc(table_size * sizeof(*table_v4), GFP_KERNEL);
+ if (unlikely(!table_v4))
+ goto err_kmemcache;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ table_v6 = kvzalloc(table_size * sizeof(*table_v6), GFP_KERNEL);
+ if (unlikely(!table_v6)) {
+ kvfree(table_v4);
+ goto err_kmemcache;
+ }
+#endif
+
+ queue_delayed_work(system_power_efficient_wq, &gc_work, HZ);
+ get_random_bytes(&key, sizeof(key));
+out:
+ mutex_unlock(&init_lock);
+ return 0;
+
+err_kmemcache:
+ kmem_cache_destroy(entry_cache);
+err:
+ atomic64_dec(&refcnt);
+ mutex_unlock(&init_lock);
+ return -ENOMEM;
+}
+
+void ratelimiter_uninit(void)
+{
+ mutex_lock(&init_lock);
+ if (atomic64_dec_if_positive(&refcnt))
+ goto out;
+
+ cancel_delayed_work_sync(&gc_work);
+ gc_entries(NULL);
+ rcu_barrier();
+ kvfree(table_v4);
+#if IS_ENABLED(CONFIG_IPV6)
+ kvfree(table_v6);
+#endif
+ kmem_cache_destroy(entry_cache);
+out:
+ mutex_unlock(&init_lock);
+}
+
+#include "selftest/ratelimiter.h"
diff --git a/drivers/net/wireguard/ratelimiter.h b/drivers/net/wireguard/ratelimiter.h
new file mode 100644
index 000000000000..8931c0615374
--- /dev/null
+++ b/drivers/net/wireguard/ratelimiter.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_RATELIMITER_H
+#define _WG_RATELIMITER_H
+
+#include <linux/skbuff.h>
+
+int ratelimiter_init(void);
+void ratelimiter_uninit(void);
+bool ratelimiter_allow(struct sk_buff *skb, struct net *net);
+
+#ifdef DEBUG
+bool ratelimiter_selftest(void);
+#endif
+
+#endif /* _WG_RATELIMITER_H */
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
new file mode 100644
index 000000000000..e5ce21703512
--- /dev/null
+++ b/drivers/net/wireguard/receive.c
@@ -0,0 +1,597 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "queueing.h"
+#include "device.h"
+#include "peer.h"
+#include "timers.h"
+#include "messages.h"
+#include "cookie.h"
+#include "socket.h"
+
+#include <linux/simd.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/udp.h>
+#include <net/ip_tunnels.h>
+
+/* Must be called with bh disabled. */
+static inline void rx_stats(struct wireguard_peer *peer, size_t len)
+{
+ struct pcpu_sw_netstats *tstats =
+ get_cpu_ptr(peer->device->dev->tstats);
+
+ u64_stats_update_begin(&tstats->syncp);
+ ++tstats->rx_packets;
+ tstats->rx_bytes += len;
+ peer->rx_bytes += len;
+ u64_stats_update_end(&tstats->syncp);
+ put_cpu_ptr(tstats);
+}
+
+#define SKB_TYPE_LE32(skb) (((struct message_header *)(skb)->data)->type)
+
+static inline size_t validate_header_len(struct sk_buff *skb)
+{
+ if (unlikely(skb->len < sizeof(struct message_header)))
+ return 0;
+ if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_DATA) &&
+ skb->len >= MESSAGE_MINIMUM_LENGTH)
+ return sizeof(struct message_data);
+ if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION) &&
+ skb->len == sizeof(struct message_handshake_initiation))
+ return sizeof(struct message_handshake_initiation);
+ if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE) &&
+ skb->len == sizeof(struct message_handshake_response))
+ return sizeof(struct message_handshake_response);
+ if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE) &&
+ skb->len == sizeof(struct message_handshake_cookie))
+ return sizeof(struct message_handshake_cookie);
+ return 0;
+}
+
+static inline int skb_prepare_header(struct sk_buff *skb,
+ struct wireguard_device *wg)
+{
+ size_t data_offset, data_len, header_len;
+ struct udphdr *udp;
+
+ if (unlikely(skb_examine_untrusted_ip_hdr(skb) != skb->protocol ||
+ skb_transport_header(skb) < skb->head ||
+ (skb_transport_header(skb) + sizeof(struct udphdr)) >
+ skb_tail_pointer(skb)))
+ return -EINVAL; /* Bogus IP header */
+ udp = udp_hdr(skb);
+ data_offset = (u8 *)udp - skb->data;
+ if (unlikely(data_offset > U16_MAX ||
+ data_offset + sizeof(struct udphdr) > skb->len))
+ /* Packet has offset at impossible location or isn't big enough
+ * to have UDP fields.
+ */
+ return -EINVAL;
+ data_len = ntohs(udp->len);
+ if (unlikely(data_len < sizeof(struct udphdr) ||
+ data_len > skb->len - data_offset))
+ /* UDP packet is reporting too small of a size or lying about
+ * its size.
+ */
+ return -EINVAL;
+ data_len -= sizeof(struct udphdr);
+ data_offset = (u8 *)udp + sizeof(struct udphdr) - skb->data;
+ if (unlikely(!pskb_may_pull(skb,
+ data_offset + sizeof(struct message_header)) ||
+ pskb_trim(skb, data_len + data_offset) < 0))
+ return -EINVAL;
+ skb_pull(skb, data_offset);
+ if (unlikely(skb->len != data_len))
+ /* Final len does not agree with calculated len */
+ return -EINVAL;
+ header_len = validate_header_len(skb);
+ if (unlikely(!header_len))
+ return -EINVAL;
+ __skb_push(skb, data_offset);
+ if (unlikely(!pskb_may_pull(skb, data_offset + header_len)))
+ return -EINVAL;
+ __skb_pull(skb, data_offset);
+ return 0;
+}
+
+static void receive_handshake_packet(struct wireguard_device *wg,
+ struct sk_buff *skb)
+{
+ struct wireguard_peer *peer = NULL;
+ enum cookie_mac_state mac_state;
+ /* This is global, so that our load calculation applies to
+ * the whole system.
+ */
+ static u64 last_under_load;
+ bool packet_needs_cookie;
+ bool under_load;
+
+ if (SKB_TYPE_LE32(skb) == cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE)) {
+ net_dbg_skb_ratelimited("%s: Receiving cookie response from %pISpfsc\n",
+ wg->dev->name, skb);
+ cookie_message_consume(
+ (struct message_handshake_cookie *)skb->data, wg);
+ return;
+ }
+
+ under_load = skb_queue_len(&wg->incoming_handshakes) >=
+ MAX_QUEUED_INCOMING_HANDSHAKES / 8;
+ if (under_load)
+ last_under_load = ktime_get_boot_fast_ns();
+ else if (last_under_load)
+ under_load = !has_expired(last_under_load, 1);
+ mac_state = cookie_validate_packet(&wg->cookie_checker, skb,
+ under_load);
+ if ((under_load && mac_state == VALID_MAC_WITH_COOKIE) ||
+ (!under_load && mac_state == VALID_MAC_BUT_NO_COOKIE))
+ packet_needs_cookie = false;
+ else if (under_load && mac_state == VALID_MAC_BUT_NO_COOKIE)
+ packet_needs_cookie = true;
+ else {
+ net_dbg_skb_ratelimited("%s: Invalid MAC of handshake, dropping packet from %pISpfsc\n",
+ wg->dev->name, skb);
+ return;
+ }
+
+ switch (SKB_TYPE_LE32(skb)) {
+ case cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION): {
+ struct message_handshake_initiation *message =
+ (struct message_handshake_initiation *)skb->data;
+
+ if (packet_needs_cookie) {
+ packet_send_handshake_cookie(wg, skb,
+ message->sender_index);
+ return;
+ }
+ peer = noise_handshake_consume_initiation(message, wg);
+ if (unlikely(!peer)) {
+ net_dbg_skb_ratelimited("%s: Invalid handshake initiation from %pISpfsc\n",
+ wg->dev->name, skb);
+ return;
+ }
+ socket_set_peer_endpoint_from_skb(peer, skb);
+ net_dbg_ratelimited("%s: Receiving handshake initiation from peer %llu (%pISpfsc)\n",
+ wg->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ packet_send_handshake_response(peer);
+ break;
+ }
+ case cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE): {
+ struct message_handshake_response *message =
+ (struct message_handshake_response *)skb->data;
+
+ if (packet_needs_cookie) {
+ packet_send_handshake_cookie(wg, skb,
+ message->sender_index);
+ return;
+ }
+ peer = noise_handshake_consume_response(message, wg);
+ if (unlikely(!peer)) {
+ net_dbg_skb_ratelimited("%s: Invalid handshake response from %pISpfsc\n",
+ wg->dev->name, skb);
+ return;
+ }
+ socket_set_peer_endpoint_from_skb(peer, skb);
+ net_dbg_ratelimited("%s: Receiving handshake response from peer %llu (%pISpfsc)\n",
+ wg->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ if (noise_handshake_begin_session(&peer->handshake,
+ &peer->keypairs)) {
+ timers_session_derived(peer);
+ timers_handshake_complete(peer);
+ /* Calling this function will either send any existing
+ * packets in the queue and not send a keepalive, which
+ * is the best case, Or, if there's nothing in the
+ * queue, it will send a keepalive, in order to give
+ * immediate confirmation of the session.
+ */
+ packet_send_keepalive(peer);
+ }
+ break;
+ }
+ }
+
+ if (unlikely(!peer)) {
+ WARN(1, "Somehow a wrong type of packet wound up in the handshake queue!\n");
+ return;
+ }
+
+ local_bh_disable();
+ rx_stats(peer, skb->len);
+ local_bh_enable();
+
+ timers_any_authenticated_packet_received(peer);
+ timers_any_authenticated_packet_traversal(peer);
+ peer_put(peer);
+}
+
+void packet_handshake_receive_worker(struct work_struct *work)
+{
+ struct wireguard_device *wg =
+ container_of(work, struct multicore_worker, work)->ptr;
+ struct sk_buff *skb;
+
+ while ((skb = skb_dequeue(&wg->incoming_handshakes)) != NULL) {
+ receive_handshake_packet(wg, skb);
+ dev_kfree_skb(skb);
+ cond_resched();
+ }
+}
+
+static inline void keep_key_fresh(struct wireguard_peer *peer)
+{
+ struct noise_keypair *keypair;
+ bool send = false;
+
+ if (peer->sent_lastminute_handshake)
+ return;
+
+ rcu_read_lock_bh();
+ keypair = rcu_dereference_bh(peer->keypairs.current_keypair);
+ if (likely(keypair && keypair->sending.is_valid) &&
+ keypair->i_am_the_initiator &&
+ unlikely(has_expired(keypair->sending.birthdate,
+ REJECT_AFTER_TIME - KEEPALIVE_TIMEOUT - REKEY_TIMEOUT)))
+ send = true;
+ rcu_read_unlock_bh();
+
+ if (send) {
+ peer->sent_lastminute_handshake = true;
+ packet_send_queued_handshake_initiation(peer, false);
+ }
+}
+
+static inline bool skb_decrypt(struct sk_buff *skb,
+ struct noise_symmetric_key *key,
+ simd_context_t simd_context)
+{
+ struct scatterlist sg[MAX_SKB_FRAGS * 2 + 1];
+ struct sk_buff *trailer;
+ unsigned int offset;
+ int num_frags;
+
+ if (unlikely(!key))
+ return false;
+
+ if (unlikely(!key->is_valid ||
+ has_expired(key->birthdate, REJECT_AFTER_TIME) ||
+ key->counter.receive.counter >= REJECT_AFTER_MESSAGES)) {
+ key->is_valid = false;
+ return false;
+ }
+
+ PACKET_CB(skb)->nonce =
+ le64_to_cpu(((struct message_data *)skb->data)->counter);
+
+ /* We ensure that the network header is part of the packet before we
+ * call skb_cow_data, so that there's no chance that data is removed
+ * from the skb, so that later we can extract the original endpoint.
+ */
+ offset = skb->data - skb_network_header(skb);
+ skb_push(skb, offset);
+ num_frags = skb_cow_data(skb, 0, &trailer);
+ offset += sizeof(struct message_data);
+ skb_pull(skb, offset);
+ if (unlikely(num_frags < 0 || num_frags > ARRAY_SIZE(sg)))
+ return false;
+
+ sg_init_table(sg, num_frags);
+ if (skb_to_sgvec(skb, sg, 0, skb->len) <= 0)
+ return false;
+
+ if (!chacha20poly1305_decrypt_sg(sg, sg, skb->len, NULL, 0,
+ PACKET_CB(skb)->nonce, key->key,
+ simd_context))
+ return false;
+
+ /* Another ugly situation of pushing and pulling the header so as to
+ * keep endpoint information intact.
+ */
+ skb_push(skb, offset);
+ if (pskb_trim(skb, skb->len - noise_encrypted_len(0)))
+ return false;
+ skb_pull(skb, offset);
+
+ return true;
+}
+
+/* This is RFC6479, a replay detection bitmap algorithm that avoids bitshifts */
+static inline bool counter_validate(union noise_counter *counter,
+ u64 their_counter)
+{
+ unsigned long index, index_current, top, i;
+ bool ret = false;
+
+ spin_lock_bh(&counter->receive.lock);
+
+ if (unlikely(counter->receive.counter >= REJECT_AFTER_MESSAGES + 1 ||
+ their_counter >= REJECT_AFTER_MESSAGES))
+ goto out;
+
+ ++their_counter;
+
+ if (unlikely((COUNTER_WINDOW_SIZE + their_counter) <
+ counter->receive.counter))
+ goto out;
+
+ index = their_counter >> ilog2(BITS_PER_LONG);
+
+ if (likely(their_counter > counter->receive.counter)) {
+ index_current = counter->receive.counter >> ilog2(BITS_PER_LONG);
+ top = min_t(unsigned long, index - index_current,
+ COUNTER_BITS_TOTAL / BITS_PER_LONG);
+ for (i = 1; i <= top; ++i)
+ counter->receive.backtrack[(i + index_current) &
+ ((COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1)] = 0;
+ counter->receive.counter = their_counter;
+ }
+
+ index &= (COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1;
+ ret = !test_and_set_bit(their_counter & (BITS_PER_LONG - 1),
+ &counter->receive.backtrack[index]);
+
+out:
+ spin_unlock_bh(&counter->receive.lock);
+ return ret;
+}
+#include "selftest/counter.h"
+
+static void packet_consume_data_done(struct wireguard_peer *peer,
+ struct sk_buff *skb,
+ struct endpoint *endpoint)
+{
+ struct net_device *dev = peer->device->dev;
+ struct wireguard_peer *routed_peer;
+ unsigned int len, len_before_trim;
+
+ socket_set_peer_endpoint(peer, endpoint);
+
+ if (unlikely(noise_received_with_keypair(&peer->keypairs,
+ PACKET_CB(skb)->keypair))) {
+ timers_handshake_complete(peer);
+ packet_send_staged_packets(peer);
+ }
+
+ keep_key_fresh(peer);
+
+ timers_any_authenticated_packet_received(peer);
+ timers_any_authenticated_packet_traversal(peer);
+
+ /* A packet with length 0 is a keepalive packet */
+ if (unlikely(!skb->len)) {
+ rx_stats(peer, message_data_len(0));
+ net_dbg_ratelimited("%s: Receiving keepalive packet from peer %llu (%pISpfsc)\n",
+ dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ goto packet_processed;
+ }
+
+ timers_data_received(peer);
+
+ if (unlikely(skb_network_header(skb) < skb->head))
+ goto dishonest_packet_size;
+ if (unlikely(!(pskb_network_may_pull(skb, sizeof(struct iphdr)) &&
+ (ip_hdr(skb)->version == 4 ||
+ (ip_hdr(skb)->version == 6 &&
+ pskb_network_may_pull(skb, sizeof(struct ipv6hdr)))))))
+ goto dishonest_packet_type;
+
+ skb->dev = dev;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ skb->protocol = skb_examine_untrusted_ip_hdr(skb);
+ if (skb->protocol == htons(ETH_P_IP)) {
+ len = ntohs(ip_hdr(skb)->tot_len);
+ if (unlikely(len < sizeof(struct iphdr)))
+ goto dishonest_packet_size;
+ if (INET_ECN_is_ce(PACKET_CB(skb)->ds))
+ IP_ECN_set_ce(ip_hdr(skb));
+ } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ len = ntohs(ipv6_hdr(skb)->payload_len) +
+ sizeof(struct ipv6hdr);
+ if (INET_ECN_is_ce(PACKET_CB(skb)->ds))
+ IP6_ECN_set_ce(skb, ipv6_hdr(skb));
+ } else
+ goto dishonest_packet_type;
+
+ if (unlikely(len > skb->len))
+ goto dishonest_packet_size;
+ len_before_trim = skb->len;
+ if (unlikely(pskb_trim(skb, len)))
+ goto packet_processed;
+
+ routed_peer = allowedips_lookup_src(&peer->device->peer_allowedips, skb);
+ peer_put(routed_peer); /* We don't need the extra reference. */
+
+ if (unlikely(routed_peer != peer))
+ goto dishonest_packet_peer;
+
+ if (unlikely(napi_gro_receive(&peer->napi, skb) == GRO_DROP)) {
+ ++dev->stats.rx_dropped;
+ net_dbg_ratelimited("%s: Failed to give packet to userspace from peer %llu (%pISpfsc)\n",
+ dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ } else
+ rx_stats(peer, message_data_len(len_before_trim));
+ return;
+
+dishonest_packet_peer:
+ net_dbg_skb_ratelimited("%s: Packet has unallowed src IP (%pISc) from peer %llu (%pISpfsc)\n",
+ dev->name, skb, peer->internal_id,
+ &peer->endpoint.addr);
+ ++dev->stats.rx_errors;
+ ++dev->stats.rx_frame_errors;
+ goto packet_processed;
+dishonest_packet_type:
+ net_dbg_ratelimited("%s: Packet is neither ipv4 nor ipv6 from peer %llu (%pISpfsc)\n",
+ dev->name, peer->internal_id, &peer->endpoint.addr);
+ ++dev->stats.rx_errors;
+ ++dev->stats.rx_frame_errors;
+ goto packet_processed;
+dishonest_packet_size:
+ net_dbg_ratelimited("%s: Packet has incorrect size from peer %llu (%pISpfsc)\n",
+ dev->name, peer->internal_id, &peer->endpoint.addr);
+ ++dev->stats.rx_errors;
+ ++dev->stats.rx_length_errors;
+ goto packet_processed;
+packet_processed:
+ dev_kfree_skb(skb);
+}
+
+int packet_rx_poll(struct napi_struct *napi, int budget)
+{
+ struct wireguard_peer *peer =
+ container_of(napi, struct wireguard_peer, napi);
+ struct crypt_queue *queue = &peer->rx_queue;
+ struct noise_keypair *keypair;
+ struct endpoint endpoint;
+ enum packet_state state;
+ struct sk_buff *skb;
+ int work_done = 0;
+ bool free;
+
+ if (unlikely(budget <= 0))
+ return 0;
+
+ while ((skb = __ptr_ring_peek(&queue->ring)) != NULL &&
+ (state = atomic_read_acquire(&PACKET_CB(skb)->state)) !=
+ PACKET_STATE_UNCRYPTED) {
+ __ptr_ring_discard_one(&queue->ring);
+ peer = PACKET_PEER(skb);
+ keypair = PACKET_CB(skb)->keypair;
+ free = true;
+
+ if (unlikely(state != PACKET_STATE_CRYPTED))
+ goto next;
+
+ if (unlikely(!counter_validate(&keypair->receiving.counter,
+ PACKET_CB(skb)->nonce))) {
+ net_dbg_ratelimited("%s: Packet has invalid nonce %llu (max %llu)\n",
+ peer->device->dev->name,
+ PACKET_CB(skb)->nonce,
+ keypair->receiving.counter.receive.counter);
+ goto next;
+ }
+
+ if (unlikely(socket_endpoint_from_skb(&endpoint, skb)))
+ goto next;
+
+ skb_reset(skb);
+ packet_consume_data_done(peer, skb, &endpoint);
+ free = false;
+
+ next:
+ noise_keypair_put(keypair, false);
+ peer_put(peer);
+ if (unlikely(free))
+ dev_kfree_skb(skb);
+
+ if (++work_done >= budget)
+ break;
+ }
+
+ if (work_done < budget)
+ napi_complete_done(napi, work_done);
+
+ return work_done;
+}
+
+void packet_decrypt_worker(struct work_struct *work)
+{
+ struct crypt_queue *queue =
+ container_of(work, struct multicore_worker, work)->ptr;
+ simd_context_t simd_context = simd_get();
+ struct sk_buff *skb;
+
+ while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
+ enum packet_state state = likely(skb_decrypt(skb,
+ &PACKET_CB(skb)->keypair->receiving,
+ simd_context)) ?
+ PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
+ queue_enqueue_per_peer_napi(&PACKET_PEER(skb)->rx_queue, skb,
+ state);
+ simd_context = simd_relax(simd_context);
+ }
+
+ simd_put(simd_context);
+}
+
+static void packet_consume_data(struct wireguard_device *wg,
+ struct sk_buff *skb)
+{
+ __le32 idx = ((struct message_data *)skb->data)->key_idx;
+ struct wireguard_peer *peer = NULL;
+ int ret;
+
+ rcu_read_lock_bh();
+ PACKET_CB(skb)->keypair =
+ (struct noise_keypair *)index_hashtable_lookup(
+ &wg->index_hashtable, INDEX_HASHTABLE_KEYPAIR, idx,
+ &peer);
+ if (unlikely(!noise_keypair_get(PACKET_CB(skb)->keypair)))
+ goto err_keypair;
+
+ if (unlikely(peer->is_dead))
+ goto err;
+
+ ret = queue_enqueue_per_device_and_peer(&wg->decrypt_queue,
+ &peer->rx_queue, skb,
+ wg->packet_crypt_wq,
+ &wg->decrypt_queue.last_cpu);
+ if (unlikely(ret == -EPIPE))
+ queue_enqueue_per_peer(&peer->rx_queue, skb, PACKET_STATE_DEAD);
+ if (likely(!ret || ret == -EPIPE)) {
+ rcu_read_unlock_bh();
+ return;
+ }
+err:
+ noise_keypair_put(PACKET_CB(skb)->keypair, false);
+err_keypair:
+ rcu_read_unlock_bh();
+ peer_put(peer);
+ dev_kfree_skb(skb);
+}
+
+void packet_receive(struct wireguard_device *wg, struct sk_buff *skb)
+{
+ if (unlikely(skb_prepare_header(skb, wg) < 0))
+ goto err;
+ switch (SKB_TYPE_LE32(skb)) {
+ case cpu_to_le32(MESSAGE_HANDSHAKE_INITIATION):
+ case cpu_to_le32(MESSAGE_HANDSHAKE_RESPONSE):
+ case cpu_to_le32(MESSAGE_HANDSHAKE_COOKIE): {
+ int cpu;
+
+ if (skb_queue_len(&wg->incoming_handshakes) >
+ MAX_QUEUED_INCOMING_HANDSHAKES ||
+ unlikely(!rng_is_initialized())) {
+ net_dbg_skb_ratelimited("%s: Dropping handshake packet from %pISpfsc\n",
+ wg->dev->name, skb);
+ goto err;
+ }
+ skb_queue_tail(&wg->incoming_handshakes, skb);
+ /* Queues up a call to packet_process_queued_handshake_
+ * packets(skb):
+ */
+ cpu = cpumask_next_online(&wg->incoming_handshake_cpu);
+ queue_work_on(cpu, wg->handshake_receive_wq,
+ &per_cpu_ptr(wg->incoming_handshakes_worker, cpu)->work);
+ break;
+ }
+ case cpu_to_le32(MESSAGE_DATA):
+ PACKET_CB(skb)->ds = ip_tunnel_get_dsfield(ip_hdr(skb), skb);
+ packet_consume_data(wg, skb);
+ break;
+ default:
+ net_dbg_skb_ratelimited("%s: Invalid packet from %pISpfsc\n",
+ wg->dev->name, skb);
+ goto err;
+ }
+ return;
+
+err:
+ dev_kfree_skb(skb);
+}
diff --git a/drivers/net/wireguard/selftest/allowedips.h b/drivers/net/wireguard/selftest/allowedips.h
new file mode 100644
index 000000000000..83cfb34ecb46
--- /dev/null
+++ b/drivers/net/wireguard/selftest/allowedips.h
@@ -0,0 +1,656 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifdef DEBUG
+
+#ifdef DEBUG_PRINT_TRIE_GRAPHVIZ
+#include <linux/siphash.h>
+
+static __init void swap_endian_and_apply_cidr(u8 *dst, const u8 *src, u8 bits,
+ u8 cidr)
+{
+ swap_endian(dst, src, bits);
+ memset(dst + (cidr + 7) / 8, 0, bits / 8 - (cidr + 7) / 8);
+ if (cidr)
+ dst[(cidr + 7) / 8 - 1] &= ~0U << ((8 - (cidr % 8)) % 8);
+}
+
+static __init void print_node(struct allowedips_node *node, u8 bits)
+{
+ char *fmt_connection = KERN_DEBUG "\t\"%p/%d\" -> \"%p/%d\";\n";
+ char *fmt_declaration = KERN_DEBUG
+ "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n";
+ char *style = "dotted";
+ u8 ip1[16], ip2[16];
+ u32 color = 0;
+
+ if (bits == 32) {
+ fmt_connection = KERN_DEBUG "\t\"%pI4/%d\" -> \"%pI4/%d\";\n";
+ fmt_declaration = KERN_DEBUG
+ "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n";
+ } else if (bits == 128) {
+ fmt_connection = KERN_DEBUG "\t\"%pI6/%d\" -> \"%pI6/%d\";\n";
+ fmt_declaration = KERN_DEBUG
+ "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n";
+ }
+ if (node->peer) {
+ hsiphash_key_t key = { 0 };
+ memcpy(&key, &node->peer, sizeof(node->peer));
+ color = hsiphash_1u32(0xdeadbeef, &key) % 200 << 16 |
+ hsiphash_1u32(0xbabecafe, &key) % 200 << 8 |
+ hsiphash_1u32(0xabad1dea, &key) % 200;
+ style = "bold";
+ }
+ swap_endian_and_apply_cidr(ip1, node->bits, bits, node->cidr);
+ printk(fmt_declaration, ip1, node->cidr, style, color);
+ if (node->bit[0]) {
+ swap_endian_and_apply_cidr(ip2, node->bit[0]->bits, bits,
+ node->cidr);
+ printk(fmt_connection, ip1, node->cidr, ip2,
+ node->bit[0]->cidr);
+ print_node(node->bit[0], bits);
+ }
+ if (node->bit[1]) {
+ swap_endian_and_apply_cidr(ip2, node->bit[1]->bits, bits,
+ node->cidr);
+ printk(fmt_connection, ip1, node->cidr, ip2,
+ node->bit[1]->cidr);
+ print_node(node->bit[1], bits);
+ }
+}
+static __init void print_tree(struct allowedips_node *top, u8 bits)
+{
+ printk(KERN_DEBUG "digraph trie {\n");
+ print_node(top, bits);
+ printk(KERN_DEBUG "}\n");
+}
+#endif
+
+#ifdef DEBUG_RANDOM_TRIE
+#define NUM_PEERS 2000
+#define NUM_RAND_ROUTES 400
+#define NUM_MUTATED_ROUTES 100
+#define NUM_QUERIES (NUM_RAND_ROUTES * NUM_MUTATED_ROUTES * 30)
+#include <linux/random.h>
+struct horrible_allowedips {
+ struct hlist_head head;
+};
+struct horrible_allowedips_node {
+ struct hlist_node table;
+ union nf_inet_addr ip;
+ union nf_inet_addr mask;
+ uint8_t ip_version;
+ void *value;
+};
+static __init void horrible_allowedips_init(struct horrible_allowedips *table)
+{
+ INIT_HLIST_HEAD(&table->head);
+}
+static __init void horrible_allowedips_free(struct horrible_allowedips *table)
+{
+ struct horrible_allowedips_node *node;
+ struct hlist_node *h;
+
+ hlist_for_each_entry_safe (node, h, &table->head, table) {
+ hlist_del(&node->table);
+ kfree(node);
+ }
+}
+static __init inline union nf_inet_addr horrible_cidr_to_mask(uint8_t cidr)
+{
+ union nf_inet_addr mask;
+
+ memset(&mask, 0x00, 128 / 8);
+ memset(&mask, 0xff, cidr / 8);
+ if (cidr % 32)
+ mask.all[cidr / 32] = htonl(
+ (0xFFFFFFFFUL << (32 - (cidr % 32))) & 0xFFFFFFFFUL);
+ return mask;
+}
+static __init inline uint8_t horrible_mask_to_cidr(union nf_inet_addr subnet)
+{
+ return hweight32(subnet.all[0]) + hweight32(subnet.all[1]) +
+ hweight32(subnet.all[2]) + hweight32(subnet.all[3]);
+}
+static __init inline void
+horrible_mask_self(struct horrible_allowedips_node *node)
+{
+ if (node->ip_version == 4)
+ node->ip.ip &= node->mask.ip;
+ else if (node->ip_version == 6) {
+ node->ip.ip6[0] &= node->mask.ip6[0];
+ node->ip.ip6[1] &= node->mask.ip6[1];
+ node->ip.ip6[2] &= node->mask.ip6[2];
+ node->ip.ip6[3] &= node->mask.ip6[3];
+ }
+}
+static __init inline bool
+horrible_match_v4(const struct horrible_allowedips_node *node,
+ struct in_addr *ip)
+{
+ return (ip->s_addr & node->mask.ip) == node->ip.ip;
+}
+static __init inline bool
+horrible_match_v6(const struct horrible_allowedips_node *node,
+ struct in6_addr *ip)
+{
+ return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) ==
+ node->ip.ip6[0] &&
+ (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) ==
+ node->ip.ip6[1] &&
+ (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) ==
+ node->ip.ip6[2] &&
+ (ip->in6_u.u6_addr32[3] & node->mask.ip6[3]) == node->ip.ip6[3];
+}
+static __init void
+horrible_insert_ordered(struct horrible_allowedips *table,
+ struct horrible_allowedips_node *node)
+{
+ struct horrible_allowedips_node *other = NULL, *where = NULL;
+ uint8_t my_cidr = horrible_mask_to_cidr(node->mask);
+
+ hlist_for_each_entry (other, &table->head, table) {
+ if (!memcmp(&other->mask, &node->mask,
+ sizeof(union nf_inet_addr)) &&
+ !memcmp(&other->ip, &node->ip,
+ sizeof(union nf_inet_addr)) &&
+ other->ip_version == node->ip_version) {
+ other->value = node->value;
+ kfree(node);
+ return;
+ }
+ where = other;
+ if (horrible_mask_to_cidr(other->mask) <= my_cidr)
+ break;
+ }
+ if (!other && !where)
+ hlist_add_head(&node->table, &table->head);
+ else if (!other)
+ hlist_add_behind(&node->table, &where->table);
+ else
+ hlist_add_before(&node->table, &where->table);
+}
+static __init int
+horrible_allowedips_insert_v4(struct horrible_allowedips *table,
+ struct in_addr *ip, uint8_t cidr, void *value)
+{
+ struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL);
+
+ if (unlikely(!node))
+ return -ENOMEM;
+ node->ip.in = *ip;
+ node->mask = horrible_cidr_to_mask(cidr);
+ node->ip_version = 4;
+ node->value = value;
+ horrible_mask_self(node);
+ horrible_insert_ordered(table, node);
+ return 0;
+}
+static __init int
+horrible_allowedips_insert_v6(struct horrible_allowedips *table,
+ struct in6_addr *ip, uint8_t cidr, void *value)
+{
+ struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL);
+
+ if (unlikely(!node))
+ return -ENOMEM;
+ node->ip.in6 = *ip;
+ node->mask = horrible_cidr_to_mask(cidr);
+ node->ip_version = 6;
+ node->value = value;
+ horrible_mask_self(node);
+ horrible_insert_ordered(table, node);
+ return 0;
+}
+static __init void *
+horrible_allowedips_lookup_v4(struct horrible_allowedips *table,
+ struct in_addr *ip)
+{
+ struct horrible_allowedips_node *node;
+ void *ret = NULL;
+
+ hlist_for_each_entry (node, &table->head, table) {
+ if (node->ip_version != 4)
+ continue;
+ if (horrible_match_v4(node, ip)) {
+ ret = node->value;
+ break;
+ }
+ }
+ return ret;
+}
+static __init void *
+horrible_allowedips_lookup_v6(struct horrible_allowedips *table,
+ struct in6_addr *ip)
+{
+ struct horrible_allowedips_node *node;
+ void *ret = NULL;
+
+ hlist_for_each_entry (node, &table->head, table) {
+ if (node->ip_version != 6)
+ continue;
+ if (horrible_match_v6(node, ip)) {
+ ret = node->value;
+ break;
+ }
+ }
+ return ret;
+}
+
+static __init bool randomized_test(void)
+{
+ unsigned int i, j, k, mutate_amount, cidr;
+ u8 ip[16], mutate_mask[16], mutated[16];
+ struct wireguard_peer **peers, *peer;
+ struct horrible_allowedips h;
+ DEFINE_MUTEX(mutex);
+ struct allowedips t;
+ bool ret = false;
+
+ mutex_init(&mutex);
+
+ allowedips_init(&t);
+ horrible_allowedips_init(&h);
+
+ peers = kcalloc(NUM_PEERS, sizeof(*peers), GFP_KERNEL);
+ if (unlikely(!peers)) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ for (i = 0; i < NUM_PEERS; ++i) {
+ peers[i] = kzalloc(sizeof(*peers[i]), GFP_KERNEL);
+ if (unlikely(!peers[i])) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ kref_init(&peers[i]->refcount);
+ }
+
+ mutex_lock(&mutex);
+
+ for (i = 0; i < NUM_RAND_ROUTES; ++i) {
+ prandom_bytes(ip, 4);
+ cidr = prandom_u32_max(32) + 1;
+ peer = peers[prandom_u32_max(NUM_PEERS)];
+ if (allowedips_insert_v4(&t, (struct in_addr *)ip, cidr, peer,
+ &mutex) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ if (horrible_allowedips_insert_v4(&h, (struct in_addr *)ip,
+ cidr, peer) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ for (j = 0; j < NUM_MUTATED_ROUTES; ++j) {
+ memcpy(mutated, ip, 4);
+ prandom_bytes(mutate_mask, 4);
+ mutate_amount = prandom_u32_max(32);
+ for (k = 0; k < mutate_amount / 8; ++k)
+ mutate_mask[k] = 0xff;
+ mutate_mask[k] = 0xff
+ << ((8 - (mutate_amount % 8)) % 8);
+ for (; k < 4; ++k)
+ mutate_mask[k] = 0;
+ for (k = 0; k < 4; ++k)
+ mutated[k] = (mutated[k] & mutate_mask[k]) |
+ (~mutate_mask[k] &
+ prandom_u32_max(256));
+ cidr = prandom_u32_max(32) + 1;
+ peer = peers[prandom_u32_max(NUM_PEERS)];
+ if (allowedips_insert_v4(&t, (struct in_addr *)mutated,
+ cidr, peer, &mutex) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ if (horrible_allowedips_insert_v4(&h,
+ (struct in_addr *)mutated, cidr, peer)) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ }
+ }
+
+ for (i = 0; i < NUM_RAND_ROUTES; ++i) {
+ prandom_bytes(ip, 16);
+ cidr = prandom_u32_max(128) + 1;
+ peer = peers[prandom_u32_max(NUM_PEERS)];
+ if (allowedips_insert_v6(&t, (struct in6_addr *)ip, cidr, peer,
+ &mutex) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ if (horrible_allowedips_insert_v6(&h, (struct in6_addr *)ip,
+ cidr, peer) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ for (j = 0; j < NUM_MUTATED_ROUTES; ++j) {
+ memcpy(mutated, ip, 16);
+ prandom_bytes(mutate_mask, 16);
+ mutate_amount = prandom_u32_max(128);
+ for (k = 0; k < mutate_amount / 8; ++k)
+ mutate_mask[k] = 0xff;
+ mutate_mask[k] = 0xff
+ << ((8 - (mutate_amount % 8)) % 8);
+ for (; k < 4; ++k)
+ mutate_mask[k] = 0;
+ for (k = 0; k < 4; ++k)
+ mutated[k] = (mutated[k] & mutate_mask[k]) |
+ (~mutate_mask[k] &
+ prandom_u32_max(256));
+ cidr = prandom_u32_max(128) + 1;
+ peer = peers[prandom_u32_max(NUM_PEERS)];
+ if (allowedips_insert_v6(&t, (struct in6_addr *)mutated,
+ cidr, peer, &mutex) < 0) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ if (horrible_allowedips_insert_v6(
+ &h, (struct in6_addr *)mutated, cidr,
+ peer)) {
+ pr_info("allowedips random self-test: out of memory\n");
+ goto free;
+ }
+ }
+ }
+
+ mutex_unlock(&mutex);
+
+#ifdef DEBUG_PRINT_TRIE_GRAPHVIZ
+ print_tree(t.root4, 32);
+ print_tree(t.root6, 128);
+#endif
+
+ for (i = 0; i < NUM_QUERIES; ++i) {
+ prandom_bytes(ip, 4);
+ if (lookup(t.root4, 32, ip) !=
+ horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) {
+ pr_info("allowedips random self-test: FAIL\n");
+ goto free;
+ }
+ }
+
+ for (i = 0; i < NUM_QUERIES; ++i) {
+ prandom_bytes(ip, 16);
+ if (lookup(t.root6, 128, ip) !=
+ horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) {
+ pr_info("allowedips random self-test: FAIL\n");
+ goto free;
+ }
+ }
+ ret = true;
+
+free:
+ mutex_lock(&mutex);
+ allowedips_free(&t, &mutex);
+ mutex_unlock(&mutex);
+ horrible_allowedips_free(&h);
+ if (peers) {
+ for (i = 0; i < NUM_PEERS; ++i)
+ kfree(peers[i]);
+ }
+ kfree(peers);
+ return ret;
+}
+#endif
+
+static __init inline struct in_addr *ip4(u8 a, u8 b, u8 c, u8 d)
+{
+ static struct in_addr ip;
+ u8 *split = (u8 *)&ip;
+ split[0] = a;
+ split[1] = b;
+ split[2] = c;
+ split[3] = d;
+ return &ip;
+}
+static __init inline struct in6_addr *ip6(u32 a, u32 b, u32 c, u32 d)
+{
+ static struct in6_addr ip;
+ __be32 *split = (__be32 *)&ip;
+ split[0] = cpu_to_be32(a);
+ split[1] = cpu_to_be32(b);
+ split[2] = cpu_to_be32(c);
+ split[3] = cpu_to_be32(d);
+ return &ip;
+}
+
+struct walk_ctx {
+ int count;
+ bool found_a, found_b, found_c, found_d, found_e;
+ bool found_other;
+};
+
+static __init int walk_callback(void *ctx, const u8 *ip, u8 cidr, int family)
+{
+ struct walk_ctx *wctx = ctx;
+
+ wctx->count++;
+
+ if (cidr == 27 &&
+ !memcmp(ip, ip4(192, 95, 5, 64), sizeof(struct in_addr)))
+ wctx->found_a = true;
+ else if (cidr == 128 &&
+ !memcmp(ip, ip6(0x26075300, 0x60006b00, 0, 0xc05f0543),
+ sizeof(struct in6_addr)))
+ wctx->found_b = true;
+ else if (cidr == 29 &&
+ !memcmp(ip, ip4(10, 1, 0, 16), sizeof(struct in_addr)))
+ wctx->found_c = true;
+ else if (cidr == 83 &&
+ !memcmp(ip, ip6(0x26075300, 0x6d8a6bf8, 0xdab1e000, 0),
+ sizeof(struct in6_addr)))
+ wctx->found_d = true;
+ else if (cidr == 21 &&
+ !memcmp(ip, ip6(0x26075000, 0, 0, 0), sizeof(struct in6_addr)))
+ wctx->found_e = true;
+ else
+ wctx->found_other = true;
+
+ return 0;
+}
+
+#define init_peer(name) do { \
+ name = kzalloc(sizeof(*name), GFP_KERNEL); \
+ if (unlikely(!name)) { \
+ pr_info("allowedips self-test: out of memory\n"); \
+ goto free; \
+ } \
+ kref_init(&name->refcount); \
+ } while (0)
+
+#define insert(version, mem, ipa, ipb, ipc, ipd, cidr) \
+ allowedips_insert_v##version(&t, ip##version(ipa, ipb, ipc, ipd), \
+ cidr, mem, &mutex)
+
+#define maybe_fail() do { \
+ ++i; \
+ if (!_s) { \
+ pr_info("allowedips self-test %zu: FAIL\n", i); \
+ success = false; \
+ } \
+ } while (0)
+
+#define test(version, mem, ipa, ipb, ipc, ipd) do { \
+ bool _s = lookup(t.root##version, version == 4 ? 32 : 128, \
+ ip##version(ipa, ipb, ipc, ipd)) == mem; \
+ maybe_fail(); \
+ } while (0)
+
+#define test_negative(version, mem, ipa, ipb, ipc, ipd) do { \
+ bool _s = lookup(t.root##version, version == 4 ? 32 : 128, \
+ ip##version(ipa, ipb, ipc, ipd)) != mem; \
+ maybe_fail(); \
+ } while (0)
+
+#define test_boolean(cond) do { \
+ bool _s = (cond); \
+ maybe_fail(); \
+ } while (0)
+
+bool __init allowedips_selftest(void)
+{
+ struct wireguard_peer *a = NULL, *b = NULL, *c = NULL, *d = NULL,
+ *e = NULL, *f = NULL, *g = NULL, *h = NULL;
+ struct allowedips_cursor cursor = { 0 };
+ struct walk_ctx wctx = { 0 };
+ bool success = false;
+ struct allowedips t;
+ DEFINE_MUTEX(mutex);
+ struct in6_addr ip;
+ size_t i = 0;
+ __be64 part;
+
+ mutex_init(&mutex);
+ mutex_lock(&mutex);
+
+ allowedips_init(&t);
+ init_peer(a);
+ init_peer(b);
+ init_peer(c);
+ init_peer(d);
+ init_peer(e);
+ init_peer(f);
+ init_peer(g);
+ init_peer(h);
+
+ insert(4, a, 192, 168, 4, 0, 24);
+ insert(4, b, 192, 168, 4, 4, 32);
+ insert(4, c, 192, 168, 0, 0, 16);
+ insert(4, d, 192, 95, 5, 64, 27);
+ /* replaces previous entry, and maskself is required */
+ insert(4, c, 192, 95, 5, 65, 27);
+ insert(6, d, 0x26075300, 0x60006b00, 0, 0xc05f0543, 128);
+ insert(6, c, 0x26075300, 0x60006b00, 0, 0, 64);
+ insert(4, e, 0, 0, 0, 0, 0);
+ insert(6, e, 0, 0, 0, 0, 0);
+ /* replaces previous entry */
+ insert(6, f, 0, 0, 0, 0, 0);
+ insert(6, g, 0x24046800, 0, 0, 0, 32);
+ /* maskself is required */
+ insert(6, h, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef, 64);
+ insert(6, a, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef, 128);
+ insert(6, c, 0x24446800, 0x40e40800, 0xdeaebeef, 0xdefbeef, 128);
+ insert(6, b, 0x24446800, 0xf0e40800, 0xeeaebeef, 0, 98);
+ insert(4, g, 64, 15, 112, 0, 20);
+ /* maskself is required */
+ insert(4, h, 64, 15, 123, 211, 25);
+ insert(4, a, 10, 0, 0, 0, 25);
+ insert(4, b, 10, 0, 0, 128, 25);
+ insert(4, a, 10, 1, 0, 0, 30);
+ insert(4, b, 10, 1, 0, 4, 30);
+ insert(4, c, 10, 1, 0, 8, 29);
+ insert(4, d, 10, 1, 0, 16, 29);
+
+#ifdef DEBUG_PRINT_TRIE_GRAPHVIZ
+ print_tree(t.root4, 32);
+ print_tree(t.root6, 128);
+#endif
+
+ success = true;
+
+ test(4, a, 192, 168, 4, 20);
+ test(4, a, 192, 168, 4, 0);
+ test(4, b, 192, 168, 4, 4);
+ test(4, c, 192, 168, 200, 182);
+ test(4, c, 192, 95, 5, 68);
+ test(4, e, 192, 95, 5, 96);
+ test(6, d, 0x26075300, 0x60006b00, 0, 0xc05f0543);
+ test(6, c, 0x26075300, 0x60006b00, 0, 0xc02e01ee);
+ test(6, f, 0x26075300, 0x60006b01, 0, 0);
+ test(6, g, 0x24046800, 0x40040806, 0, 0x1006);
+ test(6, g, 0x24046800, 0x40040806, 0x1234, 0x5678);
+ test(6, f, 0x240467ff, 0x40040806, 0x1234, 0x5678);
+ test(6, f, 0x24046801, 0x40040806, 0x1234, 0x5678);
+ test(6, h, 0x24046800, 0x40040800, 0x1234, 0x5678);
+ test(6, h, 0x24046800, 0x40040800, 0, 0);
+ test(6, h, 0x24046800, 0x40040800, 0x10101010, 0x10101010);
+ test(6, a, 0x24046800, 0x40040800, 0xdeadbeef, 0xdeadbeef);
+ test(4, g, 64, 15, 116, 26);
+ test(4, g, 64, 15, 127, 3);
+ test(4, g, 64, 15, 123, 1);
+ test(4, h, 64, 15, 123, 128);
+ test(4, h, 64, 15, 123, 129);
+ test(4, a, 10, 0, 0, 52);
+ test(4, b, 10, 0, 0, 220);
+ test(4, a, 10, 1, 0, 2);
+ test(4, b, 10, 1, 0, 6);
+ test(4, c, 10, 1, 0, 10);
+ test(4, d, 10, 1, 0, 20);
+
+ insert(4, a, 1, 0, 0, 0, 32);
+ insert(4, a, 64, 0, 0, 0, 32);
+ insert(4, a, 128, 0, 0, 0, 32);
+ insert(4, a, 192, 0, 0, 0, 32);
+ insert(4, a, 255, 0, 0, 0, 32);
+ allowedips_remove_by_peer(&t, a, &mutex);
+ test_negative(4, a, 1, 0, 0, 0);
+ test_negative(4, a, 64, 0, 0, 0);
+ test_negative(4, a, 128, 0, 0, 0);
+ test_negative(4, a, 192, 0, 0, 0);
+ test_negative(4, a, 255, 0, 0, 0);
+
+ allowedips_free(&t, &mutex);
+ allowedips_init(&t);
+ insert(4, a, 192, 168, 0, 0, 16);
+ insert(4, a, 192, 168, 0, 0, 24);
+ allowedips_remove_by_peer(&t, a, &mutex);
+ test_negative(4, a, 192, 168, 0, 1);
+
+ /* These will hit the BUG_ON(len >= 128) in free_node if something goes wrong. */
+ for (i = 0; i < 128; ++i) {
+ part = cpu_to_be64(~(1LLU << (i % 64)));
+ memset(&ip, 0xff, 16);
+ memcpy((u8 *)&ip + (i < 64) * 8, &part, 8);
+ allowedips_insert_v6(&t, &ip, 128, a, &mutex);
+ }
+
+ allowedips_free(&t, &mutex);
+
+ allowedips_init(&t);
+ insert(4, a, 192, 95, 5, 93, 27);
+ insert(6, a, 0x26075300, 0x60006b00, 0, 0xc05f0543, 128);
+ insert(4, a, 10, 1, 0, 20, 29);
+ insert(6, a, 0x26075300, 0x6d8a6bf8, 0xdab1f1df, 0xc05f1523, 83);
+ insert(6, a, 0x26075300, 0x6d8a6bf8, 0xdab1f1df, 0xc05f1523, 21);
+ allowedips_walk_by_peer(&t, &cursor, a, walk_callback, &wctx, &mutex);
+ test_boolean(wctx.count == 5);
+ test_boolean(wctx.found_a);
+ test_boolean(wctx.found_b);
+ test_boolean(wctx.found_c);
+ test_boolean(wctx.found_d);
+ test_boolean(wctx.found_e);
+ test_boolean(!wctx.found_other);
+
+#ifdef DEBUG_RANDOM_TRIE
+ if (success)
+ success = randomized_test();
+#endif
+
+ if (success)
+ pr_info("allowedips self-tests: pass\n");
+
+free:
+ allowedips_free(&t, &mutex);
+ kfree(a);
+ kfree(b);
+ kfree(c);
+ kfree(d);
+ kfree(e);
+ kfree(f);
+ kfree(g);
+ kfree(h);
+ mutex_unlock(&mutex);
+
+ return success;
+}
+#undef test_negative
+#undef test
+#undef remove
+#undef insert
+#undef init_peer
+
+#endif
diff --git a/drivers/net/wireguard/selftest/counter.h b/drivers/net/wireguard/selftest/counter.h
new file mode 100644
index 000000000000..1c2a3b4e1fdc
--- /dev/null
+++ b/drivers/net/wireguard/selftest/counter.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifdef DEBUG
+bool __init packet_counter_selftest(void)
+{
+ unsigned int test_num = 0, i;
+ union noise_counter counter;
+ bool success = true;
+
+#define T_INIT do { \
+ memset(&counter, 0, sizeof(union noise_counter)); \
+ spin_lock_init(&counter.receive.lock); \
+ } while (0)
+#define T_LIM (COUNTER_WINDOW_SIZE + 1)
+#define T(n, v) do { \
+ ++test_num; \
+ if (counter_validate(&counter, n) != v) { \
+ pr_info("nonce counter self-test %u: FAIL\n", \
+ test_num); \
+ success = false; \
+ } \
+ } while (0)
+
+ T_INIT;
+ /* 1 */ T(0, true);
+ /* 2 */ T(1, true);
+ /* 3 */ T(1, false);
+ /* 4 */ T(9, true);
+ /* 5 */ T(8, true);
+ /* 6 */ T(7, true);
+ /* 7 */ T(7, false);
+ /* 8 */ T(T_LIM, true);
+ /* 9 */ T(T_LIM - 1, true);
+ /* 10 */ T(T_LIM - 1, false);
+ /* 11 */ T(T_LIM - 2, true);
+ /* 12 */ T(2, true);
+ /* 13 */ T(2, false);
+ /* 14 */ T(T_LIM + 16, true);
+ /* 15 */ T(3, false);
+ /* 16 */ T(T_LIM + 16, false);
+ /* 17 */ T(T_LIM * 4, true);
+ /* 18 */ T(T_LIM * 4 - (T_LIM - 1), true);
+ /* 19 */ T(10, false);
+ /* 20 */ T(T_LIM * 4 - T_LIM, false);
+ /* 21 */ T(T_LIM * 4 - (T_LIM + 1), false);
+ /* 22 */ T(T_LIM * 4 - (T_LIM - 2), true);
+ /* 23 */ T(T_LIM * 4 + 1 - T_LIM, false);
+ /* 24 */ T(0, false);
+ /* 25 */ T(REJECT_AFTER_MESSAGES, false);
+ /* 26 */ T(REJECT_AFTER_MESSAGES - 1, true);
+ /* 27 */ T(REJECT_AFTER_MESSAGES, false);
+ /* 28 */ T(REJECT_AFTER_MESSAGES - 1, false);
+ /* 29 */ T(REJECT_AFTER_MESSAGES - 2, true);
+ /* 30 */ T(REJECT_AFTER_MESSAGES + 1, false);
+ /* 31 */ T(REJECT_AFTER_MESSAGES + 2, false);
+ /* 32 */ T(REJECT_AFTER_MESSAGES - 2, false);
+ /* 33 */ T(REJECT_AFTER_MESSAGES - 3, true);
+ /* 34 */ T(0, false);
+
+ T_INIT;
+ for (i = 1; i <= COUNTER_WINDOW_SIZE; ++i)
+ T(i, true);
+ T(0, true);
+ T(0, false);
+
+ T_INIT;
+ for (i = 2; i <= COUNTER_WINDOW_SIZE + 1; ++i)
+ T(i, true);
+ T(1, true);
+ T(0, false);
+
+ T_INIT;
+ for (i = COUNTER_WINDOW_SIZE + 1; i-- > 0;)
+ T(i, true);
+
+ T_INIT;
+ for (i = COUNTER_WINDOW_SIZE + 2; i-- > 1;)
+ T(i, true);
+ T(0, false);
+
+ T_INIT;
+ for (i = COUNTER_WINDOW_SIZE + 1; i-- > 1;)
+ T(i, true);
+ T(COUNTER_WINDOW_SIZE + 1, true);
+ T(0, false);
+
+ T_INIT;
+ for (i = COUNTER_WINDOW_SIZE + 1; i-- > 1;)
+ T(i, true);
+ T(0, true);
+ T(COUNTER_WINDOW_SIZE + 1, true);
+#undef T
+#undef T_LIM
+#undef T_INIT
+
+ if (success)
+ pr_info("nonce counter self-tests: pass\n");
+ return success;
+}
+#endif
diff --git a/drivers/net/wireguard/selftest/ratelimiter.h b/drivers/net/wireguard/selftest/ratelimiter.h
new file mode 100644
index 000000000000..f9f99961fc7c
--- /dev/null
+++ b/drivers/net/wireguard/selftest/ratelimiter.h
@@ -0,0 +1,174 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifdef DEBUG
+
+#include <linux/jiffies.h>
+
+static const struct {
+ bool result;
+ unsigned int msec_to_sleep_before;
+} expected_results[] __initconst = {
+ [0 ... PACKETS_BURSTABLE - 1] = { true, 0 },
+ [PACKETS_BURSTABLE] = { false, 0 },
+ [PACKETS_BURSTABLE + 1] = { true, MSEC_PER_SEC / PACKETS_PER_SECOND },
+ [PACKETS_BURSTABLE + 2] = { false, 0 },
+ [PACKETS_BURSTABLE + 3] = { true, (MSEC_PER_SEC / PACKETS_PER_SECOND) * 2 },
+ [PACKETS_BURSTABLE + 4] = { true, 0 },
+ [PACKETS_BURSTABLE + 5] = { false, 0 }
+};
+
+static __init unsigned int maximum_jiffies_at_index(int index)
+{
+ unsigned int total_msecs = 2 * MSEC_PER_SEC / PACKETS_PER_SECOND / 3;
+ int i;
+
+ for (i = 0; i <= index; ++i)
+ total_msecs += expected_results[i].msec_to_sleep_before;
+ return msecs_to_jiffies(total_msecs);
+}
+
+bool __init ratelimiter_selftest(void)
+{
+ int i, test = 0, tries = 0, ret = false;
+ unsigned long loop_start_time;
+#if IS_ENABLED(CONFIG_IPV6)
+ struct sk_buff *skb6;
+ struct ipv6hdr *hdr6;
+#endif
+ struct sk_buff *skb4;
+ struct iphdr *hdr4;
+
+ BUILD_BUG_ON(MSEC_PER_SEC % PACKETS_PER_SECOND != 0);
+
+ if (ratelimiter_init())
+ goto out;
+ ++test;
+ if (ratelimiter_init()) {
+ ratelimiter_uninit();
+ goto out;
+ }
+ ++test;
+ if (ratelimiter_init()) {
+ ratelimiter_uninit();
+ ratelimiter_uninit();
+ goto out;
+ }
+ ++test;
+
+ skb4 = alloc_skb(sizeof(struct iphdr), GFP_KERNEL);
+ if (unlikely(!skb4))
+ goto err_nofree;
+ skb4->protocol = htons(ETH_P_IP);
+ hdr4 = (struct iphdr *)skb_put(skb4, sizeof(*hdr4));
+ hdr4->saddr = htonl(8182);
+ skb_reset_network_header(skb4);
+ ++test;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ skb6 = alloc_skb(sizeof(struct ipv6hdr), GFP_KERNEL);
+ if (unlikely(!skb6)) {
+ kfree_skb(skb4);
+ goto err_nofree;
+ }
+ skb6->protocol = htons(ETH_P_IPV6);
+ hdr6 = (struct ipv6hdr *)skb_put(skb6, sizeof(*hdr6));
+ hdr6->saddr.in6_u.u6_addr32[0] = htonl(1212);
+ hdr6->saddr.in6_u.u6_addr32[1] = htonl(289188);
+ skb_reset_network_header(skb6);
+ ++test;
+#endif
+
+restart:
+ loop_start_time = jiffies;
+ for (i = 0; i < ARRAY_SIZE(expected_results); ++i) {
+#define ensure_time do { \
+ if (time_is_before_jiffies(loop_start_time + \
+ maximum_jiffies_at_index(i))) { \
+ if (++tries >= 5000) \
+ goto err; \
+ gc_entries(NULL); \
+ rcu_barrier(); \
+ msleep(500); \
+ goto restart; \
+ } \
+ } while (0)
+
+ if (expected_results[i].msec_to_sleep_before)
+ msleep(expected_results[i].msec_to_sleep_before);
+
+ ensure_time;
+ if (ratelimiter_allow(skb4, &init_net) !=
+ expected_results[i].result)
+ goto err;
+ ++test;
+ hdr4->saddr = htonl(ntohl(hdr4->saddr) + i + 1);
+ ensure_time;
+ if (!ratelimiter_allow(skb4, &init_net))
+ goto err;
+ ++test;
+ hdr4->saddr = htonl(ntohl(hdr4->saddr) - i - 1);
+
+#if IS_ENABLED(CONFIG_IPV6)
+ hdr6->saddr.in6_u.u6_addr32[2] =
+ hdr6->saddr.in6_u.u6_addr32[3] = htonl(i);
+ ensure_time;
+ if (ratelimiter_allow(skb6, &init_net) !=
+ expected_results[i].result)
+ goto err;
+ ++test;
+ hdr6->saddr.in6_u.u6_addr32[0] =
+ htonl(ntohl(hdr6->saddr.in6_u.u6_addr32[0]) + i + 1);
+ ensure_time;
+ if (!ratelimiter_allow(skb6, &init_net))
+ goto err;
+ ++test;
+ hdr6->saddr.in6_u.u6_addr32[0] =
+ htonl(ntohl(hdr6->saddr.in6_u.u6_addr32[0]) - i - 1);
+ ensure_time;
+#endif
+ }
+
+ tries = 0;
+restart2:
+ gc_entries(NULL);
+ rcu_barrier();
+
+ if (atomic_read(&total_entries))
+ goto err;
+ ++test;
+
+ for (i = 0; i <= max_entries; ++i) {
+ hdr4->saddr = htonl(i);
+ if (ratelimiter_allow(skb4, &init_net) != (i != max_entries)) {
+ if (++tries < 5000)
+ goto restart2;
+ goto err;
+ }
+ ++test;
+ }
+
+ ret = true;
+
+err:
+ kfree_skb(skb4);
+#if IS_ENABLED(CONFIG_IPV6)
+ kfree_skb(skb6);
+#endif
+err_nofree:
+ ratelimiter_uninit();
+ ratelimiter_uninit();
+ ratelimiter_uninit();
+ /* Uninit one extra time to check underflow detection. */
+ ratelimiter_uninit();
+out:
+ if (ret)
+ pr_info("ratelimiter self-tests: pass\n");
+ else
+ pr_info("ratelimiter self-test %d: fail\n", test);
+
+ return ret;
+}
+#endif
diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
new file mode 100644
index 000000000000..5b6d0fe733c4
--- /dev/null
+++ b/drivers/net/wireguard/send.c
@@ -0,0 +1,420 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "queueing.h"
+#include "timers.h"
+#include "device.h"
+#include "peer.h"
+#include "socket.h"
+#include "messages.h"
+#include "cookie.h"
+
+#include <linux/simd.h>
+#include <linux/uio.h>
+#include <linux/inetdevice.h>
+#include <linux/socket.h>
+#include <net/ip_tunnels.h>
+#include <net/udp.h>
+#include <net/sock.h>
+
+static void packet_send_handshake_initiation(struct wireguard_peer *peer)
+{
+ struct message_handshake_initiation packet;
+
+ if (!has_expired(atomic64_read(&peer->last_sent_handshake),
+ REKEY_TIMEOUT))
+ return; /* This function is rate limited. */
+
+ atomic64_set(&peer->last_sent_handshake, ktime_get_boot_fast_ns());
+ net_dbg_ratelimited("%s: Sending handshake initiation to peer %llu (%pISpfsc)\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+
+ if (noise_handshake_create_initiation(&packet, &peer->handshake)) {
+ cookie_add_mac_to_packet(&packet, sizeof(packet), peer);
+ timers_any_authenticated_packet_traversal(peer);
+ timers_any_authenticated_packet_sent(peer);
+ atomic64_set(&peer->last_sent_handshake,
+ ktime_get_boot_fast_ns());
+ socket_send_buffer_to_peer(peer, &packet, sizeof(packet),
+ HANDSHAKE_DSCP);
+ timers_handshake_initiated(peer);
+ }
+}
+
+void packet_handshake_send_worker(struct work_struct *work)
+{
+ struct wireguard_peer *peer = container_of(work, struct wireguard_peer,
+ transmit_handshake_work);
+
+ packet_send_handshake_initiation(peer);
+ peer_put(peer);
+}
+
+void packet_send_queued_handshake_initiation(struct wireguard_peer *peer,
+ bool is_retry)
+{
+ if (!is_retry)
+ peer->timer_handshake_attempts = 0;
+
+ rcu_read_lock_bh();
+ /* We check last_sent_handshake here in addition to the actual function
+ * we're queueing up, so that we don't queue things if not strictly
+ * necessary:
+ */
+ if (!has_expired(atomic64_read(&peer->last_sent_handshake),
+ REKEY_TIMEOUT) || unlikely(peer->is_dead))
+ goto out;
+
+ peer_get(peer);
+ /* Queues up calling packet_send_queued_handshakes(peer), where we do a
+ * peer_put(peer) after:
+ */
+ if (!queue_work(peer->device->handshake_send_wq,
+ &peer->transmit_handshake_work))
+ /* If the work was already queued, we want to drop the
+ * extra reference:
+ */
+ peer_put(peer);
+out:
+ rcu_read_unlock_bh();
+}
+
+void packet_send_handshake_response(struct wireguard_peer *peer)
+{
+ struct message_handshake_response packet;
+
+ atomic64_set(&peer->last_sent_handshake, ktime_get_boot_fast_ns());
+ net_dbg_ratelimited("%s: Sending handshake response to peer %llu (%pISpfsc)\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+
+ if (noise_handshake_create_response(&packet, &peer->handshake)) {
+ cookie_add_mac_to_packet(&packet, sizeof(packet), peer);
+ if (noise_handshake_begin_session(&peer->handshake,
+ &peer->keypairs)) {
+ timers_session_derived(peer);
+ timers_any_authenticated_packet_traversal(peer);
+ timers_any_authenticated_packet_sent(peer);
+ atomic64_set(&peer->last_sent_handshake,
+ ktime_get_boot_fast_ns());
+ socket_send_buffer_to_peer(peer, &packet,
+ sizeof(packet),
+ HANDSHAKE_DSCP);
+ }
+ }
+}
+
+void packet_send_handshake_cookie(struct wireguard_device *wg,
+ struct sk_buff *initiating_skb,
+ __le32 sender_index)
+{
+ struct message_handshake_cookie packet;
+
+ net_dbg_skb_ratelimited("%s: Sending cookie response for denied handshake message for %pISpfsc\n",
+ wg->dev->name, initiating_skb);
+ cookie_message_create(&packet, initiating_skb, sender_index,
+ &wg->cookie_checker);
+ socket_send_buffer_as_reply_to_skb(wg, initiating_skb, &packet,
+ sizeof(packet));
+}
+
+static inline void keep_key_fresh(struct wireguard_peer *peer)
+{
+ struct noise_keypair *keypair;
+ bool send = false;
+
+ rcu_read_lock_bh();
+ keypair = rcu_dereference_bh(peer->keypairs.current_keypair);
+ if (likely(keypair && keypair->sending.is_valid) &&
+ (unlikely(atomic64_read(&keypair->sending.counter.counter) >
+ REKEY_AFTER_MESSAGES) ||
+ (keypair->i_am_the_initiator &&
+ unlikely(has_expired(keypair->sending.birthdate,
+ REKEY_AFTER_TIME)))))
+ send = true;
+ rcu_read_unlock_bh();
+
+ if (send)
+ packet_send_queued_handshake_initiation(peer, false);
+}
+
+static inline unsigned int skb_padding(struct sk_buff *skb)
+{
+ /* We do this modulo business with the MTU, just in case the networking
+ * layer gives us a packet that's bigger than the MTU. In that case, we
+ * wouldn't want the final subtraction to overflow in the case of the
+ * padded_size being clamped.
+ */
+ unsigned int last_unit = skb->len % PACKET_CB(skb)->mtu;
+ unsigned int padded_size = ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE);
+
+ if (padded_size > PACKET_CB(skb)->mtu)
+ padded_size = PACKET_CB(skb)->mtu;
+ return padded_size - last_unit;
+}
+
+static inline bool skb_encrypt(struct sk_buff *skb,
+ struct noise_keypair *keypair,
+ simd_context_t simd_context)
+{
+ unsigned int padding_len, plaintext_len, trailer_len;
+ struct scatterlist sg[MAX_SKB_FRAGS * 2 + 1];
+ struct message_data *header;
+ struct sk_buff *trailer;
+ int num_frags;
+
+ /* Calculate lengths. */
+ padding_len = skb_padding(skb);
+ trailer_len = padding_len + noise_encrypted_len(0);
+ plaintext_len = skb->len + padding_len;
+
+ /* Expand data section to have room for padding and auth tag. */
+ num_frags = skb_cow_data(skb, trailer_len, &trailer);
+ if (unlikely(num_frags < 0 || num_frags > ARRAY_SIZE(sg)))
+ return false;
+
+ /* Set the padding to zeros, and make sure it and the auth tag are part
+ * of the skb.
+ */
+ memset(skb_tail_pointer(trailer), 0, padding_len);
+
+ /* Expand head section to have room for our header and the network
+ * stack's headers.
+ */
+ if (unlikely(skb_cow_head(skb, DATA_PACKET_HEAD_ROOM) < 0))
+ return false;
+
+ /* We have to remember to add the checksum to the innerpacket, in case
+ * the receiver forwards it.
+ */
+ if (likely(!skb_checksum_setup(skb, true)))
+ skb_checksum_help(skb);
+
+ /* Only after checksumming can we safely add on the padding at the end
+ * and the header.
+ */
+ skb_set_inner_network_header(skb, 0);
+ header = (struct message_data *)skb_push(skb, sizeof(*header));
+ header->header.type = cpu_to_le32(MESSAGE_DATA);
+ header->key_idx = keypair->remote_index;
+ header->counter = cpu_to_le64(PACKET_CB(skb)->nonce);
+ pskb_put(skb, trailer, trailer_len);
+
+ /* Now we can encrypt the scattergather segments */
+ sg_init_table(sg, num_frags);
+ if (skb_to_sgvec(skb, sg, sizeof(struct message_data),
+ noise_encrypted_len(plaintext_len)) <= 0)
+ return false;
+ return chacha20poly1305_encrypt_sg(sg, sg, plaintext_len, NULL, 0,
+ PACKET_CB(skb)->nonce,
+ keypair->sending.key, simd_context);
+}
+
+void packet_send_keepalive(struct wireguard_peer *peer)
+{
+ struct sk_buff *skb;
+
+ if (skb_queue_empty(&peer->staged_packet_queue)) {
+ skb = alloc_skb(DATA_PACKET_HEAD_ROOM + MESSAGE_MINIMUM_LENGTH,
+ GFP_ATOMIC);
+ if (unlikely(!skb))
+ return;
+ skb_reserve(skb, DATA_PACKET_HEAD_ROOM);
+ skb->dev = peer->device->dev;
+ PACKET_CB(skb)->mtu = skb->dev->mtu;
+ skb_queue_tail(&peer->staged_packet_queue, skb);
+ net_dbg_ratelimited("%s: Sending keepalive packet to peer %llu (%pISpfsc)\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr);
+ }
+
+ packet_send_staged_packets(peer);
+}
+
+#define skb_walk_null_queue_safe(first, skb, next) \
+ for (skb = first, next = skb->next; skb; \
+ skb = next, next = skb ? skb->next : NULL)
+static inline void skb_free_null_queue(struct sk_buff *first)
+{
+ struct sk_buff *skb, *next;
+
+ skb_walk_null_queue_safe (first, skb, next)
+ dev_kfree_skb(skb);
+}
+
+static void packet_create_data_done(struct sk_buff *first,
+ struct wireguard_peer *peer)
+{
+ struct sk_buff *skb, *next;
+ bool is_keepalive, data_sent = false;
+
+ timers_any_authenticated_packet_traversal(peer);
+ timers_any_authenticated_packet_sent(peer);
+ skb_walk_null_queue_safe (first, skb, next) {
+ is_keepalive = skb->len == message_data_len(0);
+ if (likely(!socket_send_skb_to_peer(peer, skb,
+ PACKET_CB(skb)->ds) && !is_keepalive))
+ data_sent = true;
+ }
+
+ if (likely(data_sent))
+ timers_data_sent(peer);
+
+ keep_key_fresh(peer);
+}
+
+void packet_tx_worker(struct work_struct *work)
+{
+ struct crypt_queue *queue =
+ container_of(work, struct crypt_queue, work);
+ struct wireguard_peer *peer;
+ struct noise_keypair *keypair;
+ struct sk_buff *first;
+ enum packet_state state;
+
+ while ((first = __ptr_ring_peek(&queue->ring)) != NULL &&
+ (state = atomic_read_acquire(&PACKET_CB(first)->state)) !=
+ PACKET_STATE_UNCRYPTED) {
+ __ptr_ring_discard_one(&queue->ring);
+ peer = PACKET_PEER(first);
+ keypair = PACKET_CB(first)->keypair;
+
+ if (likely(state == PACKET_STATE_CRYPTED))
+ packet_create_data_done(first, peer);
+ else
+ skb_free_null_queue(first);
+
+ noise_keypair_put(keypair, false);
+ peer_put(peer);
+ }
+}
+
+void packet_encrypt_worker(struct work_struct *work)
+{
+ struct crypt_queue *queue =
+ container_of(work, struct multicore_worker, work)->ptr;
+ struct sk_buff *first, *skb, *next;
+ simd_context_t simd_context = simd_get();
+
+ while ((first = ptr_ring_consume_bh(&queue->ring)) != NULL) {
+ enum packet_state state = PACKET_STATE_CRYPTED;
+
+ skb_walk_null_queue_safe (first, skb, next) {
+ if (likely(skb_encrypt(skb, PACKET_CB(first)->keypair,
+ simd_context)))
+ skb_reset(skb);
+ else {
+ state = PACKET_STATE_DEAD;
+ break;
+ }
+ }
+ queue_enqueue_per_peer(&PACKET_PEER(first)->tx_queue, first,
+ state);
+
+ simd_context = simd_relax(simd_context);
+ }
+ simd_put(simd_context);
+}
+
+static void packet_create_data(struct sk_buff *first)
+{
+ struct wireguard_peer *peer = PACKET_PEER(first);
+ struct wireguard_device *wg = peer->device;
+ int ret = -EINVAL;
+
+ rcu_read_lock_bh();
+ if (unlikely(peer->is_dead))
+ goto err;
+
+ ret = queue_enqueue_per_device_and_peer(&wg->encrypt_queue,
+ &peer->tx_queue, first,
+ wg->packet_crypt_wq,
+ &wg->encrypt_queue.last_cpu);
+ if (unlikely(ret == -EPIPE))
+ queue_enqueue_per_peer(&peer->tx_queue, first,
+ PACKET_STATE_DEAD);
+err:
+ rcu_read_unlock_bh();
+ if (likely(!ret || ret == -EPIPE))
+ return;
+ noise_keypair_put(PACKET_CB(first)->keypair, false);
+ peer_put(peer);
+ skb_free_null_queue(first);
+}
+
+void packet_send_staged_packets(struct wireguard_peer *peer)
+{
+ struct noise_symmetric_key *key;
+ struct noise_keypair *keypair;
+ struct sk_buff_head packets;
+ struct sk_buff *skb;
+
+ /* Steal the current queue into our local one. */
+ __skb_queue_head_init(&packets);
+ spin_lock_bh(&peer->staged_packet_queue.lock);
+ skb_queue_splice_init(&peer->staged_packet_queue, &packets);
+ spin_unlock_bh(&peer->staged_packet_queue.lock);
+ if (unlikely(skb_queue_empty(&packets)))
+ return;
+
+ /* First we make sure we have a valid reference to a valid key. */
+ rcu_read_lock_bh();
+ keypair = noise_keypair_get(
+ rcu_dereference_bh(peer->keypairs.current_keypair));
+ rcu_read_unlock_bh();
+ if (unlikely(!keypair))
+ goto out_nokey;
+ key = &keypair->sending;
+ if (unlikely(!key->is_valid))
+ goto out_nokey;
+ if (unlikely(has_expired(key->birthdate, REJECT_AFTER_TIME)))
+ goto out_invalid;
+
+ /* After we know we have a somewhat valid key, we now try to assign
+ * nonces to all of the packets in the queue. If we can't assign nonces
+ * for all of them, we just consider it a failure and wait for the next
+ * handshake.
+ */
+ skb_queue_walk (&packets, skb) {
+ /* 0 for no outer TOS: no leak. TODO: should we use flowi->tos
+ * as outer? */
+ PACKET_CB(skb)->ds = ip_tunnel_ecn_encap(0, ip_hdr(skb), skb);
+ PACKET_CB(skb)->nonce =
+ atomic64_inc_return(&key->counter.counter) - 1;
+ if (unlikely(PACKET_CB(skb)->nonce >= REJECT_AFTER_MESSAGES))
+ goto out_invalid;
+ }
+
+ packets.prev->next = NULL;
+ peer_get(keypair->entry.peer);
+ PACKET_CB(packets.next)->keypair = keypair;
+ packet_create_data(packets.next);
+ return;
+
+out_invalid:
+ key->is_valid = false;
+out_nokey:
+ noise_keypair_put(keypair, false);
+
+ /* We orphan the packets if we're waiting on a handshake, so that they
+ * don't block a socket's pool.
+ */
+ skb_queue_walk (&packets, skb)
+ skb_orphan(skb);
+ /* Then we put them back on the top of the queue. We're not too
+ * concerned about accidentally getting things a little out of order if
+ * packets are being added really fast, because this queue is for before
+ * packets can even be sent and it's small anyway.
+ */
+ spin_lock_bh(&peer->staged_packet_queue.lock);
+ skb_queue_splice(&packets, &peer->staged_packet_queue);
+ spin_unlock_bh(&peer->staged_packet_queue.lock);
+
+ /* If we're exiting because there's something wrong with the key, it
+ * means we should initiate a new handshake.
+ */
+ packet_send_queued_handshake_initiation(peer, false);
+}
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
new file mode 100644
index 000000000000..2e9e44f3b1d2
--- /dev/null
+++ b/drivers/net/wireguard/socket.c
@@ -0,0 +1,435 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "device.h"
+#include "peer.h"
+#include "socket.h"
+#include "queueing.h"
+#include "messages.h"
+
+#include <linux/ctype.h>
+#include <linux/net.h>
+#include <linux/if_vlan.h>
+#include <linux/if_ether.h>
+#include <linux/inetdevice.h>
+#include <net/udp_tunnel.h>
+#include <net/ipv6.h>
+
+static inline int send4(struct wireguard_device *wg, struct sk_buff *skb,
+ struct endpoint *endpoint, u8 ds,
+ struct dst_cache *cache)
+{
+ struct flowi4 fl = {
+ .saddr = endpoint->src4.s_addr,
+ .daddr = endpoint->addr4.sin_addr.s_addr,
+ .fl4_dport = endpoint->addr4.sin_port,
+ .flowi4_mark = wg->fwmark,
+ .flowi4_proto = IPPROTO_UDP
+ };
+ struct rtable *rt = NULL;
+ struct sock *sock;
+ int ret = 0;
+
+ skb->next = skb->prev = NULL;
+ skb->dev = wg->dev;
+ skb->mark = wg->fwmark;
+
+ rcu_read_lock_bh();
+ sock = rcu_dereference_bh(wg->sock4);
+
+ if (unlikely(!sock)) {
+ ret = -ENONET;
+ goto err;
+ }
+
+ fl.fl4_sport = inet_sk(sock)->inet_sport;
+
+ if (cache)
+ rt = dst_cache_get_ip4(cache, &fl.saddr);
+
+ if (!rt) {
+ security_sk_classify_flow(sock, flowi4_to_flowi(&fl));
+ if (unlikely(!inet_confirm_addr(sock_net(sock), NULL, 0,
+ fl.saddr, RT_SCOPE_HOST))) {
+ endpoint->src4.s_addr = 0;
+ *(__force __be32 *)&endpoint->src_if4 = 0;
+ fl.saddr = 0;
+ if (cache)
+ dst_cache_reset(cache);
+ }
+ rt = ip_route_output_flow(sock_net(sock), &fl, sock);
+ if (unlikely(endpoint->src_if4 && ((IS_ERR(rt) &&
+ PTR_ERR(rt) == -EINVAL) || (!IS_ERR(rt) &&
+ rt->dst.dev->ifindex != endpoint->src_if4)))) {
+ endpoint->src4.s_addr = 0;
+ *(__force __be32 *)&endpoint->src_if4 = 0;
+ fl.saddr = 0;
+ if (cache)
+ dst_cache_reset(cache);
+ if (!IS_ERR(rt))
+ ip_rt_put(rt);
+ rt = ip_route_output_flow(sock_net(sock), &fl, sock);
+ }
+ if (unlikely(IS_ERR(rt))) {
+ ret = PTR_ERR(rt);
+ net_dbg_ratelimited("%s: No route to %pISpfsc, error %d\n",
+ wg->dev->name, &endpoint->addr, ret);
+ goto err;
+ } else if (unlikely(rt->dst.dev == skb->dev)) {
+ ip_rt_put(rt);
+ ret = -ELOOP;
+ net_dbg_ratelimited("%s: Avoiding routing loop to %pISpfsc\n",
+ wg->dev->name, &endpoint->addr);
+ goto err;
+ }
+ if (cache)
+ dst_cache_set_ip4(cache, &rt->dst, fl.saddr);
+ }
+ udp_tunnel_xmit_skb(rt, sock, skb, fl.saddr, fl.daddr, ds,
+ ip4_dst_hoplimit(&rt->dst), 0, fl.fl4_sport,
+ fl.fl4_dport, false, false);
+ goto out;
+
+err:
+ kfree_skb(skb);
+out:
+ rcu_read_unlock_bh();
+ return ret;
+}
+
+static inline int send6(struct wireguard_device *wg, struct sk_buff *skb,
+ struct endpoint *endpoint, u8 ds,
+ struct dst_cache *cache)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+ struct flowi6 fl = {
+ .saddr = endpoint->src6,
+ .daddr = endpoint->addr6.sin6_addr,
+ .fl6_dport = endpoint->addr6.sin6_port,
+ .flowi6_mark = wg->fwmark,
+ .flowi6_oif = endpoint->addr6.sin6_scope_id,
+ .flowi6_proto = IPPROTO_UDP
+ /* TODO: addr->sin6_flowinfo */
+ };
+ struct dst_entry *dst = NULL;
+ struct sock *sock;
+ int ret = 0;
+
+ skb->next = skb->prev = NULL;
+ skb->dev = wg->dev;
+ skb->mark = wg->fwmark;
+
+ rcu_read_lock_bh();
+ sock = rcu_dereference_bh(wg->sock6);
+
+ if (unlikely(!sock)) {
+ ret = -ENONET;
+ goto err;
+ }
+
+ fl.fl6_sport = inet_sk(sock)->inet_sport;
+
+ if (cache)
+ dst = dst_cache_get_ip6(cache, &fl.saddr);
+
+ if (!dst) {
+ security_sk_classify_flow(sock, flowi6_to_flowi(&fl));
+ if (unlikely(!ipv6_addr_any(&fl.saddr) &&
+ !ipv6_chk_addr(sock_net(sock), &fl.saddr, NULL, 0))) {
+ endpoint->src6 = fl.saddr = in6addr_any;
+ if (cache)
+ dst_cache_reset(cache);
+ }
+ ret = ipv6_stub->ipv6_dst_lookup(sock_net(sock), sock, &dst,
+ &fl);
+ if (unlikely(ret)) {
+ net_dbg_ratelimited("%s: No route to %pISpfsc, error %d\n",
+ wg->dev->name, &endpoint->addr, ret);
+ goto err;
+ } else if (unlikely(dst->dev == skb->dev)) {
+ dst_release(dst);
+ ret = -ELOOP;
+ net_dbg_ratelimited("%s: Avoiding routing loop to %pISpfsc\n",
+ wg->dev->name, &endpoint->addr);
+ goto err;
+ }
+ if (cache)
+ dst_cache_set_ip6(cache, dst, &fl.saddr);
+ }
+
+ udp_tunnel6_xmit_skb(dst, sock, skb, skb->dev, &fl.saddr, &fl.daddr, ds,
+ ip6_dst_hoplimit(dst), 0, fl.fl6_sport,
+ fl.fl6_dport, false);
+ goto out;
+
+err:
+ kfree_skb(skb);
+out:
+ rcu_read_unlock_bh();
+ return ret;
+#else
+ return -EAFNOSUPPORT;
+#endif
+}
+
+int socket_send_skb_to_peer(struct wireguard_peer *peer, struct sk_buff *skb,
+ u8 ds)
+{
+ size_t skb_len = skb->len;
+ int ret = -EAFNOSUPPORT;
+
+ read_lock_bh(&peer->endpoint_lock);
+ if (peer->endpoint.addr.sa_family == AF_INET)
+ ret = send4(peer->device, skb, &peer->endpoint, ds,
+ &peer->endpoint_cache);
+ else if (peer->endpoint.addr.sa_family == AF_INET6)
+ ret = send6(peer->device, skb, &peer->endpoint, ds,
+ &peer->endpoint_cache);
+ else
+ dev_kfree_skb(skb);
+ if (likely(!ret))
+ peer->tx_bytes += skb_len;
+ read_unlock_bh(&peer->endpoint_lock);
+
+ return ret;
+}
+
+int socket_send_buffer_to_peer(struct wireguard_peer *peer, void *buffer,
+ size_t len, u8 ds)
+{
+ struct sk_buff *skb = alloc_skb(len + SKB_HEADER_LEN, GFP_ATOMIC);
+
+ if (unlikely(!skb))
+ return -ENOMEM;
+
+ skb_reserve(skb, SKB_HEADER_LEN);
+ skb_set_inner_network_header(skb, 0);
+ skb_put_data(skb, buffer, len);
+ return socket_send_skb_to_peer(peer, skb, ds);
+}
+
+int socket_send_buffer_as_reply_to_skb(struct wireguard_device *wg,
+ struct sk_buff *in_skb, void *buffer,
+ size_t len)
+{
+ int ret = 0;
+ struct sk_buff *skb;
+ struct endpoint endpoint;
+
+ if (unlikely(!in_skb))
+ return -EINVAL;
+ ret = socket_endpoint_from_skb(&endpoint, in_skb);
+ if (unlikely(ret < 0))
+ return ret;
+
+ skb = alloc_skb(len + SKB_HEADER_LEN, GFP_ATOMIC);
+ if (unlikely(!skb))
+ return -ENOMEM;
+ skb_reserve(skb, SKB_HEADER_LEN);
+ skb_set_inner_network_header(skb, 0);
+ skb_put_data(skb, buffer, len);
+
+ if (endpoint.addr.sa_family == AF_INET)
+ ret = send4(wg, skb, &endpoint, 0, NULL);
+ else if (endpoint.addr.sa_family == AF_INET6)
+ ret = send6(wg, skb, &endpoint, 0, NULL);
+ /* No other possibilities if the endpoint is valid, which it is,
+ * as we checked above.
+ */
+
+ return ret;
+}
+
+int socket_endpoint_from_skb(struct endpoint *endpoint,
+ const struct sk_buff *skb)
+{
+ memset(endpoint, 0, sizeof(*endpoint));
+ if (skb->protocol == htons(ETH_P_IP)) {
+ endpoint->addr4.sin_family = AF_INET;
+ endpoint->addr4.sin_port = udp_hdr(skb)->source;
+ endpoint->addr4.sin_addr.s_addr = ip_hdr(skb)->saddr;
+ endpoint->src4.s_addr = ip_hdr(skb)->daddr;
+ endpoint->src_if4 = skb->skb_iif;
+ } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ endpoint->addr6.sin6_family = AF_INET6;
+ endpoint->addr6.sin6_port = udp_hdr(skb)->source;
+ endpoint->addr6.sin6_addr = ipv6_hdr(skb)->saddr;
+ endpoint->addr6.sin6_scope_id = ipv6_iface_scope_id(
+ &ipv6_hdr(skb)->saddr, skb->skb_iif);
+ endpoint->src6 = ipv6_hdr(skb)->daddr;
+ } else
+ return -EINVAL;
+ return 0;
+}
+
+static inline bool endpoint_eq(const struct endpoint *a,
+ const struct endpoint *b)
+{
+ return (a->addr.sa_family == AF_INET && b->addr.sa_family == AF_INET &&
+ a->addr4.sin_port == b->addr4.sin_port &&
+ a->addr4.sin_addr.s_addr == b->addr4.sin_addr.s_addr &&
+ a->src4.s_addr == b->src4.s_addr && a->src_if4 == b->src_if4) ||
+ (a->addr.sa_family == AF_INET6 &&
+ b->addr.sa_family == AF_INET6 &&
+ a->addr6.sin6_port == b->addr6.sin6_port &&
+ ipv6_addr_equal(&a->addr6.sin6_addr, &b->addr6.sin6_addr) &&
+ a->addr6.sin6_scope_id == b->addr6.sin6_scope_id &&
+ ipv6_addr_equal(&a->src6, &b->src6)) ||
+ unlikely(!a->addr.sa_family && !b->addr.sa_family);
+}
+
+void socket_set_peer_endpoint(struct wireguard_peer *peer,
+ const struct endpoint *endpoint)
+{
+ /* First we check unlocked, in order to optimize, since it's pretty rare
+ * that an endpoint will change. If we happen to be mid-write, and two
+ * CPUs wind up writing the same thing or something slightly different,
+ * it doesn't really matter much either.
+ */
+ if (endpoint_eq(endpoint, &peer->endpoint))
+ return;
+ write_lock_bh(&peer->endpoint_lock);
+ if (endpoint->addr.sa_family == AF_INET) {
+ peer->endpoint.addr4 = endpoint->addr4;
+ peer->endpoint.src4 = endpoint->src4;
+ peer->endpoint.src_if4 = endpoint->src_if4;
+ } else if (endpoint->addr.sa_family == AF_INET6) {
+ peer->endpoint.addr6 = endpoint->addr6;
+ peer->endpoint.src6 = endpoint->src6;
+ } else
+ goto out;
+ dst_cache_reset(&peer->endpoint_cache);
+out:
+ write_unlock_bh(&peer->endpoint_lock);
+}
+
+void socket_set_peer_endpoint_from_skb(struct wireguard_peer *peer,
+ const struct sk_buff *skb)
+{
+ struct endpoint endpoint;
+
+ if (!socket_endpoint_from_skb(&endpoint, skb))
+ socket_set_peer_endpoint(peer, &endpoint);
+}
+
+void socket_clear_peer_endpoint_src(struct wireguard_peer *peer)
+{
+ write_lock_bh(&peer->endpoint_lock);
+ memset(&peer->endpoint.src6, 0, sizeof(peer->endpoint.src6));
+ dst_cache_reset(&peer->endpoint_cache);
+ write_unlock_bh(&peer->endpoint_lock);
+}
+
+static int receive(struct sock *sk, struct sk_buff *skb)
+{
+ struct wireguard_device *wg;
+
+ if (unlikely(!sk))
+ goto err;
+ wg = sk->sk_user_data;
+ if (unlikely(!wg))
+ goto err;
+ packet_receive(wg, skb);
+ return 0;
+
+err:
+ kfree_skb(skb);
+ return 0;
+}
+
+static inline void sock_free(struct sock *sock)
+{
+ if (unlikely(!sock))
+ return;
+ sk_clear_memalloc(sock);
+ udp_tunnel_sock_release(sock->sk_socket);
+}
+
+static inline void set_sock_opts(struct socket *sock)
+{
+ sock->sk->sk_allocation = GFP_ATOMIC;
+ sock->sk->sk_sndbuf = INT_MAX;
+ sk_set_memalloc(sock->sk);
+}
+
+int socket_init(struct wireguard_device *wg, u16 port)
+{
+ int ret;
+ struct udp_tunnel_sock_cfg cfg = {
+ .sk_user_data = wg,
+ .encap_type = 1,
+ .encap_rcv = receive
+ };
+ struct socket *new4 = NULL, *new6 = NULL;
+ struct udp_port_cfg port4 = {
+ .family = AF_INET,
+ .local_ip.s_addr = htonl(INADDR_ANY),
+ .local_udp_port = htons(port),
+ .use_udp_checksums = true
+ };
+#if IS_ENABLED(CONFIG_IPV6)
+ int retries = 0;
+ struct udp_port_cfg port6 = {
+ .family = AF_INET6,
+ .local_ip6 = IN6ADDR_ANY_INIT,
+ .use_udp6_tx_checksums = true,
+ .use_udp6_rx_checksums = true,
+ .ipv6_v6only = true
+ };
+#endif
+
+#if IS_ENABLED(CONFIG_IPV6)
+retry:
+#endif
+
+ ret = udp_sock_create(wg->creating_net, &port4, &new4);
+ if (ret < 0) {
+ pr_err("%s: Could not create IPv4 socket\n", wg->dev->name);
+ return ret;
+ }
+ set_sock_opts(new4);
+ setup_udp_tunnel_sock(wg->creating_net, new4, &cfg);
+
+#if IS_ENABLED(CONFIG_IPV6)
+ if (ipv6_mod_enabled()) {
+ port6.local_udp_port = inet_sk(new4->sk)->inet_sport;
+ ret = udp_sock_create(wg->creating_net, &port6, &new6);
+ if (ret < 0) {
+ udp_tunnel_sock_release(new4);
+ if (ret == -EADDRINUSE && !port && retries++ < 100)
+ goto retry;
+ pr_err("%s: Could not create IPv6 socket\n",
+ wg->dev->name);
+ return ret;
+ }
+ set_sock_opts(new6);
+ setup_udp_tunnel_sock(wg->creating_net, new6, &cfg);
+ }
+#endif
+
+ socket_reinit(wg, new4 ? new4->sk : NULL, new6 ? new6->sk : NULL);
+ return 0;
+}
+
+void socket_reinit(struct wireguard_device *wg, struct sock *new4,
+ struct sock *new6)
+{
+ struct sock *old4, *old6;
+
+ mutex_lock(&wg->socket_update_lock);
+ old4 = rcu_dereference_protected(wg->sock4,
+ lockdep_is_held(&wg->socket_update_lock));
+ old6 = rcu_dereference_protected(wg->sock6,
+ lockdep_is_held(&wg->socket_update_lock));
+ rcu_assign_pointer(wg->sock4, new4);
+ rcu_assign_pointer(wg->sock6, new6);
+ if (new4)
+ wg->incoming_port = ntohs(inet_sk(new4)->inet_sport);
+ mutex_unlock(&wg->socket_update_lock);
+ synchronize_rcu_bh();
+ synchronize_net();
+ sock_free(old4);
+ sock_free(old6);
+}
diff --git a/drivers/net/wireguard/socket.h b/drivers/net/wireguard/socket.h
new file mode 100644
index 000000000000..d873ffad9ea3
--- /dev/null
+++ b/drivers/net/wireguard/socket.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_SOCKET_H
+#define _WG_SOCKET_H
+
+#include <linux/netdevice.h>
+#include <linux/udp.h>
+#include <linux/if_vlan.h>
+#include <linux/if_ether.h>
+
+int socket_init(struct wireguard_device *wg, u16 port);
+void socket_reinit(struct wireguard_device *wg, struct sock *new4,
+ struct sock *new6);
+int socket_send_buffer_to_peer(struct wireguard_peer *peer, void *data,
+ size_t len, u8 ds);
+int socket_send_skb_to_peer(struct wireguard_peer *peer, struct sk_buff *skb,
+ u8 ds);
+int socket_send_buffer_as_reply_to_skb(struct wireguard_device *wg,
+ struct sk_buff *in_skb, void *out_buffer,
+ size_t len);
+
+int socket_endpoint_from_skb(struct endpoint *endpoint,
+ const struct sk_buff *skb);
+void socket_set_peer_endpoint(struct wireguard_peer *peer,
+ const struct endpoint *endpoint);
+void socket_set_peer_endpoint_from_skb(struct wireguard_peer *peer,
+ const struct sk_buff *skb);
+void socket_clear_peer_endpoint_src(struct wireguard_peer *peer);
+
+#if defined(CONFIG_DYNAMIC_DEBUG) || defined(DEBUG)
+#define net_dbg_skb_ratelimited(fmt, dev, skb, ...) do { \
+ struct endpoint __endpoint; \
+ socket_endpoint_from_skb(&__endpoint, skb); \
+ net_dbg_ratelimited(fmt, dev, &__endpoint.addr, \
+ ##__VA_ARGS__); \
+ } while (0)
+#else
+#define net_dbg_skb_ratelimited(fmt, skb, ...)
+#endif
+
+#endif /* _WG_SOCKET_H */
diff --git a/drivers/net/wireguard/timers.c b/drivers/net/wireguard/timers.c
new file mode 100644
index 000000000000..fead499a7321
--- /dev/null
+++ b/drivers/net/wireguard/timers.c
@@ -0,0 +1,256 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#include "timers.h"
+#include "device.h"
+#include "peer.h"
+#include "queueing.h"
+#include "socket.h"
+
+/*
+ * - Timer for retransmitting the handshake if we don't hear back after
+ * `REKEY_TIMEOUT + jitter` ms.
+ *
+ * - Timer for sending empty packet if we have received a packet but after have
+ * not sent one for `KEEPALIVE_TIMEOUT` ms.
+ *
+ * - Timer for initiating new handshake if we have sent a packet but after have
+ * not received one (even empty) for `(KEEPALIVE_TIMEOUT + REKEY_TIMEOUT)` ms.
+ *
+ * - Timer for zeroing out all ephemeral keys after `(REJECT_AFTER_TIME * 3)` ms
+ * if no new keys have been received.
+ *
+ * - Timer for, if enabled, sending an empty authenticated packet every user-
+ * specified seconds.
+ */
+
+#define peer_get_from_timer(timer_name) \
+ struct wireguard_peer *peer; \
+ rcu_read_lock_bh(); \
+ peer = peer_get_maybe_zero(from_timer(peer, timer, timer_name)); \
+ rcu_read_unlock_bh(); \
+ if (unlikely(!peer)) \
+ return;
+
+static inline void mod_peer_timer(struct wireguard_peer *peer,
+ struct timer_list *timer,
+ unsigned long expires)
+{
+ rcu_read_lock_bh();
+ if (likely(netif_running(peer->device->dev) && !peer->is_dead))
+ mod_timer(timer, expires);
+ rcu_read_unlock_bh();
+}
+
+static inline void del_peer_timer(struct wireguard_peer *peer,
+ struct timer_list *timer)
+{
+ rcu_read_lock_bh();
+ if (likely(netif_running(peer->device->dev) && !peer->is_dead))
+ del_timer(timer);
+ rcu_read_unlock_bh();
+}
+
+static void expired_retransmit_handshake(struct timer_list *timer)
+{
+ peer_get_from_timer(timer_retransmit_handshake);
+
+ if (peer->timer_handshake_attempts > MAX_TIMER_HANDSHAKES) {
+ pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d attempts, giving up\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr, MAX_TIMER_HANDSHAKES + 2);
+
+ del_peer_timer(peer, &peer->timer_send_keepalive);
+ /* We drop all packets without a keypair and don't try again,
+ * if we try unsuccessfully for too long to make a handshake.
+ */
+ skb_queue_purge(&peer->staged_packet_queue);
+
+ /* We set a timer for destroying any residue that might be left
+ * of a partial exchange.
+ */
+ if (!timer_pending(&peer->timer_zero_key_material))
+ mod_peer_timer(peer, &peer->timer_zero_key_material,
+ jiffies + REJECT_AFTER_TIME * 3 * HZ);
+ } else {
+ ++peer->timer_handshake_attempts;
+ pr_debug("%s: Handshake for peer %llu (%pISpfsc) did not complete after %d seconds, retrying (try %d)\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr, REKEY_TIMEOUT,
+ peer->timer_handshake_attempts + 1);
+
+ /* We clear the endpoint address src address, in case this is
+ * the cause of trouble.
+ */
+ socket_clear_peer_endpoint_src(peer);
+
+ packet_send_queued_handshake_initiation(peer, true);
+ }
+ peer_put(peer);
+}
+
+static void expired_send_keepalive(struct timer_list *timer)
+{
+ peer_get_from_timer(timer_send_keepalive);
+
+ packet_send_keepalive(peer);
+ if (peer->timer_need_another_keepalive) {
+ peer->timer_need_another_keepalive = false;
+ mod_peer_timer(peer, &peer->timer_send_keepalive,
+ jiffies + KEEPALIVE_TIMEOUT * HZ);
+ }
+ peer_put(peer);
+}
+
+static void expired_new_handshake(struct timer_list *timer)
+{
+ peer_get_from_timer(timer_new_handshake);
+
+ pr_debug("%s: Retrying handshake with peer %llu (%pISpfsc) because we stopped hearing back after %d seconds\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr, KEEPALIVE_TIMEOUT + REKEY_TIMEOUT);
+ /* We clear the endpoint address src address, in case this is the cause
+ * of trouble.
+ */
+ socket_clear_peer_endpoint_src(peer);
+ packet_send_queued_handshake_initiation(peer, false);
+ peer_put(peer);
+}
+
+static void expired_zero_key_material(struct timer_list *timer)
+{
+ peer_get_from_timer(timer_zero_key_material);
+
+ rcu_read_lock_bh();
+ if (!peer->is_dead) {
+ /* Should take our reference. */
+ if (!queue_work(peer->device->handshake_send_wq,
+ &peer->clear_peer_work))
+ /* If the work was already on the queue, we want to drop the extra reference */
+ peer_put(peer);
+ }
+ rcu_read_unlock_bh();
+}
+static void queued_expired_zero_key_material(struct work_struct *work)
+{
+ struct wireguard_peer *peer =
+ container_of(work, struct wireguard_peer, clear_peer_work);
+
+ pr_debug("%s: Zeroing out all keys for peer %llu (%pISpfsc), since we haven't received a new one in %d seconds\n",
+ peer->device->dev->name, peer->internal_id,
+ &peer->endpoint.addr, REJECT_AFTER_TIME * 3);
+ noise_handshake_clear(&peer->handshake);
+ noise_keypairs_clear(&peer->keypairs);
+ peer_put(peer);
+}
+
+static void expired_send_persistent_keepalive(struct timer_list *timer)
+{
+ peer_get_from_timer(timer_persistent_keepalive);
+
+ if (likely(peer->persistent_keepalive_interval))
+ packet_send_keepalive(peer);
+ peer_put(peer);
+}
+
+/* Should be called after an authenticated data packet is sent. */
+void timers_data_sent(struct wireguard_peer *peer)
+{
+ if (!timer_pending(&peer->timer_new_handshake))
+ mod_peer_timer(peer, &peer->timer_new_handshake,
+ jiffies + (KEEPALIVE_TIMEOUT + REKEY_TIMEOUT) * HZ);
+}
+
+/* Should be called after an authenticated data packet is received. */
+void timers_data_received(struct wireguard_peer *peer)
+{
+ if (likely(netif_running(peer->device->dev))) {
+ if (!timer_pending(&peer->timer_send_keepalive))
+ mod_peer_timer(peer, &peer->timer_send_keepalive,
+ jiffies + KEEPALIVE_TIMEOUT * HZ);
+ else
+ peer->timer_need_another_keepalive = true;
+ }
+}
+
+/* Should be called after any type of authenticated packet is sent, whether
+ * keepalive, data, or handshake.
+ */
+void timers_any_authenticated_packet_sent(struct wireguard_peer *peer)
+{
+ del_peer_timer(peer, &peer->timer_send_keepalive);
+}
+
+/* Should be called after any type of authenticated packet is received, whether
+ * keepalive, data, or handshake.
+ */
+void timers_any_authenticated_packet_received(struct wireguard_peer *peer)
+{
+ del_peer_timer(peer, &peer->timer_new_handshake);
+}
+
+/* Should be called after a handshake initiation message is sent. */
+void timers_handshake_initiated(struct wireguard_peer *peer)
+{
+ mod_peer_timer(
+ peer, &peer->timer_retransmit_handshake,
+ jiffies + REKEY_TIMEOUT * HZ +
+ prandom_u32_max(REKEY_TIMEOUT_JITTER_MAX_JIFFIES));
+}
+
+/* Should be called after a handshake response message is received and processed
+ * or when getting key confirmation via the first data message.
+ */
+void timers_handshake_complete(struct wireguard_peer *peer)
+{
+ del_peer_timer(peer, &peer->timer_retransmit_handshake);
+ peer->timer_handshake_attempts = 0;
+ peer->sent_lastminute_handshake = false;
+ getnstimeofday(&peer->walltime_last_handshake);
+}
+
+/* Should be called after an ephemeral key is created, which is before sending a
+ * handshake response or after receiving a handshake response.
+ */
+void timers_session_derived(struct wireguard_peer *peer)
+{
+ mod_peer_timer(peer, &peer->timer_zero_key_material,
+ jiffies + REJECT_AFTER_TIME * 3 * HZ);
+}
+
+/* Should be called before a packet with authentication, whether
+ * keepalive, data, or handshakem is sent, or after one is received.
+ */
+void timers_any_authenticated_packet_traversal(struct wireguard_peer *peer)
+{
+ if (peer->persistent_keepalive_interval)
+ mod_peer_timer(peer, &peer->timer_persistent_keepalive,
+ jiffies + peer->persistent_keepalive_interval * HZ);
+}
+
+void timers_init(struct wireguard_peer *peer)
+{
+ timer_setup(&peer->timer_retransmit_handshake,
+ expired_retransmit_handshake, 0);
+ timer_setup(&peer->timer_send_keepalive, expired_send_keepalive, 0);
+ timer_setup(&peer->timer_new_handshake, expired_new_handshake, 0);
+ timer_setup(&peer->timer_zero_key_material, expired_zero_key_material, 0);
+ timer_setup(&peer->timer_persistent_keepalive,
+ expired_send_persistent_keepalive, 0);
+ INIT_WORK(&peer->clear_peer_work, queued_expired_zero_key_material);
+ peer->timer_handshake_attempts = 0;
+ peer->sent_lastminute_handshake = false;
+ peer->timer_need_another_keepalive = false;
+}
+
+void timers_stop(struct wireguard_peer *peer)
+{
+ del_timer_sync(&peer->timer_retransmit_handshake);
+ del_timer_sync(&peer->timer_send_keepalive);
+ del_timer_sync(&peer->timer_new_handshake);
+ del_timer_sync(&peer->timer_zero_key_material);
+ del_timer_sync(&peer->timer_persistent_keepalive);
+ flush_work(&peer->clear_peer_work);
+}
diff --git a/drivers/net/wireguard/timers.h b/drivers/net/wireguard/timers.h
new file mode 100644
index 000000000000..483529c9d873
--- /dev/null
+++ b/drivers/net/wireguard/timers.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ */
+
+#ifndef _WG_TIMERS_H
+#define _WG_TIMERS_H
+
+#include <linux/ktime.h>
+
+struct wireguard_peer;
+
+void timers_init(struct wireguard_peer *peer);
+void timers_stop(struct wireguard_peer *peer);
+void timers_data_sent(struct wireguard_peer *peer);
+void timers_data_received(struct wireguard_peer *peer);
+void timers_any_authenticated_packet_sent(struct wireguard_peer *peer);
+void timers_any_authenticated_packet_received(struct wireguard_peer *peer);
+void timers_handshake_initiated(struct wireguard_peer *peer);
+void timers_handshake_complete(struct wireguard_peer *peer);
+void timers_session_derived(struct wireguard_peer *peer);
+void timers_any_authenticated_packet_traversal(struct wireguard_peer *peer);
+
+static inline bool has_expired(u64 birthday_nanoseconds, u64 expiration_seconds)
+{
+ return (s64)(birthday_nanoseconds + expiration_seconds * NSEC_PER_SEC)
+ <= (s64)ktime_get_boot_fast_ns();
+}
+
+#endif /* _WG_TIMERS_H */
diff --git a/drivers/net/wireguard/version.h b/drivers/net/wireguard/version.h
new file mode 100644
index 000000000000..fba9c7ab9423
--- /dev/null
+++ b/drivers/net/wireguard/version.h
@@ -0,0 +1 @@
+#define WIREGUARD_VERSION "0.0.20180910"
diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h
new file mode 100644
index 000000000000..3d73ad714e52
--- /dev/null
+++ b/include/uapi/linux/wireguard.h
@@ -0,0 +1,190 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR MIT)
+ *
+ * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * Documentation
+ * =============
+ *
+ * The below enums and macros are for interfacing with WireGuard, using generic
+ * netlink, with family WG_GENL_NAME and version WG_GENL_VERSION. It defines two
+ * methods: get and set. Note that while they share many common attributes,
+ * these two functions actually accept a slightly different set of inputs and
+ * outputs.
+ *
+ * WG_CMD_GET_DEVICE
+ * -----------------
+ *
+ * May only be called via NLM_F_REQUEST | NLM_F_DUMP. The command should contain
+ * one but not both of:
+ *
+ * WGDEVICE_A_IFINDEX: NLA_U32
+ * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *
+ * The kernel will then return several messages (NLM_F_MULTI) containing the
+ * following tree of nested items:
+ *
+ * WGDEVICE_A_IFINDEX: NLA_U32
+ * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN
+ * WGDEVICE_A_PUBLIC_KEY: len WG_KEY_LEN
+ * WGDEVICE_A_LISTEN_PORT: NLA_U16
+ * WGDEVICE_A_FWMARK: NLA_U32
+ * WGDEVICE_A_PEERS: NLA_NESTED
+ * 0: NLA_NESTED
+ * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN
+ * WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN
+ * WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6
+ * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16
+ * WGPEER_A_LAST_HANDSHAKE_TIME: struct timespec
+ * WGPEER_A_RX_BYTES: NLA_U64
+ * WGPEER_A_TX_BYTES: NLA_U64
+ * WGPEER_A_ALLOWEDIPS: NLA_NESTED
+ * 0: NLA_NESTED
+ * WGALLOWEDIP_A_FAMILY: NLA_U16
+ * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr
+ * WGALLOWEDIP_A_CIDR_MASK: NLA_U8
+ * 1: NLA_NESTED
+ * ...
+ * 2: NLA_NESTED
+ * ...
+ * ...
+ * WGPEER_A_PROTOCOL_VERSION: NLA_U32
+ * 1: NLA_NESTED
+ * ...
+ * ...
+ *
+ * It is possible that all of the allowed IPs of a single peer will not
+ * fit within a single netlink message. In that case, the same peer will
+ * be written in the following message, except it will only contain
+ * WGPEER_A_PUBLIC_KEY and WGPEER_A_ALLOWEDIPS. This may occur several
+ * times in a row for the same peer. It is then up to the receiver to
+ * coalesce adjacent peers. Likewise, it is possible that all peers will
+ * not fit within a single message. So, subsequent peers will be sent
+ * in following messages, except those will only contain WGDEVICE_A_IFNAME
+ * and WGDEVICE_A_PEERS. It is then up to the receiver to coalesce these
+ * messages to form the complete list of peers.
+ *
+ * Since this is an NLA_F_DUMP command, the final message will always be
+ * NLMSG_DONE, even if an error occurs. However, this NLMSG_DONE message
+ * contains an integer error code. It is either zero or a negative error
+ * code corresponding to the errno.
+ *
+ * WG_CMD_SET_DEVICE
+ * -----------------
+ *
+ * May only be called via NLM_F_REQUEST. The command should contain the
+ * following tree of nested items, containing one but not both of
+ * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME:
+ *
+ * WGDEVICE_A_IFINDEX: NLA_U32
+ * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current
+ * peers should be removed prior to adding the list below.
+ * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove
+ * WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly
+ * WGDEVICE_A_FWMARK: NLA_U32, 0 to disable
+ * WGDEVICE_A_PEERS: NLA_NESTED
+ * 0: NLA_NESTED
+ * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN
+ * WGPEER_A_FLAGS: NLA_U32, 0 and/or WGPEER_F_REMOVE_ME if the
+ * specified peer should be removed rather than
+ * added/updated and/or WGPEER_F_REPLACE_ALLOWEDIPS
+ * if all current allowed IPs of this peer should be
+ * removed prior to adding the list below.
+ * WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN, all zeros to remove
+ * WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6
+ * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16, 0 to disable
+ * WGPEER_A_ALLOWEDIPS: NLA_NESTED
+ * 0: NLA_NESTED
+ * WGALLOWEDIP_A_FAMILY: NLA_U16
+ * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr
+ * WGALLOWEDIP_A_CIDR_MASK: NLA_U8
+ * 1: NLA_NESTED
+ * ...
+ * 2: NLA_NESTED
+ * ...
+ * ...
+ * WGPEER_A_PROTOCOL_VERSION: NLA_U32, should not be set or used at
+ * all by most users of this API, as the
+ * most recent protocol will be used when
+ * this is unset. Otherwise, must be set
+ * to 1.
+ * 1: NLA_NESTED
+ * ...
+ * ...
+ *
+ * It is possible that the amount of configuration data exceeds that of
+ * the maximum message length accepted by the kernel. In that case, several
+ * messages should be sent one after another, with each successive one
+ * filling in information not contained in the prior. Note that if
+ * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably
+ * should not be specified in fragments that come after, so that the list
+ * of peers is only cleared the first time but appened after. Likewise for
+ * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message
+ * of a peer, it likely should not be specified in subsequent fragments.
+ *
+ * If an error occurs, NLMSG_ERROR will reply containing an errno.
+ */
+
+#ifndef _WG_UAPI_WIREGUARD_H
+#define _WG_UAPI_WIREGUARD_H
+
+#define WG_GENL_NAME "wireguard"
+#define WG_GENL_VERSION 1
+
+#define WG_KEY_LEN 32
+
+enum wg_cmd {
+ WG_CMD_GET_DEVICE,
+ WG_CMD_SET_DEVICE,
+ __WG_CMD_MAX
+};
+#define WG_CMD_MAX (__WG_CMD_MAX - 1)
+
+enum wgdevice_flag {
+ WGDEVICE_F_REPLACE_PEERS = 1U << 0
+};
+enum wgdevice_attribute {
+ WGDEVICE_A_UNSPEC,
+ WGDEVICE_A_IFINDEX,
+ WGDEVICE_A_IFNAME,
+ WGDEVICE_A_PRIVATE_KEY,
+ WGDEVICE_A_PUBLIC_KEY,
+ WGDEVICE_A_FLAGS,
+ WGDEVICE_A_LISTEN_PORT,
+ WGDEVICE_A_FWMARK,
+ WGDEVICE_A_PEERS,
+ __WGDEVICE_A_LAST
+};
+#define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1)
+
+enum wgpeer_flag {
+ WGPEER_F_REMOVE_ME = 1U << 0,
+ WGPEER_F_REPLACE_ALLOWEDIPS = 1U << 1
+};
+enum wgpeer_attribute {
+ WGPEER_A_UNSPEC,
+ WGPEER_A_PUBLIC_KEY,
+ WGPEER_A_PRESHARED_KEY,
+ WGPEER_A_FLAGS,
+ WGPEER_A_ENDPOINT,
+ WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
+ WGPEER_A_LAST_HANDSHAKE_TIME,
+ WGPEER_A_RX_BYTES,
+ WGPEER_A_TX_BYTES,
+ WGPEER_A_ALLOWEDIPS,
+ WGPEER_A_PROTOCOL_VERSION,
+ __WGPEER_A_LAST
+};
+#define WGPEER_A_MAX (__WGPEER_A_LAST - 1)
+
+enum wgallowedip_attribute {
+ WGALLOWEDIP_A_UNSPEC,
+ WGALLOWEDIP_A_FAMILY,
+ WGALLOWEDIP_A_IPADDR,
+ WGALLOWEDIP_A_CIDR_MASK,
+ __WGALLOWEDIP_A_LAST
+};
+#define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1)
+
+#endif /* _WG_UAPI_WIREGUARD_H */
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
new file mode 100755
index 000000000000..568612c45acc
--- /dev/null
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -0,0 +1,499 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+#
+# This script tests the below topology:
+#
+# ┌─────────────────────┐ ┌──────────────────────────────────┐ ┌─────────────────────┐
+# │ $ns1 namespace │ │ $ns0 namespace │ │ $ns2 namespace │
+# │ │ │ │ │ │
+# │┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐│
+# ││ wg0 │───────────┼───┼────────────│ lo │────────────┼───┼───────────│ wg0 ││
+# │├────────┴──────────┐│ │ ┌───────┴────────┴────────┐ │ │┌──────────┴────────┤│
+# ││192.168.241.1/24 ││ │ │(ns1) (ns2) │ │ ││192.168.241.2/24 ││
+# ││fd00::1/24 ││ │ │127.0.0.1:1 127.0.0.1:2│ │ ││fd00::2/24 ││
+# │└───────────────────┘│ │ │[::]:1 [::]:2 │ │ │└───────────────────┘│
+# └─────────────────────┘ │ └─────────────────────────┘ │ └─────────────────────┘
+# └──────────────────────────────────┘
+#
+# After the topology is prepared we run a series of TCP/UDP iperf3 tests between the
+# wireguard peers in $ns1 and $ns2. Note that $ns0 is the endpoint for the wg0
+# interfaces in $ns1 and $ns2. See https://www.wireguard.com/netns/ for further
+# details on how this is accomplished.
+set -e
+
+exec 3>&1
+export WG_HIDE_KEYS=never
+netns0="wg-test-$$-0"
+netns1="wg-test-$$-1"
+netns2="wg-test-$$-2"
+pretty() { echo -e "\x1b[32m\x1b[1m[+] ${1:+NS$1: }${2}\x1b[0m" >&3; }
+pp() { pretty "" "$*"; "$@"; }
+maybe_exec() { if [[ $BASHPID -eq $$ ]]; then "$@"; else exec "$@"; fi; }
+n0() { pretty 0 "$*"; maybe_exec ip netns exec $netns0 "$@"; }
+n1() { pretty 1 "$*"; maybe_exec ip netns exec $netns1 "$@"; }
+n2() { pretty 2 "$*"; maybe_exec ip netns exec $netns2 "$@"; }
+ip0() { pretty 0 "ip $*"; ip -n $netns0 "$@"; }
+ip1() { pretty 1 "ip $*"; ip -n $netns1 "$@"; }
+ip2() { pretty 2 "ip $*"; ip -n $netns2 "$@"; }
+sleep() { read -t "$1" -N 0 || true; }
+waitiperf() { pretty "${1//*-}" "wait for iperf:5201"; while [[ $(ss -N "$1" -tlp 'sport = 5201') != *iperf3* ]]; do sleep 0.1; done; }
+waitncatudp() { pretty "${1//*-}" "wait for udp:1111"; while [[ $(ss -N "$1" -ulp 'sport = 1111') != *ncat* ]]; do sleep 0.1; done; }
+waitncattcp() { pretty "${1//*-}" "wait for tcp:1111"; while [[ $(ss -N "$1" -tlp 'sport = 1111') != *ncat* ]]; do sleep 0.1; done; }
+waitiface() { pretty "${1//*-}" "wait for $2 to come up"; ip netns exec "$1" bash -c "while [[ \$(< \"/sys/class/net/$2/operstate\") != up ]]; do read -t .1 -N 0 || true; done;"; }
+
+cleanup() {
+ set +e
+ exec 2>/dev/null
+ printf "$orig_message_cost" > /proc/sys/net/core/message_cost
+ ip0 link del dev wg0
+ ip1 link del dev wg0
+ ip2 link del dev wg0
+ local to_kill="$(ip netns pids $netns0) $(ip netns pids $netns1) $(ip netns pids $netns2)"
+ [[ -n $to_kill ]] && kill $to_kill
+ pp ip netns del $netns1
+ pp ip netns del $netns2
+ pp ip netns del $netns0
+ exit
+}
+
+orig_message_cost="$(< /proc/sys/net/core/message_cost)"
+trap cleanup EXIT
+printf 0 > /proc/sys/net/core/message_cost
+
+ip netns del $netns0 2>/dev/null || true
+ip netns del $netns1 2>/dev/null || true
+ip netns del $netns2 2>/dev/null || true
+pp ip netns add $netns0
+pp ip netns add $netns1
+pp ip netns add $netns2
+ip0 link set up dev lo
+
+ip0 link add dev wg0 type wireguard
+ip0 link set wg0 netns $netns1
+ip0 link add dev wg0 type wireguard
+ip0 link set wg0 netns $netns2
+key1="$(pp wg genkey)"
+key2="$(pp wg genkey)"
+pub1="$(pp wg pubkey <<<"$key1")"
+pub2="$(pp wg pubkey <<<"$key2")"
+psk="$(pp wg genpsk)"
+[[ -n $key1 && -n $key2 && -n $psk ]]
+
+configure_peers() {
+ ip1 addr add 192.168.241.1/24 dev wg0
+ ip1 addr add fd00::1/24 dev wg0
+
+ ip2 addr add 192.168.241.2/24 dev wg0
+ ip2 addr add fd00::2/24 dev wg0
+
+ n1 wg set wg0 \
+ private-key <(echo "$key1") \
+ listen-port 1 \
+ peer "$pub2" \
+ preshared-key <(echo "$psk") \
+ allowed-ips 192.168.241.2/32,fd00::2/128
+ n2 wg set wg0 \
+ private-key <(echo "$key2") \
+ listen-port 2 \
+ peer "$pub1" \
+ preshared-key <(echo "$psk") \
+ allowed-ips 192.168.241.1/32,fd00::1/128
+
+ ip1 link set up dev wg0
+ ip2 link set up dev wg0
+}
+configure_peers
+
+tests() {
+ # Ping over IPv4
+ n2 ping -c 10 -f -W 1 192.168.241.1
+ n1 ping -c 10 -f -W 1 192.168.241.2
+
+ # Ping over IPv6
+ n2 ping6 -c 10 -f -W 1 fd00::1
+ n1 ping6 -c 10 -f -W 1 fd00::2
+
+ # TCP over IPv4
+ n2 iperf3 -s -1 -B 192.168.241.2 &
+ waitiperf $netns2
+ n1 iperf3 -Z -t 3 -c 192.168.241.2
+
+ # TCP over IPv6
+ n1 iperf3 -s -1 -B fd00::1 &
+ waitiperf $netns1
+ n2 iperf3 -Z -t 3 -c fd00::1
+
+ # UDP over IPv4
+ n1 iperf3 -s -1 -B 192.168.241.1 &
+ waitiperf $netns1
+ n2 iperf3 -Z -t 3 -b 0 -u -c 192.168.241.1
+
+ # UDP over IPv6
+ n2 iperf3 -s -1 -B fd00::2 &
+ waitiperf $netns2
+ n1 iperf3 -Z -t 3 -b 0 -u -c fd00::2
+}
+
+[[ $(ip1 link show dev wg0) =~ mtu\ ([0-9]+) ]] && orig_mtu="${BASH_REMATCH[1]}"
+big_mtu=$(( 34816 - 1500 + $orig_mtu ))
+
+# Test using IPv4 as outer transport
+n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2
+n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1
+# Before calling tests, we first make sure that the stats counters are working
+n2 ping -c 10 -f -W 1 192.168.241.1
+{ read _; read _; read _; read rx_bytes _; read _; read tx_bytes _; } < <(ip2 -stats link show dev wg0)
+(( rx_bytes == 1372 && (tx_bytes == 1428 || tx_bytes == 1460) ))
+{ read _; read _; read _; read rx_bytes _; read _; read tx_bytes _; } < <(ip1 -stats link show dev wg0)
+(( tx_bytes == 1372 && (rx_bytes == 1428 || rx_bytes == 1460) ))
+read _ rx_bytes tx_bytes < <(n2 wg show wg0 transfer)
+(( rx_bytes == 1372 && (tx_bytes == 1428 || tx_bytes == 1460) ))
+read _ rx_bytes tx_bytes < <(n1 wg show wg0 transfer)
+(( tx_bytes == 1372 && (rx_bytes == 1428 || rx_bytes == 1460) ))
+
+tests
+ip1 link set wg0 mtu $big_mtu
+ip2 link set wg0 mtu $big_mtu
+tests
+
+ip1 link set wg0 mtu $orig_mtu
+ip2 link set wg0 mtu $orig_mtu
+
+# Test using IPv6 as outer transport
+n1 wg set wg0 peer "$pub2" endpoint [::1]:2
+n2 wg set wg0 peer "$pub1" endpoint [::1]:1
+tests
+ip1 link set wg0 mtu $big_mtu
+ip2 link set wg0 mtu $big_mtu
+tests
+
+# Test that route MTUs work with the padding
+ip1 link set wg0 mtu 1300
+ip2 link set wg0 mtu 1300
+n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2
+n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1
+n0 iptables -A INPUT -m length --length 1360 -j DROP
+n1 ip route add 192.168.241.2/32 dev wg0 mtu 1299
+n2 ip route add 192.168.241.1/32 dev wg0 mtu 1299
+n2 ping -c 1 -W 1 -s 1269 192.168.241.1
+n2 ip route delete 192.168.241.1/32 dev wg0 mtu 1299
+n1 ip route delete 192.168.241.2/32 dev wg0 mtu 1299
+n0 iptables -F INPUT
+
+ip1 link set wg0 mtu $orig_mtu
+ip2 link set wg0 mtu $orig_mtu
+
+# Test using IPv4 that roaming works
+ip0 -4 addr del 127.0.0.1/8 dev lo
+ip0 -4 addr add 127.212.121.99/8 dev lo
+n1 wg set wg0 listen-port 9999
+n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2
+n1 ping6 -W 1 -c 1 fd00::2
+[[ $(n2 wg show wg0 endpoints) == "$pub1 127.212.121.99:9999" ]]
+
+# Test using IPv6 that roaming works
+n1 wg set wg0 listen-port 9998
+n1 wg set wg0 peer "$pub2" endpoint [::1]:2
+n1 ping -W 1 -c 1 192.168.241.2
+[[ $(n2 wg show wg0 endpoints) == "$pub1 [::1]:9998" ]]
+
+# Test that crypto-RP filter works
+n1 wg set wg0 peer "$pub2" allowed-ips 192.168.241.0/24
+exec 4< <(n1 ncat -l -u -p 1111)
+nmap_pid=$!
+waitncatudp $netns1
+n2 ncat -u 192.168.241.1 1111 <<<"X"
+read -r -N 1 -t 1 out <&4 && [[ $out == "X" ]]
+kill $nmap_pid
+more_specific_key="$(pp wg genkey | pp wg pubkey)"
+n1 wg set wg0 peer "$more_specific_key" allowed-ips 192.168.241.2/32
+n2 wg set wg0 listen-port 9997
+exec 4< <(n1 ncat -l -u -p 1111)
+nmap_pid=$!
+waitncatudp $netns1
+n2 ncat -u 192.168.241.1 1111 <<<"X"
+! read -r -N 1 -t 1 out <&4 || false
+kill $nmap_pid
+n1 wg set wg0 peer "$more_specific_key" remove
+[[ $(n1 wg show wg0 endpoints) == "$pub2 [::1]:9997" ]]
+
+ip1 link del wg0
+ip2 link del wg0
+
+# Test using NAT. We now change the topology to this:
+# ┌────────────────────────────────────────┐ ┌────────────────────────────────────────────────┐ ┌────────────────────────────────────────┐
+# │ $ns1 namespace │ │ $ns0 namespace │ │ $ns2 namespace │
+# │ │ │ │ │ │
+# │ ┌─────┐ ┌─────┐ │ │ ┌──────┐ ┌──────┐ │ │ ┌─────┐ ┌─────┐ │
+# │ │ wg0 │─────────────│vethc│───────────┼────┼────│vethrc│ │vethrs│──────────────┼─────┼──│veths│────────────│ wg0 │ │
+# │ ├─────┴──────────┐ ├─────┴──────────┐│ │ ├──────┴─────────┐ ├──────┴────────────┐ │ │ ├─────┴──────────┐ ├─────┴──────────┐ │
+# │ │192.168.241.1/24│ │192.168.1.100/24││ │ │192.168.1.100/24│ │10.0.0.1/24 │ │ │ │10.0.0.100/24 │ │192.168.241.2/24│ │
+# │ │fd00::1/24 │ │ ││ │ │ │ │SNAT:192.168.1.0/24│ │ │ │ │ │fd00::2/24 │ │
+# │ └────────────────┘ └────────────────┘│ │ └────────────────┘ └───────────────────┘ │ │ └────────────────┘ └────────────────┘ │
+# └────────────────────────────────────────┘ └────────────────────────────────────────────────┘ └────────────────────────────────────────┘
+
+ip1 link add dev wg0 type wireguard
+ip2 link add dev wg0 type wireguard
+configure_peers
+
+ip0 link add vethrc type veth peer name vethc
+ip0 link add vethrs type veth peer name veths
+ip0 link set vethc netns $netns1
+ip0 link set veths netns $netns2
+ip0 link set vethrc up
+ip0 link set vethrs up
+ip0 addr add 192.168.1.1/24 dev vethrc
+ip0 addr add 10.0.0.1/24 dev vethrs
+ip1 addr add 192.168.1.100/24 dev vethc
+ip1 link set vethc up
+ip1 route add default via 192.168.1.1
+ip2 addr add 10.0.0.100/24 dev veths
+ip2 link set veths up
+waitiface $netns0 vethrc
+waitiface $netns0 vethrs
+waitiface $netns1 vethc
+waitiface $netns2 veths
+
+n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward'
+n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout'
+n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream'
+n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1
+
+n1 wg set wg0 peer "$pub2" endpoint 10.0.0.100:2 persistent-keepalive 1
+n1 ping -W 1 -c 1 192.168.241.2
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]]
+# Demonstrate n2 can still send packets to n1, since persistent-keepalive will prevent connection tracking entry from expiring (to see entries: `n0 conntrack -L`).
+pp sleep 3
+n2 ping -W 1 -c 1 192.168.241.1
+
+n0 iptables -t nat -F
+ip0 link del vethrc
+ip0 link del vethrs
+ip1 link del wg0
+ip2 link del wg0
+
+# Test that saddr routing is sticky but not too sticky, changing to this topology:
+# ┌────────────────────────────────────────┐ ┌────────────────────────────────────────┐
+# │ $ns1 namespace │ │ $ns2 namespace │
+# │ │ │ │
+# │ ┌─────┐ ┌─────┐ │ │ ┌─────┐ ┌─────┐ │
+# │ │ wg0 │─────────────│veth1│───────────┼────┼──│veth2│────────────│ wg0 │ │
+# │ ├─────┴──────────┐ ├─────┴──────────┐│ │ ├─────┴──────────┐ ├─────┴──────────┐ │
+# │ │192.168.241.1/24│ │10.0.0.1/24 ││ │ │10.0.0.2/24 │ │192.168.241.2/24│ │
+# │ │fd00::1/24 │ │fd00:aa::1/96 ││ │ │fd00:aa::2/96 │ │fd00::2/24 │ │
+# │ └────────────────┘ └────────────────┘│ │ └────────────────┘ └────────────────┘ │
+# └────────────────────────────────────────┘ └────────────────────────────────────────┘
+
+ip1 link add dev wg0 type wireguard
+ip2 link add dev wg0 type wireguard
+configure_peers
+ip1 link add veth1 type veth peer name veth2
+ip1 link set veth2 netns $netns2
+n1 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/all/accept_dad'
+n2 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/all/accept_dad'
+n1 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/veth1/accept_dad'
+n2 bash -c 'printf 0 > /proc/sys/net/ipv6/conf/veth2/accept_dad'
+n1 bash -c 'printf 1 > /proc/sys/net/ipv4/conf/veth1/promote_secondaries'
+
+# First we check that we aren't overly sticky and can fall over to new IPs when old ones are removed
+ip1 addr add 10.0.0.1/24 dev veth1
+ip1 addr add fd00:aa::1/96 dev veth1
+ip2 addr add 10.0.0.2/24 dev veth2
+ip2 addr add fd00:aa::2/96 dev veth2
+ip1 link set veth1 up
+ip2 link set veth2 up
+waitiface $netns1 veth1
+waitiface $netns2 veth2
+n1 wg set wg0 peer "$pub2" endpoint 10.0.0.2:2
+n1 ping -W 1 -c 1 192.168.241.2
+ip1 addr add 10.0.0.10/24 dev veth1
+ip1 addr del 10.0.0.1/24 dev veth1
+n1 ping -W 1 -c 1 192.168.241.2
+n1 wg set wg0 peer "$pub2" endpoint [fd00:aa::2]:2
+n1 ping -W 1 -c 1 192.168.241.2
+ip1 addr add fd00:aa::10/96 dev veth1
+ip1 addr del fd00:aa::1/96 dev veth1
+n1 ping -W 1 -c 1 192.168.241.2
+
+# Now we show that we can successfully do reply to sender routing
+ip1 link set veth1 down
+ip2 link set veth2 down
+ip1 addr flush dev veth1
+ip2 addr flush dev veth2
+ip1 addr add 10.0.0.1/24 dev veth1
+ip1 addr add 10.0.0.2/24 dev veth1
+ip1 addr add fd00:aa::1/96 dev veth1
+ip1 addr add fd00:aa::2/96 dev veth1
+ip2 addr add 10.0.0.3/24 dev veth2
+ip2 addr add fd00:aa::3/96 dev veth2
+ip1 link set veth1 up
+ip2 link set veth2 up
+waitiface $netns1 veth1
+waitiface $netns2 veth2
+n2 wg set wg0 peer "$pub1" endpoint 10.0.0.1:1
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]]
+n2 wg set wg0 peer "$pub1" endpoint [fd00:aa::1]:1
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 [fd00:aa::1]:1" ]]
+n2 wg set wg0 peer "$pub1" endpoint 10.0.0.2:1
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.2:1" ]]
+n2 wg set wg0 peer "$pub1" endpoint [fd00:aa::2]:1
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 [fd00:aa::2]:1" ]]
+
+# What happens if the inbound destination address belongs to a different interface as the default route?
+ip1 link add dummy0 type dummy
+ip1 addr add 10.50.0.1/24 dev dummy0
+ip1 link set dummy0 up
+ip2 route add 10.50.0.0/24 dev veth2
+n2 wg set wg0 peer "$pub1" endpoint 10.50.0.1:1
+n2 ping -W 1 -c 1 192.168.241.1
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.50.0.1:1" ]]
+
+ip1 link del dummy0
+ip1 addr flush dev veth1
+ip2 addr flush dev veth2
+ip1 route flush dev veth1
+ip2 route flush dev veth2
+
+# Now we see what happens if another interface route takes precedence over an ongoing one
+ip1 link add veth3 type veth peer name veth4
+ip1 link set veth4 netns $netns2
+ip1 addr add 10.0.0.1/24 dev veth1
+ip2 addr add 10.0.0.2/24 dev veth2
+ip1 addr add 10.0.0.3/24 dev veth3
+ip1 link set veth1 up
+ip2 link set veth2 up
+ip1 link set veth3 up
+ip2 link set veth4 up
+waitiface $netns1 veth1
+waitiface $netns2 veth2
+waitiface $netns1 veth3
+waitiface $netns2 veth4
+ip1 route flush dev veth1
+ip1 route flush dev veth3
+ip1 route add 10.0.0.0/24 dev veth1 src 10.0.0.1 metric 2
+n1 wg set wg0 peer "$pub2" endpoint 10.0.0.2:2
+n1 ping -W 1 -c 1 192.168.241.2
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.1:1" ]]
+ip1 route add 10.0.0.0/24 dev veth3 src 10.0.0.3 metric 1
+n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/veth1/rp_filter'
+n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/veth4/rp_filter'
+n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter'
+n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter'
+n1 ping -W 1 -c 1 192.168.241.2
+[[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.3:1" ]]
+
+ip1 link del veth1
+ip1 link del veth3
+ip1 link del wg0
+ip2 link del wg0
+
+# We test that Netlink/IPC is working properly by doing things that usually cause split responses
+ip0 link add dev wg0 type wireguard
+config=( "[Interface]" "PrivateKey=$(wg genkey)" "[Peer]" "PublicKey=$(wg genkey)" )
+for a in {1..255}; do
+ for b in {0..255}; do
+ config+=( "AllowedIPs=$a.$b.0.0/16,$a::$b/128" )
+ done
+done
+n0 wg setconf wg0 <(printf '%s\n' "${config[@]}")
+i=0
+for ip in $(n0 wg show wg0 allowed-ips); do
+ ((++i))
+done
+((i == 255*256*2+1))
+ip0 link del wg0
+ip0 link add dev wg0 type wireguard
+config=( "[Interface]" "PrivateKey=$(wg genkey)" )
+for a in {1..40}; do
+ config+=( "[Peer]" "PublicKey=$(wg genkey)" )
+ for b in {1..52}; do
+ config+=( "AllowedIPs=$a.$b.0.0/16" )
+ done
+done
+n0 wg setconf wg0 <(printf '%s\n' "${config[@]}")
+i=0
+while read -r line; do
+ j=0
+ for ip in $line; do
+ ((++j))
+ done
+ ((j == 53))
+ ((++i))
+done < <(n0 wg show wg0 allowed-ips)
+((i == 40))
+ip0 link del wg0
+ip0 link add wg0 type wireguard
+config=( )
+for i in {1..29}; do
+ config+=( "[Peer]" "PublicKey=$(wg genkey)" )
+done
+config+=( "[Peer]" "PublicKey=$(wg genkey)" "AllowedIPs=255.2.3.4/32,abcd::255/128" )
+n0 wg setconf wg0 <(printf '%s\n' "${config[@]}")
+n0 wg showconf wg0 > /dev/null
+ip0 link del wg0
+
+allowedips=( )
+for i in {1..197}; do
+ allowedips+=( abcd::$i )
+done
+saved_ifs="$IFS"
+IFS=,
+allowedips="${allowedips[*]}"
+IFS="$saved_ifs"
+ip0 link add wg0 type wireguard
+n0 wg set wg0 peer "$pub1"
+n0 wg set wg0 peer "$pub2" allowed-ips "$allowedips"
+{
+ read -r pub allowedips
+ [[ $pub == "$pub1" && $allowedips == "(none)" ]]
+ read -r pub allowedips
+ [[ $pub == "$pub2" ]]
+ i=0
+ for _ in $allowedips; do
+ ((++i))
+ done
+ ((i == 197))
+} < <(n0 wg show wg0 allowed-ips)
+ip0 link del wg0
+
+! n0 wg show doesnotexist || false
+
+ip0 link add wg0 type wireguard
+n0 wg set wg0 private-key <(echo "$key1") peer "$pub2" preshared-key <(echo "$psk")
+[[ $(n0 wg show wg0 private-key) == "$key1" ]]
+[[ $(n0 wg show wg0 preshared-keys) == "$pub2 $psk" ]]
+n0 wg set wg0 private-key /dev/null peer "$pub2" preshared-key /dev/null
+[[ $(n0 wg show wg0 private-key) == "(none)" ]]
+[[ $(n0 wg show wg0 preshared-keys) == "$pub2 (none)" ]]
+n0 wg set wg0 peer "$pub2"
+n0 wg set wg0 private-key <(echo "$key2")
+[[ $(n0 wg show wg0 public-key) == "$pub2" ]]
+[[ -z $(n0 wg show wg0 peers) ]]
+n0 wg set wg0 peer "$pub2"
+[[ -z $(n0 wg show wg0 peers) ]]
+n0 wg set wg0 private-key <(echo "$key1")
+n0 wg set wg0 peer "$pub2"
+[[ $(n0 wg show wg0 peers) == "$pub2" ]]
+ip0 link del wg0
+
+declare -A objects
+while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
+ [[ $line =~ .*(wg[0-9]+:\ [A-Z][a-z]+\ [0-9]+)\ .*(created|destroyed).* ]] || continue
+ objects["${BASH_REMATCH[1]}"]+="${BASH_REMATCH[2]}"
+done < /dev/kmsg
+alldeleted=1
+for object in "${!objects[@]}"; do
+ if [[ ${objects["$object"]} != *createddestroyed ]]; then
+ echo "Error: $object: merely ${objects["$object"]}" >&3
+ alldeleted=0
+ fi
+done
+[[ $alldeleted -eq 1 ]]
+pretty "" "Objects that were created were also destroyed."
--
2.19.0
^ permalink raw reply related
* Re: [PATCH 5/7] MIPS: mscc: ocelot: add GPIO4 pinmuxing DT node
From: Quentin Schulz @ 2018-09-14 16:26 UTC (permalink / raw)
To: Alexandre Belloni
Cc: ralf, paul.burton, jhogan, robh+dt, mark.rutland, davem, andrew,
f.fainelli, allan.nielsen, linux-mips, devicetree, linux-kernel,
netdev, thomas.petazzoni, antoine.tenart
In-Reply-To: <20180914145446.GQ14988@piout.net>
[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]
Hi Alexandre,
On Fri, Sep 14, 2018 at 04:54:46PM +0200, Alexandre Belloni wrote:
> Hi,
>
> On 14/09/2018 11:44:26+0200, Quentin Schulz wrote:
> > In order to use GPIO4 as a GPIO, we need to mux it in this mode so let's
> > declare a new pinctrl DT node for it.
> >
> > Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
> > ---
> > arch/mips/boot/dts/mscc/ocelot.dtsi | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/arch/mips/boot/dts/mscc/ocelot.dtsi b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > index 8ce317c..b5c4c74 100644
> > --- a/arch/mips/boot/dts/mscc/ocelot.dtsi
> > +++ b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > @@ -182,6 +182,11 @@
> > interrupts = <13>;
> > #interrupt-cells = <2>;
> >
> > + gpio4: gpio4 {
> > + pins = "GPIO_4";
> > + function = "gpio";
> > + };
> > +
>
> For a GPIO, I would do that in the board dts because it is not used
> directly in the dtsi.
>
And the day we've two boards using this pinctrl we move it to a dtsi. Is
that the plan?
Thanks,
Quentin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH net-next 2/7] net: phy: mscc: add support for VSC8584 PHY
From: Quentin Schulz @ 2018-09-14 16:28 UTC (permalink / raw)
To: Andrew Lunn
Cc: alexandre.belloni, ralf, paul.burton, jhogan, robh+dt,
mark.rutland, davem, f.fainelli, allan.nielsen, linux-mips,
devicetree, linux-kernel, netdev, thomas.petazzoni,
antoine.tenart
In-Reply-To: <20180914132930.fphdm3dm2incetbq@qschulz>
[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]
Hi Andrew,
On Fri, Sep 14, 2018 at 03:29:30PM +0200, Quentin Schulz wrote:
> Hi Andrew,
>
> On Fri, Sep 14, 2018 at 03:18:46PM +0200, Andrew Lunn wrote:
> > > Most of the init sequence of a PHY of the package is common to all PHYs
> > > in the package, thus we use the SMI broadcast feature which enables us
> > > to propagate a write in one register of one PHY to all PHYs in the
> > > package.
> >
> > Hi Quinten
> >
> > Could you say a bit more about the broadcast. Does the SMI broadcast
> > go to all PHY everywhere on an MDIO bus, or only all PHYs within one
> > package? I'm just thinking about the case you need two of these
> > packages to cover 8 switch ports.
> >
>
> Ah sorry, that wasn't very explicit. That's a feature on the PHY side so
> my wildest guess is that it wouldn't impact any other PHY outside of
> this package. Affecting any other PHY on the bus is counter-intuitive to
> me but I'll ask the HW engineers for confirmation.
>
Confirmed by HW engineers, it only impacts PHYs in the same package.
Quentin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* [PATCH] net: ethernet: ti: add missing GENERIC_ALLOCATOR dependency
From: Corentin Labbe @ 2018-09-14 11:20 UTC (permalink / raw)
To: davem; +Cc: linux-kernel, netdev, Corentin Labbe
This patch mades TI_DAVINCI_CPDMA select GENERIC_ALLOCATOR.
without that, the following sparc64 build failure happen
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_check_free_tx_desc':
(.text+0x278): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_chan_submit':
(.text+0x340): undefined reference to `gen_pool_alloc'
(.text+0x5c4): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `__cpdma_chan_free':
davinci_cpdma.c:(.text+0x64c): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_desc_pool_destroy.isra.6':
davinci_cpdma.c:(.text+0x17ac): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x17b8): undefined reference to `gen_pool_avail'
davinci_cpdma.c:(.text+0x1824): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x1830): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_ctlr_create':
(.text+0x19f8): undefined reference to `devm_gen_pool_create'
(.text+0x1a90): undefined reference to `gen_pool_add_virt'
Makefile:1011: recipe for target 'vmlinux' failed
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
drivers/net/ethernet/ti/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9263d63..f932923 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -41,6 +41,7 @@ config TI_DAVINCI_MDIO
config TI_DAVINCI_CPDMA
tristate "TI DaVinci CPDMA Support"
depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST
+ select GENERIC_ALLOCATOR
---help---
This driver supports TI's DaVinci CPDMA dma engine.
--
2.7.4
^ permalink raw reply related
* [PATCH] net: hp100: fix always-true check for link up state
From: Colin King @ 2018-09-14 16:39 UTC (permalink / raw)
To: Jaroslav Kysela, David S . Miller, netdev; +Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
The operation ~(p100_inb(VG_LAN_CFG_1) & HP100_LINK_UP) returns a value
that is always non-zero and hence the wait for the link to drop always
terminates prematurely. Fix this by using a logical not operator instead
of a bitwise complement. This issue has been in the driver since
pre-2.6.12-rc2.
Detected by CoverityScan, CID#114157 ("Logical vs. bitwise operator")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
drivers/net/ethernet/hp/hp100.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hp/hp100.c b/drivers/net/ethernet/hp/hp100.c
index c8c7ad2eff77..9b5a68b65432 100644
--- a/drivers/net/ethernet/hp/hp100.c
+++ b/drivers/net/ethernet/hp/hp100.c
@@ -2634,7 +2634,7 @@ static int hp100_login_to_vg_hub(struct net_device *dev, u_short force_relogin)
/* Wait for link to drop */
time = jiffies + (HZ / 10);
do {
- if (~(hp100_inb(VG_LAN_CFG_1) & HP100_LINK_UP_ST))
+ if (!(hp100_inb(VG_LAN_CFG_1) & HP100_LINK_UP_ST))
break;
if (!in_interrupt())
schedule_timeout_interruptible(1);
--
2.17.1
^ permalink raw reply related
* Re: [PATCH net-next 2/7] net: phy: mscc: add support for VSC8584 PHY
From: Andrew Lunn @ 2018-09-14 16:58 UTC (permalink / raw)
To: Quentin Schulz
Cc: alexandre.belloni, ralf, paul.burton, jhogan, robh+dt,
mark.rutland, davem, f.fainelli, allan.nielsen, linux-mips,
devicetree, linux-kernel, netdev, thomas.petazzoni,
antoine.tenart
In-Reply-To: <20180914162828.5e75ffh5sig4om3d@qschulz>
> Confirmed by HW engineers, it only impacts PHYs in the same package.
Hi Quentin
Thanks for checking. As you said, it would be counter intuitive,
meaning a lot of confusion if it actually did happen.
Maybe you can add "in package" before broadcast in the commit message
and the code comments.
Andrew
^ permalink raw reply
* Re: [PATCH 5/7] MIPS: mscc: ocelot: add GPIO4 pinmuxing DT node
From: Andrew Lunn @ 2018-09-14 17:02 UTC (permalink / raw)
To: Quentin Schulz
Cc: Alexandre Belloni, ralf, paul.burton, jhogan, robh+dt,
mark.rutland, davem, f.fainelli, allan.nielsen, linux-mips,
devicetree, linux-kernel, netdev, thomas.petazzoni,
antoine.tenart
In-Reply-To: <20180914162638.fgzzjin2bzgx74de@qschulz>
On Fri, Sep 14, 2018 at 06:26:38PM +0200, Quentin Schulz wrote:
> Hi Alexandre,
>
> On Fri, Sep 14, 2018 at 04:54:46PM +0200, Alexandre Belloni wrote:
> > Hi,
> >
> > On 14/09/2018 11:44:26+0200, Quentin Schulz wrote:
> > > In order to use GPIO4 as a GPIO, we need to mux it in this mode so let's
> > > declare a new pinctrl DT node for it.
> > >
> > > Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
> > > ---
> > > arch/mips/boot/dts/mscc/ocelot.dtsi | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/arch/mips/boot/dts/mscc/ocelot.dtsi b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > index 8ce317c..b5c4c74 100644
> > > --- a/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > +++ b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > @@ -182,6 +182,11 @@
> > > interrupts = <13>;
> > > #interrupt-cells = <2>;
> > >
> > > + gpio4: gpio4 {
> > > + pins = "GPIO_4";
> > > + function = "gpio";
> > > + };
> > > +
> >
> > For a GPIO, I would do that in the board dts because it is not used
> > directly in the dtsi.
> >
>
> And the day we've two boards using this pinctrl we move it to a dtsi. Is
> that the plan?
Hi Quentin
gpio4 appears to be pretty arbitrary. Could a different design use a
different gpio? It me, this seems like a board property.
Andrew
^ permalink raw reply
* Re: [PATCH] net/mlx4_core: print firmware version during driver loading
From: Qing Huang @ 2018-09-14 17:15 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: netdev, linux-rdma, linux-kernel, tariqt, davem
In-Reply-To: <20180914044314.GC5257@mtr-leonro.mtl.com>
The FW version is actually a very crucial piece of information and only
printed once here
when the driver is loaded. People tend to get confused when switching
multiple FW files
back and forth without running separate utility tools, especially at
customer sites.
IMHO, this information is very useful and only takes up very little log
file space. :-)
I was also thinking of doing something slightly differently. Maybe we
just trim down the
output string, and add something like this?
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2208,6 +2208,11 @@ static int mlx4_init_fw(struct mlx4_dev *dev)
return err;
}
+ mlx4_info(dev, "Installed FW version is %d.%d.%03d.\n",
+ (int) (dev->caps.fw_ver >> 32),
+ (int) (dev->caps.fw_ver >> 16) & 0xffff,
+ (int) dev->caps.fw_ver & 0xffff);
+
err = mlx4_load_fw(dev);
if (err) {
mlx4_err(dev, "Failed to start FW, aborting\n");
Thanks,
Qing
On 9/13/2018 9:43 PM, Leon Romanovsky wrote:
> On Thu, Sep 13, 2018 at 05:25:14PM -0700, Qing Huang wrote:
>> When debugging firmware related issues, it's very helpful to have
> ^^^^^^^^^^ exactly, this is why we set this print as mlx4_dbg and
> not mlx4_info.
>
>> the installed FW version info in the kernel log when the driver is
>> loaded. It's easier to match error/warning messages with different
>> FW versions in the log other than running a separate tool to get
>> the information back and forth.
>>
>> Signed-off-by: Qing Huang <qing.huang@oracle.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx4/fw.c | 10 +++++-----
>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> index babcfd9..e1c5218 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
>> @@ -1686,11 +1686,11 @@ int mlx4_QUERY_FW(struct mlx4_dev *dev)
>> MLX4_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET);
>> cmd->max_cmds = 1 << lg;
>>
>> - mlx4_dbg(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
>> - (int) (dev->caps.fw_ver >> 32),
>> - (int) (dev->caps.fw_ver >> 16) & 0xffff,
>> - (int) dev->caps.fw_ver & 0xffff,
>> - cmd_if_rev, cmd->max_cmds);
>> + mlx4_info(dev, "FW version %d.%d.%03d (cmd intf rev %d), max commands %d\n",
>> + (int)(dev->caps.fw_ver >> 32),
>> + (int)(dev->caps.fw_ver >> 16) & 0xffff,
>> + (int)dev->caps.fw_ver & 0xffff,
>> + cmd_if_rev, cmd->max_cmds);
>>
>> MLX4_GET(fw->catas_offset, outbox, QUERY_FW_ERR_START_OFFSET);
>> MLX4_GET(fw->catas_size, outbox, QUERY_FW_ERR_SIZE_OFFSET);
>> --
>> 2.9.3
>>
^ permalink raw reply
* [PATCH net-next] cxgb4: update supported DCB version
From: Ganesh Goudar @ 2018-09-14 12:05 UTC (permalink / raw)
To: netdev, davem; +Cc: nirranjan, indranil, dt, varun, Ganesh Goudar
- In CXGB4_DCB_STATE_FW_INCOMPLETE state check if the dcb
version is changed and update the dcb supported version.
- Also, fill the priority code point value for priority
based flow control.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c | 27 ++++++++++++++++++++++++++
drivers/net/ethernet/chelsio/cxgb4/l2t.c | 6 ++++--
2 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
index b34f0f0..6ba3104 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
@@ -114,6 +114,24 @@ void cxgb4_dcb_reset(struct net_device *dev)
cxgb4_dcb_state_init(dev);
}
+/* update the dcb port support, if version is IEEE then set it to
+ * FW_PORT_DCB_VER_IEEE and if DCB_CAP_DCBX_VER_CEE is already set then
+ * clear that. and if it is set to CEE then set dcb supported to
+ * DCB_CAP_DCBX_VER_CEE & if DCB_CAP_DCBX_VER_IEEE is set, clear it
+ */
+static inline void cxgb4_dcb_update_support(struct port_dcb_info *dcb)
+{
+ if (dcb->dcb_version == FW_PORT_DCB_VER_IEEE) {
+ if (dcb->supported & DCB_CAP_DCBX_VER_CEE)
+ dcb->supported &= ~DCB_CAP_DCBX_VER_CEE;
+ dcb->supported |= DCB_CAP_DCBX_VER_IEEE;
+ } else if (dcb->dcb_version == FW_PORT_DCB_VER_CEE1D01) {
+ if (dcb->supported & DCB_CAP_DCBX_VER_IEEE)
+ dcb->supported &= ~DCB_CAP_DCBX_VER_IEEE;
+ dcb->supported |= DCB_CAP_DCBX_VER_CEE;
+ }
+}
+
/* Finite State machine for Data Center Bridging.
*/
void cxgb4_dcb_state_fsm(struct net_device *dev,
@@ -165,6 +183,15 @@ void cxgb4_dcb_state_fsm(struct net_device *dev,
}
case CXGB4_DCB_STATE_FW_INCOMPLETE: {
+ if (transition_to != CXGB4_DCB_INPUT_FW_DISABLED) {
+ /* during this CXGB4_DCB_STATE_FW_INCOMPLETE state,
+ * check if the dcb version is changed (there can be
+ * mismatch in default config & the negotiated switch
+ * configuration at FW, so update the dcb support
+ * accordingly.
+ */
+ cxgb4_dcb_update_support(dcb);
+ }
switch (transition_to) {
case CXGB4_DCB_INPUT_FW_ENABLED: {
/* we're alreaady in firmware DCB mode */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/l2t.c b/drivers/net/ethernet/chelsio/cxgb4/l2t.c
index 301c4df..99022c0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/l2t.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/l2t.c
@@ -433,10 +433,12 @@ struct l2t_entry *cxgb4_l2t_get(struct l2t_data *d, struct neighbour *neigh,
else
lport = netdev2pinfo(physdev)->lport;
- if (is_vlan_dev(neigh->dev))
+ if (is_vlan_dev(neigh->dev)) {
vlan = vlan_dev_vlan_id(neigh->dev);
- else
+ vlan |= vlan_dev_get_egress_qos_mask(neigh->dev, priority);
+ } else {
vlan = VLAN_NONE;
+ }
write_lock_bh(&d->lock);
for (e = d->l2tab[hash].first; e; e = e->next)
--
2.1.0
^ permalink raw reply related
* Re: [PATCH net-next v4 08/20] zinc: Poly1305 ARM and ARM64 implementations
From: Ard Biesheuvel @ 2018-09-14 17:27 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Linux Kernel Mailing List, <netdev@vger.kernel.org>,
open list:HARDWARE RANDOM NUMBER GENERATOR CORE, David S. Miller,
Greg Kroah-Hartman, Samuel Neves, Andy Lutomirski,
Jean-Philippe Aumasson, Andy Polyakov, Russell King,
linux-arm-kernel
In-Reply-To: <20180914162240.7925-9-Jason@zx2c4.com>
On 14 September 2018 at 18:22, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> These NEON and non-NEON implementations come from Andy Polyakov's
> implementation. They are exactly the same as Andy Polyakov's original,
> with the following exceptions:
>
> - Entries and exits use the proper kernel convention macro.
> - CPU feature checking is done in C by the glue code, so that has been
> removed from the assembly.
> - The function names have been renamed to fit kernel conventions.
> - Labels have been renamed to fit kernel conventions.
> - The neon code can jump to the scalar code when it makes sense to do
> so.
>
> After '/^#/d;/^\..*[^:]$/d', the code has the following diff in actual
> instructions from the original.
>
As I asked in response to v3, could we please have this as a separate
patch on top? The diff below is corrupted.
Also, both Andy and Eric have offered to get involved in upstreaming
these changes to OpenSSL, so there is no delta to begin with.
> ARM:
>
> -poly1305_init:
> -.Lpoly1305_init:
> +ENTRY(poly1305_init_arm)
> stmdb sp!,{r4-r11}
>
> eor r3,r3,r3
> @@ -18,8 +25,6 @@
> moveq r0,#0
> beq .Lno_key
>
> - adr r11,.Lpoly1305_init
> - ldr r12,.LOPENSSL_armcap
> ldrb r4,[r1,#0]
> mov r10,#0x0fffffff
> ldrb r5,[r1,#1]
> @@ -34,8 +39,6 @@
> ldrb r7,[r1,#6]
> and r4,r4,r10
>
> - ldr r12,[r11,r12] @ OPENSSL_armcap_P
> - ldr r12,[r12]
> ldrb r8,[r1,#7]
> orr r5,r5,r6,lsl#8
> ldrb r6,[r1,#8]
> @@ -45,22 +48,6 @@
> ldrb r8,[r1,#10]
> and r5,r5,r3
>
> - tst r12,#ARMV7_NEON @ check for NEON
> - adr r9,poly1305_blocks_neon
> - adr r11,poly1305_blocks
> - it ne
> - movne r11,r9
> - adr r12,poly1305_emit
> - adr r10,poly1305_emit_neon
> - it ne
> - movne r12,r10
> - itete eq
> - addeq r12,r11,#(poly1305_emit-.Lpoly1305_init)
> - addne r12,r11,#(poly1305_emit_neon-.Lpoly1305_init)
> - addeq r11,r11,#(poly1305_blocks-.Lpoly1305_init)
> - addne r11,r11,#(poly1305_blocks_neon-.Lpoly1305_init)
> - orr r12,r12,#1 @ thumb-ify address
> - orr r11,r11,#1
> ldrb r9,[r1,#11]
> orr r6,r6,r7,lsl#8
> ldrb r7,[r1,#12]
> @@ -79,17 +66,16 @@
> str r6,[r0,#8]
> and r7,r7,r3
> str r7,[r0,#12]
> - stmia r2,{r11,r12} @ fill functions table
> - mov r0,#1
> - mov r0,#0
> .Lno_key:
> ldmia sp!,{r4-r11}
> bx lr @ bx lr
> tst lr,#1
> moveq pc,lr @ be binary compatible with V4, yet
> .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> -poly1305_blocks:
> -.Lpoly1305_blocks:
> +ENDPROC(poly1305_init_arm)
> +
> +ENTRY(poly1305_blocks_arm)
> +.Lpoly1305_blocks_arm:
> stmdb sp!,{r3-r11,lr}
>
> ands r2,r2,#-16
> @@ -231,10 +217,11 @@
> tst lr,#1
> moveq pc,lr @ be binary compatible with V4, yet
> .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> -poly1305_emit:
> +ENDPROC(poly1305_blocks_arm)
> +
> +ENTRY(poly1305_emit_arm)
> stmdb sp!,{r4-r11}
> .Lpoly1305_emit_enter:
> -
> ldmia r0,{r3-r7}
> adds r8,r3,#5 @ compare to modulus
> adcs r9,r4,#0
> @@ -305,8 +292,12 @@
> tst lr,#1
> moveq pc,lr @ be binary compatible with V4, yet
> .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> +ENDPROC(poly1305_emit_arm)
> +
> +
>
> -poly1305_init_neon:
> +ENTRY(poly1305_init_neon)
> +.Lpoly1305_init_neon:
> ldr r4,[r0,#20] @ load key base 2^32
> ldr r5,[r0,#24]
> ldr r6,[r0,#28]
> @@ -515,8 +506,9 @@
> vst1.32 {d8[1]},[r7]
>
> bx lr @ bx lr
> +ENDPROC(poly1305_init_neon)
>
> -poly1305_blocks_neon:
> +ENTRY(poly1305_blocks_neon)
> ldr ip,[r0,#36] @ is_base2_26
> ands r2,r2,#-16
> beq .Lno_data_neon
> @@ -524,7 +516,7 @@
> cmp r2,#64
> bhs .Lenter_neon
> tst ip,ip @ is_base2_26?
> - beq .Lpoly1305_blocks
> + beq .Lpoly1305_blocks_arm
>
> .Lenter_neon:
> stmdb sp!,{r4-r7}
> @@ -534,7 +526,7 @@
> bne .Lbase2_26_neon
>
> stmdb sp!,{r1-r3,lr}
> - bl poly1305_init_neon
> + bl .Lpoly1305_init_neon
>
> ldr r4,[r0,#0] @ load hash value base 2^32
> ldr r5,[r0,#4]
> @@ -989,8 +981,9 @@
> ldmia sp!,{r4-r7}
> .Lno_data_neon:
> bx lr @ bx lr
> +ENDPROC(poly1305_blocks_neon)
>
> -poly1305_emit_neon:
> +ENTRY(poly1305_emit_neon)
> ldr ip,[r0,#36] @ is_base2_26
>
> stmdb sp!,{r4-r11}
> @@ -1055,6 +1048,6 @@
>
> ldmia sp!,{r4-r11}
> bx lr @ bx lr
> +ENDPROC(poly1305_emit_neon)
>
> ARM64:
>
> -poly1305_init:
> +ENTRY(poly1305_init_arm)
> cmp x1,xzr
> stp xzr,xzr,[x0] // zero hash value
> stp xzr,xzr,[x0,#16] // [along with is_base2_26]
> @@ -11,14 +15,9 @@
> csel x0,xzr,x0,eq
> b.eq .Lno_key
>
> - ldrsw x11,.LOPENSSL_armcap_P
> - ldr x11,.LOPENSSL_armcap_P
In the original, this looks like
#ifdef __ILP32__
ldrsw $t1,.LOPENSSL_armcap_P
#else
ldr $t1,.LOPENSSL_armcap_P
#endif
so I guess git commit ate those lines.
> - adr x10,.LOPENSSL_armcap_P
> -
> ldp x7,x8,[x1] // load key
> mov x9,#0xfffffffc0fffffff
> movk x9,#0x0fff,lsl#48
> - ldr w17,[x10,x11]
> rev x7,x7 // flip bytes
> rev x8,x8
> and x7,x7,x9 // &=0ffffffc0fffffff
> @@ -26,24 +25,11 @@
> and x8,x8,x9 // &=0ffffffc0ffffffc
> stp x7,x8,[x0,#32] // save key value
>
> - tst w17,#ARMV7_NEON
> -
> - adr x12,poly1305_blocks
> - adr x7,poly1305_blocks_neon
> - adr x13,poly1305_emit
> - adr x8,poly1305_emit_neon
> -
> - csel x12,x12,x7,eq
> - csel x13,x13,x8,eq
> -
> - stp w12,w13,[x2]
> - stp x12,x13,[x2]
> -
> - mov x0,#1
> .Lno_key:
> ret
> +ENDPROC(poly1305_init_arm)
>
> -poly1305_blocks:
> +ENTRY(poly1305_blocks_arm)
> ands x2,x2,#-16
> b.eq .Lno_data
>
> @@ -100,8 +86,9 @@
>
> .Lno_data:
> ret
> +ENDPROC(poly1305_blocks_arm)
>
> -poly1305_emit:
> +ENTRY(poly1305_emit_arm)
> ldp x4,x5,[x0] // load hash base 2^64
> ldr x6,[x0,#16]
> ldp x10,x11,[x2] // load nonce
> @@ -124,7 +111,9 @@
> stp x4,x5,[x1] // write result
>
> ret
> -poly1305_mult:
> +ENDPROC(poly1305_emit_arm)
> +
> +__poly1305_mult:
> mul x12,x4,x7 // h0*r0
> umulh x13,x4,x7
>
> @@ -158,7 +147,7 @@
>
> ret
>
> -poly1305_splat:
> +__poly1305_splat:
> and x12,x4,#0x03ffffff // base 2^64 -> base 2^26
> ubfx x13,x4,#26,#26
> extr x14,x5,x4,#52
> @@ -182,11 +171,11 @@
>
> ret
>
> -poly1305_blocks_neon:
> +ENTRY(poly1305_blocks_neon)
> ldr x17,[x0,#24]
> cmp x2,#128
> b.hs .Lblocks_neon
> - cbz x17,poly1305_blocks
> + cbz x17,poly1305_blocks_arm
>
> .Lblocks_neon:
> stp x29,x30,[sp,#-80]!
> @@ -232,7 +221,7 @@
> adcs x5,x5,x13
> adc x6,x6,x3
>
> - bl poly1305_mult
> + bl __poly1305_mult
> ldr x30,[sp,#8]
>
> cbz x3,.Lstore_base2_64_neon
> @@ -274,7 +263,7 @@
> adcs x5,x5,x13
> adc x6,x6,x3
>
> - bl poly1305_mult
> + bl __poly1305_mult
>
> .Linit_neon:
> and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
> @@ -301,19 +290,19 @@
> mov x5,x8
> mov x6,xzr
> add x0,x0,#48+12
> - bl poly1305_splat
> + bl __poly1305_splat
>
> - bl poly1305_mult // r^2
> + bl __poly1305_mult // r^2
> sub x0,x0,#4
> - bl poly1305_splat
> + bl __poly1305_splat
>
> - bl poly1305_mult // r^3
> + bl __poly1305_mult // r^3
> sub x0,x0,#4
> - bl poly1305_splat
> + bl __poly1305_splat
>
> - bl poly1305_mult // r^4
> + bl __poly1305_mult // r^4
> sub x0,x0,#4
> - bl poly1305_splat
> + bl __poly1305_splat
> ldr x30,[sp,#8]
>
> add x16,x1,#32
> @@ -743,10 +732,11 @@
> .Lno_data_neon:
> ldr x29,[sp],#80
> ret
> +ENDPROC(poly1305_blocks_neon)
>
> -poly1305_emit_neon:
> +ENTRY(poly1305_emit_neon)
> ldr x17,[x0,#24]
> - cbz x17,poly1305_emit
> + cbz x17,poly1305_emit_arm
>
> ldp w10,w11,[x0] // load hash value base 2^26
> ldp w12,w13,[x0,#8]
> @@ -788,6 +778,6 @@
> stp x4,x5,[x1] // write result
>
> ret
> +ENDPROC(poly1305_emit_neon)
>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Cc: Samuel Neves <sneves@dei.uc.pt>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Greg KH <gregkh@linuxfoundation.org>
> Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
> Cc: Andy Polyakov <appro@openssl.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: linux-arm-kernel@lists.infradead.org
> ---
> lib/zinc/Makefile | 8 +
> lib/zinc/poly1305/poly1305-arm-glue.h | 69 ++
> lib/zinc/poly1305/poly1305-arm.S | 1117 +++++++++++++++++++++++++
> lib/zinc/poly1305/poly1305-arm64.S | 822 ++++++++++++++++++
> 4 files changed, 2016 insertions(+)
> create mode 100644 lib/zinc/poly1305/poly1305-arm-glue.h
> create mode 100644 lib/zinc/poly1305/poly1305-arm.S
> create mode 100644 lib/zinc/poly1305/poly1305-arm64.S
>
> diff --git a/lib/zinc/Makefile b/lib/zinc/Makefile
> index d1e3892e06d9..f37df89a3f87 100644
> --- a/lib/zinc/Makefile
> +++ b/lib/zinc/Makefile
> @@ -25,6 +25,14 @@ endif
>
> ifeq ($(CONFIG_ZINC_POLY1305),y)
> zinc-y += poly1305/poly1305.o
> +ifeq ($(CONFIG_ZINC_ARCH_ARM),y)
> +zinc-y += poly1305/poly1305-arm.o
> +CFLAGS_poly1305.o += -include $(srctree)/$(src)/poly1305/poly1305-arm-glue.h
> +endif
> +ifeq ($(CONFIG_ZINC_ARCH_ARM64),y)
> +zinc-y += poly1305/poly1305-arm64.o
> +CFLAGS_poly1305.o += -include $(srctree)/$(src)/poly1305/poly1305-arm-glue.h
> +endif
> endif
>
I still don't like the GCC -includes, especially because these .h
files contain function and variable definitions so they are not
actually header files to begin with.
Also, you mentioned in the commit log that you got rid of defines and
made the code more modular, but as far as I can tell, libzinc is still
a single monolithic binary that is essentially always builtin once we
move random.c to it.
> zinc-y += main.o
> diff --git a/lib/zinc/poly1305/poly1305-arm-glue.h b/lib/zinc/poly1305/poly1305-arm-glue.h
> new file mode 100644
> index 000000000000..53f8fec7f858
> --- /dev/null
> +++ b/lib/zinc/poly1305/poly1305-arm-glue.h
> @@ -0,0 +1,69 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + */
> +
> +#include <zinc/poly1305.h>
> +#include <asm/hwcap.h>
> +#include <asm/neon.h>
> +
> +asmlinkage void poly1305_init_arm(void *ctx, const u8 key[16]);
> +asmlinkage void poly1305_blocks_arm(void *ctx, const u8 *inp, const size_t len,
> + const u32 padbit);
> +asmlinkage void poly1305_emit_arm(void *ctx, u8 mac[16], const u32 nonce[4]);
> +#if IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && \
> + (defined(CONFIG_64BIT) || __LINUX_ARM_ARCH__ >= 7)
> +#define ARM_USE_NEON
> +asmlinkage void poly1305_blocks_neon(void *ctx, const u8 *inp, const size_t len,
> + const u32 padbit);
> +asmlinkage void poly1305_emit_neon(void *ctx, u8 mac[16], const u32 nonce[4]);
> +#endif
> +
> +static bool poly1305_use_neon __ro_after_init;
> +
> +void __init poly1305_fpu_init(void)
> +{
> +#if defined(CONFIG_ARM64)
> + poly1305_use_neon = elf_hwcap & HWCAP_ASIMD;
> +#elif defined(CONFIG_ARM)
> + poly1305_use_neon = elf_hwcap & HWCAP_NEON;
> +#endif
> +}
> +
> +static inline bool poly1305_init_arch(void *ctx,
> + const u8 key[POLY1305_KEY_SIZE],
> + simd_context_t simd_context)
> +{
> + poly1305_init_arm(ctx, key);
> + return true;
> +}
> +
> +static inline bool poly1305_blocks_arch(void *ctx, const u8 *inp,
> + const size_t len, const u32 padbit,
> + simd_context_t simd_context)
> +{
> +#if defined(ARM_USE_NEON)
> + if (simd_context == HAVE_FULL_SIMD && poly1305_use_neon) {
> + poly1305_blocks_neon(ctx, inp, len, padbit);
> + return true;
> + }
> +#endif
> + poly1305_blocks_arm(ctx, inp, len, padbit);
> + return true;
> +}
> +
> +static inline bool poly1305_emit_arch(void *ctx, u8 mac[POLY1305_MAC_SIZE],
> + const u32 nonce[4],
> + simd_context_t simd_context)
> +{
> +#if defined(ARM_USE_NEON)
> + if (simd_context == HAVE_FULL_SIMD && poly1305_use_neon) {
> + poly1305_emit_neon(ctx, mac, nonce);
> + return true;
> + }
> +#endif
> + poly1305_emit_arm(ctx, mac, nonce);
> + return true;
> +}
> +
> +#define HAVE_POLY1305_ARCH_IMPLEMENTATION
We shouldn't #define HAVE_xxx constants in code but only in Kconfig.
> diff --git a/lib/zinc/poly1305/poly1305-arm.S b/lib/zinc/poly1305/poly1305-arm.S
> new file mode 100644
> index 000000000000..110f4317b5d7
> --- /dev/null
> +++ b/lib/zinc/poly1305/poly1305-arm.S
> @@ -0,0 +1,1117 @@
> +/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
> + *
> + * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
> + *
> + * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
> + */
> +
> +#include <linux/linkage.h>
> +
> +.text
> +#if defined(__thumb2__)
> +.syntax unified
> +.thumb
> +#else
> +.code 32
> +#endif
> +
> +.align 5
> +ENTRY(poly1305_init_arm)
> + stmdb sp!,{r4-r11}
> +
> + eor r3,r3,r3
> + cmp r1,#0
> + str r3,[r0,#0] @ zero hash value
> + str r3,[r0,#4]
> + str r3,[r0,#8]
> + str r3,[r0,#12]
> + str r3,[r0,#16]
> + str r3,[r0,#36] @ is_base2_26
> + add r0,r0,#20
> +
> +#ifdef __thumb2__
> + it eq
> +#endif
> + moveq r0,#0
> + beq .Lno_key
> +
> + ldrb r4,[r1,#0]
> + mov r10,#0x0fffffff
> + ldrb r5,[r1,#1]
> + and r3,r10,#-4 @ 0x0ffffffc
> + ldrb r6,[r1,#2]
> + ldrb r7,[r1,#3]
> + orr r4,r4,r5,lsl#8
> + ldrb r5,[r1,#4]
> + orr r4,r4,r6,lsl#16
> + ldrb r6,[r1,#5]
> + orr r4,r4,r7,lsl#24
> + ldrb r7,[r1,#6]
> + and r4,r4,r10
> +
> + ldrb r8,[r1,#7]
> + orr r5,r5,r6,lsl#8
> + ldrb r6,[r1,#8]
> + orr r5,r5,r7,lsl#16
> + ldrb r7,[r1,#9]
> + orr r5,r5,r8,lsl#24
> + ldrb r8,[r1,#10]
> + and r5,r5,r3
> +
> + ldrb r9,[r1,#11]
> + orr r6,r6,r7,lsl#8
> + ldrb r7,[r1,#12]
> + orr r6,r6,r8,lsl#16
> + ldrb r8,[r1,#13]
> + orr r6,r6,r9,lsl#24
> + ldrb r9,[r1,#14]
> + and r6,r6,r3
> +
> + ldrb r10,[r1,#15]
> + orr r7,r7,r8,lsl#8
> + str r4,[r0,#0]
> + orr r7,r7,r9,lsl#16
> + str r5,[r0,#4]
> + orr r7,r7,r10,lsl#24
> + str r6,[r0,#8]
> + and r7,r7,r3
> + str r7,[r0,#12]
> +.Lno_key:
> + ldmia sp!,{r4-r11}
> +#if __LINUX_ARM_ARCH__ >= 5
> + bx lr @ bx lr
> +#else
> + tst lr,#1
> + moveq pc,lr @ be binary compatible with V4, yet
> + .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> +#endif
> +ENDPROC(poly1305_init_arm)
> +
> +.align 5
> +ENTRY(poly1305_blocks_arm)
> +.Lpoly1305_blocks_arm:
> + stmdb sp!,{r3-r11,lr}
> +
> + ands r2,r2,#-16
> + beq .Lno_data
> +
> + cmp r3,#0
> + add r2,r2,r1 @ end pointer
> + sub sp,sp,#32
> +
> + ldmia r0,{r4-r12} @ load context
> +
> + str r0,[sp,#12] @ offload stuff
> + mov lr,r1
> + str r2,[sp,#16]
> + str r10,[sp,#20]
> + str r11,[sp,#24]
> + str r12,[sp,#28]
> + b .Loop
> +
> +.Loop:
> +#if __LINUX_ARM_ARCH__ < 7
> + ldrb r0,[lr],#16 @ load input
> +#ifdef __thumb2__
> + it hi
> +#endif
> + addhi r8,r8,#1 @ 1<<128
> + ldrb r1,[lr,#-15]
> + ldrb r2,[lr,#-14]
> + ldrb r3,[lr,#-13]
> + orr r1,r0,r1,lsl#8
> + ldrb r0,[lr,#-12]
> + orr r2,r1,r2,lsl#16
> + ldrb r1,[lr,#-11]
> + orr r3,r2,r3,lsl#24
> + ldrb r2,[lr,#-10]
> + adds r4,r4,r3 @ accumulate input
> +
> + ldrb r3,[lr,#-9]
> + orr r1,r0,r1,lsl#8
> + ldrb r0,[lr,#-8]
> + orr r2,r1,r2,lsl#16
> + ldrb r1,[lr,#-7]
> + orr r3,r2,r3,lsl#24
> + ldrb r2,[lr,#-6]
> + adcs r5,r5,r3
> +
> + ldrb r3,[lr,#-5]
> + orr r1,r0,r1,lsl#8
> + ldrb r0,[lr,#-4]
> + orr r2,r1,r2,lsl#16
> + ldrb r1,[lr,#-3]
> + orr r3,r2,r3,lsl#24
> + ldrb r2,[lr,#-2]
> + adcs r6,r6,r3
> +
> + ldrb r3,[lr,#-1]
> + orr r1,r0,r1,lsl#8
> + str lr,[sp,#8] @ offload input pointer
> + orr r2,r1,r2,lsl#16
> + add r10,r10,r10,lsr#2
> + orr r3,r2,r3,lsl#24
> +#else
> + ldr r0,[lr],#16 @ load input
> +#ifdef __thumb2__
> + it hi
> +#endif
> + addhi r8,r8,#1 @ padbit
> + ldr r1,[lr,#-12]
> + ldr r2,[lr,#-8]
> + ldr r3,[lr,#-4]
> +#ifdef __ARMEB__
> + rev r0,r0
> + rev r1,r1
> + rev r2,r2
> + rev r3,r3
> +#endif
> + adds r4,r4,r0 @ accumulate input
> + str lr,[sp,#8] @ offload input pointer
> + adcs r5,r5,r1
> + add r10,r10,r10,lsr#2
> + adcs r6,r6,r2
> +#endif
> + add r11,r11,r11,lsr#2
> + adcs r7,r7,r3
> + add r12,r12,r12,lsr#2
> +
> + umull r2,r3,r5,r9
> + adc r8,r8,#0
> + umull r0,r1,r4,r9
> + umlal r2,r3,r8,r10
> + umlal r0,r1,r7,r10
> + ldr r10,[sp,#20] @ reload r10
> + umlal r2,r3,r6,r12
> + umlal r0,r1,r5,r12
> + umlal r2,r3,r7,r11
> + umlal r0,r1,r6,r11
> + umlal r2,r3,r4,r10
> + str r0,[sp,#0] @ future r4
> + mul r0,r11,r8
> + ldr r11,[sp,#24] @ reload r11
> + adds r2,r2,r1 @ d1+=d0>>32
> + eor r1,r1,r1
> + adc lr,r3,#0 @ future r6
> + str r2,[sp,#4] @ future r5
> +
> + mul r2,r12,r8
> + eor r3,r3,r3
> + umlal r0,r1,r7,r12
> + ldr r12,[sp,#28] @ reload r12
> + umlal r2,r3,r7,r9
> + umlal r0,r1,r6,r9
> + umlal r2,r3,r6,r10
> + umlal r0,r1,r5,r10
> + umlal r2,r3,r5,r11
> + umlal r0,r1,r4,r11
> + umlal r2,r3,r4,r12
> + ldr r4,[sp,#0]
> + mul r8,r9,r8
> + ldr r5,[sp,#4]
> +
> + adds r6,lr,r0 @ d2+=d1>>32
> + ldr lr,[sp,#8] @ reload input pointer
> + adc r1,r1,#0
> + adds r7,r2,r1 @ d3+=d2>>32
> + ldr r0,[sp,#16] @ reload end pointer
> + adc r3,r3,#0
> + add r8,r8,r3 @ h4+=d3>>32
> +
> + and r1,r8,#-4
> + and r8,r8,#3
> + add r1,r1,r1,lsr#2 @ *=5
> + adds r4,r4,r1
> + adcs r5,r5,#0
> + adcs r6,r6,#0
> + adcs r7,r7,#0
> + adc r8,r8,#0
> +
> + cmp r0,lr @ done yet?
> + bhi .Loop
> +
> + ldr r0,[sp,#12]
> + add sp,sp,#32
> + stmia r0,{r4-r8} @ store the result
> +
> +.Lno_data:
> +#if __LINUX_ARM_ARCH__ >= 5
> + ldmia sp!,{r3-r11,pc}
> +#else
> + ldmia sp!,{r3-r11,lr}
> + tst lr,#1
> + moveq pc,lr @ be binary compatible with V4, yet
> + .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> +#endif
> +ENDPROC(poly1305_blocks_arm)
> +
> +.align 5
> +ENTRY(poly1305_emit_arm)
> + stmdb sp!,{r4-r11}
> +.Lpoly1305_emit_enter:
> + ldmia r0,{r3-r7}
> + adds r8,r3,#5 @ compare to modulus
> + adcs r9,r4,#0
> + adcs r10,r5,#0
> + adcs r11,r6,#0
> + adc r7,r7,#0
> + tst r7,#4 @ did it carry/borrow?
> +
> +#ifdef __thumb2__
> + it ne
> +#endif
> + movne r3,r8
> + ldr r8,[r2,#0]
> +#ifdef __thumb2__
> + it ne
> +#endif
> + movne r4,r9
> + ldr r9,[r2,#4]
> +#ifdef __thumb2__
> + it ne
> +#endif
> + movne r5,r10
> + ldr r10,[r2,#8]
> +#ifdef __thumb2__
> + it ne
> +#endif
> + movne r6,r11
> + ldr r11,[r2,#12]
> +
> + adds r3,r3,r8
> + adcs r4,r4,r9
> + adcs r5,r5,r10
> + adc r6,r6,r11
> +
> +#if __LINUX_ARM_ARCH__ >= 7
> +#ifdef __ARMEB__
> + rev r3,r3
> + rev r4,r4
> + rev r5,r5
> + rev r6,r6
> +#endif
> + str r3,[r1,#0]
> + str r4,[r1,#4]
> + str r5,[r1,#8]
> + str r6,[r1,#12]
> +#else
> + strb r3,[r1,#0]
> + mov r3,r3,lsr#8
> + strb r4,[r1,#4]
> + mov r4,r4,lsr#8
> + strb r5,[r1,#8]
> + mov r5,r5,lsr#8
> + strb r6,[r1,#12]
> + mov r6,r6,lsr#8
> +
> + strb r3,[r1,#1]
> + mov r3,r3,lsr#8
> + strb r4,[r1,#5]
> + mov r4,r4,lsr#8
> + strb r5,[r1,#9]
> + mov r5,r5,lsr#8
> + strb r6,[r1,#13]
> + mov r6,r6,lsr#8
> +
> + strb r3,[r1,#2]
> + mov r3,r3,lsr#8
> + strb r4,[r1,#6]
> + mov r4,r4,lsr#8
> + strb r5,[r1,#10]
> + mov r5,r5,lsr#8
> + strb r6,[r1,#14]
> + mov r6,r6,lsr#8
> +
> + strb r3,[r1,#3]
> + strb r4,[r1,#7]
> + strb r5,[r1,#11]
> + strb r6,[r1,#15]
> +#endif
> + ldmia sp!,{r4-r11}
> +#if __LINUX_ARM_ARCH__ >= 5
> + bx lr @ bx lr
> +#else
> + tst lr,#1
> + moveq pc,lr @ be binary compatible with V4, yet
> + .word 0xe12fff1e @ interoperable with Thumb ISA:-)
> +#endif
> +ENDPROC(poly1305_emit_arm)
> +
> +
> +#if __LINUX_ARM_ARCH__ >= 7
> +.fpu neon
> +
> +.align 5
> +ENTRY(poly1305_init_neon)
> +.Lpoly1305_init_neon:
> + ldr r4,[r0,#20] @ load key base 2^32
> + ldr r5,[r0,#24]
> + ldr r6,[r0,#28]
> + ldr r7,[r0,#32]
> +
> + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26
> + mov r3,r4,lsr#26
> + mov r4,r5,lsr#20
> + orr r3,r3,r5,lsl#6
> + mov r5,r6,lsr#14
> + orr r4,r4,r6,lsl#12
> + mov r6,r7,lsr#8
> + orr r5,r5,r7,lsl#18
> + and r3,r3,#0x03ffffff
> + and r4,r4,#0x03ffffff
> + and r5,r5,#0x03ffffff
> +
> + vdup.32 d0,r2 @ r^1 in both lanes
> + add r2,r3,r3,lsl#2 @ *5
> + vdup.32 d1,r3
> + add r3,r4,r4,lsl#2
> + vdup.32 d2,r2
> + vdup.32 d3,r4
> + add r4,r5,r5,lsl#2
> + vdup.32 d4,r3
> + vdup.32 d5,r5
> + add r5,r6,r6,lsl#2
> + vdup.32 d6,r4
> + vdup.32 d7,r6
> + vdup.32 d8,r5
> +
> + mov r5,#2 @ counter
> +
> +.Lsquare_neon:
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4
> + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4
> + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4
> + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4
> + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4
> +
> + vmull.u32 q5,d0,d0[1]
> + vmull.u32 q6,d1,d0[1]
> + vmull.u32 q7,d3,d0[1]
> + vmull.u32 q8,d5,d0[1]
> + vmull.u32 q9,d7,d0[1]
> +
> + vmlal.u32 q5,d7,d2[1]
> + vmlal.u32 q6,d0,d1[1]
> + vmlal.u32 q7,d1,d1[1]
> + vmlal.u32 q8,d3,d1[1]
> + vmlal.u32 q9,d5,d1[1]
> +
> + vmlal.u32 q5,d5,d4[1]
> + vmlal.u32 q6,d7,d4[1]
> + vmlal.u32 q8,d1,d3[1]
> + vmlal.u32 q7,d0,d3[1]
> + vmlal.u32 q9,d3,d3[1]
> +
> + vmlal.u32 q5,d3,d6[1]
> + vmlal.u32 q8,d0,d5[1]
> + vmlal.u32 q6,d5,d6[1]
> + vmlal.u32 q7,d7,d6[1]
> + vmlal.u32 q9,d1,d5[1]
> +
> + vmlal.u32 q8,d7,d8[1]
> + vmlal.u32 q5,d1,d8[1]
> + vmlal.u32 q6,d3,d8[1]
> + vmlal.u32 q7,d5,d8[1]
> + vmlal.u32 q9,d0,d7[1]
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ lazy reduction as discussed in "NEON crypto" by D.J. Bernstein
> + @ and P. Schwabe
> + @
> + @ H0>>+H1>>+H2>>+H3>>+H4
> + @ H3>>+H4>>*5+H0>>+H1
> + @
> + @ Trivia.
> + @
> + @ Result of multiplication of n-bit number by m-bit number is
> + @ n+m bits wide. However! Even though 2^n is a n+1-bit number,
> + @ m-bit number multiplied by 2^n is still n+m bits wide.
> + @
> + @ Sum of two n-bit numbers is n+1 bits wide, sum of three - n+2,
> + @ and so is sum of four. Sum of 2^m n-m-bit numbers and n-bit
> + @ one is n+1 bits wide.
> + @
> + @ >>+ denotes Hnext += Hn>>26, Hn &= 0x3ffffff. This means that
> + @ H0, H2, H3 are guaranteed to be 26 bits wide, while H1 and H4
> + @ can be 27. However! In cases when their width exceeds 26 bits
> + @ they are limited by 2^26+2^6. This in turn means that *sum*
> + @ of the products with these values can still be viewed as sum
> + @ of 52-bit numbers as long as the amount of addends is not a
> + @ power of 2. For example,
> + @
> + @ H4 = H4*R0 + H3*R1 + H2*R2 + H1*R3 + H0 * R4,
> + @
> + @ which can't be larger than 5 * (2^26 + 2^6) * (2^26 + 2^6), or
> + @ 5 * (2^52 + 2*2^32 + 2^12), which in turn is smaller than
> + @ 8 * (2^52) or 2^55. However, the value is then multiplied by
> + @ by 5, so we should be looking at 5 * 5 * (2^52 + 2^33 + 2^12),
> + @ which is less than 32 * (2^52) or 2^57. And when processing
> + @ data we are looking at triple as many addends...
> + @
> + @ In key setup procedure pre-reduced H0 is limited by 5*4+1 and
> + @ 5*H4 - by 5*5 52-bit addends, or 57 bits. But when hashing the
> + @ input H0 is limited by (5*4+1)*3 addends, or 58 bits, while
> + @ 5*H4 by 5*5*3, or 59[!] bits. How is this relevant? vmlal.u32
> + @ instruction accepts 2x32-bit input and writes 2x64-bit result.
> + @ This means that result of reduction have to be compressed upon
> + @ loop wrap-around. This can be done in the process of reduction
> + @ to minimize amount of instructions [as well as amount of
> + @ 128-bit instructions, which benefits low-end processors], but
> + @ one has to watch for H2 (which is narrower than H0) and 5*H4
> + @ not being wider than 58 bits, so that result of right shift
> + @ by 26 bits fits in 32 bits. This is also useful on x86,
> + @ because it allows to use paddd in place for paddq, which
> + @ benefits Atom, where paddq is ridiculously slow.
> +
> + vshr.u64 q15,q8,#26
> + vmovn.i64 d16,q8
> + vshr.u64 q4,q5,#26
> + vmovn.i64 d10,q5
> + vadd.i64 q9,q9,q15 @ h3 -> h4
> + vbic.i32 d16,#0xfc000000 @ &=0x03ffffff
> + vadd.i64 q6,q6,q4 @ h0 -> h1
> + vbic.i32 d10,#0xfc000000
> +
> + vshrn.u64 d30,q9,#26
> + vmovn.i64 d18,q9
> + vshr.u64 q4,q6,#26
> + vmovn.i64 d12,q6
> + vadd.i64 q7,q7,q4 @ h1 -> h2
> + vbic.i32 d18,#0xfc000000
> + vbic.i32 d12,#0xfc000000
> +
> + vadd.i32 d10,d10,d30
> + vshl.u32 d30,d30,#2
> + vshrn.u64 d8,q7,#26
> + vmovn.i64 d14,q7
> + vadd.i32 d10,d10,d30 @ h4 -> h0
> + vadd.i32 d16,d16,d8 @ h2 -> h3
> + vbic.i32 d14,#0xfc000000
> +
> + vshr.u32 d30,d10,#26
> + vbic.i32 d10,#0xfc000000
> + vshr.u32 d8,d16,#26
> + vbic.i32 d16,#0xfc000000
> + vadd.i32 d12,d12,d30 @ h0 -> h1
> + vadd.i32 d18,d18,d8 @ h3 -> h4
> +
> + subs r5,r5,#1
> + beq .Lsquare_break_neon
> +
> + add r6,r0,#(48+0*9*4)
> + add r7,r0,#(48+1*9*4)
> +
> + vtrn.32 d0,d10 @ r^2:r^1
> + vtrn.32 d3,d14
> + vtrn.32 d5,d16
> + vtrn.32 d1,d12
> + vtrn.32 d7,d18
> +
> + vshl.u32 d4,d3,#2 @ *5
> + vshl.u32 d6,d5,#2
> + vshl.u32 d2,d1,#2
> + vshl.u32 d8,d7,#2
> + vadd.i32 d4,d4,d3
> + vadd.i32 d2,d2,d1
> + vadd.i32 d6,d6,d5
> + vadd.i32 d8,d8,d7
> +
> + vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]!
> + vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]!
> + vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
> + vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
> + vst1.32 {d8[0]},[r6,:32]
> + vst1.32 {d8[1]},[r7,:32]
> +
> + b .Lsquare_neon
> +
> +.align 4
> +.Lsquare_break_neon:
> + add r6,r0,#(48+2*4*9)
> + add r7,r0,#(48+3*4*9)
> +
> + vmov d0,d10 @ r^4:r^3
> + vshl.u32 d2,d12,#2 @ *5
> + vmov d1,d12
> + vshl.u32 d4,d14,#2
> + vmov d3,d14
> + vshl.u32 d6,d16,#2
> + vmov d5,d16
> + vshl.u32 d8,d18,#2
> + vmov d7,d18
> + vadd.i32 d2,d2,d12
> + vadd.i32 d4,d4,d14
> + vadd.i32 d6,d6,d16
> + vadd.i32 d8,d8,d18
> +
> + vst4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]!
> + vst4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]!
> + vst4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
> + vst4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
> + vst1.32 {d8[0]},[r6]
> + vst1.32 {d8[1]},[r7]
> +
> + bx lr @ bx lr
> +ENDPROC(poly1305_init_neon)
> +
> +.align 5
> +ENTRY(poly1305_blocks_neon)
> + ldr ip,[r0,#36] @ is_base2_26
> + ands r2,r2,#-16
> + beq .Lno_data_neon
> +
> + cmp r2,#64
> + bhs .Lenter_neon
> + tst ip,ip @ is_base2_26?
> + beq .Lpoly1305_blocks_arm
> +
> +.Lenter_neon:
> + stmdb sp!,{r4-r7}
> + vstmdb sp!,{d8-d15} @ ABI specification says so
> +
> + tst ip,ip @ is_base2_26?
> + bne .Lbase2_26_neon
> +
> + stmdb sp!,{r1-r3,lr}
> + bl .Lpoly1305_init_neon
> +
> + ldr r4,[r0,#0] @ load hash value base 2^32
> + ldr r5,[r0,#4]
> + ldr r6,[r0,#8]
> + ldr r7,[r0,#12]
> + ldr ip,[r0,#16]
> +
> + and r2,r4,#0x03ffffff @ base 2^32 -> base 2^26
> + mov r3,r4,lsr#26
> + veor d10,d10,d10
> + mov r4,r5,lsr#20
> + orr r3,r3,r5,lsl#6
> + veor d12,d12,d12
> + mov r5,r6,lsr#14
> + orr r4,r4,r6,lsl#12
> + veor d14,d14,d14
> + mov r6,r7,lsr#8
> + orr r5,r5,r7,lsl#18
> + veor d16,d16,d16
> + and r3,r3,#0x03ffffff
> + orr r6,r6,ip,lsl#24
> + veor d18,d18,d18
> + and r4,r4,#0x03ffffff
> + mov r1,#1
> + and r5,r5,#0x03ffffff
> + str r1,[r0,#36] @ is_base2_26
> +
> + vmov.32 d10[0],r2
> + vmov.32 d12[0],r3
> + vmov.32 d14[0],r4
> + vmov.32 d16[0],r5
> + vmov.32 d18[0],r6
> + adr r5,.Lzeros
> +
> + ldmia sp!,{r1-r3,lr}
> + b .Lbase2_32_neon
> +
> +.align 4
> +.Lbase2_26_neon:
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ load hash value
> +
> + veor d10,d10,d10
> + veor d12,d12,d12
> + veor d14,d14,d14
> + veor d16,d16,d16
> + veor d18,d18,d18
> + vld4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]!
> + adr r5,.Lzeros
> + vld1.32 {d18[0]},[r0]
> + sub r0,r0,#16 @ rewind
> +
> +.Lbase2_32_neon:
> + add r4,r1,#32
> + mov r3,r3,lsl#24
> + tst r2,#31
> + beq .Leven
> +
> + vld4.32 {d20[0],d22[0],d24[0],d26[0]},[r1]!
> + vmov.32 d28[0],r3
> + sub r2,r2,#16
> + add r4,r1,#32
> +
> +#ifdef __ARMEB__
> + vrev32.8 q10,q10
> + vrev32.8 q13,q13
> + vrev32.8 q11,q11
> + vrev32.8 q12,q12
> +#endif
> + vsri.u32 d28,d26,#8 @ base 2^32 -> base 2^26
> + vshl.u32 d26,d26,#18
> +
> + vsri.u32 d26,d24,#14
> + vshl.u32 d24,d24,#12
> + vadd.i32 d29,d28,d18 @ add hash value and move to #hi
> +
> + vbic.i32 d26,#0xfc000000
> + vsri.u32 d24,d22,#20
> + vshl.u32 d22,d22,#6
> +
> + vbic.i32 d24,#0xfc000000
> + vsri.u32 d22,d20,#26
> + vadd.i32 d27,d26,d16
> +
> + vbic.i32 d20,#0xfc000000
> + vbic.i32 d22,#0xfc000000
> + vadd.i32 d25,d24,d14
> +
> + vadd.i32 d21,d20,d10
> + vadd.i32 d23,d22,d12
> +
> + mov r7,r5
> + add r6,r0,#48
> +
> + cmp r2,r2
> + b .Long_tail
> +
> +.align 4
> +.Leven:
> + subs r2,r2,#64
> + it lo
> + movlo r4,r5
> +
> + vmov.i32 q14,#1<<24 @ padbit, yes, always
> + vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1]
> + add r1,r1,#64
> + vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0)
> + add r4,r4,#64
> + itt hi
> + addhi r7,r0,#(48+1*9*4)
> + addhi r6,r0,#(48+3*9*4)
> +
> +#ifdef __ARMEB__
> + vrev32.8 q10,q10
> + vrev32.8 q13,q13
> + vrev32.8 q11,q11
> + vrev32.8 q12,q12
> +#endif
> + vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26
> + vshl.u32 q13,q13,#18
> +
> + vsri.u32 q13,q12,#14
> + vshl.u32 q12,q12,#12
> +
> + vbic.i32 q13,#0xfc000000
> + vsri.u32 q12,q11,#20
> + vshl.u32 q11,q11,#6
> +
> + vbic.i32 q12,#0xfc000000
> + vsri.u32 q11,q10,#26
> +
> + vbic.i32 q10,#0xfc000000
> + vbic.i32 q11,#0xfc000000
> +
> + bls .Lskip_loop
> +
> + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^2
> + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4
> + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
> + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
> + b .Loop_neon
> +
> +.align 5
> +.Loop_neon:
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2
> + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r
> + @ ___________________/
> + @ ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2
> + @ ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r
> + @ ___________________/ ____________________/
> + @
> + @ Note that we start with inp[2:3]*r^2. This is because it
> + @ doesn't depend on reduction in previous iteration.
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ d4 = h4*r0 + h3*r1 + h2*r2 + h1*r3 + h0*r4
> + @ d3 = h3*r0 + h2*r1 + h1*r2 + h0*r3 + h4*5*r4
> + @ d2 = h2*r0 + h1*r1 + h0*r2 + h4*5*r3 + h3*5*r4
> + @ d1 = h1*r0 + h0*r1 + h4*5*r2 + h3*5*r3 + h2*5*r4
> + @ d0 = h0*r0 + h4*5*r1 + h3*5*r2 + h2*5*r3 + h1*5*r4
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ inp[2:3]*r^2
> +
> + vadd.i32 d24,d24,d14 @ accumulate inp[0:1]
> + vmull.u32 q7,d25,d0[1]
> + vadd.i32 d20,d20,d10
> + vmull.u32 q5,d21,d0[1]
> + vadd.i32 d26,d26,d16
> + vmull.u32 q8,d27,d0[1]
> + vmlal.u32 q7,d23,d1[1]
> + vadd.i32 d22,d22,d12
> + vmull.u32 q6,d23,d0[1]
> +
> + vadd.i32 d28,d28,d18
> + vmull.u32 q9,d29,d0[1]
> + subs r2,r2,#64
> + vmlal.u32 q5,d29,d2[1]
> + it lo
> + movlo r4,r5
> + vmlal.u32 q8,d25,d1[1]
> + vld1.32 d8[1],[r7,:32]
> + vmlal.u32 q6,d21,d1[1]
> + vmlal.u32 q9,d27,d1[1]
> +
> + vmlal.u32 q5,d27,d4[1]
> + vmlal.u32 q8,d23,d3[1]
> + vmlal.u32 q9,d25,d3[1]
> + vmlal.u32 q6,d29,d4[1]
> + vmlal.u32 q7,d21,d3[1]
> +
> + vmlal.u32 q8,d21,d5[1]
> + vmlal.u32 q5,d25,d6[1]
> + vmlal.u32 q9,d23,d5[1]
> + vmlal.u32 q6,d27,d6[1]
> + vmlal.u32 q7,d29,d6[1]
> +
> + vmlal.u32 q8,d29,d8[1]
> + vmlal.u32 q5,d23,d8[1]
> + vmlal.u32 q9,d21,d7[1]
> + vmlal.u32 q6,d25,d8[1]
> + vmlal.u32 q7,d27,d8[1]
> +
> + vld4.32 {d21,d23,d25,d27},[r4] @ inp[2:3] (or 0)
> + add r4,r4,#64
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ (hash+inp[0:1])*r^4 and accumulate
> +
> + vmlal.u32 q8,d26,d0[0]
> + vmlal.u32 q5,d20,d0[0]
> + vmlal.u32 q9,d28,d0[0]
> + vmlal.u32 q6,d22,d0[0]
> + vmlal.u32 q7,d24,d0[0]
> + vld1.32 d8[0],[r6,:32]
> +
> + vmlal.u32 q8,d24,d1[0]
> + vmlal.u32 q5,d28,d2[0]
> + vmlal.u32 q9,d26,d1[0]
> + vmlal.u32 q6,d20,d1[0]
> + vmlal.u32 q7,d22,d1[0]
> +
> + vmlal.u32 q8,d22,d3[0]
> + vmlal.u32 q5,d26,d4[0]
> + vmlal.u32 q9,d24,d3[0]
> + vmlal.u32 q6,d28,d4[0]
> + vmlal.u32 q7,d20,d3[0]
> +
> + vmlal.u32 q8,d20,d5[0]
> + vmlal.u32 q5,d24,d6[0]
> + vmlal.u32 q9,d22,d5[0]
> + vmlal.u32 q6,d26,d6[0]
> + vmlal.u32 q8,d28,d8[0]
> +
> + vmlal.u32 q7,d28,d6[0]
> + vmlal.u32 q5,d22,d8[0]
> + vmlal.u32 q9,d20,d7[0]
> + vmov.i32 q14,#1<<24 @ padbit, yes, always
> + vmlal.u32 q6,d24,d8[0]
> + vmlal.u32 q7,d26,d8[0]
> +
> + vld4.32 {d20,d22,d24,d26},[r1] @ inp[0:1]
> + add r1,r1,#64
> +#ifdef __ARMEB__
> + vrev32.8 q10,q10
> + vrev32.8 q11,q11
> + vrev32.8 q12,q12
> + vrev32.8 q13,q13
> +#endif
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ lazy reduction interleaved with base 2^32 -> base 2^26 of
> + @ inp[0:3] previously loaded to q10-q13 and smashed to q10-q14.
> +
> + vshr.u64 q15,q8,#26
> + vmovn.i64 d16,q8
> + vshr.u64 q4,q5,#26
> + vmovn.i64 d10,q5
> + vadd.i64 q9,q9,q15 @ h3 -> h4
> + vbic.i32 d16,#0xfc000000
> + vsri.u32 q14,q13,#8 @ base 2^32 -> base 2^26
> + vadd.i64 q6,q6,q4 @ h0 -> h1
> + vshl.u32 q13,q13,#18
> + vbic.i32 d10,#0xfc000000
> +
> + vshrn.u64 d30,q9,#26
> + vmovn.i64 d18,q9
> + vshr.u64 q4,q6,#26
> + vmovn.i64 d12,q6
> + vadd.i64 q7,q7,q4 @ h1 -> h2
> + vsri.u32 q13,q12,#14
> + vbic.i32 d18,#0xfc000000
> + vshl.u32 q12,q12,#12
> + vbic.i32 d12,#0xfc000000
> +
> + vadd.i32 d10,d10,d30
> + vshl.u32 d30,d30,#2
> + vbic.i32 q13,#0xfc000000
> + vshrn.u64 d8,q7,#26
> + vmovn.i64 d14,q7
> + vaddl.u32 q5,d10,d30 @ h4 -> h0 [widen for a sec]
> + vsri.u32 q12,q11,#20
> + vadd.i32 d16,d16,d8 @ h2 -> h3
> + vshl.u32 q11,q11,#6
> + vbic.i32 d14,#0xfc000000
> + vbic.i32 q12,#0xfc000000
> +
> + vshrn.u64 d30,q5,#26 @ re-narrow
> + vmovn.i64 d10,q5
> + vsri.u32 q11,q10,#26
> + vbic.i32 q10,#0xfc000000
> + vshr.u32 d8,d16,#26
> + vbic.i32 d16,#0xfc000000
> + vbic.i32 d10,#0xfc000000
> + vadd.i32 d12,d12,d30 @ h0 -> h1
> + vadd.i32 d18,d18,d8 @ h3 -> h4
> + vbic.i32 q11,#0xfc000000
> +
> + bhi .Loop_neon
> +
> +.Lskip_loop:
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1
> +
> + add r7,r0,#(48+0*9*4)
> + add r6,r0,#(48+1*9*4)
> + adds r2,r2,#32
> + it ne
> + movne r2,#0
> + bne .Long_tail
> +
> + vadd.i32 d25,d24,d14 @ add hash value and move to #hi
> + vadd.i32 d21,d20,d10
> + vadd.i32 d27,d26,d16
> + vadd.i32 d23,d22,d12
> + vadd.i32 d29,d28,d18
> +
> +.Long_tail:
> + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^1
> + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^2
> +
> + vadd.i32 d24,d24,d14 @ can be redundant
> + vmull.u32 q7,d25,d0
> + vadd.i32 d20,d20,d10
> + vmull.u32 q5,d21,d0
> + vadd.i32 d26,d26,d16
> + vmull.u32 q8,d27,d0
> + vadd.i32 d22,d22,d12
> + vmull.u32 q6,d23,d0
> + vadd.i32 d28,d28,d18
> + vmull.u32 q9,d29,d0
> +
> + vmlal.u32 q5,d29,d2
> + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
> + vmlal.u32 q8,d25,d1
> + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
> + vmlal.u32 q6,d21,d1
> + vmlal.u32 q9,d27,d1
> + vmlal.u32 q7,d23,d1
> +
> + vmlal.u32 q8,d23,d3
> + vld1.32 d8[1],[r7,:32]
> + vmlal.u32 q5,d27,d4
> + vld1.32 d8[0],[r6,:32]
> + vmlal.u32 q9,d25,d3
> + vmlal.u32 q6,d29,d4
> + vmlal.u32 q7,d21,d3
> +
> + vmlal.u32 q8,d21,d5
> + it ne
> + addne r7,r0,#(48+2*9*4)
> + vmlal.u32 q5,d25,d6
> + it ne
> + addne r6,r0,#(48+3*9*4)
> + vmlal.u32 q9,d23,d5
> + vmlal.u32 q6,d27,d6
> + vmlal.u32 q7,d29,d6
> +
> + vmlal.u32 q8,d29,d8
> + vorn q0,q0,q0 @ all-ones, can be redundant
> + vmlal.u32 q5,d23,d8
> + vshr.u64 q0,q0,#38
> + vmlal.u32 q9,d21,d7
> + vmlal.u32 q6,d25,d8
> + vmlal.u32 q7,d27,d8
> +
> + beq .Lshort_tail
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ (hash+inp[0:1])*r^4:r^3 and accumulate
> +
> + vld4.32 {d0[1],d1[1],d2[1],d3[1]},[r7]! @ load r^3
> + vld4.32 {d0[0],d1[0],d2[0],d3[0]},[r6]! @ load r^4
> +
> + vmlal.u32 q7,d24,d0
> + vmlal.u32 q5,d20,d0
> + vmlal.u32 q8,d26,d0
> + vmlal.u32 q6,d22,d0
> + vmlal.u32 q9,d28,d0
> +
> + vmlal.u32 q5,d28,d2
> + vld4.32 {d4[1],d5[1],d6[1],d7[1]},[r7]!
> + vmlal.u32 q8,d24,d1
> + vld4.32 {d4[0],d5[0],d6[0],d7[0]},[r6]!
> + vmlal.u32 q6,d20,d1
> + vmlal.u32 q9,d26,d1
> + vmlal.u32 q7,d22,d1
> +
> + vmlal.u32 q8,d22,d3
> + vld1.32 d8[1],[r7,:32]
> + vmlal.u32 q5,d26,d4
> + vld1.32 d8[0],[r6,:32]
> + vmlal.u32 q9,d24,d3
> + vmlal.u32 q6,d28,d4
> + vmlal.u32 q7,d20,d3
> +
> + vmlal.u32 q8,d20,d5
> + vmlal.u32 q5,d24,d6
> + vmlal.u32 q9,d22,d5
> + vmlal.u32 q6,d26,d6
> + vmlal.u32 q7,d28,d6
> +
> + vmlal.u32 q8,d28,d8
> + vorn q0,q0,q0 @ all-ones
> + vmlal.u32 q5,d22,d8
> + vshr.u64 q0,q0,#38
> + vmlal.u32 q9,d20,d7
> + vmlal.u32 q6,d24,d8
> + vmlal.u32 q7,d26,d8
> +
> +.Lshort_tail:
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ horizontal addition
> +
> + vadd.i64 d16,d16,d17
> + vadd.i64 d10,d10,d11
> + vadd.i64 d18,d18,d19
> + vadd.i64 d12,d12,d13
> + vadd.i64 d14,d14,d15
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ lazy reduction, but without narrowing
> +
> + vshr.u64 q15,q8,#26
> + vand.i64 q8,q8,q0
> + vshr.u64 q4,q5,#26
> + vand.i64 q5,q5,q0
> + vadd.i64 q9,q9,q15 @ h3 -> h4
> + vadd.i64 q6,q6,q4 @ h0 -> h1
> +
> + vshr.u64 q15,q9,#26
> + vand.i64 q9,q9,q0
> + vshr.u64 q4,q6,#26
> + vand.i64 q6,q6,q0
> + vadd.i64 q7,q7,q4 @ h1 -> h2
> +
> + vadd.i64 q5,q5,q15
> + vshl.u64 q15,q15,#2
> + vshr.u64 q4,q7,#26
> + vand.i64 q7,q7,q0
> + vadd.i64 q5,q5,q15 @ h4 -> h0
> + vadd.i64 q8,q8,q4 @ h2 -> h3
> +
> + vshr.u64 q15,q5,#26
> + vand.i64 q5,q5,q0
> + vshr.u64 q4,q8,#26
> + vand.i64 q8,q8,q0
> + vadd.i64 q6,q6,q15 @ h0 -> h1
> + vadd.i64 q9,q9,q4 @ h3 -> h4
> +
> + cmp r2,#0
> + bne .Leven
> +
> + @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> + @ store hash value
> +
> + vst4.32 {d10[0],d12[0],d14[0],d16[0]},[r0]!
> + vst1.32 {d18[0]},[r0]
> +
> + vldmia sp!,{d8-d15} @ epilogue
> + ldmia sp!,{r4-r7}
> +.Lno_data_neon:
> + bx lr @ bx lr
> +ENDPROC(poly1305_blocks_neon)
> +
> +.align 5
> +ENTRY(poly1305_emit_neon)
> + ldr ip,[r0,#36] @ is_base2_26
> +
> + stmdb sp!,{r4-r11}
> +
> + tst ip,ip
> + beq .Lpoly1305_emit_enter
> +
> + ldmia r0,{r3-r7}
> + eor r8,r8,r8
> +
> + adds r3,r3,r4,lsl#26 @ base 2^26 -> base 2^32
> + mov r4,r4,lsr#6
> + adcs r4,r4,r5,lsl#20
> + mov r5,r5,lsr#12
> + adcs r5,r5,r6,lsl#14
> + mov r6,r6,lsr#18
> + adcs r6,r6,r7,lsl#8
> + adc r7,r8,r7,lsr#24 @ can be partially reduced ...
> +
> + and r8,r7,#-4 @ ... so reduce
> + and r7,r6,#3
> + add r8,r8,r8,lsr#2 @ *= 5
> + adds r3,r3,r8
> + adcs r4,r4,#0
> + adcs r5,r5,#0
> + adcs r6,r6,#0
> + adc r7,r7,#0
> +
> + adds r8,r3,#5 @ compare to modulus
> + adcs r9,r4,#0
> + adcs r10,r5,#0
> + adcs r11,r6,#0
> + adc r7,r7,#0
> + tst r7,#4 @ did it carry/borrow?
> +
> + it ne
> + movne r3,r8
> + ldr r8,[r2,#0]
> + it ne
> + movne r4,r9
> + ldr r9,[r2,#4]
> + it ne
> + movne r5,r10
> + ldr r10,[r2,#8]
> + it ne
> + movne r6,r11
> + ldr r11,[r2,#12]
> +
> + adds r3,r3,r8 @ accumulate nonce
> + adcs r4,r4,r9
> + adcs r5,r5,r10
> + adc r6,r6,r11
> +
> +#ifdef __ARMEB__
> + rev r3,r3
> + rev r4,r4
> + rev r5,r5
> + rev r6,r6
> +#endif
> + str r3,[r1,#0] @ store the result
> + str r4,[r1,#4]
> + str r5,[r1,#8]
> + str r6,[r1,#12]
> +
> + ldmia sp!,{r4-r11}
> + bx lr @ bx lr
> +ENDPROC(poly1305_emit_neon)
> +
> +.align 5
> +.Lzeros:
> +.long 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
> +#endif
> diff --git a/lib/zinc/poly1305/poly1305-arm64.S b/lib/zinc/poly1305/poly1305-arm64.S
> new file mode 100644
> index 000000000000..c20023544183
> --- /dev/null
> +++ b/lib/zinc/poly1305/poly1305-arm64.S
> @@ -0,0 +1,822 @@
> +/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
> + *
> + * Copyright (C) 2015-2018 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
> + * Copyright (C) 2006-2017 CRYPTOGAMS by <appro@openssl.org>. All Rights Reserved.
> + *
> + * This is based in part on Andy Polyakov's implementation from CRYPTOGAMS.
> + */
> +
> +#include <linux/linkage.h>
> +.text
> +
> +.align 5
> +ENTRY(poly1305_init_arm)
> + cmp x1,xzr
> + stp xzr,xzr,[x0] // zero hash value
> + stp xzr,xzr,[x0,#16] // [along with is_base2_26]
> +
> + csel x0,xzr,x0,eq
> + b.eq .Lno_key
> +
> + ldp x7,x8,[x1] // load key
> + mov x9,#0xfffffffc0fffffff
> + movk x9,#0x0fff,lsl#48
> +#ifdef __ARMEB__
> + rev x7,x7 // flip bytes
> + rev x8,x8
> +#endif
> + and x7,x7,x9 // &=0ffffffc0fffffff
> + and x9,x9,#-4
> + and x8,x8,x9 // &=0ffffffc0ffffffc
> + stp x7,x8,[x0,#32] // save key value
> +
> +.Lno_key:
> + ret
> +ENDPROC(poly1305_init_arm)
> +
> +.align 5
> +ENTRY(poly1305_blocks_arm)
> + ands x2,x2,#-16
> + b.eq .Lno_data
> +
> + ldp x4,x5,[x0] // load hash value
> + ldp x7,x8,[x0,#32] // load key value
> + ldr x6,[x0,#16]
> + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
> + b .Loop
> +
> +.align 5
> +.Loop:
> + ldp x10,x11,[x1],#16 // load input
> + sub x2,x2,#16
> +#ifdef __ARMEB__
> + rev x10,x10
> + rev x11,x11
> +#endif
> + adds x4,x4,x10 // accumulate input
> + adcs x5,x5,x11
> +
> + mul x12,x4,x7 // h0*r0
> + adc x6,x6,x3
> + umulh x13,x4,x7
> +
> + mul x10,x5,x9 // h1*5*r1
> + umulh x11,x5,x9
> +
> + adds x12,x12,x10
> + mul x10,x4,x8 // h0*r1
> + adc x13,x13,x11
> + umulh x14,x4,x8
> +
> + adds x13,x13,x10
> + mul x10,x5,x7 // h1*r0
> + adc x14,x14,xzr
> + umulh x11,x5,x7
> +
> + adds x13,x13,x10
> + mul x10,x6,x9 // h2*5*r1
> + adc x14,x14,x11
> + mul x11,x6,x7 // h2*r0
> +
> + adds x13,x13,x10
> + adc x14,x14,x11
> +
> + and x10,x14,#-4 // final reduction
> + and x6,x14,#3
> + add x10,x10,x14,lsr#2
> + adds x4,x12,x10
> + adcs x5,x13,xzr
> + adc x6,x6,xzr
> +
> + cbnz x2,.Loop
> +
> + stp x4,x5,[x0] // store hash value
> + str x6,[x0,#16]
> +
> +.Lno_data:
> + ret
> +ENDPROC(poly1305_blocks_arm)
> +
> +.align 5
> +ENTRY(poly1305_emit_arm)
> + ldp x4,x5,[x0] // load hash base 2^64
> + ldr x6,[x0,#16]
> + ldp x10,x11,[x2] // load nonce
> +
> + adds x12,x4,#5 // compare to modulus
> + adcs x13,x5,xzr
> + adc x14,x6,xzr
> +
> + tst x14,#-4 // see if it's carried/borrowed
> +
> + csel x4,x4,x12,eq
> + csel x5,x5,x13,eq
> +
> +#ifdef __ARMEB__
> + ror x10,x10,#32 // flip nonce words
> + ror x11,x11,#32
> +#endif
> + adds x4,x4,x10 // accumulate nonce
> + adc x5,x5,x11
> +#ifdef __ARMEB__
> + rev x4,x4 // flip output bytes
> + rev x5,x5
> +#endif
> + stp x4,x5,[x1] // write result
> +
> + ret
> +ENDPROC(poly1305_emit_arm)
> +
> +.align 5
> +__poly1305_mult:
> + mul x12,x4,x7 // h0*r0
> + umulh x13,x4,x7
> +
> + mul x10,x5,x9 // h1*5*r1
> + umulh x11,x5,x9
> +
> + adds x12,x12,x10
> + mul x10,x4,x8 // h0*r1
> + adc x13,x13,x11
> + umulh x14,x4,x8
> +
> + adds x13,x13,x10
> + mul x10,x5,x7 // h1*r0
> + adc x14,x14,xzr
> + umulh x11,x5,x7
> +
> + adds x13,x13,x10
> + mul x10,x6,x9 // h2*5*r1
> + adc x14,x14,x11
> + mul x11,x6,x7 // h2*r0
> +
> + adds x13,x13,x10
> + adc x14,x14,x11
> +
> + and x10,x14,#-4 // final reduction
> + and x6,x14,#3
> + add x10,x10,x14,lsr#2
> + adds x4,x12,x10
> + adcs x5,x13,xzr
> + adc x6,x6,xzr
> +
> + ret
> +
> +__poly1305_splat:
> + and x12,x4,#0x03ffffff // base 2^64 -> base 2^26
> + ubfx x13,x4,#26,#26
> + extr x14,x5,x4,#52
> + and x14,x14,#0x03ffffff
> + ubfx x15,x5,#14,#26
> + extr x16,x6,x5,#40
> +
> + str w12,[x0,#16*0] // r0
> + add w12,w13,w13,lsl#2 // r1*5
> + str w13,[x0,#16*1] // r1
> + add w13,w14,w14,lsl#2 // r2*5
> + str w12,[x0,#16*2] // s1
> + str w14,[x0,#16*3] // r2
> + add w14,w15,w15,lsl#2 // r3*5
> + str w13,[x0,#16*4] // s2
> + str w15,[x0,#16*5] // r3
> + add w15,w16,w16,lsl#2 // r4*5
> + str w14,[x0,#16*6] // s3
> + str w16,[x0,#16*7] // r4
> + str w15,[x0,#16*8] // s4
> +
> + ret
> +
> +.align 5
> +ENTRY(poly1305_blocks_neon)
> + ldr x17,[x0,#24]
> + cmp x2,#128
> + b.hs .Lblocks_neon
> + cbz x17,poly1305_blocks_arm
> +
> +.Lblocks_neon:
> + stp x29,x30,[sp,#-80]!
> + add x29,sp,#0
> +
> + ands x2,x2,#-16
> + b.eq .Lno_data_neon
> +
> + cbz x17,.Lbase2_64_neon
> +
> + ldp w10,w11,[x0] // load hash value base 2^26
> + ldp w12,w13,[x0,#8]
> + ldr w14,[x0,#16]
> +
> + tst x2,#31
> + b.eq .Leven_neon
> +
> + ldp x7,x8,[x0,#32] // load key value
> +
> + add x4,x10,x11,lsl#26 // base 2^26 -> base 2^64
> + lsr x5,x12,#12
> + adds x4,x4,x12,lsl#52
> + add x5,x5,x13,lsl#14
> + adc x5,x5,xzr
> + lsr x6,x14,#24
> + adds x5,x5,x14,lsl#40
> + adc x14,x6,xzr // can be partially reduced...
> +
> + ldp x12,x13,[x1],#16 // load input
> + sub x2,x2,#16
> + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
> +
> + and x10,x14,#-4 // ... so reduce
> + and x6,x14,#3
> + add x10,x10,x14,lsr#2
> + adds x4,x4,x10
> + adcs x5,x5,xzr
> + adc x6,x6,xzr
> +
> +#ifdef __ARMEB__
> + rev x12,x12
> + rev x13,x13
> +#endif
> + adds x4,x4,x12 // accumulate input
> + adcs x5,x5,x13
> + adc x6,x6,x3
> +
> + bl __poly1305_mult
> + ldr x30,[sp,#8]
> +
> + cbz x3,.Lstore_base2_64_neon
> +
> + and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
> + ubfx x11,x4,#26,#26
> + extr x12,x5,x4,#52
> + and x12,x12,#0x03ffffff
> + ubfx x13,x5,#14,#26
> + extr x14,x6,x5,#40
> +
> + cbnz x2,.Leven_neon
> +
> + stp w10,w11,[x0] // store hash value base 2^26
> + stp w12,w13,[x0,#8]
> + str w14,[x0,#16]
> + b .Lno_data_neon
> +
> +.align 4
> +.Lstore_base2_64_neon:
> + stp x4,x5,[x0] // store hash value base 2^64
> + stp x6,xzr,[x0,#16] // note that is_base2_26 is zeroed
> + b .Lno_data_neon
> +
> +.align 4
> +.Lbase2_64_neon:
> + ldp x7,x8,[x0,#32] // load key value
> +
> + ldp x4,x5,[x0] // load hash value base 2^64
> + ldr x6,[x0,#16]
> +
> + tst x2,#31
> + b.eq .Linit_neon
> +
> + ldp x12,x13,[x1],#16 // load input
> + sub x2,x2,#16
> + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
> +#ifdef __ARMEB__
> + rev x12,x12
> + rev x13,x13
> +#endif
> + adds x4,x4,x12 // accumulate input
> + adcs x5,x5,x13
> + adc x6,x6,x3
> +
> + bl __poly1305_mult
> +
> +.Linit_neon:
> + and x10,x4,#0x03ffffff // base 2^64 -> base 2^26
> + ubfx x11,x4,#26,#26
> + extr x12,x5,x4,#52
> + and x12,x12,#0x03ffffff
> + ubfx x13,x5,#14,#26
> + extr x14,x6,x5,#40
> +
> + stp d8,d9,[sp,#16] // meet ABI requirements
> + stp d10,d11,[sp,#32]
> + stp d12,d13,[sp,#48]
> + stp d14,d15,[sp,#64]
> +
> + fmov d24,x10
> + fmov d25,x11
> + fmov d26,x12
> + fmov d27,x13
> + fmov d28,x14
> +
> + ////////////////////////////////// initialize r^n table
> + mov x4,x7 // r^1
> + add x9,x8,x8,lsr#2 // s1 = r1 + (r1 >> 2)
> + mov x5,x8
> + mov x6,xzr
> + add x0,x0,#48+12
> + bl __poly1305_splat
> +
> + bl __poly1305_mult // r^2
> + sub x0,x0,#4
> + bl __poly1305_splat
> +
> + bl __poly1305_mult // r^3
> + sub x0,x0,#4
> + bl __poly1305_splat
> +
> + bl __poly1305_mult // r^4
> + sub x0,x0,#4
> + bl __poly1305_splat
> + ldr x30,[sp,#8]
> +
> + add x16,x1,#32
> + adr x17,.Lzeros
> + subs x2,x2,#64
> + csel x16,x17,x16,lo
> +
> + mov x4,#1
> + str x4,[x0,#-24] // set is_base2_26
> + sub x0,x0,#48 // restore original x0
> + b .Ldo_neon
> +
> +.align 4
> +.Leven_neon:
> + add x16,x1,#32
> + adr x17,.Lzeros
> + subs x2,x2,#64
> + csel x16,x17,x16,lo
> +
> + stp d8,d9,[sp,#16] // meet ABI requirements
> + stp d10,d11,[sp,#32]
> + stp d12,d13,[sp,#48]
> + stp d14,d15,[sp,#64]
> +
> + fmov d24,x10
> + fmov d25,x11
> + fmov d26,x12
> + fmov d27,x13
> + fmov d28,x14
> +
> +.Ldo_neon:
> + ldp x8,x12,[x16],#16 // inp[2:3] (or zero)
> + ldp x9,x13,[x16],#48
> +
> + lsl x3,x3,#24
> + add x15,x0,#48
> +
> +#ifdef __ARMEB__
> + rev x8,x8
> + rev x12,x12
> + rev x9,x9
> + rev x13,x13
> +#endif
> + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
> + and x5,x9,#0x03ffffff
> + ubfx x6,x8,#26,#26
> + ubfx x7,x9,#26,#26
> + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
> + extr x8,x12,x8,#52
> + extr x9,x13,x9,#52
> + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
> + fmov d14,x4
> + and x8,x8,#0x03ffffff
> + and x9,x9,#0x03ffffff
> + ubfx x10,x12,#14,#26
> + ubfx x11,x13,#14,#26
> + add x12,x3,x12,lsr#40
> + add x13,x3,x13,lsr#40
> + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
> + fmov d15,x6
> + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
> + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
> + fmov d16,x8
> + fmov d17,x10
> + fmov d18,x12
> +
> + ldp x8,x12,[x1],#16 // inp[0:1]
> + ldp x9,x13,[x1],#48
> +
> + ld1 {v0.4s,v1.4s,v2.4s,v3.4s},[x15],#64
> + ld1 {v4.4s,v5.4s,v6.4s,v7.4s},[x15],#64
> + ld1 {v8.4s},[x15]
> +
> +#ifdef __ARMEB__
> + rev x8,x8
> + rev x12,x12
> + rev x9,x9
> + rev x13,x13
> +#endif
> + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
> + and x5,x9,#0x03ffffff
> + ubfx x6,x8,#26,#26
> + ubfx x7,x9,#26,#26
> + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
> + extr x8,x12,x8,#52
> + extr x9,x13,x9,#52
> + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
> + fmov d9,x4
> + and x8,x8,#0x03ffffff
> + and x9,x9,#0x03ffffff
> + ubfx x10,x12,#14,#26
> + ubfx x11,x13,#14,#26
> + add x12,x3,x12,lsr#40
> + add x13,x3,x13,lsr#40
> + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
> + fmov d10,x6
> + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
> + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
> + movi v31.2d,#-1
> + fmov d11,x8
> + fmov d12,x10
> + fmov d13,x12
> + ushr v31.2d,v31.2d,#38
> +
> + b.ls .Lskip_loop
> +
> +.align 4
> +.Loop_neon:
> + ////////////////////////////////////////////////////////////////
> + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2
> + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^3+inp[7]*r
> + // ___________________/
> + // ((inp[0]*r^4+inp[2]*r^2+inp[4])*r^4+inp[6]*r^2+inp[8])*r^2
> + // ((inp[1]*r^4+inp[3]*r^2+inp[5])*r^4+inp[7]*r^2+inp[9])*r
> + // ___________________/ ____________________/
> + //
> + // Note that we start with inp[2:3]*r^2. This is because it
> + // doesn't depend on reduction in previous iteration.
> + ////////////////////////////////////////////////////////////////
> + // d4 = h0*r4 + h1*r3 + h2*r2 + h3*r1 + h4*r0
> + // d3 = h0*r3 + h1*r2 + h2*r1 + h3*r0 + h4*5*r4
> + // d2 = h0*r2 + h1*r1 + h2*r0 + h3*5*r4 + h4*5*r3
> + // d1 = h0*r1 + h1*r0 + h2*5*r4 + h3*5*r3 + h4*5*r2
> + // d0 = h0*r0 + h1*5*r4 + h2*5*r3 + h3*5*r2 + h4*5*r1
> +
> + subs x2,x2,#64
> + umull v23.2d,v14.2s,v7.s[2]
> + csel x16,x17,x16,lo
> + umull v22.2d,v14.2s,v5.s[2]
> + umull v21.2d,v14.2s,v3.s[2]
> + ldp x8,x12,[x16],#16 // inp[2:3] (or zero)
> + umull v20.2d,v14.2s,v1.s[2]
> + ldp x9,x13,[x16],#48
> + umull v19.2d,v14.2s,v0.s[2]
> +#ifdef __ARMEB__
> + rev x8,x8
> + rev x12,x12
> + rev x9,x9
> + rev x13,x13
> +#endif
> +
> + umlal v23.2d,v15.2s,v5.s[2]
> + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
> + umlal v22.2d,v15.2s,v3.s[2]
> + and x5,x9,#0x03ffffff
> + umlal v21.2d,v15.2s,v1.s[2]
> + ubfx x6,x8,#26,#26
> + umlal v20.2d,v15.2s,v0.s[2]
> + ubfx x7,x9,#26,#26
> + umlal v19.2d,v15.2s,v8.s[2]
> + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
> +
> + umlal v23.2d,v16.2s,v3.s[2]
> + extr x8,x12,x8,#52
> + umlal v22.2d,v16.2s,v1.s[2]
> + extr x9,x13,x9,#52
> + umlal v21.2d,v16.2s,v0.s[2]
> + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
> + umlal v20.2d,v16.2s,v8.s[2]
> + fmov d14,x4
> + umlal v19.2d,v16.2s,v6.s[2]
> + and x8,x8,#0x03ffffff
> +
> + umlal v23.2d,v17.2s,v1.s[2]
> + and x9,x9,#0x03ffffff
> + umlal v22.2d,v17.2s,v0.s[2]
> + ubfx x10,x12,#14,#26
> + umlal v21.2d,v17.2s,v8.s[2]
> + ubfx x11,x13,#14,#26
> + umlal v20.2d,v17.2s,v6.s[2]
> + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
> + umlal v19.2d,v17.2s,v4.s[2]
> + fmov d15,x6
> +
> + add v11.2s,v11.2s,v26.2s
> + add x12,x3,x12,lsr#40
> + umlal v23.2d,v18.2s,v0.s[2]
> + add x13,x3,x13,lsr#40
> + umlal v22.2d,v18.2s,v8.s[2]
> + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
> + umlal v21.2d,v18.2s,v6.s[2]
> + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
> + umlal v20.2d,v18.2s,v4.s[2]
> + fmov d16,x8
> + umlal v19.2d,v18.2s,v2.s[2]
> + fmov d17,x10
> +
> + ////////////////////////////////////////////////////////////////
> + // (hash+inp[0:1])*r^4 and accumulate
> +
> + add v9.2s,v9.2s,v24.2s
> + fmov d18,x12
> + umlal v22.2d,v11.2s,v1.s[0]
> + ldp x8,x12,[x1],#16 // inp[0:1]
> + umlal v19.2d,v11.2s,v6.s[0]
> + ldp x9,x13,[x1],#48
> + umlal v23.2d,v11.2s,v3.s[0]
> + umlal v20.2d,v11.2s,v8.s[0]
> + umlal v21.2d,v11.2s,v0.s[0]
> +#ifdef __ARMEB__
> + rev x8,x8
> + rev x12,x12
> + rev x9,x9
> + rev x13,x13
> +#endif
> +
> + add v10.2s,v10.2s,v25.2s
> + umlal v22.2d,v9.2s,v5.s[0]
> + umlal v23.2d,v9.2s,v7.s[0]
> + and x4,x8,#0x03ffffff // base 2^64 -> base 2^26
> + umlal v21.2d,v9.2s,v3.s[0]
> + and x5,x9,#0x03ffffff
> + umlal v19.2d,v9.2s,v0.s[0]
> + ubfx x6,x8,#26,#26
> + umlal v20.2d,v9.2s,v1.s[0]
> + ubfx x7,x9,#26,#26
> +
> + add v12.2s,v12.2s,v27.2s
> + add x4,x4,x5,lsl#32 // bfi x4,x5,#32,#32
> + umlal v22.2d,v10.2s,v3.s[0]
> + extr x8,x12,x8,#52
> + umlal v23.2d,v10.2s,v5.s[0]
> + extr x9,x13,x9,#52
> + umlal v19.2d,v10.2s,v8.s[0]
> + add x6,x6,x7,lsl#32 // bfi x6,x7,#32,#32
> + umlal v21.2d,v10.2s,v1.s[0]
> + fmov d9,x4
> + umlal v20.2d,v10.2s,v0.s[0]
> + and x8,x8,#0x03ffffff
> +
> + add v13.2s,v13.2s,v28.2s
> + and x9,x9,#0x03ffffff
> + umlal v22.2d,v12.2s,v0.s[0]
> + ubfx x10,x12,#14,#26
> + umlal v19.2d,v12.2s,v4.s[0]
> + ubfx x11,x13,#14,#26
> + umlal v23.2d,v12.2s,v1.s[0]
> + add x8,x8,x9,lsl#32 // bfi x8,x9,#32,#32
> + umlal v20.2d,v12.2s,v6.s[0]
> + fmov d10,x6
> + umlal v21.2d,v12.2s,v8.s[0]
> + add x12,x3,x12,lsr#40
> +
> + umlal v22.2d,v13.2s,v8.s[0]
> + add x13,x3,x13,lsr#40
> + umlal v19.2d,v13.2s,v2.s[0]
> + add x10,x10,x11,lsl#32 // bfi x10,x11,#32,#32
> + umlal v23.2d,v13.2s,v0.s[0]
> + add x12,x12,x13,lsl#32 // bfi x12,x13,#32,#32
> + umlal v20.2d,v13.2s,v4.s[0]
> + fmov d11,x8
> + umlal v21.2d,v13.2s,v6.s[0]
> + fmov d12,x10
> + fmov d13,x12
> +
> + /////////////////////////////////////////////////////////////////
> + // lazy reduction as discussed in "NEON crypto" by D.J. Bernstein
> + // and P. Schwabe
> + //
> + // [see discussion in poly1305-armv4 module]
> +
> + ushr v29.2d,v22.2d,#26
> + xtn v27.2s,v22.2d
> + ushr v30.2d,v19.2d,#26
> + and v19.16b,v19.16b,v31.16b
> + add v23.2d,v23.2d,v29.2d // h3 -> h4
> + bic v27.2s,#0xfc,lsl#24 // &=0x03ffffff
> + add v20.2d,v20.2d,v30.2d // h0 -> h1
> +
> + ushr v29.2d,v23.2d,#26
> + xtn v28.2s,v23.2d
> + ushr v30.2d,v20.2d,#26
> + xtn v25.2s,v20.2d
> + bic v28.2s,#0xfc,lsl#24
> + add v21.2d,v21.2d,v30.2d // h1 -> h2
> +
> + add v19.2d,v19.2d,v29.2d
> + shl v29.2d,v29.2d,#2
> + shrn v30.2s,v21.2d,#26
> + xtn v26.2s,v21.2d
> + add v19.2d,v19.2d,v29.2d // h4 -> h0
> + bic v25.2s,#0xfc,lsl#24
> + add v27.2s,v27.2s,v30.2s // h2 -> h3
> + bic v26.2s,#0xfc,lsl#24
> +
> + shrn v29.2s,v19.2d,#26
> + xtn v24.2s,v19.2d
> + ushr v30.2s,v27.2s,#26
> + bic v27.2s,#0xfc,lsl#24
> + bic v24.2s,#0xfc,lsl#24
> + add v25.2s,v25.2s,v29.2s // h0 -> h1
> + add v28.2s,v28.2s,v30.2s // h3 -> h4
> +
> + b.hi .Loop_neon
> +
> +.Lskip_loop:
> + dup v16.2d,v16.d[0]
> + add v11.2s,v11.2s,v26.2s
> +
> + ////////////////////////////////////////////////////////////////
> + // multiply (inp[0:1]+hash) or inp[2:3] by r^2:r^1
> +
> + adds x2,x2,#32
> + b.ne .Long_tail
> +
> + dup v16.2d,v11.d[0]
> + add v14.2s,v9.2s,v24.2s
> + add v17.2s,v12.2s,v27.2s
> + add v15.2s,v10.2s,v25.2s
> + add v18.2s,v13.2s,v28.2s
> +
> +.Long_tail:
> + dup v14.2d,v14.d[0]
> + umull2 v19.2d,v16.4s,v6.4s
> + umull2 v22.2d,v16.4s,v1.4s
> + umull2 v23.2d,v16.4s,v3.4s
> + umull2 v21.2d,v16.4s,v0.4s
> + umull2 v20.2d,v16.4s,v8.4s
> +
> + dup v15.2d,v15.d[0]
> + umlal2 v19.2d,v14.4s,v0.4s
> + umlal2 v21.2d,v14.4s,v3.4s
> + umlal2 v22.2d,v14.4s,v5.4s
> + umlal2 v23.2d,v14.4s,v7.4s
> + umlal2 v20.2d,v14.4s,v1.4s
> +
> + dup v17.2d,v17.d[0]
> + umlal2 v19.2d,v15.4s,v8.4s
> + umlal2 v22.2d,v15.4s,v3.4s
> + umlal2 v21.2d,v15.4s,v1.4s
> + umlal2 v23.2d,v15.4s,v5.4s
> + umlal2 v20.2d,v15.4s,v0.4s
> +
> + dup v18.2d,v18.d[0]
> + umlal2 v22.2d,v17.4s,v0.4s
> + umlal2 v23.2d,v17.4s,v1.4s
> + umlal2 v19.2d,v17.4s,v4.4s
> + umlal2 v20.2d,v17.4s,v6.4s
> + umlal2 v21.2d,v17.4s,v8.4s
> +
> + umlal2 v22.2d,v18.4s,v8.4s
> + umlal2 v19.2d,v18.4s,v2.4s
> + umlal2 v23.2d,v18.4s,v0.4s
> + umlal2 v20.2d,v18.4s,v4.4s
> + umlal2 v21.2d,v18.4s,v6.4s
> +
> + b.eq .Lshort_tail
> +
> + ////////////////////////////////////////////////////////////////
> + // (hash+inp[0:1])*r^4:r^3 and accumulate
> +
> + add v9.2s,v9.2s,v24.2s
> + umlal v22.2d,v11.2s,v1.2s
> + umlal v19.2d,v11.2s,v6.2s
> + umlal v23.2d,v11.2s,v3.2s
> + umlal v20.2d,v11.2s,v8.2s
> + umlal v21.2d,v11.2s,v0.2s
> +
> + add v10.2s,v10.2s,v25.2s
> + umlal v22.2d,v9.2s,v5.2s
> + umlal v19.2d,v9.2s,v0.2s
> + umlal v23.2d,v9.2s,v7.2s
> + umlal v20.2d,v9.2s,v1.2s
> + umlal v21.2d,v9.2s,v3.2s
> +
> + add v12.2s,v12.2s,v27.2s
> + umlal v22.2d,v10.2s,v3.2s
> + umlal v19.2d,v10.2s,v8.2s
> + umlal v23.2d,v10.2s,v5.2s
> + umlal v20.2d,v10.2s,v0.2s
> + umlal v21.2d,v10.2s,v1.2s
> +
> + add v13.2s,v13.2s,v28.2s
> + umlal v22.2d,v12.2s,v0.2s
> + umlal v19.2d,v12.2s,v4.2s
> + umlal v23.2d,v12.2s,v1.2s
> + umlal v20.2d,v12.2s,v6.2s
> + umlal v21.2d,v12.2s,v8.2s
> +
> + umlal v22.2d,v13.2s,v8.2s
> + umlal v19.2d,v13.2s,v2.2s
> + umlal v23.2d,v13.2s,v0.2s
> + umlal v20.2d,v13.2s,v4.2s
> + umlal v21.2d,v13.2s,v6.2s
> +
> +.Lshort_tail:
> + ////////////////////////////////////////////////////////////////
> + // horizontal add
> +
> + addp v22.2d,v22.2d,v22.2d
> + ldp d8,d9,[sp,#16] // meet ABI requirements
> + addp v19.2d,v19.2d,v19.2d
> + ldp d10,d11,[sp,#32]
> + addp v23.2d,v23.2d,v23.2d
> + ldp d12,d13,[sp,#48]
> + addp v20.2d,v20.2d,v20.2d
> + ldp d14,d15,[sp,#64]
> + addp v21.2d,v21.2d,v21.2d
> +
> + ////////////////////////////////////////////////////////////////
> + // lazy reduction, but without narrowing
> +
> + ushr v29.2d,v22.2d,#26
> + and v22.16b,v22.16b,v31.16b
> + ushr v30.2d,v19.2d,#26
> + and v19.16b,v19.16b,v31.16b
> +
> + add v23.2d,v23.2d,v29.2d // h3 -> h4
> + add v20.2d,v20.2d,v30.2d // h0 -> h1
> +
> + ushr v29.2d,v23.2d,#26
> + and v23.16b,v23.16b,v31.16b
> + ushr v30.2d,v20.2d,#26
> + and v20.16b,v20.16b,v31.16b
> + add v21.2d,v21.2d,v30.2d // h1 -> h2
> +
> + add v19.2d,v19.2d,v29.2d
> + shl v29.2d,v29.2d,#2
> + ushr v30.2d,v21.2d,#26
> + and v21.16b,v21.16b,v31.16b
> + add v19.2d,v19.2d,v29.2d // h4 -> h0
> + add v22.2d,v22.2d,v30.2d // h2 -> h3
> +
> + ushr v29.2d,v19.2d,#26
> + and v19.16b,v19.16b,v31.16b
> + ushr v30.2d,v22.2d,#26
> + and v22.16b,v22.16b,v31.16b
> + add v20.2d,v20.2d,v29.2d // h0 -> h1
> + add v23.2d,v23.2d,v30.2d // h3 -> h4
> +
> + ////////////////////////////////////////////////////////////////
> + // write the result, can be partially reduced
> +
> + st4 {v19.s,v20.s,v21.s,v22.s}[0],[x0],#16
> + st1 {v23.s}[0],[x0]
> +
> +.Lno_data_neon:
> + ldr x29,[sp],#80
> + ret
> +ENDPROC(poly1305_blocks_neon)
> +
> +.align 5
> +ENTRY(poly1305_emit_neon)
> + ldr x17,[x0,#24]
> + cbz x17,poly1305_emit_arm
> +
> + ldp w10,w11,[x0] // load hash value base 2^26
> + ldp w12,w13,[x0,#8]
> + ldr w14,[x0,#16]
> +
> + add x4,x10,x11,lsl#26 // base 2^26 -> base 2^64
> + lsr x5,x12,#12
> + adds x4,x4,x12,lsl#52
> + add x5,x5,x13,lsl#14
> + adc x5,x5,xzr
> + lsr x6,x14,#24
> + adds x5,x5,x14,lsl#40
> + adc x6,x6,xzr // can be partially reduced...
> +
> + ldp x10,x11,[x2] // load nonce
> +
> + and x12,x6,#-4 // ... so reduce
> + add x12,x12,x6,lsr#2
> + and x6,x6,#3
> + adds x4,x4,x12
> + adcs x5,x5,xzr
> + adc x6,x6,xzr
> +
> + adds x12,x4,#5 // compare to modulus
> + adcs x13,x5,xzr
> + adc x14,x6,xzr
> +
> + tst x14,#-4 // see if it's carried/borrowed
> +
> + csel x4,x4,x12,eq
> + csel x5,x5,x13,eq
> +
> +#ifdef __ARMEB__
> + ror x10,x10,#32 // flip nonce words
> + ror x11,x11,#32
> +#endif
> + adds x4,x4,x10 // accumulate nonce
> + adc x5,x5,x11
> +#ifdef __ARMEB__
> + rev x4,x4 // flip output bytes
> + rev x5,x5
> +#endif
> + stp x4,x5,[x1] // write result
> +
> + ret
> +ENDPROC(poly1305_emit_neon)
> +
> +.align 5
> +.Lzeros:
> +.long 0,0,0,0,0,0,0,0
> --
> 2.19.0
>
^ permalink raw reply
* Re: [PATCH net-next 2/7] net: phy: mscc: add support for VSC8584 PHY
From: Andrew Lunn @ 2018-09-14 17:27 UTC (permalink / raw)
To: Quentin Schulz
Cc: alexandre.belloni, ralf, paul.burton, jhogan, robh+dt,
mark.rutland, davem, f.fainelli, allan.nielsen, linux-mips,
devicetree, linux-kernel, netdev, thomas.petazzoni,
antoine.tenart
In-Reply-To: <a61d9affd3f1ec9deb60c882cce1daf37fbe2427.1536916714.git-series.quentin.schulz@bootlin.com>
> struct vsc8531_private {
> int rate_magic;
> u16 supp_led_modes;
> @@ -181,6 +354,7 @@ struct vsc8531_private {
> struct vsc85xx_hw_stat *hw_stats;
> u64 *stats;
> int nstats;
> + bool pkg_init;
> +/* bus->mdio_lock should be locked when using this function */
> +static int vsc8584_cmd(struct mii_bus *bus, int phy, u16 val)
> +{
> + unsigned long deadline;
> + u16 reg_val;
> +
> + __mdiobus_write(bus, phy, MSCC_EXT_PAGE_ACCESS,
> + MSCC_PHY_PAGE_EXTENDED_GPIO);
> +
> + __mdiobus_write(bus, phy, MSCC_PHY_PROC_CMD, PROC_CMD_NCOMPLETED | val);
Hi Quentin
All the __mdiobus_write() look a bit ugly. Maybe add bus and base_addr
to the vsc8531_private structure. Then add helpers
phy_write_base_phy(priv, reg, val) and phy_read_base_phy(priv, reg).
You could also add in:
if (unlikely(!mutex_is_locked(&priv->bus->mdio_lock))) {
dev_err(bus->dev, "MDIO bus lock not held!\n");
dump_stack();
}
Having such code in the mv88e6xxx driver has found a few bugs for me.
Andrew
^ permalink raw reply
* Re: [PATCH net-next v4 00/20] WireGuard: Secure Network Tunnel
From: Ard Biesheuvel @ 2018-09-14 17:39 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Linux Kernel Mailing List, <netdev@vger.kernel.org>,
open list:HARDWARE RANDOM NUMBER GENERATOR CORE, David S. Miller,
Greg Kroah-Hartman
In-Reply-To: <20180914161954.7325-1-Jason@zx2c4.com>
On 14 September 2018 at 18:19, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Changes v3->v4:
> - Remove mistaken double 07/17 patch.
> - Fix whitespace issues in blake2s assembly.
> - It's not possible to put compound literals into __initconst, so
> we now instead just use boring fixed size struct members.
> - Move away from makefile ifdef maze and instead prefer kconfig values,
> which also makes the design a bit more modular too, which could help
> in the future.
Could you elaborate on this? From the patches, it is not clear to me
how this has improved.
> - Port old crypto API implementations (ChaCha20 and Poly1305) to Zinc.
> - Port security/keys/big_key to Zinc as second example of a good usage of
> Zinc.
> - Document precisely what is different between the kernel code and
> CRYPTOGAMS code when the CRYPTOGAMS code is used.
> - Move changelog to top of 00/20 message so that people can
> actually find it.
>
> -----------------------------------------------------------
>
> This patchset is available on git.kernel.org in this branch, where it may be
> pulled directly for inclusion into net-next:
>
> * https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/log/?h=jd/wireguard
>
> -----------------------------------------------------------
>
> WireGuard is a secure network tunnel written especially for Linux, which
> has faced around three years of serious development, deployment, and
> scrutiny. It delivers excellent performance and is extremely easy to
> use and configure. It has been designed with the primary goal of being
> both easy to audit by virtue of being small and highly secure from a
> cryptography and systems security perspective. WireGuard is used by some
> massive companies pushing enormous amounts of traffic, and likely
> already today you've consumed bytes that at some point transited through
> a WireGuard tunnel. Even as an out-of-tree module, WireGuard has been
> integrated into various userspace tools, Linux distributions, mobile
> phones, and data centers. There are ports in several languages to
> several operating systems, and even commercial hardware and services
> sold integrating WireGuard. It is time, therefore, for WireGuard to be
> properly integrated into Linux.
>
> Ample information, including documentation, installation instructions,
> and project details, is available at:
>
> * https://www.wireguard.com/
> * https://www.wireguard.com/papers/wireguard.pdf
>
> As it is currently an out-of-tree module, it lives in its own git repo
> and has its own mailing list, and every commit for the module is tested
> against every stable kernel since 3.10 on a variety of architectures
> using an extensive test suite:
>
> * https://git.zx2c4.com/WireGuard
> https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/WireGuard.git/
> * https://lists.zx2c4.com/mailman/listinfo/wireguard
> * https://www.wireguard.com/build-status/
>
> The project has been broadly discussed at conferences, and was presented
> to the Netdev developers in Seoul last November, where a paper was
> released detailing some interesting aspects of the project. Dave asked
> me after the talk if I would consider sending in a v1 "sooner rather
> than later", hence this patchset. A decision is still waiting from the
> Linux Plumbers Conference, but an update on these topics may be presented
> in Vancouver in a few months. Prior presentations:
>
> * https://www.wireguard.com/presentations/
> * https://www.wireguard.com/papers/wireguard-netdev22.pdf
>
> The cryptography in the protocol itself has been formally verified by
> several independent academic teams with positive results, and I know of
> two additional efforts on their way to further corroborate those
> findings. The version 1 protocol is "complete", and so the purpose of
> this review is to assess the implementation of the protocol. However, it
> still may be of interest to know that the thing you're reviewing uses a
> protocol with various nice security properties:
>
> * https://www.wireguard.com/formal-verification/
>
> This patchset is divided into four segments. The first introduces a very
> simple helper for working with the FPU state for the purposes of amortizing
> SIMD operations. The second segment is a small collection of cryptographic
> primitives, split up into several commits by primitive and by hardware. The
> third shows usage of Zinc within the existing crypto API and as a replacement
> to the existing crypto API. The last is WireGuard itself, presented as an
> unintrusive and self-contained virtual network driver.
>
> It is intended that this entire patch series enter the kernel through
> DaveM's net-next tree. Subsequently, WireGuard patches will go through
> DaveM's net-next tree, while Zinc patches will go through Greg KH's tree.
>
> Enjoy,
> Jason
^ permalink raw reply
* Re: [PATCH net-next v4 08/20] zinc: Poly1305 ARM and ARM64 implementations
From: Jason A. Donenfeld @ 2018-09-14 17:45 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: LKML, Netdev, Linux Crypto Mailing List, David Miller,
Greg Kroah-Hartman, Samuel Neves, Andrew Lutomirski,
Jean-Philippe Aumasson, Andy Polyakov, Russell King - ARM Linux,
linux-arm-kernel
In-Reply-To: <CAKv+Gu8BD=fLk3zm8tvRQ3H-yiePqzXOrKLEz1BLFSRRz2opOQ@mail.gmail.com>
Hi Ard,
On Fri, Sep 14, 2018 at 7:27 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> As I asked in response to v3, could we please have this as a separate
> patch on top? The diff below is corrupted.
I had played with that originally, but thought it made things actually
harder to review, whereas here you have the changes presented pretty
straight forwardly, and I'd appreciate your review of them. If you and
Eric both prefer I split this into two commits, with the first one
just plopping down the CRYPTOGAMS code as is and the second one
bringing it up to kernel-snuff, I can do that.
> Also, both Andy and Eric have offered to get involved in upstreaming
> these changes to OpenSSL, so there is no delta to begin with.
Yes, I think this is probably a good long-term plan, which we can act
on sometime after Zinc is merged.
> I still don't like the GCC -includes, especially because these .h
> files contain function and variable definitions so they are not
> actually header files to begin with.
I very very strongly disagree with you here. I think doing it via
-include is significantly cleaner than any of the alternatives, and
allows the code to be cleanly expressed as conditionals that the
optimizer trivially compiles out in the case of stub functions
returning false and branch optimizes when the stub functions return
true. It is extremely important that these compile together as one
compilation unit. Yes, this is a different design than the crypto
API's approach, but I believe the approach presented here poses
significant improvements and is a lot cleaner.
> Also, you mentioned in the commit log that you got rid of defines and
> made the code more modular, but as far as I can tell, libzinc is still
> a single monolithic binary that is essentially always builtin once we
> move random.c to it.
Yes, it's still monolithic, but it's now trivial to split up when the
time comes to do that. If you and AndyL think that it should be split
into multiple modules _now_, then I can go ahead and do that for v5.
But if it's not essential, it seems simpler to keep it as is. I'll
wait for word from you two on this.
Jason
^ permalink raw reply
* Re: [PATCH net-next v4 00/20] WireGuard: Secure Network Tunnel
From: Jason A. Donenfeld @ 2018-09-14 17:47 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: LKML, Netdev, Linux Crypto Mailing List, David Miller,
Greg Kroah-Hartman
In-Reply-To: <CAKv+Gu_LYsNs88uF4+G1xfOtWvNPOjiiYZKqZf7qSBkvn6iEoA@mail.gmail.com>
On Fri, Sep 14, 2018 at 7:40 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> > - Move away from makefile ifdef maze and instead prefer kconfig values,
> > which also makes the design a bit more modular too, which could help
> > in the future.
>
> Could you elaborate on this? From the patches, it is not clear to me
> how this has improved.
Feature detection was prior done as a confusing set of ifeq and
ifdefs. Instead, I've now put the logic for this into the kconfig,
which makes the makefiles and header files a bit simpler. This also
makes it easier to later on modularize Zinc itself if deemed
necessary.
^ permalink raw reply
* Re: [PATCH net-next v4 18/20] crypto: port ChaCha20 to Zinc
From: Jason A. Donenfeld @ 2018-09-14 17:49 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: LKML, Netdev, Linux Crypto Mailing List, David Miller,
Greg Kroah-Hartman, Samuel Neves, Andrew Lutomirski,
Jean-Philippe Aumasson, Eric Biggers
In-Reply-To: <CAKv+Gu-wwFJOL82+iJYCu8rbzeDWLYH=5PtGOJBUouB1zdiZjg@mail.gmail.com>
On Fri, Sep 14, 2018 at 7:38 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> so could we please bring that discussion to a close before we drop the ARM code?
My understanding is that either these will find their way up to AndyP
and then back down here, or Eric or you will augment the .S in this
patch at a later date with an improvement commit that includes some
benchmarks.
Jason
^ permalink raw reply
* Re: [PATCH net-next 3/7] net: phy: mscc: split config_init in two functions for VSC8584
From: Florian Fainelli @ 2018-09-14 17:57 UTC (permalink / raw)
To: Quentin Schulz, alexandre.belloni, ralf, paul.burton, jhogan,
robh+dt, mark.rutland, davem, andrew
Cc: allan.nielsen, linux-mips, devicetree, linux-kernel, netdev,
thomas.petazzoni, antoine.tenart
In-Reply-To: <5daa7f3e467b218410238ef0fb97f01779f8f49f.1536916714.git-series.quentin.schulz@bootlin.com>
On 09/14/2018 02:44 AM, Quentin Schulz wrote:
> Part of the config init is common between the VSC8584 and the VSC8574,
> so to prepare the upcoming support for VSC8574, separate config_init
> PHY-specific code to config_pre_init function which is set in the probe
> function of the PHY and used in config_init.
>
> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
> ---
> drivers/net/phy/mscc.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c
> index b450489..69cc3cf 100644
> --- a/drivers/net/phy/mscc.c
> +++ b/drivers/net/phy/mscc.c
> @@ -355,6 +355,7 @@ struct vsc8531_private {
> u64 *stats;
> int nstats;
> bool pkg_init;
> + int (*config_pre_init)(struct mii_bus *bus, int phy);
Is not this overkill given that you have a reference to the phy_device,
you could check for the for phy_id to know which exact type you have and
call the appropriate pre_init function?
unsigned int phy might be more appropriate.
--
Florian
^ permalink raw reply
* Re: [PATCH 5/7] MIPS: mscc: ocelot: add GPIO4 pinmuxing DT node
From: Alexandre Belloni @ 2018-09-14 18:02 UTC (permalink / raw)
To: Quentin Schulz
Cc: ralf, paul.burton, jhogan, robh+dt, mark.rutland, davem, andrew,
f.fainelli, allan.nielsen, linux-mips, devicetree, linux-kernel,
netdev, thomas.petazzoni, antoine.tenart
In-Reply-To: <20180914162638.fgzzjin2bzgx74de@qschulz>
On 14/09/2018 18:26:38+0200, Quentin Schulz wrote:
> Hi Alexandre,
>
> On Fri, Sep 14, 2018 at 04:54:46PM +0200, Alexandre Belloni wrote:
> > Hi,
> >
> > On 14/09/2018 11:44:26+0200, Quentin Schulz wrote:
> > > In order to use GPIO4 as a GPIO, we need to mux it in this mode so let's
> > > declare a new pinctrl DT node for it.
> > >
> > > Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
> > > ---
> > > arch/mips/boot/dts/mscc/ocelot.dtsi | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/arch/mips/boot/dts/mscc/ocelot.dtsi b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > index 8ce317c..b5c4c74 100644
> > > --- a/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > +++ b/arch/mips/boot/dts/mscc/ocelot.dtsi
> > > @@ -182,6 +182,11 @@
> > > interrupts = <13>;
> > > #interrupt-cells = <2>;
> > >
> > > + gpio4: gpio4 {
> > > + pins = "GPIO_4";
> > > + function = "gpio";
> > > + };
> > > +
> >
> > For a GPIO, I would do that in the board dts because it is not used
> > directly in the dtsi.
> >
>
> And the day we've two boards using this pinctrl we move it to a dtsi. Is
> that the plan?
>
Not really, at least not for gpios. I've included the pinctrl for the
uart, i2c and spi because they are the only option if you are to use
those peripherals. Else, I've would have left the pinctrl to the board
file. From my point of view, the gpios are too board specific to be in a
soc dtsi.
--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply
* Re: [PATCH net-next 4/4] bnxt_en: Always forward VF MAC address to the PF.
From: Siwei Liu @ 2018-09-14 12:49 UTC (permalink / raw)
To: Michael Chan; +Cc: David Miller, Netdev, si-wei liu
In-Reply-To: <1525763921-20698-5-git-send-email-michael.chan@broadcom.com>
This commit is toxic, if possible I hope it can be reverted and
reworked with a new patch.
First, the patch introduced backward incompatible changes to bnxt_en
VF driver that is causing issue when interoperating with the old PF
driver without this commit. In that event, VF probing fails from
within the VM:
[ 5.660331] Broadcom NetXtreme-C/E driver bnxt_en v1.9.1
[ 5.663653] bnxt_en 0000:00:03.0 (unnamed net_device)
(uninitialized): hwrm req_type 0xf seq id 0x6 error 0x4
[ 5.665804] bnxt_en 0000:00:03.0 (unnamed net_device)
(uninitialized): VF MAC address 00:01:02:03:04:05 not approved by the
PF
[ 5.668268] bnxt_en 0000:00:03.0: Unable to initialize mac address.
[ 5.670974] bnxt_en: probe of 0000:00:03.0 failed with error -99
Second, this commit contains driver changes to both PF and VF side,
and incorrectly assumes that both PF and VF can/should be updated at
the same time to resolve the original issue (zero VF MAC address in
'ip link show') it tried to address. In fact that is not warranted. A
potential warranted fix is for VF driver to ignore what
bnxt_approve_mac() may return when it got a valid MAC address from the
firmware. The only purpose for the bnxt_approve_mac call for this case
is a best-effort attempt to inform PF of the MAC address, instead of
failing the VF driver probe when talking to an old PF driver.
Canonical reported a similar issue a few days back due to the same cause.
https://www.spinics.net/lists/netdev/msg521428.html
Regards,
-Siwei
On Tue, May 8, 2018 at 12:18 AM, Michael Chan <michael.chan@broadcom.com> wrote:
> The current code already forwards the VF MAC address to the PF, except
> in one case. If the VF driver gets a valid MAC address from the firmware
> during probe time, it will not forward the MAC address to the PF,
> incorrectly assuming that the PF already knows the MAC address. This
> causes "ip link show" to show zero VF MAC addresses for this case.
>
> This assumption is not correct. Newer firmware remembers the VF MAC
> address last used by the VF and provides it to the VF driver during
> probe. So we need to always forward the VF MAC address to the PF.
>
> The forwarded MAC address may now be the PF assigned MAC address and so we
> need to make sure we approve it for this case.
>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
> drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 3 ++-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index cd3ab78..dfa0839 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -8678,8 +8678,8 @@ static int bnxt_init_mac_addr(struct bnxt *bp)
> memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
> } else {
> eth_hw_addr_random(bp->dev);
> - rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
> }
> + rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
> #endif
> }
> return rc;
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
> index cc21d87..a649108 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
> @@ -923,7 +923,8 @@ static int bnxt_vf_configure_mac(struct bnxt *bp, struct bnxt_vf_info *vf)
> if (req->enables & cpu_to_le32(FUNC_VF_CFG_REQ_ENABLES_DFLT_MAC_ADDR)) {
> if (is_valid_ether_addr(req->dflt_mac_addr) &&
> ((vf->flags & BNXT_VF_TRUST) ||
> - (!is_valid_ether_addr(vf->mac_addr)))) {
> + !is_valid_ether_addr(vf->mac_addr) ||
> + ether_addr_equal(req->dflt_mac_addr, vf->mac_addr))) {
> ether_addr_copy(vf->vf_mac_addr, req->dflt_mac_addr);
> return bnxt_hwrm_exec_fwd_resp(bp, vf, msg_size);
> }
> --
> 1.8.3.1
>
^ permalink raw reply
* Re: [PATCH] net/mlx4_core: print firmware version during driver loading
From: Andrew Lunn @ 2018-09-14 18:17 UTC (permalink / raw)
To: Qing Huang
Cc: Leon Romanovsky, netdev, linux-rdma, linux-kernel, tariqt, davem
In-Reply-To: <c580ad9d-b63d-743b-2278-1c4cf3553186@oracle.com>
On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
> The FW version is actually a very crucial piece of information and only
> printed once here
> when the driver is loaded. People tend to get confused when switching
> multiple FW files
> back and forth without running separate utility tools, especially at
> customer sites.
> IMHO, this information is very useful and only takes up very little log file
> space. :-)
Why not use ethtool -i ?
$ sudo ethtool -i eth0
driver: r8169
version: 2.3LK-NAPI
firmware-version: rtl8168g-2_0.0.1 02/06/13
Andrew
^ permalink raw reply
* KMSAN: uninit-value in do_ip_vs_set_ctl
From: syzbot @ 2018-09-14 18:23 UTC (permalink / raw)
To: coreteam, davem, fw, horms, ja, kadlec, linux-kernel, lvs-devel,
netdev, netfilter-devel, pablo, syzkaller-bugs, wensong
Hello,
syzbot found the following crash on:
HEAD commit: 06b2df0593a8 kmsan: unpoison only the created pages in get..
git tree: https://github.com/google/kmsan.git/master
console output: https://syzkaller.appspot.com/x/log.txt?x=11a6ae37800000
kernel config: https://syzkaller.appspot.com/x/.config?x=4ca1e57bafa8ab1f
dashboard link: https://syzkaller.appspot.com/bug?extid=23b5f9e7caf61d9a3898
compiler: clang version 7.0.0 (trunk 329391)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14008417800000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11deb017800000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+23b5f9e7caf61d9a3898@syzkaller.appspotmail.com
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
==================================================================
BUG: KMSAN: uninit-value in do_ip_vs_set_ctl+0x15ac/0x2760
net/netfilter/ipvs/ip_vs_ctl.c:2424
CPU: 1 PID: 4464 Comm: syz-executor844 Not tainted 4.17.0-rc3+ #94
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1084
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
do_ip_vs_set_ctl+0x15ac/0x2760 net/netfilter/ipvs/ip_vs_ctl.c:2424
nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
nf_setsockopt+0x476/0x4d0 net/netfilter/nf_sockopt.c:115
ip_setsockopt+0x24b/0x2b0 net/ipv4/ip_sockglue.c:1253
raw_setsockopt+0x2e5/0x350 net/ipv4/raw.c:868
sock_common_setsockopt+0x136/0x170 net/core/sock.c:3039
__sys_setsockopt+0x4af/0x560 net/socket.c:1903
__do_sys_setsockopt net/socket.c:1914 [inline]
__se_sys_setsockopt net/socket.c:1911 [inline]
__x64_sys_setsockopt+0x15c/0x1c0 net/socket.c:1911
do_syscall_64+0x154/0x220 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x43fca9
RSP: 002b:00007fff7a4795b8 EFLAGS: 00000213 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fca9
RDX: 0000000000000480 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
R10: 0000000000000000 R11: 0000000000000213 R12: 00000000004015d0
R13: 0000000000401660 R14: 0000000000000000 R15: 0000000000000000
Local variable description: ----arg@do_ip_vs_set_ctl
Variable was created at:
read_pnet include/net/net_namespace.h:288 [inline]
sock_net include/net/sock.h:2306 [inline]
do_ip_vs_set_ctl+0x93/0x2760 net/netfilter/ipvs/ip_vs_ctl.c:2347
nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
nf_setsockopt+0x476/0x4d0 net/netfilter/nf_sockopt.c:115
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* Re: [PATCH] net/mlx4_core: print firmware version during driver loading
From: Qing Huang @ 2018-09-14 18:33 UTC (permalink / raw)
To: Andrew Lunn
Cc: Leon Romanovsky, netdev, linux-rdma, linux-kernel, tariqt, davem
In-Reply-To: <20180914181718.GD3811@lunn.ch>
On 9/14/2018 11:17 AM, Andrew Lunn wrote:
> On Fri, Sep 14, 2018 at 10:15:48AM -0700, Qing Huang wrote:
>> The FW version is actually a very crucial piece of information and only
>> printed once here
>> when the driver is loaded. People tend to get confused when switching
>> multiple FW files
>> back and forth without running separate utility tools, especially at
>> customer sites.
>> IMHO, this information is very useful and only takes up very little log file
>> space. :-)
> Why not use ethtool -i ?
>
> $ sudo ethtool -i eth0
> driver: r8169
> version: 2.3LK-NAPI
> firmware-version: rtl8168g-2_0.0.1 02/06/13
>
> Andrew
Sure. You can also use ibstat or ibv_devinfo tool if they are installed.
But it's not very
convenient in some cases.
E.g.
A customer upgrades FW on HCAs and encounters issues. During triage,
it's much easier
to study customer uploaded log files when remotely testing different FW
files.
Thanks.
^ permalink raw reply
* Re: [PATCH iproute2] q_cake: Add printing of no-split-gso option
From: Toke Høiland-Jørgensen @ 2018-09-14 13:40 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20180912130743.1adfe86b@xeon-e3>
Stephen Hemminger <stephen@networkplumber.org> writes:
> On Wed, 12 Sep 2018 00:32:16 +0200
> Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>
>> When the GSO splitting was turned into dual split-gso/no-split-gso options,
>> the printing of the latter was left out. Add that, so output is consistent
>> with the options passed.
>>
>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>
> Applied. I noticed that nat/nonat and wash/nowash have similar missing
> output.
Thanks! And yeah, you're right; I'll send another patch :)
-Toke
^ permalink raw reply
* [PATCH iproute2] q_cake: Also print nonat, nowash and no-ack-filter keywords
From: Toke Høiland-Jørgensen @ 2018-09-14 13:51 UTC (permalink / raw)
To: netdev; +Cc: cake, Toke Høiland-Jørgensen
Similar to the previous patch for no-split-gso, the negative keywords for
'nat', 'wash' and 'ack-filter' were not printed either. Add those well.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
tc/q_cake.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tc/q_cake.c b/tc/q_cake.c
index 077bf84f..e827e3f1 100644
--- a/tc/q_cake.c
+++ b/tc/q_cake.c
@@ -468,6 +468,8 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
if (nat)
print_string(PRINT_FP, NULL, "nat ", NULL);
+ else
+ print_string(PRINT_FP, NULL, "nonat ", NULL);
print_bool(PRINT_JSON, "nat", NULL, nat);
if (tb[TCA_CAKE_WASH] &&
@@ -508,6 +510,8 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
if (wash)
print_string(PRINT_FP, NULL, "wash ", NULL);
+ else
+ print_string(PRINT_FP, NULL, "nowash ", NULL);
print_bool(PRINT_JSON, "wash", NULL, wash);
if (ingress)
@@ -520,7 +524,7 @@ static int cake_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
else if (ack_filter == CAKE_ACK_FILTER)
print_string(PRINT_ANY, "ack-filter", "ack-filter ", "enabled");
else
- print_string(PRINT_JSON, "ack-filter", NULL, "disabled");
+ print_string(PRINT_ANY, "ack-filter", "no-ack-filter ", "disabled");
if (split_gso)
print_string(PRINT_FP, NULL, "split-gso ", NULL);
--
2.18.0
^ permalink raw reply related
* Re: [PATCH net] veth: Orphan skb before GRO
From: Paolo Abeni @ 2018-09-14 14:16 UTC (permalink / raw)
To: Toshiaki Makita, David S. Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <1536899624-2438-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>
On Fri, 2018-09-14 at 13:33 +0900, Toshiaki Makita wrote:
> GRO expects skbs not to be owned by sockets, but when XDP is enabled veth
> passed skbs owned by sockets. It caused corrupted sk_wmem_alloc.
>
> Paolo Abeni reported the following splat:
>
> [ 362.098904] refcount_t overflow at skb_set_owner_w+0x5e/0xa0 in iperf3[1644], uid/euid: 0/0
> [ 362.108239] WARNING: CPU: 0 PID: 1644 at kernel/panic.c:648 refcount_error_report+0xa0/0xa4
> [ 362.117547] Modules linked in: tcp_diag inet_diag veth intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf ipmi_ssif iTCO_wdt sg ipmi_si iTCO_vendor_support ipmi_devintf mxm_wmi ipmi_msghandler pcspkr dcdbas mei_me wmi mei lpc_ich acpi_power_meter pcc_cpufreq xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe igb ttm ahci mdio libahci ptp crc32c_intel drm pps_core libata i2c_algo_bit dca dm_mirror dm_region_hash dm_log dm_mod
> [ 362.176622] CPU: 0 PID: 1644 Comm: iperf3 Not tainted 4.19.0-rc2.vanilla+ #2025
> [ 362.184777] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
> [ 362.193124] RIP: 0010:refcount_error_report+0xa0/0xa4
> [ 362.198758] Code: 08 00 00 48 8b 95 80 00 00 00 49 8d 8c 24 80 0a 00 00 41 89 c1 44 89 2c 24 48 89 de 48 c7 c7 18 4d e7 9d 31 c0 e8 30 fa ff ff <0f> 0b eb 88 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 49 89 fc
> [ 362.219711] RSP: 0018:ffff9ee6ff603c20 EFLAGS: 00010282
> [ 362.225538] RAX: 0000000000000000 RBX: ffffffff9de83e10 RCX: 0000000000000000
> [ 362.233497] RDX: 0000000000000001 RSI: ffff9ee6ff6167d8 RDI: ffff9ee6ff6167d8
> [ 362.241457] RBP: ffff9ee6ff603d78 R08: 0000000000000490 R09: 0000000000000004
> [ 362.249416] R10: 0000000000000000 R11: ffff9ee6ff603990 R12: ffff9ee664b94500
> [ 362.257377] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff9de615f9
> [ 362.265337] FS: 00007f1d22d28740(0000) GS:ffff9ee6ff600000(0000) knlGS:0000000000000000
> [ 362.274363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 362.280773] CR2: 00007f1d222f35d0 CR3: 0000001fddfec003 CR4: 00000000001606f0
> [ 362.288733] Call Trace:
> [ 362.291459] <IRQ>
> [ 362.293702] ex_handler_refcount+0x4e/0x80
> [ 362.298269] fixup_exception+0x35/0x40
> [ 362.302451] do_trap+0x109/0x150
> [ 362.306048] do_error_trap+0xd5/0x130
> [ 362.315766] invalid_op+0x14/0x20
> [ 362.319460] RIP: 0010:skb_set_owner_w+0x5e/0xa0
> [ 362.324512] Code: ef ff ff 74 49 48 c7 43 60 20 7b 4a 9d 8b 85 f4 01 00 00 85 c0 75 16 8b 83 e0 00 00 00 f0 01 85 44 01 00 00 0f 88 d8 23 16 00 <5b> 5d c3 80 8b 91 00 00 00 01 8b 85 f4 01 00 00 89 83 a4 00 00 00
> [ 362.345465] RSP: 0018:ffff9ee6ff603e20 EFLAGS: 00010a86
> [ 362.351291] RAX: 0000000000001100 RBX: ffff9ee65deec700 RCX: ffff9ee65e829244
> [ 362.359250] RDX: 0000000000000100 RSI: ffff9ee65e829100 RDI: ffff9ee65deec700
> [ 362.367210] RBP: ffff9ee65e829100 R08: 000000000002a380 R09: 0000000000000000
> [ 362.375169] R10: 0000000000000002 R11: fffff1a4bf77bb00 R12: ffffc0754661d000
> [ 362.383130] R13: ffff9ee65deec200 R14: ffff9ee65f597000 R15: 00000000000000aa
> [ 362.391092] veth_xdp_rcv+0x4e4/0x890 [veth]
> [ 362.399357] veth_poll+0x4d/0x17a [veth]
> [ 362.403731] net_rx_action+0x2af/0x3f0
> [ 362.407912] __do_softirq+0xdd/0x29e
> [ 362.411897] do_softirq_own_stack+0x2a/0x40
> [ 362.416561] </IRQ>
> [ 362.418899] do_softirq+0x4b/0x70
> [ 362.422594] __local_bh_enable_ip+0x50/0x60
> [ 362.427258] ip_finish_output2+0x16a/0x390
> [ 362.431824] ip_output+0x71/0xe0
> [ 362.440670] __tcp_transmit_skb+0x583/0xab0
> [ 362.445333] tcp_write_xmit+0x247/0xfb0
> [ 362.449609] __tcp_push_pending_frames+0x2d/0xd0
> [ 362.454760] tcp_sendmsg_locked+0x857/0xd30
> [ 362.459424] tcp_sendmsg+0x27/0x40
> [ 362.463216] sock_sendmsg+0x36/0x50
> [ 362.467104] sock_write_iter+0x87/0x100
> [ 362.471382] __vfs_write+0x112/0x1a0
> [ 362.475369] vfs_write+0xad/0x1a0
> [ 362.479062] ksys_write+0x52/0xc0
> [ 362.482759] do_syscall_64+0x5b/0x180
> [ 362.486841] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 362.492473] RIP: 0033:0x7f1d22293238
> [ 362.496458] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 c5 54 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
> [ 362.517409] RSP: 002b:00007ffebaef8008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 362.525855] RAX: ffffffffffffffda RBX: 0000000000002800 RCX: 00007f1d22293238
> [ 362.533816] RDX: 0000000000002800 RSI: 00007f1d22d36000 RDI: 0000000000000005
> [ 362.541775] RBP: 00007f1d22d36000 R08: 00000002db777a30 R09: 0000562b70712b20
> [ 362.549734] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
> [ 362.557693] R13: 0000000000002800 R14: 00007ffebaef8060 R15: 0000562b70712260
>
> In order to avoid this, orphan the skb before entering GRO.
>
> Fixes: 948d4f214fde ("veth: Add driver XDP")
> Reported-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> ---
> drivers/net/veth.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 8d679c8..41a00cd 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -463,6 +463,8 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
> int mac_len, delta, off;
> struct xdp_buff xdp;
>
> + skb_orphan(skb);
> +
> rcu_read_lock();
> xdp_prog = rcu_dereference(rq->xdp_prog);
> if (unlikely(!xdp_prog)) {
> @@ -508,8 +510,6 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb,
> skb_copy_header(nskb, skb);
> head_off = skb_headroom(nskb) - skb_headroom(skb);
> skb_headers_offset_update(nskb, head_off);
> - if (skb->sk)
> - skb_set_owner_w(nskb, skb->sk);
> consume_skb(skb);
> skb = nskb;
> }
I just gave it a run in my test environment, and it fixes the reported
issue.
Tested-by: Paolo Abeni <pabeni@redhat.com>
^ permalink raw reply
* [PATCH net] pppoe: fix reception of frames with no mac header
From: Guillaume Nault @ 2018-09-14 14:28 UTC (permalink / raw)
To: netdev; +Cc: Michal Ostrowski, Eric Dumazet
pppoe_rcv() needs to look back at the Ethernet header in order to
lookup the PPPoE session. Therefore we need to ensure that the mac
header is big enough to contain an Ethernet header. Otherwise
eth_hdr(skb)->h_source might access invalid data.
==================================================================
BUG: KMSAN: uninit-value in __get_item drivers/net/ppp/pppoe.c:172 [inline]
BUG: KMSAN: uninit-value in get_item drivers/net/ppp/pppoe.c:236 [inline]
BUG: KMSAN: uninit-value in pppoe_rcv+0xcef/0x10e0 drivers/net/ppp/pppoe.c:450
CPU: 0 PID: 4543 Comm: syz-executor355 Not tainted 4.16.0+ #87
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:53
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
__get_item drivers/net/ppp/pppoe.c:172 [inline]
get_item drivers/net/ppp/pppoe.c:236 [inline]
pppoe_rcv+0xcef/0x10e0 drivers/net/ppp/pppoe.c:450
__netif_receive_skb_core+0x47df/0x4a90 net/core/dev.c:4562
__netif_receive_skb net/core/dev.c:4627 [inline]
netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701
netif_receive_skb+0x230/0x240 net/core/dev.c:4725
tun_rx_batched drivers/net/tun.c:1555 [inline]
tun_get_user+0x740f/0x7c60 drivers/net/tun.c:1962
tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
call_write_iter include/linux/fs.h:1782 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x7fb/0x9f0 fs/read_write.c:482
vfs_write+0x463/0x8d0 fs/read_write.c:544
SYSC_write+0x172/0x360 fs/read_write.c:589
SyS_write+0x55/0x80 fs/read_write.c:581
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x4447c9
RSP: 002b:00007fff64c8fc28 EFLAGS: 00000297 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004447c9
RDX: 000000000000fd87 RSI: 0000000020000600 RDI: 0000000000000004
RBP: 00000000006cf018 R08: 00007fff64c8fda8 R09: 00007fff00006bda
R10: 0000000000005fe7 R11: 0000000000000297 R12: 00000000004020d0
R13: 0000000000402160 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
slab_post_alloc_hook mm/slab.h:445 [inline]
slab_alloc_node mm/slub.c:2737 [inline]
__kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
__kmalloc_reserve net/core/skbuff.c:138 [inline]
__alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
alloc_skb include/linux/skbuff.h:984 [inline]
alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
tun_alloc_skb drivers/net/tun.c:1532 [inline]
tun_get_user+0x2242/0x7c60 drivers/net/tun.c:1829
tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
call_write_iter include/linux/fs.h:1782 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x7fb/0x9f0 fs/read_write.c:482
vfs_write+0x463/0x8d0 fs/read_write.c:544
SYSC_write+0x172/0x360 fs/read_write.c:589
SyS_write+0x55/0x80 fs/read_write.c:581
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
==================================================================
Fixes: 224cf5ad14c0 ("ppp: Move the PPP drivers")
Reported-by: syzbot+f5f6080811c849739212@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
drivers/net/ppp/pppoe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index ce61231e96ea..62dc564b251d 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -429,6 +429,9 @@ static int pppoe_rcv(struct sk_buff *skb, struct net_device *dev,
if (!skb)
goto out;
+ if (skb_mac_header_len(skb) < ETH_HLEN)
+ goto drop;
+
if (!pskb_may_pull(skb, sizeof(struct pppoe_hdr)))
goto drop;
--
2.19.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox