* Re: [PATCH 3/4] of: Convert to using %pOFn instead of device_node.name
From: Joe Perches @ 2018-09-08 0:30 UTC (permalink / raw)
To: Thierry Reding, Rob Herring
Cc: Frank Rowand, devicetree, linux-kernel, Andrew Lunn,
Florian Fainelli, netdev
In-Reply-To: <20180907122928.GA5821@ulmo>
On Fri, 2018-09-07 at 14:29 +0200, Thierry Reding wrote:
> On Tue, Aug 28, 2018 at 10:52:53AM -0500, Rob Herring wrote:
> > In preparation to remove the node name pointer from struct device_node,
> > convert printf users to use the %pOFn format specifier.
> >
> > Cc: Frank Rowand <frowand.list@gmail.com>
> > Cc: Andrew Lunn <andrew@lunn.ch>
> > Cc: Florian Fainelli <f.fainelli@gmail.com>
> > Cc: devicetree@vger.kernel.org
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---
> > drivers/of/device.c | 4 ++--
> > drivers/of/of_mdio.c | 12 ++++++------
> > drivers/of/of_numa.c | 4 ++--
> > drivers/of/overlay.c | 4 ++--
> > drivers/of/platform.c | 8 ++++----
> > drivers/of/unittest.c | 12 ++++++------
> > 6 files changed, 22 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > index 5957cd4fa262..daa075d87317 100644
> > --- a/drivers/of/device.c
> > +++ b/drivers/of/device.c
> > @@ -219,7 +219,7 @@ static ssize_t of_device_get_modalias(struct device *dev, char *str, ssize_t len
> > return -ENODEV;
> >
> > /* Name & Type */
> > - csize = snprintf(str, len, "of:N%sT%s", dev->of_node->name,
> > + csize = snprintf(str, len, "of:N%pOFnT%s", dev->of_node,
> > dev->of_node->type);
> > tsize = csize;
> > len -= csize;
>
> This seems to cause the modalias to be improperly constructed. As a
> consequence, automatic module loading at boot time is now broken. I
> think the reason why this fails is because vsnprintf() will skip all
> alpha-numeric characters after a call to pointer(). Presumably this
> is meant to be a generic way of skipping whatever specifiers we throw
> at it.
>
> Unfortunately for the case of OF modaliases, this means that the 'T'
> character gets eaten, so we end up with something like this:
>
> # udevadm info /sys/bus/platform/devices/54200000.dc
> [...]
> E: MODALIAS=of:Ndc<NULL>Cnvidia,tegra124-dc
> [...]
>
> instead of this:
>
> # udevadm info /sys/bus/platform/devices/54200000.dc
> [...]
> E: MODALIAS=of:NdcT<NULL>Cnvidia,tegra124-dc
> [...]
>
> Everything is back to normal if I revert this patch. However, since
> that's obviously not what we want, I think perhaps what we need is a
> way for pointer() (and its implementations) to report back how many
> characters in the format string it consumed so that we can support
> these kinds of back-to-back strings.
>
> If nobody else has the time I can look into coding up a fix, but in the
> meantime it might be best to back this one out until we can handle the
> OF modalias format string.
Or just use 2 consecutive snprintf calls
csize = snprintf(str, len, "of:N%pOFn", dev->of_node);
csize += snprintf(str + csize, len - csize, "T%s",
dev->of_node->type);
^ permalink raw reply
* Re: Containers and checkpoint/restart micro-conference at LPC2018
From: Stéphane Graber @ 2018-09-08 4:59 UTC (permalink / raw)
To: lxc-devel-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I,
lxc-users-cunTk1MwBs9qMoObBWhMNEqPaTDuhLve2LY78lusg7I,
containers-cunTk1MwBs98uUxBSJOaYoYkZiVZrdSR2LY78lusg7I,
linux-security-module-u79uwXL29TY76Z2rM5mHXA,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20180813161015.GD18492@castiana>
[-- Attachment #1.1: Type: text/plain, Size: 1465 bytes --]
On Mon, Aug 13, 2018 at 12:10:15PM -0400, Stéphane Graber wrote:
> Hello,
>
> This year's edition of the Linux Plumbers Conference will once again
> have a containers micro-conference but this time around we'll have twice
> the usual amount of time and will include the content that would
> traditionally go into the checkpoint/restore micro-conference.
>
> LPC2018 will be held in Vancouver, Canada from the 13th to the 15th of
> November, co-located with the Linux Kernel Summit.
>
>
> We're looking for discussion topics around kernel work related to
> containers and namespacing, resource control, access control,
> checkpoint/restore of kernel structures, filesystem/mount handling for
> containers and any related userspace work.
>
>
> The format of the event will mostly be discussions where someone
> introduces a given topic/problem and it then gets discussed for 20-30min
> before moving on to something else. There will also be limited room for
> short demos of recent work with shorter 15min slots.
>
>
> Details can be found here:
>
> https://discuss.linuxcontainers.org/t/containers-micro-conference-at-linux-plumbers-2018/2417
>
>
> Looking forward to seeing you in Vancouver!
Hello,
We've added an extra week to the CFP, new deadline is Friday 14th of September.
If you were thinking about sending something bug then forgot or just
missed the deadline, now is your chance to send it!
Stéphane
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel
^ permalink raw reply
* [bpf-next, v2 3/3] selftests/bpf: test bpf flow dissection
From: Petar Penkov @ 2018-09-08 0:11 UTC (permalink / raw)
To: netdev
Cc: davem, ast, daniel, simon.horman, ecree, songliubraving, tom,
Petar Penkov, Willem de Bruijn
In-Reply-To: <20180908001110.200828-1-peterpenkov96@gmail.com>
From: Petar Penkov <ppenkov@google.com>
Adds a test that sends different types of packets over multiple
tunnels and verifies that valid packets are dissected correctly. To do
so, a tc-flower rule is added to drop packets on UDP src port 9, and
packets are sent from ports 8, 9, and 10. Only the packets on port 9
should be dropped. Because tc-flower relies on the flow dissector to
match flows, correct classification demonstrates correct dissection.
Also add support logic to load the BPF program and to inject the test
packets.
Signed-off-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
tools/testing/selftests/bpf/.gitignore | 2 +
tools/testing/selftests/bpf/Makefile | 6 +-
tools/testing/selftests/bpf/config | 1 +
.../selftests/bpf/flow_dissector_load.c | 140 ++++
.../selftests/bpf/test_flow_dissector.c | 782 ++++++++++++++++++
.../selftests/bpf/test_flow_dissector.sh | 115 +++
tools/testing/selftests/bpf/with_addr.sh | 54 ++
tools/testing/selftests/bpf/with_tunnels.sh | 36 +
8 files changed, 1134 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/flow_dissector_load.c
create mode 100644 tools/testing/selftests/bpf/test_flow_dissector.c
create mode 100755 tools/testing/selftests/bpf/test_flow_dissector.sh
create mode 100755 tools/testing/selftests/bpf/with_addr.sh
create mode 100755 tools/testing/selftests/bpf/with_tunnels.sh
diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index 4d789c1e5167..8a60c9b9892d 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -23,3 +23,5 @@ test_skb_cgroup_id_user
test_socket_cookie
test_cgroup_storage
test_select_reuseport
+test_flow_dissector
+flow_dissector_load
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index e65f50f9185e..fd3851d5c079 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -47,10 +47,12 @@ TEST_PROGS := test_kmod.sh \
test_tunnel.sh \
test_lwt_seg6local.sh \
test_lirc_mode2.sh \
- test_skb_cgroup_id.sh
+ test_skb_cgroup_id.sh \
+ test_flow_dissector.sh
# Compile but not part of 'make run_tests'
-TEST_GEN_PROGS_EXTENDED = test_libbpf_open test_sock_addr test_skb_cgroup_id_user
+TEST_GEN_PROGS_EXTENDED = test_libbpf_open test_sock_addr test_skb_cgroup_id_user \
+ flow_dissector_load test_flow_dissector
include ../lib.mk
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index b4994a94968b..3655508f95fd 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -18,3 +18,4 @@ CONFIG_CRYPTO_HMAC=m
CONFIG_CRYPTO_SHA256=m
CONFIG_VXLAN=y
CONFIG_GENEVE=y
+CONFIG_NET_CLS_FLOWER=m
diff --git a/tools/testing/selftests/bpf/flow_dissector_load.c b/tools/testing/selftests/bpf/flow_dissector_load.c
new file mode 100644
index 000000000000..d3273b5b3173
--- /dev/null
+++ b/tools/testing/selftests/bpf/flow_dissector_load.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <error.h>
+#include <errno.h>
+#include <getopt.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+const char *cfg_pin_path = "/sys/fs/bpf/flow_dissector";
+const char *cfg_map_name = "jmp_table";
+bool cfg_attach = true;
+char *cfg_section_name;
+char *cfg_path_name;
+
+static void load_and_attach_program(void)
+{
+ struct bpf_program *prog, *main_prog;
+ struct bpf_map *prog_array;
+ int i, fd, prog_fd, ret;
+ struct bpf_object *obj;
+ int prog_array_fd;
+
+ ret = bpf_prog_load(cfg_path_name, BPF_PROG_TYPE_FLOW_DISSECTOR, &obj,
+ &prog_fd);
+ if (ret)
+ error(1, 0, "bpf_prog_load %s", cfg_path_name);
+
+ main_prog = bpf_object__find_program_by_title(obj, cfg_section_name);
+ if (!main_prog)
+ error(1, 0, "bpf_object__find_program_by_title %s",
+ cfg_section_name);
+
+ prog_fd = bpf_program__fd(main_prog);
+ if (prog_fd < 0)
+ error(1, 0, "bpf_program__fd");
+
+ prog_array = bpf_object__find_map_by_name(obj, cfg_map_name);
+ if (!prog_array)
+ error(1, 0, "bpf_object__find_map_by_name %s", cfg_map_name);
+
+ prog_array_fd = bpf_map__fd(prog_array);
+ if (prog_array_fd < 0)
+ error(1, 0, "bpf_map__fd %s", cfg_map_name);
+
+ i = 0;
+ bpf_object__for_each_program(prog, obj) {
+ fd = bpf_program__fd(prog);
+ if (fd < 0)
+ error(1, 0, "bpf_program__fd");
+
+ if (fd != prog_fd) {
+ printf("%d: %s\n", i, bpf_program__title(prog, false));
+ bpf_map_update_elem(prog_array_fd, &i, &fd, BPF_ANY);
+ ++i;
+ }
+ }
+
+ ret = bpf_prog_attach(prog_fd, 0 /* Ignore */, BPF_FLOW_DISSECTOR, 0);
+ if (ret)
+ error(1, 0, "bpf_prog_attach %s", cfg_path_name);
+
+ ret = bpf_object__pin(obj, cfg_pin_path);
+ if (ret)
+ error(1, 0, "bpf_object__pin %s", cfg_pin_path);
+
+}
+
+static void detach_program(void)
+{
+ char command[64];
+ int ret;
+
+ ret = bpf_prog_detach(0, BPF_FLOW_DISSECTOR);
+ if (ret)
+ error(1, 0, "bpf_prog_detach");
+
+ /* To unpin, it is necessary and sufficient to just remove this dir */
+ sprintf(command, "rm -r %s", cfg_pin_path);
+ ret = system(command);
+ if (ret)
+ error(1, errno, command);
+}
+
+static void parse_opts(int argc, char **argv)
+{
+ bool attach = false;
+ bool detach = false;
+ int c;
+
+ while ((c = getopt(argc, argv, "adp:s:")) != -1) {
+ switch (c) {
+ case 'a':
+ if (detach)
+ error(1, 0, "attach/detach are exclusive");
+ attach = true;
+ break;
+ case 'd':
+ if (attach)
+ error(1, 0, "attach/detach are exclusive");
+ detach = true;
+ break;
+ case 'p':
+ if (cfg_path_name)
+ error(1, 0, "only one prog name can be given");
+
+ cfg_path_name = optarg;
+ break;
+ case 's':
+ if (cfg_section_name)
+ error(1, 0, "only one section can be given");
+
+ cfg_section_name = optarg;
+ break;
+ }
+ }
+
+ if (detach)
+ cfg_attach = false;
+
+ if (cfg_attach && !cfg_path_name)
+ error(1, 0, "must provide a path to the BPF program");
+
+ if (cfg_attach && !cfg_section_name)
+ error(1, 0, "must provide a section name");
+}
+
+int main(int argc, char **argv)
+{
+ parse_opts(argc, argv);
+ if (cfg_attach)
+ load_and_attach_program();
+ else
+ detach_program();
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/test_flow_dissector.c b/tools/testing/selftests/bpf/test_flow_dissector.c
new file mode 100644
index 000000000000..12b784afba31
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_flow_dissector.c
@@ -0,0 +1,782 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Inject packets with all sorts of encapsulation into the kernel.
+ *
+ * IPv4/IPv6 outer layer 3
+ * GRE/GUE/BARE outer layer 4, where bare is IPIP/SIT/IPv4-in-IPv6/..
+ * IPv4/IPv6 inner layer 3
+ */
+
+#define _GNU_SOURCE
+
+#include <stddef.h>
+#include <arpa/inet.h>
+#include <asm/byteorder.h>
+#include <error.h>
+#include <errno.h>
+#include <linux/if_packet.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/ipv6.h>
+#include <netinet/ip.h>
+#include <netinet/in.h>
+#include <netinet/udp.h>
+#include <poll.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#define CFG_PORT_INNER 8000
+
+/* Add some protocol definitions that do not exist in userspace */
+
+struct grehdr {
+ uint16_t unused;
+ uint16_t protocol;
+} __attribute__((packed));
+
+struct guehdr {
+ union {
+ struct {
+#if defined(__LITTLE_ENDIAN_BITFIELD)
+ __u8 hlen:5,
+ control:1,
+ version:2;
+#elif defined (__BIG_ENDIAN_BITFIELD)
+ __u8 version:2,
+ control:1,
+ hlen:5;
+#else
+#error "Please fix <asm/byteorder.h>"
+#endif
+ __u8 proto_ctype;
+ __be16 flags;
+ };
+ __be32 word;
+ };
+};
+
+static uint8_t cfg_dsfield_inner;
+static uint8_t cfg_dsfield_outer;
+static uint8_t cfg_encap_proto;
+static bool cfg_expect_failure = false;
+static int cfg_l3_extra = AF_UNSPEC; /* optional SIT prefix */
+static int cfg_l3_inner = AF_UNSPEC;
+static int cfg_l3_outer = AF_UNSPEC;
+static int cfg_num_pkt = 10;
+static int cfg_num_secs = 0;
+static char cfg_payload_char = 'a';
+static int cfg_payload_len = 100;
+static int cfg_port_gue = 6080;
+static bool cfg_only_rx;
+static bool cfg_only_tx;
+static int cfg_src_port = 9;
+
+static char buf[ETH_DATA_LEN];
+
+#define INIT_ADDR4(name, addr4, port) \
+ static struct sockaddr_in name = { \
+ .sin_family = AF_INET, \
+ .sin_port = __constant_htons(port), \
+ .sin_addr.s_addr = __constant_htonl(addr4), \
+ };
+
+#define INIT_ADDR6(name, addr6, port) \
+ static struct sockaddr_in6 name = { \
+ .sin6_family = AF_INET6, \
+ .sin6_port = __constant_htons(port), \
+ .sin6_addr = addr6, \
+ };
+
+INIT_ADDR4(in_daddr4, INADDR_LOOPBACK, CFG_PORT_INNER)
+INIT_ADDR4(in_saddr4, INADDR_LOOPBACK + 2, 0)
+INIT_ADDR4(out_daddr4, INADDR_LOOPBACK, 0)
+INIT_ADDR4(out_saddr4, INADDR_LOOPBACK + 1, 0)
+INIT_ADDR4(extra_daddr4, INADDR_LOOPBACK, 0)
+INIT_ADDR4(extra_saddr4, INADDR_LOOPBACK + 1, 0)
+
+INIT_ADDR6(in_daddr6, IN6ADDR_LOOPBACK_INIT, CFG_PORT_INNER)
+INIT_ADDR6(in_saddr6, IN6ADDR_LOOPBACK_INIT, 0)
+INIT_ADDR6(out_daddr6, IN6ADDR_LOOPBACK_INIT, 0)
+INIT_ADDR6(out_saddr6, IN6ADDR_LOOPBACK_INIT, 0)
+INIT_ADDR6(extra_daddr6, IN6ADDR_LOOPBACK_INIT, 0)
+INIT_ADDR6(extra_saddr6, IN6ADDR_LOOPBACK_INIT, 0)
+
+static unsigned long util_gettime(void)
+{
+ struct timeval tv;
+
+ gettimeofday(&tv, NULL);
+ return (tv.tv_sec * 1000) + (tv.tv_usec / 1000);
+}
+
+static void util_printaddr(const char *msg, struct sockaddr *addr)
+{
+ unsigned long off = 0;
+ char nbuf[INET6_ADDRSTRLEN];
+
+ switch (addr->sa_family) {
+ case PF_INET:
+ off = __builtin_offsetof(struct sockaddr_in, sin_addr);
+ break;
+ case PF_INET6:
+ off = __builtin_offsetof(struct sockaddr_in6, sin6_addr);
+ break;
+ default:
+ error(1, 0, "printaddr: unsupported family %u\n",
+ addr->sa_family);
+ }
+
+ if (!inet_ntop(addr->sa_family, ((void *) addr) + off, nbuf,
+ sizeof(nbuf)))
+ error(1, errno, "inet_ntop");
+
+ fprintf(stderr, "%s: %s\n", msg, nbuf);
+}
+
+static unsigned long add_csum_hword(const uint16_t *start, int num_u16)
+{
+ unsigned long sum = 0;
+ int i;
+
+ for (i = 0; i < num_u16; i++)
+ sum += start[i];
+
+ return sum;
+}
+
+static uint16_t build_ip_csum(const uint16_t *start, int num_u16,
+ unsigned long sum)
+{
+ sum += add_csum_hword(start, num_u16);
+
+ while (sum >> 16)
+ sum = (sum & 0xffff) + (sum >> 16);
+
+ return ~sum;
+}
+
+static void build_ipv4_header(void *header, uint8_t proto,
+ uint32_t src, uint32_t dst,
+ int payload_len, uint8_t tos)
+{
+ struct iphdr *iph = header;
+
+ iph->ihl = 5;
+ iph->version = 4;
+ iph->tos = tos;
+ iph->ttl = 8;
+ iph->tot_len = htons(sizeof(*iph) + payload_len);
+ iph->id = htons(1337);
+ iph->protocol = proto;
+ iph->saddr = src;
+ iph->daddr = dst;
+ iph->check = build_ip_csum((void *) iph, iph->ihl << 1, 0);
+}
+
+static void ipv6_set_dsfield(struct ipv6hdr *ip6h, uint8_t dsfield)
+{
+ uint16_t val, *ptr = (uint16_t *)ip6h;
+
+ val = ntohs(*ptr);
+ val &= 0xF00F;
+ val |= ((uint16_t) dsfield) << 4;
+ *ptr = htons(val);
+}
+
+static void build_ipv6_header(void *header, uint8_t proto,
+ struct sockaddr_in6 *src,
+ struct sockaddr_in6 *dst,
+ int payload_len, uint8_t dsfield)
+{
+ struct ipv6hdr *ip6h = header;
+
+ ip6h->version = 6;
+ ip6h->payload_len = htons(payload_len);
+ ip6h->nexthdr = proto;
+ ip6h->hop_limit = 8;
+ ipv6_set_dsfield(ip6h, dsfield);
+
+ memcpy(&ip6h->saddr, &src->sin6_addr, sizeof(ip6h->saddr));
+ memcpy(&ip6h->daddr, &dst->sin6_addr, sizeof(ip6h->daddr));
+}
+
+static uint16_t build_udp_v4_csum(const struct iphdr *iph,
+ const struct udphdr *udph,
+ int num_words)
+{
+ unsigned long pseudo_sum;
+ int num_u16 = sizeof(iph->saddr); /* halfwords: twice byte len */
+
+ pseudo_sum = add_csum_hword((void *) &iph->saddr, num_u16);
+ pseudo_sum += htons(IPPROTO_UDP);
+ pseudo_sum += udph->len;
+ return build_ip_csum((void *) udph, num_words, pseudo_sum);
+}
+
+static uint16_t build_udp_v6_csum(const struct ipv6hdr *ip6h,
+ const struct udphdr *udph,
+ int num_words)
+{
+ unsigned long pseudo_sum;
+ int num_u16 = sizeof(ip6h->saddr); /* halfwords: twice byte len */
+
+ pseudo_sum = add_csum_hword((void *) &ip6h->saddr, num_u16);
+ pseudo_sum += htons(ip6h->nexthdr);
+ pseudo_sum += ip6h->payload_len;
+ return build_ip_csum((void *) udph, num_words, pseudo_sum);
+}
+
+static void build_udp_header(void *header, int payload_len,
+ uint16_t dport, int family)
+{
+ struct udphdr *udph = header;
+ int len = sizeof(*udph) + payload_len;
+
+ udph->source = htons(cfg_src_port);
+ udph->dest = htons(dport);
+ udph->len = htons(len);
+ udph->check = 0;
+ if (family == AF_INET)
+ udph->check = build_udp_v4_csum(header - sizeof(struct iphdr),
+ udph, len >> 1);
+ else
+ udph->check = build_udp_v6_csum(header - sizeof(struct ipv6hdr),
+ udph, len >> 1);
+}
+
+static void build_gue_header(void *header, uint8_t proto)
+{
+ struct guehdr *gueh = header;
+
+ gueh->proto_ctype = proto;
+}
+
+static void build_gre_header(void *header, uint16_t proto)
+{
+ struct grehdr *greh = header;
+
+ greh->protocol = htons(proto);
+}
+
+static int l3_length(int family)
+{
+ if (family == AF_INET)
+ return sizeof(struct iphdr);
+ else
+ return sizeof(struct ipv6hdr);
+}
+
+static int build_packet(void)
+{
+ int ol3_len = 0, ol4_len = 0, il3_len = 0, il4_len = 0;
+ int el3_len = 0;
+
+ if (cfg_l3_extra)
+ el3_len = l3_length(cfg_l3_extra);
+
+ /* calculate header offsets */
+ if (cfg_encap_proto) {
+ ol3_len = l3_length(cfg_l3_outer);
+
+ if (cfg_encap_proto == IPPROTO_GRE)
+ ol4_len = sizeof(struct grehdr);
+ else if (cfg_encap_proto == IPPROTO_UDP)
+ ol4_len = sizeof(struct udphdr) + sizeof(struct guehdr);
+ }
+
+ il3_len = l3_length(cfg_l3_inner);
+ il4_len = sizeof(struct udphdr);
+
+ if (el3_len + ol3_len + ol4_len + il3_len + il4_len + cfg_payload_len >=
+ sizeof(buf))
+ error(1, 0, "packet too large\n");
+
+ /*
+ * Fill packet from inside out, to calculate correct checksums.
+ * But create ip before udp headers, as udp uses ip for pseudo-sum.
+ */
+ memset(buf + el3_len + ol3_len + ol4_len + il3_len + il4_len,
+ cfg_payload_char, cfg_payload_len);
+
+ /* add zero byte for udp csum padding */
+ buf[el3_len + ol3_len + ol4_len + il3_len + il4_len + cfg_payload_len] = 0;
+
+ switch (cfg_l3_inner) {
+ case PF_INET:
+ build_ipv4_header(buf + el3_len + ol3_len + ol4_len,
+ IPPROTO_UDP,
+ in_saddr4.sin_addr.s_addr,
+ in_daddr4.sin_addr.s_addr,
+ il4_len + cfg_payload_len,
+ cfg_dsfield_inner);
+ break;
+ case PF_INET6:
+ build_ipv6_header(buf + el3_len + ol3_len + ol4_len,
+ IPPROTO_UDP,
+ &in_saddr6, &in_daddr6,
+ il4_len + cfg_payload_len,
+ cfg_dsfield_inner);
+ break;
+ }
+
+ build_udp_header(buf + el3_len + ol3_len + ol4_len + il3_len,
+ cfg_payload_len, CFG_PORT_INNER, cfg_l3_inner);
+
+ if (!cfg_encap_proto)
+ return il3_len + il4_len + cfg_payload_len;
+
+ switch (cfg_l3_outer) {
+ case PF_INET:
+ build_ipv4_header(buf + el3_len, cfg_encap_proto,
+ out_saddr4.sin_addr.s_addr,
+ out_daddr4.sin_addr.s_addr,
+ ol4_len + il3_len + il4_len + cfg_payload_len,
+ cfg_dsfield_outer);
+ break;
+ case PF_INET6:
+ build_ipv6_header(buf + el3_len, cfg_encap_proto,
+ &out_saddr6, &out_daddr6,
+ ol4_len + il3_len + il4_len + cfg_payload_len,
+ cfg_dsfield_outer);
+ break;
+ }
+
+ switch (cfg_encap_proto) {
+ case IPPROTO_UDP:
+ build_gue_header(buf + el3_len + ol3_len + ol4_len -
+ sizeof(struct guehdr),
+ cfg_l3_inner == PF_INET ? IPPROTO_IPIP
+ : IPPROTO_IPV6);
+ build_udp_header(buf + el3_len + ol3_len,
+ sizeof(struct guehdr) + il3_len + il4_len +
+ cfg_payload_len,
+ cfg_port_gue, cfg_l3_outer);
+ break;
+ case IPPROTO_GRE:
+ build_gre_header(buf + el3_len + ol3_len,
+ cfg_l3_inner == PF_INET ? ETH_P_IP
+ : ETH_P_IPV6);
+ break;
+ }
+
+ switch (cfg_l3_extra) {
+ case PF_INET:
+ build_ipv4_header(buf,
+ cfg_l3_outer == PF_INET ? IPPROTO_IPIP
+ : IPPROTO_IPV6,
+ extra_saddr4.sin_addr.s_addr,
+ extra_daddr4.sin_addr.s_addr,
+ ol3_len + ol4_len + il3_len + il4_len +
+ cfg_payload_len, 0);
+ break;
+ case PF_INET6:
+ build_ipv6_header(buf,
+ cfg_l3_outer == PF_INET ? IPPROTO_IPIP
+ : IPPROTO_IPV6,
+ &extra_saddr6, &extra_daddr6,
+ ol3_len + ol4_len + il3_len + il4_len +
+ cfg_payload_len, 0);
+ break;
+ }
+
+ return el3_len + ol3_len + ol4_len + il3_len + il4_len +
+ cfg_payload_len;
+}
+
+/* sender transmits encapsulated over RAW or unencap'd over UDP */
+static int setup_tx(void)
+{
+ int family, fd, ret;
+
+ if (cfg_l3_extra)
+ family = cfg_l3_extra;
+ else if (cfg_l3_outer)
+ family = cfg_l3_outer;
+ else
+ family = cfg_l3_inner;
+
+ fd = socket(family, SOCK_RAW, IPPROTO_RAW);
+ if (fd == -1)
+ error(1, errno, "socket tx");
+
+ if (cfg_l3_extra) {
+ if (cfg_l3_extra == PF_INET)
+ ret = connect(fd, (void *) &extra_daddr4,
+ sizeof(extra_daddr4));
+ else
+ ret = connect(fd, (void *) &extra_daddr6,
+ sizeof(extra_daddr6));
+ if (ret)
+ error(1, errno, "connect tx");
+ } else if (cfg_l3_outer) {
+ /* connect to destination if not encapsulated */
+ if (cfg_l3_outer == PF_INET)
+ ret = connect(fd, (void *) &out_daddr4,
+ sizeof(out_daddr4));
+ else
+ ret = connect(fd, (void *) &out_daddr6,
+ sizeof(out_daddr6));
+ if (ret)
+ error(1, errno, "connect tx");
+ } else {
+ /* otherwise using loopback */
+ if (cfg_l3_inner == PF_INET)
+ ret = connect(fd, (void *) &in_daddr4,
+ sizeof(in_daddr4));
+ else
+ ret = connect(fd, (void *) &in_daddr6,
+ sizeof(in_daddr6));
+ if (ret)
+ error(1, errno, "connect tx");
+ }
+
+ return fd;
+}
+
+/* receiver reads unencapsulated UDP */
+static int setup_rx(void)
+{
+ int fd, ret;
+
+ fd = socket(cfg_l3_inner, SOCK_DGRAM, 0);
+ if (fd == -1)
+ error(1, errno, "socket rx");
+
+ if (cfg_l3_inner == PF_INET)
+ ret = bind(fd, (void *) &in_daddr4, sizeof(in_daddr4));
+ else
+ ret = bind(fd, (void *) &in_daddr6, sizeof(in_daddr6));
+ if (ret)
+ error(1, errno, "bind rx");
+
+ return fd;
+}
+
+static int do_tx(int fd, const char *pkt, int len)
+{
+ int ret;
+
+ ret = write(fd, pkt, len);
+ if (ret == -1)
+ error(1, errno, "send");
+ if (ret != len)
+ error(1, errno, "send: len (%d < %d)\n", ret, len);
+
+ return 1;
+}
+
+static int do_poll(int fd, short events, int timeout)
+{
+ struct pollfd pfd;
+ int ret;
+
+ pfd.fd = fd;
+ pfd.events = events;
+
+ ret = poll(&pfd, 1, timeout);
+ if (ret == -1)
+ error(1, errno, "poll");
+ if (ret && !(pfd.revents & POLLIN))
+ error(1, errno, "poll: unexpected event 0x%x\n", pfd.revents);
+
+ return ret;
+}
+
+static int do_rx(int fd)
+{
+ char rbuf;
+ int ret, num = 0;
+
+ while (1) {
+ ret = recv(fd, &rbuf, 1, MSG_DONTWAIT);
+ if (ret == -1 && errno == EAGAIN)
+ break;
+ if (ret == -1)
+ error(1, errno, "recv");
+ if (rbuf != cfg_payload_char)
+ error(1, 0, "recv: payload mismatch");
+ num++;
+ };
+
+ return num;
+}
+
+static int do_main(void)
+{
+ unsigned long tstop, treport, tcur;
+ int fdt = -1, fdr = -1, len, tx = 0, rx = 0;
+
+ if (!cfg_only_tx)
+ fdr = setup_rx();
+ if (!cfg_only_rx)
+ fdt = setup_tx();
+
+ len = build_packet();
+
+ tcur = util_gettime();
+ treport = tcur + 1000;
+ tstop = tcur + (cfg_num_secs * 1000);
+
+ while (1) {
+ if (!cfg_only_rx)
+ tx += do_tx(fdt, buf, len);
+
+ if (!cfg_only_tx)
+ rx += do_rx(fdr);
+
+ if (cfg_num_secs) {
+ tcur = util_gettime();
+ if (tcur >= tstop)
+ break;
+ if (tcur >= treport) {
+ fprintf(stderr, "pkts: tx=%u rx=%u\n", tx, rx);
+ tx = 0;
+ rx = 0;
+ treport = tcur + 1000;
+ }
+ } else {
+ if (tx == cfg_num_pkt)
+ break;
+ }
+ }
+
+ /* read straggler packets, if any */
+ if (rx < tx) {
+ tstop = util_gettime() + 100;
+ while (rx < tx) {
+ tcur = util_gettime();
+ if (tcur >= tstop)
+ break;
+
+ do_poll(fdr, POLLIN, tstop - tcur);
+ rx += do_rx(fdr);
+ }
+ }
+
+ fprintf(stderr, "pkts: tx=%u rx=%u\n", tx, rx);
+
+ if (fdr != -1 && close(fdr))
+ error(1, errno, "close rx");
+ if (fdt != -1 && close(fdt))
+ error(1, errno, "close tx");
+
+ /*
+ * success (== 0) only if received all packets
+ * unless failure is expected, in which case none must arrive.
+ */
+ if (cfg_expect_failure)
+ return rx != 0;
+ else
+ return rx != tx;
+}
+
+
+static void __attribute__((noreturn)) usage(const char *filepath)
+{
+ fprintf(stderr, "Usage: %s [-e gre|gue|bare|none] [-i 4|6] [-l len] "
+ "[-O 4|6] [-o 4|6] [-n num] [-t secs] [-R] [-T] "
+ "[-s <osrc> [-d <odst>] [-S <isrc>] [-D <idst>] "
+ "[-x <otos>] [-X <itos>] [-f <isport>] [-F]\n",
+ filepath);
+ exit(1);
+}
+
+static void parse_addr(int family, void *addr, const char *optarg)
+{
+ int ret;
+
+ ret = inet_pton(family, optarg, addr);
+ if (ret == -1)
+ error(1, errno, "inet_pton");
+ if (ret == 0)
+ error(1, 0, "inet_pton: bad string");
+}
+
+static void parse_addr4(struct sockaddr_in *addr, const char *optarg)
+{
+ parse_addr(AF_INET, &addr->sin_addr, optarg);
+}
+
+static void parse_addr6(struct sockaddr_in6 *addr, const char *optarg)
+{
+ parse_addr(AF_INET6, &addr->sin6_addr, optarg);
+}
+
+static int parse_protocol_family(const char *filepath, const char *optarg)
+{
+ if (!strcmp(optarg, "4"))
+ return PF_INET;
+ if (!strcmp(optarg, "6"))
+ return PF_INET6;
+
+ usage(filepath);
+}
+
+static void parse_opts(int argc, char **argv)
+{
+ int c;
+
+ while ((c = getopt(argc, argv, "d:D:e:f:Fhi:l:n:o:O:Rs:S:t:Tx:X:")) != -1) {
+ switch (c) {
+ case 'd':
+ if (cfg_l3_outer == AF_UNSPEC)
+ error(1, 0, "-d must be preceded by -o");
+ if (cfg_l3_outer == AF_INET)
+ parse_addr4(&out_daddr4, optarg);
+ else
+ parse_addr6(&out_daddr6, optarg);
+ break;
+ case 'D':
+ if (cfg_l3_inner == AF_UNSPEC)
+ error(1, 0, "-D must be preceded by -i");
+ if (cfg_l3_inner == AF_INET)
+ parse_addr4(&in_daddr4, optarg);
+ else
+ parse_addr6(&in_daddr6, optarg);
+ break;
+ case 'e':
+ if (!strcmp(optarg, "gre"))
+ cfg_encap_proto = IPPROTO_GRE;
+ else if (!strcmp(optarg, "gue"))
+ cfg_encap_proto = IPPROTO_UDP;
+ else if (!strcmp(optarg, "bare"))
+ cfg_encap_proto = IPPROTO_IPIP;
+ else if (!strcmp(optarg, "none"))
+ cfg_encap_proto = IPPROTO_IP; /* == 0 */
+ else
+ usage(argv[0]);
+ break;
+ case 'f':
+ cfg_src_port = strtol(optarg, NULL, 0);
+ break;
+ case 'F':
+ cfg_expect_failure = true;
+ break;
+ case 'h':
+ usage(argv[0]);
+ break;
+ case 'i':
+ if (!strcmp(optarg, "4"))
+ cfg_l3_inner = PF_INET;
+ else if (!strcmp(optarg, "6"))
+ cfg_l3_inner = PF_INET6;
+ else
+ usage(argv[0]);
+ break;
+ case 'l':
+ cfg_payload_len = strtol(optarg, NULL, 0);
+ break;
+ case 'n':
+ cfg_num_pkt = strtol(optarg, NULL, 0);
+ break;
+ case 'o':
+ cfg_l3_outer = parse_protocol_family(argv[0], optarg);
+ break;
+ case 'O':
+ cfg_l3_extra = parse_protocol_family(argv[0], optarg);
+ break;
+ case 'R':
+ cfg_only_rx = true;
+ break;
+ case 's':
+ if (cfg_l3_outer == AF_INET)
+ parse_addr4(&out_saddr4, optarg);
+ else
+ parse_addr6(&out_saddr6, optarg);
+ break;
+ case 'S':
+ if (cfg_l3_inner == AF_INET)
+ parse_addr4(&in_saddr4, optarg);
+ else
+ parse_addr6(&in_saddr6, optarg);
+ break;
+ case 't':
+ cfg_num_secs = strtol(optarg, NULL, 0);
+ break;
+ case 'T':
+ cfg_only_tx = true;
+ break;
+ case 'x':
+ cfg_dsfield_outer = strtol(optarg, NULL, 0);
+ break;
+ case 'X':
+ cfg_dsfield_inner = strtol(optarg, NULL, 0);
+ break;
+ }
+ }
+
+ if (cfg_only_rx && cfg_only_tx)
+ error(1, 0, "options: cannot combine rx-only and tx-only");
+
+ if (cfg_encap_proto && cfg_l3_outer == AF_UNSPEC)
+ error(1, 0, "options: must specify outer with encap");
+ else if ((!cfg_encap_proto) && cfg_l3_outer != AF_UNSPEC)
+ error(1, 0, "options: cannot combine no-encap and outer");
+ else if ((!cfg_encap_proto) && cfg_l3_extra != AF_UNSPEC)
+ error(1, 0, "options: cannot combine no-encap and extra");
+
+ if (cfg_l3_inner == AF_UNSPEC)
+ cfg_l3_inner = AF_INET6;
+ if (cfg_l3_inner == AF_INET6 && cfg_encap_proto == IPPROTO_IPIP)
+ cfg_encap_proto = IPPROTO_IPV6;
+
+ /* RFC 6040 4.2:
+ * on decap, if outer encountered congestion (CE == 0x3),
+ * but inner cannot encode ECN (NoECT == 0x0), then drop packet.
+ */
+ if (((cfg_dsfield_outer & 0x3) == 0x3) &&
+ ((cfg_dsfield_inner & 0x3) == 0x0))
+ cfg_expect_failure = true;
+}
+
+static void print_opts(void)
+{
+ if (cfg_l3_inner == PF_INET6) {
+ util_printaddr("inner.dest6", (void *) &in_daddr6);
+ util_printaddr("inner.source6", (void *) &in_saddr6);
+ } else {
+ util_printaddr("inner.dest4", (void *) &in_daddr4);
+ util_printaddr("inner.source4", (void *) &in_saddr4);
+ }
+
+ if (!cfg_l3_outer)
+ return;
+
+ fprintf(stderr, "encap proto: %u\n", cfg_encap_proto);
+
+ if (cfg_l3_outer == PF_INET6) {
+ util_printaddr("outer.dest6", (void *) &out_daddr6);
+ util_printaddr("outer.source6", (void *) &out_saddr6);
+ } else {
+ util_printaddr("outer.dest4", (void *) &out_daddr4);
+ util_printaddr("outer.source4", (void *) &out_saddr4);
+ }
+
+ if (!cfg_l3_extra)
+ return;
+
+ if (cfg_l3_outer == PF_INET6) {
+ util_printaddr("extra.dest6", (void *) &extra_daddr6);
+ util_printaddr("extra.source6", (void *) &extra_saddr6);
+ } else {
+ util_printaddr("extra.dest4", (void *) &extra_daddr4);
+ util_printaddr("extra.source4", (void *) &extra_saddr4);
+ }
+
+}
+
+int main(int argc, char **argv)
+{
+ parse_opts(argc, argv);
+ print_opts();
+ return do_main();
+}
diff --git a/tools/testing/selftests/bpf/test_flow_dissector.sh b/tools/testing/selftests/bpf/test_flow_dissector.sh
new file mode 100755
index 000000000000..c0fb073b5eab
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_flow_dissector.sh
@@ -0,0 +1,115 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Load BPF flow dissector and verify it correctly dissects traffic
+export TESTNAME=test_flow_dissector
+unmount=0
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+msg="skip all tests:"
+if [ $UID != 0 ]; then
+ echo $msg please run this as root >&2
+ exit $ksft_skip
+fi
+
+# This test needs to be run in a network namespace with in_netns.sh. Check if
+# this is the case and run it with in_netns.sh if it is being run in the root
+# namespace.
+if [[ -z $(ip netns identify $$) ]]; then
+ ../net/in_netns.sh "$0" "$@"
+ exit $?
+fi
+
+# Determine selftest success via shell exit code
+exit_handler()
+{
+ if (( $? == 0 )); then
+ echo "selftests: $TESTNAME [PASS]";
+ else
+ echo "selftests: $TESTNAME [FAILED]";
+ fi
+
+ set +e
+
+ # Cleanup
+ tc filter del dev lo ingress pref 1337 2> /dev/null
+ tc qdisc del dev lo ingress 2> /dev/null
+ ./flow_dissector_load -d 2> /dev/null
+ if [ $unmount -ne 0 ]; then
+ umount bpffs 2> /dev/null
+ fi
+}
+
+# Exit script immediately (well catched by trap handler) if any
+# program/thing exits with a non-zero status.
+set -e
+
+# (Use 'trap -l' to list meaning of numbers)
+trap exit_handler 0 2 3 6 9
+
+# Mount BPF file system
+if /bin/mount | grep /sys/fs/bpf > /dev/null; then
+ echo "bpffs already mounted"
+else
+ echo "bpffs not mounted. Mounting..."
+ unmount=1
+ /bin/mount bpffs /sys/fs/bpf -t bpf
+fi
+
+# Attach BPF program
+./flow_dissector_load -p bpf_flow.o -s dissect
+
+# Setup
+tc qdisc add dev lo ingress
+
+echo "Testing IPv4..."
+# Drops all IP/UDP packets coming from port 9
+tc filter add dev lo parent ffff: protocol ip pref 1337 flower ip_proto \
+ udp src_port 9 action drop
+
+# Send 10 IPv4/UDP packets from port 8. Filter should not drop any.
+./test_flow_dissector -i 4 -f 8
+# Send 10 IPv4/UDP packets from port 9. Filter should drop all.
+./test_flow_dissector -i 4 -f 9 -F
+# Send 10 IPv4/UDP packets from port 10. Filter should not drop any.
+./test_flow_dissector -i 4 -f 10
+
+echo "Testing IPIP..."
+# Send 10 IPv4/IPv4/UDP packets from port 8. Filter should not drop any.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e bare -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 8
+# Send 10 IPv4/IPv4/UDP packets from port 9. Filter should drop all.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e bare -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 9 -F
+# Send 10 IPv4/IPv4/UDP packets from port 10. Filter should not drop any.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e bare -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 10
+
+echo "Testing IPv4 + GRE..."
+# Send 10 IPv4/GRE/IPv4/UDP packets from port 8. Filter should not drop any.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e gre -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 8
+# Send 10 IPv4/GRE/IPv4/UDP packets from port 9. Filter should drop all.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e gre -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 9 -F
+# Send 10 IPv4/GRE/IPv4/UDP packets from port 10. Filter should not drop any.
+./with_addr.sh ./with_tunnels.sh ./test_flow_dissector -o 4 -e gre -i 4 \
+ -D 192.168.0.1 -S 1.1.1.1 -f 10
+
+tc filter del dev lo ingress pref 1337
+
+echo "Testing IPv6..."
+# Drops all IPv6/UDP packets coming from port 9
+tc filter add dev lo parent ffff: protocol ipv6 pref 1337 flower ip_proto \
+ udp src_port 9 action drop
+
+# Send 10 IPv6/UDP packets from port 8. Filter should not drop any.
+./test_flow_dissector -i 6 -f 8
+# Send 10 IPv6/UDP packets from port 9. Filter should drop all.
+./test_flow_dissector -i 6 -f 9 -F
+# Send 10 IPv6/UDP packets from port 10. Filter should not drop any.
+./test_flow_dissector -i 6 -f 10
+
+exit 0
diff --git a/tools/testing/selftests/bpf/with_addr.sh b/tools/testing/selftests/bpf/with_addr.sh
new file mode 100755
index 000000000000..ffcd3953f94c
--- /dev/null
+++ b/tools/testing/selftests/bpf/with_addr.sh
@@ -0,0 +1,54 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# add private ipv4 and ipv6 addresses to loopback
+
+readonly V6_INNER='100::a/128'
+readonly V4_INNER='192.168.0.1/32'
+
+if getopts ":s" opt; then
+ readonly SIT_DEV_NAME='sixtofourtest0'
+ readonly V6_SIT='2::/64'
+ readonly V4_SIT='172.17.0.1/32'
+ shift
+fi
+
+fail() {
+ echo "error: $*" 1>&2
+ exit 1
+}
+
+setup() {
+ ip -6 addr add "${V6_INNER}" dev lo || fail 'failed to setup v6 address'
+ ip -4 addr add "${V4_INNER}" dev lo || fail 'failed to setup v4 address'
+
+ if [[ -n "${V6_SIT}" ]]; then
+ ip link add "${SIT_DEV_NAME}" type sit remote any local any \
+ || fail 'failed to add sit'
+ ip link set dev "${SIT_DEV_NAME}" up \
+ || fail 'failed to bring sit device up'
+ ip -6 addr add "${V6_SIT}" dev "${SIT_DEV_NAME}" \
+ || fail 'failed to setup v6 SIT address'
+ ip -4 addr add "${V4_SIT}" dev "${SIT_DEV_NAME}" \
+ || fail 'failed to setup v4 SIT address'
+ fi
+
+ sleep 2 # avoid race causing bind to fail
+}
+
+cleanup() {
+ if [[ -n "${V6_SIT}" ]]; then
+ ip -4 addr del "${V4_SIT}" dev "${SIT_DEV_NAME}"
+ ip -6 addr del "${V6_SIT}" dev "${SIT_DEV_NAME}"
+ ip link del "${SIT_DEV_NAME}"
+ fi
+
+ ip -4 addr del "${V4_INNER}" dev lo
+ ip -6 addr del "${V6_INNER}" dev lo
+}
+
+trap cleanup EXIT
+
+setup
+"$@"
+exit "$?"
diff --git a/tools/testing/selftests/bpf/with_tunnels.sh b/tools/testing/selftests/bpf/with_tunnels.sh
new file mode 100755
index 000000000000..e24949ed3a20
--- /dev/null
+++ b/tools/testing/selftests/bpf/with_tunnels.sh
@@ -0,0 +1,36 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# setup tunnels for flow dissection test
+
+readonly SUFFIX="test_$(mktemp -u XXXX)"
+CONFIG="remote 127.0.0.2 local 127.0.0.1 dev lo"
+
+setup() {
+ ip link add "ipip_${SUFFIX}" type ipip ${CONFIG}
+ ip link add "gre_${SUFFIX}" type gre ${CONFIG}
+ ip link add "sit_${SUFFIX}" type sit ${CONFIG}
+
+ echo "tunnels before test:"
+ ip tunnel show
+
+ ip link set "ipip_${SUFFIX}" up
+ ip link set "gre_${SUFFIX}" up
+ ip link set "sit_${SUFFIX}" up
+}
+
+
+cleanup() {
+ ip tunnel del "ipip_${SUFFIX}"
+ ip tunnel del "gre_${SUFFIX}"
+ ip tunnel del "sit_${SUFFIX}"
+
+ echo "tunnels after test:"
+ ip tunnel show
+}
+
+trap cleanup EXIT
+
+setup
+"$@"
+exit "$?"
--
2.19.0.rc2.392.g5ba43deb5a-goog
^ permalink raw reply related
* [bpf-next, v2 2/3] flow_dissector: implements eBPF parser
From: Petar Penkov @ 2018-09-08 0:11 UTC (permalink / raw)
To: netdev
Cc: davem, ast, daniel, simon.horman, ecree, songliubraving, tom,
Petar Penkov, Willem de Bruijn
In-Reply-To: <20180908001110.200828-1-peterpenkov96@gmail.com>
From: Petar Penkov <ppenkov@google.com>
This eBPF program extracts basic/control/ip address/ports keys from
incoming packets. It supports recursive parsing for IP encapsulation,
and VLAN, along with IPv4/IPv6 and extension headers. This program is
meant to show how flow dissection and key extraction can be done in
eBPF.
Link: http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf
Signed-off-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
tools/testing/selftests/bpf/Makefile | 2 +-
tools/testing/selftests/bpf/bpf_flow.c | 386 +++++++++++++++++++++++++
2 files changed, 387 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/bpf/bpf_flow.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index fff7fb1285fc..e65f50f9185e 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -35,7 +35,7 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
test_get_stack_rawtp.o test_sockmap_kern.o test_sockhash_kern.o \
test_lwt_seg6local.o sendmsg4_prog.o sendmsg6_prog.o test_lirc_mode2_kern.o \
get_cgroup_id_kern.o socket_cookie_prog.o test_select_reuseport_kern.o \
- test_skb_cgroup_id_kern.o
+ test_skb_cgroup_id_kern.o bpf_flow.o
# Order correspond to 'make run_tests' order
TEST_PROGS := test_kmod.sh \
diff --git a/tools/testing/selftests/bpf/bpf_flow.c b/tools/testing/selftests/bpf/bpf_flow.c
new file mode 100644
index 000000000000..f1e33a574841
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpf_flow.c
@@ -0,0 +1,386 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <limits.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <linux/pkt_cls.h>
+#include <linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/icmp.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include <linux/if_packet.h>
+#include <sys/socket.h>
+#include <linux/if_tunnel.h>
+#include <linux/mpls.h>
+#include "bpf_helpers.h"
+#include "bpf_endian.h"
+
+int _version SEC("version") = 1;
+#define PROG(F) SEC(#F) int bpf_func_##F
+
+/* These are the identifiers of the BPF programs that will be used in tail
+ * calls. Name is limited to 16 characters, with the terminating character and
+ * bpf_func_ above, we have only 6 to work with, anything after will be cropped.
+ */
+enum {
+ IP,
+ IPV6,
+ IPV6OP, /* Destination/Hop-by-Hop Options IPv6 Extension header */
+ IPV6FR, /* Fragmentation IPv6 Extension Header */
+ MPLS,
+ VLAN,
+};
+
+#define IP_MF 0x2000
+#define IP_OFFSET 0x1FFF
+#define IP6_MF 0x0001
+#define IP6_OFFSET 0xFFF8
+
+struct vlan_hdr {
+ __be16 h_vlan_TCI;
+ __be16 h_vlan_encapsulated_proto;
+};
+
+struct gre_hdr {
+ __be16 flags;
+ __be16 proto;
+};
+
+struct frag_hdr {
+ __u8 nexthdr;
+ __u8 reserved;
+ __be16 frag_off;
+ __be32 identification;
+};
+
+struct bpf_map_def SEC("maps") jmp_table = {
+ .type = BPF_MAP_TYPE_PROG_ARRAY,
+ .key_size = sizeof(__u32),
+ .value_size = sizeof(__u32),
+ .max_entries = 8
+};
+
+struct bpf_dissect_cb {
+ __u16 nhoff;
+};
+
+static __always_inline void *bpf_flow_dissect_get_header(struct __sk_buff *skb,
+ __u16 hdr_size,
+ void *buffer)
+{
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ void *data_end = (__u8 *)(long)skb->data_end;
+ void *data = (__u8 *)(long)skb->data;
+ __u8 *hdr;
+
+ /* Verifies this variable offset does not overflow */
+ if (cb->nhoff > (USHRT_MAX - hdr_size))
+ return NULL;
+
+ hdr = data + cb->nhoff;
+ if (hdr + hdr_size <= data_end)
+ return hdr;
+
+ if (bpf_skb_load_bytes(skb, cb->nhoff, buffer, hdr_size))
+ return NULL;
+
+ return buffer;
+}
+
+/* Dispatches on ETHERTYPE */
+static __always_inline int parse_eth_proto(struct __sk_buff *skb, __be16 proto)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+
+ keys->n_proto = proto;
+ switch (proto) {
+ case bpf_htons(ETH_P_IP):
+ bpf_tail_call(skb, &jmp_table, IP);
+ break;
+ case bpf_htons(ETH_P_IPV6):
+ bpf_tail_call(skb, &jmp_table, IPV6);
+ break;
+ case bpf_htons(ETH_P_MPLS_MC):
+ case bpf_htons(ETH_P_MPLS_UC):
+ bpf_tail_call(skb, &jmp_table, MPLS);
+ break;
+ case bpf_htons(ETH_P_8021Q):
+ case bpf_htons(ETH_P_8021AD):
+ bpf_tail_call(skb, &jmp_table, VLAN);
+ break;
+ default:
+ /* Protocol not supported */
+ return BPF_DROP;
+ }
+
+ return BPF_DROP;
+}
+
+SEC("dissect")
+int dissect(struct __sk_buff *skb)
+{
+ if (!skb->vlan_present)
+ return parse_eth_proto(skb, skb->protocol);
+ else
+ return parse_eth_proto(skb, skb->vlan_proto);
+}
+
+/* Parses on IPPROTO_* */
+static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ void *data_end = (void *)(long)skb->data_end;
+ struct icmphdr *icmp, _icmp;
+ struct gre_hdr *gre, _gre;
+ struct ethhdr *eth, _eth;
+ struct tcphdr *tcp, _tcp;
+ struct udphdr *udp, _udp;
+
+ keys->ip_proto = proto;
+ switch (proto) {
+ case IPPROTO_ICMP:
+ icmp = bpf_flow_dissect_get_header(skb, sizeof(*icmp), &_icmp);
+ if (!icmp)
+ return BPF_DROP;
+ return BPF_OK;
+ case IPPROTO_IPIP:
+ keys->is_encap = true;
+ return parse_eth_proto(skb, bpf_htons(ETH_P_IP));
+ case IPPROTO_IPV6:
+ keys->is_encap = true;
+ return parse_eth_proto(skb, bpf_htons(ETH_P_IPV6));
+ case IPPROTO_GRE:
+ gre = bpf_flow_dissect_get_header(skb, sizeof(*gre), &_gre);
+ if (!gre)
+ return BPF_DROP;
+
+ if (bpf_htons(gre->flags & GRE_VERSION))
+ /* Only inspect standard GRE packets with version 0 */
+ return BPF_OK;
+
+ cb->nhoff += sizeof(*gre); /* Step over GRE Flags and Proto */
+ if (GRE_IS_CSUM(gre->flags))
+ cb->nhoff += 4; /* Step over chksum and Padding */
+ if (GRE_IS_KEY(gre->flags))
+ cb->nhoff += 4; /* Step over key */
+ if (GRE_IS_SEQ(gre->flags))
+ cb->nhoff += 4; /* Step over sequence number */
+
+ keys->is_encap = true;
+
+ if (gre->proto == bpf_htons(ETH_P_TEB)) {
+ eth = bpf_flow_dissect_get_header(skb, sizeof(*eth),
+ &_eth);
+ if (!eth)
+ return BPF_DROP;
+
+ cb->nhoff += sizeof(*eth);
+
+ return parse_eth_proto(skb, eth->h_proto);
+ } else {
+ return parse_eth_proto(skb, gre->proto);
+ }
+ case IPPROTO_TCP:
+ tcp = bpf_flow_dissect_get_header(skb, sizeof(*tcp), &_tcp);
+ if (!tcp)
+ return BPF_DROP;
+
+ if (tcp->doff < 5)
+ return BPF_DROP;
+
+ if ((__u8 *)tcp + (tcp->doff << 2) > data_end)
+ return BPF_DROP;
+
+ keys->thoff = cb->nhoff;
+ keys->sport = tcp->source;
+ keys->dport = tcp->dest;
+ return BPF_OK;
+ case IPPROTO_UDP:
+ case IPPROTO_UDPLITE:
+ udp = bpf_flow_dissect_get_header(skb, sizeof(*udp), &_udp);
+ if (!udp)
+ return BPF_DROP;
+
+ keys->thoff = cb->nhoff;
+ keys->sport = udp->source;
+ keys->dport = udp->dest;
+ return BPF_OK;
+ default:
+ return BPF_DROP;
+ }
+
+ return BPF_DROP;
+}
+
+static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+
+ keys->ip_proto = nexthdr;
+ switch (nexthdr) {
+ case IPPROTO_HOPOPTS:
+ case IPPROTO_DSTOPTS:
+ bpf_tail_call(skb, &jmp_table, IPV6OP);
+ break;
+ case IPPROTO_FRAGMENT:
+ bpf_tail_call(skb, &jmp_table, IPV6FR);
+ break;
+ default:
+ return parse_ip_proto(skb, nexthdr);
+ }
+
+ return BPF_DROP;
+}
+
+PROG(IP)(struct __sk_buff *skb)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ void *data_end = (__u8 *)(long)skb->data_end;
+ void *data = (__u8 *)(long)skb->data;
+ bool done = false;
+ struct iphdr *iph, _iph;
+
+ iph = bpf_flow_dissect_get_header(skb, sizeof(*iph), &_iph);
+ if (!iph)
+ return BPF_DROP;
+
+ /* IP header cannot be smaller than 20 bytes */
+ if (iph->ihl < 5)
+ return BPF_DROP;
+
+ keys->addr_proto = ETH_P_IP;
+ keys->ipv4_src = iph->saddr;
+ keys->ipv4_dst = iph->daddr;
+
+ cb->nhoff += iph->ihl << 2;
+ if (data + cb->nhoff > data_end)
+ return BPF_DROP;
+
+ if (iph->frag_off & bpf_htons(IP_MF | IP_OFFSET)) {
+ keys->is_frag = true;
+ if (iph->frag_off & bpf_htons(IP_OFFSET))
+ /* From second fragment on, packets do not have headers
+ * we can parse.
+ */
+ done = true;
+ else
+ keys->is_first_frag = true;
+ }
+
+ if (done)
+ return BPF_OK;
+
+ return parse_ip_proto(skb, iph->protocol);
+}
+
+PROG(IPV6)(struct __sk_buff *skb)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ struct ipv6hdr *ip6h, _ip6h;
+
+ ip6h = bpf_flow_dissect_get_header(skb, sizeof(*ip6h), &_ip6h);
+ if (!ip6h)
+ return BPF_DROP;
+
+ keys->addr_proto = ETH_P_IPV6;
+ memcpy(&keys->ipv6_src, &ip6h->saddr, 2*sizeof(ip6h->saddr));
+
+ cb->nhoff += sizeof(struct ipv6hdr);
+
+ return parse_ipv6_proto(skb, ip6h->nexthdr);
+}
+
+PROG(IPV6OP)(struct __sk_buff *skb)
+{
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ struct ipv6_opt_hdr *ip6h, _ip6h;
+
+ ip6h = bpf_flow_dissect_get_header(skb, sizeof(*ip6h), &_ip6h);
+ if (!ip6h)
+ return BPF_DROP;
+
+ /* hlen is in 8-octects and does not include the first 8 bytes
+ * of the header
+ */
+ cb->nhoff += (1 + ip6h->hdrlen) << 3;
+
+ return parse_ipv6_proto(skb, ip6h->nexthdr);
+}
+
+PROG(IPV6FR)(struct __sk_buff *skb)
+{
+ struct bpf_flow_keys *keys = (struct bpf_flow_keys *)(long)(skb->flow_keys);
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ struct frag_hdr *fragh, _fragh;
+
+ fragh = bpf_flow_dissect_get_header(skb, sizeof(*fragh), &_fragh);
+ if (!fragh)
+ return BPF_DROP;
+
+ cb->nhoff += sizeof(*fragh);
+ keys->is_frag = true;
+ if (!(fragh->frag_off & bpf_htons(IP6_OFFSET)))
+ keys->is_first_frag = true;
+
+ return parse_ipv6_proto(skb, fragh->nexthdr);
+}
+
+PROG(MPLS)(struct __sk_buff *skb)
+{
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ void *data = (void *)(long)skb->data;
+ struct mpls_label *mpls, _mpls;
+ __u8 *version;
+
+ mpls = bpf_flow_dissect_get_header(skb, sizeof(*mpls), &_mpls);
+ if (!mpls)
+ return BPF_DROP;
+
+ return BPF_OK;
+}
+
+PROG(VLAN)(struct __sk_buff *skb)
+{
+ struct bpf_dissect_cb *cb = (struct bpf_dissect_cb *)(skb->cb);
+ struct vlan_hdr *vlan, _vlan;
+ __be16 proto;
+
+ /* Peek back to see if single or double-tagging */
+ if (bpf_skb_load_bytes(skb, cb->nhoff - sizeof(proto), &proto,
+ sizeof(proto)))
+ return BPF_DROP;
+
+ /* Account for double-tagging */
+ if (proto == bpf_htons(ETH_P_8021AD)) {
+ vlan = bpf_flow_dissect_get_header(skb, sizeof(*vlan), &_vlan);
+ if (!vlan)
+ return BPF_DROP;
+
+ if (vlan->h_vlan_encapsulated_proto != bpf_htons(ETH_P_8021Q))
+ return BPF_DROP;
+
+ cb->nhoff += sizeof(*vlan);
+ }
+
+ vlan = bpf_flow_dissect_get_header(skb, sizeof(*vlan), &_vlan);
+ if (!vlan)
+ return BPF_DROP;
+
+ cb->nhoff += sizeof(*vlan);
+ /* Only allow 8021AD + 8021Q double tagging and no triple tagging.*/
+ if (vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021AD) ||
+ vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021Q))
+ return BPF_DROP;
+
+ return parse_eth_proto(skb, vlan->h_vlan_encapsulated_proto);
+}
+
+char __license[] SEC("license") = "GPL";
--
2.19.0.rc2.392.g5ba43deb5a-goog
^ permalink raw reply related
* [bpf-next, v2 1/3] flow_dissector: implements flow dissector BPF hook
From: Petar Penkov @ 2018-09-08 0:11 UTC (permalink / raw)
To: netdev
Cc: davem, ast, daniel, simon.horman, ecree, songliubraving, tom,
Petar Penkov, Willem de Bruijn
In-Reply-To: <20180908001110.200828-1-peterpenkov96@gmail.com>
From: Petar Penkov <ppenkov@google.com>
Adds a hook for programs of type BPF_PROG_TYPE_FLOW_DISSECTOR and
attach type BPF_FLOW_DISSECTOR that is executed in the flow dissector
path. The BPF program is per-network namespace.
Signed-off-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
include/linux/bpf.h | 1 +
include/linux/bpf_types.h | 1 +
include/linux/skbuff.h | 7 ++
include/net/net_namespace.h | 3 +
include/net/sch_generic.h | 12 ++-
include/uapi/linux/bpf.h | 25 ++++++
kernel/bpf/syscall.c | 8 ++
kernel/bpf/verifier.c | 32 ++++++++
net/core/filter.c | 67 ++++++++++++++++
net/core/flow_dissector.c | 136 +++++++++++++++++++++++++++++++++
tools/bpf/bpftool/prog.c | 1 +
tools/include/uapi/linux/bpf.h | 25 ++++++
tools/lib/bpf/libbpf.c | 2 +
13 files changed, 317 insertions(+), 3 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 523481a3471b..988a00797bcd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -212,6 +212,7 @@ enum bpf_reg_type {
PTR_TO_PACKET_META, /* skb->data - meta_len */
PTR_TO_PACKET, /* reg points to skb->data */
PTR_TO_PACKET_END, /* skb->data + headlen */
+ PTR_TO_FLOW_KEYS, /* reg points to bpf_flow_keys */
};
/* The information passed from prog-specific *_is_valid_access
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index cd26c090e7c0..22083712dd18 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -32,6 +32,7 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LIRC_MODE2, lirc_mode2)
#ifdef CONFIG_INET
BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport)
#endif
+BPF_PROG_TYPE(BPF_PROG_TYPE_FLOW_DISSECTOR, flow_dissector)
BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 17a13e4785fc..ce0e863f02a2 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -243,6 +243,8 @@ struct scatterlist;
struct pipe_inode_info;
struct iov_iter;
struct napi_struct;
+struct bpf_prog;
+union bpf_attr;
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
struct nf_conntrack {
@@ -1192,6 +1194,11 @@ void skb_flow_dissector_init(struct flow_dissector *flow_dissector,
const struct flow_dissector_key *key,
unsigned int key_count);
+int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
+ struct bpf_prog *prog);
+
+int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr);
+
bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector *flow_dissector,
void *target_container,
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 9b5fdc50519a..99d4148e0f90 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -43,6 +43,7 @@ struct ctl_table_header;
struct net_generic;
struct uevent_sock;
struct netns_ipvs;
+struct bpf_prog;
#define NETDEV_HASHBITS 8
@@ -145,6 +146,8 @@ struct net {
#endif
struct net_generic __rcu *gen;
+ struct bpf_prog __rcu *flow_dissector_prog;
+
/* Note : following structs are cache line aligned */
#ifdef CONFIG_XFRM
struct netns_xfrm xfrm;
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index a6d00093f35e..1b81ba85fd2d 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -19,6 +19,7 @@ struct Qdisc_ops;
struct qdisc_walker;
struct tcf_walker;
struct module;
+struct bpf_flow_keys;
typedef int tc_setup_cb_t(enum tc_setup_type type,
void *type_data, void *cb_priv);
@@ -307,9 +308,14 @@ struct tcf_proto {
};
struct qdisc_skb_cb {
- unsigned int pkt_len;
- u16 slave_dev_queue_mapping;
- u16 tc_classid;
+ union {
+ struct {
+ unsigned int pkt_len;
+ u16 slave_dev_queue_mapping;
+ u16 tc_classid;
+ };
+ struct bpf_flow_keys *flow_keys;
+ };
#define QDISC_CB_PRIV_LEN 20
unsigned char data[QDISC_CB_PRIV_LEN];
};
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 66917a4eba27..3064706fcaaa 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -152,6 +152,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_LWT_SEG6LOCAL,
BPF_PROG_TYPE_LIRC_MODE2,
BPF_PROG_TYPE_SK_REUSEPORT,
+ BPF_PROG_TYPE_FLOW_DISSECTOR,
};
enum bpf_attach_type {
@@ -172,6 +173,7 @@ enum bpf_attach_type {
BPF_CGROUP_UDP4_SENDMSG,
BPF_CGROUP_UDP6_SENDMSG,
BPF_LIRC_MODE2,
+ BPF_FLOW_DISSECTOR,
__MAX_BPF_ATTACH_TYPE
};
@@ -2333,6 +2335,7 @@ struct __sk_buff {
/* ... here. */
__u32 data_meta;
+ __u32 flow_keys;
};
struct bpf_tunnel_key {
@@ -2778,4 +2781,26 @@ enum bpf_task_fd_type {
BPF_FD_TYPE_URETPROBE, /* filename + offset */
};
+struct bpf_flow_keys {
+ __u16 thoff;
+ __u16 addr_proto; /* ETH_P_* of valid addrs */
+ __u8 is_frag;
+ __u8 is_first_frag;
+ __u8 is_encap;
+ __be16 n_proto;
+ __u8 ip_proto;
+ union {
+ struct {
+ __be32 ipv4_src;
+ __be32 ipv4_dst;
+ };
+ struct {
+ __u32 ipv6_src[4]; /* in6_addr; network order */
+ __u32 ipv6_dst[4]; /* in6_addr; network order */
+ };
+ };
+ __be16 sport;
+ __be16 dport;
+};
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 3c9636f03bb2..b3c2d09bcf7a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1615,6 +1615,9 @@ static int bpf_prog_attach(const union bpf_attr *attr)
case BPF_LIRC_MODE2:
ptype = BPF_PROG_TYPE_LIRC_MODE2;
break;
+ case BPF_FLOW_DISSECTOR:
+ ptype = BPF_PROG_TYPE_FLOW_DISSECTOR;
+ break;
default:
return -EINVAL;
}
@@ -1636,6 +1639,9 @@ static int bpf_prog_attach(const union bpf_attr *attr)
case BPF_PROG_TYPE_LIRC_MODE2:
ret = lirc_prog_attach(attr, prog);
break;
+ case BPF_PROG_TYPE_FLOW_DISSECTOR:
+ ret = skb_flow_dissector_bpf_prog_attach(attr, prog);
+ break;
default:
ret = cgroup_bpf_prog_attach(attr, ptype, prog);
}
@@ -1688,6 +1694,8 @@ static int bpf_prog_detach(const union bpf_attr *attr)
return sockmap_get_from_fd(attr, BPF_PROG_TYPE_SK_SKB, NULL);
case BPF_LIRC_MODE2:
return lirc_prog_detach(attr);
+ case BPF_FLOW_DISSECTOR:
+ return skb_flow_dissector_bpf_prog_detach(attr);
default:
return -EINVAL;
}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6ff1bac1795d..8ccbff4fff93 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -261,6 +261,7 @@ static const char * const reg_type_str[] = {
[PTR_TO_PACKET] = "pkt",
[PTR_TO_PACKET_META] = "pkt_meta",
[PTR_TO_PACKET_END] = "pkt_end",
+ [PTR_TO_FLOW_KEYS] = "flow_keys",
};
static char slot_type_char[] = {
@@ -965,6 +966,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
case PTR_TO_PACKET:
case PTR_TO_PACKET_META:
case PTR_TO_PACKET_END:
+ case PTR_TO_FLOW_KEYS:
case CONST_PTR_TO_MAP:
return true;
default:
@@ -1238,6 +1240,7 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
case BPF_PROG_TYPE_LWT_XMIT:
case BPF_PROG_TYPE_SK_SKB:
case BPF_PROG_TYPE_SK_MSG:
+ case BPF_PROG_TYPE_FLOW_DISSECTOR:
if (meta)
return meta->pkt_access;
@@ -1321,6 +1324,18 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off,
return -EACCES;
}
+static int check_flow_keys_access(struct bpf_verifier_env *env, int off,
+ int size)
+{
+ if (size < 0 || off < 0 ||
+ (u64)off + size > sizeof(struct bpf_flow_keys)) {
+ verbose(env, "invalid access to flow keys off=%d size=%d\n",
+ off, size);
+ return -EACCES;
+ }
+ return 0;
+}
+
static bool __is_pointer_value(bool allow_ptr_leaks,
const struct bpf_reg_state *reg)
{
@@ -1422,6 +1437,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
* right in front, treat it the very same way.
*/
return check_pkt_ptr_alignment(env, reg, off, size, strict);
+ case PTR_TO_FLOW_KEYS:
+ pointer_desc = "flow keys ";
+ break;
case PTR_TO_MAP_VALUE:
pointer_desc = "value ";
break;
@@ -1692,6 +1710,17 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
err = check_packet_access(env, regno, off, size, false);
if (!err && t == BPF_READ && value_regno >= 0)
mark_reg_unknown(env, regs, value_regno);
+ } else if (reg->type == PTR_TO_FLOW_KEYS) {
+ if (t == BPF_WRITE && value_regno >= 0 &&
+ is_pointer_value(env, value_regno)) {
+ verbose(env, "R%d leaks addr into flow keys\n",
+ value_regno);
+ return -EACCES;
+ }
+
+ err = check_flow_keys_access(env, off, size);
+ if (!err && t == BPF_READ && value_regno >= 0)
+ mark_reg_unknown(env, regs, value_regno);
} else {
verbose(env, "R%d invalid mem access '%s'\n", regno,
reg_type_str[reg->type]);
@@ -1839,6 +1868,8 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
case PTR_TO_PACKET_META:
return check_packet_access(env, regno, reg->off, access_size,
zero_size_allowed);
+ case PTR_TO_FLOW_KEYS:
+ return check_flow_keys_access(env, reg->off, access_size);
case PTR_TO_MAP_VALUE:
return check_map_access(env, regno, reg->off, access_size,
zero_size_allowed);
@@ -4366,6 +4397,7 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
case PTR_TO_CTX:
case CONST_PTR_TO_MAP:
case PTR_TO_PACKET_END:
+ case PTR_TO_FLOW_KEYS:
/* Only valid matches are exact, which memcmp() above
* would have accepted
*/
diff --git a/net/core/filter.c b/net/core/filter.c
index 8cb242b4400f..bc3725c26794 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5122,6 +5122,17 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
}
}
+static const struct bpf_func_proto *
+flow_dissector_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+ switch (func_id) {
+ case BPF_FUNC_skb_load_bytes:
+ return &bpf_skb_load_bytes_proto;
+ default:
+ return bpf_base_func_proto(func_id);
+ }
+}
+
static const struct bpf_func_proto *
lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
@@ -5237,6 +5248,7 @@ static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range(struct __sk_buff, data_meta):
case bpf_ctx_range(struct __sk_buff, data_end):
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
if (size != size_default)
return false;
break;
@@ -5265,6 +5277,7 @@ static bool sk_filter_is_valid_access(int off, int size,
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range(struct __sk_buff, data_meta):
case bpf_ctx_range(struct __sk_buff, data_end):
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
return false;
}
@@ -5290,6 +5303,7 @@ static bool lwt_is_valid_access(int off, int size,
case bpf_ctx_range(struct __sk_buff, tc_classid):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
case bpf_ctx_range(struct __sk_buff, data_meta):
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
return false;
}
@@ -5500,6 +5514,7 @@ static bool tc_cls_act_is_valid_access(int off, int size,
case bpf_ctx_range(struct __sk_buff, data_end):
info->reg_type = PTR_TO_PACKET_END;
break;
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
case bpf_ctx_range_till(struct __sk_buff, family, local_port):
return false;
}
@@ -5701,6 +5716,7 @@ static bool sk_skb_is_valid_access(int off, int size,
switch (off) {
case bpf_ctx_range(struct __sk_buff, tc_classid):
case bpf_ctx_range(struct __sk_buff, data_meta):
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
return false;
}
@@ -5760,6 +5776,39 @@ static bool sk_msg_is_valid_access(int off, int size,
return true;
}
+static bool flow_dissector_is_valid_access(int off, int size,
+ enum bpf_access_type type,
+ const struct bpf_prog *prog,
+ struct bpf_insn_access_aux *info)
+{
+ if (type == BPF_WRITE) {
+ switch (off) {
+ case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]):
+ break;
+ default:
+ return false;
+ }
+ }
+
+ switch (off) {
+ case bpf_ctx_range(struct __sk_buff, data):
+ info->reg_type = PTR_TO_PACKET;
+ break;
+ case bpf_ctx_range(struct __sk_buff, data_end):
+ info->reg_type = PTR_TO_PACKET_END;
+ break;
+ case bpf_ctx_range(struct __sk_buff, flow_keys):
+ info->reg_type = PTR_TO_FLOW_KEYS;
+ break;
+ case bpf_ctx_range(struct __sk_buff, tc_classid):
+ case bpf_ctx_range(struct __sk_buff, data_meta):
+ case bpf_ctx_range_till(struct __sk_buff, family, local_port):
+ return false;
+ }
+
+ return bpf_skb_is_valid_access(off, size, type, prog, info);
+}
+
static u32 bpf_convert_ctx_access(enum bpf_access_type type,
const struct bpf_insn *si,
struct bpf_insn *insn_buf,
@@ -6054,6 +6103,15 @@ static u32 bpf_convert_ctx_access(enum bpf_access_type type,
bpf_target_off(struct sock_common,
skc_num, 2, target_size));
break;
+
+ case offsetof(struct __sk_buff, flow_keys):
+ off = si->off;
+ off -= offsetof(struct __sk_buff, flow_keys);
+ off += offsetof(struct sk_buff, cb);
+ off += offsetof(struct qdisc_skb_cb, flow_keys);
+ *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg,
+ si->src_reg, off);
+ break;
}
return insn - insn_buf;
@@ -7017,6 +7075,15 @@ const struct bpf_verifier_ops sk_msg_verifier_ops = {
const struct bpf_prog_ops sk_msg_prog_ops = {
};
+const struct bpf_verifier_ops flow_dissector_verifier_ops = {
+ .get_func_proto = flow_dissector_func_proto,
+ .is_valid_access = flow_dissector_is_valid_access,
+ .convert_ctx_access = bpf_convert_ctx_access,
+};
+
+const struct bpf_prog_ops flow_dissector_prog_ops = {
+};
+
int sk_detach_filter(struct sock *sk)
{
int ret = -ENOENT;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index ce9eeeb7c024..7eed48c46a94 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -25,6 +25,9 @@
#include <net/flow_dissector.h>
#include <scsi/fc/fc_fcoe.h>
#include <uapi/linux/batadv_packet.h>
+#include <linux/bpf.h>
+
+static DEFINE_MUTEX(flow_dissector_mutex);
static void dissector_set_key(struct flow_dissector *flow_dissector,
enum flow_dissector_key_id key_id)
@@ -62,6 +65,44 @@ void skb_flow_dissector_init(struct flow_dissector *flow_dissector,
}
EXPORT_SYMBOL(skb_flow_dissector_init);
+int skb_flow_dissector_bpf_prog_attach(const union bpf_attr *attr,
+ struct bpf_prog *prog)
+{
+ struct bpf_prog *attached;
+ struct net *net;
+
+ net = current->nsproxy->net_ns;
+ mutex_lock(&flow_dissector_mutex);
+ attached = rcu_dereference_protected(net->flow_dissector_prog,
+ lockdep_is_held(&flow_dissector_mutex));
+ if (attached) {
+ /* Only one BPF program can be attached at a time */
+ mutex_unlock(&flow_dissector_mutex);
+ return -EEXIST;
+ }
+ rcu_assign_pointer(net->flow_dissector_prog, prog);
+ mutex_unlock(&flow_dissector_mutex);
+ return 0;
+}
+
+int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
+{
+ struct bpf_prog *attached;
+ struct net *net;
+
+ net = current->nsproxy->net_ns;
+ mutex_lock(&flow_dissector_mutex);
+ attached = rcu_dereference_protected(net->flow_dissector_prog,
+ lockdep_is_held(&flow_dissector_mutex));
+ if (!attached) {
+ mutex_unlock(&flow_dissector_mutex);
+ return -ENOENT;
+ }
+ bpf_prog_put(attached);
+ RCU_INIT_POINTER(net->flow_dissector_prog, NULL);
+ mutex_unlock(&flow_dissector_mutex);
+ return 0;
+}
/**
* skb_flow_get_be16 - extract be16 entity
* @skb: sk_buff to extract from
@@ -588,6 +629,60 @@ static bool skb_flow_dissect_allowed(int *num_hdrs)
return (*num_hdrs <= MAX_FLOW_DISSECT_HDRS);
}
+static void __skb_flow_bpf_to_target(const struct bpf_flow_keys *flow_keys,
+ struct flow_dissector *flow_dissector,
+ void *target_container)
+{
+ struct flow_dissector_key_control *key_control;
+ struct flow_dissector_key_basic *key_basic;
+ struct flow_dissector_key_addrs *key_addrs;
+ struct flow_dissector_key_ports *key_ports;
+
+ key_control = skb_flow_dissector_target(flow_dissector,
+ FLOW_DISSECTOR_KEY_CONTROL,
+ target_container);
+ key_control->thoff = flow_keys->thoff;
+ if (flow_keys->is_frag)
+ key_control->flags |= FLOW_DIS_IS_FRAGMENT;
+ if (flow_keys->is_first_frag)
+ key_control->flags |= FLOW_DIS_FIRST_FRAG;
+ if (flow_keys->is_encap)
+ key_control->flags |= FLOW_DIS_ENCAPSULATION;
+
+ key_basic = skb_flow_dissector_target(flow_dissector,
+ FLOW_DISSECTOR_KEY_BASIC,
+ target_container);
+ key_basic->n_proto = flow_keys->n_proto;
+ key_basic->ip_proto = flow_keys->ip_proto;
+
+ if (flow_keys->addr_proto == ETH_P_IP &&
+ dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_IPV4_ADDRS)) {
+ key_addrs = skb_flow_dissector_target(flow_dissector,
+ FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+ target_container);
+ key_addrs->v4addrs.src = flow_keys->ipv4_src;
+ key_addrs->v4addrs.dst = flow_keys->ipv4_dst;
+ key_control->addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+ } else if (flow_keys->addr_proto == ETH_P_IPV6 &&
+ dissector_uses_key(flow_dissector,
+ FLOW_DISSECTOR_KEY_IPV6_ADDRS)) {
+ key_addrs = skb_flow_dissector_target(flow_dissector,
+ FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+ target_container);
+ memcpy(&key_addrs->v6addrs, &flow_keys->ipv6_src,
+ sizeof(key_addrs->v6addrs));
+ key_control->addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+ }
+
+ if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_PORTS)) {
+ key_ports = skb_flow_dissector_target(flow_dissector,
+ FLOW_DISSECTOR_KEY_PORTS,
+ target_container);
+ key_ports->src = flow_keys->sport;
+ key_ports->dst = flow_keys->dport;
+ }
+}
+
/**
* __skb_flow_dissect - extract the flow_keys struct and return it
* @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
@@ -619,6 +714,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector_key_vlan *key_vlan;
enum flow_dissect_ret fdret;
enum flow_dissector_key_id dissector_vlan = FLOW_DISSECTOR_KEY_MAX;
+ struct bpf_prog *attached;
int num_hdrs = 0;
u8 ip_proto = 0;
bool ret;
@@ -658,6 +754,46 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
FLOW_DISSECTOR_KEY_BASIC,
target_container);
+ rcu_read_lock();
+ attached = skb ? rcu_dereference(dev_net(skb->dev)->flow_dissector_prog)
+ : NULL;
+ if (attached) {
+ /* Note that even though the const qualifier is discarded
+ * throughout the execution of the BPF program, all changes(the
+ * control block) are reverted after the BPF program returns.
+ * Therefore, __skb_flow_dissect does not alter the skb.
+ */
+ struct bpf_flow_keys flow_keys = {};
+ struct qdisc_skb_cb cb_saved;
+ struct qdisc_skb_cb *cb;
+ u16 *pseudo_cb;
+ u32 result;
+
+ cb = qdisc_skb_cb(skb);
+ pseudo_cb = (u16 *)bpf_skb_cb((struct sk_buff *)skb);
+
+ /* Save Control Block */
+ memcpy(&cb_saved, cb, sizeof(cb_saved));
+ memset(cb, 0, sizeof(cb_saved));
+
+ /* Pass parameters to the BPF program */
+ cb->flow_keys = &flow_keys;
+ *pseudo_cb = nhoff;
+
+ bpf_compute_data_pointers((struct sk_buff *)skb);
+ result = BPF_PROG_RUN(attached, skb);
+
+ /* Restore state */
+ memcpy(cb, &cb_saved, sizeof(cb_saved));
+
+ __skb_flow_bpf_to_target(&flow_keys, flow_dissector,
+ target_container);
+ key_control->thoff = min_t(u16, key_control->thoff, skb->len);
+ rcu_read_unlock();
+ return result == BPF_OK;
+ }
+ rcu_read_unlock();
+
if (dissector_uses_key(flow_dissector,
FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
struct ethhdr *eth = eth_hdr(skb);
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index dce960d22106..b1cd3bc8db70 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -74,6 +74,7 @@ static const char * const prog_type_name[] = {
[BPF_PROG_TYPE_RAW_TRACEPOINT] = "raw_tracepoint",
[BPF_PROG_TYPE_CGROUP_SOCK_ADDR] = "cgroup_sock_addr",
[BPF_PROG_TYPE_LIRC_MODE2] = "lirc_mode2",
+ [BPF_PROG_TYPE_FLOW_DISSECTOR] = "flow_dissector",
};
static void print_boot_time(__u64 nsecs, char *buf, unsigned int size)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 66917a4eba27..3064706fcaaa 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -152,6 +152,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_LWT_SEG6LOCAL,
BPF_PROG_TYPE_LIRC_MODE2,
BPF_PROG_TYPE_SK_REUSEPORT,
+ BPF_PROG_TYPE_FLOW_DISSECTOR,
};
enum bpf_attach_type {
@@ -172,6 +173,7 @@ enum bpf_attach_type {
BPF_CGROUP_UDP4_SENDMSG,
BPF_CGROUP_UDP6_SENDMSG,
BPF_LIRC_MODE2,
+ BPF_FLOW_DISSECTOR,
__MAX_BPF_ATTACH_TYPE
};
@@ -2333,6 +2335,7 @@ struct __sk_buff {
/* ... here. */
__u32 data_meta;
+ __u32 flow_keys;
};
struct bpf_tunnel_key {
@@ -2778,4 +2781,26 @@ enum bpf_task_fd_type {
BPF_FD_TYPE_URETPROBE, /* filename + offset */
};
+struct bpf_flow_keys {
+ __u16 thoff;
+ __u16 addr_proto; /* ETH_P_* of valid addrs */
+ __u8 is_frag;
+ __u8 is_first_frag;
+ __u8 is_encap;
+ __be16 n_proto;
+ __u8 ip_proto;
+ union {
+ struct {
+ __be32 ipv4_src;
+ __be32 ipv4_dst;
+ };
+ struct {
+ __u32 ipv6_src[4]; /* in6_addr; network order */
+ __u32 ipv6_dst[4]; /* in6_addr; network order */
+ };
+ };
+ __be16 sport;
+ __be16 dport;
+};
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 8476da7f2720..9ca8e0e624d8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1502,6 +1502,7 @@ static bool bpf_prog_type__needs_kver(enum bpf_prog_type type)
case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
case BPF_PROG_TYPE_LIRC_MODE2:
case BPF_PROG_TYPE_SK_REUSEPORT:
+ case BPF_PROG_TYPE_FLOW_DISSECTOR:
return false;
case BPF_PROG_TYPE_UNSPEC:
case BPF_PROG_TYPE_KPROBE:
@@ -2121,6 +2122,7 @@ static const struct {
BPF_PROG_SEC("sk_skb", BPF_PROG_TYPE_SK_SKB),
BPF_PROG_SEC("sk_msg", BPF_PROG_TYPE_SK_MSG),
BPF_PROG_SEC("lirc_mode2", BPF_PROG_TYPE_LIRC_MODE2),
+ BPF_PROG_SEC("flow_dissector", BPF_PROG_TYPE_FLOW_DISSECTOR),
BPF_SA_PROG_SEC("cgroup/bind4", BPF_CGROUP_INET4_BIND),
BPF_SA_PROG_SEC("cgroup/bind6", BPF_CGROUP_INET6_BIND),
BPF_SA_PROG_SEC("cgroup/connect4", BPF_CGROUP_INET4_CONNECT),
--
2.19.0.rc2.392.g5ba43deb5a-goog
^ permalink raw reply related
* [bpf-next, v2 0/3] Introduce eBPF flow dissector
From: Petar Penkov @ 2018-09-08 0:11 UTC (permalink / raw)
To: netdev
Cc: davem, ast, daniel, simon.horman, ecree, songliubraving, tom,
Petar Penkov
From: Petar Penkov <ppenkov@google.com>
This patch series hardens the RX stack by allowing flow dissection in BPF,
as previously discussed [1]. Because of the rigorous checks of the BPF
verifier, this provides significant security guarantees. In particular, the
BPF flow dissector cannot get inside of an infinite loop, as with
CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
read outside of packet bounds, because all memory accesses are checked.
Also, with BPF the administrator can decide which protocols to support,
reducing potential attack surface. Rarely encountered protocols can be
excluded from dissection and the program can be updated without kernel
recompile or reboot if a bug is discovered.
Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
This includes a new BPF program and attach type.
Patch 2 adds a flow dissector program in BPF. This parses most protocols in
__skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
and address types).
Patch 3 adds a selftest that attaches the BPF program to the flow dissector
and sends traffic with different levels of encapsulation.
Performance Evaluation:
The in-kernel implementation was compared against the demo program from
patch 2 using the test in patch 3 with IPv4/UDP traffic over 10 seconds.
$perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
-t 10
In-kernel Dissector:
__skb_flow_dissect overhead: 2.12%
Total Packets: 3,272,597 (from output of ./test_flow_dissector)
BPF Dissector:
__skb_flow_dissect overhead: 1.63%
Total Packets: 3,232,356 (from output of ./test_flow_dissector)
No-op BPF Dissector:
__skb_flow_dissect overhead: 1.52%
Total Packets: 3,330,635 (from output of ./test_flow_dissector)
Changes since v1:
1/ (Patch 1) LD_ABS instructions now disallowed for the new BPF prog type
2/ (Patch 1) now checks if skb is NULL in __skb_flow_dissect()
3/ (Patch 1) fixed incorrect accesses in flow_dissector_is_valid_access()
- writes to the flow_keys field now disallowed
- reads/writes to tc_classid and data_meta now disallowed
4/ (Patch 2) headers pulled with bpf_skb_load_data if direct access fails
Changes since RFC:
1/ (Patch 1) Flow dissector hook changed from global to per-netns
2/ (Patch 1) Defined struct bpf_flow_keys to be used in BPF flow dissector
programs instead of exposing the internal flow keys layout. Added a
function to translate from bpf_flow_keys to the internal layout after BPF
dissection is complete. The pointer to this struct is stored in
qdisc_skb_cb rather than inside of the 20 byte control block which
simplifies verification and allows access to all 20 bytes of the cb.
3/ (Patch 2) Removed GUE parsing as it relied on a hardcoded port
4/ (Patch 2) MPLS parsing now stops at the first label which is consistent
with the in-kernel flow dissector
5/ (Patch 2) Refactored to use direct packet access and to write out to
struct bpf_flow_keys
[1] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf
Petar Penkov (3):
flow_dissector: implements flow dissector BPF hook
flow_dissector: implements eBPF parser
selftests/bpf: test bpf flow dissection
include/linux/bpf.h | 1 +
include/linux/bpf_types.h | 1 +
include/linux/skbuff.h | 7 +
include/net/net_namespace.h | 3 +
include/net/sch_generic.h | 12 +-
include/uapi/linux/bpf.h | 25 +
kernel/bpf/syscall.c | 8 +
kernel/bpf/verifier.c | 32 +
net/core/filter.c | 67 ++
net/core/flow_dissector.c | 136 +++
tools/bpf/bpftool/prog.c | 1 +
tools/include/uapi/linux/bpf.h | 25 +
tools/lib/bpf/libbpf.c | 2 +
tools/testing/selftests/bpf/.gitignore | 2 +
tools/testing/selftests/bpf/Makefile | 8 +-
tools/testing/selftests/bpf/bpf_flow.c | 386 +++++++++
tools/testing/selftests/bpf/config | 1 +
.../selftests/bpf/flow_dissector_load.c | 140 ++++
.../selftests/bpf/test_flow_dissector.c | 782 ++++++++++++++++++
.../selftests/bpf/test_flow_dissector.sh | 115 +++
tools/testing/selftests/bpf/with_addr.sh | 54 ++
tools/testing/selftests/bpf/with_tunnels.sh | 36 +
22 files changed, 1838 insertions(+), 6 deletions(-)
create mode 100644 tools/testing/selftests/bpf/bpf_flow.c
create mode 100644 tools/testing/selftests/bpf/flow_dissector_load.c
create mode 100644 tools/testing/selftests/bpf/test_flow_dissector.c
create mode 100755 tools/testing/selftests/bpf/test_flow_dissector.sh
create mode 100755 tools/testing/selftests/bpf/with_addr.sh
create mode 100755 tools/testing/selftests/bpf/with_tunnels.sh
--
2.19.0.rc2.392.g5ba43deb5a-goog
^ permalink raw reply
* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge
From: D'Souza, Nelson @ 2018-09-07 23:42 UTC (permalink / raw)
To: David Ahern, netdev@vger.kernel.org; +Cc: Ido Schimmel
In-Reply-To: <a32ada04-6dd6-3d09-19bf-8db4e1479e03@cumulusnetworks.com>
Thanks David and Ido, for finding the root-cause for bridge Rx packets getting dropped, also for coming up with a patch.
Regards,
Nelson
On 9/7/18, 9:09 AM, "David Ahern" <dsa@cumulusnetworks.com> wrote:
On 9/7/18 9:56 AM, D'Souza, Nelson wrote:
> ------------------------------------------------------------------------
> *From:* David Ahern <dsa@cumulusnetworks.com>
> *Sent:* Thursday, September 6, 2018 5:27 PM
> *To:* D'Souza, Nelson; netdev@vger.kernel.org
> *Subject:* Re: [**EXTERNAL**] Re: VRF with enslaved L3 enabled bridge
>
> On 9/5/18 12:00 PM, D'Souza, Nelson wrote:
>> Just following up.... would you be able to confirm that this is a
> Linux VRF issue?
>
> I can confirm that I can reproduce the problem. Need to find time to dig
> into it.
bridge's netfilter hook is dropping the packet.
bridge's netfilter code registers hook operations that are invoked when
nh_hook is called. It then sees all subsequent calls to nf_hook.
Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing
allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for
NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish,
then resets in_prerouting flag to 0. Any subsequent calls to nf_hook
invoke ip_sabotage_in. That function sees in_prerouting is not
set and steals (drops) the packet.
The simplest change is to have ip_sabotage_in recognize that the bridge
can be enslaved to a VRF (L3 master device) and allow the packet to
continue.
Thanks to Ido for the hint on ip_sabotage_in.
This patch works for me:
diff --git a/net/bridge/br_netfilter_hooks.c
b/net/bridge/br_netfilter_hooks.c
index 6e0dc6bcd32a..37278dc280eb 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -835,7 +835,8 @@ static unsigned int ip_sabotage_in(void *priv,
struct sk_buff *skb,
const struct nf_hook_state *state)
{
- if (skb->nf_bridge && !skb->nf_bridge->in_prerouting) {
+ if (skb->nf_bridge && !skb->nf_bridge->in_prerouting &&
+ !netif_is_l3_master(skb->dev)) {
state->okfn(state->net, state->sk, skb);
return NF_STOLEN;
}
^ permalink raw reply
* Re: WARNING in bpf_jit_free
From: syzbot @ 2018-09-07 23:23 UTC (permalink / raw)
To: ast, daniel, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <000000000000e92d1805711f5552@google.com>
syzbot has found a reproducer for the following crash on:
HEAD commit: 28619527b8a7 Merge git://git.kernel.org/pub/scm/linux/kern..
git tree: bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=1339498e400000
kernel config: https://syzkaller.appspot.com/x/.config?x=62e9b447c16085cf
dashboard link: https://syzkaller.appspot.com/bug?extid=2ff1e7cb738fd3c41113
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=163cc149400000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+2ff1e7cb738fd3c41113@syzkaller.appspotmail.com
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
8021q: adding VLAN 0 to HW filter on device team0
WARNING: CPU: 0 PID: 5391 at kernel/bpf/core.c:628 bpf_jit_free+0x2e5/0x3f0
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
panic+0x238/0x4e7 kernel/panic.c:184
__warn.cold.8+0x163/0x1ba kernel/panic.c:536
report_bug+0x254/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:993
RIP: 0010:bpf_jit_free+0x2e5/0x3f0
Code: 07 38 c8 7f 08 84 c0 0f 85 85 00 00 00 48 b8 00 02 00 00 00 00 ad de
44 0f b6 63 02 48 39 c2 0f 84 d9 fd ff ff e8 8b 4b f3 ff <0f> 0b e9 cd fd
ff ff e8 7f 4b f3 ff 4c 89 f0 48 ba 00 00 00 00 00
RSP: 0018:ffff8801b77bf648 EFLAGS: 00010293
RAX: ffff8801d8730500 RBX: ffffc9000192c000 RCX: 0000000000000002
RDX: 0000000000000000 RSI: ffffffff818b83a5 RDI: ffff8801cdcea6e8
RBP: ffff8801b77bf6e0 R08: ffff8801d8730dc8 R09: 0000000000000006
R10: 0000000000000000 R11: ffff8801d8730500 R12: 000000000000000f
R13: 1ffff10036ef7ecb R14: ffffc9000192c002 R15: ffffc9000192c020
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bef80 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bf0f8 R08: ffff8801d8730500 R09: ffffed003b5c4732
R10: ffffed003b5c4732 R11: ffff8801dae23993 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#2] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77be828 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77be9a0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#3] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77be0c8 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77be240 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#4] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bd968 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bdae0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#5] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bd208 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bd380 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#6] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bcaa8 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bcc20 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#7] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bc348 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bc4c0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#8] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bbbe8 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bbd60 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#9] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bb488 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bb600 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#10] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bad28 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77baea0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#11] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77ba5c8 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77ba740 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#12] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77b9e68 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77b9fe0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Oops: 0000 [#13] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77b9708 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77b9880 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Thread overran stack, or stack corrupted
Oops: 0000 [#14] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77b8fa8 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77b9120 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Thread overran stack, or stack corrupted
Oops: 0000 [#15] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77b8848 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77b89c0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
==================================================================
BUG: KASAN: slab-out-of-bounds in do_error_trap+0x3b6/0x4d0
arch/x86/kernel/traps.c:296
Read of size 8 at addr ffff8801b77b7430 by task kworker/0:0/5391
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
Call Trace:
BUG: unable to handle kernel paging request at fffffbfff4004000
PGD 21ffec067 P4D 21ffec067 PUD 21fe60067 PMD 1c1a04067 PTE 0
Thread overran stack, or stack corrupted
Oops: 0000 [#16] PREEMPT SMP KASAN
CPU: 0 PID: 5391 Comm: kworker/0:0 Not tainted 4.19.0-rc2+ #50
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77b6e58 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77b6fd0 R08: ffff8801d8730500 R09: 0000000000000001
R10: ffffed003b5c4732 R11: 0000000000000000 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
Call Trace:
Modules linked in:
Dumping ftrace buffer:
(ftrace buffer empty)
CR2: fffffbfff4004000
---[ end trace 89eec6ca57f730dc ]---
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:384 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:435 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find+0x2fe/0x4a0 kernel/bpf/core.c:509
Code: 8e f3 ff 4c 8b ad b0 fe ff ff 4c 89 e6 4c 89 ef e8 47 8f f3 ff 4d 39
e5 0f 82 a7 00 00 00 e8 89 8e f3 ff 4c 89 e0 48 c1 e8 03 <42> 0f b6 04 30
84 c0 74 08 3c 03 0f 8e 35 01 00 00 41 8b 04 24 4c
RSP: 0018:ffff8801b77bef80 EFLAGS: 00010806
RAX: 1ffffffff4004000 RBX: ffff8801cdcea6b0 RCX: ffffffff818b4099
RDX: 0000000000000000 RSI: ffffffff818b40a7 RDI: 0000000000000006
RBP: ffff8801b77bf0f8 R08: ffff8801d8730500 R09: ffffed003b5c4732
R10: ffffed003b5c4732 R11: ffff8801dae23993 R12: ffffffffa0020000
R13: ffffffffffffffff R14: dffffc0000000000 R15: ffff8801cdcea6b0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffffbfff4004000 CR3: 000000000946a000 CR4: 00000000001406f0
^ permalink raw reply
* Re: GPL compliance issue with liquidio/lio_23xx_vsw.bin firmware
From: Felix Manlunas @ 2018-09-07 22:58 UTC (permalink / raw)
To: Ben Hutchings
Cc: Florian Weimer, linux-firmware, netdev@vger.kernel.org,
Derek Chickles, Satanand Burla, Felix Manlunas, Raghu Vatsavayi,
Manish Awasthi, Manojkumar.Panicker
In-Reply-To: <8dfbae2a66cfa1317e67e4ba21e017a1fa8054db.camel@decadent.org.uk>
On Wed, Aug 29, 2018 at 09:26:35PM +0100, Ben Hutchings wrote:
> Date: Wed, 29 Aug 2018 21:26:35 +0100
> From: Ben Hutchings <ben@decadent.org.uk>
> To: Felix Manlunas <felix.manlunas@cavium.com>, Florian Weimer
> <fweimer@redhat.com>
> CC: linux-firmware@kernel.org, "netdev@vger.kernel.org"
> <netdev@vger.kernel.org>, Derek Chickles
> <derek.chickles@caviumnetworks.com>, Satanand Burla
> <satananda.burla@caviumnetworks.com>, Felix Manlunas
> <felix.manlunas@caviumnetworks.com>, Raghu Vatsavayi
> <raghu.vatsavayi@caviumnetworks.com>, Manish Awasthi
> <manish.awasthi@cavium.com>, Manojkumar.Panicker@cavium.com
> Subject: Re: GPL compliance issue with liquidio/lio_23xx_vsw.bin firmware
> User-Agent: Evolution 3.29.92-1
>
> On Mon, 2018-08-27 at 17:04 -0700, Felix Manlunas wrote:
> > On Mon, Aug 27, 2018 at 05:01:10PM +0200, Florian Weimer wrote:
> > > liquidio/lio_23xx_vsw.bin contains a compiled MIPS Linux kernel:
> > >
> > > $ tail --bytes=+1313 liquidio/lio_23xx_vsw.bin > elf
> > > $ readelf -aW elf
> > > […]
> > > [ 6] __ksymtab PROGBITS ffffffff80e495f8 64a5f8 00d130
> > > 00 A 0 0 8
> > > [ 7] __ksymtab_gpl PROGBITS ffffffff80e56728 657728 008400
> > > 00 A 0 0 8
> > > [ 8] __ksymtab_strings PROGBITS ffffffff80e5eb28 65fb28 018868
> > > 00 A 0 0 1
> > > […]
> > > Symbol table '.symtab' contains 1349 entries:
> > > Num: Value Size Type Bind Vis Ndx Name
> > > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
> > > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS
> > > arch/mips/kernel/head.o
> > > 2: 0000000000000000 0 FILE LOCAL DEFAULT ABS init/main.c
> > > 3: 0000000000000000 0 FILE LOCAL DEFAULT ABS
> > > include/linux/types.h
> > > […]
> > >
> > > Yet there is no corresponding source provided, and LICENCE.cavium lacks
> > > the required notices.
> > >
> > > Thanks,
> > > Florian
> >
> > Cavium apologizes for the oversight. Cavium has been advertising the
> > appropriate license terms including the existence of GPL in the firmware
> > in our outbox releases. We will update the license terms in LICENCE.cavium
> > in our upstream contribution in collaboration with our legal team.
>
> Everything added to linux-firmware.git needs to be safe for Linux
> distributions to redistribute. (There are some ancient firmware images
> with unclear licensing, but we don't want to add any more.)
>
> My understanding is that GPL v2 requires that commercial distributors
> either provide the complete and corresponding source alongside the
> binaries, or include a written offer to provide it later. They
> *cannot* simply point to your offer to do this (only non-commercial
> distributors can do that). So the source needs to be published too.
>
> Adding the complete Linux kernel source code to linux-firmware.git
> doesn't seem like a sensible step, so maybe this particular firmware
> needs to live elsewhere.
>
> Ben.
Apologies for the delay. We're committed to provide sources for the GPL
components of LiquidIO firmware.
After Cavium's recent merger with Marvell, we are revisiting our Licensing
and expect to share the updated license soon. Since keeping Linux kernel
source code in linux-firmware.git doesn't make sense to us too, we're also
working internally on identifying the appropriate mechanism to share those
sources when requested. That said, it is our opinion that for all
practical purposes (for example, distribution of driver + firmware) the
firmware binary for LiquidIO adapters should continue to reside in the
linux-firmware.git as long as all the prerequisites for sharing the
firmware are met. In the case of liquidio_23xx_vsw.bin, this would mean an
additional license type in the LICENCE.cavium file for this firmware
including a reference to the accompanying sources for the Linux kernel
bundled in the firmware.
As we understand, there is no precedent in the linux-firmware.git for
device firmware that include a Linux kernel so we also request comments
from community on the best ways to support the LiquidIO firmware and
similar firmwares that may appear in future.
Cavium LiquidIO Team
^ permalink raw reply
* [PATCH net] netfilter: bridge: Don't sabotage nf_hook calls from an l3mdev
From: dsahern @ 2018-09-07 22:08 UTC (permalink / raw)
To: netdev; +Cc: ndsouza, idosch, pablo, fw, David Ahern
From: David Ahern <dsahern@gmail.com>
For starters, the bridge netfilter code registers operations that
are invoked any time nh_hook is called. Specifically, ip_sabotage_in
watches for nested calls for NF_INET_PRE_ROUTING when a bridge is in
the stack.
Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing
allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for
NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish,
then resets in_prerouting flag to 0 and the packet continues up the
stack. The packet eventually makes it to the VRF driver and it invokes
nf_hook for NF_INET_PRE_ROUTING in case any rules have been added against
the vrf device.
Because of the registered operations the call to nf_hook causes
ip_sabotage_in to be invoked. That function sees the nf_bridge on the
skb and that in_prerouting is not set. Thinking it is an invalid nested
call it steals (drops) the packet.
Update ip_sabotage_in to recognize that the bridge or one of its upper
devices (e.g., vlan) can be enslaved to a VRF (L3 master device) and
allow the packet to go through the nf_hook a second time.
Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device")
Reported-by: D'Souza, Nelson <ndsouza@ciena.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
---
net/bridge/br_netfilter_hooks.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 6e0dc6bcd32a..37278dc280eb 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -835,7 +835,8 @@ static unsigned int ip_sabotage_in(void *priv,
struct sk_buff *skb,
const struct nf_hook_state *state)
{
- if (skb->nf_bridge && !skb->nf_bridge->in_prerouting) {
+ if (skb->nf_bridge && !skb->nf_bridge->in_prerouting &&
+ !netif_is_l3_master(skb->dev)) {
state->okfn(state->net, state->sk, skb);
return NF_STOLEN;
}
--
2.11.0
^ permalink raw reply related
* Re: [net-next] i40e(vf): remove i40e_ethtool_stats.h header file
From: David Miller @ 2018-09-07 22:02 UTC (permalink / raw)
To: jacob.e.keller
Cc: jeffrey.t.kirsher, netdev, dongsheng.wang, intel-wired-lan,
aaron.f.brown
In-Reply-To: <20180907215544.5957-1-jacob.e.keller@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Date: Fri, 7 Sep 2018 14:55:44 -0700
> Essentially reverts commit 8fd75c58a09a ("i40e: move ethtool
> stats boiler plate code to i40e_ethtool_stats.h", 2018-08-30), and
> additionally moves the similar code in i40evf into i40evf_ethtool.c.
>
> The code was intially moved from i40e_ethtool.c into i40e_ethtool_stats.h
> as a way of better logically organizing the code. This has two problems.
> First, we can't have an inline function with variadic arguments on all
> platforms. Second, it gave the appearance that we had plans to share
> code between the i40e and i40evf drivers, due to having a near copy of
> the contents in the i40evf/i40e_ethtool_stats.h file.
>
> Patches which actually attempt to combine or share code between the i40e
> and i40evf drivers have not materialized, and are likely a ways off.
>
> Rather than fixing the one function which causes build issues, just move
> this code back into the i40e_ethtool.c and i40evf_ethtool.c files. Note
> that we also change these functions back from static inlines to just
> statics, since they're no longer in a header file.
>
> We can revisit this if/when work is done to actually attempt to share
> code between drivers. Alternatively, this stats code could be made more
> generic so that it can be shared across drivers as part of ethtool
> kernel work.
>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Applied and after the build checks out will be pushed to net-next.
Thanks.
^ permalink raw reply
* [net-next] i40e(vf): remove i40e_ethtool_stats.h header file
From: Jacob Keller @ 2018-09-07 21:55 UTC (permalink / raw)
To: David Miller, Jeffrey Kirsher, netdev
Cc: Wang Dongsheng, Intel Wired LAN, aaron.f.brown, Jacob Keller
Essentially reverts commit 8fd75c58a09a ("i40e: move ethtool
stats boiler plate code to i40e_ethtool_stats.h", 2018-08-30), and
additionally moves the similar code in i40evf into i40evf_ethtool.c.
The code was intially moved from i40e_ethtool.c into i40e_ethtool_stats.h
as a way of better logically organizing the code. This has two problems.
First, we can't have an inline function with variadic arguments on all
platforms. Second, it gave the appearance that we had plans to share
code between the i40e and i40evf drivers, due to having a near copy of
the contents in the i40evf/i40e_ethtool_stats.h file.
Patches which actually attempt to combine or share code between the i40e
and i40evf drivers have not materialized, and are likely a ways off.
Rather than fixing the one function which causes build issues, just move
this code back into the i40e_ethtool.c and i40evf_ethtool.c files. Note
that we also change these functions back from static inlines to just
statics, since they're no longer in a header file.
We can revisit this if/when work is done to actually attempt to share
code between drivers. Alternatively, this stats code could be made more
generic so that it can be shared across drivers as part of ethtool
kernel work.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
Sorry aboutthe delay, our iwl queue maintainer Jeff is on vacation, and
discussion about how best to resolve this issue was/is ongoing in the
IWL list.
I opted to just move the contents of these files back into
i40e_ethtool.c and i40evf_ethtool.c respectively, as I think it's better
than only moving the single offending function back into the .c file.
This is different from the approach suggested by Wang, Dongsheng
<dongsheng.wang@hxt-semitech.com> on Intel Wired LAN, but I think it's a
better approach for now.
I've also kindly asked if Aaron Brown could test this out.
.../net/ethernet/intel/i40e/i40e_ethtool.c | 219 ++++++++++++++++-
.../ethernet/intel/i40e/i40e_ethtool_stats.h | 221 ------------------
.../intel/i40evf/i40e_ethtool_stats.h | 221 ------------------
.../ethernet/intel/i40evf/i40evf_ethtool.c | 219 ++++++++++++++++-
4 files changed, 436 insertions(+), 444 deletions(-)
delete mode 100644 drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
delete mode 100644 drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index d7d3974beca2..87fe2e60602f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -6,7 +6,224 @@
#include "i40e.h"
#include "i40e_diag.h"
-#include "i40e_ethtool_stats.h"
+/* ethtool statistics helpers */
+
+/**
+ * struct i40e_stats - definition for an ethtool statistic
+ * @stat_string: statistic name to display in ethtool -S output
+ * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
+ * @stat_offset: offsetof() the stat from a base pointer
+ *
+ * This structure defines a statistic to be added to the ethtool stats buffer.
+ * It defines a statistic as offset from a common base pointer. Stats should
+ * be defined in constant arrays using the I40E_STAT macro, with every element
+ * of the array using the same _type for calculating the sizeof_stat and
+ * stat_offset.
+ *
+ * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
+ * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
+ * the i40e_add_ethtool_stat() helper function.
+ *
+ * The @stat_string is interpreted as a format string, allowing formatted
+ * values to be inserted while looping over multiple structures for a given
+ * statistics array. Thus, every statistic string in an array should have the
+ * same type and number of format specifiers, to be formatted by variadic
+ * arguments to the i40e_add_stat_string() helper function.
+ **/
+struct i40e_stats {
+ char stat_string[ETH_GSTRING_LEN];
+ int sizeof_stat;
+ int stat_offset;
+};
+
+/* Helper macro to define an i40e_stat structure with proper size and type.
+ * Use this when defining constant statistics arrays. Note that @_type expects
+ * only a type name and is used multiple times.
+ */
+#define I40E_STAT(_type, _name, _stat) { \
+ .stat_string = _name, \
+ .sizeof_stat = FIELD_SIZEOF(_type, _stat), \
+ .stat_offset = offsetof(_type, _stat) \
+}
+
+/* Helper macro for defining some statistics directly copied from the netdev
+ * stats structure.
+ */
+#define I40E_NETDEV_STAT(_net_stat) \
+ I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
+
+/* Helper macro for defining some statistics related to queues */
+#define I40E_QUEUE_STAT(_name, _stat) \
+ I40E_STAT(struct i40e_ring, _name, _stat)
+
+/* Stats associated with a Tx or Rx ring */
+static const struct i40e_stats i40e_gstrings_queue_stats[] = {
+ I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
+ I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
+};
+
+/**
+ * i40e_add_one_ethtool_stat - copy the stat into the supplied buffer
+ * @data: location to store the stat value
+ * @pointer: basis for where to copy from
+ * @stat: the stat definition
+ *
+ * Copies the stat data defined by the pointer and stat structure pair into
+ * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
+ * i40e_add_queue_stats. If the pointer is null, data will be zero'd.
+ */
+static void
+i40e_add_one_ethtool_stat(u64 *data, void *pointer,
+ const struct i40e_stats *stat)
+{
+ char *p;
+
+ if (!pointer) {
+ /* ensure that the ethtool data buffer is zero'd for any stats
+ * which don't have a valid pointer.
+ */
+ *data = 0;
+ return;
+ }
+
+ p = (char *)pointer + stat->stat_offset;
+ switch (stat->sizeof_stat) {
+ case sizeof(u64):
+ *data = *((u64 *)p);
+ break;
+ case sizeof(u32):
+ *data = *((u32 *)p);
+ break;
+ case sizeof(u16):
+ *data = *((u16 *)p);
+ break;
+ case sizeof(u8):
+ *data = *((u8 *)p);
+ break;
+ default:
+ WARN_ONCE(1, "unexpected stat size for %s",
+ stat->stat_string);
+ *data = 0;
+ }
+}
+
+/**
+ * __i40e_add_ethtool_stats - copy stats into the ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location to copy stats from
+ * @stats: array of stats to copy
+ * @size: the size of the stats definition
+ *
+ * Copy the stats defined by the stats array using the pointer as a base into
+ * the data buffer supplied by ethtool. Updates the data pointer to point to
+ * the next empty location for successive calls to __i40e_add_ethtool_stats.
+ * If pointer is null, set the data values to zero and update the pointer to
+ * skip these stats.
+ **/
+static void
+__i40e_add_ethtool_stats(u64 **data, void *pointer,
+ const struct i40e_stats stats[],
+ const unsigned int size)
+{
+ unsigned int i;
+
+ for (i = 0; i < size; i++)
+ i40e_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
+}
+
+/**
+ * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location where stats are stored
+ * @stats: static const array of stat definitions
+ *
+ * Macro to ease the use of __i40e_add_ethtool_stats by taking a static
+ * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
+ * ensuring that we pass the size associated with the given stats array.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided.
+ **/
+#define i40e_add_ethtool_stats(data, pointer, stats) \
+ __i40e_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
+
+/**
+ * i40e_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @ring: the ring to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
+ * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
+ * ring pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the ring into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ **/
+static void
+i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
+{
+ const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
+ const struct i40e_stats *stats = i40e_gstrings_queue_stats;
+ unsigned int start;
+ unsigned int i;
+
+ /* To avoid invalid statistics values, ensure that we keep retrying
+ * the copy until we get a consistent value according to
+ * u64_stats_fetch_retry_irq. But first, make sure our ring is
+ * non-null before attempting to access its syncp.
+ */
+ do {
+ start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
+ for (i = 0; i < size; i++) {
+ i40e_add_one_ethtool_stat(&(*data)[i], ring,
+ &stats[i]);
+ }
+ } while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+ /* Once we successfully copy the stats in, update the data pointer */
+ *data += size;
+}
+
+/**
+ * __i40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ **/
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+ const unsigned int size, ...)
+{
+ unsigned int i;
+
+ for (i = 0; i < size; i++) {
+ va_list args;
+
+ va_start(args, size);
+ vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
+ *p += ETH_GSTRING_LEN;
+ va_end(args);
+ }
+}
+
+/**
+ * 40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ *
+ * Format and copy the strings described by the const static stats value into
+ * the buffer pointed at by p.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided. Additionally, stats must be an array such that
+ * ARRAY_SIZE can be called on it.
+ **/
+#define i40e_add_stat_strings(p, stats, ...) \
+ __i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
#define I40E_PF_STAT(_name, _stat) \
I40E_STAT(struct i40e_pf, _name, _stat)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
deleted file mode 100644
index bba1cb0b658f..000000000000
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool_stats.h
+++ /dev/null
@@ -1,221 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
-
-/* ethtool statistics helpers */
-
-/**
- * struct i40e_stats - definition for an ethtool statistic
- * @stat_string: statistic name to display in ethtool -S output
- * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
- * @stat_offset: offsetof() the stat from a base pointer
- *
- * This structure defines a statistic to be added to the ethtool stats buffer.
- * It defines a statistic as offset from a common base pointer. Stats should
- * be defined in constant arrays using the I40E_STAT macro, with every element
- * of the array using the same _type for calculating the sizeof_stat and
- * stat_offset.
- *
- * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
- * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
- * the i40e_add_ethtool_stat() helper function.
- *
- * The @stat_string is interpreted as a format string, allowing formatted
- * values to be inserted while looping over multiple structures for a given
- * statistics array. Thus, every statistic string in an array should have the
- * same type and number of format specifiers, to be formatted by variadic
- * arguments to the i40e_add_stat_string() helper function.
- **/
-struct i40e_stats {
- char stat_string[ETH_GSTRING_LEN];
- int sizeof_stat;
- int stat_offset;
-};
-
-/* Helper macro to define an i40e_stat structure with proper size and type.
- * Use this when defining constant statistics arrays. Note that @_type expects
- * only a type name and is used multiple times.
- */
-#define I40E_STAT(_type, _name, _stat) { \
- .stat_string = _name, \
- .sizeof_stat = FIELD_SIZEOF(_type, _stat), \
- .stat_offset = offsetof(_type, _stat) \
-}
-
-/* Helper macro for defining some statistics directly copied from the netdev
- * stats structure.
- */
-#define I40E_NETDEV_STAT(_net_stat) \
- I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
-
-/* Helper macro for defining some statistics related to queues */
-#define I40E_QUEUE_STAT(_name, _stat) \
- I40E_STAT(struct i40e_ring, _name, _stat)
-
-/* Stats associated with a Tx or Rx ring */
-static const struct i40e_stats i40e_gstrings_queue_stats[] = {
- I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
- I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
-};
-
-/**
- * i40e_add_one_ethtool_stat - copy the stat into the supplied buffer
- * @data: location to store the stat value
- * @pointer: basis for where to copy from
- * @stat: the stat definition
- *
- * Copies the stat data defined by the pointer and stat structure pair into
- * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
- * i40e_add_queue_stats. If the pointer is null, data will be zero'd.
- */
-static inline void
-i40e_add_one_ethtool_stat(u64 *data, void *pointer,
- const struct i40e_stats *stat)
-{
- char *p;
-
- if (!pointer) {
- /* ensure that the ethtool data buffer is zero'd for any stats
- * which don't have a valid pointer.
- */
- *data = 0;
- return;
- }
-
- p = (char *)pointer + stat->stat_offset;
- switch (stat->sizeof_stat) {
- case sizeof(u64):
- *data = *((u64 *)p);
- break;
- case sizeof(u32):
- *data = *((u32 *)p);
- break;
- case sizeof(u16):
- *data = *((u16 *)p);
- break;
- case sizeof(u8):
- *data = *((u8 *)p);
- break;
- default:
- WARN_ONCE(1, "unexpected stat size for %s",
- stat->stat_string);
- *data = 0;
- }
-}
-
-/**
- * __i40e_add_ethtool_stats - copy stats into the ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location to copy stats from
- * @stats: array of stats to copy
- * @size: the size of the stats definition
- *
- * Copy the stats defined by the stats array using the pointer as a base into
- * the data buffer supplied by ethtool. Updates the data pointer to point to
- * the next empty location for successive calls to __i40e_add_ethtool_stats.
- * If pointer is null, set the data values to zero and update the pointer to
- * skip these stats.
- **/
-static inline void
-__i40e_add_ethtool_stats(u64 **data, void *pointer,
- const struct i40e_stats stats[],
- const unsigned int size)
-{
- unsigned int i;
-
- for (i = 0; i < size; i++)
- i40e_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
-}
-
-/**
- * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location where stats are stored
- * @stats: static const array of stat definitions
- *
- * Macro to ease the use of __i40e_add_ethtool_stats by taking a static
- * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
- * ensuring that we pass the size associated with the given stats array.
- *
- * The parameter @stats is evaluated twice, so parameters with side effects
- * should be avoided.
- **/
-#define i40e_add_ethtool_stats(data, pointer, stats) \
- __i40e_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
-
-/**
- * i40e_add_queue_stats - copy queue statistics into supplied buffer
- * @data: ethtool stats buffer
- * @ring: the ring to copy
- *
- * Queue statistics must be copied while protected by
- * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
- * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
- * ring pointer is null, zero out the queue stat values and update the data
- * pointer. Otherwise safely copy the stats from the ring into the supplied
- * buffer and update the data pointer when finished.
- *
- * This function expects to be called while under rcu_read_lock().
- **/
-static inline void
-i40e_add_queue_stats(u64 **data, struct i40e_ring *ring)
-{
- const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
- const struct i40e_stats *stats = i40e_gstrings_queue_stats;
- unsigned int start;
- unsigned int i;
-
- /* To avoid invalid statistics values, ensure that we keep retrying
- * the copy until we get a consistent value according to
- * u64_stats_fetch_retry_irq. But first, make sure our ring is
- * non-null before attempting to access its syncp.
- */
- do {
- start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
- for (i = 0; i < size; i++) {
- i40e_add_one_ethtool_stat(&(*data)[i], ring,
- &stats[i]);
- }
- } while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
-
- /* Once we successfully copy the stats in, update the data pointer */
- *data += size;
-}
-
-/**
- * __i40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- * @size: size of the stats array
- *
- * Format and copy the strings described by stats into the buffer pointed at
- * by p.
- **/
-static inline void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
- const unsigned int size, ...)
-{
- unsigned int i;
-
- for (i = 0; i < size; i++) {
- va_list args;
-
- va_start(args, size);
- vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
- *p += ETH_GSTRING_LEN;
- va_end(args);
- }
-}
-
-/**
- * 40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- *
- * Format and copy the strings described by the const static stats value into
- * the buffer pointed at by p.
- *
- * The parameter @stats is evaluated twice, so parameters with side effects
- * should be avoided. Additionally, stats must be an array such that
- * ARRAY_SIZE can be called on it.
- **/
-#define i40e_add_stat_strings(p, stats, ...) \
- __i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h b/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
deleted file mode 100644
index 60b595dd8c39..000000000000
--- a/drivers/net/ethernet/intel/i40evf/i40e_ethtool_stats.h
+++ /dev/null
@@ -1,221 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright(c) 2013 - 2018 Intel Corporation. */
-
-/* ethtool statistics helpers */
-
-/**
- * struct i40e_stats - definition for an ethtool statistic
- * @stat_string: statistic name to display in ethtool -S output
- * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
- * @stat_offset: offsetof() the stat from a base pointer
- *
- * This structure defines a statistic to be added to the ethtool stats buffer.
- * It defines a statistic as offset from a common base pointer. Stats should
- * be defined in constant arrays using the I40E_STAT macro, with every element
- * of the array using the same _type for calculating the sizeof_stat and
- * stat_offset.
- *
- * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
- * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
- * the i40e_add_ethtool_stat() helper function.
- *
- * The @stat_string is interpreted as a format string, allowing formatted
- * values to be inserted while looping over multiple structures for a given
- * statistics array. Thus, every statistic string in an array should have the
- * same type and number of format specifiers, to be formatted by variadic
- * arguments to the i40e_add_stat_string() helper function.
- **/
-struct i40e_stats {
- char stat_string[ETH_GSTRING_LEN];
- int sizeof_stat;
- int stat_offset;
-};
-
-/* Helper macro to define an i40e_stat structure with proper size and type.
- * Use this when defining constant statistics arrays. Note that @_type expects
- * only a type name and is used multiple times.
- */
-#define I40E_STAT(_type, _name, _stat) { \
- .stat_string = _name, \
- .sizeof_stat = FIELD_SIZEOF(_type, _stat), \
- .stat_offset = offsetof(_type, _stat) \
-}
-
-/* Helper macro for defining some statistics directly copied from the netdev
- * stats structure.
- */
-#define I40E_NETDEV_STAT(_net_stat) \
- I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
-
-/* Helper macro for defining some statistics related to queues */
-#define I40E_QUEUE_STAT(_name, _stat) \
- I40E_STAT(struct i40e_ring, _name, _stat)
-
-/* Stats associated with a Tx or Rx ring */
-static const struct i40e_stats i40e_gstrings_queue_stats[] = {
- I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
- I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
-};
-
-/**
- * i40evf_add_one_ethtool_stat - copy the stat into the supplied buffer
- * @data: location to store the stat value
- * @pointer: basis for where to copy from
- * @stat: the stat definition
- *
- * Copies the stat data defined by the pointer and stat structure pair into
- * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
- * i40evf_add_queue_stats. If the pointer is null, data will be zero'd.
- */
-static inline void
-i40evf_add_one_ethtool_stat(u64 *data, void *pointer,
- const struct i40e_stats *stat)
-{
- char *p;
-
- if (!pointer) {
- /* ensure that the ethtool data buffer is zero'd for any stats
- * which don't have a valid pointer.
- */
- *data = 0;
- return;
- }
-
- p = (char *)pointer + stat->stat_offset;
- switch (stat->sizeof_stat) {
- case sizeof(u64):
- *data = *((u64 *)p);
- break;
- case sizeof(u32):
- *data = *((u32 *)p);
- break;
- case sizeof(u16):
- *data = *((u16 *)p);
- break;
- case sizeof(u8):
- *data = *((u8 *)p);
- break;
- default:
- WARN_ONCE(1, "unexpected stat size for %s",
- stat->stat_string);
- *data = 0;
- }
-}
-
-/**
- * __i40evf_add_ethtool_stats - copy stats into the ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location to copy stats from
- * @stats: array of stats to copy
- * @size: the size of the stats definition
- *
- * Copy the stats defined by the stats array using the pointer as a base into
- * the data buffer supplied by ethtool. Updates the data pointer to point to
- * the next empty location for successive calls to __i40evf_add_ethtool_stats.
- * If pointer is null, set the data values to zero and update the pointer to
- * skip these stats.
- **/
-static inline void
-__i40evf_add_ethtool_stats(u64 **data, void *pointer,
- const struct i40e_stats stats[],
- const unsigned int size)
-{
- unsigned int i;
-
- for (i = 0; i < size; i++)
- i40evf_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
-}
-
-/**
- * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
- * @data: ethtool stats buffer
- * @pointer: location where stats are stored
- * @stats: static const array of stat definitions
- *
- * Macro to ease the use of __i40evf_add_ethtool_stats by taking a static
- * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
- * ensuring that we pass the size associated with the given stats array.
- *
- * The parameter @stats is evaluated twice, so parameters with side effects
- * should be avoided.
- **/
-#define i40e_add_ethtool_stats(data, pointer, stats) \
- __i40evf_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
-
-/**
- * i40evf_add_queue_stats - copy queue statistics into supplied buffer
- * @data: ethtool stats buffer
- * @ring: the ring to copy
- *
- * Queue statistics must be copied while protected by
- * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
- * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
- * ring pointer is null, zero out the queue stat values and update the data
- * pointer. Otherwise safely copy the stats from the ring into the supplied
- * buffer and update the data pointer when finished.
- *
- * This function expects to be called while under rcu_read_lock().
- **/
-static inline void
-i40evf_add_queue_stats(u64 **data, struct i40e_ring *ring)
-{
- const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
- const struct i40e_stats *stats = i40e_gstrings_queue_stats;
- unsigned int start;
- unsigned int i;
-
- /* To avoid invalid statistics values, ensure that we keep retrying
- * the copy until we get a consistent value according to
- * u64_stats_fetch_retry_irq. But first, make sure our ring is
- * non-null before attempting to access its syncp.
- */
- do {
- start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
- for (i = 0; i < size; i++) {
- i40evf_add_one_ethtool_stat(&(*data)[i], ring,
- &stats[i]);
- }
- } while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
-
- /* Once we successfully copy the stats in, update the data pointer */
- *data += size;
-}
-
-/**
- * __i40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- * @size: size of the stats array
- *
- * Format and copy the strings described by stats into the buffer pointed at
- * by p.
- **/
-static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
- const unsigned int size, ...)
-{
- unsigned int i;
-
- for (i = 0; i < size; i++) {
- va_list args;
-
- va_start(args, size);
- vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
- *p += ETH_GSTRING_LEN;
- va_end(args);
- }
-}
-
-/**
- * 40e_add_stat_strings - copy stat strings into ethtool buffer
- * @p: ethtool supplied buffer
- * @stats: stat definitions array
- *
- * Format and copy the strings described by the const static stats value into
- * the buffer pointed at by p.
- *
- * The parameter @stats is evaluated twice, so parameters with side effects
- * should be avoided. Additionally, stats must be an array such that
- * ARRAY_SIZE can be called on it.
- **/
-#define i40e_add_stat_strings(p, stats, ...) \
- __i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index 9319971c5c92..50f65ab737c9 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -6,7 +6,224 @@
#include <linux/uaccess.h>
-#include "i40e_ethtool_stats.h"
+/* ethtool statistics helpers */
+
+/**
+ * struct i40e_stats - definition for an ethtool statistic
+ * @stat_string: statistic name to display in ethtool -S output
+ * @sizeof_stat: the sizeof() the stat, must be no greater than sizeof(u64)
+ * @stat_offset: offsetof() the stat from a base pointer
+ *
+ * This structure defines a statistic to be added to the ethtool stats buffer.
+ * It defines a statistic as offset from a common base pointer. Stats should
+ * be defined in constant arrays using the I40E_STAT macro, with every element
+ * of the array using the same _type for calculating the sizeof_stat and
+ * stat_offset.
+ *
+ * The @sizeof_stat is expected to be sizeof(u8), sizeof(u16), sizeof(u32) or
+ * sizeof(u64). Other sizes are not expected and will produce a WARN_ONCE from
+ * the i40e_add_ethtool_stat() helper function.
+ *
+ * The @stat_string is interpreted as a format string, allowing formatted
+ * values to be inserted while looping over multiple structures for a given
+ * statistics array. Thus, every statistic string in an array should have the
+ * same type and number of format specifiers, to be formatted by variadic
+ * arguments to the i40e_add_stat_string() helper function.
+ **/
+struct i40e_stats {
+ char stat_string[ETH_GSTRING_LEN];
+ int sizeof_stat;
+ int stat_offset;
+};
+
+/* Helper macro to define an i40e_stat structure with proper size and type.
+ * Use this when defining constant statistics arrays. Note that @_type expects
+ * only a type name and is used multiple times.
+ */
+#define I40E_STAT(_type, _name, _stat) { \
+ .stat_string = _name, \
+ .sizeof_stat = FIELD_SIZEOF(_type, _stat), \
+ .stat_offset = offsetof(_type, _stat) \
+}
+
+/* Helper macro for defining some statistics directly copied from the netdev
+ * stats structure.
+ */
+#define I40E_NETDEV_STAT(_net_stat) \
+ I40E_STAT(struct rtnl_link_stats64, #_net_stat, _net_stat)
+
+/* Helper macro for defining some statistics related to queues */
+#define I40E_QUEUE_STAT(_name, _stat) \
+ I40E_STAT(struct i40e_ring, _name, _stat)
+
+/* Stats associated with a Tx or Rx ring */
+static const struct i40e_stats i40e_gstrings_queue_stats[] = {
+ I40E_QUEUE_STAT("%s-%u.packets", stats.packets),
+ I40E_QUEUE_STAT("%s-%u.bytes", stats.bytes),
+};
+
+/**
+ * i40evf_add_one_ethtool_stat - copy the stat into the supplied buffer
+ * @data: location to store the stat value
+ * @pointer: basis for where to copy from
+ * @stat: the stat definition
+ *
+ * Copies the stat data defined by the pointer and stat structure pair into
+ * the memory supplied as data. Used to implement i40e_add_ethtool_stats and
+ * i40evf_add_queue_stats. If the pointer is null, data will be zero'd.
+ */
+static void
+i40evf_add_one_ethtool_stat(u64 *data, void *pointer,
+ const struct i40e_stats *stat)
+{
+ char *p;
+
+ if (!pointer) {
+ /* ensure that the ethtool data buffer is zero'd for any stats
+ * which don't have a valid pointer.
+ */
+ *data = 0;
+ return;
+ }
+
+ p = (char *)pointer + stat->stat_offset;
+ switch (stat->sizeof_stat) {
+ case sizeof(u64):
+ *data = *((u64 *)p);
+ break;
+ case sizeof(u32):
+ *data = *((u32 *)p);
+ break;
+ case sizeof(u16):
+ *data = *((u16 *)p);
+ break;
+ case sizeof(u8):
+ *data = *((u8 *)p);
+ break;
+ default:
+ WARN_ONCE(1, "unexpected stat size for %s",
+ stat->stat_string);
+ *data = 0;
+ }
+}
+
+/**
+ * __i40evf_add_ethtool_stats - copy stats into the ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location to copy stats from
+ * @stats: array of stats to copy
+ * @size: the size of the stats definition
+ *
+ * Copy the stats defined by the stats array using the pointer as a base into
+ * the data buffer supplied by ethtool. Updates the data pointer to point to
+ * the next empty location for successive calls to __i40evf_add_ethtool_stats.
+ * If pointer is null, set the data values to zero and update the pointer to
+ * skip these stats.
+ **/
+static void
+__i40evf_add_ethtool_stats(u64 **data, void *pointer,
+ const struct i40e_stats stats[],
+ const unsigned int size)
+{
+ unsigned int i;
+
+ for (i = 0; i < size; i++)
+ i40evf_add_one_ethtool_stat((*data)++, pointer, &stats[i]);
+}
+
+/**
+ * i40e_add_ethtool_stats - copy stats into ethtool supplied buffer
+ * @data: ethtool stats buffer
+ * @pointer: location where stats are stored
+ * @stats: static const array of stat definitions
+ *
+ * Macro to ease the use of __i40evf_add_ethtool_stats by taking a static
+ * constant stats array and passing the ARRAY_SIZE(). This avoids typos by
+ * ensuring that we pass the size associated with the given stats array.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided.
+ **/
+#define i40e_add_ethtool_stats(data, pointer, stats) \
+ __i40evf_add_ethtool_stats(data, pointer, stats, ARRAY_SIZE(stats))
+
+/**
+ * i40evf_add_queue_stats - copy queue statistics into supplied buffer
+ * @data: ethtool stats buffer
+ * @ring: the ring to copy
+ *
+ * Queue statistics must be copied while protected by
+ * u64_stats_fetch_begin_irq, so we can't directly use i40e_add_ethtool_stats.
+ * Assumes that queue stats are defined in i40e_gstrings_queue_stats. If the
+ * ring pointer is null, zero out the queue stat values and update the data
+ * pointer. Otherwise safely copy the stats from the ring into the supplied
+ * buffer and update the data pointer when finished.
+ *
+ * This function expects to be called while under rcu_read_lock().
+ **/
+static void
+i40evf_add_queue_stats(u64 **data, struct i40e_ring *ring)
+{
+ const unsigned int size = ARRAY_SIZE(i40e_gstrings_queue_stats);
+ const struct i40e_stats *stats = i40e_gstrings_queue_stats;
+ unsigned int start;
+ unsigned int i;
+
+ /* To avoid invalid statistics values, ensure that we keep retrying
+ * the copy until we get a consistent value according to
+ * u64_stats_fetch_retry_irq. But first, make sure our ring is
+ * non-null before attempting to access its syncp.
+ */
+ do {
+ start = !ring ? 0 : u64_stats_fetch_begin_irq(&ring->syncp);
+ for (i = 0; i < size; i++) {
+ i40evf_add_one_ethtool_stat(&(*data)[i], ring,
+ &stats[i]);
+ }
+ } while (ring && u64_stats_fetch_retry_irq(&ring->syncp, start));
+
+ /* Once we successfully copy the stats in, update the data pointer */
+ *data += size;
+}
+
+/**
+ * __i40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ * @size: size of the stats array
+ *
+ * Format and copy the strings described by stats into the buffer pointed at
+ * by p.
+ **/
+static void __i40e_add_stat_strings(u8 **p, const struct i40e_stats stats[],
+ const unsigned int size, ...)
+{
+ unsigned int i;
+
+ for (i = 0; i < size; i++) {
+ va_list args;
+
+ va_start(args, size);
+ vsnprintf(*p, ETH_GSTRING_LEN, stats[i].stat_string, args);
+ *p += ETH_GSTRING_LEN;
+ va_end(args);
+ }
+}
+
+/**
+ * 40e_add_stat_strings - copy stat strings into ethtool buffer
+ * @p: ethtool supplied buffer
+ * @stats: stat definitions array
+ *
+ * Format and copy the strings described by the const static stats value into
+ * the buffer pointed at by p.
+ *
+ * The parameter @stats is evaluated twice, so parameters with side effects
+ * should be avoided. Additionally, stats must be an array such that
+ * ARRAY_SIZE can be called on it.
+ **/
+#define i40e_add_stat_strings(p, stats, ...) \
+ __i40e_add_stat_strings(p, stats, ARRAY_SIZE(stats), ## __VA_ARGS__)
#define I40EVF_STAT(_name, _stat) \
I40E_STAT(struct i40evf_adapter, _name, _stat)
--
2.18.0.219.gaf81d287a9da
^ permalink raw reply related
* Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the operation of dst_free()
From: Wolfgang Walter @ 2018-09-07 21:10 UTC (permalink / raw)
To: Steffen Klassert; +Cc: netdev, Wei Wang, Tobias Hommel
In-Reply-To: <1591446.aUMgYVGcZN@stwm.de>
Hello Steffen,
in one of your emails to Thomas you wrote:
> xfrm_lookup+0x2a is at the very beginning of xfrm_lookup(), here we
> find:
>
> u16 family = dst_orig->ops->family;
>
> ops has an offset of 32 bytes (20 hex) in dst_orig, so looks like
> dst_orig is NULL.
>
> In the forwarding case, we get dst_orig from the skb and dst_orig
> can't be NULL here unless the skb itself is already fishy.
Is this really true?
If xfrm_lookup is called from
__xfrm_route_forward():
int __xfrm_route_forward(struct sk_buff *skb, unsigned short family)
{
struct net *net = dev_net(skb->dev);
struct flowi fl;
struct dst_entry *dst;
int res = 1;
if (xfrm_decode_session(skb, &fl, family) < 0) {
XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR);
return 0;
}
skb_dst_force(skb);
dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE);
if (IS_ERR(dst)) {
res = 0;
dst = NULL;
}
skb_dst_set(skb, dst);
return res;
}
couldn't it be possible that skb_dst_force(skb) actually sets dst to NULL if
it cannot safely lock it? If it is absolutely sure that skb_dst_force() never
can set dst to NULL I wonder why it is called at all?
Here is skb_dst_force()
static inline void skb_dst_force(struct sk_buff *skb)
{
if (skb_dst_is_noref(skb)) {
struct dst_entry *dst = skb_dst(skb);
WARN_ON(!rcu_read_lock_held());
if (!dst_hold_safe(dst))
dst = NULL;
skb->_skb_refdst = (unsigned long)dst;
}
}
and dst_hold_safe() is
static inline bool dst_hold_safe(struct dst_entry *dst)
{
return atomic_inc_not_zero(&dst->__refcnt);
}
Am Freitag, 7. September 2018, 22:22:39 schrieb Wolfgang Walter:
> Am Freitag, 31. August 2018, 08:50:24 schrieb Steffen Klassert:
> > On Thu, Aug 30, 2018 at 08:53:50PM +0200, Wolfgang Walter wrote:
> > > Hello,
> > >
> > > kernels > 4.12 do not work on one of our main routers. They crash as
> > > soon
> > > as ipsec-tunnels are configured and ipsec-traffic actually flows.
> >
> > Can you please send the backtrace of this crash?
>
> I bootet the b838d5e1c5b6e57b10ec8af2268824041e3ea911 several times but I
> could not record the complete trace. I think I have to log to the serial
> console but I can't do that before next week.
>
>
> What I could record ist:
>
> There is a always
> <IRQ> ... </IRQ>
> the callrace.
>
> This is the part I could see:
>
>
> irq_exit+0x71/0x80
> do_IRQ+0x4d/0xd0
> common_interrup+07a/0x7a
> </IRQ>
> RIP: 010:cpuidle_enter_state+0x11d/0x200
> RSP: 0018:ffffc9000321bee0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffc4
> RAX: ffff88085efde450 RBX: 0000000000000004 RCX: 00000003c9e63c13
> RDX: 00000003c9e63c13 RSI: ffb03103fe35ac43 RDI: 0000000000000000
> RBP: ffffe8ffff7cf600 R08: 000000000000000c R09: 0000000000000004
> R10: 0000000000000400 R11: 00000003c99e56fc R12: 00000003c9e63c13
> R13: 00000003c9da9567 R14: 0000000000000004 R15: ffffffff822763e0
> do_idle+0xd3/0x160
> cpu_startup_entry+0x14/0x20
> secondary_startup_64+0xa5/0xb0
> Code: 00 0f b7 83 c0 00 00 00 80 7c 02 08 01 0f 86 d3 02 00 00 41
> 8b 8c 24 3c 10 00 00 48 8b 6b 58 85 c9 0f 84 2f 01 00 00 48 83 e5 fe <f6> 45
> 60
> 02 0f 84 4e 01 00 00 f6 43 38 01 74 0d 80 00 bd ab 00 00
> RIP: ip_forward+0xd4/0x470 RSP: ffff88085efc3cb0
> CR2: 0000000000000060
> ----[ end trace 7205b53c25b7b35a ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Kernel Offset: disabled
> Rebooting in 60 seconds..
>
>
> I got an email from Tobias Hommel and I think it is the same problem.
>
> It is very clear that it is the difference from
>
> ipv4: call dst_hold_safe() properly
>
> to
>
> ipv4: mark DST_NOGC and remove the operation of dst_free()
>
> which triggers this bug.
>
> Regards,
Regards
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
^ permalink raw reply
* Re: [RFC PATCH bpf-next v2 0/4] Implement bpf queue/stack maps
From: Mauricio Vasquez @ 2018-09-07 20:40 UTC (permalink / raw)
To: Alexei Starovoitov; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev, joe
In-Reply-To: <20180907001317.fj7f6fg6ihljompp@ast-mbp.dhcp.thefacebook.com>
On 09/06/2018 07:13 PM, Alexei Starovoitov wrote:
> On Fri, Aug 31, 2018 at 11:25:48PM +0200, Mauricio Vasquez B wrote:
>> In some applications this is needed have a pool of free elements, like for
>> example the list of free L4 ports in a SNAT. None of the current maps allow
>> to do it as it is not possibleto get an any element without having they key
>> it is associated to.
>>
>> This patchset implements two new kind of eBPF maps: queue and stack.
>> Those maps provide to eBPF programs the peek, push and pop operations, and for
>> userspace applications a new bpf_map_lookup_and_delete_elem() is added.
>>
>> Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
>>
>> ---
>>
>> I am sending this as an RFC because there is still an issue I am not sure how
>> to solve.
>>
>> The queue/stack maps have a linked list for saving the nodes, and a
>> preallocation schema based on the pcpu_freelist already implemented and used
>> in the htabmap. Each time an element is pushed into the map, a node from the
>> pcpu_freelist is taken and then added to the linked list.
>>
>> The pop operation takes and *removes* the first node from the linked list, then
>> it uses *call_rcu* to postpose freeing the node, i.e, the node is only returned
>> to the pcpu_freelist when the rcu callback is executed. This is needed because
>> an element returned by the pop() operation should remain valid for the whole
>> duration of the eBPF program.
>>
>> The problem is that elements are not immediately returned to the free list, so
>> in some cases the push operation could fail because there are not free nodes
>> in the pcpu_freelist.
>>
>> The following code snippet exposes that problem.
>>
>> ...
>> /* Push MAP_SIZE elements */
>> for (i = 0; i < MAP_SIZE; i++)
>> assert(bpf_map_update_elem(fd, NULL, &vals[i], 0) == 0);
>>
>> /* Pop all elements */
>> for (i = 0; i < MAP_SIZE; i++)
>> assert(bpf_map_lookup_and_delete_elem(fd, NULL, &val) == 0 &&
>> val == vals[i]);
>>
>> // sleep(1) <-- If I put this sleep, everything works.
>> /* Push MAP_SIZE elements */
>> for (i = 0; i < MAP_SIZE; i++)
>> assert(bpf_map_update_elem(fd, NULL, &vals[i], 0) == 0);
>> ^^^
>> This fails because there are not available elements in pcpu_freelist
>> ...
>>
>> I think a possible solution is to oversize the pcpu_freelist (no idea by how
>> much, maybe double or, or make it 1.5 time the max elements in the map?)
>> I also have concerns about it, it would waste that memory in many cases and
>> this is also probably that it doesn't solve the issue because that code snippet
>> is puhsing and popping elements too fast, so even if the pcpu_freelist is much
>> large a certain time instant all the elements could be used.
>>
>> Is this really an important issue?
>> Any idea of how to solve it?
> It is important issue indeed and a difficult one to solve.
> We have the same issue with hash map.
> If the prog is doing:
> value = lookup(key);
> delete(key);
> // here the prog shouldn't be accessing the value anymore, since the memory
> // could have been reused, but value pointer is still valid and points to
> // allocated memory
Just to notice that for the queue map it is a little bit worse because
there isn't a way to mark an element to be reused, hence in some cases
the pool of free elements could be exhausted.
> bpf_map_pop_elem() is trying to do lookup_and_delete and preserve
> validity of value without races.
> With pcpu_freelist I don't think there is a solution.
> We can have this queue/stack map without prealloc and use kmalloc/kfree
> back and forth. Performance will not be as great, but for your use case,
> I suspect, it will be good enough.
I agree, for our use case we are not that worried about the performance,
it is still in the dataplane but let's say it is not in the "hot" path.
> The key issue with kmalloc/kfree is unbounded time of rcu callbacks.
> If somebody starts doing push/pop for every packet, the rcu subsystem
> will struggle and nothing we can do about it.
>
> The only way I could think of to resolve this problem is to reuse
> the logic that Joe is working on for socket lookups inside the program.
> Joe,
> how is that going? Could you repost the latest patches?
>
> In such case the api for stack map will look like:
>
> elem = bpf_map_pop_elem(stack);
> // access elem
> bpf_map_free_elem(elem);
> // here prog is not allowed to access elem and verifier will catch that
>
> elem = bpf_map_alloc_elem(stack);
> // populate elem
> bpf_map_push_elem(elem);
> // here prog is not allowed to access elem and verifier will catch that
>
> Then both pre-allocated elems and kmalloc/kfree will work fine
> and no unbounded rcu issues in both cases.
>
>
I read the Joe's proposal and using that for this problem looks like a
nice solution.
I think a good trade-off for now would be to go ahead with a queue/stack
map without preallocating support (or maybe include it having always in
mind that this issue has to be solved in the near future) and then, as a
separated work, try to use Joe's proposal in the map helpers.
What do you think?
Thanks,
Mauricio.
^ permalink raw reply
* [Patch net-next] htb: use anonymous union for simplicity
From: Cong Wang @ 2018-09-07 20:29 UTC (permalink / raw)
To: netdev; +Cc: jiri, Cong Wang, Jamal Hadi Salim
In-Reply-To: <20180907202914.21331-1-xiyou.wangcong@gmail.com>
cl->leaf.q is slightly more readable than cl->un.leaf.q.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
net/sched/sch_htb.c | 98 ++++++++++++++++++++++++++---------------------------
1 file changed, 49 insertions(+), 49 deletions(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 43c4bfe625a9..2c45e97b7b87 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -132,7 +132,7 @@ struct htb_class {
struct htb_class_inner {
struct htb_prio clprio[TC_HTB_NUMPRIO];
} inner;
- } un;
+ };
s64 pq_key;
int prio_activity; /* for which prios are we active */
@@ -411,13 +411,13 @@ static void htb_activate_prios(struct htb_sched *q, struct htb_class *cl)
int prio = ffz(~m);
m &= ~(1 << prio);
- if (p->un.inner.clprio[prio].feed.rb_node)
+ if (p->inner.clprio[prio].feed.rb_node)
/* parent already has its feed in use so that
* reset bit in mask as parent is already ok
*/
mask &= ~(1 << prio);
- htb_add_to_id_tree(&p->un.inner.clprio[prio].feed, cl, prio);
+ htb_add_to_id_tree(&p->inner.clprio[prio].feed, cl, prio);
}
p->prio_activity |= mask;
cl = p;
@@ -447,19 +447,19 @@ static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl)
int prio = ffz(~m);
m &= ~(1 << prio);
- if (p->un.inner.clprio[prio].ptr == cl->node + prio) {
+ if (p->inner.clprio[prio].ptr == cl->node + prio) {
/* we are removing child which is pointed to from
* parent feed - forget the pointer but remember
* classid
*/
- p->un.inner.clprio[prio].last_ptr_id = cl->common.classid;
- p->un.inner.clprio[prio].ptr = NULL;
+ p->inner.clprio[prio].last_ptr_id = cl->common.classid;
+ p->inner.clprio[prio].ptr = NULL;
}
htb_safe_rb_erase(cl->node + prio,
- &p->un.inner.clprio[prio].feed);
+ &p->inner.clprio[prio].feed);
- if (!p->un.inner.clprio[prio].feed.rb_node)
+ if (!p->inner.clprio[prio].feed.rb_node)
mask |= 1 << prio;
}
@@ -555,7 +555,7 @@ htb_change_class_mode(struct htb_sched *q, struct htb_class *cl, s64 *diff)
*/
static inline void htb_activate(struct htb_sched *q, struct htb_class *cl)
{
- WARN_ON(cl->level || !cl->un.leaf.q || !cl->un.leaf.q->q.qlen);
+ WARN_ON(cl->level || !cl->leaf.q || !cl->leaf.q->q.qlen);
if (!cl->prio_activity) {
cl->prio_activity = 1 << cl->prio;
@@ -615,7 +615,7 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch,
__qdisc_drop(skb, to_free);
return ret;
#endif
- } else if ((ret = qdisc_enqueue(skb, cl->un.leaf.q,
+ } else if ((ret = qdisc_enqueue(skb, cl->leaf.q,
to_free)) != NET_XMIT_SUCCESS) {
if (net_xmit_drop_count(ret)) {
qdisc_qstats_drop(sch);
@@ -823,7 +823,7 @@ static struct htb_class *htb_lookup_leaf(struct htb_prio *hprio, const int prio)
cl = rb_entry(*sp->pptr, struct htb_class, node[prio]);
if (!cl->level)
return cl;
- clp = &cl->un.inner.clprio[prio];
+ clp = &cl->inner.clprio[prio];
(++sp)->root = clp->feed.rb_node;
sp->pptr = &clp->ptr;
sp->pid = &clp->last_ptr_id;
@@ -857,7 +857,7 @@ static struct sk_buff *htb_dequeue_tree(struct htb_sched *q, const int prio,
* graft operation on the leaf since last dequeue;
* simply deactivate and skip such class
*/
- if (unlikely(cl->un.leaf.q->q.qlen == 0)) {
+ if (unlikely(cl->leaf.q->q.qlen == 0)) {
struct htb_class *next;
htb_deactivate(q, cl);
@@ -873,12 +873,12 @@ static struct sk_buff *htb_dequeue_tree(struct htb_sched *q, const int prio,
goto next;
}
- skb = cl->un.leaf.q->dequeue(cl->un.leaf.q);
+ skb = cl->leaf.q->dequeue(cl->leaf.q);
if (likely(skb != NULL))
break;
- qdisc_warn_nonwc("htb", cl->un.leaf.q);
- htb_next_rb_node(level ? &cl->parent->un.inner.clprio[prio].ptr:
+ qdisc_warn_nonwc("htb", cl->leaf.q);
+ htb_next_rb_node(level ? &cl->parent->inner.clprio[prio].ptr:
&q->hlevel[0].hprio[prio].ptr);
cl = htb_lookup_leaf(hprio, prio);
@@ -886,16 +886,16 @@ static struct sk_buff *htb_dequeue_tree(struct htb_sched *q, const int prio,
if (likely(skb != NULL)) {
bstats_update(&cl->bstats, skb);
- cl->un.leaf.deficit[level] -= qdisc_pkt_len(skb);
- if (cl->un.leaf.deficit[level] < 0) {
- cl->un.leaf.deficit[level] += cl->quantum;
- htb_next_rb_node(level ? &cl->parent->un.inner.clprio[prio].ptr :
+ cl->leaf.deficit[level] -= qdisc_pkt_len(skb);
+ if (cl->leaf.deficit[level] < 0) {
+ cl->leaf.deficit[level] += cl->quantum;
+ htb_next_rb_node(level ? &cl->parent->inner.clprio[prio].ptr :
&q->hlevel[0].hprio[prio].ptr);
}
/* this used to be after charge_class but this constelation
* gives us slightly better performance
*/
- if (!cl->un.leaf.q->q.qlen)
+ if (!cl->leaf.q->q.qlen)
htb_deactivate(q, cl);
htb_charge_class(q, cl, level, skb);
}
@@ -972,10 +972,10 @@ static void htb_reset(struct Qdisc *sch)
for (i = 0; i < q->clhash.hashsize; i++) {
hlist_for_each_entry(cl, &q->clhash.hash[i], common.hnode) {
if (cl->level)
- memset(&cl->un.inner, 0, sizeof(cl->un.inner));
+ memset(&cl->inner, 0, sizeof(cl->inner));
else {
- if (cl->un.leaf.q)
- qdisc_reset(cl->un.leaf.q);
+ if (cl->leaf.q)
+ qdisc_reset(cl->leaf.q);
}
cl->prio_activity = 0;
cl->cmode = HTB_CAN_SEND;
@@ -1098,8 +1098,8 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg,
*/
tcm->tcm_parent = cl->parent ? cl->parent->common.classid : TC_H_ROOT;
tcm->tcm_handle = cl->common.classid;
- if (!cl->level && cl->un.leaf.q)
- tcm->tcm_info = cl->un.leaf.q->handle;
+ if (!cl->level && cl->leaf.q)
+ tcm->tcm_info = cl->leaf.q->handle;
nest = nla_nest_start(skb, TCA_OPTIONS);
if (nest == NULL)
@@ -1142,9 +1142,9 @@ htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d)
};
__u32 qlen = 0;
- if (!cl->level && cl->un.leaf.q) {
- qlen = cl->un.leaf.q->q.qlen;
- qs.backlog = cl->un.leaf.q->qstats.backlog;
+ if (!cl->level && cl->leaf.q) {
+ qlen = cl->leaf.q->q.qlen;
+ qs.backlog = cl->leaf.q->qstats.backlog;
}
cl->xstats.tokens = clamp_t(s64, PSCHED_NS2TICKS(cl->tokens),
INT_MIN, INT_MAX);
@@ -1172,14 +1172,14 @@ static int htb_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new,
cl->common.classid, extack)) == NULL)
return -ENOBUFS;
- *old = qdisc_replace(sch, new, &cl->un.leaf.q);
+ *old = qdisc_replace(sch, new, &cl->leaf.q);
return 0;
}
static struct Qdisc *htb_leaf(struct Qdisc *sch, unsigned long arg)
{
struct htb_class *cl = (struct htb_class *)arg;
- return !cl->level ? cl->un.leaf.q : NULL;
+ return !cl->level ? cl->leaf.q : NULL;
}
static void htb_qlen_notify(struct Qdisc *sch, unsigned long arg)
@@ -1205,15 +1205,15 @@ static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl,
{
struct htb_class *parent = cl->parent;
- WARN_ON(cl->level || !cl->un.leaf.q || cl->prio_activity);
+ WARN_ON(cl->level || !cl->leaf.q || cl->prio_activity);
if (parent->cmode != HTB_CAN_SEND)
htb_safe_rb_erase(&parent->pq_node,
&q->hlevel[parent->level].wait_pq);
parent->level = 0;
- memset(&parent->un.inner, 0, sizeof(parent->un.inner));
- parent->un.leaf.q = new_q ? new_q : &noop_qdisc;
+ memset(&parent->inner, 0, sizeof(parent->inner));
+ parent->leaf.q = new_q ? new_q : &noop_qdisc;
parent->tokens = parent->buffer;
parent->ctokens = parent->cbuffer;
parent->t_c = ktime_get_ns();
@@ -1223,8 +1223,8 @@ static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl,
static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl)
{
if (!cl->level) {
- WARN_ON(!cl->un.leaf.q);
- qdisc_destroy(cl->un.leaf.q);
+ WARN_ON(!cl->leaf.q);
+ qdisc_destroy(cl->leaf.q);
}
gen_kill_estimator(&cl->rate_est);
tcf_block_put(cl->block);
@@ -1286,11 +1286,11 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg)
sch_tree_lock(sch);
if (!cl->level) {
- unsigned int qlen = cl->un.leaf.q->q.qlen;
- unsigned int backlog = cl->un.leaf.q->qstats.backlog;
+ unsigned int qlen = cl->leaf.q->q.qlen;
+ unsigned int backlog = cl->leaf.q->qstats.backlog;
- qdisc_reset(cl->un.leaf.q);
- qdisc_tree_reduce_backlog(cl->un.leaf.q, qlen, backlog);
+ qdisc_reset(cl->leaf.q);
+ qdisc_tree_reduce_backlog(cl->leaf.q, qlen, backlog);
}
/* delete from hash and active; remainder in destroy_class */
@@ -1419,13 +1419,13 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
classid, NULL);
sch_tree_lock(sch);
if (parent && !parent->level) {
- unsigned int qlen = parent->un.leaf.q->q.qlen;
- unsigned int backlog = parent->un.leaf.q->qstats.backlog;
+ unsigned int qlen = parent->leaf.q->q.qlen;
+ unsigned int backlog = parent->leaf.q->qstats.backlog;
/* turn parent into inner node */
- qdisc_reset(parent->un.leaf.q);
- qdisc_tree_reduce_backlog(parent->un.leaf.q, qlen, backlog);
- qdisc_destroy(parent->un.leaf.q);
+ qdisc_reset(parent->leaf.q);
+ qdisc_tree_reduce_backlog(parent->leaf.q, qlen, backlog);
+ qdisc_destroy(parent->leaf.q);
if (parent->prio_activity)
htb_deactivate(q, parent);
@@ -1436,10 +1436,10 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
}
parent->level = (parent->parent ? parent->parent->level
: TC_HTB_MAXDEPTH) - 1;
- memset(&parent->un.inner, 0, sizeof(parent->un.inner));
+ memset(&parent->inner, 0, sizeof(parent->inner));
}
/* leaf (we) needs elementary qdisc */
- cl->un.leaf.q = new_q ? new_q : &noop_qdisc;
+ cl->leaf.q = new_q ? new_q : &noop_qdisc;
cl->common.classid = classid;
cl->parent = parent;
@@ -1455,8 +1455,8 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
qdisc_class_hash_insert(&q->clhash, &cl->common);
if (parent)
parent->children++;
- if (cl->un.leaf.q != &noop_qdisc)
- qdisc_hash_add(cl->un.leaf.q, true);
+ if (cl->leaf.q != &noop_qdisc)
+ qdisc_hash_add(cl->leaf.q, true);
} else {
if (tca[TCA_RATE]) {
err = gen_replace_estimator(&cl->bstats, NULL,
@@ -1478,7 +1478,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
psched_ratecfg_precompute(&cl->ceil, &hopt->ceil, ceil64);
/* it used to be a nasty bug here, we have to check that node
- * is really leaf before changing cl->un.leaf !
+ * is really leaf before changing cl->leaf !
*/
if (!cl->level) {
u64 quantum = cl->rate.rate_bytes_ps;
--
2.14.4
^ permalink raw reply related
* [Patch net-next] net_sched: remove redundant qdisc lock classes
From: Cong Wang @ 2018-09-07 20:29 UTC (permalink / raw)
To: netdev; +Cc: jiri, Cong Wang, Jamal Hadi Salim
We no longer take any spinlock on RX path for ingress qdisc,
so this lockdep annotation is no longer needed.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
net/sched/sch_api.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 98541c6399db..411c40344b77 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -27,7 +27,6 @@
#include <linux/kmod.h>
#include <linux/list.h>
#include <linux/hrtimer.h>
-#include <linux/lockdep.h>
#include <linux/slab.h>
#include <linux/hashtable.h>
@@ -1053,10 +1052,6 @@ static int qdisc_block_indexes_set(struct Qdisc *sch, struct nlattr **tca,
return 0;
}
-/* lockdep annotation is needed for ingress; egress gets it only for name */
-static struct lock_class_key qdisc_tx_lock;
-static struct lock_class_key qdisc_rx_lock;
-
/*
Allocate and initialize new qdisc.
@@ -1121,7 +1116,6 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
if (handle == TC_H_INGRESS) {
sch->flags |= TCQ_F_INGRESS;
handle = TC_H_MAKE(TC_H_INGRESS, 0);
- lockdep_set_class(qdisc_lock(sch), &qdisc_rx_lock);
} else {
if (handle == 0) {
handle = qdisc_alloc_handle(dev);
@@ -1129,7 +1123,6 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
if (handle == 0)
goto err_out3;
}
- lockdep_set_class(qdisc_lock(sch), &qdisc_tx_lock);
if (!netif_is_multiqueue(dev))
sch->flags |= TCQ_F_ONETXQUEUE;
}
--
2.14.4
^ permalink raw reply related
* Re: kernels > v4.12 oops/crash with ipsec-traffic: bisected to b838d5e1c5b6e57b10ec8af2268824041e3ea911: ipv4: mark DST_NOGC and remove the operation of dst_free()
From: Wolfgang Walter @ 2018-09-07 20:22 UTC (permalink / raw)
To: Steffen Klassert; +Cc: netdev, Wei Wang, Tobias Hommel
In-Reply-To: <20180831065024.GG23674@gauss3.secunet.de>
Am Freitag, 31. August 2018, 08:50:24 schrieb Steffen Klassert:
> On Thu, Aug 30, 2018 at 08:53:50PM +0200, Wolfgang Walter wrote:
> > Hello,
> >
> > kernels > 4.12 do not work on one of our main routers. They crash as soon
> > as ipsec-tunnels are configured and ipsec-traffic actually flows.
>
> Can you please send the backtrace of this crash?
>
I bootet the b838d5e1c5b6e57b10ec8af2268824041e3ea911 several times but I
could not record the complete trace. I think I have to log to the serial
console but I can't do that before next week.
What I could record ist:
There is a always
<IRQ> ... </IRQ>
the callrace.
This is the part I could see:
irq_exit+0x71/0x80
do_IRQ+0x4d/0xd0
common_interrup+07a/0x7a
</IRQ>
RIP: 010:cpuidle_enter_state+0x11d/0x200
RSP: 0018:ffffc9000321bee0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffc4
RAX: ffff88085efde450 RBX: 0000000000000004 RCX: 00000003c9e63c13
RDX: 00000003c9e63c13 RSI: ffb03103fe35ac43 RDI: 0000000000000000
RBP: ffffe8ffff7cf600 R08: 000000000000000c R09: 0000000000000004
R10: 0000000000000400 R11: 00000003c99e56fc R12: 00000003c9e63c13
R13: 00000003c9da9567 R14: 0000000000000004 R15: ffffffff822763e0
do_idle+0xd3/0x160
cpu_startup_entry+0x14/0x20
secondary_startup_64+0xa5/0xb0
Code: 00 0f b7 83 c0 00 00 00 80 7c 02 08 01 0f 86 d3 02 00 00 41
8b 8c 24 3c 10 00 00 48 8b 6b 58 85 c9 0f 84 2f 01 00 00 48 83 e5 fe <f6> 45
60
02 0f 84 4e 01 00 00 f6 43 38 01 74 0d 80 00 bd ab 00 00
RIP: ip_forward+0xd4/0x470 RSP: ffff88085efc3cb0
CR2: 0000000000000060
----[ end trace 7205b53c25b7b35a ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
Rebooting in 60 seconds..
I got an email from Tobias Hommel and I think it is the same problem.
It is very clear that it is the difference from
ipv4: call dst_hold_safe() properly
to
ipv4: mark DST_NOGC and remove the operation of dst_free()
which triggers this bug.
Regards,
--
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
^ permalink raw reply
* Re: [PATCH net-next 08/13] net: sched: rename tcf_block_get{_ext}() and tcf_block_put{_ext}()
From: Cong Wang @ 2018-09-07 20:09 UTC (permalink / raw)
To: Vlad Buslov
Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
David Miller, Stephen Hemminger, Kirill Tkhai, Paul E. McKenney,
Nicolas Dichtel, Leon Romanovsky, Greg KH, mark.rutland,
Florian Westphal, David Ahern, lucien xin, Jakub Kicinski,
Christian Brauner, Jiri Benc
In-Reply-To: <1536220742-25650-9-git-send-email-vladbu@mellanox.com>
On Thu, Sep 6, 2018 at 12:59 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> Functions tcf_block_get{_ext}() and tcf_block_put{_ext}() actually
> attach/detach block to specific Qdisc besides just taking/putting
> reference. Rename them according to their purpose.
Where exactly does it attach to?
Each qdisc provides a pointer to a pointer of a block, like
&cl->block. It is where the result is saved to. It takes a parameter
of Qdisc* merely for read-only purpose.
So, renaming it to *attach() is even confusing, at least not
any better. Please find other names or leave them as they are.
^ permalink raw reply
* Re: [PATCH v3 3/3] IB/ipoib: Log sysfs 'dev_id' accesses from userspace
From: Doug Ledford @ 2018-09-07 20:02 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Arseny Maslennikov, linux-rdma, Jason Gunthorpe, netdev
In-Reply-To: <20180907172821.GS3142@mtr-leonro.mtl.com>
[-- Attachment #1: Type: text/plain, Size: 4629 bytes --]
On Fri, 2018-09-07 at 20:28 +0300, Leon Romanovsky wrote:
> On Fri, Sep 07, 2018 at 01:14:37PM -0400, Doug Ledford wrote:
> > On Thu, 2018-09-06 at 16:03 +0300, Leon Romanovsky wrote:
> > > On Thu, Sep 06, 2018 at 10:04:33AM +0300, Arseny Maslennikov wrote:
> > > > On Wed, Sep 05, 2018 at 04:50:35PM +0300, Leon Romanovsky wrote:
> > > > > On Mon, Sep 03, 2018 at 07:13:16PM +0300, Arseny Maslennikov wrote:
> > > > > > Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
> > > > > > ---
> > > > > > drivers/infiniband/ulp/ipoib/ipoib_main.c | 38 +++++++++++++++++++++++
> > > > > > 1 file changed, 38 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > > > > index 30f840f874b3..7386e5bde3d3 100644
> > > > > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > > > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > > > > > @@ -2386,6 +2386,42 @@ int ipoib_add_pkey_attr(struct net_device *dev)
> > > > > > return device_create_file(&dev->dev, &dev_attr_pkey);
> > > > > > }
> > > > > >
> > > > > > +/*
> > > > > > + * We erroneously exposed the iface's port number in the dev_id
> > > > > > + * sysfs field long after dev_port was introduced for that purpose[1],
> > > > > > + * and we need to stop everyone from relying on that.
> > > > > > + * Let's overload the shower routine for the dev_id file here
> > > > > > + * to gently bring the issue up.
> > > > > > + *
> > > > > > + * [1] https://www.spinics.net/lists/netdev/msg272123.html
> > > > > > + */
> > > > > > +static ssize_t dev_id_show(struct device *dev,
> > > > > > + struct device_attribute *attr, char *buf)
> > > > > > +{
> > > > > > + struct net_device *ndev = to_net_dev(dev);
> > > > > > + ssize_t ret = -EINVAL;
> > > > > > +
> > > > > > + if (ndev->dev_id == ndev->dev_port) {
> > > > > > + netdev_info_once(ndev,
> > > > > > + "\"%s\" wants to know my dev_id. "
> > > > > > + "Should it look at dev_port instead?\n",
> > > > > > + current->comm);
> > > > > > + netdev_info_once(ndev,
> > > > > > + "See Documentation/ABI/testing/sysfs-class-net for more info.\n");
> > > > > > + }
> > > > > > +
> > > > > > + ret = sprintf(buf, "%#x\n", ndev->dev_id);
> > > > > > +
> > > > > > + return ret;
> > > > > > +}
> > > > > > +static DEVICE_ATTR_RO(dev_id);
> > > > > > +
> > > > >
> > > > > I don't see this field among exposed by IPoIB, why should we expose it now?
> > > > >
> > > >
> > > > To deviate from standard netdev behaviour, which only prints the
> > > > field out. Doug wanted this to also print a deprecation message, and
> > > > netdev (obviously) does not do that. See below.
> > > >
> > > > > > +int ipoib_intercept_dev_id_attr(struct net_device *dev)
> > > > > > +{
> > > > > > + device_remove_file(&dev->dev, &dev_attr_dev_id);
> > > > > > + return device_create_file(&dev->dev, &dev_attr_dev_id);
> > > > >
> > > > > Why isn't enough to rely on netdev code?
> > > > >
> > > >
> > > > Netdev code relies on macros around a *static* function 'netdev_show',
> > > > which is defined in net/core/net-sysfs.c; it is not listed in any header
> > > > files, and the macros aren't as well. This all leads me to believe it
> > > > was not really meant to be used from outside net/core/net-sysfs.
> > > >
> > > > The only way we could use any netdev code here is to set up our own
> > > > handler (again), printk() a message, then call netdev_show — but we have
> > > > no access to it.
> > > >
> > > > Of course, it also may be that I'm terribly missing a clue.
> > >
> > > Thanks,
> > >
> > > IMHO, the end result of adequate Doug's request is a little bit too much.
> > > I don't think that it justifies such remove->create construction.
> > >
> > > Personal opinion.
> >
> > I agree with you that the end result is kinda bulky, *but*, we will need
> > to know if there are things using the old dev_id before we can remove
> > it. In particular, I'm concerned the IPoIB handling code of
> > NetworkManager uses it. It's worth the cost I think.
>
> I did my checks now and saw that ibdev2netdev uses the value provided
> from this dev_id, so every invocation of that script will generate the
> warning.
>
> See this line in Parav's repo
> https://github.com/Mellanox/container_scripts/blob/master/ibdev2netdev#L112
I'm pretty sure libvirt + qemu will trigger it too.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH net-next 09/13] net: sched: extend tcf_block with rcu
From: Cong Wang @ 2018-09-07 19:52 UTC (permalink / raw)
To: Vlad Buslov
Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
David Miller, Stephen Hemminger, Kirill Tkhai, Paul E. McKenney,
Nicolas Dichtel, Leon Romanovsky, Greg KH, mark.rutland,
Florian Westphal, David Ahern, lucien xin, Jakub Kicinski,
Christian Brauner, Jiri Benc
In-Reply-To: <1536220742-25650-10-git-send-email-vladbu@mellanox.com>
On Thu, Sep 6, 2018 at 12:59 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> Extend tcf_block with rcu to allow safe deallocation when it is accessed
> concurrently.
This sucks, please fold this patch into where you call rcu_read_lock()
on tcf block.
This patch _alone_ is apparently not complete. This is not how
we split patches.
^ permalink raw reply
* Re: [PATCH net-next 13/13] net: sched: add flags to Qdisc class ops struct
From: Cong Wang @ 2018-09-07 19:50 UTC (permalink / raw)
To: Vlad Buslov
Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
David Miller, Stephen Hemminger, Kirill Tkhai, Paul E. McKenney,
Nicolas Dichtel, Leon Romanovsky, Greg KH, mark.rutland,
Florian Westphal, David Ahern, lucien xin, Jakub Kicinski,
Christian Brauner, Jiri Benc
In-Reply-To: <1536220742-25650-14-git-send-email-vladbu@mellanox.com>
On Thu, Sep 6, 2018 at 12:59 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>
> Extend Qdisc_class_ops with flags. Create enum to hold possible class ops
> flag values. Add first class ops flags value QDISC_CLASS_OPS_DOIT_UNLOCKED
> to indicate that class ops functions can be called without taking rtnl
> lock.
We don't add anything that is not used.
This is the last patch in this series, so I am pretty sure you split
it in a wrong way, it certainly belongs to next series, not this series.
^ permalink raw reply
* [PATCH 5/5] ata: ahci_sunxi: use xxxsetbits32 functions
From: Corentin Labbe @ 2018-09-07 19:41 UTC (permalink / raw)
To: Gilles.Muller, Julia.Lawall, agust, alexandre.torgue, alistair,
benh, carlo, davem, galak, joabreu, khilman, maxime.ripard,
michal.lkml, mpe, mporter, nicolas.palix, oss, paulus,
peppe.cavallaro, tj, vitb, wens
Cc: cocci, linux-amlogic, linux-arm-kernel, linux-ide, linux-kernel,
linuxppc-dev, netdev, linux-sunxi, Corentin Labbe
In-Reply-To: <1536349307-20714-1-git-send-email-clabbe@baylibre.com>
This patch converts ahci_sunxi to use xxxsetbits32 functions
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
drivers/ata/ahci_sunxi.c | 51 ++++++++++++------------------------------------
1 file changed, 12 insertions(+), 39 deletions(-)
diff --git a/drivers/ata/ahci_sunxi.c b/drivers/ata/ahci_sunxi.c
index 911710643305..0799441f1237 100644
--- a/drivers/ata/ahci_sunxi.c
+++ b/drivers/ata/ahci_sunxi.c
@@ -25,6 +25,7 @@
#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/regulator/consumer.h>
+#include <linux/setbits.h>
#include "ahci.h"
#define DRV_NAME "ahci-sunxi"
@@ -58,34 +59,6 @@ MODULE_PARM_DESC(enable_pmp,
#define AHCI_P0PHYCR 0x0178
#define AHCI_P0PHYSR 0x017c
-static void sunxi_clrbits(void __iomem *reg, u32 clr_val)
-{
- u32 reg_val;
-
- reg_val = readl(reg);
- reg_val &= ~(clr_val);
- writel(reg_val, reg);
-}
-
-static void sunxi_setbits(void __iomem *reg, u32 set_val)
-{
- u32 reg_val;
-
- reg_val = readl(reg);
- reg_val |= set_val;
- writel(reg_val, reg);
-}
-
-static void sunxi_clrsetbits(void __iomem *reg, u32 clr_val, u32 set_val)
-{
- u32 reg_val;
-
- reg_val = readl(reg);
- reg_val &= ~(clr_val);
- reg_val |= set_val;
- writel(reg_val, reg);
-}
-
static u32 sunxi_getbits(void __iomem *reg, u8 mask, u8 shift)
{
return (readl(reg) >> shift) & mask;
@@ -100,22 +73,22 @@ static int ahci_sunxi_phy_init(struct device *dev, void __iomem *reg_base)
writel(0, reg_base + AHCI_RWCR);
msleep(5);
- sunxi_setbits(reg_base + AHCI_PHYCS1R, BIT(19));
- sunxi_clrsetbits(reg_base + AHCI_PHYCS0R,
+ setbits32(reg_base + AHCI_PHYCS1R, BIT(19));
+ clrsetbits32(reg_base + AHCI_PHYCS0R,
(0x7 << 24),
(0x5 << 24) | BIT(23) | BIT(18));
- sunxi_clrsetbits(reg_base + AHCI_PHYCS1R,
+ clrsetbits32(reg_base + AHCI_PHYCS1R,
(0x3 << 16) | (0x1f << 8) | (0x3 << 6),
(0x2 << 16) | (0x6 << 8) | (0x2 << 6));
- sunxi_setbits(reg_base + AHCI_PHYCS1R, BIT(28) | BIT(15));
- sunxi_clrbits(reg_base + AHCI_PHYCS1R, BIT(19));
- sunxi_clrsetbits(reg_base + AHCI_PHYCS0R,
+ setbits32(reg_base + AHCI_PHYCS1R, BIT(28) | BIT(15));
+ clrbits32(reg_base + AHCI_PHYCS1R, BIT(19));
+ clrsetbits32(reg_base + AHCI_PHYCS0R,
(0x7 << 20), (0x3 << 20));
- sunxi_clrsetbits(reg_base + AHCI_PHYCS2R,
+ clrsetbits32(reg_base + AHCI_PHYCS2R,
(0x1f << 5), (0x19 << 5));
msleep(5);
- sunxi_setbits(reg_base + AHCI_PHYCS0R, (0x1 << 19));
+ setbits32(reg_base + AHCI_PHYCS0R, (0x1 << 19));
timeout = 250; /* Power up takes aprox 50 us */
do {
@@ -130,7 +103,7 @@ static int ahci_sunxi_phy_init(struct device *dev, void __iomem *reg_base)
udelay(1);
} while (1);
- sunxi_setbits(reg_base + AHCI_PHYCS2R, (0x1 << 24));
+ setbits32(reg_base + AHCI_PHYCS2R, (0x1 << 24));
timeout = 100; /* Calibration takes aprox 10 us */
do {
@@ -158,10 +131,10 @@ static void ahci_sunxi_start_engine(struct ata_port *ap)
struct ahci_host_priv *hpriv = ap->host->private_data;
/* Setup DMA before DMA start */
- sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ff00, 0x00004400);
+ clrsetbits32(hpriv->mmio + AHCI_P0DMACR, 0x0000ff00, 0x00004400);
/* Start DMA */
- sunxi_setbits(port_mmio + PORT_CMD, PORT_CMD_START);
+ setbits32(port_mmio + PORT_CMD, PORT_CMD_START);
}
static const struct ata_port_info ahci_sunxi_port_info = {
--
2.16.4
^ permalink raw reply related
* [PATCH 1/5] powerpc: rename setbits32/clrbits32 to setbits32_be/clrbits32_be
From: Corentin Labbe @ 2018-09-07 19:41 UTC (permalink / raw)
To: Gilles.Muller, Julia.Lawall, agust, alexandre.torgue, alistair,
benh, carlo, davem, galak, joabreu, khilman, maxime.ripard,
michal.lkml, mpe, mporter, nicolas.palix, oss, paulus,
peppe.cavallaro, tj, vitb, wens
Cc: cocci, linux-amlogic, linux-arm-kernel, linux-ide, linux-kernel,
linuxppc-dev, netdev, linux-sunxi, Corentin Labbe
In-Reply-To: <1536349307-20714-1-git-send-email-clabbe@baylibre.com>
Since setbits32/clrbits32 work on be32, it's better to remove ambiguity on
the used data type.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
arch/powerpc/include/asm/fsl_lbc.h | 2 +-
arch/powerpc/include/asm/io.h | 5 +-
arch/powerpc/platforms/44x/canyonlands.c | 4 +-
arch/powerpc/platforms/4xx/gpio.c | 28 ++++-----
arch/powerpc/platforms/512x/pdm360ng.c | 6 +-
arch/powerpc/platforms/52xx/mpc52xx_common.c | 6 +-
arch/powerpc/platforms/52xx/mpc52xx_gpt.c | 10 ++--
arch/powerpc/platforms/82xx/ep8248e.c | 2 +-
arch/powerpc/platforms/82xx/km82xx.c | 6 +-
arch/powerpc/platforms/82xx/mpc8272_ads.c | 10 ++--
arch/powerpc/platforms/82xx/pq2.c | 2 +-
arch/powerpc/platforms/82xx/pq2ads-pci-pic.c | 4 +-
arch/powerpc/platforms/82xx/pq2fads.c | 10 ++--
arch/powerpc/platforms/83xx/km83xx.c | 6 +-
arch/powerpc/platforms/83xx/mpc836x_mds.c | 2 +-
arch/powerpc/platforms/85xx/mpc85xx_mds.c | 2 +-
arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 4 +-
arch/powerpc/platforms/85xx/mpc85xx_rdb.c | 2 +-
arch/powerpc/platforms/85xx/p1022_ds.c | 4 +-
arch/powerpc/platforms/85xx/p1022_rdk.c | 4 +-
arch/powerpc/platforms/85xx/t1042rdb_diu.c | 4 +-
arch/powerpc/platforms/85xx/twr_p102x.c | 2 +-
arch/powerpc/platforms/86xx/mpc8610_hpcd.c | 4 +-
arch/powerpc/platforms/8xx/adder875.c | 2 +-
arch/powerpc/platforms/8xx/m8xx_setup.c | 4 +-
arch/powerpc/platforms/8xx/mpc86xads_setup.c | 4 +-
arch/powerpc/platforms/8xx/mpc885ads_setup.c | 28 ++++-----
arch/powerpc/platforms/embedded6xx/flipper-pic.c | 6 +-
arch/powerpc/platforms/embedded6xx/hlwd-pic.c | 8 +--
arch/powerpc/platforms/embedded6xx/wii.c | 10 ++--
arch/powerpc/sysdev/cpm1.c | 26 ++++-----
arch/powerpc/sysdev/cpm2.c | 16 ++---
arch/powerpc/sysdev/cpm_common.c | 4 +-
arch/powerpc/sysdev/fsl_85xx_l2ctlr.c | 8 +--
arch/powerpc/sysdev/fsl_lbc.c | 2 +-
arch/powerpc/sysdev/fsl_pci.c | 8 +--
arch/powerpc/sysdev/fsl_pmc.c | 2 +-
arch/powerpc/sysdev/fsl_rcpm.c | 74 ++++++++++++------------
arch/powerpc/sysdev/fsl_rio.c | 4 +-
arch/powerpc/sysdev/fsl_rmu.c | 8 +--
arch/powerpc/sysdev/mpic_timer.c | 12 ++--
41 files changed, 178 insertions(+), 177 deletions(-)
diff --git a/arch/powerpc/include/asm/fsl_lbc.h b/arch/powerpc/include/asm/fsl_lbc.h
index c7240a024b96..55d7aa0c27cf 100644
--- a/arch/powerpc/include/asm/fsl_lbc.h
+++ b/arch/powerpc/include/asm/fsl_lbc.h
@@ -276,7 +276,7 @@ static inline void fsl_upm_start_pattern(struct fsl_upm *upm, u8 pat_offset)
*/
static inline void fsl_upm_end_pattern(struct fsl_upm *upm)
{
- clrbits32(upm->mxmr, MxMR_OP_RP);
+ clrbits32_be(upm->mxmr, MxMR_OP_RP);
while (in_be32(upm->mxmr) & MxMR_OP_RP)
cpu_relax();
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index e0331e754568..29ecefd41ecb 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -873,8 +873,8 @@ static inline void * bus_to_virt(unsigned long address)
#endif /* CONFIG_PPC32 */
/* access ports */
-#define setbits32(_addr, _v) out_be32((_addr), in_be32(_addr) | (_v))
-#define clrbits32(_addr, _v) out_be32((_addr), in_be32(_addr) & ~(_v))
+#define setbits32_be(_addr, _v) out_be32((_addr), in_be32(_addr) | (_v))
+#define clrbits32_be(_addr, _v) out_be32((_addr), in_be32(_addr) & ~(_v))
#define setbits16(_addr, _v) out_be16((_addr), in_be16(_addr) | (_v))
#define clrbits16(_addr, _v) out_be16((_addr), in_be16(_addr) & ~(_v))
@@ -904,6 +904,7 @@ static inline void * bus_to_virt(unsigned long address)
#define clrsetbits_le16(addr, clear, set) clrsetbits(le16, addr, clear, set)
#define clrsetbits_8(addr, clear, set) clrsetbits(8, addr, clear, set)
+#define clrsetbits32_be(addr, clear, set) clrsetbits(be32, addr, clear, set)
#endif /* __KERNEL__ */
diff --git a/arch/powerpc/platforms/44x/canyonlands.c b/arch/powerpc/platforms/44x/canyonlands.c
index 157f4ce46386..7145a730769d 100644
--- a/arch/powerpc/platforms/44x/canyonlands.c
+++ b/arch/powerpc/platforms/44x/canyonlands.c
@@ -113,8 +113,8 @@ static int __init ppc460ex_canyonlands_fixup(void)
* USB2HStop and gpio19 will be USB2DStop. For more details refer to
* table 34-7 of PPC460EX user manual.
*/
- setbits32((vaddr + GPIO0_OSRH), 0x42000000);
- setbits32((vaddr + GPIO0_TSRH), 0x42000000);
+ setbits32_be((vaddr + GPIO0_OSRH), 0x42000000);
+ setbits32_be((vaddr + GPIO0_TSRH), 0x42000000);
err_gpio:
iounmap(vaddr);
err_bcsr:
diff --git a/arch/powerpc/platforms/4xx/gpio.c b/arch/powerpc/platforms/4xx/gpio.c
index 2238e369cde4..e84f2d20674e 100644
--- a/arch/powerpc/platforms/4xx/gpio.c
+++ b/arch/powerpc/platforms/4xx/gpio.c
@@ -82,9 +82,9 @@ __ppc4xx_gpio_set(struct gpio_chip *gc, unsigned int gpio, int val)
struct ppc4xx_gpio __iomem *regs = mm_gc->regs;
if (val)
- setbits32(®s->or, GPIO_MASK(gpio));
+ setbits32_be(®s->or, GPIO_MASK(gpio));
else
- clrbits32(®s->or, GPIO_MASK(gpio));
+ clrbits32_be(®s->or, GPIO_MASK(gpio));
}
static void
@@ -112,18 +112,18 @@ static int ppc4xx_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)
spin_lock_irqsave(&chip->lock, flags);
/* Disable open-drain function */
- clrbits32(®s->odr, GPIO_MASK(gpio));
+ clrbits32_be(®s->odr, GPIO_MASK(gpio));
/* Float the pin */
- clrbits32(®s->tcr, GPIO_MASK(gpio));
+ clrbits32_be(®s->tcr, GPIO_MASK(gpio));
/* Bits 0-15 use TSRL/OSRL, bits 16-31 use TSRH/OSRH */
if (gpio < 16) {
- clrbits32(®s->osrl, GPIO_MASK2(gpio));
- clrbits32(®s->tsrl, GPIO_MASK2(gpio));
+ clrbits32_be(®s->osrl, GPIO_MASK2(gpio));
+ clrbits32_be(®s->tsrl, GPIO_MASK2(gpio));
} else {
- clrbits32(®s->osrh, GPIO_MASK2(gpio));
- clrbits32(®s->tsrh, GPIO_MASK2(gpio));
+ clrbits32_be(®s->osrh, GPIO_MASK2(gpio));
+ clrbits32_be(®s->tsrh, GPIO_MASK2(gpio));
}
spin_unlock_irqrestore(&chip->lock, flags);
@@ -145,18 +145,18 @@ ppc4xx_gpio_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)
__ppc4xx_gpio_set(gc, gpio, val);
/* Disable open-drain function */
- clrbits32(®s->odr, GPIO_MASK(gpio));
+ clrbits32_be(®s->odr, GPIO_MASK(gpio));
/* Drive the pin */
- setbits32(®s->tcr, GPIO_MASK(gpio));
+ setbits32_be(®s->tcr, GPIO_MASK(gpio));
/* Bits 0-15 use TSRL, bits 16-31 use TSRH */
if (gpio < 16) {
- clrbits32(®s->osrl, GPIO_MASK2(gpio));
- clrbits32(®s->tsrl, GPIO_MASK2(gpio));
+ clrbits32_be(®s->osrl, GPIO_MASK2(gpio));
+ clrbits32_be(®s->tsrl, GPIO_MASK2(gpio));
} else {
- clrbits32(®s->osrh, GPIO_MASK2(gpio));
- clrbits32(®s->tsrh, GPIO_MASK2(gpio));
+ clrbits32_be(®s->osrh, GPIO_MASK2(gpio));
+ clrbits32_be(®s->tsrh, GPIO_MASK2(gpio));
}
spin_unlock_irqrestore(&chip->lock, flags);
diff --git a/arch/powerpc/platforms/512x/pdm360ng.c b/arch/powerpc/platforms/512x/pdm360ng.c
index dc81f05e0bce..283925e49096 100644
--- a/arch/powerpc/platforms/512x/pdm360ng.c
+++ b/arch/powerpc/platforms/512x/pdm360ng.c
@@ -38,7 +38,7 @@ static int pdm360ng_get_pendown_state(void)
reg = in_be32(pdm360ng_gpio_base + 0xc);
if (reg & 0x40)
- setbits32(pdm360ng_gpio_base + 0xc, 0x40);
+ setbits32_be(pdm360ng_gpio_base + 0xc, 0x40);
reg = in_be32(pdm360ng_gpio_base + 0x8);
@@ -69,8 +69,8 @@ static int __init pdm360ng_penirq_init(void)
return -ENODEV;
}
out_be32(pdm360ng_gpio_base + 0xc, 0xffffffff);
- setbits32(pdm360ng_gpio_base + 0x18, 0x2000);
- setbits32(pdm360ng_gpio_base + 0x10, 0x40);
+ setbits32_be(pdm360ng_gpio_base + 0x18, 0x2000);
+ setbits32_be(pdm360ng_gpio_base + 0x10, 0x40);
return 0;
}
diff --git a/arch/powerpc/platforms/52xx/mpc52xx_common.c b/arch/powerpc/platforms/52xx/mpc52xx_common.c
index 565e3a83dc9e..8a8b3d79798d 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_common.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_common.c
@@ -314,13 +314,13 @@ int mpc5200_psc_ac97_gpio_reset(int psc_number)
/* enable gpio pins for output */
setbits8(&wkup_gpio->wkup_gpioe, reset);
- setbits32(&simple_gpio->simple_gpioe, sync | out);
+ setbits32_be(&simple_gpio->simple_gpioe, sync | out);
setbits8(&wkup_gpio->wkup_ddr, reset);
- setbits32(&simple_gpio->simple_ddr, sync | out);
+ setbits32_be(&simple_gpio->simple_ddr, sync | out);
/* Assert cold reset */
- clrbits32(&simple_gpio->simple_dvo, sync | out);
+ clrbits32_be(&simple_gpio->simple_dvo, sync | out);
clrbits8(&wkup_gpio->wkup_dvo, reset);
/* wait for 1 us */
diff --git a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
index 17cf249b18ee..88eef86f802c 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
@@ -142,7 +142,7 @@ static void mpc52xx_gpt_irq_unmask(struct irq_data *d)
unsigned long flags;
raw_spin_lock_irqsave(&gpt->lock, flags);
- setbits32(&gpt->regs->mode, MPC52xx_GPT_MODE_IRQ_EN);
+ setbits32_be(&gpt->regs->mode, MPC52xx_GPT_MODE_IRQ_EN);
raw_spin_unlock_irqrestore(&gpt->lock, flags);
}
@@ -152,7 +152,7 @@ static void mpc52xx_gpt_irq_mask(struct irq_data *d)
unsigned long flags;
raw_spin_lock_irqsave(&gpt->lock, flags);
- clrbits32(&gpt->regs->mode, MPC52xx_GPT_MODE_IRQ_EN);
+ clrbits32_be(&gpt->regs->mode, MPC52xx_GPT_MODE_IRQ_EN);
raw_spin_unlock_irqrestore(&gpt->lock, flags);
}
@@ -308,7 +308,7 @@ static int mpc52xx_gpt_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)
dev_dbg(gpt->dev, "%s: gpio:%d\n", __func__, gpio);
raw_spin_lock_irqsave(&gpt->lock, flags);
- clrbits32(&gpt->regs->mode, MPC52xx_GPT_MODE_GPIO_MASK);
+ clrbits32_be(&gpt->regs->mode, MPC52xx_GPT_MODE_GPIO_MASK);
raw_spin_unlock_irqrestore(&gpt->lock, flags);
return 0;
@@ -482,7 +482,7 @@ int mpc52xx_gpt_stop_timer(struct mpc52xx_gpt_priv *gpt)
return -EBUSY;
}
- clrbits32(&gpt->regs->mode, MPC52xx_GPT_MODE_COUNTER_ENABLE);
+ clrbits32_be(&gpt->regs->mode, MPC52xx_GPT_MODE_COUNTER_ENABLE);
raw_spin_unlock_irqrestore(&gpt->lock, flags);
return 0;
}
@@ -639,7 +639,7 @@ static int mpc52xx_wdt_release(struct inode *inode, struct file *file)
unsigned long flags;
raw_spin_lock_irqsave(&gpt_wdt->lock, flags);
- clrbits32(&gpt_wdt->regs->mode,
+ clrbits32_be(&gpt_wdt->regs->mode,
MPC52xx_GPT_MODE_COUNTER_ENABLE | MPC52xx_GPT_MODE_WDT_EN);
gpt_wdt->wdt_mode &= ~MPC52xx_GPT_IS_WDT;
raw_spin_unlock_irqrestore(&gpt_wdt->lock, flags);
diff --git a/arch/powerpc/platforms/82xx/ep8248e.c b/arch/powerpc/platforms/82xx/ep8248e.c
index 8fec050f2d5b..da4fee98085f 100644
--- a/arch/powerpc/platforms/82xx/ep8248e.c
+++ b/arch/powerpc/platforms/82xx/ep8248e.c
@@ -262,7 +262,7 @@ static void __init ep8248e_setup_arch(void)
/* When this is set, snooping CPM DMA from RAM causes
* machine checks. See erratum SIU18.
*/
- clrbits32(&cpm2_immr->im_siu_conf.siu_82xx.sc_bcr, MPC82XX_BCR_PLDP);
+ clrbits32_be(&cpm2_immr->im_siu_conf.siu_82xx.sc_bcr, MPC82XX_BCR_PLDP);
ep8248e_bcsr_node =
of_find_compatible_node(NULL, NULL, "fsl,ep8248e-bcsr");
diff --git a/arch/powerpc/platforms/82xx/km82xx.c b/arch/powerpc/platforms/82xx/km82xx.c
index 28860e40b5db..b5b34f8b1a9b 100644
--- a/arch/powerpc/platforms/82xx/km82xx.c
+++ b/arch/powerpc/platforms/82xx/km82xx.c
@@ -157,9 +157,9 @@ static void __init init_ioports(void)
cpm2_clk_setup(CPM_CLK_FCC2, CPM_CLK14, CPM_CLK_TX);
/* Force USB FULL SPEED bit to '1' */
- setbits32(&cpm2_immr->im_ioport.iop_pdata, 1 << (31 - 10));
+ setbits32_be(&cpm2_immr->im_ioport.iop_pdata, 1 << (31 - 10));
/* clear USB_SLAVE */
- clrbits32(&cpm2_immr->im_ioport.iop_pdata, 1 << (31 - 11));
+ clrbits32_be(&cpm2_immr->im_ioport.iop_pdata, 1 << (31 - 11));
}
static void __init km82xx_setup_arch(void)
@@ -172,7 +172,7 @@ static void __init km82xx_setup_arch(void)
/* When this is set, snooping CPM DMA from RAM causes
* machine checks. See erratum SIU18.
*/
- clrbits32(&cpm2_immr->im_siu_conf.siu_82xx.sc_bcr, MPC82XX_BCR_PLDP);
+ clrbits32_be(&cpm2_immr->im_siu_conf.siu_82xx.sc_bcr, MPC82XX_BCR_PLDP);
init_ioports();
diff --git a/arch/powerpc/platforms/82xx/mpc8272_ads.c b/arch/powerpc/platforms/82xx/mpc8272_ads.c
index d23c10a96bde..a9c8cd13a4b5 100644
--- a/arch/powerpc/platforms/82xx/mpc8272_ads.c
+++ b/arch/powerpc/platforms/82xx/mpc8272_ads.c
@@ -164,13 +164,13 @@ static void __init mpc8272_ads_setup_arch(void)
#define BCSR3_FETHIEN2 0x10000000
#define BCSR3_FETH2_RST 0x08000000
- clrbits32(&bcsr[1], BCSR1_RS232_EN1 | BCSR1_RS232_EN2 | BCSR1_FETHIEN);
- setbits32(&bcsr[1], BCSR1_FETH_RST);
+ clrbits32_be(&bcsr[1], BCSR1_RS232_EN1 | BCSR1_RS232_EN2 | BCSR1_FETHIEN);
+ setbits32_be(&bcsr[1], BCSR1_FETH_RST);
- clrbits32(&bcsr[3], BCSR3_FETHIEN2);
- setbits32(&bcsr[3], BCSR3_FETH2_RST);
+ clrbits32_be(&bcsr[3], BCSR3_FETHIEN2);
+ setbits32_be(&bcsr[3], BCSR3_FETH2_RST);
- clrbits32(&bcsr[3], BCSR3_USB_nEN);
+ clrbits32_be(&bcsr[3], BCSR3_USB_nEN);
iounmap(bcsr);
diff --git a/arch/powerpc/platforms/82xx/pq2.c b/arch/powerpc/platforms/82xx/pq2.c
index c4f7029fc9ae..43a9a948f064 100644
--- a/arch/powerpc/platforms/82xx/pq2.c
+++ b/arch/powerpc/platforms/82xx/pq2.c
@@ -25,7 +25,7 @@
void __noreturn pq2_restart(char *cmd)
{
local_irq_disable();
- setbits32(&cpm2_immr->im_clkrst.car_rmr, RMR_CSRE);
+ setbits32_be(&cpm2_immr->im_clkrst.car_rmr, RMR_CSRE);
/* Clear the ME,EE,IR & DR bits in MSR to cause checkstop */
mtmsr(mfmsr() & ~(MSR_ME | MSR_EE | MSR_IR | MSR_DR));
diff --git a/arch/powerpc/platforms/82xx/pq2ads-pci-pic.c b/arch/powerpc/platforms/82xx/pq2ads-pci-pic.c
index 8b065bdf7412..b691de4c580a 100644
--- a/arch/powerpc/platforms/82xx/pq2ads-pci-pic.c
+++ b/arch/powerpc/platforms/82xx/pq2ads-pci-pic.c
@@ -47,7 +47,7 @@ static void pq2ads_pci_mask_irq(struct irq_data *d)
unsigned long flags;
raw_spin_lock_irqsave(&pci_pic_lock, flags);
- setbits32(&priv->regs->mask, 1 << irq);
+ setbits32_be(&priv->regs->mask, 1 << irq);
mb();
raw_spin_unlock_irqrestore(&pci_pic_lock, flags);
@@ -63,7 +63,7 @@ static void pq2ads_pci_unmask_irq(struct irq_data *d)
unsigned long flags;
raw_spin_lock_irqsave(&pci_pic_lock, flags);
- clrbits32(&priv->regs->mask, 1 << irq);
+ clrbits32_be(&priv->regs->mask, 1 << irq);
raw_spin_unlock_irqrestore(&pci_pic_lock, flags);
}
}
diff --git a/arch/powerpc/platforms/82xx/pq2fads.c b/arch/powerpc/platforms/82xx/pq2fads.c
index 6c654dc74a4b..05e9c743712f 100644
--- a/arch/powerpc/platforms/82xx/pq2fads.c
+++ b/arch/powerpc/platforms/82xx/pq2fads.c
@@ -140,18 +140,18 @@ static void __init pq2fads_setup_arch(void)
/* Enable the serial and ethernet ports */
- clrbits32(&bcsr[1], BCSR1_RS232_EN1 | BCSR1_RS232_EN2 | BCSR1_FETHIEN);
- setbits32(&bcsr[1], BCSR1_FETH_RST);
+ clrbits32_be(&bcsr[1], BCSR1_RS232_EN1 | BCSR1_RS232_EN2 | BCSR1_FETHIEN);
+ setbits32_be(&bcsr[1], BCSR1_FETH_RST);
- clrbits32(&bcsr[3], BCSR3_FETHIEN2);
- setbits32(&bcsr[3], BCSR3_FETH2_RST);
+ clrbits32_be(&bcsr[3], BCSR3_FETHIEN2);
+ setbits32_be(&bcsr[3], BCSR3_FETH2_RST);
iounmap(bcsr);
init_ioports();
/* Enable external IRQs */
- clrbits32(&cpm2_immr->im_siu_conf.siu_82xx.sc_siumcr, 0x0c000000);
+ clrbits32_be(&cpm2_immr->im_siu_conf.siu_82xx.sc_siumcr, 0x0c000000);
pq2_init_pci();
diff --git a/arch/powerpc/platforms/83xx/km83xx.c b/arch/powerpc/platforms/83xx/km83xx.c
index d8642a4afc74..d13f11aac111 100644
--- a/arch/powerpc/platforms/83xx/km83xx.c
+++ b/arch/powerpc/platforms/83xx/km83xx.c
@@ -101,19 +101,19 @@ static void quirk_mpc8360e_qe_enet10(void)
* UCC1: write 0b11 to bits 18:19
* at address IMMRBAR+0x14A8
*/
- setbits32((base + 0xa8), 0x00003000);
+ setbits32_be((base + 0xa8), 0x00003000);
/*
* UCC2 option 1: write 0b11 to bits 4:5
* at address IMMRBAR+0x14A8
*/
- setbits32((base + 0xa8), 0x0c000000);
+ setbits32_be((base + 0xa8), 0x0c000000);
/*
* UCC2 option 2: write 0b11 to bits 16:17
* at address IMMRBAR+0x14AC
*/
- setbits32((base + 0xac), 0x0000c000);
+ setbits32_be((base + 0xac), 0x0000c000);
}
iounmap(base);
of_node_put(np_par);
diff --git a/arch/powerpc/platforms/83xx/mpc836x_mds.c b/arch/powerpc/platforms/83xx/mpc836x_mds.c
index fd44dd03e1f3..56e638fdbbc5 100644
--- a/arch/powerpc/platforms/83xx/mpc836x_mds.c
+++ b/arch/powerpc/platforms/83xx/mpc836x_mds.c
@@ -118,7 +118,7 @@ static void __init mpc836x_mds_setup_arch(void)
* IMMR + 0x14A8[4:5] = 11 (clk delay for UCC 2)
* IMMR + 0x14A8[18:19] = 11 (clk delay for UCC 1)
*/
- setbits32(immap, 0x0c003000);
+ setbits32_be(immap, 0x0c003000);
/*
* IMMR + 0x14AC[20:27] = 10101010
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_mds.c b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
index d7e440e6dba3..06c18149dc5a 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_mds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_mds.c
@@ -262,7 +262,7 @@ static void __init mpc85xx_mds_qe_init(void)
* and QE12 for QE MII management signals in PMUXCR
* register.
*/
- setbits32(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
+ setbits32_be(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
MPC85xx_PMUXCR_QE(3) |
MPC85xx_PMUXCR_QE(9) |
MPC85xx_PMUXCR_QE(12));
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
index f05325f0cc03..b1bb81a49a7f 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
@@ -60,9 +60,9 @@ static void mpc85xx_freeze_time_base(bool freeze)
mask = CCSR_GUTS_DEVDISR_TB0 | CCSR_GUTS_DEVDISR_TB1;
if (freeze)
- setbits32(&guts->devdisr, mask);
+ setbits32_be(&guts->devdisr, mask);
else
- clrbits32(&guts->devdisr, mask);
+ clrbits32_be(&guts->devdisr, mask);
in_be32(&guts->devdisr);
}
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 10069503e39f..13ae0b12dd5a 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -115,7 +115,7 @@ static void __init mpc85xx_rdb_setup_arch(void)
* and QE12 for QE MII management singals in PMUXCR
* register.
*/
- setbits32(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
+ setbits32_be(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
MPC85xx_PMUXCR_QE(3) |
MPC85xx_PMUXCR_QE(9) |
MPC85xx_PMUXCR_QE(12));
diff --git a/arch/powerpc/platforms/85xx/p1022_ds.c b/arch/powerpc/platforms/85xx/p1022_ds.c
index 9fb57f78cdbe..adb7abdd291f 100644
--- a/arch/powerpc/platforms/85xx/p1022_ds.c
+++ b/arch/powerpc/platforms/85xx/p1022_ds.c
@@ -405,11 +405,11 @@ void p1022ds_set_pixel_clock(unsigned int pixclock)
pxclk = clamp_t(u32, pxclk, 2, 255);
/* Disable the pixel clock, and set it to non-inverted and no delay */
- clrbits32(&guts->clkdvdr,
+ clrbits32_be(&guts->clkdvdr,
CLKDVDR_PXCKEN | CLKDVDR_PXCKDLY | CLKDVDR_PXCLK_MASK);
/* Enable the clock and set the pxclk */
- setbits32(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
+ setbits32_be(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
iounmap(guts);
}
diff --git a/arch/powerpc/platforms/85xx/p1022_rdk.c b/arch/powerpc/platforms/85xx/p1022_rdk.c
index 276e00ab3dde..97698230f031 100644
--- a/arch/powerpc/platforms/85xx/p1022_rdk.c
+++ b/arch/powerpc/platforms/85xx/p1022_rdk.c
@@ -75,11 +75,11 @@ void p1022rdk_set_pixel_clock(unsigned int pixclock)
pxclk = clamp_t(u32, pxclk, 2, 255);
/* Disable the pixel clock, and set it to non-inverted and no delay */
- clrbits32(&guts->clkdvdr,
+ clrbits32_be(&guts->clkdvdr,
CLKDVDR_PXCKEN | CLKDVDR_PXCKDLY | CLKDVDR_PXCLK_MASK);
/* Enable the clock and set the pxclk */
- setbits32(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
+ setbits32_be(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
iounmap(guts);
}
diff --git a/arch/powerpc/platforms/85xx/t1042rdb_diu.c b/arch/powerpc/platforms/85xx/t1042rdb_diu.c
index dac36ba82fea..c11f95711a8a 100644
--- a/arch/powerpc/platforms/85xx/t1042rdb_diu.c
+++ b/arch/powerpc/platforms/85xx/t1042rdb_diu.c
@@ -114,11 +114,11 @@ static void t1042rdb_set_pixel_clock(unsigned int pixclock)
pxclk = clamp_t(u32, pxclk, 2, 255);
/* Disable the pixel clock, and set it to non-inverted and no delay */
- clrbits32(scfg + CCSR_SCFG_PIXCLKCR,
+ clrbits32_be(scfg + CCSR_SCFG_PIXCLKCR,
PIXCLKCR_PXCKEN | PIXCLKCR_PXCKDLY | PIXCLKCR_PXCLK_MASK);
/* Enable the clock and set the pxclk */
- setbits32(scfg + CCSR_SCFG_PIXCLKCR, PIXCLKCR_PXCKEN | (pxclk << 16));
+ setbits32_be(scfg + CCSR_SCFG_PIXCLKCR, PIXCLKCR_PXCKEN | (pxclk << 16));
iounmap(scfg);
}
diff --git a/arch/powerpc/platforms/85xx/twr_p102x.c b/arch/powerpc/platforms/85xx/twr_p102x.c
index 360f6253e9ff..b678ee2665d0 100644
--- a/arch/powerpc/platforms/85xx/twr_p102x.c
+++ b/arch/powerpc/platforms/85xx/twr_p102x.c
@@ -95,7 +95,7 @@ static void __init twr_p1025_setup_arch(void)
* and QE12 for QE MII management signals in PMUXCR
* register.
* Set QE mux bits in PMUXCR */
- setbits32(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
+ setbits32_be(&guts->pmuxcr, MPC85xx_PMUXCR_QE(0) |
MPC85xx_PMUXCR_QE(3) |
MPC85xx_PMUXCR_QE(9) |
MPC85xx_PMUXCR_QE(12));
diff --git a/arch/powerpc/platforms/86xx/mpc8610_hpcd.c b/arch/powerpc/platforms/86xx/mpc8610_hpcd.c
index a5d73fabe4d1..78472179b05a 100644
--- a/arch/powerpc/platforms/86xx/mpc8610_hpcd.c
+++ b/arch/powerpc/platforms/86xx/mpc8610_hpcd.c
@@ -261,11 +261,11 @@ void mpc8610hpcd_set_pixel_clock(unsigned int pixclock)
pxclk = clamp_t(u32, pxclk, 2, 31);
/* Disable the pixel clock, and set it to non-inverted and no delay */
- clrbits32(&guts->clkdvdr,
+ clrbits32_be(&guts->clkdvdr,
CLKDVDR_PXCKEN | CLKDVDR_PXCKDLY | CLKDVDR_PXCLK_MASK);
/* Enable the clock and set the pxclk */
- setbits32(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
+ setbits32_be(&guts->clkdvdr, CLKDVDR_PXCKEN | (pxclk << 16));
iounmap(guts);
}
diff --git a/arch/powerpc/platforms/8xx/adder875.c b/arch/powerpc/platforms/8xx/adder875.c
index bcef9f66191e..d21d0b8fd2a7 100644
--- a/arch/powerpc/platforms/8xx/adder875.c
+++ b/arch/powerpc/platforms/8xx/adder875.c
@@ -77,7 +77,7 @@ static void __init init_ioports(void)
cpm1_clk_setup(CPM_CLK_SMC1, CPM_BRG1, CPM_CLK_RTX);
/* Set FEC1 and FEC2 to MII mode */
- clrbits32(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
+ clrbits32_be(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
}
static void __init adder875_setup(void)
diff --git a/arch/powerpc/platforms/8xx/m8xx_setup.c b/arch/powerpc/platforms/8xx/m8xx_setup.c
index 027c42d8966c..2ed24abd0b40 100644
--- a/arch/powerpc/platforms/8xx/m8xx_setup.c
+++ b/arch/powerpc/platforms/8xx/m8xx_setup.c
@@ -103,7 +103,7 @@ void __init mpc8xx_calibrate_decr(void)
/* Force all 8xx processors to use divide by 16 processor clock. */
clk_r2 = immr_map(im_clkrst);
- setbits32(&clk_r2->car_sccr, 0x02000000);
+ setbits32_be(&clk_r2->car_sccr, 0x02000000);
immr_unmap(clk_r2);
/* Processor frequency is MHz.
@@ -203,7 +203,7 @@ void __noreturn mpc8xx_restart(char *cmd)
local_irq_disable();
- setbits32(&clk_r->car_plprcr, 0x00000080);
+ setbits32_be(&clk_r->car_plprcr, 0x00000080);
/* Clear the ME bit in MSR to cause checkstop on machine check
*/
mtmsr(mfmsr() & ~0x1000);
diff --git a/arch/powerpc/platforms/8xx/mpc86xads_setup.c b/arch/powerpc/platforms/8xx/mpc86xads_setup.c
index 8d02f5ff4481..a25e5ab15d65 100644
--- a/arch/powerpc/platforms/8xx/mpc86xads_setup.c
+++ b/arch/powerpc/platforms/8xx/mpc86xads_setup.c
@@ -87,7 +87,7 @@ static void __init init_ioports(void)
cpm1_clk_setup(CPM_CLK_SCC1, CPM_CLK2, CPM_CLK_RX);
/* Set FEC1 and FEC2 to MII mode */
- clrbits32(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
+ clrbits32_be(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
}
static void __init mpc86xads_setup_arch(void)
@@ -112,7 +112,7 @@ static void __init mpc86xads_setup_arch(void)
return;
}
- clrbits32(bcsr_io, BCSR1_RS232EN_1 | BCSR1_RS232EN_2 | BCSR1_ETHEN);
+ clrbits32_be(bcsr_io, BCSR1_RS232EN_1 | BCSR1_RS232EN_2 | BCSR1_ETHEN);
iounmap(bcsr_io);
}
diff --git a/arch/powerpc/platforms/8xx/mpc885ads_setup.c b/arch/powerpc/platforms/8xx/mpc885ads_setup.c
index a0c83c1905c6..8aad0fb9090b 100644
--- a/arch/powerpc/platforms/8xx/mpc885ads_setup.c
+++ b/arch/powerpc/platforms/8xx/mpc885ads_setup.c
@@ -123,7 +123,7 @@ static void __init init_ioports(void)
cpm1_clk_setup(CPM_CLK_SCC3, CPM_CLK6, CPM_CLK_RX);
/* Set FEC1 and FEC2 to MII mode */
- clrbits32(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
+ clrbits32_be(&mpc8xx_immr->im_cpm.cp_cptr, 0x00000180);
}
static void __init mpc885ads_setup_arch(void)
@@ -148,33 +148,33 @@ static void __init mpc885ads_setup_arch(void)
return;
}
- clrbits32(&bcsr[1], BCSR1_RS232EN_1);
+ clrbits32_be(&bcsr[1], BCSR1_RS232EN_1);
#ifdef CONFIG_MPC8xx_SECOND_ETH_FEC2
- setbits32(&bcsr[1], BCSR1_RS232EN_2);
+ setbits32_be(&bcsr[1], BCSR1_RS232EN_2);
#else
- clrbits32(&bcsr[1], BCSR1_RS232EN_2);
+ clrbits32_be(&bcsr[1], BCSR1_RS232EN_2);
#endif
- clrbits32(bcsr5, BCSR5_MII1_EN);
- setbits32(bcsr5, BCSR5_MII1_RST);
+ clrbits32_be(bcsr5, BCSR5_MII1_EN);
+ setbits32_be(bcsr5, BCSR5_MII1_RST);
udelay(1000);
- clrbits32(bcsr5, BCSR5_MII1_RST);
+ clrbits32_be(bcsr5, BCSR5_MII1_RST);
#ifdef CONFIG_MPC8xx_SECOND_ETH_FEC2
- clrbits32(bcsr5, BCSR5_MII2_EN);
- setbits32(bcsr5, BCSR5_MII2_RST);
+ clrbits32_be(bcsr5, BCSR5_MII2_EN);
+ setbits32_be(bcsr5, BCSR5_MII2_RST);
udelay(1000);
- clrbits32(bcsr5, BCSR5_MII2_RST);
+ clrbits32_be(bcsr5, BCSR5_MII2_RST);
#else
- setbits32(bcsr5, BCSR5_MII2_EN);
+ setbits32_be(bcsr5, BCSR5_MII2_EN);
#endif
#ifdef CONFIG_MPC8xx_SECOND_ETH_SCC3
- clrbits32(&bcsr[4], BCSR4_ETH10_RST);
+ clrbits32_be(&bcsr[4], BCSR4_ETH10_RST);
udelay(1000);
- setbits32(&bcsr[4], BCSR4_ETH10_RST);
+ setbits32_be(&bcsr[4], BCSR4_ETH10_RST);
- setbits32(&bcsr[1], BCSR1_ETHEN);
+ setbits32_be(&bcsr[1], BCSR1_ETHEN);
np = of_find_node_by_path("/soc@ff000000/cpm@9c0/serial@a80");
#else
diff --git a/arch/powerpc/platforms/embedded6xx/flipper-pic.c b/arch/powerpc/platforms/embedded6xx/flipper-pic.c
index db0be007fd06..6df4533aa851 100644
--- a/arch/powerpc/platforms/embedded6xx/flipper-pic.c
+++ b/arch/powerpc/platforms/embedded6xx/flipper-pic.c
@@ -53,7 +53,7 @@ static void flipper_pic_mask_and_ack(struct irq_data *d)
void __iomem *io_base = irq_data_get_irq_chip_data(d);
u32 mask = 1 << irq;
- clrbits32(io_base + FLIPPER_IMR, mask);
+ clrbits32_be(io_base + FLIPPER_IMR, mask);
/* this is at least needed for RSW */
out_be32(io_base + FLIPPER_ICR, mask);
}
@@ -72,7 +72,7 @@ static void flipper_pic_mask(struct irq_data *d)
int irq = irqd_to_hwirq(d);
void __iomem *io_base = irq_data_get_irq_chip_data(d);
- clrbits32(io_base + FLIPPER_IMR, 1 << irq);
+ clrbits32_be(io_base + FLIPPER_IMR, 1 << irq);
}
static void flipper_pic_unmask(struct irq_data *d)
@@ -80,7 +80,7 @@ static void flipper_pic_unmask(struct irq_data *d)
int irq = irqd_to_hwirq(d);
void __iomem *io_base = irq_data_get_irq_chip_data(d);
- setbits32(io_base + FLIPPER_IMR, 1 << irq);
+ setbits32_be(io_base + FLIPPER_IMR, 1 << irq);
}
diff --git a/arch/powerpc/platforms/embedded6xx/hlwd-pic.c b/arch/powerpc/platforms/embedded6xx/hlwd-pic.c
index 8112b39879d6..5487710bed1c 100644
--- a/arch/powerpc/platforms/embedded6xx/hlwd-pic.c
+++ b/arch/powerpc/platforms/embedded6xx/hlwd-pic.c
@@ -50,7 +50,7 @@ static void hlwd_pic_mask_and_ack(struct irq_data *d)
void __iomem *io_base = irq_data_get_irq_chip_data(d);
u32 mask = 1 << irq;
- clrbits32(io_base + HW_BROADWAY_IMR, mask);
+ clrbits32_be(io_base + HW_BROADWAY_IMR, mask);
out_be32(io_base + HW_BROADWAY_ICR, mask);
}
@@ -67,7 +67,7 @@ static void hlwd_pic_mask(struct irq_data *d)
int irq = irqd_to_hwirq(d);
void __iomem *io_base = irq_data_get_irq_chip_data(d);
- clrbits32(io_base + HW_BROADWAY_IMR, 1 << irq);
+ clrbits32_be(io_base + HW_BROADWAY_IMR, 1 << irq);
}
static void hlwd_pic_unmask(struct irq_data *d)
@@ -75,10 +75,10 @@ static void hlwd_pic_unmask(struct irq_data *d)
int irq = irqd_to_hwirq(d);
void __iomem *io_base = irq_data_get_irq_chip_data(d);
- setbits32(io_base + HW_BROADWAY_IMR, 1 << irq);
+ setbits32_be(io_base + HW_BROADWAY_IMR, 1 << irq);
/* Make sure the ARM (aka. Starlet) doesn't handle this interrupt. */
- clrbits32(io_base + HW_STARLET_IMR, 1 << irq);
+ clrbits32_be(io_base + HW_STARLET_IMR, 1 << irq);
}
diff --git a/arch/powerpc/platforms/embedded6xx/wii.c b/arch/powerpc/platforms/embedded6xx/wii.c
index 403523c061ba..dd511e19147a 100644
--- a/arch/powerpc/platforms/embedded6xx/wii.c
+++ b/arch/powerpc/platforms/embedded6xx/wii.c
@@ -134,7 +134,7 @@ static void __init wii_setup_arch(void)
hw_gpio = wii_ioremap_hw_regs("hw_gpio", HW_GPIO_COMPATIBLE);
if (hw_gpio) {
/* turn off the front blue led and IR light */
- clrbits32(hw_gpio + HW_GPIO_OUT(0),
+ clrbits32_be(hw_gpio + HW_GPIO_OUT(0),
HW_GPIO_SLOT_LED | HW_GPIO_SENSOR_BAR);
}
}
@@ -145,7 +145,7 @@ static void __noreturn wii_restart(char *cmd)
if (hw_ctrl) {
/* clear the system reset pin to cause a reset */
- clrbits32(hw_ctrl + HW_CTRL_RESETS, HW_CTRL_RESETS_SYS);
+ clrbits32_be(hw_ctrl + HW_CTRL_RESETS, HW_CTRL_RESETS_SYS);
}
wii_spin();
}
@@ -159,13 +159,13 @@ static void wii_power_off(void)
* set the owner of the shutdown pin to ARM, because it is
* accessed through the registers for the ARM, below
*/
- clrbits32(hw_gpio + HW_GPIO_OWNER, HW_GPIO_SHUTDOWN);
+ clrbits32_be(hw_gpio + HW_GPIO_OWNER, HW_GPIO_SHUTDOWN);
/* make sure that the poweroff GPIO is configured as output */
- setbits32(hw_gpio + HW_GPIO_DIR(1), HW_GPIO_SHUTDOWN);
+ setbits32_be(hw_gpio + HW_GPIO_DIR(1), HW_GPIO_SHUTDOWN);
/* drive the poweroff GPIO high */
- setbits32(hw_gpio + HW_GPIO_OUT(1), HW_GPIO_SHUTDOWN);
+ setbits32_be(hw_gpio + HW_GPIO_OUT(1), HW_GPIO_SHUTDOWN);
}
wii_spin();
}
diff --git a/arch/powerpc/sysdev/cpm1.c b/arch/powerpc/sysdev/cpm1.c
index 4f8dcf124828..9de5f13c51cb 100644
--- a/arch/powerpc/sysdev/cpm1.c
+++ b/arch/powerpc/sysdev/cpm1.c
@@ -60,14 +60,14 @@ static void cpm_mask_irq(struct irq_data *d)
{
unsigned int cpm_vec = (unsigned int)irqd_to_hwirq(d);
- clrbits32(&cpic_reg->cpic_cimr, (1 << cpm_vec));
+ clrbits32_be(&cpic_reg->cpic_cimr, (1 << cpm_vec));
}
static void cpm_unmask_irq(struct irq_data *d)
{
unsigned int cpm_vec = (unsigned int)irqd_to_hwirq(d);
- setbits32(&cpic_reg->cpic_cimr, (1 << cpm_vec));
+ setbits32_be(&cpic_reg->cpic_cimr, (1 << cpm_vec));
}
static void cpm_end_irq(struct irq_data *d)
@@ -188,7 +188,7 @@ unsigned int cpm_pic_init(void)
if (setup_irq(eirq, &cpm_error_irqaction))
printk(KERN_ERR "Could not allocate CPM error IRQ!");
- setbits32(&cpic_reg->cpic_cicr, CICR_IEN);
+ setbits32_be(&cpic_reg->cpic_cicr, CICR_IEN);
end:
of_node_put(np);
@@ -317,14 +317,14 @@ static void cpm1_set_pin32(int port, int pin, int flags)
&mpc8xx_immr->im_cpm.cp_pedir;
if (flags & CPM_PIN_OUTPUT)
- setbits32(&iop->dir, pin);
+ setbits32_be(&iop->dir, pin);
else
- clrbits32(&iop->dir, pin);
+ clrbits32_be(&iop->dir, pin);
if (!(flags & CPM_PIN_GPIO))
- setbits32(&iop->par, pin);
+ setbits32_be(&iop->par, pin);
else
- clrbits32(&iop->par, pin);
+ clrbits32_be(&iop->par, pin);
if (port == CPM_PORTB) {
if (flags & CPM_PIN_OPENDRAIN)
@@ -335,14 +335,14 @@ static void cpm1_set_pin32(int port, int pin, int flags)
if (port == CPM_PORTE) {
if (flags & CPM_PIN_SECONDARY)
- setbits32(&iop->sor, pin);
+ setbits32_be(&iop->sor, pin);
else
- clrbits32(&iop->sor, pin);
+ clrbits32_be(&iop->sor, pin);
if (flags & CPM_PIN_OPENDRAIN)
- setbits32(&mpc8xx_immr->im_cpm.cp_peodr, pin);
+ setbits32_be(&mpc8xx_immr->im_cpm.cp_peodr, pin);
else
- clrbits32(&mpc8xx_immr->im_cpm.cp_peodr, pin);
+ clrbits32_be(&mpc8xx_immr->im_cpm.cp_peodr, pin);
}
}
@@ -732,7 +732,7 @@ static int cpm1_gpio32_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)
spin_lock_irqsave(&cpm1_gc->lock, flags);
- setbits32(&iop->dir, pin_mask);
+ setbits32_be(&iop->dir, pin_mask);
__cpm1_gpio32_set(mm_gc, pin_mask, val);
spin_unlock_irqrestore(&cpm1_gc->lock, flags);
@@ -750,7 +750,7 @@ static int cpm1_gpio32_dir_in(struct gpio_chip *gc, unsigned int gpio)
spin_lock_irqsave(&cpm1_gc->lock, flags);
- clrbits32(&iop->dir, pin_mask);
+ clrbits32_be(&iop->dir, pin_mask);
spin_unlock_irqrestore(&cpm1_gc->lock, flags);
diff --git a/arch/powerpc/sysdev/cpm2.c b/arch/powerpc/sysdev/cpm2.c
index 07718b9a2c99..445d6e45a6de 100644
--- a/arch/powerpc/sysdev/cpm2.c
+++ b/arch/powerpc/sysdev/cpm2.c
@@ -335,22 +335,22 @@ void cpm2_set_pin(int port, int pin, int flags)
pin = 1 << (31 - pin);
if (flags & CPM_PIN_OUTPUT)
- setbits32(&iop[port].dir, pin);
+ setbits32_be(&iop[port].dir, pin);
else
- clrbits32(&iop[port].dir, pin);
+ clrbits32_be(&iop[port].dir, pin);
if (!(flags & CPM_PIN_GPIO))
- setbits32(&iop[port].par, pin);
+ setbits32_be(&iop[port].par, pin);
else
- clrbits32(&iop[port].par, pin);
+ clrbits32_be(&iop[port].par, pin);
if (flags & CPM_PIN_SECONDARY)
- setbits32(&iop[port].sor, pin);
+ setbits32_be(&iop[port].sor, pin);
else
- clrbits32(&iop[port].sor, pin);
+ clrbits32_be(&iop[port].sor, pin);
if (flags & CPM_PIN_OPENDRAIN)
- setbits32(&iop[port].odr, pin);
+ setbits32_be(&iop[port].odr, pin);
else
- clrbits32(&iop[port].odr, pin);
+ clrbits32_be(&iop[port].odr, pin);
}
diff --git a/arch/powerpc/sysdev/cpm_common.c b/arch/powerpc/sysdev/cpm_common.c
index b74508175b67..d36a95708aaf 100644
--- a/arch/powerpc/sysdev/cpm_common.c
+++ b/arch/powerpc/sysdev/cpm_common.c
@@ -165,7 +165,7 @@ static int cpm2_gpio32_dir_out(struct gpio_chip *gc, unsigned int gpio, int val)
spin_lock_irqsave(&cpm2_gc->lock, flags);
- setbits32(&iop->dir, pin_mask);
+ setbits32_be(&iop->dir, pin_mask);
__cpm2_gpio32_set(mm_gc, pin_mask, val);
spin_unlock_irqrestore(&cpm2_gc->lock, flags);
@@ -183,7 +183,7 @@ static int cpm2_gpio32_dir_in(struct gpio_chip *gc, unsigned int gpio)
spin_lock_irqsave(&cpm2_gc->lock, flags);
- clrbits32(&iop->dir, pin_mask);
+ clrbits32_be(&iop->dir, pin_mask);
spin_unlock_irqrestore(&cpm2_gc->lock, flags);
diff --git a/arch/powerpc/sysdev/fsl_85xx_l2ctlr.c b/arch/powerpc/sysdev/fsl_85xx_l2ctlr.c
index c27058e5df26..2b7e2b4a2543 100644
--- a/arch/powerpc/sysdev/fsl_85xx_l2ctlr.c
+++ b/arch/powerpc/sysdev/fsl_85xx_l2ctlr.c
@@ -124,23 +124,23 @@ static int mpc85xx_l2ctlr_of_probe(struct platform_device *dev)
switch (ways) {
case LOCK_WAYS_EIGHTH:
- setbits32(&l2ctlr->ctl,
+ setbits32_be(&l2ctlr->ctl,
L2CR_L2E | L2CR_L2FI | L2CR_SRAM_EIGHTH);
break;
case LOCK_WAYS_TWO_EIGHTH:
- setbits32(&l2ctlr->ctl,
+ setbits32_be(&l2ctlr->ctl,
L2CR_L2E | L2CR_L2FI | L2CR_SRAM_QUART);
break;
case LOCK_WAYS_HALF:
- setbits32(&l2ctlr->ctl,
+ setbits32_be(&l2ctlr->ctl,
L2CR_L2E | L2CR_L2FI | L2CR_SRAM_HALF);
break;
case LOCK_WAYS_FULL:
default:
- setbits32(&l2ctlr->ctl,
+ setbits32_be(&l2ctlr->ctl,
L2CR_L2E | L2CR_L2FI | L2CR_SRAM_FULL);
break;
}
diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c
index 5340a483cf55..994233e41b91 100644
--- a/arch/powerpc/sysdev/fsl_lbc.c
+++ b/arch/powerpc/sysdev/fsl_lbc.c
@@ -192,7 +192,7 @@ static int fsl_lbc_ctrl_init(struct fsl_lbc_ctrl *ctrl,
struct fsl_lbc_regs __iomem *lbc = ctrl->regs;
/* clear event registers */
- setbits32(&lbc->ltesr, LTESR_CLEAR);
+ setbits32_be(&lbc->ltesr, LTESR_CLEAR);
out_be32(&lbc->lteatr, 0);
out_be32(&lbc->ltear, 0);
out_be32(&lbc->lteccr, LTECCR_CLEAR);
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index 918be816b097..17aa5ee63d34 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -1196,11 +1196,11 @@ static int fsl_pci_pme_probe(struct pci_controller *hose)
pci = hose->private_data;
/* Enable PTOD, ENL23D & EXL23D */
- clrbits32(&pci->pex_pme_mes_disr,
+ clrbits32_be(&pci->pex_pme_mes_disr,
PME_DISR_EN_PTOD | PME_DISR_EN_ENL23D | PME_DISR_EN_EXL23D);
out_be32(&pci->pex_pme_mes_ier, 0);
- setbits32(&pci->pex_pme_mes_ier,
+ setbits32_be(&pci->pex_pme_mes_ier,
PME_DISR_EN_PTOD | PME_DISR_EN_ENL23D | PME_DISR_EN_EXL23D);
/* PME Enable */
@@ -1218,7 +1218,7 @@ static void send_pme_turnoff_message(struct pci_controller *hose)
int i;
/* Send PME_Turn_Off Message Request */
- setbits32(&pci->pex_pmcr, PEX_PMCR_PTOMR);
+ setbits32_be(&pci->pex_pmcr, PEX_PMCR_PTOMR);
/* Wait trun off done */
for (i = 0; i < 150; i++) {
@@ -1254,7 +1254,7 @@ static void fsl_pci_syscore_do_resume(struct pci_controller *hose)
int i;
/* Send Exit L2 State Message */
- setbits32(&pci->pex_pmcr, PEX_PMCR_EXL2S);
+ setbits32_be(&pci->pex_pmcr, PEX_PMCR_EXL2S);
/* Wait exit done */
for (i = 0; i < 150; i++) {
diff --git a/arch/powerpc/sysdev/fsl_pmc.c b/arch/powerpc/sysdev/fsl_pmc.c
index 232225e7f863..bbcf4cb89bb6 100644
--- a/arch/powerpc/sysdev/fsl_pmc.c
+++ b/arch/powerpc/sysdev/fsl_pmc.c
@@ -37,7 +37,7 @@ static int pmc_suspend_enter(suspend_state_t state)
{
int ret;
- setbits32(&pmc_regs->pmcsr, PMCSR_SLP);
+ setbits32_be(&pmc_regs->pmcsr, PMCSR_SLP);
/* At this point, the CPU is asleep. */
/* Upon resume, wait for SLP bit to be clear. */
diff --git a/arch/powerpc/sysdev/fsl_rcpm.c b/arch/powerpc/sysdev/fsl_rcpm.c
index 9259a94f70e1..bd2a7606bfce 100644
--- a/arch/powerpc/sysdev/fsl_rcpm.c
+++ b/arch/powerpc/sysdev/fsl_rcpm.c
@@ -33,10 +33,10 @@ static void rcpm_v1_irq_mask(int cpu)
int hw_cpu = get_hard_smp_processor_id(cpu);
unsigned int mask = 1 << hw_cpu;
- setbits32(&rcpm_v1_regs->cpmimr, mask);
- setbits32(&rcpm_v1_regs->cpmcimr, mask);
- setbits32(&rcpm_v1_regs->cpmmcmr, mask);
- setbits32(&rcpm_v1_regs->cpmnmimr, mask);
+ setbits32_be(&rcpm_v1_regs->cpmimr, mask);
+ setbits32_be(&rcpm_v1_regs->cpmcimr, mask);
+ setbits32_be(&rcpm_v1_regs->cpmmcmr, mask);
+ setbits32_be(&rcpm_v1_regs->cpmnmimr, mask);
}
static void rcpm_v2_irq_mask(int cpu)
@@ -44,10 +44,10 @@ static void rcpm_v2_irq_mask(int cpu)
int hw_cpu = get_hard_smp_processor_id(cpu);
unsigned int mask = 1 << hw_cpu;
- setbits32(&rcpm_v2_regs->tpmimr0, mask);
- setbits32(&rcpm_v2_regs->tpmcimr0, mask);
- setbits32(&rcpm_v2_regs->tpmmcmr0, mask);
- setbits32(&rcpm_v2_regs->tpmnmimr0, mask);
+ setbits32_be(&rcpm_v2_regs->tpmimr0, mask);
+ setbits32_be(&rcpm_v2_regs->tpmcimr0, mask);
+ setbits32_be(&rcpm_v2_regs->tpmmcmr0, mask);
+ setbits32_be(&rcpm_v2_regs->tpmnmimr0, mask);
}
static void rcpm_v1_irq_unmask(int cpu)
@@ -55,10 +55,10 @@ static void rcpm_v1_irq_unmask(int cpu)
int hw_cpu = get_hard_smp_processor_id(cpu);
unsigned int mask = 1 << hw_cpu;
- clrbits32(&rcpm_v1_regs->cpmimr, mask);
- clrbits32(&rcpm_v1_regs->cpmcimr, mask);
- clrbits32(&rcpm_v1_regs->cpmmcmr, mask);
- clrbits32(&rcpm_v1_regs->cpmnmimr, mask);
+ clrbits32_be(&rcpm_v1_regs->cpmimr, mask);
+ clrbits32_be(&rcpm_v1_regs->cpmcimr, mask);
+ clrbits32_be(&rcpm_v1_regs->cpmmcmr, mask);
+ clrbits32_be(&rcpm_v1_regs->cpmnmimr, mask);
}
static void rcpm_v2_irq_unmask(int cpu)
@@ -66,26 +66,26 @@ static void rcpm_v2_irq_unmask(int cpu)
int hw_cpu = get_hard_smp_processor_id(cpu);
unsigned int mask = 1 << hw_cpu;
- clrbits32(&rcpm_v2_regs->tpmimr0, mask);
- clrbits32(&rcpm_v2_regs->tpmcimr0, mask);
- clrbits32(&rcpm_v2_regs->tpmmcmr0, mask);
- clrbits32(&rcpm_v2_regs->tpmnmimr0, mask);
+ clrbits32_be(&rcpm_v2_regs->tpmimr0, mask);
+ clrbits32_be(&rcpm_v2_regs->tpmcimr0, mask);
+ clrbits32_be(&rcpm_v2_regs->tpmmcmr0, mask);
+ clrbits32_be(&rcpm_v2_regs->tpmnmimr0, mask);
}
static void rcpm_v1_set_ip_power(bool enable, u32 mask)
{
if (enable)
- setbits32(&rcpm_v1_regs->ippdexpcr, mask);
+ setbits32_be(&rcpm_v1_regs->ippdexpcr, mask);
else
- clrbits32(&rcpm_v1_regs->ippdexpcr, mask);
+ clrbits32_be(&rcpm_v1_regs->ippdexpcr, mask);
}
static void rcpm_v2_set_ip_power(bool enable, u32 mask)
{
if (enable)
- setbits32(&rcpm_v2_regs->ippdexpcr[0], mask);
+ setbits32_be(&rcpm_v2_regs->ippdexpcr[0], mask);
else
- clrbits32(&rcpm_v2_regs->ippdexpcr[0], mask);
+ clrbits32_be(&rcpm_v2_regs->ippdexpcr[0], mask);
}
static void rcpm_v1_cpu_enter_state(int cpu, int state)
@@ -95,10 +95,10 @@ static void rcpm_v1_cpu_enter_state(int cpu, int state)
switch (state) {
case E500_PM_PH10:
- setbits32(&rcpm_v1_regs->cdozcr, mask);
+ setbits32_be(&rcpm_v1_regs->cdozcr, mask);
break;
case E500_PM_PH15:
- setbits32(&rcpm_v1_regs->cnapcr, mask);
+ setbits32_be(&rcpm_v1_regs->cnapcr, mask);
break;
default:
pr_warn("Unknown cpu PM state (%d)\n", state);
@@ -114,16 +114,16 @@ static void rcpm_v2_cpu_enter_state(int cpu, int state)
switch (state) {
case E500_PM_PH10:
/* one bit corresponds to one thread for PH10 of 6500 */
- setbits32(&rcpm_v2_regs->tph10setr0, 1 << hw_cpu);
+ setbits32_be(&rcpm_v2_regs->tph10setr0, 1 << hw_cpu);
break;
case E500_PM_PH15:
- setbits32(&rcpm_v2_regs->pcph15setr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph15setr, mask);
break;
case E500_PM_PH20:
- setbits32(&rcpm_v2_regs->pcph20setr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph20setr, mask);
break;
case E500_PM_PH30:
- setbits32(&rcpm_v2_regs->pcph30setr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph30setr, mask);
break;
default:
pr_warn("Unknown cpu PM state (%d)\n", state);
@@ -172,10 +172,10 @@ static void rcpm_v1_cpu_exit_state(int cpu, int state)
switch (state) {
case E500_PM_PH10:
- clrbits32(&rcpm_v1_regs->cdozcr, mask);
+ clrbits32_be(&rcpm_v1_regs->cdozcr, mask);
break;
case E500_PM_PH15:
- clrbits32(&rcpm_v1_regs->cnapcr, mask);
+ clrbits32_be(&rcpm_v1_regs->cnapcr, mask);
break;
default:
pr_warn("Unknown cpu PM state (%d)\n", state);
@@ -196,16 +196,16 @@ static void rcpm_v2_cpu_exit_state(int cpu, int state)
switch (state) {
case E500_PM_PH10:
- setbits32(&rcpm_v2_regs->tph10clrr0, 1 << hw_cpu);
+ setbits32_be(&rcpm_v2_regs->tph10clrr0, 1 << hw_cpu);
break;
case E500_PM_PH15:
- setbits32(&rcpm_v2_regs->pcph15clrr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph15clrr, mask);
break;
case E500_PM_PH20:
- setbits32(&rcpm_v2_regs->pcph20clrr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph20clrr, mask);
break;
case E500_PM_PH30:
- setbits32(&rcpm_v2_regs->pcph30clrr, mask);
+ setbits32_be(&rcpm_v2_regs->pcph30clrr, mask);
break;
default:
pr_warn("Unknown cpu PM state (%d)\n", state);
@@ -226,7 +226,7 @@ static int rcpm_v1_plat_enter_state(int state)
switch (state) {
case PLAT_PM_SLEEP:
- setbits32(pmcsr_reg, RCPM_POWMGTCSR_SLP);
+ setbits32_be(pmcsr_reg, RCPM_POWMGTCSR_SLP);
/* Upon resume, wait for RCPM_POWMGTCSR_SLP bit to be clear. */
result = spin_event_timeout(
@@ -253,9 +253,9 @@ static int rcpm_v2_plat_enter_state(int state)
switch (state) {
case PLAT_PM_LPM20:
/* clear previous LPM20 status */
- setbits32(pmcsr_reg, RCPM_POWMGTCSR_P_LPM20_ST);
+ setbits32_be(pmcsr_reg, RCPM_POWMGTCSR_P_LPM20_ST);
/* enter LPM20 status */
- setbits32(pmcsr_reg, RCPM_POWMGTCSR_LPM20_RQ);
+ setbits32_be(pmcsr_reg, RCPM_POWMGTCSR_LPM20_RQ);
/* At this point, the device is in LPM20 status. */
@@ -291,9 +291,9 @@ static void rcpm_common_freeze_time_base(u32 *tben_reg, int freeze)
if (freeze) {
mask = in_be32(tben_reg);
- clrbits32(tben_reg, mask);
+ clrbits32_be(tben_reg, mask);
} else {
- setbits32(tben_reg, mask);
+ setbits32_be(tben_reg, mask);
}
/* read back to push the previous write */
diff --git a/arch/powerpc/sysdev/fsl_rio.c b/arch/powerpc/sysdev/fsl_rio.c
index 5011ffea4e4b..278e63cc8afe 100644
--- a/arch/powerpc/sysdev/fsl_rio.c
+++ b/arch/powerpc/sysdev/fsl_rio.c
@@ -668,10 +668,10 @@ int fsl_rio_setup(struct platform_device *dev)
out_be32(priv->regs_win
+ RIO_CCSR + i*0x20, 0);
/* Set 1x lane */
- setbits32(priv->regs_win
+ setbits32_be(priv->regs_win
+ RIO_CCSR + i*0x20, 0x02000000);
/* Enable ports */
- setbits32(priv->regs_win
+ setbits32_be(priv->regs_win
+ RIO_CCSR + i*0x20, 0x00600000);
msleep(100);
if (in_be32((priv->regs_win
diff --git a/arch/powerpc/sysdev/fsl_rmu.c b/arch/powerpc/sysdev/fsl_rmu.c
index 88b35a3dcdc5..134ba53f0fcb 100644
--- a/arch/powerpc/sysdev/fsl_rmu.c
+++ b/arch/powerpc/sysdev/fsl_rmu.c
@@ -355,7 +355,7 @@ fsl_rio_dbell_handler(int irq, void *dev_instance)
dmsg->sid, dmsg->tid,
dmsg->info);
}
- setbits32(&fsl_dbell->dbell_regs->dmr, DOORBELL_DMR_DI);
+ setbits32_be(&fsl_dbell->dbell_regs->dmr, DOORBELL_DMR_DI);
out_be32(&fsl_dbell->dbell_regs->dsr, DOORBELL_DSR_DIQI);
}
@@ -909,10 +909,10 @@ fsl_open_inb_mbox(struct rio_mport *mport, void *dev_id, int mbox, int entries)
out_be32(&rmu->msg_regs->imr, 0x001b0060);
/* Set number of queue entries */
- setbits32(&rmu->msg_regs->imr, (get_bitmask_order(entries) - 2) << 12);
+ setbits32_be(&rmu->msg_regs->imr, (get_bitmask_order(entries) - 2) << 12);
/* Now enable the unit */
- setbits32(&rmu->msg_regs->imr, 0x1);
+ setbits32_be(&rmu->msg_regs->imr, 0x1);
out:
return rc;
@@ -1015,7 +1015,7 @@ void *fsl_get_inb_message(struct rio_mport *mport, int mbox)
rmu->msg_rx_ring.virt_buffer[buf_idx] = NULL;
out1:
- setbits32(&rmu->msg_regs->imr, RIO_MSG_IMR_MI);
+ setbits32_be(&rmu->msg_regs->imr, RIO_MSG_IMR_MI);
out2:
return buf;
diff --git a/arch/powerpc/sysdev/mpic_timer.c b/arch/powerpc/sysdev/mpic_timer.c
index 87e7c42777a8..70b02ba90220 100644
--- a/arch/powerpc/sysdev/mpic_timer.c
+++ b/arch/powerpc/sysdev/mpic_timer.c
@@ -154,7 +154,7 @@ static int set_cascade_timer(struct timer_group_priv *priv, u64 ticks,
tcr = casc_priv->tcr_value |
(casc_priv->tcr_value << MPIC_TIMER_TCR_ROVR_OFFSET);
- setbits32(priv->group_tcr, tcr);
+ setbits32_be(priv->group_tcr, tcr);
tmp_ticks = div_u64_rem(ticks, MAX_TICKS_CASCADE, &rem_ticks);
@@ -253,7 +253,7 @@ void mpic_start_timer(struct mpic_timer *handle)
struct timer_group_priv *priv = container_of(handle,
struct timer_group_priv, timer[handle->num]);
- clrbits32(&priv->regs[handle->num].gtbcr, TIMER_STOP);
+ clrbits32_be(&priv->regs[handle->num].gtbcr, TIMER_STOP);
}
EXPORT_SYMBOL(mpic_start_timer);
@@ -269,7 +269,7 @@ void mpic_stop_timer(struct mpic_timer *handle)
struct timer_group_priv, timer[handle->num]);
struct cascade_priv *casc_priv;
- setbits32(&priv->regs[handle->num].gtbcr, TIMER_STOP);
+ setbits32_be(&priv->regs[handle->num].gtbcr, TIMER_STOP);
casc_priv = priv->timer[handle->num].cascade_handle;
if (casc_priv) {
@@ -340,7 +340,7 @@ void mpic_free_timer(struct mpic_timer *handle)
u32 tcr;
tcr = casc_priv->tcr_value | (casc_priv->tcr_value <<
MPIC_TIMER_TCR_ROVR_OFFSET);
- clrbits32(priv->group_tcr, tcr);
+ clrbits32_be(priv->group_tcr, tcr);
priv->idle |= casc_priv->cascade_map;
priv->timer[handle->num].cascade_handle = NULL;
} else {
@@ -508,7 +508,7 @@ static void timer_group_init(struct device_node *np)
/* Init FSL timer hardware */
if (priv->flags & FSL_GLOBAL_TIMER)
- setbits32(priv->group_tcr, MPIC_TIMER_TCR_CLKDIV);
+ setbits32_be(priv->group_tcr, MPIC_TIMER_TCR_CLKDIV);
list_add_tail(&priv->node, &timer_group_list);
@@ -531,7 +531,7 @@ static void mpic_timer_resume(void)
list_for_each_entry(priv, &timer_group_list, node) {
/* Init FSL timer hardware */
if (priv->flags & FSL_GLOBAL_TIMER)
- setbits32(priv->group_tcr, MPIC_TIMER_TCR_CLKDIV);
+ setbits32_be(priv->group_tcr, MPIC_TIMER_TCR_CLKDIV);
}
}
--
2.16.4
^ permalink raw reply related
* [PATCH RFC 3/5] coccinelle: add xxxsetbitsXX converting spatch
From: Corentin Labbe @ 2018-09-07 19:41 UTC (permalink / raw)
To: Gilles.Muller, Julia.Lawall, agust, alexandre.torgue, alistair,
benh, carlo, davem, galak, joabreu, khilman, maxime.ripard,
michal.lkml, mpe, mporter, nicolas.palix, oss, paulus,
peppe.cavallaro, tj, vitb, wens
Cc: cocci, linux-amlogic, linux-arm-kernel, linux-ide, linux-kernel,
linuxppc-dev, netdev, linux-sunxi, Corentin Labbe
In-Reply-To: <1536349307-20714-1-git-send-email-clabbe@baylibre.com>
This patch add a spatch which convert all open coded of setbits32/clrbits32/clrsetbits32
and their 64 bits counterparts.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
scripts/coccinelle/misc/setbits.cocci | 423 ++++++++++++++++++++++++++++++++++
1 file changed, 423 insertions(+)
create mode 100644 scripts/coccinelle/misc/setbits.cocci
diff --git a/scripts/coccinelle/misc/setbits.cocci b/scripts/coccinelle/misc/setbits.cocci
new file mode 100644
index 000000000000..c01ab6d75eb4
--- /dev/null
+++ b/scripts/coccinelle/misc/setbits.cocci
@@ -0,0 +1,423 @@
+virtual context
+
+@pclrsetbits32a@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- rr &= ~clear;
+- rr |= set;
+- writel(rr, reg);
++ clrsetbits32(reg, clear, set);
+
+@pclrsetbits32b@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- rr |= set;
+- rr &= ~clear;
+- writel(rr, reg);
++ clrsetbits32(reg, clear, set);
+
+@pclrbits32@
+identifier rr;
+expression reg;
+expression clear;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- rr &= ~clear;
+- writel(rr, reg);
++ clrbits32(reg, clear);
+
+@psetbits32@
+identifier rr;
+expression reg;
+expression set;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- rr |= set;
+- writel(rr, reg);
++ setbits32(reg, set);
+
+@psetbits32b@
+identifier rr;
+expression reg;
+expression set1;
+expression set2;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- rr |= set1;
+- rr |= set2;
+- writel(rr, reg);
++ setbits32(reg, set1 | set2);
+
+@pclrsetbits64a@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+- rr &= ~clear;
+- rr |= set;
+- writeq(rr, reg);
++ clrsetbits64(reg, clear, set);
+
+@pclrsetbits64b@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+- rr |= set;
+- rr &= ~clear;
+- writeq(rr, reg);
++ clrsetbits64(reg, clear, set);
+
+@pclrbits64@
+identifier rr;
+expression reg;
+expression clear;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+- rr &= ~clear;
+- writeq(rr, reg);
++ clrbits64(reg, clear);
+
+@psetbits64@
+identifier rr;
+expression reg;
+expression set;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+- rr |= set;
+- writeq(rr, reg);
++ setbits64(reg, set);
+
+@@
+expression dwmac;
+expression reg;
+expression mask;
+expression value;
+@@
+
+- meson8b_dwmac_mask_bits(dwmac, reg, mask, value);
++ clrsetbits32(dwmac->regs + reg, mask, value);
+
+@ppclrsetbits32a@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+- u32 rr = readl(reg);
+- rr &= ~clear;
+- rr |= set;
+- writel(rr, reg);
++ clrsetbits32(reg, clear, set);
+
+@ppclrsetbits32b@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+- u32 rr = readl(reg);
+- rr |= set;
+- rr &= ~clear;
+- writel(rr, reg);
++ clrsetbits32(reg, clear, set);
+
+@ppclrbits32@
+identifier rr;
+expression reg;
+expression clear;
+@@
+
+- u32 rr = readl(reg);
+- rr &= ~clear;
+- writel(rr, reg);
++ clrbits32(reg, clear);
+
+@ppsetbits32@
+identifier rr;
+expression reg;
+expression set;
+@@
+
+- u32 rr = readl(reg);
+- rr |= set;
+- writel(rr, reg);
++ setbits32(reg, set);
+
+@ppclrsetbits64a@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+- u64 rr = readq(reg);
+- rr &= ~clear;
+- rr |= set;
+- writeq(rr, reg);
++ clrsetbits64(reg, clear, set);
+
+@ppclrsetbits64b@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+- u64 rr = readq(reg);
+- rr |= set;
+- rr &= ~clear;
+- writeq(rr, reg);
++ clrsetbits64(reg, clear, set);
+
+@ppclrbits64@
+identifier rr;
+expression reg;
+expression clear;
+@@
+
+- u64 rr = readq(reg);
+- rr &= ~clear;
+- writeq(rr, reg);
++ clrbits64(reg, clear);
+
+@ppsetbits64@
+identifier rr;
+expression reg;
+expression set;
+@@
+
+- u64 rr = readq(reg);
+- rr |= set;
+- writeq(rr, reg);
++ setbits64(reg, set);
+
+@pif_set_clr@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+ if (...)
+- rr |= set;
++ setbits32(reg, set);
+ else
+- rr &= ~clear;
++ clrbits32(reg, clear);
+- writel(rr, reg);
+
+@pifclrset@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+ if (...)
+- rr &= ~clear;
++ clrbits32(reg, clear);
+ else
+- rr |= set;
++ setbits32(reg, set);
+- writel(rr, reg);
+
+@pif_set_clr64@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+ if (...)
+- rr |= set;
++ setbits64(reg, set);
+ else
+- rr &= ~clear;
++ clrbits64(reg, clear);
+- writeq(rr, reg);
+
+@pif_clr_set64@
+identifier rr;
+expression reg;
+expression set;
+expression clear;
+@@
+
+ u64 rr;
+ ...
+- rr = readq(reg);
+ if (...)
+- rr &= ~clear;
++ clrbits64(reg, clear);
+ else
+- rr |= set;
++ setbits32(reg, set);
+- writeq(rr, reg);
+
+
+@p_setbits32_m2@
+identifier rr;
+expression reg;
+expression set;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- writel(rr | set, reg);
++ setbits32(reg, set);
+
+@p_clrbits32_m2@
+identifier rr;
+expression reg;
+expression mask;
+@@
+
+ u32 rr;
+ ...
+- rr = readl(reg);
+- writel(rr & ~mask, reg);
++ clrbits32(reg, mask);
+
+
+@p_setbits_oneliner@
+expression addr;
+expression set;
+@@
+- writel(readl(addr) | set, addr);
++ setbits32(addr, set);
+
+@p_clrbits_oneliner@
+expression addr;
+expression mask;
+@@
+- writel(readl(addr) & ~mask, addr);
++ clrbits32(addr, mask);
+
+@p_clrsetbits_oneliner_a@
+expression addr;
+expression set;
+expression mask;
+@@
+- writel(readl(addr) | set & ~mask, addr);
++ clrsetbits32(addr, mask, set);
+
+@p_clrsetbits_oneliner_b@
+expression addr;
+expression set;
+expression mask;
+@@
+- writel(readl(addr) & ~mask | set, addr);
++ clrsetbits32(addr, mask, set);
+
+@p_clrsetbits_oneliner_a2@
+expression addr;
+expression set;
+expression mask;
+@@
+- writel((readl(addr) | set) & ~mask, addr);
++ clrsetbits32(addr, mask, set);
+
+@p_clrsetbits_oneliner_b2@
+expression addr;
+expression set;
+expression mask;
+@@
+- writel((readl(addr) & ~mask) | set, addr);
++ clrsetbits32(addr, mask, set);
+
+
+
+
+
+
+
+
+
+
+
+// sub optimal way to add header
+@header1 depends on psetbits32 || pclrbits32 || pclrsetbits32a || pclrsetbits32b || psetbits64 || pclrbits64 || pclrsetbits64a || pclrsetbits64b || ppsetbits32 || ppclrbits32 || ppclrsetbits32a || ppclrsetbits32b || ppsetbits64 || ppclrbits64 || ppclrsetbits64a || ppclrsetbits64b@
+@@
+ #include <linux/io.h>
++ #include <linux/setbits.h>
+
+@header2 depends on (psetbits32 || pclrbits32 || pclrsetbits32a || pclrsetbits32b || psetbits64 || pclrbits64 || pclrsetbits64a || pclrsetbits64b || ppsetbits32 || ppclrbits32 || ppclrsetbits32a || ppclrsetbits32b || ppsetbits64 || ppclrbits64 || ppclrsetbits64a || ppclrsetbits64b) && !header1@
+@@
+ #include <linux/interrupt.h>
++ #include <linux/setbits.h>
+
+@header3 depends on (psetbits32 || pclrbits32 || pclrsetbits32a || pclrsetbits32b || psetbits64 || pclrbits64 || pclrsetbits64a || pclrsetbits64b || ppsetbits32 || ppclrbits32 || ppclrsetbits32a || ppclrsetbits32b || ppsetbits64 || ppclrbits64 || ppclrsetbits64a || ppclrsetbits64b) && !header2@
+@@
+ #include <linux/of.h>
++ #include <linux/setbits.h>
+
+@@
+expression base;
+expression offset;
+expression value;
+@@
+
+- mtu3_setbits(base, offset, value);
++ setbits32(base + offset, value);
+
+@@
+expression base;
+expression offset;
+expression mask;
+@@
+
+- mtu3_clrbits(base, offset, mask);
++ clrbits32(base + offset, mask);
+
--
2.16.4
^ permalink raw reply related
* [PATCH 2/5] include: add setbits32/clrbits32/clrsetbits32/setbits64/clrbits64/clrsetbits64 in linux/setbits.h
From: Corentin Labbe @ 2018-09-07 19:41 UTC (permalink / raw)
To: Gilles.Muller, Julia.Lawall, agust, alexandre.torgue, alistair,
benh, carlo, davem, galak, joabreu, khilman, maxime.ripard,
michal.lkml, mpe, mporter, nicolas.palix, oss, paulus,
peppe.cavallaro, tj, vitb, wens
Cc: cocci, linux-amlogic, linux-arm-kernel, linux-ide, linux-kernel,
linuxppc-dev, netdev, linux-sunxi, Corentin Labbe
In-Reply-To: <1536349307-20714-1-git-send-email-clabbe@baylibre.com>
This patch adds setbits32/clrbits32/clrsetbits32 and
setbits64/clrbits64/clrsetbits64 in linux/setbits.h header.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
---
include/linux/setbits.h | 55 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
create mode 100644 include/linux/setbits.h
diff --git a/include/linux/setbits.h b/include/linux/setbits.h
new file mode 100644
index 000000000000..3e1e273551bb
--- /dev/null
+++ b/include/linux/setbits.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __LINUX_SETBITS_H
+#define __LINUX_SETBITS_H
+
+#include <linux/io.h>
+
+#define __setbits(readfunction, writefunction, addr, set) \
+ writefunction((readfunction(addr) | (set)), addr)
+#define __clrbits(readfunction, writefunction, addr, mask) \
+ writefunction((readfunction(addr) & ~(mask)), addr)
+#define __clrsetbits(readfunction, writefunction, addr, mask, set) \
+ writefunction(((readfunction(addr) & ~(mask)) | (set)), addr)
+#define __setclrbits(readfunction, writefunction, addr, mask, set) \
+ writefunction(((readfunction(addr) | (seti)) & ~(mask)), addr)
+
+#define setbits32(addr, set) __setbits(readl, writel, addr, set)
+#define setbits32_relaxed(addr, set) __setbits(readl_relaxed, writel_relaxed, \
+ addr, set)
+
+#define clrbits32(addr, mask) __clrbits(readl, writel, addr, mask)
+#define clrbits32_relaxed(addr, mask) __clrbits(readl_relaxed, writel_relaxed, \
+ addr, mask)
+
+#define clrsetbits32(addr, mask, set) __clrsetbits(readl, writel, addr, mask, set)
+#define clrsetbits32_relaxed(addr, mask, set) __clrsetbits(readl_relaxed, \
+ writel_relaxed, \
+ addr, mask, set)
+
+#define setclrbits32(addr, mask, set) __setclrbits(readl, writel, addr, mask, set)
+#define setclrbits32_relaxed(addr, mask, set) __setclrbits(readl_relaxed, \
+ writel_relaxed, \
+ addr, mask, set)
+
+/* We cannot use CONFIG_64BIT as some x86 drivers use writeq */
+#if defined(writeq) && defined(readq)
+#define setbits64(addr, set) __setbits(readq, writeq, addr, set)
+#define setbits64_relaxed(addr, set) __setbits(readq_relaxed, writeq_relaxed, \
+ addr, set)
+
+#define clrbits64(addr, mask) __clrbits(readq, writeq, addr, mask)
+#define clrbits64_relaxed(addr, mask) __clrbits(readq_relaxed, writeq_relaxed, \
+ addr, mask)
+
+#define clrsetbits64(addr, mask, set) __clrsetbits(readq, writeq, addr, mask, set)
+#define clrsetbits64_relaxed(addr, mask, set) __clrsetbits(readq_relaxed, \
+ writeq_relaxed, \
+ addr, mask, set)
+
+#define setclrbits64(addr, mask, set) __setclrbits(readq, writeq, addr, mask, set)
+#define setclrbits64_relaxed(addr, mask, set) __setclrbits(readq_relaxed, \
+ writeq_relaxed, \
+ addr, mask, set)
+#endif /* writeq/readq */
+
+#endif /* __LINUX_SETBITS_H */
--
2.16.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox