All of lore.kernel.org
 help / color / mirror / Atom feed
* xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
@ 2019-07-31 21:12 Brandon Cazander
  2019-08-01  8:18 ` Jesper Dangaard Brouer
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
  0 siblings, 2 replies; 16+ messages in thread
From: Brandon Cazander @ 2019-07-31 21:12 UTC (permalink / raw)
  To: xdp-newbies@vger.kernel.org


I am having an issue with xdpgeneric specifically when using XDP_PASS after
bpf_xdp_adjust_head to pop some headers off. My test environment is qemu using
virtio_net specifically, but it also happens with e1000 in qemu/physical devices.

On a real NIC (ixgbe), the same program is successfully passing decapsulated
traffic, but fails in the same way when forcing xdpgeneric mode.

Here is the packet outside of my VM, with the IP+UDP+tun header.

# sudo tcpdump -ni ens5-outside -e udp -vvvXc1
tcpdump: listening on ens5-outside, link-type EN10MB (Ethernet), capture size 262144 bytes
15:54:14.263306 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 127: (tos 0x0, ttl 255, id 0, offset 0, flags [DF], proto UDP (17), length 113)
    172.64.0.108.39999 > 172.64.0.101.7803: [no cksum] UDP, length 85
	0x0000:  4500 0071 0000 4000 ff11 222a ac40 006c  E..q..@..."*.@.l
	0x0010:  ac40 0065 9c3f 1e7b 005d 0000 0145 0000  .@.e.?.{.]...E..
	0x0020:  54e3 6b40 003f 01d5 e8c0 a800 02c0 a801  T.k@.?..........
	0x0030:  0208 002c b1b0 5eb8 f196 ca40 5d00 0000  ...,..^....@]...
	0x0040:  00c8 0304 0000 0000 0010 1112 1314 1516  ................
	0x0050:  1718 191a 1b1c 1d1e 1f20 2122 2324 2526  ..........!"#$%&
	0x0060:  2728 292a 2b2c 2d2e 2f30 3132 3334 3536  '()*+,-./0123456
	0x0070:  37                                       7

My XDP program copies the original ethhdr to the new offset and adjusts the head
forward, and I can see the resulting ICMP packet is valid with tcpdump inside
the VM:

# tcpdump -niens5 icmp -c1 -vXe
tcpdump: listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes
15:53:25.602109 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 35986, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.0.2 > 192.168.1.2: ICMP echo request, id 45150, seq 43683, length 64
	0x0000:  4500 0054 8c92 4000 3f01 2cc2 c0a8 0002  E..T..@.?.,.....
	0x0010:  c0a8 0102 0800 ed79 b05e aaa3 6dca 405d  .......y.^..m.@]
	0x0020:  0000 0000 3589 0d00 0000 0000 1011 1213  ....5...........
	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
	0x0050:  3435 3637                                4567

Unfortunately, at this point, the packet is dropped in ip_rcv_core. I added a
perf probe on the specific drop line that I'm hitting, and printing the skb->len
and len (from ntohs(iph->tot_len)) variables. You can see the obviously wrong
len value below, while a packet capture in the VM does show the correct value
for tot_len in the IP header.

# perf probe -L ip_rcv_core:64+4 | cat
<ip_rcv_core@/usr/src/debug/kernel-debug-5.2.2-1.2.x86_64/linux-5.2/linux-obj/../net/ipv4/ip_input.c:64>
     64  	if (pskb_trim_rcsum(skb, len)) {
     65  		__IP_INC_STATS(net, IPSTATS_MIB_INDISCARDS);
     66  		goto drop;
         	}

# perf probe -a 'ip_rcv_core:66 skb=skb->len:u32 len=len:u32'
	swapper     0 [001] 84794.954487: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294931717
	swapper     0 [001] 84794.965473: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294936833

In contrast, here's what it looks like in XDP native mode (different line number
but looking):

# perf probe -a 'ip_rcv_core:57 skb=skb->len:u32 len=len:u32'
	swapper     0 [000]   353.187439: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84
	swapper     0 [003]   353.187577: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84

Here's the relevant portion of my program where I decapsulate:

static __always_inline int handle_peer_data_ipv4(struct xdp_md *ctx)
{
	void *data, *data_end;
	struct ethhdr *eth, *orig_eth;
	__u32 csum = 0;

	data = (void *)(unsigned long)ctx->data;
	data_end = (void *)(unsigned long)ctx->data_end;

	if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header) > data_end) {
		return XDP_DROP;
	}

	orig_eth = data;
	eth = data + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header);
	memcpy(&eth->h_source, &orig_eth->h_source, ETH_ALEN);
	memcpy(&eth->h_dest, &orig_eth->h_dest, ETH_ALEN);
	eth->h_proto = __constant_htons(ETH_P_IP);

	/* Decapsulate by removing IP + UDP + tunnel headers */
	if (bpf_xdp_adjust_head(ctx, (int)(sizeof(struct iphdr) + sizeof(struct udphdr) +
					   sizeof(struct tunnel_header)))) {
		return XDP_DROP;
	}

	return XDP_PASS;
}

struct tunnel_header {
	__u8 flags;
};

Kernel is 5.2.2-1-debug, OS is openSUSE Tumbleweed 20190724. For qemu I run with
these arguments for the NICs:

# qemu (...) -netdev tap,br=vm-bridge,id=hostnet1,ifname=ens5-outside,queues=4,vhost=on \
-device virtio-net-pci,mq=on,vectors=9,guest_tso4=off,guest_tso6=off,netdev=hostnet1,id=net1,mac=06:54:01:00:00:01

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-05 18:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-31 21:12 xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets Brandon Cazander
2019-08-01  8:18 ` Jesper Dangaard Brouer
2019-08-01 14:54   ` Jesper Dangaard Brouer
2019-08-01 16:05     ` Jesper Dangaard Brouer
2019-08-01 17:33   ` Brandon Cazander
2019-08-01 18:16     ` Jesper Dangaard Brouer
2019-08-01 18:53       ` Brandon Cazander
2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 2/4] selftests/bpf: add wrapper scripts for test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 3/4] selftests/bpf: reduce time to execute test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP Jesper Dangaard Brouer
2019-08-02  0:44     ` Jakub Kicinski
2019-08-02  7:53       ` Jesper Dangaard Brouer
2019-08-02 17:08         ` Jakub Kicinski
2019-08-05 18:19   ` [net v1 PATCH 0/4] net: fix regressions " David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.