From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [PATCH bpf-next v3 15/15] samples/bpf: sample application and documentation for AF_XDP sockets Date: Wed, 2 May 2018 22:59:03 +0200 Message-ID: <20180502225903.39180be8@redhat.com> References: <20180502110136.3738-1-bjorn.topel@gmail.com> <20180502110136.3738-16-bjorn.topel@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Cc: magnus.karlsson@intel.com, alexander.h.duyck@intel.com, alexander.duyck@gmail.com, john.fastabend@gmail.com, ast@fb.com, willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, mst@redhat.com, netdev@vger.kernel.org, michael.lundkvist@ericsson.com, jesse.brandeburg@intel.com, anjali.singhai@intel.com, qi.z.zhang@intel.com, =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , brouer@redhat.com To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:59172 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751148AbeEBU7K (ORCPT ); Wed, 2 May 2018 16:59:10 -0400 In-Reply-To: <20180502110136.3738-16-bjorn.topel@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 2 May 2018 13:01:36 +0200 Björn Töpel wrote: > +static void rx_drop(struct xdpsock *xsk) > +{ > + struct xdp_desc descs[BATCH_SIZE]; > + unsigned int rcvd, i; > + > + rcvd = xq_deq(&xsk->rx, descs, BATCH_SIZE); > + if (!rcvd) > + return; > + > + for (i = 0; i < rcvd; i++) { > + u32 idx = descs[i].idx; > + > + lassert(idx < NUM_FRAMES); > +#if DEBUG_HEXDUMP > + char *pkt; > + char buf[32]; > + > + pkt = xq_get_data(xsk, idx, descs[i].offset); > + sprintf(buf, "idx=%d", idx); > + hex_dump(pkt, descs[i].len, buf); > +#endif > + } > + > + xsk->rx_npkts += rcvd; > + > + umem_fill_to_kernel_ex(&xsk->umem->fq, descs, rcvd); > +} I would really like to see an option that can enable reading the data/memory in the packet. Else the test is rather fake... I hacked it myself manually to read first u32. - Before: 10,771,083 pps - After: 9,430,741 pps The slowdown is not as big as I expected, which is good :-) With perf stat I can see more LLC-load's, but not misses. It is not getting registered as a cache-miss that I read data on the remote CPPU. p.s. these tests are with mlx5 (which only have XDP_REDIRECT RX-side). - - Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer Before: sudo ~/perf stat -C3 -e L1-icache-load-misses -e cycles -e instructions -e cache-misses -e cache-references -e LLC-store-misses -e LLC-store -e LLC-load-misses -e LLC-load -r 3 sleep 1 Performance counter stats for 'CPU(s) 3' (3 runs): 200,020 L1-icache-load-misses ( +- 0.76% ) (33.31%) 3,920,754,587 cycles ( +- 0.14% ) (44.50%) 3,062,308,209 instructions # 0.78 insn per cycle ( +- 0.28% ) (55.65%) 823 cache-misses # 0.011 % of all cache refs ( +- 70.81% ) (66.74%) 7,587,132 cache-references ( +- 0.48% ) (77.83%) 0 LLC-store-misses (77.83%) 384,401 LLC-store ( +- 2.97% ) (77.83%) 15 LLC-load-misses # 0.00% of all LL-cache hits ( +-100.00% ) (22.17%) 3,192,312 LLC-load ( +- 0.35% ) (22.17%) 1.001199221 seconds time elapsed ( +- 0.00% ) After: $ sudo ~/perf stat -C3 -e L1-icache-load-misses -e cycles -e instructions -e cache-misses -e cache-references -e LLC-store-misses -e LLC-store -e LLC-load-misses -e LLC-load -r 3 sleep 1 Performance counter stats for 'CPU(s) 3' (3 runs): 154,921 L1-icache-load-misses ( +- 3.88% ) (33.31%) 3,924,791,213 cycles ( +- 0.10% ) (44.50%) 2,930,116,185 instructions # 0.75 insn per cycle ( +- 0.33% ) (55.65%) 342 cache-misses # 0.002 % of all cache refs ( +- 65.52% ) (66.74%) 15,810,892 cache-references ( +- 0.13% ) (77.83%) 0 LLC-store-misses (77.83%) 925,544 LLC-store ( +- 2.33% ) (77.83%) 155 LLC-load-misses # 0.00% of all LL-cache hits ( +- 67.22% ) (22.17%) 12,791,264 LLC-load ( +- 0.04% ) (22.17%) 1.001206058 seconds time elapsed ( +- 0.00% )