From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Brenden Blanco <bblanco@plumgrid.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org, tom@herbertland.com,
alexei.starovoitov@gmail.com, Or Gerlitz <ogerlitz@mellanox.com>,
daniel@iogearbox.net, john.fastabend@gmail.com,
brouer@redhat.com
Subject: Re: [RFC PATCH 5/5] Add sample for adding simple drop program to link
Date: Wed, 6 Apr 2016 22:01:00 +0200 [thread overview]
Message-ID: <20160406220100.0df04925@redhat.com> (raw)
In-Reply-To: <20160406214848.7568235b@redhat.com>
On Wed, 6 Apr 2016 21:48:48 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> I'm testing with this program and these patches, after getting past the
> challenge of compiling the samples/bpf files ;-)
>
>
> On Fri, 1 Apr 2016 18:21:58 -0700 Brenden Blanco <bblanco@plumgrid.com> wrote:
>
> > Add a sample program that only drops packets at the
> > BPF_PROG_TYPE_PHYS_DEV hook of a link. With the drop-only program,
> > observed single core rate is ~14.6Mpps.
>
> On my i7-4790K CPU @ 4.00GHz I'm seeing 9.7Mpps (single flow/cpu).
> (generator: pktgen_sample03_burst_single_flow.sh)
>
> # ./netdrvx1 $(</sys/class/net/mlx4p1/ifindex)
> sh: /sys/kernel/debug/tracing/kprobe_events: No such file or directory
> Success: Loaded file ./netdrvx1_kern.o
> proto 17: 9776320 drops/s
>
> These numbers are quite impressive. Compared to: sending it to local
> socket that drop packets 1.7Mpps. Compared to: dropping with iptables
> in "raw" table 3.7Mpps.
>
> If I do multiple flows, via ./pktgen_sample05_flow_per_thread.sh
> then I hit this strange 14.5Mpps limit (proto 17: 14505558 drops/s).
> And the RX 4x CPUs are starting to NOT use 100% in softirq, they have
> some cycles attributed to %idle. (I verified generator is sending at
> 24Mpps).
>
>
> > Other tests were run, for instance without the dropcnt increment or
> > without reading from the packet header, the packet rate was mostly
> > unchanged.
>
> If I change the program to not touch packet data (don't call
> load_byte()) then the performance increase to 14.6Mpps (single
> flow/cpu). And the RX CPU is mostly idle... mlx4_en_process_rx_cq()
> and page alloc/free functions taking the time.
>
> > $ perf record -a samples/bpf/netdrvx1 $(</sys/class/net/eth0/ifindex)
> > proto 17: 14597724 drops/s
> >
> > ./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4
> > Running... ctrl^C to stop
> > Device: eth4@0
> > Result: OK: 6486875(c6485849+d1026) usec, 23689465 (60byte,0frags)
> > 3651906pps 1752Mb/sec (1752914880bps) errors: 0
> > Device: eth4@1
> > Result: OK: 6486874(c6485656+d1217) usec, 23689489 (60byte,0frags)
> > 3651911pps 1752Mb/sec (1752917280bps) errors: 0
> > Device: eth4@2
> > Result: OK: 6486851(c6485730+d1120) usec, 23687853 (60byte,0frags)
> > 3651672pps 1752Mb/sec (1752802560bps) errors: 0
> > Device: eth4@3
> > Result: OK: 6486879(c6485807+d1071) usec, 23688954 (60byte,0frags)
> > 3651825pps 1752Mb/sec (1752876000bps) errors: 0
> >
> > perf report --no-children:
> > 18.36% ksoftirqd/1 [mlx4_en] [k] mlx4_en_process_rx_cq
> > 15.98% swapper [kernel.vmlinux] [k] poll_idle
> > 12.71% ksoftirqd/1 [mlx4_en] [k] mlx4_en_alloc_frags
> > 6.87% ksoftirqd/1 [mlx4_en] [k] mlx4_en_free_frag
> > 4.20% ksoftirqd/1 [kernel.vmlinux] [k] get_page_from_freelist
> > 4.09% swapper [mlx4_en] [k] mlx4_en_process_rx_cq
> > 3.32% ksoftirqd/1 [kernel.vmlinux] [k] sk_load_byte_positive_offset
> > 2.39% ksoftirqd/1 [mdio] [k] 0x00000000000074cd
> > 2.23% swapper [mlx4_en] [k] mlx4_en_alloc_frags
> > 2.20% ksoftirqd/1 [kernel.vmlinux] [k] free_pages_prepare
> > 2.08% ksoftirqd/1 [mlx4_en] [k] mlx4_call_bpf
> > 1.57% ksoftirqd/1 [kernel.vmlinux] [k] percpu_array_map_lookup_elem
> > 1.35% ksoftirqd/1 [mdio] [k] 0x00000000000074fa
> > 1.09% ksoftirqd/1 [kernel.vmlinux] [k] free_one_page
> > 1.02% ksoftirqd/1 [kernel.vmlinux] [k] bpf_map_lookup_elem
> > 0.90% ksoftirqd/1 [kernel.vmlinux] [k] __alloc_pages_nodemask
> > 0.88% swapper [kernel.vmlinux] [k] intel_idle
> > 0.82% ksoftirqd/1 [mdio] [k] 0x00000000000074be
> > 0.80% swapper [mlx4_en] [k] mlx4_en_free_frag
>
> My picture (single flow/cpu) looks a little bit different:
>
> + 64.33% ksoftirqd/7 [kernel.vmlinux] [k] __bpf_prog_run
> + 9.60% ksoftirqd/7 [mlx4_en] [k] mlx4_en_alloc_frags
> + 7.71% ksoftirqd/7 [mlx4_en] [k] mlx4_en_process_rx_cq
> + 5.47% ksoftirqd/7 [mlx4_en] [k] mlx4_en_free_frag
> + 1.68% ksoftirqd/7 [kernel.vmlinux] [k] get_page_from_freelist
> + 1.52% ksoftirqd/7 [mlx4_en] [k] mlx4_call_bpf
> + 1.02% ksoftirqd/7 [kernel.vmlinux] [k] free_pages_prepare
> + 0.72% ksoftirqd/7 [mlx4_en] [k] mlx4_alloc_pages.isra.20
> + 0.70% ksoftirqd/7 [kernel.vmlinux] [k] __rcu_read_unlock
> + 0.65% ksoftirqd/7 [kernel.vmlinux] [k] percpu_array_map_lookup_elem
>
> On my i7-4790K CPU, I don't have DDIO, thus I assume this high cost in
> __bpf_prog_run is due to a cache-miss on the packet data.
Before someone else point out the obvious... I forgot to enable JIT.
Enable it::
# echo 1 > /proc/sys/net/core/bpf_jit_enable
Performance increased to: 10.8Mpps (proto 17: 10819446 drops/s)
Samples: 51K of event 'cycles', Event count (approx.): 56775706510
Overhead Command Shared Object Symbol
+ 55.90% ksoftirqd/7 [kernel.vmlinux] [k] sk_load_byte_positive_offset
+ 10.71% ksoftirqd/7 [mlx4_en] [k] mlx4_en_alloc_frags
+ 8.26% ksoftirqd/7 [mlx4_en] [k] mlx4_en_process_rx_cq
+ 5.94% ksoftirqd/7 [mlx4_en] [k] mlx4_en_free_frag
+ 2.04% ksoftirqd/7 [kernel.vmlinux] [k] get_page_from_freelist
+ 2.03% ksoftirqd/7 [kernel.vmlinux] [k] percpu_array_map_lookup_elem
+ 1.42% ksoftirqd/7 [mlx4_en] [k] mlx4_call_bpf
+ 1.04% ksoftirqd/7 [kernel.vmlinux] [k] free_pages_prepare
+ 1.03% ksoftirqd/7 [kernel.vmlinux] [k] __rcu_read_unlock
+ 0.97% ksoftirqd/7 [mlx4_en] [k] mlx4_alloc_pages.isra.20
+ 0.95% ksoftirqd/7 [devlink] [k] 0x0000000000005f87
+ 0.58% ksoftirqd/7 [devlink] [k] 0x0000000000005f8f
+ 0.49% ksoftirqd/7 [kernel.vmlinux] [k] __free_pages_ok
+ 0.47% ksoftirqd/7 [kernel.vmlinux] [k] __rcu_read_lock
+ 0.46% ksoftirqd/7 [kernel.vmlinux] [k] free_one_page
+ 0.38% ksoftirqd/7 [kernel.vmlinux] [k] net_rx_action
+ 0.36% ksoftirqd/7 [kernel.vmlinux] [k] bpf_map_lookup_elem
+ 0.36% ksoftirqd/7 [kernel.vmlinux] [k] __mod_zone_page_state
+ 0.34% ksoftirqd/7 [kernel.vmlinux] [k] __alloc_pages_nodemask
+ 0.32% ksoftirqd/7 [kernel.vmlinux] [k] _raw_spin_lock
+ 0.31% ksoftirqd/7 [devlink] [k] 0x0000000000005f0a
+ 0.29% ksoftirqd/7 [kernel.vmlinux] [k] next_zones_zonelist
It is a very likely cache-miss in sk_load_byte_positive_offset().
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2016-04-06 20:01 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-02 1:21 [RFC PATCH 0/5] Add driver bpf hook for early packet drop Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter Brenden Blanco
2016-04-02 16:39 ` Tom Herbert
2016-04-03 7:02 ` Brenden Blanco
2016-04-04 22:07 ` Thomas Graf
2016-04-05 8:19 ` Jesper Dangaard Brouer
2016-04-04 8:49 ` Daniel Borkmann
2016-04-04 13:07 ` Jesper Dangaard Brouer
2016-04-04 13:36 ` Daniel Borkmann
2016-04-04 14:09 ` Tom Herbert
2016-04-04 15:12 ` Jesper Dangaard Brouer
2016-04-04 15:29 ` Brenden Blanco
2016-04-04 16:07 ` John Fastabend
2016-04-04 16:17 ` Brenden Blanco
2016-04-04 20:00 ` Alexei Starovoitov
2016-04-04 22:04 ` Thomas Graf
2016-04-05 2:25 ` Alexei Starovoitov
2016-04-05 8:11 ` Jesper Dangaard Brouer
2016-04-05 9:29 ` Jesper Dangaard Brouer
2016-04-05 22:06 ` Alexei Starovoitov
2016-04-04 14:33 ` Eric Dumazet
2016-04-04 15:18 ` Edward Cree
2016-04-02 1:21 ` [RFC PATCH 2/5] net: add ndo to set bpf prog in adapter rx Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 3/5] rtnl: add option for setting link bpf prog Brenden Blanco
2016-04-02 1:21 ` [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program Brenden Blanco
2016-04-02 2:08 ` Eric Dumazet
2016-04-02 2:47 ` Alexei Starovoitov
2016-04-04 14:57 ` Jesper Dangaard Brouer
2016-04-04 15:22 ` Eric Dumazet
2016-04-04 18:50 ` Alexei Starovoitov
2016-04-05 14:15 ` Or Gerlitz
2016-04-06 4:05 ` Brenden Blanco
2016-04-03 6:15 ` Brenden Blanco
2016-04-05 2:20 ` Brenden Blanco
2016-04-05 2:44 ` Eric Dumazet
2016-04-05 18:59 ` Eran Ben Elisha
2016-04-02 8:23 ` Jesper Dangaard Brouer
2016-04-03 6:11 ` Brenden Blanco
2016-04-04 18:27 ` Alexei Starovoitov
2016-04-05 6:04 ` Jesper Dangaard Brouer
2016-04-02 18:40 ` Johannes Berg
2016-04-03 6:38 ` Brenden Blanco
2016-04-04 7:35 ` Johannes Berg
2016-04-04 9:57 ` Daniel Borkmann
2016-04-04 18:46 ` Alexei Starovoitov
2016-04-04 21:01 ` Daniel Borkmann
2016-04-05 1:17 ` Alexei Starovoitov
2016-04-04 8:33 ` Jesper Dangaard Brouer
2016-04-04 9:22 ` Daniel Borkmann
2016-04-02 1:21 ` [RFC PATCH 5/5] Add sample for adding simple drop program to link Brenden Blanco
2016-04-06 19:48 ` Jesper Dangaard Brouer
2016-04-06 20:01 ` Jesper Dangaard Brouer [this message]
2016-04-06 23:11 ` Alexei Starovoitov
2016-04-06 20:03 ` Daniel Borkmann
2016-04-02 16:47 ` [RFC PATCH 0/5] Add driver bpf hook for early packet drop Tom Herbert
2016-04-03 5:41 ` Brenden Blanco
2016-04-04 7:48 ` Jesper Dangaard Brouer
2016-04-04 18:10 ` Alexei Starovoitov
2016-04-02 18:41 ` Johannes Berg
2016-04-02 22:57 ` Tom Herbert
2016-04-03 2:28 ` Lorenzo Colitti
2016-04-04 7:37 ` Johannes Berg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160406220100.0df04925@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bblanco@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@mellanox.com \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).