From: Daniel Borkmann <daniel@iogearbox.net>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
Patrick McHardy <kaber@trash.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
netfilter-devel@vger.kernel.org, davem@davemloft.net,
netdev@vger.kernel.org
Subject: Re: [PATCH 6/6] net: move qdisc ingress filtering on top of netfilter ingress hooks
Date: Fri, 01 May 2015 01:01:14 +0200 [thread overview]
Message-ID: <5542B43A.6020201@iogearbox.net> (raw)
In-Reply-To: <55427F81.4080807@iogearbox.net>
On 04/30/2015 09:16 PM, Daniel Borkmann wrote:
> On 04/30/2015 06:36 PM, Pablo Neira Ayuso wrote:
> ...
>> But where are the barriers? These unfounded performance claims are
>> simply absurd, qdisc ingress barely performs a bit better just because
>> it executes a bit less code and only in the single CPU scenario with
>> no rules at all.
>
> I think we're going in circles a bit. :( You are right in saying that
> currently, there's a central spinlock, which is worked on to get rid
> of, you've seen the patch on the list floating around already. Single
> CPU, artificial micro-benchmark, which were done show that you see on
> your machine ~613Kpps to ~545Kpps, others have seen it more amplified
> as 22.4Mpps to 18.0Mpps drop from __netif_receive_skb_core() up to an
> empty dummy u32_classify() rule, which has already been acknowledged
> that this gap needs to be improved. Lets call it unfounded then. I
> think we wouldn't even have this discussion if we wouldn't try brute
> forcing both worlds behind this single static key, or, have both
> invoked from within the same layer/list.
Ok, out of curiosity, I did the same as both of you: I'm using a pretty
standard Supermicro X10SLM-F/X10SLM-F, Xeon E3-1240 v3.
*** ingress + dummy u32, net-next:
w/o perf:
...
Result: OK: 5157948(c5157388+d559) usec, 100000000 (60byte,0frags)
19387551pps 9306Mb/sec (9306024480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 5182638(c5182057+d580) usec, 100000000 (60byte,0frags)
19295191pps 9261Mb/sec (9261691680bps) errors: 100000000
26.07% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
14.39% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
13.69% kpktgend_0 [cls_u32] [k] u32_classify
11.75% kpktgend_0 [kernel.kallsyms] [k] _raw_spin_lock
5.34% kpktgend_0 [sch_ingress] [k] ingress_enqueue
5.21% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
4.93% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
3.41% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
3.21% kpktgend_0 [pktgen] [k] pktgen_thread_worker
3.16% kpktgend_0 [kernel.kallsyms] [k] tc_classify
3.08% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.05% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.60% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
1.15% kpktgend_0 [kernel.kallsyms] [k] classify
0.45% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
*** nf hook infra + ingress + dummy u32, net-next:
w/o perf:
...
Result: OK: 6555903(c6555744+d159) usec, 100000000 (60byte,0frags)
15253426pps 7321Mb/sec (7321644480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 6591291(c6591153+d138) usec, 100000000 (60byte,0frags)
15171532pps 7282Mb/sec (7282335360bps) errors: 100000000
25.94% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
12.19% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
11.00% kpktgend_0 [kernel.kallsyms] [k] _raw_spin_lock
10.58% kpktgend_0 [cls_u32] [k] u32_classify
5.34% kpktgend_0 [sch_ingress] [k] handle_ing
4.68% kpktgend_0 [kernel.kallsyms] [k] nf_iterate
4.33% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
4.32% kpktgend_0 [sch_ingress] [k] ingress_enqueue
3.62% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
2.95% kpktgend_0 [kernel.kallsyms] [k] nf_hook_slow
2.75% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.60% kpktgend_0 [kernel.kallsyms] [k] tc_classify
2.52% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
2.50% kpktgend_0 [pktgen] [k] pktgen_thread_worker
1.77% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.28% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
0.94% kpktgend_0 [kernel.kallsyms] [k] classify
0.38% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
*** drop ingress spinlock (patch w/ bstats addition) + ingress +
dummy u32, net-next:
w/o perf:
...
Result: OK: 4789828(c4789353+d474) usec, 100000000 (60byte,0frags)
20877576pps 10021Mb/sec (10021236480bps) errors: 100000000
perf record -C0 -ecycles:k ./pktgen.sh p1p1
...
Result: OK: 4829276(c4828437+d839) usec, 100000000 (60byte,0frags)
20707036pps 9939Mb/sec (9939377280bps) errors: 100000000
33.11% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core
15.27% kpktgend_0 [kernel.kallsyms] [k] kfree_skb
14.60% kpktgend_0 [cls_u32] [k] u32_classify
6.06% kpktgend_0 [sch_ingress] [k] ingress_enqueue
5.55% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat
5.31% kpktgend_0 [kernel.kallsyms] [k] skb_defer_rx_timestamp
3.77% kpktgend_0 [pktgen] [k] pktgen_thread_worker
3.45% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal
3.33% kpktgend_0 [kernel.kallsyms] [k] tc_classify
3.33% kpktgend_0 [kernel.kallsyms] [k] ip_rcv
2.34% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb
1.78% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk
1.15% kpktgend_0 [kernel.kallsyms] [k] classify
0.48% kpktgend_0 [kernel.kallsyms] [k] __local_bh_enable_ip
That means, here, moving ingress behind nf hooks, I see a similar
slowdown in this micro-benchmark as Alexei of really worst case
of ~27%.
Now in real world that might probably just end up as a few percent
depending on the use case, but really, why should we go down that
path if we can just avoid that?
If you find a way, where both tc/nf hooks are triggered from within
the same list, then that would probably look better already. Or,
as a start, as mentioned, with a second static key for netfilter,
which can later on then still be reworked for a better integration,
although I agree with you that it's less clean and I see the point
of consolidating code.
If you want, I'm happy to provide numbers if you have a next set as
well, feel free to ping me.
Thanks,
Daniel
next prev parent reply other threads:[~2015-04-30 23:01 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-29 18:53 [PATCH 0/6 RFC] Netfilter ingress support (v2) Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 1/6] netfilter: cleanup struct nf_hook_ops indentation Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 2/6] netfilter: add hook list to nf_hook_state Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 3/6] netfilter: add nf_hook_list_active() Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 4/6] netfilter: move generic hook infrastructure into net/core/hooks.c Pablo Neira Ayuso
2015-04-29 23:59 ` Patrick McHardy
2015-04-29 18:53 ` [PATCH 5/6] net: add netfilter ingress hook Pablo Neira Ayuso
2015-04-29 18:53 ` [PATCH 6/6] net: move qdisc ingress filtering on top of netfilter ingress hooks Pablo Neira Ayuso
2015-04-29 20:27 ` Daniel Borkmann
2015-04-29 23:32 ` Pablo Neira Ayuso
2015-04-30 0:10 ` Daniel Borkmann
2015-04-30 0:20 ` Daniel Borkmann
2015-04-30 0:30 ` Patrick McHardy
2015-04-30 0:41 ` Daniel Borkmann
2015-04-30 0:48 ` Patrick McHardy
2015-04-30 1:16 ` Alexei Starovoitov
2015-04-30 1:34 ` Patrick McHardy
2015-04-30 2:22 ` Jamal Hadi Salim
2015-04-30 3:11 ` Patrick McHardy
2015-04-30 11:55 ` Jamal Hadi Salim
2015-04-30 15:33 ` Pablo Neira Ayuso
2015-04-30 16:09 ` Daniel Borkmann
2015-04-30 16:36 ` Pablo Neira Ayuso
2015-04-30 19:16 ` Daniel Borkmann
2015-04-30 23:01 ` Daniel Borkmann [this message]
2015-05-01 1:15 ` Jamal Hadi Salim
2015-04-30 10:12 ` Pablo Neira Ayuso
2015-04-30 19:05 ` Alexei Starovoitov
2015-04-30 0:37 ` Patrick McHardy
2015-04-30 1:04 ` Daniel Borkmann
2015-04-30 1:43 ` Patrick McHardy
2015-04-30 2:35 ` Jamal Hadi Salim
2015-04-30 3:29 ` Patrick McHardy
2015-04-30 4:05 ` Patrick McHardy
2015-04-30 6:02 ` Alexei Starovoitov
2015-04-30 9:24 ` Daniel Borkmann
2015-04-30 10:28 ` Pablo Neira Ayuso
2015-04-29 23:36 ` Patrick McHardy
2015-04-30 0:00 ` Daniel Borkmann
2015-04-30 0:15 ` Patrick McHardy
2015-04-29 21:53 ` Cong Wang
2015-04-29 23:37 ` Patrick McHardy
2015-04-29 23:42 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5542B43A.6020201@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=alexei.starovoitov@gmail.com \
--cc=davem@davemloft.net \
--cc=jhs@mojatatu.com \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.