All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	John Fastabend <john.r.fastabend@intel.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	netdev@vger.kernel.org, brouer@redhat.com
Subject: Re: [PATCH v2 net-next] net: sched: run ingress qdisc without locks
Date: Mon, 4 May 2015 13:04:05 +0200	[thread overview]
Message-ID: <20150504130405.3ff6672e@redhat.com> (raw)
In-Reply-To: <5546FFCB.50903@plumgrid.com>

On Sun, 03 May 2015 22:12:43 -0700
Alexei Starovoitov <ast@plumgrid.com> wrote:

> On 5/3/15 8:42 AM, Jesper Dangaard Brouer wrote:
> >
> > I was actually expecting to see a higher performance boost.
>  > improvement diff     = -2.85 ns
> ...
> > The patch is removing two atomic operations, spin_{un,}lock, which I
> > have benchmarked[1] to cost approx 14ns on my system.  Your system
> > likely is faster, but not that much (p.s. benchmark your own system
> > with [1])
> >
> > [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c
> 
> have tried you tight loop spin_lock test on my box and it showed:
> time_bench: Type:spin_lock_unlock Per elem: 40 cycles(tsc) 11.070 ns
> and yet the total single cpu gain from removal of spin_lock/unlock
> in ingress path is smaller than 11ns. I think this observation is
> telling us that tight loop benchmarking is inherently flawed.
> I'm guessing that uops that cmpxchg is broken into can execute in
> parallel with uops of other insns, so tight loops of the same sequence
> of uops has more alu dependencies whereas in more normal insn flow
> these uops can mix and match better. Would be great if intel microarch
> experts can chime in.

How do you activate the ingress code path?

I'm just doing (is this enough?):
 export DEV=eth4
 tc qdisc add dev $DEV handle ffff: ingress
 

I re-ran the experiment, and I can also only show a 2.68ns
improvement.  This is rather strange, and I cannot explain it.

The lock clearly shows up in perf report[1] with 12.23% raw_spin_lock,
and perf report[2] it clearly gone, but we don't see a 12% improvement
in performance, but around 4.7%.

Before activating qdisc ingress code : 25.3Mpps (25398057)
Activating qdisc ingress with lock   : 16.9Mpps (16989315)
Activating qdisc ingress without lock: 17.8Mpps (17800496)

(1/17800496*10^9)-(1/16989315*10^9) = -2.68 ns

The "cost" of activating the ingress qdisc is also interesting:
 (1/25398057*10^9)-(1/16989315*10^9) = -19.49 ns
 (1/25398057*10^9)-(1/17800496*10^9) = -16.81 ns

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

My setup
 * Tested on top of commit 4749c3ef854
 * gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC)
 * CPU E5-2695(ES) @ 2.8GHz

[1] perf report with ingress qlock

 Samples: 2K of event 'cycles', Event count (approx.): 1762298819
   Overhead  Command        Shared Object     Symbol
 +   35.86%  kpktgend_0     [kernel.vmlinux]  [k] __netif_receive_skb_core
 +   17.81%  kpktgend_0     [kernel.vmlinux]  [k] kfree_skb
 +   12.23%  kpktgend_0     [kernel.vmlinux]  [k] _raw_spin_lock
    - _raw_spin_lock
       + 93.54% __netif_receive_skb_core
       + 6.46% __netif_receive_skb
 +    5.45%  kpktgend_0     [sch_ingress]     [k] ingress_enqueue
 +    4.65%  kpktgend_0     [pktgen]          [k] pktgen_thread_worker
 +    4.23%  kpktgend_0     [kernel.vmlinux]  [k] ip_rcv
 +    3.95%  kpktgend_0     [kernel.vmlinux]  [k] tc_classify_compat
 +    3.71%  kpktgend_0     [kernel.vmlinux]  [k] tc_classify
 +    3.03%  kpktgend_0     [kernel.vmlinux]  [k] netif_receive_skb_internal
 +    2.65%  kpktgend_0     [kernel.vmlinux]  [k] netif_receive_skb_sk
 +    1.97%  kpktgend_0     [kernel.vmlinux]  [k] __netif_receive_skb
 +    0.71%  kpktgend_0     [kernel.vmlinux]  [k] __local_bh_enable_ip
 +    0.28%  kpktgend_0     [kernel.vmlinux]  [k] kthread_should_stop

[2] perf report without ingress qlock

 Samples: 2K of event 'cycles', Event count (approx.): 1633499063
   Overhead  Command       Shared Object        Symbol
 +   39.29%  kpktgend_0    [kernel.vmlinux]     [k] __netif_receive_skb_core
 +   19.24%  kpktgend_0    [kernel.vmlinux]     [k] kfree_skb
 +   11.05%  kpktgend_0    [sch_ingress]        [k] ingress_enqueue
 +    4.69%  kpktgend_0    [kernel.vmlinux]     [k] tc_classify
 +    4.48%  kpktgend_0    [kernel.vmlinux]     [k] ip_rcv
 +    4.43%  kpktgend_0    [kernel.vmlinux]     [k] tc_classify_compat
 +    4.19%  kpktgend_0    [pktgen]             [k] pktgen_thread_worker
 +    3.50%  kpktgend_0    [kernel.vmlinux]     [k] netif_receive_skb_internal
 +    2.61%  kpktgend_0    [kernel.vmlinux]     [k] netif_receive_skb_sk
 +    2.26%  kpktgend_0    [kernel.vmlinux]     [k] __netif_receive_skb
 +    0.43%  kpktgend_0    [kernel.vmlinux]     [k] __local_bh_enable_ip
 +    0.13%  swapper       [kernel.vmlinux]     [k] mwait_idle

  reply	other threads:[~2015-05-04 11:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-02  5:27 [PATCH v2 net-next] net: sched: run ingress qdisc without locks Alexei Starovoitov
2015-05-03 15:42 ` Jesper Dangaard Brouer
2015-05-04  5:12   ` Alexei Starovoitov
2015-05-04 11:04     ` Jesper Dangaard Brouer [this message]
2015-05-05  1:27       ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150504130405.3ff6672e@redhat.com \
    --to=brouer@redhat.com \
    --cc=ast@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=john.r.fastabend@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.