netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: William Tu <u9012063@gmail.com>
Cc: "Björn Töpel" <bjorn.topel@gmail.com>,
	magnus.karlsson@intel.com,
	"Alexander Duyck" <alexander.h.duyck@intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Alexei Starovoitov" <ast@fb.com>,
	willemdebruijn.kernel@gmail.com,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	michael.lundkvist@ericsson.com, jesse.brandeburg@intel.com,
	anjali.singhai@intel.com, jeffrey.b.shaw@intel.com,
	ferruh.yigit@intel.com, qi.z.zhang@intel.com, brouer@redhat.com,
	dendibakh@gmail.com
Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support
Date: Wed, 28 Mar 2018 10:01:36 +0200	[thread overview]
Message-ID: <20180328100136.6202448b@redhat.com> (raw)
In-Reply-To: <CALDO+Sb=8yTdEofBB5Nav-Ea+T-bzqm6eM6_1LLb46etMz+ULA@mail.gmail.com>

On Tue, 27 Mar 2018 17:06:50 -0700
William Tu <u9012063@gmail.com> wrote:

> On Tue, Mar 27, 2018 at 2:37 AM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > On Mon, 26 Mar 2018 14:58:02 -0700
> > William Tu <u9012063@gmail.com> wrote:
> >  
> >> > Again high count for NMI ?!?
> >> >
> >> > Maybe you just forgot to tell perf that you want it to decode the
> >> > bpf_prog correctly?
> >> >
> >> > https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html#perf-tool-symbols
> >> >
> >> > Enable via:
> >> >  $ sysctl net/core/bpf_jit_kallsyms=1
> >> >
> >> > And use perf report (while BPF is STILL LOADED):
> >> >
> >> >  $ perf report --kallsyms=/proc/kallsyms
> >> >
> >> > E.g. for emailing this you can use this command:
> >> >
> >> >  $ perf report --sort cpu,comm,dso,symbol --kallsyms=/proc/kallsyms --no-children --stdio -g none | head -n 40
> >> >  
> >>
> >> Thanks, I followed the steps, the result of l2fwd
> >> # Total Lost Samples: 119
> >> #
> >> # Samples: 2K of event 'cycles:ppp'
> >> # Event count (approx.): 25675705627
> >> #
> >> # Overhead  CPU  Command  Shared Object       Symbol
> >> # ........  ...  .......  ..................  ..................................
> >> #
> >>     10.48%  013  xdpsock  xdpsock             [.] main
> >>      9.77%  013  xdpsock  [kernel.vmlinux]    [k] clflush_cache_range
> >>      8.45%  013  xdpsock  [kernel.vmlinux]    [k] nmi
> >>      8.07%  013  xdpsock  [kernel.vmlinux]    [k] xsk_sendmsg
> >>      7.81%  013  xdpsock  [kernel.vmlinux]    [k] __domain_mapping
> >>      4.95%  013  xdpsock  [kernel.vmlinux]    [k] ixgbe_xmit_frame_ring
> >>      4.66%  013  xdpsock  [kernel.vmlinux]    [k] skb_store_bits
> >>      4.39%  013  xdpsock  [kernel.vmlinux]    [k] syscall_return_via_sysret
> >>      3.93%  013  xdpsock  [kernel.vmlinux]    [k] pfn_to_dma_pte
> >>      2.62%  013  xdpsock  [kernel.vmlinux]    [k] __intel_map_single
> >>      2.53%  013  xdpsock  [kernel.vmlinux]    [k] __alloc_skb
> >>      2.36%  013  xdpsock  [kernel.vmlinux]    [k] iommu_no_mapping
> >>      2.21%  013  xdpsock  [kernel.vmlinux]    [k] alloc_skb_with_frags
> >>      2.07%  013  xdpsock  [kernel.vmlinux]    [k] skb_set_owner_w
> >>      1.98%  013  xdpsock  [kernel.vmlinux]    [k] __kmalloc_node_track_caller
> >>      1.94%  013  xdpsock  [kernel.vmlinux]    [k] ksize
> >>      1.84%  013  xdpsock  [kernel.vmlinux]    [k] validate_xmit_skb_list
> >>      1.62%  013  xdpsock  [kernel.vmlinux]    [k] kmem_cache_alloc_node
> >>      1.48%  013  xdpsock  [kernel.vmlinux]    [k] __kmalloc_reserve.isra.37
> >>      1.21%  013  xdpsock  xdpsock             [.] xq_enq
> >>      1.08%  013  xdpsock  [kernel.vmlinux]    [k] intel_alloc_iova
> >>  
> >
> > You did use net/core/bpf_jit_kallsyms=1 and correct perf commands decoding of
> > bpf_prog, so the perf top#3 'nmi' is likely a real NMI call... which looks wrong.
> >  
> Thanks, you're right. Let me dig more on this NMI behavior.
> 
> >  
> >> And l2fwd under "perf stat" looks OK to me. There is little context
> >> switches, cpu is fully utilized, 1.17 insn per cycle seems ok.
> >>
> >> Performance counter stats for 'CPU(s) 6':
> >>   10000.787420      cpu-clock (msec)          #    1.000 CPUs utilized
> >>             24      context-switches          #    0.002 K/sec
> >>              0      cpu-migrations            #    0.000 K/sec
> >>              0      page-faults               #    0.000 K/sec
> >> 22,361,333,647      cycles                    #    2.236 GHz
> >> 13,458,442,838      stalled-cycles-frontend   #   60.19% frontend cycles idle
> >> 26,251,003,067      instructions              #    1.17  insn per cycle
> >>                                               #    0.51  stalled cycles per insn
> >>  4,938,921,868      branches                  #  493.853 M/sec
> >>      7,591,739      branch-misses             #    0.15% of all branches
> >>   10.000835769 seconds time elapsed  
> >
> > This perf stat also indicate something is wrong.
> >
> > The 1.17 insn per cycle is NOT okay, it is too low (compared to what
> > usually I see, e.g. 2.36  insn per cycle).
> >
> > It clearly says you have 'stalled-cycles-frontend' and '60.19% frontend
> > cycles idle'.   This means your CPU have issues/bottleneck fetching
> > instructions. Explained by Andi Kleen here [1]
> >
> > [1] https://github.com/andikleen/pmu-tools/wiki/toplev-manual
> >  
> thanks for the link!
>
> It's definitely weird that my frontend cycle (fetch and decode)
> stalled is so high.
>
> I assume this xdpsock code is small and should all fit into the icache.
> However, doing another perf stat on xdpsock l2fwd shows
> 
> 13,720,109,581      stalled-cycles-frontend   # 60.01% frontend cycles
> idle     (23.82%)
> 
> <not supported>      stalled-cycles-backend
>       7,994,837      branch-misses           # 0.16% of all branches
>        (23.80%)
>     996,874,424      bus-cycles      # 99.679 M/sec      (23.80%)
>  18,942,220,445      ref-cycles      # 1894.067 M/sec    (28.56%)
>     100,983,226      LLC-loads       # 10.097 M/sec      (23.80%)
>       4,897,089      LLC-load-misses # 4.85% of all LL-cache hits     (23.80%)
>      66,659,889      LLC-stores      # 6.665 M/sec       (9.52%)
>           8,373 LLC-store-misses     # 0.837 K/sec  (9.52%)
>     158,178,410      LLC-prefetches       # 15.817 M/sec  (9.52%)
>       3,011,180      LLC-prefetch-misses  # 0.301 M/sec   (9.52%)
>   8,190,383,109      dTLB-loads       # 818.971 M/sec     (9.52%)
>      20,432,204      dTLB-load-misses # 0.25% of all dTLB cache hits   (9.52%)
>   3,729,504,674      dTLB-stores       # 372.920 M/sec     (9.52%)
>         992,231  dTLB-store-misses         # 0.099 M/sec    (9.52%)
> <not supported>      dTLB-prefetches
> <not supported>      dTLB-prefetch-misses
>          11,619 iTLB-loads            # 0.001 M/sec (9.52%)
>       1,874,756      iTLB-load-misses # 16135.26% of all iTLB cache hits (14.28%)

What was the sample period for this perf stat?

> I have super high iTLB-load-misses. This is probably the cause of high
> frontend stalled.

It looks very strange that your iTLB-loads are 11,619, while the
iTLB-load-misses are much much higher 1,874,756.

> Do you know any way to improve iTLB hit rate?

The xdpsock code should be small enough to fit in the iCache, but it
might be layout in memory in an unfortunate way.  You could play with
rearranging the C-code (look at the objdump alignments).

If you want to know the details about code alignment issue, and how to
troubleshoot them, you should read this VERY excellent blog post by
Denis Bakhvalov:
https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-03-28  8:01 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-31 13:53 [RFC PATCH 00/24] Introducing AF_XDP support Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 01/24] xsk: AF_XDP sockets buildable skeleton Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 02/24] xsk: add user memory registration sockopt Björn Töpel
2018-02-07 16:00   ` Willem de Bruijn
2018-02-07 21:39     ` Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 03/24] xsk: added XDP_{R,T}X_RING sockopt and supporting structures Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 04/24] xsk: add bind support and introduce Rx functionality Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 05/24] bpf: added bpf_xdpsk_redirect Björn Töpel
2018-02-05 13:42   ` Jesper Dangaard Brouer
2018-02-07 21:11     ` Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 06/24] net: wire up xsk support in the XDP_REDIRECT path Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 07/24] xsk: introduce Tx functionality Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 08/24] i40e: add support for XDP_REDIRECT Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 09/24] samples/bpf: added xdpsock program Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 10/24] netdevice: added XDP_{UN,}REGISTER_XSK command to ndo_bpf Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 11/24] netdevice: added ndo for transmitting a packet from an XDP socket Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 12/24] xsk: add iterator functions to xsk_ring Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 13/24] i40e: introduce external allocator support Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 14/24] i40e: implemented page recycling buff_pool Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 15/24] i40e: start using " Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 16/24] i40e: separated buff_pool interface from i40e implementaion Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 17/24] xsk: introduce xsk_buff_pool Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 18/24] xdp: added buff_pool support to struct xdp_buff Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 19/24] xsk: add support for zero copy Rx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 20/24] xsk: add support for zero copy Tx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 21/24] i40e: implement xsk sub-commands in ndo_bpf for zero copy Rx Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 22/24] i40e: introduced a clean_tx callback function Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 23/24] i40e: introduced Tx completion callbacks Björn Töpel
2018-01-31 13:53 ` [RFC PATCH 24/24] i40e: Tx support for zero copy allocator Björn Töpel
2018-02-01 16:42 ` [RFC PATCH 00/24] Introducing AF_XDP support Jesper Dangaard Brouer
2018-02-02 10:31 ` Jesper Dangaard Brouer
2018-02-05 15:05 ` Björn Töpel
2018-02-07 15:54   ` Willem de Bruijn
2018-02-07 21:28     ` Björn Töpel
2018-02-08 23:16       ` Willem de Bruijn
2018-02-07 17:59 ` Tom Herbert
2018-02-07 21:38   ` Björn Töpel
2018-03-26 16:06 ` William Tu
2018-03-26 16:38   ` Jesper Dangaard Brouer
2018-03-26 21:58     ` William Tu
2018-03-27  6:09       ` Björn Töpel
2018-03-27  9:37       ` Jesper Dangaard Brouer
2018-03-28  0:06         ` William Tu
2018-03-28  8:01           ` Jesper Dangaard Brouer [this message]
2018-03-28 15:05             ` William Tu
2018-03-26 22:54     ` Tushar Dave
2018-03-26 23:03       ` Alexander Duyck
2018-03-26 23:20         ` Tushar Dave
2018-03-28  0:49           ` William Tu
2018-03-27  6:30         ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180328100136.6202448b@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=anjali.singhai@intel.com \
    --cc=ast@fb.com \
    --cc=bjorn.topel@gmail.com \
    --cc=bjorn.topel@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=dendibakh@gmail.com \
    --cc=ferruh.yigit@intel.com \
    --cc=jeffrey.b.shaw@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=michael.lundkvist@ericsson.com \
    --cc=netdev@vger.kernel.org \
    --cc=qi.z.zhang@intel.com \
    --cc=u9012063@gmail.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).