All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: linux-mm@kvack.org, netdev@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	brouer@redhat.com
Subject: Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists
Date: Wed, 7 Oct 2015 14:31:20 +0200	[thread overview]
Message-ID: <20151007143120.7068416d@redhat.com> (raw)
In-Reply-To: <20151006010703.09e2f0ff@redhat.com>

On Tue, 6 Oct 2015 01:07:03 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> (trimmed Cc list a little)
> 
> On Mon, 5 Oct 2015 14:20:45 -0700 Andi Kleen <ak@linux.intel.com> wrote:
> 
> > > My only problem left, is I want a perf measurement that pinpoint these
> > > kind of spots.  The difference in L1-icache-load-misses were significant
> > > (1,278,276 vs 2,719,158).  I tried to somehow perf record this with
> > > different perf events without being able to pinpoint the location (even
> > > though I know the spot now).  Even tried Andi's ocperf.py... maybe he
> > > will know what event I should try?
> > 
> > Run pmu-tools toplev.py -l3 with --show-sample. It tells you what the
> > bottle neck is and what to sample for if there is a suitable event and
> > even prints the command line.
> > 
> > https://github.com/andikleen/pmu-tools/wiki/toplev-manual#sampling-with-toplev
> > 
> 
> My result from (IP-forward flow hitting CPU 0):
>  $ sudo ./toplev.py -I 1000 -l3 -a --show-sample --core C0
> 
> So, what does this tell me?:
>
>  C0    BAD     Bad_Speculation:                                 0.00 % [  5.50%]
>  C0    BE      Backend_Bound:                                 100.00 % [  5.50%]
>  C0    BE/Mem  Backend_Bound.Memory_Bound:                     53.06 % [  5.50%]
>  C0    BE/Core Backend_Bound.Core_Bound:                       46.94 % [  5.50%]
>  C0-T0 FE      Frontend_Bound.Frontend_Latency.Branch_Resteers: 5.42 % [  5.50%]
>  C0-T0 BE/Mem  Backend_Bound.Memory_Bound.L1_Bound:            54.51 % [  5.50%]
>  C0-T0 BE/Core Backend_Bound.Core_Bound.Ports_Utilization:     20.99 % [  5.60%]
>  C0-T0         CPU utilization: 1.00 CPUs   	[100.00%]
>  C0-T1 FE      Frontend_Bound.Frontend_Latency.Branch_Resteers: 6.04 % [  5.50%]
>  C0-T1         CPU utilization: 1.00 CPUs   	[100.00%]

Reading: https://github.com/andikleen/pmu-tools/wiki/toplev-manual
Helped me understand most of above.

My specific CPU (i7-4790K @ 4.00GHz) unfortunately seems to have
limited "Frontend" support. E.g. 

 # perf record -g -a -e stalled-cycles-frontend
 Error:
 The stalled-cycles-frontend event is not supported.

And AFAIK icache misses are part of "frontend".


> Unfortunately the perf command it gives me fails with:
>  "invalid or unsupported event".
> 
> Perf command:
> 
>  sudo ./ocperf.py record -g -e \
  cpu/event=0xc5,umask=0x0,name=Branch_Resteers_BR_MISP_RETIRED_ALL_BRANCHES:pp,period=400009/pp,\
  cpu/event=0xd,umask=0x3,cmask=1,name=Bad_Speculation_INT_MISC_RECOVERY_CYCLES,period=2000003/,\
  cpu/event=0xd1,umask=0x1,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_L1_HIT:pp,period=2000003/pp,\
  cpu/event=0xd1,umask=0x40,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_HIT_LFB:pp,period=100003/pp \
  -C 0,4 -a

I fixed the problem with this perf command by removing the ":pp" part.
Perhaps your tool need to fix that?

A working command line looks like this:

 sudo ./ocperf.py record -g -e \
cpu/event=0xc5,umask=0x0,name=Branch_Resteers_BR_MISP_RETIRED_ALL_BRANCHES,period=400009/pp,\
cpu/event=0xd,umask=0x3,cmask=1,name=Bad_Speculation_INT_MISC_RECOVERY_CYCLES,period=2000003/,\
cpu/event=0xd1,umask=0x1,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_L1_HIT,period=2000003/pp,\
cpu/event=0xd1,umask=0x40,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_HIT_LFB,period=100003/pp \
  -C 0,4 -a

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-10-07 12:31 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-28 12:26 [PATCH 0/7] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 1/7] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 2/7] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 13:49   ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 4/7] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:11   ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 5/7] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:16   ` Christoph Lameter
2015-09-28 15:51     ` Jesper Dangaard Brouer
2015-09-28 15:51       ` Jesper Dangaard Brouer
2015-09-28 16:28       ` Christoph Lameter
2015-09-29  7:32         ` Jesper Dangaard Brouer
2015-09-29  7:32           ` Jesper Dangaard Brouer
2015-09-28 16:30       ` Christoph Lameter
2015-09-29  7:12         ` Jesper Dangaard Brouer
2015-09-29  7:12           ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:22   ` Christoph Lameter
2015-09-28 15:22     ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 14:53   ` Alexander Duyck
2015-09-28 15:59     ` Jesper Dangaard Brouer
2015-09-28 15:59       ` Jesper Dangaard Brouer
2015-09-29 15:46 ` [MM PATCH V4 0/6] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 1/6] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 2/6] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 3/6] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 4/6] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-29 16:38     ` Alexander Duyck
2015-09-29 17:00       ` Jesper Dangaard Brouer
2015-09-29 17:20         ` Alexander Duyck
2015-09-29 17:20           ` Alexander Duyck
2015-09-29 18:16           ` Jesper Dangaard Brouer
2015-09-30 11:44       ` [MM PATCH V4.1 " Jesper Dangaard Brouer
2015-09-30 16:03         ` Christoph Lameter
2015-10-01 22:10         ` Andrew Morton
2015-10-01 22:10           ` Andrew Morton
2015-10-02  9:41           ` Jesper Dangaard Brouer
2015-10-02 10:10             ` Christoph Lameter
2015-10-02 10:40               ` Jesper Dangaard Brouer
2015-10-02 13:40             ` Jesper Dangaard Brouer
2015-10-02 21:50               ` Andrew Morton
2015-10-02 21:50                 ` Andrew Morton
2015-10-05 19:26                 ` Jesper Dangaard Brouer
2015-10-05 21:20                   ` Andi Kleen
2015-10-05 21:20                     ` Andi Kleen
2015-10-05 23:07                     ` Jesper Dangaard Brouer
2015-10-07 12:31                       ` Jesper Dangaard Brouer [this message]
2015-10-07 13:36                         ` Arnaldo Carvalho de Melo
2015-10-07 15:44                           ` Andi Kleen
2015-10-07 15:44                             ` Andi Kleen
2015-10-07 16:06                         ` Andi Kleen
2015-10-05 23:53                   ` Jesper Dangaard Brouer
2015-10-05 23:53                     ` Jesper Dangaard Brouer
2015-10-07 10:39                   ` Jesper Dangaard Brouer
2015-10-07 10:39                     ` Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-10-14  5:15     ` Joonsoo Kim
2015-10-14  5:15       ` Joonsoo Kim
2015-10-21  7:57       ` Jesper Dangaard Brouer
2015-11-05  5:09         ` Joonsoo Kim
2015-11-05  5:09           ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151007143120.7068416d@redhat.com \
    --to=brouer@redhat.com \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.