All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@redhat.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>,
	linux-mm@kvack.org, netdev@vger.kernel.org
Subject: Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists
Date: Wed, 7 Oct 2015 10:36:19 -0300	[thread overview]
Message-ID: <20151007133619.GR12682@redhat.com> (raw)
In-Reply-To: <20151007143120.7068416d@redhat.com>

Em Wed, Oct 07, 2015 at 02:31:20PM +0200, Jesper Dangaard Brouer escreveu:
> On Tue, 6 Oct 2015 01:07:03 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > (trimmed Cc list a little)
> > 
> > On Mon, 5 Oct 2015 14:20:45 -0700 Andi Kleen <ak@linux.intel.com> wrote:
> > 
> > > > My only problem left, is I want a perf measurement that pinpoint these
> > > > kind of spots.  The difference in L1-icache-load-misses were significant
> > > > (1,278,276 vs 2,719,158).  I tried to somehow perf record this with
> > > > different perf events without being able to pinpoint the location (even
> > > > though I know the spot now).  Even tried Andi's ocperf.py... maybe he
> > > > will know what event I should try?
> > > 
> > > Run pmu-tools toplev.py -l3 with --show-sample. It tells you what the
> > > bottle neck is and what to sample for if there is a suitable event and
> > > even prints the command line.
> > > 
> > > https://github.com/andikleen/pmu-tools/wiki/toplev-manual#sampling-with-toplev
> > > 
> > 
> > My result from (IP-forward flow hitting CPU 0):
> >  $ sudo ./toplev.py -I 1000 -l3 -a --show-sample --core C0
> > 
> > So, what does this tell me?:
> >
> >  C0    BAD     Bad_Speculation:                                 0.00 % [  5.50%]
> >  C0    BE      Backend_Bound:                                 100.00 % [  5.50%]
> >  C0    BE/Mem  Backend_Bound.Memory_Bound:                     53.06 % [  5.50%]
> >  C0    BE/Core Backend_Bound.Core_Bound:                       46.94 % [  5.50%]
> >  C0-T0 FE      Frontend_Bound.Frontend_Latency.Branch_Resteers: 5.42 % [  5.50%]
> >  C0-T0 BE/Mem  Backend_Bound.Memory_Bound.L1_Bound:            54.51 % [  5.50%]
> >  C0-T0 BE/Core Backend_Bound.Core_Bound.Ports_Utilization:     20.99 % [  5.60%]
> >  C0-T0         CPU utilization: 1.00 CPUs   	[100.00%]
> >  C0-T1 FE      Frontend_Bound.Frontend_Latency.Branch_Resteers: 6.04 % [  5.50%]
> >  C0-T1         CPU utilization: 1.00 CPUs   	[100.00%]
> 
> Reading: https://github.com/andikleen/pmu-tools/wiki/toplev-manual
> Helped me understand most of above.
> 
> My specific CPU (i7-4790K @ 4.00GHz) unfortunately seems to have
> limited "Frontend" support. E.g. 
> 
>  # perf record -g -a -e stalled-cycles-frontend
>  Error:
>  The stalled-cycles-frontend event is not supported.
> 
> And AFAIK icache misses are part of "frontend".
> 
> 
> > Unfortunately the perf command it gives me fails with:
> >  "invalid or unsupported event".
> > 
> > Perf command:
> > 
> >  sudo ./ocperf.py record -g -e \
>   cpu/event=0xc5,umask=0x0,name=Branch_Resteers_BR_MISP_RETIRED_ALL_BRANCHES:pp,period=400009/pp,\
>   cpu/event=0xd,umask=0x3,cmask=1,name=Bad_Speculation_INT_MISC_RECOVERY_CYCLES,period=2000003/,\
>   cpu/event=0xd1,umask=0x1,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_L1_HIT:pp,period=2000003/pp,\
>   cpu/event=0xd1,umask=0x40,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_HIT_LFB:pp,period=100003/pp \
>   -C 0,4 -a
> 
> I fixed the problem with this perf command by removing the ":pp" part.
> Perhaps your tool need to fix that?
> 
> A working command line looks like this:
> 
>  sudo ./ocperf.py record -g -e \
> cpu/event=0xc5,umask=0x0,name=Branch_Resteers_BR_MISP_RETIRED_ALL_BRANCHES,period=400009/pp,\
> cpu/event=0xd,umask=0x3,cmask=1,name=Bad_Speculation_INT_MISC_RECOVERY_CYCLES,period=2000003/,\
> cpu/event=0xd1,umask=0x1,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_L1_HIT,period=2000003/pp,\
> cpu/event=0xd1,umask=0x40,name=L1_Bound_MEM_LOAD_UOPS_RETIRED_HIT_LFB,period=100003/pp \
>   -C 0,4 -a

There is a recent patch that may help here, see below, but maybe its
just a matter of removing that :pp, as it ends with a /pp anyway, no
need to state that twice :)

With the patch below all those /pp would be replaced with /P.

- Arnaldo


https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/tools/perf?id=7f94af7a489fada17d28cc60e8f4409ce216bd6d

----------------------------------------------------------------------
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.

Following record:
  $ perf record -e cycles:P ...

will detect maximum precise level for 'cycles' event and use it.
----------------------------------------------------------------------

- Arnaldo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-10-07 13:36 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-28 12:26 [PATCH 0/7] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 1/7] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 2/7] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 3/7] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 13:49   ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 4/7] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:11   ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 5/7] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:16   ` Christoph Lameter
2015-09-28 15:51     ` Jesper Dangaard Brouer
2015-09-28 15:51       ` Jesper Dangaard Brouer
2015-09-28 16:28       ` Christoph Lameter
2015-09-29  7:32         ` Jesper Dangaard Brouer
2015-09-29  7:32           ` Jesper Dangaard Brouer
2015-09-28 16:30       ` Christoph Lameter
2015-09-29  7:12         ` Jesper Dangaard Brouer
2015-09-29  7:12           ` Jesper Dangaard Brouer
2015-09-28 12:26 ` [PATCH 6/7] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 15:22   ` Christoph Lameter
2015-09-28 15:22     ` Christoph Lameter
2015-09-28 12:26 ` [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk() Jesper Dangaard Brouer
2015-09-28 12:26   ` Jesper Dangaard Brouer
2015-09-28 14:53   ` Alexander Duyck
2015-09-28 15:59     ` Jesper Dangaard Brouer
2015-09-28 15:59       ` Jesper Dangaard Brouer
2015-09-29 15:46 ` [MM PATCH V4 0/6] Further optimizing SLAB/SLUB bulking Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 1/6] slub: create new ___slab_alloc function that can be called with irqs disabled Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 2/6] slub: Avoid irqoff/on in bulk allocation Jesper Dangaard Brouer
2015-09-29 15:47   ` [MM PATCH V4 3/6] slub: mark the dangling ifdef #else of CONFIG_SLUB_DEBUG Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 4/6] slab: implement bulking for SLAB allocator Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 5/6] slub: support for bulk free with SLUB freelists Jesper Dangaard Brouer
2015-09-29 16:38     ` Alexander Duyck
2015-09-29 17:00       ` Jesper Dangaard Brouer
2015-09-29 17:20         ` Alexander Duyck
2015-09-29 17:20           ` Alexander Duyck
2015-09-29 18:16           ` Jesper Dangaard Brouer
2015-09-30 11:44       ` [MM PATCH V4.1 " Jesper Dangaard Brouer
2015-09-30 16:03         ` Christoph Lameter
2015-10-01 22:10         ` Andrew Morton
2015-10-01 22:10           ` Andrew Morton
2015-10-02  9:41           ` Jesper Dangaard Brouer
2015-10-02 10:10             ` Christoph Lameter
2015-10-02 10:40               ` Jesper Dangaard Brouer
2015-10-02 13:40             ` Jesper Dangaard Brouer
2015-10-02 21:50               ` Andrew Morton
2015-10-02 21:50                 ` Andrew Morton
2015-10-05 19:26                 ` Jesper Dangaard Brouer
2015-10-05 21:20                   ` Andi Kleen
2015-10-05 21:20                     ` Andi Kleen
2015-10-05 23:07                     ` Jesper Dangaard Brouer
2015-10-07 12:31                       ` Jesper Dangaard Brouer
2015-10-07 13:36                         ` Arnaldo Carvalho de Melo [this message]
2015-10-07 15:44                           ` Andi Kleen
2015-10-07 15:44                             ` Andi Kleen
2015-10-07 16:06                         ` Andi Kleen
2015-10-05 23:53                   ` Jesper Dangaard Brouer
2015-10-05 23:53                     ` Jesper Dangaard Brouer
2015-10-07 10:39                   ` Jesper Dangaard Brouer
2015-10-07 10:39                     ` Jesper Dangaard Brouer
2015-09-29 15:48   ` [MM PATCH V4 6/6] slub: optimize bulk slowpath free by detached freelist Jesper Dangaard Brouer
2015-10-14  5:15     ` Joonsoo Kim
2015-10-14  5:15       ` Joonsoo Kim
2015-10-21  7:57       ` Jesper Dangaard Brouer
2015-11-05  5:09         ` Joonsoo Kim
2015-11-05  5:09           ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151007133619.GR12682@redhat.com \
    --to=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=brouer@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.