Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	linux-kernel@vger.kernel.org, sebastien.dugue@bull.net,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's
Date: Fri, 1 Nov 2013 10:13:37 +0100	[thread overview]
Message-ID: <20131101091337.GA27063@gmail.com> (raw)
In-Reply-To: <20131031143325.GB25894@hmsreliant.think-freely.org>


* Neil Horman <nhorman@tuxdriver.com> wrote:

> On Thu, Oct 31, 2013 at 11:22:00AM +0100, Ingo Molnar wrote:
> > 
> > * Neil Horman <nhorman@tuxdriver.com> wrote:
> > 
> > > > etc. For such short runtimes make sure the last column displays 
> > > > close to 100%, so that the PMU results become trustable.
> > > > 
> > > > A nehalem+ PMU will allow 2-4 events to be measured in parallel, 
> > > > plus generics like 'cycles', 'instructions' can be added 'for free' 
> > > > because they get counted in a separate (fixed purpose) PMU register.
> > > > 
> > > > The last colum tells you what percentage of the runtime that 
> > > > particular event was actually active. 100% (or empty last column) 
> > > > means it was active all the time.
> > > > 
> > > > Thanks,
> > > > 
> > > > 	Ingo
> > > > 
> > > 
> > > Hmm, 
> > > 
> > > I ran this test:
> > > 
> > > for i in `seq 0 1 3`
> > > do
> > > echo $i > /sys/module/csum_test/parameters/module_test_mode
> > > taskset -c 0 perf stat --repeat 20 -C 0 -e L1-dcache-load-misses -e L1-dcache-prefetches -e cycles -e instructions -ddd ./test.sh
> > > done
> > 
> > You need to remove '-ddd' which is a shortcut for a ton of useful 
> > events, but here you want to use fewer events, to increase the 
> > precision of the measurement.
> > 
> > Thanks,
> > 
> > 	Ingo
> > 
> 
> Thank you ingo, that fixed it.  I'm trying some other variants of 
> the csum algorithm that Doug and I discussed last night, but FWIW, 
> the relative performance of the 4 test cases 
> (base/prefetch/parallel/both) remains unchanged. I'm starting to 
> feel like at this point, theres very little point in doing 
> parallel alu operations (unless we can find a way to break the 
> dependency on the carry flag, which is what I'm tinkering with 
> now).

I would still like to encourage you to pick up the improvements that 
Doug measured (mostly via prefetch tweaking?) - that looked like 
some significant speedups that we don't want to lose!

Also, trying to stick the in-kernel implementation into 'perf bench' 
would be a useful first step as well, for this and future efforts.

See what we do in tools/perf/bench/mem-memcpy-x86-64-asm.S to pick 
up the in-kernel assembly memcpy implementations:

#define memcpy MEMCPY /* don't hide glibc's memcpy() */
#define altinstr_replacement text
#define globl p2align 4; .globl
#define Lmemcpy_c globl memcpy_c; memcpy_c
#define Lmemcpy_c_e globl memcpy_c_e; memcpy_c_e

#include "../../../arch/x86/lib/memcpy_64.S"

So it needed a bit of trickery/wrappery for 'perf bench mem memcpy', 
but that is a one-time effort - once it's done then the current 
in-kernel csum_partial() implementation would be easily measurable 
(and any performance regression in it bisectable, etc.) from that 
point on.

In user-space it would also be easier to add various parameters and 
experimental implementations and background cache-stressing 
workloads automatically.

Something similar might be possible for csum_partial(), 
csum_partial_copy*(), etc.

Note, if any of you ventures to add checksum-benchmarking to perf 
bench, please base any patches on top of tip:perf/core:

  git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core

as there are a couple of perf bench enhancements in the pipeline 
already for v3.13.

Thanks,

	Ingo

next prev parent reply	other threads:[~2013-11-01  9:13 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1381510298-20572-1-git-send-email-nhorman@tuxdriver.com>
     [not found] ` <20131012172124.GA18241@gmail.com>
     [not found]   ` <20131014202854.GH26880@hmsreliant.think-freely.org>
     [not found]     ` <1381785560.2045.11.camel@edumazet-glaptop.roam.corp.google.com>
     [not found]       ` <1381789127.2045.22.camel@edumazet-glaptop.roam.corp.google.com>
     [not found]         ` <20131017003421.GA31470@hmsreliant.think-freely.org>
2013-10-17  8:41           ` [PATCH] x86: Run checksumming in parallel accross multiple alu's Ingo Molnar
2013-10-17 18:19             ` H. Peter Anvin
2013-10-17 18:48               ` Eric Dumazet
2013-10-18  6:43               ` Ingo Molnar
2013-10-28 16:01             ` Neil Horman
2013-10-28 16:20               ` Ingo Molnar
2013-10-28 17:49                 ` Neil Horman
2013-10-28 16:24               ` Ingo Molnar
2013-10-28 16:49                 ` David Ahern
2013-10-28 17:46                 ` Neil Horman
2013-10-28 18:29                   ` Neil Horman
2013-10-29  8:25                     ` Ingo Molnar
2013-10-29 11:20                       ` Neil Horman
2013-10-29 11:30                         ` Ingo Molnar
2013-10-29 11:49                           ` Neil Horman
2013-10-29 12:52                             ` Ingo Molnar
2013-10-29 13:07                               ` Neil Horman
2013-10-29 13:11                                 ` Ingo Molnar
2013-10-29 13:20                                   ` Neil Horman
2013-10-29 14:17                                   ` Neil Horman
2013-10-29 14:27                                     ` Ingo Molnar
2013-10-29 20:26                                       ` Neil Horman
2013-10-31 10:22                                         ` Ingo Molnar
2013-10-31 14:33                                           ` Neil Horman
2013-11-01  9:13                                             ` Ingo Molnar [this message]
2013-11-01 14:06                                               ` Neil Horman
2013-10-29 14:12                               ` David Ahern
     [not found] ` <1383751399-10298-1-git-send-email-nhorman@tuxdriver.com>
     [not found]   ` <1383751399-10298-3-git-send-email-nhorman@tuxdriver.com>
     [not found]     ` <87iow58eqf.fsf@tassilo.jf.intel.com>
2013-11-07 21:23       ` [PATCH v2 2/2] x86: add prefetching to do_csum Neil Horman
2013-10-30  5:25 [PATCH] x86: Run checksumming in parallel accross multiple alu's Doug Ledford
2013-10-30 10:27 ` David Laight
2013-10-30 11:02 ` Neil Horman
2013-10-30 12:18   ` David Laight
2013-10-30 13:22     ` Doug Ledford
2013-10-30 13:35   ` Doug Ledford
2013-10-30 14:04     ` David Laight
2013-10-30 14:52     ` Neil Horman
2013-10-31 18:30     ` Neil Horman
2013-11-01  9:21       ` Ingo Molnar
2013-11-01 15:42       ` Ben Hutchings
2013-11-01 16:08         ` Neil Horman
2013-11-01 16:16           ` Ben Hutchings
2013-11-01 16:18           ` David Laight
2013-11-01 17:37             ` Neil Horman
2013-11-01 19:45               ` Joe Perches
2013-11-01 19:58                 ` Neil Horman
2013-11-01 20:26                   ` Joe Perches
2013-11-02  2:07                     ` Neil Horman
2013-11-04  9:47               ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131101091337.GA27063@gmail.com \
    --to=mingo@kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=sebastien.dugue@bull.net \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).