From: Ingo Molnar <mingo@kernel.org>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Neil Horman <nhorman@tuxdriver.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
linux-kernel@vger.kernel.org, sebastien.dugue@bull.net,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
x86@kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's
Date: Fri, 18 Oct 2013 08:43:21 +0200 [thread overview]
Message-ID: <20131018064321.GG14264@gmail.com> (raw)
In-Reply-To: <52602A29.506@zytor.com>
* H. Peter Anvin <hpa@zytor.com> wrote:
> On 10/17/2013 01:41 AM, Ingo Molnar wrote:
> >
> > To correctly simulate the workload you'd have to:
> >
> > - allocate a buffer larger than your L2 cache.
> >
> > - to measure the effects of the prefetches you'd also have to randomize
> > the individual buffer positions. See how 'perf bench numa' implements a
> > random walk via --data_rand_walk, in tools/perf/bench/numa.c.
> > Otherwise the CPU might learn your simplistic stream direction and the
> > L2 cache might hw-prefetch your data, interfering with any explicit
> > prefetches the code does. In many real-life usecases packet buffers are
> > scattered.
> >
> > Also, it would be nice to see standard deviation noise numbers when two
> > averages are close to each other, to be able to tell whether differences
> > are statistically significant or not.
>
>
> Seriously, though, how much does it matter? All the above seems likely
> to do is to drown the signal by adding noise.
I think it matters a lot and I don't think it 'adds' noise - it measures
something else (cache cold behavior - which is the common case for
first-time csum_partial() use for network packets), which was not measured
before, and that that is by its nature has different noise patterns.
I've done many cache-cold measurements myself and had no trouble achieving
statistically significant results and high precision.
> If the parallel (threaded) checksumming is faster, which theory says it
> should and microbenchmarking confirms, how important are the
> macrobenchmarks?
Microbenchmarks can be totally blind to things like the ideal prefetch
window size. (or whether a prefetch should be done at all: some CPUs will
throw away prefetches if enough regular fetches arrive.)
Also, 'naive' single-threaded algorithms can occasionally be better in the
cache-cold case because a linear, predictable stream of memory accesses
might saturate the memory bus better than a somewhat random looking,
interleaved web of accesses that might not harmonize with buffer depths.
I _think_ if correctly tuned then the parallel algorithm should be better
in the cache cold case, I just don't know with what parameters (and the
algorithm has at least one free parameter: the prefetch window size), and
I don't know how significant the effect is.
Also, more fundamentally, I absolutely detest doing no measurements or
measuring the wrong thing - IMHO there are too many 'blind' optimization
commits in the kernel with little to no observational data attached.
Thanks,
Ingo
next prev parent reply other threads:[~2013-10-18 6:43 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1381510298-20572-1-git-send-email-nhorman@tuxdriver.com>
[not found] ` <20131012172124.GA18241@gmail.com>
[not found] ` <20131014202854.GH26880@hmsreliant.think-freely.org>
[not found] ` <1381785560.2045.11.camel@edumazet-glaptop.roam.corp.google.com>
[not found] ` <1381789127.2045.22.camel@edumazet-glaptop.roam.corp.google.com>
[not found] ` <20131017003421.GA31470@hmsreliant.think-freely.org>
2013-10-17 8:41 ` [PATCH] x86: Run checksumming in parallel accross multiple alu's Ingo Molnar
2013-10-17 18:19 ` H. Peter Anvin
2013-10-17 18:48 ` Eric Dumazet
2013-10-18 6:43 ` Ingo Molnar [this message]
2013-10-28 16:01 ` Neil Horman
2013-10-28 16:20 ` Ingo Molnar
2013-10-28 17:49 ` Neil Horman
2013-10-28 16:24 ` Ingo Molnar
2013-10-28 16:49 ` David Ahern
2013-10-28 17:46 ` Neil Horman
2013-10-28 18:29 ` Neil Horman
2013-10-29 8:25 ` Ingo Molnar
2013-10-29 11:20 ` Neil Horman
2013-10-29 11:30 ` Ingo Molnar
2013-10-29 11:49 ` Neil Horman
2013-10-29 12:52 ` Ingo Molnar
2013-10-29 13:07 ` Neil Horman
2013-10-29 13:11 ` Ingo Molnar
2013-10-29 13:20 ` Neil Horman
2013-10-29 14:17 ` Neil Horman
2013-10-29 14:27 ` Ingo Molnar
2013-10-29 20:26 ` Neil Horman
2013-10-31 10:22 ` Ingo Molnar
2013-10-31 14:33 ` Neil Horman
2013-11-01 9:13 ` Ingo Molnar
2013-11-01 14:06 ` Neil Horman
2013-10-29 14:12 ` David Ahern
[not found] ` <1383751399-10298-1-git-send-email-nhorman@tuxdriver.com>
[not found] ` <1383751399-10298-3-git-send-email-nhorman@tuxdriver.com>
[not found] ` <87iow58eqf.fsf@tassilo.jf.intel.com>
2013-11-07 21:23 ` [PATCH v2 2/2] x86: add prefetching to do_csum Neil Horman
2013-10-30 5:25 [PATCH] x86: Run checksumming in parallel accross multiple alu's Doug Ledford
2013-10-30 10:27 ` David Laight
2013-10-30 11:02 ` Neil Horman
2013-10-30 12:18 ` David Laight
2013-10-30 13:22 ` Doug Ledford
2013-10-30 13:35 ` Doug Ledford
2013-10-30 14:04 ` David Laight
2013-10-30 14:52 ` Neil Horman
2013-10-31 18:30 ` Neil Horman
2013-11-01 9:21 ` Ingo Molnar
2013-11-01 15:42 ` Ben Hutchings
2013-11-01 16:08 ` Neil Horman
2013-11-01 16:16 ` Ben Hutchings
2013-11-01 16:18 ` David Laight
2013-11-01 17:37 ` Neil Horman
2013-11-01 19:45 ` Joe Perches
2013-11-01 19:58 ` Neil Horman
2013-11-01 20:26 ` Joe Perches
2013-11-02 2:07 ` Neil Horman
2013-11-04 9:47 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131018064321.GG14264@gmail.com \
--to=mingo@kernel.org \
--cc=eric.dumazet@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
--cc=sebastien.dugue@bull.net \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).