Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: Doug Dumitru <doug@easyco.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	gayatri.kammela@intel.com, ravi.v.shankar@intel.com,
	hpa@zytor.com, yu-cheng.yu@intel.com, yuanhan.liu@intel.com
Subject: Re: Prefetch in /lib/raid6/avx2.c
Date: Wed, 5 Oct 2016 16:17:10 -0700	[thread overview]
Message-ID: <20161005231710.GB2804@kernel.org> (raw)
In-Reply-To: <CAFx4rwS5-TCWKxRYpXHeRsfTiJ=mTV0gxoL-yUuqoEbpXst08A@mail.gmail.com>

On Sun, Oct 02, 2016 at 03:40:09PM -0700, Doug Dumitru wrote:
> I have been doing some high bandwidth testing of raid-6, and the
> pretetch in raid6_avx24_gen_syndrome appears to be less than optimal.
> 
> This is my patch (against 4.4.0-38 [Ubuntu 16.04LTS)
> 
> --- cut here ---
> --- lib/raid6/avx2.c0   2016-10-01 21:42:25.280347868 -0700
> +++ lib/raid6/avx2.c    2016-10-02 15:35:48.168480760 -0700
> @@ -189,10 +189,8 @@
> 
>                 for (z = z0; z >= 0; z--) {
> 
> -                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d]));
> -                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d+32]));
> -                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d+64]));
> -                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d+96]));
> +                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d+128]));
> +                       asm volatile("prefetchnta %0" : : "m" (dptr[z][d+192]));
> 
>                         asm volatile("vpcmpgtb %ymm4,%ymm1,%ymm5");
>                         asm volatile("vpcmpgtb %ymm6,%ymm1,%ymm7");
> --- cut here ---
> 
> In perf, the cpu cycles goes from 5.3% to 3.0% for
> raid6_avx24_gen_syndrome in my test and throughput increases from
> about 8.2GB/sec to almost 10GB/sec.  It is a very "synthetic" test,
> but the avx2 code does seem to be a factor.
> 
> I suspect other SSE and AVX "unroll variants" have similar issues, but
> I have not tested those.
> 
> My test system is an E5-1650 v3 (single socket) with DDR4.  This might
> help dual sockets even more.

CC some intel folks to see if they have ideas

  reply	other threads:[~2016-10-05 23:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-02 22:40 Prefetch in /lib/raid6/avx2.c Doug Dumitru
2016-10-05 23:17 ` Shaohua Li [this message]
2016-10-06  7:27   ` AW: " Markus Stockhausen
2016-10-06 17:32     ` Doug Dumitru

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161005231710.GB2804@kernel.org \
    --to=shli@kernel.org \
    --cc=doug@easyco.com \
    --cc=gayatri.kammela@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=yu-cheng.yu@intel.com \
    --cc=yuanhan.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox