From mboxrd@z Thu Jan  1 00:00:00 1970
From: zhihong.wang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org
Subject: [PATCH 0/4] DPDK memcpy optimization
Date: Mon, 19 Jan 2015 09:53:30 +0800
Message-ID: <1421632414-10027-1-git-send-email-zhihong.wang@intel.com>
To: dev-VfR2kkLFssw@public.gmane.org
Return-path: <dev-bounces-VfR2kkLFssw@public.gmane.org>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev-VfR2kkLFssw@public.gmane.org>
List-Help: <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request-VfR2kkLFssw@public.gmane.org?subject=subscribe>
Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org
Sender: "dev" <dev-bounces-VfR2kkLFssw@public.gmane.org>

This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
It also extends memcpy test coverage with unaligned cases and more test points.

Optimization techniques are summarized below:

1. Utilize full cache bandwidth

2. Enforce aligned stores

3. Apply load address alignment based on architecture features

4. Make load/store address available as early as possible

5. General optimization techniques like inlining, branch reducing, prefetch pattern access

Zhihong Wang (4):
  Disabled VTA for memcpy test in app/test/Makefile
  Removed unnecessary test cases in test_memcpy.c
  Extended test coverage in test_memcpy_perf.c
  Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX
    platforms

 app/test/Makefile                                  |   6 +
 app/test/test_memcpy.c                             |  52 +-
 app/test/test_memcpy_perf.c                        | 238 +++++---
 .../common/include/arch/x86/rte_memcpy.h           | 664 +++++++++++++++------
 4 files changed, 656 insertions(+), 304 deletions(-)

-- 
1.9.3