From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Subject: Re: [PATCH] arch/arm: optimization for memcpy on AArch64
Date: Wed, 29 Nov 2017 18:01:56 +0530
Message-ID: <20171129123154.GA22644@jerin>
References: <1511768985-21639-1-git-send-email-herbert.guan@arm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: jianbo.liu@arm.com, dev@dpdk.org
To: Herbert Guan <herbert.guan@arm.com>
Return-path: <dev-bounces@dpdk.org>
Received: from NAM02-CY1-obe.outbound.protection.outlook.com
 (mail-cys01nam02on0077.outbound.protection.outlook.com [104.47.37.77])
 by dpdk.org (Postfix) with ESMTP id D82492C55
 for <dev@dpdk.org>; Wed, 29 Nov 2017 13:32:18 +0100 (CET)
Content-Disposition: inline
In-Reply-To: <1511768985-21639-1-git-send-email-herbert.guan@arm.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

-----Original Message-----
> Date: Mon, 27 Nov 2017 15:49:45 +0800
> From: Herbert Guan <herbert.guan@arm.com>
> To: jerin.jacob@caviumnetworks.com, jianbo.liu@arm.com, dev@dpdk.org
> CC: Herbert Guan <herbert.guan@arm.com>
> Subject: [PATCH] arch/arm: optimization for memcpy on AArch64
> X-Mailer: git-send-email 1.8.3.1
> +
> +/**************************************
> + * Beginning of customization section
> + **************************************/
> +#define ALIGNMENT_MASK 0x0F
> +#ifndef RTE_ARCH_ARM64_MEMCPY_STRICT_ALIGN
> +// Only src unalignment will be treaed as unaligned copy

C++ style comments. It may generate check patch errors.

> +#define IS_UNALIGNED_COPY(dst, src) ((uintptr_t)(dst) & ALIGNMENT_MASK)
> +#else
> +// Both dst and src unalignment will be treated as unaligned copy
> +#define IS_UNALIGNED_COPY(dst, src) \
> +		(((uintptr_t)(dst) | (uintptr_t)(src)) & ALIGNMENT_MASK)
> +#endif
> +
> +
> +// If copy size is larger than threshold, memcpy() will be used.
> +// Run "memcpy_perf_autotest" to determine the proper threshold.
> +#define ALIGNED_THRESHOLD       ((size_t)(0xffffffff))
> +#define UNALIGNED_THRESHOLD     ((size_t)(0xffffffff))

Do you see any case where this threshold is useful.

> +
> +static inline void *__attribute__ ((__always_inline__))
> +rte_memcpy(void *restrict dst, const void *restrict src, size_t n)
> +{
> +	if (n < 16) {
> +		rte_memcpy_lt16((uint8_t *)dst, (const uint8_t *)src, n);
> +		return dst;
> +	}
> +	if (n < 64) {
> +		rte_memcpy_ge16_lt64((uint8_t *)dst, (const uint8_t *)src, n);
> +		return dst;
> +	}

Unfortunately we have 128B cache arm64 implementation too. Could you 
please take care that based on RTE_CACHE_LINE_SIZE

> +	__builtin_prefetch(src, 0, 0);
> +	__builtin_prefetch(dst, 1, 0);

See above point and Please use DPDK equivalents. rte_prefetch*()