From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ferruh Yigit Subject: Re: [PATCH v3 15/29] crypto/qat: use eal I/O device memory read/write API Date: Fri, 13 Jan 2017 11:32:29 +0000 Message-ID: References: <1482832175-27199-1-git-send-email-jerin.jacob@caviumnetworks.com> <1484212646-10338-1-git-send-email-jerin.jacob@caviumnetworks.com> <1484212646-10338-16-git-send-email-jerin.jacob@caviumnetworks.com> <6bb9980b-f546-38d5-044a-63507510f6a5@intel.com> <20170113081641.GA17635@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: dev@dpdk.org, konstantin.ananyev@intel.com, thomas.monjalon@6wind.com, bruce.richardson@intel.com, jianbo.liu@linaro.org, viktorin@rehivetech.com, santosh.shukla@caviumnetworks.com, John Griffin , Fiona Trahe , Deepak Kumar Jain , Thomas Monjalon To: Jerin Jacob Return-path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id A53BA28F3 for ; Fri, 13 Jan 2017 12:32:33 +0100 (CET) In-Reply-To: <20170113081641.GA17635@localhost.localdomain> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 1/13/2017 8:17 AM, Jerin Jacob wrote: > On Thu, Jan 12, 2017 at 07:09:22PM +0000, Ferruh Yigit wrote: >> Hi Jerin, >> >> On 1/12/2017 9:17 AM, Jerin Jacob wrote: >> <...> >> >>> +#include >>> + >>> /* CSR write macro */ >>> -#define ADF_CSR_WR(csrAddr, csrOffset, val) \ >>> - (void)((*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset)) \ >>> - = (val))) >>> +#define ADF_CSR_WR(csrAddr, csrOffset, val) \ >>> + rte_write32(val, (((uint8_t *)csrAddr) + csrOffset)) >> >> For IA, this update introduces an extra compiler barrier (rte_io_wmb()), >> which is indeed not a must, is this correct? > > AFAIK, Compiler barrier is required for IA. I am not an IA expert, if > someone thinks it needs to changed then I can fix it in following commit > in this patch series by making rte_io_wmb() and rte_io_rmb() as empty. > > Let me know. > > AFAIK, Linux kernel code has a barrier in readl/writel for IA. > > Typically we don't use any non relaxed versions in fast path.In fast > typically all the drivers has explicit write barrier for doorbell write > and followed by a relaxed version of write. IMO, In any event, it won't > generate performance regression. > > [dpdk-master] $ git show > 70c343bdc8c33a51a9db23cd58122bdfc120a58f > commit 70c343bdc8c33a51a9db23cd58122bdfc120a58f > Author: Jerin Jacob > Date: Mon Dec 5 06:36:49 2016 +0530 > > eal/x86: define I/O device memory barriers for IA > > The patch does not provide any functional change for IA. > I/O barriers are mapped to existing smp barriers. > > CC: Bruce Richardson > CC: Konstantin Ananyev > Signed-off-by: Jerin Jacob > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h > b/lib/librte_eal/common/include/arch/x86/rte_atomic.h > index 00b1cdf..4eac666 100644 > --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h > +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h > @@ -61,6 +61,12 @@ extern "C" { > > #define rte_smp_rmb() rte_compiler_barrier() > > +#define rte_io_mb() rte_mb() > + > +#define rte_io_wmb() rte_compiler_barrier() > + > +#define rte_io_rmb() rte_compiler_barrier() > + > /*------------------------- 16 bit atomic operations > * -------------------------*/ > > #ifndef RTE_FORCE_INTRINSICS > >> >> If so, does it make sense to override these functions for x86, and make >> rte_writeX = rte_writeX_relaxed >> rte_readX = rte_readX_relaxed >> >>> >>> /* CSR read macro */ >>> -#define ADF_CSR_RD(csrAddr, csrOffset) \ >>> - (*((volatile uint32_t *)(((uint8_t *)csrAddr) + csrOffset))) >>> +#define ADF_CSR_RD(csrAddr, csrOffset) \ >>> + rte_read32((((uint8_t *)csrAddr) + csrOffset)) >> >> This patchset both introduces new rte_readX/rte_writeX functions, also >> applies them into drivers. >> >> While applying them, it changes the behavior. >> Like above code was doing a read, but after update it does read and >> read_memory_barrier. >> >> What do you think this patchset updates usage in a manner that keeps >> behavior exact same. Like using rte_read32_relaxed for this case. >> And doing architecture related updates in a different patchset? > > Need to use rte_read32 at this commit otherwise it will break for ARM. > That's was all point for this patchset. Why it breaks the ARM, is it because rte_*mb() updated for ARM in this patchset (patch 7/29) ? I believe it is good to make these modifications in two phase: - First replace old usage with rte_readX/rte_writeX while keeping exact same behavior - Second, do architecture specific changes. Both in eal and drivers level if required. Thanks, ferruh > For performance regression, we can always verify by taking delta > between this changeset and the previous changeset. If you think, I need > to make rte_io_wmb()/rte_io_rmb() as empty for IA then I could do that > as well. > > >> >> This both makes easy to see architecture specific updates, and makes >> easy to trace any possible performance issues by this patchset. >> >>> >>> #define ADF_BANK_INT_SRC_SEL_MASK_0 0x4444444CUL >>> #define ADF_BANK_INT_SRC_SEL_MASK_X 0x44444444UL >>> >>