From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matan Azrad Subject: [PATCH v3 6/7] net/mlx4: mitigate Tx path memory barriers Date: Mon, 30 Oct 2017 10:07:28 +0000 Message-ID: <1509358049-18854-7-git-send-email-matan@mellanox.com> References: <1508768520-4810-1-git-send-email-ophirmu@mellanox.com> <1509358049-18854-1-git-send-email-matan@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain Cc: dev@dpdk.org, Ophir Munk To: Adrien Mazarguil Return-path: Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10067.outbound.protection.outlook.com [40.107.1.67]) by dpdk.org (Postfix) with ESMTP id AEE8B1B2FE for ; Mon, 30 Oct 2017 11:08:04 +0100 (CET) In-Reply-To: <1509358049-18854-1-git-send-email-matan@mellanox.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Replace most of the memory barriers by compiler barriers since they are all targeted to the DRAM; This improves code efficiency for systems which force store order between different addresses. Only the doorbell record store should be protected by memory barrier since it is targeted to the PCI memory domain. Limit pre byte count store compiler barrier for systems with cache line size smaller than 64B (TXBB size). Signed-off-by: Matan Azrad --- drivers/net/mlx4/mlx4_rxtx.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c index 8ea8851..482c399 100644 --- a/drivers/net/mlx4/mlx4_rxtx.c +++ b/drivers/net/mlx4/mlx4_rxtx.c @@ -168,7 +168,7 @@ struct pv { /* * Make sure we read the CQE after we read the ownership bit. */ - rte_rmb(); + rte_io_rmb(); #ifndef NDEBUG if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) == MLX4_CQE_OPCODE_ERROR)) { @@ -203,7 +203,7 @@ struct pv { */ cq->cons_index = cons_index; *cq->set_ci_db = rte_cpu_to_be_32(cq->cons_index & MLX4_CQ_DB_CI_MASK); - rte_wmb(); + rte_io_wmb(); sq->tail = sq->tail + nr_txbbs; /* Update the list of packets posted for transmission. */ elts_comp -= pkts; @@ -321,6 +321,7 @@ static int handle_multi_segs(struct rte_mbuf *buf, * control segment. */ if ((uintptr_t)dseg & (uintptr_t)(MLX4_TXBB_SIZE - 1)) { +#if RTE_CACHE_LINE_SIZE < 64 /* * Need a barrier here before writing the byte_count * fields to make sure that all the data is visible @@ -331,6 +332,7 @@ static int handle_multi_segs(struct rte_mbuf *buf, * data, and end up sending the wrong data. */ rte_io_wmb(); +#endif /* RTE_CACHE_LINE_SIZE */ dseg->byte_count = byte_count; } else { /* @@ -469,8 +471,7 @@ static int handle_multi_segs(struct rte_mbuf *buf, break; } #endif /* NDEBUG */ - /* Need a barrier here before byte count store. */ - rte_io_wmb(); + /* Never be TXBB aligned, no need compiler barrier. */ dseg->byte_count = rte_cpu_to_be_32(buf->data_len); /* Fill the control parameters for this packet. */ @@ -533,7 +534,7 @@ static int handle_multi_segs(struct rte_mbuf *buf, * setting ownership bit (because HW can start * executing as soon as we do). */ - rte_wmb(); + rte_io_wmb(); ctrl->owner_opcode = rte_cpu_to_be_32(owner_opcode | ((sq->head & sq->txbb_cnt) ? MLX4_BIT_WQE_OWN : 0)); -- 1.8.3.1