From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yongseok Koh Subject: Re: [PATCH v2 5/5] net/mlx5: add vectorized Rx/Tx burst for SSE4.1 Date: Wed, 5 Jul 2017 10:41:32 -0700 Message-ID: <20170705174131.GA4716@yongseok-MBP.local> References: <20170628230403.10142-1-yskoh@mellanox.com> <969ef71aa84f02c19f7fe011fe75e25049177d76.1498850005.git.yskoh@mellanox.com> <20170704085852.GD21379@autoinstall.dev.6wind.com> <20170705003842.GA3440@minint-98vp2qg> <20170705082126.GF21379@autoinstall.dev.6wind.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: ferruh.yigit@intel.com, dev@dpdk.org, adrien.mazarguil@6wind.com To: =?iso-8859-1?Q?N=E9lio?= Laranjeiro Return-path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0043.outbound.protection.outlook.com [104.47.1.43]) by dpdk.org (Postfix) with ESMTP id 19A471C0B for ; Wed, 5 Jul 2017 19:41:46 +0200 (CEST) Content-Disposition: inline In-Reply-To: <20170705082126.GF21379@autoinstall.dev.6wind.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Jul 05, 2017 at 10:21:26AM +0200, Nélio Laranjeiro wrote: > On Tue, Jul 04, 2017 at 05:38:44PM -0700, Yongseok Koh wrote: > > On Tue, Jul 04, 2017 at 10:58:52AM +0200, Nélio Laranjeiro wrote: > > > Yongseok, some comments in this huge and great work, > > > > > > On Fri, Jun 30, 2017 at 12:23:33PM -0700, Yongseok Koh wrote: > > > > To make vectorized burst routines enabled, it is required to run on x86_64 > > > > architecture which can support at least SSE4.1. If all the conditions are > > > > met, the vectorized burst functions are enabled automatically. The decision > > > > is made individually on RX and TX. There's no PMD option to make a > > > > selection. > > > > > > > > Signed-off-by: Yongseok Koh > > > > --- > > > > drivers/net/mlx5/Makefile | 10 + > > > > drivers/net/mlx5/mlx5_defs.h | 18 + > > > > drivers/net/mlx5/mlx5_ethdev.c | 28 +- > > > > drivers/net/mlx5/mlx5_rxq.c | 55 +- > > > > drivers/net/mlx5/mlx5_rxtx.c | 339 ++------ > > > > drivers/net/mlx5/mlx5_rxtx.h | 283 ++++++- > > > > drivers/net/mlx5/mlx5_rxtx_vec_sse.c | 1451 ++++++++++++++++++++++++++++++++++ > > > > drivers/net/mlx5/mlx5_txq.c | 2 +- > > > > 8 files changed, 1909 insertions(+), 277 deletions(-) > > > > create mode 100644 drivers/net/mlx5/mlx5_rxtx_vec_sse.c > > > > > > > > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h > > > > index 51e258a15..2d0894fcd 100644 > > > > --- a/drivers/net/mlx5/mlx5_rxtx.h > > > > +++ b/drivers/net/mlx5/mlx5_rxtx.h > > [...] > > > > + txq_complete(txq); > > > > + /* A CQE slot must always be available. */ > > > > + assert((1u << txq->cqe_n) - (txq->cq_pi - txq->cq_ci)); > > > > > > This assert should be moved to the txq_complete(), or it should not be > > > an assert. > > txq_complete() is a common function, so this can't force to spare at least one > > slot in a completion queue. This assert is to force to allocate enough CQE slots > > by accurate calculation as completion is suppressed by MLX5_TX_COMP_THRESH. If > > the CQ size is well defined (e.g. size of Tx ring / MLX5_TX_COMP_THRESH), it > > doesn't need to check deficiency of CQ slot but checking slots in Tx ring > > (max_elts) is sufficient. If you are okay with this, please let me know, then > > I'll send out v3. > > Just using your comment... > /* A CQE slot must always be available. */ > > This is always true in any Tx function where this assumption should be > true to avoid testing it before posting a completion request and thus > avoid cycles waste and this independently of the computation for the CQ > ring size. > > As it is an assert it is only present to help the developer in error > code he developed (or for a user to point an issue in the code), this > can be any-where in the Tx data path. It becomes useful for any Tx > function using this txq_complete(). > > If this assert is only related to your code, it may means it should be > an if which avoids to post a completion request when no slots are > available. I think I was wrong about the assert. If the Tx ring is full (max_elts == 0), then there won't be any empty slot left in CQ. I'll remove the two asserts. Thanks Yongseok