From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH] net/mlx5: restrict workaround of gcc AVX512F bug Date: Tue, 13 Nov 2018 01:30:26 +0100 Message-ID: <3466805.ZzxQIDAxy8@xps> References: <20181113000122.12594-1-thomas@monjalon.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: dev@dpdk.org, yskoh@mellanox.com, shahafs@mellanox.com, ferruh.yigit@intel.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, christian.ehrhardt@canonical.com, justin.parus@microsoft.com To: Stephen Hemminger Return-path: Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by dpdk.org (Postfix) with ESMTP id 09D9A201 for ; Tue, 13 Nov 2018 01:30:30 +0100 (CET) In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 13/11/2018 01:14, Stephen Hemminger: > On Mon, Nov 12, 2018, 4:01 PM Thomas Monjalon > > A bug was found when the inline function mlx5_tx_complete() > > is optimized with AVX512F instructions. It corrupts an offset > > in the instructions vmovdqu8 of the AVX2 version of rte_mov128(), > > used in rte_memcpy(), which is called in rte_mempool_put_bulk(). > > > > All the above functions are inline. So the workaround is > > to disable AVX512F optimization for the functions calling the > > top-level function of this call stack, i.e. mlx5_tx_complete(). > > All GCC versions supporting AVX512 are supposed to be affected. > > > > The root cause is not identified yet. It may be thought that > > more related bugs may happen in other functions. > > That's why the initial workaround was to disable AVX512F globally. > > This patch takes the risk of applying the workaround only for the > > functions known to be affected, in order to preserve the optimization > > everywhere else. > > > > Bugzilla ID: 97 > > Fixes: 8d07c82b239f ("mk: disable gcc AVX512F support") > > > > Signed-off-by: Thomas Monjalon > > The additional annotations clutter the code. > How big a performance hit is it to disable for whole driver? Or just use > memcpy instead of rte_memcpy? rte_memcpy() is used via rte_mempool_put_bulk(). I am not going to change it to memcpy... About disabling AVX512F for the whole driver, the goal of this patch is to reduce the scope of the workaround. If a per-function scope is not chosen, then we can stay with a global safe scope. If you are interested to know more, the bugzilla has tons of infos: https://bugs.dpdk.org/show_bug.cgi?id=97 Given that we don't get much help on this major GCC bug, we are probably going to stay on the safe side. Anyway I must stop working (alone) on this bug, and instead, focus on making 18.11 out.