From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [PATCH v2] ring: use aligned memzone allocation Date: Fri, 9 Jun 2017 10:16:25 -0700 Message-ID: <20170609101625.09075858@xeon-e3> References: <20170602200337.50743-1-daniel.verkamp@intel.com> <20170602201213.51143-1-daniel.verkamp@intel.com> <2601191342CEEE43887BDE71AB9772583FB05190@IRSMSX109.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583FB05216@IRSMSX109.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583FB060FD@IRSMSX109.ger.corp.intel.com> <20170606124201.GA43772@bricha3-MOBL3.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583FB0644D@IRSMSX109.ger.corp.intel.com> <6908e71a-c849-83d3-e86d-745acf9f9491@sts.kz> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "Ananyev, Konstantin" , "Richardson, Bruce" , "Verkamp, Daniel" , "dev@dpdk.org" To: Yerden Zhumabekov Return-path: Received: from mail-pf0-f177.google.com (mail-pf0-f177.google.com [209.85.192.177]) by dpdk.org (Postfix) with ESMTP id 7E2FB5323 for ; Fri, 9 Jun 2017 19:16:34 +0200 (CEST) Received: by mail-pf0-f177.google.com with SMTP id 83so30542652pfr.0 for ; Fri, 09 Jun 2017 10:16:34 -0700 (PDT) In-Reply-To: <6908e71a-c849-83d3-e86d-745acf9f9491@sts.kz> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, 9 Jun 2017 18:47:43 +0600 Yerden Zhumabekov wrote: > On 06.06.2017 19:19, Ananyev, Konstantin wrote: > > > >>>> Maybe there is some deeper reason for the >= 128-byte alignment logic in rte_ring.h? > >>> Might be, would be good to hear opinion the author of that change. > >> It gives improved performance for core-2-core transfer. > > You mean empty cache-line(s) after prod/cons, correct? > > That's ok but why we can't keep them and whole rte_ring aligned on cache-line boundaries? > > Something like that: > > struct rte_ring { > > ... > > struct rte_ring_headtail prod __rte_cache_aligned; > > EMPTY_CACHE_LINE __rte_cache_aligned; > > struct rte_ring_headtail cons __rte_cache_aligned; > > EMPTY_CACHE_LINE __rte_cache_aligned; > > }; > > > > Konstantin > > > > I'm curious, can anyone explain, how does it actually affect > performance? Maybe we can utilize it application code? I think it is because on Intel CPU's the CPU will speculatively fetch adjacent cache lines. If these cache lines change, then it will create false sharing.