From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [PATCH v4] vhost_user: protect active rings from async ring changes Date: Wed, 20 Dec 2017 12:19:45 -0800 Message-ID: <20171220121945.0143b0af@xeon-e3> References: <20171220163752-mutt-send-email-victork@redhat.com> <20171220110616.21301e11@xeon-e3> <634157847.2119460.1513800390896.JavaMail.zimbra@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, stable@dpdk.org, Jens Freimann , Maxime Coquelin , Yuanhan Liu , Tiwei Bie , Jianfeng Tan To: Victor Kaplansky Return-path: Received: from mail-pl0-f66.google.com (mail-pl0-f66.google.com [209.85.160.66]) by dpdk.org (Postfix) with ESMTP id A03031B1C7 for ; Wed, 20 Dec 2017 21:19:48 +0100 (CET) Received: by mail-pl0-f66.google.com with SMTP id g2so9656459pli.8 for ; Wed, 20 Dec 2017 12:19:48 -0800 (PST) In-Reply-To: <634157847.2119460.1513800390896.JavaMail.zimbra@redhat.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, 20 Dec 2017 15:06:30 -0500 (EST) Victor Kaplansky wrote: > > Wrapping locking inline's adds nothing and makes life harder > > for static analysis tools. > > Yep. In this case it inhibits the details of how the locking is > implemented (e.g. the name of the lock). It also facilitates > replacement of locking mechanism, by another implementation. > See below. YAGNI You aren't gonna need it. Don't build infrastructure for things that you forsee. > > > > The bigger problem is that doing locking on all enqueue/dequeue > > can have a visible performance impact. Did you measure that? > > > > Could you invent an RCUish mechanism using compiler barriers? > > > > I've played a bit with measuring performance impact. Successful > lock adds on the average about 30 cycles on my Haswell cpu. > (and it successes 99.999...% of time). > > I can investigate it more, but my initial feeling is that adding a > memory barrier (the real one, not the compiler barrier) would add > about the same overhead. > > By the way, the way update_queuing_status() in > drivers/net/vhost/rte_eth_vhost.c tries to avoid contention with > the active queue by playing with "allow_queuing" and "while_queuing" > seems to be broken, since memory barriers are missing. CPU cycles alone don't matter on modern x86. What matters is cache and instructions per cycle. In this case locking requires locked instruction which causes the cpu prefetching and instruction pipeline to stall.