From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756719AbZHFUvN (ORCPT ); Thu, 6 Aug 2009 16:51:13 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756224AbZHFUvM (ORCPT ); Thu, 6 Aug 2009 16:51:12 -0400 Received: from ovro.ovro.caltech.edu ([192.100.16.2]:49711 "EHLO ovro.ovro.caltech.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756207AbZHFUvM (ORCPT ); Thu, 6 Aug 2009 16:51:12 -0400 Date: Thu, 6 Aug 2009 13:51:09 -0700 From: "Ira W. Snyder" To: Gregory Haskins Cc: Arnd Bergmann , paulmck@linux.vnet.ibm.com, alacrityvm-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: Re: [PATCH 1/7] shm-signal: shared-memory signals Message-ID: <20090806205109.GA1330@ovro.caltech.edu> References: <20090803171030.17268.26962.stgit@dev.haskins.net> <20090803171735.17268.37490.stgit@dev.haskins.net> <200908061556.55390.arnd@arndb.de> <4A7ABA530200005A00051C18@sinclair.provo.novell.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4A7ABA530200005A00051C18@sinclair.provo.novell.com> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 (ovro.ovro.caltech.edu); Thu, 06 Aug 2009 13:51:10 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 06, 2009 at 09:11:15AM -0600, Gregory Haskins wrote: > Hi Arnd, > > >>> On 8/6/2009 at 9:56 AM, in message <200908061556.55390.arnd@arndb.de>, Arnd > Bergmann wrote: > > On Monday 03 August 2009, Gregory Haskins wrote: > >> shm-signal provides a generic shared-memory based bidirectional > >> signaling mechanism. It is used in conjunction with an existing > >> signal transport (such as posix-signals, interrupts, pipes, etc) to > >> increase the efficiency of the transport since the state information > >> is directly accessible to both sides of the link. The shared-memory > >> design provides very cheap access to features such as event-masking > >> and spurious delivery mititgation, and is useful implementing higher > >> level shared-memory constructs such as rings. > > > > Looks like a very useful feature in general. > > Thanks, I was hoping that would be the case. > > > > >> +struct shm_signal_irq { > >> + __u8 enabled; > >> + __u8 pending; > >> + __u8 dirty; > >> +}; > > > > Won't this layout cause cache line ping pong? Other schemes I have > > seen try to separate the bits so that each cache line is written to > > by only one side. > > It could possibly use some optimization in that regard. I generally consider myself an expert at concurrent programming, but this lockless stuff is, um, hard ;) I was going for correctness first. > > Long story short, any suggestions on ways to split this up are welcome (particularly now, before the ABI is sealed ;) > > > This gets much more interesting if the two sides > > are on remote ends of an I/O link, e.g. using a nontransparent > > PCI bridge, where you only want to send stores over the wire, but > > never fetches or even read-modify-write cycles. > > /me head explodes ;) > I've actually implemented this idea for virtio. Read the virtio-over-PCI patches I posted, and you'll see that the entire virtqueue implementation NEVER uses reads across the PCI bus, only writes. The slowpath configuration space uses reads, but the virtqueues themselves are write-only. Some trivial benchmarking against an earlier driver that did writes+reads across the PCI bus showed that the write-only driver was about 2x as fast. (Throughput increased from ~30MB/sec to ~65MB/sec). I'm sure the write-only design was not the only change responsible for the speedup, but it was definitely a contributing factor. Ira