From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965385Ab2JXDIu (ORCPT ); Tue, 23 Oct 2012 23:08:50 -0400 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:56484 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934056Ab2JXDIs (ORCPT ); Tue, 23 Oct 2012 23:08:48 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AooYAFZbh1B5LNyp/2dsb2JhbAA8CIUrtlSEcgKBEIEJgh4BAQU6HCMQCAMOCi4UJQMhE4gDuxYUi0wWgWyEXAOVcYlChniDAw Date: Wed, 24 Oct 2012 14:08:45 +1100 From: Dave Chinner To: Mikulas Patocka Cc: Peter Zijlstra , Oleg Nesterov , "Paul E. McKenney" , Linus Torvalds , Ingo Molnar , Srikar Dronamraju , Ananth N Mavinakayanahalli , Anton Arapov , linux-kernel@vger.kernel.org, Thomas Gleixner Subject: Re: [PATCH 1/2] brw_mutex: big read-write mutex Message-ID: <20121024030845.GT4291@dastard> References: <20121017165902.GB9872@redhat.com> <20121017224430.GC2518@linux.vnet.ibm.com> <20121018162409.GA28504@redhat.com> <20121018163833.GK2518@linux.vnet.ibm.com> <20121018175747.GA30691@redhat.com> <1350650286.30157.28.camel@twins> <1350668451.2768.60.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 19, 2012 at 06:54:41PM -0400, Mikulas Patocka wrote: > > > On Fri, 19 Oct 2012, Peter Zijlstra wrote: > > > > Yes, I tried this approach - it involves doing LOCK instruction on read > > > lock, remembering the cpu and doing another LOCK instruction on read > > > unlock (which will hopefully be on the same CPU, so no cacheline bouncing > > > happens in the common case). It was slower than the approach without any > > > LOCK instructions (43.3 seconds seconds for the implementation with > > > per-cpu LOCKed access, 42.7 seconds for this implementation without atomic > > > instruction; the benchmark involved doing 512-byte direct-io reads and > > > writes on a ramdisk with 8 processes on 8-core machine). > > > > So why is that a problem? Surely that's already tons better then what > > you've currently got. > > Percpu rw-semaphores do not improve performance at all. I put them there > to avoid performance regression, not to improve performance. > > All Linux kernels have a race condition - when you change block size of a > block device and you read or write the device at the same time, a crash > may happen. This bug is there since ever. Recently, this bug started to > cause major trouble - multiple high profile business sites report crashes > because of this race condition. > > You can fix this race by using a read lock around I/O paths and write lock > around block size changing, but normal rw semaphore cause cache line > bouncing when taken for read by multiple processors and I/O performance > degradation because of it is measurable. This doesn't sound like a new problem. Hasn't this global access, single modifier exclusion problem been solved before in the VFS? e.g. mnt_want_write()/mnt_make_readonly() Cheers, Dave. -- Dave Chinner david@fromorbit.com