From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753289AbZK1CHa (ORCPT ); Fri, 27 Nov 2009 21:07:30 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752341AbZK1CHa (ORCPT ); Fri, 27 Nov 2009 21:07:30 -0500 Received: from e8.ny.us.ibm.com ([32.97.182.138]:53416 "EHLO e8.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751252AbZK1CH3 (ORCPT ); Fri, 27 Nov 2009 21:07:29 -0500 Date: Fri, 27 Nov 2009 18:07:39 -0800 From: "Paul E. McKenney" To: Nick Piggin Cc: Linux Kernel Mailing List , Linus Torvalds Subject: Re: [rfc] "fair" rw spinlocks Message-ID: <20091128020739.GA18149@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20091123145409.GA29627@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20091123145409.GA29627@wotan.suse.de> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 23, 2009 at 03:54:09PM +0100, Nick Piggin wrote: > Hi, > > Last time this issue came up that I could see, I don't think > there were objections to making rwlocks fair, the main > difficulty seemed to be that we allow reentrant read locks > (so a write lock waiting must not block arbitrary read lockers). > > Nowadays our rwlock usage is smaller although still quite a > few, so it would make better sense to do a conversion by > introducing a new lock type and move them over I guess. > > Anyway, I would like to add some kind of fairness or at least > anti starvation for writers. We have a customer seeing total > livelock on tasklist_lock for write locking on a system as small > as 8 core Opteron. > > This was basically reproduced by several cores executing wait > with WNOHANG. > > Of course it would always be nice to improve locking so > contention isn't an issue, but so long as we have rwlocks, we > could possibly get into a situation where starvation is > triggered *somehow*. So I'd really like to fix this. > > This particular starvation on tasklist lock I guess is a local > DoS vulnerability even if the workload is not particularly > realistic. > > Anyway, I don't have a patch yet. I'm sure it can be done > without extra atomics in fastpaths. Comments? The usual trick would be to keep per-fair-rwlock state in per-CPU variables. If it is forbidden to read-acquire one nestable fair rwlock while read-holding another, then this per-CPU state can be a single pointer and a nesting count. On the other hand, if it is permitted to read-acquire one nestable fair rwlock while holding another, then one can use a small per-CPU array of pointer/count pairs. Readers check the per-CPU state. If they already read-hold the lock, they increment the nesting count, otherwise, they contend directly for the lock (and set up the per-CPU state). Same number of atomics on the fastpath as the current implementation. Too bad about those array access, though! ;-) (Though on modern hardware, the array accesses might be a non-event, performance-wise.) Hey, you asked!!! And there are other ways to make this work, including variations on brlock. Thanx, Paul