From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751601AbcFYOZE (ORCPT ); Sat, 25 Jun 2016 10:25:04 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:39751 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751564AbcFYOZC (ORCPT ); Sat, 25 Jun 2016 10:25:02 -0400 Date: Sat, 25 Jun 2016 16:24:47 +0200 From: Peter Zijlstra To: Pan Xinhui Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, dave@stgolabs.net, will.deacon@arm.com, Waiman.Long@hpe.com, benh@kernel.crashing.org Subject: Re: [PATCH] locking/osq: Drop the overload of osq lock Message-ID: <20160625142447.GK30154@twins.programming.kicks-ass.net> References: <1466876523-33437-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1466876523-33437-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 25, 2016 at 01:42:03PM -0400, Pan Xinhui wrote: > An over-committed guest with more vCPUs than pCPUs has a heavy overload > in osq_lock(). > > This is because vCPU A hold the osq lock and yield out, vCPU B wait > per_cpu node->locked to be set. IOW, vCPU B wait vCPU A to run and > unlock the osq lock. Even there is need_resched(), it did not help on > such scenario. > > To fix such bad issue, add a threshold in one while-loop of osq_lock(). > The value of threshold is somehow equal to SPIN_THRESHOLD. Blergh, virt ... So yes, lock holder preemption sucks. You would also want to limit the immediate spin on owner. Also; I really hate these random number spin-loop thresholds. Is it at all possible to get feedback from your LPAR stuff that the vcpu was preempted? Because at that point we can add do something like: int vpc = vcpu_preempt_count(); ... for (;;) { /* the big spin loop */ if (need_resched() || vpc != vcpu_preempt_count()) /* bail */ } With a default implementation like: static inline int vcpu_preempt_count(void) { return 0; } So the compiler can make it all go away. But on virt muck it would stop spinning the moment the vcpu gets preempted, which is the right moment I'm thinking.