From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751601AbcFYOZE (ORCPT <rfc822;w@1wt.eu>);
	Sat, 25 Jun 2016 10:25:04 -0400
Received: from bombadil.infradead.org ([198.137.202.9]:39751 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751564AbcFYOZC (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Sat, 25 Jun 2016 10:25:02 -0400
Date: Sat, 25 Jun 2016 16:24:47 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, dave@stgolabs.net,
        will.deacon@arm.com, Waiman.Long@hpe.com, benh@kernel.crashing.org
Subject: Re: [PATCH] locking/osq: Drop the overload of osq lock
Message-ID: <20160625142447.GK30154@twins.programming.kicks-ass.net>
References: <1466876523-33437-1-git-send-email-xinhui.pan@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1466876523-33437-1-git-send-email-xinhui.pan@linux.vnet.ibm.com>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Sat, Jun 25, 2016 at 01:42:03PM -0400, Pan Xinhui wrote:
> An over-committed guest with more vCPUs than pCPUs has a heavy overload
> in osq_lock().
> 
> This is because vCPU A hold the osq lock and yield out, vCPU B wait
> per_cpu node->locked to be set. IOW, vCPU B wait vCPU A to run and
> unlock the osq lock. Even there is need_resched(), it did not help on
> such scenario.
> 
> To fix such bad issue, add a threshold in one while-loop of osq_lock().
> The value of threshold is somehow equal to SPIN_THRESHOLD.

Blergh, virt ...

So yes, lock holder preemption sucks. You would also want to limit the
immediate spin on owner.

Also; I really hate these random number spin-loop thresholds.

Is it at all possible to get feedback from your LPAR stuff that the vcpu
was preempted? Because at that point we can add do something like:


	int vpc = vcpu_preempt_count();

	...

	for (;;) {

		/* the big spin loop */

		if (need_resched() || vpc != vcpu_preempt_count())
			/* bail */

	}


With a default implementation like:

static inline int vcpu_preempt_count(void)
{
	return 0;
}

So the compiler can make it all go away.


But on virt muck it would stop spinning the moment the vcpu gets
preempted, which is the right moment I'm thinking.