From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Galbraith Subject: Re: [RFC -v3 PATCH 2/3] sched: add yield_to function Date: Thu, 13 Jan 2011 04:26:09 +0100 Message-ID: <1294889169.8089.10.camel@marge.simson.net> References: <20110103162637.29f23c40@annuminas.surriel.com> <20110103162918.577a9620@annuminas.surriel.com> <1294164289.2016.186.camel@laptop> <1294246647.8369.52.camel@marge.simson.net> <1294247065.2016.267.camel@laptop> <1294378146.8823.27.camel@marge.simson.net> <4D2E6B62.2000802@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Peter Zijlstra , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Avi Kiviti , Srivatsa Vaddagiri , Chris Wright To: Rik van Riel Return-path: Received: from mailout-de.gmx.net ([213.165.64.22]:43184 "HELO mail.gmx.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with SMTP id S932718Ab1AMD0Q (ORCPT ); Wed, 12 Jan 2011 22:26:16 -0500 In-Reply-To: <4D2E6B62.2000802@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, 2011-01-12 at 22:02 -0500, Rik van Riel wrote: > Cgroups only makes the matter worse - libvirt places > each KVM guest into its own cgroup, so a VCPU will > generally always be alone on its own per-cgroup, per-cpu > runqueue! That can lead to pulling a VCPU onto our local > CPU because we think we are alone, when in reality we > share the CPU with others... How can that happen? If the task you're trying to accelerate isn't in your task group, the whole attempt should be a noop. > Removing the pulling code allows me to use all 4 > CPUs with a 4-VCPU KVM guest in an uncontended situation. > > > + /* Tell the scheduler that we'd really like pse to run next. */ > > + p_cfs_rq->next = pse; > > Using set_next_buddy propagates this up to the root, > allowing the scheduler to actually know who we want to > run next when cgroups is involved. > > > + /* We know whether we want to preempt or not, but are we allowed? */ > > + if (preempt&& same_thread_group(p, task_of(p_cfs_rq->curr))) > > + resched_task(task_of(p_cfs_rq->curr)); > > With this in place, we can get into the situation where > we will gladly give up CPU time, but not actually give > any to the other VCPUs in our guest. > > I believe we can get rid of that test, because pick_next_entity > already makes sure it ignores ->next if picking ->next would > lead to unfairness. Preempting everybody who is in your way isn't playing nice neighbor, so I think at least the same_thread_group() test needs to stay. But that's Peter's call. Starting a zillion threads to play wakeup preempt and lets hog the cpu isn't nice either, but it's allowed. > Removing this test (and simplifying yield_to_task_fair) seems > to lead to more predictable test results. Less is more :) -Mike