From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968716AbXG3VZc (ORCPT ); Mon, 30 Jul 2007 17:25:32 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S968634AbXG3VZB (ORCPT ); Mon, 30 Jul 2007 17:25:01 -0400 Received: from ns1.suse.de ([195.135.220.2]:37558 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968625AbXG3VZA (ORCPT ); Mon, 30 Jul 2007 17:25:00 -0400 Date: Mon, 30 Jul 2007 23:24:57 +0200 From: Andrea Arcangeli To: Chris Snook Cc: tim.c.chen@linux.intel.com, mingo@elte.hu, linux-kernel@vger.kernel.org Subject: Re: pluggable scheduler thread (was Re: Volanomark slows by 80% under CFS) Message-ID: <20070730212457.GJ7503@v2.random> References: <1185573687.19777.44.camel@localhost.localdomain> <46AA8E57.8010105@redhat.com> <20070728005920.GA31622@v2.random> <46AABB5B.3030702@redhat.com> <20070728050141.GC31622@v2.random> <46AAE760.9030602@redhat.com> <1185821379.19777.58.camel@localhost.localdomain> <46AE5322.9030605@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46AE5322.9030605@redhat.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 30, 2007 at 05:07:46PM -0400, Chris Snook wrote: > [..] It's spending a lot less time in %sys despite the > higher context switches, [..] The workload takes 40% more so you've to add up that additional 40% too into your math. "A lot less time" sounds an overstatement to me. Also you've to take into account cache effects in executing the scheduler so much etc... > [..] and there are far fewer tasks waiting for CPU > time. The real problem seems to be that volanomark is optimized for a It looks weird that there are a lot less tasks in R state. Could you press SYSRQ+T to see where those hundred tasks are sleeping in the CFS run? > That's not to say that we can't improve volanomark performance under CFS, > but simply that CFS isn't so fundamentally flawed that this is impossible. Given the increase of context switches, it means not all the ctx switches are "userland mandated", so the first thing to try here is to increase the granularity with the new tunable sysctl. Increasing the granularity has to reduce the context switch rate, and in turn it will reduce the slowdown to less than 40%. There's nothing necessarily flawed in CFS even if it's slower than O(1) in this load no matter how you tune it. The higher context switch rate to retain complete fariness is a feature, but fariness vs global performance is generally a tradeoff.