From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751233AbWFSIm1 (ORCPT ); Mon, 19 Jun 2006 04:42:27 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751235AbWFSIm1 (ORCPT ); Mon, 19 Jun 2006 04:42:27 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:53977 "EHLO fgwmail5.fujitsu.co.jp") by vger.kernel.org with ESMTP id S1751233AbWFSIm0 (ORCPT ); Mon, 19 Jun 2006 04:42:26 -0400 Message-ID: <44966320.6080308@jp.fujitsu.com> Date: Mon, 19 Jun 2006 17:41:04 +0900 From: MAEDA Naoaki User-Agent: Thunderbird 1.5.0.4 (Windows/20060516) MIME-Version: 1.0 To: Sam Vilain Cc: vatsa@in.ibm.com, Nick Piggin , Kirill Korotaev , Mike Galbraith , Ingo Molnar , Peter Williams , Andrew Morton , sekharan@us.ibm.com, Balbir Singh , linux-kernel@vger.kernel.org, kurosawa@valinux.co.jp, ckrm-tech@lists.sourceforge.net, MAEDA Naoaki Subject: Re: [RFC] CPU controllers? References: <20060615134632.GA22033@in.ibm.com> <4493C1D1.4020801@yahoo.com.au> <20060617164812.GB4643@in.ibm.com> <4494DF50.2070509@yahoo.com.au> <4494EA66.8030305@vilain.net> <20060618071847.GA4988@in.ibm.com> <449606F5.6050909@vilain.net> <44964C89.6060003@jp.fujitsu.com> <44965E0C.9050508@vilain.net> In-Reply-To: <44965E0C.9050508@vilain.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Sam Vilain wrote: > MAEDA Naoaki wrote: >>> ok, so basically the bit in cpu_rc_load() where for_each_cpu_mask() is >>> called, in Maeda Naoaki's patch "CPU controller - Add class load >>> estimation support", is where O(N) creeps in that could be remedied with >>> a token bucket algorithm. You don't want this because if you have 10,000 >>> processes on a system in two resource groups, the aggregate performance >>> will suffer due to the large number of cacheline misses during the 5,000 >>> size loop that runs every resched. >>> >> Thank you for looking the code. >> >> cpu_rc_load() is never called unless sysadm tries to access the load >> information via configfs from userland. In addition, it sums up per-CPU >> group stats, so the size of loop is the number of CPU, not process in >> the group. >> >> However, there is a similer loop in cpu_rc_recalc_tsfactor(), which runs >> every CPU_RC_RECALC_INTERVAL that is defined as HZ. I don't think it >> will cause a big performance penalty. >> > > Ok, so that's not as bad as it looked. So, while it is still O(N), the > fact that it is O(N/HZ) makes this not a problem until you get to > possibly impractical levels of runqueue length. Do you mean N is the size of the loop? for_each_cpu_mask() loops the number of CPUs times. It is not directly related to runqueue length. > I'm thinking it's probably worth doing anyway, just so that it can be > performance tested to see if this performance guestimate is accurate. > >>> To apply the token bucket here, you would first change the per-CPU >>> struct cpu_rc to have the TBF fields; minimally: >>> >>> [...] >>> I think that the characteristics of these two approaches are subtly >>> different. Both scale timeslices, but in a different way - instead of >>> estimating the load and scaling back timeslices up front, busy Resource >>> Groups are relied on to deplete their tokens in a timely manner, and get >>> shorter slices allocated because of that. No doubt from 10,000 feet they >>> both look the same. >>> >> Current 0(1) scheduler gives extra bonus for interactive tasks by >> requeuing them to active array for a while. It would break >> the controller's efforts. So, I'm planning to stop the interactive >> task requeuing if the target share doesn't meet. >> >> Are there a similar issue on the vserver scheduler? >> > > Not an issue - those extra requeued timeslices are accounted for normally. It's great. Thanks, MAEDA Naoaki