From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752614AbYHRImX (ORCPT ); Mon, 18 Aug 2008 04:42:23 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751642AbYHRImP (ORCPT ); Mon, 18 Aug 2008 04:42:15 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:35388 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751514AbYHRImO (ORCPT ); Mon, 18 Aug 2008 04:42:14 -0400 Date: Mon, 18 Aug 2008 10:42:01 +0200 From: Ingo Molnar To: "Zhang, Yanmin" Cc: a.p.zijlstra@chello.nl, Linux Kernel Mailing List Subject: Re: scale sysctl_sched_shares_ratelimit with nr_cpus Message-ID: <20080818084201.GA25432@elte.hu> References: <37E52D09333DE2469A03574C88DBF40F024EBD2F@pdsmsx414.ccr.corp.intel.com> <20080818065220.GA2711@elte.hu> <37E52D09333DE2469A03574C88DBF40F024EBD69@pdsmsx414.ccr.corp.intel.com> <20080818070147.GA4801@elte.hu> <37E52D09333DE2469A03574C88DBF40F024EBE06@pdsmsx414.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37E52D09333DE2469A03574C88DBF40F024EBE06@pdsmsx414.ccr.corp.intel.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-ELTE-VirusStatus: clean X-ELTE-SpamScore: -1.5 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-1.5 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.3 -1.5 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Zhang, Yanmin wrote: > >>Does a scheduler trace show anything about why that drop happens? Do > >>something like this to trace the scheduler: > >> > >>assuming debugfs is mounted under /debug and CONFIG_SCHED_TRACER=y: > >> > >> echo 1 > /debug/tracing/tracing_cpumask > >> echo sched_switch > /debug/tracing/available_tracers > >> cat /debug/tracing/trace_pipe > trace.txt > [YM] Thanks for your good pointer. I collected the data and didn't find > anything abnormal except the pid about waker. > > Receiver-197-13665 [00] 1369.966423: 13665:120:R + 13607:120:S > Receiver-197-13665 [00] 1369.966440: 13665:120:R + 13611:120:S > Receiver-197-13665 [00] 1369.966458: 13665:120:R + 13615:120:S > Receiver-197-13665 [00] 1369.966463: 13665:120:R + 13619:120:S > Receiver-197-13665 [00] 1369.966466: 13665:120:R + 13623:120:S > Receiver-197-13665 [00] 1369.966469: 13665:120:R + 13627:120:S > Receiver-197-13665 [00] 1369.966475: 13665:120:R + 13631:120:S > Receiver-197-13665 [00] 1369.966480: 13665:120:R + 13635:120:S > Receiver-197-13665 [00] 1369.966485: 13665:120:R + 13639:120:S > Receiver-197-13665 [00] 1369.966495: 13665:120:R + 13643:120:S > Receiver-197-13665 [00] 1369.966507: 13871:120:R + 13647:120:S > Above waker pid is 13871 while the current pid is 13665. I found lots of > such mismatch data. > > Receiver-197-13665 [00] 1369.966513: 13465:120:R + 13651:120:S > Receiver-197-13665 [00] 1369.966516: 13665:120:R + 13655:120:S > Receiver-197-13665 [00] 1369.966521: 13665:120:R + 13659:120:S > Receiver-197-13665 [00] 1369.966530: 13665:120:R + 13667:120:S > Receiver-197-13665 [00] 1369.966544: 13883:120:R + 13663:120:S > Receiver-197-13665 [00] 1369.966549: 13665:120:R ==> 13667:120:R > Sender-140-13667 [00] 1369.966573: 13351:120:R + 13668:120:S > Sender-140-13667 [00] 1369.966578: 13667:120:R ==> 13659:120:R > > > BTW, I analyzed schedstat data and found wake_affine and > load_balance_newidle seem abnormal. 2.6.27-rc has more task pulls. I > set CONFIG_GROUP_SCHED=n with above testing. hm, does this mean there's too much idle time during the testrun, because we dont load-balance agressively enough? Ingo