From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762660AbXGXHDz (ORCPT ); Tue, 24 Jul 2007 03:03:55 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755689AbXGXHDs (ORCPT ); Tue, 24 Jul 2007 03:03:48 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:42070 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755282AbXGXHDr (ORCPT ); Tue, 24 Jul 2007 03:03:47 -0400 Date: Tue, 24 Jul 2007 12:43:20 +0530 From: Srivatsa Vaddagiri To: Dhaval Giani Cc: Andrew Morton , Balbir Singh , linux-kernel@vger.kernel.org, Ingo Molnar Subject: Re: System hangs on running kernbench Message-ID: <20070724071320.GA12169@linux.vnet.ibm.com> Reply-To: vatsa@linux.vnet.ibm.com References: <20070718075648.GA4235@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070718075648.GA4235@linux.vnet.ibm.com> User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 18, 2007 at 01:26:48PM +0530, Dhaval Giani wrote: > Hi Andrew, > > I was running kernbench on top of 2.6.22-rc6-mm1 and I got a Hangcheck > alert (This is when kernbench reached make -j). > > Also make -j is hanging. [refer http://marc.info/?l=linux-kernel&m=118474574807055 for complete report of this bug] Ingo, Dhaval tracked the root cause of this problem to be in cfs (btw cfs patches weren't git-bisect safe). Basically, "make -s -j" workload hanged the machine, leading to lot of OOM killings. This was on a 8-cpu machine with no swap space configured and 4GB RAM. The same workload works "fine" (runs to completion) on 2.6.22. I played with the scheduler tunables a bit and found that the problem goes away if I set sched_granularity_ns to 100ms (default value 32ms). So my theory is this: 32ms preemption granularity is too low value for any compile thread to make "usefull" progress. As a result of this rapid context switch, job retiral rate slows down compared to job arrival rate. This builds up job pressure on the system very quickly (than may have happened with 100ms default granularity_ns or 2.6.22 kernel), leading to OOM killings (and hang). >>From a user perspective, who is running with default granularity_ns value, this may be seen as a regression. Perhaps, these new tunables in cfs are something for users to become used to and tune it to approp setting for their system. It would have been nice for kernel to auto-tune the settings based on workload, but I guess that's harder. -- Regards, vatsa