From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760727AbZEGQ4W (ORCPT ); Thu, 7 May 2009 12:56:22 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751983AbZEGQ4N (ORCPT ); Thu, 7 May 2009 12:56:13 -0400 Received: from smtp-outbound-1.vmware.com ([65.115.85.69]:60654 "EHLO smtp-outbound-1.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751684AbZEGQ4M (ORCPT ); Thu, 7 May 2009 12:56:12 -0400 Subject: Re: [PATCH] x86: Reduce the default HZ value From: Alok Kataria Reply-To: akataria@vmware.com To: Chris Snook Cc: "H. Peter Anvin" , Ingo Molnar , Thomas Gleixner , the arch/x86 maintainers , LKML , "alan@lxorguk.ukuu.org.uk" In-Reply-To: <13a12eea0905070935o5abbeb49n8320d06c15b19b56@mail.gmail.com> References: <1241462661.412.8.camel@alok-dev1> <4A00ADDE.9000908@zytor.com> <1241560625.8665.17.camel@alok-dev1> <13a12eea0905070935o5abbeb49n8320d06c15b19b56@mail.gmail.com> Content-Type: text/plain Organization: VMware INC. Date: Thu, 07 May 2009 09:56:13 -0700 Message-Id: <1241715373.32495.21.camel@alok-dev1> Mime-Version: 1.0 X-Mailer: Evolution 2.12.3 (2.12.3-8.el5_2.3) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2009-05-07 at 09:35 -0700, Chris Snook wrote: > On Tue, May 5, 2009 at 5:57 PM, Alok Kataria wrote: > > > > On Tue, 2009-05-05 at 14:21 -0700, H. Peter Anvin wrote: > >> Alok Kataria wrote: > >> > Hi, > >> > > >> > Given that there were no major objections that came up regarding > >> > reducing the HZ value in http://lkml.org/lkml/2009/4/27/499. > >> > > >> > Below is the patch which actually reduces it, please consider for tip. > >> > > >> > >> What is the benefit of this? > > > > I did some experiments on linux 2.6.29 guests running on VMware and > > noticed that the number of timer interrupts could have some slowdown on > > the total throughput on the system. > > A simple tight loop experiment showed that with HZ=1000 we took about > > 264sec to complete the loop and that same loop took about 255sec with > > HZ=100. > > You can find more information here http://lkml.org/lkml/2009/4/28/401 > > This is why certain niches, such as HPC users, often prefer HZ=100 > kernels. For the rest of us, sacrificing a few percent CPU throughput > for significant latency gains is well worth it. > > > And with HRT i don't see any downsides in terms of increased latencies > > for device timer's or anything of that sought. > > > >> > >> I can see at least one immediate downside: some timeout values in the > >> kernel are still maintained in units of HZ (like poll, I believe), and > >> so with a lower HZ value we'll have higher roundoff errors. > > > > If that at all is such a big problem shouldn't we think about moving to > > using schedule_hrtimeout for such cases rather than relying on jiffy > > based timeouts. > > The hrtimer explanation over here http://www.tglx.de/hrtimers.html > > also talks about where these HZ (timer wheel) based timeouts be used and > > shouldn't really be dependent on accurate timing. > > But your patch doesn't do this. The reason it doesn't do it is because poll and select already use hrtimer. So IMO no important subsystem relies on jiffies for wakeups. Thus the latency problem is not actually present in the kernel. > If you want us to merge a patch that > makes VMware systems faster, we're a lot more likely to take it if it > make everyone else's systems faster, or at least not slower. I doubt it would make any system slower, running these simple experiments is not hard at all and one could run these on native system too to check. > > > Also the default HZ value was 250 before this commit > > > > commit 5cb04df8d3f03e37a19f2502591a84156be71772 > > x86: defconfig updates > > > > And it was 250 for a very long time before that too. The commit log > > doesn't explain why the value was bumped up either. > > 250 was considered a compromise between 100 and 1000, but almost > everyone who cared just ended up using one or the other, and most of > them preferred 1000. > > Given your use case, what you really need to do is get Red Hat, > Novell, et al. on the phone and ask them to ship kernels with HZ=100, > because the distributions do their own thing anyway. Yeah but I don't think there is any better platform other than LKML to figure out if at all this is a problem anymore. Once we are assured that a low HZ is no more a problem I don't see why would the various distros not consider reducing it. > If you can > figure out a way to do that without harming latency, they'll be > thrilled. Why do you think it would harm latency ? The sched_tick too is driven by hrtimers, if there is any specific subsystem which you think still relies on jiffy we could think about using hrtimer's for them too, right ? I did a quick scan and the only things that rely on jiffy are the device timeout's where latency is not a issue. So please let me know in what cases do you think it could affect system latency. Thanks, Alok > > -- Chris