From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262566AbTLPAjo (ORCPT ); Mon, 15 Dec 2003 19:39:44 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264093AbTLPAjo (ORCPT ); Mon, 15 Dec 2003 19:39:44 -0500 Received: from gateway-1237.mvista.com ([12.44.186.158]:61423 "EHLO av.mvista.com") by vger.kernel.org with ESMTP id S262566AbTLPAjm (ORCPT ); Mon, 15 Dec 2003 19:39:42 -0500 Message-ID: <3FDE5449.60507@mvista.com> Date: Mon, 15 Dec 2003 16:39:37 -0800 From: George Anzinger Organization: MontaVista Software User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Nick Piggin CC: Guillaume Foliard , linux-kernel@vger.kernel.org Subject: Re: Scheduler degradation since 2.5.66 References: <200312142048.51579.guifo@wanadoo.fr> <3FDD205A.6040807@cyberone.com.au> <3FDD35F9.7090709@cyberone.com.au> In-Reply-To: <3FDD35F9.7090709@cyberone.com.au> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Nick Piggin wrote: > > > Nick Piggin wrote: > >> >> >> Guillaume Foliard wrote: >> >>> Hello, >>> >>> I have been playing with kernel 2.5/2.6 for around 6 months now. I >>> was quite pleased with 2.5.65 to see that the soft real-time >>> behaviour was much better than 2.4.x. Since then I tried most of the >>> 2.5/2.6 versions. But recently someone warned me about some >>> degradations with 2.6.0-test6. To show the degradation since 2.5.66 I >>> have run a simple test program on most of the versions. This simple >>> program is measuring the time it takes to a process to be woken up >>> after a call to nanosleep. >>> As the results are plots, please visit this small website for more >>> information : http://perso.wanadoo.fr/kayakgabon/linux >>> I'm ready to perform more tests or provide more information if >>> necessary. >>> >> >> This isn't a problem with the scheduler, its a problem with >> sys_nanosleep. >> jiffies_to_timespec( {1000000us} ) returns 2 jiffies, and nanosleep adds >> an extra one and asks to sleep for that long (ie. 3ms). > > > > I think you should actually sleep for 2 jiffies here. You have asked > to sleep for _at least_ 1 real millisecond and you really don't care > about the number of jiffies that is. Depending on when the last timer > interrupt had fired, the next jiffy might be in another microsecond. > > So I think you really must sleep for that extra jiffy (but 3 is too > many I think). Notice your first graphs are actually bad, because > some sleeps are much less than 1000us. > > I don't know much about the timer code though, perhaps you do need to > sleep for 3 jiffies... We get the request at some time t between tick tt and tt+1 to sleep for N ticks. We round this up to the next higher tick count convert to jiffies dropping any fraction and then add 1. So that should be 2 right? This is added to NOW which, in the test code, is pretty well pined to the last tick plus processing time. So why do you see 3? What is missing here is that the request was for 1.000000 ms and a tick is really 0.999849 ms. So the request is for a bit more than a tick which we are obligated to round up to 2 ticks. Then adding the 1 tick guard we get the 3 you are seeing. Now if you actually look at that elapsed time you should see it at about 2.999547 ms and ranging down to 1.999698 ms. Try running the test with a requested sleep time of something less than 0.999849 ms. All this is for the x86 which is using this time to do the best it can with the PIT which can only get this close to 1 ms ticks. You can even vary this number to see exactly where the round up actually happens. Ah, life in the nano world :) -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml