From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264998AbUGGI7h (ORCPT ); Wed, 7 Jul 2004 04:59:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S265022AbUGGI7h (ORCPT ); Wed, 7 Jul 2004 04:59:37 -0400 Received: from ultra1.eskimo.com ([204.122.16.64]:55563 "EHLO ultra1.eskimo.com") by vger.kernel.org with ESMTP id S264998AbUGGI7d (ORCPT ); Wed, 7 Jul 2004 04:59:33 -0400 Date: Wed, 7 Jul 2004 01:59:23 -0700 From: Elladan To: "Povolotsky, Alexander" Cc: "'Mike Galbraith'" , "'elladan@eskimo.com'" , "'linux-kernel@vger.kernel.org'" Subject: Re: Maximum frequency of re-scheduling (minimum time quantum) que stio n Message-ID: <20040707085923.GA29731@eskimo.com> References: <313680C9A886D511A06000204840E1CF08F42FD7@whq-msgusr-02.pit.comms.marconi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <313680C9A886D511A06000204840E1CF08F42FD7@whq-msgusr-02.pit.comms.marconi.com> User-Agent: Mutt/1.5.6+20040523i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 07, 2004 at 03:59:01AM -0400, Povolotsky, Alexander wrote: > Thanks to both of you for answering ! > > >The catch here is, without the preemptable kernel option, the kernel > >can't preempt itself, so if the first process was doing something in the > >kernel, there'd be a delay. Even with the option, it can't preempt > >itself inside of a critical section, so there will still be a (shorter) > >delay. > > Yes, I am aware, - thanks to the previous answer (not included here), about > this Linux 2.6 > configurable "preemptable kernel" option and was assuming it is configured > and in effect. Note that the preemptable kernel gives you no guarantee of latency, though it does reduce the average latency. A different patch was constructed in the 2.4 era which attempted to provide guaranteed latency through a different approach (effectively, having all long-running operations yield). > >In addition, the kernel can only preempt if something happens which lets > >it check its state. Unless the low priority process makes some system > calls, > > does above means that "some system calls" have internal "built-in" > schedule() > call within their implementation ? > Is there (anywhere) documentation which lists all > system calls, which internally invoke the scheduler via calling schedule() > function call? All system calls execute schedule() before returning to user space, if schedule has been requested. It's done in the system call handler. An interrupt will also schedule if necessary. In addition, if you have a preemptable kernel, schedule() may be executed on demand at any time the kernel isn't in a critical section, or upon exiting a critical section if it is. > >the only thing that will trigger this is the timer interrupt > >which runs at eg. 100 or 400hz typically. > > So I think that above is anwering my original question, that in the "worst > case" scenario - unless the rescheduling is induced earlier by explicit or > implicit (via certain system calls) invokation of the schedule() function > call, - the attempt of rescheduling (again, of course, by calling schedule() > function call) will be done at least at every "clock tick time" (say every > 10 ms, which is default value) ? The reschedule will be done whenever the kernel does it. There is no guaranteed worst case. It's just based on "best effort." If an interrupt from some device, or the timer interrupt, causes a high-priority process to become runnable, the kernel will attempt to schedule as soon as possible. With a preemptable kernel, that will be as soon as it releases all locks and can thus be safely interrupted. But the kernel makes no guarantee that locks won't be held for long periods of time, so the worst case latency is the longest possible duration of an operation in the kernel. That could be quite long (many milliseconds) if it's walking tables, hitting worst-case hash behavior, etc. The thing to note about the timer is that, if you have no external interrupt driving your wakeup, then you're waking up based on some sort of timer. The best resolution you can get from a timer is approximately based on HZ. In 2.6 kernels, the interrupt frequency is 1000hz, but ticks are represented to userspace as if they were 100hz. Also, you may have an RTC available if you need high frequency interrupts. -J