* [QUESTION] Micro-Second timers in kernel ? @ 2002-03-16 0:48 Jean Tourrilhes 2002-03-16 2:42 ` george anzinger 0 siblings, 1 reply; 4+ messages in thread From: Jean Tourrilhes @ 2002-03-16 0:48 UTC (permalink / raw) To: Linux kernel mailing list, Alan Cox Hi, I'm wondering what is the lowest resolution of timers that can be get in Linux across all platforms. The goal : I need to do microsecond resolution delay in the hard_xmit function of the IrDA-USB driver, and don't want to just grab the CPU. The function sys_nanosleep() seems to indicate that under 2ms, we should not even bother using a timer. Well, on a modern CPU, 2ms is a very long time (on the other hand, it seems OK for PDAs). The definition of "tick" in timer.c indicate that the timer_bh is called at a maximum of HZ time per second (which is consistent with the definition of jiffies). On i386, this would be one tick every 10ms. Well... I'm stuck. 10ms is a very long time at 4Mb/s. So, I guess I'll continue to busy wait before sending each packet. Ugh ! Regards, Jean ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [QUESTION] Micro-Second timers in kernel ? 2002-03-16 0:48 [QUESTION] Micro-Second timers in kernel ? Jean Tourrilhes @ 2002-03-16 2:42 ` george anzinger 2002-03-20 20:44 ` Kasper Dupont 0 siblings, 1 reply; 4+ messages in thread From: george anzinger @ 2002-03-16 2:42 UTC (permalink / raw) To: jt; +Cc: Linux kernel mailing list, Alan Cox Jean Tourrilhes wrote: > > Hi, > > I'm wondering what is the lowest resolution of timers that can > be get in Linux across all platforms. The goal : I need to do > microsecond resolution delay in the hard_xmit function of the IrDA-USB > driver, and don't want to just grab the CPU. > > The function sys_nanosleep() seems to indicate that under 2ms, > we should not even bother using a timer. Well, on a modern CPU, 2ms is > a very long time (on the other hand, it seems OK for PDAs). > The definition of "tick" in timer.c indicate that the timer_bh > is called at a maximum of HZ time per second (which is consistent with > the definition of jiffies). On i386, this would be one tick every > 10ms. > Well... I'm stuck. 10ms is a very long time at 4Mb/s. So, I > guess I'll continue to busy wait before sending each packet. Ugh ! > The overhead to do a timer is on the order of at least 100 us on an 800MHZ machine. Given this, a timer/ interrupt based delay for less than 100 us is probably a bad idea. Still, times in this range and up are available in the high-res-timers patch, BUT, while the patch makes a stab at providing POSIX timers for all "arch"s the high-res part depends on the hardware and thus is different for each platform. Even for the x86 there are two versions (three if you want to work with a machine that does not have a TSC). The upshot of this is that high-res timers will be available on some platforms soon but it will take some time to find them on all platforms. That said, there are a few kernel issues that need to be ironed out. Here are a couple: 1.) What interface(s) would you like to see in the kernel. 2.) Is there a standard compliant way to extend high-res to the user APIs that currently, implicitly reference CLOCK_REALTIME. The issue here is: CLOCK_REALTIME is rather firmly locked on to 1/HZ resolution. The select() API (to give just one example) does not specify a CLOCK and so implicitly uses CLOCK_REALTIME. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- George george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Real time sched: http://sourceforge.net/projects/rtsched/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [QUESTION] Micro-Second timers in kernel ? 2002-03-16 2:42 ` george anzinger @ 2002-03-20 20:44 ` Kasper Dupont 2002-03-29 9:24 ` george anzinger 0 siblings, 1 reply; 4+ messages in thread From: Kasper Dupont @ 2002-03-20 20:44 UTC (permalink / raw) To: Linux-Kernel george anzinger wrote: > > Jean Tourrilhes wrote: > > > > Well... I'm stuck. 10ms is a very long time at 4Mb/s. So, I > > guess I'll continue to busy wait before sending each packet. Ugh ! > > > The overhead to do a timer is on the order of at least 100 us on an > 800MHZ machine. Given this, a timer/ interrupt based delay for less > than 100 us is probably a bad idea. I have been considering all this timer thing for some time. I would actually like to see the interrupt at regular intervals completely removed from the kernel. Instead I would rather be using a programable one shot timer. The PIT used in the i386 based machines actually can do that, although it is not the best designed hardware I could imagine. I don't know what posibilities there are in other hardware. The idea behind all this is that the timers are used for two different purposes. One purpose is that we want particular pieces of code executed at given times, the one shot timer can be used for that. The other purpose is to meassure time intervals, this can be done better in hardware without the need for interrupts, the TSC is more or less a proof of that. What we need to do is that whenever we want a function called at a specified time, we insert a struct containing the time and whatever additional information we need into a priority queue. We then take the difference between the current time and the time from the head of the queue and reprogram the oneshot timer if necesarry. Whenever the timer interrupt gets invoked it should keep executing the first element from the priority queue, until the element in the head has a time stamp in the future. Then it should either busywait or reprogram the timer depending on the needed delay compared to the overhead. All this should probably not be done by the timer interrupt itself, so perhaps we would use a kernel thread, a tasklet or something else. I haven't yet figured out exactly what would be the best. The problem with this idea is primarily that a lot of code might need to be rewritten. We would also need another unit for the jiffies variable, nanoseconds would probably be a good choice. But this would probably need it to be a 64bit integer even on 32bit architectures. And actually jiffies should no longer be a variable but actually a function call. (This is in some way similar to what happened to current at some time.) I have been considering a few implementation details on the i386 architecture, obviously a combination of the PIT and the TSC will be needed. The TSC cannot produce interrupts, and the PIT cannot reliably be used to meassure time when used as one shot timer. The PIT has a well specified frequency, the TSC doesn't. I don't know about the accuracy of the two. We would need to know the frequency of the TSC, but I guess the kernel already meassures that at bootup. I would suggest that the frequency meassured in HZ would be put in some proc pseudofile. Then there is an easy way for applications to read it, and root can change it if needed. The posibility of changing of this pseudofile would have two main purposes. First of all it can be used on hardware were the boot time measurement of the frequency is for some reason wrong. Second a feature could be added to ntpd to change the frequency used by the kernel in order to slowly adjust the time towards the right value. The accuracy with which we know the TSC speed will affect the accuracy of the time. However the accuracy of the PIT should not be that important. If interrupts come too early we will either busywait a very short time, or we will just schedule another interrupt. If interrupts come too late, some process will sleep slightly longer than it was supposed to, but that is no worse than what the kernel promises. And probably no worse than the current implementation. With all this in place the scheduler does no more need to use fixed size timeslices, it could implement growing time slices and decreasing priorities for processes that needs lots of CPU time. And OTOH use small time slices for interactive processes. There are still a few questions for which I do not yet know the answers: - How much overhead will the timer interrupts have with this new design? - How much overhead will the needed 64bit calculations in the kernel have? - Will this be possible on all architectures on which Linux runs. And are there any architectures where this will be easy to implement? - Is the TSC frequency reliable enough to be used for this? I'm absolutely sure that this will require a lot of work to implement, but with the right architecture I believe it would be a very good idea. Is it worth a try? -- Kasper Dupont -- der bruger for meget tid på usenet. For sending spam use mailto:razor-report@daimi.au.dk ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [QUESTION] Micro-Second timers in kernel ? 2002-03-20 20:44 ` Kasper Dupont @ 2002-03-29 9:24 ` george anzinger 0 siblings, 0 replies; 4+ messages in thread From: george anzinger @ 2002-03-29 9:24 UTC (permalink / raw) To: Kasper Dupont; +Cc: Linux-Kernel Kasper Dupont wrote: > > george anzinger wrote: > > > > Jean Tourrilhes wrote: > > > > > > Well... I'm stuck. 10ms is a very long time at 4Mb/s. So, I > > > guess I'll continue to busy wait before sending each packet. Ugh ! > > > > > The overhead to do a timer is on the order of at least 100 us on an > > 800MHZ machine. Given this, a timer/ interrupt based delay for less > > than 100 us is probably a bad idea. > > I have been considering all this timer thing for some time. I would > actually like to see the interrupt at regular intervals completely > removed from the kernel. Check out the high-res-timers sourceforge site (see signature for URL). You will find a patch to make a "tick less" kernel. It does pretty much what you are thinking of. I rejected the whole notion based on the instrumentation results from that kernel. The problem, in a nut shell, is that a timer needs to be stopped and started each context switch. While these operations are really quite fast, so is the context switch and the extra overhead seems to cross the "ticked" system timer overhead when the context switch rate is, well, lets say busy. The upshot is that the "tick less" system is overload prone, timer overhead increases with load (or context switch rate) where, in the "ticked" system it is almost flat. > > Instead I would rather be using a programable one shot timer. The > PIT used in the i386 based machines actually can do that, although > it is not the best designed hardware I could imagine. I don't know > what posibilities there are in other hardware. > > The idea behind all this is that the timers are used for two > different purposes. One purpose is that we want particular pieces > of code executed at given times, the one shot timer can be used for > that. The other purpose is to meassure time intervals, this can be > done better in hardware without the need for interrupts, the TSC is > more or less a proof of that. > > What we need to do is that whenever we want a function called at a > specified time, we insert a struct containing the time and whatever > additional information we need into a priority queue. We then take > the difference between the current time and the time from the head > of the queue and reprogram the oneshot timer if necesarry. The wonderful PIT has a maximum time of some where around 500 ms (if I remember correctly). This would mean that you would need "keep alive" interrupts for longer times... > > Whenever the timer interrupt gets invoked it should keep executing > the first element from the priority queue, until the element in the > head has a time stamp in the future. Then it should either busywait > or reprogram the timer depending on the needed delay compared to > the overhead. All this should probably not be done by the timer > interrupt itself, so perhaps we would use a kernel thread, a tasklet > or something else. I haven't yet figured out exactly what would be > the best. This is exactly what the run_timer_list code does, both in the standard kernel and in the high-res-timer code (not to be confused with the "tick less" experiment mentioned above). This code is run from a tasklet. You would need to use great care to use a kernel thread as you could then lock out timer events with real time tasks. I did do this (another time and system) with a kernel task that reset its own priority based on the timer event priority, so something like this is possible. It does, however, introduce latency in the event. > > The problem with this idea is primarily that a lot of code might > need to be rewritten. We would also need another unit for the > jiffies variable, nanoseconds would probably be a good choice. But > this would probably need it to be a 64bit integer even on 32bit > architectures. And actually jiffies should no longer be a variable > but actually a function call. (This is in some way similar to what > happened to current at some time.) IMHO it is used far too often to be a function (overhead again). Also, for most things the 1/HZ resolution is satisfactory. We do need better resolution sometimes, but this usually comes at some cost so it is best to only use it when needed. > > I have been considering a few implementation details on the i386 > architecture, obviously a combination of the PIT and the TSC will > be needed. The TSC cannot produce interrupts, and the PIT cannot > reliably be used to meassure time when used as one shot timer. > > The PIT has a well specified frequency, the TSC doesn't. I don't > know about the accuracy of the two. We would need to know the > frequency of the TSC, but I guess the kernel already meassures > that at bootup. I would suggest that the frequency meassured in HZ > would be put in some proc pseudofile. Then there is an easy way > for applications to read it, and root can change it if needed. The It is already in /proc. > posibility of changing of this pseudofile would have two main > purposes. First of all it can be used on hardware were the boot > time measurement of the frequency is for some reason wrong. Second > a feature could be added to ntpd to change the frequency used by > the kernel in order to slowly adjust the time towards the right > value. The actual value used to convert TSC to nano or micro seconds is a scaled number which takes a small amount of math to compute. The result, however, is a very fast conversion when doing e.g. gettimeofday() (~.65 micro seconds on 800 MHz PIII). So to do this you would have to tap on the kernels shoulder and have it recompute these scaled numbers. This is not unreasonable and IF we can get notice of throttling (which changes the TSC clock rate) this is exactly what we would want to do. > > The accuracy with which we know the TSC speed will affect the > accuracy of the time. However the accuracy of the PIT should not > be that important. If interrupts come too early we will either > busywait a very short time, or we will just schedule another > interrupt. If interrupts come too late, some process will sleep > slightly longer than it was supposed to, but that is no worse > than what the kernel promises. And probably no worse than the > current implementation. The standards INSIST that we NEVER return too early, but allow late. The TSC and PIT are, in some (most?) hardware driven from the same crystal so they should always have the same error rate, usually a few PPM. > > With all this in place the scheduler does no more need to use > fixed size timeslices, it could implement growing time slices and > decreasing priorities for processes that needs lots of CPU time. > And OTOH use small time slices for interactive processes. > > There are still a few questions for which I do not yet know the > answers: > - How much overhead will the timer interrupts have with this new > design? See the "tick less" patch. It has instrumentation that measures this. The short answer is the best you can do with the PIT is a 5 to 1 improvement (assuming keep alive interrupts at ~500 ms). This, however, is not the problem. The real issue is the timer start/ stop time incurred each context switch. Most of the time there is a time event closer than the end of the slice AND most of the time several context switches occur prior to the slice end. All this adds accounting time to the context switch, but does not add PIT programming time as a close in timer usually is active. The instrumentation shows this. > - How much overhead will the needed 64bit calculations in the > kernel have? Mostly none. You don't really need 64bit stuff often enough to notice. > - Will this be possible on all architectures on which Linux runs. > And are there any architectures where this will be easy to > implement? There are certainly platforms with better time interrupt hardware, and a couple with worse. > - Is the TSC frequency reliable enough to be used for this? Same rock as the PIT. The issue is throttling and power management slow downs. > > I'm absolutely sure that this will require a lot of work to > implement, but with the right architecture I believe it would be > a very good idea. Is it worth a try? IMHO NO :) > > -- > Kasper Dupont -- der bruger for meget tid på usenet. > For sending spam use mailto:razor-report@daimi.au.dk > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Real time sched: http://sourceforge.net/projects/rtsched/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2002-03-29 9:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-03-16 0:48 [QUESTION] Micro-Second timers in kernel ? Jean Tourrilhes 2002-03-16 2:42 ` george anzinger 2002-03-20 20:44 ` Kasper Dupont 2002-03-29 9:24 ` george anzinger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox