* Re: RFC: THE OFFLINE SCHEDULER [not found] ` <1251004897.7043.70.camel@marge.simson.net> @ 2009-08-23 9:09 ` raz ben yehuda 2009-08-23 7:30 ` Mike Galbraith 0 siblings, 1 reply; 79+ messages in thread From: raz ben yehuda @ 2009-08-23 9:09 UTC (permalink / raw) To: Mike Galbraith Cc: riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Sun, 2009-08-23 at 07:21 +0200, Mike Galbraith wrote: > On Sun, 2009-08-23 at 02:27 +0300, raz ben yehuda wrote: > > The Open University of Israel > > Department of Mathematics and computer science > > > > FINAL PAPER > > OFFLINE SCHEDULER > > > > > > > > OFFSCHED is a platform aimed to assign an assignment to an offloaded processor.An offloaded processor is a processor that is hot un-plugged from the operating system. > > > > Description > > > > In today’s computer world, we find that most processors have several embedded cores and hyper-threading. Most programmers do not really use these powerful features and let the operating system do the work. > > At most, a programmer will bound an application to a certain processor or assign an interrupt to a different processor. At the end, we get system busy in maintaining tasks across processors, balancing interrupts, flushing TLBs and DTLBs using atomic operations even when not needed and worst of all, spin locks across processors in vein; and the more processors the merrier. I argue that in some cases, part of this behavior is due to fact the multiple core operating system is not service oriented but a system oriented. There is no easy way to assign a processor to do a distinct service, undisturbed, accurate, and fast as long as the processor is an active part of an operating system and still be a part of most of the operating system address space. > > > > OFFSCHED Purpose > > > > The purpose of the OFFSCHED is to create a platform for services. For example, assume a firewall is being attacked; the Linux operating system will generate endless number of interrupts and/or softirqs to analyze the traffic and throw out bad packets. This is on the expense of “good” packets. Have you ever tried to “ssh” to an attacked machine? Who protects the operating system ? > > What if we can simply do the packet analysis outside the operating system, without being interrupted ? > > Why not assign a core to do only “firewalling”? Or just routing? Design a new type of Real Time system? Maybe assign it as an ultra accurate timer? Create a delaying service that does not just spin? Offload a TCP stack? perhaps a new type of a locking scheme? New type bottom-halves? Debug a running kernel through an offloaded processor? Maybe assign a GPU to do other things than just graphics? > > Amdahl Law teaches us that linear speed-up is not very feasible , so why not spare a processor to do certain tasks better? > > Technologically speaking, I am referring to the Linux kernel ability to virtually hot unplug a (SMT) processor ;but instead of letting it wonder in endless “halts”, assign it a service. > > Seems to me this boils down to a different way to make a SW box in a HW > box, which already exists. What does this provide that partitioning a > box with csets and virtualization doesn't? OFFSCHED does not compete with cpu sets nor virtualization.it is different. 1. Neither virtuallization nor cpu sets provide hard real time. OFFSCHED does this with a little cost and no impact on the OS.OFFSCHED is not just accurate , it is also extremely fast,after all, it is NMI'ed processor. 2. OFFSCHED has a access to every piece of memory in the system. so it can act as a centry for the system, or use linux facilities. Also, the kernel can access OFFSCHED memory, it is the same address space. 3. OFFSCHED can improve the linux OS ( NAPI,OFFSCHED firewall,RTOP ), while a guest OS cannot. 4. cpu sets cannot replace softirqs and hardirqs. OFFSCHED can. cpu sets deals with kernel threads and user space threads. in OFFSCHED we use offlets. 5. cpu sets and virtualization are services provided by the kernel to the "system".who serves the kernel ? who protects the kernel ? 6. offlets gives the programmer full control over an entire processor. no preemption, no interrupts, no quiesce. you know what happens , and when it happens. I have this hard real time system several years on my SMP/MC/SMT machines. It serves me well. The core of OFFSCHED patch was 4 lines. So,i simply compile a ***entirely regular*** linux bzImage and that's it. It did not mess with drivers, spinlocks, softirqs ..., OFFSCHED just directed the cpu_down to my own hard real time piece of code. The rest of the kernel remained the same. > -Mike > ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-23 9:09 ` RFC: THE OFFLINE SCHEDULER raz ben yehuda @ 2009-08-23 7:30 ` Mike Galbraith 2009-08-23 11:05 ` raz ben yehuda 0 siblings, 1 reply; 79+ messages in thread From: Mike Galbraith @ 2009-08-23 7:30 UTC (permalink / raw) To: raz ben yehuda Cc: riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Sun, 2009-08-23 at 12:09 +0300, raz ben yehuda wrote: > On Sun, 2009-08-23 at 07:21 +0200, Mike Galbraith wrote: > > Seems to me this boils down to a different way to make a SW box in a HW > > box, which already exists. What does this provide that partitioning a > > box with csets and virtualization doesn't? > OFFSCHED does not compete with cpu sets nor virtualization.it is > different. > > 1. Neither virtuallization nor cpu sets provide hard real time. OFFSCHED > does this with a little cost and no impact on the OS.OFFSCHED is not > just accurate , it is also extremely fast,after all, it is NMI'ed > processor. Why not? Why can't I run an RT kernel with an RTOS guest and let it do it's deadline management thing? > 2. OFFSCHED has a access to every piece of memory in the system. so it > can act as a centry for the system, or use linux facilities. Also, the > kernel can access OFFSCHED memory, it is the same address space. Hm. That appears to be a self negating argument. > 3. OFFSCHED can improve the linux OS ( NAPI,OFFSCHED firewall,RTOP ), > while a guest OS cannot. > > 4. cpu sets cannot replace softirqs and hardirqs. OFFSCHED can. cpu sets > deals with kernel threads and user space threads. in OFFSCHED we use > offlets. Which still looks like OS-fu to me. > 5. cpu sets and virtualization are services provided by the kernel to > the "system".who serves the kernel ? who protects the kernel ? If either one can diddle the others ram, they are in no way isolated or protected, so can't even defend against their own bugs. What protects a hard RT deadline from VM pressure, memory bandwidth consumption etc etc? Looks to me like it's soft RT, because you can't control the external variables. > 6. offlets gives the programmer full control over an entire processor. > no preemption, no interrupts, no quiesce. you know what happens , and > when it happens. If I can route interrupts such that only say network interrupts are delivered to my cset/vm core, and the guest OS is a custom high speed low drag application, I just don't see much difference. > I have this hard real time system several years on my SMP/MC/SMT > machines. It serves me well. The core of OFFSCHED patch was 4 lines. > So,i simply compile a ***entirely regular*** linux bzImage and that's > it. It did not mess with drivers, spinlocks, softirqs ..., OFFSCHED just > directed the cpu_down to my own hard real time piece of code. The rest > of the kernel remained the same. Aaaaanyway, I'm not saying it's not a useful thing to do, just saying I don't see any reason you can't get essentially the same result with what's in the kernel now. -Mike ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-23 7:30 ` Mike Galbraith @ 2009-08-23 11:05 ` raz ben yehuda 2009-08-23 9:52 ` Mike Galbraith 0 siblings, 1 reply; 79+ messages in thread From: raz ben yehuda @ 2009-08-23 11:05 UTC (permalink / raw) To: Mike Galbraith Cc: riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Sun, 2009-08-23 at 09:30 +0200, Mike Galbraith wrote: > On Sun, 2009-08-23 at 12:09 +0300, raz ben yehuda wrote: > > On Sun, 2009-08-23 at 07:21 +0200, Mike Galbraith wrote: > > > > Seems to me this boils down to a different way to make a SW box in a HW > > > box, which already exists. What does this provide that partitioning a > > > box with csets and virtualization doesn't? > > OFFSCHED does not compete with cpu sets nor virtualization.it is > > different. > > > > 1. Neither virtuallization nor cpu sets provide hard real time. OFFSCHED > > does this with a little cost and no impact on the OS.OFFSCHED is not > > just accurate , it is also extremely fast,after all, it is NMI'ed > > processor. > > Why not? Why can't I run an RT kernel with an RTOS guest and let it do > it's deadline management thing? Have you ever tested how long a single context switch cost ? can you run this system with a 1us accuracy ? you cannot.try ftrac'ing your system. the interrupt alone costs several hundreds nano seconds. By the time you will be reaching your code, the deadline will be nearly gone. > > 2. OFFSCHED has a access to every piece of memory in the system. so it > > can act as a centry for the system, or use linux facilities. Also, the > > kernel can access OFFSCHED memory, it is the same address space. > > Hm. That appears to be a self negating argument. correct. but I can receive packets over napi and transmit packets over hard_start_xmit much faster than any guest OS. I can disable interrupts and move to poll mode, thus helping the operating system. can a guest OS help linux? > > 3. OFFSCHED can improve the linux OS ( NAPI,OFFSCHED firewall,RTOP ), > > while a guest OS cannot. > > > > 4. cpu sets cannot replace softirqs and hardirqs. OFFSCHED can. cpu sets > > deals with kernel threads and user space threads. in OFFSCHED we use > > offlets. > > Which still looks like OS-fu to me. I do not understand this remark. > > 5. cpu sets and virtualization are services provided by the kernel to > > the "system".who serves the kernel ? who protects the kernel ? > > If either one can diddle the others ram, they are in no way isolated or > protected, so can't even defend against their own bugs. correct. but the same applies for a hosting OS. as you said , it is a self negating argument. what if your system is attacked by a RT task that saturate all cpu time, you will not even be able to know what is wrong with your system. in OFFSCHED-RTOP I show that even when attacked by a malicious task, I still can see the problem because I can access the task list and dump it to a remote machine. It is even possible to "kill it" with the offlet-server ( still need to write the killing thing ). > What protects a hard RT deadline from VM pressure, memory bandwidth > consumption etc etc? Looks to me like it's soft RT, because you can't > control the external variables. what does protect guest OS from the host ? also, In one of my applications I wrote my own pre-allocation system, OFFSCHED used only its own pools. so VM was never a problem and it is a true hard real time system. as for memory bandwidth pressure OFFSCHED is not protected from. But if you design your application to use small footprint, you will be able to stay in the processor cache. when you have a kernel thread, lazy TLB is not always promised. Can you say your RT task will never be preempted ? And again, if RTAI or anything of the like has facilities for this kind of problems, OFFSCHED can use them as well. > > 6. offlets gives the programmer full control over an entire processor. > > no preemption, no interrupts, no quiesce. you know what happens , and > > when it happens. > > If I can route interrupts such that only say network interrupts are > delivered to my cset/vm core, and the guest OS is a custom high speed > low drag application, I just don't see much difference. There are other tasks a system must walk through , for example, a processor must walk through a quiesce state, which means you cannot have your real time thread running forever without loosing processor from time to time. and how would you prevent RCU starvation ? what about IPI ? > > I have this hard real time system several years on my SMP/MC/SMT > > machines. It serves me well. The core of OFFSCHED patch was 4 lines. > > So,i simply compile a ***entirely regular*** linux bzImage and that's > > it. It did not mess with drivers, spinlocks, softirqs ..., OFFSCHED just > > directed the cpu_down to my own hard real time piece of code. The rest > > of the kernel remained the same. > > Aaaaanyway, I'm not saying it's not a useful thing to do, just saying I > don't see any reason you can't get essentially the same result with > what's in the kernel now. I thank you for your interest. > -Mike > ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-23 11:05 ` raz ben yehuda @ 2009-08-23 9:52 ` Mike Galbraith 2009-08-25 15:23 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Mike Galbraith @ 2009-08-23 9:52 UTC (permalink / raw) To: raz ben yehuda Cc: riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Sun, 2009-08-23 at 14:05 +0300, raz ben yehuda wrote: > On Sun, 2009-08-23 at 09:30 +0200, Mike Galbraith wrote: > > On Sun, 2009-08-23 at 12:09 +0300, raz ben yehuda wrote: > > > On Sun, 2009-08-23 at 07:21 +0200, Mike Galbraith wrote: > > > > > > Seems to me this boils down to a different way to make a SW box in a HW > > > > box, which already exists. What does this provide that partitioning a > > > > box with csets and virtualization doesn't? > > > OFFSCHED does not compete with cpu sets nor virtualization.it is > > > different. > > > > > > 1. Neither virtuallization nor cpu sets provide hard real time. OFFSCHED > > > does this with a little cost and no impact on the OS.OFFSCHED is not > > > just accurate , it is also extremely fast,after all, it is NMI'ed > > > processor. > > > > Why not? Why can't I run an RT kernel with an RTOS guest and let it do > > it's deadline management thing? > Have you ever tested how long a single context switch cost ? can you run > this system with a 1us accuracy ? you cannot.try ftrac'ing your system. > the interrupt alone costs several hundreds nano seconds. By the time you > will be reaching your code, the deadline will be nearly gone. I've measured context switch cost many times. The point though, wasn't how tight a constraint may be, you maintained that realtime was out the window, and I didn't see any reason for that to be the case. > > > 2. OFFSCHED has a access to every piece of memory in the system. so it > > > can act as a centry for the system, or use linux facilities. Also, the > > > kernel can access OFFSCHED memory, it is the same address space. > > > > Hm. That appears to be a self negating argument. > correct. but I can receive packets over napi and transmit packets over hard_start_xmit > much faster than any guest OS. I can disable interrupts and move to poll > mode, thus helping the operating system. can a guest OS help linux? Depends entirely on the job at hand. If the job is running a firewall in kernel mode, no it won't cut the mustard. (no offense intended, but this all sounds like a great big kernel module to me, one which doesn't even taint the kernel) > > > 3. OFFSCHED can improve the linux OS ( NAPI,OFFSCHED firewall,RTOP ), > > > while a guest OS cannot. > > > > > > 4. cpu sets cannot replace softirqs and hardirqs. OFFSCHED can. cpu sets > > > deals with kernel threads and user space threads. in OFFSCHED we use > > > offlets. > > > > Which still looks like OS-fu to me. > I do not understand this remark. Whether it's offlet, tasklet, insert buzz-word of the day, it's thread of execution management, which I called OS-fu, ie one of those things that OSs do. The rest, I'll leave off replying to, we're kinda splitting hairs. I don't see a big generic benefit to OFFSCHED or ilk, others do. -Mike ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-23 9:52 ` Mike Galbraith @ 2009-08-25 15:23 ` Christoph Lameter 2009-08-25 17:56 ` Mike Galbraith 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-25 15:23 UTC (permalink / raw) To: Mike Galbraith Cc: raz ben yehuda, riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Sun, 23 Aug 2009, Mike Galbraith wrote: > The rest, I'll leave off replying to, we're kinda splitting hairs. I > don't see a big generic benefit to OFFSCHED or ilk, others do. No we are not splitting hairs. OFFSCHED takes the OS noise (interrupts, timers, RCU, cacheline stealing etc etc) out of certain processors. You cannot run an undisturbed piece of software on the OS right now. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 15:23 ` Christoph Lameter @ 2009-08-25 17:56 ` Mike Galbraith 2009-08-25 18:03 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Mike Galbraith @ 2009-08-25 17:56 UTC (permalink / raw) To: Christoph Lameter Cc: raz ben yehuda, riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Tue, 2009-08-25 at 11:23 -0400, Christoph Lameter wrote: > On Sun, 23 Aug 2009, Mike Galbraith wrote: > > > The rest, I'll leave off replying to, we're kinda splitting hairs. I > > don't see a big generic benefit to OFFSCHED or ilk, others do. > > No we are not splitting hairs. OFFSCHED takes the OS noise (interrupts, > timers, RCU, cacheline stealing etc etc) out of certain processors. You > cannot run an undisturbed piece of software on the OS right now. I asked the questions I did out of pure curiosity, and that curiosity has been satisfied. It's not that I find it useless or whatnot (or that my opinion matters to anyone but me;). I personally find the concept of injecting an RTOS into a general purpose OS with no isolation to be alien. Intriguing, but very very alien. -Mike ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 17:56 ` Mike Galbraith @ 2009-08-25 18:03 ` Christoph Lameter 2009-08-25 18:12 ` Mike Galbraith 2009-08-25 19:08 ` Peter Zijlstra 0 siblings, 2 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-25 18:03 UTC (permalink / raw) To: Mike Galbraith Cc: raz ben yehuda, riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Tue, 25 Aug 2009, Mike Galbraith wrote: > I asked the questions I did out of pure curiosity, and that curiosity > has been satisfied. It's not that I find it useless or whatnot (or that > my opinion matters to anyone but me;). I personally find the concept of > injecting an RTOS into a general purpose OS with no isolation to be > alien. Intriguing, but very very alien. Well lets work on the isolation piece then. We could run a regular process on the RT cpu and switch back when OS services are needed? ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 18:03 ` Christoph Lameter @ 2009-08-25 18:12 ` Mike Galbraith [not found] ` <5d96567b0908251522m3fd4ab98n76a52a34a11e874c@mail.gmail.com> 2009-08-25 19:08 ` Peter Zijlstra 1 sibling, 1 reply; 79+ messages in thread From: Mike Galbraith @ 2009-08-25 18:12 UTC (permalink / raw) To: Christoph Lameter Cc: raz ben yehuda, riel, mingo, peterz, andrew motron, wiseman, lkml, linux-rt-users On Tue, 2009-08-25 at 14:03 -0400, Christoph Lameter wrote: > On Tue, 25 Aug 2009, Mike Galbraith wrote: > > > I asked the questions I did out of pure curiosity, and that curiosity > > has been satisfied. It's not that I find it useless or whatnot (or that > > my opinion matters to anyone but me;). I personally find the concept of > > injecting an RTOS into a general purpose OS with no isolation to be > > alien. Intriguing, but very very alien. > > Well lets work on the isolation piece then. We could run a regular process > on the RT cpu and switch back when OS services are needed? If there were isolation, that would make it much less alien to _me_. Isolation would kinda destroy the reason it was written though. RT application/OS is injected into the network stack, which is kinda cool, but makes the hairs on my neck stand up. -Mike ^ permalink raw reply [flat|nested] 79+ messages in thread
[parent not found: <5d96567b0908251522m3fd4ab98n76a52a34a11e874c@mail.gmail.com>]
* Fwd: RFC: THE OFFLINE SCHEDULER [not found] ` <5d96567b0908251522m3fd4ab98n76a52a34a11e874c@mail.gmail.com> @ 2009-08-25 22:32 ` Raz 0 siblings, 0 replies; 79+ messages in thread From: Raz @ 2009-08-25 22:32 UTC (permalink / raw) To: Linux Kernel, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 1209 bytes --] There are other disturbances other than interrupts. attached is a first draft for the 11-th real time conference ,( If it would be accepted ). tar zxvf offsched.tgz cd paper make kpdf offsched.pdf. thank you raz On Tue, Aug 25, 2009 at 9:12 PM, Mike Galbraith <efault@gmx.de> wrote: > > On Tue, 2009-08-25 at 14:03 -0400, Christoph Lameter wrote: > > On Tue, 25 Aug 2009, Mike Galbraith wrote: > > > > > I asked the questions I did out of pure curiosity, and that curiosity > > > has been satisfied. It's not that I find it useless or whatnot (or that > > > my opinion matters to anyone but me;). I personally find the concept of > > > injecting an RTOS into a general purpose OS with no isolation to be > > > alien. Intriguing, but very very alien. > > > > Well lets work on the isolation piece then. We could run a regular process > > on the RT cpu and switch back when OS services are needed? > > If there were isolation, that would make it much less alien to _me_. > Isolation would kinda destroy the reason it was written though. RT > application/OS is injected into the network stack, which is kinda cool, > but makes the hairs on my neck stand up. > > -Mike > [-- Attachment #2: offsched.tgz --] [-- Type: application/x-gzip, Size: 117378 bytes --] ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 18:03 ` Christoph Lameter 2009-08-25 18:12 ` Mike Galbraith @ 2009-08-25 19:08 ` Peter Zijlstra 2009-08-25 19:18 ` Christoph Lameter ` (2 more replies) 1 sibling, 3 replies; 79+ messages in thread From: Peter Zijlstra @ 2009-08-25 19:08 UTC (permalink / raw) To: Christoph Lameter Cc: Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Tue, 2009-08-25 at 14:03 -0400, Christoph Lameter wrote: > On Tue, 25 Aug 2009, Mike Galbraith wrote: > > > I asked the questions I did out of pure curiosity, and that curiosity > > has been satisfied. It's not that I find it useless or whatnot (or that > > my opinion matters to anyone but me;). I personally find the concept of > > injecting an RTOS into a general purpose OS with no isolation to be > > alien. Intriguing, but very very alien. > > Well lets work on the isolation piece then. We could run a regular process > on the RT cpu and switch back when OS services are needed? Christoph, stop being silly, this offline scheduler thing won't happen, full stop. Its not a maintainable solution, it doesn't integrate with existing kernel infrastructure, and its plain ugly. If you want something work within Linux, don't build kernels in kernels or other such ugly hacks. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 19:08 ` Peter Zijlstra @ 2009-08-25 19:18 ` Christoph Lameter 2009-08-25 19:22 ` Chris Friesen 2009-08-25 21:09 ` Éric Piel 2 siblings, 0 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-25 19:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Tue, 25 Aug 2009, Peter Zijlstra wrote: > On Tue, 2009-08-25 at 14:03 -0400, Christoph Lameter wrote: > > On Tue, 25 Aug 2009, Mike Galbraith wrote: > > > > > I asked the questions I did out of pure curiosity, and that curiosity > > > has been satisfied. It's not that I find it useless or whatnot (or that > > > my opinion matters to anyone but me;). I personally find the concept of > > > injecting an RTOS into a general purpose OS with no isolation to be > > > alien. Intriguing, but very very alien. > > > > Well lets work on the isolation piece then. We could run a regular process > > on the RT cpu and switch back when OS services are needed? > > Christoph, stop being silly, this offline scheduler thing won't happen, > full stop. Well there are the low latency requirements still. Those need to be addressed in some form. Some of these ideas here are a starting point. > Its not a maintainable solution, it doesn't integrate with existing > kernel infrastructure, and its plain ugly. > > If you want something work within Linux, don't build kernels in kernels > or other such ugly hacks. Ok so how would you go about avoiding the OS noise which motivated the patches for the Offline scheduler? ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 19:08 ` Peter Zijlstra 2009-08-25 19:18 ` Christoph Lameter @ 2009-08-25 19:22 ` Chris Friesen 2009-08-25 20:35 ` Sven-Thorsten Dietrich 2009-08-26 5:31 ` Peter Zijlstra 2009-08-25 21:09 ` Éric Piel 2 siblings, 2 replies; 79+ messages in thread From: Chris Friesen @ 2009-08-25 19:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Lameter, Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On 08/25/2009 01:08 PM, Peter Zijlstra wrote: > Christoph, stop being silly, this offline scheduler thing won't happen, > full stop. > > Its not a maintainable solution, it doesn't integrate with existing > kernel infrastructure, and its plain ugly. > > If you want something work within Linux, don't build kernels in kernels > or other such ugly hacks. Is it the whole concept of isolating one or more cpus from all normal kernel tasks that you don't like, or just this particular implementation? I ask because I know of at least one project that would have used this capability had it been available. As it stands they have to live with the usual kernel threads running on the cpu that they're trying to dedicate to their app. Chris ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 19:22 ` Chris Friesen @ 2009-08-25 20:35 ` Sven-Thorsten Dietrich 2009-08-26 5:31 ` Peter Zijlstra 1 sibling, 0 replies; 79+ messages in thread From: Sven-Thorsten Dietrich @ 2009-08-25 20:35 UTC (permalink / raw) To: Chris Friesen Cc: Peter Zijlstra, Christoph Lameter, Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Tue, 2009-08-25 at 13:22 -0600, Chris Friesen wrote: > On 08/25/2009 01:08 PM, Peter Zijlstra wrote: > > > Christoph, stop being silly, this offline scheduler thing won't happen, > > full stop. > > > > Its not a maintainable solution, it doesn't integrate with existing > > kernel infrastructure, and its plain ugly. > > > > If you want something work within Linux, don't build kernels in kernels > > or other such ugly hacks. > > Is it the whole concept of isolating one or more cpus from all normal > kernel tasks that you don't like, or just this particular implementation? > > I ask because I know of at least one project that would have used this > capability had it been available. As it stands they have to live with > the usual kernel threads running on the cpu that they're trying to > dedicate to their app. > Its already possible to *almost* vacate a CPU except for a handful of kernel threads. There are various hacks being distributed which also offload / suppress timer and rcu activity from specific CPUs. Everything I have looked at has been hackish and racy, and no one using this is pusing any of it upstream. OFFLINING solves the problem in a minimalist way, and only for tasks with very limited interaction with the Kernel. In contrast however, almost all tasks with such limited Kernel interaction should be able to do fine under PREEMPT_RT after some cpuset work. For those which absolutely cannot handle a handful of kernel threads sharing the CPU, the only option today is one or another form of hackery, and amongst those options, this would seem attractive by its mere simplicity. But complete CPU isolation for user-space tasks still eludes. Sven > Chris > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 19:22 ` Chris Friesen 2009-08-25 20:35 ` Sven-Thorsten Dietrich @ 2009-08-26 5:31 ` Peter Zijlstra 2009-08-26 10:29 ` raz ben yehuda 2009-08-26 15:21 ` Pekka Enberg 1 sibling, 2 replies; 79+ messages in thread From: Peter Zijlstra @ 2009-08-26 5:31 UTC (permalink / raw) To: Chris Friesen Cc: Christoph Lameter, Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Tue, 2009-08-25 at 13:22 -0600, Chris Friesen wrote: > On 08/25/2009 01:08 PM, Peter Zijlstra wrote: > > > Christoph, stop being silly, this offline scheduler thing won't happen, > > full stop. > > > > Its not a maintainable solution, it doesn't integrate with existing > > kernel infrastructure, and its plain ugly. > > > > If you want something work within Linux, don't build kernels in kernels > > or other such ugly hacks. > > Is it the whole concept of isolating one or more cpus from all normal > kernel tasks that you don't like, or just this particular implementation? > > I ask because I know of at least one project that would have used this > capability had it been available. As it stands they have to live with > the usual kernel threads running on the cpu that they're trying to > dedicate to their app. Its the simple fact of going around the kernel instead of using the kernel. Going around the kernel doesn't benefit anybody, least of all Linux. So its the concept of running stuff on a CPU outside of Linux that I don't like. I mean, if you want that, go ahead and run RTLinux, RTAI, L4-Linux etc.. lots of special non-Linux hypervisor/exo-kernel like things around for you to run things outside Linux with. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 5:31 ` Peter Zijlstra @ 2009-08-26 10:29 ` raz ben yehuda 2009-08-26 8:02 ` Mike Galbraith 2009-08-26 13:47 ` Christoph Lameter 2009-08-26 15:21 ` Pekka Enberg 1 sibling, 2 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 10:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Chris Friesen, Christoph Lameter, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 07:31 +0200, Peter Zijlstra wrote: > On Tue, 2009-08-25 at 13:22 -0600, Chris Friesen wrote: > > On 08/25/2009 01:08 PM, Peter Zijlstra wrote: > > > > > Christoph, stop being silly, this offline scheduler thing won't happen, > > > full stop. > > > > > > Its not a maintainable solution, it doesn't integrate with existing > > > kernel infrastructure, and its plain ugly. > > > > > > If you want something work within Linux, don't build kernels in kernels > > > or other such ugly hacks. > > > > Is it the whole concept of isolating one or more cpus from all normal > > kernel tasks that you don't like, or just this particular implementation? > > > > I ask because I know of at least one project that would have used this > > capability had it been available. As it stands they have to live with > > the usual kernel threads running on the cpu that they're trying to > > dedicate to their app. > > Its the simple fact of going around the kernel instead of using the > kernel. > > Going around the kernel doesn't benefit anybody, least of all Linux. > > So its the concept of running stuff on a CPU outside of Linux that I > don't like. I mean, if you want that, go ahead and run RTLinux, RTAI, > L4-Linux etc.. lots of special non-Linux hypervisor/exo-kernel like > things around for you to run things outside Linux with. Hello Peter, Hello All. First , It a pleasure seeing that you take interest in OFFSCHED. So thank you. To my opinion this a matter of defining what a system is. Queuing theory teaches us that a system is defined to be everything within the boundary of the computer, this includes, peripherals, processors, RAM , operating system, the distribution and so on. The kernel is merely a part of the SYSTEM, it is not THE SYSTEM; and it is not a blasphemy to bypass it.The kernel is not the goal and it is not sacred. OFFSCHED is bad name to my project. My project is called SOS = Service Oriented System. SOS, has nothing to do with Real time. SOS is about arranging the processors to serve the SYSTEM the best way we can; if the kernel disturbs the service, put it a side I say. How will the kernel is going to handle 32 processors machines ? These numbers are no longer a science-fiction. What i am suggesting is merely a different approach of how to handle multiple core systems. instead of thinking in processes, threads and so on i am thinking in services. Why not take a processor and define this processor to do just firewalling ? encryption ? routing ? transmission ? video processing... and so on... Raz ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 10:29 ` raz ben yehuda @ 2009-08-26 8:02 ` Mike Galbraith 2009-08-26 8:16 ` Raz 2009-08-26 13:47 ` Christoph Lameter 1 sibling, 1 reply; 79+ messages in thread From: Mike Galbraith @ 2009-08-26 8:02 UTC (permalink / raw) To: raz ben yehuda Cc: Peter Zijlstra, Chris Friesen, Christoph Lameter, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 13:29 +0300, raz ben yehuda wrote: > OFFSCHED is bad name to my project. My project is called SOS = Service > Oriented System. > SOS, has nothing to do with Real time. ?? The paper you pointed me at maintains it's very much about realtime. <quote> This paper argues that OFFSCHED fits to the niche of Multiprocessors real time systems by partitioning a system to two; the operating system and OFFSCHED. OFFSHCED is a hybrid system. It is hybrid because it is both real time and still a regular Linux server. Real time is mainly achieved by the NMI characteristic and the CPU isolation. It is a hybrid system because OFFSCHED scheduler interacts with the operating system. </quote> -Mike. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 8:02 ` Mike Galbraith @ 2009-08-26 8:16 ` Raz 0 siblings, 0 replies; 79+ messages in thread From: Raz @ 2009-08-26 8:16 UTC (permalink / raw) To: Mike Galbraith Cc: Peter Zijlstra, Chris Friesen, Christoph Lameter, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, Aug 26, 2009 at 10:02 AM, Mike Galbraith<efault@gmx.de> wrote: > On Wed, 2009-08-26 at 13:29 +0300, raz ben yehuda wrote: > >> OFFSCHED is bad name to my project. My project is called SOS = Service >> Oriented System. >> SOS, has nothing to do with Real time. > > ?? > > The paper you pointed me at maintains it's very much about realtime. > > <quote> > This paper argues that OFFSCHED fits to the niche of Multiprocessors > real time systems by partitioning a system to two; the operating system > and OFFSCHED. OFFSHCED is a hybrid system. It is hybrid because it is > both real time and still a regular Linux server. Real time is mainly > achieved by the NMI characteristic and the CPU isolation. It is a hybrid > system because OFFSCHED scheduler interacts with the operating system. > </quote> > > -Mike. > > Mike Hello Correct. OFFSCHED has a real time facet that puts it in the SMP real time system arena. it has other facets such as security and monitoring. If you take a look at OFFSCHED-RTOP and OFFSCHED-SECURED you will see that I can actually get information and change a system properties (very pool implementation so far, i am a bit tired...) even if this kernel is not **accessible**. Raz -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 10:29 ` raz ben yehuda 2009-08-26 8:02 ` Mike Galbraith @ 2009-08-26 13:47 ` Christoph Lameter 2009-08-26 14:45 ` Maxim Levitsky 1 sibling, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 13:47 UTC (permalink / raw) To: raz ben yehuda Cc: Peter Zijlstra, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, raz ben yehuda wrote: > How will the kernel is going to handle 32 processors machines ? These > numbers are no longer a science-fiction. The kernel is already running on 4096 processor machines. Dont worry about that. > What i am suggesting is merely a different approach of how to handle > multiple core systems. instead of thinking in processes, threads and so > on i am thinking in services. Why not take a processor and define this > processor to do just firewalling ? encryption ? routing ? transmission ? > video processing... and so on... I think that is a valuable avenue to explore. What we do so far is treating each processor equally. Dedicating a processor has benefits in terms of cache hotness and limits OS noise. Most of the large processor configurations already partition the system using cpusets in order to limit the disturbance by OS processing. A set of cpus is used for OS activities and system daemons are put into that set. But what can be done is limited because the OS threads as well as interrupt and timer processing etc cannot currently be moved. The ideas that you are proposing are particularly usedful for applications that require low latencies and cannot tolerate OS noise easily (Infiniband MPI base jobs f.e.) ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 13:47 ` Christoph Lameter @ 2009-08-26 14:45 ` Maxim Levitsky 2009-08-26 14:54 ` raz ben yehuda 0 siblings, 1 reply; 79+ messages in thread From: Maxim Levitsky @ 2009-08-26 14:45 UTC (permalink / raw) To: Christoph Lameter Cc: raz ben yehuda, Peter Zijlstra, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 09:47 -0400, Christoph Lameter wrote: > On Wed, 26 Aug 2009, raz ben yehuda wrote: > > > How will the kernel is going to handle 32 processors machines ? These > > numbers are no longer a science-fiction. > > The kernel is already running on 4096 processor machines. Dont worry about > that. > > > What i am suggesting is merely a different approach of how to handle > > multiple core systems. instead of thinking in processes, threads and so > > on i am thinking in services. Why not take a processor and define this > > processor to do just firewalling ? encryption ? routing ? transmission ? > > video processing... and so on... > > I think that is a valuable avenue to explore. What we do so far is > treating each processor equally. Dedicating a processor has benefits in > terms of cache hotness and limits OS noise. > > Most of the large processor configurations already partition the system > using cpusets in order to limit the disturbance by OS processing. A set of > cpus is used for OS activities and system daemons are put into that set. > But what can be done is limited because the OS threads as well as > interrupt and timer processing etc cannot currently be moved. The ideas > that you are proposing are particularly usedful for applications that > require low latencies and cannot tolerate OS noise easily (Infiniband MPI > base jobs f.e.) My 0.2 cents: I have always been fascinated by the idea of controlling another cpu from the main CPU. Usually these cpus are custom, run proprietary software, and have no datasheet on their I/O interfaces. But, being able to turn an ordinary CPU into something like that seems to be very nice. For example, It might help with profiling. Think about a program that can run uninterrupted how much it wants. I might even be better, if the dedicated CPU would use a predefined reserved memory range (I wish there was a way to actually lock it to that range) On the other hand, I could see this as a jump platform for more proprietary code, something like that: we use linux in out server platform, but out "insert buzzword here" network stack pro+ can handle 100% more load that linux does, and it runs on a dedicated core.... In the other words, we might see 'firmwares' that take an entire cpu for their usage. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 14:45 ` Maxim Levitsky @ 2009-08-26 14:54 ` raz ben yehuda 2009-08-26 15:06 ` Pekka Enberg ` (2 more replies) 0 siblings, 3 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 14:54 UTC (permalink / raw) To: Maxim Levitsky Cc: Christoph Lameter, Peter Zijlstra, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 17:45 +0300, Maxim Levitsky wrote: > On Wed, 2009-08-26 at 09:47 -0400, Christoph Lameter wrote: > > On Wed, 26 Aug 2009, raz ben yehuda wrote: > > > > > How will the kernel is going to handle 32 processors machines ? These > > > numbers are no longer a science-fiction. > > > > The kernel is already running on 4096 processor machines. Dont worry about > > that. > > > > > What i am suggesting is merely a different approach of how to handle > > > multiple core systems. instead of thinking in processes, threads and so > > > on i am thinking in services. Why not take a processor and define this > > > processor to do just firewalling ? encryption ? routing ? transmission ? > > > video processing... and so on... > > > > I think that is a valuable avenue to explore. What we do so far is > > treating each processor equally. Dedicating a processor has benefits in > > terms of cache hotness and limits OS noise. > > > > Most of the large processor configurations already partition the system > > using cpusets in order to limit the disturbance by OS processing. A set of > > cpus is used for OS activities and system daemons are put into that set. > > But what can be done is limited because the OS threads as well as > > interrupt and timer processing etc cannot currently be moved. The ideas > > that you are proposing are particularly usedful for applications that > > require low latencies and cannot tolerate OS noise easily (Infiniband MPI > > base jobs f.e.) > > My 0.2 cents: > > I have always been fascinated by the idea of controlling another cpu > from the main CPU. > > Usually these cpus are custom, run proprietary software, and have no > datasheet on their I/O interfaces. > > But, being able to turn an ordinary CPU into something like that seems > to be very nice. > > For example, It might help with profiling. Think about a program that > can run uninterrupted how much it wants. > > I might even be better, if the dedicated CPU would use a predefined > reserved memory range (I wish there was a way to actually lock it to > that range) > > On the other hand, I could see this as a jump platform for more > proprietary code, something like that: we use linux in out server > platform, but out "insert buzzword here" network stack pro+ can handle > 100% more load that linux does, and it runs on a dedicated core.... > > In the other words, we might see 'firmwares' that take an entire cpu for > their usage. This is exactly what offsched (sos) is. you got it. SOS was partly inspired by the notion of a GPU. Processors are to become more and more redundant and Linux as an evolutionary system must use it. why not offload raid5 write engine ? why not encrypt in a different processor ? Also , having so many processors in a single OS means a bug prone system , with endless contention points when two or more OS processors interacts. let's make things simpler. > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 14:54 ` raz ben yehuda @ 2009-08-26 15:06 ` Pekka Enberg 2009-08-26 15:11 ` raz ben yehuda 2009-08-26 15:30 ` Peter Zijlstra 2009-08-26 15:37 ` Chetan.Loke 2 siblings, 1 reply; 79+ messages in thread From: Pekka Enberg @ 2009-08-26 15:06 UTC (permalink / raw) To: raz ben yehuda Cc: Maxim Levitsky, Christoph Lameter, Peter Zijlstra, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, Aug 26, 2009 at 5:54 PM, raz ben yehuda<raziebe@gmail.com> wrote: >> I have always been fascinated by the idea of controlling another cpu >> from the main CPU. >> >> Usually these cpus are custom, run proprietary software, and have no >> datasheet on their I/O interfaces. >> >> But, being able to turn an ordinary CPU into something like that seems >> to be very nice. >> >> For example, It might help with profiling. Think about a program that >> can run uninterrupted how much it wants. >> >> I might even be better, if the dedicated CPU would use a predefined >> reserved memory range (I wish there was a way to actually lock it to >> that range) >> >> On the other hand, I could see this as a jump platform for more >> proprietary code, something like that: we use linux in out server >> platform, but out "insert buzzword here" network stack pro+ can handle >> 100% more load that linux does, and it runs on a dedicated core.... >> >> In the other words, we might see 'firmwares' that take an entire cpu for >> their usage. > > This is exactly what offsched (sos) is. you got it. SOS was partly inspired by the notion of a GPU. So where are the patches? The URL in the original post returns 404... ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 15:06 ` Pekka Enberg @ 2009-08-26 15:11 ` raz ben yehuda 0 siblings, 0 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 15:11 UTC (permalink / raw) To: Pekka Enberg Cc: Maxim Levitsky, Christoph Lameter, Peter Zijlstra, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users sos linux is at: http://sos-linux.svn.sourceforge.net/viewvc/sos-linux/offsched/ you will find the modules, once shot patches , split patches, and a Documentation folder. On Wed, 2009-08-26 at 18:06 +0300, Pekka Enberg wrote: > On Wed, Aug 26, 2009 at 5:54 PM, raz ben yehuda<raziebe@gmail.com> wrote: > >> I have always been fascinated by the idea of controlling another cpu > >> from the main CPU. > >> > >> Usually these cpus are custom, run proprietary software, and have no > >> datasheet on their I/O interfaces. > >> > >> But, being able to turn an ordinary CPU into something like that seems > >> to be very nice. > >> > >> For example, It might help with profiling. Think about a program that > >> can run uninterrupted how much it wants. > >> > >> I might even be better, if the dedicated CPU would use a predefined > >> reserved memory range (I wish there was a way to actually lock it to > >> that range) > >> > >> On the other hand, I could see this as a jump platform for more > >> proprietary code, something like that: we use linux in out server > >> platform, but out "insert buzzword here" network stack pro+ can handle > >> 100% more load that linux does, and it runs on a dedicated core.... > >> > >> In the other words, we might see 'firmwares' that take an entire cpu for > >> their usage. > > > > This is exactly what offsched (sos) is. you got it. SOS was partly inspired by the notion of a GPU. > > So where are the patches? The URL in the original post returns 404... ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 14:54 ` raz ben yehuda 2009-08-26 15:06 ` Pekka Enberg @ 2009-08-26 15:30 ` Peter Zijlstra 2009-08-26 15:41 ` Christoph Lameter 2009-08-26 15:37 ` Chetan.Loke 2 siblings, 1 reply; 79+ messages in thread From: Peter Zijlstra @ 2009-08-26 15:30 UTC (permalink / raw) To: raz ben yehuda Cc: Maxim Levitsky, Christoph Lameter, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 17:54 +0300, raz ben yehuda wrote: > This is exactly what offsched (sos) is. you got it. SOS was partly > inspired by the notion of a GPU. It is not, GPUs and other paired chips form a hybrid system. Linux is known to run on one or more of such chips and communicate through whatever means these chips have. But what you propose here is hard partitioning a homogeneous system, totally different. > Processors are to become more and more redundant and Linux as an > evolutionary system must use it. why not offload raid5 write engine ? > why not encrypt in a different processor ? Why waste a whole cpu for something that could be done by part of one? > Also , having so many processors in a single OS means a bug prone > system , with endless contention points when two or more OS processors > interacts. You're bound to have interaction between the core os and these partitions you want, non of it different from how threads in the kernel would interact, other than that you're going to re-invent everything already present in the kernel. > let's make things simpler. You don't, you make things more complex by introducing duplicate functionality. What's more, you burden the user with having to configure such a system, and make choices on having to give up parts of his system, nothing like that should be needed on a homogeneous system. Work spend on trimming fat of of the core kernel helps everybody, even users not otherwise interested in things like giving up a whole cpu for some odd purpose. There is no reason something could be done more efficiently on a dedicated CPU than not when you assume a homogeneous system (which is all Linux supports in the single image sense). If you think the kernel is too fat and does superfluous things for your needs, help trim it. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 15:30 ` Peter Zijlstra @ 2009-08-26 15:41 ` Christoph Lameter 2009-08-26 16:03 ` Peter Zijlstra 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 15:41 UTC (permalink / raw) To: Peter Zijlstra Cc: raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, Peter Zijlstra wrote: > Why waste a whole cpu for something that could be done by part of one? Because of latency and performance requirements > You're bound to have interaction between the core os and these > partitions you want, non of it different from how threads in the kernel > would interact, other than that you're going to re-invent everything > already present in the kernel. The kernel interactions can be done while running on another (not isolated) cpu. > You don't, you make things more complex by introducing duplicate > functionality. The functionality does not exist. This is about new features. > If you think the kernel is too fat and does superfluous things for your > needs, help trim it. Mind boogling nonsense. Please stop fantasizing and trolling. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 15:41 ` Christoph Lameter @ 2009-08-26 16:03 ` Peter Zijlstra 2009-08-26 16:16 ` Pekka Enberg 2009-08-26 16:20 ` Christoph Lameter 0 siblings, 2 replies; 79+ messages in thread From: Peter Zijlstra @ 2009-08-26 16:03 UTC (permalink / raw) To: Christoph Lameter Cc: raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 11:41 -0400, Christoph Lameter wrote: > On Wed, 26 Aug 2009, Peter Zijlstra wrote: > > > Why waste a whole cpu for something that could be done by part of one? > > Because of latency and performance requirements Latency is the only one, and yes people have been using hacks like this, I've also earlier mentioned RTAI, RTLinux and L4-Linux which basically do the same thing. The problem is, that its not linux, you cannot run something on a these off-cores and use the same functionality as linux, if you could it'd not be offline. The past year or so you've been whining about the tick latency, and I've seen exactly _0_ patches from you slimming down the work done in there, even though I pointed out some obvious things that could be done. Carving out cpus just doesn't work in the long run (see below for more), it adds configuration burdens on people and it would duplicate functionality (below), or it provides it in a (near) useless manner. If you were to work on lowering the linux latency in the full kernel sense, you'd help out a lot of people, many use-cases would improve and you'd be helpful to he greater good. If you hack up special cases like this, then only your one use-case gets better and the rest doesn't, or it might actually get worse, because it got less attention. > > You're bound to have interaction between the core os and these > > partitions you want, non of it different from how threads in the kernel > > would interact, other than that you're going to re-invent everything > > already present in the kernel. > > The kernel interactions can be done while running on another (not > isolated) cpu. There needs to be some communication between the isolated and non isolated part, otherwise what's the point. Even when you'd let it handle say a network device as pure firewall, you'd need to configure the thing, requiring interaction. Interaction of any sorts gets serialization requirements, and from there on things tend to grow. > > You don't, you make things more complex by introducing duplicate > > functionality. > > The functionality does not exist. This is about new features. It is not, he is proposing to use these cores for: - network stuff, we already have that - raid5 stuff, we already have that - other stuff we already have Then there is the issue of what happens when a single core isn't sufficient for the given task, then you'd need to split up, again creating more interaction. > > If you think the kernel is too fat and does superfluous things for your > > needs, help trim it. > > Mind boogling nonsense. Please stop fantasizing and trolling. Oh, to lay down the crack-pipe and sod off. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 16:03 ` Peter Zijlstra @ 2009-08-26 16:16 ` Pekka Enberg 2009-08-26 16:20 ` Christoph Lameter 1 sibling, 0 replies; 79+ messages in thread From: Pekka Enberg @ 2009-08-26 16:16 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Lameter, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users Hi Peter, On Wed, Aug 26, 2009 at 7:03 PM, Peter Zijlstra<peterz@infradead.org> wrote: > There needs to be some communication between the isolated and non > isolated part, otherwise what's the point. Even when you'd let it handle > say a network device as pure firewall, you'd need to configure the > thing, requiring interaction. The use case Christoph described was an user-space number cruncher app that does some network I/O over RDMA IIRC. AFAICT, if he could isolate a physical CPU for the thing, there would be little or no communication with the non-isolated part. Yes, the setup sounds weird but it's a real workload although pretty damn specialized. On Wed, Aug 26, 2009 at 7:03 PM, Peter Zijlstra<peterz@infradead.org> wrote: >> > If you think the kernel is too fat and does superfluous things for your >> > needs, help trim it. >> >> Mind boogling nonsense. Please stop fantasizing and trolling. > > Oh, to lay down the crack-pipe and sod off. I guess I'll go for the magic mushrooms then. Pekka ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 16:03 ` Peter Zijlstra 2009-08-26 16:16 ` Pekka Enberg @ 2009-08-26 16:20 ` Christoph Lameter 2009-08-26 18:04 ` Ingo Molnar 1 sibling, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 16:20 UTC (permalink / raw) To: Peter Zijlstra Cc: raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, Peter Zijlstra wrote: > On Wed, 2009-08-26 at 11:41 -0400, Christoph Lameter wrote: > > On Wed, 26 Aug 2009, Peter Zijlstra wrote: > > > > > Why waste a whole cpu for something that could be done by part of one? > > > > Because of latency and performance requirements > > Latency is the only one, and yes people have been using hacks like this, > I've also earlier mentioned RTAI, RTLinux and L4-Linux which basically > do the same thing. > > The problem is, that its not linux, you cannot run something on a these > off-cores and use the same functionality as linux, if you could it'd not > be offline. Right. We discussed this. Why are you repeating the same old arguments? > Carving out cpus just doesn't work in the long run (see below for more), > it adds configuration burdens on people and it would duplicate > functionality (below), or it provides it in a (near) useless manner. Its pretty simple. Just isolate the cpu, forbid the OS to run anything on it. Allow a user space process to change its affinity to the isolated cpu. Should the process be so stupid as to ask the OS for services then just switch it back to a regular processor. Interaction is still possible via shared memory communication as well as memory mapped devices. > If you hack up special cases like this, then only your one use-case gets > better and the rest doesn't, or it might actually get worse, because it > got less attention. What special case? This is a generic mechanism. > > The kernel interactions can be done while running on another (not > > isolated) cpu. > > There needs to be some communication between the isolated and non > isolated part, otherwise what's the point. Even when you'd let it handle > say a network device as pure firewall, you'd need to configure the > thing, requiring interaction. Shared memory, memory mapped devices? > Interaction of any sorts gets serialization requirements, and from there > on things tend to grow. Yes and there are mechanism that provide the serialization without OS services. > > The functionality does not exist. This is about new features. > > It is not, he is proposing to use these cores for: > > - network stuff, we already have that > - raid5 stuff, we already have that > - other stuff we already have Right. I also want to use it for network stuff. Infiniband which support memory mapped registers and stuff. Its generic not special as you state. > Then there is the issue of what happens when a single core isn't > sufficient for the given task, then you'd need to split up, again > creating more interaction. Well yes you need to create synchronization methods that do not require OS interaction. > > > If you think the kernel is too fat and does superfluous things for your > > > needs, help trim it. > > > > Mind boogling nonsense. Please stop fantasizing and trolling. > > Oh, to lay down the crack-pipe and sod off. Dont have one here. Whats a sod off? ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 16:20 ` Christoph Lameter @ 2009-08-26 18:04 ` Ingo Molnar 2009-08-26 19:15 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Ingo Molnar @ 2009-08-26 18:04 UTC (permalink / raw) To: Christoph Lameter Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users * Christoph Lameter <cl@linux-foundation.org> wrote: > On Wed, 26 Aug 2009, Peter Zijlstra wrote: > > > On Wed, 2009-08-26 at 11:41 -0400, Christoph Lameter wrote: > > > On Wed, 26 Aug 2009, Peter Zijlstra wrote: > > > > > > > Why waste a whole cpu for something that could be done by part of one? > > > > > > Because of latency and performance requirements > > > > Latency is the only one, and yes people have been using hacks > > like this, I've also earlier mentioned RTAI, RTLinux and > > L4-Linux which basically do the same thing. > > > > The problem is, that its not linux, you cannot run something on > > a these off-cores and use the same functionality as linux, if > > you could it'd not be offline. > > Right. We discussed this. Why are you repeating the same old > arguments? The thing is, you have cut out (and have not replied to) this crutial bit of what Peter wrote: > > The past year or so you've been whining about the tick latency, > > and I've seen exactly _0_ patches from you slimming down the > > work done in there, even though I pointed out some obvious > > things that could be done. ... which pretty much settles the issue as far as i'm concerned. If you were truly interested in a constructive solution to lower latencies in Linux you should have sent patches already for the low hanging fruits Peter pointed out. Ingo ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 18:04 ` Ingo Molnar @ 2009-08-26 19:15 ` Christoph Lameter 2009-08-26 19:32 ` Ingo Molnar 2009-08-27 7:15 ` Mike Galbraith 0 siblings, 2 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 19:15 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, Ingo Molnar wrote: > The thing is, you have cut out (and have not replied to) this > crutial bit of what Peter wrote: > > > > The past year or so you've been whining about the tick latency, > > > and I've seen exactly _0_ patches from you slimming down the > > > work done in there, even though I pointed out some obvious > > > things that could be done. > > ... which pretty much settles the issue as far as i'm concerned. If > you were truly interested in a constructive solution to lower > latencies in Linux you should have sent patches already for the low > hanging fruits Peter pointed out. The noise latencies were already reduced in years earlier to the mininum (f.e. the work on slab queue cleaning). Certainly more could be done there but that misses the point. The point of the OFFLINE scheduler is to completely eliminate the OS disturbances by getting rid of *all* OS processing on some cpus. For some reason scheduler developers seem to be threatened by this idea and they go into bizarre lines of arguments to avoid the issue. Its simple and doable and the scheduler will still be there after we do this. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 19:15 ` Christoph Lameter @ 2009-08-26 19:32 ` Ingo Molnar 2009-08-26 20:40 ` Christoph Lameter 2009-08-26 21:32 ` raz ben yehuda 2009-08-27 7:15 ` Mike Galbraith 1 sibling, 2 replies; 79+ messages in thread From: Ingo Molnar @ 2009-08-26 19:32 UTC (permalink / raw) To: Christoph Lameter Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users * Christoph Lameter <cl@linux-foundation.org> wrote: > On Wed, 26 Aug 2009, Ingo Molnar wrote: > > > The thing is, you have cut out (and have not replied to) this > > crutial bit of what Peter wrote: > > > > > > The past year or so you've been whining about the tick latency, > > > > and I've seen exactly _0_ patches from you slimming down the > > > > work done in there, even though I pointed out some obvious > > > > things that could be done. > > > > ... which pretty much settles the issue as far as i'm concerned. > > If you were truly interested in a constructive solution to lower > > latencies in Linux you should have sent patches already for the > > low hanging fruits Peter pointed out. > > The noise latencies were already reduced in years earlier to the > mininum (f.e. the work on slab queue cleaning). Certainly more > could be done there but that misses the point. Peter suggested various improvements to the timer tick related latencies _you_ were complaining about earlier this year. Those latencies sure were not addressed 'years earlier'. If you are unwilling to reduce the very latencies you apparently cared and complained about then you dont have much real standing to complain now. ( If you on the other hand were approaching this issue with pragmatism and with intellectual honesty, if you were at the end of a string of patches that gradually improved latencies but couldnt get them below a certain threshold, and if scheduler developers couldnt give you any ideas what else to improve, and _then_ suggested some other solution, you might have a point. You are far away from being able to claim that. ) Really, it's a straightforward application of Occam's Razor to the scheduler. We go for the simplest solution first, and try to help more people first, before going for some specialist hack. > The point of the OFFLINE scheduler is to completely eliminate the > OS disturbances by getting rid of *all* OS processing on some > cpus. > > For some reason scheduler developers seem to be threatened by this > idea and they go into bizarre lines of arguments to avoid the > issue. Its simple and doable and the scheduler will still be there > after we do this. If you meant to include me in that summary categorization, i dont feel 'threatened' by any such patches (why would i? They dont seem to have sharp teeth nor any apparent poison fangs) - i simply concur with the reasons Peter listed that it is a technically inferior solution. Ingo ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 19:32 ` Ingo Molnar @ 2009-08-26 20:40 ` Christoph Lameter 2009-08-26 20:50 ` Andrew Morton 2009-08-26 21:08 ` Ingo Molnar 2009-08-26 21:32 ` raz ben yehuda 1 sibling, 2 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 20:40 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, Ingo Molnar wrote: > ( If you on the other hand were approaching this issue with > pragmatism and with intellectual honesty, if you were at the end > of a string of patches that gradually improved latencies but > couldnt get them below a certain threshold, and if scheduler > developers couldnt give you any ideas what else to improve, and > _then_ suggested some other solution, you might have a point. > You are far away from being able to claim that. ) Intellectual honesty? Wish I would be seeing it. So far there is not even the uptake required on your side to discuss the problem. There is no threshold. HPC and other industries want processors as a whole with all their abilities. They will squeeze the last bit of performance out of them. > to have sharp teeth nor any apparent poison fangs) - i simply concur > with the reasons Peter listed that it is a technically inferior > solution. Ok so you are saying that the reduction of OS latencies will make the processor completely available and have no disturbances like OFFLINE scheduling? Peter has not given a solution to the problem. Nor have you. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:40 ` Christoph Lameter @ 2009-08-26 20:50 ` Andrew Morton 2009-08-26 21:09 ` Christoph Lameter ` (3 more replies) 2009-08-26 21:08 ` Ingo Molnar 1 sibling, 4 replies; 79+ messages in thread From: Andrew Morton @ 2009-08-26 20:50 UTC (permalink / raw) To: Christoph Lameter Cc: mingo, peterz, raziebe, maximlevitsky, cfriesen, efault, riel, wiseman, linux-kernel, linux-rt-users On Wed, 26 Aug 2009 16:40:09 -0400 (EDT) Christoph Lameter <cl@linux-foundation.org> wrote: > Peter has not given a solution to the problem. Nor have you. What problem? All I've seen is "I want 100% access to a CPU". That's not a problem statement - it's an implementation. What is the problem statement? My take on these patches: the kernel gives userspace unmediated access to memory resources if it wants that. The kernel gives userspace unmediated access to IO devices if it wants that. But for some reason people freak out at the thought of providing unmediated access to CPU resources. Don't get all religious about this. If the change is clean, maintainable and useful then there's no reason to not merge it. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:50 ` Andrew Morton @ 2009-08-26 21:09 ` Christoph Lameter 2009-08-26 21:15 ` Chris Friesen ` (2 subsequent siblings) 3 siblings, 0 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 21:09 UTC (permalink / raw) To: Andrew Morton Cc: mingo, peterz, raziebe, maximlevitsky, cfriesen, efault, riel, wiseman, linux-kernel, linux-rt-users On Wed, 26 Aug 2009, Andrew Morton wrote: > All I've seen is "I want 100% access to a CPU". That's not a problem > statement - it's an implementation. Maybe. But its a problem statement that I have seen in various industries. Multiple kernel hacks exist to do this in more or less contorted way. We already have Linux scheduler functionality that does partially what is needed. See the isolcpus kernel parameter. isolcpus does not switch off OS sources of noise but it takes the processor away from the scheduler. We need a harder form of isolation where the excluded processors offer no OS services at all. > What is the problem statement? My definition (likely not covering all that the author of this patchset wants): How to make a processor in a multicore system completely available to a process. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:50 ` Andrew Morton 2009-08-26 21:09 ` Christoph Lameter @ 2009-08-26 21:15 ` Chris Friesen 2009-08-26 21:37 ` raz ben yehuda 2009-08-26 21:34 ` Ingo Molnar 2009-08-26 21:34 ` raz ben yehuda 3 siblings, 1 reply; 79+ messages in thread From: Chris Friesen @ 2009-08-26 21:15 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, mingo, peterz, raziebe, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On 08/26/2009 02:50 PM, Andrew Morton wrote: > What problem? > > All I've seen is "I want 100% access to a CPU". That's not a problem > statement - it's an implementation. > > What is the problem statement? I can only speak for myself... In our case the problem statement was that we had an inherently single-threaded emulator app that we wanted to push as hard as absolutely possible. We gave it as close to a whole cpu as we could using cpu and irq affinity and we used message queues in shared memory to allow another cpu to handle I/O. In our case we still had kernel threads running on the app cpu, but if we'd had a straightforward way to avoid them we would have used it. Chris ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 21:15 ` Chris Friesen @ 2009-08-26 21:37 ` raz ben yehuda 2009-08-27 16:51 ` Chris Friesen 0 siblings, 1 reply; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 21:37 UTC (permalink / raw) To: Chris Friesen Cc: Andrew Morton, Christoph Lameter, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Wed, 2009-08-26 at 15:15 -0600, Chris Friesen wrote: > On 08/26/2009 02:50 PM, Andrew Morton wrote: > > > What problem? > > > > All I've seen is "I want 100% access to a CPU". That's not a problem > > statement - it's an implementation. > > > > What is the problem statement? > > I can only speak for myself... > > In our case the problem statement was that we had an inherently > single-threaded emulator app that we wanted to push as hard as > absolutely possible. > > We gave it as close to a whole cpu as we could using cpu and irq > affinity and we used message queues in shared memory to allow another > cpu to handle I/O. In our case we still had kernel threads running on > the app cpu, but if we'd had a straightforward way to avoid them we > would have used it. > > Chris Chris. I offer myself to help anyone wishes to apply OFFSCHED. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 21:37 ` raz ben yehuda @ 2009-08-27 16:51 ` Chris Friesen 2009-08-27 17:04 ` Christoph Lameter 2009-08-27 21:33 ` raz ben yehuda 0 siblings, 2 replies; 79+ messages in thread From: Chris Friesen @ 2009-08-27 16:51 UTC (permalink / raw) To: raz ben yehuda Cc: Andrew Morton, Christoph Lameter, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On 08/26/2009 03:37 PM, raz ben yehuda wrote: > > On Wed, 2009-08-26 at 15:15 -0600, Chris Friesen wrote: >> We gave it as close to a whole cpu as we could using cpu and irq >> affinity and we used message queues in shared memory to allow another >> cpu to handle I/O. In our case we still had kernel threads running on >> the app cpu, but if we'd had a straightforward way to avoid them we >> would have used it. > Chris. I offer myself to help anyone wishes to apply OFFSCHED. I just went and read the docs. One of the things I noticed is that it says that the offlined cpu cannot run userspace tasks. For our situation that's a showstopper, unfortunately. Chris ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 16:51 ` Chris Friesen @ 2009-08-27 17:04 ` Christoph Lameter 2009-08-27 21:09 ` Thomas Gleixner 2009-08-27 21:33 ` raz ben yehuda 1 sibling, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-27 17:04 UTC (permalink / raw) To: Chris Friesen Cc: raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 27 Aug 2009, Chris Friesen wrote: > I just went and read the docs. One of the things I noticed is that it > says that the offlined cpu cannot run userspace tasks. For our > situation that's a showstopper, unfortunately. It needs to be implemented the right way. Essentially this is a variation on the isolcpu kernel boot option. We probably need some syscall to move a user space process to a bare metal cpu since the cpu cannot be considered online in the regular sense. An isolated cpu can then only execute one process at a time. A process would do all initialization and lock itsresources in memory before going to the isolated processor. Any attempt to use OS facilities need to cause the process to be moved back to a cpu with OS services. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 17:04 ` Christoph Lameter @ 2009-08-27 21:09 ` Thomas Gleixner 2009-08-27 22:22 ` Gregory Haskins ` (2 more replies) 0 siblings, 3 replies; 79+ messages in thread From: Thomas Gleixner @ 2009-08-27 21:09 UTC (permalink / raw) To: Christoph Lameter Cc: Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 27 Aug 2009, Christoph Lameter wrote: > On Thu, 27 Aug 2009, Chris Friesen wrote: > > > I just went and read the docs. One of the things I noticed is that it > > says that the offlined cpu cannot run userspace tasks. For our > > situation that's a showstopper, unfortunately. > > It needs to be implemented the right way. Essentially this is a variation > on the isolcpu kernel boot option. We probably need some syscall to move > a user space process to a bare metal cpu since the cpu cannot be > considered online in the regular sense. It can. It needs to be flagged as reserved for special tasks and you need a separate mechanism to move and pin a task to such a CPU. > An isolated cpu can then only execute one process at a time. A process > would do all initialization and lock itsresources in memory before going > to the isolated processor. Any attempt to use OS facilities need to cause > the process to be moved back to a cpu with OS services. You are creating a "one special case" operation mode which is not justified in my opinion. Let's look at the problem you want to solve: Run exactly one thread on a dedicated CPU w/o any disturbance by the scheduler tick. You can move away anything else than the scheduler tick from a CPU today already w/o a single line of code change. But you want to impose restrictions like resource locking and moving back to another CPU in case of a syscall. What's the purpose of this ? It does not buy anything except additional complexity. That's just the wrong approach. All you need is a way to tell the kernel that CPUx can switch off the scheduler tick when only one thread is running and that very thread is running in user space. Once another thread arrives on that CPU or the single thread enters the kernel for a blocking syscall the scheduler tick has to be restarted. It's not rocket science to fix the well known issues of stopping and eventually restarting the scheduler tick, the CPU time accounting and some other small details. Such a modification would be of general use contrary to your proposed solution which is just a hack to solve your particular special case of operation. Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 21:09 ` Thomas Gleixner @ 2009-08-27 22:22 ` Gregory Haskins 2009-08-28 2:15 ` Rik van Riel 2009-08-28 6:14 ` Peter Zijlstra 2009-08-27 23:51 ` Chris Friesen 2009-08-28 18:43 ` Christoph Lameter 2 siblings, 2 replies; 79+ messages in thread From: Gregory Haskins @ 2009-08-27 22:22 UTC (permalink / raw) To: Thomas Gleixner Cc: Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 3349 bytes --] Thomas Gleixner wrote: > On Thu, 27 Aug 2009, Christoph Lameter wrote: >> On Thu, 27 Aug 2009, Chris Friesen wrote: >> >>> I just went and read the docs. One of the things I noticed is that it >>> says that the offlined cpu cannot run userspace tasks. For our >>> situation that's a showstopper, unfortunately. >> It needs to be implemented the right way. Essentially this is a variation >> on the isolcpu kernel boot option. We probably need some syscall to move >> a user space process to a bare metal cpu since the cpu cannot be >> considered online in the regular sense. > > It can. It needs to be flagged as reserved for special tasks and you > need a separate mechanism to move and pin a task to such a CPU. > >> An isolated cpu can then only execute one process at a time. A process >> would do all initialization and lock itsresources in memory before going >> to the isolated processor. Any attempt to use OS facilities need to cause >> the process to be moved back to a cpu with OS services. > > You are creating a "one special case" operation mode which is not > justified in my opinion. Let's look at the problem you want to solve: > > Run exactly one thread on a dedicated CPU w/o any disturbance by the > scheduler tick. > > You can move away anything else than the scheduler tick from a CPU > today already w/o a single line of code change. > > But you want to impose restrictions like resource locking and moving > back to another CPU in case of a syscall. What's the purpose of this ? > It does not buy anything except additional complexity. > > That's just the wrong approach. All you need is a way to tell the > kernel that CPUx can switch off the scheduler tick when only one > thread is running and that very thread is running in user space. Once > another thread arrives on that CPU or the single thread enters the > kernel for a blocking syscall the scheduler tick has to be > restarted. > > It's not rocket science to fix the well known issues of stopping and > eventually restarting the scheduler tick, the CPU time accounting and > some other small details. Such a modification would be of general use > contrary to your proposed solution which is just a hack to solve your > particular special case of operation. I wonder if it makes sense to do something along the lines of the sched-class... IOW: What if we adopted one of the following models: 1) Create a new class that is higher prio than FIFO/RR and, when selected, disables the tick. 2) Modify FIFO so that it disables tick by default...update accounting info at next reschedule event. 3) Variation of 2..leave FIFO+tick as is by default, but have some kind of parameter to optionally disable tick if desired. In a way, we should probably consider (2) independent of this particular thread. FIFO doesn't need a tick anyway afaict...only a RESCHED+IPI truly ever matter here....or am I missing something obvious (probably w.r.t accounting)? You could then couple this solution with cpusets (possibly with a little work to get rid of any pesky per-cpy kthreads) to achieve the desired effect of interference-free operation. You wouldn't even have to have funky rules eluded to above w.r.t. making sure only one userspace thread is running on the core. Thoughts? -Greg [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 267 bytes --] ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 22:22 ` Gregory Haskins @ 2009-08-28 2:15 ` Rik van Riel 2009-08-28 3:33 ` Gregory Haskins 2009-08-28 6:14 ` Peter Zijlstra 1 sibling, 1 reply; 79+ messages in thread From: Rik van Riel @ 2009-08-28 2:15 UTC (permalink / raw) To: Gregory Haskins Cc: Thomas Gleixner, Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users Gregory Haskins wrote: > 2) Modify FIFO so that it disables tick by default...update accounting > info at next reschedule event. I like it. The only thing to watch out for is that events that wake up higher-priority FIFO tasks do not get deferred :) -- All rights reversed. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 2:15 ` Rik van Riel @ 2009-08-28 3:33 ` Gregory Haskins 2009-08-28 4:27 ` Gregory Haskins 0 siblings, 1 reply; 79+ messages in thread From: Gregory Haskins @ 2009-08-28 3:33 UTC (permalink / raw) To: Rik van Riel Cc: Thomas Gleixner, Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 1293 bytes --] Hi Rik, Rik van Riel wrote: > Gregory Haskins wrote: > >> 2) Modify FIFO so that it disables tick by default...update accounting >> info at next reschedule event. > > I like it. The only thing to watch out for is that > events that wake up higher-priority FIFO tasks do > not get deferred :) > Yeah, agreed. My (potentially half-baked) proposal should work at least from a pure scheduling perspective since FIFO technically does not reschedule based on a tick, and wakeups/migrations should still work bidirectionally with existing scheduler policies. However, and to what I believe is your point: its not entirely clear to me what impact, if any, there would be w.r.t. any _other_ events that may be driven off of the scheduler tick (i.e. events other than scheduling policies, like timeslice expiration, etc). Perhaps someone else like Thomas, Ingo, or Peter have some input here. I guess the specific question to ask is: Does the scheduler tick code have any role other than timeslice policies and updating accounting information? Examples would include timer-expiry, for instance. I would think most of this logic is handled by finer grained components like HRT, but I am admittedly ignorant of the actual timer voodoo ;) Kind Regards, -Greg [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 267 bytes --] ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 3:33 ` Gregory Haskins @ 2009-08-28 4:27 ` Gregory Haskins 2009-08-28 10:26 ` Thomas Gleixner 0 siblings, 1 reply; 79+ messages in thread From: Gregory Haskins @ 2009-08-28 4:27 UTC (permalink / raw) To: Rik van Riel Cc: Thomas Gleixner, Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 2572 bytes --] Gregory Haskins wrote: > Hi Rik, > > Rik van Riel wrote: >> Gregory Haskins wrote: >> >>> 2) Modify FIFO so that it disables tick by default...update accounting >>> info at next reschedule event. >> I like it. The only thing to watch out for is that >> events that wake up higher-priority FIFO tasks do >> not get deferred :) >> > > Yeah, agreed. My (potentially half-baked) proposal should work at least > from a pure scheduling perspective since FIFO technically does not > reschedule based on a tick, and wakeups/migrations should still work > bidirectionally with existing scheduler policies. > > However, and to what I believe is your point: its not entirely clear to > me what impact, if any, there would be w.r.t. any _other_ events that > may be driven off of the scheduler tick (i.e. events other than > scheduling policies, like timeslice expiration, etc). Perhaps someone > else like Thomas, Ingo, or Peter have some input here. > > I guess the specific question to ask is: Does the scheduler tick code > have any role other than timeslice policies and updating accounting > information? Examples would include timer-expiry, for instance. I > would think most of this logic is handled by finer grained components > like HRT, but I am admittedly ignorant of the actual timer voodoo ;) > Thinking about this idea some more: I can't see why this isn't just a trivial variation of the nohz idle code already in mainline. In both cases (idle and FIFO tasks) the cpu is "consumed" 100% by some arbitrary job (spinning/HLT for idle, RT thread for FIFO) while we have the scheduler tick disabled. The only real difference is a matter of power-management (HLT/mwait go to sleep-states, whereas spinning/rt-task run full tilt). Therefore the answer may be as simple as bracketing the FIFO task with tick_nohz_stop_sched_tick() + tick_nohz_restart_sched_tick(). The nohz code will probably need some minor adjustments so it is not assuming things about the state being "idle" (e.g. "isidle") for places when it matters (idle_calls++ stat is one example). Potential problems: a) disabling/renabling the tick on a per-RT task schedule() may prove to be prohibitively expensive. b) we will need to make sure the rt-bandwidth protection mechanism is defeated so the task is allowed to consume 100% bandwidth. Perhaps these states should be in the cpuset/root-domain, and configured when you create the partition (e.g. "tick=off", "bandwidth=off" makes it an "offline" set). Kind Regards, -Greg [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 267 bytes --] ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 4:27 ` Gregory Haskins @ 2009-08-28 10:26 ` Thomas Gleixner 2009-08-28 18:57 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 10:26 UTC (permalink / raw) To: Gregory Haskins Cc: Rik van Riel, Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Gregory Haskins wrote: > > However, and to what I believe is your point: its not entirely clear to > > me what impact, if any, there would be w.r.t. any _other_ events that > > may be driven off of the scheduler tick (i.e. events other than > > scheduling policies, like timeslice expiration, etc). Perhaps someone > > else like Thomas, Ingo, or Peter have some input here. > > > > I guess the specific question to ask is: Does the scheduler tick code > > have any role other than timeslice policies and updating accounting > > information? Examples would include timer-expiry, for instance. I > > would think most of this logic is handled by finer grained components > > like HRT, but I am admittedly ignorant of the actual timer voodoo ;) There is not much happening in the scheduler tick: - accounting of CPU time. this can be delegated to some other CPU as long as the user space task is running and consuming 100% - timer list timers. If there is no service/device active on that CPU then there are no timers to run - rcu call backs. Same as above, but might need some tweaking. - printk tick. Not really interesting - scheduler time slicing. Not necessary in such a context - posix cpu timers. Only interesting when the application uses them So there is not much which needs the tick in such a scenario. Of course we'd need to exclude that CPU from the do_timer duty as well. > Thinking about this idea some more: I can't see why this isn't just a > trivial variation of the nohz idle code already in mainline. In both > cases (idle and FIFO tasks) the cpu is "consumed" 100% by some arbitrary > job (spinning/HLT for idle, RT thread for FIFO) while we have the > scheduler tick disabled. The only real difference is a matter of > power-management (HLT/mwait go to sleep-states, whereas spinning/rt-task > run full tilt). > > Therefore the answer may be as simple as bracketing the FIFO task with > tick_nohz_stop_sched_tick() + tick_nohz_restart_sched_tick(). The nohz > code will probably need some minor adjustments so it is not assuming > things about the state being "idle" (e.g. "isidle") for places when it > matters (idle_calls++ stat is one example). Yeah, it's similar to what we do in nohz idle already, but we'd need to split out some of the functions very carefully to reuse them. > Potential problems: > > a) disabling/renabling the tick on a per-RT task schedule() may prove to > be prohibitively expensive. For a single taks consuming 100% CPU it is a non problem. You disable it once. But yes on a standard system this needs to be investigated. > b) we will need to make sure the rt-bandwidth protection mechanism is > defeated so the task is allowed to consume 100% bandwidth. > > Perhaps these states should be in the cpuset/root-domain, and configured > when you create the partition (e.g. "tick=off", "bandwidth=off" makes it > an "offline" set). That makes sense and should not be rocket science to implement. Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 10:26 ` Thomas Gleixner @ 2009-08-28 18:57 ` Christoph Lameter 2009-08-28 19:23 ` Thomas Gleixner 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-28 18:57 UTC (permalink / raw) To: Thomas Gleixner Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Thomas Gleixner wrote: > That makes sense and should not be rocket science to implement. I like it and such a thing would do a lot for reducing noise. However, look at a typical task (from the HPC world) that would be running on an isolated processors. It would 1. Spin on some memory location waiting for an event. 2. Process data passed to it, prepare output data and then go back to 1. The enticing thing about doing 1 with shared memory and/or infiniband is that it can be done in a few hundred nanoseconds instead of 10-20 microseconds. This allows a much faster IPC communication if we bypass the OS. For many uses deterministic responses are desired. If the handler that runs is never disturbed by extraneous processing (IPI, faults, irqs etc) then we can say that we run at the maximum speed that the machine can run at. That is what many sites expect. In an HPC environment synchronization points are essential and the frequency of synchronization points (where we spin on a cacheline) is important for the ability to scale the accuratey and the performance of the algorithm. If we can make N processor operate in a deterministic fashion on f.e. an array of floating point numbers then the rendezvous occurring with minimal wait time in each of the N processes. Getting rid of all sources of interruptions gets us the best performance possible. Right now often strong variability makes it necessary to have long durations of the processing periods and deal with long wait times because one of the N processes has not finished yet. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 18:57 ` Christoph Lameter @ 2009-08-28 19:23 ` Thomas Gleixner 2009-08-28 19:52 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 19:23 UTC (permalink / raw) To: Christoph Lameter Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Christoph Lameter wrote: > On Fri, 28 Aug 2009, Thomas Gleixner wrote: > > > That makes sense and should not be rocket science to implement. > > I like it and such a thing would do a lot for reducing noise. > > However, look at a typical task (from the HPC world) that would be > running on an isolated processors. It would > > 1. Spin on some memory location waiting for an event. > > 2. Process data passed to it, prepare output data and then go back to 1. > > The enticing thing about doing 1 with shared memory and/or infiniband is > that it can be done in a few hundred nanoseconds instead of 10-20 > microseconds. This allows a much faster IPC communication if we bypass > the OS. > > For many uses deterministic responses are desired. If the handler that > runs is never disturbed by extraneous processing (IPI, faults, irqs etc) > then we can say that we run at the maximum speed that the machine can run > at. That is what many sites expect. Right, and I think we can get there. The timer can be eliminated with some work. Faults shouldn't happen on that CPU and all other interrupts can be kept away with proper affinity settings. Softirqs should not happen on such a CPU either as there is neither a hardirq nor a user space task triggering them. Same applies for timers. So there are some remaining issues like IPIs, but I'm pretty sure that they can be tamed to zero as well. Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 19:23 ` Thomas Gleixner @ 2009-08-28 19:52 ` Christoph Lameter 2009-08-28 20:00 ` Thomas Gleixner 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-28 19:52 UTC (permalink / raw) To: Thomas Gleixner Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Thomas Gleixner wrote: > Right, and I think we can get there. The timer can be eliminated with > some work. Faults shouldn't happen on that CPU and all other > interrupts can be kept away with proper affinity settings. Softirqs > should not happen on such a CPU either as there is neither a hardirq > nor a user space task triggering them. Same applies for timers. So > there are some remaining issues like IPIs, but I'm pretty sure that > they can be tamed to zero as well. There are various timer generated thingies like vm statistics, slab queue management and device specific things that run on each processor. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 19:52 ` Christoph Lameter @ 2009-08-28 20:00 ` Thomas Gleixner 2009-08-28 20:21 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 20:00 UTC (permalink / raw) To: Christoph Lameter Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Christoph Lameter wrote: > On Fri, 28 Aug 2009, Thomas Gleixner wrote: > > > Right, and I think we can get there. The timer can be eliminated with > > some work. Faults shouldn't happen on that CPU and all other > > interrupts can be kept away with proper affinity settings. Softirqs > > should not happen on such a CPU either as there is neither a hardirq > > nor a user space task triggering them. Same applies for timers. So > > there are some remaining issues like IPIs, but I'm pretty sure that > > they can be tamed to zero as well. > > There are various timer generated thingies like vm statistics, slab queue > management and device specific things that run on each processor. The statistics stuff needs to be tackled anyway as we need to offload the sched accounting to some other cpu. What slab queue stuff is running on timers and cannot be switched off in such a context? Device specific stuff should not happen on such a CPU when there is no device handled on it. Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 20:00 ` Thomas Gleixner @ 2009-08-28 20:21 ` Christoph Lameter 2009-08-28 20:34 ` Thomas Gleixner 2009-08-29 17:03 ` jim owens 0 siblings, 2 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-28 20:21 UTC (permalink / raw) To: Thomas Gleixner Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Thomas Gleixner wrote: > The statistics stuff needs to be tackled anyway as we need to offload > the sched accounting to some other cpu. The vm statisticcs in mm/vmstat.c are different from the sched accounting. > What slab queue stuff is running on timers and cannot be switched off > in such a context? slab does run a timer every 2 second to age queues. If there was activity then there can be a relatively long time in which we periodically throw out portions of the cached data. > Device specific stuff should not happen on such a CPU when there is no > device handled on it. The device may periodically check for conditions that require action. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 20:21 ` Christoph Lameter @ 2009-08-28 20:34 ` Thomas Gleixner 2009-08-31 19:19 ` Christoph Lameter 2009-08-29 17:03 ` jim owens 1 sibling, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 20:34 UTC (permalink / raw) To: Christoph Lameter Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Christoph Lameter wrote: > On Fri, 28 Aug 2009, Thomas Gleixner wrote: > > > The statistics stuff needs to be tackled anyway as we need to offload > > the sched accounting to some other cpu. > > The vm statisticcs in mm/vmstat.c are different from the sched accounting. I know, but the problem is basically the same. Delegate the stats to someone else. > > What slab queue stuff is running on timers and cannot be switched off > > in such a context? > > slab does run a timer every 2 second to age queues. If there was activity > then there can be a relatively long time in which we periodically throw > out portions of the cached data. Right, but why does that affect a CPU which is marked "I'm not involved in that game" ? > > Device specific stuff should not happen on such a CPU when there is no > > device handled on it. > > The device may periodically check for conditions that require action. Errm. The device is associated to some other CPU, so why would it require action on an isolated one ? Or are you talking about a device which is associated to that isolated CPU ? Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 20:34 ` Thomas Gleixner @ 2009-08-31 19:19 ` Christoph Lameter 2009-08-31 17:44 ` Roland Dreier 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-31 19:19 UTC (permalink / raw) To: Thomas Gleixner Cc: Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, Thomas Gleixner wrote: > > slab does run a timer every 2 second to age queues. If there was activity > > then there can be a relatively long time in which we periodically throw > > out portions of the cached data. > > Right, but why does that affect a CPU which is marked "I'm not > involved in that game" ? Its run unconditionally on every processor. System needs to scan through all slabs and all queues to figure out if there is something to expire. > > The device may periodically check for conditions that require action. > > Errm. The device is associated to some other CPU, so why would it > require action on an isolated one ? Or are you talking about a device > which is associated to that isolated CPU ? Some devices now have per cpu threads. See f.e. Mellanox IB drivers. Maybe there is a way to restrict that. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-31 19:19 ` Christoph Lameter @ 2009-08-31 17:44 ` Roland Dreier 2009-09-01 18:42 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Roland Dreier @ 2009-08-31 17:44 UTC (permalink / raw) To: Christoph Lameter Cc: Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users > Some devices now have per cpu threads. See f.e. Mellanox IB drivers. Maybe > there is a way to restrict that. AFAIK the Mellanox drivers just create a single-threaded workqueue. - R. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-31 17:44 ` Roland Dreier @ 2009-09-01 18:42 ` Christoph Lameter 2009-09-01 16:15 ` Roland Dreier 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-09-01 18:42 UTC (permalink / raw) To: Roland Dreier Cc: Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Mon, 31 Aug 2009, Roland Dreier wrote: > > > Some devices now have per cpu threads. See f.e. Mellanox IB drivers. Maybe > > there is a way to restrict that. > > AFAIK the Mellanox drivers just create a single-threaded workqueue. Ok then these per cpu irqs are there to support something different? There are per cpu irqs here. Seems to be hardware supported? 62: 13824 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-0 63: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-1 64: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-2 65: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-3 66: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-4 67: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-5 68: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-6 69: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-7 70: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-8 71: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-9 72: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-10 73: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-11 74: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-12 75: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-13 76: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-14 77: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-comp-15 78: 3225 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge mlx4-async 79: 1832 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-0 80: 5546 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-1 81: 2604 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-2 82: 124 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-3 83: 743126 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-4 84: 857 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-5 85: 321 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-6 86: 2296 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0-7 ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-09-01 18:42 ` Christoph Lameter @ 2009-09-01 16:15 ` Roland Dreier 0 siblings, 0 replies; 79+ messages in thread From: Roland Dreier @ 2009-09-01 16:15 UTC (permalink / raw) To: Christoph Lameter Cc: Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users > Ok then these per cpu irqs are there to support something different? There > are per cpu irqs here. Seems to be hardware supported? Yes, the driver now creates per-cpu IRQs for completions. However if you don't trigger any completion events then you won't get any interrupts. That's different from the workqueues, which are used to poll the hardware for port changes and internal errors (and which are single-threaded and can be put on whatever "system services" CPU you want) - R. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 20:21 ` Christoph Lameter 2009-08-28 20:34 ` Thomas Gleixner @ 2009-08-29 17:03 ` jim owens 2009-08-31 19:22 ` Christoph Lameter 1 sibling, 1 reply; 79+ messages in thread From: jim owens @ 2009-08-29 17:03 UTC (permalink / raw) To: Christoph Lameter Cc: Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users Christoph Lameter wrote: > On Fri, 28 Aug 2009, Thomas Gleixner wrote: >> What slab queue stuff is running on timers and cannot be switched off >> in such a context? > > slab does run a timer every 2 second to age queues. If there was activity > then there can be a relatively long time in which we periodically throw > out portions of the cached data. OK, you have me fully confused now. From other HPC people, I know the "no noise in my math application" requirement. But that means the user code that is running on the CPU must not do anything that wakes the kernel. Not even page faults, so they pin the memory at job start. Anything the user code does that needs kernel statistics or kernel action is "I must fix my user code", or "I accept that the noise is necessary". So we don't need to offload stats to other CPUs, stats are not needed. >> Device specific stuff should not happen on such a CPU when there is no >> device handled on it. > > The device may periodically check for conditions that require action. Again, what is this device and why is it controlled directly by user-space code. Devices should be controlled even in an HPC environment by the kernel. AFAIK HPC wants the kernel to be the bootstrap and data transfer manager running on a small subset of the total CPUs, with the dedicated CPUs running math jobs. jim ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-29 17:03 ` jim owens @ 2009-08-31 19:22 ` Christoph Lameter 2009-08-31 15:33 ` Peter Zijlstra 0 siblings, 1 reply; 79+ messages in thread From: Christoph Lameter @ 2009-08-31 19:22 UTC (permalink / raw) To: jim owens Cc: Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Sat, 29 Aug 2009, jim owens wrote: > From other HPC people, I know the "no noise in my math application" > requirement. But that means the user code that is running on the > CPU must not do anything that wakes the kernel. Not even page faults, > so they pin the memory at job start. Right. > Anything the user code does that needs kernel statistics or > kernel action is "I must fix my user code", or "I accept that > the noise is necessary". > > So we don't need to offload stats to other CPUs, stats are not needed. Stats updates are performed if needed or not. Same with slab expiration. Thats why its necessary to Offline the cpu. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-31 19:22 ` Christoph Lameter @ 2009-08-31 15:33 ` Peter Zijlstra 2009-09-01 18:46 ` Christoph Lameter 0 siblings, 1 reply; 79+ messages in thread From: Peter Zijlstra @ 2009-08-31 15:33 UTC (permalink / raw) To: Christoph Lameter Cc: jim owens, Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Mon, 2009-08-31 at 15:22 -0400, Christoph Lameter wrote: > > Stats updates are performed if needed or not. Same with slab expiration. > Thats why its necessary to Offline the cpu. Or we fix it to not do anything when not needed. Come-on Christoph this ain't rocket science. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-31 15:33 ` Peter Zijlstra @ 2009-09-01 18:46 ` Christoph Lameter 0 siblings, 0 replies; 79+ messages in thread From: Christoph Lameter @ 2009-09-01 18:46 UTC (permalink / raw) To: Peter Zijlstra Cc: jim owens, Thomas Gleixner, Gregory Haskins, Rik van Riel, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Mon, 31 Aug 2009, Peter Zijlstra wrote: > On Mon, 2009-08-31 at 15:22 -0400, Christoph Lameter wrote: > > > > Stats updates are performed if needed or not. Same with slab expiration. > > Thats why its necessary to Offline the cpu. > > Or we fix it to not do anything when not needed. > > Come-on Christoph this ain't rocket science. Well lets try it then. Not sure that I can be too much help as long as we have issues with UDP weirdness in the network layers. Bugs before feature work.... Sigh. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 22:22 ` Gregory Haskins 2009-08-28 2:15 ` Rik van Riel @ 2009-08-28 6:14 ` Peter Zijlstra 1 sibling, 0 replies; 79+ messages in thread From: Peter Zijlstra @ 2009-08-28 6:14 UTC (permalink / raw) To: Gregory Haskins Cc: Thomas Gleixner, Christoph Lameter, Chris Friesen, raz ben yehuda, Andrew Morton, mingo, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 2009-08-27 at 18:22 -0400, Gregory Haskins wrote: > I wonder if it makes sense to do something along the lines of the > sched-class... Disabling the tick isn't a big deal from the scheduler's point of view, its all the other accounting crap that happens. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 21:09 ` Thomas Gleixner 2009-08-27 22:22 ` Gregory Haskins @ 2009-08-27 23:51 ` Chris Friesen 2009-08-28 0:44 ` Thomas Gleixner 2009-08-28 18:43 ` Christoph Lameter 2 siblings, 1 reply; 79+ messages in thread From: Chris Friesen @ 2009-08-27 23:51 UTC (permalink / raw) To: Thomas Gleixner Cc: Christoph Lameter, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On 08/27/2009 03:09 PM, Thomas Gleixner wrote: > That's just the wrong approach. All you need is a way to tell the > kernel that CPUx can switch off the scheduler tick when only one > thread is running and that very thread is running in user space. Once > another thread arrives on that CPU or the single thread enters the > kernel for a blocking syscall the scheduler tick has to be > restarted. That's an elegant approach...I like it. How would you deal with per-cpu kernel threads (softirqs, etc.) or softirq processing while in the kernel? Switching off the timer tick isn't sufficient because the scheduler will be triggered on the way back to userspace in a syscall. Chris ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 23:51 ` Chris Friesen @ 2009-08-28 0:44 ` Thomas Gleixner 2009-08-28 21:20 ` Chris Friesen 0 siblings, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 0:44 UTC (permalink / raw) To: Chris Friesen Cc: Christoph Lameter, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 27 Aug 2009, Chris Friesen wrote: > On 08/27/2009 03:09 PM, Thomas Gleixner wrote: > > > That's just the wrong approach. All you need is a way to tell the > > kernel that CPUx can switch off the scheduler tick when only one > > thread is running and that very thread is running in user space. Once > > another thread arrives on that CPU or the single thread enters the > > kernel for a blocking syscall the scheduler tick has to be > > restarted. > > That's an elegant approach...I like it. > > How would you deal with per-cpu kernel threads (softirqs, etc.) or > softirq processing while in the kernel? If you have pinned an interrupt to that CPU then you need to process the softirq for it as well. If that's the device your very single user space thread is talking to then you better want that, if you are not interested then simply pin that device irq to some other CPU: no irq -> no softirq. > Switching off the timer tick isn't sufficient because the scheduler > will be triggered on the way back to userspace in a syscall. If there is just one user space thread why is the NOOP call to the scheduler interesting ? If you go into the kernel you have some overhead anyway, so why would the few instructions to call schedule() and return with the same task (as it is the only runnable) matter ? Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 0:44 ` Thomas Gleixner @ 2009-08-28 21:20 ` Chris Friesen 0 siblings, 0 replies; 79+ messages in thread From: Chris Friesen @ 2009-08-28 21:20 UTC (permalink / raw) To: Thomas Gleixner Cc: Christoph Lameter, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On 08/27/2009 06:44 PM, Thomas Gleixner wrote: > On Thu, 27 Aug 2009, Chris Friesen wrote: >> How would you deal with per-cpu kernel threads (softirqs, etc.) or >> softirq processing while in the kernel? > > If you have pinned an interrupt to that CPU then you need to process > the softirq for it as well. If that's the device your very single user > space thread is talking to then you better want that, if you are not > interested then simply pin that device irq to some other CPU: no irq > -> no softirq. Ah, okay. For some reason I had thought that the incoming work was queued up globally and might be handled by any softirq. Chris ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 21:09 ` Thomas Gleixner 2009-08-27 22:22 ` Gregory Haskins 2009-08-27 23:51 ` Chris Friesen @ 2009-08-28 18:43 ` Christoph Lameter 2 siblings, 0 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-28 18:43 UTC (permalink / raw) To: Thomas Gleixner Cc: Chris Friesen, raz ben yehuda, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 27 Aug 2009, Thomas Gleixner wrote: > You are creating a "one special case" operation mode which is not > justified in my opinion. Let's look at the problem you want to solve: > > Run exactly one thread on a dedicated CPU w/o any disturbance by the > scheduler tick. Thats not the problem I want to solve. There are multiple events that could disturb a process like timers firing, softirqs and hardirqs. > You can move away anything else than the scheduler tick from a CPU > today already w/o a single line of code change. How do you remove the per processor kernel threads for allocators and other kernel subsystems? What about IPI broadcasts? ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 16:51 ` Chris Friesen 2009-08-27 17:04 ` Christoph Lameter @ 2009-08-27 21:33 ` raz ben yehuda 2009-08-27 22:05 ` Thomas Gleixner 1 sibling, 1 reply; 79+ messages in thread From: raz ben yehuda @ 2009-08-27 21:33 UTC (permalink / raw) To: Chris Friesen Cc: Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Thu, 2009-08-27 at 10:51 -0600, Chris Friesen wrote: > On 08/26/2009 03:37 PM, raz ben yehuda wrote: > > > > On Wed, 2009-08-26 at 15:15 -0600, Chris Friesen wrote: > > >> We gave it as close to a whole cpu as we could using cpu and irq > >> affinity and we used message queues in shared memory to allow another > >> cpu to handle I/O. In our case we still had kernel threads running on > >> the app cpu, but if we'd had a straightforward way to avoid them we > >> would have used it. > > > Chris. I offer myself to help anyone wishes to apply OFFSCHED. > > I just went and read the docs. One of the things I noticed is that it > says that the offlined cpu cannot run userspace tasks. For our > situation that's a showstopper, unfortunately. Given that your entire software is T size , and T' is the amount of real time size, what is the relation T'/T ? If T'/T << 1 then dissect it, and put the T' in OFFSCHED. My software T's is about 100MB while the real time section is about 60K. They communicate through a simple ioctls. CPU isolation example: a transmission engine. In the image bellow, I am presenting 4 streaming engines, over 4 Intels 82598EB 10Gbps. A streaming engine is actually a Xeon E5420 2.5GHz. Each engine has ***full control*** over its own interface. So you can: 1. fully control the processor's usage. 2. know **exactly*** how much each single packet transmission costs. for example, in this case in processor 3 a single packet average transmission is 1974tscs, which is ~700ns. 3. know how many packets fails to transmit right **on time** ( the Lates counter) . and on time in this case means within the 122us jitter. 4. There are 8 cores in this machine. The rest 4 OS cores are ~95% idle. The only resource these cores share is the bus. State: kstreamer UP. Started at October 05 05:19:51 ****************************************************** CPU 3,63% usage,Sessions 1499,6124301 kbps CPU 5,77% usage,Sessions 1499,6123859 kbps CPU 6,78% usage,Sessions 1498,6123709 kbps CPU 7,73% usage,Sessions 1498,6117766 kbps Summary: Throughput=24.489Gbps Sessions =5994 ****************************************************** Streaming Processor 3 Tx Count : Tot=399990164 Good=399990164 Bad=0 ERR=0(LOC=0,FULL=0) Time : GoodSendTsc( Max 1565895 Avg 1974) Lates=649 Flow Errors : Underflow (0,0) NotResched=0 GenErr=0 Sessions : Cur 1499(RTP=1499,UDP=0,MCAST=0,ENCRYPT=0) PAUSED=0 CPU : 63% usage Queue : Max Size 92 Avg 69 Csc=79 Throughput (bps) : Tot 6271285040 MPEG 5988905440 Throughput (Mbps): Tot Mbps 6271 MPEG Mbps 5988 Throughput : Packets/sec 568855 Streaming Processor 5 Tx Count : Tot=399944597 Good=399944595 Bad=2 ERR=2(LOC=0,FULL=2) Time : GoodSendTsc( Max 1566052 Avg 2464) Lates=5521 Flow Errors : Underflow (0,0) NotResched=0 GenErr=0 Sessions : Cur 1499(RTP=1499,UDP=0,MCAST=0,ENCRYPT=0) PAUSED=0 CPU : 77% usage Queue : Max Size 95 Avg 69 Csc=79 Throughput (bps) : Tot 6270832416 MPEG 5988473792 Throughput (Mbps): Tot Mbps 6270 MPEG Mbps 5988 Throughput : Packets/sec 568814 Streaming Processor 6 Tx Count : Tot=399898586 Good=399898585 Bad=0 ERR=0(LOC=0,FULL=0) Time : GoodSendTsc( Max 1661385 Avg 2474) Lates=8064 Flow Errors : Underflow (0,0) NotResched=0 GenErr=0 Sessions : Cur 1498(RTP=1498,UDP=0,MCAST=0,ENCRYPT=0) PAUSED=0 CPU : 78% usage Queue : Max Size 91 Avg 69 Csc=87 Throughput (bps) : Tot 6270678560 MPEG 5988326400 Throughput (Mbps): Tot Mbps 6270 MPEG Mbps 5988 Throughput : Packets/sec 568800 Streaming Processor 7 Tx Count : Tot=399845166 Good=399845100 Bad=66 ERR=66(LOC=0,FULL=66) Time : GoodSendTsc( Max 2962620 Avg 2377) Lates=42626 Flow Errors : Underflow (0,0) NotResched=0 GenErr=0 Sessions : Cur 1498(RTP=1498,UDP=0,MCAST=0,ENCRYPT=0) PAUSED=0 CPU : 73% usage Queue : Max Size 94 Avg 69 Csc=66 Throughput (bps) : Tot 6264592672 MPEG 5982514944 Throughput (Mbps): Tot Mbps 6264 MPEG Mbps 5982 Throughput : Packets/sec 568248 --------------- Reservation Load Balancer ------------------ eth2 : 5994501 kbps eth3 : 5994501 kbps eth4 : 5990502 kbps eth5 : 5990502 kbps > Chris > ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 21:33 ` raz ben yehuda @ 2009-08-27 22:05 ` Thomas Gleixner 2009-08-28 8:38 ` raz ben yehuda 0 siblings, 1 reply; 79+ messages in thread From: Thomas Gleixner @ 2009-08-27 22:05 UTC (permalink / raw) To: raz ben yehuda Cc: Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, raz ben yehuda wrote: > On Thu, 2009-08-27 at 10:51 -0600, Chris Friesen wrote: > > On 08/26/2009 03:37 PM, raz ben yehuda wrote: > > > > > > On Wed, 2009-08-26 at 15:15 -0600, Chris Friesen wrote: > > > > >> We gave it as close to a whole cpu as we could using cpu and irq > > >> affinity and we used message queues in shared memory to allow another > > >> cpu to handle I/O. In our case we still had kernel threads running on > > >> the app cpu, but if we'd had a straightforward way to avoid them we > > >> would have used it. > > > > > Chris. I offer myself to help anyone wishes to apply OFFSCHED. > > > > I just went and read the docs. One of the things I noticed is that it > > says that the offlined cpu cannot run userspace tasks. For our > > situation that's a showstopper, unfortunately. > > Given that your entire software is T size , and T' is the amount of real > time size, what is the relation T'/T ? > If T'/T << 1 then dissect it, and put the T' in OFFSCHED. > My software T's is about 100MB while the real time section is about 60K. Chris was stating that your offlined cpu cannot run userspace tasks. How is your answer connected to Chris' statement ? Please stop useless marketing. LKML is about technical problems not advertisement. > They communicate through a simple ioctls. This is totally irrelevant and we all know how communication channels between Linux and whatever hackery (RT-Linux, RTAI, ... OFFLINE) works. > CPU isolation example: a transmission engine. > > In the image bellow, I am presenting 4 streaming engines, over 4 Intels > 82598EB 10Gbps. A streaming engine is actually a Xeon E5420 2.5GHz. > Each engine has ***full control*** over its own interface. So you can: > > 1. fully control the processor's usage. By disabling the OS control over the CPU resource. How innovative. > 2. know **exactly*** how much each single packet transmission costs. for > example, in this case in processor 3 a single packet average > transmission is 1974tscs, which is ~700ns. > > 3. know how many packets fails to transmit right **on time** ( the Lates > counter) . and on time in this case means within the 122us jitter. Are those statistics a crucial property of your OFFLINE thing ? > 4. There are 8 cores in this machine. The rest 4 OS cores are ~95% idle. > The only resource these cores share is the bus. That does not change the problem that you cannot run ordinary user space tasks on your offlined CPUs and you are simply hacking round the real problem which I outlined in my previous mail. Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-27 22:05 ` Thomas Gleixner @ 2009-08-28 8:38 ` raz ben yehuda 2009-08-28 10:05 ` Thomas Gleixner 2009-08-28 13:25 ` Rik van Riel 0 siblings, 2 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-28 8:38 UTC (permalink / raw) To: Thomas Gleixner Cc: Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Fri, 2009-08-28 at 00:05 +0200, Thomas Gleixner wrote: > On Fri, 28 Aug 2009, raz ben yehuda wrote: > > On Thu, 2009-08-27 at 10:51 -0600, Chris Friesen wrote: > > > On 08/26/2009 03:37 PM, raz ben yehuda wrote: > > > > > > > > On Wed, 2009-08-26 at 15:15 -0600, Chris Friesen wrote: > > > > > > >> We gave it as close to a whole cpu as we could using cpu and irq > > > >> affinity and we used message queues in shared memory to allow another > > > >> cpu to handle I/O. In our case we still had kernel threads running on > > > >> the app cpu, but if we'd had a straightforward way to avoid them we > > > >> would have used it. > > > > > > > Chris. I offer myself to help anyone wishes to apply OFFSCHED. > > > > > > I just went and read the docs. One of the things I noticed is that it > > > says that the offlined cpu cannot run userspace tasks. For our > > > situation that's a showstopper, unfortunately. > > > > Given that your entire software is T size , and T' is the amount of real > > time size, what is the relation T'/T ? > > If T'/T << 1 then dissect it, and put the T' in OFFSCHED. > > My software T's is about 100MB while the real time section is about 60K. > > Chris was stating that your offlined cpu cannot run userspace > tasks. How is your answer connected to Chris' statement ? Please stop > useless marketing. LKML is about technical problems not advertisement. > > > They communicate through a simple ioctls. > > This is totally irrelevant and we all know how communication channels > between Linux and whatever hackery (RT-Linux, RTAI, ... OFFLINE) > works. Why are you referring to the above projects as hacks ? What is a hack ? > > CPU isolation example: a transmission engine. > > > > In the image bellow, I am presenting 4 streaming engines, over 4 Intels > > 82598EB 10Gbps. A streaming engine is actually a Xeon E5420 2.5GHz. > > Each engine has ***full control*** over its own interface. So you can: > > > > 1. fully control the processor's usage. > > By disabling the OS control over the CPU resource. How innovative. I must say when a john doe like me receives this kind of response from names like Thomas Gleixner it aches. > > 2. know **exactly*** how much each single packet transmission costs. for > > example, in this case in processor 3 a single packet average > > transmission is 1974tscs, which is ~700ns. > > > > 3. know how many packets fails to transmit right **on time** ( the Lates > > counter) . and on time in this case means within the 122us jitter. > > Are those statistics a crucial property of your OFFLINE thing ? yes. latency is a crucial property. > > 4. There are 8 cores in this machine. The rest 4 OS cores are ~95% idle. > > The only resource these cores share is the bus. > > That does not change the problem that you cannot run ordinary user > space tasks on your offlined CPUs and you are simply hacking round the > real problem which I outlined in my previous mail. OFFSCHED is not just about RT. it is about assigning assignments to another resource outside the operating system. very much like GPUs, network processors, and so on, but just with software that is accessible to the OS. > Thanks, > > tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 8:38 ` raz ben yehuda @ 2009-08-28 10:05 ` Thomas Gleixner 2009-08-28 13:25 ` Rik van Riel 1 sibling, 0 replies; 79+ messages in thread From: Thomas Gleixner @ 2009-08-28 10:05 UTC (permalink / raw) To: raz ben yehuda Cc: Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, riel, wiseman, linux-kernel, linux-rt-users On Fri, 28 Aug 2009, raz ben yehuda wrote: > On Fri, 2009-08-28 at 00:05 +0200, Thomas Gleixner wrote: > > This is totally irrelevant and we all know how communication channels > > between Linux and whatever hackery (RT-Linux, RTAI, ... OFFLINE) > > works. > > Why are you referring to the above projects as hacks ? What is a hack ? Everything which works around the real problem instead of solving it. > > > 2. know **exactly*** how much each single packet transmission costs. for > > > example, in this case in processor 3 a single packet average > > > transmission is 1974tscs, which is ~700ns. > > > > > > 3. know how many packets fails to transmit right **on time** ( the Lates > > > counter) . and on time in this case means within the 122us jitter. > > > > Are those statistics a crucial property of your OFFLINE thing ? > yes. latency is a crucial property. You are not answering my quesiton. Also reducing latencies is something we want to do in the kernel proper in the first place. We all know that you can reduce the latencies by taking control away from the kernel and running a side show. But that's nothing new. It has been done for decades already and none of these projects has ever improved the kernel itself. > > > 4. There are 8 cores in this machine. The rest 4 OS cores are ~95% idle. > > > The only resource these cores share is the bus. > > > > That does not change the problem that you cannot run ordinary user > > space tasks on your offlined CPUs and you are simply hacking round the > > real problem which I outlined in my previous mail. > > OFFSCHED is not just about RT. it is about assigning assignments to > another resource outside the operating system. very much like GPUs, > network processors, and so on, but just with software that is > accessible to the OS. I was not talking about RT. I was talking about the problem that you cannot run an ordinary user space task on your offlined CPU. That's the main point where the design sucks. Having specialized programming environments which impose tight restrictions on the application programmer for no good reason are horrible. Also how are GPUs, network processors related to my statements ? Running specialized software on dedicated hardware which is an addon to the base system controlled by the kernel is not new. There are enough real world applications running Linux on the main CPU and some extra code on an add on DSP or whatever. Cell/SPU or the TI ARM/DSP combos are just the most obvious examples which come to my mind. Where is the point of OFFSCHED here ? In your earlier mails you talked about isolating cores of the base system by taking the control away from the kernel and what a wonderful solution this is because it allows you full control over that core. We can dedicate a core to special computations today and we can assign resources of any kind to it under the full control of the OS. The only disturbing factor is the scheduler tick. So you work around the scheduler tick problem by taking the core away from the OS. That does not solve the problem, it simply introduces a complete set of new problems: - accounting of CPU utilization excludes the offlined core - resource assignment is restricted to startup of the application - standard admin tools (top, ps ....) are not working - unnecessary restrictions for the application programmer: - no syscalls - no standard IPC - .... - debugging of the code which runs on the offlined core needs separate tools - performance analysis e.g. with profiling/performance counters cannot use the existing mechanisms - .... Thanks, tglx ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 8:38 ` raz ben yehuda 2009-08-28 10:05 ` Thomas Gleixner @ 2009-08-28 13:25 ` Rik van Riel 2009-08-28 13:37 ` jim owens 2009-08-28 15:22 ` raz ben yehuda 1 sibling, 2 replies; 79+ messages in thread From: Rik van Riel @ 2009-08-28 13:25 UTC (permalink / raw) To: raz ben yehuda Cc: Thomas Gleixner, Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users raz ben yehuda wrote: > yes. latency is a crucial property. In the case of network packets, wouldn't you get a lower latency by transmitting the packet from the CPU that knows the packet should be transmitted, instead of sending an IPI to another CPU and waiting for that CPU to do the work? Inter-CPU communication has always been the bottleneck when it comes to SMP performance. Why does adding more inter-CPU communication make your system faster, instead of slower like one would expect? -- All rights reversed. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 13:25 ` Rik van Riel @ 2009-08-28 13:37 ` jim owens 2009-08-28 15:22 ` raz ben yehuda 1 sibling, 0 replies; 79+ messages in thread From: jim owens @ 2009-08-28 13:37 UTC (permalink / raw) To: Rik van Riel Cc: raz ben yehuda, Thomas Gleixner, Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users Rik van Riel wrote: > raz ben yehuda wrote: > >> yes. latency is a crucial property. > > In the case of network packets, wouldn't you get a lower > latency by transmitting the packet from the CPU that > knows the packet should be transmitted, instead of sending > an IPI to another CPU and waiting for that CPU to do the > work? > > Inter-CPU communication has always been the bottleneck > when it comes to SMP performance. Why does adding more > inter-CPU communication make your system faster, instead > of slower like one would expect? > Maybe just me being paranoid, but from the beginning this "use for dedicated IO processor" has scared the crap out of me. Reminds me of Winmodem... sell cheap hardware by stealing your CPU! The HPC FIFO user application on the other hand is a reasonable if somewhat edge-case specialized user batch job. jim ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-28 13:25 ` Rik van Riel 2009-08-28 13:37 ` jim owens @ 2009-08-28 15:22 ` raz ben yehuda 1 sibling, 0 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-28 15:22 UTC (permalink / raw) To: Rik van Riel Cc: Thomas Gleixner, Chris Friesen, Andrew Morton, mingo, peterz, maximlevitsky, efault, wiseman, linux-kernel, linux-rt-users On Fri, 2009-08-28 at 09:25 -0400, Rik van Riel wrote: > raz ben yehuda wrote: > > > yes. latency is a crucial property. > > In the case of network packets, wouldn't you get a lower > latency by transmitting the packet from the CPU that > knows the packet should be transmitted, instead of sending > an IPI to another CPU and waiting for that CPU to do the > work? Hello Rik If I understand what you are saying, you say that I pass 1.5K packets to a offline CPU ? If so, then this is not what I do, because you are very right, it does not make any sense. I do not pass packets to an offline cpu , i pass assignments. an assignment is a buffer with some context of what do with it (like aio) and a buffer is of ~1MB. Also, the offline processor holds the network interface as it own interface. No two offline processors transmit over a single interface.( I modified the bonding driver to work with offline processor for that ). I am aware of network queue per processors, but benchmarks proved this was better.( I do not have these benchmarks now). Also these engines do not release any sk_buffs to the operating system, these packets are being reused over and over to reduce latency of allocating memory and cache misses. Also, in some cases I disabled the transmit interrupts and I released packets ( --skb->users was still greater than 0, not really release ) in an offline context.I learned it from the chelsio driver. This way, I reduced more load from the operating system. It proved to be better in large 1Gbps arrays and was able to remove atomic_inc atomic_dec in some variants of the code, atomic operations cost a lot. in MSI cards I did not find it useful.in the example i showed, i use MSI and system is almost idle. Also, as I recall , IPI will not pass to an offladed processor. offsced it runs NMI. Also, I would to express my apologies if any of this correspondence seems to be as I am trying to PR offsched. I am not. > Inter-CPU communication has always been the bottleneck > when it comes to SMP performance. Why does adding more > inter-CPU communication make your system faster, instead > of slower like one would expect? > ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:50 ` Andrew Morton 2009-08-26 21:09 ` Christoph Lameter 2009-08-26 21:15 ` Chris Friesen @ 2009-08-26 21:34 ` Ingo Molnar 2009-08-27 2:55 ` Frank Ch. Eigler 2009-08-26 21:34 ` raz ben yehuda 3 siblings, 1 reply; 79+ messages in thread From: Ingo Molnar @ 2009-08-26 21:34 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, peterz, raziebe, maximlevitsky, cfriesen, efault, riel, wiseman, linux-kernel, linux-rt-users * Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 26 Aug 2009 16:40:09 -0400 (EDT) > Christoph Lameter <cl@linux-foundation.org> wrote: > > > Peter has not given a solution to the problem. Nor have you. > > What problem? > > All I've seen is "I want 100% access to a CPU". That's not a problem > statement - it's an implementation. > > What is the problem statement? > > My take on these patches: the kernel gives userspace unmediated > access to memory resources if it wants that. The kernel gives > userspace unmediated access to IO devices if it wants that. But > for some reason people freak out at the thought of providing > unmediated access to CPU resources. Claiming all user-available CPU time from user-space is already possible: use SCHED_FIFO - the only question are remaining latencies in the final 0.01% of CPU time you cannot claim via SCHED_FIFO. ( Btw., this scheduling feature was implemented in Linux well before raw IO block devices were implemented, so i'm not sure what you mean by 'freaking out'. ) What we are objecting to are these easy isolation side-hacks for the remaining 0.01% that fail to address the real problem: the latencies. Those latencies can hurt not just isolated apps but _non isolated_ (and latency critical) apps too, and what we insist on is getting the proper fixes, not just ugly workarounds that side-step the problem. ( a secondary objection is the extension and extra layering of something that could be done within existing APIs/ABIs too. We want to minimize the configuration space. ) > Don't get all religious about this. If the change is clean, > maintainable and useful then there's no reason to not merge it. Precisely. This feature as proposed here hinders the correct solution being implemented - and hence hurts long term maintainability and hence is a no-merge right now. [It also weakens the pressure to fix latencies for a much wider set of applications, hence hurts the quality of Linux in the long run. (i.e. is a net step backwards)] Ingo ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 21:34 ` Ingo Molnar @ 2009-08-27 2:55 ` Frank Ch. Eigler 0 siblings, 0 replies; 79+ messages in thread From: Frank Ch. Eigler @ 2009-08-27 2:55 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, Christoph Lameter, peterz, raziebe, maximlevitsky, cfriesen, efault, riel, wiseman, linux-kernel, linux-rt-users Ingo Molnar <mingo@elte.hu> writes: > [...] > >> Don't get all religious about this. If the change is clean, >> maintainable and useful then there's no reason to not merge it. > Precisely. This feature as proposed here hinders the correct > solution being implemented - and hence hurts long term > maintainability and hence is a no-merge right now. (Does it "hinder" this in any different way than the following, as in possibly reducing "pressure" for it?) > [It also weakens the pressure to fix latencies for a much wider set > of applications, hence hurts the quality of Linux in the long > run. (i.e. is a net step backwards)] How would you differentiate the above sentiment from "perfect is the enemy of the good"? - FChE ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:50 ` Andrew Morton ` (2 preceding siblings ...) 2009-08-26 21:34 ` Ingo Molnar @ 2009-08-26 21:34 ` raz ben yehuda 3 siblings, 0 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 21:34 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, mingo, peterz, maximlevitsky, cfriesen, efault, riel, wiseman, linux-kernel, linux-rt-users On Wed, 2009-08-26 at 13:50 -0700, Andrew Morton wrote: > On Wed, 26 Aug 2009 16:40:09 -0400 (EDT) > Christoph Lameter <cl@linux-foundation.org> wrote: > > > Peter has not given a solution to the problem. Nor have you. > > What problem? > > All I've seen is "I want 100% access to a CPU". That's not a problem > statement - it's an implementation. > > What is the problem statement? > > > My take on these patches: the kernel gives userspace unmediated access > to memory resources if it wants that. The kernel gives userspace > unmediated access to IO devices if it wants that. But for some reason > people freak out at the thought of providing unmediated access to CPU > resources. > > Don't get all religious about this. If the change is clean, > maintainable and useful then there's no reason to not merge it. thank you Mr Morton. thank you !!! ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 20:40 ` Christoph Lameter 2009-08-26 20:50 ` Andrew Morton @ 2009-08-26 21:08 ` Ingo Molnar 2009-08-26 21:26 ` Christoph Lameter 1 sibling, 1 reply; 79+ messages in thread From: Ingo Molnar @ 2009-08-26 21:08 UTC (permalink / raw) To: Christoph Lameter Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users * Christoph Lameter <cl@linux-foundation.org> wrote: > > to have sharp teeth nor any apparent poison fangs) - i simply > > concur with the reasons Peter listed that it is a technically > > inferior solution. > > Ok so you are saying that the reduction of OS latencies will make > the processor completely available and have no disturbances like > OFFLINE scheduling? I'm saying that your lack of trying to reduce even low-hanging-fruit latency sources that were pointed out to you fundamentally destroys your credibility in claiming that they are unfixable for all practical purposes. Or, to come up with a car analogy: it's a bit as if at a repair shop you complained that your car has a scratch on its cooler grid that annoys you, and you insisted that it be outfitted with a new diesel engine which needs no cooler grid (throwing away the nice Hemi block it has currently) - and ignored the mechanic's opinion that he loves the Hemi and that to him the scratch looks very much like bird-sh*t and that a proper car wash might do the trick too ;-) > Peter has not given a solution to the problem. Nor have you. What do you mean by 'has given a solution' - a patch? Peter mentioned a few things that you can try to reduce the worst-case latency of the timer tick. Peter also implemented the hr-tick solution (CONFIG_SCHED_HRTICK) - it's mostly upstream but disabled because it had problems - if you are interested in improving this area you can fix and complete it. That would benefit ordinary Linux users too, not just rare isolation apps. Ingo ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 21:08 ` Ingo Molnar @ 2009-08-26 21:26 ` Christoph Lameter 0 siblings, 0 replies; 79+ messages in thread From: Christoph Lameter @ 2009-08-26 21:26 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users On Wed, 26 Aug 2009, Ingo Molnar wrote: > I'm saying that your lack of trying to reduce even low-hanging-fruit > latency sources that were pointed out to you fundamentally destroys > your credibility in claiming that they are unfixable for all > practical purposes. I have never claimed that they are unfixable. However, reducing latencies does not remove a disturbance. > Or, to come up with a car analogy: it's a bit as if at a repair shop > you complained that your car has a scratch on its cooler grid that > annoys you, and you insisted that it be outfitted with a new diesel > engine which needs no cooler grid (throwing away the nice Hemi block > it has currently) - and ignored the mechanic's opinion that he loves > the Hemi and that to him the scratch looks very much like bird-sh*t > and that a proper car wash might do the trick too ;-) Nope. Its like you want to get rid of your car and the person you talk to tries to convince you to keep it. He claims if you would just wash it and repair it then it will be maybe become almost invisible and you would have reached your goal of not having a car. > That would benefit ordinary Linux users too, not just rare isolation > apps. We are talking about apps that need isolation here not regular app. ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 19:32 ` Ingo Molnar 2009-08-26 20:40 ` Christoph Lameter @ 2009-08-26 21:32 ` raz ben yehuda 1 sibling, 0 replies; 79+ messages in thread From: raz ben yehuda @ 2009-08-26 21:32 UTC (permalink / raw) To: Ingo Molnar Cc: Christoph Lameter, Peter Zijlstra, Maxim Levitsky, Chris Friesen, Mike Galbraith, riel, andrew motron, wiseman, lkml, linux-rt-users Ingo Hello First thank you for your interest. OFFSCHED is a variant of a proprietary software. it is 4 years old.It is stable. and.. well...this thing works .And it is so simple. SO VERY VERY SIMPLE. ONCE YOU GO OFFLINE YOU NEVER LOOK BACK. OFFSCHED has a full access to many kernel facilities. My software transmits packets, encrypt packets, and reaches network throughput traffic ( 25Gbs), same as pktgen while saturating its 8 SSD disks. My software take statistics of an offloaded processor usage, and unlike OS processors, since I have a full control of the processor, the usage is growing quite linearly. there are no bursts of CPU usage. it remains stable of X% usage even when I transmit 25Gbps. OFFSCHED __oldest__ patch was 4 lines. this how it started. 4 lines of patch and My 2.6.18-8.el5 kernel is suddenly a hard real time kernel. Today, I patch this kernel, build only a bzImage, throw this 2MB bzImage on a server running regular centos/redhat distribution, and caboom, I have a real time server in god-know-where. I do not mess with any driver, i do not mess with initrd. I just fix 4 lines. that all. OFFSCHED is not just for real time. It can monitor the kernel, protect it and do whatever come to mind. please see OFFSCHED-RTOP.pdf. thank you raz On Wed, 2009-08-26 at 21:32 +0200, Ingo aMolnar wrote: > * Christoph Lameter <cl@linux-foundation.org> wrote: > > > On Wed, 26 Aug 2009, Ingo Molnar wrote: > > > > > The thing is, you have cut out (and have not replied to) this > > > crutial bit of what Peter wrote: > > > > > > > > The past year or so you've been whining about the tick latency, > > > > > and I've seen exactly _0_ patches from you slimming down the > > > > > work done in there, even though I pointed out some obvious > > > > > things that could be done. > > > > > > ... which pretty much settles the issue as far as i'm concerned. > > > If you were truly interested in a constructive solution to lower > > > latencies in Linux you should have sent patches already for the > > > low hanging fruits Peter pointed out. > > > > The noise latencies were already reduced in years earlier to the > > mininum (f.e. the work on slab queue cleaning). Certainly more > > could be done there but that misses the point. > > Peter suggested various improvements to the timer tick related > latencies _you_ were complaining about earlier this year. Those > latencies sure were not addressed 'years earlier'. > > If you are unwilling to reduce the very latencies you apparently > cared and complained about then you dont have much real standing to > complain now. > > ( If you on the other hand were approaching this issue with > pragmatism and with intellectual honesty, if you were at the end > of a string of patches that gradually improved latencies but > couldnt get them below a certain threshold, and if scheduler > developers couldnt give you any ideas what else to improve, and > _then_ suggested some other solution, you might have a point. > You are far away from being able to claim that. ) > > Really, it's a straightforward application of Occam's Razor to the > scheduler. We go for the simplest solution first, and try to help > more people first, before going for some specialist hack. > > > The point of the OFFLINE scheduler is to completely eliminate the > > OS disturbances by getting rid of *all* OS processing on some > > cpus. > > > > For some reason scheduler developers seem to be threatened by this > > idea and they go into bizarre lines of arguments to avoid the > > issue. Its simple and doable and the scheduler will still be there > > after we do this. > > If you meant to include me in that summary categorization, i dont > feel 'threatened' by any such patches (why would i? They dont seem > to have sharp teeth nor any apparent poison fangs) - i simply concur > with the reasons Peter listed that it is a technically inferior > solution. > > Ingo ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 19:15 ` Christoph Lameter 2009-08-26 19:32 ` Ingo Molnar @ 2009-08-27 7:15 ` Mike Galbraith 1 sibling, 0 replies; 79+ messages in thread From: Mike Galbraith @ 2009-08-27 7:15 UTC (permalink / raw) To: Christoph Lameter Cc: Ingo Molnar, Peter Zijlstra, raz ben yehuda, Maxim Levitsky, Chris Friesen, riel, andrew motron, wiseman, lkml, linux-rt-users On Wed, 2009-08-26 at 15:15 -0400, Christoph Lameter wrote: > The point of the OFFLINE scheduler is to completely eliminate the > OS disturbances by getting rid of *all* OS processing on some cpus. No, that's not the point of OFFSCHED. It's about offloading kernel functionality to a peer, and as it currently exists after some years of development. kernel functionality only. Raz has already stated that hard RT is not the point. <quote> (for full context, jump back a bit in this thread) > On the other hand, I could see this as a jump platform for more > proprietary code, something like that: we use linux in out server > platform, but out "insert buzzword here" network stack pro+ can handle > 100% more load that linux does, and it runs on a dedicated core.... > > In the other words, we might see 'firmwares' that take an entire cpu for > their usage. This is exactly what offsched (sos) is. you got it. SOS was partly inspired by the notion of a GPU. Processors are to become more and more redundant and Linux as an evolutionary system must use it. why not offload raid5 write engine ? why not encrypt in a different processor ? Also , having so many processors in a single OS means a bug prone system , with endless contention points when two or more OS processors interacts. let's make things simpler. </quote> -Mike ^ permalink raw reply [flat|nested] 79+ messages in thread
* RE: RFC: THE OFFLINE SCHEDULER 2009-08-26 14:54 ` raz ben yehuda 2009-08-26 15:06 ` Pekka Enberg 2009-08-26 15:30 ` Peter Zijlstra @ 2009-08-26 15:37 ` Chetan.Loke 2 siblings, 0 replies; 79+ messages in thread From: Chetan.Loke @ 2009-08-26 15:37 UTC (permalink / raw) To: raziebe, maximlevitsky Cc: cl, peterz, cfriesen, efault, riel, mingo, akpm, wiseman, linux-kernel, linux-rt-users > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel- > owner@vger.kernel.org] On Behalf Of raz ben yehuda > Sent: Wednesday, August 26, 2009 10:54 AM > To: Maxim Levitsky > Cc: Christoph Lameter; Peter Zijlstra; Chris Friesen; Mike Galbraith; > riel@redhat.com; mingo@elte.hu; andrew motron; wiseman@macs.biu.ac.il; > lkml; linux-rt-users@vger.kernel.org > Subject: Re: RFC: THE OFFLINE SCHEDULER > > > On Wed, 2009-08-26 at 17:45 +0300, Maxim Levitsky wrote: > > On Wed, 2009-08-26 at 09:47 -0400, Christoph Lameter wrote: > > > On Wed, 26 Aug 2009, raz ben yehuda wrote: > > > > > > > How will the kernel is going to handle 32 processors machines ? > These > > > > numbers are no longer a science-fiction. > > > > > > The kernel is already running on 4096 processor machines. Dont worry > about > > > that. > > > > > > > What i am suggesting is merely a different approach of how to handle > > > > multiple core systems. instead of thinking in processes, threads and > so > > > > on i am thinking in services. Why not take a processor and define > this > > > > processor to do just firewalling ? encryption ? routing ? > transmission ? > > > > video processing... and so on... > > > > > > I think that is a valuable avenue to explore. What we do so far is > > > treating each processor equally. Dedicating a processor has benefits > in > > > terms of cache hotness and limits OS noise. > > > > > > Most of the large processor configurations already partition the > system > > > using cpusets in order to limit the disturbance by OS processing. A > set of > > > cpus is used for OS activities and system daemons are put into that > set. > > > But what can be done is limited because the OS threads as well as > > > interrupt and timer processing etc cannot currently be moved. The > ideas > > > that you are proposing are particularly usedful for applications that > > > require low latencies and cannot tolerate OS noise easily (Infiniband > MPI > > > base jobs f.e.) > > > > My 0.2 cents: > > > > I have always been fascinated by the idea of controlling another cpu > > from the main CPU. > > > > Usually these cpus are custom, run proprietary software, and have no > > datasheet on their I/O interfaces. > > > > But, being able to turn an ordinary CPU into something like that seems > > to be very nice. > > > > For example, It might help with profiling. Think about a program that > > can run uninterrupted how much it wants. > > > > I might even be better, if the dedicated CPU would use a predefined > > reserved memory range (I wish there was a way to actually lock it to > > that range) > > > > On the other hand, I could see this as a jump platform for more > > proprietary code, something like that: we use linux in out server > > platform, but out "insert buzzword here" network stack pro+ can handle > > 100% more load that linux does, and it runs on a dedicated core.... > > > > In the other words, we might see 'firmwares' that take an entire cpu for > > their usage. > This is exactly what offsched (sos) is. you got it. SOS was partly > inspired by the notion of a GPU. > Processors are to become more and more redundant and Linux as an > evolutionary system must use it. why not offload raid5 write engine ? > why not encrypt in a different processor ? RAID/Encryption + GPU. You got it. This is what one of our teams did but by offloading it on a PCIe I/O module and using couple(Protocol+Application core) of cores. One core could run SAS/SATA and other could run your home grown application f/w and/or a linux distro and you could make it do whatever you want. But that was then. Multi-Core systems are now a commodity. Chetan Loke ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-26 5:31 ` Peter Zijlstra 2009-08-26 10:29 ` raz ben yehuda @ 2009-08-26 15:21 ` Pekka Enberg 1 sibling, 0 replies; 79+ messages in thread From: Pekka Enberg @ 2009-08-26 15:21 UTC (permalink / raw) To: Peter Zijlstra Cc: Chris Friesen, Christoph Lameter, Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users Hi Peter, On Wed, Aug 26, 2009 at 8:31 AM, Peter Zijlstra<peterz@infradead.org> wrote: >> Is it the whole concept of isolating one or more cpus from all normal >> kernel tasks that you don't like, or just this particular implementation? >> >> I ask because I know of at least one project that would have used this >> capability had it been available. As it stands they have to live with >> the usual kernel threads running on the cpu that they're trying to >> dedicate to their app. > > Its the simple fact of going around the kernel instead of using the > kernel. > > Going around the kernel doesn't benefit anybody, least of all Linux. > > So its the concept of running stuff on a CPU outside of Linux that I > don't like. I mean, if you want that, go ahead and run RTLinux, RTAI, > L4-Linux etc.. lots of special non-Linux hypervisor/exo-kernel like > things around for you to run things outside Linux with. Out of curiosity, what's the problem with it? Why can't the scheduler be taught to bind one user-space thread on a given CPU and make sure no other threads are scheduled on that CPU? I'm not a scheduler expert but that seems like a logical extension to the current cpuset logic and would help the low-latency workload Christoph has described in the past. Pekka ^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: RFC: THE OFFLINE SCHEDULER 2009-08-25 19:08 ` Peter Zijlstra 2009-08-25 19:18 ` Christoph Lameter 2009-08-25 19:22 ` Chris Friesen @ 2009-08-25 21:09 ` Éric Piel 2 siblings, 0 replies; 79+ messages in thread From: Éric Piel @ 2009-08-25 21:09 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Lameter, Mike Galbraith, raz ben yehuda, riel, mingo, andrew motron, wiseman, lkml, linux-rt-users Op 25-08-09 21:08, Peter Zijlstra schreef: > On Tue, 2009-08-25 at 14:03 -0400, Christoph Lameter wrote: >> On Tue, 25 Aug 2009, Mike Galbraith wrote: >> >>> I asked the questions I did out of pure curiosity, and that curiosity >>> has been satisfied. It's not that I find it useless or whatnot (or that >>> my opinion matters to anyone but me;). I personally find the concept of >>> injecting an RTOS into a general purpose OS with no isolation to be >>> alien. Intriguing, but very very alien. >> Well lets work on the isolation piece then. We could run a regular process >> on the RT cpu and switch back when OS services are needed? > > Christoph, stop being silly, this offline scheduler thing won't happen, > full stop. > > Its not a maintainable solution, it doesn't integrate with existing > kernel infrastructure, and its plain ugly. > > If you want something work within Linux, don't build kernels in kernels > or other such ugly hacks. Hello, For the one interested in such approach, you can have a look at an now unmaintained project that we developed, ARTiS: http://www2.lifl.fr/west/artis/ It allows several RT tasks to share a "RT" cpu, and if a task tries to "cheat" by calling a kernel function which disables the preemption or the interrupts, it is temporally migrated to another CPU. This is a working approach, with some good low latency results which can be seen in the papers on the website. See you, Eric ^ permalink raw reply [flat|nested] 79+ messages in thread
end of thread, other threads:[~2009-09-01 18:46 UTC | newest]
Thread overview: 79+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1250983671.5688.21.camel@raz>
[not found] ` <1251004897.7043.70.camel@marge.simson.net>
2009-08-23 9:09 ` RFC: THE OFFLINE SCHEDULER raz ben yehuda
2009-08-23 7:30 ` Mike Galbraith
2009-08-23 11:05 ` raz ben yehuda
2009-08-23 9:52 ` Mike Galbraith
2009-08-25 15:23 ` Christoph Lameter
2009-08-25 17:56 ` Mike Galbraith
2009-08-25 18:03 ` Christoph Lameter
2009-08-25 18:12 ` Mike Galbraith
[not found] ` <5d96567b0908251522m3fd4ab98n76a52a34a11e874c@mail.gmail.com>
2009-08-25 22:32 ` Fwd: " Raz
2009-08-25 19:08 ` Peter Zijlstra
2009-08-25 19:18 ` Christoph Lameter
2009-08-25 19:22 ` Chris Friesen
2009-08-25 20:35 ` Sven-Thorsten Dietrich
2009-08-26 5:31 ` Peter Zijlstra
2009-08-26 10:29 ` raz ben yehuda
2009-08-26 8:02 ` Mike Galbraith
2009-08-26 8:16 ` Raz
2009-08-26 13:47 ` Christoph Lameter
2009-08-26 14:45 ` Maxim Levitsky
2009-08-26 14:54 ` raz ben yehuda
2009-08-26 15:06 ` Pekka Enberg
2009-08-26 15:11 ` raz ben yehuda
2009-08-26 15:30 ` Peter Zijlstra
2009-08-26 15:41 ` Christoph Lameter
2009-08-26 16:03 ` Peter Zijlstra
2009-08-26 16:16 ` Pekka Enberg
2009-08-26 16:20 ` Christoph Lameter
2009-08-26 18:04 ` Ingo Molnar
2009-08-26 19:15 ` Christoph Lameter
2009-08-26 19:32 ` Ingo Molnar
2009-08-26 20:40 ` Christoph Lameter
2009-08-26 20:50 ` Andrew Morton
2009-08-26 21:09 ` Christoph Lameter
2009-08-26 21:15 ` Chris Friesen
2009-08-26 21:37 ` raz ben yehuda
2009-08-27 16:51 ` Chris Friesen
2009-08-27 17:04 ` Christoph Lameter
2009-08-27 21:09 ` Thomas Gleixner
2009-08-27 22:22 ` Gregory Haskins
2009-08-28 2:15 ` Rik van Riel
2009-08-28 3:33 ` Gregory Haskins
2009-08-28 4:27 ` Gregory Haskins
2009-08-28 10:26 ` Thomas Gleixner
2009-08-28 18:57 ` Christoph Lameter
2009-08-28 19:23 ` Thomas Gleixner
2009-08-28 19:52 ` Christoph Lameter
2009-08-28 20:00 ` Thomas Gleixner
2009-08-28 20:21 ` Christoph Lameter
2009-08-28 20:34 ` Thomas Gleixner
2009-08-31 19:19 ` Christoph Lameter
2009-08-31 17:44 ` Roland Dreier
2009-09-01 18:42 ` Christoph Lameter
2009-09-01 16:15 ` Roland Dreier
2009-08-29 17:03 ` jim owens
2009-08-31 19:22 ` Christoph Lameter
2009-08-31 15:33 ` Peter Zijlstra
2009-09-01 18:46 ` Christoph Lameter
2009-08-28 6:14 ` Peter Zijlstra
2009-08-27 23:51 ` Chris Friesen
2009-08-28 0:44 ` Thomas Gleixner
2009-08-28 21:20 ` Chris Friesen
2009-08-28 18:43 ` Christoph Lameter
2009-08-27 21:33 ` raz ben yehuda
2009-08-27 22:05 ` Thomas Gleixner
2009-08-28 8:38 ` raz ben yehuda
2009-08-28 10:05 ` Thomas Gleixner
2009-08-28 13:25 ` Rik van Riel
2009-08-28 13:37 ` jim owens
2009-08-28 15:22 ` raz ben yehuda
2009-08-26 21:34 ` Ingo Molnar
2009-08-27 2:55 ` Frank Ch. Eigler
2009-08-26 21:34 ` raz ben yehuda
2009-08-26 21:08 ` Ingo Molnar
2009-08-26 21:26 ` Christoph Lameter
2009-08-26 21:32 ` raz ben yehuda
2009-08-27 7:15 ` Mike Galbraith
2009-08-26 15:37 ` Chetan.Loke
2009-08-26 15:21 ` Pekka Enberg
2009-08-25 21:09 ` Éric Piel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).