From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261510AbVFKBlg (ORCPT ); Fri, 10 Jun 2005 21:41:36 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261512AbVFKBlg (ORCPT ); Fri, 10 Jun 2005 21:41:36 -0400 Received: from e31.co.us.ibm.com ([32.97.110.129]:13745 "EHLO e31.co.us.ibm.com") by vger.kernel.org with ESMTP id S261510AbVFKBlL (ORCPT ); Fri, 10 Jun 2005 21:41:11 -0400 Date: Fri, 10 Jun 2005 18:41:33 -0700 From: "Paul E. McKenney" To: Andrea Arcangeli Cc: Bill Huey , Karim Yaghmour , Lee Revell , Tim Bird , linux-kernel@vger.kernel.org, tglx@linutronix.de, mingo@elte.hu, pmarques@grupopie.com, bruce@andrew.cmu.edu, nickpiggin@yahoo.com.au, ak@muc.de, sdietrich@mvista.com, dwalker@mvista.com, hch@infradead.org, akpm@osdl.org Subject: Re: Attempted summary of "RT patch acceptance" thread Message-ID: <20050611014133.GO1300@us.ibm.com> Reply-To: paulmck@us.ibm.com References: <20050609235026.GE1297@us.ibm.com> <1118372388.32270.6.camel@mindpipe> <20050610154745.GA1300@us.ibm.com> <20050610173728.GA6564@g5.random> <20050610193926.GA19568@nietzsche.lynx.com> <42A9F788.2040107@opersys.com> <20050610223724.GA20853@nietzsche.lynx.com> <20050610225231.GF6564@g5.random> <20050610230836.GD21618@nietzsche.lynx.com> <20050610232955.GH6564@g5.random> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050610232955.GH6564@g5.random> User-Agent: Mutt/1.4.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 11, 2005 at 01:29:55AM +0200, Andrea Arcangeli wrote: > On Fri, Jun 10, 2005 at 04:08:36PM -0700, Bill Huey wrote: > > On Sat, Jun 11, 2005 at 12:52:31AM +0200, Andrea Arcangeli wrote: > > > Just tell me how can you go to a customer and tell him that your > > > linux-RTOS has a guaranteed worst case latency of 50usec. How can you > > > tell that? Did you exercise all possible scheduler paths with cache > > > disabled and calculated the worst case latency with rdtsc + math? > > > > Ask Ingo. I'm through with this track and your misinformed comments > > You didn't provide an answer yourself, and you fallback on somebody else > cause there's no valid answer to this question. All I've seen so far are > measurements backed by some statistical significance, that's very far > from the meaning I give to the word _guarantee_. I just -know- I am going to regret stepping into the middle of this one... [Hey, Karim!!! Any extra room in that bunker of yours?] All I have seen Ingo claim is that -some- hardware configurations running -some- specific workloads had excellent -measured- maximum scheduling latencies. I have not seen Ingo claim that he has mathematically proven that CONFIG_PREEMPT_RT's worst-case scheduling latency is bounded, let alone make any claim of what such a mathematically proven bound might be. Nor have I seen Ingo claim that CONFIG_PREEMPT_RT has passed any sort of industry-standard acceptance test. If I missed such a mathematical proof, or such an industry-standard acceptance test, please send me the URL. Ingo -has- greatly reduced the amount of kernel code that must be inspected to guarantee realtime response, and this is a great accomplishment and a great contribution. I believe that his approach is more than "good enough" for a great number of applications that are traditionally thought of as "hard realtime". But, unless I missed something, it is not mathematically proven, nor is it in any way "certified". Some of the approaches involving separate RTOSes -can- legitimately claim have mathematically proven worst-case scheduling latencies. For example, the dual-OS/dual-core approach can do so, particularly in the case where the "RTOS" is a tight loop written on bare metal in hand-coded assembly. One might consider this to be a trivial example, perhaps even a stupid example, but there are a lot of apps out there that use exactly this approach. The migration-between-OSes approach, such as RTAI Fusion, might also be able to mathematically guarantee RTOS scheduling latencies. And, unlike the dual-OS/dual-core approaches, it does provide a single environment to applications. However, it still results in two separate OS instances to administer, and, as near as I can tell, the RTOS has to stick its hands so deeply into Linux's internals that it might as well be part of the Linux kernel. Unless I am missing something, any sort of bug in the Linux kernel has a reasonable chance of messing up the RTOS, since the RTOS must paw through Linux's tasks. Can we say that one of these approaches is definitely "good enough" for all reasonable Linux RT work, and that we should therefore stop working on the other approach? I might well be missing something, but I don't believe so, at least not yet. In the meantime, the more time anyone spends sending flames on LKML, the less time that person is putting into improving his/her favorite approach. So the greater number of flames one sends, the smaller one's favorite approach's chance of success! Thanx, Paul > I fully acknowledge there are some problems that aren't more equally > easily solved with full userland code, for those problems keeping things > simpler by making the kernel more RT aware has a value. > > I fully acknowledge that simulating infinite cpus has a value too in > discovering race conditions too ;). > > But I personally dislike all things that works by luck, preempt-RT that > aims to provide guarantees about worst case latencies is no exception, > so you shoudln't be surprised if I'm not a RTOS backer for problems that > can be more easily solved with ruby-hard RT designs like RTAI and > rtlinux. I don't care how things have worked in the past on non-linux > OS. Most of the time even current not-RT linux will work fine if it's > idle as it would be on some embedded usages like the printer example. > > This even ignoring the fact all those context switch will not be cheap, > kernel can execute a lot more hard-irqs than context switches per > second, and this is another fact. On a 4ghz cpu it doesn't matter, but > on a embedded it can matter, so the more reliable solution is obviously > higher performant too. Of the 50usec it takes to such project on > linuxdevices to execute probably a 10% of that is wasted in the irq > handling itself (ioapic hardware proper stuff). > > Anyway I've more fun things to do on than to talk about hard-RT, which > I'm doing just with the aim to provide my little contribution in form of > IMHO valid criticism that you clearly don't appreciate. >