From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <48819761.7000702@domain.hid> Date: Sat, 19 Jul 2008 09:27:29 +0200 From: Philippe Gerum MIME-Version: 1.0 References: <1216282324.3472.29.camel@domain.hid> <48804E28.80906@domain.hid> In-Reply-To: <48804E28.80906@domain.hid> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Subject: Re: [Xenomai-help] Generel problem with realtime-tops(like adeos) over linux-kernel Reply-To: rpm@xenomai.org List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?IlNjaGzDpGdsIFwiTWFuZnJlZCBqdW4uXCIi?= Cc: xenomai@xenomai.org Philippe Gerum wrote: > Schlägl Manfred jun. wrote: >> But... Linux is able to provide hard-realtime while interrupts are >> locked. And many services(driver) use this. >> >> abstract example: >> {{{ >> spin_lock_irqsave >> if(hardware_data_valid()) >> process_hardware_data() >> spin_lock_irqrestore >> }}} > > Well, real-time is not about allowing this handler to perform un-preempted, but > rather to guarantee that the highest priority code will always get the CPU at > any point in time. If that handler is the most time-critical work to do on your > box, that's fine. But if it's not, this code is wrong, because it basically > wrecks real-time behaviour. > > Actually, your basic assumption is flawed on a regular SMP kernel: what if the > lock is currently held by a task running on another CPU, that ends up being > preempted by an IRQ, e.g. due to a mixed spin_lock(CPU #0) vs spin_lock_irq*(cpu #1) construct, protecting two different accesses to the same critical resource. We could make sure that such construct never involves time-critical code, but that would still require to identify the potentially problematic code, and fix it accordingly. IOW, we would have to go through the very same audit process than for a virtualized IRQ system like Adeos. In the former case (vanilla non-RT kernel), we would have to always use the spin_lock_irq*() form to serialize accesses, in the latter case (Adeos-enabled kernel), we would have to convert regular locks to raw Adeos locks (ipipe_spin_lock_t). Another scenario which comes to mind involves 3 CPUs and two locks on a vanilla kernel, i.e.: CPU #0 CPU #1 CPU #2 spin_lock(&lockB); spin_lock_irqsave(&lockA); spin_lock(&lockB); spin_lock_irqsave(&lockA); In that case, and despite the locking sequence seems fine, the driver code on CPU #2 that attempts to grab lockA will wait until lockB is released on CPU #0, which can induce significant delay, while CPU #2 spins with hw interrupts off. Again, that kind of nested construct could be banned, but how would you know it is never used without actually auditing the code? or any higher priority task? i.e. on a native preemption kernel, because spinlocks are turned into rt-mutexes that allow rescheduling in what used to be truly atomic sections on vanilla kernels. What I mean with those examples, is that your analysis is right when it comes to the potential issue introduced by interrupt virtualization, but other problems leading to the same consequence can appear with regular or native preemption kernels as well. Therefore, the only reasonable answer is to address each time-critical aspect specifically (e.g. MTD and such), and from that point, one can fix them - such as introducing raw locks in Adeos-enabled kernels or re-ordering task priorities with native preemption ones - regardless of the underlying real-time infrastructure. I'm not saying that such task is an easy one, all I'm saying is that there is no Linux kernel that is virtuous by essence when it comes to predictable behaviour, so we have to help them a bit by knowing about our real-time constraints in any case. -- Philippe.