From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <48804E28.80906@domain.hid> Date: Fri, 18 Jul 2008 10:02:48 +0200 From: Philippe Gerum MIME-Version: 1.0 References: <1216282324.3472.29.camel@domain.hid> In-Reply-To: <1216282324.3472.29.camel@domain.hid> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: Re: [Xenomai-help] Generel problem with realtime-tops(like adeos) over linux-kernel Reply-To: rpm@xenomai.org List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?ISO-8859-1?Q?=22Schl=E4gl_=5C=22Manfred_jun=2E=5C=22=22?= Cc: xenomai@xenomai.org Schl=E4gl Manfred jun. wrote: > Hi! >=20 > I think we've discovered a generell logical problem with realtime-tops > like adeos over the linux-kernel. >=20 > The basic-assumption of such an system is: Linux is not a > realtime-system, so it is not able to provide realtime to it's services, > so no linux-service is able to use realtime-capabilities, so no > linux-service has realtime-requirements. >>>From this it follows that we are able use a top like adeos (send > interrupts later, always interrupt the linux-kernel). >=20 > But... Linux is able to provide hard-realtime while interrupts are > locked. And many services(driver) use this. >=20 > abstract example: > {{{ > spin_lock_irqsave > if(hardware_data_valid()) > process_hardware_data() > spin_lock_irqrestore > }}} Well, real-time is not about allowing this handler to perform un-preempted,= but rather to guarantee that the highest priority code will always get the CPU = at any point in time. If that handler is the most time-critical work to do on = your box, that's fine. But if it's not, this code is wrong, because it basically wrecks real-time behaviour. Actually, your basic assumption is flawed on a regular SMP kernel: what if = the lock is currently held by a task running on another CPU, that ends up being preempted by an IRQ, or any higher priority task? Unless running a RT-capab= le system, common spinlock loops are entered with hw interrupts off, therefore= , you end up locking interrupts for an undefined amount of time on your local CPU, before being able to enter your critical section. So much for predictabilit= y, both for entering that critical section, but above all for any other code t= hat would want to get the local CPU attention asap while your code is waiting f= or the lock. Masking interrupts may solve the preeemption issue on a uniprocessor box, b= ut this does not guarantee that any other time-critical part of the system requiring immediate attention, because of its higher priority, will get the= CPU on time. Therefore, I don't see how this construct could be used to enforce real-time, precisely because it completely ignores priorities. > works fine without adeos, In fact, it does not work in a reliable manner, without actual RTOS support. The point is not about Adeos, which is only an enabler for real-time suppor= t, which in turn brings proper preemption and priority management. > but with adeos there may be a relative long > interruption between validation and processing. The hardware may overrun > and process_hardware_data is called without valid data... >=20 > In our case we have this problem while the rx-interrupt of our > ethernet-driver. The dma is running permanently and generates an overrun > between the error-checking(which would catch the overrun) part and the > data-processing part of the handler. >=20 > I think it is possible that there could be many such (latent) problems > in linux-kernel. For example USB which itself has realtime-requirements, > or eventually mtd (lost data as cause of wrong flash-write/erase > timings), ... The whole idea underlying dual RT/non-RT systems, is that RT processes 1) share the available CPU horsepower between them all according to arbitra= ry priorities, 2) should be part of a software system that leaves some cycles = to the non-RT processes whenever possible. The example you described says basically: 1) any time-critical driver may l= ock interrupts out in order to complete its duty un-preempted, 2) all time-crit= ical drivers have to perform their duty without stepping on each others toes wit= h the rather limited help of a single giant traffic light (i.e. hw interrupt mask= ing, and no priority scheme). When designing such a system, you would have to think about the potential d= amage high priority tasks may cause to low priority tasks, because of preemption,= or absence therereof. But again, you would have to do that with all kinds of RT frameworks. This is something which could only be sorted out at coding desi= gn level, which means that you would have to review the locking scheme for all time-critical sections you want to care of in any case. Once identified, those sections can be fixed individually, including with A= deos. If they happen to be too complex for using a different kind of (ironed, Adeos-aware) lock, then maybe they do not qualify for being atomic in the f= irst place. Practically, if you don't want to put your MTD flash at risk on a dual RT/G= POS design, use VxWorks, but in that case, do not run the MTD task along with o= ther tasks that may preempt it for a dangerously long time. Back to square #1, I= 'm afraid, you have to know what your RT constraints are. >=20 > So ... what do you think about that. > Native preemption turns spinlocks into rt-mutexes, which allows the code fl= ow to be diverted from the critical section for an undefined amount of time when = the CPU has to turn its attention to a higher priority task. So your spinlock actually gives you no guarantee beyond proper serialization of the section = in question. Therefore, the issue you raised is not a co-kernel problem, it si= mply expresses a general question about any RT design: which activity has highest priority and lowest response time required? I don't think your are going to= get away with that problem only relying on hw interrupt masking. To sum up, if a system has multiple time-critical duties to perform, well, I see no other option than enumerating them, and building a sane priority des= ign accordingly. Depending on multi-domain execution the Adeos way, or native preemption makes no difference here. At the end of the day, both of them wi= ll bring predictability to my RT code, whilst a non-RT kernel will certainly cause me headaches. --=20 Philippe.