From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4A0BF054.3040308@domain.hid> Date: Thu, 14 May 2009 12:20:04 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <4A0AC1C8.4050006@domain.hid> <4A0AC3F9.9090103@domain.hid> <4A0AC8A6.1000701@domain.hid> <1242220962.26544.955.camel@domain.hid> <4A0AE726.5090107@domain.hid> <1242230121.26544.977.camel@domain.hid> <4A0AF109.5050804@domain.hid> <1242247840.26544.981.camel@domain.hid> In-Reply-To: <1242247840.26544.981.camel@domain.hid> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [PATCH] Fix host IRQ propagation List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: xenomai-core Philippe Gerum wrote: > On Wed, 2009-05-13 at 18:10 +0200, Jan Kiszka wrote: >> Philippe Gerum wrote: >>> On Wed, 2009-05-13 at 17:28 +0200, Jan Kiszka wrote: >>>> Philippe Gerum wrote: >>>>> On Wed, 2009-05-13 at 15:18 +0200, Jan Kiszka wrote: >>>>>> Gilles Chanteperdrix wrote: >>>>>>> Jan Kiszka wrote: >>>>>>>> Hi Gilles, >>>>>>>> >>>>>>>> I'm currently facing a nasty effect with switchtest over latest git head >>>>>>>> (only tested this so far): running it inside my test VM (ie. with >>>>>>>> frequent excessive latencies) I get a stalled Linux timer IRQ quite >>>>>>>> quickly. System is otherwise still responsive, Xenomai timers are still >>>>>>>> being delivered, other Linux IRQs too. switchtest complained about >>>>>>>> >>>>>>>> "Warning: Linux is compiled to use FPU in kernel-space." >>>>>>>> >>>>>>>> when it was started. Kernels are 2.6.28.9/ipipe-x86-2.2-07 and >>>>>>>> 2.6.29.3/ipipe-x86-2.3-01 (LTTng patched in, but unused), both show the >>>>>>>> same effect. >>>>>>>> >>>>>>>> Seen this before? >>>>>>> The warning about Linux being compiled to use FPU in kernel-space means >>>>>>> that you enabled soft RAID or compiled for K7, Geode, or any other >>>>>> RAID is on (ordinary server config). >>>>>> >>>>>>> configuration using 3DNow for such simple operations as memcpy. It is >>>>>>> harmless, it simply means that switchtest can not use fpu in kernel-space. >>>>>>> >>>>>>> The bug you have is probably the same as the one described here, which I >>>>>>> am able to reproduce on my atom: >>>>>>> https://mail.gna.org/public/xenomai-help/2009-04/msg00200.html >>>>>>> >>>>>>> Unfortunately, I for one am working on ARM issues and am not available >>>>>>> to debug x86 issues. I think Philippe is busy too... >>>>>> OK, looks like I got the same flu here. >>>>>> >>>>>> Philippe, did you find out any more details in the meantime? Then I'm >>>>>> afraid I have to pick this up. >>>>> No, I did not resume this task yet. Working from the powerpc side of the >>>>> universe here. >>>> Hoho, don't think this rain here over x86 would have never made it down >>>> to ARM or PPC land! ;) >>>> >>>> Martin, could you check if this helps you, too? >>>> >>>> Jan >>>> >>>> (as usual, ready to be pulled from 'for-upstream') >>>> >>>> ---------> >>>> >>>> Host IRQs may not only be triggered from non-root domains. >>> Are you sure of this? I can't find any spot where this assumption would >>> be wrong. host_pend() is basically there to relay RT timer ticks and >>> device IRQs, and this only happens on behalf of the pipeline head. At >>> least, this is how rthal_irq_host_pend() should be used in any case. If >>> you did find a spot where this interface is being called from the lower >>> stage, then this is the root bug to fix. >> I haven't studied the I-pipe trace /wrt this in details yet, but I could >> imagine that some shadow task is interrupted in primary mode by the >> timer IRQ and then leaves the handler in secondary mode due to whatever >> events between schedule-out and in at the end of xnintr_clock_handler. >> > > You need a thread context to move to secondary, I just can't see how > such scenario would be possible. Here is the trace of events: => Shadow task starts migration to secondary => in xnpod_suspend_thread, nklock is briefly released before xnpod_schedule => timer IRQ intercepts => as the current CPU is marked for reschedule, we enter xnpod_schedule before propagating the host tick => once the migrating thread comes in again, it will run the xnintr_clock_handler tail, i.e. xnarch_relay_tick, already over the root domain Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux