There seems to be quite a bit of confusion and possible problems in the cpu_idle code of many architectures. Hopefully many will be false positives, but please bear with me and read this and check your code if you maintain: alpha, arm26, h8300, ia64, m68knommu, ia64, parisc, ppc64, s390, sh64, sparc, um, or xtensa. OK, so I've send a patch to Andrew that makes the various rules a bit clearer (this did not introduce arch bugs, I found them when making this patch). I've attached the complete patch at the end of this email. You *ONLY* need to look at the small changes to your architecture, and read the following rules. Your cpu_idle routines need to obey the following rules: **** 1. Preempt should now disabled over idle routines. Should only be enabled to call schedule() then disabled again. 2. need_resched/TIF_NEED_RESCHED is only ever set, and will never be cleared until the running task has called schedule(). Idle threads only ever need to query need_resched, never set or clear it. 3. When cpu_idle finds (need_resched() == 'true'), it should call schedule(). It should not call schedule() otherwise. 4. The only time interrupts need to be disabled when checking need_resched is if we are about to sleep the processor until the next interrupt (this doesn't provide any protection of need_resched, it prevents losing an interrupt). 4a. Common problem with this type of sleep appears to be: local_irq_disable(); if (!need_resched()) { local_irq_enable(); *** resched interrupt arrives here *** __asm__("sleep until next interrupt"); } 5. TIF_POLLING_NRFLAG can be set by idle routines that do not need an interrupt to wake them up when need_resched goes high. If TIF_POLLING_NRFLAG is set, and we do decide to enter an interrupt sleep, it needs to be cleared then a memory barrier issued (followed by a test of need_resched with interrupts disabled, as explained in 3). **** Possible arch problems I found (and either tried to fix or didn't): alpha - set TIF_POLLING_NRFLAG. OK? arm26 - how did it work?! It didn't appear to ever call schedule() (See#4). I don't know how it could have run other processes without CONFIG_PREEMPT (fixed? please check!) h8300 - Fixed(?) to "sleep" only if need_resched is NOT set. (See #3) - Is such sleeping racy vs interrupts? (See #4a). The H8/300 manual I found indicates yes, however disabling IRQs over the sleep mean only NMIs can wake it up, so can't fix easily without doing spin waiting. ia64 - is safe_halt call racy vs interrupts? (See #4a) m68knommu - Fixed(?) to "stop" only if need_resched() is NOT set. (See #3) - Is such sleeping racy vs interrupts? (See #4a) parisc - set TIF_POLLING_NRFLAG. OK? ppc64 - dont test or clear need_resched in cpu_idle. (See #2) s390 - local irq disable before checking need_resched doesn't gain anything (removed, OK?) sh64 - Is sleeping racy vs interrupts? (See #4a) sparc - IRQs on at this point(?), change local_irq_save to _disable. - Changed idle loop so don't go to schedule() if pm_idle is NULL! - set TIF_POLLING_NRFLAG for SMP. - TODO: needs secondary CPUs to disable preempt (See #1) um - I'm too lazy to really look. Might be OK :P xtensa - obviously not tested with preempt (hopefully fixed?) And attached is the patch, for reference. Please make sure what fixes I have made are correct, and what concerns I have raised are checked. Feel free to ask for help or clarification of anything. Thanks, Nick -- SUSE Labs, Novell Inc.