* A question about cpu_idle() @ 2009-10-21 3:07 yi li 2009-10-21 4:47 ` Gregory Haskins 0 siblings, 1 reply; 7+ messages in thread From: yi li @ 2009-10-21 3:07 UTC (permalink / raw) To: linux-rt-users Hi RT users, While reading patch-2.6.31.4-rt14, there is a patch for cpu_idle() which I cannot understand. Could anyone kindly enough to tell me what is patch used for? diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index ebefb54..c8d0ece 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -152,9 +152,11 @@ void cpu_idle(void) } tick_nohz_restart_sched_tick(); - preempt_enable_no_resched(); - schedule(); + local_irq_disable(); + __preempt_enable_no_resched(); + __schedule(); preempt_disable(); + local_irq_enable(); } } Regards, -Yi ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-21 3:07 A question about cpu_idle() yi li @ 2009-10-21 4:47 ` Gregory Haskins 2009-10-22 4:30 ` yi li 0 siblings, 1 reply; 7+ messages in thread From: Gregory Haskins @ 2009-10-21 4:47 UTC (permalink / raw) To: yi li, linux-rt-users >>> On 10/20/2009 at 11:07 PM, in message <a0e7fce50910202007l587d2cfcu91b455ff0486110b@mail.gmail.com>, yi li <liyi.dev@gmail.com> wrote: > Hi RT users, > > While reading patch-2.6.31.4-rt14, there is a patch for cpu_idle() > which I cannot understand. > > Could anyone kindly enough to tell me what is patch used for? Hi Yi, I believe that logic is ensuring that the task is put to sleep instead of simply being preempted. The difference is that a preemption leaves the task on the RQ, whereas calling schedule() may or may not leave the caller on the RQ, depending on the status of current->state. The local_irq_disable+__preempt_enable_no_resched dance is a way of legally calling schedule() while effectively preventing preemption (since interrupts-off also disables preemption). The difference is that its legal to call __schedule() with interrupts off, but you can't with preempt_disable(). Long story short, the enable_no_resched() + schedule() pattern is only ever used when you want to make sure the task fully sleeps. However without this patch to disable interrupts, the original code appears racy and thus probably had issues achieving its intended goal. HTH, -Greg > > > diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c > index ebefb54..c8d0ece 100644 > --- a/arch/x86/kernel/process_64.c > +++ b/arch/x86/kernel/process_64.c > @@ -152,9 +152,11 @@ void cpu_idle(void) > } > > tick_nohz_restart_sched_tick(); > - preempt_enable_no_resched(); > - schedule(); > + local_irq_disable(); > + __preempt_enable_no_resched(); > + __schedule(); > preempt_disable(); > + local_irq_enable(); > } > } > > Regards, > -Yi > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-21 4:47 ` Gregory Haskins @ 2009-10-22 4:30 ` yi li 2009-10-22 12:34 ` Gregory Haskins 0 siblings, 1 reply; 7+ messages in thread From: yi li @ 2009-10-22 4:30 UTC (permalink / raw) To: Gregory Haskins; +Cc: linux-rt-users On Wed, Oct 21, 2009 at 12:47 PM, Gregory Haskins <ghaskins@novell.com> wrote: >>>> On 10/20/2009 at 11:07 PM, in message > <a0e7fce50910202007l587d2cfcu91b455ff0486110b@mail.gmail.com>, yi li > <liyi.dev@gmail.com> wrote: >> Hi RT users, >> >> While reading patch-2.6.31.4-rt14, there is a patch for cpu_idle() >> which I cannot understand. >> >> Could anyone kindly enough to tell me what is patch used for? > > Hi Yi, > > I believe that logic is ensuring that the task is put to sleep instead of simply being preempted. The difference is that a preemption leaves the task on the RQ, whereas calling schedule() may or may not leave the caller on the RQ, depending on the status of current->state. The local_irq_disable+__preempt_enable_no_resched dance is a way of legally calling schedule() while effectively preventing preemption (since interrupts-off also disables preemption). The difference is that its legal to call __schedule() with interrupts off, but you can't with preempt_disable(). > > Long story short, the enable_no_resched() + schedule() pattern is only ever used when you want to make sure the task fully sleeps. However without this patch to disable interrupts, the original code appears racy and thus probably had issues achieving its intended goal. > > HTH, > -Greg > Thanks Greg for the kind reply. But due to my limited knowledge on PREEMPT_RT, I still not fully understand. Given schedule() is defined as: asmlinkage void __sched schedule(void) { need_resched: local_irq_disable(); __schedule(); local_irq_enable(); if (need_resched()) goto need_resched; } Comparing bellow two code sequences: 1) preempt_enable_no_resched(); schedule(); preempt_disable(); 2) local_irq_disable(); __preempt_enable_no_resched(); __schedule(); preempt_disable(); local_irq_enable(); It seems to me, the main difference is to change the order of "preempt_enable_no_sched() / local_irq_disable()": i.e, from: "preempt_enable_no_sched(); /* Yi: will some race condition happens here? */ local_irq_disable(); __schedule();" to: "local_irq_disable(); /* Yi: disables irq effectively disables preemption? */ preempt_enable_no_sched(); __schedule();" Regards, -Yi >> >> >> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c >> index ebefb54..c8d0ece 100644 >> --- a/arch/x86/kernel/process_64.c >> +++ b/arch/x86/kernel/process_64.c >> @@ -152,9 +152,11 @@ void cpu_idle(void) >> } >> >> tick_nohz_restart_sched_tick(); >> - preempt_enable_no_resched(); >> - schedule(); >> + local_irq_disable(); >> + __preempt_enable_no_resched(); >> + __schedule(); >> preempt_disable(); >> + local_irq_enable(); >> } >> } >> >> Regards, >> -Yi >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-22 4:30 ` yi li @ 2009-10-22 12:34 ` Gregory Haskins 2009-10-22 16:24 ` yi li 0 siblings, 1 reply; 7+ messages in thread From: Gregory Haskins @ 2009-10-22 12:34 UTC (permalink / raw) To: yi li; +Cc: linux-rt-users >>> On 10/22/2009 at 12:30 AM, in message <a0e7fce50910212130h385cdbe1ucbb4fe00233eb0bb@mail.gmail.com>, yi li <liyi.dev@gmail.com> wrote: > On Wed, Oct 21, 2009 at 12:47 PM, Gregory Haskins <ghaskins@novell.com> wrote: >>>>> On 10/20/2009 at 11:07 PM, in message >> <a0e7fce50910202007l587d2cfcu91b455ff0486110b@mail.gmail.com>, yi li >> <liyi.dev@gmail.com> wrote: >>> Hi RT users, >>> >>> While reading patch-2.6.31.4-rt14, there is a patch for cpu_idle() >>> which I cannot understand. >>> >>> Could anyone kindly enough to tell me what is patch used for? >> >> Hi Yi, >> >> I believe that logic is ensuring that the task is put to sleep instead of > simply being preempted. The difference is that a preemption leaves the task > on the RQ, whereas calling schedule() may or may not leave the caller on the > RQ, depending on the status of current->state. The > local_irq_disable+__preempt_enable_no_resched dance is a way of legally > calling schedule() while effectively preventing preemption (since > interrupts-off also disables preemption). The difference is that its legal > to call __schedule() with interrupts off, but you can't with > preempt_disable(). >> >> Long story short, the enable_no_resched() + schedule() pattern is only ever > used when you want to make sure the task fully sleeps. However without this > patch to disable interrupts, the original code appears racy and thus probably > had issues achieving its intended goal. >> >> HTH, >> -Greg >> > > Thanks Greg for the kind reply. > > But due to my limited knowledge on PREEMPT_RT, I still not fully understand. > > Given schedule() is defined as: > > asmlinkage void __sched schedule(void) > { > need_resched: > local_irq_disable(); > __schedule(); > local_irq_enable(); > > if (need_resched()) > goto need_resched; > } > > Comparing bellow two code sequences: > > 1) > preempt_enable_no_resched(); > schedule(); > preempt_disable(); > > > 2) > local_irq_disable(); > __preempt_enable_no_resched(); > __schedule(); > preempt_disable(); > local_irq_enable(); > > It seems to me, the main difference is to change the order of > "preempt_enable_no_sched() / local_irq_disable()": > i.e, from: > > "preempt_enable_no_sched(); > /* Yi: will some race condition happens here? */ This is exactly right. In a preemptible kernel (which includes standard mainline PREEMPT as well as PREEMPT_RT) we technically may preempt at this point. Normally a preempt_enable() will induce a type of "voluntary" preemption point when the preempt-disable count goes to zero. The "no_sched()" variant disables this voluntary reschedule, but it does nothing to prevent an _involuntary_ preemption (such as a RESCHEDULE interprocessor-interrupt (IPI) sent from another core). > local_irq_disable(); > __schedule();" > > to: > > "local_irq_disable(); /* Yi: disables irq effectively disables preemption? Yes, that is right. irq_disable is a effective superset of preempt-disable() in this context because it also blocks those RESCHED_IPI events from being received. Of course, disabling interrupts also has other side effects since it also disables *all* interrupts (like timers, etc) so it should be used sparingly. In this case, we are simply bridging the preempt_enable_no_resched() and schedule() to make sure it is truly an atomic transition to a sleep state, so its use is justified. I hope this helps, and feel free to ask any more questions you wish. -Greg > */ > preempt_enable_no_sched(); > __schedule();" > > Regards, > -Yi > >>> >>> >>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c >>> index ebefb54..c8d0ece 100644 >>> --- a/arch/x86/kernel/process_64.c >>> +++ b/arch/x86/kernel/process_64.c >>> @@ -152,9 +152,11 @@ void cpu_idle(void) >>> } >>> >>> tick_nohz_restart_sched_tick(); >>> - preempt_enable_no_resched(); >>> - schedule(); >>> + local_irq_disable(); >>> + __preempt_enable_no_resched(); >>> + __schedule(); >>> preempt_disable(); >>> + local_irq_enable(); >>> } >>> } >>> >>> Regards, >>> -Yi >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> >> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-22 12:34 ` Gregory Haskins @ 2009-10-22 16:24 ` yi li 2009-10-22 18:59 ` Gregory Haskins 0 siblings, 1 reply; 7+ messages in thread From: yi li @ 2009-10-22 16:24 UTC (permalink / raw) To: Gregory Haskins; +Cc: linux-rt-users On Thu, Oct 22, 2009 at 8:34 PM, Gregory Haskins <ghaskins@novell.com> wrote: > Yes, that is right. irq_disable is a effective superset of preempt-disable() in this context because it also blocks those RESCHED_IPI events from being received. Of course, disabling interrupts also has other side effects since it also disables *all* interrupts (like timers, etc) so it should be used sparingly. In this case, we are simply bridging the preempt_enable_no_resched() and schedule() to make sure it is truly an atomic transition to a sleep state, so its use is justified. > > I hope this helps, and feel free to ask any more questions you wish. Thanks. Yes it make sense to me. Another question may be basic - but I just cannot figure out. Why PREEMPT_RT patch disables local irq (with local_irq_disable()) before __schedule()? Is it to ensure atomic excution of __schedule()? If so, does that mean, linux kernel without PREEMPT_RT patch suffers from racy problem in schedule()? asmlinkage void __sched schedule(void) { need_resched: local_irq_disable(); __schedule(); local_irq_enable(); if (need_resched()) goto need_resched; } I saw similiar code in preempt_schedule(), __cond_schedule() and preempt_schedule_irq(). Regards, -Yi -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-22 16:24 ` yi li @ 2009-10-22 18:59 ` Gregory Haskins 2009-10-24 10:04 ` yi li 0 siblings, 1 reply; 7+ messages in thread From: Gregory Haskins @ 2009-10-22 18:59 UTC (permalink / raw) To: yi li; +Cc: linux-rt-users >>> On 10/22/2009 at 12:24 PM, in message <a0e7fce50910220924s4b640cbi51af6d7e791e98e9@mail.gmail.com>, yi li <liyi.dev@gmail.com> wrote: > On Thu, Oct 22, 2009 at 8:34 PM, Gregory Haskins <ghaskins@novell.com> wrote: >> Yes, that is right. irq_disable is a effective superset of preempt-disable() > in this context because it also blocks those RESCHED_IPI events from being > received. Of course, disabling interrupts also has other side effects since > it also disables *all* interrupts (like timers, etc) so it should be used > sparingly. In this case, we are simply bridging the > preempt_enable_no_resched() and schedule() to make sure it is truly an atomic > transition to a sleep state, so its use is justified. >> >> I hope this helps, and feel free to ask any more questions you wish. > > Thanks. Yes it make sense to me. > > Another question may be basic - but I just cannot figure out. > > Why PREEMPT_RT patch disables local irq (with local_irq_disable()) > before __schedule()? Yes, but note that even mainline disables interrupts before doing a context switch. It just does it in a different place within schedule() (see spin_lock_irq(rq->lock)). I believe PREEMPT_RT introduces the non-irq disabled form of __schedule() so that it can be used in those no_sched() + schedule() instances that we already discussed. The schedule() function then becomes a client of __schedule(), properly disabling interrupts first as is required > Is it to ensure atomic excution of __schedule()? Yes, atomic execution of the context switch is a requirement for both mainline and PREEMPT_RT, and thus interrupts must eventually be turned off as the registers are reloaded. > If so, does that mean, linux kernel without PREEMPT_RT patch suffers > from racy problem in schedule()? No, mainline schedule() works just fine to my knowledge. The issue is for the implementation of preempt_enable_no_resched() + schedule(). On a preemptible kernel (such as PREEMPT or PREEMPT_RT), you may have a narrow race w.r.t. a preemption point occurring between the time that preemption is enabled and we call schedule(). IOW: The code would still work fine even if we hit the race, but it results in suboptimal behavior. What would happen is that the system may preempt the task, leave it on the RQ, run some other task, and then return control to this original task only to have it finish putting itself to sleep. It is more efficient to just let it fully complete the deactivation than it would be to allow the one extra context switch, but it technically works either way. Since RT is about reducing latency, and the scheduler itself is a source of latency, it is wise to eliminate any extraneous switches whenever possible. However, this enhancement is technically applicable for CONFIG_PREEMPT mainline as well, afaict. Kind Regards, -Greg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: A question about cpu_idle() 2009-10-22 18:59 ` Gregory Haskins @ 2009-10-24 10:04 ` yi li 0 siblings, 0 replies; 7+ messages in thread From: yi li @ 2009-10-24 10:04 UTC (permalink / raw) To: Gregory Haskins; +Cc: linux-rt-users On Fri, Oct 23, 2009 at 2:59 AM, Gregory Haskins <ghaskins@novell.com> wrote: >>>> On 10/22/2009 at 12:24 PM, in message > <a0e7fce50910220924s4b640cbi51af6d7e791e98e9@mail.gmail.com>, yi li > <liyi.dev@gmail.com> wrote: >> On Thu, Oct 22, 2009 at 8:34 PM, Gregory Haskins <ghaskins@novell.com> wrote: >>> Yes, that is right. irq_disable is a effective superset of preempt-disable() >> in this context because it also blocks those RESCHED_IPI events from being >> received. Of course, disabling interrupts also has other side effects since >> it also disables *all* interrupts (like timers, etc) so it should be used >> sparingly. In this case, we are simply bridging the >> preempt_enable_no_resched() and schedule() to make sure it is truly an atomic >> transition to a sleep state, so its use is justified. >>> >>> I hope this helps, and feel free to ask any more questions you wish. >> >> Thanks. Yes it make sense to me. >> >> Another question may be basic - but I just cannot figure out. >> >> Why PREEMPT_RT patch disables local irq (with local_irq_disable()) >> before __schedule()? > > Yes, but note that even mainline disables interrupts before doing a context switch. It just does it in a different place within schedule() (see spin_lock_irq(rq->lock)). > > I believe PREEMPT_RT introduces the non-irq disabled form of __schedule() so that it can be used in those no_sched() + schedule() instances that we already discussed. The schedule() function then becomes a client of __schedule(), properly disabling interrupts first as is required > >> Is it to ensure atomic excution of __schedule()? > > Yes, atomic execution of the context switch is a requirement for both mainline and PREEMPT_RT, and thus interrupts must eventually be turned off as the registers are reloaded. > >> If so, does that mean, linux kernel without PREEMPT_RT patch suffers >> from racy problem in schedule()? > > No, mainline schedule() works just fine to my knowledge. The issue is for the implementation of preempt_enable_no_resched() + schedule(). On a preemptible kernel (such as PREEMPT or PREEMPT_RT), you may have a narrow race w.r.t. a preemption point occurring between the time that preemption is enabled and we call schedule(). IOW: The code would still work fine even if we hit the race, but it results in suboptimal behavior. > > What would happen is that the system may preempt the task, leave it on the RQ, run some other task, and then return control to this original task only to have it finish putting itself to sleep. It is more efficient to just let it fully complete the deactivation than it would be to allow the one extra context switch, but it technically works either way. Since RT is about reducing latency, and the scheduler itself is a source of latency, it is wise to eliminate any extraneous switches whenever possible. However, this enhancement is technically applicable for CONFIG_PREEMPT mainline as well, afaict. > > Kind Regards, > -Greg > Thanks Greg. This makes sense to me. I appreciate your explanation. -Yi -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-10-24 10:03 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-21 3:07 A question about cpu_idle() yi li 2009-10-21 4:47 ` Gregory Haskins 2009-10-22 4:30 ` yi li 2009-10-22 12:34 ` Gregory Haskins 2009-10-22 16:24 ` yi li 2009-10-22 18:59 ` Gregory Haskins 2009-10-24 10:04 ` yi li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).