* Re: Serial port latency [not found] <000401c0b319517fea9@local> @ 2001-03-25 23:10 ` Pavel Machek 2001-03-29 7:58 ` Manfred Spraul 2001-04-10 0:37 ` Serial port latency Andrea Arcangeli 0 siblings, 2 replies; 21+ messages in thread From: Pavel Machek @ 2001-03-25 23:10 UTC (permalink / raw) To: Manfred Spraul; +Cc: geirt, linux-kernel Hi! > Is the computer otherwise idle? > I've seen one unexplainable report with atm problems that disappeared > (!) if a kernel compile was running. I've seen similar bugs. If you hook something on schedule_tq and forget to set current->need_resched, this is exactly what you get. -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Serial port latency 2001-03-25 23:10 ` Serial port latency Pavel Machek @ 2001-03-29 7:58 ` Manfred Spraul 2001-03-30 22:36 ` Pavel Machek 2001-04-10 0:37 ` Serial port latency Andrea Arcangeli 1 sibling, 1 reply; 21+ messages in thread From: Manfred Spraul @ 2001-03-29 7:58 UTC (permalink / raw) To: Pavel Machek; +Cc: geirt, linux-kernel From: "Pavel Machek" <pavel@suse.cz> > > Is the computer otherwise idle? > > I've seen one unexplainable report with atm problems that disappeared > > (!) if a kernel compile was running. > > I've seen similar bugs. If you hook something on schedule_tq and forget > to set current->need_resched, this is exactly what you get. > I'm running with a patch that printk's if cpu_idle() is called while a softirq is pending. If I access the floppy on my K6/200 every track triggers the check, and sometimes the console blanking code triggers it. What about creating a special cpu_is_idle() function that the idle functions must call before sleeping? -- Manfred ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Serial port latency 2001-03-29 7:58 ` Manfred Spraul @ 2001-03-30 22:36 ` Pavel Machek 2001-03-31 22:09 ` Manfred Spraul 0 siblings, 1 reply; 21+ messages in thread From: Pavel Machek @ 2001-03-30 22:36 UTC (permalink / raw) To: Manfred Spraul; +Cc: geirt, linux-kernel Hi! > > > Is the computer otherwise idle? > > > I've seen one unexplainable report with atm problems that > disappeared > > > (!) if a kernel compile was running. > > > > I've seen similar bugs. If you hook something on schedule_tq and > forget > > to set current->need_resched, this is exactly what you get. > > > I'm running with a patch that printk's if cpu_idle() is called while a > softirq is pending. > If I access the floppy on my K6/200 every track triggers the check, and > sometimes the console blanking code triggers it. Seems floppy and console is buggy, then. > What about creating a special cpu_is_idle() function that the idle > functions must call before sleeping? I'd say just fix all the bugs. Pavel -- The best software in life is free (not shareware)! Pavel GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+ ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Serial port latency 2001-03-30 22:36 ` Pavel Machek @ 2001-03-31 22:09 ` Manfred Spraul 2001-04-03 23:07 ` softirq buggy [Re: Serial port latency] Pavel Machek 0 siblings, 1 reply; 21+ messages in thread From: Manfred Spraul @ 2001-03-31 22:09 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1514 bytes --] Pavel Machek wrote: > > > > > > > I've seen similar bugs. If you hook something on schedule_tq and > > forget > > > to set current->need_resched, this is exactly what you get. > > > > > I'm running with a patch that printk's if cpu_idle() is called while a > > softirq is pending. > > If I access the floppy on my K6/200 every track triggers the check, and > > sometimes the console blanking code triggers it. > > Seems floppy and console is buggy, then. > No. The softirq implementation is buggy. I can trigger the problem with the TASKLET_HI (floppy), and both net rx and tx (ping -l) > > What about creating a special cpu_is_idle() function that the idle > > functions must call before sleeping? > > I'd say just fix all the bugs. > Ok, there are 2 bugs that are (afaics) impossible to fix without checking for pending softirq's in cpu_idle(): a) queue_task(my_task1, tq_immediate); mark_bh(); schedule(); ;within schedule: do_softirq() ;within my_task1: mark_bh(); ; bh returns, but do_softirq won't loop ; do_softirq returns. ; schedule() clears current->need_resched ; idle thread scheduled. --> idle can run although softirq's are pending I assume I trigger this race with the floppy driver. b) hw interrupt do_softirq within the net_rx handler: another hw interrupt, additional packets are queued do_softirq won't loop. returns to idle thread. --> packets delayed unnecessary. What about the attached patch? Obviously the other idle cpu must be converted to use the function as well. -- Manfred [-- Attachment #2: patch-proc --] [-- Type: text/plain, Size: 1025 bytes --] --- 2.4/arch/i386/kernel/process.c Thu Feb 22 22:28:52 2001 +++ build-2.4/arch/i386/kernel/process.c Sun Apr 1 00:05:21 2001 @@ -73,6 +73,30 @@ hlt_counter--; } +/** + * cpu_is_idle - helper function for idle functions + * + * pm_idle functions must call this function to verify that + * the cpu is really idle. It must be called with disabled + * local interrupts. + * Return values: + * 0: cpu was not idle, local interrupts reenabled. + * 1: go into power saving mode, local interrupts are + * still disabled. +*/ +static inline int cpu_is_idle(void) +{ + if (current->need_resched) { + __sti(); + return 0; + } + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { + __sti(); + do_softirq(); + return 0; + } + return 1; +} /* * We use this if we don't have any better * idle routine.. @@ -81,10 +105,8 @@ { if (current_cpu_data.hlt_works_ok && !hlt_counter) { __cli(); - if (!current->need_resched) + if (cpu_is_idle()) safe_halt(); - else - __sti(); } } ^ permalink raw reply [flat|nested] 21+ messages in thread
* softirq buggy [Re: Serial port latency] 2001-03-31 22:09 ` Manfred Spraul @ 2001-04-03 23:07 ` Pavel Machek 2001-04-04 21:18 ` Manfred Spraul 0 siblings, 1 reply; 21+ messages in thread From: Pavel Machek @ 2001-04-03 23:07 UTC (permalink / raw) To: Manfred Spraul; +Cc: kernel list Hi! > > Seems floppy and console is buggy, then. > > > > No. The softirq implementation is buggy. > I can trigger the problem with the TASKLET_HI (floppy), and both net rx > and tx (ping -l) > > > > What about creating a special cpu_is_idle() function that the idle > > > functions must call before sleeping? > > > > I'd say just fix all the bugs. > > > > Ok, there are 2 bugs that are (afaics) impossible to fix without > checking for pending softirq's in cpu_idle(): > > a) > queue_task(my_task1, tq_immediate); > mark_bh(); > schedule(); > ;within schedule: do_softirq() > ;within my_task1: > mark_bh(); > ; bh returns, but do_softirq won't loop > ; do_softirq returns. > ; schedule() clears current->need_resched > ; idle thread scheduled. > --> idle can run although softirq's are pending Or anything else can run altrough softirqs are pending. If it is computation job, softinterrupts are delayed quiet a bit, right? So right fix seems to be "loop in do_softirq". Pavel > I assume I trigger this race with the floppy driver. > > b) > hw interrupt > do_softirq > within the net_rx handler: another hw interrupt, additional packets are > queued > do_softirq won't loop. > returns to idle thread. --> packets delayed unnecessary. > > What about the attached patch? Obviously the other idle cpu must be > converted to use the function as well. -- I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care." Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-03 23:07 ` softirq buggy [Re: Serial port latency] Pavel Machek @ 2001-04-04 21:18 ` Manfred Spraul 2001-04-06 12:00 ` Pavel Machek 0 siblings, 1 reply; 21+ messages in thread From: Manfred Spraul @ 2001-04-04 21:18 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list From: "Pavel Machek" <pavel@ucw.cz> > > > Ok, there are 2 bugs that are (afaics) impossible to fix without > > checking for pending softirq's in cpu_idle(): > > > > a) > > queue_task(my_task1, tq_immediate); > > mark_bh(); > > schedule(); > > ;within schedule: do_softirq() > > ;within my_task1: > > mark_bh(); > > ; bh returns, but do_softirq won't loop > > ; do_softirq returns. > > ; schedule() clears current->need_resched > > ; idle thread scheduled. > > --> idle can run although softirq's are pending > > Or anything else can run altrough softirqs are pending. If it is > computation job, softinterrupts are delayed quiet a bit, right? > > So right fix seems to be "loop in do_softirq". > No, it's the wrong fix. A network server under high load would loop forever within the softirq, never returning to process level. do_softirq cannot loop, the right fix is "check often for pending softirq's". It's checked before a process returns to user space, it's checked when a process schedules. What's missing is that the idle functions must check for pending softirqs, too. -- Manfred ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-04 21:18 ` Manfred Spraul @ 2001-04-06 12:00 ` Pavel Machek 2001-04-07 22:28 ` Manfred Spraul 0 siblings, 1 reply; 21+ messages in thread From: Pavel Machek @ 2001-04-06 12:00 UTC (permalink / raw) To: Manfred Spraul; +Cc: kernel list Hi! > > > Ok, there are 2 bugs that are (afaics) impossible to fix without > > > checking for pending softirq's in cpu_idle(): > > > > > > a) > > > queue_task(my_task1, tq_immediate); > > > mark_bh(); > > > schedule(); > > > ;within schedule: do_softirq() > > > ;within my_task1: > > > mark_bh(); > > > ; bh returns, but do_softirq won't loop > > > ; do_softirq returns. > > > ; schedule() clears current->need_resched > > > ; idle thread scheduled. > > > --> idle can run although softirq's are pending > > > > Or anything else can run altrough softirqs are pending. If it is > > computation job, softinterrupts are delayed quiet a bit, right? > > > > So right fix seems to be "loop in do_softirq". > > > No, it's the wrong fix. > A network server under high load would loop forever within the softirq, > never returning to process level. > > do_softirq cannot loop, the right fix is "check often for pending > softirq's". > It's checked before a process returns to user space, it's checked when a > process schedules. What's missing is that the idle functions must check > for pending softirqs, too. Ok. I was missing the fact it is checked on going userspace. -- The best software in life is free (not shareware)! Pavel GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+ ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-06 12:00 ` Pavel Machek @ 2001-04-07 22:28 ` Manfred Spraul 2001-04-08 16:58 ` kuznet 0 siblings, 1 reply; 21+ messages in thread From: Manfred Spraul @ 2001-04-07 22:28 UTC (permalink / raw) To: Pavel Machek; +Cc: kernel list [-- Attachment #1: Type: text/plain, Size: 132 bytes --] Pavel Machek wrote: > > Ok. I was missing the fact it is checked on going userspace. > What about the attached patch? -- Manfred [-- Attachment #2: patch-cpu-idle --] [-- Type: text/plain, Size: 2858 bytes --] // $Header$ // Kernel Version: // VERSION = 2 // PATCHLEVEL = 4 // SUBLEVEL = 3 // EXTRAVERSION = -ac3 --- 2.4/include/linux/interrupt.h Thu Feb 22 22:29:53 2001 +++ build-2.4/include/linux/interrupt.h Sat Apr 7 23:51:31 2001 @@ -269,4 +269,25 @@ extern int probe_irq_off(unsigned long); /* returns 0 or negative on failure */ extern unsigned int probe_irq_mask(unsigned long); /* returns mask of ISA interrupts */ +/** + * cpu_is_idle - helper function for idle functions + * + * pm_idle functions must call this function to verify that + * the cpu is really idle. It must be called with disabled + * local interrupts. + * Return values: + * 0: cpu was not idle. + * 1: go into power saving mode. + */ +static inline int cpu_is_idle(void) +{ + if (current->need_resched) { + return 0; + } + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { + do_softirq(); + return 0; + } + return 1; +} #endif --- 2.4/arch/i386/kernel/process.c Thu Feb 22 22:28:52 2001 +++ build-2.4/arch/i386/kernel/process.c Sat Apr 7 23:51:46 2001 @@ -73,6 +73,7 @@ hlt_counter--; } + /* * We use this if we don't have any better * idle routine.. @@ -81,7 +82,7 @@ { if (current_cpu_data.hlt_works_ok && !hlt_counter) { __cli(); - if (!current->need_resched) + if (cpu_is_idle()) safe_halt(); else __sti(); @@ -92,26 +93,21 @@ * On SMP it's slightly faster (but much more power-consuming!) * to poll the ->need_resched flag instead of waiting for the * cross-CPU IPI to arrive. Use this option with caution. + * + * Isn't this identical to default_idle with the 'no-hlt' boot + * option? <manfred@colorfullife.com> */ static void poll_idle (void) { int oldval; + for(;;) { + if (!cpu_is_idle()) + break; + __sti(); + rep_nop(); + } __sti(); - - /* - * Deal with another CPU just having chosen a thread to - * run here: - */ - oldval = xchg(¤t->need_resched, -1); - - if (!oldval) - asm volatile( - "2:" - "cmpl $-1, %0;" - "rep; nop;" - "je 2b;" - : :"m" (current->need_resched)); } /* --- 2.4/drivers/acpi/cpu.c Sat Apr 7 22:02:01 2001 +++ build-2.4/drivers/acpi/cpu.c Sat Apr 7 23:55:17 2001 @@ -148,7 +148,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; if (acpi_bm_activity()) goto sleep2; @@ -171,7 +171,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; if (acpi_bm_activity()) goto sleep2; @@ -205,7 +205,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; time = acpi_read_pm_timer(); @@ -235,7 +235,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; time = acpi_read_pm_timer(); acpi_c1_count++; ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-07 22:28 ` Manfred Spraul @ 2001-04-08 16:58 ` kuznet 2001-04-08 17:21 ` Manfred Spraul 0 siblings, 1 reply; 21+ messages in thread From: kuznet @ 2001-04-08 16:58 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel Hello! > + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { > + do_softirq(); > + return 0; BTW you may delete do_softirq()... schedule() will call this. > + * > + * Isn't this identical to default_idle with the 'no-hlt' boot > + * option? <manfred@colorfullife.com> Seeems, it is not. need_resched=-1 avoids useless IPIs. Alexey ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-08 16:58 ` kuznet @ 2001-04-08 17:21 ` Manfred Spraul 2001-04-08 17:58 ` kuznet 0 siblings, 1 reply; 21+ messages in thread From: Manfred Spraul @ 2001-04-08 17:21 UTC (permalink / raw) To: kuznet; +Cc: linux-kernel From: <kuznet@ms2.inr.ac.ru> > Hello! > > > + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { > > + do_softirq(); > > + return 0; > > BTW you may delete do_softirq()... schedule() will call this. > But with a huge overhead. I'd prefer to call it directly from within the idle functions, the overhead of schedule is IMHO too high. > > > + * > > + * Isn't this identical to default_idle with the 'no-hlt' boot > > + * option? <manfred@colorfullife.com> > > Seeems, it is not. need_resched=-1 avoids useless IPIs. > I already wondered why need_resched is set to -1 ;-) I'll remove that comment and repost the patch. -- Manfred ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-08 17:21 ` Manfred Spraul @ 2001-04-08 17:58 ` kuznet 2001-04-08 18:16 ` Manfred Spraul 2001-04-08 21:35 ` [PATCH] Re: softirq buggy Manfred Spraul 0 siblings, 2 replies; 21+ messages in thread From: kuznet @ 2001-04-08 17:58 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel Hello! > But with a huge overhead. I'd prefer to call it directly from within the > idle functions, the overhead of schedule is IMHO too high. + if (current->need_resched) { + return 0; ^^^^^^^^ + } + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { + do_softirq(); + return 0; ^^^^^^^^^ You return one value in both casesand I decided it means "schedule". 8) Apparently you meaned return 1 in the first case. 8) But in this case it becomes wrong. do_softirq() can raise need_reshed and moreover irqs arrive during it. Order of check should be different. BTW what's about overhead... I suspect it is _lower_ in the case of schedule(). In the case of networking at least, when softirq most likely wakes some socket. Alexey ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: softirq buggy [Re: Serial port latency] 2001-04-08 17:58 ` kuznet @ 2001-04-08 18:16 ` Manfred Spraul 2001-04-08 21:35 ` [PATCH] Re: softirq buggy Manfred Spraul 1 sibling, 0 replies; 21+ messages in thread From: Manfred Spraul @ 2001-04-08 18:16 UTC (permalink / raw) To: kuznet; +Cc: linux-kernel From: <kuznet@ms2.inr.ac.ru> To: "Manfred Spraul" <manfred@colorfullife.com> Cc: <linux-kernel@vger.redhat.com> Sent: Sunday, April 08, 2001 7:58 PM Subject: Re: softirq buggy [Re: Serial port latency] > Hello! > > > But with a huge overhead. I'd prefer to call it directly from within the > > idle functions, the overhead of schedule is IMHO too high. > > > + if (current->need_resched) { > + return 0; > ^^^^^^^^ > + } > + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { > + do_softirq(); > + return 0; > ^^^^^^^^^ > You return one value in both casesand I decided it means "schedule". 8) > Apparently you meaned return 1 in the first case. 8) > No, the code is correct. 0 means "don't stop the cpu". The pm_idle function pointer will return to cpu_idle() (arch/i386/kernel/process.c), and that function contains another while(!current->need_resched) idle(); loop ;-) > But in this case it becomes wrong. do_softirq() can raise need_reshed > and moreover irqs arrive during it. Order of check should be different. > Yes, I'll correct that. > > BTW what's about overhead... I suspect it is _lower_ in the case > of schedule(). In the case of networking at least, when softirq > most likely wakes some socket. > I'm not sure - what if the computer is just a router? But OTHO: the cpu is idle, so it doesn't matter at all if the idle cpu spends it's time within schedule() or within safe_hlt(), I'll change my patch. I have another question: I added cpu_is_idle() into <linux/interrupt.h>. Is that acceptable, or is there a better header file for such a function? -- Manfred ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] Re: softirq buggy 2001-04-08 17:58 ` kuznet 2001-04-08 18:16 ` Manfred Spraul @ 2001-04-08 21:35 ` Manfred Spraul 2001-04-09 8:42 ` Albert D. Cahalan 2001-04-09 13:50 ` Andrea Arcangeli 1 sibling, 2 replies; 21+ messages in thread From: Manfred Spraul @ 2001-04-08 21:35 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 377 bytes --] I've attached a new patch: * cpu_is_idle() moved to <linux/pm.h> * function uninlined due to header dependencies * cpu_is_idle() doesn't call do_softirq directly, instead the caller returns to schedule() * cpu_is_idle() exported for modules. * docu updated. I'd prefer to inline cpu_is_idle(), but optimizing the idle code path is probably not that important ;-) -- Manfred [-- Attachment #2: patch-cpu-idle --] [-- Type: text/plain, Size: 2940 bytes --] // $Header$ // Kernel Version: // VERSION = 2 // PATCHLEVEL = 4 // SUBLEVEL = 3 // EXTRAVERSION = -ac3 --- 2.4/include/linux/pm.h Thu Jan 4 23:50:47 2001 +++ build-2.4/include/linux/pm.h Sun Apr 8 21:02:02 2001 @@ -186,6 +186,8 @@ extern void (*pm_idle)(void); extern void (*pm_power_off)(void); +int cpu_is_idle(void); + #endif /* __KERNEL__ */ #endif /* _LINUX_PM_H */ --- 2.4/kernel/sched.c Sat Apr 7 22:02:27 2001 +++ build-2.4/kernel/sched.c Sun Apr 8 21:02:37 2001 @@ -1242,6 +1242,28 @@ sched_data->last_schedule = get_cycles(); } +/** + * cpu_is_idle - helper function for idle functions + * + * pm_idle functions must call this function to verify that + * the cpu is really idle. + * The return value is valid until local interrupts are + * reenabled. + * Return values: + * 1: go into power saving mode. + * 0: cpu is not idle, return to schedule() + */ +int cpu_is_idle(void) +{ + if (current->need_resched) + return 0; + + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) + return 0; + + return 1; +} + extern void init_timervecs (void); void __init sched_init(void) --- 2.4/kernel/ksyms.c Sat Apr 7 22:02:27 2001 +++ build-2.4/kernel/ksyms.c Sun Apr 8 21:42:48 2001 @@ -440,6 +440,7 @@ #endif EXPORT_SYMBOL(kstat); EXPORT_SYMBOL(nr_running); +EXPORT_SYMBOL(cpu_is_idle); /* misc */ EXPORT_SYMBOL(panic); --- 2.4/arch/i386/kernel/process.c Thu Feb 22 22:28:52 2001 +++ build-2.4/arch/i386/kernel/process.c Sun Apr 8 21:25:23 2001 @@ -81,7 +81,7 @@ { if (current_cpu_data.hlt_works_ok && !hlt_counter) { __cli(); - if (!current->need_resched) + if (cpu_is_idle()) safe_halt(); else __sti(); @@ -105,13 +105,10 @@ */ oldval = xchg(¤t->need_resched, -1); - if (!oldval) - asm volatile( - "2:" - "cmpl $-1, %0;" - "rep; nop;" - "je 2b;" - : :"m" (current->need_resched)); + if (!oldval) { + while(cpu_is_idle()) + rep_nop(); + } } /* @@ -131,7 +128,7 @@ void (*idle)(void) = pm_idle; if (!idle) idle = default_idle; - while (!current->need_resched) + while (cpu_is_idle()) idle(); schedule(); check_pgt_cache(); --- 2.4/drivers/acpi/cpu.c Sat Apr 7 22:02:01 2001 +++ build-2.4/drivers/acpi/cpu.c Sat Apr 7 23:55:17 2001 @@ -148,7 +148,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; if (acpi_bm_activity()) goto sleep2; @@ -171,7 +171,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; if (acpi_bm_activity()) goto sleep2; @@ -205,7 +205,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; time = acpi_read_pm_timer(); @@ -235,7 +235,7 @@ unsigned long diff; __cli(); - if (current->need_resched) + if (!cpu_is_idle()) goto out; time = acpi_read_pm_timer(); acpi_c1_count++; ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-08 21:35 ` [PATCH] Re: softirq buggy Manfred Spraul @ 2001-04-09 8:42 ` Albert D. Cahalan 2001-04-09 13:50 ` Andrea Arcangeli 1 sibling, 0 replies; 21+ messages in thread From: Albert D. Cahalan @ 2001-04-09 8:42 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel > I'd prefer to inline cpu_is_idle(), but optimizing the idle > code path is probably not that important ;-) Sure it is, in one way: how fast can you get back to work? (not OK to take a millisecond getting out of the idle loop) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-08 21:35 ` [PATCH] Re: softirq buggy Manfred Spraul 2001-04-09 8:42 ` Albert D. Cahalan @ 2001-04-09 13:50 ` Andrea Arcangeli 2001-04-09 15:26 ` Manfred Spraul 1 sibling, 1 reply; 21+ messages in thread From: Andrea Arcangeli @ 2001-04-09 13:50 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel On Sun, Apr 08, 2001 at 11:35:36PM +0200, Manfred Spraul wrote: > I've attached a new patch: > > * cpu_is_idle() moved to <linux/pm.h> > * function uninlined due to header dependencies > * cpu_is_idle() doesn't call do_softirq directly, instead the caller > returns to schedule() > * cpu_is_idle() exported for modules. > * docu updated. > > I'd prefer to inline cpu_is_idle(), but optimizing the idle code path is > probably not that important ;-) your cpu_is_idle will return 0 in the need_resched != 0 check even if the cpu is idle (because of the -1 trick for avoiding the SMP-IPI to notify the cpu). The issue you are addressing is quite londstanding and it is not only related to the loop with an idle cpu. This is the way I prefer to fix it: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre1/ksoftirqd-1 Andrea ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-09 13:50 ` Andrea Arcangeli @ 2001-04-09 15:26 ` Manfred Spraul 2001-04-09 17:31 ` Andrea Arcangeli 2001-04-09 17:48 ` kuznet 0 siblings, 2 replies; 21+ messages in thread From: Manfred Spraul @ 2001-04-09 15:26 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Andrea Arcangeli wrote: > > your cpu_is_idle will return 0 in the need_resched != 0 check even if the cpu > is idle (because of the -1 trick for avoiding the SMP-IPI to notify the cpu). > Fixed. > The issue you are addressing is quite londstanding and it is not only related > to the loop with an idle cpu. > > This is the way I prefer to fix it: > > ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.4pre1/ksoftirqd-1 > The return path to user space checks for pending softirqs. A delay of 1/HZ is only possible if the cpu loops in kernel space without returning to user space - and the functions that can loop check 'current->need_resched'. That means that either cpu_is_idle() must be renamed to schedule_required() and all 'need_resched' users should use that function, or something like your patch. Is a full thread really necessary? Just setting 'need_resched' should be enough, schedule() checks for pending softirqs. And do you have a rough idea how often that new thread is scheduled under load? Btw, you don't schedule the ksoftirqd thread if do_softirq() returns from the 'if(in_interrupt())' check. I assume that this is the most common case of delayed softirq processing: ; in process context spin_lock_bh(); ; hw interrupt arrives ; do_softirq returns immediately spin_unlock_bh(); -- Manfred ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-09 15:26 ` Manfred Spraul @ 2001-04-09 17:31 ` Andrea Arcangeli 2001-04-09 17:48 ` kuznet 1 sibling, 0 replies; 21+ messages in thread From: Andrea Arcangeli @ 2001-04-09 17:31 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2730 bytes --] On Mon, Apr 09, 2001 at 05:26:27PM +0200, Manfred Spraul wrote: > The return path to user space checks for pending softirqs. A delay of And it breaks the loop too if new softirq events become pending again in background. > 1/HZ is only possible if the cpu loops in kernel space without returning > to user space - and the functions that can loop check It is also possible when new events are posted and I think it makes sense to scale the softirq load with the scheduler when it is flooding. Theoretically one could move the _whole_ softirq load into the ksoftirqd, but that would increase the latency too much and I think it is better to use it only as a fallback when we have to giveup but we still would like to keep processing the softirq load so we let the scheduler to choose if we still can do that or if we should giveup on the softirq. > Is a full thread really necessary? Just setting 'need_resched' should be > enough, schedule() checks for pending softirqs. If you abuse need_resched then you can starve userspace again, if you are ok to starve userspace indefinitely then it is more efficient to keep looping forever into do_softirq as far as new events are posted in background instead of exiting do_softirq and waiting the scheduler to kickin again. > And do you have a rough idea how often that new thread is scheduled > under load? The scheduling is not as heavy as with tasks, it's a kernel thread so the tlb isn't touched. However yes it will generate some overhead with schedule() compared to just waiting the 1/HZ but letting the scheduler to understand when the softirq should keep running instead of another task is supposed to be a feature. I run a netpipe run with an alpha SMP as receiver with the ksoftirqd patch and then without and the numbers didn't changed at all even if the ksofitrqd was often running (1/2% of the load of the machine). > Btw, you don't schedule the ksoftirqd thread if do_softirq() returns > from the 'if(in_interrupt())' check. That's not necessary and it's intentional, such check will be passed in the last do_softirq executed before returning to userspace or kernel normal context, the reason of such check is only to avoid recursing too much on the stack during nested irqs. > I assume that this is the most common case of delayed softirq > processing: > > ; in process context > spin_lock_bh(); > ; hw interrupt arrives > ; do_softirq returns immediately > spin_unlock_bh(); This is yet another case and it's handled before returning to userspace so the latency should still be pretty small (and there would be no singificant advantage and almost certainly only a performance drop in waking up ksoftirqd from the `do_softirq returns immediatly' line). Andrea [-- Attachment #2: ksofitrqd.png --] [-- Type: image/png, Size: 3940 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-09 15:26 ` Manfred Spraul 2001-04-09 17:31 ` Andrea Arcangeli @ 2001-04-09 17:48 ` kuznet 2001-04-09 18:26 ` Andrea Arcangeli 1 sibling, 1 reply; 21+ messages in thread From: kuznet @ 2001-04-09 17:48 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel Hello! > Btw, you don't schedule the ksoftirqd thread if do_softirq() returns > from the 'if(in_interrupt())' check. ksoftirqd will not be switched to before the first schedule or ret form syscall, when softirqs will be processed in any case. So, wake up in this case would be mistake. > I assume that this is the most common case of delayed softirq > processing: softirqs have the same latency warranty as rt threads, so that this is not a problem at all. The _real_ problem is softirqs generated from another softirqs: additonal thread is made _not_ to speed up softirqs, but to _tame_ them (if I understood Andres's explanations correctly). Alexey ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy 2001-04-09 17:48 ` kuznet @ 2001-04-09 18:26 ` Andrea Arcangeli 0 siblings, 0 replies; 21+ messages in thread From: Andrea Arcangeli @ 2001-04-09 18:26 UTC (permalink / raw) To: kuznet; +Cc: Manfred Spraul, linux-kernel On Mon, Apr 09, 2001 at 09:48:02PM +0400, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > Btw, you don't schedule the ksoftirqd thread if do_softirq() returns > > from the 'if(in_interrupt())' check. > > ksoftirqd will not be switched to before the first schedule > or ret form syscall, when softirqs will be processed in any case. > So, wake up in this case would be mistake. Agreed. > The _real_ problem is softirqs generated from another softirqs: > additonal thread is made _not_ to speed up softirqs, but to _tame_ > them (if I understood Andres's explanations correctly). Exactly. Andrea ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Serial port latency 2001-03-25 23:10 ` Serial port latency Pavel Machek 2001-03-29 7:58 ` Manfred Spraul @ 2001-04-10 0:37 ` Andrea Arcangeli 1 sibling, 0 replies; 21+ messages in thread From: Andrea Arcangeli @ 2001-04-10 0:37 UTC (permalink / raw) To: Pavel Machek; +Cc: Manfred Spraul, geirt, linux-kernel On Sun, Mar 25, 2001 at 11:10:14PM +0000, Pavel Machek wrote: > I've seen similar bugs. If you hook something on schedule_tq and forget > to set current->need_resched, this is exactly what you get. keventd fixes tq_scheduler case (tq_scheduler is dead). Andrea ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Re: softirq buggy
@ 2001-04-09 11:37 Studierende der Universitaet des Saarlandes
0 siblings, 0 replies; 21+ messages in thread
From: Studierende der Universitaet des Saarlandes @ 2001-04-09 11:37 UTC (permalink / raw)
To: acahalan; +Cc: linux-kernel
> > I'd prefer to inline cpu_is_idle(), but optimizing the idle
> >code path is probably not that important ;-)
>
> Sure it is, in one way: how fast can you get back to work?
> (not OK to take a millisecond getting out of the idle loop)
2 short function calls instead of 2 "if(current->need_resched)" on the
way out.
I didn't try very hard to fix the inline dependencies, could you try to
move cpu_is_idle() from kernel/sched.c into <linux/pm.h>?
I'm sure it won't be more difficult than the last "Athlon+SMP doesn't
compile" problem ;-)
--
Manfred
^ permalink raw reply [flat|nested] 21+ messages in threadend of thread, other threads:[~2001-04-10 0:35 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <000401c0b319517fea9@local>
2001-03-25 23:10 ` Serial port latency Pavel Machek
2001-03-29 7:58 ` Manfred Spraul
2001-03-30 22:36 ` Pavel Machek
2001-03-31 22:09 ` Manfred Spraul
2001-04-03 23:07 ` softirq buggy [Re: Serial port latency] Pavel Machek
2001-04-04 21:18 ` Manfred Spraul
2001-04-06 12:00 ` Pavel Machek
2001-04-07 22:28 ` Manfred Spraul
2001-04-08 16:58 ` kuznet
2001-04-08 17:21 ` Manfred Spraul
2001-04-08 17:58 ` kuznet
2001-04-08 18:16 ` Manfred Spraul
2001-04-08 21:35 ` [PATCH] Re: softirq buggy Manfred Spraul
2001-04-09 8:42 ` Albert D. Cahalan
2001-04-09 13:50 ` Andrea Arcangeli
2001-04-09 15:26 ` Manfred Spraul
2001-04-09 17:31 ` Andrea Arcangeli
2001-04-09 17:48 ` kuznet
2001-04-09 18:26 ` Andrea Arcangeli
2001-04-10 0:37 ` Serial port latency Andrea Arcangeli
2001-04-09 11:37 [PATCH] Re: softirq buggy Studierende der Universitaet des Saarlandes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox