* softirq in pre3 and all linux ports @ 2001-06-19 19:03 Andrea Arcangeli 2001-06-20 3:33 ` Paul Mackerras 0 siblings, 1 reply; 12+ messages in thread From: Andrea Arcangeli @ 2001-06-19 19:03 UTC (permalink / raw) To: Linus Torvalds, Alan Cox, Ingo Molnar, kuznet; +Cc: linux-kernel With pre3 there are bugs introduced into mainline that are getting extended to all architectures. First of all nucking the handle_softirq from entry.S is wrong. ppc copied without thinking and we'll need to resurrect it too for example so please arch maintainers don't kill that check (alpha in pre3 by luck didn't killed it I think). Without such check before returning to userspace any tasklet or softirq posted by kernel code will get a latency of 1/HZ. Secondly the pre3 softirq re-enables irqs before returning from do_softirq which is wrong as well because it can generate irq flood and stack overflow and do_softirq run not at the first not nested irq layer. Third if a softirq or a tasklet post itself running again the do_softirq can starve userspace in one or more cpus. Fourth if the tasklet or softirq or bottom half hander is been marked running again because of another even (like a nested irq) the kernel can starve userspace too. (softirqs are much heavier than the irq handler so it can also live lockup much more easily this way) This patch that I have in my tree since some day fixes all those issues. The assmembler changes needed in the entry.S files while returning to userspace can be described in C code this way, this is the 2.4.5 way: if (softirq_active(cpu) & softirq_mask(cpu)) do_softirq(); This is the 2.4.6pre3+belowfix way: if (softirq_pending(cpu)) do_softirq() pending doesn't need to be a 64bit integer (it can though) but it needs to be at least a 32bit integer. An `int' is fine for most archs, on alpha we use a long though and that's fine too. So I recommend Linus merging this patch that fixes all the above mentioned bugs (the anti starvation/live lockup logic is called ksoftirqd): ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.6pre3aa1/00_ksoftirqd-6 Plus those SMP race fixes for archs where atomic operations aren't implicit memory barriers: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.6pre3aa1/00_softirq-fixes-4 Plus this scheduler generic cpu binding fix to avoid ksoftirqd deadlocking at boot: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.6pre3aa1/00_cpus_allowed-1 I verified the patches applies just fine to 2.4.6pre3 and they're not controversial. If you've any question on how to update a certain kernel port I will do my best to help in the update process! Andrea ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-19 19:03 softirq in pre3 and all linux ports Andrea Arcangeli @ 2001-06-20 3:33 ` Paul Mackerras 2001-06-20 3:54 ` Andrea Arcangeli 2001-06-20 18:16 ` kuznet 0 siblings, 2 replies; 12+ messages in thread From: Paul Mackerras @ 2001-06-20 3:33 UTC (permalink / raw) To: Andrea Arcangeli Cc: Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel Andrea Arcangeli writes: > With pre3 there are bugs introduced into mainline that are getting > extended to all architectures. > > First of all nucking the handle_softirq from entry.S is wrong. ppc > copied without thinking and we'll need to resurrect it too for example Well, I object to the "without thinking" bit. It seems to me that code that raises a softirq without having either hard interrupts or BHs disabled is buggy - why would you want to do that? And if we do want to allow that, shouldn't we put the check in raise_softirq or the equivalent, to get the minimum latency? > Fourth if the tasklet or softirq or bottom half hander is been marked > running again because of another even (like a nested irq) the kernel can > starve userspace too. (softirqs are much heavier than the irq handler so > it can also live lockup much more easily this way) Soft irqs should definitely not be much heavier than an irq handler, if they are then we have implemented them wrongly somehow. > So I recommend Linus merging this patch that fixes all the above > mentioned bugs (the anti starvation/live lockup logic is called > ksoftirqd): ksoftirqd seems like the wrong solution to the problem to me, if we really getting starved by softirqs then we need to look at whether whatever is doing it should be a kernel thread itself rather than doing it in softirqs. Do you have a concrete example of the starvation/live lockup that you can describe to us? Regards, Paul. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 3:33 ` Paul Mackerras @ 2001-06-20 3:54 ` Andrea Arcangeli 2001-06-20 4:00 ` David S. Miller 2001-06-20 12:18 ` Paul Mackerras 2001-06-20 18:16 ` kuznet 1 sibling, 2 replies; 12+ messages in thread From: Andrea Arcangeli @ 2001-06-20 3:54 UTC (permalink / raw) To: Paul Mackerras Cc: Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel On Wed, Jun 20, 2001 at 01:33:19PM +1000, Paul Mackerras wrote: > Well, I object to the "without thinking" bit. [..] agreed, apologies. > BHs disabled is buggy - why would you want to do that? And if we do tasklet_schedule > want to allow that, shouldn't we put the check in raise_softirq or the > equivalent, to get the minimum latency? We should release the stack before running the softirq (some place uses softirqs to release the stack and avoid overflows). > Soft irqs should definitely not be much heavier than an irq handler, > if they are then we have implemented them wrongly somehow. ip + tcp are more intensive than just queueing a packet in a blacklog. That's why they're not done in irq context in first place. > ksoftirqd seems like the wrong solution to the problem to me, if we > really getting starved by softirqs then we need to look at whether > whatever is doing it should be a kernel thread itself rather than > doing it in softirqs. Do you have a concrete example of the > starvation/live lockup that you can describe to us? I don't have gigabit ethernet so I cannot flood my boxes to death. But I think it's real, and a softirq marking itself runnable again is another case to handle without live lockups or starvation. Andrea ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 3:54 ` Andrea Arcangeli @ 2001-06-20 4:00 ` David S. Miller 2001-06-20 4:07 ` Andrea Arcangeli 2001-06-20 12:18 ` Paul Mackerras 1 sibling, 1 reply; 12+ messages in thread From: David S. Miller @ 2001-06-20 4:00 UTC (permalink / raw) To: Andrea Arcangeli Cc: Paul Mackerras, Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel Andrea Arcangeli writes: > I don't have gigabit ethernet so I cannot flood my boxes to death. > But I think it's real, and a softirq marking itself runnable again is > another case to handle without live lockups or starvation. I think (still) that you're just moving the problem around and not actually changing anything. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 4:00 ` David S. Miller @ 2001-06-20 4:07 ` Andrea Arcangeli 2001-06-20 18:06 ` kuznet 0 siblings, 1 reply; 12+ messages in thread From: Andrea Arcangeli @ 2001-06-20 4:07 UTC (permalink / raw) To: David S. Miller Cc: Paul Mackerras, Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel On Tue, Jun 19, 2001 at 09:00:24PM -0700, David S. Miller wrote: > > Andrea Arcangeli writes: > > I don't have gigabit ethernet so I cannot flood my boxes to death. > > But I think it's real, and a softirq marking itself runnable again is > > another case to handle without live lockups or starvation. > > I think (still) that you're just moving the problem around and > not actually changing anything. something will defintely to change radically if the softirq marks itself runnable again. but this to me sounds similar to the other one (irq flood that basically left the softirq pending every time you check it). Andrea ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 4:07 ` Andrea Arcangeli @ 2001-06-20 18:06 ` kuznet 2001-06-20 22:10 ` David S. Miller 0 siblings, 1 reply; 12+ messages in thread From: kuznet @ 2001-06-20 18:06 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: davem, paulus, torvalds, alan, mingo, linux-kernel Hello! > > Andrea Arcangeli writes: > > > I don't have gigabit ethernet so I cannot flood my boxes to death. > > > But I think it's real, and a softirq marking itself runnable again is > > > another case to handle without live lockups or starvation. Andrea, you do not need gigabit interfaces to check this. 100Mbit ones are enough and even better, because they do not mitigate as rule and consume more resources. 8) Actually, you may laugh, but one 10Mbit(!) interface is enough in some curcumstances, namely when stack does more work than usually: sniffing, connection tracking in presence of fragments, syn flooding etc. Actually, now I do not understand why TUX still works with Ingo's patch. As soon as bulk work is made in thread context, it should die pretty fastly doing no progress. :-) > > I think (still) that you're just moving the problem around and > > not actually changing anything. Well, ksoftirqd is not sort of placebo yet. :-) OK. Let's forget about infinite thread latency and live lock problems introduced by Ingo's patch. Eventually, BSD does exactly the same thing for ages and nobody but security paranoics cried about this too much. We are just fully bsd compliant now. 8) Let's look at different angle: f.e. with Ingo's patch, as soon as one cpu processes some global BH, all the rest of cpus will spin waiting for global bh release. Is this good? I am afraid this is not quite good. Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 18:06 ` kuznet @ 2001-06-20 22:10 ` David S. Miller 2001-06-20 23:16 ` Andrea Arcangeli 2001-06-21 16:58 ` kuznet 0 siblings, 2 replies; 12+ messages in thread From: David S. Miller @ 2001-06-20 22:10 UTC (permalink / raw) To: kuznet; +Cc: Andrea Arcangeli, paulus, torvalds, alan, mingo, linux-kernel kuznet@ms2.inr.ac.ru writes: > Actually, now I do not understand why TUX still works with Ingo's patch. > As soon as bulk work is made in thread context, it should die pretty > fastly doing no progress. :-) TUX also has per-cpu timers patch of Ingo as well. Did you forget this? :-) > Let's look at different angle: f.e. with Ingo's patch, as soon as > one cpu processes some global BH, all the rest of cpus will spin > waiting for global bh release. Is this good? I am afraid this is not > quite good. It is equivalent to some old dumb code doing cli() right? The only interesting global BHs left right now are: 1) Timers 2) SCSI BH SCSI may be transformed right now in 15 minutes of boring editing to a softirq, it has all the appropriate locking already. Timers have no hard technical reason for not being a softirq either. However, this would be work requiring real thought, not just mindless edits. Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 22:10 ` David S. Miller @ 2001-06-20 23:16 ` Andrea Arcangeli 2001-06-21 16:58 ` kuznet 1 sibling, 0 replies; 12+ messages in thread From: Andrea Arcangeli @ 2001-06-20 23:16 UTC (permalink / raw) To: David S. Miller; +Cc: kuznet, paulus, torvalds, alan, mingo, linux-kernel On Wed, Jun 20, 2001 at 03:10:13PM -0700, David S. Miller wrote: > TUX also has per-cpu timers patch of Ingo as well. Not in my tree, tux doesn't depend on it at all. that's a further optimization that tcp will take advatage of regardless of tux, same applies to the pagecache scalability hashlock patch. Andrea ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 22:10 ` David S. Miller 2001-06-20 23:16 ` Andrea Arcangeli @ 2001-06-21 16:58 ` kuznet 1 sibling, 0 replies; 12+ messages in thread From: kuznet @ 2001-06-21 16:58 UTC (permalink / raw) To: David S. Miller; +Cc: andrea, paulus, torvalds, alan, mingo, linux-kernel Hello! > TUX also has per-cpu timers patch of Ingo as well. > Did you forget this? :-) If I remember correctly, it has threaded timer pool, but timers still acquire global bh lock, so that the things become only worse. Apparently, it is invisible at first sight because bulk work typical for tux and triggered by timers, is moved to cpu local tasklets (garbage collection: time wait etc.). > It is equivalent to some old dumb code doing cli() right? Sort of. > The only interesting global BHs left right now are: > > 1) Timers > 2) SCSI BH In generic server case, yes. But also add BH_IMMEDIATE and BHs, used by hordes of devices. > Timers have no hard technical reason for not being a softirq > either. However, this would be work requiring real thought, > not just mindless edits. Yes. But, in any case, global BHs are not a pathalogy: they were handy tool, allowing to hide lots of spinlocks. And not plain spinlocks, but asynchronous ones. It was pretty light, but had latency up to 1/HZ in the worst case. Now they have unreasonably strict latency (useless, as rule) but eat cpu instead. Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 3:54 ` Andrea Arcangeli 2001-06-20 4:00 ` David S. Miller @ 2001-06-20 12:18 ` Paul Mackerras 2001-06-20 12:52 ` Andrea Arcangeli 1 sibling, 1 reply; 12+ messages in thread From: Paul Mackerras @ 2001-06-20 12:18 UTC (permalink / raw) To: Andrea Arcangeli Cc: Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel Andrea Arcangeli writes: > We should release the stack before running the softirq (some place uses > softirqs to release the stack and avoid overflows). Well if they are relying on having a lot of stack available then those places are buggy. Once the softirq is made pending it can run at any time that interrupts are enabled. You can't rely on a softirq handler having any more stack available than a hard interrupt handler has. > ip + tcp are more intensive than just queueing a packet in a blacklog. > That's why they're not done in irq context in first place. Ah, ok, I misunderstood, I thought you were saying that that softirq framework itself had a lot of overhead. > I don't have gigabit ethernet so I cannot flood my boxes to death. > But I think it's real, and a softirq marking itself runnable again is > another case to handle without live lockups or starvation. As for the gigabit ethernet case, if we are having packets coming in and generating hard interrupts at that sort of a rate then what we really need is the sort of interrupt throttling that Jamal talked about at the 2.5 kernel kickoff. It seems to me that possibly softirqs are being used in some places where a kernel thread would be more appropriate. Instead of making softirqs use a kernel thread, I think it would be better to find the places that should use a thread and make them do so. Softirqs are still after all interrupt handlers (ones that run at a lower priority than any hardware interrupt) and should be treated as such. Paul. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 12:18 ` Paul Mackerras @ 2001-06-20 12:52 ` Andrea Arcangeli 0 siblings, 0 replies; 12+ messages in thread From: Andrea Arcangeli @ 2001-06-20 12:52 UTC (permalink / raw) To: Paul Mackerras Cc: Linus Torvalds, Alan Cox, Ingo Molnar, kuznet, linux-kernel On Wed, Jun 20, 2001 at 10:18:10PM +1000, Paul Mackerras wrote: > Well if they are relying on having a lot of stack available then those > places are buggy. Once the softirq is made pending it can run at any it's not about having lots of stack available, it's about avoiding recursion. Andrea ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: softirq in pre3 and all linux ports 2001-06-20 3:33 ` Paul Mackerras 2001-06-20 3:54 ` Andrea Arcangeli @ 2001-06-20 18:16 ` kuznet 1 sibling, 0 replies; 12+ messages in thread From: kuznet @ 2001-06-20 18:16 UTC (permalink / raw) To: paulus; +Cc: andrea, torvalds, alan, mingo, linux-kernel Hello! > Soft irqs should definitely not be much heavier than an irq handler, > if they are then we have implemented them wrongly somehow. For example, all the networking nicely fits to this class. :-) Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-06-21 16:59 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-06-19 19:03 softirq in pre3 and all linux ports Andrea Arcangeli 2001-06-20 3:33 ` Paul Mackerras 2001-06-20 3:54 ` Andrea Arcangeli 2001-06-20 4:00 ` David S. Miller 2001-06-20 4:07 ` Andrea Arcangeli 2001-06-20 18:06 ` kuznet 2001-06-20 22:10 ` David S. Miller 2001-06-20 23:16 ` Andrea Arcangeli 2001-06-21 16:58 ` kuznet 2001-06-20 12:18 ` Paul Mackerras 2001-06-20 12:52 ` Andrea Arcangeli 2001-06-20 18:16 ` kuznet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox