* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-20 20:54 Ingo Molnar
@ 2002-07-19 21:49 ` george anzinger
2002-07-20 14:25 ` Marcin Dalecki
2002-07-20 20:57 ` Ingo Molnar
2 siblings, 0 replies; 11+ messages in thread
From: george anzinger @ 2002-07-19 21:49 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Linus Torvalds, Russell King, David S. Miller
Ingo Molnar wrote:
>
> the following patch, against 2.5.26:
>
> http://redhat.com/~mingo/remove-irqlock-patches/remove-irqlock-2.5.26-A2
I haven't applied the patch, but just looking at it, it
appears that you have removed the conditional call to
softirq on local_bh_enable(). Isn't this needed?
-g
>
> is a work-in-progress massive cleanup of the IRQ subsystem. It's losely
> based on Linus' original idea and DaveM's original implementation, to fold
> our various irq, softirq and bh counters into the preemption counter.
>
> with this approach it was possible:
>
> - to remove the 'big IRQ lock' on SMP - on which sti() and cli() relied.
>
> - to streamline/simplify arch/i386/kernel/irq.c significantly.
>
> - to simplify the softirq code.
>
> - to remove the preemption count increase/decrease code from the lowlevel
> IRQ assembly code.
>
> - to speed up schedule() a bit.
>
> sti() and cli() is gone forever, there is no more globally synchronizing
> irq-disabling capability. All code that relied on sti() and cli() and
> restore_flags() must use other locking mechanisms from now on (spinlocks
> and __cli()/__sti()).
>
> obviously this patch breaks massive amounts of code, so only limited
> .configs are working at the moment, such as:
>
> http://redhat.com/~mingo/remove-irqlock-patches/config
>
> otherwise the patch was developed and tested on SMP systems, and while the
> code is still a bit rough in places, the base IRQ code appears to be
> pretty robust and clean.
>
> while it boots already so the worst is over, there is lots of work left:
> eg. to fix the serial layer to not use cli()/sti() and bhs ...
>
> RMK, is there any chance to get your new serial layer into 2.5 sometime
> soon? ['soon' as in 'tomorrow' :-) ] That is perhaps one of the biggest
> kernel subsystems that make use of cli()/sti() currently. The rest is
> drivers mostly, which is still not unsignificant, but perhaps a bit easier
> to manage.
>
> Ingo
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-20 20:54 Ingo Molnar
2002-07-19 21:49 ` george anzinger
@ 2002-07-20 14:25 ` Marcin Dalecki
2002-07-21 21:17 ` Ingo Molnar
2002-07-20 20:57 ` Ingo Molnar
2 siblings, 1 reply; 11+ messages in thread
From: Marcin Dalecki @ 2002-07-20 14:25 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Linus Torvalds, Russell King, David S. Miller
> RMK, is there any chance to get your new serial layer into 2.5 sometime
> soon? ['soon' as in 'tomorrow' :-) ] That is perhaps one of the biggest
Well I would like to allo wmysefl to allow a tad bit of advocacy for
this step.
I looked at it already quite a time ago and *second* it!
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
@ 2002-07-20 17:57 Oleg Nesterov
2002-07-21 23:43 ` george anzinger
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2002-07-20 17:57 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar
Hello.
> - to remove the preemption count increase/decrease code from the lowlevel
> IRQ assembly code.
So do_IRQ() can start with preempt_count == 0.
Suppose another cpu sets TIF_NEED_RESCHED flag
at the same time.
spin_lock(&desc->lock) sets preempt_count == 1.
Before calling handle_IRQ_event() (which adds IRQ_OFFSET
to preempt_count), do_IRQ() does spin_unlock(&desc->lock)
and falls into schedule().
Am I missed something?
It seems to me that call to irq_enter() must be shifted
from handle_IRQ_event() to do_IRQ().
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
@ 2002-07-20 20:54 Ingo Molnar
2002-07-19 21:49 ` george anzinger
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Ingo Molnar @ 2002-07-20 20:54 UTC (permalink / raw)
To: linux-kernel; +Cc: Linus Torvalds, Russell King, David S. Miller
the following patch, against 2.5.26:
http://redhat.com/~mingo/remove-irqlock-patches/remove-irqlock-2.5.26-A2
is a work-in-progress massive cleanup of the IRQ subsystem. It's losely
based on Linus' original idea and DaveM's original implementation, to fold
our various irq, softirq and bh counters into the preemption counter.
with this approach it was possible:
- to remove the 'big IRQ lock' on SMP - on which sti() and cli() relied.
- to streamline/simplify arch/i386/kernel/irq.c significantly.
- to simplify the softirq code.
- to remove the preemption count increase/decrease code from the lowlevel
IRQ assembly code.
- to speed up schedule() a bit.
sti() and cli() is gone forever, there is no more globally synchronizing
irq-disabling capability. All code that relied on sti() and cli() and
restore_flags() must use other locking mechanisms from now on (spinlocks
and __cli()/__sti()).
obviously this patch breaks massive amounts of code, so only limited
.configs are working at the moment, such as:
http://redhat.com/~mingo/remove-irqlock-patches/config
otherwise the patch was developed and tested on SMP systems, and while the
code is still a bit rough in places, the base IRQ code appears to be
pretty robust and clean.
while it boots already so the worst is over, there is lots of work left:
eg. to fix the serial layer to not use cli()/sti() and bhs ...
RMK, is there any chance to get your new serial layer into 2.5 sometime
soon? ['soon' as in 'tomorrow' :-) ] That is perhaps one of the biggest
kernel subsystems that make use of cli()/sti() currently. The rest is
drivers mostly, which is still not unsignificant, but perhaps a bit easier
to manage.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-20 20:54 Ingo Molnar
2002-07-19 21:49 ` george anzinger
2002-07-20 14:25 ` Marcin Dalecki
@ 2002-07-20 20:57 ` Ingo Molnar
2 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2002-07-20 20:57 UTC (permalink / raw)
To: linux-kernel; +Cc: Linus Torvalds, Russell King, David S. Miller
> the following patch, against 2.5.26:
>
> http://redhat.com/~mingo/remove-irqlock-patches/remove-irqlock-2.5.26-A2
forgot to mention that the patch applies ontop the scheduler-improvements
patch:
http://redhat.com/~mingo/O(1)-scheduler/sched-nobatch-2.5.26-C2
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-20 14:25 ` Marcin Dalecki
@ 2002-07-21 21:17 ` Ingo Molnar
0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2002-07-21 21:17 UTC (permalink / raw)
To: martin; +Cc: linux-kernel, Linus Torvalds, Russell King, David S. Miller
Martin,
On Sat, 20 Jul 2002, Marcin Dalecki wrote:
> Well I would like to allow myself to allow a tad bit of advocacy for
> this step.
the attached patch was a quick hack to get rid of the global cli() from
drivers/ide/main.c - but it's obviously broken and i'd like to fix it.
how can this be done cleanly - should we do a:
synchronize_irq(drive->channel->irq);
before unregistering the driver?
this might not even be necessery i believe, since the implicit (?)
free_irq() synchronizes with all pending instances of that interrupt
source.
is there something else i'm missing, something else we need to synchronize
with - perhaps the timers?
Ingo
--- linux/drivers/ide/main.c.orig Sun Jul 21 20:37:12 2002
+++ linux/drivers/ide/main.c Sun Jul 21 21:06:28 2002
@@ -1091,18 +1091,18 @@
{
unsigned long flags;
- save_flags(flags); /* all CPUs */
- cli(); /* all CPUs */
+ __save_flags(flags); // FIXME: is this safe?
+ __cli();
#if 0
if (__MOD_IN_USE(ata_ops(drive)->owner)) {
- restore_flags(flags);
+ __restore_flags(flags); // FIXME: is this safe?
return 1;
}
#endif
if (drive->usage || drive->busy || !ata_ops(drive)) {
- restore_flags(flags); /* all CPUs */
+ __restore_flags(flags); // FIXME: is this safe?
return 1;
}
@@ -1111,7 +1111,7 @@
#endif
drive->driver = NULL;
- restore_flags(flags); /* all CPUs */
+ __restore_flags(flags); // FIXME: is this safe?
return 0;
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-20 17:57 [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups Oleg Nesterov
@ 2002-07-21 23:43 ` george anzinger
2002-07-21 23:50 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: george anzinger @ 2002-07-21 23:43 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: linux-kernel, Ingo Molnar
Oleg Nesterov wrote:
>
> Hello.
>
> > - to remove the preemption count increase/decrease code from the lowlevel
> > IRQ assembly code.
IMHO this is unwise as the system NEEDS to exit to user (or
system) space when preempt_count is zero with a garuntee
that TIF_NEED_RESCHED is 0. To do this from irq.c means
that it must exit with interrupts off and the the low level
code needs to keep them off till the irtn. But this is
where we test TIF_NEED_RESCHED, and if it is set, schedule()
is called and should be called with preempt_count != 0, to
avoid stack overflow interrupt loops. And, schedule()
returns with the interrupt system on (even if we, unwisely,
call it with it off).
>
> So do_IRQ() can start with preempt_count == 0.
> Suppose another cpu sets TIF_NEED_RESCHED flag
> at the same time.
>
> spin_lock(&desc->lock) sets preempt_count == 1.
> Before calling handle_IRQ_event() (which adds IRQ_OFFSET
> to preempt_count), do_IRQ() does spin_unlock(&desc->lock)
> and falls into schedule().
>
> Am I missed something?
Nope, you got it right.
>
> It seems to me that call to irq_enter() must be shifted
> from handle_IRQ_event() to do_IRQ().
Even then...
>
> Oleg.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-21 23:43 ` george anzinger
@ 2002-07-21 23:50 ` Ingo Molnar
0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2002-07-21 23:50 UTC (permalink / raw)
To: george anzinger; +Cc: Oleg Nesterov, linux-kernel, Linus Torvalds
On Sun, 21 Jul 2002, george anzinger wrote:
> > > - to remove the preemption count increase/decrease code from the lowlevel
> > > IRQ assembly code.
>
> IMHO this is unwise as the system NEEDS to exit to user (or
> system) space when preempt_count is zero with a garuntee
> that TIF_NEED_RESCHED is 0. [...]
my comment was too compact - the bumping of the preemption count did not
get removed, it's just the dualness that got removed. Both the lowlevel
entry code and the irq_enter() code used to do the same thing - now only
irq_enter() does it.
in the latest patch irq_enter() is done early in do_IRQ(), and irq_exit()
is done before returning from do_IRQ().
> [...] To do this from irq.c means that it must exit with interrupts off
> and the the low level code needs to keep them off till the irtn. [...]
yes, we are very careful to keep irqs disabled in do_IRQ(), both before
and after calling the handler.
> [...] But this is where we test TIF_NEED_RESCHED, and if it is set,
> schedule() is called and should be called with preempt_count != 0, to
> avoid stack overflow interrupt loops. And, schedule() returns with the
> interrupt system on (even if we, unwisely, call it with it off).
yes, in this case schedule() is called with a nonzero preempt count
(PREEMPT_ACTIVE), to keep it from recursing into itself.
> > It seems to me that call to irq_enter() must be shifted from
> > handle_IRQ_event() to do_IRQ().
>
> Even then...
could you check out the latest patch, can you still see any hole in it?
All the above scenarios should be handled correctly.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
@ 2002-07-22 14:45 Oleg Nesterov
2002-07-22 18:03 ` george anzinger
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2002-07-22 14:45 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, george anzinger
Hello.
> > [...] To do this from irq.c means that it must exit with interrupts off
> > and the the low level code needs to keep them off till the irtn. [...]
> yes, we are very careful to keep irqs disabled in do_IRQ(), both before
> and after calling the handler.
Note that smp_xxx_interrupt() functions must be carefull
with preemt_{disable,enable} brackets.
For example, smp_invalidate_interrupt() may be preempted
after put_cpu(). Probably not big deal (it is return path),
but it is better to use preempt_enable_no_resched() here -
let ret_from_intr: do its job.
smp_{error,spurious,thermal}_interrupt() - all of them
use printk() without bumping preemt_count and have problem
after spin_unlock_irqrestore(&logbuf_lock, flags).
If these problems worth fixing, then preempt_stop (cli)
can be killed in entry.S:ret_from_intr(), yes? If i understand
correctly none of the irq handlers should return to low level
code with irq enabled.
P.S.
May I suggest somebody with good english fix
Documentation/preempt-locking.txt?
It states, that disabled interrupts prevents preemption.
Yes, but only in a sense, that the delivery of reschedule
interrupt is suppressed.
Process with irqs disabled and current->preempt_count == 0 can
be preempted (with interrupts enabled) after spin_lock/unlock etc.
Even in UP case preemption can happen while calling wake_up_...().
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-22 14:45 Oleg Nesterov
@ 2002-07-22 18:03 ` george anzinger
2002-07-22 18:04 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: george anzinger @ 2002-07-22 18:03 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: linux-kernel, Ingo Molnar
Oleg Nesterov wrote:
>
> Hello.
>
> > > [...] To do this from irq.c means that it must exit with interrupts off
> > > and the the low level code needs to keep them off till the irtn. [...]
>
> > yes, we are very careful to keep irqs disabled in do_IRQ(), both before
> > and after calling the handler.
>
> Note that smp_xxx_interrupt() functions must be carefull
> with preemt_{disable,enable} brackets.
>
> For example, smp_invalidate_interrupt() may be preempted
> after put_cpu(). Probably not big deal (it is return path),
> but it is better to use preempt_enable_no_resched() here -
> let ret_from_intr: do its job.
>
> smp_{error,spurious,thermal}_interrupt() - all of them
> use printk() without bumping preemt_count and have problem
> after spin_unlock_irqrestore(&logbuf_lock, flags).
>
> If these problems worth fixing, then preempt_stop (cli)
> can be killed in entry.S:ret_from_intr(), yes?
I REALLY have a problem with this abstraction for cli. I
think it just makes the code hard to read...
> If i understand
> correctly none of the irq handlers should return to low level
> code with irq enabled.
But schedule and signal code does return with interrupts
enabled, so a cli is still needed here. Also at least some
of the trap code returns with interrupts enabled.
>
> P.S.
>
> May I suggest somebody with good english fix
> Documentation/preempt-locking.txt?
> It states, that disabled interrupts prevents preemption.
> Yes, but only in a sense, that the delivery of reschedule
> interrupt is suppressed.
>
> Process with irqs disabled and current->preempt_count == 0 can
> be preempted (with interrupts enabled) after spin_lock/unlock etc.
> Even in UP case preemption can happen while calling wake_up_...().
This is really a bug and a fix is on the way. Turning
interrupts off MUST disable preemption, but trying to
preempt from this state is so rare that the test will be in
preempt_schedule() rather than inline or an attempt to put
disable/enable code along with each cli/sti.
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups.
2002-07-22 18:03 ` george anzinger
@ 2002-07-22 18:04 ` Ingo Molnar
0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2002-07-22 18:04 UTC (permalink / raw)
To: george anzinger; +Cc: Oleg Nesterov, linux-kernel
On Mon, 22 Jul 2002, george anzinger wrote:
> But schedule and signal code does return with interrupts enabled, so a
> cli is still needed here. Also at least some of the trap code returns
> with interrupts enabled.
the change only affects the ret_from_intr path, which is IRQ-only. The
signal and schedule path is still disabling interrupts.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2002-07-22 18:02 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-20 17:57 [announce, patch, RFC] "big IRQ lock" removal, IRQ cleanups Oleg Nesterov
2002-07-21 23:43 ` george anzinger
2002-07-21 23:50 ` Ingo Molnar
-- strict thread matches above, loose matches on Subject: below --
2002-07-20 20:54 Ingo Molnar
2002-07-19 21:49 ` george anzinger
2002-07-20 14:25 ` Marcin Dalecki
2002-07-21 21:17 ` Ingo Molnar
2002-07-20 20:57 ` Ingo Molnar
2002-07-22 14:45 Oleg Nesterov
2002-07-22 18:03 ` george anzinger
2002-07-22 18:04 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox