public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: 2.6.24-rc6-mm1
       [not found]   ` <20071223121410.0d572e03.akpm@linux-foundation.org>
@ 2007-12-24  1:25     ` Herbert Xu
  2007-12-30 13:10       ` 2.6.24-rc6-mm1 Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2007-12-24  1:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: trem, Ingo Molnar, Linux Kernel Mailing List

On Sun, Dec 23, 2007 at 12:14:10PM -0800, Andrew Morton wrote:
>
> No, the problem is that include/crypto/scatterwalk.h doesn't include enough
> header files to support its inlining fetish.  It needs sched.h.

I'll get it fixed in cryptodev.

> Ingo, it's not good that we have cond_resched() definitions conditionally
> duplicated in kernel.h - that's increasing the risk of bugs like this one.

Actually, why do we even have cond_resched when real preemption
is on? It seems to be a waste of space and time.

Any objections to something like this to remove cond_resched with
CONFIG_PREEMPT on (apart from the potential to uncover more bugs
like this one)?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 94bc996..a7283c9 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -105,8 +105,8 @@ struct user;
  * supposed to.
  */
 #ifdef CONFIG_PREEMPT_VOLUNTARY
-extern int cond_resched(void);
-# define might_resched() cond_resched()
+extern int _cond_resched(void);
+# define might_resched() _cond_resched()
 #else
 # define might_resched() do { } while (0)
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ac3d496..ae8e9bd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1863,7 +1863,18 @@ static inline int need_resched(void)
  * cond_resched_lock() will drop the spinlock before scheduling,
  * cond_resched_softirq() will enable bhs before scheduling.
  */
-extern int cond_resched(void);
+#ifdef CONFIG_PREEMPT
+static inline int cond_resched(void)
+{
+	return 0;
+}
+#else
+extern int _cond_resched(void);
+static inline int cond_resched(void)
+{
+	return _cond_resched();
+}
+#endif
 extern int cond_resched_lock(spinlock_t * lock);
 extern int cond_resched_softirq(void);
 
diff --git a/kernel/sched.c b/kernel/sched.c
index 3df84ea..2dc2bbf 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4683,7 +4683,8 @@ static void __cond_resched(void)
 	} while (need_resched());
 }
 
-int __sched cond_resched(void)
+#if !defined(CONFIG_PREEMPT) || defined(CONFIG_PREEMPT_VOLUNTARY)
+int __sched _cond_resched(void)
 {
 	if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) &&
 					system_state == SYSTEM_RUNNING) {
@@ -4692,7 +4693,8 @@ int __sched cond_resched(void)
 	}
 	return 0;
 }
-EXPORT_SYMBOL(cond_resched);
+EXPORT_SYMBOL(_cond_resched);
+#endif
 
 /*
  * cond_resched_lock() - if a reschedule is pending, drop the given lock,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2007-12-24  1:25     ` 2.6.24-rc6-mm1 Herbert Xu
@ 2007-12-30 13:10       ` Ingo Molnar
  2008-01-02 10:31         ` 2.6.24-rc6-mm1 Nick Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2007-12-30 13:10 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Andrew Morton, trem, Linux Kernel Mailing List


* Herbert Xu <herbert@gondor.apana.org.au> wrote:

> > Ingo, it's not good that we have cond_resched() definitions 
> > conditionally duplicated in kernel.h - that's increasing the risk of 
> > bugs like this one.
> 
> Actually, why do we even have cond_resched when real preemption is on? 
> It seems to be a waste of space and time.

due to the BKL. cond_resched() in BKL code breaks up BKL latencies.

i dont mind not doing that though - we should increase the pain for BKL 
users, so that subsystems finally get rid of it altogether. 
lock_kernel() use within the kernel is still rampant - there are still 
more than 400 callsites to lock_kernel().

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2007-12-30 13:10       ` 2.6.24-rc6-mm1 Ingo Molnar
@ 2008-01-02 10:31         ` Nick Piggin
  2008-01-02 11:01           ` 2.6.24-rc6-mm1 Peter Zijlstra
  2008-01-02 11:08           ` 2.6.24-rc6-mm1 Ingo Molnar
  0 siblings, 2 replies; 12+ messages in thread
From: Nick Piggin @ 2008-01-02 10:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Herbert Xu, Andrew Morton, trem, Linux Kernel Mailing List

On Monday 31 December 2007 00:10, Ingo Molnar wrote:
> * Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > > Ingo, it's not good that we have cond_resched() definitions
> > > conditionally duplicated in kernel.h - that's increasing the risk of
> > > bugs like this one.
> >
> > Actually, why do we even have cond_resched when real preemption is on?
> > It seems to be a waste of space and time.
>
> due to the BKL. cond_resched() in BKL code breaks up BKL latencies.
>
> i dont mind not doing that though - we should increase the pain for BKL
> users, so that subsystems finally get rid of it altogether.
> lock_kernel() use within the kernel is still rampant - there are still
> more than 400 callsites to lock_kernel().

It would be silly to potentially increase latency in some areas
for CONFIG_PREEMPT kernels, though.

Better may be to detect when there is CONFIG_PREEMPT and
CONFIG_PREEMPT_BKL, and ifdef away the cond_resched in that case
(or -- why do we even make CONFIG_PREEMPT_BKL an option? Are there
really workloads left where it causes throughput regressions?)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 10:31         ` 2.6.24-rc6-mm1 Nick Piggin
@ 2008-01-02 11:01           ` Peter Zijlstra
  2008-01-02 11:12             ` 2.6.24-rc6-mm1 Nick Piggin
  2008-01-02 11:08           ` 2.6.24-rc6-mm1 Ingo Molnar
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2008-01-02 11:01 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List


On Wed, 2008-01-02 at 21:31 +1100, Nick Piggin wrote:
> On Monday 31 December 2007 00:10, Ingo Molnar wrote:
> > * Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > > > Ingo, it's not good that we have cond_resched() definitions
> > > > conditionally duplicated in kernel.h - that's increasing the risk of
> > > > bugs like this one.
> > >
> > > Actually, why do we even have cond_resched when real preemption is on?
> > > It seems to be a waste of space and time.
> >
> > due to the BKL. cond_resched() in BKL code breaks up BKL latencies.
> >
> > i dont mind not doing that though - we should increase the pain for BKL
> > users, so that subsystems finally get rid of it altogether.
> > lock_kernel() use within the kernel is still rampant - there are still
> > more than 400 callsites to lock_kernel().
> 
> It would be silly to potentially increase latency in some areas
> for CONFIG_PREEMPT kernels, though.
> 
> Better may be to detect when there is CONFIG_PREEMPT and
> CONFIG_PREEMPT_BKL, and ifdef away the cond_resched in that case
> (or -- why do we even make CONFIG_PREEMPT_BKL an option? Are there
> really workloads left where it causes throughput regressions?)

I've seen 1s+ desktop latencies due to PREEMPT_BKL when I was still
using reiserfs.

Both reiserfs and tty were fighting for the bkl and massive prio
inversion ensued. Turning PREEMPT_BKL off made the system usable again.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 10:31         ` 2.6.24-rc6-mm1 Nick Piggin
  2008-01-02 11:01           ` 2.6.24-rc6-mm1 Peter Zijlstra
@ 2008-01-02 11:08           ` Ingo Molnar
  1 sibling, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2008-01-02 11:08 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Herbert Xu, Andrew Morton, trem, Linux Kernel Mailing List


* Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> (or -- why do we even make CONFIG_PREEMPT_BKL an option? [...]

thanks for the reminder - i just zapped it. Was a pleasure ;-)

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 11:01           ` 2.6.24-rc6-mm1 Peter Zijlstra
@ 2008-01-02 11:12             ` Nick Piggin
  2008-01-02 11:24               ` 2.6.24-rc6-mm1 Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Nick Piggin @ 2008-01-02 11:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List

On Wednesday 02 January 2008 22:01, Peter Zijlstra wrote:
> On Wed, 2008-01-02 at 21:31 +1100, Nick Piggin wrote:
> > On Monday 31 December 2007 00:10, Ingo Molnar wrote:
> > > * Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > > > > Ingo, it's not good that we have cond_resched() definitions
> > > > > conditionally duplicated in kernel.h - that's increasing the risk
> > > > > of bugs like this one.
> > > >
> > > > Actually, why do we even have cond_resched when real preemption is
> > > > on? It seems to be a waste of space and time.
> > >
> > > due to the BKL. cond_resched() in BKL code breaks up BKL latencies.
> > >
> > > i dont mind not doing that though - we should increase the pain for BKL
> > > users, so that subsystems finally get rid of it altogether.
> > > lock_kernel() use within the kernel is still rampant - there are still
> > > more than 400 callsites to lock_kernel().
> >
> > It would be silly to potentially increase latency in some areas
> > for CONFIG_PREEMPT kernels, though.
> >
> > Better may be to detect when there is CONFIG_PREEMPT and
> > CONFIG_PREEMPT_BKL, and ifdef away the cond_resched in that case
> > (or -- why do we even make CONFIG_PREEMPT_BKL an option? Are there
> > really workloads left where it causes throughput regressions?)
>
> I've seen 1s+ desktop latencies due to PREEMPT_BKL when I was still
> using reiserfs.

Fair enough; so the former ifdefery would be preferable for now then.

> Both reiserfs and tty were fighting for the bkl and massive prio
> inversion ensued. Turning PREEMPT_BKL off made the system usable again.

Are either of those subsystems actually using the BKL to protect against
anything else (than themselves)? It would be sweet to have them use
private mutexes for the job instead (although even then it probably
wouldn't be a straight conversion)...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 11:12             ` 2.6.24-rc6-mm1 Nick Piggin
@ 2008-01-02 11:24               ` Peter Zijlstra
  2008-01-02 12:19                 ` 2.6.24-rc6-mm1 Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2008-01-02 11:24 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List, Alan Cox


On Wed, 2008-01-02 at 22:12 +1100, Nick Piggin wrote:
> On Wednesday 02 January 2008 22:01, Peter Zijlstra wrote:

> > I've seen 1s+ desktop latencies due to PREEMPT_BKL when I was still
> > using reiserfs.
> 
> Fair enough; so the former ifdefery would be preferable for now then.

To be honest, I must mention that the load that did that was a kernel
build -j5 on a dual socket Athlon MP box. With a current kernel and XFS
that load is making the box slow but its still very servicable.

> > Both reiserfs and tty were fighting for the bkl and massive prio
> > inversion ensued. Turning PREEMPT_BKL off made the system usable again.
> 
> Are either of those subsystems actually using the BKL to protect against
> anything else (than themselves)?

I doubt it.

IIRC Alan is working on getting tty BKL free.

>  It would be sweet to have them use
> private mutexes for the job instead (although even then it probably
> wouldn't be a straight conversion)...

I tried a quick conversion of reiser3 at the time, but it really wants a
recursive lock and I couldn't be bothered to fix a 'legacy' filesystem
so I just gave up and converted the filesystem to XFS.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 11:24               ` 2.6.24-rc6-mm1 Peter Zijlstra
@ 2008-01-02 12:19                 ` Ingo Molnar
  2008-01-02 13:26                   ` 2.6.24-rc6-mm1 Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-01-02 12:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Piggin, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List, Alan Cox


* Peter Zijlstra <peterz@infradead.org> wrote:

> >  It would be sweet to have them use private mutexes for the job 
> > instead (although even then it probably wouldn't be a straight 
> > conversion)...
> 
> I tried a quick conversion of reiser3 at the time, but it really wants 
> a recursive lock and I couldn't be bothered to fix a 'legacy' 
> filesystem so I just gave up and converted the filesystem to XFS.

as long as the only requirement is recursion, and not any of the other 
BKL properties, that could be wrapped. I guess fixing the TTY code to 
have no BKL dependencies has a higher chance of success - given that 
Alan is working on it :-)

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 12:19                 ` 2.6.24-rc6-mm1 Ingo Molnar
@ 2008-01-02 13:26                   ` Alan Cox
  2008-01-02 16:18                     ` 2.6.24-rc6-mm1 Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2008-01-02 13:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Nick Piggin, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List

> BKL properties, that could be wrapped. I guess fixing the TTY code to 
> have no BKL dependencies has a higher chance of success - given that 
> Alan is working on it :-)

Bit by bit when I can face it, and with a lot of other people
contributing parts. Right now the BKL mostly protects the open/close
paths and those are *really* ugly

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 13:26                   ` 2.6.24-rc6-mm1 Alan Cox
@ 2008-01-02 16:18                     ` Ingo Molnar
  2008-01-02 22:49                       ` 2.6.24-rc6-mm1 Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2008-01-02 16:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: Peter Zijlstra, Nick Piggin, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List


* Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > BKL properties, that could be wrapped. I guess fixing the TTY code 
> > to have no BKL dependencies has a higher chance of success - given 
> > that Alan is working on it :-)
> 
> Bit by bit when I can face it, and with a lot of other people 
> contributing parts. Right now the BKL mostly protects the open/close 
> paths and those are *really* ugly

could we perhaps just replace it with a tty_mutex? (possibly a recursive 
one) I suspect by now most of the BKL dependencies there have become 
local to the tty code? Or are there deep VFS dependencies as well? (if 
yes, what type of dependencies?)

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 2.6.24-rc6-mm1
  2008-01-02 16:18                     ` 2.6.24-rc6-mm1 Ingo Molnar
@ 2008-01-02 22:49                       ` Alan Cox
  2008-01-02 23:29                         ` tty + BKL [Was: 2.6.24-rc6-mm1] Jiri Slaby
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2008-01-02 22:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Nick Piggin, Herbert Xu, Andrew Morton, trem,
	Linux Kernel Mailing List

> could we perhaps just replace it with a tty_mutex? (possibly a recursive 
> one) I suspect by now most of the BKL dependencies there have become 
> local to the tty code? Or are there deep VFS dependencies as well? (if 
> yes, what type of dependencies?)

The big problem is that nobody actually knows where all the dependancies
are. That is why I've started with the bits we can decipher so that it
simplifies the mess each time we clean up the locking of some lower level
aspect.

Almost all the serial drivers clone the same open and release methods (or
worse older versions of it) so that also needs doing. Lots to do, so
little time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* tty + BKL [Was: 2.6.24-rc6-mm1]
  2008-01-02 22:49                       ` 2.6.24-rc6-mm1 Alan Cox
@ 2008-01-02 23:29                         ` Jiri Slaby
  0 siblings, 0 replies; 12+ messages in thread
From: Jiri Slaby @ 2008-01-02 23:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: Ingo Molnar, Peter Zijlstra, Nick Piggin, Herbert Xu,
	Andrew Morton, trem, Linux Kernel Mailing List

On 01/02/2008 11:49 PM, Alan Cox wrote:
> Almost all the serial drivers clone the same open and release methods (or
> worse older versions of it) so that also needs doing. Lots to do, so
> little time.

Could you be more specific here please, maybe somebody could help.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-01-02 23:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <9DtBq-2jD-3@gated-at.bofh.it>
     [not found] ` <476EAF98.6040004@yahoo.fr>
     [not found]   ` <20071223121410.0d572e03.akpm@linux-foundation.org>
2007-12-24  1:25     ` 2.6.24-rc6-mm1 Herbert Xu
2007-12-30 13:10       ` 2.6.24-rc6-mm1 Ingo Molnar
2008-01-02 10:31         ` 2.6.24-rc6-mm1 Nick Piggin
2008-01-02 11:01           ` 2.6.24-rc6-mm1 Peter Zijlstra
2008-01-02 11:12             ` 2.6.24-rc6-mm1 Nick Piggin
2008-01-02 11:24               ` 2.6.24-rc6-mm1 Peter Zijlstra
2008-01-02 12:19                 ` 2.6.24-rc6-mm1 Ingo Molnar
2008-01-02 13:26                   ` 2.6.24-rc6-mm1 Alan Cox
2008-01-02 16:18                     ` 2.6.24-rc6-mm1 Ingo Molnar
2008-01-02 22:49                       ` 2.6.24-rc6-mm1 Alan Cox
2008-01-02 23:29                         ` tty + BKL [Was: 2.6.24-rc6-mm1] Jiri Slaby
2008-01-02 11:08           ` 2.6.24-rc6-mm1 Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox