public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
	William Lee Irwin III <wli@holomorphy.com>,
	Arjan van de Ven <arjanv@redhat.com>,
	Lee Revell <rlrevell@joe-job.com>
Subject: Re: [patch] remove the BKL (Big Kernel Lock), this time for real
Date: Wed, 15 Sep 2004 17:55:55 +0200	[thread overview]
Message-ID: <20040915155555.GA11019@elte.hu> (raw)
In-Reply-To: <Pine.LNX.4.58.0409150826150.2333@ppc970.osdl.org>


* Linus Torvalds <torvalds@osdl.org> wrote:

> > the attached patch is a new approach to get rid of Linux's Big Kernel
> > Lock as we know it today.
> 
> I really think this is wrong.
> 
> Maybe not from a conceptual standpoint, but that implementation with
> the scheduler doing "reaquire_kernel_lock()" and doing a down() there
> is just wrong, wrong, wrong.
> 
> If we're going to do a down() and block immediately after being
> scheduled, I don't think we should have been picked in the first
> place.

agreed, and that codepath is only for correctness - the semaphore code
will make sure most of the time that this doesnt happen. It's the same
as the need_resched check - it doesnt happen in 99.9% of the cases but
when it happens it must happen.

> Yeah, yeah, you have all that magic to not recurse by setting
> lock-depth negative before doing the down(), but it still feels
> fundamentally wrong to me. There's also the question whether this
> actually _helps_ anything, since it may well just replace the spinning
> with lots of new scheduler activity. 

Rare activity that still runs under the BKL (e.g. mounting, or ioctls)
can introduce many milliseconds of scheduling delays that hurt
latencies. None of this is really performance-sensitive as it's almost
always used for some big old piece of code. Anything that is frequently
taken/dropped we've replaced with proper spinlocks already. So what's
left is the code for which 1) everyone hurts most from it not being a
real semaphore 2) no _user_ or maintainer of that code cares about the
BKL because it's not performance-critical.

> And you make schedule() a lot more expensive for kernel lock holders
> by copying the CPU map. [...]

we dont really care about BKL _users_, other than correctness. We've got
cpusets for the bitmap overhead reduction.

> [...] You may have tested it on a machine where the CPU map is just a
> single word, but what about the big machines?

and it's not that bad - if it's .. 512 CPUs then _BKL users_ copy 64
bytes. I dont think the 512-CPU guys will complain about this patch. If
there are any frequent BKL users on such big boxes then they need
serious and quick fixing anyway, because they are bouncing a global
spinlock madly across 512 CPUs, 32 cross-connects and 4 backplanes! The
64-byte copy within the CPU-local task structure really dwarves in
comparison...

> Spinlocks really _are_ cheaper. Wouldn't it be nice to just continue
> removing kernel lock users and keeping a very _simple_ kernel lock for
> legacy issues? 

yes, but progress in this area seems to have slowed down, and people are
hurting from the latencies introduced by the BKL meanwhile. Who cares if
some rare big chunk of code runs under a semaphore, as long as it's
preemptable?

a global spinlock isnt all that much cheaper than a semaphore, even if
it's not taken or lightly taken. If it's heavily taken then a semaphore
wins. There is clearly an interim area where a spinlock will win - but
only if the spinlock is well-localized to some resource or CPU - this is
absolutely not true for the BKL which is as global as it gets.

> In other words, I'd _really_ like to see some serious numbers for
> this.

fully agreed about that too!

	Ingo

  reply	other threads:[~2004-09-15 15:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-15 15:18 [patch] remove the BKL (Big Kernel Lock), this time for real Ingo Molnar
2004-09-15 15:40 ` Linus Torvalds
2004-09-15 15:55   ` Ingo Molnar [this message]
2004-09-15 17:04     ` Ricky Beam
2004-09-15 19:39       ` Ingo Molnar
2004-09-15 18:28     ` Chris Wedgwood
2004-09-15 21:25   ` William Lee Irwin III
2004-09-17 10:39 ` Ingo Molnar
2004-09-17 12:53   ` Ingo Molnar
2004-09-17 20:56     ` Andrea Arcangeli
2004-09-18  8:02       ` Ingo Molnar
2004-09-18 23:36         ` Andrea Arcangeli
2004-09-17 13:26   ` Andrea Arcangeli
2004-09-17 13:47     ` William Lee Irwin III
2004-09-17 13:56       ` Andrea Arcangeli
2004-09-17 14:18         ` William Lee Irwin III
2004-09-17 15:16   ` Linus Torvalds
     [not found] <2EJTp-7bx-1@gated-at.bofh.it>
2004-09-15 15:46 ` Andi Kleen
2004-09-15 15:58   ` Ingo Molnar
2004-09-15 20:11   ` Ingo Molnar
2004-09-16  1:17     ` Nick Piggin
2004-09-16 14:28   ` Bill Davidsen
2004-09-16 22:29     ` Bill Huey
2004-09-16 22:40       ` David S. Miller
2004-09-16 22:51         ` Bill Huey
2004-09-16 22:54           ` David S. Miller
2004-09-16 23:01             ` Bill Huey
2004-09-16 23:33             ` William Lee Irwin III
2004-09-17  6:43           ` Ingo Molnar
2004-09-17  7:21             ` Tony Lee
  -- strict thread matches above, loose matches on Subject: below --
2004-09-18  5:44 Manfred Spraul
2004-09-18 13:09 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040915155555.GA11019@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@osdl.org \
    --cc=arjanv@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rlrevell@joe-job.com \
    --cc=torvalds@osdl.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox