linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Christoph Lameter <clameter@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Lee Schermerhorn <Lee.Schermerhorn@hp.com>,
	Rik van Riel <riel@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 02/20] make the inode i_mmap_lock a reader/writer lock
Date: Thu, 20 Dec 2007 18:59:12 +1100	[thread overview]
Message-ID: <200712201859.12934.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <Pine.LNX.4.64.0712192301120.13118@schroedinger.engr.sgi.com>

On Thursday 20 December 2007 18:04, Christoph Lameter wrote:
> > The only reason the x86 ticket locks have the 256 CPu limit is that
> > if they go any bigger, we can't use the partial registers so would
> > have to have a few more instructions.
>
> x86_64 is going up to 4k or 16k cpus soon for our new hardware.
>
> > A 32 bit spinlock would allow 64K cpus (ticket lock has 2 counters,
> > each would be 16 bits). And it would actually shrink the spinlock in
> > the case of preempt kernels too (because it would no longer have the
> > lockbreak field).
> >
> > And yes, I'll go out on a limb and say that 64k CPUs ought to be
> > enough for anyone ;)
>
> I think those things need a timeframe applied to it. Thats likely
> going to be true for the next 3 years (optimistic assessment ;-)).

Yeah, that was tongue in cheek ;)


> Could you go to 32bit spinlock by default?

On x86, the size of the ticket locks is 32 bit, simply because I didn't
want to risk possible alignment bugs (a subsequent patch cuts it down to
16 bits, but this is a much smaller win than 64->32 in general because
of natural alignment of types).

Note that the ticket locks still support twice the number as the old
spinlocks, so I'm not causing a regression here... but yes, increasing
the size further will require an extra instruction or two.

> How about NUMA awareness for the spinlocks? Larger backoff periods for
> off node lock contentions please.

ticket locks can naturally tell you how many waiters there are, and how
many waiters are in front of you, so it is really nice for doing backoff
(eg. you can adapt the backoff *very* nicely depending on how many are in
front of you, and how quickly you are moving toward the front).

Also, since I got rid of the ->break_lock field, you could use that space
perhaps to add a cpu # of the lock holder for even more backoff context
(if you find that helps).

Anyway, I didn't do any of that because it obviously needs someone with
real hardware in order to tune it properly.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-12-20  7:59 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-18 21:15 [patch 00/20] VM pageout scalability improvements Rik van Riel
2007-12-18 21:15 ` [patch 01/20] convert anon_vma list lock a read/write lock Rik van Riel
2007-12-20  7:07   ` Christoph Lameter
2007-12-18 21:15 ` [patch 02/20] make the inode i_mmap_lock a reader/writer lock Rik van Riel
2007-12-19  0:48   ` Nick Piggin
2007-12-19  4:09     ` KOSAKI Motohiro
2007-12-19 15:52     ` Lee Schermerhorn
2007-12-19 16:31       ` Rik van Riel
2007-12-19 16:53         ` Lee Schermerhorn
2007-12-19 19:28           ` Peter Zijlstra
2007-12-19 23:40             ` Nick Piggin
2007-12-20  7:04               ` Christoph Lameter
2007-12-20  7:59                 ` Nick Piggin [this message]
2008-01-02 23:35                   ` Mike Travis
2008-01-03  6:07                     ` Nick Piggin
2008-01-03  8:55                       ` Ingo Molnar
2008-01-07  9:01                         ` Nick Piggin
2007-12-18 21:15 ` [patch 03/20] move isolate_lru_page() to vmscan.c Rik van Riel
2007-12-20  7:08   ` Christoph Lameter
2007-12-18 21:15 ` [patch 04/20] free swap space on swap-in/activation Rik van Riel
2007-12-18 21:15 ` [patch 05/20] define page_file_cache() function Rik van Riel
2007-12-18 21:15 ` [patch 06/20] debugging checks for page_file_cache() Rik van Riel
2007-12-18 21:15 ` [patch 07/20] Use an indexed array for LRU variables Rik van Riel
2007-12-18 21:15 ` [patch 08/20] split LRU lists into anon & file sets Rik van Riel
2007-12-18 21:15 ` [patch 09/20] split anon & file LRUs for memcontrol code Rik van Riel
2007-12-18 21:15 ` [patch 10/20] SEQ replacement for anonymous pages Rik van Riel
2007-12-19  5:17   ` KOSAKI Motohiro
2007-12-19 13:40     ` Rik van Riel
2007-12-20  2:04       ` KOSAKI Motohiro
2007-12-18 21:15 ` [patch 11/20] add newly swapped in pages to the inactive list Rik van Riel
2007-12-18 21:15 ` [patch 12/20] No Reclaim LRU Infrastructure Rik van Riel
2007-12-18 21:15 ` [patch 13/20] Non-reclaimable page statistics Rik van Riel
2007-12-18 21:15 ` [patch 14/20] Scan noreclaim list for reclaimable pages Rik van Riel
2007-12-18 21:15 ` [patch 15/20] ramfs pages are non-reclaimable Rik van Riel
2007-12-18 21:15 ` [patch 16/20] SHM_LOCKED pages are nonreclaimable Rik van Riel
2007-12-18 21:15 ` [patch 17/20] non-reclaimable mlocked pages Rik van Riel
2007-12-19  0:56   ` Nick Piggin
2007-12-19 13:45     ` Rik van Riel
2007-12-19 14:24       ` Peter Zijlstra
2007-12-19 14:53         ` Rik van Riel
2007-12-19 16:08           ` Lee Schermerhorn
2007-12-19 16:04       ` Lee Schermerhorn
2007-12-20 20:56         ` Rik van Riel
2007-12-21 10:52           ` Nick Piggin
2007-12-21 14:17             ` Rik van Riel
2007-12-23 12:22               ` Nick Piggin
2007-12-24  1:00                 ` Rik van Riel
2007-12-19 23:34       ` Nick Piggin
2007-12-20  7:19     ` Christoph Lameter
2007-12-20 15:33       ` Rik van Riel
2007-12-21 17:13         ` Lee Schermerhorn
2007-12-18 21:15 ` [patch 18/20] mlock vma pages under mmap_sem held for read Rik van Riel
2007-12-18 21:15 ` [patch 19/20] handle mlocked pages during map/unmap and truncate Rik van Riel
2007-12-18 21:15 ` [patch 20/20] account mlocked pages Rik van Riel
2007-12-22 20:27 ` [patch 00/20] VM pageout scalability improvements Balbir Singh
2007-12-23  0:21   ` Rik van Riel
2007-12-23 22:59     ` Balbir Singh
2007-12-24  1:11       ` Rik van Riel
2007-12-28  3:20         ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200712201859.12934.nickpiggin@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=clameter@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).