All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Ingo Molnar <mingo@elte.hu>
Cc: Anton Blanchard <anton@samba.org>,
	Linus Torvalds <torvalds@transmeta.com>,
	Rik van Riel <riel@conectiva.com.br>,
	Momchil Velikov <velco@fadata.bg>,
	John Stoffel <stoffel@casc.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] Radix-tree pagecache for 2.5
Date: Fri, 1 Feb 2002 15:44:33 +0100	[thread overview]
Message-ID: <20020201154433.C9904@athlon.random> (raw)
In-Reply-To: <20020131231242.GA4138@krispykreme> <Pine.LNX.4.33.0202010958220.2111-100000@localhost.localdomain>
In-Reply-To: <Pine.LNX.4.33.0202010958220.2111-100000@localhost.localdomain>; from mingo@elte.hu on Fri, Feb 01, 2002 at 10:04:50AM +0100

On Fri, Feb 01, 2002 at 10:04:50AM +0100, Ingo Molnar wrote:
> 
> On Fri, 1 Feb 2002, Anton Blanchard wrote:
> 
> > There were a few solutions (from davem and ingo) to allocate a larger
> > hash but with the radix patch we no longer have to worry about this.
> 
> there is one big issue we forgot to consider.
> 
> in the case of radix trees it's not only search depth that gets worse with
> big files. The thing i'm worried about is the 'big pagecache lock' being
> reintroduced again. If eg. a database application puts lots of data into a
> single file (multiple gigabytes - why not), then the
> mapping->i_shared_lock becomes a 'big pagecache lock' again, causing
> serious SMP contention for even the read() case. Benchmarks show that it's
> the distribution of locks that matters on big boxes.

exactly, this is the same thing I mentioned in some past email. It's not
that having per-inode data structures solves the locking completly, DBMS
are used to store stuff in a single file. And of course with a structure
like radix tree it would be a pain to have it scale within the same
file, unlike with the hashtable where each bucket is indipendent from
the others.

> 
> dbench hides this issue, because it uses many temporary files, so the

Indeed, a lot of workloads would benefit from the separate data
structure and locking, but not all, some important one not.

> locking overhead is distributed. Would you be willing to run benchmarks
> that measure the scalability of reading from one bigger file, from
> multiple CPUs?

Agreed, also with DaveM patch applied, sizing the hash properly so it
has a mean distribution of 1 entry per bucket or so, will decrease the
window for the spinlock collisions as well btw.

> 
> with hash based locking, the locking overhead is *always* distributed.
> 
> with radix trees the locking overhead is distributed only if multiple
> files are used. With one big file (or a few big files), the i_shared_lock
> will always bounce between CPUs wildly in read() workloads, degrading
> scalability just as much as it is degraded with the pagecache_lock now.
> 
> 	Ingo


Andrea

  parent reply	other threads:[~2002-02-01 14:44 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-01-29 15:54 [PATCH] Radix-tree pagecache for 2.5 Christoph Hellwig
2002-01-29 19:27 ` Linus Torvalds
2002-01-29 21:40   ` David S. Miller
2002-01-29 22:07     ` Linus Torvalds
2002-01-29 23:01       ` Rik van Riel
2002-01-29 23:32         ` Alan Cox
2002-01-29 23:35           ` Rik van Riel
2002-01-30  3:00             ` Daniel Phillips
2002-01-31 23:44           ` Anton Blanchard
2002-02-01  0:34             ` Alan Cox
2002-02-01 11:04               ` Rik van Riel
2002-02-01 11:33                 ` Arjan van de Ven
2002-02-02 18:57                 ` Richard Henderson
2002-02-02 21:15                   ` Rik van Riel
2002-01-29 23:02   ` Momchil Velikov
2002-01-29 23:33     ` Linus Torvalds
2002-01-29 23:45       ` Christoph Hellwig
2002-01-30 21:25       ` Momchil Velikov
2002-01-30 22:05         ` John Stoffel
2002-01-30 22:15           ` Momchil Velikov
2002-01-31  2:33             ` Andrea Arcangeli
2002-01-31 13:58               ` Rik van Riel
2002-01-31 14:36                 ` Andrea Arcangeli
2002-01-31 15:32                   ` Alan Cox
2002-01-31 16:39                   ` William Lee Irwin III
2002-01-31 17:19                     ` William Lee Irwin III
2002-01-31 17:21                     ` Andrea Arcangeli
2002-01-31 17:50                       ` William Lee Irwin III
2002-01-31 17:46                   ` Linus Torvalds
2002-01-31 18:02                     ` Andrea Arcangeli
2002-01-31 18:32                       ` Linus Torvalds
2002-01-31 18:38                         ` Rik van Riel
2002-01-31 18:49                           ` Linus Torvalds
2002-01-31 19:09                             ` Momchil Velikov
2002-01-31 19:26                             ` Andrew Morton
2002-01-31 21:12                             ` Momchil Velikov
2002-01-31 19:14                         ` Andrea Arcangeli
2002-01-31 19:23                           ` Linus Torvalds
2002-01-31 21:34                             ` Ingo Molnar
2002-01-31 23:12                               ` Anton Blanchard
2002-01-31 23:55                                 ` Andrea Arcangeli
2002-02-01  0:01                                   ` David S. Miller
2002-02-16 16:20                                     ` Andrea Arcangeli
2002-02-01  3:56                                   ` Anton Blanchard
2002-02-01  6:32                                 ` Momchil Velikov
2002-02-01 18:38                                   ` Anton Blanchard
2002-02-01  9:04                                 ` Ingo Molnar
2002-02-01  7:59                                   ` Momchil Velikov
2002-02-01 10:29                                     ` Ingo Molnar
2002-02-01  9:01                                       ` Momchil Velikov
2002-02-01  9:10                                         ` David S. Miller
2002-02-01  9:07                                       ` David S. Miller
2002-02-01  9:13                                         ` Momchil Velikov
2002-02-01 17:06                                       ` Linus Torvalds
2002-02-01 18:29                                         ` Jeff Garzik
2002-02-01 18:44                                           ` arjan
2002-02-01 19:47                                             ` Jeff Garzik
2002-02-02 15:39                                               ` Rik van Riel
2002-02-05 14:21                                                 ` Pavel Machek
2002-02-05 18:45                                                   ` Rik van Riel
2002-02-05 20:30                                                     ` Eric Dumazet
2002-02-05 20:46                                                     ` Pavel Machek
2002-02-06  9:07                                                     ` Daniel Phillips
2002-02-05  9:19                                           ` Zdenek Kabelac
2002-02-01 23:49                                         ` Ingo Molnar
2002-02-01 14:44                                   ` Andrea Arcangeli [this message]
2002-02-01 14:59                                   ` Momchil Velikov
2002-02-01 17:03                                     ` Ingo Molnar
2002-02-01 15:26                                       ` Momchil Velikov
2002-02-01 23:45                                         ` Ingo Molnar
2002-01-31 10:41             ` Josh MacDonald
2002-01-31 14:00               ` Rik van Riel
2002-01-31 14:21               ` Momchil Velikov
2002-01-30 22:22           ` Christoph Hellwig
2002-01-30  3:02     ` Daniel Phillips
2002-01-29 23:00 ` William Lee Irwin III
  -- strict thread matches above, loose matches on Subject: below --
2002-02-02 19:23 rwhron
2002-02-03 14:31 ` chris
2002-02-03 23:33 ` Momchil Velikov
2002-02-04  3:59   ` rwhron
2002-02-06  2:04   ` rwhron
2002-02-06 11:44     ` Rik van Riel
2002-02-06 21:34 rwhron
2002-02-06 21:37 ` Rik van Riel
2002-02-06 22:06   ` rwhron
2002-02-07 11:32   ` Daniel Phillips
2002-02-07 11:32   ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020201154433.C9904@athlon.random \
    --to=andrea@suse.de \
    --cc=anton@samba.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=riel@conectiva.com.br \
    --cc=stoffel@casc.com \
    --cc=torvalds@transmeta.com \
    --cc=velco@fadata.bg \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.