git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Junio C Hamano <junkio@cox.net>
Cc: Linus Torvalds <torvalds@osdl.org>,
	"Randal L. Schwartz" <merlyn@stonehenge.com>,
	git@vger.kernel.org
Subject: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux
Date: Tue, 19 Dec 2006 10:53:08 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0612191027270.18171@xanadu.home> (raw)
In-Reply-To: <7v1wmwtfmk.fsf@assigned-by-dhcp.cox.net>

It was reported by Randal L. Schwartz <merlyn@stonehenge.com> that 
indexing the Linux repository ~150MB pack takes about an hour on OS x 
while it's a minute on Linux.  It seems that the OS X mmap() 
implementation is more than 2 orders of magnitude slower than the Linux 
one.

Linus proposed a patch replacing mmap() with pread() bringing index-pack 
performance on OS X in line with the Linux one.  The performances on 
Linux also improved by a small margin.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

OK looks like this has been sorted out while I was away.  Good!

On Tue, 19 Dec 2006, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > Hi,
> >
> > in a very unscientific test, without your patch local cloning of the 
> > LilyPond repo takes 1m33s (user), and with your patch (pread() instead of 
> > mmap()) it takes 1m13s (user). The real times are somewhat bogus, but 
> > still in favour of pread(), but only by 8 seconds instead of 20.
> >
> > This is on Linux 2.4.32.
> 
> Interesting.  Anybody have numbers from 2.6?

My 37 seconds of yesterday dropped to 32.

This is Linus's patch plus a few cosmetic changes.

diff --git a/index-pack.c b/index-pack.c
index 6d6c92b..e08a687 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -1,3 +1,8 @@
+#define _XOPEN_SOURCE 500
+#include <unistd.h>
+#include <sys/time.h>
+#include <signal.h>
+
 #include "cache.h"
 #include "delta.h"
 #include "pack.h"
@@ -6,8 +11,6 @@
 #include "commit.h"
 #include "tag.h"
 #include "tree.h"
-#include <sys/time.h>
-#include <signal.h>
 
 static const char index_pack_usage[] =
 "git-index-pack [-v] [-o <index-file>] [{ ---keep | --keep=<msg> }] { <pack-file> | --stdin [--fix-thin] [<pack-file>] }";
@@ -87,7 +90,7 @@ static unsigned display_progress(unsigned n, unsigned total, unsigned last_pc)
 static unsigned char input_buffer[4096];
 static unsigned long input_offset, input_len, consumed_bytes;
 static SHA_CTX input_ctx;
-static int input_fd, output_fd, mmap_fd;
+static int input_fd, output_fd, pack_fd;
 
 /* Discard current buffer used content. */
 static void flush(void)
@@ -148,14 +151,14 @@ static const char *open_pack_file(const char *pack_name)
 			output_fd = open(pack_name, O_CREAT|O_EXCL|O_RDWR, 0600);
 		if (output_fd < 0)
 			die("unable to create %s: %s\n", pack_name, strerror(errno));
-		mmap_fd = output_fd;
+		pack_fd = output_fd;
 	} else {
 		input_fd = open(pack_name, O_RDONLY);
 		if (input_fd < 0)
 			die("cannot open packfile '%s': %s",
 			    pack_name, strerror(errno));
 		output_fd = -1;
-		mmap_fd = input_fd;
+		pack_fd = input_fd;
 	}
 	SHA1_Init(&input_ctx);
 	return pack_name;
@@ -279,27 +282,25 @@ static void *get_data_from_pack(struct object_entry *obj)
 {
 	unsigned long from = obj[0].offset + obj[0].hdr_size;
 	unsigned long len = obj[1].offset - from;
-	unsigned pg_offset = from % getpagesize();
-	unsigned char *map, *data;
+	unsigned char *src, *data;
 	z_stream stream;
 	int st;
 
-	map = mmap(NULL, len + pg_offset, PROT_READ, MAP_PRIVATE,
-		   mmap_fd, from - pg_offset);
-	if (map == MAP_FAILED)
-		die("cannot mmap pack file: %s", strerror(errno));
+	src = xmalloc(len);
+	if (pread(pack_fd, src, len, from) != len)
+		die("cannot pread pack file: %s", strerror(errno));
 	data = xmalloc(obj->size);
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = data;
 	stream.avail_out = obj->size;
-	stream.next_in = map + pg_offset;
+	stream.next_in = src;
 	stream.avail_in = len;
 	inflateInit(&stream);
 	while ((st = inflate(&stream, Z_FINISH)) == Z_OK);
 	inflateEnd(&stream);
 	if (st != Z_STREAM_END || stream.total_out != obj->size)
 		die("serious inflate inconsistency");
-	munmap(map, len + pg_offset);
+	free(src);
 	return data;
 }

  parent reply	other threads:[~2006-12-19 15:53 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <86y7p57y05.fsf@blue.stonehenge.com>
     [not found] ` <Pine.LNX.4.63.0612182154170.19693@wbgn013.biozentrum.uni-wuerzburg.de>
     [not found]   ` <Pine.LNX.4.63.0612182213020.19693@wbgn013.biozentrum.uni-wuerzburg.de>
     [not found]     ` <Pine.LNX.4.64.0612181638220.18171@xanadu.home>
2006-12-18 21:55       ` [PATCH] fetch-pack: avoid fixing thin packs when unnecessary Johannes Schindelin
2006-12-18 22:17         ` Nicolas Pitre
     [not found] ` <Pine.LNX.4.64.0612181251020.3479@woody.osdl.org>
     [not found]   ` <86r6uw9azn.fsf@blue.stonehenge.com>
     [not found]     ` <Pine.LNX.4.64.0612181625140.18171@xanadu.home>
2006-12-18 22:01       ` cloning the kernel - why long time in "Resolving 313037 deltas" Randal L. Schwartz
2006-12-18 22:09         ` Nicolas Pitre
2006-12-18 22:21           ` Randal L. Schwartz
2006-12-18 22:50             ` Nicolas Pitre
2006-12-18 22:22         ` Linus Torvalds
2006-12-18 22:26           ` Randal L. Schwartz
2006-12-18 23:02             ` Martin Langhoff
2006-12-22  1:44               ` Kyle Moffett
2006-12-22  1:56                 ` Shawn Pearce
2006-12-22  8:04                 ` Marco Roeland
2007-01-03 13:55               ` Andreas Ericsson
2006-12-18 23:28             ` Linus Torvalds
2006-12-19  0:13               ` Nicolas Pitre
2006-12-19  5:11                 ` Theodore Tso
2006-12-19  6:39                   ` Shawn Pearce
2006-12-19  6:51                     ` Linus Torvalds
2006-12-19  7:26                       ` Shawn Pearce
2006-12-19  7:52                         ` Marco Roeland
2006-12-19  7:58                           ` Shawn Pearce
2006-12-19  8:32                       ` Shawn Pearce
2006-12-19  8:40                         ` Marco Roeland
2006-12-19  8:49                           ` Shawn Pearce
2006-12-19  9:13                             ` Marco Roeland
2006-12-19 20:28                               ` Alex Riesen
2006-12-21 20:35                                 ` Juergen Ruehle
2006-12-19 16:19                     ` Theodore Tso
2006-12-19 16:57                       ` Linus Torvalds
2006-12-20  1:54                         ` Shawn Pearce
2006-12-20  1:58                       ` Shawn Pearce
2006-12-19  6:47                   ` Linus Torvalds
2006-12-19  8:32                     ` Johannes Schindelin
2006-12-19  9:10                       ` Junio C Hamano
2006-12-19  9:47                         ` Jeff King
2006-12-19 10:24                         ` Andy Whitcroft
2006-12-19 15:53                         ` Nicolas Pitre [this message]
2006-12-19 19:00                           ` [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux Junio C Hamano
2006-12-19 19:14                             ` Nicolas Pitre
2006-12-19 19:55                               ` Linus Torvalds
2006-12-19 19:57                                 ` Randal L. Schwartz
2006-12-19 20:03                                   ` Randal L. Schwartz
2006-12-19 20:02                                 ` Jeff Garzik
2006-12-20  0:30                                   ` Junio C Hamano
2006-12-20  0:40                                     ` Linus Torvalds
2006-12-20  0:50                                       ` Jeff Garzik
2006-12-20  1:12                                       ` Junio C Hamano
2006-12-20 20:17                                         ` Junio C Hamano
2006-12-20 20:53                                           ` Linus Torvalds
2006-12-20 21:52                                             ` Junio C Hamano
2006-12-20 22:13                                 ` Nikolai Weibull

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0612191027270.18171@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=merlyn@stonehenge.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).