From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "René Scharfe" <l.s.r@web.de>,
"Rasmus Villemoes" <rv@rasmusvillemoes.dk>,
"Eric Sunshine" <sunshine@sunshineco.com>
Subject: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
Date: Thu, 16 Apr 2015 05:01:38 -0400 [thread overview]
Message-ID: <20150416090138.GG17938@peff.net> (raw)
In-Reply-To: <20150416084733.GA17811@peff.net>
We spend a lot of time in strbuf_getwholeline in a tight
loop reading characters from a stdio handle into a buffer.
The libc getdelim() function can do this for us with less
overhead. It's in POSIX.1-2008, and was a GNU extension
before that. Therefore we can't rely on it, but can fall
back to the existing getc loop when it is not available.
The HAVE_GETDELIM knob is turned on automatically for Linux,
where we have glibc. We don't need to set any new
feature-test macros, because we already define _GNU_SOURCE.
Other systems that implement getdelim may need to other
macros (probably _POSIX_C_SOURCE >= 200809L), but we can
address that along with setting the Makefile knob after
testing the feature on those systems.
Running "git rev-parse refs/heads/does-not-exist" on a repo
with an extremely large (1.6GB) packed-refs file went from
(best-of-5):
real 0m8.601s
user 0m8.084s
sys 0m0.524s
to:
real 0m6.768s
user 0m6.340s
sys 0m0.432s
for a wall-clock speedup of 21%.
Based on a patch from Rasmus Villemoes <rv@rasmusvillemoes.dk>.
Signed-off-by: Jeff King <peff@peff.net>
---
If somebody has a FreeBSD or OS X system to test on, I'd
love to see what is needed to compile with HAVE_GETDELIM
there. And to confirm that the performance is much better.
Sharing my 1.6GB packed-refs file would be hard, but you
should be able to generate something large and ridiculous.
I'll leave that as an exercise to the reader.
Makefile | 6 ++++++
config.mak.uname | 1 +
strbuf.c | 42 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 49 insertions(+)
diff --git a/Makefile b/Makefile
index 5f3987f..36655d5 100644
--- a/Makefile
+++ b/Makefile
@@ -359,6 +359,8 @@ all::
# compiler is detected to support it.
#
# Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function.
+#
+# Define HAVE_GETDELIM if your system has the getdelim() function.
GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL
BASIC_CFLAGS += -DHAVE_BSD_SYSCTL
endif
+ifdef HAVE_GETDELIM
+ BASIC_CFLAGS += -DHAVE_GETDELIM
+endif
+
ifeq ($(TCLTK_PATH),)
NO_TCLTK = NoThanks
endif
diff --git a/config.mak.uname b/config.mak.uname
index f4e77cb..d26665f 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux)
HAVE_DEV_TTY = YesPlease
HAVE_CLOCK_GETTIME = YesPlease
HAVE_CLOCK_MONOTONIC = YesPlease
+ HAVE_GETDELIM = YesPlease
endif
ifeq ($(uname_S),GNU/kFreeBSD)
HAVE_ALLOCA_H = YesPlease
diff --git a/strbuf.c b/strbuf.c
index 921619e..0d4f4e5 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -435,6 +435,47 @@ int strbuf_getcwd(struct strbuf *sb)
return -1;
}
+#ifdef HAVE_GETDELIM
+int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
+{
+ ssize_t r;
+
+ if (feof(fp))
+ return EOF;
+
+ strbuf_reset(sb);
+
+ /* Translate slopbuf to NULL, as we cannot call realloc on it */
+ if (!sb->alloc)
+ sb->buf = NULL;
+ r = getdelim(&sb->buf, &sb->alloc, term, fp);
+
+ if (r > 0) {
+ sb->len = r;
+ return 0;
+ }
+ assert(r == -1);
+
+ /*
+ * Normally we would have called xrealloc, which will try to free
+ * memory and recover. But we have no way to tell getdelim() to do so.
+ * Worse, we cannot try to recover ENOMEM ourselves, because we have
+ * no idea how many bytes were read by getdelim.
+ *
+ * Dying here is reasonable. It mirrors what xrealloc would do on
+ * catastrophic memory failure. We skip the opportunity to free pack
+ * memory and retry, but that's unlikely to help for a malloc small
+ * enough to hold a single line of input, anyway.
+ */
+ if (errno == ENOMEM)
+ die("Out of memory, getdelim failed");
+
+ /* Restore slopbuf that we moved out of the way before */
+ if (!sb->buf)
+ strbuf_init(sb, 0);
+ return EOF;
+}
+#else
int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
{
int ch;
@@ -458,6 +499,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
sb->buf[sb->len] = '\0';
return 0;
}
+#endif
int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
{
--
2.4.0.rc2.384.g7297a4a
next prev parent reply other threads:[~2015-04-16 9:02 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-05 1:06 [PATCH 0/6] address packed-refs speed regressions Jeff King
2015-04-05 1:07 ` [PATCH 1/6] strbuf_getwholeline: use getc macro Jeff King
2015-04-05 1:08 ` [PATCH 2/6] git-compat-util: add fallbacks for unlocked stdio Jeff King
2015-04-05 1:11 ` [PATCH 3/6] strbuf_getwholeline: use getc_unlocked Jeff King
2015-04-05 4:56 ` Jeff King
2015-04-05 5:27 ` Jeff King
2015-04-05 5:35 ` Jeff King
2015-04-05 20:49 ` Junio C Hamano
2015-04-05 14:36 ` Duy Nguyen
2015-04-05 18:24 ` Jeff King
2015-04-05 20:09 ` Junio C Hamano
2015-04-07 13:48 ` Rasmus Villemoes
2015-04-07 19:04 ` Jeff King
2015-04-07 22:43 ` Rasmus Villemoes
2015-04-08 0:17 ` Jeff King
2015-04-05 1:11 ` [PATCH 4/6] strbuf: add an optimized 1-character strbuf_grow Jeff King
2015-04-06 2:13 ` Eric Sunshine
2015-04-06 5:05 ` Jeff King
2015-04-05 1:11 ` [PATCH 5/6] t1430: add another refs-escape test Jeff King
2015-04-05 1:15 ` [PATCH 6/6] refname_is_safe: avoid expensive normalize_path_copy call Jeff King
2015-04-05 13:41 ` [PATCH 0/6] address packed-refs speed regressions René Scharfe
2015-04-05 18:52 ` Jeff King
2015-04-05 18:59 ` Jeff King
2015-04-05 23:04 ` René Scharfe
2015-04-05 22:39 ` René Scharfe
2015-04-06 4:49 ` Jeff King
2015-04-16 8:47 ` [PATCH v2 0/9] " Jeff King
2015-04-16 8:48 ` [PATCH 1/9] strbuf_getwholeline: use getc macro Jeff King
2015-04-16 8:48 ` [PATCH 2/9] git-compat-util: add fallbacks for unlocked stdio Jeff King
2015-04-16 8:49 ` [PATCH 3/9] strbuf_getwholeline: use getc_unlocked Jeff King
2015-04-16 8:51 ` [PATCH 4/9] config: use getc_unlocked when reading from file Jeff King
2015-04-16 8:53 ` [PATCH 5/9] strbuf_addch: avoid calling strbuf_grow Jeff King
2015-04-16 8:58 ` [PATCH 6/9] strbuf_getwholeline: " Jeff King
2015-04-16 9:01 ` Jeff King [this message]
2015-04-17 10:16 ` [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available Eric Sunshine
2015-04-21 23:09 ` Jeff King
2015-05-08 23:56 ` Eric Sunshine
2015-05-09 1:09 ` Jeff King
2015-06-02 18:22 ` Eric Sunshine
2015-04-22 18:00 ` Johannes Schindelin
2015-04-22 18:06 ` Jeff King
2015-04-16 9:03 ` [PATCH 8/9] read_packed_refs: avoid double-checking sane refs Jeff King
2015-04-16 9:04 ` [PATCH 9/9] t1430: add another refs-escape test Jeff King
2015-04-16 9:25 ` [PATCH v2 0/9] address packed-refs speed regressions Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150416090138.GG17938@peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
--cc=rv@rasmusvillemoes.dk \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.