From: Andreas Ericsson <ae@op5.se>
To: Nicolas Pitre <nico@cam.org>
Cc: Jon Smirl <jonsmirl@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: Better value for chunk_size when threaded
Date: Mon, 10 Dec 2007 11:26:59 +0100 [thread overview]
Message-ID: <475D1473.5090809@op5.se> (raw)
In-Reply-To: <alpine.LFD.0.99999.0712062014060.555@xanadu.home>
Nicolas Pitre wrote:
> On Thu, 6 Dec 2007, Jon Smirl wrote:
>
>> I tried some various ideas out for chunk_size and the best strategy I
>> found was to simply set it to a constant. How does 20,000 work on
>> other CPUs?
>
> That depends on the object size. If you have a repo with big objects
> but only 1000 of them for example, then the constant doesn't work.
>
> Ideally I'd opt for a value that tend towards around 5 seconds worth of
> work per segment, or something like that. Maybe using the actual
> objects size could be another way.
>
>> I'd turn on default threaded support with this change. With threads=1
>> versus non-threaded there is no appreciable difference in the time.
>
> Would need a way to determine pthreads availability from Makefile.
>
>> Is there an API to ask how many CPUs are in the system? It would be
>> nice to default the number of threads equal to the number of CPUs and
>> only use pack.threads=X to override.
>
> If there is one besides futzing with /proc/cpuinfo I'd like to know
> about it. Bonus points if it is portable.
>
Here is such a one. I've sent it before, using git-send-email, but that
one doesn't seem to work too well for all list-members, probably because
my own laptop appears to be the original SMTP-server and its name can't
be looked up. Sorry for inlining it here instead of sending it as a mail
on its own, but I have absolutely no idea how to get git-send-email to
do ldap authentication and connect to our tls-enabled smtp-server
without using /usr/bin/sendmail and adding my laptop as originating
smtp-server.
This patch replaces the one I sent earlier and *should* work on
everything from Irix and AIX to Linux, Windows and every other
posixish system. It passes all tests, both with and without
THREADED_DELTA_SEARCH, and causes our weekly repack of our
mother-ship repos to run roughly 4 times as fast (4 cores, no
previous thread config).
Extract with
sed -n -e /^##SEDMEHERE##/,/##TOHERE##/p -e /^##/d
##SEDMEHERE##
From ddf08303bd7962be385abbd5e964455a90ed6055 Mon Sep 17 00:00:00 2001
From: Andreas Ericsson <ae@op5.se>
Date: Thu, 6 Dec 2007 22:09:27 +0100
Subject: [PATCH] pack-objects: Add runtime detection of number of CPU's
Packing objects can be done in parallell nowadays, but
it's only done if the config option pack.threads is set
to a value above 1. Because of that, the code-path used
is sometimes not the most optimal one.
This patch adds a routine to detect the number of active
CPU's at runtime, which should provide a better default
and activate the (hopefully) better codepath more often.
The code is a rework of "numcpus.c", written by one
Philip Willoughby <pgw99@doc.ic.ac.uk>. numcpus.c is in
the public domain and can presently be downloaded from
http://csgsoft.doc.ic.ac.uk/numcpus/
Signed-off-by: Andreas Ericsson <ae@op5.se>
---
builtin-pack-objects.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 48 insertions(+), 1 deletions(-)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 250dc56..ccf5198 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -17,6 +17,13 @@
#ifdef THREADED_DELTA_SEARCH
#include <pthread.h>
+# ifdef _WIN32
+# define WIN32_LEAN_AND_MEAN
+# include <windows.h>
+# endif
+# if defined(hpux) || defined(__hpux) || defined(_hpux)
+# include <sys/pstat.h>
+# endif
#endif
static const char pack_usage[] = "\
@@ -70,7 +77,7 @@ static int progress = 1;
static int window = 10;
static uint32_t pack_size_limit;
static int depth = 50;
-static int delta_search_threads = 1;
+static long delta_search_threads;
static int pack_to_stdout;
static int num_preferred_base;
static struct progress *progress_state;
@@ -2004,6 +2011,44 @@ static int adjust_perm(const char *path, mode_t mode)
return adjust_shared_perm(path);
}
+/*
+ * this is a disgusting nest of #ifdef's. I just love
+ * non-portable interfaces. By doing it in two steps
+ * we can get the function to be fairly coherent anyways
+ */
+#ifndef _SC_NPROCESSORS_ONLN
+# ifdef _SC_NPROC_ONLN
+# define _SC_NPROCESSORS_ONLN _SC_NPROC_ONLN
+# elif defined _SC_CRAY_NCPU
+# define _SC_NPROCESSORS_ONLN _SC_CRAY_NCPU
+# endif
+#endif
+static long active_cpu_count(void)
+{
+#ifdef THREADED_DELTA_SEARCH
+# ifdef _SC_NPROCESSORS_ONLN
+ long ncpus;
+
+ if ((ncpus = (long)sysconf(_SC_NPROCESSORS_ONLN)) > 0)
+ return ncpus;
+# else
+# ifdef _WIN32
+ SYSTEM_INFO info;
+ GetSystemInfo(&info);
+
+ return (long)info.dwNumberOfProcessors;
+# endif /* _WIN32 */
+# if defined(hpux) || defined(__hpux) || defined(_hpux)
+ struct pst_dynamic psd;
+
+ if (!pstat_getdynamic(&psd, sizeof(psd), (size_t)1, 0))
+ return (long)psd.psd_proc_cnt;
+# endif /* hpux */
+# endif /* _SC_NPROCESSORS_ONLN */
+#endif /* THREADED_DELTA_SEARCH */
+ return 1;
+}
+
int cmd_pack_objects(int argc, const char **argv, const char *prefix)
{
int use_internal_rev_list = 0;
@@ -2019,6 +2064,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
rp_av[1] = "--objects"; /* --thin will make it --objects-edge */
rp_ac = 2;
+ delta_search_threads = active_cpu_count();
+
git_config(git_pack_config);
if (!pack_compression_seen && core_compression_seen)
pack_compression_level = core_compression_level;
--
1.5.3.6.2031.gf9bdc
##TOHERE##
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
next prev parent reply other threads:[~2007-12-10 10:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-06 23:58 Better value for chunk_size when threaded Jon Smirl
2007-12-07 1:27 ` Nicolas Pitre
2007-12-07 1:37 ` Jon Smirl
2007-12-07 3:25 ` Nicolas Pitre
2007-12-10 10:26 ` Andreas Ericsson [this message]
2007-12-10 13:46 ` Nicolas Pitre
2007-12-07 8:57 ` Andreas Ericsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=475D1473.5090809@op5.se \
--to=ae@op5.se \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonsmirl@gmail.com \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).