* Re: Horrible re-packing?
From: Linus Torvalds @ 2006-06-05 18:44 UTC (permalink / raw)
To: Junio C Hamano, Nicolas Pitre; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606050951120.5498@g5.osdl.org>
On Mon, 5 Jun 2006, Linus Torvalds wrote:
>
> Whaah! That nice 6.33MB pack-file exploded to 14.5MB!
>
> And it's possibly broken by the fact that we've been renaming things
> lately (ie the "rev-list.c" -> "builtin-rev-list.c" thing ends up not
> finding things)
No, it's even simpler.
The breakage is entirely mine, and due to the tree-walking conversion of
the "process_tree()" function.
In that function, we used to have a local "const char *name" that
_shadowed_ the incoming _argument_ with the same type, and the
tree-walking conversion did not notice that the inner "name" should have
been converted to "entry.path" - so it used the outer-level "name".
Gaah. We should probably use -Wshadow or something, which would hopefully
have warned about the re-use of the same variable name in two different
scopes.
Regardless, this fixes it.
Linus
---
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 17c04b9..e885624 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -135,9 +135,9 @@ static struct object_list **process_tree
while (tree_entry(&desc, &entry)) {
if (S_ISDIR(entry.mode))
- p = process_tree(lookup_tree(entry.sha1), p, &me, name);
+ p = process_tree(lookup_tree(entry.sha1), p, &me, entry.path);
else
- p = process_blob(lookup_blob(entry.sha1), p, &me, name);
+ p = process_blob(lookup_blob(entry.sha1), p, &me, entry.path);
}
free(tree->buffer);
tree->buffer = NULL;
^ permalink raw reply related
* Re: [RFC] git commit --branch
From: Jon Loeliger @ 2006-06-05 18:22 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Martin Waitz, Git List
In-Reply-To: <7vd5dvyvkq.fsf@assigned-by-dhcp.cox.net>
On Tue, 2006-05-30 at 17:52, Junio C Hamano wrote:
> Martin Waitz <tali@admingilde.org> writes:
>
> >> And your approach is to backport the fix to its original topic
> >> and then re-pull the topic onto the test branch.
> >
> > yes. I was doing this after working on gitweb a bit.
> > In order to test gitweb, I need some local adaptations.
>
> Funny you mention this. I had exactly the same arrangement for
> hacking on gitweb. One "localconf" branch to tell it where the
> repositories are, "origin" to track upstream, "master" to use
> for deployment, and other topic branches.
We all do. :-)
BTW, did you (anyone?) see my patch to help the local
configuration issue some? It basically separates out the
config bits into a separate hash table in a separate file that
can be updated quite independently without even modifying
the original gitweb.cgi. That allows the gitweb.cgi
proper to be slammed down and updated much more readily.
http://marc.theaimsgroup.com/?l=git&m=114308224922372&w=2
jdl
^ permalink raw reply
* Re: [ANNOUNCE qgit-1.3]
From: Pavel Roskin @ 2006-06-05 17:59 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
In-Reply-To: <e5u8fk$ju6$1@sea.gmane.org>
On Sun, 2006-06-04 at 11:17 +0200, Jakub Narebski wrote:
> > The big feature is the use of tabs instead of independent windows.
> >
> > This change alone could be enough for a release. It's a big rewrite of UI
> > code to let browsing revisions and patches quicker and easier.
>
> Of course that is advantage _only_ if the tabs are independend, and one
> (usually) doesn't need to view them simultaneously, e.g. side by side.
What would you want to see side-by-side?
--
Regards,
Pavel Roskin
^ permalink raw reply
* [PATCH] Fix git_setup_directory_gently when GIT_DIR is set
From: Johannes Schindelin @ 2006-06-05 17:46 UTC (permalink / raw)
To: git, junkio
When calling git_setup_directory_gently, and GIT_DIR was set, it just
ignored the variable nongit_ok.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
---
setup.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/setup.c b/setup.c
index fe7f884..74301c2 100644
--- a/setup.c
+++ b/setup.c
@@ -184,6 +184,10 @@ const char *setup_git_directory_gently(i
}
return NULL;
bad_dir_environ:
+ if (nongit_ok) {
+ *nongit_ok = 1;
+ return NULL;
+ }
path[len] = 0;
die("Not a git repository: '%s'", path);
}
--
1.3.3.gdb440-dirty
^ permalink raw reply related
* Re: [PATCH] git: handle aliases defined in $GIT_DIR/config
From: Johannes Schindelin @ 2006-06-05 17:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0606051902210.20820@wbgn013.biozentrum.uni-wuerzburg.de>
Hi,
On Mon, 5 Jun 2006, Johannes Schindelin wrote:
> Hi,
>
> sorry, I did not test with the subdir=... stuff I copied from Pasky's
> patch. It breaks things for me. Looking into it...
There were actually two bugs: I did not change the subdir back in all
cases, and git_setup_directory_gently had a bug (will send patch
separately). The updated patch:
---
git.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 111 insertions(+), 0 deletions(-)
diff --git a/git.c b/git.c
index bc463c9..8854472 100644
--- a/git.c
+++ b/git.c
@@ -10,6 +10,7 @@ #include <limits.h>
#include <stdarg.h>
#include "git-compat-util.h"
#include "exec_cmd.h"
+#include "cache.h"
#include "builtin.h"
@@ -32,6 +33,113 @@ static void prepend_to_path(const char *
setenv("PATH", path, 1);
}
+static const char *alias_command;
+static char *alias_string = NULL;
+
+static int git_alias_config(const char *var, const char *value)
+{
+ if (!strncmp(var, "alias.", 6) && !strcmp(var + 6, alias_command)) {
+ alias_string = strdup(value);
+ }
+ return 0;
+}
+
+static int split_cmdline(char *cmdline, const char ***argv)
+{
+ int src, dst, count = 0, size = 16;
+ char quoted = 0;
+
+ *argv = malloc(sizeof(char*) * size);
+
+ /* split alias_string */
+ (*argv)[count++] = cmdline;
+ for (src = dst = 0; cmdline[src];) {
+ char c = cmdline[src];
+ if (!quoted && isspace(c)) {
+ cmdline[dst++] = 0;
+ while (cmdline[++src]
+ && isspace(cmdline[src]))
+ ; /* skip */
+ if (count >= size) {
+ size += 16;
+ *argv = realloc(*argv, sizeof(char*) * size);
+ }
+ (*argv)[count++] = cmdline + dst;
+ } else if(!quoted && (c == '\'' || c == '"')) {
+ quoted = c;
+ src++;
+ } else if (c == quoted) {
+ quoted = 0;
+ src++;
+ } else {
+ if (c == '\\' && quoted != '\'') {
+ src++;
+ c = cmdline[src];
+ if (!c) {
+ free(*argv);
+ *argv = NULL;
+ return error("cmdline ends with \\");
+ }
+ }
+ cmdline[dst++] = c;
+ src++;
+ }
+ }
+
+ cmdline[dst] = 0;
+
+ if (quoted) {
+ free(*argv);
+ *argv = NULL;
+ return error("unclosed quote");
+ }
+
+ return count;
+}
+
+static int handle_alias(int *argcp, const char ***argv)
+{
+ int nongit = 0, ret = 0;
+ const char *subdir;
+
+ subdir = setup_git_directory_gently(&nongit);
+ if (!nongit) {
+ int count;
+ const char** new_argv;
+
+ alias_command = (*argv)[0];
+ git_config(git_alias_config);
+ if (alias_string) {
+
+ count = split_cmdline(alias_string, &new_argv);
+
+ if (count < 1)
+ die("empty alias for %s", alias_command);
+
+ if (!strcmp(alias_command, new_argv[0]))
+ die("recursive alias: %s", alias_command);
+
+ /* insert after command name */
+ if (*argcp > 1) {
+ new_argv = realloc(new_argv, sizeof(char*) *
+ (count + *argcp - 1));
+ memcpy(new_argv + count, *argv, sizeof(char*) *
+ (*argcp - 1));
+ }
+
+ *argv = new_argv;
+ *argcp += count - 1;
+
+ ret = 1;
+ }
+ }
+
+ if (subdir)
+ chdir(subdir);
+
+ return ret;
+}
+
const char git_version_string[] = GIT_VERSION;
static void handle_internal_command(int argc, const char **argv, char **envp)
@@ -121,6 +229,7 @@ int main(int argc, const char **argv, ch
if (!strncmp(cmd, "git-", 4)) {
cmd += 4;
argv[0] = cmd;
+ handle_alias(&argc, &argv);
handle_internal_command(argc, argv, envp);
die("cannot handle %s internally", cmd);
}
@@ -178,6 +287,8 @@ int main(int argc, const char **argv, ch
exec_path = git_exec_path();
prepend_to_path(exec_path, strlen(exec_path));
+ handle_alias(&argc, &argv);
+
/* See if it's an internal command */
handle_internal_command(argc, argv, envp);
^ permalink raw reply related
* Horrible re-packing?
From: Linus Torvalds @ 2006-06-05 17:08 UTC (permalink / raw)
To: Junio C Hamano, Nicolas Pitre; +Cc: Git Mailing List
Junio, Nico,
I just tried doing a "git repack -a -d -f" to because I expected a full
re-pack to do _better_ than doing occasional incrementals, and verify the
pack generation, but imagine my shock when IT SUCKS.
I didn't look at where the suckage started, but look at this:
[torvalds@g5 git]$ git repack -a -d
Generating pack...
Done counting 21322 objects.
Deltifying 21322 objects.
100% (21322/21322) done
Writing 21322 objects.
100% (21322/21322) done
Total 21322, written 21322 (delta 14489), reused 21319 (delta 14486)
Pack pack-fe4ff117c9959ead3443b826a777423b3062b666 created.
[torvalds@g5 git]$ ll .git/objects/pack/
total 7008
-rw-r--r-- 1 torvalds torvalds 512792 Jun 5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.idx
-rw-r--r-- 1 torvalds torvalds 6643695 Jun 5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.pack
Ie, we have anice 6.33MB pack-file.
Now:
[torvalds@g5 git]$ git repack -a -d -f
Generating pack...
Done counting 21322 objects.
Deltifying 21322 objects.
100% (21322/21322) done
Writing 21322 objects.
100% (21322/21322) done
Total 21322, written 21322 (delta 10187), reused 6777 (delta 0)
Pack pack-fe4ff117c9959ead3443b826a777423b3062b666 created.
[torvalds@g5 git]$ ll .git/objects/pack/
total 15352
-rw-r--r-- 1 torvalds torvalds 512792 Jun 5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.idx
-rw-r--r-- 1 torvalds torvalds 15176139 Jun 5 09:41 pack-fe4ff117c9959ead3443b826a777423b3062b666.pack
Whaah! That nice 6.33MB pack-file exploded to 14.5MB!
Doing repeated "git repack -a -d" to try to do incrementals, it stopped
improving after the sixth one, at which point it was down to 11.7MB, still
almost twice as big as before.
Re-doing it with
git repack -a -d -f --depth=100 --window=100
got me back to 6.94MB, but that's still 10% larger than the pack-file I
had before.
Interestingly, it's the "window" that matters more. The depth part didn't
make that huge of a difference, so it looks like it's the sorting
heuristic that may be broken again.
And it's possibly broken by the fact that we've been renaming things
lately (ie the "rev-list.c" -> "builtin-rev-list.c" thing ends up not
finding things)
Nico? Any ideas?
Linus
^ permalink raw reply
* Re: [PATCH] git: handle aliases defined in $GIT_DIR/config
From: Johannes Schindelin @ 2006-06-05 17:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.63.0606051847480.18604@wbgn013.biozentrum.uni-wuerzburg.de>
Hi,
sorry, I did not test with the subdir=... stuff I copied from Pasky's
patch. It breaks things for me. Looking into it...
Ciao,
Dscho
^ permalink raw reply
* Re: [PATCH] git: handle aliases defined in $GIT_DIR/config
From: Johannes Schindelin @ 2006-06-05 16:51 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3bekacts.fsf@assigned-by-dhcp.cox.net>
Hi,
On Sun, 4 Jun 2006, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > For me, short cuts have to be easy to type, so they never
> > include digits, and they are never case sensitive, so I do not
> > need any fancy config stuff...
>
> Fair enough, and the spirit is the same as what Pasky suggested
> earlier, I think.
>
> However, I am not sure about some parts of the code. I started
> mucking with it myself, but realized it is far easier for me to
> just let the original submitter, especially the capable one like
> you, do a bit more work ;-).
Are you trying to butter me up? If so, it's working ;-)
Here is a revised patch which addresses all of your comments (and Pasky's
implicit ones) except the move of split_cmdline to somewhere central (I
am not sure if that function is really needed elsewhere...):
---
git.c | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 113 insertions(+), 0 deletions(-)
diff --git a/git.c b/git.c
index bc463c9..db6ac61 100644
--- a/git.c
+++ b/git.c
@@ -10,6 +10,7 @@ #include <limits.h>
#include <stdarg.h>
#include "git-compat-util.h"
#include "exec_cmd.h"
+#include "cache.h"
#include "builtin.h"
@@ -32,6 +33,115 @@ static void prepend_to_path(const char *
setenv("PATH", path, 1);
}
+static const char *alias_command;
+static char *alias_string = NULL;
+
+static int git_alias_config(const char *var, const char *value)
+{
+ if (!strncmp(var, "alias.", 6) && !strcmp(var + 6, alias_command)) {
+ alias_string = strdup(value);
+ }
+ return 0;
+}
+
+static int split_cmdline(char *cmdline, const char ***argv)
+{
+ int src, dst, count = 0, size = 16;
+ char quoted = 0;
+
+ *argv = malloc(sizeof(char*) * size);
+
+ /* split alias_string */
+ (*argv)[count++] = cmdline;
+ for (src = dst = 0; cmdline[src];) {
+ char c = cmdline[src];
+ if (!quoted && isspace(c)) {
+ cmdline[dst++] = 0;
+ while (cmdline[++src]
+ && isspace(cmdline[src]))
+ ; /* skip */
+ if (count >= size) {
+ size += 16;
+ *argv = realloc(*argv, sizeof(char*) * size);
+ }
+ (*argv)[count++] = cmdline + dst;
+ } else if(!quoted && (c == '\'' || c == '"')) {
+ quoted = c;
+ src++;
+ } else if (c == quoted) {
+ quoted = 0;
+ src++;
+ } else {
+ if (c == '\\' && quoted != '\'') {
+ src++;
+ c = cmdline[src];
+ if (!c) {
+ free(*argv);
+ *argv = NULL;
+ return error("cmdline ends with \\");
+ }
+ }
+ cmdline[dst++] = c;
+ src++;
+ }
+ }
+
+ cmdline[dst] = 0;
+
+ if (quoted) {
+ free(*argv);
+ *argv = NULL;
+ return error("unclosed quote");
+ }
+
+ return count;
+}
+
+static int handle_alias(int *argcp, const char ***argv)
+{
+ int nongit = 0;
+ const char *subdir;
+
+ subdir = setup_git_directory_gently(&nongit);
+ if (!nongit) {
+ int count;
+ const char** new_argv;
+
+ alias_command = (*argv)[0];
+ git_config(git_alias_config);
+ if (!alias_string)
+ return 0;
+
+ count = split_cmdline(alias_string, &new_argv);
+
+ if (count < 1)
+ die("empty alias for %s", alias_command);
+
+ if (!strcmp(alias_command, new_argv[0]))
+ die("recursive alias: %s", alias_command);
+
+ /* insert after command name */
+ if (*argcp > 1) {
+ new_argv = realloc(new_argv,
+ sizeof(char*) * (count + *argcp - 1));
+ memcpy(new_argv + count, *argv, sizeof(char*) * (*argcp - 1));
+ }
+
+ *argv = new_argv;
+ *argcp += count - 1;
+
+ if (subdir)
+ chdir(subdir);
+
+ return 1;
+ }
+
+ if (subdir)
+ chdir(subdir);
+
+ return 0;
+}
+
const char git_version_string[] = GIT_VERSION;
static void handle_internal_command(int argc, const char **argv, char **envp)
@@ -121,6 +231,7 @@ int main(int argc, const char **argv, ch
if (!strncmp(cmd, "git-", 4)) {
cmd += 4;
argv[0] = cmd;
+ handle_alias(&argc, &argv);
handle_internal_command(argc, argv, envp);
die("cannot handle %s internally", cmd);
}
@@ -178,6 +289,8 @@ int main(int argc, const char **argv, ch
exec_path = git_exec_path();
prepend_to_path(exec_path, strlen(exec_path));
+ handle_alias(&argc, &argv);
+
/* See if it's an internal command */
handle_internal_command(argc, argv, envp);
^ permalink raw reply related
* Re: [PATCH 0/27] Documentation: Spelling fixes
From: Nikolai Weibull @ 2006-06-05 16:48 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: Junio C Hamano, Horst.H.von.Brand, git
In-Reply-To: <4484239C.7020608@op5.se>
On 6/5/06, Andreas Ericsson <ae@op5.se> wrote:
> Nikolai Weibull wrote:
> > On 6/4/06, Junio C Hamano <junkio@cox.net> wrote:
> >
> >> Most do not seem to be typoes, depending on where you learned
> >> the language (XYZour vs XYZor; ok, Ok, and OK; ie vs i.e.).
> >
> > Where do you write "ie" instead of "i.e."?
> >
>
> Mailing lists, online conversations, tech docs written in code
> editors...
Do you mean that code editors usually don't let you enter a dot into
the buffer, or what?
> Compare with online'ish abbrevs (afaict, iirc, imo, fyi).
That's hardly the same thing. Most people would upcase AFAICT, IIRC,
IMO, and FYI.
I wouldn't group "i.e." with such abbreviations in any case. (Hehe.)
> > In Swedish, there has been a trend to remove dots from abbreviated
> > expressions, but it seems people are returning to use dots.
> > Personally, I find that dots make things a lot clearer.
>
> Swedish has lots of abbreviations where one "part" of the abbreviation
> consists of multiple characters, like t.ex.
And "bl.a.".
> When each character of the abbrev defines one complete word dots are
> just prettiness-noise, their presence or absence decided by the gravity
> of the meaning ("R.I.P." vs "ie"). Obviously, correctness never hurts
> but this is, on two accounts, punktknulleri.
Considering that people don't want to get stuck on trying to
understand what the word "ie" is supposed to mean in a manual page
they're trying to understand what some command does (this happened to
me), I really think that fucking with the dots is called for.
Anyway, the general guidelines recommended by "The Chicago Manual of Style" are:
Use periods with abbreviations that appear in lowercase letters; use
no periods with abbreviations that appear in full capitals or small
capitals, whether two letters or more.
One possible solution is to expand "i.e." to "that is" (or something
equally befitting) and "e.g." to "for example", "such as", or similar.
nikolai
^ permalink raw reply
* Re: irc usage..
From: Sean @ 2006-06-05 16:07 UTC (permalink / raw)
To: antarus
Cc: martin.langhoff, spyderous, torvalds, ydirson, git, smurf,
Johannes.Schindelin
In-Reply-To: <448398BC.5090402@gentoo.org>
On Sun, 04 Jun 2006 22:36:44 -0400
Alec Warner <antarus@gentoo.org> wrote:
> I'll keep chugging on this one; it won't be the final import as I
> haven't used the complete Authors file, so I will try the repacking
> optimization next time I do an import.
Hi Alec,
You may want to go back and do another import for other reasons, but if
the only reason is to fix up the author information it would be _much_
faster to simply rewrite the git commit history. Cogito has something
called "cg-admin-rewritehist" which should do what you need and there
are other scripts floating around specificially for rewriting just the
author information.
HTH,
Sean
^ permalink raw reply
* Re: [PATCH, take 2] Add example xinetd(8) configuration to Documentation/everyday.txt
From: Dmitry V. Levin @ 2006-06-05 13:12 UTC (permalink / raw)
To: Horst von Brand; +Cc: git
In-Reply-To: <200606050054.k550sFCC018490@laptop11.inf.utfsm.cl>
[-- Attachment #1: Type: text/plain, Size: 270 bytes --]
On Sun, Jun 04, 2006 at 08:54:15PM -0400, Horst von Brand wrote:
> Dmitry V. Levin <ldv@altlinux.org> wrote:
> > It is a bad advice to run git-daemon as root.
>
> Right, my bad. Fixed patch below.
[...]
> + user = root
Really?
--
ldv
[-- Attachment #2: Type: application/pgp-signature, Size: 191 bytes --]
^ permalink raw reply
* Re: Using pickaxe to track changed symbol CR4_FEATURES_ADDR
From: Andreas Ericsson @ 2006-06-05 12:43 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: GIT
In-Reply-To: <20060605102627.GB24346@cip.informatik.uni-erlangen.de>
Thomas Glanzmann wrote:
> Hello,
> I am looking for the symbol CR4_FEATURES_ADDR which must be gone in one
> of the last kernel revision. Now how I do use pickaxe to track any
> changes that involve my missing symbol? Or is there a better way to
> track that change down?
>
$ git whatchanged -S'CR4_FEATURES_ADDR'
last time I checked, but that was 10 days and an immense amount of cheap
turkish alcohol ago so it's quite possible that I'm wrong.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply
* Re: [PATCH 0/27] Documentation: Spelling fixes
From: Andreas Ericsson @ 2006-06-05 12:29 UTC (permalink / raw)
To: Nikolai Weibull; +Cc: Junio C Hamano, Horst.H.von.Brand, git
In-Reply-To: <dbfc82860606041059l31605bc5j18ad2b35ea6f6dc0@mail.gmail.com>
Nikolai Weibull wrote:
> On 6/4/06, Junio C Hamano <junkio@cox.net> wrote:
>
>> Most do not seem to be typoes, depending on where you learned
>> the language (XYZour vs XYZor; ok, Ok, and OK; ie vs i.e.).
>
>
> Where do you write "ie" instead of "i.e."?
>
Mailing lists, online conversations, tech docs written in code
editors... Compare with online'ish abbrevs (afaict, iirc, imo, fyi).
> In Swedish, there has been a trend to remove dots from abbreviated
> expressions, but it seems people are returning to use dots.
> Personally, I find that dots make things a lot clearer.
>
Swedish has lots of abbreviations where one "part" of the abbreviation
consists of multiple characters, like t.ex.
When each character of the abbrev defines one complete word dots are
just prettiness-noise, their presence or absence decided by the gravity
of the meaning ("R.I.P." vs "ie"). Obviously, correctness never hurts
but this is, on two accounts, punktknulleri.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply
* Re: Gitk feature - show nearby tags
From: Marco Costalba @ 2006-06-05 11:54 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhd305dk9.fsf@assigned-by-dhcp.cox.net>
On 6/5/06, Junio C Hamano <junkio@cox.net> wrote:
> "Marco Costalba" <mcostalba@gmail.com> writes:
>
> I think your "start from positive ones, traverse one by one and
> stop traversal that hits the negative one" logic requires the
> negative one to be directly on the traversal paths starting from
> positive ones to have _any_ effect. We often ask "what's the
> ones that are still not merged to the master from the side
> branch" while dealing with topic branches:
>
> c-------d---e master time flows from
> / / left to right
> --a---b---x---y---z side
>
> and the way to ask that question is "rev-list master..side"
> (which is "rev-list side ^master"). It should list z and not
> show y nor x nor b nor a.
>
> In order for it to be able to notice that y should not be
> listed, it needs to perform traversals from negative ones as
> well in order to learn that y is reachable from master.
>
Thanks for your clear explanation. Now I see much better what's the deal.
>
> I think one useful thing we can do is to generalize what
> "describe", "nave-rev", and "merge-base" do to have a command
> that takes a committish X and a set of other committish T1..Tn,
> and examines if Ti (1<=i<=n) is reachable from X and if X is
> reachable from Ti (1<=i<=n), and give a short-hand to specify
> the set of T for common patterns like --heads --tags and --all.
>
I don't know if this is enough for our original problem to find previous tag.
Our problem is indeed not only to find previous tags, but _nearest_
previous, so I think we have to think to a generalization that takes
in account also a kind of 'metric' among tags because the only
reachability seems to fall short in finding the nearset one.
But definitely I need to think more about this ;-)
Marco
^ permalink raw reply
* Using pickaxe to track changed symbol CR4_FEATURES_ADDR
From: Thomas Glanzmann @ 2006-06-05 10:26 UTC (permalink / raw)
To: GIT
Hello,
I am looking for the symbol CR4_FEATURES_ADDR which must be gone in one
of the last kernel revision. Now how I do use pickaxe to track any
changes that involve my missing symbol? Or is there a better way to
track that change down?
Thomas
^ permalink raw reply
* Re: [RFC] Add first whack at interpolated daemon paths.
From: Junio C Hamano @ 2006-06-05 6:20 UTC (permalink / raw)
To: Jon Loeliger; +Cc: git
In-Reply-To: <E1Fn6B9-00017u-BV@jdl.com>
Jon Loeliger <jdl@jdl.com> writes:
> This is really RFC-ish. No canonicalization of hostname is done.
That's fine. RFC-ish is what gets the ball rolling.
> diff --git a/Makefile b/Makefile
> index 004c216..6a02236 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -211,6 +211,7 @@ DIFF_OBJS = \
> LIB_OBJS = \
> blob.o commit.o connect.o csum-file.o cache-tree.o base85.o \
> date.o diff-delta.o entry.o exec_cmd.o ident.o index.o \
> + interpolate.o \
> object.o pack-check.o patch-delta.o path.o pkt-line.o \
> quote.o read-cache.o refs.o run-command.o dir.o \
> server-info.o setup.o sha1_file.o sha1_name.o strbuf.o \
A separate helper library is always a good idea. Is it
"interpolate", I wonder, however?
> diff --git a/connect.c b/connect.c
> index 54f7bf7..9e7b276 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -374,7 +374,13 @@ static int git_tcp_connect(int fd[2], co
>
> fd[0] = sockfd;
> fd[1] = sockfd;
> - packet_write(sockfd, "%s %s\n", prog, path);
> +
> + /*
> + * Separate original protocol components prog and path
> + * from extended components with a NUL byte.
> + */
> + packet_write(sockfd, "%s %s%cHOST=%s\n", prog, path, 0, host);
> +
Since packet interface reader knows the length of the total
string, it might make sense to use NUL as the terminator
betweeen extended attributes (i.e. end the format with
"...HOST=%s%c" and append 0 as the last argument). Some of our
later extended attributes might want to have LF in them for
whatever reason we do not foresee here.
> return 0;
> }
>
> @@ -443,7 +449,13 @@ static int git_tcp_connect(int fd[2], co
>
> fd[0] = sockfd;
> fd[1] = sockfd;
> - packet_write(sockfd, "%s %s\n", prog, path);
> +
> + /*
> + * Separate original protocol components prog and path
> + * from extended components with a NUL byte.
> + */
> + packet_write(sockfd, "%s %s%cHOST=%s\n", prog, path, 0, host);
> +
> return 0;
> }
We probably would want to share this part between two
git_tcp_connect() implementations. Rename two #ifdef'ed
implemenations of git_tcp_connect() to git_tcp_connect_sock(),
make its sole purpose to just open the connection and return the
sockfd. Write a new git_tcp_connect() which is not #ifdef'ed,
call git_tcp_connect_sock() from there, and do the part after
"if (sockfd < 0)" in the new git_tcp_connect().
> +/* Flag indicating client sent extra args. */
> +int saw_extended_args = 0;
>
> /* If defined, ~user notation is allowed and the string is inserted
> * after ~user/. E.g. a request to git://host/~alice/frotz would
> @@ -41,6 +50,23 @@ static char *user_path = NULL;
> static unsigned int timeout = 0;
> static unsigned int init_timeout = 0;
>
> +/*
> + * Static table for now. Ugh.
> + * Feel free to make dynamic as needed.
> + */
> +#define INTERP_SLOT_HOST (0)
> +#define INTERP_SLOT_DIR (1)
> +#define INTERP_SLOT_PERCENT (2)
> +
> +struct interp interp_table[] = {
> + { "%H", 0},
> + { "%D", 0},
> + { "%%", "%"},
> +};
As Linus mentioned, %H for full host and %h for the most
specific part may make sense. I wonder if %D has a practical
value. It lets you splice the client supplied directory path in
the middle of the final string, but would it be useful in
practice? Otherwise, maybe dropping %D and always appending at
the tail of "interporated" path that is tailored for the virtual
host might be easier to code and explain.
As to naming, "interp" sounds as if we are dealing with some
"interpreter" here...
> +#define N_INTERPS (sizeof(interp_table) / sizeof(struct interp))
I would use ARRAY_SIZE() from git-compat-util.h where you would
use N_INTERPS and drop the #define altogether.
> + else if (interpolated_path && saw_extended_args) {
> + if (*dir != '/') {
> + /* Allow only absolute */
> + logerror("'%s': Non-absolute path denied (interpolated-path active)", dir);
> + return NULL;
> + }
> +
> + loginfo("Before interpolation '%s'", dir);
> + loginfo("Interp slot 0 (%s,%s)",
> + interp_table[0].name, interp_table[0].value);
> + loginfo("Interp slot 1 (%s,%s)",
> + interp_table[1].name, interp_table[1].value);
> + interpolate(interp_path, PATH_MAX, interpolated_path,
> + interp_table, N_INTERPS);
> + loginfo("After interpolation '%s'", interp_path);
> + dir = interp_path;
> + }
I suspect it would be easier to maintain the site if you let the
site administrator the default virtual host name, and for
requests from older clients use that name to interpolate %H, and
always use this codepath and nothing else (in other words, lose
"saw_extended_args" check) for both old and new clients.
With your patch, the administrator needs to configure the daemon
with --base_path=/mnt1/git.or.cz/git --interpolate=/mnt1/%H/git and
maintain them in sync.
> diff --git a/interpolate.c b/interpolate.c
>...
> + if (p) {
> + /*
> + * Found a potential interpolation point.
> + */
> + for (i = 0; i < ninterps; i++) {
> + name = interps[i].name;
> + if (strncmp(p, name, strlen(name)) == 0)
> + break;
> + }
> +
> + value = interps[i].value;
> + valuelen = strlen(value);
> + printf("Interp: %s to %s\n", name, value);
Wouldn't a daemon misconfigured with --interpolate="/mnt/%X"
barf here by overstepping interps[] array?
^ permalink raw reply
* Re: Gitk feature - show nearby tags
From: Junio C Hamano @ 2006-06-05 6:20 UTC (permalink / raw)
To: Marco Costalba; +Cc: Junio C Hamano, git
In-Reply-To: <e5bfff550606040657p5c1a3dceq3eef254ab64f0e3a@mail.gmail.com>
"Marco Costalba" <mcostalba@gmail.com> writes:
> What I understand is that git-rev-list lists _first_ the given commit,
> then his parents, then his grandparents and so on _until_ a commit
> which is stated with a preceding '{caret}' is found.
> So everything that is between the given commit and HEAD is never found
> and ignored.
As you now know, the way it works is that it takes an unordered
set of committishes, and performs a set operation that says
"include everybody reachable from positive ones while excluding
everybody reachable from negative ones". --topo-order tells it
to topologically (instead of doing the commit date-order which
it does by default) sort the resulting list. The resulting list
is then written out.
> Is it a problem to change the git-rev-list behaviour to reflect (my
> understanding of) documentation or it breaks something?
I suspect it would break quite many things. Existing users use
the command knowing it is a set operation on an unordered set of
committishes, and expect the command to behave that way. Also
the most typical use A..B translates to ^B A (either internally
or by rev-parse) so "the first" would typically be a negative
one.
I think your "start from positive ones, traverse one by one and
stop traversal that hits the negative one" logic requires the
negative one to be directly on the traversal paths starting from
positive ones to have _any_ effect. We often ask "what's the
ones that are still not merged to the master from the side
branch" while dealing with topic branches:
c-------d---e master time flows from
/ / left to right
--a---b---x---y---z side
and the way to ask that question is "rev-list master..side"
(which is "rev-list side ^master"). It should list z and not
show y nor x nor b nor a.
In order for it to be able to notice that y should not be
listed, it needs to perform traversals from negative ones as
well in order to learn that y is reachable from master.
How would you ask the same question to the modified rev-list
that does "start from positive ones, traverse one by one and
stop traversal that hits the negative one" logic?
I think one useful thing we can do is to generalize what
"describe", "nave-rev", and "merge-base" do to have a command
that takes a committish X and a set of other committish T1..Tn,
and examines if Ti (1<=i<=n) is reachable from X and if X is
reachable from Ti (1<=i<=n), and give a short-hand to specify
the set of T for common patterns like --heads --tags and --all.
But that would not be rev-list; I suspect you would end up doing
something quite similar to what show-branch does.
^ permalink raw reply
* [RFC] Add first whack at interpolated daemon paths.
From: Jon Loeliger @ 2006-06-05 3:54 UTC (permalink / raw)
To: git
Modify git protocol to pass in client hostname and
allow it to be interpolated into daemon source dir.
New --interpolated-path=<path> option.
---
Makefile | 1 +
connect.c | 16 +++++++-
daemon.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
interpolate.c | 74 +++++++++++++++++++++++++++++++++++++
interpolate.h | 10 +++++
5 files changed, 198 insertions(+), 16 deletions(-)
This is really RFC-ish. No canonicalization of hostname is done.
It is backwards compatible with existing daemon path handling,
but also allows for future extensibility with more "extended"
client args being supplied. The interpolate code is pretty
generic, but the table driving it in this case is hard coded.
Tons of loginfo() crap left in here to be cleaned up.
diff --git a/Makefile b/Makefile
index 004c216..6a02236 100644
--- a/Makefile
+++ b/Makefile
@@ -211,6 +211,7 @@ DIFF_OBJS = \
LIB_OBJS = \
blob.o commit.o connect.o csum-file.o cache-tree.o base85.o \
date.o diff-delta.o entry.o exec_cmd.o ident.o index.o \
+ interpolate.o \
object.o pack-check.o patch-delta.o path.o pkt-line.o \
quote.o read-cache.o refs.o run-command.o dir.o \
server-info.o setup.o sha1_file.o sha1_name.o strbuf.o \
diff --git a/connect.c b/connect.c
index 54f7bf7..9e7b276 100644
--- a/connect.c
+++ b/connect.c
@@ -374,7 +374,13 @@ static int git_tcp_connect(int fd[2], co
fd[0] = sockfd;
fd[1] = sockfd;
- packet_write(sockfd, "%s %s\n", prog, path);
+
+ /*
+ * Separate original protocol components prog and path
+ * from extended components with a NUL byte.
+ */
+ packet_write(sockfd, "%s %s%cHOST=%s\n", prog, path, 0, host);
+
return 0;
}
@@ -443,7 +449,13 @@ static int git_tcp_connect(int fd[2], co
fd[0] = sockfd;
fd[1] = sockfd;
- packet_write(sockfd, "%s %s\n", prog, path);
+
+ /*
+ * Separate original protocol components prog and path
+ * from extended components with a NUL byte.
+ */
+ packet_write(sockfd, "%s %s%cHOST=%s\n", prog, path, 0, host);
+
return 0;
}
diff --git a/daemon.c b/daemon.c
index 776749e..0c9ebe3 100644
--- a/daemon.c
+++ b/daemon.c
@@ -10,6 +10,7 @@ #include <syslog.h>
#include "pkt-line.h"
#include "cache.h"
#include "exec_cmd.h"
+#include "interpolate.h"
static int log_syslog;
static int verbose;
@@ -18,7 +19,8 @@ static int reuseaddr;
static const char daemon_usage[] =
"git-daemon [--verbose] [--syslog] [--inetd | --port=n] [--export-all]\n"
" [--timeout=n] [--init-timeout=n] [--strict-paths]\n"
-" [--base-path=path] [--user-path | --user-path=path]\n"
+" [--base-path=path] [--interpolated-path=path]\n"
+" [--user-path | --user-path=path]\n"
" [--reuseaddr] [directory...]";
/* List of acceptable pathname prefixes */
@@ -28,8 +30,15 @@ static int strict_paths = 0;
/* If this is set, git-daemon-export-ok is not required */
static int export_all_trees = 0;
-/* Take all paths relative to this one if non-NULL */
+/*
+ * Take all paths relative to this one if non-NULL.
+ *
+ */
static char *base_path = NULL;
+static char *interpolated_path = NULL;
+
+/* Flag indicating client sent extra args. */
+int saw_extended_args = 0;
/* If defined, ~user notation is allowed and the string is inserted
* after ~user/. E.g. a request to git://host/~alice/frotz would
@@ -41,6 +50,23 @@ static char *user_path = NULL;
static unsigned int timeout = 0;
static unsigned int init_timeout = 0;
+/*
+ * Static table for now. Ugh.
+ * Feel free to make dynamic as needed.
+ */
+#define INTERP_SLOT_HOST (0)
+#define INTERP_SLOT_DIR (1)
+#define INTERP_SLOT_PERCENT (2)
+
+struct interp interp_table[] = {
+ { "%H", 0},
+ { "%D", 0},
+ { "%%", "%"},
+};
+
+#define N_INTERPS (sizeof(interp_table) / sizeof(struct interp))
+
+
static void logreport(int priority, const char *err, va_list params)
{
/* We should do a single write so that it is atomic and output
@@ -142,10 +168,15 @@ static int avoid_alias(char *p)
}
}
-static char *path_ok(char *dir)
+static char *path_ok(struct interp *itable)
{
static char rpath[PATH_MAX];
+ static char interp_path[PATH_MAX];
char *path;
+ char *dir;
+
+ dir = itable[INTERP_SLOT_DIR].value;
+ loginfo("Request for '%s'", dir);
if (avoid_alias(dir)) {
logerror("'%s': aliased", dir);
@@ -174,16 +205,34 @@ static char *path_ok(char *dir)
dir = rpath;
}
}
+ else if (interpolated_path && saw_extended_args) {
+ if (*dir != '/') {
+ /* Allow only absolute */
+ logerror("'%s': Non-absolute path denied (interpolated-path active)", dir);
+ return NULL;
+ }
+
+ loginfo("Before interpolation '%s'", dir);
+ loginfo("Interp slot 0 (%s,%s)",
+ interp_table[0].name, interp_table[0].value);
+ loginfo("Interp slot 1 (%s,%s)",
+ interp_table[1].name, interp_table[1].value);
+ interpolate(interp_path, PATH_MAX, interpolated_path,
+ interp_table, N_INTERPS);
+ loginfo("After interpolation '%s'", interp_path);
+ dir = interp_path;
+ }
else if (base_path) {
if (*dir != '/') {
/* Allow only absolute */
logerror("'%s': Non-absolute path denied (base-path active)", dir);
return NULL;
}
- else {
- snprintf(rpath, PATH_MAX, "%s%s", base_path, dir);
- dir = rpath;
- }
+ snprintf(rpath, PATH_MAX, "%s%s", base_path, dir);
+ loginfo("dir was %s", dir);
+ loginfo("base_path is %s", base_path);
+ loginfo("rpath now %s", rpath);
+ dir = rpath;
}
path = enter_repo(dir, strict_paths);
@@ -223,15 +272,13 @@ static char *path_ok(char *dir)
return NULL; /* Fallthrough. Deny by default */
}
-static int upload(char *dir)
+static int upload(struct interp *itable)
{
/* Timeout as string */
char timeout_buf[64];
const char *path;
- loginfo("Request for '%s'", dir);
-
- if (!(path = path_ok(dir)))
+ if (!(path = path_ok(itable)))
return -1;
/*
@@ -264,10 +311,34 @@ static int upload(char *dir)
return -1;
}
+void parse_extra_args(char *extra_args, int buflen)
+{
+ char *val;
+ int vallen;
+ char *end = extra_args + buflen;
+
+ while (extra_args < end && *extra_args) {
+ saw_extended_args = 1;
+ loginfo("Extended arg %s", extra_args);
+ if (strncasecmp("host=", extra_args, 5) == 0) {
+ val = extra_args + 5;
+ vallen = strlen(val) + 1;
+ if (*val) {
+ char *save = xmalloc(vallen);
+ interp_table[INTERP_SLOT_HOST].value = save;
+ safe_strncpy(save, val, vallen);
+ }
+ /* On to the next one */
+ extra_args = val + vallen;
+ }
+ }
+}
+
static int execute(void)
{
static char line[1000];
- int len;
+ int len; /* full packet length, including extended args */
+ int n; /* original protocol part size */
alarm(init_timeout ? init_timeout : timeout);
len = packet_read_line(0, line, sizeof(line));
@@ -276,8 +347,18 @@ static int execute(void)
if (len && line[len-1] == '\n')
line[--len] = 0;
- if (!strncmp("git-upload-pack ", line, 16))
- return upload(line+16);
+ /*
+ * Check for extended args after a NUL byte.
+ */
+ n = strlen(line);
+ if (n != len) {
+ parse_extra_args(line + n + 1, len - n - 1);
+ }
+
+ if (!strncmp("git-upload-pack ", line, 16)) {
+ interp_table[INTERP_SLOT_DIR].value = line+16;
+ return upload(interp_table);
+ }
logerror("Protocol error: '%s'", line);
return -1;
@@ -711,6 +792,10 @@ int main(int argc, char **argv)
base_path = arg+12;
continue;
}
+ if (!strncmp(arg, "--interpolated-path=", 20)) {
+ interpolated_path = arg+20;
+ continue;
+ }
if (!strcmp(arg, "--reuseaddr")) {
reuseaddr = 1;
continue;
diff --git a/interpolate.c b/interpolate.c
new file mode 100644
index 0000000..d936022
--- /dev/null
+++ b/interpolate.c
@@ -0,0 +1,74 @@
+#include <string.h>
+
+#include "interpolate.h"
+
+
+int
+interpolate(char *result, int reslen, char *orig,
+ struct interp *interps, int ninterps)
+{
+ int i;
+ char *p;
+ char *src = orig;
+ char *dest = result;
+ int newlen = 0;
+
+ char *name;
+ char *value;
+ int valuelen;
+
+ do {
+
+ p = strchr(src, '%');
+
+ if (p) {
+ /*
+ * Found a potential interpolation point.
+ */
+ for (i = 0; i < ninterps; i++) {
+ name = interps[i].name;
+ if (strncmp(p, name, strlen(name)) == 0)
+ break;
+ }
+
+ value = interps[i].value;
+ valuelen = strlen(value);
+ printf("Interp: %s to %s\n", name, value);
+
+ int len = p - src;
+ if (newlen + len < reslen) {
+ strncpy(dest, src, len);
+ newlen += len;
+ dest += len;
+ *dest = 0;
+ src = p + strlen(name);
+ if (newlen + valuelen < reslen) {
+ strncpy(dest, value, valuelen);
+ newlen += valuelen;
+ dest += valuelen;
+ *dest = 0;
+ } else {
+ printf("new value %s didn't fit.\n", value);
+ return 0; /* something's not fitting. */
+ }
+ } else {
+ printf("orig part %s didn't fit.\n", src);
+ return 0; /* something's not fitting. */
+ }
+
+ } else {
+ /* Copy remainder */
+ int len = strlen(src);
+ if (newlen < reslen) {
+ strncpy(dest, src, len);
+ dest += len;
+ *dest = 0;
+ } else {
+ printf("Remainder %s didn't fit.\n", src);
+ return 0;
+ }
+ }
+ } while (p);
+
+ return 1; /* successful */
+}
diff --git a/interpolate.h b/interpolate.h
new file mode 100644
index 0000000..3b710ad
--- /dev/null
+++ b/interpolate.h
@@ -0,0 +1,10 @@
+
+struct interp {
+ char *name;
+ char *value;
+};
+
+
+extern int interpolate(char *result, int reslen, char *orig,
+ struct interp *interps, int ninterps);
+
--
1.3.3.g16a4-dirty
^ permalink raw reply related
* Re: irc usage..
From: Martin Langhoff @ 2006-06-05 3:49 UTC (permalink / raw)
To: antarus
Cc: Donnie Berkholz, Linus Torvalds, Yann Dirson, Git Mailing List,
Matthias Urlichs, Johannes Schindelin
In-Reply-To: <448398BC.5090402@gentoo.org>
On 6/5/06, Alec Warner <antarus@gentoo.org> wrote:
> > I don't think you can do this in parallel. What I would do is remove
> > the -a from the git-repack invocation. It does hurt import times quite
> > a bit -- just do a git-repack -a -d when it's done.
>
> Only repack at the end then? disk space isn't an issue here so I'll give
> that a shot.
Not exactly -- by removing the -a from the git-repack invocation what
you get is cheap "partial" packing rather than a full repack. This is
somewhat inefficient disk-wise, perhaps by 10% or so. But full repacks
get more and more expensive as the repo grows.
So you don't need to run git-repack -a -d at the end, but it will be a
good measure to see how compact the packing gets.
> > And... having said that, there is still a memory leak somehow,
> > somewhere. It's been evading me for 2 weeks now, so I feel an idiot
> > now. Not too bad in general, but it shows clearly in the gentoo and
> > mozilla imports.
>
> 30565 antarus 17 0 470m 456m 1640 S 14 11.6 234:23.38
> git-cvsimport
> 30566 antarus 16 0 6753m 147m 752 S 7 3.7 120:27.06 cvs
>
> I'm on cvs-1.11.12 and the git version of git
Yep, I see roughly the same. It grows slowly and I don't know why :(
> I'll keep chugging on this one; it won't be the final import as I
> haven't used the complete Authors file, so I will try the repacking
> optimization next time I do an import.
Cool. If it dies for any reason, just do
git-update-ref refs/heads/master refs/heads/origin
git-update-ref HEAD origin
git-checkout
You only need to do this the first time -- after that, the core heads
are set. Rerun the script and it will pick up where it left. If it
dies again, just do git-checkout to see the latest files.
(Above, replace origin with your -o option if you are using it. I
normally use -o cvshead.)
martin
^ permalink raw reply
* Re: git daemon directory munging?
From: H. Peter Anvin @ 2006-06-05 2:59 UTC (permalink / raw)
To: Jon Loeliger; +Cc: git
In-Reply-To: <E1Fn4Xf-0000bL-82@jdl.com>
Jon Loeliger wrote:
>> Well, you can bind different git daemons to different IP addresses
>> (IP-based vhosting) or different ports (with SRV records in DNS.)
>
> Is there existing support for telling the git-daemon what
> specific IP to bind to out of an inetd setup and I just
> missed it?
>
No, but that really should be added. It's a pretty trivial hack.
> I could set that up realatively easily and gain the
> functionality I wanted that way too.
>
> I've also hacked in a host interpolation too.
>
> But like you said, canonicalizing it and checking it is likely
> a bit of a pain. I've side-stepped one angle of that by
> symlinking in my /pub directory for multiple different
> hostnames too. :-)
>
Doesn't work very well. DNS is case-insensitive, and worse, there are
the PunyCode aliases or whatever they're called.
-hpa
^ permalink raw reply
* Re: irc usage..
From: Alec Warner @ 2006-06-05 2:36 UTC (permalink / raw)
To: Martin Langhoff
Cc: Donnie Berkholz, Linus Torvalds, Yann Dirson, Git Mailing List,
Matthias Urlichs, Johannes Schindelin
In-Reply-To: <46a038f90606041906k66d85152v6e402c65151d7ab8@mail.gmail.com>
Martin Langhoff wrote:
> On 6/5/06, Alec Warner <antarus@gentoo.org> wrote:
>
>> Ok the box this was running on had issues, so I switched to using
>> pearl.amd64.dev.gentoo.org, a dual core amd64 X2 4600+ with 4 gigs of
>> ram and plenty of disk. The "problem" now is just converstion time...30
>> hours and I'm into 2004-09-17...but it's been in 2004 all day, seems
>> like most of the commits are in the last three years. Are there
>> architectural issues with doing this in parallel?
>
>
> I don't think you can do this in parallel. What I would do is remove
> the -a from the git-repack invocation. It does hurt import times quite
> a bit -- just do a git-repack -a -d when it's done.
Only repack at the end then? disk space isn't an issue here so I'll give
that a shot.
>
> And... having said that, there is still a memory leak somehow,
> somewhere. It's been evading me for 2 weeks now, so I feel an idiot
> now. Not too bad in general, but it shows clearly in the gentoo and
> mozilla imports.
30565 antarus 17 0 470m 456m 1640 S 14 11.6 234:23.38
git-cvsimport
30566 antarus 16 0 6753m 147m 752 S 7 3.7 120:27.06 cvs
I'm on cvs-1.11.12 and the git version of git
> You are forced to do it in a sequence because cvsps only tells you
> about the files added/removed/changed in a commit -- you need the
> ancestor to have a view of what the whole tree looked like. The only
> room for parallelism I see is to fork off new processes to work on
> branches in parallel.
Not helpful in the Gentoo case, since we only have one branch; minus an
accident when a dev branched gentoo-x86 a while back ;)
I'll keep chugging on this one; it won't be the final import as I
haven't used the complete Authors file, so I will try the repacking
optimization next time I do an import.
-Alec Warner
^ permalink raw reply
* [PATCH] Fix Documentation/everyday.txt: Junio's workflow
From: Horst H. von Brand @ 2006-06-05 2:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
The workflow for Junio was badly formatted.
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
---
Documentation/everyday.txt | 21 +++++++++++++--------
1 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/Documentation/everyday.txt b/Documentation/everyday.txt
index ffba543..6745ab5 100644
--- a/Documentation/everyday.txt
+++ b/Documentation/everyday.txt
@@ -336,15 +336,20 @@ master, nor exposed as a part of a stabl
<11> make sure I did not accidentally rewind master beyond what I
already pushed out. "ko" shorthand points at the repository I have
at kernel.org, and looks like this:
- $ cat .git/remotes/ko
- URL: kernel.org:/pub/scm/git/git.git
- Pull: master:refs/tags/ko-master
- Pull: maint:refs/tags/ko-maint
- Push: master
- Push: +pu
- Push: maint
++
+------------
+$ cat .git/remotes/ko
+URL: kernel.org:/pub/scm/git/git.git
+Pull: master:refs/tags/ko-master
+Pull: maint:refs/tags/ko-maint
+Push: master
+Push: +pu
+Push: maint
+------------
++
In the output from "git show-branch", "master" should have
everything "ko-master" has.
+
<12> push out the bleeding edge.
<13> push the tag out, too.
@@ -390,7 +395,7 @@ service git
port = 9418
socket_type = stream
wait = no
- user = root
+ user = nobody
server = /usr/bin/git-daemon
server_args = --inetd --syslog --export-all --base-path=/pub/scm
log_on_failure += USERID
--
1.3.3.g16a4
^ permalink raw reply related
* Re: git daemon directory munging?
From: Jon Loeliger @ 2006-06-05 2:10 UTC (permalink / raw)
To: git
> Well, you can bind different git daemons to different IP addresses
> (IP-based vhosting) or different ports (with SRV records in DNS.)
Is there existing support for telling the git-daemon what
specific IP to bind to out of an inetd setup and I just
missed it?
I could set that up realatively easily and gain the
functionality I wanted that way too.
I've also hacked in a host interpolation too.
But like you said, canonicalizing it and checking it is likely
a bit of a pain. I've side-stepped one angle of that by
symlinking in my /pub directory for multiple different
hostnames too. :-)
jdl
^ permalink raw reply
* [PATCH] Fix Documentation/everyday.txt: Junio's workflow
From: Horst H. von Brand @ 2006-06-05 2:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Dmitry V. Levin
The workflow for Junio was badly formatted.
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
---
Documentation/everyday.txt | 21 +++++++++++++--------
1 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/Documentation/everyday.txt b/Documentation/everyday.txt
index ffba543..6745ab5 100644
--- a/Documentation/everyday.txt
+++ b/Documentation/everyday.txt
@@ -336,15 +336,20 @@ master, nor exposed as a part of a stabl
<11> make sure I did not accidentally rewind master beyond what I
already pushed out. "ko" shorthand points at the repository I have
at kernel.org, and looks like this:
- $ cat .git/remotes/ko
- URL: kernel.org:/pub/scm/git/git.git
- Pull: master:refs/tags/ko-master
- Pull: maint:refs/tags/ko-maint
- Push: master
- Push: +pu
- Push: maint
++
+------------
+$ cat .git/remotes/ko
+URL: kernel.org:/pub/scm/git/git.git
+Pull: master:refs/tags/ko-master
+Pull: maint:refs/tags/ko-maint
+Push: master
+Push: +pu
+Push: maint
+------------
++
In the output from "git show-branch", "master" should have
everything "ko-master" has.
+
<12> push out the bleeding edge.
<13> push the tag out, too.
@@ -390,7 +395,7 @@ service git
port = 9418
socket_type = stream
wait = no
- user = root
+ user = nobody
server = /usr/bin/git-daemon
server_args = --inetd --syslog --export-all --base-path=/pub/scm
log_on_failure += USERID
--
1.3.3.g16a4
^ permalink raw reply related
* Re: irc usage..
From: Martin Langhoff @ 2006-06-05 2:06 UTC (permalink / raw)
To: antarus
Cc: Donnie Berkholz, Linus Torvalds, Yann Dirson, Git Mailing List,
Matthias Urlichs, Johannes Schindelin
In-Reply-To: <44837BDB.2090601@gentoo.org>
On 6/5/06, Alec Warner <antarus@gentoo.org> wrote:
> Ok the box this was running on had issues, so I switched to using
> pearl.amd64.dev.gentoo.org, a dual core amd64 X2 4600+ with 4 gigs of
> ram and plenty of disk. The "problem" now is just converstion time...30
> hours and I'm into 2004-09-17...but it's been in 2004 all day, seems
> like most of the commits are in the last three years. Are there
> architectural issues with doing this in parallel?
I don't think you can do this in parallel. What I would do is remove
the -a from the git-repack invocation. It does hurt import times quite
a bit -- just do a git-repack -a -d when it's done.
And... having said that, there is still a memory leak somehow,
somewhere. It's been evading me for 2 weeks now, so I feel an idiot
now. Not too bad in general, but it shows clearly in the gentoo and
mozilla imports.
> Since the repository commits are all in cvs, it should be possible to do
> the work in parallel, since you know what all the commits touch. The
> concern would be ordering of nodes in the tree; you'd end up building a
> bunch of subtrees and patching them together?
Well... parsecvs does a bit of this but in sequential fashion... it
imports all the files first, and then runs through the history
building the tree+commits in order, committing them. It saves a lot of
time in the file imports by parsing the RCS file directly. The
downside is that it must keep a filename+version=>sha1 mapping --
which I think is why parsecvs won't fit in memory until it's changed
to store it on disk somehow ;-)
You are forced to do it in a sequence because cvsps only tells you
about the files added/removed/changed in a commit -- you need the
ancestor to have a view of what the whole tree looked like. The only
room for parallelism I see is to fork off new processes to work on
branches in parallel.
martin
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox