* Re: git-log produces no output
From: Linus Torvalds @ 2006-04-21 18:11 UTC (permalink / raw)
To: Bob Portmann; +Cc: Git Mailing List
In-Reply-To: <20060421172001.44441.qmail@web60325.mail.yahoo.com>
On Fri, 21 Apr 2006, Bob Portmann wrote:
>
> I cannot get any output out of it and am wondering if I am using it
> correctly or it is broken.
You're using it correctly, but it isn't broken for me.
> As I understand it, git-log should just print out the log messages but
> not the changes, whereas git-whatchanged will print out both.
Well, in 1.3.0, "git log" can actually do both, and you can get the
whatchanged output by just saying "git log -p".
But yes, without the "-p", you should get just the log.
And that's exactly what I get, both with current HEAD git, and with a
v1.3.0 checkout.
> test-log> git log
> test-log>
>
> As you can see git log produces no output. I've tried it with other
> options with the same result.
Very strange indeed. Can you do
git log > file
to see if that changes (and see if the file contains anything)? The reason
I mention that is that by default "git log" will start a pager for you,
and if you somehow have a broken PAGER setup, I could imagine exactly the
behaviour you see (although I don't see why "git whatchanged" would work
either, in that case).
Finally, if that doesn't output anything either, please do (for just that
small repository, so that the trace is also small)
strace -o git-trace git log > /dev/null
and send out the result. Again, for PAGER reasons, that "> /dev/null" is
actually important, because we don't want to trigger the pager code.
Linus
^ permalink raw reply
* Split up builtin commands into separate files from git.c
From: Linus Torvalds @ 2006-04-21 17:27 UTC (permalink / raw)
To: Junio C Hamano, Git Mailing List
Right now it split it into "builtin-log.c" for log-related commands
("log", "show" and "whatchanged"), and "builtin-help.c" for the
informational commands (usage printing and "help" and "version").
This just makes things easier to read, I find.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
[ Hey, make of this what you will. There's no real code changes, except I
used that "git_version_string[]" variable to make things easier to
build, and I renamed "cmd_wc" to "cmd_whatchanged", since when it's
split up into another file, we don't want to have a gratuitous
short-hand that we have to remember across files.
I find things easier to work with the more you split them up along
conceptual lines, and as we do more and more built-ins, git.c would end
up a horrible mess unless we do _something_ like this.
But if people don't like it, it's not a big deal. Another throw-away
patch from me. ]
Makefile | 9 +-
builtin-help.c | 241 ++++++++++++++++++++++++++++++++++++++++++++
builtin-log.c | 69 +++++++++++++
builtin.h | 23 ++++
git.c | 305 +-------------------------------------------------------
5 files changed, 342 insertions(+), 305 deletions(-)
diff --git a/Makefile b/Makefile
index 3ecd674..a83c502 100644
--- a/Makefile
+++ b/Makefile
@@ -213,6 +213,9 @@ LIB_OBJS = \
fetch-clone.o revision.o pager.o tree-walk.o xdiff-interface.o \
$(DIFF_OBJS)
+BUILTIN_OBJS = \
+ builtin-log.o builtin-help.o
+
GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
LIBS = $(GITLIBS) -lz
@@ -462,10 +465,10 @@ all:
strip: $(PROGRAMS) git$X
$(STRIP) $(STRIP_OPTS) $(PROGRAMS) git$X
-git$X: git.c common-cmds.h $(GITLIBS)
+git$X: git.c common-cmds.h $(BUILTIN_OBJS) $(GITLIBS)
$(CC) -DGIT_VERSION='"$(GIT_VERSION)"' \
$(ALL_CFLAGS) -o $@ $(filter %.c,$^) \
- $(ALL_LDFLAGS) $(LIBS)
+ $(BUILTIN_OBJS) $(ALL_LDFLAGS) $(LIBS)
$(BUILT_INS): git$X
rm -f $@ && ln git$X $@
@@ -565,7 +568,7 @@ init-db.o: init-db.c
$(CC) -c $(ALL_CFLAGS) \
-DDEFAULT_GIT_TEMPLATE_DIR='"$(template_dir_SQ)"' $*.c
-$(LIB_OBJS): $(LIB_H)
+$(LIB_OBJS) $(BUILTIN_OBJS): $(LIB_H)
$(patsubst git-%$X,%.o,$(PROGRAMS)): $(GITLIBS)
$(DIFF_OBJS): diffcore.h
diff --git a/builtin-help.c b/builtin-help.c
new file mode 100644
index 0000000..10a59cc
--- /dev/null
+++ b/builtin-help.c
@@ -0,0 +1,241 @@
+/*
+ * builtin-help.c
+ *
+ * Builtin help-related commands (help, usage, version)
+ */
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "common-cmds.h"
+
+static const char git_usage[] =
+ "Usage: git [--version] [--exec-path[=GIT_EXEC_PATH]] [--help] COMMAND [ ARGS ]";
+
+/* most gui terms set COLUMNS (although some don't export it) */
+static int term_columns(void)
+{
+ char *col_string = getenv("COLUMNS");
+ int n_cols = 0;
+
+ if (col_string && (n_cols = atoi(col_string)) > 0)
+ return n_cols;
+
+#ifdef TIOCGWINSZ
+ {
+ struct winsize ws;
+ if (!ioctl(1, TIOCGWINSZ, &ws)) {
+ if (ws.ws_col)
+ return ws.ws_col;
+ }
+ }
+#endif
+
+ return 80;
+}
+
+static void oom(void)
+{
+ fprintf(stderr, "git: out of memory\n");
+ exit(1);
+}
+
+static inline void mput_char(char c, unsigned int num)
+{
+ while(num--)
+ putchar(c);
+}
+
+static struct cmdname {
+ size_t len;
+ char name[1];
+} **cmdname;
+static int cmdname_alloc, cmdname_cnt;
+
+static void add_cmdname(const char *name, int len)
+{
+ struct cmdname *ent;
+ if (cmdname_alloc <= cmdname_cnt) {
+ cmdname_alloc = cmdname_alloc + 200;
+ cmdname = realloc(cmdname, cmdname_alloc * sizeof(*cmdname));
+ if (!cmdname)
+ oom();
+ }
+ ent = malloc(sizeof(*ent) + len);
+ if (!ent)
+ oom();
+ ent->len = len;
+ memcpy(ent->name, name, len);
+ ent->name[len] = 0;
+ cmdname[cmdname_cnt++] = ent;
+}
+
+static int cmdname_compare(const void *a_, const void *b_)
+{
+ struct cmdname *a = *(struct cmdname **)a_;
+ struct cmdname *b = *(struct cmdname **)b_;
+ return strcmp(a->name, b->name);
+}
+
+static void pretty_print_string_list(struct cmdname **cmdname, int longest)
+{
+ int cols = 1, rows;
+ int space = longest + 1; /* min 1 SP between words */
+ int max_cols = term_columns() - 1; /* don't print *on* the edge */
+ int i, j;
+
+ if (space < max_cols)
+ cols = max_cols / space;
+ rows = (cmdname_cnt + cols - 1) / cols;
+
+ qsort(cmdname, cmdname_cnt, sizeof(*cmdname), cmdname_compare);
+
+ for (i = 0; i < rows; i++) {
+ printf(" ");
+
+ for (j = 0; j < cols; j++) {
+ int n = j * rows + i;
+ int size = space;
+ if (n >= cmdname_cnt)
+ break;
+ if (j == cols-1 || n + rows >= cmdname_cnt)
+ size = 1;
+ printf("%-*s", size, cmdname[n]->name);
+ }
+ putchar('\n');
+ }
+}
+
+static void list_commands(const char *exec_path, const char *pattern)
+{
+ unsigned int longest = 0;
+ char path[PATH_MAX];
+ int dirlen;
+ DIR *dir = opendir(exec_path);
+ struct dirent *de;
+
+ if (!dir) {
+ fprintf(stderr, "git: '%s': %s\n", exec_path, strerror(errno));
+ exit(1);
+ }
+
+ dirlen = strlen(exec_path);
+ if (PATH_MAX - 20 < dirlen) {
+ fprintf(stderr, "git: insanely long exec-path '%s'\n",
+ exec_path);
+ exit(1);
+ }
+
+ memcpy(path, exec_path, dirlen);
+ path[dirlen++] = '/';
+
+ while ((de = readdir(dir)) != NULL) {
+ struct stat st;
+ int entlen;
+
+ if (strncmp(de->d_name, "git-", 4))
+ continue;
+ strcpy(path+dirlen, de->d_name);
+ if (stat(path, &st) || /* stat, not lstat */
+ !S_ISREG(st.st_mode) ||
+ !(st.st_mode & S_IXUSR))
+ continue;
+
+ entlen = strlen(de->d_name);
+ if (4 < entlen && !strcmp(de->d_name + entlen - 4, ".exe"))
+ entlen -= 4;
+
+ if (longest < entlen)
+ longest = entlen;
+
+ add_cmdname(de->d_name + 4, entlen-4);
+ }
+ closedir(dir);
+
+ printf("git commands available in '%s'\n", exec_path);
+ printf("----------------------------");
+ mput_char('-', strlen(exec_path));
+ putchar('\n');
+ pretty_print_string_list(cmdname, longest - 4);
+ putchar('\n');
+}
+
+static void list_common_cmds_help(void)
+{
+ int i, longest = 0;
+
+ for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
+ if (longest < strlen(common_cmds[i].name))
+ longest = strlen(common_cmds[i].name);
+ }
+
+ puts("The most commonly used git commands are:");
+ for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
+ printf(" %s", common_cmds[i].name);
+ mput_char(' ', longest - strlen(common_cmds[i].name) + 4);
+ puts(common_cmds[i].help);
+ }
+ puts("(use 'git help -a' to get a list of all installed git commands)");
+}
+
+void cmd_usage(int show_all, const char *exec_path, const char *fmt, ...)
+{
+ if (fmt) {
+ va_list ap;
+
+ va_start(ap, fmt);
+ printf("git: ");
+ vprintf(fmt, ap);
+ va_end(ap);
+ putchar('\n');
+ }
+ else
+ puts(git_usage);
+
+ if (exec_path) {
+ putchar('\n');
+ if (show_all)
+ list_commands(exec_path, "git-*");
+ else
+ list_common_cmds_help();
+ }
+
+ exit(1);
+}
+
+static void show_man_page(const char *git_cmd)
+{
+ const char *page;
+
+ if (!strncmp(git_cmd, "git", 3))
+ page = git_cmd;
+ else {
+ int page_len = strlen(git_cmd) + 4;
+ char *p = malloc(page_len + 1);
+ strcpy(p, "git-");
+ strcpy(p + 4, git_cmd);
+ p[page_len] = 0;
+ page = p;
+ }
+
+ execlp("man", "man", page, NULL);
+}
+
+int cmd_version(int argc, const char **argv, char **envp)
+{
+ printf("git version %s\n", git_version_string);
+ return 0;
+}
+
+int cmd_help(int argc, const char **argv, char **envp)
+{
+ const char *help_cmd = argv[1];
+ if (!help_cmd)
+ cmd_usage(0, git_exec_path(), NULL);
+ else if (!strcmp(help_cmd, "--all") || !strcmp(help_cmd, "-a"))
+ cmd_usage(1, git_exec_path(), NULL);
+ else
+ show_man_page(help_cmd);
+ return 0;
+}
+
+
diff --git a/builtin-log.c b/builtin-log.c
new file mode 100644
index 0000000..418101d
--- /dev/null
+++ b/builtin-log.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git log" and related commands (show, whatchanged)
+ *
+ * (C) Copyright 2006 Linus Torvalds
+ * 2006 Junio Hamano
+ */
+#include "cache.h"
+#include "commit.h"
+#include "diff.h"
+#include "revision.h"
+#include "log-tree.h"
+
+static int cmd_log_wc(int argc, const char **argv, char **envp,
+ struct rev_info *rev)
+{
+ struct commit *commit;
+
+ rev->abbrev = DEFAULT_ABBREV;
+ rev->commit_format = CMIT_FMT_DEFAULT;
+ rev->verbose_header = 1;
+ argc = setup_revisions(argc, argv, rev, "HEAD");
+
+ if (argc > 1)
+ die("unrecognized argument: %s", argv[1]);
+
+ prepare_revision_walk(rev);
+ setup_pager();
+ while ((commit = get_revision(rev)) != NULL) {
+ log_tree_commit(rev, commit);
+ free(commit->buffer);
+ commit->buffer = NULL;
+ }
+ return 0;
+}
+
+int cmd_whatchanged(int argc, const char **argv, char **envp)
+{
+ struct rev_info rev;
+
+ init_revisions(&rev);
+ rev.diff = 1;
+ rev.diffopt.recursive = 1;
+ return cmd_log_wc(argc, argv, envp, &rev);
+}
+
+int cmd_show(int argc, const char **argv, char **envp)
+{
+ struct rev_info rev;
+
+ init_revisions(&rev);
+ rev.diff = 1;
+ rev.diffopt.recursive = 1;
+ rev.combine_merges = 1;
+ rev.dense_combined_merges = 1;
+ rev.always_show_header = 1;
+ rev.ignore_merges = 0;
+ rev.no_walk = 1;
+ return cmd_log_wc(argc, argv, envp, &rev);
+}
+
+int cmd_log(int argc, const char **argv, char **envp)
+{
+ struct rev_info rev;
+
+ init_revisions(&rev);
+ rev.always_show_header = 1;
+ rev.diffopt.recursive = 1;
+ return cmd_log_wc(argc, argv, envp, &rev);
+}
diff --git a/builtin.h b/builtin.h
new file mode 100644
index 0000000..47408a0
--- /dev/null
+++ b/builtin.h
@@ -0,0 +1,23 @@
+#ifndef BUILTIN_H
+#define BUILTIN_H
+
+#ifndef PATH_MAX
+# define PATH_MAX 4096
+#endif
+
+extern const char git_version_string[];
+
+void cmd_usage(int show_all, const char *exec_path, const char *fmt, ...)
+#ifdef __GNUC__
+ __attribute__((__format__(__printf__, 3, 4), __noreturn__))
+#endif
+ ;
+
+extern int cmd_help(int argc, const char **argv, char **envp);
+extern int cmd_version(int argc, const char **argv, char **envp);
+
+extern int cmd_whatchanged(int argc, const char **argv, char **envp);
+extern int cmd_show(int argc, const char **argv, char **envp);
+extern int cmd_log(int argc, const char **argv, char **envp);
+
+#endif
diff --git a/git.c b/git.c
index 40b7e42..aa2b814 100644
--- a/git.c
+++ b/git.c
@@ -11,215 +11,8 @@ #include <stdarg.h>
#include <sys/ioctl.h>
#include "git-compat-util.h"
#include "exec_cmd.h"
-#include "common-cmds.h"
-#include "cache.h"
-#include "commit.h"
-#include "diff.h"
-#include "revision.h"
-#include "log-tree.h"
-
-#ifndef PATH_MAX
-# define PATH_MAX 4096
-#endif
-
-static const char git_usage[] =
- "Usage: git [--version] [--exec-path[=GIT_EXEC_PATH]] [--help] COMMAND [ ARGS ]";
-
-/* most gui terms set COLUMNS (although some don't export it) */
-static int term_columns(void)
-{
- char *col_string = getenv("COLUMNS");
- int n_cols = 0;
-
- if (col_string && (n_cols = atoi(col_string)) > 0)
- return n_cols;
-
-#ifdef TIOCGWINSZ
- {
- struct winsize ws;
- if (!ioctl(1, TIOCGWINSZ, &ws)) {
- if (ws.ws_col)
- return ws.ws_col;
- }
- }
-#endif
-
- return 80;
-}
-
-static void oom(void)
-{
- fprintf(stderr, "git: out of memory\n");
- exit(1);
-}
-
-static inline void mput_char(char c, unsigned int num)
-{
- while(num--)
- putchar(c);
-}
-
-static struct cmdname {
- size_t len;
- char name[1];
-} **cmdname;
-static int cmdname_alloc, cmdname_cnt;
-
-static void add_cmdname(const char *name, int len)
-{
- struct cmdname *ent;
- if (cmdname_alloc <= cmdname_cnt) {
- cmdname_alloc = cmdname_alloc + 200;
- cmdname = realloc(cmdname, cmdname_alloc * sizeof(*cmdname));
- if (!cmdname)
- oom();
- }
- ent = malloc(sizeof(*ent) + len);
- if (!ent)
- oom();
- ent->len = len;
- memcpy(ent->name, name, len);
- ent->name[len] = 0;
- cmdname[cmdname_cnt++] = ent;
-}
-
-static int cmdname_compare(const void *a_, const void *b_)
-{
- struct cmdname *a = *(struct cmdname **)a_;
- struct cmdname *b = *(struct cmdname **)b_;
- return strcmp(a->name, b->name);
-}
-
-static void pretty_print_string_list(struct cmdname **cmdname, int longest)
-{
- int cols = 1, rows;
- int space = longest + 1; /* min 1 SP between words */
- int max_cols = term_columns() - 1; /* don't print *on* the edge */
- int i, j;
-
- if (space < max_cols)
- cols = max_cols / space;
- rows = (cmdname_cnt + cols - 1) / cols;
-
- qsort(cmdname, cmdname_cnt, sizeof(*cmdname), cmdname_compare);
-
- for (i = 0; i < rows; i++) {
- printf(" ");
-
- for (j = 0; j < cols; j++) {
- int n = j * rows + i;
- int size = space;
- if (n >= cmdname_cnt)
- break;
- if (j == cols-1 || n + rows >= cmdname_cnt)
- size = 1;
- printf("%-*s", size, cmdname[n]->name);
- }
- putchar('\n');
- }
-}
-
-static void list_commands(const char *exec_path, const char *pattern)
-{
- unsigned int longest = 0;
- char path[PATH_MAX];
- int dirlen;
- DIR *dir = opendir(exec_path);
- struct dirent *de;
-
- if (!dir) {
- fprintf(stderr, "git: '%s': %s\n", exec_path, strerror(errno));
- exit(1);
- }
-
- dirlen = strlen(exec_path);
- if (PATH_MAX - 20 < dirlen) {
- fprintf(stderr, "git: insanely long exec-path '%s'\n",
- exec_path);
- exit(1);
- }
-
- memcpy(path, exec_path, dirlen);
- path[dirlen++] = '/';
-
- while ((de = readdir(dir)) != NULL) {
- struct stat st;
- int entlen;
-
- if (strncmp(de->d_name, "git-", 4))
- continue;
- strcpy(path+dirlen, de->d_name);
- if (stat(path, &st) || /* stat, not lstat */
- !S_ISREG(st.st_mode) ||
- !(st.st_mode & S_IXUSR))
- continue;
-
- entlen = strlen(de->d_name);
- if (4 < entlen && !strcmp(de->d_name + entlen - 4, ".exe"))
- entlen -= 4;
-
- if (longest < entlen)
- longest = entlen;
-
- add_cmdname(de->d_name + 4, entlen-4);
- }
- closedir(dir);
-
- printf("git commands available in '%s'\n", exec_path);
- printf("----------------------------");
- mput_char('-', strlen(exec_path));
- putchar('\n');
- pretty_print_string_list(cmdname, longest - 4);
- putchar('\n');
-}
-
-static void list_common_cmds_help(void)
-{
- int i, longest = 0;
-
- for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
- if (longest < strlen(common_cmds[i].name))
- longest = strlen(common_cmds[i].name);
- }
-
- puts("The most commonly used git commands are:");
- for (i = 0; i < ARRAY_SIZE(common_cmds); i++) {
- printf(" %s", common_cmds[i].name);
- mput_char(' ', longest - strlen(common_cmds[i].name) + 4);
- puts(common_cmds[i].help);
- }
- puts("(use 'git help -a' to get a list of all installed git commands)");
-}
-
-#ifdef __GNUC__
-static void cmd_usage(int show_all, const char *exec_path, const char *fmt, ...)
- __attribute__((__format__(__printf__, 3, 4), __noreturn__));
-#endif
-static void cmd_usage(int show_all, const char *exec_path, const char *fmt, ...)
-{
- if (fmt) {
- va_list ap;
-
- va_start(ap, fmt);
- printf("git: ");
- vprintf(fmt, ap);
- va_end(ap);
- putchar('\n');
- }
- else
- puts(git_usage);
-
- if (exec_path) {
- putchar('\n');
- if (show_all)
- list_commands(exec_path, "git-*");
- else
- list_common_cmds_help();
- }
-
- exit(1);
-}
+#include "builtin.h"
static void prepend_to_path(const char *dir, int len)
{
@@ -240,99 +33,7 @@ static void prepend_to_path(const char *
setenv("PATH", path, 1);
}
-static void show_man_page(const char *git_cmd)
-{
- const char *page;
-
- if (!strncmp(git_cmd, "git", 3))
- page = git_cmd;
- else {
- int page_len = strlen(git_cmd) + 4;
- char *p = malloc(page_len + 1);
- strcpy(p, "git-");
- strcpy(p + 4, git_cmd);
- p[page_len] = 0;
- page = p;
- }
-
- execlp("man", "man", page, NULL);
-}
-
-static int cmd_version(int argc, const char **argv, char **envp)
-{
- printf("git version %s\n", GIT_VERSION);
- return 0;
-}
-
-static int cmd_help(int argc, const char **argv, char **envp)
-{
- const char *help_cmd = argv[1];
- if (!help_cmd)
- cmd_usage(0, git_exec_path(), NULL);
- else if (!strcmp(help_cmd, "--all") || !strcmp(help_cmd, "-a"))
- cmd_usage(1, git_exec_path(), NULL);
- else
- show_man_page(help_cmd);
- return 0;
-}
-
-static int cmd_log_wc(int argc, const char **argv, char **envp,
- struct rev_info *rev)
-{
- struct commit *commit;
-
- rev->abbrev = DEFAULT_ABBREV;
- rev->commit_format = CMIT_FMT_DEFAULT;
- rev->verbose_header = 1;
- argc = setup_revisions(argc, argv, rev, "HEAD");
-
- if (argc > 1)
- die("unrecognized argument: %s", argv[1]);
-
- prepare_revision_walk(rev);
- setup_pager();
- while ((commit = get_revision(rev)) != NULL) {
- log_tree_commit(rev, commit);
- free(commit->buffer);
- commit->buffer = NULL;
- }
- return 0;
-}
-
-static int cmd_wc(int argc, const char **argv, char **envp)
-{
- struct rev_info rev;
-
- init_revisions(&rev);
- rev.diff = 1;
- rev.diffopt.recursive = 1;
- return cmd_log_wc(argc, argv, envp, &rev);
-}
-
-static int cmd_show(int argc, const char **argv, char **envp)
-{
- struct rev_info rev;
-
- init_revisions(&rev);
- rev.diff = 1;
- rev.diffopt.recursive = 1;
- rev.combine_merges = 1;
- rev.dense_combined_merges = 1;
- rev.always_show_header = 1;
- rev.ignore_merges = 0;
- rev.no_walk = 1;
- return cmd_log_wc(argc, argv, envp, &rev);
-}
-
-static int cmd_log(int argc, const char **argv, char **envp)
-{
- struct rev_info rev;
-
- init_revisions(&rev);
- rev.always_show_header = 1;
- rev.diffopt.recursive = 1;
- return cmd_log_wc(argc, argv, envp, &rev);
-}
+const char git_version_string[] = GIT_VERSION;
static void handle_internal_command(int argc, const char **argv, char **envp)
{
@@ -344,7 +45,7 @@ static void handle_internal_command(int
{ "version", cmd_version },
{ "help", cmd_help },
{ "log", cmd_log },
- { "whatchanged", cmd_wc },
+ { "whatchanged", cmd_whatchanged },
{ "show", cmd_show },
};
int i;
^ permalink raw reply related
* git-log produces no output
From: Bob Portmann @ 2006-04-21 17:20 UTC (permalink / raw)
To: Git Mailing List
I am just starting out with git and have a noob question about got-log.
I cannot get any output out of it and am wondering if I am using it
correctly or it is broken. As I understand it, git-log should just
print out the log messages but not the changes, whereas git-whatchanged
will print out both. But while git-whatchanged works, git-log never
does. I have a trivial example below which shows what I mean. But I
get the same result using my real archives and out of git.git as well.
Thanks,
Bob
PS I'm using git 1.3.0 and have tried this on both Mac OS X and Linux
with the same results.
Trivial example:
git-test> mkdir test-log
git-test> cd test-log
test-log> git-init-db
defaulting to local storage area
test-log> echo "Hello World" >hello
test-log> git add .
test-log> git commit -a -m 'One line hello'
Committing initial tree 117c62a8c5e01758bd284126a6af69deab9dbbe2
test-log> echo "Hello World 2" >>hello
test-log> git commit -a -m 'Two line hello'
test-log> git whatchanged -p
diff-tree 9a4d7602fff052b6796c2862edddd11ae2e45d08 (from
a38306518c5e5e8eb630c02Author: Bob Portmann <portmann@xxxx.xx.xx>
Date: Fri Apr 21 10:56:11 2006 -0600
Two line hello
diff --git a/hello b/hello
index 557db03..514e5c5 100644
--- a/hello
+++ b/hello
@@ -1 +1,2 @@
Hello World
+Hello World 2
test-log> git log
test-log>
As you can see git log produces no output. I've tried it with other
options with the same result.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
^ permalink raw reply related
* Re: [RESEND] [PATCH] fix gitk with lots of tags
From: Linus Torvalds @ 2006-04-21 15:19 UTC (permalink / raw)
To: Paul Mackerras; +Cc: Jim Radford, Junio C Hamano, Git Mailing List
In-Reply-To: <17480.50829.466038.316769@cargo.ozlabs.ibm.com>
On Fri, 21 Apr 2006, Paul Mackerras wrote:
>
> Junio, did you tell me some time ago about a flag to git-rev-parse
> that spits out just the file/directory names? What was it again?
git-rev-parse --no-flags --no-revs "$@"
should fo what you want.
Linus
^ permalink raw reply
* Re: [RESEND] [PATCH] fix gitk with lots of tags
From: Paul Mackerras @ 2006-04-21 11:48 UTC (permalink / raw)
To: Jim Radford; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <20060418180614.GA31543@blackbean.org>
Jim Radford writes:
> I've gotten no reposnse from Paul on this patch[1]. If it seems ok to
> you, would you mind putting it in your queue for him? I hate to see
> gitk die with "argument list too long" messages. They're so 640k.
The reservation I have about this is that I need to be able to tell
the file/directory names from the tags/heads/SHA1 IDs. After the pass
through git-rev-parse it's easy; I just take the things that match
^[a-f0-9]{40}$ as IDs and the rest as file/directory names or
switches.
Junio, did you tell me some time ago about a flag to git-rev-parse
that spits out just the file/directory names? What was it again?
> [1] Maybe he judges people by the color of their IP address?
As in _black_bean.org? :)
> Then again, he could just be busy.
Yeah. Or just returned from international travel, or something like
that. :)
Paul.
^ permalink raw reply
* Re: n-heads and patch dependency chains
From: Andreas Ericsson @ 2006-04-21 8:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4q0oyt3w.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
> Jon Loeliger <jdl@freescale.com> writes:
>
>
>>On Tue, 2006-04-04 at 06:47, Andreas Ericsson wrote:
>>
>>
>>>No, I mean that this would commit both to the testing branch (being the
>>>result of several merged topic-branches) and to the topic-branch merged
>>>in. Commit as in regular commit, with a commit-message and a patch. The
>>>resulting repository would be the exact same as if the change was
>>>committed only to the topic-branch and then cherry-picked on to the
>>>testing-branch.
>
>
> To be consistent, I think the result should be "as if the change
> was commited only to the topic-branch and then the topic-branch
> was *merged* into the testing-branch", since you start your
> testing branch as "being the result of several merged topic-branches".
>
> I do that (manually) all the time, with:
>
> $ git checkout next
> $ hack hack hack
>
> $ git checkout -m one/topic
> $ git commit -o this-path that-path
> $ git checkout next
> $ git pull . one/topic
>
> Giving a short-hand for the last four-command sequence would
> certainly be nice.
>
Ah. That's easier than what I originally looked at doing.
>
>>I am your number one fan! If I finish reading these 600+
>>messages, will I find out you have already implemented it,
>>it's committed, and you just need me to test it now? :-)
>
>
> Likewise... ;-)
>
Sorry to disappoint you so far. I'll see if I can turn up my
shell-skills a notch or two and get the hang of the commit-script enough
to implement it.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Nicolas Pitre @ 2006-04-21 3:07 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Junio C Hamano, git
In-Reply-To: <20060421024012.GA1213@spearce.org>
On Thu, 20 Apr 2006, Shawn Pearce wrote:
> Nicolas Pitre <nico@cam.org> wrote:
> > With the patch above the Linux kernel pack is 0.3% smaller with 1% more
> > CPU usage. But like for the diff-delta hash list limiting code this
> > small overhead is certainly a good compromize to avoid big degradations
> > in some other cases.
>
> Hmm. See the email I just sent. I was seeing a good 10% increase
> in my own tests on a Linux kernel repository. But I guess I can
> hope that my test was flawed somehow and it really is closer to a 1%
> increase in running time, making it more likely that the above fix
> makes it into GIT.
Well, I repeated the kernel run and this time it took 2.5% more CPU with
the patch.
But the thing is that I get a +/- 1% difference between successive runs.
So while the patch does add a certain overhead, it appears to be in the
same range as noise here.
Nicolas
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-21 2:40 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0604202213470.2215@localhost.localdomain>
Nicolas Pitre <nico@cam.org> wrote:
> On Thu, 20 Apr 2006, Shawn Pearce wrote:
>
> > Based on Linus' comment I changed your patch to just the following.
> > It still produced the 46M pack file, so the first hunk apears to
> > not have had much of an affect with this data.
> >
> > From a running time perspective it appears as though this patch is
> > making things slightly better, not worse. I ran it a few times
> > for each case always using the 46M pack as input for
> > "git-repack -a -d -f".
> >
> > 'next' 137.13 real 95.82 user 25.24 sys
> > 'next'+patch 131.62 real 89.35 user 28.56 sys
> >
> > but even if the running time was an extra 6 seconds I'd still rather
> > spend 4% more running time to use 1/2 the storage space.
> >
> >
> > diff --git a/pack-objects.c b/pack-objects.c
> > index 09f4f2c..f7d6217 100644
> > --- a/pack-objects.c
> > +++ b/pack-objects.c
> > @@ -1052,7 +1052,7 @@ static int try_delta(struct unpacked *cu
> > if (cur_entry->delta)
> > max_size = cur_entry->delta_size-1;
> > if (sizediff >= max_size)
> > - return -1;
> > + return 0;
> > delta_buf = diff_delta(old->data, oldsize,
> > cur->data, size, &delta_size, max_size);
> > if (!delta_buf)
>
> I can confirm this is indeed the best fix so far. Any "smarter"
> solution I could think of did increase the size of the final pack quite
> spectacularly and rather unexpectedly with Shawn's repository.
Wow. I'm such a trouble maker. *grin*
> Of course removing the if (sizediff >= max_size) entirely does produce a
> smaller pack (39MB) but it takes about twice the CPU.
Eh, that's not worth it. 7M disk space saved for twice the work isn't
that good of a tradeoff. I'm not in favor of that version.
> With the patch above the Linux kernel pack is 0.3% smaller with 1% more
> CPU usage. But like for the diff-delta hash list limiting code this
> small overhead is certainly a good compromize to avoid big degradations
> in some other cases.
Hmm. See the email I just sent. I was seeing a good 10% increase
in my own tests on a Linux kernel repository. But I guess I can
hope that my test was flawed somehow and it really is closer to a 1%
increase in running time, making it more likely that the above fix
makes it into GIT.
--
Shawn.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-21 2:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <20060421012029.GB819@spearce.org>
I just tried the patch below on a couple-month-old Linux 2.6
repository from Linus (last commit: Feb 14 2006). It did not
decrease the pack file size by much despite the higher delta:
'next' Total 189435, written 189435 (delta 142093), reused 44057 (delta 0)
'next'+patch Total 189435, written 189435 (delta 142712), reused 43954 (delta 0)
'next' 104464297 bytes
'next'+patch 104092920 bytes (99.6% of 'next')
'next' 328.98 real 206.02 user 93.60 sys
'next'+patch 363.06 real 218.98 user 94.72 sys
So it looks like the patch is taking longer to run, and by about 10%.
An expensive price to pay for what amounts to only a 0.4% reduction
in pack size on the kernel.
Shawn Pearce <spearce@spearce.org> wrote:
> Based on Linus' comment I changed your patch to just the following.
> It still produced the 46M pack file, so the first hunk apears to
> not have had much of an affect with this data.
>
> From a running time perspective it appears as though this patch is
> making things slightly better, not worse. I ran it a few times
> for each case always using the 46M pack as input for
> "git-repack -a -d -f".
>
> 'next' 137.13 real 95.82 user 25.24 sys
> 'next'+patch 131.62 real 89.35 user 28.56 sys
>
> but even if the running time was an extra 6 seconds I'd still rather
> spend 4% more running time to use 1/2 the storage space.
>
>
> diff --git a/pack-objects.c b/pack-objects.c
> index 09f4f2c..f7d6217 100644
> --- a/pack-objects.c
> +++ b/pack-objects.c
> @@ -1052,7 +1052,7 @@ static int try_delta(struct unpacked *cu
> if (cur_entry->delta)
> max_size = cur_entry->delta_size-1;
> if (sizediff >= max_size)
> - return -1;
> + return 0;
> delta_buf = diff_delta(old->data, oldsize,
> cur->data, size, &delta_size, max_size);
> if (!delta_buf)
--
Shawn.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Nicolas Pitre @ 2006-04-21 2:28 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Junio C Hamano, git
In-Reply-To: <20060421012029.GB819@spearce.org>
On Thu, 20 Apr 2006, Shawn Pearce wrote:
> Based on Linus' comment I changed your patch to just the following.
> It still produced the 46M pack file, so the first hunk apears to
> not have had much of an affect with this data.
>
> From a running time perspective it appears as though this patch is
> making things slightly better, not worse. I ran it a few times
> for each case always using the 46M pack as input for
> "git-repack -a -d -f".
>
> 'next' 137.13 real 95.82 user 25.24 sys
> 'next'+patch 131.62 real 89.35 user 28.56 sys
>
> but even if the running time was an extra 6 seconds I'd still rather
> spend 4% more running time to use 1/2 the storage space.
>
>
> diff --git a/pack-objects.c b/pack-objects.c
> index 09f4f2c..f7d6217 100644
> --- a/pack-objects.c
> +++ b/pack-objects.c
> @@ -1052,7 +1052,7 @@ static int try_delta(struct unpacked *cu
> if (cur_entry->delta)
> max_size = cur_entry->delta_size-1;
> if (sizediff >= max_size)
> - return -1;
> + return 0;
> delta_buf = diff_delta(old->data, oldsize,
> cur->data, size, &delta_size, max_size);
> if (!delta_buf)
I can confirm this is indeed the best fix so far. Any "smarter"
solution I could think of did increase the size of the final pack quite
spectacularly and rather unexpectedly with Shawn's repository.
Of course removing the if (sizediff >= max_size) entirely does produce a
smaller pack (39MB) but it takes about twice the CPU.
With the patch above the Linux kernel pack is 0.3% smaller with 1% more
CPU usage. But like for the diff-delta hash list limiting code this
small overhead is certainly a good compromize to avoid big degradations
in some other cases.
Nicolas
^ permalink raw reply
* PATCH RESEND: git-svnimport memory leak plug
From: Auke Kok @ 2006-04-21 2:12 UTC (permalink / raw)
To: git
I was unable with todays git-svnimport to convert a modest 140mb svn repository to git using svn-git-import. the process would bomb out after 2000 revisions and I have 20000. It consumed over 1.5gb vm space on my 1gb machine, and died with a 'cannot fork error'. this also killed my project server today after I discivered it the wrong way.
The original patch send earlier by Santi Bejar fixes the problem. I include the patch again here so it can be merged. my mailer probably will nuke it so here's the link to the archive post:
http://marc.theaimsgroup.com/?l=git&m=114345884526971&w=2
Please include this patch.
Cheers,
Auke
On 3/24/06, Santi Béjar <sbejar@gmail.com> wrote:
> Jan-Benedict Glaw <jbglaw@lug-owl.de> writes:
>
> > On Wed, 2006-03-22 14:33:37 +0100, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
> >
> > Since it seems nobody looked at the GCC import run (which means to use
> > the svnimport), I ran it again, under strace control:
> >
> >> GCC
> >> ~~~
> >> $ /home/jbglaw/bin/git svnimport -C gcc -v svn://gcc.gnu.org/svn/gcc
> >
> >> Committed change 3936:/ 1993-03-31 05:44:03)
> >> Commit ID ceff85145f8671fb2a9d826a761cedc2a507bd1e
> >> Writing to refs/heads/origin
> >> DONE: 3936 origin ceff85145f8671fb2a9d826a761cedc2a507bd1e
> >> ... 3937 trunk/gcc/final.c ...
> >> Can't fork at /home/jbglaw/bin/git-svnimport line 379.
> >
>
> I have the same (?) problem with one of my svn repository. It worked
> before (I've redone the import with the -r flag), so I bisected it.
> The problematic commit seems to be:
>
> diff-tree 4802426... (from 525c0d7...)
> Author: Karl Hasselström <kha@treskal.com>
> Date: Sun Feb 26 06:11:27 2006 +0100
>
> svnimport: Convert executable flag
>
> Convert the svn:executable property to file mode 755 when converting
> an SVN repository to GIT.
>
> Signed-off-by: Karl Hasselström <kha@treskal.com>
> Signed-off-by: Junio C Hamano <junkio@cox.net>
>
> :100755 100755 ee2940f... 6603b96... M git-svnimport.perl
>
> I think it has a memory leak, it used up to 140m of memory.
>
> $ git reset --hard 4802426^
> $ time ../git-svnimport.perl file:///path/
> Use of uninitialized value in string eq at ../git-svnimport.perl line 463.
> Use of uninitialized value in substitution (s///) at ../git-svnimport.perl line 466.
> real 0m55.801s
> user 0m30.578s
> sys 0m23.084s
>
> $ git reset --hard 4802426
> $ time ../git-svnimport.perl file:///path/
> Use of uninitialized value in string eq at ../git-svnimport.perl line 463.
> Use of uninitialized value in substitution (s///) at ../git-svnimport.perl line 466.
> Can't fork at /home/santi/usr/src/scm/git/git-svnimport.perl line 331.
> real 6m2.163s
> user 0m20.332s
> sys 0m50.180s
>
> and it didn't finished. Hope it helps.
And this patch fixes my problems.
---
Introduced in 4802426.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
---
git-svnimport.perl | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/git-svnimport.perl b/git-svnimport.perl
index 639aa41..f2cf062 100755
--- a/git-svnimport.perl
+++ b/git-svnimport.perl
@@ -135,8 +135,10 @@
print "... $rev $path ...\n" if $opt_v;
my (undef, $properties);
+ my $pool = SVN::Pool->new();
eval { (undef, $properties)
- = $self->{'svn'}->get_file($path,$rev,$fh); };
+ = $self->{'svn'}->get_file($path,$rev,$fh,$pool); };
+ $pool->clear;
if($@) {
return undef if $@ =~ /Attempted to get checksum/;
die $@;
^ permalink raw reply related
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-21 1:20 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <7vfyk8vscl.fsf@assigned-by-dhcp.cox.net>
Based on Linus' comment I changed your patch to just the following.
It still produced the 46M pack file, so the first hunk apears to
not have had much of an affect with this data.
>From a running time perspective it appears as though this patch is
making things slightly better, not worse. I ran it a few times
for each case always using the 46M pack as input for
"git-repack -a -d -f".
'next' 137.13 real 95.82 user 25.24 sys
'next'+patch 131.62 real 89.35 user 28.56 sys
but even if the running time was an extra 6 seconds I'd still rather
spend 4% more running time to use 1/2 the storage space.
diff --git a/pack-objects.c b/pack-objects.c
index 09f4f2c..f7d6217 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -1052,7 +1052,7 @@ static int try_delta(struct unpacked *cu
if (cur_entry->delta)
max_size = cur_entry->delta_size-1;
if (sizediff >= max_size)
- return -1;
+ return 0;
delta_buf = diff_delta(old->data, oldsize,
cur->data, size, &delta_size, max_size);
if (!delta_buf)
^ permalink raw reply related
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-21 1:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <7vy7xzvpsg.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> wrote:
[snip]
> I suspect the test patch makes pack-objects a lot more
> expensive.
Which patch are you talking about the previous patch or the one in
the message I'm now replying to?
> The code before the test patch said "if the size is very small
> or size difference is too great, do not consider this, and do
> not consider any more objects in the delta window, because we
> know they are either even smaller of the same path, they have
> different names, or they are of different type". The test patch
> you tried was a quick and dirty hack that said "under the
> too-small condition, skip this one, but keep trying the rest of
> the delta window".
>
> Here is a cleaned up patch. What it does is "under the
> too-small condition, see if the object has the same basename,
> and if so keep going, but otherwise skip the rest as before".
[snip]
The patch below does not help very much:
Total 46391, written 46391 (delta 6686), reused 37979 (delta 0)
129M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
> diff --git a/pack-objects.c b/pack-objects.c
> index 09f4f2c..2173709 100644
> --- a/pack-objects.c
> +++ b/pack-objects.c
> @@ -1036,8 +1036,6 @@ static int try_delta(struct unpacked *cu
> oldsize = old_entry->size;
> sizediff = oldsize > size ? oldsize - size : size - oldsize;
>
> - if (size < 50)
> - return -1;
> if (old_entry->depth >= max_depth)
> return 0;
>
> @@ -1048,20 +1046,27 @@ static int try_delta(struct unpacked *cu
> * more space-efficient (deletes don't have to say _what_ they
> * delete).
> */
> - max_size = size / 2 - 20;
> - if (cur_entry->delta)
> - max_size = cur_entry->delta_size-1;
> - if (sizediff >= max_size)
> - return -1;
> - delta_buf = diff_delta(old->data, oldsize,
> - cur->data, size, &delta_size, max_size);
> - if (!delta_buf)
> + if (50 <= size) {
> + max_size = size / 2 - 20;
> + if (cur_entry->delta)
> + max_size = cur_entry->delta_size-1;
> + if (sizediff < max_size) {
> + delta_buf = diff_delta(old->data, oldsize,
> + cur->data, size,
> + &delta_size, max_size);
> + if (!delta_buf)
> + return 0;
> + cur_entry->delta = old_entry;
> + cur_entry->delta_size = delta_size;
> + cur_entry->depth = old_entry->depth + 1;
> + free(delta_buf);
> + return 0;
> + }
> + }
> + /* Keep going as long as the basename matches */
> + if (((cur_entry->hash ^ old_entry->hash) >>DIRBITS) == 0)
> return 0;
> - cur_entry->delta = old_entry;
> - cur_entry->delta_size = delta_size;
> - cur_entry->depth = old_entry->depth + 1;
> - free(delta_buf);
> - return 0;
> + return -1;
> }
>
> static void progress_interval(int signum)
>
--
Shawn.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Nicolas Pitre @ 2006-04-21 0:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vfyk8vscl.fsf@assigned-by-dhcp.cox.net>
On Thu, 20 Apr 2006, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> >> But I suspect we have a built-in "we sort bigger to smaller, and
> >> we cut off when we switch bins" somewhere in find_delta() loop,
> >> which I do not recall touching when I did that change, so that
> >> may be interfering and preventing 0-11-AdjLite.deg from all over
> >> the place to delta against each other.
> >
> > I just cannot find something that would do that in the code. When
> > --no-reuse-delta is specified, the only things that will break the loop
> > in find_delta() is when try_delta() returns -1, and that happens only
> > when changing object type or when the size difference is too big, but
> > nothing looks at the name hash.
>
> The list is sorted by type then hash then size (type_size_sort),
> so if you have t/Makefile that are big medium small too-small
> and then doc/Makefile that are big medium, once you see the
> too-small t/Makefile it would not consider the big doc/Makefile
> as a candidate X-<.
Bingo! I didn't think it all through before.
Nicolas
^ permalink raw reply
* [cogito-0.17.2] Bug in cg-log selection
From: Blaisorblade @ 2006-04-21 0:08 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
(Please CC me on replies as I'm not subscribed).
On a standard Linux tree:
$ cg-branch-ls
origin
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.16.y.git
this command selects two commit instead than the right one (which is selected
by git-whatchanged):
$ cg-log -r v2.6.16:v2.6.16.9 kernel/power/process.c
shows commits 6b2467e45179a336f1e5b70d2b2ae1fe89a00133 and
1dd6f008de5a04251d9cbe4c1cf67e4c708f9fe9, but the latter doesn't touch that
file, and
$ git-whatchanged -p v2.6.16..v2.6.16.9 kernel/power/process.c
only shows the first commit.
Verbatim output of short listing:
$ cg-log -c -s -r v2.6.16:v2.6.16.9 kernel/power/process.c|cat
6b2467e45179a336f1e5b70d2b2ae1fe89a00133 Pavel Machek 2006-04-17 13:16
[PATCH] Fix suspend with traced tasks
1dd6f008de5a04251d9cbe4c1cf67e4c708f9fe9 Jeff Garzik 2006-03-27 22:47
[PATCH] sata_mv: fix irq port status usage
Bye
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
^ permalink raw reply
* Cogito bug on Debian
From: Martin Langhoff @ 2006-04-20 23:17 UTC (permalink / raw)
To: Git Mailing List, Petr Baudis
This was spotted circulating on Catalyst's IRC channel. Apparently,
the bug "causes non-serious data loss".
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=330031
cheers,
martin
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Junio C Hamano @ 2006-04-20 23:02 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604201630320.2215@localhost.localdomain>
Nicolas Pitre <nico@cam.org> writes:
>> I originally thought, with one single notable exception of
>> Makefile, having the identically named file in many different
>> directories is not common nor sane,
>
> I'd tend to disagree with that but...
I disagree with that myself now. The kernel tree has many
files with the same basename (e.g. arch/*/kernel/irq.c).
It is a different issue if they are good delta base candidates
with each other, though.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Linus Torvalds @ 2006-04-20 22:59 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Junio C Hamano, Nicolas Pitre, git
In-Reply-To: <20060420220240.GB32748@spearce.org>
On Thu, 20 Apr 2006, Shawn Pearce wrote:
> Junio C Hamano <junkio@cox.net> wrote:
> >
> > This _might_ improve things:
> >
> > diff --git a/pack-objects.c b/pack-objects.c
> > index 09f4f2c..0c6abe9 100644
> > --- a/pack-objects.c
> > +++ b/pack-objects.c
> > @@ -1037,7 +1039,7 @@ static int try_delta(struct unpacked *cu
> > sizediff = oldsize > size ? oldsize - size : size - oldsize;
> >
> > if (size < 50)
> > - return -1;
> > + return 0;
> > if (old_entry->depth >= max_depth)
> > return 0;
> >
> > @@ -1052,7 +1054,7 @@ static int try_delta(struct unpacked *cu
> > if (cur_entry->delta)
> > max_size = cur_entry->delta_size-1;
> > if (sizediff >= max_size)
> > - return -1;
> > + return 0;
> > delta_buf = diff_delta(old->data, oldsize,
> > cur->data, size, &delta_size, max_size);
> > if (!delta_buf)
>
> Holy cow, it did:
>
> Total 46391, written 46391 (delta 8391), reused 37774 (delta 0)
> 46M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
>
> That's the smallest packing I've seen yet. And it doesn't have a
> negative affect on repacking GIT either.
I think I know what's going on, and why your bisection claimed it was the
re-hashing change that was the problem, even though it really wasn't.
That
if (sizediff >= max_size)
return -1;
check is actually fairly _old_. It's from June 2005, ie it's from pretty
much the first two days of that packing thing existing in the first place.
(The initial repacking was done June 25th, with a lot of tweaking over the
next few days. That sizediff thing was part of the fairly early tweaking).
The thing is, that check made sense back then. Why? Because we sorted
things in decreasing size order back then (I think this was before we even
did any name-based heuristic sorting at all), so that when we tried the
delta algorithm, and the size diff was bigger than the last delta size, we
pretty much _knew_ the new delta would be bigger still, and there was no
point in continuing with try_delta.
HOWEVER. We have since changed the sorting to sort according to name
before it sorts according to size, so that old heuristic that depended on
the size being monotonically increasing simply doesn't make any sense any
more.
So I think at that second "return -1" really _should_ be changed to a
"return 0", and not just because it helps your particular case. It's
literally a bug these days, because the assumptions that caused it to
return -1 simply aren't true any more.
(It wasn't _strictly_ true even originally: even if the sizediff is huge,
the _delta_ may not be huge, since we can delete data with a small delta.
So it's quite likely that we should compare the "old is bigger than new"
and "new is bigger than old" separately and have different heuristics for
them. Again, that was simply not much of an issue back when we sorted just
by size).
So even the "return 0" might not be completely right. We might actually
want to look at how big the delta is, and return only once that fails.
Linus
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Junio C Hamano @ 2006-04-20 22:35 UTC (permalink / raw)
To: Shawn Pearce; +Cc: Nicolas Pitre, git
In-Reply-To: <20060420220240.GB32748@spearce.org>
Shawn Pearce <spearce@spearce.org> writes:
> Junio C Hamano <junkio@cox.net> wrote:
>
>> The list is sorted by type then hash then size (type_size_sort),
>> so if you have t/Makefile that are big medium small too-small
>> and then doc/Makefile that are big medium, once you see the
>> too-small t/Makefile it would not consider the big doc/Makefile
>> as a candidate X-<.
>>
>> This _might_ improve things:
>>...
>
> Holy cow, it did:
>
> Total 46391, written 46391 (delta 8391), reused 37774 (delta 0)
> 46M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
>
> That's the smallest packing I've seen yet. And it doesn't have a
> negative affect on repacking GIT either.
Thanks for trying. Mind trying one more?
I suspect the test patch makes pack-objects a lot more
expensive.
The code before the test patch said "if the size is very small
or size difference is too great, do not consider this, and do
not consider any more objects in the delta window, because we
know they are either even smaller of the same path, they have
different names, or they are of different type". The test patch
you tried was a quick and dirty hack that said "under the
too-small condition, skip this one, but keep trying the rest of
the delta window".
Here is a cleaned up patch. What it does is "under the
too-small condition, see if the object has the same basename,
and if so keep going, but otherwise skip the rest as before".
If you have objects like this and are trying to pack the first
object (this list is sorted in the order pack-object tries):
(size) (path)
1000 t/0-11-AdjLite.deg
10 t/0-11-AdjLite.deg
800 s/0-11-AdjLite.deg
20 t/0-12-AdjLite.deg
the current code stops after checking t/0-11-AdjLite.deg. The
test patch tries all of them. This patch skips that file, but
tries "s/0-11-AdjLite.deg", and then stops at the next one.
-- >8 --
diff --git a/pack-objects.c b/pack-objects.c
index 09f4f2c..2173709 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -1036,8 +1036,6 @@ static int try_delta(struct unpacked *cu
oldsize = old_entry->size;
sizediff = oldsize > size ? oldsize - size : size - oldsize;
- if (size < 50)
- return -1;
if (old_entry->depth >= max_depth)
return 0;
@@ -1048,20 +1046,27 @@ static int try_delta(struct unpacked *cu
* more space-efficient (deletes don't have to say _what_ they
* delete).
*/
- max_size = size / 2 - 20;
- if (cur_entry->delta)
- max_size = cur_entry->delta_size-1;
- if (sizediff >= max_size)
- return -1;
- delta_buf = diff_delta(old->data, oldsize,
- cur->data, size, &delta_size, max_size);
- if (!delta_buf)
+ if (50 <= size) {
+ max_size = size / 2 - 20;
+ if (cur_entry->delta)
+ max_size = cur_entry->delta_size-1;
+ if (sizediff < max_size) {
+ delta_buf = diff_delta(old->data, oldsize,
+ cur->data, size,
+ &delta_size, max_size);
+ if (!delta_buf)
+ return 0;
+ cur_entry->delta = old_entry;
+ cur_entry->delta_size = delta_size;
+ cur_entry->depth = old_entry->depth + 1;
+ free(delta_buf);
+ return 0;
+ }
+ }
+ /* Keep going as long as the basename matches */
+ if (((cur_entry->hash ^ old_entry->hash) >>DIRBITS) == 0)
return 0;
- cur_entry->delta = old_entry;
- cur_entry->delta_size = delta_size;
- cur_entry->depth = old_entry->depth + 1;
- free(delta_buf);
- return 0;
+ return -1;
}
static void progress_interval(int signum)
^ permalink raw reply related
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-20 22:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nicolas Pitre, git
In-Reply-To: <7vfyk8vscl.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> >> But I suspect we have a built-in "we sort bigger to smaller, and
> >> we cut off when we switch bins" somewhere in find_delta() loop,
> >> which I do not recall touching when I did that change, so that
> >> may be interfering and preventing 0-11-AdjLite.deg from all over
> >> the place to delta against each other.
> >
> > I just cannot find something that would do that in the code. When
> > --no-reuse-delta is specified, the only things that will break the loop
> > in find_delta() is when try_delta() returns -1, and that happens only
> > when changing object type or when the size difference is too big, but
> > nothing looks at the name hash.
>
> The list is sorted by type then hash then size (type_size_sort),
> so if you have t/Makefile that are big medium small too-small
> and then doc/Makefile that are big medium, once you see the
> too-small t/Makefile it would not consider the big doc/Makefile
> as a candidate X-<.
>
> This _might_ improve things:
>
> diff --git a/pack-objects.c b/pack-objects.c
> index 09f4f2c..0c6abe9 100644
> --- a/pack-objects.c
> +++ b/pack-objects.c
> @@ -1037,7 +1039,7 @@ static int try_delta(struct unpacked *cu
> sizediff = oldsize > size ? oldsize - size : size - oldsize;
>
> if (size < 50)
> - return -1;
> + return 0;
> if (old_entry->depth >= max_depth)
> return 0;
>
> @@ -1052,7 +1054,7 @@ static int try_delta(struct unpacked *cu
> if (cur_entry->delta)
> max_size = cur_entry->delta_size-1;
> if (sizediff >= max_size)
> - return -1;
> + return 0;
> delta_buf = diff_delta(old->data, oldsize,
> cur->data, size, &delta_size, max_size);
> if (!delta_buf)
Holy cow, it did:
Total 46391, written 46391 (delta 8391), reused 37774 (delta 0)
46M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
That's the smallest packing I've seen yet. And it doesn't have a
negative affect on repacking GIT either.
--
Shawn.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Jakub Narebski @ 2006-04-20 21:56 UTC (permalink / raw)
To: git
In-Reply-To: <7vodywvsrq.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
> v1.2.3 hash was base-name only. doc/Makefile and t/Makefile
> were thrown in the same bin and sorted by size. When the
> history you are packing is deep, and doc/Makefile and t/Makefile
> are not related at all, this made effective size of delta window
> 1/N where N is the number of such duplicates.
>
> The one you found above uses a hash that is fully full-path.
> The two are in completely different bins, and bins are totally
> random. This was not a good strategy.
>
> v1.3.0 hash is base-name hash concatenated with leading-path
> has. t/Makefile and doc/Makefile go in separate bins, but the
> bins are close to each other; this avoids the problem in v1.2.3
> when you have deep history, but at the same time if you do not
> have many many versions of t/Makefile to overflow the delta
> window, it gives t/Makefile a chance to delta with doc/Makefile.
[...]
> You could try this patch to resurrect the hash used in v1.2.3,
> and you may get better packing for your particular repository;
> but I am not sure if it gives better results in the general
> case. I am running the test myself now while waiting for my
> day-job database to load X-<.
Perhaps the packing code could check which version gives smaller pack, or at
least be instructed that one might want different packing heuristic for
specific repository? Surely 2x difference in size is worth considering (and
complication)...
--
Jakub Narebski
Warsaw, Poland
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Shawn Pearce @ 2006-04-20 21:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Nicolas Pitre, Linus Torvalds
In-Reply-To: <7vodywvsrq.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
>
> > I just spent some time bisecting this issue and it looks like the
> > following change by Junio may be the culprit:
> >
> > commit 1d6b38cc76c348e2477506ca9759fc241e3d0d46
[snip]
> Unfortunately, that is not the same hash we use in v1.3.0, so we
> need to look elsewhere for interactions.
Pity. Then either bisect goofed or there was a goof in meatspace
while using bisect. I honestly expected bisect to point at the
problem commit. I tried reverting 1d6b38cc but it didn't apply
cleanly and I didn't feel like working through all of the conflicts
at the time.
[snip]
> The earlier observation by Linus on reverting eeef7135 is
> consistent with it; that commit was the one that introduced
> v1.3.0 hash.
Yet reverting that didn't help either.
[snip]
> You could try this patch to resurrect the hash used in v1.2.3,
> and you may get better packing for your particular repository;
> but I am not sure if it gives better results in the general
> case. I am running the test myself now while waiting for my
> day-job database to load X-<.
[snip]
Nope. When applied to 'next' it didn't help very much:
Total 46391, written 46391 (delta 6466), reused 38662 (delta 0)
118M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack
Just to note: the 1.3.0 packer is saving 1M in the GIT repository
over the 1.2.3 packer. So for a real project it does seem to have
some benefit. And if you benchmarked the 1.3.0 packer against
the Linux kernel and found it to be better than the 1.2.3 packer
that's even better.
I think this repository of mine may just be a degenerate case which
GIT doesn't pack very well. GIT can't be all things to all people!
--
Shawn.
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Junio C Hamano @ 2006-04-20 21:40 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604201630320.2215@localhost.localdomain>
Nicolas Pitre <nico@cam.org> writes:
>> But I suspect we have a built-in "we sort bigger to smaller, and
>> we cut off when we switch bins" somewhere in find_delta() loop,
>> which I do not recall touching when I did that change, so that
>> may be interfering and preventing 0-11-AdjLite.deg from all over
>> the place to delta against each other.
>
> I just cannot find something that would do that in the code. When
> --no-reuse-delta is specified, the only things that will break the loop
> in find_delta() is when try_delta() returns -1, and that happens only
> when changing object type or when the size difference is too big, but
> nothing looks at the name hash.
The list is sorted by type then hash then size (type_size_sort),
so if you have t/Makefile that are big medium small too-small
and then doc/Makefile that are big medium, once you see the
too-small t/Makefile it would not consider the big doc/Makefile
as a candidate X-<.
This _might_ improve things:
diff --git a/pack-objects.c b/pack-objects.c
index 09f4f2c..0c6abe9 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -1037,7 +1039,7 @@ static int try_delta(struct unpacked *cu
sizediff = oldsize > size ? oldsize - size : size - oldsize;
if (size < 50)
- return -1;
+ return 0;
if (old_entry->depth >= max_depth)
return 0;
@@ -1052,7 +1054,7 @@ static int try_delta(struct unpacked *cu
if (cur_entry->delta)
max_size = cur_entry->delta_size-1;
if (sizediff >= max_size)
- return -1;
+ return 0;
delta_buf = diff_delta(old->data, oldsize,
cur->data, size, &delta_size, max_size);
if (!delta_buf)
^ permalink raw reply related
* Re: cg-clone produces "___" file and no working tree
From: Zack Brown @ 2006-04-20 21:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vodywxago.fsf@assigned-by-dhcp.cox.net>
Hi,
> honestly I would not recommend "runnning
> without installing" unless you know what you are doing ;-).
OK, you're right, the problem was that I was not doing a proper install. I
followed the directions and it worked. Thanks!
Be well,
Zack
On Thu, Apr 20, 2006 at 01:23:35PM -0700, Junio C Hamano wrote:
> Junio C Hamano <junkio@cox.net> writes:
>
> > Zack Brown <zbrown@tumblerings.org> writes:
> >
> >> Not true. I went into the git source directory, and ran "make". Nothing more.
> >
> > Ah, I misunderstood. You are trying to run it _without_
> > installing it.
> >
> > Well, then probably you do not have templates installed
> > anywhere, especially not where git-init-db expects them to be
> > found.
>
> (sorry for the short message sent unfinished by mistake).
>
> Running things without installing is somewhat tricky, but test
> framework needs to do that, so there are some things you would
> need to do.
>
> - "git init-db" takes --template argument; in the source area
> before installing, they are built in templates/blt/.
>
> - "git" and programs that need to invoke other git programs
> (e.g. git-send-pack) expects things to be found in gitexecdir
> you set when you build. If you are not installing, you need
> to override that with GIT_EXEC_PATH environment variable.
>
> There might be other things, but you should be able to find them
> from what t/Makefile and t/test-lib.sh do.
>
> Having said that, honestly I would not recommend "runnning
> without installing" unless you know what you are doing ;-).
>
--
^ permalink raw reply
* Re: 1.3.0 creating bigger packs than 1.2.3
From: Junio C Hamano @ 2006-04-20 21:31 UTC (permalink / raw)
To: Shawn Pearce; +Cc: git, Nicolas Pitre, Linus Torvalds
In-Reply-To: <20060420173131.GF31738@spearce.org>
Shawn Pearce <spearce@spearce.org> writes:
> I just spent some time bisecting this issue and it looks like the
> following change by Junio may be the culprit:
>
> commit 1d6b38cc76c348e2477506ca9759fc241e3d0d46
> Author: Junio C Hamano <junkio@cox.net>
> Date: Wed Feb 22 22:10:24 2006 -0800
>
> pack-objects: use full pathname to help hashing with "thin" pack.
>
> This uses the same hashing algorithm to the "preferred base
> tree" objects and the incoming pathnames, to group the same
> files from different revs together, while spreading files with
> the same basename in different directories.
>
> Signed-off-by: Junio C Hamano <junkio@cox.net>
>
Unfortunately, that is not the same hash we use in v1.3.0, so we
need to look elsewhere for interactions.
v1.2.3 hash was base-name only. doc/Makefile and t/Makefile
were thrown in the same bin and sorted by size. When the
history you are packing is deep, and doc/Makefile and t/Makefile
are not related at all, this made effective size of delta window
1/N where N is the number of such duplicates.
The one you found above uses a hash that is fully full-path.
The two are in completely different bins, and bins are totally
random. This was not a good strategy.
v1.3.0 hash is base-name hash concatenated with leading-path
has. t/Makefile and doc/Makefile go in separate bins, but the
bins are close to each other; this avoids the problem in v1.2.3
when you have deep history, but at the same time if you do not
have many many versions of t/Makefile to overflow the delta
window, it gives t/Makefile a chance to delta with doc/Makefile.
The earlier observation by Linus on reverting eeef7135 is
consistent with it; that commit was the one that introduced
v1.3.0 hash.
You could try this patch to resurrect the hash used in v1.2.3,
and you may get better packing for your particular repository;
but I am not sure if it gives better results in the general
case. I am running the test myself now while waiting for my
day-job database to load X-<.
NOTE NOTE NOTE. The hash in v1.2.3 was done with the basename
(relying on rev-list --objects to only show the basename) and
hashed from front to back. The current one uses the same hash
scrambling function, but it hashes from back to front, and it
knows rev-list --objects gives it a full path.
What this patch does is to stop the hashing after we are done
with the basename part. So it still gives different hash value
to the same path from v1.2.3 version, but the distribution
should be equivalent.
NOTE 2. Feeding output from the current "rev-list --objects" to
v1.2.3 pack-object is the same as "hash full path and spread
things out" intermediate version, which is the worst performer.
-- >8 --
git diff
diff --git a/pack-objects.c b/pack-objects.c
index 09f4f2c..e58e169 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -492,6 +492,8 @@ static unsigned name_hash(struct name_pa
name_hash = hash;
hash = 0;
}
+ return name_hash;
+
for (p = path; p; p = p->up) {
hash = hash * 11 + '/';
n = p->elem + p->len;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox