* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Junio C Hamano @ 2005-05-12 21:44 UTC (permalink / raw)
To: Sean; +Cc: tglx, Junio C Hamano, H. Peter Anvin, git
In-Reply-To: <2477.10.10.10.24.1115933520.squirrel@linux1>
>>>>> "S" == Sean <seanlkml@sympatico.ca> writes:
S> When an object is committed locally it is set to the local time. You can
S> only have this feature when you use private commit objects (shared blobs
S> are okay).
This brings up an interesting possibility, which is off topic
from this thread.
You _could_ (I am not advocating this, just thinking aloud) have
GIT_OBJECT_DIRECTORY and GIT_COMMIT_OBJECT_DIRECTORY pointing at
two separate object pools, with the value of
GIT_COMMIT_OBJECT_DIRECTORY being on
GIT_ALTERNATE_OBJECT_DIRECTORIES list. Your commits go to
GIT_COMMIT_OBJECT_DIRECTORY (local to the tree) and everything
else go to GIT_OBJECT_DIRECTORY (can be shared across trees).
Hmm.... Interesting. My gut feeling tells me not to go there,
though.
^ permalink raw reply
* Re: [PATCH Cogito] Improve option parsing for cg-log
From: Marcel Holtmann @ 2005-05-12 21:49 UTC (permalink / raw)
To: Petr Baudis; +Cc: GIT Mailing List
In-Reply-To: <20050512211315.GP324@pasky.ji.cz>
Hi Petr,
> > the attached patch changes the option parsing, because otherwise we are
> > stuck to a specific order.
>
> thanks, applied. However, you didn't include the -r options parsing in
> there yet.
what do you mean by that?
Regards
Marcel
^ permalink raw reply
* [PATCH Cogito] Add -u option to cg-log to show only commits from a specific user
From: Marcel Holtmann @ 2005-05-12 21:58 UTC (permalink / raw)
To: Petr Baudis; +Cc: GIT Mailing List
[-- Attachment #1: Type: text/plain, Size: 292 bytes --]
Hi Petr,
the attached patch introduces the -u option for cg-log. Now you can give
a username or a part of an username and only commits with a matching
author or committer will be displayed. Based on a patch from Sean.
Regards
Marcel
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
[-- Attachment #2: patch --]
[-- Type: text/x-patch, Size: 1168 bytes --]
Index: cg-log
===================================================================
--- 7fa81b554162c34c616e74392960939412d18081/cg-log (mode:100755)
+++ uncommitted/cg-log (mode:100755)
@@ -15,6 +15,9 @@
#
# Takes an -f option to list which files was changed.
#
+# Takes -u"username" to list only commits where author or
+# committer contains username.
+#
# Takes an -r followed with id resolving to a commit to start from
# (HEAD by default), or id1:id2 representing an (id1;id2] range
# of commits to show.
@@ -34,6 +37,7 @@
colsignoff=
coldefault=
list_files=
+user=
while [ "$1" ]; do
# TODO: Parse -r here too.
case "$1" in
@@ -51,6 +55,10 @@
list_files=1
shift
;;
+ -u*)
+ user="${1#-u}"
+ shift
+ ;;
*)
break
;;
@@ -123,6 +131,9 @@
parent=$(git-cat-file commit $commit | sed -n '2s/parent //p;2Q')
[ "$parent" ] && [ "$(git-diff-tree -r $commit $parent "$@")" ] || continue
fi
+ if [ "$user" ]; then
+ git-cat-file commit $commit | grep -e '^author ' -e '^committer ' | grep -qi "$user" || continue
+ fi
echo $colheader""commit ${commit%:*} $coldefault;
git-cat-file commit $commit | \
while read key rest; do
^ permalink raw reply
* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Thomas Gleixner @ 2005-05-12 22:06 UTC (permalink / raw)
To: Sean; +Cc: Junio C Hamano, H. Peter Anvin, git
In-Reply-To: <2477.10.10.10.24.1115933520.squirrel@linux1>
On Thu, 2005-05-12 at 17:32 -0400, Sean wrote:
> > How do you enforce correct timestamps ?
>
> When an object is committed locally it is set to the local time. You can
> only have this feature when you use private commit objects (shared blobs
> are okay). It doesn't matter if the timestamps are correct in the global
> sense, just that they're correct for the local server, because they'll
> only ever be compared against each other.
That limits the usefulness to a local place, which makes no sense in a
distributed development scenario.
> By the way, repoid doesn't work when all the branches are done in the same
> repository. You'd need to use something like repoid-branch.
Right. That was my basic idea to collect the information either from an
environment variable or deduce it from the current wroking directory,
which is unlikely to be the same for different branches. hpa's arguments
against this approach are quite good but I think somethink like a per
branch repository id is not too hard to implement.
tglx
^ permalink raw reply
* Re: [PATCH] [RFD] Add repoid identifier to commit
From: Sean @ 2005-05-12 22:24 UTC (permalink / raw)
To: tglx; +Cc: Junio C Hamano, H. Peter Anvin, git
In-Reply-To: <1115935604.11872.97.camel@tglx>
On Thu, May 12, 2005 6:06 pm, Thomas Gleixner said:
Thomas,
> That limits the usefulness to a local place, which makes no sense in a
> distributed development scenario.
I don't think that is true, the only time you'd use this time is when
comparing against other commits from the same repository. As you download
the commits you're interested in from a remote repository, you compare
them to each other to get the order.
Sean
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Matt Mackall @ 2005-05-12 22:29 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: Petr Baudis, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505121709250.30848-100000@iabervon.org>
On Thu, May 12, 2005 at 05:24:27PM -0400, Daniel Barkalow wrote:
> On Thu, 12 May 2005, Matt Mackall wrote:
>
> > Does this need an HTTP request (and round trip) per object? It appears
> > to. That's 2200 requests/round trips for my 800 patch benchmark.
>
> It requires a request per object, but it should be possible (with
> somewhat more complicated code) to overlap them such that it doesn't
> require a serial round trip for each. Since the server is sending static
> files, the overhead for each should be minimal.
It's not minimal. The size of an HTTP request is often not much
different than the size of a compressed file delta. Here's one of the
indexes from a file in an hg repo:
rev offset length base linkrev p1 p2 nodeid
0 0 2307 0 0 0000000000.. 0000000000.. b6444347c6..
1 2307 77 0 5 b6444347c6.. 0000000000.. 06763db6de..
2 2384 225 0 11 06763db6de.. 0000000000.. acc8e2b2f0..
3 2609 40 0 16 acc8e2b2f0.. 0000000000.. 461b079d98..
4 2649 261 0 17 461b079d98.. 0000000000.. 8507ba44cc..
5 2910 486 0 18 8507ba44cc.. 0000000000.. b68523252b..
6 3396 98 0 21 b68523252b.. 0000000000.. b3f2586243..
7 3494 238 0 22 b3f2586243.. 0000000000.. d73d0f8ee9..
8 3732 39 0 23 d73d0f8ee9.. 0000000000.. caaf506196..
9 3771 266 0 24 caaf506196.. 0000000000.. 54485fc96f..
10 4037 81 0 29 54485fc96f.. 0000000000.. b9eae7b990..
11 4118 310 0 31 b9eae7b990.. 0000000000.. a9926b092a..
12 4428 545 0 33 a9926b092a.. 0000000000.. f26c600172..
13 4973 419 0 34 f26c600172.. 0000000000.. ec4ab0acb7..
14 5392 136 0 38 ec4ab0acb7.. 0000000000.. eb5f3f76c8..
15 5528 161 0 39 eb5f3f76c8.. 0000000000.. 4fc5f3a3ae..
16 5689 258 0 46 4fc5f3a3ae.. 0000000000.. 3ad83891fb..
17 5947 171 0 49 3ad83891fb.. 0000000000.. 3983ac6cd2..
18 6118 195 0 50 3983ac6cd2.. 0000000000.. f138865e04..
19 6313 79 0 52 f138865e04.. 0000000000.. 3566c1f449..
20 6392 85 0 53 3566c1f449.. 0000000000.. 0694a4e3eb..
21 6477 91 0 54 0694a4e3eb.. 0000000000.. 5f98ae7426..
22 6568 208 0 56 5f98ae7426.. 0000000000.. dae5cb80db..
23 6776 286 0 62 dae5cb80db.. 0000000000.. 90ff243869..
All the junk that gets bundled in an http request/response will be
similar in size to the stuff in the third column.
Relative to the 10-20x overhead of not sending deltas, yes, it's only 10%.
> > How does git find the outstanding changesets?
>
> In the present mainline, you first have to find the head commit you
> want. I have a patch which does this for you over the same
> connection. Starting from that point, it tracks reachability on the
> receiving end, and requests anything it doesn't have.
Does it do this recursively? Eg, if the server has 800 new linear
commits, does the client have to do 800 round trips following parent
pointers to find all the new changesets? In this case, Mercurial does
about 6 round trips, totalling less than 1K, plus one requests
that pulls everything.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply
* Re: New version of gitk
From: Alex Riesen @ 2005-05-12 23:06 UTC (permalink / raw)
To: Paul Mackerras; +Cc: git
In-Reply-To: <17026.43676.670725.66502@cargo.ozlabs.ibm.com>
On 5/12/05, Paul Mackerras <paulus@samba.org> wrote:
> I have just put a new version of gitk at:
>
> http://ozlabs.org/~paulus/gitk-0.9
>
Very, very nice and useful. Thank you!
Btw, how does the tree look like with unconnected (unmerged) branches?
And the case where Linus just pointed HEAD to most recent commit?
There are some confusing interconnections in the tree, like around this commit:
"Author: David Woodhouse <dwmw2@shinybook.infradead.org> 2005-05-05 14:59:37
Committer: David Woodhouse <dwmw2@shinybook.infradead.org> 2005-05-05 14:59:37
Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git"
(Maybe show sha1's, just for reference?)
And I got the error below trying to run gitk in kernel git (latest).
Probably because I
closed window before the script finished something.
Error in startup script: invalid command name ".ctop.clist.canv"
while executing
"$canv create line $x $linestarty($level) $x $canvy -width 2 -fill
$colormap($id)"
(procedure "drawgraph" line 40)
invoked from within
"drawgraph $start"
invoked from within
"if {$start != {}} {
drawgraph $start
}"
(file "/home/raa/bin/gitk" line 703)
--
Alex
^ permalink raw reply
* [PATCH] Fix git-diff-files for symlinks.
From: Junio C Hamano @ 2005-05-12 23:51 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
In-Reply-To: <20050512192941.GC324@pasky.ji.cz>
Again I am not sure why this was missed during the last round,
but git-diff-files mishandles symlinks on the filesystem. This
patch fixes it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
*** Petr, this one falls into "emergency obvious fix" category.
*** I found it during adding more "basic" test to the test suite,
*** which turns out to be quite useful.
diff-files.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)
--- a/diff-files.c
+++ b/diff-files.c
@@ -126,7 +126,8 @@
continue;
oldmode = ntohl(ce->ce_mode);
- mode = S_IFREG | ce_permissions(st.st_mode);
+ mode = (S_ISLNK(st.st_mode) ? S_IFLNK :
+ S_IFREG | ce_permissions(st.st_mode));
show_modified(oldmode, mode, ce->sha1, null_sha1,
ce->name);
------------------------------------------------
^ permalink raw reply
* [PATCH 0/3] Core GIT fixes and additions.
From: Junio C Hamano @ 2005-05-13 0:14 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
In-Reply-To: <20050512192941.GC324@pasky.ji.cz>
Pasky,
I am sending you three patches rediffed against the tip
of git-pb tree for inclusion.
* [PATCH 1/3] Introduce "rev-list --stop-at=<commit>".
* [PATCH 2/3] Support symlinks in git-ls-files --others.
* [PATCH 3/3] Add git-ls-files -k.
The first one is independent from other two. 3/3 touches the
same file as 2/3.
^ permalink raw reply
* [PATCH 1/3] Introduce "rev-list --stop-at=<commit>".
From: Junio C Hamano @ 2005-05-13 0:15 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
Additional option, --stop-at=<commit>, is introduced. The
git-rev-list output stops just before showing the named commit.
This is based on Thoms Gleixner's patch but slightly reworked.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Documentation/git-rev-list.txt | 18 +++++++++++++++++-
rev-list.c | 20 ++++++++++++++++----
2 files changed, 33 insertions(+), 5 deletions(-)
--- a/Documentation/git-rev-list.txt
+++ b/Documentation/git-rev-list.txt
@@ -9,7 +9,10 @@
SYNOPSIS
--------
-'git-rev-list' <commit>
+'git-rev-list' [--max-count=<number>]
+ [--max-age=<unixtime>]
+ [--min-age=<unixtime>]
+ [--stop-at=<commit>] <commit>
DESCRIPTION
-----------
@@ -17,6 +20,19 @@
given commit, taking ancestry relationship into account. This is
useful to produce human-readable log output.
+OPTIONS
+-------
+--max-count=<number>::
+ Stop after showing <number> commits.
+
+--max-age=<unixtime>::
+ Stop after showing commit made before <unixtime>.
+
+--min-age=<unixtime>::
+ Skip until commit made before <unixtime>.
+
+--stop-at=<commit>::
+ Stop just before showing <commit>.
Author
------
--- a/rev-list.c
+++ b/rev-list.c
@@ -1,12 +1,21 @@
#include "cache.h"
#include "commit.h"
+static const char *rev_list_usage =
+"usage: rev-list [OPTION] commit-id\n"
+" --max-count=nr\n"
+" --max-age=epoch\n"
+" --min-age=epoch\n"
+" --stop-at=commit\n";
+
int main(int argc, char **argv)
{
unsigned char sha1[20];
struct commit_list *list = NULL;
struct commit *commit;
char *commit_arg = NULL;
+ unsigned char stop_at[20];
+ int has_stop_at = 0;
int i;
unsigned long max_age = -1;
unsigned long min_age = -1;
@@ -21,16 +30,17 @@
max_age = atoi(arg + 10);
} else if (!strncmp(arg, "--min-age=", 10)) {
min_age = atoi(arg + 10);
+ } else if (!strncmp(arg, "--stop-at=", 10)) {
+ if (get_sha1(arg + 10, stop_at))
+ usage(rev_list_usage);
+ has_stop_at = 1;
} else {
commit_arg = arg;
}
}
if (!commit_arg || get_sha1(commit_arg, sha1))
- usage("usage: rev-list [OPTION] commit-id\n"
- " --max-count=nr\n"
- " --max-age=epoch\n"
- " --min-age=epoch\n");
+ usage(rev_list_usage);
commit = lookup_commit(sha1);
if (!commit || parse_commit(commit) < 0)
@@ -46,6 +56,8 @@
break;
if (max_count != -1 && !max_count--)
break;
+ if (has_stop_at && !memcmp(stop_at, commit->object.sha1, 20))
+ break;
printf("%s\n", sha1_to_hex(commit->object.sha1));
} while (list);
return 0;
------------------------------------------------
^ permalink raw reply
* [PATCH 2/3] Support symlinks in git-ls-files --others.
From: Junio C Hamano @ 2005-05-13 0:16 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
It is kind of surprising that this was missed in the last round,
but the work tree scanner in git-ls-files was still deliberately
ignoring symlinks. This patch fixes it, so that --others will
correctly report unregistered symlinks.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
cache.h | 1 +
ls-files.c | 8 +++++---
2 files changed, 6 insertions(+), 3 deletions(-)
--- a/cache.h
+++ b/cache.h
@@ -27,6 +27,7 @@
#define DT_UNKNOWN 0
#define DT_DIR 1
#define DT_REG 2
+#define DT_LNK 3
#define DTYPE(de) DT_UNKNOWN
#endif
--- a/ls-files.c
+++ b/ls-files.c
@@ -109,8 +109,9 @@
/*
* Read a directory tree. We currently ignore anything but
- * directories and regular files. That's because git doesn't
- * handle them at all yet. Maybe that will change some day.
+ * directories, regular files and symlinks. That's because git
+ * doesn't handle them at all yet. Maybe that will change some
+ * day.
*
* Also, we currently ignore all names starting with a dot.
* That likely will not change.
@@ -141,7 +142,7 @@
case DT_UNKNOWN:
if (lstat(fullname, &st))
continue;
- if (S_ISREG(st.st_mode))
+ if (S_ISREG(st.st_mode) || S_ISLNK(st.st_mode))
break;
if (!S_ISDIR(st.st_mode))
continue;
@@ -152,6 +153,7 @@
baselen + len + 1);
continue;
case DT_REG:
+ case DT_LNK:
break;
}
add_name(fullname, baselen + len);
------------------------------------------------
^ permalink raw reply
* [PATCH 3/3] Add git-ls-files -k.
From: Junio C Hamano @ 2005-05-13 0:17 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
When checkout-cache attempts to check out a non-directory where
a directory exists on the work tree, or to check out a file
under directory D when path D is a non-directory on the work
tree, the attempt fails. Before running checkout-cache, the
user can run git-ls-files with the -k (killed) option to get a
list of such paths. The tagged output format uses "K" to denote
them. This is useful for Porcelain layer to be careful when
dealing with the recently corrected behaviour of checkout-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
*** This should be considered a companion patch to the
*** checkout-cache fix you merged recently. An extra safety
*** net just like git-check-files.
Documentation/git-ls-files.txt | 10 +++-
ls-files.c | 101 ++++++++++++++++++++++++++++++++++-------
2 files changed, 92 insertions(+), 19 deletions(-)
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -10,8 +10,8 @@
SYNOPSIS
--------
'git-ls-files' [-z] [-t]
- (--[cached|deleted|others|ignored|stage|unmerged])\*
- (-[c|d|o|i|s|u])\*
+ (--[cached|deleted|others|ignored|stage|unmerged|killed])\*
+ (-[c|d|o|i|s|u|k])\*
[-x <pattern>|--exclude=<pattern>]
[-X <file>|--exclude-from=<file>]
@@ -45,6 +45,11 @@
-u|--unmerged::
Show unmerged files in the output (forces --stage)
+-k|--killed::
+ Show files on the filesystem that need to be removed due
+ to file/directory conflicts for checkout-cache to
+ succeed.
+
-z::
\0 line termination on output
@@ -65,6 +70,7 @@
H cached
M unmerged
R removed/deleted
+ K to be killed
? other
Output
--- a/ls-files.c
+++ b/ls-files.c
@@ -16,12 +16,14 @@
static int show_ignored = 0;
static int show_stage = 0;
static int show_unmerged = 0;
+static int show_killed = 0;
static int line_terminator = '\n';
static const char *tag_cached = "";
static const char *tag_unmerged = "";
static const char *tag_removed = "";
static const char *tag_other = "";
+static const char *tag_killed = "";
static int nr_excludes;
static const char **excludes;
@@ -87,24 +89,30 @@
return 0;
}
-static const char **dir;
+struct nond_on_fs {
+ int len;
+ char name[0];
+};
+
+static struct nond_on_fs **dir;
static int nr_dir;
static int dir_alloc;
static void add_name(const char *pathname, int len)
{
- char *name;
+ struct nond_on_fs *ent;
if (cache_name_pos(pathname, len) >= 0)
return;
if (nr_dir == dir_alloc) {
dir_alloc = alloc_nr(dir_alloc);
- dir = xrealloc(dir, dir_alloc*sizeof(char *));
+ dir = xrealloc(dir, dir_alloc*sizeof(ent));
}
- name = xmalloc(len + 1);
- memcpy(name, pathname, len + 1);
- dir[nr_dir++] = name;
+ ent = xmalloc(sizeof(*ent) + len + 1);
+ ent->len = len;
+ memcpy(ent->name, pathname, len);
+ dir[nr_dir++] = ent;
}
/*
@@ -164,11 +172,62 @@
static int cmp_name(const void *p1, const void *p2)
{
- const char *n1 = *(const char **)p1;
- const char *n2 = *(const char **)p2;
- int l1 = strlen(n1), l2 = strlen(n2);
+ const struct nond_on_fs *e1 = *(const struct nond_on_fs **)p1;
+ const struct nond_on_fs *e2 = *(const struct nond_on_fs **)p2;
+
+ return cache_name_compare(e1->name, e1->len,
+ e2->name, e2->len);
+}
- return cache_name_compare(n1, l1, n2, l2);
+static void show_killed_files()
+{
+ int i;
+ for (i = 0; i < nr_dir; i++) {
+ struct nond_on_fs *ent = dir[i];
+ char *cp, *sp;
+ int pos, len, killed = 0;
+
+ for (cp = ent->name; cp - ent->name < ent->len; cp = sp + 1) {
+ sp = strchr(cp, '/');
+ if (!sp) {
+ /* If ent->name is prefix of an entry in the
+ * cache, it will be killed.
+ */
+ pos = cache_name_pos(ent->name, ent->len);
+ if (0 <= pos)
+ die("bug in show-killed-files");
+ pos = -pos - 1;
+ while (pos < active_nr &&
+ ce_stage(active_cache[pos]))
+ pos++; /* skip unmerged */
+ if (active_nr <= pos)
+ break;
+ /* pos points at a name immediately after
+ * ent->name in the cache. Does it expect
+ * ent->name to be a directory?
+ */
+ len = ce_namelen(active_cache[pos]);
+ if ((ent->len < len) &&
+ !strncmp(active_cache[pos]->name,
+ ent->name, ent->len) &&
+ active_cache[pos]->name[ent->len] == '/')
+ killed = 1;
+ break;
+ }
+ if (0 <= cache_name_pos(ent->name, sp - ent->name)) {
+ /* If any of the leading directories in
+ * ent->name is registered in the cache,
+ * ent->name will be killed.
+ */
+ killed = 1;
+ break;
+ }
+ }
+ if (killed)
+ printf("%s%.*s%c", tag_killed,
+ dir[i]->len, dir[i]->name,
+ line_terminator);
+ }
}
static void show_files(void)
@@ -176,11 +235,16 @@
int i;
/* For cached/deleted files we don't need to even do the readdir */
- if (show_others) {
+ if (show_others || show_killed) {
read_directory(".", "", 0);
- qsort(dir, nr_dir, sizeof(char *), cmp_name);
- for (i = 0; i < nr_dir; i++)
- printf("%s%s%c", tag_other, dir[i], line_terminator);
+ qsort(dir, nr_dir, sizeof(struct nond_on_fs *), cmp_name);
+ if (show_others)
+ for (i = 0; i < nr_dir; i++)
+ printf("%s%.*s%c", tag_other,
+ dir[i]->len, dir[i]->name,
+ line_terminator);
+ if (show_killed)
+ show_killed_files();
}
if (show_cached | show_stage) {
for (i = 0; i < active_nr; i++) {
@@ -219,8 +283,8 @@
}
static const char *ls_files_usage =
- "ls-files [-z] [-t] (--[cached|deleted|others|stage|unmerged])* "
- "[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]";
+"ls-files [-z] [-t] (--[cached|deleted|others|stage|unmerged|killed])* "
+"[ --ignored [--exclude=<pattern>] [--exclude-from=<file>) ]";
int main(int argc, char **argv)
{
@@ -236,6 +300,7 @@
tag_unmerged = "M ";
tag_removed = "R ";
tag_other = "? ";
+ tag_killed = "K ";
} else if (!strcmp(arg, "-c") || !strcmp(arg, "--cached")) {
show_cached = 1;
} else if (!strcmp(arg, "-d") || !strcmp(arg, "--deleted")) {
@@ -246,6 +311,8 @@
show_ignored = 1;
} else if (!strcmp(arg, "-s") || !strcmp(arg, "--stage")) {
show_stage = 1;
+ } else if (!strcmp(arg, "-k") || !strcmp(arg, "--killed")) {
+ show_killed = 1;
} else if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
/* There's no point in showing unmerged unless
* you also show the stage information.
@@ -271,7 +338,7 @@
}
/* With no flags, we default to showing the cached files */
- if (!(show_stage | show_deleted | show_others | show_unmerged))
+ if (!(show_stage | show_deleted | show_others | show_unmerged | show_killed))
show_cached = 1;
read_cache();
------------------------------------------------
^ permalink raw reply
* [PATCH 1/2] Add test suite and an example.
From: Junio C Hamano @ 2005-05-13 0:20 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
This adds t/ directory to host test suite, a test helper
libarary and a basic set of tests that has already helped
finding a bug introduced when we added symlink support.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
*** Note: please make sure chmod +x t*.sh after applying before committing.
t/t0000-basic.sh | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
t/test-lib.sh | 118 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 251 insertions(+)
t/t0000-basic.sh (. --> 100755)
t/test-lib.sh (. --> 100755)
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -0,0 +1,133 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='Test the very basics part #1.
+
+The rest of the test suite does not check the basic operation of git
+plumbing commands to work very carefully. Their job is to concentrate
+on tricky features that caused bugs in the past to detect regression.
+
+This test runs very basic features, like registering things in cache,
+writing tree, etc.
+
+Note that this test *deliberately* hard-codes many expected object
+IDs. When object ID computation changes, like in the previous case of
+swapping compression and hashing order, the person who is making the
+modification *should* take notice and update the test vectors here.
+'
+. ./test-lib.sh
+
+################################################################
+# init-db has been done in an empty repository.
+# make sure it is empty.
+
+find .git/objects -type f -print >should-be-empty
+test_expect_success 'cmp -s /dev/null should-be-empty'
+
+# also it should have 256 subdirectories. 257 is counting "objects"
+find .git/objects -type d -print >full-of-directories
+test_expect_success 'test "$(wc -l full-of-directories | sed -e "s/ .*//")" = 257'
+
+################################################################
+# Basics of the basics
+
+# updating a new file without --add should fail.
+test_expect_failure 'git-update-cache should-be-empty'
+
+# and with --add it should succeed, even if it is empty (it used to fail).
+test_expect_success 'git-update-cache --add should-be-empty'
+
+test_expect_success 'tree=$(git-write-tree)'
+
+# we know the shape and contents of the tree and know the object ID for it.
+test_expect_success 'test "$tree" = 7bb943559a305bdd6bdee2cef6e5df2413c3d30a'
+
+# Removing paths.
+rm -f should-be-empty full-of-directories
+test_expect_failure 'git-update-cache should-be-empty'
+test_expect_success 'git-update-cache --remove should-be-empty'
+
+# Empty tree can be written with recent write-tree.
+test_expect_success 'tree=$(git-write-tree)'
+test_expect_success 'test "$tree" = 4b825dc642cb6eb9a060e54bf8d69288fbee4904'
+
+# Various types of objects
+mkdir path2 path3 path3/subp3
+for p in path0 path2/file2 path3/file3 path3/subp3/file3
+do
+ echo "hello $p" >$p
+ ln -s "hello $p" ${p}sym
+done
+test_expect_success 'find path* ! -type d -print0 | xargs -0 -r git-update-cache --add'
+
+# Show them and see that matches what we expect.
+test_expect_success 'git-ls-files --stage >current'
+
+cat >expected <<\EOF
+100644 f87290f8eb2cbbea7857214459a0739927eab154 0 path0
+120000 15a98433ae33114b085f3eb3bb03b832b3180a01 0 path0sym
+100644 3feff949ed00a62d9f7af97c15cd8a30595e7ac7 0 path2/file2
+120000 d8ce161addc5173867a3c3c730924388daedbc38 0 path2/file2sym
+100644 0aa34cae68d0878578ad119c86ca2b5ed5b28376 0 path3/file3
+120000 8599103969b43aff7e430efea79ca4636466794f 0 path3/file3sym
+100644 00fb5908cb97c2564a9783c0c64087333b3b464f 0 path3/subp3/file3
+120000 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c 0 path3/subp3/file3sym
+EOF
+test_expect_success 'diff current expected'
+
+test_expect_success 'tree=$(git-write-tree)'
+test_expect_success 'test "$tree" = 087704a96baf1c2d1c869a8b084481e121c88b5b'
+
+test_expect_success 'git-ls-tree $tree >current'
+cat >expected <<\EOF
+100644 blob f87290f8eb2cbbea7857214459a0739927eab154 path0
+120000 blob 15a98433ae33114b085f3eb3bb03b832b3180a01 path0sym
+040000 tree 58a09c23e2ca152193f2786e06986b7b6712bdbe path2
+040000 tree 21ae8269cacbe57ae09138dcc3a2887f904d02b3 path3
+EOF
+test_expect_success 'diff current expected'
+
+test_expect_success 'git-ls-tree -r $tree >current'
+cat >expected <<\EOF
+100644 blob f87290f8eb2cbbea7857214459a0739927eab154 path0
+120000 blob 15a98433ae33114b085f3eb3bb03b832b3180a01 path0sym
+040000 tree 58a09c23e2ca152193f2786e06986b7b6712bdbe path2
+100644 blob 3feff949ed00a62d9f7af97c15cd8a30595e7ac7 path2/file2
+120000 blob d8ce161addc5173867a3c3c730924388daedbc38 path2/file2sym
+040000 tree 21ae8269cacbe57ae09138dcc3a2887f904d02b3 path3
+100644 blob 0aa34cae68d0878578ad119c86ca2b5ed5b28376 path3/file3
+120000 blob 8599103969b43aff7e430efea79ca4636466794f path3/file3sym
+040000 tree 3c5e5399f3a333eddecce7a9b9465b63f65f51e2 path3/subp3
+100644 blob 00fb5908cb97c2564a9783c0c64087333b3b464f path3/subp3/file3
+120000 blob 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c path3/subp3/file3sym
+EOF
+test_expect_success 'diff current expected'
+
+################################################################
+# read-tree followed by write-tree should be idempotent
+
+rm .git/index
+test_expect_success 'git-read-tree $tree &&
+test -f .git/index &&
+newtree=$(git-write-tree) &&
+test "$newtree" = "$tree"'
+
+cat >expected <<\EOF
+*100644->100644 blob f87290f8eb2cbbea7857214459a0739927eab154->0000000000000000000000000000000000000000 path0
+*120000->120000 blob 15a98433ae33114b085f3eb3bb03b832b3180a01->0000000000000000000000000000000000000000 path0sym
+*100644->100644 blob 3feff949ed00a62d9f7af97c15cd8a30595e7ac7->0000000000000000000000000000000000000000 path2/file2
+*120000->120000 blob d8ce161addc5173867a3c3c730924388daedbc38->0000000000000000000000000000000000000000 path2/file2sym
+*100644->100644 blob 0aa34cae68d0878578ad119c86ca2b5ed5b28376->0000000000000000000000000000000000000000 path3/file3
+*120000->120000 blob 8599103969b43aff7e430efea79ca4636466794f->0000000000000000000000000000000000000000 path3/file3sym
+*100644->100644 blob 00fb5908cb97c2564a9783c0c64087333b3b464f->0000000000000000000000000000000000000000 path3/subp3/file3
+*120000->120000 blob 6649a1ebe9e9f1c553b66f5a6e74136a07ccc57c->0000000000000000000000000000000000000000 path3/subp3/file3sym
+EOF
+test_expect_success 'git-diff-files >current && cmp -s current expected'
+
+test_expect_success 'git-update-cache --refresh'
+
+test_expect_success 'git-diff-files >current && cmp -s current /dev/null'
+
+test_done
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -0,0 +1,118 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+# For repeatability, reset the environment to known value.
+LANG=C
+TZ=UTC
+export LANG TZ
+unset AUTHOR_DATE
+unset AUTHOR_EMAIL
+unset AUTHOR_NAME
+unset COMMIT_AUTHOR_EMAIL
+unset COMMIT_AUTHOR_NAME
+unset GIT_ALTERNATE_OBJECT_DIRECTORIES
+unset GIT_AUTHOR_DATE
+unset GIT_AUTHOR_EMAIL
+unset GIT_AUTHOR_NAME
+unset GIT_COMMITTER_EMAIL
+unset GIT_COMMITTER_NAME
+unset GIT_DIFF_OPTS
+unset GIT_DIR
+unset GIT_EXTERNAL_DIFF
+unset GIT_INDEX_FILE
+unset GIT_OBJECT_DIRECTORY
+unset SHA1_FILE_DIRECTORIES
+unset SHA1_FILE_DIRECTORY
+
+# Each test should start with something like this, after copyright notices:
+#
+# test_description='Description of this test...
+# This test checks if command xyzzy does the right thing...
+# '
+# . ./test-lib.sh
+
+error () {
+ echo >&2 "* error: $*"
+ exit 1
+}
+
+say () {
+ echo "* $*"
+}
+
+case "${test_description}" in
+'')
+ error "test script did not set test_description." ;;
+esac
+
+while case "$#" in 0) break;; esac
+do
+ case "$1" in
+ -d|--d|--de|--deb|--debu|--debug)
+ debug=t; shift ;;
+ -h|--h|--he|--hel|--help)
+ say "$test_description"
+ exit 0
+ ;;
+ *)
+ break ;;
+ esac
+done
+test_failure=0
+
+test_debug () {
+ case "$debug" in '') ;; ?*) eval "$*" ;; esac
+}
+
+test_ok () {
+ say "$@"
+}
+
+test_failure () {
+ say "***BAD*** $@"
+ test_failure=1;
+}
+
+test_expect_failure () {
+ say "expecting failure: $1"
+ eval "$1"
+ case $? in
+ 0) test_failure "did not fail as expected." ;;
+ *) test_ok "failed as expected." ;;
+ esac
+}
+
+test_expect_success () {
+ say "expecting success: $1"
+ eval "$1"
+ case $? in
+ 0) test_ok "succeeded as expected." ;;
+ *) test_failure "did not succeed as expected." ;;
+ esac
+}
+
+test_done () {
+ case "$test_failure" in
+ 0)
+ # we could:
+ # cd .. && rm -fr trash
+ # but that means we forbid any tests that use their own
+ # subdirectory from calling test_done without coming back
+ # to where they started from.
+ exit 0 ;;
+ *) exit 1 ;;
+ esac
+}
+
+# Test the binaries we have just built. The tests are kept in
+# t/ subdirectory and are run in trash subdirectory.
+PATH=$(pwd)/..:$PATH
+
+# Test repository
+test=trash
+rm -fr "$test"
+mkdir "$test"
+cd "$test"
+git-init-db 2>/dev/null || error "cannot run git-init-db"
------------------------------------------------
^ permalink raw reply
* [PATCH 2/2] Test recent additions to core GIT.
From: Junio C Hamano @ 2005-05-13 0:22 UTC (permalink / raw)
To: Petr Baudis; +Cc: git
This is a set of tests I used to verify recently made changes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
*** Note: Please do not forget chmod +x t/t*.sh after applying
*** before committing.
t/t0100-environment-names.sh | 84 +++++++++++++++++++++++++++++++++++++++++++
t/t0200-update-cache.sh | 47 ++++++++++++++++++++++++
t/t0400-ls-files.sh | 29 ++++++++++++++
t/t0500-ls-files.sh | 55 ++++++++++++++++++++++++++++
t/t1000-checkout-cache.sh | 54 +++++++++++++++++++++++++++
t/t1001-checkout-cache.sh | 76 ++++++++++++++++++++++++++++++++++++++
6 files changed, 345 insertions(+)
t/t0100-environment-names.sh (. --> 100755)
t/t0200-update-cache.sh (. --> 100755)
t/t0400-ls-files.sh (. --> 100755)
t/t0500-ls-files.sh (. --> 100755)
t/t1000-checkout-cache.sh (. --> 100755)
t/t1001-checkout-cache.sh (. --> 100755)
--- a/t/t0100-environment-names.sh
+++ b/t/t0100-environment-names.sh
@@ -0,0 +1,84 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='general environment name warning test.
+
+This test makes sure that use of deprecated environment variables
+trigger the warnings from gitenv().'
+
+env_vars='GIT_AUTHOR_DATE:AUTHOR_DATE
+GIT_AUTHOR_EMAIL:AUTHOR_EMAIL
+GIT_AUTHOR_NAME:AUTHOR_NAME
+GIT_COMMITTER_EMAIL:COMMIT_AUTHOR_EMAIL
+GIT_COMMITTER_NAME:COMMIT_AUTHOR_NAME
+GIT_ALTERNATE_OBJECT_DIRECTORIES:SHA1_FILE_DIRECTORIES
+GIT_OBJECT_DIRECTORY:SHA1_FILE_DIRECTORY
+'
+
+. ./test-lib.sh
+
+export_them () {
+ for ev in $env_vars
+ do
+ new=$(expr "$ev" : '\(.*\):')
+ old=$(expr "$ev" : '.*:\(.*\)')
+ # Build and eval the following:
+ # case "${VAR+set}" in set) export VAR;; esac
+ evstr='case "${'$new'+set}" in set) export '$new';; esac'
+ eval "$evstr"
+ evstr='case "${'$old'+set}" in set) export '$old';; esac'
+ eval "$evstr"
+ done
+}
+
+date >path0
+git-update-cache --add path0
+tree=$(git-write-tree)
+
+AUTHOR_DATE='Wed May 11 23:55:18 2005'
+AUTHOR_EMAIL='author@example.xz'
+AUTHOR_NAME='A U Thor'
+COMMIT_AUTHOR_EMAIL='author@example.xz'
+COMMIT_AUTHOR_NAME='A U Thor'
+SHA1_FILE_DIRECTORY=.git/objects
+
+export_them
+
+test_debug 'echo with only old variables exported.'
+
+echo 'foo' | git-commit-tree $tree >/dev/null 2>errmsg
+cat >expected-err <<\EOF
+warning: Attempting to use SHA1_FILE_DIRECTORY
+warning: GIT environment variables have been renamed.
+warning: Please adjust your scripts and environment.
+warning: old AUTHOR_DATE => new GIT_AUTHOR_DATE
+warning: old AUTHOR_EMAIL => new GIT_AUTHOR_EMAIL
+warning: old AUTHOR_NAME => new GIT_AUTHOR_NAME
+warning: old COMMIT_AUTHOR_EMAIL => new GIT_COMMITTER_EMAIL
+warning: old COMMIT_AUTHOR_NAME => new GIT_COMMITTER_NAME
+warning: old SHA1_FILE_DIRECTORY => new GIT_OBJECT_DIRECTORY
+EOF
+sed -ne '/^warning: /p' <errmsg >generated-err
+test_debug 'cat errmsg'
+test_expect_success 'cmp generated-err expected-err'
+
+test_debug 'echo with new variables exported.'
+
+for ev in $env_vars
+do
+ new=$(expr "$ev" : '\(.*\):')
+ old=$(expr "$ev" : '.*:\(.*\)')
+ # Build and eval the following:
+ # NEWENV=$OLDENV
+ evstr="$new=\$$old"
+ eval "$evstr"
+done
+export_them
+echo 'foo' | git-commit-tree $tree >/dev/null 2>errmsg
+sed -ne '/^warning: /p' <errmsg >generated-err
+test_debug 'cat errmsg'
+test_expect_success 'cmp generated-err /dev/null'
+
+test_done
--- a/t/t0200-update-cache.sh
+++ b/t/t0200-update-cache.sh
@@ -0,0 +1,47 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-update-cache nonsense-path test.
+
+This test creates the following structure in the cache:
+
+ path0 - a file
+ path1 - a symlink
+ path2/file2 - a file in a directory
+ path3/file3 - a file in a directory
+
+and tries to git-update-cache --add the following:
+
+ path0/file0 - a file in a directory
+ path1/file1 - a file in a directory
+ path2 - a file
+ path3 - a symlink
+
+All of the attempts should fail.
+'
+
+. ./test-lib.sh
+
+mkdir path2 path3
+date >path0
+ln -s xyzzy path1
+date >path2/file2
+date >path3/file3
+
+git-update-cache --add -- path0 path1 path2/file2 path3/file3
+
+rm -fr path?
+
+mkdir path0 path1
+date >path2
+ln -s frotz path3
+date >path0/file0
+date >path1/file1
+
+for p in path0/file0 path1/file1 path2 path3
+do
+ test_expect_failure "git-update-cache --add -- $p"
+done
+test_done
--- a/t/t0400-ls-files.sh
+++ b/t/t0400-ls-files.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-ls-files test (--others should pick up symlinks).
+
+This test runs git-ls-files --others with the following on the
+filesystem.
+
+ path0 - a file
+ path1 - a symlink
+ path2/file2 - a file in a directory
+'
+. ./test-lib.sh
+
+date >path0
+ln -s xyzzy path1
+mkdir path2
+date >path2/file2
+git-ls-files --others >.output
+cat >.expected <<EOF
+path0
+path1
+path2/file2
+EOF
+
+test_expect_success 'diff .output .expected'
+test_done
--- a/t/t0500-ls-files.sh
+++ b/t/t0500-ls-files.sh
@@ -0,0 +1,55 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-ls-files -k flag test.
+
+This test prepares the following in the cache:
+
+ path0 - a file
+ path1 - a symlink
+ path2/file2 - a file in a directory
+ path3/file3 - a file in a directory
+
+and the following on the filesystem:
+
+ path0/file0 - a file in a directory
+ path1/file1 - a file in a directory
+ path2 - a file
+ path3 - a symlink
+ path4 - a file
+ path5 - a symlink
+ path6/file6 - a file in a directory
+
+git-ls-files -k should report that existing filesystem
+objects except path4, path5 and path6/file6 to be killed.
+'
+. ./test-lib.sh
+
+date >path0
+ln -s xyzzy path1
+mkdir path2 path3
+date >path2/file2
+date >path3/file3
+git-update-cache --add -- path0 path1 path?/file?
+
+rm -fr path?
+date >path2
+ln -s frotz path3
+ln -s nitfol path5
+mkdir path0 path1 path6
+date >path0/file0
+date >path1/file1
+date >path6/file6
+
+git-ls-files -k >.output
+cat >.expected <<EOF
+path0/file0
+path1/file1
+path2
+path3
+EOF
+
+test_expect_success 'diff .output .expected'
+test_done
--- a/t/t1000-checkout-cache.sh
+++ b/t/t1000-checkout-cache.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-checkout-cache test.
+
+This test registers the following filesystem structure in the
+cache:
+
+ path0 - a file
+ path1/file1 - a file in a directory
+
+And then tries to checkout in a work tree that has the following:
+
+ path0/file0 - a file in a directory
+ path1 - a file
+
+The git-checkout-cache command should fail when attempting to checkout
+path0, finding it is occupied by a directory, and path1/file1, finding
+path1 is occupied by a non-directory. With "-f" flag, it should remove
+the conflicting paths and succeed.
+'
+. ./test-lib.sh
+
+date >path0
+mkdir path1
+date >path1/file1
+git-update-cache --add path0 path1/file1
+test_debug 'git-ls-files --stage'
+
+rm -fr path0 path1
+mkdir path0
+date >path0/file0
+date >path1
+test_debug 'git-ls-files --stage'
+test_debug 'find path*'
+
+test_expect_failure 'git-checkout-cache -a'
+test_debug 'find path*'
+
+test_expect_success 'git-checkout-cache -f -a'
+test_debug 'find path*'
+
+if test -f path0 && test -d path1 && test -f path1/file1
+then
+ test_ok "checkout successful"
+else
+ test_failure "checkout failed"
+fi
+
+test_done
+
+
--- a/t/t1001-checkout-cache.sh
+++ b/t/t1001-checkout-cache.sh
@@ -0,0 +1,76 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='git-checkout-cache test.
+
+This test registers the following filesystem structure in the cache:
+
+ path0/file0 - a file in a directory
+ path1/file1 - a file in a directory
+
+and attempts to check it out when the work tree has:
+
+ path0/file0 - a file in a directory
+ path1 - a symlink pointing at "path0"
+
+Checkout cache should fail to extract path1/file1 because the leading
+path path1 is occupied by a non-directory. With "-f" it should remove
+the symlink path1 and create directory path1 and file path1/file1.
+'
+. ./test-lib.sh
+
+show_files() {
+ # show filesystem files, just [-dl] for type and name
+ find path? -ls |
+ sed -e 's/^[0-9]* * [0-9]* * \([-bcdl]\)[^ ]* *[0-9]* *[^ ]* *[^ ]* *[0-9]* [A-Z][a-z][a-z] [0-9][0-9] [^ ]* /fs: \1 /'
+ # what's in the cache, just mode and name
+ git-ls-files --stage |
+ sed -e 's/^\([0-9]*\) [0-9a-f]* [0-3] /ca: \1 /'
+ # what's in the tree, just mode and name.
+ git-ls-tree -r "$1" |
+ sed -e 's/^\([0-9]*\) [^ ]* [0-9a-f]* /tr: \1 /'
+}
+
+mkdir path0
+date >path0/file0
+git-update-cache --add path0/file0
+tree1=$(git-write-tree)
+test_debug 'show_files $tree1'
+
+mkdir path1
+date >path1/file1
+git-update-cache --add path1/file1
+tree2=$(git-write-tree)
+test_debug 'show_files $tree2'
+
+rm -fr path1
+git-read-tree -m $tree1
+git-checkout-cache -f -a
+test_debug 'show_files $tree1'
+
+ln -s path0 path1
+git-update-cache --add path1
+tree3=$(git-write-tree)
+test_debug 'show_files $tree3'
+
+# Morten says "Got that?" here.
+# Test begins.
+
+git-read-tree $tree2
+test_expect_success 'git-checkout-cache -f -a'
+test_debug show_files $tree2
+
+if test ! -h path0 && test -d path0 &&
+ test ! -h path1 && test -d path1 &&
+ test ! -h path0/file0 && test -f path0/file0 &&
+ test ! -h path1/file1 && test -f path1/file1
+then
+ test_ok "checked out correctly."
+else
+ test_failure "did not check out correctly."
+fi
+
+test_done
+
------------------------------------------------
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Daniel Barkalow @ 2005-05-13 0:33 UTC (permalink / raw)
To: Matt Mackall; +Cc: Petr Baudis, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <20050512222943.GI5914@waste.org>
On Thu, 12 May 2005, Matt Mackall wrote:
> On Thu, May 12, 2005 at 05:24:27PM -0400, Daniel Barkalow wrote:
> > On Thu, 12 May 2005, Matt Mackall wrote:
> >
> > > Does this need an HTTP request (and round trip) per object? It appears
> > > to. That's 2200 requests/round trips for my 800 patch benchmark.
> >
> > It requires a request per object, but it should be possible (with
> > somewhat more complicated code) to overlap them such that it doesn't
> > require a serial round trip for each. Since the server is sending static
> > files, the overhead for each should be minimal.
>
> It's not minimal. The size of an HTTP request is often not much
> different than the size of a compressed file delta.
I was thinking of server-side processing overhead, not bandwidth. It's
true that the bandwidth could be noticeable for these small files.
> All the junk that gets bundled in an http request/response will be
> similar in size to the stuff in the third column.
kernel.org seems to send 283-byte responses, to be completely
precise. This could be cut down substantially if Apache were tweaked a bit
to skip all the optional headers which are useless or wrong in this
context. (E.g., that includes sending a content-type of "text/plain" for
the binary data)
> Does it do this recursively? Eg, if the server has 800 new linear
> commits, does the client have to do 800 round trips following parent
> pointers to find all the new changesets?
Yes, although that also includes pulling the commits, and may be
interleaved with pulling the trees and objects to cover the
latency. (I.e., one round trip gets the new head hash; the second gets
that commit; on the third the tree and the parent(s) can be requested at
once; on the fouth the contents of the tree and the grandparents, at
which point the bandwidth will probably be the limiting factor for the
rest of the operation.)
> In this case, Mercurial does about 6 round trips, totalling less than
> 1K, plus one requests that pulls everything.
I must be misunderstanding your numbers, because 6 HTTP responses is more
than 1K, ignoring any actual content from the server, and 1K for 800
commits is less than 2 bytes per commit.
I'm also worried about testing on 800 linear commits, since the projects
under consideration tend to have very non-linear histories.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Matt Mackall @ 2005-05-13 1:11 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: Petr Baudis, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505121949210.30848-100000@iabervon.org>
On Thu, May 12, 2005 at 08:33:56PM -0400, Daniel Barkalow wrote:
> On Thu, 12 May 2005, Matt Mackall wrote:
>
> > On Thu, May 12, 2005 at 05:24:27PM -0400, Daniel Barkalow wrote:
> > > On Thu, 12 May 2005, Matt Mackall wrote:
> > >
> > > > Does this need an HTTP request (and round trip) per object? It appears
> > > > to. That's 2200 requests/round trips for my 800 patch benchmark.
> > >
> > > It requires a request per object, but it should be possible (with
> > > somewhat more complicated code) to overlap them such that it doesn't
> > > require a serial round trip for each. Since the server is sending static
> > > files, the overhead for each should be minimal.
> >
> > It's not minimal. The size of an HTTP request is often not much
> > different than the size of a compressed file delta.
>
> I was thinking of server-side processing overhead, not bandwidth. It's
> true that the bandwidth could be noticeable for these small files.
>
> > All the junk that gets bundled in an http request/response will be
> > similar in size to the stuff in the third column.
>
> kernel.org seems to send 283-byte responses, to be completely
> precise. This could be cut down substantially if Apache were tweaked a bit
> to skip all the optional headers which are useless or wrong in this
> context. (E.g., that includes sending a content-type of "text/plain" for
> the binary data)
>
> > Does it do this recursively? Eg, if the server has 800 new linear
> > commits, does the client have to do 800 round trips following parent
> > pointers to find all the new changesets?
>
> Yes, although that also includes pulling the commits, and may be
> interleaved with pulling the trees and objects to cover the
> latency. (I.e., one round trip gets the new head hash; the second gets
> that commit; on the third the tree and the parent(s) can be requested at
> once; on the fouth the contents of the tree and the grandparents, at
> which point the bandwidth will probably be the limiting factor for the
> rest of the operation.)
What if a changeset is smaller than the bandwidth-delay product of
your link? As an extreme example, Mercurial is currently at a point
where its -entire repo- changegroup (set of all changesets) can be in
flight on the wire on a typical link.
> > In this case, Mercurial does about 6 round trips, totalling less than
> > 1K, plus one requests that pulls everything.
>
> I must be misunderstanding your numbers, because 6 HTTP responses is more
> than 1K, ignoring any actual content from the server, and 1K for 800
> commits is less than 2 bytes per commit.
1k of application-level data, sorry. And my whole point is that I
don't send those 800 commit identifiers (which are 40 bytes each as
hex). I send about 30 or so. It's basically a negotiation to find the
earliest commits not known to the client with a minimum of round trips
and data exchange.
> I'm also worried about testing on 800 linear commits, since the projects
> under consideration tend to have very non-linear histories.
Not true at all. Dumps from Andrew to Linus via patch bombs will
result in runs of hundreds of linear commits on a regular basis.
Linear patch series are the preferred way to make changes and series
of 30 or 40 small patches are not at all uncommon.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply
* Re: [PATCH] [RFD] Add repoid identifier to commit [its a workspace id, isn't it?]
From: Jon Seymour @ 2005-05-13 1:37 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: tglx, git
In-Reply-To: <428291CD.7010701@zytor.com>
>
> I would like to suggest a few limiters are set on the repoid. In
> particular, I'd like to suggest that a repoid is a UUID, that a file is
> used to track it (.git/repoid), and that if it doesn't exist, a new one
> is created from /dev/urandom.
>
I think I understand what Thomas is trying to achieve, but I think
there is a naming problem here. The marker really isn't a repoid - it
is a workspace id.
Two workspaces can share the same physical repository, yet have
different "repoid"s. So the thing being identified isn't the
repository - it's the workspace in the commit was performed.
Thomas' objective, I think, is the following:
from the point of view of a given workspace, determine the merge
order of the
global repository (and there really is only _one_ repository for
this purpose) from
the point of view of that workspace. The interesting workspaces
are workspaces that
contributed commits to the global history.
Thomas is correct to point out that committer id is not a substitute
for a workspace identifier since a given committer may work in
multiple workspaces concurrently.
I can also see why an identifier in the commit is necessary to
reconstruct the history.
Consider the following history:
Rn
| \
Rn-1 Mn
| /
Rn-2
| \
Rn-3 Mn-1
| /
Rn-4
Assume that changes Mn and Mn-1 are made the same workspace, M. Then, from the
point of view of workspace M, the history is:
Rn
Rn-1
Mn
Rn-2
Rn-3
Mn-1
Rn-4
>From the point of view of a given change epoch, M always wants to see
"local changes occur first". To know what changes were local to M you
need to mark the changes that workspace M made with an identifier
saying that M did this in this workspace, hence the need for the
marker that Thomas is proposing.
Assuming that there is value in being able to reconstruct the merge
order from the perspective of workspaces that have contributed to the
global history it would seem that Thomas's suggestion of marking each
commit with an identifier is reasonable, however, I think the name of
the identifier should change - what's being tracked is a workspace,
not a repository.
jon.
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Daniel Barkalow @ 2005-05-13 2:23 UTC (permalink / raw)
To: Matt Mackall; +Cc: Petr Baudis, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <20050513011149.GK5914@waste.org>
On Thu, 12 May 2005, Matt Mackall wrote:
> On Thu, May 12, 2005 at 08:33:56PM -0400, Daniel Barkalow wrote:
>
> > Yes, although that also includes pulling the commits, and may be
> > interleaved with pulling the trees and objects to cover the
> > latency. (I.e., one round trip gets the new head hash; the second gets
> > that commit; on the third the tree and the parent(s) can be requested at
> > once; on the fouth the contents of the tree and the grandparents, at
> > which point the bandwidth will probably be the limiting factor for the
> > rest of the operation.)
>
> What if a changeset is smaller than the bandwidth-delay product of
> your link? As an extreme example, Mercurial is currently at a point
> where its -entire repo- changegroup (set of all changesets) can be in
> flight on the wire on a typical link.
If this is common for the repository in question, then it will be forced
to wait for the parent to come in, true. If you have a number of merges,
however, you start using more total bandwidth relative to latency while
tracking them in parallel.
> > I must be misunderstanding your numbers, because 6 HTTP responses is more
> > than 1K, ignoring any actual content from the server, and 1K for 800
> > commits is less than 2 bytes per commit.
>
> 1k of application-level data, sorry. And my whole point is that I
> don't send those 800 commit identifiers (which are 40 bytes each as
> hex). I send about 30 or so. It's basically a negotiation to find the
> earliest commits not known to the client with a minimum of round trips
> and data exchange.
Does this rely on the history being entirely linear? I suppose that
requesting a rev-list from the server (which could have it as a static
file generated when a new head was pushed) could jumpstart the
process. The client could request all of the commits it doesn't have in
rapid succession, and then request trees as the commits started coming
in. Of course, this would get inefficient if you were, for example,
pulling a merge with a branch with a long history, since you'd get a ton
of old mainline (which you already have) interleaved with occasional new
things.
> > I'm also worried about testing on 800 linear commits, since the projects
> > under consideration tend to have very non-linear histories.
>
> Not true at all. Dumps from Andrew to Linus via patch bombs will
> result in runs of hundreds of linear commits on a regular basis.
> Linear patch series are the preferred way to make changes and series
> of 30 or 40 small patches are not at all uncommon.
It has sounded like Andrew had some interest in using git, and a number of
other developers are using it already. If this becomes still more common,
it may be the case that, instead of sending patch bombs, Andrew will point
Linus at authors' original series, in which case the mainline would be
merges of a hundred linear series of various lengths. I had the
impression, although I never looked carefully, that this was happening on
a smaller scale with BK, where work by BK users got included using BK,
rather than as patches applied out of a bomb.
It certainly makes sense as a design goal to be able to support everything
happening within the system, rather than getting exported and reimported.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Matt Mackall @ 2005-05-13 2:44 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: Petr Baudis, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505122148480.30848-100000@iabervon.org>
On Thu, May 12, 2005 at 10:23:01PM -0400, Daniel Barkalow wrote:
> On Thu, 12 May 2005, Matt Mackall wrote:
>
> > On Thu, May 12, 2005 at 08:33:56PM -0400, Daniel Barkalow wrote:
> >
> > > Yes, although that also includes pulling the commits, and may be
> > > interleaved with pulling the trees and objects to cover the
> > > latency. (I.e., one round trip gets the new head hash; the second gets
> > > that commit; on the third the tree and the parent(s) can be requested at
> > > once; on the fouth the contents of the tree and the grandparents, at
> > > which point the bandwidth will probably be the limiting factor for the
> > > rest of the operation.)
> >
> > What if a changeset is smaller than the bandwidth-delay product of
> > your link? As an extreme example, Mercurial is currently at a point
> > where its -entire repo- changegroup (set of all changesets) can be in
> > flight on the wire on a typical link.
>
> If this is common for the repository in question, then it will be forced
> to wait for the parent to come in, true. If you have a number of merges,
> however, you start using more total bandwidth relative to latency while
> tracking them in parallel.
No, you're missing my point. If you can request all the files in a
changeset in less than a round-trip time, you have a pipeline stall.
Let's say a changeset is 10k and round trip time is 100ms. That means
you'll stall on any pipe with more than 100k/s. You won't know what
changeset to request next as it'll still be in flight.
> > > I must be misunderstanding your numbers, because 6 HTTP responses is more
> > > than 1K, ignoring any actual content from the server, and 1K for 800
> > > commits is less than 2 bytes per commit.
> >
> > 1k of application-level data, sorry. And my whole point is that I
> > don't send those 800 commit identifiers (which are 40 bytes each as
> > hex). I send about 30 or so. It's basically a negotiation to find the
> > earliest commits not known to the client with a minimum of round trips
> > and data exchange.
>
> Does this rely on the history being entirely linear? I suppose that
> requesting a rev-list from the server (which could have it as a static
> file generated when a new head was pushed) could jumpstart the
> process. The client could request all of the commits it doesn't have in
> rapid succession, and then request trees as the commits started coming
> in. Of course, this would get inefficient if you were, for example,
> pulling a merge with a branch with a long history, since you'd get a ton
> of old mainline (which you already have) interleaved with occasional new
> things.
I don't depend on history being linear (I'm not reinventing CVS here)
and I don't grab a list of all revisions (the point is to be
scalable). In fact, I do something fairly clever, and something I
don't think will work with git, because, yet again, it lacks the
metadata.
> > > I'm also worried about testing on 800 linear commits, since the projects
> > > under consideration tend to have very non-linear histories.
> >
> > Not true at all. Dumps from Andrew to Linus via patch bombs will
> > result in runs of hundreds of linear commits on a regular basis.
> > Linear patch series are the preferred way to make changes and series
> > of 30 or 40 small patches are not at all uncommon.
>
> It has sounded like Andrew had some interest in using git, and a number of
> other developers are using it already. If this becomes still more common,
> it may be the case that, instead of sending patch bombs, Andrew will point
> Linus at authors' original series, in which case the mainline would be
> merges of a hundred linear series of various lengths. I had the
> impression, although I never looked carefully, that this was happening on
> a smaller scale with BK, where work by BK users got included using BK,
> rather than as patches applied out of a bomb.
Andrew already uses git, in a manner much like he used BK. He does a
pull from a repo, generates a patch of that repo vs mainline, and puts
that in -mm. And never passes that stuff on to Linus.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply
* Regarding gitk
From: Tejun Heo @ 2005-05-13 5:19 UTC (permalink / raw)
To: paulus, git
Hello, Paulus.
First of all, thanks a lot for gitk. I was working on something using
graphviz/pygtk to do about the same thing for a couple of weeks (and got
pretty far with it) but ditched it as gitk seemed much better. I really
love how gitk shows the commit graph. :-)
As I don't wanna ditch any more of my time, it would be great if you
let me know what you're currently working on, so that I can coordinate
with you. Here are the things I have on mind.
* integrate two-way diff view w/ diff map into gitk from mgdiff.
* show the current cache and working files at the head of graph
* demand-load commits as the user scrolls down the graph
I wrote a commit viewing utility (gitkdiff) modified from mgdiff two
weeks ago so I'm quite familiar with mgdiff source, and implemented
demand-loading of commits in my own project in python, but writing a
separate c utility with the same algorithm wouldn't take that much time.
Thanks.
--
tejun
^ permalink raw reply
* Re: [PATCH] Stop git-rev-list at sha1 match
From: Petr Baudis @ 2005-05-13 5:26 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Junio C Hamano, git
In-Reply-To: <1115857873.22180.253.camel@tglx>
Dear diary, on Thu, May 12, 2005 at 02:31:13AM CEST, I got a letter
where Thomas Gleixner <tglx@linutronix.de> told me that...
> On Thu, 2005-05-12 at 01:44 +0200, Petr Baudis wrote:
> > for extensive discussion on how (it is impossible or very hard) to do
> > better.
>
> :)
>
> > So how would you order the list of commits?
>
> Rn
> merged Mn
> merged Mn-1
> Rn-1
> ....
>
> That's the relevant information in repository R. Looking at it from
> repository M after M updated to Rn
>
> (Mn+1) == Rn ; Mn+1 is not created due to head forward
> merged Rn
> ..
> merged Rn-3
> Mn
> Mn-1
>
> Thats the historical correct ordering from a repository point of view.
> Thats the only relevant information IMNSHO.
But it is impossible to reconstruct without the repoid or something. So
my point that it makes no sense and is actually dangerous with the
current rev-list output order holds.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
^ permalink raw reply
* Re: [PATCH 1/3] Introduce "rev-list --stop-at=<commit>".
From: Petr Baudis @ 2005-05-13 5:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3bssc770.fsf@assigned-by-dhcp.cox.net>
Dear diary, on Fri, May 13, 2005 at 02:15:15AM CEST, I got a letter
where Junio C Hamano <junkio@cox.net> told me that...
> Additional option, --stop-at=<commit>, is introduced. The
> git-rev-list output stops just before showing the named commit.
>
> This is based on Thoms Gleixner's patch but slightly reworked.
>
> Signed-off-by: Junio C Hamano <junkio@cox.net>
Won't apply for now - as I already said in the relevant thread, this
makes no sense with the current git-rev-list output order, and even
encourages using it in wrong way. It is ok when the merges are reported
in a different way, but that's impossible without some repoid (I yet
have to catch up with that thread :-).
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
^ permalink raw reply
* Re: [PATCH Cogito] Improve option parsing for cg-log
From: Petr Baudis @ 2005-05-13 5:41 UTC (permalink / raw)
To: Marcel Holtmann; +Cc: GIT Mailing List
In-Reply-To: <1115934586.18499.70.camel@pegasus>
Dear diary, on Thu, May 12, 2005 at 11:49:46PM CEST, I got a letter
where Marcel Holtmann <marcel@holtmann.org> told me that...
> Hi Petr,
>
> > > the attached patch changes the option parsing, because otherwise we are
> > > stuck to a specific order.
> >
> > thanks, applied. However, you didn't include the -r options parsing in
> > there yet.
>
> what do you mean by that?
The -r option still must be after all the other options.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
^ permalink raw reply
* Re: Mercurial 0.4e vs git network pull
From: Petr Baudis @ 2005-05-13 5:44 UTC (permalink / raw)
To: Daniel Barkalow
Cc: Matt Mackall, linux-kernel, git, mercurial, Linus Torvalds
In-Reply-To: <Pine.LNX.4.21.0505121709250.30848-100000@iabervon.org>
Dear diary, on Thu, May 12, 2005 at 11:24:27PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> In the present mainline, you first have to find the head commit you
> want. I have a patch which does this for you over the same
> connection. Starting from that point, it tracks reachability on the
> receiving end, and requests anything it doesn't have.
Could we get the patch, please? :-)
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
^ permalink raw reply
* Re: [PATCH 1/3] Introduce "rev-list --stop-at=<commit>".
From: Noel Grandin @ 2005-05-13 6:07 UTC (permalink / raw)
To: Petr Baudis; +Cc: Junio C Hamano, git
In-Reply-To: <20050513052901.GB16464@pasky.ji.cz>
[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]
Also, it should be called --stop-before given it's behaviour.
--stop-at implies that it includes the given commit.
Petr Baudis wrote:
>Dear diary, on Fri, May 13, 2005 at 02:15:15AM CEST, I got a letter
>where Junio C Hamano <junkio@cox.net> told me that...
>
>
>>Additional option, --stop-at=<commit>, is introduced. The
>>git-rev-list output stops just before showing the named commit.
>>
>>This is based on Thoms Gleixner's patch but slightly reworked.
>>
>>Signed-off-by: Junio C Hamano <junkio@cox.net>
>>
>>
>
>Won't apply for now - as I already said in the relevant thread, this
>makes no sense with the current git-rev-list output order, and even
>encourages using it in wrong way. It is ok when the merges are reported
>in a different way, but that's impossible without some repoid (I yet
>have to catch up with that thread :-).
>
>
>
NOTICE: Please note that this email, and the contents thereof,
are subject to the standard Peralex email disclaimer, which may
be found at: http://www.peralex.com/disclaimer.html
If you cannot access the disclaimer through the URL attached
and you wish to receive a copy thereof please send
an email to email@peralex.com
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox