* [FILE] GNU BIT
From: Andreas Gal @ 2005-04-25 6:24 UTC (permalink / raw)
To: Git Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 6002 bytes --]
BIT - a little bit like SCM
BIT is a training exercise in shell programming and is the result of
my attempts to wrap my head around GIT's inner working. BIT's
command line interface should be very familiar to anyone who worked
with other(tm) SCM tools before. I try to not depend on custom
GIT features. BIT uses the off-the-shelf GIT core tools distributed
by Linus. This means that BIT has about 2% of the features of
Cogito. Also, it has about 0.1% of the user base of Cogito, so its
probably very broken and I really don't recommend using it.
You can obtain BIT from the following GIT repository:
http://nil.ics.uci.edu/~gal/public/bit
You can use the GIT core utilities to pull and check-out BIT:
init-db
curl http://nil.ics.uci.edu/~gal/public/bit/HEAD > .git/HEAD
http-pull -a `cat .git/HEAD` \
http://nil.ics.uci.edu/~gal/public/bit
read-tree `cat .git/HEAD`
checkout-cache -f -a
Naturally, you can also use BIT to pull the current sources, which
is much simpler:
bit clone http://nil.ics.uci.edu/~gal/public/bit
This will create a directory "bit", pull the latest sources and perform
a check-out for you.
INSTALLATION
Put "bit" anywhere in your search path. Its only a single bash script. It
requires a link "bit-resolve" to itself and the (soft) link should reside
in the same directory as "bit" itself. "bit" acts as merge script when
invoked through that link.
At this point, BIT's functionality is minimal. It does what I need it for.
I will obviously add more commands as we go along, but I won't touch
things like tags and stuff like that until Linus' makes up his mind how to
do it *RIGHT*.
BASIC CONCEPTS
- BIT always checks out all files in your repository and it does so
automatically. In other words, there is no "bit co" command. Not
everyone will like this, but I do.
- You can't seek around in a repository. Checked-out files always
match the HEAD revision. You can diff files against older versions,
but if you want to check out an older version, you will have to
clone the repository (see below).
TRACKING SOMEONE ELSE'S TREE
$ bit clone http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
(Note: Don't forget the '/' at the end, otherwise http-pull won't work.)
This command pulls Linus' latest GIT tree to a local repository "git.git".
You can change the latter by giving clone an additional argument.
$ bit clone http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/ \
git-trunk
Once you have a copy of the remote repository, you can check whether there
are new changesets in the remote repository that you haven't seen yet:
$ bit changes -R \
http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
If you see any changes, you can merge them into your own tree:
$ bit pull http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
BRANCHING
Now lets assume you want to work on an extension to GIT. For this, we will
clone the repository:
$ bit clone git-trunk git-bit
This will create a copy of the git-trunk repository and name it "git-bit".
The object directory is shared (using a soft link), which has the nice
benefit that once you run "bit pull" on one of the repositories, the
other one will be able to merge changes without any network traffic
(except for reading the current HEAD).
Lets say we make some changes to sha1_file.c and want to commit it to
our local repository "git-bit":
... edit sha1_file.c ...
In case we already forgot what file we edited, "bit pending" will tells us:
$ bit pending
sha1_file.c
Just in case we still can remember what we changed, there is "bit diffs",
which shows a diff to the current HEAD or any other version of our
tree.
$ bit diffs
--- 28ad1598e54200ca8ee1261ed7beb4e31e20b2f1/sha1_file.c
+++ sha1_file.c
@@ -70,6 +70,7 @@
int i;
static char *name, *base;
+ /* added a cool new feature here */
if (!base) {
char *sha1_file_directory = getenv(DB_ENVIRONMENT) ? : ...
int len = strlen(sha1_file_directory);
To commit our changes, we use "bit commit". It will fire up "vi" to ask for
a commit message.
$ bit commit
... enter commit message in vi ...
MERGING
Lets assume Linus' put out a new version of GIT, so we want to update both of
our repositories. First lets do this for "git-trunk".
$ cd bit-trunk
$ bit pull http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
(Note: You have to specify the URL explicitly every time because there is no
consensus yet where to store this information. Once thats sorted out, this
will be automatic, of course.)
As this repository only tracks Linus' sources, there should be no conflicts.
Now lets go to our "git-bit" repository and do the same there:
$ cd git-bit
$ bit pull http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
Because both repositories share the object directory, you will get away
with minimal network traffic. Conflicts are resolved using RCS merge. If
that fails, you have to edit the offending files yourself.
PUSHING PATCHES UPSTREAM
Lets assume we want to send our improvements to Linus. For this, we can
ask changes to show us all local changes in our repository:
$ bit changes -L \
http://www.kernel.org/pub/linux/kernel/people/torvalds/git.git/
There is currently no mechanism in BIT to generate patches automatically,
but I will add one shortly. What is working already is that you can
push your repository to a remote location:
$ bit push ssh://gal@sam.ics.uci.edu/.nfs/public_html/public/git/
This will update the remote repository via SSH and set HEAD to point
to your latest version. Please note that you have to create a
repository at the remote location using "init-db".
HELP
Try "bit --help" to get some simple instructions how to use BIT. All
commands have builtin help as well. Try "bit commit --help". Not all
options are always implemented. Feel free to send me a patch.
[-- Attachment #2: Type: APPLICATION/x-gzip, Size: 5057 bytes --]
^ permalink raw reply
* Re: [PATCH] New option (-H) for rpush/rpull to update HEAD
From: Daniel Barkalow @ 2005-04-25 5:18 UTC (permalink / raw)
To: Andreas Gal; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504242149330.5553@sam.ics.uci.edu>
On Sun, 24 Apr 2005, Andreas Gal wrote:
> Why? Updating HEAD right after writing the commit id and all its children
> to the object directory seems reasonable and prevents race conditions when
> the remote repository is shared via HTTP etc.
Come to think of it, the option I was thinking of is orthogonal to your
option; I was thinking of an option to read the head from the sending
side's file, rather than from the command line.
In any case, if you're sharing the repository by HTTP, there's no hurry to
update the HEAD right after, since the old head doesn't stop being valid
(although it's obviously not going to be current in a moment).
> rpull, in contrast, should never touch HEAD, because conflicts might
> force a merge that will set HEAD to something else. For the pull case we
> should let the script(tm) do that.
For the rpull case, what you want is to just say "HEAD" (or something),
and the remote server will send you the HEAD and you pull down that
commit. You're probably right that you don't want to automatically write
it to the local HEAD, though, although the future format should give us
somewhere good to put the result.
> Its local anyway. rpush is different because the script(tm) has to do
> some SSH magic to update HEAD. I will gladly supply a patch to fix what
> to read/write once you have figured out the final layout, but I really
> need a working rpush _NOW_ ;).
That was actually my motice in getting rpush/rpull/http-pull in, too.
If you're going to do much serious with this, you should probably remove
the if (has_sha1_file()) continue;" bit in process_commit(), so that it
will make sure that the repository gets pulled completely with -a, even if
some commits have already been pulled. (This will make things less
efficient, but less error-prone, and we'll fix the inefficiency later.)
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply
* [PATCH 2/2] Introduce diff-tree-helper.
From: Junio C Hamano @ 2005-04-25 5:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7v1x8zsamn.fsf_-_@assigned-by-dhcp.cox.net>
This patch introduces a new program, diff-tree-helper. It reads
output from diff-cache and diff-tree, and produces a patch file.
The diff format customization can be done the same way the
show-diff uses; the same external diff interface introduced by
the previous patch to drive diff from show-diff is used so this
is not surprising.
It is used like the following examples:
$ diff-cache --cached -z <tree> | diff-tree-helper -z -R paths...
$ diff-tree -r -z <tree1> <tree2> | diff-tree-helper -z paths...
- As usual, the use of the -z flag is recommended in the script
to pass NUL-terminated filenames through the pipe between
commands.
- The -R flag is used to generate reverse diff. It does not
matter for diff-tree case, but it is sometimes useful to get
a patch in the desired direction out of diff-cache.
- The paths parameters are used to restrict the paths that
appears in the output. Again this is useful to use with
diff-cache, which, unlike diff-tree, does not take such paths
restriction parameters.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Makefile | 6 -
diff-tree-helper.c | 302 +++++++++++++++++++++++++++++++++++++++++++++++++++++
strbuf.c | 43 +++++++
strbuf.h | 13 ++
4 files changed, 363 insertions(+), 1 deletion(-)
--- k/Makefile
+++ l/Makefile
@@ -16,7 +16,8 @@ AR=ar
PROG= update-cache show-diff init-db write-tree read-tree commit-tree \
cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
check-files ls-tree merge-base merge-cache unpack-file git-export \
- diff-cache convert-cache http-pull rpush rpull rev-list
+ diff-cache convert-cache http-pull rpush rpull rev-list \
+ diff-tree-helper
all: $(PROG)
@@ -27,6 +28,9 @@ LIB_OBJS=read-cache.o sha1_file.o usage.
LIB_FILE=libgit.a
LIB_H=cache.h object.h
+LIB_H += strbuf.h
+LIB_OBJS += strbuf.o
+
LIB_H += diff.h
LIB_OBJS += diff.o
--- k/diff-tree-helper.c
+++ l/diff-tree-helper.c
@@ -0,0 +1,302 @@
+#include "cache.h"
+#include "strbuf.h"
+#include "diff.h"
+
+static int matches_pathspec(const char *name, char **spec, int cnt)
+{
+ int i;
+ int namelen = strlen(name);
+ for (i = 0; i < cnt; i++) {
+ int speclen = strlen(spec[i]);
+ if (! strncmp(spec[i], name, speclen) &&
+ speclen <= namelen &&
+ (name[speclen] == 0 ||
+ name[speclen] == '/'))
+ return 1;
+ }
+ return 0;
+}
+
+static int parse_oneside_change(const char *cp, unsigned char *sha1,
+ char *path) {
+ int ch;
+ while ((ch = *cp) && '0' <= ch && ch <= '7')
+ cp++; /* skip mode bits */
+ if (strncmp(cp, "\tblob\t", 6))
+ return -1;
+ cp += 6;
+ if (get_sha1_hex(cp, sha1))
+ return -1;
+ cp += 40;
+ if (*cp++ != '\t')
+ return -1;
+ strcpy(path, cp);
+ return 0;
+}
+
+#define STATUS_CACHED 0 /* cached and sha1 valid */
+#define STATUS_ABSENT 1 /* diff-tree says old removed or new added */
+#define STATUS_UNCACHED 2 /* diff-cache output: read from working tree */
+
+static int parse_diff_tree_output(const char *buf,
+ unsigned char *old_sha1,
+ int *old_status,
+ unsigned char *new_sha1,
+ int *new_status,
+ char *path) {
+ const char *cp = buf;
+ int ch;
+ static unsigned char null_sha[20] = { 0, };
+
+ switch (*cp++) {
+ case '+':
+ *old_status = STATUS_ABSENT;
+ *new_status = (memcmp(new_sha1, null_sha, sizeof(null_sha)) ?
+ STATUS_CACHED : STATUS_UNCACHED);
+ return parse_oneside_change(cp, new_sha1, path);
+ case '-':
+ *new_status = STATUS_ABSENT;
+ *old_status = (memcmp(old_sha1, null_sha, sizeof(null_sha)) ?
+ STATUS_CACHED : STATUS_UNCACHED);
+ return parse_oneside_change(cp, old_sha1, path);
+ case '*':
+ break;
+ default:
+ return -1;
+ }
+
+ /* This is for '*' entries */
+ while ((ch = *cp) && ('0' <= ch && ch <= '7'))
+ cp++; /* skip mode bits */
+ if (strncmp(cp, "->", 2))
+ return -1;
+ cp += 2;
+ while ((ch = *cp) && ('0' <= ch && ch <= '7'))
+ cp++; /* skip mode bits */
+ if (strncmp(cp, "\tblob\t", 6))
+ return -1;
+ cp += 6;
+ if (get_sha1_hex(cp, old_sha1))
+ return -1;
+ cp += 40;
+ if (strncmp(cp, "->", 2))
+ return -1;
+ cp += 2;
+ if (get_sha1_hex(cp, new_sha1))
+ return -1;
+ cp += 40;
+ if (*cp++ != '\t')
+ return -1;
+ strcpy(path, cp);
+ *old_status = (memcmp(old_sha1, null_sha, sizeof(null_sha)) ?
+ STATUS_CACHED : STATUS_UNCACHED);
+ *new_status = (memcmp(new_sha1, null_sha, sizeof(null_sha)) ?
+ STATUS_CACHED : STATUS_UNCACHED);
+ return 0;
+}
+
+static int sha1err(const char *path, const unsigned char *sha1)
+{
+ return error("diff-tree-helper: unable to read sha1 file of %s (%s)",
+ path, sha1_to_hex(sha1));
+}
+
+static int fserr(const char *path)
+{
+ return error("diff-tree-helper: unable to read file %s", path);
+}
+
+static char *map_whole_file(const char *path, unsigned long *size) {
+ int fd;
+ struct stat st;
+ void *buf;
+
+ if ((fd = open(path, O_RDONLY)) < 0) {
+ error("diff-tree-helper: unable to read file %s", path);
+ return 0;
+ }
+ if (fstat(fd, &st) < 0) {
+ close(fd);
+ error("diff-tree-helper: unable to stat file %s", path);
+ return 0;
+ }
+ *size = st.st_size;
+ buf = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+ close(fd);
+ return buf;
+}
+
+static int show_diff(const unsigned char *old_sha1, int old_status,
+ const unsigned char *new_sha1, int new_status,
+ const char *path, int reverse_diff)
+{
+ char other[PATH_MAX];
+ unsigned long size;
+ char type[20];
+ int fd;
+ int reverse;
+ void *blob = 0;
+ const char *fs = 0;
+ int need_unmap = 0;
+ int need_unlink = 0;
+
+
+ switch (old_status) {
+ case STATUS_CACHED:
+ blob = read_sha1_file(old_sha1, type, &size);
+ if (! blob)
+ return sha1err(path, old_sha1);
+
+ switch (new_status) {
+ case STATUS_CACHED:
+ strcpy(other, ".diff_tree_helper_XXXXXX");
+ fd = mkstemp(other);
+ if (fd < 0)
+ die("unable to create temp-file");
+ if (write(fd, blob, size) != size)
+ die("unable to write temp-file");
+ close(fd);
+ free(blob);
+
+ blob = read_sha1_file(new_sha1, type, &size);
+ if (! blob)
+ return sha1err(path, new_sha1);
+
+ need_unlink = 1;
+ /* new = blob, old = fs */
+ reverse = !reverse_diff;
+ fs = other;
+ break;
+
+ case STATUS_ABSENT:
+ case STATUS_UNCACHED:
+ fs = ((new_status == STATUS_ABSENT) ?
+ "/dev/null" : path);
+ reverse = reverse_diff;
+ break;
+
+ default:
+ reverse = reverse_diff;
+ }
+ break;
+
+ case STATUS_ABSENT:
+ switch (new_status) {
+ case STATUS_CACHED:
+ blob = read_sha1_file(new_sha1, type, &size);
+ if (! blob)
+ return sha1err(path, new_sha1);
+ /* old = fs, new = blob */
+ fs = "/dev/null";
+ reverse = !reverse_diff;
+ break;
+
+ case STATUS_ABSENT:
+ return error("diff-tree-helper: absent from both old and new?");
+ case STATUS_UNCACHED:
+ fs = path;
+ blob = strdup("");
+ size = 0;
+ /* old = blob, new = fs */
+ reverse = reverse_diff;
+ break;
+ default:
+ reverse = reverse_diff;
+ }
+ break;
+
+ case STATUS_UNCACHED:
+ fs = path; /* old = fs, new = blob */
+ reverse = !reverse_diff;
+
+ switch (new_status) {
+ case STATUS_CACHED:
+ blob = read_sha1_file(new_sha1, type, &size);
+ if (! blob)
+ return sha1err(path, new_sha1);
+ break;
+
+ case STATUS_ABSENT:
+ blob = strdup("");
+ size = 0;
+ break;
+
+ case STATUS_UNCACHED:
+ /* old = fs */
+ blob = map_whole_file(path, &size);
+ if (! blob)
+ return fserr(path);
+ need_unmap = 1;
+ break;
+ default:
+ reverse = reverse_diff;
+ }
+ break;
+
+ default:
+ reverse = reverse_diff;
+ }
+
+ if (fs)
+ show_differences(fs,
+ path, /* label */
+ blob,
+ size,
+ reverse /* 0: diff blob fs
+ 1: diff fs blob */);
+
+ if (need_unlink)
+ unlink(other);
+ if (need_unmap && blob)
+ munmap(blob, size);
+ else
+ free(blob);
+ return 0;
+}
+
+static const char *diff_tree_helper_usage =
+"diff-tree-helper [-R] [-z] paths...";
+
+int main(int ac, char **av) {
+ struct strbuf sb;
+ int reverse_diff = 0;
+ int line_termination = '\n';
+
+ strbuf_init(&sb);
+
+ while (1 < ac && av[1][0] == '-') {
+ if (av[1][1] == 'R')
+ reverse_diff = 1;
+ else if (av[1][1] == 'z')
+ line_termination = 0;
+ else
+ usage(diff_tree_helper_usage);
+ ac--; av++;
+ }
+ /* the remaining parameters are paths patterns */
+
+ prepare_diff_cmd();
+
+ while (1) {
+ int old_status, new_status;
+ unsigned char old_sha1[20], new_sha1[20];
+ char path[PATH_MAX];
+ read_line(&sb, stdin, line_termination);
+ if (sb.eof)
+ break;
+ if (parse_diff_tree_output(sb.buf,
+ old_sha1, &old_status,
+ new_sha1, &new_status,
+ path)) {
+ fprintf(stderr, "cannot parse %s\n", sb.buf);
+ continue;
+ }
+ if (1 < ac && ! matches_pathspec(path, av+1, ac-1))
+ continue;
+
+ show_diff(old_sha1, old_status,
+ new_sha1, new_status,
+ path, reverse_diff);
+ }
+ return 0;
+}
--- k/strbuf.c
+++ l/strbuf.c
@@ -0,0 +1,43 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include "strbuf.h"
+
+void strbuf_init(struct strbuf *sb) {
+ sb->buf = 0;
+ sb->eof = sb->alloc = sb->len = 0;
+}
+
+static void strbuf_begin(struct strbuf *sb) {
+ free(sb->buf);
+ strbuf_init(sb);
+}
+
+static void inline strbuf_add(struct strbuf *sb, int ch) {
+ if (sb->alloc <= sb->len) {
+ sb->alloc = sb->alloc * 3 / 2 + 16;
+ sb->buf = realloc(sb->buf, sb->alloc);
+ }
+ sb->buf[sb->len++] = ch;
+}
+
+static void strbuf_end(struct strbuf *sb) {
+ strbuf_add(sb, 0);
+}
+
+void read_line(struct strbuf *sb, FILE *fp, int term) {
+ int ch;
+ strbuf_begin(sb);
+ if (feof(fp)) {
+ sb->eof = 1;
+ return;
+ }
+ while ((ch = fgetc(fp)) != EOF) {
+ if (ch == term)
+ break;
+ strbuf_add(sb, ch);
+ }
+ if (sb->len == 0)
+ sb->eof = 1;
+ strbuf_end(sb);
+}
+
--- k/strbuf.h
+++ l/strbuf.h
@@ -0,0 +1,13 @@
+#ifndef STRBUF_H
+#define STRBUF_H
+struct strbuf {
+ int alloc;
+ int len;
+ int eof;
+ unsigned char *buf;
+};
+
+extern void strbuf_init(struct strbuf *);
+extern void read_line(struct strbuf *, FILE *, int);
+
+#endif /* STRBUF_H */
^ permalink raw reply
* [PATCH 1/2] Split external diff command interface to a separate file.
From: Junio C Hamano @ 2005-04-25 5:15 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7v1x8zsamn.fsf_-_@assigned-by-dhcp.cox.net>
With this patch, the non-core'ish part of show-diff command that
invokes an external "diff" comand to obtain patches is split
into a separate file. The next patch will introduce a new
command, diff-tree-helper, which uses this common diff interface
to format diff-tree and diff-cache output into a patch form.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
Makefile | 4 ++
diff.c | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
diff.h | 17 +++++++++
show-diff.c | 101 +--------------------------------------------------------
4 files changed, 130 insertions(+), 98 deletions(-)
--- k/Makefile
+++ l/Makefile
@@ -27,6 +27,9 @@ LIB_OBJS=read-cache.o sha1_file.o usage.
LIB_FILE=libgit.a
LIB_H=cache.h object.h
+LIB_H += diff.h
+LIB_OBJS += diff.o
+
LIBS = $(LIB_FILE)
LIBS += -lz
@@ -66,6 +69,7 @@ checkout-cache.o: $(LIB_H)
commit.o: $(LIB_H)
commit-tree.o: $(LIB_H)
convert-cache.o: $(LIB_H)
+diff.o: $(LIB_H)
diff-cache.o: $(LIB_H)
diff-tree.o: $(LIB_H)
fsck-cache.o: $(LIB_H)
--- k/diff.c
+++ l/diff.c
@@ -0,0 +1,106 @@
+#include "cache.h"
+#include "diff.h"
+
+static char *diff_cmd = "diff -L 'k/%s' -L 'l/%s' ";
+static char *diff_opts = "-p -u";
+static char *diff_arg_forward = " - '%s'";
+static char *diff_arg_reverse = " '%s' -";
+
+void prepare_diff_cmd(void)
+{
+ /*
+ * Default values above are meant to match the
+ * Linux kernel development style. Examples of
+ * alternative styles you can specify via environment
+ * variables are:
+ *
+ * GIT_DIFF_CMD="diff -L '%s' -L '%s'"
+ * GIT_DIFF_OPTS="-c";
+ */
+ diff_cmd = getenv("GIT_DIFF_CMD") ? : diff_cmd;
+ diff_opts = getenv("GIT_DIFF_OPTS") ? : diff_opts;
+}
+
+/* Help to copy the thing properly quoted for the shell safety.
+ * any single quote is replaced with '\'', and the caller is
+ * expected to enclose the result within a single quote pair.
+ *
+ * E.g.
+ * original sq_expand result
+ * name ==> name ==> 'name'
+ * a b ==> a b ==> 'a b'
+ * a'b ==> a'\''b ==> 'a'\''b'
+ */
+static char *sq_expand(const char *src)
+{
+ static char *buf = NULL;
+ int cnt, c;
+ const char *cp;
+ char *bp;
+
+ /* count bytes needed to store the quoted string. */
+ for (cnt = 1, cp = src; *cp; cnt++, cp++)
+ if (*cp == '\'')
+ cnt += 3;
+
+ if (! (buf = malloc(cnt)))
+ return buf;
+ bp = buf;
+ while ((c = *src++)) {
+ if (c != '\'')
+ *bp++ = c;
+ else {
+ bp = strcpy(bp, "'\\''");
+ bp += 4;
+ }
+ }
+ *bp = 0;
+ return buf;
+}
+
+void show_differences(const char *name, /* filename on the filesystem */
+ const char *label, /* diff label to use */
+ void *old_contents, /* contents in core */
+ unsigned long long old_size, /* size in core */
+ int reverse /* 0: diff core file
+ 1: diff file core */)
+{
+ FILE *f;
+ char *name_sq = sq_expand(name);
+ const char *label_sq = (name != label) ? sq_expand(label) : name_sq;
+ char *diff_arg = reverse ? diff_arg_reverse : diff_arg_forward;
+ int cmd_size = strlen(name_sq) + strlen(label_sq) * 2 +
+ strlen(diff_cmd) + strlen(diff_opts) + strlen(diff_arg);
+ char *cmd = malloc(cmd_size);
+ int next_at;
+
+ fflush(stdout);
+ next_at = snprintf(cmd, cmd_size, diff_cmd, label_sq, label_sq);
+ next_at += snprintf(cmd+next_at, cmd_size-next_at, "%s", diff_opts);
+ next_at += snprintf(cmd+next_at, cmd_size-next_at, diff_arg, name_sq);
+ f = popen(cmd, "w");
+ if (old_size)
+ fwrite(old_contents, old_size, 1, f);
+ pclose(f);
+ if (label_sq != name_sq)
+ free((void*)label_sq); /* constness */
+ free(name_sq);
+ free(cmd);
+}
+
+void show_diff_empty(const unsigned char *sha1,
+ const char *name,
+ int reverse)
+{
+ char *old;
+ unsigned long int size;
+ unsigned char type[20];
+
+ old = read_sha1_file(sha1, type, &size);
+ if (! old) {
+ error("unable to read blob object for %s (%s)", name,
+ sha1_to_hex(sha1));
+ return;
+ }
+ show_differences("/dev/null", name, old, size, reverse);
+}
--- k/diff.h
+++ l/diff.h
@@ -0,0 +1,17 @@
+#ifndef DIFF_H
+#define DIFF_H
+
+extern void prepare_diff_cmd(void);
+
+extern void show_differences(const char *name, /* filename on the filesystem */
+ const char *label, /* diff label to use */
+ void *old_contents, /* contents in core */
+ unsigned long long old_size, /* size in core */
+ int reverse /* 0: diff core file
+ 1: diff file core */);
+
+extern void show_diff_empty(const unsigned char *sha1,
+ const char *name,
+ int reverse);
+
+#endif /* DIFF_H */
--- k/show-diff.c
+++ l/show-diff.c
@@ -4,103 +4,7 @@
* Copyright (C) Linus Torvalds, 2005
*/
#include "cache.h"
-
-static char *diff_cmd = "diff -L 'a/%s' -L 'b/%s' ";
-static char *diff_opts = "-p -u";
-static char *diff_arg_forward = " - '%s'";
-static char *diff_arg_reverse = " '%s' -";
-
-static void prepare_diff_cmd(void)
-{
- /*
- * Default values above are meant to match the
- * Linux kernel development style. Examples of
- * alternative styles you can specify via environment
- * variables are:
- *
- * GIT_DIFF_CMD="diff -L '%s' -L '%s'"
- * GIT_DIFF_OPTS="-c";
- */
- diff_cmd = getenv("GIT_DIFF_CMD") ? : diff_cmd;
- diff_opts = getenv("GIT_DIFF_OPTS") ? : diff_opts;
-}
-
-/* Help to copy the thing properly quoted for the shell safety.
- * any single quote is replaced with '\'', and the caller is
- * expected to enclose the result within a single quote pair.
- *
- * E.g.
- * original sq_expand result
- * name ==> name ==> 'name'
- * a b ==> a b ==> 'a b'
- * a'b ==> a'\''b ==> 'a'\''b'
- */
-static char *sq_expand(char *src)
-{
- static char *buf = NULL;
- int cnt, c;
- char *cp;
-
- /* count bytes needed to store the quoted string. */
- for (cnt = 1, cp = src; *cp; cnt++, cp++)
- if (*cp == '\'')
- cnt += 3;
-
- if (! (buf = malloc(cnt)))
- return buf;
- cp = buf;
- while ((c = *src++)) {
- if (c != '\'')
- *cp++ = c;
- else {
- cp = strcpy(cp, "'\\''");
- cp += 4;
- }
- }
- *cp = 0;
- return buf;
-}
-
-static void show_differences(char *name, char *label, void *old_contents,
- unsigned long long old_size, int reverse)
-{
- FILE *f;
- char *name_sq = sq_expand(name);
- char *label_sq = (name != label) ? sq_expand(label) : name_sq;
- char *diff_arg = reverse ? diff_arg_reverse : diff_arg_forward;
- int cmd_size = strlen(name_sq) + strlen(label_sq) * 2 +
- strlen(diff_cmd) + strlen(diff_opts) + strlen(diff_arg);
- char *cmd = malloc(cmd_size);
- int next_at;
-
- fflush(stdout);
- next_at = snprintf(cmd, cmd_size, diff_cmd, label_sq, label_sq);
- next_at += snprintf(cmd+next_at, cmd_size-next_at, "%s", diff_opts);
- next_at += snprintf(cmd+next_at, cmd_size-next_at, diff_arg, name_sq);
- f = popen(cmd, "w");
- if (old_size)
- fwrite(old_contents, old_size, 1, f);
- pclose(f);
- if (label_sq != name_sq)
- free(label_sq);
- free(name_sq);
- free(cmd);
-}
-
-static void show_diff_empty(struct cache_entry *ce, int reverse)
-{
- char *old;
- unsigned long int size;
- unsigned char type[20];
-
- old = read_sha1_file(ce->sha1, type, &size);
- if (! old) {
- error("unable to read blob object for %s (%s)", ce->name,
- sha1_to_hex(ce->sha1));
- return;
- }
- show_differences("/dev/null", ce->name, old, size, reverse);
-}
+#include "diff.h"
static const char *show_diff_usage = "show-diff [-q] [-s] [-z] [paths...]";
@@ -183,7 +87,8 @@ int main(int argc, char **argv)
else {
printf("%s: %s\n", ce->name, strerror(errno));
if (errno == ENOENT)
- show_diff_empty(ce, reverse);
+ show_diff_empty(ce->sha1, ce->name,
+ reverse);
}
continue;
}
^ permalink raw reply
* [PATCH 0/2] diff-tree/diff-cache helper
From: Junio C Hamano @ 2005-04-25 5:12 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504232202340.19877@ppc970.osdl.org>
I use a set of small scripts [*1*] directly on top of the core
git, which needed to make patches out of diff-tree and
diff-cache output. Its output is compatible with what show-diff
produces.
Since this helper program would be generally useful for other
wrapper layer programs like Cogito, I am submitting it in two
parts. The first part is to rip out the external diff program
interface part out of show-diff.c and moving it into a common
library. The second part introduces the diff-tree-helper
program.
[Footnotes]
*1* This is found at http://members.cox.net/junkio/
^ permalink raw reply
* Re: [PATCH] New option (-H) for rpush/rpull to update HEAD
From: Andreas Gal @ 2005-04-25 4:55 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.21.0504250041500.30848-100000@iabervon.org>
Why? Updating HEAD right after writing the commit id and all its children
to the object directory seems reasonable and prevents race conditions when
the remote repository is shared via HTTP etc. rpull, in contrast, should
never touch HEAD, because conflicts might force a merge that will set
HEAD to something else. For the pull case we should let the script(tm) do
that. Its local anyway. rpush is different because the script(tm) has to
do some SSH magic to update HEAD. I will gladly supply a patch to fix what
to read/write once you have figured out the final layout, but I really
need a working rpush _NOW_ ;).
Andreas
On Mon, 25 Apr 2005, Daniel Barkalow wrote:
> On Sun, 24 Apr 2005, Andreas Gal wrote:
>
> > This patch adds a new option -H to rpush/rpull to update the
> > HEAD pointer when pushing a new release to a remote repository.
>
> Updating the head pointer (in either direction) should be instead of
> specifying a commit, and should also apply to http-pull. I've also
> suggested some changes to the organization of HEAD and related items, so
> the logical things to read and write are likely to change soon.
>
> -Daniel
> *This .sig left intentionally blank*
>
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH] New option (-H) for rpush/rpull to update HEAD
From: Daniel Barkalow @ 2005-04-25 4:47 UTC (permalink / raw)
To: Andreas Gal; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504242132280.5064@sam.ics.uci.edu>
On Sun, 24 Apr 2005, Andreas Gal wrote:
> This patch adds a new option -H to rpush/rpull to update the
> HEAD pointer when pushing a new release to a remote repository.
Updating the head pointer (in either direction) should be instead of
specifying a commit, and should also apply to http-pull. I've also
suggested some changes to the organization of HEAD and related items, so
the logical things to read and write are likely to change soon.
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply
* [PATCH] New option (-H) for rpush/rpull to update HEAD
From: Andreas Gal @ 2005-04-25 4:39 UTC (permalink / raw)
To: Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0504250443380.14200@sheen.jakma.org>
This patch adds a new option -H to rpush/rpull to update the
HEAD pointer when pushing a new release to a remote repository.
Signed-off-by: Andreas Gal <gal@uci.edu>
--- c27af2c2464de28732b8ad1fff3ed8a0804250d6/rpull.c
+++ rpull.c
@@ -11,6 +11,7 @@
static int tree = 0;
static int commits = 0;
static int all = 0;
+static int update_head = 0;
static int fd_in;
static int fd_out;
@@ -104,11 +105,13 @@
all = 1;
tree = 1;
commits = 1;
+ } else if (argv[arg][1] == 'H') {
+ update_head = 1;
}
arg++;
}
if (argc < arg + 2) {
- usage("rpull [-c] [-t] [-a] commit-id url");
+ usage("rpull [-c] [-t] [-a] [-H] commit-id url");
return 1;
}
commit_id = argv[arg];
@@ -123,6 +126,11 @@
return 1;
if (process_commit(sha1))
return 1;
+ if (update_head) {
+ FILE* fp = fopen("HEAD", "w+");
+ fprintf(fp, "%s\n", commit_id);
+ fclose(fp);
+ }
return 0;
}
--- 0293a1a46311d7e20b13177143741ab9d6d0d201/rpush.c
+++ rpush.c
@@ -56,7 +56,7 @@
arg++;
}
if (argc < arg + 2) {
- usage("rpush [-c] [-t] [-a] commit-id url");
+ usage("rpush [-c] [-t] [-a] [-H] commit-id url");
return 1;
}
commit_id = argv[arg];
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 3:47 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0504250435050.14200@sheen.jakma.org>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 978 bytes --]
On Mon, 25 Apr 2005, Paul Jakma wrote:
> Uh, I have no idea whether verifying a signature of a commit object is
> sufficient, ie equivalent to signing each file.
>
> commit refers to tree objects, which I presume lists the SHA-1 object IDs of
> files, but IIRC Linus already described why a signature of the commit object
> should not be used to trust the rest of commit.. (i'll have to find his
> mail). If so, an index is required.
Ah, apparently it is sufficient:
Linus:
“Just signing the commit is indeed sufficient to just say "I trust
this commit". But I essentially what to also say what I trust it
_for_ as well.”
So this would work for commit objects.
It would also work for tag objects, if you pointed people at the signature
object rather than the actual tag object.
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
Humor in the Court:
Q. Were you aquainted with the deceased?
A. Yes, sir.
Q. Before or after he died?
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 3:40 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0504250413200.14200@sheen.jakma.org>
On Mon, 25 Apr 2005, Paul Jakma wrote:
> You dont even need it, see my other mail. If:
>
> - the signature is an object and added after the commit object
>
> - tools know that signatures are 'proxies of' or precursors to the
> objects they are signing (which makes sense, a signature by
> definition refers to something else)
>
> - the signature object refers to the object it is signing (eg a
> 'Signing <object ID>' header)
>
> Then head can simply be the signature object and tools can find the
> commit by following the 'Signing' field of the signature (they dont
> even need to check the signature is valid). No index lookup needed.
> You only need the index for historical verification really, and you can
> always generate an index if needs be. (and have the tools maintain it).
Uh, I have no idea whether verifying a signature of a commit object
is sufficient, ie equivalent to signing each file.
commit refers to tree objects, which I presume lists the SHA-1 object
IDs of files, but IIRC Linus already described why a signature of the
commit object should not be used to trust the rest of commit.. (i'll
have to find his mail). If so, an index is required.
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
Old programmers never die, they just hit account block limit.
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: David A. Wheeler @ 2005-04-25 3:32 UTC (permalink / raw)
To: Linus Torvalds
Cc: Fabian Franz, Paul Jakma, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504241938410.18901@ppc970.osdl.org>
On Mon, 25 Apr 2005, Fabian Franz wrote:
>> What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?
If you mean "hash of object being signed", the problem is that
there may be more than one signature of a given object.
Keys get stolen, for example, so you want to re-sign the objects.
Yes, you could replace the files, but it's nicer to make it
so there's never a need to replace files in the first place.
That's one of the nice properties of the git object database;
so if we can have that property everywhere, I think we should.
Instead, store the signatures in the normal object database, &
give it type "signature". To speed access FROM a commit or tag
to a signature (and FROM a commit to a tag), create a
separate reverse directory that tells you what objects reference
a given object. Like this:
.git/
objects/
00/
0195297c2a6336c2007548f909769e0862b509 <= a commit object
02/
0395297c2a6336c2007548f909769e0862b509 <= signature of commit
04/
0595297c2a6336c2007548f909769e0862b509 <= a tag
06/
0795297c2a6336c2007548f909769e0862b509 <= signature of tag
reverse/
00/
0195297c2a6336c2007548f909769e0862b509/
020395297c2a6336c2007548f909769e0862b509 "this signs commit"
.... other later signatures of this commit go here.
04/
0595297c2a6336c2007548f909769e0862b509/
060795297c2a6336c2007548f909769e0862b509
.... other later signatures of this tag go here.
The reverse directory's contents are basically the filenames.
The files themselves could be symlinks back up, or not.
Content-free files are probably more portable across filesystems,
and it's probably also good for space efficiency
(though I haven't examined that carefully).
"git"'s knowledge of signatures should be VERY limited, and
not dependent on PGP. I think that'd be easy.
You could prepend some signature data into the "signature" file to
make it much easier to reconstruct the reverse directory and
to make it easy to check things WITHOUT knowledge of PGP or whatever.
Here's potential output:
$ cat-file commit 000195297c2a6336c2007548f909769e0862b509
tree 2aaf94eae20acc451553766f3c063bc46cfa75c6
parent dc459bf85b3ff97333e759d641c5d18f4dad470d
author Petr Baudis <pasky@ucw.cz> 1114303479 +0200
committer Petr Baudis <xpasky@machine.sinus.cz> 1114303479 +0200
Added the whatsit flag.
$ cat-file signature 000195297c2a6336c2007548f909769e0862b509
signatureof commit 000195297c2a6336c2007548f909769e0862b509
signer Petr Baudis <pasky@ucw.cz>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQBCbFaRCxlT/+f+SU4RAgYSAKCWpPNlDKDkxuuA649zJop7WkQPnACdF1Fg
JgXatbJU8YJ7JHqvgyGepRU=
=Kttg
-----END PGP SIGNATURE-----
$
--- David A. Wheeler
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 3:24 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <426C5F43.8010705@dwheeler.com>
On Sun, 24 Apr 2005, David A. Wheeler wrote:
> Now you just have to FIND the signature of a signed object, i.e.
> efficiently go the "other way" from signed object to detached
> signature. A separate directory with this mapping, or embedding
> the mapping inside the object directory (HASH.d/<list>) both solve
> it.
You dont even need it, see my other mail. If:
- the signature is an object and added after the commit object
- tools know that signatures are 'proxies of' or precursors to the
objects they are signing (which makes sense, a signature by
definition refers to something else)
- the signature object refers to the object it is signing (eg a
'Signing <object ID>' header)
Then head can simply be the signature object and tools can find the
commit by following the 'Signing' field of the signature (they dont
even need to check the signature is valid). No index lookup needed.
You only need the index for historical verification really, and you
can always generate an index if needs be. (and have the tools
maintain it).
> The more I think about it, the more I think a separate "reverse"
> index directory would be a better idea. It just needs to from
> "me" to "who references me", at least so that you can quickly
> find all signatures of a given object. If the reverse directory
> gets wonky, anyone can just delete the reverse index directory
> at any time & reconstruct it by iterating the objects.
> Before "-----BEGIN PGP SIGNATURE-----" you should add:
> signatureof HASHVALUE
> to make reconstruction easy; PGP processors ignore stuff
> before "-----".
Oof, dont do this:
- makes assumptions about the format of the signature
- that it is ASCII
- that you can change it
Just add a git header which is independent of the signature data.
In lieu of the 'signature object as precursor' approach above, just
have the tools maintain an index. It can be maintained as objects as
added, and can always be blown away and recreated by inspection of
the repository data.
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
To doubt everything or to believe everything are two equally convenient
solutions; both dispense with the necessity of reflection.
-- H. Poincar'\be
^ permalink raw reply
* Revised PPC assembly implementation
From: linux @ 2005-04-25 3:13 UTC (permalink / raw)
To: paulus; +Cc: git, linux
In-Reply-To: <17003.9009.226712.220822@cargo.ozlabs.ibm.com>
Three changes:
- Added stack frame as per your description.
- Found two bugs. (Cutting & pasting too fast.) Fixed.
- Minor scheduling improvements. More to come.
Which lead to three questions:
- Is the stack set properly now?
- Does it produce the right answer now?
- Is it any faster?
Thanks for your help!
/*
* SHA-1 implementation for PowerPC.
*
* Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
*/
/*
* We roll the registers for A, B, C, D, E around on each
* iteration; E on iteration t is D on iteration t+1, and so on.
* We use registers 6 - 10 for this. (Registers 27 - 31 hold
* the previous values.)
*/
#define RA(t) ((((t)+4)%5)+6)
#define RB(t) ((((t)+3)%5)+6)
#define RC(t) ((((t)+2)%5)+6)
#define RD(t) ((((t)+1)%5)+6)
#define RE(t) ((((t)+0)%5)+6)
/* We use registers 11 - 26 for the W values */
#define W(t) (((t)%16)+11)
/* Register 5 is used for the constant k */
/*
* Note that, in the previous step, RC was rotated, and RA was computed.
* So try to postpone using them, *especially* the latter.
*/
/* f(b,c,d) = "bitwise b ? c : d" = (b & c) + (~b & d) */
#define STEPD0(t) \
andc %r0,RD(t),RB(t); \
add %r0,%r0,W(t)
add RE(t),RE(t),%r0; \
and %r0,RC(t),RB(t); \
add %r0,%r0,%r5
add RE(t),RE(t),%r0; \
rotlwi %r0,RA(t),5; \
rotlwi RB(t),RB(t),30; \
add RE(t),RE(t),%r0;
/* f(b,c,d) = b ^ c ^ d */
#define STEPD1(t) \
xor %r0,RD(t),RB(t); \
xor %r0,%r0,RC(t); \
add %r0,%r0,W(t)
add RE(t),RE(t),%r0; \
rotlwi %r0,RA(t),5; \
add %r0,%r0,%r5
rotlwi RB(t),RB(t),30; \
add RE(t),RE(t),%r0;
/* f(b,c,d) = majority(b,c,d) = (b & d) + (c & (b ^ d)) */
#define STEPD2(t) \
and %r0,RD(t),RB(t); \
add %r0,%r0,W(t)
add RE(t),RE(t),%r0; \
xor %r0,RD(t),RB(t); \
and %r0,%r0,RC(t); \
add RE(t),RE(t),%r0; \
rotlwi %r0,RA(t),5; \
add %r0,%r0,%r5
rotlwi RB(t),RB(t),30; \
add RE(t),RE(t),%r0;
#define LOADW(t) \
lwz W(t),(t)*4(%r4)
#define UPDATEW(t) \
xor %r0,W((t)-3),W((t)-8); \
xor W(t),W((t)-16),W((t)-14); \
xor W(t),W(t),%r0; \
rotlwi W(t),W(t),1
#define STEP0LD4(t) \
STEPD0(t); LOADW((t)+4); \
STEPD0((t)+1); LOADW((t)+5); \
STEPD0((t)+2); LOADW((t)+6); \
STEPD0((t)+3); LOADW((t)+7)
#define STEPUP4(t, fn) \
STEP##fn(t); UPDATEW((t)+4); \
STEP##fn((t)+1); UPDATEW((t)+5); \
STEP##fn((t)+2); UPDATEW((t)+6); \
STEP##fn((t)+3); UPDATEW((t)+7)
#define STEPUP20(t, fn) \
STEPUP4(t, fn); \
STEPUP4((t)+4, fn); \
STEPUP4((t)+8, fn); \
STEPUP4((t)+12, fn); \
STEPUP4((t)+16, fn)
.globl sha1_core
sha1_core:
stwu %r1,-80(%r1)
stmw %r13,4(%r1)
/* Load up A - E */
lmw %r27,0(%r3)
mtctr %r5
1: mr RA(0),%r27
LOADW(0)
mr RB(0),%r28
LOADW(1)
mr RC(0),%r29
LOADW(2)
mr RD(0),%r30
LOADW(3)
mr RE(0),%r31
lis %r5,0x5a82 /* K0-19 */
ori %r5,%r5,0x7999
STEP0LD4(0)
STEP0LD4(4)
STEP0LD4(8)
STEPUP4(12, D0)
STEPUP4(16, D0)
lis %r5,0x6ed9 /* K20-39 */
ori %r5,%r5,0xeba1
STEPUP20(20, D1)
lis %r5,0x8f1b /* K40-59 */
ori %r5,%r5,0xbcdc
STEPUP20(40, D2)
lis %r5,0xca62 /* K60-79 */
ori %r5,%r5,0xc1d6
STEPUP4(60, D1)
STEPUP4(64, D1)
STEPUP4(68, D1)
STEPUP4(72, D1)
STEPD1(76)
STEPD1(77)
STEPD1(78)
STEPD1(79)
/* Add results to original values */
add %r27,%r27,RA(0)
add %r28,%r28,RB(0)
add %r29,%r29,RC(0)
add %r30,%r30,RD(0)
add %r31,%r31,RE(0)
addi %r4,%r4,64
bdnz 1b
/* Save final hash, restore registers, and return */
stmw %r27,0(%r3)
lmw %r13,4(%r1)
addi %r1,%r1,80
blr
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 3:08 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0504250323040.14200@sheen.jakma.org>
Ah, to add to below..
If one wished, one could optionally store the actual signature data
as a seperate blob object and refer to it in the signing object. Not
needed really for a GPG ASCII clear-signed detached signature (tiny
and they're ASCII obviously :) ), but who knows.
On Mon, 25 Apr 2005, Paul Jakma wrote:
> - add the 'signature object' to the respository after the signed
> object
>
> So a 'signed commit' turns into the
>
> - tool preparing the commit object,
> - get the user to sign it
> - save the detached signature for later
> - adding the commit object to the repository
- adding the signature blob, if it is to stored as a blob
> - prepare the signing object and add to repository
> The repository head then refers then to signature object, which could
> (handwaving) look something like:
>
> Object Signature
> Signing <object ID, in this case of the commit object>
> Sign-type GPG
With either a 'Signature <ID of signature data blob>' or else:
> <signature data>
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
May you have many beautiful and obedient daughters.
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: David A. Wheeler @ 2005-04-25 3:08 UTC (permalink / raw)
To: Linus Torvalds
Cc: Paul Jakma, Sean, Thomas Glanzmann, David Woodhouse, Jan Dittmer,
Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504241846290.18901@ppc970.osdl.org>
Linus Torvalds wrote:
>
> On Sun, 24 Apr 2005, David A. Wheeler wrote:
>
>>It may be better to have them as simple detached signatures, which are
>>completely separate files (see gpg --detached).
>
> Actually, if we do totally separate files, then the detached thing is ok,
> and we migth decide to not call the objects at all, since that seems to be
> unnecessarily complex.
>
> Maybe we'll just have signed tags by doing exactly that: just a collection
> of detached signature files. The question becomes one of how to name such
> things in a distributed tree. That is the thing that using an object for
> them would have solved very naturally.
I agree, naming signatures using the same way other objects are named
would be very clean. So, why not? It's perfectly reasonable to
just store detached signatures as hashed objects, just like the rest;
just create a new object type ("signature").
If 3 different keys are used to sign the same object, the detached
signatures will have different hash values, so they'll get named easily.
Now you just have to FIND the signature of a signed object,
i.e. efficiently go the "other way" from signed object to detached
signature. A separate directory with this mapping, or embedding the
mapping inside the object directory (HASH.d/<list>) both solve it.
The more I think about it, the more I think a separate "reverse"
index directory would be a better idea. It just needs to from
"me" to "who references me", at least so that you can quickly
find all signatures of a given object. If the reverse directory
gets wonky, anyone can just delete the reverse index directory
at any time & reconstruct it by iterating the objects.
Before "-----BEGIN PGP SIGNATURE-----" you should add:
signatureof HASHVALUE
to make reconstruction easy; PGP processors ignore stuff
before "-----". The PGP data does include a hash, but it's not
easy to get it out (I don't see a way to do it in gpg from the
command line), and it's quite possible that a signer won't
use SHA-1 when they sign something (they may not even
realize it; it depends on their implementation's configuration).
Better to include something about what was signed with the signature.
Hmm, probably worth backtracking to see what's needed.
There needs to be a way to identify tags, and a way to sign that
tag so that you can decide to trust some tags & not others.
There needs to be a way to sign commits, and store that info
for later. And really, these are special cases of general
assertions about other things; you might want someone to be
able to make other signed assertions (e.g., that it
passed test suite XYZ).
If tags & commits are all you plan to sign for now, well, you
already have commits. You can just add a "tag" type and a
"signature" type of object (the "signature" is just a detached
OpenPGP signature). "signature" can sign tag or commit types.
I still like the idea of a more general "assertion" type, esp.
for assertions that something passed a test suite on a certain date
or was reviewed at a certain date by someone, but admittedly
that could be added later in the same manner.
Then you need to be able to quickly find a signature, given a
commit or tag. A "reverse" directory then does that nicely,
and if you put enough information in front of the signature,
you can regenerate the reverse directory whenever you wish.
--- David A. Wheeler
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 3:03 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <426C5266.6050200@dwheeler.com>
On Sun, 24 Apr 2005, David A. Wheeler wrote:
> Right. I suggested putting it in the same directory as the
> objects, so that rsync users get them "for free", but a separate
> directory has its own advantages & that'd be fine too. In fact, the
> more I think about it, I think it'd be cleaner to have it separate.
> You could prepend on top of the signature (if signatures are
> separate from assertions) WHAT got signed so that the index could
> be recreated from scratch when desired.
Well, i'm trying to play with git right now to see what would fit
with how it abstracts things.
I think possibly:
- add the 'signature object' to the respository after the signed
object
So a 'signed commit' turns into the
- tool preparing the commit object,
- get the user to sign it
- save the detached signature for later
- adding the commit object to the repository
- prepare the signing object and add to repository
The repository head then refers then to signature object, which could
(handwaving) look something like:
Object Signature
Signing <object ID, in this case of the commit object>
Sign-type GPG
<signature data>
Tools should then treat signature objects as 'stand ins' for the
object they are signing (verify the signature - if desired - and then
just retrieve the 'Signing' object ID and use that further).
I have no working knowledge of git though, other than following this
list. So I have no idea whether above is at all appropriate or
workable.
> If you mean "the signatures aren't stored with the objects", NO.
> Please don't! If the signatures are not stored in the database,
> then over time they'll get lost.
No more lost than anything else in the git 'fs'.
If someone prunes old objects, they'll lose the signed objects along
with the signatures. If those files weren't replicated anywhere else,
well they've just blown away history for good, both the history of
the source and corresponding signatures.
> It's important to me to store the record of trust, as well as what
> changed, so that ANYONE can later go back and verify that things
> are as they're supposed to be, or exactly who trusted what.
See above.
> git definitely doesn't have this currently, though you could run
> the fsck tools which end up creating a lot of the info (but it's
> then thrown away).
Well, it could be retained then.
> Yes. The problem is that maintaining the index is a pain.
Possibly.
> It's probably worth it for signatures, because the primary use is
> the other direction ("who signed this?"); it's not clear that the
> other direction is common for other data.
In CVS it is. If you 'cvs log' a file, you can get a report on which
revisions of the file belong to which tags (which can be useful
information sometimes: "ah, so that release had the buggy version"
type of thing. Or as a sanity check to make sure you got a tag right
- particularly when you have to move a wrong tag[1]). So, in addition
to signatures, a general 'referrers of this object' index could be
useful for reports.
1. This might be just a CVS thing, and not wanted for git -> the
ability to tag historical revisions and indeed change what tags refer
to.
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
Decaffeinated coffee? Just Say No.
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Jan Harkes @ 2005-04-25 2:43 UTC (permalink / raw)
To: Kernel Mailing List, Git Mailing List
In-Reply-To: <20050425023420.GA14696@lists.us.dell.com>
On Sun, Apr 24, 2005 at 09:34:20PM -0500, Matt Domsch wrote:
> On Sun, Apr 24, 2005 at 09:01:28PM -0400, David A. Wheeler wrote:
> > It may be better to have them as simple detached signatures, which are
> > completely separate files (see gpg --detached).
> > Yeah, gpg currently implements detached signatures
> > by repeating what gets signed, which is unfortunate,
> > but the _idea_ is the right one.
>
> I solve this with two simple scripts, "sign" calls "cutsig".
...
> gpg --armor --clearsign --detach-sign --default-key "${DEFAULT_KEY} -v -v -o - ${1} | \
> ${CUTSIG} > ${1}.sign
You could also just leave out the --clearsign option and it will DTRT.
Jan
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Linus Torvalds @ 2005-04-25 2:44 UTC (permalink / raw)
To: Fabian Franz
Cc: David A. Wheeler, Paul Jakma, Sean, Thomas Glanzmann,
David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
Git Mailing List
In-Reply-To: <200504250417.17231.FabianFranz@gmx.de>
On Mon, 25 Apr 2005, Fabian Franz wrote:
>
> What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?
Well, the SHA1 of an object really is not a very good name, unless you
have something to manage it with. Again, the object database has something
to manage and find those objects with - things like .git/HEAD, but also
"fsck" to find dangling and unnamed objects.
Maybe we'll never have so many tags that we need to manage them, and yes,
if so, we can just have ".git/signatures" be a directory with objects that
are just named for their content SHA1, the same way the object database
is, but separately (and probably just using a flat file structure, no need
for the subdirectory fan-out that the object directory has).
No need for a ".sig" thing, since they'd be defined to be signatures just
from their location.
> Or would this violate the concept of the object database to just contain
> hashes?
This wouldn't be an object at all in that case, they'd be totally outside
the scope of the git object model.
And yes, if they were to be git objects, they'd follow totally different
rules: they'd have to have the "tag+length+'\0'" format, and they would be
zlib-compressed.
If they are totally outside of git, then I don't care what the object
format is, and then it could be just a regular text-file with a signature
and content, and just happen to be named for the SHA1 hash so that there
is no confusion about what happens when multiple people happen to create
different tags with the same name.
Linus
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Andreas Gal @ 2005-04-25 2:39 UTC (permalink / raw)
To: Fabian Franz
Cc: Linus Torvalds, David A. Wheeler, Paul Jakma, Sean,
Thomas Glanzmann, David Woodhouse, Jan Dittmer, Greg KH,
Git Mailing List
In-Reply-To: <200504250417.17231.FabianFranz@gmx.de>
It may sound a little weird, but we could actually store the signature in
the inode/filename. GPG signatures seem to be around 80 bytes of ASC,
thats well below MAXPATH and should work even if your repository is
somewhere/deep/in/your/filesystem/hierarchy.
1. Signed objects are named sha1-sig (sig is a 80 character signature
here, not the three letters sig).
2. To make sure we can find objects without their signature, there is
always a soft link sha1 -> sha1-sig (fsck can check this and create
missing links).
3. To find a signature, just follow the link and look at the real name.
4. Files can be distributed without signature (content is unchanged) and
you can sign them in your local tree with your own signature,
effectively throwing my signature away.
The only limitation is that each object can only be signed by one person.
On the other hand, this might not be a limitation at all. If I create a
file, I sign it. Nobody else. Same goes for trees and commits that I
create. You can sign your own commit object when you merge my stuff, and
then push that commit object out (along with your co-signature).
Andreas
On Mon, 25 Apr 2005, Fabian Franz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Am Montag, 25. April 2005 03:50 schrieb Linus Torvalds:
>
> > Maybe we'll just have signed tags by doing exactly that: just a collection
> > of detached signature files. The question becomes one of how to name such
> > things in a distributed tree. That is the thing that using an object for
> > them would have solved very naturally.
>
> What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?
>
> Or would this violate the concept of the object database to just contain
> hashes?
>
> cu
>
> Fabian
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.4 (GNU/Linux)
>
> iD8DBQFCbFMsI0lSH7CXz7MRAof0AKCILjPE/M72cMSVNDC/DWYSzmrU/ACggOuS
> ogNPwUf2ASAwmbwixzSTuPs=
> =pW5D
> -----END PGP SIGNATURE-----
>
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Matt Domsch @ 2005-04-25 2:34 UTC (permalink / raw)
To: David A. Wheeler
Cc: Paul Jakma, Linus Torvalds, Sean, Thomas Glanzmann,
David Woodhouse, Jan Dittmer, Greg KH, Kernel Mailing List,
Git Mailing List
In-Reply-To: <426C4168.6030008@dwheeler.com>
On Sun, Apr 24, 2005 at 09:01:28PM -0400, David A. Wheeler wrote:
> It may be better to have them as simple detached signatures, which are
> completely separate files (see gpg --detached).
> Yeah, gpg currently implements detached signatures
> by repeating what gets signed, which is unfortunate,
> but the _idea_ is the right one.
I solve this with two simple scripts, "sign" calls "cutsig".
--------------
sign
#!/bin/sh
DEFAULT_KEY="my-private-key-string"
CUTSIG=~/bin/cutsig.pl
usage()
{
echo "usage: $0 filename"
echo " produces filename.sign"
}
if [ $# -lt 1 ]; then
usage
exit 1;
fi
gpg --armor --clearsign --detach-sign --default-key "${DEFAULT_KEY} -v -v -o - ${1} | \
${CUTSIG} > ${1}.sign
exit 0
-----------------
cutsig
#!/usr/bin/perl -w
do {
$line = <STDIN>;
} until $line =~ "-----BEGIN PGP SIGNATURE-----";
print $line;
while ( $line = <STDIN>) {
print $line;
}
exit 0;
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Fabian Franz @ 2005-04-25 2:17 UTC (permalink / raw)
To: Linus Torvalds, David A. Wheeler
Cc: Paul Jakma, Sean, Thomas Glanzmann, David Woodhouse, Jan Dittmer,
Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0504241846290.18901@ppc970.osdl.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Am Montag, 25. April 2005 03:50 schrieb Linus Torvalds:
> Maybe we'll just have signed tags by doing exactly that: just a collection
> of detached signature files. The question becomes one of how to name such
> things in a distributed tree. That is the thing that using an object for
> them would have solved very naturally.
What about just <sha1 hash of object>.sig or <sha1 hash of object>.asc?
Or would this violate the concept of the object database to just contain
hashes?
cu
Fabian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
iD8DBQFCbFMsI0lSH7CXz7MRAof0AKCILjPE/M72cMSVNDC/DWYSzmrU/ACggOuS
ogNPwUf2ASAwmbwixzSTuPs=
=pW5D
-----END PGP SIGNATURE-----
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: David A. Wheeler @ 2005-04-25 2:13 UTC (permalink / raw)
To: Paul Jakma
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <Pine.LNX.4.62.0504250212200.14200@sheen.jakma.org>
Paul Jakma wrote:
> On Sun, 24 Apr 2005, David A. Wheeler wrote:
> Hmm, what do you mean by "repeating what gets signed"?
Forget it, irrelevant. I vaguely remembered some problem with
gpg's detached signatures, but it was probably either a really
early alpha version or someone was using "--clearsign" instead
of "--armor". I just did a quick check with:
gpg --armor --detach -o junk.sig junk.c
and it worked "as expected"; no repeat of the data.
>> Yes, and see my earlier posting. It'd be easy to store signatures in
>> the current objects directory, of course. The trick is to be able
>> to go from signed-object to the signature;
> Two ways:
> 1. An index of sigs to signed-object.
> (or more generally: objects to referring-objects)
Right. I suggested putting it in the same directory as the objects,
so that rsync users get them "for free", but a separate directory
has its own advantages & that'd be fine too.
In fact, the more I think about it, I think it'd be cleaner
to have it separate. You could prepend on top of the signature
(if signatures are separate from assertions) WHAT got signed so
that the index could be recreated from scratch when desired.
> 2. Just give people the URI of the signature, let them (or their
> tools) follow the 'parent' link to the object of interest
If you mean "the signatures aren't stored with the objects", NO.
Please don't! If the signatures are not stored in the database,
then over time they'll get lost. It's important to me to
store the record of trust, as well as what changed, so that
ANYONE can later go back and verify that things are as they're
supposed to be, or exactly who trusted what.
> I think it might be more useful just to provide a general index to
> lookup 'referring' objects (if git does not already - I dont think it
> does, but I dont know enough to know for sure).
git definitely doesn't have this currently, though you could run the
fsck tools which end up creating a lot of the info (but it's then
thrown away).
> So you could ask "which
> {commit,tag,signature,tree}(s) refer(s) to this object?" - that general
> concept will always work.
Yes. The problem is that maintaining the index is a pain.
It's probably worth it for signatures, because the primary use
is the other direction ("who signed this?"); it's not clear that
the other direction is common for other data.
--- David A. Wheeler
^ permalink raw reply
* unpack_sha1_file issues
From: Morten Welinder @ 2005-04-25 2:01 UTC (permalink / raw)
To: GIT Mailing List, Linus Torvalds
unpack_sha1_file is being called by fsck-cache. Therefore it should
not assume that
a \0 occurs within the first 8192 bytes of the uncompressed data.
However, currently
both the sscanf and strlen calls do just that.
Also, unpack_sha1_file should call inflateEnd in a couple of the error cases.
Finally, if *size==0, shouldn't unpack_sha1_file allocate a single
byte to prevent
malloc from returning NULL?
Morten
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Linus Torvalds @ 2005-04-25 1:50 UTC (permalink / raw)
To: David A. Wheeler
Cc: Paul Jakma, Sean, Thomas Glanzmann, David Woodhouse, Jan Dittmer,
Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <426C4168.6030008@dwheeler.com>
On Sun, 24 Apr 2005, David A. Wheeler wrote:
>
> It may be better to have them as simple detached signatures, which are
> completely separate files (see gpg --detached).
> Yeah, gpg currently implements detached signatures
> by repeating what gets signed, which is unfortunate,
> but the _idea_ is the right one.
Actually, if we do totally separate files, then the detached thing is ok,
and we migth decide to not call the objects at all, since that seems to be
unnecessarily complex.
Maybe we'll just have signed tags by doing exactly that: just a collection
of detached signature files. The question becomes one of how to name such
things in a distributed tree. That is the thing that using an object for
them would have solved very naturally.
Linus
^ permalink raw reply
* Re: Git-commits mailing list feed.
From: Paul Jakma @ 2005-04-25 1:35 UTC (permalink / raw)
To: David A. Wheeler
Cc: Linus Torvalds, Sean, Thomas Glanzmann, David Woodhouse,
Jan Dittmer, Greg KH, Kernel Mailing List, Git Mailing List
In-Reply-To: <426C4168.6030008@dwheeler.com>
On Sun, 24 Apr 2005, David A. Wheeler wrote:
> It may be better to have them as simple detached signatures, which
> are completely separate files (see gpg --detached). Yeah, gpg
> currently implements detached signatures by repeating what gets
> signed, which is unfortunate, but the _idea_ is the right one.
Hmm, what do you mean by "repeating what gets signed"?
> Yes, and see my earlier posting. It'd be easy to store signatures in
> the current objects directory, of course. The trick is to be able
> to go from signed-object to the signature;
Two ways:
1. An index of sigs to signed-object.
(or more generally: objects to referring-objects)
2. Just give people the URI of the signature, let them (or their
tools) follow the 'parent' link to the object of interest
> this could be done just by creating a subdirectory using a variant
> of the name of the signed-object's file, and in that directory
> store the hash values of the signatures. E.G.:
> 00/
> 3b128932189018329839019 <- object to sign
> 3b128932189018329839019.d/
> 0143709289032890234323451
> 01/
> 43709289032890234323451 <- signature
You could hack it in to the namespace somehow I guess. I'm not sure
hacking it in would be a good thing though.
I think it might be more useful just to provide a general index to
lookup 'referring' objects (if git does not already - I dont think it
does, but I dont know enough to know for sure). So you could ask
"which {commit,tag,signature,tree}(s) refer(s) to this object?" -
that general concept will always work. If you wanted to make the
implementation of this index use some kind of sub directory as in the
above, fine..
See also method 2 above. Which would be more efficient for tools if,
within a project, some developers sign their 'updates' and some
dont.. (you never need to check whether there's a signature or not -
you'll know it from the URI automatically).
> There are LOTS of reasons for storing signatures so that they can
> be checked later on, just like there are lots of reasons for
> storing old code... they give you evidence that the reputed history
> is true (and if you doubt it, they give you a way to limit the
> doubt).
Indeed.
Anyway, we shall see what Linus does. :)
(But I do hope at least that signatures are /not/ included inline
using BEGIN PGP.. in the object that is signed.)
regards,
--
Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A
Fortune:
To err is human, to purr feline.
To err is human, two curs canine.
To err is human, to moo bovine.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox