* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-20 19:13 UTC (permalink / raw)
To: git
In-Reply-To: <1116615600.12975.33.camel@dhcp-188>
Hello Kay,
I would like to see that I can klick on the file instead of the seperate
'blob' link in a directory view, becasue that is more intuitive and you can
already an klick on directories:
http://www.kernel.org/git/?p=git/git.git;a=tree;h=665a48af9e192ed84d2707c95d4c0d9c45eb45ad;hb=411746940f02f6fb90c4b6b97c6f07cee599c2e1
Thanks for this great tool!
Sincerely,
Thomas
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-20 19:22 UTC (permalink / raw)
To: Kay Sievers; +Cc: Petr Baudis, Git Mailing List, Peter Anvin
In-Reply-To: <1116615600.12975.33.camel@dhcp-188>
On Fri, 20 May 2005, Kay Sievers wrote:
>
> Somehting like this?:
> http://kernel.org/git/?p=git/git.git;a=commitdiff;h=de809dbbce497e0d107562615c1d85ff35b4e0c5
Btw, at least for me, this looks much more interesting than the "commit"
thing, and maybe it would make sense to make the summary links be to the
"commitdiff" instead of the "commit"?
Or is it just so much more expensive to generate, that we want to not have
people go there normally? (hpa cc'd, since he may have some insight into
whether this is likely to be an issue or not? It's not like git-diff-tree
is that expensive, but it _does_ end up doing a "diff" against each
changed file, of course, modulo any caching of results).
Linus
^ permalink raw reply
* add conf file support to gitweb
From: Andres Salomon @ 2005-05-20 19:29 UTC (permalink / raw)
To: git, Kay Sievers
[-- Attachment #1: Type: text/plain, Size: 405 bytes --]
Hi,
The attached patch makes gitweb read and eval variables from
an /etc/gitweb.conf file. This is useful for distributions; I'm
packaging gitweb for debian, and want to have a separate config file
that users can edit that won't get overwritten when they upgrade gitweb.
Even if you don't take this patch, please consider some other method
that decouples the configuration from the gitweb.cgi script.
[-- Attachment #2: gitweb.conf.patch --]
[-- Type: text/x-patch, Size: 687 bytes --]
Index: gitweb.cgi
===================================================================
--- 8b7a4b08ba4892970a2531d4c1584e3881a13586/gitweb.cgi (mode:100644)
+++ fe8329b147103e115e2ad727bfca34c2ecfa901d/gitweb.cgi (mode:100755)
@@ -40,6 +40,16 @@
#my $projects_list = $projectroot;
my $projects_list = "index/index.aux";
+# allow config file to override settings above
+if (-r '/etc/gitweb.conf') {
+ open(CONF, '/etc/gitweb.conf') || die_error(undef, "Cannot open /etc/gitweb.conf.");
+ while (<CONF>) {
+ chomp;
+ eval($_) if ($_ =~ /^\s*(\$[\w]+)\s*=\s*(.*)\s*$/);
+ }
+ close(CONF);
+}
+
# input validation and dispatch
my $action = $cgi->param('a');
if (defined $action) {
^ permalink raw reply
* Re: gitweb and kernel.org
From: Jeff Garzik @ 2005-05-20 19:47 UTC (permalink / raw)
To: Kay Sievers; +Cc: Git Mailing List, FTP Admin
In-Reply-To: <1116615502.12975.29.camel@dhcp-188>
Kay Sievers wrote:
> Initial support for branches added! :)
Wow, that was fast. Thanks.
Jeff
^ permalink raw reply
* Re: gitweb wishlist
From: H. Peter Anvin @ 2005-05-20 20:34 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kay Sievers, Petr Baudis, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505201219420.2206@ppc970.osdl.org>
Linus Torvalds wrote:
>
> On Fri, 20 May 2005, Kay Sievers wrote:
>
>>Somehting like this?:
>> http://kernel.org/git/?p=git/git.git;a=commitdiff;h=de809dbbce497e0d107562615c1d85ff35b4e0c5
>
>
> Btw, at least for me, this looks much more interesting than the "commit"
> thing, and maybe it would make sense to make the summary links be to the
> "commitdiff" instead of the "commit"?
>
> Or is it just so much more expensive to generate, that we want to not have
> people go there normally? (hpa cc'd, since he may have some insight into
> whether this is likely to be an issue or not? It's not like git-diff-tree
> is that expensive, but it _does_ end up doing a "diff" against each
> changed file, of course, modulo any caching of results).
>
What I ended up doing for the diff viewer on kernel.org is that every
page that's generated gets stuffed in a cache (locklessly indexed by a
SHA-1 of a canonicalized form of the query); the pages people actually
see are then simply pulled from the cache. This caching was a just
enormous win. In the case of the diff viewer, the header is generated
each time, since I allow the user to select a custom style sheet (and
don't want to cache versions for each style sheet), but that's a trivial
detail.
-hpa
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-20 20:49 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Kay Sievers, Petr Baudis, Git Mailing List
In-Reply-To: <428E49DD.406@zytor.com>
On Fri, 20 May 2005, H. Peter Anvin wrote:
>
> What I ended up doing for the diff viewer on kernel.org is that every
> page that's generated gets stuffed in a cache (locklessly indexed by a
> SHA-1 of a canonicalized form of the query); the pages people actually
> see are then simply pulled from the cache. This caching was a just
> enormous win.
Ok. That still leaves the bandwidth issue (the full diffs are bigger than
the commit object), but usually the diffs in individual commits aren't
_that_ large, so maybe it's a non-issue.
Oh, btw, I notice that you moved klibc over to git - care to share your
cvs->git script (I assume you scripted it ;)? That would seem to be an
obvious addition to the core stuff..
Linus
^ permalink raw reply
* Re: gitweb wishlist
From: H. Peter Anvin @ 2005-05-20 20:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kay Sievers, Petr Baudis, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505201346330.2206@ppc970.osdl.org>
Linus Torvalds wrote:
>
> Oh, btw, I notice that you moved klibc over to git - care to share your
> cvs->git script (I assume you scripted it ;)? That would seem to be an
> obvious addition to the core stuff..
>
Actually, Kay did the conversion... the scripts are clearly very
cantankerous, because if *I* run them -- I tried -- they don't work!
Since it's Kay's work, I'll leave them to him, but I would definitely
love to move more of my CVS repos over to git, especially syslinux.
-hpa
^ permalink raw reply
* [PATCH 1/3] delta read
From: Nicolas Pitre @ 2005-05-20 20:57 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
This patch makes the core code aware of delta objects and undeltafy
them as needed. The convention is to use read_sha1_file() to have
undeltafication done automatically (most users do that already so
this is transparent).
If the delta object itself has to be accessed then it must be done
through map_sha1_file() and unpack_sha1_file().
In that context mktag.c has been switched to read_sha1_file() as there
is no reason to do the full map+unpack manually.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Index: git/sha1_file.c
===================================================================
--- git.orig/sha1_file.c
+++ git/sha1_file.c
@@ -9,6 +9,7 @@
#include <stdarg.h>
#include <limits.h>
#include "cache.h"
+#include "delta.h"
#ifndef O_NOATIME
#if defined(__linux__) && (defined(__i386__) || defined(__PPC__))
@@ -353,6 +354,19 @@
if (map) {
buf = unpack_sha1_file(map, mapsize, type, size);
munmap(map, mapsize);
+ if (buf && !strcmp(type, "delta")) {
+ void *ref = NULL, *delta = buf;
+ unsigned long ref_size, delta_size = *size;
+ buf = NULL;
+ if (delta_size > 20)
+ ref = read_sha1_file(delta, type, &ref_size);
+ if (ref)
+ buf = patch_delta(ref, ref_size,
+ delta+20, delta_size-20,
+ size);
+ free(delta);
+ free(ref);
+ }
return buf;
}
return NULL;
Index: git/mktag.c
===================================================================
--- git.orig/mktag.c
+++ git/mktag.c
@@ -25,20 +25,14 @@
static int verify_object(unsigned char *sha1, const char *expected_type)
{
int ret = -1;
- unsigned long mapsize;
- void *map = map_sha1_file(sha1, &mapsize);
+ char type[100];
+ unsigned long size;
+ void *buffer = read_sha1_file(sha1, type, &size);
- if (map) {
- char type[100];
- unsigned long size;
- void *buffer = unpack_sha1_file(map, mapsize, type, &size);
-
- if (buffer) {
- if (!strcmp(type, expected_type))
- ret = check_sha1_signature(sha1, buffer, size, type);
- free(buffer);
- }
- munmap(map, mapsize);
+ if (buffer) {
+ if (!strcmp(type, expected_type))
+ ret = check_sha1_signature(sha1, buffer, size, type);
+ free(buffer);
}
return ret;
}
^ permalink raw reply
* [PATCH 2/3] delta check
From: Nicolas Pitre @ 2005-05-20 20:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
This patch adds knowledge of delta objects to fsck-cache and various
object parsing code. A new switch to git-fsck-cache is provided to
display the maximum delta depth found in a repository.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Index: git/fsck-cache.c
===================================================================
--- git.orig/fsck-cache.c
+++ git/fsck-cache.c
@@ -6,15 +6,46 @@
#include "tree.h"
#include "blob.h"
#include "tag.h"
+#include "delta.h"
#define REACHABLE 0x0001
static int show_root = 0;
static int show_tags = 0;
static int show_unreachable = 0;
+static int show_max_delta_depth = 0;
static int keep_cache_objects = 0;
static unsigned char head_sha1[20];
+static void expand_deltas(void)
+{
+ int i, max_depth = 0;
+
+ /*
+ * To be as efficient as possible we look for delta heads and
+ * recursively process them going backward, and parsing
+ * resulting objects along the way. This allows for processing
+ * each delta objects only once regardless of the delta depth.
+ */
+ for (i = 0; i < nr_objs; i++) {
+ struct object *obj = objs[i];
+ if (obj->parsed && !obj->delta && obj->attached_deltas) {
+ int depth = 0;
+ char type[10];
+ unsigned long size;
+ void *buf = read_sha1_file(obj->sha1, type, &size);
+ if (!buf)
+ continue;
+ depth = process_deltas(buf, size, obj->type,
+ obj->attached_deltas);
+ if (max_depth < depth)
+ max_depth = depth;
+ }
+ }
+ if (show_max_delta_depth)
+ printf("maximum delta depth = %d\n", max_depth);
+}
+
static void check_connectivity(void)
{
int i;
@@ -25,7 +56,12 @@
struct object_list *refs;
if (!obj->parsed) {
- printf("missing %s %s\n", obj->type, sha1_to_hex(obj->sha1));
+ if (obj->delta)
+ printf("unresolved delta %s\n",
+ sha1_to_hex(obj->sha1));
+ else
+ printf("missing %s %s\n",
+ obj->type, sha1_to_hex(obj->sha1));
continue;
}
@@ -43,7 +79,12 @@
continue;
if (show_unreachable && !(obj->flags & REACHABLE)) {
- printf("unreachable %s %s\n", obj->type, sha1_to_hex(obj->sha1));
+ if (obj->attached_deltas)
+ printf("foreign delta reference %s\n",
+ sha1_to_hex(obj->sha1));
+ else
+ printf("unreachable %s %s\n",
+ obj->type, sha1_to_hex(obj->sha1));
continue;
}
@@ -201,6 +242,8 @@
return fsck_commit((struct commit *) obj);
if (obj->type == tag_type)
return fsck_tag((struct tag *) obj);
+ if (!obj->type && obj->delta)
+ return 0;
return -1;
}
@@ -384,6 +427,10 @@
show_root = 1;
continue;
}
+ if (!strcmp(arg, "--delta-depth")) {
+ show_max_delta_depth = 1;
+ continue;
+ }
if (!strcmp(arg, "--cache")) {
keep_cache_objects = 1;
continue;
@@ -400,6 +447,8 @@
}
fsck_sha1_list();
+ expand_deltas();
+
heads = 0;
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
@@ -423,7 +472,7 @@
}
/*
- * If we've not been gived any explicit head information, do the
+ * If we've not been given any explicit head information, do the
* default ones from .git/refs. We also consider the index file
* in this case (ie this implies --cache).
*/
Index: git/delta.c
===================================================================
--- /dev/null
+++ git/delta.c
@@ -0,0 +1,115 @@
+#include "object.h"
+#include "blob.h"
+#include "tree.h"
+#include "commit.h"
+#include "tag.h"
+#include "delta.h"
+#include "cache.h"
+#include <string.h>
+
+/* the delta object definition (it can alias any other object) */
+struct delta {
+ union {
+ struct object object;
+ struct blob blob;
+ struct tree tree;
+ struct commit commit;
+ struct tag tag;
+ } u;
+};
+
+struct delta *lookup_delta(unsigned char *sha1)
+{
+ struct object *obj = lookup_object(sha1);
+ if (!obj) {
+ struct delta *ret = xmalloc(sizeof(struct delta));
+ memset(ret, 0, sizeof(struct delta));
+ created_object(sha1, &ret->u.object);
+ return ret;
+ }
+ return (struct delta *) obj;
+}
+
+int parse_delta_buffer(struct delta *item, void *buffer, unsigned long size)
+{
+ struct object *reference;
+ struct object_list *p;
+
+ if (item->u.object.delta)
+ return 0;
+ item->u.object.delta = 1;
+ if (size <= 20)
+ return -1;
+ reference = lookup_object(buffer);
+ if (!reference) {
+ struct delta *ref = xmalloc(sizeof(struct delta));
+ memset(ref, 0, sizeof(struct delta));
+ created_object(buffer, &ref->u.object);
+ reference = &ref->u.object;
+ }
+
+ p = xmalloc(sizeof(*p));
+ p->item = &item->u.object;
+ p->next = reference->attached_deltas;
+ reference->attached_deltas = p;
+ return 0;
+}
+
+int process_deltas(void *src, unsigned long src_size, const char *src_type,
+ struct object_list *delta_list)
+{
+ int deepest = 0;
+ do {
+ struct object *obj = delta_list->item;
+ static char type[10];
+ void *map, *delta, *buf;
+ unsigned long map_size, delta_size, buf_size;
+ map = map_sha1_file(obj->sha1, &map_size);
+ if (!map)
+ continue;
+ delta = unpack_sha1_file(map, map_size, type, &delta_size);
+ munmap(map, map_size);
+ if (!delta)
+ continue;
+ if (strcmp(type, "delta") || delta_size <= 20) {
+ free(delta);
+ continue;
+ }
+ buf = patch_delta(src, src_size,
+ delta+20, delta_size-20,
+ &buf_size);
+ free(delta);
+ if (!buf)
+ continue;
+ if (check_sha1_signature(obj->sha1, buf, buf_size, src_type) < 0)
+ printf("sha1 mismatch for delta %s\n", sha1_to_hex(obj->sha1));
+ if (obj->type && obj->type != src_type) {
+ error("got %s when expecting %s for delta %s",
+ src_type, obj->type, sha1_to_hex(obj->sha1));
+ free(buf);
+ continue;
+ }
+ obj->type = src_type;
+ if (src_type == blob_type) {
+ parse_blob_buffer((struct blob *)obj, buf, buf_size);
+ } else if (src_type == tree_type) {
+ parse_tree_buffer((struct tree *)obj, buf, buf_size);
+ } else if (src_type == commit_type) {
+ parse_commit_buffer((struct commit *)obj, buf, buf_size);
+ } else if (src_type == tag_type) {
+ parse_tag_buffer((struct tag *)obj, buf, buf_size);
+ } else {
+ error("unknown object type %s", src_type);
+ free(buf);
+ continue;
+ }
+ if (obj->attached_deltas) {
+ int depth = process_deltas(buf, buf_size, src_type,
+ obj->attached_deltas);
+ if (deepest < depth)
+ deepest = depth;
+ }
+ free(buf);
+ } while ((delta_list = delta_list->next));
+ return deepest + 1;
+}
Index: git/tag.c
===================================================================
--- git.orig/tag.c
+++ git/tag.c
@@ -13,6 +13,8 @@
ret->object.type = tag_type;
return ret;
}
+ if (!obj->type)
+ obj->type = tag_type;
if (obj->type != tag_type) {
error("Object %s is a %s, not a tree",
sha1_to_hex(sha1), obj->type);
Index: git/tree.c
===================================================================
--- git.orig/tree.c
+++ git/tree.c
@@ -83,6 +83,8 @@
ret->object.type = tree_type;
return ret;
}
+ if (!obj->type)
+ obj->type = tree_type;
if (obj->type != tree_type) {
error("Object %s is a %s, not a tree",
sha1_to_hex(sha1), obj->type);
Index: git/blob.c
===================================================================
--- git.orig/blob.c
+++ git/blob.c
@@ -14,6 +14,8 @@
ret->object.type = blob_type;
return ret;
}
+ if (!obj->type)
+ obj->type = blob_type;
if (obj->type != blob_type) {
error("Object %s is a %s, not a blob",
sha1_to_hex(sha1), obj->type);
Index: git/delta.h
===================================================================
--- git.orig/delta.h
+++ git/delta.h
@@ -1,6 +1,21 @@
+#ifndef DELTA_H
+#define DELTA_H
+
+/* handling of delta buffers */
extern void *diff_delta(void *from_buf, unsigned long from_size,
void *to_buf, unsigned long to_size,
unsigned long *delta_size);
extern void *patch_delta(void *src_buf, unsigned long src_size,
void *delta_buf, unsigned long delta_size,
unsigned long *dst_size);
+
+/* handling of delta objects */
+struct delta;
+struct object_list;
+extern struct delta *lookup_delta(unsigned char *sha1);
+extern int parse_delta_buffer(struct delta *item, void *buffer, unsigned long size);
+extern int parse_delta(struct delta *item, unsigned char sha1);
+extern int process_deltas(void *src, unsigned long src_size,
+ const char *src_type, struct object_list *delta);
+
+#endif
Index: git/commit.c
===================================================================
--- git.orig/commit.c
+++ git/commit.c
@@ -37,6 +37,8 @@
ret->object.type = commit_type;
return ret;
}
+ if (!obj->type)
+ obj->type = commit_type;
return check_commit(obj, sha1);
}
Index: git/object.c
===================================================================
--- git.orig/object.c
+++ git/object.c
@@ -4,6 +4,7 @@
#include "commit.h"
#include "cache.h"
#include "tag.h"
+#include "delta.h"
#include <stdlib.h>
#include <string.h>
@@ -104,6 +105,7 @@
unsigned long mapsize;
void *map = map_sha1_file(sha1, &mapsize);
if (map) {
+ int is_delta;
struct object *obj;
char type[100];
unsigned long size;
@@ -111,9 +113,14 @@
munmap(map, mapsize);
if (!buffer)
return NULL;
- if (check_sha1_signature(sha1, buffer, size, type) < 0)
+ is_delta = !strcmp(type, "delta");
+ if (!is_delta && check_sha1_signature(sha1, buffer, size, type) < 0)
printf("sha1 mismatch %s\n", sha1_to_hex(sha1));
- if (!strcmp(type, "blob")) {
+ if (is_delta) {
+ struct delta *delta = lookup_delta(sha1);
+ parse_delta_buffer(delta, buffer, size);
+ obj = (struct object *) delta;
+ } else if (!strcmp(type, "blob")) {
struct blob *blob = lookup_blob(sha1);
parse_blob_buffer(blob, buffer, size);
obj = &blob->object;
Index: git/Makefile
===================================================================
--- git.orig/Makefile
+++ git/Makefile
@@ -36,7 +36,7 @@
$(INSTALL) $(PROG) $(SCRIPTS) $(dest)$(bin)
LIB_OBJS=read-cache.o sha1_file.o usage.o object.o commit.o tree.o blob.o \
- tag.o date.o index.o diff-delta.o patch-delta.o
+ tag.o delta.o date.o index.o diff-delta.o patch-delta.o
LIB_FILE=libgit.a
LIB_H=cache.h object.h blob.h tree.h commit.h tag.h delta.h
Index: git/object.h
===================================================================
--- git.orig/object.h
+++ git/object.h
@@ -9,10 +9,12 @@
struct object {
unsigned parsed : 1;
unsigned used : 1;
+ unsigned delta : 1;
unsigned int flags;
unsigned char sha1[20];
const char *type;
struct object_list *refs;
+ struct object_list *attached_deltas;
};
extern int nr_objs;
^ permalink raw reply
* [PATCH 3/3] delta creation
From: Nicolas Pitre @ 2005-05-20 21:00 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
This patch adds the ability to actually create delta objects using a
new tool: git-mkdelta. It uses an ordered list of potential objects
to deltafy against earlier objects in the list. A cap on the depth of
delta references can be provided as well, otherwise the default is to
not have any limit. A limit of 0 will also undeltafy any given object.
Also provided is the beginning of a script to deltafy an entire
repository.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Index: git/Makefile
===================================================================
--- git.orig/Makefile
+++ git/Makefile
@@ -19,7 +19,8 @@
INSTALL=install
SCRIPTS=git-apply-patch-script git-merge-one-file-script git-prune-script \
- git-pull-script git-tag-script git-resolve-script git-whatchanged
+ git-pull-script git-tag-script git-resolve-script git-whatchanged \
+ git-deltafy-script
PROG= git-update-cache git-diff-files git-init-db git-write-tree \
git-read-tree git-commit-tree git-cat-file git-fsck-cache \
@@ -28,7 +29,7 @@
git-unpack-file git-export git-diff-cache git-convert-cache \
git-http-pull git-rpush git-rpull git-rev-list git-mktag \
git-diff-helper git-tar-tree git-local-pull git-write-blob \
- git-get-tar-commit-id
+ git-get-tar-commit-id git-mkdelta
all: $(PROG)
@@ -107,6 +108,7 @@
git-diff-helper: diff-helper.c
git-tar-tree: tar-tree.c
git-write-blob: write-blob.c
+git-mkdelta: mkdelta.c
git-http-pull: LIBS += -lcurl
Index: git/mkdelta.c
===================================================================
--- /dev/null
+++ git/mkdelta.c
@@ -0,0 +1,317 @@
+/*
+ * Deltafication of a GIT database.
+ *
+ * (C) 2005 Nicolas Pitre <nico@cam.org>
+ *
+ * This code is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "cache.h"
+#include "delta.h"
+
+static int replace_object(char *buf, unsigned long size, unsigned char *sha1)
+{
+ char tmpfile[PATH_MAX];
+ int fd;
+
+ snprintf(tmpfile, sizeof(tmpfile), "%s/obj_XXXXXX", get_object_directory());
+ fd = mkstemp(tmpfile);
+ if (fd < 0)
+ return error("%s: %s\n", tmpfile, strerror(errno));
+ if (write(fd, buf, size) != size) {
+ perror("unable to write file");
+ close(fd);
+ unlink(tmpfile);
+ return -1;
+ }
+ fchmod(fd, 0444);
+ close(fd);
+ if (rename(tmpfile, sha1_file_name(sha1))) {
+ perror("unable to replace original object");
+ unlink(tmpfile);
+ return -1;
+ }
+ return 0;
+}
+
+static void *create_object(char *buf, unsigned long len, char *hdr, int hdrlen,
+ unsigned long *retsize)
+{
+ char *compressed;
+ unsigned long size;
+ z_stream stream;
+
+ /* Set it up */
+ memset(&stream, 0, sizeof(stream));
+ deflateInit(&stream, Z_BEST_COMPRESSION);
+ size = deflateBound(&stream, len+hdrlen);
+ compressed = xmalloc(size);
+
+ /* Compress it */
+ stream.next_out = compressed;
+ stream.avail_out = size;
+
+ /* First header.. */
+ stream.next_in = hdr;
+ stream.avail_in = hdrlen;
+ while (deflate(&stream, 0) == Z_OK)
+ /* nothing */;
+
+ /* Then the data itself.. */
+ stream.next_in = buf;
+ stream.avail_in = len;
+ while (deflate(&stream, Z_FINISH) == Z_OK)
+ /* nothing */;
+ deflateEnd(&stream);
+ *retsize = stream.total_out;
+ return compressed;
+}
+
+static int restore_original_object(char *buf, unsigned long len,
+ char *type, unsigned char *sha1)
+{
+ char hdr[50];
+ int hdrlen, ret;
+ void *compressed;
+ unsigned long size;
+
+ hdrlen = sprintf(hdr, "%s %lu", type, len)+1;
+ compressed = create_object(buf, len, hdr, hdrlen, &size);
+ ret = replace_object(compressed, size, sha1);
+ free(compressed);
+ return ret;
+}
+
+static void *create_delta_object(char *buf, unsigned long len,
+ unsigned char *sha1_ref, unsigned long *size)
+{
+ char hdr[50];
+ int hdrlen;
+
+ /* Generate the header + sha1 of reference for delta */
+ hdrlen = sprintf(hdr, "delta %lu", len+20)+1;
+ memcpy(hdr + hdrlen, sha1_ref, 20);
+ hdrlen += 20;
+
+ return create_object(buf, len, hdr, hdrlen, size);
+}
+
+static unsigned long get_object_size(unsigned char *sha1)
+{
+ struct stat st;
+ if (stat(sha1_file_name(sha1), &st))
+ die("%s: %s", sha1_to_hex(sha1), strerror(errno));
+ return st.st_size;
+}
+
+static void *get_buffer(unsigned char *sha1, char *type, unsigned long *size)
+{
+ unsigned long mapsize;
+ void *map = map_sha1_file(sha1, &mapsize);
+ if (map) {
+ void *buffer = unpack_sha1_file(map, mapsize, type, size);
+ munmap(map, mapsize);
+ if (buffer)
+ return buffer;
+ }
+ error("unable to get object %s", sha1_to_hex(sha1));
+ return NULL;
+}
+
+static void *expand_delta(void *delta, unsigned long delta_size, char *type,
+ unsigned long *size, unsigned int *depth, char *head)
+{
+ void *buf = NULL;
+ *depth++;
+ if (delta_size < 20) {
+ error("delta object is bad");
+ free(delta);
+ } else {
+ unsigned long ref_size;
+ void *ref = get_buffer(delta, type, &ref_size);
+ if (ref && !strcmp(type, "delta"))
+ ref = expand_delta(ref, ref_size, type, &ref_size,
+ depth, head);
+ else
+ memcpy(head, delta, 20);
+ if (ref)
+ buf = patch_delta(ref, ref_size, delta+20,
+ delta_size-20, size);
+ free(ref);
+ free(delta);
+ }
+ return buf;
+}
+
+static char *mkdelta_usage =
+"mkdelta [ --max-depth=N ] <reference_sha1> <target_sha1> [ <next_sha1> ... ]";
+
+int main(int argc, char **argv)
+{
+ unsigned char sha1_ref[20], sha1_trg[20], head_ref[20], head_trg[20];
+ char type_ref[20], type_trg[20];
+ void *buf_ref, *buf_trg, *buf_delta;
+ unsigned long size_ref, size_trg, size_orig, size_delta;
+ unsigned int depth_ref, depth_trg, depth_max = -1;
+ int i, verbose = 0;
+
+ for (i = 1; i < argc; i++) {
+ if (!strcmp(argv[i], "-v")) {
+ verbose = 1;
+ } else if (!strcmp(argv[i], "-d") && i+1 < argc) {
+ depth_max = atoi(argv[++i]);
+ } else if (!strncmp(argv[i], "--max-depth=", 12)) {
+ depth_max = atoi(argv[i]+12);
+ } else
+ break;
+ }
+
+ if (i + (depth_max != 0) >= argc)
+ usage(mkdelta_usage);
+
+ if (get_sha1(argv[i], sha1_ref))
+ die("bad sha1 %s", argv[i]);
+ depth_ref = 0;
+ buf_ref = get_buffer(sha1_ref, type_ref, &size_ref);
+ if (buf_ref && !strcmp(type_ref, "delta"))
+ buf_ref = expand_delta(buf_ref, size_ref, type_ref,
+ &size_ref, &depth_ref, head_ref);
+ else
+ memcpy(head_ref, sha1_ref, 20);
+ if (!buf_ref)
+ die("unable to obtain initial object %s", argv[i]);
+
+ if (depth_ref > depth_max) {
+ if (restore_original_object(buf_ref, size_ref, type_ref, sha1_ref))
+ die("unable to restore %s", argv[i]);
+ if (verbose)
+ printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
+ depth_ref = 0;
+ }
+
+ /*
+ * TODO: deltafication should be tried against any early object
+ * in the object list and not only the previous object.
+ */
+
+ while (++i < argc) {
+ if (get_sha1(argv[i], sha1_trg))
+ die("bad sha1 %s", argv[i]);
+ depth_trg = 0;
+ buf_trg = get_buffer(sha1_trg, type_trg, &size_trg);
+ if (buf_trg && !size_trg) {
+ if (verbose)
+ printf("skip %s (object is empty)\n", argv[i]);
+ continue;
+ }
+ size_orig = size_trg;
+ if (buf_trg && !strcmp(type_trg, "delta")) {
+ if (!memcmp(buf_trg, sha1_ref, 20)) {
+ /* delta already in place */
+ depth_ref++;
+ memcpy(sha1_ref, sha1_trg, 20);
+ buf_ref = patch_delta(buf_ref, size_ref,
+ buf_trg+20, size_trg-20,
+ &size_ref);
+ if (!buf_ref)
+ die("unable to apply delta %s", argv[i]);
+ if (depth_ref > depth_max) {
+ if (restore_original_object(buf_ref, size_ref,
+ type_ref, sha1_ref))
+ die("unable to restore %s", argv[i]);
+ if (verbose)
+ printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
+ depth_ref = 0;
+ continue;
+ }
+ if (verbose)
+ printf("skip %s (delta already in place)\n", argv[i]);
+ continue;
+ }
+ buf_trg = expand_delta(buf_trg, size_trg, type_trg,
+ &size_trg, &depth_trg, head_trg);
+ } else
+ memcpy(head_trg, sha1_trg, 20);
+ if (!buf_trg)
+ die("unable to read target object %s", argv[i]);
+
+ if (depth_trg > depth_max) {
+ if (restore_original_object(buf_trg, size_trg, type_trg, sha1_trg))
+ die("unable to restore %s", argv[i]);
+ if (verbose)
+ printf("undelta %s (depth was %d)\n", argv[i], depth_trg);
+ depth_trg = 0;
+ size_orig = size_trg;
+ }
+
+ if (depth_max == 0)
+ goto skip;
+
+ if (strcmp(type_ref, type_trg))
+ die("type mismatch for object %s", argv[i]);
+
+ if (!size_ref) {
+ if (verbose)
+ printf("skip %s (initial object is empty)\n", argv[i]);
+ goto skip;
+ }
+
+ if (depth_ref + 1 > depth_max) {
+ if (verbose)
+ printf("skip %s (exceeding max link depth)\n", argv[i]);
+ goto skip;
+ }
+
+ if (!memcmp(head_ref, sha1_trg, 20)) {
+ if (verbose)
+ printf("skip %s (would create a loop)\n", argv[i]);
+ goto skip;
+ }
+
+ buf_delta = diff_delta(buf_ref, size_ref, buf_trg, size_trg, &size_delta);
+ if (!buf_delta)
+ die("out of memory");
+
+ /* no need to even try to compress if original
+ uncompressed is already smaller */
+ if (size_delta+20 < size_orig) {
+ void *buf_obj;
+ unsigned long size_obj;
+ buf_obj = create_delta_object(buf_delta, size_delta,
+ sha1_ref, &size_obj);
+ free(buf_delta);
+ size_orig = get_object_size(sha1_trg);
+ if (size_obj >= size_orig) {
+ free(buf_obj);
+ if (verbose)
+ printf("skip %s (original is smaller)\n", argv[i]);
+ goto skip;
+ }
+ if (replace_object(buf_obj, size_obj, sha1_trg))
+ die("unable to write delta for %s", argv[i]);
+ free(buf_obj);
+ depth_ref++;
+ if (verbose)
+ printf("delta %s (size=%ld.%02ld%%, depth=%d)\n",
+ argv[i], size_obj*100 / size_orig,
+ (size_obj*10000 / size_orig)%100,
+ depth_ref);
+ } else {
+ free(buf_delta);
+ if (verbose)
+ printf("skip %s (original is smaller)\n", argv[i]);
+ skip:
+ depth_ref = depth_trg;
+ memcpy(head_ref, head_trg, 20);
+ }
+
+ free(buf_ref);
+ buf_ref = buf_trg;
+ size_ref = size_trg;
+ memcpy(sha1_ref, sha1_trg, 20);
+ }
+
+ return 0;
+}
Index: git/git-deltafy-script
===================================================================
--- /dev/null
+++ git/git-deltafy-script
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+# Script to deltafy an entire GIT repository based on the commit list.
+# The most recent version of a file is the reference and previous versions
+# are made delta against the best earlier version available. And so on for
+# successive versions going back in time. This way the delta overhead is
+# pushed towards older version of any given file.
+#
+# NOTE: the "best earlier version" is not implemented in mkdelta yet
+# and therefore only the next eariler version is used at this time.
+#
+# TODO: deltafy tree objects as well.
+#
+# The -d argument allows to provide a limit on the delta chain depth.
+# If 0 is passed then everything is undeltafied.
+
+set -e
+
+depth=
+[ "$1" == "-d" ] && depth="--max-depth=$2" && shift 2
+
+curr_file=""
+
+git-rev-list HEAD |
+git-diff-tree -r --stdin |
+sed -n '/^\*/ s/^.*->\(.\{41\}\)\(.*\)$/\2 \1/p' | sort | uniq |
+while read file sha1; do
+ if [ "$file" == "$curr_file" ]; then
+ list="$list $sha1"
+ else
+ if [ "$list" ]; then
+ echo "Processing $curr_file"
+ echo "$head $list" | xargs git-mkdelta $depth -v
+ fi
+ curr_file="$file"
+ list=""
+ head="$sha1"
+ fi
+done
^ permalink raw reply
* checkout-cache -f: a better way?
From: Jeff Garzik @ 2005-05-20 21:05 UTC (permalink / raw)
To: Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 563 bytes --]
Being a weirdo, I don't use cogito for kernel development, just git
itself. I store branches in .git/refs/heads/ per the defacto standard,
and use the attached script to switch the working directory from one
branch to another.
Problem is, 'git-checkout-cache -q -f -a' really pounds the disk, and
takes quite a while.
Is there any way to avoid -f, while ensuring that the working directory
truly represents the new branch?
BitKeeper has a secret checkout arg '-S', which will leave files
untouched if the mtime/size information is unchanged.
Jeff
[-- Attachment #2: git-switch-tree --]
[-- Type: text/plain, Size: 381 bytes --]
#!/bin/sh
if [ "x$1" != "x" ]
then
if [ "$1" == "master" ]
then
( cd .git && rm -f HEAD && ln -s refs/heads/master HEAD )
else
if [ ! -f .git/refs/heads/$1 ]
then
echo Branch $1 not found.
exit 1
fi
( cd .git && rm -f HEAD && ln -s refs/heads/$1 HEAD )
fi
fi
git-read-tree $(cat .git/HEAD) && \
git-checkout-cache -q -f -a && \
git-update-cache --refresh
^ permalink raw reply
* Re: gitweb wishlist
From: Thomas Glanzmann @ 2005-05-20 21:16 UTC (permalink / raw)
To: Git Mailing List
In-Reply-To: <428E4D8C.3020606@zytor.com>
[-- Attachment #1: Type: text/plain, Size: 496 bytes --]
Hello,
I imported the mutt-cvs for the 1.5 branch into GIT using the following
script. But it is a hack. I also think that I will use something like
that to build a CVS->GIT vendortracking.
cvsps -x -z 10 -b HEAD -g -p ../../patches/
And using the attached script to import the patches in GIT. It works
quiet well.
See also msgid: 1115080139.21105.18.camel@localhost.localdomain there
are the scripts which he used to convert the CVS to GIT for HPA. My
scripts are based on his work.
Thomas
[-- Attachment #2: cvsps-import.pl --]
[-- Type: text/plain, Size: 1810 bytes --]
#!/usr/bin/perl
use strict;
use warnings;
use File::Temp qw/ tempfile tempdir /;
# ---------------------
# PatchSet 1
# Date: 2002/07/23 07:41:30
# Author: hpa
# Branch: HEAD
# Tag: (none)
# Log:
# Initial revision
#
# Members:
# klibc.cvsroot/snprintf.c:INITIAL->1.1
# klibc.cvsroot/vsnprintf.c:INITIAL->1.1
# klibc.cvsroot/klibc/Makefile:INITIAL->1.1
# klibc.cvsroot/klibc/snprintf.c:INITIAL->1.1
# klibc.cvsroot/klibc/vsnprintf.c:INITIAL->1.1
#
# --- /dev/null 2005-04-30 18:00:24.840397008 +0200
# +++ klibc/klibc.cvsroot/snprintf.c 2005-05-02 19:57:42.879913000 +0200
# @@ -0,0 +1,19 @@
# +/*
my $patch = $ARGV[0];
my %committer = (
brendan => [ 'Brendan Cully', 'brendan@kublai.com' ],
me => [ 'Michael Elkins', 'me@sigpipe.org' ],
roessler => [ 'Thomas Roessler', 'roessler@does-not-exist.org' ]
);
my @log = ();
$ENV{GIT_AUTHOR_EMAIL} = "";
$ENV{GIT_COMMITTER_EMAIL} = "";
open (my $fd, $patch);
while (my $line = <$fd>) {
if ($line =~ m/^Date: (.*)/) {
$ENV{GIT_AUTHOR_DATE} = $1;
} elsif ($line =~ m/^Author: (.*)/) {
if (defined($committer{$1})) {
$ENV{GIT_COMMITTER_NAME} = @{$committer{$1}}[0];
$ENV{GIT_COMMITTER_EMAIL} = @{$committer{$1}}[1];
$ENV{GIT_AUTHOR_NAME} = @{$committer{$1}}[0];
$ENV{GIT_AUTHOR_EMAIL} = @{$committer{$1}}[1];
} else {
$ENV{GIT_COMMITTER_NAME} = $1;
$ENV{GIT_AUTHOR_NAME} = $1;
}
} elsif ($line =~ m/^Log:/) {
while (my $line = <$fd>) {
if ($line =~ m/^Members: $/) {
pop(@log);
last;
} elsif ($line =~ /^From: (.+) <([^>]+@[^>]+)>$/) {
$ENV{GIT_AUTHOR_NAME} = $1;
$ENV{GIT_AUTHOR_EMAIL} = $2;
}
push @log, $line;
}
}
}
close($fd);
my ($fh, $logfile) = tempfile(CLEANUP => 1);
print $fh @log;
system("git patch $patch < $logfile");
close($fh);
^ permalink raw reply
* Re: gitweb wishlist
From: Kay Sievers @ 2005-05-20 21:41 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Petr Baudis, Git Mailing List, Peter Anvin
In-Reply-To: <Pine.LNX.4.58.0505201219420.2206@ppc970.osdl.org>
On Fri, 2005-05-20 at 12:22 -0700, Linus Torvalds wrote:
>
> On Fri, 20 May 2005, Kay Sievers wrote:
> >
> > Somehting like this?:
> > http://kernel.org/git/?p=git/git.git;a=commitdiff;h=de809dbbce497e0d107562615c1d85ff35b4e0c5
>
> Btw, at least for me, this looks much more interesting than the "commit"
> thing, and maybe it would make sense to make the summary links be to the
> "commitdiff" instead of the "commit"?
How about this:
http://www.kernel.org/git/?p=git/git.git;a=summary
The default link is still the same, but you can use the link at the end.
Kay
^ permalink raw reply
* Re: gitweb wishlist
From: Kay Sievers @ 2005-05-20 22:04 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List
In-Reply-To: <428E4D8C.3020606@zytor.com>
[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]
On Fri, 2005-05-20 at 13:50 -0700, H. Peter Anvin wrote:
> Linus Torvalds wrote:
> >
> > Oh, btw, I notice that you moved klibc over to git - care to share your
> > cvs->git script (I assume you scripted it ;)? That would seem to be an
> > obvious addition to the core stuff..
> >
>
> Actually, Kay did the conversion... the scripts are clearly very
> cantankerous, because if *I* run them -- I tried -- they don't work!
> Since it's Kay's work, I'll leave them to him, but I would definitely
> love to move more of my CVS repos over to git, especially syslinux.
Here we go;
These scripts are just a quick hack, I just wanted to know how nice the
stupid cvs file history can be converted to git-committs.
It exports the CVS repo with the help of the nice cvsps to individual
patches. (Every patch contains something like a "ChangeSet" by searching
for file revisions with the same checkin-date)
Then the patches with the header are split into individual files for
committing it into git (similar to Linus' git-mbox-tools).
If we reach a CVS tag with a patch during sequential patching, the
script throws away the whole current working tree and checks the
revision out of CVS. This way we make sure, that the git-tag matches
tree CVS has tagged. (I've encountered two mismatches in the
"patch-chain" with the CVS revision-tag. These corrections are hardcoded
into the script. :)
For every CVS revision-tag a git-tag without any content except the name
is created.
And the klibc-repo was created with a patched git-commit to fake the
commit date with the author date. :)
Good luck with it,
Kay
[-- Attachment #2: export-to-git.sh --]
[-- Type: application/x-shellscript, Size: 2581 bytes --]
[-- Attachment #3: split-cvsps-patch.pl --]
[-- Type: application/x-perl, Size: 2386 bytes --]
^ permalink raw reply
* Re: gitweb wishlist
From: H. Peter Anvin @ 2005-05-20 22:13 UTC (permalink / raw)
To: Kay Sievers; +Cc: Linus Torvalds, Petr Baudis, Git Mailing List
In-Reply-To: <1116626652.12975.118.camel@dhcp-188>
Kay Sievers wrote:
>
> And the klibc-repo was created with a patched git-commit to fake the
> commit date with the author date. :)
>
In fact, I kind of wish we'd also made committer == author.
Since this whole thing is an import from another revision control
system, one really wants that. It's one of those very rare situations
in which fudging the commit date is not only fully legitimate, but darn
near required.
-hpa
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Junio C Hamano @ 2005-05-20 22:38 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List
In-Reply-To: <428E5102.60003@pobox.com>
>>>>> "JG" == Jeff Garzik <jgarzik@pobox.com> writes:
JG> Being a weirdo, I don't use cogito for kernel development, just git
JG> itself.
My customer, in other words ;-).
JG> git-read-tree $(cat .git/HEAD) && \
JG> git-checkout-cache -q -f -a && \
JG> git-update-cache --refresh
I have to check checkout-cache.c, but assuming that you start
from an already populated work tree with a valid cache when you
do the git-read-tree at the third line from the last, using
"git-read-tree -m HEAD" (you do not need to say $(cat .git/HEAD)
in the modern git anymore) would be a good place to start.
Also the modern git-checkout-cache has a '-u' option and with it
you should not need 'git-update-cache --refresh' after that.
Let me know if you have any problems. Single tree '-m' is what
Linus did and '-u' option to git-checkout-cache is mine.
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-20 23:25 UTC (permalink / raw)
To: Kay Sievers; +Cc: H. Peter Anvin, Petr Baudis, Git Mailing List
In-Reply-To: <1116626652.12975.118.camel@dhcp-188>
On Sat, 21 May 2005, Kay Sievers wrote:
>
> These scripts are just a quick hack, I just wanted to know how nice the
> stupid cvs file history can be converted to git-committs.
Ugh, indeed.
Is it a cvsps bug or what that causes you to have to re-order the patches?
Or is it that you don't handle branches or something in CVS? The fact that
you also remove one of the tags "suppress ash-branch" in that same number
sequence that you had to fix up by re-ordering seems to imply that the
breakage has something to do with branching.
Does anybody have any suggestions for a nice and smallish CVS project that
has branches that I should look at?
Linus
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Linus Torvalds @ 2005-05-20 23:33 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List
In-Reply-To: <428E5102.60003@pobox.com>
On Fri, 20 May 2005, Jeff Garzik wrote:
>
> Problem is, 'git-checkout-cache -q -f -a' really pounds the disk, and
> takes quite a while.
No. "git" is perfect, and "git-checkout-cache -f" already does exactly
what you want.
> Is there any way to avoid -f, while ensuring that the working directory
> truly represents the new branch?
You don't need to avoid -f, it already has the logic to avoid writing
files that are already up-to-date.
HOWEVER, your script is broken:
git-read-tree $(cat .git/HEAD) && \
git-checkout-cache -q -f -a && \
git-update-cache --refresh
you need to use the "-m" switch to git-read-tree to tell it to merge the
index information from your previous tree with the new one.
Also, don't do the "$(cat .git/HEAD)" thing any more, since modern git
does this so much more nicely, and allows you to use your branch names
directly.
Finally, use the new "-u" flag to git-checkout-cache, which will update
the cache as it goes along.
In other words, those lines in your script should look like this:
git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
and you'll be a lot happier.
Linus
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Jeff Garzik @ 2005-05-20 23:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vacmpsetb.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
>>>>>>"JG" == Jeff Garzik <jgarzik@pobox.com> writes:
>
>
> JG> Being a weirdo, I don't use cogito for kernel development, just git
> JG> itself.
>
> My customer, in other words ;-).
>
> JG> git-read-tree $(cat .git/HEAD) && \
> JG> git-checkout-cache -q -f -a && \
> JG> git-update-cache --refresh
>
> I have to check checkout-cache.c, but assuming that you start
> from an already populated work tree with a valid cache when you
> do the git-read-tree at the third line from the last, using
> "git-read-tree -m HEAD" (you do not need to say $(cat .git/HEAD)
> in the modern git anymore) would be a good place to start.
>
> Also the modern git-checkout-cache has a '-u' option and with it
> you should not need 'git-update-cache --refresh' after that.
>
> Let me know if you have any problems. Single tree '-m' is what
> Linus did and '-u' option to git-checkout-cache is mine.
Pardon my ignorance (I'm slow :)), but how do those changes address the
fact that git-checkout-cache appears to checkout the entire kernel tree
(over 100MB of writes) when using '-f' ?
git-checkout-cache -f writes out every file, even if it exists, correct?
Jeff
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Junio C Hamano @ 2005-05-20 23:39 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List
In-Reply-To: <428E73B9.1080907@pobox.com>
>>>>> "JG" == Jeff Garzik <jgarzik@pobox.com> writes:
JG> git-checkout-cache -f writes out every file, even if it exists, correct?
No, that's not correct. To translate my prose, you would want
this:
git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
(notice that I do not have git-update-cache --refresh after
that).
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Linus Torvalds @ 2005-05-20 23:51 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505201626560.2206@ppc970.osdl.org>
On Fri, 20 May 2005, Linus Torvalds wrote:
>
> In other words, those lines in your script should look like this:
>
> git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
>
> and you'll be a lot happier.
Btw, I do realize that I'm a total wiener, and that my inability to use
"getopt_long()" is shameful and stupid.
What can I say? I'm easily confused, and besides, I really seldom program
in user mode.
So if somebody were to getopt'ify git, _without_ adding crapola like
autoconf (which probably implies that git would just require GNU getopt),
and others agree that it's ok to just say that we expect getopt_long() to
exist, then I'd not have any objections to making the above just be
git-read-tree -m HEAD | git-checkout-cache -fqua
(to which the beavis-and-butthead in me says "hehhehhehh.. He said fqua.
Hehhehh. fire fire fire.")
Linus
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Jeff Garzik @ 2005-05-20 23:55 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505201641160.2206@ppc970.osdl.org>
[-- Attachment #1: Type: text/plain, Size: 720 bytes --]
Linus Torvalds wrote:
>
> On Fri, 20 May 2005, Linus Torvalds wrote:
>
>>In other words, those lines in your script should look like this:
>>
>> git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
>>
>>and you'll be a lot happier.
>
>
> Btw, I do realize that I'm a total wiener, and that my inability to use
> "getopt_long()" is shameful and stupid.
info libc argp :) argp is a lot more flexible, but with the same basic
structure as getopt_long().
If you pick a random git program, I would be willing to convert it as an
example. I attached my implementation of ipcrm[1] as an example.
Jeff
[1] from 'posixutils', my project to implement all the POSIX command
line utilities. Yes, I'm crazy too.
[-- Attachment #2: ipcrm.c --]
[-- Type: text/x-csrc, Size: 5023 bytes --]
/*
* Copyright 2004-2005 Jeff Garzik <jgarzik@pobox.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; see the file COPYING. If not, write to
* the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
*
*/
#ifndef HAVE_CONFIG_H
#error missing autoconf-generated config.h.
#endif
#include "posixutils-config.h"
#include <sys/types.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <sys/sem.h>
#include <sys/shm.h>
#include <argp.h>
#include <libpu.h>
static const char doc[] =
N_("ipcrm - remove a message queue, semaphore set or shared memory id");
static struct argp_option options[] = {
{ NULL, 'q', "msgid", 0,
N_("Remove message queue identifier msgid from system") },
{ NULL, 'm', "shmid", 0,
N_("Remove shared memory identifier shmid from system") },
{ NULL, 's', "semid", 0,
N_("Remove semaphore identifier semid from system") },
{ NULL, 'Q', "msgkey", 0,
N_("Remove message queue identifier, created with key msgkey, from system") },
{ NULL, 'M', "shmkey", 0,
N_("Remove shared memory identifier, created with key shmkey, from system") },
{ NULL, 'S', "semkey", 0,
N_("Remove semaphore identifier, created with key semkey, from system") },
{ }
};
static error_t parse_opt (int key, char *arg, struct argp_state *state);
static const struct argp argp = { options, parse_opt, NULL, doc };
enum parse_options_bits {
OPT_MSG = (1 << 0),
OPT_SHM = (1 << 1),
OPT_SEM = (1 << 2),
OPT_KEY = (1 << 3),
};
struct arglist {
struct arglist *next;
int mask;
unsigned long arg;
};
#ifdef _SEM_SEMUN_UNDEFINED
union semun
{
int val;
struct semid_ds *buf;
unsigned short int *array;
struct seminfo *__buf;
};
#endif
static int exit_status = EXIT_SUCCESS;
static struct arglist *arglist;
static const char *arg_name(int mask)
{
if (mask & OPT_MSG) return "msg";
if (mask & OPT_SHM) return "shm";
if (mask & OPT_SEM) return "sem";
return NULL;
}
static void push_opt(int mask, unsigned long arg)
{
struct arglist *tmp, *node = xcalloc(1, sizeof(struct arglist));
node->mask = mask;
node->arg = arg;
tmp = arglist;
if (!tmp) {
arglist = node;
} else {
while (tmp->next)
tmp = tmp->next;
tmp->next = node;
}
}
static void push_arg_opt(int mask, const char *arg)
{
int base = (mask & OPT_KEY) ? 0 : 10;
char *end = NULL;
unsigned long l;
l = strtoul(arg, &end, base);
if ((*end != 0) || /* entire string is -not- valid */
((mask & OPT_KEY) && (l == IPC_PRIVATE))) {
fprintf(stderr, "%s%s '%s' invalid\n",
arg_name(mask),
mask & OPT_KEY ? "key" : "id",
arg);
exit_status = EXIT_FAILURE;
return;
}
push_opt(mask, l);
}
static error_t parse_opt (int key, char *arg, struct argp_state *state)
{
switch (key) {
case 'q': push_arg_opt(OPT_MSG, arg); break;
case 'm': push_arg_opt(OPT_SHM, arg); break;
case 's': push_arg_opt(OPT_SEM, arg); break;
case 'Q': push_arg_opt(OPT_MSG | OPT_KEY, arg); break;
case 'M': push_arg_opt(OPT_SHM | OPT_KEY, arg); break;
case 'S': push_arg_opt(OPT_SEM | OPT_KEY, arg); break;
default:
return ARGP_ERR_UNKNOWN;
}
return 0;
}
static void pinterr(const char *msg, long l)
{
fprintf(stderr, msg, l, strerror(errno));
exit_status = 1;
}
static void remove_one(int mask, unsigned long arg)
{
int rc;
int id = (int) arg;
const char *errmsg = NULL;
if (mask & OPT_KEY) {
if (mask & OPT_MSG)
id = msgget(arg, 0);
else if (mask & OPT_SHM)
id = shmget(arg, 0, 0);
else if (mask & OPT_SEM)
id = semget(arg, 0, 0);
else
abort(); /* should never happen */
}
if (id < 0) {
pinterr("key 0x%lx lookup failed: %s\n", arg);
return;
}
if (mask & OPT_MSG) {
rc = msgctl(id, IPC_RMID, NULL);
errmsg = "msgctl(0x%x): %s\n";
}
else if (mask & OPT_SHM) {
rc = shmctl(id, IPC_RMID, NULL);
errmsg = "shmctl(0x%x): %s\n";
}
else if (mask & OPT_SEM) {
union semun dummy;
dummy.val = 0;
rc = semctl(id, 0, IPC_RMID, dummy);
errmsg = "semctl(0x%x): %s\n";
}
else
abort(); /* should never happen */
if (rc < 0) {
fprintf(stderr, errmsg, id, strerror(errno));
exit_status = 1;
}
}
static void remove_stuff(void)
{
struct arglist *tmp = arglist;
while (tmp) {
remove_one(tmp->mask, tmp->arg);
tmp = tmp->next;
}
}
int main (int argc, char *argv[])
{
error_t rc;
pu_init();
rc = argp_parse(&argp, argc, argv, 0, NULL, NULL);
if (rc) {
fprintf(stderr, "argp_parse failed: %s\n", strerror(rc));
return 1;
}
remove_stuff();
return exit_status;
}
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Jeff Garzik @ 2005-05-20 23:58 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vvf5dqxfq.fsf@assigned-by-dhcp.cox.net>
Junio C Hamano wrote:
>>>>>>"JG" == Jeff Garzik <jgarzik@pobox.com> writes:
>
>
> JG> git-checkout-cache -f writes out every file, even if it exists, correct?
>
> No, that's not correct. To translate my prose, you would want
> this:
Thanks, I stand corrected :)
> git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
>
> (notice that I do not have git-update-cache --refresh after
> that).
Yep, thanks. Script does seem faster now. Numbers for hot cache (first
is pre-modification, post is your mod):
[jgarzik@pretzel libata-dev]$ time git-switch-tree adma-mwi
real 0m7.069s
user 0m4.183s
sys 0m2.817s
[jgarzik@pretzel libata-dev]$ time git-switch-tree adma
real 0m0.389s
user 0m0.294s
sys 0m0.094s
^ permalink raw reply
* Re: checkout-cache -f: a better way?
From: Junio C Hamano @ 2005-05-20 23:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Jeff Garzik, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505201641160.2206@ppc970.osdl.org>
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> (to which the beavis-and-butthead in me says "hehhehhehh.. He said fqua.
LT> Hehhehh. fire fire fire.")
Earlier this week I've sent out a "Request for Help" listing
some janitorial work, on which this was one of the item. I
believe Jeff suggested use of argp over GNU getopt(), but other
than that I do not think we had any volunteers (hint hint). I
haven't looked into any of the RFH items myself yet.
^ permalink raw reply
* Re: gitweb wishlist
From: Linus Torvalds @ 2005-05-21 0:50 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Kay Sievers, Petr Baudis, Thomas Glanzmann, Git Mailing List
In-Reply-To: <428E745C.30304@zytor.com>
[ Thomas added to cc, since he seems to have also worked on this ]
On Fri, 20 May 2005, H. Peter Anvin wrote:
>
> Here is my "main" OSS CVS repository; look at the syslinux module. It
> has at least some minor branching.
Ok, "cvsps" output scares me. I wonder what
WARNING: Invalid PatchSet 775, Tag syslinux-2_12-pre7:
memdisk/init32.asm:1.3=after, memdisk/Makefile:1.26=before. Treated as 'before'
WARNING: Invalid PatchSet 775, Tag syslinux-2_12-pre7:
memdisk/init32.asm:1.3=after, memdisk/e820test.c:1.7=before. Treated as 'before'
...
means..
Also, your syslinux repo is interesting and shows another thing: doing a
cvsps -g -p separate
ends badly with
Directing PatchSet 938 to file separate/938.patch
cvs rdiff: failed to read diff file header /tmp/cvso8PswZ for mdiskchk.com,v: end of file
system command returned non-zero exit status: 1: aborting
which doesn't look very promising and causes an empty diff for
mdiskck.com. Trying with --cvs-direct shows the reason:
Index: syslinux/sample/mdiskchk.com
===================================================================
RCS file:
/home/torvalds/src/osscvs/cvsroot/syslinux/sample/mdiskchk.com,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
Binary files /tmp/cvsU6MGU0 and /tmp/cvsiskFVR differ
which shows that anything that bases itself of diffs (ie uses "-g" with
cvsps) is just doomed to failure, since there's no good way to handle
binary data. Both Kay's and Thomas' scripts try to do the "-g" thing,
that's just not right.
So the cvs->git thing would need to be based on the actual objects, which
obviously fits git quite well, but I was really hoping to have cvsps give
some nice intermediate format..
So it looks like we should avoid the diff format, and instead use
cvsps -p separate
and then just parse the "Members" thing and turning each of them either
into a "delete" (for ->.*DEAD) or "cvs checkout -rxxx" (for ".*->xxx").
Handling branches by literally treating them as different heads in git
sounds quite simple, and indeed it looks like the basic logic for cvs->git
translation would be
for-each-patch-from-cvsps
do
git-read-tree -m branchname-from-patch
git-update-cache -f -u -q -a
for-each-member-in-patch
do
if [ DEAD ]; then
rm member
git-update-cache --remove member
else
cvs co -rREV member
git-update-cache --add member
fi
cat commit-message-from-patch |
git-commit-tree $(git-write-tree) -p branchname-from-patch > .git/revs/heads/branchname-from-patch
done
done
which looks like it should work, and handle binary files right.
There seems to be two questions:
- what to do about branch creation (ie a branch name we haven't seen
before): it looks like cvsps doesn't tell you what the _originating_
branch was for a new branch (that may be my confusion - maybe you can't
create branches off branches in CVS?)
For syslinux, it looks like you can always base it on HEAD, or possibly
just the previous patch (which looks like it is always HEAD). The above
pseudo-script will actually do that automatically, simply by virtue of
the "git-read-tree -m" at the top of the loop failing when the
branchname doesn't exist yet.
- whether to bother to create merge entries for when somebody tried to
merge a branch back or forth in CVS.
CVS fundamentally doesn't have the notion of such a thing, and cvsps
can't either. But we could try to guess, based on the commit message,
perhaps.
NOTE! Such a "merge" would not have any real GIT merge functionality
what-so-ever. It would just introduce a second parent into the commit,
nothing more.
Bah. What crud.
Linus
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox