* Re: seek request
From: Petr Baudis @ 2005-05-26 8:29 UTC (permalink / raw)
To: Zack Brown; +Cc: Git Mailing List
In-Reply-To: <20050522071106.GA8060@tumblerings.org>
Dear diary, on Sun, May 22, 2005 at 09:11:06AM CEST, I got a letter
where Zack Brown <zbrown@tumblerings.org> told me that...
> Hi folks,
>
> In Cogito, it would be nice to have a
>
> cg-seek +
>
> that would seek to the next archive state. This way, I could start off seeking
> back to the beginning of an archive, and quickly step forward, looking at files
> as I went, to the present.
>
> A corresponding
> cg-seek -
> would go the reverse direction, back toward the beginning of a project.
>
> I'm not sure how useful this would be for actual source code - I suspect any
> benefit would be minimal - but the benefit for documentation and text files,
> where the only way to test improvements is to read them by eye, would be
> significant.
Well, and what if the commit has multiple parents? Or - even much more
interestingly - multiple children?
If we keep applying the first parent rule, we could just traverse the
graph from heads/master to HEAD following this rule, and then just take
a step back to where we came from for cg-seek +. It wouldn't be exactly
cheap, but it'd probably work.
Patch welcomed. ;-)
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
^ permalink raw reply
* Re: [doc]playing with git, and netdev/libata-dev trees
From: Frank Sorenson @ 2005-05-26 8:19 UTC (permalink / raw)
To: Git Mailing List; +Cc: jgarzik
In-Reply-To: <42955DF7.4000805@pobox.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jeff Garzik wrote:
> Hopefully, this email can quick-start some people on git.
I think the quick-start is great. Definitely needed to get people
up-to-speed with git.
> 1) installing git
>
> git requires bootstrapping, since you must have git installed in order
> to check out git.git (git repo), and linux-2.6.git (kernel repo). I
> have put together a bootstrap tarball of today's git repository.
>
> Download tarball from:
> http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-20050526.tar.bz2
A tarball of git will work, but it's a big bootstrap, and will need
periodic updating.
I'm curious whether we couldn't make a git-bootstrap that contained a
significantly stripped-down version that did nothing other than
bootstrap git. A single program/file (perl script perhaps?) that...
mkdir git
cd git
rsync -rl rsync://rsync.kernel.org/pub/scm/git/git.git/ .git
echo rsync://rsync.kernel.org/pub/scm/git/git.git > .git/branches/origin
beginning with .git/HEAD, start extracting files
...and nothing more. Then, we could tell people to just download
git-bootstrap and run it to create an up-to-date git repo.
Frank
- --
Frank Sorenson - KD7TZK
Systems Manager, Computer Science Department
Brigham Young University
frank@tuxrocks.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFClYaKaI0dwg4A47wRAp1MAKCcGK8vTWtw1gnTCjFbMFpbZkSO8QCff5RE
NC8Z7RVHFP4qcbKRMSJ2rzg=
=fXsr
-----END PGP SIGNATURE-----
^ permalink raw reply
* [doc][git] playing with git, and netdev/libata-dev trees
From: Jeff Garzik @ 2005-05-26 5:26 UTC (permalink / raw)
To: Linux Kernel, Netdev, linux-ide@vger.kernel.org
Cc: Andrew Morton, Git Mailing List
Hopefully, this email can quick-start some people on git.
One of the things Linus's new 'git' tool allows me to do is make public
the 50+ repositories that were previously only available on my local
workstation. This should make it a lot easier for developers to see
precisely what I have merged, and makes generating follow-up patches a
whole lot easier.
When I merge a patch for drivers/net/forcedeth.c, I merge it into a
brand new 'forcedeth' repository, a peer to the 40+ other such
repository. Under BitKeeper, I made these repositories available merged
together into one big "netdev-2.6" repository because it was too time
consuming to make the individual 50+ trees publicly available. With
git, developers have direct access to the individual trees.
I thought I would write up a quick guide describing how to mess around
with the netdev and libata-dev trees, and with git in general.
1) installing git
git requires bootstrapping, since you must have git installed in order
to check out git.git (git repo), and linux-2.6.git (kernel repo). I
have put together a bootstrap tarball of today's git repository.
Download tarball from:
http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-20050526.tar.bz2
tarball build-deps: zlib, libcurl
install tarball: unpack && make && sudo make prefix=/usr/local install
jgarzik helper scripts, not in official git distribution:
http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-switch-tree
http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-new-branch
http://www.kernel.org/pub/linux/kernel/people/jgarzik/git-changes-script
After reading the rest of this document, come back and update your copy
of git to the latest:
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git
2) download a linux kernel tree for the very first time
mkdir -p linux-2.6/.git
cd linux-2.6
rsync -a --delete --verbose --stats --progress \
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/
\ <- word-wrapped backslash; sigh
.git/
3) download latest changes to on-disk local tree
cd linux-2.6
git-pull-script \
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
4) check out files from the git repository into the working directory
cd linux-2.6
git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
5) check in your own modifications (e.g. apply a patch)
# go to repo
cd linux-2.6
# make some modifications
patch -sp1 < /tmp/my.patch
diffstat -p1 < /tmp/my.patch
# NOTE: add '--add' and/or '--remove' if files were added or removed
git-update-cache <list of all files changed>
# commit changes
GIT_AUTHOR_NAME="John Doe" \
GIT_AUTHOR_EMAIL="jdoe@foo.com" \
GIT_COMMITTER_NAME="Jeff Garzik" \
GIT_COMMITTER_EMAIL="jgarzik@pobox.com" \
git-commit-tree `git-write-tree` \
-p $(cat .git/HEAD ) \
< changelog.txt \
> .git/HEAD
6) List all changes in working dir, in diff format.
git-diff-cache -p HEAD
7) List all changesets (i.e. show each cset's description text) in local
tree that are not present in remote tree.
cd my-kernel-tree-2.6
git-changes-script -L ../linux-2.6 | less
8) List all changesets:
git-whatchanged
9) apply all patches in a Berkeley mbox-format file
First, download and add to your PATH Linus's git tools:
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/git-tools.git
cd my-kernel-tree-2.6
dotest /path/to/mbox # yes, Linus has no taste in naming useful scripts
10) don't forget to download tags from time to time.
git-pull-script only downloads sha1-indexed object data, and the
requested remote head. This misses updates to the .git/refs/tags/ and
.git/refs/heads directories. It is advisable to update your kernel .git
directories periodically with a full rsync command, to make sure you got
everything:
cd linux-2.6
rsync -a --delete --verbose --stats --progress \
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/
\ <- word-wrapped backslash; sigh
.git/
11) [jg-specific] list all branches found in netdev-2.6 or libata-dev trees.
Download
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
or
rsync://rsync.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git
cd netdev-2.6
ls .git/refs/heads/
{ these are the current netdev-2.6 branches }
> 8139cp forcedeth master qeth smc91x we18
> 8139too-iomap for-linus natsemi r8169 smc91x-eeprom wifi
> airo hdlc ns83820 register-netdev starfire
> atmel ieee80211 orinoco remove-drivers tlan
> chelsio iff-running orinoco-hch sis900 veth
> dm9000 janitor ppp skge viro
12) [jg-specific] make desired branch current in working directory
git-switch-tree $branch
13) [jg-specific] create a new branch, and make it current
git-new-branch $branch
14) [jg-specific] examine which branch is current
ls -l .git/HEAD
15) undo all local modifications (same as checkout):
git-read-tree -m HEAD && git-checkout-cache -q -f -u -a
^ permalink raw reply
* Re: [PATCH] mkdelta enhancements
From: Junio C Hamano @ 2005-05-26 4:48 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Linus Torvalds, git
In-Reply-To: <Pine.LNX.4.62.0505252340270.16151@localhost.localdomain>
I have not measured and I have not studied how the current
mkdelta does things, but if you are not doing so already it may
make sense to keep the later one expanded in full and represent
older ones as delta, since that would give faster access to more
often used items (I am assuming more recent version are likely
to be accessed more frequently, which I believe is what SCMs
typically do. RCS format is geared towards this AFAICR).
^ permalink raw reply
* Re: [PATCH] ls-tree matching a prefix
From: Junio C Hamano @ 2005-05-26 4:44 UTC (permalink / raw)
To: Jason McMullan; +Cc: git
In-Reply-To: <20050526034756.GA1488@port.evillabs.net>
>>>>> "JM" == Jason McMullan <jason.mcmullan@timesys.com> writes:
JM> For this purpose, I've enhanced git-ls-tree to allow the
JM> specification of an optional 'match path' that restricts
JM> that output of git-ls-tree to just the path requested.
JM> If the patch has a '/' in it, it implies -r.
I'd rather see the behaviour match existing commands with path
restriction, like diff-tree, diff-cache, and diff-files. That
is, to take a list of paths and limit your output to those that
match one of them. I do not think this enhancement would
negatively affect your stated use of getting one entry with the
exact match.
^ permalink raw reply
* Re: new cvsps version fixes issues for cvs2git
From: David Mansfield @ 2005-05-26 4:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List, Thomas Glanzmann
In-Reply-To: <Pine.LNX.4.58.0505252111580.2307@ppc970.osdl.org>
>
> Since the CVS information doesn't contain any timezone, it would be bogus
> to use one, and the only sane git conversion is to always use UTC. Using
> the timezone of the converter is also bogus, since that just makes
> different converters get different results.
The cvs log is now properly handled as UTC. It wasn't before. That's
one good thing. And yes, git conversion better always be UTC, no
argument here.
>
> So I'd much rather see you add a flag that just always does the native CVS
> time (ie UTC)? Quite frankly, it's wrong to do anything else, exactly
> because it makes no sense to print out dates in a timezone that has no
> relevance (what relevance does Pacitic time have for somebody who
> committed something at 8AM Eastern? _None_).
It will always print in the localtime of the user running cvsps, not the
timezone of the commiter (in fact, we don't know the timezone of the
committer at all). I hate committing something, running cvsps and
having it tell me I'm about to commit in five hours, but I *do* see your
point.
>
> The fact is, if we depend on people doign "TZ=UTC", people will forget,
> and then people will have different conversions.
That's true, that would be terrible. But I'm arguing that the actual
conversion program (which actually wants machine readable output) should
make it happen. If that means we need a shell-script wrapper than
so-be-it. By letting cvsps display in any timezone, including UTC, it
can work for everyone (keep the policy out of the program).
> (My personal preference would be to _default_ to UTC, and instead have a
> special flag that says "use localtime to print stuff out", since
> localtime really is the least relevant one most of the time)
>
The thing you may be missing (and, hey, why not?) is that some people
will actually still be using cvs, and cvsps to them is a tool that
produces output for humans. For you, it is a stone in the path to git's
domination of the world.
I'll have to think about it. At the very least a flag requesting UTC,
or a flag requesting localtime makes sense. Setting environment
variables and then running a program always seemed a bit like abuse of
global variables.
David
^ permalink raw reply
* Re: new cvsps version fixes issues for cvs2git
From: Linus Torvalds @ 2005-05-26 4:20 UTC (permalink / raw)
To: David Mansfield; +Cc: Git Mailing List, Thomas Glanzmann
In-Reply-To: <42954A6D.6020503@cobite.com>
On Thu, 26 May 2005, David Mansfield wrote:
>
> 4) patchset date/time problems. the date/time handling was bogus. some
> of it was patched for some time in my tree, but not released. also we
> now report all dates in LOCALTIME. use the TZ variable to get a
> different time. Note: 'cvs log' format is always UTC.
>
> Linus, based on #4, you may want to set 'export TZ=UTC' before running,
> and handle date/time conversion in cvs2git counting on that. otherwise,
> I think problems may occur with daylight savings (apr/oct).
cvs2git only wants UTC times, and doesn't do any conversion, since that's
the native git format (git considers all times to be UTC, but also records
a "what timezone was the thing done in" so that if you want to, you can
print it out not in localtime, but in "localtime as it was for the
committer"). Nothing else really makes sense - it's totally senseless to
print it out as "in localtime of user".
Since the CVS information doesn't contain any timezone, it would be bogus
to use one, and the only sane git conversion is to always use UTC. Using
the timezone of the converter is also bogus, since that just makes
different converters get different results.
So I'd much rather see you add a flag that just always does the native CVS
time (ie UTC)? Quite frankly, it's wrong to do anything else, exactly
because it makes no sense to print out dates in a timezone that has no
relevance (what relevance does Pacitic time have for somebody who
committed something at 8AM Eastern? _None_).
The fact is, if we depend on people doign "TZ=UTC", people will forget,
and then people will have different conversions.
(My personal preference would be to _default_ to UTC, and instead have a
special flag that says "use localtime to print stuff out", since
localtime really is the least relevant one most of the time)
Linus
^ permalink raw reply
* new cvsps version fixes issues for cvs2git
From: David Mansfield @ 2005-05-26 4:02 UTC (permalink / raw)
To: Git Mailing List, Linus Torvalds; +Cc: Thomas Glanzmann
Hi,
I just put out the 2.1 tarball on
http://www.cobite.com/cvsps/cvsps-2.1.tar.gz. I tested it out with the
syslinux, mutt, and a bunch of my own repos. It fixes the following
issues that were causing some of the problems with cvs2git:
1) proper detection and reporting of branch ancestry, with the -A
option. This patch was sent under separate cover, but now I also
explicitly disallow the bogus 'import' branch from being an ancestor.
The ancestor will only be reported when a new branch appears.
2) patchset ordering problems. actual revision ancestry is considered
when ordering the patchsets. this mainly affects the 'patchset 1 and
patchset 2 are swapped' problem, but could be others
3) patchset 'globbing' problems. previously, cvsps would allow the same
file into a patchset more than once. this is clearly bogus, and now it
isn't allowed, combined with #2 and some minor date tweaking, the
ordering should be 'more perfect than ever.'
4) patchset date/time problems. the date/time handling was bogus. some
of it was patched for some time in my tree, but not released. also we
now report all dates in LOCALTIME. use the TZ variable to get a
different time. Note: 'cvs log' format is always UTC.
Linus, based on #4, you may want to set 'export TZ=UTC' before running,
and handle date/time conversion in cvs2git counting on that. otherwise,
I think problems may occur with daylight savings (apr/oct).
If there are any remaining issues, let me know.
David
^ permalink raw reply
* [PATCH] mkdelta enhancements
From: Nicolas Pitre @ 2005-05-26 3:57 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
Although it was described as such, git-mkdelta didn't really attempt to
find the best delta against any previous object in the list but was only
creating a delta against the preceeding object. This patch fixes that.
This means that
git-mkdelta sha1 sha2 sha3 sha4 sha5 sha6
will now create a sha2 delta against sha1, a sha3 delta against either
sha2 or sha1 and keep the best one, a sha4 delta against either sha3,
sha2 or sha1, etc. The --max-behind argument limits that search for the
best delta to the specified number of previous objects in the list. If
no limit is specified it is unlimited.
Also added a -q (quiet) switch so it is possible to have 3 levels of
output: -q for nothing, -v for verbose, and if none of -q nor -v is
specified then only actual changes on the object database are shown.
Finally the git-deltafy-script has been updated to use the -t switch of
git-diff-tree.
Signed-off-by: Nicolas Pitre <nico@cam.org>
diff --git a/git-deltafy-script b/git-deltafy-script
--- a/git-deltafy-script
+++ b/git-deltafy-script
@@ -6,11 +6,6 @@
# successive versions going back in time. This way the delta overhead is
# pushed towards older version of any given file.
#
-# NOTE: the "best earlier version" is not implemented in mkdelta yet
-# and therefore only the next eariler version is used at this time.
-#
-# TODO: deltafy tree objects as well.
-#
# The -d argument allows to provide a limit on the delta chain depth.
# If 0 is passed then everything is undeltafied.
@@ -22,7 +17,7 @@ depth=
curr_file=""
git-rev-list HEAD |
-git-diff-tree -r --stdin |
+git-diff-tree -r -t --stdin $@ |
awk '/^:/ { if ($5 == "M" || $5 == "N") print $4, $6 }' |
LC_ALL=C sort -s -k 2 | uniq |
while read sha1 file; do
diff --git a/mkdelta.c b/mkdelta.c
--- a/mkdelta.c
+++ b/mkdelta.c
@@ -98,21 +98,17 @@ static void *create_delta_object(char *b
return create_object(buf, len, hdr, hdrlen, size);
}
-static unsigned long get_object_size(unsigned char *sha1)
-{
- struct stat st;
- if (stat(sha1_file_name(sha1), &st))
- die("%s: %s", sha1_to_hex(sha1), strerror(errno));
- return st.st_size;
-}
-
-static void *get_buffer(unsigned char *sha1, char *type, unsigned long *size)
+static void *get_buffer(unsigned char *sha1, char *type,
+ unsigned long *size, unsigned long *compsize)
{
unsigned long mapsize;
- void *map = map_sha1_file(sha1, &mapsize);
+ void *map;
+ map = map_sha1_file(sha1, &mapsize);
if (map) {
void *buffer = unpack_sha1_file(map, mapsize, type, size);
munmap(map, mapsize);
+ if (compsize)
+ *compsize = mapsize;
if (buffer)
return buffer;
}
@@ -120,25 +116,23 @@ static void *get_buffer(unsigned char *s
return NULL;
}
-static void *expand_delta(void *delta, unsigned long delta_size, char *type,
- unsigned long *size, unsigned int *depth, char *head)
+static void *expand_delta(void *delta, unsigned long *size, char *type,
+ unsigned int *depth, char *head)
{
void *buf = NULL;
*depth++;
- if (delta_size < 20) {
+ if (*size < 20) {
error("delta object is bad");
free(delta);
} else {
unsigned long ref_size;
- void *ref = get_buffer(delta, type, &ref_size);
+ void *ref = get_buffer(delta, type, &ref_size, NULL);
if (ref && !strcmp(type, "delta"))
- ref = expand_delta(ref, ref_size, type, &ref_size,
- depth, head);
+ ref = expand_delta(ref, &ref_size, type, depth, head);
else
memcpy(head, delta, 20);
if (ref)
- buf = patch_delta(ref, ref_size, delta+20,
- delta_size-20, size);
+ buf = patch_delta(ref, ref_size, delta+20, *size-20, size);
free(ref);
free(delta);
}
@@ -146,172 +140,197 @@ static void *expand_delta(void *delta, u
}
static char *mkdelta_usage =
-"mkdelta [ --max-depth=N ] <reference_sha1> <target_sha1> [ <next_sha1> ... ]";
+"mkdelta [--max-depth=N] [--max-behind=N] <reference_sha1> <target_sha1> [<next_sha1> ...]";
+struct delta {
+ unsigned char sha1[20]; /* object sha1 */
+ void *buf; /* object content */
+ unsigned long size; /* object size */
+ unsigned char head[20]; /* top most delta reference object */
+ unsigned int depth; /* delta depth */
+};
+
int main(int argc, char **argv)
{
- unsigned char sha1_ref[20], sha1_trg[20], head_ref[20], head_trg[20];
- char type_ref[20], type_trg[20];
- void *buf_ref, *buf_trg, *buf_delta;
- unsigned long size_ref, size_trg, size_orig, size_delta;
- unsigned int depth_ref, depth_trg, depth_max = -1;
- int i, verbose = 0;
+ struct delta *ref, trg;
+ char ref_type[20], trg_type[20], *skip_reason;
+ void *best_buf;
+ unsigned long best_size, orig_size, orig_compsize;
+ int r, orig_ref, best_ref, nb_refs, next_ref, max_refs = 0;
+ int i, best_skip, verbose = 0, quiet = 0;
+ unsigned int max_depth = -1;
for (i = 1; i < argc; i++) {
if (!strcmp(argv[i], "-v")) {
verbose = 1;
+ quiet = 0;
+ } else if (!strcmp(argv[i], "-q")) {
+ quiet = 1;
+ verbose = 0;
} else if (!strcmp(argv[i], "-d") && i+1 < argc) {
- depth_max = atoi(argv[++i]);
+ max_depth = atoi(argv[++i]);
} else if (!strncmp(argv[i], "--max-depth=", 12)) {
- depth_max = atoi(argv[i]+12);
+ max_depth = atoi(argv[i]+12);
+ } else if (!strcmp(argv[i], "-b") && i+1 < argc) {
+ max_refs = atoi(argv[++i]);
+ } else if (!strncmp(argv[i], "--max-behind=", 13)) {
+ max_refs = atoi(argv[i]+13);
} else
break;
}
- if (i + (depth_max != 0) >= argc)
+ if (i + (max_depth != 0) >= argc)
usage(mkdelta_usage);
- if (get_sha1(argv[i], sha1_ref))
- die("bad sha1 %s", argv[i]);
- depth_ref = 0;
- buf_ref = get_buffer(sha1_ref, type_ref, &size_ref);
- if (buf_ref && !strcmp(type_ref, "delta"))
- buf_ref = expand_delta(buf_ref, size_ref, type_ref,
- &size_ref, &depth_ref, head_ref);
- else
- memcpy(head_ref, sha1_ref, 20);
- if (!buf_ref)
- die("unable to obtain initial object %s", argv[i]);
-
- if (depth_ref > depth_max) {
- if (restore_original_object(buf_ref, size_ref, type_ref, sha1_ref))
- die("unable to restore %s", argv[i]);
- if (verbose)
- printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
- depth_ref = 0;
- }
-
- /*
- * TODO: deltafication should be tried against any early object
- * in the object list and not only the previous object.
- */
+ if (!max_refs)
+ max_refs = argc - i;
+ ref = xmalloc(max_refs * sizeof(*ref));
+ for (r = 0; r < max_refs; r++)
+ ref[r].buf = NULL;
+ next_ref = nb_refs = 0;
- while (++i < argc) {
- if (get_sha1(argv[i], sha1_trg))
+ do {
+ if (get_sha1(argv[i], trg.sha1))
die("bad sha1 %s", argv[i]);
- depth_trg = 0;
- buf_trg = get_buffer(sha1_trg, type_trg, &size_trg);
- if (buf_trg && !size_trg) {
+ trg.buf = get_buffer(trg.sha1, trg_type, &trg.size, &orig_compsize);
+ if (trg.buf && !trg.size) {
if (verbose)
printf("skip %s (object is empty)\n", argv[i]);
continue;
}
- size_orig = size_trg;
- if (buf_trg && !strcmp(type_trg, "delta")) {
- if (!memcmp(buf_trg, sha1_ref, 20)) {
- /* delta already in place */
- depth_ref++;
- memcpy(sha1_ref, sha1_trg, 20);
- buf_ref = patch_delta(buf_ref, size_ref,
- buf_trg+20, size_trg-20,
- &size_ref);
- if (!buf_ref)
- die("unable to apply delta %s", argv[i]);
- if (depth_ref > depth_max) {
- if (restore_original_object(buf_ref, size_ref,
- type_ref, sha1_ref))
- die("unable to restore %s", argv[i]);
- if (verbose)
- printf("undelta %s (depth was %d)\n", argv[i], depth_ref);
- depth_ref = 0;
- continue;
- }
- if (verbose)
- printf("skip %s (delta already in place)\n", argv[i]);
- continue;
+ orig_size = trg.size;
+ orig_ref = -1;
+ trg.depth = 0;
+ if (trg.buf && !strcmp(trg_type, "delta")) {
+ for (r = 0; r < nb_refs; r++)
+ if (!memcmp(trg.buf, ref[r].sha1, 20))
+ break;
+ if (r < nb_refs) {
+ /* no need to load the reference object */
+ trg.buf = patch_delta(ref[r].buf, ref[r].size,
+ trg.buf+20, trg.size-20,
+ &trg.size);
+ trg.depth = ref[r].depth + 1;
+ memcpy(trg.head, ref[r].head, 20);
+ strcpy(trg_type, ref_type);
+ orig_ref = r;
+ } else {
+ trg.buf = expand_delta(trg.buf, &trg.size, trg_type,
+ &trg.depth, trg.head);
}
- buf_trg = expand_delta(buf_trg, size_trg, type_trg,
- &size_trg, &depth_trg, head_trg);
- } else
- memcpy(head_trg, sha1_trg, 20);
- if (!buf_trg)
- die("unable to read target object %s", argv[i]);
-
- if (depth_trg > depth_max) {
- if (restore_original_object(buf_trg, size_trg, type_trg, sha1_trg))
- die("unable to restore %s", argv[i]);
- if (verbose)
- printf("undelta %s (depth was %d)\n", argv[i], depth_trg);
- depth_trg = 0;
- size_orig = size_trg;
+ } else {
+ memcpy(trg.head, trg.sha1, 20);
}
+ if (!trg.buf)
+ die("unable to read target object %s", argv[i]);
- if (depth_max == 0)
- goto skip;
-
- if (strcmp(type_ref, type_trg))
+ if (max_depth && nb_refs && strcmp(ref_type, trg_type)) {
die("type mismatch for object %s", argv[i]);
-
- if (!size_ref) {
- if (verbose)
- printf("skip %s (initial object is empty)\n", argv[i]);
- goto skip;
- }
-
- if (depth_ref + 1 > depth_max) {
- if (verbose)
- printf("skip %s (exceeding max link depth)\n", argv[i]);
- goto skip;
- }
-
- if (!memcmp(head_ref, sha1_trg, 20)) {
- if (verbose)
- printf("skip %s (would create a loop)\n", argv[i]);
- goto skip;
+ } else {
+ strcpy(ref_type, trg_type);
}
- buf_delta = diff_delta(buf_ref, size_ref, buf_trg, size_trg, &size_delta);
- if (!buf_delta)
- die("out of memory");
-
- /* no need to even try to compress if original
- uncompressed is already smaller */
- if (size_delta+20 < size_orig) {
- void *buf_obj;
- unsigned long size_obj;
- buf_obj = create_delta_object(buf_delta, size_delta,
- sha1_ref, &size_obj);
- free(buf_delta);
- size_orig = get_object_size(sha1_trg);
- if (size_obj >= size_orig) {
- free(buf_obj);
- if (verbose)
- printf("skip %s (original is smaller)\n", argv[i]);
- goto skip;
+ best_buf = NULL;
+ best_size = -1;
+ best_ref = -1;
+ best_skip = 0;
+ skip_reason = NULL;
+ for (r = 0; max_depth && r < nb_refs; r++) {
+ void *delta_buf, *comp_buf;
+ unsigned long delta_size, comp_size;
+
+ if (ref[r].depth >= max_depth) {
+ if (best_skip < 1) {
+ skip_reason = "exceeding max link depth";
+ best_skip = 1;
+ }
+ continue;
+ }
+ if (!memcmp(ref[r].head, trg.sha1, 20)) {
+ if (best_skip < 2) {
+ skip_reason = "would create a loop)\n";
+ best_skip = 2;
+ }
+ continue;
+ }
+ if (r == orig_ref) {
+ if (best_skip < 3) {
+ skip_reason = "delta already in place";
+ best_skip = 3;
+ }
+ continue;
+ }
+ delta_buf = diff_delta(ref[r].buf, ref[r].size,
+ trg.buf, trg.size, &delta_size);
+ if (!delta_buf)
+ die("out of memory");
+ if (trg.depth <= max_depth &&
+ delta_size+20 >= orig_size) {
+ /* no need to even try to compress if original
+ object is smaller than this delta */
+ free(delta_buf);
+ if (best_skip < 4) {
+ skip_reason = "no size reduction";
+ best_skip = 4;
+ }
+ continue;
+ }
+ comp_buf = create_delta_object(delta_buf, delta_size,
+ ref[r].sha1, &comp_size);
+ if (!comp_buf)
+ die("out of memory");
+ free(delta_buf);
+ if (trg.depth <= max_depth &&
+ comp_size >= orig_compsize) {
+ free(comp_buf);
+ if (best_skip < 5) {
+ skip_reason = "no size reduction";
+ best_skip = -1;
+ }
+ continue;
+ }
+ if ((comp_size < best_size) ||
+ (comp_size == best_size &&
+ ref[r].depth < ref[best_ref].depth)) {
+ free(best_buf);
+ best_buf = comp_buf;
+ best_size = comp_size;
+ best_ref = r;
}
- if (replace_object(buf_obj, size_obj, sha1_trg))
- die("unable to write delta for %s", argv[i]);
- free(buf_obj);
- depth_ref++;
- if (verbose)
- printf("delta %s (size=%ld.%02ld%%, depth=%d)\n",
- argv[i], size_obj*100 / size_orig,
- (size_obj*10000 / size_orig)%100,
- depth_ref);
- } else {
- free(buf_delta);
- if (verbose)
- printf("skip %s (original is smaller)\n", argv[i]);
- skip:
- depth_ref = depth_trg;
- memcpy(head_ref, head_trg, 20);
}
- free(buf_ref);
- buf_ref = buf_trg;
- size_ref = size_trg;
- memcpy(sha1_ref, sha1_trg, 20);
- }
+ if (best_buf) {
+ if (replace_object(best_buf, best_size, trg.sha1))
+ die("unable to write delta for %s", argv[i]);
+ free(best_buf);
+ trg.depth = ref[best_ref].depth + 1;
+ memcpy(trg.head, ref[best_ref].head, 20);
+ if (!quiet)
+ printf("delta %s (size=%ld.%02ld%% depth=%d dist=%d)\n",
+ argv[i], best_size*100 / orig_compsize,
+ (best_size*10000 / orig_compsize)%100,
+ trg.depth,
+ (next_ref - best_ref + max_refs)
+ % (max_refs + 1) + 1);
+ } else if (trg.depth > max_depth) {
+ if (restore_original_object(trg.buf, trg.size, trg_type, trg.sha1))
+ die("unable to restore %s", argv[i]);
+ if (!quiet)
+ printf("undelta %s (depth was %d)\n",
+ argv[i], trg.depth);
+ trg.depth = 0;
+ memcpy(trg.head, trg.sha1, 20);
+ } else if (skip_reason && verbose) {
+ printf("skip %s (%s)\n", argv[i], skip_reason);
+ }
+
+ free(ref[next_ref].buf);
+ ref[next_ref] = trg;
+ if (++next_ref > nb_refs)
+ nb_refs = next_ref;
+ if (next_ref == max_refs)
+ next_ref = 0;
+ } while (++i < argc);
return 0;
}
^ permalink raw reply
* [PATCH] ls-tree matching a prefix
From: Jason McMullan @ 2005-05-26 3:47 UTC (permalink / raw)
To: git
In the Porcelain I've been working on, I have found it useful
to retrieve a single file's SHA1 out of a tree when I don't
want to create an index.
For this purpose, I've enhanced git-ls-tree to allow the
specification of an optional 'match path' that restricts
that output of git-ls-tree to just the path requested.
If the patch has a '/' in it, it implies -r.
ie:
$ git-ls-tree HEAD Makefile
100644 blob 92d0e87535ecaa5e52a6503c43dd30dd546ea6b7 Makefile
$ git-ls-tree HEAD t
040000 tree 33ce2f3201c99d5da785bb777639c1e2374c44d2 t
$ git-ls-tree HEAD t/test-lib.sh
100755 blob d3f71d1932310197219155b426687d155bf63c5b t/test-lib.sh
Signed-Off-By: Jason McMullan <jason.mcmullan@timesys.com>
diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
--- a/Documentation/git-ls-tree.txt
+++ b/Documentation/git-ls-tree.txt
@@ -27,6 +27,10 @@ OPTIONS
-z::
\0 line termination on output
+[path]::
+ Only return items that match the specified path, relative to the
+ root of the tree. If a patch has a '/' in it, implies -r
+
Output Format
-------------
<mode>\t <type>\t <object>\t <file>
diff --git a/ls-tree.c b/ls-tree.c
--- a/ls-tree.c
+++ b/ls-tree.c
@@ -26,10 +26,32 @@ static void print_path_prefix(struct pat
static void list_recursive(void *buffer,
const char *type,
unsigned long size,
- struct path_prefix *prefix)
+ struct path_prefix *prefix,
+ const char *match_path)
{
struct path_prefix this_prefix;
this_prefix.prev = prefix;
+ char mpref[PATH_MAX];
+ size_t mlen = 0;
+ char *cp = NULL;
+ if (match_path != NULL) {
+ if (*match_path == 0)
+ return;
+ cp = strchr(match_path,'/');
+ if (cp == NULL) {
+ strcpy(mpref,match_path);
+ match_path = NULL;
+ } else {
+ recursive = 1;
+ strncpy(mpref,match_path,cp-match_path);
+ mpref[cp-match_path]=0;
+ cp++;
+ match_path = cp;
+ if (*cp == 0)
+ cp = NULL;
+ }
+ mlen = strlen(mpref);
+ }
if (strcmp(type, "tree"))
die("expected a 'tree' node");
@@ -48,27 +70,35 @@ static void list_recursive(void *buffer,
buffer = sha1 + 20;
size -= namelen + 20;
- printf("%06o\t%s\t%s\t", mode,
- S_ISDIR(mode) ? "tree" : "blob",
- sha1_to_hex(sha1));
- print_path_prefix(prefix);
- fputs(path, stdout);
- putchar(line_termination);
+ if (mlen && strcmp(mpref, path) != 0)
+ continue;
+
+ if (cp == NULL) {
+ printf("%06o\t%s\t%s\t", mode,
+ S_ISDIR(mode) ? "tree" : "blob",
+ sha1_to_hex(sha1));
+ print_path_prefix(prefix);
+ fputs(path, stdout);
+ putchar(line_termination);
+ }
if (! recursive || ! S_ISDIR(mode))
continue;
+ if (mlen && cp == NULL)
+ continue;
+
if (! (eltbuf = read_sha1_file(sha1, elttype, &eltsize)) ) {
error("cannot read %s", sha1_to_hex(sha1));
continue;
}
this_prefix.name = path;
- list_recursive(eltbuf, elttype, eltsize, &this_prefix);
+ list_recursive(eltbuf, elttype, eltsize, &this_prefix, match_path);
free(eltbuf);
}
}
-static int list(unsigned char *sha1)
+static int list(unsigned char *sha1, const char *match_path)
{
void *buffer;
unsigned long size;
@@ -76,12 +106,12 @@ static int list(unsigned char *sha1)
buffer = read_object_with_reference(sha1, "tree", &size, NULL);
if (!buffer)
die("unable to read sha1 file");
- list_recursive(buffer, "tree", size, NULL);
+ list_recursive(buffer, "tree", size, NULL, match_path);
free(buffer);
return 0;
}
-static const char *ls_tree_usage = "git-ls-tree [-r] [-z] <key>";
+static const char *ls_tree_usage = "git-ls-tree [-r] [-z] <key> [path]";
int main(int argc, char **argv)
{
@@ -101,11 +131,11 @@ int main(int argc, char **argv)
argc--; argv++;
}
- if (argc != 2)
+ if (argc != 2 && argc != 3)
usage(ls_tree_usage);
if (get_sha1(argv[1], sha1) < 0)
usage(ls_tree_usage);
- if (list(sha1) < 0)
+ if (list(sha1, argc==3 ? argv[2] : NULL) < 0)
die("list failed");
return 0;
}
^ permalink raw reply
* [RFC/PATCH] Detect copies harder in diff-tree.
From: Junio C Hamano @ 2005-05-26 3:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505232314510.2307@ppc970.osdl.org>
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> That said, I don't think -C is that important.
By now, you know I won't listen ;-).
I've done preliminary --detect-copies-harder (that is to feed all
the unmodified files to diffcore when doing -C) and --use-cache
(this is to allow diffcore to avoid expanding blob when there is
a already matching file in the work tree) changes to diff-tree.
This example is from the linux-2.6 tree, with the tip of the
tree in the work tree and the cache, and looking at the commit
when include/asm-um was cleaned up (May 5th). The first one
does not detect copies from "unmodified" files, but the latter
two do. In this commit there isn't any copy that "harder"
version finds but ordinary one doesn't.
: siamese; time ../git.junio/git-diff-tree -r \
-C dbc35cc73f2edd6e39d7e814dbb6eddad6294665 >/dev/null
real 0m0.010s
user 0m0.010s
sys 0m0.000s
: siamese; time ../git.junio/git-diff-tree -r --detect-copies-harder \
-C dbc35cc73f2edd6e39d7e814dbb6eddad6294665 >/dev/null
real 0m19.938s
user 0m11.520s
sys 0m1.240s
: siamese; time ../git.junio/git-diff-tree -r \
--detect-copies-harder --use-cache -C \
dbc35cc73f2edd6e39d7e814dbb6eddad6294665 >/dev/null
real 0m5.858s
user 0m5.110s
sys 0m0.710s
------------
Add --detect-copies-harder and --use-cache to diff-tree.
This adds two new options to diff-tree. Even when -C is used,
diff-tree does not normally feed "unmodified" filepair to the
diffcore, so copy detection is done only among the files that
have changed. With --detect-copies-harder, it can also detect
copies made from an unmodified file (this behavior is the
default for diff-files and diff-cache). When this option is
used, it is recommended to also give --use-cache, which lets
diffcore to avoid expanding blob when the work tree has the same
file unmodified.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
cd /opt/packrat/playpen/public/in-place/git/git.junio/
jit-diff
# - linus: git-rev-list: add "end" commit and "--header" flag
# + (working tree)
diff --git a/diff-tree.c b/diff-tree.c
--- a/diff-tree.c
+++ b/diff-tree.c
@@ -6,6 +6,8 @@ static int show_root_diff = 0;
static int verbose_header = 0;
static int ignore_merges = 1;
static int recursive = 0;
+static int use_cache = 0;
+static int detect_copies_harder = 0;
static int show_tree_entry_in_recursive = 0;
static int read_stdin = 0;
static int diff_output_format = DIFF_FORMAT_HUMAN;
@@ -108,7 +110,8 @@ static int compare_tree_entry(void *tree
show_file("+", tree2, size2, base);
return 1;
}
- if (!memcmp(sha1, sha2, 20) && mode1 == mode2)
+ if (!memcmp(sha1, sha2, 20) && mode1 == mode2 &&
+ (detect_rename != DIFF_DETECT_COPY || !detect_copies_harder))
return 0;
/*
@@ -549,6 +552,14 @@ int main(int argc, const char **argv)
read_stdin = 1;
continue;
}
+ if (!strcmp(arg, "--use-cache")) {
+ use_cache = 1;
+ continue;
+ }
+ if (!strcmp(arg, "--detect-copies-harder")) {
+ detect_copies_harder = 1;
+ continue;
+ }
if (!strcmp(arg, "--root")) {
show_root_diff = 1;
continue;
@@ -566,6 +577,16 @@ int main(int argc, const char **argv)
pathlens[i] = strlen(paths[i]);
}
+ if (detect_rename && use_cache && !active_cache) {
+ /* read-cache does not die even when it fails
+ * so it is safe for us to do this here. Also
+ * it does not smudge active_cache or active_nr
+ * when it fails, so we do not have to worry about
+ * cleaning it up oufselves either.
+ */
+ read_cache();
+ }
+
switch (nr_sha1) {
case 0:
if (!read_stdin)
Compilation finished at Wed May 25 19:59:09
^ permalink raw reply
* Re: [PATCH] Test case portability fix.
From: Junio C Hamano @ 2005-05-26 2:55 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505251935210.2307@ppc970.osdl.org>
Sorry, I think I sent an stale copy from my draft box by
accident and you already have the exactly same patch.
About the "From: " thing, I understood.
^ permalink raw reply
* Re: gitweb wishlist
From: David Mansfield @ 2005-05-26 2:51 UTC (permalink / raw)
To: Thomas Glanzmann; +Cc: Git Mailing List
In-Reply-To: <20050524045840.GI12141@cip.informatik.uni-erlangen.de>
Thomas Glanzmann wrote:
> Hello,
>
>
>> WARNING: Invalid PatchSet 775, Tag syslinux-2_12-pre7:
>> memdisk/init32.asm:1.3=after, memdisk/Makefile:1.26=before. Treated as 'before'
>> WARNING: Invalid PatchSet 775, Tag syslinux-2_12-pre7:
>> memdisk/init32.asm:1.3=after, memdisk/e820test.c:1.7=before. Treated as 'before'
>> ...
>
>
> actually I think this is the broken upstream version. It can't parse
> dates right. Just look at the exported patches and see if them all from
> 1970. However the debian package has a patch in which solves it:
>
> maybe you should try with the attached patch or with the version that
> comes with debian sarge. I also reported this problem a while back to
> the original author.
>
I was about to apply this and I already had in it my cvs tree! Funny
how these things go. I must have gotten it before, applied it and never
released a new version. Funny that this one hase the tm.tm_isdst = 0
that is missing from the version I applied (and fixes an important bug).
Anyway, I'm about to release a new version cumulative with all this, a
fixed ancestor version, correct ordering for those pesky import commits,
and a couple other annoying fixes.
BTW: the above warnings are actually legit in this case.
David
^ permalink raw reply
* Re: [PATCH] Make cvs2git support remote CVS repos
From: Linus Torvalds @ 2005-05-26 2:42 UTC (permalink / raw)
To: Mark Allen; +Cc: git
In-Reply-To: <20050525181132.75705.qmail@web41204.mail.yahoo.com>
On Wed, 25 May 2005, Mark Allen wrote:
>
> Added a "--module=cvsmodule" command line option and (since we're going to process argv
> anyway) made "-v" for verbose mode a command line option too, instead of a compile time
> option.
Ahh.. You found out how to get CVS to check out individual files.
The reason I use RCS "co" directly is because I couldn't figure out how
CVS can be made to do it. Of course, the raw RCS possibly also performs
better, but somebody should check that. If the overhead of using CVS to do
this is low enough, we should drop the raw RCS access, which should
simplify your patch and get rid of the need for "RCSDIR".
Anybody up for some performance testing?
Linus
^ permalink raw reply
* Re: [PATCH] Test case portability fix.
From: Linus Torvalds @ 2005-05-26 2:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <7vr7fug2i4.fsf@assigned-by-dhcp.cox.net>
On Wed, 25 May 2005, Junio C Hamano wrote:
>
> This is the remainder of testcase fix by Mark Allen to make them
> work on his Darwin box. I was using "xargs -r" (GNU) where it
> was not needed, sed -ne '/^\(author\|committer\)/s|>.*|>|p'
> where some sed does not know what to do with '\|', and also
> "cmp - file" to compare standard input with a file, which his
> cmp does not support.
>
> Author: Mark Allen <mrallen1@yahoo.com>
> Author-Date:
Btw, do this as
From: Mark Allen <mrallen1@yahoo.com>
at the top of the email body, and my patch-application scripts will
automatically do the right thing.
The Author-date thing you might as well drop for now..
Linus
^ permalink raw reply
* [PATCH] Test case portability fix.
From: Junio C Hamano @ 2005-05-26 2:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <7v1x7uj4i3.fsf_-_@assigned-by-dhcp.cox.net>
This is the remainder of testcase fix by Mark Allen to make them
work on his Darwin box. I was using "xargs -r" (GNU) where it
was not needed, sed -ne '/^\(author\|committer\)/s|>.*|>|p'
where some sed does not know what to do with '\|', and also
"cmp - file" to compare standard input with a file, which his
cmp does not support.
Author: Mark Allen <mrallen1@yahoo.com>
Author-Date:
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
t/t0000-basic.sh | 2 +-
t/t0110-environment-names-old.sh | 6 ++----
2 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -84,7 +84,7 @@ do
done
test_expect_success \
'adding various types of objects with git-update-cache --add.' \
- 'find path* ! -type d -print0 | xargs -0 -r git-update-cache --add'
+ 'find path* ! -type d -print0 | xargs -0 git-update-cache --add'
# Show them and see that matches what we expect.
test_expect_success \
diff --git a/t/t0110-environment-names-old.sh b/t/t0110-environment-names-old.sh
--- a/t/t0110-environment-names-old.sh
+++ b/t/t0110-environment-names-old.sh
@@ -86,8 +86,7 @@ committer A U Thor <author@example.xz>
EOF
test_expect_success \
'verify old AUTHOR variables were used correctly in commit' \
- 'sed -ne '\''/^\(author\|committer\)/s|>.*|>|p'\'' current |
- cmp - expected'
+ 'sed -ne '\''/^\(author\)/s|>.*|>|p'\'' -e'\''/^\(committer\)/s|>.*|>|p'\''\ current > out && cmp out expected'
unset GIT_DIR
test_expect_success \
@@ -128,7 +127,6 @@ committer R O Htua <rohtua@example.xz>
EOF
test_expect_success \
'verify new AUTHOR variables were used correctly in commit.' \
- 'sed -ne '\''/^\(author\|committer\)/s|>.*|>|p'\'' current |
- cmp - expected'
+ 'sed -ne '\''/^\(author\)/s|>.*|>|p'\'' -e'\''/^\(committer\)/s|>.*|>|p'\''\ current > out && cmp out expected'
test_done
------------------------------------------------
^ permalink raw reply
* Re: Summary of core GIT while you are away.
From: Linus Torvalds @ 2005-05-26 1:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kay Sievers, pasky, braddr, nico, david, Git Mailing List
In-Reply-To: <7vzmuig561.fsf@assigned-by-dhcp.cox.net>
On Wed, 25 May 2005, Junio C Hamano wrote:
>
> I haven't done anything about this yet. I was kind of waiting
> for the blob retention API for "struct object" derivatives to
> come to the conclusion.
I decided on it, and it's a hacky (but reasonable) decision.
"parse_commit()" always retains the buffer for the commit. It's very
commit-specific, since (a) that's the main data structure that you'd
actually want to retain and (b) of all the core objects (ignoring tags)
commits tend to be the smallest and fewest, so this fixed policy has the
least impact memory-usage-wise.
If you want to free it, you can do
free(commit->buffer);
commit->buffer = NULL;
and you should be all good (but the only user I thought migh want to is
"fsck", and that one doesn't need to, since it never calls
"parse_commit()" at all - it calls "parse_commit_buffer()" which does
_not_ do this, since the caller already knows the buffer and can choose to
do it if it wants to.
Oh, thinking about that actually brings up a bug: mixing parse_commit()
and parse_commit_buffer() is unsafe, since the state of the buffer is
unclear. Normally that's ok, _except_ when we use just "parse_object()"
(like we do in "lookup_commit_reference()"), when we really don't know.
Hmm.
Linus
^ permalink raw reply
* Re: Summary of core GIT while you are away.
From: Linus Torvalds @ 2005-05-26 1:45 UTC (permalink / raw)
To: Kay Sievers; +Cc: Junio C Hamano, pasky, braddr, nico, david, Git Mailing List
In-Reply-To: <20050526004411.GA12360@vrfy.org>
On Thu, 26 May 2005, Kay Sievers wrote:
>
> On Mon, May 16, 2005 at 09:10:10AM -0700, Linus Torvalds wrote:
> >
> > Then you could just do
> >
> > git-rev-list -v --header HEAD | grep -z 'author[^\n]*Linus'
> >
> > to tell it to do the "verbose" thing (only showing the header of the
> > commit, not the whole message), and grep for "Linus" in the author line.
>
> What happened to that idea? That's not already working in some other way I
> missed, right? The pickaxe stuff is nice and was easy to call from the cgi,
> but searching in commit messages would be nice too.
> If it's not going to happen in the git-binaries, I may do it just in the
> cgi itself.
Ok, you twisted my arm. Checked in.
git-rev-list --header HEAD | grep -z 'author[\n]*Linus'
and you will get output that is a series of commits by me (strictly
speaking, that's not true, since the "author" thing might be in the
non-header part).
The format is:
<commit-id> '\n' <commit-msg> '\0'
and if the full commit message is binary and contains NUL bytes in itself
(only crazy people do that, but let's keep in mind that crazy people
exist), we naturally truncate it at the first NUL. This makes it easy to
parse, and is what makes "grep -z" work, for example.
It also has a rudimentary "stop at commit x" function, but I'm not doing
any reachability analysis, so it's purely based on the commit date, and is
thus _not_ equivalent to "git-rev-tree START ^END". I reserve the right to
try to change that behaviour, though, if I decide I (or sombody else ;)
can do a nice incremental reacability thing.
So you can do
git-rev-list --header HEAD v2.6.12-rc5
and it will print out all the commits in date order until it hits
v2.6.12-rc5 (which it won't print out).
The output format is really optimized for something like "git log" or
"gitweb". I suspect it's pretty much perfect for doing the "last 10"
things, without having to do "git-cat-file" for each commit.
Oh, and I considered adding a "--header-lines=c" which limits the header
printout to 'c' lines (probably starting at where the free-form thing
starts, so that "--header-lines=0" would print out just the fixed-format
header of the commit messages.
(This is slightly different from what I initially envisioned, which had
"-v" and "--header", and I'm not sure it's better but it ended up being
what I did. If you think my first idea was better, send me a (by now
trivial) patch).
Linus
^ permalink raw reply
* Re: Summary of core GIT while you are away.
From: Junio C Hamano @ 2005-05-26 1:13 UTC (permalink / raw)
To: Kay Sievers; +Cc: Linus Torvalds, pasky, braddr, nico, david, Git Mailing List
In-Reply-To: <20050526004411.GA12360@vrfy.org>
>>>>> "KS" == Kay Sievers <kay.sievers@vrfy.org> writes:
KS> What happened to that idea? That's not already working in some other way I
KS> missed, right? The pickaxe stuff is nice and was easy to call from the cgi,
KS> but searching in commit messages would be nice too.
KS> If it's not going to happen in the git-binaries, I may do it just in the
KS> cgi itself.
I haven't done anything about this yet. I was kind of waiting
for the blob retention API for "struct object" derivatives to
come to the conclusion.
^ permalink raw reply
* Re: [PATCH] Make sure diff-helper can tell rename/copy in the new diff-raw format.
From: Junio C Hamano @ 2005-05-26 0:55 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0505230736180.2307@ppc970.osdl.org>
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:
LT> ... But the thing is,
LT> that's actually what I _want_, because I was planning on writing a tool
LT> that applies patches that applies them all-or-nothing.
I was going through past messages and realized I missed this
part of your message. Now I think I understand what git-apply
program is all about.
There is one thing [*1*] currently missing from diff-patch
output for your plan to fully work.
A type change, like a file turning into a symlink, is currently
something built-in diff punts. Your earlier response to "What
about modified and type changed" question suggests that you
would want it to be expressed as a delete and a create, so I
imagine that the "diff --git" output for this diff-raw:
:100644 120000 abcdef... abcdef... T frotz frotz
you would want to see output as this:
diff --git a/frotz b/frotz
deleted file mode 100644
--- frotz
+++ /dev/null
@@ -1 +0,0 @@
-rezrov
\ No newline at end of file
diff --git a/frotz b/frotz
new file mode 120000
--- /dev/null
+++ frotz
@@ -0,0 +1 @@
+rezrov
\ No newline at end of file
Even simpler for me is not to do this "splitting a filepair into
create and delete", and have diff compare the two blobs
directly, though that would make a patch that does not make
sense to humans:
diff --git a/frotz b/frotz
old mode 100644
new mode 120000
... diff between readlink and file contents if any ...
A tree turning into a file and vice versa is something you are
already taking care of in diff-tree when feeding the diffcore,
and diff-cache and diff-files do not even see tree objects to
begin with, so tree-to-file is something that will never be fed
to the output routine as a matched filepair, and you will always
get a delete/create pair with the current code. I am fairly
certain, therefore, tree-to-file is not a problem. Only symlink
vs file case is problematic with the current output routine.
[Footnote]
*1* Strictly speaking, there is another. Changes in tree object
are not shown, either. This however will not be a problem for
git-apply, because as long as the files underneath are handled
correctly you will end up with the right tree.
^ permalink raw reply
* Re: Summary of core GIT while you are away.
From: Kay Sievers @ 2005-05-26 0:44 UTC (permalink / raw)
To: Linus Torvalds
Cc: Junio C Hamano, pasky, braddr, nico, david, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505160837080.28162@ppc970.osdl.org>
On Mon, May 16, 2005 at 09:10:10AM -0700, Linus Torvalds wrote:
>
> The only thing I personally
> think sucks is the author/committer matching of git-rev-list/tree, since
> it would seem like somebody might well like to match on an arbitrary part
> of a commit, and special-casing author/committer seems somewhat broken.
> I personally suspect that both git-rev-list and git-rev-tree should have
> an alternate output format that could be more easily grepped by subsequent
> commands. For example, right now git-rev-list just outputs a list of
> commit ID's, and it might make sense to have a flag to just append the
> commit message to the output, and zero-terminate it (and if the commit
> message has a NUL byte in it, just truncate it at that point).
>
> Then you could just do
>
> git-rev-list -v --header HEAD | grep -z 'author[^\n]*Linus'
>
> to tell it to do the "verbose" thing (only showing the header of the
> commit, not the whole message), and grep for "Linus" in the author line.
What happened to that idea? That's not already working in some other way I
missed, right? The pickaxe stuff is nice and was easy to call from the cgi,
but searching in commit messages would be nice too.
If it's not going to happen in the git-binaries, I may do it just in the
cgi itself.
Kay
^ permalink raw reply
* [PATCH] Make tests more portable
From: Junio C Hamano @ 2005-05-26 0:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Allen, git
In-Reply-To: <20050525045229.29706.qmail@web41205.mail.yahoo.com>
This is the remainder of testcase fix by Mark Allen to make them
work on his Darwin box. I was using "xargs -r" (GNU) where it
was not needed, sed -ne '/^\(author\|committer\)/s|>.*|>|p'
where his sed does not know what to do with '\|', and "cmp -
file" to compare standard input with a file, which his cmp does
not support.
Another problem his patch fixed has been merged in the tip of
your git.git already.
Author: Mark Allen <mrallen1@yahoo.com>
Author-Date: Tue, 24 May 2005 21:52:28 -0700
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
t/t0000-basic.sh | 2 +-
t/t0110-environment-names-old.sh | 6 ++----
2 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -84,7 +84,7 @@ do
done
test_expect_success \
'adding various types of objects with git-update-cache --add.' \
- 'find path* ! -type d -print0 | xargs -0 -r git-update-cache --add'
+ 'find path* ! -type d -print0 | xargs -0 git-update-cache --add'
# Show them and see that matches what we expect.
test_expect_success \
diff --git a/t/t0110-environment-names-old.sh b/t/t0110-environment-names-old.sh
--- a/t/t0110-environment-names-old.sh
+++ b/t/t0110-environment-names-old.sh
@@ -86,8 +86,7 @@ committer A U Thor <author@example.xz>
EOF
test_expect_success \
'verify old AUTHOR variables were used correctly in commit' \
- 'sed -ne '\''/^\(author\|committer\)/s|>.*|>|p'\'' current |
- cmp - expected'
+ 'sed -ne '\''/^\(author\)/s|>.*|>|p'\'' -e'\''/^\(committer\)/s|>.*|>|p'\''\ current > out && cmp out expected'
unset GIT_DIR
test_expect_success \
@@ -128,7 +127,6 @@ committer R O Htua <rohtua@example.xz>
EOF
test_expect_success \
'verify new AUTHOR variables were used correctly in commit.' \
- 'sed -ne '\''/^\(author\|committer\)/s|>.*|>|p'\'' current |
- cmp - expected'
+ 'sed -ne '\''/^\(author\)/s|>.*|>|p'\'' -e'\''/^\(committer\)/s|>.*|>|p'\''\ current > out && cmp out expected'
test_done
------------------------------------------------
^ permalink raw reply
* [PATCH] Mode only changes from diff.
From: Junio C Hamano @ 2005-05-25 23:00 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Kay Sievers, Git Mailing List
In-Reply-To: <Pine.LNX.4.58.0505251544250.2307@ppc970.osdl.org>
This fixes another bug.
- Mode-only changes were pruned incorrectly from the output.
- Added test to catch the above problem.
- Normalize rename/copy similarity score in the diff-raw output
to per-cent, no matter what scale we internally use.
Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff-helper.c | 2 ++
diff.c | 6 ++++--
t/t4006-diff-mode.sh | 34 ++++++++++++++++++++++++++++++++++
3 files changed, 40 insertions(+), 2 deletions(-)
new file (100755): t/t4006-diff-mode.sh
diff --git a/diff-helper.c b/diff-helper.c
--- a/diff-helper.c
+++ b/diff-helper.c
@@ -4,6 +4,7 @@
#include "cache.h"
#include "strbuf.h"
#include "diff.h"
+#include "diffcore.h" /* just for MAX_SCORE */
static const char *pickaxe = NULL;
static int line_termination = '\n';
@@ -77,6 +78,7 @@ int main(int ac, const char **av) {
if (status == 'R' || status == 'C') {
two_paths = 1;
sscanf(cp, "%d", &score);
+ score = score * MAX_SCORE / 100;
if (line_termination) {
cp = strchr(cp,
inter_name_termination);
diff --git a/diff.c b/diff.c
--- a/diff.c
+++ b/diff.c
@@ -517,7 +517,8 @@ static void diff_flush_raw(struct diff_f
switch (p->status) {
case 'C': case 'R':
two_paths = 1;
- sprintf(status, "%c%1d", p->status, p->score);
+ sprintf(status, "%c%03d", p->status,
+ (int)(0.5 + p->score * 100.0/MAX_SCORE));
break;
default:
two_paths = 0;
@@ -750,7 +751,8 @@ static void diff_resolve_rename_copy(voi
if (!p->status)
p->status = 'R';
}
- else if (memcmp(p->one->sha1, p->two->sha1, 20))
+ else if (memcmp(p->one->sha1, p->two->sha1, 20) ||
+ p->one->mode != p->two->mode)
p->status = 'M';
else
/* this is a "no-change" entry */
diff --git a/t/t4006-diff-mode.sh b/t/t4006-diff-mode.sh
new file mode 100755
--- /dev/null
+++ b/t/t4006-diff-mode.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+#
+# Copyright (c) 2005 Junio C Hamano
+#
+
+test_description='Test mode change diffs.
+
+'
+. ./test-lib.sh
+
+test_expect_success \
+ 'setup' \
+ 'echo frotz >rezrov &&
+ git-update-cache --add rezrov &&
+ tree=`git-write-tree` &&
+ echo $tree'
+
+test_expect_success \
+ 'chmod' \
+ 'chmod +x rezrov &&
+ git-update-cache rezrov &&
+ git-diff-cache $tree >current'
+
+_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
+_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
+sed -e 's/\(:100644 100755\) \('"$_x40"'\) \2 /\1 X X /' <current >check
+echo ":100644 100755 X X M rezrov" >expected
+
+test_expect_success \
+ 'verify' \
+ 'diff -u expected check'
+
+test_done
+
------------------------------------------------
^ permalink raw reply
* Re: change of git-diff-tree and symlinks
From: Linus Torvalds @ 2005-05-25 22:47 UTC (permalink / raw)
To: Kay Sievers; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <20050525222622.GA8552@vrfy.org>
On Thu, 26 May 2005, Kay Sievers wrote:
>
> If we introduce 'T', how is a content _and_ a type change represented
> if they happen at the same time?
A 'T' _always_ implies a content change, imho.
Yes, that strange udev changeset actually had files that had the same
content as the symlinks, but from a patch perspective, that should
probably really still be a "file got entirely deleted" + "we created a
symlink with new content". Anything else just doesn't make any sense.
So in that way, 'T' really is different from 'M'. 'M' implies a patch
(which might be empty, of course), while 'T' implies that the old thing
was deleted and entirely replaced with something totally different.
Linus
^ permalink raw reply
* Re: change of git-diff-tree and symlinks
From: Junio C Hamano @ 2005-05-25 22:43 UTC (permalink / raw)
To: Kay Sievers; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <20050525222622.GA8552@vrfy.org>
>>>>> "KS" == Kay Sievers <kay.sievers@vrfy.org> writes:
KS> If we introduce 'T', how is a content _and_ a type change represented
KS> if they happen at the same time?
If you have this pair in two trees:
ln -s frotz xyzzy
echo -n frotz >xyzzy
it is a 'T'. If you instead have these in two trees:
ln -s rezrov xyzzy
echo -n frotz >xyzzy
it is also a 'T'.
I do not think we would want patch format to give us a diff
showing that string rezrov changing into frotz in the latter
example anyway. When we have a type change, content change is
irrelevant.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox