Git development

Git development
 help / color / mirror / Atom feed

* Re: new gitk feature
From: Linus Torvalds @ 2006-04-26 15:09 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git
In-Reply-To: <17487.21137.344427.173131@cargo.ozlabs.ibm.com>

On Wed, 26 Apr 2006, Paul Mackerras wrote:
>
> I just pushed some changes to gitk which add a new feature, the
> ability to have multiple "views" of a repository.  Each view is a
> subgraph of the full graph.  At the moment the only subgraph that you
> can specify is the subgraph containing the commits that affect a
> specified set of files or directories.  You can switch between views
> quickly, and if the currently selected commit exists in the new view
> when you switch views, it is selected in the new view.

This gets close to something I wanted, but at the same time falls very 
short of it because the views are always shown completely disjoint.

I've wanted for a long time to have a way to _highlight_ commits. That's 
actually very much a "view" thing, but it's a mode where you really see 
one view, but the commits that exist in another view have a different 
color (or have the commits that _don't_ exist in the other view be grayed 
out).

I hope that your new "view" thing would support this notion too: instead 
of having to totally switch between view, it would be wonderful if you 
could have one "master view" and then use another view to "highlight".

Also, I think revision information should be part of a view. For example, 
in the "highlight" case, I'd love to have the "main view" be the default 
"everything", and then have some way to _highlight_ the view that is 
defined by the revision pattern "v1.3.1.."

Any possibility of something light that? I'd _love_ to be able to see the 
whole tree, but with things that touch certain files or things that are 
newer highlighted.

(Btw, the "revision information" is also cool things like "--unpacked". I 
actually use "gitk --unpacked" every once in a while, just because it's 
such a cool way to say "show me everything I've added since I packed the 
repo last).

			Linus

^ permalink raw reply

* Re: lstat() call in rev-parse.c
From: Matthias Lederhofer @ 2006-04-26 15:28 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0604230906370.3701@g5.osdl.org>

> So the rule is: if you don't give that "--", then we have to be able to 
> confirm that the filenames are really files. Not a misspelled revision 
> name, or a revision name that was correctly spelled, but for the wrong 
> project, because you were in the wrong subdirectory ;)

Shouldn't git rev-parse try to stat the file (additionally?) in the
current directory instead of the top git directory? git (diff|log|..)
seem to fail everytime in a subdirectory without --.

^ permalink raw reply

* Re: lstat() call in rev-parse.c
From: Linus Torvalds @ 2006-04-26 15:43 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git
In-Reply-To: <E1FYlwn-0005mf-CL@moooo.ath.cx>

On Wed, 26 Apr 2006, Matthias Lederhofer wrote:

> > So the rule is: if you don't give that "--", then we have to be able 
> > to confirm that the filenames are really files. Not a misspelled 
> > revision name, or a revision name that was correctly spelled, but for 
> > the wrong project, because you were in the wrong subdirectory ;)
> 
> Shouldn't git rev-parse try to stat the file (additionally?) in the 
> current directory instead of the top git directory? git (diff|log|..) 
> seem to fail everytime in a subdirectory without --.

Good point. However, the reason for that is that it actually _does_ stat 
the file in the current directory, but it has done the 

	revs->prefix = setup_git_directory();

in the init path (and it does need to do that, since that's what figures 
out where the .git directory is, so that we can parse the revisions 
correctly).

And that "setup_git_directory()" will chdir() to the root of the project.

So the "lstat()" should probably take "revs->prefix" into account, the 
way get_pathspec() does. Ie we should probably use

	char *name = argv[i];
	if (rev->prefix)
		name = prefix_filename(rev->prefix, strlen(rev->prefix), name);
	if (lstat(name, ..) < 0)
		die(...)

instead of just a plain lstat().

Probably worth doing as a small helper funtion of its own (and get rid of 
the current "die_badfile()" - and do all of that inside the helper 
function).

Somebody?

		Linus

^ permalink raw reply

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Nicolas Pitre @ 2006-04-26 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsj4sxer.fsf@assigned-by-dhcp.cox.net>

On Tue, 25 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > Note, this is a RFC particularly to Junio since the resulting pack is 
> > larger than without the patch with git-repack -a -f.  However using a 
> > subsequent git-repack -a brings the pack size down to expected size.  So 
> > I'm not sure I've got everything right.
> 
> I haven't tested it seriously yet, but there is nothing that
> looks obviously wrong that might cause the inflation problem,
> from the cursory look after applying the patch on top of your
> last round.
> 
> > +	if (nr_objects == nr_result && trg_entry->delta_limit >= max_depth)
> > +		return 0;
> 
> The older code was loosening this check only for a delta chain
> that is already in pack (which is limited to its previous
> max_depth).  The end result is almost the same -- a thin pack
> recipient would have deeper delta than it asked. The difference
> is that the earlier code had implicit 2*max_depth limit,

Ah.  Indeed.  Didn't realize that.  I can restore that behavior quite 
easily if necessary.

> but this one makes the chain length unbounded, which I do not think it 
> is necessarily a bad change.

Well as long as the thin pack doesn't carry too many revisions it should 
be fine since, as the comment in the code sais, those packs are always 
unpacked.

Initially I had a bug where the delta depth was completely ignored.  I 
was pretty excited when repacking the kernel produced a pack 20% smaller 
although I didn't know why at that time.  But when attempting another 
git-repack -a -f then the initial object counting was sooooooo 
slooooooooow.

> > -	/*
> > -	 * NOTE!
> > -	 *
> > -	 * We always delta from the bigger to the smaller, since that's
> > -	 * more space-efficient (deletes don't have to say _what_ they
> > -	 * delete).
> > -	 */
> 
> This comment by Linus still applies, even though the scan order
> is now reversed; no need to remove it.

This is not exactly true.  In general it is so, but as we fixed the 
deltification of objects with the same name but in different directories 
it is well possible to go from smaller to larger and leaving that 
comment there is misleading.

This is also why I changed the sizediff rule such that:

	sizediff = src_size < size ? size - src_size : 0;

Since the src buffer already has its delta index computed, it costs 
almost nothing to attempt matching much smaller objects against it.  
However if we go from small to larger then the previous logic still 
applies.

> > +	if (trg_entry->delta) {
> > +		/*
> > +		 * The target object already has a delta base but we just
> > +		 * found a better one.  Remove it from its former base
> > +		 * childhood and redetermine the base delta_limit (if used).
> > +		 */
> 
> And you are making the delta chain unbound for thin case, you
> can probably omit this with the same if() here; the
> recomputation seems rather expensive.

Ah right.  I was doing so partly, but I can skip any tree maintenance 
altogether in that case as well.

> > +		if (!size)
> > +			continue;
> > +		delta_index = create_delta_index(n->data, size);
> > +		if (!delta_index)
> > +			die("out of memory");
> 
> It might be worth saying "if (size < 50)" here as well; no point
> wasting the delta window for small sources.

Good point.  No real effect on the pack size though.

> > -#if 0
> > -		/* if we made n a delta, and if n is already at max
> > -		 * depth, leaving it in the window is pointless.  we
> > -		 * should evict it first.
> > -		 * ... in theory only; somehow this makes things worse.
> > -		 */
> > -		if (entry->delta && depth <= entry->depth)
> > -			continue;
> > -#endif
> 
> I was almost tempted to suggest that the degradation you are
> seeing might be related to this mystery I did not get around to
> solve.  By allowing to give chance to try delta against less
> optimum candidates, it appeared that we ended up making the
> final pack size bigger than otherwise, which suggests that our
> choice between plain undeltified and a delta half its size might
> be favoring delta too much.  But it does not appear to be
> related to the inflation you are seeing.

Certainly not, since git-repack -a may only delta _more_ and the pack 
size actualy goes down a lot in my case.

The mystery I'm facing is why would a second pass with git-repack -a fix 
things?  It has a different window behavior since objects already 
deltified do not occupy window space. Hmmm.  That would certainly 
explain why doing a git-repack -a after a git-repack -a -f produces a 
smaller pack even currently.

> BTW, have you tried it without --no-reuse-pack on an object list
> that is not thin?  It appears you are busting the depth limit.
> 
> Using the same "git rev-list --objects v1.2.3..v1.3.0" as input,
> git-pack-objects without --no-reuse-pack gives this
> distribution:
> 
> chain length = 1: 364 objects
> chain length = 2: 269 objects
> chain length = 3: 198 objects
> chain length = 4: 164 objects
> chain length = 5: 148 objects
> chain length = 6: 123 objects
> chain length = 7: 122 objects
> chain length = 8: 103 objects
> chain length = 9: 92 objects
> chain length = 10: 234 objects
> chain length = 11: 12 objects
> chain length = 12: 1 object
> chain length = 13: 2 objects

Oops.  OK fixed.

> So it _might_ be that the depth limiting code is subtly broken
> which is causing you throw away a perfectly good delta base
> which in turn results in a bad pack.

Actually no.  That bug instead allowed each given base to deltify more 
targets than it should have.

Nicolas

^ permalink raw reply

* [PATCH] git-fetch: resolve remote symrefs for HTTP transport
From: Nick Hengeveld @ 2006-04-26 16:10 UTC (permalink / raw)
  To: git

git-fetch validates that a remote ref resolves to a SHA1 prior to calling
git-http-fetch.  This adds support for resolving a few levels of symrefs
to get to the SHA1.

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>


---

Maybe this isn't the right way to handle this - since we're already
calling perl we could use LWP to do the transfers (using keepalive
even?) or we could let git-http-fetch take care of it and deal with
remote names that don't resolve.  It may also make sense to modify
git-http-fetch so it can fetch more than one head at a time.

 git-fetch.sh |   16 ++++++++++++----
 1 files changed, 12 insertions(+), 4 deletions(-)

aa50f9012834993d8bd080050bc13b23465f9185
diff --git a/git-fetch.sh b/git-fetch.sh
index 83143f8..280f62e 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -270,14 +270,22 @@ fetch_main () {
 	  if [ -n "$GIT_SSL_NO_VERIFY" ]; then
 	      curl_extra_args="-k"
 	  fi
-	  remote_name_quoted=$(perl -e '
+	  max_depth=5
+	  depth=0
+	  head="ref: $remote_name"
+	  while (expr "z$head" : "zref:" && expr $depth \< $max_depth) >/dev/null
+	  do
+	    remote_name_quoted=$(perl -e '
 	      my $u = $ARGV[0];
+              $u =~ s/^ref:\s*//;
 	      $u =~ s{([^-a-zA-Z0-9/.])}{sprintf"%%%02x",ord($1)}eg;
 	      print "$u";
-	  ' "$remote_name")
-	  head=$(curl -nsfL $curl_extra_args "$remote/$remote_name_quoted") &&
+	  ' "$head")
+	    head=$(curl -nsfL $curl_extra_args "$remote/$remote_name_quoted")
+	    depth=$( expr \( $depth + 1 \) )
+	  done
 	  expr "z$head" : "z$_x40\$" >/dev/null ||
-		  die "Failed to fetch $remote_name from $remote"
+	      die "Failed to fetch $remote_name from $remote"
 	  echo >&2 Fetching "$remote_name from $remote" using http
 	  git-http-fetch -v -a "$head" "$remote/" || exit
 	  ;;
-- 
1.3.0.g368f0-dirty

^ permalink raw reply related

* Re: [PATCH] git-fetch: resolve remote symrefs for HTTP transport
From: Shawn Pearce @ 2006-04-26 17:09 UTC (permalink / raw)
  To: Nick Hengeveld; +Cc: git
In-Reply-To: <20060426161001.GH32744@reactrix.com>

Nick Hengeveld <nickh@reactrix.com> wrote:
> 
> Maybe this isn't the right way to handle this - since we're already
> calling perl we could use LWP to do the transfers (using keepalive
> even?)

LWP, no.  My Mac OS X perl installation appears to have LWP installed
by dumb luck but my Gentoo Linux perl doesn't have LWP anywhere
in @INC.  :-) Yet both systems run GIT happily.

The HTTP support in GIT is already linked against libcurl and libcurl
is required to use said HTTP support.  I would think that libcurl
is capable of using Keep-Alive when possible, and libcurl and C
are certainly available anywhere GIT's HTTP support is currently
being used.  Ideally any HTTP feature should either be using the
curl command line tool, or better, be written in C against the
libcurl library.  But not LWP.  Its not always available even though
a valid perl is.

-- 
Shawn.

^ permalink raw reply

* [PATCH] Fix filename verification when in a subdirectory
From: Linus Torvalds @ 2006-04-26 17:15 UTC (permalink / raw)
  To: Junio C Hamano, Matthias Lederhofer; +Cc: Git Mailing List, Paul Mackerras
In-Reply-To: <Pine.LNX.4.64.0604260832240.3701@g5.osdl.org>


When we are in a subdirectory of a git archive, we need to take the prefix 
of that subdirectory into accoung when we verify filename arguments.

Noted by Matthias Lederhofer

This also uses the improved error reporting for all the other git commands 
that use the revision parsing interfaces, not just git-rev-parse. Also, it 
makes the error reporting for mixed filenames and argument flags clearer 
(you cannot put flags after the start of the pathname list).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
On Wed, 26 Apr 2006, Linus Torvalds wrote:
> > 
> > Shouldn't git rev-parse try to stat the file (additionally?) in the 
> > current directory instead of the top git directory? git (diff|log|..) 
> > seem to fail everytime in a subdirectory without --.
> 
> Good point. However, the reason for that is that it actually _does_ stat 
> the file in the current directory, but it has done the 
> 
> 	revs->prefix = setup_git_directory();
> 
> in the init path (and it does need to do that, since that's what figures 
> out where the .git directory is, so that we can parse the revisions 
> correctly).
> 
> And that "setup_git_directory()" will chdir() to the root of the project.

diff --git a/cache.h b/cache.h
index 69801b0..4d8fabc 100644
--- a/cache.h
+++ b/cache.h
@@ -134,6 +134,7 @@ extern const char *setup_git_directory_g
 extern const char *setup_git_directory(void);
 extern const char *prefix_path(const char *prefix, int len, const char *path);
 extern const char *prefix_filename(const char *prefix, int len, const char *path);
+extern void verify_filename(const char *prefix, const char *name);
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
diff --git a/rev-parse.c b/rev-parse.c
index 7f66ae2..62e16af 100644
--- a/rev-parse.c
+++ b/rev-parse.c
@@ -160,14 +160,6 @@ static int show_file(const char *arg)
 	return 0;
 }
 
-static void die_badfile(const char *arg)
-{
-	if (errno != ENOENT)
-		die("'%s': %s", arg, strerror(errno));
-	die("'%s' is ambiguous - revision name or file/directory name?\n"
-	    "Please put '--' before the list of filenames.", arg);
-}
-
 int main(int argc, char **argv)
 {
 	int i, as_is = 0, verify = 0;
@@ -177,14 +169,12 @@ int main(int argc, char **argv)
 	git_config(git_default_config);
 
 	for (i = 1; i < argc; i++) {
-		struct stat st;
 		char *arg = argv[i];
 		char *dotdot;
 
 		if (as_is) {
 			if (show_file(arg) && as_is < 2)
-				if (lstat(arg, &st) < 0)
-					die_badfile(arg);
+				verify_filename(prefix, arg);
 			continue;
 		}
 		if (!strcmp(arg,"-n")) {
@@ -350,8 +340,7 @@ int main(int argc, char **argv)
 			continue;
 		if (verify)
 			die("Needed a single revision");
-		if (lstat(arg, &st) < 0)
-			die_badfile(arg);
+		verify_filename(prefix, arg);
 	}
 	show_default();
 	if (verify && revs_count != 1)
diff --git a/revision.c b/revision.c
index f9c7d15..f2a9f25 100644
--- a/revision.c
+++ b/revision.c
@@ -752,17 +752,15 @@ int setup_revisions(int argc, const char
 			arg++;
 		}
 		if (get_sha1(arg, sha1) < 0) {
-			struct stat st;
 			int j;
 
 			if (seen_dashdash || local_flags)
 				die("bad revision '%s'", arg);
 
 			/* If we didn't have a "--", all filenames must exist */
-			for (j = i; j < argc; j++) {
-				if (lstat(argv[j], &st) < 0)
-					die("'%s': %s", argv[j], strerror(errno));
-			}
+			for (j = i; j < argc; j++)
+				verify_filename(revs->prefix, argv[j]);
+
 			revs->prune_data = get_pathspec(revs->prefix, argv + i);
 			break;
 		}
diff --git a/setup.c b/setup.c
index 36ede3d..119ef7d 100644
--- a/setup.c
+++ b/setup.c
@@ -62,6 +62,29 @@ const char *prefix_filename(const char *
 	return path;
 }
 
+/*
+ * Verify a filename that we got as an argument for a pathspec
+ * entry. Note that a filename that begins with "-" never verifies
+ * as true, because even if such a filename were to exist, we want
+ * it to be preceded by the "--" marker (or we want the user to
+ * use a format like "./-filename")
+ */
+void verify_filename(const char *prefix, const char *arg)
+{
+	const char *name;
+	struct stat st;
+
+	if (*arg == '-')
+		die("bad flag '%s' used after filename", arg);
+	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
+	if (!lstat(name, &st))
+		return;
+	if (errno == ENOENT);
+		die("ambiguous argument '%s': unknown revision or filename\n"
+		    "Use '--' to separate filenames from revisions", arg);
+	die("'%s': %s", arg, strerror(errno));
+}
+
 const char **get_pathspec(const char *prefix, const char **pathspec)
 {
 	const char *entry = *pathspec;

^ permalink raw reply related

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Nicolas Pitre @ 2006-04-26 17:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsj4sxer.fsf@assigned-by-dhcp.cox.net>

On Tue, 25 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > Note, this is a RFC particularly to Junio since the resulting pack is 
> > larger than without the patch with git-repack -a -f.  However using a 
> > subsequent git-repack -a brings the pack size down to expected size.  So 
> > I'm not sure I've got everything right.
> 
> I haven't tested it seriously yet, but there is nothing that
> looks obviously wrong that might cause the inflation problem,
> from the cursory look after applying the patch on top of your
> last round.

Never mind.  I found a flaw in the determination of delta_limit when 
reparenting a delta target.  The immediate parent's delta_limit is 
readjusted when its longest delta is moved to another base, but if that 
parent was itself a delta then the delta_limit adjustment is not 
propagated back up to the top.  This means that some objects were 
falsely credited with too high delta_limit.

And actually I'm not sure how to solve that without walking the tree 
up to the top each time, which I want to avoid as much as possible.

Nicolas

^ permalink raw reply

* Re: [PATCH] Fix filename verification when in a subdirectory
From: Timo Hirvonen @ 2006-04-26 18:05 UTC (permalink / raw)
  To: torvalds; +Cc: junkio, matled, git, paulus
In-Reply-To: <Pine.LNX.4.64.0604261010390.3701@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> wrote:

> +void verify_filename(const char *prefix, const char *arg)
> +{
> +	const char *name;
> +	struct stat st;
> +
> +	if (*arg == '-')
> +		die("bad flag '%s' used after filename", arg);
> +	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
> +	if (!lstat(name, &st))
> +		return;
> +	if (errno == ENOENT);

Extra semicolon.

-- 
http://onion.dynserv.net/~timo/

^ permalink raw reply

* Re: [PATCH] Fix filename verification when in a subdirectory
From: Linus Torvalds @ 2006-04-26 18:14 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: junkio, matled, git, paulus
In-Reply-To: <20060426210541.5e145e88.tihirvon@gmail.com>



On Wed, 26 Apr 2006, Timo Hirvonen wrote:
> 
> Extra semicolon.

Duh, indeed. It just didn't show up in any of the normal cases.

Junio, just apply without that stupid semicolon..

		Linus

^ permalink raw reply

* Re: [PATCH] Add --continue and --abort options to git-rebase.
From: Junio C Hamano @ 2006-04-26 20:05 UTC (permalink / raw)
  To: sean; +Cc: git
In-Reply-To: <BAYC1-PASMTP025110BEB495EC4F07CDE2AEBC0@CEZ.ICE>

sean <seanlkml@sympatico.ca> writes:

>   git rebase [--onto <newbase>] <upstream> [<branch>]
>   git rebase --continue
>   git rebase --abort
>
> ---
>
> Take 2.  Must simpler patch which doesn't trying to 
> rejigger the command line too much.

This second round seems to make more sense.  Sign-off?

^ permalink raw reply

* Re: [PATCH] send-email: Change from Mail::Sendmail to Net::SMTP
From: Junio C Hamano @ 2006-04-26 20:17 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Eric Wong, Junio C Hamano, git, Ryan Anderson
In-Reply-To: <46a038f90604251745u1b15ad99ka1aeff1cd8d8c344@mail.gmail.com>

"Martin Langhoff" <martin.langhoff@gmail.com> writes:

>  * This box has nothing listening on port 25. It doesn't get email
> from the net, being a LAN machine, so I've told the debian config
> system that we don't need an smtp daemon. Net::SMTP doesn't know how
> to use /usr/bin/sendmail

Wouldn't --smtp-server=that.smtp.server work for you?  Ah, that
would not work if your use is to send a local mail.  Hmph...

^ permalink raw reply

* Re: [PATCH] send-email: Change from Mail::Sendmail to Net::SMTP
From: Martin Langhoff @ 2006-04-26 20:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, git, Ryan Anderson
In-Reply-To: <7vy7xsm6qa.fsf@assigned-by-dhcp.cox.net>

On 4/27/06, Junio C Hamano <junkio@cox.net> wrote:
> > system that we don't need an smtp daemon. Net::SMTP doesn't know how
> > to use /usr/bin/sendmail

> Wouldn't --smtp-server=that.smtp.server work for you?  Ah, that
> would not work if your use is to send a local mail.  Hmph...

Well, the machine knows that the smtp server is (I mean, files in /etc
have the right values in them), but I don't think often about it. Only
when I am installing OSs or MTAs...

I know... I'm a whiner! ;-) I'll probably do something that does an
eval and tries Mail::Sendmail and post it.

martin

^ permalink raw reply

* [PATCH] revision parsing: make "rev -- paths" checks stronger.
From: Junio C Hamano @ 2006-04-26 22:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604220921430.3701@g5.osdl.org>

If you don't have a "--" marker, then:

 - all of the arguments we are going to assume are pathspecs
   must exist in the working tree.

 - none of the arguments we parsed as revisions could be
   interpreted as a filename.

so that there really isn't any possibility of confusion in case
somebody does have a revision that looks like a pathname too.

The former rule has been in effect; this implements the latter.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * I did not understand 'if you have a pathspec, but' part, and
   'none of the paths are "--"' part of your rules, so this
   might probably be a bit different from what you had in
   mind.

   The patch, lacking the "if you have a pathspec" part of the
   rule, would make this complain:

   	$ >v1.0.0
        $ git show v1.0.0

   BTW, this would silently do what you would not find
   interesting, with or without the patch:

       $ >v2.9.6
	 ... long time later after you forget you had such a
	 ... bogus file
       $ git show v2.9.6

  Linus Torvalds <torvalds@osdl.org> writes:

  > ... I was going to say that if you have a pathspec, 
  > but don't have a "--" marker, then we'd additionally check:
  >
  >  - none of the arguments we parsed as revisions could be interpreted as a 
  >    filename.
  >
  >  - none of the paths are "--" 
  >
  > so that there really isn't...



 cache.h    |    1 +
 revision.c |   14 +++++++++++++-
 setup.c    |   24 ++++++++++++++++++++++--
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 4d8fabc..a4f253e 100644
--- a/cache.h
+++ b/cache.h
@@ -135,6 +135,7 @@ extern const char *setup_git_directory(v
 extern const char *prefix_path(const char *prefix, int len, const char *path);
 extern const char *prefix_filename(const char *prefix, int len, const char *path);
 extern void verify_filename(const char *prefix, const char *name);
+extern void verify_non_filename(const char *prefix, const char *name);
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
diff --git a/revision.c b/revision.c
index f2a9f25..b6ed014 100644
--- a/revision.c
+++ b/revision.c
@@ -740,6 +740,11 @@ int setup_revisions(int argc, const char
 				include = get_reference(revs, next, sha1, flags);
 				if (!exclude || !include)
 					die("Invalid revision range %s..%s", arg, next);
+
+				if (!seen_dashdash) {
+					*dotdot = '.';
+					verify_non_filename(revs->prefix, arg);
+				}
 				add_pending_object(revs, exclude, this);
 				add_pending_object(revs, include, next);
 				continue;
@@ -757,13 +762,20 @@ int setup_revisions(int argc, const char
 			if (seen_dashdash || local_flags)
 				die("bad revision '%s'", arg);
 
-			/* If we didn't have a "--", all filenames must exist */
+			/* If we didn't have a "--":
+			 * (1) all filenames must exist;
+			 * (2) all rev-args must not be interpretable
+			 *     as a valid filename.
+			 * but the latter we have checked in the main loop.
+			 */
 			for (j = i; j < argc; j++)
 				verify_filename(revs->prefix, argv[j]);
 
 			revs->prune_data = get_pathspec(revs->prefix, argv + i);
 			break;
 		}
+		if (!seen_dashdash)
+			verify_non_filename(revs->prefix, arg);
 		object = get_reference(revs, arg, sha1, flags ^ local_flags);
 		add_pending_object(revs, object, arg);
 	}
diff --git a/setup.c b/setup.c
index cce9bb8..fe7f884 100644
--- a/setup.c
+++ b/setup.c
@@ -80,11 +80,31 @@ void verify_filename(const char *prefix,
 	if (!lstat(name, &st))
 		return;
 	if (errno == ENOENT)
-		die("ambiguous argument '%s': unknown revision or filename\n"
-		    "Use '--' to separate filenames from revisions", arg);
+		die("ambiguous argument '%s': unknown revision or path not in the working tree.\n"
+		    "Use '--' to separate paths from revisions", arg);
 	die("'%s': %s", arg, strerror(errno));
 }
 
+/*
+ * Opposite of the above: the command line did not have -- marker
+ * and we parsed the arg as a refname.  It should not be interpretable
+ * as a filename.
+ */
+void verify_non_filename(const char *prefix, const char *arg)
+{
+	const char *name;
+	struct stat st;
+
+	if (*arg == '-')
+		return; /* flag */
+	name = prefix ? prefix_filename(prefix, strlen(prefix), arg) : arg;
+	if (!lstat(name, &st))
+		die("ambiguous argument '%s': both revision and filename\n"
+		    "Use '--' to separate filenames from revisions", arg);
+	if (errno != ENOENT)
+		die("'%s': %s", arg, strerror(errno));
+}
+
 const char **get_pathspec(const char *prefix, const char **pathspec)
 {
 	const char *entry = *pathspec;
-- 
1.3.1.ga6c7

^ permalink raw reply related

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Nicolas Pitre @ 2006-04-27  3:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604261341200.18520@localhost.localdomain>

On Wed, 26 Apr 2006, Nicolas Pitre wrote:

> On Tue, 25 Apr 2006, Junio C Hamano wrote:
> 
> > Nicolas Pitre <nico@cam.org> writes:
> > 
> > > Note, this is a RFC particularly to Junio since the resulting pack is 
> > > larger than without the patch with git-repack -a -f.  However using a 
> > > subsequent git-repack -a brings the pack size down to expected size.  So 
> > > I'm not sure I've got everything right.
> > 
> > I haven't tested it seriously yet, but there is nothing that
> > looks obviously wrong that might cause the inflation problem,
> > from the cursory look after applying the patch on top of your
> > last round.
> 
> Never mind.  I found a flaw in the determination of delta_limit when 
> reparenting a delta target.  The immediate parent's delta_limit is 
> readjusted when its longest delta is moved to another base, but if that 
> parent was itself a delta then the delta_limit adjustment is not 
> propagated back up to the top.  This means that some objects were 
> falsely credited with too high delta_limit.
> 
> And actually I'm not sure how to solve that without walking the tree 
> up to the top each time, which I want to avoid as much as possible.

Well, that seems to be unavoidable.

Reversing the window logic isn't that much of a good idea after all.  As 
soon as we need to control the delta depth we sorta need to maintain 
trees of deltas and those trees are built from leaves up to the trunk.  

But each new object in the list is then used as a possible
new base for previously created deltas still in the object window.  When 
the new base is determined to produce a better delta then the 
relationship with the old base must be broken, which means that the 
information about delta length for the old base has to be updated.  And 
if the detached delta chain was the longest for that old base then the 
remaining longest delta chain from that old base has to be found and 
that information reflected up to the ultimate base in that tree.  And 
doing so isn't necessarily trivial as can be seen in the patch below.

Then there is the possibility of having a delta "branch" with maximum 
depth meaning that the trunk for that branch may not be deltified. But 
if a later objects to come does constitute a better delta base for the 
object in the middle of that branch then the branch will be broken in 
the middle to be transplanted onto the new base as explained previously.  
Which means that the initial trunk no longer has a maximum 
depth and 
some objects that were skipped because of the depth limit could 
now have been tested, leading to suboptimal delta matching.  This is why 
running pack-objects a second time improved things as it picked up 
those missed delta opportunities.  But having to run pack-objects 
multiple times is a bit against the point, and even if pack-objects 
processed the object list multiple times it certainly won't be 
faster than the current code.

In short, trying to deltify objects by keeping the base object constant 
for the match window really sucks, even from a theoretical point of 
view.  It does produce sensibly larger 
packs and it does take longer to do so.  I'm therefore dropping that 
approach.  My current patch can be found below.  If someone 
smarter than me (there are plenty I'm sure) can come with improvements 
to it then be my guest.

Therefore I really think the best approach is to simply keep delta index 
data along with object data in the match window and keep the current 
window direction for matching.  It obviously will take up more memory, 
but most probably less than if we set the window size = 20.  And 
building delta trees from the trunk to the leaves will always be optimal 
regardless with no delta rebasing needed.  I'll post another patch doing 
just that.

---

diff --git a/pack-objects.c b/pack-objects.c
index c0acc46..17280fb 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -19,19 +19,17 @@ struct object_entry {
 	unsigned long offset;	/* offset into the final pack file;
 				 * nonzero if already written.
 				 */
-	unsigned int depth;	/* delta depth */
-	unsigned int delta_limit;	/* base adjustment for in-pack delta */
+	unsigned int delta_limit;	/* deepest delta from this object */
 	unsigned int hash;	/* name hint hash */
 	enum object_type type;
 	enum object_type in_pack_type;	/* could be delta */
 	unsigned long delta_size;	/* delta data size (uncompressed) */
 	struct object_entry *delta;	/* delta base object */
-	struct packed_git *in_pack; 	/* already in pack */
-	unsigned int in_pack_offset;
 	struct object_entry *delta_child; /* delitified objects who bases me */
 	struct object_entry *delta_sibling; /* other deltified objects who
-					     * uses the same base as me
-					     */
+					       uses the same base as me */
+	struct packed_git *in_pack; 	/* already in pack */
+	unsigned int in_pack_offset;
 	int preferred_base;	/* we do not pack this, but is encouraged to
 				 * be used as the base objectto delta huge
 				 * objects against.
@@ -884,17 +882,16 @@ static void check_object(struct object_e
 		    sha1_to_hex(entry->sha1), type);
 }
 
-static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
+static unsigned int find_delta_limit(struct object_entry *me)
 {
 	struct object_entry *child = me->delta_child;
-	unsigned int m = n;
 	while (child) {
-		unsigned int c = check_delta_limit(child, n + 1);
-		if (m < c)
-			m = c;
+		unsigned int c = find_delta_limit(child);
+		if (me->delta_limit <= c)
+			me->delta_limit = c + 1;
 		child = child->delta_sibling;
 	}
-	return m;
+	return me->delta_limit;
 }
 
 static void get_object_details(void)
@@ -906,11 +903,11 @@ static void get_object_details(void)
 	for (i = 0, entry = objects; i < nr_objects; i++, entry++)
 		check_object(entry);
 
-	if (nr_objects == nr_result) {
+	if (!no_reuse_delta && nr_objects == nr_result) {
 		/*
-		 * Depth of objects that depend on the entry -- this
-		 * is subtracted from depth-max to break too deep
-		 * delta chain because of delta data reusing.
+		 * We must determine the maximum depth of reused deltas
+		 * for those objects used as their base before find_deltas()
+		 * starts considering them as potential delta targets.
 		 * However, we loosen this restriction when we know we
 		 * are creating a thin pack -- it will have to be
 		 * expanded on the other end anyway, so do not
@@ -919,8 +916,7 @@ static void get_object_details(void)
 		 */
 		for (i = 0, entry = objects; i < nr_objects; i++, entry++)
 			if (!entry->delta && entry->delta_child)
-				entry->delta_limit =
-					check_delta_limit(entry, 1);
+				find_delta_limit(entry);
 	}
 }
 
@@ -994,6 +990,7 @@ static int type_size_sort(const struct o
 struct unpacked {
 	struct object_entry *entry;
 	void *data;
+	int pos;
 };
 
 /*
@@ -1004,64 +1001,94 @@ struct unpacked {
  * more importantly, the bigger file is likely the more recent
  * one.
  */
-static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_depth)
+static int try_delta(struct unpacked *trg, struct unpacked *src,
+		     struct delta_index *src_index, unsigned max_depth)
 {
-	struct object_entry *cur_entry = cur->entry;
-	struct object_entry *old_entry = old->entry;
-	unsigned long size, oldsize, delta_size, sizediff;
-	long max_size;
+	struct object_entry *trg_entry = trg->entry;
+	struct object_entry *src_entry = src->entry;
+	unsigned long size, src_size, delta_size, sizediff, max_size;
 	void *delta_buf;
 
 	/* Don't bother doing diffs between different types */
-	if (cur_entry->type != old_entry->type)
+	if (trg_entry->type != src_entry->type)
 		return -1;
 
 	/* We do not compute delta to *create* objects we are not
 	 * going to pack.
 	 */
-	if (cur_entry->preferred_base)
-		return -1;
+	if (trg_entry->preferred_base)
+		return 0;
 
-	/* If the current object is at pack edge, take the depth the
-	 * objects that depend on the current object into account --
-	 * otherwise they would become too deep.
+	/*
+	 * Make sure deltifying this object won't make its deepest delta
+	 * too deep, but only when not producing a thin pack.
 	 */
-	if (cur_entry->delta_child) {
-		if (max_depth <= cur_entry->delta_limit)
-			return 0;
-		max_depth -= cur_entry->delta_limit;
-	}
-
-	size = cur_entry->size;
-	oldsize = old_entry->size;
-	sizediff = oldsize > size ? oldsize - size : size - oldsize;
+	if (trg_entry->delta_limit >= max_depth && nr_objects == nr_result)
+		return 0;
 
+	/* Now some size filtering euristics. */
+	size = trg_entry->size;
 	if (size < 50)
-		return -1;
-	if (old_entry->depth >= max_depth)
 		return 0;
-
-	/*
-	 * NOTE!
-	 *
-	 * We always delta from the bigger to the smaller, since that's
-	 * more space-efficient (deletes don't have to say _what_ they
-	 * delete).
-	 */
 	max_size = size / 2 - 20;
-	if (cur_entry->delta)
-		max_size = cur_entry->delta_size-1;
+	if (trg_entry->delta)
+		max_size = trg_entry->delta_size-1;
+	src_size = src_entry->size;
+	sizediff = src_size < size ? size - src_size : 0;
 	if (sizediff >= max_size)
 		return 0;
-	delta_buf = diff_delta(old->data, oldsize,
-			       cur->data, size, &delta_size, max_size);
+
+	delta_buf = create_delta(src_index, trg->data, size, &delta_size, max_size);
 	if (!delta_buf)
 		return 0;
-	cur_entry->delta = old_entry;
-	cur_entry->delta_size = delta_size;
-	cur_entry->depth = old_entry->depth + 1;
+
+	if (trg_entry->delta && nr_objects == nr_result) {
+		/*
+		 * The target object already has a delta base but we just
+		 * found a better one.  Remove it from its former base
+		 * childhood and redetermine the base delta_limit.
+		 * But again, only when not creating a thin pack.
+		 */
+		struct object_entry *base = trg_entry->delta;
+		struct object_entry **child_link = &base->delta_child;
+		unsigned int limit = base->delta_limit;
+		base->delta_limit = 0;
+		while (*child_link) {
+			if (*child_link == trg_entry) {
+				*child_link = trg_entry->delta_sibling;
+				continue;
+			}
+			if (base->delta_limit <= (*child_link)->delta_limit)
+				base->delta_limit =
+					(*child_link)->delta_limit + 1;
+			child_link = &(*child_link)->delta_sibling;
+		}
+		if (base->delta_limit == limit)
+			goto out;
+		while ((base = base->delta) && ++limit == base->delta_limit) {
+			struct object_entry *child = base->delta_child;
+			base->delta_limit = 0;
+			do {
+				if(base->delta_limit <= child->delta_limit) {
+					base->delta_limit =
+						child->delta_limit + 1;
+					if (base->delta_limit == limit)
+						goto out;
+				}
+				child = child->delta_sibling;
+			} while (child);
+		}
+		out:;
+	}
+
+	trg_entry->delta = src_entry;
+	trg_entry->delta_size = delta_size;
+	trg_entry->delta_sibling = src_entry->delta_child;
+	src_entry->delta_child = trg_entry;
+	if (src_entry->delta_limit <= trg_entry->delta_limit)
+		src_entry->delta_limit = trg_entry->delta_limit + 1;
 	free(delta_buf);
-	return 0;
+	return 1;
 }
 
 static void progress_interval(int signum)
@@ -1078,14 +1105,15 @@ static void find_deltas(struct object_en
 	unsigned last_percent = 999;
 
 	memset(array, 0, array_size);
-	i = nr_objects;
+	i = 0;
 	idx = 0;
 	if (progress)
 		fprintf(stderr, "Deltifying %d objects.\n", nr_result);
 
-	while (--i >= 0) {
-		struct object_entry *entry = list[i];
+	while (i < nr_objects) {
+		struct object_entry *entry = list[i++];
 		struct unpacked *n = array + idx;
+		struct delta_index *delta_index;
 		unsigned long size;
 		char type[10];
 		int j;
@@ -1109,11 +1137,18 @@ static void find_deltas(struct object_en
 			 */
 			continue;
 
+		if (entry->size < 50)
+			continue;
 		free(n->data);
+		n->pos = i;
 		n->entry = entry;
 		n->data = read_sha1_file(entry->sha1, type, &size);
 		if (size != entry->size)
-			die("object %s inconsistent object length (%lu vs %lu)", sha1_to_hex(entry->sha1), size, entry->size);
+			die("object %s inconsistent object length (%lu vs %lu)",
+			    sha1_to_hex(entry->sha1), size, entry->size);
+		delta_index = create_delta_index(n->data, size);
+		if (!delta_index)
+			die("out of memory");
 
 		j = window;
 		while (--j > 0) {
@@ -1124,18 +1159,10 @@ static void find_deltas(struct object_en
 			m = array + other_idx;
 			if (!m->entry)
 				break;
-			if (try_delta(n, m, depth) < 0)
+			if (try_delta(m, n, delta_index, depth) < 0)
 				break;
 		}
-#if 0
-		/* if we made n a delta, and if n is already at max
-		 * depth, leaving it in the window is pointless.  we
-		 * should evict it first.
-		 * ... in theory only; somehow this makes things worse.
-		 */
-		if (entry->delta && depth <= entry->depth)
-			continue;
-#endif
+		free_delta_index(delta_index);
 		idx++;
 		if (idx >= window)
 			idx = 0;

^ permalink raw reply related

* [PATCH] use delta index data when finding best delta matches
From: Nicolas Pitre @ 2006-04-27  3:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604262210120.18520@localhost.localdomain>


This patch allows for computing the delta index for each base object 
only once and reuse it when trying to find the best delta match.

This should set the mark and pave the way for possibly better delta 
generator algorithms.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

... as mentioned in my previous post

diff --git a/pack-objects.c b/pack-objects.c
index c0acc46..5b2ef9a 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -994,6 +994,7 @@ static int type_size_sort(const struct o
 struct unpacked {
 	struct object_entry *entry;
 	void *data;
+	struct delta_index *index;
 };
 
 /*
@@ -1004,64 +1005,56 @@ struct unpacked {
  * more importantly, the bigger file is likely the more recent
  * one.
  */
-static int try_delta(struct unpacked *cur, struct unpacked *old, unsigned max_depth)
+static int try_delta(struct unpacked *trg, struct unpacked *src,
+		     struct delta_index *src_index, unsigned max_depth)
 {
-	struct object_entry *cur_entry = cur->entry;
-	struct object_entry *old_entry = old->entry;
-	unsigned long size, oldsize, delta_size, sizediff;
-	long max_size;
+	struct object_entry *trg_entry = trg->entry;
+	struct object_entry *src_entry = src->entry;
+	unsigned long size, src_size, delta_size, sizediff, max_size;
 	void *delta_buf;
 
 	/* Don't bother doing diffs between different types */
-	if (cur_entry->type != old_entry->type)
+	if (trg_entry->type != src_entry->type)
 		return -1;
 
 	/* We do not compute delta to *create* objects we are not
 	 * going to pack.
 	 */
-	if (cur_entry->preferred_base)
+	if (trg_entry->preferred_base)
 		return -1;
 
-	/* If the current object is at pack edge, take the depth the
+	/*
+	 * If the current object is at pack edge, take the depth the
 	 * objects that depend on the current object into account --
 	 * otherwise they would become too deep.
 	 */
-	if (cur_entry->delta_child) {
-		if (max_depth <= cur_entry->delta_limit)
+	if (trg_entry->delta_child) {
+		if (max_depth <= trg_entry->delta_limit)
 			return 0;
-		max_depth -= cur_entry->delta_limit;
+		max_depth -= trg_entry->delta_limit;
 	}
-
-	size = cur_entry->size;
-	oldsize = old_entry->size;
-	sizediff = oldsize > size ? oldsize - size : size - oldsize;
-
-	if (size < 50)
-		return -1;
-	if (old_entry->depth >= max_depth)
+	if (src_entry->depth >= max_depth)
 		return 0;
 
-	/*
-	 * NOTE!
-	 *
-	 * We always delta from the bigger to the smaller, since that's
-	 * more space-efficient (deletes don't have to say _what_ they
-	 * delete).
-	 */
+	/* Now some size filtering euristics. */
+	size = trg_entry->size;
 	max_size = size / 2 - 20;
-	if (cur_entry->delta)
-		max_size = cur_entry->delta_size-1;
+	if (trg_entry->delta)
+		max_size = trg_entry->delta_size-1;
+	src_size = src_entry->size;
+	sizediff = src_size < size ? size - src_size : 0;
 	if (sizediff >= max_size)
 		return 0;
-	delta_buf = diff_delta(old->data, oldsize,
-			       cur->data, size, &delta_size, max_size);
+
+	delta_buf = create_delta(src_index, trg->data, size, &delta_size, max_size);
 	if (!delta_buf)
 		return 0;
-	cur_entry->delta = old_entry;
-	cur_entry->delta_size = delta_size;
-	cur_entry->depth = old_entry->depth + 1;
+
+	trg_entry->delta = src_entry;
+	trg_entry->delta_size = delta_size;
+	trg_entry->depth = src_entry->depth + 1;
 	free(delta_buf);
-	return 0;
+	return 1;
 }
 
 static void progress_interval(int signum)
@@ -1109,11 +1102,19 @@ static void find_deltas(struct object_en
 			 */
 			continue;
 
+		if (entry->size < 50)
+			continue;
+		if (n->index)
+			free_delta_index(n->index);
 		free(n->data);
 		n->entry = entry;
 		n->data = read_sha1_file(entry->sha1, type, &size);
 		if (size != entry->size)
-			die("object %s inconsistent object length (%lu vs %lu)", sha1_to_hex(entry->sha1), size, entry->size);
+			die("object %s inconsistent object length (%lu vs %lu)",
+			    sha1_to_hex(entry->sha1), size, entry->size);
+		n->index = create_delta_index(n->data, size);
+		if (!n->index)
+			die("out of memory");
 
 		j = window;
 		while (--j > 0) {
@@ -1124,7 +1125,7 @@ static void find_deltas(struct object_en
 			m = array + other_idx;
 			if (!m->entry)
 				break;
-			if (try_delta(n, m, depth) < 0)
+			if (try_delta(n, m, m->index, depth) < 0)
 				break;
 		}
 #if 0
@@ -1144,8 +1145,11 @@ #endif
 	if (progress)
 		fputc('\n', stderr);
 
-	for (i = 0; i < window; ++i)
+	for (i = 0; i < window; ++i) {
+		if (array[i].index)
+			free_delta_index(array[i].index);
 		free(array[i].data);
+	}
 	free(array);
 }
 

^ permalink raw reply related

* Re: [PATCH/RFC] reverse the pack-objects delta window logic
From: Junio C Hamano @ 2006-04-27  5:04 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604262210120.18520@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

>> And actually I'm not sure how to solve that without walking the tree 
>> up to the top each time, which I want to avoid as much as possible.
>
> Well, that seems to be unavoidable.
>
> Reversing the window logic isn't that much of a good idea after all.
> ...

> Then there is the possibility of having a delta "branch" with maximum 
> depth meaning that the trunk for that branch may not be deltified. But 
> if a later objects to come does constitute a better delta base for the 
> object in the middle of that branch then the branch will be broken in 
> the middle to be transplanted onto the new base as explained previously.  
> Which means that the initial trunk no longer has a maximum depth and some
> objects that were skipped because of the depth limit could now have been
> tested, leading to suboptimal delta matching.

Good analysis, and I tend to agree with your conclusion.

BTW, Geert mentioned on the #git channel that about half the
filepair git-pack-objects asks diff_delta() to try have long
sequences of matching bytes at the beginning and at the end.  It
might be worthwhile if we can take an advantage of it, whichever
delta algorithm we would use.

^ permalink raw reply

* Re: [PATCH] Add --continue and --abort options to git-rebase.
From: sean @ 2006-04-27  5:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3bg0nlvb.fsf@assigned-by-dhcp.cox.net>

On Wed, 26 Apr 2006 13:05:28 -0700
Junio C Hamano <junkio@cox.net> wrote:

> sean <seanlkml@sympatico.ca> writes:
> 
> >   git rebase [--onto <newbase>] <upstream> [<branch>]
> >   git rebase --continue
> >   git rebase --abort
> >
> > ---
> >
> > Take 2.  Must simpler patch which doesn't trying to 
> > rejigger the command line too much.
> 
> This second round seems to make more sense.  Sign-off?
> 

Sure,

    Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>

Sean

^ permalink raw reply

* how to trace the patch?
From: Aubrey @ 2006-04-27 10:06 UTC (permalink / raw)
  To: git

Hi all,

When I update the kernel git tree a few days later, you know, there
could be a lot of patches. Then I found one file changed, how can I
know which patch the modification belong to?
How can I find the patch?
Many thanks,

Regards,
-Aubrey

^ permalink raw reply

* [PATCH] C version of git-count-objects
From: Peter Hagervall @ 2006-04-27 10:12 UTC (permalink / raw)
  To: git; +Cc: junkio

Answering the call Linus made[1], sort of, but for a completely
different program.

Anyway, it ought to be at least as portable as the shell script, and a
whole lot faster, however much that matters.

Signed-off-by: Peter Hagervall <hager@cs.umu.se>

[1] http://article.gmane.org/gmane.comp.version-control.git/19073

---

 Makefile             |    5 +--
 count-objects.c      |   56 +++++++++++++++++++++++++++++++++++++
 git-count-objects.sh |   31 --------------------
 3 files changed, 59 insertions(+), 33 deletions(-)


diff --git a/Makefile b/Makefile
index 8ce27a6..53e7591 100644
--- a/Makefile
+++ b/Makefile
@@ -115,7 +115,7 @@ ### --- END CONFIGURATION SECTION ---
 SCRIPT_SH = \
 	git-add.sh git-bisect.sh git-branch.sh git-checkout.sh \
 	git-cherry.sh git-clean.sh git-clone.sh git-commit.sh \
-	git-count-objects.sh git-diff.sh git-fetch.sh \
+	git-diff.sh git-fetch.sh \
 	git-format-patch.sh git-ls-remote.sh \
 	git-merge-one-file.sh git-parse-remote.sh \
 	git-prune.sh git-pull.sh git-push.sh git-rebase.sh \
@@ -165,7 +165,8 @@ PROGRAMS = \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X git-check-ref-format$X \
 	git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
+	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X \
+	git-count-objects$X
 
 BUILT_INS = git-log$X
 
diff --git a/count-objects.c b/count-objects.c
new file mode 100644
index 0000000..67ab6f0
--- /dev/null
+++ b/count-objects.c
@@ -0,0 +1,56 @@
+#include "cache.h"
+#include "git-compat-util.h"
+
+static char pathname[PATH_MAX + 1];
+static int numobjects, numblocks;
+static const char hex_digits[] = "0123456789abcdef";
+
+void count_objects(void)
+{
+	char subdir[3];
+	int i, j;
+	struct stat statbuf;
+	struct dirent *dirp;
+	DIR *dp;
+	subdir[2] = '\0';
+	for (i = 0; i < 16; i++) {
+		subdir[0] = hex_digits[i];
+		for (j = 0; j < 16; j++) {
+			subdir[1] = hex_digits[j];
+			if (access(subdir, R_OK | X_OK))
+				continue;
+			chdir(subdir);
+			if (!(dp = opendir("."))) {
+				error("can't open subdir %s", subdir);
+				continue;
+			}
+			while ((dirp = readdir(dp))) {
+				if (!strcmp(dirp->d_name, ".") ||
+					!strcmp(dirp->d_name, ".."))
+					continue;
+				if (lstat(dirp->d_name, &statbuf)) {
+					error("can't stat file %s", dirp->d_name);
+					continue;
+				}
+				numblocks += statbuf.st_blocks;
+				numobjects++;
+			}
+			closedir(dp);
+			chdir("..");
+		}
+	}
+}
+
+int main(int argc, char **argv)
+{
+	setup_git_directory();
+
+	if (chdir(".git/objects"))
+		die("%s", strerror(errno));
+
+	count_objects();
+
+	printf("%d objects, %d kilobytes\n", numobjects, numblocks / 2);
+
+	return 0;
+}
diff --git a/git-count-objects.sh b/git-count-objects.sh
deleted file mode 100755
index 40c58ef..0000000
--- a/git-count-objects.sh
+++ /dev/null
@@ -1,31 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-
-GIT_DIR=`git-rev-parse --git-dir` || exit $?
-
-dc </dev/null 2>/dev/null || {
-	# This is not a real DC at all -- it just knows how
-	# this script feeds DC and does the computation itself.
-	dc () {
-		while read a b
-		do
-			case $a,$b in
-			0,)	acc=0 ;;
-			*,+)	acc=$(($acc + $a)) ;;
-			p,)	echo "$acc" ;;
-			esac
-		done
-	}
-}
-
-echo $(find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null | wc -l) objects, \
-$({
-    echo 0
-    # "no-such" is to help Darwin folks by not using xargs -r.
-    find "$GIT_DIR/objects"/?? -type f -print 2>/dev/null |
-    xargs du -k "$GIT_DIR/objects/no-such" 2>/dev/null |
-    sed -e 's/[ 	].*/ +/'
-    echo p
-} | dc) kilobytes

^ permalink raw reply related

* cg-clone not fetching all tags?
From: Wolfgang Denk @ 2006-04-27 10:52 UTC (permalink / raw)
  To: Git Mailing List

Hi,

it seems that "cg-clone" does not fetch all tags any more - only  the
most  recent ones (modiufied in the last N days?) seem to be fetched?
[Eventually the "N days"  might  correspond  to  "changing  tools  to
version X", but I have no way to find out.]

This happens only when using HTTP; using ssh  or  rsync  works  fine.
Also,  if  we follow the "cg-clone" by a "git-fetch -t" command, this
will load the missing tags.

Is this intentional, or am I doing anything wrong?

[For testing, try "cg-clone http://www.denx.de/git/u-boot.git"]

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
In theory, there is no difference between  theory  and  practice.  In
practice, however, there is.

^ permalink raw reply

* Re: how to trace the patch?
From: sean @ 2006-04-27 10:57 UTC (permalink / raw)
  To: Aubrey; +Cc: git
In-Reply-To: <6d6a94c50604270306j44c280bdo283591f2f595f74e@mail.gmail.com>

On Thu, 27 Apr 2006 18:06:09 +0800
Aubrey <aubreylee@gmail.com> wrote:

> When I update the kernel git tree a few days later, you know, there
> could be a lot of patches. Then I found one file changed, how can I
> know which patch the modification belong to?
> How can I find the patch?

Hi Aubrey,

$ git log -- <filename>

To see a list of commits that affected the file you're interested in.

$ git log -p -- <filename>

Will include a diff after each commit showing you how the file was
changed.  And if you want to see what other changes happened in each
commit that modified your file, add "--full-diff" to the command above.

Note that you can also replace the <filename> with a <directory> 
to see a list of commits that affected any file below that directory.

HTH,
Sean

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Morten Welinder @ 2006-04-27 13:16 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git, junkio
In-Reply-To: <20060427101254.GA22769@peppar.cs.umu.se>

> +                       if (access(subdir, R_OK | X_OK))
> +                               continue;
> +                       chdir(subdir);

You've got yourself a needless race condition right there.  Just
do the chdir and check the return value.  (And besides, access
checks with the wrong set of permissions, should this ever end
up in set[ug]id context.)

Morten

^ permalink raw reply

* Re: [PATCH] C version of git-count-objects
From: Nicolas Pitre @ 2006-04-27 13:23 UTC (permalink / raw)
  To: Peter Hagervall; +Cc: git, junkio
In-Reply-To: <20060427101254.GA22769@peppar.cs.umu.se>

On Thu, 27 Apr 2006, Peter Hagervall wrote:

> Answering the call Linus made[1], sort of, but for a completely
> different program.
> 
> Anyway, it ought to be at least as portable as the shell script, and a
> whole lot faster, however much that matters.
> 
[...]
> +	for (i = 0; i < 16; i++) {
> +		subdir[0] = hex_digits[i];
> +		for (j = 0; j < 16; j++) {
> +			subdir[1] = hex_digits[j];
> +			if (access(subdir, R_OK | X_OK))
> +				continue;
> +			chdir(subdir);
> +			if (!(dp = opendir("."))) {
> +				error("can't open subdir %s", subdir);
> +				continue;
> +			}

Looks like you're missing a chdir(".."); there.


Nicolas

^ permalink raw reply

* Two gitweb feature requests
From: David Woodhouse @ 2006-04-27 13:27 UTC (permalink / raw)
  To: Kay Sievers; +Cc: git

First... When publishing trees, I currently give both the git:// URL for
people who want to pull the tree, and the http:// URL to gitweb for
those who just want to browse.

It would be useful if I could get away with giving just one URL --
probably the http:// one to gitweb. If gitweb were to have a mode in
which it gave a referral to the git:// URL, and if the git tools would
use that, then that would work well.

Secondly, it would be useful if gitweb would list the branches in a
repository and allow each of them to be viewed in the same way as it
does the master branch.

-- 
dwmw2

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox