Git development
 help / color / mirror / Atom feed
* Re: [PATCH] pull: refuse complete src:dst fetchspec arguments
From: Junio C Hamano @ 2009-10-20 21:11 UTC (permalink / raw)
  To: Sean Estabrooks; +Cc: Thomas Rast, git
In-Reply-To: <BLU0-SMTP97AA2287062D9A104101C8AEC00@phx.gbl>

Sean Estabrooks <seanlkml@sympatico.ca> writes:

>> -test_expect_success 'pulling into void using master:master' '
>> -	mkdir cloned-uho &&
>> -	(
>> -		cd cloned-uho &&
>> -		git init &&
>> -		git pull .. master:master
>> -	) &&
>> -	test -f file &&
>> -	test -f cloned-uho/file &&
>> -	test_cmp file cloned-uho/file
>> -
>> -
>>  test_expect_success 'test . as a remote' '
>>  
>>  	git branch copy master &&
>> -- 
>> 
>
> Instead of removing this test it should be modified or replaced
> with a test that ensures the new functionality operates correctly.
> In this case that would mean checking that using a full refspec
> errors out.

Shouldn't "git pull .. master" still work in this case, too?  So this test
will probably become two tests, one for "git pull .. master:master" that
correctly fails, and the other for "git pull .. master" to still work as
expected.

^ permalink raw reply

* Re: git gc and kernel.org
From: Junio C Hamano @ 2009-10-20 21:02 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Johan Herland, git, H. Peter Anvin, ftpadmin
In-Reply-To: <alpine.LFD.2.00.0910201442480.21460@xanadu.home>

Nicolas Pitre <nico@fluxnic.net> writes:

> Still... Hopefully this is going to become redundant information in the 
> future with the eventual deployment of smart protocol over HTTP.  So I 
> think that as a config option being off by default this is a good 
> compromize.  Site administrators can turn it on by default in 
> /etc/gitconfig.

Ok, something like this?


 Documentation/config.txt |    8 ++++++++
 builtin-receive-pack.c   |   18 ++++++++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index cd17814..2d3b4f5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1320,6 +1320,10 @@ rebase.stat::
 	Whether to show a diffstat of what changed upstream since the last
 	rebase. False by default.
 
+receive.autogc::
+	If set to true, git-receive-pack will run "git-gc --auto"
+	after receiving data from git-push and updating refs.
+
 receive.fsckObjects::
 	If it is set to true, git-receive-pack will check all received
 	objects. It will abort in the case of a malformed object or a
@@ -1355,6 +1359,10 @@ receive.denyNonFastForwards::
 	even if that push is forced. This configuration variable is
 	set when initializing a shared repository.
 
+receive.updateserverinfo::
+	If set to true, git-receive-pack will run git-update-server-info
+	after receiving data from git-push and updating refs.
+
 remote.<name>.url::
 	The URL of a remote repository.  See linkgit:git-fetch[1] or
 	linkgit:git-push[1].
diff --git a/builtin-receive-pack.c b/builtin-receive-pack.c
index b771fe9..6573226 100644
--- a/builtin-receive-pack.c
+++ b/builtin-receive-pack.c
@@ -28,6 +28,8 @@ static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
 static int report_status;
 static int prefer_ofs_delta = 1;
+static int auto_update_server_info;
+static int auto_gc;
 static const char *head_name;
 static char *capabilities_to_send;
 
@@ -88,6 +90,16 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.updateserverinfo") == 0) {
+		auto_update_server_info = git_config_bool(var, value);
+		return 0;
+	}
+
+	if (strcmp(var, "receive.autogc") == 0) {
+		auto_gc = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -672,6 +684,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 			report(unpack_status);
 		run_receive_hook(post_receive_hook);
 		run_update_post_hook(commands);
+		if (auto_update_server_info)
+			update_server_info(0);
+		if (auto_gc) {
+			const char *argv_gc_auto[] = { "gc", "--auto", NULL };
+			run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
+		}
 	}
 	return 0;
 }

^ permalink raw reply related

* Re: git hang with corrupted .pack
From: Junio C Hamano @ 2009-10-20 20:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn O. Pearce, Andy Isaacson, git, Alex Riesen
In-Reply-To: <alpine.LFD.2.00.0910201538180.21460@xanadu.home>

Nicolas Pitre <nico@fluxnic.net> writes:

> I didn't spend the time needed to think about this issue and your 
> proposed fix yet.  However I think that using sizeof(delta_head)-1 
> makes the code a bit confusing.  At this point i'd use:
>
> 	int size = sizeof(delta_head) - 1;
>
> and use that variable instead just like it is done in 
> unpack_compressed_entry() to have the same code pattern.

Sounds good.  Here is a reroll with a bit more explanation.

-- >8 --
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 20 Oct 2009 12:40:02 -0700
Subject: [PATCH] Fix "corrupt input stream" check while reading from packfiles

An ealier "fix" made us break out of the loop when we get Z_BUF_ERROR back
from inflate(), and either the input stream still had some data to
consume, or we have already got the full output we expected.

This is the same kind of mistake as we corrected with 456cdf6 (Fix loose
object uncompression check., 2007-03-19); it is valid for inflate() to
produce full output before it consumes the input stream fully; e.g.
immediately before reading the end of stream marker.

Instead, detect corrupt input stream by feeding the input as long as
inflate() wants to without detecting a real error, and giving it an output
buffer that is one byte longer than necessary.  If it touches the extra
byte, we know that the input stream is corrupt; otherwise inflate() will
notice the broken input stream by itself.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1_file.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..f0907b8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1344,7 +1344,8 @@ unsigned long get_size_from_delta(struct packed_git *p,
 			          off_t curpos)
 {
 	const unsigned char *data;
-	unsigned char delta_head[20], *in;
+	unsigned char delta_head[21], *in;
+	unsigned long expected_size = sizeof(delta_head) - 1;
 	z_stream stream;
 	int st;
 
@@ -1357,13 +1358,14 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
+		 stream.total_out < expected_size);
 	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+	if ((st != Z_STREAM_END) &&
+	    stream.total_out != expected_size) {
 		error("delta data unpack-initial failed");
 		return 0;
 	}
@@ -1589,15 +1591,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);
-- 
1.6.5.1.107.gba912

^ permalink raw reply related

* Re: git fsck not identifying corrupted packs
From: Alex Riesen @ 2009-10-20 20:49 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Matthieu Moy, Johannes Schindelin, Johannes Sixt,
	Sergio Callegari, git
In-Reply-To: <alpine.LFD.2.00.0910201437240.21460@xanadu.home>

On Tue, Oct 20, 2009 at 20:39, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Mon, 19 Oct 2009, Junio C Hamano wrote:
>
>> Actually, I changed my mind.  I do not think this so big that we need to
>> wait for a major version bump.  Why not shoot for 1.6.6?
>
> Agreed.  With a prominent note in the release notes to point people at
> it when they don't read release notes and complain that fsck suddenly
> became very slow after they upgraded.

I have a feeling that it either wont be noticed at all (i.e., I have run fsck
with something other than --full only one time in my whole bash history,
and never shown it otherwise to anyone else), or people will immediately
like it ("Oh, finallly! Now that feels like it is doing something!" :)

^ permalink raw reply

* Re: ident hash usage question
From: Alex Riesen @ 2009-10-20 20:43 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: git
In-Reply-To: <76c5b8580910201330r45cf625k3a41b5b9e24b3e01@mail.gmail.com>

On Tue, Oct 20, 2009 at 22:30, Eugene Sajine <euguess@gmail.com> wrote:
> One of my friends said that git is not working for their development
> model... C++ development with static linking across the board, where
> they need to see exactly which version of the file has got to the
> executable. Roughly, they are using CVS' keywords and revision numbers
> and a script wich matches them between two versions of the
> executables.

As soon as your friend understand, that a commit describes the complete
state of the repository on the moment of commit, he/she will notice that
the commit allows to find each what content each file in the product
had at the moment of compilation and linking (assuming they weren't
compiling uncommitted tree, which RCS/CVS/SVN/Perforce mindset
tends to encourage).

> I've got curious if Git can support it and how it can be done with
> minimal changes to workflow.

Depends on workflow, I afraid. And I personally wouldn't bother.

>>> # little script or regexp here (don’t have it)
>>>
>>> Does it make sense?
>>
>> Not much. You'll always get a long list of commits which didn't
>> change the damned blob. And you have absolutely no way
>> to find out exactly which of the commits have produced
>> the blob you're looking at (because you decided to do away
>> with the information).
>
> How is that? It seams to me that git log <path> will show only commits
> where <path> was changed/committed? Considering the fact that I've got
> the initial path from the blob, i should get the exact commit history
> (or last commit in my example) for the file(s) (Files if renaming
> occurred without content change).

The blob is present in each commit since it was introduced. Except
when your project contains only that one blob, isn't the state of
the other parts of an interest?

^ permalink raw reply

* Re: ident hash usage question
From: Junio C Hamano @ 2009-10-20 20:43 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: Alex Riesen, git
In-Reply-To: <76c5b8580910201330r45cf625k3a41b5b9e24b3e01@mail.gmail.com>

Eugene Sajine <euguess@gmail.com> writes:

> One of my friends said that git is not working for their development
> model... C++ development with static linking across the board, where
> they need to see exactly which version of the file has got to the
> executable. Roughly, they are using CVS' keywords and revision numbers
> and a script wich matches them between two versions of the
> executables.
>
> I've got curious if Git can support it and how it can be done with
> minimal changes to workflow.

If the project is already arranged to be compiled with decent automation,
I do not think you need any change to the workflow.

You would have a version.cc file with

	static char program_version[] = "My Program " VERSION_STRING;

in it, and teach the build procedure how to compile and link this file.
Something like:

    version.o: version.cc
    	$(CXX) -o $@ -DVERSION_STRING=\""$(git describe HEAD)"\" $?

^ permalink raw reply

* Re: git hang with corrupted .pack
From: Nicolas Pitre @ 2009-10-20 19:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Andy Isaacson, git, Alex Riesen
In-Reply-To: <7vpr8hn9ly.fsf@alter.siamese.dyndns.org>

On Tue, 20 Oct 2009, Junio C Hamano wrote:

> Perhaps it would be as simple as this?
> 
> We deliberately give one byte more than what we expect to see and error
> out when we do get that extra byte filled.
> 
>  sha1_file.c |   17 +++++++++--------
>  1 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index 4cc8939..8c9f392 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1344,7 +1344,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
>  			          off_t curpos)
>  {
>  	const unsigned char *data;
> -	unsigned char delta_head[20], *in;
> +	unsigned char delta_head[21], *in;

I didn't spend the time needed to think about this issue and your 
proposed fix yet.  However I think that using sizeof(delta_head)-1 
makes the code a bit confusing.  At this point i'd use:

	int size = sizeof(delta_head) - 1;

and use that variable instead just like it is done in 
unpack_compressed_entry() to have the same code pattern.


Nicolas

^ permalink raw reply

* Re: git fsck not identifying corrupted packs
From: Nicolas Pitre @ 2009-10-20 18:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Johannes Schindelin, Johannes Sixt,
	Sergio Callegari, git
In-Reply-To: <7vfx9esgvt.fsf@alter.siamese.dyndns.org>

On Mon, 19 Oct 2009, Junio C Hamano wrote:

> Actually, I changed my mind.  I do not think this so big that we need to
> wait for a major version bump.  Why not shoot for 1.6.6?

Agreed.  With a prominent note in the release notes to point people at 
it when they don't read release notes and complain that fsck suddenly 
became very slow after they upgraded.


Nicolas

^ permalink raw reply

* Re: git gc and kernel.org
From: Nicolas Pitre @ 2009-10-20 18:46 UTC (permalink / raw)
  To: Johan Herland; +Cc: Junio C Hamano, git, H. Peter Anvin, ftpadmin
In-Reply-To: <200910201054.48315.johan@herland.net>

On Tue, 20 Oct 2009, Johan Herland wrote:

> On Tuesday 20 October 2009, Junio C Hamano wrote:
> > At one point, update-server-info used to compute a lot more than what we
> > currently compute and it made some sense to oppose against it on
> > performance ground.
> > 
> > But these days it only lists the refs and packs and does nothing else;
> > the performance impact should be immeasurable and it adds only two files
> > to the repository.  It cannot be a big deal, unless you oppose to http
> > transport on a non-technical ground.
> 
> FYI, update-server-info takes ~0.7 seconds in a repo with one pack and 
> ~50000 refs, so I guess it's acceptable to enable it by default, even in 
> those kinds of repos.

Still... Hopefully this is going to become redundant information in the 
future with the eventual deployment of smart protocol over HTTP.  So I 
think that as a config option being off by default this is a good 
compromize.  Site administrators can turn it on by default in 
/etc/gitconfig.


Nicolas

^ permalink raw reply

* [PATCH] git-clone.txt: Fix grammar and formatting
From: Björn Gustavsson @ 2009-10-20 20:38 UTC (permalink / raw)
  To: git; +Cc: gitster

Add the missing definite article ("the") in several places.

Change "note to..." to "note for...", since "note to" means that
that the note is addressed to someone (source: Google search).

Change "progressbar" to "progress bar" (source: Wikipedia).

Format git commands, options, and file names consistently using
back quotes (i.e. a fixed font in the resulting HTML document).

Signed-off-by: Björn Gustavsson <bgustavsson@gmail.com>
---
I am not a native English speaker, so if my attempted corrections
of the grammar is wrong, please correct me!

 Documentation/git-clone.txt |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 5ebcba1..7e7d9fc 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -39,7 +39,7 @@ OPTIONS
 --local::
 -l::
 	When the repository to clone from is on a local machine,
-	this flag bypasses normal "git aware" transport
+	this flag bypasses the normal "git aware" transport
 	mechanism and clones the repository by making a copy of
 	HEAD and everything under objects and refs directories.
 	The files under `.git/objects/` directory are hardlinked
@@ -60,7 +60,7 @@ OPTIONS
 -s::
 	When the repository to clone is on the local machine,
 	instead of using hard links, automatically setup
-	.git/objects/info/alternates to share the objects
+	`.git/objects/info/alternates` to share the objects
 	with the source repository.  The resulting repository
 	starts out without any object of its own.
 +
@@ -69,7 +69,7 @@ it unless you understand what it does. If you clone your
 repository using this option and then delete branches (or use any
 other git command that makes any existing commit unreferenced) in the
 source repository, some objects may become unreferenced (or dangling).
-These objects may be removed by normal git operations (such as 'git-commit')
+These objects may be removed by normal git operations (such as `git commit`)
 which automatically call `git gc --auto`. (See linkgit:git-gc[1].)
 If these objects are removed and were referenced by the cloned repository,
 then the cloned repository will become corrupt.
@@ -86,13 +86,13 @@ objects from the source repository into a pack in the cloned repository.
 
 --reference <repository>::
 	If the reference repository is on the local machine,
-	automatically setup .git/objects/info/alternates to
+	automatically setup `.git/objects/info/alternates` to
 	obtain objects from the reference repository.  Using
 	an already existing repository as an alternate will
 	require fewer objects to be copied from the repository
 	being cloned, reducing network and local storage costs.
 +
-*NOTE*: see NOTE to --shared option.
+*NOTE*: see the NOTE for the `--shared` option.
 
 --quiet::
 -q::
@@ -101,7 +101,7 @@ objects from the source repository into a pack in the cloned repository.
 
 --verbose::
 -v::
-	Display the progressbar, even in case the standard output is not
+	Display the progress bar, even in case the standard output is not
 	a terminal.
 
 --no-checkout::
@@ -121,17 +121,17 @@ objects from the source repository into a pack in the cloned repository.
 	configuration variables are created.
 
 --mirror::
-	Set up a mirror of the remote repository.  This implies --bare.
+	Set up a mirror of the remote repository.  This implies `--bare`.
 
 --origin <name>::
 -o <name>::
-	Instead of using the remote name 'origin' to keep track
-	of the upstream repository, use <name>.
+	Instead of using the remote name `origin` to keep track
+	of the upstream repository, use `<name>`.
 
 --branch <name>::
 -b <name>::
 	Instead of pointing the newly created HEAD to the branch pointed
-	to by the cloned repository's HEAD, point to <name> branch
+	to by the cloned repository's HEAD, point to `<name>` branch
 	instead. In a non-bare repository, this is the branch that will
 	be checked out.
 
@@ -158,7 +158,7 @@ objects from the source repository into a pack in the cloned repository.
 --recursive::
 	After the clone is created, initialize all submodules within,
 	using their default settings. This is equivalent to running
-	'git submodule update --init --recursive' immediately after
+	`git submodule update --init --recursive` immediately after
 	the clone is finished. This option is ignored if the cloned
 	repository does not have a worktree/checkout (i.e. if any of
 	`--no-checkout`/`-n`, `--bare`, or `--mirror` is given)
@@ -171,8 +171,8 @@ objects from the source repository into a pack in the cloned repository.
 <directory>::
 	The name of a new directory to clone into.  The "humanish"
 	part of the source repository is used if no directory is
-	explicitly given ("repo" for "/path/to/repo.git" and "foo"
-	for "host.xz:foo/.git").  Cloning into an existing directory
+	explicitly given (`repo` for `/path/to/repo.git` and `foo`
+	for `host.xz:foo/.git`).  Cloning into an existing directory
 	is only allowed if the directory is empty.
 
 :git-clone: 1
-- 
1.6.5.1.2.g310c

^ permalink raw reply related

* Re: [PATCH] pull: refuse complete src:dst fetchspec arguments
From: Sean Estabrooks @ 2009-10-20 20:30 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git
In-Reply-To: <d561e70f0aa802ceb96eba16d3bb2316134d69c8.1256062808.git.trast@student.ethz.ch>

On Tue, 20 Oct 2009 20:23:06 +0200
Thomas Rast <trast@student.ethz.ch> wrote:

Hi Thomas,

> git-pull has historically accepted full fetchspecs, meaning that you
> could do
> 
>   git pull $repo A:B
> 
> which would simultaneously fetch the remote branch A into the local
> branch B and merge B into HEAD.  This got especially confusing if B
> was checked out.  New users variously mistook pull for fetch or read
> that command as "merge the remote A into my B", neither of which is
> correct.
> 
> Since the above usage should be very rare and can be done with
> separate calls to fetch and merge, we just disallow full fetchspecs in
> git-pull.

It is however a handy shortcut to be able to specify the full refspec
and specify where you want the head stored locally.  It seems a shame to
throw away that functionality because of one confusing case.   Wouldn't
it be better to test of the confusing case and instead error out if the
local refname is already checked out?


[...]
> diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh
> index dd2ee84..a566a99 100755
> --- a/t/t5520-pull.sh
> +++ b/t/t5520-pull.sh
> @@ -29,18 +29,6 @@ test_expect_success 'checking the results' '
>  	diff file cloned/file
>  '
>  
> -test_expect_success 'pulling into void using master:master' '
> -	mkdir cloned-uho &&
> -	(
> -		cd cloned-uho &&
> -		git init &&
> -		git pull .. master:master
> -	) &&
> -	test -f file &&
> -	test -f cloned-uho/file &&
> -	test_cmp file cloned-uho/file
> -
> -
>  test_expect_success 'test . as a remote' '
>  
>  	git branch copy master &&
> -- 
> 

Instead of removing this test it should be modified or replaced
with a test that ensures the new functionality operates correctly.
In this case that would mean checking that using a full refspec
errors out.

Cheers,
Sean

^ permalink raw reply

* Re: ident hash usage question
From: Alex Riesen @ 2009-10-20 20:35 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Eugene Sajine, git
In-Reply-To: <200910202222.37563.j6t@kdbg.org>

On Tue, Oct 20, 2009 at 22:22, Johannes Sixt <j6t@kdbg.org> wrote:
> On Dienstag, 20. Oktober 2009, Alex Riesen wrote:
>> What's so hard with storing the SHA-1 of the *commit*, anyway?
>
> The hard part is that you get a cycle: commit SHA1 depends on contents depends
> on commit SHA1.

Don't store it in the repo. Store it in the output.

^ permalink raw reply

* Re: ident hash usage question
From: Eugene Sajine @ 2009-10-20 20:30 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git
In-Reply-To: <81b0412b0910201219q4d16c472n43cab4b5d17cf63c@mail.gmail.com>

First, thank you for your answers!

>
> Very likely it isn't, but it's your choice.

;)

>
>> Because after having this hash one can build up all necessary info from it:
>
> Depending on your definition of "necessary".

I'm trying to find a way to get to state where i can do

$ git log <path>

Or any other variants of it without introducing any non-default
scripts/features/keywords and limiting keywords to one to avoid any
related problem...

One of my friends said that git is not working for their development
model... C++ development with static linking across the board, where
they need to see exactly which version of the file has got to the
executable. Roughly, they are using CVS' keywords and revision numbers
and a script wich matches them between two versions of the
executables.

I've got curious if Git can support it and how it can be done with
minimal changes to workflow.

>
>> #finding blobs with SHA indicated in $Id$ keword
>> $ git log --no-abbrev --raw --all | grep SHA-1
>
> yeah. These are all starting from commit which introduced
> the hash under a specific path, ending at the commit where
> the path contains another SHA-1.

Actually, grep "SHA-1 A" will show added path(s) only, so this is resolved

>
>> # little script or regexp here (don’t have it)
>> $ pull out path from result
>>
>> # last commit for the path with all corresponding info
>> $ git log -1 HEAD path
>>
>> So, this seems to cover most of the needs of people who would like to
>> use keywords expansion, if they are not ready to forget about them…
>>
>> Does it make sense?
>
> Not much. You'll always get a long list of commits which didn't
> change the damned blob. And you have absolutely no way
> to find out exactly which of the commits have produced
> the blob you're looking at (because you decided to do away
> with the information).

How is that? It seams to me that git log <path> will show only commits
where <path> was changed/committed? Considering the fact that I've got
the initial path from the blob, i should get the exact commit history
(or last commit in my example) for the file(s) (Files if renaming
occurred without content change).

Thanks,
Eugene

^ permalink raw reply

* Re: [PATCH] blame: make sure that the last line ends in an LF
From: Junio C Hamano @ 2009-10-20 20:28 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Junio C Hamano, Git List, Johannes Schindelin
In-Reply-To: <fabb9a1e0910201304j2730e6f3j1ebc5c17be07dd12@mail.gmail.com>

Sverre Rabbelier <srabbelier@gmail.com> writes:

> On Tue, Oct 20, 2009 at 11:55, Junio C Hamano <gitster@pobox.com> wrote:
>> That is kind of surprising ;-) as I do remember that I never thought about
>> this issue of dealing with the incomplete lines while writing the blame
>> algorithm.  I actually didn't even think about "well this will not work
>> because I am ignoring the incomplete lines".
>
> I used the following repo for testing:
>
> $ git init
> Initialized empty Git repository in /home/sverre/code/test/.git/
> $ echo "first line" > test
> $ git add test
> $ git commit -m "initial"
> [master (root-commit) d573d06] initial
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 test
> $ echo -n "second line, no newline" >> test
> $ git add test
> $ git commit -m "second"
> [master 76ad2f9] second
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> Regular output looks good:
>
> $ git blame test
> ^d573d06 (Sverre Rabbelier 2009-10-20 12:30:56 -0500 1) first line
> 76ad2f90 (Sverre Rabbelier 2009-10-20 12:31:57 -0500 2) second line, no newline
>
> Porcelain output looks good too:
>
> $ git blame -p test
> d573d06f0dd50148ba8e59bf8f1ef8fa7ee9fc88 1 1 1
> author Sverre Rabbelier
> author-mail <srabbelier@gmail.com>
> author-time 1256059856
> author-tz -0500
> committer Sverre Rabbelier
> committer-mail <srabbelier@gmail.com>
> committer-time 1256059856
> committer-tz -0500
> summary initial
> boundary
> filename test
>         first line
> 76ad2f90bde689a65715e37afd37d45942c74954 2 2 1
> author Sverre Rabbelier
> author-mail <srabbelier@gmail.com>
> author-time 1256059917
> author-tz -0500
> committer Sverre Rabbelier
> committer-mail <srabbelier@gmail.com>
> committer-time 1256059917
> committer-tz -0500
> summary second
> previous d573d06f0dd50148ba8e59bf8f1ef8fa7ee9fc88 test
> filename test
>         second line, no newline


Thanks.

For both styles of output, adding an extra LF after "no newline" would be
necessary to make the output legible (for human) and parsable (for
scripts).

In addition, it would help Porcelains to re-construct the final text if
you added a boolean "incomplete-line" (put it on its own line, immediately
after "filename test" line).  Then they will know that LF after "second
line, no newline" was not there in the original and was added for
parsability.

I am not sure what we want to do for non-porcelain output (other than
adding the extra LF at the end).  Assuming that they are meant to be read
by humans (and casual scripts that do not bother reading --porcelain
format), it might be best not to add any extra marking.

^ permalink raw reply

* Re: ident hash usage question
From: Johannes Sixt @ 2009-10-20 20:22 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Eugene Sajine, git
In-Reply-To: <81b0412b0910201219q4d16c472n43cab4b5d17cf63c@mail.gmail.com>

On Dienstag, 20. Oktober 2009, Alex Riesen wrote:
> What's so hard with storing the SHA-1 of the *commit*, anyway?

The hard part is that you get a cycle: commit SHA1 depends on contents depends 
on commit SHA1.

-- Hannes

^ permalink raw reply

* Re: [PATCH] blame: make sure that the last line ends in an LF
From: Sverre Rabbelier @ 2009-10-20 20:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Johannes Schindelin
In-Reply-To: <7vbpk2ovio.fsf@alter.siamese.dyndns.org>

Heya,

On Tue, Oct 20, 2009 at 11:55, Junio C Hamano <gitster@pobox.com> wrote:
> That is kind of surprising ;-) as I do remember that I never thought about
> this issue of dealing with the incomplete lines while writing the blame
> algorithm.  I actually didn't even think about "well this will not work
> because I am ignoring the incomplete lines".

I used the following repo for testing:

$ git init
Initialized empty Git repository in /home/sverre/code/test/.git/
$ echo "first line" > test
$ git add test
$ git commit -m "initial"
[master (root-commit) d573d06] initial
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 test
$ echo -n "second line, no newline" >> test
$ git add test
$ git commit -m "second"
[master 76ad2f9] second
 1 files changed, 1 insertions(+), 0 deletions(-)

Regular output looks good:

$ git blame test
^d573d06 (Sverre Rabbelier 2009-10-20 12:30:56 -0500 1) first line
76ad2f90 (Sverre Rabbelier 2009-10-20 12:31:57 -0500 2) second line, no newline

Porcelain output looks good too:

$ git blame -p test
d573d06f0dd50148ba8e59bf8f1ef8fa7ee9fc88 1 1 1
author Sverre Rabbelier
author-mail <srabbelier@gmail.com>
author-time 1256059856
author-tz -0500
committer Sverre Rabbelier
committer-mail <srabbelier@gmail.com>
committer-time 1256059856
committer-tz -0500
summary initial
boundary
filename test
        first line
76ad2f90bde689a65715e37afd37d45942c74954 2 2 1
author Sverre Rabbelier
author-mail <srabbelier@gmail.com>
author-time 1256059917
author-tz -0500
committer Sverre Rabbelier
committer-mail <srabbelier@gmail.com>
committer-time 1256059917
committer-tz -0500
summary second
previous d573d06f0dd50148ba8e59bf8f1ef8fa7ee9fc88 test
filename test
        second line, no newline

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply

* Re: [RFC] pull/fetch rename
From: Wesley J. Landaker @ 2009-10-20 19:59 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Björn Steinbrink
In-Reply-To: <200910201947.50423.trast@student.ethz.ch>

On Tuesday 20 October 2009 11:47:45 Thomas Rast wrote:
> Especially on IRC, we see many people who are some combination of
> misunderstanding, misusing or overusing git-pull.  I figure this is
> the result of several factors, notably
> 
> a) pull/push are not symmetric,
> 
> b) guides/tutorials recommend pull for situations where they
>    shouldn't,
> 
> c) people blindly fire commands at git.

This may be minor, but also:

d) in mercurial, pull/push are symmetric, but fetch means pull+merge

> As you probably guessed by now, here is an idea for a very aggressive
> transition plan to address (a) in four phases:

I would love to see this change, not because I get confused about pull/fetch 
(it honestly only took a few days to get used to), but because having 
push/pull be symmetric just is so much more conceptually pure / easier 
explain to co-workers / separation between orthogonal operations / 
satisfying to my inner perfectionist / etc.
  
> 1. git-fetch gets options --merge/-m and --rebase that make it behave
>    like (current) git-pull, but requiring explicit arguments.
>    git-pull gets a new option --merge (-m) that only enforces presence
>    of arguments.
> 
> 2. git-pull refuses to do any work unless given either --merge or
>    --rebase.  Deprecation warnings for this start at the same time as
>    (1.).
>
> 3. git-pull becomes a synonym for git-fetch.
> 
> 4. git-fetch gives deprecation warnings that point the user to
>    git-pull instead.

Hmmm, maybe this would be better for easier transition; replace 2-4 above 
with:

2. git-pull learns --merge and gets a configuration option that allows 
turning auto-merging off (e.g. pull.merge = merge/yes (default), rebase, or 
no). This doesn't change any behavior by default, but allows individual 
users to essentially make pull == fetch, and is forward compatible with 
changes up to #4.

3. git-pull gives a deprecation warning if the configuration option is not 
set, but otherwise defaults to merge. To get rid of the warning, you can set 
it explicitly (one way or another).

4. The configuration option default changes to "no", and a helpful message 
is printed telling you that you can set the configuration option to merge to 
get the old behavior.

5. Drop deprecation messages. At this point, git fetch and git pull are 
identical, except git fetch never merges, regardless of the pull 
configuration setting.

This has a few nice properties:

  * There is lots and lots of warning; this transition could happen slowly.

  * Early on, it will be possible to make git pull have symmetric behavior 
by default, which is the desired endgame.

  * In the end, people who want "git pull" to always keep it's current 
behavior can do so by setting the proper configuration variable.

  * git fetch doesn't need to be deprecated (but could be).

^ permalink raw reply

* Re: Re: [RFC PATCH] hooks: add some defaults to support sane workflow to pre-commit
From: Heiko Voigt @ 2009-10-20 19:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4opv4vjh.fsf@alter.siamese.dyndns.org>

On Mon, Oct 19, 2009 at 01:50:26AM -0700, Junio C Hamano wrote:
> Heiko Voigt <git-list@hvoigt.net> writes:
> 
> > It sometimes happens that you move to a new machine and forget to setup your
> > name and email address. To find this out after 10 commits can be quite
> > frustrating.
> 
> Doesn't env_hint[] kick in in fmt_ident() in such a case, though?

It is not want you usually want. You get <username>@<hostname> as the
committer but if you are working on a workstation this is seldomly your
email address. At least in my experience.

I would like this as a safeguard that you have done some basic setup
before you create any commits.

cheers Heiko

^ permalink raw reply

* Re: [PATCH] blame: make sure that the last line ends in an LF
From: Johannes Schindelin @ 2009-10-20 19:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Sverre Rabbelier, Git List
In-Reply-To: <7vbpk2ovio.fsf@alter.siamese.dyndns.org>

Hi,

On Tue, 20 Oct 2009, Junio C Hamano wrote:

> Sverre Rabbelier <srabbelier@gmail.com> writes:
> 
> >> Or am I worrying too much?
> >
> > No, I think your concerns are valid, we should go with (2) and DTRT. 
> > Does the updated patch address your concerns? If so I can send a new 
> > version.
> 
> Assuming the internal blame algorithm correctly works with such an 
> input, I'd be happier with an approach to allow users to tell the 
> difference. The --porcelain output was designed to be extensible, and it 
> might make sense to show the "this line is incomplete" as a separate 
> bit, though.

Sorry, you lost me.  If, say, line 614 is the last line that does not end 
in a new line, if I ask for it to be blamed, I want to know who is 
responsible for the current form of line 614.

Not whether the line ends in a new line or not.

Ciao,
Dscho

^ permalink raw reply

* Re: git hang with corrupted .pack
From: Junio C Hamano @ 2009-10-20 19:33 UTC (permalink / raw)
  To: Shawn O. Pearce, Nicolas Pitre; +Cc: Andy Isaacson, git, Alex Riesen
In-Reply-To: <7vzl7mng35.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>>> We now abort the loop if inflate() returns Z_BUF_ERROR without
>>> consuming the entire input buffer it was given, or has filled
>>> the entire output buffer but has not yet returned Z_STREAM_END.
>>> Either state is a clear indicator that this loop is not working
>>> as expected, and should not continue.
>>
>> When the inflated contents is of size 0, avail_out would be 0 and avail_in
>> would still have something because the input stream needs to have the end
>> of stream marker that is more than zero byte long.
>
> After thinking about this a bit more, I am reasonably sure that this is
> it.  The contents does not have to be a 0-length string, but you would hit
> this if the pure-data portion of the deflated stream aligns at the end of
> your (un)pack window and it happens to require another use_pack() to move
> the window to read the end-of-stream signal.  In that situation, the
> output buffer has already been filled, but you haven't read the input
> stream fully.  Would't the new check incorrectly trigger in such a case?
>
>>>  		st = git_inflate(&stream, Z_FINISH);
>>> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
>>> +			break;
>
> We won't see this on 64-bit platforms because we use larger (un)pack
> window and the condition is much less likely to be met.

Perhaps it would be as simple as this?

We deliberately give one byte more than what we expect to see and error
out when we do get that extra byte filled.

 sha1_file.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..8c9f392 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1344,7 +1344,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
 			          off_t curpos)
 {
 	const unsigned char *data;
-	unsigned char delta_head[20], *in;
+	unsigned char delta_head[21], *in;
 	z_stream stream;
 	int st;
 
@@ -1357,13 +1357,14 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
+		 stream.total_out < (sizeof(delta_head) - 1));
 	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+	if ((st != Z_STREAM_END) &&
+	    stream.total_out != sizeof(delta_head) - 1) {
 		error("delta data unpack-initial failed");
 		return 0;
 	}
@@ -1589,15 +1590,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);

^ permalink raw reply related

* Re: [PATCH] pull: refuse complete src:dst fetchspec arguments
From: Wesley J. Landaker @ 2009-10-20 19:29 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git
In-Reply-To: <d561e70f0aa802ceb96eba16d3bb2316134d69c8.1256062808.git.trast@student.ethz.ch>

On Tuesday 20 October 2009 12:23:06 Thomas Rast wrote:
> git-pull has historically accepted full fetchspecs, meaning that you
> could do
> 
>   git pull $repo A:B
> 
> which would simultaneously fetch the remote branch A into the local
> branch B and merge B into HEAD.  This got especially confusing if B
> was checked out.  New users variously mistook pull for fetch or read
> that command as "merge the remote A into my B", neither of which is
> correct.

One thought here is that if the change you suggested (and I personally like) 
in your "[RFC] pull/fetch rename" thread was made, then I would expect to be 
able to run this exact command to have git fetch the remote branch A into 
the local branch B (with no merging taking place, because I didn't say --
merge). So basically, it would be like "git fetch $repo A:B" is now.

I readily agree that the *current* behavior of that command would have 
probably caught me off-guard, since I probably only would have typed that on 
accident (e.g. using "pull" when I meant "fetch").

^ permalink raw reply

* Re: ident hash usage question
From: Alex Riesen @ 2009-10-20 19:19 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: git
In-Reply-To: <76c5b8580910201159i75a90f28pb882e83f0c7c40ae@mail.gmail.com>

On Tue, Oct 20, 2009 at 20:59, Eugene Sajine <euguess@gmail.com> wrote:
>>
>> Well, not exactly impossible, but you can end up with multiple paths,
>> some of which may not have anything to do the original path.
>>
>> Just run git log --no-abbrev --raw --all and grep for the SHA-1
>
> This is exactly what I was looking for! Thank you!
>

Very likely it isn't, but it's your choice.

> Because after having this hash one can build up all necessary info from it:

Depending on your definition of "necessary".

> #finding blobs with SHA indicated in $Id$ keword
> $ git log --no-abbrev --raw --all | grep SHA-1

yeah. These are all starting from commit which introduced
the hash under a specific path, ending at the commit where
the path contains another SHA-1.

> # little script or regexp here (don’t have it)
> $ pull out path from result
>
> # last commit for the path with all corresponding info
> $ git log -1 HEAD path
>
> So, this seems to cover most of the needs of people who would like to
> use keywords expansion, if they are not ready to forget about them…
>
> Does it make sense?

Not much. You'll always get a long list of commits which didn't
change the damned blob. And you have absolutely no way
to find out exactly which of the commits have produced
the blob you're looking at (because you decided to do away
with the information).

What's so hard with storing the SHA-1 of the *commit*, anyway?

^ permalink raw reply

* Re: [RFC/PATCH] fsck: default to "git fsck --full"
From: Junio C Hamano @ 2009-10-20 19:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Matthieu Moy, Johannes Schindelin, Alex Riesen,
	Johannes Sixt, Sergio Callegari, git
In-Reply-To: <alpine.LFD.2.00.0910201457310.21460@xanadu.home>

Nicolas Pitre <nico@fluxnic.net> writes:

> On Tue, 20 Oct 2009, Junio C Hamano wrote:
>
>> diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
>> index 5f1fecb..1896e05 100644
>> --- a/Documentation/RelNotes-1.6.6.txt
>> +++ b/Documentation/RelNotes-1.6.6.txt
>> @@ -1,6 +1,13 @@
>>  GIT v1.6.6 Release Notes
>>  ========================
>>  
>> +In this release, "git fsck" defaults to "git fsck --full" and checks
>> +packfiles.  If you prefer a quicker check only on loose objects (the
>              ^^
>
> Might be worth mentioning explicitly that, because of that change, plain 
> fsck is now going to take much longer to complete.

Sounds fair; thanks.

diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
index 0adf998..fa0e11a 100644
--- a/Documentation/RelNotes-1.6.6.txt
+++ b/Documentation/RelNotes-1.6.6.txt
@@ -2,7 +2,8 @@ GIT v1.6.6 Release Notes
 ========================
 
 In this release, "git fsck" defaults to "git fsck --full" and checks
-packfiles.  If you prefer a quicker check only on loose objects (the
+packfiles, and because of this it will take much longer to complete
+than before.  If you prefer a quicker check only on loose objects (the
 old default), you can say "git fsck --no-full".  This has been
 supported by 1.5.4 and newer versions of git, so it is safe to write
 it in your script even if you use slightly older git on some of your

^ permalink raw reply related

* Re: [RFC/PATCH] fsck: default to "git fsck --full"
From: Nicolas Pitre @ 2009-10-20 19:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Johannes Schindelin, Alex Riesen, Johannes Sixt,
	Sergio Callegari, git
In-Reply-To: <7v3a5doqcg.fsf_-_@alter.siamese.dyndns.org>

On Tue, 20 Oct 2009, Junio C Hamano wrote:

> diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
> index 5f1fecb..1896e05 100644
> --- a/Documentation/RelNotes-1.6.6.txt
> +++ b/Documentation/RelNotes-1.6.6.txt
> @@ -1,6 +1,13 @@
>  GIT v1.6.6 Release Notes
>  ========================
>  
> +In this release, "git fsck" defaults to "git fsck --full" and checks
> +packfiles.  If you prefer a quicker check only on loose objects (the
             ^^

Might be worth mentioning explicitly that, because of that change, plain 
fsck is now going to take much longer to complete.


Nicolas

^ permalink raw reply

* Re: ident hash usage question
From: Eugene Sajine @ 2009-10-20 18:59 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Eugene Sajine
In-Reply-To: <81b0412b0910201119w7583487ag276cf964d0a85e@mail.gmail.com>

>
> Well, not exactly impossible, but you can end up with multiple paths,
> some of which may not have anything to do the original path.
>
> Just run git log --no-abbrev --raw --all and grep for the SHA-1
>

This is exactly what I was looking for! Thank you!

I understand that in some cases this can give me two paths instead of
one, but it will only demonstrate that I have absolute copy of a file
inside my repo, which is also good to diagnose (because the
probability of having two meaningful files with the same hash, but
different content is way too low).

So, this means that after few little tricks the keyword expansion
problem may be resolved by only using $Id$ keyword.
Because after having this hash one can build up all necessary info from it:

#finding blobs with SHA indicated in $Id$ keword
$ git log --no-abbrev --raw --all | grep SHA-1

# little script or regexp here (don’t have it)
$ pull out path from result

# last commit for the path with all corresponding info
$ git log -1 HEAD path

So, this seems to cover most of the needs of people who would like to
use keywords expansion, if they are not ready to forget about them…

Does it make sense?

Thanks,
Eugene

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox