Git development
 help / color / mirror / Atom feed
* Re: [PATCH] mergetool: clean up temp files when aborted
From: Phil Hord @ 2013-01-24 21:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phil Hord, git@vger.kernel.org, Theodore Ts'o
In-Reply-To: <7vbocebblo.fsf@alter.siamese.dyndns.org>

On Thu, Jan 24, 2013 at 2:54 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Phil Hord <hordp@cisco.com> writes:
>
>> When handling a symlink conflict or a deleted-file conflict, mergetool
>> stops to ask the user what to do. If the user chooses any option besides
>> "(a)bort", then the temporary files which mergetool created in
>> preparation for handling the conflict are removed.  But these temporary
>> files are not removed when the user chooses to abort the operation.
>>
>>     $ git cherry-pick other/branch
>>     error: could not apply 4e43581... Fix foo.c
>>
>>     $ git status --short
>>     DU foo.c
>>
>>     $ git mergetool
>>     Merging:
>>     foo.c
>>
>>     Deleted merge conflict for 'foo.c':
>>       {local}: deleted
>>       {remote}: modified file
>>     Use (m)odified or (d)eleted file, or (a)bort? a
>>     Continue merging other unresolved paths (y/n) ? n
>>
>>     $ git status --short
>>     DU foo.c
>>     ?? foo.c.BACKUP.16929.c
>>     ?? foo.c.BASE.16929.c
>>     ?? foo.c.LOCAL.16929.c
>>     ?? foo.c.REMOTE.16929.c
>>
>> These temporary files should not remain after the mergetool operation is
>> completed.
>
> Aren't there cases where people "abort" so that they can have a
> chance inspect them outside mergetool program?  If there are no such
> cases, then I would agree with your claim "should not remain", but
> the proposed log message does not explay why it is sure that there
> are no such use cases.

You may be right about other people's workflows which I forgot to
consider.  I have noticed a lot of inconsistency in this tool wrt to
mergetool.keepBackup and mergetool.keepTemporaries.  Perhaps this
change should hinge on the latter.

Would you agree with this rephrased statement (accompanied by matching logic)?

    When mergetool.keepTemporaries is false or not set, these temporary files
    should not remain when the mergetool operation is aborted.


Here is the wording from git help config:

      mergetool.keepBackup
           After performing a merge, the original file with conflict
markers can be saved as a file with a .orig extension. If this
variable is set to false then this file
           is not preserved. Defaults to true (i.e. keep the backup files).

       mergetool.keepTemporaries
           When invoking a custom merge tool, git uses a set of
temporary files to pass to the tool. If the tool returns an error and
this variable is set to true, then
           these temporary files will be preserved, otherwise they
will be removed after the tool has exited. Defaults to false.


I have another commit in the works to clean up consistency around the
keepBackup.  I'll re-roll this one at the same time.

Phil

^ permalink raw reply

* Re: [PATCH] mergetool: clean up temp files when aborted
From: Junio C Hamano @ 2013-01-24 21:41 UTC (permalink / raw)
  To: Phil Hord; +Cc: Phil Hord, git@vger.kernel.org, Theodore Ts'o
In-Reply-To: <CABURp0qxj0HVHbYOR1Enfx_A1-ALPDGhLriGp=gab-LBigwt8g@mail.gmail.com>

Phil Hord <phil.hord@gmail.com> writes:

> On Thu, Jan 24, 2013 at 2:54 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Phil Hord <hordp@cisco.com> writes:
>> ...
>>> These temporary files should not remain after the mergetool operation is
>>> completed.
>>
>> Aren't there cases where people "abort" so that they can have a
>> chance inspect them outside mergetool program?  If there are no such
>> cases, then I would agree with your claim "should not remain", but
>> the proposed log message does not explay why it is sure that there
>> are no such use cases.
>
> You may be right about other people's workflows which I forgot to
> consider.  I have noticed a lot of inconsistency in this tool wrt to
> mergetool.keepBackup and mergetool.keepTemporaries.  Perhaps this
> change should hinge on the latter.
>
> Would you agree with this rephrased statement (accompanied by matching logic)?
>
>     When mergetool.keepTemporaries is false or not set, these temporary files
>     should not remain when the mergetool operation is aborted.

That is perfectly sensible (and should apply equally to non-"abort"
cases, I think).

Thanks.

^ permalink raw reply

* Re: [PATCH] t9902: Instruct git-completion.bash about a test mode
From: Junio C Hamano @ 2013-01-24 21:46 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git, Jeff King
In-Reply-To: <201301242220.09067.jn.avila@free.fr>

"Jean-Noël AVILA" <jn.avila@free.fr> writes:

> In test mode, git completion should propose commands only if they
> belong to the list of authorized commands.
>
> Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
> ---
>
> Better show some code than try to explain. Here is what I mean by
> "filter the output git help -a". 

Why do you have to make an extra shell function call for each and
every possible Git subcommand to slow down everybody's work when not
in testing mode?

Comparing it with Peff's suggestion, it is fairly clear which one we
should pick, I think.



>  contrib/completion/git-completion.bash | 16 +++++++++++++++-
>  t/t9902-completion.sh                  |  4 ++--
>  2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index 14dd5e7..6490553 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -531,6 +531,20 @@ __git_complete_strategy ()
>  	return 1
>  }
>  
> +if test -z "$AUTHORIZED_CMD_LIST"; then
> +	__git_cmdlist ()
> +	{ 
> +		echo $1;
> +	}
> +else
> +	__git_cmdlist ()
> +	{ 
> +		if [[ " $AUTHORIZED_CMD_LIST " =~ " $1 " ]] ; then
> +				echo $1
> +		fi
> +	}
> +fi
> +
>  __git_list_all_commands ()
>  {
>  	local i IFS=" "$'\n'
> @@ -538,7 +552,7 @@ __git_list_all_commands ()
>  	do
>  		case $i in
>  		*--*)             : helper pattern;;
> -		*) echo $i;;
> +		*) __git_cmdlist $i;;
>  		esac
>  	done
>  }
> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index 3cd53f8..5e7d81e 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -12,8 +12,8 @@ complete ()
>  	# do nothing
>  	return 0
>  }
> -
> -. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash"
> +AUTHORIZED_CMD_LIST=" checkout show add filter-branch ls-files send-email describe"
> +. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" 
>  
>  # We don't need this function to actually join words or do anything special.
>  # Also, it's cleaner to avoid touching bash's internal completion variables.

^ permalink raw reply

* Re: [PATCH] t9902: Instruct git-completion.bash about a test mode
From: Jean-Noël AVILA @ 2013-01-24 22:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <7vtxq68day.fsf@alter.siamese.dyndns.org>

Le jeudi 24 janvier 2013 22:46:13, Junio C Hamano a écrit :
> "Jean-Noël AVILA" <jn.avila@free.fr> writes:
> 
> > In test mode, git completion should propose commands only if they
> > belong to the list of authorized commands.
> >
> > Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
> > ---
> >
> > Better show some code than try to explain. Here is what I mean by
> > "filter the output git help -a". 
> 
> Why do you have to make an extra shell function call for each and
> every possible Git subcommand to slow down everybody's work when not
> in testing mode?
> 

My rational was to be sure to put the environment variable out of the 
way once the script has been sourced. I can make two alternative 
definitions of __git_list_all_commands () depending on the presence of 
$AUTHORIZED_CMD_LIST if you are worried about performance.

> Comparing it with Peff's suggestion, it is fairly clear which one we
> should pick, I think.
> 
> 
> 
> >  contrib/completion/git-completion.bash | 16 +++++++++++++++-
> >  t/t9902-completion.sh                  |  4 ++--
> >  2 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> > index 14dd5e7..6490553 100644
> > --- a/contrib/completion/git-completion.bash
> > +++ b/contrib/completion/git-completion.bash
> > @@ -531,6 +531,20 @@ __git_complete_strategy ()
> >  	return 1
> >  }
> >  
> > +if test -z "$AUTHORIZED_CMD_LIST"; then
> > +	__git_cmdlist ()
> > +	{ 
> > +		echo $1;
> > +	}
> > +else
> > +	__git_cmdlist ()
> > +	{ 
> > +		if [[ " $AUTHORIZED_CMD_LIST " =~ " $1 " ]] ; then
> > +				echo $1
> > +		fi
> > +	}
> > +fi
> > +
> >  __git_list_all_commands ()
> >  {
> >  	local i IFS=" "$'\n'
> > @@ -538,7 +552,7 @@ __git_list_all_commands ()
> >  	do
> >  		case $i in
> >  		*--*)             : helper pattern;;
> > -		*) echo $i;;
> > +		*) __git_cmdlist $i;;
> >  		esac
> >  	done
> >  }
> > diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> > index 3cd53f8..5e7d81e 100755
> > --- a/t/t9902-completion.sh
> > +++ b/t/t9902-completion.sh
> > @@ -12,8 +12,8 @@ complete ()
> >  	# do nothing
> >  	return 0
> >  }
> > -
> > -. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash"
> > +AUTHORIZED_CMD_LIST=" checkout show add filter-branch ls-files send-email describe"
> > +. "$GIT_BUILD_DIR/contrib/completion/git-completion.bash" 
> >  
> >  # We don't need this function to actually join words or do anything special.
> >  # Also, it's cleaner to avoid touching bash's internal completion variables.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH] mergetools: Add tortoisegitmerge helper
From: Sven Strickroth @ 2013-01-24 22:07 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Sebastian Schuberth, davvid, Jeff King
In-Reply-To: <7vfw1qbbr4.fsf@alter.siamese.dyndns.org>

Am 24.01.2013 20:51 schrieb Junio C Hamano:
> Sven Strickroth <sven.strickroth@tu-clausthal.de> writes:
> 
>> - The TortoiseGit team renamed TortoiseMerge.exe to TortoiseGitMerge.exe
>>   (starting with 1.8.0) in order to make clear that this one has special
>>   support for git and prevent confusion with the TortoiseSVN TortoiseMerge
>>   version.
> 
> Wouldn't it make more sense in such a situation if your users can
> keep using the old "tortoisemerge" configured in their configuration
> and when the renamed one is found the mergetool automatically used
> it, rather than the way your patch is done?

That was also my first idea, however, TortoiseMerge uses parameters as
follows: '-base:"$BASE"'. TortoiseGitMerge uses values separated by
space from keys: '-base "$BASE"'. So both are incompatible (the first
approach has problems with spaces in filenames, the TortoiseGitMerge
approach fixes this).

>> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
>> index 4314ad0..13cbe5b 100644
>> --- a/Documentation/diff-config.txt
>> +++ b/Documentation/diff-config.txt
>> @@ -151,7 +151,7 @@ diff.<driver>.cachetextconv::
>>  diff.tool::
>>  	The diff tool to be used by linkgit:git-difftool[1].  This
>>  	option overrides `merge.tool`, and has the same valid built-in
>> -	values as `merge.tool` minus "tortoisemerge" and plus
>> -	"kompare".  Any other value is treated as a custom diff tool,
>> +	values as `merge.tool` minus "tortoisemerge"/"tortoisegitmerge" and
>> +	plus "kompare".  Any other value is treated as a custom diff tool,
>>  	and there must be a corresponding `difftool.<tool>.cmd`
>>  	option.
> 
> So in short, two tortoises and kompare are only valid as mergetool
> but cannot be used as difftool?  No, I am reading it wrong.
> merge.tool can be used for both, kompare can be used as difftool,
> and two tortoises can only be used as mergetool.
> 
> This paragraph needs to be rewritten to unconfuse readers.  The
> original is barely intelligible, and it becomes unreadable as the
> set of tools subtracted by "minus" and added by "plus" grows.

But I think this should not be part of this patch.

-- 
Best regards,
 Sven Strickroth
 PGP key id F5A9D4C4 @ any key-server

^ permalink raw reply

* Re: [PATCH] mergetools: Add tortoisegitmerge helper
From: Junio C Hamano @ 2013-01-24 22:15 UTC (permalink / raw)
  To: Sven Strickroth; +Cc: git, Sebastian Schuberth, davvid, Jeff King
In-Reply-To: <5101B0A5.1020308@tu-clausthal.de>

Sven Strickroth <sven.strickroth@tu-clausthal.de> writes:

> Am 24.01.2013 20:51 schrieb Junio C Hamano:
>> Sven Strickroth <sven.strickroth@tu-clausthal.de> writes:
>> 
>>> - The TortoiseGit team renamed TortoiseMerge.exe to TortoiseGitMerge.exe
>>>   (starting with 1.8.0) in order to make clear that this one has special
>>>   support for git and prevent confusion with the TortoiseSVN TortoiseMerge
>>>   version.
>> 
>> Wouldn't it make more sense in such a situation if your users can
>> keep using the old "tortoisemerge" configured in their configuration
>> and when the renamed one is found the mergetool automatically used
>> it, rather than the way your patch is done?
>
> That was also my first idea, however, TortoiseMerge uses parameters as
> follows: '-base:"$BASE"'. TortoiseGitMerge uses values separated by
> space from keys: '-base "$BASE"'. So both are incompatible (the first
> approach has problems with spaces in filenames, the TortoiseGitMerge
> approach fixes this).

OK.  Please unconfuse future readers of "git log" by saying why such
a unification does not work in the proposed log message.

>>> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
>>> index 4314ad0..13cbe5b 100644
>>> --- a/Documentation/diff-config.txt
>>> +++ b/Documentation/diff-config.txt
>>> @@ -151,7 +151,7 @@ diff.<driver>.cachetextconv::
>>>  diff.tool::
>>>  	The diff tool to be used by linkgit:git-difftool[1].  This
>>>  	option overrides `merge.tool`, and has the same valid built-in
>>> -	values as `merge.tool` minus "tortoisemerge" and plus
>>> -	"kompare".  Any other value is treated as a custom diff tool,
>>> +	values as `merge.tool` minus "tortoisemerge"/"tortoisegitmerge" and
>>> +	plus "kompare".  Any other value is treated as a custom diff tool,
>>>  	and there must be a corresponding `difftool.<tool>.cmd`
>>>  	option.
>> 
>> So in short, two tortoises and kompare are only valid as mergetool
>> but cannot be used as difftool?  No, I am reading it wrong.
>> merge.tool can be used for both, kompare can be used as difftool,
>> and two tortoises can only be used as mergetool.
>> 
>> This paragraph needs to be rewritten to unconfuse readers.  The
>> original is barely intelligible, and it becomes unreadable as the
>> set of tools subtracted by "minus" and added by "plus" grows.
>
> But I think this should not be part of this patch.

I agree that it can be done (and it is better to be done) as a
preparatory step.  The current text is barely readable, but with
this patch there will be two "minus", and the result becomes
unreadable at that point.

It also could be done as a follow-up documentation readability fix.

^ permalink raw reply

* Re: [PATCH] t9902: Instruct git-completion.bash about a test mode
From: Junio C Hamano @ 2013-01-24 22:29 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: git, Jeff King
In-Reply-To: <201301242304.16078.jn.avila@free.fr>

"Jean-Noël AVILA" <jn.avila@free.fr> writes:

> My rational was to be sure to put the environment variable out of the 
> way once the script has been sourced. I can make two alternative 
> definitions of __git_list_all_commands () depending on the presence of 
> $AUTHORIZED_CMD_LIST if you are worried about performance.

Actually, it does not matter, as once __git_list_all_commands is
run, its result is kept in a variable.

So Peff's

  if test -z "$FAKE_COMMAND_LIST"; then
          __git_cmdlist() {
                  git help -a | egrep '^  [a-zA-Z0-9]'
          }
  else
          __git_cmdlist() {
                  printf '%s' "$FAKE_COMMAND_LIST"
          }
  fi

would be just as simple if not simpler, does the same thing, and is
sufficient, I think.

The t9902 test is only interested in making sure that the completion
works, and we do not want "git help -a" that omits a subcommand from
its output that is not built in your particular environment to get
in the way, which will not be an issue with this approach.

^ permalink raw reply

* Re: [PATCH v4 1/3] push: further clean up fields of "struct ref"
From: Eric Sunshine @ 2013-01-24 22:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Chris Rorvick
In-Reply-To: <1358978130-12216-2-git-send-email-gitster@pobox.com>

On Wed, Jan 23, 2013 at 4:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
> The "nonfastforward" and "update" fields are only used while
> deciding what value to assign to the "status" locally in a single
> function.  Remove them from the "struct ref".
>
> The "requires_force" field is not used to decide if the proposed
> update requires a --force option to succeed, or to record such a
> decision made elsewhere.  It is used by status reporting code that
> the particular update was "forced".  Rename it to "forced_udpate",

s/udpate/update/

^ permalink raw reply

* Re: [PATCH] t9902: Instruct git-completion.bash about a test mode
From: Jean-Noël AVILA @ 2013-01-24 22:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <7vlibi8baq.fsf@alter.siamese.dyndns.org>

Le jeudi 24 janvier 2013 23:29:33, Junio C Hamano a écrit :
> "Jean-Noël AVILA" <jn.avila@free.fr> writes:
> 
> > My rational was to be sure to put the environment variable out of the 
> > way once the script has been sourced. I can make two alternative 
> > definitions of __git_list_all_commands () depending on the presence of 
> > $AUTHORIZED_CMD_LIST if you are worried about performance.
> 
> Actually, it does not matter, as once __git_list_all_commands is
> run, its result is kept in a variable.
> 
> So Peff's
> 
>   if test -z "$FAKE_COMMAND_LIST"; then
>           __git_cmdlist() {
>                   git help -a | egrep '^  [a-zA-Z0-9]'
>           }
>   else
>           __git_cmdlist() {
>                   printf '%s' "$FAKE_COMMAND_LIST"
>           }
>   fi
> 
> would be just as simple if not simpler, does the same thing, and is
> sufficient, I think.
> 
> The t9902 test is only interested in making sure that the completion
> works, and we do not want "git help -a" that omits a subcommand from
> its output that is not built in your particular environment to get
> in the way, which will not be an issue with this approach.
> 


Ah. I totally missed the point. I though that the requested change was
to intersect the list needed for the test with the one provided by the 
standard completion.

^ permalink raw reply

* [PATCH] t9902: protect test from stray build artifacts
From: Junio C Hamano @ 2013-01-24 23:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Jean-Noël AVILA, git
In-Reply-To: <20130122003954.GA23297@sigill.intra.peff.net>

When you have random build artifacts in your build directory, left
behind by running "make" while on another branch, the "git help -a"
command run by __git_list_all_commands in the completion script that
is being tested does not have a way to know that they are not part
of the subcommands this build will ship.  Such extra subcommands may
come from the user's $PATH.  They will interfere with the tests that
expect a certain prefix to uniquely expand to a known completion.

Instrument the completion script and give it a way for us to tell
what (subset of) subcommands we are going to ship.

Also add a test to "git --help <prefix><TAB>" expansion.  It needs
to show not just commands but some selected documentation pages.

Based on an idea by Jeff King.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 contrib/completion/git-completion.bash | 11 ++++++++++-
 t/t9902-completion.sh                  | 25 ++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index a4c48e1..6139b50 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -531,10 +531,19 @@ __git_complete_strategy ()
 	return 1
 }
 
+__git_commands () {
+	if test -n "${GIT_TESTING_COMMAND_COMPLETION:-}"
+	then
+		printf "%s" "${GIT_TESTING_COMMAND_COMPLETION}"
+	else
+		git help -a|egrep '^  [a-zA-Z0-9]'
+	fi
+}
+
 __git_list_all_commands ()
 {
 	local i IFS=" "$'\n'
-	for i in $(git help -a|egrep '^  [a-zA-Z0-9]')
+	for i in $(__git_commands)
 	do
 		case $i in
 		*--*)             : helper pattern;;
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 3cd53f8..adc1372 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -13,6 +13,25 @@ complete ()
 	return 0
 }
 
+# Be careful when updating this list:
+#
+# (1) The build tree may have build artifact from different branch, or
+#     the user's $PATH may have a random executable that may begin
+#     with "git-check" that are not part of the subcommands this build
+#     will ship, e.g.  "check-ignore".  The tests for completion for
+#     subcommand names tests how "check" is expanded; we limit the
+#     possible candidates to "checkout" and "check-attr" to make sure
+#     "check-attr", which is known by the filter function as a
+#     subcommand to be thrown out, while excluding other random files
+#     that happen to begin with "check" to avoid letting them get in
+#     the way.
+#
+# (2) A test makes sure that common subcommands are included in the
+#     completion for "git <TAB>", and a plumbing is excluded.  "add",
+#     "filter-branch" and "ls-files" are listed for this.
+
+GIT_TESTING_COMMAND_COMPLETION='add checkout check-attr filter-branch ls-files'
+
 . "$GIT_BUILD_DIR/contrib/completion/git-completion.bash"
 
 # We don't need this function to actually join words or do anything special.
@@ -196,7 +215,6 @@ test_expect_success 'general options plus command' '
 	test_completion "git --paginate check" "checkout " &&
 	test_completion "git --git-dir=foo check" "checkout " &&
 	test_completion "git --bare check" "checkout " &&
-	test_completion "git --help des" "describe " &&
 	test_completion "git --exec-path=foo check" "checkout " &&
 	test_completion "git --html-path check" "checkout " &&
 	test_completion "git --no-pager check" "checkout " &&
@@ -207,6 +225,11 @@ test_expect_success 'general options plus command' '
 	test_completion "git --no-replace-objects check" "checkout "
 '
 
+test_expect_success 'git --help completion' '
+	test_completion "git --help ad" "add " &&
+	test_completion "git --help core" "core-tutorial "
+'
+
 test_expect_success 'setup for ref completion' '
 	echo content >file1 &&
 	echo more >file2 &&

^ permalink raw reply related

* [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
From: Jonathan Nieder @ 2013-01-24 23:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Angus Hammond, git, Mihai Rusu
In-Reply-To: <20120521231017.GJ10981@sigill.intra.peff.net>

Hi Jeff,

Jeff King wrote:

> When we pull the user's name from the GECOS field of the
> passwd file (or generate an email address based on their
> username and hostname), we put the result into a
> static buffer.
[...]
> The conversion is mostly mechanical: replace string copies
> with strbuf equivalents, and access the strbuf.buf directly.
> There are a few exceptions:

This broke /etc/mailname handling.  Before:

	$ git var GIT_COMMITTER_IDENT
	Jonathan Nieder <jrn@mailname.example.com> 1359069165 -0800

After:

	$ git var GIT_COMMITTER_IDENT
	Jonathan Nieder <mailname.example.com> 1359069192 -0800

The cause:

[...]
> --- a/ident.c
> +++ b/ident.c
> @@ -19,42 +18,27 @@ int user_ident_explicitly_given;
[...]
> -static int add_mailname_host(char *buf, size_t len)
> +static int add_mailname_host(struct strbuf *buf)
>  {
>  	FILE *mailname;
>  
> @@ -65,7 +49,7 @@ static int add_mailname_host(char *buf, size_t len)
>  				strerror(errno));
>  		return -1;
>  	}
> -	if (!fgets(buf, len, mailname)) {
> +	if (strbuf_getline(buf, mailname, '\n') == EOF) {

This clears the strbuf.

How about something like this as a quick fix?

Reported-by: Mihai Rusu <dizzy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

diff --git a/ident.c b/ident.c
index 73a06a1..cabd73f 100644
--- a/ident.c
+++ b/ident.c
@@ -41,6 +41,7 @@ static void copy_gecos(const struct passwd *w, struct strbuf *name)
 static int add_mailname_host(struct strbuf *buf)
 {
 	FILE *mailname;
+	struct strbuf mailnamebuf = STRBUF_INIT;
 
 	mailname = fopen("/etc/mailname", "r");
 	if (!mailname) {
@@ -49,14 +50,17 @@ static int add_mailname_host(struct strbuf *buf)
 				strerror(errno));
 		return -1;
 	}
-	if (strbuf_getline(buf, mailname, '\n') == EOF) {
+	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
 		if (ferror(mailname))
 			warning("cannot read /etc/mailname: %s",
 				strerror(errno));
+		strbuf_release(&mailnamebuf);
 		fclose(mailname);
 		return -1;
 	}
 	/* success! */
+	strbuf_addbuf(buf, &mailnamebuf);
+	strbuf_release(&mailnamebuf);
 	fclose(mailname);
 	return 0;
 }

^ permalink raw reply related

* Re: segmentation fault (nullpointer) with git log --submodule -p
From: Jeff King @ 2013-01-24 23:27 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Stefan Näwe, Armin, Junio C Hamano, Jonathon Mah,
	git@vger.kernel.org
In-Reply-To: <CACsJy8CEofqi9S8-SDx_O+Ko0i56aRZ4KEJrVnbFum6zzsJrJg@mail.gmail.com>

On Thu, Jan 24, 2013 at 09:14:47PM +0700, Nguyen Thai Ngoc Duy wrote:

> >>> I did. My bisection told me this is the suspect:
> >>>
> >>> ccdc603 (parse_object: try internal cache before reading object db)
> >>
> >> diff --git a/object.c b/object.c
> >> index d8d09f9..6b06297 100644
> >> --- a/object.c
> >> +++ b/object.c
> >> @@ -191,10 +191,15 @@ struct object *parse_object(const unsigned char *sha1)
> >>         enum object_type type;
> >>         int eaten;
> >>         const unsigned char *repl = lookup_replace_object(sha1);
> >> -       void *buffer = read_sha1_file(sha1, &type, &size);
> >> +       void *buffer;
> >> +       struct object *obj;
> >> +
> >> +       obj = lookup_object(sha1);
> >> +       if (obj && obj->parsed)
> >> +               return obj;
> >>
> >> Any chance obj->parsed is 1 but ((struct commit*)obj)->buffer is NULL?
> >> What if you change that "if" to
> >>
> >> if (obj && obj->parsed && (obj->type != OBJ_COMMIT || ((struct commit
> >> *)obj)->buffer))
> >>
> >
> > No more segfault!
> 
> Sweet. I have no idea how that fixes it. Maybe Jeff can give some
> explanation after he wakes up.

Ugh. I think I know why it fixes it. We free the commit's buffer as part
of the log traversal, but then later want to access it as part of the
diff. We presumably call parse_object somewhere in the middle to make
sure it is parsed.

Before ccdc603, a side effect of parse_object is that even for a parsed
object, we would fill in the buffer field of a commit or tree. See
parse_object_buffer:

        } else if (type == OBJ_COMMIT) {
                struct commit *commit = lookup_commit(sha1);
                if (commit) {
                        if (parse_commit_buffer(commit, buffer, size))
                                return NULL;
                        if (!commit->buffer) {
                                commit->buffer = buffer;
                                eaten = 1;
                        }
                        obj = &commit->object;
                }

When this patch was originally proposed, I wrote[1]:

  On Thu, Jan 05, 2012 at 01:55:22PM -0800, Junio C Hamano wrote:
  > > So I think it is safe short of somebody doing some horrible manual
  > > munging of a "struct object".
  >
  > Yeah, I was worried about codepaths like commit-pretty-printing
  > might be mucking with the contents of commit->buffer, perhaps
  > reencoding the text and then calling parse_object() to get the
  > unmodified original back, or something silly like that. But the
  > lookup_object() call at the beginning of the parse_object() already
  > prevents us from doing such a thing, so we should be OK, I would
  > think.

  [...]

  What saves you is that the parse_*_buffer functions all do nothing
  when the object.parsed flag is set, and the code I added makes sure
  that object.parsed is set in the object that lookup_object returns.

  So yeah, anytime you tweak the contents of commit->buffer but don't
  unset the "parsed" flag, you are asking for trouble.

Which is true, but obviously I missed that in addition to calling
parse_*_buffer, which will be a no-op, we _also_ set the buffer
independently. So parse_object was functioning in a belt-and-suspenders
for that case. And I think this is probably the same root cause as the
segfault which came up here:

  http://thread.gmane.org/gmane.comp.version-control.git/214366

So what to do?

We can revert ccdc603, but I do not think we need to. We can catch the
problematic cases with something like your patch, but still get the
optimization when the buffer really is already filled in. I think we'd
need to extend your patch to handle trees, too, to be totally correct.

But there are still some loose ends that I note:

  1. Making such a change would be parse_object erring on the side of
     providing the buffer. But it doesn't actually know if the buffer is
     desired or not. For instance, upload-pack benefited from this
     optimization, but does not need save_commit_buffer on at all. So
     commit->buffer is _always_ NULL there, and that's just fine; we
     really don't need to read the object.

     Now this may be a bad example, because due to my follow-on patches,
     we avoid calling parse_object at all in most cases, so I don't
     think it matters any longer to upload-pack. But I suspect there are
     other places with similar circumstances. Fundamentally parse_object
     doesn't know what the caller is interested in.

  2. This means that parse_commit and parse_object behave differently in
     this regard. The former will leave the buffer unfilled. Meaning we
     may still have issues with code paths that munge the buffer without
     resetting the parsed flag, independent of ccdc603 and fixing this.

To me, these highlight that our commit->buffer management is fragile and
is largely about guessing in various circumstances whether somebody will
later want the buffer. I'm not sure of the right solution, though. It
seems like something that inherently blurs the lines between bits of
code (e.g., how should "log" know that a submodule diff might later want
to see the same entry? Should we optimistically free and then make it
easier for the later user to reliably ensure the buffer is primed? Or
should we err on the side of keeping it in place?).

-Peff

[1] http://article.gmane.org/gmane.comp.version-control.git/188000

^ permalink raw reply

* Re: [PATCH] parse_object: clear "parsed" when freeing buffers
From: Jonathon Mah @ 2013-01-24 23:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git
In-Reply-To: <20130124070715.GD610@sigill.intra.peff.net>

On 2013-01-23, at 23:07, Jeff King <peff@peff.net> wrote:

> On Wed, Jan 23, 2013 at 01:25:04PM -0800, Jonathon Mah wrote:
> 
>> Several areas of code would free buffers for object structs that
>> contained them ("struct tree" and "struct commit"), but without clearing
>> the "parsed" flag. parse_object would clear the flag for "struct tree",
>> but commits would remain in an invalid state (marked as parsed but with
>> a NULL buffer). Because the objects are uniqued (ccdc6037fee), the
>> invalid objects stay around and can lead to bad behavior.
>> 
>> In particular, running pickaxe on all refs in a repository with a cached
>> textconv could segfault. If the textconv cache ref was evaluated first
>> by cmd_log_walk, a subsequent notes_cache_match_validity call would
>> dereference NULL.
> 
> Just to be sure I understand, what is going on is something like this?
> 
> Log reads all refs, including notes. After showing the notes commit,
> log frees the buffer to save memory.  Later, doing the diff on a
> commit causes us to use the notes_cache mechanism. That will look at
> the commit at the tip of the notes ref, which log has already
> processed. It will call parse_commit, but that will do nothing, as the
> parsed flag is already set, but the commit buffer remains NULL.
> 
> I wonder if this could be the source of the segfault here, too:
> 
> http://thread.gmane.org/gmane.comp.version-control.git/214322/focus=214355

Indeed, that description matches what I see. The other segfault seems to be in the same place too.

The segfault hits with the following stack (optimization off):

0   git-log         parse_commit_header + 39 (pretty.c:738)
1   git-log         format_commit_one + 3102 (pretty.c:1138)
2   git-log         format_commit_item + 177 (pretty.c:1203)
3   git-log         strbuf_expand + 172 (strbuf.c:247)
4   git-log         format_commit_message + 196 (pretty.c:1263)
5   git-log         notes_cache_match_validity + 215 (notes-cache.c:22)
6   git-log         notes_cache_init + 212 (notes-cache.c:41)
7   git-log         userdiff_get_textconv + 176 (userdiff.c:301)
8   git-log         get_textconv + 66 (diff.c:2233)
9   git-log         has_changes + 53 (diffcore-pickaxe.c:205)
10  git-log         pickaxe + 299 (diffcore-pickaxe.c:40)
11  git-log         diffcore_pickaxe_count + 344 (diffcore-pickaxe.c:275)
12  git-log         diffcore_pickaxe + 58 (diffcore-pickaxe.c:289)
13  git-log         diffcore_std + 162 (diff.c:4673)
14  git-log         log_tree_diff_flush + 43 (log-tree.c:721)
15  git-log         log_tree_diff + 566 (log-tree.c:826)
16  git-log         log_tree_commit + 63 (log-tree.c:847)
17  git-log         cmd_log_walk + 172 (log.c:301)
18  git-log         cmd_log + 186 (log.c:568)
19  git-log         run_builtin + 475 (git.c:273)
20  git-log         handle_internal_command + 199 (git.c:434)
21  git-log         main + 149 (git.c:523)
22  libdyld.dylib   start + 1

commit->message was freed and nulled earlier in a call to cmd_log_walk. This test reproduces it reliably on my machine:

diff --git a/t/t4042-diff-textconv-caching.sh b/t/t4042-diff-textconv-caching.sh
index 91f8198..d7e26ca 100755
--- a/t/t4042-diff-textconv-caching.sh
+++ b/t/t4042-diff-textconv-caching.sh
@@ -106,4 +106,15 @@ test_expect_success 'switching diff driver produces correct results' '
	test_cmp expect actual
'

+cat >expect <<EOF
+./helper other (refs/notes/textconv/magic)
+one
+EOF
+# add empty commit on master to make bug more reproducible
+test_expect_success 'pickaxe with cached textconv' '
+	git commit --allow-empty -m another &&
+	git log -S other --pretty=tformat:%s%d --all >actual &&
+	test_cmp expect actual
+'
+
test_done




Jonathon Mah
me@JonathonMah.com

^ permalink raw reply related

* Re: segmentation fault (nullpointer) with git log --submodule -p
From: Junio C Hamano @ 2013-01-24 23:56 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Stefan Näwe, Armin, Jonathon Mah,
	git@vger.kernel.org
In-Reply-To: <20130124232721.GA16036@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> ... (e.g., how should "log" know that a submodule diff might later want
> to see the same entry? Should we optimistically free and then make it
> easier for the later user to reliably ensure the buffer is primed? Or
> should we err on the side of keeping it in place?).

My knee-jerk reaction is that we should consider that commit->buffer
belongs to the revision traversal machinery.  Any other uses bolted
on later can borrow it if buffer still exists (I do not think pretty
code rewrites the buffer contents in place in any way), or they can
ask read_sha1_file() to read it themselves and free when they are
done.

^ permalink raw reply

* Re: segmentation fault (nullpointer) with git log --submodule -p
From: Jeff King @ 2013-01-25  0:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Duy Nguyen, Stefan Näwe, Armin, Jonathon Mah,
	git@vger.kernel.org
In-Reply-To: <7va9ry87a0.fsf@alter.siamese.dyndns.org>

On Thu, Jan 24, 2013 at 03:56:23PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > ... (e.g., how should "log" know that a submodule diff might later want
> > to see the same entry? Should we optimistically free and then make it
> > easier for the later user to reliably ensure the buffer is primed? Or
> > should we err on the side of keeping it in place?).
> 
> My knee-jerk reaction is that we should consider that commit->buffer
> belongs to the revision traversal machinery.  Any other uses bolted
> on later can borrow it if buffer still exists (I do not think pretty
> code rewrites the buffer contents in place in any way), or they can
> ask read_sha1_file() to read it themselves and free when they are
> done.

Yeah, that is probably the sanest way forward. It at least leaves
ownership in one place, and everybody else is an opportunistic consumer.
We do need to annotate each user site though with something like the
"ensure" function I mentioned.

If they are not owners, then the better pattern is probably something
like:

  /*
   * Get the commit buffer, either opportunistically using
   * the cached version attached to the commit object, or loading it
   * from disk if necessary.
   */
  const char *use_commit_buffer(struct commit *c)
  {
          enum object_type type;
          unsigned long size;

          if (c->buffer)
                  return c->buffer;
          /*
           * XXX check type == OBJ_COMMIT?
           * XXX die() on NULL as a convenience to callers?
           */
          return read_sha1_file(c->object.sha1, &type, &size);
  }

  void unuse_commit_buffer(const char *buf, struct commit *c)
  {
          /*
           * If we used the cached copy attached to the commit,
           * we don't want to free it; it's not our responsibility.
           */
          if (buf == c->buffer)
                  return;

          free((char *)buf);
  }

I suspect that putting a use/unuse pair inside format_commit_message
would handle most cases.

-Peff

^ permalink raw reply

* Re: [regression] Re: [PATCHv2 10/15] drop length limitations on gecos-derived names and emails
From: Jeff King @ 2013-01-25  1:05 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Angus Hammond, git, Mihai Rusu
In-Reply-To: <20130124232146.GA17458@google.com>

On Thu, Jan 24, 2013 at 03:21:46PM -0800, Jonathan Nieder wrote:

> This broke /etc/mailname handling.  Before:
> 
> 	$ git var GIT_COMMITTER_IDENT
> 	Jonathan Nieder <jrn@mailname.example.com> 1359069165 -0800
> 
> After:
> 
> 	$ git var GIT_COMMITTER_IDENT
> 	Jonathan Nieder <mailname.example.com> 1359069192 -0800

Ick. I wonder how that slipped through...I know I was testing with
/etc/mailname when developing the series, because I'm on a Debian
system. We do even check this code path in t7502 (if you have the
AUTOIDENT prereq), but of course we can't verify the actual value
automatically, because it could be anything. So I guess I just missed it
during my manual testing, and the automated testing is insufficient to
catch this particular breakage.

> > -	if (!fgets(buf, len, mailname)) {
> > +	if (strbuf_getline(buf, mailname, '\n') == EOF) {
> 
> This clears the strbuf.

Right. Definitely the problem.

> How about something like this as a quick fix?
> 
> Reported-by: Mihai Rusu <dizzy@google.com>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> 
> diff --git a/ident.c b/ident.c
> index 73a06a1..cabd73f 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -41,6 +41,7 @@ static void copy_gecos(const struct passwd *w, struct strbuf *name)
>  static int add_mailname_host(struct strbuf *buf)
>  {
>  	FILE *mailname;
> +	struct strbuf mailnamebuf = STRBUF_INIT;
>  
>  	mailname = fopen("/etc/mailname", "r");
>  	if (!mailname) {
> @@ -49,14 +50,17 @@ static int add_mailname_host(struct strbuf *buf)
>  				strerror(errno));
>  		return -1;
>  	}
> -	if (strbuf_getline(buf, mailname, '\n') == EOF) {
> +	if (strbuf_getline(&mailnamebuf, mailname, '\n') == EOF) {
>  		if (ferror(mailname))
>  			warning("cannot read /etc/mailname: %s",
>  				strerror(errno));
> +		strbuf_release(&mailnamebuf);
>  		fclose(mailname);
>  		return -1;
>  	}
>  	/* success! */
> +	strbuf_addbuf(buf, &mailnamebuf);
> +	strbuf_release(&mailnamebuf);
>  	fclose(mailname);
>  	return 0;
>  }

I think that is the only reasonable fix. Thanks for figuring it out.

We could expand the test in t7502 to check for "@" in the email, but it
feels weirdly specific to this bug. Either way,

Acked-by: Jeff King <peff@peff.net>

(with a proper commit message, of course).

-Peff

^ permalink raw reply

* Re: [PATCH] t9902: protect test from stray build artifacts
From: Jeff King @ 2013-01-25  1:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jean-Noël AVILA, git
In-Reply-To: <7vehha89j5.fsf_-_@alter.siamese.dyndns.org>

On Thu, Jan 24, 2013 at 03:07:42PM -0800, Junio C Hamano wrote:

> When you have random build artifacts in your build directory, left
> behind by running "make" while on another branch, the "git help -a"
> command run by __git_list_all_commands in the completion script that
> is being tested does not have a way to know that they are not part
> of the subcommands this build will ship.  Such extra subcommands may
> come from the user's $PATH.  They will interfere with the tests that
> expect a certain prefix to uniquely expand to a known completion.
> 
> Instrument the completion script and give it a way for us to tell
> what (subset of) subcommands we are going to ship.
> 
> Also add a test to "git --help <prefix><TAB>" expansion.  It needs
> to show not just commands but some selected documentation pages.
> 
> Based on an idea by Jeff King.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  contrib/completion/git-completion.bash | 11 ++++++++++-
>  t/t9902-completion.sh                  | 25 ++++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 2 deletions(-)

This looks good to me.

The only thing I might add is a test just to double-check that "git help
-a" is parsed correctly. Like:

  test_expect_success 'command completion works without test harness' '
           GIT_TESTING_COMMAND_COMPLETION= run_completion "git bun" &&
           grep "^bundle\$" out
  '

(we know we are running bash here, so the one-shot variable is OK to be
used with a function).

-Peff

^ permalink raw reply

* Re: [PATCH 1/7] sha1_file: keep track of where an SHA-1 object comes from
From: Duy Nguyen @ 2013-01-25  1:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jens Lehmann
In-Reply-To: <7va9rycw4t.fsf@alter.siamese.dyndns.org>

On Fri, Jan 25, 2013 at 12:45 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>  How about this way instead: we keep track of where objects come from
>>  so we can verify object source when we create or update something
>>  that contains SHA-1.
>
> The overall approach taken by this series may be worth considering, but
> I do not think the check implemented here is correct.
>
> An object may be found in an alternate odb but we may also have our
> own copy of the same object.  You need to prove that a suspicious
> object is visible to us *ONLY* through add_submodule_odb().

The way alt odbs are linked (new odbs area always at the end), if we
have the same copy, their copy will never be read (we check out alt
odbs from head to tail). So those duplicate suspicious objects are
actually invisible to us.

> Once you do add_submodule_odb() to contaminate our object pool, you
> make everything a suspicious object that needs to be checked; that
> is the worst part of the story.

And because we never really touch their alt copy, the returned alt
source is "ours", trusted and not checked. The check should only occur
on objects that we do not have.
-- 
Duy

^ permalink raw reply

* Re: [PATCH v3 01/10] wildmatch: fix "**" special case
From: Duy Nguyen @ 2013-01-25  1:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1udacugj.fsf@alter.siamese.dyndns.org>

On Fri, Jan 25, 2013 at 1:22 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> I don't think we need to support two different sets of wildcards in
>> the long run. I'm thinking of adding ":(glob)" with WM_PATHNAME.
>> Pathspec without :(glob) still uses the current wildcards (i.e. no
>> FNM_PATHNAME). At some point, like 2.0, we either switch the behavior
>> of patterns-without-:(glob) to WM_PATHNAME, or just disable wildcards
>> when :(glob) is not present.
>
> Yeah, I think that is sensible.
>
> I am meaning to merge your retire-fnmatch topic to 'master'.

I thought you wanted it to stay in 'next' longer :-)

> What should the Release Notes say for the upcoming release?
>
>     You can build with USE_WILDMATCH=YesPlease to use a replacement
>     implementation of pattern matching logic used for pathname-like
>     things, e.g.  refnames and paths in the repository.  The new
>     implementation is not expected change the existing behaviour of
>     Git at all, except for "git for-each-ref" where you can now say
>     "refs/**/master" and match with both refs/heads/master and
>     refs/remotes/origin/master.  In future versions of Git, we plan
>     to use this new implementation in wider places (e.g. "git log
>     '**/t*.sh'" may find commits that touch a shell script whose
>     name begins with "t" at any level), but we are not there yet.
>     By building with USE_WILDMATCH, using the resulting Git daily
>     and reporting when you find breakages, you can help us get
>     closer to that goal.

That looks ok. You may want to mention that "**" syntax is enabled in
.gitignore and .gitattributes as well (even without USE_WILDMATCH). We
could even stop the behavior change in for-each-ref (FNM_PATHNAME in
general) but I guess because it's a nice regression, nobody would
complain.
-- 
Duy

^ permalink raw reply

* Re: [PATCH 6/7] read-cache: refuse to create index referring to external objects
From: Duy Nguyen @ 2013-01-25  2:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jens Lehmann
In-Reply-To: <7vpq0ubdec.fsf@alter.siamese.dyndns.org>

On Fri, Jan 25, 2013 at 2:15 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>>  read-cache.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index fda78bc..4579215 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -1720,6 +1720,26 @@ static int ce_write_entry(git_SHA_CTX *c, int fd, struct cache_entry *ce,
>>                             ce->name + common, ce_namelen(ce) - common);
>>       }
>>
>> +     if (object_database_contaminated) {
>> +             struct object_info oi;
>> +             switch (ce->ce_mode) {
>> +             case S_IFGITLINK:
>> +                     break;
>> +             case S_IFDIR:
>> +                     if (sha1_object_info_extended(ce->sha1, &oi) != OBJ_TREE ||
>
> This case should never happen.  Do we store any tree object in the
> index in the first place?

No it was copy/paste mistake (from cache-tree.c, before I deleted it
and added check_sha1_file_for_external_source in 3/7)

>> +                         (oi.alt && oi.alt->external))
>> +                             die("cannot create index referring to an external tree %s",
>> +                                 sha1_to_hex(ce->sha1));
>> +                     break;
>> +             default:
>> +                     if (sha1_object_info_extended(ce->sha1, &oi) != OBJ_BLOB ||
>> +                                 (oi.alt && oi.alt->external))
>> +                             die("cannot create index referring to an external blob %s",
>> +                                 sha1_to_hex(ce->sha1));
>> +                     break;
>> +             }
>> +     }
>> +
>>       result = ce_write(c, fd, ondisk, size);
>>       free(ondisk);
>>       return result;
>
> I think the check you want to add is to the cache-tree code; before
> writing the new index out, if you suspect you might have called
> cache_tree_update() before, invalidate any hierarchy that is
> recorded to produce a tree object that refers to an object that is
> only available through external object store.

cache-tree is covered by check_sha1_file_for_external_source() when it
actually writes a tree (dry-run mode goes through unchecked). Even
when cache-tree is not involved, I do not want the index to point to
an non-existing SHA-1 ("git diff --cached" may fail next time, for
example).

> As to the logic to check if a object is something that we should
> refuse to create a new dependent, I think you should not butcher
> sha1_object_info_extended(), and instead add a new call that checks
> if a given SHA-1 yields an object if you ignore alternate object
> databases that are marked as "external", whose signature may
> resemble:
>
>         int has_sha1_file_proper(const unsigned char *sha1);
>
> or something.

Agreed.

> This is a tangent, but in addition, you may want to add an even
> narrower variant that checks the same but ignoring _all_ alternate
> object databases, "external" or not:
>
>         int has_sha1_file_local(const unsigned char *sha1);
>
> That may be useful if we want to later make the alternate safer to
> use; instead of letting the codepath to create an object ask
> has_sha1_file() to see if it already exists and refrain from writing
> a new copy, we switch to ask has_sha1_file_locally() and even if an
> alternate object database we borrow from has it, we keep our own
> copy in our repository.
>
> Not tested beyond syntax, but rebasing something like this patch on
> top of your "mark external sources" change may take us closer to
> that.

Thanks, will incorporate in the reroll.
-- 
Duy

^ permalink raw reply

* Re: segmentation fault (nullpointer) with git log --submodule -p
From: Duy Nguyen @ 2013-01-25  2:05 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Stefan Näwe, Armin, Jonathon Mah,
	git@vger.kernel.org
In-Reply-To: <20130125005528.GA27325@sigill.intra.peff.net>

On Fri, Jan 25, 2013 at 7:55 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Jan 24, 2013 at 03:56:23PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>
>> > ... (e.g., how should "log" know that a submodule diff might later want
>> > to see the same entry? Should we optimistically free and then make it
>> > easier for the later user to reliably ensure the buffer is primed? Or
>> > should we err on the side of keeping it in place?).
>>
>> My knee-jerk reaction is that we should consider that commit->buffer
>> belongs to the revision traversal machinery.  Any other uses bolted
>> on later can borrow it if buffer still exists (I do not think pretty
>> code rewrites the buffer contents in place in any way), or they can
>> ask read_sha1_file() to read it themselves and free when they are
>> done.
>
> Yeah, that is probably the sanest way forward. It at least leaves
> ownership in one place, and everybody else is an opportunistic consumer.
> We do need to annotate each user site though with something like the
> "ensure" function I mentioned.
>
> If they are not owners, then the better pattern is probably something
> like:

You probably should rename "buffer" (to _buffer or something) and let
the compiler catches all the places commit->buffer illegally used.

>
>   /*
>    * Get the commit buffer, either opportunistically using
>    * the cached version attached to the commit object, or loading it
>    * from disk if necessary.
>    */
>   const char *use_commit_buffer(struct commit *c)
>   {
>           enum object_type type;
>           unsigned long size;
>
>           if (c->buffer)
>                   return c->buffer;
>           /*
>            * XXX check type == OBJ_COMMIT?
>            * XXX die() on NULL as a convenience to callers?
>            */
>           return read_sha1_file(c->object.sha1, &type, &size);
>   }
>
>   void unuse_commit_buffer(const char *buf, struct commit *c)
>   {
>           /*
>            * If we used the cached copy attached to the commit,
>            * we don't want to free it; it's not our responsibility.
>            */
>           if (buf == c->buffer)
>                   return;
>
>           free((char *)buf);
>   }
>
> I suspect that putting a use/unuse pair inside format_commit_message
> would handle most cases.
>
> -Peff
-- 
Duy

^ permalink raw reply

* Re: [PATCH] t9902: protect test from stray build artifacts
From: Jonathan Nieder @ 2013-01-25  2:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jean-Noël AVILA, git, Daniel Baumann
In-Reply-To: <20130125011349.GB27657@sigill.intra.peff.net>

Jeff King wrote:
> On Thu, Jan 24, 2013 at 03:07:42PM -0800, Junio C Hamano wrote[1]:

>> Instrument the completion script and give it a way for us to tell
>> what (subset of) subcommands we are going to ship.
[...]
> The only thing I might add is a test just to double-check that "git help
> -a" is parsed correctly. Like:
>
>   test_expect_success 'command completion works without test harness' '
>            GIT_TESTING_COMMAND_COMPLETION= run_completion "git bun" &&
>            grep "^bundle\$" out
>   '

Yes.  Since there are no other 'git help -a' tests, I think we need
this.

Aside from that, the fix looks good to me.

Jonathan

[1] http://thread.gmane.org/gmane.comp.version-control.git/214167/focus=214469

^ permalink raw reply

* [PATCH] Update renamed file merge-file.h in Makefile
From: Jiang Xin @ 2013-01-25  3:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jiang Xin

Commit v1.8.1-rc0-3-gfa2364e renamed merge-file.h to merge-blobs.h, but
forgot to update the reference of merge-file.h in Makefile. This would
break the build of "po/git.pot", which depends on $(LOCALIZED_C), then
fallback to the missing file "merge-file.h".

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
---
 Makefile |    2 +-
 1 个文件被修改,插入 1 行(+),删除 1 行(-)

diff --git a/Makefile b/Makefile
index 1b30d7b..a786d4c 100644
--- a/Makefile
+++ b/Makefile
@@ -649,7 +649,7 @@ LIB_H += list-objects.h
 LIB_H += ll-merge.h
 LIB_H += log-tree.h
 LIB_H += mailmap.h
-LIB_H += merge-file.h
+LIB_H += merge-blobs.h
 LIB_H += merge-recursive.h
 LIB_H += mergesort.h
 LIB_H += notes-cache.h
-- 
1.8.1.1

^ permalink raw reply related

* Re: [PATCH 1/7] sha1_file: keep track of where an SHA-1 object comes from
From: Junio C Hamano @ 2013-01-25  3:58 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git, Jens Lehmann
In-Reply-To: <CACsJy8Ag6v7wUnupRwGid26AUzgZ=WbdA5F-MpjUv5ktaj5Asg@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> On Fri, Jan 25, 2013 at 12:45 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>  How about this way instead: we keep track of where objects come from
>>>  so we can verify object source when we create or update something
>>>  that contains SHA-1.
>>
>> The overall approach taken by this series may be worth considering, but
>> I do not think the check implemented here is correct.
>>
>> An object may be found in an alternate odb but we may also have our
>> own copy of the same object.  You need to prove that a suspicious
>> object is visible to us *ONLY* through add_submodule_odb().
>
> The way alt odbs are linked (new odbs area always at the end), if we
> have the same copy, their copy will never be read (we check out alt
> odbs from head to tail). So those duplicate suspicious objects are
> actually invisible to us.

The way I read find_pack_entry() is that our heuristics to start
our search by looking at the pack in which we found an object the
last time will invalidate your assumption above.  Am I mistaken?

^ permalink raw reply

* Re: segmentation fault (nullpointer) with git log --submodule -p
From: Junio C Hamano @ 2013-01-25  3:59 UTC (permalink / raw)
  To: Jeff King
  Cc: Duy Nguyen, Stefan Näwe, Armin, Jonathon Mah,
	git@vger.kernel.org
In-Reply-To: <7va9ry87a0.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> ... (e.g., how should "log" know that a submodule diff might later want
>> to see the same entry? Should we optimistically free and then make it
>> easier for the later user to reliably ensure the buffer is primed? Or
>> should we err on the side of keeping it in place?).
>
> My knee-jerk reaction is that we should consider that commit->buffer
> belongs to the revision traversal machinery.  Any other uses bolted
> on later can borrow it if buffer still exists (I do not think pretty
> code rewrites the buffer contents in place in any way), or they can
> ask read_sha1_file() to read it themselves and free when they are
> done.

I've been toying with an idea along this line.

 commit.h        | 16 ++++++++++++++++
 builtin/blame.c | 27 ++++++++-------------------
 commit.c        | 20 ++++++++++++++++++++
 3 files changed, 44 insertions(+), 19 deletions(-)

diff --git a/commit.h b/commit.h
index c16c8a7..b559535 100644
--- a/commit.h
+++ b/commit.h
@@ -226,4 +226,20 @@ extern void print_commit_list(struct commit_list *list,
 			      const char *format_cur,
 			      const char *format_last);
 
+extern int ensure_commit_buffer(struct commit *);
+extern void discard_commit_buffer(struct commit *);
+
+#define with_commit_buffer(commit) \
+	do { \
+		int had_buffer_ = !!commit->buffer; \
+		if (!had_buffer_) \
+			ensure_commit_buffer(commit); \
+		do
+
+#define done_with_commit_buffer(commit) \
+		while (0); \
+		if (!had_buffer_) \
+			discard_commit_buffer(commit); \
+	} while (0)
+
 #endif /* COMMIT_H */
diff --git a/builtin/blame.c b/builtin/blame.c
index b431ba3..8b2e4a5 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1424,25 +1424,14 @@ static void get_commit_info(struct commit *commit,
 
 	commit_info_init(ret);
 
-	/*
-	 * We've operated without save_commit_buffer, so
-	 * we now need to populate them for output.
-	 */
-	if (!commit->buffer) {
-		enum object_type type;
-		unsigned long size;
-		commit->buffer =
-			read_sha1_file(commit->object.sha1, &type, &size);
-		if (!commit->buffer)
-			die("Cannot read commit %s",
-			    sha1_to_hex(commit->object.sha1));
-	}
-	encoding = get_log_output_encoding();
-	reencoded = logmsg_reencode(commit, encoding);
-	message   = reencoded ? reencoded : commit->buffer;
-	get_ac_line(message, "\nauthor ",
-		    &ret->author, &ret->author_mail,
-		    &ret->author_time, &ret->author_tz);
+	with_commit_buffer(commit) {
+		encoding = get_log_output_encoding();
+		reencoded = logmsg_reencode(commit, encoding);
+		message   = reencoded ? reencoded : commit->buffer;
+		get_ac_line(message, "\nauthor ",
+			    &ret->author, &ret->author_mail,
+			    &ret->author_time, &ret->author_tz);
+	} done_with_commit_buffer(commit);
 
 	if (!detailed) {
 		free(reencoded);
diff --git a/commit.c b/commit.c
index e8eb0ae..a627eea 100644
--- a/commit.c
+++ b/commit.c
@@ -1357,3 +1357,23 @@ void print_commit_list(struct commit_list *list,
 		printf(format, sha1_to_hex(list->item->object.sha1));
 	}
 }
+
+int ensure_commit_buffer(struct commit *commit)
+{
+	enum object_type type;
+	unsigned long size;
+
+	if (commit->buffer)
+		return 0;
+	commit->buffer = read_sha1_file(commit->object.sha1, &type, &size);
+	if (commit->buffer)
+		return -1;
+	else
+		return 0;
+}
+
+void discard_commit_buffer(struct commit *commit)
+{
+	free(commit->buffer);
+	commit->buffer = NULL;
+}

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox