Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Replace git-cvsimport with a rewrite that fixes major bugs.
From: Michael Haggerty @ 2013-01-03 16:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Antoine Pelisse, Chris Rorvick, Eric Raymond, git
In-Reply-To: <7vip7expd8.fsf@alter.siamese.dyndns.org>

On 01/03/2013 04:22 PM, Junio C Hamano wrote:
> Antoine Pelisse <apelisse@gmail.com> writes:
> 
>>> Doesn't Python come with a standard subprocess module that lets you
>>> spawn external programs safely, similar to the way Perl's list form
>>> open(), e.g. "open($fh, "-|", 'git', @args)", works?
> 
> ... and of course a more boring "system('git', $subcmd, @args)", as well.

Python's os.system() takes exactly one argument, which must be a string,
and executes it in a subshell.  subprocess is indeed the way to go.

>> You mean something like this:
>>
>>   p1 = subprocess.Popen([backend.command()], stdout=subprocess.PIPE)
>>   subprocess.Popen(["git", "fast-import", "--quiet"] + gitopts,
>> cwd=outdir, stdin=p1.stdout)
>>
>> Assuming gitopts is a list rather than a string. (care must be taken
>> with backend.command() also)
> 
> Yes.
> 
> I vaguely recall that the subprocess module once used to be one
> portability issue but that was between Python 2.3 and 2.4 or some
> ancient history, and it should no longer be relevant.

subprocess was added in Python 2.4, and the above example should work
fine in any version >= 2.4.  But please note that other functions have
been added to the module since then, like check_call() (v2.5),
check_output (v2.7), and some methods were added to the Popen object in
v2.6.

Such things are documented pretty reliably in the Python library
documentation [1]; when in doubt, one can view older versions of the
library documentation, which are all available online [2].

Michael

[1] http://docs.python.org/2/library/
[2] http://www.python.org/doc/versions/

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: [RFH] NetBSD 6?
From: Junio C Hamano @ 2013-01-03 16:17 UTC (permalink / raw)
  To: Greg Troxel; +Cc: git
In-Reply-To: <rmi8v8av05d.fsf@fnord.ir.bbn.com>

Greg Troxel <gdt@ir.bbn.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Greg Troxel <gdt@ir.bbn.com> writes:
>>
>>> I realize a README.foo file for N different systems could be clutter,
>>> but having these checked in would provide the concise help that people
>>> on any of those platforms need.
>>
>> Our Makefile documents knobs people on various platforms can tweak
>> (PYTHON_PATH and OLD_ICONV are two examples of them), sets default
>> values for them based on $(uname -S), then includes config.mak file
>> the user can optionally create to override these platform defaults.
>> This infrastructure is used across platforms, not just for NetBSD.
>
> If you have to choose a single PYTHON_PATH, the one you picked is right
> (for now, and likely will be right for a long time).
>
> But as a general rule, I think configure tests are preferable to
> OS-specific variables.

I forgot to mention that we also ship configure (and keep track of
configure.ac) so that optionally people can let autoconf machinery
to create config.mak.autogen to be included at the same place as
handcrafted config.mak in their build process.  I do not offhand
know if we do "for p in python python2.6 python2.7; do ..." kind of
thing, though.

> A large number of people on NetBSD use git, but almost all of them get
> it via pkgsrc (which is also where they get perl, emacs, svn, and
> everything else you didn't find in /usr/bin).  The exception would be
> people that want to hack on git itself.

Yeah, that much I figured ;-)

> People who want gnu libiconv can install the libiconv package from
> pkgsrc.  (I'm guessing OLD_ICONV means "POSIX iconv", without GNU
> extensions (iconvctl?).)

It refers to the type of the second parameter to iconv(); OLD_ICONV
makes it take "const char *", as opposed to "char *", the latter of
which matches

  http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html

> So, it doesn't matter too much for pkgsrc what you change, because it
> can be patched anyway (once, for all users).

Yes.  The values in the Makefile are fallback defaults, and matters
only to people who build from the source.

^ permalink raw reply

* [PATCH v3 5/5] format-patch: pick up branch description when no ref is specified
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 16:16 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <7v7gnuxo4q.fsf@alter.siamese.dyndns.org>

We only try to get branch name in "format-patch origin" case or
similar and not "format-patch -22" where HEAD is automatically
added. Without correct branch name, branch description cannot be
added. Make sure we always get branch name.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 On Thu, Jan 3, 2013 at 10:49 PM, Junio C Hamano <gitster@pobox.com> wrote:
 > You could (and I think you should) do something like this:
 >
 >         if (0 <= positive) {
 >                 ref = rev->cmdline.rev[positive].name;
 >                 tip_sha1 = rev->cmdline.rev[positive].item->sha1;
 >         } else if (... defaulted to implied HEAD? ...) {
 >                 ref = "HEAD";
 >                 tip_sha1 = rev->pending.objects[0].item->sha1;
 >         } else {
 >                 return NULL;
 >         }
 >
 >         if (dwim_ref(...) && !prefixcmp(full_ref, "refs/heads/") &&
 >             !hashcmp(tip_sha1, branch_sha1))
 >
 > to preserve that safety instead.

 Good idea. Done.

 builtin/log.c           | 18 +++++++++++++++---
 t/t4014-format-patch.sh |  7 +++++++
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 039bf67..72a368a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1016,6 +1016,7 @@ static char *find_branch_name(struct rev_info *rev)
 {
 	int i, positive = -1;
 	unsigned char branch_sha1[20];
+	const unsigned char *tip_sha1;
 	const char *ref;
 	char *full_ref, *branch = NULL;
 
@@ -1027,12 +1028,23 @@ static char *find_branch_name(struct rev_info *rev)
 		else
 			return NULL;
 	}
-	if (positive < 0)
+	if (0 <= positive) {
+		ref = rev->cmdline.rev[positive].name;
+		tip_sha1 = rev->cmdline.rev[positive].item->sha1;
+	} else if (!rev->cmdline.nr && rev->pending.nr == 1 &&
+		   !strcmp(rev->pending.objects[0].name, "HEAD")) {
+		/*
+		 * No actual ref from command line, but "HEAD" from
+		 * rev->def was added in setup_revisions()
+		 * e.g. format-patch --cover-letter -12
+		 */
+		ref = "HEAD";
+		tip_sha1 = rev->pending.objects[0].item->sha1;
+	} else
 		return NULL;
-	ref = rev->cmdline.rev[positive].name;
 	if (dwim_ref(ref, strlen(ref), branch_sha1, &full_ref) &&
 	    !prefixcmp(full_ref, "refs/heads/") &&
-	    !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1))
+	    !hashcmp(tip_sha1, branch_sha1))
 		branch = xstrdup(full_ref + strlen("refs/heads/"));
 	free(full_ref);
 	return branch;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 903a797..7e52941 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -998,4 +998,11 @@ test_expect_success 'cover letter using branch description (5)' '
 	grep hello actual >/dev/null
 '
 
+test_expect_success 'cover letter using branch description (6)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter -2 >actual &&
+	grep hello actual >/dev/null
+'
+
 test_done
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* Re: cvsps, parsecvs, svn2git and the CVS exporter mess
From: Martin Langhoff @ 2013-01-03 15:51 UTC (permalink / raw)
  To: Eric S. Raymond
  Cc: Yann Dirson, Michael Haggerty, Heiko Voigt, Antoine Pelisse,
	Bart Massey, Keith Packard, David Mansfield, Git Mailing List
In-Reply-To: <20121222173649.04C5B44119@snark.thyrsus.com>

On Sat, Dec 22, 2012 at 12:36 PM, Eric S. Raymond <esr@thyrsus.com> wrote:
> It is pure accident that I now maintain two of these.

Maintainership is always temporary.

> Having three different tools for this job seems to me duplicative and
> pointless; two of them should probably be let die an honorable death.

Perhaps just maintain the code that serves your goals. That way, you
don't need long trolly emails nor approval from anyone.




m
--
 martin.langhoff@gmail.com
 martin@laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply

* Re: [PATCH v2 5/5] format-patch: pick up branch description when no ref is specified
From: Junio C Hamano @ 2013-01-03 15:49 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1357221791-7496-6-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> We only try to get branch name in "format-patch origin" case or
> similar and not "format-patch -22" where HEAD is automatically
> added. Without correct branch name, branch description cannot be
> added. Make sure we always get branch name.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  builtin/log.c           | 16 +++++++++++++---
>  t/t4014-format-patch.sh |  7 +++++++
>  2 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/log.c b/builtin/log.c
> index 039bf67..81683f6 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -1027,12 +1027,22 @@ static char *find_branch_name(struct rev_info *rev)
>  		else
>  			return NULL;
>  	}
> -	if (positive < 0)
> +	if (positive >= 0)
> +		ref = rev->cmdline.rev[positive].name;
> +	else if (!rev->cmdline.nr && rev->pending.nr == 1 &&
> +		 !strcmp(rev->pending.objects[0].name, "HEAD"))
> +		/*
> +		 * No actual ref from command line, but "HEAD" from
> +		 * rev->def was added in setup_revisions()
> +		 * e.g. format-patch --cover-letter -12
> +		 */
> +		ref = "HEAD";
> +	else
>  		return NULL;
> -	ref = rev->cmdline.rev[positive].name;
>  	if (dwim_ref(ref, strlen(ref), branch_sha1, &full_ref) &&
>  	    !prefixcmp(full_ref, "refs/heads/") &&
> -	    !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1))
> +	    (positive < 0 ||
> +	     !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1)))

This hashcmp() is to make sure that the tip we _guessed_ with
dwim_ref() that the user meant matches what we are going to use, so
I do not think disabling the check is a good idea.

You could (and I think you should) do something like this:

	if (0 <= positive) {
        	ref = rev->cmdline.rev[positive].name;
                tip_sha1 = rev->cmdline.rev[positive].item->sha1;
	} else if (... defaulted to implied HEAD? ...) {
		ref = "HEAD";
                tip_sha1 = rev->pending.objects[0].item->sha1;
	} else {
		return NULL;
	}

        if (dwim_ref(...) && !prefixcmp(full_ref, "refs/heads/") &&
            !hashcmp(tip_sha1, branch_sha1))

to preserve that safety instead.

^ permalink raw reply

* Re: cvsps, parsecvs, svn2git and the CVS exporter mess
From: Michael Haggerty @ 2013-01-03 15:37 UTC (permalink / raw)
  To: Eric S. Raymond
  Cc: Yann Dirson, Heiko Voigt, Antoine Pelisse, Bart Massey,
	Keith Packard, David Mansfield, git
In-Reply-To: <20121222173649.04C5B44119@snark.thyrsus.com>

On 12/22/2012 06:36 PM, Eric S. Raymond wrote:
> * One is Michael Haggerty's cvs2git.  I had bad experiences with the
> cvs2svn code it's derived from in the past, but Michael believes those
> problems have been fixed and I will accept that - at least until I can
> test for myself.  Its documented interface is not quite good enough
> yet; as the documentation says, "The data that should be fed to git
> fast-import are written to two files, which have to be loaded into git
> fast-import manually."

There are two good reasons that the output is written to two separate files:

1. The files are generated during different passes of cvs2git, and since
the cvs2git conversion is restartable pass-by-pass, the first file might
only need to be generated once even while the user is iterating on
adjustments to other conversion options.

2. The first ("blobfile") contains blob definitions for file revisions,
which are read out of the RCS files in the order they are held in the
RCS file.  This is vastly faster than reading the file revisions in the
order that they are needed for git commits because (1) all revisions for
a file can be computed from one serial read of the RCS file; (2) there
is no need to jump around from rcsfile to rcsfile.  The second
("dumpfile") stitches the blobs together into git commits by referring
to the blobs that are needed.  This file is smaller because it doesn't
contain the actual file contents.  Another advantage of this approach is
that a blob need only appear once in the blobfile even if it is used
multiple times in the git history.

Anyway, surely cat'ing two output files together is not such a difficult
problem?

A potentially bigger problem is that if you want to handle such
blob/dump output, you have to deal with git-fast-import format's "blob"
command as opposed to only handling inline blobs.  However, if that is a
problem, it is possible to configure cvs2git to write the blobs inline
with the rest of the dumpfile (this mode is supported because "hg
fast-import" doesn't support detached blobs).  You would have to create
an options file that uses GitRevisionInlineWriter, similar to what is
done in cvs2hg-example.options.

> [...]
> Having three different tools for this job seems to me duplicative and
> pointless; two of them should probably be let die an honorable death.
> I don't actually care which of the three survives - and, in
> particular, if I determine that cvs2git is doing the best job of the
> three I am quite willing to declare end-of-life for cvsps and
> parsecvs.  It's not like I don't have plenty of other projects to work
> on.

cvs2git does not currently support incremental conversions; therefore, a
cvsps-based option (if it would actually work, that is) would have at
least one advantage over cvs2git.

> I presently know of three test suites other than mine. One was built
> by Heiko to test cvsps, another lives in the git t/ directory, and the
> third is cvs2git's. I haven't looked at cv2git's yet, but the others
> are not in their present form suited to where I am taking cvsps and
> parsecvs.  Heiko's relies on the default human-readable cvsps format,
> which I consider obsolete and uninteresting.  The git tests are
> dependent on details of porcelain behavior.  I think it would be
> better to test import-stream output.

cvs2svn has an extensive test suite which includes tests derived from
bug reports that we have received over the years.  I adapted a few of
its test repositories to create the git test suite additions that I made
in Feb 2009, but there are many more in our project.

A lot of our test suite deals with additional conversion features, like:

* Re-encoding filenames, usernames, and log messages from whatever
happens to have been used in the CVS repository into UTF-8

* Fixing CVS branches, tags, and mixed branch/tag messes according to
user wishes; renaming branches and tags

* Allowing the user to influence the choice of which branch should serve
as the source for another branch/tag (CVS records this information very
ambiguously)

* Fixing binary vs. text files, expanding/contracting CVS keywords, etc.

* Removing lots of synthetic revisions and other cruft generated by CVS
to fit within the RCS file format

* Dealing with vendor branches in a sensible way, especially considering
that very many users misuse vendor branches for initial imports

* Dealing with various common types of CVS repository corruption

See our list of features [1] for more details.  Presumably many of these
features would not be covered by your test framework, and are not
supported by the other conversion tools.

Unfortunately, our tests are mostly based on cvs2svn (i.e., not 2git);
that is, the conversion is done with cvs2svn and checked by verifying
the contents of the resulting Subversion repository.

The script contrib/verify-cvs2svn.py is another kind of test; it checks
every branch and tag out of CVS and the destination repository and
verifies that their contents are identical.  This script is intended to
be used by users to check their own conversion.  Please note that it
doesn't check the history, only the branch/tag tips.  But this script
works with both Subversion and git (at least it should; it probably
doesn't get tested much).

> Here is what I propose.  Let's build a common test suite that cvs2git,
> git-cvsimport, cvsps, and parsecvs can all use, apply it rigorously,
> and let the best tool win.  (This would mean, among other things, that
> git can stop carrying things that are essentially cvsps tests in its
> tree.)

I think it would be great to have a way to test across tools, though
please realize that the inference of the most plausible "true" CVS
history is partly objective but also often a matter of heuristics and
taste.  Moreover, the choice of how to represent the inferred history in
git, which has rather a different model than CVS/Subversion, is also
non-obvious and somewhat controversial.  I expect that there will be a
number of simple CVS repositories for which we can all agree about the
correct git output, but not far away will be a vast number for which the
"correct" answer is unclear.  Many of the interesting tests would fall
into the latter category.

> The two people I most need to sign off on this are, I guess, Michael
> Haggerty and either Junio Hamano or whoever specifically owns
> git-cvsimport and its tests.  [...]

It's not clear what you want me to sign off on.  I guess you want to
replace (or augment?) the cvs2svn test suite with one based on your
framework?  Right off the top of my head I can think of a few
considerations from the point of view of the cvs2svn project:

* We definitely want to continue testing the Subversion output of
cvs2svn.  A test suite that only tests the git output could at best be
an addition to the current test suite, not a replacement for it.  (That
being said, the addition of good tests of the 2git output would be great.)

* A test suite that tests only the easy cases wouldn't really be
interesting, because the difficult cases are where the potential
problems lie.

* It would be unfortunate if the cvs2svn test suite would grow another
run-time dependency or if we would have to invest a lot of time
synchronizing with another project, though if the gain were big enough
we could consider it.

* The licenses obviously have to be compatible to the extent required by
the level of coupling.

* I don't have a lot of time to work on the integration.  cvs2svn has
long been at a level of maturity where it doesn't need much care and
feeding, and I would like to keep it that way :-)  Nowadays I am far
more interested in working on the git project with my little available
open-sourcin' time.


Rereading this email, I realize that it is not clear to me why your new
testing project needs the "signoff" or cooperation from any of the
conversions tool projects (git-cvsimport or cvs2svn or parsecvs or ...)
in the first place.  The essence of your project will be a collection of
CVS test repositories, and code that can read the conversion output
(whether via git or as fast-input data) and verify that it matches
expectations (right?).  Presumably it will have a place where any of the
conversion tools could be plugged into it, and perhaps a bit of code
that knows how to configure and run the best-known tools (and perhaps
even to download and build them).

It would seem natural to me that your project stops there, and stays at
arms-length from the conversion projects.  If your test suite proves
itself to be obviously better than the cvs2svn test suite, then we might
try to integrate it *then* (or not; even then it wouldn't really be
obligatory).

Michael

[1] http://cvs2svn.tigris.org/features.html

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: [BUG] two-way read-tree can write null sha1s into index
From: Junio C Hamano @ 2013-01-03 15:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, git
In-Reply-To: <20130103083712.GC32377@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, Jan 01, 2013 at 02:24:46PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > So I think we need to update twoway_merge to recognize unmerged entries,
>> > which gives us two options:
>> >
>> >   1. Reject the merge.
>> >
>> >   2. Throw away the current unmerged entry in favor of the "new" entry
>> >      (when old and new are the same, of course; otherwise we would
>> >      reject).
>> >
>> > I think (2) is the right thing.
>> 
>> As "--reset" in "read-tree --reset -u A B" is a way to say "we know
>> we are based on A and we want to go to B no matter what", I agree we
>> should not reject the merge.
>> 
>> With -m instead of --reset, I am not sure what the right semantics
>> should be, though.
>
> Good point; I was just thinking about the --reset case.
>
> With "-m", though, we could in theory carry over the unmerged entries
> (again, assuming that "old" and "new" are the same; otherwise it is an
> obvious reject). But those entries would be confused with any new
> unmerged entries we create. It seems we already protect against this,
> though: "read-tree -m" will not run at all if you have unmerged entries.
>
> Likewise, "checkout" seems to have similar protections.
>
> So I think it may be a non-issue.

Yeah.  Also earlier in the thread you mentioned three-way case, but
I do not think we ever would want --reset with three trees, so I
think that too is a non-issue for the same reason.

I would still feel safer if we expressed the expectation of
the callee in the code, perhaps like this in the two-way case:

	if (current->ce_flags & CE_CONFLICTED) {
        	if (!o->reset) {
                	... either die or fail ...
		} else {
                	... your fix ...
		}
	}

^ permalink raw reply

* Re: [PATCH] Replace git-cvsimport with a rewrite that fixes major bugs.
From: Junio C Hamano @ 2013-01-03 15:22 UTC (permalink / raw)
  To: Antoine Pelisse; +Cc: Chris Rorvick, Eric Raymond, git
In-Reply-To: <CALWbr2xx0beca_LUHO45pGMZ4Y0jZ9-iMWq8WBO6PmW==Ysw=A@mail.gmail.com>

Antoine Pelisse <apelisse@gmail.com> writes:

>> Doesn't Python come with a standard subprocess module that lets you
>> spawn external programs safely, similar to the way Perl's list form
>> open(), e.g. "open($fh, "-|", 'git', @args)", works?

... and of course a more boring "system('git', $subcmd, @args)", as well.

> You mean something like this:
>
>   p1 = subprocess.Popen([backend.command()], stdout=subprocess.PIPE)
>   subprocess.Popen(["git", "fast-import", "--quiet"] + gitopts,
> cwd=outdir, stdin=p1.stdout)
>
> Assuming gitopts is a list rather than a string. (care must be taken
> with backend.command() also)

Yes.

I vaguely recall that the subprocess module once used to be one
portability issue but that was between Python 2.3 and 2.4 or some
ancient history, and it should no longer be relevant.

^ permalink raw reply

* [PATCH v2 5/5] format-patch: pick up branch description when no ref is specified
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1357221791-7496-1-git-send-email-pclouds@gmail.com>

We only try to get branch name in "format-patch origin" case or
similar and not "format-patch -22" where HEAD is automatically
added. Without correct branch name, branch description cannot be
added. Make sure we always get branch name.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/log.c           | 16 +++++++++++++---
 t/t4014-format-patch.sh |  7 +++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 039bf67..81683f6 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1027,12 +1027,22 @@ static char *find_branch_name(struct rev_info *rev)
 		else
 			return NULL;
 	}
-	if (positive < 0)
+	if (positive >= 0)
+		ref = rev->cmdline.rev[positive].name;
+	else if (!rev->cmdline.nr && rev->pending.nr == 1 &&
+		 !strcmp(rev->pending.objects[0].name, "HEAD"))
+		/*
+		 * No actual ref from command line, but "HEAD" from
+		 * rev->def was added in setup_revisions()
+		 * e.g. format-patch --cover-letter -12
+		 */
+		ref = "HEAD";
+	else
 		return NULL;
-	ref = rev->cmdline.rev[positive].name;
 	if (dwim_ref(ref, strlen(ref), branch_sha1, &full_ref) &&
 	    !prefixcmp(full_ref, "refs/heads/") &&
-	    !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1))
+	    (positive < 0 ||
+	     !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1)))
 		branch = xstrdup(full_ref + strlen("refs/heads/"));
 	free(full_ref);
 	return branch;
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 903a797..7e52941 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -998,4 +998,11 @@ test_expect_success 'cover letter using branch description (5)' '
 	grep hello actual >/dev/null
 '
 
+test_expect_success 'cover letter using branch description (6)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter -2 >actual &&
+	grep hello actual >/dev/null
+'
+
 test_done
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 4/5] format-patch: pick up correct branch name from symbolic ref
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1357221791-7496-1-git-send-email-pclouds@gmail.com>

find_branch_name() assumes to take refs/heads/<branch>. But we also
have symbolic refs, such as HEAD, that can point to a valid branch in
refs/heads and do not follow refs/heads/<branch> syntax. Remove the
assumption and apply normal ref resolution. After all it would be
confusing if rev machinery resolves a ref in one way and
find_branch_name() another.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/log.c           | 21 +++++++++------------
 t/t4014-format-patch.sh | 14 ++++++++++++++
 2 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index e7b7db1..039bf67 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1016,8 +1016,8 @@ static char *find_branch_name(struct rev_info *rev)
 {
 	int i, positive = -1;
 	unsigned char branch_sha1[20];
-	struct strbuf buf = STRBUF_INIT;
-	const char *branch;
+	const char *ref;
+	char *full_ref, *branch = NULL;
 
 	for (i = 0; i < rev->cmdline.nr; i++) {
 		if (rev->cmdline.rev[i].flags & UNINTERESTING)
@@ -1029,16 +1029,13 @@ static char *find_branch_name(struct rev_info *rev)
 	}
 	if (positive < 0)
 		return NULL;
-	strbuf_addf(&buf, "refs/heads/%s", rev->cmdline.rev[positive].name);
-	branch = resolve_ref_unsafe(buf.buf, branch_sha1, 1, NULL);
-	if (!branch ||
-	    prefixcmp(branch, "refs/heads/") ||
-	    hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1))
-		branch = NULL;
-	strbuf_release(&buf);
-	if (branch)
-		return xstrdup(rev->cmdline.rev[positive].name);
-	return NULL;
+	ref = rev->cmdline.rev[positive].name;
+	if (dwim_ref(ref, strlen(ref), branch_sha1, &full_ref) &&
+	    !prefixcmp(full_ref, "refs/heads/") &&
+	    !hashcmp(rev->cmdline.rev[positive].item->sha1, branch_sha1))
+		branch = xstrdup(full_ref + strlen("refs/heads/"));
+	free(full_ref);
+	return branch;
 }
 
 int cmd_format_patch(int argc, const char **argv, const char *prefix)
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index ec9ef9e..903a797 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -984,4 +984,18 @@ test_expect_success 'cover letter using branch description (3)' '
 	grep hello actual >/dev/null
 '
 
+test_expect_success 'cover letter using branch description (4)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter master.. >actual &&
+	grep hello actual >/dev/null
+'
+
+test_expect_success 'cover letter using branch description (5)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter -2 HEAD >actual &&
+	grep hello actual >/dev/null
+'
+
 test_done
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 3/5] t4014: a few more tests on cover letter using branch description
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1357221791-7496-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t4014-format-patch.sh | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 16a4ca1..ec9ef9e 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -963,4 +963,25 @@ test_expect_success 'format patch ignores color.ui' '
 	test_cmp expect actual
 '
 
+test_expect_success 'cover letter using branch description (1)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter master >actual &&
+	grep hello actual >/dev/null
+'
+
+test_expect_success 'cover letter using branch description (2)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter rebuild-1~2..rebuild-1 >actual &&
+	grep hello actual >/dev/null
+'
+
+test_expect_success 'cover letter using branch description (3)' '
+	git checkout rebuild-1 &&
+	test_config branch.rebuild-1.description hello &&
+	git format-patch --stdout --cover-letter ^master rebuild-1 >actual &&
+	grep hello actual >/dev/null
+'
+
 test_done
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 2/5] branch: delete branch description if it's empty
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1357221791-7496-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/branch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 1ec9c02..873f624 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -725,7 +725,7 @@ static int edit_branch_description(const char *branch_name)
 	stripspace(&buf, 1);
 
 	strbuf_addf(&name, "branch.%s.description", branch_name);
-	status = git_config_set(name.buf, buf.buf);
+	status = git_config_set(name.buf, buf.len ? buf.buf : NULL);
 	strbuf_release(&name);
 	strbuf_release(&buf);
 
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 1/5] config.txt: a few lines about branch.<name>.description
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1357221791-7496-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/config.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bf8f911..ee64846 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -735,6 +735,12 @@ branch.<name>.rebase::
 it unless you understand the implications (see linkgit:git-rebase[1]
 for details).
 
+branch.<name>.description::
+	Branch description, can be edited with
+	`git branch --edit-description`. Branch description is
+	automatically added in the format-patch cover letter or
+	request-pull summary.
+
 browser.<tool>.cmd::
 	Specify the command to invoke the specified browser. The
 	specified command is evaluated in shell with the URLs passed
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 0/5] nd/maint-branch-desc-doc reroll
From: Nguyễn Thái Ngọc Duy @ 2013-01-03 14:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

 - move the comment in 5/5 into the commented code block
 - resolve branch name for "HEAD" in 4/5

Nguyễn Thái Ngọc Duy (5):
  config.txt: a few lines about branch.<name>.description
  branch: delete branch description if it's empty
  t4014: a few more tests on cover letter using branch description
  format-patch: pick up correct branch name from symbolic ref
  format-patch: pick up branch description when no ref is specified

 Documentation/config.txt |  6 ++++++
 builtin/branch.c         |  2 +-
 builtin/log.c            | 33 ++++++++++++++++++++-------------
 t/t4014-format-patch.sh  | 42 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 69 insertions(+), 14 deletions(-)

-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply

* Re: [RFH] NetBSD 6?
From: Greg Troxel @ 2013-01-03 13:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vd2xnypt6.fsf@alter.siamese.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 2791 bytes --]


Junio C Hamano <gitster@pobox.com> writes:

> Greg Troxel <gdt@ir.bbn.com> writes:
>
>> I realize a README.foo file for N different systems could be clutter,
>> but having these checked in would provide the concise help that people
>> on any of those platforms need.
>
> Our Makefile documents knobs people on various platforms can tweak
> (PYTHON_PATH and OLD_ICONV are two examples of them), sets default
> values for them based on $(uname -S), then includes config.mak file
> the user can optionally create to override these platform defaults.
> This infrastructure is used across platforms, not just for NetBSD.

If you have to choose a single PYTHON_PATH, the one you picked is right
(for now, and likely will be right for a long time).

But as a general rule, I think configure tests are preferable to
OS-specific variables.

> The part shown in the patch was to update the platform default for
> NetBSD.  The setting we have been shipping in our Makefile seemed to
> be different from what I needed on my NetBSD 6 install, and I was
> wondering if we have no real users of Git on the platorm (which
> would explain why we didn't get any complaints or patches to update
> this part).  Or there are some optional packages all real NetBSD
> users install, but being a NetBSD newbie I didn't, that makes the
> values I showed in the patch inappropriate for them (e.g. Perhaps
> there is a mechanism other than pkgsrc that installs perl and python
> under /usr/bin?  Perhaps an optional libi18n package that gives
> iconv(3) with new function signature?), in which case I definitely
> should not apply that patch to my tree, as it would only be an
> improvement for one person and a regression for existing users at
> the same time.

A large number of people on NetBSD use git, but almost all of them get
it via pkgsrc (which is also where they get perl, emacs, svn, and
everything else you didn't find in /usr/bin).  The exception would be
people that want to hack on git itself.

People who want gnu libiconv can install the libiconv package from
pkgsrc.  (I'm guessing OLD_ICONV means "POSIX iconv", without GNU
extensions (iconvctl?).)  The git package doesn't depend on GNU iconv,
though (perhaps it should but we tend to avoid dependencies that arren't
really needed).

There are no mechanisms to install python/perl in /usr/bin, and doing so
would be viewed as bizarre.  /usr/bin belongs to the base system,
/usr/pkg to pkgsrc and /usr/local to the user.

So, it doesn't matter too much for pkgsrc what you change, because it
can be patched anyway (once, for all users).  It's probably better to
make a straightforward build from source come out right.
But, if the build respects the PYTHON_PATH environment variable, that's
easier (and more robust against changes) than patching.

[-- Attachment #2: Type: application/pgp-signature, Size: 194 bytes --]

^ permalink raw reply

* Fwd: git-gui / Warning: "No newline at end of file”
From: Tobias Preuss @ 2013-01-03 12:26 UTC (permalink / raw)
  To: git
In-Reply-To: <CADEaiE-J4nEwyK4WSiH=zzaH6cb85mw15O3wxrWrTXJtZfJixQ@mail.gmail.com>

Hello. I never got a response. Did my email pass the distribution
list? Best, Tobias


---------- Forwarded message ----------
From: Tobias Preuss <tobias.preuss@googlemail.com>
Date: Tue, Nov 13, 2012 at 9:26 PM
Subject: git-gui / Warning: "No newline at end of file”
To: git <git@vger.kernel.org>


Hello.
I noticed a problem when working with git-gui which might be a bug.
The issue only affects you when you are visually trying to stage
changes line by line. Here are the steps to reproduce the problem:

1. Initialize a new repository.
2. Create a file with three lines of content each with the word
"Hello". Do not put a new line at the end of the file.
3. Add and commit the file.
4. Edit the same file putting words inbetween the three lines.
5. Open git-gui and try to stage the changes line by line.

The editor will append the warning "No newline at end of file” to the
end of the diff. When you are trying to stage a line an error occurs.
The problem is also illustrated in a question on Stackoverflow [1].

Please let me know if you need more information or if I should send
this problem to some other mailing list.
Thank you, Tobias

____________
[1] http://stackoverflow.com/questions/13223868/how-to-stage-line-by-line-in-git-gui-although-no-newline-at-end-of-file-warnin

^ permalink raw reply

* Re: git filter-branch doesn't dereference annotated tags
From: Johannes Sixt @ 2013-01-03 10:33 UTC (permalink / raw)
  To: Grégory Pakosz; +Cc: Junio C Hamano, git
In-Reply-To: <CAC_01E2HXSnXBgDm=Cbwgi5PbiuHp_qPpoaqT_=pdDWDMnC5jA@mail.gmail.com>

Am 03.01.2013 10:50, schrieb Grégory Pakosz:
>>
>> IOW, if the command was something like
>>
>>   git filter-branch ...filter options... -- v1.0 master ...
>>
>> and v1.0 is an annotated tag, then it is reasonable to expect v1.0 to be
>> deleted if the commit it points to goes away. But if the commit did not
>> go away, but was rewritten, then it is equally reasonable to expect that
>> the tag is also rewritten. But I don't think that we currently do the
>> latter.
>>
> When the commit doesn't go away, the tag is currently being rewritten properly.

Indeed, but only if a --tag-name-filter was specified.

>> Therefore, IMO, a change that implements the former behavior should also
>> implement the latter behavior.
>>
> The patch in my latest email does both. (yet lacks unit tests for now)

If it deletes a tag only when --tag-name-filter was specified, than that
should be fine.

-- Hannes

^ permalink raw reply

* Re: [RFD] annnotating a pair of commit objects?
From: Johannes Sixt @ 2013-01-03 10:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr4m2ycij.fsf@alter.siamese.dyndns.org>

Am 03.01.2013 08:03, schrieb Junio C Hamano:
> The intended use case is to "go beyond rerere".  Given a history of
> this shape:
> 
>     o---o---o---I      mainline
>    / 
>   O---o---X---o---A    topic A
>    \
>     o---Y---o---o---B  topic B
> 
> Suppose in the original O we had a function "distimmed_doshes()" to
> tell if doshes are already distimmed, with some call sites.  On the
> branch leading to A, at commit X, this function was renamed to
> "doshes_are_distimmed()", and all existing call sites were adjusted.
> On the side branch leading to B, however, at commit Y, a new call
> site to the old function was added in a file that was not touched
> between O..A at all.
> 
> When merging either the topic A or the topic B (but not both) to the
> integration branch that did not touch this function or use of it, no
> special care needs to be taken, but when merging the second topic
> after merging the other one, we need to resolve a semantic conflict.
> Namely, the callsite to "distimmed_doshes()" introduced by commit Y
> needs to be adjusted to call "doshes_are_distimmed()" instead.

I guess this issue comes up when you rebuild pu. Perhaps you (and other
integrators with a similar workflow) are sufficiently served with
something that resembles rebase -p --first-parent, as proposed here:

http://thread.gmane.org/gmane.comp.version-control.git/198125/focus=198483

(A proposal of the same idea appeared already years earlier:

http://thread.gmane.org/gmane.comp.version-control.git/62782/focus=62794

but its implementation only re-did the merge, which would not help your
case.)

-- Hannes

^ permalink raw reply

* Re: [RFD] annnotating a pair of commit objects?
From: Michael Haggerty @ 2013-01-03  9:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <7vr4m2ycij.fsf@alter.siamese.dyndns.org>

On 01/03/2013 08:03 AM, Junio C Hamano wrote:
> I'd like a datastore that maps a pair of commit object names to
> another object name, such that:
> 
>  * When looking at two commits A and B, efficiently query all data
>    associated with a pair of commits <X,Y> where X is contained in
>    the range A..B and not in B..A, and Y is contained in the range
>    B..A and not in A..B.
> 
>  * When <X,Y> is registered in the datastore, and X is rewritten to
>    X' and/or Y is rewritten to Y', the mapping is updated so that it
>    can be queried with <X',Y'> as a new key, similar to the way a
>    notes tree that maps object X can be updated to map object X'
>    when such a rewrite happens.
> 
> The intended use case is to "go beyond rerere".  Given a history of
> this shape:
> 
>     o---o---o---I      mainline
>    / 
>   O---o---X---o---A    topic A
>    \
>     o---Y---o---o---B  topic B

If we ignore rewriting for a moment, the information that you want to
record is essentially the merge M of X and Y, no?  Namely, X and Y
conflict logically with each other (though perhaps not textually) and
you, the human, want to record how to reconcile them:

     o---o---o---I      mainline
    /
   O---o---X---o---A    topic A
    \       \
     \       M
      \     /
       o---Y---o---o---B  topic B

However, you don't necessarily want to go to the trouble to make a
branch to point at M, nor to do the bookkeeping manually that would be
required to take the information stored in M into account when merging A
and B later.

Suppose we had M; how could we make use of it in future merges?

> [...] and can create a merge J without semantic adjustment.
> 
>     o---o---o---I---J      mainline
>    /               /  
>   O---o---X---o---A        topic A
>    \
>     o---Y---o---o---B      topic B

That would become

     o---o---o---I---J      mainline
    /               /
   O---o---X---o---A        topic A
    \       \
     \       M
      \     /
       o---Y---o---o---B  topic B


> When I later merge topic B to the integration branch, however, [...]
> to notice that we need to be careful when creating the merge K:
> 
>     o---o---o---I---J---K  mainline
>    /               /   /
>   O---o---X---o---A   /    topic A
>    \                 /
>     o---Y---o---o---B      topic B

When doing this merge, I think your goal is equivalent to discovering
that M includes part of the merge of J and B, and adding M as an
(implicit or explicit) third parent to the merge:

     o---o---o---I---J-------K  mainline
    /               /    .  /
   O---o---X---o---A    .  /    topic A
    \       \          .  /
     \       M.........  /
      \     /           /
       o---Y---o---o---B        topic B

How could M be stored?  Assuming that these type of premerge merges are
sparse, then Jeff's analysis seems good.  Concretely, one could simply
store pointers to M from both X and Y; e.g.,

* Add a note to X with the information "when merging this commit with Y,
use premerge M"

* Add a note to Y with the information "when merging this commit with X,
use premerge M"

Then, when merging, create the set J..B, scan all of the commits on B..J
for these "premerge" notes (O(|B..J|)), and for each one, look in the
set J..B to see if it is present.  The effort should scale like

    O( |J..B| + |B..J| * lg(|J..B|) )

where, of course J and B could be exchanged for either aesthetic or
performance reasons.  (One would also need a mechanism for preventing M
from being garbage-collected.)

Incidentally, this is just the sort of thing I have been thinking about
using to implement a kind of "incremental merge"; I've started writing
up my thoughts on my blog [1,2,3] (including how to make pretty pictures
of merge conflicts).

Michael

[1]
http://softwareswirl.blogspot.de/2012/12/the-conflict-frontier-of-nightmare-merge.html
[2]
http://softwareswirl.blogspot.de/2012/12/mapping-merge-conflict-frontier.html
[3]
http://softwareswirl.blogspot.de/2012/12/real-world-conflict-diagrams.html

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: git filter-branch doesn't dereference annotated tags
From: Grégory Pakosz @ 2013-01-03  9:50 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, git
In-Reply-To: <50E55198.5080202@kdbg.org>

>
> IOW, if the command was something like
>
>   git filter-branch ...filter options... -- v1.0 master ...
>
> and v1.0 is an annotated tag, then it is reasonable to expect v1.0 to be
> deleted if the commit it points to goes away. But if the commit did not
> go away, but was rewritten, then it is equally reasonable to expect that
> the tag is also rewritten. But I don't think that we currently do the
> latter.
>
When the commit doesn't go away, the tag is currently being rewritten properly.

> Therefore, IMO, a change that implements the former behavior should also
> implement the latter behavior.
>
The patch in my latest email does both. (yet lacks unit tests for now)

Gregory

^ permalink raw reply

* Re: git filter-branch doesn't dereference annotated tags
From: Johannes Sixt @ 2013-01-03  9:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Grégory Pakosz, git
In-Reply-To: <7v623f18ci.fsf@alter.siamese.dyndns.org>

Am 03.01.2013 00:19, schrieb Junio C Hamano:
> Grégory Pakosz <gpakosz@visionobjects.com> writes:
> 
>> So we have an annotated tag that points to a commit that is rewritten
>> to nothing as the result of the filtering. What should happen?
> 
> If the user asked to filter that tag itself, it may make sense to
> remove it, rather than keeping it pointing at the original commit,
> because the commit it used to point at no longer exists in the
> alternate history being created by filter-branch.

IOW, if the command was something like

  git filter-branch ...filter options... -- v1.0 master ...

and v1.0 is an annotated tag, then it is reasonable to expect v1.0 to be
deleted if the commit it points to goes away. But if the commit did not
go away, but was rewritten, then it is equally reasonable to expect that
the tag is also rewritten. But I don't think that we currently do the
latter.

Therefore, IMO, a change that implements the former behavior should also
implement the latter behavior.

-- Hannes

^ permalink raw reply

* Re: [BUG] two-way read-tree can write null sha1s into index
From: Jeff King @ 2013-01-03  8:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git
In-Reply-To: <7vvcbg7d8x.fsf@alter.siamese.dyndns.org>

On Tue, Jan 01, 2013 at 02:24:46PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So I think we need to update twoway_merge to recognize unmerged entries,
> > which gives us two options:
> >
> >   1. Reject the merge.
> >
> >   2. Throw away the current unmerged entry in favor of the "new" entry
> >      (when old and new are the same, of course; otherwise we would
> >      reject).
> >
> > I think (2) is the right thing.
> 
> As "--reset" in "read-tree --reset -u A B" is a way to say "we know
> we are based on A and we want to go to B no matter what", I agree we
> should not reject the merge.
> 
> With -m instead of --reset, I am not sure what the right semantics
> should be, though.

Good point; I was just thinking about the --reset case.

With "-m", though, we could in theory carry over the unmerged entries
(again, assuming that "old" and "new" are the same; otherwise it is an
obvious reject). But those entries would be confused with any new
unmerged entries we create. It seems we already protect against this,
though: "read-tree -m" will not run at all if you have unmerged entries.

Likewise, "checkout" seems to have similar protections.

So I think it may be a non-issue.

-Peff

^ permalink raw reply

* Re: [RFD] annnotating a pair of commit objects?
From: Jeff King @ 2013-01-03  8:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vr4m2ycij.fsf@alter.siamese.dyndns.org>

On Wed, Jan 02, 2013 at 11:03:00PM -0800, Junio C Hamano wrote:

> I'd like a datastore that maps a pair of commit object names to
> another object name, such that:
> 
>  * When looking at two commits A and B, efficiently query all data
>    associated with a pair of commits <X,Y> where X is contained in
>    the range A..B and not in B..A, and Y is contained in the range
>    B..A and not in A..B.
> [...]
> Obviously, with O(cnt(A..B))*O(cnt(B..A)) complexity, this can be
> trivially coded, by trying all pairs in symmetric difference.
> 
> But I am hoping we can do better than that.
> 
> Ideas?

Just thinking out loud, the problem can be generalized in math terms as:

  - you have a universe of elements, `U` (i.e., all commits)

  - you have two sets, `X` and `Y`, such that each is a subset of `U`
    (these correspond to the two sides of the range, but we can think of
    them just as sets of commits). We happen to know that the sets are
    disjoint, but I don't know if that is helpful here.

  - you have a set of sets `M` that is a subset of the cartesian product
    `U x U` (i.e., its elements are "{x,y}" pairs, and membership in
    that set is your "bit"; you could also think of it as a mapping if
    you wanted more than a bit).

  - you want to know the intersection of `X x Y` and `M` (which of your
    pairs are in the mapping set).

Without doing any careful analysis, my gut says that in the worst case,
you are going to be stuck with `O(|X|*|Y|)` (i.e., what you are trying
to do better than above). But if we assume that `M` is relatively sparse
(which it should be; you only create entries when you do a merge between
two commits, and even then, only when it is tricky), we can probably do
better in practice.

For example, consider this naive way of doing it. Store `M` as a mapping
of commits to sets of commits, with fast lookup (a hash, or sorted
list).  For each element of `X`, look it up in `M`; call its value `V`
(which, remember, is a set itself).  For each element of `Y`, look it up
in `V`. The complexity would be:

  O(|X| * lg(|M|) * |Y| * lg(V_avg))

where "V_avg" is the average cardinality of each of the value sets we
map in the first step. But imagine we pre-sort `Y`, and then in the
second step, rather than looking up each `Y` in `V`, we instead look up
each `V` in `Y`. Then we have:

  O(|X| * lg(|M|) * V_avg * lg(|Y|))

IOW, we get to apply the log to |Y|. This is a win if we expect that
V_avg is going to be much smaller than |Y|. Which I think it would be,
since we would only have entries for merges we've done before[1].

That's just off the top of my head. This seems like it should be a
classic databases problem (since the cartesian product here is really
just a binary relation), but I couldn't find anything obvious online
(and I never paid attention in class, either).

-Peff

[1] You can do the same inversion trick for looking up elements of `M`
    in `X` instead of vice versa. It would probably buy you less, as you
    have a lot of commits that have merges at all (i.e., `M` is big),
    but only a few matching partners for each entry (i.e., `V` is
    small).

^ permalink raw reply

* Re: [PATCH] Replace git-cvsimport with a rewrite that fixes major bugs.
From: Antoine Pelisse @ 2013-01-03  7:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Chris Rorvick, Eric Raymond, git
In-Reply-To: <7vmwwqyc8w.fsf@alter.siamese.dyndns.org>

> Doesn't Python come with a standard subprocess module that lets you
> spawn external programs safely, similar to the way Perl's list form
> open(), e.g. "open($fh, "-|", 'git', @args)", works?

You mean something like this:

  p1 = subprocess.Popen([backend.command()], stdout=subprocess.PIPE)
  subprocess.Popen(["git", "fast-import", "--quiet"] + gitopts,
cwd=outdir, stdin=p1.stdout)

Assuming gitopts is a list rather than a string. (care must be taken
with backend.command() also)

^ permalink raw reply

* [PATCH] tests: turn on test-lint by default
From: Jeff King @ 2013-01-03  7:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, git
In-Reply-To: <7vhamzyqev.fsf@alter.siamese.dyndns.org>

On Wed, Jan 02, 2013 at 06:02:48PM -0800, Junio C Hamano wrote:

> Torsten Bögershausen <tboegi@web.de> writes:
> 
> > When the dust has settled, we can either enable the check always,
> > or mention "make test-lint-shell-syntax" in the Documentation.
> 
> In the longer term, I'm pretty much in favor of enabling all the
> checks that are cheap by default, as that would help people catch
> easy mistakes while preparing their patches.  People do not tend to
> enable any check if it were optional.

That is fine with me, and I always intended that we turn the lint on by
default at some point (I'm not really sure why we didn't -- looking at
the list archives, I think I did not push it because it seemed like
nobody was really that interested).

Certainly the two existing checks are cheap and do not produce false
positives, and should be safe to turn on. Like this:

-- >8 --
Subject: [PATCH] tests: turn on test-lint by default

The test Makefile knows about a few "lint" checks for common
errors. However, they are not enabled as part of "make test"
by default, which means that many people do not bother
running them. Since they are both quick to run and accurate
(i.e., no false positives), there should be no harm in
turning them on and helping submitters catch errors earlier.

We could just set:

  TEST_LINT = test-lint

to enable all tests. But that would be unnecessarily
annoying later on if we add slower or less accurate tests
that should not be part of the default. Instead, we name the
tests individually.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/Makefile b/t/Makefile
index 3025418..5c6de81 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -13,6 +13,7 @@ DEFAULT_TEST_TARGET ?= test
 RM ?= rm -f
 PROVE ?= prove
 DEFAULT_TEST_TARGET ?= test
+TEST_LINT ?= test-lint-duplicates test-lint-executable
 
 # Shell quote;
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
-- 
1.8.1.rc3.4.gf3a2f57





> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox