Git development
 help / color / mirror / Atom feed
* Re: [PATCH 4/5] mv: improve overwrite warning
From: Jeff King @ 2011-12-12 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jari Aalto, git
In-Reply-To: <7vobvd36ms.fsf@alter.siamese.dyndns.org>

On Mon, Dec 12, 2011 at 11:57:31AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > This message looks overly long to me, but I wanted to match the existing
> > messages. Another option would be just:
> >
> >   warning: overwriting 'three/one'
> 
> Yes, I think it makes perfect sense to drop the ugly "source=one destination=two"
> cruft, both for single-source and multiple-source cases.

Yeah, the more I look at the message in the patch I sent, the uglier it
gets. Here's a re-rolled 4 and 5 with the nicer format.

-Peff

^ permalink raw reply

* [PATCHv2 4/5] mv: improve overwrite warning
From: Jeff King @ 2011-12-12 21:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jari Aalto, git
In-Reply-To: <20111212215238.GD9754@sigill.intra.peff.net>

When we try to "git mv" over an existing file, the error
message is fairly informative:

  $ git mv one two
  fatal: destination exists, source=one, destination=two

When the user forces the overwrite, we give a warning:

  $ git mv -f one two
  warning: destination exists; will overwrite!

This is less informative, but still sufficient in the simple
rename case, as there is only one rename happening.

But when moving files from one directory to another, it
becomes useless:

  $ mkdir three
  $ touch one two three/one
  $ git add .
  $ git mv one two three
  fatal: destination exists, source=one, destination=three/one
  $ git mv -f one two three
  warning: destination exists; will overwrite!

The first message is helpful, but the second one gives us no
clue about what was overwritten. Let's mention the name of
the destination file:

  $ git mv -f one two three
  warning: overwriting 'three/one'

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/mv.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index ae6c30c..8dd5a45 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -177,7 +177,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				 * check both source and destination
 				 */
 				if (S_ISREG(st.st_mode) || S_ISLNK(st.st_mode)) {
-					warning(_("%s; will overwrite!"), bad);
+					warning(_("overwriting '%s'"), dst);
 					bad = NULL;
 				} else
 					bad = _("Cannot overwrite");
-- 
1.7.8.13.g74677

^ permalink raw reply related

* [PATCHv2 5/5] mv: be quiet about overwriting
From: Jeff King @ 2011-12-12 21:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jari Aalto, git
In-Reply-To: <20111212215238.GD9754@sigill.intra.peff.net>

When a user asks us to force a mv and overwrite the
destination, we print a warning. However, since a typical
use would be:

  $ git mv one two
  fatal: destination exists, source=one, destination=two
  $ git mv -f one two
  warning: overwriting 'two'

this warning is just noise. We already know we're
overwriting; that's why we gave -f!

This patch silences the warning unless "--verbose" is given.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/mv.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index 8dd5a45..2a144b0 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -177,7 +177,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				 * check both source and destination
 				 */
 				if (S_ISREG(st.st_mode) || S_ISLNK(st.st_mode)) {
-					warning(_("overwriting '%s'"), dst);
+					if (verbose)
+						warning(_("overwriting '%s'"), dst);
 					bad = NULL;
 				} else
 					bad = _("Cannot overwrite");
-- 
1.7.8.13.g74677

^ permalink raw reply related

* Re: Breakage (?) in configure and git_vsnprintf()
From: Jonathan Nieder @ 2011-12-12 21:56 UTC (permalink / raw)
  To: Jeff King
  Cc: Michael Haggerty, Junio C Hamano, git discussion list,
	Michal Rokos, Brandon Casey
In-Reply-To: <20111212064305.GA16511@sigill.intra.peff.net>

Jeff King wrote:

> I'll leave the issue of "-std=c89" triggering SNPRINTF_RETURNS_BOGUS to
> people who know and care about autoconf. My gut is to say "don't do
> that". Git is not actually pure c89. We typically target systems that
> are _at least_ c89, but it's more important to match and run well on
> real-world systems than what was defined in the standard. So we don't
> depend on c99, but we do depend on quirks and features that were
> prominent in mid-90's Unix variants.

Using vsnprintf isn't even wrong from the standards-pedant point of
view --- vsnprintf has been in POSIX for a long time.  The problem is
that the configure script is not setting appropriate feature test
macros such as _XOPEN_SOURCE (like git-compat-util.h does) during its
feature tests.

Maybe it would be possible to hook the appropriate magic into
AC_LANG_PROGRAM.

^ permalink raw reply

* Re: Question about commit message wrapping
From: Frans Klaver @ 2011-12-12 22:16 UTC (permalink / raw)
  To: Holger Hellmuth
  Cc: Andrew Ardill, Jakub Narebski, Sidney San Martín, git
In-Reply-To: <4EE62DB9.8030406@ira.uka.de>

On Mon, 12 Dec 2011 17:37:13 +0100, Holger Hellmuth <hellmuth@ira.uka.de>  
wrote:

>> I am starting to think that we need to somehow keep the current
>> behavior, but override at smaller widths. Maybe even use format=flowed
>> in format-patch.
>
> Could you explain what overriding at smaller widths would achieve?  
> Supporting format=flowed would be nice though.

I specifically meant trying to detect pre-wrapped text, removing the  
new-lines where we think it is because of wrapping at 80 characters. So  
the result would be: perfect on screens up to 80 characters wide, and ok  
on anything wider.

The implementation of that however could affect code in the mail if it  
isn't done properly.

>
>> On the other hand, the fundamental use with git is to
>> communicate code, and I'm not sure how that [cw]ould be handled.
>
> I prefer wrapped code to code that is cut of at a specific column.  
> Wrapped code has much less possibility for misinterpretation. Python  
> programmers might disagree?

Wrapped code as in auto-wrapped? Or as in manually wrapped? Python  
programmers have significant white space, but you can still hard wrap  
stuff, as long as the next statement is properly indented.

>
> I see your proposal mainly as an improvement to the display of texts,  
> not code.

Me too.


-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

^ permalink raw reply

* [PATCH v2] Update documentation for stripspace
From: Conrad Irwin @ 2011-12-12 22:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Conrad Irwin
In-Reply-To: <7vy5ui5h0k.fsf@alter.siamese.dyndns.org>

On Sun, Dec 11, 2011 at 10:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Conrad Irwin <conrad.irwin@gmail.com> writes:
>
> The original says "remove" and new one says "normalize*s*". I think we
> tend to say things in imperative mood (i.e. without the trailing "s").

Good point, fixed.

>
> I do not think 'user-provided metadata' is a good wording. This is just a
> simple text clean-up filter and you can use it to clean your text files
> that you mean to store in the repository as well.

Changed just to "text", as that seems simpler. I've left the example
uses as is, they were the prime reason for that sentence existing.

>
> The last one is a bit funny, though.
>
> By definition, you cannot end the last line with more than one '\n' (upon
> seeing the second '\n', you would realize immediately that the line you
> saw was _not_ the last line). I think you meant the file does not end with
> an incomplete line, i.e. "ensures the output does not end with an
> incomplete line by adding '\n' at the end if needed".

Hmm, I'm not sure that's the best way of describing it — I've gone with:
"add a missing '\n' to the last line if necessary.".

>
>> +In the case where the input consists entirely of whitespace characters, no
>> +output will be produced.
>> +
>> +*NOTE*: This is intended for cleaning metadata, prefer the `--whitespace=fix`
>> +mode of linkgit:git-apply[1] for correcting whitespace of patches or files in
>> +the repository.
>
> I can tell that these three lines were the _primary_ thing you wanted to
> add with this patch, having never seen anybody got confused between the
> whitespace breakage fix and text cleaning, I wonder if this is adding
> clarity or giving users an impression that git can do too many things than
> they can wrap their mind around and forcing them to wonder if they have to
> learn everything git can do for them.

The motivation for this patch was an old post to the Cairo mailing list
[1]. There they were using (previously undocumented) behaviour of
git stripspace, instead of git apply --whitespace=fix (it may be that
--whitespace=fix didn't exist at that point?).

I've moved the *NOTE* into a SEE ALSO section where I think it reads
less opinionatedly — is that better?

>
>>  OPTIONS
>>  -------
>>  -s::
>>  --strip-comments::
>> -     In addition to empty lines, also strip lines starting with '#'.
>> +     Also remove all lines starting with '#'.
[snip]
>
> If I were touching this description, I probably would say something like
> "Treat lines starting with a '#' as if they are empty lines".
>

If only it were that simple! If you have a commented line between two
non-commented lines, then no empty line results. I've added an example
to the re-rolled patch that show-cases the behaviour of comment
stripping in both cases. I went with "skip and remove" as the verb, to
imply that the lines are ignored by the previous transformations.

Thanks for the detailed feedback.

Conrad

[1] http://lists.freedesktop.org/archives/cairo/2006-June/007062.html

----8<----

Tell the user what this command is intended for, and expand the
description of what it does.

Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
---
 Documentation/git-stripspace.txt |   69 ++++++++++++++++++++++++++++++++++---
 builtin/stripspace.c             |    2 +-
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-stripspace.txt b/Documentation/git-stripspace.txt
index b78f031..a0a6ea4 100644
--- a/Documentation/git-stripspace.txt
+++ b/Documentation/git-stripspace.txt
@@ -3,26 +3,83 @@ git-stripspace(1)
 
 NAME
 ----
-git-stripspace - Filter out empty lines
+git-stripspace - Remove unnecessary whitespace
 
 
 SYNOPSIS
 --------
 [verse]
-'git stripspace' [-s | --strip-comments] < <stream>
+'git stripspace' [-s | --strip-comments] < input
 
 DESCRIPTION
 -----------
-Remove multiple empty lines, and empty lines at beginning and end.
+
+Clean the input in the manner used by 'git' for text such as commit
+messages, notes, tags and branch descriptions.
+
+With no arguments, this will:
+
+- remove trailing whitespace from all lines
+- collapse multiple consecutive empty lines into one empty line
+- remove blank lines from the beginning and end of the input
+- add a missing '\n' to the last line if necessary.
+
+In the case where the input consists entirely of whitespace characters, no
+output will be produced.
 
 OPTIONS
 -------
 -s::
 --strip-comments::
-	In addition to empty lines, also strip lines starting with '#'.
+	Skip and remove all lines starting with '#'.
+
+EXAMPLES
+--------
+
+Given the following noisy input with '$' indicating the end of a line:
+
+--------
+|A brief introduction   $
+|   $
+|$
+|A new paragraph$
+|# with a commented-out line    $
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out. $
+|      $
+|The end.$
+|  $
+---------
+
+Use 'git stripspace' with no arguments to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|# with a commented-out line$
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out.$
+|$
+|The end.$
+---------
 
-<stream>::
-	Byte stream to act on.
+Use 'git stripspace --strip-comments' to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|explaining lots of stuff.$
+|$
+|The end.$
+---------
+
+SEE ALSO
+--------
+The `--whitespace=fix` mode of linkgit:git-apply[1].
 
 GIT
 ---
diff --git a/builtin/stripspace.c b/builtin/stripspace.c
index 1288ffc..f16986c 100644
--- a/builtin/stripspace.c
+++ b/builtin/stripspace.c
@@ -75,7 +75,7 @@ int cmd_stripspace(int argc, const char **argv, const char *prefix)
 				!strcmp(argv[1], "--strip-comments")))
 		strip_comments = 1;
 	else if (argc > 1)
-		usage("git stripspace [-s | --strip-comments] < <stream>");
+		usage("git stripspace [-s | --strip-comments] < input");
 
 	if (strbuf_read(&buf, 0, 1024) < 0)
 		die_errno("could not read the input");
-- 
1.7.8.164.g00d7e

^ permalink raw reply related

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
From: Jens Lehmann @ 2011-12-12 22:29 UTC (permalink / raw)
  To: Phil Hord; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git
In-Reply-To: <CABURp0okOmsk4JV9Ku5pHJb5vT-kr_fmweNNBKZ_OoRyfZan=Q@mail.gmail.com>

Am 12.12.2011 22:16, schrieb Phil Hord:
> On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>>> We could even declare that the gitlink for such a
>>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>>> think that is necessary.
>>
>> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
>> should do nicely. And that would avoid referencing a non-existing commit
>> in case you later want to turn a floating submodule into an exact one.
> 
> 
> I'm sorry I missed this comment before.
> 
> I hope we can allow storing the actual gitlink in the superproject for
> each commit even when we're using floating submodules.

I think you misread my statement, I was just talking about the initial
commit containing the newly added submodule, not any subsequent ones.
Floating makes differences between the original SHA-1 and the current
tip of the branch invisible, so there is nothing to commit.

>  I thought-experimented with this a bit last year and came to the
> conclusion that I should be able to 'float' to tips (developer
> convenience) and also to store the SHA-1 of each gitlink through
> history (automated maybe; as-needed).

Which means that after "git submodule update" floated a submodule branch
further, you would have to commit that in the superproject.

> The problem with "float-only" is that it loses history so, for
> example, git-bisect doesn't work.

Yep. And different developers can have the same superproject commit
checked out but their submodules can be quite different.

> The problem with "float + gitlinks", of course, is that it looks like
> "not floating" to the developers (git-status is dirty unless
> overridden, etc.)

Yeah. But what if each "git submodule update" would update the tip of
the submodule branch and add that to the superproject? You could follow
a tip but still produce reproducible trees.

^ permalink raw reply

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Jens Lehmann @ 2011-12-12 22:31 UTC (permalink / raw)
  To: Leif Gruenwoldt; +Cc: Andreas T.Auer, Junio C Hamano, git
In-Reply-To: <CALFF=ZRB7qjj7VMhzr12ySdHmZsySoqceu5brFht8rX1+W3NPg@mail.gmail.com>

Am 12.12.2011 20:13, schrieb Leif Gruenwoldt:
> On Mon, Dec 12, 2011 at 1:42 PM, Andreas T.Auer
> <andreas.t.auer_gtml_37453@ursus.ath.cx> wrote:
> 
>> The next question is: Wouldn't you like to have the new stable branch only
>> pulled in, when the projectX (as the superproject) is currently on that new
>> development branch (maybe master)?
>>
>> But if you checkout that fixed released version 1.2.9.8, wouldn't it be
>> better that in that case the gitlinked version of the submodule is checked
>> out instead of some unrelated new version? I mean, when the gitlinks are
>> tracked with the projectX commits, this should work well.
>>
>> And what about a maintenance branch, which is not a fixed version but a
>> quite stable branch which should only have bugfixes. Shouldn't the auto-pull
>> be disabled in that case, too?
>>
>> I think the "auto-pull" behavior should depend on the currently checked out
>> branch. So the configuration options should allow the definition of one or
>> more mappings.
> 
> Yes. I think you nailed it. The floating behaviour would best be
> configured per branch.

Why not use .gitmodules and make the "branch" setting work without having to
sync it?

> An aside. Would this mean a "git pull" on the product repo would
> automatically do a pull (git submodule update) on the submodule too?

Me thinks that only "git submodule update" should do that. Or what should
happen for people not using pull but just doing fetch and checkout?

^ permalink raw reply

* Re: [PATCH v3 0/3] grep attributes and multithreading
From: René Scharfe @ 2011-12-12 22:37 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Junio C Hamano, Jeff King, Eric Herman
In-Reply-To: <cover.1323723759.git.trast@student.ethz.ch>

Am 12.12.2011 22:16, schrieb Thomas Rast:
> I think we should finish up these three patches for the next release.

> I already posted a bunch of POC patches, but doing the timing manually
> has been getting on my nerves lately.  So I would first like to
> formalize some of the performance testing, perhaps along lines already
> drawn up by Michael Hagger, perhaps not.  Then we can revisit the
> issue of grep performance.  But I would prefer not to block the -W fix
> and two easy and confirmed speedups on that.

Yes, that's a good idea.  The three patches are uncontroversial 
incremental improvements.

> I dropped this part entirely:
>
>> How about adding a parameter to control the number of threads
>> (--threads?) instead that defaults to eight (or five) for the worktree
>> and one for the rest?  That would also make benchmarking easier.
>
> It does make testing a lot easier, but the interface is IMHO not fit
> for users and I have a feeling that the "right" for-debugging
> interface will end up falling out of the performance testing work
> (probably an environment variable).  The end-user option should be a
> config setting, if any.

Agreed; users shouldn't need to specify such a parameter -- our 
heuristic should be good enough.

René

^ permalink raw reply

* Re: [PATCH 4/2] grep: turn off threading for non-worktree
From: René Scharfe @ 2011-12-12 22:37 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: Thomas Rast, git, Eric Herman, Junio C Hamano
In-Reply-To: <20111210131305.GA13344@arf.padd.com>

Am 10.12.2011 14:13, schrieb Pete Wyckoff:
> rene.scharfe@lsrfire.ath.cx wrote on Wed, 07 Dec 2011 18:00 +0100:
>> Am 07.12.2011 09:12, schrieb Thomas Rast:
>>> René Scharfe wrote:
>>>> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
>>>> index 47ac188..e981a9b 100644
>>>> --- a/Documentation/git-grep.txt
>>>> +++ b/Documentation/git-grep.txt
>>>> @@ -228,8 +228,9 @@ OPTIONS
>>>>   	there is a match and with non-zero status when there isn't.
>>>>
>>>>   --threads<n>::
>>>> +	Run<n>  search threads in parallel.  Default is 8 when searching
>>>> +	the worktree and 0 otherwise.  This option is ignored if git was
>>>> +	built without support for POSIX threads.
>>> [...]
>>>> -		nr_threads = (online_cpus()>  1) ? THREADS : 0;
>>>> +		nr_threads = (online_cpus()>  1&&  !list.nr) ? THREADS : 0;
>>>
>>> It would be more consistent to stick to the pack.threads convention
>>> where 0 means "all of my cores", so to disable threading the user
>>> would set the number of threads to 1.  Or were you trying to measure
>>> the contention between the worker thread and the add_work() thread?
>>
>> Yes, indeed, the cost for the threading overhead does interest me.  The
>> documentation should perhaps mention --no-threads explicitly to avoid
>> confusion.
>>
>> Currently there is no way to specify "as many threads as there are
>> cores" here.  Previous measurements indicated that it wasn't too useful,
>> however, because I/O parallelism was beneficial even for machines with
>> less than eight cores and more threads didn't pay off.
>
> Right.  Even for single CPU machines this is true, so the
> nr_threads calculation above should still use all 8 THREADS
> regardless of the number of online_cpus().

That makes sense.  However, in a quick test with a simple regex against 
a cache-warm Linux repo threading increased the runtime of git grep by 
30% on a single-core virtual machine.  Let's keep that check until we 
understand this better..

René

^ permalink raw reply

* Re: Auto update submodules after merge and reset
From: Jens Lehmann @ 2011-12-12 22:39 UTC (permalink / raw)
  To: Andreas T.Auer; +Cc: Phil Hord, git
In-Reply-To: <4EE51D7B.7020806@ursus.ath.cx>

Am 11.12.2011 22:15, schrieb Andreas T.Auer:
> 
> 
> On 10.12.2011 00:57 Phil Hord wrote:
>>  On Thu, Dec 8, 2011 at 4:13 AM,
>>  <andreas.t.auer_gtml_37453@ursus.ath.cx> wrote:
>>
>>  Yes, but maybe you can update this information in the .gitmodules
>>  file easily with a command.  Maybe it could be something simpler
>>  than "git sync-gitmodules-branches", but that is essentially what it
>>  would do: it would save the current branch in each submodule as the
>>  "tracked" branch in the .gitmodules file.
> 
> Ok, I have read a better description of the "floating submodule" model now, so it is a different use case and somehow it makes sense. In that case there are probably just a few branches that you would like to follow, maybe an "unstable" for the newest development or "stable" for the current release or some maintenance branches.
> 
>>  Now this makes sense.  I want the same thing.  I want to preserve
>>  history on "old" commits, but I want to "advance to tip" on "new"
>>  commits.
>>
>>  The trouble, I think, is in telling the difference between "old" and
>>   "new".  I think it means there is a switch, like --auto-follow (or
>>  --no-auto-follow for the alternate if core.auto_follow is set).  But
>>   having a config option as the default is likely to break lots of
>>  scripts.
> 
> In my use case the branches on the submodules follow the superproject and (mostly) versions that are committed there, it just adds the possibility to keep on working without checking out a branch after an update and without colliding with existing branchnames in the submodule.

Using superproject branch names in the submodules make no sense for a
lot of use cases.

> The other use case wants to follow the commits of that other submodule without checking the corresponding gitlinks into the superproject. But wouldn't it also make sense here to define actually a mapping in the .gitmodule that says "if the branch 'develop' is checkedout in the supermodule then with every submodule update automatically pull the newest 'unstable' commit from the submodule"? Or for "master" follow "stable" or for the "maint" branch follow updates in the "bugfixes" branch.
> 
> For example
> 
> [submodule "commonlib"]
>     update = heads develop:unstable master:stable maint:bugfixes

Having that configured with "branch=unstable", "branch=stable" etc. in
.gitmodules on the superprojects branches would be simpler and achieve
the same functionality.

> So whenever a defined branch is checked out in the superproject the mapped branch will be checked out in the submodule ("new" commit), but if a (e.g. tagged) commit is checked out ("old" commit) then the gitlink in the superproject is used to check out the referenced commit in the submodule.

I think checkout should only use the submodule commit recorded in the
superproject and a subsequent "git submodule update" should be needed
to update the submodule to tip. Otherwise you record SHA-1 but still
won't be able to bisect ...

> In http://thread.gmane.org/gmane.comp.version-control.git/183837 was discussed whether the gitlink in the superproject should be set to all-zero if updates follow the tip or maybe use the SHA1 of the commit when the submodule was added. I think the gitlink should be updated everytime when a new commit in the superproject is created.

Nope, only when "git submodule update" is run. Otherwise you'll spray the
history with submodule updates totally unrelated to the commits in the
superproject, which is rather confusing.

^ permalink raw reply

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Phil Hord @ 2011-12-12 22:56 UTC (permalink / raw)
  To: Leif Gruenwoldt; +Cc: Andreas T.Auer, Junio C Hamano, git
In-Reply-To: <CALFF=ZRB7qjj7VMhzr12ySdHmZsySoqceu5brFht8rX1+W3NPg@mail.gmail.com>

On Mon, Dec 12, 2011 at 2:13 PM, Leif Gruenwoldt <leifer@gmail.com> wrote:
> On Mon, Dec 12, 2011 at 1:42 PM, Andreas T.Auer
> <andreas.t.auer_gtml_37453@ursus.ath.cx> wrote:
>
>> The next question is: Wouldn't you like to have the new stable branch only
>> pulled in, when the projectX (as the superproject) is currently on that new
>> development branch (maybe master)?
>>
>> But if you checkout that fixed released version 1.2.9.8, wouldn't it be
>> better that in that case the gitlinked version of the submodule is checked
>> out instead of some unrelated new version? I mean, when the gitlinks are
>> tracked with the projectX commits, this should work well.
>>
>> And what about a maintenance branch, which is not a fixed version but a
>> quite stable branch which should only have bugfixes. Shouldn't the auto-pull
>> be disabled in that case, too?
>>
>> I think the "auto-pull" behavior should depend on the currently checked out
>> branch. So the configuration options should allow the definition of one or
>> more mappings.
>
> Yes. I think you nailed it. The floating behaviour would best be
> configured per branch.

Yes, I think you nailed it too.  I've been thinking the same thing for
a while now, but I didn't know how to express it completely.  Some of
the discussion on here last week gelled the last bits in my mind.

To wit, I think I would want something like this in my project:

Use gitlinks when the superproject HEAD is one of these:
    refs/heads/maint/*
    refs/heads/svn/*     (historic branches)
    refs/tags/*
    <SHA1> (detached)

Float on the rest, using the branch given in .gitmodules (which may be
* to mean "use the same branch as the superproject".)

But maybe it is foolish of me to keep branches where I really want
lightweight tags.  If so, I could get away with this:

   Float if .git/HEAD begins with "refs/heads"
   Else, use the SHA1.


> An aside. Would this mean a "git pull" on the product repo would
> automatically do a pull (git submodule update) on the submodule too?

Good question.

I want to say "eventually, but not yet."  But someone else may
disagree.  "git pull --recurse-submodules=yes" does not do this yet.
A separate git-submodule-update is still required.  But I think this
is a separate issue from floating submodules.

Phil

^ permalink raw reply

* Re: [PATCH 1/3] test-terminal: give the child an empty stdin TTY
From: Jonathan Nieder @ 2011-12-12 23:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Thomas Rast, git, Michael Haggerty
In-Reply-To: <20111212191602.GA14061@sigill.intra.peff.net>

Jeff King wrote:

> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -198,6 +198,9 @@ else
>  	exec 4>/dev/null 3>/dev/null
>  fi
>
> +exec 6<&0
> +exec 0</dev/null
> +
>  test_failure=0
>  test_count=0
>  test_fixed=0

Independently of the changes to explicitly pass HEAD to "git shortlog"
in t7006 (we should) and make test_terminal make stdin into a tty, too
(if tests still make sure "git log --stdin" launches a pager with
stdin not a tty, then it should be a fine change), this looks good to
me.

^ permalink raw reply

* Re: [PATCH 2/2] Makefile: optionally exclude code that needs Unix sockets
From: Junio C Hamano @ 2011-12-12 23:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Sixt, git
In-Reply-To: <20111212213951.GB9754@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> In theory we should also disable the documentation for credential-cache.
> But that means surgery not only to Documentation/Makefile, but figuring
> out how to pass the flags down to the actual asciidoc process (since
> gitcredentials(7) mentions it in the text). Certainly possible, but I
> don't know if it's worth the effort or not.

I do not think it matters that much. We've been shipping documentation for
stuff like remote archiver and daemon without conditional compilation, no?

^ permalink raw reply

* Re: Breakage (?) in configure and git_vsnprintf()
From: Brandon Casey @ 2011-12-12 23:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Andreas Schwab, Michael Haggerty, Junio C Hamano,
	git discussion list, Michal Rokos
In-Reply-To: <20111212142550.GA23875@sigill.intra.peff.net>


FYI: all tests passed on IRIX (not that I suspected otherwise).

-Brandon

On 12/12/2011 08:25 AM, Jeff King wrote:
> On Mon, Dec 12, 2011 at 10:30:27AM +0100, Andreas Schwab wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>>  	if (maxsize > 0) {
>>> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
>>> +		va_copy(cp, ap);
>>> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
>>
>> You also need to call va_end on the copy.
> 
> Silly me. Thanks for catching.
> 
> Junio, here's a corrected version.
> 
> -- >8 --
> Subject: [PATCH] compat/snprintf: don't look at va_list twice
> 
> If you define SNPRINTF_RETURNS_BOGUS, we use a special
> git_vsnprintf wrapper assumes that vsnprintf returns "-1"
> instead of the number of characters that you would need to
> store the result.
> 
> To do this, it invokes vsnprintf multiple times, growing a
> heap buffer until we have enough space to hold the result.
> However, this means we evaluate the va_list parameter
> multiple times, which is generally a bad thing (it may be
> modified by calls to vsnprintf, yielding undefined
> behavior).
> 
> Instead, we must va_copy it and hand the copy to vsnprintf,
> so we always have a pristine va_list.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  compat/snprintf.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/compat/snprintf.c b/compat/snprintf.c
> index e1e0e75..42ea1ac 100644
> --- a/compat/snprintf.c
> +++ b/compat/snprintf.c
> @@ -19,11 +19,14 @@
>  #undef vsnprintf
>  int git_vsnprintf(char *str, size_t maxsize, const char *format, va_list ap)
>  {
> +	va_list cp;
>  	char *s;
>  	int ret = -1;
>  
>  	if (maxsize > 0) {
> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
> +		va_copy(cp, ap);
> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
> +		va_end(cp);
>  		if (ret == maxsize-1)
>  			ret = -1;
>  		/* Windows does not NUL-terminate if result fills buffer */
> @@ -42,7 +45,9 @@ int git_vsnprintf(char *str, size_t maxsize, const char *format, va_list ap)
>  		if (! str)
>  			break;
>  		s = str;
> -		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, ap);
> +		va_copy(cp, ap);
> +		ret = vsnprintf(str, maxsize-SNPRINTF_SIZE_CORR, format, cp);
> +		va_end(cp);
>  		if (ret == maxsize-1)
>  			ret = -1;
>  	}

^ permalink raw reply

* Re: [PATCH v2] Update documentation for stripspace
From: Junio C Hamano @ 2011-12-12 23:41 UTC (permalink / raw)
  To: Conrad Irwin; +Cc: Junio C Hamano, git
In-Reply-To: <1323728909-7847-1-git-send-email-conrad.irwin@gmail.com>

Conrad Irwin <conrad.irwin@gmail.com> writes:

> ...
> I've moved the *NOTE* into a SEE ALSO section where I think it reads
> less opinionatedly ― is that better?

I think it looks a lot worse.

At least your original hinted that some people may confuse the two and the
NOTE was there for such people; other people who would not even dream of
such a confusion won't find the existence of the note a "Huh?". But the
updated patch with a link in SEE ALSO section without any explanation
would be a definite "Huh?" for those of us who find that stripspace does
not have anything to do with what "apply --whitespace=fix" does.

The new example section looks good. Perhaps we can just drop the extra SEE
ALSO and resurrect the *NOTE* from your v1 patch.

Thanks.

^ permalink raw reply

* Re: Auto update submodules after merge and reset
From: Phil Hord @ 2011-12-12 23:43 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Andreas T.Auer, git
In-Reply-To: <4EE682A3.8070704@web.de>

On Mon, Dec 12, 2011 at 5:39 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 11.12.2011 22:15, schrieb Andreas T.Auer:
>> In http://thread.gmane.org/gmane.comp.version-control.git/183837 was discussed whether the gitlink in the superproject should be set to all-zero if updates follow the tip or maybe use the SHA1 of the commit when the submodule was added. I think the gitlink should be updated everytime when a new commit in the superproject is created.
>
> Nope, only when "git submodule update" is run. Otherwise you'll spray the
> history with submodule updates totally unrelated to the commits in the
> superproject, which is rather confusing.


And this is why my superproject is a makefile, a .gitmodules file and
a bunch of gitlinks.  We only use it to track the advancement of
submodule activity.

So yes, I want my superproject's gitlinks to update automatically.  I
just don't know a smart way to make that happen.

Phil

^ permalink raw reply

* Re: [PATCH v3 0/3] grep attributes and multithreading
From: Junio C Hamano @ 2011-12-12 23:44 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, René Scharfe, Jeff King, Eric Herman
In-Reply-To: <cover.1323723759.git.trast@student.ethz.ch>

Thanks; sign-off?

^ permalink raw reply

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
From: Phil Hord @ 2011-12-12 23:50 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git
In-Reply-To: <4EE6805D.7020708@web.de>

On Mon, Dec 12, 2011 at 5:29 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 12.12.2011 22:16, schrieb Phil Hord:
>> On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>>> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>>>> We could even declare that the gitlink for such a
>>>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>>>> think that is necessary.
>>>
>>> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
>>> should do nicely. And that would avoid referencing a non-existing commit
>>> in case you later want to turn a floating submodule into an exact one.
>>
>>
>> I'm sorry I missed this comment before.
>>
>> I hope we can allow storing the actual gitlink in the superproject for
>> each commit even when we're using floating submodules.
>
> I think you misread my statement, I was just talking about the initial
> commit containing the newly added submodule, not any subsequent ones.
> Floating makes differences between the original SHA-1 and the current
> tip of the branch invisible, so there is nothing to commit.
>
>>  I thought-experimented with this a bit last year and came to the
>> conclusion that I should be able to 'float' to tips (developer
>> convenience) and also to store the SHA-1 of each gitlink through
>> history (automated maybe; as-needed).
>
> Which means that after "git submodule update" floated a submodule branch
> further, you would have to commit that in the superproject.

Sadly, yes.  Currently I have my CI-server do this for me after it
verifies each new submodule commit is able to build successfully.

>> The problem with "float-only" is that it loses history so, for
>> example, git-bisect doesn't work.
>
> Yep. And different developers can have the same superproject commit
> checked out but their submodules can be quite different.

>> The problem with "float + gitlinks", of course, is that it looks like
>> "not floating" to the developers (git-status is dirty unless
>> overridden, etc.)
>
> Yeah. But what if each "git submodule update" would update the tip of
> the submodule branch and add that to the superproject? You could follow
> a tip but still produce reproducible trees.

Yes, and that's what I want.

Not what it sounded like was being suggested before, which (to my
eyes) implied that the submodule gitlinks were useless noise.

Phil

^ permalink raw reply

* [PATCH] Update documentation for stripspace
From: Conrad Irwin @ 2011-12-12 23:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Conrad Irwin
In-Reply-To: <7vwra1z7bg.fsf@alter.siamese.dyndns.org>

2011/12/12 Junio C Hamano <gitster@pobox.com>:
> Conrad Irwin <conrad.irwin@gmail.com> writes:
>> I've moved the *NOTE* into a SEE ALSO section where I think it reads
>> less opinionatedly ― is that better?
>
> I think it looks a lot worse.
[snip]
>
> The new example section looks good. Perhaps we can just drop the extra SEE
> ALSO and resurrect the *NOTE* from your v1 patch.
>
> Thanks.

Done.

Conrad

---8<---

Tell the user what this command is intended for, and expand the
description of what it does.

Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
---
 Documentation/git-stripspace.txt |   69 ++++++++++++++++++++++++++++++++++---
 builtin/stripspace.c             |    2 +-
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-stripspace.txt b/Documentation/git-stripspace.txt
index b78f031..a548444 100644
--- a/Documentation/git-stripspace.txt
+++ b/Documentation/git-stripspace.txt
@@ -3,26 +3,83 @@ git-stripspace(1)
 
 NAME
 ----
-git-stripspace - Filter out empty lines
+git-stripspace - Remove unnecessary whitespace
 
 
 SYNOPSIS
 --------
 [verse]
-'git stripspace' [-s | --strip-comments] < <stream>
+'git stripspace' [-s | --strip-comments] < input
 
 DESCRIPTION
 -----------
-Remove multiple empty lines, and empty lines at beginning and end.
+
+Clean the input in the manner used by 'git' for text such as commit
+messages, notes, tags and branch descriptions.
+
+With no arguments, this will:
+
+- remove trailing whitespace from all lines
+- collapse multiple consecutive empty lines into one empty line
+- remove empty lines from the beginning and end of the input
+- add a missing '\n' to the last line if necessary.
+
+In the case where the input consists entirely of whitespace characters, no
+output will be produced.
+
+*NOTE*: This is intended for cleaning metadata, prefer the `--whitespace=fix`
+mode of linkgit:git-apply[1] for correcting whitespace of patches or files in
+the repository.
 
 OPTIONS
 -------
 -s::
 --strip-comments::
-	In addition to empty lines, also strip lines starting with '#'.
+	Skip and remove all lines starting with '#'.
+
+EXAMPLES
+--------
+
+Given the following noisy input with '$' indicating the end of a line:
 
-<stream>::
-	Byte stream to act on.
+--------
+|A brief introduction   $
+|   $
+|$
+|A new paragraph$
+|# with a commented-out line    $
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out. $
+|      $
+|The end.$
+|  $
+---------
+
+Use 'git stripspace' with no arguments to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|# with a commented-out line$
+|explaining lots of stuff.$
+|$
+|# An old paragraph, also commented-out.$
+|$
+|The end.$
+---------
+
+Use 'git stripspace --strip-comments' to obtain:
+
+--------
+|A brief introduction$
+|$
+|A new paragraph$
+|explaining lots of stuff.$
+|$
+|The end.$
+---------
 
 GIT
 ---
diff --git a/builtin/stripspace.c b/builtin/stripspace.c
index 1288ffc..f16986c 100644
--- a/builtin/stripspace.c
+++ b/builtin/stripspace.c
@@ -75,7 +75,7 @@ int cmd_stripspace(int argc, const char **argv, const char *prefix)
 				!strcmp(argv[1], "--strip-comments")))
 		strip_comments = 1;
 	else if (argc > 1)
-		usage("git stripspace [-s | --strip-comments] < <stream>");
+		usage("git stripspace [-s | --strip-comments] < input");
 
 	if (strbuf_read(&buf, 0, 1024) < 0)
 		die_errno("could not read the input");
-- 
1.7.8.164.g00d7e

^ permalink raw reply related

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Brandon Casey @ 2011-12-13  0:12 UTC (permalink / raw)
  To: Leif Gruenwoldt; +Cc: Junio C Hamano, git
In-Reply-To: <CALFF=ZQKRgx_AodBQH17T9cSe_JFtoKie7DoMMfkTXCyCFospw@mail.gmail.com>

On Sat, Dec 10, 2011 at 9:27 AM, Leif Gruenwoldt <leifer@gmail.com> wrote:
> On Sat, Dec 10, 2011 at 1:30 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> So that use case does not sound like a good rationale to require addition
>> of floating submodules.
>
> Ok I will try another scenario :)
>
> Imagine again products A, B and C and a common library. The products are in
> a stable state of development and track a stable branch of the common lib.
> Then imagine an important security fix gets made to the common library. On
> the next pull of products A, B, and C they get this fix for free
> because they were
> floating. They didn't need to communicate with the maintainer of the common
> repo to know this. In fact they don't really care. They just want the
> latest stable
> code for that release branch.
>
> This is how package management on many linux systems works. Dependencies
> get updated and all products reap the benefit (or catastrophe) automatically.

What happens if the update to the floating submodule introduces a bug
that prevents A, B, and/or C from building/running correctly?  If the
submodule states are not recorded, how would the previously working
submodule version be restored so that development on A, B, and C,
could proceed?  I guess each developer could manually checkout @{1} in
the submodule in their working directory, though that wouldn't work
for a new clone, and it's not very elegant.

I presume that if A, B, and C, do not care to know exactly what was
fixed in the common library, they probably do not care to investigate
the breakage in that repo either.  Or, they may not have the
expertise.

-Brandon

^ permalink raw reply

* Re: Debugging git-commit slowness on a large repo
From: Joshua Redstone @ 2011-12-13  0:15 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy, Carlos Martín Nieto, Tomas Carnecky,
	Junio C Hamano
  Cc: git@vger.kernel.org
In-Reply-To: <CB069308.2C9DD%joshua.redstone@fb.com>

Sorry for the poor formatting of the stack trace.

I've written two scripts to reproduce the slow commit behavior that I see.
 I've posted both to:
   https://gist.github.com/1469760

To repro, first create a dir with lots of files (it defaults to creating 1
million files in 1000 dirs):

$ loadGen.py --baseDir=./bigdir

then, run the simulator scripts to generate and commit a series of small
changes to the repo:

$ git reset --hard HEAD && simulate.py ./bigdir git

The git reset is to clean up any cruft left over from a previous partial
invocation of simulate.py

Note that loadGen.py defaults to creating 1 million files and committing
them in one commit.  With a flash drive this took < 30 min, and subsequent
small commits in simulate.py took about 6 seconds.  With a hard-drive,
it's taking > 1hr (still waiting for it to finish).

Cheers,
Josh


On 12/8/11 4:17 PM, "Joshua Redstone" <joshua.redstone@fb.com> wrote:

>Btw, I also tried doing some very poor-man's profiling on "git commit"
>without any of the readtree/writetree/updateindex commands.
>
>Around 50% of the time was in (bottom few frames may have varied)
>
>#1  0x00000000004c467e in find_pack_entry (sha1=0x1475a44 ,
>e=0x7fff2621f070) at sha1_file.c:2027
>#2  0x00000000004c57b0 in has_sha1_file (sha1=0x7fe2cd9c7900 "00") at
>sha1_file.c:2567  
>                  
>                 
>#3  0x000000000046e4af in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:333        
>                  
>                  
>            
>#4  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#5  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#6  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#7  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#8  0x000000000046e278 in update_one (it=<value optimized out>,
>cache=<value optimized out>, entries=<value optimized out>, base=<value
>optimized out>, baselen=<value optimized out>, missing_ok=<value optimized
>out>, dryrun=0) at cache-\
>tree.c:285        
>                  
>                  
>            
>#9  0x000000000046e869 in cache_tree_update (it=<value optimized out>,
>cache=<value optimized out>, entries=dwarf2_read_address: Corrupted DWARF
>expression.       
>                 
>) at cache-tree.c:379
>                  
>                  
>            
>#10 0x000000000041cade in prepare_to_commit (index_file=0x781740
>".git/index", prefix=<value optimized out>, current_head=<value optimized
>out>, s=0x7fff26220d00, author_ident=<value optimized out>) at
>builtin/commit.c:866
>#11 0x000000000041d891 in cmd_commit (argc=0, argv=0x7fff262213a0,
>prefix=0x0) at builtin/commit.c:1407
>                  
>                  
>#12 0x0000000000404bf7 in handle_internal_command (argc=4,
>argv=0x7fff262213a0) at git.c:308
>                  
>                  
>#13 0x0000000000404e2f in main (argc=4, argv=0x7fff262213a0) at git.c:512
>                  
>                  
>            
> 
>
>
>And 30% of the time was in:
>
>#0  0x00000034af2c34a5 in _lxstat () from /lib64/libc.so.6
>                  
>                  
>            
>#1  0x00000000004abe0f in refresh_cache_ent (istate=0x780940,
>ce=0x7f8462a34e40, options=0, err=0x7fff6dd9f588) at
>/usr/include/sys/stat.h:443
>                  
>#2  0x00000000004ac1a0 in refresh_index (istate=0x780940, flags=<value
>optimized out>, pathspec=<value optimized out>, seen=<value optimized
>out>, header_msg=0x0) at read-cache.c:1133
>                  
>#3  0x000000000041b60a in refresh_cache_or_die (refresh_flags=<value
>optimized out>) at builtin/commit.c:331
>                  
>                  
>#4  0x000000000041bc39 in prepare_index (argc=0, argv=0x7fff6dda0310,
>prefix=0x0, current_head=<value optimized out>, is_status=<value optimized
>out>) at builtin/commit.c:414
>                 
>#5  0x000000000041d878 in cmd_commit (argc=0, argv=0x7fff6dda0310,
>prefix=0x0) at builtin/commit.c:1403
>                  
>                  
>  
>
>
>Josh
>
>
>On 12/8/11 4:09 PM, "Joshua Redstone" <joshua.redstone@fb.com> wrote:
>
>>On 12/7/11 5:39 PM, "Nguyen Thai Ngoc Duy" <pclouds@gmail.com> wrote:
>>
>>>On Thu, Dec 8, 2011 at 5:48 AM, Joshua Redstone <joshua.redstone@fb.com>
>>>wrote:
>>>> Hi Duy,
>>>> Thanks for the documentation link.
>>>>
>>>> git ls-files shows 100k files, which matches # of files in the working
>>>> tree ('find . -type f -print | wc -l').
>>>
>>>Any chance you can split it into smaller repositories, or remove files
>>>from working directory (e.g. if you store logs, you don't have to keep
>>>logs from all time in working directory, they can be retrieved from
>>>history).
>>
>>It's not really feasible to split it into smaller repositories.  In fact,
>>we're expecting it to grow between 3x and 5x in number of files and
>>number
>>of commits.
>>
>>>
>>>> I added a 'git read-tree HEAD' before the git-add, and a 'git
>>>>write-tree'
>>>> after the add.  With that, the commit time slowed down to 8 seconds
>>>>per
>>>> commit, plus 4 more seconds for the read-tree/add/write-tree ops.  The
>>>> read-tree/add/write-tree each took about a second.
>>>
>>>read-tree destroys stat info in index, refreshing 100k entries in
>>>index in this case may take some time. Try this to see if commit time
>>>reduces and how much time update-index takes
>>>
>>>read-tree HEAD
>>>update-index --refresh
>>>add ....
>>>write-tree
>>>commit -q
>>
>>I added the "update-index --refresh" and the time for commit became more
>>like 0.6 seconds.
>>In this setup: read-tree takes ~2 seconds, update-index takes ~8 seconds,
>>git-add takes 1 to 4 seconds, and write-tree takes less than 1 second.
>>
>>>
>>>> As an experiment, I also tried removing the 'git read-tree' and just
>>>> having the git-write-tree.  That sped up commits to 0.6 seconds, but
>>>>the
>>>> overall time for add/write-tree/commit was still 3 to 6 seconds.
>>>
>>>overall time is not really important because we duplicate work here
>>>(write-tree is done as part of commit again). What I'm trying to do is
>>>to determine how much time each operation in commit may take.
>>>-- 
>>>Duy
>>
>

^ permalink raw reply

* Re: [PATCH 3/4] Guard memory overwriting in resolve_ref's static buffer
From: Junio C Hamano @ 2011-12-13  0:37 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jonathan Nieder
In-Reply-To: <1323688832-32016-3-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:

> diff --git a/cache.h b/cache.h
> index 4887a3e..ba5e911 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -865,7 +865,8 @@ extern int read_ref(const char *filename, unsigned char *sha1);
>   *
>   * errno is sometimes set on errors, but not always.
>   */
> -extern const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag);
> +#define resolve_ref(ref, sha1, reading, flag) resolve_ref_real(ref, sha1, reading, flag, __FILE__, __LINE__)
> +extern const char *resolve_ref_real(const char *ref, unsigned char *sha1, int reading, int *flag, const char *file, int line);
>  extern char *resolve_refdup(const char *ref, unsigned char *sha1, int reading, int *flag);
>  
>  extern int dwim_ref(const char *str, int len, unsigned char *sha1, char **ref);

Eek.

> diff --git a/wrapper.c b/wrapper.c
> index 85f09df..02b6c81 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -60,6 +60,33 @@ void *xmallocz(size_t size)
>  	return ret;
>  }
>  
> +void *xmalloc_mmap(size_t size, const char *file, int line)
> +{
> +	struct alloc_header *block;
> +	size += offsetof(struct alloc_header,buf);
> +	block = mmap(NULL, size, PROT_READ | PROT_WRITE,
> +		   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (block == (struct alloc_header*)-1)
> +		die_errno("unable to mmap %lu bytes anonymously",
> +			  (unsigned long)size);
> +
> +	block->file = file;
> +	block->line = line;
> +	block->size = size;
> +	return block->buf;
> +}
> +

Double eek. A refname is ordinarily way too small than a page and we spend
a full page every time resolve_ref_unsafe() is called. That is acceptable
only for debugging build, but then having to patch the main codepath like
the following, renaming the "real" implementation of a rather important
function is not acceptable in a non-debugging build.

> diff --git a/refs.c b/refs.c
> index 8ffb32f..cf8dfcc 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -497,12 +497,21 @@ static int get_packed_ref(const char *ref, unsigned char *sha1)
>  	return -1;
>  }
>  
> -const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag)
> +const char *resolve_ref_real(const char *ref, unsigned char *sha1,
> +			     int reading, int *flag, const char *file, int line)
>  {
>  	int depth = MAXDEPTH;
>  	ssize_t len;
>  	char buffer[256];
> -	static char ref_buffer[256];
> +	static char real_ref_buffer[256];
> +	static char *ref_buffer;
> +
> +	if (!ref_buffer && !getenv("GIT_DEBUG_MEMCHECK"))
> +		ref_buffer = real_ref_buffer;
> +	if (ref_buffer != real_ref_buffer) {
> +		xfree_mmap(ref_buffer);
> +		ref_buffer = xmalloc_mmap(256, file, line);
> +	}

I'll drop 3/4 from the series, adjust 4/4, and queue the result as a
three-patch series for now.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 02/51] refs: rename "refname" variables
From: Junio C Hamano @ 2011-12-13  0:37 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-3-git-send-email-mhagger@alum.mit.edu>

This added otherwise unnecessary conflicts with topics in flight, but the
conflicts were nothing Git and a human couldn't manage, and overall it is
not a bad readability change for the longer-term code health.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 08/51] is_dup_ref(): extract function from sort_ref_array()
From: Junio C Hamano @ 2011-12-12 22:33 UTC (permalink / raw)
  To: mhagger
  Cc: git, Jeff King, Drew Northup, Jakub Narebski, Heiko Voigt,
	Johan Herland, Julian Phillips
In-Reply-To: <1323668338-1764-9-git-send-email-mhagger@alum.mit.edu>

mhagger@alum.mit.edu writes:

> +/*
> + * Emit a warning and return true iff ref1 and ref2 have the same name
> + * and the same sha1.  Die if they have the same name but different
> + * sha1s.
> + */
> +static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2)
> +{
> +	if (!strcmp(ref1->name, ref2->name)) {
> +		/* Duplicate name; make sure that the SHA1s match: */
> +		if (hashcmp(ref1->sha1, ref2->sha1))
> +			die("Duplicated ref, and SHA1s don't match: %s",
> +			    ref1->name);
> +		warning("Duplicated ref: %s", ref1->name);
> +		return 1;
> +	} else {
> +		return 0;
> +	}
> +}

At this step, is_dup_ref() is only called from sort_ref_array() which in
turn is only called on either a collection of loose or packed refs, so
warning is warranted. IOW I do not see anything wrong with _this_ patch.

Later in the series, however, the collection of extra refs seems to be
shoved into a phoney "direntry" and made to go through the same add_ref()
machinery that uses find_containing_direntry() which in turn calls
search_ref_dir() that smells like a definite no-no.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox