Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] Be more careful when determining whether a remote was configured
From: Jeff King @ 2017-01-19 18:27 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Thomas Gummerer, Andrew Arnott
In-Reply-To: <alpine.DEB.2.20.1701181700020.3469@virtualbox>

On Wed, Jan 18, 2017 at 05:22:40PM +0100, Johannes Schindelin wrote:

> > > I want to err on the side of caution. That's why.
> > 
> > I guess I just don't see why changing the behavior with respect to
> > "prune" or "proxy" is any less conservative than changing the one for
> > "refspec".
> 
> Let's take a step back and consider the problem I try to solve, okay?

OK. Though after reading your message over several times, I think I may
have confused things by raising two separate issues.

So let me re-state my issues here for clarity:

  1. I'm concerned that a setting for remote.<url>.X would fail to apply
     to a bare "git fetch <url>" if "X" is considered as "not really"
     configuring a remote.

  2. I'm concerned that splitting the remote.*.X keys into two classes:
     "really configures" and "does not really configure" creates a
     confusing interface for users. Some keys are OK to set in your
     ~/.gitconfig and others are not.

Let's talk about concern 1 first.  Based on your analysis in this
message, it looks like it _isn't_ a problem with your patch (because
is_configured is never used for applying values, only for add/del/rename
types of operations in remote.c).

So good. Thanks for addressing my concern. And that makes your
"conservative" comment make more sense; the idea is that you are not
breaking anything, but just loosening selectively for some values of
"X".

I think there are still some weird corner cases even for "prune",
though. E.g., try:

  git init
  git remote add foo whatever
  git config remote.foo.prune true
  git config remote.other.prune false

So now we have:

	[remote "foo"]
		url = whatever
		fetch = +refs/heads/*:refs/remotes/foo/*
		prune = true
	[remote "other"]
		prune = false

Now try "git remote rename foo other". With current versions of git,
it's disallowed. With your patch, I get:

	[remote "other"]
		url = whatever
		prune = true
	[remote "other"]
		prune = false
		fetch = +refs/heads/*:refs/remotes/other/*

The old value of remote.other.prune is overriding the one we tried to
move into place.

In your motivating example, the old value is in ~/.gitconfig, so it
automatically takes lower precedence, and everything just works (I think
it would also just work if "other" had been defined originally _before_
foo in the config file).

I think you could fix it by having git-remote teach the "not really"
config values (like "prune") to overwrite any existing value when
rewriting the config. I think this just needs the multi_replace flag set
when calling git_config_set_multivar().

That raises questions about what should happen when multi-value keys
like "refspec" would be set (if we were to add them to the "not really"
set). Should they be overwritten, or merged? And in that sense, your
patch lets you punt on those issues.

I still think my second concern is valid. It's made worse by your patch
(if only because everything was disallowed before, and now some things
are and some things aren't). If this is a first step towards a final
state where the rules are simple, then starting conservatively makes
sense. And until we get there, the answer "yes, it should, but nobody
has worked out the semantics yet; care to make a patch?" is an OK one.
But it sounds like you do not ever want to loosen the "refspec" case.

I don't think that's ideal, but at the very least if that's the end
state, the list of "OK to use in ~/.gitconfig" keys should probably be
documented.

Reading your message, though, I still wonder if we can do better...

> The problem is that `git remote rename <old> <new>` refuses to do its job
> if it detects a configured `<new>`. And it detects a configured `<new>`
> even in cases where there is not *really* any remote configured.

I'd add to this that "git remote add <new>" should work in a similar way
(that was the one that I think people often ran into with
remote.origin.fetch refspecs).

> The example use case is to configure `remote.origin.prune = true` in
> ~/.gitconfig, i.e. changing Git's default for all "origin" remotes in all
> of the user's repositories.
> 
> Now, the *real* fix would be to detect whether the remote was "configured"
> in the current repository, or in ~/.gitconfig. But that may not even be
> desirable, as we would want a more general fix for the question: can `git
> remote` rename a given remote to a new name, i.e. is that new name already
> taken?

I think _this_ is a much better way of framing the problem. It is not
about which keys are set, but about _where_ they are set. IOW, a
reasonable rule would be: if there is any remote.*.X in the repo config,
then git-remote should consider it a configured repo. And otherwise, no
matter what is in ~/.gitconfig or elsewhere, git-remote should proceed
as if it doesn't exist (and repo-level config can take precedence over
config defined elsewhere).

I.e., something like this:

diff --git a/remote.c b/remote.c
index 298f2f93f..720d616be 100644
--- a/remote.c
+++ b/remote.c
@@ -373,6 +373,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 	}
 	remote = make_remote(name, namelen);
 	remote->origin = REMOTE_CONFIG;
+	if (current_config_scope() == CONFIG_SCOPE_REPO)
+		remote->configured = 1;
 	if (!strcmp(subkey, "mirror"))
 		remote->mirror = git_config_bool(key, value);
 	else if (!strcmp(subkey, "skipdefaultupdate"))

That doesn't make your test pass, but I think that is only because your
test is not covering the interesting case (it puts the new config in the
repo config, not in ~/.gitconfig).

What do you think?

> Originally, I would even have put the "vcs" into that set, as I could see
> a legitimate use case for users to configure "remote.svn.vcs = vcs" in
> ~/.gitconfig. But the regression test suite specifically tests for that
> case, and I trust that there was a good reason, even if Thomas did not
> describe that good reason in the commit message nor in any reply to this
> patch pair.

The config-scope thing above would allow "remote.svn.vcs" in
~/.gitconfig. But I don't think the test script actually checks that; it
checks for the repo-level config. And we would continue to do the right
thing there.

-Peff

^ permalink raw reply related

* Re: [PATCH v5 3/3] log --graph: customize the graph lines with config log.graphColors
From: Junio C Hamano @ 2017-01-19 18:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20170119165127.6cxw64fjz7aevkq2@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Thu, Jan 19, 2017 at 06:41:23PM +0700, Nguyễn Thái Ngọc Duy wrote:
>
>> +static void parse_graph_colors_config(struct argv_array *colors, const char *string)
>> +{
>> +	const char *end, *start;
>> +
>> +	start = string;
>> +	end = string + strlen(string);
>> +	while (start < end) {
>> +		const char *comma = strchrnul(start, ',');
>> +		char color[COLOR_MAXLEN];
>> +
>> +		if (!color_parse_mem(start, comma - start, color))
>> +			argv_array_push(colors, color);
>> +		else
>> +			warning(_("ignore invalid color '%.*s' in log.graphColors"),
>> +				(int)(comma - start), start);
>> +		start = comma + 1;
>> +	}
>> +	argv_array_push(colors, GIT_COLOR_RESET);
>> +}
>
> This looks good.
>
>> @@ -207,9 +228,24 @@ struct git_graph *graph_init(struct rev_info *opt)
>>  {
>>  	struct git_graph *graph = xmalloc(sizeof(struct git_graph));
>>  
>> -	if (!column_colors)
>> -		graph_set_column_colors(column_colors_ansi,
>> -					column_colors_ansi_max);
>> +	if (!column_colors) {
>> +		struct argv_array ansi_colors = {
>> +			column_colors_ansi,
>> +			column_colors_ansi_max + 1
>> +		};
>
> Hrm. At first I thought this would cause memory corrution, because your
> argv_array_clear() would try to free() the non-heap array you've stuffed
> inside. But you only clear the custom_colors array which actually is
> dynamically allocated. This outer one is just here to give uniform
> access:
>
>> +		struct argv_array *colors = &ansi_colors;
>> +		char *string;
>> +
>> +		if (!git_config_get_string("log.graphcolors", &string)) {
>> +			static struct argv_array custom_colors = ARGV_ARRAY_INIT;
>> +			argv_array_clear(&custom_colors);
>> +			parse_graph_colors_config(&custom_colors, string);
>> +			free(string);
>> +			colors = &custom_colors;
>> +		}
>> +		/* graph_set_column_colors takes a max-index, not a count */
>> +		graph_set_column_colors(colors->argv, colors->argc - 1);
>> +	}
>
> Since there's only one line that cares about the result of "colors",
> maybe it would be less confusing to do:
>
>   if (!git_config_get-string("log.graphcolors", &string)) {
> 	... parse, etc ...
> 	graph_set_column_colors(colors.argv, colors.argc - 1);
>   } else {
> 	graph_set_column_colors(column_colors_ansi,
> 	                        column_colors_ansi_max);
>   }

Yes, that would be much much less confusing.  By doing so, the cover
letter can lose "pushing the limits of abuse", too ;-).



^ permalink raw reply

* Re: [PATCH v3 1/5] doc: add documentation for OPT_STRING_LIST
From: Junio C Hamano @ 2017-01-19 17:58 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jacob Keller, Git mailing list, Johannes Sixt,
	Johannes Schindelin
In-Reply-To: <CA+P7+xrv35w0RYNEZ88K_EWC57A4utBQTibtw6+TJBCtzA9Ybw@mail.gmail.com>

Jacob Keller <jacob.keller@gmail.com> writes:

> On Wed, Jan 18, 2017 at 11:45 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> I do not know if it is clear enough that 'option' in the last
>> sentence is a placeholder.  I then wondered if spelling it as
>> `--no-<long>` would make it a bit clearer, but that is ugly.
>
> To be fair, this is exactly how the rest of the doc spells these
> things, so I would rather be consistent with the doc as is, and a
> future patch could clean this up. See OPT_SET_INT, for an example of
> `--no-option`.

Ah, OK, then I have no issues with it.

> Maybe we can rephrase this "The list is reset via `--no-option`"?

I think I saw that in your latest interdiff and I think it is good.

Thanks.


^ permalink raw reply

* Re: [RFC] stash --continue
From: Marc Branchaud @ 2017-01-19 18:38 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Stephan Beyer, git
In-Reply-To: <alpine.DEB.2.20.1701191341530.3469@virtualbox>

On 2017-01-19 10:49 AM, Johannes Schindelin wrote:
> Hi Marc,
>
> On Wed, 18 Jan 2017, Marc Branchaud wrote:
>
>> On 2017-01-18 11:34 AM, Johannes Schindelin wrote:
>>>
>>> On Wed, 18 Jan 2017, Marc Branchaud wrote:
>>>
>>>> On 2017-01-16 05:54 AM, Johannes Schindelin wrote:
>>>>
>>>>> On Mon, 16 Jan 2017, Stephan Beyer wrote:
>>>>>
>>>>>> a git-newbie-ish co-worker uses git-stash sometimes. Last time
>>>>>> he used "git stash pop", he got into a merge conflict. After he
>>>>>> resolved the conflict, he did not know what to do to get the
>>>>>> repository into the wanted state. In his case, it was only "git
>>>>>> add <resolved files>" followed by a "git reset" and a "git stash
>>>>>> drop", but there may be more involved cases when your index is
>>>>>> not clean before "git stash pop" and you want to have your index
>>>>>> as before.
>>>>>>
>>>>>> This led to the idea to have something like "git stash
>>>>>> --continue"[1]
>>>>>
>>>>> More like "git stash pop --continue". Without the "pop" command,
>>>>> it does not make too much sense.
>>>>
>>>> Why not?  git should be able to remember what stash command created
>>>> the conflict.  Why should I have to?  Maybe the fire alarm goes off
>>>> right when I run the stash command, and by the time I get back to it
>>>> I can't remember which operation I did.  It would be nice to be able
>>>> to tell git to "just finish off (or abort) the stash operation,
>>>> whatever it was".
>>>
>>> That reeks of a big potential for confusion.
>>>
>>> Imagine for example a total Git noob who calls `git stash list`,
>>> scrolls two pages down, then hits `q` by mistake. How would you
>>> explain to that user that `git stash --continue` does not continue
>>> showing the list at the third page?
>>
>> Sorry, but I have trouble taking that example seriously.  It assumes
>> such a level of "noobness" that the user doesn't even understand how
>> standard command output paging works, not just with git but with any
>> shell command.
>
> Yeah, well, I thought you understood what I meant.
>
> The example was the best I could come up with quickly, and it only tried
> to show that there are *other* stash operations that one might perceive
> to happen at the same time as the "pop" operation, so your flimsical
> comment "why not continue the latest operation" may very well be
> ambiguous.
>
> And if it is not ambiguous in "stash", it certainly will be in other Git
> operations. And therefore, having a DWIM in "stash" to allow "--continue"
> without any specific subcommand, but not having it in other Git commands,
> is just a very poor user interface design. It is prone to confuse users,
> which is always a hallmark of a bad user interface.

Please don't underestimate the power of syntactic consistency in helping 
users achieve their goals.  Having some commands use "git foo 
--continue" while others use "git foo bar --continue" *will* confuse 
people, regardless of how logical the reasons for those differences.

But in the case of stash, I still don't see the utility in having 
operation-specific continuation.  Consider the following sequence (as 
you say, this doesn't work yet, but making it work seems reasonable):

	git stash pop  # creates some conflicts
	git stash apply stash@{4} # creates some other conflicts
	# (User resolves the conflicts created by the pop.)
	git stash pop --continue

Given the description of the original proposal (do "git reset; git stash 
drop"), what's the state of the index and the working tree?

In particular, what has the user gained by continuing just that pop?

Another thing to ask is, how common is such a scenario likely to be?  I 
suggest that it will be far more common for users to resolve all the 
conflicts and then want to continue all their interrupted stash 
operations.  If so, fussily forcing them to explicitly continue the pop 
and the apply is just a waste.

> Hence my objection to "git stash --continue". No argument in favor of "git
> stash --continue" I heard so far comes even close to being convincing.

Well, what about the potential for a slippery slope?  If the user is 
forced to be specific about continuing either a pop or an apply, why 
wouldn't git allow them to be specific about *which* pop or apply they 
want to continue?  Consider another hypothetical scenario:

	git stash pop  # creates some conflicts
	git stash pop  # creates some more conflicts
	git stash pop  # creates even more conflicts
	# (User resolves the conflicts created by second pop.)
	git stash pop --continue
	# Oops, there's still some unresovled pops!

Obviously the user isn't ready to finish off all the pops, so they'll 
want some way to specify which pop to continue.  Dealing with that just 
feels like a lot of work for minimal benefit.

>>> Even worse: `git stash` (without arguments) defaults to the `save`
>>> operation, so any user who does not read the documentation (and who
>>> does?) would assume that `git stash --continue` *also* implies `save`.
>>
>> Like the first example, your user is trying to "continue" a command that
>> is already complete.
>
> Says who? You may understand the semantics better than other users, but
> who are you to judge?
>
> But that's besides the point.
>
> My point (which you did not quite understand) was that it can be ambiguous
> what to continue when looking at *all* Git commands. To special-case "git
> stash"'s user interface makes things more confusing, and therefore less
> usable for everyone.
>
> And even with `git stash apply`, you could construct a very plausible
> scenario (which does not work yet, but we may want to make it work): if
> `git stash apply` causes conflicts, and `git stash apply stash@{1}`
> conflicts in a *different* set of files, why don't we allow the second
> operation to succeed (adding its conflicts)?

Running those two commands should be perfectly fine.  The interesting 
question is what it means to *continue* from that state.

> That example is like `git cherry-pick -n` with two different commits, both
> of which conflict with the current worktree, but in different files. Both
> cherry-picks would do their job if called after one another, and the
> result is a worktree with the *combined* conflicts. That is a legitimate
> use case (which I happened to *actually* perform just the other day).

I don't mean to suggest that you shouldn't be able to do that.

> If we fix "git stash" (and there is no reason we should not), it would
> also allow "git stash pop; git stash pop" to work with two stashes that
> both conflict with the current worktree, just in different files.

Sounds fine to me.  So, what would it mean to continue from that state?

> So I challenge you to get less hung up on the *exact* example I present,
> and to try to see through the example what the issue is that I am trying
> to get at.
>
>>> If that was not enough, there would still be the overall design of
>>> Git's user interface. You can call it confusing, inconsistent, with a
>>> lot of room for improvement, and you would be correct. But none of
>>> Git's commands has a `--continue` option that remembers the latest
>>> subcommand and continues that. To introduce that behavior in `git
>>> stash` would disimprove the situation.
>>
>> I think it's more the case that none of the current continuable commands
>> have subcommands (though I can't think of all the continuable or
>> abortable operations offhand, so maybe I'm wrong).  I think we're
>> discussing new UI ground here.
>
> Nope, we are not entering new UI ground here. The principle is clear with
> the existing --continue options: you pass them to the same operation that
> you want to continue. By that reasoning, "git stash --continue" should
> continue the (implicit) "save" operation. But that is not at all what you
> want.
>
>> And since the pattern is already "git foo --continue",
>
> But foo *is the operation*! By that reasoning, you should agree that "git
> stash --continue" is *wrong*!

No, in the user's mind *stash* is the operation!  The user is doing 
"stash" stuff.  She doesn't care if the conflicts came from a pop or an 
apply.  She has resolved the conflicts, and now she just wants to continue.

There's no confusion about possibly continuing some other stash 
subcommand, like save or list or drop.  None of those other commands are 
continuable.  In the following sequence

	git stash "let me save my work now"
	git stash --continue

That continue command does nothing, regardless of whether the implied 
save command succeeded or failed.  There's simply nothing to continue.

With the sequence

	git stash pop  # creates some conflicts
	git stash "got some popped conflicts"
	git stash --continue

Again the continue command should do nothing:  If the save command 
succeeded it should have cleaned up the interrupted pop.  If the save 
command failed, then the continue command can't continue from a 
conflicted state.  (OTOH, a "git stash --abort" would abort the pop.)

>> Think of it this way:  All the currently continuable/abortable commands
>> put the repository in a shaky state, where performing certain other
>> operations would be ill advised.  Attempting to start a rebase while a
>> merge conflict is unresolved, for example.  IIRC, git actually tries to
>> stop users from shooting their feet in this way.
>>
>> And so it should be for the stash operation:  If applying a stash yields
>> a conflict, it has to be resolved or aborted before something like a
>> rebase or merge is attempted.
>
> That already happens, and I have no idea how you think this safe-guarding
> has anything to do whether the "--continue" option makes sense in "git
> stash", or only in "git stash pop".
>
>> In the long run, I think there's even the possibility of generic "git
>> continue" and "git abort" commands,
>
> Wrong.
>
> You can call "git cherry-pick" (and "git cherry-pick --continue") while
> running a "git rebase -i".
>
> You can run "git rebase", "git stash", "git cherry-pick" and many other
> commands while running a "git bisect".
>
> You can even run a "git rebase" or a "git cherry-pick" while resolving an
> interrupted "git am".
>
> Many, many examples that make it *impossible* for Git to know *what* you
> want to continue, *what* you want to abort.

Right, I'd missed that.  I agree that a generic "git continue" is nonsense.

		M.


^ permalink raw reply

* Re: Git: new feature suggestion
From: Joao Pinto @ 2017-01-19 17:55 UTC (permalink / raw)
  To: Konstantin Khomoutov, Joao Pinto
  Cc: git, Linus Torvalds, CARLOS.PALMINHA@synopsys.com
In-Reply-To: <20170119093313.ea57832dfd1bc7e0b0f1e630@domain007.com>


Hi,

Às 6:33 AM de 1/19/2017, Konstantin Khomoutov escreveu:
> On Wed, 18 Jan 2017 10:40:52 +0000
> Joao Pinto <Joao.Pinto@synopsys.com> wrote:
> 
> [...]
>> I have seen a lot of Linux developers avoid this re-organization
>> operations because they would lose the renamed file history, because
>> a new log is created for the new file, even if it is a renamed
>> version of itself. I am sending you this e-mail to suggest the
>> creation of a new feature in Git: when renamed, a file or folder
>> should inherit his parent’s log and a “rename: …” would be
>> automatically created or have some kind of pointer to its “old” form
>> to make history analysis easier.
> 
> Git does not record renames because of its stance that what matters is
> code _of the whole project_ as opposed to its location in a particular
> file.
> 
> Hence with regard to renames Git "works backwards" by detecting them
> dynamically while traversing the history (such as with `git log`
> etc).  This detection uses certain heuristics which can be controlled
> with knobs pointed to by Stefan Beller.
> 
> Still, I welcome you to read the sort-of "reference" post by Linus
> Torvalds [1] in which he explains the reasoning behind this approach
> implemented in Git.  IMO, understanding the reasoning behind the idea
> is much better than just mechanically learning how to use it.
> 
> The whole thread (esp. Torvalds' replies) is worth reading, but that
> particular mail summarizes the whole thing very well.
> 
> (The reference link to it used to be [2], but Gmane is not fully
> recovered to be able to display it.)
> 
> 1. https://urldefense.proofpoint.com/v2/url?u=http-3A__public-2Dinbox.org_git_Pine.LNX.4.58.0504150753440.7211-40ppc970.osdl.org_&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=X0bQCOGTuZF-uq6smPwJDw4Q47qHgjWaewgTHCbhMnM&s=97U97toe9A6XOAJxbhxvWeYpzl-wPw9QvlhQfAEUTdI&e= 
> 2. https://urldefense.proofpoint.com/v2/url?u=http-3A__thread.gmane.org_gmane.comp.version-2Dcontrol.git_27_focus-3D217&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=s2fO0hii0OGNOv9qQy_HRXy-xAJUD1NNoEcc3io_kx0&m=X0bQCOGTuZF-uq6smPwJDw4Q47qHgjWaewgTHCbhMnM&s=agYFOBCbLeaKAB6frWWzcwHkZyrMZLW4ExgDxzQyVlI&e= 
> 

Thank you very much for the info!

Joao

^ permalink raw reply

* Re: Git: new feature suggestion
From: Linus Torvalds @ 2017-01-19 18:39 UTC (permalink / raw)
  To: Konstantin Khomoutov
  Cc: Joao Pinto, Git Mailing List, CARLOS.PALMINHA@synopsys.com
In-Reply-To: <20170119093313.ea57832dfd1bc7e0b0f1e630@domain007.com>

On Wed, Jan 18, 2017 at 10:33 PM, Konstantin Khomoutov
<kostix+git@007spb.ru> wrote:
>
> Still, I welcome you to read the sort-of "reference" post by Linus
> Torvalds [1] in which he explains the reasoning behind this approach
> implemented in Git.

It's worth noting that that discussion was from some _very_ early days
in git (one week into the whole thing), when none of those
visualization tools were actually implemented.

Even now, ten years after the fact, plain git doesn't actually do what
I outlined. Yes, "git blame -Cw" works fairly well, and is in general
better than the traditional per-file "annotate". And yes, "git log
--follow" does another (small) part of the outlined thing, but is
really not very powerful.

Some tools on top of git do more, but I think in general this is an
area that could easily be improved upon. For example, the whole
iterative and interactive drilling down in history of a particular
file is very inconvenient to do with "git blame" (you find a commit
that change the area in some way that you don't find interesting, so
then you have to restart git blame with the parent of that
unintersting commit).

You can do it in tig, but I suspect a more graphical tool might be better.

.. and we still end up having a lot of things where we simply just
work with pathnames. For example, when doing merges, it' absolutely
_wonderful_ doing

   gitk --merge <filename>

to see what happened to that filename that has a conflict during the
merge. But it's all based on the whole-file changes, and sometimes
you'd like to see just the commits that generate one particular
conflict (in the kernel, things like the MAINTAINERS file can have
quite a lot of changes, but they are all pretty idnependent, and what
you want to see is just "changes to this area").

We do have the "-L" flag to git log, but it doesn't actually work for
this particular case because of limitations.

So what I'm trying to say is that the argument from 10+ years ago that
"you can do better with intelligent tools after-the-fact" is very much
true, but it's also true that we don't actually have all those
intelligent tools, and this is an area that could still be improved
upon. Some of them are actually available as add-ons in various
graphical IDE's that use git.

                 Linus

^ permalink raw reply

* Re: [RFC 1/2] grep: only add delimiter if there isn't one already
From: Junio C Hamano @ 2017-01-19 18:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git
In-Reply-To: <20170119150347.3484-2-stefanha@redhat.com>

Stefan Hajnoczi <stefanha@redhat.com> writes:

> git-grep(1) output does not follow git's own syntax:
>
>   $ git grep malloc v2.9.3:
>   v2.9.3::Makefile:       COMPAT_CFLAGS += -Icompat/nedmalloc
>   $ git show v2.9.3::Makefile
>   fatal: Path ':Makefile' does not exist in 'v2.9.3'
>
> This patch avoids emitting the unnecessary ':' delimiter if the name
> already ends with ':' or '/':
>
>   $ git grep malloc v2.9.3:
>   v2.9.3:Makefile:       COMPAT_CFLAGS += -Icompat/nedmalloc
>   $ git show v2.9.3:Makefile
>   (succeeds)

I am mildly negative on this one.  I suspect that the above example
is a made-up one and you might have a more compelling real-world use
case in mind, but at least the above does not convince me.

The end-user explicitly gave the extra ':' because she wanted to
feed the tree object, not a tag that leads to the tree, for some
reason.  I think the output should respect that and show how she
spelled the starting point.  IOW, it is not "':' added unnecessarily
by Git".  It is ':' added by the end-user for whatever reason.

>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  builtin/grep.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 462e607..3643d8a 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -490,7 +490,11 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>  		strbuf_init(&base, PATH_MAX + len + 1);
>  		if (len) {
>  			strbuf_add(&base, name, len);
> -			strbuf_addch(&base, ':');
> +
> +			/* Add a delimiter if there isn't one already */
> +			if (name[len - 1] != '/' && name[len - 1] != ':') {
> +				strbuf_addch(&base, ':');
> +			}
>  		}
>  		init_tree_desc(&tree, data, size);
>  		hit = grep_tree(opt, pathspec, &tree, &base, base.len,

^ permalink raw reply

* Re: [RFC 2/2] grep: use '/' delimiter for paths
From: Junio C Hamano @ 2017-01-19 18:54 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git
In-Reply-To: <20170119150347.3484-3-stefanha@redhat.com>

Stefan Hajnoczi <stefanha@redhat.com> writes:

> If the tree contains a sub-directory then git-grep(1) output contains a
> colon character instead of a path separator:
>
>   $ git grep malloc v2.9.3:t
>   v2.9.3:t:test-lib.sh:	setup_malloc_check () {
>   $ git show v2.9.3:t:test-lib.sh
>   fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'

I am slightly less negative on this compared to 1/2, but not by a
big margin.  The user wanted to feed a subtree to the command,
instead of doing the more natural

    $ git grep -e malloc v2.9.3 -- t

So again, "contains a colon character" is not coming from what Git
does, but the user gave Git "a colon character" when she didn't have
to.

Having said that, if we wanted to start ignoring what the end-user
said in the initial input like 1/2 and 2/2 does (i.e. "this specific
tree object" as an input), I think the approach to solve for 1/2 and
2/2 should be the same.  I think we should decide to do a slash
instead of a colon, not because v2.9.3: has colon at the end and
v2.9.3:t has colon in it, but because both of these are both bare
tree objects.  The patches presented does not seem to base their
decision on the actual object type but on the textual input, which
seems wrong.

^ permalink raw reply

* Re: grep open pull requests
From: Jeff King @ 2017-01-19 19:02 UTC (permalink / raw)
  To: Jack Bates; +Cc: git
In-Reply-To: <276f6ed9-ff06-a00f-b88a-4d40d55c6d40@nottheoilrig.com>

On Thu, Jan 19, 2017 at 11:12:03AM -0700, Jack Bates wrote:

> I have a couple questions around grepping among open pull requests.
> 
> First, "git for-each-ref --no-merged": When I run the following,
> it lists refs/pull/1112/head, even though #1112 was merged in commit
> ced4da1. I guess this is because the tip of refs/pull/1112/head is 107fc59,
> not ced4da1?

Right. Git's `--no-merged` is about commit ancestry.

> This maybe shows my lack of familiarity with Git details,
> but superficially the two commits appear identical -- [1] and [2] --
> same parent, etc. Nonetheless they have different SHA-1s.
> I'm not sure why that is -- I didn't merge the commit --
> but regardless, GitHub somehow still connects ced4da1 with #1112.

The commits differ only in the committer timestamp. Try:

  diff -u <(git cat-file commit 107fc5910) \
          <(git cat-file commit ced4da132)

Is it possible that somebody cherry-pick or rebased it? Looking at the
history of apache/trafficserver, I don't see any merges, which implies
to me that the project is using a rebase workflow to merge PRs.

It's much trickier to find from the git topology whether a particular
history contains rebased versions of commits.  You can look at the
--cherry options to "git log", which use patch-ids to try to equate
commits. Something like:

  git for-each-ref --format='%(refname)' 'refs/pull/*/head' |
  while read refname; do
	if test -z "$(git rev-list --right-only --cherry-pick -1 origin...$refname)
	then
		echo "$refname: not merged"
	fi
  done

That's obviously much less efficient than `--no-merged`, but it should
generally work. The exception is if the rebase changed the commit
sufficiently that its patch-id may have changed.

> So my question is, how are they doing that,

I suspect the answer is "somebody clicked the rebase button GitHub"
which simultaneously did the rebase and marked the PR as merged.

> Lastly, a question more about GitHub than Git, but: Given the way GitHub is
> setup, I hope I can get a list of unmerged pull requests from Git alone. Can
> you think of a way to list *open* pull requests,
> or is that status only available out of band?

That information isn't reflected in the git topology. It's in GitHub's
database. You can ask the API:

  https://developer.github.com/v3/

There are libraries to help with that:

  https://developer.github.com/libraries/

I think that's probably the best answer to your "unmerged" question,
too. Ask the API which PRs are unmerged, and then do whatever git-level
analysis you want based on that.

-Peff

^ permalink raw reply

* Re: [RFC 2/2] grep: use '/' delimiter for paths
From: Brandon Williams @ 2017-01-19 18:29 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: git, gitster
In-Reply-To: <20170119150347.3484-3-stefanha@redhat.com>

On 01/19, Stefan Hajnoczi wrote:
> If the tree contains a sub-directory then git-grep(1) output contains a
> colon character instead of a path separator:
> 
>   $ git grep malloc v2.9.3:t
>   v2.9.3:t:test-lib.sh:	setup_malloc_check () {
>   $ git show v2.9.3:t:test-lib.sh
>   fatal: Path 't:test-lib.sh' does not exist in 'v2.9.3'
> 
> This patch attempts to use the correct delimiter:
> 
>   $ git grep malloc v2.9.3:t
>   v2.9.3:t/test-lib.sh:	setup_malloc_check () {
>   $ git show v2.9.3:t/test-lib.sh
>   (success)
> 
> This patch does not cope with @{1979-02-26 18:30:00} syntax and treats
> it as a path because it contains colons.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  builtin/grep.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 3643d8a..06f8b47 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -493,7 +493,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>  
>  			/* Add a delimiter if there isn't one already */
>  			if (name[len - 1] != '/' && name[len - 1] != ':') {
> -				strbuf_addch(&base, ':');
> +				/* rev: or rev:path/ */
> +				strbuf_addch(&base, strchr(name, ':') ? '/' : ':');

As Jeff mentioned it may be better to base which character gets appended
by checking the obj->type field like this maybe:

diff --git a/builtin/grep.c b/builtin/grep.c
index 69dab5dc5..9dfe11dc7 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,7 +495,8 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 			/* Add a delimiter if there isn't one already */
 			if (name[len - 1] != '/' && name[len - 1] != ':') {
 				/* rev: or rev:path/ */
-				strbuf_addch(&base, strchr(name, ':') ? '/' : ':');
+				char del = obj->type == OBJ_COMMIT ? ':' : '/';
+				strbuf_addch(&base, del);
 			}
 		}
 		init_tree_desc(&tree, data, size);

-- 
Brandon Williams

^ permalink raw reply related

* Re: [PATCH] clear_delta_base_cache(): don't modify hashmap while iterating
From: Junio C Hamano @ 2017-01-19 19:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Ulrich Spörlein, Ed Maste, git
In-Reply-To: <20170119163350.zsfb33lmigkyljjh@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Thanks. Here it is rolled up with a commit message.
>
> -- >8 --
> Subject: clear_delta_base_cache(): don't modify hashmap while iterating
>
> Removing entries while iterating causes fast-import to
> access an already-freed `struct packed_git`, leading to
> various confusing errors.
>
> What happens is that clear_delta_base_cache() drops the
> whole contents of the cache by iterating over the hashmap,
> calling release_delta_base_cache() on each entry. That
> function removes the item from the hashmap. The hashmap code
> may then shrink the table, but the hashmap_iter struct
> retains an offset from the old table.
>
> As a result, the next call to hashmap_iter_next() may claim
> that the iteration is done, even though some items haven't
> been visited.
>
> The only caller of clear_delta_base_cache() is fast-import,
> which wants to clear the cache because it is discarding the
> packed_git struct for its temporary pack. So by failing to
> remove all of the entries, we still have references to the
> freed packed_git.
>
> To make things even more confusing, this doesn't seem to
> trigger with the test suite, because it depends on
> complexities like the size of the hash table, which entries
> got cleared, whether we try to access them before they're
> evicted from the cache, etc.
>
> So I've been able to identify the problem with large
> imports like freebsd's svn import, or a fast-export of
> linux.git. But nothing that would be reasonable to run as
> part of the normal test suite.
>
> We can fix this easily by iterating over the lru linked list
> instead of the hashmap. They both contain the same entries,
> and we can use the "safe" variant of the list iterator,
> which exists for exactly this case.
>
> Let's also add a warning to the hashmap API documentation to
> reduce the chances of getting bit by this again.
>
> Reported-by: Ulrich Spörlein <uqs@freebsd.org>
> Signed-off-by: Jeff King <peff@peff.net>
> ---

Makes sense.  Thanks, both, for reporting, finding and fixing.

Will apply.

>  Documentation/technical/api-hashmap.txt | 4 +++-
>  sha1_file.c                             | 9 ++++-----
>  2 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt
> index 28f5a8b71..a3f020cd9 100644
> --- a/Documentation/technical/api-hashmap.txt
> +++ b/Documentation/technical/api-hashmap.txt
> @@ -188,7 +188,9 @@ Returns the removed entry, or NULL if not found.
>  `void *hashmap_iter_next(struct hashmap_iter *iter)`::
>  `void *hashmap_iter_first(struct hashmap *map, struct hashmap_iter *iter)`::
>  
> -	Used to iterate over all entries of a hashmap.
> +	Used to iterate over all entries of a hashmap. Note that it is
> +	not safe to add or remove entries to the hashmap while
> +	iterating.
>  +
>  `hashmap_iter_init` initializes a `hashmap_iter` structure.
>  +
> diff --git a/sha1_file.c b/sha1_file.c
> index 1eb47f611..d20714d6b 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2342,11 +2342,10 @@ static inline void release_delta_base_cache(struct delta_base_cache_entry *ent)
>  
>  void clear_delta_base_cache(void)
>  {
> -	struct hashmap_iter iter;
> -	struct delta_base_cache_entry *entry;
> -	for (entry = hashmap_iter_first(&delta_base_cache, &iter);
> -	     entry;
> -	     entry = hashmap_iter_next(&iter)) {
> +	struct list_head *lru, *tmp;
> +	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
> +		struct delta_base_cache_entry *entry =
> +			list_entry(lru, struct delta_base_cache_entry, lru);
>  		release_delta_base_cache(entry);
>  	}
>  }

^ permalink raw reply

* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
From: Junio C Hamano @ 2017-01-19 19:00 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
	Christian Couder
In-Reply-To: <CACsJy8B_xe6RtszPrncvDdSYosNUQxvhcEQ3DO_WO0sepXqvvQ@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, Jan 9, 2017 at 9:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>>> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Christian Couder <christian.couder@gmail.com> writes:
>>>>
>>>>> So what should we do if freshen_file() returns 0 which means that the
>>>>> freshening failed?
>>>>
>>>> You tell me ;-)  as you are the one who is proposing this feature.
>>>
>>> My answer is, we are not worse than freshening loose objects case
>>> (especially since I took the idea from there).
>>
>> I do not think so, unfortunately.  Loose object files with stale
>> timestamps are not removed as long as objects are still reachable.
>
> But there are plenty of unreachable loose objects, added in index,
> then got replaced with new versions. cache-tree can create loose trees
> too and it's been run more often, behind user's back, to take
> advantage of the shortcut in unpack-trees.

I am not sure if I follow.  Aren't objects reachable from the
cache-tree in the index protected from gc?

Not that I think freshening would actually fail in a repository
where you can actually write into to update the index or its refs to
make a difference (iow, even if we make it die() loudly when shared
index cannot be "touched" because we are paranoid, no real life
usage will trigger that die(), and if a repository does trigger the
die(), I think you would really want to know about it).

^ permalink raw reply

* Re: Git: new feature suggestion
From: Linus Torvalds @ 2017-01-19 19:16 UTC (permalink / raw)
  To: Joao Pinto
  Cc: Konstantin Khomoutov, Git Mailing List,
	CARLOS.PALMINHA@synopsys.com
In-Reply-To: <991ef396-3fc3-27d6-283c-b8dffa10a7b7@synopsys.com>

On Thu, Jan 19, 2017 at 10:54 AM, Joao Pinto <Joao.Pinto@synopsys.com> wrote:
>
> I am currently facing some challenges in one of Linux subsystems where a rename
> of a set of folders and files would be the perfect scenario for future
> development, but the suggestion is not accepted, not because it's not correct,
> but because it makes the maintainer life harder in backporting bug fixes and new
> features to older kernel versions and because it is not easy to follow the
> renamed file/folder history from the kernel.org history logs.

Honestly, that's less of a git issue, and more of a "patch will not
apply across versions" issue.

No amount of rename detection will ever fix that, simply because the
rename hadn't even _happened_ in the old versions that things get
backported to.

("git cherry-pick" can do a merge resolution and thus do "backwards"
renaming too, so tooling can definitely help, but it still ends up
meaning that even trivial patches are no longer the _same_ trivial
patch across versions).

So renaming things increases maintainer workloads in those situations
regardless of any tooling issues.

(You may also be referring to the mellanox mess, where this issue is
very much exacerbated by having different groups working on the same
thing, and maintainers having very much a "I will not take _anything_
from any of the groups that makes my life more complicated" model,
because those groups fucked up so much in the past).

In other words, quite often issues are about workflows rather than
tools. The networking layer probably has more of this, because David
actually does the backports himself, so he _really_ doesn't want to
complicate things.

               Linus

^ permalink raw reply

* Re: Git: new feature suggestion
From: Joao Pinto @ 2017-01-19 18:54 UTC (permalink / raw)
  To: Linus Torvalds, Konstantin Khomoutov
  Cc: Joao Pinto, Git Mailing List, CARLOS.PALMINHA@synopsys.com
In-Reply-To: <CA+55aFxAe8bH2xXkx1p5gYN+nc-D-vjNnfUeA_64Q3ttpbHq+w@mail.gmail.com>


Hi Linus,

Às 6:39 PM de 1/19/2017, Linus Torvalds escreveu:
> On Wed, Jan 18, 2017 at 10:33 PM, Konstantin Khomoutov
> <kostix+git@007spb.ru> wrote:
>>
>> Still, I welcome you to read the sort-of "reference" post by Linus
>> Torvalds [1] in which he explains the reasoning behind this approach
>> implemented in Git.
> 
> It's worth noting that that discussion was from some _very_ early days
> in git (one week into the whole thing), when none of those
> visualization tools were actually implemented.
> 
> Even now, ten years after the fact, plain git doesn't actually do what
> I outlined. Yes, "git blame -Cw" works fairly well, and is in general
> better than the traditional per-file "annotate". And yes, "git log
> --follow" does another (small) part of the outlined thing, but is
> really not very powerful.
> 
> Some tools on top of git do more, but I think in general this is an
> area that could easily be improved upon. For example, the whole
> iterative and interactive drilling down in history of a particular
> file is very inconvenient to do with "git blame" (you find a commit
> that change the area in some way that you don't find interesting, so
> then you have to restart git blame with the parent of that
> unintersting commit).
> 
> You can do it in tig, but I suspect a more graphical tool might be better.
> 
> .. and we still end up having a lot of things where we simply just
> work with pathnames. For example, when doing merges, it' absolutely
> _wonderful_ doing
> 
>    gitk --merge <filename>
> 
> to see what happened to that filename that has a conflict during the
> merge. But it's all based on the whole-file changes, and sometimes
> you'd like to see just the commits that generate one particular
> conflict (in the kernel, things like the MAINTAINERS file can have
> quite a lot of changes, but they are all pretty idnependent, and what
> you want to see is just "changes to this area").
> 
> We do have the "-L" flag to git log, but it doesn't actually work for
> this particular case because of limitations.
> 
> So what I'm trying to say is that the argument from 10+ years ago that
> "you can do better with intelligent tools after-the-fact" is very much
> true, but it's also true that we don't actually have all those
> intelligent tools, and this is an area that could still be improved
> upon. Some of them are actually available as add-ons in various
> graphical IDE's that use git.
> 
>                  Linus
> 

I am currently facing some challenges in one of Linux subsystems where a rename
of a set of folders and files would be the perfect scenario for future
development, but the suggestion is not accepted, not because it's not correct,
but because it makes the maintainer life harder in backporting bug fixes and new
features to older kernel versions and because it is not easy to follow the
renamed file/folder history from the kernel.org history logs.

Like nature shows us, the ability to adapt is the key for survival, so Linux
would gain a lot with some new features in git that can make maintainers life
easier. Assisted-backporting would be an excellent feature for them.

Did you ever thought about optimization backport operations through git or by an
add-on to it?

I am available to help if this feature makes sense for git users.

Thanks,
Joao

^ permalink raw reply

* [RFC/PATCH] Disallow commands from within unpopulated submodules.
From: Stefan Beller @ 2017-01-19 19:30 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Consider you have a submodule at path "sub". What should happen in case
you run a command such as "git -C sub add ." ?

Here is what currently happens:
1) The submodule is populated, i.e. there is a .git (file/dir) inside
   "sub". This is equivalent of running "git add ." in the submodule and
   it behaves as you would expect, adding all files to the index.
2) The submodule is not populated or even not initialized.
   For quite some time we got
   $ git -C sub add .
   git: pathspec.c:317: prefix_pathspec: Assertion `item->nowildcard_len <= item->len && item->prefix <= item->len' failed.
   Aborted (core dumped)
   (This is fixed by another patch in flight to not assert,
    but rather die with a better message instead; but that patch is
    merely a fix of a corner case in the pathspec code.)

While 1) is rather uncontroversial, there are multiple things the user
may have intended with this command in 2):
 * add the submodule to the superproject
 * add all files inside the sub/ directory to the submodule or
   superproject.
It is unclear what the user intended, so rather error out instead.

Now let's ask the same question for "git -C sub status ." (which is a
command that is only reading and not writing to the repository)

1) If the submodule is populated, the user clearly intended to know
   more about the submodules status
2) It is unclear if the user wanted to learn about the submodules state
   (So ideally: "The submodule 'sub' is not initialized. To init ...")
   or the status check should be applied to the superproject instead.

Avoid the confusion in 2) as well and just error out for now. Later on
we may want to add another flag to git.c to allow commands to be run
inside unpopulated submodules and each command reacts appropriately.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This is the next logical step after sb/pathspec-errors (pathspec:
give better message for submodule related pathspec error). If you are in
a path that is clearly a submodules, I would expect that most users would
expect the git operation to apply to the submodule.  In case of unpopulated
submodules, this is not the case though, but we apply the operation to the
superproject, which may be wrong or confusing.  Hence just error out for now.
Later we may want to add a flag that allows specific commands to run in such
a setup (e.g. git status could give a fancier message than a die(..)).

I marked this as RFC
* to request for comments if this is a good idea from a UI-perspective
* because I did not adapt any test for this patch. (A lot of submodule tests
  seem to break with this; From a cursory read of those tests, I'd rather
  blame the tests for being sloppy than this patch damaging user expectations)

Thanks,
Stefan

 git.c       |  3 +++
 submodule.c | 36 ++++++++++++++++++++++++++++++++++++
 submodule.h |  1 +
 3 files changed, 40 insertions(+)

diff --git a/git.c b/git.c
index bbaa949e9c..ca53b61474 100644
--- a/git.c
+++ b/git.c
@@ -2,6 +2,7 @@
 #include "exec_cmd.h"
 #include "help.h"
 #include "run-command.h"
+#include "submodule.h"

 const char git_usage_string[] =
 	"git [--version] [--help] [-C <path>] [-c name=value]\n"
@@ -364,6 +365,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 		if (prefix)
 			die("can't use --super-prefix from a subdirectory");
 	}
+	if (prefix)
+		check_prefix_inside_submodule(prefix);

 	if (!help && p->option & NEED_WORK_TREE)
 		setup_work_tree();
diff --git a/submodule.c b/submodule.c
index 73521cdbb2..357a22b804 100644
--- a/submodule.c
+++ b/submodule.c
@@ -495,6 +495,42 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }

+/* check if the given prefix is inside an uninitialized submodule */
+void check_prefix_inside_submodule(const char *prefix)
+{
+	const char *work_tree = get_git_work_tree();
+	if (work_tree) {
+		int pos;
+		const struct cache_entry *in_submodule = NULL;
+
+		if (read_cache() < 0)
+			die("index file corrupt");
+		pos = cache_name_pos(prefix, strlen(prefix));
+		/*
+		 * gitlinks are recored with no ending '/' in the index,
+		 * but the prefix has an ending '/', so we will never find
+		 * an exact match, but always the position where we'd
+		 * insert the prefix.
+		 */
+		if (pos < 0) {
+			const struct cache_entry *ce;
+			int len = strlen(prefix);
+			/* Check the previous position */
+			pos = (-1 - pos) - 1;
+			ce = active_cache[pos];
+			if (ce->ce_namelen < len)
+				len = ce->ce_namelen;
+			if (!memcmp(ce->name, prefix, len))
+				in_submodule = ce;
+		} else
+			/* This case cannot happen because */
+			die("BUG: prefixes end with '/', but we do not record ending slashes in the index");
+
+		if (in_submodule)
+			die(_("command from inside unpopulated submodule '%s' not supported."), in_submodule->name);
+	}
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
 		      int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index b7576d6f43..6b30542822 100644
--- a/submodule.h
+++ b/submodule.h
@@ -30,6 +30,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}

+extern void check_prefix_inside_submodule(const char *prefix);
 int is_staging_gitmodules_ok(void);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 int remove_path_from_gitmodules(const char *path);
-- 
2.11.0.299.g762782ba8a

^ permalink raw reply related

* Re: [PATCH v5 3/3] log --graph: customize the graph lines with config log.graphColors
From: Junio C Hamano @ 2017-01-19 19:34 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <xmqq1svy3nxi.fsf@gitster.mtv.corp.google.com>

Junio C Hamano <gitster@pobox.com> writes:

>> Since there's only one line that cares about the result of "colors",
>> maybe it would be less confusing to do:
>>
>>   if (!git_config_get-string("log.graphcolors", &string)) {
>> 	... parse, etc ...
>> 	graph_set_column_colors(colors.argv, colors.argc - 1);
>>   } else {
>> 	graph_set_column_colors(column_colors_ansi,
>> 	                        column_colors_ansi_max);
>>   }
>
> Yes, that would be much much less confusing.  By doing so, the cover
> letter can lose "pushing the limits of abuse", too ;-).

Will queue with a squashable change.

Thanks.

 graph.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/graph.c b/graph.c
index 822d40ae20..00aeee36d8 100644
--- a/graph.c
+++ b/graph.c
@@ -229,22 +229,20 @@ struct git_graph *graph_init(struct rev_info *opt)
 	struct git_graph *graph = xmalloc(sizeof(struct git_graph));
 
 	if (!column_colors) {
-		struct argv_array ansi_colors = {
-			column_colors_ansi,
-			column_colors_ansi_max + 1
-		};
-		struct argv_array *colors = &ansi_colors;
 		char *string;
-
-		if (!git_config_get_string("log.graphcolors", &string)) {
+		if (git_config_get_string("log.graphcolors", &string)) {
+			/* not configured -- use default */
+			graph_set_column_colors(column_colors_ansi,
+						column_colors_ansi_max);
+		} else {
 			static struct argv_array custom_colors = ARGV_ARRAY_INIT;
 			argv_array_clear(&custom_colors);
 			parse_graph_colors_config(&custom_colors, string);
 			free(string);
-			colors = &custom_colors;
+			/* graph_set_column_colors takes a max-index, not a count */
+			graph_set_column_colors(custom_colors.argv,
+						custom_colors.argc - 1);
 		}
-		/* graph_set_column_colors takes a max-index, not a count */
-		graph_set_column_colors(colors->argv, colors->argc - 1);
 	}
 
 	graph->commit = NULL;

^ permalink raw reply related

* Re: The design of refs backends, linked worktrees and submodules
From: Junio C Hamano @ 2017-01-19 19:44 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Michael Haggerty, Git Mailing List
In-Reply-To: <CACsJy8CHoroX2k9GqOFmXkvvPCPN4SBeCg+6aC2WSWNSKVmWQw@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> ... A bit worried about transactions though,
> because I think per-repo and per-worktree updates will be separated in
> two transactions. But that's probably ok because future backends, like
> lmdb, will have to go through the same route anyway.

If we have per-worktree ref-store, what can be done as a useful
thing by future backends like lmdb may be to use the same single
database to store shared and per-worktree refs, similar to the way
Michael mentioned in his response to your message I am responding
to, i.e.

    I kindof think that it would have been a better design to store the
    references for all linked worktrees in the main repository's ref-store.
    For example, the "bisect" refs for a worktree named "<name>" could have
    been stored under "refs/worktrees/<name>/bisect/*".

The current design is heavily influenced by how "contrib/workdir"
lays things out, for the latter of which I take the blame X-<, but
perhaps the files backend can be retrofitted to use that layout in
the longer term?

^ permalink raw reply

* Re: problem with insider build for windows and git
From: Johannes Schindelin @ 2017-01-19 19:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Gooch, git
In-Reply-To: <xmqq8tq8b4mr.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Wed, 18 Jan 2017, Junio C Hamano wrote:

> Aside from the "ouch, one topic has merged earlier iteration, that
> was merged to 'master', also now merged to 'maint', and we need to
> follow up on both" you sent out earlier,

I know of one more "ouch" moment where my latest iterations did not get
picked up: my latest version of the "Avoid a segmentation fault with
renaming merges" patch did not output an error message in case of !nce
because the code flow will result in more appropriate error messages later
anyway. I did not provide a follow-up patch for that because the current
version in `maint` is not wrong per se.

> are there any other topic that are already in 'master' that should go to
> 2.11.x track?

Personally, I would have merged 'nd/config-misc-fixes' into `maint`, I
guess, and 'jc/abbrev-autoscale-config', and probably also 'jc/latin-1'.
The 'rj/git-version-gen-do-not-force-abbrev' topic would be another
candidate for inclusion. The 'ah/grammos' strikes me as obvious `maint`
material, as well as 'ew/svn-fixes'. I have no opinion on the p4 topics
(five, by my counting), as I have no experience with (or for that matter,
need for) Perforce, but Lars might have a strong opinion on those.

Having said that, these are the topics that *I* would merge into `maint`
if I maintained Git. I don't, so this is just my opinion, man [*1*].

Ciao,
Johannes

Footnote *1*: While you read that last part of the sentence, imagine me in
slippers and a bathrobe, with a White Russian in my left hand for which I
used milk instead of cream (for the White Russian, that is, not for my
left hand).

^ permalink raw reply

* Re: [PATCH v5 0/4] Per-worktree config file support
From: Stefan Beller @ 2017-01-19 20:03 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: git@vger.kernel.org, Junio C Hamano, Michael J Gruber,
	Jens Lehmann, Lars Schneider, Michael Haggerty, Max Kirillov
In-Reply-To: <CACsJy8BHgc8o+SydeiVnqaZRCbkJEWVzqDZM4sgey04ZLtG3tQ@mail.gmail.com>

On Thu, Jan 19, 2017 at 4:09 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Jan 11, 2017 at 12:01 AM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, Jan 10, 2017 at 3:25 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>>> Let's get this rolling again. To refresh your memory because it's half
>>> a year since v4 [1], this is about letting each worktree in multi
>>> worktree setup has their own config settings. The most prominent ones
>>> are core.worktree, used by submodules, and core.sparseCheckout.
>>
>> Thanks for getting this rolling again.
>>
>>>
>>> This time I'm not touching submodules at all. I'm leaving it in the
>>> good hands of "submodule people". All I'm providing is mechanism. How
>>> you use it is up to you. So the series benefits sparse checkout users
>>> only.
>>
>> As one of the "submodule people", I have no complaints here.
>>
>>>
>>> Not much has changed from v4, except that the migration to new config
>>> layout is done automatically _update_ a config variable with "git
>>> config --worktree".
>>>
>>> I think this one is more or less ready. I have an RFC follow-up patch
>>> about core.bare, but that could be handled separately.
>>
>> I have read through the series and think the design is sound for worktrees
>> (though I have little knowledge about them).
>
> Submodules and multi worktrees start to look very similar, the more I
> think about it. Well, except that multi worktree does not separate odb
> and config files, maybe.

Similar to worktrees submodules can appear and disappear without
affecting the project/main tree. (though the mechanism is different,
for submodules, you'd checkout a version that doesn't have the submodule,
whereas for worktrees the user explicitely says: "I don't want to see this
worktree any more")

> And we have already seen both have a need to
> share code (like the moving .git dir operation). I suspect I'll learn
> more about submodules along the way, and you worktrees ;-)

I sure hope so.

>
>> Now even further:
>>
>> So to build on top of this series, I'd like to make submodules usable
>> with worktrees (i.e. shared object store, only clone/fetch once and
>> all worktrees
>> benefit from it), the big question is how to get the underlying data
>> model right.
>>
>> Would a submodule go into the superprojects
>>
>>     .git/worktrees/<worktree-name>/modules/<submodule-name>/
>>
>> or rather
>>
>>     .git/modules<submodule-name>/worktrees/<worktree-name>
>>
>> Or both (one of them being a gitlink file pointing at the other?)
>>
>> I have not made up my mind, as I haven't laid out all cases that are
>> relevant here.
>
> I would go with a conservative step first, keep submodules
> per-worktree. After it's sorted out. You can change the layout (e.g.
> as a config extension). The latter probably has some complication (but
> yeah sharing would be a big plus).

The sharing is what we are asked for as it would "make
submodules usable" (compared to the repo tool, which
doesn't have object sharing AFAIK). ;)

Currently I am leaning to put the worktree directory first and the
submodules within, i.e.

.git/worktrees/<worktree-name>/modules/<submodule-name>/

but in that directory, we'd only have the per-worktree
specific stuff, the object store would live with the superprojects
main worktree, i.e. at .git/modules/<submodule-name> we'd have
the main git dir for the submodule.

Thanks,
Stefan

> --
> Duy

^ permalink raw reply

* Re: The design of refs backends, linked worktrees and submodules
From: Johannes Schindelin @ 2017-01-19 20:04 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Duy Nguyen, Git Mailing List
In-Reply-To: <341999fc-4496-b974-c117-c18a2fca1358@alum.mit.edu>

Hi,

On Thu, 19 Jan 2017, Michael Haggerty wrote:

> On 01/19/2017 12:55 PM, Duy Nguyen wrote:
> > I've started working on fixing the "git gc" issue with multiple
> > worktrees, which brings me back to this. Just some thoughts. Comments
> > are really appreciated.
> > 
> > In the current code, files backend has special cases for both
> > submodules (explicitly) and linked worktrees (hidden behind git_path).
> 
> There is another terrible hack also needed to implement linked
> worktrees, namely that the `refs/bisect/` hierarchy is manually inserted
> into the `ref_cache`, because otherwise it wouldn't be noticed when
> iterating over loose references via `readdir()`.
> 
> Other similar hacks would be required if other reference subtrees are
> declared to be per-worktree.
> 
> > But if a backend has to handle this stuff, all future backends have to
> > too. Which does not sound great. Imagine we have "something" in
> > addition to worktrees and submodules in future, then all backends have
> > to learn about it.
> 
> Agreed, the status quo is not pretty.
> 
> I kindof think that it would have been a better design to store the
> references for all linked worktrees in the main repository's ref-store.
> For example, the "bisect" refs for a worktree named "<name>" could have
> been stored under "refs/worktrees/<name>/bisect/*".

That strikes me as a good design, indeed. It addresses very explicitly the
root cause of the worktree problems: Git's source code was developed for
years with the assumption that there is a 1:1 mapping between ref names
and SHA-1s in each repository, and the way worktrees were implemented
broke that assumption.

So introducing a new refs/ namespace -- that is visible to all other
worktrees -- would have addressed that problem.

This, BTW, is related to my concerns about introducing a "shadow" config
layer for worktrees: I still think it would be a bad idea, and very likely
to cause regressions in surprising ways, to allow such config "overlays"
per-worktree, as Git's current code's assumption is that there is only one
config per repository, and that it can, say, set one config setting to
match another (which in the per-worktree case would possibly hold true in
only one worktree only). Instead, introducing a new "namespace" in the
(single) config similar to refs/worktrees/<name> could address that
problem preemptively.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCHv3 1/4] cache.h: document index_name_pos
From: Junio C Hamano @ 2017-01-19 20:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git
In-Reply-To: <20170119031854.4570-2-sbeller@google.com>

Stefan Beller <sbeller@google.com> writes:

> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  cache.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/cache.h b/cache.h
> index 1b67f078dd..1469ddeafe 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -575,7 +575,26 @@ extern int verify_path(const char *path);
>  extern int index_dir_exists(struct index_state *istate, const char *name, int namelen);
>  extern void adjust_dirname_case(struct index_state *istate, char *name);
>  extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase);
> +
> +/*
> + * Searches for an entry defined by name and namelen in the given index.
> + * If the return value is positive (including 0) it is the position of an
> + * exact match. If the return value is negative, the negated value minus 1
> + * is the position where the entry would be inserted.

So if the return value is -3, you negate it to get 3 and then
subtract 1 to get 2.  The function is telling you that "e" will
sit at active_cache[2] in the following example.

Which is correct.

> + * Example: The current index consists of these files and its stages:
> + *
> + *   b#0, d#0, f#1, f#3
> + *
> + * index_name_pos(&index, "a", 1) -> -1
> + * index_name_pos(&index, "b", 1) ->  0
> + * index_name_pos(&index, "c", 1) -> -2
> + * index_name_pos(&index, "d", 1) ->  1
> + * index_name_pos(&index, "e", 1) -> -3
> + * index_name_pos(&index, "f", 1) -> -3
> + * index_name_pos(&index, "g", 1) -> -5
> + */

This time the counting seems correct.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/2] Be more careful when determining whether a remote was configured
From: Jeff King @ 2017-01-19 20:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, git, Thomas Gummerer, Andrew Arnott
In-Reply-To: <xmqq4m0u24hs.fsf@gitster.mtv.corp.google.com>

On Thu, Jan 19, 2017 at 12:12:47PM -0800, Junio C Hamano wrote:

> > The config-scope thing above would allow "remote.svn.vcs" in
> > ~/.gitconfig. But I don't think the test script actually checks that; it
> > checks for the repo-level config. And we would continue to do the right
> > thing there.
> 
> I am not "you" you are addressing to, but I think tying it to where
> the variable came from makes quite sense.  
> 
> Because it makes it no longer possible to just inspect the
> configured result to answer "is the remote configured?",
> introduction of the configured field also needs to be preserved from
> the original by Dscho, so does reading from historical non-config
> sources like $GIT_DIR/remotes/*, which are by definition
> per-repository thing.
> 
> IOW, with this tweak (and not setting ->configured based on what
> keys are set), I think Dscho's patch makes sense.

Yeah, worry if that wasn't clear: the hunk I posted was a just a
partial. The actual thing I built and ran against the test suite was
exactly as you described.

-Peff

^ permalink raw reply

* Re: [PATCH 2/2] Be more careful when determining whether a remote was configured
From: Junio C Hamano @ 2017-01-19 20:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, git, Thomas Gummerer, Andrew Arnott
In-Reply-To: <20170119182721.7y2zzrbaalfqjjn6@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Wed, Jan 18, 2017 at 05:22:40PM +0100, Johannes Schindelin wrote:
>
>> > > I want to err on the side of caution. That's why.
>> > 
>> > I guess I just don't see why changing the behavior with respect to
>> > "prune" or "proxy" is any less conservative than changing the one for
>> > "refspec".
>> 
> I think _this_ is a much better way of framing the problem. It is not
> about which keys are set, but about _where_ they are set. IOW, a
> reasonable rule would be: if there is any remote.*.X in the repo config,
> then git-remote should consider it a configured repo. And otherwise, no
> matter what is in ~/.gitconfig or elsewhere, git-remote should proceed
> as if it doesn't exist (and repo-level config can take precedence over
> config defined elsewhere).
>
> I.e., something like this:
>
> diff --git a/remote.c b/remote.c
> index 298f2f93f..720d616be 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -373,6 +373,8 @@ static int handle_config(const char *key, const char *value, void *cb)
>  	}
>  	remote = make_remote(name, namelen);
>  	remote->origin = REMOTE_CONFIG;
> +	if (current_config_scope() == CONFIG_SCOPE_REPO)
> +		remote->configured = 1;
>  	if (!strcmp(subkey, "mirror"))
>  		remote->mirror = git_config_bool(key, value);
>  	else if (!strcmp(subkey, "skipdefaultupdate"))
>
> That doesn't make your test pass, but I think that is only because your
> test is not covering the interesting case (it puts the new config in the
> repo config, not in ~/.gitconfig).
>
> What do you think?
>
>> Originally, I would even have put the "vcs" into that set, as I could see
>> a legitimate use case for users to configure "remote.svn.vcs = vcs" in
>> ~/.gitconfig. But the regression test suite specifically tests for that
>> case, and I trust that there was a good reason, even if Thomas did not
>> describe that good reason in the commit message nor in any reply to this
>> patch pair.
>
> The config-scope thing above would allow "remote.svn.vcs" in
> ~/.gitconfig. But I don't think the test script actually checks that; it
> checks for the repo-level config. And we would continue to do the right
> thing there.

I am not "you" you are addressing to, but I think tying it to where
the variable came from makes quite sense.  

Because it makes it no longer possible to just inspect the
configured result to answer "is the remote configured?",
introduction of the configured field also needs to be preserved from
the original by Dscho, so does reading from historical non-config
sources like $GIT_DIR/remotes/*, which are by definition
per-repository thing.

IOW, with this tweak (and not setting ->configured based on what
keys are set), I think Dscho's patch makes sense.


^ permalink raw reply

* [RESEND PATCHv2] contrib: remove git-convert-objects
From: Stefan Beller @ 2017-01-19 20:29 UTC (permalink / raw)
  To: gitster; +Cc: git, torvalds, Stefan Beller
In-Reply-To: <20161228180205.29213-1-sbeller@google.com>

git-convert-objects, originally named git-convert-cache was used in
early 2005 to convert to a new repository format, e.g. adding an author
date.

By now the need for conversion of the very early repositories is less
relevant, we no longer need to keep it in contrib; remove it.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

v2:
  no changes since v1, but as there were no replies nor do I find this
  patch cooking, I decided to resend.
  
Thanks,
Stefan

 contrib/convert-objects/convert-objects.c       | 329 ------------------------
 contrib/convert-objects/git-convert-objects.txt |  29 ---
 2 files changed, 358 deletions(-)
 delete mode 100644 contrib/convert-objects/convert-objects.c
 delete mode 100644 contrib/convert-objects/git-convert-objects.txt

diff --git a/contrib/convert-objects/convert-objects.c b/contrib/convert-objects/convert-objects.c
deleted file mode 100644
index f3b57bf1d2..0000000000
--- a/contrib/convert-objects/convert-objects.c
+++ /dev/null
@@ -1,329 +0,0 @@
-#include "cache.h"
-#include "blob.h"
-#include "commit.h"
-#include "tree.h"
-
-struct entry {
-	unsigned char old_sha1[20];
-	unsigned char new_sha1[20];
-	int converted;
-};
-
-#define MAXOBJECTS (1000000)
-
-static struct entry *convert[MAXOBJECTS];
-static int nr_convert;
-
-static struct entry * convert_entry(unsigned char *sha1);
-
-static struct entry *insert_new(unsigned char *sha1, int pos)
-{
-	struct entry *new = xcalloc(1, sizeof(struct entry));
-	hashcpy(new->old_sha1, sha1);
-	memmove(convert + pos + 1, convert + pos, (nr_convert - pos) * sizeof(struct entry *));
-	convert[pos] = new;
-	nr_convert++;
-	if (nr_convert == MAXOBJECTS)
-		die("you're kidding me - hit maximum object limit");
-	return new;
-}
-
-static struct entry *lookup_entry(unsigned char *sha1)
-{
-	int low = 0, high = nr_convert;
-
-	while (low < high) {
-		int next = (low + high) / 2;
-		struct entry *n = convert[next];
-		int cmp = hashcmp(sha1, n->old_sha1);
-		if (!cmp)
-			return n;
-		if (cmp < 0) {
-			high = next;
-			continue;
-		}
-		low = next+1;
-	}
-	return insert_new(sha1, low);
-}
-
-static void convert_binary_sha1(void *buffer)
-{
-	struct entry *entry = convert_entry(buffer);
-	hashcpy(buffer, entry->new_sha1);
-}
-
-static void convert_ascii_sha1(void *buffer)
-{
-	unsigned char sha1[20];
-	struct entry *entry;
-
-	if (get_sha1_hex(buffer, sha1))
-		die("expected sha1, got '%s'", (char *) buffer);
-	entry = convert_entry(sha1);
-	memcpy(buffer, sha1_to_hex(entry->new_sha1), 40);
-}
-
-static unsigned int convert_mode(unsigned int mode)
-{
-	unsigned int newmode;
-
-	newmode = mode & S_IFMT;
-	if (S_ISREG(mode))
-		newmode |= (mode & 0100) ? 0755 : 0644;
-	return newmode;
-}
-
-static int write_subdirectory(void *buffer, unsigned long size, const char *base, int baselen, unsigned char *result_sha1)
-{
-	char *new = xmalloc(size);
-	unsigned long newlen = 0;
-	unsigned long used;
-
-	used = 0;
-	while (size) {
-		int len = 21 + strlen(buffer);
-		char *path = strchr(buffer, ' ');
-		unsigned char *sha1;
-		unsigned int mode;
-		char *slash, *origpath;
-
-		if (!path || strtoul_ui(buffer, 8, &mode))
-			die("bad tree conversion");
-		mode = convert_mode(mode);
-		path++;
-		if (memcmp(path, base, baselen))
-			break;
-		origpath = path;
-		path += baselen;
-		slash = strchr(path, '/');
-		if (!slash) {
-			newlen += sprintf(new + newlen, "%o %s", mode, path);
-			new[newlen++] = '\0';
-			hashcpy((unsigned char *)new + newlen, (unsigned char *) buffer + len - 20);
-			newlen += 20;
-
-			used += len;
-			size -= len;
-			buffer = (char *) buffer + len;
-			continue;
-		}
-
-		newlen += sprintf(new + newlen, "%o %.*s", S_IFDIR, (int)(slash - path), path);
-		new[newlen++] = 0;
-		sha1 = (unsigned char *)(new + newlen);
-		newlen += 20;
-
-		len = write_subdirectory(buffer, size, origpath, slash-origpath+1, sha1);
-
-		used += len;
-		size -= len;
-		buffer = (char *) buffer + len;
-	}
-
-	write_sha1_file(new, newlen, tree_type, result_sha1);
-	free(new);
-	return used;
-}
-
-static void convert_tree(void *buffer, unsigned long size, unsigned char *result_sha1)
-{
-	void *orig_buffer = buffer;
-	unsigned long orig_size = size;
-
-	while (size) {
-		size_t len = 1+strlen(buffer);
-
-		convert_binary_sha1((char *) buffer + len);
-
-		len += 20;
-		if (len > size)
-			die("corrupt tree object");
-		size -= len;
-		buffer = (char *) buffer + len;
-	}
-
-	write_subdirectory(orig_buffer, orig_size, "", 0, result_sha1);
-}
-
-static unsigned long parse_oldstyle_date(const char *buf)
-{
-	char c, *p;
-	char buffer[100];
-	struct tm tm;
-	const char *formats[] = {
-		"%c",
-		"%a %b %d %T",
-		"%Z",
-		"%Y",
-		" %Y",
-		NULL
-	};
-	/* We only ever did two timezones in the bad old format .. */
-	const char *timezones[] = {
-		"PDT", "PST", "CEST", NULL
-	};
-	const char **fmt = formats;
-
-	p = buffer;
-	while (isspace(c = *buf))
-		buf++;
-	while ((c = *buf++) != '\n')
-		*p++ = c;
-	*p++ = 0;
-	buf = buffer;
-	memset(&tm, 0, sizeof(tm));
-	do {
-		const char *next = strptime(buf, *fmt, &tm);
-		if (next) {
-			if (!*next)
-				return mktime(&tm);
-			buf = next;
-		} else {
-			const char **p = timezones;
-			while (isspace(*buf))
-				buf++;
-			while (*p) {
-				if (!memcmp(buf, *p, strlen(*p))) {
-					buf += strlen(*p);
-					break;
-				}
-				p++;
-			}
-		}
-		fmt++;
-	} while (*buf && *fmt);
-	printf("left: %s\n", buf);
-	return mktime(&tm);
-}
-
-static int convert_date_line(char *dst, void **buf, unsigned long *sp)
-{
-	unsigned long size = *sp;
-	char *line = *buf;
-	char *next = strchr(line, '\n');
-	char *date = strchr(line, '>');
-	int len;
-
-	if (!next || !date)
-		die("missing or bad author/committer line %s", line);
-	next++; date += 2;
-
-	*buf = next;
-	*sp = size - (next - line);
-
-	len = date - line;
-	memcpy(dst, line, len);
-	dst += len;
-
-	/* Is it already in new format? */
-	if (isdigit(*date)) {
-		int datelen = next - date;
-		memcpy(dst, date, datelen);
-		return len + datelen;
-	}
-
-	/*
-	 * Hacky hacky: one of the sparse old-style commits does not have
-	 * any date at all, but we can fake it by using the committer date.
-	 */
-	if (*date == '\n' && strchr(next, '>'))
-		date = strchr(next, '>')+2;
-
-	return len + sprintf(dst, "%lu -0700\n", parse_oldstyle_date(date));
-}
-
-static void convert_date(void *buffer, unsigned long size, unsigned char *result_sha1)
-{
-	char *new = xmalloc(size + 100);
-	unsigned long newlen = 0;
-
-	/* "tree <sha1>\n" */
-	memcpy(new + newlen, buffer, 46);
-	newlen += 46;
-	buffer = (char *) buffer + 46;
-	size -= 46;
-
-	/* "parent <sha1>\n" */
-	while (!memcmp(buffer, "parent ", 7)) {
-		memcpy(new + newlen, buffer, 48);
-		newlen += 48;
-		buffer = (char *) buffer + 48;
-		size -= 48;
-	}
-
-	/* "author xyz <xyz> date" */
-	newlen += convert_date_line(new + newlen, &buffer, &size);
-	/* "committer xyz <xyz> date" */
-	newlen += convert_date_line(new + newlen, &buffer, &size);
-
-	/* Rest */
-	memcpy(new + newlen, buffer, size);
-	newlen += size;
-
-	write_sha1_file(new, newlen, commit_type, result_sha1);
-	free(new);
-}
-
-static void convert_commit(void *buffer, unsigned long size, unsigned char *result_sha1)
-{
-	void *orig_buffer = buffer;
-	unsigned long orig_size = size;
-
-	if (memcmp(buffer, "tree ", 5))
-		die("Bad commit '%s'", (char *) buffer);
-	convert_ascii_sha1((char *) buffer + 5);
-	buffer = (char *) buffer + 46;    /* "tree " + "hex sha1" + "\n" */
-	while (!memcmp(buffer, "parent ", 7)) {
-		convert_ascii_sha1((char *) buffer + 7);
-		buffer = (char *) buffer + 48;
-	}
-	convert_date(orig_buffer, orig_size, result_sha1);
-}
-
-static struct entry * convert_entry(unsigned char *sha1)
-{
-	struct entry *entry = lookup_entry(sha1);
-	enum object_type type;
-	void *buffer, *data;
-	unsigned long size;
-
-	if (entry->converted)
-		return entry;
-	data = read_sha1_file(sha1, &type, &size);
-	if (!data)
-		die("unable to read object %s", sha1_to_hex(sha1));
-
-	buffer = xmalloc(size);
-	memcpy(buffer, data, size);
-
-	if (type == OBJ_BLOB) {
-		write_sha1_file(buffer, size, blob_type, entry->new_sha1);
-	} else if (type == OBJ_TREE)
-		convert_tree(buffer, size, entry->new_sha1);
-	else if (type == OBJ_COMMIT)
-		convert_commit(buffer, size, entry->new_sha1);
-	else
-		die("unknown object type %d in %s", type, sha1_to_hex(sha1));
-	entry->converted = 1;
-	free(buffer);
-	free(data);
-	return entry;
-}
-
-int main(int argc, char **argv)
-{
-	unsigned char sha1[20];
-	struct entry *entry;
-
-	setup_git_directory();
-
-	if (argc != 2)
-		usage("git-convert-objects <sha1>");
-	if (get_sha1(argv[1], sha1))
-		die("Not a valid object name %s", argv[1]);
-
-	entry = convert_entry(sha1);
-	printf("new sha1: %s\n", sha1_to_hex(entry->new_sha1));
-	return 0;
-}
diff --git a/contrib/convert-objects/git-convert-objects.txt b/contrib/convert-objects/git-convert-objects.txt
deleted file mode 100644
index f871880cfb..0000000000
--- a/contrib/convert-objects/git-convert-objects.txt
+++ /dev/null
@@ -1,29 +0,0 @@
-git-convert-objects(1)
-======================
-
-NAME
-----
-git-convert-objects - Converts old-style git repository
-
-
-SYNOPSIS
---------
-[verse]
-'git-convert-objects'
-
-DESCRIPTION
------------
-Converts old-style git repository to the latest format
-
-
-Author
-------
-Written by Linus Torvalds <torvalds@osdl.org>
-
-Documentation
---------------
-Documentation by David Greaves, Junio C Hamano and the git-list <git@vger.kernel.org>.
-
-GIT
----
-Part of the linkgit:git[7] suite
-- 
2.11.0.299.g762782ba8a


^ permalink raw reply related

* [PATCH v6 0/3] Turn the difftool into a builtin
From: Johannes Schindelin @ 2017-01-19 20:30 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, David Aguilar, Dennis Kaarsemaker, Paul Sbarra
In-Reply-To: <cover.1484668473.git.johannes.schindelin@gmx.de>

This patch series converts the difftool from a Perl script into a
builtin, for three reasons:

1. Perl is really not native on Windows. Not only is there a performance
   penalty to be paid just for running Perl scripts, we also have to deal
   with the fact that users may have different Perl installations, with
   different options, and some other Perl installation may decide to set
   PERL5LIB globally, wreaking havoc with Git for Windows' Perl (which we
   have to use because almost all other Perl distributions lack the
   Subversion bindings we need for `git svn`).

2. As the Perl script uses Unix-y paths that are not native to Windows,
   the Perl interpreter has to go through a POSIX emulation layer (the
   MSYS2 runtime). This means that paths have to be converted from
   Unix-y paths to Windows-y paths (and vice versa) whenever crossing
   the POSIX emulation barrier, leading to quite possibly surprising path
   translation errors.

3. Perl makes for a rather large reason that Git for Windows' installer
   weighs in with >30MB. While one Perl script less does not relieve us
   of that burden, it is one step in the right direction.

Changes since v5:

- reworded the commit message of 2/3 to account for the change in v4
  where we no longer keep both scripted and builtin difftool working
  (with the switch difftool.useBuiltin deciding which one is used).

Johannes Schindelin (3):
  difftool: add a skeleton for the upcoming builtin
  difftool: implement the functionality in the builtin
  Retire the scripted difftool

 Makefile                                           |   2 +-
 builtin.h                                          |   1 +
 builtin/difftool.c                                 | 692 +++++++++++++++++++++
 .../examples/git-difftool.perl                     |   0
 git.c                                              |   1 +
 t/t7800-difftool.sh                                |  92 +--
 6 files changed, 741 insertions(+), 47 deletions(-)
 create mode 100644 builtin/difftool.c
 rename git-difftool.perl => contrib/examples/git-difftool.perl (100%)

base-commit: ffac48d093d4b518a0cc0e8bf1b7cb53e0c3d7a2
Published-As: https://github.com/dscho/git/releases/tag/builtin-difftool-v6
Fetch-It-Via: git fetch https://github.com/dscho/git builtin-difftool-v6

-- 
2.11.0.windows.3

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox