Git development

Git development
 help / color / mirror / Atom feed

* Re: [ANNOUNCE] Git Merge Contributor Summit topic planning
From: Junio C Hamano @ 2017-02-01 18:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Erik van Zijst, git, ssaasen, mheemskerk
In-Reply-To: <20170201145300.4pn3faodhdb72jly@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> If you _can_ do that latter part, and you take "I only care about
> resumability" to the simplest extreme, you'd probably end up with a
> protocol more like:
>
>   Client: I need a packfile with this want/have
>   Server: OK, here it is; its opaque id is XYZ.
>   ... connection interrupted ...
>   Client: It's me again. I have up to byte N of pack XYZ
>   Server: OK, resuming
>           [or: I don't have XYZ anymore; start from scratch]
>
> Then generating XYZ and generating that bundle are basically the same
> task.

The above allows a simple and naive implementation of generating a
packstream and "tee"ing it to a spool file to be kept while sending
to the first client that asks XYZ.

The story I heard from folks who run git servers at work for Android
and other projects, however, is that they rarely see two requests
with want/have that result in an identical XYZ, unless "have" is an
empty set (aka "clone").  In a busy repository, between two clone
requests relatively close together, somebody would be pushing, so
you'd need many XYZs in your spool even if you want to support only
the "clone" case.

So in the real life, I think that the exchange needs to be more
like this:

    C: I need a packfile with this want/have
    ... C/S negotiate what "have"s are common ...
    S: Sorry, but our negitiation indicates that you are way too
       behind.  I'll send you a packfile that brings you up to a
       slightly older set of "want", so pretend that you asked for
       these slightly older "want"s instead.  The opaque id of that
       packfile is XYZ.  After getting XYZ, come back to me with
       your original set of "want"s.  You would give me more recent
       "have" in that request.  
    ... connection interrupted ...
    C: It's me again.  I have up to byte N of pack XYZ
    S: OK, resuming (or: I do not have it anymore, start from scratch)
    ... after 0 or more iterations C fully receives and digests XYZ ...

and then the above will iterate until the server does not have to
say "Sorry but you are way too behind" and returns a packfile
without having to tweak the "want".

That way, you can limit the number of XYZ you would need to keep to
a reasonable number.

The recent proposal by Jonathan Tan also allows the server side to
tweak the final tips the client receives after the protocol exchange
started.  I suspect the above two will become related.

^ permalink raw reply

* Re: [PATCH 1/5] add SWAP macro
From: René Scharfe @ 2017-02-01 18:06 UTC (permalink / raw)
  To: Jeff King, Johannes Schindelin
  Cc: Brandon Williams, Johannes Sixt, Git List, Junio C Hamano
In-Reply-To: <20170201114750.r5xdy6emdczmnh4j@sigill.intra.peff.net>

Am 01.02.2017 um 12:47 schrieb Jeff King:
> I'm not altogether convinced that SWAP() is an improvement in
> readability. I really like that it's shorter than the code it replaces,
> but there is a danger with introducing opaque constructs. It's one more
> thing for somebody familiar with C but new to the project to learn or
> get wrong.

I'm biased, of course, but SIMPLE_SWAP and SWAP exchange the values of 
two variables -- how could someone get that wrong?  Would it help to 
name the macro SWAP_VALUES?  Or XCHG? ;)

> IMHO the main maintenance gain from René's patch is that you don't have
> to specify the type, which means you can never have a memory-overflow
> bug due to incorrect types. If we lose that, then I don't see all that
> much value in the whole thing.

Size checks could be added to SIMPLE_SWAP as well.

The main benefit of a swap macro is reduced repetition IMHO: Users 
specify the variables to swap once instead of twice in a special order, 
and with SWAP they don't need to declare their type again.  Squeezing 
out redundancy makes the code easier to read and modify.

René

^ permalink raw reply

* Re: [PATCH 1/5] add SWAP macro
From: Junio C Hamano @ 2017-02-01 18:33 UTC (permalink / raw)
  To: René Scharfe
  Cc: Jeff King, Johannes Schindelin, Brandon Williams, Johannes Sixt,
	Git List
In-Reply-To: <bfd0d758-a9c8-9792-6294-9f9ed632cc98@web.de>

René Scharfe <l.s.r@web.de> writes:

> Size checks could be added to SIMPLE_SWAP as well.

Between SWAP() and SIMPLE_SWAP() I am undecided.

If the compiler can infer the type and the size of the two
"locations" given to the macro, there is no technical reason to
require the caller to specify the type as an extra argument, so
SIMPLE_SWAP() may not necessarily an improvement over SWAP() from
that point of view.  If the redundancy is used as a sanity check,
I'd be in favor of SIMPLE_SWAP(), though.

If the goal of SIMPLE_SWAP(), on the other hand, were to support the
"only swap char with int for small value" example earlier in the
thread, it means you cannot sanity check the type of things being
swapped in the macro, and the readers of the code need to know about
the details of what variables are being swapped.  It looks to me
that it defeats the main benefit of using a macro.

> The main benefit of a swap macro is reduced repetition IMHO: Users
> specify the variables to swap once instead of twice in a special
> order, and with SWAP they don't need to declare their type again.
> Squeezing out redundancy makes the code easier to read and modify.

Yes.

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2017, #06; Tue, 31)
From: Junio C Hamano @ 2017-02-01 18:40 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, psteinhardt
In-Reply-To: <20170201110851.GA475@pks-pc>

Patrick Steinhardt <ps@pks.im> writes:

> I just tried to write additional tests to exercise this in our
> config-tests by using `git config --get-urlmatch` with multiple
> `http.extraheader`s set. I've been stumped that it still didn't
> work correctly with my patch, only the last value was actually
> ever returned.

I think that is very much in line with "git config --get" works, and
"--get-urlmatch" being an enhancement of "--get", I would expect
that "--get-urlmatch" to follow the usual "the last one wins" rule.

If you want to see _all_ values for a multi-valued variable, you
would say "git config --get-all".  IIUC, there currently is no
"--get-all-urlmatch", but there may need one.  I dunno.

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Junio C Hamano @ 2017-02-01 19:19 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <9780d67c9f11c056202987377c542d0313772ba2.1485950225.git.johannes.schindelin@gmx.de>

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> index 2734b9a1ca..7f1f802396 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -694,10 +694,14 @@ static const char *get_ssh_command(void)
>  static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
>  			      int *port_option, int *needs_batch)
>  {
> -	const char *variant;
> +	const char *variant = getenv("GIT_SSH_VARIANT");
>  	char *p = NULL;
>  
> -	if (!is_cmdline) {
> +	if (variant)
> +		; /* okay, fall through */
> +	else if (!git_config_get_string("ssh.variant", &p))
> +		variant = p;
> +	else if (!is_cmdline) {
>  		p = xstrdup(ssh_command);
>  		variant = basename(p);
>  	} else {
> @@ -717,7 +721,8 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
>  	}
>  
>  	if (!strcasecmp(variant, "plink") ||
> -	    !strcasecmp(variant, "plink.exe"))
> +	    !strcasecmp(variant, "plink.exe") ||
> +	    !strcasecmp(variant, "putty"))

This means that "putty" that appear as the first word of the command
line, not the value configured for ssh.variant or GIT_SSH_VARIANT,
will also trigger the option for "plink", no?

Worse, "plink.exe" configured as the value of "ssh.variant", is
anything other than 'ssh', 'plink', 'putty', or 'tortoiseplink', but
it is not treated as normal ssh, contrary to the documentation.

> +ssh.variant::
> +	Depending on the value of the environment variables `GIT_SSH` or
> +	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
> +	auto-detects whether to adjust its command-line parameters for use
> +	with plink or tortoiseplink, as opposed to the default (OpenSSH).
> ++
> +The config variable `ssh.variant` can be set to override this auto-detection;
> +valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
> +will be treated as normal ssh. This setting can be overridden via the
> +environment variable `GIT_SSH_VARIANT`.

I was hoping that the "rewrite that retains simplicity" was also
correct, but let's not go in this direction.  The more lines you
touch in a hurry, the worse the code will get, and that is to be
expected.

I'd suggest taking the documentation updates from this series, and
then minimally plug the leak introduced by the previous round,
perhaps like the attached SQUASH.  As I said, GIT_SSH_VARIANT and
ssh.variant are expected to be rare cases, and the case in which
they are set does not have to be optimized if it makes the necessary
change smaller and more likely to be correct.

I think restructuring along the line of 3/4 of this round is very
sensible in the longer term (if anything, handle_ssh_variant() now
really handles ssh variants in all cases, which makes it worthy of
its name, as opposed to the one in the previous round that only
reacts to the overrides).  But it seems that it will take longer to
get such a rewrite right, and my priority is seeing this topic to
add autodetection to GIT_SSH_COMMAND and other smaller topics in the
upcoming -rc0 in a serviceable and correct shape.

The restructuring done by 3/4 can come later after the dust settles,
if somebody cares deeply enough about performance in the rare cases.

 Documentation/config.txt | 14 +++++++++-----
 Documentation/git.txt    |  9 ++++-----
 connect.c                |  9 +++++----
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f2c210f0a0..b88df57ab6 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1950,11 +1950,15 @@ matched against are those given directly to Git commands.  This means any URLs
 visited as a result of a redirection do not participate in matching.
 
 ssh.variant::
-	Override the autodetection of plink/tortoiseplink in the SSH
-	command that 'git fetch' and 'git push' use. It can be set to
-	either `ssh`, `plink`, `putty` or `tortoiseplink`. Any other
-	value will be treated as normal ssh. This is useful in case
-	that Git gets this wrong.
+	Depending on the value of the environment variables `GIT_SSH` or
+	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
+	auto-detects whether to adjust its command-line parameters for use
+	with plink or tortoiseplink, as opposed to the default (OpenSSH).
++
+The config variable `ssh.variant` can be set to override this auto-detection;
+valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
+will be treated as normal ssh. This setting can be overridden via the
+environment variable `GIT_SSH_VARIANT`.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/Documentation/git.txt b/Documentation/git.txt
index c322558aa7..a0c6728d1a 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1021,11 +1021,10 @@ personal `.ssh/config` file.  Please consult your ssh documentation
 for further details.
 
 `GIT_SSH_VARIANT`::
-	If this environment variable is set, it overrides the autodetection
-	of plink/tortoiseplink in the SSH command that 'git fetch' and 'git
-	push' use. It can be set to either `ssh`, `plink`, `putty` or
-	`tortoiseplink`. Any other value will be treated as normal ssh. This
-	is useful in case that Git gets this wrong.
+	If this environment variable is set, it overrides Git's autodetection
+	whether `GIT_SSH`/`GIT_SSH_COMMAND`/`core.sshCommand` refer to OpenSSH,
+	plink or tortoiseplink. This variable overrides the config setting
+	`ssh.variant` that serves the same purpose.
 
 `GIT_ASKPASS`::
 	If this environment variable is set, then Git commands which need to
diff --git a/connect.c b/connect.c
index 7b4437578b..da51fef9ee 100644
--- a/connect.c
+++ b/connect.c
@@ -693,10 +693,11 @@ static const char *get_ssh_command(void)
 
 static int handle_ssh_variant(int *port_option, int *needs_batch)
 {
-	const char *variant;
+	char *variant;
 
-	if (!(variant = getenv("GIT_SSH_VARIANT")) &&
-		git_config_get_string_const("ssh.variant", &variant))
+	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
+	if (!variant &&
+	    git_config_get_string("ssh.variant", &variant))
 		return 0;
 
 	if (!strcmp(variant, "plink") || !strcmp(variant, "putty"))
@@ -705,7 +706,7 @@ static int handle_ssh_variant(int *port_option, int *needs_batch)
 		*port_option = 'P';
 		*needs_batch = 1;
 	}
-
+	free(variant);
 	return 1;
 }
 

^ permalink raw reply related

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Junio C Hamano @ 2017-02-01 19:46 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <xmqq1svhpvm0.fsf@gitster.mtv.corp.google.com>

Junio C Hamano <gitster@pobox.com> writes:

> I think restructuring along the line of 3/4 of this round is very
> sensible in the longer term (if anything, handle_ssh_variant() now
> really handles ssh variants in all cases, which makes it worthy of
> its name, as opposed to the one in the previous round that only
> reacts to the overrides).  But it seems that it will take longer to
> get such a rewrite right, and my priority is seeing this topic to
> add autodetection to GIT_SSH_COMMAND and other smaller topics in the
> upcoming -rc0 in a serviceable and correct shape.
>
> The restructuring done by 3/4 can come later after the dust settles,
> if somebody cares deeply enough about performance in the rare cases.

Having said all that, because I like the patch 3/4 so much, here is
another way to fix this, and I think (of course I am biased, but...)
the result is easier to grok.  Not only it makes it clear that the
namespace for the actual command names on the command line and the
namespace for the supported values of the configuration are distinct,
it makes it clear that we do not do anything extra when the user
explicitly tells us which variant to use.

This is designed to be squashed into [4/4] of this latest round (as
opposed to the one in the message I am responding to, which is to be
squashed into the previous round).

 connect.c | 42 +++++++++++++++++++++++++++++++-----------
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/connect.c b/connect.c
index 7f1f802396..12ad8d4c8b 100644
--- a/connect.c
+++ b/connect.c
@@ -691,17 +691,39 @@ static const char *get_ssh_command(void)
 	return NULL;
 }
 
-static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
-			      int *port_option, int *needs_batch)
+static int override_ssh_variant(int *port_option, int *needs_batch)
 {
-	const char *variant = getenv("GIT_SSH_VARIANT");
+	char *variant;
+
+	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
+	if (!variant &&
+	    git_config_get_string("ssh.variant", &variant))
+		return 0;
+
+	if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
+		*port_option = 'P';
+		*needs_batch = 0;
+	} else if (!strcmp(variant, "tortoiseplink")) {
+		*port_option = 'P';
+		*needs_batch = 1;
+	} else {
+		*port_option = 'p';
+		*needs_batch = 0;
+	}
+	free(variant);
+	return 1;
+}
+
+static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
+			       int *port_option, int *needs_batch)
+{
+	const char *variant;
 	char *p = NULL;
 
-	if (variant)
-		; /* okay, fall through */
-	else if (!git_config_get_string("ssh.variant", &p))
-		variant = p;
-	else if (!is_cmdline) {
+	if (override_ssh_variant(port_option, needs_batch))
+		return;
+
+	if (!is_cmdline) {
 		p = xstrdup(ssh_command);
 		variant = basename(p);
 	} else {
@@ -717,7 +739,7 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
 			 */
 			free(ssh_argv);
 		} else
-			return 0;
+			return;
 	}
 
 	if (!strcasecmp(variant, "plink") ||
@@ -730,8 +752,6 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
 		*needs_batch = 1;
 	}
 	free(p);
-
-	return 1;
 }
 
 /*

^ permalink raw reply related

* Re: [ANNOUNCE] Git Merge Contributor Summit topic planning
From: Christian Couder @ 2017-02-01 19:51 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20170131005900.fo5qnbsb67rzgnjt@sigill.intra.peff.net>

On Tue, Jan 31, 2017 at 1:59 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Jan 31, 2017 at 01:48:05AM +0100, Jeff King wrote:
>
>> The list of topics is totally open. If you're coming and have something
>> you'd like to present or discuss, then propose it here. If you're _not_
>> coming, you may still chime in with input on topics, but please don't
>> suggest a topic unless somebody who is there will agree to lead the
>> discussion.
>
> Here are the two topics I plan on bringing:
>
>   - Git / Software Freedom Conservancy yearly report. I'll plan to give
>     a rundown of the past year's activities and financials, along with
>     some open questions that could benefit from community input.
>
>   - The git-scm.com website: who runs that thing, anyway? An overview
>     of the site, how it's managed, and what it needs.
>
> I plan to send out detailed emails on both topics to the list on
> Wednesday, and then follow-up with a summary of any useful in-person
> discussion (since obviously not everybody will be at the summit).
>
> I'd encourage anybody with a topic to present to consider doing
> something similar.

GitLab people at the Summit (this includes me) would like to spend a
few minutes to introduce https://gitlab.com/gitlab-org/gitaly/ and
answer any questions.

^ permalink raw reply

* Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND
From: Junio C Hamano @ 2017-02-01 20:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jeff King
In-Reply-To: <cover.1485950225.git.johannes.schindelin@gmx.de>

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> It is quite preposterous to call this an "iteration" of the patch
> series, because the code is so different now. I say this because I want
> to caution that this code has not been tested as thoroughly, by far, as
> the first iteration.
>
> The primary purpose of code review is correctness, everything else is
> either a consequence of it, or a means to make reviewing easier.

You are utterly wrong here.

The primary purpose of code review is to spot and correct the
problems the submitter has missed.  The problems can span from
stupid bugs that affect correctness to maintainability, to design
mistakes, to clarity of explanation for both end users and future
developers.

Among them, correctness problems are, as long as the problem to be
solved is specified clearly enough, the easiest to spot by the
submitter before the patch is sent out.  The submitter is focused on
solving one problem, and if the updated code does not even work as
the submitter advertises it would, that can be caught by the
submitter before the patch is even sent out.  

Of course, humans are not perfect, and catching correctness problems
is important, but that is not the only (let alone primary) thing.

When a submitter has been focusing on solving one problem, it is
easy to lose the larger picture and to end up adding something that
may be "correct" (in the sense of "works as advertised by the
submitter") but does not fit well with the rest of the system, or
covers some use cases but misses other important and related use
cases.  If the "does not fit well" surfaces to the end user level,
that would become a design problem.  If it affects the future
developers, that would become a maintainability problem.

Compared to the correctness issue, these are much harder to spot by
the submitter alone, who focused so intensely to get his code
"correct".  The review process is of greater value to spot these
issues.

I've already read and commented on the series; as I said, I think
the rewrite in 3/4 makes the resulting code much easier to read, and
with the fix-up I just sent out, I think the end result is correct,
works as advertised, the way it is advertised is clear to end users,
and is maintainable.  

But I do share your uneasiness about the "new-ness" of the code.

^ permalink raw reply

* Re: [ANNOUNCE] Git Merge Contributor Summit topic planning
From: Stefan Beller @ 2017-02-01 20:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git@vger.kernel.org
In-Reply-To: <20170131004804.p5sule4rh2xrgtwe@sigill.intra.peff.net>

On Mon, Jan 30, 2017 at 4:48 PM, Jeff King <peff@peff.net> wrote:
> The Contributor Summit is only a few days away; I'd like to work out a
> few bits of logistics ahead of time.
>
> We're up to 26 attendees. The room layout will probably be three big
> round-tables, with a central projector. We should be able to have
> everybody pay attention to a single speaker, or break into 3 separate
> conversations.
>
> The list of topics is totally open. If you're coming and have something
> you'd like to present or discuss, then propose it here. If you're _not_
> coming, you may still chime in with input on topics, but please don't
> suggest a topic unless somebody who is there will agree to lead the
> discussion.

submodules and X (How do submodules and worktrees interact,
should they?, Which functions need support for submodules, e.g. checkout,
branch,  grep, etc...? Are we interested in keeping a submodule its own
logical unit? Do we want to have dedicated plumbing commands for
submodules?)

... would be my line of talk,

Stefan

^ permalink raw reply

* Re: [PATCH] merge-recursive: make "CONFLICT (rename/delete)" message show both paths
From: Junio C Hamano @ 2017-02-01 20:56 UTC (permalink / raw)
  To: Matt McCutchen; +Cc: git
In-Reply-To: <xmqqvaswrv5q.fsf@gitster.mtv.corp.google.com>

Junio C Hamano <gitster@pobox.com> writes:

> Matt McCutchen <matt@mattmccutchen.net> writes:
> ...
>> Please check that my refactoring is indeed correct!  All the existing tests pass
>> for me, but the existing test coverage of these conflict messages looks poor.
>
> This unfortunately is doing a bit too many things at once from that
> point of view.  I need to reserve a solid quiet 20-minutes without
> distraction to check it, which I am hoping to do tonight.

Let me make sure if I understood your changes correctly:

 * conflict_rename_delete() knew which one is renamed and which one
   is deleted (even though the deleted one was called "other"), but
   because in the original code handle_change_delete() wants to
   always see tree A first and then tree B in its parameter list,
   the original code swapped a/b before calling it.  In the original
   code, handle_change_delete() needed to figure out which one is
   the deleted one by looking at a_oid or b_oid.

 * In the updated code, the knowledge of which branch survives and
   which branch is deleted is passed from the caller to
   handle_change_delete(), which no longer needs to figure out by
   looking at a_oid/b_oid.  The updated API only passes the oid for
   surviving branch (as deleted one would have been 0{40} anyway).

 * In the updated code, handle_change_delete() is told the names of
   both branches (the one that survives and the other that was
   deleted).  It no longer has to switch between o->branch[12]
   depending on the NULLness of a_oid/b_oid; it knows both names and
   which one is which.

 * handle_modify_delete() also calls handle_change_delete().  Unlike
   conflict_rename_delete(), it is not told by its caller which
   branch keeps the path and which branch deletes the path, and
   instead relies on handle_change_delete() to figure it out.
   Because of the above change to the API, now it needs to sort it
   out before calling handle_change_delete().

It all makes sense to me.  

The single call to update_file() that appears near the end of
handle_change_delete() in the updated code corresponds to calls to
the same function in 3 among 4 codepaths in the function in the
original code.  It is a bit tricky to reason about, though.

In the original code, update_file() was omitted when we didn't have
to come up with a unique alternate filename and the one that is left
is a_oid (i.e. our side).  The way to tell if we did not come up
with a unique alternate filename used to be to see the "renamed"
variable but now it is the NULL-ness of "alt_path".  And the way to
tell if the side that is left is ours, we check to see o->branch1
is the change_branch, not delete_branch.

So the condition to guard the call to update_file() also looks
correct to me.

Thanks.

>> -	char *renamed = NULL;
>> +	char *alt_path = NULL;
>> +	const char *update_path = path;
>>  	int ret = 0;
>> +
>>  	if (dir_in_way(path, !o->call_depth, 0)) {
>> -		renamed = unique_path(o, path, a_oid ? o->branch1 : o->branch2);
>> +		update_path = alt_path = unique_path(o, path, change_branch);
>>  	}
>>  
>>  	if (o->call_depth) {
>> @@ -1081,43 +1085,43 @@ static int handle_change_delete(struct merge_options *o,
>>  		 */
>>  		ret = remove_file_from_cache(path);
>>  		if (!ret)
>> -			ret = update_file(o, 0, o_oid, o_mode,
>> -					  renamed ? renamed : path);
>> -	} else if (!a_oid) {
>> -		if (!renamed) {
>> -			output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -			       "and %s in %s. Version %s of %s left in tree."),
>> -			       change, path, o->branch1, change_past,
>> -			       o->branch2, o->branch2, path);
>> -			ret = update_file(o, 0, b_oid, b_mode, path);
>> -		} else {
>> -			output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -			       "and %s in %s. Version %s of %s left in tree at %s."),
>> -			       change, path, o->branch1, change_past,
>> -			       o->branch2, o->branch2, path, renamed);
>> -			ret = update_file(o, 0, b_oid, b_mode, renamed);
>> -		}
>> +			ret = update_file(o, 0, o_oid, o_mode, update_path);
>>  	} else {
>> -		if (!renamed) {
>> -			output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -			       "and %s in %s. Version %s of %s left in tree."),
>> -			       change, path, o->branch2, change_past,
>> -			       o->branch1, o->branch1, path);
>> +		if (!alt_path) {
>> +			if (!old_path) {
>> +				output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> +				       "and %s in %s. Version %s of %s left in tree."),
>> +				       change, path, delete_branch, change_past,
>> +				       change_branch, change_branch, path);
>> +			} else {
>> +				output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> +				       "and %s to %s in %s. Version %s of %s left in tree."),
>> +				       change, old_path, delete_branch, change_past, path,
>> +				       change_branch, change_branch, path);
>> +			}
>>  		} else {
>> -			output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> -			       "and %s in %s. Version %s of %s left in tree at %s."),
>> -			       change, path, o->branch2, change_past,
>> -			       o->branch1, o->branch1, path, renamed);
>> -			ret = update_file(o, 0, a_oid, a_mode, renamed);
>> +			if (!old_path) {
>> +				output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> +				       "and %s in %s. Version %s of %s left in tree at %s."),
>> +				       change, path, delete_branch, change_past,
>> +				       change_branch, change_branch, path, alt_path);
>> +			} else {
>> +				output(o, 1, _("CONFLICT (%s/delete): %s deleted in %s "
>> +				       "and %s to %s in %s. Version %s of %s left in tree at %s."),
>> +				       change, old_path, delete_branch, change_past, path,
>> +				       change_branch, change_branch, path, alt_path);
>> +			}
>>  		}
>>  		/*
>> -		 * No need to call update_file() on path when !renamed, since
>> -		 * that would needlessly touch path.  We could call
>> -		 * update_file_flags() with update_cache=0 and update_wd=0,
>> -		 * but that's a no-op.
>> +		 * No need to call update_file() on path when change_branch ==
>> +		 * o->branch1 && !alt_path, since that would needlessly touch
>> +		 * path.  We could call update_file_flags() with update_cache=0
>> +		 * and update_wd=0, but that's a no-op.
>>  		 */
>> +		if (change_branch != o->branch1 || alt_path)
>> +			ret = update_file(o, 0, changed_oid, changed_mode, update_path);
>>  	}
>> -	free(renamed);
>> +	free(alt_path);
>>  
>>  	return ret;
>>  }
>> @@ -1125,28 +1129,17 @@ static int handle_change_delete(struct merge_options *o,
>>  static int conflict_rename_delete(struct merge_options *o,
>>  				   struct diff_filepair *pair,
>>  				   const char *rename_branch,
>> -				   const char *other_branch)
>> +				   const char *delete_branch)
>>  {
>>  	const struct diff_filespec *orig = pair->one;
>>  	const struct diff_filespec *dest = pair->two;
>> -	const struct object_id *a_oid = NULL;
>> -	const struct object_id *b_oid = NULL;
>> -	int a_mode = 0;
>> -	int b_mode = 0;
>> -
>> -	if (rename_branch == o->branch1) {
>> -		a_oid = &dest->oid;
>> -		a_mode = dest->mode;
>> -	} else {
>> -		b_oid = &dest->oid;
>> -		b_mode = dest->mode;
>> -	}
>>  
>>  	if (handle_change_delete(o,
>>  				 o->call_depth ? orig->path : dest->path,
>> +				 o->call_depth ? NULL : orig->path,
>>  				 &orig->oid, orig->mode,
>> -				 a_oid, a_mode,
>> -				 b_oid, b_mode,
>> +				 &dest->oid, dest->mode,
>> +				 rename_branch, delete_branch,
>>  				 _("rename"), _("renamed")))
>>  		return -1;
>>  
>> @@ -1665,11 +1658,27 @@ static int handle_modify_delete(struct merge_options *o,
>>  				 struct object_id *a_oid, int a_mode,
>>  				 struct object_id *b_oid, int b_mode)
>>  {
>> +	const char *modify_branch, *delete_branch;
>> +	struct object_id *changed_oid;
>> +	int changed_mode;
>> +
>> +	if (a_oid) {
>> +		modify_branch = o->branch1;
>> +		delete_branch = o->branch2;
>> +		changed_oid = a_oid;
>> +		changed_mode = a_mode;
>> +	} else {
>> +		modify_branch = o->branch2;
>> +		delete_branch = o->branch1;
>> +		changed_oid = b_oid;
>> +		changed_mode = b_mode;
>> +	}
>> +
>>  	return handle_change_delete(o,
>> -				    path,
>> +				    path, NULL,
>>  				    o_oid, o_mode,
>> -				    a_oid, a_mode,
>> -				    b_oid, b_mode,
>> +				    changed_oid, changed_mode,
>> +				    modify_branch, delete_branch,
>>  				    _("modify"), _("modified"));
>>  }
>>  
>> diff --git a/t/t6045-merge-rename-delete.sh b/t/t6045-merge-rename-delete.sh
>> new file mode 100755
>> index 0000000..8f33244
>> --- /dev/null
>> +++ b/t/t6045-merge-rename-delete.sh
>> @@ -0,0 +1,23 @@
>> +#!/bin/sh
>> +
>> +test_description='Merge-recursive rename/delete conflict message'
>> +. ./test-lib.sh
>> +
>> +test_expect_success 'rename/delete' '
>> +echo foo >A &&
>> +git add A &&
>> +git commit -m "initial" &&
>> +
>> +git checkout -b rename &&
>> +git mv A B &&
>> +git commit -m "rename" &&
>> +
>> +git checkout master &&
>> +git rm A &&
>> +git commit -m "delete" &&
>> +
>> +test_must_fail git merge --strategy=recursive rename >output &&
>> +test_i18ngrep "CONFLICT (rename/delete): A deleted in HEAD and renamed to B in rename. Version rename of B left in tree." output
>> +'
>> +
>> +test_done

^ permalink raw reply

* Re: [ANNOUNCE] Git Merge Contributor Summit topic planning
From: Jeff King @ 2017-02-01 21:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Erik van Zijst, git, ssaasen, mheemskerk
In-Reply-To: <xmqqh94dpyzs.fsf@gitster.mtv.corp.google.com>

On Wed, Feb 01, 2017 at 10:06:15AM -0800, Junio C Hamano wrote:

> > If you _can_ do that latter part, and you take "I only care about
> > resumability" to the simplest extreme, you'd probably end up with a
> > protocol more like:
> >
> >   Client: I need a packfile with this want/have
> >   Server: OK, here it is; its opaque id is XYZ.
> >   ... connection interrupted ...
> >   Client: It's me again. I have up to byte N of pack XYZ
> >   Server: OK, resuming
> >           [or: I don't have XYZ anymore; start from scratch]
> >
> > Then generating XYZ and generating that bundle are basically the same
> > task.
> 
> The above allows a simple and naive implementation of generating a
> packstream and "tee"ing it to a spool file to be kept while sending
> to the first client that asks XYZ.
> 
> The story I heard from folks who run git servers at work for Android
> and other projects, however, is that they rarely see two requests
> with want/have that result in an identical XYZ, unless "have" is an
> empty set (aka "clone").  In a busy repository, between two clone
> requests relatively close together, somebody would be pushing, so
> you'd need many XYZs in your spool even if you want to support only
> the "clone" case.

Yeah, I agree a tag "XYZ" does not cover all cases, especially for
fetches.

We do caching at GitHub based on the sha1(want+have+options) tag, and it
does catch quite a lot of parallelism, but not all. It catches most
clones, and many fetches that are done by "thundering herds" of similar
clients.

One thing you could do with such a pure "resume XYZ" tag is to represent
the generated pack _without_ replicating the actual object bytes, but
take shortcuts by basing particular bits on the on-disk packfile. Just
enough to serve a deterministic packfile for the same want/have bits.

For instance, if the server knew that XYZ meant

  - send bytes m through n of packfile p, then...

  - send the object at position i of packfile p, as a delta against the
    object at position j of packfile q

  - ...and so on

Then you could store very small "instruction sheets" for each XYZ that
rely on the data in the packfiles. If those packfiles go away (e.g., due
to a repack) that invalidates all of your current XYZ tags. That's OK as
long as this is an optimization, not a correctness requirement.

I haven't actually built anything like this, though, so I don't have a
complete language for the instruction sheets, nor numbers on how big
they would be for average cases.

> So in the real life, I think that the exchange needs to be more
> like this:
> 
>     C: I need a packfile with this want/have
>     ... C/S negotiate what "have"s are common ...
>     S: Sorry, but our negitiation indicates that you are way too
>        behind.  I'll send you a packfile that brings you up to a
>        slightly older set of "want", so pretend that you asked for
>        these slightly older "want"s instead.  The opaque id of that
>        packfile is XYZ.  After getting XYZ, come back to me with
>        your original set of "want"s.  You would give me more recent
>        "have" in that request.  
>     ... connection interrupted ...
>     C: It's me again.  I have up to byte N of pack XYZ
>     S: OK, resuming (or: I do not have it anymore, start from scratch)
>     ... after 0 or more iterations C fully receives and digests XYZ ...
> 
> and then the above will iterate until the server does not have to
> say "Sorry but you are way too behind" and returns a packfile
> without having to tweak the "want".

Yes, I think that is a reasonable variant. The client knows about
seeding, but the XYZ conversation continues to happen inside the git
protocol. So it loses flexibility versus a true CDN redirection, but it
would "just work" when the server/client both understand the feature,
without the server admin having to set up a separate bundle-over-http
infrastructure.

-Peff

^ permalink raw reply

* Re: [ANNOUNCE] Git Merge Contributor Summit topic planning
From: Junio C Hamano @ 2017-02-01 21:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Erik van Zijst, git, ssaasen, mheemskerk
In-Reply-To: <20170201212825.advj7f3ucnfbspbj@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> For instance, if the server knew that XYZ meant
>
>   - send bytes m through n of packfile p, then...
>   
>   - send the object at position i of packfile p, as a delta against the
>     object at position j of packfile q
>
>   - ...and so on
>
> Then you could store very small "instruction sheets" for each XYZ that
> rely on the data in the packfiles. If those packfiles go away (e.g., due
> to a repack) that invalidates all of your current XYZ tags. That's OK as
> long as this is an optimization, not a correctness requirement.

Yes.  You can play optimization games.

>> So in the real life, I think that the exchange needs to be more
>> like this:
>> 
>>     C: I need a packfile with this want/have
>>     ... C/S negotiate what "have"s are common ...
>>     S: Sorry, but our negitiation indicates that you are way too
>>        behind.  I'll send you a packfile that brings you up to a
>>        slightly older set of "want", so pretend that you asked for
>>        these slightly older "want"s instead.  The opaque id of that
>>        packfile is XYZ.  After getting XYZ, come back to me with
>>        your original set of "want"s.  You would give me more recent
>>        "have" in that request.  
>>     ... connection interrupted ...
>>     C: It's me again.  I have up to byte N of pack XYZ
>>     S: OK, resuming (or: I do not have it anymore, start from scratch)
>>     ... after 0 or more iterations C fully receives and digests XYZ ...
>> 
>> and then the above will iterate until the server does not have to
>> say "Sorry but you are way too behind" and returns a packfile
>> without having to tweak the "want".
>
> Yes, I think that is a reasonable variant. The client knows about
> seeding, but the XYZ conversation continues to happen inside the git
> protocol. So it loses flexibility versus a true CDN redirection, but it
> would "just work" when the server/client both understand the feature,
> without the server admin having to set up a separate bundle-over-http
> infrastructure.

You can also do a CDN offline as a natural extension.  When the
server says "Sorry, you are way too behind.", the above example
tells "I'll update you to a slightly stale version first" to the
client.  An natural extension could say "Go update yourself to a
slightly stale version first by grabbing that bundle over there."

But I agree that doing everything in-line may be a logical and
simpler first step to get there.

^ permalink raw reply

* git commit results in many lstat()s
From: Gumbel, Matthew K @ 2017-02-01 21:45 UTC (permalink / raw)
  To: git@vger.kernel.org

Hello,

My high level problem is to speed up git commit on a large repository stored on NFS filesystem. I see via strace that it is slow because it makes a large number (~50,000) of lstat() calls in serial. Every call is a round-trip to the NFS server.

I do not understand why git commit must call lstat() on every file in the repository, even when I specify the name of the file I want to commit on the command line. Can somebody explain why it must call lstat on every file?

My command-line looks like this: git commit -uno -o -m asdf file-to-commit.txt

Secondly, are there any optimizations I can make to avoid this behavior?

Thanks,
Matt

^ permalink raw reply

* [PATCH] doc: add note about ignoring --no-create-reflog
From: cornelius.weig @ 2017-02-01 22:07 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Cornelius Weig

From: Cornelius Weig <cornelius.weig@tngtech.com>

The commands git-branch and git-tag accept a `--create-reflog` argument.
On the other hand, the negated form `--no-create-reflog` is accepted as
a valid option but has no effect. This silent noop may puzzle users.

To communicate that this behavior is intentional, add a short note in
the manuals for git-branch and git-tag.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
---

Notes:
    In a previous discussion (<xmqqbmunrwbf.fsf@gitster.mtv.corp.google.com>) it
    was found that git-branch and git-tag accept a "--no-create-reflog" argument,
    but it has no effect, does not produce a warning, and is undocumented.

 Documentation/git-branch.txt | 1 +
 Documentation/git-tag.txt    | 1 +
 2 files changed, 2 insertions(+)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 1fae4ee..fca3754 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -91,6 +91,7 @@ OPTIONS
 	based sha1 expressions such as "<branchname>@\{yesterday}".
 	Note that in non-bare repositories, reflogs are usually
 	enabled by default by the `core.logallrefupdates` config option.
+	The negated form `--no-create-reflog` is silently ignored.
 
 -f::
 --force::
diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
index 5b2288c..b0b933e 100644
--- a/Documentation/git-tag.txt
+++ b/Documentation/git-tag.txt
@@ -152,6 +152,7 @@ This option is only applicable when listing tags without annotation lines.
 --create-reflog::
 	Create a reflog for the tag. To globally enable reflogs for tags, see
 	`core.logAllRefUpdates` in linkgit:git-config[1].
+	The negated form `--no-create-reflog` is silently ignored.
 
 <tagname>::
 	The name of the tag to create, delete, or describe.
-- 
2.10.2


^ permalink raw reply related

* Re: git commit results in many lstat()s
From: Junio C Hamano @ 2017-02-01 22:11 UTC (permalink / raw)
  To: Gumbel, Matthew K; +Cc: git@vger.kernel.org
In-Reply-To: <DA0A42D68346B1469147552440A645039A9C56D4@ORSMSX115.amr.corp.intel.com>

"Gumbel, Matthew K" <matthew.k.gumbel@intel.com> writes:

> I do not understand why git commit must call lstat() on every file
> in the repository, even when I specify the name of the file I want
> to commit on the command line.

Assuming the "COPYING" and "README.md" files are already tracked:

    $ >COPYING
    $ >README.md
    $ git commit COPYING

would open an editor, in which you would see list of files under
"Changes to be committed", "Changes not staged for commit", etc.
Among the second class you would see README.md listed.

To figure out what paths are "changed", without having to open all
files and compare their contents with what is recorded in the commit
you are building on top of, we do lstat(2) to see if the timestamp
(and other information in the inode) of the files are the same since
you checked them out of HEAD.

    $ git commit --no-status COPYING

would reduce the number of lstat(2) somewhat, because the codepath
is told that it does not have to make the list to be shown in the
editor.  So would

    $ git commit -m "empty COPYING" COPYING

These two only halve the number of lstat(2), by taking advantage of
the fact that the list of "modified files" does not have to be
built.  There probably are other things that can be optimized.

^ permalink raw reply

* Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND
From: Johannes Schindelin @ 2017-02-01 22:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <xmqqr33hoetx.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
> 
> > It is quite preposterous to call this an "iteration" of the patch
> > series, because the code is so different now. I say this because I want
> > to caution that this code has not been tested as thoroughly, by far, as
> > the first iteration.
> >
> > The primary purpose of code review is correctness, everything else is
> > either a consequence of it, or a means to make reviewing easier.
> 
> You are utterly wrong here.
> 
> The primary purpose of code review is to spot and correct the
> problems the submitter has missed.  The problems can span from
> stupid bugs that affect correctness to maintainability, to design
> mistakes, to clarity of explanation for both end users and future
> developers.
> 
> Among them, correctness problems are, as long as the problem to be
> solved is specified clearly enough, the easiest to spot by the
> submitter before the patch is sent out.  The submitter is focused on
> solving one problem, and if the updated code does not even work as
> the submitter advertises it would, that can be caught by the
> submitter before the patch is even sent out.  
> 
> Of course, humans are not perfect, and catching correctness problems
> is important, but that is not the only (let alone primary) thing.
> 
> When a submitter has been focusing on solving one problem, it is
> easy to lose the larger picture and to end up adding something that
> may be "correct" (in the sense of "works as advertised by the
> submitter") but does not fit well with the rest of the system, or
> covers some use cases but misses other important and related use
> cases.  If the "does not fit well" surfaces to the end user level,
> that would become a design problem.  If it affects the future
> developers, that would become a maintainability problem.
> 
> Compared to the correctness issue, these are much harder to spot by
> the submitter alone, who focused so intensely to get his code
> "correct".  The review process is of greater value to spot these
> issues.

We will never agree on this.

From my perspective, design, explanation and maintainability are a
consequence of making it easy for reviewers to spot where the code is
incorrect.

And correctness is not covered by "the submitter tested this". Correctness
includes all the corner cases, where the "many eyes make bugs shallow"
really shines.

I'd rather have reviewers find bugs than users.

I will *never* be a fan of a review process that pushes correctness to a
back seat (yes, it is much harder than spotting typos or lines longer than
80 columns per row, but the ultimate goal is to deliver value to the end
user, not to make life easy for the maintainer).

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Johannes Schindelin @ 2017-02-01 22:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <xmqqwpd9ofry.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I think restructuring along the line of 3/4 of this round is very
> > sensible in the longer term (if anything, handle_ssh_variant() now
> > really handles ssh variants in all cases, which makes it worthy of
> > its name, as opposed to the one in the previous round that only
> > reacts to the overrides).  But it seems that it will take longer to
> > get such a rewrite right, and my priority is seeing this topic to
> > add autodetection to GIT_SSH_COMMAND and other smaller topics in the
> > upcoming -rc0 in a serviceable and correct shape.
> >
> > The restructuring done by 3/4 can come later after the dust settles,
> > if somebody cares deeply enough about performance in the rare cases.
> 
> Having said all that, because I like the patch 3/4 so much, here is
> another way to fix this, and I think (of course I am biased, but...)
> the result is easier to grok.  Not only it makes it clear that the
> namespace for the actual command names on the command line and the
> namespace for the supported values of the configuration are distinct,
> it makes it clear that we do not do anything extra when the user
> explicitly tells us which variant to use.
> 
> This is designed to be squashed into [4/4] of this latest round (as
> opposed to the one in the message I am responding to, which is to be
> squashed into the previous round).
> 
>  connect.c | 42 +++++++++++++++++++++++++++++++-----------
>  1 file changed, 31 insertions(+), 11 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index 7f1f802396..12ad8d4c8b 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -691,17 +691,39 @@ static const char *get_ssh_command(void)
>  	return NULL;
>  }
>  
> -static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
> -			      int *port_option, int *needs_batch)
> +static int override_ssh_variant(int *port_option, int *needs_batch)
>  {
> -	const char *variant = getenv("GIT_SSH_VARIANT");
> +	char *variant;
> +
> +	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
> +	if (!variant &&
> +	    git_config_get_string("ssh.variant", &variant))
> +		return 0;
> +
> +	if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
> +		*port_option = 'P';
> +		*needs_batch = 0;
> +	} else if (!strcmp(variant, "tortoiseplink")) {
> +		*port_option = 'P';
> +		*needs_batch = 1;
> +	} else {
> +		*port_option = 'p';
> +		*needs_batch = 0;
> +	}
> +	free(variant);
> +	return 1;
> +}
> +
> +static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
> +			       int *port_option, int *needs_batch)
> +{
> +	const char *variant;
>  	char *p = NULL;
>  
> -	if (variant)
> -		; /* okay, fall through */
> -	else if (!git_config_get_string("ssh.variant", &p))
> -		variant = p;
> -	else if (!is_cmdline) {
> +	if (override_ssh_variant(port_option, needs_batch))
> +		return;
> +
> +	if (!is_cmdline) {
>  		p = xstrdup(ssh_command);
>  		variant = basename(p);
>  	} else {
> @@ -717,7 +739,7 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
>  			 */
>  			free(ssh_argv);
>  		} else
> -			return 0;
> +			return;
>  	}
>  
>  	if (!strcasecmp(variant, "plink") ||
> @@ -730,8 +752,6 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline,
>  		*needs_batch = 1;
>  	}
>  	free(p);
> -
> -	return 1;
>  }
>  
>  /*

That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
not your specific objection that that is the case?

No worries, I will let this simmer for a while. Your fixup has a lot of
duplicated code (so much for maintainability as an important goal... ;-))
and I will have to think about it. My immediate thinking is to *not*
duplicate code, strip the .exe extension directly after calling basename()
in the cases where we handle commands, and handle "putty" in the other
cases by replacing it with the string "plink".

But as I said, this will simmer for a while.

Ciao,
Johannes

^ permalink raw reply

* RE: git commit results in many lstat()s
From: Gumbel, Matthew K @ 2017-02-01 22:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org
In-Reply-To: <xmqq8tppo92x.fsf@gitster.mtv.corp.google.com>

"Junio C Hamano" <jch2355@gmail.com> writes:
> There probably are other things that can be optimized.

Yes, I think that when the user passes --only flag to git-commit, then git does not
need to call refresh_cache() in prepare_index() in builtin/commit.c.

I may experiment with that. Do you see any downside or negative side-effects?

Matt

^ permalink raw reply

* Re: [PATCH] doc: add note about ignoring --no-create-reflog
From: Junio C Hamano @ 2017-02-01 22:30 UTC (permalink / raw)
  To: cornelius.weig; +Cc: git, peff
In-Reply-To: <20170201220727.18070-1-cornelius.weig@tngtech.com>

cornelius.weig@tngtech.com writes:

> From: Cornelius Weig <cornelius.weig@tngtech.com>
>
> The commands git-branch and git-tag accept a `--create-reflog` argument.

For the purpose of contrasting the above with "--no-create-reflog",
I find it a bit too weak to just say "accept".  How about

    The commands git-branch and git-tag accept a `--create-reflog`
    option, and creates reflog even in a repository where
    core.logallrefupdates configuration is set not to.

or something?  After all "--no-create-reflog" is accepted.  It just
does not override the configured (or unconfigured) default.

> On the other hand, the negated form `--no-create-reflog` is accepted as
> a valid option but has no effect. This silent noop may puzzle users.

True, very true.

> To communicate that this behavior is intentional, add a short note in
> the manuals for git-branch and git-tag.

Hmph.  The added "short note" merely states the fact; it does not
hint that it is intentional or it explains what reasoning is behind
that intention.

> Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
> ---
>
> Notes:
>     In a previous discussion (<xmqqbmunrwbf.fsf@gitster.mtv.corp.google.com>) it
>     was found that git-branch and git-tag accept a "--no-create-reflog" argument,
>     but it has no effect, does not produce a warning, and is undocumented.

Reading what Peff said in the thread, I do not think we actively
wanted this behaviour; we agreed that it is merely acceptable.  

So perhaps s/this behaviour is intentional/this is known/ to weaken
the log message?  That way, when somebody else who really cares
comes later and finds this commit that adds explicit notes to these
manual pages via "git blame", s/he would not be dissuaded from
making things better.  Such an update may make it warn when
core.logallrefupdates is not set to false (and continue to ignore
the command line option), or it may make the command line option
actually override the configured default.

With such an update to the log message, I think the patch looks
good.

Thanks.

>  Documentation/git-branch.txt | 1 +
>  Documentation/git-tag.txt    | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 1fae4ee..fca3754 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -91,6 +91,7 @@ OPTIONS
>  	based sha1 expressions such as "<branchname>@\{yesterday}".
>  	Note that in non-bare repositories, reflogs are usually
>  	enabled by default by the `core.logallrefupdates` config option.
> +	The negated form `--no-create-reflog` is silently ignored.
>  
>  -f::
>  --force::
> diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt
> index 5b2288c..b0b933e 100644
> --- a/Documentation/git-tag.txt
> +++ b/Documentation/git-tag.txt
> @@ -152,6 +152,7 @@ This option is only applicable when listing tags without annotation lines.
>  --create-reflog::
>  	Create a reflog for the tag. To globally enable reflogs for tags, see
>  	`core.logAllRefUpdates` in linkgit:git-config[1].
> +	The negated form `--no-create-reflog` is silently ignored.
>  
>  <tagname>::
>  	The name of the tag to create, delete, or describe.

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Junio C Hamano @ 2017-02-01 22:33 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <alpine.DEB.2.20.1702012319460.3496@virtualbox>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> not your specific objection that that is the case?

Yup, you can remove that while you reroll.

> No worries, I will let this simmer for a while. Your fixup has a lot of
> duplicated code (so much for maintainability as an important goal... ;-))
> and I will have to think about it. My immediate thinking is to *not*
> duplicate code,...

You need to realize that the namespaces of the configuration and the
command names are distinct.  There is no code duplication.

^ permalink raw reply

* Re: [PATCH] doc: add note about ignoring --no-create-reflog
From: Jeff King @ 2017-02-01 22:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: cornelius.weig, git
In-Reply-To: <xmqq4m0do86p.fsf@gitster.mtv.corp.google.com>

On Wed, Feb 01, 2017 at 02:30:38PM -0800, Junio C Hamano wrote:

> > Notes:
> >     In a previous discussion (<xmqqbmunrwbf.fsf@gitster.mtv.corp.google.com>) it
> >     was found that git-branch and git-tag accept a "--no-create-reflog" argument,
> >     but it has no effect, does not produce a warning, and is undocumented.
> 
> Reading what Peff said in the thread, I do not think we actively
> wanted this behaviour; we agreed that it is merely acceptable.
> 
> So perhaps s/this behaviour is intentional/this is known/ to weaken
> the log message?  That way, when somebody else who really cares
> comes later and finds this commit that adds explicit notes to these
> manual pages via "git blame", s/he would not be dissuaded from
> making things better.  Such an update may make it warn when
> core.logallrefupdates is not set to false (and continue to ignore
> the command line option), or it may make the command line option
> actually override the configured default.

Yeah, I'd consider it more of a "known bug" or "known limitation" than
anything.

Those can go in a separate section, but they're probably more likely to
be read when supplied next to the actual option.

> With such an update to the log message, I think the patch looks
> good.
> [...]
> > @@ -91,6 +91,7 @@ OPTIONS
> >  	based sha1 expressions such as "<branchname>@\{yesterday}".
> >  	Note that in non-bare repositories, reflogs are usually
> >  	enabled by default by the `core.logallrefupdates` config option.
> > +	The negated form `--no-create-reflog` is silently ignored.

This might be nitpicking, but it's _not_ ignored. It still negates an
earlier "--create-reflog". It is only that it does not override the
decision to create a reflog caused by the setting of
core.logallrefupdates.

-Peff

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Johannes Schindelin @ 2017-02-01 22:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <xmqqzii5mthp.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> > not your specific objection that that is the case?
> 
> Yup, you can remove that while you reroll.
> 
> > No worries, I will let this simmer for a while. Your fixup has a lot of
> > duplicated code (so much for maintainability as an important goal... ;-))
> > and I will have to think about it. My immediate thinking is to *not*
> > duplicate code,...
> 
> You need to realize that the namespaces of the configuration and the
> command names are distinct.  There is no code duplication.

Since your 2/4 did away with the "plink" and "tortoiseplink" values in
favor of "port_option" and "batch_option", there is a duplication of logic
which I tried to undo in v3 and which you reintroduce in your fixup.

Maybe that refactoring was not so smart to begin with, and we should have
instead modified the code to use an enum instead and keep the original
conditionals instead of setting a port_option and a batch_option
explicitly.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Junio C Hamano @ 2017-02-01 22:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <xmqqzii5mthp.fsf@gitster.mtv.corp.google.com>

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
>> not your specific objection that that is the case?
>
> Yup, you can remove that while you reroll.
>
>> No worries, I will let this simmer for a while. Your fixup has a lot of
>> duplicated code (so much for maintainability as an important goal... ;-))
>> and I will have to think about it. My immediate thinking is to *not*
>> duplicate code,...
>
> You need to realize that the namespaces of the configuration and the
> command names are distinct.  There is no code duplication.

To explain this a bit, there is no reason why allowed values for
SSH_VARIANT must be "putty" and "tortoiseplink".  An alternative
design could be "port_option=-p,needs_batch=yes" and it may be more
logical and futureproof if a variant of tortoiseplink decides to use
"-p" instead of "-P" and still require "-batch".

Prematurely attempting to share code, only because the current
vocabularies for two distinct concepts happen to overlap, is not
de-duplicating the code for maintainability.  It is adding
unnecessary work other people need to do in the future when they
want to extend the system.

^ permalink raw reply

* Re: [PATCH v3 0/4] Handle PuTTY (plink/tortoiseplink) even in GIT_SSH_COMMAND
From: Junio C Hamano @ 2017-02-01 22:55 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jeff King
In-Reply-To: <alpine.DEB.2.20.1702012311440.3496@virtualbox>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Compared to the correctness issue, these are much harder to spot by
>> the submitter alone, who focused so intensely to get his code
>> "correct".  The review process is of greater value to spot these
>> issues.
>
> We will never agree on this.

That's too bad.

> From my perspective, design, explanation and maintainability are a
> consequence of making it easy for reviewers to spot where the code is
> incorrect.
>
> And correctness is not covered by "the submitter tested this". Correctness
> includes all the corner cases, where the "many eyes make bugs shallow"
> really shines.
>
> I'd rather have reviewers find bugs than users.

I'd rather have submitters find bugs than reviewers.

> I will *never* be a fan of a review process that pushes correctness to a
> back seat (yes, it is much harder than spotting typos or lines longer than
> 80 columns per row, but the ultimate goal is to deliver value to the end
> user, not to make life easy for the maintainer).

Did I ever say correctness is pushed to a back seat?

I said that it is easier to spot correctness issues for you as a
submitter than other kinds of issues without outside help (and
implied that if you are a diligent contributor, you should aim for,
and you should be able to achieve, a patch series where correctness
issues do not need to be pointed out).  But other higher level
issues are harder for any submitter to spot (regardless of
experience and competence of the submitter), because one gets so
married to one's own code, design and worldview.  And that is why
"review is primarily to spot bugs" can never be a correct viewpoint.

A reviewee needs to be prepared to accept review comments on higher
level issues, even more readily than comments on correctness issues,
because it is too easy to be constrained by early decisions one has
already made while preparing a patch series and become blind to
bigger picture after staring one's own new code for number of hours.
Higher level issues can be more easily spotted by reviewers, whose
eyes are still fresh to the series.

^ permalink raw reply

* Re: [PATCH v3 4/4] connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
From: Johannes Schindelin @ 2017-02-01 23:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Segev Finer, Jeff King
In-Reply-To: <xmqqvastmt09.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Wed, 1 Feb 2017, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> >> That leaves the "putty" case in handle_ssh_variant(), does it not? Was it
> >> not your specific objection that that is the case?
> >
> > Yup, you can remove that while you reroll.
> >
> >> No worries, I will let this simmer for a while. Your fixup has a lot of
> >> duplicated code (so much for maintainability as an important goal... ;-))
> >> and I will have to think about it. My immediate thinking is to *not*
> >> duplicate code,...
> >
> > You need to realize that the namespaces of the configuration and the
> > command names are distinct.  There is no code duplication.
> 
> To explain this a bit, there is no reason why allowed values for
> SSH_VARIANT must be "putty" and "tortoiseplink".  An alternative
> design could be "port_option=-p,needs_batch=yes" and it may be more
> logical and futureproof if a variant of tortoiseplink decides to use
> "-p" instead of "-P" and still require "-batch".
> 
> Prematurely attempting to share code, only because the current
> vocabularies for two distinct concepts happen to overlap, is not
> de-duplicating the code for maintainability.  It is adding
> unnecessary work other people need to do in the future when they
> want to extend the system.

Except, of course, that your hypothetical port_option and needs_batch
settings would be handled at a different point altogether.

I sense very strongly that this discussion has taken a very emotional
turn, which is detrimental to the quality. So let's take a break here.

Ciao,
Johannes

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox