Git development
 help / color / mirror / Atom feed
* Re: [PATCH] diff: fix up SHA-1 abbreviations outside of repository
From: Junio C Hamano @ 2016-12-05  7:19 UTC (permalink / raw)
  To: Jack Bates; +Cc: git, Jeff King, Jack Bates
In-Reply-To: <20161204194747.7100-1-jack@nottheoilrig.com>

Jack Bates <bk874k@nottheoilrig.com> writes:

> The three cases where "git diff" operates outside of a repository are 1)
> when we run it outside of a repository, 2) when one of the files we're
> comparing is outside of the repository we're in, and 3) the --no-index
> option. Commit 4f03666 ("diff: handle sha1 abbreviations outside of
> repository", 2016-10-20) only worked in the first case.
> ---
>  builtin/diff.c                  |  4 +++-
>  t/t4063-diff-no-index-abbrev.sh | 50 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 53 insertions(+), 1 deletion(-)
>  create mode 100755 t/t4063-diff-no-index-abbrev.sh
>
> diff --git a/builtin/diff.c b/builtin/diff.c
> index 7f91f6d..ec7c432 100644
> --- a/builtin/diff.c
> +++ b/builtin/diff.c
> @@ -342,9 +342,11 @@ int cmd_diff(int argc, const char **argv, const char *prefix)
>  		       "--no-index" : "[--no-index]");
>  
>  	}
> -	if (no_index)
> +	if (no_index) {
>  		/* If this is a no-index diff, just run it and exit there. */
> +		startup_info->have_repository = 0;
>  		diff_no_index(&rev, argc, argv);
> +	}

This kind of change makes me nervous (partly because I am not seeing
the whole code but only this part of the patch).

Some code may react to "have_repository" being zero and do the right
thing (which I think is what you are using from your previous "we
did one of the three cases" change here), but the codepath that led
to "have_repository" being set to non-zero previously must have done
a lot more than just flipping that field to non-zero, and setting
zero to this field alone would not "undo" what it did.

I wonder if we're better off if we made sure that diff_no_index()
works the same way regardless of the value of "have_repository"
field?




^ permalink raw reply

* Re: Should reset_revision_walk clear more flags?
From: Junio C Hamano @ 2016-12-05  7:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Mike Hommey, git
In-Reply-To: <20161205054013.taosbwjamxiwzocn@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Which I think would include both of the flags you mentioned, along with
> others like SYMMETRIC_LEFT, PATCHSAME, etc. Probably really everything
> mentioned in revision.h, which should be summed up as ALL_REV_FLAGS.
> Some callsites already seem to feed that to clear_commit_marks().
>
> I doubt you can go too wrong by clearing more flags. 

This and ...

> It's possible that
> some caller is relying on a flag _not_ being cleared between two
> traversals, but in that case it should probably be using
> clear_commit_marks() or clear_object_flags() explicitly to make it clear
> what it expects to be saved.

... this are contradictory, no?  

A caller may use two sets of its own flags in addition to letting
the traversal machinery use the basic ones, and it may want to keep
one of its own two sets while clearing the other set.  It would
clear_commit_marks() to clear the latter.  And then it would let
that the next traversal to reset_revision_walk() clear the basic
ones.

So if you make reset_revision_walk() clear "more flags", that would
break such a caller that is using clear_commit_marks() to make it
clear what its expectations are, no?

I dunno.


^ permalink raw reply

* Re: [PATCH] diff: fix up SHA-1 abbreviations outside of repository
From: Jeff King @ 2016-12-05  7:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jack Bates, git, Jack Bates
In-Reply-To: <xmqqr35m6dwt.fsf@gitster.mtv.corp.google.com>

On Sun, Dec 04, 2016 at 11:19:46PM -0800, Junio C Hamano wrote:

> > -	if (no_index)
> > +	if (no_index) {
> >  		/* If this is a no-index diff, just run it and exit there. */
> > +		startup_info->have_repository = 0;
> >  		diff_no_index(&rev, argc, argv);
> > +	}
> 
> This kind of change makes me nervous (partly because I am not seeing
> the whole code but only this part of the patch).
> 
> Some code may react to "have_repository" being zero and do the right
> thing (which I think is what you are using from your previous "we
> did one of the three cases" change here), but the codepath that led
> to "have_repository" being set to non-zero previously must have done
> a lot more than just flipping that field to non-zero, and setting
> zero to this field alone would not "undo" what it did.

I _think_ it's OK because the only substantive change would be the
chdir() to the top of the working tree. But that information is carried
through by revs->prefix, which we act on regardless of the value of
startup_info->have_repository when we call prefix_filename().

I agree that it may be an accident waiting to happen, though, as soon as
some buried sub-function needs to care about the distinction.

> I wonder if we're better off if we made sure that diff_no_index()
> works the same way regardless of the value of "have_repository"
> field?

If you mean adding a diffopt flag like "just abbreviate everything to
FALLBACK_DEFAULT_ABBREV even if we're in a repository", and then setting
that in diff_no_index(), I agree that is a lot cleaner.

I'm still not 100% convinced that it's actually the correct behavior,
but at least doing a more contained version wouldn't take away other
functionality like reading config.

-Peff

^ permalink raw reply

* Re: Should reset_revision_walk clear more flags?
From: Jeff King @ 2016-12-05  7:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Mike Hommey, git
In-Reply-To: <xmqqmvga6dmb.fsf@gitster.mtv.corp.google.com>

On Sun, Dec 04, 2016 at 11:26:04PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Which I think would include both of the flags you mentioned, along with
> > others like SYMMETRIC_LEFT, PATCHSAME, etc. Probably really everything
> > mentioned in revision.h, which should be summed up as ALL_REV_FLAGS.
> > Some callsites already seem to feed that to clear_commit_marks().
> >
> > I doubt you can go too wrong by clearing more flags. 
> 
> This and ...
> 
> > It's possible that
> > some caller is relying on a flag _not_ being cleared between two
> > traversals, but in that case it should probably be using
> > clear_commit_marks() or clear_object_flags() explicitly to make it clear
> > what it expects to be saved.
> 
> ... this are contradictory, no?

I don't think so. If you are calling reset_revision_walk(), then I'm not
sure you have any right to expect that particular flags are left intact.
You _should_ be using an option that lets you specify the full set of
flags, rather than making an implicit assumption about which flags are
left.

In other words, I'd much rather see:

  clear_object_flags(ALL_REV_FLAGS & (~TMP_MARK));

than:

  /* This leaves TMP_MARK intact! */
  reset_revision_walk();

if you need to leave TMP_MARK intact.

Which isn't to say every caller does it correctly already. My claim
above is only "should", not "does". :)

But given that many of them _do_ use the more specific flag-clearing
functions, I think most of the calls to reset_revision_walk() are
blanket "I'm done now, clean up after me" calls.

> A caller may use two sets of its own flags in addition to letting
> the traversal machinery use the basic ones, and it may want to keep
> one of its own two sets while clearing the other set.  It would
> clear_commit_marks() to clear the latter.  And then it would let
> that the next traversal to reset_revision_walk() clear the basic
> ones.
> 
> So if you make reset_revision_walk() clear "more flags", that would
> break such a caller that is using clear_commit_marks() to make it
> clear what its expectations are, no?

I'd argue that it should not be calling reset_revision_walk() at all if
it wants anything saved. It should mask out the flags it wants, as
above. But I do realize that reasoning is circular: it's OK for
reset_revision_walk() to reset everything because that's what it does,
or at least should do from its name. :)

-Peff

^ permalink raw reply

* Re: Error after calling git difftool -d with
From: P. Duijst @ 2016-12-05  7:58 UTC (permalink / raw)
  To: David Aguilar; +Cc: Johannes Schindelin, git
In-Reply-To: <20161205051510.itftw4hyzkv6nnxn@gmail.com>

On 12/5/2016 06:15, David Aguilar wrote:
> On Fri, Dec 02, 2016 at 05:05:06PM +0100, Johannes Schindelin wrote:
>> Hi Peter,
>>
>> On Fri, 2 Dec 2016, P. Duijst wrote:
>>
>>> Incase filenames are used with a quote ' or a bracket [  (and maybe some more
>>> characters), git "diff" and "difftool -y" works fine, but git *difftool **-d*
>>> gives the next error message:
>>>
>>>     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
>>>     $ git diff
>>>     diff --git a/Test ''inch.txt b/Test ''inch.txt
>>>     index dbff793..41f3257 100644
>>>     --- a/Test ''inch.txt
>>>     +++ b/Test ''inch.txt
>>>     @@ -1 +1,3 @@
>>>     +
>>>     +ddd
>>>       Test error in simple repository
>>>     warning: LF will be replaced by CRLF in Test ''inch.txt.
>>>     The file will have its original line endings in your working directory.
>>>
>>>     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
>>>     *$ git difftool -d*
>>>     *fatal: Cannot open '/d/Dev/test//Test ''inch.txt': No such file or
>>>     directory*
>>>     *hash-object /d/Dev/test//Test ''inch.txt: command returned error: 128*
>>>
>>>     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
>>>     $
>>>
>>>
>>> This issue is inside V2.10.x and V2.11.0.
>>> V2.9.0 is working correctly...
>> You say v2.11.0, but did you also try the new, experimental builtin
>> difftool? You can test without reinstalling:
>>
>> 	git -c difftool.useBuiltin=true difftool -d ...
> FWIW, I verified that this problem does not manifest itself on
> Linux, using the current scripted difftool.
>
> Peter, what actual diff tool are you using?
>
> Since these filenames work fine with "difftool -d" on Linux, it
> suggests that this is either a tool-specific issue, or an issue
> related to unix-to-windows path translation.
Hi all,

@Johannes: "git -c difftool.useBuiltin=true difftool -d" works OK :-), 
beyond compare is launching with the diff's displayed
@David: I am using Beyond Compare V4.1.9

Best regards,

Peter


^ permalink raw reply

* Re: [PATCH v1] git-p4: add config to retry p4 commands; retry 3 times by default
From: Lars Schneider @ 2016-12-05  9:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <20161204140311.26269-1-larsxschneider@gmail.com>


> On 04 Dec 2016, at 15:03, larsxschneider@gmail.com wrote:
> 
> From: Lars Schneider <lars.schneider@autodesk.com>

Hi Junio,

if you decide to queue this patch and/or the "git-p4: fix empty file 
processing for large file system backend GitLFS", please use my 
signed-off address. I accidentally messed up the author field in
both.

Thanks,
Lars


> 
> P4 commands can fail due to random network issues. P4 users can counter
> these issues by using a retry flag supported by all p4 commands [1].
> 
> Add an integer Git config value `git-p4.retries` to define the number of
> retries for all p4 invocations. If the config is not defined then set
> the default retry count to 3.
> 
> [1] https://www.perforce.com/perforce/doc.current/manuals/cmdref/global.options.html
> 
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>


^ permalink raw reply

* Re: [PATCH 1/4] shallow.c: make paint_alloc slightly more robust
From: Duy Nguyen @ 2016-12-05 10:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Rasmus Villemoes, Git Mailing List
In-Reply-To: <20161203051454.vp772xtto5ddxe7g@sigill.intra.peff.net>

On Sat, Dec 3, 2016 at 12:14 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 02, 2016 at 09:31:01PM +0100, Rasmus Villemoes wrote:
>
>> I have no idea if this is a real issue, but it's not obvious to me that
>> paint_alloc cannot be called with info->nr_bits greater than about
>> 4M (\approx 8*COMMIT_SLAB_SIZE). In that case the new slab would be too
>> small. So just round up the allocation to the maximum of
>> COMMIT_SLAB_SIZE and size.
>
> I had trouble understanding what the problem is from this description,
> but I think i figured it out from the code.
>
> Let me try to restate it to make sure I understand.
>
> The paint_alloc() may be asked to allocate a certain number of bits,
> which it does across a series of independently allocated slabs. Each
> slab holds a fixed size, but we only allocate a single slab. If the
> number we need to allocate is larger than fits in a single slab, then at
> the end we'll have under-allocated.

Each bit here represents a ref. This code walks the commit graph and
"paints" all commits reachable by the n-th ref with the n-th bit,
stored in the commit slab. But because the majority of commits will
have the same bitmap (e.g. when you exclude tag ABC and nothing else,
then all commits from ABC will have the same bitmap "1"), it's a waste
to allocate the same bitmap per commit (and it's also inefficient to
let malloc allocate 1 bit). I tried to reduce the memory usage: if the
a commit and its parent has the same bitmap, and the slab pointer of
the child commit points to the memory of the parent's, no extra
allocation is done. This manual memory management is pretty much like
alloc.c

The COMMIT_SLAB_SIZE here is really an arbitrary big number so that we
don't have to allocate often. It's basically allocating a new memory
pool. When we use all of that pool, we allocate a new one.. Yeah I
probably should define a new one instead of reusing COMMIT_SLAB_SIZE.
Tthe chances of under-allocation is super low, but still possible: you
need to send more than 4M "exclude" (or "shallow") requests to
upload-pack, to create a bitmap of over 512KiB. That's a lot of
traffic in git protocol.

> Your solution is to make the slab we allocate bigger. But that seems
> odd to me. Usually when we are using COMMIT_SLAB_SIZE, we are allocating
> a series of slabs that make up a virtual array, and we know that each
> slab has the same size. So if you need to find the k-th item, and each
> slab has length n, then you'd look at slab (k / n), and then at item (k
> % n) within that slab.
>
> In other words, I think the solution isn't to make the one slab bigger,
> but to allocate slabs until we have enough of them to meet the request.

If I still understand my code (it's been a long time since I wrote
this thing), then I think we just need to catch the problem and die().
Normal users should never ask the server to allocate this much.
-- 
Duy

^ permalink raw reply

* Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
From: Johannes Schindelin @ 2016-12-05 10:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git, David Aguilar, Dennis Kaarsemaker
In-Reply-To: <xmqqlgvz6x87.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Thu, 1 Dec 2016, Junio C Hamano wrote:

> As to "I have to spawn config", I think it is sensible to start the
> cmd_difftool() wrapper without adding RUN_SETUP to the command table,
> then call git_config_get_bool() to check the configuration only from
> system and per-user files, and then finally either call into
> builtin_difftool() where setup_git_directory() is called, or spawn the
> scripted difftool, as Peff already said.

I just spent a considerable amount of time to figure out that I overlooked
your comment about "only from system and per-user files".

> Your "users opt-in while installing" is not about setting per-repository
> option.

Wow. That would make things really inconsistent.

I know that *I* would want to be able to switch that opt-in feature off
for repositories where I absolutely rely on some rock-solid difftool. And
as a user I would not only be shocked when it would not work as expected,
but I would be *positively* shocked to learn that this is intended, by
design.

Seriously, do you really think it is a good idea to have
git_config_get_value() *ignore* any value in .git/config?

Really?

It caught *me* by surprise, and it definitely makes for a very, very bad
user experience.

And this is much more fundamental than just difftool.useBuiltin.

I see for example that our pager.<cmd> setting is ignoring per-repository
settings. That is, if the user sets pager.<cmd> in ~/.gitconfig and then
tries to override this in a specific repository (which any user would only
do for very good reasons, I am sure), Git would happily ignore that
careful setup and create an angry user.

The only reason this did not blow up in our face yet is that users
apparently do not make use of this feature yet. Which does not make this
behavior (that "git_config_get_value()" happily ignores .git/config) less
broken.

We need to fix this.

Ciao,
Dscho

^ permalink raw reply

* Re: Error after calling git difftool -d with
From: Johannes Schindelin @ 2016-12-05 10:56 UTC (permalink / raw)
  To: P. Duijst; +Cc: David Aguilar, git
In-Reply-To: <c0c8c333-adfa-ad58-f1ec-7239a3a16528@gmail.com>

Hi Peter,

On Mon, 5 Dec 2016, P. Duijst wrote:

> On 12/5/2016 06:15, David Aguilar wrote:
> > On Fri, Dec 02, 2016 at 05:05:06PM +0100, Johannes Schindelin wrote:
> > >
> > > On Fri, 2 Dec 2016, P. Duijst wrote:
> > >
> > > > Incase filenames are used with a quote ' or a bracket [  (and
> > > > maybe some more characters), git "diff" and "difftool -y" works
> > > > fine, but git *difftool **-d* gives the next error message:
> > > >
> > > >     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
> > > >     $ git diff
> > > >     diff --git a/Test ''inch.txt b/Test ''inch.txt
> > > >     index dbff793..41f3257 100644
> > > >     --- a/Test ''inch.txt
> > > >     +++ b/Test ''inch.txt
> > > >     @@ -1 +1,3 @@
> > > >     +
> > > >     +ddd
> > > >       Test error in simple repository
> > > >     warning: LF will be replaced by CRLF in Test ''inch.txt.
> > > >     The file will have its original line endings in your working
> > > >     directory.
> > > >
> > > >     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
> > > >     *$ git difftool -d*
> > > >     *fatal: Cannot open '/d/Dev/test//Test ''inch.txt': No such file or
> > > >     directory*
> > > >     *hash-object /d/Dev/test//Test ''inch.txt: command returned error:
> > > >     128*
> > > >
> > > >     peter@scm_ws_10 MINGW64 /d/Dev/test (master)
> > > >     $
> > > >
> > > >
> > > > This issue is inside V2.10.x and V2.11.0.
> > > > V2.9.0 is working correctly...
> > > You say v2.11.0, but did you also try the new, experimental builtin
> > > difftool? You can test without reinstalling:
> > >
> > >  git -c difftool.useBuiltin=true difftool -d ...
> >
> > FWIW, I verified that this problem does not manifest itself on Linux,
> > using the current scripted difftool.
> >
> > Peter, what actual diff tool are you using?
> >
> > Since these filenames work fine with "difftool -d" on Linux, it
> > suggests that this is either a tool-specific issue, or an issue
> > related to unix-to-windows path translation.
> 
> @Johannes: "git -c difftool.useBuiltin=true difftool -d" works OK :-), beyond
> compare is launching with the diff's displayed

Perfect.

In that case, I think it is not worth fixing the scripted tool but focus
on getting rid of it in favor of the builtin version.

It's not like it is the only problem with having difftool implemented
as a script...

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH v1] git-p4: add config to retry p4 commands; retry 3 times by default
From: Luke Diamand @ 2016-12-05 11:02 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Git Users, Ori Rawlings, Lars Schneider
In-Reply-To: <20161204140311.26269-1-larsxschneider@gmail.com>

On 4 December 2016 at 14:03,  <larsxschneider@gmail.com> wrote:
> From: Lars Schneider <lars.schneider@autodesk.com>
>
> P4 commands can fail due to random network issues. P4 users can counter
> these issues by using a retry flag supported by all p4 commands [1].
>
> Add an integer Git config value `git-p4.retries` to define the number of
> retries for all p4 invocations. If the config is not defined then set
> the default retry count to 3.

Looks good to me, ack.

>
> [1] https://www.perforce.com/perforce/doc.current/manuals/cmdref/global.options.html
>
> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
> ---
>
> Notes:
>     Base Commit: 454cb6b (v2.11.0)
>     Diff on Web: https://github.com/git/git/compare/454cb6b...larsxschneider:654c727
>     Checkout:    git fetch https://github.com/larsxschneider/git git-p4/retries-v1 && git checkout 654c727
>
>  Documentation/git-p4.txt | 4 ++++
>  git-p4.py                | 5 +++++
>  2 files changed, 9 insertions(+)
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index c83aaf39c3..656587248c 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -467,6 +467,10 @@ git-p4.client::
>         Client specified as an option to all p4 commands, with
>         '-c <client>', including the client spec.
>
> +git-p4.retries::
> +       Specifies the number of times to retry a p4 command (notably,
> +       'p4 sync') if the network times out. The default value is 3.
> +
>  Clone and sync variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~
>  git-p4.syncFromOrigin::
> diff --git a/git-p4.py b/git-p4.py
> index fd5ca52462..2422178210 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -78,6 +78,11 @@ def p4_build_cmd(cmd):
>      if len(client) > 0:
>          real_cmd += ["-c", client]
>
> +    retries = gitConfigInt("git-p4.retries")
> +    if retries is None:
> +        # Perform 3 retries by default
> +        retries = 3
> +    real_cmd += ["-r", str(retries)]
>
>      if isinstance(cmd,basestring):
>          real_cmd = ' '.join(real_cmd) + ' ' + cmd
> --
> 2.11.0
>

^ permalink raw reply

* RE: git 2.11.0 error when pushing to remote located on a windows share
From: thomas.attwood @ 2016-12-05 11:05 UTC (permalink / raw)
  To: tboegi, peff; +Cc: git
In-Reply-To: <20161204080914.GB2415@tb-raspi>

On Sun, 4 Dec 2016 08:09:14 +0000, Torsten Bögershausen wrote:
> There seems to be another issue, which may or may not being related:
> https://github.com/git-for-windows/git/issues/979

I think this is the same issue. I've posted my trace command output there as
It might be more appropriate:
https://github.com/git-for-windows/git/issues/979#issuecomment-264816175


^ permalink raw reply

* Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling
From: Ramsay Jones @ 2016-12-05 11:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, GIT Mailing-list
In-Reply-To: <20161205053258.jtnqq64gp5n7vtni@sigill.intra.peff.net>



On 05/12/16 05:32, Jeff King wrote:
> On Sun, Dec 04, 2016 at 08:45:59PM +0000, Ramsay Jones wrote:
>> I recently noticed that:
>>
>>     $ make >pout 2>&1
>>     $ ./git version
>>     git version 2.11.0.286.g109e8a9
>>     $ git describe
>>     v2.11.0-286-g109e8a99d
>>     $
>>
>> ... for non-release builds, the commit part of the version
>> string was still using an --abbrev=7.
> 
> It seems like this kind of discussion ought to go in the commit message.
> :)
> 
> That said, I think the right patch may be to just drop --abbrev
> entirely.

Heh, that was the first version of the patch. However, I got to thinking
about why --abbrev=7 was there in the first place; the only reason I
could think of was to defeat local configuration to get a measure of
reproducibility.

Unfortunately, you can't get the 'auto' behaviour from --abbrev
(on the pu branch):

    $ ./git describe --abbrev=-1
    v2.11.0-286-g109e8
    $ ./git describe --abbrev=0
    v2.11.0
    $ ./git describe
    v2.11.0-286-g109e8a99d
    $

I did think about using '-c core.abbrev=auto', but that would
depend on Junio's patch (nothing wrong with that, of course):

    $ git version
    git version 2.11.0
    $ git -c core.abbrev=auto describe
    fatal: bad numeric config value 'auto' for 'core.abbrev': invalid unit
    $ ./git -c core.abbrev=auto describe
    v2.11.0-286-g109e8a99d
    $ 

What do you think?

ATB,
Ramsay Jones


^ permalink raw reply

* Re: difftool -d not populating left correctly when not in git root
From: Johannes Schindelin @ 2016-12-05 11:46 UTC (permalink / raw)
  To: David Aguilar; +Cc: Frank Becker, git, John Keeping, Junio C Hamano
In-Reply-To: <20161204223139.kk2ejrfc5elxmsro@gmail.com>

Hi David,

On Sun, 4 Dec 2016, David Aguilar wrote:

> Date: Sun, 4 Dec 2016 14:27:17 -0800
> Subject: [PATCH] difftool: properly handle being launched from a subdirectory
> 
> 9ec26e797781239b36ebccb87c590e5778358007 corrected how path arguments
> are handled in a subdirectory, but it introduced a regression in how
> entries outside of the subdirectory are handled by the dir-diff.
> 
> When preparing the right-side of the diff we only construct the parts
> that changed.
> 
> When constructing the left side we construct an index, but we were
> constructing it relative to the subdirectory, and thus it ends up empty
> because we are in a subdirectory and the paths are incorrect.
> 
> Teach difftool to chdir to the toplevel of the repository before
> preparing its temporary indexes.  This ensures that all of the
> toplevel-relative paths are valid.
> 
> Add a test case to exercise this use case.
> 
> Reported-by: Frank Becker <fb@mooflu.com>
> Signed-off-by: David Aguilar <davvid@gmail.com>

I applied this to my builtin-difftool branch (which, as per Peff's
suggestion, runs t7800 both with and without the builtin difftool) and the
tests pass:

https://github.com/dscho/git/commit/9b9a69c5ee975a4a2565343ae0a9199a6ac89609

Which means that the builtin difftool already had fixed the bug, more by
coincidence than by design, I have to admit.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 4/4] shallow.c: remove useless test
From: Duy Nguyen @ 2016-12-05 12:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Rasmus Villemoes, Git Mailing List
In-Reply-To: <20161203052422.hhaj3idboo6r6dz5@sigill.intra.peff.net>

On Sat, Dec 3, 2016 at 12:24 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 02, 2016 at 09:31:04PM +0100, Rasmus Villemoes wrote:
>
>> It seems to be odd to do x=y if x==y. Maybe there's a bug somewhere near
>> this, but as is this is somewhat confusing.
>
> Yeah, this code is definitely wrong, but I'm not sure what it's trying
> to do. This is the first time I've looked at it.

I'm sorry I don't know why it's there either :( The first version that
has this was v3 [1] which still uses "util" pointdf instead of the
commit slab but the logic does not differ much.

This is the place when we "paint" the parent commit with the same
bitmap as the child I mentioned earlier (though I think I mentioned it
backward). You see similar code in the same loop just a bit earlier:
if the commit has not been painted, it gets a new bitmap, otherwise
new refs are OR'd to its bitmap. It's the OR part (when the bitmaps
pointed by *p_refs and *refs differ, it's not just about pointer
comparison) that's probably missing here.

But it looks like we can safely delete the " || *p_refs == *refs" part
because the commit in question is inserted back to the commit list
"head" and revisited in the next iteration. If its bitmap is different
from the child's, then in the next iteration it should hit the "if
(memcmp(tmp, *refs, bitmap_size))" line above, in the same loop, then
the new bit will be added. If it's marked UNINTERESTING though, that
won't happen. I'll need more time to stare at this code...

[1] http://public-inbox.org/git/1385351754-9954-9-git-send-email-pclouds@gmail.com/
-- 
Duy

^ permalink raw reply

* Re: Git v2.11.0 breaks max depth nested alternates
From: Philip Oakley @ 2016-12-05 12:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Kyle J. McKay, Junio C Hamano, Git mailing list
In-Reply-To: <20161205071822.ndeswelgj5epej5k@sigill.intra.peff.net>

From: "Jeff King" <peff@peff.net>
> On Sun, Dec 04, 2016 at 11:22:52AM -0000, Philip Oakley wrote:
>
>> > Ever since 722ff7f876 (receive-pack: quarantine objects until
>> > pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining
>> > objects and packs received during an incoming push into a separate
>> > objects directory and using the alternates mechanism to make them
>> > available until they are either accepted and moved into the main
>> > objects directory or rejected and discarded.
>>
>> Is there a step here that after the accepted/rejected stage, it should 
>> then
>> decrement the limit back to its original value. The problem description
>> suggests that might be the case.
>
> No. I thought that at first, too, but this increment happens in the
> sub-process which is using the extra level of alternates for its entire
> lifetime. So it "resets" it by exiting, and the parent process never
> increments its internal value at all.
>
Thanks for the clarification.
--
Philip 


^ permalink raw reply

* Re: [PATCH 0/6] restricting http redirects
From: Jeff King @ 2016-12-05 13:08 UTC (permalink / raw)
  To: git; +Cc: Jann Horn
In-Reply-To: <20161201090336.xjbb47bublfcpglo@sigill.intra.peff.net>

On Thu, Dec 01, 2016 at 04:03:37AM -0500, Jeff King wrote:

> Jann Horn brought up on the git-security list some interesting
> social-engineering attacks around the way Git handles HTTP redirects.
> These patches are my attempt to harden our redirect handling against
> these attacks.

There's one other possible attack I thought of while discussing [1],
that is worth mentioning.

We limited the number of http redirects in b25811646 (http: limit
redirection depth, 2015-09-22). But what about http-alternates? Could
you redirect to yourself via http-alternates and convince a client to
loop infinitely?

It looks like no, because we do not seem to handle recursive
alternates at all in the http walker. Which means that repositories with
recursive local alternates cannot be fetched over dumb-http. But it also
means that we don't have to worry about limiting the recursion depth.

-Peff

[1] http://public-inbox.org/git/fe33de5b5f0b3da68b249cc4a49a6d7@3c843fe6ba8f3c586a21345a2783aa0/

^ permalink raw reply

* Re: [PATCH] docs: warn about possible '=' in clean/smudge filter process values
From: Jeff King @ 2016-12-05 13:12 UTC (permalink / raw)
  To: larsxschneider; +Cc: git
In-Reply-To: <20161203194516.12879-1-larsxschneider@gmail.com>

On Sat, Dec 03, 2016 at 08:45:16PM +0100, larsxschneider@gmail.com wrote:

> From: Lars Schneider <larsxschneider@gmail.com>
> 
> A pathname value in a clean/smudge filter process "key=value" pair can
> contain the '=' character (introduced in edcc858). Make the user aware
> of this issue in the docs, add a corresponding test case, and fix the
> issue in filter process value parser of the example implementation in
> contrib.

Yeah, I just naturally assumed it would work this way, as it's pretty
standard in our other key=value protocols. But certainly it's reasonable
to document it (and to keep the t0021 filter as a good example).

-Peff

^ permalink raw reply

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
From: Jeff King @ 2016-12-05 13:23 UTC (permalink / raw)
  To: Lars Schneider
  Cc: Christian Couder, git, Junio C Hamano, Nguyen Thai Ngoc Duy,
	Mike Hommey, Eric Wong, Christian Couder
In-Reply-To: <A5ABBF3E-BED9-4FF3-9DE5-B529DEF0B8E8@gmail.com>

On Sat, Dec 03, 2016 at 07:47:51PM +0100, Lars Schneider wrote:

> >  - "<command> have": the command should output the sha1, size and
> > type of all the objects the external ODB contains, one object per
> > line.
> 
> This looks impractical. If a repo has 10k external files with
> 100 versions each then you need to read/transfer 1m hashes (this is
> not made up - I am working with Git repos than contain >>10k files
> in GitLFS).

Are you worried about the client-to-server communication, or the pipe
between git and the helper? I had imagined that the client-to-server
communication happen infrequently and be cached.

But 1m hashes is 20MB, which is still a lot to dump over the pipe.
Another option is that Git defines a simple on-disk data structure
(e.g., a flat file of sorted 20-byte binary sha1s), and occasionally
asks the filter "please update your on-disk index".

That still leaves open the question of how the external odb script
efficiently gets updates from the server. It can use an ETag or similar
to avoid downloading an identical copy, but if one hash is added, we'd
want to know that efficiently. That is technically outside the scope of
the git<->external-odb interface, but obviously it's related. The design
of the on-disk format might be make that problem easier or harder on the
external-odb script.

> Wouldn't it be better if Git collects all hashes that it currently 
> needs and then asks the external ODBs if they have them?

I think you're going to run into latency problems when git wants to ask
"do we have this object" and expects the answer to be no. You wouldn't
want a network request for each.

And I think it would be quite complex to teach all operations to work on
a promise-like system where the answer to "do we have it" might be
"maybe; check back after you've figured out the whole batch of hashes
you're interested in".

> >  - "<command> get <sha1>": the command should then read from the
> > external ODB the content of the object corresponding to <sha1> and
> > output it on stdout.
> > 
> >  - "<command> put <sha1> <size> <type>": the command should then read
> > from stdin an object and store it in the external ODB.
> 
> Based on my experience with Git clean/smudge filters I think this kind 
> of single shot protocol will be a performance bottleneck as soon as 
> people store more than >1000 files in the external ODB.
> Maybe you can reuse my "filter process protocol" (edcc858) here?

Yeah. This interface comes from my original proposal, which used the
rationale "well, the files are big, so process startup shouldn't be a
big deal". And I don't think I wrote it down, but an implicit rationale
was "it seems to work for LFS, so it should work here too". But of
course LFS hit scaling problems, and so would this. It was one of the
reasons I was interested in making sure your filter protocol could be
used as a generic template, and I think we would want to do something
similar here.

> > * Transfer
> > 
> > To tranfer information about the blobs stored in external ODB, some
> > special refs, called "odb ref", similar as replace refs, are used.
> > 
> > For now there should be one odb ref per blob. Each ref name should be
> > refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> > in the external odb named <odbname>.
> > 
> > These odb refs should all point to a blob that should be stored in the
> > Git repository and contain information about the blob stored in the
> > external odb. This information can be specific to the external odb.
> > The repos can then share this information using commands like:
> > 
> > `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

I'd worry about scaling this part. Traditionally our refs storage does
not scale very well.

-Peff

^ permalink raw reply

* [ANNOUNCE] Git Merge Contributor's Summit Feb 2, 2017, Brussels
From: Jeff King @ 2016-12-05 13:35 UTC (permalink / raw)
  To: git

Git Merge 2017 is happening on Feb 3rd; there will be a Contributor's
Summit the day before. The details are largely what I wrote earlier in
[1], but here's a recap:

  When: Thursday, Feb 2, 2017. 10am-3pm.
  Where: The Egg[2], Brussels, Belgium
  What: Git nerds having a round-table discussion
  Who: All contributors to git or related projects in the git ecosystem
       are invited; if you're not sure if you qualify, just ask!

Registration for the conference is open, and you need to register
specifically for the Contributor's Summit to attend. There's a special
code for registering for the Contributor's Summit; email me off-list to
get instructions. It works either when registering for the main
conference, or separately. You don't have to attend the main conference
on Friday to come to the Contributor's Summit, but I encourage you to.

As I mentioned before, the content is up to us. So start thinking of
things you'd like to discuss. We can probably organize topics on the
kernel.org wiki (if anybody really wants to start things going, feel
free to start a wiki page).

Let me know if you have any questions.

-Peff

[1] http://public-inbox.org/git/20161025162829.jcy6fmnmdjual6io@sigill.intra.peff.net/

[2] https://www.google.com/maps/place/The+Egg/@50.8357424,4.321425,15z/data=!4m13!1m7!3m6!1s0x0:0xe028c6680611d1da!2sThe+Egg!3b1!8m2!3d50.8331946!4d4.3270469!3m4!1s0x0:0xe028c6680611d1da!8m2!3d50.8331946!4d4.3270469


^ permalink raw reply

* Bug: stash staged file move loses original file deletion
From: Matthew Patey @ 2016-12-05 14:37 UTC (permalink / raw)
  To: git

Git version 2.8.0 (sorry can't update to more recent on my machine) on Ubuntu.

After moving a file, if the new file is in the index but the deletion
of the old file is not staged, git stash loses the deletion operation.
Repro:

1. git mv a b
This will have both the "deletion" and the "file added" in the index

2. git stash; git stash pop
Now file a is still in the index, but file b deletion is not staged.

3. git stash; git stash show -v
This will show that the deletion operation is not in the stash

4. git stash pop
Again confirms the issue, file a is in the index, but file b is
present and unmodified in the working directory.

^ permalink raw reply

* Feature request: Set git svn options in .git/config file
From: Juergen Kosel @ 2016-12-05 17:34 UTC (permalink / raw)
  To: git

Hello,

while working with a git-svn repository, I like to use the command
line options --use-log-author and --add-author-from for all calls of
git svn fetch,
git svn rebase,
git svn dcommit, ...

Doing so consequently, the commit-ids of each git repository, which
has been cloned from the same svn-repository match.

Unfortunately, it is possible to forget these options on the command
line. (And 2nd, tortoise-git does not surport it, see
https://gitlab.com/tortoisegit/tortoisegit/issues/2824 ).

Therefore I believe, that it would be the best solution to store the
settings of --add-author-from, --use-log-author and maybe
--authors-prog in the .git/config file.


Greetings
	Juergen

^ permalink raw reply

* Re: [PATCH] diff: fix up SHA-1 abbreviations outside of repository
From: Jack Bates @ 2016-12-05 17:45 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: git
In-Reply-To: <20161205072614.zg6yglqnznna65vf@sigill.intra.peff.net>

On 05/12/16 12:26 AM, Jeff King wrote:
> On Sun, Dec 04, 2016 at 11:19:46PM -0800, Junio C Hamano wrote:
>>> -	if (no_index)
>>> +	if (no_index) {
>>>  		/* If this is a no-index diff, just run it and exit there. */
>>> +		startup_info->have_repository = 0;
>>>  		diff_no_index(&rev, argc, argv);
>>> +	}
>>
>> This kind of change makes me nervous (partly because I am not seeing
>> the whole code but only this part of the patch).
>>
>> Some code may react to "have_repository" being zero and do the right
>> thing (which I think is what you are using from your previous "we
>> did one of the three cases" change here), but the codepath that led
>> to "have_repository" being set to non-zero previously must have done
>> a lot more than just flipping that field to non-zero, and setting
>> zero to this field alone would not "undo" what it did.
>
> I _think_ it's OK because the only substantive change would be the
> chdir() to the top of the working tree. But that information is carried
> through by revs->prefix, which we act on regardless of the value of
> startup_info->have_repository when we call prefix_filename().
>
> I agree that it may be an accident waiting to happen, though, as soon as
> some buried sub-function needs to care about the distinction.
>
>> I wonder if we're better off if we made sure that diff_no_index()
>> works the same way regardless of the value of "have_repository"
>> field?
>
> If you mean adding a diffopt flag like "just abbreviate everything to
> FALLBACK_DEFAULT_ABBREV even if we're in a repository", and then setting
> that in diff_no_index(), I agree that is a lot cleaner.
>
> I'm still not 100% convinced that it's actually the correct behavior,
> but at least doing a more contained version wouldn't take away other
> functionality like reading config.

I don't have a strong reason for wanting these three cases to behave 
identically, I was merely surprised that they don't. I think you 
expected them to behave the same as well? I'll withdraw this patch.

Conceptually I do think of "git diff" as having two separate modes, "in 
repository" and "out of repository", with the --no-index option forcing 
the "out of repository" mode. But maybe there are good reasons why this 
isn't accurate, or maybe it doesn't matter that it's not 100% accurate.

In summary, currently all of the three cases are "no index" but only the 
first case doesn't "have repository".

Thank you for your thoughtful feedback!

^ permalink raw reply

* Re: [PATCH] docs: warn about possible '=' in clean/smudge filter process values
From: Junio C Hamano @ 2016-12-05 17:49 UTC (permalink / raw)
  To: Jeff King; +Cc: larsxschneider, git
In-Reply-To: <20161205131219.kdq4zafith26vro2@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Sat, Dec 03, 2016 at 08:45:16PM +0100, larsxschneider@gmail.com wrote:
>
>> From: Lars Schneider <larsxschneider@gmail.com>
>> 
>> A pathname value in a clean/smudge filter process "key=value" pair can
>> contain the '=' character (introduced in edcc858). Make the user aware
>> of this issue in the docs, add a corresponding test case, and fix the
>> issue in filter process value parser of the example implementation in
>> contrib.
>
> Yeah, I just naturally assumed it would work this way, as it's pretty
> standard in our other key=value protocols. But certainly it's reasonable
> to document it (and to keep the t0021 filter as a good example).

There are diplomatic ways and other ways to say the same thing, it
seems, and yours is almost always more diplomatic than mine ;-) 

My knee-jerk reaction last night that I didn't send was "I would
understand if the new test were about quotes or whitespaces, but is
it even useful to make sure that the value can have an equal sign in
it?"



^ permalink raw reply

* Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling
From: Junio C Hamano @ 2016-12-05 18:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Ramsay Jones, GIT Mailing-list
In-Reply-To: <20161205053258.jtnqq64gp5n7vtni@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Sun, Dec 04, 2016 at 08:45:59PM +0000, Ramsay Jones wrote:
>
>> 
>> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
>> ---
>> 
>> Hi Junio,
>> 
>> I recently noticed that:
>> 
>>     $ make >pout 2>&1
>>     $ ./git version
>>     git version 2.11.0.286.g109e8a9
>>     $ git describe
>>     v2.11.0-286-g109e8a99d
>>     $
>> 
>> ... for non-release builds, the commit part of the version
>> string was still using an --abbrev=7.
>
> It seems like this kind of discussion ought to go in the commit message.
> :)
>
> That said, I think the right patch may be to just drop --abbrev
> entirely.
> ...
> I think at that point it was a noop, as 7 should have been the default.
> And now we probably ought to drop it, so that we can use the
> auto-scaling default.

Yeah, I agree.

It does mean that snapshot binaries built out of the same commit in
the same repository before and after a repack have higher chances of
getting named differently, which may surprise people, but that
already is possible with a fixed length if the repacking involves
pruning (albeit with lower probabilities), and I do not think it is
a problem.

^ permalink raw reply

* Re: [RFC PATCH] GIT-VERSION-GEN: set --abbrev=9 to match auto-scaling
From: Junio C Hamano @ 2016-12-05 18:10 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Jeff King, GIT Mailing-list
In-Reply-To: <ab1e7ce9-1022-0c72-2f72-63e3b9182bc9@ramsayjones.plus.com>

Ramsay Jones <ramsay@ramsayjones.plus.com> writes:

> Heh, that was the first version of the patch. However, I got to thinking
> about why --abbrev=7 was there in the first place; the only reason I
> could think of was to defeat local configuration to get a measure of
> reproducibility.
>
> Unfortunately, you can't get the 'auto' behaviour from --abbrev
> (on the pu branch):
>
>     $ ./git describe --abbrev=-1
>     v2.11.0-286-g109e8
>     $ ./git describe --abbrev=0
>     v2.11.0
>     $ ./git describe
>     v2.11.0-286-g109e8a99d
>     $

What is the reason why the last one is undesirable?  Is it because
the user may have core.abbrev set to some value in the configuration
and you want to override it to force "auto"?

I am not sure how rigid GIT-VERSION-GEN wants to be to countermand
such an explicit user preference (i.e. existing configuration).

> I did think about using '-c core.abbrev=auto', 

Having said that, if countermanding end-user's configuration is
desireble, I agree that "-c core.abbrev=auto" is the way to do so.

> but that would depend on Junio's patch (nothing wrong with that,
> of course):

You caught me.  I'll need to polish that into a usable shape soon
then.  And that is orthogonal to the "does it make sense to force
'auto' in this context?" question.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox