git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* The fetch command should "always" honor remote.<name>.fetch
@ 2014-04-05 22:43 Lewis Diamond
  2014-04-07 18:23 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Lewis Diamond @ 2014-04-05 22:43 UTC (permalink / raw)
  To: git

Hi all,
//for tldr; see EOM
I recently setup git for my team and we ran into a small annoyance. 
Though this may be solved with a simple alias, I decided to spend this 
Saturday afternoon exploring the Git source to find out how easy an 
actual patch could be. Unfortunately, the fix isn't trivial and I'm not 
familiar with the Git code base. I'd definitely prefer fixing issue 
properly in the source since the behaviour is, in my opinion, 
inconsistent, so I'll need some insights from experienced git developers.

Let me first describe the issue. Within a single repository, multiple 
ref name spaces can be defined to hold, for example, user branches. For 
example, the main development branches can be located under 
refs/heads/[master | develop] while users push their personal branches 
to refs/user/[bob | alice]/heads/* in order to save their work or 
collaborate with one another. The problem arises when fetching from 
configured fetch refspecs. The behaviour isn't consistent with the push 
behaviour. Let's look at an example:

==
Alice has a repository with a configured remote named foo:
[remote "foo"]
     url = ...
     fetch = +refs/heads/*:refs/remotes/foo/*
     fetch = +refs/users/bob/heads/*:refs/remotes/foo/bob/
     p
     push = refs/heads/*:refs/users/alice/heads/*

Let's say the following refs exist on the remote:
refs/heads/master
refs/users/bob/heads/master
refs/users/bob/heads/develop

If Alice has a master and feature_a branch, 'git push foo' would result in:
master -> refs/users/alice/heads/master
feature_a -> refs/users/alice/heads/feature/a

'git push foo feature/a' would result in:
feature/a -> refs/users/alice/heads/feature/a

'git fetch foo' would result in:
[new ref] refs/users/bob/heads/master -> foo/bob/master //OK
[new ref] refs/users/bob/heads/develop -> foo/bob/develop //OK
[new ref] refs/heads/master -> foo/master //OK

'git fetch foo refs/users/bob/heads/develop' would result in (FETCH_HEAD 
omitted):
[new ref] refs/users/bob/heads/develop -> foo/bob/develop //OK

'git fetch foo refs/heads/master' would result in (FETCH_HEAD omitted):
[new ref] refs/heads/master -> foo/master //OK

'git fetch foo develop' would result in:
fatal: Couldn't find remote ref test2 //Not OK, (case 1)

'git fetch foo master' would result in (FETCH_HEAD omitted):
[new ref] refs/heads/master -> foo/master //OK, but missing another ref! 
(case 2)
//It should also fetch refs/users/bob/heads/master -> foo/bob/master

If you remove this configuration line: fetch = 
+refs/heads/*:refs/remotes/foo/*
Then you run 'git fetch foo master', this would result in:
  * branch master -> FETCH_HEAD //Debatable whether this is OK or not, 
but it's definitely missing bob's master! (case 3)
==

case 1: The ref abbreviation is received in builtin/fetch.c: get_ref_map 
and passed to remote.c:get_fetch_map with missing_ok = 0. The 
abbreviated ref is then evaluated through remote.c:get_remote_ref --> 
remote.c:find_ref_by_name_abbrev to finally end up being matched in 
refs.c:ref_rev_parse_rules. Since in this case, develop doesn't exist 
under 'develop', 'refs/develop', 'refs/tags/develop', 
'refs/heads/develop', 'refs/remotes/develop' or 
'refs/remotes/develop/HEAD', no ref is matched and the command dies (due 
to missing_ok=0). All of this happens without ever looking at the 
configured fetch refspec.

case 2: Starts like case 1, except that the ref_rev_parse_rules get a 
match and only this match is used. The destination is then mapped in a 
later call to get_fetch_map, from builtin/fetch.c@311

case 3: Same as 2, but it maps to HEAD.

tldr;
In my opinion, these issues can be resolved with the following rules:
1. Explicit refs (starts with refs/) are fetched 'as-is' and put into 
FETCH_HEAD and into every configured refspec->dst for which refspec->src 
matches the input ref. (The push command should be made the same for 
consistency. It currently only pushes to the first matching refspec.)
2. Abbreviated refs (e.g. 'master') are matched only against the 
remote_refs matching remote->fetch_refspec (I think remote.c 
get_expanded_map can provide this). If multiple matches exist, all of 
them are fetched to their corresponding refspec->dst (and FETCH_HEAD). 
If multiple matches exist and have conflicting refspec->dst, error out 
(or should we honour the first matching refspec?).

Note that for the 2nd rule, if remote->fetch_refspec doesn't include 
refs/heads/*, it means I went through the trouble of removing it from 
the configuration. Therefore, I think it shouldn't even be fetched. 
However, we need to make sure to handle 'git fetch <URL> master', such 
that it is still evaluated by ref_rev_parse_rules.

Any input is be appreciated.
Thanks,
Lewis Diamond

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The fetch command should "always" honor remote.<name>.fetch
  2014-04-05 22:43 The fetch command should "always" honor remote.<name>.fetch Lewis Diamond
@ 2014-04-07 18:23 ` Junio C Hamano
  2014-04-07 18:46   ` Lewis Diamond
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-04-07 18:23 UTC (permalink / raw)
  To: Lewis Diamond; +Cc: git

Lewis Diamond <git@lewisdiamond.com> writes:

> 'git fetch foo develop' would result in:
> fatal: Couldn't find remote ref test2 //Not OK, (case 1)

I have no idea where the "test2" comes from, as it does not appear
anywhere in the above write-up, and it could be a bug.

> 'git fetch foo master' would result in (FETCH_HEAD omitted):
> [new ref] refs/heads/master -> foo/master //OK, but missing another
> ref! (case 2)
> //It should also fetch refs/users/bob/heads/master -> foo/bob/master

This is an incorrect expectation.

The user who gave the command line said only "master", and did not
want to grab "users/bob/heads/master".  If the user wanted to get it
as well, the command line would have said so, e.g.

	git fetch there master users/bob/heads/master

> If you remove this configuration line: fetch =
> +refs/heads/*:refs/remotes/foo/*
> Then you run 'git fetch foo master', this would result in:
>  * branch master -> FETCH_HEAD //Debatable whether this is OK or not,
> but it's definitely missing bob's master! (case 3)

Likewise.

The 'master' short-hand is designed not to match refs/users/any/thing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The fetch command should "always" honor remote.<name>.fetch
  2014-04-07 18:23 ` Junio C Hamano
@ 2014-04-07 18:46   ` Lewis Diamond
  2014-04-09  1:45     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Lewis Diamond @ 2014-04-07 18:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Lewis Diamond, git

On Mon, Apr 7, 2014 at 2:23 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Lewis Diamond <git@lewisdiamond.com> writes:
>
>> 'git fetch foo develop' would result in:
>> fatal: Couldn't find remote ref test2 //Not OK, (case 1)
>
> I have no idea where the "test2" comes from, as it does not appear
> anywhere in the above write-up, and it could be a bug.
>

Sorry, "test2" was a local test to copy the error message. It should read "foo".

>> 'git fetch foo master' would result in (FETCH_HEAD omitted):
>> [new ref] refs/heads/master -> foo/master //OK, but missing another
>> ref! (case 2)
>> //It should also fetch refs/users/bob/heads/master -> foo/bob/master
>
> This is an incorrect expectation.

Indeed this is the "correct" behaviour since it works as designed.
However, this behaviour is inconsistent with the push command. An
expectation is never "incorrect" as we are talking about intuitive vs
non-intuitive.

>
> The user who gave the command line said only "master", and did not
> want to grab "users/bob/heads/master".  If the user wanted to get it
> as well, the command line would have said so, e.g.
>
>         git fetch there master users/bob/heads/master
>

Actually, the user specifically configured the remote to fetch
refs/users/bob/heads/*, meaning "those are the branches I'm interested
in".

>> If you remove this configuration line: fetch =
>> +refs/heads/*:refs/remotes/foo/*
>> Then you run 'git fetch foo master', this would result in:
>>  * branch master -> FETCH_HEAD //Debatable whether this is OK or not,
>> but it's definitely missing bob's master! (case 3)
>
> Likewise.
>
> The 'master' short-hand is designed not to match refs/users/any/thing.
>

Sure, this shorthand is designed to match refs listed in the rev parse
rules list. However, a quick survey showed me that most people would
expect their configuration to be honoured when using the shorthand for
fetching (like it is for push). This thread is about improving the
fetch command so that the short-hand works in such cases (and make it
consistent with what push does).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The fetch command should "always" honor remote.<name>.fetch
  2014-04-07 18:46   ` Lewis Diamond
@ 2014-04-09  1:45     ` Junio C Hamano
  2014-04-09 15:57       ` Lewis Diamond
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-04-09  1:45 UTC (permalink / raw)
  To: Lewis Diamond; +Cc: Lewis Diamond, git

Lewis Diamond <me@lewisdiamond.com> writes:

>>> 'git fetch foo master' would result in (FETCH_HEAD omitted):
>>> [new ref] refs/heads/master -> foo/master //OK, but missing another
>>> ref! (case 2)
>>> //It should also fetch refs/users/bob/heads/master -> foo/bob/master
>>
>> This is an incorrect expectation.
>
> Indeed this is the "correct" behaviour since it works as designed.
> However, this behaviour is inconsistent with the push command. An
> expectation is never "incorrect" as we are talking about intuitive vs
> non-intuitive.
>
>>
>> The user who gave the command line said only "master", and did not
>> want to grab "users/bob/heads/master".  If the user wanted to get it
>> as well, the command line would have said so, e.g.
>>
>>         git fetch there master users/bob/heads/master
>>
>
> Actually, the user specifically configured the remote to fetch
> refs/users/bob/heads/*, meaning "those are the branches I'm interested
> in".
>
>>> If you remove this configuration line: fetch =
>>> +refs/heads/*:refs/remotes/foo/*
>>> Then you run 'git fetch foo master', this would result in:
>>>  * branch master -> FETCH_HEAD //Debatable whether this is OK or not,
>>> but it's definitely missing bob's master! (case 3)
>>
>> Likewise.
>>
>> The 'master' short-hand is designed not to match refs/users/any/thing.
>
> Sure, this shorthand is designed to match refs listed in the rev parse
> rules list. However, a quick survey showed me that most people would
> expect their configuration to be honoured when using the shorthand for
> fetching (like it is for push). This thread is about improving the
> fetch command so that the short-hand works in such cases (and make it
> consistent with what push does).

Now, I am puzzled, as I do not think push behaves in such an insane
way.  You got me worried enough that I had to make sure (see below).

Perhaps there is some misunderstanding.

With two repositories src and dst, I prepared these in src:

    $ git ls-remote ../src
    cae2fe07f0954772ec9d871391313cb91a030cba	HEAD
    cae2fe07f0954772ec9d871391313cb91a030cba	refs/heads/master
    cae2fe07f0954772ec9d871391313cb91a030cba	refs/users/bob/master

and then this config in dst/.git/config:

    [remote "origin"]
            url = ../src
            fetch = +refs/heads/*:refs/remotes/origin/*
            fetch = +refs/users/bob/*:refs/remotes/bob/*
            push = refs/heads/*:refs/users/alice/*

Now, from such an empty dst repository:

    $ cd dst
    $ git fetch -v origin
    From ../src
     * [new branch]      master     -> origin/master
     * [new ref]         refs/users/bob/master -> bob/master
    $ git reset --hard origin/master
    $ git ls-remote .
    cae2fe07f0954772ec9d871391313cb91a030cba	HEAD
    cae2fe07f0954772ec9d871391313cb91a030cba	refs/heads/master
    cae2fe07f0954772ec9d871391313cb91a030cba	refs/remotes/bob/master
    cae2fe07f0954772ec9d871391313cb91a030cba	refs/remotes/origin/master

    $ git commit --allow-empty -m another
    $ git push -v origin master
    Pushing to ../src
    Counting objects: 1, done.
    Writing objects: 100% (1/1), 185 bytes | 0 bytes/s, done.
    Total 1 (delta 0), reused 0 (delta 0)
    To ../src
       cae2fe0..faae8fb  master -> refs/users/alice/master

Redoing the same experiment with this config with an extra item in
dst/.git/config:

    [remote "origin"]
            url = ../src
            fetch = +refs/heads/*:refs/remotes/origin/*
            fetch = +refs/users/bob/*:refs/remotes/bob/*
            push = refs/heads/*:refs/users/alice/*
	    push = refs/remotes/bob/*:refs/users/bob/*

gives the same.

    ... the same procedure to prepare 'master' that is one commit
    ... ahead with "allow-empty"
    $ git update-ref refs/remotes/bob/master HEAD
    $ git push -v origin master
    Pushing to ../src
    Counting objects: 1, done.
    Writing objects: 100% (1/1), 185 bytes | 0 bytes/s, done.
    Total 1 (delta 0), reused 0 (delta 0)
    To ../src
       cae2fe0..faae8fb  master -> refs/users/alice/master

We do not look at remotes/bob/master we have, and we do not touch
users/bob/master at the remote site, either.

When you explicitly say what you want to push on the command line,
that is what you are telling Git to push.  'master' which is an
abbreviation of refs/heads/master.  Where it goes may be determined
by using remote.origin.push as a mapping, but the left hand side of
the mapping is still what you gave from the commad line (after
dwimming the abbreviated refname exactly the same way other commands
like "git log master" interpret them).

This is very much deliberately so and unlikely to change.  And that
goes the same for fetching as well.

The command line interface is optimized for the two most common use
cases.  Either you want to fetch everything you are interested in
(in which case you do not say what you want to fetch on the command
line and let the configured refspecs kick in), or you want to
explicitly want to state what you want.  Allowing random set of refs
that happens to match left hand side of wildcard patterns of refspec
will break the expectation big time, when somebody asks "git push
origin master" meaning "I want to push my 'master' branch out" (or
"git pull origin master" meaning "I want to merge the master branch
from the origin") and you instead push out "remotes/bob/master" (or
even worse, create an octopus merge with origin's master and bob's
master).

Hope this clarifies the confusion.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The fetch command should "always" honor remote.<name>.fetch
  2014-04-09  1:45     ` Junio C Hamano
@ 2014-04-09 15:57       ` Lewis Diamond
  2014-04-09 18:40         ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Lewis Diamond @ 2014-04-09 15:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

What I mean by "behave like push" is the following:

[remote "foo"]
    url = ...
    fetch = refs/users/bob/heads/*:refs/heads/*  #Note that the fetch
and push configuration match.
    push = refs/heads/*:refs/users/bob/heads/*

git ls-remote foo
refs/heads/master

git push foo master
refs/heads/master -> refs/users/bob/heads/master

git fetch foo master
refs/heads/master -> FETCH_HEAD

In this case, the fetch commit isn't the same as the one we just
pushed! Yes, I agree that the abbreviation expansion works as designed
(using the rev_parse_rules), but the point of that email is that those
rules become counter-intuitive when you configure a fetch refspec. You
specifically tell git that you don't care about refs/heads/* and all
you care about is refs/users/bob/heads/*, why would it go and fetch
refs/heads/master? Of course, you should still be able to fetch it
when you explicitly asks for it but the configured refspec should take
precedence over the default rev_parse_rules.

Maybe more people could give their opinion as to whether this would be
more intuitive or not. I (and other people I've asked) find the
current behaviour counter-intuitive. Our opinion seem to differ
because you consider the 'master' abbreviation as being explicit. In
my opinion, explicit means 'refs/heads/master' and an abbreviation is
absolutely not explicit.

If a patch wouldn't be welcome I'll fix it with aliases or a plugin.

-Lewis

On Tue, Apr 8, 2014 at 9:45 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Lewis Diamond <me@lewisdiamond.com> writes:
>
>>>> 'git fetch foo master' would result in (FETCH_HEAD omitted):
>>>> [new ref] refs/heads/master -> foo/master //OK, but missing another
>>>> ref! (case 2)
>>>> //It should also fetch refs/users/bob/heads/master -> foo/bob/master
>>>
>>> This is an incorrect expectation.
>>
>> Indeed this is the "correct" behaviour since it works as designed.
>> However, this behaviour is inconsistent with the push command. An
>> expectation is never "incorrect" as we are talking about intuitive vs
>> non-intuitive.
>>
>>>
>>> The user who gave the command line said only "master", and did not
>>> want to grab "users/bob/heads/master".  If the user wanted to get it
>>> as well, the command line would have said so, e.g.
>>>
>>>         git fetch there master users/bob/heads/master
>>>
>>
>> Actually, the user specifically configured the remote to fetch
>> refs/users/bob/heads/*, meaning "those are the branches I'm interested
>> in".
>>
>>>> If you remove this configuration line: fetch =
>>>> +refs/heads/*:refs/remotes/foo/*
>>>> Then you run 'git fetch foo master', this would result in:
>>>>  * branch master -> FETCH_HEAD //Debatable whether this is OK or not,
>>>> but it's definitely missing bob's master! (case 3)
>>>
>>> Likewise.
>>>
>>> The 'master' short-hand is designed not to match refs/users/any/thing.
>>
>> Sure, this shorthand is designed to match refs listed in the rev parse
>> rules list. However, a quick survey showed me that most people would
>> expect their configuration to be honoured when using the shorthand for
>> fetching (like it is for push). This thread is about improving the
>> fetch command so that the short-hand works in such cases (and make it
>> consistent with what push does).
>
> Now, I am puzzled, as I do not think push behaves in such an insane
> way.  You got me worried enough that I had to make sure (see below).
>
> Perhaps there is some misunderstanding.
>
> With two repositories src and dst, I prepared these in src:
>
>     $ git ls-remote ../src
>     cae2fe07f0954772ec9d871391313cb91a030cba    HEAD
>     cae2fe07f0954772ec9d871391313cb91a030cba    refs/heads/master
>     cae2fe07f0954772ec9d871391313cb91a030cba    refs/users/bob/master
>
> and then this config in dst/.git/config:
>
>     [remote "origin"]
>             url = ../src
>             fetch = +refs/heads/*:refs/remotes/origin/*
>             fetch = +refs/users/bob/*:refs/remotes/bob/*
>             push = refs/heads/*:refs/users/alice/*
>
> Now, from such an empty dst repository:
>
>     $ cd dst
>     $ git fetch -v origin
>     From ../src
>      * [new branch]      master     -> origin/master
>      * [new ref]         refs/users/bob/master -> bob/master
>     $ git reset --hard origin/master
>     $ git ls-remote .
>     cae2fe07f0954772ec9d871391313cb91a030cba    HEAD
>     cae2fe07f0954772ec9d871391313cb91a030cba    refs/heads/master
>     cae2fe07f0954772ec9d871391313cb91a030cba    refs/remotes/bob/master
>     cae2fe07f0954772ec9d871391313cb91a030cba    refs/remotes/origin/master
>
>     $ git commit --allow-empty -m another
>     $ git push -v origin master
>     Pushing to ../src
>     Counting objects: 1, done.
>     Writing objects: 100% (1/1), 185 bytes | 0 bytes/s, done.
>     Total 1 (delta 0), reused 0 (delta 0)
>     To ../src
>        cae2fe0..faae8fb  master -> refs/users/alice/master
>
> Redoing the same experiment with this config with an extra item in
> dst/.git/config:
>
>     [remote "origin"]
>             url = ../src
>             fetch = +refs/heads/*:refs/remotes/origin/*
>             fetch = +refs/users/bob/*:refs/remotes/bob/*
>             push = refs/heads/*:refs/users/alice/*
>             push = refs/remotes/bob/*:refs/users/bob/*
>
> gives the same.
>
>     ... the same procedure to prepare 'master' that is one commit
>     ... ahead with "allow-empty"
>     $ git update-ref refs/remotes/bob/master HEAD
>     $ git push -v origin master
>     Pushing to ../src
>     Counting objects: 1, done.
>     Writing objects: 100% (1/1), 185 bytes | 0 bytes/s, done.
>     Total 1 (delta 0), reused 0 (delta 0)
>     To ../src
>        cae2fe0..faae8fb  master -> refs/users/alice/master
>
> We do not look at remotes/bob/master we have, and we do not touch
> users/bob/master at the remote site, either.
>
> When you explicitly say what you want to push on the command line,
> that is what you are telling Git to push.  'master' which is an
> abbreviation of refs/heads/master.  Where it goes may be determined
> by using remote.origin.push as a mapping, but the left hand side of
> the mapping is still what you gave from the commad line (after
> dwimming the abbreviated refname exactly the same way other commands
> like "git log master" interpret them).
>
> This is very much deliberately so and unlikely to change.  And that
> goes the same for fetching as well.
>
> The command line interface is optimized for the two most common use
> cases.  Either you want to fetch everything you are interested in
> (in which case you do not say what you want to fetch on the command
> line and let the configured refspecs kick in), or you want to
> explicitly want to state what you want.  Allowing random set of refs
> that happens to match left hand side of wildcard patterns of refspec
> will break the expectation big time, when somebody asks "git push
> origin master" meaning "I want to push my 'master' branch out" (or
> "git pull origin master" meaning "I want to merge the master branch
> from the origin") and you instead push out "remotes/bob/master" (or
> even worse, create an octopus merge with origin's master and bob's
> master).
>
> Hope this clarifies the confusion.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: The fetch command should "always" honor remote.<name>.fetch
  2014-04-09 15:57       ` Lewis Diamond
@ 2014-04-09 18:40         ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2014-04-09 18:40 UTC (permalink / raw)
  To: Lewis Diamond; +Cc: git

Lewis Diamond <git@lewisdiamond.com> writes:

> ... Yes, I agree that the abbreviation expansion works as designed
> (using the rev_parse_rules), 

I am not fundamentally opposed if you want to add a new command line
option to "git fetch" so that the shortened "what to fetch" are
dwimmed differently, but changing how "git fetch there master"
without any such option behaves will not fly well.  It will break
those who have already learned Git who expect that that is the way
to explicitly ask to fetch the master branch regardless of what
configuration the repository might have.

It is true that "git fetch there 'master'" cannot possibly mean the
'master' branch we locally have, so there is no fundamental reason
why we should use the same rev-parse dwim rules to grok them.

In fact we used to have different dwim rules for local (rev-parse
dwim rules) and for remote access --- I do not offhand recall if we
had rules for push and fetch separately, but I wouldn't be surprised
if we did.  The underlying mechanism certainly allowed us to use
separate rules for them back then.

Over time, however, having separate rules for remote and local was
found confusing by users, and that is why we changed the code to use
the same rule everywhere when dwimming the abbreviated refname on
the command line these days.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-04-09 18:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-05 22:43 The fetch command should "always" honor remote.<name>.fetch Lewis Diamond
2014-04-07 18:23 ` Junio C Hamano
2014-04-07 18:46   ` Lewis Diamond
2014-04-09  1:45     ` Junio C Hamano
2014-04-09 15:57       ` Lewis Diamond
2014-04-09 18:40         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).