Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] RF+ENH(TST): compare the entire list of submodule status --recursive to stay intact
From: Yaroslav O Halchenko @ 2018-12-13 22:43 UTC (permalink / raw)
  To: git
In-Reply-To: <CAGZ79kZb28bCvM7cOYHC4cpJWpA-3_gcbxS_g-rG0yy=9jXquw@mail.gmail.com>

On Thu, 13 Dec 2018, Stefan Beller wrote:

> > and kaboom -- we have a new test.  If we decide to test more -- just tune up
> > test_expect_unchanged_submodule_status and done -- all the tests remain
> > sufficiently prescribed.

> > What do you think?

> That is pretty cool. Maybe my gut reaction on the previous patch
> also had to do with the numbers, i.e. having 2 extra function for
> only having 2 tests more legible. A framework is definitely better
> once we have more tests.

cool, thanks for the feedback - I will then try to make it happen

quick one (so when I get to it I know):  should I replicate all those
tests you have for other update strategies? (treating of config
specifications etc)  There is no easy way to parametrize them somehow?
;)    In Python world I might have mocked the actual underlying call to
update, to see what option it would be getting and assure that it is the
one I specified via config, and then sweepped through all of them
to make sure nothing interim changes it.  Just wondering if may be
something like that exists in git's tests support.

BTW - sorry if RTFM and unrelated, is there  a way to  

    update --merge  

but allowing only  fast-forwards?  My use case is collection of this
submodules: http://datasets.datalad.org/?dir=/openneuro  which all
should come from github and I should not have any changes of my own.
Sure thing if all is clean etc, merge should result in fast-forward.  I
just do not want to miss a case where there was some (temporary?) "dirt"
which I forgot to reset and it would then get merged etc.

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply

* Re: [PATCH 2/2] RF+ENH(TST): compare the entire list of submodule status --recursive to stay intact
From: Stefan Beller @ 2018-12-13 23:58 UTC (permalink / raw)
  To: debian; +Cc: git
In-Reply-To: <20181213224356.GI4633@hopa.kiewit.dartmouth.edu>

On Thu, Dec 13, 2018 at 2:44 PM Yaroslav O Halchenko
<debian@onerussian.com> wrote:
>
>
> On Thu, 13 Dec 2018, Stefan Beller wrote:
>
> > > and kaboom -- we have a new test.  If we decide to test more -- just tune up
> > > test_expect_unchanged_submodule_status and done -- all the tests remain
> > > sufficiently prescribed.
>
> > > What do you think?
>
> > That is pretty cool. Maybe my gut reaction on the previous patch
> > also had to do with the numbers, i.e. having 2 extra function for
> > only having 2 tests more legible. A framework is definitely better
> > once we have more tests.
>
> cool, thanks for the feedback - I will then try to make it happen
>
> quick one (so when I get to it I know):  should I replicate all those
> tests you have for other update strategies? (treating of config
> specifications etc)

If there is a sensible way to do so?
I have the impression that there are enough differences, that it
may not be possible to replicate all tests meaningfully from the
other modes.

> There is no easy way to parametrize them somehow?

There is t/lib-submodule-update.sh, which brings this to
an extreme, as it makes a "test suite in a test suite"; and I would
not follow that example for this change.

> ;)    In Python world I might have mocked the actual underlying call to
> update, to see what option it would be getting and assure that it is the
> one I specified via config, and then sweepped through all of them
> to make sure nothing interim changes it.  Just wondering if may be
> something like that exists in git's tests support.

gits tests are very heavy on end to end testing, i.e. run a whole command
and observe its output. This makes our command setup code, (i.e. finding
the repository, parsing options, reading possible config, etc) a really well
exercised code path. ;-)

There is a recent push towards testing only units, most of
t/helper is used for that, e.g. c.f. 4c7bb45269 (test-reach:
test get_reachable_subset, 2018-11-02).

So if you have a good idea how to focus the submodule
tests more on the (new) unit that you add, that would be cool.

> BTW - sorry if RTFM and unrelated, is there  a way to
>
>     update --merge
>
> but allowing only  fast-forwards?  My use case is collection of this
> submodules: http://datasets.datalad.org/?dir=/openneuro  which all
> should come from github and I should not have any changes of my own.

So you want the merge option  --ff-only
to be passed to the submodule merge command. I guess you could make
a patch, that update takes another option (--ff-only, only useful when
--merge is given), which is then propagated.

I am not sure if we could have a more generalized option passing,
which would allow to pass any option (for its respective command)
to the command that is run in the update mode.

> Sure thing if all is clean etc, merge should result in fast-forward.  I
> just do not want to miss a case where there was some (temporary?) "dirt"
> which I forgot to reset and it would then get merged etc.

maybe use --rebase, such that your potential change would bubble
up and possibly produce a merge conflict?

^ permalink raw reply

* Re: [PATCH 1/1] worktree refs: fix case sensitivity for 'head'
From: brian m. carlson @ 2018-12-14  0:33 UTC (permalink / raw)
  To: Mike Rappazzo
  Cc: Stefan Beller, Nguyễn Thái Ngọc, gitgitgadget,
	Git List, Junio C Hamano
In-Reply-To: <CANoM8SWQTAEYGiUC9PnWi8u9oAJYPcyiE5+5usoRvR7Vw2z0JA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

On Thu, Dec 13, 2018 at 04:14:28PM -0500, Mike Rappazzo wrote:
> On Thu, Dec 13, 2018 at 3:48 PM Stefan Beller <sbeller@google.com> wrote:
> >
> > > > The current situation is definitely a problem.  If I am in a worktree,
> > > > using "head" should be the same as "HEAD".
> >
> > By any chance, is your file system case insensitive?
> > That is usually the source of confusion for these discussions.
> 
> This behavior is the same for MacOS (High Sierra) and Windows 7.  I
> assume other derivatives of those act the same.
> 
> On CentOS "head" is an ambiguous ref.  If Windows and Mac resulted in
> an ambiguous ref, that would also be OK, but as it is now, they return
> the result of "HEAD" on the primary worktree.

I'm pretty sure that we want HEAD to be only written as "HEAD". It's
known that systems with case-insensitive file systems sometimes allow
"head" instead of "HEAD" because the ref is written in the file system.

I think the improvement we'd want here is to reject HEAD being written
as "head" on those systems, but in a global way that affects all uses of
HEAD.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/2] .gitattributes: ensure t/oid-info/* has eol=lf
From: brian m. carlson @ 2018-12-14  0:51 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, Junio C Hamano, Derrick Stolee
In-Reply-To: <4a22502a318a65f144b3b6542cc5e711a1811c78.1544638490.git.gitgitgadget@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1324 bytes --]

On Wed, Dec 12, 2018 at 10:14:53AM -0800, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The new test_oid machinery in the test library requires reading
> some information from t/oid-info/hash-info and t/oid-info/oid.
> The shell logic that reads from these files is sensitive to CRLF
> line endings, causing a problem when the test suite is run on a
> Windows machine that converts LF to CRLF.
> 
> Exclude the files in this folder from this conversion.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  .gitattributes | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/.gitattributes b/.gitattributes
> index acf853e029..3738cea7eb 100644
> --- a/.gitattributes
> +++ b/.gitattributes
> @@ -13,3 +13,4 @@
>  /Documentation/gitk.txt conflict-marker-size=32
>  /Documentation/user-manual.txt conflict-marker-size=32
>  /t/t????-*.sh conflict-marker-size=32
> +/t/oid-info/* eol=lf

Yeah, this seems like a sensible thing to do. I assumed the shell on
Windows would read data as text files, not as binary files. It's kinda
hard for me as a non-Windows user to predict what will need CRLF endings
and what will need LF endings with Git on Windows.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Junio C Hamano @ 2018-12-14  2:09 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181211104342.GA7233@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> In protocol v2, instead of just running "upload-pack", we have a generic
> "serve" loop which runs command requests from the client. What used to
> be "upload-pack" is now generally split into two operations: "ls-refs"
> and "fetch". The latter knows it must respect uploadpack.* config, but
> the former is not actually specific to a fetch operation (we do not yet
> do v2 receive-pack, but eventually we may, and ls-refs would support
> both operations).
>
> However, ls-refs does need to know which operation we're performing, in
> order to read the correct config (e.g., uploadpack.hiderefs versus
> receive.hiderefs; we don't read _either_ right now, but this is the
> first step to fixing that).
>
> In the generic "git-serve" program, we don't have that information. But
> in the protocol as it is actually used by clients, the client still asks
> to run "git-upload-pack", and then issues an "ls-refs" from there. So we
> _do_ know at that point that "uploadpack" is the right config context.
> This patch teaches the protocol v2 "serve" code to pass that context
> through to the individual commands (a future patch will teach ls-refs to
> actually use it).

Thanks for a clear description of ugly status quo X-<.

> diff --git a/ls-refs.h b/ls-refs.h
> index b62877e8da..da26fc9824 100644
> --- a/ls-refs.h
> +++ b/ls-refs.h
> @@ -4,7 +4,8 @@
>  struct repository;
>  struct argv_array;
>  struct packet_reader;
> -extern int ls_refs(struct repository *r, struct argv_array *keys,
> +extern int ls_refs(struct repository *r, const char *config_context,
> +		   struct argv_array *keys,
>  		   struct packet_reader *request);

One thing I wonder is if we want to pass the whole *_opt thing,
instead of only one field out of it.

>  #endif /* LS_REFS_H */
> diff --git a/serve.c b/serve.c
> index bda085f09c..70f89cf0d9 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -48,6 +48,7 @@ struct protocol_capability {
>  	 * This field should be NULL for capabilities which are not commands.
>  	 */
>  	int (*command)(struct repository *r,
> +		       const char *config_context,

Likewise here.

>  		       struct argv_array *keys,
>  		       struct packet_reader *request);
>  };

Thanks.

^ permalink raw reply

* Re: [PATCH 2/3] parse_hide_refs_config: handle NULL section
From: Junio C Hamano @ 2018-12-14  2:11 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181211104358.GB7233@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> This helper function looks for config in two places: transfer.hiderefs,
> or $section.hiderefs, where $section is passed in by the caller (and is
> "uploadpack" or "receive", depending on the context).
>
> In preparation for callers which do not even have that context (namely
> the "git-serve" command), let's handle a NULL by looking only at
> transfer.hiderefs (as opposed to segfaulting).

Makes sense, and I too wonder if we want to keep the "serve" thing
in the longer term, at least in the current form.


^ permalink raw reply

* Re: [PATCH v2 8/8] tests: mark tests broken under GIT_TEST_PROTOCOL_VERSION=2
From: Junio C Hamano @ 2018-12-14  2:18 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Brandon Williams, Jonathan Tan
In-Reply-To: <87pnu51kac.fsf@evledraar.gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> +	# Other protocol versions (e.g. 2) allow fetching an unadvertised
>> +	# object, so run this test with the default protocol version (0).
>> +	test_must_fail env GIT_TEST_PROTOCOL_VERSION= git -C client fetch-pack ../server \
>>  		$(git -C server rev-parse refs/heads/master^) 2>err &&
>
> What? So the equivalent of uploadpack.allowAnySHA1InWant=true is on for
> v2 all the time?

UnfortunatelyYes (see Peff's three-patch series).

^ permalink raw reply

* Re: [PATCH v2 1/2] .gitattributes: ensure t/oid-info/* has eol=lf
From: Junio C Hamano @ 2018-12-14  2:23 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, Derrick Stolee,
	SZEDER Gábor
In-Reply-To: <6d6ca9a2-9245-e5e9-6d7a-e6d3004054bb@gmail.com>

Derrick Stolee <stolee@gmail.com> writes:

>> FWIW, I personally do not think "is sensitive to CRLF" is too bad,
>> as my attempt to clarify it does not make it much better, e.g.
>>
>> 	The logic to read from these files in shell uses built-in
>> 	"read" command, which leaves CR at the end of these text
>> 	files when they are checked out with CRLF line endings, at
>> 	least when run with bash shipped with Git for Windows.  This
>> 	results in an unexpected value in the variable these lines
>> 	are read into, leading the tests to fail.
>>
>> So, I'll keep what I queued when I received v1 for now.
>
> Sorry for missing the edit to the message. You are correct that v2
> just added one commit.
>
> I like your rewording, if you feel like editing it.

I'm kinda neutral ;-).

^ permalink raw reply

* Re: [PATCH] cherry-pick: do not error on non-merge commits when '-m 1' is specified
From: Junio C Hamano @ 2018-12-14  2:36 UTC (permalink / raw)
  To: Sergey Organov; +Cc: git
In-Reply-To: <878t0twibw.fsf@javad.com>

Sergey Organov <sorganov@gmail.com> writes:

> I came up with the following as a preparatory change. Looks acceptable?
>
> -- 8< --
>
>     t3510: stop using '-m 1' to force failure mid-sequence of cherry-picks
>     
>     We are going to allow 'git cherry-pick -m 1' for non-merge commits, so
>     this method to force failure will stop to work.
>     
>     Use '-m 4' instead as it's very unlikely we will ever have such an
>     octopus in this test setup.

Yeah, that is a good approach.  Thanks for coming up with it.

I agree that it also is a good idea to use a variable to avoid
repeating "4" (and risking the two uses of the constant drifting
apart), but I find a single letter variable 'm' a bit too bland and
not descriptive enough.  Perhaps spell it out as mainline=4,
possibly with a comment why that is not a more-commonly-seen number
like "1"?

	# to make sure that the session to cherry-pick a sequence
	# gets interrupted, use a high-enough number that is larger
	# than the number of parents of any commit
	mainline=4 &&

or something.

> 	Modified   t/t3510-cherry-pick-sequence.sh
> diff --git a/t/t3510-cherry-pick-sequence.sh b/t/t3510-cherry-pick-sequence.sh
> index c84eeef..a873cf4 100755
> --- a/t/t3510-cherry-pick-sequence.sh
> +++ b/t/t3510-cherry-pick-sequence.sh
> @@ -61,7 +61,8 @@ test_expect_success 'cherry-pick mid-cherry-pick-sequence' '
>  
>  test_expect_success 'cherry-pick persists opts correctly' '
>  	pristine_detach initial &&
> -	test_expect_code 128 git cherry-pick -s -m 1 --strategy=recursive -X patience -X ours initial..anotherpick &&
> +	m=4 &&
> +	test_expect_code 128 git cherry-pick -s -m $m --strategy=recursive -X patience -X ours initial..anotherpick &&
>  	test_path_is_dir .git/sequencer &&
>  	test_path_is_file .git/sequencer/head &&
>  	test_path_is_file .git/sequencer/todo &&
> @@ -69,7 +70,7 @@ test_expect_success 'cherry-pick persists opts correctly' '
>  	echo "true" >expect &&
>  	git config --file=.git/sequencer/opts --get-all options.signoff >actual &&
>  	test_cmp expect actual &&
> -	echo "1" >expect &&
> +	echo "$m" >expect &&
>  	git config --file=.git/sequencer/opts --get-all options.mainline >actual &&
>  	test_cmp expect actual &&
>  	echo "recursive" >expect &&
>
> -- 8< --

^ permalink raw reply

* Re: Preparing for 2.20.1 brown-paper-bag maintenance release
From: Junio C Hamano @ 2018-12-14  2:40 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Ævar Arnfjörð Bjarmason, Git Mailing List,
	Johannes Schindelin, Derrick Stolee
In-Reply-To: <CACsJy8CexGXtyBa1oNP0jVx-+8JPTcfTd5wf1ZyfJWf8HDtLVQ@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> On Thu, Dec 13, 2018 at 7:08 PM Duy Nguyen <pclouds@gmail.com> wrote:
>> There's also a bug in my patch (-2 is already being used by
>> parse_opt_unknown_cb and my patch will change behavior of git-blame at
>> least in theory).
>
> Ah no. Too many magic numbers in parse-options.c code. It's working
> fine but I'll need to give these some names to avoid confusion in the
> future.

OK.


^ permalink raw reply

* Re: [PATCH] submodule update: run at most one fetch job unless otherwise set
From: Junio C Hamano @ 2018-12-14  2:53 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, sjon
In-Reply-To: <20181213190248.247083-1-sbeller@google.com>

Stefan Beller <sbeller@google.com> writes:

> From: Junio C Hamano <gitster@pobox.com>
>
> In a028a1930c (fetching submodules: respect `submodule.fetchJobs`
> config option, 2016-02-29), we made sure to keep the default behavior
> of a fetching at most one submodule at once when not setting the
> newly introduced `submodule.fetchJobs` config.
>
> This regressed in 90efe595c5 (builtin/submodule--helper: factor
> out submodule updating, 2018-08-03). Fix it.
>
> Reported-by: Sjon Hortensius <sjon@parse.nl>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---

Thanks for tying the loose ends.

We may want to convert the _INIT macro to use the designated
initializers; I had to count the fields twice to make sure I was
tweaking the right one.  

It did not help that I saw, before looking at the current code, that
90efe595c5 added one field at the end to the struct but did not
touch _INIT macro at all.  That made me guess that max_jobs=0 was
due to _missing_ initialization value, leading me to an incorrect
fix to append an extra 1 at the end, but that was bogus.  The
missing initialization left by 90efe595 ("builtin/submodule--helper:
factor out submodule updating", 2018-08-03) was silently fixed by
f1d15713 ("builtin/submodule--helper: store update_clone information
in a struct", 2018-08-03), I think, so replacing the 0 at the end is
1 happens to be the right fix, but with designated initializers, all
these confusions are more easily avoided.

>  builtin/submodule--helper.c | 2 +-
>  t/t5526-fetch-submodules.sh | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index d38113a31a..1f8a4a9d52 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1551,7 +1551,7 @@ struct submodule_update_clone {
>  #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \
>  	SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \
>  	NULL, NULL, NULL, \
> -	NULL, 0, 0, 0, NULL, 0, 0, 0}
> +	NULL, 0, 0, 0, NULL, 0, 0, 1}
>  
>  
>  static void next_submodule_warn_missing(struct submodule_update_clone *suc,
> diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
> index 6c2f9b2ba2..a0317556c6 100755
> --- a/t/t5526-fetch-submodules.sh
> +++ b/t/t5526-fetch-submodules.sh
> @@ -524,6 +524,8 @@ test_expect_success 'fetching submodules respects parallel settings' '
>  	git config fetch.recurseSubmodules true &&
>  	(
>  		cd downstream &&
> +		GIT_TRACE=$(pwd)/trace.out git fetch &&
> +		grep "1 tasks" trace.out &&
>  		GIT_TRACE=$(pwd)/trace.out git fetch --jobs 7 &&
>  		grep "7 tasks" trace.out &&
>  		git config submodule.fetchJobs 8 &&

^ permalink raw reply

* Re: [Question] Complex textconv text
From: Junio C Hamano @ 2018-12-14  3:12 UTC (permalink / raw)
  To: Randall S. Becker; +Cc: git
In-Reply-To: <004501d492f5$98f0fcc0$cad2f640$@nexbridge.com>

"Randall S. Becker" <rsbecker@nexbridge.com> writes:

> Hi all,
>
> I have a strange situation and need help with resolving funky characters in
> .git/config. My situation is this:
>
> [diff "*.dat"]
> 	textconv = enscribe-conv
> --format=-a1\(A=-a1,-a16,-a32\|P=-a1,-a32,-a16\|=-a1,-d,a14\),-a224
>
> Basically this is a formatter for diff so that I can show structured binary
> data. The unquoted syntax of the format string is:
>  --format=-a1(A=-a1,-a16,-a32|P=-a1,-a32,-a16|=-a1,-d,a14),-a224
>
> Content is not really important. The issue is that git is reporting fatal:
> bad config line 2 in file .git/config when I escape the (, ), and |
> characters.

That failure is understandable, as

    The following escape sequences (beside `\"` and `\\`) are recognized:
    `\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB)
    and `\b` for backspace (BS).  Other char escape sequences (including octal
    escape sequences) are invalid.

is what Documentation/config.txt says.  \(, \) and \| is not a way
to escape these letters from the .git/config parser (they do not
need to be escaped from .git/config parser)..

> I get syntax errors otherwise from the shell running the
> textconv. I have tried
> --format="-a1(A=-a1,-a16,-a32|P=-a1,-a32,-a16|=-a1,-d,a14),-a224", to no
> avail.

Would 

	textconv = enscribe-conv --format=\"-a1(A=...,-a224\"

work?

We want to show the shell what you wrote so that pipes and parens
are inside a dq pair, but the .git/config language strips pair of
dq, so the .git/config language parser needs to be told that these
dq are not for it to eat.

^ permalink raw reply

* Re: [PATCH v3] revision: use commit graph in get_reference()
From: Junio C Hamano @ 2018-12-14  3:20 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, stolee
In-Reply-To: <20181213185450.230953-1-jonathantanmy@google.com>

Jonathan Tan <jonathantanmy@google.com> writes:

> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
>
>   git rev-list --objects --stdin --not --all --quiet <(list of objects)
>
> If the client repository has many refs, this command can be slow,
> regardless of the nature of the server repository or what is being
> fetched. A profiler reveals that most of the time is spent in
> setup_revisions() (approx. 60/63), and of the time spent in
> setup_revisions(), most of it is spent in parse_object() (approx.
> 49/60). This is because setup_revisions() parses the target of every ref
> (from "--all"), and parse_object() reads the buffer of the object.
>
> Reading the buffer is unnecessary if the repository has a commit graph
> and if the ref points to a commit (which is typically the case). This
> patch uses the commit graph wherever possible; on my computer, when I
> run the above command with a list of 1 object on a many-ref repository,
> I get a speedup from 1.8s to 1.0s.
>
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This patch is still on master. Junio, let me know if you would rather
> have me base it on sb/more-repo-in-api instead.

Unless we all agree that we will abandon sb/more-repo-in-api,
rerolling this on 'master' will force me to resolve similar but
different conflicts every time.  Unless we fast-track that other
topic, that is, but I do not think that is what you meant to do.

> Change in v3: Now uses a simpler API with the algorithm suggested by
> Peff in [2], except that I also retain the existing optimization that
> checks if graph_pos is already set.

OK.

^ permalink raw reply

* Re: [PATCH 1/1] worktree refs: fix case sensitivity for 'head'
From: Junio C Hamano @ 2018-12-14  3:31 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Mike Rappazzo, gitgitgadget, Git Mailing List
In-Reply-To: <CACsJy8Bj=8xHp3JA8dLbyM=RwJey7utMK6DTVe_0AjBNVHxJyg@mail.gmail.com>

Duy Nguyen <pclouds@gmail.com> writes:

> If you make "head" work like "HEAD", then it should work for _all_
> commands, not just worktree, and "MASTER" should match
> "refs/heads/master" and so on. I don't think it's as simple as
> changing strcmp to strcasecmp. You would need to make ref management
> case-insensitive (and make sure if still is case-sensitive if
> configured so). I don't think anybody has managed that.

And it is unclear why anybody would even want to do so.

Thanks for a doze of sanity.

^ permalink raw reply

* Re: [PATCH 0/4] Expose gpgsig in pretty-print
From: Michał Górny @ 2018-12-14  4:11 UTC (permalink / raw)
  To: John Passaro, git; +Cc: gitster, alex.crezoff, peff
In-Reply-To: <20181213212256.48122-1-john.a.passaro@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2378 bytes --]

On Thu, 2018-12-13 at 16:22 -0500, John Passaro wrote:
> Currently, users who do not have GPG installed have no way to discern
> signed from unsigned commits without examining raw commit data. I
> propose two new pretty-print placeholders to expose this information:
> 
> %GR: full ("R"aw) contents of gpgsig header
> %G+: Y/N if the commit has nonempty gpgsig header or not
> 
> The second is of course much more likely to be used, but having exposed
> the one, exposing the other too adds almost no complexity.
> 
> I'm open to suggestion on the names of these placeholders.
> 
> This commit is based on master but e5a329a279 ("run-command: report exec
> failure" 2018-12-11) is required for the tests to pass.
> 
> One note is that this change touches areas of the pretty-format
> documentation that are radically revamped in aw/pretty-trailers: see
> 42617752d4 ("doc: group pretty-format.txt placeholders descriptions"
> 2018-12-08). I have another version of this branch based on that branch
> as well, so you can use that in case conflicts with aw/pretty-trailers
> arise.
> 
> See:
> - https://github.com/jpassaro/git/tree/jp/pretty-expose-gpgsig
> - https://github.com/jpassaro/git/tree/jp/pretty-expose-gpgsig--based-on-aw-pretty-trailers
> 
> John Passaro (4):
>   pretty: expose raw commit signature
>   t/t7510-signed-commit.sh: test new placeholders
>   doc, tests: pretty behavior when gpg missing
>   docs/pretty-formats: add explanation + copy edits
> 
>  Documentation/pretty-formats.txt |  21 ++++--
>  pretty.c                         |  36 ++++++++-
>  t/t7510-signed-commit.sh         | 125 +++++++++++++++++++++++++++++--
>  3 files changed, 167 insertions(+), 15 deletions(-)
> 
> 
> base-commit: 5d826e972970a784bd7a7bdf587512510097b8c7
> prerequisite-patch-id: aedfe228fd293714d9cd0392ac22ff1cba7365db

Just a suggestion: since the raw signature is not very useful without
the commit data to check it against, and the commit data is non-trivial
to construct (requires mangling raw data anyway), maybe you could either
add another placeholder to get the data for signature verification, or
(alternatively or simultaneously) add a placeholder that prints both
data and signature in the OpenPGP message format (i.e. something you can
pass straight to 'gpg --verify').

-- 
Best regards,
Michał Górny

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 963 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] RF+ENH(TST): compare the entire list of submodule status --recursive to stay intact
From: Yaroslav O Halchenko @ 2018-12-14  4:22 UTC (permalink / raw)
  To: git
In-Reply-To: <CAGZ79kaLpMa+AYwtNh+HUkYk3ORDephxZ74hcTbS=z1CQ+bf6A@mail.gmail.com>


On Thu, 13 Dec 2018, Stefan Beller wrote:

> > cool, thanks for the feedback - I will then try to make it happen

> > quick one (so when I get to it I know):  should I replicate all those
> > tests you have for other update strategies? (treating of config
> > specifications etc)

> If there is a sensible way to do so?
> I have the impression that there are enough differences, that it
> may not be possible to replicate all tests meaningfully from the
> other modes.

oh, by replicate I just meant to copy/paste and adjust for expected for
--reset-hard test behavior (and possibly introduced helper),
nothing fancy, just duplication as for replication ;-) 

> > There is no easy way to parametrize them somehow?

> There is t/lib-submodule-update.sh, which brings this to
> an extreme, as it makes a "test suite in a test suite"; and I would
> not follow that example for this change.

ok

> > ;)    In Python world I might have mocked the actual underlying call to
> > update, to see what option it would be getting and assure that it is the
> > one I specified via config, and then sweepped through all of them
> > to make sure nothing interim changes it.  Just wondering if may be
> > something like that exists in git's tests support.

> gits tests are very heavy on end to end testing, i.e. run a whole command
> and observe its output. This makes our command setup code, (i.e. finding
> the repository, parsing options, reading possible config, etc) a really well
> exercised code path. ;-)

> There is a recent push towards testing only units, most of
> t/helper is used for that, e.g. c.f. 4c7bb45269 (test-reach:
> test get_reachable_subset, 2018-11-02).

> So if you have a good idea how to focus the submodule
> tests more on the (new) unit that you add, that would be cool.

no, not really any good ideas -- I am new here, but I will keep an eye open.

> > BTW - sorry if RTFM and unrelated, is there  a way to

> >     update --merge

> > but allowing only  fast-forwards?  My use case is collection of this
> > submodules: http://datasets.datalad.org/?dir=/openneuro  which all
> > should come from github and I should not have any changes of my own.

> So you want the merge option  --ff-only
> to be passed to the submodule merge command. I guess you could make
> a patch, that update takes another option (--ff-only, only useful when
> --merge is given), which is then propagated.

> I am not sure if we could have a more generalized option passing,
> which would allow to pass any option (for its respective command)
> to the command that is run in the update mode.

wouldn't it be (theoretically) possible, in principle, to pass
them via some config variable?  e.g. instead of  

submodule update --reset-hard

have

-c submodule.update.reset.opts=--hard update --reset

and then analogously

-c submodule.update.merge.opts=--ff-only update --merge

(--ff-only I guess would make no sense for any "supermodule" - a repo
with submodules)

> > Sure thing if all is clean etc, merge should result in fast-forward.  I
> > just do not want to miss a case where there was some (temporary?) "dirt"
> > which I forgot to reset and it would then get merged etc.

> maybe use --rebase, such that your potential change would bubble
> up and possibly produce a merge conflict?

that is a good idea as a workaround, thanks!

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience     http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        

^ permalink raw reply

* Re: [PATCH 1/1] worktree refs: fix case sensitivity for 'head'
From: Jacob Keller @ 2018-12-14  6:49 UTC (permalink / raw)
  To: Mike Rappazzo
  Cc: Stefan Beller, Nguyễn Thái Ngọc, gitgitgadget,
	Git List, Junio C Hamano
In-Reply-To: <CANoM8SWQTAEYGiUC9PnWi8u9oAJYPcyiE5+5usoRvR7Vw2z0JA@mail.gmail.com>

On Thu, Dec 13, 2018 at 1:16 PM Mike Rappazzo <rappazzo@gmail.com> wrote:
>
> On Thu, Dec 13, 2018 at 3:48 PM Stefan Beller <sbeller@google.com> wrote:
> >
> > > > The current situation is definitely a problem.  If I am in a worktree,
> > > > using "head" should be the same as "HEAD".
> >
> > By any chance, is your file system case insensitive?
> > That is usually the source of confusion for these discussions.
>
> This behavior is the same for MacOS (High Sierra) and Windows 7.  I
> assume other derivatives of those act the same.
>
> On CentOS "head" is an ambiguous ref.  If Windows and Mac resulted in
> an ambiguous ref, that would also be OK, but as it is now, they return
> the result of "HEAD" on the primary worktree.
>

Because refs are *not* case sensitive, and we know that "HEAD" should
be per-worktree, it gets checked in the per-worktree refs section. But
lowercase head is known to not be a per-worktree ref, so we then ask
the main worktree about head. Since you happen to be on a case
insensitive file system, it then finds refs/HEAD in the main refs
worktree, and returns that.

I don't understand why the CentOS shows it as ambiguous, unless you
actually happen to have a ref named head. (possibly a branch?)

I suspect we could improve things by attempting to figure out if our
file system is case insensitive and warn users. However, I recall
patches which tried this, and no suitable method was found. Partly
because it's not just "case" that is the only problem. There might be
things like unicode characters which don't get properly encoded, etc.

The best solution would be to get a non-filesystem backed ref storage
working that could be used in place of the filesystem.

Thanks,
Jake

> >
> > Maybe in worktree code we have a spillover between path
> > resolution and ref namespace?

^ permalink raw reply

* Re: [PATCH 1/1] worktree refs: fix case sensitivity for 'head'
From: Duy Nguyen @ 2018-12-14  7:37 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Mike Rappazzo, Stefan Beller, gitgitgadget, Git Mailing List,
	Junio C Hamano
In-Reply-To: <CA+P7+xonxvfuhw4W+FUL87We8CaOwxsndFkN5bcgBhdsnZ5QAg@mail.gmail.com>

On Fri, Dec 14, 2018 at 7:50 AM Jacob Keller <jacob.keller@gmail.com> wrote:
>
> On Thu, Dec 13, 2018 at 1:16 PM Mike Rappazzo <rappazzo@gmail.com> wrote:
> >
> > On Thu, Dec 13, 2018 at 3:48 PM Stefan Beller <sbeller@google.com> wrote:
> > >
> > > > > The current situation is definitely a problem.  If I am in a worktree,
> > > > > using "head" should be the same as "HEAD".
> > >
> > > By any chance, is your file system case insensitive?
> > > That is usually the source of confusion for these discussions.
> >
> > This behavior is the same for MacOS (High Sierra) and Windows 7.  I
> > assume other derivatives of those act the same.
> >
> > On CentOS "head" is an ambiguous ref.  If Windows and Mac resulted in
> > an ambiguous ref, that would also be OK, but as it is now, they return
> > the result of "HEAD" on the primary worktree.
> >
>
> Because refs are *not* case sensitive, and we know that "HEAD" should
> be per-worktree, it gets checked in the per-worktree refs section. But
> lowercase head is known to not be a per-worktree ref, so we then ask
> the main worktree about head. Since you happen to be on a case
> insensitive file system, it then finds refs/HEAD in the main refs
> worktree, and returns that.
>
> I don't understand why the CentOS shows it as ambiguous, unless you
> actually happen to have a ref named head. (possibly a branch?)

I think it's just our default answer when we can't decide

$ git rev-parse head
head
fatal: ambiguous argument 'head': unknown revision or path not in the
working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
$ git rev-parse head --
fatal: bad revision 'head'

> I suspect we could improve things by attempting to figure out if our
> file system is case insensitive and warn users. However, I recall
> patches which tried this, and no suitable method was found. Partly
> because it's not just "case" that is the only problem. There might be
> things like unicode characters which don't get properly encoded, etc.
>
> The best solution would be to get a non-filesystem backed ref storage
> working that could be used in place of the filesystem.

Even with a new ref storage, I'm pretty sure pseudo refs like HEAD,
FETCH_HEAD... will forever be backed by filesystem. HEAD for example
is part of the repository signature and must exist as a file. We could
also lookup pseudo refs with readdir() instead of lstat(). On
case-preserving-and-insensitive filesystems, we can reject "head" this
way. But that comes with a high cost.
-- 
Duy

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Jeff King @ 2018-12-14  8:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <xmqqo99oeu4u.fsf@gitster-ct.c.googlers.com>

On Fri, Dec 14, 2018 at 11:09:53AM +0900, Junio C Hamano wrote:

> > diff --git a/ls-refs.h b/ls-refs.h
> > index b62877e8da..da26fc9824 100644
> > --- a/ls-refs.h
> > +++ b/ls-refs.h
> > @@ -4,7 +4,8 @@
> >  struct repository;
> >  struct argv_array;
> >  struct packet_reader;
> > -extern int ls_refs(struct repository *r, struct argv_array *keys,
> > +extern int ls_refs(struct repository *r, const char *config_context,
> > +		   struct argv_array *keys,
> >  		   struct packet_reader *request);
> 
> One thing I wonder is if we want to pass the whole *_opt thing,
> instead of only one field out of it.

I actually started by doing that, but "struct serve_options" is not
currently known by ls-refs.c, upload-pack.c, etc. So they'd have to
start including "serve.h". I don't think that's the end of the world,
but it felt like a funny mutual circular to me (my mental model now is
that ls-refs, upload-pack, etc are low-level commands, tied together by
the "serve" stuff).

-Peff

^ permalink raw reply

* Re: [PATCH 0/3] protocol v2 and hidden refs
From: Jeff King @ 2018-12-14  8:35 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, bwilliamseng
In-Reply-To: <20181213195305.249059-1-jonathantanmy@google.com>

On Thu, Dec 13, 2018 at 11:53:05AM -0800, Jonathan Tan wrote:

> >     I don't know if there's a good solution. I tried running the whole
> >     test suite with v2 as the default. It does find this bug, but it has
> >     a bunch of other problems (notably fetch-pack won't run as v2, but
> >     some other tests I think also depend on v0's reachability rules,
> >     which v2 is documented not to enforce).
> 
> I think Aevar's patches (sent after you wrote this) is a good start, and
> I have started looking at it too.

Yeah, I'm excited to see it working with fetch-pack, as the current
behavior is to complain if you've tried to enable v2 config:

  $ git config protocol.version 2
  $ git fetch-pack git://github.com/git/git
  fatal: support for protocol v2 not implemented yet

I haven't actually run into it in the real world, but somebody might if
they have scripted around fetch-pack and are experimenting with v2. A
much friendlier behavior would be falling back to v1, but actually
supporting v2 is better still. :)

> >   - The "serve" command is funky, because it has no concept of whether
> >     the "ls-refs" is for fetching or pushing. Is git-serve even a thing
> >     that we want to support going forward?  I know part of the original
> >     v2 conception was that one would be able to just connect to
> >     "git-serve" and do a number of operations. But in practice the v2
> >     probing requires saying "I'd like to git-upload-pack, and v2 if you
> >     please". So no client ever calls git-serve.
> > 
> >     Is this something we plan to eventually move to? Or can it be
> >     considered a funny vestige of the development? In the latter case, I
> >     think we should consider removing it.
> 
> Personally, I lean towards removing it, but there are arguments on both
> sides. In particular, removing "serve" means that both developers and
> users of Git need not be concerned with a 3rd endpoint, but preserving
> "serve" (and planning to migrate away from "upload-pack" and
> "receive-pack") means that we will only have one endpoint, eliminating
> confusion about which endpoint to use when making certain requests (when
> we add requests other than "fetch" and "push").

Yeah, at first glance I like the simplicity of a unified model. But the
separate fetch/push endpoints have been useful in the past. Separate
uploadpack/receive.hiderefs that I dealt with here are one form. Another
is that many people do HTTP access control using the endpoints. For
example, if I have a repo which is public-read and private-write, the
config we recommend in git-http-backend(1) is to lock down the
receive-pack access using webserver config.

If all the webserver sees is "somebody wants to connect to git-serve",
it doesn't know if it should be challenging them for authentication or
not. It would have to start peeking into the git-serve conversation to
see what the client actually wants to do. That's _possible_ to do, but
it gets pretty awkward with existing webserver tools (whereas matching
the URI endpoint is pretty easy).

Ditto for locked down ssh sessions like git-shell (or custom solutions
like gitolite). Right now we can "git-upload-pack is OK on this repo,
git-receive-pack is not". But blindly running "git serve" would be
dangerous. In this case I think we have a few more options, because the
user has always already authenticated. So we can just tell "git serve"
via the environment whether the user is authorized for push. It's harder
with HTTP because most setups avoid even challenging for auth unless
it's necessary.

So I'm a bit worried that the unified endpoint model is going to be a
dead end, at which point carrying around git-serve just makes things
more complicated.

-Peff

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Jonathan Nieder @ 2018-12-14  8:36 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181211104342.GA7233@sigill.intra.peff.net>

Hi,

Jeff King wrote:

> In protocol v2, instead of just running "upload-pack", we have a generic
> "serve" loop which runs command requests from the client. What used to
> be "upload-pack" is now generally split into two operations: "ls-refs"
> and "fetch". The latter knows it must respect uploadpack.* config, but
> the former is not actually specific to a fetch operation (we do not yet
> do v2 receive-pack, but eventually we may, and ls-refs would support
> both operations).

I think I'm missing something.  Why wouldn't "ls-refs for push" not pass
the information that it's for push as part of the *body* of the ls-refs
request?

(That's a separate issue from whether we need to have ls-refs for push
at all, as opposed to specifying a policy for the requested ref
updates and getting a list of "have"s without ref names attached.  But
that's a discussion for another day.)

Is there some other more immediate motivation for this patch?  In the
spirit of YAGNI, I would rather understand that motivation instead of
one that in many possible designs would never materialize.

Thanks,
Jonathan

^ permalink raw reply

* Re: [PATCH v3] revision: use commit graph in get_reference()
From: Jeff King @ 2018-12-14  8:45 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, stolee, gitster
In-Reply-To: <20181213185450.230953-1-jonathantanmy@google.com>

On Thu, Dec 13, 2018 at 10:54:50AM -0800, Jonathan Tan wrote:

> -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> +static struct commit *parse_commit_in_graph_one(struct repository *r,
> +						struct commit_graph *g,
> +						const struct object_id *oid)

Making sure I understand the new logic...

>  {
> +	struct object *obj;
> +	struct commit *commit;
>  	uint32_t pos;
>  
> -	if (item->object.parsed)
> -		return 1;
> +	obj = lookup_object(r, oid->hash);
> +	commit = obj && obj->type == OBJ_COMMIT ? (struct commit *) obj : NULL;
> +	if (commit && obj->parsed)
> +		return commit;

OK, so if it's a commit and we have it parsed, we return that. By using
lookup_object(), if it's a non-commit, we haven't changed anything.
Good.

> -	if (find_commit_in_graph(item, g, &pos))
> -		return fill_commit_in_graph(item, g, pos);
> +	if (commit && commit->graph_pos != COMMIT_NOT_FROM_GRAPH)
> +		pos = commit->graph_pos;
> +	else if (bsearch_graph(g, oid, &pos))
> +		; /* bsearch_graph sets pos */
> +	else
> +		return NULL;

And then we try to find it in the commit graph. If we didn't, then we'll
end up returning NULL. Good.

> -	return 0;
> +	if (!commit) {
> +		commit = lookup_commit(r, oid);
> +		if (!commit)
> +			return NULL;
> +	}

And at this point we found it in the commit graph, so we know it's a
commit. lookup_commit() should succeed, but in the off chance that it's
in the commit graph _and_ we previously found it as a non-commit
(yikes!), we'll return NULL. That's equivalent to just pretending we
didn't find it in the commit graph, and the caller can sort it out (when
they read the object, either it will match the previous type, or it
really will be a commit and they'll follow the normal complaining path).
Good.

So this all makes sense. The one thing we don't do here is actually
parse an unparsed commit that isn't in the graph, and instead leave that
to the caller. E.g. get_reference() now does:

> -	object = parse_object(revs->repo, oid);
> +	object = (struct object *) parse_commit_in_graph(revs->repo, oid);
> +	if (!object)
> +		object = parse_object(revs->repo, oid);

In theory we could save another lookup_object() in parse_object() by
combining these steps, but I don't think it's really worth worrying too
much about.

So overall this looks good to me.

-Peff

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Jeff King @ 2018-12-14  8:55 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181214083621.GA7121@google.com>

On Fri, Dec 14, 2018 at 12:36:21AM -0800, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > In protocol v2, instead of just running "upload-pack", we have a generic
> > "serve" loop which runs command requests from the client. What used to
> > be "upload-pack" is now generally split into two operations: "ls-refs"
> > and "fetch". The latter knows it must respect uploadpack.* config, but
> > the former is not actually specific to a fetch operation (we do not yet
> > do v2 receive-pack, but eventually we may, and ls-refs would support
> > both operations).
> 
> I think I'm missing something.  Why wouldn't "ls-refs for push" not pass
> the information that it's for push as part of the *body* of the ls-refs
> request?

I don't know. Why doesn't the current "ls-refs" say "ls-refs for fetch"?

Certainly if that information was carried from the client request it
would work fine, and ls-refs would have enough to know which config to
respect. But I could not find any documentation on this, nor discussion
of plans for a v2 push. Since that information isn't passed now, we'd
have to assume that "ls-refs" without "for-push" always means "for
fetch".

I'm conceptually OK with that, but if that is the plan for going
forward, it was not at all obvious to me (and it does feel rather
implicit).

> Is there some other more immediate motivation for this patch?  In the
> spirit of YAGNI, I would rather understand that motivation instead of
> one that in many possible designs would never materialize.

Without this information, in patch 3 ls-refs cannot know to look at
uploadpack.hiderefs, unless it makes the implicit assumption that it is
always serving a fetch.

-Peff

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Jonathan Nieder @ 2018-12-14  9:28 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181214085507.GD11777@sigill.intra.peff.net>

Jeff King wrote:
> On Fri, Dec 14, 2018 at 12:36:21AM -0800, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> In protocol v2, instead of just running "upload-pack", we have a generic
>>> "serve" loop which runs command requests from the client. What used to
>>> be "upload-pack" is now generally split into two operations: "ls-refs"
>>> and "fetch". The latter knows it must respect uploadpack.* config, but
>>> the former is not actually specific to a fetch operation (we do not yet
>>> do v2 receive-pack, but eventually we may, and ls-refs would support
>>> both operations).
>>
>> I think I'm missing something.  Why wouldn't "ls-refs for push" not pass
>> the information that it's for push as part of the *body* of the ls-refs
>> request?
>
> I don't know. Why doesn't the current "ls-refs" say "ls-refs for fetch"?

Also YAGNI. ;-)

In other words, since the design for push isn't set in stone yet, we had
nothing to be consistent with.  And if there's an option to ls-ref to
indicate whether it's for fetch or for push, then it can default to
fetch.

As an aside, my experience from teaching people about Git protocol is
that the concept of "ls-remote for push" producing a different result
from "git ls-remote" is very confusing.  Given what it is used for, I am
motivated to replace it with something more tailored.

> Certainly if that information was carried from the client request it
> would work fine, and ls-refs would have enough to know which config to
> respect. But I could not find any documentation on this, nor discussion
> of plans for a v2 push.

Interesting.  The last discussion of push v2 plans was in
https://public-inbox.org/git/20180717210915.139521-1-bmwill@google.com/.
Expect to hear more soon.

>                         Since that information isn't passed now, we'd
> have to assume that "ls-refs" without "for-push" always means "for
> fetch".
>
> I'm conceptually OK with that, but if that is the plan for going
> forward, it was not at all obvious to me (and it does feel rather
> implicit).

Don't get me wrong: I haven't wrapped my head around config context
and how it fits into the broader picture yet, but it may be a very
good thing to have.  So please consider this comment to be about the
commit message only.

Based on the motivation you're describing here, I think treating it as
uploadpack and adding a NEEDSWORK comment would be a good way forward.
If we're moving toward a world with more protocol commands that don't
fit in the upload-pack / receive-pack categories, then we need to
figure out in more detail what that world looks like:

- do we keep on adding new endpoints, in the same spirit as
  upload-archive?  If so, what endpoint should a client use to get
  capabilities before it decides which endpoint to use?

- do we merge everything in "git serve" except where a specific
  endpoint is needed for protocol v0 compatibility?  That would lose
  the ability to distinguish fetches from pushes without looking at
  the body of requests (which is useful to some people for monitoring,
  blocking, etc) --- do we consider that to be an acceptable loss?

- once we've decided what the future should look like, how does the
  transition to that future look?

>> Is there some other more immediate motivation for this patch?  In the
>> spirit of YAGNI, I would rather understand that motivation instead of
>> one that in many possible designs would never materialize.
>
> Without this information, in patch 3 ls-refs cannot know to look at
> uploadpack.hiderefs, unless it makes the implicit assumption that it is
> always serving a fetch.

I think that's a reasonable assumption to make, especially if made
explicit using a simple comment. :)

Thanks for explaining,
Jonathan

^ permalink raw reply

* Re: [PATCH 1/3] serve: pass "config context" through to individual commands
From: Jeff King @ 2018-12-14  9:55 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Brandon Williams, Jonathan Tan
In-Reply-To: <20181214092820.GB7121@google.com>

On Fri, Dec 14, 2018 at 01:28:20AM -0800, Jonathan Nieder wrote:

> > Certainly if that information was carried from the client request it
> > would work fine, and ls-refs would have enough to know which config to
> > respect. But I could not find any documentation on this, nor discussion
> > of plans for a v2 push.
> 
> Interesting.  The last discussion of push v2 plans was in
> https://public-inbox.org/git/20180717210915.139521-1-bmwill@google.com/.
> Expect to hear more soon.

The words "ls-refs" and "advertisement" are notably absent from that
thread. ;)

> > I'm conceptually OK with that, but if that is the plan for going
> > forward, it was not at all obvious to me (and it does feel rather
> > implicit).
> 
> Don't get me wrong: I haven't wrapped my head around config context
> and how it fits into the broader picture yet, but it may be a very
> good thing to have.  So please consider this comment to be about the
> commit message only.

If we're OK with accepting that the client will pass along the
fetch/push context for each individual command, then I don't think we
would ever need this. It is literally about relaying the fact of "the
original request was via upload-pack". If the commands already know the
context the client is interested in from another method, then I don't
think they should ever need to care about that fact.

> Based on the motivation you're describing here, I think treating it as
> uploadpack and adding a NEEDSWORK comment would be a good way forward.
> If we're moving toward a world with more protocol commands that don't
> fit in the upload-pack / receive-pack categories, then we need to
> figure out in more detail what that world looks like:
> 
> - do we keep on adding new endpoints, in the same spirit as
>   upload-archive?  If so, what endpoint should a client use to get
>   capabilities before it decides which endpoint to use?
> 
> - do we merge everything in "git serve" except where a specific
>   endpoint is needed for protocol v0 compatibility?  That would lose
>   the ability to distinguish fetches from pushes without looking at
>   the body of requests (which is useful to some people for monitoring,
>   blocking, etc) --- do we consider that to be an acceptable loss?
> 
> - once we've decided what the future should look like, how does the
>   transition to that future look?

I agree those are all interesting open questions. I didn't want to solve
any of them now, but just fix this (IMHO pretty serious) regression. I
was mostly trying to do so without making any assumptions about where
we'd go in the future (and even NEEDSWORK feels a little funny; it's not
clear to me whether that work is going to be needed or not).

> > Without this information, in patch 3 ls-refs cannot know to look at
> > uploadpack.hiderefs, unless it makes the implicit assumption that it is
> > always serving a fetch.
> 
> I think that's a reasonable assumption to make, especially if made
> explicit using a simple comment. :)

The big danger is that somebody does implement a "push" command and
forgets to touch ls-refs. That would be wrong and buggy for more reasons
than this (as you noted earlier, it should handle .have refs somehow).
But what worries me is that the failure mode for the bug is to start
exposing refs which are meant to be hidden. Which to me is a little more
serious than "the new functionality doesn't work".

So I guess I considered it to mostly be defensive (and I'd be fine if it
was later ripped out when a more elegant approach becomes obvious).

That said, I'm not totally opposed to the implicit thing if that's where
we all think the protocol code should be headed. The patch is certainly
smaller. The whole series could be replaced with this:

-- >8 --
Subject: [PATCH] upload-pack: support hidden refs with protocol v2

In the v2 protocol, upload-pack's advertisement has been moved to the
"ls-refs" command. That command does not respect hidden-ref config (like
transfer.hiderefs) at all, and advertises everything.

While there are some features that are not supported in v2 (e.g., v2
always allows fetching any sha1 without respect to advertisements), the
lack of this feature is not documented and is likely just a bug. Let's
make it work, as otherwise upgrading a server to a v2-capable git will
start exposing these refs that the repository admin has asked to remain
hidden.

Note that we assume we're operating on behalf of a fetch here, since
that's the only thing implemented in v2 at this point. See the in-code
comment.

Signed-off-by: Jeff King <peff@peff.net>
---
 ls-refs.c            | 16 ++++++++++++++++
 t/t5512-ls-remote.sh |  6 ++++++
 2 files changed, 22 insertions(+)

diff --git a/ls-refs.c b/ls-refs.c
index a06f12eca8..9c9a7c647f 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -5,6 +5,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "pkt-line.h"
+#include "config.h"

 /*
  * Check if one of the prefixes is a prefix of the ref.
@@ -40,6 +41,9 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	const char *refname_nons = strip_namespace(refname);
 	struct strbuf refline = STRBUF_INIT;

+	if (ref_is_hidden(refname_nons, refname))
+		return 0;
+
 	if (!ref_match(&data->prefixes, refname))
 		return 0;

@@ -69,6 +73,16 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	return 0;
 }

+static int ls_refs_config(const char *var, const char *value, void *data)
+{
+	/*
+	 * We only serve fetches over v2 for now, so respect only "uploadpack"
+	 * config. This may need to eventually be expanded to "receive", but we
+	 * don't yet know how that information will be passed to ls-refs.
+	 */
+	return parse_hide_refs_config(var, value, "uploadpack");
+}
+
 int ls_refs(struct repository *r, struct argv_array *keys,
 	    struct packet_reader *request)
 {
@@ -76,6 +90,8 @@ int ls_refs(struct repository *r, struct argv_array *keys,

 	memset(&data, 0, sizeof(data));

+	git_config(ls_refs_config, NULL);
+
 	while (packet_reader_read(request) != PACKET_READ_FLUSH) {
 		const char *arg = request->line;
 		const char *out;
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 32e722db2e..ca69636fd5 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -204,6 +204,12 @@ test_expect_success 'overrides work between mixed transfer/upload-pack hideRefs'
 	grep refs/tags/magic actual
 '

+test_expect_success 'protocol v2 supports hiderefs' '
+	test_config uploadpack.hiderefs refs/tags &&
+	git -c protocol.version=2 ls-remote . >actual &&
+	! grep refs/tags actual
+'
+
 test_expect_success 'ls-remote --symref' '
 	git fetch origin &&
 	cat >expect <<-EOF &&
-- 
2.20.0.738.gdb22cab611

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox