* Re: [Suggestion] Documentation: quick-install-man not propagating DESTDIR
From: Junio C Hamano @ 2020-09-02 19:44 UTC (permalink / raw)
To: Randall S. Becker; +Cc: 'git'
In-Reply-To: <077a01d680af$2ad65510$8082ff30$@nexbridge.com>
"Randall S. Becker" <rsbecker@nexbridge.com> writes:
> The make quick-install-man rule is not propagating DESTDIR when GNU Make
> 4.2.1 is used.
I wonder, instead of having to change all "$(MAKE) -C elsewhere", we
can add DESTDIR to the list of variables that are exported.
... goes and looks ...
Hmph, DESTDIR is exported together with DIFF, TAR, INSTALL and
SHELL_PATH. We do rely on SHELL_PATH to be exported correctly to
t/Makefile for "make test" to work, so it is puzzling.
It is doubly puzzling that we use $(INSTALL) in Documentation/Makefile
on the same line as $(DESTDIR) is used, and apparently you are not
reporting problem on that one.
> It seems like a bit of a nit to report this, but I discovered that the
> installation is not putting the manuals in the same place as git. It’s a
> pretty simple fix. I can put a patch together if desired.
I do not think we want that patch. Instead I think we'd want a
patch that uses the same trick as what makes INSTALL work.
Thanks.
> diff --git a/Makefile b/Makefile
> index 372139f1f2..dae2d99a7f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2992,10 +2992,10 @@ install-gitweb:
> $(MAKE) -C gitweb install
>
> install-doc: install-man-perl
> - $(MAKE) -C Documentation install
> + $(MAKE) -C Documentation install DESTDIR=$(DESTDIR)
>
> install-man: install-man-perl
> - $(MAKE) -C Documentation install-man
> + $(MAKE) -C Documentation install-man DESTDIR=$(DESTDIR)
>
> install-man-perl: man-perl
> $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(mandir_SQ)/man3'
> @@ -3006,19 +3006,19 @@ install-html:
> $(MAKE) -C Documentation install-html
>
> install-info:
> - $(MAKE) -C Documentation install-info
> + $(MAKE) -C Documentation install-info DESTDIR=$(DESTDIR)
>
> install-pdf:
> - $(MAKE) -C Documentation install-pdf
> + $(MAKE) -C Documentation install-pdf DESTDIR=$(DESTDIR)
>
> quick-install-doc:
> - $(MAKE) -C Documentation quick-install
> + $(MAKE) -C Documentation quick-install DESTDIR=$(DESTDIR)
>
> quick-install-man:
> - $(MAKE) -C Documentation quick-install-man
> + $(MAKE) -C Documentation quick-install-man DESTDIR=$(DESTDIR)
>
> quick-install-html:
> - $(MAKE) -C Documentation quick-install-html
> + $(MAKE) -C Documentation quick-install-html DESTDIR=$(DESTDIR)
>
> -- Brief whoami:
> NonStop developer since approximately 211288444200000000
> UNIX developer since approximately 421664400
> -- In my real life, I talk too much.
^ permalink raw reply
* Re: Bug report: git cat-file -e / rev-list disagree with git fsck on empty tree
From: Junio C Hamano @ 2020-09-02 19:52 UTC (permalink / raw)
To: Anish R Athalye; +Cc: git@vger.kernel.org
In-Reply-To: <1006A7F3-8C48-46E3-8F7C-3F82181E3619@mit.edu>
Anish R Athalye <aathalye@mit.edu> writes:
> This is related to the change made in f06ab027efd2 (rev-list: allow cached
> objects in existence check).
>
> That patch seemed designed to allow the workflow where the empty tree is
> missing from the object store, so
> `git cat-file -e 4b825dc642cb6eb9a060e54bf8d69288fbee4904` and
> `git rev-list --objects 4b825dc642cb6eb9a060e54bf8d69288fbee4904`
> both return success even when the object is not physically present.
That sounds buggy. I know git knows about both empty tree and empty
blob, but replacing the empty tree object name with the empty blob
object name in the above in a freshly-created empty repository gives
me errors from both of them, which is what I'd expect.
> However, in the same situation:
>
> $ git fsck
> [...]
> missing tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
... and if some other tree references to the empty tree (which is
unusual---I do not think we record such a tree, but some third-party
tools might), it is understandable fsck would complain.
> I'm not sure if this is the intended behavior (the tree is indeed missing, so
> in some sense, this is reasonable). But it seems somewhat confusing that it
> disagrees with the interrogation commands.
^ permalink raw reply
* RE: [Suggestion] Documentation: quick-install-man not propagating DESTDIR
From: Randall S. Becker @ 2020-09-02 19:52 UTC (permalink / raw)
To: 'Junio C Hamano'; +Cc: 'git'
In-Reply-To: <xmqqft80arpy.fsf@gitster.c.googlers.com>
On September 2, 2020 3:45 PM, Junio C Hamano wrote:
> "Randall S. Becker" <rsbecker@nexbridge.com> writes:
>
> > The make quick-install-man rule is not propagating DESTDIR when GNU
> > Make
> > 4.2.1 is used.
>
> I wonder, instead of having to change all "$(MAKE) -C elsewhere", we can
> add DESTDIR to the list of variables that are exported.
>
> ... goes and looks ...
>
> Hmph, DESTDIR is exported together with DIFF, TAR, INSTALL and
> SHELL_PATH. We do rely on SHELL_PATH to be exported correctly to
> t/Makefile for "make test" to work, so it is puzzling.
>
> It is doubly puzzling that we use $(INSTALL) in Documentation/Makefile on
> the same line as $(DESTDIR) is used, and apparently you are not reporting
> problem on that one.
>
> > It seems like a bit of a nit to report this, but I discovered that the
> > installation is not putting the manuals in the same place as git. It’s
> > a pretty simple fix. I can put a patch together if desired.
>
> I do not think we want that patch. Instead I think we'd want a patch that
> uses the same trick as what makes INSTALL work.
I'll look into that approach. Thanks for the direction.
Regards,
Randall
^ permalink raw reply
* Re: [PATCH] fetch: no FETCH_HEAD display if --no-write-fetch-head
From: Junio C Hamano @ 2020-09-02 20:07 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, derrickstolee
In-Reply-To: <20200902190232.317477-1-jonathantanmy@google.com>
Jonathan Tan <jonathantanmy@google.com> writes:
> 887952b8c6 ("fetch: optionally allow disabling FETCH_HEAD update",
> 2020-08-18) introduced the ability to disable writing to FETCH_HEAD
> during fetch, but did not suppress the "<source> -> FETCH_HEAD" message
> when this ability is used. This message is misleading in this case,
> because FETCH_HEAD is not written.
Thanks for noticing, but I wonder if we should keep this for users
of dry-run, which tends to give more output to what _would_ have
been done if it were not dry-run?
> This might be important for Stolee's maintenance prefetch patch [1] too
> - presumably we don't want to show FETCH_HEAD there, as it would be
> misleading and would clutter in the same way (albeit to a lesser
> extent).
Yes, that makes sense.
> Also, because "fetch" is used to
> lazy-fetch missing objects in a partial clone, this significantly
> clutters up the output in that case since the objects to be fetched are
> potentially numerous. Therefore, suppress this message when
> --no-write-fetch-head is passed.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This is on origin/jt/lazy-fetch.
I think this patch, with possible correction for "we are not writing
FETCH_HEAD but we do want to show what would have been done when the
user asked --dry-run", should be done as a brown-paper-bag bugfix
directly on jc/no-update-fetch-head topic and merged quicly down to
'master', instead of taken hostage of some other topic in flight,
but it would soon already be a week since jt/lazy-fetch was merged
to 'next', so it probably is OK to apply on jt/lazy-fetch, and it
indeed is easier to manage that way.
> [1] https://lore.kernel.org/git/da64c51a8182ec13aeed8f0157079fb29a09ee85.1598380599.git.gitgitgadget@gmail.com/
> ---
> builtin/fetch.c | 3 ++-
> t/t0410-partial-clone.sh | 7 +++++--
> 2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 320ba9471d..442df05f5a 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1023,11 +1023,12 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
> rc |= update_local_ref(ref, what, rm, ¬e,
> summary_width);
> free(ref);
> - } else
> + } else if (write_fetch_head) {
> format_display(¬e, '*',
> *kind ? kind : "branch", NULL,
> *what ? what : "HEAD",
> "FETCH_HEAD", summary_width);
> + }
> if (note.len) {
> if (verbosity >= 0 && !shown_url) {
> fprintf(stderr, _("From %.*s\n"),
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index d681e90640..584a039b85 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -183,7 +183,7 @@ test_expect_success 'missing CLI object, but promised, passes fsck' '
> '
>
> test_expect_success 'fetching of missing objects' '
> - rm -rf repo &&
> + rm -rf repo err &&
> test_create_repo server &&
> test_commit -C server foo &&
> git -C server repack -a -d --write-bitmap-index &&
> @@ -194,7 +194,10 @@ test_expect_success 'fetching of missing objects' '
>
> git -C repo config core.repositoryformatversion 1 &&
> git -C repo config extensions.partialclone "origin" &&
> - git -C repo cat-file -p "$HASH" &&
> + git -C repo cat-file -p "$HASH" 2>err &&
> +
> + # Ensure that no spurious FETCH_HEAD messages are written
> + ! grep FETCH_HEAD err &&
Test also --dry-run, but that perhaps needs to be done outside the
context of partial-clone. The above "lazy fetching should be silent
and should not bother users with mention of FETCH_HEAD" is good test
in the context of partial-clone, though.
jc/no-update-fetch-head added its own test to t/t5510, and both the
"output lacks FETCH_HEAD when --no-write-fetch-head is given" test
and the "output still mentions FETCH_HEAD with --dry-run" test
belong there.
Thanks.
^ permalink raw reply
* Re: [PATCH v5 6/8] config: correctly read worktree configs in submodules
From: Jonathan Nieder @ 2020-09-02 20:15 UTC (permalink / raw)
To: Matheus Tavares; +Cc: git, gitster, stolee, newren, jonathantanmy
In-Reply-To: <3e02e1bd248438e0b435a19d857432edcaa15a2c.1599026986.git.matheus.bernardino@usp.br>
Matheus Tavares wrote:
> The config machinery is not able to read worktree configs from a
> submodule in a process where the_repository represents the superproject.
... where the_repository represents the superproject and
extensions.worktreeConfig is not set there, right?
> Furthermore, when extensions.worktreeConfig is set on the superproject,
> querying for a worktree config in a submodule will, instead, return
> the value set at the superproject.
>
> The problem resides in do_git_config_sequence(). Although the function
> receives a git_dir string, it uses the_repository->git_dir when making
This part of the commit message seems to be rephrasing what the patch
says; for that kind of thing, it seems better to let the patch speak
for itself. Can we describe what is happening at a higher level (in
other words the intent instead of the details of how that is
manifested in code)? For example,
The relevant code is in do_git_config_sequence. Although it is designed
to act on an arbitrary repository, specified by the passed-in git_dir
string, it accidentally depends on the_repository in two places:
- it reads the global variable `repository_format_worktree_config`,
which is set based on the content of the_repository, to determine
whether extensions.worktreeConfig is set
- it uses the git_pathdup helper to find the config.worktree file,
instead of making a path using the passed-in git_dir falue
Sever these dependencies.
[...]
> --- a/config.c
> +++ b/config.c
> @@ -1747,11 +1747,22 @@ static int do_git_config_sequence(const struct config_options *opts,
> ret += git_config_from_file(fn, repo_config, data);
>
> current_parsing_scope = CONFIG_SCOPE_WORKTREE;
> - if (!opts->ignore_worktree && repository_format_worktree_config) {
> + if (!opts->ignore_worktree && repo_config && opts->git_dir) {
repo_config is non-NULL if and only if commondir is non-NULL and
commondir is always non-NUlL if git_dir is non-NULL (as checked higher
in the function), right? I think that means this condition could be
written more simply as
if (!opts->ignore_worktree && opts->git_dir) {
which I think should be easier for the reader to understand.
> + struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
> + struct strbuf buf = STRBUF_INIT;
> +
> + read_repository_format(&repo_fmt, repo_config);
> +
> + if (!verify_repository_format(&repo_fmt, &buf) &&
> + repo_fmt.worktree_config) {
In the common case where we are acting on the_repository, this add
extra complexity and slows the routine down.
Would passing in the 'struct repository *' to allow distinguishing
that case help? Something like this:
diff --git i/builtin/config.c w/builtin/config.c
index 5e39f618854..ca4caedf33a 100644
--- i/builtin/config.c
+++ w/builtin/config.c
@@ -699,10 +699,8 @@ int cmd_config(int argc, const char **argv, const char *prefix)
config_options.respect_includes = !given_config_source.file;
else
config_options.respect_includes = respect_includes_opt;
- if (!nongit) {
- config_options.commondir = get_git_common_dir();
- config_options.git_dir = get_git_dir();
- }
+ if (!nongit)
+ config_options.repo = the_repository;
if (end_nul) {
term = '\0';
diff --git i/config.c w/config.c
index 2bdff4457be..70a1dd0ad3f 100644
--- i/config.c
+++ w/config.c
@@ -222,8 +222,8 @@ static int include_by_gitdir(const struct config_options *opts,
const char *git_dir;
int already_tried_absolute = 0;
- if (opts->git_dir)
- git_dir = opts->git_dir;
+ if (opts->repo && opts->repo->gitdir)
+ git_dir = opts->repo->gitdir;
else
goto done;
@@ -1720,10 +1720,10 @@ static int do_git_config_sequence(const struct config_options *opts,
char *repo_config;
enum config_scope prev_parsing_scope = current_parsing_scope;
- if (opts->commondir)
- repo_config = mkpathdup("%s/config", opts->commondir);
- else if (opts->git_dir)
- BUG("git_dir without commondir");
+ if (opts->repo && opts->repo->commondir)
+ repo_config = mkpathdup("%s/config", opts->repo->commondir);
+ else if (opts->repo && opts->repo->gitdir)
+ BUG("gitdir without commondir");
else
repo_config = NULL;
@@ -1824,27 +1824,33 @@ void read_early_config(config_fn_t cb, void *data)
struct config_options opts = {0};
struct strbuf commondir = STRBUF_INIT;
struct strbuf gitdir = STRBUF_INIT;
+ struct repository the_early_repo = {0};
opts.respect_includes = 1;
if (have_git_dir()) {
- opts.commondir = get_git_common_dir();
- opts.git_dir = get_git_dir();
+ opts.repo = the_repository;
/*
* When setup_git_directory() was not yet asked to discover the
* GIT_DIR, we ask discover_git_directory() to figure out whether there
* is any repository config we should use (but unlike
- * setup_git_directory_gently(), no global state is changed, most
+ * setup_git_directory_gently(), no global state is changed; most
* notably, the current working directory is still the same after the
* call).
+ *
+ * NEEDSWORK: There is some duplicate work between
+ * discover_git_directory and repo_init. Update to use a variant of
+ * repo_init that does its own repository discovery once available.
*/
} else if (!discover_git_directory(&commondir, &gitdir)) {
- opts.commondir = commondir.buf;
- opts.git_dir = gitdir.buf;
+ repo_init(&the_early_repo, gitdir.buf, NULL);
+ opts.repo = &the_early_repo;
}
config_with_options(cb, data, NULL, &opts);
+ if (the_early_repo.settings.initialized)
+ repo_clear(&the_early_repo);
strbuf_release(&commondir);
strbuf_release(&gitdir);
}
@@ -2097,8 +2103,7 @@ static void repo_read_config(struct repository *repo)
struct config_options opts = { 0 };
opts.respect_includes = 1;
- opts.commondir = repo->commondir;
- opts.git_dir = repo->gitdir;
+ opts.repo = repo;
if (!repo->config)
repo->config = xcalloc(1, sizeof(struct config_set));
diff --git i/config.h w/config.h
index 91cdfbfb414..e56293fb29f 100644
--- i/config.h
+++ w/config.h
@@ -21,6 +21,7 @@
*/
struct object_id;
+struct repository;
/* git_config_parse_key() returns these negated: */
#define CONFIG_INVALID_KEY 1
@@ -87,8 +88,7 @@ struct config_options {
unsigned int ignore_worktree : 1;
unsigned int ignore_cmdline : 1;
unsigned int system_gently : 1;
- const char *commondir;
- const char *git_dir;
+ struct repository *repo;
config_parser_event_fn_t event_fn;
void *event_fn_data;
enum config_error_action {
==== >8 ====
[...]
> --- a/t/helper/test-config.c
> +++ b/t/helper/test-config.c
[...]
> @@ -72,14 +80,34 @@ static int early_config_cb(const char *var, const char *value, void *vdata)
> #define TC_VALUE_NOT_FOUND 1
> #define TC_CONFIG_FILE_ERROR 2
>
> +static const char *test_config_usage[] = {
> + "test-tool config [--submodule=<path>] <cmd> [<args>]",
> + NULL
> +};
> +
> int cmd__config(int argc, const char **argv)
> {
> int i, val, ret = 0;
> const char *v;
> const struct string_list *strptr;
> struct config_set cs;
> + struct repository subrepo, *repo = the_repository;
> + const char *subrepo_path = NULL;
> +
> + struct option options[] = {
> + OPT_STRING(0, "submodule", &subrepo_path, "path",
> + "run <cmd> on the submodule at <path>"),
> + OPT_END()
> + };
Nice.
> +
> + argc = parse_options(argc, argv, NULL, options, test_config_usage,
> + PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_STOP_AT_NON_OPTION);
> + if (argc < 2)
> + die("Please, provide a command name on the command-line");
optional nit: can use usage_with_options here. It produces a better
error message than any other I can think of (all I can think of are
things like "need a <cmd>").
This is from existing code, but the use of parse_options opens up the
possibility of taking advantage of the parse-options generated message. :)
[...]
> --- a/t/t2404-worktree-config.sh
> +++ b/t/t2404-worktree-config.sh
> @@ -78,4 +78,20 @@ test_expect_success 'config.worktree no longer read without extension' '
> test_cmp_config -c wt2 shared this.is
> '
>
> +test_expect_success 'correctly read config.worktree from submodules' '
> + test_unconfig extensions.worktreeconfig &&
> + git init sub &&
> + (
> + cd sub &&
> + test_commit a &&
> + git config extensions.worktreeconfig true &&
> + git config --worktree wtconfig.sub test-value
> + ) &&
> + git submodule add ./sub &&
> + git commit -m "add sub" &&
> + echo test-value >expect &&
> + test-tool config --submodule=sub get_value wtconfig.sub >actual &&
> + test_cmp expect actual
> +'
Lovely.
Summary: I like the direction this change goes in.
I think we can do it without repeating repository format discovery in
the the_repository case and without duplicating repository format
discovery code in the submodule case. If it proves too fussy, then a
NEEDSWORK comment would be helpful to help the reader see what is
going on.
Thanks and hope that helps,
Jonathan
^ permalink raw reply related
* [PATCH] vcbuild: fix library name for expat with make MSVC=1
From: Orgad Shaneh via GitGitGadget @ 2020-09-02 20:16 UTC (permalink / raw)
To: git; +Cc: Orgad Shaneh, Orgad Shaneh
From: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
---
vcbuild: fix library name for expat with make MSVC=1
Signed-off-by: Orgad Shaneh orgads@gmail.com [orgads@gmail.com]
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-722%2Forgads%2Fvcbuild-expat-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-722/orgads/vcbuild-expat-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/722
compat/vcbuild/scripts/clink.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
index 61ad084a7b..df167d1e1a 100755
--- a/compat/vcbuild/scripts/clink.pl
+++ b/compat/vcbuild/scripts/clink.pl
@@ -66,7 +66,7 @@
}
push(@args, $lib);
} elsif ("$arg" eq "-lexpat") {
- push(@args, "expat.lib");
+ push(@args, "libexpat.lib");
} elsif ("$arg" =~ /^-L/ && "$arg" ne "-LTCG") {
$arg =~ s/^-L/-LIBPATH:/;
push(@lflags, $arg);
base-commit: e19713638985533ce461db072b49112da5bd2042
--
gitgitgadget
^ permalink raw reply related
* [PATCH] vcbuild: fix batch file name in README
From: Orgad Shaneh via GitGitGadget @ 2020-09-02 20:18 UTC (permalink / raw)
To: git; +Cc: Orgad Shaneh, Orgad Shaneh
From: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
---
vcbuild: Fix batch file name in README
Signed-off-by: Orgad Shaneh orgads@gmail.com [orgads@gmail.com]
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-721%2Forgads%2Fvcbuild-readme-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-721/orgads/vcbuild-readme-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/721
compat/vcbuild/README | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/compat/vcbuild/README b/compat/vcbuild/README
index 42292e7c09..51fb083dbb 100644
--- a/compat/vcbuild/README
+++ b/compat/vcbuild/README
@@ -26,8 +26,8 @@ The Steps to Build Git with VS2015 or VS2017 from the command line.
Use ONE of the following forms which should match how you want to
compile git.exe.
- $ ./compat/vcbuild/vcpkg_copy_packages.bat debug
- $ ./compat/vcbuild/vcpkg_copy_packages.bat release
+ $ ./compat/vcbuild/vcpkg_copy_dlls.bat debug
+ $ ./compat/vcbuild/vcpkg_copy_dlls.bat release
3. Build git using MSVC from an SDK bash window using one of the
following commands:
base-commit: e19713638985533ce461db072b49112da5bd2042
--
gitgitgadget
^ permalink raw reply related
* Re: [PATCH v3 12/14] commit-graph: add large-filters bitmap chunk
From: Taylor Blau @ 2020-09-02 20:23 UTC (permalink / raw)
To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster
In-Reply-To: <20200819133526.GE29528@szeder.dev>
(Finally getting back to this review after working on some other topics
for a while..., sorry for the late response).
On Wed, Aug 19, 2020 at 03:35:26PM +0200, SZEDER Gábor wrote:
> On Tue, Aug 11, 2020 at 04:52:07PM -0400, Taylor Blau wrote:
> > When a commit has more than a certain number of changed paths (commonly
> > 512), the commit-graph machinery represents it as a zero-length filter.
> > This is done since having many entries in the Bloom filter has
> > undesirable effects on the false positivity rate.
>
> This is not the case, the false positive probability depends on the
> ratio of the Bloom filter's size and the number of elements it
> contains, and we size the filters proportional to the number of
> elements they contain, so the number of elements shouldn't affect the
> false positive rate.
I'm not sure that I understand. I agree that the FPR depends on the
ratio between the number of elements in the filter and the filter's
"size". But, consider a Bloom filter that is too small to faithfully
represent all its elements. Such a filter would likely have all its bits
set high, in which case every query would return "maybe", and the FPR
would go up.
> On the contrary, it's the small filters, up to around 30-35 bytes,
> that tend to have larger than expected false positive rate when using
> double hashing.
I agree that small filters suffer from the same, but I think this is an
"in addition" not an "on the contrary".
In either case, I don't think that this is an important detail for the
commit message. What matters is the representation (that we truncate >=
512 elements to a length-zero filter), not why (that can be found in
another commit). I'd have expected to find the rationale in ed591febb4
(bloom.c: core Bloom filter implementation for changed paths.,
2020-03-30), but I couldn't find anything there.
So, I'll drop this sentence entirely to avoid an unimportant detail.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH] fetch: do not look for submodule changes in unchanged refs
From: Junio C Hamano @ 2020-09-02 20:26 UTC (permalink / raw)
To: Orgad Shaneh via GitGitGadget; +Cc: git, Orgad Shaneh
In-Reply-To: <pull.720.git.1599056635276.gitgitgadget@gmail.com>
"Orgad Shaneh via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Orgad Shaneh <orgads@gmail.com>
>
> This operation is very expensive, as it scans all the refs using
> setup_revisions, which resolves each ref, including checking if it
> is ambiguous, or if it is a file name etc.
Nobody can tell what "This operation" is without looking at the
patch/diff text. Our commit message typically gives minimum
explanation of the situation and the problem it tries to solve first
to make it self sufficient. And then we go on to order the code
base to be in a better shape. Something along the lines of ...
When fetching recursively with submodules, for each ref in the
superproject, we call check_for_new_submodule_commits() to
figure out X and Y for the object the ref was pointing at before
the fetch in the superproject, in order to ensure Z. This is
expensive because of A, B and C, but it unnecessary if the fetch
in the superproject did not update the ref (i.e. the objects
that are required to exist in the submodule did not change).
Check if we are making any change to the ref, and skip the check
if we aren't.
... but I didn't fill the most important bits in the above, as by
now you, as the person who encountered the issue and figured out a
good way to solve it, would know what to fill the placeholders with
far better than I would ;-)
> There is no reason to do all that for refs that haven't changed in this
> fetch.
>
> Reported here:
> https://public-inbox.org/git/CAGHpTBKSUJzFSWc=uznSu2zB33qCSmKXM-iAjxRCpqNK5bnhRg@mail.gmail.com/
>
> Amends commit be76c2128234d94b47f7087152ee55d08bb65d88.
I am not sure what this reference is trying to achieve. Fixing a
bug in be76c212 (fetch: ensure submodule objects fetched,
2018-12-06)? If so, please say so more directly, perhaps like
be76c212 (fetch: ensure submodule objects fetched, 2018-12-06)
tried to do what we are trying to do here, but it botched the
exectuion by forgetting the fact that ...
or somesuch. The cited commit says
The submodule checks were done only when a ref in the
superproject changed,...
so it is not clear what we are really fixing with this patch,
though. Is the assertion "checks were done only when changed"
it made incorrect and instead we were doing unnecessary check
always?
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 0f23dd4b8c..d3f922fc89 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -958,8 +958,10 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
> ref->force = rm->peer_ref->force;
> }
>
> - if (recurse_submodules != RECURSE_SUBMODULES_OFF)
> + if (recurse_submodules != RECURSE_SUBMODULES_OFF &&
> + (!rm->peer_ref || !oideq(&ref->old_oid, &ref->new_oid))) {
> check_for_new_submodule_commits(&rm->old_oid);
> + }
The original before be76c212 fed ref->new_oid to the check
function. Now that we are using ref->{old,new}_oid in the
condition, would it make more sense to pass ref->new_oid
like we did before the commit, or is that an object that is
different from rm->old_oid?
Thanks.
> if (!strcmp(rm->name, "HEAD")) {
> kind = "";
>
> base-commit: e19713638985533ce461db072b49112da5bd2042
^ permalink raw reply
* Re: [PATCH v3 12/14] commit-graph: add large-filters bitmap chunk
From: Taylor Blau @ 2020-09-02 20:40 UTC (permalink / raw)
To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster
In-Reply-To: <20200901143546.GA5678@szeder.dev>
On Tue, Sep 01, 2020 at 04:35:46PM +0200, SZEDER Gábor wrote:
> On Tue, Aug 11, 2020 at 04:52:07PM -0400, Taylor Blau wrote:
> > This is already undesirable, since it means that we are wasting
> > considerable effort to discover that a commit with too many changed
> > paths, only to throw that effort away (and then repeat the process the
> > next time a roll-up is performed).
>
> Is this really the case?
>
> If a commit has a corresponding entry in the Bloom filters chunks,
> then the commit-graph machinery does know the Bloom filter associated
> with that commit. The size of that filter should not matter, i.e.
> even if it is a zero-length filter, the commit-graph machinery should
> know about it all the same. And as far as I can tell it does indeed,
> because load_bloom_filter_from_graph() sets a non-NULL 'filter->data'
> pointer even if 'filter->len' is zero, which get_bloom_filter() treats
> as "we know about this", and returns early without (re)computing the
> filter. Even the test 'Bloom generation does not recompute too-large
> filters' added in this patch is expected to succeed, and my
> superficial and makeshift testing seems to corroborate this; at least
> I couldn't find a combination of commands and options that would
> recompute any existing zero-length Bloom filters.
>
> Am I missing something?
I'm not sure that this would work, or at least that it creates another
pair of cases that we have to disambiguate. Here's what I'm thinking
after reading what you wrote:
- By this point in the series, we expect every commit in a
commit-graph layer to have a corresponding entry in the 'BIDX' chunk
(if one exists, obviously all of this is meaningless without
specifying '--changed-paths').
- In a future commit (specifically, "builtin/commit-graph.c: introduce
'--max-new-filters=<n>'"), this will no longer be the case.
- So, by that point in the series, we would have two possible reasons
why a commit did not show up in the BIDX. Either:
* The commit has too many changed paths, in which case we don't
want to write it down, or
* We had already computed too many new filters, and so didn't have
the budget to compute the filter for some commit(s).
In the first of those two cases, we want to avoid recomputing the
too-large filter. But, in the latter case, we *do* want to compute the
filter from scratch, since we want to "backfill" commits with missing
filters by letting them trickle in over time.
I might be missing something, too, but I think that those two cases are
just as indistinguishable as what's in the commit-graph today (without
the 'BFXL' chunk).
> > To address this, add a new chunk which encodes a bitmap where the ith
> > bit is on iff the ith commit has zero or at least 512 changed paths.
> > Likewise store the maximum number of changed paths we are willing to
> > store in order to prepare for eventually making this value more easily
> > customizable.
>
> I don't understand how storing this value would make it any easier to
> customize.
It doesn't, but it's also not meant to. The idea is that if the value
ever were easy to customize, then we could keep track of which layers
were written with which threshold. This would allow you to invent fancy
rules like dropping filters when the threshold lowers, or recomputing
ones that you wouldn't have otherwise when it goes up.
> > Another approach would be to introduce a new BIDX chunk (say, one
> > identified by 'BID2') which is identical to the existing BIDX chunk,
> > except the most-significant bit of each offset is interpreted as "this
> > filter is too big" iff looking at a BID2 chunk. This avoids having to
> > write a bitmap, but forces older clients to rewrite their commit-graphs
> > (as well as reduces the theoretical largest Bloom filters we couldl
>
> And it reduces the max possible size of the BDAT chunk, and thus the
> max number of commits with Bloom filters as well.
Fair point.
> s/couldl/could/
Oops, thanks.
> > write, and forces us to maintain the code necessary to translate BIDX
> > chunks to BID2 ones). Separately from this patch, I implemented this
> > alternate approach and did not find it to be advantageous.
>
> Let's take a step back to reconsider what should be stored in this
> bitmap for a moment. Sure, setting a bit for each commit that doesn't
> modify any paths or modifies too many makes it possible to repliably
> identify commits that don't have Bloom filters yet. But isn't it a
> bit roundabout way? I think it would be better to directly track
> which commits don't have Bloom filters yet. IOW what you really want
> is a, say, BNCY "Bloom filter Not Computed Yet" chunk, where we set
> the corresponding bit for each commit which has an entry in the BIDX
> chunk but for which a Bloom filter hasn't been computed yet.
>
> - It's simpler and easier to explain (IMO).
>
> - This bitmap chunk can easily be made optional: if all Bloom
> filters have been computed, then the bitmap will contain all
> zeros. So why bother writing it, when we can save a bit of space
> instead?
I don't think that this is true. Omitting the chunk (because you have
computed--or tried to compute--filters for every commit in the graph)
isn't distinguishable from what exists today, so we are back at square
one there.
> - It avoids the unpleasentness of setting a bit in the _Large_ Bloom
> Filters chunks for commits _not_ modifying any paths.
I agree it's unpleasant, but I also don't think it's a show-stopper.
> - Less incentive to spill implementation details to the format
> specification (e.g. 512 modified paths).
>
> Now, let's take another step back: is such a bitmap really necessary?
> We could write a single-byte Bloom filter with no bits set for commits
> not modifying any paths, and a single-byte Bloom filter with all bits
> set for commits modifying too many paths. This is compatible with the
> specs and any existing implementation should do the right thing when
> reading such filters, this would allow us to interpret zero-length
> filters as "not computed yet", and if that bitmap chunk won't be
> optional, then this would save space as long as less than 1/8 of
> commits modify no or too many paths. Unfortunately, however, all
> existing zero-length Bloom filters have to be recomputed.
I think this is a semantic difference: either you store a bitmap, or a
Bloom filter containing the same data. To me, I don't think there's a
huge difference, since we're talking about 1 bit per commit. If we were
really worried, we could store them as EWAH-compressed bitmaps, but I
don't get a sense that such a concern exists.
I do feel strongly about using a non-probabilistic data structure,
though, since the point of this feature is to be able to make tighter
guarentees about the runtime of 'git commit-graph write'.
>
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> > .../technical/commit-graph-format.txt | 12 +++
> > bloom.h | 2 +-
> > commit-graph.c | 96 ++++++++++++++++---
> > commit-graph.h | 4 +
> > t/t4216-log-bloom.sh | 25 ++++-
> > 5 files changed, 124 insertions(+), 15 deletions(-)
> >
> > diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
> > index 440541045d..5f2d9ab4d7 100644
> > --- a/Documentation/technical/commit-graph-format.txt
> > +++ b/Documentation/technical/commit-graph-format.txt
> > @@ -123,6 +123,18 @@ CHUNK DATA:
> > of length zero.
> > * The BDAT chunk is present if and only if BIDX is present.
> >
> > + Large Bloom Filters (ID: {'B', 'F', 'X', 'L'}) [Optional]
> > + * It starts with a 32-bit unsigned integer specifying the maximum number of
> > + changed-paths that can be stored in a single Bloom filter.
>
> Should this value be the same in all elements of a commit-graph chain?
> Note that in this case having different values won't affect revision
> walks using modified path Bloom filters.
I don't think it needs to be the same necessarily.
> > + * It then contains a list of 64-bit words (the length of this list is
> > + determined by the width of the chunk) which is a bitmap. The 'i'th bit is
> > + set exactly when the 'i'th commit in the graph has a changed-path Bloom
> > + filter with zero entries (either because the commit is empty, or because
> > + it contains more than 512 changed paths).
>
> Please make clear the byte order of these 64 bit words in the specs as
> well.
>
> Furthermore, that 512 path limit is an implementation detail, so it
> would be better if it didn't leak into the specification of this new
> chunk.
Addressed both, thanks.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH v2 3/3] ci: stop linking built-ins to the dashed versions
From: Junio C Hamano @ 2020-09-02 20:50 UTC (permalink / raw)
To: Johannes Schindelin
Cc: SZEDER Gábor, Johannes Schindelin via GitGitGadget, git
In-Reply-To: <nycvar.QRO.7.76.6.2009020902080.56@tvgsbejvaqbjf.bet>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> What this patch does is simply to complete the mission of e4597aae6590
> (run test suite without dashed git-commands in PATH, 2009-12-02): to make
> sure that our very own scripts do not use dashed invocations of built-in
> commands.
OK, then I totally misread the proposed log message, or the proposed
log message didn't adequately describe what it was trying to solve.
With e4597aae (run test suite without dashed git-commands in PATH,
2009-12-02), we should not need git-foo in bin-wrappers/ for most of
git subcommands, as long as our tests and scripted porcelains the
test invokes never use the dashed form. And with this step, we come
closer to that goal?
"git checkout next && make test" does not seem to populate anything
extra in bin-wrappers but bare minimum that needs due to the protocol
requirements, though, without this patch, so the above may not be
the whole story.
Apparently, the patch does not achieve this goal.
For that reason, we just introduced a Makefile knob to skip linking
them. TO make sure that this keeps working, teach the CI
(and PR) builds to skip generating those hard-links.
because what matters to tests, when run without with-dash, which is
the sensible modern default, is what is in bin-wrappers/ and we do
not populate it with builtin git-add... at all even before this
step. In other words, this change has no way to make sure "skip
linking them" will keep working more than what we already have.
Rather,
Since e4597aae6590, we stopped running our tests with "git-foo"
binaries found at the top-level of a freshly built source tree;
instead we have placed only "git" and selected "git-foo"
commands that must be on $PATH in bin-wrappers/ and they were
what used by the tests. This is to catch the tests and scripted
Porcelains that are tested will get caught if they try to use
the dashed form.
Since CI jobs will not install the built Git to anywhere, and
the links we make at the top-level of the source tree for
"git-add" and friends are not even used during tests, they are
pure waste of resources these days.
Thanks to the newly invented SKIP_DASHED_BUILT_INS knob, we can
now skip creating these links in the source tree. Do so.
or something along the line, perhaps.
Thanks.
^ permalink raw reply
* Re: [PATCH v3 13/14] commit-graph: rename 'split_commit_graph_opts'
From: Taylor Blau @ 2020-09-02 21:02 UTC (permalink / raw)
To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster
In-Reply-To: <20200819095656.GD29528@szeder.dev>
On Wed, Aug 19, 2020 at 11:56:56AM +0200, SZEDER Gábor wrote:
> On Tue, Aug 11, 2020 at 04:52:11PM -0400, Taylor Blau wrote:
> > In the subsequent commit, additional options will be added to the
> > commit-graph API which have nothing to do with splitting.
> >
> > Rename the 'split_commit_graph_opts' structure to the more-generic
> > 'commit_graph_opts' to encompass both.
>
> Good. Note, however, that write_commit_graph() has a 'flags'
> parameter as well, and when a feature is enabled via this 'flags',
> then first a corresponding 'ctx->foo' field is set, and that
> 'ctx->foo' is checked while computing and writing the commit-graph.
> With the generic options struct some other feature will be enabled via
> the 'opts->bar' field, so simply 'ctx->opts->bar' is checked while
> writing the commit-graph.
>
> With the generic options struct there really is no need for a separate
> flags parameter, the values in the flags can be stored in the options
> struct, and we can eliminate this inconsistency instead of adding even
> more.
I like the direction that you're headed in, but I'm not entirely sure
what you're suggesting. Do you want to make the
'enum commit_graph_write_flags flags' part of the new options struct?
Break out the fields into individual bits on that struct?
I'm not opposed to either, but note that there is also already a 'flags'
field on the options structure related to splitting, so that would have
to be untangled, too.
What I'm trying to say is that I think there's more complexity here than
you're giving it credit for. I'd rather press on with what we have here,
and devote adequate time to unraveling the complexity appropriately than
try to shove in another patch that takes a half-step in the right
direction.
>
> > diff --git a/commit-graph.h b/commit-graph.h
> > index ddbca1b59d..af08c4505d 100644
> > --- a/commit-graph.h
> > +++ b/commit-graph.h
> > @@ -109,7 +109,7 @@ enum commit_graph_split_flags {
> > COMMIT_GRAPH_SPLIT_REPLACE = 2
> > };
> >
> > -struct split_commit_graph_opts {
> > +struct commit_graph_opts {
> > int size_multiple;
> > int max_commits;
> > timestamp_t expire_time;
> enum commit_graph_split_flags flags;
>
> While this was 'struct split_commit_graph_opts *split_opts' it was
> clear what kind of flags were in this 'flags' field. Now that the
> struct is generic it's not clear anymore, so perhaps it should be
> renamed as well (e.g. 'split_flags'), or even turned into a couple of
> bit fields.
This I can definitely vouch for, so I'll 's/flags/split_flags' in the
next revision.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH v3 14/14] builtin/commit-graph.c: introduce '--max-new-filters=<n>'
From: Taylor Blau @ 2020-09-02 21:03 UTC (permalink / raw)
To: SZEDER Gábor; +Cc: Taylor Blau, git, peff, dstolee, gitster
In-Reply-To: <20200817225004.GB29528@szeder.dev>
On Tue, Aug 18, 2020 at 12:50:04AM +0200, SZEDER Gábor wrote:
> On Fri, Aug 14, 2020 at 04:20:21PM -0400, Taylor Blau wrote:
> > > > @@ -1486,10 +1499,15 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
> > > > ctx->order_by_pack ? commit_pos_cmp : commit_gen_cmp,
> > > > &ctx->commits);
> > > >
> > > > + max_new_filters = ctx->opts->max_new_filters >= 0 ?
> > > > + ctx->opts->max_new_filters : ctx->commits.nr;
> > >
> > > git_test_write_commit_graph_or_die() invokes
> > > write_commit_graph_reachable() with opts=0x0, so 'ctx->opts' is NULL,
> > > and we get segfault. This breaks a lot of tests when run with
> > > GIT_TEST_COMMIT_GRAPH=1 GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1.
> >
> > Great catch, thanks. Fixing this is as simple as adding 'ctx->opts &&'
> > right before we dereference 'ctx->opts', since setting this variable
> > equal to 'ctx->commits.nr' is the right thing to do in that case.
>
> That would avoid the segfault, sure, but I would rather see all
> callers of write_commit_graph{_reachable}() passing a valid opts
> instance. Just like we don't call the diff machinery with a NULL
> diff_options, or the revision walking machinery with a NULL rev_info.
I wouldn't mind that either, but this is definitely a common pattern
throughout the commit-graph machinery. So, if/when we do get away from
it, I'd rather do so uniformly than in some spots.
> > Unrelated to this comment, I am hoping to send out a final version of
> > this series sometime next week so that we can keep moving forward with
> > Bloom filter improvements.
> >
> > Have you had a chance to review the rest of the patches? I'll happily
> > wait until you have had a chance to do so before sending v5 so that we
>
> v5? This is v3, and I'm unable to a find a v4.
Sorry, I clearly had too much on my mind when I was writing this ;). I'm
hopeful that with your careful review that v4 will be the last of this
topic.
Thanks,
Taylor
^ permalink raw reply
* [PATCH v2] fetch: no FETCH_HEAD display if --no-write-fetch-head
From: Jonathan Tan @ 2020-09-02 21:05 UTC (permalink / raw)
To: git; +Cc: Jonathan Tan, gitster
In-Reply-To: <xmqq7dtcaqob.fsf@gitster.c.googlers.com>
887952b8c6 ("fetch: optionally allow disabling FETCH_HEAD update",
2020-08-18) introduced the ability to disable writing to FETCH_HEAD
during fetch, but did not suppress the "<source> -> FETCH_HEAD" message
when this ability is used. This message is misleading in this case,
because FETCH_HEAD is not written. Also, because "fetch" is used to
lazy-fetch missing objects in a partial clone, this significantly
clutters up the output in that case since the objects to be fetched are
potentially numerous.
Therefore, suppress this message when --no-write-fetch-head is passed
(but not when --dry-run is set).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This is on origin/jt/lazy-fetch.
Junio writes [1]:
> Test also --dry-run, but that perhaps needs to be done outside the
> context of partial-clone. The above "lazy fetching should be silent
> and should not bother users with mention of FETCH_HEAD" is good test
> in the context of partial-clone, though.
>
> jc/no-update-fetch-head added its own test to t/t5510, and both the
> "output lacks FETCH_HEAD when --no-write-fetch-head is given" test
> and the "output still mentions FETCH_HEAD with --dry-run" test
> belong there.
Ah, thanks for catching that. Here's an updated version.
[1] https://lore.kernel.org/git/xmqq7dtcaqob.fsf@gitster.c.googlers.com/
---
builtin/fetch.c | 8 +++++++-
t/t0410-partial-clone.sh | 7 +++++--
t/t5510-fetch.sh | 18 ++++++++++--------
3 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 320ba9471d..c6c4689250 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1023,11 +1023,17 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
rc |= update_local_ref(ref, what, rm, ¬e,
summary_width);
free(ref);
- } else
+ } else if (write_fetch_head || dry_run) {
+ /*
+ * Display fetches written to FETCH_HEAD (or
+ * would be written to FETCH_HEAD, if --dry-run
+ * is set).
+ */
format_display(¬e, '*',
*kind ? kind : "branch", NULL,
*what ? what : "HEAD",
"FETCH_HEAD", summary_width);
+ }
if (note.len) {
if (verbosity >= 0 && !shown_url) {
fprintf(stderr, _("From %.*s\n"),
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index d681e90640..584a039b85 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -183,7 +183,7 @@ test_expect_success 'missing CLI object, but promised, passes fsck' '
'
test_expect_success 'fetching of missing objects' '
- rm -rf repo &&
+ rm -rf repo err &&
test_create_repo server &&
test_commit -C server foo &&
git -C server repack -a -d --write-bitmap-index &&
@@ -194,7 +194,10 @@ test_expect_success 'fetching of missing objects' '
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "origin" &&
- git -C repo cat-file -p "$HASH" &&
+ git -C repo cat-file -p "$HASH" 2>err &&
+
+ # Ensure that no spurious FETCH_HEAD messages are written
+ ! grep FETCH_HEAD err &&
# Ensure that the .promisor file is written, and check that its
# associated packfile contains the object
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 2a1abe91f0..759aec9305 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -543,16 +543,18 @@ test_expect_success 'fetch into the current branch with --update-head-ok' '
'
-test_expect_success 'fetch --dry-run does not touch FETCH_HEAD' '
- rm -f .git/FETCH_HEAD &&
- git fetch --dry-run . &&
- ! test -f .git/FETCH_HEAD
+test_expect_success 'fetch --dry-run does not touch FETCH_HEAD, but still prints what would be written' '
+ rm -f .git/FETCH_HEAD err &&
+ git fetch --dry-run . 2>err &&
+ ! test -f .git/FETCH_HEAD &&
+ grep FETCH_HEAD err
'
-test_expect_success '--no-write-fetch-head does not touch FETCH_HEAD' '
- rm -f .git/FETCH_HEAD &&
- git fetch --no-write-fetch-head . &&
- ! test -f .git/FETCH_HEAD
+test_expect_success '--no-write-fetch-head does not touch FETCH_HEAD, and does not print what would be written' '
+ rm -f .git/FETCH_HEAD err &&
+ git fetch --no-write-fetch-head . 2>err &&
+ ! test -f .git/FETCH_HEAD &&
+ ! grep FETCH_HEAD err
'
test_expect_success '--write-fetch-head gets defeated by --dry-run' '
--
2.28.0.402.g5ffc5be6b7-goog
^ permalink raw reply related
* Re: [PATCH v2] fetch: no FETCH_HEAD display if --no-write-fetch-head
From: Junio C Hamano @ 2020-09-02 21:27 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git
In-Reply-To: <20200902210539.1981453-1-jonathantanmy@google.com>
Thanks; this round looks good.
^ permalink raw reply
* Re: Repo state broken due to mismatched file name casing during merge
From: brian m. carlson @ 2020-09-02 23:23 UTC (permalink / raw)
To: Rafał Grzybowski; +Cc: git
In-Reply-To: <CANG6M-ri1uvEPCcssP=Q0iM25Vhr5QZ220zUUhMdH4AVR9OLSw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1815 bytes --]
On 2020-09-02 at 16:26:27, Rafał Grzybowski wrote:
> What did you do before the bug happened? (Steps to reproduce your issue)
>
> mkdir repo
> cd repo
>
> git init
>
> "sample file" > file.txt
> git add file.txt
> git commit -m "Added file.txt"
>
>
> git checkout -b other_branch
> echo other file > other_file.txt
> git add other_file.txt
> git commit -m "Added other_file.txt"
>
> git checkout master
> echo Other file > Other_file.txt
> git add Other_file.txt
> git commit -m "Added Other_file.txt"
>
> git merge other_branch
> git status
>
> What did you expect to happen? (Expected behavior)
>
> A clean state, no unstaged changes.
>
> What happened instead? (Actual behavior)
>
> There is always an unstaged file other_file.txt which case changes if
> I try to discard and the unstaged change stays.
> If I try to delete the file, I get two unstaged file removal changes.
Git always preserves the case of files internally. However, the default
configuration on Windows does not. When you performed this merge, you
ended up with two files, but it's not possible to write both of them
into the working tree. Since they are not the same, when Git checks the
status of one of the files, it finds it to be modified.
The solution here is to do "git rm --cached Other_file.txt" followed by
a commit. That removes the file from the index but doesn't touch the
working tree. In Windows 10, you can also set a directory to be case
sensitive, which should fix this if you do a "git checkout .".
This isn't a bug in Git; Git doesn't know about file name encodings, and
even if it only handled Unicode file names, it's impossible to correctly
perform locale-insensitive case folding, so Git doesn't even try.
--
brian m. carlson: Houston, Texas, US
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply
* Re: [PATCH v2] fetch: no FETCH_HEAD display if --no-write-fetch-head
From: Jonathan Nieder @ 2020-09-02 23:56 UTC (permalink / raw)
To: Jonathan Tan; +Cc: git, gitster
In-Reply-To: <20200902210539.1981453-1-jonathantanmy@google.com>
Jonathan Tan wrote:
> 887952b8c6 ("fetch: optionally allow disabling FETCH_HEAD update",
> 2020-08-18) introduced the ability to disable writing to FETCH_HEAD
> during fetch, but did not suppress the "<source> -> FETCH_HEAD" message
> when this ability is used. This message is misleading in this case,
> because FETCH_HEAD is not written. Also, because "fetch" is used to
> lazy-fetch missing objects in a partial clone, this significantly
> clutters up the output in that case since the objects to be fetched are
> potentially numerous.
>
> Therefore, suppress this message when --no-write-fetch-head is passed
> (but not when --dry-run is set).
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
[...]
> builtin/fetch.c | 8 +++++++-
> t/t0410-partial-clone.sh | 7 +++++--
> t/t5510-fetch.sh | 18 ++++++++++--------
> 3 files changed, 22 insertions(+), 11 deletions(-)
Thanks for fixing it, and sorry I didn't catch it during initial
review.
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 320ba9471d..c6c4689250 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1023,11 +1023,17 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
> rc |= update_local_ref(ref, what, rm, ¬e,
> summary_width);
> free(ref);
> - } else
> + } else if (write_fetch_head || dry_run) {
> + /*
> + * Display fetches written to FETCH_HEAD (or
> + * would be written to FETCH_HEAD, if --dry-run
nit: to fix the parallel construction, s/would/that would/ or
s/written/that were written/
> + * is set).
> + */
> format_display(¬e, '*',
> *kind ? kind : "branch", NULL,
> *what ? what : "HEAD",
> "FETCH_HEAD", summary_width);
> + }
> if (note.len) {
> if (verbosity >= 0 && !shown_url) {
> fprintf(stderr, _("From %.*s\n"),
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index d681e90640..584a039b85 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -183,7 +183,7 @@ test_expect_success 'missing CLI object, but promised, passes fsck' '
> '
>
> test_expect_success 'fetching of missing objects' '
> - rm -rf repo &&
> + rm -rf repo err &&
> test_create_repo server &&
> test_commit -C server foo &&
> git -C server repack -a -d --write-bitmap-index &&
> @@ -194,7 +194,10 @@ test_expect_success 'fetching of missing objects' '
>
> git -C repo config core.repositoryformatversion 1 &&
> git -C repo config extensions.partialclone "origin" &&
> - git -C repo cat-file -p "$HASH" &&
> + git -C repo cat-file -p "$HASH" 2>err &&
> +
> + # Ensure that no spurious FETCH_HEAD messages are written
> + ! grep FETCH_HEAD err &&
This test feels very specific to the bug being addressed. That said,
since we *also* have a test below for the generic behavior I don't mind.
>
> # Ensure that the .promisor file is written, and check that its
> # associated packfile contains the object
> diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> index 2a1abe91f0..759aec9305 100755
> --- a/t/t5510-fetch.sh
> +++ b/t/t5510-fetch.sh
> @@ -543,16 +543,18 @@ test_expect_success 'fetch into the current branch with --update-head-ok' '
>
> '
>
> -test_expect_success 'fetch --dry-run does not touch FETCH_HEAD' '
> - rm -f .git/FETCH_HEAD &&
> - git fetch --dry-run . &&
> - ! test -f .git/FETCH_HEAD
> +test_expect_success 'fetch --dry-run does not touch FETCH_HEAD, but still prints what would be written' '
> + rm -f .git/FETCH_HEAD err &&
> + git fetch --dry-run . 2>err &&
> + ! test -f .git/FETCH_HEAD &&
> + grep FETCH_HEAD err
> '
>
> -test_expect_success '--no-write-fetch-head does not touch FETCH_HEAD' '
> - rm -f .git/FETCH_HEAD &&
> - git fetch --no-write-fetch-head . &&
> - ! test -f .git/FETCH_HEAD
> +test_expect_success '--no-write-fetch-head does not touch FETCH_HEAD, and does not print what would be written' '
> + rm -f .git/FETCH_HEAD err &&
> + git fetch --no-write-fetch-head . 2>err &&
> + ! test -f .git/FETCH_HEAD &&
> + ! grep FETCH_HEAD err
Nice.
With or without the following squashed in,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
diff --git i/builtin/fetch.c w/builtin/fetch.c
index c6c46892503..387eeb1ec08 100644
--- i/builtin/fetch.c
+++ w/builtin/fetch.c
@@ -1025,9 +1025,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
free(ref);
} else if (write_fetch_head || dry_run) {
/*
- * Display fetches written to FETCH_HEAD (or
- * would be written to FETCH_HEAD, if --dry-run
- * is set).
+ * Display fetches that wrote to FETCH_HEAD (or,
+ * if --dry-run is set, that would write to
+ * FETCH_HEAD).
*/
format_display(¬e, '*',
*kind ? kind : "branch", NULL,
^ permalink raw reply related
* Re: [PATCH] vcbuild: fix batch file name in README
From: Jonathan Nieder @ 2020-09-03 0:02 UTC (permalink / raw)
To: Orgad Shaneh via GitGitGadget; +Cc: git, Orgad Shaneh, Jeff Hostetler
In-Reply-To: <pull.721.git.1599077900986.gitgitgadget@gmail.com>
(cc-ing Jeff Hostetler, vcbuild expert)
Orgad Shaneh wrote:
> Signed-off-by: Orgad Shaneh <orgads@gmail.com>
> ---
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-721%2Forgads%2Fvcbuild-readme-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-721/orgads/vcbuild-readme-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/721
>
> compat/vcbuild/README | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Makes sense. This discrepancy is already present in the initial
contribution dce7d295514 (msvc: support building Git using MS Visual
C++, 2019-06-25), so it's probably from documentation going out of
date between review rounds.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> diff --git a/compat/vcbuild/README b/compat/vcbuild/README
> index 42292e7c09..51fb083dbb 100644
> --- a/compat/vcbuild/README
> +++ b/compat/vcbuild/README
> @@ -26,8 +26,8 @@ The Steps to Build Git with VS2015 or VS2017 from the command line.
> Use ONE of the following forms which should match how you want to
> compile git.exe.
>
> - $ ./compat/vcbuild/vcpkg_copy_packages.bat debug
> - $ ./compat/vcbuild/vcpkg_copy_packages.bat release
> + $ ./compat/vcbuild/vcpkg_copy_dlls.bat debug
> + $ ./compat/vcbuild/vcpkg_copy_dlls.bat release
>
> 3. Build git using MSVC from an SDK bash window using one of the
> following commands:
>
> base-commit: e19713638985533ce461db072b49112da5bd2042
> --
> gitgitgadget
^ permalink raw reply
* Re: [PATCH] vcbuild: fix library name for expat with make MSVC=1
From: Jonathan Nieder @ 2020-09-03 0:06 UTC (permalink / raw)
To: Orgad Shaneh via GitGitGadget
Cc: git, Orgad Shaneh, Jeff Hostetler, Johannes Schindelin
In-Reply-To: <pull.722.git.1599077798953.gitgitgadget@gmail.com>
(cc-ing a couple of Windows experts)
Orgad Shaneh wrote:
> Subject: vcbuild: fix library name for expat with make MSVC=1
Do you have more details? For example, what error message does the
build produce without this change?
> Signed-off-by: Orgad Shaneh <orgads@gmail.com>
> ---
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-722%2Forgads%2Fvcbuild-expat-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-722/orgads/vcbuild-expat-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/722
>
> compat/vcbuild/scripts/clink.pl | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
I'm ignorant enough about the platform-specific details involved that
I'd like an Ack from one of the Windows folks.
Thanks,
Jonathan
> diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl
> index 61ad084a7b..df167d1e1a 100755
> --- a/compat/vcbuild/scripts/clink.pl
> +++ b/compat/vcbuild/scripts/clink.pl
> @@ -66,7 +66,7 @@
> }
> push(@args, $lib);
> } elsif ("$arg" eq "-lexpat") {
> - push(@args, "expat.lib");
> + push(@args, "libexpat.lib");
> } elsif ("$arg" =~ /^-L/ && "$arg" ne "-LTCG") {
> $arg =~ s/^-L/-LIBPATH:/;
> push(@lflags, $arg);
>
> base-commit: e19713638985533ce461db072b49112da5bd2042
> --
> gitgitgadget
^ permalink raw reply
* Re: [PATCH v2] fetch: no FETCH_HEAD display if --no-write-fetch-head
From: Junio C Hamano @ 2020-09-03 2:03 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Jonathan Tan, git
In-Reply-To: <20200902235628.GB4035286@google.com>
Jonathan Nieder <jrnieder@gmail.com> writes:
>> builtin/fetch.c | 8 +++++++-
>> t/t0410-partial-clone.sh | 7 +++++--
>> t/t5510-fetch.sh | 18 ++++++++++--------
>> 3 files changed, 22 insertions(+), 11 deletions(-)
>
> Thanks for fixing it, and sorry I didn't catch it during initial
> review.
> ...
>> diff --git a/builtin/fetch.c b/builtin/fetch.c
>> index 320ba9471d..c6c4689250 100644
>> --- a/builtin/fetch.c
>> +++ b/builtin/fetch.c
>> @@ -1023,11 +1023,17 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
>> rc |= update_local_ref(ref, what, rm, ¬e,
>> summary_width);
>> free(ref);
>> - } else
>> + } else if (write_fetch_head || dry_run) {
>> + /*
>> + * Display fetches written to FETCH_HEAD (or
>> + * would be written to FETCH_HEAD, if --dry-run
>
> nit: to fix the parallel construction, s/would/that would/ or
> s/written/that were written/
True.
>> + * is set).
>> + */
>> format_display(¬e, '*',
>> *kind ? kind : "branch", NULL,
>> *what ? what : "HEAD",
>> "FETCH_HEAD", summary_width);
>> + }
Strictly speaking, I suspect that this is still broken when the user
says "fetch --no-write-fetch-head --dry-run" in which case we should
skip this block.
And to fix it properly, we would probably need to keep track of
three things semi-independently.
- were we told this is a "dry-run"? (current 'dry_run' variable)
- were we told not to store fetch-head? (missing)
- after all, are we going to write or not write fetch-head (current
'write_fetch_head' variable)
And the conditional to protect this block would be fixed to use only
the second and new "have we seen --no-fetch-head on the command
line?" variable, and ignore the settings of the dry_run variable, I
think.
Thanks.
^ permalink raw reply
* Re: [PATCH] vcbuild: fix library name for expat with make MSVC=1
From: Junio C Hamano @ 2020-09-03 2:07 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Orgad Shaneh via GitGitGadget, git, Orgad Shaneh, Jeff Hostetler,
Johannes Schindelin
In-Reply-To: <20200903000640.GD4035286@google.com>
Jonathan Nieder <jrnieder@gmail.com> writes:
> (cc-ing a couple of Windows experts)
> Orgad Shaneh wrote:
>
>> Subject: vcbuild: fix library name for expat with make MSVC=1
>
> Do you have more details? For example, what error message does the
> build produce without this change?
Presumably you'd get an error at link time, saying either 'no such
library: expat.lib', or 'symbol X not found' for symbols that were
supposed to come from libexpat.lib.
I do not think we want to see exact error message. If an empty body
of the log message bothers us, I probably would say that it is
sufficient to write something like
The name of the expat library is libexpat.lib, not expat.lib;
otherwise we'd get linkage errors.
> I'm ignorant enough about the platform-specific details involved that
> I'd like an Ack from one of the Windows folks.
I saw Dscho looked at both pull requests and commented on them, but
haven't seen him (or anybody else) acking or nacking the version
that was submitted. It would be nice to see an Ack from Windows'
side.
Thanks.
^ permalink raw reply
* Re: [PATCH] xrealloc: do not reuse pointer freed by zero-length realloc()
From: Jonathan Nieder @ 2020-09-03 3:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jeff King, Derrick Stolee, git
In-Reply-To: <xmqq3641ebep.fsf@gitster.c.googlers.com>
Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>> If we do handle it up-front, then I think we'd actually want:
>>
>> if (!size) {
>> free(ptr);
>> return xmalloc(0);
>> }
>>
>> (i.e., to never return NULL for consistency with xmalloc() and
>> xcalloc()).
>
> Makes sense. I suspect that this is optimizing for a wrong case,
> but in practice that should not matter. Not having to worry about
> a request to resize to 0-byte in the remainder of the function is
> actually a plus for readability, I would say.
I agree with both points: if we were repeatedly shrinking and growing
a buffer and cared about its performance, then we'd want the first
version, and since we aren't, we should prefer this version that is
more readable.
Thanks,
Jonathan
^ permalink raw reply
* Re: Git in Outreachy?
From: Jeff King @ 2020-09-03 5:41 UTC (permalink / raw)
To: Emily Shaffer; +Cc: git, Christian Couder, Johannes Schindelin
In-Reply-To: <20200901125119.GA3250129@coredump.intra.peff.net>
On Tue, Sep 01, 2020 at 08:51:20AM -0400, Jeff King wrote:
> On Mon, Aug 31, 2020 at 11:05:37AM -0700, Emily Shaffer wrote:
>
> > I'm interested in mentoring or co-mentoring this term. (I'm not working
> > this week but I didn't want to miss a deadline to volunteer.)
>
> OK, between you and Christian, then, it sounds like it's worth pursuing.
> I'll sign us up and try to arrange funding.
I'm still working out funding details, but in the meantime we're signed
up. Potential mentors should propose projects here:
https://www.outreachy.org/communities/cfp/git/
Sooner is better than later. We can technically submit projects up until
the 24th, but student applications are open now, and have to be in by
September 20th.
-Peff
^ permalink raw reply
* Re: [PATCH] RFC: refs: add GIT_DEBUG_REFS debugging mechanism
From: Jonathan Nieder @ 2020-09-03 5:44 UTC (permalink / raw)
To: Jonathan Tan
Cc: gitgitgadget, git, hanwenn, hanwen, git, stolee, Jeff King,
Josh Steadmon
In-Reply-To: <20200902174939.3391882-1-jonathantanmy@google.com>
(+Peff and Josh, trace experts)
Jonathan Tan wrote:
> Han-Wen Nienhuys wrote:
>> This should be integrated with the trace2 sub-system, and I would appreciate
>> pointers on how to start.
>
> The trace2 subsystem seems to be designed to detect errors coarsely
> (e.g. by looking at process invocations) and log timings. It currently
> doesn't seem to be made for this kind of fine debugging information -
> and perhaps this is by design, because such logging information would
> not be useful to most users and would just clutter up the logs.
>
> But I think there is a place for this in Git - in particular, we have
> packet tracing (GIT_TRACE_PACKET), and this has been useful both in
> automated tests (t/????-*.sh) and in manual tests. Ref backend tracing
> seems to be similar. But this would be best if we had an additional
> option that could control whether ref backend tracing was on or off,
> independent from other things like the trace2 target.
>
> Is the plan to migrate all usages of "trace" to "trace2" or for both to
> exist simultaneously? If the latter, then ref backend tracing could just
> use "trace", but if the former, we'd have to designing this additional
> option.
Here's my not-completely-thought-through take:
trace.h defines a convenient and simple API for unstructured traces.
You can define a key
static struct trace_key trace_refs = TRACE_KEY_INIT(REFS);
and then you have available
trace_want(&trace_refs)
trace_printf_key(&trace_refs, fmt, args...)
trace_strbuf(&trace_refs, data)
and so on.
One use of trace_printf (i.e. tracing without a key) is to print lines
for important events like starting a subprocess. For those, trace2 is a
natural replacement, with the advantage that the resulting event is
associated with any enclosing trace2 regions.
One design goal of trace2 was to be able to replace the original trace
subsystem completely. Its output routines are flexible that it should
be able to produce GIT_TRACE style output. It should be possible to
implement a trace_printf style API on top of trace2.
For ref tracing, do we care much about the human-friendliness of the
output? Do we expect this to be useful for debugging once reftable is
more established? Is it something we'd want to extract timing regions
from and use for performance improvements?
For ad hoc debugging, the structured tracing features of trace2 are
not all that important. So if using the trace_key API is simpler,
then for that application I'd say go for it. Some day we could
reimplement that same API on top of trace2 (or use a coccinelle
transform to a similar API) and your could would still work. :)
For producing logs that get aggregated, the structured output of
trace2 tends to be quite useful. Event tracing doesn't have a notion
of setting which subsystems to trace on but it would be a natural
thing to add. See TR2_SYSENV_CFG_PARAM for an example of configurable
"what to trace" information used by GIT_TRACE2_EVENT. Happy to
provide more advice if this seems applicable for the problem at hand.
Summary:
- if this is only going to be used for tests and for ad hoc debugging,
I'd suggest sticking to the simple trace_key based unstructured
tracing API
- if we expect it to be something we want to aggregate over many runs
and query, then structure becomes more important and trace2 starts
to pull its weight
Thanks,
Jonathan
^ permalink raw reply
* Re: Git in Outreachy?
From: Jonathan Nieder @ 2020-09-03 6:00 UTC (permalink / raw)
To: Christian Couder; +Cc: Jeff King, git, Christian Couder, Johannes Schindelin
In-Reply-To: <CAP8UFD2rpNhDhyHdQNxS-KJZgcumsCpK_JQ5koCqXJd70s-+_w@mail.gmail.com>
Hi,
Christian Couder wrote:
> I would appreciate help to find project ideas though. Are there still
> scripts that are worth converting to C (excluding git-bisect.sh and
> git-submodule.sh that are still worked on)? Are there worthy
> refactorings or improvements that we could propose as projects?
I think setting up something like snowpatch[*] to run CI on patches
that have hit the mailing list but not yet hit "seen" might be a good
project for an interested applicant (and I'd be interested in
co-mentoring if we find a taker).
Some other topics that could be interesting:
- better support for handling people's name changing
- making signing features such as signed push easier to use (for
example by allowing signing with SSH keys to simplify PKI) and more
useful (for example by standardizing a way to publish signed push
logs in Git)
- protocol: sharing notes and branch descriptions
- formats: on-disk reverse idx
- obliterate
- cache server to take advantage of multiple promisors+packfile URIs
Jonathan
[*] https://github.com/ruscur/snowpatch
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox