Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] clone: support cloning of filtered bundles
From: Nikolay Edigaryev @ 2024-01-14 21:26 UTC (permalink / raw)
  To: phillip.wood; +Cc: Nikolay Edigaryev via GitGitGadget, git, Derrick Stolee
In-Reply-To: <CAFX5hXTCPt-rDrWZ-RN8S84o_FooY3Ck2H1kMYdHQGzuetPBSw@mail.gmail.com>

> Note that currently, when you clone a normal non-filtered bundle with a
> '--filter' argument specified, no filtering will take place and no error
> will be thrown. "promisor = true" and "partialclonefilter = ..." options
> will be set in the repo config, but no .promisor file will be created.
> This is even more confusing IMO, but that's how it currently on
> Git 2.43.0.

I was wrong about this one: the .promisor pack file actually gets created.

And the filtering is not being done because the "uploadpack.allowFilter"
global option is not enabled by default.

Unfortunately the only way to figure this out is to run Git with
GIT_TRACE=2, which shows:

warning: filtering not recognized by server, ignoring

So please feel free to skip this part from the consideration.


On Sun, Jan 14, 2024 at 11:39 PM Nikolay Edigaryev <edigaryev@gmail.com> wrote:
>
> Hello Phillip,
>
> > As I understand it if you're cloning from a bundle file then then there
> > is no remote so how can we set a remote-specific config?
>
> There is a remote, for more details see df61c88979 (clone: also
> configure url for bare clones, 2010-03-29), which has the following
> code:
>
> strbuf_addf(&key, "remote.%s.url", remote_name);
> git_config_set(key.buf, repo);
> strbuf_reset(&key);
>
> You can verify this by creating a bundle on Git 2.43.0 with "git create
> bundle bundle.bundle --all" and then cloning it with "git clone
> --bare /path/to/bundle.bundle", in my case the following repo-wide
> configuration file was created:
>
> [core]
> repositoryformatversion = 0
> filemode = true
> bare = true
> ignorecase = true
> precomposeunicode = true
> [remote "origin"]
> url = /Users/edi/src/cirrus-cli/cli.bundle
>
> > I'm surprised that the proposed change does not require the user to pass
> > "--filter" to "git clone" as I expected that we'd want to check that the
> > filter on the command line was compatible with the filter used to create
> > the bundle. Allowing "git clone" to create a partial clone without the
> > user asking for it by passing the "--filter" option feels like is going
> > to end up confusing users.
>
> Note that currently, when you clone a normal non-filtered bundle with a
> '--filter' argument specified, no filtering will take place and no error
> will be thrown. "promisor = true" and "partialclonefilter = ..." options
> will be set in the repo config, but no .promisor file will be created.
> This is even more confusing IMO, but that's how it currently on
> Git 2.43.0.
>
> You have a good point, but I feel like completely preventing cloning of
> filtered bundles and requiring a '--filter' argument is very taxing. If
> you've already specified a '--filter' when creating a bundle (and thus
> your intent to use partially cloned data), why do it multiple times?
>
> What I propose as an alternative here is to act based on the user's
> intent when cloning:
>
> * when the user specifies no '--filter' argument, do nothing special,
>    allow cloning both types of bundles: normal and filtered (with the
>    logic from this patch)
>
> * when the user does specify a '--filter' argument, either:
>   * throw an error explaining that filtering of filtered bundles is not
>     supported
>   * or compare the user's filter specification and the one that is
>     in the bundle and only throw an error if they mismatch
>
> Let me know what you think about this (and perhaps you have a more
> concrete example in mind where this will have negative consequences)
> and I'll be happy to do a next iteration.
>
>
> On Sun, Jan 14, 2024 at 10:00 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >
> > Hi Nikolay
> >
> > On 14/01/2024 11:16, Nikolay Edigaryev via GitGitGadget wrote:
> > > From: Nikolay Edigaryev <edigaryev@gmail.com>
> > >
> > > f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
> > > incredibly useful ability to create filtered bundles, which advances
> > > the partial clone/promisor support in Git and allows for archiving
> > > large repositories to object storages like S3 in bundles that are:
> > >
> > > * easy to manage
> > >    * bundle is just a single file, it's easier to guarantee atomic
> > >      replacements in object storages like S3 and they are faster to
> > >      fetch than a bare repository since there's only a single GET
> > >      request involved
> > > * incredibly tiny
> > >    * no indexes (which may be more than 10 MB for some repositories)
> > >      and other fluff, compared to cloning a bare repository
> > >    * bundle can be filtered to only contain the tips of refs neccessary
> > >      for e.g. code-analysis purposes
> > >
> > > However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
> > > bundle, 2022-03-09) the cloning of such bundles was disabled, with a
> > > note that this behavior is not desired, and it the long-term this
> > > should be possible.
> > >
> > > The commit above states that it's not possible to have this at the
> > > moment due to lack of remote and a repository-global config that
> > > specifies an object filter, yet it's unclear why a remote-specific
> > > config can't be used instead, which is what this change does.
> >
> > As I understand it if you're cloning from a bundle file then then there
> > is no remote so how can we set a remote-specific config?
> >
> > I'm surprised that the proposed change does not require the user to pass
> > "--filter" to "git clone" as I expected that we'd want to check that the
> > filter on the command line was compatible with the filter used to create
> > the bundle. Allowing "git clone" to create a partial clone without the
> > user asking for it by passing the "--filter" option feels like is going
> > to end up confusing users.
> >
> > > +test_expect_success 'cloning from filtered bundle works' '
> > > +     git bundle create partial.bdl --all --filter=blob:none &&
> > > +     git clone --bare partial.bdl partial 2>err
> >
> > The redirection hides any error message which will make debugging test
> > failures harder. It would be nice to see this test check any config set
> > when cloning and that git commands can run successfully in the repository.
> >
> > Best Wishes
> >
> > Phillip

^ permalink raw reply

* Re: [PATCH 00/10] Enrich Trailer API
From: Linus Arver @ 2024-01-14 20:05 UTC (permalink / raw)
  To: Junio C Hamano, Linus Arver via GitGitGadget
  Cc: git, Emily Shaffer, Christian Couder
In-Reply-To: <owlyil3yhv70.fsf@fine.c.googlers.com>

Linus Arver <linusa@google.com> writes:

>     interpret-trailers: fail if given unrecognized arguments
>       (Summary: E.g., for "--where", only accept recognized WHERE_* enum
>       values. If we get something unrecognized, fail with an error
>       instead of silently doing nothing. Ditto for "--if-exists" and
>       "--if-missing".)
>
> The last one is a different class of bug than the first two, and perhaps
> less interesting.

Actually, upon closer inspection I realize that we already fail if given
unrecognized arguments to --where, --if-exists, and --if-missing. But we
don't explain why to the user because no error message is printed. This
commit has been retitled to "interpret-trailers: print error if given
unrecognized arguments".

Thanks.

^ permalink raw reply

* [PATCH] rev-list-options: fix off-by-one in '--filter=blob:limit=<n>' explainer
From: Nikolay Edigaryev via GitGitGadget @ 2024-01-14 19:50 UTC (permalink / raw)
  To: git; +Cc: Nikolay Edigaryev, Nikolay Edigaryev

From: Nikolay Edigaryev <edigaryev@gmail.com>

'--filter=blob:limit=<n>' was introduced in 25ec7bcac0 (list-objects:
filter objects in traverse_commit_list, 2017-11-21) and later expanded
to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
2020-02-14)

The logic that was introduced in these commits (and that still persists
to this day) omits blobs larger than _or equal_ to n bytes or units.

However, the documentation (Documentation/rev-list-options.txt) states:

>The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n
bytes or units. n may be zero.

Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
logic, so it seems it is the documentation that needs fixing, not the
code.

This changes the explanation to be similar to
Documentation/git-clone.txt, which is correct.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
---
    rev-list-options: fix off-by-one in '--filter=blob:limit=' explainer
    
    '--filter=blob:limit=' was introduced in 25ec7bcac0 (list-objects:
    filter objects in traverse_commit_list, 2017-11-21) and later expanded
    to bitmaps in 84243da129 (pack-bitmap: implement BLOB_LIMIT filtering,
    2020-02-14)
    
    The logic that was introduced in these commits (and that still persists
    to this day) omits blobs larger than or equal to n bytes or units.
    
    However, the documentation (Documentation/rev-list-options.txt) states:
    
    > The form '--filter=blob:limit=[kmg]' omits blobs larger than n bytes
    > or units. n may be zero.
    
    Moreover, the t6113-rev-list-bitmap-filters.sh tests for exactly this
    logic, so it seems it is the documentation that needs fixing, not the
    code.
    
    This changes the explanation to be similar to
    Documentation/git-clone.txt, which is correct.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1645%2Fedigaryev%2Ffix-blob-limit-off-by-one-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1645/edigaryev/fix-blob-limit-off-by-one-v1
Pull-Request: https://github.com/git/git/pull/1645

 Documentation/rev-list-options.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index 2bf239ff030..a583b52c612 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -947,10 +947,10 @@ ifdef::git-rev-list[]
 +
 The form '--filter=blob:none' omits all blobs.
 +
-The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes
-or units.  n may be zero.  The suffixes k, m, and g can be used to name
-units in KiB, MiB, or GiB.  For example, 'blob:limit=1k' is the same
-as 'blob:limit=1024'.
+The form '--filter=blob:limit=<n>[kmg]' omits blobs of size at least n
+bytes or units.  n may be zero.  The suffixes k, m, and g can be used
+to name units in KiB, MiB, or GiB.  For example, 'blob:limit=1k'
+is the same as 'blob:limit=1024'.
 +
 The form '--filter=object:type=(tag|commit|tree|blob)' omits all objects
 which are not of the requested type.

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH] clone: support cloning of filtered bundles
From: Nikolay Edigaryev @ 2024-01-14 19:39 UTC (permalink / raw)
  To: phillip.wood; +Cc: Nikolay Edigaryev via GitGitGadget, git, Derrick Stolee
In-Reply-To: <24b82309-34e9-49a0-983b-7e94dad3d06b@gmail.com>

Hello Phillip,

> As I understand it if you're cloning from a bundle file then then there
> is no remote so how can we set a remote-specific config?

There is a remote, for more details see df61c88979 (clone: also
configure url for bare clones, 2010-03-29), which has the following
code:

strbuf_addf(&key, "remote.%s.url", remote_name);
git_config_set(key.buf, repo);
strbuf_reset(&key);

You can verify this by creating a bundle on Git 2.43.0 with "git create
bundle bundle.bundle --all" and then cloning it with "git clone
--bare /path/to/bundle.bundle", in my case the following repo-wide
configuration file was created:

[core]
repositoryformatversion = 0
filemode = true
bare = true
ignorecase = true
precomposeunicode = true
[remote "origin"]
url = /Users/edi/src/cirrus-cli/cli.bundle

> I'm surprised that the proposed change does not require the user to pass
> "--filter" to "git clone" as I expected that we'd want to check that the
> filter on the command line was compatible with the filter used to create
> the bundle. Allowing "git clone" to create a partial clone without the
> user asking for it by passing the "--filter" option feels like is going
> to end up confusing users.

Note that currently, when you clone a normal non-filtered bundle with a
'--filter' argument specified, no filtering will take place and no error
will be thrown. "promisor = true" and "partialclonefilter = ..." options
will be set in the repo config, but no .promisor file will be created.
This is even more confusing IMO, but that's how it currently on
Git 2.43.0.

You have a good point, but I feel like completely preventing cloning of
filtered bundles and requiring a '--filter' argument is very taxing. If
you've already specified a '--filter' when creating a bundle (and thus
your intent to use partially cloned data), why do it multiple times?

What I propose as an alternative here is to act based on the user's
intent when cloning:

* when the user specifies no '--filter' argument, do nothing special,
   allow cloning both types of bundles: normal and filtered (with the
   logic from this patch)

* when the user does specify a '--filter' argument, either:
  * throw an error explaining that filtering of filtered bundles is not
    supported
  * or compare the user's filter specification and the one that is
    in the bundle and only throw an error if they mismatch

Let me know what you think about this (and perhaps you have a more
concrete example in mind where this will have negative consequences)
and I'll be happy to do a next iteration.


On Sun, Jan 14, 2024 at 10:00 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Nikolay
>
> On 14/01/2024 11:16, Nikolay Edigaryev via GitGitGadget wrote:
> > From: Nikolay Edigaryev <edigaryev@gmail.com>
> >
> > f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
> > incredibly useful ability to create filtered bundles, which advances
> > the partial clone/promisor support in Git and allows for archiving
> > large repositories to object storages like S3 in bundles that are:
> >
> > * easy to manage
> >    * bundle is just a single file, it's easier to guarantee atomic
> >      replacements in object storages like S3 and they are faster to
> >      fetch than a bare repository since there's only a single GET
> >      request involved
> > * incredibly tiny
> >    * no indexes (which may be more than 10 MB for some repositories)
> >      and other fluff, compared to cloning a bare repository
> >    * bundle can be filtered to only contain the tips of refs neccessary
> >      for e.g. code-analysis purposes
> >
> > However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
> > bundle, 2022-03-09) the cloning of such bundles was disabled, with a
> > note that this behavior is not desired, and it the long-term this
> > should be possible.
> >
> > The commit above states that it's not possible to have this at the
> > moment due to lack of remote and a repository-global config that
> > specifies an object filter, yet it's unclear why a remote-specific
> > config can't be used instead, which is what this change does.
>
> As I understand it if you're cloning from a bundle file then then there
> is no remote so how can we set a remote-specific config?
>
> I'm surprised that the proposed change does not require the user to pass
> "--filter" to "git clone" as I expected that we'd want to check that the
> filter on the command line was compatible with the filter used to create
> the bundle. Allowing "git clone" to create a partial clone without the
> user asking for it by passing the "--filter" option feels like is going
> to end up confusing users.
>
> > +test_expect_success 'cloning from filtered bundle works' '
> > +     git bundle create partial.bdl --all --filter=blob:none &&
> > +     git clone --bare partial.bdl partial 2>err
>
> The redirection hides any error message which will make debugging test
> failures harder. It would be nice to see this test check any config set
> when cloning and that git commands can run successfully in the repository.
>
> Best Wishes
>
> Phillip

^ permalink raw reply

* Re: [PATCH] strvec: use correct member name in comments
From: Linus Arver @ 2024-01-14 18:20 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Linus Arver via GitGitGadget, git
In-Reply-To: <20240113073131.GA657764@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Fri, Jan 12, 2024 at 04:37:46PM -0800, Linus Arver wrote:
>
>> OTOH if we were treating these .h files as something meant for direct
>> external consumption (that is, if strvec.h is libified and external
>> users outside of Git are expected to use it directly as their first
>> point of documentation), at that point it might make sense to name the
>> parameters (akin to the style of manpages for syscalls). But I imagine
>> at that point we would have some other means of developer docs (beyond
>> raw header files) for libified parts of Git, so even in that case it's
>> probably fine to keep things as is.
>
> I think this is mostly orthogonal to libification. Whether the audience
> is other parts of Git or users outside of Git, they need to know how to
> call the function. Our main source of documentation there is comments
> above the declaration (we've marked these with "/**" which would allow a
> parser to pull them into a separate doc file, but AFAIK in the 9 years
> since we started that convention, nobody has bothered to write such a
> script).
>
> Naming the parameters can help when writing those comments, because you
> can then refer to them (e.g., see the comment above strbuf_addftime).
> Even without that, I think they can be helpful, but I don't think I'd
> bother adding them in unless taking a pass over the whole file, looking
> for comments that do not sufficiently explain their matching functions.

So in summary you are saying that the comments are the most important
source of documentation that we have currently, and unless naming the
parameters improves these comments, we shouldn't bother naming these
parameters. I agree.

> I don't doubt that some of that would be necessary for libification,
> just to increase the quality of the documentation. But I think it's
> largely separate from the patch in this thread.

I agree with both statements. Thanks.

^ permalink raw reply

* Re: [PATCH] clone: support cloning of filtered bundles
From: Phillip Wood @ 2024-01-14 18:00 UTC (permalink / raw)
  To: Nikolay Edigaryev via GitGitGadget, git; +Cc: Derrick Stolee, Nikolay Edigaryev
In-Reply-To: <pull.1644.git.git.1705231010118.gitgitgadget@gmail.com>

Hi Nikolay

On 14/01/2024 11:16, Nikolay Edigaryev via GitGitGadget wrote:
> From: Nikolay Edigaryev <edigaryev@gmail.com>
> 
> f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
> incredibly useful ability to create filtered bundles, which advances
> the partial clone/promisor support in Git and allows for archiving
> large repositories to object storages like S3 in bundles that are:
> 
> * easy to manage
>    * bundle is just a single file, it's easier to guarantee atomic
>      replacements in object storages like S3 and they are faster to
>      fetch than a bare repository since there's only a single GET
>      request involved
> * incredibly tiny
>    * no indexes (which may be more than 10 MB for some repositories)
>      and other fluff, compared to cloning a bare repository
>    * bundle can be filtered to only contain the tips of refs neccessary
>      for e.g. code-analysis purposes
> 
> However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
> bundle, 2022-03-09) the cloning of such bundles was disabled, with a
> note that this behavior is not desired, and it the long-term this
> should be possible.
> 
> The commit above states that it's not possible to have this at the
> moment due to lack of remote and a repository-global config that
> specifies an object filter, yet it's unclear why a remote-specific
> config can't be used instead, which is what this change does.

As I understand it if you're cloning from a bundle file then then there 
is no remote so how can we set a remote-specific config?

I'm surprised that the proposed change does not require the user to pass 
"--filter" to "git clone" as I expected that we'd want to check that the 
filter on the command line was compatible with the filter used to create 
the bundle. Allowing "git clone" to create a partial clone without the 
user asking for it by passing the "--filter" option feels like is going 
to end up confusing users.

> +test_expect_success 'cloning from filtered bundle works' '
> +	git bundle create partial.bdl --all --filter=blob:none &&
> +	git clone --bare partial.bdl partial 2>err

The redirection hides any error message which will make debugging test 
failures harder. It would be nice to see this test check any config set 
when cloning and that git commands can run successfully in the repository.

Best Wishes

Phillip

^ permalink raw reply

* [PATCH] clone: support cloning of filtered bundles
From: Nikolay Edigaryev via GitGitGadget @ 2024-01-14 11:16 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Nikolay Edigaryev, Nikolay Edigaryev

From: Nikolay Edigaryev <edigaryev@gmail.com>

f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
incredibly useful ability to create filtered bundles, which advances
the partial clone/promisor support in Git and allows for archiving
large repositories to object storages like S3 in bundles that are:

* easy to manage
  * bundle is just a single file, it's easier to guarantee atomic
    replacements in object storages like S3 and they are faster to
    fetch than a bare repository since there's only a single GET
    request involved
* incredibly tiny
  * no indexes (which may be more than 10 MB for some repositories)
    and other fluff, compared to cloning a bare repository
  * bundle can be filtered to only contain the tips of refs neccessary
    for e.g. code-analysis purposes

However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
bundle, 2022-03-09) the cloning of such bundles was disabled, with a
note that this behavior is not desired, and it the long-term this
should be possible.

The commit above states that it's not possible to have this at the
moment due to lack of remote and a repository-global config that
specifies an object filter, yet it's unclear why a remote-specific
config can't be used instead, which is what this change does.

Signed-off-by: Nikolay Edigaryev <edigaryev@gmail.com>
---
    clone: support cloning of filtered bundles
    
    f18b512bbb (bundle: create filtered bundles, 2022-03-09) introduced an
    incredibly useful ability to create filtered bundles, which advances the
    partial clone/promisor support in Git and allows for archiving large
    repositories to object storages like S3 in bundles that are:
    
     * easy to manage
       * bundle is just a single file, it's easier to guarantee atomic
         replacements in object storages like S3 and they are faster to
         fetch than a bare repository since there's only a single GET
         request involved
     * incredibly tiny
       * no indexes (which may be more than 10 MB for some repositories) and
         other fluff, compared to cloning a bare repository
       * bundle can be filtered to only contain the tips of refs neccessary
         for e.g. code-analysis purposes
    
    However, in 86fdd94d72 (clone: fail gracefully when cloning filtered
    bundle, 2022-03-09) the cloning of such bundles was disabled, with a
    note that this behavior is not desired, and it the long-term this should
    be possible.
    
    The commit above states that it's not possible to have this at the
    moment due to lack of remote and a repository-global config that
    specifies an object filter, yet it's unclear why a remote-specific
    config can't be used instead, which is what this change does.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1644%2Fedigaryev%2Fclone-filtered-bundles-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1644/edigaryev/clone-filtered-bundles-v1
Pull-Request: https://github.com/git/git/pull/1644

 builtin/clone.c        | 13 +++++++++++--
 t/t6020-bundle-misc.sh | 13 +++----------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index c6357af9498..4b3fedf78ed 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1227,9 +1227,18 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 		if (fd > 0)
 			close(fd);
+
+		if (has_filter) {
+			strbuf_addf(&key, "remote.%s.promisor", remote_name);
+			git_config_set(key.buf, "true");
+			strbuf_reset(&key);
+
+			strbuf_addf(&key, "remote.%s.partialclonefilter", remote_name);
+			git_config_set(key.buf, expand_list_objects_filter_spec(&header.filter));
+			strbuf_reset(&key);
+		}
+
 		bundle_header_release(&header);
-		if (has_filter)
-			die(_("cannot clone from filtered bundle"));
 	}
 
 	transport_set_option(transport, TRANS_OPT_KEEP, "yes");
diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 3e6bcbf30cd..f449df00642 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -555,16 +555,9 @@ do
 	'
 done
 
-# NEEDSWORK: 'git clone --bare' should be able to clone from a filtered
-# bundle, but that requires a change to promisor/filter config options.
-# For now, we fail gracefully with a helpful error. This behavior can be
-# changed in the future to succeed as much as possible.
-test_expect_success 'cloning from filtered bundle has useful error' '
-	git bundle create partial.bdl \
-		--all \
-		--filter=blob:none &&
-	test_must_fail git clone --bare partial.bdl partial 2>err &&
-	grep "cannot clone from filtered bundle" err
+test_expect_success 'cloning from filtered bundle works' '
+	git bundle create partial.bdl --all --filter=blob:none &&
+	git clone --bare partial.bdl partial 2>err
 '
 
 test_expect_success 'verify catches unreachable, broken prerequisites' '

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
From: Jeff King @ 2024-01-14 10:14 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <36b9f6b6240686cc5b0a761b889614fc31f01d34.1704966670.git.ps@pks.im>

On Thu, Jan 11, 2024 at 11:06:43AM +0100, Patrick Steinhardt wrote:

> We're about to introduce a stat(3P)-based caching mechanism to reload
> the list of stacks only when it has changed. In order to avoid race
> conditions this requires us to have a file descriptor available that we
> can use to call fstat(3P) on.
> 
> Prepare for this by converting the code to use `fd_read_lines()` so that
> we have the file descriptor readily available.

Coverity noted a case with this series where we might feed a negative
value to fstat(). I'm not sure if it's a bug or not.

The issue is that here:

> @@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
>  			goto out;
>  
> -		err = read_lines(st->list_file, &names);
> -		if (err < 0)
> -			goto out;
> +		fd = open(st->list_file, O_RDONLY);
> +		if (fd < 0) {
> +			if (errno != ENOENT) {
> +				err = REFTABLE_IO_ERROR;
> +				goto out;
> +			}
> +
> +			names = reftable_calloc(sizeof(char *));
> +		} else {
> +			err = fd_read_lines(fd, &names);
> +			if (err < 0)
> +				goto out;
> +		}

...we might end up with fd as "-1" after calling open() on the list
file. For most errors we'll jump to "out", which makes sense. But if we
get ENOENT, we keep going with an empty file-list, which makes sense.

But we then do other stuff with "fd". I think this case is OK:

> @@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>  		names = NULL;
>  		free_names(names_after);
>  		names_after = NULL;
> +		close(fd);
> +		fd = -1;

We only get here if reftable_stack_reload_once() returned an error,
which it won't do since we feed it a blank set of names (and anyway,
close(-1) is a harmless noop).

But if we actually get to the end of the function, it's more
questionable. As of this patch, it's OK:

>  		delay = delay + (delay * rand()) / RAND_MAX + 1;
>  		sleep_millisec(delay);
>  	}
>  
>  out:
> +	if (fd >= 0)
> +		close(fd);
>  	free_names(names);
>  	free_names(names_after);
>  	return err;

But in the next patch we have this hunk:

> @@ -374,7 +375,11 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
>                 sleep_millisec(delay);
>         }
> 
> +       stat_validity_update(&st->list_validity, fd);
> +
>  out:
> +       if (err)
> +               stat_validity_clear(&st->list_validity);
>         if (fd >= 0)
>                 close(fd);
>         free_names(names);

which means we'll feed a negative value to stat_validity_update(). I
think this may be OK, because I'd imagine the only sensible thing to do
is call stat_validity_clear() instead. And using a negative fd means
fstat() will fail, which will cause stat_validity_update() to clear the
validity struct anyway. But I thought it was worth double-checking.

-Peff

^ permalink raw reply

* [PATCH v2] tests: move t0009-prio-queue.sh to the new unit testing framework
From: Chandra Pratap via GitGitGadget @ 2024-01-14  8:18 UTC (permalink / raw)
  To: git; +Cc: Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1642.git.1705219829965.gitgitgadget@gmail.com>

From: Chandra Pratap <chandrapratap3519@gmail.com>

t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
tests Git's implementation of a priority queue. Migrate the
test over to the new unit testing framework to simplify debugging
and reduce test run-time. Refactor the required logic and add
a new test case in addition to porting over the original ones in
shell.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    tests: move t0009-prio-queue.sh to the new unit testing framework

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1642%2FChand-ra%2Fprio-queue-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1642/Chand-ra/prio-queue-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1642

Range-diff vs v1:

 1:  ca20abe95ea ! 1:  60ac9b3c259 tests: move t0009-prio-queue.sh to the new unit testing framework
     @@ t/unit-tests/t-prio-queue.c (new)
      +#define REVERSE_STACK_INPUT "stack", "1", "2", "3", "4", "5", "6", "reverse", "dump"
      +#define REVERSE_STACK_EXPECTED 1, 2, 3, 4, 5, 6
      +
     -+#define TEST_INPUT(INPUT, EXPECTED, name)		\
     ++#define TEST_INPUT(INPUT, EXPECTED, name)			\
      +  static void test_##name(void)					\
     -+{											\
     -+	const char *input[] = {INPUT};						\
     -+	int expected[] = {EXPECTED};					\
     -+	int result[INPUT_SIZE];								\
     -+	test_prio_queue(input, result);						\
     ++{								\
     ++	const char *input[] = {INPUT};				\
     ++	int expected[] = {EXPECTED};				\
     ++	int result[INPUT_SIZE];					\
     ++	test_prio_queue(input, result);				\
      +	check(!memcmp(expected, result, sizeof(expected)));	\
      +}
      +


 Makefile                    |  2 +-
 t/helper/test-prio-queue.c  | 51 -------------------
 t/helper/test-tool.c        |  1 -
 t/helper/test-tool.h        |  1 -
 t/t0009-prio-queue.sh       | 66 -------------------------
 t/unit-tests/t-prio-queue.c | 99 +++++++++++++++++++++++++++++++++++++
 6 files changed, 100 insertions(+), 120 deletions(-)
 delete mode 100644 t/helper/test-prio-queue.c
 delete mode 100755 t/t0009-prio-queue.sh
 create mode 100644 t/unit-tests/t-prio-queue.c

diff --git a/Makefile b/Makefile
index 312d95084c1..d5e48e656b3 100644
--- a/Makefile
+++ b/Makefile
@@ -828,7 +828,6 @@ TEST_BUILTINS_OBJS += test-partial-clone.o
 TEST_BUILTINS_OBJS += test-path-utils.o
 TEST_BUILTINS_OBJS += test-pcre2-config.o
 TEST_BUILTINS_OBJS += test-pkt-line.o
-TEST_BUILTINS_OBJS += test-prio-queue.o
 TEST_BUILTINS_OBJS += test-proc-receive.o
 TEST_BUILTINS_OBJS += test-progress.o
 TEST_BUILTINS_OBJS += test-reach.o
@@ -1340,6 +1339,7 @@ THIRD_PARTY_SOURCES += sha1dc/%
 
 UNIT_TEST_PROGRAMS += t-basic
 UNIT_TEST_PROGRAMS += t-strbuf
+UNIT_TEST_PROGRAMS += t-prio-queue
 UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c
deleted file mode 100644
index f0bf255f5f0..00000000000
--- a/t/helper/test-prio-queue.c
+++ /dev/null
@@ -1,51 +0,0 @@
-#include "test-tool.h"
-#include "prio-queue.h"
-
-static int intcmp(const void *va, const void *vb, void *data UNUSED)
-{
-	const int *a = va, *b = vb;
-	return *a - *b;
-}
-
-static void show(int *v)
-{
-	if (!v)
-		printf("NULL\n");
-	else
-		printf("%d\n", *v);
-	free(v);
-}
-
-int cmd__prio_queue(int argc UNUSED, const char **argv)
-{
-	struct prio_queue pq = { intcmp };
-
-	while (*++argv) {
-		if (!strcmp(*argv, "get")) {
-			void *peek = prio_queue_peek(&pq);
-			void *get = prio_queue_get(&pq);
-			if (peek != get)
-				BUG("peek and get results do not match");
-			show(get);
-		} else if (!strcmp(*argv, "dump")) {
-			void *peek;
-			void *get;
-			while ((peek = prio_queue_peek(&pq))) {
-				get = prio_queue_get(&pq);
-				if (peek != get)
-					BUG("peek and get results do not match");
-				show(get);
-			}
-		} else if (!strcmp(*argv, "stack")) {
-			pq.compare = NULL;
-		} else {
-			int *v = xmalloc(sizeof(*v));
-			*v = atoi(*argv);
-			prio_queue_put(&pq, v);
-		}
-	}
-
-	clear_prio_queue(&pq);
-
-	return 0;
-}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 876cd2dc313..5f591b9718f 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -57,7 +57,6 @@ static struct test_cmd cmds[] = {
 	{ "path-utils", cmd__path_utils },
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
-	{ "prio-queue", cmd__prio_queue },
 	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 70dd4eba119..b921138d8ec 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -50,7 +50,6 @@ int cmd__partial_clone(int argc, const char **argv);
 int cmd__path_utils(int argc, const char **argv);
 int cmd__pcre2_config(int argc, const char **argv);
 int cmd__pkt_line(int argc, const char **argv);
-int cmd__prio_queue(int argc, const char **argv);
 int cmd__proc_receive(int argc, const char **argv);
 int cmd__progress(int argc, const char **argv);
 int cmd__reach(int argc, const char **argv);
diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh
deleted file mode 100755
index eea99107a48..00000000000
--- a/t/t0009-prio-queue.sh
+++ /dev/null
@@ -1,66 +0,0 @@
-#!/bin/sh
-
-test_description='basic tests for priority queue implementation'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-cat >expect <<'EOF'
-1
-2
-3
-4
-5
-5
-6
-7
-8
-9
-10
-EOF
-test_expect_success 'basic ordering' '
-	test-tool prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-2
-3
-4
-1
-5
-6
-EOF
-test_expect_success 'mixed put and get' '
-	test-tool prio-queue 6 2 4 get 5 3 get get 1 dump >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-1
-2
-NULL
-1
-2
-NULL
-EOF
-test_expect_success 'notice empty queue' '
-	test-tool prio-queue 1 2 get get get 1 2 get get get >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-3
-2
-6
-4
-5
-1
-8
-EOF
-test_expect_success 'stack order' '
-	test-tool prio-queue stack 8 1 5 4 6 2 3 dump >actual &&
-	test_cmp expect actual
-'
-
-test_done
diff --git a/t/unit-tests/t-prio-queue.c b/t/unit-tests/t-prio-queue.c
new file mode 100644
index 00000000000..07b112f5894
--- /dev/null
+++ b/t/unit-tests/t-prio-queue.c
@@ -0,0 +1,99 @@
+#include "test-lib.h"
+#include "prio-queue.h"
+
+static int intcmp(const void *va, const void *vb, void *data UNUSED)
+{
+	const int *a = va, *b = vb;
+	return *a - *b;
+}
+
+static int show(int *v)
+{
+	int ret = -1;
+	if (v)
+		ret = *v;
+	free(v);
+	return ret;
+}
+
+static int test_prio_queue(const char **input, int *result)
+{
+	struct prio_queue pq = { intcmp };
+	int i = 0;
+
+	while (*input) {
+		if (!strcmp(*input, "get")) {
+			void *peek = prio_queue_peek(&pq);
+			void *get = prio_queue_get(&pq);
+			if (peek != get)
+				BUG("peek and get results do not match");
+			result[i++] = show(get);
+		} else if (!strcmp(*input, "dump")) {
+			void *peek;
+			void *get;
+			while ((peek = prio_queue_peek(&pq))) {
+				get = prio_queue_get(&pq);
+				if (peek != get)
+					BUG("peek and get results do not match");
+				result[i++] = show(get);
+			}
+		} else if (!strcmp(*input, "stack")) {
+			pq.compare = NULL;
+		} else if (!strcmp(*input, "reverse")) {
+			prio_queue_reverse(&pq);
+		} else {
+			int *v = xmalloc(sizeof(*v));
+			*v = atoi(*input);
+			prio_queue_put(&pq, v);
+		}
+		input++;
+	}
+
+	clear_prio_queue(&pq);
+
+	return 0;
+}
+
+#define INPUT_SIZE 6
+
+#define BASIC_INPUT "1", "2", "3", "4", "5", "5", "dump"
+#define BASIC_EXPECTED 1, 2, 3, 4, 5, 5
+
+#define MIXED_PUT_GET_INPUT "6", "2", "4", "get", "5", "3", "get", "get", "1", "dump"
+#define MIXED_PUT_GET_EXPECTED 2, 3, 4, 1, 5, 6
+
+#define EMPTY_QUEUE_INPUT "1", "2", "get", "get", "get", "1", "2", "get", "get", "get"
+#define EMPTY_QUEUE_EXPECTED 1, 2, -1, 1, 2, -1
+
+#define STACK_INPUT "stack", "1", "5", "4", "6", "2", "3", "dump"
+#define STACK_EXPECTED 3, 2, 6, 4, 5, 1
+
+#define REVERSE_STACK_INPUT "stack", "1", "2", "3", "4", "5", "6", "reverse", "dump"
+#define REVERSE_STACK_EXPECTED 1, 2, 3, 4, 5, 6
+
+#define TEST_INPUT(INPUT, EXPECTED, name)			\
+  static void test_##name(void)					\
+{								\
+	const char *input[] = {INPUT};				\
+	int expected[] = {EXPECTED};				\
+	int result[INPUT_SIZE];					\
+	test_prio_queue(input, result);				\
+	check(!memcmp(expected, result, sizeof(expected)));	\
+}
+
+TEST_INPUT(BASIC_INPUT, BASIC_EXPECTED, basic)
+TEST_INPUT(MIXED_PUT_GET_INPUT, MIXED_PUT_GET_EXPECTED, mixed)
+TEST_INPUT(EMPTY_QUEUE_INPUT, EMPTY_QUEUE_EXPECTED, empty)
+TEST_INPUT(STACK_INPUT, STACK_EXPECTED, stack)
+TEST_INPUT(REVERSE_STACK_INPUT, REVERSE_STACK_EXPECTED, reverse)
+
+int cmd_main(int argc, const char **argv)
+{
+	TEST(test_basic(), "prio-queue works for basic input");
+	TEST(test_mixed(), "prio-queue works for mixed put & get commands");
+	TEST(test_empty(), "prio-queue works when queue is empty");
+	TEST(test_stack(), "prio-queue works when used as a LIFO stack");
+	TEST(test_reverse(), "prio-queue works when LIFO stack is reversed");
+
+	return test_done();
+}

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH] tests: move t0009-prio-queue.sh to the new unit testing framework
From: Chandra Pratap @ 2024-01-14  8:15 UTC (permalink / raw)
  To: Chandra Pratap via GitGitGadget; +Cc: git, Chandra Pratap
In-Reply-To: <pull.1642.git.1705219829965.gitgitgadget@gmail.com>

Please ignore the patch above, my e-mail client seems to have
messed up the backslash indentation. I will follow up with the
corrected patch shortly.

On Sun, 14 Jan 2024 at 13:40, Chandra Pratap via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
> tests Git's implementation of a priority queue. Migrate the
> test over to the new unit testing framework to simplify debugging
> and reduce test run-time. Refactor the required logic and add
> a new test case in addition to porting over the original ones in
> shell.
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     tests: move t0009-prio-queue.sh to the new unit testing framework
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1642%2FChand-ra%2Fprio-queue-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1642/Chand-ra/prio-queue-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1642
>
>  Makefile                    |  2 +-
>  t/helper/test-prio-queue.c  | 51 -------------------
>  t/helper/test-tool.c        |  1 -
>  t/helper/test-tool.h        |  1 -
>  t/t0009-prio-queue.sh       | 66 -------------------------
>  t/unit-tests/t-prio-queue.c | 99 +++++++++++++++++++++++++++++++++++++
>  6 files changed, 100 insertions(+), 120 deletions(-)
>  delete mode 100644 t/helper/test-prio-queue.c
>  delete mode 100755 t/t0009-prio-queue.sh
>  create mode 100644 t/unit-tests/t-prio-queue.c
>
> diff --git a/Makefile b/Makefile
> index 312d95084c1..d5e48e656b3 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -828,7 +828,6 @@ TEST_BUILTINS_OBJS += test-partial-clone.o
>  TEST_BUILTINS_OBJS += test-path-utils.o
>  TEST_BUILTINS_OBJS += test-pcre2-config.o
>  TEST_BUILTINS_OBJS += test-pkt-line.o
> -TEST_BUILTINS_OBJS += test-prio-queue.o
>  TEST_BUILTINS_OBJS += test-proc-receive.o
>  TEST_BUILTINS_OBJS += test-progress.o
>  TEST_BUILTINS_OBJS += test-reach.o
> @@ -1340,6 +1339,7 @@ THIRD_PARTY_SOURCES += sha1dc/%
>
>  UNIT_TEST_PROGRAMS += t-basic
>  UNIT_TEST_PROGRAMS += t-strbuf
> +UNIT_TEST_PROGRAMS += t-prio-queue
>  UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
>  UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
>  UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
> diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c
> deleted file mode 100644
> index f0bf255f5f0..00000000000
> --- a/t/helper/test-prio-queue.c
> +++ /dev/null
> @@ -1,51 +0,0 @@
> -#include "test-tool.h"
> -#include "prio-queue.h"
> -
> -static int intcmp(const void *va, const void *vb, void *data UNUSED)
> -{
> -       const int *a = va, *b = vb;
> -       return *a - *b;
> -}
> -
> -static void show(int *v)
> -{
> -       if (!v)
> -               printf("NULL\n");
> -       else
> -               printf("%d\n", *v);
> -       free(v);
> -}
> -
> -int cmd__prio_queue(int argc UNUSED, const char **argv)
> -{
> -       struct prio_queue pq = { intcmp };
> -
> -       while (*++argv) {
> -               if (!strcmp(*argv, "get")) {
> -                       void *peek = prio_queue_peek(&pq);
> -                       void *get = prio_queue_get(&pq);
> -                       if (peek != get)
> -                               BUG("peek and get results do not match");
> -                       show(get);
> -               } else if (!strcmp(*argv, "dump")) {
> -                       void *peek;
> -                       void *get;
> -                       while ((peek = prio_queue_peek(&pq))) {
> -                               get = prio_queue_get(&pq);
> -                               if (peek != get)
> -                                       BUG("peek and get results do not match");
> -                               show(get);
> -                       }
> -               } else if (!strcmp(*argv, "stack")) {
> -                       pq.compare = NULL;
> -               } else {
> -                       int *v = xmalloc(sizeof(*v));
> -                       *v = atoi(*argv);
> -                       prio_queue_put(&pq, v);
> -               }
> -       }
> -
> -       clear_prio_queue(&pq);
> -
> -       return 0;
> -}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index 876cd2dc313..5f591b9718f 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -57,7 +57,6 @@ static struct test_cmd cmds[] = {
>         { "path-utils", cmd__path_utils },
>         { "pcre2-config", cmd__pcre2_config },
>         { "pkt-line", cmd__pkt_line },
> -       { "prio-queue", cmd__prio_queue },
>         { "proc-receive", cmd__proc_receive },
>         { "progress", cmd__progress },
>         { "reach", cmd__reach },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index 70dd4eba119..b921138d8ec 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -50,7 +50,6 @@ int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__pcre2_config(int argc, const char **argv);
>  int cmd__pkt_line(int argc, const char **argv);
> -int cmd__prio_queue(int argc, const char **argv);
>  int cmd__proc_receive(int argc, const char **argv);
>  int cmd__progress(int argc, const char **argv);
>  int cmd__reach(int argc, const char **argv);
> diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh
> deleted file mode 100755
> index eea99107a48..00000000000
> --- a/t/t0009-prio-queue.sh
> +++ /dev/null
> @@ -1,66 +0,0 @@
> -#!/bin/sh
> -
> -test_description='basic tests for priority queue implementation'
> -
> -TEST_PASSES_SANITIZE_LEAK=true
> -. ./test-lib.sh
> -
> -cat >expect <<'EOF'
> -1
> -2
> -3
> -4
> -5
> -5
> -6
> -7
> -8
> -9
> -10
> -EOF
> -test_expect_success 'basic ordering' '
> -       test-tool prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
> -       test_cmp expect actual
> -'
> -
> -cat >expect <<'EOF'
> -2
> -3
> -4
> -1
> -5
> -6
> -EOF
> -test_expect_success 'mixed put and get' '
> -       test-tool prio-queue 6 2 4 get 5 3 get get 1 dump >actual &&
> -       test_cmp expect actual
> -'
> -
> -cat >expect <<'EOF'
> -1
> -2
> -NULL
> -1
> -2
> -NULL
> -EOF
> -test_expect_success 'notice empty queue' '
> -       test-tool prio-queue 1 2 get get get 1 2 get get get >actual &&
> -       test_cmp expect actual
> -'
> -
> -cat >expect <<'EOF'
> -3
> -2
> -6
> -4
> -5
> -1
> -8
> -EOF
> -test_expect_success 'stack order' '
> -       test-tool prio-queue stack 8 1 5 4 6 2 3 dump >actual &&
> -       test_cmp expect actual
> -'
> -
> -test_done
> diff --git a/t/unit-tests/t-prio-queue.c b/t/unit-tests/t-prio-queue.c
> new file mode 100644
> index 00000000000..98a69790f7e
> --- /dev/null
> +++ b/t/unit-tests/t-prio-queue.c
> @@ -0,0 +1,99 @@
> +#include "test-lib.h"
> +#include "prio-queue.h"
> +
> +static int intcmp(const void *va, const void *vb, void *data UNUSED)
> +{
> +       const int *a = va, *b = vb;
> +       return *a - *b;
> +}
> +
> +static int show(int *v)
> +{
> +       int ret = -1;
> +       if (v)
> +               ret = *v;
> +       free(v);
> +       return ret;
> +}
> +
> +static int test_prio_queue(const char **input, int *result)
> +{
> +       struct prio_queue pq = { intcmp };
> +       int i = 0;
> +
> +       while (*input) {
> +               if (!strcmp(*input, "get")) {
> +                       void *peek = prio_queue_peek(&pq);
> +                       void *get = prio_queue_get(&pq);
> +                       if (peek != get)
> +                               BUG("peek and get results do not match");
> +                       result[i++] = show(get);
> +               } else if (!strcmp(*input, "dump")) {
> +                       void *peek;
> +                       void *get;
> +                       while ((peek = prio_queue_peek(&pq))) {
> +                               get = prio_queue_get(&pq);
> +                               if (peek != get)
> +                                       BUG("peek and get results do not match");
> +                               result[i++] = show(get);
> +                       }
> +               } else if (!strcmp(*input, "stack")) {
> +                       pq.compare = NULL;
> +               } else if (!strcmp(*input, "reverse")) {
> +                       prio_queue_reverse(&pq);
> +               } else {
> +                       int *v = xmalloc(sizeof(*v));
> +                       *v = atoi(*input);
> +                       prio_queue_put(&pq, v);
> +               }
> +               input++;
> +       }
> +
> +       clear_prio_queue(&pq);
> +
> +       return 0;
> +}
> +
> +#define INPUT_SIZE 6
> +
> +#define BASIC_INPUT "1", "2", "3", "4", "5", "5", "dump"
> +#define BASIC_EXPECTED 1, 2, 3, 4, 5, 5
> +
> +#define MIXED_PUT_GET_INPUT "6", "2", "4", "get", "5", "3", "get", "get", "1", "dump"
> +#define MIXED_PUT_GET_EXPECTED 2, 3, 4, 1, 5, 6
> +
> +#define EMPTY_QUEUE_INPUT "1", "2", "get", "get", "get", "1", "2", "get", "get", "get"
> +#define EMPTY_QUEUE_EXPECTED 1, 2, -1, 1, 2, -1
> +
> +#define STACK_INPUT "stack", "1", "5", "4", "6", "2", "3", "dump"
> +#define STACK_EXPECTED 3, 2, 6, 4, 5, 1
> +
> +#define REVERSE_STACK_INPUT "stack", "1", "2", "3", "4", "5", "6", "reverse", "dump"
> +#define REVERSE_STACK_EXPECTED 1, 2, 3, 4, 5, 6
> +
> +#define TEST_INPUT(INPUT, EXPECTED, name)              \
> +  static void test_##name(void)                                        \
> +{                                                                                      \
> +       const char *input[] = {INPUT};                                          \
> +       int expected[] = {EXPECTED};                                    \
> +       int result[INPUT_SIZE];                                                         \
> +       test_prio_queue(input, result);                                         \
> +       check(!memcmp(expected, result, sizeof(expected)));     \
> +}
> +
> +TEST_INPUT(BASIC_INPUT, BASIC_EXPECTED, basic)
> +TEST_INPUT(MIXED_PUT_GET_INPUT, MIXED_PUT_GET_EXPECTED, mixed)
> +TEST_INPUT(EMPTY_QUEUE_INPUT, EMPTY_QUEUE_EXPECTED, empty)
> +TEST_INPUT(STACK_INPUT, STACK_EXPECTED, stack)
> +TEST_INPUT(REVERSE_STACK_INPUT, REVERSE_STACK_EXPECTED, reverse)
> +
> +int cmd_main(int argc, const char **argv)
> +{
> +       TEST(test_basic(), "prio-queue works for basic input");
> +       TEST(test_mixed(), "prio-queue works for mixed put & get commands");
> +       TEST(test_empty(), "prio-queue works when queue is empty");
> +       TEST(test_stack(), "prio-queue works when used as a LIFO stack");
> +       TEST(test_reverse(), "prio-queue works when LIFO stack is reversed");
> +
> +       return test_done();
> +}
>
> base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
> --
> gitgitgadget

^ permalink raw reply

* [PATCH] tests: move t0009-prio-queue.sh to the new unit testing framework
From: Chandra Pratap via GitGitGadget @ 2024-01-14  8:10 UTC (permalink / raw)
  To: git; +Cc: Chandra Pratap, Chandra Pratap

From: Chandra Pratap <chandrapratap3519@gmail.com>

t/t0009-prio-queue.sh along with t/helper/test-prio-queue.c unit
tests Git's implementation of a priority queue. Migrate the
test over to the new unit testing framework to simplify debugging
and reduce test run-time. Refactor the required logic and add
a new test case in addition to porting over the original ones in
shell.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    tests: move t0009-prio-queue.sh to the new unit testing framework

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1642%2FChand-ra%2Fprio-queue-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1642/Chand-ra/prio-queue-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1642

 Makefile                    |  2 +-
 t/helper/test-prio-queue.c  | 51 -------------------
 t/helper/test-tool.c        |  1 -
 t/helper/test-tool.h        |  1 -
 t/t0009-prio-queue.sh       | 66 -------------------------
 t/unit-tests/t-prio-queue.c | 99 +++++++++++++++++++++++++++++++++++++
 6 files changed, 100 insertions(+), 120 deletions(-)
 delete mode 100644 t/helper/test-prio-queue.c
 delete mode 100755 t/t0009-prio-queue.sh
 create mode 100644 t/unit-tests/t-prio-queue.c

diff --git a/Makefile b/Makefile
index 312d95084c1..d5e48e656b3 100644
--- a/Makefile
+++ b/Makefile
@@ -828,7 +828,6 @@ TEST_BUILTINS_OBJS += test-partial-clone.o
 TEST_BUILTINS_OBJS += test-path-utils.o
 TEST_BUILTINS_OBJS += test-pcre2-config.o
 TEST_BUILTINS_OBJS += test-pkt-line.o
-TEST_BUILTINS_OBJS += test-prio-queue.o
 TEST_BUILTINS_OBJS += test-proc-receive.o
 TEST_BUILTINS_OBJS += test-progress.o
 TEST_BUILTINS_OBJS += test-reach.o
@@ -1340,6 +1339,7 @@ THIRD_PARTY_SOURCES += sha1dc/%
 
 UNIT_TEST_PROGRAMS += t-basic
 UNIT_TEST_PROGRAMS += t-strbuf
+UNIT_TEST_PROGRAMS += t-prio-queue
 UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c
deleted file mode 100644
index f0bf255f5f0..00000000000
--- a/t/helper/test-prio-queue.c
+++ /dev/null
@@ -1,51 +0,0 @@
-#include "test-tool.h"
-#include "prio-queue.h"
-
-static int intcmp(const void *va, const void *vb, void *data UNUSED)
-{
-	const int *a = va, *b = vb;
-	return *a - *b;
-}
-
-static void show(int *v)
-{
-	if (!v)
-		printf("NULL\n");
-	else
-		printf("%d\n", *v);
-	free(v);
-}
-
-int cmd__prio_queue(int argc UNUSED, const char **argv)
-{
-	struct prio_queue pq = { intcmp };
-
-	while (*++argv) {
-		if (!strcmp(*argv, "get")) {
-			void *peek = prio_queue_peek(&pq);
-			void *get = prio_queue_get(&pq);
-			if (peek != get)
-				BUG("peek and get results do not match");
-			show(get);
-		} else if (!strcmp(*argv, "dump")) {
-			void *peek;
-			void *get;
-			while ((peek = prio_queue_peek(&pq))) {
-				get = prio_queue_get(&pq);
-				if (peek != get)
-					BUG("peek and get results do not match");
-				show(get);
-			}
-		} else if (!strcmp(*argv, "stack")) {
-			pq.compare = NULL;
-		} else {
-			int *v = xmalloc(sizeof(*v));
-			*v = atoi(*argv);
-			prio_queue_put(&pq, v);
-		}
-	}
-
-	clear_prio_queue(&pq);
-
-	return 0;
-}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 876cd2dc313..5f591b9718f 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -57,7 +57,6 @@ static struct test_cmd cmds[] = {
 	{ "path-utils", cmd__path_utils },
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
-	{ "prio-queue", cmd__prio_queue },
 	{ "proc-receive", cmd__proc_receive },
 	{ "progress", cmd__progress },
 	{ "reach", cmd__reach },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 70dd4eba119..b921138d8ec 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -50,7 +50,6 @@ int cmd__partial_clone(int argc, const char **argv);
 int cmd__path_utils(int argc, const char **argv);
 int cmd__pcre2_config(int argc, const char **argv);
 int cmd__pkt_line(int argc, const char **argv);
-int cmd__prio_queue(int argc, const char **argv);
 int cmd__proc_receive(int argc, const char **argv);
 int cmd__progress(int argc, const char **argv);
 int cmd__reach(int argc, const char **argv);
diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh
deleted file mode 100755
index eea99107a48..00000000000
--- a/t/t0009-prio-queue.sh
+++ /dev/null
@@ -1,66 +0,0 @@
-#!/bin/sh
-
-test_description='basic tests for priority queue implementation'
-
-TEST_PASSES_SANITIZE_LEAK=true
-. ./test-lib.sh
-
-cat >expect <<'EOF'
-1
-2
-3
-4
-5
-5
-6
-7
-8
-9
-10
-EOF
-test_expect_success 'basic ordering' '
-	test-tool prio-queue 2 6 3 10 9 5 7 4 5 8 1 dump >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-2
-3
-4
-1
-5
-6
-EOF
-test_expect_success 'mixed put and get' '
-	test-tool prio-queue 6 2 4 get 5 3 get get 1 dump >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-1
-2
-NULL
-1
-2
-NULL
-EOF
-test_expect_success 'notice empty queue' '
-	test-tool prio-queue 1 2 get get get 1 2 get get get >actual &&
-	test_cmp expect actual
-'
-
-cat >expect <<'EOF'
-3
-2
-6
-4
-5
-1
-8
-EOF
-test_expect_success 'stack order' '
-	test-tool prio-queue stack 8 1 5 4 6 2 3 dump >actual &&
-	test_cmp expect actual
-'
-
-test_done
diff --git a/t/unit-tests/t-prio-queue.c b/t/unit-tests/t-prio-queue.c
new file mode 100644
index 00000000000..98a69790f7e
--- /dev/null
+++ b/t/unit-tests/t-prio-queue.c
@@ -0,0 +1,99 @@
+#include "test-lib.h"
+#include "prio-queue.h"
+
+static int intcmp(const void *va, const void *vb, void *data UNUSED)
+{
+	const int *a = va, *b = vb;
+	return *a - *b;
+}
+
+static int show(int *v)
+{
+	int ret = -1;
+	if (v)
+		ret = *v;
+	free(v);
+	return ret;
+}
+
+static int test_prio_queue(const char **input, int *result)
+{
+	struct prio_queue pq = { intcmp };
+	int i = 0;
+
+	while (*input) {
+		if (!strcmp(*input, "get")) {
+			void *peek = prio_queue_peek(&pq);
+			void *get = prio_queue_get(&pq);
+			if (peek != get)
+				BUG("peek and get results do not match");
+			result[i++] = show(get);
+		} else if (!strcmp(*input, "dump")) {
+			void *peek;
+			void *get;
+			while ((peek = prio_queue_peek(&pq))) {
+				get = prio_queue_get(&pq);
+				if (peek != get)
+					BUG("peek and get results do not match");
+				result[i++] = show(get);
+			}
+		} else if (!strcmp(*input, "stack")) {
+			pq.compare = NULL;
+		} else if (!strcmp(*input, "reverse")) {
+			prio_queue_reverse(&pq);
+		} else {
+			int *v = xmalloc(sizeof(*v));
+			*v = atoi(*input);
+			prio_queue_put(&pq, v);
+		}
+		input++;
+	}
+
+	clear_prio_queue(&pq);
+
+	return 0;
+}
+
+#define INPUT_SIZE 6
+
+#define BASIC_INPUT "1", "2", "3", "4", "5", "5", "dump"
+#define BASIC_EXPECTED 1, 2, 3, 4, 5, 5
+
+#define MIXED_PUT_GET_INPUT "6", "2", "4", "get", "5", "3", "get", "get", "1", "dump"
+#define MIXED_PUT_GET_EXPECTED 2, 3, 4, 1, 5, 6
+
+#define EMPTY_QUEUE_INPUT "1", "2", "get", "get", "get", "1", "2", "get", "get", "get"
+#define EMPTY_QUEUE_EXPECTED 1, 2, -1, 1, 2, -1
+
+#define STACK_INPUT "stack", "1", "5", "4", "6", "2", "3", "dump"
+#define STACK_EXPECTED 3, 2, 6, 4, 5, 1
+
+#define REVERSE_STACK_INPUT "stack", "1", "2", "3", "4", "5", "6", "reverse", "dump"
+#define REVERSE_STACK_EXPECTED 1, 2, 3, 4, 5, 6
+
+#define TEST_INPUT(INPUT, EXPECTED, name)		\
+  static void test_##name(void)					\
+{											\
+	const char *input[] = {INPUT};						\
+	int expected[] = {EXPECTED};					\
+	int result[INPUT_SIZE];								\
+	test_prio_queue(input, result);						\
+	check(!memcmp(expected, result, sizeof(expected)));	\
+}
+
+TEST_INPUT(BASIC_INPUT, BASIC_EXPECTED, basic)
+TEST_INPUT(MIXED_PUT_GET_INPUT, MIXED_PUT_GET_EXPECTED, mixed)
+TEST_INPUT(EMPTY_QUEUE_INPUT, EMPTY_QUEUE_EXPECTED, empty)
+TEST_INPUT(STACK_INPUT, STACK_EXPECTED, stack)
+TEST_INPUT(REVERSE_STACK_INPUT, REVERSE_STACK_EXPECTED, reverse)
+
+int cmd_main(int argc, const char **argv)
+{
+	TEST(test_basic(), "prio-queue works for basic input");
+	TEST(test_mixed(), "prio-queue works for mixed put & get commands");
+	TEST(test_empty(), "prio-queue works when queue is empty");
+	TEST(test_stack(), "prio-queue works when used as a LIFO stack");
+	TEST(test_reverse(), "prio-queue works when LIFO stack is reversed");
+
+	return test_done();
+}

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
gitgitgadget

^ permalink raw reply related

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: SZEDER Gábor @ 2024-01-13 23:41 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git
In-Reply-To: <ZaMJU6MJ5wZxyLeM@nand.local>

On Sat, Jan 13, 2024 at 05:06:11PM -0500, Taylor Blau wrote:
> Hi Gábor,
> 
> Thanks very much for your patience while I figure all of this out. I'm
> confident that with the fixup! below that your test passes in the next
> round of this series.
> 
> On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote:
> > On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> > > Taylor Blau <me@ttaylorr.com> writes:
> > >
> > > >> * tb/path-filter-fix (2023-10-18) 17 commits
> > > >>  - bloom: introduce `deinit_bloom_filters()`
> > > >>  ...
> > > >>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> > > >>
> > > >>  The Bloom filter used for path limited history traversal was broken
> > > >>  on systems whose "char" is unsigned; update the implementation and
> > > >>  bump the format version to 2.
> > > >>
> > > >>  Expecting a reroll.
> > > >>  cf. <20231023202212.GA5470@szeder.dev>
> > > >>  source: <cover.1697653929.git.me@ttaylorr.com>
> > > >
> > > > I was confused by this one, since I couldn't figure out which tests
> > > > Gábor was referring to here. I responded in [1], but haven't heard back
> > > > since the end of October.
> > > > ...
> > > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/
> >
> > I keep referring to the test in:
> >
> >   https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/
> >
> > which, rather disappointingly, is still the only test out there
> > exercising the interaction between split commit graphs and different
> > modified path Bloom filter versions.  Note that in that message I
> > mentioned that merging layers with differenet Bloom filter versions
> > seemed to work, but that, alas, is no longer the case, because it's
> > now broken in Taylor's recent iterations of the patch series.
> 
> Thanks for clarifying. With all of the different versions of this series
> and the couple of tests that you and I were talking about, I think that
> I got confused and thought you were referring to a different test.
> 
> Indeed, the current version of this series (v4) does not pass the test
> in <20230830200218.GA5147@szeder.dev>, but the fix in order for it to
> pass is relatively straightforward.
> 
> I have this on top of my local branch as a fixup! and I'll squash it
> down on Tuesday[^1] and send a reroll after double-checking that the fix
> is correct and doesn't conflict with any other parts of the series.
> 
> Since it's been so long since I've looked at this, I'll re-read the
> series as a whole with fresh eyes to make sure that everything still
> makes sense.
> 
> In any case, here's the patch on top (with a lightly modified version of
> the test you wrote in <20230830200218.GA5147@szeder.dev>:

I certainly hope that I'm just misunderstanding, and you don't
actually imply that this one test in its current form would qualify as
thorough testing... :)


> Subject: [PATCH] fixup! commit-graph: ensure Bloom filters are read with
>  consistent settings
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  commit-graph.c       |  3 ++-
>  t/t4216-log-bloom.sh | 29 ++++++++++++++++++++++++++++-
>  2 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 60fa64d956..82a4632c4e 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -507,7 +507,8 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
>  		}
> 
>  		if (g->bloom_filter_settings->bits_per_entry != settings->bits_per_entry ||
> -		    g->bloom_filter_settings->num_hashes != settings->num_hashes) {
> +		    g->bloom_filter_settings->num_hashes != settings->num_hashes ||
> +		    g->bloom_filter_settings->hash_version != settings->hash_version) {
>  			g->chunk_bloom_indexes = NULL;
>  			g->chunk_bloom_data = NULL;
>  			FREE_AND_NULL(g->bloom_filter_settings);
> diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
> index 569f2b6f8b..d8c42e2e27 100755
> --- a/t/t4216-log-bloom.sh
> +++ b/t/t4216-log-bloom.sh
> @@ -438,7 +438,7 @@ test_expect_success 'setup for mixed Bloom setting tests' '
>  	done
>  '
> 
> -test_expect_success 'ensure incompatible Bloom filters are ignored' '
> +test_expect_success 'ensure Bloom filters with incompatible settings are ignored' '
>  	# Compute Bloom filters with "unusual" settings.
>  	git -C $repo rev-parse one >in &&
>  	GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=3 git -C $repo commit-graph write \
> @@ -488,6 +488,33 @@ test_expect_success 'merge graph layers with incompatible Bloom settings' '
>  	test_must_be_empty err
>  '
> 
> +test_expect_success 'ensure Bloom filter with incompatible versions are ignored' '
> +	rm "$repo/$graph" &&
> +
> +	git -C $repo log --oneline --no-decorate -- $CENT >expect &&
> +
> +	# Compute v1 Bloom filters for commits at the bottom.
> +	git -C $repo rev-parse HEAD^ >in &&
> +	git -C $repo commit-graph write --stdin-commits --changed-paths \
> +		--split <in &&
> +
> +	# Compute v2 Bloomfilters for the rest of the commits at the top.
> +	git -C $repo rev-parse HEAD >in &&
> +	git -C $repo -c commitGraph.changedPathsVersion=2 commit-graph write \
> +		--stdin-commits --changed-paths --split=no-merge <in &&
> +
> +	test_line_count = 2 $repo/$chain &&
> +
> +	git -C $repo log --oneline --no-decorate -- $CENT >actual 2>err &&
> +	test_cmp expect actual &&
> +
> +	layer="$(head -n 1 $repo/$chain)" &&
> +	cat >expect.err <<-EOF &&
> +	warning: disabling Bloom filters for commit-graph layer $SQ$layer$SQ due to incompatible settings
> +	EOF
> +	test_cmp expect.err err
> +'
> +
>  get_first_changed_path_filter () {
>  	test-tool read-graph bloom-filters >filters.dat &&
>  	head -n 1 filters.dat
> 
> --
> 2.38.0.16.g393fd4c6db
> 
> (Excuse the old version of Git here, I'm in the middle of a long-running
> process which checks out older versions of the tree to count the number
> of unused-parameters over time.)
> 
> Thanks,
> Taylor
> 
> [^1]: Monday is a US holiday, so I likely won't be at my computer.

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: SZEDER Gábor @ 2024-01-13 22:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git
In-Reply-To: <20240113183544.GA3000857@szeder.dev>

On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote:
> On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> > 
> > >> * tb/path-filter-fix (2023-10-18) 17 commits
> > >>  - bloom: introduce `deinit_bloom_filters()`
> > >>  ...
> > >>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> > >>
> > >>  The Bloom filter used for path limited history traversal was broken
> > >>  on systems whose "char" is unsigned; update the implementation and
> > >>  bump the format version to 2.
> > >>
> > >>  Expecting a reroll.
> > >>  cf. <20231023202212.GA5470@szeder.dev>
> > >>  source: <cover.1697653929.git.me@ttaylorr.com>
> > >
> > > I was confused by this one, since I couldn't figure out which tests
> > > Gábor was referring to here. I responded in [1], but haven't heard back
> > > since the end of October.
> > > ...
> > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/
> 
> I keep referring to the test in:
> 
>   https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/
> 
> which, rather disappointingly, is still the only test out there
> exercising the interaction between split commit graphs and different
> modified path Bloom filter versions.  Note that in that message I
> mentioned that merging layers with differenet Bloom filter versions
> seemed to work, but that, alas, is no longer the case, because it's
> now broken in Taylor's recent iterations of the patch series.
> 
> At the risk of sounding like a broken record: the interaction of split
> commit graphs and different Bloom filter versions should be thoroughly
> tested.

On a related note, if current git (I tried current master and v2.43.0)
encounters a commit graph layer containing v2 Bloom filters (created
by current seen) while writing a new commit graph, then it segfaults
dereferencing a NULL 'settings' pointer in
get_or_compute_bloom_filter().

The test below demonstrates this, but it's quite hacky using two
different git versions: it has to be run by an old git version not yet
supporting v2 Bloom filters, and a new git version already supporting
them should be installed at /tmp/git-new/.


diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index 2ba0324a69..0464dd68d5 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' '
 	test_cmp expect.err err
 '
 
+CENT=$(printf "\302\242")
+test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' '
+	git init split-v2-old &&
+	(
+		cd split-v2-old &&
+		git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" &&
+		for i in 1 2 3
+		do
+			echo $i >$CENT &&
+			git add $CENT &&
+			git commit -m "$i" || return 1
+		done &&
+		git log --oneline -- $CENT >expect &&
+
+		# Here we write a commit graph layer containing v2 changed
+		# path Bloom filters using a git binary built from current
+		# 'seen' branch.
+		git rev-parse HEAD^ |
+		/tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \
+			commit-graph write --stdin-commits --changed-paths --split &&
+
+		# This is current master, and segfaults.
+		git commit-graph write --reachable --changed-paths &&
+
+		git log --oneline -- $CENT >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done



(gdb) r
Starting program: /home/szeder/src/git/git commit-graph write --reachable --changed-paths
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00005555556b8714 in get_or_compute_bloom_filter (r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, settings=0x0, computed=0x7fffffffd634) at bloom.c:253
253		diffopt.max_changes = settings->max_changed_paths;
(gdb) l
248			return NULL;
249	
250		repo_diff_setup(r, &diffopt);
251		diffopt.flags.recursive = 1;
252		diffopt.detect_rename = 0;
253		diffopt.max_changes = settings->max_changed_paths;
254		diff_setup_done(&diffopt);
255	
256		/* ensure commit is parsed so we have parent information */
257		repo_parse_commit(r, c);
(gdb) bt
#0  0x00005555556b8714 in get_or_compute_bloom_filter (
    r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, 
    settings=0x0, computed=0x7fffffffd634) at bloom.c:253
#1  0x00005555556d16d5 in compute_bloom_filters (ctx=0x555555a1c200)
    at commit-graph.c:1783
#2  0x00005555556d4329 in write_commit_graph (odb=0x555555a19210, 
    pack_indexes=0x0, commits=0x7fffffffd7c0, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), 
    opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:2603
#3  0x00005555556d19ab in write_commit_graph_reachable (odb=0x555555a19210, 
    flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), 
    opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:1849
#4  0x00005555555a63c2 in graph_write (argc=0, argv=0x7fffffffde20, prefix=0x0)
    at builtin/commit-graph.c:294
#5  0x00005555555a66f4 in cmd_commit_graph (argc=3, argv=0x7fffffffde20, 
    prefix=0x0) at builtin/commit-graph.c:353
#6  0x0000555555575b43 in run_builtin (p=0x5555559da260 <commands+576>, 
    argc=4, argv=0x7fffffffde20) at git.c:469
#7  0x0000555555575fcb in handle_builtin (argc=4, argv=0x7fffffffde20)
    at git.c:724
#8  0x0000555555576272 in run_argv (argcp=0x7fffffffdc7c, argv=0x7fffffffdc70)
    at git.c:788
#9  0x000055555557685d in cmd_main (argc=4, argv=0x7fffffffde20) at git.c:923
#10 0x000055555568ad55 in main (argc=5, argv=0x7fffffffde18)
    at common-main.c:62



^ permalink raw reply related

* Re: [PATCH v2 2/2] t5541: remove lockfile creation
From: Justin Tobler @ 2024-01-13 22:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, Justin Tobler via GitGitGadget, git,
	Patrick Steinhardt
In-Reply-To: <xmqqo7dqbfin.fsf@gitster.g>

Great! Thanks everyone for the help!
-Justin

On Fri, Jan 12, 2024 at 11:58 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> > On Thu, Jan 11, 2024 at 08:24:30PM +0000, Justin Tobler via GitGitGadget wrote:
> >
> >> -    # the new branch should not have been created upstream
> >> -    test_must_fail git -C "$d" show-ref --verify refs/heads/atomic &&
> >> -
> >> -    # upstream should still reflect atomic2, the last thing we pushed
> >> -    # successfully
> >> -    git rev-parse atomic2 >expected &&
> >> -    # ...to other.
> >> -    git -C "$d" rev-parse refs/heads/other >actual &&
> >> -    test_cmp expected actual &&
> >> -
> >> -    # the new branch should not have been created upstream
> >> +    # The atomic and other branches should be created upstream.
> >>      test_must_fail git -C "$d" show-ref --verify refs/heads/atomic &&
> >> +    test_must_fail git -C "$d" show-ref --verify refs/heads/other &&
> >
> > This last comment should say "should not be created", I think?
> >
> > Other than that, both patches look good to me.
>
> Thanks.  Will queue with the following and "Acked-by: peff".
>
> diff --git c/t/t5541-http-push-smart.sh w/t/t5541-http-push-smart.sh
> index 9a8bed6c32..71428f3d5c 100755
> --- c/t/t5541-http-push-smart.sh
> +++ w/t/t5541-http-push-smart.sh
> @@ -242,7 +242,7 @@ test_expect_success 'push --atomic fails on server-side errors' '
>         # --atomic should cause entire push to be rejected
>         test_must_fail git push --atomic "$up" atomic other 2>output  &&
>
> -       # The atomic and other branches should be created upstream.
> +       # The atomic and other branches should not be created upstream.
>         test_must_fail git -C "$d" show-ref --verify refs/heads/atomic &&
>         test_must_fail git -C "$d" show-ref --verify refs/heads/other &&
>
>

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Taylor Blau @ 2024-01-13 22:06 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Junio C Hamano, git
In-Reply-To: <20240113183544.GA3000857@szeder.dev>

Hi Gábor,

Thanks very much for your patience while I figure all of this out. I'm
confident that with the fixup! below that your test passes in the next
round of this series.

On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote:
> On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > >> * tb/path-filter-fix (2023-10-18) 17 commits
> > >>  - bloom: introduce `deinit_bloom_filters()`
> > >>  ...
> > >>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> > >>
> > >>  The Bloom filter used for path limited history traversal was broken
> > >>  on systems whose "char" is unsigned; update the implementation and
> > >>  bump the format version to 2.
> > >>
> > >>  Expecting a reroll.
> > >>  cf. <20231023202212.GA5470@szeder.dev>
> > >>  source: <cover.1697653929.git.me@ttaylorr.com>
> > >
> > > I was confused by this one, since I couldn't figure out which tests
> > > Gábor was referring to here. I responded in [1], but haven't heard back
> > > since the end of October.
> > > ...
> > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/
>
> I keep referring to the test in:
>
>   https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/
>
> which, rather disappointingly, is still the only test out there
> exercising the interaction between split commit graphs and different
> modified path Bloom filter versions.  Note that in that message I
> mentioned that merging layers with differenet Bloom filter versions
> seemed to work, but that, alas, is no longer the case, because it's
> now broken in Taylor's recent iterations of the patch series.

Thanks for clarifying. With all of the different versions of this series
and the couple of tests that you and I were talking about, I think that
I got confused and thought you were referring to a different test.

Indeed, the current version of this series (v4) does not pass the test
in <20230830200218.GA5147@szeder.dev>, but the fix in order for it to
pass is relatively straightforward.

I have this on top of my local branch as a fixup! and I'll squash it
down on Tuesday[^1] and send a reroll after double-checking that the fix
is correct and doesn't conflict with any other parts of the series.

Since it's been so long since I've looked at this, I'll re-read the
series as a whole with fresh eyes to make sure that everything still
makes sense.

In any case, here's the patch on top (with a lightly modified version of
the test you wrote in <20230830200218.GA5147@szeder.dev>:

Subject: [PATCH] fixup! commit-graph: ensure Bloom filters are read with
 consistent settings

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 commit-graph.c       |  3 ++-
 t/t4216-log-bloom.sh | 29 ++++++++++++++++++++++++++++-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index 60fa64d956..82a4632c4e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -507,7 +507,8 @@ static void validate_mixed_bloom_settings(struct commit_graph *g)
 		}

 		if (g->bloom_filter_settings->bits_per_entry != settings->bits_per_entry ||
-		    g->bloom_filter_settings->num_hashes != settings->num_hashes) {
+		    g->bloom_filter_settings->num_hashes != settings->num_hashes ||
+		    g->bloom_filter_settings->hash_version != settings->hash_version) {
 			g->chunk_bloom_indexes = NULL;
 			g->chunk_bloom_data = NULL;
 			FREE_AND_NULL(g->bloom_filter_settings);
diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh
index 569f2b6f8b..d8c42e2e27 100755
--- a/t/t4216-log-bloom.sh
+++ b/t/t4216-log-bloom.sh
@@ -438,7 +438,7 @@ test_expect_success 'setup for mixed Bloom setting tests' '
 	done
 '

-test_expect_success 'ensure incompatible Bloom filters are ignored' '
+test_expect_success 'ensure Bloom filters with incompatible settings are ignored' '
 	# Compute Bloom filters with "unusual" settings.
 	git -C $repo rev-parse one >in &&
 	GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=3 git -C $repo commit-graph write \
@@ -488,6 +488,33 @@ test_expect_success 'merge graph layers with incompatible Bloom settings' '
 	test_must_be_empty err
 '

+test_expect_success 'ensure Bloom filter with incompatible versions are ignored' '
+	rm "$repo/$graph" &&
+
+	git -C $repo log --oneline --no-decorate -- $CENT >expect &&
+
+	# Compute v1 Bloom filters for commits at the bottom.
+	git -C $repo rev-parse HEAD^ >in &&
+	git -C $repo commit-graph write --stdin-commits --changed-paths \
+		--split <in &&
+
+	# Compute v2 Bloomfilters for the rest of the commits at the top.
+	git -C $repo rev-parse HEAD >in &&
+	git -C $repo -c commitGraph.changedPathsVersion=2 commit-graph write \
+		--stdin-commits --changed-paths --split=no-merge <in &&
+
+	test_line_count = 2 $repo/$chain &&
+
+	git -C $repo log --oneline --no-decorate -- $CENT >actual 2>err &&
+	test_cmp expect actual &&
+
+	layer="$(head -n 1 $repo/$chain)" &&
+	cat >expect.err <<-EOF &&
+	warning: disabling Bloom filters for commit-graph layer $SQ$layer$SQ due to incompatible settings
+	EOF
+	test_cmp expect.err err
+'
+
 get_first_changed_path_filter () {
 	test-tool read-graph bloom-filters >filters.dat &&
 	head -n 1 filters.dat

--
2.38.0.16.g393fd4c6db

(Excuse the old version of Git here, I'm in the middle of a long-running
process which checks out older versions of the tree to count the number
of unused-parameters over time.)

Thanks,
Taylor

[^1]: Monday is a US holiday, so I likely won't be at my computer.

^ permalink raw reply related

* Re: [PATCH v3 1/2] completion: refactor existence checks for pseudorefs
From: SZEDER Gábor @ 2024-01-13 21:07 UTC (permalink / raw)
  To: Stan Hu; +Cc: git, Patrick Steinhardt, Christian Couder
In-Reply-To: <29c928649aba7dfce4dab1b5d923bc23b7af2d32.1703022850.git.stanhu@gmail.com>

On Tue, Dec 19, 2023 at 02:14:17PM -0800, Stan Hu wrote:
> In preparation for the reftable backend, this commit introduces a
> '__git_pseudoref_exists' function that continues to use 'test -f' to
> determine whether a given pseudoref exists in the local filesystem.
> 
> Signed-off-by: Stan Hu <stanhu@gmail.com>
> ---
>  contrib/completion/git-completion.bash | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index 13a39ebd2e..8edd002eed 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -122,6 +122,15 @@ __git ()
>  		${__git_dir:+--git-dir="$__git_dir"} "$@" 2>/dev/null
>  }
>  
> +# Runs git in $__git_repo_path to determine whether a pseudoref exists.
> +# 1: The pseudo-ref to search
> +__git_pseudoref_exists ()
> +{
> +	local ref=$1
> +
> +	[ -f "$__git_repo_path/$ref" ]

This new helper function relies on $__git_repo_path being set, but it
doesn't ensure that it's actually set by calling __git_find_repo_path.
Instead, it relies on its callers calling __git_find_repo_path,
although after this patch none of those callers directly access
$__git_repo_path anymore.

It would be better to call __git_find_repo_path in this helper to make
it more self-contained, and then the now unnecessary
__git_find_repo_path calls from the three callers can be removed.

Yeah, I know it's late, because this patch is already in master, but
this would be a good preparation step for eventual dedicated tests of
__git_pseudoref_exists that came up in:

  https://public-inbox.org/git/20240113191749.GB3000857@szeder.dev/

> +}
> +
>  # Removes backslash escaping, single quotes and double quotes from a word,
>  # stores the result in the variable $dequoted_word.
>  # 1: The word to dequote.
> @@ -1625,7 +1634,7 @@ __git_cherry_pick_inprogress_options=$__git_sequencer_inprogress_options
>  _git_cherry_pick ()
>  {
>  	__git_find_repo_path
> -	if [ -f "$__git_repo_path"/CHERRY_PICK_HEAD ]; then
> +	if __git_pseudoref_exists CHERRY_PICK_HEAD; then
>  		__gitcomp "$__git_cherry_pick_inprogress_options"
>  		return
>  	fi
> @@ -2067,7 +2076,7 @@ _git_log ()
>  	__git_find_repo_path
>  
>  	local merge=""
> -	if [ -f "$__git_repo_path/MERGE_HEAD" ]; then
> +	if __git_pseudoref_exists MERGE_HEAD; then
>  		merge="--merge"
>  	fi
>  	case "$prev,$cur" in
> @@ -2934,6 +2943,7 @@ _git_reset ()
>  
>  _git_restore ()
>  {
> +	__git_find_repo_path
>  	case "$prev" in
>  	-s)
>  		__git_complete_refs
> @@ -2952,7 +2962,7 @@ _git_restore ()
>  		__gitcomp_builtin restore
>  		;;
>  	*)
> -		if __git rev-parse --verify --quiet HEAD >/dev/null; then
> +		if __git_pseudoref_exists HEAD; then
>  			__git_complete_index_file "--modified"
>  		fi
>  	esac
> @@ -2963,7 +2973,7 @@ __git_revert_inprogress_options=$__git_sequencer_inprogress_options
>  _git_revert ()
>  {
>  	__git_find_repo_path
> -	if [ -f "$__git_repo_path"/REVERT_HEAD ]; then
> +	if __git_pseudoref_exists REVERT_HEAD; then
>  		__gitcomp "$__git_revert_inprogress_options"
>  		return
>  	fi
> @@ -3592,7 +3602,7 @@ __gitk_main ()
>  	__git_find_repo_path
>  
>  	local merge=""
> -	if [ -f "$__git_repo_path/MERGE_HEAD" ]; then
> +	if __git_pseudoref_exists MERGE_HEAD; then
>  		merge="--merge"
>  	fi
>  	case "$cur" in
> -- 
> 2.42.0
> 
> 

^ permalink raw reply

* Re: [PATCH 2/2] completion: silence pseudoref existence check
From: SZEDER Gábor @ 2024-01-13 19:17 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Stan Hu
In-Reply-To: <24563975fca8df6ae73917e9ee3534933d47c429.1704969119.git.ps@pks.im>

On Thu, Jan 11, 2024 at 11:41:59AM +0100, Patrick Steinhardt wrote:
> In 44dbb3bf29 (completion: support pseudoref existence checks for
> reftables, 2023-12-19), we have extended the Bash completion script to
> support future ref backends better by using git-rev-parse(1) to check
> for pseudo-ref existence. This conversion has introduced a bug, because
> even though we pass `--quiet` to git-rev-parse(1) it would still output
> the resolved object ID of the ref in question if it exists.
> 
> Fix this by redirecting its stdout to `/dev/null` and add a test that
> catches this behaviour. Note that the test passes even without the fix
> for the "files" backend because we parse pseudo refs via the filesystem
> directly in that case. But the test will fail with the "reftable"
> backend.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  contrib/completion/git-completion.bash | 2 +-
>  t/t9902-completion.sh                  | 8 ++++++++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index 8c40ade494..8872192e2b 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -146,7 +146,7 @@ __git_pseudoref_exists ()
>  	if __git_eread "$__git_repo_path/HEAD" head; then
>  		b="${head#ref: }"
>  		if [ "$b" == "refs/heads/.invalid" ]; then

Nit: I guess these last two lines above came from the git prompt
script, where we do need the name of the ref, but here we don't need
that, so the condition could have simply been

  if [ "$head" = "ref: refs/heads/.invalid" ]

With a single = instead of ==, because there is no pattern matching
here.

> -			__git -C "$__git_repo_path" rev-parse --verify --quiet "$ref" 2>/dev/null
> +			__git -C "$__git_repo_path" rev-parse --verify --quiet "$ref" >/dev/null 2>&1

You don't need the '-C $__git_repo_path' option and you don't have to
redirect stderr either, because the purpose of the __git wrapper
function is to take care of both.

>  			return $?
>  		fi
>  	fi
> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index 78cb93bea7..b14ae4de14 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -1927,6 +1927,14 @@ test_expect_success 'git checkout - --orphan with branch already provided comple
>  	EOF
>  '
>  
> +test_expect_success 'git reset completes modified files' '

The description of the test case mentions 'git reset' ...

> +	test_commit A a.file &&
> +	echo B >a.file &&
> +	test_completion "git restore a." <<-\EOF

... but it invokes 'git restore'.

Anyway, I think it would be better to add a dedicated test or two to
exercise the __git_pseudoref_exists helper function instead (or
perhaps in addition).

> +	a.file
> +	EOF
> +'
> +
>  test_expect_success 'teardown after ref completion' '
>  	git branch -d matching-branch &&
>  	git tag -d matching-tag &&
> -- 
> 2.43.GIT
> 



^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: SZEDER Gábor @ 2024-01-13 18:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Taylor Blau, git
In-Reply-To: <xmqqa5pm9tnx.fsf@gitster.g>

On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> 
> >> * tb/path-filter-fix (2023-10-18) 17 commits
> >>  - bloom: introduce `deinit_bloom_filters()`
> >>  ...
> >>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
> >>
> >>  The Bloom filter used for path limited history traversal was broken
> >>  on systems whose "char" is unsigned; update the implementation and
> >>  bump the format version to 2.
> >>
> >>  Expecting a reroll.
> >>  cf. <20231023202212.GA5470@szeder.dev>
> >>  source: <cover.1697653929.git.me@ttaylorr.com>
> >
> > I was confused by this one, since I couldn't figure out which tests
> > Gábor was referring to here. I responded in [1], but haven't heard back
> > since the end of October.
> > ...
> > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/

I keep referring to the test in:

  https://public-inbox.org/git/20230830200218.GA5147@szeder.dev/

which, rather disappointingly, is still the only test out there
exercising the interaction between split commit graphs and different
modified path Bloom filter versions.  Note that in that message I
mentioned that merging layers with differenet Bloom filter versions
seemed to work, but that, alas, is no longer the case, because it's
now broken in Taylor's recent iterations of the patch series.

At the risk of sounding like a broken record: the interaction of split
commit graphs and different Bloom filter versions should be thoroughly
tested.

^ permalink raw reply

* Re: Suggested clarification for .gitattributes reference documentation
From: Matthias Aßhauer @ 2024-01-13  9:24 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Michael Litwak, brian m. carlson, git@vger.kernel.org
In-Reply-To: <20240113074323.GA6819@tb-raspi4>



On Sat, 13 Jan 2024, Torsten Bögershausen wrote:

> On Sat, Jan 13, 2024 at 02:56:27AM +0000, Michael Litwak wrote:
>> I just installed Git for Windows 2.43.0 and noticed the installer offers three options for altering the PATH:
>>
>> 1) Run git from git bash only
>>
>> 2) Run git from git bash, cmd.exe and PowerShell (RECOMMENDED)
>>
>> 3) Run git from git bash, cmd.exe and PowerShell with optional utilities (warning: will override find, sort and other system utilities).
>>
>> It turns out iconv.exe is accessible from cmd.exe (Command Prompt) only when you take the third option.  But iconv.exe is NOT optional.  It is required for git to deal with UTF-16LE with BOM text conversions (and probably for numerous other encoding conversions).

For end users directly calling iconv.exe is definitely optional.

> Plese wait a second - and thanks for bringing this up.
> To my knowledge the binary iconv.exe (or just iconv under non-Windows)
> is never called from Git itself.
> Git is using iconv_open() and friends, which are all inside
> a library, either the C-library "libc", or "libiconv"
> (not 100% sure about the naming here)

Exactly. I can't find a single instance of Git for Windows calling 
iconv.exe instead of using the corresponding library functions. [1]

And even if it did, iconv.exe is definitely on the path for git.exe [2] 
unless you're calling /(mingw|clangarm)(64|32)/bin/git.exe directly, in 
which case the solution is to call /cmd/git.exe instead.

[1] 
https://github.com/search?q=repo%3Agit-for-windows%2Fgit%20iconv%20NOT%20path%3A%2F%5Et%5C%2F%2F%20NOT%20path%3A%2F%5EDocumentation%5C%2F%2F&type=code
[2] 
https://github.com/git-for-windows/MINGW-packages/blob/0c91cf2079184ae6a604e8f7a406a47d39305e72/mingw-w64-git/git-wrapper.c#L166-L258

> iconv.exe is not needed in everyday life, or is it ?
> If yes, when ?
> iconv.exe is used when you run the test-suite, to verify
> what Git is doing.
>
> Could you elaborate a little bit more,
> when iconv.exe is missing, and what is happening, please ?
>
>>
>> But when PATH option #2 is chosen, and iconv.exe is unreachable from a Windows Command Prompt, the git commands which call upon iconv.exe do NOT indicate the error.  The call to iconv.exe fails silently.  It is only later after you commit, push and clone the repo again that you see the encoding failures.
>>
>> And the warning about overriding find and sort must be taken with a grain of salt, since the Windows versions of those programs are accessed via a Windows folder which appears earlier in the PATH.

We should probably consider rewording that warning.

>> So this Git for Windows installer screen is misleading.  And perhaps iconv.exe should be relocated so it is accessible even when PATH option #2 is chosen.  I intend to submit an issue on the Git for Windows issue tracker regarding this.  I'll also submit an issue about the lack of an error when running 'git add' for a UTF-16LE with BOM file under PATH option #2.
>>
>> Thanks,
>> - Michael
>>
> []
>

^ permalink raw reply

* Re: Suggested clarification for .gitattributes reference documentation
From: Torsten Bögershausen @ 2024-01-13  7:43 UTC (permalink / raw)
  To: Michael Litwak; +Cc: brian m. carlson, git@vger.kernel.org
In-Reply-To: <SJ0PR10MB56932ABBEEEC6F8ADE23995AFA6E2@SJ0PR10MB5693.namprd10.prod.outlook.com>

On Sat, Jan 13, 2024 at 02:56:27AM +0000, Michael Litwak wrote:
> I just installed Git for Windows 2.43.0 and noticed the installer offers three options for altering the PATH:
>
> 1) Run git from git bash only
>
> 2) Run git from git bash, cmd.exe and PowerShell (RECOMMENDED)
>
> 3) Run git from git bash, cmd.exe and PowerShell with optional utilities (warning: will override find, sort and other system utilities).
>
> It turns out iconv.exe is accessible from cmd.exe (Command Prompt) only when you take the third option.  But iconv.exe is NOT optional.  It is required for git to deal with UTF-16LE with BOM text conversions (and probably for numerous other encoding conversions).

Plese wait a second - and thanks for bringing this up.
To my knowledge the binary iconv.exe (or just iconv under non-Windows)
is never called from Git itself.
Git is using iconv_open() and friends, which are all inside
a library, either the C-library "libc", or "libiconv"
(not 100% sure about the naming here)

iconv.exe is not needed in everyday life, or is it ?
If yes, when ?
iconv.exe is used when you run the test-suite, to verify
what Git is doing.

Could you elaborate a little bit more,
when iconv.exe is missing, and what is happening, please ?

>
> But when PATH option #2 is chosen, and iconv.exe is unreachable from a Windows Command Prompt, the git commands which call upon iconv.exe do NOT indicate the error.  The call to iconv.exe fails silently.  It is only later after you commit, push and clone the repo again that you see the encoding failures.
>
> And the warning about overriding find and sort must be taken with a grain of salt, since the Windows versions of those programs are accessed via a Windows folder which appears earlier in the PATH.
>
> So this Git for Windows installer screen is misleading.  And perhaps iconv.exe should be relocated so it is accessible even when PATH option #2 is chosen.  I intend to submit an issue on the Git for Windows issue tracker regarding this.  I'll also submit an issue about the lack of an error when running 'git add' for a UTF-16LE with BOM file under PATH option #2.
>
> Thanks,
> - Michael
>
[]

^ permalink raw reply

* Re: [PATCH] advice: allow disabling the automatic hint in advise_if_enabled()
From: Jeff King @ 2024-01-13  7:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Rubén Justo, Git List, Taylor Blau, Dragan Simic
In-Reply-To: <xmqqsf326vpn.fsf@gitster.g>

On Fri, Jan 12, 2024 at 02:19:32PM -0800, Junio C Hamano wrote:

> Rubén Justo <rjusto@gmail.com> writes:
> 
> >  advice.c          | 109 +++++++++++++++++++++++++---------------------
> >  t/t0018-advice.sh |   1 -
> >  2 files changed, 59 insertions(+), 51 deletions(-)
> >
> > diff --git a/advice.c b/advice.c
> > index f6e4c2f302..8474966ce1 100644
> > --- a/advice.c
> > +++ b/advice.c
> > @@ -33,50 +33,56 @@ static const char *advise_get_color(enum color_advice ix)
> >  	return "";
> >  }
> >  
> > +enum advice_level {
> > +	ADVICE_LEVEL_DEFAULT = 0,
> 
> We may want to say "_NONE" not "_DEFAULT" to match the convention
> used elsewhere, but I have no strong preference.  I do have a
> preference to see that, when we explicitly say "In this enum, there
> is ADVICE_LEVEL_DEFAULT and its value is zero" with "= 0" in the
> definition, the places we use a variable of this enum type, we say
> 
> 	if (!variable)
> 
> and not
> 
> 	if (variable == ADVICE_LEVEL_DEFAULT)

This may be getting into the realm of bikeshedding, but... ;)

For a tri-state we often use "-1" for "not specified". That has the nice
side effect in this case that "if (level)" shows the advice (that works
because "unspecified" and "explicitly true" both show the advice. And
then "if (level < 0)" is used for just the hint. But maybe that is too
clever/fragile.

Of course that means that all of the initializers have to use "-1"
explicitly. So zero-initialization is sometimes nice, too.

-Peff

^ permalink raw reply

* Re: [PATCH] strvec: use correct member name in comments
From: Jeff King @ 2024-01-13  7:31 UTC (permalink / raw)
  To: Linus Arver; +Cc: Junio C Hamano, Linus Arver via GitGitGadget, git
In-Reply-To: <owlyle8uhxut.fsf@fine.c.googlers.com>

On Fri, Jan 12, 2024 at 04:37:46PM -0800, Linus Arver wrote:

> OTOH if we were treating these .h files as something meant for direct
> external consumption (that is, if strvec.h is libified and external
> users outside of Git are expected to use it directly as their first
> point of documentation), at that point it might make sense to name the
> parameters (akin to the style of manpages for syscalls). But I imagine
> at that point we would have some other means of developer docs (beyond
> raw header files) for libified parts of Git, so even in that case it's
> probably fine to keep things as is.

I think this is mostly orthogonal to libification. Whether the audience
is other parts of Git or users outside of Git, they need to know how to
call the function. Our main source of documentation there is comments
above the declaration (we've marked these with "/**" which would allow a
parser to pull them into a separate doc file, but AFAIK in the 9 years
since we started that convention, nobody has bothered to write such a
script).

Naming the parameters can help when writing those comments, because you
can then refer to them (e.g., see the comment above strbuf_addftime).
Even without that, I think they can be helpful, but I don't think I'd
bother adding them in unless taking a pass over the whole file, looking
for comments that do not sufficiently explain their matching functions.

I don't doubt that some of that would be necessary for libification,
just to increase the quality of the documentation. But I think it's
largely separate from the patch in this thread.

-Peff

^ permalink raw reply

* Re: [PATCH] diffcore-delta: avoid ignoring final 'line' of file
From: Junio C Hamano @ 2024-01-13  6:21 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Elijah Newren via GitGitGadget, git
In-Reply-To: <CABPp-BGp0NMQKLYg=OxJgnVxARffNF57B_N2bLmwT2R2EZqhdA@mail.gmail.com>

Elijah Newren <newren@gmail.com> writes:

>> I am not very happy with the hardcoded 97.  You are already using
>> the non-standard 10% threshold.  If the delta detection that
>> forgets about the last line is so broken as your proposed log
>> message noted, shouldn't you be able to construct a sample pair of
>> preimage and postimage for which the broken version gives so low
>> similarity to be judged not worth treating as a rename, while the
>> fixed version gives reasonable similarity to be made into a rename,
>> by the default threshold?  That way, the test only needs to see if
>> we got a rename (with any similarity) or a delete and an add.
>
> Oops, the threshold is entirely unnecessary here; not sure why I
> didn't remember to take it out (originally used the threshold while
> testing without the fix to just how low of a similarity git thought
> these nearly identical files had).
>
> Since you don't like the threshold, and since we don't seem to have a
> summary format that reports on the rename without the percentage, I
> guess I need to munge the output with sed:
>
>       sed -e "s/^R[0-9]* /R /" actual >actual.munged &&

Heh, I was hoping that we should be able to use "diff --name-only".

 $ git mv Makefile Breakfile
 $ git diff --name-only -M HEAD
 Breakfile
 $ git reset --hard
 $ git rm Makefile
 $ >Breakfile && git add Breakfile
 $ git diff --name-only -M HEAD
 Breakfile
 Makefile
 $ git reset --hard

^ permalink raw reply

* [PATCH v2] diffcore-delta: avoid ignoring final 'line' of file
From: Elijah Newren via GitGitGadget @ 2024-01-13  4:26 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Elijah Newren, Elijah Newren, Elijah Newren
In-Reply-To: <pull.1637.git.1705006074626.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

hash_chars() would hash lines to integers, and store them in a spanhash,
but cut lines at 64 characters.  Thus, whenever it reached 64 characters
or a newline, it would create a new spanhash.  The problem is, the final
part of the file might not end 64 characters after the previous 'line'
and might not end with a newline.  This could, for example, cause an
85-byte file with 12 lines and only the first character in the file
differing to appear merely 23% similar rather than the expected 97%.
Ensure the last line is included, and add a testcase that would have
caught this problem.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
    diffcore-delta: avoid ignoring final 'line' of file
    
    Found while experimenting with converting portions of diffcore-delta to
    Rust.
    
    Changes since v1:
    
     * Removed the unnecessary similarity threshold specification
     * Munged the discovered similarity percentage so we are only checking
       that a rename is detected
     * Fixed up filenames

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1637%2Fnewren%2Ffix-diffcore-final-line-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1637/newren/fix-diffcore-final-line-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1637

Range-diff vs v1:

 1:  f62b22a54c3 ! 1:  e0223bbfeb5 diffcore-delta: avoid ignoring final 'line' of file
     @@ t/t4001-diff-rename.sh: test_expect_success 'basename similarity vs best similar
       '
       
      +test_expect_success 'last line matters too' '
     -+	test_write_lines a 0 1 2 3 4 5 6 7 8 9 >nonewline &&
     -+	printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
     -+	git add nonewline &&
     ++	{
     ++		test_write_lines a 0 1 2 3 4 5 6 7 8 9 &&
     ++		printf "git ignores final up to 63 characters if not newline terminated"
     ++	} >no-final-lf &&
     ++	git add no-final-lf &&
      +	git commit -m "original version of file with no final newline" &&
      +
      +	# Change ONLY the first character of the whole file
     -+	test_write_lines b 0 1 2 3 4 5 6 7 8 9 >nonewline &&
     -+	printf "git ignores final up to 63 characters if not newline terminated" >>nonewline &&
     -+	git add nonewline &&
     -+	git mv nonewline still-no-newline &&
     -+	git commit -a -m "rename nonewline -> still-no-newline" &&
     -+	git diff-tree -r -M01 --name-status HEAD^ HEAD >actual &&
     ++	{
     ++		test_write_lines b 0 1 2 3 4 5 6 7 8 9 &&
     ++		printf "git ignores final up to 63 characters if not newline terminated"
     ++	} >no-final-lf &&
     ++	git add no-final-lf &&
     ++	git mv no-final-lf still-absent-final-lf &&
     ++	git commit -a -m "rename no-final-lf -> still-absent-final-lf" &&
     ++	git diff-tree -r -M --name-status HEAD^ HEAD >actual &&
     ++	sed -e "s/^R[0-9]*	/R	/" actual >actual.munged &&
      +	cat >expected <<-\EOF &&
     -+	R097	nonewline	still-no-newline
     ++	R	no-final-lf	still-absent-final-lf
      +	EOF
     -+	test_cmp expected actual
     ++	test_cmp expected actual.munged
      +'
      +
       test_done


 diffcore-delta.c       |  4 ++++
 t/t4001-diff-rename.sh | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/diffcore-delta.c b/diffcore-delta.c
index c30b56e983b..7136c3dd203 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -159,6 +159,10 @@ static struct spanhash_top *hash_chars(struct repository *r,
 		n = 0;
 		accum1 = accum2 = 0;
 	}
+	if (n > 0) {
+		hashval = (accum1 + accum2 * 0x61) % HASHBASE;
+		hash = add_spanhash(hash, hashval, n);
+	}
 	QSORT(hash->data, (size_t)1ul << hash->alloc_log2, spanhash_cmp);
 	return hash;
 }
diff --git a/t/t4001-diff-rename.sh b/t/t4001-diff-rename.sh
index 85be1367de6..49c042a38ae 100755
--- a/t/t4001-diff-rename.sh
+++ b/t/t4001-diff-rename.sh
@@ -286,4 +286,28 @@ test_expect_success 'basename similarity vs best similarity' '
 	test_cmp expected actual
 '
 
+test_expect_success 'last line matters too' '
+	{
+		test_write_lines a 0 1 2 3 4 5 6 7 8 9 &&
+		printf "git ignores final up to 63 characters if not newline terminated"
+	} >no-final-lf &&
+	git add no-final-lf &&
+	git commit -m "original version of file with no final newline" &&
+
+	# Change ONLY the first character of the whole file
+	{
+		test_write_lines b 0 1 2 3 4 5 6 7 8 9 &&
+		printf "git ignores final up to 63 characters if not newline terminated"
+	} >no-final-lf &&
+	git add no-final-lf &&
+	git mv no-final-lf still-absent-final-lf &&
+	git commit -a -m "rename no-final-lf -> still-absent-final-lf" &&
+	git diff-tree -r -M --name-status HEAD^ HEAD >actual &&
+	sed -e "s/^R[0-9]*	/R	/" actual >actual.munged &&
+	cat >expected <<-\EOF &&
+	R	no-final-lf	still-absent-final-lf
+	EOF
+	test_cmp expected actual.munged
+'
+
 test_done

base-commit: 055bb6e9969085777b7fab83e3fee0017654f134
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v5 2/2] t7501: add tests for --amend --signoff
From: Ghanshyam Thakkar @ 2024-01-13  4:21 UTC (permalink / raw)
  To: git; +Cc: gitster, phillip.wood123, christian.couder, Ghanshyam Thakkar
In-Reply-To: <20240113042254.38602-1-shyamthakkar001@gmail.com>

Add tests for amending the commit to add Signed-off-by trailer. And
also to check if it does not add another trailer if one already exists.

Currently, there are tests for --signoff separately in t7501, however,
they are not tested with --amend.

Therefore, these tests belong with other similar tests of --amend in
t7501-commit-basic-functionality.

Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
---
 t/t7501-commit-basic-functionality.sh | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/t/t7501-commit-basic-functionality.sh b/t/t7501-commit-basic-functionality.sh
index 3e18b879b5..db37e9051b 100755
--- a/t/t7501-commit-basic-functionality.sh
+++ b/t/t7501-commit-basic-functionality.sh
@@ -3,8 +3,7 @@
 # Copyright (c) 2007 Kristian Høgsberg <krh@redhat.com>
 #
 
-# FIXME: Test the various index usages, test reflog,
-# signoff
+# FIXME: Test the various index usages, test reflog
 
 test_description='git commit'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
@@ -464,6 +463,28 @@ test_expect_success 'amend commit to fix date' '
 
 '
 
+test_expect_success 'amend commit to add signoff' '
+
+	test_commit "msg" file content &&
+	git commit --amend --signoff &&
+	test_commit_message HEAD <<-EOF
+	msg
+
+	Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>
+	EOF
+'
+
+test_expect_success 'amend does not add signoff if it already exists' '
+
+	test_commit --signoff "tenor" file newcontent &&
+	git commit --amend --signoff &&
+	test_commit_message HEAD <<-EOF
+	tenor
+
+	Signed-off-by: $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL>
+	EOF
+'
+
 test_expect_success 'commit mentions forced date in output' '
 	git commit --amend --date=2010-01-02T03:04:05 >output &&
 	grep "Date: *Sat Jan 2 03:04:05 2010" output
-- 
2.43.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox