* Re: [PATCH 2/3] revision: clear decoration structs during release_revisions()
From: Junio C Hamano @ 2023-10-06 16:42 UTC (permalink / raw)
To: Jeff King; +Cc: git
In-Reply-To: <20231006005132.GA992085@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
> On Thu, Oct 05, 2023 at 04:00:54PM -0700, Junio C Hamano wrote:
>
>> Wow, nested maze of callbacks make my head spin ;-) but they all
>> look reasonable. Thanks.
>
> Yeah, I don't love those one-liner callbacks just to handle the cast.
>
> The other alternative is to write some kind of for_each_decoration()
> macro, but I think it ends up in the usual macro hell (requiring the
> caller to provide iterator variables, hanging half-open braces, and so
> on). It might be worth it if iterating could be used in other places,
> but I don't think it is.
>
> So I tried to choose the lesser of two evils. :)
Oh, I am happy with the result.
^ permalink raw reply
* Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation
From: Elijah Newren @ 2023-10-06 14:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Sergey Organov, git
In-Reply-To: <xmqq34yog3ux.fsf@gitster.g>
Hi,
On Thu, Oct 5, 2023 at 2:24 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Sergey Organov <sorganov@gmail.com> writes:
>
> > ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
> > +-m::
> > + Show diffs for merge commits in the default format. This is
> > + similar to '--diff-merges=on' (which see) except `-m` will
> > + produce no output unless `-p` is given as well.
>
> I think the sentence reads better without the translated (q.v.) that
> confused Eric.
Agreed; confused me too.
> > +-c::
> > + Produce combined diff output for merge commits.
> > + Shortcut for '--diff-merges=combined -p'.
> > +
> > +--cc::
> > + Produce dense combined diff output for merge commits.
> > + Shortcut for '--diff-merges=dense-combined -p'.
>
> Good.
>
> > +--remerge-diff::
> > + Produce diff against re-merge.
> > + Shortcut for '--diff-merges=remerge -p'.
>
> I suspect that many people do not get what "re-merge" in "against
> re-merge" really is. As "combined diff" and "dense combined diff"
> are not explained in the previous two entries either, and expect the
> readers to read the real description (which more or less matches
> what the original description for "-c" and "--cc" had, which is
> good), it would be better to say "Produce remerge-diff output for
> merge commits." here, too. It makes it consistent, and "for merge
> commits" makes it clear the "magic" does not apply to regular
> commits (which the above entries for "-c" and "--cc" do, which is
> very good).
Perhaps:
Produce remerge-diff output for merge commits, in order to show how
conflicts were resolved.
> > +separate::
> > + Show full diff with respect to each of parents.
> > + Separate log entry and diff is generated for each parent.
>
> In the early days of Git before -c/--cc were invented, we explained
> this mode as "pairwise comparison", and the phrase "pairwise" still
> may be the best one to describe the behaviour here. In fact, we see
> in the updated description of combined below the exact phrase is used
> to refer to this oldest output format.
>
> Show the `--patch` output pairwise, together with the commit
> header, repeated for each parent for a merge commit.
I like this.
> or something, perhaps. I added "repeated" here to make the contrast
> with "simultaneously" stand out.
>
> > +combined, c::
> > + Show differences from each of the parents to the merge
> > + result simultaneously instead of showing pairwise diff between
> > + a parent and the result one at a time. Furthermore, it lists
> > + only files which were modified from all parents.
> > ++
> > +dense-combined, cc::
> > + Further compress output produced by `--diff-merges=combined`
> > + by omitting uninteresting hunks whose contents in the parents
> > + have only two variants and the merge result picks one of them
> > + without modification.
> > ++
> > +remerge, r::
> > + Remerge two-parent merge commits to create a temporary tree
> > + object--potentially containing files with conflict markers
> > + and such. A diff is then shown between that temporary tree
> > + and the actual merge commit.
>
> The original says "two-parent merge comimts are remerged" so it is
> not a failure of this patch, but the first verb "Remerge" sounds
> unnecessarily unfriendly to the readers.
>
> For a two-parent merge commit, a merge of these two commits
> is retried to create a temporary tree object, potentially
> containing files with conflict markers. A `--patch` output
> then is shown between ...
>
> would be easier to follow and more faithful to the original
> description added by db757e8b (show, log: provide a --remerge-diff
> capability, 2022-02-02).
I like it. Perhaps it may also benefit from explaining why this mode
is useful as well:
For a two-parent merge commit, a merge of these two commits is
retried to create a temporary tree object, potentially containing
files with conflict markers. A diff is then shown between that
temporary tree and the actual merge commit. This has the effect
of showing whether and how both semantic and textual conflicts
were resolved by the user (i.e. what changes the user made after
running 'git merge' and before finally committing).
> Either way, it makes readers wonder what happens to merges with more
> than 2 parents (octopus merges). It is not a new problem and this
> topic should not attempt to fix it.
We could add:
For octopus merges (merges with more than two parents), currently
only shows a warning about skipping such commits.
if wanted.
But perhaps I've distracted too much from Sergey's topic, and I should
submit these wording tweaks as a patch on top? I'm fine either way.
> [Footnote]
>
> * When a project allows fast-forward merges, something like this can
> happen (and Git was _designed_ to allow and even encourage it)
>
> - Linus pulls from Sergey and sees merge conflicts that are very
> messy. Sergey is asked to resolve the conflict, as Linus knows
> Sergey understands the changes he is asking Linus to pull much
> better than Linus does.
>
> - Sergey does "git pull origin" that would give the same set of
> conflicts Linus saw, perhaps ours/theirs sides swapped, resolves
> the conflicts, and comits the merge result. He may even add a
> few other improvements on top (or may not). He tells Linus that
> his tree is ready to be pulled again.
>
> - Linus pulls from Sergey again. This time it is fast-forward,
> without an extra merge commit that records the Linus's previous
> tip as the first parent and Sergey's work as the second parent.
>
> - Linus continues working from here.
>
> In such a workflow, merges are nothing more than "combining
> multiple histories together" and the first parenthood is NOT
> inherently special among parents at all. The original "-m -p"
> (aka "pairwise diff") output reflects this world view and ensures
> that all parents are shown more or less as equals (yes, the first
> parent diff is shown first before the other parents, but you
> cannot avoid it when outputting to a single dimension medium).
>
> This world view was the only world view Git supported, until I
> added the "--first-parent" traversal in 0053e902 (git-log
> --first-parent: show only the first parent log, 2007-03-13).
>
> With the "--first-parent", with "--no-ff" option to "git merge", a
> different world view becomes possible. A merge is not merely
> combining multiple histories, which are equals. It is bringing
> work done on a side branch into the trunk. To see the overview of
> the history, "git log --first-parent" would give the outline,
> which would be full of merges from side branches, each of which
> can be seen as summarizing the work done on the side branch that
> was merged, and it may occasionally have single-parent commits
> that are hotfixes or trivial clean-ups or project administrivia
> commits. With "-p", "git log" would show the changes the work
> done on a side branch as a single unit for a merge, and individual
> commits if they are single-parent. The life is good.
>
> It all breaks down if the "diff against the first parent" is done
> on a merge that is not bringing the work on a side branch in to
> the trunk. The merge done in the second step Sergey did for Linus
> in the above example will have his work on the history leading to
> its first parent, and from the overall project's point of view,
> the second parent is the tip of the history of the trunk. Showing
> first-parent diff for a merge that was *not* discovered via the
> first-parent traversal would show such a meaningless patch. This
> is an illustration of the fallout from mixing two incompatible
> world views together, "--diff-merges=first-parent" wants to work
> in a world where the first-parent is special among parents, but
> traversal without "--first-parent" wants to treat all the branches
> equally.
>
> All the other <format>s accepted by the "--diff-merges=<format>"
> option are symmetrical and they work equally well when in a
> history of a project that considers the first-parenthood special
> (i.e. work on a side branch is brought into the trunk history) or
> in a history with merges whose parent order should not matter, so
> unlike "--diff-merges=first-parent", it makes sense to apply them
> with or without first-parent traversal. It however is not true
> for the "--diff-merges=first-parent" variant, which is asymmetric.
>
> And that is why I think use of "--diff-merges=first-parent"
> without "--first-parent" traversal is a bad thing to teach users
> to use.
Thanks for writing this up. In the past, I didn't know how to put
into words why I didn't particularly care for this mode. You explain
it rather well.
^ permalink raw reply
* Re: [OUTREACHY] Links to Microprojects
From: Christian Couder @ 2023-10-06 9:34 UTC (permalink / raw)
To: Naomi Ibe; +Cc: git
In-Reply-To: <CACS=G2wzmFKih0g2S3P8UHnn+wKog4sJFD6Z66iEpptGZ4w83g@mail.gmail.com>
On Fri, Oct 6, 2023 at 10:37 AM Naomi Ibe <naomi.ibeh69@gmail.com> wrote:
>
> Yes I've checked the available links on Git internship on the
> Outreachy dashboard but I could only find a link for the 2021-2022
> winter round (https://git.github.io/Outreachy-23-Microprojects/) and
> the patches on the mailing list look more advanced and not beginner
> friendly too.
It looks like you are responding to a previous email, but your email
doesn't contain any "In-Reply-To:" header to link it to other previous
emails which is confusing.
In this reply to Junio's reply to your previous email:
https://lore.kernel.org/git/CAP8UFD1cd5YZqAxYbYUMNkAYJLLGjBpNe_NK5nVq3eLxxSDzEQ@mail.gmail.com/
I discuss why there is no specific page for this Outreachy round and
why we might not set up specific pages for Outreachy or GSoC rounds
anymore.
Anyway I think https://git.github.io/Outreachy-23-Microprojects/ still
contains good microproject ideas like:
- Add more builtin patterns for userdiff
- Avoid pipes in git related commands in test scripts
- Modernize a test script
I am not sure if the other ideas are still relevant, but they could be.
> This link (https://github.com/gitgitgadget/git/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
> has "good-first-issues" from 2019-2022, considering the time when they
> were recorded, are we allowed to pick an issue from there?
Yes, you are allowed to pick ideas from anywhere, but for old issues
or ideas it's better to make sure that they are still relevant and not
too complex. It depends on the issue how you can make sure of that.
For a bug for example you can try to reproduce it to see if it's still
relevant. Searching the mailing list can often help as you can perhaps
find if others have already worked on it or similar issues.
^ permalink raw reply
* Re: How To Pick And Work On A Microproject
From: Christian Couder @ 2023-10-06 9:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Naomi Ibe, Kaartic Sivaraam
In-Reply-To: <xmqq7co0elnt.fsf@gitster.g>
On Fri, Oct 6, 2023 at 12:42 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Naomi Ibe <naomi.ibeh69@gmail.com> writes:
>
> > "Select a microproject and check that it has not yet been taken or
> > discussed by searching the mailing list. Public Inbox is your friend."
>
> Yeah, that is VERY unfriendly. There is no mention on the pool of
> microproject ideas from which you can "select" here. I wonder if
> some HTML link is missing in the sentence (i.e., clicking a word
> leading to a page that lists what you can select from), or it has
> always been like this.
This sentence has been like that for a long time. I have just improved
it to the following:
"* Select a microproject and check that it has not yet been taken or
discussed by searching the mailing list. Please read all the
sections below related to finding or selecting a microproject."
I have also made it clearer that the list these sentences are in is a
"Summary of the steps needed to complete a microproject" by adding a
section with that title.
> Later in the same document, I see
>
> How to find other ideas for microprojects
>
> First check the specific page(s) or information about Git
> microprojects related to your program that should have been
> published on this site or on the GSoC or Outreachy site. But then
> still read on everything below!
>
> which is much more realistic, as long as the "specific page(s)" are
> well curated (which I have no idea myself, as I have never been in
> the mentoring pool). Naomi, have you checked and found such a page
> on Outreachy site?
There is no such page as I haven't taken the time to write one. But I
have added the following paragraph just below the above one:
"It’s also possible that we haven’t taken the time to put up a page
listing microprojects ideas for your program. The pages we used to
create for that were named “XXXX Applicant Microprojects” where XXXX
is the program name and a date, for example “SoC 2016 Applicant
Microprojects” for the GSoC in 2016, or “Outreachy Winter 2021-2022
Applicant Microprojects” for Outreachy in 2021-2022. See the following
directory to find these old pages that might still be useful:
https://git.github.io/Historical-SoC-Outreachy/"
I am not sure how others feel about this, but I think it would be
better in the future to not have to prepare such pages, and to just
have a section with a number of examples of good microprojects on this
https://git.github.io/General-Microproject-Information/ page. It will
be easier to update this section when we know about other good ideas
or better ideas, or when we want to remove an idea that we don't
consider good anymore, or just update an idea.
> Then it goes on to suggest finding a bug report, but I tend to think
> that fixing them is way oversized to be a good microproject.
I agree that it's oversized for most bugs. I have just added the
following paragraph at the end of this "Searching for bug reports"
subsection:
"Also some bugs are difficult to understand and require too much or
too difficult work for a microproject, so don’t spend too much time on
them if it appears that they might not be simple to fix, and don’t
hesitate to ask on the mailing list if they are a good microproject."
> And finally it gives a casual mention of good+first+issue, which is
> probably the closest to what _should_ be listed as the first place
> to try (sorry, I however do not know how well the list is curated,
> either, but from a cursory look it looks legit).
>
> https://github.com/gitgitgadget/git/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
>
> There also is a mention of #leftoverbits in the document, but by its
> nature, they can easily become stale or irrelevant, and they tend to
> be more real issues, and I would expect them to be unnecessarily
> harder than what dip-your-toe-in-the-water-and-say-hello
> microprojects need to be.
I have just added the following at the end of the subsection about
#leftoverbits:
"As for bugs, and many things really, you can also ask if you are not
sure it's simple enough to fix."
Thanks for reviewing the doc!
^ permalink raw reply
* [OUTREACHY] Links to Microprojects
From: Naomi Ibe @ 2023-10-06 8:33 UTC (permalink / raw)
To: git
Yes I've checked the available links on Git internship on the
Outreachy dashboard but I could only find a link for the 2021-2022
winter round (https://git.github.io/Outreachy-23-Microprojects/) and
the patches on the mailing list look more advanced and not beginner
friendly too.
This link (https://github.com/gitgitgadget/git/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
has "good-first-issues" from 2019-2022, considering the time when they
were recorded, are we allowed to pick an issue from there?
^ permalink raw reply
* Re: [PATCH] merge-ort: initialize repo in index state
From: Elijah Newren @ 2023-10-06 5:14 UTC (permalink / raw)
To: John Cai via GitGitGadget; +Cc: git, John Cai
In-Reply-To: <pull.1583.git.git.1696519349407.gitgitgadget@gmail.com>
On Thu, Oct 5, 2023 at 9:14 AM John Cai via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: John Cai <johncai86@gmail.com>
>
> initialize_attr_index() does not initialize the repo member of
> attr_index. Starting in 44451a2e5e (attr: teach "--attr-source=<tree>"
> global option to "git", 2023-05-06), this became a problem because
> istate->repo gets passed down the call chain starting in
> git_check_attr(). This gets passed all the way down to
> replace_refs_enabled(), which segfaults when accessing r->gitdir.
>
> Fix this by initializing the repository in the index state.
>
> Signed-off-by: John Cai <johncai86@gmail.com>
> Helped-by: Christian Couder <christian.couder@gmail.com>
> ---
> merge-ort: initialize repo in index state
>
> initialize_attr_index() does not initialize the repo member of
> attr_index. Starting in 44451a2e5e (attr: teach "--attr-source=" global
> option to "git", 2023-05-06), this became a problem because istate->repo
> gets passed down the call chain starting in git_check_attr(). This gets
> passed all the way down to replace_refs_enabled(), which segfaults when
> accessing r->gitdir.
>
> Fix this by initializing the repository in the index state.
Out of curiosity, are the changes in 44451a2e5e and its predecessors
sufficient to allow us to gut this nasty initialize_attr_index() hack
from merge-ort? See the comment at the top of the function and this
old mailing list thread:
https://lore.kernel.org/git/CABPp-BE1TvFJ1eOa8Ci5JTMET+dzZh3m3NxppqqWPyEp1UeAVg@mail.gmail.com/.
I never wanted to generate an index, Stolee didn't like it either, but
the attribute code seemed hardcoded to require an index and I had gone
down enough rabbit holes trying to get merge-ort into shape. But I
still absolutely hate this awful hack.
If we do have to live with it still, then...
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1583%2Fjohn-cai%2Fjc%2Fpopulate-repo-when-init-attr-index-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1583/john-cai/jc/populate-repo-when-init-attr-index-v1
> Pull-Request: https://github.com/git/git/pull/1583
>
> merge-ort.c | 1 +
> t/t4300-merge-tree.sh | 20 ++++++++++++++++++++
> 2 files changed, 21 insertions(+)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 7857ce9fbd1..172dc7d497d 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1902,6 +1902,7 @@ static void initialize_attr_index(struct merge_options *opt)
> struct index_state *attr_index = &opt->priv->attr_index;
> struct cache_entry *ce;
>
> + attr_index->repo = the_repository;
Can we use opt->repo instead and reduce the number of places
hardcoding the_repository?
> attr_index->initialized = 1;
>
> if (!opt->renormalize)
> diff --git a/t/t4300-merge-tree.sh b/t/t4300-merge-tree.sh
> index 57c4f26e461..254453fff9c 100755
> --- a/t/t4300-merge-tree.sh
> +++ b/t/t4300-merge-tree.sh
> @@ -86,6 +86,26 @@ EXPECTED
> test_cmp expected actual
> '
>
> +test_expect_success '3-way merge with --attr-source' '
> + test_when_finished rm -rf 3-way &&
> + git init 3-way &&
> + (
> + cd 3-way &&
> + test_commit initial file1 foo &&
> + base=$(git rev-parse HEAD) &&
> + git checkout -b brancha &&
> + echo bar>>file1 &&
> + git commit -am "adding bar" &&
> + source=$(git rev-parse HEAD) &&
> + echo baz>>file1 &&
> + git commit -am "adding baz" &&
> + merge=$(git rev-parse HEAD) &&
> + test_must_fail git --attr-source=HEAD merge-tree -z --write-tree \
> + --merge-base "$base" --end-of-options "$source" "$merge" >out &&
> + grep "Merge conflict in file1" out
> + )
> +'
> +
> test_expect_success 'file change A, B (same)' '
> git reset --hard initial &&
> test_commit "change-a-b-same-A" "initial-file" "AAA" &&
>
> base-commit: 493f4622739e9b64f24b465b21aa85870dd9dc09
> --
> gitgitgadget
^ permalink raw reply
* Re: [PATCH 1/1] Allow attr magic for git-add, git-stash.
From: Junio C Hamano @ 2023-10-06 2:42 UTC (permalink / raw)
To: Joanna Wang; +Cc: git
In-Reply-To: <20231005232423.1834298-1-jojwang@google.com>
Joanna Wang <jojwang@google.com> writes:
> This lets users limit added/stashed files or exclude files based on file
> attributes. For example, the chromium project would like to use
> this like "git add --all ':(exclude,attr:submodule)'", as submodules
> are managed in a unique way and often results in submodule changes
> that users do not want in their commits.
> ...
Everything else before the following line was written quite well
(perhaps except for the commit title on the Subject: header). Very
pleasant to see in the first iteration of a new contributor.
> Any bugs this may trigger will be fixed in follow-up patches.
This is rather a poor statement to make, as it hints that there are
known breakages this change will reveal that you are not telling us,
although I suspect that it is not the case.
But in case there are such existing-but-not-revealed bugs, the above
is not how the world works around here. Instead, any existing code
whose bug has been hidden behind the GUARD_PATHSPEC() and known to
be revealed with this change should be fixed with N preliminary
clean-up patches, and then finally this patch would become the last
[N+1/N+1] step of such a N+1 patch series.
If no bugs are known to be revealed by applying this code change,
then we should just drop the above comment. It is common to see a
patch that changes behaviour without intending to break anything has
unintended and unfortunate side effects, and we cannot really avoid
it. The fallouts from such a change will be fixed after they are
discovered, and that is not something we need or want to repeat at
the end of every commit message ;-)
> I have tested this thoroughly with different flags, non-root directories,
> and other magic pathspecs, but may be unaware of some edge cases.
These are good things to add to the new test this patch adds. Your
one time testing will not protect the current code that works for
these undocumented cases from future breakages, but when made into
a part of the test suite, it will.
> diff --git a/builtin/add.c b/builtin/add.c
> index c27254a5cd..ef0b8d55fd 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
> * Check the "pathspec '%s' did not match any files" block
> * below before enabling new magic.
> */
> - parse_pathspec(&pathspec, PATHSPEC_ATTR,
> + parse_pathspec(&pathspec, 0,
> PATHSPEC_PREFER_FULL |
> PATHSPEC_SYMLINK_LEADING_PATH,
> prefix, argv);
> @@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
> PATHSPEC_LITERAL |
> PATHSPEC_GLOB |
> PATHSPEC_ICASE |
> - PATHSPEC_EXCLUDE);
> + PATHSPEC_EXCLUDE |
> + PATHSPEC_ATTR);
Both hunks look quite as expected. Looking good.
> diff --git a/dir.c b/dir.c
> index 8486e4d56f..55c9607b1a 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2173,13 +2173,7 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
> if (!pathspec || !pathspec->nr)
> return 0;
>
> - GUARD_PATHSPEC(pathspec,
> - PATHSPEC_FROMTOP |
> - PATHSPEC_MAXDEPTH |
> - PATHSPEC_LITERAL |
> - PATHSPEC_GLOB |
> - PATHSPEC_ICASE |
> - PATHSPEC_EXCLUDE);
> + GUARD_PATHSPEC(pathspec, PATHSPEC_ALL_MAGIC);
Hmph, is it a good idea in general to use ALL_MAGIC in guard? The
programmer who wrote this not just promises that everything in the
current set of PATHSPEC_FOO bits are supported in this codepath
right now, but also says any *new* PATHSPEC_FOO magic will be
automatically supported. How does the updated code guarantee it?
> @@ -239,14 +253,18 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
> test_i18ngrep "Only one" actual
> '
>
> -test_expect_success 'fail if attr magic is used places not implemented' '
> - # The main purpose of this test is to check that we actually fail
> - # when you attempt to use attr magic in commands that do not implement
> - # attr magic. This test does not advocate git-add to stay that way,
> - # though, but git-add is convenient as it has its own internal pathspec
> - # parsing.
> - test_must_fail git add ":(attr:labelB)" 2>actual &&
> - test_i18ngrep "magic not supported" actual
As the comment says, this test is primarily about making sure that
parse_pathspec() that limits the allowed pathspec magic and
GUARD_PATHSPEC() that forbids certain magic are working as expected.
It is strongly preferrable, instead of butchering this test that
guards these two mechanisms from being broken, to find a command
that still has some restriction on the magic it allows, and use it
to make sure they still trigger and give "magic not supported" error
message. IOW, do not remove the above test, but find something
other than "add" that is suitable to still follow the original
intention of the test.
It is, of course, very much welcome to add a new test below to
protect the new feature. IIRC, "add --all" and "add -u" used
somewhat different codepaths to find and update the changed paths.
Don't we need a test each for both new paths (which "--all" is about
and these two new files created in the test are) and also tracked
paths (which "-u" would try to enumerate and update the index with)?
For that matter, wouldn't "add . ':(exclude)...'" be also something
we need to check? Or do these three all use pretty-much the same
codepath under the hood?
> +test_expect_success 'check that attr magic works for git-add' '
> + # attr magic was previously blocked for git-add. With attr magic
> + # enabled for git-add, add a basic test to make sure it works as
> + # expected and add more tests if more bugs are discovered.
These three lines are unwanted.
"was X, now Y" and "do Z now" may belong to the log message where
previous state and new state are captured in the commit. But not in
the tracked contents. 6 months down the road, when reading t6135 in
order to further polish the tests, nobody cares if "add" did not
support the attr magic in the past. The title of the test already
says the test is about attr magic used in pathspec fed to "git add",
so "add a basic test" is unnecessary to say.
> + cat <<-\EOF >expect &&
> + sub/newFileA-foo
> + EOF
> + touch sub/newFileA-foo &&
> + touch sub/newFileB-foo &&
Unless it matters that these files have recent timestamps, do not
use "touch", merely to ensure presence of a file. We often use a
simple redirection
>sub/newFileA-foo &&
for such purpose (I thought we had it somewhere in the coding
guidelines document, but if not, we should add it).
> + git add --all ":(exclude,attr:labelB)" &&
> + git diff --name-only --cached >actual &&
> + test_cmp expect actual
> '
So we have two new paths (I do not offhand know if there are
existing paths that are already tracked), and one is with label and
the other is without. We tell "add" to grab all paths except for
the paths with the label and see that we do not see the one with the
label. OK.
Thanks.
^ permalink raw reply
* Re: [PATCH 2/3] revision: clear decoration structs during release_revisions()
From: Jeff King @ 2023-10-06 0:51 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq34yoekt5.fsf@gitster.g>
On Thu, Oct 05, 2023 at 04:00:54PM -0700, Junio C Hamano wrote:
> Wow, nested maze of callbacks make my head spin ;-) but they all
> look reasonable. Thanks.
Yeah, I don't love those one-liner callbacks just to handle the cast.
The other alternative is to write some kind of for_each_decoration()
macro, but I think it ends up in the usual macro hell (requiring the
caller to provide iterator variables, hanging half-open braces, and so
on). It might be worth it if iterating could be used in other places,
but I don't think it is.
So I tried to choose the lesser of two evils. :)
-Peff
^ permalink raw reply
* Re: [PATCH 0/10] some commit-graph leak fixes
From: Junio C Hamano @ 2023-10-06 0:39 UTC (permalink / raw)
To: Taylor Blau; +Cc: Jeff King, git
In-Reply-To: <ZR732biF718ju9QU@nand.local>
Taylor Blau <me@ttaylorr.com> writes:
> On Tue, Oct 03, 2023 at 04:25:04PM -0400, Jeff King wrote:
>> I noticed while working on the jk/commit-graph-verify-fix topic that
>> free_commit_graph() leaks any slices of a commit-graph-chain except for
>> the first. I naively hoped that fixing that would make t5324 leak-free,
>> but it turns out there were a number of other leaks, so I fixed those,
>> too. A couple of them were in the merge code, which in turn means a
>> bunch of new test scripts are now leak-free.
>>
>> Even though I saw the problem on that other topic, there's no dependency
>> here; this series can be applied directly to master (or possibly even
>> maint, though I didn't try).
>
> Thanks for carefully finding and explaining these various leaks. The
> series is a definite improvement, and after reviewing closely I couldn't
> find anything worth changing. LGTM!
Thanks, both. Let's merge it down to 'next'.
^ permalink raw reply
* [PATCH 1/1] Allow attr magic for git-add, git-stash.
From: Joanna Wang @ 2023-10-05 23:23 UTC (permalink / raw)
To: git; +Cc: Joanna Wang
This lets users limit added/stashed files or exclude files based on file
attributes. For example, the chromium project would like to use
this like "git add --all ':(exclude,attr:submodule)'", as submodules
are managed in a unique way and often results in submodule changes
that users do not want in their commits.
This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add/stash, which was previously
blocked (for stash/add) by GUARD_PATHSPEC and (for add only) a pathspec
mask in parse_pathspec()).
With this patch, attr is supported for the two commands. It is possible
that when the attr pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations. Later, to get a more user-friendly error
message when attr was used with git-add, PATHSPEC_ATTR was added as a
mask to git-add's invocation of parse_pathspec() in
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
Any bugs this may trigger will be fixed in follow-up patches.
Signed-off-by: Joanna Wang <jojwang@google.com>
---
I have tested this thoroughly with different flags, non-root directories,
and other magic pathspecs, but may be unaware of some edge cases.
The GUARD_PATHSPEC() in dir.c is a potential part of the call stack of
fill_directory(), which is called by add, stash, grep, ls-files,
sparse-checkout, and status.
However, this patch only seems to add new behavior to git-add and git-stash.
grep, ls-files, and status already support attr pathspecs and running some of
these commands with the debugger confirms this GUARD_PATHSPEC() is never reached.
But again I am not aware of all edge cases. For sparse-checkout I'm not sure
if/how pathspec can be used, but either way did not see new behavior.
For context, attr magic support for git-add is part of work discussed here:
https://lore.kernel.org/git/xmqqfs2yjgvr.fsf@gitster.g/T/#t
builtin/add.c | 5 +++--
dir.c | 8 +-------
t/t6135-pathspec-with-attrs.sh | 34 ++++++++++++++++++++++++++--------
3 files changed, 30 insertions(+), 17 deletions(-)
diff --git a/builtin/add.c b/builtin/add.c
index c27254a5cd..ef0b8d55fd 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
* Check the "pathspec '%s' did not match any files" block
* below before enabling new magic.
*/
- parse_pathspec(&pathspec, PATHSPEC_ATTR,
+ parse_pathspec(&pathspec, 0,
PATHSPEC_PREFER_FULL |
PATHSPEC_SYMLINK_LEADING_PATH,
prefix, argv);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
PATHSPEC_LITERAL |
PATHSPEC_GLOB |
PATHSPEC_ICASE |
- PATHSPEC_EXCLUDE);
+ PATHSPEC_EXCLUDE |
+ PATHSPEC_ATTR);
for (i = 0; i < pathspec.nr; i++) {
const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 8486e4d56f..55c9607b1a 100644
--- a/dir.c
+++ b/dir.c
@@ -2173,13 +2173,7 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
if (!pathspec || !pathspec->nr)
return 0;
- GUARD_PATHSPEC(pathspec,
- PATHSPEC_FROMTOP |
- PATHSPEC_MAXDEPTH |
- PATHSPEC_LITERAL |
- PATHSPEC_GLOB |
- PATHSPEC_ICASE |
- PATHSPEC_EXCLUDE);
+ GUARD_PATHSPEC(pathspec, PATHSPEC_ALL_MAGIC);
for (i = 0; i < pathspec->nr; i++) {
const struct pathspec_item *item = &pathspec->items[i];
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e11d8cb119 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,23 @@ test_expect_success 'setup .gitattributes' '
fileSetLabel label
fileValue label=foo
fileWrongLabel label☺
+ newFileA* labelA
+ newFileB* labelB
EOF
echo fileSetLabel label1 >sub/.gitattributes &&
git add .gitattributes sub/.gitattributes &&
git commit -m "add attributes"
'
+test_expect_success 'setup .gitignore' '
+ cat <<-\EOF >.gitignore &&
+ actual
+ expect
+ EOF
+ git add .gitignore &&
+ git commit -m "add gitignore"
+'
+
test_expect_success 'check specific set attr' '
cat <<-\EOF >expect &&
fileSetLabel
@@ -150,6 +161,7 @@ test_expect_success 'check specific value attr (2)' '
test_expect_success 'check unspecified attr' '
cat <<-\EOF >expect &&
.gitattributes
+ .gitignore
fileA
fileAB
fileAC
@@ -175,6 +187,7 @@ test_expect_success 'check unspecified attr' '
test_expect_success 'check unspecified attr (2)' '
cat <<-\EOF >expect &&
HEAD:.gitattributes
+ HEAD:.gitignore
HEAD:fileA
HEAD:fileAB
HEAD:fileAC
@@ -200,6 +213,7 @@ test_expect_success 'check unspecified attr (2)' '
test_expect_success 'check multiple unspecified attr' '
cat <<-\EOF >expect &&
.gitattributes
+ .gitignore
fileC
fileNoLabel
fileWrongLabel
@@ -239,14 +253,18 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
test_i18ngrep "Only one" actual
'
-test_expect_success 'fail if attr magic is used places not implemented' '
- # The main purpose of this test is to check that we actually fail
- # when you attempt to use attr magic in commands that do not implement
- # attr magic. This test does not advocate git-add to stay that way,
- # though, but git-add is convenient as it has its own internal pathspec
- # parsing.
- test_must_fail git add ":(attr:labelB)" 2>actual &&
- test_i18ngrep "magic not supported" actual
+test_expect_success 'check that attr magic works for git-add' '
+ # attr magic was previously blocked for git-add. With attr magic
+ # enabled for git-add, add a basic test to make sure it works as
+ # expected and add more tests if more bugs are discovered.
+ cat <<-\EOF >expect &&
+ sub/newFileA-foo
+ EOF
+ touch sub/newFileA-foo &&
+ touch sub/newFileB-foo &&
+ git add --all ":(exclude,attr:labelB)" &&
+ git diff --name-only --cached >actual &&
+ test_cmp expect actual
'
test_expect_success 'abort on giving invalid label on the command line' '
--
2.42.0.582.g8ccd20d70d-goog
^ permalink raw reply related
* Re: [PATCH 2/3] revision: clear decoration structs during release_revisions()
From: Junio C Hamano @ 2023-10-05 23:00 UTC (permalink / raw)
To: Jeff King; +Cc: git
In-Reply-To: <20231005213014.GB986467@coredump.intra.peff.net>
Jeff King <peff@peff.net> writes:
Wow, nested maze of callbacks make my head spin ;-) but they all
look reasonable. Thanks.
> diff --git a/line-log.c b/line-log.c
> index 790ab73212..24a1ecb677 100644
> --- a/line-log.c
> +++ b/line-log.c
> @@ -1327,3 +1327,13 @@ int line_log_filter(struct rev_info *rev)
>
> return 0;
> }
> +
> +static void free_void_line_log_data(void *data)
> +{
> + free_line_log_data(data);
> +}
> +
> +void line_log_free(struct rev_info *rev)
> +{
> + clear_decoration(&rev->line_log_data, free_void_line_log_data);
> +}
> diff --git a/line-log.h b/line-log.h
> index adff361b1b..4291da8d79 100644
> --- a/line-log.h
> +++ b/line-log.h
> @@ -60,4 +60,6 @@ int line_log_process_ranges_arbitrary_commit(struct rev_info *rev,
>
> int line_log_print(struct rev_info *rev, struct commit *commit);
>
> +void line_log_free(struct rev_info *rev);
> +
> #endif /* LINE_LOG_H */
> diff --git a/revision.c b/revision.c
> index e789834dd1..219dc76716 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -3083,6 +3083,11 @@ static void release_revisions_mailmap(struct string_list *mailmap)
>
> static void release_revisions_topo_walk_info(struct topo_walk_info *info);
>
> +static void free_void_commit_list(void *list)
> +{
> + free_commit_list(list);
> +}
> +
> void release_revisions(struct rev_info *revs)
> {
> free_commit_list(revs->commits);
> @@ -3100,6 +3105,10 @@ void release_revisions(struct rev_info *revs)
> diff_free(&revs->pruning);
> reflog_walk_info_release(revs->reflog_info);
> release_revisions_topo_walk_info(revs->topo_walk_info);
> + clear_decoration(&revs->children, free_void_commit_list);
> + clear_decoration(&revs->merge_simplification, free);
> + clear_decoration(&revs->treesame, free);
> + line_log_free(revs);
> }
>
> static void add_child(struct rev_info *revs, struct commit *parent, struct commit *child)
> diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
> index 6e01e2629c..613f0710e9 100755
> --- a/t/t4217-log-limit.sh
> +++ b/t/t4217-log-limit.sh
> @@ -2,6 +2,7 @@
>
> test_description='git log with filter options limiting the output'
>
> +TEST_PASSES_SANITIZE_LEAK=true
> . ./test-lib.sh
>
> test_expect_success 'setup test' '
^ permalink raw reply
* Re: How To Pick And Work On A Microproject
From: Junio C Hamano @ 2023-10-05 22:42 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Naomi Ibe
In-Reply-To: <CACS=G2z84_-WkWa-BnK8bNNqb9z_o37BC3-kb_NkrjzAL=r4Sg@mail.gmail.com>
Naomi Ibe <naomi.ibeh69@gmail.com> writes:
> "Select a microproject and check that it has not yet been taken or
> discussed by searching the mailing list. Public Inbox is your friend."
Yeah, that is VERY unfriendly. There is no mention on the pool of
microproject ideas from which you can "select" here. I wonder if
some HTML link is missing in the sentence (i.e., clicking a word
leading to a page that lists what you can select from), or it has
always been like this.
Later in the same document, I see
How to find other ideas for microprojects
First check the specific page(s) or information about Git
microprojects related to your program that should have been
published on this site or on the GSoC or Outreachy site. But then
still read on everything below!
which is much more realistic, as long as the "specific page(s)" are
well curated (which I have no idea myself, as I have never been in
the mentoring pool). Naomi, have you checked and found such a page
on Outreachy site?
Then it goes on to suggest finding a bug report, but I tend to think
that fixing them is way oversized to be a good microproject.
And finally it gives a casual mention of good+first+issue, which is
probably the closest to what _should_ be listed as the first place
to try (sorry, I however do not know how well the list is curated,
either, but from a cursory look it looks legit).
https://github.com/gitgitgadget/git/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
There also is a mention of #leftoverbits in the document, but by its
nature, they can easily become stale or irrelevant, and they tend to
be more real issues, and I would expect them to be unnecessarily
harder than what dip-your-toe-in-the-water-and-say-hello
microprojects need to be.
^ permalink raw reply
* [OUTREACHY] How To Pick And Work On A Microproject
From: Naomi Ibe @ 2023-10-05 22:17 UTC (permalink / raw)
To: git
I have gone through this link
https://git.github.io/General-Microproject-Information/ and I am not
really clear with it especially this line
"Select a microproject and check that it has not yet been taken or
discussed by searching the mailing list. Public Inbox is your friend."
On the mailing list I see messages with the [PATCH] keyword in front
of them, am I expected to pick one and reply to it? How do I reply
directly under the thread which contains the issue? Please, how do I
find the issue on the Git repository? I checked the repo and could not
find the "issues" section also .
Please HELP!!! Any explanations would be very much appreciated, as I
would prefer to start working on it as early as possible.
Thank you
^ permalink raw reply
* How To Pick And Work On A Microproject
From: Naomi Ibe @ 2023-10-05 22:14 UTC (permalink / raw)
To: git
I have gone through this link
https://git.github.io/General-Microproject-Information/ and I am not
really clear with it especially this line
"Select a microproject and check that it has not yet been taken or
discussed by searching the mailing list. Public Inbox is your friend."
On the mailing list I see messages with the [PATCH] keyword in front
of them, am I expected to pick one and reply to it? How do I reply
directly under the thread which contains the issue? Please, how do I
find the issue on the Git repository? I checked the repo and could not
find the "issues" section also .
Please HELP!!! Any explanations would be very much appreciated, as I
would prefer to start working on it as early as possible.
Thank you.
^ permalink raw reply
* Re: What's cooking in git.git (Oct 2023, #01; Mon, 2)
From: Junio C Hamano @ 2023-10-05 21:47 UTC (permalink / raw)
To: Sergey Organov; +Cc: git
In-Reply-To: <87lecgeqfu.fsf@osv.gnss.ru>
Sergey Organov <sorganov@gmail.com> writes:
> Overall, as an example, I'd understand if you had deflected the patch
> with "let's rather use -d for '--decorate=short', or '--date=relative'",
> or something like that, but you don't, leaving me uncertain about your
> actual worries and intentions.
Oh, I would be very much more sympathetic if somebody wanted to make
a short-and-sweet single-letter option to stand for "--first-parent
-p", if they come with the "first-parent chain is special---it is
the trunk history of the development" world view. And the resulting
behaviour would be "give me the diffs" in their world view, so I
would understand if they wanted to use "-d" for such an operation.
However, to folks who do not subscribe to "the first parent chain is
the trunk history" world view, "give me the diffs" is not an
explanation of the resulting behaviour, because in "-d" there is no
trace of hint that it is also about first-parent traversal.
So "-d" may not be a perfect fit for it, either. But at least it is
based on a more consistent world view, I would think, than
"--diff-merges=1 -p", whose behaviour becomes unexplainable when it
hits "reverse" merges in a world where the first parent chain is not
necessarily the trunk.
Anyway, I've tentatively queued the "--dd" round. Naming is hard,
I cannot tell what "dd" stards for, and I suspect no user can X-<.
Thanks.
^ permalink raw reply
* Re: [PATCH v3 2/3] diff-merges: introduce '--dd' option
From: Junio C Hamano @ 2023-10-05 21:45 UTC (permalink / raw)
To: Sergey Organov; +Cc: git
In-Reply-To: <20231004214558.210339-3-sorganov@gmail.com>
Sergey Organov <sorganov@gmail.com> writes:
> This option provides a shortcut to request diff with respect to first
> parent for any kind of commit, universally. It's implemented as pure
> synonym for "--diff-merges=first-parent --patch".
That explains what the patch does, but it does not tell us why it is
useful [*].
> NOTE: originally proposed as '-d', and renamed to '--dd' due to Junio
> request to keep "short-and-sweet" '-d' reserved for other uses.
The note is not grammatical, and more importantly, readers of "git
log" 6 months down the road would not care. I'd rather not see it
in the proposed log message. It is suitable material to place after
the three-dash line, or in the cover letter for the iteration.
> diff --git a/diff-merges.c b/diff-merges.c
> index ec97616db1df..45507588a279 100644
> --- a/diff-merges.c
> +++ b/diff-merges.c
> @@ -131,6 +131,9 @@ int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
> } else if (!strcmp(arg, "--cc")) {
> set_dense_combined(revs);
> revs->merges_imply_patch = 1;
> + } else if (!strcmp(arg, "--dd")) {
> + set_first_parent(revs);
> + revs->merges_imply_patch = 1;
Quite straight-forward as expected. I do not think "--dd" clicks
for many people as "first parent diffs all over", though.
> } else if (!strcmp(arg, "--remerge-diff")) {
> set_remerge_diff(revs);
> revs->merges_imply_patch = 1;
> diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
> index 5de1d190759f..4b474808311e 100755
> --- a/t/t4013-diff-various.sh
> +++ b/t/t4013-diff-various.sh
> @@ -473,6 +473,14 @@ test_expect_success 'log --diff-merges=on matches --diff-merges=separate' '
> test_cmp expected actual
> '
>
> +test_expect_success 'log --dd matches --diff-merges=1 -p' '
> + git log --diff-merges=1 -p master >result &&
> + process_diffs result >expected &&
> + git log --dd master >result &&
> + process_diffs result >actual &&
> + test_cmp expected actual
> +'
> +
> test_expect_success 'deny wrong log.diffMerges config' '
> test_config log.diffMerges wrong-value &&
> test_expect_code 128 git log
Looking good.
Thanks.
[Footnote]
* As I said elsewhere, I do not think it is a good idea to encourage
users' to adopt a screwed-up worldview in which first parent is
special but not special, and does the wrong thing for reverse
merges. If the option were short-hand for "--first-parent -p",
at least I would be more sympathetic.
^ permalink raw reply
* Re: [PATCH v3 3/3] completion: complete '--dd'
From: Junio C Hamano @ 2023-10-05 21:45 UTC (permalink / raw)
To: Sergey Organov; +Cc: git
In-Reply-To: <20231004214558.210339-4-sorganov@gmail.com>
Sergey Organov <sorganov@gmail.com> writes:
> '--dd' only makes sense for 'git log' and 'git show', so add it to
> __git_log_show_options which is referenced in the completion for these
> two commands.
>
> Signed-off-by: Sergey Organov <sorganov@gmail.com>
> ---
> contrib/completion/git-completion.bash | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index 133ec92bfae7..ca4fa39f3ff8 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2042,7 +2042,7 @@ __git_log_shortlog_options="
> "
> # Options accepted by log and show
> __git_log_show_options="
> - --diff-merges --diff-merges= --no-diff-merges --remerge-diff
> + --diff-merges --diff-merges= --no-diff-merges --dd --remerge-diff
> "
>
> __git_diff_merges_opts="off none on first-parent 1 separate m combined c dense-combined cc remerge r"
Quite straight-forward. I am kind of surprised that we do not have
to list "--cc" here. Perhaps it is so short and common that people
do not need completion help?
But that is not a new problem caused by this series, so it is OK.
Unless "--cc" gets completed without being listed here, using some
automation like the "--git-completion-helper" option, in which case
we may want to see if we can remove all of the above and complete
them the same way as "--cc" gets completed. I didn't check.
Thanks.
^ permalink raw reply
* [PATCH 3/3] daemon: free listen_addr before returning
From: Jeff King @ 2023-10-05 21:33 UTC (permalink / raw)
To: git
In-Reply-To: <20231005212802.GA982892@coredump.intra.peff.net>
We build up a string list of listen addresses from the command-line
arguments, but never free it. This causes t5811 to complain of a leak
(though curiously it seems to do so only when compiled with gcc, not
with clang).
To handle this correctly, we have to do a little refactoring:
- there are two exit points from the main function, depending on
whether we are entering the main loop or serving a single client
(since rather than a traditional fork model, we re-exec ourselves
with the extra "--serve" argument to accommodate Windows).
We don't need --listen at all in the --serve case, of course, but it
is passed along by the parent daemon, which simply copies all of the
command-line options it got.
- we just "return serve()" to run the main loop, giving us no chance
to do any cleanup
So let's use a "ret" variable to store the return code, and give
ourselves a single exit point at the end. That gives us one place to do
cleanup.
Note that this code also uses the "use a no-dup string-list, but
allocate strings we add to it" trick, meaning string_list_clear() will
not realize it should free them. We can fix this by switching to a "dup"
string-list, but using the "append_nodup" function to add to it (this is
preferable to tweaking the strdup_strings flag before clearing, as it
puts all the subtle memory-ownership code together).
Signed-off-by: Jeff King <peff@peff.net>
---
Diff is a little messy due to indentation. Viewing with "-w" makes it
more clear, or just viewing the post-image.
daemon.c | 37 ++++++++++++++++++++----------------
t/t5811-proto-disable-git.sh | 2 ++
2 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/daemon.c b/daemon.c
index f5e597114b..17d331b2f3 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1243,19 +1243,20 @@ static int serve(struct string_list *listen_addr, int listen_port,
int cmd_main(int argc, const char **argv)
{
int listen_port = 0;
- struct string_list listen_addr = STRING_LIST_INIT_NODUP;
+ struct string_list listen_addr = STRING_LIST_INIT_DUP;
int serve_mode = 0, inetd_mode = 0;
const char *pid_file = NULL, *user_name = NULL, *group_name = NULL;
int detach = 0;
struct credentials *cred = NULL;
int i;
+ int ret;
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
const char *v;
if (skip_prefix(arg, "--listen=", &v)) {
- string_list_append(&listen_addr, xstrdup_tolower(v));
+ string_list_append_nodup(&listen_addr, xstrdup_tolower(v));
continue;
}
if (skip_prefix(arg, "--port=", &v)) {
@@ -1437,22 +1438,26 @@ int cmd_main(int argc, const char **argv)
die_errno("failed to redirect stderr to /dev/null");
}
- if (inetd_mode || serve_mode)
- return execute();
+ if (inetd_mode || serve_mode) {
+ ret = execute();
+ } else {
+ if (detach) {
+ if (daemonize())
+ die("--detach not supported on this platform");
+ }
- if (detach) {
- if (daemonize())
- die("--detach not supported on this platform");
- }
+ if (pid_file)
+ write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
- if (pid_file)
- write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());
+ /* prepare argv for serving-processes */
+ strvec_push(&cld_argv, argv[0]); /* git-daemon */
+ strvec_push(&cld_argv, "--serve");
+ for (i = 1; i < argc; ++i)
+ strvec_push(&cld_argv, argv[i]);
- /* prepare argv for serving-processes */
- strvec_push(&cld_argv, argv[0]); /* git-daemon */
- strvec_push(&cld_argv, "--serve");
- for (i = 1; i < argc; ++i)
- strvec_push(&cld_argv, argv[i]);
+ ret = serve(&listen_addr, listen_port, cred);
+ }
- return serve(&listen_addr, listen_port, cred);
+ string_list_clear(&listen_addr, 0);
+ return ret;
}
diff --git a/t/t5811-proto-disable-git.sh b/t/t5811-proto-disable-git.sh
index 8ac6b2a1d0..ed773e7432 100755
--- a/t/t5811-proto-disable-git.sh
+++ b/t/t5811-proto-disable-git.sh
@@ -1,6 +1,8 @@
#!/bin/sh
test_description='test disabling of git-over-tcp in clone/fetch'
+
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY/lib-proto-disable.sh"
. "$TEST_DIRECTORY/lib-git-daemon.sh"
--
2.42.0.836.g4b88bf80c5
^ permalink raw reply related
* [PATCH 2/3] revision: clear decoration structs during release_revisions()
From: Jeff King @ 2023-10-05 21:30 UTC (permalink / raw)
To: git
In-Reply-To: <20231005212802.GA982892@coredump.intra.peff.net>
The point of release_revisions() is to free memory associated with the
rev_info struct, but we have several "struct decoration" members that
are left untouched. Since the previous commit introduced a function to
do that, we can just call it.
We do have to provide some specialized callbacks to map the void
pointers onto real ones (the alternative would be casting the existing
function pointers; this generally works because "void *" is usually
interchangeable with a struct pointer, but it is technically forbidden
by the standard).
Since the line-log code does not expose the type it stores in the
decoration (nor of course the function to free it), I put this behind a
generic line_log_free() entry point. It's possible we may need to add
more line-log specific bits anyway (running t4211 shows a number of
other leaks in the line-log code).
While this doubtless cleans up many leaks triggered by the test suite,
the only script which becomes leak-free is t4217, as it does very little
beyond a simple traversal (its existing leak was from the use of
--children, which is now fixed).
Signed-off-by: Jeff King <peff@peff.net>
---
line-log.c | 10 ++++++++++
line-log.h | 2 ++
revision.c | 9 +++++++++
t/t4217-log-limit.sh | 1 +
4 files changed, 22 insertions(+)
diff --git a/line-log.c b/line-log.c
index 790ab73212..24a1ecb677 100644
--- a/line-log.c
+++ b/line-log.c
@@ -1327,3 +1327,13 @@ int line_log_filter(struct rev_info *rev)
return 0;
}
+
+static void free_void_line_log_data(void *data)
+{
+ free_line_log_data(data);
+}
+
+void line_log_free(struct rev_info *rev)
+{
+ clear_decoration(&rev->line_log_data, free_void_line_log_data);
+}
diff --git a/line-log.h b/line-log.h
index adff361b1b..4291da8d79 100644
--- a/line-log.h
+++ b/line-log.h
@@ -60,4 +60,6 @@ int line_log_process_ranges_arbitrary_commit(struct rev_info *rev,
int line_log_print(struct rev_info *rev, struct commit *commit);
+void line_log_free(struct rev_info *rev);
+
#endif /* LINE_LOG_H */
diff --git a/revision.c b/revision.c
index e789834dd1..219dc76716 100644
--- a/revision.c
+++ b/revision.c
@@ -3083,6 +3083,11 @@ static void release_revisions_mailmap(struct string_list *mailmap)
static void release_revisions_topo_walk_info(struct topo_walk_info *info);
+static void free_void_commit_list(void *list)
+{
+ free_commit_list(list);
+}
+
void release_revisions(struct rev_info *revs)
{
free_commit_list(revs->commits);
@@ -3100,6 +3105,10 @@ void release_revisions(struct rev_info *revs)
diff_free(&revs->pruning);
reflog_walk_info_release(revs->reflog_info);
release_revisions_topo_walk_info(revs->topo_walk_info);
+ clear_decoration(&revs->children, free_void_commit_list);
+ clear_decoration(&revs->merge_simplification, free);
+ clear_decoration(&revs->treesame, free);
+ line_log_free(revs);
}
static void add_child(struct rev_info *revs, struct commit *parent, struct commit *child)
diff --git a/t/t4217-log-limit.sh b/t/t4217-log-limit.sh
index 6e01e2629c..613f0710e9 100755
--- a/t/t4217-log-limit.sh
+++ b/t/t4217-log-limit.sh
@@ -2,6 +2,7 @@
test_description='git log with filter options limiting the output'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'setup test' '
--
2.42.0.836.g4b88bf80c5
^ permalink raw reply related
* [PATCH 1/3] decorate: add clear_decoration() function
From: Jeff King @ 2023-10-05 21:29 UTC (permalink / raw)
To: git
In-Reply-To: <20231005212802.GA982892@coredump.intra.peff.net>
There's not currently any way to free the resources associated with a
decoration struct. As a result, we have several memory leaks which
cannot easily be plugged.
Let's add a "clear" function and make use of it in the example code of
t9004. This removes the only leak from that script, so we can mark it as
passing the leak sanitizer.
Curiously this leak is found only when running SANITIZE=leak with clang,
but not with gcc. But it is a bog-standard leak: we allocate some
memory in a local variable struct, and then exit main() without
releasing it. I'm not sure why gcc doesn't find it. After this
patch, both compilers report it as leak-free.
Note that the clear function takes a callback to free the individual
entries. That's not needed for our example (which is just decorating
with ints), but will be for real callers.
Signed-off-by: Jeff King <peff@peff.net>
---
decorate.c | 15 +++++++++++++++
decorate.h | 10 ++++++++++
t/helper/test-example-decorate.c | 2 ++
t/t9004-example.sh | 2 ++
4 files changed, 29 insertions(+)
diff --git a/decorate.c b/decorate.c
index a5c43c0c14..69aeb142b4 100644
--- a/decorate.c
+++ b/decorate.c
@@ -81,3 +81,18 @@ void *lookup_decoration(struct decoration *n, const struct object *obj)
j = 0;
}
}
+
+void clear_decoration(struct decoration *n, void (*free_cb)(void *))
+{
+ if (free_cb) {
+ unsigned int i;
+ for (i = 0; i < n->size; i++) {
+ void *d = n->entries[i].decoration;
+ if (d)
+ free_cb(d);
+ }
+ }
+
+ FREE_AND_NULL(n->entries);
+ n->size = n->nr = 0;
+}
diff --git a/decorate.h b/decorate.h
index ee43dee1f0..cdeb17c9df 100644
--- a/decorate.h
+++ b/decorate.h
@@ -58,4 +58,14 @@ void *add_decoration(struct decoration *n, const struct object *obj, void *decor
*/
void *lookup_decoration(struct decoration *n, const struct object *obj);
+/*
+ * Clear all decoration entries, releasing any memory used by the structure.
+ * If free_cb is not NULL, it is called for every decoration value currently
+ * stored.
+ *
+ * After clearing, the decoration struct can be used again. The "name" field is
+ * retained.
+ */
+void clear_decoration(struct decoration *n, void (*free_cb)(void *));
+
#endif
diff --git a/t/helper/test-example-decorate.c b/t/helper/test-example-decorate.c
index 2ed910adaa..8f59f6be4c 100644
--- a/t/helper/test-example-decorate.c
+++ b/t/helper/test-example-decorate.c
@@ -72,5 +72,7 @@ int cmd__example_decorate(int argc UNUSED, const char **argv UNUSED)
if (objects_noticed != 2)
BUG("should have 2 objects");
+ clear_decoration(&n, NULL);
+
return 0;
}
diff --git a/t/t9004-example.sh b/t/t9004-example.sh
index 7e8894a4a7..590aab0304 100755
--- a/t/t9004-example.sh
+++ b/t/t9004-example.sh
@@ -1,6 +1,8 @@
#!/bin/sh
test_description='check that example code compiles and runs'
+
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'decorate' '
--
2.42.0.836.g4b88bf80c5
^ permalink raw reply related
* [PATCH 0/3] a few more leak fixes
From: Jeff King @ 2023-10-05 21:28 UTC (permalink / raw)
To: git
I was really bothered that using clang with SANITIZE=leak found a leak
that gcc didn't. And then I was doubly bothered to find that there is
one that gcc finds that clang doesn't!
I don't think either of these are urgent or important leaks on their
own, but the flaky nature of the results makes it annoying while trying
to find and clean up other leaks. So here are some fixes. Patches 1 and
3 are those two cases, respectively. Patch 2 is a more interesting
leak-fix enabled by the infrastructure added in patch 1.
[1/3]: decorate: add clear_decoration() function
[2/3]: revision: clear decoration structs during release_revisions()
[3/3]: daemon: free listen_addr before returning
daemon.c | 37 ++++++++++++++++++--------------
decorate.c | 15 +++++++++++++
decorate.h | 10 +++++++++
line-log.c | 10 +++++++++
line-log.h | 2 ++
revision.c | 9 ++++++++
t/helper/test-example-decorate.c | 2 ++
t/t4217-log-limit.sh | 1 +
t/t5811-proto-disable-git.sh | 2 ++
t/t9004-example.sh | 2 ++
10 files changed, 74 insertions(+), 16 deletions(-)
-Peff
^ permalink raw reply
* Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation
From: Junio C Hamano @ 2023-10-05 21:24 UTC (permalink / raw)
To: Sergey Organov; +Cc: git
In-Reply-To: <20231004214558.210339-2-sorganov@gmail.com>
Sergey Organov <sorganov@gmail.com> writes:
> ---diff-merges=(off|none|on|first-parent|1|separate|m|combined|c|dense-combined|cc|remerge|r)::
> +-m::
> + Show diffs for merge commits in the default format. This is
> + similar to '--diff-merges=on' (which see) except `-m` will
> + produce no output unless `-p` is given as well.
I think the sentence reads better without the translated (q.v.) that
confused Eric.
> +-c::
> + Produce combined diff output for merge commits.
> + Shortcut for '--diff-merges=combined -p'.
> +
> +--cc::
> + Produce dense combined diff output for merge commits.
> + Shortcut for '--diff-merges=dense-combined -p'.
Good.
> +--remerge-diff::
> + Produce diff against re-merge.
> + Shortcut for '--diff-merges=remerge -p'.
I suspect that many people do not get what "re-merge" in "against
re-merge" really is. As "combined diff" and "dense combined diff"
are not explained in the previous two entries either, and expect the
readers to read the real description (which more or less matches
what the original description for "-c" and "--cc" had, which is
good), it would be better to say "Produce remerge-diff output for
merge commits." here, too. It makes it consistent, and "for merge
commits" makes it clear the "magic" does not apply to regular
commits (which the above entries for "-c" and "--cc" do, which is
very good).
> --no-diff-merges::
> + Synonym for '--diff-merges=off'.
> +
> +--diff-merges=<format>::
> Specify diff format to be used for merge commits. Default is
> - {diff-merges-default} unless `--first-parent` is in use, in which case
> - `first-parent` is the default.
> + {diff-merges-default} unless `--first-parent` is in use, in
> + which case `first-parent` is the default.
This reads well.
In the longer term, "--diff-merge=first-parent" that is used without
first-parent traversal should be discouraged and be deprecated, I
think, but that is a separate story [*].
> ---diff-merges=(off|none):::
> ---no-diff-merges:::
> +The following formats are supported:
> ++
> +--
> +off, none::
> Disable output of diffs for merge commits. Useful to override
> implied value.
> +
> ---diff-merges=on:::
> ---diff-merges=m:::
> --m:::
> - This option makes diff output for merge commits to be shown in
> - the default format. `-m` will produce the output only if `-p`
> - is given as well. The default format could be changed using
> +on, m::
> + Make diff output for merge commits to be shown in the default
> + format. The default format could be changed using
> `log.diffMerges` configuration parameter, which default value
> is `separate`.
The original is already wrong so these are not problems this patch
introduces, but
- "configuration variable" is how we refer to these entities.
- "which default value" -> "whose default value".
> ---diff-merges=first-parent:::
> ---diff-merges=1:::
> - This option makes merge commits show the full diff with
> - respect to the first parent only.
> +first-parent, 1::
> + Show full diff with respect to first parent. This is the same
> + format as `--patch` produces for non-merge commits.
> +
Yes, this is the same output as `-p`, as if parents other than the
first parent of the merge commit did not exist.
This was inherited from the original elsewhere, but it makes it
unnecessary confusing to say "full diff" here and in the next one.
Show `--patch` output with respoect to the first parent for a
merge commit, as if the other parents did not exist.
perhaps?
> +separate::
> + Show full diff with respect to each of parents.
> + Separate log entry and diff is generated for each parent.
In the early days of Git before -c/--cc were invented, we explained
this mode as "pairwise comparison", and the phrase "pairwise" still
may be the best one to describe the behaviour here. In fact, we see
in the updated description of combined below the exact phrase is used
to refer to this oldest output format.
Show the `--patch` output pairwise, together with the commit
header, repeated for each parent for a merge commit.
or something, perhaps. I added "repeated" here to make the contrast
with "simultaneously" stand out.
> +combined, c::
> + Show differences from each of the parents to the merge
> + result simultaneously instead of showing pairwise diff between
> + a parent and the result one at a time. Furthermore, it lists
> + only files which were modified from all parents.
> ++
> +dense-combined, cc::
> + Further compress output produced by `--diff-merges=combined`
> + by omitting uninteresting hunks whose contents in the parents
> + have only two variants and the merge result picks one of them
> + without modification.
> ++
> +remerge, r::
> + Remerge two-parent merge commits to create a temporary tree
> + object--potentially containing files with conflict markers
> + and such. A diff is then shown between that temporary tree
> + and the actual merge commit.
The original says "two-parent merge comimts are remerged" so it is
not a failure of this patch, but the first verb "Remerge" sounds
unnecessarily unfriendly to the readers.
For a two-parent merge commit, a merge of these two commits
is retried to create a temporary tree object, potentially
containing files with conflict markers. A `--patch` output
then is shown between ...
would be easier to follow and more faithful to the original
description added by db757e8b (show, log: provide a --remerge-diff
capability, 2022-02-02).
Either way, it makes readers wonder what happens to merges with more
than 2 parents (octopus merges). It is not a new problem and this
topic should not attempt to fix it.
Looks very good otherwise. Let me read on.
Thanks.
[Footnote]
* When a project allows fast-forward merges, something like this can
happen (and Git was _designed_ to allow and even encourage it)
- Linus pulls from Sergey and sees merge conflicts that are very
messy. Sergey is asked to resolve the conflict, as Linus knows
Sergey understands the changes he is asking Linus to pull much
better than Linus does.
- Sergey does "git pull origin" that would give the same set of
conflicts Linus saw, perhaps ours/theirs sides swapped, resolves
the conflicts, and comits the merge result. He may even add a
few other improvements on top (or may not). He tells Linus that
his tree is ready to be pulled again.
- Linus pulls from Sergey again. This time it is fast-forward,
without an extra merge commit that records the Linus's previous
tip as the first parent and Sergey's work as the second parent.
- Linus continues working from here.
In such a workflow, merges are nothing more than "combining
multiple histories together" and the first parenthood is NOT
inherently special among parents at all. The original "-m -p"
(aka "pairwise diff") output reflects this world view and ensures
that all parents are shown more or less as equals (yes, the first
parent diff is shown first before the other parents, but you
cannot avoid it when outputting to a single dimension medium).
This world view was the only world view Git supported, until I
added the "--first-parent" traversal in 0053e902 (git-log
--first-parent: show only the first parent log, 2007-03-13).
With the "--first-parent", with "--no-ff" option to "git merge", a
different world view becomes possible. A merge is not merely
combining multiple histories, which are equals. It is bringing
work done on a side branch into the trunk. To see the overview of
the history, "git log --first-parent" would give the outline,
which would be full of merges from side branches, each of which
can be seen as summarizing the work done on the side branch that
was merged, and it may occasionally have single-parent commits
that are hotfixes or trivial clean-ups or project administrivia
commits. With "-p", "git log" would show the changes the work
done on a side branch as a single unit for a merge, and individual
commits if they are single-parent. The life is good.
It all breaks down if the "diff against the first parent" is done
on a merge that is not bringing the work on a side branch in to
the trunk. The merge done in the second step Sergey did for Linus
in the above example will have his work on the history leading to
its first parent, and from the overall project's point of view,
the second parent is the tip of the history of the trunk. Showing
first-parent diff for a merge that was *not* discovered via the
first-parent traversal would show such a meaningless patch. This
is an illustration of the fallout from mixing two incompatible
world views together, "--diff-merges=first-parent" wants to work
in a world where the first-parent is special among parents, but
traversal without "--first-parent" wants to treat all the branches
equally.
All the other <format>s accepted by the "--diff-merges=<format>"
option are symmetrical and they work equally well when in a
history of a project that considers the first-parenthood special
(i.e. work on a side branch is brought into the trunk history) or
in a history with merges whose parent order should not matter, so
unlike "--diff-merges=first-parent", it makes sense to apply them
with or without first-parent traversal. It however is not true
for the "--diff-merges=first-parent" variant, which is asymmetric.
And that is why I think use of "--diff-merges=first-parent"
without "--first-parent" traversal is a bad thing to teach users
to use.
^ permalink raw reply
* Re: [PATCH v3 1/3] diff-merges: improve --diff-merges documentation
From: Junio C Hamano @ 2023-10-05 21:11 UTC (permalink / raw)
To: Eric Sunshine; +Cc: Sergey Organov, git
In-Reply-To: <CAPig+cT63L2+XmDRKw4Pc+iDmUL+UFcyummOcOtS+3wYaNbFvg@mail.gmail.com>
Eric Sunshine <sunshine@sunshineco.com> writes:
>> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> @@ -43,66 +43,74 @@ endif::git-diff[]
>> +-m::
>> + Show diffs for merge commits in the default format. This is
>> + similar to '--diff-merges=on' (which see) except `-m` will
>> + produce no output unless `-p` is given as well.
>
> I'm having difficulty grasping the parenthetical "(which see)" comment.
I am, too. I know what it means when written in the more common
Latin abbreviation (q.v.), but I suspect it may be rare to spell it
in English like this. I found
https://writingcenter.unc.edu/tips-and-tools/latin-terms-and-abbreviations/
that starts its explanation with this:
The abbreviation q.v. stands for quod vide, which translates
literally as “which see,” although in practice it mea
something more like “for which see elsewhere.
and it goes on to say:
The reader is expected to know how to locate this information
without further assistance. Since there is always the
possibility that the reader won’t be able to find the
information cited by q.v., it’s better to use a simple English
phrase such as “for more on this topic, see pages 72-3” or
“a detailed definition appears on page 16.” Such phrases are
immediately comprehensible to the reader (who may not even know
what q.v. means) and remove any ambiguity about where
additional information is located.
which only applies halfway to this example, as with the text before
it makes it very clear for readers that they need to learn about
"--diff-merges=on". It is so clear to the point that the only
effect "(which see)" here has is to waste bytes and confuses
readers, I am afraid.
^ permalink raw reply
* Re: [PATCH 08/10] commit-graph: free write-context entries before overwriting
From: Jeff King @ 2023-10-05 21:03 UTC (permalink / raw)
To: Taylor Blau; +Cc: git
In-Reply-To: <ZR73rpsKiTBF/rYj@nand.local>
On Thu, Oct 05, 2023 at 01:51:42PM -0400, Taylor Blau wrote:
> @@ -1033,11 +1033,11 @@ struct write_commit_graph_context {
> uint64_t progress_cnt;
>
> char *base_graph_name;
> - int num_commit_graphs_before;
> - int num_commit_graphs_after;
> - char **commit_graph_filenames_before;
> - char **commit_graph_filenames_after;
> - char **commit_graph_hash_after;
> + struct {
> + size_t nr;
> + char **fname;
> + char **hash;
> + } graphs_before, graphs_after;
> uint32_t new_num_commits_in_base;
> struct commit_graph *new_base_graph;
> --- >8 ---
>
> ...making the corresponding changes throughout the rest of the file. But
> that is definitely out of scope here, and could easily be left for
> another day.
I agree that it would make things a bit more readable, but there
currently is no "hash_before". So they're not quite symmetric.
-Peff
^ permalink raw reply
* Re: [PATCH] doc/cat-file: clarify description regarding various command forms
From: Jeff King @ 2023-10-05 21:01 UTC (permalink / raw)
To: Štěpán Němec; +Cc: git, avarab
In-Reply-To: <20231005194852+0200.262756-stepnem@smrk.net>
On Thu, Oct 05, 2023 at 07:48:52PM +0200, Štěpán Němec wrote:
> > Yeah, I think that is a big improvement over the status quo. I might
> > also be worth starting with a single-sentence overview of what is common
> > to both modes. Something like:
> >
> > Output the contents or details of one or more objects. [...]
>
> I thought about that when proposing the rewrite, but feel that it would
> again just duplicate what's said elsewhere, in this case even before,
> not after, in the very first line of the man page:
>
> git-cat-file - Provide content or type and size information for
> repository objects
Ah, true, I was thinking that the DESCRIPTION section would be the first
thing users would read, but I didn't notice the headline. I agree that
what it says is probably sufficient (though arguably "type and size" is
slightly inaccurate there; I said "details" in my proposed text but
maybe that is too vague).
> > This command can operate in two modes, depending on whether an
> > option from the --batch family is specified.
> >
> > In non-batch mode, the command provides information on a single object
> > given on the command line.
> ^^^^^
> Any particular reason you prefer "given" to "named"? However absurd a
> notion of giving an actual object on the command line might seem, to me
> "named" is better in that it leaves no room for such misinterpretation.
> And the <object> description in OPTIONS talks about "ways to spell
> object names", building on the same concept.
Nope, I didn't even do that replacement consciously (I was just fleshing
out my example, and ended up deciding nothing else needed to be
changed). So "named" is fine by me.
Thanks.
-Peff
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox