Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v2] doc/git-bisect: clarify `git bisect run` syntax
From: Javier Mora @ 2023-10-23 22:53 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: cousteau via GitGitGadget, git, Patrick Steinhardt
In-Reply-To: <CAPig+cQuBwzaG7ZssGUY6k8wf8pcGZHAGLnbRy579uTPMKqwKQ@mail.gmail.com>

> the patch subject becomes a bit outdated with this addition.

Right; I wanted to change it to something like "clarify `git bisect
run` syntax and other minor changes" but wanted to keep the title
concise.
I guess I could change it to just "clarify `git bisect` syntax" though
remove the "run").

> the following two lines are already referencing placeholders
> <term-new> and <term-old>

That's why I added it; that `(bad|new|<term-new>)` felt a bit awkward
with no previous explanation of what <term-new> was.

> ...now we have an inconsistency again since this text just uses the
> generic <term>. However, I haven't convinced myself that we need to
> care about this inconsistency.

I thought about that, but in THAT case it wasn't necessary because
<term-new> and <term-old> are never used there (and I wanted to avoid
making -h too long).  But it's true that it feels inconsistent; I may
add it just for the sake of consistency.

Overall, maybe I should leave that change to a separate patch, even if
it's a minor correction.  (This made more sense when I had in mind the
plan to move everything from description to synopsis so I would need
to touch all those lines anyway.)  The changes will be compatible
anyway (they're far away enough to not cause merge conflicts).  What
do you think?

^ permalink raw reply

* Re: [RFC PATCH 0/5] Introduce -t, --table for status/add commands
From: Jacob Stopak @ 2023-10-23 23:17 UTC (permalink / raw)
  To: Oswald Buddenhagen; +Cc: Dragan Simic, git
In-Reply-To: <ZTbVY7Nf+DTYqHky@ugly>

On Mon, Oct 23, 2023 at 10:19:47PM +0200, Oswald Buddenhagen wrote:
> On Mon, Oct 23, 2023 at 12:29:12PM -0700, Jacob Stopak wrote:
> > Those arrows showing how things move only really apply to "simulating"
> > (dry runs) for specific commands like add, restore, rm, commit, stash,
> > etc, so making the --table proposal a default status output would still
> > miss those scenarios.
> > 
> you're too focused on the status quo of your own tool. :-)

Ha it's possible. I created git-sim with a very specific use case in mind
so you're right it's probably worth rethinking that while taking into account
Git's other functionality that wasn't in the picture with an external tool.

> there is really nothing that would speak against the real commands reporting
> what they just *actually did*. this would seem rather helpful for noobs and
> other insecure users.

That's true, it would be just as easy to report the results of a command
(and even easier in some cases) than forecasting the result in a dry-run.
The question is how to decide which one? Do you report the results of
certain commands by default while hints are enabled? And only as a dry run
when the --dry-run / -n flag is added? That actually would make sense as
it wouldn't add "special" behavior to the dry run. The dry run would just
report the exact same default output as the normal command, but omit the action.

> if one really wanted, "you can also use this with --dry-run" could be part
> of the hint that would say how to turn off the extra verbosity (or just the
> hint itself, if one likes the verbosity).

Interesting. So many ways to think about how to optimize the user
interaction...

> one could even go one step further and put at least the destructive commands
> into interactive/confirmation mode by default. but that's probably a bridge
> too far, as it would be potentially habit-forming in a bad way.

That would be an interesting add on to our discussion above. So as a
thought experiment, let's pretend there are no restrictions from
traditional users, we could:

  1) Enable verbose results output by default to certain commands, which
     could include a visual table-based output where applicable, and a
     hint to disable it. (Are hints currently command-specific? Or even
     scenario-specific within a command? Or is it all or nothing?)

  2) Include verbose/table results on dry runs, and add dry run flags onto
     commands that should seem to have it for consistency. For example
     "git add" has a dry run flag but "git restore" (the hinted inverse
     operation) does not. I assume on dry runs hints wouldn't be used to
     communicate anything.

  3) Well, I'll admit about once every 3 months I run "git stash --all"
     when I really meant "git stash --include-untracked" and proceed to
     lose a small part of my soul. This would be saved by a simple
     confirmation. I find that the stuff like "git reset --hard" doesn't
     bother me anymore since I know when to be careful with it and what
     things I can get back and how to do it. But I find the nasty ones
     are the things that sound like what you want but end up doing
     something bad. Not sure there is a way to isolate those though...

     I'm rambling now. But maybe for interactivity, at the very least it
     could be added to dry runs, like a "Here's what would happen, want
     to run it now?" I got this feedback a few times for git-sim as well.

^ permalink raw reply

* Re: [PATCH v2] doc/git-bisect: clarify `git bisect run` syntax
From: Eric Sunshine @ 2023-10-23 23:18 UTC (permalink / raw)
  To: Javier Mora; +Cc: cousteau via GitGitGadget, git, Patrick Steinhardt
In-Reply-To: <CAH1-q0hNSKgr1-dtZac=z7Bx15gON0Y-1pyBM57zuXaFPaJJKQ@mail.gmail.com>

On Mon, Oct 23, 2023 at 6:53 PM Javier Mora
<cousteaulecommandant@gmail.com> wrote:
> > the patch subject becomes a bit outdated with this addition.
>
> Right; I wanted to change it to something like "clarify `git bisect
> run` syntax and other minor changes" but wanted to keep the title
> concise.
> I guess I could change it to just "clarify `git bisect` syntax" though
> remove the "run").

Yup.

> > the following two lines are already referencing placeholders
> > <term-new> and <term-old>
>
> That's why I added it; that `(bad|new|<term-new>)` felt a bit awkward
> with no previous explanation of what <term-new> was.
>
> > ...now we have an inconsistency again since this text just uses the
> > generic <term>. However, I haven't convinced myself that we need to
> > care about this inconsistency.
>
> I thought about that, but in THAT case it wasn't necessary because
> <term-new> and <term-old> are never used there (and I wanted to avoid
> making -h too long).  But it's true that it feels inconsistent; I may
> add it just for the sake of consistency.

I don't feel strongly about the inconsistency at this point.

> Overall, maybe I should leave that change to a separate patch, even if
> it's a minor correction.  (This made more sense when I had in mind the
> plan to move everything from description to synopsis so I would need
> to touch all those lines anyway.)  The changes will be compatible
> anyway (they're far away enough to not cause merge conflicts).  What
> do you think?

I can certainly see the "{new,bad}" to "(new|bad") and <term> to
<new-term>/<old-term> changes being separated out, making this a two-
or three-patch series.

^ permalink raw reply

* Re: [PATCH 2/2] fetch: no redundant error message for atomic fetch
From: Jiang Xin @ 2023-10-23 23:20 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Git List, Junio C Hamano, Jiang Xin
In-Reply-To: <ZTZF3AbNNuGpy38l@tanuki>

On Mon, Oct 23, 2023 at 6:07 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Mon, Oct 23, 2023 at 05:16:20PM +0800, Jiang Xin wrote:
> > On Mon, Oct 23, 2023 at 4:27 PM Patrick Steinhardt <ps@pks.im> wrote:
> > >
> > > On Thu, Oct 19, 2023 at 10:34:33PM +0800, Jiang Xin wrote:
> > > > @@ -1775,10 +1775,8 @@ static int do_fetch(struct transport *transport,
> > > >       }
> > > >
> > > >  cleanup:
> > > > -     if (retcode && transaction) {
> > > > -             ref_transaction_abort(transaction, &err);
> > > > +     if (retcode && transaction && ref_transaction_abort(transaction, &err))
> > > >               error("%s", err.buf);
> > > > -     }
> > >
> > > Right. We already call `error()` in all cases where `err` was populated
> > > before we `goto cleanup;`, so calling it unconditionally a second time
> > > here is wrong.
> > >
> > > That being said, `ref_transaction_abort()` will end up calling the
> > > respective backend's implementation of `transaction_abort`, and for the
> > > files backend it actually ignores `err` completely. So if the abort
> > > fails, we would still end up calling `error()` with an empty string.
> >
> > The transaction_abort implementations of the two builtin refs backends
> > will not use "err“ because they never fail (always return 0). Some one
> > may want to implement their own refs backend which may use the "err"
> > variable in their "transaction_abort". So follow the pattern as
> > update-ref.c and files-backend.c to call ref_transaction_abort() is
> > safe.
> >
> > > Furthermore, it can happen that `transaction_commit` fails, writes to
> > > the buffer and then prints the error. If the abort now fails as well, we
> > > would end up printing the error message twice.
> >
> > The abort never fails so error message from transaction_commit() will
> > not reach the code.
>
> With that reasoning we could get rid of the error handling of abort
> completely as it's known not to fail. But only because it does not fail
> right now doesn't mean that it won't in the future, as the infra for it
> to fail is all in place. And in case it ever does the current code will
> run into the bug I described.

If in the future ref_transaction_abort() fails for some reason, the
err variable will be filled with the error message and the previous
error message will be discarded, no duplication will occur. So I think
use this fix is OK.

> So in my opinion, we should either refactor the code to clarify that
> this cannot fail indeed. Or do the right thing and handle the error case
> correctly, which right now we don't.
>
> Patrick

^ permalink raw reply

* Re: [PATCH v5 0/5] merge-ort: implement support for packing objects together
From: Junio C Hamano @ 2023-10-23 23:31 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Elijah Newren, Eric W. Biederman, Jeff King,
	Patrick Steinhardt
In-Reply-To: <cover.1698101088.git.me@ttaylorr.com>

Taylor Blau <me@ttaylorr.com> writes:

> But I
> think that this approach ended up being less heavy-weight than I had
> originally imagined, so I think that this version is a worthwhile
> improvement over v4.

;-).

This version is a good place to stop, a bit short of going full OO.
Nicely done.


^ permalink raw reply

* Re: [PATCH v3 5/7] refs: add pseudorefs array and iteration functions
From: Junio C Hamano @ 2023-10-24  0:08 UTC (permalink / raw)
  To: Andy Koppe; +Cc: git, stolee
In-Reply-To: <20231023221143.72489-6-andy.koppe@gmail.com>

Andy Koppe <andy.koppe@gmail.com> writes:

> Define const array 'pseudorefs' with the names of the pseudorefs that
> are documented in gitrevisions.1, and add functions for_each_pseudoref()
> and refs_for_each_pseudoref() for iterating over them.

Makes sense, and we can later (ab|re)use the same mechanism to
extend "git for-each-ref" that currently only knows how to show
things under "refs/" hierarchy.

> The functions process the pseudorefs in the same way as head_ref() and
> refs_head_ref() process HEAD, invoking an each_ref_fn callback on each
> pseudoref that exists.

Good.

^ permalink raw reply

* Re: [PATCH v2] doc/git-bisect: clarify `git bisect run` syntax
From: Junio C Hamano @ 2023-10-24  0:12 UTC (permalink / raw)
  To: Javier Mora
  Cc: Eric Sunshine, cousteau via GitGitGadget, git, Patrick Steinhardt
In-Reply-To: <CAH1-q0hNSKgr1-dtZac=z7Bx15gON0Y-1pyBM57zuXaFPaJJKQ@mail.gmail.com>

Javier Mora <cousteaulecommandant@gmail.com> writes:

>> the patch subject becomes a bit outdated with this addition.
>
> Right; I wanted to change it to something like "clarify `git bisect
> run` syntax and other minor changes" but wanted to keep the title
> concise.
> I guess I could change it to just "clarify `git bisect` syntax" though
> remove the "run").

Quite honestly, I think at this point we are entering into the
"diminishing returns" territory.  The title is still clear enough,
and the patch is good.

Thanks.  The patch has been merged to 'next'.

^ permalink raw reply

* Re: [RFC PATCH 0/5] Introduce -t, --table for status/add commands
From: Dragan Simic @ 2023-10-24  1:10 UTC (permalink / raw)
  To: Jacob Stopak; +Cc: Oswald Buddenhagen, git
In-Reply-To: <ZTb/HeILRHnZjaz6.jacob@initialcommit.io>

On 2023-10-24 01:17, Jacob Stopak wrote:
> That's true, it would be just as easy to report the results of a 
> command
> (and even easier in some cases) than forecasting the result in a 
> dry-run.
> The question is how to decide which one? Do you report the results of
> certain commands by default while hints are enabled? And only as a dry 
> run
> when the --dry-run / -n flag is added? That actually would make sense 
> as
> it wouldn't add "special" behavior to the dry run. The dry run would 
> just
> report the exact same default output as the normal command, but omit 
> the action.

IMHO, it would be the best to simply implement support for 
"<command>.verbose = table" in the git configuration, similarly to how 
we already have "commit.verbose = true".  That way tables could be 
enabled per command, according to the user's preferences, regardless of 
performing dry runs or not.

The new hint would be placed somewhere, which should be decided 
separately, but having the hint enabled or disabled wouldn't affect 
anything else.  Just like with the other hints.

^ permalink raw reply

* Re: [RFC PATCH 0/5] Introduce -t, --table for status/add commands
From: Junio C Hamano @ 2023-10-24  2:03 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Jacob Stopak, Oswald Buddenhagen, git
In-Reply-To: <62164acf4a787042dbb6e5abe212559b@manjaro.org>

Dragan Simic <dsimic@manjaro.org> writes:

> IMHO, it would be the best to simply implement support for
> "<command>.verbose = table" in the git configuration, similarly to how
> we already have "commit.verbose = true".  That way tables could be
> enabled per command, according to the user's preferences, regardless
> of performing dry runs or not.

I think the "use more verbose report format to help relatively
inexperienced folks, in exchange for spending more screen real
estate" is a good direction to think about this thing.

I am not personally interested in adding such support all that much
myself, but one piece of advice I can offer those who are interested
is not to be too deeply attached to the word "table".

It may be that "git status" (not "status -s" [*] but the current
version for human consumption) shows "paths with changes to be
committed (i.e. changes added to the index exist)" and "paths with
changes you could add to the index (i.e. changes yet to be added to
the index exist)" in a separate list, and Jacob may have found that
it gives a more understandable view into the states of each path if
the output is turned 90-degrees and the changes are shown in a
tabular form.  But "table" in this example is merely an
implementation detail of one presentation that is easier to
understand for "git status", and calling it "table" and making it a
word in the vocabulary of <command>.verbose is like a tail wagging
the dog.  We want to convey to the users that the option is about
"with extra verbosity, the user is buying a bit more clarity", not
necessarily "use tabular form no matter what".

For some of the commands, tabular presentation might not even be the
presentation form that is the easiest to understand to novices.  For
example, I just pushed out today's integration result to some of my
repositories, and "git push" output looks like so:

To github.com:gitster/git.git
 + 5cb4030332...7dc6f5ada8 ak/color-decorate-symbols -> ak/colo...
 + a71066b71b...c80a646458 jch -> jch (forced update)
 + 89a1ffc6a4...416cdf7260 seen -> seen (forced update)
 + 7ff160463b...2c610ca9ff tb/merge-tree-write-pack -> tb/merge...
   2e3b7b2460..57243409ad  refs/notes/amlog -> refs/notes/amlog

This is already "tabular" and gives enough information to tell me
which ones did not get updated (e.g., I do not see 'next' there) and
which ones are forced ('jch' and 'seen' are usually forced and I'll
notice that I may have made mistakes if they are not).  But a
hypothetical presentation that is easier for novices to read may
show "git log --oneline --graph old...new" (or some abbreviated form
of it) between the old and new tips of the affected branches.  At
that point, calling the improved output as "table" would make little
sense.

For commands that Jacob found it easier to explain in tabular form,
like "git add", it is very possible that two years down the road,
another Jacob comes around and proposes a different presentation
that is even easier for novices to understand, and it may not be a
tabular form.

So be very careful when choosing what to call this new thing, and
avoid naming it after the implementation details (e.g., in what
particular shape the data gets presented) that may turn out not to
be the most important part of the concept.

[Footnote]

 * FWIW, "git status -s" is a tabular presentation.  Maybe we can
   add a more verbose form of "-s" and be done with it for the
   command?

^ permalink raw reply

* Re: [RFC PATCH 0/5] Introduce -t, --table for status/add commands
From: Dragan Simic @ 2023-10-24  2:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jacob Stopak, Oswald Buddenhagen, git
In-Reply-To: <xmqqil6w6al3.fsf@gitster.g>

On 2023-10-24 04:03, Junio C Hamano wrote:
> I think the "use more verbose report format to help relatively
> inexperienced folks, in exchange for spending more screen real
> estate" is a good direction to think about this thing.
> 
> I am not personally interested in adding such support all that much
> myself, but one piece of advice I can offer those who are interested
> is not to be too deeply attached to the word "table".
> 
> ... snip ...
> 
> So be very careful when choosing what to call this new thing, and
> avoid naming it after the implementation details (e.g., in what
> particular shape the data gets presented) that may turn out not to
> be the most important part of the concept.

Totally agreed, "table" simply sneaked in and remained here as the term. 
  Perhaps "<command>.verbose = extra" or something similar would be a 
good choice.

> [Footnote]
> 
>  * FWIW, "git status -s" is a tabular presentation.  Maybe we can
>    add a more verbose form of "-s" and be done with it for the
>    command?

That's also an option.

^ permalink raw reply

* Re: [PATCH v4 4/7] bulk-checkin: implement `SOURCE_INCORE` mode for `bulk_checkin_source`
From: Patrick Steinhardt @ 2023-10-24  6:34 UTC (permalink / raw)
  To: Jeff King
  Cc: Taylor Blau, git, Elijah Newren, Eric W. Biederman,
	Junio C Hamano
In-Reply-To: <20231023185842.GE1537181@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]

On Mon, Oct 23, 2023 at 02:58:42PM -0400, Jeff King wrote:
> On Mon, Oct 23, 2023 at 11:19:13AM +0200, Patrick Steinhardt wrote:
> 
> > > +	case SOURCE_INCORE:
> > > +		assert(source->read <= source->size);
> > 
> > Is there any guideline around when to use `assert()` vs `BUG()`? I think
> > that this assertion here is quite critical, because when it does not
> > hold we can end up performing out-of-bounds reads and writes. But as
> > asserts are typically missing in non-debug builds, this safeguard would
> > not do anything for our end users, right?
> 
> I don't think we have a written guideline. My philosophy is: always use
> BUG(), because you will never be surprised that the assertion was not
> compiled in (and I think compiling without assertions is almost
> certainly premature optimization, especially given the way we tend to
> use them).
> 
> -Peff

I'm inclined to agree with your philosophy. Makes me wonder whether we
should write a Coccinelle rule to catch this. But a quick-and-dirty grep
in our codebase shows that such a rule would cause quite a lot of churn:

$ git grep BUG\( | wc -l
677
$ git grep assert\( | wc -l
549

Probably not worth it.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 08/11] t4207: delete replace references via git-update-ref(1)
From: Patrick Steinhardt @ 2023-10-24  6:42 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Han-Wen Nienhuys
In-Reply-To: <ZTaiYEyhKT/yZwHZ@nand.local>

[-- Attachment #1: Type: text/plain, Size: 3339 bytes --]

On Mon, Oct 23, 2023 at 12:42:08PM -0400, Taylor Blau wrote:
> On Wed, Oct 18, 2023 at 07:35:37AM +0200, Patrick Steinhardt wrote:
> > In t4207 we set up a set of replace objects via git-replace(1). Because
> > these references should not be impacting subsequent tests we also set up
> > some cleanup logic that deletes the replacement references via a call to
> > `rm -rf`. This reaches into the internal implementation details of the
> > reference backend and will thus break when we grow an alternative refdb
> > implementation.
> >
> > Refactor the tests to delete the replacement refs via Git commands so
> > that we become independent of the actual refdb that's in use. As we
> > don't have a nice way to delete all replacements or all references in a
> > certain namespace, we opt for a combination of git-for-each-ref(1) and
> > git-update-ref(1)'s `--stdin` mode.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  t/t4207-log-decoration-colors.sh | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/t/t4207-log-decoration-colors.sh b/t/t4207-log-decoration-colors.sh
> > index 21986a866df..d138e513a04 100755
> > --- a/t/t4207-log-decoration-colors.sh
> > +++ b/t/t4207-log-decoration-colors.sh
> > @@ -71,7 +71,7 @@ ${c_tag}tag: ${c_reset}${c_tag}A${c_reset}${c_commit})${c_reset} A
> >  '
> >
> >  test_expect_success 'test coloring with replace-objects' '
> > -	test_when_finished rm -rf .git/refs/replace* &&
> > +	test_when_finished "git for-each-ref refs/replace*/** --format=${SQ}delete %(refname)${SQ} | git update-ref --stdin" &&
> 
> Here and below, should we avoid the for-each-ref showing up on the
> left-hand side of the pipe? I'd think we want something closer to:
> 
>     test_when_finished "git for-each-ref refs/replace*/** --format=${SQ}delete %(refname)${SQ} >in && git update-ref --stdin <in" &&
> 
> But having to quote the --format argument with "${SQ}"s makes the whole
> thing a little awkward to read and parse.
> 
> Do you think that something like the below would be a readability
> improvement?

Yes, this certainly looks like a good improvement to me, thanks!

Patrick

> diff --git a/t/t4207-log-decoration-colors.sh b/t/t4207-log-decoration-colors.sh
> index d138e513a0..de8f6638cb 100755
> --- a/t/t4207-log-decoration-colors.sh
> +++ b/t/t4207-log-decoration-colors.sh
> @@ -70,8 +70,13 @@ ${c_tag}tag: ${c_reset}${c_tag}A${c_reset}${c_commit})${c_reset} A
>  	cmp_filtered_decorations
>  '
> 
> --- >8 ---
> +remove_replace_refs () {
> +	git for-each-ref 'refs/replace*/**' --format='delete %(refname)' >in &&
> +	git update-ref --stdin <in
> +}
> +
>  test_expect_success 'test coloring with replace-objects' '
> -	test_when_finished "git for-each-ref refs/replace*/** --format=${SQ}delete %(refname)${SQ} | git update-ref --stdin" &&
> +	test_when_finished remove_replace_refs &&
>  	test_commit C &&
>  	test_commit D &&
> 
> @@ -99,7 +104,7 @@ EOF
>  '
> 
>  test_expect_success 'test coloring with grafted commit' '
> -	test_when_finished "git for-each-ref refs/replace*/** --format=${SQ}delete %(refname)${SQ} | git update-ref --stdin" &&
> +	test_when_finished remove_replace_refs &&
> 
>  	git replace --graft HEAD HEAD~2 &&
> --- 8< ---
> 
> Thanks,
> Taylor

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [OUTREACHY] Final Application For Git Internshhip
From: Naomi Ibe @ 2023-10-24  7:47 UTC (permalink / raw)
  To: git

I'm Ibe Naomi Amarachi,I'm a Nigerian and I currently live in
Lagos,Nigeria. I'm a graduate of the African Leadership Xcelerator
Software Engineering program (ALX SE), with specialization in backend
web development and I'm applying for the "Moving Existing Tests to a
Unit Testing Framework" project

Some of my projects that involve working with Shell,C and Git can be found here:

https://github.com/Amarajah/alx-system_engineering-devops

https://github.com/Amarajah/alx-low_level_programming

Git has been a part of my software engineering journey from day one
and it's allowed me to collaborate with peers and also keep track of
my personal projects. Currently Git uses end-to-end tests for error
conditions that could easily be captured by unit tests, the project is
aimed at turning end-to-end tests to unit tests, and I'd love to be a
part of it

My microproject contribution is here:

https://public-inbox.org/git/20231009011652.1791-1-naomi.ibeh69@gmail.com/T/#u

And here is my updated contribution link after review by the Git community:

https://public-inbox.org/git/xmqqttqox5cp.fsf@gitster.g/T/#u

Below is my project timeline:
(Of course I'd be very much willing to work with the community and
mentors to edit it so it perfectly meets up to the community's
expectations)

October 2, 2023 - October 30, 2023

Familiarizing myself with the community , mailing list and
contributing my microproject

November 20, 2023 - December 4, 2023

Familiarizing myself with the already existing tests and also the
chosen unit test framework

Do more research on the internship projects and find better ways to
get it done in harmony with coding guidelines and community
requirements

December 4, 2023 - December 31, 2023

First document the initial state of the test files, then make sure all
test files conform to coding guidelines down to the tiniest details
(e.g git/t/helper/test-write-cache.c and git/t/helper/test-advise.c
have die() messages which do not conform to coding guidelines)

January 2, 2024 - January 31, 2024

Run the tests and verify they still work as originally intended

Begin migrations of test files

Test migrated files and make necessary changes based on feedback
received from teammates and mentors

February 1, 2024 - March 1, 2024

Continue testing migrated files and making necessary changes based on
feedback received from teammates and mentors

Document each step and request for reviews from teammates and mentors

Tidy up the project,make sure all necessary files are migrated, they
all work as intended, they are well documented and that there are no
conflicts

^ permalink raw reply

* Re: [RESEND v2] git-rebase.txt: rewrite docu for fixup/squash (again)
From: Phillip Wood @ 2023-10-24  9:22 UTC (permalink / raw)
  To: Oswald Buddenhagen, phillip.wood
  Cc: git, Junio C Hamano, Christian Couder, Charvi Mendiratta,
	Marc Branchaud, Johannes Sixt
In-Reply-To: <ZTayxB0Nm7AEyafp@ugly>

Hi Oswald

On 23/10/2023 18:52, Oswald Buddenhagen wrote:
> On Mon, Oct 23, 2023 at 05:01:02PM +0100, Phillip Wood wrote:
>> On 23/10/2023 14:00, Oswald Buddenhagen wrote:
>>> +unless "fixup -c" is used. In the latter case, the message is obtained
>>> +only from the "fixup -c" commit (having more than one of these is
>>> +incorrect).
>>
>> This change is incorrect - it is perfectly fine to have more than one 
>> "fixup -c" command. In that case we use the message of the commit of 
>> the final "fixup -c" command.
>>
> i know that this is the case, see the previous thread (which i failed to 
> link by header, cf. 
> https://lore.kernel.org/all/20231020092707.917514-1-oswald.buddenhagen@gmx.de/T/#u ).

Ah, I see Marc has already raised this point.

>> One case where there can be multiple "fixup -c" commands is  when a 
>> commit has been reworded several times via "git commit 
>> --fixup=reword:<commit>" and the user runs "git rebase --autosquash"
>>
> a cleaner solution would be recognizing the situation and not generating 
> these contradicting commands in the first place. of course that would be 
> more complexity, but it would also allow catching accidental use.
> 
> of course i can go back to documenting the status quo, but it seems kind 
> of wrong.

I agree there is an argument for improving the implementation of 
--autosquash but until we do I think it is counterproductive to change 
the documentation like this as it will cause users to wonder why "rebase 
--autosquash" generates a todo list that is incorrect according to the 
documentation.

>> In the case of
>>
>> pick A
>> fixup -C B
>>
>> don't we keep the authorship from A and just use the commit message 
>> from B?
>>
> uhm. we clearly do. that means i was given incorrect advice in 
> https://lore.kernel.org/all/YjXRM5HiRizZ035p@ugly/T/#u (and so the 
> thread is still looking for a resolution) ...

I'll take a look at that thread and comment there.

I do think it is a good idea to document where the authorship of a 
rebased commit comes from.

Best Wishes

Phillip

^ permalink raw reply

* Re: using oldest date when squashing commits
From: Phillip Wood @ 2023-10-24  9:26 UTC (permalink / raw)
  To: Johannes Sixt, Oswald Buddenhagen; +Cc: git
In-Reply-To: <9fae5292-d58f-95da-245b-6e205383cb50@kdbg.org>

On 20/03/2022 08:05, Johannes Sixt wrote:
> Am 19.03.22 um 13:48 schrieb Oswald Buddenhagen:
>> during interactive rebasing, i sometimes find it necessary to move a
>> hunk from one commit to a later one in the branch. now, if that hunk
>> cannot be re-ordered with the later commit due to conflicting with it,
>> it becomes necessary to squash the later commit onto a temporary commit
>> created from the extracted hunk, not the other way around (or using a
>> stash). unfortunately, this causes the author date of the later commit
>> to be reset, which can rather seriously falsify the date if the branch
>> is long-lived.
> 
> You want `fixup -C` in the todo-list. See the hints near the end of the
> todo-list.

Unfortunately "fixup -C" only copies the commit message not the 
authorship (that's usually a good thing but not it means it wont work 
for what Oswald wants to do). Maybe we should add another flag for 
fixup/squash commands to take the authorship from that commit. In the 
meantime creating the temporary commit with "git commit -C" is probably 
the easiest way to keep the original authorship.

Best Wishes

Phillip

^ permalink raw reply

* Re: using oldest date when squashing commits
From: Oswald Buddenhagen @ 2023-10-24 10:18 UTC (permalink / raw)
  To: phillip.wood; +Cc: Johannes Sixt, git
In-Reply-To: <a99b16a8-a06c-4d38-bb78-46ce17411597@gmail.com>

On Tue, Oct 24, 2023 at 10:26:29AM +0100, Phillip Wood wrote:
>On 20/03/2022 08:05, Johannes Sixt wrote:
>> Am 19.03.22 um 13:48 schrieb Oswald Buddenhagen:
>>> during interactive rebasing, i sometimes find it necessary to move a
>>> hunk from one commit to a later one in the branch. now, if that hunk
>>> cannot be re-ordered with the later commit due to conflicting with it,
>>> it becomes necessary to squash the later commit onto a temporary commit
>>> created from the extracted hunk, not the other way around (or using a
>>> stash). unfortunately, this causes the author date of the later commit
>>> to be reset, which can rather seriously falsify the date if the branch
>>> is long-lived.
>> 
>> You want `fixup -C` in the todo-list. See the hints near the end of the
>> todo-list.
>
>Unfortunately "fixup -C" only copies the commit message not the 
>authorship

>(that's usually a good thing
>
why? what would that be useful for? it seems rather counter-intuitive.
it's also inconsistent with commit -c/-C's behavior, which seems like a 
red flag to me.

>but not it means it wont work for what Oswald wants to do).

>Maybe we should add another flag for fixup/squash commands to take the 
>authorship from that commit.
>
that's a possibility. but given the above, it might be better to simply 
change the behavior of -c/-C to keep the UI lean and consistent with 
commit's behavior.

regards

^ permalink raw reply

* ls-remote bug
From: Lior Zeltzer @ 2023-10-24 10:55 UTC (permalink / raw)
  To: git@vger.kernel.org
In-Reply-To: <BL0PR18MB2130C672D503E49F955E04E8BADFA@BL0PR18MB2130.namprd18.prod.outlook.com>

>uname -a
Linux dc3lp-veld0045 3.10.0-1160.21.1.el7.x86_64 #1 SMP Tue Mar 16 18:28:22 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Gerrit version :
3.8.0

Bug description : 
When running ls-remote : sometime data gets cut in the middle

Reproducing :
You need a few files with a few repo names (I used 4 files with 10 repos each)
Call then l1..l4
And the code below just cd into each of them does ls-remote twice and compares the data
Doing it in parallel on all lists.
Data received in both ls-remotes should be the same , if not, it prints ***
Repos should contain a lot of tags and refs

Note : 
1.  without stderr redirection (2>&1) all works well
2. On local repos (not through gerrit) all works well

I compared various git vers and found the bug to be between 2.31.8 and 2.32.0
Comparing ls-remote.c file between those vers gave me :

Lines :
if (transport_disconnect(transport))
		return 1;

moved to end of sub

copying ls-remote.c from 2.31.8 to 2.32.0 - fixed the bug

Code reproducing bug :

#!/proj/mislcad/areas/DAtools/tools/perl/5.10.1/bin/perl -w
use strict;
use Cwd qw(cwd);

my $count = 4;
for my $f (1..$count) {
  my $child = fork();
  if (!$child) {
    my $curr = cwd();

    my @repos = `cat l$f`;
    foreach my $repo (@repos) {
      chomp $repo;
      print "$repo\n";
      chdir($repo);
      my $remote_tags_str = `git ls-remote  2>&1`;
      my $remote_tags_str2 = `git ls-remote  2>&1 `;
      chdir($curr);
      if ( $remote_tags_str ne $remote_tags_str2) {
         print "***\n";
      }
    }

    exit(0);
  }
}
while (wait != -1) {}
1;

^ permalink raw reply

* Re: [PATCH v3 3/3] rev-list: add commit object support in `--missing` option
From: Karthik Nayak @ 2023-10-24 11:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, ps
In-Reply-To: <xmqq1qdptffk.fsf@gitster.g>

On Fri, Oct 20, 2023 at 6:41 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Rather, I was wondering if we need to use object flags to mark these
> > objects, or can do what we want to do without using any object flags
> > at all.  For the purpose of reporting "missing" objects, wouldn't it
> > be sufficient to walk the object graph and report our findings as we
> > go?  To avoid reporting the same object twice, as we reasonably can
> > expect that the missing objects are minority (compared to the total
> > number of objects), perhaps the codepath that makes such a report
> > can use a hashmap of object_ids or something, for example.
>
> Digging from the bottom,
>
>  * builtin/rev-list.c:show_commit() gets "struct rev_list_info *"
>    that has "struct rev_info *" [*].
>
>  * list-objects.c:do_traverse() calls revision.c:get_revision() to
>    obtain commits, some of which may be missing ones, and things
>    behind get_revision() are responsible for marking the commit as
>    missing.  It has "struct traversal_context *", among whose
>    members is the "revs" member that is the "struct rev_info *".
>
>  * revision.c:get_revision() and machinery behind it ultimately
>    discovers a missing commit in the revision.c:process_parents()
>    that loops over the parents commit_list.  It of course has access
>    to "struct rev_info *".
>
> So, presumably, if we add a new member to "struct rev_info" that
> optionally [*] points at an oidset that records the object names of
> missing objects we discovered so far (i.e., the set of missing
> objects), the location we set the MISSING bit of a commit can
> instead add the object name of the commit to the set.  And we can
> export a function that takes "struct rev_info *" and "struct object
> *" (or "struct object_id *") to check for membership in the "set of
> missing objects", which would be used where we checked the MISSING
> bit of a commit.
>
> I do not know the performance implications of going this route, but
> if we do not find a suitable vacant bit, we do not have to use any
> object flags bit to do this, if we go this route, I would think.  I
> may be missing some details that breaks the above outline, though.
>
>
> [Footnotes]
>
>  * A potential #leftoverbits tangent.
>
>    Why is "rev_list_info" structure declared in <bisect.h>?  I
>    suspect that this is a fallout from recent header file shuffling,
>    but given who uses it (among which is rev-list:show_commit() that
>    has very little to do with bisection and uses the information in
>    rev_list_info when doing its normal non-bisect things), it does
>    not make much sense.
>
>  * When .do_not_die_on_missing_objects is false, it can and should
>    be left NULL, but presumably we use the "do not die" bit even
>    when we are not necessarily collecting the missing objects?  So
>    the new member cannot replace the "do not die" bit completely.

Thanks for the suggestion, this does seem like a good way to go ahead without
using flags. The only performance issue being if there are too many commits
which are missing, then oidset would be large.

But I think that's okay though.

> Thanks for researching.  It sounds like it may be a better bit to
> steal than the one used by the commit-graph, as long as there is no
> reason to expect that blame may want to work in a corrupt repository
> with missing objects, but when it happens, we may regret the
> decision we are making here.
>

I don't see blame working with missing commits though, because it relies on
parsing commits to get information to show to the user. So I think it's a safe
bit to steal. Also, when the time comes we could always release the bit and
move to the solution you mentioned above.

Anyways on the whole I think keeping it future compatible makes a lot
more sense.
I'll send a patch series to implement an oidset instead of flags soon.

- Karthik

^ permalink raw reply

* [PATCH v4 1/3] revision: rename bit to `do_not_die_on_missing_objects`
From: Karthik Nayak @ 2023-10-24 12:26 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, ps
In-Reply-To: <20231024122631.158415-1-karthik.188@gmail.com>

The bit `do_not_die_on_missing_tree` is used in revision.h to ensure the
revision walker does not die when encountering a missing tree. This is
currently exclusively set within `builtin/rev-list.c` to ensure the
`--missing` option works with missing trees.

In the upcoming commits, we will extend `--missing` to also support
missing commits. So let's rename the bit to
`do_not_die_on_missing_objects`, which is object type agnostic and can
be used for both trees/commits.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 builtin/reflog.c   |  2 +-
 builtin/rev-list.c |  2 +-
 list-objects.c     |  2 +-
 revision.h         | 17 +++++++++--------
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index df63a5892e..9e369a5977 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -298,7 +298,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		struct rev_info revs;
 
 		repo_init_revisions(the_repository, &revs, prefix);
-		revs.do_not_die_on_missing_tree = 1;
+		revs.do_not_die_on_missing_objects = 1;
 		revs.ignore_missing = 1;
 		revs.ignore_missing_links = 1;
 		if (verbose)
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index ff715d6918..ea77489c38 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -561,7 +561,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	}
 
 	if (arg_missing_action)
-		revs.do_not_die_on_missing_tree = 1;
+		revs.do_not_die_on_missing_objects = 1;
 
 	argc = setup_revisions(argc, argv, &revs, &s_r_opt);
 
diff --git a/list-objects.c b/list-objects.c
index c25c72b32c..47296dff2f 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -177,7 +177,7 @@ static void process_tree(struct traversal_context *ctx,
 		    is_promisor_object(&obj->oid))
 			return;
 
-		if (!revs->do_not_die_on_missing_tree)
+		if (!revs->do_not_die_on_missing_objects)
 			die("bad tree object %s", oid_to_hex(&obj->oid));
 	}
 
diff --git a/revision.h b/revision.h
index 50091bbd13..c73c92ef40 100644
--- a/revision.h
+++ b/revision.h
@@ -212,18 +212,19 @@ struct rev_info {
 
 			/*
 			 * Blobs are shown without regard for their existence.
-			 * But not so for trees: unless exclude_promisor_objects
+			 * But not so for trees/commits: unless exclude_promisor_objects
 			 * is set and the tree in question is a promisor object;
 			 * OR ignore_missing_links is set, the revision walker
-			 * dies with a "bad tree object HASH" message when
-			 * encountering a missing tree. For callers that can
-			 * handle missing trees and want them to be filterable
+			 * dies with a "bad <type> object HASH" message when
+			 * encountering a missing object. For callers that can
+			 * handle missing trees/commits and want them to be filterable
 			 * and showable, set this to true. The revision walker
-			 * will filter and show such a missing tree as usual,
-			 * but will not attempt to recurse into this tree
-			 * object.
+			 * will filter and show such a missing object as usual,
+			 * but will not attempt to recurse into this tree/commit
+			 * object. The revision walker will also set the MISSING
+			 * flag for such objects.
 			 */
-			do_not_die_on_missing_tree:1,
+			do_not_die_on_missing_objects:1,
 
 			/* for internal use only */
 			exclude_promisor_objects:1;
-- 
2.42.0


^ permalink raw reply related

* [PATCH v4 2/3] rev-list: move `show_commit()` to the bottom
From: Karthik Nayak @ 2023-10-24 12:26 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, ps
In-Reply-To: <20231024122631.158415-1-karthik.188@gmail.com>

The `show_commit()` function already depends on `finish_commit()`, and
in the upcoming commit, we'll also add a dependency on
`finish_object__ma()`. Since in C symbols must be declared before
they're used, let's move `show_commit()` below both `finish_commit()`
and `finish_object__ma()`, so the code is cleaner as a whole without the
need for declarations.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 builtin/rev-list.c | 85 +++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 43 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index ea77489c38..98542e8b3c 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -100,7 +100,48 @@ static off_t get_object_disk_usage(struct object *obj)
 	return size;
 }
 
-static void finish_commit(struct commit *commit);
+static inline void finish_object__ma(struct object *obj)
+{
+	/*
+	 * Whether or not we try to dynamically fetch missing objects
+	 * from the server, we currently DO NOT have the object.  We
+	 * can either print, allow (ignore), or conditionally allow
+	 * (ignore) them.
+	 */
+	switch (arg_missing_action) {
+	case MA_ERROR:
+		die("missing %s object '%s'",
+		    type_name(obj->type), oid_to_hex(&obj->oid));
+		return;
+
+	case MA_ALLOW_ANY:
+		return;
+
+	case MA_PRINT:
+		oidset_insert(&missing_objects, &obj->oid);
+		return;
+
+	case MA_ALLOW_PROMISOR:
+		if (is_promisor_object(&obj->oid))
+			return;
+		die("unexpected missing %s object '%s'",
+		    type_name(obj->type), oid_to_hex(&obj->oid));
+		return;
+
+	default:
+		BUG("unhandled missing_action");
+		return;
+	}
+}
+
+static void finish_commit(struct commit *commit)
+{
+	free_commit_list(commit->parents);
+	commit->parents = NULL;
+	free_commit_buffer(the_repository->parsed_objects,
+			   commit);
+}
+
 static void show_commit(struct commit *commit, void *data)
 {
 	struct rev_list_info *info = data;
@@ -219,48 +260,6 @@ static void show_commit(struct commit *commit, void *data)
 	finish_commit(commit);
 }
 
-static void finish_commit(struct commit *commit)
-{
-	free_commit_list(commit->parents);
-	commit->parents = NULL;
-	free_commit_buffer(the_repository->parsed_objects,
-			   commit);
-}
-
-static inline void finish_object__ma(struct object *obj)
-{
-	/*
-	 * Whether or not we try to dynamically fetch missing objects
-	 * from the server, we currently DO NOT have the object.  We
-	 * can either print, allow (ignore), or conditionally allow
-	 * (ignore) them.
-	 */
-	switch (arg_missing_action) {
-	case MA_ERROR:
-		die("missing %s object '%s'",
-		    type_name(obj->type), oid_to_hex(&obj->oid));
-		return;
-
-	case MA_ALLOW_ANY:
-		return;
-
-	case MA_PRINT:
-		oidset_insert(&missing_objects, &obj->oid);
-		return;
-
-	case MA_ALLOW_PROMISOR:
-		if (is_promisor_object(&obj->oid))
-			return;
-		die("unexpected missing %s object '%s'",
-		    type_name(obj->type), oid_to_hex(&obj->oid));
-		return;
-
-	default:
-		BUG("unhandled missing_action");
-		return;
-	}
-}
-
 static int finish_object(struct object *obj, const char *name UNUSED,
 			 void *cb_data)
 {
-- 
2.42.0


^ permalink raw reply related

* [PATCH v4 0/3] rev-list: add support for commits in `--missing`
From: Karthik Nayak @ 2023-10-24 12:26 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, ps
In-Reply-To: <20231019121024.194317-1-karthik.188@gmail.com>

The `--missing` option in git-rev-list(1) was introduced intitally
to deal with missing blobs in the context of promissory notes.
Eventually the option was extended to also support tree objects in
7c0fe330d5 (rev-list: handle missing tree objects properly,2018-10-05).

This patch series extends the `--missing` option to also add support for
commit objects. We do this by introducing a new flag `MISSING` which is
added whenever we encounter a missing commit during traversal. Then in
`builtin/rev-list` we check for this flag and take the appropriate
action based on the `--missing=*` option used.

This series is an alternate to the patch series I had posted earlier:
https://lore.kernel.org/git/20230908174208.249184-1-karthik.188@gmail.com/.
In that patch, we introduced an option `--ignore-missing-links` which
was added to expose the `ignore_missing_links` bit to the user. The
issue in that patch was that, the option `--ignore-missing-links` didn't
play well the pre-existing `--missing` option. This series avoids that
route and just extends the `--missing` option for commits to solve the
same problem.

V3 of the series can be found here: https://lore.kernel.org/git/20231019121024.194317-1-karthik.188@gmail.com/T/#mf6a442e06f323a78a45af086ddd353998bab0052

Changes from v3:
- Instead of using flags on the object level as in the previous vesrions, we
add and use a missing_objects oidset in rev_info. This avoids flag bit-collision
since we were using an existing bit and doesn't require extending the flag size
as done in v3. 

Changelog vs v3:

1:  8c469cf479 = 1:  8c469cf479 revision: rename bit to `do_not_die_on_missing_objects`
2:  76ce43d973 = 2:  76ce43d973 rev-list: move `show_commit()` to the bottom
3:  4c640f9ab4 ! 3:  d892f0b82d rev-list: add commit object support in `--missing` option
    @@ Commit message
         a fatal error.
     
         Let's extend the functionality of `--missing` option to also support
    -    commit objects. This is done by adding a new `MISSING` flag that the
    -    revision walker sets whenever it encounters a missing tree/commit. The
    -    revision walker will now continue the traversal and call `show_commit()`
    -    even for missing commits. In rev-list we can then check for this flag
    -    and call the existing code for parsing `--missing` objects.
    +    commit objects. This is done by adding a `missing_objects` field to
    +    `rev_info`. This field is an `oidset` to which we'll add the missing
    +    commits as we encounter them. The revision walker will now continue the
    +    traversal and call `show_commit()` even for missing commits. In rev-list
    +    we can then check if the commit is a missing commit and call the
    +    existing code for parsing `--missing` objects.
     
         A scenario where this option would be used is to find the boundary
         objects between different object directories. Consider a repository with
    @@ Commit message
         alternate object directory allows us to find the boundary objects
         between the main and alternate object directory.
     
    +    Helped-by: Patrick Steinhardt <ps@pks.im>
         Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
     
      ## builtin/rev-list.c ##
    @@ builtin/rev-list.c: static void show_commit(struct commit *commit, void *data)
      	display_progress(progress, ++progress_counter);
      
     +	if (revs->do_not_die_on_missing_objects &&
    -+	    commit->object.flags & MISSING) {
    ++	    oidset_contains(&revs->missing_objects, &commit->object.oid)) {
     +		finish_object__ma(&commit->object);
     +		return;
     +	}
    @@ builtin/rev-list.c: static void show_commit(struct commit *commit, void *data)
     
      ## list-objects.c ##
     @@ list-objects.c: static void do_traverse(struct traversal_context *ctx)
    - 		 * an uninteresting boundary commit may not have its tree
    - 		 * parsed yet, but we are not going to show them anyway
      		 */
    --		if (!ctx->revs->tree_objects)
    -+		if (!ctx->revs->tree_objects || commit->object.flags & MISSING)
    + 		if (!ctx->revs->tree_objects)
      			; /* do not bother loading tree */
    ++		else if (ctx->revs->do_not_die_on_missing_objects &&
    ++			 oidset_contains(&ctx->revs->missing_objects, &commit->object.oid))
    ++			;
      		else if (repo_get_commit_tree(the_repository, commit)) {
      			struct tree *tree = repo_get_commit_tree(the_repository,
    -
    - ## object.h ##
    -@@ object.h: void object_array_init(struct object_array *array);
    - 
    - /*
    -  * object flag allocation:
    -- * revision.h:               0---------10         15             23------27
    -+ * revision.h:               0---------10         15             22------28
    -  * fetch-pack.c:             01    67
    -  * negotiator/default.c:       2--5
    -  * walker.c:                 0-2
    -@@ object.h: void object_array_init(struct object_array *array);
    -  * builtin/show-branch.c:    0-------------------------------------------26
    -  * builtin/unpack-objects.c:                                 2021
    -  */
    --#define FLAG_BITS  28
    -+#define FLAG_BITS  29
    - 
    - #define TYPE_BITS 3
    - 
    + 								 commit);
     
      ## revision.c ##
    +@@
    + #include "object-name.h"
    + #include "object-file.h"
    + #include "object-store-ll.h"
    ++#include "oidset.h"
    + #include "tag.h"
    + #include "blob.h"
    + #include "tree.h"
     @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *commit,
    - 	struct commit_list *parent = commit->parents;
    - 	unsigned pass_flags;
      
    --	if (commit->object.flags & ADDED)
    -+	if (commit->object.flags & (ADDED | MISSING))
    + 	if (commit->object.flags & ADDED)
      		return 0;
    ++	if (revs->do_not_die_on_missing_objects &&
    ++		oidset_contains(&revs->missing_objects, &commit->object.oid))
    ++		return 0;
      	commit->object.flags |= ADDED;
      
    + 	if (revs->include_check &&
     @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *commit,
      	for (parent = commit->parents; parent; parent = parent->next) {
      		struct commit *p = parent->item;
    @@ revision.c: static int process_parents(struct rev_info *revs, struct commit *com
     +			if (!revs->do_not_die_on_missing_objects)
     +				return -1;
     +			else
    -+				p->object.flags |= MISSING;
    ++				oidset_insert(&revs->missing_objects, &p->object.oid);
      		}
      		if (revs->sources) {
      			char **slot = revision_sources_at(revs->sources, p);
    +@@ revision.c: int prepare_revision_walk(struct rev_info *revs)
    + 				       FOR_EACH_OBJECT_PROMISOR_ONLY);
    + 	}
    + 
    ++	if (revs->do_not_die_on_missing_objects)
    ++		oidset_init(&revs->missing_objects, 0);
    ++
    + 	if (!revs->reflog_info)
    + 		prepare_to_use_bloom_filter(revs);
    + 	if (!revs->unsorted_input)
     
      ## revision.h ##
     @@
    - #define ANCESTRY_PATH	(1u<<27)
    - #define ALL_REV_FLAGS	(((1u<<11)-1) | NOT_USER_GIVEN | TRACK_LINEAR | PULL_MERGE)
    + #include "commit.h"
    + #include "grep.h"
    + #include "notes.h"
    ++#include "oidset.h"
    + #include "pretty.h"
    + #include "diff.h"
    + #include "commit-slab-decl.h"
    +@@ revision.h: struct rev_info {
      
    -+#define MISSING (1u<<28)
    + 	/* Location where temporary objects for remerge-diff are written. */
    + 	struct tmp_objdir *remerge_objdir;
     +
    - #define DECORATE_SHORT_REFS	1
    - #define DECORATE_FULL_REFS	2
    ++	/* Missing objects to be tracked without failing traversal. */
    ++	struct oidset missing_objects;
    + };
      
    + /**
     
      ## t/t6022-rev-list-missing.sh (new) ##
     @@


Karthik Nayak (3):
  revision: rename bit to `do_not_die_on_missing_objects`
  rev-list: move `show_commit()` to the bottom
  rev-list: add commit object support in `--missing` option

 builtin/reflog.c            |  2 +-
 builtin/rev-list.c          | 93 +++++++++++++++++++------------------
 list-objects.c              |  5 +-
 revision.c                  | 16 ++++++-
 revision.h                  | 21 +++++----
 t/t6022-rev-list-missing.sh | 74 +++++++++++++++++++++++++++++
 6 files changed, 155 insertions(+), 56 deletions(-)
 create mode 100755 t/t6022-rev-list-missing.sh

-- 
2.42.0


^ permalink raw reply

* [PATCH v4 3/3] rev-list: add commit object support in `--missing` option
From: Karthik Nayak @ 2023-10-24 12:26 UTC (permalink / raw)
  To: karthik.188; +Cc: git, gitster, ps
In-Reply-To: <20231024122631.158415-1-karthik.188@gmail.com>

The `--missing` object option in rev-list currently works only with
missing blobs/trees. For missing commits the revision walker fails with
a fatal error.

Let's extend the functionality of `--missing` option to also support
commit objects. This is done by adding a `missing_objects` field to
`rev_info`. This field is an `oidset` to which we'll add the missing
commits as we encounter them. The revision walker will now continue the
traversal and call `show_commit()` even for missing commits. In rev-list
we can then check if the commit is a missing commit and call the
existing code for parsing `--missing` objects.

A scenario where this option would be used is to find the boundary
objects between different object directories. Consider a repository with
a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
repository, using the `--missing=print` option while disabling the
alternate object directory allows us to find the boundary objects
between the main and alternate object directory.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 builtin/rev-list.c          |  6 +++
 list-objects.c              |  3 ++
 revision.c                  | 16 +++++++-
 revision.h                  |  4 ++
 t/t6022-rev-list-missing.sh | 74 +++++++++++++++++++++++++++++++++++++
 5 files changed, 101 insertions(+), 2 deletions(-)
 create mode 100755 t/t6022-rev-list-missing.sh

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 98542e8b3c..37b52520b5 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -149,6 +149,12 @@ static void show_commit(struct commit *commit, void *data)
 
 	display_progress(progress, ++progress_counter);
 
+	if (revs->do_not_die_on_missing_objects &&
+	    oidset_contains(&revs->missing_objects, &commit->object.oid)) {
+		finish_object__ma(&commit->object);
+		return;
+	}
+
 	if (show_disk_usage)
 		total_disk_usage += get_object_disk_usage(&commit->object);
 
diff --git a/list-objects.c b/list-objects.c
index 47296dff2f..260089388c 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -389,6 +389,9 @@ static void do_traverse(struct traversal_context *ctx)
 		 */
 		if (!ctx->revs->tree_objects)
 			; /* do not bother loading tree */
+		else if (ctx->revs->do_not_die_on_missing_objects &&
+			 oidset_contains(&ctx->revs->missing_objects, &commit->object.oid))
+			;
 		else if (repo_get_commit_tree(the_repository, commit)) {
 			struct tree *tree = repo_get_commit_tree(the_repository,
 								 commit);
diff --git a/revision.c b/revision.c
index 219dc76716..e60646c1a7 100644
--- a/revision.c
+++ b/revision.c
@@ -6,6 +6,7 @@
 #include "object-name.h"
 #include "object-file.h"
 #include "object-store-ll.h"
+#include "oidset.h"
 #include "tag.h"
 #include "blob.h"
 #include "tree.h"
@@ -1112,6 +1113,9 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 
 	if (commit->object.flags & ADDED)
 		return 0;
+	if (revs->do_not_die_on_missing_objects &&
+		oidset_contains(&revs->missing_objects, &commit->object.oid))
+		return 0;
 	commit->object.flags |= ADDED;
 
 	if (revs->include_check &&
@@ -1168,7 +1172,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 	for (parent = commit->parents; parent; parent = parent->next) {
 		struct commit *p = parent->item;
 		int gently = revs->ignore_missing_links ||
-			     revs->exclude_promisor_objects;
+			     revs->exclude_promisor_objects ||
+			     revs->do_not_die_on_missing_objects;
 		if (repo_parse_commit_gently(revs->repo, p, gently) < 0) {
 			if (revs->exclude_promisor_objects &&
 			    is_promisor_object(&p->object.oid)) {
@@ -1176,7 +1181,11 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
 					break;
 				continue;
 			}
-			return -1;
+
+			if (!revs->do_not_die_on_missing_objects)
+				return -1;
+			else
+				oidset_insert(&revs->missing_objects, &p->object.oid);
 		}
 		if (revs->sources) {
 			char **slot = revision_sources_at(revs->sources, p);
@@ -3800,6 +3809,9 @@ int prepare_revision_walk(struct rev_info *revs)
 				       FOR_EACH_OBJECT_PROMISOR_ONLY);
 	}
 
+	if (revs->do_not_die_on_missing_objects)
+		oidset_init(&revs->missing_objects, 0);
+
 	if (!revs->reflog_info)
 		prepare_to_use_bloom_filter(revs);
 	if (!revs->unsorted_input)
diff --git a/revision.h b/revision.h
index c73c92ef40..f6bf422f0e 100644
--- a/revision.h
+++ b/revision.h
@@ -4,6 +4,7 @@
 #include "commit.h"
 #include "grep.h"
 #include "notes.h"
+#include "oidset.h"
 #include "pretty.h"
 #include "diff.h"
 #include "commit-slab-decl.h"
@@ -373,6 +374,9 @@ struct rev_info {
 
 	/* Location where temporary objects for remerge-diff are written. */
 	struct tmp_objdir *remerge_objdir;
+
+	/* Missing objects to be tracked without failing traversal. */
+	struct oidset missing_objects;
 };
 
 /**
diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
new file mode 100755
index 0000000000..40265a4f66
--- /dev/null
+++ b/t/t6022-rev-list-missing.sh
@@ -0,0 +1,74 @@
+#!/bin/sh
+
+test_description='handling of missing objects in rev-list'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+# We setup the repository with two commits, this way HEAD is always
+# available and we can hide commit 1.
+test_expect_success 'create repository and alternate directory' '
+	test_commit 1 &&
+	test_commit 2 &&
+	test_commit 3
+'
+
+for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+do
+	test_expect_success "rev-list --missing=error fails with missing object $obj" '
+		oid="$(git rev-parse $obj)" &&
+		path=".git/objects/$(test_oid_to_path $oid)" &&
+
+		mv "$path" "$path.hidden" &&
+		test_when_finished "mv $path.hidden $path" &&
+
+		test_must_fail git rev-list --missing=error --objects \
+			--no-object-names HEAD
+	'
+done
+
+for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+do
+	for action in "allow-any" "print"
+	do
+		test_expect_success "rev-list --missing=$action with missing $obj" '
+			oid="$(git rev-parse $obj)" &&
+			path=".git/objects/$(test_oid_to_path $oid)" &&
+
+			# Before the object is made missing, we use rev-list to
+			# get the expected oids.
+			git rev-list --objects --no-object-names \
+				HEAD ^$obj >expect.raw &&
+
+			# Blobs are shared by all commits, so evethough a commit/tree
+			# might be skipped, its blob must be accounted for.
+			if [ $obj != "HEAD:1.t" ]; then
+				echo $(git rev-parse HEAD:1.t) >>expect.raw &&
+				echo $(git rev-parse HEAD:2.t) >>expect.raw
+			fi &&
+
+			mv "$path" "$path.hidden" &&
+			test_when_finished "mv $path.hidden $path" &&
+
+			git rev-list --missing=$action --objects --no-object-names \
+				HEAD >actual.raw &&
+
+			# When the action is to print, we should also add the missing
+			# oid to the expect list.
+			case $action in
+			allow-any)
+				;;
+			print)
+				grep ?$oid actual.raw &&
+				echo ?$oid >>expect.raw
+				;;
+			esac &&
+
+			sort actual.raw >actual &&
+			sort expect.raw >expect &&
+			test_cmp expect actual
+		'
+	done
+done
+
+test_done
-- 
2.42.0


^ permalink raw reply related

* Re: Regression: git send-email fails with "Use of uninitialized value $address" + "unable to extract a valid address"
From: Uwe Kleine-König @ 2023-10-24 13:00 UTC (permalink / raw)
  To: Michael Strawbridge; +Cc: git, Luben Tuikov, entwicklung
In-Reply-To: <68d7e5c3-6b4a-4d0d-9885-f3d4e2199f26@amd.com>

[-- Attachment #1: Type: text/plain, Size: 2375 bytes --]

Hello,

On Fri, Oct 20, 2023 at 05:06:36PM -0400, Michael Strawbridge wrote:
> On 10/20/23 06:04, Uwe Kleine-König wrote:
> > hello,
> > 
> > On Fri, Oct 13, 2023 at 04:14:37PM +0200, Uwe Kleine-König wrote:
> >> Hello,
> >>
> >> 	$ git send-email --to 'A B <a@b.org>, C D <c@d.org>' lala.patch
> >> 	Use of uninitialized value $address in sprintf at /usr/lib/git-core/git-send-email line 1172.
> >> 	error: unable to extract a valid address from:
> >>
> >> This happens for me with git 2.42.0 and also on master (59167d7d09fd, "The seventeenth batch").
> >>
> >> Bisection points at
> >>
> >> 	a8022c5f7b67 ("send-email: expose header information to git-send-email's sendemail-validate hook")
> >>
> >> I didn't try to understand that change and fix the problem.
> > 
> > Another (similar?) problem with non-ascii-chars:
> > 
> > 	$ git send-email --to 'Will Deacon <will@kernel.org>' --to 'Krzysztof Wilczyński <kw@linux.com>' --to 'Lorenzo Pieralisi <lpieralisi@kernel.org>' --cc 'Rob Herring <robh@kernel.org>' --to 'Bjorn Helgaas <bhelgaas@google.com>' --cc 'linux-pci@vger.kernel.org' --cc kernel@pengutronix.de -1 --base=@~
> > 	Use of uninitialized value $address in sprintf at /home/uwe/gsrc/git/git-send-email line 1162.
> > 	error: unable to extract a valid address from:
> > 
> > Bisection points to the same commit, when dropping ń in Krzysztof's
> > name, it works fine.
> > 
> This is interesting.  Thanks for reporting it.  If you are able, could you please try the patches found in the below threads:
> - https://public-inbox.org/git/20230918212004.GC2163162@coredump.intra.peff.net/T/#mae64003cbb72f015bf5c0c04216524fcb6bb8d09

On main (2e8e77cbac8a) this one is already applied, with that the error
message reduces to:

	$ git send-email --to 'Uwe Kleine-König <u.kleine-koenig@pengutronix.de>' -1
	error: unable to extract a valid address from: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

> - https://public-inbox.org/git/f5c6a72b-f888-4d43-8be8-3ce2c878c669@gmail.com/T/#mca12dc95ccfd3ce2b94e7752ebaae9891201084f

This one doesn't help either. With it applied on top of main I get the
sams result as on vanilla main.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH 00/12] show-ref: introduce mode to check for ref existence
From: Patrick Steinhardt @ 2023-10-24 13:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Eric Sunshine, Han-Wen Nienhuys

[-- Attachment #1: Type: text/plain, Size: 2772 bytes --]

Hi,

this patch series introduces a new `--exists` mode to git-show-ref(1) to
explicitly check for the existence of a reference, only. It tries to
address a gap in our plumbing tools: while we have a plethora of tools
to resolve revisions and thus also references, we do not have any tool
that can generically check for the existence of both direct and symoblic
references without resolving its contents.

This series has been split out of my other patch series that refactors
our test suite to reduce direct access to on-disk data structures. It is
structured as follows:

    - Patches 1-8 refactor the code to stop relying on global variables,
      addressing some smaller issues that surface. Furthermore, the
      different modes that git-show-ref(1) has are made more explicit
      such that the command becomes more extensible.

    - Patch 9 ensures that the user does not request mutually exclusive
      modes.

    - Patch 10 updates the documentation to better reflect how the modes
      are to be used.

    - Patch 11 introduces the new `--exists` mode as well as a bunch of
      tests for it.

    - Patch 12 introduces two test helpers `test_ref_exists` and
      `test_ref_missing` and updates many of our tests to use those
      instead.

I admittedly may have went a bit overboard with this series. But I had a
hard time understanding git-show-ref(1) and how the global state affects
the different modes.

Patrick

[1]: <cover.1697607222.git.ps@pks.im>

Patrick Steinhardt (12):
  builtin/show-ref: convert pattern to a local variable
  builtin/show-ref: split up different subcommands
  builtin/show-ref: fix leaking string buffer
  builtin/show-ref: fix dead code when passing patterns
  builtin/show-ref: refactor `--exclude-existing` options
  builtin/show-ref: stop using global variable to count matches
  builtin/show-ref: stop using global vars for `show_one()`
  builtin/show-ref: refactor options for patterns subcommand
  builtin/show-ref: ensure mutual exclusiveness of subcommands
  builtin/show-ref: explicitly spell out different modes in synopsis
  builtin/show-ref: add new mode to check for reference existence
  t: use git-show-ref(1) to check for ref existence

 Documentation/git-show-ref.txt |  16 +-
 builtin/show-ref.c             | 275 ++++++++++++++++++++++-----------
 t/t1403-show-ref.sh            |  70 +++++++++
 t/t1430-bad-ref-name.sh        |  27 ++--
 t/t3200-branch.sh              |  33 ++--
 t/t5521-pull-options.sh        |   4 +-
 t/t5605-clone-local.sh         |   2 +-
 t/test-lib-functions.sh        |  55 +++++++
 8 files changed, 363 insertions(+), 119 deletions(-)


base-commit: a9ecda2788e229afc9b611acaa26d0d9d4da53ed
-- 
2.42.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH 01/12] builtin/show-ref: convert pattern to a local variable
From: Patrick Steinhardt @ 2023-10-24 13:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Eric Sunshine, Han-Wen Nienhuys
In-Reply-To: <cover.1698152926.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3886 bytes --]

The `pattern` variable is a global variable that tracks either the
reference names (not patterns!) for the `--verify` mode or the patterns
for the non-verify mode. This is a bit confusing due to the slightly
different meanings.

Convert the variable to be local. While this does not yet fix the double
meaning of the variable, this change allows us to address it in a
subsequent patch more easily by explicitly splitting up the different
subcommands of git-show-ref(1).

Note that we introduce a `struct show_ref_data` to pass the patterns to
`show_ref()`. While this is overengineered now, we will extend this
structure in a subsequent patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/show-ref.c | 46 ++++++++++++++++++++++++++++------------------
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index 5110814f796..7efab14b96c 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -20,7 +20,6 @@ static const char * const show_ref_usage[] = {
 
 static int deref_tags, show_head, tags_only, heads_only, found_match, verify,
 	   quiet, hash_only, abbrev, exclude_arg;
-static const char **pattern;
 static const char *exclude_existing_arg;
 
 static void show_one(const char *refname, const struct object_id *oid)
@@ -50,15 +49,21 @@ static void show_one(const char *refname, const struct object_id *oid)
 	}
 }
 
+struct show_ref_data {
+	const char **patterns;
+};
+
 static int show_ref(const char *refname, const struct object_id *oid,
-		    int flag UNUSED, void *cbdata UNUSED)
+		    int flag UNUSED, void *cbdata)
 {
+	struct show_ref_data *data = cbdata;
+
 	if (show_head && !strcmp(refname, "HEAD"))
 		goto match;
 
-	if (pattern) {
+	if (data->patterns) {
 		int reflen = strlen(refname);
-		const char **p = pattern, *m;
+		const char **p = data->patterns, *m;
 		while ((m = *p++) != NULL) {
 			int len = strlen(m);
 			if (len > reflen)
@@ -180,6 +185,9 @@ static const struct option show_ref_options[] = {
 
 int cmd_show_ref(int argc, const char **argv, const char *prefix)
 {
+	struct show_ref_data show_ref_data = {0};
+	const char **patterns;
+
 	git_config(git_default_config, NULL);
 
 	argc = parse_options(argc, argv, prefix, show_ref_options,
@@ -188,38 +196,40 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix)
 	if (exclude_arg)
 		return exclude_existing(exclude_existing_arg);
 
-	pattern = argv;
-	if (!*pattern)
-		pattern = NULL;
+	patterns = argv;
+	if (!*patterns)
+		patterns = NULL;
 
 	if (verify) {
-		if (!pattern)
+		if (!patterns)
 			die("--verify requires a reference");
-		while (*pattern) {
+		while (*patterns) {
 			struct object_id oid;
 
-			if ((starts_with(*pattern, "refs/") || !strcmp(*pattern, "HEAD")) &&
-			    !read_ref(*pattern, &oid)) {
-				show_one(*pattern, &oid);
+			if ((starts_with(*patterns, "refs/") || !strcmp(*patterns, "HEAD")) &&
+			    !read_ref(*patterns, &oid)) {
+				show_one(*patterns, &oid);
 			}
 			else if (!quiet)
-				die("'%s' - not a valid ref", *pattern);
+				die("'%s' - not a valid ref", *patterns);
 			else
 				return 1;
-			pattern++;
+			patterns++;
 		}
 		return 0;
 	}
 
+	show_ref_data.patterns = patterns;
+
 	if (show_head)
-		head_ref(show_ref, NULL);
+		head_ref(show_ref, &show_ref_data);
 	if (heads_only || tags_only) {
 		if (heads_only)
-			for_each_fullref_in("refs/heads/", show_ref, NULL);
+			for_each_fullref_in("refs/heads/", show_ref, &show_ref_data);
 		if (tags_only)
-			for_each_fullref_in("refs/tags/", show_ref, NULL);
+			for_each_fullref_in("refs/tags/", show_ref, &show_ref_data);
 	} else {
-		for_each_ref(show_ref, NULL);
+		for_each_ref(show_ref, &show_ref_data);
 	}
 	if (!found_match) {
 		if (verify && !quiet)
-- 
2.42.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox