Git development

Git development
 help / color / mirror / Atom feed

* Usema
From: Monchi Asenov @ 2023-01-24  6:42 UTC (permalink / raw)
  To: git



Inviato da iPhone

^ permalink raw reply

* Re: [PATCH v4] sparse-checkout.txt: new document with sparse-checkout directions
From: Elijah Newren @ 2023-01-24  3:17 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Elijah Newren via GitGitGadget, git, Victoria Dye, Derrick Stolee,
	Shaoxuan Yuan, Matheus Tavares, Glen Choo, Martin von Zweigbergk
In-Reply-To: <CAOLTT8S3qaA7E0=qMHOKKs7F3zcNeXnscL-VrP3pVSHmxc7YZw@mail.gmail.com>

On Mon, Jan 23, 2023 at 7:05 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> Elijah Newren <newren@gmail.com> 于2023年1月20日周五 12:30写道：
> >
> > On Sat, Jan 14, 2023 at 2:19 AM ZheNing Hu <adlternative@gmail.com> wrote:
[...]
> > > Yeah, it seems that git status fetch these .gitigore files.
> > > So what's the wrong here?
> >
> > Nothing; this looks exactly as I'd expect.
> >
> > `git status` is supposed to check whether untracked files are ignored
> > or not, and it needs .gitignore files to do so.  If an entire
> > directory is missing, then it can avoid loading a .gitignore under
> > that directory, but you set up your patterns such that no directory is
> > likely to be excluded.
> >
>
> This may be a bit strange, why doesn't the cone mode have this problem?

I wouldn't classify it as a problem.  But, cone mode technically can
experience the same behavior too, it's just that it should be fairly
rare.  That's because...

In a working tree, if an entire directory is missing, then obviously
there are no files inside that directory.  If there are no files, then
there cannot be any untracked files.  Since there are no untracked
files within that directory, we do not need to find out which of those
0 files within that directory are ignored.  Therefore, we do not need
the .gitignore file(s) underneath that missing directory.  Since in
cone mode you only have the entire directory or none of it, whenever
we are missing a .gitignore file, it is because we are missing the
whole directory and the entire directory will be missing from the
worktree too.  So, we simply don't typically need the .gitignore files
outside the sparse patterns.

If the directory is present, though, then you do need the .gitignore
files so that you can classify untracked files within that directory
into ignored and non-ignored categories.

Since non-cone mode tends to leave the directory around and have some
files in it, you are going to need the .gitignore files for `git
status` to correctly operate.  That seems to me to be pretty intrinsic
to that mode.

Now, cone mode can also experience this behavior by having a directory
present in the worktree which is normally meant to be missing.  This
could happen due to the user manually creating the directory, or could
be done by git if e.g. there are conflicts in directories outside the
sparse cone when merging/rebasing/etc.  Either of those cases would be
rare, but in such a case we would need to download the .gitignore
files within those directories as well.  But, again, that'd make sense
because we want to know which untracked files in the present directory
are ignored when we run `git status`, so we need the .gitignore files.

> > Maybe there are other special cases that one could attempt to handle
> > (e.g. first check if there are any untracked files in a directory and
> > only then check for which are ignored, but that'd probably take some
> > big refactoring of some hairy dir.c code to accomplish something like
> > that, and you'd have to be really careful to not regress the
> > performance of non-sparse cases).  Personally, I don't think it's
> > worth it.  If you really don't like it, I'd suggest choosing any one
> > of the following ways to avoid it:
> >
>
> This "bug" may not be very important, because it has no application
> at present,  scalar also uses cone mode by default.

I don't consider it a bug.  However, you have some statements below
which appear contradictory to me, which may suggest that I'm
misunderstanding what you are saying or misunderstanding the project
setup you have.

> >   * Include the .gitignore files in your sparse-checkout; might as
> > well since you're going to get them anyway.
>
> This can also seem like a pain in the butt because of the extra
> unnecessary .gitigore downloads.

Why are you calling them unnecessary?  Aren't they required for
correct operation of `git status`?

> >   * Set status.showUntrackedFiles to `no` so that .gitignore files are
> > irrelevant
>
> This may also be a temporary rather than a complete solution.

How so?  Do you expect users to override this option?  If so, isn't
that because they are interested in untracked files and likely want to
know which ones are not ignored?  And if so, isn't it then fine that
we download the .gitignore files needed to answer their questions?

> >   * Use cone mode instead of non-cone mode (and yes, restructure your
> > repo if needed)
>
> No, I might think cone mode has some other disadvantages: it includes
> too many files. I would prefer to get the behavior of non-cone mode.

I understand that restructuring a repository might not be enticing for
a variety of reasons, and sometimes is just flat impossible.  That may
well be the case for you, but I just want to point out that your
reason for not choosing this option seems to suggest you didn't
understand the choice as I worded it, because:

Using cone mode does not necessarily imply more files, once you allow
for restructuring the repository.  If you restructure your repository
such that the files you don't want in your sparse-checkout are in a
separate directory, and you then use cone mode to select directories
other than the ones with files you want excluded, then cone mode would
give you the exact same selection of files as non-cone mode.

[...]
> >   * Remove the .gitignore files and commit their deletion.  Just do
> > without them, somehow.
>
> That's not a proper solution absolutely.

Why not?  Above you called the .gitignore files "unnecessary".  If
they are unnecessary, then surely you can just get rid of them?

The point of the .gitignore files is to allow git to classify
untracked files as either ignored or not.  If classifying untracked
files as ignored-or-not is important to you, then `git status` needs
to download these files in order to do that classifying whenever there
are any untracked files.  So, the question for you is whether this
classifying is necessary.  If it's not necessary, then just delete the
files since they serve no other purpose.  If it is necessary, then why
complain that they are downloaded?

^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Elijah Newren @ 2023-01-24  2:36 UTC (permalink / raw)
  To: phillip.wood
  Cc: Elijah Newren via GitGitGadget, git, Derrick Stolee,
	Eric Sunshine, Martin Ågren, Phillip Wood
In-Reply-To: <759fb313-ce88-4eb7-96c0-4adeb75ca9f9@dunelm.org.uk>

Hi Phillip,

On Mon, Jan 23, 2023 at 12:08 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Elijah
>
> On 22/01/2023 06:12, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > The git-rebase manual noted several sets of incompatible options, but
> > we were missing tests for a few of these.  Further, we were missing
> > code checks for some of these, which could result in command line
> > options being silently ignored.
> >
> > Also, note that adding a check for autosquash means that using
> > --whitespace=fix together with the config setting rebase.autosquash=true
> > will trigger an error.  A subsequent commit will improve the error
> > message.
>
> Thanks for updating the commit message and for the new commits at the
> end of the series.
>
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > --- a/builtin/rebase.c
> > +++ b/builtin/rebase.c
> > @@ -1224,6 +1224,26 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
> >               if (options.fork_point < 0)
> >                       options.fork_point = 0;
> >       }
> > +     /*
> > +      * The apply backend does not support --[no-]reapply-cherry-picks.
> > +      * The behavior it implements by default is equivalent to
> > +      * --no-reapply-cherry-picks (due to passing --cherry-picks to
> > +      * format-patch), but --keep-base alters the upstream such that no
> > +      * cherry-picks can be found (effectively making it act like
> > +      * --reapply-cherry-picks).
> > +      *
> > +      * Now, if the user does specify --[no-]reapply-cherry-picks, but
> > +      * does so in such a way that options.reapply_cherry_picks ==
> > +      * keep_base, then the behavior they get will match what they
> > +      * expect despite options.reapply_cherry_picks being ignored.  We
> > +      * could just allow the flag in that case, but it seems better to
> > +      * just alert the user that they've specified a flag that the
> > +      * backend ignores.
> > +      */
>
> I'm a bit confused by this. --keep-base works with either
> --reapply-cherry-picks (which is the default if --keep-base is given) or
> --no-reapply-cherry-picks. Just below this hunk we have
>
>         if (options.reapply_cherry_picks < 0)
>                 options.reapply_cherry_picks = keep_base;
>
> So we only set options.reapply_cherry_picks to match keep_base if the
> user did not specify -[-no]-reapply-cherry-picks on the commandline.

options.reapply_cherry_picks is totally ignored by the apply backend,
regardless of whether it's set by the user or the setup code in
builtin/rebase.c.  And if we have an option which is ignored, isn't it
nicer to provide an error message to the user if they tried to set it?

Said another way, while users could start with these command lines:

    (Y) git rebase --whitespace=fix
    (Z) git rebase --whitespace=fix --keep-base

and modify them to include flags that would be ignored, we could allow:

    (A) git rebase --whitespace=fix --no-reapply-cherry-picks
    (B) git rebase --whitespace=fix --keep-base --reapply-cherry-picks

But we could not allow commands like

    (C) git rebase --whitespace=fix --reapply-cherry-picks
    (D) git rebase --whitespace=fix --keep-base --no-reapply-cherry-picks

For all four cases (A)-(D), the apply backend will ignore whatever
--[no-]reapply-cherry-picks flag is provided.  For (A) and (B), the
behavior the apply backend provides happens to match what the user
is requesting, while for (C) and (D) the behavior does not match.
So we should at least reject (C) and (D).  But, although we could
technically allow (A) and (B), what advantage would it provide?  I
think the results of allowing those two commands would be:

    1) Confusion by end users -- why should (C) & (D) throw errors if
       (A) and (B) are accepted?  That's not an easy rule to understand.

    2) More confusion by end users -- the documentation for years has
       stated that --reapply-cherry-picks is incompatible with the apply
       backend, suggesting users would be surprised if at least (B) and
       probably (A) didn't throw error messages.

    3) Confusing documentation -- If we don't want to throw errors for
       (A) and (B), how do we modify the "INCOMPATIBLE OPTIONS" section
       of Documentation/git-rebase.txt to explain the relevant details
       of when these flags are (or are not) incompatible with the apply
       backend?   I think it'd end up with a very verbose explanation
       that likely confuses more than it helps.

    4) Excessively complicated code -- The previous attempts to
       implement this got it wrong.  Prior to ce5238a690 ("rebase
       --keep-base: imply --reapply-cherry-picks", 2022-10-17), the code
       would error out on (B) and (C).  After that commit, it would only
       error out on (C).  Both solutions are incorrect since they miss
       (D), and I think the code just becomes hard to hard to follow in
       order to only error out on both (C) and (D) without (A) and (B).

(#2 and #3 might just be a repeat of the same issue, documentation,
but it seemed easier to write separately.)

I think it's simpler for the code, for the documentation, and for end
users to just error out on all of (A), (B), (C), and (D).
 --[no-]reapply-cherry-picks is not supported by the apply backend.

But, given this lengthy email, perhaps I should split out the handling
of --[no-]reapply-cherry-picks into its own commit and copy some or
all of the description above into the commit message?

^ permalink raw reply

* Re: [PATCH v4 0/9] rebase: fix several code/testing/documentation issues around flag incompatibilities
From: Elijah Newren @ 2023-01-24  2:05 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Elijah Newren via GitGitGadget, git, Eric Sunshine,
	Martin Ågren, Phillip Wood
In-Reply-To: <1d9748df-4f54-908d-75cf-49ff1d154fcb@github.com>

On Mon, Jan 23, 2023 at 7:56 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 1/22/2023 1:12 AM, Elijah Newren via GitGitGadget wrote:
> > We had a report about --update-refs being ignored when --whitespace=fix was
> > passed, confusing an end user. These were already marked as incompatible in
> > the manual, but the code didn't check for the incompatibility and provide an
> > error to the user.
> >
> > Folks brought up other flags tangentially when reviewing an earlier round of
> > this series, and I found we had more that were missing checks, and that we
> > were missing some testcases, and that the documentation was wrong on some of
> > the relevant options. So, I ended up getting lots of little fixes to
> > straighten these all out.
>
> Wow, this really expanded since I last looked at it. Thanks for taking on all
> of that extra work! (That was not my intention when recommending that you take
> over the fix.)

Yeah, I know you were willing to let some of the work wait for some
future series, and I intended to take that route.  But...
  * both you and Phillip brought up --autosquash, and it turns out we
didn't check it
  * the above made me check the whole list of incompatibilities and
discover other missing checks
  * adding tests made me discover that the documented
incompatibilities had a few issues
  * you mentioned that an explicit --apply wasn't checked either, and
since I was already diving in I figured I might as well handle that
too
  * even though you mentioned the config fix wasn't needed, both Junio
and Phillip brought it up as well,
    and based on the various feedback gotten so far, I started think
that just addressing it might require
    less work than going through more back and forth on review of the
feature without that implementation.

> > Changes since v3:
> >
> >  * Corrected the code surrounding --[no-]reapply-cherry-picks, and extended
> >    the testcases (Thanks to Phillip for pointing out my error)
> >  * I went ahead and implemented the better error message when the merge
> >    backend is triggered solely via config options.
>
> I really appreciate this extra attention to detail. I'm also really happy with
> how you implemented it, using different variables to signal how the option was
> specified (and using -1 for "unset" in both cases).
>
> While I had not been following version 2 or 3, I read this version in its
> entirety and everything looked good to me. These improvements to our docs,
> tests, and implementation will be felt by users.

Thanks for reading!

^ permalink raw reply

* Re: [PATCH 5/5] hook: support a --to-stdin=<path> option for testing
From: Michael Strawbridge @ 2023-01-24  0:59 UTC (permalink / raw)
  To: Junio C Hamano, Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau
In-Reply-To: <xmqqtu0gkaye.fsf@gitster.g>


On 2023-01-23 19:55, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> From: Emily Shaffer <emilyshaffer@google.com>
>>
>> Expose the "path_to_stdin" API added in the preceding commit in the
>> "git hook run" command. For now we won't be using this command
>> interface outside of the tests, but exposing this functionality makes
>> it easier to test the hook API.
> Presumably, the send-email validation topic would be using this
> immediately once it becomes available, no?

Yes I would be trying to use it for my send-email header patch right away.

>
> When "git hook" finds and runs more than one hook script, do they
> get the same input?  What I am wondering is if "to-stdin" should be
> exposed like this interface (which may be sufficient for testing
> purposes).  I imagine that scripters (e.g. send-email developers)
> would find it more convenient if they do not have to come up with a
> temporary file and they can just run "git hook" and feed whatever
> they want to give to the hook from its standard input.  "git hook"
> command, upon startup, should do the reading of its standard input
> and spooling it to a temporary file it uses to pass the contents to
> the hook scripts, in other words.
>
> Other than that, it surely looks not just handy for tests, but has
> immediate uses.
>
> Thanks.

^ permalink raw reply

* Re: [PATCH 5/5] hook: support a --to-stdin=<path> option for testing
From: Junio C Hamano @ 2023-01-24  0:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau,
	Michael Strawbridge
In-Reply-To: <patch-5.5-cb9ef7a89c4-20230123T170551Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> From: Emily Shaffer <emilyshaffer@google.com>
>
> Expose the "path_to_stdin" API added in the preceding commit in the
> "git hook run" command. For now we won't be using this command
> interface outside of the tests, but exposing this functionality makes
> it easier to test the hook API.

Presumably, the send-email validation topic would be using this
immediately once it becomes available, no?

When "git hook" finds and runs more than one hook script, do they
get the same input?  What I am wondering is if "to-stdin" should be
exposed like this interface (which may be sufficient for testing
purposes).  I imagine that scripters (e.g. send-email developers)
would find it more convenient if they do not have to come up with a
temporary file and they can just run "git hook" and feed whatever
they want to give to the hook from its standard input.  "git hook"
command, upon startup, should do the reading of its standard input
and spooling it to a temporary file it uses to pass the contents to
the hook scripts, in other words.

Other than that, it surely looks not just handy for tests, but has
immediate uses.

Thanks.

^ permalink raw reply

* Re: [PATCH 3/5] hook API: support passing stdin to hooks, convert am's 'post-rewrite'
From: Junio C Hamano @ 2023-01-23 23:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau,
	Michael Strawbridge
In-Reply-To: <patch-3.5-c6b9b69c516-20230123T170551Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> @@ -53,8 +53,14 @@ static int pick_next_hook(struct child_process *cp,
>  	if (!hook_path)
>  		return 0;
>  
> -	cp->no_stdin = 1;
>  	strvec_pushv(&cp->env, hook_cb->options->env.v);
> +	/* reopen the file for stdin; run_command closes it. */
> +	if (hook_cb->options->path_to_stdin) {
> +		cp->no_stdin = 0;
> +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> +	} else {
> +		cp->no_stdin = 1;
> +	}

By the way, using the path_to_stdin as the customization machinery
for the API users, and keeping it to the API implementation to
actually open the file and stuff .in member with it, is a good way
to make sure that multiple processes do not compete for the same
standard input stream.  IOW, what I was worried about in my review
of [2/5] is addressed by this mechanism.

Thanks.

^ permalink raw reply

* Re: [PATCH 3/5] hook API: support passing stdin to hooks, convert am's 'post-rewrite'
From: Junio C Hamano @ 2023-01-23 23:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau,
	Michael Strawbridge
In-Reply-To: <patch-3.5-c6b9b69c516-20230123T170551Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> diff --git a/hook.c b/hook.c
> index a4fa1031f28..86c6dc1fe70 100644
> --- a/hook.c
> +++ b/hook.c
> @@ -53,8 +53,14 @@ static int pick_next_hook(struct child_process *cp,
>  	if (!hook_path)
>  		return 0;
>  
> -	cp->no_stdin = 1;
>  	strvec_pushv(&cp->env, hook_cb->options->env.v);
> +	/* reopen the file for stdin; run_command closes it. */
> +	if (hook_cb->options->path_to_stdin) {
> +		cp->no_stdin = 0;
> +		cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY);
> +	} else {
> +		cp->no_stdin = 1;
> +	}

Do we need this else clause?  I thought that we've made sure
no_stdin is the default.  Is it just being explicit?

> diff --git a/hook.h b/hook.h
> index 4258b13da0d..19ab9a5806e 100644
> --- a/hook.h
> +++ b/hook.h
> @@ -30,6 +30,11 @@ struct run_hooks_opt
>  	 * was invoked.
>  	 */
>  	int *invoked_hook;
> +
> +	/**
> +	 * Path to file which should be piped to stdin for each hook.
> +	 */
> +	const char *path_to_stdin;
>  };
>  
>  #define RUN_HOOKS_OPT_INIT { \

^ permalink raw reply

* Re: [PATCH 2/5] run-command: allow stdin for run_processes_parallel
From: Junio C Hamano @ 2023-01-23 22:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau,
	Michael Strawbridge
In-Reply-To: <patch-2.5-81eef2f60a0-20230123T170551Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +	/*
> +	 * By default, do not inherit stdin from the parent process - otherwise,
> +	 * all children would share stdin! Users may overwrite this to provide
> +	 * something to the child's stdin by having their 'get_next_task'
> +	 * callback assign 0 to .no_stdin and an appropriate integer to .in.
> +	 */
> +	pp->children[i].process.no_stdin = 1;
> +
>  	code = opts->get_next_task(&pp->children[i].process,
>  				   opts->ungroup ? NULL : &pp->children[i].err,
>  				   opts->data,
> @@ -1601,7 +1609,6 @@ static int pp_start_one(struct parallel_processes *pp,
>  		pp->children[i].process.err = -1;
>  		pp->children[i].process.stdout_to_stderr = 1;
>  	}
> -	pp->children[i].process.no_stdin = 1;

For this single process, by default .no_stdin is set before it is
passed to start_command(), so the default behaviour does not change.

This needs a new safety to ensure that processes that have .no_stdin
turned off do not share the same value in their .in member, doesn't
it?  Hopefully that will be added in a later step in the series?

Provided that such a safety will appear in the end result, this
conversion does make sense to me.  Let's read on.

>  	if (start_command(&pp->children[i].process)) {
>  		if (opts->start_failure)

^ permalink raw reply

* Re: [PATCH 1/5] run-command.c: remove dead assignment in while-loop
From: Junio C Hamano @ 2023-01-23 22:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Emily Shaffer, Eric Sunshine, Felipe Contreras, Taylor Blau,
	Michael Strawbridge
In-Reply-To: <patch-1.5-351c6a55a41-20230123T170551Z-avarab@gmail.com>

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Remove code that's been unused since it was added in
> c553c72eed6 (run-command: add an asynchronous parallel child
> processor, 2015-12-15), the next use of "i" in this function is:
>
> 	for (i = 0; ...

And it has been updated to a different type, i.e.

	for (size_t i = 0; ...

so it doubly makes sense to kill that unused variable.

Makes sense.

> So we'll always clobber the "i" that's set here. Presumably the "i"
> assignment is an artifact of WIP code that made it into our tree.
>
> A subsequent commit will need to adjust the type of the "i" variable
> in the otherwise unrelated for-loop, which is why this is being
> removed now.

That, together with the earlier mention of the other i (which I
think came from the same source---perhaps the original topic this
was taken from had int->size_t change in it) are both stale.

Please proofread what you send out before submitting.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 2/2] ls-files: add %(skipworktree) atom to format option
From: Junio C Hamano @ 2023-01-23 22:39 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: Elijah Newren, ZheNing Hu via GitGitGadget, git,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Victoria Dye, Nguyễn Thái Ngọc Duy
In-Reply-To: <CAOLTT8TztbirR9FmD0s_5iPQ9+NETfecXHE8xeJDNXQUNojSJA@mail.gmail.com>

ZheNing Hu <adlternative@gmail.com> writes:

>> Perhaps it becomes useful in conjunction with %(if) and friends,
>> when they become avaiable?
>> ...
> Can this strange design be considered as a bad design of %(if) and
> %(else) in ref-filter?

Sorry, but I am not sure what "strange design" you are referring to.

On the ref-filter side, thanks to these conditional formatting
primitives, a boolean %(token) that gives an empty string vs a
non-empty string does make sense for a good design.  If another
--format="" wants to imitate it, it would be a good idea to first
see what part of the implementation can be shared and reused from
there.

^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Junio C Hamano @ 2023-01-23 22:30 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <xmqqilgxm2ky.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> A repository having some unreachable objects floating in the object
> store is not corrupt.  As long as all the objects reachable from refs
> are connected, that is a perfectly sane state.
>
> But allowing unbundling with the sanity check loosened WILL corrupt
> it, at the moment you point some objects from the bundle with refs.

While all of the above is true, I think existing check done by
bundle.c::verify_bundle() is stricter than necessary.  We make sure
that the prerequiste objects exist and are reachable from the refs.
But for the purpose of ensuring the health of the repo after the
operation, it is also OK if the prerequisite objects exist and they
pass connected.c::check_connected() test to reach existing refs.
verify_bundle() that is used in unbundle() does not allow it.

In a simplest case, with a single ref and a single strand of pearls
history, with a few cruft history that are connected to the main
history but are not anchored by any ref (e.g. the tip of 'main' was
rewound recently and we haven't done any GC).

                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main

And they have another repository created back when '5' was the
latest and greatest, which built three commits on top of it.

   0---1---2---3---4---5---A---B---C

They create a bundle of 5..C to update us to C.  In the meantime, we
have created three commits ourselves (6, 7, and 8) but threw two
away.

If a bundle requires commit '5', because it is reachable from an
existing ref (which points at the 'main' branch), we can unbundle it
and point a ref at the tip of the history contained within the
bundle safely.  Commit '5' being pointed by a ref means the commit,
its ancestors, and all trees and blobs referenced are available to
the repository (some may be fetched lazily from promisor), and
unless the producer lied and placed unconnected data in the bundle,
unpacking a bundle on top of '5' should give us all the objects that
are needed to complete the new tip proposed by the bundle (e.g. 'C').

                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main
                        \
                         A---B---C refs/bundle-1/main

And existing check that I called "sticter than necessary" easily
makes sure that it is safe to point 'C' with our ref.

Imagine another party cloned us back when 'main' was pointing at '8'
(which we since then have rewound), and built a few commits on it.

                                   X---Y refs/bundle-2/main
                                  /
                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main

As they did not know we'd rewind and discard 7 and 8, they would
have created their bundle to cover 8..Y.  The current test will fail
because '8' that is prerequisite is not reachable from any ref on
our side.  But that is too pessimistic.

As long as we haven't garbage collected so that all objects
reachable from '7' and '8' are available to this repository,
however, we should be able to unbundle the bundle that has '8' as
its prerequisite.  For that, we only need that '8' passes the
check_connected() check, which essentially means "we shouldn't find
any missing link while traversing history from '8' that stops at any
existing refs".

Again this relies on the fact that unbundling code makes sure that
incoming data is fully connected (i.e. bundle producer did not lie
about the prerequisite).

^ permalink raw reply

* Re: [PATCH v4] win32: fix thread usage for win32
From: Johannes Sixt @ 2023-01-23 21:47 UTC (permalink / raw)
  To: Jeff Hostetler, Rose via GitGitGadget; +Cc: Seija Kijin, git
In-Reply-To: <9e75f76b-081c-c763-0fae-edd6d97fbc88@jeffhostetler.com>

Am 23.01.23 um 18:43 schrieb Jeff Hostetler:
> 
> 
> On 1/23/23 11:48 AM, Rose via GitGitGadget wrote:
>> From: Seija Kijin <doremylover123@gmail.com>
>>
>> Use _beginthreadex instead of CreateThread
>> since we use the Windows CRT,
>> as Microsoft recommends _beginthreadex
>> over CreateThread for these situations.
>>
>> Finally, check for NULL handles, not "INVALID_HANDLE,"
>> as _beginthreadex guarantees a valid handle in most cases
>>
>> Signed-off-by: Seija Kijin <doremylover123@gmail.com>
>> ---
>>      win32: fix thread usage for win32
>>           Use pthread_exit instead of async_exit.
>>           This means we do not have to deal with Windows's implementation
>>      requiring an unsigned exit coded despite the POSIX exit code
>> requiring a
>>      signed exit code.
>>           Use _beginthreadex instead of CreateThread since we use the
>> Windows CRT.
>>           Finally, check for NULL handles, not "INVALID_HANDLE," as
>> _beginthreadex
>>      guarantees a valid handle in most cases
>>           Signed-off-by: Seija Kijin doremylover123@gmail.com
>>
>> Published-As:
>> https://github.com/gitgitgadget/git/releases/tag/pr-git-1440%2FAtariDreams%2FCreateThread-v4
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git
>> pr-git-1440/AtariDreams/CreateThread-v4
>> Pull-Request: https://github.com/git/git/pull/1440
>>
>> Range-diff vs v3:
>>
>>   1:  68baafba2bd ! 1:  2e2d5ce7745 win32: fix thread usage for win32
>>       @@ Commit message
>>                   Signed-off-by: Seija Kijin <doremylover123@gmail.com>
>>              - ## compat/mingw.c ##
>>       -@@ compat/mingw.c: static int start_timer_thread(void)
>>       -     timer_event = CreateEvent(NULL, FALSE, FALSE, NULL);
>>       -     if (timer_event) {
>>       -         timer_thread = (HANDLE) _beginthreadex(NULL, 0,
>> ticktack, NULL, 0, NULL);
>>       --        if (!timer_thread )
>>       -+        if (!timer_thread)
>>       -             return errno = ENOMEM,
>>       -                 error("cannot start timer thread");
>>       -     } else
>>       -
>>         ## compat/winansi.c ##
>>        @@ compat/winansi.c: enum {
>>             TEXT = 0, ESCAPE = 033, BRACKET = '['
>>
>>
>>   compat/winansi.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/compat/winansi.c b/compat/winansi.c
>> index 3abe8dd5a27..be65b27bd75 100644
>> --- a/compat/winansi.c
>> +++ b/compat/winansi.c
>> @@ -340,7 +340,7 @@ enum {
>>       TEXT = 0, ESCAPE = 033, BRACKET = '['
>>   };
>>   -static DWORD WINAPI console_thread(LPVOID unused)
>> +static unsigned int WINAPI console_thread(LPVOID unused)
>>   {
>>       unsigned char buffer[BUFFER_SIZE];
>>       DWORD bytes;
>> @@ -643,9 +643,9 @@ void winansi_init(void)
>>           die_lasterr("CreateFile for named pipe failed");
>>         /* start console spool thread on the pipe's read end */
>> -    hthread = CreateThread(NULL, 0, console_thread, NULL, 0, NULL);
>> -    if (hthread == INVALID_HANDLE_VALUE)
>> -        die_lasterr("CreateThread(console_thread) failed");
>> +    hthread = (HANDLE)_beginthreadex(NULL, 0, console_thread, NULL,
>> 0, NULL);
>> +    if (!hthread)
>> +        die_lasterr("_beginthreadex(console_thread) failed");
>>         /* schedule cleanup routine */
>>       if (atexit(winansi_exit))
>>
>> base-commit: 56c8fb1e95377900ec9d53c07886022af0a5d3c2
> 
> This change may or may not be harmless, but it scares me
> because it is possibly a very subtle change and is being
> made for an unknown reason -- is there a problem being
> fixed here?  Or is this just churn for the sake of churn
> to avoid an awkward cast of the return code?
> 
> What does _beginthreadex() specifically do that we need
> it to do for us?
> 
> _beginthreadex() does some CRT init and then calls CreateThread(),
> so what are we missing by calling CreateThread() directly?

I also question the value of this change. As long as the thread does not
call into any CRT functions, we do not need the services of
_beginthreadex(). AFAICS, it only uses WinAPI functions and some
uncritical C functions like memmove and memset. Am I missing something?

> 
> The code in question is 11+ years old and it hasn't been a
> problem (right?), so I have to wonder what value do we get
> from this change.
> 
> The containing function here is setting up a special console
> thread and named pipe to access the console, so I doubt that
> any of the tests in the test suite actually would actually
> exercise this change (since the tests aren't interactive).
> 
> The low-level Windows startup code is very tricky and sensitive
> (and we need to test with both GCC's CRT and MSVC's CRT).
> As I said earlier, the change may or may not be harmless, but
> I question the need for it.
> 
> Jeff
> 
> 


^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Junio C Hamano @ 2023-01-23 21:08 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <eae85534-89c9-6eff-69d5-7d4b2be85fb6@github.com>

Derrick Stolee <derrickstolee@github.com> writes:

> Do we do the same as we unpack from a fetch?

We should.

We only consider tips of refs and objects that are reachable from
them to be "present", and there may be random objects that float in
the object store without any guarantee that no the objects that
ought to be reachable from them are missing from the object store,
but they do not play part in the common ancestor discovery.

And then after we unpack, we ensure that the proposed updates to
refs made by the fetch operation will not corrupt the repository.
This can be guaranteed by making sure that objects to be placed at
the updated tip can all reach some existing tips of refs.  We trust
refs before the operation (in this case, 'git fetch') begins.  We
ensure that refs after the operation can be trusted before updating
them.  Where "trust" here means "objects pointed at by them are
connected.

^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Junio C Hamano @ 2023-01-23 20:13 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <eae85534-89c9-6eff-69d5-7d4b2be85fb6@github.com>

Derrick Stolee <derrickstolee@github.com> writes:

> We are specifically removing the requirement that the objects are
> reachable from refs, we still check that the objects are in the
> object store. Thus, we can only be in a bad state afterwards if
> the required objects for a bundle were in the object store,
> previously unreachable, and one of these two things happened:
>
> 1. Some objects reachable from those required commits were already
>    missing in the repository (so the repo's object store was broken
>    but only for some unreachable objects).

A repository having some unreachable objects floating in the object
store is not corrupt.  As long as all the objects reachable from refs
are connected, that is a perfectly sane state.

But allowing unbundling with the sanity check loosened WILL corrupt
it, at the moment you point some objects from the bundle with refs.

> I think we should trust the repository to not be in the first state,

So, I think this line of thought is simply mistaken.

>> I am OK as long as we check the assumption holds true at the end;
>> this looks like a good optimization.
>  
> So are you recommending that we verify all objects reachable from
> the new refs/bundles/* are present after unbundling?

Making sure that prerequisites are connected will reduce the span of
the DAG we would need to verify.  After unbundling all bundles, but
before updating the refs to point at the tips in the bundles, if we
can make sure that these prerequisite objects named in the bundles
are reachable from the tips recorded in the bundles, while stopping
the traversal at the tips of original refs (remember: we have only
updated objects in the object store, but haven't updated the refs from
the bundles), that would allow us to make sure that the updates to
refs proposed by the bundles will not corrupt the repository.

^ permalink raw reply

* Re: nested submodules detection w/o .gitmodules file
From: Emily Shaffer @ 2023-01-23 20:12 UTC (permalink / raw)
  To: Andry; +Cc: git
In-Reply-To: <1716310675.20230122233403@inbox.ru>

On Sun, Jan 22, 2023 at 12:45 PM Andry <andry@inbox.ru> wrote:
>
> Hello Git,
>
> I have a pretty long investigation has been started from usage 3dparty projects related directly or indirectly to the git submodules:
>
> `svn externals replacement` : https://github.com/chronoxor/gil/issues/6
> `svn complete replacement for externals` : https://github.com/dirk-thomas/vcstool/issues/243
>
> And stumbled on this discussion:
>
> `nested submodules detection w/o .gitmodules file` : https://github.com/gitextensions/gitextensions/discussions/10644 (https://github.com/gitextensions/gitextensions/issues/10642)
>
> The main question here is that, could the git have has submodules without `.submodules` file?

There is a little nuance here. Git can have nested repositories in a
few different ways; submodules are just one of them. A submodule is
the combination of a gitlink object in the object graph *and* a
corresponding entry in the .gitmodules file. It's certainly possible
to embed a nested repository in other ways, such as by ignoring the
.gitmodules file, but then your nested repository is no longer a
submodule, and operations which recurse over submodules will not
consider that nested repository. The reason is that we need to
understand where to clone the submodule from - that information isn't
contained in the superproject's repository URL, and it can't be
contained in the gitlink directly (which is in essence just a commit
object, but one that exists in the nested repository, not in the
superproject repository). If we didn't have a way of writing down the
submodule's remote URL and version controlling it, we wouldn't have a
way to populate the submodules when a user is cloning.

>
> If no, then all side projects which utilizes it's own input file for the externals may subsequentially fail:
>
> https://github.com/chronoxor/gil
> https://github.com/dirk-thomas/vcstool
> https://github.com/ingydotnet/git-subrepo
>
> If yes, then other projects which does rely on the `.submodules` would have not actual or even invalid state:
>
> https://github.com/gitextensions/gitextensions
>
> Or even the github itself:
>
> `Zip archive to include submodule` : https://github.com/dear-github/dear-github/issues/214
>
> (`[PATCH] archive: add –recurse-submodules to git-archive command` : https://git.github.io/rev_news/2022/11/30/edition-93/, https://lore.kernel.org/git/pull.1359.git.git.1665597148042.gitgitgadget@gmail.com/)
>
> Mine point here is that:
>
> Git database is a primary storage. The `.gitmodules` file is not a primary storage, so can be in not an actual or desync state with the database.
> And any application or a 3dparty project must read the database directly.

The .gitmodules exists to help at clone time; it's possible, as I
think you're pointing out, to have some intermediate state locally.
But this file is what needs to be the source of truth for putting
together the repository on a new machine for the first time.

>
> But another problem here is that the git still does not have a stable API for that.
> For example, a submodule can be declared directly from the `.git/config` file in a working copy:
>
> https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-init--ltpathgt82308203
> https://git-scm.com/docs/gitsubmodules#_active_submodules
>
> So, who is right and what is wrong here?
>

^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Phillip Wood @ 2023-01-23 20:08 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget, git
  Cc: Derrick Stolee, Elijah Newren, Eric Sunshine, Martin Ågren,
	Phillip Wood
In-Reply-To: <5e4851e611ee18112bd71939ee900e02a8d590c5.1674367961.git.gitgitgadget@gmail.com>

Hi Elijah

On 22/01/2023 06:12, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> The git-rebase manual noted several sets of incompatible options, but
> we were missing tests for a few of these.  Further, we were missing
> code checks for some of these, which could result in command line
> options being silently ignored.
> 
> Also, note that adding a check for autosquash means that using
> --whitespace=fix together with the config setting rebase.autosquash=true
> will trigger an error.  A subsequent commit will improve the error
> message.

Thanks for updating the commit message and for the new commits at the 
end of the series.

> Signed-off-by: Elijah Newren <newren@gmail.com>
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1224,6 +1224,26 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   		if (options.fork_point < 0)
>   			options.fork_point = 0;
>   	}
> +	/*
> +	 * The apply backend does not support --[no-]reapply-cherry-picks.
> +	 * The behavior it implements by default is equivalent to
> +	 * --no-reapply-cherry-picks (due to passing --cherry-picks to
> +	 * format-patch), but --keep-base alters the upstream such that no
> +	 * cherry-picks can be found (effectively making it act like
> +	 * --reapply-cherry-picks).
> +	 *
> +	 * Now, if the user does specify --[no-]reapply-cherry-picks, but
> +	 * does so in such a way that options.reapply_cherry_picks ==
> +	 * keep_base, then the behavior they get will match what they
> +	 * expect despite options.reapply_cherry_picks being ignored.  We
> +	 * could just allow the flag in that case, but it seems better to
> +	 * just alert the user that they've specified a flag that the
> +	 * backend ignores.
> +	 */

I'm a bit confused by this. --keep-base works with either 
--reapply-cherry-picks (which is the default if --keep-base is given) or 
--no-reapply-cherry-picks. Just below this hunk we have

	if (options.reapply_cherry_picks < 0)
		options.reapply_cherry_picks = keep_base;

So we only set options.reapply_cherry_picks to match keep_base if the 
user did not specify -[-no]-reapply-cherry-picks on the commandline.

Best Wishes

Phillip

> +	if (options.reapply_cherry_picks >= 0)
> +		imply_merge(&options, options.reapply_cherry_picks ? "--reapply-cherry-picks" :
> +								     "--no-reapply-cherry-picks");
> +
>   	/*
>   	 * --keep-base defaults to --reapply-cherry-picks to avoid losing
>   	 * commits when using this option.
> @@ -1406,13 +1426,6 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	if (options.empty != EMPTY_UNSPECIFIED)
>   		imply_merge(&options, "--empty");
>   
> -	/*
> -	 * --keep-base implements --reapply-cherry-picks by altering upstream so
> -	 * it works with both backends.
> -	 */
> -	if (options.reapply_cherry_picks && !keep_base)
> -		imply_merge(&options, "--reapply-cherry-picks");
> -
>   	if (gpg_sign)
>   		options.gpg_sign_opt = xstrfmt("-S%s", gpg_sign);
>   
> @@ -1503,6 +1516,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	if (options.update_refs)
>   		imply_merge(&options, "--update-refs");
>   
> +	if (options.autosquash)
> +		imply_merge(&options, "--autosquash");
> +
>   	if (options.type == REBASE_UNSPECIFIED) {
>   		if (!strcmp(options.default_backend, "merge"))
>   			imply_merge(&options, "--merge");
> diff --git a/t/t3422-rebase-incompatible-options.sh b/t/t3422-rebase-incompatible-options.sh
> index f86274990b0..6a17b571ec7 100755
> --- a/t/t3422-rebase-incompatible-options.sh
> +++ b/t/t3422-rebase-incompatible-options.sh
> @@ -50,6 +50,11 @@ test_rebase_am_only () {
>   		test_must_fail git rebase $opt --strategy-option=ours A
>   	"
>   
> +	test_expect_success "$opt incompatible with --autosquash" "
> +		git checkout B^0 &&
> +		test_must_fail git rebase $opt --autosquash A
> +	"
> +
>   	test_expect_success "$opt incompatible with --interactive" "
>   		git checkout B^0 &&
>   		test_must_fail git rebase $opt --interactive A
> @@ -60,6 +65,26 @@ test_rebase_am_only () {
>   		test_must_fail git rebase $opt --exec 'true' A
>   	"
>   
> +	test_expect_success "$opt incompatible with --keep-empty" "
> +		git checkout B^0 &&
> +		test_must_fail git rebase $opt --keep-empty A
> +	"
> +
> +	test_expect_success "$opt incompatible with --empty=..." "
> +		git checkout B^0 &&
> +		test_must_fail git rebase $opt --empty=ask A
> +	"
> +
> +	test_expect_success "$opt incompatible with --no-reapply-cherry-picks" "
> +		git checkout B^0 &&
> +		test_must_fail git rebase $opt --no-reapply-cherry-picks A
> +	"
> +
> +	test_expect_success "$opt incompatible with --reapply-cherry-picks" "
> +		git checkout B^0 &&
> +		test_must_fail git rebase $opt --reapply-cherry-picks A
> +	"
> +
>   	test_expect_success "$opt incompatible with --update-refs" "
>   		git checkout B^0 &&
>   		test_must_fail git rebase $opt --update-refs A

^ permalink raw reply

* Re: [PATCH v2] Documentation: render dash correctly
From: Andrei Rybak @ 2023-01-23 18:52 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Junio C Hamano
In-Reply-To: <CAN0heSogz0cdhVJdiZhCc2_fcHzJggPjbS0wCAQkRh1uZMxLig@mail.gmail.com>

On 23/01/2023 12:04, Martin Ågren wrote:
> On Mon, 23 Jan 2023 at 10:01, Andrei Rybak <rybak.a.v@gmail.com> wrote:
> 
>>   highlighted with a `$` sign; if you are trying to recreate these
>> -example by hand, do not cut and paste them---they are there
>> +example by hand, do not cut and paste them--they are there
>>   primarily to highlight extra whitespace at the end of some lines.
> 
> OK, so this is one of the new ones compared to v1. I can see the
> argument for adding some spaces around the "--" for consistency and to
> make this a bit easier to read in the resulting manpage (which can of
> course be very subjective), but then I can also see that kind of change

There are some less subjective guidelines.  Asciidoc turns "--" into an
em-dash.[1]  In English, em-dash is almost always not surrounded by
spaces (it is in French, for example), while en-dash is spaced in
English when used instead of an em-dash.[2][3][4]

This means that it's all the other places that use " -- " with spaces
that are incorrect.

References:

1. https://docs.asciidoctor.org/asciidoc/latest/syntax-quick-reference/#text-replacements

2. English Wikipedia is clear about its usage of en- and em-dashes:
    https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style#Dashes

3. Chicago Manual of Style FAQ doesn't spell out the spacing, but it's
    clear from examples:
    https://www.chicagomanualofstyle.org/qanda/data/faq/topics/HyphensEnDashesEmDashes/faq0002.html

4. More confirmation on English Language and Usage Q&A website on Stack
    Exchange network: https://english.stackexchange.com/a/154998/54197

> being left out as orthogonal to this patch.

Indeed, correcting spacing around dashes is orthogonal.  Also, it might
not be very desirable to have so much churn for spacing issues.
  
> This v2 patch looks good to me.

Thank you for review.
  
> Martin


^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Derrick Stolee @ 2023-01-23 18:24 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, me, vdye, avarab, steadmon, chooglen
In-Reply-To: <xmqqsfg1m8l6.fsf@gitster.g>

On 1/23/2023 1:03 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> When unbundling a bundle, the verify_bundle() method checks two things
>> with regards to the prerequisite commits:
>>
>>  1. Those commits are in the object store, and
>>  2. Those commits are reachable from refs.
>>
>> During testing of the bundle URI feature, where multiple bundles are
>> unbundled in the same process, the ref store did not appear to be
>> refreshing with the new refs/bundles/* references added within that
>> process. This caused the second half -- the reachability walk -- report
>> that some commits were not present, despite actually being present.
>>
>> One way to attempt to fix this would be to create a way to force-refresh
>> the ref state. That would correct this for these cases where the
>> refs/bundles/* references have been updated. However, this still is an
>> expensive operation in a repository with many references.
>>
>> Instead, optionally allow callers to skip this portion by instead just
>> checking for presence within the object store. Use this when unbundling
>> in bundle-uri.c.
> 
> This step is new in this round.
> 
> I am assuming that this approach is to avoid repeated "now we
> unbundled one, let's spend enormous cycles to update the in-core
> view of the ref store before processing the next bundle"---instead
> we unbundle all, assuming the prerequisites for each and every
> bundle are satisfied.

We are specifically removing the requirement that the objects are
reachable from refs, we still check that the objects are in the
object store. Thus, we can only be in a bad state afterwards if
the required objects for a bundle were in the object store,
previously unreachable, and one of these two things happened:

1. Some objects reachable from those required commits were already
   missing in the repository (so the repo's object store was broken
   but only for some unreachable objects).

2. A GC pruned those objects between verifying the bundle and
   writing the refs/bundles/* refs after unbundling.

I think we should trust the repository to not be in the first state,
but the race condition in the second option will create a state
where we have missing objects that are now reachable from refs.

> I am OK as long as we check the assumption holds true at the end;
> this looks like a good optimization.

So are you recommending that we verify all objects reachable from
the new refs/bundles/* are present after unbundling? That would
prevent the possibility of a GC race, but at some significant run-
time performance costs. Do we do the same as we unpack from a
fetch? Do we apply the same scrutiny to the objects during
unbundling as we do from a fetched pack? They both use 'git
index-pack --stdin --fix-thin', so my guess is that we do the same
amount of checks for both cases.

Since this is only one of multiple ways to add objects that depend
on possibly-unreachable objects, my preferred way to avoid the
GC race is by using care in the GC process itself (such as the new
--expire-to option to recover from these races).

Thanks,
-Stolee

^ permalink raw reply

* Re: [CI]: Is t7527 known to be flakey?
From: SZEDER Gábor @ 2023-01-23 18:12 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: Junio C Hamano, Jeff Hostetler, edecosta, git
In-Reply-To: <f7449e73-7f50-67ea-2be4-2037f98a69f3@jeffhostetler.com>

On Mon, Jan 23, 2023 at 11:56:53AM -0500, Jeff Hostetler wrote:
> Was this on Linux or MacOS ?

On an average-ish Linux (an Ubuntu LTS variant).  So the issue is not
specific to musl.

^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Junio C Hamano @ 2023-01-23 18:03 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: git, me, vdye, avarab, steadmon, chooglen, Derrick Stolee
In-Reply-To: <b3828725bc8f8887b9b4777a0e3d84224a427f31.1674487310.git.gitgitgadget@gmail.com>

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> When unbundling a bundle, the verify_bundle() method checks two things
> with regards to the prerequisite commits:
>
>  1. Those commits are in the object store, and
>  2. Those commits are reachable from refs.
>
> During testing of the bundle URI feature, where multiple bundles are
> unbundled in the same process, the ref store did not appear to be
> refreshing with the new refs/bundles/* references added within that
> process. This caused the second half -- the reachability walk -- report
> that some commits were not present, despite actually being present.
>
> One way to attempt to fix this would be to create a way to force-refresh
> the ref state. That would correct this for these cases where the
> refs/bundles/* references have been updated. However, this still is an
> expensive operation in a repository with many references.
>
> Instead, optionally allow callers to skip this portion by instead just
> checking for presence within the object store. Use this when unbundling
> in bundle-uri.c.

This step is new in this round.

I am assuming that this approach is to avoid repeated "now we
unbundled one, let's spend enormous cycles to update the in-core
view of the ref store before processing the next bundle"---instead
we unbundle all, assuming the prerequisites for each and every
bundle are satisfied.

I am OK as long as we check the assumption holds true at the end;
this looks like a good optimization.


^ permalink raw reply

* Re: [PATCH v2] MyFirstContribution: refrain from self-iterating too much
From: Torsten Bögershausen @ 2023-01-23 17:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqzga9opdu.fsf@gitster.g>

On Sun, Jan 22, 2023 at 08:18:05PM -0800, Junio C Hamano wrote:

The whole thing is much more convenient to read, so to say.
Some nit-picking inline.

> Finding mistakes in and improving your own patches is a good idea,
> but doing so too quickly is being inconsiderate to reviewers who
> have just seen the initial iteration and taking their time to review
> it.  Encourage new developers to perform such a self review before
> they send out their patches, not after.

I think that this is what V1 was about. Review first, send then.
Is there still so much focus on this ?
Or is it more about "what to do when it went wrong?"

How about this, or similar ?

...it.  Encourage developers to wait with a new version too early.
But if they plan to send one later, they are welcome to comment
their own work as they where reviers.


>
> Helped-by: Torsten Bögershausen <tboegi@web.de>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  Documentation/MyFirstContribution.txt | 30 +++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/Documentation/MyFirstContribution.txt b/Documentation/MyFirstContribution.txt
> index ccfd0cb5f3..3e4f1c7764 100644
> --- a/Documentation/MyFirstContribution.txt
> +++ b/Documentation/MyFirstContribution.txt
> @@ -1256,6 +1256,36 @@ index 88f126184c..38da593a60 100644
>  [[now-what]]
>  == My Patch Got Emailed - Now What?
>
> +After you sent out your first patch, you may find mistakes in it, or
> +a different and better way to achieve the goal of the patch.  But
> +resist the temptation to send a new version immediately.
> +
> + - If the mistakes you found are minor, send a reply to your patch as
> +   if you were a reviewer and mention that you will fix them in an
> +   updated version.
> +
> + - On the other hand, if you think you want to change the course so
> +   drastically that reviews on the initial patch would become
> +   useless, send a reply to your patch to say so immediately to
> +   avoid wasting others' time (e.g. "I am working on a better
> +   approach, so please ignore this patch, and wait for the updated
> +   version.")
> +
> +And give reviewers enough time to process your initial patch before
> +sending an updated version.
> +
> +The above is a good practice if you sent your initial patch
> +prematurely without polish.  But a better approach of course is to
> +avoid sending your patch prematurely in the first place.
> +
> +Keep in mind that people in the development community do not have to
> +see your patch immediately after you wrote it.  Instead of seeing
> +the initial version right now, that you will follow up with several
> +updated "oops, I like this version better than the previous one"
> +versions over 2 days, reviewers would much appreciate if a single
> +more polished version came 2 days late and that version, that
> +contains fewer mistakes, were the only one they need to review.
> +
>  [[reviewing]]
>  === Responding to Reviews
>
> --
> 2.39.1-308-g56c8fb1e95
>

^ permalink raw reply

* Re: [PATCH v4] win32: fix thread usage for win32
From: Jeff Hostetler @ 2023-01-23 17:43 UTC (permalink / raw)
  To: Rose via GitGitGadget, git; +Cc: Johannes Sixt, Rose, Seija Kijin
In-Reply-To: <pull.1440.v4.git.git.1674492499537.gitgitgadget@gmail.com>



On 1/23/23 11:48 AM, Rose via GitGitGadget wrote:
> From: Seija Kijin <doremylover123@gmail.com>
> 
> Use _beginthreadex instead of CreateThread
> since we use the Windows CRT,
> as Microsoft recommends _beginthreadex
> over CreateThread for these situations.
> 
> Finally, check for NULL handles, not "INVALID_HANDLE,"
> as _beginthreadex guarantees a valid handle in most cases
> 
> Signed-off-by: Seija Kijin <doremylover123@gmail.com>
> ---
>      win32: fix thread usage for win32
>      
>      Use pthread_exit instead of async_exit.
>      
>      This means we do not have to deal with Windows's implementation
>      requiring an unsigned exit coded despite the POSIX exit code requiring a
>      signed exit code.
>      
>      Use _beginthreadex instead of CreateThread since we use the Windows CRT.
>      
>      Finally, check for NULL handles, not "INVALID_HANDLE," as _beginthreadex
>      guarantees a valid handle in most cases
>      
>      Signed-off-by: Seija Kijin doremylover123@gmail.com
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1440%2FAtariDreams%2FCreateThread-v4
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1440/AtariDreams/CreateThread-v4
> Pull-Request: https://github.com/git/git/pull/1440
> 
> Range-diff vs v3:
> 
>   1:  68baafba2bd ! 1:  2e2d5ce7745 win32: fix thread usage for win32
>       @@ Commit message
>        
>            Signed-off-by: Seija Kijin <doremylover123@gmail.com>
>        
>       - ## compat/mingw.c ##
>       -@@ compat/mingw.c: static int start_timer_thread(void)
>       - 	timer_event = CreateEvent(NULL, FALSE, FALSE, NULL);
>       - 	if (timer_event) {
>       - 		timer_thread = (HANDLE) _beginthreadex(NULL, 0, ticktack, NULL, 0, NULL);
>       --		if (!timer_thread )
>       -+		if (!timer_thread)
>       - 			return errno = ENOMEM,
>       - 				error("cannot start timer thread");
>       - 	} else
>       -
>         ## compat/winansi.c ##
>        @@ compat/winansi.c: enum {
>         	TEXT = 0, ESCAPE = 033, BRACKET = '['
> 
> 
>   compat/winansi.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/compat/winansi.c b/compat/winansi.c
> index 3abe8dd5a27..be65b27bd75 100644
> --- a/compat/winansi.c
> +++ b/compat/winansi.c
> @@ -340,7 +340,7 @@ enum {
>   	TEXT = 0, ESCAPE = 033, BRACKET = '['
>   };
>   
> -static DWORD WINAPI console_thread(LPVOID unused)
> +static unsigned int WINAPI console_thread(LPVOID unused)
>   {
>   	unsigned char buffer[BUFFER_SIZE];
>   	DWORD bytes;
> @@ -643,9 +643,9 @@ void winansi_init(void)
>   		die_lasterr("CreateFile for named pipe failed");
>   
>   	/* start console spool thread on the pipe's read end */
> -	hthread = CreateThread(NULL, 0, console_thread, NULL, 0, NULL);
> -	if (hthread == INVALID_HANDLE_VALUE)
> -		die_lasterr("CreateThread(console_thread) failed");
> +	hthread = (HANDLE)_beginthreadex(NULL, 0, console_thread, NULL, 0, NULL);
> +	if (!hthread)
> +		die_lasterr("_beginthreadex(console_thread) failed");
>   
>   	/* schedule cleanup routine */
>   	if (atexit(winansi_exit))
> 
> base-commit: 56c8fb1e95377900ec9d53c07886022af0a5d3c2

This change may or may not be harmless, but it scares me
because it is possibly a very subtle change and is being
made for an unknown reason -- is there a problem being
fixed here?  Or is this just churn for the sake of churn
to avoid an awkward cast of the return code?

What does _beginthreadex() specifically do that we need
it to do for us?

_beginthreadex() does some CRT init and then calls CreateThread(),
so what are we missing by calling CreateThread() directly?

The code in question is 11+ years old and it hasn't been a
problem (right?), so I have to wonder what value do we get
from this change.

The containing function here is setting up a special console
thread and named pipe to access the console, so I doubt that
any of the tests in the test suite actually would actually
exercise this change (since the tests aren't interactive).

The low-level Windows startup code is very tricky and sensitive
(and we need to test with both GCC's CRT and MSVC's CRT).
As I said earlier, the change may or may not be harmless, but
I question the need for it.

Jeff



^ permalink raw reply

* Re: GSoC 2023
From: Kaartic Sivaraam @ 2023-01-23 17:41 UTC (permalink / raw)
  To: Christian Couder; +Cc: git
In-Reply-To: <CAP8UFD3jzX5zRRYKS5uES2X9vB4eKJruzT7o6+7KytqLSmmZRg@mail.gmail.com>

Hi Christian,

On 17/01/23 15:04, Christian Couder wrote:
> 
> GSoC Org Applications open next week on Monday, January 23rd at 1800
> UTC and close on Tuesday, February 7th at 1800 UTC.
> 
> I am interested in mentoring and being an org admin for Git again this
> year, so I plan to apply for Git soon.
>

I would be glad to help as an Org admin this year. I don't suppose I 
have the capacity to mentor / co-mentor this year, though. I could very 
well help as much as I can passively where needed.

One thing to note about the Org application. As per Google's claim we 
should be able to complete this year's application quickly since the new 
webapp allows us to reuse last year's application details. To quote Google,

   If your org applied to GSoC in 2022 your Org Admin will have a very
   quick and easy 2023 application. : ) As mentioned last year, the
   webapp is now evergreen so if your org applied in 2022 all of the
   information for the org application is still in the system so the Org
   Admin will have a couple of quick steps to follow during the
   application window.

Do feel free to let me know if it works otherwise, Christian. I'll be 
glad to look into it.

> The GSoC contributor application deadline is April 4 - 18:00 UTC, so
> (co-)mentors and org admins are already welcome to volunteer. We also
> need project ideas to refresh our idea page from last year
> (https://git.github.io/SoC-2022-Ideas/).
>

We certainly are in need of ideas and mentors for this year. Do chime in 
with your thoughts. :-)

-- 
Sivaraam

^ permalink raw reply

* Re: [PATCH v2] Documentation: render dash correctly
From: Junio C Hamano @ 2023-01-23 17:39 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Andrei Rybak, git
In-Reply-To: <CAN0heSogz0cdhVJdiZhCc2_fcHzJggPjbS0wCAQkRh1uZMxLig@mail.gmail.com>

Martin Ågren <martin.agren@gmail.com> writes:

> On Mon, 23 Jan 2023 at 10:01, Andrei Rybak <rybak.a.v@gmail.com> wrote:
>
>>  highlighted with a `$` sign; if you are trying to recreate these
>> -example by hand, do not cut and paste them---they are there
>> +example by hand, do not cut and paste them--they are there
>>  primarily to highlight extra whitespace at the end of some lines.
>
> OK, so this is one of the new ones compared to v1. I can see the
> argument for adding some spaces around the "--" for consistency and to
> make this a bit easier to read in the resulting manpage (which can of
> course be very subjective), but then I can also see that kind of change
> being left out as orthogonal to this patch.
>
> This v2 patch looks good to me.

Thanks, both.  Will queue.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox